Methods and Compositions for the Recombinant Biosynthesis of Fatty Acids and Esters Berry; David Arthur ; et al. [Joule Unlimited Technologies, Inc.]

Methods and Compositions for the Recombinant Biosynthesis of Fatty Acids and Esters

Berry; David Arthur ; et al.

Patent Application Summary

U.S. patent application number 14/559168 was filed with the patent office on 2015-03-26 for methods and compositions for the recombinant biosynthesis of fatty acids and esters. The applicant listed for this patent is Joule Unlimited Technologies, Inc.. Invention is credited to Noubar Boghos Afeyan, David Arthur Berry, Christian Perry Ridley, Dan Eric Robertson, Martha Sholl, Frank Anthony Skraly, Regina Wilpiszeski.

Application Number	20150082691 14/559168
Document ID	/
Family ID	43974445
Filed Date	2015-03-26

United States Patent Application	20150082691
Kind Code	A1
Berry; David Arthur ; et al.	March 26, 2015

Methods and Compositions for the Recombinant Biosynthesis of Fatty Acids and Esters

Abstract

The present disclosure identifies methods and compositions for modifying photoautotrophic organisms, such that the organisms efficiently convert carbon dioxide and light into compounds such as esters and fatty acids. In certain embodiments, the compounds produced are secreted into the medium used to culture the organisms.

Inventors:

Berry; David Arthur; (Brookline, MA) ; Afeyan; Noubar Boghos; (Lexington, MA) ; Skraly; Frank Anthony; (Watertown, MA) ; Ridley; Christian Perry; (Acton, MA) ; Robertson; Dan Eric; (Belmont, MA) ; Wilpiszeski; Regina; (Cambridge, MA) ; Sholl; Martha; (Haverhill, MA)

Applicant:

Name	City	State	Country	Type
Joule Unlimited Technologies, Inc.	Bedford	MA	US

Family ID:

43974445

Appl. No.:

14/559168

Filed:

December 3, 2014

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
13765211	Feb 12, 2013	8906665
14559168
13243165	Sep 23, 2011	8399227
13765211
12876056	Sep 3, 2010	8048654
13243165
PCT/US2009/035937	Mar 3, 2009
12876056
61121532	Dec 10, 2008
61033411	Mar 3, 2008
61033402	Mar 3, 2008
61353145	Jun 9, 2010

Current U.S. Class:	44/388
Current CPC Class:	Y02E 50/13 20130101; C12Y 602/01003 20130101; C12N 15/79 20130101; C10L 2270/026 20130101; Y02E 50/10 20130101; C10L 2200/0476 20130101; C10L 1/026 20130101; C12P 7/649 20130101
Class at Publication:	44/388
International Class:	C10L 1/02 20060101 C10L001/02

Claims

1. A fuel composition comprising a mixture of one or more fatty acid esters, wherein at least one of the fatty acid esters in the mixture is tetradecanoic acid ester, .DELTA.9-hexadecenoic acid ester, hexadecanoic acid ester, heptadecanoic acid ester, .DELTA.9-octadecenoic acid ester, or octadecanoic acid ester, and wherein at least a portion of the carbon used as raw material of the one or more fatty acid esters in the mixture is inorganic carbon.

2. The fuel composition of claim 1, wherein at least two of the fatty acid esters in the mixture are hexadecanoic acid ester and octadecanoic acid ester.

3. The fuel composition of claim 3, wherein the amount of hexadecanoic acid ester in the mixture is between 1.5 and 10 fold greater than the amount of octadecanoic acid ester in the mixture.

4. The fuel composition of claim 1, wherein at least one of the fatty acid esters in the mixture is hexadecanoic acid ester.

5. The fuel composition of claim 4, wherein at least 50% of the fatty acid ester in the mixture is hexadecanoic acid ester.

6. The fuel composition of claim 1, wherein the fuel composition is a low-sulfur fuel composition.

7. The fuel composition of claim 1, wherein the fuel composition is a carbon-neutral fuel composition.

8. The fuel composition of claim 1, wherein the fuel composition has a higher .delta..sub.p than a comparable fuel composition made from fixed atmospheric carbon or plant-derived biomass.

9. The fuel composition of claim 1, wherein the inorganic carbon is carbon dioxide.

10. A fuel composition comprising a mixture of one or more fatty acid esters, wherein at least one of the fatty acid esters in the mixture is tetradecanoic acid ester, .DELTA.9-hexadecenoic acid ester, hexadecanoic acid ester, heptadecanoic acid ester, .DELTA.9-octadecenoic acid ester, or octadecanoic acid ester, and wherein at least a portion of the carbon in the one or more fatty acid esters in the mixture is inorganic carbon.

11. The fuel composition of claim 10, wherein at least two of the fatty acid esters in the mixture are hexadecanoic acid ester and octadecanoic acid ester.

12. The fuel composition of claim 11, wherein the amount of hexadecanoic acid ester in the mixture is between 1.5 and 10 fold greater than the amount of octadecanoic acid ester in the mixture.

13. The fuel composition of claim 10, wherein at least one of the fatty acid esters in the mixture is hexadecanoic acid ester.

14. The fuel composition of claim 13, wherein at least 50% of the fatty acid ester in the mixture is hexadecanoic acid ester.

15. The fuel composition of claim 10, wherein the fuel composition is a low-sulfur fuel composition.

16. The fuel composition of claim 10, wherein the fuel composition is a carbon-neutral fuel composition.

17. The fuel composition of claim 10, wherein the fuel composition has a higher .delta..sub.p than a comparable fuel composition made from fixed atmospheric carbon or plant-derived biomass.

18. The fuel composition of claim 10, wherein the inorganic carbon is derived from carbon dioxide.

19. A fuel composition comprising a mixture of one or more fatty acid esters, wherein at least one of the fatty acid esters in the mixture is tetradecanoic acid ester, .DELTA.9-hexadecenoic acid ester, hexadecanoic acid ester, heptadecanoic acid ester, .DELTA.9-octadecenoic acid ester, or octadecanoic acid ester, and wherein the fuel composition has a higher .delta..sub.p than a comparable fuel composition made from fixed atmospheric carbon or plant-derived biomass.

20. The fuel composition of claim 19, wherein at least a portion of the carbon used as raw material of the one or more fatty acid esters in the mixture is inorganic carbon or wherein at least a portion of the carbon in the one or more fatty acid esters in the mixture is inorganic carbon.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a divisional of U.S. patent application Ser. No. 13/765,211, filed Feb. 12, 2013, which is a divisional of U.S. patent application Ser. No. 13/243,165, filed Sep. 23, 2011, which is a continuation of U.S. patent application Ser. No. 12/876,056, filed Sep. 3, 2010, which is a continuation-in-part of international application PCT/US/2009/035937, filed Mar. 3, 2009, which claims the benefit of earlier filed U.S. Provisional Patent Application No. 61/121,532, filed Dec. 10, 2008, U.S. Provisional Patent Application No. 61/033,411 filed Mar. 3, 2008, and U.S. Provisional Application No. 61/033,402, filed Mar. 3, 2008; this application also claims priority to U.S. Provisional Application 61/353,145, filed Jun. 9, 2010. The disclosures of each of these applications are incorporated hereinby reference, in their entirety, for all purposes.

REFERENCE TO SEQUENCE LISTING

[0002] The instant application contains a Sequence Listing which has been submitted via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Sep. 23, 2011, is named "19578_US_Sequence_Listing.txt", lists 25 sequences, and is 91.4 kb in size.

FIELD OF THE INVENTION

[0003] The present disclosure relates to methods for conferring fatty acid and fatty acid ester-producing properties to a heterotrophic or photoautotrophic host, such that the modified host can be used in the commercial production of fuels and chemicals.

BACKGROUND OF THE INVENTION

[0004] Many existing photoautotrophic organisms (i.e., plants, algae, and photosynthetic bacteria) are poorly suited for industrial bioprocessing and have therefore not demonstrated commercial viability. Such organisms typically have slow doubling times (3-72 hrs) compared to industrialized heterotrophic organisms such as Escherichia coli (20 minutes), reflective of low total productivities. A need exists, therefore, for engineered photosynthetic microbes which produce increased yields of fatty acids and esters.

SUMMARY OF THE INVENTION

[0005] In one embodiment, the invention provides a method for producing fatty acid esters, comprising: (i) culturing an engineered photosynthetic microorganism in a culture medium, wherein said engineered photosynthetic microorganism comprises a recombinant thioesterase, a recombinant acyl-CoA synthetase, and a recombinant wax synthase; and (ii) exposing said engineered photosynthetic microorganism to light and carbon dioxide, wherein said exposure results in the incorporation of an alcohol into a fatty acid ester produced by said engineered photosynthetic microorganism. In a related embodiment, the engineered photosynthetic microorganism is an engineered cyanobacterium. In another related embodiment, at least one of said fatty acid esters produced by the engineered cyanobacterium is selected from the group consisting of a tetradecanoic acid ester, a hexadecanoic acid ester, a heptadecanoic acid ester, a .DELTA.9-octadecenoic acid ester, and an octadecanoic acid ester. In another related embodiment, the amount of said fatty acid esters produced by said engineered cyanobacterium is increased relative to the amount of fatty acid produced by an otherwise identical cell lacking said recombinant thioesterase, acyl-CoA synthetase or wax synthase. In certain embodiments, the incorporated alcohol is an exogenously added alcohol selected from the group consisting of methanol, ethanol, propanol, isopropanol, butanol, hexanol, cyclohexanol, and isoamyl alcohol.

[0006] In another related embodiment, the esters produce by the engineered cyanobacteria include a hexadecanoic acid ester and an octadecanoic acid ester. In another related embodiment, the amount of hexadecanoic acid ester produced is between 1.5 and 10 fold greater than the amount of octadecanoic acid ester. In yet another related embodiment, the amount of hexadecanoic acid ester produced is between 1.5 and 5 fold greater than the amount of octadecanoic acid ester produced. In yet another related embodiment, at least 50% of the esters produced by said engineered cyanobacterium are hexadecanoic acid esters. In yet another related embodiment, between 65% and 85% of the esters produced by said engineered cyanobacterium are hexadecanoic acid esters.

[0007] In a related embodiment of the method for producing fatty acid esters described above, the exogenously alcohol is butanol and fatty acid butyl esters are produced. In yet another related embodiment, the yield of fatty acid butyl esters is at least 5% dry cell weight. In yet another related embodiment, the yield of fatty acid butyl esters is at least 10% dry cell weight. In yet another related embodiment, exogenously added butanol is present in said culture at concentrations between 0.01 and 0.2% (vol/vol). In yet another related embodiment, the concentration of exogenously added butanol is about 0.05 to 0.075% (vol/vol).

[0008] In another related embodiment of the method for producing fatty acid esters described above, the exogenously added alcohol is ethanol. In yet another related embodiment, the yield of ethyl esters is at least 1% dry cell weight.

[0009] In another related embodiment of the method for producing fatty acid esters described above, the exogenously added alcohol is methanol. In yet another related embodiment, the yield of methyl esters is at least 0.01% dry cell weight.

[0010] In another related embodiment, said engineered cyanobacterium further comprises a recombinant resistance nodulation cell division type ("RND-type") transporter, e.g., a TolC-AcrAB transporter. In another related embodiment, the expression of TolC is controlled by a promoter separate from the promoter that controls expression of AcrAB. In another related embodiment, the genes encoding the recombinant transporter are encoded by a plasmid. In another related embodiment, the fatty acid esters are secreted into the culture medium at increased levels relative to an otherwise identical cyanobacterium lacking the recombinant transporter.

[0011] In certain embodiments of the methods for producing fatty acid esters described above, the recombinant thioesterase, wax synthase, and acyl-CoA synthetase are expressed as an operon under the control of a single promoter. In certain embodiments, the single promoter is an inducible promoter. In other embodiments of the methods described above, the expression of at least two of the genes selected from the group consisting of a recombinant thioesterase, wax synthase, and acyl-CoA synthetase is under the control of different promoters. One or more of the promoters can be an inducible promoter. In related embodiments, at least one of said recombinant genes is encoded on a plasmid. In yet other related embodiments, at least one of said recombinant genes is integrated into the chromosome of the engineered cyanobacteria. In yet other related embodiments, at least one of said recombinant genes is a gene that is native to the engineered cyanobacteria, but whose expression is controlled by a recombinant promoter. In yet other related embodiments, one or more promoters are selected from the group consisting of a cI promoter, a cpcB promoter, a lacI-Ptrc promoter, an EM7 promoter, an PaphII promoter, a NirA-type promoter, a PnrsA promoter, or a PnrsB promoter.

[0012] In another embodiment, the invention provides a method for producing fatty acid esters, comprising: (i) culturing an engineered cyanobacterium in a culture medium, wherein said engineered cyanobacterium comprises a recombinant acyl-CoA synthetase and a recombinant wax synthase; and (ii) exposing said engineered cyanobacterium to light and carbon dioxide, wherein said exposure results in the conversion of an alcohol by said engineered cyanobacterium into fatty acid esters, wherein at least one of said fatty acid esters is selected from the group consisting of a tetradecanoic acid ester, a hexadecanoic acid ester, a heptadecanoic acid ester, a .DELTA.9-octadecenoic acid ester, and an octadecanoic acid ester, wherein the amount of said fatty acid esters produced by said engineered cyanobacterium is increased relative to the amount of fatty acid produced by an otherwise identical cell lacking said recombinant acyl-CoA synthetase or wax synthase. In a related embodiment, the alcohol is an exogenously added alcohol selected from the group consisting of methanol, ethanol, propanol, isopropanol, butanol, hexanol, cyclohexanol, and isoamyl alcohol.

[0013] In another embodiment, the invention provides a method for producing a fatty acid ester, comprising: (i) culturing an engineered cyanobacterium in a culture medium, wherein said engineered cyanobacterium comprises a recombinant RND-type transporter; and (ii) exposing said engineered cyanobacterium to light and carbon dioxide, wherein said exposure results in the production of a fatty acid ester by said engineered cyanobacterium, and wherein said RND-type transporter secretes said fatty acid ester into said culture medium. In a related embodiment, said RND-type transporter is a TolC-AcrAB transporter.

[0014] In an embodiment related to the methods described above, the invention further comprises isolating said fatty acid ester from said engineered cyanobacterium or said culture medium.

[0015] In another embodiment, the invention also provides an engineered cyanobacterium, wherein said cyanobacterium comprises a recombinant thioesterase, a recombinant acyl-CoA synthetase, and a recombinant wax synthase. In certain embodiments, the engineered cyanobacterium additionally comprises a recombinant RND-type transporter, e.g., a TolC-AcrAB transporter.

[0016] In a related embodiment, at least one of said recombinant enzymes is heterologous with respect to said engineered cyanobacterium. In another embodiment, said cyanobacterium does not synthesize fatty acid esters in the absence of the expression of one or both of the recombinant enzymes. In another embodiment, at least one of said recombinant enzymes is not heterologous to said engineered cyanobacterium.

[0017] In yet another related embodiment, the recombinant thioesterase, acyl-CoA synthetase and wax synthase are selected from the enzymes listed in Table 3A, Table 3B and Table 3C, respectively. In yet another related embodiment, the recombinant thioesterase has an amino acid sequence that is identical to SEQ ID NO: 1. In yet another related embodiment, the recombinant thioesterase has an amino acid sequence that is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 1. In yet another related embodiment, the recombinant acyl-CoA synthetase is identical to SEQ ID NO:2. In yet another related embodiment, the recombinant acyl-CoA synthetase has an amino acid sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 2. In yet another related embodiment, recombinant wax synthase is identical to SEQ ID NO: 3. In yet another related embodiment, the recombinant wax synthase has an amino acid sequence is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 3. In yet another related embodiment, the recombinant TolC transporter amino acid sequence is identical to SEQ ID NO: 7. In yet another related embodiment, the recombinant TolC transporter has an amino acid sequence at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 7. In yet another related embodiment, the recombinant AcrA amino acid sequence is identical to SEQ ID NO: 8. In yet another related embodiment, the recombinant AcrA amino acid sequence is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 8. In yet another related embodiment, the recombinant AcrB amino acid sequence is identical to SEQ ID NO: 9. In yet another related embodiment, the recombinant AcrB amino acid sequence is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 9.

[0018] In related embodiments of the above-described embodiments, an engineered photosynthetic microorganism other than a cyanobacterium can be used. In other related embodiments, a thermophilic cyanobacterium can be used.

[0019] In another embodiment, the invention provides a methods and compositions for producing fatty acids using an engineered photosynthetic microorganism. For example, in one embodiment, the invention provides a method for producing fatty acids, comprising: (a) culturing an engineered photosynthetic microorganism, wherein said engineered photosynthetic microorganism comprises a modification which reduces the expression of said microorganism's endogenous acyl-ACP synthetase; and (b) exposing said engineered photosynthetic microorganism to light and carbon dioxide, wherein said exposure results in the production of fatty acids by said engineered cyanobacterium, wherein the amount of fatty acids produced is increased relative to the amount of fatty acids produced by an otherwise identical microorganism lacking said modification. In a related embodiment, the engineered microorganism is a thermophile. In another related embodiment, the engineered microorganism is a cyanobacterium. In yet another related embodiment, the engineered microorganism is a thermophilic cyanobacterium. In yet another related embodiment, the engineered microorganism is Thermosynechococcus elongatus BP-1. In yet another related embodiment of the method for producing fatty acids, the modification is a knock-out or deletion of the gene encoding said endogenous acyl-ACP synthetase. In yet another related embodiment, the gene encoding said acyl-ACP synthetase is the acyl-ACP synthetase or aas gene, e.g., GenBank accession number NP.sub.--682091.1. In yet another related embodiment, the increase in fatty acid production is at least a 2 fold increase. In yet another related embodiment, the increase in fatty acid production is between 2 and 4.5 fold. In yet another related embodiment, the increase in fatty acid production includes an increase in fatty acids secreted into a culture media. In yet another related embodiment, most of said increase in fatty acid production arises from the increased production of myristic and oleic acid. In yet another related embodiment of the method for producing fatty acids, the engineered photosynthetic microorganism further comprises a TolC-AcrAB transporter.

[0020] In another embodiment, the invention provides an engineered photosynthetic microorganism, wherein said microorganism comprises a deletion or knock-out of an endogenous gene encoding a acyl-ACP synthetase or long-chain fatty acid ligase. In a related embodiment, engineered photosynthetic microorganism is a thermophile. In yet another related embodiment, the engineered photosynthetic microorganism is a cyanobacterium or a thermophilic cyanobacterium. In yet another related embodiment, the cyanobacterium is Thermosynechococcus elongatus BP-1. In yet another related embodiment, the acyl-ACP synthetase is the aas gene of the thermophilic cyanobacterium, e.g., GenBank accession number NP.sub.--682091.1. In yet another related embodiment, the engineered photosynthetic microorganism further comprises a TolC-AcrAB transporter.

[0021] In yet another embodiment, the invention provides an engineered cyanbacterial strain selected from the group consisting of JCC723, JCC803, JCC1215, JCC803, JCC1132, and JCC1585. In yet another embodiment, the invention provides an engineered cyanobacterial strain selected from the group consisting of the engineered Synechococcus sp. PCC7002 strains JCC1648 (.DELTA.aas tesA, with tesA under control of P(nir07) on pAQ4), JCC1704 (.DELTA.aas fatB, with fatB inserted at aquI under the control of P(nir07)), JCC1705 (.DELTA.aas fatB1, with fatB1 inserted at aquI under the control of P(nir07)), JCC1706 (.DELTA.aas fatB2 with fatB2 inserted at aquI under the control of P(nir07)), JCC1751 (.DELTA.aas tesA, with tesA under control of P(nir07) on pAQ3), and JCC1755 (.DELTA.aas fatB_mat, with fatB_mat under control of P(nir07) on pAQ3). In yet another embodiment, the invention provides the engineered cyanobacterial strain JCC1862 (Thermosynechococcus elongatus BP-1 kan.sup.R .DELTA.aas).

[0022] These and other embodiments of the invention are further described in the Figures, Description, Examples and Claims, herein.

BRIEF DESCRIPTION OF THE FIGURES

[0023] FIG. 1 depicts a GC/MS chromatogram overlay comparing cell pellet extracts of JCC803 incubated with either methanol (top trace) or ethanol (bottom traces). The peaks due to methyl esters (MEs) or ethyl esters (EEs) are labeled.

[0024] FIG. 2 shows three stacked GC/FID chromatograms comparing cell pellet extracts of the indicated cyanobacterial strains when cultured in the presence of ethanol. The interval between tick marks on the FID response axis is 20,000.

[0025] FIG. 3 depicts stacks of GC/FID chromatograms comparing cell pellet extracts of JCC803 cultures incubated with different alcohols (indicated on respective chromatograms). Numbers indicate the respective fatty acid ester corresponding to the alcohol added (1=myristate; 2=palmitate; 3=oleate; 4=stearate). EA=ethyl arachidate. The interval between tick marks on the FID response axis is 400,000.

[0026] FIG. 4 depicts a GC/chromatogram of a cell pellet extract from a JCC803 culture incubated with ethanol. 1=ethyl myristate; 2=ethyl palmitoleate; 3=ethyl palmitate; 4=ethyl margarate; 5=ethyl oleate; 6=ethyl stearate.

[0027] FIG. 5 depicts a GC/chromatogram of a cell pellet extract from a JCC803 culture incubated with butanol. 1=butyl myristate, 2=butyl palmitoleate, 3=butyl palmitate, 4=butyl margarate, 5=butyl oleate, 6=butyl stearate.

DETAILED DESCRIPTION OF THE INVENTION

[0028] Unless otherwise defined herein, scientific and technical terms used in connection with the present invention shall have the meanings that are commonly understood by those of ordinary skill in the art. Further, unless otherwise required by context, singular terms shall include the plural and plural terms shall include the singular. Generally, nomenclatures used in connection with, and techniques of, biochemistry, enzymology, molecular and cellular biology, microbiology, genetics and protein and nucleic acid chemistry and hybridization described herein are those well known and commonly used in the art.

[0029] The methods and techniques of the present invention are generally performed according to conventional methods well known in the art and as described in various general and more specific references that are cited and discussed throughout the present specification unless otherwise indicated. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989); Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992, and Supplements to 2002); Harlow and Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1990); Taylor and Drickamer, Introduction to Glycobiology, Oxford Univ. Press (2003); Worthington Enzyme Manual, Worthington Biochemical Corp., Freehold, N.J.; Handbook of Biochemistry: Section A Proteins, Vol I, CRC Press (1976); Handbook of Biochemistry: Section A Proteins, Vol II, CRC Press (1976); Essentials of Glycobiology, Cold Spring Harbor Laboratory Press (1999).

[0030] All publications, patents and other references mentioned herein are hereby incorporated by reference in their entireties.

[0031] The following terms, unless otherwise indicated, shall be understood to have the following meanings:

[0032] The term "polynucleotide" or "nucleic acid molecule" refers to a polymeric form of nucleotides of at least 10 bases in length. The term includes DNA molecules (e.g., cDNA or genomic or synthetic DNA) and RNA molecules (e.g., mRNA or synthetic RNA), as well as analogs of DNA or RNA containing non-natural nucleotide analogs, non-native internucleoside bonds, or both. The nucleic acid can be in any topological conformation. For instance, the nucleic acid can be single-stranded, double-stranded, triple-stranded, quadruplexed, partially double-stranded, branched, hairpinned, circular, or in a padlocked conformation.

[0033] Unless otherwise indicated, and as an example for all sequences described herein under the general format "SEQ ID NO:", "nucleic acid comprising SEQ ID NO:1" refers to a nucleic acid, at least a portion of which has either (i) the sequence of SEQ ID NO:1, or (ii) a sequence complementary to SEQ ID NO:1. The choice between the two is dictated by the context. For instance, if the nucleic acid is used as a probe, the choice between the two is dictated by the requirement that the probe be complementary to the desired target.

[0034] An "isolated" RNA, DNA or a mixed polymer is one which is substantially separated from other cellular components that naturally accompany the native polynucleotide in its natural host cell, e.g., ribosomes, polymerases and genomic sequences with which it is naturally associated.

[0035] As used herein, an "isolated" organic molecule (e.g., a fatty acid or a fatty acid ester) is one which is substantially separated from the cellular components (membrane lipids, chromosomes, proteins) of the host cell from which it originated, or from the medium in which the host cell was cultured. The term does not require that the biomolecule has been separated from all other chemicals, although certain isolated biomolecules may be purified to near homogeneity.

[0036] The term "recombinant" refers to a biomolecule, e.g., a gene or protein, that (1) has been removed from its naturally occurring environment, (2) is not associated with all or a portion of a polynucleotide in which the gene is found in nature, (3) is operatively linked to a polynucleotide which it is not linked to in nature, or (4) does not occur in nature. The term "recombinant" can be used in reference to cloned DNA isolates, chemically synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as proteins and/or mRNAs encoded by such nucleic acids.

[0037] As used herein, an endogenous nucleic acid sequence in the genome of an organism (or the encoded protein product of that sequence) is deemed "recombinant" herein if a heterologous sequence is placed adjacent to the endogenous nucleic acid sequence, such that the expression of this endogenous nucleic acid sequence is altered. In this context, a heterologous sequence is a sequence that is not naturally adjacent to the endogenous nucleic acid sequence, whether or not the heterologous sequence is itself endogenous (originating from the same host cell or progeny thereof) or exogenous (originating from a different host cell or progeny thereof). By way of example, a promoter sequence can be substituted (e.g., by homologous recombination) for the native promoter of a gene in the genome of a host cell, such that this gene has an altered expression pattern. This gene would now become "recombinant" because it is separated from at least some of the sequences that naturally flank it.

[0038] A nucleic acid is also considered "recombinant" if it contains any modifications that do not naturally occur to the corresponding nucleic acid in a genome. For instance, an endogenous coding sequence is considered "recombinant" if it contains an insertion, deletion or a point mutation introduced artificially, e.g., by human intervention. A "recombinant nucleic acid" also includes a nucleic acid integrated into a host cell chromosome at a heterologous site and a nucleic acid construct present as an episome.

[0039] As used herein, the phrase "degenerate variant" of a reference nucleic acid sequence encompasses nucleic acid sequences that can be translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from the reference nucleic acid sequence. The term "degenerate oligonucleotide" or "degenerate primer" is used to signify an oligonucleotide capable of hybridizing with target nucleic acid sequences that are not necessarily identical in sequence but that are homologous to one another within one or more particular segments.

[0040] The term "percent sequence identity" or "identical" in the context of nucleic acid sequences refers to the residues in the two sequences which are the same when aligned for maximum correspondence. The length of sequence identity comparison may be over a stretch of at least about nine nucleotides, usually at least about 20 nucleotides, more usually at least about 24 nucleotides, typically at least about 28 nucleotides, more typically at least about 32 nucleotides, and preferably at least about 36 or more nucleotides. There are a number of different algorithms known in the art which can be used to measure nucleotide sequence identity. For instance, polynucleotide sequences can be compared using FASTA, Gap or Bestfit, which are programs in Wisconsin Package Version 10.0, Genetics Computer Group (GCG), Madison, Wis. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (hereby incorporated by reference in its entirety). For instance, percent sequence identity between nucleic acid sequences can be determined using FASTA with its default parameters (a word size of 6 and the NOPAM factor for the scoring matrix) or using Gap with its default parameters as provided in GCG Version 6.1, herein incorporated by reference. Alternatively, sequences can be compared using the computer program, BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al., Meth. Enzymol. 266:131-141 (1996); Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)).

[0041] The term "substantial homology" or "substantial similarity," when referring to a nucleic acid or fragment thereof, indicates that, when optimally aligned with appropriate nucleotide insertions or deletions with another nucleic acid (or its complementary strand), there is nucleotide sequence identity in at least about 76%, 80%, 85%, preferably at least about 90%, and more preferably at least about 95%, 96%, 97%, 98% or 99% of the nucleotide bases, as measured by any well-known algorithm of sequence identity, such as FASTA, BLAST or Gap, as discussed above.

[0042] Alternatively, substantial homology or similarity exists when a nucleic acid or fragment thereof hybridizes to another nucleic acid, to a strand of another nucleic acid, or to the complementary strand thereof, under stringent hybridization conditions. "Stringent hybridization conditions" and "stringent wash conditions" in the context of nucleic acid hybridization experiments depend upon a number of different physical parameters. Nucleic acid hybridization will be affected by such conditions as salt concentration, temperature, solvents, the base composition of the hybridizing species, length of the complementary regions, and the number of nucleotide base mismatches between the hybridizing nucleic acids, as will be readily appreciated by those skilled in the art. One having ordinary skill in the art knows how to vary these parameters to achieve a particular stringency of hybridization.

[0043] In general, "stringent hybridization" is performed at about 25.degree. C. below the thermal melting point (T.sub.m) for the specific DNA hybrid under a particular set of conditions. "Stringent washing" is performed at temperatures about 5.degree. C. lower than the T.sub.m for the specific DNA hybrid under a particular set of conditions. The T.sub.m is the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. See Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989), page 9.51, hereby incorporated by reference. For purposes herein, "stringent conditions" are defined for solution phase hybridization as aqueous hybridization (i.e., free of formamide) in 6.times.SSC (where 20.times.SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 65.degree. C. for 8-12 hours, followed by two washes in 0.2.times.SSC, 0.1% SDS at 65.degree. C. for 20 minutes. It will be appreciated by the skilled worker that hybridization at 65.degree. C. will occur at different rates depending on a number of factors including the length and percent identity of the sequences which are hybridizing.

[0044] The nucleic acids (also referred to as polynucleotides) of this present invention may include both sense and antisense strands of RNA, cDNA, genomic DNA, and synthetic forms and mixed polymers of the above. They may be modified chemically or biochemically or may contain non-natural or derivatized nucleotide bases, as will be readily appreciated by those of skill in the art. Such modifications include, for example, labels, methylation, substitution of one or more of the naturally occurring nucleotides with an analog, internucleotide modifications such as uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoramidates, carbamates, etc.), charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.), pendent moieties (e.g., polypeptides), intercalators (e.g., acridine, psoralen, etc.), chelators, alkylators, and modified linkages (e.g., alpha anomeric nucleic acids, etc.) Also included are synthetic molecules that mimic polynucleotides in their ability to bind to a designated sequence via hydrogen bonding and other chemical interactions. Such molecules are known in the art and include, for example, those in which peptide linkages substitute for phosphate linkages in the backbone of the molecule. Other modifications can include, for example, analogs in which the ribose ring contains a bridging moiety or other structure such as the modifications found in "locked" nucleic acids.

[0045] The term "mutated" when applied to nucleic acid sequences means that nucleotides in a nucleic acid sequence may be inserted, deleted or changed compared to a reference nucleic acid sequence. A single alteration may be made at a locus (a point mutation) or multiple nucleotides may be inserted, deleted or changed at a single locus. In addition, one or more alterations may be made at any number of loci within a nucleic acid sequence. A nucleic acid sequence may be mutated by any method known in the art including but not limited to mutagenesis techniques such as "error-prone PCR" (a process for performing PCR under conditions where the copying fidelity of the DNA polymerase is low, such that a high rate of point mutations is obtained along the entire length of the PCR product; see, e.g., Leung et al., Technique, 1:11-15 (1989) and Caldwell and Joyce, PCR Methods Applic. 2:28-33 (1992)); and "oligonucleotide-directed mutagenesis" (a process which enables the generation of site-specific mutations in any cloned DNA segment of interest; see, e.g., Reidhaar-Olson and Sauer, Science 241:53-57 (1988)).

[0046] The term "attenuate" as used herein generally refers to a functional deletion, including a mutation, partial or complete deletion, insertion, or other variation made to a gene sequence or a sequence controlling the transcription of a gene sequence, which reduces or inhibits production of the gene product, or renders the gene product non-functional. In some instances a functional deletion is described as a knockout mutation. Attenuation also includes amino acid sequence changes by altering the nucleic acid sequence, placing the gene under the control of a less active promoter, down-regulation, expressing interfering RNA, ribozymes or antisense sequences that target the gene of interest, or through any other technique known in the art. In one example, the sensitivity of a particular enzyme to feedback inhibition or inhibition caused by a composition that is not a product or a reactant (non-pathway specific feedback) is lessened such that the enzyme activity is not impacted by the presence of a compound. In other instances, an enzyme that has been altered to be less active can be referred to as attenuated.

[0047] Deletion: The removal of one or more nucleotides from a nucleic acid molecule or one or more amino acids from a protein, the regions on either side being joined together.

[0048] Knock-out: A gene whose level of expression or activity has been reduced to zero. In some examples, a gene is knocked-out via deletion of some or all of its coding sequence. In other examples, a gene is knocked-out via introduction of one or more nucleotides into its open reading frame, which results in translation of a non-sense or otherwise non-functional protein product.

[0049] The term "vector" as used herein is intended to refer to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid," which generally refers to a circular double stranded DNA loop into which additional DNA segments may be ligated, but also includes linear double-stranded molecules such as those resulting from amplification by the polymerase chain reaction (PCR) or from treatment of a circular plasmid with a restriction enzyme. Other vectors include cosmids, bacterial artificial chromosomes (BAC) and yeast artificial chromosomes (YAC). Another type of vector is a viral vector, wherein additional DNA segments may be ligated into the viral genome (discussed in more detail below). Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., vectors having an origin of replication which functions in the host cell). Other vectors can be integrated into the genome of a host cell upon introduction into the host cell, and are thereby replicated along with the host genome. Moreover, certain preferred vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as "recombinant expression vectors" (or simply "expression vectors").

[0050] "Operatively linked" or "operably linked" expression control sequences refers to a linkage in which the expression control sequence is contiguous with the gene of interest to control the gene of interest, as well as expression control sequences that act in trans or at a distance to control the gene of interest.

[0051] The term "expression control sequence" as used herein refers to polynucleotide sequences which are necessary to affect the expression of coding sequences to which they are operatively linked. Expression control sequences are sequences which control the transcription, post-transcriptional events and translation of nucleic acid sequences. Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; and when desired, sequences that enhance protein secretion. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and transcription termination sequence. The term "control sequences" is intended to include, at a minimum, all components whose presence is essential for expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.

[0052] Promoters useful for expressing the recombinant genes described herein include both constitutive and inducible/repressible promoters. Examples of inducible/repressible promoters include nickel-inducible promoters (e.g., PnrsA, PnrsB; see, e.g., Lopez-Mauy et al., Cell (2002) v. 43:247-256, incorporated by reference herein) and urea repressible promoters such as PnirA (described in, e.g., Qi et al., Applied and Environmental Microbiology (2005) v. 71: 5678-5684, incorporated by reference herein). In other embodiments, a PaphII and/or a lacIq-Ptrc promoter can used to control expression. Where multiple recombinant genes are expressed in an engineered cyanobacteria of the invention, the different genes can be controlled by different promoters or by identical promoters in separate operons, or the expression of two or more genes may be controlled by a single promoter as part of an operon.

[0053] The term "recombinant host cell" (or simply "host cell"), as used herein, is intended to refer to a cell into which a recombinant vector has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term "host cell" as used herein. A recombinant host cell may be an isolated cell or cell line grown in culture or may be a cell which resides in a living tissue or organism.

[0054] The term "peptide" as used herein refers to a short polypeptide, e.g., one that is typically less than about 50 amino acids long and more typically less than about 30 amino acids long. The term as used herein encompasses analogs and mimetics that mimic structural and thus biological function.

[0055] The term "polypeptide" encompasses both naturally-occurring and non-naturally-occurring proteins, and fragments, mutants, derivatives and analogs thereof. A polypeptide may be monomeric or polymeric. Further, a polypeptide may comprise a number of different domains each of which has one or more distinct activities.

[0056] The term "isolated protein" or "isolated polypeptide" is a protein or polypeptide that by virtue of its origin or source of derivation (1) is not associated with naturally associated components that accompany it in its native state, (2) exists in a purity not found in nature, where purity can be adjudged with respect to the presence of other cellular material (e.g., is free of other proteins from the same species) (3) is expressed by a cell from a different species, or (4) does not occur in nature (e.g., it is a fragment of a polypeptide found in nature or it includes amino acid analogs or derivatives not found in nature or linkages other than standard peptide bonds). Thus, a polypeptide that is chemically synthesized or synthesized in a cellular system different from the cell from which it naturally originates will be "isolated" from its naturally associated components. A polypeptide or protein may also be rendered substantially free of naturally associated components by isolation, using protein purification techniques well known in the art. As thus defined, "isolated" does not necessarily require that the protein, polypeptide, peptide or oligopeptide so described has been physically removed from its native environment.

[0057] The term "polypeptide fragment" as used herein refers to a polypeptide that has a deletion, e.g., an amino-terminal and/or carboxy-terminal deletion compared to a full-length polypeptide. In a preferred embodiment, the polypeptide fragment is a contiguous sequence in which the amino acid sequence of the fragment is identical to the corresponding positions in the naturally-occurring sequence. Fragments typically are at least 5, 6, 7, 8, 9 or 10 amino acids long, preferably at least 12, 14, 16 or 18 amino acids long, more preferably at least 20 amino acids long, more preferably at least 25, 30, 35, 40 or 45, amino acids, even more preferably at least 50 or 60 amino acids long, and even more preferably at least 70 amino acids long.

[0058] A "modified derivative" refers to polypeptides or fragments thereof that are substantially homologous in primary structural sequence but which include, e.g., in vivo or in vitro chemical and biochemical modifications or which incorporate amino acids that are not found in the native polypeptide. Such modifications include, for example, acetylation, carboxylation, phosphorylation, glycosylation, ubiquitination, labeling, e.g., with radionuclides, and various enzymatic modifications, as will be readily appreciated by those skilled in the art. A variety of methods for labeling polypeptides and of substituents or labels useful for such purposes are well known in the art, and include radioactive isotopes such as .sup.125I, .sup.32P, .sup.35S, and .sup.3H, ligands which bind to labeled antiligands (e.g., antibodies), fluorophores, chemiluminescent agents, enzymes, and antiligands which can serve as specific binding pair members for a labeled ligand. The choice of label depends on the sensitivity required, ease of conjugation with the primer, stability requirements, and available instrumentation. Methods for labeling polypeptides are well known in the art. See, e.g., Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992, and Supplements to 2002) (hereby incorporated by reference).

[0059] The term "fusion protein" refers to a polypeptide comprising a polypeptide or fragment coupled to heterologous amino acid sequences. Fusion proteins are useful because they can be constructed to contain two or more desired functional elements from two or more different proteins. A fusion protein comprises at least 10 contiguous amino acids from a polypeptide of interest, more preferably at least 20 or 30 amino acids, even more preferably at least 40, 50 or 60 amino acids, yet more preferably at least 75, 100 or 125 amino acids. Fusions that include the entirety of the proteins of the present invention have particular utility. The heterologous polypeptide included within the fusion protein of the present invention is at least 6 amino acids in length, often at least 8 amino acids in length, and usefully at least 15, 20, and 25 amino acids in length. Fusions that include larger polypeptides, such as an IgG Fc region, and even entire proteins, such as the green fluorescent protein ("GFP") chromophore-containing proteins, have particular utility. Fusion proteins can be produced recombinantly by constructing a nucleic acid sequence which encodes the polypeptide or a fragment thereof in frame with a nucleic acid sequence encoding a different protein or peptide and then expressing the fusion protein. Alternatively, a fusion protein can be produced chemically by crosslinking the polypeptide or a fragment thereof to another protein.

[0060] As used herein, the term "antibody" refers to a polypeptide, at least a portion of which is encoded by at least one immunoglobulin gene, or fragment thereof, and that can bind specifically to a desired target molecule. The term includes naturally-occurring forms, as well as fragments and derivatives.

[0061] Fragments within the scope of the term "antibody" include those produced by digestion with various proteases, those produced by chemical cleavage and/or chemical dissociation and those produced recombinantly, so long as the fragment remains capable of specific binding to a target molecule. Among such fragments are Fab, Fab', Fv, F(ab').sub.2, and single chain Fv (scFv) fragments.

[0062] Derivatives within the scope of the term include antibodies (or fragments thereof) that have been modified in sequence, but remain capable of specific binding to a target molecule, including: interspecies chimeric and humanized antibodies; antibody fusions; heteromeric antibody complexes and antibody fusions, such as diabodies (bispecific antibodies), single-chain diabodies, and intrabodies (see, e.g., Intracellular Antibodies: Research and Disease Applications, (Marasco, ed., Springer-Verlag New York, Inc., 1998), the disclosure of which is incorporated herein by reference in its entirety).

[0063] As used herein, antibodies can be produced by any known technique, including harvest from cell culture of native B lymphocytes, harvest from culture of hybridomas, recombinant expression systems and phage display.

[0064] The term "non-peptide analog" refers to a compound with properties that are analogous to those of a reference polypeptide. A non-peptide compound may also be termed a "peptide mimetic" or a "peptidomimetic." See, e.g., Jones, Amino Acid and Peptide Synthesis, Oxford University Press (1992); Jung, Combinatorial Peptide and Nonpeptide Libraries: A Handbook, John Wiley (1997); Bodanszky et al., Peptide Chemistry--A Practical Textbook, Springer Verlag (1993); Synthetic Peptides: A Users Guide, (Grant, ed., W. H. Freeman and Co., 1992); Evans et al., J. Med. Chem. 30:1229 (1987); Fauchere, J. Adv. Drug Res. 15:29 (1986); Veber and Freidinger, Trends Neurosci., 8:392-396 (1985); and references sited in each of the above, which are incorporated herein by reference. Such compounds are often developed with the aid of computerized molecular modeling. Peptide mimetics that are structurally similar to useful peptides of the present invention may be used to produce an equivalent effect and are therefore envisioned to be part of the present invention.

[0065] A "polypeptide mutant" or "mutein" refers to a polypeptide whose sequence contains an insertion, duplication, deletion, rearrangement or substitution of one or more amino acids compared to the amino acid sequence of a native or wild-type protein. A mutein may have one or more amino acid point substitutions, in which a single amino acid at a position has been changed to another amino acid, one or more insertions and/or deletions, in which one or more amino acids are inserted or deleted, respectively, in the sequence of the naturally-occurring protein, and/or truncations of the amino acid sequence at either or both the amino or carboxy termini. A mutein may have the same but preferably has a different biological activity compared to the naturally-occurring protein.

[0066] A mutein has at least 85% overall sequence homology to its wild-type counterpart. Even more preferred are muteins having at least 90% overall sequence homology to the wild-type protein.

[0067] In an even more preferred embodiment, a mutein exhibits at least 95% sequence identity, even more preferably 98%, even more preferably 99% and even more preferably 99.9% overall sequence identity.

[0068] Sequence homology may be measured by any common sequence analysis algorithm, such as Gap or Bestfit.

[0069] Amino acid substitutions can include those which: (1) reduce susceptibility to proteolysis, (2) reduce susceptibility to oxidation, (3) alter binding affinity for forming protein complexes, (4) alter binding affinity or enzymatic activity, and (5) confer or modify other physicochemical or functional properties of such analogs.

[0070] As used herein, the twenty conventional amino acids and their abbreviations follow conventional usage. See Immunology--A Synthesis (Golub and Gren eds., Sinauer Associates, Sunderland, Mass., 2.sup.nd ed. 1991), which is incorporated herein by reference. Stereoisomers (e.g., D-amino acids) of the twenty conventional amino acids, unnatural amino acids such as .alpha.-, .alpha.-disubstituted amino acids, N-alkyl amino acids, and other unconventional amino acids may also be suitable components for polypeptides of the present invention. Examples of unconventional amino acids include: 4-hydroxyproline, .gamma.-carboxyglutamate, .epsilon.-N,N,N-trimethyllysine, .epsilon.-N-acetyllysine, O-phosphoserine, N-acetylserine, N-formylmethionine, 3-methylhistidine, 5-hydroxylysine, N-methylarginine, and other similar amino acids and imino acids (e.g., 4-hydroxyproline). In the polypeptide notation used herein, the left-hand end corresponds to the amino terminal end and the right-hand end corresponds to the carboxy-terminal end, in accordance with standard usage and convention.

[0071] A protein has "homology" or is "homologous" to a second protein if the nucleic acid sequence that encodes the protein has a similar sequence to the nucleic acid sequence that encodes the second protein. Alternatively, a protein has homology to a second protein if the two proteins have "similar" amino acid sequences. (Thus, the term "homologous proteins" is defined to mean that the two proteins have similar amino acid sequences.) As used herein, homology between two regions of amino acid sequence (especially with respect to predicted structural similarities) is interpreted as implying similarity in function.

[0072] When "homologous" is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions. A "conservative amino acid substitution" is one in which an amino acid residue is substituted by another amino acid residue having a side chain (R group) with similar chemical properties (e.g., charge or hydrophobicity). In general, a conservative amino acid substitution will not substantially change the functional properties of a protein. In cases where two or more amino acid sequences differ from each other by conservative substitutions, the percent sequence identity or degree of homology may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. See, e.g., Pearson, 1994, Methods Mol. Biol. 24:307-31 and 25:365-89 (herein incorporated by reference).

[0073] The following six groups each contain amino acids that are conservative substitutions for one another: 1) Serine (S), Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), Leucine (L), Methionine (M), Alanine (A), Valine (V), and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).

[0074] Sequence homology for polypeptides, which is also referred to as percent sequence identity, is typically measured using sequence analysis software. See, e.g., the Sequence Analysis Software Package of the Genetics Computer Group (GCG), University of Wisconsin Biotechnology Center, 910 University Avenue, Madison, Wis. 53705. Protein analysis software matches similar sequences using a measure of homology assigned to various substitutions, deletions and other modifications, including conservative amino acid substitutions. For instance, GCG contains programs such as "Gap" and "Bestfit" which can be used with default parameters to determine sequence homology or sequence identity between closely related polypeptides, such as homologous polypeptides from different species of organisms or between a wild-type protein and a mutein thereof. See, e.g., GCG Version 6.1.

[0075] A preferred algorithm when comparing a particular polypeptide sequence to a database containing a large number of sequences from different organisms is the computer program BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al., Meth. Enzymol. 266:131-141 (1996); Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)).

[0076] Preferred parameters for BLASTp are: Expectation value: 10 (default); Filter: seg (default); Cost to open a gap: 11 (default); Cost to extend a gap: 1 (default); Max. alignments: 100 (default); Word size: 11 (default); No. of descriptions: 100 (default); Penalty Matrix: BLOWSUM62.

[0077] Preferred parameters for BLASTp are: Expectation value: 10 (default); Filter: seg (default); Cost to open a gap: 11 (default); Cost to extend a gap: 1 (default); Max. alignments: 100 (default); Word size: 11 (default); No. of descriptions: 100 (default); Penalty Matrix: BLOWSUM62. The length of polypeptide sequences compared for homology will generally be at least about 16 amino acid residues, usually at least about 20 residues, more usually at least about 24 residues, typically at least about 28 residues, and preferably more than about 35 residues. When searching a database containing sequences from a large number of different organisms, it is preferable to compare amino acid sequences. Database searching using amino acid sequences can be measured by algorithms other than blastp known in the art. For instance, polypeptide sequences can be compared using FASTA, a program in GCG Version 6.1. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990) (incorporated by reference herein). For example, percent sequence identity between amino acid sequences can be determined using FASTA with its default parameters (a word size of 2 and the PAM250 scoring matrix), as provided in GCG Version 6.1, herein incorporated by reference.

[0078] "Specific binding" refers to the ability of two molecules to bind to each other in preference to binding to other molecules in the environment. Typically, "specific binding" discriminates over adventitious binding in a reaction by at least two-fold, more typically by at least 10-fold, often at least 100-fold. Typically, the affinity or avidity of a specific binding reaction, as quantified by a dissociation constant, is about 10.sup.-7 M or stronger (e.g., about 10.sup.-8 M, 10.sup.-9 M or even stronger).

[0079] "Percent dry cell weight" refers to a production measurement of esters of fatty acids or fatty acids obtained as follows: a defined volume of culture is centrifuged to pellet the cells. Cells are washed then dewetted by at least one cycle of microcentrifugation and aspiration. Cell pellets are lyophilized overnight, and the tube containing the dry cell mass is weighed again such that the mass of the cell pellet can be calculated within .+-.0.1 mg. At the same time cells are processed for dry cell weight determination, a second sample of the culture in question is harvested, washed, and dewetted. The resulting cell pellet, corresponding to 1-3 mg of dry cell weight, is then extracted by vortexing in approximately 1 ml acetone plus butylated hydroxytoluene (BHT) as antioxidant and an internal standard, e.g., ethyl arachidate. Cell debris is then pelleted by centrifugation and the supernatant (extractant) is taken for analysis by GC. For accurate quantitation of the molecules, flame ionization detection (FID) was used as opposed to MS total ion count. The concentrations of the esters or fatty acids in the biological extracts were calculated using calibration relationships between GC-FID peak area and known concentrations of authentic standards. Knowing the volume of the extractant, the resulting concentrations of the products in the extractant, and the dry cell weight of the cell pellet extracted, the percentage of dry cell weight that comprised the esters or fatty acids can be determined.

[0080] The term "region" as used herein refers to a physically contiguous portion of the primary structure of a biomolecule. In the case of proteins, a region is defined by a contiguous portion of the amino acid sequence of that protein.

[0081] The term "domain" as used herein refers to a structure of a biomolecule that contributes to a known or suspected function of the biomolecule. Domains may be co-extensive with regions or portions thereof; domains may also include distinct, non-contiguous regions of a biomolecule. Examples of protein domains include, but are not limited to, an Ig domain, an extracellular domain, a transmembrane domain, and a cytoplasmic domain.

[0082] As used herein, the term "molecule" means any compound, including, but not limited to, a small molecule, peptide, protein, sugar, nucleotide, nucleic acid, lipid, etc., and such a compound can be natural or synthetic.

[0083] "Carbon-based Products of Interest" include alcohols such as ethanol, propanol, isopropanol, butanol, fatty alcohols, fatty acid esters, wax esters; hydrocarbons and alkanes such as propane, octane, diesel, Jet Propellant 8 (JP8); polymers such as terephthalate, 1,3-propanediol, 1,4-butanediol, polyols, Polyhydroxyalkanoates (PHA), poly-beta-hydroxybutyrate (PHB), acrylate, adipic acid, .epsilon.-caprolactone, isoprene, caprolactam, rubber; commodity chemicals such as lactate, Docosahexaenoic acid (DHA), 3-hydroxypropionate, .gamma.-valerolactone, lysine, serine, aspartate, aspartic acid, sorbitol, ascorbate, ascorbic acid, isopentenol, lanosterol, omega-3 DHA, lycopene, itaconate, 1,3-butadiene, ethylene, propylene, succinate, citrate, citric acid, glutamate, malate, 3-hydroxypropionic acid (HPA), lactic acid, THF, gamma butyrolactone, pyrrolidones, hydroxybutyrate, glutamic acid, levulinic acid, acrylic acid, malonic acid; specialty chemicals such as carotenoids, isoprenoids, itaconic acid; pharmaceuticals and pharmaceutical intermediates such as 7-aminodeacetoxycephalosporanic acid (7-ADCA)/cephalosporin, erythromycin, polyketides, statins, paclitaxel, docetaxel, terpenes, peptides, steroids, omega fatty acids and other such suitable products of interest. Such products are useful in the context of biofuels, industrial and specialty chemicals, as intermediates used to make additional products, such as nutritional supplements, neutraceuticals, polymers, paraffin replacements, personal care products and pharmaceuticals.

[0084] Biofuel: A biofuel refers to any fuel that derives from a biological source. Biofuel can refer to one or more hydrocarbons, one or more alcohols, one or more fatty esters or a mixture thereof.

[0085] The term "hydrocarbon" generally refers to a chemical compound that consists of the elements carbon (C), hydrogen (H) and optionally oxygen (O). There are essentially three types of hydrocarbons, e.g., aromatic hydrocarbons, saturated hydrocarbons and unsaturated hydrocarbons such as alkenes, alkynes, and dienes. The term also includes fuels, biofuels, plastics, waxes, solvents and oils. Hydrocarbons encompass biofuels, as well as plastics, waxes, solvents and oils. A "fatty acid" is a carboxylic acid with a long unbranched aliphatic tail (chain), which is either saturated or unsaturated. Most naturally occurring fatty acids have a chain of four to 28 carbons.

[0086] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this present invention pertains. Exemplary methods and materials are described below, although methods and materials similar or equivalent to those described herein can also be used in the practice of the present invention and will be apparent to those of skill in the art. All publications and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. The materials, methods, and examples are illustrative only and not intended to be limiting.

[0087] Throughout this specification and claims, the word "comprise" or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.

Nucleic Acid Sequences

[0088] Esters are chemical compounds with the basic formula:

##STR00001##

where R and R' denote any alkyl or aryl group. In one embodiment, the invention provides one or more isolated or recombinant nucleic acids encoding one or more genes which, when recombinantly expressed in a photosynthetic microorganism, catalyze the synthesis of esters by the microorganism. The first gene is a thioesterase, which catalyzes the synthesis of fatty acids from an acyl-Acyl Carrier Protein ("acyl-ACP") molecule. The second gene is an acyl-CoA synthetase, which synthesizes fatty acyl-CoA from a fatty acid. The third gene is a wax synthase, which synthesizes esters from a fatty acyl-CoA molecule and an alcohol (e.g., methanol, ethanol, proponal, butanol, etc.). In certain related embodiments, additional genes expressing a recombinant resistance nodulation cell division type ("RND-type") transporter such as TolC/AcrAB are also recombinantly expressed to facilitate the transport of ethyl esters outside of the engineered photosynthetic cell and into the culture medium.

[0089] Accordingly, the present invention provides isolated nucleic acid molecules for genes encoding thioesterase, acyl-CoA synthetases and wax synthase enzymes, and variants thereof. An exemplary full-length expression optimized nucleic acid sequence for a gene encoding a thioesterase is presented as SEQ ID NO: 4. The corresponding amino acid sequences is presented as SEQ ID NO: 1. Additional genes encoding thioesterases are presented in Table 3A. An exemplary full-length expression-optimized nucleic acid sequence for a gene encoding an acyl-CoA synthetase is presented as SEQ ID NO: 5, and the corresponding amino acid sequence is presented as SEQ ID NOs: 2. Additional genes encoding acyl-CoA synthetases are presented in Table 3B. An exemplary full-length expression-optimized nucleic acid sequence for a gene encoding an acyl-CoA synthetase is presented as SEQ ID NO: 6, and the corresponding amino acid sequence is presented as SEQ ID NOs: 3. Additional genes encoding acyl-CoA synthetases are presented in Table 3C.

[0090] One skilled in the art will recognize that the redundancy of the genetic code will allow many other nucleic acid sequences to encode the identical enzymes. The sequences of the nucleic acids disclosed herein can be optimized as needed to yield the desired expression levels in a particular photosynthetic microorganism. Such a nucleic acid sequence can have 70%, 75%, 80%, 85%, 90%, 95%, 98%, 99%, 99.9% or even higher identity to the native gene sequence.

[0091] In another embodiment, the nucleic acid molecule of the present invention encodes a polypeptide having the amino acid sequence of SEQ ID NO:1, 2, 3, 7, 8, or 9. Preferably, the nucleic acid molecule of the present invention encodes a polypeptide sequence of at least 50%, 60, 70%, 80%, 85%, 90% or 95% identity to SEQ ID NO:1, 2, 3, 7, 8 or 9 and the identity can even more preferably be 96%, 97%, 98%, 99%, 99.9% or even higher.

[0092] The present invention also provides nucleic acid molecules that hybridize under stringent conditions to the above-described nucleic acid molecules. As defined above, and as is well known in the art, stringent hybridizations are performed at about 25.degree. C. below the thermal melting point (T.sub.m) for the specific DNA hybrid under a particular set of conditions, where the T.sub.m is the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. Stringent washing is performed at temperatures about 5.degree. C. lower than the T.sub.m for the specific DNA hybrid under a particular set of conditions.

[0093] Nucleic acid molecules comprising a fragment of any one of the above-described nucleic acid sequences are also provided. These fragments preferably contain at least 20 contiguous nucleotides. More preferably the fragments of the nucleic acid sequences contain at least 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or even more contiguous nucleotides.

[0094] The nucleic acid sequence fragments of the present invention display utility in a variety of systems and methods. For example, the fragments may be used as probes in various hybridization techniques. Depending on the method, the target nucleic acid sequences may be either DNA or RNA. The target nucleic acid sequences may be fractionated (e.g., by gel electrophoresis) prior to the hybridization, or the hybridization may be performed on samples in situ. One of skill in the art will appreciate that nucleic acid probes of known sequence find utility in determining chromosomal structure (e.g., by Southern blotting) and in measuring gene expression (e.g., by Northern blotting). In such experiments, the sequence fragments are preferably detectably labeled, so that their specific hydridization to target sequences can be detected and optionally quantified. One of skill in the art will appreciate that the nucleic acid fragments of the present invention may be used in a wide variety of blotting techniques not specifically described herein.

[0095] It should also be appreciated that the nucleic acid sequence fragments disclosed herein also find utility as probes when immobilized on microarrays. Methods for creating microarrays by deposition and fixation of nucleic acids onto support substrates are well known in the art. Reviewed in DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21(1)(suppl):1-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376), the disclosures of which are incorporated herein by reference in their entireties. Analysis of, for example, gene expression using microarrays comprising nucleic acid sequence fragments, such as the nucleic acid sequence fragments disclosed herein, is a well-established utility for sequence fragments in the field of cell and molecular biology. Other uses for sequence fragments immobilized on microarrays are described in Gerhold et al., Trends Biochem. Sci. 24:168-173 (1999) and Zweiger, Trends Biotechnol. 17:429-436 (1999); DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21(1)(suppl):1-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376), the disclosure of each of which is incorporated herein by reference in its entirety.

[0096] As is well known in the art, enzyme activities can be measured in various ways. For example, the pyrophosphorolysis of OMP may be followed spectroscopically (Grubmeyer et al., (1993) J. Biol. Chem. 268:20299-20304). Alternatively, the activity of the enzyme can be followed using chromatographic techniques, such as by high performance liquid chromatography (Chung and Sloan, (1986) J. Chromatogr. 371:71-81). As another alternative the activity can be indirectly measured by determining the levels of product made from the enzyme activity. These levels can be measured with techniques including aqueous chloroform/methanol extraction as known and described in the art (Cf M. Kates (1986) Techniques of Lipidology; Isolation, analysis and identification of Lipids. Elsevier Science Publishers, New York (ISBN: 0444807322)). More modern techniques include using gas chromatography linked to mass spectrometry (Niessen, W. M. A. (2001). Current practice of gas chromatography--mass spectrometry. New York, N.Y.: Marcel Dekker. (ISBN: 0824704738)). Additional modern techniques for identification of recombinant protein activity and products including liquid chromatography-mass spectrometry (LCMS), high performance liquid chromatography (HPLC), capillary electrophoresis, Matrix-Assisted Laser Desorption Ionization time of flight-mass spectrometry (MALDI-TOF MS), nuclear magnetic resonance (NMR), near-infrared (NIR) spectroscopy, viscometry (Knothe, G (1997) Am. Chem. Soc. Symp. Series, 666: 172-208), titration for determining free fatty acids (Komers (1997) Fett/Lipid, 99(2): 52-54), enzymatic methods (Bailer (1991) Fresenius J. Anal. Chem. 340(3): 186), physical property-based methods, wet chemical methods, etc. can be used to analyze the levels and the identity of the product produced by the organisms of the present invention. Other methods and techniques may also be suitable for the measurement of enzyme activity, as would be known by one of skill in the art.

Vectors

[0097] Also provided are vectors, including expression vectors, which comprise the above nucleic acid molecules of the present invention, as described further herein. In a first embodiment, the vectors include the isolated nucleic acid molecules described above. In an alternative embodiment, the vectors of the present invention include the above-described nucleic acid molecules operably linked to one or more expression control sequences. The vectors of the instant invention may thus be used to express a thioesterase, an acyl-CoA synthease, and/or a wax synthase, contributing to the synthesis of esters by the cell.

[0098] In a related embodiment, vectors may include nucleic acid molecules encoding an RND-type transporter such as TolC/AcrAB to facilitate the extracellular transport of esters. Exemplary vectors of the invention include any of the vectors expressing a thioesterase, an acyl-CoA synthease, wax synthase, and/or TolC/AcrAB transporter disclosed here, e.g., pJB532, pJB634, pJB578 and pJB1074. The invention also provides other vectors such as pJB161 which are capable of receiving nucleic acid sequences of the invention. Vectors such as pJB161 comprise sequences which are homologous with sequences that are present in plasmids which are endogenous to certain photosynthetic microorganisms (e.g., plasmids pAQ7 or pAQ1 of certain Synechococcus species). Recombination between pJB161 and the endogenous plasmids in vivo yield engineered microbes expressing the genes of interest from their endogenous plasmids. Alternatively, vectors can be engineered to recombine with the host cell chromosome, or the vector can be engineered to replicate and express genes of interest independent of the host cell chromosome or any of the host cell's endogenous plasmids.

[0099] Vectors useful for expression of nucleic acids in prokaryotes are well known in the art.

Isolated Polypeptides

[0100] According to another aspect of the present invention, isolated polypeptides (including muteins, allelic variants, fragments, derivatives, and analogs) encoded by the nucleic acid molecules of the present invention are provided. In one embodiment, the isolated polypeptide comprises the polypeptide sequence corresponding to SEQ ID NO:1, 2, 3, 7, 8, or 9. In an alternative embodiment of the present invention, the isolated polypeptide comprises a polypeptide sequence at least 85% identical to SEQ ID NO:1, 2, 3, 7, 8, or 9. Preferably the isolated polypeptide of the present invention has at least 50%, 60, 70%, 80%, 85%, 90%, 95%, 98%, 98.1%, 98.2%, 98.3%, 98.4%, 98.5%, 98.6%, 98.7%, 98.8%, 98.9%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% or even higher identity to SEQ ID NO:1, 2, 3, 7, 8 or 9.

[0101] According to other embodiments of the present invention, isolated polypeptides comprising a fragment of the above-described polypeptide sequences are provided. These fragments preferably include at least 20 contiguous amino acids, more preferably at least 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 or even more contiguous amino acids.

[0102] The polypeptides of the present invention also include fusions between the above-described polypeptide sequences and heterologous polypeptides. The heterologous sequences can, for example, include sequences designed to facilitate purification, e.g. histidine tags, and/or visualization of recombinantly-expressed proteins. Other non-limiting examples of protein fusions include those that permit display of the encoded protein on the surface of a phage or a cell, fusions to intrinsically fluorescent proteins, such as green fluorescent protein (GFP), and fusions to the IgG Fc region.

Host Cell Transformants

[0103] In another aspect of the present invention, host cells transformed with the nucleic acid molecules or vectors of the present invention, and descendants thereof, are provided. In some embodiments of the present invention, these cells carry the nucleic acid sequences of the present invention on vectors, which may but need not be freely replicating vectors. In other embodiments of the present invention, the nucleic acids have been integrated into the genome of the host cells and/or into an endogenous plasmid of the host cells.

[0104] In a preferred embodiment, the host cell comprises one or more recombinant thioesterase-, acyl-CoA synthase-, wax synthase-, or TolC/AcrAB-encoding nucleic acids which express thioesterase-, acyl-CoA synthase, wax synthase or TolC/AcrAB respectively in the host cell.

[0105] In an alternative embodiment, the host cells of the present invention can be mutated by recombination with a disruption, deletion or mutation of the isolated nucleic acid of the present invention so that the activity of a native thioesterase, acyl-CoA synthase, wax synthase, and/or TolC/AcrAB protein in the host cell is reduced or eliminated compared to a host cell lacking the mutation.

Selected or Engineered Microorganisms for the Production of Fatty Acids, Esters, and Other Carbon-Based Products of Interest

[0106] Microorganism: Includes prokaryotic and eukaryotic microbial species from the Domains Archaea, Bacteria and Eucarya, the latter including yeast and filamentous fungi, protozoa, algae, or higher Protista. The terms "microbial cells" and "microbes" are used interchangeably with the term microorganism.

[0107] A variety of host organisms can be transformed to produce a product of interest. Photoautotrophic organisms include eukaryotic plants and algae, as well as prokaryotic cyanobacteria, green-sulfur bacteria, green non-sulfur bacteria, purple sulfur bacteria, and purple non-sulfur bacteria.

[0108] Extremophiles are also contemplated as suitable organisms. Such organisms withstand various environmental parameters such as temperature, radiation, pressure, gravity, vacuum, desiccation, salinity, pH, oxygen tension, and chemicals. They include hyperthermophiles, which grow at or above 80.degree. C. such as Pyrolobus fumarii; thermophiles, which grow between 60-80.degree. C. such as Synechococcus lividis; mesophiles, which grow between 15-60.degree. C. and psychrophiles, which grow at or below 15.degree. C. such as Psychrobacter and some insects. Radiation tolerant organisms include Deinococcus radiodurans. Pressure-tolerant organisms include piezophiles, which tolerate pressure of 130 MPa. Weight-tolerant organisms include barophiles. Hypergravity (e.g., >1 g) hypogravity (e.g., <1 g) tolerant organisms are also contemplated. Vacuum tolerant organisms include tardigrades, insects, microbes and seeds. Dessicant tolerant and anhydrobiotic organisms include xerophiles such as Artemia salina; nematodes, microbes, fungi and lichens. Salt-tolerant organisms include halophiles (e.g., 2-5 M NaCl) Halobacteriacea and Dunaliella salina. pH-tolerant organisms include alkaliphiles such as Natronobacterium, Bacillus firmus OF4, Spirulina spp. (e.g., pH>9) and acidophiles such as Cyanidium caldarium, Ferroplasma sp. (e.g., low pH). Anaerobes, which cannot tolerate O.sub.2 such as Methanococcus jannaschii; microaerophils, which tolerate some O.sub.2 such as Clostridium and aerobes, which require O.sub.2 are also contemplated. Gas-tolerant organisms, which tolerate pure CO.sub.2 include Cyanidium caldarium and metal tolerant organisms include metalotolerants such as Ferroplasma acidarmanus (e.g., Cu, As, Cd, Zn), Ralstonia sp. CH34 (e.g., Zn, Co, Cd, Hg, Pb). Gross, Michael. Life on the Edge: Amazing Creatures Thriving in Extreme Environments. New YorK: Plenum (1998) and Seckbach, J. "Search for Life in the Universe with Terrestrial Microbes Which Thrive Under Extreme Conditions." In Cristiano Batalli Cosmovici, Stuart Bowyer, and Dan Wertheimer, eds., Astronomical and Biochemical Origins and the Search for Life in the Universe, p. 511. Milan: Editrice Compositori (1997).

[0109] Plants include but are not limited to the following genera: Arabidopsis, Beta, Glycine, Jatropha, Miscanthus, Panicum, Phalaris, Populus, Saccharum, Salix, Simmondsia and Zea.

[0110] Algae and cyanobacteria include but are not limited to the following genera: Acanthoceras, Acanthococcus, Acaryochloris, Achnanthes, Achnanthidium, Actinastrum, Actinochloris, Actinocyclus, Actinotaenium, Amphichrysis, Amphidinium, Amphikrikos, Amphipleura, Amphiprora, Amphithrix, Amphora, Anabaena, Anabaenopsis, Aneumastus, Ankistrodesmus, Ankyra, Anomoeoneis, Apatococcus, Aphanizomenon, Aphanocapsa, Aphanochaete, Aphanothece, Apiocystis, Apistonema, Arthrodesmus, Artherospira, Ascochloris, Asterionella, Asterococcus, Audouinella, Aulacoseira, Bacillaria, Balbiania, Bambusina, Bangia, Basichlamys, Batrachospermum, Binuclearia, Bitrichia, Blidingia, Botrdiopsis, Botrydium, Botryococcus, Botryosphaerella, Brachiomonas, Brachysira, Brachytrichia, Brebissonia, Bulbochaete, Bumilleria, Bumilleriopsis, Caloneis, Calothrix, Campylodiscus, Capsosiphon, Carteria, Catena, Cavinula, Centritractus, Centronella, Ceratium, Chaetoceros, Chaetochloris, Chaetomorpha, Chaetonella, Chaetonema, Chaetopeltis, Chaetophora, Chaetosphaeridium, Chamaesiphon, Chara, Characiochloris, Characiopsis, Characium, Charales, Chilomonas, Chlainomonas, Chlamydoblepharis, Chlamydocapsa, Chlamydomonas, Chlamydomonopsis, Chlamydomyxa, Chlamydonephris, Chlorangiella, Chlorangiopsis, Chlorella, Chlorobotrys, Chlorobrachis, Chlorochytrium, Chlorococcum, Chlorogloea, Chlorogloeopsis, Chlorogonium, Chlorolobion, Chloromonas, Chlorophysema, Chlorophyta, Chlorosaccus, Chlorosarcina, Choricystis, Chromophyton, Chromulina, Chroococcidiopsis, Chroococcus, Chroodactylon, Chroomonas, Chroothece, Chrysamoeba, Chrysapsis, Chrysidiastrum, Chrysocapsa, Chrysocapsella, Chrysochaete, Chrysochromulina, Chrysococcus, Chrysocrinus, Chrysolepidomonas, Chrysolykos, Chrysonebula, Chrysophyta, Chrysopyxis, Chrysosaccus, Chrysophaerella, Chrysostephanosphaera, Clodophora, Clastidium, Closteriopsis, Closterium, Coccomyxa, Cocconeis, Coelastrella, Coelastrum, Coelosphaerium, Coenochloris, Coenococcus, Coenocystis, Colacium, Coleochaete, Collodictyon, Compsogonopsis, Compsopogon, Conjugatophyta, Conochaete, Coronastrum, Cosmarium, Cosmioneis, Cosmocladium, Crateriportula, Craticula, Crinalium, Crucigenia, Crucigeniella, Cryptoaulax, Cryptomonas, Cryptophyta, Ctenophora, Cyanodictyon, Cyanonephron, Cyanophora, Cyanophyta, Cyanothece, Cyanothomonas, Cyclonexis, Cyclostephanos, Cyclotella, Cylindrocapsa, Cylindrocystis, Cylindrospermum, Cylindrotheca, Cymatopleura, Cymbella, Cymbellonitzschia, Cystodinium Dactylococcopsis, Debarya, Denticula, Dermatochrysis, Dermocarpa, Dermocarpella, Desmatractum, Desmidium, Desmococcus, Desmonema, Desmosiphon, Diacanthos, Diacronema, Diadesmis, Diatoma, Diatomella, Dicellula, Dichothrix, Dichotomococcus, Dicranochaete, Dictyochloris, Dictyococcus, Dictyosphaerium, Didymocystis, Didymogenes, Didymosphenia, Dilabifilum, Dimorphococcus, Dinobryon, Dinococcus, Diplochloris, Diploneis, Diplostauron, Distrionella, Docidium, Draparnaldia, Dunaliella, Dysmorphococcus, Ecballocystis, Elakatothrix, Ellerbeckia, Encyonema, Enteromorpha, Entocladia, Entomoneis, Entophysalis, Epichrysis, Epipyxis, Epithemia, Eremosphaera, Euastropsis, Euastrum, Eucapsis, Eucocconeis, Eudorina, Euglena, Euglenophyta, Eunotia, Eustigmatophyta, Eutreptia, Fallacia, Fischerella, Fragilaria, Fragilariforma, Franceia, Frustulia, Curcilla, Geminella, Genicularia, Glaucocystis, Glaucophyta, Glenodiniopsis, Glenodinium, Gloeocapsa, Gloeochaete, Gloeochrysis, Gloeococcus, Gloeocystis, Gloeodendron, Gloeomonas, Gloeoplax, Gloeothece, Gloeotila, Gloeotrichia, Gloiodictyon, Golenkinia, Golenkiniopsis, Gomontia, Gomphocymbella, Gomphonema, Gomphosphaeria, Gonatozygon, Gongrosia, Gongrosira, Goniochloris, Gonium, Gonyostomum, Granulochloris, Granulocystopsis, Groenbladia, Gymnodinium, Gymnozyga, Gyrosigma, Haematococcus, Hafniomonas, Hallassia, Hammatoidea, Hannaea, Hantzschia, Hapalosiphon, Haplotaenium, Haptophyta, Haslea, Hemidinium, Hemitoma, Heribaudiella, Heteromastix, Heterothrix, Hibberdia, Hildenbrandia, Hillea, Holopedium, Homoeothrix, Hormanthonema, Hormotila, Hyalobrachion, Hyalocardium, Hyalodiscus, Hyalogonium, Hyalotheca, Hydrianum, Hydrococcus, Hydrocoleum, Hydrocoryne, Hydrodictyon, Hydrosera, Hydrurus, Hyella, Hymenomonas, Isthmochloron, Johannesbaptistia, Juranyiella, Karayevia, Kathablepharis, Katodinium, Kephyrion, Keratococcus, Kirchneriella, Klebsormidium, Kolbesia, Koliella, Komarekia, Korshikoviella, Kraskella, Lagerheimia, Lagynion, Lamprothamnium, Lemanea, Lepocinclis, Leptosira, Lobococcus, Lobocystis, Lobomonas, Luticola, Lyngbya, Malleochloris, Mallomonas, Mantoniella, Marssoniella, Martyana, Mastigocoleus, Gastogloia, Melosira, Merismopedia, Mesostigma, Mesotaenium, Micractinium, Micrasterias, Microchaete, Microcoleus, Microcystis, Microglena, Micromonas, Microspora, Microthamnion, Mischococcus, Monochrysis, Monodus, Monomastix, Monoraphidium, Monostroma, Mougeotia, Mougeotiopsis, Myochloris, Myromecia, Myxosarcina, Naegeliella, Nannochloris, Nautococcus, Navicula, Neglectella, Neidium, Nephroclamys, Nephrocytium, Nephrodiella, Nephroselmis, Netrium, Nitella, Nitellopsis, Nitzschia, Nodularia, Nostoc, Ochromonas, Oedogonium, Oligochaetophora, Onychonema, Oocardium, Oocystis, Opephora, Ophiocytium, Orthoseira, Oscillatoria, Oxyneis, Pachycladella, Palmella, Palmodictyon, Pnadorina, Pannus, Paralia, Pascherina, Paulschulzia, Pediastrum, Pedinella, Pedinomonas, Pedinopera, Pelagodictyon, Penium, Peranema, Peridiniopsis, Peridinium, Peronia, Petroneis, Phacotus, Phacus, Phaeaster, Phaeodermatium, Phaeophyta, Phaeosphaera, Phaeothamnion, Phormidium, Phycopeltis, Phyllariochloris, Phyllocardium, Phyllomitas, Pinnularia, Pitophora, Placoneis, Planctonema, Planktosphaeria, Planothidium, Plectonema, Pleodorina, Pleurastrum, Pleurocapsa, Pleurocladia, Pleurodiscus, Pleurosigma, Pleurosira, Pleurotaenium, Pocillomonas, Podohedra, Polyblepharides, Polychaetophora, Polyedriella, Polyedriopsis, Polygoniochloris, Polyepidomonas, Polytaenia, Polytoma, Polytomella, Porphyridium, Posteriochromonas, Prasinochloris, Prasinocladus, Prasinophyta, Prasiola, Prochlorphyta, Prochlorothrix, Protoderma, Protosiphon, Provasoliella, Prymnesium, Psammodictyon, Psammothidium, Pseudanabaena, Pseudenoclonium, Psuedocarteria, Pseudochate, Pseudocharacium, Pseudococcomyxa, Pseudodictyosphaerium, Pseudokephyrion, Pseudoncobyrsa, Pseudoquadrigula, Pseudosphaerocystis, Pseudostaurastrum, Pseudostaurosira, Pseudotetrastrum, Pteromonas, Punctastruata, Pyramichlamys, Pyramimonas, Pyrrophyta, Quadrichloris, Quadricoccus, Quadrigula, Radiococcus, Radiofilum, Raphidiopsis, Raphidocelis, Raphidonema, Raphidophyta, Peimeria, Rhabdoderma, Rhabdomonas, Rhizoclonium, Rhodomonas, Rhodophyta, Rhoicosphenia, Rhopalodia, Rivularia, Rosenvingiella, Rossithidium, Roya, Scenedesmus, Scherffelia, Schizochlamydella, Schizochlamys, Schizomeris, Schizothrix, Schroederia, Scolioneis, Scotiella, Scotiellopsis, Scourfieldia, Scytonema, Selenastrum, Selenochloris, Sellaphora, Semiorbis, Siderocelis, Diderocystopsis, Dimonsenia, Siphononema, Sirocladium, Sirogonium, Skeletonema, Sorastrum, Spermatozopsis, Sphaerellocystis, Sphaerellopsis, Sphaerodinium, Sphaeroplea, Sphaerozosma, Spiniferomonas, Spirogyra, Spirotaenia, Spirulina, Spondylomorum, Spondylosium, Sporotetras, Spumella, Staurastrum, Stauerodesmus, Stauroneis, Staurosira, Staurosirella, Stenopterobia, Stephanocostis, Stephanodiscus, Stephanoporos, Stephanosphaera, Stichococcus, Stichogloea, Stigeoclonium, Stigonema, Stipitococcus, Stokesiella, Strombomonas, Stylochrysalis, Stylodinium, Styloyxis, Stylosphaeridium, Surirella, Sykidion, Symploca, Synechococcus, Synechocystis, Synedra, Synochromonas, Synura, Tabellaria, Tabularia, Teilingia, Temnogametum, Tetmemorus, Tetrachlorella, Tetracyclus, Tetradesmus, Tetraedriella, Tetraedron, Tetraselmis, Tetraspora, Tetrastrum, Thalassiosira, Thamniochaete, Thorakochloris, Thorea, Tolypella, Tolypothrix, Trachelomonas, Trachydiscus, Trebouxia, Trentepholia, Treubaria, Tribonema, Trichodesmium, Trichodiscus, Trochiscia, Tryblionella, Ulothrix, Uroglena, Uronema, Urosolenia, Urospora, Uva, Vacuolaria, Vaucheria, Volvox, Volvulina, Westella, Woloszynskia, Xanthidium, Xanthophyta, Xenococcus, Zygnema, Zygnemopsis, and Zygonium.

[0111] Additional cyanobacteria include members of the genus Chamaesiphon, Chroococcus, Cyanobacterium, Cyanobium, Cyanothece, Dactylococcopsis, Gloeobacter, Gloeocapsa, Gloeothece, Microcystis, Prochlorococcus, Prochloron, Synechococcus, Synechocystis, Cyanocystis, Dermocarpella, Stanieria, Xenococcus, Chroococcidiopsis, Myxosarcina, Arthrospira, Borzia, Crinalium, Geitlerinemia, Leptolyngbya, Limnothrix, Lyngbya, Microcoleus, Oscillatoria, Planktothrix, Prochiorothrix, Pseudanabaena, Spirulina, Starria, Symploca, Trichodesmium, Tychonema, Anabaena, Anabaenopsis, Aphanizomenon, Cyanospira, Cylindrospermopsis, Cylindrospermum, Nodularia, Nostoc, Scylonema, Calothrix, Rivularia, Tolypothrix, Chlorogloeopsis, Fischerella, Geitieria, Iyengariella, Nostochopsis, Stigonema and Thermosynechococcus.

[0112] Green non-sulfur bacteria include but are not limited to the following genera: Chloroflexus, Chloronema, Oscillochloris, Heliothrix, Herpetosiphon, Roseiflexus, and Thermomicrobium.

[0113] Green sulfur bacteria include but are not limited to the following genera:

[0114] Chlorobium, Clathrochloris, and Prosthecochloris.

[0115] Purple sulfur bacteria include but are not limited to the following genera: Allochromatium, Chromatium, Halochromatium, Isochromatium, Marichromatium, Rhodovulum, Thermochromatium, Thiocapsa, Thiorhodococcus, and Thiocystis,

[0116] Purple non-sulfur bacteria include but are not limited to the following genera: Phaeospirillum, Rhodobaca, Rhodobacter, Rhodomicrobium, Rhodopila, Rhodopseudomonas, Rhodothalassium, Rhodospirillum, Rodovibrio, and Roseospira.

[0117] Aerobic chemolithotrophic bacteria include but are not limited to nitrifying bacteria such as Nitrobacteraceae sp., Nitrobacter sp., Nitrospina sp., Nitrococcus sp., Nitrospira sp., Nitrosomonas sp., Nitrosococcus sp., Nitrosospira sp., Nitrosolobus sp., Nitrosovibrio sp.; colorless sulfur bacteria such as, Thiovulum sp., Thiobacillus sp., Thiomicrospira sp., Thiosphaera sp., Thermothrix sp.; obligately chemolithotrophic hydrogen bacteria such as Hydrogenobacter sp., iron and manganese-oxidizing and/or depositing bacteria such as Siderococcus sp., and magnetotactic bacteria such as Aquaspirillum sp.

[0118] Archaeobacteria include but are not limited to methanogenic archaeobacteria such as Methanobacterium sp., Methanobrevibacter sp., Methanothermus sp., Methanococcus sp., Methanomicrobium sp., Methanospirillum sp., Methanogenium sp., Methanosarcina sp., Methanolobus sp., Methanothrix sp., Methanococcoides sp., Methanoplanus sp.; extremely thermophilic S-Metabolizers such as Thermoproteus sp., Pyrodictium sp., Sulfolobus sp., Acidianus sp. and other microorganisms such as, Bacillus subtilis, Saccharomyces cerevisiae, Streptomyces sp., Ralstonia sp., Rhodococcus sp., Corynebacteria sp., Brevibacteria sp., Mycobacteria sp., and oleaginous yeast.

[0119] Preferred organisms for the manufacture of esters according to the methods disclosed herein include: Arabidopsis thaliana, Panicum virgatum, Miscanthus giganteus, and Zea mays (plants); Botryococcus braunii, Chlamydomonas reinhardtii and Dunaliela salina (algae); Synechococcus sp PCC 7002, Synechococcus sp. PCC 7942, Synechocystis sp. PCC 6803, Thermosynechococcus elongatus BP-1 (cyanobacteria); Chlorobium tepidum (green sulfur bacteria), Chloroflexus auranticus (green non-sulfur bacteria); Chromatium tepidum and Chromatium vinosum (purple sulfur bacteria); Rhodospirillum rubrum, Rhodobacter capsulatus, and Rhodopseudomonas palusris (purple non-sulfur bacteria).

[0120] Yet other suitable organisms include synthetic cells or cells produced by synthetic genomes as described in Venter et al. US Pat. Pub. No. 2007/0264688, and cell-like systems or synthetic cells as described in Glass et al. US Pat. Pub. No. 2007/0269862.

[0121] Still, other suitable organisms include microorganisms that can be engineered to fix carbon dioxide, such as Escherichia coli, Acetobacter aceti, Bacillus subtilis, yeast and fungi such as Clostridium ljungdahlii, Clostridium thermocellum, Penicillium chrysogenum, Pichia pastoris, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pseudomonas fluorescens, or Zymomonas mobilis.

[0122] The capability to use carbon dioxide as the sole source of cell carbon (autotrophy) is found in almost all major groups of prokaryotes. The CO.sub.2 fixation pathways differ between groups, and there is no clear distribution pattern of the four presently-known autotrophic pathways. See, e.g., Fuchs, G. 1989. Alternative pathways of autotrophic CO.sub.2 fixation, p. 365-382, in H. G. Schlegel, and B. Bowien (ed.), Autotrophic bacteria. Springer-Verlag, Berlin, Germany. The reductive pentose phosphate cycle (Calvin-Bassham-Benson cycle) represents the CO.sub.2 fixation pathway in almost all aerobic autotrophic bacteria, for example, the cyanobacteria.

[0123] For producing esters via the recombinant expression of thioesterase, acyl-CoA synthetase and/or wax synthase enzymes, an engineered cyanobacteria, e.g., a Synechococcus or Thermosynechococcus species, is especially preferred. Other preferred organisms include Synechocystis, Klebsiella oxytoca, Escherichia coli or Saccharomyces cerevisiae. Other prokaryotic, archaea and eukaryotic host cells are also encompassed within the scope of the present invention. Engineered ester-producing organisms expressing thioesterase, acyl-CoA synthetase and/or wax synthase enzymes can be further engineered to express recombinant TolC/AcrAB to enhance the extracellular transport of esters.

Carbon-Based Products of Interest: Esters

[0124] In various embodiments of the invention, desired esters or a mixture thereof can be produced. For example, by including a particular alcohol or mixture of alcohols in the culture media, methyl esters, ethyl esters, propyl esters, butyl esters, and esters of higher chain length alcohols (or mixtures thereof, depending on the substrate alcohols available to the photosynthetic microbe) can be synthesized. The carbon chain lengths of the esters can vary from C.sub.10 to C.sub.20, e.g., using ethanol as a substrate, diverse esters including, e.g., ethyl myristate, ethyl palmitate, ethyl oleate, and/or ethyl stearate and/or mixtures thereof can be produced by a single engineered photosynthetic microorganism of the invention. Accordingly, the invention provides methods and compositions for the production of various chain lengths of esters, each of which is suitable for use as a fuel or any other chemical use.

[0125] In preferred aspects, the methods provide culturing host cells for direct product secretion for easy recovery without the need to extract biomass. These carbon-based products of interest are secreted directly into the medium. Since the invention enables production of various defined chain length of hydrocarbons and alcohols, the secreted products are easily recovered or separated. The products of the invention, therefore, can be used directly or used with minimal processing.

Media and Culture Conditions

[0126] One skilled in the art will recognize that a variety of media and culture conditions can be used in conjunction with the methods and engineered cyanobacteria disclosed herein for the bioproduction of fatty acid esters (see, e.g., Rogers and Gallon, Biochemistry of the Algae and Cyanobacteria, Clarendon Press Oxford (1988); Burlwe, Algal Culture: From Laboratory to Pilot Plant, Carnegie Institution of Washington Publication 600 Washington, D.C., (1961); and Round, F. E. The Biology of the Algae. St Martin's Press, New York, 1965; Golden S S et al. (1987) Methods Enzymol 153:215-231; Golden and Sherman, J. Bacteriology 158:36 (1984), each of which is incorporated herein by reference). Exemplary culture conditions and media are also described in, e.g., WO/2010/068288, filed May 21, 2009, published Jun. 17, 2010, and incorporated by reference herein. Typical culture conditions for the methods of the present invention include the use of JB 2.1 culture media or A+ media. A recipe for one liter of JB 2.1 appears in Table A, below.

TABLE-US-00001 TABLE A JB 2.1 media (1 L) mg/L Chemical added FW Molarity Units Source NaCl 18000 58.44 308 mM Fisher KCl 600 74.55 8.05 mM Fisher NaNO.sub.3 4000 84.99 47.06 mM Sigma Aldrich MgSO.sub.4--7H.sub.2O 5000 246.47 20.29 mM Sigma Aldrich KH.sub.2PO.sub.4 200 136.09 1.47 mM Fisher CaCl.sub.2 266 110.99 2.40 mM Sigma NaEDTA.sub.tetra 30 372.24 80.59 .mu.M Fisher Ferric Citrate 14.1 244.95 57.48 .mu.M Acros Organics Tris 1000 121.14 8.25 mM Fisher Vitamin B.sub.12 0.004 1355.37 2.95E-03 .mu.M Sigma (Cyanoco- Aldrich balamin) H.sub.3BO.sub.3 34 61.83 554 .mu.M Acros Organics MnCl.sub.2--4H.sub.2O 4.3 197.91 21.83 .mu.M Sigma ZnCl 0.32 136.28 2.31 .mu.M Sigma MoO.sub.3 0.030 143.94 0.21 .mu.M Sigma Aldrich CuSO.sub.4--5H.sub.2O 0.0030 249.69 0.012 .mu.M Sigma Aldrich CoCl.sub.2--6H.sub.2O 0.012 237.93 0.051 .mu.M Sigma

[0127] As described in more detail in the Examples, below, in certain embodiments one or more alcohols (e.g., methanol, ethanol, propanol, butanol, etc.) may be added during culturing to produce the desired fatty acid ester(s) of interest (e.g., a fatty acid methyl ester, a fatty acid ethyl ester, etc., and mixtures thereof). For organisms that require or metabolize most efficiently in the presence of light and carbon dioxide, either carbon dioxide or bicarbonate can be used during culturing.

Fuel Compositions

[0128] In various embodiments, compositions produced by the methods of the invention are used as fuels. Such fuels comply with ASTM standards, for instance, standard specifications for diesel fuel oils D 975-09b, and Jet A, Jet A-1 and Jet B as specified in ASTM Specification D. 1655-68. Fuel compositions may require blending of several products to produce a uniform product. The blending process is relatively straightforward, but the determination of the amount of each component to include in a blend is much more difficult. Fuel compositions may, therefore, include aromatic and/or branched hydrocarbons, for instance, 75% saturated and 25% aromatic, wherein some of the saturated hydrocarbons are branched and some are cyclic. Preferably, the methods of the invention produce an array of hydrocarbons, such as C.sub.13-C.sub.17 or C.sub.10-C.sub.15 to alter cloud point. Furthermore, the compositions may comprise fuel additives, which are used to enhance the performance of a fuel or engine. For example, fuel additives can be used to alter the freezing/gelling point, cloud point, lubricity, viscosity, oxidative stability, ignition quality, octane level, and flash point. Fuels compositions may also comprise, among others, antioxidants, static dissipater, corrosion inhibitor, icing inhibitor, biocide, metal deactivator and thermal stability improver.

[0129] In addition to many environmental advantages of the invention such as CO.sub.2 conversion and renewable source, other advantages of the fuel compositions disclosed herein include low sulfur content, low emissions, being free or substantially free of alcohol and having high cetane number.

Carbon Fingerprinting

[0130] Biologically-produced carbon-based products, e.g., ethanol, fatty acids, alkanes, isoprenoids, represent a new commodity for fuels, such as alcohols, diesel and gasoline. Such biofuels have not been produced using biomass but use CO2 as its carbon source. These new fuels may be distinguishable from fuels derived form petrochemical carbon on the basis of dual carbon-isotopic fingerprinting. Such products, derivatives, and mixtures thereof may be completely distinguished from their petrochemical derived counterparts on the basis of .sup.14C (fM) and dual carbon-isotopic fingerprinting, indicating new compositions of matter.

[0131] There are three naturally occurring isotopes of carbon: .sup.12C, .sup.13C, and .sup.14C. These isotopes occur in above-ground total carbon at fractions of 0.989, 0.011, and 10.sup.-12, respectively. The isotopes .sup.12C and .sup.13C are stable, while .sup.14C decays naturally to .sup.14N, a beta particle, and an anti-neutrino in a process with a half-life of 5730 years. The isotope .sup.14C originates in the atmosphere, due primarily to neutron bombardment of .sup.14N caused ultimately by cosmic radiation. Because of its relatively short half-life (in geologic terms), .sup.14C occurs at extremely low levels in fossil carbon. Over the course of 1 million years without exposure to the atmosphere, just 1 part in 10.sup.50 will remain .sup.14C.

[0132] The .sup.13C:.sup.12C ratio varies slightly but measurably among natural carbon sources. Generally these differences are expressed as deviations from the .sup.13C:.sup.12C ratio in a standard material. The international standard for carbon is Pee Dee Belemnite, a form of limestone found in South Carolina, with a .sup.13C fraction of 0.0112372. For a carbon source a, the deviation of the .sup.13C:.sup.12C ratio from that of Pee Dee Belemnite is expressed as: .delta..sub.a=(R.sub.a/R.sub.s)-1, where R.sub.a=.sup.13C:.sup.12C ratio in the natural source, and R.sub.s=.sup.13C:.sup.12C ratio in Pee Dee Belemnite, the standard. For convenience, .delta..sub.a is expressed in parts per thousand, or .Salinity.. A negative value of .delta..sub.a shows a bias toward .sup.12C over .sup.13C as compared to Pee Dee Belemnite. Table 1 shows .delta..sub.a and .sup.14C fraction for several natural sources of carbon.

TABLE-US-00002 TABLE 1 13C:12C variations in natural carbon sources Source -.delta..sub.a (.Salinity.) References Underground coal 32.5 Farquhar et al. (1989) Plant Mol. Biol., 40: 503-37 Fossil fuels 26 Farquhar et al. (1989) Plant Mol. Biol., 40: 503-37 Ocean DIC* 0-1.5 Goericke et al. (1994) Chapter 9 in Stable Isotopes in Ecology and Environmental Science, by K. Lajtha and R. H. Michener, Blackwell Publishing; Ivlev (2010) Separation Sci. Technol. 36: 1819-1914 Atmospheric 6-8 Ivlev (2010) Separation Sci. Technol. 36: 1819-1914; CO2 Farquhar et al. (1989) Plant Mol. Biol., 40: 503-37 Freshwater DIC* 6-14 Dettman et al. (1999) Geochim. Cosmochim. Acta 63: 1049-1057 Pee Dee Belemnite 0 Ivlev (2010) Separation Sci. Technol. 36: 1819-1914 *DIC = dissolved inorganic carbon

[0133] Biological processes often discriminate among carbon isotopes. The natural abundance of .sup.14C is very small, and hence discrimination for or against .sup.14C is difficult to measure. Biological discrimination between .sup.13C and .sup.12C, however, is well-documented. For a biological product p, we can define similar quantities to those above: .delta..sub.p=(R.sub.p/R.sub.s)-1, where R.sub.p=.sup.13C:.sup.12C ratio in the biological product, and R.sub.s=.sup.13C:.sup.12C ratio in Pee Dee Belemnite, the standard. Table 2 shows measured deviations in the .sup.13C:.sup.12C ratio for some biological products.

TABLE-US-00003 TABLE 2 .sup.13C:.sup.12C variations in selected biological products Product -.delta..sub.p(.Salinity.) -D(.Salinity.)* References Plant sugar/starch from 18-28 .sup. 10-20 Ivlev (2010) Separation Sci. Technol. 36: 1819-1914 atmospheric CO.sub.2 Cyanobacterial biomass from 18-31 16.5-31 Goericke et al. (1994) Chapter 9 in marine DIC Stable Isotopes in Ecology and Environmental Science, by K. Lajtha and R. H. Michener, Blackwell Publishing; Sakata et al. (1997) Geochim. Cosmochim. Acta, 61: 5379-89 Cyanobacterial lipid from 39-40 37.5-40 Sakata et al. (1997) Geochim. Cosmochim. Acta, marine DIC 61: 5379-89 Algal lipid from marine DIC 17-28 15.5-28 Goericke et al. (1994) Chapter 9 in Stable Isotopes in Ecology and Environmental Science, by K. Lajtha and R. H. Michener, Blackwell Publishing; Abelseon et al. (1961) Proc. Natl. Acad. Sci., 47: 623-32 Algal biomass from 17-36 3-30 Marty et al. (2008) Limnol. Oceanogr.: Methods 6: 51-63 freshwater DIC E. coli lipid from plant sugar 15-27 near 0 Monson et al. (1980) J. Biol. Chem., 255: 11435-41 Cyanobacterial lipid from fossil 63.5-66.sup. 37.5-40 -- carbon Cyanobacterial biomass from 42.5-57.sup. 16.5-31 -- fossil carbon *D = discrimination by a biological process in its utilization of .sup.12C vs. .sup.13C (see text)

[0134] Table 2 introduces a new quantity, D. This is the discrimination by a biological process in its utilization of .sup.12C vs. .sup.13C. We define D as follows: D=(R.sub.p/R.sub.a)-1. This quantity is very similar to .delta..sub.a and .delta..sub.p, except we now compare the biological product directly to the carbon source rather than to a standard. Using D, we can combine the bias effects of a carbon source and a biological process to obtain the bias of the biological product as compared to the standard. Solving for .delta..sub.p, we obtain: .delta..sub.p=(D)(.delta..sub.a)+D+.delta..sub.a, and, because (D)(.delta..sub.a) is generally very small compared to the other terms, .delta..sub.p.apprxeq..delta..sub.a+D.

[0135] For a biological product having a production process with a known D, we may therefore estimate .delta..sub.p by summing .delta..sub.a and D. We assume that D operates irrespective of the carbon source. This has been done in Table 1 for cyanobacterial lipid and biomass produced from fossil carbon. As shown in the Table 1 and Table 2, above, cyanobacterial products made from fossil carbon (in the form of, for example, flue gas or other emissions) will have a higher .delta..sub.p than those of comparable biological products made from other sources, distinguishing them on the basis of composition of matter from these other biological products. In addition, any product derived solely from fossil carbon will have a negligible fraction of .sup.14C, while products made from above-ground carbon will have a .sup.14C fraction of approximately 10.sup.-12.

[0136] Accordingly, in certain aspects, the invention provides various carbon-based products of interest characterized as -.delta..sub.p(.Salinity.) of about 63.5 to about 66 and -D(.Salinity.) of about 37.5 to about 40.

[0137] The following examples are for illustrative purposes and are not intended to limit the scope of the present invention.

Example 1

Recombinant Genes for the Biosynthesis of Biodiesel and Biodiesel-Like Compounds

[0138] In one embodiment of the invention, a cyanobacterium strain is transformed or engineered to express one or more enzymes selected from the following list: a wax synthase (EC: 2.3.175), a thioesterase (EC: 3.1.2.-, 3.1.2.14), and an acyl-CoA synthase (EC: 6.2.1.3). For example, a typical embodiment utilizes a thioesterase gene from E. coli (tesA; SEQ ID NO:1), an acyl-CoA synthetase gene from E. coli (fadD; SEQ ID NO:2), and a wax synthase gene from A. baylyi (wax; SEQ ID NO:3). Thioesterase generates fatty acid from acyl-ACP. Acyl-CoA synthetase (also referred to as acyl-CoA ligase) generates fatty acyl-CoA from fatty acid. Wax synthase (EC 2.3.1.75) generates fatty acid esters using acyl-CoA and acyl alcohol as substrates (e.g., methanol, ethanol, butanol, etc).

[0139] Additional thioesterase, acyl-CoA synthetase and wax synthases genes that can be recombinantly expressed in cyanobacteria are set forth in Table 3A, Table 3B, and Table 3C, respectively.

TABLE-US-00004 TABLE 3A Exemplary Thioesterases* GenBank: Genbank: gene protein accession accession Source Enzyme number number E. coli C-18:1 NC_000913 NP_415027 thioesterase Cuphea C-8:0 to C-10:0 U39834.1 AAC49269 hookeriana thioesterase Umbellularia C-12:0 M94159.1 Q41635 california thioesterase Cinnamonum C-14:0 U17076.1 Q39473 camphorum thioesterase Arabidopsis C-18:1 822102 NP_189147.1 thaliana thioesterase *where leader sequences are present in the native protein, as in the case of E. coli tesA, the leader sequences are typically removed before the activity is recombinantly expressed

TABLE-US-00005 TABLE 3B Exemplary Acyl-CoA Synthetases GenBank: Genbank: gene protein accession accession Source Gene name number number E. coli Acyl-CoA NC_000913 NP_416319.1 synthetase Geobacillus Acyl-CoA CP000557.1 ABO66726.1 thermodenitrificans synthetase NG80-2

TABLE-US-00006 TABLE 3C Exemplary Wax Synthases GenBank: Genbank: gene protein Gene or accession accession Source protein name number number Acinetobacter baylyi wxs AF529086.1 AAO17391.1 Mycobacterium acyltransferase, NP_218257.1 tuberculosis H37Rv WS/DGAT/MGAT Saccharomyces Eeb1 NP_015230 cerevisiae Saccharomyces YMR210w NP_013937 cerevisiae Rattus FAEE synthase P16303 norvegicus (rat) Fundibacter wst9 jadensis DSM 12178 Acinetobacter sp. Wshn H01-N H. sapiens mWS Fragaria xananassa SAAT Malus xdomestica mpAAT Simmondsia JjWs Q9XGY6 chinensis Mus musculus mWS Q6E1M8

[0140] The engineered cyanobacterium expressing one or more of the thioesterase, acyl-CoA synthetase, and wax synthase genes set forth above is grown in suitable media, under appropriate conditions (e.g., temperature, shaking, light, etc.). After a certain optical density is reached, the cells are separated from the spent medium by centrifugation. The cell pellet is re-suspended and the cell suspension and the spent medium are then extracted with a suitable solvent, e.g., ethyl acetate. The resulting ethyl acetate phases from the cell suspension and the supernatant are subjected to GC-MS analysis. The fatty acid esters in the ethyl acetate phases can be quantified, e.g., using commercial palmitic acid ethyl ester as a reference standard.

[0141] Fatty acid esters can be made according to this method by adding an alcohol (e.g., methanol, propanol, isopropanol, butanol, etc.) to the fermentation media, whereby fatty acid esters of the added alcohols are produced by the engineered cyanobacterium. Alternatively, one or more alcohols can be synthesized by the engineered cyanobacterium, natively or recombinantly, and used as substrates for fatty acid ester synthesis by a recombinantly expressed wax synthase. As detailed in the Examples below, the engineered cyanobacterium can also be modified to recombinantly expresses a TolC/AcrAB transporter to facilitate secretion of the fatty acid esters into the culture medium.

Example 2

Synthesis of Ethyl and Methyl Fatty Acid Esters by an Engineered Cyanobacterium

[0142] Genes and Plasmids:

[0143] The pJB5 base vector was designed as an empty expression vector for recombination into Synechococcus sp. PCC 7002. Two regions of homology, the Upstream Homology Region (UHR) and the Downstream Homology Region (DHR), are designed to flank the construct of interest. These 500 bp regions of homology correspond to positions 3301-3800 and 3801-4300 (Genbank Accession NC.sub.--005025) for UHR and DHR respectively. The aadA promoter, gene sequence, and terminator were designed to confer spectinomycin and streptomycin resistance to the integrated construct. For expression, pJB5 was designed with the aphII kanamycin resistance cassette promoter and ribosome binding site (RBS). Downstream of this promoter and RBS, the restriction endonuclease recognition site for NdeI, EcoRI, SpeI and PacI were inserted. Following the EcoRI site, the natural terminator from the alcohol dehydrogenase gene from Zymomonas mobilis (adhII) terminator was included. Convenient XbaI restriction sites flank the UHR and the DHR allowing cleavage of the DNA intended for recombination from the rest of the vector.

[0144] The E. coli thioesterase tesA gene with the leader sequence removed (SEQ ID NO:4; Genbank # NC.sub.--000913; Chot and Cronan, 1993), the E. coli acyl-CoA synthetase fadD (SEQ ID NO:5; Genbank # NC.sub.--000913; Kameda and Nunn, 1981) and the wax synthase gene (wax) from Acinetobacter baylyi strain ADPI (SEQ ID NO:6; Genbank # AF529086.1; Stoveken et al. 2005) were purchased from DNA 2.0, following codon optimization, checking for secondary structure effects, and removal of any unwanted restriction sites (NdeI, XhoI, BamHI, NgoMIV, NcoI, SacI, BsrGI, AvrII, BmtI, MluI, EcoRI, SbfI, NotI, SpeI, XbaI, Pad, AscI, FseI). These genes were received on a pJ201 vector and assembled into a three-gene operon (tesA-fadD-wax, SEQ ID NO: 10) with flanking NdeI-EcoRI sites on the recombination vector pJB5 under the control of the PaphII kanamycin resistance cassette promoter. A second plasmid (pJB532; SEQ ID NO:11) was constructed which is identical to pJB494 except the PaphII promoter was replaced with SEQ ID NO:12, a Ptrc promoter and a lacIq repressor. As a control, a third plasmid (pJB413) was prepared with only tesA under the control of the PaphII promoter. These plasmid constructs were named pJB494, pJB532, and pJB413, respectively.

[0145] Strain Construction:

[0146] The constructs described above were integrated onto the plasmid pAQ1 in Synechococcus sp. PCC 7002 according to the following protocol. Synechococcus 7002 was grown for 48 h from colonies in an incubated shaker flask at 37.degree. C. at 2% CO.sub.2 to an OD.sub.730 of 1 in A.sup.+ medium described in Frigaard et al., Methods Mol. Biol., 274:325-340 (2004). 450 .mu.L of culture was added to a epi-tube with 50 .mu.L of 5 .mu.g of plasmid DNA digested with XbaI ((New England Biolabs; Ipswitch, Mass.)) that was not purified following restriction digest. Cells were incubated in the dark for four hours at 37.degree. C. The entire volume of cells was plated on A.sup.+ medium plates with 1.5% agarose and grown at 37.degree. C. in a lighted incubator (40-60 .mu.E/m2/s PAR, measured with a LI-250A light meter (LI-COR)) for about 24 hours. 25 .mu.g/mL of spectinomycin was underlayed on the plates. Resistant colonies were visible in 7-10 days after further incubation, and recombinant strains were confirmed by PCR using internal and external primers to check insertion and confirm location of the genes on pAQ1 in the strains (Table 4).

TABLE-US-00007 TABLE 4 Joule Culture Collection (JCC) numbers of Synechococcus sp. PCC 7002 recombinant strains with gene insertions on the native plasmid pAQ1 JCC # Promoter Genes Marker JCC879 PaphII -- aadA JCC750 PaphII tesA aadA JCC723 PaphII tesA-fadD-wax aadA JCC803 lacIq Ptrc tesA-fadD-wax aadA

[0147] Ethyl Ester Production Culturing Conditions:

[0148] One colony of each of the four strains listed in Table 4 was inoculated into 10 ml of A+ media containing 50 .mu.g/ml spectinomycin and 1% ethanol (v/v). These cultures were incubated for about 4 days in a bubble tube at 37.degree. C. sparged at approximately 1-2 bubbles of 1% CO.sub.2/air every 2 seconds in light (40-50 .mu.E/m2/s PAR, measured with a LI-250A light meter (LI-COR)). The cultures were then diluted so that the following day they would have OD.sub.730 of 2-6. The cells were washed with 2.times.10 ml JB 2.1/spec200, and inoculated into duplicate 28 ml cultures in JB 2.1/spec200+1% ethanol (v/v) media to an OD.sub.730=0.07. IPTG was added to the JCC803 cultures to a final concentration of 0.5 mM. These cultures were incubated in a shaking incubator at 150 rpm at 37.degree. C. under 2% CO.sub.2/air and continuous light (70-130 .mu.E m2/s PAR, measured with a LI-250A light meter (LI-COR)) for ten days. Water loss through evaporation was replaced with the addition of sterile Milli-Q water. 0.5% (v/v) ethanol was added to the cultures to replace loss due to evaporation every 48 hours. At 68 and 236 hours, 5 ml and 3 ml of culture were removed from each flask for ethyl ester analysis, respectively. The OD.sub.730 values reached by the cultures are given in Table 5.

TABLE-US-00008 TABLE 5 OD.sub.730s reached by recombinant Synechococcus sp. PCC 7002 strains at timepoints 68 and 236 h JCC879 JCC879 JCC750 JCC750 JCC723 JCC723 JCC803 JCC803 Time point #1 #2 #1 #2 #1 #2 #1 #2 68 h 3.6 4.0 4.6 5.0 6.6 6.0 5.4 5.8 236 h 21.2 18.5 19.4 20.9 22.2 21.4 17.2 17.7

[0149] The culture aliquots were pelleted using a Sorvall RC6 Plus superspeed centrifuge (Thermo Electron Corp) and a F13S-14X50CY rotor (5000 rpm for 10 min). The spent media supernatant was removed and the cells were resuspended in 1 ml of Milli-Q water. The cells were pelleted again using a benchtop centrifuge, the supernatant discarded and the cell pellet was stored at -80.degree. C. until analyzed for the presence of ethyl esters.

[0150] Detection and Quantification of Ethyl Esters in Strains:

[0151] Cell pellets were thawed and 1 ml aliquots of acetone (Acros Organics 326570010) containing 100 mg/L butylated hydroxytoluene (Sigma-Aldrich B1378) and 50 mg/L ethyl valerate (Fluka 30784) were added. The cell pellets were mixed with the acetone using a Pasteur pipettes and vortexed twice for 10 seconds (total extraction time of 1-2 min). The suspensions were centrifuged for 5 min to pellet debris, and the supernatants were removed with Pasteur pipettes and subjected to analysis with a gas chromatograph using flame ionization detection (GC/FID).

[0152] An Agilent 7890A GC/FID equipped with a 7683 series autosampler was used to detect the ethyl esters. One .mu.L of each sample was injected into the GC inlet (split 5:1, pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 mL/min) and an inlet temperature of 280.degree. C. The column was a HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m) and the carrier gas was helium at a flow of 1.0 mL/min. The GC oven temperature program was 50.degree. C., hold one minute; 10.degree./min increase to 280.degree. C.; hold ten minutes. The GC/MS interface was 290.degree. C., and the MS range monitored was 25 to 600 amu. Ethyl myristate [C14:0; retention time (rt): 17.8 min], ethyl palmitate (C16:0; rt: 19.8 min) and ethyl stearate (C18:0; rt: 21.6 min) were identified based on comparison to a standard mix of C4-C24 even carbon saturated fatty acid ethyl esters (Supelco 49454-U). Ethyl oleate (C18:1; rt: 21.4 min) was identified by comparison with an ethyl oleate standard (Sigma Aldrich 268011). These identifications were confirmed by GC/MS (see following Methyl Ester Production description for details). Calibration curves were constructed for these ethyl esters using the commercially available standards, and the concentrations of ethyl esters present in the extracts were determined and normalized to the concentration of ethyl valerate (internal standard).

[0153] Four different ethyl esters were found in the extracts of JCC723 and JCC803 (Table 6 and Table 7). In general, JCC803 produced 2-10.times. the amount of each ethyl ester than JCC723, but ethyl myristate (C14:0) was only produced in low quantities of 1 mg/L or less for all these cultures. Both JCC723 and JCC803 produced ethyl esters with the relative amounts C16:0>C18:0>C18:1 (cis-9)>C14:0. No ethyl esters were found in the extracts of JCC879 or JCC750, indicating that the strain cannot make ethyl esters naturally and that expression of only the tesA gene is not sufficient to confer production of ethyl esters.

TABLE-US-00009 TABLE 6 Amounts of respective ethyl esters found in the cell pellet extracts of JCC723 given as mg/L of culture C18:1 C14:0 C16:0 (cis-9) C18:0 % Sample myristate palmitate oleate stearate Yield* JCC723 #1 68 h 0.08 0.34 0.22 0.21 0.04 JCC723 #2 68 h 0.12 1.0 0.43 0.40 0.1 JCC803 #1 68 h 0.45 6.6 1.4 0.74 0.6 JCC803 #2 68 h 0.63 8.6 2.0 0.94 0.7 JCC723 #1 236 h 1.04 15.3 2.1 4.5 0.3 JCC723 #2 236 h 0.59 9.0 1.3 3.7 0.2 JCC803 #1 236 h 0.28 35.3 13.4 19.2 1.3 JCC803 #2 236 h 0.49 49.4 14.9 21.2 1.6 *Yield (%) = ((sum of EEs)/dry cell weight)*100

TABLE-US-00010 TABLE 7 % of total ethyl esters by mass C14:0 C16:0 C18:1 C18:0 Sample myristate palmitate oleate stearate JCC723 #1 68 h 9.4 40.0 25.9 24.7 JCC723 #2 68 h 6.2 51.3 22.1 20.5 JCC803 #1 68 h 4.9 71.8 15.2 8.1 JCC803 #2 68 h 5.2 70.7 16.4 7.7 JCC723 #1 236 h 4.5 66.7 9.2 19.6 JCC723 #2 236 h 4.0 61.7 8.9 25.4 JCC803 #1 236 h 0.4 51.8 19.7 28.2 JCC803 #2 236 h 0.6 57.4 17.3 24.7

[0154] Methyl Ester Production Culturing Conditions:

[0155] One colony of JCC803 (Table 1) was inoculated into 10 mL of A+ media containing 50 .mu.g/ml spectinomycin and 1% ethanol (v/v). This culture was incubated for 3 days in a bubble tube at 37.degree. C. sparged at approximately 1-2 bubbles of 1% CO.sub.2/air every 2 seconds in light (40-50 .mu.E/m2/s PAR, measured with a LI-250A light meter (LI-COR)). The culture was innoculated into two flasks to a final volume of 20.5 ml and OD.sub.730=0.08 in A+ media containing 200 .mu.g/ml spectinomycin and 0.5 mM IPTG with either 0.5% methanol or 0.5% ethanol (v/v). These cultures were incubated in a shaking incubator at 150 rpm at 37.degree. C. under 2% CO.sub.2/air and continuous light (70-130 .mu.E m2/s PAR, measured with a LI-250A light meter (LI-COR)) for three days. Water loss through evaporation was replaced with the addition of sterile Milli-Q water. Samples of 5 ml of these cultures (OD.sub.730=5-6) were analyzed for the presence of ethyl or methyl esters.

[0156] Detection of Methyl Esters and Comparison with Ethyl Ester Production in the Same Strain:

[0157] Cell pellets were thawed and 1 ml aliquots of acetone (Acros Organics 326570010) containing 100 mg/L butylated hydroxytoluene (Sigma-Aldrich B1378) and 50 mg/L ethyl valerate (Fluka 30784) were added. The cell pellets were mixed with the acetone using a Pasteur pipette and vortexed twice for 10 seconds (total extraction time of 1-2 min). The suspensions were centrifuged for 5 min to pellet debris, and the supernatants were removed with Pasteur pipettes and subjected to analysis with a gas chromatograph using mass spectral detection (GC/MS).

[0158] An Agilent 7890A GC/5975C EI-MS equipped with a 7683 series autosampler was used to measure the ethyl esters. One .mu.L of each sample was injected into the GC inlet using pulsed splitless injection (pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 mL/min) and an inlet temperature of 280.degree. C. The column was a HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m) and the carrier gas was helium at a flow of 1.0 mL/min. The GC oven temperature program was 50.degree. C., hold one minute; 10.degree./min increase to 280.degree. C.; hold ten minutes. The GC/MS interface was 290.degree. C., and the MS range monitored was 25 to 600 amu. Compounds indicated by peaks present in total ion chromatograms were identified by matching experimentally determined mass spectra associated with the peaks with mass spectral matches found by searching in a NIST 08 MS database.

[0159] The culture of JCC803 incubated with ethanol contained ethyl palmitate [C16:0; retention time (rt): 18.5 min], ethyl heptadecanoate (C17rt: 19.4 min), ethyl oleate (C18:1; rt: 20.1 min) and ethyl stearate (C18:0; rt: 20.3 min) (FIG. 1). The relative amounts produced were C16:0>C18:0>C18:1>C17:0. The production of low levels of C17:0 and the absence of measured levels of C14:0/myristate in this experiment is likely a result of the use of A+ medium (JB 2.1 was used to generate the date in Table 7, above).

[0160] No ethyl esters were detected in the strain incubated with methanol. Instead, methyl palmitate (C16:0; retention time ("rt"): 17.8 min), methyl heptadecanoate (C17:0; rt: 18.8 min) and methyl stearate (C18:0) were found (FIG. 1; methyl palmitate: 0.1 mg/L; methyl heptadecanoate: 0.062 mg/L; methyl stearate: 0.058 mg/L; total FAMEs: 0.22 mg/L; % of DCW: 0.01).

[0161] The data presented herein shows that JCC803 and other cyanobacterial strains engineered with tesA-fadD-wax genes can utilize methanol, ethanol, butanol, and other alcohols, including exogenously added alcohols, to produce a variety of fatty acid esters. In certain embodiments, multiple types of exogenous or endogenous alcohols (e.g., methanol and ethanol; butanol or ethanol; methanol and butanol; etc.) could be added to the culture medium and utilized as substrates.

Example 3

Production of Fatty-Acid Esters Through Heterologous Expression of an Acyl-CoA Synthetase and a Wax Synthase

[0162] In order to compare the yields of fatty-acid esters produced by recombinant strains expressing tesA-fadD or fadD-wax (i.e., two of the three genes in the tesA-fadD-wax synthetic operon), fadD-wax and tesA-fadD and were assembled as two-gene operons and inserted into pJB5 to yield pJB634 and pJB578, respectively. These recombination plasmids were transformed into Synechococcus sp. PCC 7002 as described in Example 1, above to generate the strains listed in Table 8. Table 8 also lists JCC723, described above.

TABLE-US-00011 TABLE 8 Joule Culture Collection (JCC) numbers of the Synechococcus sp. PCC 7002 recombinant strains with gene insertions on the native plasmid pAQ1. Promoter- % operon DCW Strain # Promoter Genes sequences Marker OD.sub.730 FAEE JCC723 PaphII tesA-fadD- SEQ ID aadA 15.35 0.20 wax NO: 10 JCC1215 PaphII fadD-wax SEQ ID aadA 10.10 0.04 NO: 13 JCC1216 PaphII tesA-fadD SEQ ID aadA 10.00 0.00 NO: 14

[0163] One 30-ml culture of each strain listed in Table 1 was prepared in JB 2.1 medium containing 200 mg/L spectinomycin and 1% ethanol (vol/vol) at an OD.sub.730=0.1 in 125 ml flasks equipped with foam plugs (inocula were from five ml A+ cultures containing 200 mg/L spectinomyin started from colonies incubated for 3 days in a Multitron II Infors shaking photoincubator under continuous light of .about.100 .mu.E m.sup.-2s.sup.-1 photosynthetically active radiation (PAR) at 37.degree. C. at 150 rpm in 2% CO.sub.2-enriched air). The cultures were incubated for seven days in the Infors incubators under continuous light of .about.100 .mu.E m.sup.-2s.sup.-1 photosynthetically active radiation (PAR) at 37.degree. C. at 150 rpm in 2% CO.sub.2-enriched air. Fifty percent of the starting volume of ethanol was added approximately at day 5 based on experimentally determined stripping rates of ethanol under these conditions. Water loss was compensated by adding back milli-Q water (based on weight loss of flasks). Optical density measurements at 730 nm (OD.sub.730) were taken (Table 8), and esters were extracted from cell pellets using the acetone procedure detailed in Example 2, above. Ethyl arachidate (Sigma A9010) at 100 mg/L was used as an internal standard instead of ethyl valerate. The dry cell weights (DCWs) were estimated based on the OD measurement using an experimentally determined average of 300 mg L.sup.-1 OD.sub.730.sup.-1.

[0164] The acetone extracts were analyzed by GC/FID (for instrument conditions, see Example 2). In order to quantify the various esters, response factors (RF) were estimated from RFs measured for authentic ethyl ester standards and these RFs were used to determine the titres in the acetone extracts. The % DCW of the fatty-acid esters and the sum of the esters as % DCW is given in Table 8. Expression of fadD-wax was sufficient to allow production of fatty-acid ethyl esters (FAEEs), while expression of tesA-fadD did not result in any FAEEs (FIG. 2). The overall yield was lower than JCC723, indicating that the co-expression of tesA is beneficial for increasing yields of FAEEs in this strain.

Example 4

Production of Longer-Chain Fatty-Acid Esters by Addition of Respective Alcohols to tesA-fadD-Wax Cultures

[0165] Seven 30-ml cultures of JCC803 (prepared from a single JCC803 culture that was diluted into 250 ml of JB 2.1 media containing 200 mg/L spectinomycin at an OD.sub.730=0.1) in 125-ml flasks were used to evaluate the ability of JCC803 to esterify different alcohols with fatty acids. Seven different alcohols were added at concentrations previously determined to allow growth of JCC803 (Table 9). The cultures were incubated for seven days in a Multitron II Infors shaking photoincubator under continuous light of .about.100 .mu.E m.sup.-2s.sup.-1 photosynthetically active radiation (PAR) at 37.degree. C. at 150 rpm in 2% CO.sub.2-enriched air. Water loss was compensated by adding back milli-Q water (based on weight loss of flasks). Optical density measurements at 730 nm (OD.sub.730) were taken (Table 3), and esters were extracted from cell pellets using the acetone procedure detailed in Example 2, above. Ethyl arachidate (Sigma A9010) at 100 mg/L was used as an internal standard instead of ethyl valerate. The dry cell weights (DCWs) were also determined for each culture so that the % DCW of the esters could be reported.

TABLE-US-00012 TABLE 9 Concentration % Final Alcohol Catalog # (vol/vol) OD.sub.730 Propanol 256404 (Sigma) 0.25 12.6 Isopropanol BP2632 (Fisher) 0.25 12.6 Butanol 34867 (Sigma) 0.1 12.5 Hexanol H13303 (Sigma) 0.01 8.6 Cyclohexanol 105899 (Sigma) 0.01 13.6 Isoamyl alcohol A393 (Fisher) 0.05 13.6 Ethanol 2716 (Decon Labs Inc.) 1.0 14.0

[0166] The acetone extracts were analyzed by GC/MS and GC/FID, as described above. The compounds indicated by peaks present in the total ion chromatograms were identified by matching the mass spectra associated with the peaks with mass spectral matches found by searching the NIST 08 MS database or by interpretation of the mass spectra when a respective mass spectrum of an authentic standard was not available in the database. In all cases, the corresponding alcohol esters of fatty acids were produced by JCC803 (FIG. 3). Six fatty-acid esters were detected and quantified in the cell pellet extracts: myristate (C14:0), palmitoleate (C16:1.DELTA.9), palmitate (C16:0), margarate (C17:0), oleate (C18:1.DELTA.9) and stearate (C18:0). Magnified chromatograms for JCC803 incubated with ethanol and butanol are shown in FIG. 4 and FIG. 5, respectively, so that the lower-yielding palmitoleate and margarate esters could be indicated on the chromatograms. In order to quantify the various esters, response factors (RF) were estimated from RFs measured for authentic ethyl ester and these RFs were used to determine the titres in the acetone extracts. The % DCW of the different esters and the sum of the esters as % DCW is given in Table 10. The % of the individual esters by weight and the total ester yield in mg/L is given in Table 11.

[0167] In general, the provision of longer-chain alcohols increased the yields of fatty-acid esters. The addition of butanol resulted in the highest yields of fatty-acid esters. Because butanol can be made biosynthetically (Nielsen et al. 2009, and references therein), exogenous butanol biosynthetic pathways could be expressed by one skilled in the art to generate a photosynthetic strain which can produce butyl esters without the addition of butanol. The use of butanol and butanol-producing pathways in other microbes containing the tesA-fadD-wax pathway would also be expected to increase yields of fatty-acid esters.

TABLE-US-00013 TABLE 10 The yield of the fatty acid-esters individually and total as % dry cell weight Total Myristate Palmitoleate Palmitate Margarate Oleate Stearate Ester Ethyl 0.05 0.02 0.94 0.01 0.11 0.15 1.3 Propyl 0.26 0.06 3.22 0.03 0.21 0.48 4.3 Isopropyl 0.20 0.04 2.42 0.02 0.08 0.42 3.2 Butyl 0.59 0.06 3.67 0.03 0.19 0.56 5.1 Hexyl 0.11 0.04 1.33 0.02 0.17 0.19 1.8 Cyclohexyl 0.09 0.03 1.88 0.01 0.09 0.31 2.4 Isoamyl 0.31 0.05 2.84 0.02 0.15 0.46 3.8

TABLE-US-00014 TABLE 11 The % of the individual esters by weight and total ester yield in mg/L. Total Myristate Palmitoleate Palmitate Margarate Oleate Stearate Ester Ethyl 4.2 1.2 73.4 0.7 8.6 12.0 77.6 Propyl 6.0 1.3 76.0 0.7 4.9 11.1 251.7 Isopropyl 6.2 1.2 76.4 0.8 2.4 13.0 188.5 Butyl 11.4 1.1 72.6 0.5 3.7 10.8 308.9 Hexyl 6.0 2.1 71.9 1.1 8.9 10.0 65.3 Cyclohexyl 3.6 1.1 78.5 0.6 3.6 12.7 139.6 Isoamyl 8.1 1.2 74.6 0.5 3.9 11.8 226.8

Example 5

Reproducibility of Butanol Yields in tesA-fadD-Wax Cultures

[0168] Six 30-ml cultures of JCC803 (prepared from a single JCC803 culture that was diluted into 200 ml of JB 2.1 media/spec200 at an OD.sub.730=0.1) in 125 ml flasks were used to evaluate the ability of JCC803 cultures to produce butyl esters when containing different concentrations of butanol. Six different concentrations were tested (Table 12). The cultures were incubated for 21 days in a Multitron II Infors shaking photoincubator under continuous light at .about.100 .mu.E m.sup.-2s.sup.-1 PAR at 37.degree. C. at 150 rpm in 2% CO.sub.2-enriched air. Fifty percent of the starting volume of butanol was added approximately every 3.5 days based on experimentally determined stripping rates of butanol under these conditions. Water loss was compensated by adding back milli-Q water (based on weight loss of flasks). OD.sub.730s were taken and esters were extracted from cell pellets using the acetone procedure detailed above. 100 mg/L ethyl arachidate (Sigma A9010) was used as an internal standard instead of ethyl valerate. The dry cell weights (DCWs) were also determined for each culture so that the % DCW of the esters could be reported.

[0169] An Agilent 7890A GC/FID equipped with a 7683 series autosampler was used to quantify the butyl esters. One microliter of each sample was injected into the GC inlet (split 5:1, pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 mL/min), which was at a temperature of 280.degree. C. The column was an HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m), and the carrier gas was helium at a flow of 1.0 mL/min. The GC oven temperature program was: 50.degree. C., hold one minute; 10.degree./min increase to 280.degree. C.; hold ten minutes. Butyl myristate, butyl palmitate, butyl margarate, butyl oleate and butyl stearate were quantified by determining appropriate response factors for the number of carbons present in the butyl esters from commercially available fatty-acid ethyl esters ("FAEEs") and fatty acid butyl esters ("FABEs"). The calibration curves were prepared for ethyl laurate (Sigma 61630), ethyl myristate (Sigma E39600), ethyl palmitate (Sigma P9009), ethyl oleate (Sigma 268011), ethyl stearate (Fluka 85690), butyl laurate (Sigma W220604) and butyl stearate (Sigma S5001). The concentrations of the butyl esters present in the extracts were determined and normalized to the concentration of ethyl arachidate (internal standard).

[0170] The yields of the JCC803 cultures as given by the % DCW of the fatty acid butyl esters is given in Table 12. The highest yield of 14.7% resulted from the culture incubated with 0.05% butanol (vol/vol) although the 0.075% butanol-containing culture was approximately the same.

TABLE-US-00015 TABLE 12 Yield of total FABES as % DCW for the JCC803 cultures containing different concentrations of butanol and final OD.sub.730 of the cultures. Concentration of butanol % (vol/vol) OD.sub.730 % DCW 0.2 10.6 11.75 0.1 9.0 12.43 0.075 12.8 14.53 0.05 12.0 14.71 0.025 13.4 10.43 0.01 16.0 6.12

Example 6

Secretion of Esters Produced by an Engineered Cyanobacterium

[0171] Plasmids.

[0172] Escherichia coli exports alkanes and other hydrophobic molecules out of the cell via the TolC-AcrAB transporter complex (Tsukagoshi and Aono, 2000; Chollet et al. 2004). PCR primer sets were designed to amplify tolC (Genbank # NC.sub.--000913.2, locus b3035) and acrA-acrB as an operon (Genbank # NC.sub.--000913.2, loci b0463, b0462) from E. coli MG1655 (ATCC #700926). The tolC and acrAB genes were amplified from MG1655 genomic DNA using the Phusion High-Fidelity PCR kit F-553 from New England BioLabs (Ipswich, Mass.) following the manufacturer's instructions. Buffer GC and 3% dimethyl sulfoxide (DMSO) were used for the PCR reactions. The amplicons were assembled into a three-gene, two-promoter construct ("transporter insert"; P.sub.psaA-tolC-P.sub.tsr2142-acrAB) and placed in multiple cloning site of recombination vector pJB161 (SEQ ID #15) to yield pJB1074. pJB161 (and pJB161-derived plasmids, including pJB1074) contain an upstream homology region (UHR) and a downstream homology region (DHR) that allows recombination into the pAQ7 plasmid of Synechococcus sp. PCC7002 at the lactate dehydrogenase locus (for pAQ7 plasmid sequence, see Genbank # CP000957). The homology regions flank a multiple cloning site (mcs), the natural terminator from the alcohol dehydrogenase gene from Zymomonas mobilis (adhII) and a kanamycin cassette which provides resistance in both E. coli and Synechococcus sp. PCC 7002. The transporter insert with flanking homology regions is provided as SEQ ID 16.

[0173] Strain Construction.

[0174] As described above, JCC803 is a strain of Synechococcus sp. PCC 7002 that has been engineered to produce esters of fatty acids (such as those found in biodiesel) when incubated in the presence of alcohols. The strain contains a thioesterase (tesA), an acyl-CoA synthetase (fadD) and a wax synthase (wxs) inserted into plasmid pAQ1 by homologous recombination.

[0175] The genes present in pJB161 and pJB1074 were integrated into the plasmid pAQ7 in Synechococcus sp. PCC 7002 (specifically, strain JCC803) using the following procedure. A 5 ml culture of JCC803 in A+ medium containing 200 mg/L spectinomycin was incubated in an Infors shaking incubator at 150 rpm at 37.degree. C. under 2% CO2/air and continuous light (70-130 .mu.E m.sup.-2 s.sup.-1 PAR, measured with a LI-250A light meter (LI-COR)) until it reached an OD730 of 1.14. For each plasmid, 500 .mu.l of culture and 5 .mu.g of plasmid DNA were added into a microcentrifuge tube. The tubes were then incubated at 37.degree. C. in the dark rotating on a Rotamix RKSVD (ATR, Inc.) on a setting of approximately 20. After 4 hours for pJB161 or 7 hours for pJB1074, the cells were pelleted using a microcentrifuge. All but .about.100 .mu.l of the supernatants were removed and the cell pellets were resuspended using the remaining supernatant and plated on A+ agar plates. The plates were incubated overnight in a Percival lighted incubator under constant illumination (40-60 .mu.E m.sup.-2 s.sup.-1 PAR, measured with a LI-250A light meter (LI-COR)) at 37.degree. C. for about 24 hours. On the following day, spectinomycin and kanamycin solution was added underneath the agar of the plates to estimated concentration of 25 mg/L spectinomycin and 50 mg/L kanamycin (assuming 40 ml A+ agar in the plate). These plates were placed back into the incubator until tiny colonies became visible. The plates were moved to another Percival incubator under the same conditions except that 1% CO.sub.2 was maintained in the air (allows for faster growth). Approximately 110 colonies formed for recombinant strains resulting from the pJB1074 transformation and 2800 colonies resulting from the pJB160 transformation. A colony from the pJB161 transformation plate was designated JCC1132.

[0176] Thirty colonies were picked from the tolC-acrAB transformation plate and streaked onto both an A+ plate with 100 mg/L spectinomycin and 0.05 mg/L erythromycin and an A+ plate with 100 mg/L spectinomycin and 0.1 mg/L erythromycin. Erythromycin is a substrate for the TolC-AcrAB transporter (Chollet et al. 2004) and served to verify function of the transporter in naturally erythromycin-sensitive Synechococcus sp. PCC 7002. The plates were incubated in Percival lighted incubator at 37.degree. C. under constant illumination (40-60 .mu.E m.sup.-2 s.sup.-1 PAR, measured with a LI-250A light meter (LI-COR)) at 37.degree. C. After two days, slight growth was visible on both plates. Eight days after streaking, variable growth and survival was evident on most of the streaks on the 0.05 mg/L erythromycin plate. On the 0.1 mg/L erythromycin plate, all of the streaks except for two had become nonviable. The same source colonies that produced the two viable streaks on 0.1 mg/L erythromycin produced streaks that were healthy on the 0.05 mg/L erythromycin plate. One of these strains on the 0.1 mg/L erythromycin plate was designated JCC1585 (see Table 13 for a list of strains).

TABLE-US-00016 TABLE 13 Strains and control strain investigated for the secretion of butyl esters. Parent Recombinant genes/Promoters JCC # strain with loci Marker JCC1132 JCC803 pAQ1:: p.sub.trc-tesa-fadd-wxs-aada; spectinomycin pAQ7::kan.sup.r kanamycin JCC1585 JCC803 pAQ1:: p.sub.trc-tesa-fadd-wxs-aada; spectinomycin pAQ7:: p.sub.psaa-tolc-p.sub.tsr2142- kanamycin acrab-kanr

[0177] Erythromycin Tolerance in Liquid Culture.

[0178] To verify the improved tolerance of JCC1585 to erythromycin compared to JCC1132, a 5 ml A+ culture containing 200 mg/L spectinomycin and 0.5 mg/L erythromycin (JCC1585) or containing 200 mg/L spectinomycin and 50 mg/L kanamycin (JCC1132) were used to inoculate 30 ml of JB 2.1 containing 200 mg/L spectinomycin and 0.5, 0.6, 0.7, 0.8, 0.9, or 1 mg/L erythromycin in 125 ml culture flasks at an OD.sub.730 of 0.1. These cultures were incubated in an Infors shaking incubator at 150 rpm at 37.degree. C. under 2% CO.sub.2/air and continuous light (70-130 .mu.E m.sup.-2 s.sup.-1 PAR, measured with a LI-250A light meter (LI-COR)). Timepoints were taken at 5 and 10 days of growth, during which water loss was replaced through addition of milli-Q water. Table 14 shows OD.sub.730 values of JCC1132 and JCC1585 cultures at day 5 and 10 with different concentrations of erythromycin present in the medium. The JCC1585 cultures were tolerant of erythromycin concentrations of up to 1 mg/L (highest concentration tested) after 10 days while the JCC1132 cultures had bleached under all concentrations of erythromycin tested.

TABLE-US-00017 TABLE 14 Erythromycin OD.sub.730 Concentration Start of OD.sub.730 OD.sub.730 Strain (mg/L) Experiment Day 5 Day 10* JCC1132 0.5 0.1 5.72 -- 0.6 0.1 4.76 -- 0.7 0.1 4.98 -- 0.8 0.1 2.94 -- 0.9 0.1 2.50 -- 1.0 0.1 2.26 -- JCC1585 0.5 0.1 6.60 7.34 0.6 0.1 6.34 6.20 0.7 0.1 5.82 5.74 0.8 0.1 5.80 4.84 0.9 0.1 5.34 5.04 1.0 0.1 5.58 5.12 *"--" indicates culture had bleached

[0179] To verify the improved tolerance of JCC1585 to erythromycin compared to JCC1132, a 5 ml A+ culture containing 200 mg/L spectinomycin and 0.5 mg/L erythromycin (JCC1585) or containing 200 mg/L spectinomycin and 50 mg/L kanamycin (JCC1132) were used to inoculate 30 ml of JB 2.1 media containing 200 mg/L spectinomycin and 0.5, 0.6, 0.7, 0.8, 0.9, or 1 mg/L erythromycin in 125 ml culture flasks at an OD730 of 0.1. These cultures were incubated in an Infors shaking incubator at 150 rpm at 37.degree. C. under 2% CO2/air and continuous light (70-130 .mu.E m2/s PAR, measured with a LI-250A light meter (LI-COR)). Timepoints were taken at 5 and 10 days of growth, during which water loss was replaced through addition of milli-Q water. The JCC1585 cultures were tolerant of erythromycin concentrations of up to 1 mg/L (highest concentration tested) after 10 days while the JCC1132 cultures had bleached under all concentrations of erythromycin tested (Table 14).

[0180] Culture Conditions.

[0181] To test for secretion of butyl esters, 5 ml A+ cultures with 200 mg/L spectinomycin and 50 mg/L kanamycin were inoculated from colonies for JCC1132 and JCC1585. These cultures were used to inoculate duplicate 30 ml cultures in JB2.1 medium containing 200 mg/L spectinomycin and 50 mg/L kanamycin. At the beginning of the experiment, 15 .mu.l butanol (Sigma 34867) was added to each flask so that fatty acid butyl esters (FABEs) would be produced by the cultures. These cultures were incubated in an Infors shaking incubator at 150 rpm at 37.degree. C. under 2% CO.sub.2/air and continuous light (70-130 .mu.E m.sup.-2 s.sup.-1 PAR, measured with a LI-250A light meter (LI-COR)) for three days. At day 4 of the experiment, 7.5 .mu.l butanol was added to the cultures to compensate for the experimentally determined stripping rate of butanol under these conditions. Water loss through evaporation was replaced with the addition of sterile Milli-Q water at day 7 and OD.sub.730 readings were taken for each culture.

[0182] Detection of Butyl Esters.

[0183] An aliquot of 250 .mu.l was removed from each culture and centrifuged at 1500 rpm in Microcentrifuge 5424 (Eppendorf) for .about.2 min. The supernatants were removed and the pellets were suspended in 500 .mu.l milli-Q H.sub.2O. The samples were centrifuged and the supernatants discarded. An additional centrifugation step for 4 min was performed, and any remaining supernatant was removed. The weight of the tube and the cell pellet were measured. One milliliter of acetone (Acros Organics 326570010) containing 100 mg/L butylated hydroxytoluene (BHT, Sigma-Aldrich B1378) and 100 mg/L ethyl arachidate (Sigma A9010) were added to each pellet, and the mixture was pipetted up and down until none of the pellet remained on the wall of the tube. Each tube was then vortexed for 15 s, and the weight of the tube, acetone solution, and cells was taken. The tubes were then spun down and 500 .mu.l of supernatant was submitted for GC analysis. From these samples, the percent dry cell weights of fatty acid butyl esters in the cell pellets were determined.

[0184] In order to quantify FABE's in the medium, 300 .mu.L of a 20% (v/v) Span80 (Fluka 85548) solution was added to each flask and mixed by swirling for 30 seconds. These mixtures were then poured into 50 mL Falcon tubes. Five mL of isooctane containing 0.01% BHT and 0.005% ethyl arachidate was added to the flasks and swirled for several seconds. The solutions were then poured into the appropriate 50 mL Falcon tubes containing the culture from the flasks. The tube was then shaken for 10 seconds and centrifuged using a Sorvall RC6 Plus superspeed centrifuge (Thermo Electron Corp) and a F13S-14X50CY rotor (6000 rpm for 20 min). One milliliter of the organic phase (upper phase) was removed and submitted for GC analysis.

[0185] The butyl esters produced by JCC803 and JCC803-derived strains were identified by GC/MS employing an Agilent 7890A GC/5975C ELMS equipped with a 7683 series autosampler. One microliter of each sample was injected into the GC inlet using a pulsed splitless injection (pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 mL/min) and an inlet temperature of 280.degree. C. The column was a HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m) and the carrier gas was helium at a flow of 1.0 mL/min. The GC oven temperature program was 50.degree. C., hold one minute; 10.degree./min increase to 280.degree. C.; hold ten minutes. The GC/MS interface was 290.degree. C., and the MS range monitored was 25 to 600 amu. Butyl myristate [retention time (rt): 19.72 min], butyl palmitate (rt: 21.58 min) butyl heptadecanoate (rt: 22.40 min), butyl oleate (rt: 23.04 min) and butyl stearate (rt: 23.24 min) were identified by matching experimentally determined mass spectra associated with the peaks with mass spectral matches found by searching in a NIST 08 MS database.

[0186] An Agilent 7890A GC/FID equipped with a 7683 series autosampler was used to quantify the butyl esters. One microliter of each sample was injected into the GC inlet (split 5:1, pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 mL/min), which was at a temperature of 280.degree. C. The column was an HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m), and the carrier gas was helium at a flow of 1.0 mL/min. The GC oven temperature program was 50.degree. C., hold one minute; 10.degree./min increase to 280.degree. C.; hold ten minutes. Butyl myristate (rt: 19.68 min], butyl palmitate (rt: 21.48 min), butyl heptadecanoate (rt: 22.32 min), butyl oleate (rt: 22.95 min) and butyl stearate (rt: 23.14 min) were quantified by determining appropriate response factors for the number of carbons present in the butyl esters from commercially-available fatty acid ethyl esters (FAEEs) and FABEs. The calibration curves were prepared for ethyl laurate (Sigma 61630), ethyl myristate (Sigma E39600), ethyl palmitate (Sigma P9009), ethyl oleate (Sigma 268011), ethyl stearate (Fluka 85690), butyl laurate (Sigma W220604) and butyl stearate (Sigma S5001). The concentrations of the butyl esters present in the extracts were determined and normalized to the concentration of ethyl arachidate (internal standard).

[0187] Peaks with areas greater than 0.05 could be integrated by the Chemstation.TM. software (Agilent.RTM.), and the concentrations of the butyl esters in both media and supernatant were determined from these values. The dry cell weight (DCW) of these strains was based on a measurement of OD.sub.730 and calculated based on the observed average DCW/OD relationship of 0.29 g L.sup.-1 OD.sup.-1. In the case of the JCC1585 culture supernatant, small peaks for butyl myristate (flask 1 area: 1.26, flask 2: 2.23) and butyl palmitate (flask 1 area: 5.16, flask 2: 5.62) were observed while no peak with an area greater than 0.05 at these retention times was found in the media extraction of the JCC1132 cultures. The OD.sub.730 percent dry cell weights of the FABEs in the cell pellets and the media are given in Table 15. The total % DCW of FABE's found in the cell pellets is indicated, as is the % DCW of butyl myristate and butyl palmitate found in the pellets and the media.

TABLE-US-00018 TABLE 15 Pellet butyl Media butyl myristate + myristate + Strain FABEs butyl pal- butyl pal- (flask) OD730 (% DCW) mitate (% DCW) mitate (% DCW) JCC1585 (1) 9.65 7.76 6.59 0.013 JCC1132 (1) 5.44 4.93 4.20 0 JCC1585 (2) 8.50 7.79 6.65 0.018 JCC1132 (2) 4.48 4.60 3.85 0

[0188] Table 15 shows that the recombinant expression of to/C in an engineered cyanobacterium provides for the secretion of a detectable fraction of esters (in this case, butyl esters) synthesized by the engineered cell. The amount of secretion achieved can be modulated by increasing concentrations of erythromycin or other transporter substrates, and/or through optimization of expression levels (promoter strength and codon optimization strategies) and/or specifically targeting a cyanobacterial membrane by employing appropriate cyanobacterial N-terminal leader sequences.

Example 7

Secretion of Fatty Acids in Thermosynechococcus elongatis BP-1 (.DELTA.aas)

[0189] Strain Construction.

[0190] Thermosynechoccocus elongatus BP-1 long-chain-fatty-acid CoA ligase gene (aas, GenBank accession number NP.sub.--682091.1) was replaced with a thermostable kanamycin resistance marker (kan_HTK, GenBank accession number AB121443.1) as follows:

[0191] Regions of homology flanking the BP-1 aas gene (Accession Number: NP.sub.--682091.1) were amplified directly from BP-1 genomic DNA using the primers in Table 16. PCR amplifications were performed with Phusion High Fidelity PCR Master Mix (New England BioLabs) and standard amplification conditions.

TABLE-US-00019 TABLE 16 SEQ ID Restriction Primer Sequence NO: site added Upstream 5'-GCTATGCCTGCAGGGGCCTTTTATGAGGAGCGGTA-3' 21 SbfI forward Upstream 5'-GCTATGGCGGCCGCTCTTCATGACAGACCCTATGGATACTA-3' 22 NotI reverse Down- 5'-GCTATGGGCGCGCCTTATCTGACTCCAGACGCAACA-3' 23 AscI stream forward Down- 5'-GCTATGGGCCGGCCGATCCTTGGATCAACTCACCCT-3' 24 FseI stream reverse

[0192] The amplified upstream homologous region (UHR) was cloned into the UHR of a pJB5 expression vector containing kan_HTK by digesting the insert and vector individually with SbfI and NotI restriction endonucleases (New England BioLabs) following well known laboratory techniques. Digestions were isolated on 1% TAE agarose gel, purified using a Gel Extraction Kit (Qiagen), and ligated with T4 DNA Ligase (New England BioLabs) incubated at room temperature for 1 hour. The ligated product was transformed into NEB 5-alpha chemically competent E. coli cells (New England BioLabs) using standard techniques and confirmed by PCR. The downstream homologous region (DHR) was cloned into the resulting plasmid following a similar protocol using AscI and FseI restriction endonucleases (New England BioLabs). The final plasmid (pJB1349) was purified using QIAprep Spin Miniprep kit (Qiagen) and the construct was confirmed by digestion with HindIII, AseI, and PstI restriction endonucleases (New England BioLabs).

[0193] BP-1 was grown in 5 ml B-HEPES liquid media in a glass test tube (45.degree. C., 120 rpm, 2% CO.sub.2) to OD.sub.7301.28. A 1 ml aliquot of culture was transferred to a fresh tube and combined with 1 ug of purified pJB1349. The culture was incubated in the dark (45.degree. C., 120 rpm, 2% CO.sub.2) for 4 hours. 4 ml of fresh B-HEPES liquid media were added and the culture was incubated with light (45.degree. C., 120 rpm, 2% CO.sub.2) overnight. 500 .mu.l of the resulting culture were plated in 3 ml of B-HEPES soft agar on B-HEPES plates containing 60 .mu.g/ml kanamycin and placed in an illuminated incubator (45.degree. C., ambient CO.sub.2) until colonies appeared (1 week), then moved into a 2% CO.sub.2 illuminated incubator for an additional week.

[0194] Four randomly selected colonies (samples A-D) were independently grown in 5 ml B-HEPES liquid media with 60 .mu.g/ml kanamycin in glass test tubes (45.degree. C., 120 rpm, 2% CO.sub.2) for one week. Replacement of aas gene was confirmed by PCR of whole cell genomic DNA by a culture PCR protocol as follows. Briefly, 100 .mu.l of each culture was resuspended in 50 .mu.l lysis buffer (96.8% diH.sub.2O, 1% Triton X-100, 2% 1M Tris pH 8.5, 0.2% 1M EDTA). 10 .mu.l of each suspension were heated 10 min at 98.degree. C. to lyse cells. 1 .mu.l of lysate was used in 15 .mu.l standard PCR reactions using Quick-Load Taq 2.times. Master Mix (New England BioLabs). The PCR product showed correct bands for an unsegregated knockout.

[0195] All cultures were maintained in fresh B-HEPES liquid media with 60 .mu.g/ml kanamycin for an additional week. The PCR reaction described above was repeated, again showing correct bands for an unsegregated knockout. Cultures were maintained in liquid culture, and one representative culture was saved as JCC1862.

[0196] Detection and Quantification of Free Fatty Acids in Strains.

[0197] Each of the four independently inoculated cultures described above (samples A-D), as well as BP-1, was analyzed for secretion of free fatty acids. OD.sub.730 was measured, and the volume in each culture tube was recorded. Fresh B-HEPES liquid media was added to each tube to bring the total volume to 5 ml and free fatty acids were extracted as follows:

[0198] Samples were acidified with 50 .mu.l 1N HCl. 500 .mu.l of 250 g/L methyl-.beta.-cyclodextrin solution was added and samples were transferred to 15-ml conical tubes after pulse-vortexing. 1 ml of 50 mg/L butylated hydroxytoluene in isooctane was added to each tube. Samples were vortexed 20 s, then centrifuged 5 min at 6000 RCF to fractionate. 500 .mu.l of the isooctane layer were placed into a new tube and submitted for GC analysis.

[0199] Concentrations of octanoic acid, decanoic acid, lauric acid, myristic acid, palmitoleic acid, palmitic acid, oleic acid, stearic acid, and 1-nonadecene extractants were quantitated by gas chromatography/flame ionization detection (GC/FID). Unknown peak areas in biological samples were converted to concentrations via linear calibration relationships determined between known authentic standard concentrations and their corresponding GC-FID peak areas. Standards were obtained from Sigma. GC-FID conditions were as follows. An Agilent 7890A GC/FID equipped with a 7683 series autosampler was used. 1 .mu.l of each sample was injected into the GC inlet (split 5:1, pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 ml/min) and an inlet temperature of 280.degree. C. The column was a HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m) and the carrier gas was helium at a flow of 1.0 ml/min. The GC oven temperature program was 50.degree. C., hold one minute; 10.degree. C./min increase to 280.degree. C.; hold ten minutes.

[0200] GC results showed that the unsegregated aas knockout increased fatty acid production relative to BP-1 (Table 17), with myristic and oleic acid making up the majority of the increase (Table 18).

TABLE-US-00020 TABLE 17 Fatty Acid Production by Sample Fatty acids Fatty acids Sample OD.sub.730 (% DCW in media) (mg/L) A 6.25 0.20 3.66 B 5.20 0.11 1.71 C 5.60 0.24 3.85 D 5.80 0.23 3.83 BP-1 6.90 0.04 0.88

TABLE-US-00021 TABLE 18 Fatty Acid Production by Type Sample Myristic (mg/L) Palmitic (mg/L) Oleic (mg/L) A 0.119 0.051 0.032 B 0.000 0.072 0.042 C 0.134 0.063 0.040 D 0.130 0.060 0.038 BP-1 0.000 0.044 0.000

Example 8

Increased Production of Fatty Acids and Fatty Esters in Thermosynechococcus elongatis BP-1 (.DELTA.aas)

[0201] Transformation of BP-1.

[0202] As disclosed in PCT/US2010/042667, filed Jul. 20, 2010, Thermosynechococcus elongatus BP-1 is transformed with integration or expression plasmids using the following protocol. 400 ml Thermosynechococcus elongatus BP-1 in B-HEPES medium is grown in a 2.8 l Fernbach flask to an OD.sub.730 of 1.0 in an Infors Multritron II shaking photoincubator (55.degree. C.; 3.5% CO.sub.2; 150 rpm). For each transformation, 50 ml cell culture is pelleted by centrifugation for 20 min (22.degree. C.; 6000 rpm). After removing the supernatant, the cell pellet is resuspended in 500 .mu.l B-HEPES and transferred to a 15 ml Falcon tube. To each 500 .mu.l BP-1 cell suspension (OD.sub.730 of .about.100), 25 .mu.g undigested plasmid (or no DNA) is added. The cell-DNA suspension is incubated in a New Brunswick shaking incubator (45.degree. C.; 250 rpm) in low light (.about.3 .mu.mol photons m.sup.-2 s.sup.1). Following this incubation, the cell-DNA suspension is made up to 1 ml by addition of B-HEPES, mixed by gentle vortexing with 2.5 ml of molten B-HEPES 0.82% top agar solution equilibrated at 55.degree. C., and spread out on the surface of a B-HEPES 1.5% agar plate (50 ml volume). Plates are left to sit at room temperature for 10 min to allow solidification of the top agar, after which time plates are placed in an inverted position in a Percival photoincubator and left to incubate for 24 hr (45.degree. C.; 1% CO.sub.2; 95% relative humidity) in low light (7-12 .mu.mol photons m.sup.-2 s.sup.1). After 24 hr, the plates are underlaid with 300 .mu.l of 10 mg/ml kanamycin so as to obtain a final kanamycin concentration of 60 .mu.g/ml following complete diffusion in the agar. Underlaid plates are placed back in the Percival incubator and left to incubate (45.degree. C.; 1% CO.sub.2; 95% relative humidity; 7-12 .mu.mol photons m.sup.-2 s.sup.1) for twelve days.

[0203] Increased Fatty Acids in BP-1.

[0204] Thermosynechococcus elongatus BP-1 (.DELTA.aas) is first constructed as described in the above Example. BP-1(.DELTA.aas) is shown to have elevated levels of both intracellular and extracellular levels of free fatty acids relative to wild-type because mechanistic analysis suggests that cells lacking an acyl-ACP synthetase have the inability to recycle exogenous or extracellular fatty acids; the extracellular fatty acid chains are diverted away from transport into the inner cellular membrane while other transport systems are thought to continue to export fatty acids. Therefore, to up-regulate fatty acid production, BP-1(.DELTA.aas) is transformed with a plasmid (e.g., pJB1349) carrying a thioesterase gene (see Table 3A). Increased cellular level of fatty acid production may be attributed to the combination of the aas deletion decreasing extracellular import of fatty acids and the addition of the thioesterase gene and/or thioesterase gene homologues.

[0205] Fatty Acid Esters.

[0206] The thioesterase gene with or without the leader sequence removed (Genbank # NC 000913, ref: Chot and Cronan, 1993), the E. coli acyl-CoA synthetase fadD (Genbank # NC 000913, ref: Kameda and Nunn, 1981) and the wax synthase (wxs) from Acinetobacter baylyi strain ADPI (Genbank # AF529086.1, ref: Stoveken et al. 2005) genes are designed for codon optimization, checking for secondary structure effects, and removal of any unwanted restriction sites (NdeI, XhoI, BamHI, NgoMIV, NcoI, SacI, BsrGI, AvrII, BmtI, MiuI, EcoRI, SbfI, NotI, SpeI, XbaI, Pad, AscI, FseI). These genes are engineered into plasmid or integration vectors (e.g., pJB1349) and assembled into a two gene operon (fadD-wxs) or a three gene operon (tesA-fadD-wxs) with flanking sites on the integration vector corresponding to integration sites for transformation into Thermosynechococcus elongatus BP-1. Integration sites include TS1, TS2, TS3 and TS4. A preferred integration site is the site of the aas gene. Host cells are cultured in the presence of small amounts of ethanol (1-10%) in the growth media under an appropriate promoter such as Pnir for the production of fatty acid esters.

[0207] In another embodiment, Thermosynechococcus elongatus BP-1 host cell with a two gene operon (fadD-wxs) or a three gene operon (tesA-fadD-wxs) is engineered to have ethanol producing genes (PCT/US2009/035937, filed Mar. 3, 2009; PCT/US2009/055949, filed Sep. 3, 2009; PCT/US2009/057694, filed Sep. 21, 2009) conferring the ability to produce fatty acid esters. In one plasmid construct, genes for ethanol production, including pyruvate decarboxylase from Zymomonas mobilis (pdc.sub.Zm) and alcohol dehydrogenase from Moorella sp. HUC22-1 (adhA.sub.M), are engineered into a plasmid and transformed into BP-1. In an alternate plasmid construct, the pyruvate decarboxylase from Zymobacter palmae (pdc.sub.Zp) and alcohol dehydrogenase from Moorella sp. HUC22-1 (adhA.sub.M), are engineered into a plasmid and transformed into BP-1. These genes are engineered into plasmid or integration vectors (e.g., pJB1349) with flanking sites on the integration vector corresponding to integration sites for transformation into Thermosynechococcus elongatus BP-1. Integration sites include TS1, TS2, TS3 and TS4. A preferred integration site is the site of the aas gene. In one configuration, expression of pdcZm and adhAM are driven by .lamda. phage cI ("PcI") and pEM7 and in another expression strain driven by PcI and PtRNA.sup.Glu. In one embodiment, a single promoter is used to control the expression of both genes. In another embodiment each gene expression is controlled by separate promoters with PaphII or Pcpcb controlling one and PcI controlling the other.

Example 9

Synechococcus Sp. PCC 7002 (.DELTA.aas) with Various Thioesterases

[0208] Strain Construction.

[0209] DNA sequences for thioesterase genes tesA, fatB, fatB1, and fatB2 were obtained from Genbank and were purchased from DNA 2.0 following codon optimization, checking for secondary structure effects, and removal of any unwanted restriction sites. Thioesterase gene fatB_mat is a modified form of fatB with its leader sequence removed.

TABLE-US-00022 TABLE 19 Thioesterase sources GenBank Gene name Organism origin protein seq tesA Escherichia coli AAC73596 fatB Umbellularia californica Q41635 (California bay) fatB1 Cinnamomum camphora Q39473 (camphor tree) fatB2 Cuphea hookeriana AAC49269

[0210] The thioesterase genes were cloned into a pJB5 expression vector containing upstream and downstream regions of homology to aquI (SYNPCC7002_A1189), pAQ3, and pAQ4 by digesting the inserts and vectors individually with AscI and NotI restriction endonucleases (New England BioLabs) following known laboratory techniques. Digestions were isolated on 1% TAE agarose gel, purified using a Gel Extraction Kit (Qiagen), and ligated with T4 DNA Ligase (New England BioLabs) incubated at room temperature for one hour. The ligated product was transformed into NEB 5-alpha chemically competent E. coli cells (New England BioLabs) using standard techniques. Purified plasmid was extracted using the QIAprep Spin Miniprep kit (Qiagen) and constructs were confirmed by PCR.

[0211] Synechococcus sp. PCC 7002 (.DELTA.aas) was grown in 5 ml A+ liquid media with 25 .mu.g/ml gentamicin in a glass test tube (37.degree. C., 120 rpm, 2% CO.sub.2) to OD.sub.730 of 0.98-1.1. 500 .mu.l of culture was combined with 1 .mu.g purified plasmid in 1.5 ml microcentrifuge tubes and incubated in darkness 3-4 hours. Samples were then plated on A+ agar plates with 3 or 6 mM urea and incubated overnight 37.degree. C. in the light. Selective antibiotics were introduced to the plates by placing stock solution spectinomycin under the agar at a final concentration of 10 .mu.g/mL, and incubating to allow diffusion of the antibiotic. Plates were incubated at 37.degree. C. with light until plates cleared and individual colonies formed. Plates were then moved to an illuminated incubator at 2% CO.sub.2. Cultures were maintained on liquid or agar A+ media containing 3-6 mM urea with 25 .mu.g/ml gentamicin, 100-200 .mu.g/ml spectomycin, to promote plasmid segregation.

[0212] Thioesterase integration and attenuation was confirmed by PCR of whole-cell genomic DNA by a "culture PCR" protocol. Briefly, 100 .mu.l of each culture was resuspended in 50 .mu.l water or lysis buffer (96.8% diH.sub.2O, 1% Triton X-100, 2% 1M tris pH 8.5, 0.2% 1M EDTA). 10 .mu.l of each suspension were heated 10 min at 98.degree. C. to lyse cells. 1 .mu.l of lysate was used in 10 .mu.l standard PCR reactions using Quick-Load Taq 2.times. Master Mix (New England BioLabs) or Platinum PCR Supermix HiFi (Invitrogen). PCR products showed correct bands for segregated aquI, pAQ4 and unsegregated (pAQ3) integrants.

[0213] Detection and Quantification of Free Fatty Acids in Strains.

[0214] Individual colonies were grown in A+ liquid media with 3 mM urea, 50 .mu.g/ml gentamicin, 200 .mu.g/ml spectomycin in glass test tubes (see Table 20). Cultures were maintained in liquid culture to promote segregation (37.degree. C., 120 rpm, 2% CO.sub.2). Liquid cultures were diluted to OD.sub.730=0.2 in 5 ml A+ liquid media with 3 mM urea and no antibiotics in glass test tubes and incubated for seven days (37.degree. C., 120 rpm, 2% CO.sub.2). After one week, OD.sub.730 was recorded and free fatty acids were extracted as follows:

[0215] Samples were acidified with 50 .mu.l 1N HCl. 500 .mu.l of 250 g/L methyl-.beta.-cyclodextrin solution was added, and samples were transferred to 15-ml conical tubes after pulse-vortexing. 1 ml of 50 mg/L butylated hydroxytoluene in isooctane was added to each tube. Samples were vortexed 20 s and immediately centrifuged 5 min at 6000 RCF to fractionate. 500 .mu.l of the isooctane layer were sub-sampled into a new tube and submitted for GC analysis.

[0216] Concentrations of octanoic acid, decanoic acid, lauric acid, myristic acid, palmitoleic acid, palmitic acid, oleic acid, stearic acid, and 1-nonadecene extractants were quantitated by gas chromatography/flame ionization detection (GC/FID). Unknown peak areas in biological samples were converted to concentrations via linear calibration relationships determined between known authentic standard concentrations and their corresponding GC-FID peak areas. Standards were obtained from Sigma. GC-FID conditions were as follows. An Agilent 7890A GC/FID equipped with a 7683 series autosampler was used. 1 .mu.l of each sample was injected into the GC inlet (split 5:1, pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 ml/min) and an inlet temperature of 280.degree. C. The column was a HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m) and the carrier gas was helium at a flow of 1.0 ml/min. The GC oven temperature program was 50.degree. C., hold one minute; 10.degree. C./min increase to 280.degree. C.; hold ten minutes.

[0217] GC results showed increased fatty acid secretion in the thioesterase strains relative to Synechococcus sp. PCC 7002 JCC138 (Table 20). The specific enrichment profile of each culture was thioesterase dependent (Table 21).

TABLE-US-00023 TABLE 20 Fatty acid secretion in tesA, fatB_mat strains Fatty Acids (% DCW Fatty acids Sample Location Promoter Thioesterase .DELTA.aas OD.sub.730 in media) (mg/ml) JCC 138 -- -- -- -- 11.80 0.11 3.81 JCC pAQ4 P(nir07) tesA yes 5.56 2.76 44.45 1648 JCC pAQ3 P(nir07) tesA yes 7.68 2.29 51.10 1751 JCC pAQ3 P(nir07) fatB_mat yes 3.92 1.79 20.38 1755

TABLE-US-00024 TABLE 21 Fatty acids by type % DCW of compounds Sample Lauric Myristic Palmitoleic Palmitic Oleic Stearic JCC 138 0.000 0.061 0.000 0.000 0.000 0.050 JCC1648 0.342 1.557 0.238 0.000 0.260 0.360 JCC 1751 0.146 0.539 0.165 1.145 0.158 0.143 JCC1755 0.940 0.224 0.289 0.143 0.197 0.000

[0218] Individual colonies of JCC1704, JCC1705, and JCC1706 were grown for three days in A+ liquid media with 3 mM urea, 25 .mu.g/ml gentamicin, 100 .mu.g/ml spectomycin in glass test tubes (37.degree. C., 120 rpm, 2% CO.sub.2). Cultures were diluted to OD.sub.730=0.2 in 5 ml A+ liquid media with 3 mM urea and no antibiotics in glass test tubes and incubated at 37.degree. C., 120 rpm, 2% CO.sub.2. After 11 days, OD.sub.730 was recorded and free fatty acids were extracted as follows:

[0219] Samples were acidified with 50 .mu.l 1N HCl. 500 .mu.l of 250 g/L methyl-.beta.-cyclodextrin solution was added and samples were transferred to 15-ml conical tubes after pulse-vortexing. 1 ml of 50 mg/L butylated hydroxytoluene in isooctane was added to each tube. Samples were vortexed 20 s and immediately centrifuged 5 min at 6000 RCF to fractionate. 500 .mu.l of the isooctane layer were sub-sampled into a new tube and submitted for GC analysis.

[0220] Concentrations of octanoic acid, decanoic acid, lauric acid, myristic acid, palmitoleic acid, palmitic acid, oleic acid, stearic acid, and 1-nonadecene extractants were quantitated by gas chromatograph/flange ionization detection (GC/FID), Unknown peak areas in biological samples were converted to concentrations via linear calibration relationships determined between known authentic standard concentrations and their corresponding GC-FID peak areas. Standards were obtained. from Sigma. GC-FID conditions were as follows. An Agilent 7890A GC/FID equipped with a 7683 series autosampler was used 1 .mu.l of each sample was injected into the GC inlet (split 5:1, pressure: 20 psi, pulse time: 0.3 min, purge time: 0.2 min, purge flow: 15 ml/min) and an inlet temperature of 280.degree. C. The column was a HP-5MS (Agilent, 30 m.times.0.25 mm.times.0.25 .mu.m) and the carrier gas was helium at a flow of 1.0 ml/min. The GC oven temperature program was 50.degree. C., hold one minute; 10.degree. C./min increase to 280.degree. C.; hold ten minutes.

[0221] GC results showed increased fatty acid secretion relative to JCC138 but to a lesser degree than tesA or fatB_mat (Table 22). The specific enrichment profile of each culture was thioesterase dependent (Table 23).

TABLE-US-00025 TABLE 22 Fatty acid secretion in fatB, fatB1, fatB2 strains Fatty Acids Fatty (% DCW in acids Sample Location Promoter Thioesterase .DELTA.aas OD.sub.730 media) (mg/ml) JCC 1648 pAQ4 P(nir07) tesA yes 11.2 6.66 216.283 JCC 1648 pAQ4 P(nir07) tesA yes 11.6 5.74 193.236 JCC 1704 aquI P(nir07) fatB yes 15.80 0.39 17.72 JCC 1704 aquI P(nir07) fatB yes 16.80 0.40 19.56 JCC 1705 aquI P(nir07) fatB1 yes 15.6 0.42 19.19 JCC 1705 aquI P(nir07) fatB1 yes 16.3 0.43 20.44 JCC 1706 aquI P(nir07) fatB2 yes 17.5 0.40 20.25 JCC 1706 aquI P(nir07) fatB2 yes 16.5 0.41 19.86

TABLE-US-00026 TABLE 23 Fatty acids by type % DCW of compounds Sample Lauric Myristic Palmitoleic Palmitic Oleic Stearic JCC 1648 0.233 1.408 0.264 3.919 0.223 0.611 JCC 1648 0.201 1.196 0.183 3.564 0.131 0.470 JCC 1704 0.000 0.057 0.107 0.073 0.087 0.063 JCC 1704 0.000 0.062 0.113 0.073 0.094 0.060 JCC 1705 0.000 0.058 0.110 0.089 0.099 0.068 JCC 1705 0.000 0.058 0.107 0.092 0.101 0.074 JCC 1706 0.000 0.054 0.098 0.090 0.085 0.071 JCC 1706 0.000 0.056 0.106 0.086 0.100 0.068

Example 10

Fatty Acid Production Under Inducible or Repressible System

[0222] Construction of the Promoter-uidA Expression Plasmid.

[0223] The E. coli uidA gene (Genbank AAB30197) was synthesized by DNA 2.0 (Menlo Park, Calif.), and was subcloned into pJB5. The DNA sequences of the ammonia-repressible nitrate reductase promoters P(nirA) (SEQ ID NO:17), P(nir07) (SEQ ID NO:18), and P(nir09) (SEQ ID NO:19) were obtained from Genbank. The nickel-inducible P(nrsB) promoter (SEQ ID NO:20), nrsS and nrsR were amplified from Synechocystis sp. PCC 6803. The promoters were cloned between NotI and NdeI sites immediately upstream of uidA, which is flanked by NdeI and EcoRI.

[0224] In addition, plasmids containing two 750-bp regions of homology designed to remove the native aquI (A1189) or the ldh (G0164) gene from Synechococcus sp. PCC 7002 were obtained by contract synthesis from DNA 2.0 (Menlo Park, Calif.). Using these vectors, 4 constructs were engineered and tested for GUS activity. Final transformation constructs are listed in Table 24. All restriction and ligation enzymes were obtained from New England Biolabs (Ipswich, Mass.). Ligated constructs were transformed into NEB 5-.alpha. competent E. coli (High Efficiency) (New England Biolabs: Ipswich, Mass.).

TABLE-US-00027 TABLE 24 Genotypes of JCC138 transformants Insert location Promoter Marker ldh P(nirA) kanamycin aquI P(nir07) spectinomycin aquI P(nir09) spectinomycin ldh P(nrsB) kanamycin

[0225] Plasmid Transformation into JCC138.

[0226] The constructs as described above were integrated onto either the genome or pAQ7 of JCC138, both of which are maintained at approximately 7 copies per cell. The following protocol was used for integrating the DNA cassettes. JCC138 was grown in an incubated shaker flask at 37.degree. C. at 1% CO.sub.2 to an OD.sub.730 of 0.8 in A.sup.+ medium. 500 .mu.l of culture was added to a microcentrifuge tube with 1 .mu.g of DNA. DNA was prepared using a Qiagen Qiaprep Spin Miniprep Kit (Valencia, Calif.) for each construct. Cells were incubated in the dark for one hour at 37.degree. C. The entire volume of cells was plated on A.sup.+ plates with 1.5% agar supplemented with 3 mM urea when necessary and grown at 37.degree. C. in an illuminated incubator (40-60 .mu.E/m2/s PAR, measured with a LI-250A light meter (LI-COR)) for approximately 24 hours. 25 .mu.g/mL of spectinomycin or 50 .mu.g/mL of kanamycin was introduced to the plates by placing the stock solution of antibiotic under the agar, and allowing it to diffuse up through the agar. After further incubation, resistant colonies became visible in 6 days. One colony from each plate was restreaked onto A.sup.+ plates with 1.5% agar supplemented with 6 mM urea when necessary and 200 .mu.g/mL spectinomycin or 50 .mu.g/mL of kanamycin.

[0227] Measurement of GUS Activity.

[0228] The GUS (beta-glucuronidase) reporter system was used to test the inducibility or repressibility of several promoters. This system measures the activity of beta-glucuronidase, an enzyme from E. coli that transforms colorless or non-fluorescent substrates into colored or fluorescent products. In this case, MUG (4-methylumbelliferyl .beta.-D-glucuronide) is the substrate, and is hydrolyzed by beta-glucuronidase to produce the florescent product MU (4-methylumbelliferone), which is subsequently detected and quantified with a fluorescent spectrophotometer.

[0229] Strains containing uidA constructs under urea repression were incubated to OD.sub.730 between 1.8 and 4. These cells were subcultured to OD.sub.730 0.2 in 5 mL A+ media supplemented with 0, 3, 6, or 12 mM urea plus either 100 .mu.g/mL spectinomycin or 50 .mu.g/ml kanamycin and incubated for 24 hours. JCC138 was cultured in 5 mL A+ media for 24 hours. The strain containing gus under nickel-inducible expression was cultured for 3 days, then subcultured to OD.sub.730 0.2 in 5 mL A+ supplemented with 0, 2, 4, or 8 M NiSO.sub.4. These cells were incubated for 6 hours. To harvest cells, cultures were spun for 5 minute at 6000 rpm. Pellets were resuspended in 1 mL 1.times.GUS extraction buffer (1 mM EDTA, 5.6 mM 2-mercaptoethanol, 0.1 M sodium phosphate, pH 7) and lysed with microtip sonication pulsing 0.5 seconds on and 0.5 seconds off for 2 min. Total protein was analyzed with Bio-Rad (Hercules, Calif.) Quick Start Bradford assay, and extracts were subsequently analyzed for GUS activity using a Sigma (St Louis, Mo.) 0-Glucuronidase Fluorescent Activity Detection Kit. Relative activities of the 4 promoters are found in Table 25.

TABLE-US-00028 TABLE 25 GUS activities of inducible/repressible promoters promoter mM urea uM NiSO.sub.4 (ABS/mg .times. 10.sup.6) P(nirA) 0 -- 121.9 3 -- 8 6 -- 11.62 12 -- 7.81 P(nir07) 0 -- 396.39 3 -- 23.61 6 -- 30.89 12 -- 33.13 P(nir09) 0 -- 97.77 3 -- 12.47 6 -- 12.35 12 -- 12.1 P(nrsB) -- 0 24.97 -- 2 286.96 -- 4 257.26 -- 8 423.77 no uidA gene -- -- 6.4

[0230] A number of embodiments of the invention have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention. All publications, patents and other references mentioned herein are hereby incorporated by reference in their entirety.

REFERENCES

[0231] Cho, H. and Cronan, J. E. (1993) The Journal of Biological Chemistry 268: 9238-9245. [0232] Chollet, R et al. (2004) Antimicrobial Agents and Chemotherapy 48: 3621-3624. [0233] Kalscheuer, R., et al. (2006a) Microbiology 152: 2529-2536. [0234] Kalscheuer, R. et al. (2006b) Applied and Environmental Microbiology 72: 1373-1379. [0235] Kameda, K. and Nunn, W. D. (1981) The Journal of Biological Chemistry 256: 5702-5707. [0236] Lopez-Mauy et al., Cell (2002) v. 43:247-256 [0237] Nielsen, D. R et al. (2009) Metabolic Engineering 11: 262-273. [0238] Qi et al., Applied and Environmental Microbiology (2005) v. 71: 5678-5684 [0239] Stoveken, T. et al. (2005) Journal of Bacteriology 187:1369-1376 [0240] Tsukagoshi, N. and Aono, R. (2000) Journal of Bacteriology 182: 4803-4810

INFORMAL SEQUENCE LISTING

TABLE-US-00029 [0241] SEQ ID NO: 1 E. coli TesA amino acid sequence (leader sequence removed) MADTLLILGDSLSAGYRMSASAAWPALLNDKWQSKTSVVNASISGDTSQQGLARLPAL LKQHQPRWVLVELGGNDGLRGFQPQQTEQTLRQILQDVKAANAEPLLMQIRLPANYGR RYNEAFSAIYPKLAKEFDVPLLPFFMEEVYLKPQWMQDDGIHPNRDAQPFIADWMAKQ LQPLVNHDS SEQ ID NO: 2 E. coli FadD amino acid sequence MKKVWLNRYPADVPTEINPDRYQSLVDMFEQSVARYADQPAFVNMGEVMTFRKLEER SRAFAAYLQQGLGLKKGDRVALMMPNLLQYPVALFGILRAGMIVVNVNPLYTPRELEH QLNDSGASAIVIVSNFAHTLEKVVDKTAVQHVILTRMGDQLSTAKGTVVNFVVKYIKRL VPKYHLPDAISFRSALHNGYRMQYVKPELVPEDLAFLQYTGGTTGVAKGAMLTHRNM LANLEQVNATYGPLLHPGKELVVTALPLYHIFALTINCLLFIELGGQNLLITNPRDIPGLV KELAKYPFTAITGVNTLFNALLNNKEFQQLDFSSLHLSAGGGMPVQQVVAERWVKLTG QYLLEGYGLTECAPLVSVNPYDIDYHSGSIGLPVPSTEAKLVDDDDNEVPPGQPGELCV KGPQVMLGYWQRPDATDEIIKNGWLHTGDIAVMDEEGFLRIVDRKKDMILVSGFNVYP NEIEDVVMQHPGVQEVAAVGVPSGSSGEAVKIFVVKKDPSLTEESLVTFCRRQLTGYKV PKLVEFRDELPKSNVGKILRRELRDEARGKVDNKA SEQ ID NO: 3 A. baylyi ADP1 wax synthase amino acids sequence MRPLHPIDFIFLSLEKRQQPMHVGGLFLFQIPDNAPDTFIQDLVNDIRISKSIPVPPFNNKL NGLFWDEDEEFDLDHHFRHIALPHPGRIRELLIYISQEHSTLLDRAKPLWTCNIIEGIEGNR FAMYFKIHHAMVDGVAGMRLIEKSLSHDVTEKSIVPPWCVEGKRAKRLREPKTGKIKKI MSGIKSQLQATPTVIQELSQTVFKDIGRNPDHVSSFQAPCSILNQRVSSSRRFAAQSFDLD RFRNIAKSLNVTINDVVLAVCSGALRAYLMSHNSLPSKPLIAMVPASIRNDDSDVSNRIT MILANLATHKDDPLQRLEIIRRSVQNSKQRFKRMTSDQILNYSAVVYGPAGLNIISGMMP KRQAFNLVISNVPGPREPLYWNGAKLDALYPASIVLDGQALNITMTSYLDKLEVGLIAC RNALPRMQNLLTHLEEEIQLFEGVIAKQEDIKTAN SEQ ID NO: 4 E. coli tesA optimized nucleic acid sequence ATGGCGGATACTCTGCTGATTCTGGGTGATTCTCTGTCTGCAGGCTACCGTATGTCCG CCTCCGCGGCCTGGCCAGCTCTGCTGAATGATAAGTGGCAGTCTAAGACGTCCGTTG TGAACGCATCCATCTCTGGCGACACGAGCCAGCAGGGCCTGGCCCGTCTGCCTGCAC TGCTGAAACAGCACCAACCGCGCTGGGTCCTGGTGGAGCTGGGCGGTAACGACGGT CTGCGCGGCTTCCAGCCGCAGCAGACCGAACAGACTCTGCGTCAGATTCTGCAGGA CGTGAAAGCTGCTAACGCGGAACCGCTGCTGATGCAGATTCGTCTGCCAGCGAACT ATGGCCGCCGTTACAACGAAGCGTTCTCTGCAATCTACCCAAAACTGGCGAAAGAG TTTGACGTCCCGCTGCTGCCGTTCTTCATGGAGGAAGTATACCTGAAACCGCAGTGG ATGCAAGATGACGGCATCCACCCGAACCGTGATGCGCAGCCGTTCATCGCTGACTG GATGGCGAAGCAACTGCAGCCGCTGGTAAACCACGATTCCTAA SEQ ID NO: 5 E. coli fadD optimized nucleic acid sequence ATGAAGAAAGTTTGGCTGAACCGTTATCCGGCAGATGTACCGACTGAAATTAACCC AGATCGTTACCAGTCCCTGGTTGACATGTTCGAACAGTCCGTGGCTCGCTACGCCGA TCAGCCTGCTTTCGTCAACATGGGTGAGGTAATGACCTTTCGCAAACTGGAGGAGCG TTCCCGTGCTTTCGCGGCATACCTGCAGCAGGGTCTGGGCCTGAAGAAAGGCGACC GCGTGGCCCTGATGATGCCGAACCTGCTGCAATATCCTGTGGCGCTGTTCGGTATCC TGCGTGCTGGTATGATCGTTGTCAATGTTAACCCTCTGTATACCCCTCGTGAACTGGA GCACCAGCTGAATGACTCTGGTGCGTCTGCTATCGTTATCGTTTCCAATTTCGCACAT ACGCTGGAGAAAGTGGTTGATAAAACCGCAGTGCAGCATGTCATTCTGACTCGCAT GGGTGACCAGCTGTCCACCGCTAAAGGTACTGTAGTCAACTTCGTTGTGAAATACAT TAAGCGCCTGGTTCCGAAATACCACCTGCCAGATGCAATTAGCTTTCGCTCTGCACT GCATAACGGTTACCGTATGCAGTACGTAAAACCAGAGCTGGTGCCGGAAGACCTGG CCTTTCTGCAGTATACCGGCGGCACCACCGGCGTGGCAAAGGGCGCGATGCTGACC CATCGTAACATGCTGGCGAACCTGGAGCAGGTTAACGCAACGTACGGCCCGCTGCT GCACCCGGGTAAAGAACTGGTAGTTACGGCACTGCCTCTGTATCACATCTTTGCACT GACGATCAACTGTCTGCTGTTCATTGAACTGGGTGGTCAGAACCTGCTGATCACCAA CCCGCGTGACATTCCGGGCCTGGTAAAAGAGCTGGCTAAGTACCCGTTCACCGCCAT TACTGGCGTAAACACTCTGTTTAACGCGCTGCTGAACAACAAAGAGTTTCAGCAGCT GGACTTCTCTAGCCTGCACCTGAGCGCTGGCGGTGGCATGCCGGTTCAGCAGGTTGT GGCAGAGCGTTGGGTGAAACTGACCGGCCAGTATCTGCTGGAGGGTTATGGTCTGA CCGAGTGTGCACCGCTGGTCAGCGTTAACCCGTATGATATTGATTACCACTCTGGTT CTATTGGTCTGCCGGTTCCGTCCACGGAAGCCAAACTGGTGGACGATGACGACAAC GAAGTACCTCCGGGCCAGCCGGGTGAGCTGTGTGTCAAGGGTCCGCAGGTTATGCT GGGCTACTGGCAGCGCCCGGACGCCACCGACGAAATCATTAAAAACGGTTGGCTGC ATACCGGTGATATCGCTGTAATGGACGAAGAAGGTTTCCTGCGTATCGTGGACCGTA AGAAAGATATGATTCTGGTGAGCGGTTTCAACGTGTACCCGAACGAAATTGAGGAC GTAGTTATGCAACACCCTGGCGTGCAGGAGGTGGCAGCCGTGGGCGTGCCGTCCGG TTCTTCTGGTGAGGCTGTGAAAATCTTTGTCGTTAAAAAGGACCCGTCCCTGACCGA AGAATCTCTGGTGACGTTTTGCCGCCGTCAACTGACTGGCTACAAAGTGCCGAAACT GGTCGAGTTCCGCGATGAGCTGCCAAAATCTAACGTGGGTAAGATCCTGCGCCGCG AGCTGCGTGACGAGGCACGTGGCAAAGTTGACAATAAAGCATAA SEQ ID NO: 6 A. baylyi wsadpl optimized nucleic acid sequence ATGCGCCCACTTCATCCGATCGATTTCATTTTCCTGTCCCTGGAGAAACGCCAGCAG CCGATGCACGTAGGTGGTCTGTTCCTGTTCCAGATCCCGGATAACGCTCCGGACACC TTTATTCAGGACCTGGTGAACGATATCCGTATCTCCAAGTCTATTCCGGTTCCGCCGT TCAACAACAAGCTGAACGGTCTGTTCTGGGACGAAGACGAGGAGTTCGATCTGGAT CACCATTTCCGTCATATTGCGCTGCCGCACCCGGGTCGCATCCGTGAGCTGCTGATT TACATCTCTCAGGAACACAGCACTCTCCTCGATCGCGCTAAACCTCTGTGGACTTGC AACATCATTGAAGGTATCGAGGGTAACCGTTTCGCCATGTACTTCAAGATTCATCAT GCGATGGTGGATGGTGTGGCGGGTATGCGTCTGATTGAGAAAAGCCTGTCCCATGAT GTTACTGAAAAGAGCATCGTACCGCCGTGGTGCGTTGAGGGCAAACGTGCTAAACG CCTGCGTGAACCGAAGACCGGCAAAATTAAGAAAATCATGTCTGGTATTAAATCTC AGCTCCAGGCCACCCCGACCGTTATTCAAGAACTGTCTCAGACGGTCTTCAAAGACA TCGGCCGTAATCCGGACCACGTTTCCTCTTTCCAGGCGCCGTGCTCCATCCTCAACC AGCGTGTGTCTTCTTCTCGTCGTTTCGCAGCACAGAGCTTTGACCTGGACCGTTTCCG CAACATCGCCAAATCTCTGAACGTGACCATTAACGACGTTGTCCTGGCTGTGTGTAG CGGTGCTCTGCGCGCTTATCTGATGTCTCATAACTCTCTGCCATCCAAACCGCTGATC GCTATGGTCCCAGCAAGCATCCGCAACGATGATTCTGATGTGTCCAACCGTATTACT ATGATTCTGGCCAACCTCGCTACTCACAAAGACGACCCTCTGCAGCGTCTGGAAATC ATCCGCCGCTCCGTCCAGAACTCTAAACAGCGTTTTAAACGCATGACTTCCGACCAG ATTCTGAACTATTCTGCGGTTGTATACGGCCCGGCTGGTCTGAACATTATCAGCGGT ATGATGCCGAAACGTCAGGCTTTTAACCTGGTAATCAGCAACGTTCCTGGCCCGCGT GAGCCGCTGTACTGGAACGGCGCAAAACTGGACGCACTGTACCCGGCTTCCATCGTT CTGGATGGCCAGGCTCTGAACATCACTATGACCTCTTACCTGGACAAACTGGAAGTA GGTCTGATCGCGTGTCGCAATGCACTGCCGCGCATGCAGAACCTGCTGACCCACCTG GAGGAGGAAATCCAGCTGTTTGAGGGCGTTATCGCCAAACAGGAAGATATCAAAAC GGCGAACTAA SEQ ID NO: 7 E. coli TolC amino acid sequence MKKLLPILIGLSLSGFSSLSQAENLMQVYQQARLSNPELRKSAADRDAAFEKINEARSPL LPQLGLGADYTYSNGYRDANGINSNATSASLQLTQSIFDMSKWRALTLQEKAAGIQDVT YQTDQQTLILNTATAYFNVLNAIDVLSYTQAQKEAIYRQLDQTTQRFNVGLVAITDVQN ARAQYDTVLANEVTARNNLDNAVEQLRQITGNYYPELAALNVENFKTDKPQPVNALLK EAEKRNLSLLQARLSQDLAREQIRQAQDGHLPTLDLTASTGISDTSYSGSKTRGAAGTQ YDDSNMGQNKVGLSFSLPIYQGGMVNSQVKQAQYNFVGASEQLESAHRSVVQTVRSSF NNINASISSINAYKQAVVSAQSSLDAMEAGYSVGTRTIVDVLDATTTLYNAKQELANAR YNYLINQLNIKSALGTLNEQDLLALNNALSKPVSTNPENVAPQTPEQNAIADGYAPDSPA PVVQQTSARTTTSNGHNPFRN SEQ ID NO: 8 E. coli AcrA amino acid sequence MNKNRGFTPLAVVLMLSGSLALTGCDDKQAQQGGQQMPAVGVVTVKTEPLQITTELP GRTSAYRIAEVRPQVSGIILKRNFKEGSDIEAGVSLYQIDPATYQATYDSAKGDLAKAQA AANIAQLTVNRYQKLLGTQYISKQEYDQALADAQQANAAVTAAKAAVETARINLAYT KVTSPISGRIGKSNVTEGALVQNGQATALATVQQLDPIYVDVTQSSNDFLRLKQELANG TLKQENGKAKVSLITSDGIKFPQDGTLEFSDVTVDQTTGSITLRAIFPNPDHTLLPGMFVR ARLEEGLNPNAILVPQQGVTRTPRGDATVLVVGADDKVETRPIVASQAIGDKWLVTEGL KAGDRVVISGLQKVRPGVQVKAQEVTADNNQQAASGAQPEQSKS SEQ ID NO: 9 E. coli AcrB amino acid sequence MPNFFIDRPIFAWVIAIIIMLAGGLAILKLPVAQYPTIAPPAVTISASYPGADAKTVQDTVT QVIEQNMNGIDNLMYMSSNSDSTGTVQITLTFESGTDADIAQVQVQNKLQLAMPLLPQE VQQQGVSVEKSSSSFLMVVGVINTDGTMTQEDISDYVAANMKDAISRTSGVGDVQLFG SQYAMRIWMNPNELNKFQLTPVDVITAIKAQNAQVAAGQLGGTPPVKGQQLNASIIAQT RLTSTEEFGKILLKVNQDGSRVLLRDVAKIELGGENYDIIAEFNGQPASGLGIKLATGAN ALDTAAAIRAELAKMEPFFPSGLKIVYPYDTTPFVKISIHEVVKTLVEAIILVFLVMYLFL QNFRATLIPTIAVPVVLLGTFAVLAAFGFSINTLTMFGMVLAIGLLVDDAIVVVENVERV MAEEGLPPKEATRKSMGQIQGALVGIAMVLSAVFVPMAFFGGSTGAIYRQFSITIVSAM ALSVLVALILTPALCATMLKPIAKGDHGEGKKGFFGWFNRMFEKSTHHYTDSVGGILRS TGRYLVLYLIIVVGMAYLFVRLPSSFLPDEDQGVFMTMVQLPAGATQERTQKVLNEVT HYYLTKEKNNVESVFAVNGFGFAGRGQNTGIAFVSLKDWADRPGEENKVEAITMRATR AFSQIKDAMVFAFNLPAIVELGTATGFDFELIDQAGLGHEKLTQARNQLLAEAAKHPDM LTSVRPNGLEDTPQFKIDIDQEKAQALGVSINDINTTLGAAWGGSYVNDFIDRGRVKKV

YVMSEAKYRMLPDDIGDWYVRAADGQMVPFSAFSSSRWEYGSPRLERYNGLPSMEILG QAAPGKSTGEAMELMEQLASKLPTGVGYDWTGMSYQERLSGNQAPSLYAISLIVVFLC LAALYESWSIPFSVMLVVPLGVIGALLAATFRGLTNDVYFQVGLLTTIGLSAKNAILIVEF AKDLMDKEGKGLIEATLDAVRMRLRPILMTSLAFILGVMPLVISTGAGSGAQNAVGTGV MGGMVTATVLAIFFVPVFFVVVRRRFSRKNEDIEHSHTVDHH SEQ ID NO: 10 PaphII underlined; tesA, fadD and wsadpl are in bold and follow the promoter in order GCGGCCGCGGGGGGGGGGGGGAAAGCCACGTTGTGTCTCAAAATCTCTGATGTTACATTGCACAAGAT AAAAATATATCATCATGAACAATAAAACTGTCTGCTTACATAAACAGTAATACAAGGGGTCATATGG CGGATACTCTGCTGATTCTGGGTGATTCTCTGTCTGCAGGCTACCGTATGTCCGCCTCCGCGGC CTGGCCAGCTCTGCTGAATGATAAGTGGCAGTCTAAGACGTCCGTTGTGAACGCATCCATCTCT GGCGACACGAGCCAGCAGGGCCTGGCCCGTCTGCCTGCACTGCTGAAACAGCACCAACCGCGC TGGGTCCTGGTGGAGCTGGGCGGTAACGACGGTCTGCGCGGCTTCCAGCCGCAGCAGACCGAA CAGACTCTGCGTCAGATTCTGCAGGACGTGAAAGCTGCTAACGCGGAACCGCTGCTGATGCAGA TTCGTCTGCCAGCGAACTATGGCCGCCGTTACAACGAAGCGTTCTCTGCAATCTACCCAAAACT GGCGAAAGAGTTTGACGTCCCGCTGCTGCCGTTCTTCATGGAGGAAGTATACCTGAAACCGCAG TGGATGCAAGATGACGGCATCCACCCGAACCGTGATGCGCAGCCGTTCATCGCTGACTGGATGG CGAAGCAACTGCAGCCGCTGGTAAACCACGATTCCTAATTAAAGATCTGTAGTAGGATCCATGTAG GGTGAGGTTATAGCTATGAAGAAAGTTTGGCTGAACCGTTATCCGGCAGATGTACCGACTGAAAT TAACCCAGATCGTTACCAGTCCCTGGTTGACATGTTCGAACAGTCCGTGGCTCGCTACGCCGAT CAGCCTGCTTTCGTCAACATGGGTGAGGTAATGACCTTTCGCAAACTGGAGGAGCGTTCCCGTG CTTTCGCGGCATACCTGCAGCAGGGTCTGGGCCTGAAGAAAGGCGACCGCGTGGCCCTGATGAT GCCGAACCTGCTGCAATATCCTGTGGCGCTGTTCGGTATCCTGCGTGCTGGTATGATCGTTGTC AATGTTAACCCTCTGTATACCCCTCGTGAACTGGAGCACCAGCTGAATGACTCTGGTGCGTCTG CTATCGTTATCGTTTCCAATTTCGCACATACGCTGGAGAAAGTGGTTGATAAAACCGCAGTGCAG CATGTCATTCTGACTCGCATGGGTGACCAGCTGTCCACCGCTAAAGGTACTGTAGTCAACTTCGT TGTGAAATACATTAAGCGCCTGGTTCCGAAATACCACCTGCCAGATGCAATTAGCTTTCGCTCTG CACTGCATAACGGTTACCGTATGCAGTACGTAAAACCAGAGCTGGTGCCGGAAGACCTGGCCTT TCTGCAGTATACCGGCGGCACCACCGGCGTGGCAAAGGGCGCGATGCTGACCCATCGTAACATG CTGGCGAACCTGGAGCAGGTTAACGCAACGTACGGCCCGCTGCTGCACCCGGGTAAAGAACTG GTAGTTACGGCACTGCCTCTGTATCACATCTTTGCACTGACGATCAACTGTCTGCTGTTCATTGA ACTGGGTGGTCAGAACCTGCTGATCACCAACCCGCGTGACATTCCGGGCCTGGTAAAAGAGCTG GCTAAGTACCCGTTCACCGCCATTACTGGCGTAAACACTCTGTTTAACGCGCTGCTGAACAACAA AGAGTTTCAGCAGCTGGACTTCTCTAGCCTGCACCTGAGCGCTGGCGGTGGCATGCCGGTTCAG CAGGTTGTGGCAGAGCGTTGGGTGAAACTGACCGGCCAGTATCTGCTGGAGGGTTATGGTCTGA CCGAGTGTGCACCGCTGGTCAGCGTTAACCCGTATGATATTGATTACCACTCTGGTTCTATTGGT CTGCCGGTTCCGTCCACGGAAGCCAAACTGGTGGACGATGACGACAACGAAGTACCTCCGGGCC AGCCGGGTGAGCTGTGTGTCAAGGGTCCGCAGGTTATGCTGGGCTACTGGCAGCGCCCGGACG CCACCGACGAAATCATTAAAAACGGTTGGCTGCATACCGGTGATATCGCTGTAATGGACGAAGA AGGTTTCCTGCGTATCGTGGACCGTAAGAAAGATATGATTCTGGTGAGCGGTTTCAACGTGTAC CCGAACGAAATTGAGGACGTAGTTATGCAACACCCTGGCGTGCAGGAGGTGGCAGCCGTGGGC GTGCCGTCCGGTTCTTCTGGTGAGGCTGTGAAAATCTTTGTCGTTAAAAAGGACCCGTCCCTGA CCGAAGAATCTCTGGTGACGTTTTGCCGCCGTCAACTGACTGGCTACAAAGTGCCGAAACTGGT CGAGTTCCGCGATGAGCTGCCAAAATCTAACGTGGGTAAGATCCTGCGCCGCGAGCTGCGTGAC GAGGCACGTGGCAAAGTTGACAATAAAGCATAACCGCGTAGGAGGACAGCTATGCGCCCACTTCA TCCGATCGATTTCATTTTCCTGTCCCTGGAGAAACGCCAGCAGCCGATGCACGTAGGTGGTCTG TTCCTGTTCCAGATCCCGGATAACGCTCCGGACACCTTTATTCAGGACCTGGTGAACGATATCCG TATCTCCAAGTCTATTCCGGTTCCGCCGTTCAACAACAAGCTGAACGGTCTGTTCTGGGACGAA GACGAGGAGTTCGATCTGGATCACCATTTCCGTCATATTGCGCTGCCGCACCCGGGTCGCATCC GTGAGCTGCTGATTTACATCTCTCAGGAACACAGCACTCTCCTCGATCGCGCTAAACCTCTGTGG ACTTGCAACATCATTGAAGGTATCGAGGGTAACCGTTTCGCCATGTACTTCAAGATTCATCATGC GATGGTGGATGGTGTGGCGGGTATGCGTCTGATTGAGAAAAGCCTGTCCCATGATGTTACTGAA AAGAGCATCGTACCGCCGTGGTGCGTTGAGGGCAAACGTGCTAAACGCCTGCGTGAACCGAAG ACCGGCAAAATTAAGAAAATCATGTCTGGTATTAAATCTCAGCTCCAGGCCACCCCGACCGTTAT TCAAGAACTGTCTCAGACGGTCTTCAAAGACATCGGCCGTAATCCGGACCACGTTTCCTCTTTCC AGGCGCCGTGCTCCATCCTCAACCAGCGTGTGTCTTCTTCTCGTCGTTTCGCAGCACAGAGCTTT GACCTGGACCGTTTCCGCAACATCGCCAAATCTCTGAACGTGACCATTAACGACGTTGTCCTGG CTGTGTGTAGCGGTGCTCTGCGCGCTTATCTGATGTCTCATAACTCTCTGCCATCCAAACCGCTG ATCGCTATGGTCCCAGCAAGCATCCGCAACGATGATTCTGATGTGTCCAACCGTATTACTATGAT TCTGGCCAACCTCGCTACTCACAAAGACGACCCTCTGCAGCGTCTGGAAATCATCCGCCGCTCC GTCCAGAACTCTAAACAGCGTTTTAAACGCATGACTTCCGACCAGATTCTGAACTATTCTGCGGT TGTATACGGCCCGGCTGGTCTGAACATTATCAGCGGTATGATGCCGAAACGTCAGGCTTTTAAC CTGGTAATCAGCAACGTTCCTGGCCCGCGTGAGCCGCTGTACTGGAACGGCGCAAAACTGGACG CACTGTACCCGGCTTCCATCGTTCTGGATGGCCAGGCTCTGAACATCACTATGACCTCTTACCTG GACAAACTGGAAGTAGGTCTGATCGCGTGTCGCAATGCACTGCCGCGCATGCAGAACCTGCTGA CCCACCTGGAGGAGGAAATCCAGCTGTTTGAGGGCGTTATCGCCAAACAGGAAGATATCAAAAC GGCGAACTAACCATGGTTGAATTC SEQ ID NO: 11 pJB532 (UHR and DHR are lowercase; lacIq with promoter and P.sub.trc underlined; tesA, fadD and wsadpl are in bold and underlined and follow the promoter in order; aadA marker is italicized and underlined) CCTGCAGGGtcagcaagctctggaatttcccgattctctgatgggagatccaaaaattctcgcagtccctcaat- cacgatatcggtcttggatcgcc ctgtagcttccgacaactgctcaattttttcgagcatctctaccgggcatcggaatgaaattaacggtgtttta- gccatgtgttatacagtgtttac aacttgactaacaaatacctgctagtgtatacatattgtattgcaatgtatacgctattttcactgctgtcttt- aatggggattatcgcaagcaagt aaaaaagcctgaaaaccccaataggtaagggattccgagcttactcgataattatcacctttgagcgcccctag- gaggaggcgaaaagctatgtctg acaaggggtttgacccctgaagtcgttgcgcgagcattaaggtctgcggatagcccataacatacttttgttga- acttgtgcgcttttatcaacccc ttaagggcttgggagcgttttatGCG GCCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGC CAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTG AGACGGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGAGAGTTGCAGCAAGCGGTC CACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTGACGGCGGGATATAAC ATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGCAGCCCG GACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGT GGGAACGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGT CGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAGCCA GACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGACC CAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGT TGATGGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTC CACAGCAATGGCATCCTGGTCATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGCTGCG CGAGAAGATTGTGCACCGCCGCTTTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACC ACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTGCGACGGCGC GTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTT GTGCCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTT TCGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACCGGC ATACTCTGCGACATCGTATAACGTTACTGGTTTCATATTCACCACCCTGAATTGACTCTCTTC CGGGCGCTATCATGCCATACCGCGAAAGGTTTTGCACCATTCGATGGTGTCAACGTAAATGC ATGCCGCTTCGCCTTCCAATTGGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCA TCGGAAGCTGTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCGTGTCGCTCAAGGC GCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACATCATAACGGTTCTGGCAAATATTCTGA AATGAGCTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAACA ATTTCACACAGGAAACAGCATGGCCAAGGAGGCCCATATGGCGGATACTCTGCTGATTCT GGGTGATTCTCTGTCTGCAGGCTACCGTATGTCCGCCTCCGCGGCCTGGCCAGCTCTG CTGAATGATAAGTGGCAGTCTAAGACGTCCGTTGTGAACGCATCCATCTCTGGCGACA CGAGCCAGCAGGGCCTGGCCCGTCTGCCTGCACTGCTGAAACAGCACCAACCGCGCTG GGTCCTGGTGGAGCTGGGCGGTAACGACGGTCTGCGCGGCTTCCAGCCGCAGCAGAC CGAACAGACTCTGCGTCAGATTCTGCAGGACGTGAAAGCTGCTAACGCGGAACCGCTG CTGATGCAGATTCGTCTGCCAGCGAACTATGGCCGCCGTTACAACGAAGCGTTCTCTG CAATCTACCCAAAACTGGCGAAAGAGTTTGACGTCCCGCTGCTGCCGTTCTTCATGGA GGAAGTATACCTGAAACCGCAGTGGATGCAAGATGACGGCATCCACCCGAACCGTGAT GCGCAGCCGTTCATCGCTGACTGGATGGCGAAGCAACTGCAGCCGCTGGTAAACCACG ATTCCTAATTAAAGATCTGTAGTAGGATCCATGTAGGGTGAGGTTATAGCTATGAAGAAAG TTTGGCTGAACCGTTATCCGGCAGATGTACCGACTGAAATTAACCCAGATCGTTACCAG TCCCTGGTTGACATGTTCGAACAGTCCGTGGCTCGCTACGCCGATCAGCCTGCTTTCGT CAACATGGGTGAGGTAATGACCTTTCGCAAACTGGAGGAGCGTTCCCGTGCTTTCGCG GCATACCTGCAGCAGGGTCTGGGCCTGAAGAAAGGCGACCGCGTGGCCCTGATGATG CCGAACCTGCTGCAATATCCTGTGGCGCTGTTCGGTATCCTGCGTGCTGGTATGATCG TTGTCAATGTTAACCCTCTGTATACCCCTCGTGAACTGGAGCACCAGCTGAATGACTCT GGTGCGTCTGCTATCGTTATCGTTTCCAATTTCGCACATACGCTGGAGAAAGTGGTTGA TAAAACCGCAGTGCAGCATGTCATTCTGACTCGCATGGGTGACCAGCTGTCCACCGCT AAAGGTACTGTAGTCAACTTCGTTGTGAAATACATTAAGCGCCTGGTTCCGAAATACCA CCTGCCAGATGCAATTAGCTTTCGCTCTGCACTGCATAACGGTTACCGTATGCAGTACG TAAAACCAGAGCTGGTGCCGGAAGACCTGGCCTTTCTGCAGTATACCGGCGGCACCAC CGGCGTGGCAAAGGGCGCGATGCTGACCCATCGTAACATGCTGGCGAACCTGGAGCA GGTTAACGCAACGTACGGCCCGCTGCTGCACCCGGGTAAAGAACTGGTAGTTACGGCA

CTGCCTCTGTATCACATCTTTGCACTGACGATCAACTGTCTGCTGTTCATTGAACTGGG TGGTCAGAACCTGCTGATCACCAACCCGCGTGACATTCCGGGCCTGGTAAAAGAGCTG GCTAAGTACCCGTTCACCGCCATTACTGGCGTAAACACTCTGTTTAACGCGCTGCTGAA CAACAAAGAGTTTCAGCAGCTGGACTTCTCTAGCCTGCACCTGAGCGCTGGCGGTGGC ATGCCGGTTCAGCAGGTTGTGGCAGAGCGTTGGGTGAAACTGACCGGCCAGTATCTGC TGGAGGGTTATGGTCTGACCGAGTGTGCACCGCTGGTCAGCGTTAACCCGTATGATAT TGATTACCACTCTGGTTCTATTGGTCTGCCGGTTCCGTCCACGGAAGCCAAACTGGTG GACGATGACGACAACGAAGTACCTCCGGGCCAGCCGGGTGAGCTGTGTGTCAAGGGT CCGCAGGTTATGCTGGGCTACTGGCAGCGCCCGGACGCCACCGACGAAATCATTAAAA ACGGTTGGCTGCATACCGGTGATATCGCTGTAATGGACGAAGAAGGTTTCCTGCGTAT CGTGGACCGTAAGAAAGATATGATTCTGGTGAGCGGTTTCAACGTGTACCCGAACGAA ATTGAGGACGTAGTTATGCAACACCCTGGCGTGCAGGAGGTGGCAGCCGTGGGCGTG CCGTCCGGTTCTTCTGGTGAGGCTGTGAAAATCTTTGTCGTTAAAAAGGACCCGTCCCT GACCGAAGAATCTCTGGTGACGTTTTGCCGCCGTCAACTGACTGGCTACAAAGTGCCG AAACTGGTCGAGTTCCGCGATGAGCTGCCAAAATCTAACGTGGGTAAGATCCTGCGCC GCGAGCTGCGTGACGAGGCACGTGGCAAAGTTGACAATAAAGCATAACCGCGTAGGAG GACAGCTATGCGCCCACTTCATCCGATCGATTTCATTTTCCTGTCCCTGGAGAAACGCC AGCAGCCGATGCACGTAGGTGGTCTGTTCCTGTTCCAGATCCCGGATAACGCTCCGGA CACCTTTATTCAGGACCTGGTGAACGATATCCGTATCTCCAAGTCTATTCCGGTTCCGC CGTTCAACAACAAGCTGAACGGTCTGTTCTGGGACGAAGACGAGGAGTTCGATCTGGA TCACCATTTCCGTCATATTGCGCTGCCGCACCCGGGTCGCATCCGTGAGCTGCTGATTT ACATCTCTCAGGAACACAGCACTCTCCTCGATCGCGCTAAACCTCTGTGGACTTGCAAC ATCATTGAAGGTATCGAGGGTAACCGTTTCGCCATGTACTTCAAGATTCATCATGCGAT GGTGGATGGTGTGGCGGGTATGCGTCTGATTGAGAAAAGCCTGTCCCATGATGTTACT GAAAAGAGCATCGTACCGCCGTGGTGCGTTGAGGGCAAACGTGCTAAACGCCTGCGTG AACCGAAGACCGGCAAAATTAAGAAAATCATGTCTGGTATTAAATCTCAGCTCCAGGC CACCCCGACCGTTATTCAAGAACTGTCTCAGACGGTCTTCAAAGACATCGGCCGTAATC CGGACCACGTTTCCTCTTTCCAGGCGCCGTGCTCCATCCTCAACCAGCGTGTGTCTTCT TCTCGTCGTTTCGCAGCACAGAGCTTTGACCTGGACCGTTTCCGCAACATCGCCAAATC TCTGAACGTGACCATTAACGACGTTGTCCTGGCTGTGTGTAGCGGTGCTCTGCGCGCT TATCTGATGTCTCATAACTCTCTGCCATCCAAACCGCTGATCGCTATGGTCCCAGCAAG CATCCGCAACGATGATTCTGATGTGTCCAACCGTATTACTATGATTCTGGCCAACCTCG CTACTCACAAAGACGACCCTCTGCAGCGTCTGGAAATCATCCGCCGCTCCGTCCAGAA CTCTAAACAGCGTTTTAAACGCATGACTTCCGACCAGATTCTGAACTATTCTGCGGTTG TATACGGCCCGGCTGGTCTGAACATTATCAGCGGTATGATGCCGAAACGTCAGGCTTT TAACCTGGTAATCAGCAACGTTCCTGGCCCGCGTGAGCCGCTGTACTGGAACGGCGCA AAACTGGACGCACTGTACCCGGCTTCCATCGTTCTGGATGGCCAGGCTCTGAACATCA CTATGACCTCTTACCTGGACAAACTGGAAGTAGGTCTGATCGCGTGTCGCAATGCACT GCCGCGCATGCAGAACCTGCTGACCCACCTGGAGGAGGAAATCCAGCTGTTTGAGGGC GTTATCGCCAAACAGGAAGATATCAAAACGGCGAACTAACCATGGTTGAATTCGGTTTTC CGTCCTGTCTTGATTTTCAAGCAAACAATGCCTCCGATTTCTAATCGGAGGCATTTGTTTTTG TTTATTGCAAAAACAAAAAATATTGTTACAAATTTTTACAGGCTATTAAGCCTACCGTCATA AATAATTTGCCATTTACTAGTTTTTAATTAACCAGAACCTTGACCGAACGCAGCGGTGGTAACG GCGCAGTGGCGGTTTTCATGGCTTGTTATGACTGTTTTTTTGGGGTACAGTCTATGCCTCGGGCAT CCAAGCAGCAAGCGCGTTACGCCGTGGGTCGATGTTTGATGTTATGGAGCAGCAACGATGTTACG CAGCAGGGCAGTCGCCCTAAAACAAAGTTAAACATCATGAGGGAAGCGGTGATCGCCGAAGTATC GACTCAACTATCAGAGGTAGTTGGCGTCATCGAGCGCCATCTCGAACCGACGTTGCTGGCCGTAC ATTTGTACGGCTCCGCAGTGGATGGCGGCCTGAAGCCACACAGTGATATTGATTTGCTGGTTACGG TGACCGTAAGGCTTGATGAAACAACGCGGCGAGCTTTGATCAACGACCTTTTGGAAACTTCGGCTT CCCCTGGAGAGAGCGAGATTCTCCGCGCTGTAGAAGTCACCATTGTTGTGCACGACGACATCATTC CGTGGCGTTATCCAGCTAAGCGCGAACTGCAATTTGGAGAATGGCAGCGCAATGACATTCTTGCAG GTATCTTCGAGCCAGCCACGATCGACATTGATCTGGCTATCTTGCTGACAAAAGCAAGAGAACATA GCGTTGCCTTGGTAGGTCCAGCGGCGGAGGAACTCTTTGATCCGGTTCCTGAACAGGATCTATTTG AGGCGCTAAATGAAACCTTAACGCTATGGAACTCGCCGCCCGACTGGGCTGGCGATGAGCGAAAT GTAGTGCTTACGTTGTCCCGCATTTGGTACAGCGCAGTAACCGGCAAAATCGCGCCGAAGGATGT CGCTGCCGACTGGGCAATGGAGCGCCTGCCGGCCCAGTATCAGCCCGTCATACTTGAAGCTAGAC AGGCTTATCTTGGACAAGAAGAAGATCGCTTGGCCTCGCGCGCAGATCAGTTGGAAGAATTTGTCC ACTACGTGAAAGGCGAGATCACCAAGGTAGTCGGCAAATAATGTCTAACAATTCGTTCAAGCCGAC GCCGCTTCGCGGCGCGGCTTAACTCAAGCGTTAGATGCACTAAGCACATAATTGCTCACAGCCAAA CTATCAGGTCAAGTCTGCTTTTATTATTTTTAAGCGTGCATAATAAGCCCTACACAAATTGGGAGATA TATCATGAGGCGCGCCacgagtgcggggaaatttcgggggcgatcgcccctatatcgcaaaaaggagttacccc- atcagagctatagtcg agaagaaaaccatcattcactcaacaaggctatgtcagaagagaaactagaccggatcgaagcagccctagagc- aattggataaggatgtgcaaac gctccaaacagagcttcagcaatcccaaaaatggcaggacaggacatgggatgttgtgaagtgggtaggcggaa- tctcagcgggcctagcggtgag cgcttccattgccctgttcgggttggtctttagattttctgtttccctgccataaaagcacattcttataagtc- atacttgtttacatcaaggaac aaaaacggcattgtgccttgcaaggcacaatgtctttctcttatgcacagatggggactggaaaccacacgcac- aattcccttaaaaagcaaccgc aaaaaataaccatcaaaataaaactggacaaattctcatgtgGGCCGGCC SEQ ID NO: 12 Ptrc promoter and lacIq repressor TCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACG CGCGGGGAGAGGCGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACG GGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGAGAGTTGCAGCAAGCGGTCCACGCT GGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTGACGGCGGGATATAACATGAGC TGTCTTCGGTATCGTCGTATCCCACTACCGAGATATCCGCACCAACGCGCAGCCCGGACTCG GTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTGGGAAC GATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTC CCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAGCCAGACGCA GACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGACCCAATGC GACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGATGG GTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGC AATGGCATCCTGGTCATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGCTGCGCGAGAA GATTGTGCACCGCCGCTTTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCACCACG CTGGCACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTGCGACGGCGCGTGCAG GGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTTGTGCCA CGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCAG AAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTC TGCGACATCGTATAACGTTACTGGTTTCATATTCACCACCCTGAATTGACTCTCTTCCGGGCG CTATCATGCCATACCGCGAAAGGTTTTGCACCATTCGATGGTGTCAACGTAAATGCATGCCG CTTCGCCTTCCAATTGGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCATCGGAA GCTGTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCGTGTCGCTCAAGGCGCACTC CCGTTCTGGATAATGTTTTTTGCGCCGACATCATAACGGTTCTGGCAAATATTCTGAAATGAG CTGTTGACAATTAATCATCCGGCTCGTATAATGTGTGGAATTGTGAGCGGATAACAATTTCA CACAGGAAACAGCAT SEQ ID NO: 13 (UHR and DHR in lowercase; P.sub.aphII underlined; fadD and wsadpl are in bold and underlined and follow the promoter in order; aadA marker is italicized and underlined) CCTGCAGGgtcagcaagctctggaatttcccgattctctgatgggagatccaaaaattctcgcagtccctcaat- cacgatatcggtcttggatcgc cctgtagcttccgacaactgctcaattttttcgagcatctctaccgggcatcggaatgaaattaacggtgtttt- agccatgtgttatacagtgttt acaacttgactaacaaatacctgctagtgtatacatattgtattgcaatgtatacgctattttcactgctgtct- ttaatggggattatcgcaagca agtaaaaaagcctgaaaaccccaataggtaagggattccgagcttactcgataattatcacctttgagcgcccc- taggaggaggcgaaaagctatg tctgacaaggggtttgacccctgaagtcgttgcgcgagcattaaggtctgcggatagcccataacatacttttg- ttgaacttgtgcgcttttatca accccttaagggcttgggagcgttttatGCGGCCGCGGGGGGGGGGGGGAAAGCCACGTTGTGTCTCAAAATCT- CTGATGTTACATTGCACA AGATAAAAATATATCATCATGAACAATAAAACTGTCTGCTTACATAAACAGTAATACAAGG GGTCATAGATCTGTAGTAGGATCCATGTAGGGTGAGGTTATAGCTATGAAGAAAGTTTGGC TGAACCGTTATCCGGCAGATGTACCGACTGAAATTAACCCAGATCGTTACCAGTCCCTG GTTGACATGTTCGAACAGTCCGTGGCTCGCTACGCCGATCAGCCTGCTTTCGTCAACAT GGGTGAGGTAATGACCTTTCGCAAACTGGAGGAGCGTTCCCGTGCTTTCGCGGCATAC CTGCAGCAGGGTCTGGGCCTGAAGAAAGGCGACCGCGTGGCCCTGATGATGCCGAAC CTGCTGCAATATCCTGTGGCGCTGTTCGGTATCCTGCGTGCTGGTATGATCGTTGTCAA TGTTAACCCTCTGTATACCCCTCGTGAACTGGAGCACCAGCTGAATGACTCTGGTGCGT CTGCTATCGTTATCGTTTCCAATTTCGCACATACGCTGGAGAAAGTGGTTGATAAAACC GCAGTGCAGCATGTCATTCTGACTCGCATGGGTGACCAGCTGTCCACCGCTAAAGGTA CTGTAGTCAACTTCGTTGTGAAATACATTAAGCGCCTGGTTCCGAAATACCACCTGCCA GATGCAATTAGCTTTCGCTCTGCACTGCATAACGGTTACCGTATGCAGTACGTAAAACC AGAGCTGGTGCCGGAAGACCTGGCCTTTCTGCAGTATACCGGCGGCACCACCGGCGTG GCAAAGGGCGCGATGCTGACCCATCGTAACATGCTGGCGAACCTGGAGCAGGTTAACG CAACGTACGGCCCGCTGCTGCACCCGGGTAAAGAACTGGTAGTTACGGCACTGCCTCT GTATCACATCTTTGCACTGACGATCAACTGTCTGCTGTTCATTGAACTGGGTGGTCAGA ACCTGCTGATCACCAACCCGCGTGACATTCCGGGCCTGGTAAAAGAGCTGGCTAAGTA CCCGTTCACCGCCATTACTGGCGTAAACACTCTGTTTAACGCGCTGCTGAACAACAAAG AGTTTCAGCAGCTGGACTTCTCTAGCCTGCACCTGAGCGCTGGCGGTGGCATGCCGGT TCAGCAGGTTGTGGCAGAGCGTTGGGTGAAACTGACCGGCCAGTATCTGCTGGAGGGT TATGGTCTGACCGAGTGTGCACCGCTGGTCAGCGTTAACCCGTATGATATTGATTACCA

CTCTGGTTCTATTGGTCTGCCGGTTCCGTCCACGGAAGCCAAACTGGTGGACGATGAC GACAACGAAGTACCTCCGGGCCAGCCGGGTGAGCTGTGTGTCAAGGGTCCGCAGGTTA TGCTGGGCTACTGGCAGCGCCCGGACGCCACCGACGAAATCATTAAAAACGGTTGGCT GCATACCGGTGATATCGCTGTAATGGACGAAGAAGGTTTCCTGCGTATCGTGGACCGT AAGAAAGATATGATTCTGGTGAGCGGTTTCAACGTGTACCCGAACGAAATTGAGGACG TAGTTATGCAACACCCTGGCGTGCAGGAGGTGGCAGCCGTGGGCGTGCCGTCCGGTTC TTCTGGTGAGGCTGTGAAAATCTTTGTCGTTAAAAAGGACCCGTCCCTGACCGAAGAA TCTCTGGTGACGTTTTGCCGCCGTCAACTGACTGGCTACAAAGTGCCGAAACTGGTCG AGTTCCGCGATGAGCTGCCAAAATCTAACGTGGGTAAGATCCTGCGCCGCGAGCTGCG TGACGAGGCACGTGGCAAAGTTGACAATAAAGCATAACTCGACGCGTAGGAGGACAGCT ATGCGCCCACTTCATCCGATCGATTTCATTTTCCTGTCCCTGGAGAAACGCCAGCAGCC GATGCACGTAGGTGGTCTGTTCCTGTTCCAGATCCCGGATAACGCTCCGGACACCTTT ATTCAGGACCTGGTGAACGATATCCGTATCTCCAAGTCTATTCCGGTTCCGCCGTTCAA CAACAAGCTGAACGGTCTGTTCTGGGACGAAGACGAGGAGTTCGATCTGGATCACCAT TTCCGTCATATTGCGCTGCCGCACCCGGGTCGCATCCGTGAGCTGCTGATTTACATCTC TCAGGAACACAGCACTCTCCTCGATCGCGCTAAACCTCTGTGGACTTGCAACATCATTG AAGGTATCGAGGGTAACCGTTTCGCCATGTACTTCAAGATTCATCATGCGATGGTGGA TGGTGTGGCGGGTATGCGTCTGATTGAGAAAAGCCTGTCCCATGATGTTACTGAAAAG AGCATCGTACCGCCGTGGTGCGTTGAGGGCAAACGTGCTAAACGCCTGCGTGAACCGA AGACCGGCAAAATTAAGAAAATCATGTCTGGTATTAAATCTCAGCTCCAGGCCACCCC GACCGTTATTCAAGAACTGTCTCAGACGGTCTTCAAAGACATCGGCCGTAATCCGGAC CACGTTTCCTCTTTCCAGGCGCCGTGCTCCATCCTCAACCAGCGTGTGTCTTCTTCTCG TCGTTTCGCAGCACAGAGCTTTGACCTGGACCGTTTCCGCAACATCGCCAAATCTCTGA ACGTGACCATTAACGACGTTGTCCTGGCTGTGTGTAGCGGTGCTCTGCGCGCTTATCT GATGTCTCATAACTCTCTGCCATCCAAACCGCTGATCGCTATGGTCCCAGCAAGCATCC GCAACGATGATTCTGATGTGTCCAACCGTATTACTATGATTCTGGCCAACCTCGCTACT CACAAAGACGACCCTCTGCAGCGTCTGGAAATCATCCGCCGCTCCGTCCAGAACTCTA AACAGCGTTTTAAACGCATGACTTCCGACCAGATTCTGAACTATTCTGCGGTTGTATAC GGCCCGGCTGGTCTGAACATTATCAGCGGTATGATGCCGAAACGTCAGGCTTTTAACC TGGTAATCAGCAACGTTCCTGGCCCGCGTGAGCCGCTGTACTGGAACGGCGCAAAACT GGACGCACTGTACCCGGCTTCCATCGTTCTGGATGGCCAGGCTCTGAACATCACTATG ACCTCTTACCTGGACAAACTGGAAGTAGGTCTGATCGCGTGTCGCAATGCACTGCCGC GCATGCAGAACCTGCTGACCCACCTGGAGGAGGAAATCCAGCTGTTTGAGGGCGTTAT CGCCAAACAGGAAGATATCAAAACGGCGAACTAACCATGGTTGAATTCGGTTTTCCGTCC TGTCTTGATTTTCAAGCAAACAATGCCTCCGATTTCTAATCGGAGGCATTTGTTTTTGTTTATT GCAAAAACAAAAAATATTGTTACAAATTTTTACAGGCTATTAAGCCTACCGTCATAAATAAT TTGCCATTTACTAGTTTTTAATTAACCAGAACCTTGACCGAACGCAGCGGTGGTAACGGCGCAG TGGCGGTTTTCATGGCTTGTTATGACTGTTTTTTTGGGGTACAGTCTATGCCTCGGGCATCCAAGCA GCAAGCGCGTTACGCCGTGGGTCGATGTTTGATGTTATGGAGCAGCAACGATGTTACGCAGCAGG GCAGTCGCCCTAAAACAAAGTTAAACATCATGAGGGAAGCGGTGATCGCCGAAGTATCGACTCAAC TATCAGAGGTAGTTGGCGTCATCGAGCGCCATCTCGAACCGACGTTGCTGGCCGTACATTTGTACG GCTCCGCAGTGGATGGCGGCCTGAAGCCACACAGTGATATTGATTTGCTGGTTACGGTGACCGTA AGGCTTGATGAAACAACGCGGCGAGCTTTGATCAACGACCTTTTGGAAACTTCGGCTTCCCCTGGA GAGAGCGAGATTCTCCGCGCTGTAGAAGTCACCATTGTTGTGCACGACGACATCATTCCGTGGCGT TATCCAGCTAAGCGCGAACTGCAATTTGGAGAATGGCAGCGCAATGACATTCTTGCAGGTATCTTC GAGCCAGCCACGATCGACATTGATCTGGCTATCTTGCTGACAAAAGCAAGAGAACATAGCGTTGCC TTGGTAGGTCCAGCGGCGGAGGAACTCTTTGATCCGGTTCCTGAACAGGATCTATTTGAGGCGCTA AATGAAACCTTAACGCTATGGAACTCGCCGCCCGACTGGGCTGGCGATGAGCGAAATGTAGTGCT TACGTTGTCCCGCATTTGGTACAGCGCAGTAACCGGCAAAATCGCGCCGAAGGATGTCGCTGCCG ACTGGGCAATGGAGCGCCTGCCGGCCCAGTATCAGCCCGTCATACTTGAAGCTAGACAGGCTTAT CTTGGACAAGAAGAAGATCGCTTGGCCTCGCGCGCAGATCAGTTGGAAGAATTTGTCCACTACGTG AAAGGCGAGATCACCAAGGTAGTCGGCAAATAATGTCTAACAATTCGTTCAAGCCGACGCCGCTTC GCGGCGCGGCTTAACTCAAGCGTTAGATGCACTAAGCACATAATTGCTCACAGCCAAACTATCAGG TCAAGTCTGCTTTTATTATTTTTAAGCGTGCATAATAAGCCCTACACAAATTGGGAGATATATCATGA GGCGCGCCacgagtgcggggaaatttcgggggcgatcgcccctatatcgcaaaaaggagttaccccatcagagc- tatagtcgagaagaaaacc atcattcactcaacaaggctatgtcagaagagaaactagaccggatcgaagcagccctagagcaattggataag- gatgtgcaaacgctccaaacag agcttcagcaatcccaaaaatggcaggacaggacatgggatgttgtgaagtgggtaggcggaatctcagcgggc- ctagcggtgagcgcttccattg ccctgttcgggttggtctttagattttctgtttccctgccataaaagcacattcttataagtcatacttgttta- catcaaggaacaaaaacggcat tgtgccttgcaaggcacaatgtctttctcttatgcacagatggggactggaaaccacacgcacaattcccttaa- aaagcaaccgcaaaaaataacc atcaaaataaaactggacaaattctcatgtgGGCCGGCC SEQ ID NO: 14 (UHR and DHR in lowercase; P.sub.aphII underlined; tesA and fadD are in bold and underlined and follow the promoter in order; aadA marker is italicized and underlined) CCTGCAGGGtcagcaagctctggaatttcccgattctctgatgggagatccaaaaattctcgcagtccctcaat- cacgatatcggtcttggatcgc cctgtagcttccgacaactgctcaattttttcgagcatctctaccgggcatcggaatgaaattaacggtgtttt- agccatgtgttatacagtgttt acaacttgactaacaaatacctgctagtgtatacatattgtattgcaatgtatacgctattttcactgctgtct- ttaatggggattatcgcaagca agtaaaaaagcctgaaaaccccaataggtaagggattccgagcttactcgataattatcacctttgagcgcccc- taggaggaggcgaaaagctatg tctgacaaggggtttgacccctgaagtcgttgcgcgagcattaaggtctgcggatagcccataacatacttttg- ttgaacttgtgcgcttttatca accccttaagggcttgggagcgttttatGCGGCCGCGGGGGGGGGGGGGAAAGCCACGTTGTGTCTCAAAATCT- CTGATGTTACATTGCACA AGATAAAAATATATCATCATGAACAATAAAACTGTCTGCTTACATAAACAGTAATACAAGG GGTCATATGGCGGATACTCTGCTGATTCTGGGTGATTCTCTGTCTGCAGGCTACCGTAT GTCCGCCTCCGCGGCCTGGCCAGCTCTGCTGAATGATAAGTGGCAGTCTAAGACGTCC GTTGTGAACGCATCCATCTCTGGCGACACGAGCCAGCAGGGCCTGGCCCGTCTGCCTG CACTGCTGAAACAGCACCAACCGCGCTGGGTCCTGGTGGAGCTGGGCGGTAACGACG GTCTGCGCGGCTTCCAGCCGCAGCAGACCGAACAGACTCTGCGTCAGATTCTGCAGGA CGTGAAAGCTGCTAACGCGGAACCGCTGCTGATGCAGATTCGTCTGCCAGCGAACTAT GGCCGCCGTTACAACGAAGCGTTCTCTGCAATCTACCCAAAACTGGCGAAAGAGTTTG ACGTCCCGCTGCTGCCGTTCTTCATGGAGGAAGTATACCTGAAACCGCAGTGGATGCA AGATGACGGCATCCACCCGAACCGTGATGCGCAGCCGTTCATCGCTGACTGGATGGCG AAGCAACTGCAGCCGCTGGTAAACCACGATTCCTAATTAAAGATCTGTAGTAGGATCCAT GTAGGGTGAGGTTATAGCTATGAAGAAAGTTTGGCTGAACCGTTATCCGGCAGATGTAC CGACTGAAATTAACCCAGATCGTTACCAGTCCCTGGTTGACATGTTCGAACAGTCCGTG GCTCGCTACGCCGATCAGCCTGCTTTCGTCAACATGGGTGAGGTAATGACCTTTCGCA AACTGGAGGAGCGTTCCCGTGCTTTCGCGGCATACCTGCAGCAGGGTCTGGGCCTGAA GAAAGGCGACCGCGTGGCCCTGATGATGCCGAACCTGCTGCAATATCCTGTGGCGCTG TTCGGTATCCTGCGTGCTGGTATGATCGTTGTCAATGTTAACCCTCTGTATACCCCTCG TGAACTGGAGCACCAGCTGAATGACTCTGGTGCGTCTGCTATCGTTATCGTTTCCAATT TCGCACATACGCTGGAGAAAGTGGTTGATAAAACCGCAGTGCAGCATGTCATTCTGAC TCGCATGGGTGACCAGCTGTCCACCGCTAAAGGTACTGTAGTCAACTTCGTTGTGAAA TACATTAAGCGCCTGGTTCCGAAATACCACCTGCCAGATGCAATTAGCTTTCGCTCTGC ACTGCATAACGGTTACCGTATGCAGTACGTAAAACCAGAGCTGGTGCCGGAAGACCTG GCCTTTCTGCAGTATACCGGCGGCACCACCGGCGTGGCAAAGGGCGCGATGCTGACCC ATCGTAACATGCTGGCGAACCTGGAGCAGGTTAACGCAACGTACGGCCCGCTGCTGCA CCCGGGTAAAGAACTGGTAGTTACGGCACTGCCTCTGTATCACATCTTTGCACTGACG ATCAACTGTCTGCTGTTCATTGAACTGGGTGGTCAGAACCTGCTGATCACCAACCCGC GTGACATTCCGGGCCTGGTAAAAGAGCTGGCTAAGTACCCGTTCACCGCCATTACTGG CGTAAACACTCTGTTTAACGCGCTGCTGAACAACAAAGAGTTTCAGCAGCTGGACTTCT CTAGCCTGCACCTGAGCGCTGGCGGTGGCATGCCGGTTCAGCAGGTTGTGGCAGAGC GTTGGGTGAAACTGACCGGCCAGTATCTGCTGGAGGGTTATGGTCTGACCGAGTGTGC ACCGCTGGTCAGCGTTAACCCGTATGATATTGATTACCACTCTGGTTCTATTGGTCTGC CGGTTCCGTCCACGGAAGCCAAACTGGTGGACGATGACGACAACGAAGTACCTCCGGG CCAGCCGGGTGAGCTGTGTGTCAAGGGTCCGCAGGTTATGCTGGGCTACTGGCAGCGC CCGGACGCCACCGACGAAATCATTAAAAACGGTTGGCTGCATACCGGTGATATCGCTG TAATGGACGAAGAAGGTTTCCTGCGTATCGTGGACCGTAAGAAAGATATGATTCTGGT GAGCGGTTTCAACGTGTACCCGAACGAAATTGAGGACGTAGTTATGCAACACCCTGGC GTGCAGGAGGTGGCAGCCGTGGGCGTGCCGTCCGGTTCTTCTGGTGAGGCTGTGAAA ATCTTTGTCGTTAAAAAGGACCCGTCCCTGACCGAAGAATCTCTGGTGACGTTTTGCCG CCGTCAACTGACTGGCTACAAAGTGCCGAAACTGGTCGAGTTCCGCGATGAGCTGCCA AAATCTAACGTGGGTAAGATCCTGCGCCGCGAGCTGCGTGACGAGGCACGTGGCAAAG TTGACAATAAAGCATAACAATTCGGTTTTCCGTCCTGTCTTGATTTTCAAGCAAACAATGCC TCCGATTTCTAATCGGAGGCATTTGTTTTTGTTTATTGCAAAAACAAAAAATATTGTTACAAA TTTTTACAGGCTATTAAGCCTACCGTCATAAATAATTTGCCATTTACTAGTTTTTAATTAACC AGAACCTTGACCGAACGCAGCGGTGGTAACGGCGCAGTGGCGGTTTTCATGGCTTGTTATGACTG TTTTTTTGGGGTACAGTCTATGCCTCGGGCATCCAAGCAGCAAGCGCGTTACGCCGTGGGTCGAT GTTTGATGTTATGGAGCAGCAACGATGTTACGCAGCAGGGCAGTCGCCCTAAAACAAAGTTAAACA TCATGAGGGAAGCGGTGATCGCCGAAGTATCGACTCAACTATCAGAGGTAGTTGGCGTCATCGAG CGCCATCTCGAACCGACGTTGCTGGCCGTACATTTGTACGGCTCCGCAGTGGATGGCGGCCTGAA GCCACACAGTGATATTGATTTGCTGGTTACGGTGACCGTAAGGCTTGATGAAACAACGCGGCGAGC TTTGATCAACGACCTTTTGGAAACTTCGGCTTCCCCTGGAGAGAGCGAGATTCTCCGCGCTGTAGA AGTCACCATTGTTGTGCACGACGACATCATTCCGTGGCGTTATCCAGCTAAGCGCGAACTGCAATT TGGAGAATGGCAGCGCAATGACATTCTTGCAGGTATCTTCGAGCCAGCCACGATCGACATTGATCT

GGCTATCTTGCTGACAAAAGCAAGAGAACATAGCGTTGCCTTGGTAGGTCCAGCGGCGGAGGAAC TCTTTGATCCGGTTCCTGAACAGGATCTATTTGAGGCGCTAAATGAAACCTTAACGCTATGGAACTC GCCGCCCGACTGGGCTGGCGATGAGCGAAATGTAGTGCTTACGTTGTCCCGCATTTGGTACAGCG CAGTAACCGGCAAAATCGCGCCGAAGGATGTCGCTGCCGACTGGGCAATGGAGCGCCTGCCGGC CCAGTATCAGCCCGTCATACTTGAAGCTAGACAGGCTTATCTTGGACAAGAAGAAGATCGCTTGGC CTCGCGCGCAGATCAGTTGGAAGAATTTGTCCACTACGTGAAAGGCGAGATCACCAAGGTAGTCG GCAAATAATGTCTAACAATTCGTTCAAGCCGACGCCGCTTCGCGGCGCGGCTTAACTCAAGCGTTA GATGCACTAAGCACATAATTGCTCACAGCCAAACTATCAGGTCAAGTCTGCTTTTATTATTTTTAAGC GTGCATAATAAGCCCTACACAAATTGGGAGATATATCATGAGGCGCGCCacgagtgcggggaaatttcgggggc gatcgcccctatatcgcaaaaaggagttaccccatcagagctatagtcgagaagaaaaccatcattcactcaac- aaggctatgtcagaagagaaac tagaccggatcgaagcagccctagagcaattggataaggatgtgcaaacgctccaaacagagcttcagcaatcc- caaaaatggcaggacaggacat gggatgttgtgaagtgggtaggcggaatctcagcgggcctagcggtgagcgcttccattgccctgttcgggttg- gtctttagattttctgtttccc tgccataaaagcacattcttataagtcatacttgtttacatcaaggaacaaaaacggcattgtgccttgcaagg- cacaatgtctttctcttatgca cagatggggactggaaaccacacgcacaattcccttaaaaagcaaccgcaaaaaataaccatcaaaataaaact- ggacaaattctcatgtgGGCCG GCC SEQ ID NO: 15 pJB161 (vector contains bla cassette, pUC ori and transcription terminators flanking the homology regions; UHR and DHR are lowercase; P.sub.aphII promoter is underlined; adhII terminator is in bold; kan.sup.R marker is italicized and underlined) ACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTG CCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGCGCT GCGATGATACCGCGAGAACCACGCTCACCGGCTCCGGATTTATCAGCAATAAACCAGCCAG CCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAAT TGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCAT CGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCA ACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTC CTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTG CATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACC AAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGA TAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGC GAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCC AACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCA AAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATATTCTTCCTTT TTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTA TTTAGAAAAATAAACAAATAGGGGTCAGTGTTACAACCAATTAACCAATTCTGAACATTATC GCGAGCCCATTTATACCTGAATATGGCTCATAACACCCCTTGTTTGCCTGGCGGCAGTAGCG CGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAG TGTGGGGACTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCA GTCGAAAGACTGGGCCTTTCGCCCGGGCTAATTATGGGGTGTCGCCCTTATTCGACTCTATA GTGAAGTTCCTATTCTCTAGAAAGTATAGGAACTTCTGAAGTGGGGCCTGCAGGgccaccacagcc aaattcatcgttaatgtggacttgccgacgcccccttttcgactaacaatcgcaatttttttcatagacatttc- ccacagaccacatcaaattaca gcaattgatctagctgaaagtttaacccacttccccccagacccagaagaccagaggcgcttaagcttccccga- acaaactcaactgaccgagggg gagggagccgtagcggcgttggtgttggcgtaaatgacaggccgagcaaagagcgatgagattttcccgacgat- tgtcttcggggatgtaattttt taaaacagcccgcaggtgacgatcaatgcctttgaccttcacatccgacggaatacaaaccaagccacagagtt- cacagcgccagtctgcatcctctttta gtggtggacgcttaaggtcttgtaaggcgatcgcctgccaatcatcagaatatcgagaagaatgtttcatctaa- acctagcgccgcaagataatcctgaaa tcgctacagtattaaaaaattctggccaacatcacagccaatactGCGGCCGCGGGGGGGGGGGGGAAAGCCAC- GTTGTGTCTCAAAATC TCTGATGTTACATTGCACAAGATAAAAATATATCATCATGAACAATAAAACTGTCTGCTTAC ATAAACAGTAATACAAGGGGTCATATGTAACAGGAATTCGGTTTTCCGTCCTGTCTTGATT TTCAAGCAAACAATGCCTCCGATTTCTAATCGGAGGCATTTGTTTTTGTTTATTGCAAA AACAAAAAATATTGTTACAAATTTTTACAGGCTATTAAGCCTACCGTCATAAATAATTT GCCATTTACTAGTTTTTAATTAAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCC GCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGATTGAACAA GATGGCCTGCATGCTGGTTCTCCGGCTGCTTGGGTGGAACGCCTGTTTGGTTACGACTGGGCTCA GCTGACTATTGGCTGTAGCGATGCAGCGGTTTTCCGTCTGTCTGCACAGGGTCGTCCGGTTCTGTT TGTGAAAACCGACCTGTCCGGCGCACTGAACGAACTGCAGGACGAAGCGGCCCGTCTGTCCTGG CTCGCGACGACTGGTGTTCCGTGCGCGGCAGTTCTGGACGTAGTTACTGAAGCCGGTCGCGATTG GCTGCTGCTGGGTGAAGTTCCGGGTCAGGATCTGCTGAGCAGCCACCTCGCTCCGGCAGAAAAAG TTTCCATCATGGCGGACGCGATGCGCCGTCTGCACACCCTGGACCCGGCAACTTGCCCGTTTGAC CATCAGGCTAAACACCGTATTGAACGTGCACGCACTCGTATGGAAGCGGGTCTGGTTGATCAGGA CGACCTGGATGAAGAGCACCAGGGCCTCGCACCGGCGGAACTGTTTGCACGTCTGAAAGCCCGC ATGCCGGACGGCGAAGACCTGGTGGTAACGCATGGCGACGCTTGTCTGCCAAACATTATGGTGGA AAACGGCCGCTTCTCTGGTTTTATTGACTGTGGCCGTCTGGGTGTAGCTGATCGCTATCAGGATAT CGCCCTCGCTACCCGCGATATTGCAGAAGAACTGGGTGGTGAATGGGCTGACCGTTTCCTGGTGC TGTACGGTATCGCAGCGCCGGATTCTCAGCGCATTGCCTTCTACCGTCTGCTGGATGAGTTCTTCT AAGGCGCGCCgaaactgcgccaagaatagctcacttcaaatcagtcacggttttgtttagggcttgtctggcga- ttttggtgacatagacagtcaca gcaacagtagccacaaaaccaagaatccggatcgaccactgggcaatggggttggcgctggtgctttctgtgcc- gagggtcgcaagatttccggccag ggagccaatgtagacatacatgatggtgccagggatcatccccacagagccgaggacatagtcttttagggaaa- cgcccgtgaccccataggcatagtt aagcagattaaagggaaatacaggtgagagacgcgtcaggagaacaatcttcaggccttccttgcccacagctt- cgtcgatggcgcgaaatttcgggttg tcggcgattttttggctcacccattggcgggccagataacgacccactaggaaagcagcgatcgctcctagggt- tgcgccaacaaagacgtaaattgatc ctaaagcgacaccaaaaacaaccccggctcccaaggtcagaatcgaccccggtagaaaagccaccgtcgccacc- acataaagcaccataaaggcga tGGCCGGCCAAAATGAAGTGAAGTTCCTATACTTTCTAGAGAATAGGAACTTCTATAGTGAG TCGAATAAGGGCGACACAAAATTTATTCTAAATGCATAATAAATACTGATAACATCTTATAG TTTGTATTATATTTTGTATTATCGTTGACATGTATAATTTTGATATCAAAAACTGATTTTCCCT TTATTATTTTCGAGATTTATTTTCTTAATTCTCTTTAACAAACTAGAAATATTGTATATACAAA AAATCATAAATAATAGATGAATAGTTTAATTATAGGTGTTCATCAATCGAAAAAGCAACGTA TCTTATTTAAAGTGCGTTGCTTTTTTCTCATTTATAAGGTTAAATAATTCTCATATATCAAGCA AAGTGACAGGCGCCCTTAAATATTCTGACAAATGCTCTTTCCCTAAACTCCCCCCATAAAAA AACCCGCCGAAGCGGGTTTTTACGTTATTTGCGGATTAACGATTACTCGTTATCAGAACCGC CCAGGGGGCCCGAGCTTAAGACTGGCCGTCGTTTTACAACACAGAAAGAGTTTGTAGAAAC GCAAAAAGGCCATCCGTCAGGGGCCTTCTGCTTAGTTTGATGCCTGGCAGTTCCCTACTCTC GCCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATC AGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAAC ATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTT TCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCG AAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTC CTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGC TTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCT GTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAG TCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCA GAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGGCTAACTACGGCTACACT AGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGG TAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGC AGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGAC GCTCAGTGGAACGACGCGCGCGTAACTCACGTTAAGGGATTTTGGTCATGAGCTTGCGCCGT CCCGTCAAGTCAGCGTAATGCTCTGCTTTT SEQ ID NO: 16 P.sub.psaA-tolC-P.sub.tsr2142-acrAB insert with flanking homology regions This sequence inserted into pJB161 to create PJB1074 (UHR and DHR in lowercase and not underlined; P.sub.psaA and P.sub.tsr2142 are underlined and capitalized; tolC, acrA and acrB are in bold, lowercase, and underlined and follow the promoter in order; kan.sup.R marker is italicized and underlined) CCTGCAGGgccaccacagccaaattcatcgttaatgtggacttgccgacgcccccttttcgactaacaatcgca- atttttttcatagacatttcccaca gaccacatcaaattacagcaattgatctagctgaaagtttaacccacttccccccagacccagaagaccagagg- cgcttaagcttccccgaacaaactca actgaccgagggggagggagccgtagcggcgttggtgttggcgtaaatgacaggccgagcaaagagcgatgaga- ttttcccgacgattgtcttcgggg atgtaatttttgtggtggacgcttaaggttaaaacagcccgcaggtgacgatcaatgcctttgaccttcacatc- cgacggaatacaaaccaagccacagag ttcacagcgccagtctgcatcctcttttacttgtaaggcgatcgcctgccaatcatcagaatatcgagaagaat- gtttcatctaaacctagcgccgcaaga taatcctgaaatcgctacagtattaaaaaattctggccaacatcacagccaatactGCGGCCGCGCCCCTATAT- TATGCATTTATA CCCCCACAATCATGTCAAGAATTCAAGCATCTTAAATAATGTTAATTATCGGCAAAGTCTGT GCTCCCCTTCTATAATGCTGAATTGAGCATTCGCCTCCTGAACGGTCTTTATTCTTCCATTGT GGGTCTTTAGATTCACGATTCTTCACAATCATTGATCTAAAGATCTTTCTAGATTCTCGAGGC ATatgaagaaattgctccccattcttatcggcctgagcctttctgggttcagttcgttgagccaggccgagaac- ctgatgcaagtttatcagcaa gcacgccttagtaacccggaattgcgtaagtctgccgccgatcgtgatgctgcctttgaaaaaattaatgaagc- gcgcagtccattactgccaca gctaggtttaggtgcagattacacctatagcaacggctaccgcgacgcgaacggcatcaactctaacgcgacca- gtgcgtccttgcagttaact caatccatttttgatatgtcgaaatggcgtgcgttaacgctgcaggaaaaagcagcagggattcaggacgtcac-

gtatcagaccgatcagcaaa ccttgatcctcaacaccgcgaccgcttatttcaacgtgttgaatgctattgacgttctttcctatacacaggca- caaaaagaagcgatctaccgtc aattagatcaaaccacccaacgttttaacgtgggcctggtagcgatcaccgacgtgcagaacgcccgcgcacag- tacgataccgtgctggcga acgaagtgaccgcacgtaataaccttgataacgcggtagagcagctgcgccagatcaccggtaactactatccg- gaactggctgcgctgaatg tcgaaaactttaaaaccgacaaaccacagccggttaacgcgctgctgaaagaagccgaaaaacgcaacctgtcg- ctgttacaggcacgcttga gccaggacctggcgcgcgagcaaattcgccaggcgcaggatggtcacttaccgactctggatttaacggcttct- accgggatttctgacacctct tatagcggttcgaaaacccgtggtgccgctggtacccagtatgacgatagcaatatgggccagaacaaagttgg- cctgagcttctcgctgccga tttatcagggcggaatggttaactcgcaggtgaaacaggcacagtacaactttgtcggtgccagcgagcaactg- gaaagtgcccatcgtagcgt cgtgcagaccgtgcgttcctccttcaacaacattaatgcatctatcagtagcattaacgcctacaaacaagccg- tagtttccgctcaaagctcatt agacgcgatggaagcgggctactcggtcggtacgcgtaccattgttgatgtgttggatgcgaccaccacgttgt- acaacgccaagcaagagctg gcgaatgcgcgttataactacctgattaatcagctgaatattaagtcagctctgggtacgttgaacgagcagga- tctgctggcactgaacaatgc gctgagcaaaccggtttccactaatccggaaaacgttgcaccgcaaacgccggaacagaatgctattgctgatg- gttatgcgcctgatagcccg gcaccagtcgttcagcaaacatccgcacgcactaccaccagtaacggtcataaccctttccgtaactgaGGATC- CAAGGTGGCTA CTTCAACGATAGCTTAAACTTCGCTGCTCCAGCGAGGGGATTTCACTGGTTTGAATGCTTCA ATGCTTGCCAAAAGAGTGCTACTGGAACTTACAAGAGTGACCCTGCGTCAGGGGAGCTAGC ACTCAAAAAAGACTCCTCCAATTCCGTCCatgaacaaaaacagagggtttacgcctctggcggtcgttctgatg- ctctca ggcagcttagccctaacaggatgtgacgacaaacaggcccaacaaggtggccagcagatgcccgccgttggcgt- agtaacagtcaaaactga acctctgcagatcacaaccgagcttccgggtcgcaccagtgcctaccggatcgcagaagttcgtcctcaagtta- gcgggattatcctgaagcgta atttcaaagaaggtagcgacatcgaagcaggtgtctctctctatcagattgatcctgcgacctatcaggcgaca- tacgacagtgcgaaaggtga tctggcgaaagcccaggctgcagccaatatcgcgcaattgacggtgaatcgttatcagaaactgctcggtactc- agtacatcagtaagcaagag tacgatcaggctctggctgatgcgcaacaggcgaatgctgcggtaactgcggcgaaagctgccgttgaaactgc- gcggatcaatctggcttaca ccaaagtcacctctccgattagcggtcgcattggtaagtcgaacgtgacggaaggcgcattggtacagaacggt- caggcgactgcgctggcaa ccgtgcagcaacttgatccgatctacgttgatgtgacccagtccagcaacgacttcctgcgcctgaaacaggaa- ctggcgaatggcacgctgaa acaagagaacggcaaagccaaagtgtcactgatcaccagtgacggcattaagttcccgcaggacggtacgctgg- aattctctgacgttaccgtt gatcagaccactgggtctatcaccctacgcgctatcttcccgaacccggatcacactctgctgccgggtatgtt- cgtgcgcgcacgtctggaaga agggcttaatccaaacgctattttagtcccgcaacagggcgtaacccgtacgccgcgtggcgatgccaccgtac- tggtagttggcgcggatgac aaagtggaaacccgtccgatcgttgcaagccaggctattggcgataagtggctggtgacagaaggtctgaaagc- aggcgatcgcgtagtaata agtgggctgcagaaagtgcgtcctggtgtccaggtaaaagcacaagaagttaccgctgataataaccagcaagc- cgcaagcggtgctcagcct gaacagtccaagtcttaacttaaacaggagccgttaagacatgcctaatttctttatcgatcgcccgatttttg- cgtgggtgatcgccattatcatcat gttggcaggggggctggcgatcctcaaactgccggtggcgcaatatcctacgattgcaccgccggcagtaacga- tctccgcctcctaccccggc gctgatgcgaaaacagtgcaggacacggtgacacaggttatcgaacagaatatgaacggtatcgataacctgat- gtacatgtcctctaacagt gactccacgggtaccgtgcagatcaccctgacctttgagtctggtactgatgcggatatcgcgcaggttcaggt- gcagaacaaactgcagctgg cgatgccgttgctgccgcaagaagttcagcagcaaggggtgagcgttgagaaatcatccagcagcttcctgatg- gttgtcggcgttatcaacac cgatggcaccatgacgcaggaggatatctccgactacgtggcggcgaatatgaaagatgccatcagccgtacgt- cgggcgtgggtgatgttca gttgttcggttcacagtacgcgatgcgtatctggatgaacccgaatgagctgaacaaattccagctaacgccgg- ttgatgtcattaccgccatca aagcgcagaacgcccaggttgcggcgggtcagctcggtggtacgccgccggtgaaaggccaacagcttaacgcc- tctattattgctcagacgc gtctgacctctactgaagagttcggcaaaatcctgctgaaagtgaatcaggatggttcccgcgtgctgctgcgt- gacgtcgcgaagattgagctg ggtggtgagaactacgacatcatcgcagagtttaacggccaaccggcttccggtctggggatcaagctggcgac- cggtgcaaacgcgctggat accgctgcggcaatccgtgctgaactggcgaagatggaaccgttcttcccgtcgggtctgaaaattgtttaccc- atacgacaccacgccgttcgt gaaaatctctattcacgaagtggttaaaacgctggtcgaagcgatcatcctcgtgttcctggttatgtatctgt- tcctgcagaacttccgcgcgacg ttgattccgaccattgccgtaccggtggtattgctcgggacctttgccgtccttgccgcctttggcttctcgat- aaacacgctaacaatgttcgggat ggtgctcgccatcggcctgttggtggatgacgccatcgttgtggtagaaaacgttgagcgtgttatggcggaag- aaggtttgccgccaaaagaa gctacccgtaagtcgatggggcagattcagggcgctctggtcggtatcgcgatggtactgtcggcggtattcgt- accgatggccttctttggcggt tctactggtgctatctatcgtcagttctctattaccattgtttcagcaatggcgctgtcggtactggtggcgtt- gatcctgactccagctctttgtgcca ccatgctgaaaccgattgccaaaggcgatcacggggaaggtaaaaaaggcttcttcggctggtttaaccgcatg- ttcgagaagagcacgcacc actacaccgacagcgtaggcggtattctgcgcagtacggggcgttacctggtgctgtatctgatcatcgtggtc- ggcatggcctatctgttcgtgc gtctgccaagctccttcttgccagatgaggaccagggcgtgtttatgaccatggttcagctgccagcaggtgca- acgcaggaacgtacacagaa agtgctcaatgaggtaacgcattactatctgaccaaagaaaagaacaacgttgagtcggtgttcgccgttaacg- gcttcggctttgcgggacgtg gtcagaataccggtattgcgttcgtttccttgaaggactgggccgatcgtccgggcgaagaaaacaaagttgaa- gcgattaccatgcgtgcaac acgcgctttctcgcaaatcaaagatgcgatggttttcgcctttaacctgcccgcaatcgtggaactgggtactg- caaccggctttgactttgagctg attgaccaggctggccttggtcacgaaaaactgactcaggcgcgtaaccagttgcttgcagaagcagcgaagca- ccctgatatgttgaccagc gtacgtccaaacggtctggaagataccccgcagtttaagattgatatcgaccaggaaaaagcgcaggcgctggg- tgtttctatcaacgacatta acaccactctgggcgctgcatggggcggcagctatgtgaacgactttatcgaccgcggtcgtgtgaagaaagtt- tatgtcatgtcagaagcgaa ataccgtatgctgccggatgatatcggcgactggtatgttcgtgctgctgatggtcagatggtgccattctcgg- cgttctcctcttctcgttgggagt acggttcgccgcgtctggaacgttacaacggcctgccatccatggaaatcttaggccaggcggcaccgggtaaa- agtaccggtgaagcaatgg agctgatggaacaactggcgagcaaactgcctaccggtgttggctatgactggacggggatgtcctatcaggaa- cgtctctccggcaaccagg caccttcactgtacgcgatttcgttgattgtcgtgttcctgtgtctggcggcgctgtacgagagctggtcgatt- ccgttctccgttatgctggtcgttc cgctgggggttatcggtgcgttgctggctgccaccttccgtggcctgaccaatgacgtttacttccaggtaggc- ctgctcacaaccattgggttgtc ggcgaagaacgcgatccttatcgtcgaattcgccaaagacttgatggataaagaaggtaaaggtctgattgaag- cgacgcttgatgcggtgcg gatgcgtttacgtccgatcctgatgacctcgctggcgtttatcctcggcgttatgccgctggttatcagtactg- gtgctggttccggcgcgcagaac gcagtaggtaccggtgtaatgggcgggatggtgaccgcaacggtactggcaatcttcttcgttccggtattctt- tgtggtggttcgccgccgcttta gccgcaagaatgaagatatcgagcacagccatactgtcgatcatcattgaGAGCTCttGAATTCGGTTTTCCGT- CCTGTCT TGATTTTCAAGCAAACAATGCCTCCGATTTCTAATCGGAGGCATTTGTTTTTGTTTATTGCAA AAACAAAAAATATTGTTACAAATTTTTACAGGCTATTAAGCCTACCGTCATAAATAATTTGC CATTTACTAGTTTTTAATTAAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCT CATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGATTGAACAAGAT GGCCTGCATGCTGGTTCTCCGGCTGCTTGGGTGGAACGCCTGTTTGGTTACGACTGGGCTCAGCT GACTATTGGCTGTAGCGATGCAGCGGTTTTCCGTCTGTCTGCACAGGGTCGTCCGGTTCTGTTTGT GAAAACCGACCTGTCCGGCGCACTGAACGAACTGCAGGACGAAGCGGCCCGTCTGTCCTGGCTC GCGACGACTGGTGTTCCGTGCGCGGCAGTTCTGGACGTAGTTACTGAAGCCGGTCGCGATTGGCT GCTGCTGGGTGAAGTTCCGGGTCAGGATCTGCTGAGCAGCCACCTCGCTCCGGCAGAAAAAGTTT CCATCATGGCGGACGCGATGCGCCGTCTGCACACCCTGGACCCGGCAACTTGCCCGTTTGACCAT CAGGCTAAACACCGTATTGAACGTGCACGCACTCGTATGGAAGCGGGTCTGGTTGATCAGGACGA CCTGGATGAAGAGCACCAGGGCCTCGCACCGGCGGAACTGTTTGCACGTCTGAAAGCCCGCATG CCGGACGGCGAAGACCTGGTGGTAACGCATGGCGACGCTTGTCTGCCAAACATTATGGTGGAAAA CGGCCGCTTCTCTGGTTTTATTGACTGTGGCCGTCTGGGTGTAGCTGATCGCTATCAGGATATCGC CCTCGCTACCCGCGATATTGCAGAAGAACTGGGTGGTGAATGGGCTGACCGTTTCCTGGTGCTGT ACGGTATCGCAGCGCCGGATTCTCAGCGCATTGCCTTCTACCGTCTGCTGGATGAGTTCTTCTAAG GCGCGCCgaaactgcgccaagaatagctcacttcaaatcagtcacggttttgtttagggcttgtctggcgattt- tggtgacatagacagtcacagcaa cagtagccacaaaaccaagaatccggatcgaccactgggcaatggggttggcgctggtgctttctgtgccgagg- gtcgcaagatttccggccagggag ccaatgtagacatacatgatggtgccagggatcatccccacagagccgaggacatagtcttttagggaaacgcc- cgtgaccccataggcatagttaagc agattaaagggaaatacaggtgagagacgcgtcaggagaacaatcttcaggccttccttgcccacagcttcgtc- gatggcgcgaaatttcgggttgtcgg cgattttttggctcacccattggcgggccagataacgacccactaggaaagcagcgatcgctcctagggttgcg- ccaacaaagacgtaaattgatcctaa agcgacaccaaaaacaaccccggctcccaaggtcagaatcgaccccggtagaaaagccaccgtcgccaccacat- aaagcaccataaaggcgatGG CCGGCC SEQ ID NO: 17 P(nirA): S. elongatus PCC 7942 TCCCTCTCAGCTCAAAAAGTATCAATGATTACTTAATGTTTGTTCTGCGCAAACTTCT TGCAGAACATGCATGATTTACAAAAAGTTGTAGTTTCTGTTACCAATTGCGAATCGA GAACTGCCTAATCTGCCGAGTATGCAAGCTGCTTTGTAGGCAGATGAATCCCAT SEQ ID NO: 18 P(nir07): S. elongatus PCC 7942 + Synechococcus sp. PCC 7002 rbcL altered ribosome binding site (RBS) GCTTGTAGCAATTGCTACTAAAAACTGCGATCGCTGCTGAAATGAGCTGGAATTTTG TCCCTCTCAGCTCAAAAAGTATCAATGATTACTTAATGTTTGTTCTGCGCAAACTTCT

TGCAGAACATGCATGATTTACAAAAAGTTGTAGTTTCTGTTACCAATTGCGAATCGA GAACTGCCTAATCTGCCGAGTATGCGATCCTTTAGCAGGAGGAAAACCAT SEQ ID NO: 19 P(nir09): Anabaena sp. PCC 7120 + Synechococcus sp. PCC 7002 rbcL RBS GCTACTCATTAGTTAAGTGTAATGCAGAAAACGCATATTCTCTATTAAACTTACGCA TTAATACGAGAATTTTGTAGCTACTTATACTATTTTACCTGAGATCCCGACATAACCT TAGAAGTATCGAAATCGTTACATAAACATTCACACAAACCACTTGACAAATTTAGCC AATGTAAAAGACTACAGTTTCTCCCCGGTTTAGTTCTAGAGTTACCTTCAGTGAAAC ATCGGCGGCGTGTCAGTCATTGAAGTAGCATAAATCAATTCAAAATACCCTGCGGG AAGGCTGCGCCAACAAAATTAAATATTTGGTTTTTCACTATTAGAGCATCGATTCAT TAATCAAAAACCTTACCCCCCAGCCCCCTTCCCTTGTAGGGAAGTGGGAGCCAAACT CCCCTCTCCGCGTCGGAGCGAAAAGTCTGAGCGGAGGTTTCCTCCGAACAGAACTTT TAAAGAGAGAGGGGTTGGGGGAGAGGTTCTTTCAAGATTACTAAATTGCTATCACT AGACCTCGTAGAACTAGCAAAGACTACGGGTGGATTGATCTTGAGCAAAAAAACTT TATGAGAACTTTAGCAGGAGGAAAACCAT SEQ ID NO: 20 nrsS-nrsR-P(nrsB): Synechocystis sp. PCC 6803 s110798-s110797 Pslr0793 + Synechococcus sp. PCC 7002 rbcL RBS GATTACCCTATATCGGGCTTTTCTCAATAAAATCTTTATTTTTTGAGGTGCTTTTTAG CCATAAATAATCACTTTAGTATAAAATTTTGACGGCGTAAAGTTGATAAAATAGAAT TAAGAATGGACTATCGGTACAGAAAAAATGGGTAACTGGATGGTGAATAAACTTCC CTTACCCAATGCACTCTCCACCGTTAAAGACCCCCTATGCTTAACGGTGATCACCTG GGCAATGGCGAGTCCCAACCCTGTCCCCCCCGTTTTGCGCGAACGATCTCGATTAAC TCGGTAAAAACGCTCAAAAATGTGTTCCTGTTGGTCGGGGGCAATGCCGATGCCGGT ATCTTGCACGGTGATGATAGCCATCTGTTCATGGGATGTCAGGGTAATATCAACACG TCCCCCAGCAGTTGTGTATTGAATGGCGTTGGCAATTAGGTTTGAGACCAGTCGATA GAGTTGGGATTCATTACCCCAGGCGTAAACTTCCCCTGAACTCAGATCACTGCTGAG ATCAATGTGGGCGGCGATCGCTAATTCTAAAAACTCTTCGGTGAGGTCACTGACTAA ATCATTTAAACAACAAAGCCGCCAATCTTCGGCGGTGGTTTCCTGCTCTAAGCGACT TAGTAGCAATAAATCCGTAATCAATTGGCTTAATCGCCTTCCCTGTCGTTCAACGGT ATGTAGCATGGTGTTAATTTCTGGGGAATGGCTTGAGTCGATGCGTAATACCGCTTC CACCGTGGCCAACAGACTAGCCAATGGCGATCGTAATTCATGGGCTGCATTCGCGGT GAATTGTTGTTGTTGTTGGTAGGACTGGTAAATGGGACGCATGGCTAACCCCGCTAA GCCCCAACTGGAGAAGGCGACCAAACCCAGGGCAATGGGAAAACTAAGCCCTAAA ATCCAAAGAATACGTTTATTTTCGGCATCAAAGGCTGCCAGGCTCCGGCCAATTTGT AGATAGCCCCAGGAAGATTTGTCTGTATTACCGGCGCTATGCAAAATGGTGGTGAAT TGTCGATACCGATCGCCGGTTGGGGGGTGAATAGTCTGCCAAGTTTCCTGGTTAAAA ATGGAGGATAGGGAAGCCGGTTGATTAGGCGAAAAAGCCAGCAGGTTGCCTTGATA ATCAAATAAACGAATGTAATATAAACTGCGATCACTAATGCCCAACGTGTGACGTTC AATCAGGGTGGGGTTGACCTGGCAGGGTTGGTTGACCAAACACAGATCGGGCAACA TTTTTTGTAATACTCCGGTGGGACTAGCATTACTCGGCAACATCGGCTCTAAACTGTC ATGCAACGTCCCGGCGATCGACTCCACTTCTCGCTCCAACGCCATCCAGTTGGCCTG CACAATGGCACGATAAACCCCCAACCCCAACAGGGTAAGAATTCCCCCCATTACTA GGGCATACCAGAAAGCCAATTGCAGACGACTACGGGCAAAGAGGCGACGGGTATTC ATGGCGATAGGGTGAACCGATAGCCTTGACCGGGAACTGTTTTAATTGGGCAAGGA CAATTTTGTTGAGCTAGCTTGCGTCGTATCAAACGCATTTGGGCCGCCACCACATTA CTCATGGGCTCCTCATCAAGATCCCACAGTTGTTGCCGGATCTTGCTACCGGAAATG ATCCGCTCTGGGTTTTGCATCAGATATTGAAAAATTTGAAATTCTCTTACGGTTAAA GCAATTTCCTGTCTTTCTAGGTTTAGTGGCTCCGAGATAGTTACCGATAACAGATTAT TACTGGGATCAAGGCTGAAGTTGCCCAAAGTTAAAATTTGCGGTTGGAATTGTGGCG ATCGCCGTTGTAGTGCCCGCAGTCTTGCTAATAGCTCTGCCATCACAAACGGTTTTGT TAGATAGTCATCTGCCCCGGCATCTAGTCCTTCGACACGGTTTTCCGGTTCTCCTAAC GCTGTTAACATCAACACCGGCAAGGAATTACCCTGGGTTCTCAGTTTTTGACAGAGT TCCAAACCCGATAATCCCGGCAGTAACCAATCCACAATGGCAAGGGTGTATTCCGTC CATTGATTTTCCAAATAATCCCAAGCTTGGGAGCCATCCGTCACCCAATCCACCACA TACTTTTCACTAACTAGCACTTTCTTAATAGCCATTCCCAAATCCGTCTCATCTTCCA CCAGCAAAATTCGCATCGCCTCTGCCTTTTTTATAACGGTCTGATCTTAGCGGGGGA AGGAGATTTTCACCTGAATTTCATACCCCCTTTGGCAGACTGGGAAAATCTTGGACA AATTAGGAGGAAAACCAT

Sequence CWU 1

1

251183PRTEscherichia coli 1Met Ala Asp Thr Leu Leu Ile Leu Gly Asp Ser Leu Ser Ala Gly Tyr 1 5 10 15 Arg Met Ser Ala Ser Ala Ala Trp Pro Ala Leu Leu Asn Asp Lys Trp 20 25 30 Gln Ser Lys Thr Ser Val Val Asn Ala Ser Ile Ser Gly Asp Thr Ser 35 40 45 Gln Gln Gly Leu Ala Arg Leu Pro Ala Leu Leu Lys Gln His Gln Pro 50 55 60 Arg Trp Val Leu Val Glu Leu Gly Gly Asn Asp Gly Leu Arg Gly Phe 65 70 75 80 Gln Pro Gln Gln Thr Glu Gln Thr Leu Arg Gln Ile Leu Gln Asp Val 85 90 95 Lys Ala Ala Asn Ala Glu Pro Leu Leu Met Gln Ile Arg Leu Pro Ala 100 105 110 Asn Tyr Gly Arg Arg Tyr Asn Glu Ala Phe Ser Ala Ile Tyr Pro Lys 115 120 125 Leu Ala Lys Glu Phe Asp Val Pro Leu Leu Pro Phe Phe Met Glu Glu 130 135 140 Val Tyr Leu Lys Pro Gln Trp Met Gln Asp Asp Gly Ile His Pro Asn 145 150 155 160 Arg Asp Ala Gln Pro Phe Ile Ala Asp Trp Met Ala Lys Gln Leu Gln 165 170 175 Pro Leu Val Asn His Asp Ser 180 2561PRTEscherichia coli 2Met Lys Lys Val Trp Leu Asn Arg Tyr Pro Ala Asp Val Pro Thr Glu 1 5 10 15 Ile Asn Pro Asp Arg Tyr Gln Ser Leu Val Asp Met Phe Glu Gln Ser 20 25 30 Val Ala Arg Tyr Ala Asp Gln Pro Ala Phe Val Asn Met Gly Glu Val 35 40 45 Met Thr Phe Arg Lys Leu Glu Glu Arg Ser Arg Ala Phe Ala Ala Tyr 50 55 60 Leu Gln Gln Gly Leu Gly Leu Lys Lys Gly Asp Arg Val Ala Leu Met 65 70 75 80 Met Pro Asn Leu Leu Gln Tyr Pro Val Ala Leu Phe Gly Ile Leu Arg 85 90 95 Ala Gly Met Ile Val Val Asn Val Asn Pro Leu Tyr Thr Pro Arg Glu 100 105 110 Leu Glu His Gln Leu Asn Asp Ser Gly Ala Ser Ala Ile Val Ile Val 115 120 125 Ser Asn Phe Ala His Thr Leu Glu Lys Val Val Asp Lys Thr Ala Val 130 135 140 Gln His Val Ile Leu Thr Arg Met Gly Asp Gln Leu Ser Thr Ala Lys 145 150 155 160 Gly Thr Val Val Asn Phe Val Val Lys Tyr Ile Lys Arg Leu Val Pro 165 170 175 Lys Tyr His Leu Pro Asp Ala Ile Ser Phe Arg Ser Ala Leu His Asn 180 185 190 Gly Tyr Arg Met Gln Tyr Val Lys Pro Glu Leu Val Pro Glu Asp Leu 195 200 205 Ala Phe Leu Gln Tyr Thr Gly Gly Thr Thr Gly Val Ala Lys Gly Ala 210 215 220 Met Leu Thr His Arg Asn Met Leu Ala Asn Leu Glu Gln Val Asn Ala 225 230 235 240 Thr Tyr Gly Pro Leu Leu His Pro Gly Lys Glu Leu Val Val Thr Ala 245 250 255 Leu Pro Leu Tyr His Ile Phe Ala Leu Thr Ile Asn Cys Leu Leu Phe 260 265 270 Ile Glu Leu Gly Gly Gln Asn Leu Leu Ile Thr Asn Pro Arg Asp Ile 275 280 285 Pro Gly Leu Val Lys Glu Leu Ala Lys Tyr Pro Phe Thr Ala Ile Thr 290 295 300 Gly Val Asn Thr Leu Phe Asn Ala Leu Leu Asn Asn Lys Glu Phe Gln 305 310 315 320 Gln Leu Asp Phe Ser Ser Leu His Leu Ser Ala Gly Gly Gly Met Pro 325 330 335 Val Gln Gln Val Val Ala Glu Arg Trp Val Lys Leu Thr Gly Gln Tyr 340 345 350 Leu Leu Glu Gly Tyr Gly Leu Thr Glu Cys Ala Pro Leu Val Ser Val 355 360 365 Asn Pro Tyr Asp Ile Asp Tyr His Ser Gly Ser Ile Gly Leu Pro Val 370 375 380 Pro Ser Thr Glu Ala Lys Leu Val Asp Asp Asp Asp Asn Glu Val Pro 385 390 395 400 Pro Gly Gln Pro Gly Glu Leu Cys Val Lys Gly Pro Gln Val Met Leu 405 410 415 Gly Tyr Trp Gln Arg Pro Asp Ala Thr Asp Glu Ile Ile Lys Asn Gly 420 425 430 Trp Leu His Thr Gly Asp Ile Ala Val Met Asp Glu Glu Gly Phe Leu 435 440 445 Arg Ile Val Asp Arg Lys Lys Asp Met Ile Leu Val Ser Gly Phe Asn 450 455 460 Val Tyr Pro Asn Glu Ile Glu Asp Val Val Met Gln His Pro Gly Val 465 470 475 480 Gln Glu Val Ala Ala Val Gly Val Pro Ser Gly Ser Ser Gly Glu Ala 485 490 495 Val Lys Ile Phe Val Val Lys Lys Asp Pro Ser Leu Thr Glu Glu Ser 500 505 510 Leu Val Thr Phe Cys Arg Arg Gln Leu Thr Gly Tyr Lys Val Pro Lys 515 520 525 Leu Val Glu Phe Arg Asp Glu Leu Pro Lys Ser Asn Val Gly Lys Ile 530 535 540 Leu Arg Arg Glu Leu Arg Asp Glu Ala Arg Gly Lys Val Asp Asn Lys 545 550 555 560 Ala 3458PRTAcinetobacter baylyi 3Met Arg Pro Leu His Pro Ile Asp Phe Ile Phe Leu Ser Leu Glu Lys 1 5 10 15 Arg Gln Gln Pro Met His Val Gly Gly Leu Phe Leu Phe Gln Ile Pro 20 25 30 Asp Asn Ala Pro Asp Thr Phe Ile Gln Asp Leu Val Asn Asp Ile Arg 35 40 45 Ile Ser Lys Ser Ile Pro Val Pro Pro Phe Asn Asn Lys Leu Asn Gly 50 55 60 Leu Phe Trp Asp Glu Asp Glu Glu Phe Asp Leu Asp His His Phe Arg 65 70 75 80 His Ile Ala Leu Pro His Pro Gly Arg Ile Arg Glu Leu Leu Ile Tyr 85 90 95 Ile Ser Gln Glu His Ser Thr Leu Leu Asp Arg Ala Lys Pro Leu Trp 100 105 110 Thr Cys Asn Ile Ile Glu Gly Ile Glu Gly Asn Arg Phe Ala Met Tyr 115 120 125 Phe Lys Ile His His Ala Met Val Asp Gly Val Ala Gly Met Arg Leu 130 135 140 Ile Glu Lys Ser Leu Ser His Asp Val Thr Glu Lys Ser Ile Val Pro 145 150 155 160 Pro Trp Cys Val Glu Gly Lys Arg Ala Lys Arg Leu Arg Glu Pro Lys 165 170 175 Thr Gly Lys Ile Lys Lys Ile Met Ser Gly Ile Lys Ser Gln Leu Gln 180 185 190 Ala Thr Pro Thr Val Ile Gln Glu Leu Ser Gln Thr Val Phe Lys Asp 195 200 205 Ile Gly Arg Asn Pro Asp His Val Ser Ser Phe Gln Ala Pro Cys Ser 210 215 220 Ile Leu Asn Gln Arg Val Ser Ser Ser Arg Arg Phe Ala Ala Gln Ser 225 230 235 240 Phe Asp Leu Asp Arg Phe Arg Asn Ile Ala Lys Ser Leu Asn Val Thr 245 250 255 Ile Asn Asp Val Val Leu Ala Val Cys Ser Gly Ala Leu Arg Ala Tyr 260 265 270 Leu Met Ser His Asn Ser Leu Pro Ser Lys Pro Leu Ile Ala Met Val 275 280 285 Pro Ala Ser Ile Arg Asn Asp Asp Ser Asp Val Ser Asn Arg Ile Thr 290 295 300 Met Ile Leu Ala Asn Leu Ala Thr His Lys Asp Asp Pro Leu Gln Arg 305 310 315 320 Leu Glu Ile Ile Arg Arg Ser Val Gln Asn Ser Lys Gln Arg Phe Lys 325 330 335 Arg Met Thr Ser Asp Gln Ile Leu Asn Tyr Ser Ala Val Val Tyr Gly 340 345 350 Pro Ala Gly Leu Asn Ile Ile Ser Gly Met Met Pro Lys Arg Gln Ala 355 360 365 Phe Asn Leu Val Ile Ser Asn Val Pro Gly Pro Arg Glu Pro Leu Tyr 370 375 380 Trp Asn Gly Ala Lys Leu Asp Ala Leu Tyr Pro Ala Ser Ile Val Leu 385 390 395 400 Asp Gly Gln Ala Leu Asn Ile Thr Met Thr Ser Tyr Leu Asp Lys Leu 405 410 415 Glu Val Gly Leu Ile Ala Cys Arg Asn Ala Leu Pro Arg Met Gln Asn 420 425 430 Leu Leu Thr His Leu Glu Glu Glu Ile Gln Leu Phe Glu Gly Val Ile 435 440 445 Ala Lys Gln Glu Asp Ile Lys Thr Ala Asn 450 455 4552DNAEscherichia coli 4atggcggata ctctgctgat tctgggtgat tctctgtctg caggctaccg tatgtccgcc 60tccgcggcct ggccagctct gctgaatgat aagtggcagt ctaagacgtc cgttgtgaac 120gcatccatct ctggcgacac gagccagcag ggcctggccc gtctgcctgc actgctgaaa 180cagcaccaac cgcgctgggt cctggtggag ctgggcggta acgacggtct gcgcggcttc 240cagccgcagc agaccgaaca gactctgcgt cagattctgc aggacgtgaa agctgctaac 300gcggaaccgc tgctgatgca gattcgtctg ccagcgaact atggccgccg ttacaacgaa 360gcgttctctg caatctaccc aaaactggcg aaagagtttg acgtcccgct gctgccgttc 420ttcatggagg aagtatacct gaaaccgcag tggatgcaag atgacggcat ccacccgaac 480cgtgatgcgc agccgttcat cgctgactgg atggcgaagc aactgcagcc gctggtaaac 540cacgattcct aa 55251686DNAEscherichia coli 5atgaagaaag tttggctgaa ccgttatccg gcagatgtac cgactgaaat taacccagat 60cgttaccagt ccctggttga catgttcgaa cagtccgtgg ctcgctacgc cgatcagcct 120gctttcgtca acatgggtga ggtaatgacc tttcgcaaac tggaggagcg ttcccgtgct 180ttcgcggcat acctgcagca gggtctgggc ctgaagaaag gcgaccgcgt ggccctgatg 240atgccgaacc tgctgcaata tcctgtggcg ctgttcggta tcctgcgtgc tggtatgatc 300gttgtcaatg ttaaccctct gtatacccct cgtgaactgg agcaccagct gaatgactct 360ggtgcgtctg ctatcgttat cgtttccaat ttcgcacata cgctggagaa agtggttgat 420aaaaccgcag tgcagcatgt cattctgact cgcatgggtg accagctgtc caccgctaaa 480ggtactgtag tcaacttcgt tgtgaaatac attaagcgcc tggttccgaa ataccacctg 540ccagatgcaa ttagctttcg ctctgcactg cataacggtt accgtatgca gtacgtaaaa 600ccagagctgg tgccggaaga cctggccttt ctgcagtata ccggcggcac caccggcgtg 660gcaaagggcg cgatgctgac ccatcgtaac atgctggcga acctggagca ggttaacgca 720acgtacggcc cgctgctgca cccgggtaaa gaactggtag ttacggcact gcctctgtat 780cacatctttg cactgacgat caactgtctg ctgttcattg aactgggtgg tcagaacctg 840ctgatcacca acccgcgtga cattccgggc ctggtaaaag agctggctaa gtacccgttc 900accgccatta ctggcgtaaa cactctgttt aacgcgctgc tgaacaacaa agagtttcag 960cagctggact tctctagcct gcacctgagc gctggcggtg gcatgccggt tcagcaggtt 1020gtggcagagc gttgggtgaa actgaccggc cagtatctgc tggagggtta tggtctgacc 1080gagtgtgcac cgctggtcag cgttaacccg tatgatattg attaccactc tggttctatt 1140ggtctgccgg ttccgtccac ggaagccaaa ctggtggacg atgacgacaa cgaagtacct 1200ccgggccagc cgggtgagct gtgtgtcaag ggtccgcagg ttatgctggg ctactggcag 1260cgcccggacg ccaccgacga aatcattaaa aacggttggc tgcataccgg tgatatcgct 1320gtaatggacg aagaaggttt cctgcgtatc gtggaccgta agaaagatat gattctggtg 1380agcggtttca acgtgtaccc gaacgaaatt gaggacgtag ttatgcaaca ccctggcgtg 1440caggaggtgg cagccgtggg cgtgccgtcc ggttcttctg gtgaggctgt gaaaatcttt 1500gtcgttaaaa aggacccgtc cctgaccgaa gaatctctgg tgacgttttg ccgccgtcaa 1560ctgactggct acaaagtgcc gaaactggtc gagttccgcg atgagctgcc aaaatctaac 1620gtgggtaaga tcctgcgccg cgagctgcgt gacgaggcac gtggcaaagt tgacaataaa 1680gcataa 168661377DNAAcinetobacter baylyi 6atgcgcccac ttcatccgat cgatttcatt ttcctgtccc tggagaaacg ccagcagccg 60atgcacgtag gtggtctgtt cctgttccag atcccggata acgctccgga cacctttatt 120caggacctgg tgaacgatat ccgtatctcc aagtctattc cggttccgcc gttcaacaac 180aagctgaacg gtctgttctg ggacgaagac gaggagttcg atctggatca ccatttccgt 240catattgcgc tgccgcaccc gggtcgcatc cgtgagctgc tgatttacat ctctcaggaa 300cacagcactc tcctcgatcg cgctaaacct ctgtggactt gcaacatcat tgaaggtatc 360gagggtaacc gtttcgccat gtacttcaag attcatcatg cgatggtgga tggtgtggcg 420ggtatgcgtc tgattgagaa aagcctgtcc catgatgtta ctgaaaagag catcgtaccg 480ccgtggtgcg ttgagggcaa acgtgctaaa cgcctgcgtg aaccgaagac cggcaaaatt 540aagaaaatca tgtctggtat taaatctcag ctccaggcca ccccgaccgt tattcaagaa 600ctgtctcaga cggtcttcaa agacatcggc cgtaatccgg accacgtttc ctctttccag 660gcgccgtgct ccatcctcaa ccagcgtgtg tcttcttctc gtcgtttcgc agcacagagc 720tttgacctgg accgtttccg caacatcgcc aaatctctga acgtgaccat taacgacgtt 780gtcctggctg tgtgtagcgg tgctctgcgc gcttatctga tgtctcataa ctctctgcca 840tccaaaccgc tgatcgctat ggtcccagca agcatccgca acgatgattc tgatgtgtcc 900aaccgtatta ctatgattct ggccaacctc gctactcaca aagacgaccc tctgcagcgt 960ctggaaatca tccgccgctc cgtccagaac tctaaacagc gttttaaacg catgacttcc 1020gaccagattc tgaactattc tgcggttgta tacggcccgg ctggtctgaa cattatcagc 1080ggtatgatgc cgaaacgtca ggcttttaac ctggtaatca gcaacgttcc tggcccgcgt 1140gagccgctgt actggaacgg cgcaaaactg gacgcactgt acccggcttc catcgttctg 1200gatggccagg ctctgaacat cactatgacc tcttacctgg acaaactgga agtaggtctg 1260atcgcgtgtc gcaatgcact gccgcgcatg cagaacctgc tgacccacct ggaggaggaa 1320atccagctgt ttgagggcgt tatcgccaaa caggaagata tcaaaacggc gaactaa 13777493PRTEscherichia coli 7Met Lys Lys Leu Leu Pro Ile Leu Ile Gly Leu Ser Leu Ser Gly Phe 1 5 10 15 Ser Ser Leu Ser Gln Ala Glu Asn Leu Met Gln Val Tyr Gln Gln Ala 20 25 30 Arg Leu Ser Asn Pro Glu Leu Arg Lys Ser Ala Ala Asp Arg Asp Ala 35 40 45 Ala Phe Glu Lys Ile Asn Glu Ala Arg Ser Pro Leu Leu Pro Gln Leu 50 55 60 Gly Leu Gly Ala Asp Tyr Thr Tyr Ser Asn Gly Tyr Arg Asp Ala Asn 65 70 75 80 Gly Ile Asn Ser Asn Ala Thr Ser Ala Ser Leu Gln Leu Thr Gln Ser 85 90 95 Ile Phe Asp Met Ser Lys Trp Arg Ala Leu Thr Leu Gln Glu Lys Ala 100 105 110 Ala Gly Ile Gln Asp Val Thr Tyr Gln Thr Asp Gln Gln Thr Leu Ile 115 120 125 Leu Asn Thr Ala Thr Ala Tyr Phe Asn Val Leu Asn Ala Ile Asp Val 130 135 140 Leu Ser Tyr Thr Gln Ala Gln Lys Glu Ala Ile Tyr Arg Gln Leu Asp 145 150 155 160 Gln Thr Thr Gln Arg Phe Asn Val Gly Leu Val Ala Ile Thr Asp Val 165 170 175 Gln Asn Ala Arg Ala Gln Tyr Asp Thr Val Leu Ala Asn Glu Val Thr 180 185 190 Ala Arg Asn Asn Leu Asp Asn Ala Val Glu Gln Leu Arg Gln Ile Thr 195 200 205 Gly Asn Tyr Tyr Pro Glu Leu Ala Ala Leu Asn Val Glu Asn Phe Lys 210 215 220 Thr Asp Lys Pro Gln Pro Val Asn Ala Leu Leu Lys Glu Ala Glu Lys 225 230 235 240 Arg Asn Leu Ser Leu Leu Gln Ala Arg Leu Ser Gln Asp Leu Ala Arg 245 250 255 Glu Gln Ile Arg Gln Ala Gln Asp Gly His Leu Pro Thr Leu Asp Leu 260 265 270 Thr Ala Ser Thr Gly Ile Ser Asp Thr Ser Tyr Ser Gly Ser Lys Thr 275 280 285 Arg Gly Ala Ala Gly Thr Gln Tyr Asp Asp Ser Asn Met Gly Gln Asn 290 295 300 Lys Val Gly Leu Ser Phe Ser Leu Pro Ile Tyr Gln Gly Gly Met Val 305 310 315 320 Asn Ser Gln Val Lys Gln Ala Gln Tyr Asn Phe Val Gly Ala Ser Glu 325 330 335 Gln Leu Glu Ser Ala His Arg Ser Val Val Gln Thr Val Arg Ser Ser 340 345 350 Phe Asn Asn Ile Asn Ala Ser Ile Ser Ser Ile Asn Ala Tyr Lys Gln 355 360 365 Ala Val Val Ser Ala Gln Ser Ser Leu Asp Ala Met Glu Ala Gly Tyr 370 375 380 Ser Val Gly Thr Arg Thr Ile Val Asp Val Leu Asp Ala Thr Thr Thr 385 390 395 400 Leu Tyr Asn Ala Lys Gln Glu Leu Ala Asn Ala Arg Tyr Asn Tyr Leu 405 410 415 Ile Asn Gln Leu Asn Ile Lys Ser Ala Leu Gly Thr Leu Asn Glu Gln 420 425 430 Asp Leu Leu Ala Leu Asn Asn Ala Leu Ser Lys Pro Val Ser Thr Asn 435 440 445 Pro Glu Asn Val Ala Pro Gln Thr Pro Glu Gln Asn Ala Ile Ala Asp 450 455 460 Gly Tyr Ala Pro Asp Ser Pro Ala Pro Val Val Gln Gln Thr Ser Ala 465 470 475 480 Arg Thr Thr Thr Ser Asn Gly His Asn Pro Phe Arg Asn 485 490 8397PRTEscherichia coli 8Met Asn Lys Asn Arg Gly Phe Thr Pro Leu Ala Val Val Leu Met Leu 1 5 10

15 Ser Gly Ser Leu Ala Leu Thr Gly Cys Asp Asp Lys Gln Ala Gln Gln 20 25 30 Gly Gly Gln Gln Met Pro Ala Val Gly Val Val Thr Val Lys Thr Glu 35 40 45 Pro Leu Gln Ile Thr Thr Glu Leu Pro Gly Arg Thr Ser Ala Tyr Arg 50 55 60 Ile Ala Glu Val Arg Pro Gln Val Ser Gly Ile Ile Leu Lys Arg Asn 65 70 75 80 Phe Lys Glu Gly Ser Asp Ile Glu Ala Gly Val Ser Leu Tyr Gln Ile 85 90 95 Asp Pro Ala Thr Tyr Gln Ala Thr Tyr Asp Ser Ala Lys Gly Asp Leu 100 105 110 Ala Lys Ala Gln Ala Ala Ala Asn Ile Ala Gln Leu Thr Val Asn Arg 115 120 125 Tyr Gln Lys Leu Leu Gly Thr Gln Tyr Ile Ser Lys Gln Glu Tyr Asp 130 135 140 Gln Ala Leu Ala Asp Ala Gln Gln Ala Asn Ala Ala Val Thr Ala Ala 145 150 155 160 Lys Ala Ala Val Glu Thr Ala Arg Ile Asn Leu Ala Tyr Thr Lys Val 165 170 175 Thr Ser Pro Ile Ser Gly Arg Ile Gly Lys Ser Asn Val Thr Glu Gly 180 185 190 Ala Leu Val Gln Asn Gly Gln Ala Thr Ala Leu Ala Thr Val Gln Gln 195 200 205 Leu Asp Pro Ile Tyr Val Asp Val Thr Gln Ser Ser Asn Asp Phe Leu 210 215 220 Arg Leu Lys Gln Glu Leu Ala Asn Gly Thr Leu Lys Gln Glu Asn Gly 225 230 235 240 Lys Ala Lys Val Ser Leu Ile Thr Ser Asp Gly Ile Lys Phe Pro Gln 245 250 255 Asp Gly Thr Leu Glu Phe Ser Asp Val Thr Val Asp Gln Thr Thr Gly 260 265 270 Ser Ile Thr Leu Arg Ala Ile Phe Pro Asn Pro Asp His Thr Leu Leu 275 280 285 Pro Gly Met Phe Val Arg Ala Arg Leu Glu Glu Gly Leu Asn Pro Asn 290 295 300 Ala Ile Leu Val Pro Gln Gln Gly Val Thr Arg Thr Pro Arg Gly Asp 305 310 315 320 Ala Thr Val Leu Val Val Gly Ala Asp Asp Lys Val Glu Thr Arg Pro 325 330 335 Ile Val Ala Ser Gln Ala Ile Gly Asp Lys Trp Leu Val Thr Glu Gly 340 345 350 Leu Lys Ala Gly Asp Arg Val Val Ile Ser Gly Leu Gln Lys Val Arg 355 360 365 Pro Gly Val Gln Val Lys Ala Gln Glu Val Thr Ala Asp Asn Asn Gln 370 375 380 Gln Ala Ala Ser Gly Ala Gln Pro Glu Gln Ser Lys Ser 385 390 395 91049PRTEscherichia coli 9Met Pro Asn Phe Phe Ile Asp Arg Pro Ile Phe Ala Trp Val Ile Ala 1 5 10 15 Ile Ile Ile Met Leu Ala Gly Gly Leu Ala Ile Leu Lys Leu Pro Val 20 25 30 Ala Gln Tyr Pro Thr Ile Ala Pro Pro Ala Val Thr Ile Ser Ala Ser 35 40 45 Tyr Pro Gly Ala Asp Ala Lys Thr Val Gln Asp Thr Val Thr Gln Val 50 55 60 Ile Glu Gln Asn Met Asn Gly Ile Asp Asn Leu Met Tyr Met Ser Ser 65 70 75 80 Asn Ser Asp Ser Thr Gly Thr Val Gln Ile Thr Leu Thr Phe Glu Ser 85 90 95 Gly Thr Asp Ala Asp Ile Ala Gln Val Gln Val Gln Asn Lys Leu Gln 100 105 110 Leu Ala Met Pro Leu Leu Pro Gln Glu Val Gln Gln Gln Gly Val Ser 115 120 125 Val Glu Lys Ser Ser Ser Ser Phe Leu Met Val Val Gly Val Ile Asn 130 135 140 Thr Asp Gly Thr Met Thr Gln Glu Asp Ile Ser Asp Tyr Val Ala Ala 145 150 155 160 Asn Met Lys Asp Ala Ile Ser Arg Thr Ser Gly Val Gly Asp Val Gln 165 170 175 Leu Phe Gly Ser Gln Tyr Ala Met Arg Ile Trp Met Asn Pro Asn Glu 180 185 190 Leu Asn Lys Phe Gln Leu Thr Pro Val Asp Val Ile Thr Ala Ile Lys 195 200 205 Ala Gln Asn Ala Gln Val Ala Ala Gly Gln Leu Gly Gly Thr Pro Pro 210 215 220 Val Lys Gly Gln Gln Leu Asn Ala Ser Ile Ile Ala Gln Thr Arg Leu 225 230 235 240 Thr Ser Thr Glu Glu Phe Gly Lys Ile Leu Leu Lys Val Asn Gln Asp 245 250 255 Gly Ser Arg Val Leu Leu Arg Asp Val Ala Lys Ile Glu Leu Gly Gly 260 265 270 Glu Asn Tyr Asp Ile Ile Ala Glu Phe Asn Gly Gln Pro Ala Ser Gly 275 280 285 Leu Gly Ile Lys Leu Ala Thr Gly Ala Asn Ala Leu Asp Thr Ala Ala 290 295 300 Ala Ile Arg Ala Glu Leu Ala Lys Met Glu Pro Phe Phe Pro Ser Gly 305 310 315 320 Leu Lys Ile Val Tyr Pro Tyr Asp Thr Thr Pro Phe Val Lys Ile Ser 325 330 335 Ile His Glu Val Val Lys Thr Leu Val Glu Ala Ile Ile Leu Val Phe 340 345 350 Leu Val Met Tyr Leu Phe Leu Gln Asn Phe Arg Ala Thr Leu Ile Pro 355 360 365 Thr Ile Ala Val Pro Val Val Leu Leu Gly Thr Phe Ala Val Leu Ala 370 375 380 Ala Phe Gly Phe Ser Ile Asn Thr Leu Thr Met Phe Gly Met Val Leu 385 390 395 400 Ala Ile Gly Leu Leu Val Asp Asp Ala Ile Val Val Val Glu Asn Val 405 410 415 Glu Arg Val Met Ala Glu Glu Gly Leu Pro Pro Lys Glu Ala Thr Arg 420 425 430 Lys Ser Met Gly Gln Ile Gln Gly Ala Leu Val Gly Ile Ala Met Val 435 440 445 Leu Ser Ala Val Phe Val Pro Met Ala Phe Phe Gly Gly Ser Thr Gly 450 455 460 Ala Ile Tyr Arg Gln Phe Ser Ile Thr Ile Val Ser Ala Met Ala Leu 465 470 475 480 Ser Val Leu Val Ala Leu Ile Leu Thr Pro Ala Leu Cys Ala Thr Met 485 490 495 Leu Lys Pro Ile Ala Lys Gly Asp His Gly Glu Gly Lys Lys Gly Phe 500 505 510 Phe Gly Trp Phe Asn Arg Met Phe Glu Lys Ser Thr His His Tyr Thr 515 520 525 Asp Ser Val Gly Gly Ile Leu Arg Ser Thr Gly Arg Tyr Leu Val Leu 530 535 540 Tyr Leu Ile Ile Val Val Gly Met Ala Tyr Leu Phe Val Arg Leu Pro 545 550 555 560 Ser Ser Phe Leu Pro Asp Glu Asp Gln Gly Val Phe Met Thr Met Val 565 570 575 Gln Leu Pro Ala Gly Ala Thr Gln Glu Arg Thr Gln Lys Val Leu Asn 580 585 590 Glu Val Thr His Tyr Tyr Leu Thr Lys Glu Lys Asn Asn Val Glu Ser 595 600 605 Val Phe Ala Val Asn Gly Phe Gly Phe Ala Gly Arg Gly Gln Asn Thr 610 615 620 Gly Ile Ala Phe Val Ser Leu Lys Asp Trp Ala Asp Arg Pro Gly Glu 625 630 635 640 Glu Asn Lys Val Glu Ala Ile Thr Met Arg Ala Thr Arg Ala Phe Ser 645 650 655 Gln Ile Lys Asp Ala Met Val Phe Ala Phe Asn Leu Pro Ala Ile Val 660 665 670 Glu Leu Gly Thr Ala Thr Gly Phe Asp Phe Glu Leu Ile Asp Gln Ala 675 680 685 Gly Leu Gly His Glu Lys Leu Thr Gln Ala Arg Asn Gln Leu Leu Ala 690 695 700 Glu Ala Ala Lys His Pro Asp Met Leu Thr Ser Val Arg Pro Asn Gly 705 710 715 720 Leu Glu Asp Thr Pro Gln Phe Lys Ile Asp Ile Asp Gln Glu Lys Ala 725 730 735 Gln Ala Leu Gly Val Ser Ile Asn Asp Ile Asn Thr Thr Leu Gly Ala 740 745 750 Ala Trp Gly Gly Ser Tyr Val Asn Asp Phe Ile Asp Arg Gly Arg Val 755 760 765 Lys Lys Val Tyr Val Met Ser Glu Ala Lys Tyr Arg Met Leu Pro Asp 770 775 780 Asp Ile Gly Asp Trp Tyr Val Arg Ala Ala Asp Gly Gln Met Val Pro 785 790 795 800 Phe Ser Ala Phe Ser Ser Ser Arg Trp Glu Tyr Gly Ser Pro Arg Leu 805 810 815 Glu Arg Tyr Asn Gly Leu Pro Ser Met Glu Ile Leu Gly Gln Ala Ala 820 825 830 Pro Gly Lys Ser Thr Gly Glu Ala Met Glu Leu Met Glu Gln Leu Ala 835 840 845 Ser Lys Leu Pro Thr Gly Val Gly Tyr Asp Trp Thr Gly Met Ser Tyr 850 855 860 Gln Glu Arg Leu Ser Gly Asn Gln Ala Pro Ser Leu Tyr Ala Ile Ser 865 870 875 880 Leu Ile Val Val Phe Leu Cys Leu Ala Ala Leu Tyr Glu Ser Trp Ser 885 890 895 Ile Pro Phe Ser Val Met Leu Val Val Pro Leu Gly Val Ile Gly Ala 900 905 910 Leu Leu Ala Ala Thr Phe Arg Gly Leu Thr Asn Asp Val Tyr Phe Gln 915 920 925 Val Gly Leu Leu Thr Thr Ile Gly Leu Ser Ala Lys Asn Ala Ile Leu 930 935 940 Ile Val Glu Phe Ala Lys Asp Leu Met Asp Lys Glu Gly Lys Gly Leu 945 950 955 960 Ile Glu Ala Thr Leu Asp Ala Val Arg Met Arg Leu Arg Pro Ile Leu 965 970 975 Met Thr Ser Leu Ala Phe Ile Leu Gly Val Met Pro Leu Val Ile Ser 980 985 990 Thr Gly Ala Gly Ser Gly Ala Gln Asn Ala Val Gly Thr Gly Val Met 995 1000 1005 Gly Gly Met Val Thr Ala Thr Val Leu Ala Ile Phe Phe Val Pro 1010 1015 1020 Val Phe Phe Val Val Val Arg Arg Arg Phe Ser Arg Lys Asn Glu 1025 1030 1035 Asp Ile Glu His Ser His Thr Val Asp His His 1040 1045 103821DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 10gcggccgcgg gggggggggg gaaagccacg ttgtgtctca aaatctctga tgttacattg 60cacaagataa aaatatatca tcatgaacaa taaaactgtc tgcttacata aacagtaata 120caaggggtca tatggcggat actctgctga ttctgggtga ttctctgtct gcaggctacc 180gtatgtccgc ctccgcggcc tggccagctc tgctgaatga taagtggcag tctaagacgt 240ccgttgtgaa cgcatccatc tctggcgaca cgagccagca gggcctggcc cgtctgcctg 300cactgctgaa acagcaccaa ccgcgctggg tcctggtgga gctgggcggt aacgacggtc 360tgcgcggctt ccagccgcag cagaccgaac agactctgcg tcagattctg caggacgtga 420aagctgctaa cgcggaaccg ctgctgatgc agattcgtct gccagcgaac tatggccgcc 480gttacaacga agcgttctct gcaatctacc caaaactggc gaaagagttt gacgtcccgc 540tgctgccgtt cttcatggag gaagtatacc tgaaaccgca gtggatgcaa gatgacggca 600tccacccgaa ccgtgatgcg cagccgttca tcgctgactg gatggcgaag caactgcagc 660cgctggtaaa ccacgattcc taattaaaga tctgtagtag gatccatgta gggtgaggtt 720atagctatga agaaagtttg gctgaaccgt tatccggcag atgtaccgac tgaaattaac 780ccagatcgtt accagtccct ggttgacatg ttcgaacagt ccgtggctcg ctacgccgat 840cagcctgctt tcgtcaacat gggtgaggta atgacctttc gcaaactgga ggagcgttcc 900cgtgctttcg cggcatacct gcagcagggt ctgggcctga agaaaggcga ccgcgtggcc 960ctgatgatgc cgaacctgct gcaatatcct gtggcgctgt tcggtatcct gcgtgctggt 1020atgatcgttg tcaatgttaa ccctctgtat acccctcgtg aactggagca ccagctgaat 1080gactctggtg cgtctgctat cgttatcgtt tccaatttcg cacatacgct ggagaaagtg 1140gttgataaaa ccgcagtgca gcatgtcatt ctgactcgca tgggtgacca gctgtccacc 1200gctaaaggta ctgtagtcaa cttcgttgtg aaatacatta agcgcctggt tccgaaatac 1260cacctgccag atgcaattag ctttcgctct gcactgcata acggttaccg tatgcagtac 1320gtaaaaccag agctggtgcc ggaagacctg gcctttctgc agtataccgg cggcaccacc 1380ggcgtggcaa agggcgcgat gctgacccat cgtaacatgc tggcgaacct ggagcaggtt 1440aacgcaacgt acggcccgct gctgcacccg ggtaaagaac tggtagttac ggcactgcct 1500ctgtatcaca tctttgcact gacgatcaac tgtctgctgt tcattgaact gggtggtcag 1560aacctgctga tcaccaaccc gcgtgacatt ccgggcctgg taaaagagct ggctaagtac 1620ccgttcaccg ccattactgg cgtaaacact ctgtttaacg cgctgctgaa caacaaagag 1680tttcagcagc tggacttctc tagcctgcac ctgagcgctg gcggtggcat gccggttcag 1740caggttgtgg cagagcgttg ggtgaaactg accggccagt atctgctgga gggttatggt 1800ctgaccgagt gtgcaccgct ggtcagcgtt aacccgtatg atattgatta ccactctggt 1860tctattggtc tgccggttcc gtccacggaa gccaaactgg tggacgatga cgacaacgaa 1920gtacctccgg gccagccggg tgagctgtgt gtcaagggtc cgcaggttat gctgggctac 1980tggcagcgcc cggacgccac cgacgaaatc attaaaaacg gttggctgca taccggtgat 2040atcgctgtaa tggacgaaga aggtttcctg cgtatcgtgg accgtaagaa agatatgatt 2100ctggtgagcg gtttcaacgt gtacccgaac gaaattgagg acgtagttat gcaacaccct 2160ggcgtgcagg aggtggcagc cgtgggcgtg ccgtccggtt cttctggtga ggctgtgaaa 2220atctttgtcg ttaaaaagga cccgtccctg accgaagaat ctctggtgac gttttgccgc 2280cgtcaactga ctggctacaa agtgccgaaa ctggtcgagt tccgcgatga gctgccaaaa 2340tctaacgtgg gtaagatcct gcgccgcgag ctgcgtgacg aggcacgtgg caaagttgac 2400aataaagcat aaccgcgtag gaggacagct atgcgcccac ttcatccgat cgatttcatt 2460ttcctgtccc tggagaaacg ccagcagccg atgcacgtag gtggtctgtt cctgttccag 2520atcccggata acgctccgga cacctttatt caggacctgg tgaacgatat ccgtatctcc 2580aagtctattc cggttccgcc gttcaacaac aagctgaacg gtctgttctg ggacgaagac 2640gaggagttcg atctggatca ccatttccgt catattgcgc tgccgcaccc gggtcgcatc 2700cgtgagctgc tgatttacat ctctcaggaa cacagcactc tcctcgatcg cgctaaacct 2760ctgtggactt gcaacatcat tgaaggtatc gagggtaacc gtttcgccat gtacttcaag 2820attcatcatg cgatggtgga tggtgtggcg ggtatgcgtc tgattgagaa aagcctgtcc 2880catgatgtta ctgaaaagag catcgtaccg ccgtggtgcg ttgagggcaa acgtgctaaa 2940cgcctgcgtg aaccgaagac cggcaaaatt aagaaaatca tgtctggtat taaatctcag 3000ctccaggcca ccccgaccgt tattcaagaa ctgtctcaga cggtcttcaa agacatcggc 3060cgtaatccgg accacgtttc ctctttccag gcgccgtgct ccatcctcaa ccagcgtgtg 3120tcttcttctc gtcgtttcgc agcacagagc tttgacctgg accgtttccg caacatcgcc 3180aaatctctga acgtgaccat taacgacgtt gtcctggctg tgtgtagcgg tgctctgcgc 3240gcttatctga tgtctcataa ctctctgcca tccaaaccgc tgatcgctat ggtcccagca 3300agcatccgca acgatgattc tgatgtgtcc aaccgtatta ctatgattct ggccaacctc 3360gctactcaca aagacgaccc tctgcagcgt ctggaaatca tccgccgctc cgtccagaac 3420tctaaacagc gttttaaacg catgacttcc gaccagattc tgaactattc tgcggttgta 3480tacggcccgg ctggtctgaa cattatcagc ggtatgatgc cgaaacgtca ggcttttaac 3540ctggtaatca gcaacgttcc tggcccgcgt gagccgctgt actggaacgg cgcaaaactg 3600gacgcactgt acccggcttc catcgttctg gatggccagg ctctgaacat cactatgacc 3660tcttacctgg acaaactgga agtaggtctg atcgcgtgtc gcaatgcact gccgcgcatg 3720cagaacctgc tgacccacct ggaggaggaa atccagctgt ttgagggcgt tatcgccaaa 3780caggaagata tcaaaacggc gaactaacca tggttgaatt c 3821117502DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 11cctgcagggt cagcaagctc tggaatttcc cgattctctg atgggagatc caaaaattct 60cgcagtccct caatcacgat atcggtcttg gatcgccctg tagcttccga caactgctca 120attttttcga gcatctctac cgggcatcgg aatgaaatta acggtgtttt agccatgtgt 180tatacagtgt ttacaacttg actaacaaat acctgctagt gtatacatat tgtattgcaa 240tgtatacgct attttcactg ctgtctttaa tggggattat cgcaagcaag taaaaaagcc 300tgaaaacccc aataggtaag ggattccgag cttactcgat aattatcacc tttgagcgcc 360cctaggagga ggcgaaaagc tatgtctgac aaggggtttg acccctgaag tcgttgcgcg 420agcattaagg tctgcggata gcccataaca tacttttgtt gaacttgtgc gcttttatca 480accccttaag ggcttgggag cgttttatgc ggccgctcac tgcccgcttt ccagtcggga 540aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt 600attgggcgcc agggtggttt ttcttttcac cagtgagacg ggcaacagct gattgccctt 660caccgcctgg ccctgagaga gttgcagcaa gcggtccacg ctggtttgcc ccagcaggcg 720aaaatcctgt ttgatggtgg ttgacggcgg gatataacat gagctgtctt cggtatcgtc 780gtatcccact accgagatat ccgcaccaac gcgcagcccg gactcggtaa tggcgcgcat 840tgcgcccagc gccatctgat cgttggcaac cagcatcgca gtgggaacga tgccctcatt 900cagcatttgc atggtttgtt gaaaaccgga catggcactc cagtcgcctt cccgttccgc 960tatcggctga atttgattgc gagtgagata tttatgccag ccagccagac gcagacgcgc 1020cgagacagaa cttaatgggc ccgctaacag cgcgatttgc tggtgaccca atgcgaccag 1080atgctccacg cccagtcgcg taccgtcttc atgggagaaa ataatactgt tgatgggtgt 1140ctggtcagag acatcaagaa ataacgccgg aacattagtg caggcagctt ccacagcaat 1200ggcatcctgg tcatccagcg gatagttaat gatcagccca ctgacgcgct gcgcgagaag 1260attgtgcacc gccgctttac aggcttcgac gccgcttcgt tctaccatcg acaccaccac 1320gctggcaccc agttgatcgg cgcgagattt aatcgccgcg acaatttgcg acggcgcgtg 1380cagggccaga ctggaggtgg caacgccaat cagcaacgac tgtttgcccg ccagttgttg 1440tgccacgcgg ttgggaatgt aattcagctc cgccatcgcc gcttccactt tttcccgcgt 1500tttcgcagaa acgtggctgg cctggttcac cacgcgggaa acggtctgat aagagacacc 1560ggcatactct gcgacatcgt ataacgttac tggtttcata ttcaccaccc tgaattgact 1620ctcttccggg cgctatcatg ccataccgcg aaaggttttg caccattcga tggtgtcaac

1680gtaaatgcat gccgcttcgc cttccaattg gactgcacgg tgcaccaatg cttctggcgt 1740caggcagcca tcggaagctg tggtatggct gtgcaggtcg taaatcactg cataattcgt 1800gtcgctcaag gcgcactccc gttctggata atgttttttg cgccgacatc ataacggttc 1860tggcaaatat tctgaaatga gctgttgaca attaatcatc cggctcgtat aatgtgtgga 1920attgtgagcg gataacaatt tcacacagga aacagcatgg ccaaggaggc ccatatggcg 1980gatactctgc tgattctggg tgattctctg tctgcaggct accgtatgtc cgcctccgcg 2040gcctggccag ctctgctgaa tgataagtgg cagtctaaga cgtccgttgt gaacgcatcc 2100atctctggcg acacgagcca gcagggcctg gcccgtctgc ctgcactgct gaaacagcac 2160caaccgcgct gggtcctggt ggagctgggc ggtaacgacg gtctgcgcgg cttccagccg 2220cagcagaccg aacagactct gcgtcagatt ctgcaggacg tgaaagctgc taacgcggaa 2280ccgctgctga tgcagattcg tctgccagcg aactatggcc gccgttacaa cgaagcgttc 2340tctgcaatct acccaaaact ggcgaaagag tttgacgtcc cgctgctgcc gttcttcatg 2400gaggaagtat acctgaaacc gcagtggatg caagatgacg gcatccaccc gaaccgtgat 2460gcgcagccgt tcatcgctga ctggatggcg aagcaactgc agccgctggt aaaccacgat 2520tcctaattaa agatctgtag taggatccat gtagggtgag gttatagcta tgaagaaagt 2580ttggctgaac cgttatccgg cagatgtacc gactgaaatt aacccagatc gttaccagtc 2640cctggttgac atgttcgaac agtccgtggc tcgctacgcc gatcagcctg ctttcgtcaa 2700catgggtgag gtaatgacct ttcgcaaact ggaggagcgt tcccgtgctt tcgcggcata 2760cctgcagcag ggtctgggcc tgaagaaagg cgaccgcgtg gccctgatga tgccgaacct 2820gctgcaatat cctgtggcgc tgttcggtat cctgcgtgct ggtatgatcg ttgtcaatgt 2880taaccctctg tatacccctc gtgaactgga gcaccagctg aatgactctg gtgcgtctgc 2940tatcgttatc gtttccaatt tcgcacatac gctggagaaa gtggttgata aaaccgcagt 3000gcagcatgtc attctgactc gcatgggtga ccagctgtcc accgctaaag gtactgtagt 3060caacttcgtt gtgaaataca ttaagcgcct ggttccgaaa taccacctgc cagatgcaat 3120tagctttcgc tctgcactgc ataacggtta ccgtatgcag tacgtaaaac cagagctggt 3180gccggaagac ctggcctttc tgcagtatac cggcggcacc accggcgtgg caaagggcgc 3240gatgctgacc catcgtaaca tgctggcgaa cctggagcag gttaacgcaa cgtacggccc 3300gctgctgcac ccgggtaaag aactggtagt tacggcactg cctctgtatc acatctttgc 3360actgacgatc aactgtctgc tgttcattga actgggtggt cagaacctgc tgatcaccaa 3420cccgcgtgac attccgggcc tggtaaaaga gctggctaag tacccgttca ccgccattac 3480tggcgtaaac actctgttta acgcgctgct gaacaacaaa gagtttcagc agctggactt 3540ctctagcctg cacctgagcg ctggcggtgg catgccggtt cagcaggttg tggcagagcg 3600ttgggtgaaa ctgaccggcc agtatctgct ggagggttat ggtctgaccg agtgtgcacc 3660gctggtcagc gttaacccgt atgatattga ttaccactct ggttctattg gtctgccggt 3720tccgtccacg gaagccaaac tggtggacga tgacgacaac gaagtacctc cgggccagcc 3780gggtgagctg tgtgtcaagg gtccgcaggt tatgctgggc tactggcagc gcccggacgc 3840caccgacgaa atcattaaaa acggttggct gcataccggt gatatcgctg taatggacga 3900agaaggtttc ctgcgtatcg tggaccgtaa gaaagatatg attctggtga gcggtttcaa 3960cgtgtacccg aacgaaattg aggacgtagt tatgcaacac cctggcgtgc aggaggtggc 4020agccgtgggc gtgccgtccg gttcttctgg tgaggctgtg aaaatctttg tcgttaaaaa 4080ggacccgtcc ctgaccgaag aatctctggt gacgttttgc cgccgtcaac tgactggcta 4140caaagtgccg aaactggtcg agttccgcga tgagctgcca aaatctaacg tgggtaagat 4200cctgcgccgc gagctgcgtg acgaggcacg tggcaaagtt gacaataaag cataaccgcg 4260taggaggaca gctatgcgcc cacttcatcc gatcgatttc attttcctgt ccctggagaa 4320acgccagcag ccgatgcacg taggtggtct gttcctgttc cagatcccgg ataacgctcc 4380ggacaccttt attcaggacc tggtgaacga tatccgtatc tccaagtcta ttccggttcc 4440gccgttcaac aacaagctga acggtctgtt ctgggacgaa gacgaggagt tcgatctgga 4500tcaccatttc cgtcatattg cgctgccgca cccgggtcgc atccgtgagc tgctgattta 4560catctctcag gaacacagca ctctcctcga tcgcgctaaa cctctgtgga cttgcaacat 4620cattgaaggt atcgagggta accgtttcgc catgtacttc aagattcatc atgcgatggt 4680ggatggtgtg gcgggtatgc gtctgattga gaaaagcctg tcccatgatg ttactgaaaa 4740gagcatcgta ccgccgtggt gcgttgaggg caaacgtgct aaacgcctgc gtgaaccgaa 4800gaccggcaaa attaagaaaa tcatgtctgg tattaaatct cagctccagg ccaccccgac 4860cgttattcaa gaactgtctc agacggtctt caaagacatc ggccgtaatc cggaccacgt 4920ttcctctttc caggcgccgt gctccatcct caaccagcgt gtgtcttctt ctcgtcgttt 4980cgcagcacag agctttgacc tggaccgttt ccgcaacatc gccaaatctc tgaacgtgac 5040cattaacgac gttgtcctgg ctgtgtgtag cggtgctctg cgcgcttatc tgatgtctca 5100taactctctg ccatccaaac cgctgatcgc tatggtccca gcaagcatcc gcaacgatga 5160ttctgatgtg tccaaccgta ttactatgat tctggccaac ctcgctactc acaaagacga 5220ccctctgcag cgtctggaaa tcatccgccg ctccgtccag aactctaaac agcgttttaa 5280acgcatgact tccgaccaga ttctgaacta ttctgcggtt gtatacggcc cggctggtct 5340gaacattatc agcggtatga tgccgaaacg tcaggctttt aacctggtaa tcagcaacgt 5400tcctggcccg cgtgagccgc tgtactggaa cggcgcaaaa ctggacgcac tgtacccggc 5460ttccatcgtt ctggatggcc aggctctgaa catcactatg acctcttacc tggacaaact 5520ggaagtaggt ctgatcgcgt gtcgcaatgc actgccgcgc atgcagaacc tgctgaccca 5580cctggaggag gaaatccagc tgtttgaggg cgttatcgcc aaacaggaag atatcaaaac 5640ggcgaactaa ccatggttga attcggtttt ccgtcctgtc ttgattttca agcaaacaat 5700gcctccgatt tctaatcgga ggcatttgtt tttgtttatt gcaaaaacaa aaaatattgt 5760tacaaatttt tacaggctat taagcctacc gtcataaata atttgccatt tactagtttt 5820taattaacca gaaccttgac cgaacgcagc ggtggtaacg gcgcagtggc ggttttcatg 5880gcttgttatg actgtttttt tggggtacag tctatgcctc gggcatccaa gcagcaagcg 5940cgttacgccg tgggtcgatg tttgatgtta tggagcagca acgatgttac gcagcagggc 6000agtcgcccta aaacaaagtt aaacatcatg agggaagcgg tgatcgccga agtatcgact 6060caactatcag aggtagttgg cgtcatcgag cgccatctcg aaccgacgtt gctggccgta 6120catttgtacg gctccgcagt ggatggcggc ctgaagccac acagtgatat tgatttgctg 6180gttacggtga ccgtaaggct tgatgaaaca acgcggcgag ctttgatcaa cgaccttttg 6240gaaacttcgg cttcccctgg agagagcgag attctccgcg ctgtagaagt caccattgtt 6300gtgcacgacg acatcattcc gtggcgttat ccagctaagc gcgaactgca atttggagaa 6360tggcagcgca atgacattct tgcaggtatc ttcgagccag ccacgatcga cattgatctg 6420gctatcttgc tgacaaaagc aagagaacat agcgttgcct tggtaggtcc agcggcggag 6480gaactctttg atccggttcc tgaacaggat ctatttgagg cgctaaatga aaccttaacg 6540ctatggaact cgccgcccga ctgggctggc gatgagcgaa atgtagtgct tacgttgtcc 6600cgcatttggt acagcgcagt aaccggcaaa atcgcgccga aggatgtcgc tgccgactgg 6660gcaatggagc gcctgccggc ccagtatcag cccgtcatac ttgaagctag acaggcttat 6720cttggacaag aagaagatcg cttggcctcg cgcgcagatc agttggaaga atttgtccac 6780tacgtgaaag gcgagatcac caaggtagtc ggcaaataat gtctaacaat tcgttcaagc 6840cgacgccgct tcgcggcgcg gcttaactca agcgttagat gcactaagca cataattgct 6900cacagccaaa ctatcaggtc aagtctgctt ttattatttt taagcgtgca taataagccc 6960tacacaaatt gggagatata tcatgaggcg cgccacgagt gcggggaaat ttcgggggcg 7020atcgccccta tatcgcaaaa aggagttacc ccatcagagc tatagtcgag aagaaaacca 7080tcattcactc aacaaggcta tgtcagaaga gaaactagac cggatcgaag cagccctaga 7140gcaattggat aaggatgtgc aaacgctcca aacagagctt cagcaatccc aaaaatggca 7200ggacaggaca tgggatgttg tgaagtgggt aggcggaatc tcagcgggcc tagcggtgag 7260cgcttccatt gccctgttcg ggttggtctt tagattttct gtttccctgc cataaaagca 7320cattcttata agtcatactt gtttacatca aggaacaaaa acggcattgt gccttgcaag 7380gcacaatgtc tttctcttat gcacagatgg ggactggaaa ccacacgcac aattccctta 7440aaaagcaacc gcaaaaaata accatcaaaa taaaactgga caaattctca tgtgggccgg 7500cc 7502121442DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 12tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa 60cgcgcgggga gaggcggttt gcgtattggg cgccagggtg gtttttcttt tcaccagtga 120gacgggcaac agctgattgc ccttcaccgc ctggccctga gagagttgca gcaagcggtc 180cacgctggtt tgccccagca ggcgaaaatc ctgtttgatg gtggttgacg gcgggatata 240acatgagctg tcttcggtat cgtcgtatcc cactaccgag atatccgcac caacgcgcag 300cccggactcg gtaatggcgc gcattgcgcc cagcgccatc tgatcgttgg caaccagcat 360cgcagtggga acgatgccct cattcagcat ttgcatggtt tgttgaaaac cggacatggc 420actccagtcg ccttcccgtt ccgctatcgg ctgaatttga ttgcgagtga gatatttatg 480ccagccagcc agacgcagac gcgccgagac agaacttaat gggcccgcta acagcgcgat 540ttgctggtga cccaatgcga ccagatgctc cacgcccagt cgcgtaccgt cttcatggga 600gaaaataata ctgttgatgg gtgtctggtc agagacatca agaaataacg ccggaacatt 660agtgcaggca gcttccacag caatggcatc ctggtcatcc agcggatagt taatgatcag 720cccactgacg cgctgcgcga gaagattgtg caccgccgct ttacaggctt cgacgccgct 780tcgttctacc atcgacacca ccacgctggc acccagttga tcggcgcgag atttaatcgc 840cgcgacaatt tgcgacggcg cgtgcagggc cagactggag gtggcaacgc caatcagcaa 900cgactgtttg cccgccagtt gttgtgccac gcggttggga atgtaattca gctccgccat 960cgccgcttcc actttttccc gcgttttcgc agaaacgtgg ctggcctggt tcaccacgcg 1020ggaaacggtc tgataagaga caccggcata ctctgcgaca tcgtataacg ttactggttt 1080catattcacc accctgaatt gactctcttc cgggcgctat catgccatac cgcgaaaggt 1140tttgcaccat tcgatggtgt caacgtaaat gcatgccgct tcgccttcca attggactgc 1200acggtgcacc aatgcttctg gcgtcaggca gccatcggaa gctgtggtat ggctgtgcag 1260gtcgtaaatc actgcataat tcgtgtcgct caaggcgcac tcccgttctg gataatgttt 1320tttgcgccga catcataacg gttctggcaa atattctgaa atgagctgtt gacaattaat 1380catccggctc gtataatgtg tggaattgtg agcggataac aatttcacac aggaaacagc 1440at 1442135615DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 13cctgcagggt cagcaagctc tggaatttcc cgattctctg atgggagatc caaaaattct 60cgcagtccct caatcacgat atcggtcttg gatcgccctg tagcttccga caactgctca 120attttttcga gcatctctac cgggcatcgg aatgaaatta acggtgtttt agccatgtgt 180tatacagtgt ttacaacttg actaacaaat acctgctagt gtatacatat tgtattgcaa 240tgtatacgct attttcactg ctgtctttaa tggggattat cgcaagcaag taaaaaagcc 300tgaaaacccc aataggtaag ggattccgag cttactcgat aattatcacc tttgagcgcc 360cctaggagga ggcgaaaagc tatgtctgac aaggggtttg acccctgaag tcgttgcgcg 420agcattaagg tctgcggata gcccataaca tacttttgtt gaacttgtgc gcttttatca 480accccttaag ggcttgggag cgttttatgc ggccgcgggg ggggggggga aagccacgtt 540gtgtctcaaa atctctgatg ttacattgca caagataaaa atatatcatc atgaacaata 600aaactgtctg cttacataaa cagtaataca aggggtcata gatctgtagt aggatccatg 660tagggtgagg ttatagctat gaagaaagtt tggctgaacc gttatccggc agatgtaccg 720actgaaatta acccagatcg ttaccagtcc ctggttgaca tgttcgaaca gtccgtggct 780cgctacgccg atcagcctgc tttcgtcaac atgggtgagg taatgacctt tcgcaaactg 840gaggagcgtt cccgtgcttt cgcggcatac ctgcagcagg gtctgggcct gaagaaaggc 900gaccgcgtgg ccctgatgat gccgaacctg ctgcaatatc ctgtggcgct gttcggtatc 960ctgcgtgctg gtatgatcgt tgtcaatgtt aaccctctgt atacccctcg tgaactggag 1020caccagctga atgactctgg tgcgtctgct atcgttatcg tttccaattt cgcacatacg 1080ctggagaaag tggttgataa aaccgcagtg cagcatgtca ttctgactcg catgggtgac 1140cagctgtcca ccgctaaagg tactgtagtc aacttcgttg tgaaatacat taagcgcctg 1200gttccgaaat accacctgcc agatgcaatt agctttcgct ctgcactgca taacggttac 1260cgtatgcagt acgtaaaacc agagctggtg ccggaagacc tggcctttct gcagtatacc 1320ggcggcacca ccggcgtggc aaagggcgcg atgctgaccc atcgtaacat gctggcgaac 1380ctggagcagg ttaacgcaac gtacggcccg ctgctgcacc cgggtaaaga actggtagtt 1440acggcactgc ctctgtatca catctttgca ctgacgatca actgtctgct gttcattgaa 1500ctgggtggtc agaacctgct gatcaccaac ccgcgtgaca ttccgggcct ggtaaaagag 1560ctggctaagt acccgttcac cgccattact ggcgtaaaca ctctgtttaa cgcgctgctg 1620aacaacaaag agtttcagca gctggacttc tctagcctgc acctgagcgc tggcggtggc 1680atgccggttc agcaggttgt ggcagagcgt tgggtgaaac tgaccggcca gtatctgctg 1740gagggttatg gtctgaccga gtgtgcaccg ctggtcagcg ttaacccgta tgatattgat 1800taccactctg gttctattgg tctgccggtt ccgtccacgg aagccaaact ggtggacgat 1860gacgacaacg aagtacctcc gggccagccg ggtgagctgt gtgtcaaggg tccgcaggtt 1920atgctgggct actggcagcg cccggacgcc accgacgaaa tcattaaaaa cggttggctg 1980cataccggtg atatcgctgt aatggacgaa gaaggtttcc tgcgtatcgt ggaccgtaag 2040aaagatatga ttctggtgag cggtttcaac gtgtacccga acgaaattga ggacgtagtt 2100atgcaacacc ctggcgtgca ggaggtggca gccgtgggcg tgccgtccgg ttcttctggt 2160gaggctgtga aaatctttgt cgttaaaaag gacccgtccc tgaccgaaga atctctggtg 2220acgttttgcc gccgtcaact gactggctac aaagtgccga aactggtcga gttccgcgat 2280gagctgccaa aatctaacgt gggtaagatc ctgcgccgcg agctgcgtga cgaggcacgt 2340ggcaaagttg acaataaagc ataactcgac gcgtaggagg acagctatgc gcccacttca 2400tccgatcgat ttcattttcc tgtccctgga gaaacgccag cagccgatgc acgtaggtgg 2460tctgttcctg ttccagatcc cggataacgc tccggacacc tttattcagg acctggtgaa 2520cgatatccgt atctccaagt ctattccggt tccgccgttc aacaacaagc tgaacggtct 2580gttctgggac gaagacgagg agttcgatct ggatcaccat ttccgtcata ttgcgctgcc 2640gcacccgggt cgcatccgtg agctgctgat ttacatctct caggaacaca gcactctcct 2700cgatcgcgct aaacctctgt ggacttgcaa catcattgaa ggtatcgagg gtaaccgttt 2760cgccatgtac ttcaagattc atcatgcgat ggtggatggt gtggcgggta tgcgtctgat 2820tgagaaaagc ctgtcccatg atgttactga aaagagcatc gtaccgccgt ggtgcgttga 2880gggcaaacgt gctaaacgcc tgcgtgaacc gaagaccggc aaaattaaga aaatcatgtc 2940tggtattaaa tctcagctcc aggccacccc gaccgttatt caagaactgt ctcagacggt 3000cttcaaagac atcggccgta atccggacca cgtttcctct ttccaggcgc cgtgctccat 3060cctcaaccag cgtgtgtctt cttctcgtcg tttcgcagca cagagctttg acctggaccg 3120tttccgcaac atcgccaaat ctctgaacgt gaccattaac gacgttgtcc tggctgtgtg 3180tagcggtgct ctgcgcgctt atctgatgtc tcataactct ctgccatcca aaccgctgat 3240cgctatggtc ccagcaagca tccgcaacga tgattctgat gtgtccaacc gtattactat 3300gattctggcc aacctcgcta ctcacaaaga cgaccctctg cagcgtctgg aaatcatccg 3360ccgctccgtc cagaactcta aacagcgttt taaacgcatg acttccgacc agattctgaa 3420ctattctgcg gttgtatacg gcccggctgg tctgaacatt atcagcggta tgatgccgaa 3480acgtcaggct tttaacctgg taatcagcaa cgttcctggc ccgcgtgagc cgctgtactg 3540gaacggcgca aaactggacg cactgtaccc ggcttccatc gttctggatg gccaggctct 3600gaacatcact atgacctctt acctggacaa actggaagta ggtctgatcg cgtgtcgcaa 3660tgcactgccg cgcatgcaga acctgctgac ccacctggag gaggaaatcc agctgtttga 3720gggcgttatc gccaaacagg aagatatcaa aacggcgaac taaccatggt tgaattcggt 3780tttccgtcct gtcttgattt tcaagcaaac aatgcctccg atttctaatc ggaggcattt 3840gtttttgttt attgcaaaaa caaaaaatat tgttacaaat ttttacaggc tattaagcct 3900accgtcataa ataatttgcc atttactagt ttttaattaa ccagaacctt gaccgaacgc 3960agcggtggta acggcgcagt ggcggttttc atggcttgtt atgactgttt ttttggggta 4020cagtctatgc ctcgggcatc caagcagcaa gcgcgttacg ccgtgggtcg atgtttgatg 4080ttatggagca gcaacgatgt tacgcagcag ggcagtcgcc ctaaaacaaa gttaaacatc 4140atgagggaag cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 4200gagcgccatc tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 4260ggcctgaagc cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 4320acaacgcggc gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 4380gagattctcc gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 4440tatccagcta agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 4500atcttcgagc cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 4560catagcgttg ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 4620gatctatttg aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 4680ggcgatgagc gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 4740aaaatcgcgc cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 4800cagcccgtca tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 4860tcgcgcgcag atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 4920gtcggcaaat aatgtctaac aattcgttca agccgacgcc gcttcgcggc gcggcttaac 4980tcaagcgtta gatgcactaa gcacataatt gctcacagcc aaactatcag gtcaagtctg 5040cttttattat ttttaagcgt gcataataag ccctacacaa attgggagat atatcatgag 5100gcgcgccacg agtgcgggga aatttcgggg gcgatcgccc ctatatcgca aaaaggagtt 5160accccatcag agctatagtc gagaagaaaa ccatcattca ctcaacaagg ctatgtcaga 5220agagaaacta gaccggatcg aagcagccct agagcaattg gataaggatg tgcaaacgct 5280ccaaacagag cttcagcaat cccaaaaatg gcaggacagg acatgggatg ttgtgaagtg 5340ggtaggcgga atctcagcgg gcctagcggt gagcgcttcc attgccctgt tcgggttggt 5400ctttagattt tctgtttccc tgccataaaa gcacattctt ataagtcata cttgtttaca 5460tcaaggaaca aaaacggcat tgtgccttgc aaggcacaat gtctttctct tatgcacaga 5520tggggactgg aaaccacacg cacaattccc ttaaaaagca accgcaaaaa ataaccatca 5580aaataaaact ggacaaattc tcatgtgggc cggcc 5615144764DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 14cctgcagggt cagcaagctc tggaatttcc cgattctctg atgggagatc caaaaattct 60cgcagtccct caatcacgat atcggtcttg gatcgccctg tagcttccga caactgctca 120attttttcga gcatctctac cgggcatcgg aatgaaatta acggtgtttt agccatgtgt 180tatacagtgt ttacaacttg actaacaaat acctgctagt gtatacatat tgtattgcaa 240tgtatacgct attttcactg ctgtctttaa tggggattat cgcaagcaag taaaaaagcc 300tgaaaacccc aataggtaag ggattccgag cttactcgat aattatcacc tttgagcgcc 360cctaggagga ggcgaaaagc tatgtctgac aaggggtttg acccctgaag tcgttgcgcg 420agcattaagg tctgcggata gcccataaca tacttttgtt gaacttgtgc gcttttatca 480accccttaag ggcttgggag cgttttatgc ggccgcgggg ggggggggga aagccacgtt 540gtgtctcaaa atctctgatg ttacattgca caagataaaa atatatcatc atgaacaata 600aaactgtctg cttacataaa cagtaataca aggggtcata tggcggatac tctgctgatt 660ctgggtgatt ctctgtctgc aggctaccgt atgtccgcct ccgcggcctg gccagctctg 720ctgaatgata agtggcagtc taagacgtcc gttgtgaacg catccatctc tggcgacacg 780agccagcagg gcctggcccg tctgcctgca ctgctgaaac agcaccaacc gcgctgggtc 840ctggtggagc tgggcggtaa cgacggtctg cgcggcttcc agccgcagca gaccgaacag 900actctgcgtc agattctgca ggacgtgaaa gctgctaacg cggaaccgct gctgatgcag 960attcgtctgc cagcgaacta tggccgccgt tacaacgaag cgttctctgc aatctaccca 1020aaactggcga aagagtttga cgtcccgctg ctgccgttct tcatggagga agtatacctg 1080aaaccgcagt ggatgcaaga tgacggcatc cacccgaacc gtgatgcgca gccgttcatc 1140gctgactgga tggcgaagca actgcagccg ctggtaaacc acgattccta attaaagatc 1200tgtagtagga tccatgtagg gtgaggttat agctatgaag aaagtttggc tgaaccgtta 1260tccggcagat gtaccgactg aaattaaccc agatcgttac cagtccctgg ttgacatgtt 1320cgaacagtcc gtggctcgct acgccgatca gcctgctttc gtcaacatgg gtgaggtaat 1380gacctttcgc aaactggagg agcgttcccg tgctttcgcg gcatacctgc agcagggtct 1440gggcctgaag aaaggcgacc gcgtggccct gatgatgccg aacctgctgc aatatcctgt 1500ggcgctgttc ggtatcctgc gtgctggtat gatcgttgtc aatgttaacc ctctgtatac 1560ccctcgtgaa ctggagcacc agctgaatga ctctggtgcg tctgctatcg ttatcgtttc 1620caatttcgca catacgctgg agaaagtggt tgataaaacc gcagtgcagc atgtcattct 1680gactcgcatg ggtgaccagc tgtccaccgc taaaggtact gtagtcaact tcgttgtgaa 1740atacattaag cgcctggttc cgaaatacca

cctgccagat gcaattagct ttcgctctgc 1800actgcataac ggttaccgta tgcagtacgt aaaaccagag ctggtgccgg aagacctggc 1860ctttctgcag tataccggcg gcaccaccgg cgtggcaaag ggcgcgatgc tgacccatcg 1920taacatgctg gcgaacctgg agcaggttaa cgcaacgtac ggcccgctgc tgcacccggg 1980taaagaactg gtagttacgg cactgcctct gtatcacatc tttgcactga cgatcaactg 2040tctgctgttc attgaactgg gtggtcagaa cctgctgatc accaacccgc gtgacattcc 2100gggcctggta aaagagctgg ctaagtaccc gttcaccgcc attactggcg taaacactct 2160gtttaacgcg ctgctgaaca acaaagagtt tcagcagctg gacttctcta gcctgcacct 2220gagcgctggc ggtggcatgc cggttcagca ggttgtggca gagcgttggg tgaaactgac 2280cggccagtat ctgctggagg gttatggtct gaccgagtgt gcaccgctgg tcagcgttaa 2340cccgtatgat attgattacc actctggttc tattggtctg ccggttccgt ccacggaagc 2400caaactggtg gacgatgacg acaacgaagt acctccgggc cagccgggtg agctgtgtgt 2460caagggtccg caggttatgc tgggctactg gcagcgcccg gacgccaccg acgaaatcat 2520taaaaacggt tggctgcata ccggtgatat cgctgtaatg gacgaagaag gtttcctgcg 2580tatcgtggac cgtaagaaag atatgattct ggtgagcggt ttcaacgtgt acccgaacga 2640aattgaggac gtagttatgc aacaccctgg cgtgcaggag gtggcagccg tgggcgtgcc 2700gtccggttct tctggtgagg ctgtgaaaat ctttgtcgtt aaaaaggacc cgtccctgac 2760cgaagaatct ctggtgacgt tttgccgccg tcaactgact ggctacaaag tgccgaaact 2820ggtcgagttc cgcgatgagc tgccaaaatc taacgtgggt aagatcctgc gccgcgagct 2880gcgtgacgag gcacgtggca aagttgacaa taaagcataa caattcggtt ttccgtcctg 2940tcttgatttt caagcaaaca atgcctccga tttctaatcg gaggcatttg tttttgttta 3000ttgcaaaaac aaaaaatatt gttacaaatt tttacaggct attaagccta ccgtcataaa 3060taatttgcca tttactagtt tttaattaac cagaaccttg accgaacgca gcggtggtaa 3120cggcgcagtg gcggttttca tggcttgtta tgactgtttt tttggggtac agtctatgcc 3180tcgggcatcc aagcagcaag cgcgttacgc cgtgggtcga tgtttgatgt tatggagcag 3240caacgatgtt acgcagcagg gcagtcgccc taaaacaaag ttaaacatca tgagggaagc 3300ggtgatcgcc gaagtatcga ctcaactatc agaggtagtt ggcgtcatcg agcgccatct 3360cgaaccgacg ttgctggccg tacatttgta cggctccgca gtggatggcg gcctgaagcc 3420acacagtgat attgatttgc tggttacggt gaccgtaagg cttgatgaaa caacgcggcg 3480agctttgatc aacgaccttt tggaaacttc ggcttcccct ggagagagcg agattctccg 3540cgctgtagaa gtcaccattg ttgtgcacga cgacatcatt ccgtggcgtt atccagctaa 3600gcgcgaactg caatttggag aatggcagcg caatgacatt cttgcaggta tcttcgagcc 3660agccacgatc gacattgatc tggctatctt gctgacaaaa gcaagagaac atagcgttgc 3720cttggtaggt ccagcggcgg aggaactctt tgatccggtt cctgaacagg atctatttga 3780ggcgctaaat gaaaccttaa cgctatggaa ctcgccgccc gactgggctg gcgatgagcg 3840aaatgtagtg cttacgttgt cccgcatttg gtacagcgca gtaaccggca aaatcgcgcc 3900gaaggatgtc gctgccgact gggcaatgga gcgcctgccg gcccagtatc agcccgtcat 3960acttgaagct agacaggctt atcttggaca agaagaagat cgcttggcct cgcgcgcaga 4020tcagttggaa gaatttgtcc actacgtgaa aggcgagatc accaaggtag tcggcaaata 4080atgtctaaca attcgttcaa gccgacgccg cttcgcggcg cggcttaact caagcgttag 4140atgcactaag cacataattg ctcacagcca aactatcagg tcaagtctgc ttttattatt 4200tttaagcgtg cataataagc cctacacaaa ttgggagata tatcatgagg cgcgccacga 4260gtgcggggaa atttcggggg cgatcgcccc tatatcgcaa aaaggagtta ccccatcaga 4320gctatagtcg agaagaaaac catcattcac tcaacaaggc tatgtcagaa gagaaactag 4380accggatcga agcagcccta gagcaattgg ataaggatgt gcaaacgctc caaacagagc 4440ttcagcaatc ccaaaaatgg caggacagga catgggatgt tgtgaagtgg gtaggcggaa 4500tctcagcggg cctagcggtg agcgcttcca ttgccctgtt cgggttggtc tttagatttt 4560ctgtttccct gccataaaag cacattctta taagtcatac ttgtttacat caaggaacaa 4620aaacggcatt gtgccttgca aggcacaatg tctttctctt atgcacagat ggggactgga 4680aaccacacgc acaattccct taaaaagcaa ccgcaaaaaa taaccatcaa aataaaactg 4740gacaaattct catgtgggcc ggcc 4764155155DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 15accaatgctt aatcagtgag gcacctatct cagcgatctg tctatttcgt tcatccatag 60ttgcctgact ccccgtcgtg tagataacta cgatacggga gggcttacca tctggcccca 120gcgctgcgat gataccgcga gaaccacgct caccggctcc ggatttatca gcaataaacc 180agccagccgg aagggccgag cgcagaagtg gtcctgcaac tttatccgcc tccatccagt 240ctattaattg ttgccgggaa gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg 300ttgttgccat cgctacaggc atcgtggtgt cacgctcgtc gtttggtatg gcttcattca 360gctccggttc ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg 420ttagctcctt cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca 480tggttatggc agcactgcat aattctctta ctgtcatgcc atccgtaaga tgcttttctg 540tgactggtga gtactcaacc aagtcattct gagaatagtg tatgcggcga ccgagttgct 600cttgcccggc gtcaatacgg gataataccg cgccacatag cagaacttta aaagtgctca 660tcattggaaa acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca 720gttcgatgta acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg 780tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac 840ggaaatgttg aatactcata ttcttccttt ttcaatatta ttgaagcatt tatcagggtt 900attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggtca 960gtgttacaac caattaacca attctgaaca ttatcgcgag cccatttata cctgaatatg 1020gctcataaca ccccttgttt gcctggcggc agtagcgcgg tggtcccacc tgaccccatg 1080ccgaactcag aagtgaaacg ccgtagcgcc gatggtagtg tggggactcc ccatgcgaga 1140gtagggaact gccaggcatc aaataaaacg aaaggctcag tcgaaagact gggcctttcg 1200cccgggctaa ttatggggtg tcgcccttat tcgactctat agtgaagttc ctattctcta 1260gaaagtatag gaacttctga agtggggcct gcagggccac cacagccaaa ttcatcgtta 1320atgtggactt gccgacgccc ccttttcgac taacaatcgc aatttttttc atagacattt 1380cccacagacc acatcaaatt acagcaattg atctagctga aagtttaacc cacttccccc 1440cagacccaga agaccagagg cgcttaagct tccccgaaca aactcaactg accgaggggg 1500agggagccgt agcggcgttg gtgttggcgt aaatgacagg ccgagcaaag agcgatgaga 1560ttttcccgac gattgtcttc ggggatgtaa tttttgtggt ggacgcttaa ggttaaaaca 1620gcccgcaggt gacgatcaat gcctttgacc ttcacatccg acggaataca aaccaagcca 1680cagagttcac agcgccagtc tgcatcctct tttacttgta aggcgatcgc ctgccaatca 1740tcagaatatc gagaagaatg tttcatctaa acctagcgcc gcaagataat cctgaaatcg 1800ctacagtatt aaaaaattct ggccaacatc acagccaata ctgcggccgc gggggggggg 1860gggaaagcca cgttgtgtct caaaatctct gatgttacat tgcacaagat aaaaatatat 1920catcatgaac aataaaactg tctgcttaca taaacagtaa tacaaggggt catatgtaac 1980aggaattcgg ttttccgtcc tgtcttgatt ttcaagcaaa caatgcctcc gatttctaat 2040cggaggcatt tgtttttgtt tattgcaaaa acaaaaaata ttgttacaaa tttttacagg 2100ctattaagcc taccgtcata aataatttgc catttactag tttttaatta aacccctatt 2160tgtttatttt tctaaataca ttcaaatatg tatccgctca tgagacaata accctgataa 2220atgcttcaat aatattgaaa aaggaagagt atgattgaac aagatggcct gcatgctggt 2280tctccggctg cttgggtgga acgcctgttt ggttacgact gggctcagct gactattggc 2340tgtagcgatg cagcggtttt ccgtctgtct gcacagggtc gtccggttct gtttgtgaaa 2400accgacctgt ccggcgcact gaacgaactg caggacgaag cggcccgtct gtcctggctc 2460gcgacgactg gtgttccgtg cgcggcagtt ctggacgtag ttactgaagc cggtcgcgat 2520tggctgctgc tgggtgaagt tccgggtcag gatctgctga gcagccacct cgctccggca 2580gaaaaagttt ccatcatggc ggacgcgatg cgccgtctgc acaccctgga cccggcaact 2640tgcccgtttg accatcaggc taaacaccgt attgaacgtg cacgcactcg tatggaagcg 2700ggtctggttg atcaggacga cctggatgaa gagcaccagg gcctcgcacc ggcggaactg 2760tttgcacgtc tgaaagcccg catgccggac ggcgaagacc tggtggtaac gcatggcgac 2820gcttgtctgc caaacattat ggtggaaaac ggccgcttct ctggttttat tgactgtggc 2880cgtctgggtg tagctgatcg ctatcaggat atcgccctcg ctacccgcga tattgcagaa 2940gaactgggtg gtgaatgggc tgaccgtttc ctggtgctgt acggtatcgc agcgccggat 3000tctcagcgca ttgccttcta ccgtctgctg gatgagttct tctaaggcgc gccgaaactg 3060cgccaagaat agctcacttc aaatcagtca cggttttgtt tagggcttgt ctggcgattt 3120tggtgacata gacagtcaca gcaacagtag ccacaaaacc aagaatccgg atcgaccact 3180gggcaatggg gttggcgctg gtgctttctg tgccgagggt cgcaagattt ccggccaggg 3240agccaatgta gacatacatg atggtgccag ggatcatccc cacagagccg aggacatagt 3300cttttaggga aacgcccgtg accccatagg catagttaag cagattaaag ggaaatacag 3360gtgagagacg cgtcaggaga acaatcttca ggccttcctt gcccacagct tcgtcgatgg 3420cgcgaaattt cgggttgtcg gcgatttttt ggctcaccca ttggcgggcc agataacgac 3480ccactaggaa agcagcgatc gctcctaggg ttgcgccaac aaagacgtaa attgatccta 3540aagcgacacc aaaaacaacc ccggctccca aggtcagaat cgaccccggt agaaaagcca 3600ccgtcgccac cacataaagc accataaagg cgatggccgg ccaaaatgaa gtgaagttcc 3660tatactttct agagaatagg aacttctata gtgagtcgaa taagggcgac acaaaattta 3720ttctaaatgc ataataaata ctgataacat cttatagttt gtattatatt ttgtattatc 3780gttgacatgt ataattttga tatcaaaaac tgattttccc tttattattt tcgagattta 3840ttttcttaat tctctttaac aaactagaaa tattgtatat acaaaaaatc ataaataata 3900gatgaatagt ttaattatag gtgttcatca atcgaaaaag caacgtatct tatttaaagt 3960gcgttgcttt tttctcattt ataaggttaa ataattctca tatatcaagc aaagtgacag 4020gcgcccttaa atattctgac aaatgctctt tccctaaact ccccccataa aaaaacccgc 4080cgaagcgggt ttttacgtta tttgcggatt aacgattact cgttatcaga accgcccagg 4140gggcccgagc ttaagactgg ccgtcgtttt acaacacaga aagagtttgt agaaacgcaa 4200aaaggccatc cgtcaggggc cttctgctta gtttgatgcc tggcagttcc ctactctcgc 4260cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat 4320cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga 4380acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt 4440ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt 4500ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 4560gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 4620gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 4680ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 4740actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 4800gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggg 4860ctaactacgg ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta 4920ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg 4980gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt 5040tgatcttttc tacggggtct gacgctcagt ggaacgacgc gcgcgtaact cacgttaagg 5100gattttggtc atgagcttgc gccgtcccgt caagtcagcg taatgctctg ctttt 5155168459DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 16cctgcagggc caccacagcc aaattcatcg ttaatgtgga cttgccgacg cccccttttc 60gactaacaat cgcaattttt ttcatagaca tttcccacag accacatcaa attacagcaa 120ttgatctagc tgaaagttta acccacttcc ccccagaccc agaagaccag aggcgcttaa 180gcttccccga acaaactcaa ctgaccgagg gggagggagc cgtagcggcg ttggtgttgg 240cgtaaatgac aggccgagca aagagcgatg agattttccc gacgattgtc ttcggggatg 300taatttttgt ggtggacgct taaggttaaa acagcccgca ggtgacgatc aatgcctttg 360accttcacat ccgacggaat acaaaccaag ccacagagtt cacagcgcca gtctgcatcc 420tcttttactt gtaaggcgat cgcctgccaa tcatcagaat atcgagaaga atgtttcatc 480taaacctagc gccgcaagat aatcctgaaa tcgctacagt attaaaaaat tctggccaac 540atcacagcca atactgcggc cgcgccccta tattatgcat ttataccccc acaatcatgt 600caagaattca agcatcttaa ataatgttaa ttatcggcaa agtctgtgct ccccttctat 660aatgctgaat tgagcattcg cctcctgaac ggtctttatt cttccattgt gggtctttag 720attcacgatt cttcacaatc attgatctaa agatctttct agattctcga ggcatatgaa 780gaaattgctc cccattctta tcggcctgag cctttctggg ttcagttcgt tgagccaggc 840cgagaacctg atgcaagttt atcagcaagc acgccttagt aacccggaat tgcgtaagtc 900tgccgccgat cgtgatgctg cctttgaaaa aattaatgaa gcgcgcagtc cattactgcc 960acagctaggt ttaggtgcag attacaccta tagcaacggc taccgcgacg cgaacggcat 1020caactctaac gcgaccagtg cgtccttgca gttaactcaa tccatttttg atatgtcgaa 1080atggcgtgcg ttaacgctgc aggaaaaagc agcagggatt caggacgtca cgtatcagac 1140cgatcagcaa accttgatcc tcaacaccgc gaccgcttat ttcaacgtgt tgaatgctat 1200tgacgttctt tcctatacac aggcacaaaa agaagcgatc taccgtcaat tagatcaaac 1260cacccaacgt tttaacgtgg gcctggtagc gatcaccgac gtgcagaacg cccgcgcaca 1320gtacgatacc gtgctggcga acgaagtgac cgcacgtaat aaccttgata acgcggtaga 1380gcagctgcgc cagatcaccg gtaactacta tccggaactg gctgcgctga atgtcgaaaa 1440ctttaaaacc gacaaaccac agccggttaa cgcgctgctg aaagaagccg aaaaacgcaa 1500cctgtcgctg ttacaggcac gcttgagcca ggacctggcg cgcgagcaaa ttcgccaggc 1560gcaggatggt cacttaccga ctctggattt aacggcttct accgggattt ctgacacctc 1620ttatagcggt tcgaaaaccc gtggtgccgc tggtacccag tatgacgata gcaatatggg 1680ccagaacaaa gttggcctga gcttctcgct gccgatttat cagggcggaa tggttaactc 1740gcaggtgaaa caggcacagt acaactttgt cggtgccagc gagcaactgg aaagtgccca 1800tcgtagcgtc gtgcagaccg tgcgttcctc cttcaacaac attaatgcat ctatcagtag 1860cattaacgcc tacaaacaag ccgtagtttc cgctcaaagc tcattagacg cgatggaagc 1920gggctactcg gtcggtacgc gtaccattgt tgatgtgttg gatgcgacca ccacgttgta 1980caacgccaag caagagctgg cgaatgcgcg ttataactac ctgattaatc agctgaatat 2040taagtcagct ctgggtacgt tgaacgagca ggatctgctg gcactgaaca atgcgctgag 2100caaaccggtt tccactaatc cggaaaacgt tgcaccgcaa acgccggaac agaatgctat 2160tgctgatggt tatgcgcctg atagcccggc accagtcgtt cagcaaacat ccgcacgcac 2220taccaccagt aacggtcata accctttccg taactgagga tccaaggtgg ctacttcaac 2280gatagcttaa acttcgctgc tccagcgagg ggatttcact ggtttgaatg cttcaatgct 2340tgccaaaaga gtgctactgg aacttacaag agtgaccctg cgtcagggga gctagcactc 2400aaaaaagact cctccaattc cgtccatgaa caaaaacaga gggtttacgc ctctggcggt 2460cgttctgatg ctctcaggca gcttagccct aacaggatgt gacgacaaac aggcccaaca 2520aggtggccag cagatgcccg ccgttggcgt agtaacagtc aaaactgaac ctctgcagat 2580cacaaccgag cttccgggtc gcaccagtgc ctaccggatc gcagaagttc gtcctcaagt 2640tagcgggatt atcctgaagc gtaatttcaa agaaggtagc gacatcgaag caggtgtctc 2700tctctatcag attgatcctg cgacctatca ggcgacatac gacagtgcga aaggtgatct 2760ggcgaaagcc caggctgcag ccaatatcgc gcaattgacg gtgaatcgtt atcagaaact 2820gctcggtact cagtacatca gtaagcaaga gtacgatcag gctctggctg atgcgcaaca 2880ggcgaatgct gcggtaactg cggcgaaagc tgccgttgaa actgcgcgga tcaatctggc 2940ttacaccaaa gtcacctctc cgattagcgg tcgcattggt aagtcgaacg tgacggaagg 3000cgcattggta cagaacggtc aggcgactgc gctggcaacc gtgcagcaac ttgatccgat 3060ctacgttgat gtgacccagt ccagcaacga cttcctgcgc ctgaaacagg aactggcgaa 3120tggcacgctg aaacaagaga acggcaaagc caaagtgtca ctgatcacca gtgacggcat 3180taagttcccg caggacggta cgctggaatt ctctgacgtt accgttgatc agaccactgg 3240gtctatcacc ctacgcgcta tcttcccgaa cccggatcac actctgctgc cgggtatgtt 3300cgtgcgcgca cgtctggaag aagggcttaa tccaaacgct attttagtcc cgcaacaggg 3360cgtaacccgt acgccgcgtg gcgatgccac cgtactggta gttggcgcgg atgacaaagt 3420ggaaacccgt ccgatcgttg caagccaggc tattggcgat aagtggctgg tgacagaagg 3480tctgaaagca ggcgatcgcg tagtaataag tgggctgcag aaagtgcgtc ctggtgtcca 3540ggtaaaagca caagaagtta ccgctgataa taaccagcaa gccgcaagcg gtgctcagcc 3600tgaacagtcc aagtcttaac ttaaacagga gccgttaaga catgcctaat ttctttatcg 3660atcgcccgat ttttgcgtgg gtgatcgcca ttatcatcat gttggcaggg gggctggcga 3720tcctcaaact gccggtggcg caatatccta cgattgcacc gccggcagta acgatctccg 3780cctcctaccc cggcgctgat gcgaaaacag tgcaggacac ggtgacacag gttatcgaac 3840agaatatgaa cggtatcgat aacctgatgt acatgtcctc taacagtgac tccacgggta 3900ccgtgcagat caccctgacc tttgagtctg gtactgatgc ggatatcgcg caggttcagg 3960tgcagaacaa actgcagctg gcgatgccgt tgctgccgca agaagttcag cagcaagggg 4020tgagcgttga gaaatcatcc agcagcttcc tgatggttgt cggcgttatc aacaccgatg 4080gcaccatgac gcaggaggat atctccgact acgtggcggc gaatatgaaa gatgccatca 4140gccgtacgtc gggcgtgggt gatgttcagt tgttcggttc acagtacgcg atgcgtatct 4200ggatgaaccc gaatgagctg aacaaattcc agctaacgcc ggttgatgtc attaccgcca 4260tcaaagcgca gaacgcccag gttgcggcgg gtcagctcgg tggtacgccg ccggtgaaag 4320gccaacagct taacgcctct attattgctc agacgcgtct gacctctact gaagagttcg 4380gcaaaatcct gctgaaagtg aatcaggatg gttcccgcgt gctgctgcgt gacgtcgcga 4440agattgagct gggtggtgag aactacgaca tcatcgcaga gtttaacggc caaccggctt 4500ccggtctggg gatcaagctg gcgaccggtg caaacgcgct ggataccgct gcggcaatcc 4560gtgctgaact ggcgaagatg gaaccgttct tcccgtcggg tctgaaaatt gtttacccat 4620acgacaccac gccgttcgtg aaaatctcta ttcacgaagt ggttaaaacg ctggtcgaag 4680cgatcatcct cgtgttcctg gttatgtatc tgttcctgca gaacttccgc gcgacgttga 4740ttccgaccat tgccgtaccg gtggtattgc tcgggacctt tgccgtcctt gccgcctttg 4800gcttctcgat aaacacgcta acaatgttcg ggatggtgct cgccatcggc ctgttggtgg 4860atgacgccat cgttgtggta gaaaacgttg agcgtgttat ggcggaagaa ggtttgccgc 4920caaaagaagc tacccgtaag tcgatggggc agattcaggg cgctctggtc ggtatcgcga 4980tggtactgtc ggcggtattc gtaccgatgg ccttctttgg cggttctact ggtgctatct 5040atcgtcagtt ctctattacc attgtttcag caatggcgct gtcggtactg gtggcgttga 5100tcctgactcc agctctttgt gccaccatgc tgaaaccgat tgccaaaggc gatcacgggg 5160aaggtaaaaa aggcttcttc ggctggttta accgcatgtt cgagaagagc acgcaccact 5220acaccgacag cgtaggcggt attctgcgca gtacggggcg ttacctggtg ctgtatctga 5280tcatcgtggt cggcatggcc tatctgttcg tgcgtctgcc aagctccttc ttgccagatg 5340aggaccaggg cgtgtttatg accatggttc agctgccagc aggtgcaacg caggaacgta 5400cacagaaagt gctcaatgag gtaacgcatt actatctgac caaagaaaag aacaacgttg 5460agtcggtgtt cgccgttaac ggcttcggct ttgcgggacg tggtcagaat accggtattg 5520cgttcgtttc cttgaaggac tgggccgatc gtccgggcga agaaaacaaa gttgaagcga 5580ttaccatgcg tgcaacacgc gctttctcgc aaatcaaaga tgcgatggtt ttcgccttta 5640acctgcccgc aatcgtggaa ctgggtactg caaccggctt tgactttgag ctgattgacc 5700aggctggcct tggtcacgaa aaactgactc aggcgcgtaa ccagttgctt gcagaagcag 5760cgaagcaccc tgatatgttg accagcgtac gtccaaacgg tctggaagat accccgcagt 5820ttaagattga tatcgaccag gaaaaagcgc aggcgctggg tgtttctatc aacgacatta 5880acaccactct gggcgctgca tggggcggca gctatgtgaa cgactttatc gaccgcggtc 5940gtgtgaagaa agtttatgtc atgtcagaag cgaaataccg tatgctgccg gatgatatcg 6000gcgactggta tgttcgtgct gctgatggtc agatggtgcc attctcggcg ttctcctctt 6060ctcgttggga gtacggttcg ccgcgtctgg aacgttacaa cggcctgcca tccatggaaa 6120tcttaggcca ggcggcaccg ggtaaaagta ccggtgaagc aatggagctg atggaacaac 6180tggcgagcaa actgcctacc ggtgttggct atgactggac ggggatgtcc tatcaggaac 6240gtctctccgg caaccaggca ccttcactgt acgcgatttc gttgattgtc gtgttcctgt 6300gtctggcggc gctgtacgag agctggtcga ttccgttctc cgttatgctg gtcgttccgc 6360tgggggttat cggtgcgttg ctggctgcca ccttccgtgg cctgaccaat gacgtttact 6420tccaggtagg cctgctcaca accattgggt tgtcggcgaa gaacgcgatc cttatcgtcg 6480aattcgccaa agacttgatg gataaagaag gtaaaggtct gattgaagcg acgcttgatg 6540cggtgcggat gcgtttacgt ccgatcctga tgacctcgct ggcgtttatc ctcggcgtta 6600tgccgctggt tatcagtact ggtgctggtt ccggcgcgca gaacgcagta ggtaccggtg 6660taatgggcgg gatggtgacc gcaacggtac

tggcaatctt cttcgttccg gtattctttg 6720tggtggttcg ccgccgcttt agccgcaaga atgaagatat cgagcacagc catactgtcg 6780atcatcattg agagctcttg aattcggttt tccgtcctgt cttgattttc aagcaaacaa 6840tgcctccgat ttctaatcgg aggcatttgt ttttgtttat tgcaaaaaca aaaaatattg 6900ttacaaattt ttacaggcta ttaagcctac cgtcataaat aatttgccat ttactagttt 6960ttaattaaac ccctatttgt ttatttttct aaatacattc aaatatgtat ccgctcatga 7020gacaataacc ctgataaatg cttcaataat attgaaaaag gaagagtatg attgaacaag 7080atggcctgca tgctggttct ccggctgctt gggtggaacg cctgtttggt tacgactggg 7140ctcagctgac tattggctgt agcgatgcag cggttttccg tctgtctgca cagggtcgtc 7200cggttctgtt tgtgaaaacc gacctgtccg gcgcactgaa cgaactgcag gacgaagcgg 7260cccgtctgtc ctggctcgcg acgactggtg ttccgtgcgc ggcagttctg gacgtagtta 7320ctgaagccgg tcgcgattgg ctgctgctgg gtgaagttcc gggtcaggat ctgctgagca 7380gccacctcgc tccggcagaa aaagtttcca tcatggcgga cgcgatgcgc cgtctgcaca 7440ccctggaccc ggcaacttgc ccgtttgacc atcaggctaa acaccgtatt gaacgtgcac 7500gcactcgtat ggaagcgggt ctggttgatc aggacgacct ggatgaagag caccagggcc 7560tcgcaccggc ggaactgttt gcacgtctga aagcccgcat gccggacggc gaagacctgg 7620tggtaacgca tggcgacgct tgtctgccaa acattatggt ggaaaacggc cgcttctctg 7680gttttattga ctgtggccgt ctgggtgtag ctgatcgcta tcaggatatc gccctcgcta 7740cccgcgatat tgcagaagaa ctgggtggtg aatgggctga ccgtttcctg gtgctgtacg 7800gtatcgcagc gccggattct cagcgcattg ccttctaccg tctgctggat gagttcttct 7860aaggcgcgcc gaaactgcgc caagaatagc tcacttcaaa tcagtcacgg ttttgtttag 7920ggcttgtctg gcgattttgg tgacatagac agtcacagca acagtagcca caaaaccaag 7980aatccggatc gaccactggg caatggggtt ggcgctggtg ctttctgtgc cgagggtcgc 8040aagatttccg gccagggagc caatgtagac atacatgatg gtgccaggga tcatccccac 8100agagccgagg acatagtctt ttagggaaac gcccgtgacc ccataggcat agttaagcag 8160attaaaggga aatacaggtg agagacgcgt caggagaaca atcttcaggc cttccttgcc 8220cacagcttcg tcgatggcgc gaaatttcgg gttgtcggcg attttttggc tcacccattg 8280gcgggccaga taacgaccca ctaggaaagc agcgatcgct cctagggttg cgccaacaaa 8340gacgtaaatt gatcctaaag cgacaccaaa aacaaccccg gctcccaagg tcagaatcga 8400ccccggtaga aaagccaccg tcgccaccac ataaagcacc ataaaggcga tggccggcc 845917169DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 17tccctctcag ctcaaaaagt atcaatgatt acttaatgtt tgttctgcgc aaacttcttg 60cagaacatgc atgatttaca aaaagttgta gtttctgtta ccaattgcga atcgagaact 120gcctaatctg ccgagtatgc aagctgcttt gtaggcagat gaatcccat 16918222DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 18gcttgtagca attgctacta aaaactgcga tcgctgctga aatgagctgg aattttgtcc 60ctctcagctc aaaaagtatc aatgattact taatgtttgt tctgcgcaaa cttcttgcag 120aacatgcatg atttacaaaa agttgtagtt tctgttacca attgcgaatc gagaactgcc 180taatctgccg agtatgcgat cctttagcag gaggaaaacc at 22219597DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 19gctactcatt agttaagtgt aatgcagaaa acgcatattc tctattaaac ttacgcatta 60atacgagaat tttgtagcta cttatactat tttacctgag atcccgacat aaccttagaa 120gtatcgaaat cgttacataa acattcacac aaaccacttg acaaatttag ccaatgtaaa 180agactacagt ttctccccgg tttagttcta gagttacctt cagtgaaaca tcggcggcgt 240gtcagtcatt gaagtagcat aaatcaattc aaaataccct gcgggaaggc tgcgccaaca 300aaattaaata tttggttttt cactattaga gcatcgattc attaatcaaa aaccttaccc 360cccagccccc ttcccttgta gggaagtggg agccaaactc ccctctccgc gtcggagcga 420aaagtctgag cggaggtttc ctccgaacag aacttttaaa gagagagggg ttgggggaga 480ggttctttca agattactaa attgctatca ctagacctcg tagaactagc aaagactacg 540ggtggattga tcttgagcaa aaaaacttta tgagaacttt agcaggagga aaaccat 597202296DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 20gattacccta tatcgggctt ttctcaataa aatctttatt ttttgaggtg ctttttagcc 60ataaataatc actttagtat aaaattttga cggcgtaaag ttgataaaat agaattaaga 120atggactatc ggtacagaaa aaatgggtaa ctggatggtg aataaacttc ccttacccaa 180tgcactctcc accgttaaag accccctatg cttaacggtg atcacctggg caatggcgag 240tcccaaccct gtcccccccg ttttgcgcga acgatctcga ttaactcggt aaaaacgctc 300aaaaatgtgt tcctgttggt cgggggcaat gccgatgccg gtatcttgca cggtgatgat 360agccatctgt tcatgggatg tcagggtaat atcaacacgt cccccagcag ttgtgtattg 420aatggcgttg gcaattaggt ttgagaccag tcgatagagt tgggattcat taccccaggc 480gtaaacttcc cctgaactca gatcactgct gagatcaatg tgggcggcga tcgctaattc 540taaaaactct tcggtgaggt cactgactaa atcatttaaa caacaaagcc gccaatcttc 600ggcggtggtt tcctgctcta agcgacttag tagcaataaa tccgtaatca attggcttaa 660tcgccttccc tgtcgttcaa cggtatgtag catggtgtta atttctgggg aatggcttga 720gtcgatgcgt aataccgctt ccaccgtggc caacagacta gccaatggcg atcgtaattc 780atgggctgca ttcgcggtga attgttgttg ttgttggtag gactggtaaa tgggacgcat 840ggctaacccc gctaagcccc aactggagaa ggcgaccaaa cccagggcaa tgggaaaact 900aagccctaaa atccaaagaa tacgtttatt ttcggcatca aaggctgcca ggctccggcc 960aatttgtaga tagccccagg aagatttgtc tgtattaccg gcgctatgca aaatggtggt 1020gaattgtcga taccgatcgc cggttggggg gtgaatagtc tgccaagttt cctggttaaa 1080aatggaggat agggaagccg gttgattagg cgaaaaagcc agcaggttgc cttgataatc 1140aaataaacga atgtaatata aactgcgatc actaatgccc aacgtgtgac gttcaatcag 1200ggtggggttg acctggcagg gttggttgac caaacacaga tcgggcaaca ttttttgtaa 1260tactccggtg ggactagcat tactcggcaa catcggctct aaactgtcat gcaacgtccc 1320ggcgatcgac tccacttctc gctccaacgc catccagttg gcctgcacaa tggcacgata 1380aacccccaac cccaacaggg taagaattcc ccccattact agggcatacc agaaagccaa 1440ttgcagacga ctacgggcaa agaggcgacg ggtattcatg gcgatagggt gaaccgatag 1500ccttgaccgg gaactgtttt aattgggcaa ggacaatttt gttgagctag cttgcgtcgt 1560atcaaacgca tttgggccgc caccacatta ctcatgggct cctcatcaag atcccacagt 1620tgttgccgga tcttgctacc ggaaatgatc cgctctgggt tttgcatcag atattgaaaa 1680atttgaaatt ctcttacggt taaagcaatt tcctgtcttt ctaggtttag tggctccgag 1740atagttaccg ataacagatt attactggga tcaaggctga agttgcccaa agttaaaatt 1800tgcggttgga attgtggcga tcgccgttgt agtgcccgca gtcttgctaa tagctctgcc 1860atcacaaacg gttttgttag atagtcatct gccccggcat ctagtccttc gacacggttt 1920tccggttctc ctaacgctgt taacatcaac accggcaagg aattaccctg ggttctcagt 1980ttttgacaga gttccaaacc cgataatccc ggcagtaacc aatccacaat ggcaagggtg 2040tattccgtcc attgattttc caaataatcc caagcttggg agccatccgt cacccaatcc 2100accacatact tttcactaac tagcactttc ttaatagcca ttcccaaatc cgtctcatct 2160tccaccagca aaattcgcat cgcctctgcc ttttttataa cggtctgatc ttagcggggg 2220aaggagattt tcacctgaat ttcatacccc ctttggcaga ctgggaaaat cttggacaaa 2280ttaggaggaa aaccat 22962135DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 21gctatgcctg caggggcctt ttatgaggag cggta 352241DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 22gctatggcgg ccgctcttca tgacagaccc tatggatact a 412336DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 23gctatgggcg cgccttatct gactccagac gcaaca 362436DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 24gctatgggcc ggccgatcct tggatcaact caccct 3625658PRTThermosynechococcus elongates 25Met Met Ser Ala Ala Tyr Thr Tyr Thr Pro Pro Gly Gly Leu Pro Gln 1 5 10 15 Asp Ala Ser Leu Pro Asp His Phe Leu Ala Tyr Lys Arg Leu Gln Ser 20 25 30 Leu Pro Glu Met Trp Pro Leu Leu Ala Gln Arg His Gly Asp Val Val 35 40 45 Ala Leu Asp Ala Pro Tyr Glu Asp Pro Pro Thr Arg Ile Thr Tyr Ser 50 55 60 Glu Leu Tyr Gln Arg Ile Gln Arg Phe Ala Ala Gly Leu Gln Ala Leu 65 70 75 80 Gly Val Ala Ala Gly Asp Arg Val Ala Leu Phe Pro Asp Asn Ser Pro 85 90 95 Arg Trp Leu Ile Ala Asp Gln Gly Ser Met Met Ala Gly Ala Ile Asn 100 105 110 Val Val Arg Ser Gly Thr Ala Asp Ala Gln Glu Leu Leu Tyr Ile Leu 115 120 125 Arg Asp Ser Gly Ala Thr Leu Leu Leu Ile Glu Asn Leu Ala Thr Leu 130 135 140 Gly Lys Leu Gln Glu Pro Leu Val Asp Thr Gly Val Lys Thr Val Val 145 150 155 160 Leu Leu Ser Gly Glu Ser Pro Glu Leu Ala Gly Phe Pro Leu Arg Leu 165 170 175 Leu Asn Phe Gly Gln Val Phe Thr Glu Gly Gln Tyr Gly Thr Val Arg 180 185 190 Ala Val Ala Ile Thr Pro Asp Asn Leu Ala Thr Leu Met Tyr Thr Ser 195 200 205 Gly Thr Thr Gly Gln Pro Lys Gly Val Met Val Thr His Gly Gly Leu 210 215 220 Leu Ser Gln Ile Val Asn Leu Trp Ala Ile Val Gln Pro Gln Val Gly 225 230 235 240 Asp Arg Val Leu Ser Ile Leu Pro Ile Trp His Ala Tyr Glu Arg Val 245 250 255 Ala Glu Tyr Phe Leu Phe Ala Cys Gly Cys Ser Gln Thr Tyr Thr Asn 260 265 270 Leu Arg His Phe Lys Asn Asp Leu Lys Arg Cys Lys Pro His Tyr Met 275 280 285 Ile Ala Val Pro Arg Ile Trp Glu Ser Phe Tyr Glu Gly Val Gln Lys 290 295 300 Gln Leu Arg Asp Ser Pro Ala Thr Lys Arg Arg Leu Ala Gln Phe Phe 305 310 315 320 Leu Ser Val Gly Gln Gln Tyr Ile Leu Gln Arg Arg Leu Leu Thr Gly 325 330 335 Leu Ser Leu Thr Asn Pro His Pro Arg Gly Trp Gln Lys Trp Leu Ala 340 345 350 Arg Val Gln Thr Leu Leu Leu Lys Pro Leu Tyr Glu Leu Gly Glu Lys 355 360 365 Arg Leu Tyr Ser Lys Ile Arg Glu Ala Thr Gly Gly Glu Ile Lys Gln 370 375 380 Val Ile Ser Gly Gly Gly Ala Leu Ala Pro His Leu Asp Thr Phe Tyr 385 390 395 400 Glu Val Ile Asn Leu Glu Val Leu Val Gly Tyr Gly Leu Thr Glu Thr 405 410 415 Ala Val Val Leu Thr Ala Arg Arg Ser Trp Ala Asn Leu Arg Gly Ser 420 425 430 Ala Gly Arg Pro Ile Pro Asp Thr Ala Ile Lys Ile Val Asp Pro Glu 435 440 445 Thr Lys Ala Pro Leu Glu Phe Gly Gln Lys Gly Leu Val Met Ala Lys 450 455 460 Gly Pro Gln Val Met Arg Gly Tyr Tyr Asn Gln Pro Glu Ala Thr Ala 465 470 475 480 Lys Val Leu Asp Ala Glu Gly Trp Phe Asp Thr Gly Asp Leu Gly Tyr 485 490 495 Leu Thr Pro Asn Gly Asp Leu Val Leu Thr Gly Arg Gln Lys Asp Thr 500 505 510 Ile Val Leu Ser Asn Gly Glu Asn Ile Glu Pro Gln Pro Ile Glu Asp 515 520 525 Ala Cys Val Arg Ser Pro Tyr Ile Asp Gln Ile Met Leu Val Gly Gln 530 535 540 Asp Gln Lys Ala Leu Gly Ala Leu Ile Val Pro Asn Leu Glu Ala Leu 545 550 555 560 Glu Ala Trp Val Val Ala Lys Gly Tyr Arg Leu Glu Leu Pro Asn Arg 565 570 575 Pro Ala Gln Ala Gly Ser Gly Glu Val Val Thr Leu Glu Ser Lys Val 580 585 590 Ile Ile Asp Leu Tyr Arg Gln Glu Leu Leu Arg Glu Val Gln Asn Arg 595 600 605 Pro Gly Tyr Arg Pro Asp Asp Arg Ile Ala Thr Phe Arg Phe Val Leu 610 615 620 Glu Pro Phe Thr Ile Glu Asn Gly Leu Leu Thr Gln Thr Leu Lys Ile 625 630 635 640 Arg Arg His Val Val Ser Asp Arg Tyr Arg Asp Met Ile Asn Ala Met 645 650 655 Phe Glu

* * * * *