Secretion Of Fatty Acids By Photosynthetic Microorganisms

ROESSLER; Paul Gordon ;   et al.

Patent Application Summary

U.S. patent application number 12/333280 was filed with the patent office on 2009-12-03 for secretion of fatty acids by photosynthetic microorganisms. Invention is credited to You Chen, Corey Neal Dodge, Bo Liu, Paul Gordon ROESSLER.

Application Number20090298143 12/333280
Document ID /
Family ID40755894
Filed Date2009-12-03

United States Patent Application 20090298143
Kind Code A1
ROESSLER; Paul Gordon ;   et al. December 3, 2009

SECRETION OF FATTY ACIDS BY PHOTOSYNTHETIC MICROORGANISMS

Abstract

Recombinant photosynthetic microorganisms that convert inorganic carbon to secreted fatty acids are described. Methods to recover the secreted fatty acids from the culture medium without the need for cell harvesting are also described.


Inventors: ROESSLER; Paul Gordon; (San Diego, CA) ; Chen; You; (San Diego, CA) ; Liu; Bo; (San Diego, CA) ; Dodge; Corey Neal; (Cardiff, CA)
Correspondence Address:
    Synthetic Genomics c/o MoFo
    12531 High Bluff Drive, Suite 100
    San Diego
    CA
    92130
    US
Family ID: 40755894
Appl. No.: 12/333280
Filed: December 11, 2008

Related U.S. Patent Documents

Application Number Filing Date Patent Number
61007333 Dec 11, 2007

Current U.S. Class: 435/134 ; 435/252.3; 435/257.2; 44/308; 554/1
Current CPC Class: C12P 7/649 20130101; C12P 7/6409 20130101; Y02E 50/10 20130101; Y02E 50/13 20130101; C12N 9/16 20130101; Y02P 20/582 20151101; Y02T 50/678 20130101
Class at Publication: 435/134 ; 435/257.2; 435/252.3; 554/1; 44/308
International Class: C12P 7/64 20060101 C12P007/64; C12N 1/13 20060101 C12N001/13; C12N 1/21 20060101 C12N001/21; C07C 57/00 20060101 C07C057/00; C10L 1/18 20060101 C10L001/18

Claims



1. A cell culture of a recombinant photosynthetic microorganism, said microorganism modified to contain a nucleic acid molecule comprising at least one recombinant expression system that produces at least one exogenous acyl-ACP thioesterase, wherein said acyl-ACP thioesterase preferentially liberates a fatty acid chain that contains 6-20 carbons, and wherein the culture medium provides inorganic carbon as substantially the sole carbon source and wherein said microorganism secretes the fatty acid liberated by the acyl-ACP thioesterase into the culture medium.

2. The culture of claim 1, wherein the at least one exogenous acyl-ACP thioesterase is a Fat B thioesterase.

3. The culture of claim 1, wherein the at least one exogenous acyl-ACP thioesterase is a Fat B thioesterase derived from the genus Cuphea.

4. The culture of claim 1, wherein the at least one exogenous acyl-ACP thioesterase is ChFatB2.

5. The culture of claim 1, wherein the recombinant photosynthetic microorganism has further been modified to produce an exogenous .beta.-ketoacyl synthase (KAS).

6. The culture of claim 5, wherein the exogenous KAS preferentially produces acyl-ACPs having the chain length for which the thioesterase has preferred activity.

7. The culture of claim 1, wherein the recombinant photosynthetic microorganism is further modified so that one or more genes encoding beta-oxidation pathway enzymes are inactivated or downregulated, or said enzymes are inhibited.

8. The culture of claim 1, wherein the recombinant photosynthetic microorganism is further modified so that one or more genes encoding acyl-ACP synthetases are inactivated or downregulated, or said synthetases are inhibited.

9. The culture of claim 1, wherein the recombinant photosynthetic microorganism is further modified so that one or more genes encoding an enzyme involved in carbohydrate biosynthesis are inactivated or downregulated, or said enzymes are inhibited.

10. The culture of claim 9, wherein the enzyme involved in carbohydrate biosynthesis is a branching enzyme.

11. A method to convert inorganic carbon to fatty acids, said method comprising: incubating the culture of claim 1 such that the recombinant photosynthetic microorganism therein secretes the fatty acid into the culture medium; and recovering the secreted fatty acids from the culture medium.

12. The method of claim 11, wherein the fatty acids are recovered from the culture by contacting the medium with particulate adsorbents.

13. The method of claim 12, wherein the particulate adsorbents circulate in the medium.

14. The method of claim 12, wherein the particulate absorbents are contained in a fixed bed column.

15. The method of claim 14, wherein the pH of the medium is lowered during said contacting.

16. The method of claim 15, wherein said pH lowering process comprises adding CO.sub.2.

17. The method of claim 16, wherein the medium is recirculated to the culture.

18. The method of claim 12, wherein the particulate adsorbents are lipophilic.

19. The method of claim 12, wherein the particulate adsorbents are ion exchange resins.

20. A composition comprising a fatty acid produced by the culture of claim 1.

21. The composition of claim 20, wherein the composition is used to produce another compound.

22. The composition of claim 20, wherein the composition is a biocrude.

23. A composition comprising a derivative of a fatty acid produced by the culture of claim 1.

24. The composition of claim 23, wherein the composition is a finished fuel or fuel additive.

25. The composition of claim 23, wherein the composition is a biological substitute for a petrochemical product.

26. The composition of claim 23, wherein the derivative is an alcohol, an alkane, or an alkene.
Description



CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit of provisional application 61/007,333 filed 11 Dec. 2007. The contents of this application are incorporated herein by reference.

TECHNICAL FIELD

[0002] This invention relates to photosynthetic microorganisms that convert inorganic carbon to fatty acids and secrete them into the culture medium, methods of production of fatty acids using such organisms, and uses thereof. The fatty acids may be used directly or may be further modified to alternate forms such as esters, reduced forms such as alcohols, or hydrocarbons, for applications in different industries, including fuels and chemicals.

BACKGROUND ART

[0003] Photosynthetic microorganisms, including eukaryotic algae and cyanobacteria, contain various lipids, including polar lipids and neutral lipids. Polar lipids (e.g., phospholipids, glycolipids, sulfolipids) are typically present in structural membranes whereas neutral lipids (e.g., triacylglycerols, wax esters) accumulate in cytoplasmic oil bodies or oil globules. A substantial research effort has been devoted to the development of methods to produce lipid-based fuels and chemicals from photosynthetic microorganisms. Typically, eukaryotic microalgae are grown under nutrient-replete conditions until a certain cell density is achieved, after which the cells are subjected to growth under nutrient-deficient conditions, which often leads to the accumulation of neutral lipids. The cells are then harvested by various means (e.g., settling, which can be facilitated by the addition of flocculants, followed by centrifugation), dried, and then the lipids are extracted from the cells by the use of various non-polar solvents. Harvesting of the cells and extraction of the lipids are cost-intensive steps. It would be desirable to obtain lipids from photosynthetic microorganisms without the requirement for cell harvesting and extraction.

[0004] PCT publication numbers WO2007/136762 and WO2008/119082 describe the production of biofuel components using microorganisms. These documents disclose the production by these organisms of fatty acid derivatives which are, apparently, short and long chain alcohols, hydrocarbons, fatty alcohols and esters including waxes, fatty acid esters or fatty esters. To the extent that fatty acid production is described, it is proposed as an intermediate to these derivatives, and the fatty acids are therefore not secreted. Further, there is no disclosure of converting inorganic carbon directly to secreted fatty acids using a photosynthetic organism grown in a culture medium containing inorganic carbon as the primary carbon source. The present invention takes advantage of the efficiency of photosynthetic organisms in secreting fatty acids into the medium in order to recover these valuable compounds.

[0005] The invention includes the expression of heterologous acyl-ACP thioesterase (TE) genes in photosynthetic microbes. Many of these genes, along with their use to alter lipid metabolism in oilseeds, have been described previously. Genes encoding the proteins that catalyze various steps in the synthesis and further metabolism of fatty acids have also been extensively described.

[0006] The two functional classes of plant acyl-ACP thioesterases (unsaturated fatty acid-recognizing Fat A versus saturated fatty acid-recognizing FatB) can be clustered based on amino acid sequence alignments as well as function. FatAs show marked preference for 18:1-ACP with minor activity towards 18:0- and 16:0-ACPs, and FatBs hydrolyze primarily saturated acyl-ACPs with chain lengths that vary between 8-16 carbons. Several studies have focused on engineering plant thioesterases with perfected or altered substrate specificities as a strategy for tailoring specialty seed oils.

[0007] As shown in FIG. 1, fatty acid synthetase catalyzes a repeating cycle wherein malonyl-acyl carrier protein (ACP) is condensed with a substrate, initially acetyl-CoA, to form acetoacetyl-ACP, liberating CO.sub.2. The acetoacetyl-ACP is then reduced, dehydrated, and reduced further to butyryl-ACP which can then itself be condensed with malonyl-ACP, and the cycle repeated, adding a 2-carbon unit at each turn. The production of free fatty acids would therefore be enhanced by a thioesterase that would liberate the fatty acid itself from ACP, breaking the cycle. That is, the acyl-ACP is prevented from reentering the cycle. Production of the fatty acid would also be encouraged by enhancing the levels of fatty acid synthetase and inhibiting any enzymes which result in degradation or further metabolism of the fatty acid.

[0008] FIG. 2 presents a more detailed description of the sequential formation of acyl-ACPs of longer and longer chains. As shown, the thioesterase enzymes listed in FIG. 2 liberate the fatty acid from the ACP thioester.

[0009] Taking advantage of this principle, Dehesh, K., et al., The Plant Journal (1996) 9:167-172, describe "Production of high levels of octanoic (8:0) and decanoic (10:0) fatty acids in transgenic canola by overexpression of ChFatB2, a thioesterase cDNA from Cuphea hookeriana." Dehesh, K., et al., Plant Physiology (1996) 110:203-210, and report "Two novel thioesterases are key determinants of the bimodal distribution of acyl chain length of Cuphea palustris seed oil."

[0010] Voelker, T., et al., Science (1992) 257:72-74, describe "Fatty acid biosynthesis redirected to medium chains in transgenic oilseed plants." Voelker, T., and Davies, M., Journal of Bacteriology (1994) 176:7320-7327, describe "Alteration of the specificity and regulation of fatty acid synthesis of Escherichia coli by expression of a plant medium-chain acyl-acyl carrier protein thioesterase."

DISCLOSURE OF THE INVENTION

[0011] The present invention is directed to the production of recombinant photosynthetic microorganisms that are able to secrete fatty acids derived from inorganic carbon into the culture medium. Methods to remove the secreted fatty acids from the culture medium without the need for cell harvesting are also provided. It is anticipated that these improvements will lead to lower costs for producing lipid-based fuels and chemicals from photosynthetic microorganisms. In addition, this invention enables the production of fatty acids of defined chain length, thus allowing their use in the formulation of a variety of different products, including fuels and chemicals.

[0012] Carbon dioxide (which, along with carbonic acid, bicarbonate and/or carbonate define the term "inorganic carbon") is converted in the photosynthetic process to organic compounds. The inorganic carbon source includes any way of delivering inorganic carbon, optionally in admixture with any other combination of compounds which do not serve as the primary carbon feedstock, but only as a mixture or carrier (for example, emissions from biofuel (e.g., ethanol) plants, power plants, petroleum-based refineries, as well as atmospheric and subterranean sources).

[0013] One embodiment of the invention relates to a culture of recombinant photosynthetic microorganisms, said organisms comprising at least one recombinant expression vector encoding at least one exogenous acyl-ACP thioesterase, wherein the at least one exogenous acyl-ACP thioesterase preferentially liberates fatty acid chains containing 6 to 20 carbons from these ACP thioesters. The fatty acids are formed from inorganic carbon as their carbon source and the culture contains substantially only inorganic carbon as a carbon source. The presence of the exogenous thioesterase will increase the secretion levels of desired fatty acids by at least 2-4 fold.

[0014] Specifically, in one embodiment, the invention is directed to a cell culture of a recombinant photosynthetic microorganism where the microorganism has been modified to contain a nucleic acid molecule comprising at least one recombinant expression system that produces at least one exogenous acyl-ACP thioesterase, wherein said acyl-ACP thioesterase preferentially liberates a fatty acid chain that contains 6-20 carbons, and wherein the culture medium provides inorganic carbon as substantially the sole carbon source and wherein said microorganism secretes the fatty acid liberated by the acyl-ACP thioesterase into the medium. In alternative embodiments, the thioesterase preferentially liberates a fatty acid chain that contains 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 carbons.

[0015] In other aspects, the invention is directed to a method to produce fatty acids of desired chain lengths by incubating these cultures and recovering these secreted fatty acids from the cultures. In one embodiment, the recovery employs solid particulate adsorbents to harvest the secreted fatty acids. The fatty acids thus recovered can be further modified synthetically or used directly as components of biofuels or chemicals.

BRIEF DESCRIPTION OF THE DRAWINGS

[0016] FIG. 1 is a diagram of the pathway of fatty acid synthesis as is known in the art.

[0017] FIG. 2 is a more detailed diagram of the synthesis of fatty acids of multiple chain lengths as is known in the art.

[0018] FIG. 3 is an enzymatic overview of fatty acid biosynthesis identifying enzymatic classes for the production of various chain length fatty acids.

[0019] FIG. 4 is a schematic diagram of a recovery system for fatty acids from the medium.

[0020] FIG. 5 shows an experimental system based on the principles in FIG. 4.

[0021] FIG. 6 shows representative acyl-ACP thioesterase from a variety of organisms.

MODES OF CARRYING OUT THE INVENTION

[0022] The present invention provides photosynthetic microorganisms that secrete fatty acids into the culture medium, along with methods to adsorb the fatty acids from the culture medium and collect them for processing into fuels and chemicals. The invention thereby eliminates or greatly reduces the need to harvest and extract the cells, resulting in substantially reduced production costs.

[0023] FIG. 2 is an overview of one aspect of the invention. As shown in FIG. 2, carbon dioxide is converted to acetyl-CoA using the multiple steps in the photosynthetic process. The acetyl-CoA is then converted to malonyl-CoA by the action of acetyl-CoA carboxylase. The malonyl-CoA is then converted to malonyl-ACP by the action of malonyl-CoA:ACP transacylase which, upon progressive action of fatty acid synthetase, results in successive additions of two carbon units. In one embodiment of the invention, the process is essentially halted at carbon chain lengths of 6 or 8 or 10 or 12 or 14 or 16 or 18 carbons by supplying the appropriate thioesterase (shown in FIG. 2 as FatB). To the extent that further conversions to longer chain fatty acids occur in this embodiment, the cell biomass can be harvested as well. The secreted fatty acids can be converted to various other forms including, for example, methyl esters, alkanes, alkenes, alpha-olefins and fatty alcohols.

[0024] Thioesterases (Acyl-ACP TEs)

[0025] In order to effect secretion of the free fatty acids, the organism is provided at least one expression system for at least one thioesterase that operates preferentially to liberate fatty acids of the desired length. Many genes encoding such thioesterases are available in the art. Some of these are subjects of U.S. patents as follows:

[0026] Examples include U.S. Pat. No. 5,298,421, entitled "Plant medium-chain-preferring acyl-ACP thioesterases and related methods," which describes the isolation of an acyl-ACP thioesterase and the gene that encodes it from the immature seeds of Umbellularia californica. Other sources for such thioesterases and their encoding genes include U.S. Pat. No. 5,304,481, entitled "Plant thioesterase having preferential hydrolase activity toward C12 acyl-ACP substrate," U.S. Pat. No. 5,344,771, entitled "Plant thioesterases," U.S. Pat. No. 5,455,167, entitled "Medium-chain thioesterases in plants," U.S. Pat. No. 5,512,482, entitled "Plant thioesterases," U.S. Pat. No. 5,530,186, entitled "Nucleotide sequences of soybean acyl-ACP thioesterase genes," U.S. Pat. No. 5,639,790, entitled "Plant medium-chain thioesterases," U.S. Pat. No. 5,667,997, entitled "C8 and C10 medium-chain thioesterases in plants," U.S. Pat. No. 5,723,761, entitled "Plant acyl-ACP thioesterase sequences," U.S. Pat. No. 5,807,893, entitled "Plant thioesterases and use for modification of fatty acid composition in plant seed oils," U.S. Pat. No. 5,850,022, entitled "Production of myristate in plant cells," U.S. Pat. No. 5,910,631, entitled "Middle chain-specific thioesterase genes from Cuphea lanceolata," U.S. Pat. No. 5,945,585, entitled "Specific for palmitoyl, stearoyl and oleoyl-alp thioesters nucleic acid fragments encoding acyl-ACP thioesterase enzymes and the use of these fragments in altering plant oil composition," U.S. Pat. No. 5,955,329, entitled "Engineering plant thioesterases for altered substrate specificity," U.S. Pat. No. 5,955,650, entitled "Nucleotide sequences of canola and soybean palmitoyl-ACP thioesterase genes and their use in the regulation of fatty acid content of the oils of soybean and canola plants," and U.S. Pat. No. 6,331,664, entitled "Acyl-ACP thioesterase nucleic acids from maize and methods of altering palmitic acid levels in transgenic plants therewith."

[0027] Others are described in the open literature as follows:

[0028] Dormann, P. et al., Planta (1993) 189:425-432, describe "Characterization of two acyl-acyl carrier protein thioesterases from developing Cuphea seeds specific for medium-chain and oleoyl-acyl carrier protein." Dormann, P., et al., Biochimica Biophysica Acta (1994) 1212:134-136, describe "Cloning and expression in Escherichia coli of a cDNA coding for the oleoyl-acyl carrier protein thioesterase from coriander (Coriandrum sativum L.)." Filichkin, S., et al., European Journal of Lipid Science and Technology (2006) 108:979-990, describe "New FATB thioesterases from a high-laurate Cuphea species: Functional and complementation analyses." Jones, A., et al., Plant Cell (1995) 7:359-371, describe "Palmitoyl-acyl carrier protein (ACP) thioesterase and the evolutionary origin of plant acyl-ACP thioesterases." Knutzon, D. S., et al., Plant Physiology (1992) 100:1751-1758, describe "Isolation and characterization of two safflower oleoyl-acyl carrier protein thioesterase cDNA clones." Slabaugh, M., et al., The Plant Journal (1998)13:611-620, describe "Condensing enzymes from Cuphea wrightii associated with medium chain fatty acid biosynthesis."

[0029] Additional genes, not previously isolated, that encode these acyl-ACP TEs can be isolated from plants that naturally contain large amounts of medium-chain fatty acids in their seed oil, including certain plants in the Lauraceae, Lythraceae, Rutaceae, Ulmaceae, and Vochysiaceae families. Typically, the fatty acids produced by the seeds of these plants are esterified to glycerol and retained inside the cells. The seeds containing the products can then be harvested and processed to isolate the fatty acids. Other sources of these enzymes, such as bacteria may also be used.

[0030] The known acyl-ACP TEs from plants can be divided into two main classes, based on their amino acid sequences and their specificity for acyl-ACPs of differing chain lengths and degrees of unsaturation. The "FatA" type of plant acyl-ACP TE has preferential activity on oleoyl-ACP, thereby releasing oleic acid, an 18-carbon fatty acid with a single double bond nine carbons distal to the carboxyl group. The "FatB" type of plant acyl-ACP TE has preferential activity on saturated acyl-ACPs, and can have broad or narrow chain length specificities. For example, FatB enzymes from different species of Cuphea have been shown to release fatty acids ranging from eight carbons in length to sixteen carbons in length from the corresponding acyl-ACPs. Listed below in Table 1 are several plant acyl-ACP TEs along with their substrate preferences. (Fatty acids are designated by standard shorthand notation, wherein the number preceding the colon represents the acyl chain length and the number after the colon represents the number of double bonds in the acyl chain.)

TABLE-US-00001 TABLE 1 Plant Acyl-ACP Thioesterase Garcinia mangostana FatA 18:1 and 18:0 Carthamus tinctorius FatA 18:1 Coriandrum sativum FatA 18:1 Cuphea hookeriana FatB1 16:0 Cuphea hookeriana FatB2 8:0 and 10:0 Cuphea wrightii FatB1 12:0 to 16:0 Cuphea palustris FatB1 8:0 and 10:0 Cuphea palustris FatB2 14:0 and 16:0 Cuphea calophylla FatB1 12:0 to 16:0 Umbellularia californica FatB1 12:0 Ulmus americana FatB1 8:0 and 10:0

[0031] The enzymes listed in Table 1 are exemplary and many additional genes encoding acyl-ACP TEs can be isolated and used in this invention, including but not limited to genes such as those that encode the following acyl-ACP TEs (referred to by GenPept Accession Numbers): [0032] CAA52069.1, CAA52070.1, CAA54060.1, CAA85387.1, CAA85388.1, CAB60830.1, CAC19933.1, CAC19934.1, CAC39106.1, CAC80370.1, CAC80371.1, CAD32683.1, CAL50570.1, CAN60643.1, CAN81819.1, CAO17726.1, CAO42218.1, CAO65585.1, CAO68322.1, AAA33019.1, AAA33020.1, AAB51523.1, AAB51524.1, AAB51525.1, AAB71729.1, AAB71730.1, AAB71731.1, AAB88824.1, AAC49001.1, AAC49002.1, AAC49179.1, AAC49180.1, AAC49269.1, AAC49783.1, AAC49784.1, AAC72881.1, AAC72882.1, AAC72883.1, AAD01982.1, AAD28187.1, AAD33870.1, AAD42220.2, AAG35064.1, AAG43857.1, AAG43858.1, AAG43859.1, AAG43860.1, AAG43861.1, AAL15645.1, AAL77443.1, AAL77445.1, AAL79361.1, AAM09524.1, AAN17328.1, AAQ08202.1, AAQ08223.1, AAX51636.1, AAX51637.1, ABB71579.1, ABB71581.1, ABC47311.1, ABD83939.1, ABE01139.1, ABH11710.1, ABI18986.1, ABI20759.1, ABI20760.1, ABL85052.1, ABU96744.1, EAY74210.1, EAY86874.1, EAY86877.1, EAY86884.1, EAY99617.1, EAZ01545.1, EAZ09668.1, EAZ12044.1, EAZ23982.1, EAZ37535.1, EAZ45287.1, NP.sub.--001047567.1, NP.sub.--001056776.1, NP.sub.--001057985.1, NP.sub.--001063601.1, NP.sub.--001068400.1, NP.sub.--172327.1, NP.sub.--189147.1, NP.sub.--193041.1, XP.sub.--001415703.1, Q39473, Q39513, Q41635, Q42712, Q9SQI3, NP.sub.--189147.1, AAC49002, CAA52070.1, CAA52069.1, 193041.1, CAC39106, CAO17726, AAC72883, AAA33020, AAL79361, AAQ08223.1, AAB51523, AAL77443, AAA33019, AAG35064, and AAL77445. Additional sources of acyl-ACP TEs that are useful in the present invention include: Arabidopsis thaliana (At); Bradyrhizobium japonicum (Bj); Brassica napus (Bn); Cinnamonum camphorum (Cc); Capsicum chinense (Cch); Cuphea hookeriana (Ch); Cuphea lanceolata (Cl); Cuphea palustris (Cp); Coriandrum sativum (Cs); Carthamus tinctorius (Ct); Cuphea wrightii (Cw); Elaeis guineensis (Eg); Gossypium hirsutum (Gh); Garcinia mangostana (Gm); Helianthus annuus (Ha); Iris germanica (Ig); Iris tectorum (It); Myristica fragrans (Mf); Triticum aestivum (Ta); Ulmus Americana (Ua); and Umbellularia californica (Uc). Exemplary TEs are shown in FIG. 6 with corresponding NCBI accession numbers.

[0033] In one embodiment, the present invention contemplates the specific production of an individual length of medium-chain fatty acid, for example, predominently producing C8 fatty acids in one culture of recombinant photosynthetic microorganisms. In another embodiment, the present invention contemplates the production of a combination of two or more different length fatty acids, for example, both C8 and C10 fatty acids in one culture of recombinant photosynthetic microorganisms.

[0034] Illustrated below are manipulations of these art-known genes to construct suitable expression systems that result in production of effective amounts of the thioesterases in selected recombinant photosynthetic organisms. In such constructions, it may be desirable to remove the portion of the gene that encodes the plastid transit peptide region, as this region is inappropriate in prokaryotes. Alternatively, if expression is to take place in eukaryotic cells, the appropriate plastid transit peptide encoding region to the host organism may be substituted. Preferred codons may also be employed, depending on the host.

[0035] Other Modifications

[0036] In addition to providing an expression system for one or more appropriate acyl-ACP TE genes, further alterations in the photosynthetic host may be made. For example, the host may be modified to include an expression system for a heterologous gene that encodes a .beta.-ketoacyl synthase (KAS) that preferentially produces acyl-ACPs having medium chain lengths. Such KAS enzymes have been described from several plants, including various species of Cuphea. See Dehesh, K., et al., The Plant Journal (1998) 15:383-390, describe "KAS IV: a 3-ketoacyl-ACP synthase from Cuphea sp. is a medium chain specific condensing enzyme."; Slabaugh, M., et al., The Plant Journal (1998) 13:611-620), and would serve to increase the availability of acyl-ACP molecules of the proper length for recognition and cleavage by the heterologous medium-chain acyl-ACP TE. Another example is that the photosynthetic host cell containing a heterologous acyl-ACP TE gene may be further modified to include an expression system for a heterologous gene that encodes a multifunctional acetyl-CoA carboxylase or a set of heterologous genes that encode the various subunits of a multi-subunit type of acetyl-CoA carboxylase. Other heterologous genes that encode additional enzymes or components of the fatty acid biosynthesis pathway could also be introduced and expressed in acyl-ACP TE-containing host cells.

[0037] The photosynthetic microorganism may also be modified such that one or more genes that encode beta-oxidation pathway enzymes have been inactivated or downregulated, or the enzymes themselves may be inhibited. This would prevent the degradation of fatty acids released from acyl-ACPs, thus enhancing the yield of secreted fatty acids. In cases where the desired products are medium-chain fatty acids, the inactivation or downregulation of genes that encode acyl-CoA synthetase and/or acyl-CoA oxidase enzymes that preferentially use these chain lengths as substrates would be beneficial. Mutations in the genes encoding medium-chain-specific acyl-CoA synthetase and/or medium-chain-specific acyl-CoA oxidase enzymes such that the activity of the enzymes is diminished would also be effective in increasing the yield of secreted fatty acids. An additional modification inactivates or down-regulates the acyl-ACP synthetase gene or inactivates the gene or protein. Mutations in the genes can be introduced either by recombinant or non-recombinant methods. These enzymes and their genes are well known, and may be targeted specifically by disruption, deletion, generation of antisense sequences, generation of ribozymes or other recombinant approaches known to the practitioner. Inactivation of the genes can also be accomplished by random mutation techniques such as UV, and the resulting cells screened for successful mutants. The proteins themselves can be inhibited by intracellular generation of appropriate antibodies or intracellular generation of peptide inhibitors.

[0038] The photosynthetic microorganism may also be modified such that one or more genes that encode storage carbohydrate or polyhydroxyalkanoate (PHA) biosynthesis pathway enzymes have been inactivated or down-regulated, or the enzymes themselves may be inhibited. Examples include enzymes involved in glycogen, starch, or chrysolaminarin synthesis, including glucan synthases and branching enzymes. Other examples include enzymes involved in PHA biosynthesis such as acetoacetyl-CoA synthase and PHA synthase.

[0039] Expression Systems

[0040] Expression of heterologous genes in cyanobacteria and eukaryotic algae is enabled by the introduction of appropriate expression vectors. For transformation of cyanobacteria, a variety of promoters that function in cyanobacteria can be utilized, including, but not limited to the lac, tac, and trc promoters and derivatives that are inducible by the addition of isopropyl .beta.-D-1-thiogalactopyranoside (IPTG), promoters that are naturally associated with transposon- or bacterial chromosome-borne antibiotic resistance genes (neomycin phosphotransferase, chloramphenicol acetyltransferase, spectinomycin adenyltransferase, etc.), promoters associated with various heterologous bacterial and native cyanobacterial genes, promoters from viruses and phages, and synthetic promoters. Promoters isolated from cyanobacteria that have been used successfully include the following:

[0041] secA (secretion; controlled by the redox state of the cell)

[0042] rbc (Rubisco operon)

[0043] psaAB (PS I reaction center proteins; light regulated)

[0044] psbA (D1 protein of PSII; light-inducible)

[0045] Likewise, a wide variety of transcriptional terminators can be used for expression vector construction. Examples of possible terminators include, but are not limited to, psbA, psaAB, rbc, secA, and T7 coat protein.

[0046] Expression vectors are introduced into the cyanobacterial strains by standard methods, including, but not limited to, natural DNA uptake, conjugation, electroporation, particle bombardment, and abrasion with glass beads, SiC fibers, or other particles. The vectors can be: 1) targeted for integration into the cyanobacterial chromosome by including flanking sequences that enable homologous recombination into the chromosome, 2) targeted for integration into endogenous cyanobacterial plasmids by including flanking sequences that enable homologous recombination into the endogenous plasmids, or 3) designed such that the expression vectors replicate within the chosen host.

[0047] For transformation of green algae, a variety of gene promoters and terminators that function in green algae can be utilized, including, but not limited to promoters and terminators from Chlamydomonas and other algae, promoters and terminators from viruses, and synthetic promoters and terminators.

[0048] Expression vectors are introduced into the green algal strains by standard methods, including, but not limited to, electroporation, particle bombardment, and abrasion with glass beads, SiC fibers, or other particles. The vectors can be 1) targeted for site-specific integration into the green algal chloroplast chromosome by including flanking sequences that enable homologous recombination into the chromosome, or 2) targeted for integration into the cellular (nucleus-localized) chromosome.

[0049] For transformation of diatoms, a variety of gene promoters that function in diatoms can be utilized in these expression vectors, including, but not limited to: 1) promoters from Thalassiosira and other heterokont algae, promoters from viruses, and synthetic promoters. Promoters from Thalassiosira pseudonana that would be suitable for use in expression vectors include an alpha-tubulin promoter (SEQ ID NO:1), a beta-tubulin promoter (SEQ ID NO:2), and an actin promoter (SEQ ID NO:3). Promoters from Phaeodacylum tricornutum that would be suitable for use in expression vectors include an alpha-tubulin promoter (SEQ ID NO:4), a beta-tubulin promoter (SEQ ID NO:5), and an actin promoter (SEQ ID NO:6). These sequences are deduced from the genomic sequences of the relevant organisms available in public databases and are merely exemplary of the wide variety of promoters that can be used. The terminators associated with these and other genes, or particular heterologous genes can be used to stop transcription and provide the appropriate signal for polyadenylation and can be derived in a similar manner or are known in the art.

[0050] Expression vectors are introduced into the diatom strains by standard methods, including, but not limited to, electroporation, particle bombardment, and abrasion with glass beads, SiC fibers, or other particles. The vectors can be 1) targeted for site-specific integration into the diatom chloroplast chromosome by including flanking sequences that enable homologous recombination into the chromosome, or 2) targeted for integration into the cellular (nucleus-localized) chromosome.

[0051] Host Organisms

[0052] The host cells used to prepare the cultures of the invention include any photosynthetic organism which is able to convert inorganic carbon into a substrate that is in turn converted to fatty acid derivatives. These organisms include prokaryotes as well as eukaryotic organisms such as algae and diatoms.

[0053] Host organisms include eukaryotic algae and cyanobacteria (blue-green algae). Representative algae include green algae (chlorophytes), red algae, diatoms, prasinophytes, glaucophytes, chlorarachniophytes, euglenophytes, chromophytes, and dinoflagellates. A number of cyanobacterial species are known and have been manipulated using molecular biological techniques, including the unicellular cyanobacteria Synechocystis sp. PCC6803 and Synechococcus elongates PCC7942, whose genomes have been completely sequenced.

[0054] The following genera of cyanobacteria may be used: one group includes

TABLE-US-00002 Chamaesiphon Chroococcus Cyanobacterium Cyanobium Cyanothece Dactylococcopsis Gloeobacter Gloeocapsa Gloeothece Microcystis Prochlorococcus Prochloron Synechococcus Synechocystis

[0055] Another group includes

TABLE-US-00003 Cyanocystis Dermocarpella Stanieria Xenococcus Chroococcidiopsis Myxosarcina Pleurocapsa

[0056] Still another group includes

TABLE-US-00004 Arthrospira Borzia Crinalium Geitlerinema Halospirulina Leptolyngbya Limnothrix Lyngbya Microcoleus Oscillatoria Planktothrix Prochlorothrix Pseudanabaena Spirulina Starria Symploca Trichodesmium Tychonema

[0057] Still another group includes

TABLE-US-00005 Anabaena Anabaenopsis Aphanizomenon Calothrix Cyanospira Cylindrospermopsis Cylindrospermum Nodularia Nostoc Rivularia Scytonema Tolypothrix

[0058] And another group includes

TABLE-US-00006 Chlorogloeopsis Fischerella Geitleria Iyengariella Nostochopsis Stigonema

[0059] In addition, various algae, including diatoms and green algae can be employed.

[0060] Desirable qualities of the host strain include high potential growth rate and lipid productivity at 25-50.degree. C., high light intensity tolerance, growth in brackish or saline water, i.e., in wide range of water types, resistance to growth inhibition by high O.sub.2 concentrations, filamentous morphology to aid harvesting by screens; resistance to predation, ability to be flocculated (by chemicals or `on-demand autoflocculation`), excellent inorganic carbon uptake characteristics, virus or cyanophage-resistance, tolerance to free fatty acids or other compounds associated with the invention method, and ability to undergo metabolic engineering.

[0061] Metabolic engineering is facilitated by the ability to take up DNA by electroporation or conjugation, lack of a restriction system and efficient homologous recombination in the event gene replacement or gene knockouts are required.

[0062] Fatty Acid Adsorption, Removal, and Recovery

[0063] The fatty acids secreted into the culture medium by the recombinant photosynthetic microorganisms described above can be recovered in a variety of ways. A straightforward isolation method by partition using immiscible solvents may be employed. In one embodiment, particulate adsorbents can be employed. These may be lipophilic particulates or ion exchange resins, depending on the design of the recovery method. They may be circulating in the separated medium and then collected, or the medium may be passed over a fixed bed column, for example, a chromatographic column containing these particulates. The fatty acids are then eluted from the particulate adsorbents by the use of an appropriate solvent. Evaporation of the solvent, followed by further processing of the isolated fatty acids and lipids can then be carried out to yield chemicals and fuels that can be used for a variety of commercial purposes.

[0064] The particulate adsorbents may have average diameters ranging from 0.5 mm to 30 mm which can be manufactured from various materials including, but not limited to, polyethylene and derivatives, polystyrene and derivatives, polyamide and derivatives, polyester and derivatives, polyurethane and derivatives, polyacrylates and derivatives, silicone and derivatives, and polysaccharide and derivatives. Certain glass and ceramic materials can also be used as the solid support component of the fat adsorbing objects. The surfaces of the particulate adsorbents may be modified so that they are better able to bind fatty acids and lipids. An example of such modification is the introduction of ether-linked alkyl groups having various chain lengths, preferably 8-30 carbons. In another example, acyl chains of various lengths can be attached to the surface of the fat adsorbing objects via ester, thioester, or amide linkages.

[0065] In one embodiment, the particulate adsorbents are coated with inorganic compounds known to bind fatty acids and lipids. Examples of such compounds include but are not limited to aluminum hydroxide, graphite, anthracite, and silica.

[0066] The particles used may also be magnetized or otherwise derivatized to facilitate recovery. For instance the particles may be coupled to one member of a binding pair and the adsorbed to a substrate containing the relevant binding partner.

[0067] The fatty acids may be eluted from the particulate adsorbents by the use of an appropriate solvent such as hexane or ethanol. The particulate adsorbents may be reused by returning them to the culture medium or used in a regenerated column. The solvent containing the dissolved fatty acids is then evaporated, leaving the fatty acids in a purified state for further conversion to chemicals and fuels. The particulate adsorbents can be designed to be neutrally buoyant or positively buoyant to enhance circulation in the culture medium. A continuous cycle of fatty acid removal and recovery can be implemented by utilizing the steps outlined above. The recovered fatty acids may be converted to alternative organic compounds, used directly, or mixed with other components. Chemical methods for such conversions are well understood in the art, and developments of biological methods for such conversions are also contemplated

[0068] The present invention further contemplates a variety of compositions comprising the fatty acids produced by the recombinant photosynthetic microorganisms described herein, and uses thereof. The composition may comprise the fatty acids themselves, or further derivatives of the fatty acids, such as alcohols, alkanes, and alkenes which can be generated from the fatty acids produced by the microorganisms by any methods that are known in the art, as well as by development of biological methods of conversion. For examples, fatty acids may be converted to alkenes by catalytic hydrogenation and catalytic dehydration.

[0069] The composition may serve, for example, as a biocrude. The biocrude can be processed through refineries that will convert the composition compounds to various petroleum and petrochemical replacements, including alkanes, olefins and aromatics through processes including hydrotreatment, decarboxylation, isomerization and catalytic cracking and reforming. The biocrude can be also converted to ester-based fuels, such as fatty acid methyl ester (commercially known as biodiesel), through established chemical processes including transesterification and esterification.

[0070] In addition, one of skill in the art could contemplate a variety of other uses for the fatty acids of the present invention, and derivatives thereof, that are well known in the art, for example, the production of chemicals, soaps, surfactants, detergents, lubricants, nutraceuticals, pharmaceuticals, cosmetics, etc. For example, derivatives of the fatty acids of the present invention include C8 chemicals, such as octanol, used in the manufacture of esters for cosmetics and flavors as well as for various medical applications, and octane, used primarily as a co-monomer in production of polyethylene. Derivatives of the fatty acids of the present invention may also include C10 chemicals, such as decanol, used in the manufacture of plasticizers, surfactants and solvents, and decene, used in the manufacture of lubricants.

[0071] Biocrudes are biologically produced compounds or a mix of different biologically produced compounds that are used as a feedstock for refineries in replacement of, or in complement to, crude oil or other forms of petroleum. In general, but not necessarily, these feedstocks have been pre-processed through biological, chemical, mechanical or thermal processes in order to be in a liquid state that is adequate for introduction in a petroleum refinery.

[0072] The fatty acids of the present invention can be a biocrude, and further processed to a biofuel composition. The biofuel can then perform as a finished fuel or a fuel additive.

[0073] "Finished fuel" is defined as a chemical compound or a mix of chemical compounds (produced through chemical, thermochemical or biological routes) that is in an adequate chemical and physical state to be used directly as a neat fuel or fuel additive in an engine. In many cases, but not always, the suitability of a finished fuel for use in an engine application is determined by a specification which describes the necessary physical and chemical properties that need to be met. Some examples of engines are: internal combustion engine, gas turbine, steam turbine, external combustion engine, and steam boiler. Some examples of finished fuels include: diesel fuel to be used in a compression-ignited (diesel) internal combustion engine, jet fuel to be used in an aviation turbine, fuel oil to be used in a boiler to generate steam or in an external combustion engine, ethanol to be used in a flex-fuel engine. Examples of fuel specifications are ASTM standards, mainly used ion the US, and the EN standards, mainly used in Europe.

[0074] "Fuel additive" refers to a compound or composition that is used in combination with another fuel for a variety of reasons, which include but are not limited to complying with mandates on the use of biofuels, reducing the consumption of fossil fuel-derived products or enhancing the performance of a fuel or engine. For example, fuel additives can be used to alter the freezing/gelling point, cloud point, lubricity, viscosity, oxidative stability, ignition quality, octane level, and flash point. Additives can further function as antioxidants, demulsifiers, oxygenates, thermal stability improvers, cetane improvers, stabilizers, cold flow improvers, combustion improvers, anti-foams, anti-haze additives, icing inhibitors, injector cleanliness additives, smoke suppressants, drag reducing additives, metal deactivators, dispersants, detergents, demulsifiers, dyes, markers, static dissipaters, biocides, and/or corrosion inhibitors.

[0075] The following examples are offered to illustrate but not to limit the invention.

Example 1

Secretion of Fatty Acids by Strains Derived from the Unicellular Photoautotrophic Cyanobacterium Synechococcus elongatus PCC 7942

[0076] The Cuphea hookeriana FatB2 gene encoding an acyl-ACP thioesterase (ChFatB2) enzyme was modified for optimized expression in Synechococcus elongatus PCC 7942. First, the portion of the gene that encodes the plastid transit peptide region of the native ChFatB2 protein was removed. The remainder of the coding region was then codon-optimized using the "Gene Designer" software program (version 1.1.4.1) provided by DNA2.0, Inc. The nucleotide sequence of this derivative of the ChFatB2 gene (hereafter ChFatB2-7942) is provided as SEQ ID NO:7. The protein sequence encoded by this gene is provided in SEQ ID NO:8.

[0077] Two different versions of the trc promoter, trc (Egon, A., et al., Gene (1983) 25:167-178) and "enhanced trc" (hereafter trcE, from pTrcHis A, Invitrogen) were used to drive the expression of ChFatB2-7942 in S. elongatus PCC 7942. The trc promoter is repressed by the Lac repressor protein encoded by the lacIq gene and can be induced by the addition of isopropyl .beta.-D-1-thiogalactopyranoside (IPTG). The trcE promoter is a derivative of trc designed to facilitate expression of eukaryotic proteins in E. coli and is also inducible by IPTG.

[0078] The fusion fragments of ChFatB2-7942 operably linked to trc or trcE, together with the lacIq gene, were cloned into the shuttle vector pAM2314 (Mackey, S. R., et al., Methods Mol. Biol. (2007) 362:115-129), which enables transformation of S. elongatus PCC 7942 via double homologous recombination-mediated integration into the "NS1" site of the chromosome. The constructed plasmid containing the trcE::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC01. SEQ ID NO:9 represents the sequence between and including the NS1 recombination sites of pSGI-YC01. The constructed plasmid containing the trc::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC09. SEQ ID NO: 10 represents the sequence between and including the NS1 recombination sites of pSGI-YC09.

[0079] Each of the plasmids pSGI-YC01 and pSGI-YC09, along with the control vector pAM2314, were introduced into wild-type S. elongatus PCC 7942 cells as described by Golden and Sherman (J. Bacteriol. (1984) 158:36-42). Both recombinant and control strains were pre-cultivated in 100 mL of BG-11 medium supplied with spectinomycin (5 mg/L) to late-log phase (OD.sub.730 nm=1.0) on a rotary shaker (150 rpm) at 30.degree. C. with constant illumination (60 .mu.E m.sup.-2 sec.sup.-1). Cultures were then subcultured at initial OD.sub.730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD.sub.730 nm=0.7-0.9. For time-course study, 60 mL aliquots of the culture were transferred into 250-mL flasks and induced by adding IPTG (final conc.=1 mM) if applicable. Cultures were sampled 0, 48, 96, and 168 hours after IPTG induction and then filtered through Whatman.RTM. GF/F filters using a Millipore vacuum filter manifold. Filtrates were collected in screw top culture tubes for gas chromatographic (GC) analysis.

[0080] Free fatty acids (FFAs) were separated from filtered cell cultures using liquid-liquid extraction. Five mL of the filtrate were mixed with 125 .mu.L of 1 M H.sub.3PO.sub.3 and 0.25 mL of 5 M NaCl, followed by addition of 2 mL of hexane and thorough mixing. For GC-FID analyze, a 0.2 .mu.l sample of the hexane was injected using a 40:1 split ratio onto a DB-FFAP column (J&W Scientific, 15 m.times.250 .mu.m.times.0.25 .mu.m), with a temperature profile starting at 150.degree. C. for 0.5 min, then heating at 15.degree. C./min to 230.degree. C. and holding for 7.1 min (1.1 mL/min He).

[0081] GC analysis results indicating the levels of medium-chain FFAs (8:0 and 10:0) in cultures containing various Synechococcus elongatus strains 168 hours after IPTG induction are shown in Table 1-1.

TABLE-US-00007 TABLE 1-1 Medium-chain fatty acid secretion in various strains of S. elongatus Fatty Acids Parent Plasmid (mg/L) Strain Strain Added Transgenes 8:0 10:0 SGC-YC2-5 PCC 7942 pAM2314 none ND ND SGC-YC1-2 PCC 7942 pSGI-YC01 trcE::ChFatB2- 1.5 3.5 7942 SGC-YC14-4 PCC 7942 pSGI-YC09 trc::ChFatB2- 5.1 10.1 7942 Note: ND represents "not detected" (<1 mg/L).

Example 2

Secretion of Fatty Acids by Strains Derived from the Unicellular Photoheterotrophic Cyanobacterium Synechocystis sp. PCC 6803

[0082] The trcE::ChFatB2-7942 and trc::ChFatB2-7942 fusion fragments, together with the lacIq gene, were cloned into the shuttle vector pSGI-YC03 (SEQ ID NO:11), which enables transformation of Synechocystis sp. PCC 6803 via double homologous recombination-mediated integration into the "RS1" site of the chromosome (Williams, Methods Enzymol. (1988) 167:766-778). The constructed plasmid containing the trcE::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC08. SEQ ID NO:12 represents the sequence between and including the RS1 recombination sites of pSGI-YC08. The constructed plasmid containing the trc::ChFatB2-7942 expression cassette and lacIq gene is designated pSGI-YC14. SEQ ID NO:13 represents the sequence between and including the RS1 recombination sites of pSGI-YC14.

[0083] Each of the plasmids pSGI-YC08, pSGI1-YC14, and the control vector pSGI-YC03, was introduced into wild-type Synechocystis PCC 6803 cells, as described by Zang, X. et al., J. Microbiol. (2007) 45:241-245. Both recombinant and control strains were pre-cultivated in 100 mL of BG-11 medium supplied with kanamycin (10 mg/L) to late-log phase (OD.sub.730 nm=1.0) on a rotary shaker (150 rpm) at 30.degree. C. with constant illumination (60 .mu.Em.sup.-2 sec.sup.-1). Cultures were then subcultured at initial OD.sub.730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD.sub.730 nm=0.7-0.9. For time-course studies, 60-mL aliquots of the culture were transferred into 250-mL flasks and induced by adding IPTG (final conc.=1 mM) when applicable. Cultures were sampled 0, 72, and 144 hours after IPTG induction and then filtered through Whatman.RTM. GF/B filters using a Millipore vacuum filter manifold. Filtrates were collected in screw top culture tubes for gas chromatographic (GC) analysis. Free fatty acids (FFA) were separated from the filtered culture supernatant solutions by liquid-liquid extraction. For each sample, 2 mL filtered culture was extracted with a mixture of 50 .mu.l phosphoric acid (1 M), 100 .mu.l NaCl (5 M) and 2 mL hexane. A 0.2 .mu.l sample was injected using a 40:1 split ratio on to a DB-FFAP column (J&W Scientific, 15 m.times.250 .mu.m.times.0.25 .mu.m), with a temperature profile starting at 150.degree. C. for 0.5 min, then heating at 15.degree. C./min to 230.degree. C. and holding for 7.1 min (1.1 mL/min He).

[0084] GC analysis results indicating the levels of medium-chain FFAs (8:0 and 10:0) in cultures 144 hours after IPTG induction are shown in Table 2-1.

TABLE-US-00008 TABLE 2-1 Medium-chain fatty acid secretion in various strains of Synechocystis. Fatty Acids Parent Plasmid (mg/L) Strain Strain Added Transgenes 8:0 10:0 SGC-YC9-8 PCC 6803 pSGI-YC03 none ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 trcE::ChFatB2- 61.3 52.7 7942 SGC-YC16-2 PCC 6803 pSGI-YC14 trc::ChFatB2- 2.7 5.8 7942 Note: ND represents "not detected" (<1 mg/L).

Example 3

Secretion of Fatty Acids by Strains Derived from the Filamentous Cyanobacterium Anabaena variabilis ATCC 29413

[0085] The trc::ChFatB2-7942 and trcE::ChFatB2-7942 fusion fragments, together with the lacIq gene, were PCR amplified using primers RS3-3F (SEQ ID NO:14) and 4YC-rrnBter-3 (SEQ ID NO:15) from pSG1-YC14 and pSGI-YC08, respectively, and then cloned into the shuttle vector pEL17, which enables transformation of A. variabilis ATCC 29413 via double homologous recombination-mediated integration into the nifU1 locus of the chromosome (Lyons and Thiel, J. Bacteriol. (1995) 177:1570-1575). The constructed plasmids are designated pSG1-YC69 and pSG1-YC70 for trc::ChFatB2-7942 and trcE::ChFatB2-7942, respectively.

[0086] Each of the plasmids pSG1-YC69, pSG1-YC70, along with the control vector pEL17, are introduced into wild-type A. variabilis ATCC 29413 cells via tri-parental conjugation, as described by Elhai and Wolk (Methods Enzymol. (1988) 167:747-754). Both recombinant and control strains are pre-cultivated in 100 mL of BG-11 medium supplied with 5 mM NH.sub.4Cl and spectinomycin (3 mg/L) to late-log phase (OD.sub.730 nm=1.0) on a rotary shaker (150 rpm) at 30.degree. C. with constant illumination (60 .mu.Em.sup.-2sec.sup.-1). Cultures are then subcultured at initial OD.sub.730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD.sub.730 nm=0.7-0.9. For time-course studies, 60-mL aliquots of the culture are transferred into 250 mL flasks and induced by adding IPTG (final conc.=1 mM) if applicable. Cultures are sampled every 72 hours and then filtered through Whatman.RTM. GF/F filters using a Millipore vacuum filter manifold. Filtrates are collected in screw top culture tubes for gas chromatographic (GC) analysis as described in Example 1.

Example 4

Secretion of Fatty Acids in Strains Derived from Synechococcus elongatus PCC 7942 Containing an Inactivated Acyl-ACP Synthetase Gene

[0087] A putative acyl-ACP synthetase gene in S. elongatus PCC 7942, synpcc7942.sub.--0918 (Cyanobase gene designation), was disrupted via replacing of an internal 422-bp portion of its coding region with a 1,741-bp DNA sequence carrying the chloramphenicol resistance marker gene, cat (which encodes chloramphenicol acetyltransferase). Primer pairs 918-15 (SEQ ID NO: 16)/918-13 (SEQ ID NO: 17) and 918-25 (SEQ ID NO:18)/918-23 (SEQ ID NO:19) were used to amplify two DNA fragments corresponding to a 5' portion (1-480 bp) and a 3' portion (903-1521 bp) of the coding region of synpcc7942.sub.--0918, respectively. The cat fragment was amplified from plasmid pAM1573 (Mackey et al., Methods Mol. Biol. 362:115-29) using PCR with primers NS21-3 Cm (SEQ ID NO:20) and ter-3 Cm (SEQ ID NO:21), which overlap primers 918-13 and 918-25, respectively. The recombinant chimeric PCR technique was then used to amplify the complete disruption cassette with the three aforementioned PCR fragments, as well as primers 918-15 and 918-23. The resulting 2,840-bp blunt-end PCR fragment (SEQ ID NO:22) was then ligated into pUC19 (Yanisch-Perron et al., Gene 33:103-119), which has been digested with both HindIII and EcoRI to remove the multiple cloning sites and subsequently blunted with T4 DNA polymerase, to yield plasmid pSGI-YC04.

[0088] Plasmid pSG1-YC04 was introduced into S. elongatus strain SGC-YC1-2, which harbors a copy of trcE::ChFatB2-7942 integrated into NS1 (see Example 1). The resulting strain was designated SGC-YC4-7. Fatty acid production assays and GC analyses were performed as described in Example 1. The results of GC analyses indicating the levels of FFAs in cultures of various S. elongatus strains 168 hours after IPTG induction are shown in Table 4-1. It is possible that inactivation of the acyl-ACP synthetase gene has a larger impact on secretion of long-chain fatty acids than on secretion of medium-chain fatty acids.

TABLE-US-00009 TABLE 4-1 Medium-chain fatty acid secretion in various strains of S. elongatus. Plasmid Fatty Acids (mg/L) Strain Parent Strain Added Transgenes Deletions 8:0 10:0 16:0 16:1 SGC-YC2-5 PCC 7942 pAM2314 none none ND ND ND 1.4 SGC-YC1-2 PCC 7942 pSGI- trcE::ChFatB2- none 1.4 4.2 ND 1.6 YC01 7942 SGC-YC4-7 SGC-YC1-2 pSGI- trcE::ChFatB2- synpcc7942_0918 1.0 3.1 1.1 3.9 YC04 7942 Note: ND represents "not detected" (<1 mg/L).

Example 5

Secretion of Fatty Acids in Strains Derived from Synechocystis sp. PCC6803 Containing an Inactivated Acyl-ACP Synthetase Gene

[0089] A .about.b 1.7-kbp DNA fragment spanning an area upstream and into the coding region of the acyl-ACP synthetase-encoding gene, slr1609 (Cyanobase gene designation), from Synechocystis sp. PCC 6803 was amplified from genomic DNA using PCR with primers NB001 (SEQ ID NO:23) and NB002 (SEQ ID NO:24). This fragment was cloned into the pCR2.1 vector (Invitrogen) to yield plasmid pSG1-NB3 and subsequently cut with the restriction enzyme Mfe1. A chloramphenicol resistance marker cassette containing the cat gene and associated regulatory control sequences was amplified from plasmid pAM1573 (Andersson, et al., Methods Enzymol. (2000) 305:527-542) to contain flanking Mfe1 restriction sites using PCR with primers NB010 (SEQ ID NO:25) and NB011 (SEQ ID NO:26). The cat gene expression cassette was then inserted into the MfeI site of pSG1-NB3 to yield pSG1-NB5 (SEQ ID NO:27).

[0090] The pSGI-NB5 vector was transformed into trcE::ChFatB2-7942-containing Synechocystis strain SGC-YC10.sup.-5 (see Example 1) according to Zang et al., J Microbiology (2007) 45:241-245. Insertion of the chloramphenicol resistance marker into the Slr1609 gene through homologous recombination was verified by PCR screening of insert and insertion site. The resulting strain was designated SGC-NB10-4, which was tested in liquid BG-11 medium for fatty acid secretion. All liquid medium growth conditions used a rotary shaker (150 rpm) at 30.degree. C. with constant illumination (60 .mu.Em.sup.-2sec.sup.-1). Cultures were inoculated in 25 mL of BG-11 medium containing chloramphenicol and/or kanamycin (5 .mu.g/mL) accordingly and grown to a sufficient density (minimal OD.sub.730 nm=1.6-2). Cultures were then used to inoculate 100 mL BG-11 medium in 250 mL polycarbonate flasks to OD.sub.730 nm=0.4-0.5 and incubated overnight. 45 mL of overnight culture at OD.sub.730 nm=0.7-0.9 were added to new 250 mL flasks, inducing with 1 mM IPTG or using as uninduced controls. 5 mL samples were taken at 0, 72 and 144 hours post induction and processed as described in Example 2.

[0091] Free fatty acids (FFA) were separated from the filtered culture supernatant solutions by liquid-liquid extraction for GC/FID (flame ionization detector) analysis. For each sample, 2 mL filtered culture was extracted with a mixture of 50 .mu.l phosphoric acid (1 M), 100 .mu.l NaCl (5 M) and 2 mL hexane. A 0.2 .mu.l sample was injected using a 40:1 split ratio on to a DB-FFAP column (J&W Scientific, 15 m.times.250 .mu.m.times.0.25 .mu.m), with a temperature profile starting at 150.degree. C. for 0.5 min, then heating at 15.degree. C./min to 230.degree. C. and holding for 7.1 min (1.1 mL/min He).

[0092] GC results indicating secreted levels of free fatty acids after 144 hours are shown in Table 5-1.

TABLE-US-00010 TABLE 5.1 Medium-chain fatty acid secretion in various strains of Synechocystis. Fatty Acids Plasmid (mg/L) Strain Parent Strain Added Transgenes Deletions 8:0 10:0 SGC-YC10-5 PCC 6803 pSGI-YC08 trcE::ChFatB2-7942 none 58.3 67.7 SGC-NB10-4 SGC-YC10-5 pSGI-NB5 trcE::ChFatB2-7942 slr1609 57.7 73.7 Note: ND represents "not detected" (<1 mg/L).

Example 6

Expression of Cuphea lanceolata Kas-IV and Helianthus annuus Kas-III genes in Synechocystis sp.

[0093] A DNA fragment comprising a functional operon was synthesized such that it contained the following elements in the given order: the trc promoter, the Cuphea lanceolata 3-ketoacyl-acyl carrier protein synthase IV gene (ClKas-IV, GenBank Accession No. CAC59946) codon-optimized for expression in Synechococcus elongatus PCC 7942, and the rps14 terminator (SEQ ID NO:28) from Synechococcus sp. WH8102. The nucleotide sequence of this entire functional operon, along with various flanking restriction enzyme recognition sites, is provided in SEQ ID NO:29.

[0094] Another DNA fragment comprising a functional operon was synthesized such that it contained the following elements in the given order: the trc promoter, the Helianthus annuus 3-ketoacyl-acyl carrier protein synthase III gene (HaKas-III, GenBank Accession No. ABP93352) codon-optimized for expression in both Synechococcus elongatus PCC 7942 and Synechocystis sp. PCC 6803, and rps14 terminator from Synechococcus sp. WH8102. The nucleotide sequence of this functional operon, along with various flanking restriction enzyme recognition sites, is provided in SEQ ID NO:30.

[0095] Codon optimization was performed by the use of the "Gene Designer" (version 1.1.4.1) software program provided by DNA2.0, Inc. The functional operon (expression cassette) containing the codon-modified ClKas-IV gene as represented in SEQ ID NO:29 was digested by the restriction enzymes SpeI and XbaI and inserted into plasmid pSGI-YC39 between the restriction sites SpeI and XbaI to form plasmid pSGI-BL26, which enables integration of the functional operon into the Synechocystis sp. PCC 6803 chromosome at the "RS2" recombination site (Aoki, et al., J. Bacteriol (1995) 177:5606-5611). The plasmid pSGI-BL27 containing the DNA fragment represented in SEQ ID NO:30 was constructed in the same way.

[0096] Plasmid pSGI-BL43 contains the trcE promoter, the codon-optimized ClKas-IV gene, and the rps14 terminator as represented in SEQ ID NO:31 and was made by inserting a SpeI/NcoI trcE fragment from pTrcHis A (Invitrogen) into SpeI/NcoI-digested pSGI-BL26. An additional plasmid, pSGI-BL44, contains the trcE promoter, the optimized ClKas-IV gene, the S. elongatus PCC 7942 kaiBC intergenic region, the optimized HaKas-III gene, and the rps14 terminator as represented in SEQ ID NO:32 and was made by inserting a BamHI/SacI fragment (containing the S. elongatus kaiBC intergenic region, the HaKas-III gene, and the rps14 terminator) generated via PCR amplification into BglII/SacI-digested pSGI-BL43. The PCR primers used to generate the DNA fragment containing the kaiBC region, HaKas-III, and rps14 terminator are provided as SEQ ID NO:33 and SEQ ID NO:34.

[0097] Wild-type Synechocystis PCC 6803 cells and transgenic Synechocystis strain SGC-YC10-5, which contains the ChFatB2-7942 gene, were transformed with plasmids pSG1-BL26, pSG1-BL27, pSG1-BL43 and pSG1-BL44 as described by Zang, X. et al. J. Microbiol. (2007) 45:241-245. Both recombinant and wild-type control strains were pre-cultivated in 20 mL of BG-11 medium to mid-log phase (OD.sub.730 nm=0.7-0.9) on a rotary shaker (150 rpm) at 30.degree. C. with constant illumination (60 .mu.Em.sup.-2sec.sup.-1). Kanamycin (5 .mu.g/mL) and/or spectinomycin (10 .mu.g/mL) were included in recombinant cultures as appropriate. Cultures were then subcultured at initial OD.sub.730 nm=0.4-0.5 in BG-11 and cultivated overnight to OD.sub.730 nm=0.7-0.9. For a time-course study, 45-mL aliquots of the culture were transferred into 250 mL flasks and induced by adding IPTG (final conc.=1 mM) when applicable. Cultures were sampled 0, 72, and 144 hours after IPTG induction and then filtered through Whatman.RTM. GF/B filters using a Millipore vacuum filter manifold. Filtrates were collected in screw top culture tubes for gas chromatographic (GC) analysis as described in Example 2.

[0098] Results indicating the levels of secreted octanoic acid and decanoic acid in culture supernatants 144 hours after culture inoculation are shown in Table 6-1. The ClKas-IV and HaKas-III genes present in the indicated strains were under the control of the trc promoter.

TABLE-US-00011 TABLE 6-1 Medium-chain fatty acid secretion in (in mg/L) various Synechocystis sp. strains Fatty Acids Parent Plasmid (mg/L) Strain Strain Added Transgenes 8:0 10:0 PCC 6803 n/a n/a None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2- 69.8 68.4 7942 SGC-BL26-3 PCC 6803 pSGI-BL26 trc-ClKas-IV ND ND SGC-BL26-5 SGC- pSGI-BL26 trcE-ChfatB2- 69.5 71.9 YC10-5 7942 trc-ClKas-IV SGC-BL27-1 PCC 6803 pSGI-BL27 trc-HaKas-III ND ND SGC-BL27-2 SGC- pSGI-BL27 trcE-ChFatB2- 65.7 66.6 YC10-5 7942 trc-HaKas-III Note: ND represents "not detected" (<1 mg/L).

[0099] For a more optimized measurement of fatty acid secretion in these strains, the fatty acid secretion data shown in Table 6-1 were normalized to cell culture density, measured as optical density at 730 nm (OD.sub.730 nm); these data are presented in Table 6-2. Other experiments described in this application could be normalized in a similar fashion.

TABLE-US-00012 TABLE 6-2 Normalized medium-chain fatty acid secretion (mg/L/OD.sub.730 nm) in various Synechocystis sp. strains Parent Plasmid Fatty Acids Strain Strain Added Transgenes 8:0 10:0 PCC 6803 n/a n/a None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2- 11.7 11.4 7942 SGC-BL26-3 PCC 6803 pSGI-BL26 trc-ClKas-IV ND ND SGC-BL26-5 SGC- pSGI-BL26 trcE-ChfatB2- 11.7 12.1 YC10-5 7942 trc-ClKas-IV SGC-BL27-1 PCC 6803 pSGI-BL27 trc-HaKas-III ND ND SGC-BL27-2 SGC- pSGI-BL27 trcE-ChFatB2- 12.2 12.3 YC10-5 7942 trc-HaKas-III Note: ND represents "not detected" (<1 mg/L).

[0100] Results indicating the levels of secreted octanoic acid and decanoic acid in culture supernatants of additional strains 120 hours after culture inoculation are shown in Table 6-3. The ClKas-IV and HaKas-III genes present in the indicated strains were under the control of the trcE promoter.

TABLE-US-00013 TABLE 6-3 Medium-chain fatty acid secretion (in mg/L) in various Synechocystis sp. strains Fatty Acids Plasmid (mg/L) Strain Parent Strain Added Transgenes 8:0 10:0 SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2-7942 34.8 43.5 SGC-BL44 PCC 6803 pSGI-BL44 trcE-ClKAS-IV + HaKAS-III ND ND SGC-YC10- SGC-YC10-5 pSGI-BL43 trcE-ChFatB2-7942 40.0 48.1 5-BL43 trcE-ClKas-IV SGC-YC10- SGC-YC10-5 pSGI-BL44 trcE-ChfatB2-7942 38.5 47.1 5-BL44 trcE-ClKAS-IV + HaKAS-III Note: ND represents "not detected" (<1 mg/L).

[0101] For a more optimized measurement of fatty acid secretion in these strains, the fatty acid secretion data shown in Table 6-1 were normalized to cell culture density, measured as optical density at 730 nm (OD.sub.730 nm); these data are presented in Table 6-4.

TABLE-US-00014 TABLE 6-4 Normalized medium-chain fatty acid secretion (mg/L/OD.sub.730 nm) in various Synechocystis sp. strains Plasmid Fatty Acids Strain Parent Strain Added Transgenes 8:0 10:0 SGC-YC10-5 PCC 6803 pSGI-YC08 trcE-ChFatB2-7942 6.8 8.5 SGC-BL44 PCC 6803 pSGI-BL44 trcE-ClKAS-IV + HaKAS-III ND ND SGC-YC10- SGC-YC10-5 pSGI-BL43 trcE-ChFatB2-7942 7.4 8.9 5-BL43 trcE-ClKas-IV SGC-YC10- SGC-YC10-5 pSGI-BL44 trcE-ChfatB2-7942 8.3 10.2 5-BL44 trcE-ClKAS-IV + HaKAS-III

Example 7

Introduction of a Heterologous Acyl-ACP Thioesterase Gene into a Diatom

[0102] A synthetic gene that encodes a derivative of the ChFatB2 enzyme with specificity for medium-chain (8:0-10:0) acyl-ACPs is expressed in various diatoms (Bacillariophyceae) by constructing and utilizing expression vectors comprising the ChFatB2 gene operably linked to gene regulatory regions (promoters and terminators) that function in diatoms. In a preferred embodiment, the gene is optimized for expression in specific diatom species and the portion of the gene that encodes the plastid transit peptide region of the native ChFatB2 protein is replaced with a plastid transit peptide that functions optimally in diatoms. The nucleotide sequence provided as SEQ ID NO:35 represents a synthetic derivative of the ChFatB2 gene that has been optimized for expression in Thalassiosira pseudonana and in which the native plastid transit peptide-encoding region of the gene has been replaced with the plastid transit peptide (including coupled signal sequence) associated with the gamma subunit of the coupling factor portion (CF1) of the chloroplast ATP synthase from T. pseudonana (JGI Identifier=jgi/Thaps3/40156/est Ext_gwp_gwl.C_chr.sub.--40019). The protein encoded by this gene, referred to hereafter as ChFatB2-Thal,) is provided in SEQ ID NO:36.

[0103] To produce an expression vector for T. pseudonana, the ChFatB2-Thal gene was placed between the T. pseudonana alpha-tubulin promoter and terminator regulatory sequences. The alpha-tubulin promoter was amplified from genomic DNA isolated from T. pseudonana CCMP 1335 by use of primers PR1 (SEQ ID NO:37) and PR3 (SEQ ID NO:38), whereas the alpha-tubulin terminator was amplified by use of primers PR4 (SEQ ID NO:39) and PR8 (SEQ ID NO:40). The KpnI/BamHI fragment from the alpha-tubulin promoter amplicon, the BamHI/XbaI fragment from the alpha-tubulin terminator and the large fragment from KpnI/XbaI-cut pUC118 (Vieira and Messing, Meth. Enzymol. (1987) 153:3-11) were then combined to form pSG1-PR5. The NcoI/BamHI fragment from ChFatB2-Thal gene was then inserted into NcoI/BamHI-digested pSG1-PR5 to form pSG1-PR16. In addition, a codon-optimized gene that encodes the nourseothricin acetyltransferase (NAT) enzyme from Streptomyces noursei (SEQ ID NO:41) (Krugel, et al., Gene (1993) 127:127-131) was synthesized and the NcoI/BamHI fragment from this NAT-encoding DNA molecule was inserted into the large NcoI/BamHI fragment from pSG1-PR5 to form pSG1-PR7, which upon introduction into T. pseudonana and other diatoms can provide resistance to the antibiotic nourseothricin.

[0104] pSGI-PR16 and pSGI-PR7 were co-transformed into T. pseudonana CCMP 1335 by means of particle bombardment essentially as described by Poulsen, et al., (J. Phycol. (2006) 42:1059-1065). Transformed cells were selected on agar plates in the presence of 100 mg/L nourseothricin (ClonNAT, obtained from Werner BioAgents, Germany). The presence of the ChFatB2-Thal gene in cells was confirmed by the use of PCR. Transformants were grown in ASW liquid medium (Darley and Volcani, Exp. Cell Res. (1964) 58:334) on a rotary shaker (150 rpm) at 18.degree. C. with constant illumination (60 .mu.Em.sup.-2sec.sup.-1). Samples were removed seven days after inoculation and the culture medium was tested for the presence of FFAs as described in Example 1.

[0105] Although no fatty acid secretion was detected under these particular experimental conditions, optimization of the ChFatB2-Thal gene and diatom host strain can be performed to achieve fatty acid secretion in diatoms, which are known to have relatively impervious cell walls.

Example 8

Secretion of Fatty Acids by Green Algae

[0106] A synthetic gene that encodes a derivative of the ChFatB2 enzyme with specificity for medium-chain (8:0-10:0) acyl-ACPs is expressed in green algae (Chlorophyceae) by constructing and utilizing expression vectors comprising the ChFatB2 gene operably linked to gene regulatory regions (promoters and terminators) that function in green algae. The gene is optimized for expression in specific green algal species and the portion of the gene that encodes the plastid transit peptide region of the native ChFatB2 protein is replaced with a plastid transit peptide that functions optimally in green algae. The nucleotide sequence provided as SEQ ID NO:42 represents a derivative of the ChFatB2 gene optimized for expression in Chlamydomonas reinhardtii and in which the native plastid transit peptide-encoding region of the gene has been replaced with the plastid transit peptide associated with the gamma subunit of the coupling factor portion (CF1) of the chloroplast ATP synthase from C. reinhardtii (GenPept Accession No. XP 001696335). The protein encoded by this gene is provided in SEQ ID NO:43.

Example 9

Secretion of Fatty Acids in Strains of Synechocystis sp. Containing a Disrupted 1,4-alpha-Glucan Branching Enzyme Gene

[0107] A 1.4-kbp DNA fragment spanning an area upstream and into the coding region of the 1,4-alpha-glucan branching enzyme gene (glgB, Cyanobase gene designation=s110158) from Synechocystis sp. PCC6803 was amplified from genomic DNA using PCR with primers glgB-5 (SEQ ID NO:44) and glgB-3 (SEQ ID NO:45). This fragment was cloned into the pCR4-Topo vector (Invitrogen) to yield plasmid pSGI-BL32 and subsequently cut with the restriction enzyme AvaI. A spectinomycin resistance marker cassette containing the aadA gene and associated regulatory control sequences was digested by HindIII from plasmid pSGI-BL27. Both of the linear fragments were treated with the Quick Blunting.TM. Kit (New England Biolabs). The aadA gene expression cassette was then inserted into the AvaI site of pSGI-BL32 to yield pSGI-BL33. The portion of pSGI-BL33 that inserts into and inactivates the glgB gene is provided as SEQ ID NO:46).

[0108] The pSGI-BL33 vector was transformed into wild-type Synechocystis PCC 6803 and into trcE::ChFatB2-7942-containing Synechocystis strain SGC-YC10-5 (see Example 1) according to Zang, et al., J. Microbiology (2007) 45:241-245. Insertion of the spectinomycin resistance marker into the S110158 (glgB) gene via homologous recombination was verified by PCR screening of insert and insertion site. Verified knockout strains were tested in liquid BG-11 medium for secretion of fatty acids. All liquid medium growth conditions used a rotary shaker (150 rpm) at 30.degree. C. with constant illumination (60 .mu.Em.sup.-2sec.sup.-1). Cultures were inoculated in 25 mL of BG-11 medium containing spectinomycin (10 .mu.g/mL) and/or kanamycin (5 .mu.g/mL) accordingly and grown to a sufficient density (minimal OD.sub.730 nm=1.6-2). Cultures were then used to inoculate 100 mL BG-11 medium in 250-mL polycarbonate flasks to OD.sub.730 nm=0.4-0.5 and incubated overnight. Forty-five mL of overnight culture at OD.sub.730 nm=0.5 were added to new 250-mL flasks; some cultures were induced with 1 mM IPTG or used as uninduced controls. Samples (0.5 mL) were taken at 0, 72, 144, and 216 hours post induction and processed as described in Example 2.

[0109] Free fatty acids (FFA) were separated from the filtered culture supernatant solutions by liquid-liquid extraction for GC/FID analysis. For each sample, 2 mL of filtered culture were extracted with a mixture of 50 .mu.L phosphoric acid (1 M), 100 .mu.L NaCl (5 M) and 2 mL hexane. A 0.2 .mu.l sample was injected using a 40:1 split ratio on to a DB-FFAP column (J&W Scientific, 15 m.times.250 .mu.m.times.0.25 .mu.m), with a temperature profile starting at 150.degree. C. for 0.5 min, then heating at 15.degree. C./min to 230.degree. C. and holding for 7.1 min (1.1 mL/min He).

[0110] GC results indicating secreted levels of free fatty acids after 216 hours are shown in Table 9-1.

TABLE-US-00015 TABLE 9-1 Medium-chain Fatty Acid Secretion (in mg/L) in Various Synechocystis sp. Strains Plasmid Fatty Acids Strain Parent Strain Added Deletion Transgenes 8:0 10:0 PCC 6803 n/a n/a None None ND ND SGC-BL33-1 PCC 6803 pSGI-BL33 Sll0158 (glgB) None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 None trcE-ChFatB2-7942 70.0 68.7 SGC-BL33-2 SGC-YC10-5 pSGI-BL33 Sll0158 (glgB) trcE-ChFatB2-7942 66.2 68.1 Note: ND represents "not detected" (<1 mg/L).

[0111] For a more optimized measurement of fatty acid secretion in these strains, the fatty acid secretion data shown in Table 9-1 were normalized to cell culture density, measured as optical density at 730 nm (OD.sub.730 nm); these data are presented in Table 9-2. Other experiments described in this application could be normalized in a similar fashion.

TABLE-US-00016 TABLE 9-2 Normalized Medium-chain Fatty Acid Secretion (mg/L/OD.sub.730 nm) in Various Synechocystis sp. Strains Plasmid Fatty Acids Strain Parent Strain Added Deletion Transgenes 8:0 10:0 PCC 6803 n/a n/a None None ND ND SGC-BL33-1 PCC 6803 pSGI-BL33 Sll0158 (glgB) None ND ND SGC-YC10-5 PCC 6803 pSGI-YC08 None trcE-ChFatB2-7942 9.8 9.7 SGC-BL33-2 SGC-YC10-5 pSGI-BL33 Sll0158 (glgB) trcE-ChFatB2-7942 10.4 10.7 Note: ND represents "not detected" (<1 mg/L).

Example 10

Capture of Free Fatty Acids from Model Solutions with Hydrophobic Adsorbent Resins

[0112] A spike solution was formulated by dissolving 75 mg/L octanoic acid and 75 mg/L decanoic acid in BG-11 medium supplemented with 300 mM NaCl and adjusting the pH to 5.8. 50 mg of each of the resins listed in Table 1 were weighed into a 50 mL centrifuge tube and combined with 1.0 mL of methanol and shaken gently. The excess methanol was decanted and the resins were dried under a 25 in Hg vacuum, room temperature, overnight. 50 mL of the spike solution was then added to each of the resins and incubated with gentle shaking at 31.degree. C. for 24 hours. Following incubation, the resins were removed by filtering over a Whatman.RTM. GF/F glass fiber filter and the filtrates were analyzed for octanoic acid and decanoic acid content by gas chromatography as described in Example 2. The capacity of each resin for octanoic and decanoic acid could then be determined by the difference in the concentration of each fatty acid before and after incubation with each resin. The results are shown in Table 10-1 below.

TABLE-US-00017 TABLE 10-1 Adsorption capacities of several commercially-available adsorbents Adsorption Capacity (mg/g) Octanoic Decanoic Total free Description Resin type Acid acid fatty acids Dowex Optipore .RTM. Post cross-linked macroporous 26.3 69.8 96.0 V503 (Dow Chemical) polystyrene divinyl benzene Lewatit 1064 MD Macroporous polystyrene 1.1 46.7 47.8 (LanXess) divinyl benzene Zeolyst CBV 28014 Very low-alumina zeolite 17.4 74.7 92.0 (Zeolyst) Zeolyst CBV 901 Low-alumina zeolite 5.4 64.8 70.1 (Zeolyst) Hisiv 3000 Silicalite Hydrophobic silicalite 15.3 23.7 39.1 (UOP Honeywell) Lipidex 5000 (Packard Alkylated sephadex gel 0.00 18.6 18.6 Instrument Co.) Norit ROW 0.8 (Fluka) Extruded activated charcoal 40.2 71.8 112.1

[0113] Elution of free fatty acids from the hydrophobic adsorbents was also investigated. Dowex.RTM. Optipore.RTM. V503, Zeolyst CBV 28014, Zeolyst CBV 901, and Norit.RTM. ROW were incubated with 1.0 mL of spike solution per mg of adsorbent as described above. After the incubation period, the adsorbents were rinsed and combined with 0.1, 0.5, or 1.0 mL methanol per mg of adsorbent and shaken gently at room temperature for 4 hours. The methanol eluates and post-adsorption spikes were analyzed for free fatty acid concentration by gas chromatography. The results are listed in Table 10-2 below.

TABLE-US-00018 TABLE 10-2 Desorption of free fatty acids in methanol % Desorption mL MeOH/mg Resin 0.1 mL/mg 0.5 mL/mg 1.0 mL/mg Dowex Optipore .RTM. V503 92% 84% 100% CBV 28014 53% 76% 84% CBV 901 78% 76% 57% Norit .RTM. ROW 44% 85% 77%

[0114] The effect of pH on adsorbent capacity was studied utilizing Dowex.RTM. Optipore.RTM. V503. 40 mg of the resin were combined with 40 mL of BG-11 media spiked with 150 mg/L of octanoic and decanoic acid and adjusted to a pH of 10.0, 7.5, 4.8, or 2.8. The pH 10 spike was buffered with 5 mM CAPS. The pH 7.5 and 2.8 spikes were buffered with 5 mM phosphate, and the pH 4.8 was buffered naturally by the dissolved fatty acids, with 5 mM NaCl added to maintain consistent conductivity. The spikes were incubated with resin as described above. Free fatty acid concentrations were measured with an enzymatic assay purchased from Zen-bio. The results are displayed in Table 10-3 below. From these results, it is clear that hydrophobic adsorption of free fatty acids is possible over a wide range of pH.

TABLE-US-00019 TABLE 10-3 Adsorption capacity of Dowex .RTM. Optipore .RTM. V503 at various pH values pH Adsorption Capacity (mg FFA/g resin) 10 42 .+-. 13 7.5 64 .+-. 4 4.8 172 .+-. 4 2.8 259 .+-. 1

[0115] Reported values are the mean of two experimental replicates, +/-one standard deviation.

Example 11

In Vivo Capture of Free Fatty Acids from Cultures of Synechocystis Strain SGC-YC10-5

[0116] Synechocystis sp. strain SGC-YC10-5, which contains the ChFatB2-7942 gene as described in Example 1, was cultured in BG-11 with and without Dowex.RTM. Optipore.RTM. V503 resin. 400 mL of fresh culture was induced with 5 mM IPTG and incubated at room temperature for 1 hour to allow for uptake of the inducer. The culture was then divided into four 1,000 mL baffled Erlenmeyer flasks with PTFE vent caps. To two of the flasks, approximately 400 mg of Dowex.RTM. Optipore.RTM. V503 were added. The adsorbent resin in the test flasks was recovered and exchanged for fresh resin daily for 10 days. The recovered resin was washed liberally with deionized water and eluted with 2 mL of methanol. Samples of culture medium from the test flasks and control flasks were also taken daily. The samples were measured for OD.sub.730 nm and filtered over a Whatman.RTM. GF/B glass fiber filter and analyzed for octanoic acid and decanoic acid content by gas chromatography as previously described in Example 2. The results are presented in Table 11-1.

TABLE-US-00020 TABLE 11-1 In vivo capture of free fatty acids from Synechocystis SGC-YC10-5 cultures Avg. Specific Average Free Fatty Growth Rate Acid Productivity (d.sup.-1) (mg L.sup.-1 d.sup.-1) Without Dowex 0.090 .+-. 0.005 16 .+-. 0.8 With Dowex 0.090 .+-. 0.010 31 .+-. 3

[0117] Reported values are the mean of two biological duplicates +/-one standard deviation.

Example 12

Integration of CO.sub.2 Delivery and Product Recovery as a Means for Enhancing the Efficiency and Economy of Both

[0118] Table 10-3 above reveals a clear relationship between free fatty acid adsorption capacity and pH. This relationship results from the inefficiency of extraction of the ionized form of the free fatty acids. Many potential production hosts require a pH significantly higher than the pKa of free fatty acids in order to survive and reproduce. An extreme example of this would be the alkalophilic cyanobacteria such as those belonging to the genera Synechococcus, Synechocystis, Spirulina, and many others, which prefer a pH between 9 and 11 for optimum growth. FIG. 5 outlines an embodiment of the invention wherein this problem is solved by recycling a portion of the culture first through a vessel where it is contacted with concentrated CO.sub.2 gas to lower the pH, then through a stationary adsorbent column wherein the protonated free fatty acids are captured.

[0119] The CO.sub.2-enriched, free fatty acid-depleted suspension is then returned to the bulk culture. The pressure inside the gas-liquid contactor can be controlled independently to provide a constant pH in the stream exiting the adsorption column. Further, the pressure of the post-column flash vessel can be controlled so as to provide a supply of CO.sub.2 which is titrated to the CO.sub.2 consumption rate of the bulk culture through PID control of pH, dissolved CO.sub.2, off-gas CO.sub.2, or any combination of the three. The excess CO.sub.2 can then be recycled.

[0120] In order to demonstrate proof of concept for the invention described above, an experimental system was constructed as displayed in FIG. 5.

[0121] Vessel E-1 was filled with 4L of a spike solution containing 700 mg/L octanoic acid dissolved in 100 mM NaCl, pH 11.1. Column C1 was filled with 45.2 g of Dowex.RTM. Optipore.RTM. V503 polymeric resin. The resin was activated with two column volumes of methanol, followed by a wash of three column volumes of 100 mM NaCl, pH 11.1. Liquid-gas contact vessel E2 was then filled with 200 mL of spike solution and 34.7 psia of CO.sub.2. When the pH of the spike solution inside E-2 had decreased to between 5 and 6 (as determined by a slip of pH paper contained within E-2) peristaltic pumps P-1 and P-2 were set to the same flow rate and column loading was initiated. Valve V-2 was adjusted as needed to increase the column pressure and prevent the formation of gas bubbles.

[0122] Fractions of the flow through were taken at periodic intervals of 70-100 mL and assayed for octanoic acid by a commercially-available free fatty acid assay purchased from Zen-Bio. Two superficial linear flow rates were evaluated: 16.3 cm/min and 6.1 cm/min. For both flow rates, a control run was performed whereby vessel E-2 was bypassed and the column was loaded directly at a pH of 11.1. Table 12-1 below displays the results of this experiment. For both flow rates, column dynamic binding capacity was approximately 4-fold greater when CO.sub.2 was used to lower the pH of the load.

TABLE-US-00021 TABLE 12-1 Dynamic binding capacity with and without CO.sub.2-mediated load acidification Dynamic Binding Capacity (mg/g) Flow velocity (cm/min) +34.7 psia CO.sub.2 Control (pH 11.1) 0 psia CO.sub.2 6.1 43.5 10.5 16.3 7.2 1.9

Example 13

Secretion of Oleic Acid by Photosynthetic Microorganisms

[0123] A synthetic gene that encodes a derivative of a FatA-type plant acyl-ACP TE enzyme with specificity for oleoyl-ACP is expressed in various photosynthetic microorganisms by constructing and utilizing expression vectors comprising a FatA gene operably linked to gene regulatory regions (promoters and terminators) that function in the host photosynthetic microorganism. The gene is optimized for expression in the host photosynthetic microorganism and the portion of the gene that encodes the plastid transit peptide region of the native FatA protein is removed for expression in cyanobacteria or replaced with a plastid transit peptide that functions effectively in the host eukaryotic photosynthetic microorganisms.

[0124] Genes that could be used for this purpose include, but are not limited to, those that encode the following acyl-ACP TEs (referred to by GenPept Accession Numbers): NP.sub.--189147.1, AAC49002, CAA52070.1, CAA52069.1, 193041.1, CAC39106, CAO17726, AAC72883, AAA33020, AAL79361, AAQ08223.1, AAB51523, AAL77443, AAA33019, AAG35064, and AAL77445.

[0125] The following is a sequence listing of all sequences referred to above. SEQ ID NO:1

Sequence CWU 1

1

461723DNAThalassiosira pseudonana 1atatcgtgga gtatatcaat ggtggggagg tgtggtgtag tagttgcgag caaagatgac 60acttggtaaa ctgatgcgac gtggatactg cgacgaagat tggccgtaca cacgtcggat 120ttgaatgaac atatgtgttt tattcaaacc aatttgacta gtttgaggaa ccttcacgtg 180tttcgctctc aaactttgag acaacagcct ccgaatccaa atgaatgact tttaaacaca 240agctaggagc tggtgatata taatatgctg gttgtatgaa agagactaat cgtgtgaaat 300aaatgatggc tcgccctagt gaatgctcct cagagacgct cattcgtcca agtgttcgtc 360acttctgtca ttgtttcctc cgaggccaag gtggtcgagt aggtagatac cagctattct 420cttgcttctt ttactttatc tccctctacc aaaaacagca cgttattatc tcctttccat 480tccacgcaat aacaagaggc aatcggtaaa gaggcacaaa caagagaaca aagaccccgg 540ctgcttctct cgtccgtccg ccgcccctaa acttcaagtt ttacttcaag ttcaatctgt 600tttttggcgc aaaaagcgcc gttgctccgc cgtcctccgc acttttcagt tctctgtcgt 660cgaggactgt tatcaacttc caagatctcc atctcttctc ctatcctccc ctaacaaagt 720acg 7232700DNAThalassiosira pseudonana 2gttcaatgcc tttggtgttg tcgtcaatag gcacttcgac tttgctcttg gttccgttat 60cccaaacttg aacgagcgcc acggtcctct cggtttcggt ggtatcccag gacctctcgt 120agttgatgca gggttcagaa tcgagataac tcatgttgtc gtttgttgtt ttgttgattt 180taccttgctt ccagctttcg gtctgtaatt acagtgacac gctgtactag aaatgatgta 240cgtttgatgg aatctctaaa attatgagct atttatgaac acaggagttc tcatcaactt 300tccatcgaaa tccgtaggag aattctaatg tcctcttcgg acgagagaca gacgtatcag 360gagtcacttg aaggttccaa gattctatct tcatgaggtc tggatatgac agtcctgcct 420tcgaggcaag ccctgtcact gtgacctttt cgcgtcgtca ataattttag gaacgcaagg 480atagggattc tccatagtaa ggactattgt ttgacccctg aaacttcaac ctttacccca 540agaatggggc attcataagt gaaaaacgtt tgttatgtat gccccaattc ctacacagga 600ataggtattg aatcacgtag aaaatgatcg ttgcgccgca agcaaacaca ccggctctct 660tccgccgcac tctcttccaa tccaacaaac aaacgcaacc 7003700DNAThalassiosira pseudonana 3acgcagatag tgtatatttg cgtcacagtc tcttgtcgtc ataggagagg agaactagag 60aacaaaaagc gtcatgtaat aaatgttgga tgttggcatg tcgtcccagc cagtatccaa 120aacaccgaat tgtcgaggtt cgtgagcttg cagcactcat ggcaacggct aatttcatat 180ctatgttatc aatgttatct gtaacactaa tgctaagtaa tgcgtcaaca acttatctcc 240tccggctctt cactccactt cgctgacgtc gtttgatatt ttatctgctc tattattcaa 300gttgaatctg cagttgaggc attctctaac ttagccgaga aatcaagacg gtgactttga 360atttacaagt acagttacgc ttacacaaga tacctttctc acaaaaaaga ttccgttggc 420tcccactgcg cattgctact tggtactatt cccatgtgga actggatttg ggggaaagag 480ggagtctgag tttgtaaatg tacatttgtt attcccttca ttatcgacaa catcactaac 540tcatcgtgca tacagagaaa aacaatctcc actttctcaa caaaagtggc cacaatgtgc 600ctccgacaca gcctcaagag ccgaccgatc gttgcatttt tcactctcga acacacacac 660acacacacac ccacacacca ccacctctct ttatccaacc 7004779DNAPhaeodactylum tricornutum 4agtcggattg aaaacagcga atgtacgcca ttccaaaggc gctcagcaaa aggagacata 60tgcacacatc cagcggaagt aagtacgaca cttgaacaag agcatgacct gtcaaagcat 120gctgccatcg tcgcttcgct tctattccca atgacacttt ggtcaccacg acttgaaaaa 180cggcaatcag caaaataagc gatagaccct gaccaacggc agctttcatc ttttatgaac 240ggcagatatt cgcatcctct tttatcgata cagcaaacac gcagaatttc tgttctcttt 300caagacgaca agcacgaatt tcggtacgct gtcataattt attgactatg ttagataaca 360caactctcat gcgctttgaa aatctgctta cttcacagta aagagacaag ctctttgcac 420tgactgcgac agagatggaa aaaaggaatt ctaccggcaa ttgacagact gatgtgaaaa 480cagagagtaa ccgtaaacaa gtaccggtaa gtatgcgcgc aacctttact tgttccgttg 540gcgtctgtca tttgatgtca cgcagacttg aaaagtcgtt cgctccattg tgaaaaatat 600catgcgacaa cgttcagaaa ggccggcgtg caatcggttt gccttgtttc tgatccgctg 660ctttttgagc aacgacctgc ggaggaccac aatgatcttt ctcttgtcgt gagagctagt 720tctattacct gttcaattac ctgctttctt gtattactcg aagctctcgt tcttctatc 7795807DNAPhaeodactylum tricornutum 5ttttgtaatt cgccactacc tttacgcaag taagaacgtt tcatgctgga gtcgtggacc 60aatcgtaagg tatacgttag tcataccgcg cctgtactat ttacgacacg agagaaagcc 120actgcagttc tgggatggga tcagatgctt gctcctttca ctgcgctggc aaactgtatg 180ctagacacga ctcggatcgg atatcgaaat caaacggcgg agaatgggtt cggatgactg 240tccggagcta cctaggaaaa gcttcttttt cgtttcggac caccaagagg gaagcgctgc 300ctgtactcgt gcgataggaa gcatcagacg tatttgttcg gatgagatca caccagaact 360agccaggcag ccagctagct attgtcatct acagatttcg aaccaaacgt ggatactaga 420aagcatggga ttgactgtga ctgtgatttg tgttgcacac tttataccta ccctcgacct 480cgtactttgt gtagtagcaa aatgtggatt gtgcgttgaa atgtagaagg gtttggggtt 540gacacgggtt cattcatatc cgggtactcg aaaatgaccg caacgatact catcgatcga 600gatacggtgt acacgtagac tacgtagaaa acctacgagg aagcagatat gattttccgg 660tccgcagcat ccacccagcc aacgtcggca aacaaccaaa caacctcgtc gccccttgtt 720gttcaagatc tgcattccat tgacagcctt ttcaacgaaa ccgttcgctc gtttgattcc 780atacgtcttt gaataccaac agaaaat 8076791DNAPhaeodactylum tricornutum 6aaagtatcaa tagcttattc cagatttttg tgatgttagc ctacttgtaa agcagcggag 60gtctgtcatg acggtgtagt ggctggtttc gctccggaaa ttaagttctg gttttatatc 120tcaacataac tagagataaa gttacaggca cgttactgta agtccgcaga ttgctaatgc 180tttgcttcgg tgtccgtaaa gcttatgtta ctgttctaga ttagagtggt atccacgatt 240ttcaaacgaa agtgacatat tgcgaattgt gcagtatcag aaaatctcca aagcaggagc 300atacattagt ttggccgtat tgcaacgagt agctctcctg aagatgcaag taatagaggc 360tgtgagcgtg aataatgaat ttgcctgttt agaagctggg gatcacatct cgtgctcccc 420aaaagtctct cagtaaatca agaatgttcc tattttcgaa aacattgcta tttatttagt 480taaccggctt cgtcctccca tttaaataaa gattttcaaa aatgacacca ccaacgtccg 540caagatcacg attcgagagg attcttcttt gtcccaacca tggatgacct ctcctattaa 600cacgtatatg aagtaccgct gctggtaccc ggaaaagaga ggacattcct tgtgggagag 660tcatcgatgc gctgccaatc gaaaaaaatg ccaaggcgag aaaagcgcag ttcgttctta 720taatccaatt ttgagtttca agacatactc gttgctacct tcccaccttc ccaaccaaac 780cactcgcaac c 79171093DNAArtificial SequenceSynthetic construct 7ccatggcgaa tggttctgca gtctctttga aatctggaag cttgaatacg caggaggata 60ctagttccag tccccctcct cggacgtttt tgcatcagct gcccgactgg agtcgcttgc 120tgaccgccat cacaacagtg tttgtcaaat ctaaacgacc ggacatgcat gatcggaaaa 180gcaagcgccc agatatgctc gtcgatagtt tcggactcga gtctactgtg caggacggcc 240tggtgttccg tcaatccttc agcatccgaa gctacgagat tggtacggac cgtaccgcta 300gcattgaaac gttgatgaac catctccaag aaaccagttt gaaccactgc aagagcacgg 360gcatcctgct ggatggtttt ggccgcacat tggaaatgtg caagcgagac ttgatctggg 420tggtcattaa aatgcagatc aaagttaatc gatacccggc ctggggagat accgttgaga 480tcaatacacg cttttcccgt ttgggcaaaa ttggcatggg tcgcgattgg ctgatctccg 540actgcaacac cggtgagatc ttggtccgtg caacgtctgc gtacgcgatg atgaatcaaa 600agacgcgtcg gttgagtaag ctgccgtatg aagttcacca agaaattgtt ccattgttcg 660ttgatagtcc cgttatcgag gattctgacc tcaaagtcca caagtttaaa gtcaagactg 720gcgattccat ccagaagggc ctgacgccag gttggaacga tctggatgtg aaccaacacg 780ttagcaacgt taagtatatc ggctggatct tggaaagtat gcctacggaa gtcctggaga 840cgcaggaact ctgcagtctc gctctggagt accgccgtga gtgtggccgt gattccgtgc 900tcgagtccgt cactgcgatg gaccctagca aagtgggtgt tcgcagtcaa taccaacacc 960tcttgcggct cgaagatggg accgccattg tgaacggcgc gaccgaatgg cgccccaaaa 1020atgccggcgc taacggggca attagtaccg ggaaaacctc caatggaaac agcgtcagct 1080aatgatagga tcc 10938359PRTArtificial SequenceSynthetic construct 8Met Ala Asn Gly Ser Ala Val Ser Leu Lys Ser Gly Ser Leu Asn Thr1 5 10 15Gln Glu Asp Thr Ser Ser Ser Pro Pro Pro Arg Thr Phe Leu His Gln 20 25 30Leu Pro Asp Trp Ser Arg Leu Leu Thr Ala Ile Thr Thr Val Phe Val 35 40 45Lys Ser Lys Arg Pro Asp Met His Asp Arg Lys Ser Lys Arg Pro Asp 50 55 60Met Leu Val Asp Ser Phe Gly Leu Glu Ser Thr Val Gln Asp Gly Leu65 70 75 80Val Phe Arg Gln Ser Phe Ser Ile Arg Ser Tyr Glu Ile Gly Thr Asp 85 90 95Arg Thr Ala Ser Ile Glu Thr Leu Met Asn His Leu Gln Glu Thr Ser 100 105 110Leu Asn His Cys Lys Ser Thr Gly Ile Leu Leu Asp Gly Phe Gly Arg 115 120 125Thr Leu Glu Met Cys Lys Arg Asp Leu Ile Trp Val Val Ile Lys Met 130 135 140Gln Ile Lys Val Asn Arg Tyr Pro Ala Trp Gly Asp Thr Val Glu Ile145 150 155 160Asn Thr Arg Phe Ser Arg Leu Gly Lys Ile Gly Met Gly Arg Asp Trp 165 170 175Leu Ile Ser Asp Cys Asn Thr Gly Glu Ile Leu Val Arg Ala Thr Ser 180 185 190Ala Tyr Ala Met Met Asn Gln Lys Thr Arg Arg Leu Ser Lys Leu Pro 195 200 205Tyr Glu Val His Gln Glu Ile Val Pro Leu Phe Val Asp Ser Pro Val 210 215 220Ile Glu Asp Ser Asp Leu Lys Val His Lys Phe Lys Val Lys Thr Gly225 230 235 240Asp Ser Ile Gln Lys Gly Leu Thr Pro Gly Trp Asn Asp Leu Asp Val 245 250 255Asn Gln His Val Ser Asn Val Lys Tyr Ile Gly Trp Ile Leu Glu Ser 260 265 270Met Pro Thr Glu Val Leu Glu Thr Gln Glu Leu Cys Ser Leu Ala Leu 275 280 285Glu Tyr Arg Arg Glu Cys Gly Arg Asp Ser Val Leu Glu Ser Val Thr 290 295 300Ala Met Asp Pro Ser Lys Val Gly Val Arg Ser Gln Tyr Gln His Leu305 310 315 320Leu Arg Leu Glu Asp Gly Thr Ala Ile Val Asn Gly Ala Thr Glu Trp 325 330 335Arg Pro Lys Asn Ala Gly Ala Asn Gly Ala Ile Ser Thr Gly Lys Thr 340 345 350Ser Asn Gly Asn Ser Val Ser 35597259DNAArtificial SequenceSynthetic construct 9cgccggggct ggcagcttag tcctgcgcaa tctctactac atctgccaac ccagtgaaat 60tttgatcttt gctggcagta gtcgccgcag tagtgatggc cgccgagttg gctatcgctt 120ggtcaagggc ggcagcagcc tgcgggtacc tctgctggaa aaagcgctcc gcatggatct 180gaccaacatg atcattgagt tgcgcgtttc caatgccttc tccaagggcg gcattcccct 240gactgttgaa ggcgttgcca atatcaagat tgctggggaa gaaccgacca tccacaacgc 300gatcgagcgg ctgcttggca aaaaccgtaa ggaaatcgag caaattgcca aggagaccct 360cgaaggcaac ttgcgtggtg ttttagccag cctcacgccg gagcagatca acgaggacaa 420aattgccttt gccaaaagtc tgctggaaga ggcggaggat gaccttgagc agctgggtct 480agtcctcgat acgctgcaag tccagaacat ttccgatgag gtcggttatc tctcggctag 540tggacgcaag cagcgggctg atctgcagcg agatgcccga attgctgaag ccgatgccca 600ggctgcctct gcgatccaaa cggccgaaaa tgacaagatc acggccctgc gtcggatcga 660tcgcgatgta gcgatcgccc aagccgaggc cgagcgccgg attcaggatg cgttgacgcg 720gcgcgaagcg gtggtggccg aagctgaagc ggacattgct accgaagtcg ctcgtagcca 780agcagaactc cctgtgcagc aggagcggat caaacaggtg cagcagcaac ttcaagccga 840tgtgatcgcc ccagctgagg cagcttgtaa acgggcgatc gcggaagcgc ggggggccgc 900cgcccgtatc gtcgaagatg gaaaagctca agcggaaggg acccaacggc tggcggaggc 960ttggcagacc gctggtgcta atgcccgcga catcttcctg ctccagaagc tcgaaattcg 1020agctcggtac catttacgtt gacaccatcg aatggtgcaa aacctttcgc ggtatggcat 1080gatagcgccc ggaagagagt caattcaggg tggtgaatgt gaaaccagta acgttatacg 1140atgtcgcaga gtatgccggt gtctcttatc agaccgtttc ccgcgtggtg aaccaggcca 1200gccacgtttc tgcgaaaacg cgggaaaaag tggaagcggc gatggcggag ctgaattaca 1260ttcccaaccg cgtggcacaa caactggcgg gcaaacagtc gttgctgatt ggcgttgcca 1320cctccagtct ggccctgcac gcgccgtcgc aaattgtcgc ggcgattaaa tctcgcgccg 1380atcaactggg tgccagcgtg gtggtgtcga tggtagaacg aagcggcgtc gaagcctgta 1440aagcggcggt gcacaatctt ctcgcgcaac gcgtcagtgg gctgatcatt aactatccgc 1500tggatgacca ggatgccatt gctgtggaag ctgcctgcac taatgttccg gcgttatttc 1560ttgatgtctc tgaccagaca cccatcaaca gtattatttt ctcccatgaa gacggtacgc 1620gactgggcgt ggagcatctg gtcgcattgg gtcaccagca aatcgcgctg ttagcgggcc 1680cattaagttc tgtctcggcg cgtctgcgtc tggctggctg gcataaatat ctcactcgca 1740atcaaattca gccgatagcg gaacgggaag gcgactggag tgccatgtcc ggttttcaac 1800aaaccatgca aatgctgaat gagggcatcg ttcccactgc gatgctggtt gccaacgatc 1860agatggcgct gggcgcaatg cgcgccatta ccgagtccgg gctgcgcgtt ggtgcggata 1920tctcggtagt gggatacgac gataccgaag acagctcatg ttatatcccg ccgttaacca 1980ccatcaaaca ggattttcgc ctgctggggc aaaccagcgt ggaccgcttg ctgcaactct 2040ctcagggcca ggcggtgaag ggcaatcagc tgttgcccgt ctcactggtg aaaagaaaaa 2100ccaccctggc gcccaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc 2160agctggcacg acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgta 2220agttagcgcg aattgatctg gtttgacagc ttatcatcga ctgcacggtg caccaatgct 2280tctggcgtca ggcagccatc ggaagctgtg gtatggctgt gcaggtcgta aatcactgca 2340taattcgtgt cgctcaaggc gcactcccgt tctggataat gttttttgcg ccgacatcat 2400aacggttctg gcaaatattc tgaaatgagc tgttgacaat taatcatccg gctcgtataa 2460tgtgtggaat tgtgagcgga taacaatttc acacaggaaa cagcgccgct gagaaaaagc 2520gaagcggcac tgctctttaa caatttatca gacaatctgt gtgggcactc gaccggaatt 2580atcgattaac tttattatta aaaattaaag aggtatatat taatgtatcg attaaataag 2640gaggaataaa ccatggcgaa tggttctgca gtctctttga aatctggaag cttgaatacg 2700caggaggata ctagttccag tccccctcct cggacgtttt tgcatcagct gcccgactgg 2760agtcgcttgc tgaccgccat cacaacagtg tttgtcaaat ctaaacgacc ggacatgcat 2820gatcggaaaa gcaagcgccc agatatgctc gtcgatagtt tcggactcga gtctactgtg 2880caggacggcc tggtgttccg tcaatccttc agcatccgaa gctacgagat tggtacggac 2940cgtaccgcta gcattgaaac gttgatgaac catctccaag aaaccagttt gaaccactgc 3000aagagcacgg gcatcctgct ggatggtttt ggccgcacat tggaaatgtg caagcgagac 3060ttgatctggg tggtcattaa aatgcagatc aaagttaatc gatacccggc ctggggagat 3120accgttgaga tcaatacacg cttttcccgt ttgggcaaaa ttggcatggg tcgcgattgg 3180ctgatctccg actgcaacac cggtgagatc ttggtccgtg caacgtctgc gtacgcgatg 3240atgaatcaaa agacgcgtcg gttgagtaag ctgccgtatg aagttcacca agaaattgtt 3300ccattgttcg ttgatagtcc cgttatcgag gattctgacc tcaaagtcca caagtttaaa 3360gtcaagactg gcgattccat ccagaagggc ctgacgccag gttggaacga tctggatgtg 3420aaccaacacg ttagcaacgt taagtatatc ggctggatct tggaaagtat gcctacggaa 3480gtcctggaga cgcaggaact ctgcagtctc gctctggagt accgccgtga gtgtggccgt 3540gattccgtgc tcgagtccgt cactgcgatg gaccctagca aagtgggtgt tcgcagtcaa 3600taccaacacc tcttgcggct cgaagatggg accgccattg tgaacggcgc gaccgaatgg 3660cgccccaaaa atgccggcgc taacggggca attagtaccg ggaaaacctc caatggaaac 3720agcgtcagct aatgatagga tccgagctcg agatctgcag ctggtaccat atgggaattc 3780gaagcttggc tgttttggcg gatgagagaa gattttcagc ctgatacaga ttaaatcaga 3840acgcagaagc ggtctgataa aacagaattt gcctggcggc agtagcgcgg tggtcccacc 3900tgaccccatg ccgaactcag aagtgaaacg ccgtagcgcc gatggtagtg tggggtctcc 3960ccatgcgaga gtagggaact gccaggcatc aaataaaacg aaaggctcag tcgaaagact 4020gggcctttcg ttttatctgt tgtttgtcgg tgaacgctct cctgagtagg acaaatccgc 4080cgggagcgga tttgaacgtt gcgaagcaac ggcccggagg gtggcgggca ggacgcccgc 4140cataaactgc caggcatcaa attaagcaga aggccatcct gacggatggc ctttttgcgt 4200ttctacaaac tcttttgttt atttttctaa atacattcaa atatgtatcc gctcatgggg 4260atccgactag taggcctcga ggaattcacg cgtacgtaga tctccgcggc cgccgatcct 4320ctagtatgct tgtaaaccgt tttgtgaaaa aatttttaaa ataaaaaagg ggacctctag 4380ggtccccaat taattagtaa tataatctat taaaggtcat tcaaaaggtc atccaccgga 4440tcagcttagt aaagccctcg ctagatttta atgcggatgt tgcgattact tcgccaacta 4500ttgcgataac aagaaaaagc cagcctttca tgatatatct cccaatttgt gtagggctta 4560ttatgcacgc ttaaaaataa taaaagcaga cttgacctga tagtttggct gtgagcaatt 4620atgtgcttag tgcatctaac gcttgagtta agccgcgccg cgaagcggcg tcggcttgaa 4680cgaattgtta gacattattt gccgactacc ttggtgatct cgcctttcac gtagtggaca 4740aattcttcca actgatctgc gcgcgaggcc aagcgatctt cttcttgtcc aagataagcc 4800tgtctagctt caagtatgac gggctgatac tgggccggca ggcgctccat tgcccagtcg 4860gcagcgacat ccttcggcgc gattttgccg gttactgcgc tgtaccaaat gcgggacaac 4920gtaagcacta catttcgctc atcgccagcc cagtcgggcg gcgagttcca tagcgttaag 4980gtttcattta gcgcctcaaa tagatcctgt tcaggaaccg gatcaaagag ttcctccgcc 5040gctggaccta ccaaggcaac gctatgttct cttgcttttg tcagcaagat agccagatca 5100atgtcgatcg tggctggctc gaagatacct gcaagaatgt cattgcgctg ccattctcca 5160aattgcagtt cgcgcttagc tggataacgc cacggaatga tgtcgtcgtg cacaacaatg 5220gtgacttcta cagcgcggag aatctcgctc tctccagggg aagccgaagt ttccaaaagg 5280tcgttgatca aagctcgccg cgttgtttca tcaagcctta cggtcaccgt aaccagcaaa 5340tcaatatcac tgtgtggctt caggccgcca tccactgcgg agccgtacaa atgtacggcc 5400agcaacgtcg gttcgagatg gcgctcgatg acgccaacta cctctgatag ttgagtcgat 5460acttcggcga tcaccgcttc cctcatgatg tttaactttg ttttagggcg actgccctgc 5520tgcgtaacat cgttgctgct ccataacatc aaacatcgac ccacggcgta acgcgcttgc 5580tgcttggatg cccgaggcat agactgtacc ccaaaaaaac agtcataaca agccatgaaa 5640accgccactg cgccgttacc accgctgcgt tcggtcaagg ttctggacca gttgcgtgag 5700cgcatacgct acttgcatta cagcttacga accgaacagg cttatgtcca ctgggttcgt 5760gccttcatcc gtttccacgg tgtgcgtcac ccggcaacct tgggcagcag cgaagtcgag 5820gcatttctgt cctggctggc gaacgagcgc aaggtttcgg tctccacgca tcgtcaggca 5880ttggcggcct tgctgttctt ctacggcaag gtgctgtgca cggatctgcc ctggcttcag 5940gagatcggaa gacctcggcc gtcgcggcgc ttgccggtgg tgctgacccc ggatgaagtg 6000gttcgcatcc tcggttttct ggaaggcgag catcgtttgt tcgcccagct tctgtatgga 6060acgggcatgc ggatcagtga gggtttgcaa ctgcgggtca aggatctgga tttcgatcac 6120ggcacgatca tcgtgcggga gggcaagggc tccaaggatc gggccttgat gttacccgag 6180agcttggcac ccagcctgcg cgagcagggg aattgatccg gtggatgacc ttttgaatga 6240cctttaatag attatattac taattaattg gggaccctag aggtcccctt ttttatttta 6300aaaatttttt cacaaaacgg tttacaagca taaagctcta gagtcgacct gcaggcatgc 6360aagcttcgag tccctgctcg tcacgctttc aggcaccgtg ccagatatcg acgtggagtc 6420gatcactgtg attggcgaag gggaaggcag cgctacccaa atcgctagct tgctggagaa 6480gctgaaacaa accacgggca ttgatctggc gaaatcccta ccgggtcaat ccgactcgcc 6540cgctgcgaag tcctaagaga tagcgatgtg accgcgatcg cttgtcaaga atcccagtga 6600tcccgaacca taggaaggca agctcaatgc ttgcctcgtc ttgaggacta tctagatgtc 6660tgtggaacgc acatttattg ccatcaagcc cgatggcgtt cagcggggtt tggtcggtac 6720gatcatcggc cgctttgagc aaaaaggctt caaactggtg

ggcctaaagc agctgaagcc 6780cagtcgcgag ctggccgaac agcactatgc tgtccaccgc gagcgcccct tcttcaatgg 6840cctcgtcgag ttcatcacct ctgggccgat cgtggcgatc gtcttggaag gcgaaggcgt 6900tgtggcggct gctcgcaagt tgatcggcgc taccaatccg ctgacggcag aaccgggcac 6960catccgtggt gattttggtg tcaatattgg ccgcaacatc atccatggct cggatgcaat 7020cgaaacagca caacaggaaa ttgctctctg gtttagccca gcagagctaa gtgattggac 7080ccccacgatt caaccctggc tgtacgaata aggtctgcat tccttcagag agacattgcc 7140atgcccgtgc tgcgatcgcc cttccaagct gccttgcccc gctgtttcgg gctggcagcc 7200ctggcgttgg ggctggcgac cgcttgccaa gaaagcagcg ctccgccggc tgccggatc 7259107113DNAArtificial SequenceSynthetic construct 10cgccggggct ggcagcttag tcctgcgcaa tctctactac atctgccaac ccagtgaaat 60tttgatcttt gctggcagta gtcgccgcag tagtgatggc cgccgagttg gctatcgctt 120ggtcaagggc ggcagcagcc tgcgggtacc tctgctggaa aaagcgctcc gcatggatct 180gaccaacatg atcattgagt tgcgcgtttc caatgccttc tccaagggcg gcattcccct 240gactgttgaa ggcgttgcca atatcaagat tgctggggaa gaaccgacca tccacaacgc 300gatcgagcgg ctgcttggca aaaaccgtaa ggaaatcgag caaattgcca aggagaccct 360cgaaggcaac ttgcgtggtg ttttagccag cctcacgccg gagcagatca acgaggacaa 420aattgccttt gccaaaagtc tgctggaaga ggcggaggat gaccttgagc agctgggtct 480agtcctcgat acgctgcaag tccagaacat ttccgatgag gtcggttatc tctcggctag 540tggacgcaag cagcgggctg atctgcagcg agatgcccga attgctgaag ccgatgccca 600ggctgcctct gcgatccaaa cggccgaaaa tgacaagatc acggccctgc gtcggatcga 660tcgcgatgta gcgatcgccc aagccgaggc cgagcgccgg attcaggatg cgttgacgcg 720gcgcgaagcg gtggtggccg aagctgaagc ggacattgct accgaagtcg ctcgtagcca 780agcagaactc cctgtgcagc aggagcggat caaacaggtg cagcagcaac ttcaagccga 840tgtgatcgcc ccagctgagg cagcttgtaa acgggcgatc gcggaagcgc ggggggccgc 900cgcccgtatc gtcgaagatg gaaaagctca agcggaaggg acccaacggc tggcggaggc 960ttggcagacc gctggtgcta atgcccgcga catcttcctg ctccagaagc tcgaaattcg 1020agctcggtac catttacgtt gacaccatcg aatggtgcaa aacctttcgc ggtatggcat 1080gatagcgccc ggaagagagt caattcaggg tggtgaatgt gaaaccagta acgttatacg 1140atgtcgcaga gtatgccggt gtctcttatc agaccgtttc ccgcgtggtg aaccaggcca 1200gccacgtttc tgcgaaaacg cgggaaaaag tggaagcggc gatggcggag ctgaattaca 1260ttcccaaccg cgtggcacaa caactggcgg gcaaacagtc gttgctgatt ggcgttgcca 1320cctccagtct ggccctgcac gcgccgtcgc aaattgtcgc ggcgattaaa tctcgcgccg 1380atcaactggg tgccagcgtg gtggtgtcga tggtagaacg aagcggcgtc gaagcctgta 1440aagcggcggt gcacaatctt ctcgcgcaac gcgtcagtgg gctgatcatt aactatccgc 1500tggatgacca ggatgccatt gctgtggaag ctgcctgcac taatgttccg gcgttatttc 1560ttgatgtctc tgaccagaca cccatcaaca gtattatttt ctcccatgaa gacggtacgc 1620gactgggcgt ggagcatctg gtcgcattgg gtcaccagca aatcgcgctg ttagcgggcc 1680cattaagttc tgtctcggcg cgtctgcgtc tggctggctg gcataaatat ctcactcgca 1740atcaaattca gccgatagcg gaacgggaag gcgactggag tgccatgtcc ggttttcaac 1800aaaccatgca aatgctgaat gagggcatcg ttcccactgc gatgctggtt gccaacgatc 1860agatggcgct gggcgcaatg cgcgccatta ccgagtccgg gctgcgcgtt ggtgcggata 1920tctcggtagt gggatacgac gataccgaag acagctcatg ttatatcccg ccgttaacca 1980ccatcaaaca ggattttcgc ctgctggggc aaaccagcgt ggaccgcttg ctgcaactct 2040ctcagggcca ggcggtgaag ggcaatcagc tgttgcccgt ctcactggtg aaaagaaaaa 2100ccaccctggc gcccaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc 2160agctggcacg acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgta 2220agttagcgcg aattgatctg gtttgacagc ttatcatcga ctgcacggtg caccaatgct 2280tctggcgtca ggcagccatc ggaagctgtg gtatggctgt gcaggtcgta aatcactgca 2340taattcgtgt cgctcaaggc gcactcccgt tctggataat gttttttgcg ccgacatcat 2400aacggttctg gcaaatattc tgaaatgagc tgttgacaat taatcatccg gctcgtataa 2460tgtgtggaat tgtgagcgga taacaatttc acacaggaaa cagaccatgg cgaatggttc 2520tgcagtctct ttgaaatctg gaagcttgaa tacgcaggag gatactagtt ccagtccccc 2580tcctcggacg tttttgcatc agctgcccga ctggagtcgc ttgctgaccg ccatcacaac 2640agtgtttgtc aaatctaaac gaccggacat gcatgatcgg aaaagcaagc gcccagatat 2700gctcgtcgat agtttcggac tcgagtctac tgtgcaggac ggcctggtgt tccgtcaatc 2760cttcagcatc cgaagctacg agattggtac ggaccgtacc gctagcattg aaacgttgat 2820gaaccatctc caagaaacca gtttgaacca ctgcaagagc acgggcatcc tgctggatgg 2880ttttggccgc acattggaaa tgtgcaagcg agacttgatc tgggtggtca ttaaaatgca 2940gatcaaagtt aatcgatacc cggcctgggg agataccgtt gagatcaata cacgcttttc 3000ccgtttgggc aaaattggca tgggtcgcga ttggctgatc tccgactgca acaccggtga 3060gatcttggtc cgtgcaacgt ctgcgtacgc gatgatgaat caaaagacgc gtcggttgag 3120taagctgccg tatgaagttc accaagaaat tgttccattg ttcgttgata gtcccgttat 3180cgaggattct gacctcaaag tccacaagtt taaagtcaag actggcgatt ccatccagaa 3240gggcctgacg ccaggttgga acgatctgga tgtgaaccaa cacgttagca acgttaagta 3300tatcggctgg atcttggaaa gtatgcctac ggaagtcctg gagacgcagg aactctgcag 3360tctcgctctg gagtaccgcc gtgagtgtgg ccgtgattcc gtgctcgagt ccgtcactgc 3420gatggaccct agcaaagtgg gtgttcgcag tcaataccaa cacctcttgc ggctcgaaga 3480tgggaccgcc attgtgaacg gcgcgaccga atggcgcccc aaaaatgccg gcgctaacgg 3540ggcaattagt accgggaaaa cctccaatgg aaacagcgtc agctaatgat aggatccgag 3600ctcgagatct gcagctggta ccatatggga attcgaagct tggctgtttt ggcggatgag 3660agaagatttt cagcctgata cagattaaat cagaacgcag aagcggtctg ataaaacaga 3720atttgcctgg cggcagtagc gcggtggtcc cacctgaccc catgccgaac tcagaagtga 3780aacgccgtag cgccgatggt agtgtggggt ctccccatgc gagagtaggg aactgccagg 3840catcaaataa aacgaaaggc tcagtcgaaa gactgggcct ttcgttttat ctgttgtttg 3900tcggtgaacg ctctcctgag taggacaaat ccgccgggag cggatttgaa cgttgcgaag 3960caacggcccg gagggtggcg ggcaggacgc ccgccataaa ctgccaggca tcaaattaag 4020cagaaggcca tcctgacgga tggccttttt gcgtttctac aaactctttt gtttattttt 4080ctaaatacat tcaaatatgt atccgctcat ggggatccga ctagtaggcc tcgaggaatt 4140cacgcgtacg tagatctccg cggccgccga tcctctagta tgcttgtaaa ccgttttgtg 4200aaaaaatttt taaaataaaa aaggggacct ctagggtccc caattaatta gtaatataat 4260ctattaaagg tcattcaaaa ggtcatccac cggatcagct tagtaaagcc ctcgctagat 4320tttaatgcgg atgttgcgat tacttcgcca actattgcga taacaagaaa aagccagcct 4380ttcatgatat atctcccaat ttgtgtaggg cttattatgc acgcttaaaa ataataaaag 4440cagacttgac ctgatagttt ggctgtgagc aattatgtgc ttagtgcatc taacgcttga 4500gttaagccgc gccgcgaagc ggcgtcggct tgaacgaatt gttagacatt atttgccgac 4560taccttggtg atctcgcctt tcacgtagtg gacaaattct tccaactgat ctgcgcgcga 4620ggccaagcga tcttcttctt gtccaagata agcctgtcta gcttcaagta tgacgggctg 4680atactgggcc ggcaggcgct ccattgccca gtcggcagcg acatccttcg gcgcgatttt 4740gccggttact gcgctgtacc aaatgcggga caacgtaagc actacatttc gctcatcgcc 4800agcccagtcg ggcggcgagt tccatagcgt taaggtttca tttagcgcct caaatagatc 4860ctgttcagga accggatcaa agagttcctc cgccgctgga cctaccaagg caacgctatg 4920ttctcttgct tttgtcagca agatagccag atcaatgtcg atcgtggctg gctcgaagat 4980acctgcaaga atgtcattgc gctgccattc tccaaattgc agttcgcgct tagctggata 5040acgccacgga atgatgtcgt cgtgcacaac aatggtgact tctacagcgc ggagaatctc 5100gctctctcca ggggaagccg aagtttccaa aaggtcgttg atcaaagctc gccgcgttgt 5160ttcatcaagc cttacggtca ccgtaaccag caaatcaata tcactgtgtg gcttcaggcc 5220gccatccact gcggagccgt acaaatgtac ggccagcaac gtcggttcga gatggcgctc 5280gatgacgcca actacctctg atagttgagt cgatacttcg gcgatcaccg cttccctcat 5340gatgtttaac tttgttttag ggcgactgcc ctgctgcgta acatcgttgc tgctccataa 5400catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg gatgcccgag gcatagactg 5460taccccaaaa aaacagtcat aacaagccat gaaaaccgcc actgcgccgt taccaccgct 5520gcgttcggtc aaggttctgg accagttgcg tgagcgcata cgctacttgc attacagctt 5580acgaaccgaa caggcttatg tccactgggt tcgtgccttc atccgtttcc acggtgtgcg 5640tcacccggca accttgggca gcagcgaagt cgaggcattt ctgtcctggc tggcgaacga 5700gcgcaaggtt tcggtctcca cgcatcgtca ggcattggcg gccttgctgt tcttctacgg 5760caaggtgctg tgcacggatc tgccctggct tcaggagatc ggaagacctc ggccgtcgcg 5820gcgcttgccg gtggtgctga ccccggatga agtggttcgc atcctcggtt ttctggaagg 5880cgagcatcgt ttgttcgccc agcttctgta tggaacgggc atgcggatca gtgagggttt 5940gcaactgcgg gtcaaggatc tggatttcga tcacggcacg atcatcgtgc gggagggcaa 6000gggctccaag gatcgggcct tgatgttacc cgagagcttg gcacccagcc tgcgcgagca 6060ggggaattga tccggtggat gaccttttga atgaccttta atagattata ttactaatta 6120attggggacc ctagaggtcc ccttttttat tttaaaaatt ttttcacaaa acggtttaca 6180agcataaagc tctagagtcg acctgcaggc atgcaagctt cgagtccctg ctcgtcacgc 6240tttcaggcac cgtgccagat atcgacgtgg agtcgatcac tgtgattggc gaaggggaag 6300gcagcgctac ccaaatcgct agcttgctgg agaagctgaa acaaaccacg ggcattgatc 6360tggcgaaatc cctaccgggt caatccgact cgcccgctgc gaagtcctaa gagatagcga 6420tgtgaccgcg atcgcttgtc aagaatccca gtgatcccga accataggaa ggcaagctca 6480atgcttgcct cgtcttgagg actatctaga tgtctgtgga acgcacattt attgccatca 6540agcccgatgg cgttcagcgg ggtttggtcg gtacgatcat cggccgcttt gagcaaaaag 6600gcttcaaact ggtgggccta aagcagctga agcccagtcg cgagctggcc gaacagcact 6660atgctgtcca ccgcgagcgc cccttcttca atggcctcgt cgagttcatc acctctgggc 6720cgatcgtggc gatcgtcttg gaaggcgaag gcgttgtggc ggctgctcgc aagttgatcg 6780gcgctaccaa tccgctgacg gcagaaccgg gcaccatccg tggtgatttt ggtgtcaata 6840ttggccgcaa catcatccat ggctcggatg caatcgaaac agcacaacag gaaattgctc 6900tctggtttag cccagcagag ctaagtgatt ggacccccac gattcaaccc tggctgtacg 6960aataaggtct gcattccttc agagagacat tgccatgccc gtgctgcgat cgcccttcca 7020agctgccttg ccccgctgtt tcgggctggc agccctggcg ttggggctgg cgaccgcttg 7080ccaagaaagc agcgctccgc cggctgccgg atc 7113117173DNAArtificial SequenceSynthetic construct 11cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat 60cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga 120acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt 180ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt 240ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc 300gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa 360gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct 420ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta 480actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg 540gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc 600ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg aagccagtta 660ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg 720gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt 780tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg 840tcatgagatt atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta 900aatcaatcta aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg 960aggcacctat ctcagcgatc tgtctatttc gttcatccat agttgcctga ctccccgtcg 1020tgtagataac tacgatacgg gagggcttac catctggccc cagtgctgca atgataccgc 1080gagacccacg ctcaccggct ccagatttat cagcaataaa ccagccagcc ggaagggccg 1140agcgcagaag tggtcctgca actttatccg cctccatcca gtctattaat tgttgccggg 1200aagctagagt aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc attgctacag 1260gcatcgtggt gtcacgctcg tcgtttggta tggcttcatt cagctccggt tcccaacgat 1320caaggcgagt tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc 1380cgatcgttgt cagaagtaag ttggccgcag tgttatcact catggttatg gcagcactgc 1440ataattctct tactgtcatg ccatccgtaa gatgcttttc tgtgactggt gagtactcaa 1500ccaagtcatt ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac 1560gggataatac cgcgccacat agcagaactt taaaagtgct catcattgga aaacgttctt 1620cggggcgaaa actctcaagg atcttaccgc tgttgagatc cagttcgatg taacccactc 1680gtgcacccaa ctgatcttca gcatctttta ctttcaccag cgtttctggg tgagcaaaaa 1740caggaaggca aaatgccgca aaaaagggaa taagggcgac acggaaatgt tgaatactca 1800tactcttcct ttttcaatat tattgaagca tttatcaggg ttattgtctc atgagcggat 1860acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa 1920aagtgccacc tgacgtctaa gaaaccatta ttatcatgac attaacctat aaaaataggc 1980gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga cggtgaaaac ctctgacaca 2040tgcagctccc ggagacggtc acagcttgtc tgtaagcgga tgccgggagc agacaagccc 2100gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg gcttaactat gcggcatcag 2160agcagattgt actgagagtg caccataaaa ttgtaaacgt taatattttg ttaaaattcg 2220cgttaaattt ttgttaaatc agctcatttt ttaaccaata ggccgaaatc ggcaaaatcc 2280cttataaatc aaaagaatag cccgagatag ggttgagtgt tgttccagtt tggaacaaga 2340gtccactatt aaagaacgtg gactccaacg tcaaagggcg aaaaaccgtc tatcagggcg 2400atggcccact acgtgaacca tcacccaaat caagtttttt ggggtcgagg tgccgtaaag 2460cactaaatcg gaaccctaaa gggagccccc gatttagagc ttgacgggga aagccggcga 2520acgtggcgag aaaggaaggg aagaaagcga aaggagcggg cgctagggcg ctggcaagtg 2580tagcggtcac gctgcgcgta accaccacac ccgccgcgct taatgcgccg ctacagggcg 2640cgtactatgg ttgctttgac gtatgcggtg tgaaataccg cacagatgcg taaggagaaa 2700ataccgcatc aggcgccatt cgccattcag gctgcgcaac tgttgggaag ggcgatcggt 2760gcgggcctct tcgctattac gccagctggc gaaaggggga tgtgctgcaa ggcgattaag 2820ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa acgacggcca gtgccaagct 2880attgctgaag cggaatccct ggttaatgcc gccgccgatg ccaattgcat tctccaagtg 2940gggcacattg aacgcttcaa cccggcattt ttagagctaa ccaaaattct caaaacggaa 3000gagttattgg cgatcgaagc ccatcgcatg agtccctatt cccagcgggc caatgatgtc 3060tccgtggtat tggatttgat gatccatgac attgacctgt tgctggaatt ggtgggttcg 3120gaagtggtta aactgtccgc cagtggcagt cgggcttctg ggtcaggata tttggattat 3180gtcaccgcta cgttaggctt ctcctccggc attgtggcca ccctcaccgc cagtaaggtc 3240acccatcgta aaattcgttc catcgccgcc cactgcaaaa attccctcac cgaagcggat 3300tttctcaata acgaaatttt gatccatcgc caaaccaccg ctgattggag cgcggactat 3360ggccaggtat tgtatcgcca ggatggtcta atcgaaaagg tttacaccag taatattgaa 3420cctctccacg ctgaattaga acattttatt cattgtgtta ggggaggtga tcaaccctca 3480gtggggggag aacaggccct caaggccctg aagttagcca gtttaattga agaaatggcc 3540ctggacagtc aggaatggca tgggggggaa gttgtgacag aatatcaaga tgccaccctg 3600gccctcagtg cgagtgttta aatcaactta attaatgcaa ttattgcgag ttcaaactcg 3660ataactttgt gaaatattac tgttgaatta atctatgact attcaataca cccccctagc 3720cgatcgcctg ttggcctacc tcgccgccga tcgcctaaat ctcagcgcca agagtagttc 3780cctcaacacc agtattctgc tcagcagtga cctattcaat caggaagggg gaattgtaac 3840agccaactat ggctttgatg gttatatggt accatatgca tgcgagctca gatctaccag 3900gttgtccttg gcgcagcgct tcccacgctg agagggtgta gcccgtcacg ggtaaccgat 3960atcgtcgaca ggcctctaga cccgggctcg agctagcaag cttggccgga tccggccgga 4020tccggagttt gtagaaacgc aaaaaggcca tccgtcagga tggccttctg cttaatttga 4080tgcctggcag tttatggcgg gcgtcctgcc cgccaccctc cgggccgttg ctccgcaacg 4140ttcaaatccg ctcccggcgg atttgtccta ctcaggagag cgttcaccga caaacaacag 4200ataaaacgaa aggcccagtc tttcgactga gcctttcgtt ttatttgatg cctggcagtt 4260ccctactctc gcatggggag accccacact accatcggcg ctacggcgtt tcacttctga 4320gttcggcatg gggtcaggtg ggaccaccgc gctactgccg ccaggcaaat tctgttttat 4380tgagccgtta ccccacctac tagctaatcc catctgggca catccgatgg caagaggccc 4440gaaggtcccc ctctttggtc ttgcgacgtt atgcggtatt agctaccgtt tccagtagtt 4500atccccctcc atcaggcagt ttcccagaca ttactcaccc gtccgccact cgtcagcaaa 4560gaagcaagct tagatcgacc tgcagggggg ggggggaaag ccacgttgtg tctcaaaatc 4620tctgatgtta cattgcacaa gataaaaata tatcatcatg aacaataaaa ctgtctgctt 4680acataaacag taatacaagg ggtgttatga gccatattca acgggaaacg tcttgctcga 4740ggccgcgatt aaattccaac atggatgctg atttatatgg gtataaatgg gctcgcgata 4800atgtcgggca atcaggtgcg acaatctatc gattgtatgg gaagcccgat gcgccagagt 4860tgtttctgaa acatggcaaa ggtagcgttg ccaatgatgt tacagatgag atggtcagac 4920taaactggct gacggaattt atgcctcttc cgaccatcaa gcattttatc cgtactcctg 4980atgatgcatg gttactcacc actgcgatcc ccgggaaaac agcattccag gtattagaag 5040aatatcctga ttcaggtgaa aatattgttg atgcgctggc agtgttcctg cgccggttgc 5100attcgattcc tgtttgtaat tgtcctttta acagcgatcg cgtatttcgt ctcgctcagg 5160cgcaatcacg aatgaataac ggtttggttg atgcgagtga ttttgatgac gagcgtaatg 5220gctggcctgt tgaacaagtc tggaaagaaa tgcataagct tttgccattc tcaccggatt 5280cagtcgtcac tcatggtgat ttctcacttg ataaccttat ttttgacgag gggaaattaa 5340taggttgtat tgatgttgga cgagtcggaa tcgcagaccg ataccaggat cttgccatcc 5400tatggaactg cctcggtgag ttttctcctt cattacagaa acggcttttt caaaaatatg 5460gtattgataa tcctgatatg aataaattgc agtttcattt gatgctcgat gagtttttct 5520aatcagaatt ggttaattgg ttgtaacact ggcagagcat tacgctgact tgacgggacg 5580gcggctttgt tgaataaatc gaacttttgc tgagttgaag gatcagatca cgcatcttcc 5640cgacaacgca gaccgttccg tggcaaagca aaagttcaaa atcaccaact ggtccaccta 5700caacaaagct ctcatcaacc gtggctccct cactttctgg ctggatgatg gggcgattca 5760ggcctggtat gagtcagcaa caccttcttc acgaggcaga cctcagcgcc cccccccccc 5820tgcaggtcga tctggtaacc ccagcgcggt tgctaccaag tagtgacccg cttcgtgatg 5880caaaatccgc tgacgatatt cgggcgatcg ctgctgaatg ccatcgagca gtaacgtggc 5940gaattcggta ccggtatgga tggcaccgat gcggaatccc aacagattgc ctttgacaac 6000aatgtggcct ggaataacct gggggatttg tccaccacca cccaacgggc ctacacttcg 6060gctattagca cagacacagt gcagagtgtt tatggcgtta atctggaaaa aaacgataac 6120attcccattg tttttgcgtg gcccattttt cccaccaccc ttaatcccac agattttcag 6180gtaatgctta acacggggga aattgtcacc ccggtgatcg cctctttgat tcccaacagt 6240gaatacaacg aacggcaaac ggtagtaatt acgggcaatt ttggtaatcg tttaacccca 6300ggcacggagg gagcgattta tcccgtttcc gtaggcacag tgttggacag tactcctttg 6360gaaatggtgg gacccaacgg cccggtcagt gcggtgggta ttaccattga tagtctcaac 6420ccctacgtgg ccggcaatgg tcccaaaatt gtcgccgcta agttagaccg cttcagtgac 6480ctgggggaag gggctcccct ctggttagcc accaatcaaa ataacagtgg cggggattta 6540tatggagacc aagcccaatt tcgtttgcga atttacacca gcgccggttt ttcccccgat 6600ggcattgcca gtttactacc cacagaattt gaacggtatt ttcaactcca agcggaagat 6660attacgggac ggacagttat cctaacccaa actggtgttg attatgaaat tcccggcttt 6720ggtctggtgc aggtgttggg gctggcggat ttggccgggg ttcaggacag ctatgacctg 6780acttacatcg aagatcatga caactattac gacattatcc tcaaagggga cgaagccgca 6840gttcgccaaa ttaagagggt tgctttgccc tccgaagggg attattcggc ggtttataat 6900cccggtggcc ccggcaatga tccagagaat ggtcccccaa attcgtaatc atgtcatagc 6960tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca 7020taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct 7080cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac 7140gcgcggggag aggcggtttg cgtattgggc gct 7173127029DNAArtificial SequenceSynthetic construct 12attgctgaag cggaatccct ggttaatgcc gccgccgatg ccaattgcat tctccaagtg 60gggcacattg

aacgcttcaa cccggcattt ttagagctaa ccaaaattct caaaacggaa 120gagttattgg cgatcgaagc ccatcgcatg agtccctatt cccagcgggc caatgatgtc 180tccgtggtat tggatttgat gatccatgac attgacctgt tgctggaatt ggtgggttcg 240gaagtggtta aactgtccgc cagtggcagt cgggcttctg ggtcaggata tttggattat 300gtcaccgcta cgttaggctt ctcctccggc attgtggcca ccctcaccgc cagtaaggtc 360acccatcgta aaattcgttc catcgccgcc cactgcaaaa attccctcac cgaagcggat 420tttctcaata acgaaatttt gatccatcgc caaaccaccg ctgattggag cgcggactat 480ggccaggtat tgtatcgcca ggatggtcta atcgaaaagg tttacaccag taatattgaa 540cctctccacg ctgaattaga acattttatt cattgtgtta ggggaggtga tcaaccctca 600gtggggggag aacaggccct caaggccctg aagttagcca gtttaattga agaaatggcc 660ctggacagtc aggaatggca tgggggggaa gttgtgacag aatatcaaga tgccaccctg 720gccctcagtg cgagtgttta aatcaactta attaatgcaa ttattgcgag ttcaaactcg 780ataactttgt gaaatattac tgttgaatta atctatgact attcaataca cccccctagc 840cgatcgcctg ttggcctacc tcgccgccga tcgcctaaat ctcagcgcca agagtagttc 900cctcaacacc agtattctgc tcagcagtga cctattcaat caggaagggg gaattgtaac 960agccaactat ggctttgatg gttatatggt accatatggt gcactctcag tacaatctgc 1020tctgatgccg catagttaag ccagtataca ctccgctatc gctacgtgac tgggtcatgg 1080ctgcgccccg acacccgcca acacccgctg acgcgccctg acgggcttgt ctgctcccgg 1140catccgctta cagacaagct gtgaccgtct ccgggagctg catgtgtcag aggttttcac 1200cgtcatcacc gaaacgcgcg aggcagcaga tcaattcgcg cgcgaaggcg aagcggcatg 1260catttacgtt gacaccatcg aatggtgcaa aacctttcgc ggtatggcat gatagcgccc 1320ggaagagagt caattcaggg tggtgaatgt gaaaccagta acgttatacg atgtcgcaga 1380gtatgccggt gtctcttatc agaccgtttc ccgcgtggtg aaccaggcca gccacgtttc 1440tgcgaaaacg cgggaaaaag tggaagcggc gatggcggag ctgaattaca ttcccaaccg 1500cgtggcacaa caactggcgg gcaaacagtc gttgctgatt ggcgttgcca cctccagtct 1560ggccctgcac gcgccgtcgc aaattgtcgc ggcgattaaa tctcgcgccg atcaactggg 1620tgccagcgtg gtggtgtcga tggtagaacg aagcggcgtc gaagcctgta aagcggcggt 1680gcacaatctt ctcgcgcaac gcgtcagtgg gctgatcatt aactatccgc tggatgacca 1740ggatgccatt gctgtggaag ctgcctgcac taatgttccg gcgttatttc ttgatgtctc 1800tgaccagaca cccatcaaca gtattatttt ctcccatgaa gacggtacgc gactgggcgt 1860ggagcatctg gtcgcattgg gtcaccagca aatcgcgctg ttagcgggcc cattaagttc 1920tgtctcggcg cgtctgcgtc tggctggctg gcataaatat ctcactcgca atcaaattca 1980gccgatagcg gaacgggaag gcgactggag tgccatgtcc ggttttcaac aaaccatgca 2040aatgctgaat gagggcatcg ttcccactgc gatgctggtt gccaacgatc agatggcgct 2100gggcgcaatg cgcgccatta ccgagtccgg gctgcgcgtt ggtgcggata tctcggtagt 2160gggatacgac gataccgaag acagctcatg ttatatcccg ccgttaacca ccatcaaaca 2220ggattttcgc ctgctggggc aaaccagcgt ggaccgcttg ctgcaactct ctcagggcca 2280ggcggtgaag ggcaatcagc tgttgcccgt ctcactggtg aaaagaaaaa ccaccctggc 2340gcccaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc agctggcacg 2400acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgta agttagcgcg 2460aattgatctg gtttgacagc ttatcatcga ctgcacggtg caccaatgct tctggcgtca 2520ggcagccatc ggaagctgtg gtatggctgt gcaggtcgta aatcactgca taattcgtgt 2580cgctcaaggc gcactcccgt tctggataat gttttttgcg ccgacatcat aacggttctg 2640gcaaatattc tgaaatgagc tgttgacaat taatcatccg gctcgtataa tgtgtggaat 2700tgtgagcgga taacaatttc acacaggaaa cagcgccgct gagaaaaagc gaagcggcac 2760tgctctttaa caatttatca gacaatctgt gtgggcactc gaccggaatt atcgattaac 2820tttattatta aaaattaaag aggtatatat taatgtatcg attaaataag gaggaataaa 2880ccatggcgaa tggttctgca gtctctttga aatctggaag cttgaatacg caggaggata 2940ctagttccag tccccctcct cggacgtttt tgcatcagct gcccgactgg agtcgcttgc 3000tgaccgccat cacaacagtg tttgtcaaat ctaaacgacc ggacatgcat gatcggaaaa 3060gcaagcgccc agatatgctc gtcgatagtt tcggactcga gtctactgtg caggacggcc 3120tggtgttccg tcaatccttc agcatccgaa gctacgagat tggtacggac cgtaccgcta 3180gcattgaaac gttgatgaac catctccaag aaaccagttt gaaccactgc aagagcacgg 3240gcatcctgct ggatggtttt ggccgcacat tggaaatgtg caagcgagac ttgatctggg 3300tggtcattaa aatgcagatc aaagttaatc gatacccggc ctggggagat accgttgaga 3360tcaatacacg cttttcccgt ttgggcaaaa ttggcatggg tcgcgattgg ctgatctccg 3420actgcaacac cggtgagatc ttggtccgtg caacgtctgc gtacgcgatg atgaatcaaa 3480agacgcgtcg gttgagtaag ctgccgtatg aagttcacca agaaattgtt ccattgttcg 3540ttgatagtcc cgttatcgag gattctgacc tcaaagtcca caagtttaaa gtcaagactg 3600gcgattccat ccagaagggc ctgacgccag gttggaacga tctggatgtg aaccaacacg 3660ttagcaacgt taagtatatc ggctggatct tggaaagtat gcctacggaa gtcctggaga 3720cgcaggaact ctgcagtctc gctctggagt accgccgtga gtgtggccgt gattccgtgc 3780tcgagtccgt cactgcgatg gaccctagca aagtgggtgt tcgcagtcaa taccaacacc 3840tcttgcggct cgaagatggg accgccattg tgaacggcgc gaccgaatgg cgccccaaaa 3900atgccggcgc taacggggca attagtaccg ggaaaacctc caatggaaac agcgtcagct 3960aatgatagga tccgagctca gatctaccag gttgtccttg gcgcagcgct tcccacgctg 4020agagggtgta gcccgtcacg ggtaaccgat atcgtcgaca ggcctctaga cccgggctcg 4080agctagcaag cttggccgga tccggccgga tccggagttt gtagaaacgc aaaaaggcca 4140tccgtcagga tggccttctg cttaatttga tgcctggcag tttatggcgg gcgtcctgcc 4200cgccaccctc cgggccgttg cttcgcaacg ttcaaatccg ctcccggcgg atttgtccta 4260ctcaggagag cgttcaccga caaacaacag ataaaacgaa aggcccagtc tttcgactga 4320gcctttcgtt ttatttgatg cctggcagtt ccctactctc gcatggggag accccacact 4380accatcggcg ctacggcgtt tcacttctga gttcggcatg gggtcaggtg ggaccaccgc 4440gctactgccg ccaggcaaat tctgttttat tgagccgtta ccccacctac tagctaatcc 4500catctgggca catccgatgg caagaggccc gaaggtcccc ctctttggtc ttgcgacgtt 4560atgcggtatt agctaccgtt tccagtagtt atccccctcc atcaggcagt ttcccagaca 4620ttactcaccc gtccgccact cgtcagcaaa gaagcaagct tagatcgacc tgcagggggg 4680ggggggaaag ccacgttgtg tctcaaaatc tctgatgtta cattgcacaa gataaaaata 4740tatcatcatg aacaataaaa ctgtctgctt acataaacag taatacaagg ggtgttatga 4800gccatattca acgggaaacg tcttgctcga ggccgcgatt aaattccaac atggatgctg 4860atttatatgg gtataaatgg gctcgcgata atgtcgggca atcaggtgcg acaatctatc 4920gattgtatgg gaagcccgat gcgccagagt tgtttctgaa acatggcaaa ggtagcgttg 4980ccaatgatgt tacagatgag atggtcagac taaactggct gacggaattt atgcctcttc 5040cgaccatcaa gcattttatc cgtactcctg atgatgcatg gttactcacc actgcgatcc 5100ccgggaaaac agcattccag gtattagaag aatatcctga ttcaggtgaa aatattgttg 5160atgcgctggc agtgttcctg cgccggttgc attcgattcc tgtttgtaat tgtcctttta 5220acagcgatcg cgtatttcgt ctcgctcagg cgcaatcacg aatgaataac ggtttggttg 5280atgcgagtga ttttgatgac gagcgtaatg gctggcctgt tgaacaagtc tggaaagaaa 5340tgcataagct tttgccattc tcaccggatt cagtcgtcac tcatggtgat ttctcacttg 5400ataaccttat ttttgacgag gggaaattaa taggttgtat tgatgttgga cgagtcggaa 5460tcgcagaccg ataccaggat cttgccatcc tatggaactg cctcggtgag ttttctcctt 5520cattacagaa acggcttttt caaaaatatg gtattgataa tcctgatatg aataaattgc 5580agtttcattt gatgctcgat gagtttttct aatcagaatt ggttaattgg ttgtaacact 5640ggcagagcat tacgctgact tgacgggacg gcggctttgt tgaataaatc gaacttttgc 5700tgagttgaag gatcagatca cgcatcttcc cgacaacgca gaccgttccg tggcaaagca 5760aaagttcaaa atcaccaact ggtccaccta caacaaagct ctcatcaacc gtggctccct 5820cactttctgg ctggatgatg gggcgattca ggcctggtat gagtcagcaa caccttcttc 5880acgaggcaga cctcagcgcc cccccccccc tgcaggtcga tctggtaacc ccagcgcggt 5940tgctaccaag tagtgacccg cttcgtgatg caaaatccgc tgacgatatt cgggcgatcg 6000ctgctgaatg ccatcgagca gtaacgtggc gaattcggta ccggtatgga tggcaccgat 6060gcggaatccc aacagattgc ctttgacaac aatgtggcct ggaataacct gggggatttg 6120tccaccacca cccaacgggc ctacacttcg gctattagca cagacacagt gcagagtgtt 6180tatggcgtta atctggaaaa aaacgataac attcccattg tttttgcgtg gcccattttt 6240cccaccaccc ttaatcccac agattttcag gtaatgctta acacggggga aattgtcacc 6300ccggtgatcg cctctttgat tcccaacagt gaatacaacg aacggcaaac ggtagtaatt 6360acgggcaatt ttggtaatcg tttaacccca ggcacggagg gagcgattta tcccgtttcc 6420gtaggcacag tgttggacag tactcctttg gaaatggtgg gacccaacgg cccggtcagt 6480gcggtgggta ttaccattga tagtctcaac ccctacgtgg ccggcaatgg tcccaaaatt 6540gtcgccgcta agttagaccg cttcagtgac ctgggggaag gggctcccct ctggttagcc 6600accaatcaaa ataacagtgg cggggattta tatggagacc aagcccaatt tcgtttgcga 6660atttacacca gcgccggttt ttcccccgat ggcattgcca gtttactacc cacagaattt 6720gaacggtatt ttcaactcca agcggaagat attacgggac ggacagttat cctaacccaa 6780actggtgttg attatgaaat tcccggcttt ggtctggtgc aggtgttggg gctggcggat 6840ttggccgggg ttcaggacag ctatgacctg acttacatcg aagatcatga caactattac 6900gacattatcc tcaaagggga cgaagccgca gttcgccaaa ttaagagggt tgctttgccc 6960tccgaagggg attattcggc ggtttataat cccggtggcc ccggcaatga tccagagaat 7020ggtccccca 7029136883DNAArtificial SequenceSynthetic construct 13attgctgaag cggaatccct ggttaatgcc gccgccgatg ccaattgcat tctccaagtg 60gggcacattg aacgcttcaa cccggcattt ttagagctaa ccaaaattct caaaacggaa 120gagttattgg cgatcgaagc ccatcgcatg agtccctatt cccagcgggc caatgatgtc 180tccgtggtat tggatttgat gatccatgac attgacctgt tgctggaatt ggtgggttcg 240gaagtggtta aactgtccgc cagtggcagt cgggcttctg ggtcaggata tttggattat 300gtcaccgcta cgttaggctt ctcctccggc attgtggcca ccctcaccgc cagtaaggtc 360acccatcgta aaattcgttc catcgccgcc cactgcaaaa attccctcac cgaagcggat 420tttctcaata acgaaatttt gatccatcgc caaaccaccg ctgattggag cgcggactat 480ggccaggtat tgtatcgcca ggatggtcta atcgaaaagg tttacaccag taatattgaa 540cctctccacg ctgaattaga acattttatt cattgtgtta ggggaggtga tcaaccctca 600gtggggggag aacaggccct caaggccctg aagttagcca gtttaattga agaaatggcc 660ctggacagtc aggaatggca tgggggggaa gttgtgacag aatatcaaga tgccaccctg 720gccctcagtg cgagtgttta aatcaactta attaatgcaa ttattgcgag ttcaaactcg 780ataactttgt gaaatattac tgttgaatta atctatgact attcaataca cccccctagc 840cgatcgcctg ttggcctacc tcgccgccga tcgcctaaat ctcagcgcca agagtagttc 900cctcaacacc agtattctgc tcagcagtga cctattcaat caggaagggg gaattgtaac 960agccaactat ggctttgatg gttatatggt accatatggt gcactctcag tacaatctgc 1020tctgatgccg catagttaag ccagtataca ctccgctatc gctacgtgac tgggtcatgg 1080ctgcgccccg acacccgcca acacccgctg acgcgccctg acgggcttgt ctgctcccgg 1140catccgctta cagacaagct gtgaccgtct ccgggagctg catgtgtcag aggttttcac 1200cgtcatcacc gaaacgcgcg aggcagcaga tcaattcgcg cgcgaaggcg aagcggcatg 1260catttacgtt gacaccatcg aatggtgcaa aacctttcgc ggtatggcat gatagcgccc 1320ggaagagagt caattcaggg tggtgaatgt gaaaccagta acgttatacg atgtcgcaga 1380gtatgccggt gtctcttatc agaccgtttc ccgcgtggtg aaccaggcca gccacgtttc 1440tgcgaaaacg cgggaaaaag tggaagcggc gatggcggag ctgaattaca ttcccaaccg 1500cgtggcacaa caactggcgg gcaaacagtc gttgctgatt ggcgttgcca cctccagtct 1560ggccctgcac gcgccgtcgc aaattgtcgc ggcgattaaa tctcgcgccg atcaactggg 1620tgccagcgtg gtggtgtcga tggtagaacg aagcggcgtc gaagcctgta aagcggcggt 1680gcacaatctt ctcgcgcaac gcgtcagtgg gctgatcatt aactatccgc tggatgacca 1740ggatgccatt gctgtggaag ctgcctgcac taatgttccg gcgttatttc ttgatgtctc 1800tgaccagaca cccatcaaca gtattatttt ctcccatgaa gacggtacgc gactgggcgt 1860ggagcatctg gtcgcattgg gtcaccagca aatcgcgctg ttagcgggcc cattaagttc 1920tgtctcggcg cgtctgcgtc tggctggctg gcataaatat ctcactcgca atcaaattca 1980gccgatagcg gaacgggaag gcgactggag tgccatgtcc ggttttcaac aaaccatgca 2040aatgctgaat gagggcatcg ttcccactgc gatgctggtt gccaacgatc agatggcgct 2100gggcgcaatg cgcgccatta ccgagtccgg gctgcgcgtt ggtgcggata tctcggtagt 2160gggatacgac gataccgaag acagctcatg ttatatcccg ccgttaacca ccatcaaaca 2220ggattttcgc ctgctggggc aaaccagcgt ggaccgcttg ctgcaactct ctcagggcca 2280ggcggtgaag ggcaatcagc tgttgcccgt ctcactggtg aaaagaaaaa ccaccctggc 2340gcccaatacg caaaccgcct ctccccgcgc gttggccgat tcattaatgc agctggcacg 2400acaggtttcc cgactggaaa gcgggcagtg agcgcaacgc aattaatgta agttagcgcg 2460aattgatctg gtttgacagc ttatcatcga ctgcacggtg caccaatgct tctggcgtca 2520ggcagccatc ggaagctgtg gtatggctgt gcaggtcgta aatcactgca taattcgtgt 2580cgctcaaggc gcactcccgt tctggataat gttttttgcg ccgacatcat aacggttctg 2640gcaaatattc tgaaatgagc tgttgacaat taatcatccg gctcgtataa tgtgtggaat 2700tgtgagcgga taacaatttc acacaggaaa cagaccatgg cgaatggttc tgcagtctct 2760ttgaaatctg gaagcttgaa tacgcaggag gatactagtt ccagtccccc tcctcggacg 2820tttttgcatc agctgcccga ctggagtcgc ttgctgaccg ccatcacaac agtgtttgtc 2880aaatctaaac gaccggacat gcatgatcgg aaaagcaagc gcccagatat gctcgtcgat 2940agtttcggac tcgagtctac tgtgcaggac ggcctggtgt tccgtcaatc cttcagcatc 3000cgaagctacg agattggtac ggaccgtacc gctagcattg aaacgttgat gaaccatctc 3060caagaaacca gtttgaacca ctgcaagagc acgggcatcc tgctggatgg ttttggccgc 3120acattggaaa tgtgcaagcg agacttgatc tgggtggtca ttaaaatgca gatcaaagtt 3180aatcgatacc cggcctgggg agataccgtt gagatcaata cacgcttttc ccgtttgggc 3240aaaattggca tgggtcgcga ttggctgatc tccgactgca acaccggtga gatcttggtc 3300cgtgcaacgt ctgcgtacgc gatgatgaat caaaagacgc gtcggttgag taagctgccg 3360tatgaagttc accaagaaat tgttccattg ttcgttgata gtcccgttat cgaggattct 3420gacctcaaag tccacaagtt taaagtcaag actggcgatt ccatccagaa gggcctgacg 3480ccaggttgga acgatctgga tgtgaaccaa cacgttagca acgttaagta tatcggctgg 3540atcttggaaa gtatgcctac ggaagtcctg gagacgcagg aactctgcag tctcgctctg 3600gagtaccgcc gtgagtgtgg ccgtgattcc gtgctcgagt ccgtcactgc gatggaccct 3660agcaaagtgg gtgttcgcag tcaataccaa cacctcttgc ggctcgaaga tgggaccgcc 3720attgtgaacg gcgcgaccga atggcgcccc aaaaatgccg gcgctaacgg ggcaattagt 3780accgggaaaa cctccaatgg aaacagcgtc agctaatgat aggatccgag ctcagatcta 3840ccaggttgtc cttggcgcag cgcttcccac gctgagaggg tgtagcccgt cacgggtaac 3900cgatatcgtc gacaggcctc tagacccggg ctcgagctag caagcttggc cggatccggc 3960cggatccgga gtttgtagaa acgcaaaaag gccatccgtc aggatggcct tctgcttaat 4020ttgatgcctg gcagtttatg gcgggcgtcc tgcccgccac cctccgggcc gttgcttcgc 4080aacgttcaaa tccgctcccg gcggatttgt cctactcagg agagcgttca ccgacaaaca 4140acagataaaa cgaaaggccc agtctttcga ctgagccttt cgttttattt gatgcctggc 4200agttccctac tctcgcatgg ggagacccca cactaccatc ggcgctacgg cgtttcactt 4260ctgagttcgg catggggtca ggtgggacca ccgcgctact gccgccaggc aaattctgtt 4320ttattgagcc gttaccccac ctactagcta atcccatctg ggcacatccg atggcaagag 4380gcccgaaggt ccccctcttt ggtcttgcga cgttatgcgg tattagctac cgtttccagt 4440agttatcccc ctccatcagg cagtttccca gacattactc acccgtccgc cactcgtcag 4500caaagaagca agcttagatc gacctgcagg gggggggggg aaagccacgt tgtgtctcaa 4560aatctctgat gttacattgc acaagataaa aatatatcat catgaacaat aaaactgtct 4620gcttacataa acagtaatac aaggggtgtt atgagccata ttcaacggga aacgtcttgc 4680tcgaggccgc gattaaattc caacatggat gctgatttat atgggtataa atgggctcgc 4740gataatgtcg ggcaatcagg tgcgacaatc tatcgattgt atgggaagcc cgatgcgcca 4800gagttgtttc tgaaacatgg caaaggtagc gttgccaatg atgttacaga tgagatggtc 4860agactaaact ggctgacgga atttatgcct cttccgacca tcaagcattt tatccgtact 4920cctgatgatg catggttact caccactgcg atccccggga aaacagcatt ccaggtatta 4980gaagaatatc ctgattcagg tgaaaatatt gttgatgcgc tggcagtgtt cctgcgccgg 5040ttgcattcga ttcctgtttg taattgtcct tttaacagcg atcgcgtatt tcgtctcgct 5100caggcgcaat cacgaatgaa taacggtttg gttgatgcga gtgattttga tgacgagcgt 5160aatggctggc ctgttgaaca agtctggaaa gaaatgcata agcttttgcc attctcaccg 5220gattcagtcg tcactcatgg tgatttctca cttgataacc ttatttttga cgaggggaaa 5280ttaataggtt gtattgatgt tggacgagtc ggaatcgcag accgatacca ggatcttgcc 5340atcctatgga actgcctcgg tgagttttct ccttcattac agaaacggct ttttcaaaaa 5400tatggtattg ataatcctga tatgaataaa ttgcagtttc atttgatgct cgatgagttt 5460ttctaatcag aattggttaa ttggttgtaa cactggcaga gcattacgct gacttgacgg 5520gacggcggct ttgttgaata aatcgaactt ttgctgagtt gaaggatcag atcacgcatc 5580ttcccgacaa cgcagaccgt tccgtggcaa agcaaaagtt caaaatcacc aactggtcca 5640cctacaacaa agctctcatc aaccgtggct ccctcacttt ctggctggat gatggggcga 5700ttcaggcctg gtatgagtca gcaacacctt cttcacgagg cagacctcag cgcccccccc 5760cccctgcagg tcgatctggt aaccccagcg cggttgctac caagtagtga cccgcttcgt 5820gatgcaaaat ccgctgacga tattcgggcg atcgctgctg aatgccatcg agcagtaacg 5880tggcgaattc ggtaccggta tggatggcac cgatgcggaa tcccaacaga ttgcctttga 5940caacaatgtg gcctggaata acctggggga tttgtccacc accacccaac gggcctacac 6000ttcggctatt agcacagaca cagtgcagag tgtttatggc gttaatctgg aaaaaaacga 6060taacattccc attgtttttg cgtggcccat ttttcccacc acccttaatc ccacagattt 6120tcaggtaatg cttaacacgg gggaaattgt caccccggtg atcgcctctt tgattcccaa 6180cagtgaatac aacgaacggc aaacggtagt aattacgggc aattttggta atcgtttaac 6240cccaggcacg gagggagcga tttatcccgt ttccgtaggc acagtgttgg acagtactcc 6300tttggaaatg gtgggaccca acggcccggt cagtgcggtg ggtattacca ttgatagtct 6360caacccctac gtggccggca atggtcccaa aattgtcgcc gctaagttag accgcttcag 6420tgacctgggg gaaggggctc ccctctggtt agccaccaat caaaataaca gtggcgggga 6480tttatatgga gaccaagccc aatttcgttt gcgaatttac accagcgccg gtttttcccc 6540cgatggcatt gccagtttac tacccacaga atttgaacgg tattttcaac tccaagcgga 6600agatattacg ggacggacag ttatcctaac ccaaactggt gttgattatg aaattcccgg 6660ctttggtctg gtgcaggtgt tggggctggc ggatttggcc ggggttcagg acagctatga 6720cctgacttac atcgaagatc atgacaacta ttacgacatt atcctcaaag gggacgaagc 6780cgcagttcgc caaattaaga gggttgcttt gccctccgaa ggggattatt cggcggttta 6840taatcccggt ggccccggca atgatccaga gaatggtccc cca 68831420DNAArtificial SequenceSynthetic construct 14accctggccc tcagtgcgag 201521DNAArtificial SequenceSynthetic construct 15tgcttctttg ctgacgagtg g 211619DNAArtificial SequencePrimer 16gtgactggaa ccgccctcg 191744DNAArtificial SequencePrimer 17ccatcgagca gtaacgtggc cgatagtgac gctaaaccag gctg 441840DNAArtificial SequencePrimer 18cgagtggcgg acgggtgagt ctacgagggc gtgcagaagc 401921DNAArtificial SequencePrimer 19caccaagttg ccttcaccga c 212044DNAArtificial SequencePrimer 20cagcctggtt tagcgtcact atcggccacg ttactgctcg atgg 442140DNAArtificial SequencePrimer 21gcttctgcac gccctcgtag actcacccgt ccgccactcg 40222840DNAArtificial SequenceSynthetic construct 22gtgactggaa ccgccctcgc gcaaccccgc gccattacgc cccacgaaca gcagcttttg 60gccaaactga aaagctatcg cgatatccaa agcttgtcgc aaatttgggg acgtgctgcc 120agtcaatttg gatcgatgcc ggctttggtt gcaccccatg ccaaaccagc gatcaccctc 180agttatcaag aattggcgat tcagatccaa gcgtttgcag ccggactgct cgcgctggga 240gtgcctacct ccacagccga tgactttccg cctcgcttgg

cgcagtttgc ggataacagc 300ccccgctggt tgattgctga ccaaggcacg ttgctggcag gggctgccaa tgcggtgcgc 360ggcgcccaag ctgaagtatc ggagctgctc tacgtcttag aggacagcgg ttcgatcggc 420ttgattgtcg aagacgcggc gctgctgaag aaactacagc ctggtttagc gtcactatcg 480gccacgttac tgctcgatgg cattcagcag cgatcgcccg aatatcgtca gcggattttg 540catcacgaag cgggtcacta cttggtagca accgcgctgg ggttaccaga tccgtcgatc 600atatcgtcaa ttattacctc cacggggaga gcctgagcaa actggcctca ggcatttgag 660aagcacacgg tcacactgct tccggtagtc aataaaccgg taaaccagca atagacataa 720gcggctattt aacgaccctg ccctgaaccg acgaccgggt cgaatttgct ttcgaatttc 780tgccattcat ccgcttatta tcacttattc aggcgtagca ccaggcgttt aagggcacca 840ataactgcct taaaaaaatt acgccccgcc ctgccactca tcgcagtact gttgtaattc 900attaagcatt ctgccgacat ggaagccatc acaaacggca tgatgaacct gaatcgccag 960cggcatcagc accttgtcgc cttgcgtata atatttgccc atggtgaaaa cgggggcgaa 1020gaagttgtcc atattggcca cgtttaaatc aaaactggtg aaactcaccc agggattggc 1080tgagacgaaa aacatattct caataaaccc tttagggaaa taggccaggt tttcaccgta 1140acacgccaca tcttgcgaat atatgtgtag aaactgccgg aaatcgtcgt ggtattcact 1200ccagagcgat gaaaacgttt cagtttgctc atggaaaacg gtgtaacaag ggtgaacact 1260atcccatatc accagctcac cgtctttcat tgccatacgg aattccggat gagcattcat 1320caggcgggca agaatgtgaa taaaggccgg ataaaacttg tgcttatttt tctttacggt 1380ctttaaaaag gccgtaatat ccagctgaac ggtctggtta taggtacatt gagcaactga 1440ctgaaatgcc tcaaaatgtt ctttacgatg ccattgggat atatcaacgg tggtatatcc 1500agtgattttt ttctccattt tagcttcctt agctcctgaa aatctcgata actcaaaaaa 1560tacgcccggt agtgatctta tttcattatg gtgaaagttg gaacctctta cgtgccgatc 1620aacgtctcat tttcgccaaa agttggccca gggcttcccg gtatcaacag ggacaccagg 1680atttatttat tctgcgaagt gatcttccgt cacaggtatt tattcgaaga cgaaagggcc 1740tcgtgatacg cctattttta taggttaatg tcatgataat aatggtttct tagacgtcag 1800gtggcacttt tcggggaaat gtgcgcggaa cccctatttg tttatttttc taaatacatt 1860caaatatgta tccgctcatg agacaataac cctgataaat gcttcaataa tattgaaaaa 1920ggaagagtat gagtattcaa catttccgtg tcgcccttat tccctttttt gcggcatttt 1980gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt 2040tgggtgcacg agtgggttac atcgaactgg atctcaacag cggtaagatc cttgagagtt 2100ttcgccccga agaacgtttt ccaatgatga gcacttttaa agttctgcta tgtggcgcgg 2160tattatcccg tgtgacggat ctaagcttgc ttctttgctg acgagtggcg gacgggtgag 2220tctacgaggg cgtgcagaag cagtttcgcg agcaaccggc gaagaaacgt cgcttgatcg 2280ataccttctt tggcttgagt caacgctatg ttttggcacg gcgccgctgg caaggactgg 2340atttgctggc actgaaccaa tccccagccc agcgcctcgc tgagggtgtc cggatgttgg 2400cgctagcacc gttgcataag ctgggcgatc gcctcgtcta cggcaaagta cgagaagcca 2460cgggtggccg aattcggcag gtgatcagtg gcggtggctc actggcactg cacctcgata 2520ccttcttcga aattgttggt gttgatttgc tggtgggtta tggcttgaca gaaacctcac 2580cagtgctgac ggggcgacgg ccttggcaca acctacgggg ttcggccggt cagccgattc 2640caggtacggc gattcggatc gtcgatcctg aaacgaagga aaaccgaccc agtggcgatc 2700gcggcttggt gctggcgaaa gggccgcaaa tcatgcaggg ctacttcaat aaacccgagg 2760cgaccgcgaa agcgatcgat gccgaaggtt ggtttgacac cggcgactta ggctacatcg 2820tcggtgaagg caacttggtg 28402325DNAArtificial SequencePrimer 23ctcgagcccc cgtgctatga ctagc 252428DNAArtificial SequencePrimer 24ctcgagcccg gaacgttttt tgtacccc 282529DNAArtificial SequencePrimer 25caattggtca cacgggataa taccgcgcc 292636DNAArtificial SequencePrimer 26caattggtcg atcatatcgt caattattac ctccac 36277224DNAArtificial SequenceSynthetic construct 27cccccgtgct atgactagcg gcgatcgcca taccggccac gaccatttgc attggatccc 60caacggcggc cacaacttcc atggcattga gatgcgggga atgatgttct agactctgac 120gcaccaaagc caatttttgt tgatggttgc aatggggatg actactgttc actttgcccc 180cagcgtcaat gcctagacct agcagtaccc ccagggctgt ggtagtgccc cccaccacgc 240attcgcttag cactaagtaa ctttcggcat gttcctgggc taactgtgcg ccccactgca 300aaccctgctg aaaaagatgc tccaccaggg ccaacggtaa cgcttgccct gtggaaagac 360agcgggcggg ttgtccgtct agattgatga ctggcaccgc tgggggaatg ggtaaaccag 420agttaaataa ataaaccgga gtatggaggg catccaccaa cgctttggtg atgaacactg 480gggaaacccc agaaatgagg ggaggtaagg gataggttgc ccctgccgta gttcccttga 540ttaaaaattc cgcatcggcg atcgccgtca attttcgatc agcgggggtt ttacccgccg 600cagaaatgcc cggaattaaa ccagtttccg taaagcccaa cacacagaca aacaccggtg 660gacagtggcc atggcgctca atccaggata aagcttggtc agactgggta taaactgtca 720acatatttct gcaagagtgg gcccaattgg gaaaatcaac ctcaaatcca ttggaatagc 780cttttttcaa ccgtaaaaat ccaactttct ctcttccctt cttccttcca tctgattatg 840gttacgccaa ttaactacca ttccatccat tgcctggcgg atatctgggc tatcaccgga 900gaaaattttg ccgatattgt ggccctcaac gatcgccata gtcatccccc cgtaacttta 960acctatgccc aattggtcac acgggataat accgcgccac atagcagaac tttaaaagtg 1020ctcatcattg gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga 1080tccagttcga tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc 1140agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg 1200acacggaaat gttgaatact catactcttc ctttttcaat attattgaag catttatcag 1260ggttattgtc tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaataggg 1320gttccgcgca catttccccg aaaagtgcca cctgacgtct aagaaaccat tattatcatg 1380acattaacct ataaaaatag gcgtatcacg aggccctttc gtcttcgaat aaatacctgt 1440gacggaagat cacttcgcag aataaataaa tcctggtgtc cctgttgata ccgggaagcc 1500ctgggccaac ttttggcgaa aatgagacgt tgatcggcac gtaagaggtt ccaactttca 1560ccataatgaa ataagatcac taccgggcgt attttttgag ttatcgagat tttcaggagc 1620taaggaagct aaaatggaga aaaaaatcac tggatatacc accgttgata tatcccaatg 1680gcatcgtaaa gaacattttg aggcatttca gtcagttgct caatgtacct ataaccagac 1740cgttcagctg gatattacgg cctttttaaa gaccgtaaag aaaaataagc acaagtttta 1800tccggccttt attcacattc ttgcccgcct gatgaatgct catccggaat tccgtatggc 1860aatgaaagac ggtgagctgg tgatatggga tagtgttcac ccttgttaca ccgttttcca 1920tgagcaaact gaaacgtttt catcgctctg gagtgaatac cacgacgatt tccggcagtt 1980tctacacata tattcgcaag atgtggcgtg ttacggtgaa aacctggcct atttccctaa 2040agggtttatt gagaatatgt ttttcgtctc agccaatccc tgggtgagtt tcaccagttt 2100tgatttaaac gtggccaata tggacaactt cttcgccccc gttttcacca tgggcaaata 2160ttatacgcaa ggcgacaagg tgctgatgcc gctggcgatt caggttcatc atgccgtttg 2220tgatggcttc catgtcggca gaatgcttaa tgaattacaa cagtactgcg atgagtggca 2280gggcggggcg taattttttt aaggcagtta ttggtgccct taaacgcctg gtgctacgcc 2340tgaataagtg ataataagcg gatgaatggc agaaattcga aagcaaattc gacccggtcg 2400tcggttcagg gcagggtcgt taaatagccg cttatgtcta ttgctggttt accggtttat 2460tgactaccgg aagcagtgtg accgtgtgct tctcaaatgc ctgaggccag tttgctcagg 2520ctctccccgt ggaggtaata attgacgata tgatcgacca attgcgggaa gaaattacag 2580cttttgccgc tggcctacag agtttaggag ttacccccca tcaacacctg gccattttcg 2640ccgacaacag cccccggtgg tttatcgccg atcaaggcag tatgttggct ggagccgtca 2700acgccgtccg ttctgcccaa gcagagcgcc aggaattact ctacatccta gaagacagca 2760acagccgtac tttaatcgca gaaaatcggc aaaccctaag caaattggcc ctagatggcg 2820aaaccattga cctgaaacta atcatcctcc tcaccgatga agaagtggca gaggacagcg 2880ccattcccca atataacttt gcccaggtca tggccctagg ggccggcaaa atccccactc 2940ccgttccccg ccaggaagaa gatttagcca ccctgatcta cacctccggc accacaggac 3000aacccaaagg ggtgatgctc agccacggta atttattgca ccaagtacgg gaattggatt 3060cggtgattat tccccgcccc ggcgatcagg tgttgagcat tttgccctgt tggcactccc 3120tagaaagaag cgccgaatat tttcttcttt cccggggctg cacgatgaac tacaccagca 3180ttcgccattt caagggggat gtgaaggaca ttaaacccca tcacattgtc ggtgtgcccc 3240ggctgtggga atccctctac gaaggggtac aaaaaacgtt ccgggctaag ggcgaattct 3300gcagatatcc atcacactgg cggccgctcg agcatgcatc tagagggccc aattcgccct 3360atagtgagtc gtattacaat tcactggccg tcgttttaca acgtcgtgac tgggaaaacc 3420ctggcgttac ccaacttaat cgccttgcag cacatccccc tttcgccagc tggcgtaata 3480gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatgga 3540cgcgccctgt agcggcgcat taagcgcggc gggtgtggtg gttacgcgca gcgtgaccgc 3600tacacttgcc agcgccctag cgcccgctcc tttcgctttc ttcccttcct ttctcgccac 3660gttcgccggc tttccccgtc aagctctaaa tcgggggctc cctttagggt tccgatttag 3720tgctttacgg cacctcgacc ccaaaaaact tgattagggt gatggttcac gtagtgggcc 3780atcgccctga tagacggttt ttcgcccttt gacgttggag tccacgttct ttaatagtgg 3840actcttgttc caaactggaa caacactcaa ccctatctcg gtctattctt ttgatttata 3900agggattttg ccgatttcgg cctattggtt aaaaaatgag ctgatttaac aaaaatttaa 3960cgcgaatttt aacaaaattc agggcgcaag ggctgctaaa ggaagcggaa cacgtagaaa 4020gccagtccgc agaaacggtg ctgaccccgg atgaatgtca gctactgggc tatctggaca 4080agggaaaacg caagcgcaaa gagaaagcag gtagcttgca gtgggcttac atggcgatag 4140ctagactggg cggttttatg gacagcaagc gaaccggaat tgccagctgg ggcgccctct 4200ggtaaggttg ggaagccctg caaagtaaac tggatggctt tcttgccgcc aaggatctga 4260tggcgcaggg gatcaagatc tgatcaagag acaggatgag gatcgtttcg catgattgaa 4320caagatggat tgcacgcagg ttctccggcc gcttgggtgg agaggctatt cggctatgac 4380tgggcacaac agacaatcgg ctgctctgat gccgccgtgt tccggctgtc agcgcagggg 4440cgcccggttc tttttgtcaa gaccgacctg tccggtgccc tgaatgaact gcaggacgag 4500gcagcgcggc tatcgtggct ggccacgacg ggcgttcctt gcgcagctgt gctcgacgtt 4560gtcactgaag cgggaaggga ctggctgcta ttgggcgaag tgccggggca ggatctcctg 4620tcatcccacc ttgctcctgc cgagaaagta tccatcatgg ctgatgcaat gcggcggctg 4680catacgcttg atccggctac ctgcccattc gaccaccaag cgaaacatcg catcgagcga 4740gcacgtactc ggatggaagc cggtcttgtc gatcaggatg atctggacga agagcatcag 4800gggctcgcgc cagccgaact gttcgccagg ctcaaggcgc gcatgcccga cggcgaggat 4860ctcgtcgtga cccatggcga tgcctgcttg ccgaatatca tggtggaaaa tggccgcttt 4920tctggattca tcgactgtgg ccggctgggt gtggcggacc gctatcagga catagcgttg 4980gctacccgtg atattgctga agagcttggc ggcgaatggg ctgaccgctt cctcgtgctt 5040tacggtatcg ccgctcccga ttcgcagcgc atcgccttct atcgccttct tgacgagttc 5100ttctgaattg aaaaaggaag agtatgagta ttcaacattt ccgtgtcgcc cttattccct 5160tttttgcggc attttgcctt cctgtttttg ctcacccaga aacgctggtg aaagtaaaag 5220atgctgaaga tcagttgggt gcacgagtgg gttacatcga actggatctc aacagcggta 5280agatccttga gagttttcgc cccgaagaac gttttccaat gatgagcact tttaaagttc 5340tgctatgtgg cgcggtatta tcccgtattg acgccgggca agagcaactc ggtcgccgca 5400tacactattc tcagaatgac ttggttgagt actcaccagt cacagaaaag catcttacgg 5460atggcatgac agtaagagaa ttatgcagtg ctgccataac catgagtgat aacactgcgg 5520ccaacttact tctgacaacg atcggaggac cgaaggagct aaccgctttt ttgcacaaca 5580tgggggatca tgtaactcgc cttgatcgtt gggaaccgga gctgaatgaa gccataccaa 5640acgacgagcg tgacaccacg atgcctgtag caatggcaac aacgttgcgc aaactattaa 5700ctggcgaact acttactcta gcttcccggc aacaattaat agactggatg gaggcggata 5760aagttgcagg accacttctg cgctcggccc ttccggctgg ctggtttatt gctgataaat 5820ctggagccgg tgagcgtggg tctcgcggta tcattgcagc actggggcca gatggtaagc 5880cctcccgtat cgtagttatc tacacgacgg ggagtcaggc aactatggat gaacgaaata 5940gacagatcgc tgagataggt gcctcactga ttaagcattg gtaactgtca gaccaagttt 6000actcatatat actttagatt gatttaaaac ttcattttta atttaaaagg atctaggtga 6060agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg ttccactgag 6120cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa 6180tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag 6240agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg 6300ttcttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat 6360acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag tcgtgtctta 6420ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg 6480gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc 6540gtgagctatg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg tatccggtaa 6600gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac gcctggtatc 6660tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg tgatgctcgt 6720caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct 6780tttgctggcc ttttgctcac atgttctttc ctgcgttatc ccctgattct gtggataacc 6840gtattaccgc ctttgagtga gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg 6900agtcagtgag cgaggaagcg gaagagcgcc caatacgcaa accgcctctc cccgcgcgtt 6960ggccgattca ttaatgcagc tggcacgaca ggtttcccga ctggaaagcg ggcagtgagc 7020gcaacgcaat taatgtgagt tagctcactc attaggcacc ccaggcttta cactttatgc 7080ttccggctcg tatgttgtgt ggaattgtga gcggataaca atttcacaca ggaaacagct 7140atgaccatga ttacgccaag cttggtaccg agctcggatc cactagtaac ggccgccagt 7200gtgctggaat tcgcccttct cgag 72242879DNAArtificial SequenceSynthetic construct 28gatccgctgt tgacccaaca gcatgagtcg ttatccaagg ggagcttcgg ctcccttttt 60tcatgcgcgg atgcggtga 79291503DNAArtificial SequenceSynthetic construct 29ggatccacta gtcctgaggt gttgacaatt aatcatccgg ctcgtataat gtgtggaatt 60gtgagcggat aacaatttca cacaggaaac agaccatggc cgtcgcactg caaccagctc 120aagaagtcgc aactaagaaa aagcctgcaa tcaaacagcg gcgcgtggtg gttaccggca 180tgggtgtggt gactcccctc gggcatgaac cggatgtgtt ttacaacaat ctcctggatg 240gcgtgagcgg cattagtgag atcgagaatt ttgactcgac gcagtttccc actcgcattg 300ccggcgaaat caagagtttc agcaccgacg gctgggtcgc gcccaaattg agcaaacgga 360tggataaatt gatgctgtat ctgctcaccg caggcaagaa agcgctggcc gatgcgggca 420tcacggatga tgtgatgaaa gagctggata aacgcaaatg tggagttctg attggcagtg 480gcatgggcgg catgaagctg ttctacgatg cgctcgaagc cctgaagatt tcgtatcgaa 540agatgaaccc attctgtgtg ccttttgcga ccacgaatat gggtagcgcc atgctggcta 600tggatttggg gtggatgggg ccgaattata gtatttccac cgcgtgcgca acctcgaact 660tctgcatctt gaacgcggct aaccacatta tccgtggtga agcagacatg atgctctgcg 720gcggctccga tgcggtcatt atccctatcg gtttgggcgg ctttgttgct tgccgcgcct 780tgagccaacg caataacgac ccaaccaagg catcgcgccc gtgggacagc aatcgcgatg 840gcttcgtcat gggcgaggga gccggggtgc tgctgttgga ggagctggaa cacgcgaaaa 900agcgaggcgc gacaatctat gctgagttct tgggagggtc ctttacatgc gatgcctacc 960acatgacgga gcctcaccca gagggcgcag gcgtgatctt gtgtatcgag aaggcaatgg 1020ctcaggcagg agtctctcgc gaggatgtta actacattaa tgctcacgca acgtccacgc 1080cggctggtga catcaaggaa taccaagctc tcgcccattg tttcggccag aactcggagc 1140tgcgggtcaa tagtacaaag tccatgatcg gtcatctgct gggtgctgcc ggtggcgtcg 1200aagctgtgac agtcattcaa gccatccgca ccggctggat tcaccctaat ctgaacctgg 1260aagacccgga caaggccgtt gacgcaaaat tcctcgtcgg accggagaaa gaacgtctca 1320acgttaaagt cggattgagc aatagtttcg gttttggtgg ccataactct agtatcctgt 1380ttgcacccta taattgataa tagatctgat ccgctgttga cccaacagca tgagtcgtta 1440tccaagggga gcttcggctc ccttttttca tgcgcggatg cggtgagagc tcacgtgtct 1500aga 1503301224DNAArtificial SequenceSynthetic construct 30ggatccacta gtcctgaggt gttgacaatt aatcatccgg ctcgtataat gtgtggaatt 60gtgagcggat aacaatttca cacaggaaac agaccatggc aagccgtgtt gttggtaaag 120gttgtaaact cgttggatgt ggtagtgccg tcccgaagtt ggaggtgagt aacgacgacc 180tcagtaagat cgtggatact tccgatgaat ggatttctgt tcggacggga atccgcaacc 240ggcgggtgat tactggtaag gataagatga cggggctggc ggtcgaggca gcccagaaag 300ccctggaaat ggctgaagtc gatgctgacg atgtggactt gctcctgttg tgcacctcca 360ccccagatga tctctttgga agtgcgccgc aaatccaggc ggcactcggc tgcaaaggaa 420accctctggc atttgatatt acagccgctt gtagcggctt cgttctgggt ctggtgagtg 480cttcctgcta tatccgcggc ggcgggttca agaacgtcct ggttatcggc gcggacgcac 540tgagccgcta cgtcgattgg actgaccgcg gcacatgcat tctctttggt gacgccgctg 600gcgctgtgtt ggtccaggcg tgtgagagcg aggacgacgg cgtcttcggg tttgatctgc 660atagcgatgg agagggttat cgccacctgc atactgggat caaggcgaac gaggagttcg 720ggacgaacgg ttccgttgtg gattttccgc ccaagcgcag cagctactct tccatccaaa 780tgaatgggaa agaagtgttc cgtttcgcct gccgcgtcgt gccccagtct attgagatcg 840cactcgagaa cgcgggcctc acacgttcta gcattgattg gctgctgctc caccaagcaa 900accaacgaat cttggatgcc gtcgcaacgc gtctggaaat tcccgcagac cgcgtgatta 960gtaacttggc taattacggc aatacttctg ccgccagcat tccgttggca ctggatgaag 1020ccgtgcgcag cggtaaggtc aaacccggtc agactatcgc aacttcgggg tttggagcag 1080gcttgacatg gggcagcgcg atcattcgct ggaattaatg atagatctga tccgctgttg 1140acccaacagc atgagtcgtt atccaagggg agcttcggct cccttttttc atgcgcggat 1200gcggtgagag ctcacgtgtc taga 1224311613DNAArtificial SequenceSynthetic construct 31tgttgacaat taatcatccg gctcgtataa tgtgtggaat tgtgagcgga taacaatttc 60acacaggaaa cagcgccgct gagaaaaagc gaagcggcac tgctctttaa caatttatca 120gacaatctgt gtgggcactc gaccggaatt atcgattaac tttattatta aaaattaaag 180aggtatatat taatgtatcg attaaataag gaggaataaa ccatggccgt cgcactgcaa 240ccagctcaag aagtcgcaac taagaaaaag cctgcaatca aacagcggcg cgtggtggtt 300accggcatgg gtgtggtgac tcccctcggg catgaaccgg atgtgtttta caacaatctc 360ctggatggcg tgagcggcat tagtgagatc gagaattttg actcgacgca gtttcccact 420cgcattgccg gcgaaatcaa gagtttcagc accgacggct gggtcgcgcc caaattgagc 480aaacggatgg ataaattgat gctgtatctg ctcaccgcag gcaagaaagc gctggccgat 540gcgggcatca cggatgatgt gatgaaagag ctggataaac gcaaatgtgg agttctgatt 600ggcagtggca tgggcggcat gaagctgttc tacgatgcgc tcgaagccct gaagatttcg 660tatcgaaaga tgaacccatt ctgtgtgcct tttgcgacca cgaatatggg tagcgccatg 720ctggctatgg atttggggtg gatggggccg aattatagta tttccaccgc gtgcgcaacc 780tcgaacttct gcatcttgaa cgcggctaac cacattatcc gtggtgaagc agacatgatg 840ctctgcggcg gctccgatgc ggtcattatc cctatcggtt tgggcggctt tgttgcttgc 900cgcgccttga gccaacgcaa taacgaccca accaaggcat cgcgcccgtg ggacagcaat 960cgcgatggct tcgtcatggg cgagggagcc ggggtgctgc tgttggagga gctggaacac 1020gcgaaaaagc gaggcgcgac aatctatgct gagttcttgg gagggtcctt tacatgcgat 1080gcctaccaca tgacggagcc tcacccagag ggcgcaggcg tgatcttgtg tatcgagaag 1140gcaatggctc aggcaggagt ctctcgcgag gatgttaact acattaatgc tcacgcaacg 1200tccacgccgg ctggtgacat caaggaatac caagctctcg cccattgttt cggccagaac 1260tcggagctgc gggtcaatag tacaaagtcc atgatcggtc atctgctggg tgctgccggt 1320ggcgtcgaag ctgtgacagt cattcaagcc atccgcaccg gctggattca ccctaatctg 1380aacctggaag acccggacaa ggccgttgac gcaaaattcc tcgtcggacc ggagaaagaa 1440cgtctcaacg ttaaagtcgg attgagcaat agtttcggtt ttggtggcca taactctagt 1500atcctgtttg caccctataa ttgataatag atctgatccg ctgttgaccc aacagcatga 1560gtcgttatcc aaggggagct tcggctccct tttttcatgc gcggatgcgg tga 1613322698DNAArtificial SequenceSynthetic construct 32cctgaggtgt

tgacaattaa tcatccggct cgtataatgt gtggaattgt gagcggataa 60caatttcaca caggaaacag cgccgctgag aaaaagcgaa gcggcactgc tctttaacaa 120tttatcagac aatctgtgtg ggcactcgac cggaattatc gattaacttt attattaaaa 180attaaagagg tatatattaa tgtatcgatt aaataaggag gaataaacca tggccgtcgc 240actgcaacca gctcaagaag tcgcaactaa gaaaaagcct gcaatcaaac agcggcgcgt 300ggtggttacc ggcatgggtg tggtgactcc cctcgggcat gaaccggatg tgttttacaa 360caatctcctg gatggcgtga gcggcattag tgagatcgag aattttgact cgacgcagtt 420tcccactcgc attgccggcg aaatcaagag tttcagcacc gacggctggg tcgcgcccaa 480attgagcaaa cggatggata aattgatgct gtatctgctc accgcaggca agaaagcgct 540ggccgatgcg ggcatcacgg atgatgtgat gaaagagctg gataaacgca aatgtggagt 600tctgattggc agtggcatgg gcggcatgaa gctgttctac gatgcgctcg aagccctgaa 660gatttcgtat cgaaagatga acccattctg tgtgcctttt gcgaccacga atatgggtag 720cgccatgctg gctatggatt tggggtggat ggggccgaat tatagtattt ccaccgcgtg 780cgcaacctcg aacttctgca tcttgaacgc ggctaaccac attatccgtg gtgaagcaga 840catgatgctc tgcggcggct ccgatgcggt cattatccct atcggtttgg gcggctttgt 900tgcttgccgc gccttgagcc aacgcaataa cgacccaacc aaggcatcgc gcccgtggga 960cagcaatcgc gatggcttcg tcatgggcga gggagccggg gtgctgctgt tggaggagct 1020ggaacacgcg aaaaagcgag gcgcgacaat ctatgctgag ttcttgggag ggtcctttac 1080atgcgatgcc taccacatga cggagcctca cccagagggc gcaggcgtga tcttgtgtat 1140cgagaaggca atggctcagg caggagtctc tcgcgaggat gttaactaca ttaatgctca 1200cgcaacgtcc acgccggctg gtgacatcaa ggaataccaa gctctcgccc attgtttcgg 1260ccagaactcg gagctgcggg tcaatagtac aaagtccatg atcggtcatc tgctgggtgc 1320tgccggtggc gtcgaagctg tgacagtcat tcaagccatc cgcaccggct ggattcaccc 1380taatctgaac ctggaagacc cggacaaggc cgttgacgca aaattcctcg tcggaccgga 1440gaaagaacgt ctcaacgtta aagtcggatt gagcaatagt ttcggttttg gtggccataa 1500ctctagtatc ctgtttgcac cctataattg ataatagatc ctgtcgttaa ctgctttgtt 1560ggtactacct gacttcaccc tcttttaaga tggcaagccg tgttgttggt aaaggttgta 1620aactcgttgg atgtggtagt gccgtcccga agttggaggt gagtaacgac gacctcagta 1680agatcgtgga tacttccgat gaatggattt ctgttcggac gggaatccgc aaccggcggg 1740tgattactgg taaggataag atgacggggc tggcggtcga ggcagcccag aaagccctgg 1800aaatggctga agtcgatgct gacgatgtgg acttgctcct gttgtgcacc tccaccccag 1860atgatctctt tggaagtgcg ccgcaaatcc aggcggcact cggctgcaaa ggaaaccctc 1920tggcatttga tattacagcc gcttgtagcg gcttcgttct gggtctggtg agtgcttcct 1980gctatatccg cggcggcggg ttcaagaacg tcctggttat cggcgcggac gcactgagcc 2040gctacgtcga ttggactgac cgcggcacat gcattctctt tggtgacgcc gctggcgctg 2100tgttggtcca ggcgtgtgag agcgaggacg acggcgtctt cgggtttgat ctgcatagcg 2160atggagaggg ttatcgccac ctgcatactg ggatcaaggc gaacgaggag ttcgggacga 2220acggttccgt tgtggatttt ccgcccaagc gcagcagcta ctcttccatc caaatgaatg 2280ggaaagaagt gttccgtttc gcctgccgcg tcgtgcccca gtctattgag atcgcactcg 2340agaacgcggg cctcacacgt tctagcattg attggctgct gctccaccaa gcaaaccaac 2400gaatcttgga tgccgtcgca acgcgtctgg aaattcccgc agaccgcgtg attagtaact 2460tggctaatta cggcaatact tctgccgcca gcattccgtt ggcactggat gaagccgtgc 2520gcagcggtaa ggtcaaaccc ggtcagacta tcgcaacttc ggggtttgga gcaggcttga 2580catggggcag cgcgatcatt cgctggaatt aatgatagat ctgatccgct gttgacccaa 2640cagcatgagt cgttatccaa ggggagcttc ggctcccttt tttcatgcgc ggatgcgg 26983389DNAArtificial SequenceSynthetic construct 33gtacgggatc cctgtcgtta actgctttgt tggtactacc tgacttcacc ctcttttaag 60atggcaagcc gtgttgttgg taaaggttg 893429DNAArtificial SequenceSynthetic construct 34cacgtgagct ctcaccgcat ccgcgcatg 29351252DNAArtificial SequenceSynthetic construct 35tcatgaagtt ccttgtcgtc gccgtctcag cacttgcaac tgcatctgct ttcacaacca 60gtcctgcctc tttcaccact gtcagcagtc cttcggtgaa caatgtgttc ggacaggagg 120gaaatgctca caggaacagg agagctacca ttgtcatgga tggagctaac ggaagtgcag 180tcagtttgaa aagtgggtca ttgaatacgc aggaggacac aagttcgtcg ccaccgcccc 240gtacattcct tcaccaactc cctgattgga gcagattgct cactgccatc acaaccgttt 300ttgttaaaag taagcgtccg gatatgcatg atcgtaagtc gaaaaggccg gacatgctcg 360tggatagttt cgggttggag agtaccgttc aggatggact cgtgttccgt caaagctttt 420cgatccgttc atatgagatt ggaactgatc gtacggcttc cattgagact ttgatgaacc 480atcttcagga gacttccctc aaccattgta agagtacagg aattttgttg gatggattcg 540gacgcacact cgaaatgtgt aagcgcgatt tgatttgggt cgtcattaaa atgcagatca 600aggttaatag atacccggcc tggggcgata cagtagaaat caatactagg ttcagcagac 660ttggtaagat cggcatgggt cgagattggc tcattagcga ctgcaatacc ggtgagatcc 720tcgtcagggc aaccagcgcc tacgccatga tgaatcagaa gacccgaaga ctctcgaagc 780ttccgtacga ggtccaccaa gagattgtcc ccctttttgt cgactccccc gtaattgaag 840attcggatct caaggtccac aaattcaaag ttaaaacggg tgacagcatc cagaagggac 900ttactcctgg ttggaacgac ctcgatgtga accaacatgt ttcgaacgtg aaatatatcg 960gctggattct tgagagtatg ccaaccgagg tacttgagac gcaggaattg tgctcgttgg 1020cattggagta tcgtcgtgag tgtgggcgag actcagtcct cgaaagtgta acagcaatgg 1080acccaagcaa agttggtgtt cgttcacagt atcaacacct cctccgtctc gaggatggaa 1140cagccattgt gaacggggcc acagagtgga ggccaaagaa cgctggcgct aacggagcta 1200tctccacagg aaagaccagc aatggtaact ctgtgagtta atgataggat cc 125236412PRTArtificial SequenceSynthetic construct 36Met Lys Phe Leu Val Val Ala Val Ser Ala Leu Ala Thr Ala Ser Ala1 5 10 15Phe Thr Thr Ser Pro Ala Ser Phe Thr Thr Val Ser Ser Pro Ser Val 20 25 30Asn Asn Val Phe Gly Gln Glu Gly Asn Ala His Arg Asn Arg Arg Ala 35 40 45Thr Ile Val Met Asp Gly Ala Asn Gly Ser Ala Val Ser Leu Lys Ser 50 55 60Gly Ser Leu Asn Thr Gln Glu Asp Thr Ser Ser Ser Pro Pro Pro Arg65 70 75 80Thr Phe Leu His Gln Leu Pro Asp Trp Ser Arg Leu Leu Thr Ala Ile 85 90 95Thr Thr Val Phe Val Lys Ser Lys Arg Pro Asp Met His Asp Arg Lys 100 105 110Ser Lys Arg Pro Asp Met Leu Val Asp Ser Phe Gly Leu Glu Ser Thr 115 120 125Val Gln Asp Gly Leu Val Phe Arg Gln Ser Phe Ser Ile Arg Ser Tyr 130 135 140Glu Ile Gly Thr Asp Arg Thr Ala Ser Ile Glu Thr Leu Met Asn His145 150 155 160Leu Gln Glu Thr Ser Leu Asn His Cys Lys Ser Thr Gly Ile Leu Leu 165 170 175Asp Gly Phe Gly Arg Thr Leu Glu Met Cys Lys Arg Asp Leu Ile Trp 180 185 190Val Val Ile Lys Met Gln Ile Lys Val Asn Arg Tyr Pro Ala Trp Gly 195 200 205Asp Thr Val Glu Ile Asn Thr Arg Phe Ser Arg Leu Gly Lys Ile Gly 210 215 220Met Gly Arg Asp Trp Leu Ile Ser Asp Cys Asn Thr Gly Glu Ile Leu225 230 235 240Val Arg Ala Thr Ser Ala Tyr Ala Met Met Asn Gln Lys Thr Arg Arg 245 250 255Leu Ser Lys Leu Pro Tyr Glu Val His Gln Glu Ile Val Pro Leu Phe 260 265 270Val Asp Ser Pro Val Ile Glu Asp Ser Asp Leu Lys Val His Lys Phe 275 280 285Lys Val Lys Thr Gly Asp Ser Ile Gln Lys Gly Leu Thr Pro Gly Trp 290 295 300Asn Asp Leu Asp Val Asn Gln His Val Ser Asn Val Lys Tyr Ile Gly305 310 315 320Trp Ile Leu Glu Ser Met Pro Thr Glu Val Leu Glu Thr Gln Glu Leu 325 330 335Cys Ser Leu Ala Leu Glu Tyr Arg Arg Glu Cys Gly Arg Asp Ser Val 340 345 350Leu Glu Ser Val Thr Ala Met Asp Pro Ser Lys Val Gly Val Arg Ser 355 360 365Gln Tyr Gln His Leu Leu Arg Leu Glu Asp Gly Thr Ala Ile Val Asn 370 375 380Gly Ala Thr Glu Trp Arg Pro Lys Asn Ala Gly Ala Asn Gly Ala Ile385 390 395 400Ser Thr Gly Lys Thr Ser Asn Gly Asn Ser Val Ser 405 4103726DNAArtificial SequencePrimer 37caggatccgg ggaggtgtgg tgtagt 263843DNAArtificial SequencePrimer 38taggatccag tggtgcccat ggtactttgt taggggagga tag 433927DNAArtificial SequencePrimer 39caggatcctc actctgtcgc gctgttg 274027DNAArtificial SequencePrimer 40catctagaga ggattgattt ccgagtc 2741573DNAArtificial SequenceSynthetic construct 41atgggcacca ctctcgacga cacggcttac cgctaccgca ccagtgtgcc gggggacgcc 60gaggccatcg aggcactgga tgggtccttc accaccgaca ccgtcttccg cgtcaccgcc 120accggggacg gcttcaccct gcgggaggtg ccggtggacc cgcccctgac caaggtgttc 180cccgacgacg agtcggacga cgagtcggac gacggggagg acggcgaccc ggactcccgg 240acgttcgtcg cgtacgggga cgacggcgac ctggcgggct tcgtggtcgt ctcgtactcc 300ggctggaacc gccggctgac cgtcgaggac atcgaggtcg ccccggagca ccgggggcac 360ggggtcgggc gcgcgctgat ggggctcgcg acggagttcg cccgcgagcg gggtgccggg 420cacctctggc tggaggtcac caacgtcaac gcaccggcga tccacgcgta ccggcggatg 480gggttcaccc tctgcggcct ggacaccgcc ctgtacgacg gcaccgcctc ggacggcgag 540caggcgctct acatgtccat gccctgcccc taa 573421198DNAArtificial SequenceSynthetic construct 42ccatggccgc tatgctcgcc tctaagcagg gcgccttcat gggccgcagc tcctttgccc 60ccgcccccaa gggcgtcgcc agccgcggct ccctgcaggt ggtggccggc gccaacggca 120gcgcggtgag cctgaagtcg ggttccctca acactcagga ggacacctcg tcctcgcccc 180cgccgcgcac gttcctgcac cagctgccgg actggtcccg cctgctgacg gctattacga 240ccgtgttcgt gaagtcgaag cgccccgaca tgcacgaccg caagagcaag cggcctgata 300tgctggtgga cagctttggc ctggagtcca cggtgcagga cggcctcgtg ttccggcaaa 360gcttcagcat ccgcagctac gagatcggca cggaccgcac cgcgtcgatc gagacgctca 420tgaaccacct ccaggagacg tcgctcaacc actgcaagtc caccggtatc ctgctggacg 480gctttggccg caccctggag atgtgcaagc gggatctgat ctgggtggtg atcaagatgc 540agatcaaggt gaaccgctat cccgcctggg gtgacaccgt cgagattaac acccgcttct 600cgcgcctggg caagatcggc atggggcgcg actggctgat ctcggactgc aacactggcg 660agatcctggt ccgggccacg tcggcctacg ccatgatgaa ccagaagact cggcggctga 720gcaagctgcc ttacgaggtg catcaggaga tcgtgccgct cttcgtggac agccccgtga 780tcgaggacag cgatctgaag gtgcacaagt tcaaggtcaa gaccggcgac agcatccaga 840agggcctgac tcccggctgg aacgacctgg acgtgaacca gcacgtctcg aacgtgaagt 900acatcggctg gattctggag tcgatgccca ccgaggtgct ggagacgcag gagctgtgct 960ccctggcgct ggagtatcgc cgcgagtgcg gccgcgactc cgtgctggag tccgtcaccg 1020cgatggaccc gtcgaaggtg ggtgtccgca gccagtacca acacctgctg cgcctcgagg 1080acggcaccgc cattgtgaac ggcgcgacgg agtggcggcc gaagaacgcg ggcgctaacg 1140gcgccatctc cacgggcaag acctccaacg gcaactcggt gagctaatga taggatcc 119843394PRTArtificial SequenceSynthetic construct 43Met Ala Ala Met Leu Ala Ser Lys Gln Gly Ala Phe Met Gly Arg Ser1 5 10 15Ser Phe Ala Pro Ala Pro Lys Gly Val Ala Ser Arg Gly Ser Leu Gln 20 25 30Val Val Ala Gly Ala Asn Gly Ser Ala Val Ser Leu Lys Ser Gly Ser 35 40 45Leu Asn Thr Gln Glu Asp Thr Ser Ser Ser Pro Pro Pro Arg Thr Phe 50 55 60Leu His Gln Leu Pro Asp Trp Ser Arg Leu Leu Thr Ala Ile Thr Thr65 70 75 80Val Phe Val Lys Ser Lys Arg Pro Asp Met His Asp Arg Lys Ser Lys 85 90 95Arg Pro Asp Met Leu Val Asp Ser Phe Gly Leu Glu Ser Thr Val Gln 100 105 110Asp Gly Leu Val Phe Arg Gln Ser Phe Ser Ile Arg Ser Tyr Glu Ile 115 120 125Gly Thr Asp Arg Thr Ala Ser Ile Glu Thr Leu Met Asn His Leu Gln 130 135 140Glu Thr Ser Leu Asn His Cys Lys Ser Thr Gly Ile Leu Leu Asp Gly145 150 155 160Phe Gly Arg Thr Leu Glu Met Cys Lys Arg Asp Leu Ile Trp Val Val 165 170 175Ile Lys Met Gln Ile Lys Val Asn Arg Tyr Pro Ala Trp Gly Asp Thr 180 185 190Val Glu Ile Asn Thr Arg Phe Ser Arg Leu Gly Lys Ile Gly Met Gly 195 200 205Arg Asp Trp Leu Ile Ser Asp Cys Asn Thr Gly Glu Ile Leu Val Arg 210 215 220Ala Thr Ser Ala Tyr Ala Met Met Asn Gln Lys Thr Arg Arg Leu Ser225 230 235 240Lys Leu Pro Tyr Glu Val His Gln Glu Ile Val Pro Leu Phe Val Asp 245 250 255Ser Pro Val Ile Glu Asp Ser Asp Leu Lys Val His Lys Phe Lys Val 260 265 270Lys Thr Gly Asp Ser Ile Gln Lys Gly Leu Thr Pro Gly Trp Asn Asp 275 280 285Leu Asp Val Asn Gln His Val Ser Asn Val Lys Tyr Ile Gly Trp Ile 290 295 300Leu Glu Ser Met Pro Thr Glu Val Leu Glu Thr Gln Glu Leu Cys Ser305 310 315 320Leu Ala Leu Glu Tyr Arg Arg Glu Cys Gly Arg Asp Ser Val Leu Glu 325 330 335Ser Val Thr Ala Met Asp Pro Ser Lys Val Gly Val Arg Ser Gln Tyr 340 345 350Gln His Leu Leu Arg Leu Glu Asp Gly Thr Ala Ile Val Asn Gly Ala 355 360 365Thr Glu Trp Arg Pro Lys Asn Ala Gly Ala Asn Gly Ala Ile Ser Thr 370 375 380Gly Lys Thr Ser Asn Gly Asn Ser Val Ser385 3904425DNAArtificial SequencePrimer 44ggtggaaaat gcctatgtgt taacg 254525DNAArtificial SequencePrimer 45cgtaggcagt gtgcaaccag gagcc 25463418DNAArtificial SequenceSynthetic construct 46ggtggaaaat gcctatgtgt taacggatct acaaaccagc accaaactct attacgaacc 60ccacggtttc cactctcccc aactgcaaga cttggggccc attgatgtgg ttttaacccc 120cgtcattggc atcaatatcc tcggattcct gccggtgctc aatggccaga aaaccaccct 180ggagctttgt cgcactgtcc atccccaggc gatcgtcccc acctctggag ccgcagaatt 240gaactatagc ggtttactaa ctaaagtttt acgtttagac ggcgatctca gtcaatttcg 300ccagtcccta attgacgaag ggatacaagc ttccctatgg gaaccccagg tgggagtgcc 360cctcaatgtg ccccaatcca ccgttggcta ggttggaatg ttcaaatcac tgtgcggtgt 420gatgcttgat aaatacagtg agccagggaa aactgcaaaa aagtgtataa agtaggttta 480acttgaatca aaatcctttc tccgcagtca tagccaggag taggaagatt accagcgaag 540caagttgtct tcccctagct ttgggcgggc aaaccccttg cagtattgcc aacgtcaaaa 600aatcaccata gccgaatgac ctacaccatc aacgctgacc aagtccatca gattgtccat 660aatcttcacc acgatccctt tgaagtgttg ggctgccatc ccctcggagc tttatgcttg 720taaaccgttt tgtgaaaaaa tttttaaaat aaaaaagggg acctctaggg tccccaatta 780attagtaata taatctatta aaggtcattc aaaaggtcat ccaccggatc agcttagtaa 840agccctcgct agattttaat gcggatgttg cgattacttc gccaactatt gcgataacaa 900gaaaaagcca gcctttcatg atatatctcc caatttgtgt agggcttatt atgcacgctt 960aaaaataata aaagcagact tgacctgata gtttggctgt gagcaattat gtgcttagtg 1020catctaacgc ttgagttaag ccgcgccgcg aagcggcgtc ggcttgaacg aattgttaga 1080cattatttgc cgactacctt ggtgatctcg cctttcacgt agtggacaaa ttcttccaac 1140tgatctgcgc gcgaggccaa gcgatcttct tcttgtccaa gataagcctg tctagcttca 1200agtatgacgg gctgatactg ggccggcagg cgctccattg cccagtcggc agcgacatcc 1260ttcggcgcga ttttgccggt tactgcgctg taccaaatgc gggacaacgt aagcactaca 1320tttcgctcat cgccagccca gtcgggcggc gagttccata gcgttaaggt ttcatttagc 1380gcctcaaata gatcctgttc aggaaccgga tcaaagagtt cctccgccgc tggacctacc 1440aaggcaacgc tatgttctct tgcttttgtc agcaagatag ccagatcaat gtcgatcgtg 1500gctggctcga agatacctgc aagaatgtca ttgcgctgcc attctccaaa ttgcagttcg 1560cgcttagctg gataacgcca cggaatgatg tcgtcgtgca caacaatggt gacttctaca 1620gcgcggagaa tctcgctctc tccaggggaa gccgaagttt ccaaaaggtc gttgatcaaa 1680gctcgccgcg ttgtttcatc aagccttacg gtcaccgtaa ccagcaaatc aatatcactg 1740tgtggcttca ggccgccatc cactgcggag ccgtacaaat gtacggccag caacgtcggt 1800tcgagatggc gctcgatgac gccaactacc tctgatagtt gagtcgatac ttcggcgatc 1860accgcttccc tcatgatgtt taactttgtt ttagggcgac tgccctgctg cgtaacatcg 1920ttgctgctcc ataacatcaa acatcgaccc acggcgtaac gcgcttgctg cttggatgcc 1980cgaccgaggc atagactgta ccccaaaaaa acagtcataa caagccatga aaaccgccac 2040tgcgccgtta ccaccgctgc gttcggtcaa ggttctggac cagttgcgtg agcgcatacg 2100ctacttgcat tacagcttac gaaccgaaca ggcttatgtc cactgggttc gtgccttcat 2160ccgtttccac ggtgtgcgtc acccggcaac cttgggcagc agcgaagtcg aggcatttct 2220gtcctggctg gcgaacgagc gcaaggtttc ggtctccacg catcgtcagg cattggcggc 2280cttgctgttc ttctacggca aggtgctgtg cacggatctg ccctggcttc aggagatcgg 2340aagacctcgg ccgtcgcggc gcttgccggt ggtgctgacc ccggatgaag tggttcgcat 2400cctcggtttt ctggaaggcg agcatcgttt gttcgcccag cttctgtatg gaacgggcat 2460gcggatcagt gagggtttgc aactgcgggt caaggatctg gatttcgatc acggcacgat 2520catcgtgcgg gagggcaagg gctccaagga tcgggcctgg cacccagcct gcgcgagcag 2580gggaattgat ccggtggatg accttttgaa tgacctttaa tagattatat tactaattaa 2640ttggggaccc tagaggtccc cttttttatt ttaaaaattt tttcacaaaa cggtttacaa 2700gcataaagct tcggggacca cggcaaggtc aatcaatggg tcattcgtgc ctatttaccc 2760acggctgaag cggtaacggt gttgcttccc accgatcgcc gggaagtgat tatgaccacg 2820gtccaccatc ccaacttttt tgaatgcgtg ttggagttgg aagaaccgaa gaattatcaa 2880ttaagaatta ccgaaaatgg ccacgaaagg gtaatttatg acccctatgg ttttaaaact 2940cccaaactga cggattttga cctccatgtg tttggggaag gcaaccacca ccgtatttac 3000gaaaaactcg gtgctcacct gatgacggtg gatggagtta aaggggttta ttttgctgtg 3060tgggccccca atgcccgcaa cgtttccatt ttgggggatt tcaacaactg ggacggcaga 3120ttgcaccaaa tgcggaaacg caacaacatg gtgtgggaat tatttatccc tgagttgggg 3180gtgggcactt cttataagta tgagattaaa aactgggaag ggcacatcta cgaaaagact 3240gacccctacg gtttttacca agaagtacgc cccaaaaccg cttccattgt ggcagacttg 3300gacggttacc aatggcacga cgaagattgg ttggaagcta ggcgcaccag

cgatcccctg 3360agcaaacccg tttccgttta cgaactccat ttaggctcct ggttgcacac tgcctacg 3418

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed