Peroxisome Biogenesis Factor Protein (pex) Disruptions For Altering Polyunsaturated Fatty Acids And Total Lipid Content In Oleaginous Eukaryotic Organisms

HONG; SEUNG-PYO ;   et al.

Patent Application Summary

U.S. patent application number 12/244950 was filed with the patent office on 2009-05-07 for peroxisome biogenesis factor protein (pex) disruptions for altering polyunsaturated fatty acids and total lipid content in oleaginous eukaryotic organisms. This patent application is currently assigned to E. I. DU PONT DE NEMOURS AND COMPANY. Invention is credited to SEUNG-PYO HONG, PAMELA L. SHARPE, ZHIXIONG XUE, NARENDRA S. YADAV, QUINN QUN ZHU.

Application Number20090117253 12/244950
Document ID /
Family ID40084411
Filed Date2009-05-07

United States Patent Application 20090117253
Kind Code A1
HONG; SEUNG-PYO ;   et al. May 7, 2009

PEROXISOME BIOGENESIS FACTOR PROTEIN (PEX) DISRUPTIONS FOR ALTERING POLYUNSATURATED FATTY ACIDS AND TOTAL LIPID CONTENT IN OLEAGINOUS EUKARYOTIC ORGANISMS

Abstract

Methods of increasing the amount of polyunsaturated fatty acids (PUFAs) in the total lipid fraction and in the oil fraction of PUFA-producing, oleaginous eukaryotes, accomplished by modifying the activity of peroxisome biogenesis factor (Pex) proteins. Disruptions of a chromosomal Pex3 gene, Pex10p gene or Pex16p gene in a PUFA-producing, oleaginous eukaryotic strain resulted in an increased amount of PUFAs, as a percent of total fatty acids and as a percent of dry cell weight, in the total lipid fraction and in the oil fraction of the strain, as compared to the parental strain whose native Pex protein was not disrupted.


Inventors: HONG; SEUNG-PYO; (HOCKESSIN, DE) ; SHARPE; PAMELA L.; (WILMINGTON, DE) ; XUE; ZHIXIONG; (CHADDS FORD, PA) ; YADAV; NARENDRA S.; (WILMINGTON, DE) ; ZHU; QUINN QUN; (WEST CHESTER, PA)
Correspondence Address:
    E I DU PONT DE NEMOURS AND COMPANY;LEGAL PATENT RECORDS CENTER
    BARLEY MILL PLAZA 25/1122B, 4417 LANCASTER PIKE
    WILMINGTON
    DE
    19805
    US
Assignee: E. I. DU PONT DE NEMOURS AND COMPANY
WILMINGTON
DE

Family ID: 40084411
Appl. No.: 12/244950
Filed: October 3, 2008

Related U.S. Patent Documents

Application Number Filing Date Patent Number
60977174 Oct 3, 2007
60977177 Oct 3, 2007

Current U.S. Class: 426/601 ; 435/254.2; 435/471
Current CPC Class: C12P 7/6472 20130101; A61P 35/00 20180101; A61P 19/02 20180101; A61P 25/28 20180101; A61P 5/24 20180101; C12N 9/1029 20130101; A61P 9/00 20180101; C12P 7/6427 20130101; A61P 29/00 20180101; A61P 25/18 20180101; A61P 9/10 20180101; A61P 9/12 20180101; C12N 9/0083 20130101; A61P 19/10 20180101; A61P 25/24 20180101; C12N 15/815 20130101; A61P 3/10 20180101; A61P 17/02 20180101; A61P 3/06 20180101; A61P 1/16 20180101; A61P 25/00 20180101; A61P 1/00 20180101; A61P 3/00 20180101
Class at Publication: 426/601 ; 435/471; 435/254.2
International Class: A23K 1/16 20060101 A23K001/16; C12N 15/63 20060101 C12N015/63; C12N 1/19 20060101 C12N001/19

Claims



1. A method of increasing the weight percent of at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids in an oleaginous eukaryotic organism having a total lipid content, a total lipid fraction and an oil fraction, comprising: a) providing an oleaginous eukaryotic organism comprising: 1) genes encoding a functional polyunsaturated fatty acid biosynthetic pathway; and 2) a disruption in a native gene encoding a peroxisome biogenesis factor protein, thereby providing a PEX-disrupted organism, and b) growing the PEX-disrupted organism under conditions as to increase the weight percent of at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids in the total lipid fraction or in the oil fraction, when compared to the weight percent of the at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids in the total lipid fraction or in the oil fraction in the oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

2. The method of claim 1, wherein the increase in the weight percent of the at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids is at least 1.3 fold, when compared to the weight percent of polyunsaturated fatty acids relative to the weight percent of total fatty acids in the total lipid fraction or in the oil fraction in an oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

3. The method of claim 1, wherein the at least one polyunsaturated fatty acid is selected from the group consisting of: linoleic acid, conjugated linoleic acid, .gamma.-linolenic acid, dihomo-.gamma.-linolenic acid, arachidonic acid, docosatetraenoic acid, .omega.-6 docosapentaenoic acid, .alpha.-linolenic acid, stearidonic acid, eicosatetraenoic acid, eicosapentaenoic acid, .omega.-3 docosapentaenoic acid, eicosadienoic acid, eicosatrienoic acid, docosahexaenoic acid, hydroxylated or epoxy fatty acids of these, C.sub.18 polyunsaturated fatty acids, C.sub.20 polyunsaturated fatty acids, and C.sub.22 polyunsaturated fatty acids.

4. The method of claim 1, wherein the at least one polyunsaturated fatty acid consists of a combination of polyunsaturated fatty acids and wherein the weight percent of the combination is increased relative to the weight percent of total fatty acids.

5. The method of claim 4, wherein the combination of polyunsaturated fatty acids consists of any combination of two or more polyunsaturated fatty acids selected from the group consisting of: linoleic acid, conjugated linoleic acid, .gamma.-linolenic acid, dihomo-.gamma.-linolenic acid, arachidonic acid, docosatetraenoic acid, .omega.-6 docosapentaenoic acid, .alpha.-linolenic acid, stearidonic acid, eicosatetraenoic acid, eicosapentaenoic acid, .omega.-3 docosapentaenoic acid, eicosadienoic acid, eicosatrienoic acid, docosahexaenoic acid, hydroxylated or epoxy fatty acids of these, a combination of C.sub.20 polyunsaturated fatty acids, a combination of C.sub.20-22 polyunsaturated fatty acids, and a combination of C.sub.22 polyunsaturated fatty acids.

6. The method of claim 1, wherein the total lipid content in the PEX-disrupted organism is increased, when compared with the total lipid content in an oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

7. The method of claim 1, wherein the total lipid content in the PEX-disrupted organism is decreased, when compared with the total lipid content in an oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

8. The method of claim 1, wherein the PEX-disrupted organism is selected from the group consisting of: Yarrowia, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Trichosporon, Lipomyces, Mortierella Thraustochytrium, Schizochytrium, and Saccharomyces having the property of oleaginy.

9. The method of claim 1, wherein the polyunsaturated fatty acid biosynthetic pathway comprises genes encoding enzymes selected from the group consisting of: .DELTA.9 desaturase, .DELTA.12 desaturase, .DELTA.6 desaturase, .DELTA.5 desaturase, .DELTA.17 desaturase, .DELTA.8 desaturase, .DELTA.15 desaturase, .DELTA.4 desaturase, C.sub.14/16 elongase, C.sub.16/18 elongase, C.sub.18/20 elongase, C.sub.20/22 elongase and .DELTA.9 elongase.

10. The method of claim 1, wherein the disruption in the native gene encoding a peroxisome biogenesis factor protein comprises a deletion selected from the group consisting of: a deletion in a portion of the gene encoding the C-terminal portion of the protein and a gene knockout.

11. The method of claim 1, wherein the peroxisome biogenesis factor protein is selected from the group consisting of: Pex1p, Pex 2p, Pex3p, Pex3Bp, Pex4p, Pex5p, Pex5Bp, Pex5Cp, Pex5/20p, Pex6p, Pex7p, Pex8p, Pex10p, Pex12p, Pex13p, Pex14p, Pex15p, Pex16p, Pex17p, Pex14/17p, Pex18p, Pex19p, Pex20p, Pex21p, Pex21Bp, Pex22p, Pex22p-like and Pex26p.

12. The method of claim 1, wherein the peroxisome biogenesis factor protein is selected from the group consisting of: peroxisome biogenesis factor 3 protein (Pex3p), peroxisome biogenesis factor 10 protein (Pex10p) and peroxisome biogenesis factor 16 protein (Pex16p), and wherein the disruption is a gene knockout.

13. The method of claim 1, wherein the peroxisome biogenesis factor protein is selected from the group consisting of: peroxisome biogenesis factor 2 protein (Pex2p), peroxisome biogenesis factor 10 protein (Pex10p) and peroxisome biogenesis factor 12 protein (Pex12p), and wherein the disruption is a deletion in a portion of the gene encoding the C-terminal portion of the C.sub.3HC.sub.4 zinc ring finger motif of the protein.

14. The oil fraction or the total lipid fraction in a PEX-disrupted organism having an increase in the weight percent of at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids, wherein the increase was obtained by the method of claim 1.

15. Use as food, feed or in an industrial application of the at least one polyunsaturated fatty acid of a PEX-disrupted organism having been increased in weight percent relative to the weight percent of total fatty acids by the method of claim 1.

16. A PEX-disrupted Yarrowia lipolytica, wherein the disruption occurs in the native gene encoding a peroxisome biogenesis factor protein selected from the group consisting of Pex3p, Pex10p and Pex16p.

17. The Yarrowia lipolytica of claim 16 having ATCC designation ATCC PTA-8614 (strain Y4128).

18. A method of increasing the percent of at least one polyunsaturated fatty acid relative to the dry cell weight in an oleaginous eukaryotic organism, comprising: a) providing an oleaginous eukaryotic organism comprising: 1) genes encoding a functional polyunsaturated fatty acid biosynthetic pathway; and 2) a disruption in a native gene encoding a peroxisome biogenesis factor protein, thereby providing a PEX-disrupted organism, and b) growing the PEX-disrupted organism under conditions as to increase the percent of at least one polyunsaturated fatty acid relative to the dry cell weight, when compared to the percent of the at least one polyunsaturated fatty acid relative to the dry cell weight in the oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

19. The method of claim 18, wherein the increase in the percent of the at least one polyunsaturated fatty acid relative to the dry cell weight is at least 1.3 fold, when compared to the percent of polyunsaturated fatty acids relative to the dry cell weight of an oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

20. The method of claim 19, wherein the at least one polyunsaturated fatty acid is selected from the group consisting of: linoleic acid, conjugated linoleic acid, .gamma.-linolenic acid, dihomo-.gamma.-linolenic acid, arachidonic acid, docosatetraenoic acid, .omega.-6 docosapentaenoic acid, .alpha.-linolenic acid, stearidonic acid, eicosatetraenoic acid, eicosapentaenoic acid, .omega.-3 docosapentaenoic acid, eicosadienoic acid, eicosatrienoic acid, docosahexaenoic acid, hydroxylated or epoxy fatty acids of these, C.sub.18 polyunsaturated fatty acids, C.sub.20 polyunsaturated fatty acids, and C.sub.22 polyunsaturated fatty acids.

21. The method of claim 19, wherein the total lipid content in the PEX-disrupted organism is altered, when compared with the total lipid content in an oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

22. The method of claim 19, wherein the disruption in the native gene encoding a peroxisome biogenesis factor protein comprises a deletion selected from the group consisting of: a deletion in a portion of the gene encoding the C-terminal portion of the protein, and a gene knockout; and wherein the peroxisome biogenesis factor protein is selected from the group consisting of: Pex1p, Pex 2p, Pex3p, Pex3Bp, Pex4p, Pex5p, Pex5Bp, Pex5Cp, Pex5/20p, Pex6p, Pex7p, Pex8p, Pex10p, Pex12p, Pex13p, Pex14p, Pex15p, Pex16p, Pex17p, Pex14/17p, Pex18p, Pex19p, Pex20p, Pex21p, Pex21Bp, Pex22p, Pex22p-like and Pex26p.
Description



[0001] This application claims the benefit of U.S. Provisional Applications No. 60/977,174 and No. 60/977,177, both filed Oct. 3, 2007 and both hereby incorporated herein by reference in their entirety.

FIELD OF THE INVENTION

[0002] This invention is in the field of biotechnology. More specifically, this invention pertains to methods useful for manipulating the polyunsaturated fatty acid (PUFA) composition and lipid content of eukaryotic organisms, based on disruption of peroxisome biogenesis factor (Pex) proteins.

BACKGROUND OF THE INVENTION

[0003] The health benefits associated with polyunsaturated fatty acids ["PUFAs"], especially .omega.-3 and .omega.-6 PUFAs, have been well documented. In order to find ways to produce large-scale quantities of .omega.-3 and .omega.-6 PUFAs, researchers have directed their work toward the discovery of genes and the understanding of the encoded biosynthetic pathways that result in lipids and fatty acids.

[0004] One effort to produce these PUFAs has introduced .omega.-3/.omega.-6 PUFA biosynthetic pathways into organisms that do not natively produce .omega.-3/.omega.-6 PUFAs. One such organism that has been extensively manipulated is the non-oleaginous yeast, Saccharomyces cerevisiae. However, none of the preliminary results demonstrating limited production of linoleic acid ["LA"], .gamma.-linolenic acid ["GLA"], .alpha.-linolenic acid ["ALA"], stearidonic acid ["STA"] and/or eicosapentaenoic acid ["EPA"] are suitable for commercial exploitation.

[0005] Other efforts to produce large-scale quantities of .omega.-3/.omega.-6 PUFAs have cultivated microbial organisms that natively produce the fatty acid of choice, e.g., heterotrophic diatoms Cyclotella sp. and Nitzschia sp., Pseudomonas, Alteromonas or Shewanella species, filamentous fungi of the genus Pythium, or Mortierella elongata, M. exigua or M. hygrophila.

[0006] All these efforts suffer from an inability to substantially improve the yield of oil or to control the characteristics of the oil composition produced, since the fermentations rely on the natural abilities of the microbes themselves.

[0007] Commonly owned U.S. Pat. No. 7,238,482 describes the use of the oleaginous yeast Yarrowia lipolytica as a production host for the production of PUFAs. Oleaginous yeast are defined as those yeast that are naturally capable of oil synthesis and accumulation, where greater than 25% of the cellular dry weight is typical. Optimization of the production host has been described in the art (see for example Int'l. App. Pub. No. WO 2006/033723, U.S. Pat. App. Pub. No. 2006-0094092, U.S. Pat. App. Pub. No. 2006-0115881, and U.S. Pat. App. Pub. No. 2006-0110806). The recombinant strains described therein comprise various chimeric genes expressing multiple copies of heterologous desaturases, elongases and acyltransferases and optionally comprise various native desaturase and acyltransferase knockouts to enable PUFA synthesis and accumulation. Further optimization of the host cell is needed for commercial production of PUFAs.

[0008] Lin Y. et al suggest that peroxisomes are required for both catabolic and anabolic lipid metabolism (Plant Physiology, 135:814-827 (2004)). However, this hypothesis was based on studies with a homolog of Pex16p in Arabidopsis mutants that had both abnormal peroxisome biogenesis and fatty acid synthesis (i.e., a reduction of oil to approximately 10-16% of wild type in sse1 seeds was reported). Binns, D. et al. (J. Cell Biol., 173(5):719-731 (2006)) also document an intimate collaboration between peroxisomes and lipid bodies in Saccharomyces cerevisiae. But, previous studies of Pex knockouts have not been performed in a PUFA-producing organism.

[0009] Applicants have solved the stated problem of optimizing host cells for commercial production of PUFAs by the unpredictable mechanism of disruption of peroxisome biogenesis factor proteins in a PUFA-producing organism, which leads to the unpredictable result of an increase in the amount of PUFAs, as a percent of total fatty acids, in a recombinant PUFA-producing strain of Y. lipolytica. Novel strains containing disruptions in peroxisome biogenesis factor proteins are described herein.

SUMMARY OF THE INVENTION

[0010] Described herein are methods of increasing the weight percent of at least one polyunsaturated fatty acid ["PUFA"] relative to the weight percent of total fatty acids ["TFAs"] in an oleaginous eukaryotic organism having a total lipid content, a total lipid fraction and an oil fraction, comprising:

a) providing an oleaginous eukaryotic organism comprising: [0011] 1) genes encoding a functional polyunsaturated fatty acid biosynthetic pathway; and [0012] 2) a disruption in a native gene encoding a peroxisome biogenesis factor protein, thereby providing a PEX-disrupted organism, and b) growing the PEX-disrupted organism under conditions as to increase the weight percent of at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids in the total lipid fraction or in the oil fraction, when compared to the weight percent of the at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids in the total lipid fraction or in the oil fraction in the oleaginous eukaryotic organism in which no native gene encoding a peroxisome biogenesis factor protein has been disrupted.

[0013] This method of increasing may also be used to increase the percent of at least one polyunsaturated fatty acid ["PUFA"] relative to the dry cell weight ["DCW"] by applying the same steps (a) and (b).

[0014] In some of the methods described here, the weight percent of the PUFA relative to the weight percent of the TFAs is increased at least 1.3 fold.

[0015] In some of the described methods, the total lipid content in the PEX-disrupted organism may be increased or decreased compared with that of an oleaginous eukaryote having no disruption in a native PEX gene.

[0016] In any of these methods, the increased PUFA may be a single PUFA or a combination of PUFAs. In either case, the increased PUFA or increased combination of PUFAs can include linoleic acid, conjugated linoleic acid, .gamma.-linolenic acid, dihomo-.gamma.-linolenic acid, arachidonic acid, docosatetraenoic acid, .omega.-6 docosapentaenoic acid, .alpha.-linolenic acid, stearidonic acid, eicosatetraenoic acid, eicosapentaenoic acid, .omega.-3 docosapentaenoic acid, eicosadienoic acid, eicosatrienoic acid, docosahexaenoic acid, hydroxylated or epoxy fatty acids of these, a C.sub.18 polyunsaturated fatty acid or a combination of these, a C.sub.20 polyunsaturated fatty acid or a combination of these, a combination of C.sub.20-22 polyunsaturated fatty acids and a C.sub.22 polyunsaturated fatty acid or a combination of these.

[0017] In any of these methods, the PEX-disrupted organism may be a member of the following: Yarrowia, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Trichosporon, Lipomyces, Mortierella Thraustochytrium, Schizochytrium, and Saccharomyces having the property of oleaginy. And, in any of the described methods, the PUFA biosynthetic pathway includes genes that encodes any or a combination of the following enzymes: .DELTA.9 desaturase, .DELTA.12 desaturase, .DELTA.6 desaturase, .DELTA.5 desaturase, .DELTA.17 desaturase, .DELTA.8 desaturase, .DELTA.15 desaturase, .DELTA.4 desaturase, C.sub.14/16 elongase, C.sub.16/18 elongase, C.sub.18/20 elongase, C.sub.20/22 elongase and .DELTA.9 elongase.

[0018] The disruption may occur in a PEX gene that encodes a peroxisome biogenesis factor protein that includes the following: Pex1p, Pex 2p, Pex3p, Pex3Bp, Pex4p, Pex5p, Pex5Bp, Pex5Cp, Pex5/20p, Pex6p, Pex7p, Pex8p, Pex10p, Pex12p, Pex13p, Pex14p, Pex15p, Pex16p, Pex17p, Pex14/17p, Pex18p, Pex19p, Pex20p, Pex21p, Pex21Bp, Pex22p, Pex22p-like and Pex26p. And in any of these methods, the disruption may be a gene knockout or a deletion in a portion of the gene that encodes the C-terminal portion of the protein. In some of these methods, the deletion is in the portion of the gene encoding the C-terminal portion of the C.sub.3HC.sub.4 zinc ring finger motif of the protein.

[0019] Also described herein is the oil fraction or the total lipid fraction in a PEX-disrupted organism, which has experienced an increase in the weight percent of at least one PUFA accomplished by the method of Claim 1. Described herein is also a PEX-disrupted Yarrowia lipolytica, having a disruption in a native gene encoding Pex3p or Pex10p or Pex16p. This Y. lipolytica may have ATCC designation ATCC PTA-8614 (strain Y4128).

Biological Deposits

[0020] The following biological materials have been deposited with the American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110-2209, and bear the following designations, accession numbers and dates of deposit.

TABLE-US-00001 Biological Material Accession No. Date of Deposit Yarrowia lipolytica Y2047 ATCC PTA-7186 Oct. 26, 2005 Yarrowia lipolytica Y2201 ATCC PTA-7185 Oct. 26, 2005 Yarrowia lipolytica Y2096 ATCC PTA-7184 Oct. 26, 2005 Yarrowia lipolytica Y3000 ATCC PTA-7187 Oct. 26, 2005 Yarrowia lipolytica Y4128 ATCC PTA-8614 Aug. 23, 2007 Yarrowia lipolytica Y4127 ATCC PTA-8802 Nov. 29, 2007

The biological materials listed above were deposited under the terms of the Budapest Treaty on the International Recognition of the Deposit of Microorganisms for the Purposes of Patent Procedure. The listed deposit will be maintained in the indicated international depository for at least 30 years and will be made available to the public upon the grant of a patent disclosing it. The availability of a deposit does not constitute a license to practice the subject invention in derogation of patent rights granted by government action.

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE LISTINGS

[0021] FIG. 1 consists of FIG. 1A and FIG. 1B, which together illustrate the .omega.-3/.omega.-6 fatty acid biosynthetic pathway, and should be viewed together when considering the description of this pathway below.

[0022] FIG. 2A provides an alignment of the C.sub.3HC.sub.4 zinc ring finger motifs of the Yarrowia lipolytica Pex10p (i.e., amino acids 327-364 of SEQ ID NO:10 [GenBank Accession No. CAG81606]), the Yarrowia lipolytica Pex2p (i.e., amino acids 266-323 of SEQ ID NO:2 [GenBank Accession No. CAG77647]) and the Yarrowia lipolytica Pex12p (i.e., amino acids 342-391 of SEQ ID NO:11 [GenBank Accession No. CAG81532]), with cysteine and histidine residues of the conserved C.sub.3HC.sub.4 zinc ring finger motif indicated by asterisks.

[0023] FIG. 2B schematically illustrates the proposed interaction between various amino acid residues of the Y. lipolytica Pex10p C.sub.3HC.sub.4 finger motif and the two zinc ions to which they bind.

[0024] FIG. 3A diagrams the development of Yarrowia lipolytica strain Y4128, producing 37.6% EPA in the total lipid fraction.

[0025] FIG. 3B provides a plasmid map for pZP3-Pa777U.

[0026] FIG. 4 provides plasmid maps for the following: (A) pY117; and, (B) pZP2-2988.

[0027] FIG. 5 provides plasmid maps for the following: (A) pZKUE3S; and, (B) pFBAIN-MOD-1.

[0028] FIG. 6 provides plasmid maps for the following: (A) pFBAIN-PEX10; and, (B) pEXP-MOD-1.

[0029] FIG. 7A provides a plasmid map for pPEX10-1. FIG. 7B diagrams the development of Yarrowia lipolytica strain Y4184U.

[0030] FIG. 8 provides plasmid maps for the following: (A) pZKL1-2SP98C; and, (B) pZKL2-5U89GC.

[0031] FIG. 9 provides plasmid maps for the following: (A) pYPS161; and, (B) pYRH13.

[0032] FIG. 10 diagrams the development of Yarrowia lipolytica strain Y4305U3.

[0033] FIG. 11 provides plasmid maps for the following: (A) pZKUM; and, (B) pZKD2-5U89A2.

[0034] FIG. 12 provides plasmid maps for the following: (A) pY87; and, (B) pY157.

[0035] The invention can be more fully understood from the following detailed description and the accompanying sequence descriptions, which form a part of this application.

[0036] The following sequences comply with 37 C.F.R. .sctn.1.821-1.825 ("Requirements for Patent Applications Containing Nucleotide Sequences and/or Amino Acid Sequence Disclosures--the Sequence Rules") and are consistent with World Intellectual Property Organization (WIPO) Standard ST.25 (1998) and the sequence listing requirements of the EPO and PCT (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of the Administrative Instructions). The symbols and format used for nucleotide and amino acid sequence data comply with the rules set forth in 37 C.F.R. .sctn.1.822.

[0037] SEQ ID NOs:1-86 are primers, ORFs encoding genes or proteins (or portions thereof, or plasmids, as identified in Table 1.

TABLE-US-00002 TABLE 1 Summary Of Nucleic Acid And Protein SEQ ID Numbers Protein Nucleic acid SEQ ID Description and Abbreviation SEQ ID NO. NO. Yarrowia lipolytica Pex1p (GenBank -- 1 Accession No. CAG82178) (1024 AA) Yarrowia lipolytica Pex2p -- 2 (GenBank Accession No. CAG77647) (381 AA) Yarrowia lipolytica Pex3p (GenBank -- 3 Accession No. CAG78565) (431 AA) Yarrowia lipolytica Pex3Bp (GenBank -- 4 Accession No. CAG83356) (395 AA) Yarrowia lipolytica Pex4p (GenBank -- 5 Accession No. CAG79130) (153 AA) Yarrowia lipolytica Pex5p (GenBank -- 6 Accession No. CAG78803) (598 AA) Yarrowia lipolytica Pex6p (GenBank -- 7 Accession No. CAG82306) (1024 AA) Yarrowia lipolytica Pex7p (GenBank -- 8 Accession No. CAG78389) (356 AA) Yarrowia lipolytica Pex8p (GenBank -- 9 Accession No. CAG80447) (671 AA) Yarrowia lipolytica Pex10p (GenBank -- 10 Accession No. CAG81606) (377 AA) Yarrowia lipolytica Pex12p (GenBank -- 11 Accession No. CAG81532) (408 AA) Yarrowia lipolytica Pex13p (GenBank -- 12 Accession No. CAG81789) (412 AA) Yarrowia lipolytica Pex14p (GenBank -- 13 Accession No. CAG79323) (380 AA) Yarrowia lipolytica Pex16p (GenBank -- 14 Accession No. CAG79622) (391 AA) Yarrowia lipolytica Pex17p (GenBank -- 15 Accession No. CAG84025) (225 AA) Yarrowia lipolytica Pex19p (GenBank -- 16 Accession No. AAK84827) (324 AA) Yarrowia lipolytica Pex20p (GenBank -- 17 Accession No. CAG79226) (417 AA) Yarrowia lipolytica Pex22p (GenBank -- 18 Accession No. CAG77876) (195 AA) Yarrowia lipolytica Pex26p (GenBank -- 19 Accession No. NC_006072, antisense (386 AA) translation of nucleotides 117230-118387) Contig comprising Yarrowia lipolytica Pex10 20 -- gene encoding peroxisomal biogenesis factor (3387 bp) protein (Pex10p) (GenBank Accession No. AB036770) Yarrowia lipolytica Pex10 (GenBank 21 22 Accession No. AB036770, nucleotides (1134 bp) (377 AA) 1038-2171) (the protein sequence is 100% identical to SEQ ID NO: 10) Yarrowia lipolytica Pex10 (GenBank 23 24 Accession No. AJ012084, which corresponds (1065 bp) (354 AA) to nucleotides 1107-2171 of GenBank Accession No. AB036770) (the first 23 amino acids are truncated with respect to the protein sequences of SEQ ID NOs: 10 and 22) Yarrowia lipolytica Pex10p C.sub.3HC.sub.4 zinc ring -- 25 finger motif (i.e., amino acids 327-364 of SEQ (38 AA) ID NO: 10) Yarrowia lipolytica truncated Pex10p -- 26 (GenBank Accession No. CAG81606 [SEQ ID (345 AA) NO: 10], with C-terminal 32 amino acid deletion) Yarrowia lipolytica mutant acetohydroxyacid 27 -- synthase (AHAS) gene comprising a W497L (2987 bp) mutation Plasmid pZP3-Pa777U 28 -- (13,066 bp) Plasmid pY117 29 -- (9570 bp) Plasmid pZP2-2988 30 -- (15,743 bp) Plasmid pZKUE3S 31 -- (6303 bp) Primer pZP-GW-5-1 32 -- Primer pZP-GW-5-2 33 -- Primer pZP-GW-5-3 34 -- Primer pZP-GW-5-4 35 -- Primer pZP-GW-3-1 36 -- Primer pZP-GW-3-2 37 -- Primer pZP-GW-3-3 38 -- Primer pZP-GW-3-4 39 -- Genome Walker adaptor [top strand] 40 -- Genome Walker adaptor [bottom strand] 41 -- Nested adaptor primer 42 -- Primer Per10 F1 43 -- Primer ZPGW-5-5 44 -- Primer Per10 R 45 -- Plasmid pFBAIN-MOD-1 46 -- (7222 bp) Plasmid pFBAIn-PEX10 47 -- (8133 bp) Primer PEX10-R-BsiWI 48 -- Primer PEX10-F1-Sall 49 -- Primer PEX10-F2-Sall 50 -- Plasmid pEXP-MOD1 51 -- (7277 bp) Plasmid pPEX10-1 52 -- (7559 bp) Plasmid pPEX10-2 53 -- (8051 bp) Plasmid pZKL1-2SP98C 54 -- (15,877 bp) Plasmid pZKL2-5U89GC 55 -- (15,812 bp) Plasmid pYPS161 56 -- (7966 bp) Primer Pex-10del1 3'.Forward 57 -- Primer Pex-10del2 5'.Reverse 58 -- Plasmid pYRH13 59 -- (8673 bp) Primer PEX16Fii 60 -- Primer PEX16Rii 61 -- Primer 3UTR-URA3 62 -- Primer Pex16-conf 63 -- Real time PCR primer ef-324F 64 -- Real time PCR primer ef-392R 65 -- Real time PCR primer Pex16-741F 66 -- Real time PCR primer Pex16-802R 67 -- Nucleotide portion of TaqMan probe ef-345T 68 -- Nucleotide portion of TaqMan probe PEX16- 69 -- 760T Plasmid pZKUM 70 -- (4313 bp) Plasmid pZKD2-5U89A2 71 -- (15,966 bp) Yarrowia lipolytica diacylglycerol 72 73 acyltransferase (DGAT2) (U.S. Pat. No. (2119 bp) (514 AA) 7,267,976) Synthetic .DELTA.12 desaturase derived from 74 75 Fusarium moniliforme, codon-optimized for (1434 bp) (477 AA) expression in Yarrowia lipolytica ("FmD12S") Synthetic mutant .DELTA.8 desaturase ("EgD8M"), 76 77 derived from Euglena gracilis ("EgD8S"; U.S. (1272 bp) (422 AA) Pat. No. 7,256,033) Synthetic .DELTA.9 elongase derived from 78 79 Eutreptiella sp. CCMP389 codon-optimized for (792 bp) (263 AA) expression in Yarrowia lipolytica ("E389D9eS") Synthetic .DELTA.5 desaturase derived from Euglena 80 81 gracilis, codon-optimized for expression in (1350 bp) (449 AA) Yarrowia lipolytica ("EgD5S") Plasmid pY157 82 -- (6356 bp) Plasmid pY87 83 -- (5910 bp) Escherichia coli LoxP recombination site, 84 -- recognized by a Cre recombinase enzyme (34 bp) Primer UP 768 85 -- Primer LP 769 86 --

DETAILED DESCRIPTION OF THE INVENTION

[0038] Described herein are generalized methods to manipulate the concentration (as a percent of total fatty acids) and content (as a percent of the dry cell weight) of long-chain polyunsaturated fatty acids ["LC-PUFAs"] in PUFA-producing eukaryotic organisms. These methods rely on disruption of a native peroxisome biogenesis factor ["Pex"] protein within the host and will have wide-spread applicability to a variety of eukaryotic organisms having native or genetically-engineered ability to produce PUFAs, including algae, fungi, oomycetes, yeast, euglenoids, stramenopiles, plants and some mammalian systems.

[0039] PUFAs, or derivatives thereof, are used as dietary substitutes, or supplements, particularly infant formulas, for patients undergoing intravenous feeding or for preventing or treating malnutrition. For example, PUFAs may be incorporated into cooking oils, fats or margarines and ingested as part of a consumer's typical diet, thereby giving the consumer desired dietary supplementation. Further, PUFAs may also be incorporated into infant formulas, nutritional supplements or other food products and may find use as anti-inflammatory or cholesterol lowering agents. Optionally, the compositions may be used for pharmaceutical use, either human or veterinary.

DEFINITIONS

[0040] In this disclosure, a number of terms and abbreviations are used.

[0041] The following definitions are provided.

[0042] "Open reading frame" is abbreviated as "ORF".

[0043] "Polymerase chain reaction" is abbreviated as "PCR".

[0044] "American Type Culture Collection" is abbreviated as "ATCC".

[0045] "Polyunsaturated fatty acid(s)" is abbreviated as "PUFA(s)".

[0046] "Triacylglycerols" are abbreviated as "TAGs".

[0047] "Total fatty acids" are abbreviated as "TFAs".

[0048] "Fatty acid methyl esters" are abbreviated as "FAMEs".

[0049] "Dry cell weight" is abbreviated as "DCW".

[0050] The term "invention" or "present invention" as used herein is not meant to be limiting but applies generally to any of the inventions defined in the claims or described herein.

[0051] The term "peroxisomes" refers to ubiquitous organelles found in all eukaryotic cells. They have a single lipid bilayer membrane that separates their contents from the cytosol and that contains various membrane proteins essential to the functions described below. Peroxisomes selectively import proteins via an "extended shuttle mechanism". More specifically, there are at least 32 known peroxisomal proteins, also known as peroxins, which participate in the process of importing proteins by means of ATP hydrolysis through the peroxisomal membrane. Some peroxins comprise a specific protein signal, i.e., a peroxisomal targeting signal or "PTS", at either the N-terminus or C-terminus to signal that importation through the peroxisomal membrane should occur. Once cellular proteins are imported into the peroxisome, they are typically subjected to some means of degradation. For example, peroxisomes contain oxidative enzymes, such as catalase, D-amino acid oxidase and uric acid oxidase, that enable degradation of substances that are toxic to the cell. Alternatively, peroxisomes breakdown fatty acid molecules to produce free molecules of acetyl-CoA which are exported back to the cytosol, in a process called .beta.-oxidation.

[0052] The terms "peroxisome biogenesis factor protein", "peroxin" and "Pex protein" are interchangeable and refer to proteins involved in peroxisome biogenesis and/or that participate in the process of importing cellular proteins by means of ATP hydrolysis through the peroxisomal membrane. The acronym of a gene that encodes any of these proteins is "Pex gene". A system for nomenclature of Pex genes is described by Distel et al., J. Cell Biol., 135:1-3 (1996). At least 32 different Pex genes have been identified so far in various eukaryotic organisms. Many Pex genes have been isolated from the analysis of mutants that demonstrated abnormal peroxisomal functions or structures. Based on a review by Kiel, J. A. K. W., et al. (Traffic, 7:1291-1303 (2006)), wherein in silico analysis of the genomic sequences of 17 different fungal species was performed, the following Pex proteins were identified: Pex1p, Pex2p, Pex3p, Pex3Bp, Pex4p, Pex5p, Pex5Bp, Pex5Cp, Pex5/20p, Pex6p, Pex7p, Pex8p, Pex10p, Pex12p, Pex13p, Pex14p, Pex15p, Pex16p, Pex17p, Pex14/17p, Pex18p, Pex19p, Pex20p, Pex21p, Pex21Bp, Pex22p, Pex22p-like and Pex26p. Thus, each of these proteins is referred to herein as a "Pex protein", a "peroxin" or a "peroxisome biogenesis factor protein", and is encoded by at least one "Pex gene".

[0053] The term "conserved domain" or "motif" refers to a set of amino acids conserved at specific positions along an aligned sequence of evolutionarily related proteins. While amino acids at other positions can vary between homologous proteins, amino acids that are highly conserved at specific positions indicate amino acids that are essential in the structure, the stability, or the activity of a protein. Because they are identified by their high degree of conservation in aligned sequences of a family of protein homologues, they can be used as identifiers, or "signatures", to determine if a protein with a newly determined sequence belongs to a previously identified protein family. Of relevance herein, Pex2p, Pex10p and Pex12p all share a cysteine-rich motif near their carboxyl termini, known as a C.sub.3HC.sub.4 zinc ring finger motif. This motif appears to be required for their activities, involved in protein docking and translocation into the peroxisome (Kiel, J. A. K. W., et al., Traffic, 7:1291-1303 (2006)).

[0054] The term "C.sub.3HC.sub.4 zinc ring finger motif" or "C.sub.3HC.sub.4 motif" generically refers to a conserved cysteine-rich motif that binds two zinc ions, identified by the presence of a sequence of amino acids as set forth in Formula I:

CX.sub.2CX.sub.9-27CX.sub.1-3HX.sub.2CX.sub.2CX.sub.4-48CX.sub.2C Formula I

The C.sub.3HC.sub.4 zinc ring finger motif within the Yarrowia lipolytica gene encoding the peroxisome biogenesis factor 10 protein, i.e., YIPex10p, is located between amino acids 327-364 of SEQ ID NO:10 and is defined by a CX.sub.2CX.sub.11CX.sub.1HX.sub.2CX.sub.2CX.sub.10CX.sub.2C motif (SEQ ID NO:25). The C.sub.3HC.sub.4 zinc ring finger motif within the Y. lipolytica gene encoding the peroxisome biogenesis factor 2 protein, i.e., YIPex2p, is located between amino acids 266-323 of SEQ ID NO:2. The Y. lipolytica peroxisome biogenesis factor 12 protein, i.e., YIPex12p, contains an imperfect C.sub.3HC.sub.4 ring-finger motif located between amino acids 342-391 of SEQ ID NO:11. The protein sequences corresponding to the C.sub.3HC.sub.4 zinc ring finger motif of YIPex10, YIPex2 and YIPex12 are aligned in FIG. 2A; asterisks denote the conserved cysteine or histidine residues of the motif.

[0055] YIPex10, YIPex2 and YIPex12 are thought to form a ring finger complex by protein-protein interaction. The proposed interaction between the cystine and histidine residues of the YIPex10p C.sub.3HC.sub.4 finger motif with two zinc residues is schematically diagrammed in FIG. 2B.

[0056] The term "Pex10" refers to the gene encoding the peroxisome biogenesis factor 10 protein or peroxisomal assembly protein Peroxin 10, wherein the peroxin protein is hereinafter referred to as "Pex10p". The function of Pex10p has not been clearly elucidated, although studies in other organisms have revealed that Pex10 products are localized in the peroxisomal membrane and are essential to the normal functioning of the organelle. A C.sub.3HC.sub.4 zinc ring finger motif appears to be conserved in the C-terminal region of Pex10p (Kalish, J. E. et al., Mol. Cell. Biol., 15:6406-6419 (1995); Tan, X. et al., J. Cell Biol., 128:307-319 (1995); Warren, D. S., et al., Am. J. Hum. Genet., 63:347-359 (1998)) and is required for enzymatic activity.

[0057] The term "YIPex10" refers to the Yarrowia lipolytica gene encoding the peroxisome biogenesis factor 10 protein, wherein the protein is hereinafter referred to as "YIPex10p". This particular peroxin was recently studied by Sumita et al. (FEMS Microbiol. Lett., 214:31-38 (2002)). The nucleotide sequence of YIPex10 was registered in GenBank under multiple accession numbers, including GenBank Accession No. CAG81606 (SEQ ID NO:10), No. AB036770 (SEQ ID NOs:20, 21 and 22) and No. AJ012084 (SEQ ID NOs:23 and 24). The YIPex10p sequence set forth in SEQ ID NO:24 is 354 amino acids in length. In contrast, the YIPex10p sequences set forth in SEQ ID NO:10 and SEQ ID NO:22 are each 377 amino acids in length, as the 100% identical sequences possess an additional 23 amino acids at the N-terminus of the protein (corresponding to a different start codon than that identified in GenBank Accession No. AJ012084 (SEQ ID NO:24)).

[0058] The term "Pex3" refers to the gene encoding the peroxisome biogenesis factor 3 protein or peroxisomal assembly protein Peroxin 3, wherein the peroxin protein is hereinafter referred to as "Pex3p". Although mechanistic details concerning the function of Pex3p have not been clearly resolved, it is clear that Pex3p is a peroxisomal integral membrane protein required early in peroxisome biogenesis for formation of the peroxisomal membrane (see, e.g., Baerends, R. J. et al., J. Biol. Chem., 271:8887-8894 (1996); Bascom, R. A. et al, Mol. Biol. Cell, 14:939-957 (2003)).

[0059] The term "YIPex3" refers to the Yarrowia lipolytica gene encoding the peroxisome biogenesis factor 3 protein, wherein the protein is hereinafter referred to as "YIPex3p". The nucleotide sequence of YIPex3 was registered in GenBank as Accession No. CAG78565 (SEQ ID NO:3).

[0060] The term "Pex16" refers to the gene encoding the peroxisome biogenesis factor 16 protein or peroxisomal assembly protein Peroxin 16, wherein the peroxin protein is hereinafter referred to as "Pex16p". The function of Pex16p has not been clearly elucidated, although studies in various organisms have revealed that Pex16 products play a role in the formation of the peroxisomal membrane and regulation of peroxisomal proliferation (Platta, H. W. and R. Erdmann, Trends Cell Biol., 17(10):474-484 (2007)).

[0061] The term "YIPex16" refers to the Yarrowia lipolytica gene encoding the peroxisome biogenesis factor 16 protein, wherein the protein is hereinafter referred to as "YIPex16p". This particular peroxin was described by Elizen G. A., et al. (J. Cell Biol., 137:1265-1278 (1997)) and Titorenko, V. I. et al. (Mol. Cell. Biol., 17:5210-5226 (1997)). The nucleotide sequence of YIPex16 was registered in GenBank as Accession No. CAG79622 (SEQ ID NO:14).

[0062] The term "disruption" in or in connection with a native Pex gene refers to an insertion, deletion, or targeted mutation within a portion of that gene, that results in either a complete gene knockout such that the gene is deleted from the genome and no protein is translated or a translated Pex protein having an insertion, deletion, amino acid substitution or other targeted mutation. The location of the disruption in the protein may be, for example, within the N-terminal portion of the protein or within the C-terminal portion of the protein. The disrupted Pex protein will have impaired activity with respect to the Pex protein that was not disrupted, and can be non-functional. A disruption in a native gene encoding a Pex protein also includes alternate means that result in low or lack of expression of the Pex protein, such as could result via manipulating the regulatory sequences, transcription and translation factors and/or signal transduction pathways or by use of sense, antisense or RNAi technology, etc.

[0063] As used herein, the term "Pex-disrupted organism" refers to any oleaginous eukaryotic organism comprising genes that encode a functional polyunsaturated fatty acid biosynthetic pathway and having a disruption, as defined above, in a native gene that encodes a peroxisome biogenesis factor protein,

[0064] The term "lipids" refer to any fat-soluble (i.e., lipophilic), naturally-occurring molecule. Lipids are a diverse group of compounds that have many key biological functions, such as structural components of cell membranes, energy storage sources and intermediates in signaling pathways. Lipids may be broadly defined as hydrophobic or amphiphilic small molecules that originate entirely or in part from either ketoacyl or isoprene groups. A general overview of lipids, based on the Lipid Metabolites and Pathways Strategy (LIPID MAPS) classification system (National Institute of General Medical Sciences, Bethesda, Md.), is shown below in Table 2.

Table 2

Overview of Lipid Classes

TABLE-US-00003 [0065] Structural Building Block Lipid Category Examples Of Lipid Classes Derived from Fatty Acyls Includes fatty acids, eicosanoids, fatty condensation esters and fatty amides of ketoacyl Glycerolipids Includes mainly of mono-, di- and tri- subunits substituted glycerols, the most well- known being the fatty acid esters of glycerol ["triacylglycerols"] Glycero- Includes phosphatidylcholine, phospholipids phosphatidylethanolamine, or phosphatidylserine, Phospholipids phosphatidylinositols and phosphatidic acids Sphingolipids Includes ceramides, phospho- sphingolipids (e.g., sphingomyelins), glycosphingolipids (e.g., gangliosides), sphingosine, cerebrosides Saccharolipids Includes acylaminosugars, acylamino- sugar glycans, acyltrehaloses, acyltrehalose glycans Polyketides Includes halogenated acetogenins, polyenes, linear tetracyclines, polyether antibiotics, flavonoids, aromatic polyketides Derived from Sterol Lipids Includes sterols (e.g., cholesterol), C18 condensation steroids (e.g., estrogens), C19 steroids of isoprene (e.g., androgens), C21 steroids (e.g., subunits progestogens, glucocorticoids and mineral-ocorticoids), secosteroids, bile acids Prenol Lipids Includes isoprenoids, carotenoids, quinones, hydroquinones, polyprenols, hopanoids

[0066] The term "total lipid fraction" of cells herein refers to all esterified fatty acids of the cell. Various subfractions within the total lipid fraction can be isolated, including the triacylglycerol ["oil"] fraction, phosphatidylcholine fraction and the phosphatidylethanolamine fraction, although this is by no means inclusive of all sub-fractions.

[0067] "Lipid bodies" refer to lipid droplets that are bound by a monolayer of phospholipid and, usually, by specific proteins. These organelles are sites where most organisms transport/store neutral lipids. Lipid bodies are thought to arise from microdomains of the endoplasmic reticulum that contain TAG biosynthesis enzymes. Their synthesis and size appear to be controlled by specific protein components.

[0068] "Neutral lipids" refer to those lipids commonly found in cells in lipid bodies as storage fats and oils and are so called because at cellular pH, the lipids bear no charged groups. Generally, they are completely non-polar with no affinity for water. Neutral lipids generally refer to mono-, di-, and/or triesters of glycerol with fatty acids, also called monoacylglycerol, diacylglycerol or triacylglycerol, respectively, or collectively, acylglycerols. A hydrolysis reaction must occur to release free fatty acids from acylglycerols.

[0069] The terms "triacylglycerols" ["TAGs"] and "oil" are interchangeable and refer to neutral lipids composed of three fatty acyl residues esterified to a glycerol molecule. TAGs can contain long chain PUFAs, as well as shorter saturated and unsaturated fatty acids and longer chain saturated fatty acids. The TAG fraction of cells is also referred to as the "oil fraction", and "oil biosynthesis" generically refers to the synthesis of TAGs in the cell. The oil or TAG fraction is a sub-fraction of the total lipid fraction, although also it constitutes a major part of the total lipid content, measured as the weight of total fatty acids in the cell as a percent of the dry cell weight [see below], in oleaginous organisms. The fatty acid composition in the oil ["TAG"] fraction and the fatty acid composition of the total lipid fraction are generally similar. Thus, an increase or decrease in the concentration of PUFAs in the total lipid fraction will correspond with an increase or decrease in the concentration of PUFAs in the oil ["TAG"] fraction, and vice versa.

[0070] The term "total fatty acids" ["TFAs"] herein refer to the sum of all cellular fatty acids that can be derivatized to fatty acid methyl esters ["FAMEs"] by the base transesterification method (as known in the art) in a given sample, which may be the total lipid fraction or the oil fraction, for example. Thus, total fatty acids include fatty acids from neutral and polar lipid fractions, including the phosphatidylcholine fraction, the phosphatidylethanolamine fraction and the diacylglycerol, monoacylglycerol and triacylglycerol ["TAG or oil"] fractions but not free fatty acids.

[0071] The term "total lipid content" of cells is a measure of TFAs as a percent of the dry cell weight ["DCW"]. Thus, total lipid content ["TFAs % DCW"] is equivalent to, e.g., milligrams of total fatty acids per 100 milligrams of DCW.

[0072] Generally, the concentration of a fatty acid is expressed herein as a weight percent of TFAs ["% TFAs"], e.g., milligrams of the given fatty acid per 100 milligrams of TFAs. Unless otherwise specifically stated in the disclosure herein, reference to the percent of a given fatty acid with respect to total lipids is equivalent to concentration of the fatty acid as % TFAs (e.g., % EPA of total lipids is equivalent to EPA % TFAs).

[0073] In some cases, it is useful to express the content of a given fatty acid(s) in a cell as its percent of the dry cell weight ["% DCW"]. Thus, for example, eicosapentaenoic acid % DCW would be determined according to the following formula: (eicosapentaenoic acid % TFAs)*(TFA % DCW)]/100.

[0074] The terms "lipid profile" and "lipid composition" are interchangeable and refer to the amount of an individual fatty acid contained in a particular lipid fraction, such as in the total lipid fraction or the oil ["TAG"] fraction, wherein the amount is expressed as a percent of TFAs. The sum of each individual fatty acid present in the mixture should be 100.

[0075] As used herein, the term "fold increase" refers to an increase obtained by multiplying by a number. For example, multiplying by 1.3 a quantity, an amount, a concentration, a weight percent, etc. provides a 1.3 fold increase.

[0076] The term "fatty acids" refers to long chain aliphatic acids (alkanoic acids) of varying chain lengths, from about C.sub.12 to C.sub.22, although both longer and shorter chain-length acids are known. The predominant chain lengths are between C.sub.16 and C.sub.22. The structure of a fatty acid is represented by a simple notation system of "X:Y", where X is the total number of carbon ["C"] atoms in the particular fatty acid and Y is the number of double bonds. Additional details concerning the differentiation between "saturated fatty acids" versus "unsaturated fatty acids", "monounsaturated fatty acids" versus "polyunsaturated fatty acids" ["PUFAs"], and "omega-6 fatty acids" [".omega.-6" or "n-6"] versus "omega-3 fatty acids" [".omega.-3" or "n-3"] are provided in U.S. Pat. No. 7,238,482, which is hereby incorporated herein by reference.

[0077] Nomenclature used to describe PUFAs herein is given in Table 3. In the column titled "Shorthand Notation", the omega-reference system is used to indicate the number of carbons, the number of double bonds and the position of the double bond closest to the omega carbon, counting from the omega carbon, which is numbered 1 for this purpose. The remainder of the Table summarizes the common names of .omega.-3 and .omega.-6 fatty acids and their precursors, the abbreviations that are used throughout the specification and the chemical name of each compound.

TABLE-US-00004 TABLE 3 Nomenclature of Polyunsaturated Fatty Acids And Precursors Shorthand Common Name Abbreviation Chemical Name Notation Myristic -- Tetradecanoic 14:0 Palmitic Palmitate Hexadecanoic 16:0 Palmitoleic -- 9-hexadecenoic 16:1 Stearic -- Octadecanoic 18:0 Oleic -- cis-9-octadecenoic 18:1 Linoleic LA cis-9,12-octadecadienoic 18:2 .omega.-6 .gamma.-Linolenic GLA cis-6,9,12- 18:3 .omega.-6 octadecatrienoic Eicosadienoic EDA cis-11,14-eicosadienoic 20:2 .omega.-6 Dihomo-.gamma.- DGLA cis-8,11,14- 20:3 .omega.-6 Linolenic eicosatrienoic Arachidonic ARA cis-5,8,11,14- 20:4 .omega.-6 eicosatetraenoic .alpha.-Linolenic ALA cis-9,12,15- 18:3 .omega.-3 octadecatrienoic Stearidonic STA cis-6,9,12,15- 18:4 .omega.-3 octadecatetraenoic Eicosatrienoic ETrA cis-11,14,17- 20:3 .omega.-3 eicosatrienoic Sciadonic SCI cis-5,11,14- 20:3b .omega.-6 eicosatrienoic Juniperonic JUP cis-5,11,14,17- 20:4b .omega.-3 eicosatetraenoic Eicosa- ETA cis-8,11,14,17- 20:4 .omega.-3 tetraenoic eicosatetraenoic Eicosa- EPA cis-5,8,11,14,17- 20:5 .omega.-3 pentaenoic eicosapentaenoic Docosatrienoic DRA cis-10,13,16- 22:3 .omega.-3 docosatrienoic Docosa- DTA cis-7,10,13,16- 22:4 .omega.-3 tetraenoic docosatetraenoic Docosa- DPAn-6 cis-4,7,10,13,16- 22:5 .omega.-6 pentaenoic docosapentaenoic Docosa- DPA cis-7,10,13,16,19- 22:5 .omega.-3 pentaenoic docosapentaenoic Docosa- DHA cis-4,7,10,13,16,19- 22:6 .omega.-3 hexaenoic docosahexaenoic

Although the .omega.-3/.omega.-6 PUFAs listed in Table 3 are the most likely to be accumulated in the oil fractions of oleaginous yeast using the methods described herein, this list should not be construed as limiting or as complete.

[0078] As used herein, the terms "a combination of polyunsaturated fatty acids" or "any combination of polyunsaturated fatty acids" refers to a mixture of any two or more of the polyunsaturated fatty acids listed above in Table 3. Such combination has the attributes of a concentration and of a weight percent that can be measured relative to a variety of concentrations or weight percents in the cell, including relative to the weight percent of the total fatty acids in the cell.

[0079] A metabolic pathway, or biosynthetic pathway, in a biochemical sense, can be regarded as a series of chemical reactions occurring in order within a cell, catalyzed by enzymes, to achieve either the formation of a metabolic product to be used or stored by the cell, or the initiation of another metabolic pathway, which is termed "flux generating step". Many of these pathways are elaborate, and involve a step by step modification of the initial substance to shape it into a product having the exact chemical structure desired.

[0080] The term "PUFA biosynthetic pathway" refers to a metabolic process that converts oleic acid to .omega.-6 fatty acids such as LA, EDA, GLA, DGLA, ARA, DRA, DTA and DPAn-6 and .omega.-3 fatty acids such as ALA, STA, ETrA, ETA, EPA, DPA and DHA. This process is well described in the literature. See e.g., Int'. App. Pub. No. WO 2006/052870. Briefly, this process involves elongation of the carbon chain through the addition of carbon atoms and desaturation of the elongated molecule through the addition of double bonds, via a series of special elongation and desaturation enzymes termed "PUFA biosynthetic pathway enzymes" that are present in the endoplasmic reticulum membrane. More specifically, "PUFA biosynthetic pathway enzymes" refer to any of the following enzymes (and genes which encode them) associated with the biosynthesis of a PUFA, including: a .DELTA.4 desaturase, a .DELTA.5 desaturase, a .DELTA.6 desaturase, a .DELTA.12 desaturase, a .DELTA.15 desaturase, a .DELTA.17 desaturase, a .DELTA.9 desaturase, a .DELTA.8 desaturase, a .DELTA.9 elongase, a C.sub.14/16 elongase, a C.sub.16/18 elongase, a C.sub.18/20 elongase and/or a C.sub.20/22 elongase.

[0081] The term ".omega.-3/.omega.-6 fatty acid biosynthetic pathway" refers to a set of genes which, when expressed under the appropriate conditions, encode enzymes that catalyze the production of either or both .omega.-3 and .omega.-6 fatty acids. Typically the genes involved in the .omega.-3/.omega.-6 fatty acid biosynthetic pathway encode PUFA biosynthetic pathway enzymes. A representative pathway is illustrated in FIG. 1, providing for the conversion of myristic acid through various intermediates to DHA, which demonstrates how both .omega.-3 and .omega.-6 fatty acids may be produced from a common source. The pathway is naturally divided into two portions, such that one portion generates only .omega.-3 fatty acids and the other portion, only .omega.-6 fatty acids. That portion that generates only .omega.-3 fatty acids is referred to herein as the .omega.-3 fatty acid biosynthetic pathway, whereas that portion that generates only .omega.-6 fatty acids is referred to herein as the .omega.-6 fatty acid biosynthetic pathway.

[0082] The term "functional" as used herein relating to the .omega.-3/.omega.-6 fatty acid biosynthetic pathway, means that some (or all) of the genes in the pathway express active enzymes, resulting in in vivo catalysis or substrate conversion. It should be understood that ".omega.-3/.omega.-6 fatty acid biosynthetic pathway" or "functional .omega.-3/.omega.-6 fatty acid biosynthetic pathway" does not imply that all of the genes listed in the above paragraph are required, as a number of fatty acid products require only the expression of a subset of the genes of this pathway.

[0083] The term ".DELTA.6 desaturase/.DELTA.6 elongase pathway" refers to a PUFA biosynthetic pathway that minimally includes at least one .DELTA.6 desaturase and at least one C.sub.16/20 elongase, thereby enabling biosynthesis of DGLA and/or ETA from LA and ALA, respectively, with GLA and/or STA as intermediate fatty acids. With expression of other desaturases and elongases, ARA, EPA, DPA and DHA may also be synthesized.

[0084] The term ".DELTA.9 elongase/.DELTA.8 desaturase pathway" refers to a PUFA biosynthetic pathway that minimally includes at least one .DELTA.9 elongase and at least one .DELTA.8 desaturase, thereby enabling biosynthesis of DGLA and/or ETA from LA and ALA, respectively, with EDA and/or ETrA as intermediate fatty acids. With expression of other desaturases and elongases, ARA, EPA, DPA and DHA may also be synthesized.

[0085] The term "desaturase" refers to a polypeptide that can desaturate adjoining carbons in a fatty acid by removing a hydrogen from one of the adjoining carbons and thereby introducing a double bond between them. Desaturation produces a fatty acid or precursor of interest. Despite use of the omega-reference system throughout the specification to refer to specific fatty acids, it is more convenient to indicate the activity of a desaturase by counting from the carboxyl end of the substrate using the delta-system. Of particular interest herein are: 1) .DELTA.5 desaturases that catalyze the conversion of the substrate fatty acid, DGLA, to ARA and/or of the substrate fatty acid, ETA, to EPA; 2) .DELTA.17 desaturases that desaturate a fatty acid between the 17.sup.th and 18.sup.th carbon atom numbered from the carboxyl-terminal end of the molecule and which, for example, catalyze the conversion of the substrate fatty acid, ARA, to EPA and/or the conversion of the substrate fatty acid, DGLA, to ETA; 3) .DELTA.6 desaturases that catalyze the conversion of the substrate fatty acid, LA, to GLA and/or the conversion of the substrate fatty acid, ALA, to STA; 4) .DELTA.12 desaturases that catalyze the conversion of the substrate fatty acid, oleic acid, to LA; 5) .DELTA.15 desaturases that catalyze the conversion of the substrate fatty acid, LA, to ALA and/or the conversion of the substrate fatty acid, GLA, to STA; 6) .DELTA.4 desaturases that catalyze the conversion of the substrate fatty acid, DPA, to DHA and/or the conversion of the substrate fatty acid, DTA, to DPAn-6; 7) .DELTA.8 desaturases that catalyze the conversion of the substrate fatty acid, EDA, to DGLA and/or the conversion of the substrate fatty acid, ETrA, to ETA; and, 8) .DELTA.9 desaturases that catalyze the conversion of the substrate fatty acid, palmitate, to palmitoleic acid (16:1) and/or the conversion of the substrate fatty acid, stearic acid, to oleic acid. .DELTA.15 and .DELTA.17 desaturases are also occasionally referred to as "omega-3 desaturases", "w-3 desaturases", and/or ".omega.-3 desaturases", based on their ability to convert .omega.-6 fatty acids into their .omega.-3 counterparts (e.g., conversion of LA into ALA and ARA into EPA, respectively). It may be desirable to empirically determine the specificity of a particular fatty acid desaturase by transforming a suitable host with the gene for the fatty acid desaturase and determining its effect on the fatty acid profile of the host.

[0086] The term "elongase" refers to a polypeptide that can elongate a fatty acid carbon chain to produce an acid 2 carbons longer than the fatty acid substrate that the elongase acts upon. This process of elongation occurs in a multi-step mechanism in association with fatty acid synthase, as described in U.S. Pat. App. Pub. No. 2005/0132442 and Int'l App. Pub. No. WO 2005/047480. Examples of reactions catalyzed by elongase systems are the conversion of GLA to DGLA, STA to ETA and EPA to DPA. In general, the substrate selectivity of elongases is somewhat broad but segregated by both chain length and the degree and type of unsaturation. For example, a C.sub.14/16 elongase utilizes a C.sub.14 substrate e.g., myristic acid, a C.sub.16/18 elongase utilizes a C.sub.16 substrate e.g., palmitate, a C.sub.18/20 elongase [also known as a .DELTA.6 elongase as the terms can be used interchangeably] utilizes a C.sub.18 substrate e.g., GLA or STA, and a C.sub.20/22 elongase utilizes a C.sub.20 substrate e.g., EPA. In like manner, a .DELTA.9 elongase is able to catalyze the conversion of LA and ALA to EDA and ETrA, respectively. It is important to note that some elongases have broad specificity and thus a single enzyme may be capable of catalyzing several elongase reactions. For example a single enzyme may thus act as both a C.sub.16/18 elongase and a C.sub.18/20 elongase.

[0087] The terms "conversion efficiency" and "percent substrate conversion" refer to the efficiency by which a particular enzyme, such as a desaturase, can convert substrate to product. The conversion efficiency is measured according to the following formula: ([product]/[substrate+product])*100, where `product` includes the immediate product and all products in the pathway derived from it.

[0088] The term "oleaginous" refers to those organisms that tend to store their energy source in the form of oil (Weete, In: Fungal Lipid Biochemistry, 2.sup.nd Ed., Plenum, 1980).

[0089] The term "oleaginous yeast" refers to those microorganisms classified as yeasts that can make oil, that is, TAGs. Generally, the cellular oil or TAG content of oleaginous microorganisms follows a sigmoid curve, wherein the concentration of lipid increases until it reaches a maximum at the late logarithmic or early stationary growth phase and then gradually decreases during the late stationary and death phases (Yongmanitchai and Ward, Appl. Environ. Microbiol., 57:419-25 (1991)). Oleaginous microorganisms as referred to herein typically accumulate in excess of about 25% of their dry cell weight as oil or TAGs. Examples of oleaginous yeast include, but are not limited to, the following genera: Yarrowia, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Trichosporon and Lipomyces.

[0090] As used herein, the terms "isolated nucleic acid fragment" and "isolated nucleic acid molecule" are used interchangeably and refer to a polymer of RNA or DNA that is single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or more segments of cDNA, genomic DNA or synthetic DNA.

[0091] A nucleic acid fragment is "hybridizable" to another nucleic acid fragment, such as a cDNA, genomic DNA, or RNA molecule, when a single-stranded form of the nucleic acid fragment can anneal to the other nucleic acid fragment under the appropriate conditions of temperature and solution ionic strength. Hybridization and washing conditions are well known and exemplified in Sambrook, J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, 2.sup.nd ed., Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989), which is hereby incorporated herein by reference, particularly Chapter 11 and Table 11.1.

[0092] A "substantial portion" of an amino acid or nucleotide sequence is that portion comprising enough of the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to putatively identify that polypeptide or gene, either by manual evaluation of the sequence by one skilled in the art, or by computer-automated sequence comparison and identification using algorithms such as BLAST (Basic Local Alignment Search Tool; Altschul, S. F., et al., J. Mol. Biol., 215:403-410 (1993)). In general, a sequence of ten or more contiguous amino acids or of thirty or more contiguous nucleotides is necessary in order to putatively identify a polypeptide or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect to nucleotide sequences, gene specific oligonucleotide probes comprising 20-30 contiguous nucleotides may be used in sequence-dependent methods of gene identification (e.g., Southern hybridization) and isolation, such as in situ hybridization of microbial colonies or bacteriophage plaques. In addition, short oligonucleotides of 12-15 bases may be used as amplification primers in PCR in order to obtain a particular nucleic acid fragment comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence comprises enough of the sequence to specifically identify and/or isolate a nucleic acid fragment comprising the sequence.

[0093] The term "complementary" is used to describe the relationship between nucleotide bases that are capable of hybridizing to one another. For example, with respect to DNA, adenosine is complementary to thymine and cytosine is complementary to guanine.

[0094] The terms "homology" and "homologous" are used interchangeably herein. They refer to nucleic acid fragments wherein changes in one or more nucleotide bases do not affect the ability of the nucleic acid fragment to mediate gene expression or produce a certain phenotype. These terms also refer to modifications of the Pex nucleic acid fragments described herein, such as deletion or insertion of one or more nucleotides that do not substantially alter the functional properties of the resulting nucleic acid fragment relative to the initial, unmodified fragment.

[0095] Moreover, the skilled artisan recognizes that homologous nucleic acid sequences are also defined by their ability to hybridize, under moderately stringent conditions, such as 0.5.times.SSC, 0.1% SDS, 60.degree. C., with the sequences exemplified herein, or to any portion of the nucleotide sequences disclosed herein and which are functionally equivalent thereto.

[0096] "Codon degeneracy" refers to the nature in the genetic code permitting variation of the nucleotide sequence without effecting the amino acid sequence of an encoded polypeptide. The skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for improved expression in a host cell, it is desirable to design the gene such that its frequency of codon usage approaches the frequency of preferred codon usage of the host cell.

[0097] "Synthetic genes" can be assembled from oligonucleotide building blocks that are chemically synthesized using procedures known to those skilled in the art. These oligonucleotide building blocks are annealed and then ligated to form gene segments that are then enzymatically assembled to construct the entire gene. Accordingly, the genes can be tailored for optimal gene expression based on optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled artisan appreciates the likelihood of successful gene expression if codon usage is biased towards those codons favored by the host. Determination of preferred codons can be based on a survey of genes derived from the host cell, where sequence information is available.

[0098] "Gene" refers to a nucleic acid fragment that expresses a specific protein, and which may refer to the coding region alone or may include regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its own regulatory sequences. "Chimeric gene" refers to any gene that is not a native gene, comprising regulatory and coding sequences that are not found together in nature. Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source, but arranged in a manner different than that found in nature. "Endogenous gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene refers to a gene that is introduced into the host organism by gene transfer. Foreign genes can comprise native genes inserted into a non-native organism, native genes introduced into a new location within the native host, or chimeric genes. A "transgene" is a gene that has been introduced into the genome by a transformation procedure. A "codon-optimized gene" is a gene having its frequency of codon usage designed to mimic the frequency of preferred codon usage of the host cell.

[0099] "Coding sequence" refers to a DNA sequence that codes for a specific amino acid sequence. "Suitable regulatory sequences" refer to nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of a coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Regulatory sequences may include promoters, enhancers, silencers, 5' untranslated leader sequence (e.g., between the transcription start site and the translation initiation codon), introns, polyadenylation recognition sequences, RNA processing sites, effector binding sites and stem-loop structures.

[0100] "Promoter" refers to a DNA sequence capable of controlling the expression of a coding sequence or functional RNA. In general, a coding sequence is located 3' to a promoter sequence. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even comprise synthetic DNA segments. It is understood by those skilled in the art that different promoters may direct the expression of a gene in different tissues or cell types, or at different stages of development, or in response to different environmental or physiological conditions. Promoters that cause a gene to be expressed in most cell types at most times are commonly referred to as "constitutive promoters". It is further recognized that since in most cases the exact boundaries of regulatory sequences have not been completely defined, DNA fragments of different lengths may have identical promoter activity.

[0101] The terms "3' non-coding sequences" and "transcription terminator" refer to DNA sequences located downstream of a coding sequence. This includes polyadenylation recognition sequences and other sequences encoding regulatory signals capable of affecting mRNA processing or gene expression. The polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. The 3' region can influence the transcription, RNA processing or stability, or translation of the associated coding sequence.

[0102] "RNA transcript" refers to the product resulting from RNA polymerase-catalyzed transcription of a DNA sequence. When the RNA transcript is a perfect complementary copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA sequence derived from post-transcriptional processing of the primary transcript and is referred to as the mature RNA. "Messenger RNA" or "mRNA" refers to the RNA that is without introns and which can be translated into protein by the cell. "cDNA" refers to a double-stranded DNA that is complementary to, and derived from, mRNA. "Sense" RNA refers to RNA transcript that includes the mRNA and so can be translated into protein by the cell. "Antisense RNA" refers to a RNA transcript that is complementary to all or part of a target primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat. No. 5,107,065; Int'l. App. Pub. No. WO 99/28508). The complementarity of an antisense RNA may be with any part of the specific gene transcript, i.e., at the 5' non-coding sequence, 3' non-coding sequence, or the coding sequence. "Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA that is not translated and yet has an effect on cellular processes.

[0103] The term "operably linked" refers to the association of nucleic acid sequences on a single nucleic acid fragment so that the function of one is affected by the other. For example, a promoter is operably linked with a coding sequence when it is capable of affecting the expression of that coding sequence. That is, the coding sequence is under the transcriptional control of the promoter. Coding sequences can be operably linked to regulatory sequences in sense or antisense orientation.

[0104] The term "expression", as used herein, refers to the transcription and stable accumulation of sense (mRNA) or antisense RNA derived from nucleic acid fragments. Expression may also refer to translation of mRNA into a polypeptide.

[0105] "Mature" protein refers to a post-translationally processed polypeptide, i.e., one from which any pre- or pro-peptides present in the primary translation product have been removed. "Precursor" protein refers to the primary product of translation of mRNA, i.e., with pre- and pro-peptides still present. Pre- and pro-peptides may be, but are not limited to, intracellular localization signals.

[0106] "Transformation" refers to the transfer of a nucleic acid molecule into a host organism, resulting in genetically stable inheritance. The nucleic acid molecule may be a plasmid that replicates autonomously, for example, or, it may integrate into the genome of the host organism. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" or "recombinant" or "transformed" organisms.

[0107] "Stable transformation" refers to the transfer of a nucleic acid fragment into a genome of a host organism, including both nuclear and organellar genomes, resulting in genetically stable inheritance. In contrast, "transient transformation" refers to the transfer of a nucleic acid fragment into the nucleus, or DNA-containing organelle, of a host organism resulting in gene expression without integration or stable inheritance. Host organisms containing the transformed nucleic acid fragments are referred to as "transgenic" organisms.

[0108] The terms "plasmid" and "vector" refer to an extra chromosomal element often carrying genes that are not part of the central metabolism of the cell, and usually in the form of circular double-stranded DNA fragments. Such elements may be autonomously replicating sequences, genome integrating sequences, phage or nucleotide sequences, linear or circular, of a single- or double-stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction that is capable of introducing an expression cassette(s) into a cell.

[0109] The term "expression cassette" refers to a fragment of DNA comprising the coding sequence of a selected gene and regulatory sequences preceding (5' non-coding sequences) and following (3' non-coding sequences) the coding sequence that are required for expression of the selected gene product. Thus, an expression cassette is typically composed of: 1) a promoter sequence; 2) a coding sequence, i.e., open reading frame ["ORF"] and, 3) a 3' untranslated region, i.e., a terminator that in eukaryotes usually contains a polyadenylation site. The expression cassette(s) is usually included within a vector, to facilitate cloning and transformation. Different expression cassettes can be transformed into different organisms including bacteria, yeast, plants and mammalian cells, as long as the correct regulatory sequences are used for each host.

[0110] The term "percent identity" refers to a relationship between two or more polypeptide sequences or two or more polynucleotide sequences, as determined by comparing the sequences. "Identity" also means the degree of sequence relatedness between polypeptide or polynucleotide sequences, as the case may be, as determined by the percentage of match between compared sequences. "Percent identity" and "percent similarity" can be readily calculated by known methods, including but not limited to those described in: 1) Computational Molecular Biology (Lesk, A. M., Ed.) Oxford University: NY (1988); 2) Biocomputing: Informatics and Genome Projects (Smith, D. W., Ed.) Academic: NY (1993); 3) Computer Analysis of Sequence Data, Part I (Griffin, A. M., and Griffin, H. G., Eds.) Humania: NJ (1994); 4) Sequence Analysis in Molecular Biology (von Heinje, G., Ed.) Academic (1987); and, 5) Sequence Analysis Primer (Gribskov, M. and Devereux, J., Eds.) Stockton: NY (1991).

[0111] Preferred methods to determine percent identity are designed to give the best match between the sequences tested. Methods to determine percent identity and percent similarity are codified in publicly available computer programs. Sequence alignments and percent identity calculations may be performed using the MegAlign.TM. program of the LASERGENE bioinformatics computing suite (DNASTAR Inc., Madison, Wis.). Multiple alignment of the sequences is performed using the "Clustal method of alignment" which encompasses several varieties of the algorithm including the "Clustal V method of alignment" and the "Clustal W method of alignment" (described by Higgins and Sharp, CABIOS, 5:151-153 (1989); Higgins, D. G. et al., Comput. Appl. Biosci., 8:189-191 (1992)) and found in the MegAlign.TM. v6.1 program of the LASERGENE bioinformatics computing suite (DNASTAR Inc.). After alignment of the sequences using either Clustal program, it is possible to obtain a "percent identity" by viewing the "sequence distances" table in the program.

[0112] It is well understood by one skilled in the art that various measures of sequence percent identity are useful in identifying polypeptides, from other species, wherein such polypeptides have the same or similar function or activity. Useful examples of percent identities include, but are not limited to, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or any integer percentage from 50% to 100%. Indeed, any integer amino acid identity from 50% to 100% may be useful in describing suitable nucleic acid fragments (isolated polynucleotides) encoding polypeptides in methods and host cells described herein, such as 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%. In some cases, suitable nucleic acid fragments (isolated polynucleotides) encode polypeptides that are at least about 70% identical, preferably at least about 75% identical, and more preferably at least about 80% identical to the amino acid sequences reported herein. Preferred nucleic acid fragments encode amino acid sequences that are at least about 85% identical to the amino acid sequences reported herein. More preferred nucleic acid fragments encode amino acid sequences that are at least about 90% identical to the amino acid sequences reported herein. Most preferred are nucleic acid fragments that encode amino acid sequences that are at least about 95% identical to the amino acid sequences reported herein.

[0113] Suitable nucleic acid fragments not only have the above homologies but typically encode a polypeptide having at least 50 amino acids, preferably at least 100 amino acids, more preferably at least 150 amino acids, still more preferably at least 200 amino acids, and most preferably at least 250 amino acids.

[0114] The term "sequence analysis software" refers to any computer algorithm or software program that is useful for the analysis of nucleotide or amino acid sequences. "Sequence analysis software" may be commercially available or independently developed. Typical sequence analysis software include, but is not limited to: 1) the GCG suite of programs (Wisconsin Package Version 9.0, Genetics Computer Group (GCG), Madison, Wis.); 2) BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. Biol., 215:403-410 (1990)); 3) DNASTAR (DNASTAR, Inc. Madison, Wis.); 4) Sequencher (Gene Codes Corporation, Ann Arbor, Mich.); and, 5) the FASTA program incorporating the Smith-Waterman algorithm (W. R. Pearson, Comput. Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor. Plenum: New York, N.Y.). Within this description, whenever sequence analysis software is used for analysis, the analytical results are based on the "default values" of the program referenced, unless otherwise specified. As used herein "default values" means any set of values or parameters that originally load with the software when first initialized.

[0115] Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described by Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual, 2.sup.nd ed., Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989) (hereinafter "Maniatis"); by Silhavy, T. J., Bennan, M. L. and Enquist, L. W., Experiments with Gene Fusions, Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1984); and by Ausubel, F. M. et al., Current Protocols in Molecular Biology, published by Greene Publishing Assoc. and Wiley-Interscience, Hoboken, N.J. (1987).

An Overview Biosynthesis of Fatty Acids and Triacylglycerols

[0116] In general, lipid accumulation in oleaginous microorganisms is triggered in response to the overall carbon to nitrogen ratio present in the growth medium. This process, leading to the de novo synthesis of free palmitate (16:0) in oleaginous microorganisms, is described in detail in U.S. Pat. No. 7,238,482. Palmitate is the precursor of longer-chain saturated and unsaturated fatty acid derivates, which are formed through the action of elongases and desaturases (FIG. 1).

[0117] TAGs, the primary storage unit for fatty acids, are formed by a series of reactions that involve: 1) esterification of one molecule of acyl-CoA to glycerol-3-phosphate via an acyltransferase to produce lysophosphatidic acid; 2) esterification of a second molecule of acyl-CoA via an acyltransferase to yield 1,2-diacylglycerol phosphate, commonly identified as phosphatidic acid; 3) removal of a phosphate by phosphatidic acid phosphatase to yield 1,2-diacylglycerol ["DAG"]; and, 4) addition of a third fatty acid by the action of an acyltransferase to form the TAG.

[0118] A wide spectrum of fatty acids can be incorporated into TAGs, including saturated and unsaturated fatty acids and short-chain and long-chain fatty acids. Some non-limiting examples of fatty acids that can be incorporated into TAGs by acyltransferases include: capric (10:0), lauric (12:0), myristic (14:0), palmitic (16:0), palmitoleic (16:1), stearic (18:0), oleic (18:1), vaccenic (18:1), LA (18:2), eleostearic (18:3), GLA (18:3), ALA (18:3), STA (18:4), arachidic (20:0), EDA (20:2), DGLA (20:3), ETrA (20:3), ARA (20:4), ETA (20:4), EPA (20:5), behenic (22:0), DPA (22:5), DHA (22:6), lignoceric (24:0), nervonic (24:1), cerotic (26:0) and montanic (28:0) fatty acids. In the methods and host cells described herein, incorporation of "long-chain" PUFAs into TAGs may be most desirable, wherein long-chain PUFAs include any fatty acid derived from an 18:1 substrate having at least 18 carbons in length, i.e., C.sub.18 or greater. This also includes hydroxylated fatty acids, epoxy fatty acids and conjugated linoleic acid.

[0119] Although most PUFAs are incorporated into TAGs as neutral lipids and are stored in lipid bodies, it is important to note that a measurement of the total PUFAs within an oleaginous organism should include those PUFAs that are located in the phosphatidylcholine fraction, phosphatidyl-ethanolamine fraction, and triacylglycerol, also known as the TAG or oil, fraction.

Biosynthesis of Omega Fatty Acids

[0120] The metabolic process wherein oleic acid is converted to .omega.-3/.omega.6 fatty acids involves elongation of the carbon chain through the addition of carbon atoms and desaturation of the molecule through the addition of double bonds. This requires a series of special desaturation and elongation enzymes present in the endoplasmic reticulum membrane. However, as seen in FIG. 1 and as described below, there are often multiple alternate pathways for production of a specific .omega.-3/.omega.-6 fatty acid.

[0121] Specifically, FIG. 1 depicts the pathways described below. All pathways require the initial conversion of oleic acid to linoleic acid ["LA"], the first of the .omega.-6 fatty acids, by a .DELTA.12 desaturase. Then, using the ".DELTA.6 desaturase/.DELTA.6 elongase pathway" and LA as substrate, long-chain .omega.-6 fatty acids are formed as follows: 1) LA is converted to .gamma.-linolenic acid ["GLA"] by a .DELTA.6 desaturase; 2) GLA is converted to dihomo-.gamma.-linolenic acid ["DGLA"] by a C.sub.18/20 elongase; 3) DGLA is converted to arachidonic acid ["ARA"] by a .DELTA.5 desaturase; 4) ARA is converted to docosatetraenoic acid ["DTA"] by a C.sub.20/22 elongase; and, 5) DTA is converted to docosapentaenoic acid ["DPAn-6"] by a .DELTA.4 desaturase.

[0122] Alternatively, the ".DELTA.6 desaturase/.DELTA.6 elongase pathway" can use .alpha.-linolenic acid ["ALA"] as substrate to produce long-chain .omega.-3 fatty acids as follows: 1) LA is converted to ALA, the first of the .omega.-3 fatty acids, by a .DELTA.15 desaturase; 2) ALA is converted to stearidonic acid ["STA"] by a .DELTA.6 desaturase; 3) STA is converted to eicosatetraenoic acid ["ETA"] by a C.sub.18/20 elongase; 4) ETA is converted to eicosapentaenoic acid ["EPA"] by a .DELTA.5 desaturase; 5) EPA is converted to docosapentaenoic acid ["DPA"] by a C.sub.20/22 elongase; and, 6) DPA is converted to docosahexaenoic acid ["DHA"] by a .DELTA.4 desaturase. Optionally, .omega.-6 fatty acids may be converted to .omega.-3 fatty acids. For example, ETA and EPA are produced from DGLA and ARA, respectively, by .DELTA.17 desaturase activity.

[0123] Alternate pathways for the biosynthesis of .omega.-3/.omega.-6 fatty acids utilize .DELTA.9 elongase and .DELTA.8 desaturase, that is, the ".DELTA.9 elongase/.DELTA.8 desaturase pathway". More specifically, LA and ALA may be converted to EDA and ETrA, respectively, by a .DELTA.9 elongase. A .DELTA.8 desaturase then converts EDA to DGLA and/or ETrA to ETA. Downstream PUFAs are subsequently formed as described above.

[0124] The host organism herein must possess the ability to produce PUFAs, either naturally or via techniques of genetic engineering. Although many microorganisms can synthesize PUFAs (including .omega.-3/.omega.-6 fatty acids) in the ordinary course of cellular metabolism, some of whom could be commercially cultured, few to none of these organisms produce oils having a desired oil content and composition for use in pharmaceuticals, dietary substitutes, medical foods, nutritional supplements, other food products, industrial oleochemicals or other end-use applications. Thus, there is increasing emphasis on the ability to engineer microorganisms for production of "designer" lipids and oils, wherein the fatty acid content and composition are carefully specified by genetic engineering. On this basis, it is expected that the host likely comprises heterologous genes encoding a functional PUFA biosynthetic pathway but not necessarily.

[0125] If the host organism does not natively produce the desired PUFAs or possess the desired lipid profile, one skilled in the art is familiar with the considerations and techniques necessary to introduce one or more expression cassettes encoding appropriate enzymes for PUFA biosynthesis into the host organism of choice. Numerous teachings are provided in the literature to one of skill for so introducing such expression cassettes into various host organisms. Some references using the host organism Yarrowia lipolytica are provided as follows: U.S. Pat. No. 7,238,482; Int'l. App. Pub. No. WO 2006/033723, Pat. Appl. Pub. No. US-2006-0094092, Pat. Appl. Pub. No. US-2006-0115881-A1 and Pat. Appl. Pub. No. US-2006-0110806-A1. This list is not exhaustive and should not be construed as limiting.

[0126] Briefly, a variety of .omega.-3/.omega.-6 PUFA products can be produced prior to their transfer to TAGs, depending on the fatty acid substrate and the particular genes of the .omega.-3/.omega.-6 fatty acid biosynthetic pathway that are present in or transformed into the host cell. As such, production of the desired fatty acid product can occur directly or indirectly. Direct production occurs when the fatty acid substrate is converted directly into the desired fatty acid product without any intermediate steps or pathway intermediates. Indirect production occurs when multiple genes encoding the PUFA biosynthetic pathway may be used in combination such that a series of reactions occur to produce a desired PUFA. Specifically, it may be desirable to transform an oleaginous yeast with an expression cassette comprising a .DELTA.12 desaturase, .DELTA.6 desaturase, a C.sub.18/20 elongase, a .DELTA.5 desaturase and a .DELTA.17 desaturase for the overproduction of EPA. See U.S. Pat. No. 7,238,482 and Int'l. App. Pub. No. WO 2006/052870. As is well known to one skilled in the art, various other combinations of genes encoding enzymes of the PUFA biosynthetic pathway may be useful to express in an oleaginous organism (see FIG. 1). The particular genes included within a particular expression cassette depend on the host organism, its PUFA profile and/or desaturase/elongase profile, the availability of substrate and the desired end product(s).

[0127] A number of candidate genes having the desired desaturase and/or elongase activities can be identified according to publicly available literature, such as GenBank, the patent literature, and experimental analysis of organisms having the ability to produce PUFAs. Useful desaturase and elongase sequences may be derived from any source, e.g., isolated from a natural source such as from bacteria, algae, fungi, oomycete, yeast, plants, animals, etc., produced via a semi-synthetic route or synthesized de novo. Following the identification of these candidate genes, considerations for choosing a specific polypeptide having desaturase or elongase activity include: 1) the substrate specificity of the polypeptide; 2) whether the polypeptide or a component thereof is a rate-limiting enzyme; 3) whether the desaturase or elongase is essential for synthesis of a desired PUFA; 4) co-factors required by the polypeptide; and/or, 5) whether the polypeptide is modified after its production, such as by a kinase or a prenyltransferases.

[0128] The expressed polypeptide preferably has parameters compatible with the biochemical environment of its location in the host cell. See U.S. Pat. No. 7,238,482. It may also be useful to consider the conversion efficiency of each particular desaturase and/or elongase. More specifically, since each enzyme rarely functions with 100% efficiency to convert substrate to product, the final lipid profile of un-purified oils produced in a host cell is typically a mixture of various PUFAs consisting of the desired .omega.-3/.omega.-6 fatty acid, as well as various upstream intermediary PUFAs. Thus, the conversion efficiency of each enzyme is also a variable to consider when optimizing biosynthesis of a desired fatty acid.

Peroxisome Biogenesis and Pex Genes

[0129] As previously described, peroxisomes are ubiquitous organelles found in all eukaryotic cells. Their primary role is the degradation of various substances within a localized organelle of the cell, such as toxic compounds, fatty acids, etc. For example, the process of .beta.-oxidation, wherein fatty acid molecules are broken down to ultimately produce free molecules of acetyl-CoA (which are exported back to the cytosol), can occur in peroxisomes. Although the process of .beta.-oxidation in mitochondria results in ATP synthesis, .beta.-oxidation in peroxisomes causes the transfer of high-potential electrons to O.sub.2 and results in the formation of H.sub.2O.sub.2, which is subsequently converted to water and O.sub.2 by peroxisome catalases. Very long chain, such as C.sub.18 to C.sub.22, fatty acids undergo initial .beta.-oxidation in peroxisomes, followed by mitochondrial .beta.-oxidation.

[0130] The proteins responsible for importing proteins by means of ATP hydrolysis through the peroxisomal membrane are known as peroxisome biogenesis factor proteins, or "peroxins". These peroxisome biogenesis factor proteins also include those proteins involved in peroxisome biogenesis/assembly. The gene acronym for peroxisome biogenesis factor proteins is Pex; and, a system for nomenclature is described by Distel et al., J. Cell Biol., 135:1-3 (1996). At least 32 different Pex genes have been identified so far in various eukaryotic organisms. In fungi, however, the recent review of Kiel et al. (Traffic, 7:1291-1303 (2006)) suggests that the minimal requirement for peroxisome biogenesis/matrix protein import is numbered as 17, thereby requiring only Pex1p, Pex2p, Pex3p, Pex4p, Pex5p, Pex6p, Pex7p, Pex8p, Pex10p, Pex12p, Pex13p, Pex14p, Pex17p, Pex19p, Pex20p, Pex22p and Pe26p. These proteins act in a coordinated fashion to proliferate (duplicate) peroxisomes and import proteins via translocation into peroxisomes (reviewed in Waterham, H. R. and J. M. Cregg. BioEssays. 19(1):57-66 (1996)).

[0131] Many Pex genes were initially isolated from the analysis of mutants that demonstrated abnormal peroxisomal functions or structures. With the availability of complete genome sequences, however, it is becoming increasingly easy to identify Pex genes via computer sequence searches based on homology. Kiel et al. (Traffic, 7:1291-1303 (2006)) cite strong conservation of the peroxisome biogenesis machinery, despite occasional low sequence similarity. More specifically, within the yeast and filamentous fungi, their data indicate that almost all Pex proteins identified thus far are conserved. Table 4, below, shows peroxisome biogenesis factor proteins identified by Kiel et al. (supra) in Saccharomyces cerevisiae, Candida glabrata, Ashbya gossypii, Kluyveromyces lactis, Candida albicans, Debaryomyces hansenii, Pichia pastoris, Hansenula polymorpha, Yarrowia lipolytica, Aspergillus fumigatus, Aspergillus nidulans, Penicillium chrysogenum, Magnaporthe grisea, Neurospora crassa, Gibberella zeae, Ustilago maydis, Cryptococcus neoformans var. neoformans and Schizosaccharomyces pombe.

TABLE-US-00005 TABLE 4 GenBank Accession Numbers Of Fungal Peroxisome Biogenesis Factor Proteins [Recreated From Table 2 of Kiel et al., (Traffic, 7: 1291-1303 (2006))] Saccharo- myces Candida Ashbya Kluyveromyces Candida Debaryomyces Pichia Hansenula Yarrowia cerevisiae glabrata gossypii lactis albicans hansenii pastoris polymorpha lipolytica Pex1p CAA82041 CAG60131 AAS53742 CAH02218 EAL02496 CAG89689 CAA85450 AAD52811 CAG82178 Pex2p CAA89508 CAG60461 AAS50677 CAH00186 EAK95929 CAG85956 CAA65646 AAT97412 CAG77647 Pex3p AAB64764 CAG62379 AAS52217 CAG99801 EAK94771 CAG89890 CAA96530 AAC49471 CAG78565 Pex3Bp -- -- -- -- -- -- na -- CAG83356 Pex4p CAA97146 CAG60639 AAS53685 CAG99212 EAL03336 CAG87262 AAA53634 AAC16238 CAG79130 Pex5p CAA89730 CAG61665 AAS53824 CAH01742 EAK94251 CAG89098 AAB40613 AAC49040 CAG78803 Pex5Bp -- CAG61076 -- -- -- -- na -- -- Pex5Cp CAA89120 -- -- -- -- -- na -- -- (Ymr018wp) Pex5/20p -- -- -- -- -- -- na -- -- Pex5Rp -- -- -- -- -- -- na -- -- Pex6p AAA16574 CAG58438 AAS54884 CAG99125 EAK95956 CAG87108 CAA80278 AAD52812 CAG82306 Pex7p CAA57183 CAG57936 AAS54301 CAG99215 EAK95226 CAG87150 AAC08303 ABA64462 CAG78389 Pex8p CAA97079 CAG61238 AAS52889 CAH01253 EAK91777, CAG89446 AAC41653 CAA82928 CAG80447 EAK91778* Pex9p ORF -- -- -- -- -- -- -- -- wrongly identified Pex10p AAB64453 CAG62699 AAS53069 CAG99788 Translation of CAG89101 AAB09086 CAA86101 CAG81606 AACQ- 01000128, nucleotides 37281-36306 (contains intron) Pex12p CAA89129 CAG62649 AAS50837 CAG99378 EAL00707 CAG84342 AAC49402 AAM66157 CAG81532 Pex13p AAB46885 CAG57840 AAS51456 CAG99931 EAK97421 CAG86337 AAB09087 DQ345349 CAG81789 Pex14p AAS56829 CAG58828 AAS54871 CAG99440 EAK90926 CAG91028 AAG28574 AAB40596 CAG79323 Pex15p CAA99046 CAG58938 AAS51506 CAG98135 -- -- na -- -- Pex16p -- -- -- -- -- -- na -- CAG79622 Pex17p CAA96116 CAG61398 AAS50595 CAH01010 EAK95385 CAG86168 AAF19606 DQ345350 CAG84025 Pex14/17p -- -- -- -- -- -- na -- -- Pex18p AAB68992 -- -- -- -- -- na -- -- Pex19p CAA98630 CAG58359 AAS52741 CAG99258 EAK97275 CAG84799 AAD43507 AAK84070 AAK84827 Pex20p -- -- -- -- EAK91603, CAG87898 AAX11696 AAX14715 CAG79226 EAK94766* Pex21p CAA97267 CAG59241 AAS51769 CAG99735 -- -- na -- -- Pex21Bp -- CAG60281 -- -- -- -- na -- -- Pex22p AAC04978 CAG60970 AAS52329 CAG97800 EAK91040 CAG88727 AAD45664 DQ384616 CAG77876 Pex22p- -- -- -- -- -- na -- -- EAL90994 like Pex26p -- -- -- -- EAK91093 CAG88929 na DQ645588 Antisense translation of NC_006072, nucleotides 117230-118387 Cryptococcus neoformans Schizo- Aspergillus Aspergillus Penicillium Magnaporthe Neurospora Gibberella Ustilago var. saccharomyces fumigatus nidulans chrysogenum grisea crassa zeae maydis neoformans pombe Pex1p EAL93310 EAA57740 AAG09748 XP_364454 EAA34641 EAA76787 EAK85195 AAW43248 CAA19256 Pex2p EAL88068 EAA58944 DQ793192 XP_368589 EAA35361 EAA70670 EAK81310 AAW40683 CAA16981 Pex3p EAL91965 EAA64392 DQ793193 XP_369909 EAA33751 EAA76989 EAK87104 AAW42444 CAB10141 Pex3Bp -- -- -- -- -- -- -- -- -- Pex4p EAL87211 Translation of DQ793194 XP_369064 EAA34737 EAA76379 Translation -- CAB91184 AACD0- of AACP0- 1000130, 1000006, nucleotides nucleotides 150195-150738 97041-96550 (contains (contains intron) intron) Pex5p EAL85289 EAA63772 AAR12222 XP_360528 EAA36111 EAA68640 EAK83659 AAW46349 CAA22179 Pex5Bp -- -- -- -- -- -- -- -- -- Pex5Cp -- -- -- -- -- -- -- -- -- Pex5/20p -- -- -- -- -- -- EAK82973 AAW41849 -- Pex5Rp -- -- -- -- -- -- -- -- -- Pex6p EAL92776 EAA63496 AAG09749 XP_368715 EAA36040 EAA73732 EAK83459 AAW45333 CAB11501 Pex7p EAL90870 EAA65909 DQ793195 XP_363555 AAN39560 EAA74171 EAK84499 AAW41119 P78798 Pex8p EAL93137 EAA57947 DQ793196 XP_359449 EAA27783 EAA77627 EAK83936 AAW43468 CAB53406 Pex9p -- -- -- -- -- -- -- -- -- Pex10p EAL87045 EAA62774 DQ793197 XP_369099 EAA34967 EAA76761 EAK83811 AAW45079 CAB51769 Pex12p EAL93972 EAA61357 DQ793198 XP_363845 EAA32773 EAA76413 EAK81282 AAW46724 CAD27496 Pex13p EAL85282 EAA63824 DQ793199 XP_369087 EAA35785 EAA68396 EAK84395 AAW42381 CAB16740 Pex14p EAL92562 EAA61046 DQ793200 XP_368216 EAA28304 EAA76904 EAK83123 AAW46857 CAA18656 Pex15p -- -- -- -- -- -- -- -- -- Pex16p EAL88469 EAA62294 DQ793201 XP_364166 EAA34648 EAA71849 EAK82801 AAW43797 CAA22819 Pex17p See -- -- -- -- -- -- -- -- Pex14/17p Pex14/17p EAL93590 EAA58642 DQ793202 XP_368163 EAA27748 EAA73655 EAK81127 -- -- Pex18p -- -- -- -- -- -- -- -- -- Pex19p EAL92487 EAA60977 DQ793203 XP_368273 EAA31855 EAA70162 EAK86072 AAW42876 CAA97344 Pex20p EAL90176 EAA60479 DQ793204 XP_368606 AAN39561 EAA76911 -- -- -- Pex21p -- -- -- -- -- -- -- -- -- Pex21Bp -- -- -- -- -- -- -- -- -- Pex22p -- -- -- -- -- -- -- -- -- Pex22p- EAL90994 EAA66006 DQ793205 XP_365689 EAA26537 Translation -- -- -- like of AACM0- 1000080, nucleotides 4362-3039 (contains intron) Pex26p EAL93994 EAA61336 DQ793206 XP_359606 EAA28582 EAA76391 -- -- -- *Partial ORFs encoded on non-overlapping contigs.

[0132] Mutations of Pex genes leading to impaired peroxisome biogenesis result in severe metabolic and developmental disturbances in yeasts, humans and plants (Eckert, J. H. and R. Erdmann, Rev. Physiol. Biochem Pharmacol., 147:75-121 (2003); Weller, S. et al., Annual Review of Genomics and Human Genetics, 4:165-211 (2003); Wanders, R. J., Am. J. Med. Genet., 126A:355-375 (2004); Mano, S. and M. Nishimura, Vitam Horm., 72:111-154 (2005); Wanders, J. A., and H. R. Waterham, Annu. Rev. Biochem., 75:295-332 (2006); Fujiki, Yukio. Peroxisome Biogenesis Disorders. In, Encyclopedia of Life Sciences. John Wiley & Sons, 2006). For example, X-linked adrenoleukodystrophy ["X-ALD"] and Zellweger syndrome, as well as several less severe forms of the disease, can result from single enzyme deficiencies and/or peroxisomal biogenesis disorders.

[0133] Within the yeast, Yarrowia lipolytica, a variety of different Pex genes have been isolated and characterized, as identified in Table 4 above. More specifically, Bascom, R. A. et al. (Mol. Biol. Cell, 14:939-957 (2003)) describe YIPex3p; Szilard, R. K. et al. (J. Cell Biol., 131:1453-1469 (1995)) describe YIPex5p; Nuttley, W. M. et al. (J. Biol. Chem., 269:556-566 (1994)) describe YIPex6p; Elizen G. A., et al. (J. Biol. Chem., 270:1429-1436 (1995)) describe YIPex9p; Elizen G. A., et al. (J. Cell Biol., 137:1265-1278 (1997)) and Titorenko, V. I. et al. (Mol. Cell. Biol., 17:5210-5226 (1997)) describe YIPex16p; Lambkin, G. R. and R. A. Rachubinski (Mol. Biol. Cell., 12(11):3353-3364 (2001)) describe YIPex19; and Titorenko V. I., et al. (J. Cell Biol., 142:403-420 (1998)) and Smith J. J. and R. A. Rachubinski (J. Cell Biol., 276:1618-1625 (2001)) describe YIPex20p.

[0134] Of initial interest herein was YIPex10p (GenBank Accession No. CAG81606, No. AB036770 and No. AJ012084). It was demonstrated in Sumita et al. (FEMS Microbiol. Lett., 214:31-38 (2002) that: 1) YIPex10p functions as a component of the peroxisome; and, 2) the C.sub.3HC.sub.4 zinc ring finger motif of YIPex10p was essential for the protein's function as determined via creation of C341S, C346S and H343W point mutations, followed by analysis of growth.

[0135] Studies of the C.sub.3HC.sub.4 zinc ring finger motif of Pex10 have been done in other organisms with similar results. For example, point mutations that alter conserved residues in the Pex10p C.sub.3HC.sub.4 motif of Pichia pastoris were found to abolish function of the protein (Kalish, J. E. et al., Mol. Cell. Biol., 15:6406-6419 (1995)). Similarly, after functional complementation assays in fibroblast cell lines, Warren D. S., et al. (Hum. Mutat., 15(6):509-521 (2000)) concluded that the C.sub.3HC.sub.4 motif was critical for Pex10p function. Several studies show that loss of function of Pex10p in Arabidopsis causes embryo lethality at the heart stage (Hu, J., et al., Science, 297:405-409 (2002); Schmumann, U. et al., Proc. Natl. Acad. Sci. U.S.A., 100:9626-9631 (2003); Sparkes, I. A., et al., Plant Physiol., 133:1809-1819 (2003); Fan, J. et al., Plant Physiol., 139:231-239 (2005)). In follow-up research, Schemann, U. et al. (Proc. Natl. Acad. Sci. U.S.A., 104:1069-1074 (2007)) investigated the function of Pex10p in nonlethal partial loss-of-function Arabidopsis mutants. Specifically, four T-DNA insertion lines expressing Pex10p with a dysfunctional C.sub.3HC.sub.4 motif were created in an Arabidopsis wildtype background. Mutant plants demonstrated impaired leaf peroxisomes and the authors suggest that inactivation of the ring finger motif in Pex10p eliminated protein interaction required for attachment of peroxisomes to chloroplasts and movement of metabolites between peroxisomes and chloroplasts.

[0136] Although studies have not identified essential domains in other Pex proteins, research has looked at the effect of various Pex mutants to learn the strategies and the molecular mechanisms evolutionarily diverse organisms use for assembling, maintaining, propagating and inheriting the peroxisome, an organelle known for its role in lipid metabolism. For example, Bascom, R. A. et al. has performed knockout and overexpression of the Yarrowia lipolytica Pex3p (Mol. Biol. Cell, 14:939-957 (2003)). The knockout cells did not contain wildtype perixosomes but instead had numerous small vesicles; overexpression resulted in cells with fewer, larger and clustered peroxisomes. They hypothesized that Pex3p is involved in the initiation of peroxisome assembly by sequestering components of peroxisome biogenesis, i.e., peroxisome targeting signal (PTS) 1 and 2 import machineries. Similarly, for Guo, T. et al., knockout of the Y lipolytica Pex16p resulted in excessive proliferation of immature peroxisomal vesicles and significantly decreased the rate and efficiency of their conversion to mature peroxisomes (J. Cell Biol., 162:1255-1266 (2003)), while overexpression resulted in few but enlarged peroxisomes (Eitzen et al., J. Cell Biol., 137:1265-1278 (1997)). Guo et al. concluded Pex16p negatively regulated the membrane scission event required for division of early peroxisomal precursors.

[0137] Despite the advances summarized above, many details concerning the roles of various Pex proteins, their interaction with one another and the biogenesis/assembly mechanism in peroxisomes remains to be elucidated. As such, the data described in the Application, wherein mutation within the C.sub.3HC.sub.4 motif of YIPex10p or knockout of YIPex3p, YIPex10p or YIPex16p results in creation of a Yarrowia lipolytica mutant that has an increased capacity to incorporate PUFAs, especially long-chain PUFAs such as C.sub.20 to C.sub.22 molecules, into the total lipid fraction and in the oil fraction in the cell, is a novel observation that does not yet find validation in studies with other plants or animals.

[0138] It has been suggested that peroxisomes are required for both catabolic and anabolic lipid metabolism (Lin, Y. et al., Plant Physiology, 135:814-827 (2004)); however, this hypothesis was based on studies with a homolog of Pex16p. More specifically, Lin, Y. et al. (supra) reported that Arabidopsis Shrunken Seed 1 (sse1) mutants had both abnormal peroxisome biogenesis and fatty acid synthesis, based on a reduction of oil to approximately 10-16% of wild type in sse1 seeds. Binns, D. et al. (J. Cell Biol., 173(5):719-731 (2006)) examined the peroxisome-lipid body interactions in Saccharomyces cerevisiae and determined that extensive physical contact between the two organelles promotes coupling of lipolysis within lipid bodies with peroxisomal fatty acid oxidation. More specifically, ratios of free fatty acids to TAGs were examined in various Pex knockouts and found to be increased relative to the wildtype. Clearly, further investigation will be necessary to understand the metabolic roles of peroxisomes and in particular of Pex3p, Pex10p and Pex16p proteins.

[0139] Without wishing to be held to any particular explanation or theory, it is hypothesized that disruption or knockout of a Pex gene within an oleaginous yeast cell affects both the catabolic and anabolic lipid metabolism that naturally occurs in peroxisomes or is affected by peroxisomes. Disruption or knockout results in an increase in the amount of PUFAs in the total lipid fraction and in the oil fraction, as a percent of total fatty acids, as compared with an oleaginous yeast whose native peroxisome biogenesis factor protein has not been disrupted. In some cases, an increase in the amount of PUFAs in the total lipid fraction and in the oil fraction as a percent of dry cell weight, and/or an increase in the total lipid content as a percent of dry cell weight, is also observed. It is hypothesized that this generalized mechanism is applicable within all eukaryotic organisms, such as algae, fungi, oomycetes, yeast, euglenoids, stramenopiles, plants and some mammalian systems, since all comprise peroxisomes.

[0140] Identification and Isolation of Pex Homologs

[0141] When the sequence of a particular Pex gene or protein within a preferred host organism is not known, one skilled in the art recognizes that it will be most desirable to identify and isolate these genes, or portions of them, prior to regulating the activity of the encoded proteins, which regulation in turn facilitates altering the amount, as a percent of total fatty acids, of PUFAs incorporated into the total lipid fraction and in the oil fraction of the eukaryote. Sequence knowledge of the preferred host's Pex genes also facilitates disruption of the homologous chromosomal genes by targeted disruption.

[0142] The Pex sequences in Table 4, or portions of them, may be used to search for Pex homologs in the same or other algal, fungal, oomycete, euglenoid, stramenopiles, yeast or plant species using sequence analysis software. In general, such computer software matches similar sequences by assigning degrees of homology to various substitutions, deletions, and other modifications. Use of software algorithms, such as the BLASTP method of alignment with a low complexity filter and the following parameters: Expect value=10, matrix=Blosum 62 (Altschul, et al., Nucleic Acids Res. 25:3389-3402 (1997)), is well-known for comparing any Pex protein in Table 4 against a database of nucleic or protein sequences and thereby identifying similar known sequences within a preferred host organism.

[0143] Use of a software algorithm to comb through databases of known sequences is particularly suitable for the isolation of homologs having a relatively low percent identity to publicly available Pex sequences, such as those described in Table 4. It is predictable that isolation would be relatively easier for Pex homologs of at least about 70%-85% identity to publicly available Pex sequences. Further, those sequences that are at least about 85%-90% identical would be particularly suitable for isolation and those sequences that are at least about 90%-95% identical would be the most facilely isolated.

[0144] Some Pex homologs have also been isolated by the use of motifs unique to the Pex enzymes. For example, it is well known that Pex2p, Pex10p and Pex12p all share a cysteine-rich motif near their carboxyl termini, known as a C.sub.3HC.sub.4 zinc ring finger motif (FIG. 2A). This region of "conserved domain" corresponds to a set of amino acids that are highly conserved at specific positions and likely represents a region of the Pex protein that is essential to the structure, stability or activity of the protein. Motifs are identified by their high degree of conservation in aligned sequences of a family of protein homologues. As unique "signatures", they can determine if a protein with a newly determined sequence belongs to a previously identified protein family. These motifs are useful as diagnostic tools for the rapid identification of novel Pex2, Pex10 and/or Pex12 genes, respectively.

[0145] Alternatively, the publicly available Pex sequences or their motifs may be hybridization reagents for the identification of homologs. The basic components of a nucleic acid hybridization test include a probe, a sample suspected of containing the gene or gene fragment of interest, and a specific hybridization method. Probes are typically single-stranded nucleic acid sequences that are complementary to the nucleic acid sequences to be detected. Probes are hybridizable to the nucleic acid sequence to be detected. Although probe length can vary from 5 bases to tens of thousands of bases, typically a probe length of about 15 bases to about 30 bases is suitable. Only part of the probe molecule need be complementary to the nucleic acid sequence to be detected. In addition, the complementarity between the probe and the target sequence need not be perfect. Hybridization does occur between imperfectly complementary molecules with the result that a certain fraction of the bases in the hybridized region are not paired with the proper complementary base.

[0146] Hybridization methods are well known. Typically the probe and the sample must be mixed under conditions that permit nucleic acid hybridization. This involves contacting the probe and sample in the presence of an inorganic or organic salt under the proper concentration and temperature conditions. The probe and sample nucleic acids must be in contact for a long enough time that any possible hybridization between the probe and the sample nucleic acid occurs. The concentration of probe or target in the mixture determine the time necessary for hybridization to occur. The higher the concentration of the probe or target, the shorter the hybridization incubation time needed. Optionally, a chaotropic agent may be added, such as guanidinium chloride, guanidinium thiocyanate, sodium thiocyanate, lithium tetrachloroacetate, sodium perchlorate, rubidium tetrachloroacetate, potassium iodide or cesium trifluoroacetate. If desired, one can add formamide to the hybridization mixture, typically 30-50% (v/v) ["by volume"].

[0147] Various hybridization solutions can be employed. Typically, these, comprise from about 20 to 60% volume, preferably 30%, of a polar organic solvent. A common hybridization solution employs about 30-50% v/v formamide, about 0.15 to 1 M sodium chloride, about 0.05 to 0.1 M buffers (e.g., sodium citrate, Tris-HCl, PIPES or HEPES (pH range about 6-9)), about 0.05 to 0.2% detergent (e.g., sodium dodecylsulfate), or between 0.5-20 mM EDTA, FICOLL (Pharmacia Inc.) (about 300-500 kdal), polyvinylpyrrolidone (about 250-500 kdal), and serum albumin. Also included in the typical hybridization solution are unlabeled carrier nucleic acids from about 0.1 to 5 mg/mL, fragmented nucleic DNA such as calf thymus or salmon sperm DNA or yeast RNA, and optionally from about 0.5 to 2% wt/vol ["weight by volume"] glycine. Other additives may be included, such as volume exclusion agents that include polar water-soluble or swellable agents (e.g., polyethylene glycol), anionic polymers (e.g., polyacrylate or polymethylacrylate) and anionic saccharinic polymers, such as dextran sulfate.

[0148] Nucleic acid hybridization is adaptable to a variety of assay formats. One of the most suitable is the sandwich assay format. The sandwich assay is particularly adaptable to hybridization under non-denaturing conditions. A primary component of a sandwich-type assay is a solid support. The solid support has adsorbed or covalently coupled to it immobilized nucleic acid probe that is unlabeled and complementary to one portion of the sequence.

[0149] Any of the Pex nucleic acid fragments or any identified homologs may be used to isolate genes encoding homologous proteins from the same or other algal, fungal, oomycete, euglenoid, stramenopiles, yeast or plant species. Isolation of homologous genes using sequence-dependent protocols is well known in the art. Examples of sequence-dependent protocols include, but are not limited to: 1) methods of nucleic acid hybridization; 2) methods of DNA and RNA amplification, as exemplified by various uses of nucleic acid amplification technologies, such as polymerase chain reaction ["PCR"] (U.S. Pat. No. 4,683,202); ligase chain reaction ["LCR"] (Tabor, S. et al., Proc. Natl. Acad. Sci. U.S.A., 82:1074 (1985)); or strand displacement amplification ["SDA"] (Walker, et al., Proc. Natl. Acad. Sci. U.S.A., 89:392 (1992)); and, 3) methods of library construction and screening by complementation.

[0150] For example, genes encoding proteins or polypeptides similar to publicly available Pex genes or their motifs could be isolated directly by using all or a portion of those publicly available nucleic acid fragments as DNA hybridization probes to screen libraries from any desired organism using well known methods. Specific oligonucleotide probes based upon the publicly available nucleic acid sequences can be designed and synthesized by methods known in the art (Maniatis, supra). Moreover, the entire sequences can be used directly to synthesize DNA probes by methods known to the skilled artisan, such as random primers DNA labeling, nick translation or end-labeling techniques, or RNA probes using available in vitro transcription systems. In addition, specific primers can be designed and used to amplify a part or the full length of the publicly available sequences or their motifs. The resulting amplification products can be labeled directly during amplification reactions or labeled after amplification reactions, and used as probes to isolate full-length DNA fragments under conditions of appropriate stringency.

[0151] Typically, in PCR-type amplification techniques, the primers have different sequences and are not complementary to each other. Depending on the desired test conditions, the sequences of the primers should be designed to provide for both efficient and faithful replication of the target nucleic acid. Methods of PCR primer design are common and well known (Thein and Wallace, "The use of oligonucleotides as specific hybridization probes in the Diagnosis of Genetic Disorders", in Human Genetic Diseases: A Practical Approach, K. E. Davis Ed., (1986) pp 33-50, IRL: Herndon, Va.; Rychlik, W., In Methods in Molecular Biology, White, B. A. Ed., (1993) Vol. 15, pp 31-39, PCR Protocols: Current Methods and Applications. Humania: Totowa, N.J.).

[0152] Generally two short segments of available Pex sequences may be used in PCR protocols to amplify longer nucleic acid fragments encoding homologous genes from DNA or RNA. PCR may also be performed on a library of cloned nucleic acid fragments wherein the sequence of one primer is derived from the available nucleic acid fragments or their motifs. The sequence of the other primer takes advantage of the presence of the polyadenylic acid tracts to the 3' end of the mRNA precursor encoding genes.

[0153] Alternatively, the second primer sequence may be based upon sequences derived from the cloning vector. For example, the skilled artisan can follow the RACE protocol (Frohman et al., Proc. Natl. Acad. Sci. U.S.A., 85:8998 (1988)) to generate cDNAs by using PCR to amplify copies of the region between a single point in the transcript and the 3' or 5' end. Primers oriented in the 3' and 5' directions can be designed from the available sequences. Using commercially available 3' RACE or 5' RACE systems (e.g., BRL, Gaithersburg, Md.), specific 3' or 5' cDNA fragments can be isolated (Ohara et al., Proc. Natl. Acad. Sci. U.S.A., 86:5673 (1989); Loh et al., Science, 243:217 (1989)).

[0154] Based on any of these well-known methods just discussed, it would be possible to identify and/or isolate Pex gene homologs in any preferred eukaryotic organism of choice. The activity of any putative Pex gene can readily be confirmed by targeted disruption of the endogenous gene within the PUFA-producing host organism, since the lipid profiles of the total lipid fraction and of the oil fraction are modified relative to those within an organism lacking the targeted Pex gene disruption.

Increasing the Amount of PUFAs in the Total Lipid Fraction and in the Oil Fraction Via Disruption of a Native Peroxisome Biogenesis Factor Protein

[0155] As noted above, the present disclosure relates to the following described methods for increasing the weight percent of one PUFA or a combination of PUFAs in an oleaginous eukaryotic organism, comprising: [0156] a) providing an oleaginous eukaryotic organism comprising a disruption in a native gene encoding a peroxisome biogenesis factor protein, which creates a PEX-disruption organism; and genes encoding a functional PUFA biosynthetic pathway; and, [0157] b) growing the eukaryotic organism of (a) under conditions wherein the weight percent of one PUFA or a combination of PUFAs is increased in the total lipid fraction and in the oil fraction relative to the weight percent of the total fatty acids, when compared with those weight percents in an oleaginous eukaryotic organism whose native peroxisome biogenesis factor protein has not been disrupted. The amount of PUFAs that increases as a percent of total fatty acids can be: 1) the PUFA that is the desired end product of a functional PUFA biosynthetic pathway, as opposed to PUFA intermediates or by-products; 2) C.sub.20 to C.sub.22 PUFAs; and/or, 3) total PUFAs.

[0158] In addition to the increase in the weight percent of one or a combination of PUFAs relative to the weight percent of the total fatty acids, in some cases, the total lipid content (TFA % DCW) of the cell may be increased or decreased. What this means is that regardless of whether the disruption in the PEX gene causes the amount of total lipids in the PEX-disrupted cell to increase or decrease, the disruption always causes the weight percent of a PUFA or of a combination of PUFAs to increase.

[0159] Another method provided herein relates to a disruption in a native gene encoding a peroxisome biogenesis factor protein, wherein said disruption can result in an increase in the percent of one PUFA or a combination of PUFAs relative to the dry cell weight when compared to that percent in a parental strain whose native Pex protein had not been disrupted or that was expressing a "replacement" copy of the disrupted native Pex protein.

[0160] In preferred aspects of the method above, the disruption in a native gene encoding a peroxisome biogenesis factor protein results in an increase in the amount of the PUFA that is the desired end product of a functional PUFA biosynthetic pathway, as opposed to PUFA intermediates or by-products, as a percent of dry cell weight relative to the parental strain whose native Pex protein had not been disrupted or the parental strain that was expressing a "replacement" copy of the disrupted native Pex protein. In some cases, the increase in the percent of a combination of PUFAs relative to the dry cell weight is a combination of C.sub.20 to C.sub.22 PUFAs or the total PUFAs.

[0161] Also described herein are organisms produced by these methods, comprising a disruption of at least one peroxisome biogenesis factor protein. Lipids and oils obtained from these organisms, products obtained from the processing of the lipids and oil, use of these lipids and oil in foods, animal feeds or industrial applications and/or use of the by-products in foods or animal feeds are also described.

[0162] Preferred eukaryotic organisms in the methods described above include algae, fungi, oomycetes, yeast, euglenoids, stramenopiles, plants and some mammalian systems.

[0163] The peroxisome biogenesis factor protein for any of these methods may be selected from the group consisting of: Pex1p, Pex2p, Pex3p, Pex3Bp Pex4p, Pex5p, Pex5Bp, Pex5Cp, Pex5/20p, Pex6p, Pex7p, Pex8p, Pex10p, Pex12p, Pex13p, Pex14p, Pex15p, Pex16p, Pex17p, Pex14/17p, Pex18p, Pex19p, Pex20p, Pex21p, Pex21B, Pex22p, Pex22p-like and Pex26p (and protein homologs thereof). In some preferred methods described herein, the disrupted peroxisome biogenesis factor protein is selected from the group consisting of: Pex2p, Pex3p, Pex10p, Pex12p and/or Pex16p. In some more preferred methods, however, the disrupted peroxisome biogenesis factor protein is selected from the group consisting of: Pex3p, Pex10p and/or Pex16p.

[0164] The disruption in the native gene encoding a peroxisome biogenesis factor protein can be an insertion, deletion, or targeted mutation within a portion of the gene, such as within the N-terminal portion of the protein or within the C-terminal portion of the protein. Alternatively, the disruption can result in a complete gene knockout such that the gene is eliminated from the host cell genome. Or, the disruption could be a targeted mutation that results in a non-functional protein.

Disruption Methodologies

[0165] The invention includes disruption in a native gene encoding a peroxisome biogenesis factor protein within a preferred host cell. Although numerous techniques are available to one of skill in the art to achieve disruption, generally the endogenous activity of a particular gene can be reduced or eliminated by the following techniques, for example: 1) disrupting the gene through insertion, substitution and/or deletion of all or part of the target gene; or 2) manipulating the regulatory sequences controlling the expression of the protein. Both of these techniques are discussed below. However, one skilled in the art appreciates that these are well described in the existing literature and are not limiting to the methods, host cells, and products described herein. One skilled in the art also appreciates the most appropriate technique for use with any particular oleaginous yeast.

[0166] Disruption Via Insertion, Substitution And/Or Deletion: For gene disruption, a foreign DNA fragment, typically a selectable marker gene, is inserted into the structural gene. This interrupts the coding sequence of the structural gene and causes inactivation of that gene. Transformation of the disruption cassette into the host cell results in replacement of the functional native gene by homologous recombination with the non-functional disrupted gene. See, for example: Hamilton et al., J. Bacteriol., 171:4617-4622 (1989); Balbas et al., Gene, 136:211-213 (1993); Gueldener et al., Nucleic Acids Res., 24:2519-2524 (1996); and Smith et al., Methods Mol. Cell. Biol., 5:270-277 (1996). One skilled in the art appreciates the many variations of the general method of gene targeting, which admits of positive or negative selection, creation of gene knockouts, and insertion of exogenous DNA sequences into specific genome sites in mammalian systems, plant cells, filamentous fungi, algae, oomycetes, euglenoids, stramenopiles, yeast and/or microbial systems.

[0167] In contrast, a non-specific method of gene disruption is the use of transposable elements or transposons. Transposons are genetic elements that insert randomly into DNA but can be later retrieved on the basis of sequence to determine the locus of insertion. Both in vivo and in vitro transposition techniques are known and involve the use of a transposable element in combination with a transposase enzyme. When the transposable element or transposon is contacted with a nucleic acid fragment in the presence of the transposase, the transposable element randomly inserts into the nucleic acid fragment. The technique is useful for random mutagenesis and for gene isolation, since the disrupted gene may be identified on the basis of the sequence of the transposable element. Kits for in vitro transposition are commercially available and include: the Primer Island Transposition Kit, available from Perkin Elmer Applied Biosystems, Branchburg, N.J., based upon the yeast Ty1 element; the Genome Priming System, available from New England Biolabs, Beverly, Mass., based upon the bacterial transposon Tn7; and EZ::TN Transposon Insertion Systems, available from Epicentre Technologies, Madison, Wis., based upon the Tn5 bacterial transposable element.

[0168] Manipulation Of Pex Regulatory Sequences: As is well known in the art, the regulatory sequences associated with a coding sequence include transcriptional and translational "control" nucleotide sequences located upstream (5' non-coding sequences), within, or downstream (3' non-coding sequences) of the coding sequence, and which influence the transcription, RNA processing or stability, or translation of the associated coding sequence. Thus, manipulation of a Pex gene's regulatory sequences may refer to manipulation of the promoters, silencers, 5' untranslated leader sequences (between the transcription start site and the translation initiation codon), introns, enhancers, initiation control regions, polyadenylation recognition sequences, RNA processing sites, effector binding sites and stem-loop structures of the particular Pex gene. In all cases, however, the result of the manipulation is down-regulation of the Pex gene's expression, which promotes increased amount of PUFAs in the total lipid fraction and in the oil fraction, as a percent of total fatty acids, as compared with an oleaginous yeast whose native peroxisome biogenesis factor protein has not been disrupted.

[0169] For example, the promoter of a Pex10 gene could be deleted or disrupted. Alternatively, the native promoter driving expression of a Pex10 gene may be substituted with a heterologous promoter having diminished promoter activity with respect to that of the native promoter. Methods useful for manipulating regulatory sequences are well known.

[0170] The skilled person is able to use these and other well known techniques to disrupt a native peroxisome biogenesis factor protein within the preferred host cells described herein, such as mammalian systems, plant cells, filamentous fungi, algae, oomycetes, euglenoids, stramenopiles and yeast.

[0171] One skilled in the art is able to discern the optimum means to disrupt the native Pex gene to achieve an increased amount of PUFAs that accumulate in the total lipid fraction and in the oil fraction, as a percent of total fatty acids, as compared with a eukaryotic organisms whose native peroxisome biogenesis factor protein has not been disrupted.

Metabolic Engineering of .omega.-3 and/or .omega.-6 Fatty Acid Biosynthesis

[0172] In addition to the methods described herein for disruption of a native peroxisome biogenesis factor protein, it may also be useful to manipulate .omega.-3 and/or .omega.-6 fatty acid biosynthesis. This may require metabolic engineering directly within the PUFA biosynthetic pathway or additional manipulation of pathways that contribute carbon to the PUFA biosynthetic pathway.

[0173] Techniques useful for up-regulating desirable biochemical pathways and down-regulating undesirable biochemical pathways are well known in the art. For example, biochemical pathways competing with the .omega./-3 and/or .omega.-6 fatty acid biosynthetic pathways for energy or carbon, or native PUFA biosynthetic pathway enzymes that interfere with production of a particular PUFA end-product, may be eliminated by gene disruption or down-regulated by other means, such as antisense mRNA and zinc-finger targeting technologies.

[0174] The following discuss altering the PUFA biosynthetic pathway as a means to increase GLA, ARA, EPA or DHA, respectively, and desirable manipulations in the TAG biosynthetic pathway and in the TAG degradation pathway: Int'l. App. Pub. No. WO 2006/033723, Int'l. App. Pub. No. WO 2006/055322 [U.S. Pat. Appl. Pub. No. 2006-0094092-A1], Int'l. App. Pub. No. WO 2006/052870 [U.S. Pat. Appl. Pub. No. 2006-0115881-A1] and Int'l. App. Pub. No. WO 2006/052871 [U.S. Pat. Appl. Pub. No. 2006-0110806-A1], respectively.

Expression Systems, Cassettes, Vectors and Transformation of Host Cells

[0175] It may be necessary to create and introduce a recombinant construct into the preferred eukaryotic host, such as e.g., mammalian systems, plant cells, filamentous fungi, algae, oomycetes, euglenoids, stramenopiles and yeast, to result in disruption of a native peroxisome biogenesis factor protein and/or introduction of genes encoding a PUFA biosynthetic pathway. One of skill in the art appreciates standard resource materials that describe: 1) specific conditions and procedures for construction, manipulation and isolation of macromolecules, such as DNA molecules, plasmids, etc.; 2) generation of recombinant DNA fragments and recombinant expression constructs; and 3) screening and isolating of clones. See Sambrook et al., Molecular Cloning: A Laboratory Manual, 2.sup.nd ed., Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989); Maliga et al., Methods in Plant Molecular Biology, Cold Spring Harbor, N.Y. (1995); Birren et al., Genome Analysis: Detecting Genes, v. 1, Cold Spring Harbor, N.Y. (1998); Birren et al., Genome Analysis Analyzing DNA, v. 2, Cold Spring Harbor: NY (1998); Plant Molecular Biology: A Laboratory Manual, Clark, ed. Springer: NY (1997).

[0176] In general, the choice of sequences included in the construct depends on the desired expression products, the nature of the host cell and the proposed means of separating transformed cells versus non-transformed cells. The skilled artisan is well aware of the genetic elements that must be present on the plasmid vector to successfully transform, select and propagate host cells containing the chimeric gene. Typically, however, the vector or cassette contains sequences directing transcription and translation of the relevant gene(s), a selectable marker and sequences allowing autonomous replication or chromosomal integration. Suitable vectors comprise a region 5' of the gene that controls transcriptional initiation, i.e., a promoter, and a region 3' of the DNA fragment that controls transcriptional termination, i.e., a terminator. It is most preferred when both control regions are derived from genes from the transformed host cell.

[0177] Initiation control regions or promoters useful for driving expression of heterologous genes or portions of them in the desired host cell are numerous and well known. These control regions may comprise a promoter, enhancer, silencer, intron sequences, 3' UTR and/or 5' UTR regions, and protein and/or RNA stabilizing elements. Such elements may vary in their strength and specificity. Virtually any promoter (i.e., native, synthetic, or chimeric) capable of directing expression of these genes in the selected host cell is suitable. Expression in a host cell can occur in an induced or constitutive fashion. Induced expression occurs by inducing the activity of a regulatable promoter operably linked to the Pex gene of interest. Constitutive expression occurs by the use of a constitutive promoter operably linked to the gene of interest.

[0178] When the host cell is, for example, yeast, transcriptional and translational regions functional in yeast cells are provided, particularly from the host species. See Int'l. App. Pub. No. WO 2006/052870 for preferred transcriptional initiation regulatory regions for use in Yarrowia lipolytica. Any of a number of regulatory sequences may be used, depending on whether constitutive or induced transcription is desired, the efficiency of the promoter in expressing the ORF of interest, the ease of construction, etc.

[0179] 3' non-coding sequences encoding transcription termination signals, i.e., a "termination region", must be provided in a recombinant construct and may be from the 3' region of the gene from which the initiation region was obtained or from a different gene. A large number of termination regions are known and function satisfactorily in a variety of hosts when utilized in both the same and different genera and species from which they were derived. The termination region is selected more for convenience rather than for any particular property. Termination regions may also be derived from various genes native to the preferred hosts.

[0180] Particularly useful termination regions for use in yeast are those derived from a yeast gene, particularly Saccharomyces, Schizosaccharomyces, Candida, Yarrowia or Kluyveromyces. The 3'-regions of mammalian genes encoding .gamma.-interferon and .alpha.-2 interferon are also known to function in yeast. The 3'-region can also be synthetic, as one of skill in the art can utilize available information to design and synthesize a 3'-region sequence that functions as a transcription terminator. A termination region may be unnecessary, but is highly preferred.

[0181] The vector may comprise a selectable and/or scorable marker, in addition to the regulatory elements described above. Preferably, the marker gene is an antibiotic resistance gene such that treating cells with the antibiotic causes inhibition of growth, or death, of untransformed cells and uninhibited growth of transformed cells. For selection of yeast transformants, any marker that functions in yeast is useful with resistance to kanamycin, hygromycin and the amino glycoside G418 and the ability to grow on media lacking uracil, lysine, histine or leucine being particularly useful.

[0182] Merely inserting a gene into a cloning vector does not ensure its expression at the desired rate, concentration, amount, etc. In response to the need for a high expression rate, many specialized expression vectors have been created by manipulating a number of different genetic elements that control transcription, RNA stability, translation, protein stability and location, oxygen limitation, and secretion from the host cell. Some of the manipulated features include: the nature of the relevant transcriptional promoter and terminator sequences, the number of copies of the cloned gene and whether the gene is plasmid-borne or integrated into the genome of the host cell, the final cellular location of the synthesized foreign protein, the efficiency of translation and correct folding of the protein in the host organism, the intrinsic stability of the mRNA and protein of the cloned gene within the host cell and the codon usage within the cloned gene, such that its frequency approaches the frequency of preferred codon usage of the host cell. Each of these may be used in the methods and host cells described herein to further optimize expression of PUFA biosynthetic pathway genes and to diminish expression of a native Pex gene.

[0183] After a recombinant construct is created, e.g., comprising a chimeric gene comprising a promoter, ORF and terminator, suitable for disruption or knock out of a native peroxisome biogenesis factor protein and/or expression of genes encoding a PUFA biosynthetic pathway activity, it is placed in a plasmid vector capable of autonomous replication in the host cell or is directly integrated into the genome of the host cell. Integration of expression cassettes can occur randomly within the host genome or can be targeted through the use of constructs containing regions of homology with the host genome sufficient to target recombination with the host locus. Where constructs are targeted to an endogenous locus, all or some of the transcriptional and translational regulatory regions can be provided by the endogenous locus.

[0184] When two or more genes are expressed from separate replicating vectors, each vector may have a different means of selection and should lack homology to the other construct(s) to maintain stable expression and prevent reassortment of elements among constructs. Judicious choice of regulatory regions, selection means and method of propagation of the introduced construct(s) can be experimentally determined so that all introduced genes are expressed at the necessary levels to provide for synthesis of the desired products.

[0185] Constructs comprising the gene of interest may be introduced into a host cell by any standard technique. These techniques include transformation, e.g., lithium acetate transformation (Methods in Enzymology, 194:186-187 (1991)), protoplast fusion, biolistic impact, electroporation, microinjection, vacuum filtration or any other method that introduces the gene of interest into the host cell.

[0186] For convenience, a host cell that has been manipulated by any method to take up a DNA sequence, for example, in an expression cassette, is referred to herein as "transformed" or "recombinant". The transformed host will have at least one copy of the expression construct and may have two or more, depending upon whether the gene is integrated into the genome, amplified, or is present on an extrachromosomal element having multiple copy numbers.

[0187] The transformed host cell can be identified by selection for a marker contained on the introduced construct. Alternatively, a separate marker construct may be co-transformed with the desired construct, as many transformation techniques introduce many DNA molecules into host cells. Typically, transformed hosts are selected for their ability to grow on selective media. Selective media may incorporate an antibiotic or lack a factor necessary for growth of the untransformed host, such as a nutrient or growth factor. An introduced marker gene may confer antibiotic resistance, or encode an essential growth factor or enzyme, thereby permitting growth on selective media when expressed in the transformed host. Selection of a transformed host can also occur when the expressed marker protein can be detected, either directly or indirectly. The marker protein may be expressed alone or as a fusion to another protein. The marker protein can be detected by its enzymatic activity (e.g., .beta.-galactosidase can convert the substrate X-gal ["5-bromo-4-chloro-3-indolyl-.beta.-D-galactopyranoside"] to a colored product; luciferase can convert luciferin to a light-emitting product) or its light-producing or modifying characteristics (e.g., the green fluorescent protein of Aequorea Victoria fluoresces when illuminated with blue light). Alternatively, antibodies can be used to detect the marker protein or a molecular tag on, for example, a protein of interest. Cells expressing the marker protein or tag can be selected, for example, visually, or by techniques such as fluorescence-activated cell sorting or panning using antibodies.

[0188] Regardless of the selected host or expression construct, multiple transformants must be screened to obtain a strain or plant line displaying the desired expression level, regulation and pattern, as different independent transformation events result in different levels and patterns of expression (Jones et al., EMBO J., 4:2411-2418 (1985); De Almeida et al., Mol. Gen. Genetics, 218:78-86 (1989)). Such screening may be accomplished by Southern analysis of DNA blots (Southern, J. Mol. Biol., 98:503 (1975)), Northern analysis of mRNA expression (Kroczek, J. Chromatogr. Biomed. Appl., 618(1-2):133-145 (1993)), Western and/or Elisa analyses of protein expression, phenotypic analysis or GC analysis of the PUFA products.

Preferred Eukaryotic Host Organisms

[0189] A variety of eukaryotic organisms are suitable as host herein, to thereby yield a transformant host organism comprising a disruption in a native peroxisome biogenesis factor protein and genes encoding a PUFA biosynthetic pathway, wherein the transformed eukaryotic host organism has an increased amount of PUFAs incorporated into the total lipid fraction and in the oil fraction, as a percent of total fatty acids, as compared to a eukaryotic organism whose native peroxisome biogenesis factor protein has not been disrupted. Various mammalian systems, plant cells, fungi, algae, oomycetes, yeasts, stramenopiles and/or euglenoids may be useful hosts. Although oleaginous organisms are preferred, non-oleaginous organisms also have utility herein such that, when one of their native PEX genes is disrupted, an increase in the weight percent of at least one polyunsaturated fatty acid relative to the weight percent of total fatty acids in the total lipid fraction or in the oil fraction will be experienced and may lead to a 1.3 fold increase in the PUFA. Additionally, the percent of the PUFA may be increased relative to the dry cell weight in the non-oleaginous organism. In alternate embodiments, a non-oleaginous organism can be genetically modified to become oleaginous, e.g., yeast such as Saccharomyces cerevisiae.

[0190] Oleaginous organisms are naturally capable of oil synthesis and accumulation, wherein the total oil content typically comprises greater than about 25% of the cellular dry weight. Various algae, moss, fungi, yeast, stramenopiles and plants are naturally classified as oleaginous.

[0191] Preferred oleaginous microbes include those algal, stramenopile and fungal organisms that naturally produce .omega.-3/.omega.-6 PUFAs. For example, ARA, EPA and/or DHA is produced via Cyclotella sp., Nitzschia sp., Pythium, Thraustochytrium sp., Schizochytrium sp. and Mortierella. The method of transformation of M. alpina is described by Mackenzie et al. (Appl. Environ. Microbiol., 66:4655 (2000)). Similarly, methods for transformation of Thraustochytriales microorganisms (e.g., Thraustochytrium, Schizochytrium) are disclosed in U.S. Pat. No. 7,001,772.

[0192] More preferred are oleaginous yeast, including those that naturally produce and those genetically engineered to produce .omega.-3/.omega.-6 PUFAs. Genera typically identified as oleaginous yeast include, but are not limited to: Yarrowia, Candida, Rhodotorula, Rhodosporidium, Cryptococcus, Trichosporon and Lipomyces. More specifically, illustrative oil-synthesizing yeasts include: Rhodosporidium toruloides, Lipomyces starkeyii, L. lipoferus, Candida revkaufi, C. pulcherrima, C. tropicalis, C. utilis, Trichosporon pullans, T. cutaneum, Rhodotorula glutinus, R. graminis and Yarrowia lipolytica (formerly classified as Candida lipolytica).

[0193] Most preferred is the oleaginous yeast Yarrowia lipolytica; and, in a further embodiment, most preferred are the Y. lipolytica strains designated as ATCC #76982, ATCC #20362, ATCC #8862, ATCC #18944 and/or LGAM S(7)1 (Papanikolaou S., and Aggelis G., Bioresour. Technol., 82(1):43-9 (2002)).

[0194] Specific teachings relating to transformation of Yarrowia lipolytica include U.S. Pat. No. 4,880,741 and U.S. Pat. No. 5,071,764 and Chen, D. C. et al. (Appl. Microbiol. Biotechnol., 48(2):232-235 (1997)), while suitable selection techniques are described in U.S. Pat. No. 7,238,482 and Int'l. App. Pub. Nos. WO 2005/003310 and WO 2006/052870.

[0195] The preferred method of expressing genes in Yarrowia lipolytica is by integration of linear DNA into the genome of the host. Integration into multiple locations within the genome can be particularly useful when high level expression of genes are desired, such as in the Ura3 locus (GenBank Accession No. AJ306421), the Leu2 gene locus (GenBank Accession No. AF260230), the Lys5 gene locus (GenBank Accession No. M34929), the Aco2 gene locus (GenBank Accession No. AJ001300), the Pox3 gene locus (Pox3: GenBank Accession No. XP.sub.--503244 or Aco3: GenBank Accession No. AJ001301), the .DELTA.12 desaturase gene locus (U.S. Pat. No. 7,214,491), the Lip1 gene locus (GenBank Accession No. Z50020), the Lip2 gene locus (GenBank Accession No. AJ012632), the SCP2 gene locus (GenBank Accession No. AJ431362), the Pex3 gene locus (GenBank Accession No. CAG78565), the Pex16 gene locus (GenBank Accession No. CAG79622) and/or the Pex10 gene locus (GenBank Accession No. CAG81606).

[0196] Preferred selection methods for use in Yarrowia lipolytica are resistance to kanamycin, hygromycin and the amino glycoside G418, as well as ability to grow on media lacking uracil, leucine, lysine, tryptophan or histidine. 5-fluoroorotic acid [5-fluorouracil-6-carboxylic acid monohydrate or "5-FOA"] may also be used for selection of yeast Ura.sup.- mutants. This compound is toxic to yeast cells that possess a functioning URA3 gene encoding orotidine 5'-monophosphate decarboxylase [OMP decarboxylase]; thus, based on this toxicity, 5-FOA is especially useful for the selection and identification of Ura.sup.- mutant yeast strains (Bartel, P. L. and Fields, S., Yeast 2-Hybrid System, Oxford University: New York, v. 7, pp 109-147, 1997; see also Int'l. App. Pub. No. WO 2006/052870 for 5-FOA use in Yarrowia).

[0197] An alternate preferred selection method for use in Yarrowia relies on a dominant, non-antibiotic marker for Yarrowia lipolytica based on sulfonylurea (chlorimuron ethyl; E. I. duPont de Nemours & Co., Inc., Wilmington, Del.) resistance. More specifically, the marker gene is a native acetohydroxyacid synthase ("AHAS" or acetolactate synthase; E.C. 4.1.3.18) that has a single amino acid change, i.e., W497L, that confers sulfonyl urea herbicide resistance (Int'l. App. Pub. No. WO 2006/052870). AHAS is the first common enzyme in the pathway for the biosynthesis of branched-chain amino acids, i.e., valine, leucine, isoleucine, and it is the target of the sulfonylurea and imidazolinone herbicides.

Fermentation Processes for Polyunsaturated Fatty Acid Production

[0198] The transformed host cell is grown under conditions that optimize expression of PUFA biosynthetic genes and produce the greatest and most economical yield of desired PUFAs. In general, media conditions may be optimized by modifying the type and amount of carbon source, the type and amount of nitrogen source, the carbon-to-nitrogen ratio, the amount of different mineral ions, the oxygen level, growth temperature, pH, length of the biomass production phase, length of the oil accumulation phase and the time and method of cell harvest. Oleaginous yeast of interest, such as Yarrowia lipolytica, are generally grown in a complex medium such as yeast extract-peptone-dextrose broth (YPD) or a defined minimal media that lacks a component necessary for growth and forces selection of the desired expression cassettes (e.g., Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Mich.)).

[0199] Fermentation media for the methods and host cells described herein must contain a suitable carbon source such as are taught in U.S. Pat. No. 7,238,482. Suitable sources of carbon encompass a wide variety of sources, with sugars, glycerol and/or fatty acids being preferred. Most preferred is glucose and/or fatty acids containing between 10-22 carbons.

[0200] Nitrogen may be supplied from an inorganic (e.g., (NH.sub.4).sub.2SO.sub.4) or organic (e.g., urea or glutamate) source. In addition to appropriate carbon and nitrogen sources, the fermentation media must also contain suitable minerals, salts, cofactors, buffers, vitamins and other components known to those skilled in the art suitable for the growth of the oleaginous yeast and the promotion of the enzymatic pathways of PUFA production. Particular attention is given to several metal ions, such as Fe.sup.+2, Cu.sup.+2, Mn.sup.+2, Co.sup.+2, Zn.sup.+2 and Mg.sup.+2, that promote synthesis of lipids and PUFAs (Nakahara, T. et al., Ind. Appl. Single Cell Oils, D. J. Kyle and R. Colin, eds. pp 61-97 (1992)).

[0201] Preferred growth media for the methods and host cells described herein are common commercially prepared media, such as Yeast Nitrogen Base (DIFCO Laboratories, Detroit, Mich.). Other defined or synthetic growth media may also be used and the appropriate medium for growth of the transformant host cells is well known in microbiology or fermentation science. A suitable pH range for the fermentation is typically between about pH 4.0 to pH 8.0, wherein pH 5.5 to pH 7.5 is preferred as the range for the initial growth conditions. The fermentation may be conducted under aerobic or anaerobic conditions, wherein microaerobic conditions are preferred.

[0202] Typically, accumulation of increased amounts of PUFAs and TAGs in oleaginous yeast cells requires a two-stage process, since the metabolic state must be "balanced" between growth and synthesis/storage of fats. Thus, most preferably, a two-stage fermentation process is necessary for the production of oils in oleaginous yeast. This approach is described in U.S. Pat. No. 7,238,482, as are various suitable fermentation process designs (i.e., batch, fed-batch and continuous) and considerations during growth.

Purification and Processing of PUFA Oils

[0203] Fatty acids, including PUFAs, may be found in the host organisms as free fatty acids or in esterified forms such as acylglycerols, phospholipids, sulfolipids or glycolipids. These fatty acids may be extracted from the host cells through a variety of means well-known in the art. One review of extraction techniques, quality analysis and acceptability standards for yeast lipids is that of Z. Jacobs (Critical Reviews in Biotechnology, 12(5/6):463-491 (1992)). A brief review of downstream processing is also available by A. Singh and O. Ward (Adv. Appl. Microbiol., 45:271-312 (1997)).

[0204] In general, means for the purification of fatty acids (including PUFAs) may include extraction (e.g., U.S. Pat. No. 6,797,303 and U.S. Pat. No. 5,648,564) with organic solvents, sonication, supercritical fluid extraction (e.g., using carbon dioxide), saponification and physical means such as presses, or combinations thereof. See U.S. Pat. No. 7,238,482.

Oils for Use in Foodstuffs, Health Food Products, Pharmaceuticals and Animal Feeds

[0205] The market place contains many food and feed products, incorporating .omega.-3 and/or .omega.-6 fatty acids, particularly ALA, GLA, ARA, EPA, DPA and DHA. It is contemplated that the microbial biomass comprising long-chain PUFAs, partially purified microbial biomass comprising PUFAs, purified microbial oil comprising PUFAs, and/or purified PUFAs made by the methods and host cells described herein impart health benefits, upon ingestion of foods or feed improved by their addition. These oils can be added to food analogs, drinks, meat products, cereal products, baked foods, snack foods and dairy products, to name a few. See U.S. Pat. App. Pub. No. 2006/0094092, hereby incorporated herein by reference.

[0206] These compositions may impart health benefits by being added to medical foods including medical nutritionals, dietary supplements, infant formula and pharmaceuticals. The skilled artisan will appreciate the amount of the oils to be added to food, feed, dietary supplements, nutriceuticals, pharmaceuticals, and other ingestible products as to impart health benefits. Health benefits from ingestion of these oils are described in the art, known to the skilled artisan and continuously investigated. Such an amount is referred to herein as an "effective" amount and depends on, among other things, the nature of the ingested products containing these oils and the physical conditions they are intended to address.

DESCRIPTION OF PREFERRED EMBODIMENTS

[0207] As demonstrated in the Examples and summarized in Table 5, infra, disruptions in the C-terminal portion of the C.sub.3HC.sub.4 zinc ring finger motif of YIPex10p, deletion of the entire chromosomal YIPex10 gene or of the entire chromosomal YIPex16 gene, deletion of both the entire chromosomal YIPex10 and the YIPex16 gene, and deletion of the entire chromosomal YIPex3 gene all resulted in an engineered PUFA-producing strain of Yarrowia lipolytica that had an increased weight percent of PUFAs as a percent of total fatty acids, relative to the parental strain whose native Pex protein had no disruption. Expression of an extrachromosomal YIPex10p in an engineered EPA-producing strain of Yarrowia lipolytica that possessed a disruption in the genomic Pex10p and an increased amount of PUFAs in the total lipid fraction and in the oil fraction reversed the effect.

[0208] Table 5 compiles data from Examples 3, 4, 5, 7, 9, 11 and 12, such that trends concerning total lipid content ["TFAs % DCW"], concentration of a given fatty acid(s) expressed as a weight percent of total fatty acids ["% TFAs"], and content of a given fatty acid(s) as its percent of the dry cell weight ["% DCW"] can be deduced, based on the presence/absence of a Pex disruption or knockout. "Desired PUFA % TFAs" and "Desired PUFA % DCW" quantify the particular concentration or content, respectively, of the desired PUFA product (i.e., DGLA or EPA) that the engineered PUFA biosynthetic pathway was designed to produce. "All PUFAs" includes LA, ALA, EDA, DGLA, ETrA, ETA and EPA, while "C20 PUFAs" is limited to EDA, DGLA, ETrA, ETA and EPA.

TABLE-US-00006 TABLE 5 PUFA % TF As and % DCW In Yarrowia lipolytica Strains With Mutant Pex Genes % TF As % DCW TF A % Desired All C20 Desired All C20 Example Strain Genomic Pex Gene DCW PUFA PUFAs PUFAs PUFA PUFAs PUFAs 3, 4 Y4086 Wildtype Pex10 28.6 9.8 60.1 25.2 2.8 17.2 7.2 [EPA] [EPA] Y4128 Mutant* Pex10 11.2 42.8 79.3 57.9 4.8 8.9 6.4 [EPA] [EPA] 5 Y4128U1 + Mutant* Pex10 + Plasmid 29.2 10.8 60 27.3 3.1 17.5 8.0 pFBAIn- Wildtype Pex10 [EPA] [EPA] PEX10 within chimeric FBAINm::Pex10::Pex20 gene Y4128U1 + Mutant* Pex10 + Plasmid 27.1 10.7 60.1 26.7 2.9 16.2 7.2 pPEX10-1 Wildtype Pex10 within Pex10-5' [EPA] [EPA] (500 bp)::Pex10::Pex10-3' gene Y4128U1 + Mutant* Pex10 + Plasmid 28.5 10.8 59 26.9 3.1 16.8 7.7 pPEX10-2 Wildtype Pex10 within Pex10-5' [EPA] [EPA] (991 bp)::Pex10::Pex10-3' gene Y4128U1 + Mutant* Pex10 22.8 27.7 62.6 42.3 6.3 14.2 9.6 control [EPA] [EPA] 7 Y4184U Wildtype Pex10 11.8 20.6 nq.sup..diamond-solid. nq.sup..diamond-solid. 2.4 nq.sup..diamond-solid. nq.sup..diamond-solid. [EPA] [EPA] 8.8 23.2 nq.sup..diamond-solid. nq.sup..diamond-solid. 2.0 nq.sup..diamond-solid. nq.sup..diamond-solid. [EPA] [EPA] Y4184U Mutant Pex10 17.6 43.2 nq.sup..diamond-solid. nq.sup..diamond-solid. 7.6 nq.sup..diamond-solid. nq.sup..diamond-solid. .DELTA.Pex10 [EPA] [EPA] 13.2 46.1 nq.sup..diamond-solid. nq.sup..diamond-solid. 6.1 nq.sup..diamond-solid. nq.sup..diamond-solid. [EPA] [EPA] 9 Y4036 Wildtype Pex16 Nq.sup..diamond-solid. 23.4 61.5 33.7 nq.sup..diamond-solid. nq.sup..diamond-solid. nq.sup..diamond-solid. (avg) [DGLA] Y4036 Mutant Pex16 Nq.sup..diamond-solid. 43.4 69.1 49.1 nq.sup..diamond-solid. nq.sup..diamond-solid. nq.sup..diamond-solid. (.DELTA.Pex16) [DGLA] (avg) 11 Y4305U Mutant Pex10 and Wildtype 30 44.7 76.6 55.4 13.4 23.0 16.6 (.DELTA.pex10) Pex16 [EPA] [EPA] (avg) Y4305 Mutant Pex10, Mutant Pex16 30 48.3 79.0 57.7 14.5 23.7 17.3 (.DELTA.Pex10, [EPA] [EPA] .DELTA.Pex16) (avg) 12 Y4036 Wildtype Pex3 4.7 19 57 27 0.9 2.7 1.3 [DGLA] [DGLA] Y4036 Mutant Pex3 6.1 46 68 56 2.8 4.4 3.4 (.DELTA.Pex3) [DGLA] [DGLA] 5.9 46 68 56 2.7 4.0 3.3 [DGLA] [DGLA] *Pex10 disruption in Y4128 results in a truncated protein, wherein the last 32 amino acids of the C-terminus (corresponding to the C-terminal portion of the C.sub.3HC.sub.4 zinc ring finger motif) are not present. .sup..diamond-solid.nq = not quantified

[0209] Although data cannot be directly compared between Examples, as a result of different Yarrowia strains and growth conditions, the following conclusions can be drawn (relative to the parental strain whose native Pex protein had not been disrupted or the parental strain that was expressing a "replacement" copy of the disrupted native Pex protein): [0210] 1) Pex disruption in a PUFA-producing Yarrowia results in an increase in the weight percent of a single PUFA, for example EPA or DLGA, relative to the weight percent of total fatty acids (% TFAs) in the total lipid fraction and in the oil fraction; [0211] 2) Pex disruption in a PUFA-producing Yarrowia results in an increase in the weight percent of C.sub.20 PUFAs relative to the weight percent of total fatty acids in the total lipid fraction and in the oil fraction; [0212] 3) By the extension of point 1), Pex disruption in a PUFA-producing Yarrowia results in an increase in the amount of any and all combinations of PUFAs relative to the weight percent of total fatty acids in the total lipid fraction and in the oil fraction; and [0213] 4) Pex disruption in a PUFA-producing Yarrowia results in an increase in the percent of a single PUFA, for example EPA or DLGA, relative to the dry cell weight.

[0214] Variable results are observed when comparing the effects of Pex disruptions in "All PUFAs % DCW", "C20 PUFAs % DCW" and TFA % DCW. Specifically, in some cases, the Pex disruption in the PUFA-producing Yarrowia results in an increased amount of C.sub.20 PUFAs or All PUFAs, as a percent of dry cell weight, in the total lipid fraction and in the oil fraction (relative to the parental strain whose native Pex protein had not been disrupted). In other cases, there is a diminished amount of C.sub.20 PUFAs or All PUFAs, as a percent of dry cell weight, in the total lipid fraction and in the oil fraction (relative to the parental strain whose native Pex protein had not been disrupted). Similar results are observed with respect to the total lipid content (TFA % DCW), in that the effect of the Pex disruption can either result in an increase in total lipid content or a decrease.

[0215] Although each of the above generalizations are of interest, it is particularly useful to examine the effect of the Pex disruptions on the ratio of the desired PUFA which the organism was engineered to produce relative to the amount of total PUFAs.

[0216] For example, 54% of the PUFAs (as a % TFAs) were EPA in strain Y4128 containing the Pex10 disruption that resulted in truncation of the last 32 amino acids of the C-terminus, while only 16.3% of the PUFAs (as a % TFAs) were EPA in the parent strain, Y4086. Thus, the disruption was responsible for a 3.3-fold increase in the amount of the desired PUFA (as % TFAs) (Examples 3, 4). In a similar manner, 62.8% of the PUFAs (as a % TFAs) were DGLA in strain Y4036 (.DELTA.Pex16), while only 38.1% the PUFAs (as a % TFAs) were DGLA in Y4036--a 1.65 fold increase (Example 9). And, 67.7% of the PUFAs (as a % TFAs) were DGLA in strain Y4036 (.DELTA.Pex3), while only 33.3% the PUFAs (as a % TFAs) were DGLA in Y4036--a 2.0 fold increase (Example 12). These results support the hypothesis that the Pex disruption results in a selective increase in the amount, as a % TFAs, of the desired PUFA which the organism was engineered to produce in the total lipid and oil fractions.

[0217] Less significant selectivity is observed when examining the effect of Pex disruptions on the ratio of C20 PUFAs relative to the amount of total PUFAs. For example, 73% of the PUFAs (as a % TFAs) were C20 PUFAs in strain Y4128 containing the Pex10 disruption, while only 42% of the PUFAs (as a % TFAs) were C20 PUFAs in strain Y4086. Thus, the disruption was responsible for a 1.7-fold increase in the amount of C20 PUFAs that accumulated in the total lipid and oil fractions, relative to the total PUFAs (Examples 3, 4). In a similar manner, 71% of the PUFAs (as a % TFAs) were C20 PUFAs in strain Y4036 (.DELTA.Pex16), while only 54.8% the PUFAs (as a % TFAs) were C20 PUFAs in Y4036--a 1.3 fold increase (Example 9). And, 82.4% of the PUFAs (as a % TFAS) were C20 PUFAs in strain Y4036 (.DELTA.Pex3), while only 47.4% the PUFAs (as a % TFAs) were C20 PUFAs in Y4036--a 1.7 fold increase (Example 12).

[0218] On the basis of the teachings and results described herein, it is expected that the feasibility and commercial utility of utilizing various disruptions in native genes encoding peroxisome biogenesis factor proteins as a means to increase the amount of PUFAs produced in a PUFA-producing eukaryotic organism will be appreciated. The PUFA-producing eukaryotic organism can synthesize a variety of .omega.-3 and/or .omega.-6 PUFAs, using either the .DELTA.9 elongase/.DELTA.8 desaturase pathway or the .DELTA.6 desaturase/.DELTA.6 elongase pathway.

EXAMPLES

[0219] The present invention is further described in the following Examples, which illustrate reductions to practice of the invention but do not completely define all of its possible variations.

General Methods

[0220] Standard recombinant DNA and molecular cloning techniques used in the Examples are well known in the art and are described by: 1) Sambrook, J., Fritsch, E. F. and Maniatis, T., Molecular Cloning: A Laboratory Manual; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1989) (Maniatis); 2) T. J. Silhavy, M. L. Bennan, and L. W. Enquist, Experiments with Gene Fusions; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y. (1984); and, 3) Ausubel, F. M. et al., Current Protocols in Molecular Biology, published by Greene Publishing Assoc. and Wiley-Interscience, Hoboken, N.J. (1987).

[0221] Materials and methods suitable for the maintenance and growth of microbial cultures are well known in the art. Techniques suitable for use in the following examples may be found as set out in Manual of Methods for General Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips, Eds), American Society for Microbiology: Washington, D.C. (1994)); or by Thomas D. Brock in Biotechnology: A Textbook of Industrial Microbiology, 2.sup.nd ed., Sinauer Associates Sunderland, Mass. (1989). All reagents, restriction enzymes and materials used for the growth and maintenance of microbial cells were obtained from Aldrich Chemicals (Milwaukee, Wis.), DIFCO Laboratories (Detroit, Mich.), New England Biolabs, Inc. (Beverly, Mass.), GIBCO/BRL (Gaithersburg, Md.), or Sigma Chemical Company (St. Louis, Mo.), unless otherwise specified. E. coli strains were typically grown at 37.degree. C. on Luria Bertani (LB) plates.

[0222] General molecular cloning was performed according to standard methods (Sambrook et al., supra). DNA sequence was generated on an ABI Automatic sequencer using dye terminator technology (U.S. Pat. No. 5,366,860; EP 272,007) using a combination of vector and insert-specific primers. Sequence editing was performed in Sequencher (Gene Codes Corporation, Ann Arbor, Mich.). All sequences represent coverage at least two times in both directions. Unless otherwise indicated herein comparisons of genetic sequences were accomplished using DNASTAR software (DNASTAR Inc., Madison, Wis.).

[0223] The meaning of abbreviations is as follows: "sec" means second(s), "min" means minute(s), "h" means hour(s), "d" means day(s), ".mu.L" means microliter(s), "mL" means milliliter(s), "L" means liter(s), ".mu.M" means micromolar, "mM" means millimolar, "M" means molar, "mmol" means millimole(s), "pmole" mean micromole(s), "g" means gram(s), ".mu.g" means microgram(s), "ng" means nanogram(s), "U" means unit(s), "bp" means base pair(s) and "kB" means kilobase(s).

Nomenclature for Expression Cassettes:

[0224] The structure of an expression cassette is represented by a simple notation system of "X::Y::Z", wherein X describes the promoter fragment, Y describes the gene fragment, and Z describes the terminator fragment, which are all operably linked to one another.

Transformation and Cultivation of Yarrowia lipolytica

[0225] Yarrowia lipolytica strain ATCC #20362 was purchased from the American Type Culture Collection (Rockville, Md.). Yarrowia lipolytica strains were routinely grown at 28-30.degree. C. in several media, according to the recipes shown below. Agar plates were prepared as required by addition of 20 g/L agar to each liquid media, according to standard methodology. [0226] YPD agar medium (per liter): 10 g of yeast extract [Difco], 20 g of Bacto peptone [Difco], and 20 g of glucose. [0227] Basic Minimal Media (MM) (per liter): 20 g glucose, 1.7 g yeast nitrogen base without amino acids, 1.0 g proline, and pH 6.1 (not adjusted). [0228] Minimal Media+Uracil (MM+uracil or MMU) (per liter): Prepare MM media as above and add 0.1 g uracil and 0.1 g uridine. [0229] Minimal Media+Uracil+Sulfonylurea (MMU+SU) (per liter): Prepare MMU media as above and add 280 mg sulfonylurea. [0230] Minimal Media+Leucine+Lysine (MMLeuLys) (per liter): Prepare MM media as above and add 0.1 g leucine and 0.1 g lysine. [0231] Minimal Media+5-Fluoroorotic Acid (MM+5-FOA) (per liter): 20 g glucose, 6.7 g Yeast Nitrogen base, 75 mg uracil, 75 mg uridine and appropriate amount of FOA (Zymo Research Corp., Orange, Calif.), based on FOA activity testing against a range of concentrations from 100 mg/L to 1000 mg/L (since variation occurs within each batch received from the supplier). [0232] High Glucose Media (HGM) (per liter): 80 glucose, 2.58 g KH.sub.2PO.sub.4 and 5.36 g K.sub.2HPO.sub.4, pH 7.5 (do not need to adjust). [0233] Fermentation medium without Yeast Extract (FM without YE) (per liter): 6.70 g Yeast Nitrogen base, 6.00 g KH.sub.2PO.sub.4, 2.00 g K.sub.2HPO.sub.4, 1.50 g MgSO.sub.4*7H.sub.2O and 20 g Glucose. [0234] Fermentation medium (FM) (per liter): Prepare FM without YE media as above and add 5.00 g Yeast extract (BBL). [0235] Synthetic Dextrose Media (SD) (per liter): 6.7 g Yeast Nitrogen base with ammonium sulfate and without amino acids; and 20 g glucose. [0236] Complete Minimal Glucose Broth Minus Uracil (CSM-Ura): Catalog No. C8140, Teknova, Hollister, Calif. (0.13% amino acid dropout powder minus uracil. 0.17% yeast nitrogen base, 0.5% (NH.sub.4).sub.2SO.sub.4, 2.0% glucose).

[0237] Transformation of Y. lipolytica was performed according to the method of Chen, D. C. et al. (Appl. Microbiol. Biotechnol., 48(2):232-235 (1997)), unless otherwise noted. Briefly, Yarrowia was streaked onto a YPD plate and grown at 30.degree. C. for approximately 18 hr. Several large loopfuls of cells were scraped from the plate and resuspended in 1 mL of transformation buffer containing: 2.25 mL of 50% PEG, average MW 3350; 0.125 mL of 2 M Li acetate, pH 6.0; and 0.125 mL of 2 M DTT. Then, approximately 500 ng of linearized plasmid DNA was incubated in 100 .mu.l of resuspended cells, and maintained at 39.degree. C. for 1 hr with vortex mixing at 15 min intervals. The cells were plated onto selection media plates and maintained at 30.degree. C. for 2 to 3 days.

Fatty Acid Analysis Of Yarrowia lipolytica

[0238] For fatty acid analysis, cells were collected by centrifugation and lipids were extracted as described in Bligh, E. G. & Dyer, W. J. (Can. J. Biochem. Physiol., 37:911-917 (1959)). Fatty acid methyl esters ["FAMEs"] were prepared by transesterification of the lipid extract with sodium methoxide (Roughan, G., and Nishida I., Arch Biochem Biophys., 276(1):3846 (1990)) and subsequently analyzed with a Hewlett-Packard 6890 GC fitted with a 30-m.times.0.25 mm (i.d.) HP-INNOWAX (Hewlett-Packard) column. The oven temperature was from 170.degree. C. (25 min hold) to 185.degree. C. at 3.5.degree. C./min.

[0239] For direct base transesterification, Yarrowia culture (3 mL) was harvested, washed once in distilled water, and dried under vacuum in a Speed-Vac for 5-10 min. Sodium methoxide (100 .mu.l of 1%) was added to the sample, and then the sample was vortexed and rocked for 20 min. After adding 3 drops of 1 M NaCl and 400 .mu.l hexane, the sample was vortexed and spun. The upper layer was removed and analyzed by GC as described above.

Example 1

Generation of Yarrowia lipolytica Strain Y4086 to Produce about 14% EPA of Total Lipids Via the .DELTA.9 Elongase/.DELTA.8 Desaturase Pathway

[0240] The present Example describes the construction of strain Y4086, derived from Yarrowia lipolytica ATCC #20362, capable of producing about 14% EPA relative to the total lipids via expression of a .DELTA.9 elongase/.DELTA.8 desaturase pathway (FIG. 3A).

[0241] The development of strain Y4086 required the construction of strain Y2224 (a FOA resistant mutant from an autonomous mutation of the Ura3 gene of wildtype Yarrowia strain ATCC #20362), strain Y4001 (producing 17% EDA with a Leu- phenotype), strain Y4001U (Leu- and Ura- phenotype), strain Y4036 (producing 18% DGLA with a Leu- phenotype), strain Y4036U (Leu- and Ura- phenotype) and strain Y4070 (producing 12% ARA with a Ura- phenotype). Further details regarding the construction of strains Y2224, Y4001, Y4001U, Y4036, Y4036U and Y4070 are described in Example 7 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference.

[0242] The final genotype of strain Y4070 with respect to wildtype Yarrowia lipolytica ATCC #20362 was Ura3-, unknown 1-, unknown 3-, Leu+, Lys+, GPD::FmD12::Pex20, YAT1::FmD12::OCT, YAT1::ME3S::Pex16, GPAT::EgD9e::Lip2, EXP1::EgD9eS::Lip1, FBAINm::EgD9eS::Lip2, FBAINm::EgD8M::Pex20, EXP1::EgD8M::Pex16, FBAIN::EgD5::Aco, EXP1::EgD5S::Pex20, YAT1::RD5S::OCT (wherein FmD12 is a Fusarium moniliforme .DELTA.12 desaturase gene [Int'l. App. Pub. No. WO 2005/047485]; ME3S is a codon-optimized C.sub.16/18 elongase gene, derived from Mortierella alpina [Int'l. App. Pub. No. WO 2007/046817]; EgD9e is a Euglena gracilis .DELTA.9 elongase gene [Int'l. App. Pub. No. WO 2007/061742]; EgD9eS is a codon-optimized .DELTA.9 elongase gene, derived from Euglena gracilis [Int'l. App. Pub. No. WO 2007/061742]; EgD8M is a synthetic mutant .DELTA.8 desaturase [Int'l. App. Pub. No. WO 2008/073271], derived from Euglena gracilis [U.S. Pat. No. 7,256,033]; EgD5 is a Euglena gracilis .DELTA.5 desaturase [U.S. Pat. App. Pub. US 2007-0292924-A1]; EgD5S is a codon-optimized .DELTA.5 desaturase gene, derived from Euglena gracilis [U.S. Pat. App. Pub. No. 2007-0292924]; and RD5S is a codon-optimized .DELTA.5 desaturase, derived from Peridinium sp. CCMP626 [U.S. Pat. App. Pub. No. 2007-0271632]).

Generation of Y4086 Strain to Produce about 14% EPA of Total Lipids

[0243] Construct pZP3-Pa777U (FIG. 3B; SEQ ID NO:28), described in Table 19 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference, was generated to integrate three .DELTA.17 desaturase genes into the Pox3 loci (GenBank Accession No. AJ001301) of strain Y4070, to thereby enable production of EPA. The .DELTA.17 desaturase genes were PaD17, a Pythium aphanidermatum .DELTA.17 desaturase (Int'l. App. Pub. No. WO 2008/054565), and PaD17S, a codon-optimized .DELTA.17 desaturase derived from Pythium aphanidermatum (Int'l. App. Pub. No. WO 2008/054565).

[0244] The pZP3-Pa777U plasmid was digested with AscI/SphI, and then used for transformation of strain Y4070 according to the General Methods. The transformant cells were plated onto MM plates and maintained at 30.degree. C. for 2 to 3 days. Single colonies were re-streaked onto MM plates, and then inoculated into liquid MMLeuLys at 30.degree. C. and shaken at 250 rpm/min for 2 days. The cells were collected by centrifugation, lipids were extracted, and FAMEs were prepared by trans-esterification, and subsequently analyzed with a Hewlett-Packard 6890 GC.

[0245] GC analyses showed the presence of EPA in the transformants containing the 3 chimeric genes of pZP3-Pa777U, but not in the parent Y4070 strain. Most of the selected 96 strains produced 10-13% EPA of total lipids. There were 2 strains (i.e., #58 and #79) that produced about 14.2% and 13.8% EPA of total lipids. These two strains were designated as Y4085 and Y4086, respectively.

[0246] The final genotype of strain Y4086 with respect to wildtype Yarrowia lipolytica ATCC #20362 was Ura3+, Leu+, Lys+, unknown 1-, unknown 2-, YALI0F24167g-, GPD::FmD12::Pex20, YAT1::FmD12::OCT, YAT1::ME3S::Pex16, GPAT::EgD9e::Lip2, EXP1::EgD9eS::Lip1, FBAINm::EgD9eS::Lip2, FBAINm::EgD8M::Pex20, EXP1::EgD8M::Pex16, FBAIN::EgD5::Aco, EXP1::EgD5S::Pex20, YAT1::RD5S::OCT, YAT1::PaD17S::Lip1, EXP1::PaD17::Pex16, FBAINm::PaD17::Aco.

Example 2

Generation of Yarrowia Lipolytica Strain Y4128 to Produce about 37% EPA of Total Lipids Via the .DELTA.9 Elongase/.DELTA.8 Desaturase Pathway

[0247] The present Example describes the construction of strain Y4128, derived from Yarrowia lipolytica ATCC #20362, capable of producing about 37.6% EPA relative to the total lipids (i.e., greater than a 2-fold increase in EPA concentration as percent of total fatty acids with respect to Y4086; FIG. 3A).

[0248] The development of strain Y4128 required the construction of strains Y2224, Y4001, Y4001U, Y4036, Y4036U, Y4070 and Y4086 (described in Example 1), as well as construction of strain Y4086U1 (Ura-).

Generation Of Strain Y4086U1 (Ura-)

[0249] Strain Y4086U1 was created via temporary expression of the Cre recombinase enzyme in construct pY117 (FIG. 4A; SEQ ID NO:29; described in Table 20 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference) within strain Y4086 to produce a Ura- phenotype. This released the LoxP sandwiched Ura3 gene from the genome. The mutated Yarrowia acetohydroxyacid synthase ["AHAS"; E.C. 4.1.3.18] enzyme (i.e., GenBank Accession No. XP.sub.--501277, comprising a W497L mutation as set forth in SEQ ID NO:27; see Int'l. App. Pub. No. WO 2006/052870) in plasmid pY117 conferred sulfonyl urea herbicide resistance (SU.sup.R), which was used as a positive screening marker.

[0250] Plasmid pY117 was used to transform strain Y4086 according to the General Methods. Following transformation, the cells were plated onto MMU+SU (280 .mu.g/mL sulfonylurea; also known as chlorimuron ethyl, E. I. duPont de Nemours & Co., Inc., Wilmington, Del.) plates and maintained at 30.degree. C. for 2 to 3 days. The individual SU.sup.R colonies grown on MMU+SU plates were picked, and streaked into YPD liquid media at 30.degree. C. and shaken at 250 rpm/min for 1 day to cure the pY117 plasmid. The grown cultures were streaked onto MMU plates. After two days at 30.degree. C., the individual colonies were re-streaked onto MM and MMU plates. Those colonies that could grow on MMU, but not on MM plates were selected. Two of these strains with Ura-phenotypes were designated as Y4086U1 and Y4086U2.

Generation of Y4128 Strain to Produce about 37% EPA of Total Lipids

[0251] Construct pZP2-2988 (FIG. 4B; SEQ ID NO:30; described in Table 21 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference) was generated to integrate one .DELTA.12 desaturase gene (i.e., FmD12S, a codon-optimized .DELTA.12 desaturase gene derived from Fusarium moniliforme [Int'l. App. Pub. No. WO 2005/047485]), two .DELTA.8 desaturase genes (i.e., EgD8M) and one .DELTA.9 elongase gene (i.e., EgD9eS) into the Pox2 loci (GenBank Accession No. AJ001300) of strain Y4086U1, to thereby enable higher level production of EPA. The pZP2-2988 plasmid was digested with AscI/SphI, and then used for transformation of strain Y4086U1 according to the General Methods. The transformant cells were plated onto MM plates and maintained at 30.degree. C. for 2 to 3 days. Single colonies were re-streaked onto MM plates, and then inoculated into liquid MMLeuLys at 30.degree. C. and shaken at 250 rpm/min for 2 days. The cells were collected by centrifugation, resuspended in HGM and then shaken at 250 rpm/min for 5 days. The cells were collected by centrifugation, lipids were extracted, and FAMEs were prepared by trans-esterification, and subsequently analyzed with a Hewlett-Packard 6890 GC.

[0252] GC analyses showed that most of the selected 96 strains produced 12-15.6% EPA of total lipids. There were 2 strains (i.e., #37 within Group I and #33 within Group II) that produced about 37.6% and 16.3% EPA of total lipids. These two strains were designated as Y4128 and Y4129, respectively.

[0253] The final genotype of strain Y4128 with respect to wildtype Yarrowia lipolytica ATCC #20362 was: YALI0F24167g-, Pex10-, unknown 1-, unknown 2-, GPD::FmD12::Pex20, YAT1::FmD12::OCT, GPM/FBAIN::FmD12S::OCT, YAT1::ME3S::Pex16, GPAT::EgD9e::Lip2, EXP1::EgD9eS::Lip1, FBAINm::EgD9eS::Lip2, FBA::EgD9eS::Pex20, FBAINm::EgD8M::Pex20, EXP1::EgD8M::Pex16, GPDIN::EgD8M::Lip1, YAT1::EgD8M::Aco, FBAIN::EgD5::Aco, EXP1::EgD5S::Pex20, YAT1::RD5S::OCT, YAT1::PaD17S::Lip1, EXP1.::PaD17::Pex16, FBAINm::PaD17::Aco.

[0254] Yarrowia lipolytica strain Y4128 was deposited with the American Type Culture Collection on Aug. 23, 2007 and bears the designation ATCC PTA-8614.

Generation of Y4128U Strains With A Ura- Phenotype

[0255] In order to disrupt the Ura3 gene in strain Y4128, construct pZKUE3S (FIG. 5A; SEQ ID NO:31; described in Table 22 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference) was created to integrate a EXP1::ME3S::Pex20 chimeric gene into the Ura3 gene of strain Y4128. Plasmid pZKUE3S was digested with SphI/PacI, and then used to transform strain Y4128 according to the General Methods. Following transformation, cells were plated onto MM+5-FOA selection plates and maintained at 30.degree. C. for 2 to 3 days.

[0256] A total of 24 transformants grown on MM+5-FOA selection plates were picked and re-streaked onto fresh MM+5-FOA plates. The cells were stripped from the plates, lipids were extracted, and FAMEs were prepared by trans-esterification, and subsequently analyzed with a Hewlett-Packard 6890 GC.

[0257] GC analyses showed the presence of between 10-15% EPA in all of the transformants with pZKUE3S from plates. The strains identified as #3, #4, #10, #12, #19 and #21 that produced 12.9%, 14.4%, 15.2%, 15.4%, 14% and 10.9% EPA of total lipids were designated as Y4128U1, Y4128U2, Y4128U3, Y4128U4, Y4128U5 and Y4128U6, respectively (collectively, Y4128U).

[0258] The discrepancy in the % EPA quantified in Y4128 (37.6%) versus Y4128U (average 13.8%) is based on differing growth conditions. Specifically, the former culture was analyzed following two days of growth in liquid culture, while the latter culture was analyzed after growth on an agar plate. The Applicants have observed a 2-3 fold increase in % EPA, when comparing results from agar plates to those in liquid culture. Thus, although results are not directly comparable, both Y4128 and Y4128U strains demonstrate production of EPA.

Example 3

Determination of Total Lipid Content of Yarrowia lipolytica Strain Y4128

[0259] The total amount of lipid produced by strain Y4128 and the percentage of each fatty acid species in the lipid were measured by GC analysis. Specifically, total lipids were extracted, and FAMEs were prepared by trans-esterification, and subsequently analyzed with a Hewlett-Packard 6890 GC, as described in the General Methods.

[0260] Dry cell weight was determined by collecting cells from 10 mL of culture via centrifugation, washing the cells with water once to remove residual medium, drying the cells in a vacuum oven at 80.degree. C. overnight, and weighing the dried cells. The total amount of FAMEs in a sample was determined by comparing the areas of all peaks in the GC profile with the peak area of an added known amount of internal standard C15:0 fatty acid.

[0261] Based on the above analyses, lipid content as a percentage of dry cell weight (DCW) and lipid composition was determined for strains Y4086 and Y4128. Strain Y4128 had decreased lipid content with respect to strain Y4086 (11.2 TFAs % DCW versus 28.6 TFAs % DCW). In contrast, strain Y4128 had elevated EPA concentrations among lipids with respect to strain Y4086, as shown below in Table 6. Fatty acids are identified as 18:0 (stearic acid), 18:1 (oleic acid), LA, ALA, EDA, DGLA, ETrA, ETA and EPA; fatty acid compositions were expressed as the weight percent (wt. %) of total fatty acids (TFAs).

TABLE-US-00007 TABLE 6 Lipid Composition in Yarrowia lipolytica Strains Y4086 And Y4128 18:3 20:3 20:3 20:4 20:5 18:2 (n-3) 20:2 (n-6) (n-3) (n-3) (n-3) Sample 18:0 18:1 [LA] [ALA] [EDA] [DGLA] [ETrA] [ETA] [EPA] Y4086 4.6 26.8 28.0 6.9 7.6 0.9 4.9 2.0 9.8 Y4128 1.8 6.7 19.6 1.8 4.2 3.4 1.5 6.0 42.8 EPA content in the cell, expressed as mg EPA/g dry cell and calculated according to the following formula: (% of EPA/Lipid) * (% of Lipid/dry cell weight) * 0.1, increased from 28 mg EPA/g DCW in strain Y4086 to 47.9 mg EPA/g DCW in strain Y4128.

[0262] Thus, the results in Table 6 showed that compared to the parent strain Y4086, strain Y4128 had a lower total lipid content (TFAs % DCW) (11.2% versus 28.6%), higher EPA % TFAs (42.8% versus 9.8%), and higher EPA % DCW (4.8% versus 2.8%). Additionally, strain Y4128 had a 3.3-fold increase in the amount of EPA relative to the total PUFAs (54% of the PUFAs [as a % TFAs] versus 16.3% of the PUFAs [as a % TFAs]) and a 1.7-fold increase in the amount of C20 PUFAs relative to the total PUFAs (73% of the PUFAs [as a % TFAs] versus 42% of the PUFAs [as a % TFAs]).

Example 4

Determination of the Integration Site of pZP2-2988 in Yarrowia lipolytica Strain Y4128 as a Pex10Integration

[0263] The genomic integration site of pZP2-2988 in strain Y4128 was determined by genome walking using the Universal GenomeWalker.TM. Kit from Clontech (Palo Alto, Calif.), following the manufacturer's recommended protocol. Based on the sequence of the plasmid, the following primers were designed for genome walking: pZP-GW-5-1 (SEQ ID NO:32), pZP-GW-5-2 (SEQ ID NO:33), pZP-GW-5-3 (SEQ ID NO:34), pZP-GW-54 (SEQ ID NO:35), pZP-GW-3-1 (SEQ ID NO:36), pZP-GW-3-2 (SEQ ID NO:37), pZP-GW-3-3 (SEQ ID NO:38) and pZP-GW-34 (SEQ ID NO:39).

[0264] Genomic DNA was prepared from strain Y4128 using the Qiagen Miniprep kit with a modified protocol. Cells were scraped off a YPD medium plate into a 1.5 mL microfuge tube. Cell pellet (100 .mu.l) was resuspended with 250 .mu.l of buffer P1 containing 0.125 M .beta.-mercaptoethanol and 1 mg/mL zymolyase 20 T (MP Biomedicals, Inc., Solon, Ohio). The cell suspension was incubated at 37.degree. C. for 30 min. Buffer P2 (250 .mu.l) was then added to the tube. After mixing by inverting the tube for several times, 350 .mu.l of buffer N3 was added. The mixture was then centrifuged at 14,000 rpm for 5 min in a microfuge. Supernatant was poured into a Qiagen miniprep spin column, and centrifuged for 1 min. The column was washed once by adding 0.75 mL of buffer PE, followed by centrifugation at 14,000 rpm for 1 min. The column was dried by further centrifugation at 14,000 rpm for 1 min. Genomic DNA was eluted by adding 50 .mu.l of buffer EB to the column, allowed to sit for 1 min and centrifuged at 14,000 rpm for 1 min.

[0265] Purified genomic DNA was used for genome walking. The DNA was digested with restriction enzymes DraI, EcoRV, PvuII and StuI separately, according to the protocol of the GenomeWalker kit. For each digestion, the reaction mixture contained 10 .mu.l of 10.times. restriction buffer, 10 .mu.l of the appropriate restriction enzyme and 8 .mu.g of genomic DNA in a total volume of 100 .mu.l. The reaction mixtures were incubated at 37.degree. C. for 4 hrs. The digested DNA samples were then purified using Qiagen PCR purification kit following the manufacturer's protocol exactly. DNA samples were eluted in 16 .mu.l water. Purified, digested genomic DNA samples were then ligated to the genome walker adaptor (infra). Each ligation mixture contained 1.9 .mu.l of the genome walker adaptor, 1.6 .mu.l of 10.times. ligation buffer, 0.5 .mu.l T4 DNA ligase and 4 .mu.l of the digested DNA. The reaction mixtures were incubated at 16.degree. C. overnight. Then, 72 .mu.l of 50 mM Tris HCl, 1 mM EDTA, pH 7.5 were added to each ligation mixture.

[0266] For 5'-end genome walking, four PCR reactions were carried out using 1 .mu.l of each ligation mixture individually as template. In addition, each reaction mixture contained 1 .mu.l of 10 .mu.M primer pZP-GW-5-1 (SEQ ID NO:32), 1 .mu.l of 10 .mu.M kit-supplied Genome Walker adaptor, 41 .mu.l water, 5 .mu.l 10.times. cDNA PCR reaction buffer and 1 .mu.l Advantage cDNA polymerase mix from Clontech. The sequence of the Genome Walker adaptor (SEQ ID NOs:40 [top strand] and 41 [bottom strand]), is shown below:

TABLE-US-00008 5'-GTAATACGACTCACTATAGGGCACGCGTGGTCGACGGCCCGGGCTGG T-3' 3'-H2N-CCCGACCA-5'

The PCR conditions were as follows: 95.degree. C. for 1 min, followed by 30 cycles at 95.degree. C. for 20 sec and 68.degree. C. for 3 min, followed by a final extension at 68.degree. C. for 7 min. The PCR products were each diluted 1:100 and 1 .mu.l of the diluted PCR product used as template for a second round of PCR. The conditions were exactly the same except that pZP-GW-5-2 (SEQ ID NO:33) replaced pZP-GW-5-1 (SEQ ID NO:32).

[0267] For 3'-end genome walking, four PCR reactions were carried out as above, except primer pZP-GW-3-1 (SEQ ID NO:36) and nested adaptor primer (SEQ ID NO:42) were used. The PCR products were similarly diluted and used as template for a second round of PCR, using pZP-GW-3-2 (SEQ ID NO:37) to replace pZP-GW-3-1 (SEQ ID NO:36).

[0268] PCR products were analyzed by gel electrophoresis. One reaction product, using EcoRV digested genomic DNA as template and the primers pZP-GW-3-2 and nested adaptor primer, generated a .about.1.6 kB fragment. This fragment was isolated, purified with a Qiagen gel purification kit and cloned into pCR2.1--TOPO. Sequence analysis showed that the fragment included both part of plasmid pZP2-2988 and the Yarrowia genomic DNA from chromosome C. The junction between them was at nucleotide position 139826 of chromosome C. This was inside the coding region of the Pex10 gene (GenBank Accession No. CAG81606; SEQ ID NO:10).

[0269] To determine the 5' end of the junction, PCR amplification was performed using genomic DNA from strain Y4128 as the template and primers Per10 F1 (SEQ ID NO:43) and ZPGW-5-5 (SEQ ID NO:44). The reaction mixture included 1 .mu.l each of 20 .mu.M primer, 1 .mu.l genomic DNA, 22 .mu.l water and 25 .mu.l TaKaRa ExTaq 2.times. premix (TaKaRa Bio Inc., Otsu Shiga, Japan). The thermocycler conditions were: 94.degree. C. for 1 min, followed by 30 cycles of 94.degree. C. for 20 sec, 55.degree. C. for 20 sec and 72.degree. C. for 2 min, followed by a final extension at 72.degree. C. for 7 min. A 1.6 kB DNA fragment was amplified and cloned into pCR2.1--TOPO. Sequence analysis showed that it was a chimeric fragment between Yarrowia genomic DNA from chromosome C and pZP2-2988. The junction was at nucleotide position 139817 of chromosome C. Thus, a 10 nucleotide segment of chromosome C was replaced by the AscI/SphI fragment from pZP2-2988 (FIG. 4B) in strain Y4128. As a result, Pex10 in strain Y4128 was lacking the last 32 amino acids of the encoded protein.

[0270] Based on the above conclusions, the Y4128U strains isolated in Example 2 (supra) are referred to subsequently as .DELTA.pex10 strains. For clarity, strain Y4128U1 is equivalent to strain Y4128U1 (.DELTA.pex10).

Example 5

Plasmid Expression of Pex10In Yarrowia lipolytica Strain Y4128U1 (.DELTA.pex10)

[0271] Three plasmids that carried the Y. lipolytica Pex10 gene were constructed: 1) pFBAIn-PEX10 allowed the expression of the Pex100RF under the control of the FBAINm promoter; and, 2) pPEX10-1 and pPEX10-2 allowed the expression of Pex10 under control of the native Pex10 promoter, although pPEX10-1 used a shorter version (.about.500 bp) while pPEX10-2 used a longer version (.about.900 bp) of the promoter. Following construction of these expression plasmids and transformation, the effect of Pex10 plasmid expression on total oil and on EPA level in the Y. lipolytica strain Y4128U1 (.DELTA.pex10) was determined. Deletion of Pex10 resulted in an increased amount of EPA as a percent of TFAs, but a reduced amount of total lipid, as a percent of DCW, in the cell.

Construction of pFBAIn-PEX10, pPEX10-1 and pPEX10-2

[0272] To construct pFBAIn-PEX10, the primers Per10 F1 (SEQ ID NO:43) and Pe10 R (SEQ ID NO:45) were used to amplify the coding region of the Pex10 gene using Y. lipolytica genomic DNA as template. The PCR reaction mixture contained 1 .mu.l each of 20 .mu.M primers, 1 .mu.l of Y. lipolytica genomic DNA (.about.100 ng), 25 .mu.l ExTaq 2.times. premix and 22 .mu.l water. The reaction was carried out as follows: 94.degree. C. for 1 min, followed by 30 cycles of 94.degree. C. for 20 sec, 55.degree. C. for 20 sec and 72.degree. C. for 90 sec, followed by a final extension of 72.degree. C. for 7 min. The PCR product, a 1168 bp DNA fragment, was purified with a Qiagen PCR purification kit, digested with NcoI and NotI, and cloned into pFBAIn-MOD-1 (SEQ ID NO:46; FIG. 5B) digested with the same two restriction enzymes.

[0273] Of the 8 individual clones subjected to sequence analysis, 2 had the correct sequence of Pex10 with no errors. The components of pFBAIn-PEX10 (SEQ ID NO:47; FIG. 6A) are listed below in Table 7.

TABLE-US-00009 TABLE 7 Components Of Plasmid pFBAIn-PEX10 (SEQ ID NO: 47) RE Sites And Nucleotides Within SEQ ID Description Of Fragment And NO: 47 Chimeric Gene Components BglII-BsiWI FBAINm::Pex10::Pex20, comprising: (6040-318) FBAINm: Yarrowia lipolytica FBAINm promoter (U.S. Pat. No. 7,202,356); Pex10: Y. lipolytica Pex10 ORF (GenBank Accession No. AB036770, nucleotides 1038-2171; SEQ ID NO: 21); Pex20: Pex20 terminator sequence from Yarrowia Pex20 gene (GenBank Accession No. AF054613) PacI-BglII Yarrowia URA3 (GenBank Accession No. AJ306421) (4530-6040) (3123-4487) Yarrowia autonomous replicating sequence 18 (ARS18; GenBank Accession No. A17608) (2464-2864) E. coli f1 origin of replication (1424-2284) Ampicillin-resistance gene (Amp.sup.R) for selection in E. coli (474-1354) ColE1 plasmid origin of replication

[0274] To construct pPEX10-1 and pPEX10-2, primers PEX10-R-BsiWI (SEQ ID NO:48), PEX10-F1-SalI (SEQ ID NO:49) and PEX10-F2-SalI (SEQ ID NO:50) were designed and synthesized. PCR amplification using genomic Yarrowia lipolytica DNA and primers PEX10-R-BsiWI and PEX10-F1-SalI generated a 1873 bp fragment containing the Pex100RF, 500 bp of the 5' upstream region and 215 bp of the 3' downstream region of the Pex10 gene, flanked by SalI and Bs/WI restriction sites at either end. This fragment was purified with the Qiagen PCR purification kit, digested with SalI and BsiWI, and cloned into pEXP-MOD-1 (SEQ ID NO:51; FIG. 6B) digested with the same two enzymes to generate pPEX10-1 (SEQ ID NO:52; FIG. 7A). Plasmid pEXP-MOD1 is similar to pFBAIn-MOD-1 (SEQ ID NO:46; FIG. 5B) except that the FBAINm promoter in the latter was replaced with the EXP1 promoter. Table 8 lists the components of pPEX10-1.

TABLE-US-00010 TABLE 8 Components Of Plasmid pPEX10-1 (SEQ ID NO: 52) RE Sites And Nucleotides Within SEQ ID Description Of Fragment And NO: 52 Chimeric Gene Components SalI-BsiWI Pex10-5'::Pex10::Pex10-3', comprising: (5705-1) Pex10-5': 500 bp of the 5' promoter region of Yarrowia lipolytica Pex10 gene; Pex10: Yarrowia lipolytica Pex10 ORF (GenBank Accession No. AB036770, nucleotides 1038-2171; SEQ ID NO: 21); Pex10-3': 215 bp of Pex10 terminator sequence from Yarrowia Pex10 gene (GenBank Accession No. AB036770) [Note the entire Pex10-5'::Pex10::Pex10-3' expression cassette is labeled collectively as "PEX10" in the Figure] PacI-SalI Yarrowia URA3 gene (GenBank Accession No. (4216-5703) AJ306421) (2806-4170) Yarrowia autonomous replicating sequence 18 (ARS18; GenBank Accession No. A17608) (2147-2547) E. coli f1 origin of replication (1107-1967) Ampicillin-resistance gene (Amp.sup.R) for selection in E. coli (157-1037) ColE1 plasmid origin of replication

[0275] PCR amplification of Yarrowia lipolytica genomic DNA using PEX10-R-BsiWI (SEQ ID NO:48) and PEX10-F2-SalI (SEQ ID NO:50) generated a 2365 bp fragment containing the PEX10 ORF, 991 bp of the 5' upstream region and 215 bp of the 3' downstream region of the Pex10 gene, flanked by SalI and BsiWI restriction sites at either end. This fragment was purified with a Qiagen PCR purification kit, digested with SalI and BsiWI, and cloned into similarly digested pEXP-MOD-1. This resulted in synthesis of pPEX10-2 (SEQ ID NO:53), whose construction is analogous to that of plasmid pPEX10-1 (Table 8, supra), with the exception of the longer Pex10-5' promoter in the chimeric Pex10-5'::Pex10::Pex10-3'gene.

Expression of Pex10 in Strain Y4128U1 (.DELTA.pex10)

[0276] Plasmids pFBAIN-MOD-1 (control; SEQ ID NO:46), pFBAIn-PEX10 (SEQ ID NO:47), pPEX10-1 (SEQ ID NO:52) and pPEX10-2 (SEQ ID NO:53) were transformed into Y4128U1 (.DELTA.pex10) according to the protocol in the General Methods. Transformants were plated on MM plates. The total lipid content and fatty acid composition of transformants carrying the above plasmids were analyzed as described in Example 3.

[0277] Lipid content as a percentage of dry cell weight (DCW) and lipid composition are shown below in Table 9. Specifically, fatty acids are identified as 18:0 (stearic acid), 18:1 (oleic acid), LA, ALA, EDA, DGLA, ETrA, ETA and EPA; fatty acid compositions were expressed as the weight percent (wt. %) of total fatty acids.

TABLE-US-00011 TABLE 9 Lipid Composition in Yarrowia lipolytica Strain Y4128U1 (.DELTA.pex10) Transformed With Various Pex10 Plasmids 18:3 20:3 20:3 20:4 20:5 TFA 18:2 (.omega.3) 20:2 (.omega.6) (.omega.3) (.omega.3) (.omega.3) Plasmid % DCW 18:0 18:1 [LA] [ALA] [EDA] [DGLA] [ETrA] [ETA] [EPA] pFBAIN-MOD-1 22.8 1.9 9.6 18.3 2.0 4.3 2.3 2.1 5.9 27.7 pFBAIN-PEX10 29.2 4.0 24.9 25.1 7.6 6.6 1.0 5.3 3.6 10.8 pPEX10-1 27.1 3.9 25.0 25.2 8.2 6.4 0.9 5.2 3.5 10.7 pPEX10-2 28.5 4.3 25.4 24.5 7.6 6.4 1.0 5.3 3.4 10.8

[0278] The results in Table 9 showed that expression of Pex10 in Y4128U1 (.DELTA.pex10), either from the native Y. lipolytica Pex10 promoter or from the Y. lipolytica FBAINm promoter, reduced the percent of EPA back to the level of Y4086 while increasing the total lipid content (TFA % DCW) up to the level of Y4086 (see data of Table 6 for comparison). EPA content per gram of dry cell changed from 63.2 mg in the case of the control sample (i.e., cells carrying pFBAIn-MOD-1) to 31.5 mg in cells carrying pFBAIn-PEX10, 29 mg in cells carrying pPEX10-1 and 30.8 mg in cells carrying pPEX10-2. These results demonstrated that disruption of the ring-finger domain of Pex10 increased the amount of EPA but reduced the total lipid content in the cell.

[0279] Thus, the results in Table 9 showed that compared to Y4128U1 (.DELTA.pex10) transformant with control plasmid, all transformants with Pex10 expressing plasmids showed higher lipid content (TFAs % DCW) (>27% versus 22.8%), lower EPA % TFAs (ca. 10.8% versus 27.7%), and lower EPA % DCW (<3.1% versus 6.3%). Additionally, strain Y4128U1 (.DELTA.pex10) transformant with control plasmid, as compared to those transformants with Pex10 expressing plasmids, had a 2.5-fold increase in the amount of EPA relative to the total PUFAs (44% of the PUFAs [as a % TFAs] versus 17.5% (avg) of the PUFAs [as a % TFAs]) and a 1.5-fold increase in the amount of C20 PUFAs relative to the total PUFAs (67% of the PUFAs [as a % TFAs] versus 44% (avg) of the PUFAs [as a % TFAs]).

Example 6

Generation of Y4184U Strain to Produce EPA

[0280] Y. lipolytica strain Y4184U was used as the host in Example 7, infra. Strain Y4184U was derived from Y. lipolytica ATCC #20362, and is capable of producing EPA via expression of a .DELTA.9 elongase/.DELTA.8 desaturase pathway. The strain has a Ura- phenotype and its construction is described in Example 7 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference.

[0281] In summary, however, the development of strain Y4184U required the construction of strain Y2224, strain Y4001, strain Y4001U, strain Y4036, strain Y4036U and strain Y4069 (supra, Example 1). Further development of strain Y4184U (diagrammed in FIG. 7B) required generation of strain Y4084, strain Y4084U1, strain Y4127 (deposited with the American Type Culture Collection on Nov. 29, 2007, under accession number ATCC PTA-8802), strain Y4127U2, strain Y4158, strain Y4158U1 and strain Y4184. The plasmid construct pZKL1-2SP98C, used for transformation of strain Y4127U2, is diagrammed in FIG. 8A (SEQ ID NO:54; described in Table 23 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference). Plasmid pZKL2-5U89GC, used for transformation of strain Y4158U1, is shown in FIG. 8B (SEQ ID NO:55; described in Table 24 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference).

[0282] The final genotype of strain Y4184 (producing 31% EPA of total lipids) with respect to wildtype Yarrowia lipolytica ATCC #20362 was unknown 1-, unknown 2-, unknown 4-, unknown 5-, unknown 6-, unknown 7-, YAT1::ME3S::Pex16, EXP1::ME3S::Pex20 (2 copies), GPAT::EgD9e::Lip2, FBAINm::EgD9eS::Lip2, EXP1::EgD9eS::Lip1, FBA::EgD9eS::Pex20, YAT1::EgD9eS::Lip2, GPD::EgD9eS::Lip2, GPDIN::EgD8M::Lip1, YAT1::EgD8M::Aco, EXP1::EgD8M::Pex16, FBAINm::EgD8M::Pex20, FBAIN::EgD8M::Lip1 (2 copies), GPM/FBAIN::FmD12S::Oct, EXP1::FmD12S::Aco, YAT1::FmD12::Oct, GPD::FmD12::Pex20, EXP1::EgD5S::Pex20, YAT1::EgD5S::Aco, YAT1::Rd5S::Oct, FBAIN::EgD5::Aco, FBAINm::PaD17::Aco, EXP1::PaD17::Pex16, YAT1::PaD17S::Lip1, YAT1::YICPT1::Aco, GPD::YICPT1::Aco (wherein FmD12 is a Fusarium moniliforme .DELTA.12 desaturase gene [Int'l. App. Pub. No. WO 2005/047485]; FmD12S is a codon-optimized .DELTA.12 desaturase gene, derived from Fusarium moniliforme [Int'l. App. Pub. No. WO 2005/047485]; ME3S is a codon-optimized C.sub.16/18 elongase gene, derived from Mortierella alpina [Int'l. App. Pub. No. WO 2007/046817]; EgD9e is a Euglena gracilis .DELTA.9 elongase gene [Int'l. App. Pub. No. WO 2007/061742]; EgD9eS is a codon-optimized .DELTA.9 elongase gene, derived from Euglena gracilis [Int'l. App. Pub. No. WO 2007/061742]; EgD8M is a synthetic mutant .DELTA.8 desaturase [Int'l. App. Pub. No. WO 2008/073271], derived from Euglena gracilis [U.S. Pat. No. 7,256,033]; EgD5 is a Euglena gracilis .DELTA.5 desaturase [U.S. Pat. App. Pub. US 2007-0292924-A1]; EgD5S is a codon-optimized .DELTA.5 desaturase gene, derived from Euglena gracilis [U.S. Pat. App. Pub. No. 2007-0292924]; RD5S is a codon-optimized .DELTA.5 desaturase, derived from Peridinium sp. CCMP626 [U.S. Pat. App. Pub. No. 2007-0271632]; PaD17 is a Pythium aphanidermatum .DELTA.17 desaturase [Int'l. App. Pub. No. WO 2008/054565]; PaD17S is a codon-optimized .DELTA.17 desaturase, derived from Pythium aphanidermatum [Int'l. App. Pub. No. WO 2008/054565]; and, YICPT1 is a Yarrowia lipolytica diacylglycerol cholinephosphotransferase gene [Int'l. App. Pub. No. WO 2006/052870]).

[0283] In order to disrupt the Ura3 gene in strain Y4184, construct pZKUE3S (FIG. 5A; SEQ ID NO:31; described in Table 22 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference) was used to integrate a EXP1::ME3S::Pex20 chimeric gene into the Ura3 gene of strain Y4184 to result in strains Y4184U1 (11.2% EPA of total lipids), Y4184U2 (10.6% EPA of total lipids) and Y4184U4 (15.5% EPA of total lipids), respectively (collectively, Y4184U).

Example 7

Chromosomal Deletion of Pex10 in Yarrowia lipolytica Strain Y4184U4 Increases Accumulation of EPA and Total Lipid Content

[0284] Construct pYPS161 (FIG. 9A, SEQ ID NO:56) was used to knock out the chromosomal Pex10 gene from the EPA-producing Yarrowia strain Y4184U4 (Example 6). Transformation of Y. lipolytica strain Y4184U4 with the Pex10 knock out construct resulted in creation of strain Y4184 (.quadrature.pex10). The effect of the Pex10 knockout on total oil and on EPA level was determined and compared. Specifically, knockout of Pex10 resulted in an increased percentage of EPA (as % TFAs and % DCW) and an increased total lipid content in the cell.

Construct PYSP161

[0285] The construct pYPS161 contained the following components:

TABLE-US-00012 TABLE 10 Description of Plasmid pYPS161 (SEQ ID NO: 56) RE Sites And Nucleotides Within SEQ ID Description Of Fragment And Chimeric Gene NO: 56 Components AscI/BsiWI 1364 bp Pex10 knockout fragment #1 of Yarrowia Pex10 (1521-157) gene (GenBank Accession No. AB036770) PacI/SphI 1290 bp Pex10 knockout fragment #2 of Yarrowia Pex10 (5519-4229) gene (GenBank Accession No. AB036770) SalI/EcoRI Yarrowia URA3 gene (GenBank Accession No. (7170-5551) AJ306421) 2451-1571 ColE1 plasmid origin of replication 3369-2509 ampicillin-resistance gene (Amp.sup.R) for selection in E. coli 3977-3577 E. coli f1 origin of replication

Generation of Yarrowia lipolytica Knockout Strain Y4184 (.DELTA.Pex10)

[0286] Standard protocols were used to transform Yarrowia lipolytica strain Y4184U4 (Example 6) with the purified 5.3 kB AscI/SphI fragment of Pex10 knockout construct pYPS161 (supra), and a cells alone control was also prepared. There were about 200 to 250 colonies present for each of the experimental transformations, while there were no colonies present on the cells alone plates (per expectations).

[0287] Colony PCR was used to screen for cells having the Pex10 deletion. Specifically, the PCR reaction was performed using MasterAmp Taq polymerase (Epicentre Technologies, Madison, Wis.) following standard protocols, using PCR primers Pex-10 del1 3'.Forward (SEQ ID NO:57) and Pex-10 del2 5'.Reverse (SEQ ID NO:58). The PCR reaction conditions were 94.degree. C. for 5 min, followed by 30 cycles at 94.degree. C. for 30 sec, 60.degree. C. for 30 sec and 72.degree. C. for 2 min, followed by a final extension at 72.degree. C. for 6 min. The reaction was then held at 4.degree. C. If the Pex10 knockout construct integrated within the Pex10 region, a single PCR product 2.8 kB in size was expected to be produced. In contrast, if the strain integrated the Pex10 knockout construct in a chromosomal region other than the Pex10 region, then two PCR fragments, i.e., 2.8 kB and 1.1 kB, would be generated. Of the 288 colonies screened, the majority had the Pex10 knockout construct integrated at a random site. Only one of the 288 colonies contained the Pex10 knockout. This strain was designated Y4184 (.DELTA.pex10).

Evaluation of Yarrowia lipolytica Strains Y4184 And Y4184 (.DELTA.Pex10) for Total Oil and EPA Production

[0288] To evaluate the effect of the Pex10 knockout on the percent of PUFAs in the total lipid fraction and the total lipid content in the cells, strains Y4184 and Y4184 (.DELTA.pex10) were grown under comparable oleaginous conditions. Specifically, cultures were grown at a starting OD.sub.600 of .about.0.1 in 25 mL of either fermentation media (FM) or FM medium without Yeast Extract (FM without YE) in a 250 mL flask for 48 hrs. The cells were harvested by centrifugation for 10 min at 8000 rpm in a 50 mL conical tube. The supernatant was discarded and the cells were re-suspended in 25 mL of HGM and transferred to a new 250 mL flask. The cells were incubated with aeration for an additional 120 hrs at 30.degree. C.

[0289] To determine the dry cell weight (DCW), the cells from 5 mL of the FM-grown cultures and 10 mL of the FM without YE-grown cultures were processed. The cultured cells were centrifuged for 10 min at 4300 rpm. The pellet was re-suspended using 10 mL of saline and was centrifuged under the same conditions for a second time. The pellet was then re-suspended using 1 mL of sterile H.sub.2O (three times) and was transferred to a pre-weighed aluminum pan. The cells were dried overnight in a vacuum oven at 80.degree. C. The weight of the cells was determined.

[0290] The total lipid content and fatty acid composition of transformants carrying the above plasmids were analyzed as described in Example 3. DCW, total lipid content (TFAs % DCW), total EPA % TFAs, and EPA % DCW are shown below in Table 11.

TABLE-US-00013 TABLE 11 Lipid Composition in Y. lipolytica Strains Y4184 And Y4184 (.DELTA.Pex10) TFAs EPA EPA Media Strain DCW % DCW % TFAs % DCW FM Y4184 11.5 11.8 20.6 2.4 Y4184 (.DELTA.Pex10) 11.5 17.6 43.2 7.6 FM Y4184 4.6 8.8 23.2 2.0 without YE Y4184 (.DELTA.Pex10) 4.0 13.2 46.1 6.1

[0291] The results in Table 11 showed that knockout of the chromosomal Pex10 gene in Y4184 (.DELTA.Pex10) increased the percent of EPA (as % TFAs and as % DCW) and increased the total oil content, as compared to the percent of EPA and total oil content in strain Y4184 whose native Pex10p had not been knocked out. More specifically, in FM media, there was about 109% increase in EPA (% TFAs), about 216% increase in EPA productivity (% DCW) and about 49% increase in total oil (TFAs % DCW). In FM without YE media, there was about 100% increase in EPA (% TFAs), about 205% increase in EPA productivity (% DCW) and about 50% increase in total oil (TFAs % DCW).

[0292] Thus, the results in Table 11 showed that in FM medium, compared to the parent strain Y4184, Y4184 (.DELTA.Pex10) strain had higher lipid content (TFAs % DCW) (17.6% versus 11.8%), higher EPA % TFAs (43.2% versus 20.6%), and higher EPA % DCW (7.6% versus 2.4%). Similarly, in FM medium without YE, compared to the parent strain Y4184, Y4184 (.DELTA.Pex10) strain had higher lipid content (TFAs % DCW) (13.2% versus 8.8%), higher EPA % TFAs (46.1% versus 23.2%), and higher EPA % DCW (6.1% versus 2.0%).

Example 8

Prophetic

Chromosomal Knockout of Alternate Pex Genes in PUFA-Producing Strains Of Yarrowia lipolytica

[0293] The present Example describes various strains of Yarrowia lipolytica that have been engineered to produce .omega.-3/.omega.-6 PUFAs. It is contemplated that any of these Y. lipolytica host strains could be engineered to produce an increased amount of .omega.-3/.omega.-6 PUFAs in the total lipid fraction and in the oil fraction, if the chromosomal gene encoding Pex1p, Pex2p, Pex3p, Pex3Bp, Pex4p, Pex5p, Pex6p, Pex7p, Pex8p, Pex12p, Pex13p, Pex14p, Pex16p, Pex17p, Pex19p, Pex20p, Pex22p or Pex26p was disrupted using the methodology of Example 7, supra.

[0294] More specifically, a variety of Yarrowia lipolytica strains have been engineered by the Applicant's Assignee to produce high concentrations of various .omega.-3/.omega.-6 PUFAs via expression of a heterologous .DELTA.6 desaturase/.DELTA.6 elongase PUFA pathway or a heterologous .DELTA.9 elongase/.DELTA.8 desaturase PUFA pathway.

Summary of Representative Yarrowia lipolytica Strains Producing .omega.-3/.omega.-6 PUFAs

[0295] Although some representative strains are summarized in the Table below, the disclosure of Yarrowia lipolytica strains producing .omega.-3/.omega.-6 PUFAs is not limited in any way to the strains therein. Instead, all of the teachings provided in the present Application, in addition to the following commonly owned and co-pending applications, are useful for development of a suitable Yarrowia lipolytica strain engineered to produce .omega.-3/.omega.-6 PUFAs. These specifically include the following Applicants' Assignee's co-pending patents and applications: U.S. Pat. No. 7,125,672, U.S. Pat. No. 7,189,559, U.S. Pat. No. 7,192,762, U.S. Pat. No. 7,198,937, U.S. Pat. No. 7,202,356, U.S. Pat. No. 7,214,491, U.S. Pat. No. 7,238,482, U.S. Pat. No. 7,256,033, U.S. Pat. No. 7,259,255, U.S. Pat. No. 7,264,949, U.S. Pat. No. 7,267,976, U.S. Pat. No. 7,273,746, U.S. patent application Ser. No. 10/985,254 and No. 10/985,691 (filed Nov. 10, 2004), U.S. patent application Ser. No. 11/183,664 (filed Jul. 18, 2005), U.S. patent application Ser. No. 11/185,301 (filed Jul. 20, 2005), U.S. patent application Ser. No. 11/190,750 (filed Jul. 27, 2005), U.S. patent application Ser. No. 11/198,975 (filed Aug. 8, 2005), U.S. patent application Ser. No. 11/253,882 (filed Oct. 19, 2005), U.S. patent application Ser. No. 11/264,784 and No. 11/264,737 (filed Nov. 1, 2005), U.S. patent application Ser. No. 11/265,761 (filed Nov. 2, 2005), U.S. patent application Ser. No. 11/601,563 and No. 11/601,564 (filed Nov. 16, 2006), U.S. patent application Ser. No. 11/635,258 (filed Dec. 7, 2006), U.S. patent application Ser. No. 11/613,420 (filed Dec. 20, 2006), U.S. patent application Ser. No. 11/787,772 (filed Apr. 18, 2007), U.S. patent application Ser. No. 11/737,772 (filed Apr. 20, 2007), U.S. patent application Ser. No. 11/740,298 (filed Apr. 26, 2007), U.S. patent application Ser. No. 12/111,237 (filed Apr. 29, 2008), U.S. patent application Ser. No. 11/748,629 and No. 11/748,637 (filed May 15, 2007), U.S. patent application Ser. No. 11/779,915 (filed Jul. 19, 2007), U.S. Pat. App. No. 60/991,266 (filed Nov. 30, 2007), U.S. patent application Ser. No. 11/952,243 (filed Dec. 7, 2007), U.S. Pat. App. No. 61/041,716 (filed Apr. 2, 2008), U.S. patent application Ser. No. 12/061,738 (filed Apr. 3, 2008), U.S. patent application Ser. No. 12/099,811 (filed Apr. 9, 2008), U.S. patent application Ser. No. 12/102,879 (filed Apr. 15, 2008), U.S. patent application Ser. No. 12/111,237 (filed Apr. 29, 2008), U.S. Pat. App. No. 61/055,511 (filed May 23, 2008) and U.S. Pat. App. No. 61/093,007 (filed Aug. 29, 2008).

TABLE-US-00014 TABLE 12 Lipid Profile Of Representative Yarrowia lipolytica Strains Engineered To Produce .omega.-3/.omega.-6 PUF As Fatty Acid Content (As A Percent ATCC [%] of Total Fatty Acids) Deposit 18:3 Strain Reference No. 16:0 16:1 18:0 18:1 18:2 (ALA) GLA Wildtype US 2006-0035351- #76982 14 11 3.5 34.8 31 -- 0 pDMW208 A1; WO2006/033723 -- 11.9 8.6 1.5 24.4 17.8 -- 25.9 pDMW208D62 -- 16.2 1.5 0.1 17.8 22.2 -- 34 M4 US 2006-0115881- -- 15 4 2 5 27 -- 35 A1; WO2006/052870 Y2034 US 2006-0094092- -- 13.1 8.1 1.7 7.4 14.8 -- 25.2 Y2047 A1; WO2006/055322 PTA-7186 15.9 6.6 0.7 8.9 16.6 -- 29.7 Y2214 -- 7.9 15.3 0 13.7 37.5 -- 0 EU US 2006-0115881- -- 19 10.3 2.3 15.8 12 -- 18.7 Y2072 A1; WO2006/052870 -- 7.6 4.1 2.2 16.8 13.9 -- 27.8 Y2102 -- 9 3 3.5 5.6 18.6 -- 29.6 Y2088 -- 17 4.5 3 2.5 10 -- 20 Y2089 -- 7.9 3.4 2.5 9.9 14.3 -- 37.5 Y2095 -- 13 0 2.6 5.1 16 -- 29.1 Y2090 -- 6 1 6.1 7.7 12.6 -- 26.4 Y2096 PTA-7184 8.1 1 6.3 8.5 11.5 -- 25 Y2201 PTA-7185 11 16.1 0.7 18.4 27 -- -- Y3000 US 2006-0110806- PTA-7187 5.9 1.2 5.5 7.7 11.7 -- 30.1 A1; WO2006/052871 Y4001 WO2008/073367 -- 4.3 4.4 3.9 35.9 23 0 -- Y4036 -- 7.7 3.6 1.1 14.2 32.6 0 -- Y4070 -- 8 5.3 3.5 14.6 42.1 0 -- Y4158 -- 3.2 1.2 2.7 14.5 30.4 5.3 -- Y4184 -- 3.1 1.5 1.8 8.7 31.5 4.9 -- Fatty Acid Content (As A Percent [%] of Total Fatty Acids) Lipid % Strain 20:2 DGLA ARA ETA EPA DPA DHA dcw Wildtype -- -- -- -- -- -- -- -- pDMW208 -- -- -- -- -- -- -- -- pDMW208D62 -- -- -- -- -- -- -- -- M4 -- 8 0 0 0 -- -- -- Y2034 -- 8.3 11.2 -- -- -- -- -- Y2047 -- 0 10.9 -- -- -- -- -- Y2214 -- 7.9 14 -- -- -- -- -- EU -- 5.7 0.2 3 10.3 -- -- 36 Y2072 -- 3.7 1.7 22 15 -- -- -- Y2102 -- 3.8 2.8 2.3 18.4 -- -- -- Y2088 -- 3 2.8 1.7 20 -- -- -- Y2089 -- 2.5 1.8 1.6 17.6 -- -- -- Y2095 -- 3.1 1.9 2.7 19.3 -- -- -- Y2090 -- 6.7 2.4 3.6 26.6 -- -- 22.9 Y2096 -- 5.8 2.1 2.5 28.1 -- -- 20.8 Y2201 3.3 3.3 1 3.8 9 -- -- -- Y3000 -- 2.6 1.2 1.2 4.7 18.3 5.6 -- Y4001 23.8 0 0 0 -- -- -- Y4036 15.6 18.2 0 0 -- -- -- Y4070 6.7 2.4 11.9 -- -- -- -- Y4158 6.2 3.1 0.3 3.4 20.5 -- -- 27.3 Y4184 5.6 2.9 0.6 2.4 28.9 -- -- 23.9

Chromosomal Knockout of Pex Genes

[0296] Following selection of a preferred Yarrowia lipolytica strain producing the desired .omega.-3/.omega.-6 PUFA (or combination of PUFAs thereof), one of skill in the art could readily engineer a suitable knockout construct, similar to pYPS161 in Example 7, to result in knockout of a chromosomal Pex gene upon transformation into the parental Y. lipolytica strain. Preferred Pex genes would include: YIPex1p (GenBank Accession No. CAG82178; SEQ ID NO:1), YIPex2p (GenBank Accession No. CAG77647; SEQ ID NO:2), YIPex3p (GenBank Accession No. CAG78565; SEQ ID NO:3), YIPex3Bp (GenBank Accession No. CAG83356; SEQ ID NO:4), YIPex4p (GenBank Accession No. CAG79130; SEQ ID NO:5), YIPex5p (GenBank Accession No. CAG78803; SEQ ID NO:6), YIPex6p (GenBank Accession No. CAG82306; SEQ ID NO:7), YIPex7p (GenBank Accession No. CAG78389; SEQ ID NO:8), YIPex8p (GenBank Accession No. CAG80447; SEQ ID NO:9), YIPex12p (GenBank Accession No. CAG81532; SEQ ID NO:11), YIPex13p (GenBank Accession No. CAG81789; SEQ ID NO:12), YIPex14p (GenBank Accession No. CAG79323; SEQ ID NO:13), YIPex16p (GenBank Accession No. CAG79622; SEQ ID NO:14), YIPex17p (GenBank Accession No. CAG84025; SEQ ID NO:15), YIPex19p (GenBank Accession No. AAK84827; SEQ ID NO:16), YIPex20p (GenBank Accession No. CAG79226; SEQ ID NO:17), YIPex22p (GenBank Accession No. CAG77876; SEQ ID NO:18) and YIPex26p (GenBank Accession No. NC.sub.--006072, antisense translation of nucleotides 117230-118387; SEQ ID NO:19).

[0297] It would be expected that the chromosomal disruption of Pex would result in an increased amount of PUFAs in the total lipid fraction and in the oil fraction, as a percent of total fatty acids, as compared with a eukaryotic organism whose native peroxisome biogenesis factor protein has not been disrupted, wherein the amount of PUFAs can be: 1) the PUFA that is the desired end product of a functional PUFA biosynthetic pathway, as opposed to PUFA intermediates or by-products, 2) C.sub.20 and C.sub.22 PUFAs, and/or 3) total PUFAs. Preferred results not only achieve an increase in the amount of PUFAs as a percent of total fatty acids but also result in an increased amount of PUFAs as a percent of dry cell weight, as compared with a eukaryotic organism whose native peroxisome biogenesis factor protein has not been disrupted. Again, the amount of PUFAs can be: 1) the PUFA that is the desired end product of a functional PUFA biosynthetic pathway, as opposed to PUFA intermediates or by-products, 2) the C.sub.20 and C.sub.22 PUFAs, and/or 3) the total PUFAs. In some cases, the total lipid content also increases, relative to that of a eukaryotic organism whose native peroxisome biogenesis factor protein has not been disrupted.

Example 9

Chromosomal Deletion of Pex16 In Yarrowia lipolytica Strain Y4036U Increases Percent DGLA Accumulated

[0298] The present Example describes use of construct pYRH13 (FIG. 9B; SEQ ID NO:59) to knock out the chromosomal Pex16 gene in the DGLA-producing Yarrowia strain Y4036U (Example 1). Transformation of Y. lipolytica strain Y4036U with the Pex16 knockout construct resulted in creation of strain Y4036U (.DELTA.pex16). The effect of the Pex16 knockout on DGLA level was determined and compared. Specifically, knockout of Pex16 resulted in an increased percentage of DGLA as a percent of total fatty acids in the cell.

Construct pYRH13

[0299] Plasmid pYRH13 was derived from plasmid pYPS161 (FIG. 9A, SEQ ID NO:56; Example 7). Specifically, a 1982 bp 5' promoter region of the Yarrowia lipolytica Pex16 gene (GenBank Accession No. CAG79622) replaced the AscI/BsiWI fragment of pYPS161 and a 448 bp 3' terminator region of the Yarrowia lipolytica Pex16 gene (GenBank Accession No. CAG79622) replaced the PacI/SphI fragment of pYPS161 to produce pYRH13 (SEQ ID NO:59; FIG. 9B).

Generation of Yarrowia lipotytica Knockout Strain Y4036 (.DELTA.Pex16)

[0300] Standard protocols were used to transform Yarrowia lipolytica strain Y4036U (Example 1) with the purified 6.0 kB AscI/SphI fragment of Pex16 knockout construct pYRH13.

[0301] To screen for cells having the Pex16 deletion, colony PCR was performed using Taq polymerase (Invitrogen; Carlsbad, Calif.) and the PCR primers PEX16Fii (SEQ ID NO:60) and PEX16Rii (SEQ ID NO:61). This set of primers was designed to amplify a 1.1 kB region of the intact Pex16 gene, and therefore the Pex16 deleted mutant (i.e., .DELTA.pex16) would not produce the band. A second set of primers was designed to produce a band only when the Pex16 gene was deleted. Specifically, one primer (i.e., 3UTR-URA3; SEQ ID NO:62) binds to a region in the vector sequences of the introduced 6.0 kB AscI/SphI disruption fragment, and the other primer (i.e., PEX16-conf; SEQ ID NO:63) binds to the Pex16 terminator sequences of chromosome outside of the homologous region of the disruption fragment.

[0302] More specifically, the colony PCR was performed using a reaction mixture that contained: 20 mM Tris-HCl (pH 8.4), 50 mM KCl, 1.5 mM MgCl.sub.2, 400 .mu.M each of dGTP, dCTP, dATP, and dTTP, 2 .mu.M of each primer, 20 .mu.l water and 2 U Taq polymerase. Amplification was carried out as follows: initial denaturation at 94.degree. C. for 120 sec, followed by 35 cycles of denaturation at 94.degree. C. for 60 sec, annealing at 55.degree. C. for 60 sec, and elongation at 72.degree. C. for 120 sec. A final elongation cycle at 72.degree. C. for 5 min was carried out, followed by reaction termination at 4.degree. C.

[0303] Of 205 colonies screened, 195 had the Pex16 knockout fragment integrated at a random site in the chromosome and thus were not .DELTA.pex16 mutants (however, the cells could grow on ura- plates, due to the presence of pYRH13). Three of these random integrants, designated as Y4036U-17, Y4036U-19 and Y4036U-33, were used as controls in lipid production experiments (infra).

[0304] The remaining 10 colonies screened (i.e., of the total 205) contained the Pex16 knockout. These ten .DELTA.pex16 mutants within the Y4036U strain background were designated RHY25 through RHY34.

Confirmation of Yarrowia lipotytica Knockout Strain Y4036U (.DELTA.Pex16) by Quantitative Real Time PCR

[0305] Further confirmation of the Pex16 knockout in strains RHY25 through RHY34 was performed by quantitative real time PCR, with the Yarrowia translation elongation factor (tef-1) gene (GenBank Accession No. AF054510) used as the control.

[0306] First, real time PCR primers and TaqMan probes targeting the Pex16 gene and the tef-1 gene, respectively, were designed with Primer Express software v 2.0 (AppliedBiosystems, Foster City, Calif.). Specifically, real time PCR primers ef-324F (SEQ ID NO:64), ef-392R (SEQ ID NO:65), PEX16-741F (SEQ ID NO:66) and PEX16-802R (SEQ ID NO:67) were designed, as well as the TaqMan probes ef-345T (i.e., 5' 6-FAM.TM.-TGCTGGTGGTGTTGGTGAGTT-TAMRA.TM., wherein the nucleotide sequence is set forth as SEQ ID NO:68) and PEX16-760T (i.e., 5'-6FAM.TM.-CTGTCCATTCTGCGACCCCTC-TAMRA.TM., wherein the nucleotide sequence is set forth as SEQ ID NO:69). The 5' end of the TaqMan fluorogenic probes have the 6FAM.TM. fluorescent reporter dye bound, while the 3' end comprises the TAMRA.TM. quencher. All primers and probes were obtained from Sigma-Genosys (Woodlands, Tex.).

[0307] Knockout candidate DNA was prepared by suspending 1 colony in 50 .mu.l of water. Reactions for tef-1 and PEX16 were run separately, in triplicate for each sample: Real time PCR reactions included 20 pmoles each of forward and reverse primers (i.e., ef-324F, ef-392R, PEX16-741F and PEX16-802R 5', supra), 5 pmoles TaqMan probe (i.e., ef-345T and PEX16-760T), 10 .mu.l TaqMan Universal PCR Master Mix--No AmpErase.RTM. Uracil-N-Glycosylase (UNG) (Catalog No. PN 4326614, AppliedBiosystems), 1 .mu.l colony suspension and 8.5 .mu.l RNase/DNase free water for a total volume of 20 .mu.l per reaction. Reactions were run on the ABI PRISM.RTM. 7900 Sequence Detection System under the following conditions: initial denaturation at 95.degree. C. for 10 min, followed by 40 cycles of denaturation at 95.degree. C. for 15 sec and annealing at 60.degree. C. for 1 min. Real time data was collected automatically during each cycle by monitoring 6-FAM.TM. fluorescence. Data analysis was performed using tef-1 gene threshold cycle (CT) values for data normalization as per the ABI PRISM.RTM. 7900 Sequence Detection System instruction manual.

[0308] Based on this analysis, it was concluded that all ten of the Y4036U (.DELTA.pex16) colonies (i.e., RHY25 through RHY34) were valid Pex16 knockouts, wherein the pYRH13 construct had integrated into the chromosomal YIPex16.

Evaluation of Yarrowia lipotytica Strains Y4036U and Y4036U (.DELTA.Pex16) for DGLA Production

[0309] To evaluate the effect of the Pex16 knockout on the percent of PUFAs in the total lipid fraction and the total lipid content in the cells, the Y4036U and Y4036U (.DELTA.pex16) strains were grown under comparable oleaginous conditions. More specifically, strains Y4036U-17, Y4036U-19 and Y4036U-33 having the Pex16 knockout fragment integrated at a random site in the chromosome were considered as Pex16 wild type (i.e., Y4036U) and strains RHY25 through RHY34 were the Pex16 mutant strains (i.e., Y4036U (.DELTA.pex16)). Cultures of each strain were grown at a starting OD.sub.600 of .about.0.1 in 25 mL of MM containing 90 mg/L L-leucine in a 125 mL flask for 48 hrs. The cells were harvested by centrifugation for 5 min at 4300 rpm in a 50 mL conical tube. The supernatant was discarded and the cells were re-suspended in 25 mL of HGM and transferred to a new 125 mL flask. The cells were incubated with aeration for an additional 120 hrs at 30.degree. C.

[0310] The fatty acid composition (i.e., LA (18:2), ALA, EDA and DGLA) for each of the strains is shown below in Table 13; fatty acid composition is expressed as the weight percent (wt. %) of total fatty acids. The average fatty acid composition of strains Y4036U and Y4036U (.DELTA.pex16) are highlighted in gray and indicated with "Ave". None of the strains tested provided sufficient cell mass in MM+L-leucine media, and thus total lipid content was not analyzed.

TABLE-US-00015 TABLE 13 Lipid Composition In Y. lipolytica Strains Y4036U And Y4036U (.DELTA.pex16) ##STR00001##

The results in Table 13 showed that knockout of the chromosomal Pex16 gene in Y4036U (.DELTA.pex16) increased the DGLA % TFAs approximately 85%, as compared to the DGLA % TFAs in strain Y4036U whose native Pex16p had not been knocked out. However, Y4036U (.DELTA.pex16) also had a -40% decrease in the LA (18:2) accumulation.

[0311] Thus, the results in Table 13 showed that compared to the parent strain Y4036, Y4036 (.DELTA.Pex16) strain had higher average DGLA % TFAs (43.4% versus 23.4%). Additionally, strain Y4036U (.DELTA.pex16) had a 1.65-fold increase in the amount of DGLA relative to the total PUFAs (62.8% of the PUFAs [as a % TFAs] versus 38.1% of the PUFAs [as a % TFAs]) and a 1.3-fold increase in the amount of C20 PUFAs relative to the total PUFAs (71% of the PUFAs [as a % TFAs] versus 54.8% of the PUFAs [as a % TFAs]).

Example 10

Generation of Y4305 Strain to Produce about 53.2% EPA of Total Liquids

[0312] Y. lipolytica strain Y4305U, having a Ura- phenotype, was used as the host in Example 11, infra. Strain Y4305 (a Ura+ strain that was parent to Y4305U) was derived from Y. lipolytica ATCC #20362, and is capable of producing about 53.2% EPA relative to the total lipids via expression of a .DELTA.9 elongase/.DELTA.8 desaturase pathway.

[0313] The development of strain Y4305U required the construction of strain Y2224, strain Y4001, strain Y4001U, strain Y4036, strain Y4036U, strain Y4070 and strain Y4086 (supra, Example 1). Further development of strain Y4305U required construction of strain Y4086U1, strain Y4128 and strain Y4128U3 (supra, Example 2). Subsequently, development of strain Y4305U (diagrammed in FIG. 10) required construction of strain Y4217 (producing 42% EPA), strain Y4217U2 (Ura-), strain Y4259 (producing 46.5% EPA), strain Y4259U2 (Ura-) and strain Y4305 (producing 53.2% EPA).

[0314] Although the details concerning transformation and selection of the EPA-producing strains developed after strain Y4128U3 are not elaborated herein, the methodology used for isolation of strain Y4217, strain Y4217U2, strain Y4259, strain Y4259U2, strain Y4305 and strain Y4305U was as described in Examples 1 and 2.

[0315] Briefly, construct pZKL2-5U89GC (FIG. 8B; SEQ ID NO:55; described in Table 24 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference) was generated to integrate one .DELTA.9 elongase gene (i.e., EgD9eS), one .DELTA.8 desaturase gene (i.e., EgD8M), one .DELTA.5 desaturase gene (i.e., EgD5S), and one Yarrowia lipolytica diacylglycerol cholinephosphotransferase (CPT1) gene into the Lip2 loci (GenBank Accession No. AJ012632) of strain Y4128U3 to thereby enable higher level production of EPA. Six strains, designated as Y4215, Y4216, Y4217, Y4218, Y4219 and Y4220, produced about 41.1%, 41.8%, 41.7%, 41.1%, 41% and 41.1% EPA of total lipids, respectively.

[0316] Strain Y4217U1 and Y4217U2 were created by disrupting the Ura3 gene in strain Y4217 via construct pZKUE3S (FIG. 5A; SEQ ID NO:31; described in Table 22 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference), comprising a chimeric EXP1::ME3S::Pex20 gene targeted for the Ura3 gene. Construct pZKL1-2SP98C (FIG. 8A; SEQ ID NO:54; described in Table 23 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference) was utilized to integrate one .DELTA.9 elongase gene (i.e., EgD9eS), one .DELTA.8 desaturase gene (i.e., EgD8M), one .DELTA.12 desaturase gene (i.e., FmD12S), and one Yarrowia lipolytica CPT1 gene into the Lip1 loci (GenBank Accession No. Z50020) of strain Y4217U2, thereby resulting in isolation of strains Y4259, Y4260, Y4261, Y4262, Y4263 and Y4264, producing about 46.5%, 44.5%, 44.5%, 44.8%, 44.5% and 44.3% EPA of total lipids, respectively.

[0317] A Ura- derivative (i.e., strain Y4259U2) was then created, via transformation with construct pZKUM (FIG. 11A; SEQ ID NO:70; described in Table 33 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference), which integrated a Ura3 mutant gene into the Ura3 gene of strain Y4259, thereby resulting in isolation of strains Y4259U1, Y4259U2 and Y4259U3, respectively (collectively, Y4259U) (producing 31.4%, 31% and 31.3% EPA of total lipids, respectively).

[0318] Finally, construct pZKD2-5U89A2 (FIG. 11B; SEQ ID NO:71) was generated to integrate one .DELTA.9 elongase gene, one .DELTA.5 desaturase gene, one .DELTA.8 desaturase gene, and one .DELTA.12 desaturase gene into the diacylglycerol acyltransferase (DGAT2) loci of strain Y4259U2, to thereby enable increased production of EPA. The pZKD2-5U89A2 plasmid contained the following components:

TABLE-US-00016 TABLE 14 Description of Plasmid pZKD2-5U89A2 (SEQ ID NO: 71) RE Sites And Nucleotides Within SEQ ID Description Of Fragment And Chimeric Gene NO: 71 Components AscI/BsiWI 728 bp 5' portion of Yarrowia DGAT2 gene (SEQ ID (1-736) NO: 72) (labeled as "YLDGAT5'" in Figure; U.S. Pat. No. 7,267,976) PacI/SphI 714 bp 3' portion of Yarrowia DGAT2 gene (SEQ ID (4164-3444) NO: 72) (labeled as "YLDGAT3'" in Figure; U.S. Pat. No. 7,267,976) SwaI/BsiWI YAT1::FmD12S::Lip2, comprising: (13377-1) YAT1: Yarrowia lipolytica YAT1 promoter (labeled as "YAT" in Figure; Pat. Appl. Pub. No. US 2006/0094102-A1); FmD12S: codon-optimized .DELTA.12 elongase (SEQ ID NO: 74), derived from Fusarium moniliforme (labeled as "F.D12S" in Figure; Int'l. App. Pub. No. WO 2005/047485); Lip2: Lip2 terminator sequence from Yarrowia Lip2 gene (GenBank Accession No. AJ012632) PmeI/SwaI FBAIN::EgD8M::Lip1 comprising: (10740-13377) FBAIN: Yarrowia lipolytica FBAIN promoter (U.S. Pat. No. 7,202,356); EgD8M: Synthetic mutant .DELTA.8 desaturase (SEQ ID NO: 76; Pat. Appl. Pub. No. US 2008-0138868 A1), derived from Euglena gracilis ("EgD8S"; U.S. Pat. No. 7,256,033); Lip1: Lip1 terminator sequence from Yarrowia Lip1 gene (GenBank Accession No. Z50020) ClaI/PmeI YAT1::E389D9eS::OCT, comprising: (8846-10740) YAT1: Yarrowia lipolytica YAT1 promoter (labeled as "YAT" in Figure; Pat. Appl. Pub. No. US 2006/0094102-A1); E389D9eS: codon-optimized .DELTA.9 elongase (SEQ ID NO: 78), derived from Eutreptiella sp. CCMP389 (labeled as "D9ES-389" in Figure; Int'l. App. Pub. No. WO 2007/061742); OCT: OCT terminator sequence from Yarrowia OCT gene (GenBank Accession No. X69988) ClaI/EcoRI Yarrowia Ura3 gene (GenBank Accession No. (8846-6777) AJ306421) EcoRI/PacI EXP1::EgD5S::ACO, comprising: (6777-4164) EXP1: Yarrowia lipolytica export protein (EXP1) promoter (labeled as "Exp" in Figure; Int'l. App. Pub. No. WO 2006/052870); EgD5S: codon-optimized .DELTA.5 desaturase (SEQ ID NO: 80), derived from Euglena gracilis (Pat. Appl. Pub. No. US 2007-0292924-A1); Aco: Aco terminator sequence from Yarrowia Aco gene (GenBank Accession No. AJ001300)

[0319] The pZKD2-5U89A2 plasmid was digested with AscI/SphI and then used for transformation of strain Y4259U2 according to the General Methods. The transformed cells were plated onto MM plates, and plates were maintained at 30.degree. C. for 3 to 4 days. Single colonies were re-streaked onto MM plates, and the resulting colonies were used to inoculate liquid MM. Liquid cultures were shaken at 250 rpm/min for 2 days at 30.degree. C. The cells were collected by centrifugation, resuspended in HGM, and then shaken at 250 rpm/min for 5 days. The cells were collected by centrifugation, and lipids were extracted. FAMEs were prepared by trans-esterification and subsequently analyzed with a Hewlett-Packard 6890 GC.

[0320] GC analyses showed that most of the selected 96 strains produced 40-46% EPA of total lipids. Four strains, designated as Y4305, Y4306, Y4307 and Y4308, produced about 53.2%, 46.4%, 46.8% and 47.8% EPA of total lipids, respectively. The complete lipid profile of Y4305 is as follows: 16:0 (2.8%), 16:1 (0.7%), 18:0 (1.3%), 18:1 (4.9%), 18:2 (17.6%), ALA (2.3%), EDA (3.4%), DGLA (2.0%), ARA (0.6%), ETA (1.7%) and EPA (53.2%). The total lipid % dry cell weight was 27.5.

[0321] The final genotype of strain Y4305 with respect to wild type Yarrowia lipolytica ATCC #20362 was SCP2-(YALI0E01298g), YALI0C18711g-, Pex10-, YALI0F24167g-, unknown 1-, unknown 3-, unknown 8-, GPD::FmD12::Pex20, YAT1::FmD12::OCT, GPM/FBAIN::FmD12S::OCT, EXP1::FmD12S::Aco, YAT1::FmD12S::Lip2, YAT1::ME3S::Pex16, EXP1::ME3S::Pex20 (3 copies), GPAT::EgD9e::Lip2, EXP1::EgD9eS::Lip1, FBAINm::EgD9eS::Lip2, FBA::EgD9eS::Pex20, GPD::EgD9eS::Lip2, YAT1::EgD9eS::Lip2, YAT1::E389D9eS::OCT, FBAINm::EgD8M::Pex20, FBAIN::EgD8M::Lip1 (2 copies), EXP1::EgD8M::Pex16, GPDIN::EgD8M::Lip1, YAT1::EgD8M::Aco, FBAIN::EgD5::Aco, EXP1::EgD5S::Pex20, YAT1::EgD5S::Aco, EXP1::EgD5S::ACO, YAT1::RD5S::OCT, YAT1::PaD17S::Lip1, EXP1::PaD17::Pex16, FBAINm::PaD17::Aco, YAT1::YICPT1::ACO, GPD::YICPT1::ACO.

[0322] In order to disrupt the Ura3 gene in strain Y4305, construct pZKUM (FIG. 11A; SEQ ID NO:70; described in Table 33 of Int'l. App. Pub. No. WO 2008/073367, hereby incorporated herein by reference) was used to integrate a Ura3 mutant gene into the Ura3 gene of strain Y4305. A total of 8 transformants grown on MM+5-FOA plates were picked and re-streaked onto MM plates and MM+5-FOA plates, separately. All 8 strains had a Ura-phenotype (i.e., cells could grow on MM+5-FOA plates, but not on MM plates). The cells were scraped from the MM+5-FOA plates, and lipids were extracted. FAMEs were prepared by trans-esterification and subsequently analyzed with a Hewlett-Packard 6890 GC.

[0323] GC analyses showed the presence of 37.6%, 37.3% and 36.5% EPA of total lipids in pZKUM transformants #1, #6 and #7 grown on MM+5-FOA plates. These three strains were designated as strains Y4305U1, Y4305U2 and Y4305U3, respectively (collectively, Y4305U). For clarity in Example 11, strain Y4305U is referred to as strain Y4305U (.DELTA.pex10).

Example 11

Chromosomal Deletion of Pex16 in Yarrowia lipolytica Strain Y4305U (.DELTA.pex10) Further Increased Percent EPA Accumulated

The Double Pex10-Pex16 Knockout

[0324] The present Example describes use of construct pYRH13 (FIG. 9B; SEQ ID NO:59) to knock out the chromosomal Pex16 in Yarrowia strain Y4305U (.DELTA.pex10) (Example 10), to thereby result in a Pex10-Pex16 double mutant. The effect of the Pex10-Pex16 double knockout on total oil and EPA level was determined and compared. Specifically, the effect of the Pex10-Pex16 double mutation in strain Y4305U (.DELTA.pex10) (.DELTA.pex16) resulted in an increased amount of EPA in the cell (EPA % TFAs and EPA % DCW), as compared to the single mutant (i.e., strain Y4305U (.DELTA.pex10)).

Generation of Yarrowia lipolytica Knockout Strain Y4305U (.DELTA.Pex10) (.DELTA.pex16)

[0325] Standard protocols were used to transform Yarrowia lipolytica strain Y4305U (.DELTA.pex10) (Example 10) with the purified 6.0 kB AscI/SphI fragment of Pex16 knockout construct pYRH13 (Example 9; SEQ ID NO:59). Screening and identification of cells having the Pex16 deletion was conducted by colony PCR, as described in Example 9.

[0326] Of 93 colonies screened, 88 had the Pex16 knockout fragment integrated at a random site in the chromosome and thus were not .DELTA.pex16 mutants (however, the cells could grow on Ura-plates, due to the presence of pYRH13). Two of these random integrants, designated as Y4305U-22 and Y4305U-25, were used as controls in lipid production experiments (infra).

[0327] The remaining 5 colonies screened (i.e., of the total 93) contained the Pex16 knockout. These five .DELTA.pex16 mutants within the Y4305U strain background were designated RHY20, RHY21, RHY22, RHY23 and RHY24. Further confirmation of the YIPex16 knockout was performed by quantitative real time PCR, as described in Example 9.

Evaluation of Yarrowia lipolytica Strains Y4305U (.DELTA.Pex10) and Y4305U (.DELTA.Pex10) (.DELTA.pex16) for EPA Production

[0328] To evaluate the effect of mutation in multiple Pex genes on the percent of PUFAs in the total lipid fraction and the total lipid content in the cells, Y4305U (.DELTA.pex10) and Y4305U (.DELTA.pex10) (.DELTA.pex16) strains were grown under comparable oleaginous conditions. More specifically, strains Y4305U-22 and Y4305U-25 having the Pex16 knockout fragment integrated at a random site in the chromosome were considered as Pex16 wild type, Pex10 knockouts (i.e., Y4305U (.DELTA.pex10)). Strains RHY22, RHY23 and RHY24 were the double knockout mutant strains (i.e., Y4305U (.DELTA.pex10) (.DELTA.pex16)). Cultures of each strain were grown in duplicate under comparable oleaginous conditions.

[0329] Specifically, cultures were grown at a starting OD.sub.600 of .about.0.1 in 25 mL of synthetic dextrose media (SD) in a 125 mL flask for 48 hrs. The cells were harvested by centrifugation for 5 min at 4300 rpm in a 50 mL conical tube. The supernatant was discarded and the cells were re-suspended in 25 mL of HGM and transferred to a new 125 mL flask. The cells were incubated with aeration for an additional 120 hrs at 30.degree. C.

[0330] To determine the dry cell weight (DCW), the cells from 5 mL of the HGM-grown cultures were processed. The cultured cells were centrifuged for 5 min at 4300 rpm. The pellet was re-suspended using 10 mL of sterile water and was centrifuged under the same conditions for a second time. The pellet was then re-suspended using 1 mL of sterile H.sub.2O (three times) and was transferred to a pre-weighed aluminum pan. The cell suspension was dried overnight in a vacuum oven at 80.degree. C. The weight of the cells was determined.

[0331] To determine the total lipid content, 1 mL of HGM cultured cells were collected by centrifugation for 1 min at 13,000 rpm, total lipids were extracted, and FAMEs were prepared by trans-esterification, and subsequently analyzed with a Hewlett-Packard 6890 GC (General Methods).

[0332] The fatty acid composition (i.e., 16:0 (palmitate), 16:1 (palmitoleic acid), 18:0, 18:1 (oleic acid), 18:2 (LA), 18:3 (ALA), EDA, DGLA, ARA, ETrA, ETA and EPA) for each of the strains is shown below in Table 15 (expressed as the weight percent (wt. %) of total fatty acids (TFA)), as well as the DCW (g/L) and total lipid content (TFAs % DCW). The average fatty acid composition of strains Y4305U (.DELTA.pex10) and Y4305U (.DELTA.pex10) (.DELTA.pex16) are highlighted in gray and indicated with "Ave".

TABLE-US-00017 TABLE 15 Lipid Composition In Y. lipolytica Strains Y4305U (.DELTA.Pex10) And Y4305U (.DELTA.Pex10) (.DELTA.Pex16) ##STR00002##

[0333] The results in Table 15 showed that knockout of the chromosomal Pex16 gene in Y4305U (.DELTA.pex10) (.DELTA.pex16) increased the EPA % TFAs approximately 8%, as compared to the EPA % TFAs in strain Y4305U (.DELTA.pex1.0) whose native Pex16p had not been knocked out. Additionally, the EPA % DCW was also increased in the double mutant as compared to in the single mutant strain, while the TFAs % DCW remained the same.

[0334] Thus, the results in Table 15 showed that compared to the control Y4305 (.DELTA.Pex10) strains, Y4305 (.DELTA.Pex10, .DELTA.Pex16) strains on average had higher EPA % TFAs (48.3% versus 44.7%) and higher EPA % DCW (14.57% versus 13.23%). Strain Y4305 (.DELTA.Pex10, .DELTA.Pex16) had only a 1.05-fold increase in the amount of EPA relative to the total PUFAs (61% of the PUFAs [as a % TFAs] versus 58.3% of the PUFAs [as a % TFAs]) relative to strain Y4305 (.DELTA.Pex10), while the increase in the amount of C20 PUFAs relative to the total PUFAs was effectively identical (73% of the PUFAs [as a % TFAs] versus 72% of the PUFAs [as a % TFAs]).

Example 12

Chromosomal Deletion of Pex3 in Yarrowia lipolytica Strain Y4036U Increases Percent DGLA Accumulated

[0335] The present Example describes use of construct pY157 (FIG. 12B; SEQ ID NO:82) to knock out the chromosomal Pex3 gene (SEQ ID NO:3) in the Ura-, DGLA-producing Yarrowia strain Y4036U (Example 1). Transformation of Y. lipolytica strain Y4036U with the Pex3 knockout construct resulted in creation of strain Y4036 (.DELTA.pex3). The effect of the Pex3 knockout on DGLA level was determined and compared to the control strain Y4036 (a Ura.sup.+ strain that was parent to strain Y4036U). Specifically, knockout of Pex3 increased DGLA as a percentage of total fatty acids and improved ca. 3-fold DGLA % DCW, compared to the control.

Construct PY157

[0336] Plasmid pY87 (FIG. 12A) contained a cassette to knock out the Yarrowia lipolytica diacylglycerol acyltransferase (DGAT2) gene, as described below in Table 16:

TABLE-US-00018 TABLE 16 Description of Plasmid pY87 (SEQ ID NO: 83) RE Sites And Nucleotides Within SEQ ID Description Of Fragment And NO: 83 Chimeric Gene Components SphI/PacI 5' portion of Yarrowia DGAT2 gene (bases 1-720 of (1-721) SEQ ID NO: 72) (U.S. Pat. No. 7,267,976) PacI/BglII LoxP::Ura3::LoxP, comprising: (721-2459) LoxP sequence (SEQ ID NO: 84); Yarrowia Ura3 gene (GenBank Accession No. AJ306421); LoxP sequence (SEQ ID NO: 84) BglII/AscI 3' portion of Yarrowia DGAT2 gene (bases 2468-3202 of (2459-3203) SEQ ID NO: 72) (U.S. Pat. No. 7,267,976) AscI/SphI Vector backbone including: (3203-5910) ColE1 plasmid origin of replication; ampicillin-resistance gene (Amp.sup.R) for selection in E. coli (4191-5051); E. coli f1 origin of replication

[0337] Plasmid pY157 was derived from plasmid pY87. Specifically, a 704 bp 5' promoter region of the Yarrowia lipolytica Pex3 gene replaced the SphI/PacI fragment of pY87 and a 448 bp 3' terminator region of the Yarrowia lipolytica Pex3 gene replaced the Bg/II/IAscI fragment of pY87 to produce pY157 (SEQ ID NO:82; FIG. 12B).

Generation of Yarrowia lipolytica Knockout Strain Y4036 (.DELTA.Pex3)

[0338] Standard protocols were used to transform Yarrowia lipolytica strain Y4036U (Example 1) with the purified 3648 bp AscI/SphI fragment of Pex3 knockout construct pY157 (supra).

[0339] To screen for cells having the Pex3 deletion, colony PCR was performed using Taq polymerase (Invitrogen; Carlsbad, Calif.) and the PCR primers UP 768 (SEQ ID NO:85) and LP 769 (SEQ ID NO:86). This set of primers was designed to amplify a 2039 bp wild type band of the intact Pex3 gene and 3719 bp knockout-specific band when the Pex3 gene was disrupted by targeted knockout.

[0340] More specifically, the colony PCR was performed using a MasterAmp Taq kit (Epicentre Technologies, Madison, Wis.; Catalog No. 82250) and the manufacturer's instructions in a 25 .mu.l reaction comprising: 2.5 .mu.l of 10.times. MasterAmp Taq buffer, 2.0 .mu.l of 25 mM MgCl.sub.2, 7.5 .mu.l of 16.times. MasterAmp Enhancer, 2.5 .mu.l of 2.5 mM dNTPs (TaKaRa Bio Inc., Otsu Shiga, Japan), 1.0 .mu.l of 10 .mu.M Upper primer, 1.0 .mu.l of 10 .mu.M Lower primer, 0.25 .mu.l of MasterAmp Taq DNA polymerase and 19.75 .mu.l of water. Amplification was carried out as follows: initial denaturation at 95.degree. C. for 5 min, followed by 40 cycles of denaturation at 95.degree. C. for 30 sec, annealing at 56.degree. C. for 60 sec, and elongation at 72.degree. C. for 4 min. A final elongation cycle at 72.degree. C. for 10 min was carried out, followed by reaction termination at 4.degree. C.

[0341] Of 48 colonies screened, 46 had the 2039 bp band expected from the wild type (i.e., undisrupted) Pex3 gene thus were not .DELTA.pex3 mutants. The remaining 2 colonies showed only a faint band of 2039 bp, suggesting that they were .DELTA.pex3 mutants with some contaminating untransformed cells present in the background. This was confirmed by streaking the 2 putative knockout colonies on selection plates to isolate single colonies. Then, genomic DNA was isolated from 3 single colonies from each putative knockout strain and screened by the same primer pair. i.e., UP 768 and LP 769 (SEQ ID NOs:85 and 86). This method was considered more sensitive than colony PCR. All three single colonies from both primary transformants lacked the 2039 bp wild type band and instead possessed the 3719 bp knockout-specific band. The two .DELTA.pex3 mutants within the Y4036U strain background were designated L134 and L135.

Evaluation of Yarrowia lipolytica Strains Y4036 And Y4036 (.DELTA.Pex3) for DGLA Production

[0342] To evaluate the effect of the Pex3 knockout on the percent of PUFAs in the total lipid fraction and the total lipid content in the cells, the Y4036 and Y4036 (.DELTA.pex3) strains were grown under comparable oleaginous conditions. Strains Y4036, L134 (i.e., Y4036 (.DELTA.pex3)) and L135 (i.e., Y4036 (.DELTA.pex3)) were inoculated into 25 mL of CSM-Ura and grown at 30.degree. C. overnight in a shaker. The preculture was aliquoted into fresh 25 mL CSM-Ura flasks at a final OD.sub.600 of 0.4. Cultures were grown at 30.degree. C. in shaker. After 48 hrs, the cells (which barely grew) were spun down and resuspended in fresh 25 mL CSM-Ura and continued to grow for 72 hrs. Cells were spun down, re-suspended in 25 mL of HGM, and continued to grow as above for 72 hrs. Cells were harvested by centrifugation, washed once in distilled water and resuspended in 25 mL water to give a final volume of 20.5 mL. An aliquot (1.5 mL) was used for lipid content, following extraction of the total lipids, preparation of FAMEs by base trans-esterification, and analysis by a Hewlett-Packard 6890 GC (General Methods). The remaining aliquot was dried down to measure dry cell weight (DCW), as described in Example 11.

[0343] The fatty acid composition (i.e., 16:0 (palmitate), 16:1 (palmitoleic acid), 18:0, 18:1 (oleic acid), 18:2 (LA), EDA and DGLA) for each of the strains is shown below in Table 17 (expressed as the weight percent (wt. %) of total fatty acids (TFA)), as well as the total lipid content (TFA % DCW). The conversion efficiency ("CE") was measured according to the following formula: ([product]/[substrate+product])*100, where `product` includes the immediate product and all products in the pathway derived from it. Thus, the .DELTA.12 desaturase conversion efficiency (.DELTA.12% CE) was calculated as: ([LA+EDA+DGLA]/[18:1+LA+EDA+DGLA])*100; the .DELTA.9 elongase conversion efficiency (.DELTA.9 elo % CE) was calculated as: ([EDA+DGLA]/[LA+EDA+DGLA])*100; and, the .DELTA.8 desaturase conversion efficiency (.DELTA.8% CE) was calculated as: ([DGLA]/[EDA+DGLA])*100. The average fatty acid composition of strains Y4036, L134 and L135 are highlighted in gray and indicated with "Ave", while "S.D." indicates the Standard Deviation. As expected, the .DELTA.pex3 strains did not grow on plates with oleate as a sole source of carbon.

TABLE-US-00019 TABLE 17 Lipid Content And Composition In Y. lipolytica Strains Y4036 And Y4036 (.DELTA.Pex3) ##STR00003##

[0344] The results in Table 17 showed that knockout of the chromosomal Pex3 gene in Y4036 (.DELTA.pex3) increased the DGLA % TFAs approximately 142%, as compared to the DGLA % TFAs in strain Y4036 whose native Pex3p had not been knocked out. Specifically, the Pex3 knockout increased DGLA levels from ca. 19% in Y4036 to 46% in Y4036 (.DELTA.pex3) strains, L134 and L135. Additionally, the .DELTA.9 elongase percent conversion efficiency increased from ca. 48% in Y4036 to 83% in Y4036 (.DELTA.pex3) strains, L134 and L135; and, TFA % DCW increased from 4.7% to 6% in the strains L134 and L135. The LA % TFAs decreased from 30% to 12%. Pex3 deletion indeed increases the flux of fatty acids and thus the substrate availability for .DELTA.9 elongation.

[0345] Thus, the results in Table 17 showed that compared to the parent strain Y4036, Y4036 (.DELTA.Pex3) strain had on average higher lipid content (TFAs % DCW) (ca. 6.0% versus 4.7%), higher DGLA % TFAs (46% versus 19%), and higher DGLA % DCW (ca. 2.8% versus 0.9%). Additionally, strain Y4036 (.DELTA.Pex3) had a 2-fold increase in the amount of DGLA relative to the total PUFAs (67.7% of the PUFAs [as a % TFAs] versus 33.3% of the PUFAs [as a % TFAs]) and a 1.7-fold increase in the amount of C20 PUFAs relative to the total PUFAs (82% of the PUFAs [as a % TFAs] versus 47% of the PUFAs [as a % TFAs]).

[0346] It is hypothesized that the improved DGLA productivity would also result in improved EPA productivity in Yarrowia lipolytica strains engineered for EPA production (e.g., Y. lipolytica strain Y4305U, as described in Example 10, and derivatives therefrom).

Sequence CWU 1

1

8611024PRTYarrowia lipolyticaMISC_FEATURE(1)..(1024)YlPex1p; GenBank Accession No. CAG82178 1Met Thr Ser Lys Ser Asp Tyr Ser Gly Lys Asp Lys Ile Glu Leu Asp1 5 10 15Pro Val Phe Ala Lys Ser Ile Asp Leu Leu Pro Asn Thr Gln Val Val 20 25 30Ile Asp Ile Gln Leu Asn Pro Lys Ile Ala His Thr Ile His Leu Glu 35 40 45Pro Val Thr Val Ala Asp Trp Glu Ile Val Glu Leu His Ala Ala Tyr 50 55 60Leu Glu Ser Arg Met Ile Asn Gln Val Arg Ala Val Ser Pro Asn Gln65 70 75 80Pro Val Thr Val Tyr Pro Ser Ser Thr Thr Ser Ala Thr Leu Lys Val 85 90 95Ile Arg Ile Glu Pro Asp Leu Gly Ala Ala Gly Phe Ala Lys Leu Ser 100 105 110Pro Asp Ser Glu Val Val Val Ala Pro Lys Gln Arg Lys Lys Glu Glu 115 120 125Lys Gln Val Lys Lys Arg Ser Gly Ser Ala Arg Ser Thr Gly Ser Gln 130 135 140Lys Arg Lys Gly Gly Arg Gly Pro His Ala Leu Arg Arg Ala Ile Ser145 150 155 160Glu Asp Phe Asp Gly His Leu Arg Leu Glu Val Ser Leu Asp Val Ser 165 170 175Gln Leu Pro Pro Glu Phe His Gln Leu Lys Asn Val Ser Ile Lys Val 180 185 190Ile Thr Pro Pro Asn Leu Ala Ser Pro Gln Gln Ala Ala Ser Ile Ala 195 200 205Val Glu Glu Lys Ser Glu Glu Ser Leu Ser Gln Asn Lys Pro Pro Ser 210 215 220Ser Glu Pro Lys Val Glu Val Pro Pro Asp Ile Ile Asn Pro Ala Ser225 230 235 240Glu Ile Val Ala Thr Leu Val Asn Asp Thr Thr Ser Pro Thr Gly His 245 250 255Ala Lys Leu Ser Tyr Ala Leu Ala Asp Ala Leu Gly Ile Pro Ser Ser 260 265 270Val Gly His Val Ile Arg Phe Glu Ser Ala Ser Lys Pro Leu Ser Gln 275 280 285Lys Pro Gly Ala Leu Val Ile His Arg Phe Ile Thr Lys Thr Val Gly 290 295 300Ala Ala Glu Gln Lys Ser Leu Arg Leu Lys Gly Glu Lys Asn Ala Asp305 310 315 320Asp Gly Val Ser Ala Asp Asp Gln Phe Ser Leu Leu Glu Glu Leu Lys 325 330 335Lys Leu Gln Met Leu Glu Gly Pro Ile Thr Asn Phe Gln Arg Leu Pro 340 345 350Pro Ile Pro Glu Leu Leu Pro Leu Gly Gly Val Ile Gly Leu Gln Asn 355 360 365Ser Glu Gly Trp Ile Gln Gly Gly Tyr Leu Gly Glu Glu Pro Ile Pro 370 375 380Phe Val Ser Gly Ser Glu Ile Leu Arg Ser Glu Ser Ser Leu Ser Pro385 390 395 400Ser Asn Ile Glu Ser Glu Asp Lys Arg Val Val Gly Leu Asp Asn Met 405 410 415Leu Asn Lys Ile Asn Glu Val Leu Ser Arg Asp Ser Ile Gly Cys Leu 420 425 430Val Tyr Gly Ser Arg Gly Ser Gly Lys Ser Ala Val Leu Asn His Ile 435 440 445Lys Lys Glu Cys Lys Val Ser His Thr His Thr Val Ser Ile Ala Cys 450 455 460Gly Leu Ile Ala Gln Asp Arg Val Gln Ala Val Arg Glu Ile Leu Thr465 470 475 480Lys Ala Phe Leu Glu Ala Ser Trp Phe Ser Pro Ser Val Leu Phe Leu 485 490 495Asp Asp Ile Asp Ala Leu Met Pro Ala Glu Val Glu His Ala Asp Ser 500 505 510Ser Arg Thr Arg Gln Leu Thr Gln Leu Phe Leu Glu Leu Ala Leu Pro 515 520 525Ile Met Lys Ser Arg His Val Ser Val Val Ala Ser Ala Gln Ala Lys 530 535 540Glu Ser Leu His Met Asn Leu Val Thr Gly His Val Phe Glu Glu Leu545 550 555 560Phe His Leu Lys Ser Pro Asp Lys Glu Ala Arg Leu Ala Ile Leu Ser 565 570 575Glu Ala Val Lys Leu Met Asp Gln Asn Val Ser Phe Ser Gln Asn Asp 580 585 590Val Leu Glu Ile Ala Ser Gln Val Asp Gly Tyr Leu Pro Gly Asp Leu 595 600 605Trp Thr Leu Ser Glu Arg Ala Gln His Glu Met Ala Leu Arg Gln Ile 610 615 620Glu Ile Gly Leu Glu Asn Pro Ser Ile Gln Leu Ala Asp Phe Met Lys625 630 635 640Ala Leu Glu Asp Phe Val Pro Ser Ser Leu Arg Gly Val Lys Leu Gln 645 650 655Lys Ser Asn Val Lys Trp Asn Asp Ile Gly Gly Leu Lys Glu Thr Lys 660 665 670Ala Val Leu Leu Glu Thr Leu Glu Trp Pro Thr Lys Tyr Ala Pro Ile 675 680 685Phe Ala Ser Cys Pro Leu Arg Leu Arg Ser Gly Leu Leu Leu Tyr Gly 690 695 700Tyr Pro Gly Cys Gly Lys Thr Tyr Leu Ala Ser Ala Val Ala Ala Gln705 710 715 720Cys Gly Leu Asn Phe Ile Ser Ile Lys Gly Pro Glu Ile Leu Asn Lys 725 730 735Tyr Ile Gly Ala Ser Glu Gln Ser Val Arg Glu Leu Phe Glu Arg Ala 740 745 750Gln Ala Ala Lys Pro Cys Ile Leu Phe Phe Asp Glu Phe Asp Ser Ile 755 760 765Ala Pro Lys Arg Gly His Asp Ser Thr Gly Val Thr Asp Arg Val Val 770 775 780Asn Gln Met Leu Thr Gln Met Asp Gly Ala Glu Gly Leu Asp Gly Val785 790 795 800Tyr Val Leu Ala Ala Thr Ser Arg Pro Asp Leu Ile Asp Pro Ala Leu 805 810 815Leu Arg Pro Gly Arg Leu Asp Lys Met Leu Ile Cys Asp Leu Pro Ser 820 825 830Tyr Glu Asp Arg Leu Asp Ile Leu Arg Ala Ile Val Asp Gly Lys Met 835 840 845His Leu Asp Gly Glu Val Glu Leu Glu Tyr Val Ala Ser Arg Thr Asp 850 855 860Gly Phe Ser Gly Ala Asp Leu Gln Ala Val Met Phe Asn Ala Tyr Leu865 870 875 880Glu Ala Ile His Glu Val Val Asp Val Ala Asp Asp Thr Ala Ala Asp 885 890 895Thr Pro Ala Leu Glu Asp Lys Arg Leu Glu Phe Phe Gln Thr Thr Leu 900 905 910Gly Asp Ala Lys Lys Asp Pro Ala Ala Val Gln Asn Glu Val Met Asn 915 920 925Ala Arg Ala Ala Val Ala Glu Lys Ala Arg Val Thr Ala Lys Leu Glu 930 935 940Ala Leu Phe Lys Gly Met Ser Val Gly Val Asp Asn Asp Asp Asp Lys945 950 955 960Pro Arg Lys Lys Ala Val Val Val Ile Lys Pro Gln His Met Asn Lys 965 970 975Ser Leu Asp Glu Thr Ser Pro Ser Ile Ser Lys Lys Glu Leu Leu Lys 980 985 990Leu Lys Gly Ile Tyr Ser Gln Phe Val Ser Gly Arg Ser Gly Asp Met 995 1000 1005Pro Pro Gly Thr Ala Ser Thr Asp Val Gly Gly Arg Ala Thr Leu 1010 1015 1020Ala2381PRTYarrowia lipolyticaMISC_FEATURE(1)..(381)YlPex2p; GenBank Accession No. CAG77647 2Met Ser Ser Val Leu Arg Leu Phe Lys Ile Gly Ala Pro Val Pro Asn1 5 10 15Val Arg Val His Gln Leu Asp Ala Ser Leu Leu Asp Ala Glu Leu Val 20 25 30Asp Leu Leu Lys Asn Gln Leu Phe Lys Gly Phe Thr Asn Phe His Pro 35 40 45Glu Phe Arg Asp Lys Tyr Glu Ser Glu Leu Val Leu Ala Leu Lys Leu 50 55 60Ile Leu Phe Lys Leu Thr Val Trp Asp His Ala Ile Thr Tyr Gly Gly65 70 75 80Lys Leu Gln Asn Leu Lys Phe Ile Asp Ser Arg His Ser Ser Lys Leu 85 90 95Gln Ile Gln Pro Ser Val Ile Gln Lys Leu Gly Tyr Gly Ile Leu Val 100 105 110Val Gly Gly Gly Tyr Leu Trp Ser Lys Ile Glu Gly Tyr Leu Leu Ala 115 120 125Arg Ser Glu Asp Asp Val Ala Thr Asp Gly Thr Ser Val Arg Gly Ala 130 135 140Ser Ala Ala Arg Gly Ala Leu Lys Val Ala Asn Phe Ala Ser Leu Leu145 150 155 160Tyr Ser Ala Ala Thr Leu Gly Asn Phe Val Ala Phe Leu Tyr Thr Gly 165 170 175Arg Tyr Ala Thr Val Ile Met Arg Leu Leu Arg Ile Arg Leu Val Pro 180 185 190Ser Gln Arg Thr Ser Ser Arg Gln Val Ser Tyr Glu Phe Gln Asn Arg 195 200 205Gln Leu Val Trp Asn Ala Phe Thr Glu Phe Leu Ile Phe Ile Leu Pro 210 215 220Leu Leu Gln Leu Pro Lys Leu Lys Arg Arg Ile Glu Arg Lys Leu Gln225 230 235 240Ser Leu Asn Val Thr Arg Val Gly Asn Val Glu Glu Ala Ser Glu Gly 245 250 255Glu Leu Ala His Leu Pro Gln Lys Thr Cys Ala Ile Cys Phe Arg Asp 260 265 270Glu Glu Glu Gln Glu Gly Gly Gly Gly Ala Ser His Tyr Ser Thr Asp 275 280 285Val Thr Asn Pro Tyr Gln Ala Asp Cys Gly His Val Tyr Cys Tyr Val 290 295 300Cys Leu Val Thr Lys Leu Ala Gln Gly Asp Gly Asp Gly Trp Asn Cys305 310 315 320Tyr Arg Cys Ala Lys Gln Val Gln Lys Met Lys Pro Trp Val Asp Val 325 330 335Asp Glu Ala Ala Val Val Gly Ala Ala Glu Met His Glu Lys Val Asp 340 345 350Val Ile Glu His Ala Glu Asp Asn Glu Gln Glu Glu Glu Glu Phe Asp 355 360 365Asp Asp Asp Glu Asp Ser Asn Phe Gln Leu Met Lys Asp 370 375 3803431PRTYarrowia lipolyticaMISC_FEATURE(1)..(431)YlPex3p; GenBank Accession No. CAG78565 3Met Asp Phe Phe Arg Arg His Gln Lys Lys Val Leu Ala Leu Val Gly1 5 10 15Val Ala Leu Ser Ser Tyr Leu Phe Ile Asp Tyr Val Lys Lys Lys Phe 20 25 30Phe Glu Ile Gln Gly Arg Leu Ser Ser Glu Arg Thr Ala Lys Gln Asn 35 40 45Leu Arg Arg Arg Phe Glu Gln Asn Gln Gln Asp Ala Asp Phe Thr Ile 50 55 60Met Ala Leu Leu Ser Ser Leu Thr Thr Pro Val Met Glu Arg Tyr Pro65 70 75 80Val Asp Gln Ile Lys Ala Glu Leu Gln Ser Lys Arg Arg Pro Thr Asp 85 90 95Arg Val Leu Ala Leu Glu Ser Ser Thr Ser Ser Ser Ala Thr Ala Gln 100 105 110Thr Val Pro Thr Met Thr Ser Gly Ala Thr Glu Glu Gly Glu Lys Ser 115 120 125Lys Thr Gln Leu Trp Gln Asp Leu Lys Arg Thr Thr Ile Ser Arg Ala 130 135 140Phe Ser Leu Val Tyr Ala Asp Ala Leu Leu Ile Phe Phe Thr Arg Leu145 150 155 160Gln Leu Asn Ile Leu Gly Arg Arg Asn Tyr Val Asn Ser Val Val Ala 165 170 175Leu Ala Gln Gln Gly Arg Glu Gly Asn Ala Glu Gly Arg Val Ala Pro 180 185 190Ser Phe Gly Asp Leu Ala Asp Met Gly Tyr Phe Gly Asp Leu Ser Gly 195 200 205Ser Ser Ser Phe Gly Glu Thr Ile Val Asp Pro Asp Leu Asp Glu Gln 210 215 220Tyr Leu Thr Phe Ser Trp Trp Leu Leu Asn Glu Gly Trp Val Ser Leu225 230 235 240Ser Glu Arg Val Glu Glu Ala Val Arg Arg Val Trp Asp Pro Val Ser 245 250 255Pro Lys Ala Glu Leu Gly Phe Asp Glu Leu Ser Glu Leu Ile Gly Arg 260 265 270Thr Gln Met Leu Ile Asp Arg Pro Leu Asn Pro Ser Ser Pro Leu Asn 275 280 285Phe Leu Ser Gln Leu Leu Pro Pro Arg Glu Gln Glu Glu Tyr Val Leu 290 295 300Ala Gln Asn Pro Ser Asp Thr Ala Ala Pro Ile Val Gly Pro Thr Leu305 310 315 320Arg Arg Leu Leu Asp Glu Thr Ala Asp Phe Ile Glu Ser Pro Asn Ala 325 330 335Ala Glu Val Ile Glu Arg Leu Val His Ser Gly Leu Ser Val Phe Met 340 345 350Asp Lys Leu Ala Val Thr Phe Gly Ala Thr Pro Ala Asp Ser Gly Ser 355 360 365Pro Tyr Pro Val Val Leu Pro Thr Ala Lys Val Lys Leu Pro Ser Ile 370 375 380Leu Ala Asn Met Ala Arg Gln Ala Gly Gly Met Ala Gln Gly Ser Pro385 390 395 400Gly Val Glu Asn Glu Tyr Ile Asp Val Met Asn Gln Val Gln Glu Leu 405 410 415Thr Ser Phe Ser Ala Val Val Tyr Ser Ser Phe Asp Trp Ala Leu 420 425 4304395PRTYarrowia lipolyticaMISC_FEATURE(1)..(395)YlPex3Bp; GenBank Accession No. CAG83356 4Met Leu Gln Ser Leu Asn Arg Asn Lys Lys Arg Leu Ala Val Ser Thr1 5 10 15Gly Leu Ile Ala Val Ala Tyr Val Val Ile Ser Tyr Thr Thr Lys Arg 20 25 30Leu Ile Glu Lys Gln Glu Gln Lys Leu Glu Glu Glu Arg Ala Lys Glu 35 40 45Arg Leu Lys Gln Leu Phe Ala Gln Thr Gln Asn Glu Ala Ala Phe His 50 55 60Thr Ala Ser Val Leu Pro Gln Leu Cys Glu Gln Ile Met Glu Phe Val65 70 75 80Ala Val Glu Lys Ile Ala Glu Gln Leu Gln Asn Met Arg Ala Glu Lys 85 90 95Arg Lys Lys Gln Asn Met Asp Asp Asp Lys His Ser Val Leu Ser Leu 100 105 110Gly Thr Glu Thr Thr Ala Ser Met Ala Asp Gly Gln Lys Met Ser Lys 115 120 125Ile Gln Leu Trp Asp Glu Leu Lys Ile Glu Ser Leu Thr Arg Ile Val 130 135 140Thr Leu Ile Tyr Cys Val Ser Leu Leu Asn Tyr Leu Ile Arg Leu Gln145 150 155 160Thr Asn Ile Val Gly Arg Lys Arg Tyr Gln Asn Glu Ala Gly Pro Ala 165 170 175Gly Ala Thr Tyr Asp Met Ser Leu Glu Gln Cys Tyr Thr Trp Leu Leu 180 185 190Thr Arg Gly Trp Lys Ser Val Val Asp Asn Val Arg Arg Ser Val Gln 195 200 205Gln Val Phe Thr Gly Val Asn Pro Arg Gln Asn Leu Ser Leu Asp Glu 210 215 220Phe Ala Thr Leu Leu Lys Arg Val Gln Thr Leu Val Asn Ser Pro Pro225 230 235 240Tyr Ser Thr Thr Pro Asn Thr Phe Leu Thr Ser Leu Leu Pro Pro Arg 245 250 255Glu Leu Glu Gln Leu Arg Leu Glu Lys Glu Lys Gln Ser Leu Ser Pro 260 265 270Asn Tyr Thr Tyr Gly Ser Pro Leu Lys Asp Leu Val Phe Glu Ser Ala 275 280 285Gln His Ile Gln Ser Pro Gln Gly Met Ser Ser Phe Arg Ala Ile Ile 290 295 300Asp Gln Ser Phe Lys Val Phe Leu Glu Lys Val Asn Glu Ser Gln Tyr305 310 315 320Val Asn Pro Pro Ser Thr Gly Gly Lys Arg Ile Ala Val Gly Ala Leu 325 330 335Gln Pro Pro Ile Ile Ser Gly Gly Pro Lys Lys Val Lys Leu Ala Ser 340 345 350Leu Leu Ser Val Ala Thr Arg Gln Ser Ser Val Ile Ser His Ala Gln 355 360 365Pro Asn Pro Tyr Val Asp Ala Ile Asn Ser Val Ala Glu Tyr Asn Gly 370 375 380Leu Cys Ala Val Ile Tyr Ser Ser Phe Glu Gln385 390 3955153PRTYarrowia lipolyticaMISC_FEATURE(1)..(153)YlPex4p; GenBank Accession No. CAG79130 5Met Ala Ser Gln Lys Arg Leu Ile Lys Glu Leu Ala Ala Tyr Lys Lys1 5 10 15Asp Pro Asn Pro Cys Leu Ala Ser Leu Thr Ala Asp Gly Asp Ser Leu 20 25 30Tyr Lys Trp Thr Ala Val Met Arg Gly Thr Glu Gly Thr Ala Tyr Glu 35 40 45Asn Gly Leu Trp Gln Val Glu Ile Asn Ile Pro Glu Asn Tyr Pro Leu 50 55 60Gln Pro Pro Thr Met Phe Phe Arg Thr Lys Ile Cys His Pro Asn Ile65 70 75 80His Phe Glu Thr Gly Glu Val Cys Ile Asp Val Leu Lys Thr Gln Trp 85 90 95Ser Pro Ala Trp Thr Ile Ser Ser Ala Cys Thr Ala Val Ser Ala Met 100 105 110Leu Ser Leu Pro Glu Pro Asp Ser Pro Leu Asn Ile Asp Ala Ala Asn 115 120 125Leu Val Arg Cys Gly Asp Glu Ser Ala Met Glu Gly Leu Val Arg Tyr 130 135 140Tyr Val Asn Lys Tyr Ala Ser Gly Asn145 1506598PRTYarrowia lipolyticaMISC_FEATURE(1)..(598)YlPex5p; GenBank Accession No. CAG78803 6Met Ser Phe Met Arg Gly Gly Ser Glu Cys Ser Thr Gly Arg Asn Pro1 5 10

15Leu Ser Gln Phe Thr Lys His Thr Ala Glu Asp Arg Ser Leu Gln His 20 25 30Asp Arg Val Ala Gly Pro Ser Gly Gly Arg Val Gly Gly Met Arg Ser 35 40 45Asn Thr Gly Glu Met Ser Gln Gln Asp Arg Glu Met Met Ala Arg Phe 50 55 60Gly Ala Ala Gly Pro Glu Gln Ser Ser Phe Asn Tyr Glu Gln Met Arg65 70 75 80His Glu Leu His Asn Met Gly Ala Gln Gly Gly Gln Ile Pro Gln Val 85 90 95Pro Ser Gln Gln Gly Ala Ala Asn Gly Gly Gln Trp Ala Arg Asp Phe 100 105 110Gly Gly Gln Gln Thr Ala Pro Gly Ala Ala Pro Gln Asp Ala Lys Asn 115 120 125Trp Asn Ala Glu Phe Gln Arg Gly Gly Ser Pro Ala Glu Ala Met Gln 130 135 140Gln Gln Gly Pro Gly Pro Met Gln Gly Gly Met Gly Met Gly Gly Met145 150 155 160Pro Met Tyr Gly Met Ala Arg Pro Met Tyr Ser Gly Met Ser Ala Asn 165 170 175Met Ala Pro Gln Phe Gln Pro Gln Gln Ala Asn Ala Arg Val Val Glu 180 185 190Leu Asp Glu Gln Asn Trp Glu Glu Gln Phe Lys Gln Met Asp Ser Ala 195 200 205Val Gly Lys Gly Lys Glu Val Glu Glu Gln Thr Ala Glu Thr Ala Thr 210 215 220Ala Thr Glu Thr Val Thr Glu Thr Glu Thr Thr Thr Glu Asp Lys Pro225 230 235 240Met Asp Ile Lys Asn Met Asp Phe Glu Asn Ile Trp Lys Asn Leu Gln 245 250 255Val Asn Val Leu Asp Asn Met Asp Glu Trp Leu Glu Glu Thr Asn Ser 260 265 270Pro Ala Trp Glu Arg Asp Phe His Glu Tyr Thr His Asn Arg Pro Glu 275 280 285Phe Ala Asp Tyr Gln Phe Glu Glu Asn Asn Gln Phe Met Glu His Pro 290 295 300Asp Pro Phe Lys Ile Gly Val Glu Leu Met Glu Thr Gly Gly Arg Leu305 310 315 320Ser Glu Ala Ala Leu Ala Phe Glu Ala Ala Val Gln Lys Asn Thr Glu 325 330 335His Ala Glu Ala Trp Gly Arg Leu Gly Ala Cys Gln Ala Gln Asn Glu 340 345 350Lys Glu Asp Pro Ala Ile Arg Ala Leu Glu Arg Cys Ile Lys Leu Glu 355 360 365Pro Gly Asn Leu Ser Ala Leu Met Asn Leu Ser Val Ser Tyr Thr Asn 370 375 380Glu Gly Tyr Glu Asn Ala Ala Tyr Ala Thr Leu Glu Arg Trp Leu Ala385 390 395 400Thr Lys Tyr Pro Glu Val Val Asp Gln Ala Arg Asn Gln Glu Pro Arg 405 410 415Leu Gly Asn Glu Asp Lys Phe Gln Leu His Ser Arg Val Thr Glu Leu 420 425 430Phe Ile Arg Ala Ala Gln Leu Ser Pro Asp Gly Ala Asn Ile Asp Ala 435 440 445Asp Val Gln Val Gly Leu Gly Val Leu Phe Tyr Gly Asn Glu Glu Tyr 450 455 460Asp Lys Ala Ile Asp Cys Phe Asn Ala Ala Ile Ala Val Arg Pro Asp465 470 475 480Asp Ala Leu Leu Trp Asn Arg Leu Gly Ala Thr Leu Ala Asn Ser His 485 490 495Arg Ser Glu Glu Ala Ile Asp Ala Tyr Tyr Lys Ala Leu Glu Leu Arg 500 505 510Pro Ser Phe Val Arg Ala Arg Tyr Asn Leu Gly Val Ser Cys Ile Asn 515 520 525Ile Gly Cys Tyr Lys Glu Ala Ala Gln Tyr Leu Leu Gly Ala Leu Ser 530 535 540Met His Lys Val Glu Gly Val Gln Asp Asp Val Leu Ala Asn Gln Ser545 550 555 560Thr Asn Leu Tyr Asp Thr Leu Lys Arg Val Phe Leu Gly Met Asp Arg 565 570 575Arg Asp Leu Val Ala Lys Val Gly Asn Gly Met Asp Val Asn Gln Phe 580 585 590Arg Asn Glu Phe Glu Phe 59571024PRTYarrowia lipolyticaMISC_FEATURE(1)..(1024)YlPex6p; GenBank Accession No. CAG82306 7Met Pro Ser Ile Ser His Lys Pro Ile Thr Ala Lys Leu Val Ala Ala1 5 10 15Pro Asp Ala Thr Lys Leu Glu Leu Ser Ser Tyr Leu Tyr Gln Gln Leu 20 25 30Phe Ser Asp Lys Pro Ala Glu Pro Tyr Val Ala Phe Glu Ala Pro Gly 35 40 45Ile Lys Trp Ala Leu Tyr Pro Ala Ser Glu Asp Arg Ser Leu Pro Gln 50 55 60Tyr Thr Cys Lys Ala Asp Ile Arg His Val Ala Gly Ser Leu Lys Lys65 70 75 80Phe Met Pro Val Val Leu Lys Arg Val Asn Pro Val Thr Ile Glu His 85 90 95Ala Ile Val Thr Val Pro Ala Ser Gln Tyr Glu Thr Leu Asn Thr Pro 100 105 110Glu Gln Val Leu Lys Ala Leu Glu Pro Gln Leu Asp Lys Asp Arg Pro 115 120 125Val Ile Arg Gln Gly Asp Val Leu Leu Asn Gly Cys Arg Val Arg Leu 130 135 140Cys Glu Pro Val Asn Gln Gly Lys Val Val Lys Gly Thr Thr Lys Leu145 150 155 160Thr Val Ala Lys Glu Gln Glu Thr Ile Gln Pro Ala Asp Glu Ala Ala 165 170 175Asp Val Ala Phe Asp Ile Ala Glu Phe Leu Asp Phe Asp Thr Ser Val 180 185 190Ala Lys Thr Arg Glu Ser Thr Asn Leu Gln Val Ala Pro Leu Glu Gly 195 200 205Ala Ile Pro Thr Pro Leu Ser Asp Arg Phe Asp Asp Cys Glu Ser Arg 210 215 220Gly Phe Val Lys Ser Glu Thr Met Ser Lys Leu Gly Val Phe Ser Gly225 230 235 240Asp Ile Val Ser Ile Lys Thr Lys Asn Gly Ala Glu Arg Val Leu Arg 245 250 255Leu Phe Ala Tyr Pro Glu Pro Asn Thr Val Lys Tyr Asp Val Val Tyr 260 265 270Val Ser Pro Ile Leu Tyr His Asn Ile Gly Asp Lys Glu Ile Glu Val 275 280 285Thr Pro Asn Gly Glu Thr His Lys Ser Val Gly Glu Ala Leu Asp Ser 290 295 300Val Leu Glu Ala Ala Glu Glu Val Lys Leu Ala Arg Val Leu Gly Pro305 310 315 320Thr Thr Thr Asp Arg Thr Phe Gln Thr Ala Tyr His Ala Gly Leu Gln 325 330 335Ala Tyr Phe Lys Pro Val Lys Arg Ala Val Arg Val Gly Asp Leu Ile 340 345 350Pro Ile Pro Phe Asp Ser Ile Leu Ala Arg Thr Ile Gly Glu Asp Pro 355 360 365Glu Met Ser His Ile Pro Leu Glu Ala Leu Ala Val Lys Pro Asp Ser 370 375 380Val Ala Trp Phe Gln Val Thr Ser Leu Asn Gly Ser Glu Asp Pro Ala385 390 395 400Ser Lys Gln Tyr Leu Val Asp Ser Ser Gln Thr Lys Leu Ile Glu Gly 405 410 415Gly Thr Thr Ser Ser Ala Val Ile Pro Thr Ser Val Pro Trp Arg Glu 420 425 430Tyr Leu Gly Leu Asp Thr Leu Pro Lys Phe Gly Ser Glu Phe Ala Tyr 435 440 445Ala Asp Lys Ile Arg Asn Leu Val Gln Ile Ser Thr Ser Ala Leu Ser 450 455 460His Ala Lys Leu Asn Thr Ser Val Leu Leu His Ser Ala Lys Arg Gly465 470 475 480Val Gly Lys Ser Thr Val Leu Arg Ser Val Ala Ala Gln Cys Gly Ile 485 490 495Ser Val Phe Glu Ile Ser Cys Phe Gly Leu Ile Gly Asp Asn Glu Ala 500 505 510Gln Thr Leu Gly Thr Leu Arg Ala Lys Leu Asp Arg Ala Tyr Gly Cys 515 520 525Ser Pro Cys Val Val Val Leu Gln His Leu Glu Ser Ile Ala Lys Lys 530 535 540Ser Asp Gln Asp Gly Lys Asp Glu Gly Ile Val Ser Lys Leu Val Asp545 550 555 560Val Leu Ala Asp Tyr Ser Gly His Gly Val Leu Leu Ala Ala Thr Ser 565 570 575Asn Asp Pro Asp Lys Ile Ser Glu Ala Ile Arg Ser Arg Phe Gln Phe 580 585 590Glu Ile Glu Ile Gly Val Pro Ser Glu Pro Gln Arg Arg Gln Ile Phe 595 600 605Ser His Leu Thr Lys Ser Gly Pro Gly Gly Asp Ser Ile Arg Asn Ala 610 615 620Pro Ile Ser Leu Arg Ser Asp Val Ser Val Glu Asn Leu Ala Leu Gln625 630 635 640Ser Ala Gly Leu Thr Pro Pro Asp Leu Thr Ala Ile Val Gln Thr Thr 645 650 655Arg Leu Arg Ala Ile Asp Arg Leu Asn Lys Leu Thr Lys Asp Ser Asp 660 665 670Thr Thr Leu Asp Asp Leu Leu Thr Leu Ser His Gly Thr Leu Gln Leu 675 680 685Thr Pro Ser Asp Phe Asp Asp Ala Ile Ala Asp Ala Arg Gln Lys Tyr 690 695 700Ser Asp Ser Ile Gly Ala Pro Arg Ile Pro Asn Val Gly Trp Asp Asp705 710 715 720Val Gly Gly Met Glu Gly Val Lys Lys Asp Ile Leu Asp Thr Ile Glu 725 730 735Thr Pro Leu Lys Tyr Pro His Trp Phe Ser Asp Gly Val Lys Lys Arg 740 745 750Ser Gly Ile Leu Phe Tyr Gly Pro Pro Gly Thr Gly Lys Thr Leu Leu 755 760 765Ala Lys Ala Ile Ala Thr Thr Phe Ser Leu Asn Phe Phe Ser Val Lys 770 775 780Gly Pro Glu Leu Leu Asn Met Tyr Ile Gly Glu Ser Glu Ala Asn Val785 790 795 800Arg Arg Val Phe Gln Lys Ala Arg Asp Ala Lys Pro Cys Val Val Phe 805 810 815Phe Asp Glu Leu Asp Ser Val Ala Pro Gln Arg Gly Asn Gln Gly Asp 820 825 830Ser Gly Gly Val Met Asp Arg Ile Val Ser Gln Leu Leu Ala Glu Leu 835 840 845Asp Gly Met Ser Thr Ala Gly Gly Glu Gly Val Phe Val Val Gly Ala 850 855 860Thr Asn Arg Pro Asp Leu Leu Asp Glu Ala Leu Leu Arg Pro Gly Arg865 870 875 880Phe Asp Lys Met Leu Tyr Leu Gly Ile Ser Asp Thr His Glu Lys Gln 885 890 895Gln Thr Ile Met Glu Ala Leu Thr Arg Lys Phe Arg Leu Ala Ala Asp 900 905 910Val Ser Leu Glu Ala Ile Ser Lys Arg Cys Pro Phe Thr Phe Thr Gly 915 920 925Ala Asp Phe Tyr Ala Leu Cys Ser Asp Ala Met Leu Asn Ala Met Thr 930 935 940Arg Thr Ala Asn Glu Val Asp Ala Lys Ile Lys Leu Leu Asn Lys Asn945 950 955 960Arg Glu Glu Ala Gly Glu Glu Pro Val Ser Ile Arg Trp Trp Phe Asp 965 970 975His Glu Ala Thr Lys Ser Asp Ile Glu Val Glu Val Ala Gln Gln Asp 980 985 990Phe Glu Lys Ala Lys Asp Glu Leu Ser Pro Ser Val Ser Ala Glu Glu 995 1000 1005Leu Gln His Tyr Leu Lys Leu Arg Gln Gln Phe Glu Gly Gly Lys 1010 1015 1020Lys8356PRTYarrowia lipolyticaMISC_FEATURE(1)..(356)YlPex7p; GenBank Accession No. CAG78389 8Met Leu Gly Phe Lys Thr Gln Gly Phe Asn Gly Tyr Ala Ala Asn Tyr1 5 10 15Ser Pro Phe Phe Asn Asp Lys Ile Ala Val Gly Thr Ala Ala Asn Tyr 20 25 30Gly Leu Val Gly Asn Gly Lys Leu Phe Ile Leu Gly Ile Ser Pro Glu 35 40 45Gly Arg Met Val Cys Glu Gly Gln Phe Asp Thr Gln Asp Gly Ile Phe 50 55 60Asp Val Ala Trp Ser Glu Gln His Glu Asn His Val Ala Thr Ala Cys65 70 75 80Gly Asp Gly Ser Val Lys Leu Phe Asp Ile Lys Ala Gly Ala Phe Pro 85 90 95Leu Val Ser Phe Lys Glu His Thr Arg Glu Val Phe Ser Val Asn Trp 100 105 110Asn Met Ala Asn Lys Ala Leu Phe Cys Thr Ser Ser Trp Asp Ser Thr 115 120 125Ile Lys Ile Trp Thr Pro Glu Arg Thr Asn Ser Ile Met Thr Leu Gly 130 135 140Gln Pro Ala Pro Ala Gln Gly Thr Asn Ala Ser Ala His Ile Gly Arg145 150 155 160Gln Thr Ala Pro Asn Gln Ala Ala Ala Gln Glu Cys Ile Tyr Ser Ala 165 170 175Lys Phe Ser Pro His Thr Asp Ser Ile Ile Ala Ser Ala His Ser Thr 180 185 190Gly Met Val Lys Val Trp Asp Thr Arg Ala Pro Gln Pro Leu Gln Gln 195 200 205Gln Phe Ser Thr Gln Gln Thr Glu Ser Gly Gly Pro Pro Glu Val Leu 210 215 220Ser Leu Asp Trp Asn Lys Tyr Arg Pro Thr Val Ile Ala Thr Gly Gly225 230 235 240Val Asp Arg Ser Val Gln Val Tyr Asp Ile Arg Met Thr Gln Pro Ala 245 250 255Ala Asn Gln Pro Val Gln Pro Leu Ser Leu Ile Leu Gly His Arg Leu 260 265 270Pro Val Arg Gly Val Ser Trp Ser Pro His His Ala Asp Leu Leu Leu 275 280 285Ser Cys Ser Tyr Asp Met Thr Ala Arg Val Trp Arg Asp Ala Ser Thr 290 295 300Gly Gly Asn Tyr Leu Ala Arg Gln Arg Gly Gly Thr Glu Val Lys Cys305 310 315 320Met Asp Arg His Thr Glu Phe Val Ile Gly Gly Asp Trp Ser Leu Trp 325 330 335Gly Asp Pro Gly Trp Ile Thr Thr Val Gly Trp Asp Gln Met Val Tyr 340 345 350Val Trp His Ala 3559671PRTYarrowia lipolyticaMISC_FEATURE(1)..(671)YlPex8p; GenBank Accession No. CAG80447 9Met Asn Lys Tyr Leu Val Pro Pro Pro Gln Ala Asn Arg Thr Val Thr1 5 10 15Asn Leu Asp Leu Leu Ile Asn Asn Leu Arg Gly Ser Ser Thr Pro Gly 20 25 30Ala Ala Glu Val Asp Thr Arg Asp Ile Leu Gln Arg Ile Val Phe Ile 35 40 45Leu Pro Thr Ile Lys Asn Pro Leu Asn Leu Asp Leu Val Ile Lys Glu 50 55 60Ile Ile Asn Ser Pro Arg Leu Leu Pro Pro Leu Ile Asp Leu His Asp65 70 75 80Tyr Gln Gln Leu Thr Asp Ala Phe Arg Ala Thr Ile Lys Arg Lys Ala 85 90 95Leu Val Thr Asp Pro Thr Ile Ser Phe Glu Ala Trp Leu Glu Thr Cys 100 105 110Phe Gln Val Ile Thr Arg Phe Ala Gly Pro Gly Trp Lys Lys Leu Pro 115 120 125Leu Leu Ala Gly Leu Ile Leu Ala Asp Tyr Asp Ile Ser Ala Asp Gly 130 135 140Pro Thr Leu Glu Arg Lys Pro Gly Phe Pro Ser Lys Leu Lys His Leu145 150 155 160Leu Lys Arg Glu Phe Val Thr Thr Phe Asp Gln Cys Leu Ser Ile Asp 165 170 175Thr Arg Asn Arg Ser Asp Ala Thr Lys Trp Val Pro Val Leu Ala Cys 180 185 190Ile Ser Ile Ala Gln Val Tyr Ser Leu Leu Gly Asp Val Ala Ile Asn 195 200 205Tyr Arg Arg Phe Leu Gln Val Gly Leu Asp Leu Ile Phe Ser Asn Tyr 210 215 220Gly Leu Glu Met Gly Thr Ala Leu Ala Arg Leu His Ala Glu Ser Gly225 230 235 240Gly Asp Ala Thr Thr Ala Gly Gly Leu Ile Gly Lys Lys Leu Lys Glu 245 250 255Pro Val Val Ala Leu Leu Asn Thr Phe Ala His Ile Ala Ser Ser Cys 260 265 270Ile Val His Val Asp Ile Asp Tyr Ile Asp Arg Ile Gln Asn Lys Ile 275 280 285Ile Leu Val Cys Glu Asn Gln Ala Glu Thr Trp Arg Ile Leu Thr Ile 290 295 300Glu Ser Pro Thr Val Met His His Gln Glu Ser Val Gln Tyr Leu Lys305 310 315 320Trp Glu Leu Phe Thr Leu Cys Ile Ile Met Gln Gly Ile Ala Asn Met 325 330 335Leu Leu Thr Gln Lys Met Asn Gln Phe Met Tyr Leu Gln Leu Ala Tyr 340 345 350Lys Gln Leu Gln Ala Leu His Ser Ile Tyr Phe Ile Val Asp Gln Met 355 360 365Gly Ser Gln Phe Ala Ala Tyr Asp Tyr Val Phe Phe Ser Ala Ile Asp 370 375 380Val Leu Leu Ser Glu Tyr Ala Pro Tyr Ile Lys Asn Arg Gly Thr Ile385 390 395 400Pro Pro Asn Lys Glu Phe Val Ala Glu Arg Leu Ala Ala Asn Leu Ala 405 410 415Gly Thr Ser Asn Val Gly Ser His Leu Pro Ile Asp Arg Ser Arg Val 420 425 430Leu Phe Ala Leu Asn Tyr Tyr Glu Gln Leu Val Thr Val Cys His Asp 435 440 445Ser Cys Val Glu Thr Ile Ile Tyr Pro Met Ala Arg Ser Phe Leu Tyr 450 455 460Pro Thr Ser Asp Ile Gln Gln Leu Lys Pro Leu Val Glu Ala Ala His465

470 475 480Ser Val Ile Leu Ala Gly Leu Ala Val Pro Thr Asn Ala Val Val Asn 485 490 495Ala Lys Leu Ile Pro Glu Tyr Met Gly Gly Val Leu Pro Leu Phe Pro 500 505 510Gly Val Phe Ser Trp Asn Gln Phe Val Leu Ala Ile Gln Ser Ile Val 515 520 525Asn Thr Val Ser Pro Pro Ser Glu Val Phe Lys Thr Asn Gln Lys Leu 530 535 540Phe Arg Leu Val Leu Asp Ser Leu Met Lys Lys Cys Arg Asp Thr Pro545 550 555 560Val Gly Ile Pro Val Pro His Ser Val Thr Val Ser Gln Glu Gln Glu 565 570 575Asp Ile Pro Pro Thr Gln Arg Ala Val Val Met Leu Ala Leu Ile Asn 580 585 590Ser Leu Pro Tyr Val Asp Ile Arg Ser Phe Glu Leu Trp Leu Gln Glu 595 600 605Thr Trp Asn Met Ile Glu Ala Thr Pro Met Leu Ala Glu Asn Ala Pro 610 615 620Asn Lys Glu Leu Ala His Ala Glu His Glu Phe Leu Val Leu Glu Met625 630 635 640Trp Lys Met Ile Ser Gly Asn Ile Asp Gln Arg Leu Asn Asp Val Ala 645 650 655Ile Arg Trp Trp Tyr Lys Lys Asn Ala Arg Val His Gly Thr Leu 660 665 67010377PRTYarrowia lipolyticaMISC_FEATURE(1)..(377)YlPex10p; GenBank Accession No. CAG81606 10Met Trp Gly Ser Ser His Ala Phe Ala Gly Glu Ser Asp Leu Thr Leu1 5 10 15Gln Leu His Thr Arg Ser Asn Met Ser Asp Asn Thr Thr Ile Lys Lys 20 25 30Pro Ile Arg Pro Lys Pro Ile Arg Thr Glu Arg Leu Pro Tyr Ala Gly 35 40 45Ala Ala Glu Ile Ile Arg Ala Asn Gln Lys Asp His Tyr Phe Glu Ser 50 55 60Val Leu Glu Gln His Leu Val Thr Phe Leu Gln Lys Trp Lys Gly Val65 70 75 80Arg Phe Ile His Gln Tyr Lys Glu Glu Leu Glu Thr Ala Ser Lys Phe 85 90 95Ala Tyr Leu Gly Leu Cys Thr Leu Val Gly Ser Lys Thr Leu Gly Glu 100 105 110Glu Tyr Thr Asn Leu Met Tyr Thr Ile Arg Asp Arg Thr Ala Leu Pro 115 120 125Gly Val Val Arg Arg Phe Gly Tyr Val Leu Ser Asn Thr Leu Phe Pro 130 135 140Tyr Leu Phe Val Arg Tyr Met Gly Lys Leu Arg Ala Lys Leu Met Arg145 150 155 160Glu Tyr Pro His Leu Val Glu Tyr Asp Glu Asp Glu Pro Val Pro Ser 165 170 175Pro Glu Thr Trp Lys Glu Arg Val Ile Lys Thr Phe Val Asn Lys Phe 180 185 190Asp Lys Phe Thr Ala Leu Glu Gly Phe Thr Ala Ile His Leu Ala Ile 195 200 205Phe Tyr Val Tyr Gly Ser Tyr Tyr Gln Leu Ser Lys Arg Ile Trp Gly 210 215 220Met Arg Tyr Val Phe Gly His Arg Leu Asp Lys Asn Glu Pro Arg Ile225 230 235 240Gly Tyr Glu Met Leu Gly Leu Leu Ile Phe Ala Arg Phe Ala Thr Ser 245 250 255Phe Val Gln Thr Gly Arg Glu Tyr Leu Gly Ala Leu Leu Glu Lys Ser 260 265 270Val Glu Lys Glu Ala Gly Glu Lys Glu Asp Glu Lys Glu Ala Val Val 275 280 285Pro Lys Lys Lys Ser Ser Ile Pro Phe Ile Glu Asp Thr Glu Gly Glu 290 295 300Thr Glu Asp Lys Ile Asp Leu Glu Asp Pro Arg Gln Leu Lys Phe Ile305 310 315 320Pro Glu Ala Ser Arg Ala Cys Thr Leu Cys Leu Ser Tyr Ile Ser Ala 325 330 335Pro Ala Cys Thr Pro Cys Gly His Phe Phe Cys Trp Asp Cys Ile Ser 340 345 350Glu Trp Val Arg Glu Lys Pro Glu Cys Pro Leu Cys Arg Gln Gly Val 355 360 365Arg Glu Gln Asn Leu Leu Pro Ile Arg 370 37511408PRTYarrowia lipolyticaMISC_FEATURE(1)..(408)YlPex12p; GenBank Accession No. CAG81532 11Met Asp Tyr Phe Ser Ser Leu Asn Ala Ser Gln Leu Asp Pro Asp Val1 5 10 15Pro Thr Leu Phe Glu Leu Leu Ser Ala Lys Gln Leu Glu Gly Leu Ile 20 25 30Ala Pro Ser Val Arg Tyr Ile Leu Ala Phe Tyr Ala Gln Arg His Pro 35 40 45Arg Tyr Leu Leu Arg Ile Val Asn Arg Tyr Asp Glu Leu Tyr Ala Leu 50 55 60Phe Met Gly Leu Val Glu Tyr Tyr Asn Leu Lys Thr Trp Asn Ala Ser65 70 75 80Phe Thr Glu Lys Phe Tyr Gly Leu Lys Arg Thr Gln Ile Leu Thr Asn 85 90 95Pro Ala Leu Arg Thr Arg Gln Ala Val Pro Asp Leu Val Glu Ala Glu 100 105 110Lys Arg Leu Ser Lys Lys Lys Ile Trp Gly Ser Leu Phe Phe Leu Ile 115 120 125Val Val Pro Tyr Val Lys Glu Lys Leu Asp Ala Arg Tyr Glu Arg Leu 130 135 140Lys Gly Arg Tyr Leu Ala Arg Asp Ile Asn Glu Glu Arg Ile Glu Ile145 150 155 160Lys Arg Thr Gly Thr Ala Gln Gln Ile Ala Val Phe Glu Phe Asp Tyr 165 170 175Trp Leu Leu Lys Leu Tyr Pro Ile Val Thr Met Gly Cys Thr Thr Ala 180 185 190Thr Leu Ala Phe His Met Leu Phe Leu Phe Ser Val Thr Arg Ala Tyr 195 200 205Ser Ile Asp Asp Phe Leu Leu Asn Ile Gln Phe Ser Arg Met Thr Arg 210 215 220Tyr Asp Tyr Gln Met Glu Thr Gln Arg Asp Ser Arg Asn Ala Ala Asn225 230 235 240Val Ala His Thr Met Lys Ser Ile Ser Glu Tyr Pro Val Ala Glu Arg 245 250 255Val Met Leu Leu Leu Thr Thr Lys Ala Gly Ala Asn Ala Met Arg Ser 260 265 270Ala Ala Leu Ser Gly Leu Ser Tyr Val Leu Pro Thr Ser Ile Phe Ala 275 280 285Leu Lys Phe Leu Glu Trp Trp Tyr Ala Ser Asp Phe Ala Arg Gln Leu 290 295 300Asn Gln Lys Arg Arg Gly Asp Leu Glu Asp Asn Leu Pro Val Pro Asp305 310 315 320Lys Val Lys Gly Ala Asp Lys Leu Ala Glu Ser Val Ala Lys Trp Lys 325 330 335Glu Asp Thr Ser Lys Cys Pro Leu Cys Ser Lys Glu Leu Val Asn Pro 340 345 350Thr Val Ile Glu Ser Gly Tyr Val Phe Cys Tyr Thr Cys Ile Tyr Arg 355 360 365His Leu Glu Asp Gly Asp Glu Glu Thr Gly Gly Arg Cys Pro Val Thr 370 375 380Gly Gln Lys Leu Leu Gly Cys Arg Trp Gln Asp Asp Val Trp Gln Val385 390 395 400Thr Gly Leu Arg Arg Leu Met Val 40512412PRTYarrowia lipolyticaMISC_FEATURE(1)..(412)YlPex13p; GenBank Accession No. CAG81789 12Met Ser Val Pro Arg Pro Lys Pro Trp Glu Gly Ala Ser Gly Ser Ser1 5 10 15Ala Ala Thr Ala Thr Pro Ala Ala Thr Ala Thr Pro Ala Ser Thr Asp 20 25 30Ala Val Ser Ser Ser Ala Gly Ser Ala Thr Gly Ala Pro Glu Leu Pro 35 40 45Ser Arg Pro Ser Ala Met Gly Ser Thr Ser Asn Ala Leu Ser Ser Pro 50 55 60Met Gly Ser Ser Met Asn Ser Gly Tyr Gly Gly Met Asn Ser Gly Tyr65 70 75 80Gly Gly Met Gly Ser Ser Tyr Gly Ser Gly Tyr Gly Ser Ser Tyr Gly 85 90 95Met Gly Ser Ser Tyr Gly Ser Gly Tyr Gly Ser Gly Leu Gly Gly Tyr 100 105 110Gly Ser Tyr Gly Gly Met Gly Gly Met Gly Gly Met Tyr Gly Ser Arg 115 120 125Tyr Gly Gly Tyr Gly Ser Tyr Gly Gly Met Gly Gly Tyr Gly Gly Tyr 130 135 140Gly Gly Met Gly Gly Gly Pro Met Gly Gln Asn Gly Leu Ala Gly Gly145 150 155 160Thr Gln Ala Thr Phe Gln Leu Ile Glu Ser Ile Val Gly Ala Val Gly 165 170 175Gly Phe Ala Gln Met Leu Glu Ser Thr Tyr Met Ala Thr Gln Ser Ser 180 185 190Phe Phe Ala Met Val Ser Val Ala Glu Gln Phe Gly Asn Leu Lys Asn 195 200 205Thr Leu Gly Ser Leu Leu Gly Ile Tyr Ala Ile Met Arg Trp Ala Arg 210 215 220Arg Leu Val Ala Lys Leu Ser Gly Gln Pro Val Thr Gly Ala Asn Gly225 230 235 240Ile Thr Pro Ala Gly Phe Ala Lys Phe Glu Ala Thr Gly Gly Ala Ala 245 250 255Gly Pro Gly Arg Gly Pro Arg Pro Ser Tyr Lys Pro Leu Leu Phe Phe 260 265 270Leu Thr Ala Val Phe Gly Leu Pro Tyr Leu Leu Gly Arg Leu Ile Lys 275 280 285Ala Leu Ala Ala Lys Gln Glu Gly Met Tyr Asp Glu His Gly Asn Leu 290 295 300Leu Pro Gly Ala Gln Met Gly Met Gly Gly Pro Gly Met Glu Gly Gly305 310 315 320Ala Glu Ile Asp Pro Ser Lys Leu Glu Phe Cys Arg Ala Asn Phe Asp 325 330 335Phe Val Pro Glu Asn Pro Gln Leu Glu Leu Glu Leu Arg Lys Gly Asp 340 345 350Leu Val Ala Val Leu Ala Lys Thr Asp Pro Met Gly Asn Pro Ser Gln 355 360 365Trp Trp Arg Val Arg Thr Arg Asp Gly Arg Ser Gly Tyr Val Pro Ala 370 375 380Asn Tyr Leu Glu Val Ile Pro Arg Pro Ala Val Glu Ala Pro Lys Lys385 390 395 400Val Glu Glu Ile Gly Ala Ser Ala Val Pro Val Asn 405 41013380PRTYarrowia lipolyticaMISC_FEATURE(1)..(380)YlPex14p; GenBank Accession No. CAG79323 13Met Ile Pro Ser Cys Leu Ser Thr Gln His Met Ala Pro Arg Glu Asp1 5 10 15Leu Val Gln Ser Ala Val Ala Phe Leu Asn Asp Pro Gln Ala Ala Thr 20 25 30Ala Pro Leu Ala Lys Arg Ile Glu Phe Leu Glu Ser Lys Asp Met Thr 35 40 45Pro Glu Glu Ile Glu Glu Ala Leu Lys Arg Ala Gly Ser Gly Ser Ala 50 55 60Gln Ser His Pro Gly Ser Val Val Ser His Gly Gly Ala Ala Pro Thr65 70 75 80Val Pro Ala Ser Tyr Ala Phe Gln Ser Ala Pro Pro Leu Pro Glu Arg 85 90 95Asp Trp Lys Asp Val Phe Ile Met Ala Thr Val Thr Val Gly Val Gly 100 105 110Phe Gly Leu Tyr Thr Val Ala Lys Arg Tyr Leu Met Pro Leu Ile Leu 115 120 125Pro Pro Thr Pro Pro Ser Leu Glu Ala Asp Lys Glu Ala Leu Glu Ala 130 135 140Glu Phe Ala Arg Val Gln Gly Leu Leu Asp Gln Val Gln Gln Asp Thr145 150 155 160Glu Glu Val Lys Asn Ser Gln Val Glu Val Ala Lys Arg Val Thr Asp 165 170 175Ala Leu Lys Gly Val Glu Glu Thr Ile Asp Gln Leu Lys Ser Gln Thr 180 185 190Lys Lys Arg Asp Asp Glu Met Lys Leu Val Thr Ala Glu Val Glu Arg 195 200 205Ile Arg Asp Arg Leu Pro Lys Asn Ile Asp Lys Leu Lys Asp Ser Gln 210 215 220Glu Gln Gly Leu Ala Asp Ile Gln Ser Glu Leu Lys Ser Leu Lys Gln225 230 235 240Leu Leu Ser Thr Arg Thr Ala Ala Ser Ser Gly Pro Lys Leu Pro Pro 245 250 255Ile Pro Pro Pro Ser Ser Tyr Leu Thr Arg Lys Ala Ser Pro Ala Val 260 265 270Pro Ala Ala Ala Pro Ala Pro Val Thr Pro Gly Ser Pro Val His Asn 275 280 285Val Ser Ser Ser Ser Thr Val Pro Ala Asp Arg Asp Asp Phe Ile Pro 290 295 300Thr Pro Ala Gly Ala Val Pro Met Ile Pro Gln Pro Ala Ser Met Ser305 310 315 320Ser Ser Ser Thr Ser Thr Val Pro Asn Ser Ala Ile Ser Ser Ala Pro 325 330 335Ser Pro Ile Gln Glu Pro Glu Pro Phe Val Pro Glu Pro Gly Asn Ser 340 345 350Ala Val Lys Lys Pro Ala Pro Lys Ala Ser Ile Pro Ala Trp Gln Leu 355 360 365Ala Ala Leu Glu Lys Glu Lys Glu Lys Glu Lys Glu 370 375 38014391PRTYarrowia lipolyticaMISC_FEATURE(1)..(391)YlPex16p; GenBank Accession No. CAG79622 14Met Thr Asp Lys Leu Val Lys Val Met Gln Lys Lys Lys Ser Ala Pro1 5 10 15Gln Thr Trp Leu Asp Ser Tyr Asp Lys Phe Leu Val Arg Asn Ala Ala 20 25 30Ser Ile Gly Ser Ile Glu Ser Thr Leu Arg Thr Val Ser Tyr Val Leu 35 40 45Pro Gly Arg Phe Asn Asp Val Glu Ile Ala Thr Glu Thr Leu Tyr Ala 50 55 60Val Leu Asn Val Leu Gly Leu Tyr His Asp Thr Ile Ile Ala Arg Ala65 70 75 80Val Ala Ala Ser Pro Asn Ala Ala Ala Val Tyr Arg Pro Ser Pro His 85 90 95Asn Arg Tyr Thr Asp Trp Phe Ile Lys Asn Arg Lys Gly Tyr Lys Tyr 100 105 110Ala Ser Arg Ala Val Thr Phe Val Lys Phe Gly Glu Leu Val Ala Glu 115 120 125Met Val Ala Lys Lys Asn Gly Gly Glu Met Ala Arg Trp Lys Cys Ile 130 135 140Ile Gly Ile Glu Gly Ile Lys Ala Gly Leu Arg Ile Tyr Met Leu Gly145 150 155 160Ser Thr Leu Tyr Gln Pro Leu Cys Thr Thr Pro Tyr Pro Asp Arg Glu 165 170 175Val Thr Gly Glu Leu Leu Glu Thr Ile Cys Arg Asp Glu Gly Glu Leu 180 185 190Asp Ile Glu Lys Gly Leu Met Asp Pro Gln Trp Lys Met Pro Arg Thr 195 200 205Gly Arg Thr Ile Pro Glu Ile Ala Pro Thr Asn Val Glu Gly Tyr Leu 210 215 220Leu Thr Lys Val Leu Arg Ser Glu Asp Val Asp Arg Pro Tyr Asn Leu225 230 235 240Leu Ser Arg Leu Asp Asn Trp Gly Val Val Ala Glu Leu Leu Ser Ile 245 250 255Leu Arg Pro Leu Ile Tyr Ala Cys Leu Leu Phe Arg Gln His Val Asn 260 265 270Lys Thr Val Pro Ala Ser Thr Lys Ser Lys Phe Pro Phe Leu Asn Ser 275 280 285Pro Trp Ala Pro Trp Ile Ile Gly Leu Val Ile Glu Ala Leu Ser Arg 290 295 300Lys Met Met Gly Ser Trp Leu Leu Arg Gln Arg Gln Ser Gly Lys Thr305 310 315 320Pro Thr Ala Leu Asp Gln Met Glu Val Lys Gly Arg Thr Asn Leu Leu 325 330 335Gly Trp Trp Leu Phe Arg Gly Glu Phe Tyr Gln Ala Tyr Thr Arg Pro 340 345 350Leu Leu Tyr Ser Ile Val Ala Arg Leu Glu Lys Ile Pro Gly Leu Gly 355 360 365Leu Phe Gly Ala Leu Ile Ser Asp Tyr Leu Tyr Leu Phe Asp Arg Tyr 370 375 380Tyr Phe Thr Ala Ser Thr Leu385 39015225PRTYarrowia lipolyticaMISC_FEATURE(1)..(225)YlPex17p; GenBank Accession No. CAG84025 15Met Ser Ala Phe Pro Glu Pro Ser Ser Phe Glu Ile Glu Phe Ala Lys1 5 10 15Gln Met Asn Arg Pro Arg Thr Val Gln Phe Lys Gln Leu Val Ala Val 20 25 30Leu Tyr Ile Phe Gly Gly Thr Ser Ala Leu Ile Tyr Ile Ile Ser Lys 35 40 45Thr Ile Leu Asn Pro Leu Phe Glu Glu Leu Thr Phe Ala Arg Ser Glu 50 55 60Tyr Ala Ile His Ala Arg Arg Leu Met Glu Gln Leu Asn Ala Lys Leu65 70 75 80Ser Ser Met Ala Ser Tyr Ile Pro Pro Val Arg Ala Leu Gln Gly Gln 85 90 95Arg Phe Val Asp Ala Gln Thr Gln Thr Glu Asp Glu Glu Gly Glu Asp 100 105 110Ile Pro Asn Pro Ser Leu Gly Lys Ser Ser His Val Ser Phe Gly Glu 115 120 125Ser Pro Met Gln Leu Lys Leu Ala Glu Lys Glu Lys Gln Gln Lys Leu 130 135 140Ile Asp Asp Ser Val Asp Asn Leu Glu Arg Leu Ala Asp Ser Leu Lys145 150 155 160His Ala Gly Glu Val Ser Asp Leu Ser Ala Leu Ser Gly Phe Lys Tyr 165 170 175Gln Val Glu Glu Leu Thr Asn Tyr Ser Asp Gln Leu Ala Met Ser Gly 180 185 190Tyr Ser Met Met Lys Ser Gly Leu Pro Gly His Glu Thr Ala Met Ser 195 200 205Glu Thr Lys Lys Glu Ile Arg Ser Leu Lys Gly Ser Val Leu Ser Val 210 215 220Arg22516324PRTYarrowia

lipolyticaMISC_FEATURE(1)..(324)YlPex19p; GenBank Accession No. AAK84827 16Met Ser His Glu Glu Asp Leu Asp Asp Leu Asp Asp Phe Leu Asp Glu1 5 10 15Phe Asp Glu Gln Val Leu Ser Lys Pro Pro Gly Ala Gln Lys Asp Ala 20 25 30Thr Pro Thr Thr Ser Thr Ala Pro Thr Thr Ala Glu Ala Lys Pro Asp 35 40 45Ala Thr Lys Lys Ser Thr Glu Thr Ser Gly Thr Asp Ser Lys Thr Glu 50 55 60Gly Ala Asp Thr Ala Asp Lys Asn Ala Ala Thr Asp Ser Ala Glu Ala65 70 75 80Gly Ala Glu Lys Val Ser Leu Pro Asn Leu Glu Asp Gln Leu Ala Gly 85 90 95Leu Lys Met Asp Asp Phe Leu Lys Asp Ile Glu Ala Asp Pro Glu Ser 100 105 110Lys Ala Gln Phe Glu Ser Leu Leu Lys Glu Ile Asn Asn Val Thr Ser 115 120 125Ala Thr Ala Ser Glu Lys Ala Gln Gln Pro Lys Ser Phe Lys Glu Thr 130 135 140Ile Ser Ala Thr Ala Asp Arg Leu Asn Gln Ser Asn Gln Glu Met Gly145 150 155 160Asp Met Pro Leu Gly Asp Asp Met Leu Ala Gly Leu Met Glu Gln Leu 165 170 175Ser Gly Ala Gly Gly Phe Gly Glu Gly Gly Glu Gly Asp Phe Gly Asp 180 185 190Met Leu Gly Gly Ile Met Arg Gln Leu Ala Ser Lys Glu Val Leu Tyr 195 200 205Gln Pro Leu Lys Glu Met His Asp Asn Tyr Pro Lys Trp Trp Asp Glu 210 215 220His Gly Ser Lys Val Thr Glu Glu Lys Glu Arg Asp Arg Leu Lys Leu225 230 235 240Gln Gln Asp Ile Val Gly Lys Ile Cys Ala Lys Phe Glu Asp Pro Ser 245 250 255Tyr Ser Asp Asp Ser Glu Ala Asp Arg Ala Val Ile Thr Gln Leu Met 260 265 270Asp Glu Met Gln Glu Thr Gly Ala Pro Pro Asp Glu Ile Met Ser Asn 275 280 285Val Ala Asp Gly Ser Ile Pro Gly Gly Leu Asp Gly Leu Gly Leu Gly 290 295 300Gly Leu Gly Gly Gly Lys Met Pro Glu Met Pro Glu Asn Met Pro Glu305 310 315 320Cys Asn Gln Gln17417PRTYarrowia lipolyticaMISC_FEATURE(1)..(417)YlPex20p; GenBank Accession No. CAG79226 17Met Ala Ser Cys Gly Pro Ser Asn Ala Leu Gln Asn Leu Ser Lys His1 5 10 15Ala Ser Ala Asp Arg Ser Leu Gln His Asp Arg Met Ala Pro Gly Gly 20 25 30Ala Pro Gly Ala Gln Arg Gln Gln Phe Arg Ser Gln Thr Gln Gly Gly 35 40 45Gln Leu Asn Asn Glu Phe Gln Gln Phe Ala Gln Ala Gly Pro Ala His 50 55 60Asn Ser Phe Glu Gln Ser Gln Met Gly Pro His Phe Gly Gln Gln His65 70 75 80Phe Gly Gln Pro His Gln Pro Gln Met Gly Gln His Ala Pro Met Ala 85 90 95His Gly Gln Gln Ser Asp Trp Ala Gln Ser Phe Ser Gln Leu Asn Leu 100 105 110Gly Pro Gln Thr Gly Pro Gln His Thr Gln Gln Ser Asn Trp Gly Gln 115 120 125Asp Phe Met Arg Gln Ser Pro Gln Ser His Gln Val Gln Pro Gln Met 130 135 140Ala Asn Gly Val Met Gly Ser Met Ser Gly Met Ser Ser Phe Gly Pro145 150 155 160Met Tyr Ser Asn Ser Gln Leu Met Asn Ser Thr Tyr Gly Leu Gln Thr 165 170 175Glu His Gln Gln Thr His Lys Thr Glu Thr Lys Ser Ser Gln Asp Ala 180 185 190Ala Phe Glu Ala Ala Phe Gly Ala Val Glu Glu Ser Ile Thr Lys Thr 195 200 205Ser Asp Lys Gly Lys Glu Val Glu Lys Asp Pro Met Glu Gln Thr Tyr 210 215 220Arg Tyr Asp Gln Ala Asp Ala Leu Asn Arg Gln Ala Glu His Ile Ser225 230 235 240Asp Asn Ile Ser Arg Glu Glu Val Asp Ile Lys Thr Asp Glu Asn Gly 245 250 255Glu Phe Ala Ser Ile Ala Arg Gln Ile Ala Ser Ser Leu Glu Glu Ala 260 265 270Asp Lys Ser Lys Phe Glu Lys Ser Thr Phe Met Asn Leu Met Arg Arg 275 280 285Ile Gly Asn His Glu Val Thr Leu Asp Gly Asp Lys Leu Val Asn Lys 290 295 300Glu Gly Glu Asp Ile Arg Glu Glu Val Arg Asp Glu Leu Leu Arg Glu305 310 315 320Gly Ala Ser Gln Glu Asn Gly Phe Gln Ser Glu Ala Gln Gln Thr Ala 325 330 335Pro Leu Pro Val His His Glu Ala Pro Pro Pro Glu Gln Ile His Pro 340 345 350His Thr Glu Thr Gly Asp Lys Gln Leu Glu Asp Pro Met Val Tyr Ile 355 360 365Glu Gln Glu Ala Ala Arg Arg Ala Ala Glu Ser Gly Arg Thr Val Glu 370 375 380Glu Glu Lys Leu Asn Phe Tyr Ser Pro Phe Glu Tyr Ala Gln Lys Leu385 390 395 400Gly Pro Gln Gly Val Ala Lys Gln Ser Asn Trp Glu Glu Asp Tyr Asp 405 410 415Phe18195PRTYarrowia lipolyticaMISC_FEATURE(1)..(195)YlPex22p; GenBank Accession No. CAG77876 18Val Pro Arg Cys Thr Ser His Pro Cys Asn Leu Thr Leu His Leu Pro1 5 10 15Val Thr Thr Met Ala Pro Arg Lys Thr Arg Leu Pro Ala Val Ile Gly 20 25 30Ala Ala Ala Ala Ala Ala Ala Val Ala Tyr Leu Val Tyr Ser Phe Val 35 40 45Ala Lys Ser Asn Ser Asp Gln Asp Thr Phe Asp Ser Ser Val Gln Ser 50 55 60Ser Ser Lys Ser Ser Thr Lys Ser Pro Lys Ser Thr Ala Thr Asn Ser65 70 75 80Lys Ile Thr Val Val Val Ser Gln Glu Leu Val Gln Ser Gln Leu Val 85 90 95Asp Phe Lys His Leu Met Ser Val His Pro Asn Leu Val Val Ile Val 100 105 110Pro Pro Met Val Ala Asn Lys Phe His Arg Ala Leu Lys Ser Ser Val 115 120 125Gly His Asp His Gly Val Lys Val Ile Arg Cys Asp Thr Asp Val Gly 130 135 140Val Ile His Val Ile Lys His Ile Arg Pro Asp Leu Ala Leu Ile Ala145 150 155 160Asp Gly Val Gly Asp Asn Ile Gln Gly Glu Ile Lys Arg Phe Val Gly 165 170 175Ser Ser Glu Ala Leu Ser Gly Asp Val Asn Leu Ala Ala Glu Arg Leu 180 185 190Thr Gly Leu 19519386PRTYarrowia lipolyticaMISC_FEATURE(1)..(386)YlPex26p; GenBank Accession No. NC_006072, antisense translation of nucleotides 117230-118387 19Met Pro Pro Ala Met Pro Gln Met Thr Thr Ser Thr Leu Leu Thr Asp1 5 10 15Ser Val Thr Ser Ala Val Asn Gln Ala Ala Thr Pro Lys Val Asp Gln 20 25 30Met Tyr Gln Thr Phe Gly Glu Ser Ala Arg Glu Phe Val Asn Lys Asn 35 40 45Phe Tyr Asn Ser Tyr Glu Leu Ile Arg Pro Phe Phe Asp Glu Ile Thr 50 55 60Ala Lys Gly Ala Gln Gln Asn Gly Ser Thr Val Leu Asp Ala Glu Asn65 70 75 80Pro His Asn Ile Pro Leu Ser Leu Trp Ile Lys Val Trp Ser Leu Tyr 85 90 95Leu Ala Ile Leu Asp Ala Ser Cys Lys Gln Ala Gly Glu Ala Leu Leu 100 105 110Asn Ser Thr Gly Asp Leu Ser Gly Ser Asp Ser Gly Glu Trp Asn Gln 115 120 125Thr Arg Lys Leu Leu Ala Arg Lys Leu Thr Ser Gly Ser Val Trp Asp 130 135 140Glu Leu Val Thr Ala Ser Gly Gly Thr Gly Asn Ile His Pro Thr Ile145 150 155 160Leu Ala Leu Leu Ala Ser Leu Ser Ile Arg His Asp Thr Asp Ala Lys 165 170 175Leu Met Ala Asp Asn Leu Glu Lys Phe Ile Val Thr Tyr Asn Asp Asn 180 185 190Gly Ser Asp Asp Val Lys Thr Lys Thr Ala Phe Tyr Lys Val Leu Asp 195 200 205Leu Tyr Leu Leu Arg Val Leu Pro Asp Leu Gly Gln Trp Asp Val Ala 210 215 220His Ser Phe Val Asn Asn Thr Asn Leu Phe Ser His Glu Gln Lys Lys225 230 235 240Glu Met Thr His Lys Leu Asp Gln Ser Gln Lys His Ala Glu Gln Glu 245 250 255His Lys Arg Leu Leu Glu Glu Ala Gln Glu Lys Glu Lys Ser Asp Ala 260 265 270Lys Glu Lys Glu Arg Glu Glu Arg Val Ser Arg Asp Thr Gln Ser Arg 275 280 285Glu Ile Lys Ser Pro Ile Val Asp Ser Ser Thr Ser Ser Arg Asp Val 290 295 300Thr Arg Asp Thr Thr Arg Glu Leu Ser Lys Ser Ser Arg Gln Pro Arg305 310 315 320Thr Leu Ser Gln Ile Ile Ser Thr Ser Leu Lys Ser Gln Phe Asp Gly 325 330 335Asn Ala Ile Phe Arg Thr Leu Ala Leu Ile Val Ile Val Ser Leu Ser 340 345 350Ala Ala Asn Pro Leu Ile Arg Lys Arg Val Val Asp Thr Leu Lys Met 355 360 365Leu Trp Ile Lys Ile Leu Gln Thr Leu Ser Met Gly Phe Lys Val Ser 370 375 380Tyr Leu385203387DNAYarrowia lipolyticamisc_feature(1)..(3387)GenBank Accession No. AB036770 20ggtaccatca agggtaaaat caaggctatc atcaagggcc atatatcgca agtttggggg 60aagataatat gttcatagtg aatcgggttg tggatttcct catctaacgg cattataact 120agtcctggag ggtctttttt atggataacc tccatgtacg atgtatccaa gatctccacg 180tactgtgttc tgtttcctaa gtaataccca acaacctctc caacaaacac ttgggaagat 240gcacttgtgc tgagatgtca agatgttaga gagtagagac agtagcaagc gtaaaaggcg 300gccgaggcca ccgagagaac agcgtagcag ggcgcgtagt caccacaggg gacgcagaac 360caaacaaatg acgaagaaga accacaagga gacgttttca aaggcaatgc aaacgaagag 420ggcaatggaa ggattgagat tagagaactg gagactggag tggcgttttc ccgatgaacg 480aacaaacacg cgaagctatg tggaccaaca tacaacacgg actgaaccag gtttttttat 540gattttttta ctggaaatag gtacgtgcca agttggacca tgacactaaa cgtgtttaat 600tagtaatatt cgtgtaagcg tacattcatt tcaaaggtta ttctttcacg gcaaagttat 660aattaaatga atgtatatgc agaaaaaaaa aaaaaaagta ctgtactgga tggagagaat 720attaataaat aattgttacc caactacatc ttgtcgattg aaagagaccc ctaagacaga 780taggatatct gcaacccgag gaatgaaccc cccagcaccg gcaccctttc tattaacaaa 840atgccaactg aaatttgaaa agttcaacta aacttatttg acccacaaaa actcgtcaaa 900agtggcggcg aaagctggca aatgatgaca tccccttgga accatgatat cctctcggaa 960tcttcgtccc catttgccac atctacttgc aacgccacat ctgcttacta agcaacccaa 1020atctgcctcg gctcaaaatg tggggaagtt cacatgcatt cgctggtgaa tctgatctga 1080cactacaact acacaccagg tccaacatga gcgacaatac gacaatcaaa aagccgatcc 1140gacccaaacc gatccggacg gaacgcctgc cttacgctgg ggccgcagaa atcatccgag 1200ccaaccagaa agaccactac tttgagtccg tgcttgaaca gcatctcgtc acgtttctgc 1260agaaatggaa gggagtacga tttatccacc agtacaagga ggagctggag acggcgtcca 1320agtttgcata tctcggtttg tgtacgcttg tgggctccaa gactctcgga gaagagtaca 1380ccaatctcat gtacactatc agagaccgaa cagctctacc gggggtggtg agacggtttg 1440gctacgtgct ttccaacact ctgtttccat acctgtttgt gcgctacatg ggcaagttgc 1500gcgccaaact gatgcgcgag tatccccatc tggtggagta cgacgaagat gagcctgtgc 1560ccagcccgga aacatggaag gagcgggtca tcaagacgtt tgtgaacaag tttgacaagt 1620tcacggcgct ggaggggttt accgcgatcc acttggcgat tttctacgtc tacggctcgt 1680actaccagct cagtaagcgg atctggggca tgcgttatgt atttggacac cgactggaca 1740agaatgagcc tcgaatcggt tacgagatgc tcggtctgct gattttcgcc cggtttgcca 1800cgtcatttgt gcagacggga agagagtacc tcggagcgct gctggaaaag agcgtggaga 1860aagaggcagg ggagaaggaa gatgaaaagg aagcggttgt gccgaaaaag aagtcgtcaa 1920ttccgttcat tgaggataca gaaggggaga cggaagacaa gatcgatctg gaggaccctc 1980gacagctcaa gttcattcct gaggcgtcca gagcgtgcac tctgtgtctg tcatacatta 2040gtgcgccggc atgtacgcca tgtggacact ttttctgttg ggactgtatt tccgaatggg 2100tgagagagaa gcccgagtgt cccttgtgtc ggcagggtgt gagagagcag aacttgttgc 2160ctatcagata atgacgaggt ctggatggaa ggactagtca gcgagacaca gagcatcagg 2220gaccagacac gaccaattca atcgacaaca ctgtgctgca tagcagtgca cagaggtcct 2280gggcatgaat atattttagc attggagata tgagtggtag agcgtataca gtattaattg 2340tggaggtatc tcgtcgcatt gatagagcaa tacagttact gctgaaggga atgataccga 2400gtatttcggc ccgattcagt tcttgatatc gtcattttgt ctctattgtc tacttttcag 2460ataacctcaa caaatcttca acaaatctcc cagtaaacag tcagagatca tatccgagat 2520catatcagat atgtcacgat ccgagtacaa taatggatat taatctgctt gattttgaat 2580tctgttgcga ttatgatttc tttgatttcg atatgaacac atacggcgac tcccagacct 2640ttagaagctc cagtttggat tcttagcaat ggttacactc aactatatcc caagtaatac 2700ttggtaacaa tatgccaagt tagtcattca ttcgttatag gagttagcaa gtgtttgtca 2760gctaaaaatg gttagtcggt cgattaccac ttagatcttt tcagcgtgga acttgatggt 2820acgcttgaac cgacacttgg agtagtcggg gctgttgatg acgtagatga cgtttcgctc 2880agggtgagga gtgcaatagt agtactcctt ggggccgtct ctcagctcaa aggttccatc 2940ggcggcaatg tcaaagaccg agccctggag cttgtagccg tagtcgccgg tccagaacaa 3000agcctgcagc tccagatagg cgatgggcat gtcgttaaca gagaaggtgt tgccctcgcc 3060ctcggtgatg gtgatgggtt cgccgtcggt ggaggcggtg atcaggtcat cttggtaggt 3120gacgggcaga gattcgaccg attgggcgtc tgatctggta taggtcagct tgtacttgtc 3180tccgacagcc gccagagcgg tggtagcgac ggtgatgagg gagatgagtt tcatattggc 3240ggcaagttta gcaaaagatg gcagtgggat tgagggacaa gagtgtttat atagatatag 3300atacaacaca acgagtctga atgagacaac cgagacaacc actcccgaag cctcactaat 3360agttactaac ggcatatttc aggtacc 3387211134DNAYarrowia lipolyticaCDS(1)..(1134)Pex10; GenBank Accession No. AB036770, nucleotides 1038-2171 21atg tgg gga agt tca cat gca ttc gct ggt gaa tct gat ctg aca cta 48Met Trp Gly Ser Ser His Ala Phe Ala Gly Glu Ser Asp Leu Thr Leu1 5 10 15caa cta cac acc agg tcc aac atg agc gac aat acg aca atc aaa aag 96Gln Leu His Thr Arg Ser Asn Met Ser Asp Asn Thr Thr Ile Lys Lys 20 25 30ccg atc cga ccc aaa ccg atc cgg acg gaa cgc ctg cct tac gct ggg 144Pro Ile Arg Pro Lys Pro Ile Arg Thr Glu Arg Leu Pro Tyr Ala Gly 35 40 45gcc gca gaa atc atc cga gcc aac cag aaa gac cac tac ttt gag tcc 192Ala Ala Glu Ile Ile Arg Ala Asn Gln Lys Asp His Tyr Phe Glu Ser 50 55 60gtg ctt gaa cag cat ctc gtc acg ttt ctg cag aaa tgg aag gga gta 240Val Leu Glu Gln His Leu Val Thr Phe Leu Gln Lys Trp Lys Gly Val65 70 75 80cga ttt atc cac cag tac aag gag gag ctg gag acg gcg tcc aag ttt 288Arg Phe Ile His Gln Tyr Lys Glu Glu Leu Glu Thr Ala Ser Lys Phe 85 90 95gca tat ctc ggt ttg tgt acg ctt gtg ggc tcc aag act ctc gga gaa 336Ala Tyr Leu Gly Leu Cys Thr Leu Val Gly Ser Lys Thr Leu Gly Glu 100 105 110gag tac acc aat ctc atg tac act atc aga gac cga aca gct cta ccg 384Glu Tyr Thr Asn Leu Met Tyr Thr Ile Arg Asp Arg Thr Ala Leu Pro 115 120 125ggg gtg gtg aga cgg ttt ggc tac gtg ctt tcc aac act ctg ttt cca 432Gly Val Val Arg Arg Phe Gly Tyr Val Leu Ser Asn Thr Leu Phe Pro 130 135 140tac ctg ttt gtg cgc tac atg ggc aag ttg cgc gcc aaa ctg atg cgc 480Tyr Leu Phe Val Arg Tyr Met Gly Lys Leu Arg Ala Lys Leu Met Arg145 150 155 160gag tat ccc cat ctg gtg gag tac gac gaa gat gag cct gtg ccc agc 528Glu Tyr Pro His Leu Val Glu Tyr Asp Glu Asp Glu Pro Val Pro Ser 165 170 175ccg gaa aca tgg aag gag cgg gtc atc aag acg ttt gtg aac aag ttt 576Pro Glu Thr Trp Lys Glu Arg Val Ile Lys Thr Phe Val Asn Lys Phe 180 185 190gac aag ttc acg gcg ctg gag ggg ttt acc gcg atc cac ttg gcg att 624Asp Lys Phe Thr Ala Leu Glu Gly Phe Thr Ala Ile His Leu Ala Ile 195 200 205ttc tac gtc tac ggc tcg tac tac cag ctc agt aag cgg atc tgg ggc 672Phe Tyr Val Tyr Gly Ser Tyr Tyr Gln Leu Ser Lys Arg Ile Trp Gly 210 215 220atg cgt tat gta ttt gga cac cga ctg gac aag aat gag cct cga atc 720Met Arg Tyr Val Phe Gly His Arg Leu Asp Lys Asn Glu Pro Arg Ile225 230 235 240ggt tac gag atg ctc ggt ctg ctg att ttc gcc cgg ttt gcc acg tca 768Gly Tyr Glu Met Leu Gly Leu Leu Ile Phe Ala Arg Phe Ala Thr Ser 245 250 255ttt gtg cag acg gga aga gag tac ctc gga gcg ctg ctg gaa aag agc 816Phe Val Gln Thr Gly Arg Glu Tyr Leu Gly Ala Leu Leu Glu Lys Ser 260 265 270gtg gag aaa gag gca ggg gag aag gaa gat gaa aag gaa gcg gtt gtg 864Val Glu Lys Glu Ala Gly Glu Lys Glu Asp Glu Lys Glu Ala Val Val 275 280 285ccg aaa aag aag tcg tca att ccg ttc att gag gat aca gaa ggg gag 912Pro Lys Lys Lys Ser Ser Ile Pro Phe Ile Glu Asp Thr Glu Gly Glu 290 295 300acg gaa gac aag atc gat ctg gag gac cct cga cag ctc aag ttc att 960Thr Glu Asp Lys Ile

Asp Leu Glu Asp Pro Arg Gln Leu Lys Phe Ile305 310 315 320cct gag gcg tcc aga gcg tgc act ctg tgt ctg tca tac att agt gcg 1008Pro Glu Ala Ser Arg Ala Cys Thr Leu Cys Leu Ser Tyr Ile Ser Ala 325 330 335ccg gca tgt acg cca tgt gga cac ttt ttc tgt tgg gac tgt att tcc 1056Pro Ala Cys Thr Pro Cys Gly His Phe Phe Cys Trp Asp Cys Ile Ser 340 345 350gaa tgg gtg aga gag aag ccc gag tgt ccc ttg tgt cgg cag ggt gtg 1104Glu Trp Val Arg Glu Lys Pro Glu Cys Pro Leu Cys Arg Gln Gly Val 355 360 365aga gag cag aac ttg ttg cct atc aga taa 1134Arg Glu Gln Asn Leu Leu Pro Ile Arg 370 37522377PRTYarrowia lipolytica 22Met Trp Gly Ser Ser His Ala Phe Ala Gly Glu Ser Asp Leu Thr Leu1 5 10 15Gln Leu His Thr Arg Ser Asn Met Ser Asp Asn Thr Thr Ile Lys Lys 20 25 30Pro Ile Arg Pro Lys Pro Ile Arg Thr Glu Arg Leu Pro Tyr Ala Gly 35 40 45Ala Ala Glu Ile Ile Arg Ala Asn Gln Lys Asp His Tyr Phe Glu Ser 50 55 60Val Leu Glu Gln His Leu Val Thr Phe Leu Gln Lys Trp Lys Gly Val65 70 75 80Arg Phe Ile His Gln Tyr Lys Glu Glu Leu Glu Thr Ala Ser Lys Phe 85 90 95Ala Tyr Leu Gly Leu Cys Thr Leu Val Gly Ser Lys Thr Leu Gly Glu 100 105 110Glu Tyr Thr Asn Leu Met Tyr Thr Ile Arg Asp Arg Thr Ala Leu Pro 115 120 125Gly Val Val Arg Arg Phe Gly Tyr Val Leu Ser Asn Thr Leu Phe Pro 130 135 140Tyr Leu Phe Val Arg Tyr Met Gly Lys Leu Arg Ala Lys Leu Met Arg145 150 155 160Glu Tyr Pro His Leu Val Glu Tyr Asp Glu Asp Glu Pro Val Pro Ser 165 170 175Pro Glu Thr Trp Lys Glu Arg Val Ile Lys Thr Phe Val Asn Lys Phe 180 185 190Asp Lys Phe Thr Ala Leu Glu Gly Phe Thr Ala Ile His Leu Ala Ile 195 200 205Phe Tyr Val Tyr Gly Ser Tyr Tyr Gln Leu Ser Lys Arg Ile Trp Gly 210 215 220Met Arg Tyr Val Phe Gly His Arg Leu Asp Lys Asn Glu Pro Arg Ile225 230 235 240Gly Tyr Glu Met Leu Gly Leu Leu Ile Phe Ala Arg Phe Ala Thr Ser 245 250 255Phe Val Gln Thr Gly Arg Glu Tyr Leu Gly Ala Leu Leu Glu Lys Ser 260 265 270Val Glu Lys Glu Ala Gly Glu Lys Glu Asp Glu Lys Glu Ala Val Val 275 280 285Pro Lys Lys Lys Ser Ser Ile Pro Phe Ile Glu Asp Thr Glu Gly Glu 290 295 300Thr Glu Asp Lys Ile Asp Leu Glu Asp Pro Arg Gln Leu Lys Phe Ile305 310 315 320Pro Glu Ala Ser Arg Ala Cys Thr Leu Cys Leu Ser Tyr Ile Ser Ala 325 330 335Pro Ala Cys Thr Pro Cys Gly His Phe Phe Cys Trp Asp Cys Ile Ser 340 345 350Glu Trp Val Arg Glu Lys Pro Glu Cys Pro Leu Cys Arg Gln Gly Val 355 360 365Arg Glu Gln Asn Leu Leu Pro Ile Arg 370 375231065DNAYarrowia lipolyticaCDS(1)..(1065)YlPEX10; GenBank Accession No. AJ012084, which corresponds to nucleotides 1107-2171 of GenBank Accession No. AB036770 23atg agc gac aat acg aca atc aaa aag ccg atc cga ccc aaa ccg atc 48Met Ser Asp Asn Thr Thr Ile Lys Lys Pro Ile Arg Pro Lys Pro Ile1 5 10 15cgg acg gaa cgc ctg cct tac gct ggg gcc gca gaa atc atc cga gcc 96Arg Thr Glu Arg Leu Pro Tyr Ala Gly Ala Ala Glu Ile Ile Arg Ala 20 25 30aac cag aaa gac cac tac ttt gag tcc gtg ctt gaa cag cat ctc gtc 144Asn Gln Lys Asp His Tyr Phe Glu Ser Val Leu Glu Gln His Leu Val 35 40 45acg ttt ctg cag aaa tgg aag gga gta cga ttt atc cac cag tac aag 192Thr Phe Leu Gln Lys Trp Lys Gly Val Arg Phe Ile His Gln Tyr Lys 50 55 60gag gag ctg gag acg gcg tcc aag ttt gca tat ctc ggt ttg tgt acg 240Glu Glu Leu Glu Thr Ala Ser Lys Phe Ala Tyr Leu Gly Leu Cys Thr65 70 75 80ctt gtg ggc tcc aag act ctc gga gaa gag tac acc aat ctc atg tac 288Leu Val Gly Ser Lys Thr Leu Gly Glu Glu Tyr Thr Asn Leu Met Tyr 85 90 95act atc aga gac cga aca gct cta ccg ggg gtg gtg aga cgg ttt ggc 336Thr Ile Arg Asp Arg Thr Ala Leu Pro Gly Val Val Arg Arg Phe Gly 100 105 110tac gtg ctt tcc aac act ctg ttt cca tac ctg ttt gtg cgc tac atg 384Tyr Val Leu Ser Asn Thr Leu Phe Pro Tyr Leu Phe Val Arg Tyr Met 115 120 125ggc aag ttg cgc gcc aaa ctg atg cgc gag tat ccc cat ctg gtg gag 432Gly Lys Leu Arg Ala Lys Leu Met Arg Glu Tyr Pro His Leu Val Glu 130 135 140tac gac gaa gat gag cct gtg ccc agc ccg gaa aca tgg aag gag cgg 480Tyr Asp Glu Asp Glu Pro Val Pro Ser Pro Glu Thr Trp Lys Glu Arg145 150 155 160gtc atc aag acg ttt gtg aac aag ttt gac aag ttc acg gcg ctg gag 528Val Ile Lys Thr Phe Val Asn Lys Phe Asp Lys Phe Thr Ala Leu Glu 165 170 175ggg ttt acc gcg atc cac ttg gcg att ttc tac gtc tac ggc tcg tac 576Gly Phe Thr Ala Ile His Leu Ala Ile Phe Tyr Val Tyr Gly Ser Tyr 180 185 190tac cag ctc agt aag cgg atc tgg ggc atg cgt tat gta ttt gga cac 624Tyr Gln Leu Ser Lys Arg Ile Trp Gly Met Arg Tyr Val Phe Gly His 195 200 205cga ctg gac aag aat gag cct cga atc ggt tac gag atg ctc ggt ctg 672Arg Leu Asp Lys Asn Glu Pro Arg Ile Gly Tyr Glu Met Leu Gly Leu 210 215 220ctg att ttc gcc cgg ttt gcc acg tca ttt gtg cag acg gga aga gag 720Leu Ile Phe Ala Arg Phe Ala Thr Ser Phe Val Gln Thr Gly Arg Glu225 230 235 240tac ctc gga gcg ctg ctg gaa aag agc gtg gag aaa gag gca ggg gag 768Tyr Leu Gly Ala Leu Leu Glu Lys Ser Val Glu Lys Glu Ala Gly Glu 245 250 255aag gaa gat gaa aag gaa gcg gtt gtg ccg aaa aag aag tcg tca att 816Lys Glu Asp Glu Lys Glu Ala Val Val Pro Lys Lys Lys Ser Ser Ile 260 265 270ccg ttc att gag gat aca gaa ggg gag acg gaa gac aag atc gat ctg 864Pro Phe Ile Glu Asp Thr Glu Gly Glu Thr Glu Asp Lys Ile Asp Leu 275 280 285gag gac cct cga cag ctc aag ttc att cct gag gcg tcc aga gcg tgc 912Glu Asp Pro Arg Gln Leu Lys Phe Ile Pro Glu Ala Ser Arg Ala Cys 290 295 300act ctg tgt ctg tca tac att agt gcg ccg gca tgt acg cca tgt gga 960Thr Leu Cys Leu Ser Tyr Ile Ser Ala Pro Ala Cys Thr Pro Cys Gly305 310 315 320cac ttt ttc tgt tgg gac tgt att tcc gaa tgg gtg aga gag aag ccc 1008His Phe Phe Cys Trp Asp Cys Ile Ser Glu Trp Val Arg Glu Lys Pro 325 330 335gag tgt ccc ttg tgt cgg cag ggt gtg aga gag cag aac ttg ttg cct 1056Glu Cys Pro Leu Cys Arg Gln Gly Val Arg Glu Gln Asn Leu Leu Pro 340 345 350atc aga taa 1065Ile Arg24354PRTYarrowia lipolytica 24Met Ser Asp Asn Thr Thr Ile Lys Lys Pro Ile Arg Pro Lys Pro Ile1 5 10 15Arg Thr Glu Arg Leu Pro Tyr Ala Gly Ala Ala Glu Ile Ile Arg Ala 20 25 30Asn Gln Lys Asp His Tyr Phe Glu Ser Val Leu Glu Gln His Leu Val 35 40 45Thr Phe Leu Gln Lys Trp Lys Gly Val Arg Phe Ile His Gln Tyr Lys 50 55 60Glu Glu Leu Glu Thr Ala Ser Lys Phe Ala Tyr Leu Gly Leu Cys Thr65 70 75 80Leu Val Gly Ser Lys Thr Leu Gly Glu Glu Tyr Thr Asn Leu Met Tyr 85 90 95Thr Ile Arg Asp Arg Thr Ala Leu Pro Gly Val Val Arg Arg Phe Gly 100 105 110Tyr Val Leu Ser Asn Thr Leu Phe Pro Tyr Leu Phe Val Arg Tyr Met 115 120 125Gly Lys Leu Arg Ala Lys Leu Met Arg Glu Tyr Pro His Leu Val Glu 130 135 140Tyr Asp Glu Asp Glu Pro Val Pro Ser Pro Glu Thr Trp Lys Glu Arg145 150 155 160Val Ile Lys Thr Phe Val Asn Lys Phe Asp Lys Phe Thr Ala Leu Glu 165 170 175Gly Phe Thr Ala Ile His Leu Ala Ile Phe Tyr Val Tyr Gly Ser Tyr 180 185 190Tyr Gln Leu Ser Lys Arg Ile Trp Gly Met Arg Tyr Val Phe Gly His 195 200 205Arg Leu Asp Lys Asn Glu Pro Arg Ile Gly Tyr Glu Met Leu Gly Leu 210 215 220Leu Ile Phe Ala Arg Phe Ala Thr Ser Phe Val Gln Thr Gly Arg Glu225 230 235 240Tyr Leu Gly Ala Leu Leu Glu Lys Ser Val Glu Lys Glu Ala Gly Glu 245 250 255Lys Glu Asp Glu Lys Glu Ala Val Val Pro Lys Lys Lys Ser Ser Ile 260 265 270Pro Phe Ile Glu Asp Thr Glu Gly Glu Thr Glu Asp Lys Ile Asp Leu 275 280 285Glu Asp Pro Arg Gln Leu Lys Phe Ile Pro Glu Ala Ser Arg Ala Cys 290 295 300Thr Leu Cys Leu Ser Tyr Ile Ser Ala Pro Ala Cys Thr Pro Cys Gly305 310 315 320His Phe Phe Cys Trp Asp Cys Ile Ser Glu Trp Val Arg Glu Lys Pro 325 330 335Glu Cys Pro Leu Cys Arg Gln Gly Val Arg Glu Gln Asn Leu Leu Pro 340 345 350Ile Arg 2538PRTYarrowia lipolyticamisc_feature(2)..(3)Xaa can be any naturally occurring amino acid 25Cys Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys1 5 10 15Xaa His Xaa Xaa Cys Xaa Xaa Cys Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 20 25 30Xaa Xaa Cys Xaa Xaa Cys 3526345PRTYarrowia lipolytica 26Met Trp Gly Ser Ser His Ala Phe Ala Gly Glu Ser Asp Leu Thr Leu1 5 10 15Gln Leu His Thr Arg Ser Asn Met Ser Asp Asn Thr Thr Ile Lys Lys 20 25 30Pro Ile Arg Pro Lys Pro Ile Arg Thr Glu Arg Leu Pro Tyr Ala Gly 35 40 45Ala Ala Glu Ile Ile Arg Ala Asn Gln Lys Asp His Tyr Phe Glu Ser 50 55 60Val Leu Glu Gln His Leu Val Thr Phe Leu Gln Lys Trp Lys Gly Val65 70 75 80Arg Phe Ile His Gln Tyr Lys Glu Glu Leu Glu Thr Ala Ser Lys Phe 85 90 95Ala Tyr Leu Gly Leu Cys Thr Leu Val Gly Ser Lys Thr Leu Gly Glu 100 105 110Glu Tyr Thr Asn Leu Met Tyr Thr Ile Arg Asp Arg Thr Ala Leu Pro 115 120 125Gly Val Val Arg Arg Phe Gly Tyr Val Leu Ser Asn Thr Leu Phe Pro 130 135 140Tyr Leu Phe Val Arg Tyr Met Gly Lys Leu Arg Ala Lys Leu Met Arg145 150 155 160Glu Tyr Pro His Leu Val Glu Tyr Asp Glu Asp Glu Pro Val Pro Ser 165 170 175Pro Glu Thr Trp Lys Glu Arg Val Ile Lys Thr Phe Val Asn Lys Phe 180 185 190Asp Lys Phe Thr Ala Leu Glu Gly Phe Thr Ala Ile His Leu Ala Ile 195 200 205Phe Tyr Val Tyr Gly Ser Tyr Tyr Gln Leu Ser Lys Arg Ile Trp Gly 210 215 220Met Arg Tyr Val Phe Gly His Arg Leu Asp Lys Asn Glu Pro Arg Ile225 230 235 240Gly Tyr Glu Met Leu Gly Leu Leu Ile Phe Ala Arg Phe Ala Thr Ser 245 250 255Phe Val Gln Thr Gly Arg Glu Tyr Leu Gly Ala Leu Leu Glu Lys Ser 260 265 270Val Glu Lys Glu Ala Gly Glu Lys Glu Asp Glu Lys Glu Ala Val Val 275 280 285Pro Lys Lys Lys Ser Ser Ile Pro Phe Ile Glu Asp Thr Glu Gly Glu 290 295 300Thr Glu Asp Lys Ile Asp Leu Glu Asp Pro Arg Gln Leu Lys Phe Ile305 310 315 320Pro Glu Ala Ser Arg Ala Cys Thr Leu Cys Leu Ser Tyr Ile Ser Ala 325 330 335Pro Ala Cys Thr Pro Cys Gly His Phe 340 345272987DNAYarrowia lipolyticamisc_featuremutant acetohydroxyacid synthase (AHAS) with W497L mutation 27ttccctagtc ccagtgtaca cccgccgata tcgcttaccc tgcagccgga ttaaggttgg 60caatttttca cgtccttgtc tccgcaatta ctcaccgggt ggtttataag attgcaagcg 120tcttgatttg tctctgtata ctaacatgca atcgcgactc gcccgacggg ccactaacct 180ggccagaatc tccagatcca agtattctct tggtctgcga tatgtttcca acacaaaagc 240ccctgctgcc cagccggcaa ctgctgagtg agtattcctt gccataaacg acccagaacc 300actgtatagt gtttggaagc actagtcaga agaccagcga aaacaggtgg aaaaaactga 360gacgaaaagc aacgaccaga aatgtaatgt gtggaaaagc gacacacaca gagcagataa 420agaggtgaca aataacgaca aatgaaatat cagtatcttc ccacaatcac tacctctcag 480ctgtctgaag gtgcggctga tatatccatc ccacgtctaa cgtatggagt gtgatagaat 540atgacgacac aagcatgaga actcgctctc tatccaacca ccgaaacact gtcactacag 600ccgttcttgt tgctccattc gcttttgtga ttccatgcct tctctggtga ctgacaacat 660tccttccttt tctccagccc tgttgttatc tgctcatgac ctacggccac tctctatcgc 720atactaacat agacgatccc agcccgctcc ccacttccag ggcaccgttg gcaagcctcc 780tatcctcaag aaggctgagg ctgccaacgc tgacatggac gagtccttca tcggaatgtc 840tggaggagag atcttccacg agatgatgct gcgacacaac gtcgacactg tcttcggtta 900ccccggtgga gccattctcc ccgtctttga cgccattcac aactctgagt acttcaactt 960tgtgctccct cgacacgagc agggtgccgg ccacatggcc gagggctacg ctcgagcctc 1020tggtaagccc ggtgtcgttc tcgtcacctc tggccccggt gccaccaacg tcatcacccc 1080catgcaggac gctctttccg atggtacccc catggttgtc ttcaccggtc aggtcctgac 1140ctccgttatc ggcactgacg ccttccagga ggccgatgtt gtcggcatct cccgatcttg 1200caccaagtgg aacgtcatgg tcaagaacgt tgctgagctc ccccgacgaa tcaacgaggc 1260ctttgagatt gctacttccg gccgacccgg tcccgttctc gtcgatctgc ccaaggatgt 1320tactgctgcc atcctgcgag agcccatccc caccaagtcc accattccct cgcattctct 1380gaccaacctc acctctgccg ccgccaccga gttccagaag caggctatcc agcgagccgc 1440caacctcatc aaccagtcca agaagcccgt cctttacgtc ggacagggta tccttggctc 1500cgaggagggt cctaagctgc ttaaggagct ggctgagaag gccgagattc ccgtcaccac 1560tactctgcag ggtcttggtg cctttgacga gcgagacccc aagtctctgc acatgctcgg 1620tatgcacggt tccggctacg ccaacatggc catgcagaac gctgactgta tcattgctct 1680cggcgcccga tttgatgacc gagttaccgg ctccatcccc aagtttgccc ccgaggctcg 1740agccgctgcc cttgagggtc gaggtggtat tgttcacttt gagatccagg ccaagaacat 1800caacaaggtt gttcaggcca ccgaagccgt tgagggagac gttaccgagt ctgtccgaca 1860gctcatcccc ctcatcaaca aggtctctgc cgctgagcga gctccctgga ctgagactat 1920ccagtcctgg aagcagcagt tccccttcct cttcgaggct gaaggtgagg atggtgttat 1980caagccccag tccgtcattg ctctgctctc tgacctgaca gagaacaaca aggacaagac 2040catcatcacc accggtgttg gtcagcatca gatgtggact gcccagcatt tccgatggcg 2100acaccctcga accatgatca cttctggtgg tcttggaact atgggttacg gcctgcccgc 2160cgctatcggc gccaaggttg cccgacctga ctgcgacgtc attgacatcg atggtgacgc 2220ttctttcaac atgactctga ccgagctgtc caccgccgtt cagttcaaca ttggcgtcaa 2280ggctattgtc ctcaacaacg aggaacaggg tatggtcacc cagctgcagt ctctcttcta 2340cgagaaccga tactgccaca ctcatcagaa gaaccccgac ttcatgaagc tggccgagtc 2400catgggcatg aagggtatcc gaatcactca cattgaccag ctggaggccg gtctcaagga 2460gatgctcgca tacaagggcc ctgtgctcgt tgaggttgtt gtcgacaaga agatccccgt 2520tcttcccatg gttcccgctg gtaaggcttt gcatgagttc cttgtctacg acgctgacgc 2580cgaggctgct tctcgacccg atcgactgaa gaatgccccc gcccctcacg tccaccagac 2640cacctttgag aactaagtgg aaaggaacac aagcaatccg aaccaaaaat aattggggtc 2700ccgtgcccac agagtctagt gcagacctaa aatgaccaca gtaaattata gctgttatta 2760aacatgagat tttgaccaac aagagcgtag gaatgttatt agctactact tgtacataca 2820cagcatttgt tttaaataat gttgcctcca ggggcagtga gatcaggacc cagatccgtg 2880gccagctctc tgacttcaga ccgcttgtac ttaagcagct cgcaacactg ttgtcgagga 2940ttgaacttgc catattcgat tttgtggtca tgaatccagc acacctc 29872813066DNAArtificial SequencePlasmid pZP3-Pa777U 28tctcggtcta ttcttttgat ttataaggga ttttgccgat ttcggcctat tggttaaaaa 60atgagctgat ttaacaaaaa tttaacgcga attttaacaa aatattaacg cttacaattt 120cctgatgcgg tattttctcc ttacgcatct gtgcggtatt tcacaccgca tcaggtggca 180cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata cattcaaata 240tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga aaaaggaaga 300gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca ttttgccttc 360ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat cagttgggtg 420cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag agttttcgcc 480ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc gcggtattat 540cccgtattga cgccgggcaa gagcaactcg gtcgccgcat acactattct cagaatgact 600tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca gtaagagaat

660tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt ctgacaacga 720tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat gtaactcgcc 780ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt gacaccacga 840tgcctgtagc aatggcaaca acgttgcgca aactattaac tggcgaacta cttactctag 900cttcccggca acaattaata gactggatgg aggcggataa agttgcagga ccacttctgc 960gctcggccct tccggctggc tggtttattg ctgataaatc tggagccggt gagcgtgggt 1020ctcgcggtat cattgcagca ctggggccag atggtaagcc ctcccgtatc gtagttatct 1080acacgacggg gagtcaggca actatggatg aacgaaatag acagatcgct gagataggtg 1140cctcactgat taagcattgg taactgtcag accaagttta ctcatatata ctttagattg 1200atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt gataatctca 1260tgaccaaaat cccttaacgt gagttttcgt tccactgagc gtcagacccc gtagaaaaga 1320tcaaaggatc ttcttgagat cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa 1380aaccaccgct accagcggtg gtttgtttgc cggatcaaga gctaccaact ctttttccga 1440aggtaactgg cttcagcaga gcgcagatac caaatactgt tcttctagtg tagccgtagt 1500taggccacca cttcaagaac tctgtagcac cgcctacata cctcgctctg ctaatcctgt 1560taccagtggc tgctgccagt ggcgataagt cgtgtcttac cgggttggac tcaagacgat 1620agttaccgga taaggcgcag cggtcgggct gaacgggggg ttcgtgcaca cagcccagct 1680tggagcgaac gacctacacc gaactgagat acctacagcg tgagctatga gaaagcgcca 1740cgcttcccga agggagaaag gcggacaggt atccggtaag cggcagggtc ggaacaggag 1800agcgcacgag ggagcttcca gggggaaacg cctggtatct ttatagtcct gtcgggtttc 1860gccacctctg acttgagcgt cgatttttgt gatgctcgtc aggggggcgg agcctatgga 1920aaaacgccag caacgcggcc tttttacggt tcctggcctt ttgctggcct tttgctcaca 1980tgttctttcc tgcgttatcc cctgattctg tggataaccg tattaccgcc tttgagtgag 2040ctgataccgc tcgccgcagc cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg 2100aagagcgccc aatacgcaaa ccgcctctcc ccgcgcgttg gccgattcat taatgcagct 2160ggcgcgccac caatcacaat tctgaaaagc acatcttgat ctcctcattg cggggagtcc 2220aacggtggtc ttattccccc gaatttcccg ctcaatctcg ttccagaccg acccggacac 2280agtgcttaac gccgttccga aactctaccg cagatatgct ccaacggact gggctgcata 2340gatgtgatcc tcggcttgga gaaatggata aaagccggcc aaaaaaaaag cggaaaaaag 2400cggaaaaaaa gagaaaaaaa atcgcaaaat ttgaaaaata gggggaaaag acgcaaaaac 2460gcaaggaggg gggagtatat gacactgata agcaagctca caacggttcc tcttattttt 2520ttcctcatct tctgcctagg ttcccaaaat cccagatgct tctctccagt gccaaaagta 2580agtaccccac aggttttcgg ccgaaaattc cacgtgcagc aacgtcgtgt ggggtgttaa 2640aatgtggggg gggggaacca ggacaagagg ctcttgtggg agccgaatga gagcacaaag 2700cgggcgggtg tgataagggc atttttgccc attttccctt ctcctgtctc tccgacggtg 2760atggcgttgt gcgtcctcta tttcttttta tttctttttg ttttatttct ctgactaccg 2820atttggtttg atttcctcaa ccccacacaa ataagctcgg gccgaggaat atatatatac 2880acggacacag tcgccctgtg gacaacacgt cactacctct acgatacaca ccgtacgttg 2940tgtggaagct tgtgagcgga taacaatttc acacaggaaa cagctatgac catgattacg 3000ccaagctcga aattaaccct cactaaaggg aacaaaagct ggagctccac cgcggacaca 3060atatctggtc aaatttcagt ttcgttacat ttaaattcct tcacttcaag ttcattcttc 3120atctgcttct gttttacttt gacaggcaaa tgaagacatg gtacgacttg atggaggcca 3180agaacgccat ttcaccccga gacaccgaag tgcctgaaat cctggctgcc cccattgata 3240acatcggaaa ctacggtatt ccggaaagtg tatatagaac ctttccccag cttgtgtctg 3300tggatatgga tggtgtaatc ccctttgagt actcgtcttg gcttctctcc gagcagtatg 3360aggctctcta atctagcgca tttaatatct caatgtattt atatatttat cttctcatgc 3420ggccgcttag ttggctttgg tcttggcagc cttggcctcc ttgagggtaa acatcttggc 3480atccttgtcg accacgccgt acttggcgta cataagacca attcggatga aggtgggaat 3540gatgggagaa gccgactttc gcaccagttc gggaaaggcc tgagcgaagg cagcagtggc 3600ctcgttgagc ttgtagtgag gaatgatggg aaacagatgg tggatctgat gtgtaccaat 3660gttgtgggac aggttgtcga tgagggctcc gtagcttcgg tccacagagg acaagttgcc 3720cttgacatag gtccactccg aatcggcgta ccagggagtt tcctcgtcgt tgtgatggag 3780gaaggtagtg acaaccagca tggtggcgaa tccaaagaga ggtgcgaagt aatacagagc 3840catggtcttg aggccgtaga cgtaggtaag gtaggcgtac agaccagcaa aggccacgag 3900agagccgagg gaaatgatga cggcagacat tcttcgcagg tagagaggct cccagggatt 3960gaagtggttg acctttcggg gaggaaatcc agcaacgagg taggcaaacc aagccgaacc 4020aagggagatg accatgtgtc gggacagggg atgagagtcg gcttctcgct gagggtagaa 4080gatctcatcc ttgtcgatgt tgccggtgtt cttgtgatgg tgtcgatggc tgatcttcca 4140cgactcgtag ggagtcagaa tgatggagtg aatgagtgtg ccaacagaga agttgagcag 4200gtgggatcgc gagaaggcac catgtccaca gtcgtgaccg atggtaaaga atccccagaa 4260cacgataccc tggagcagaa tgtagccagt gcaaaggacg gcatcgagca gtgcaaactc 4320ctgcacgata gcaagggctc gagcatagta cagtccgaga gcaagggaac cggcaatgcc 4380cagagctcgc acggtatagt agagggacca gggaacagag gcttcgaagc agtgggcagg 4440cagggatcgc ttgatctcgg tgagagtagg gaactcgtag ggagcggcaa cggtagagga 4500agccatggtt gtgaattagg gtggtgagaa tggttggttg tagggaagaa tcaaaggccg 4560gtctcgggat ccgtgggtat atatatatat atatatatat acgatccttc gttacctccc 4620tgttctcaaa actgtggttt ttcgtttttc gttttttgct ttttttgatt tttttagggc 4680caactaagct tccagatttc gctaatcacc tttgtactaa ttacaagaaa ggaagaagct 4740gattagagtt gggcttttta tgcaactgtg ctactcctta tctctgatat gaaagtgtag 4800acccaatcac atcatgtcat ttagagttgg taatactggg aggatagata aggcacgaaa 4860acgagccata gcagacatgc tgggtgtagc caagcagaag aaagtagatg ggagccaatt 4920gacgagcgag ggagctacgc caatccgaca tacgacacgc tgagatcgtc ttggccgggg 4980ggtacctaca gatgtccaag ggtaagtgct tgactgtaat tgtatgtctg aggacaaata 5040tgtagtcagc cgtataaagt cataccaggc accagtgcca tcatcgaacc actaactctc 5100tatgatacat gcctccggta ttattgtacc atgcgtcgct ttgttacata cgtatcttgc 5160ctttttctct cagaaactcc agactttggc tattggtcga gataagcccg gaccatagtg 5220agtctttcac actctacatt tctcccttgc tccaactatc gattgttgtc tactaactat 5280cgtacgataa cttcgtatag catacattat acgaagttat cgcgtcgacg agtatctgtc 5340tgactcgtca ttgccgcctt tggagtacga ctccaactat gagtgtgctt ggatcacttt 5400gacgatacat tcttcgttgg aggctgtggg tctgacagct gcgttttcgg cgcggttggc 5460cgacaacaat atcagctgca acgtcattgc tggctttcat catgatcaca tttttgtcgg 5520caaaggcgac gcccagagag ccattgacgt tctttctaat ttggaccgat agccgtatag 5580tccagtctat ctataagttc aactaactcg taactattac cataacatat acttcactgc 5640cccagataag gttccgataa aaagttctgc agactaaatt tatttcagtc tcctcttcac 5700caccaaaatg ccctcctacg aagctcgagc taacgtccac aagtccgcct ttgccgctcg 5760agtgctcaag ctcgtggcag ccaagaaaac caacctgtgt gcttctctgg atgttaccac 5820caccaaggag ctcattgagc ttgccgataa ggtcggacct tatgtgtgca tgatcaaaac 5880ccatatcgac atcattgacg acttcaccta cgccggcact gtgctccccc tcaaggaact 5940tgctcttaag cacggtttct tcctgttcga ggacagaaag ttcgcagata ttggcaacac 6000tgtcaagcac cagtaccggt gtcaccgaat cgccgagtgg tccgatatca ccaacgccca 6060cggtgtaccc ggaaccggaa tcattgctgg cctgcgagct ggtgccgagg aaactgtctc 6120tgaacagaag aaggaggacg tctctgacta cgagaactcc cagtacaagg agttcctagt 6180cccctctccc aacgagaagc tggccagagg tctgctcatg ctggccgagc tgtcttgcaa 6240gggctctctg gccactggcg agtactccaa gcagaccatt gagcttgccc gatccgaccc 6300cgagtttgtg gttggcttca ttgcccagaa ccgacctaag ggcgactctg aggactggct 6360tattctgacc cccggggtgg gtcttgacga caagggagac gctctcggac agcagtaccg 6420aactgttgag gatgtcatgt ctaccggaac ggatatcata attgtcggcc gaggtctgta 6480cggccagaac cgagatccta ttgaggaggc caagcgatac cagaaggctg gctgggaggc 6540ttaccagaag attaactgtt agaggttaga ctatggatat gtaatttaac tgtgtatata 6600gagagcgtgc aagtatggag cgcttgttca gcttgtatga tggtcagacg acctgtctga 6660tcgagtatgt atgatactgc acaacctgtg tatccgcatg atctgtccaa tggggcatgt 6720tgttgtgttt ctcgatacgg agatgctggg tacagtgcta atacgttgaa ctacttatac 6780ttatatgagg ctcgaagaaa gctgacttgt gtatgactta ttctcaacta catccccagt 6840cacaatacca ccactgcact accactacac caaaaccatg atcaaaccac ccatggactt 6900cctggaggca gaagaacttg ttatggaaaa gctcaagaga gagatcataa cttcgtatag 6960catacattat acgaagttat cctgcaggta aaggaattca tgctgttcat cgtggttaat 7020gctgctgtgt gctgtgtgtg tgtgttgttt ggcgctcatt gttgcgttat gcagcgtaca 7080ccacaatatt ggaagcttat tagcctttct attttttcgt ttgcaaggct taacaacatt 7140gctgtggaga gggatgggga tatggaggcc gctggaggga gtcggagagg cgttttggag 7200cggcttggcc tggcgcccag ctcgcgaaac gcacctagga ccctttggca cgccgaaatg 7260tgccactttt cagtctagta acgccttacc tacgtcattc catgcgtgca tgtttgcgcc 7320ttttttccct tgcccttgat cgccacacag tacagtgcac tgtacagtgg aggttttggg 7380ggggtcttag atgggagcta aaagcggcct agcggtacac tagtgggatt gtatggagtg 7440gcatggagcc taggtggagc ctgacaggac gcacgaccgg ctagcccgtg acagacgatg 7500ggtggctcct gttgtccacc gcgtacaaat gtttgggcca aagtcttgtc agccttgctt 7560gcgaacctaa ttcccaattt tgtcacttcg cacccccatt gatcgagccc taacccctgc 7620ccatcaggca atccaattaa gctcgcattg tctgccttgt ttagtttggc tcctgcccgt 7680ttcggcgtcc acttgcacaa acacaaacaa gcattatata taaggctcgt ctctccctcc 7740caaccacact cacttttttg cccgtcttcc cttgctaaca caaaagtcaa gaacacaaac 7800aaccacccca acccccttac acacaagaca tatctacagc aatggccatg gcttcttcca 7860ctgttgctgc gccgtacgag ttcccgacgc tgacggagat caagcgctcg ctgccagcgc 7920actgctttga ggcctcggtc ccgtggtcgc tctactacac cgtgcgcgcg ctgggcatcg 7980ccggctcgct cgcgctcggc ctctactacg cgcgcgcgct cgcgatcgtg caggagtttg 8040ccctgctgga tgcggtgctc tgcacggggt acattctgct gcagggcatc gtattctggg 8100ggttcttcac catcggccat gactgcggcc acggcgcgtt ctcgcgttcg cacctgctca 8160acttcagcgt cggcacgctc attcactcga tcatcctcac gccgtacgag tcatggaaga 8220tctcgcaccg ccaccaccac aagaacacgg gcaacatcga caaggacgag attttctacc 8280cgcagcgcga ggccgactcg cacccactgt cccgacacat ggtgatctcg ctcggctcgg 8340cctggttcgc gtacctcgtt gcgggcttcc ctcctcgcaa ggtgaaccac ttcaaccctt 8400gggaaccgtt gtacctgcgc cgcatgtctg ccgtcatcat ctcactcggc tcgctcgtgg 8460cgttcgcggg cttgtatgcg tatctcacct acgtctatgg ccttaagacc atggcgctgt 8520actacttcgc ccctctcttt gggttcgcca cgatgctcgt ggtcactacc tttttgcacc 8580acaatgacga ggaaacgcca tggtacgccg actcggagtg gacgtacgtc aagggcaacc 8640tctcgtccgt ggaccgctcg tacggcgcgc tcatcgacaa cctgagccac aacatcggca 8700cgcaccagat ccaccacctg tttccgatca tcccgcacta caagctgaac gaggcgacgg 8760cagcgttcgc gcaggcgttc ccggagctcg tgcgcaagag cgcgtcgccg atcatcccga 8820cgttcatccg catcgggctc atgtacgcca agtacggcgt cgtggacaag gacgccaaga 8880tgtttacgct caaggaggcc aaggccgcca agaccaaggc caactaggcg gccgcattga 8940tgattggaaa cacacacatg ggttatatct aggtgagagt tagttggaca gttatatatt 9000aaatcagcta tgccaacggt aacttcattc atgtcaacga ggaaccagtg actgcaagta 9060atatagaatt tgaccacctt gccattctct tgcactcctt tactatatct catttatttc 9120ttatatacaa atcacttctt cttcccagca tcgagctcgg aaacctcatg agcaataaca 9180tcgtggatct cgtcaataga gggctttttg gactccttgc tgttggccac cttgtccttg 9240ctgtttaaac agtgtacgca gatctactat agaggaacat ttaaattgcc ccggagaaga 9300cggccaggcc gcctagatga caaattcaac aactcacagc tgactttctg ccattgccac 9360tagggggggg cctttttata tggccaagcc aagctctcca cgtcggttgg gctgcaccca 9420acaataaatg ggtagggttg caccaacaaa gggatgggat ggggggtaga agatacgagg 9480ataacggggc tcaatggcac aaataagaac gaatactgcc attaagactc gtgatccagc 9540gactgacacc attgcatcat ctaagggcct caaaactacc tcggaactgc tgcgctgatc 9600tggacaccac agaggttccg agcactttag gttgcaccaa atgtcccacc aggtgcaggc 9660agaaaacgct ggaacagcgt gtacagtttg tcttaacaaa aagtgagggc gctgaggtcg 9720agcagggtgg tgtgacttgt tatagccttt agagctgcga aagcgcgtat ggatttggct 9780catcaggcca gattgagggt ctgtggacac atgtcatgtt agtgtacttc aatcgccccc 9840tggatatagc cccgacaata ggccgtggcc tcattttttt gccttccgca catttccatt 9900gctcggtacc cacaccttgc ttctcctgca cttgccaacc ttaatactgg tttacattga 9960ccaacatctt acaagcgggg ggcttgtcta gggtatatat aaacagtggc tctcccaatc 10020ggttgccagt ctcttttttc ctttctttcc ccacagattc gaaatctaaa ctacacatca 10080cagaattccg agccgtgagt atccacgaca agatcagtgt cgagacgacg cgttttgtgt 10140aatgacacaa tccgaaagtc gctagcaaca cacactctct acacaaacta acccagctct 10200ggtaccatgg cttcttccac tgttgctgcg ccgtacgagt tcccgacgct gacggagatc 10260aagcgctcgc tgccagcgca ctgctttgag gcctcggtcc cgtggtcgct ctactacacc 10320gtgcgcgcgc tgggcatcgc cggctcgctc gcgctcggcc tctactacgc gcgcgcgctc 10380gcgatcgtgc aggagtttgc cctgctggat gcggtgctct gcacggggta cattctgctg 10440cagggcatcg tattctgggg gttcttcacc atcggccatg actgcggcca cggcgcgttc 10500tcgcgttcgc acctgctcaa cttcagcgtc ggcacgctca ttcactcgat catcctcacg 10560ccgtacgagt catggaagat ctcgcaccgc caccaccaca agaacacggg caacatcgac 10620aaggacgaga ttttctaccc gcagcgcgag gccgactcgc acccactgtc ccgacacatg 10680gtgatctcgc tcggctcggc ctggttcgcg tacctcgttg cgggcttccc tcctcgcaag 10740gtgaaccact tcaacccttg ggaaccgttg tacctgcgcc gcatgtctgc cgtcatcatc 10800tcactcggct cgctcgtggc gttcgcgggc ttgtatgcgt atctcaccta cgtctatggc 10860cttaagacca tggcgctgta ctacttcgcc cctctctttg ggttcgccac gatgctcgtg 10920gtcactacct ttttgcacca caatgacgag gaaacgccat ggtacgccga ctcggagtgg 10980acgtacgtca agggcaacct ctcgtccgtg gaccgctcgt acggcgcgct catcgacaac 11040ctgagccaca acatcggcac gcaccagatc caccacctgt ttccgatcat cccgcactac 11100aagctgaacg aggcgacggc agcgttcgcg caggcgttcc cggagctcgt gcgcaagagc 11160gcgtcgccga tcatcccgac gttcatccgc atcgggctca tgtacgccaa gtacggcgtc 11220gtggacaagg acgccaagat gtttacgctc aaggaggcca aggccgccaa gaccaaggcc 11280aactaggcgg ccgcatggag cgtgtgttct gagtcgatgt tttctatgga gttgtgagtg 11340ttagtagaca tgatgggttt atatatgatg aatgaataga tgtgattttg atttgcacga 11400tggaattgag aactttgtaa acgtacatgg gaatgtatga atgtgggggt tttgtgactg 11460gataactgac ggtcagtgga cgccgttgtt caaatatcca agagatgcga gaaactttgg 11520gtcaagtgaa catgtcctct ctgttcaagt aaaccatcaa ctatgggtag tatatttagt 11580aaggacaaga gttgagattc tttggagtcc tagaaacgta ttttcgcgtt ccaagatcaa 11640attagtagag taatacgggc acgggaatcc attcatagtc tcaatcctgc aggtgagtta 11700attaagatga cgacatttgc gagctggacg aggaatagat ggagcgtgtg ttctgagtcg 11760atgttttcta tggagttgtg agtgttagta gacatgatgg gtttatatat gatgaatgaa 11820tagatgtgat tttgatttgc acgatggaat tgagaacttt gtaaacgtac atgggaatgt 11880atgaatgtgg gggttttgtg actggataac tgacggtcag tggacgccgt tgttcaaata 11940tccaagagat gcgagaaact ttgggtcaag tgaacatgtc ctctctgttc aagtaaacca 12000tcaactatgg gtagtatatt tagtaaggac aagagttgag attctttgga gtcctagaaa 12060cgtattttcg cgttccaaga tcaaattagt agagtaatac gggcacggga atccattcat 12120agtctcaatt ttcccatagg tgtgctacaa ggtgttgaga tgtggtacag taccaccatg 12180attcgaggta aagagcccag aagtcattga tgaggtcaag aaatacacag atctacagct 12240caatacaatg aatatcttct ttcatattct tcaggtgaca ccaagggtgt ctattttccc 12300cagaaatgcg tgaaaaggcg cgtgtgtagc gtggagtatg ggttcggttg gcgtatcctt 12360catatatcga cgaaatagta gggcaagaga tgacaaaaag tatctatatg tagacagcgt 12420agaatatgga tttgattggt ataaattcat ttattgcgtg tctcacaaat actctcgata 12480agttggggtt aaactggaga tggaacaatg tcgatatctc gacgcatgcg acgtcgggcc 12540caattcgccc tatagtgagt cgtattacaa ttcactggcc gtcgttttac aacgtcgtga 12600ctgggaaaac cctggcgtta cccaacttaa tcgccttgca gcacatcccc ctttcgccag 12660ctggcgtaat agcgaagagg cccgcaccga tcgcccttcc caacagttgc gcagcctgaa 12720tggcgaatgg acgcgccctg tagcggcgca ttaagcgcgg cgggtgtggt ggttacgcgc 12780agcgtgaccg ctacacttgc cagcgcccta gcgcccgctc ctttcgcttt cttcccttcc 12840tttctcgcca cgttcgccgg ctttccccgt caagctctaa atcgggggct ccctttaggg 12900ttccgattta gtgctttacg gcacctcgac cccaaaaaac ttgattaggg tgatggttca 12960cgtagtgggc catcgccctg atagacggtt tttcgccctt tgacgttgga gtccacgttc 13020tttaatagtg gactcttgtt ccaaactgga acaacactca acccta 13066299570DNAArtificial SequencePlasmid pY117 29ggccgccacc gcggcccgag attccggcct cttcggccgc caagcgaccc gggtggacgt 60ctagaggtac ctagcaatta acagatagtt tgccggtgat aattctctta acctcccaca 120ctcctttgac ataacgattt atgtaacgaa actgaaattt gaccagatat tgtgtccgcg 180gtggagctcc agcttttgtt ccctttagtg agggtttaaa cgagcttggc gtaatcatgg 240tcatagctgt ttcctgtgtg aaattgttat ccgctcacaa ttccacacaa cgtacgagcc 300ggaagcataa agtgtaaagc ctggggtgcc taatgagtga gctaactcac attaattgcg 360ttgcgctcac tgcccgcttt ccagtcggga aacctgtcgt gccagctgca ttaatgaatc 420ggccaacgcg cggggagagg cggtttgcgt attgggcgct cttccgcttc ctcgctcact 480gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc aaaggcggta 540atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc aaaaggccag 600caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag gctccgcccc 660cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc gacaggacta 720taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt tccgaccctg 780ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct ttctcatagc 840tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg ctgtgtgcac 900gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct tgagtccaac 960ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat tagcagagcg 1020aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg ctacactaga 1080aggacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa aagagttggt 1140agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt ttgcaagcag 1200cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc tacggggtct 1260gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt atcaaaaagg 1320atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta aagtatatat 1380gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat ctcagcgatc 1440tgtctatttc gttcatccat agttgcctga ctccccgtcg tgtagataac tacgatacgg 1500gagggcttac catctggccc cagtgctgca atgataccgc gagacccacg ctcaccggct 1560ccagatttat cagcaataaa ccagccagcc ggaagggccg agcgcagaag tggtcctgca 1620actttatccg cctccatcca gtctattaat tgttgccggg aagctagagt aagtagttcg 1680ccagttaata gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt gtcacgctcg 1740tcgtttggta tggcttcatt cagctccggt tcccaacgat caaggcgagt tacatgatcc 1800cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt cagaagtaag 1860ttggccgcag tgttatcact catggttatg gcagcactgc ataattctct tactgtcatg 1920ccatccgtaa gatgcttttc tgtgactggt gagtactcaa ccaagtcatt ctgagaatag 1980tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac gggataatac cgcgccacat 2040agcagaactt taaaagtgct catcattgga aaacgttctt cggggcgaaa actctcaagg 2100atcttaccgc tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatcttca 2160gcatctttta ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aaatgccgca 2220aaaaagggaa taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaatat 2280tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag 2340aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccacc tgacgcgccc 2400tgtagcggcg cattaagcgc ggcgggtgtg gtggttacgc gcagcgtgac cgctacactt 2460gccagcgccc tagcgcccgc tcctttcgct ttcttccctt cctttctcgc cacgttcgcc 2520ggctttcccc gtcaagctct aaatcggggg ctccctttag ggttccgatt tagtgcttta 2580cggcacctcg accccaaaaa

acttgattag ggtgatggtt cacgtagtgg gccatcgccc 2640tgatagacgg tttttcgccc tttgacgttg gagtccacgt tctttaatag tggactcttg 2700ttccaaactg gaacaacact caaccctatc tcggtctatt cttttgattt ataagggatt 2760ttgccgattt cggcctattg gttaaaaaat gagctgattt aacaaaaatt taacgcgaat 2820tttaacaaaa tattaacgct tacaatttcc attcgccatt caggctgcgc aactgttggg 2880aagggcgatc ggtgcgggcc tcttcgctat tacgccagct ggcgaaaggg ggatgtgctg 2940caaggcgatt aagttgggta acgccagggt tttcccagtc acgacgttgt aaaacgacgg 3000ccagtgaatt gtaatacgac tcactatagg gcgaattggg taccgggccc cccctcgagg 3060tcgatggtgt cgataagctt gatatcgaat tcatgtcaca caaaccgatc ttcgcctcaa 3120ggaaacctaa ttctacatcc gagagactgc cgagatccag tctacactga ttaattttcg 3180ggccaataat ttaaaaaaat cgtgttatat aatattatat gtattatata tatacatcat 3240gatgatactg acagtcatgt cccattgcta aatagacaga ctccatctgc cgcctccaac 3300tgatgttctc aatatttaag gggtcatctc gcattgttta ataataaaca gactccatct 3360accgcctcca aatgatgttc tcaaaatata ttgtatgaac ttatttttat tacttagtat 3420tattagacaa cttacttgct ttatgaaaaa cacttcctat ttaggaaaca atttataatg 3480gcagttcgtt catttaacaa tttatgtaga ataaatgtta taaatgcgta tgggaaatct 3540taaatatgga tagcataaat gatatctgca ttgcctaatt cgaaatcaac agcaacgaaa 3600aaaatccctt gtacaacata aatagtcatc gagaaatatc aactatcaaa gaacagctat 3660tcacacgtta ctattgagat tattattgga cgagaatcac acactcaact gtctttctct 3720cttctagaaa tacaggtaca agtatgtact attctcattg ttcatacttc tagtcatttc 3780atcccacata ttccttggat ttctctccaa tgaatgacat tctatcttgc aaattcaaca 3840attataataa gatataccaa agtagcggta tagtggcaat caaaaagctt ctctggtgtg 3900cttctcgtat ttatttttat tctaatgatc cattaaaggt atatatttat ttcttgttat 3960ataatccttt tgtttattac atgggctgga tacataaagg tattttgatt taattttttg 4020cttaaattca atcccccctc gttcagtgtc aactgtaatg gtaggaaatt accatacttt 4080tgaagaagca aaaaaaatga aagaaaaaaa aaatcgtatt tccaggttag acgttccgca 4140gaatctagaa tgcggtatgc ggtacattgt tcttcgaacg taaaagttgc gctccctgag 4200atattgtaca tttttgcttt tacaagtaca agtacatcgt acaactatgt actactgttg 4260atgcatccac aacagtttgt tttgtttttt tttgtttttt ttttttctaa tgattcatta 4320ccgctatgta tacctacttg tacttgtagt aagccgggtt attggcgttc aattaatcat 4380agacttatga atctgcacgg tgtgcgctgc gagttacttt tagcttatgc atgctacttg 4440ggtgtaatat tgggatctgt tcggaaatca acggatgctc aaccgatttc gacagtaatt 4500aattaattcc ctagtcccag tgtacacccg ccgatatcgc ttaccctgca gccggattaa 4560ggttggcaat ttttcacgtc cttgtctccg caattactca ccgggtggtt tataagattg 4620caagcgtctt gatttgtctc tgtatactaa catgcaatcg cgactcgccc gacgggccac 4680taacctggcc agaatctcca gatccaagta ttctcttggt ctgcgatatg tttccaacac 4740aaaagcccct gctgcccagc cggcaactgc tgagtgagta ttccttgcca taaacgaccc 4800agaaccactg tatagtgttt ggaagcacta gtcagaagac cagcgaaaac aggtggaaaa 4860aactgagacg aaaagcaacg accagaaatg taatgtgtgg aaaagcgaca cacacagagc 4920agataaagag gtgacaaata acgacaaatg aaatatcagt atcttcccac aatcactacc 4980tctcagctgt ctgaaggtgc ggctgatata tccatcccac gtctaacgta tggagtgtga 5040tagaatatga cgacacaagc atgagaactc gctctctatc caaccaccga aacactgtca 5100ctacagccgt tcttgttgct ccattcgctt ttgtgattcc atgccttctc tggtgactga 5160caacattcct tccttttctc cagccctgtt gttatctgct catgacctac ggccactctc 5220tatcgcatac taacatagac gatcccagcc cgctccccac ttccagggca ccgttggcaa 5280gcctcctatc ctcaagaagg ctgaggctgc caacgctgac atggacgagt ccttcatcgg 5340aatgtctgga ggagagatct tccacgagat gatgctgcga cacaacgtcg acactgtctt 5400cggttacccc ggtggagcca ttctccccgt ctttgacgcc attcacaact ctgagtactt 5460caactttgtg ctccctcgac acgagcaggg tgccggccac atggccgagg gctacgctcg 5520agcctctggt aagcccggtg tcgttctcgt cacctctggc cccggtgcca ccaacgtcat 5580cacccccatg caggacgctc tttccgatgg tacccccatg gttgtcttca ccggtcaggt 5640cctgacctcc gttatcggca ctgacgcctt ccaggaggcc gatgttgtcg gcatctcccg 5700atcttgcacc aagtggaacg tcatggtcaa gaacgttgct gagctccccc gacgaatcaa 5760cgaggccttt gagattgcta cttccggccg acccggtccc gttctcgtcg atctgcccaa 5820ggatgttact gctgccatcc tgcgagagcc catccccacc aagtccacca ttccctcgca 5880ttctctgacc aacctcacct ctgccgccgc caccgagttc cagaagcagg ctatccagcg 5940agccgccaac ctcatcaacc agtccaagaa gcccgtcctt tacgtcggac agggtatcct 6000tggctccgag gagggtccta agctgcttaa ggagctggct gagaaggccg agattcccgt 6060caccactact ctgcagggtc ttggtgcctt tgacgagcga gaccccaagt ctctgcacat 6120gctcggtatg cacggttccg gctacgccaa catggccatg cagaacgctg actgtatcat 6180tgctctcggc gcccgatttg atgaccgagt taccggctcc atccccaagt ttgcccccga 6240ggctcgagcc gctgcccttg agggtcgagg tggtattgtt cactttgaga tccaggccaa 6300gaacatcaac aaggttgttc aggccaccga agccgttgag ggagacgtta ccgagtctgt 6360ccgacagctc atccccctca tcaacaaggt ctctgccgct gagcgagctc cctggactga 6420gactatccag tcctggaagc agcagttccc cttcctcttc gaggctgaag gtgaggatgg 6480tgttatcaag ccccagtccg tcattgctct gctctctgac ctgacagaga acaacaagga 6540caagaccatc atcaccaccg gtgttggtca gcatcagatg tggactgccc agcatttccg 6600atggcgacac cctcgaacca tgatcacttc tggtggtctt ggaactatgg gttacggcct 6660gcccgccgct atcggcgcca aggttgcccg acctgactgc gacgtcattg acatcgatgg 6720tgacgcttct ttcaacatga ctctgaccga gctgtccacc gccgttcagt tcaacattgg 6780cgtcaaggct attgtcctca acaacgagga acagggtatg gtcacccagc tgcagtctct 6840cttctacgag aaccgatact gccacactca tcagaagaac cccgacttca tgaagctggc 6900cgagtccatg ggcatgaagg gtatccgaat cactcacatt gaccagctgg aggccggtct 6960caaggagatg ctcgcataca agggccctgt gctcgttgag gttgttgtcg acaagaagat 7020ccccgttctt cccatggttc ccgctggtaa ggctttgcat gagttccttg tctacgacgc 7080tgacgccgag gctgcttctc gacccgatcg actgaagaat gcccccgccc ctcacgtcca 7140ccagaccacc tttgagaact aagtggaaag gaacacaagc aatccgaacc aaaaataatt 7200ggggtcccgt gcccacagag tctagtgcag acctaaaatg accacagtaa attatagctg 7260ttattaaaca tgagattttg accaacaaga gcgtaggaat gttattagct actacttgta 7320catacacagc atttgtttta aataatgttg cctccagggg cagtgagatc aggacccaga 7380tccgtggcca gctctctgac ttcagaccgc ttgtacttaa gcagctcgca acactgttgt 7440cgaggattga acttgccata ttcgattttg tggtcatgaa tccagcacac ctcatttaaa 7500tgtagctaac ggtagcaggc gaactactgg tacatacctc ccccggaata tgtacaggca 7560taatgcgtat ctgtgggaca tgtggtcgtt gcgccattat gtaagcagcg tgtactcctc 7620tgactgtcca tatggtttgc tccatctcac cctcatcgtt ttcattgttc acaggcggcc 7680acaaaaaaac tgtcttctct ccttctctct tcgccttagt ctactcggac cagttttagt 7740ttagcttggc gccactggat aaatgagacc tcaggccttg tgatgaggag gtcacttatg 7800aagcatgtta ggaggtgctt gtatggatag agaagcaccc aaaataataa gaataataat 7860aaaacagggg gcgttgtcat ttcatatcgt gttttcacca tcaatacacc tccaaacaat 7920gcccttcatg tggccagccc caatattgtc ctgtagttca actctatgca gctcgtatct 7980tattgagcaa gtaaaactct gtcagccgat attgcccgac ccgcgacaag ggtcaacaag 8040gtggtgtaag gccttcgcag aagtcaaaac tgtgccaaac aaacatctag agtctctttg 8100gtgtttctcg catatatttw atcggctgtc ttacgtattt gcgcctcggt accggactaa 8160tttcggatca tccccaatac gctttttctt cgcagctgtc aacagtgtcc atgatctatc 8220cacctaaatg ggtcatatga ggcgtataat ttcgtggtgc tgataataat tcccatatat 8280ttgacacaaa acttcccccc ctagacatac atctcacaat ctcacttctt gtgcttctgt 8340cacacatctc ctccagctga cttcaactca cacctctgcc ccagttggtc tacagcggta 8400taaggtttct ccgcatagag gtgcaccact cctcccgata cttgtttgtg tgacttgtgg 8460gtcacgacat atatatctac acacattgcg ccaccctttg gttcttccag cacaacaaaa 8520acacgacacg ctaaccatgg ccaatttact gaccgtacac caaaatttgc ctgcattacc 8580ggtcgatgca acgagtgatg aggttcgcaa gaacctgatg gacatgttca gggatcgcca 8640ggcgttttct gagcatacct ggaaaatgct tctgtccgtt tgccggtcgt gggcggcatg 8700gtgcaagttg aataaccgga aatggtttcc cgcagaacct gaagatgttc gcgattatct 8760tctatatctt caggcgcgcg gtctggcagt aaaaactatc cagcaacatt tgggccagct 8820aaacatgctt catcgtcggt ccgggctgcc acgaccaagt gacagcaatg ctgtttcact 8880ggttatgcgg cggatccgaa aagaaaacgt tgatgccggt gaacgtgcaa aacaggctct 8940agcgttcgaa cgcactgatt tcgaccaggt tcgttcactc atggaaaata gcgatcgctg 9000ccaggatata cgtaatctgg catttctggg gattgcttat aacaccctgt tacgtatagc 9060cgaaattgcc aggatcaggg ttaaagatat ctcacgtact gacggtggga gaatgttaat 9120ccatattggc agaacgaaaa cgctggttag caccgcaggt gtagagaagg cacttagcct 9180gggggtaact aaactggtcg agcgatggat ttccgtctct ggtgtagctg atgatccgaa 9240taactacctg ttttgccggg tcagaaaaaa tggtgttgcc gcgccatctg ccaccagcca 9300gctatcaact cgcgccctgg aagggatttt tgaagcaact catcgattga tttacggcgc 9360taaggatgac tctggtcaga gatacctggc ctggtctgga cacagtgccc gtgtcggagc 9420cgcgcgagat atggcccgcg ctggagtttc aataccggag atcatgcaag ctggtggctg 9480gaccaatgta aatattgtca tgaactatat ccgtaacctg gatagtgaaa caggggcaat 9540ggtgcgcctg ctggaagatg gcgattaagc 95703015743DNAArtificial SequencePlasmid pZP2-2988 30ggccgcatgt acatacaaga ttatttatag aaatgaatcg cgatcgaaca aagagtacga 60gtgtacgagt aggggatgat gataaaagtg gaagaagttc cgcatctttg gatttatcaa 120cgtgtaggac gatacttcct gtaaaaatgc aatgtcttta ccataggttc tgctgtagat 180gttattaact accattaaca tgtctacttg tacagttgca gaccagttgg agtatagaat 240ggtacactta ccaaaaagtg ttgatggttg taactacgat atataaaact gttgacggga 300tctgtatatt cggtaagata tattttgtgg ggttttagtg gtgtttaaac agtgtacgca 360gtactataga ggaacaattg ccccggagaa gacggccagg ccgcctagat gacaaattca 420acaactcaca gctgactttc tgccattgcc actagggggg ggccttttta tatggccaag 480ccaagctctc cacgtcggtt gggctgcacc caacaataaa tgggtagggt tgcaccaaca 540aagggatggg atggggggta gaagatacga ggataacggg gctcaatggc acaaataaga 600acgaatactg ccattaagac tcgtgatcca gcgactgaca ccattgcatc atctaagggc 660ctcaaaacta cctcggaact gctgcgctga tctggacacc acagaggttc cgagcacttt 720aggttgcacc aaatgtccca ccaggtgcag gcagaaaacg ctggaacagc gtgtacagtt 780tgtcttaaca aaaagtgagg gcgctgaggt cgagcagggt ggtgtgactt gttatagcct 840ttagagctgc gaaagcgcgt atggatttgg ctcatcaggc cagattgagg gtctgtggac 900acatgtcatg ttagtgtact tcaatcgccc cctggatata gccccgacaa taggccgtgg 960cctcattttt ttgccttccg cacatttcca ttgctcggta cccacacctt gcttctcctg 1020cacttgccaa ccttaatact ggtttacatt gaccaacatc ttacaagcgg ggggcttgtc 1080tagggtatat ataaacagtg gctctcccaa tcggttgcca gtctcttttt tcctttcttt 1140ccccacagat tcgaaatcta aactacacat cacaccatgg aggtcgtgaa cgaaatcgtc 1200tccattggcc aggaggttct tcccaaggtc gactatgctc agctctggtc tgatgcctcg 1260cactgcgagg tgctgtacct ctccatcgcc ttcgtcatcc tgaagttcac ccttggtcct 1320ctcggaccca agggtcagtc tcgaatgaag tttgtgttca ccaactacaa cctgctcatg 1380tccatctact cgctgggctc cttcctctct atggcctacg ccatgtacac cattggtgtc 1440atgtccgaca actgcgagaa ggctttcgac aacaatgtct tccgaatcac cactcagctg 1500ttctacctca gcaagttcct cgagtacatt gactccttct atctgcccct catgggcaag 1560cctctgacct ggttgcagtt ctttcaccat ctcggagctc ctatggacat gtggctgttc 1620tacaactacc gaaacgaagc cgtttggatc tttgtgctgc tcaacggctt cattcactgg 1680atcatgtacg gctactattg gacccgactg atcaagctca agttccctat gcccaagtcc 1740ctgattactt ctatgcagat cattcagttc aacgttggct tctacatcgt ctggaagtac 1800cggaacattc cctgctaccg acaagatgga atgagaatgt ttggctggtt tttcaactac 1860ttctacgttg gtactgtcct gtgtctgttc ctcaacttct acgtgcagac ctacatcgtc 1920cgaaagcaca agggagccaa aaagattcag tgagcggccg caagtgtgga tggggaagtg 1980agtgcccggt tctgtgtgca caattggcaa tccaagatgg atggattcaa cacagggata 2040tagcgagcta cgtggtggtg cgaggatata gcaacggata tttatgtttg acacttgaga 2100atgtacgata caagcactgt ccaagtacaa tactaaacat actgtacata ctcatactcg 2160tacccgggca acggtttcac ttgagtgcag tggctagtgc tcttactcgt acagtgtgca 2220atactgcgta tcatagtctt tgatgtatat cgtattcatt catgttagtt gcgtacgggc 2280gtcgttgctt gtgtgatttt tgaggaccca tccctttggt atataagtat actctggggt 2340taaggttgcc cgtgtagtct aggttatagt tttcatgtga aataccgaga gccgagggag 2400aataaacggg ggtatttgga cttgtttttt tcgcggaaaa gcgtcgaatc aaccctgcgg 2460gccttgcacc atgtccacga cgtgtttctc gccccaattc gccccttgca cgtcaaaatt 2520aggcctccat ctagacccct ccataacatg tgactgtggg gaaaagtata agggaaacca 2580tgcaaccata gacgacgtga aagacgggga ggaaccaatg gaggccaaag aaatggggta 2640gcaacagtcc aggagacaga caaggagaca aggagagggc gcccgaaaga tcggaaaaac 2700aaacatgtcc aattggggca gtgacggaaa cgacacggac acttcagtac aatggaccga 2760ccatctccaa gccagggtta ttccggtatc accttggccg taacctcccg ctggtacctg 2820atattgtaca cgttcacatt caatatactt tcagctacaa taagagaggc tgtttgtcgg 2880gcatgtgtgt ccgtcgtatg gggtgatgtc cgagggcgaa attcgctaca agcttaactc 2940tggcgcttgt ccagtatgaa tagacaagtc aagaccagtg gtgccatgat tgacagggag 3000gtacaagact tcgatactcg agcattactc ggacttgtgg cgattgaaca gacgggcgat 3060cgcttctccc ccgtattgcc ggcgcgccag ctgcattaat gaatcggcca acgcgcgggg 3120agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 3180gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca 3240gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 3300cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 3360aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 3420tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac 3480ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat 3540ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 3600cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac 3660ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt 3720gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaagaac agtatttggt 3780atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc 3840aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga 3900aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac 3960gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc 4020cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct 4080gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca 4140tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct 4200ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca 4260ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc 4320atccagtcta ttaattgttg ccgggaagct agagtaagta gttcgccagt taatagtttg 4380cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct 4440tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa 4500aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta 4560tcactcatgg ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc 4620ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg 4680agttgctctt gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa 4740gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg 4800agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc 4860accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 4920gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat 4980cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 5040ggggttccgc gcacatttcc ccgaaaagtg ccacctgatg cggtgtgaaa taccgcacag 5100atgcgtaagg agaaaatacc gcatcaggaa attgtaagcg ttaatatttt gttaaaattc 5160gcgttaaatt tttgttaaat cagctcattt tttaaccaat aggccgaaat cggcaaaatc 5220ccttataaat caaaagaata gaccgagata gggttgagtg ttgttccagt ttggaacaag 5280agtccactat taaagaacgt ggactccaac gtcaaagggc gaaaaaccgt ctatcagggc 5340gatggcccac tacgtgaacc atcaccctaa tcaagttttt tggggtcgag gtgccgtaaa 5400gcactaaatc ggaaccctaa agggagcccc cgatttagag cttgacgggg aaagccggcg 5460aacgtggcga gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt 5520gtagcggtca cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc 5580gcgtccattc gccattcagg ctgcgcaact gttgggaagg gcgatcggtg cgggcctctt 5640cgctattacg ccagctggcg aaagggggat gtgctgcaag gcgattaagt tgggtaacgc 5700cagggttttc ccagtcacga cgttgtaaaa cgacggccag tgaattgtaa tacgactcac 5760tatagggcga attgggcccg acgtcgcatg cgctgatgac actttggtct gaaagagatg 5820cattttgaat cccaaacttg cagtgcccaa gtgacataca tctccgcgtt ttggaaaatg 5880ttcagaaaca gttgattgtg ttggaatggg gaatggggaa tggaaaaatg actcaagtat 5940caattccaaa aacttctctg gctggcagta cctactgtcc atactactgc attttctcca 6000gtcaggccac tctatactcg acgacacagt agtaaaaccc agataatttc gacataaaca 6060agaaaacaga cccaataata tttatatata gtcagccgtt tgtccagttc agactgtaat 6120agccgaaaaa aaatccaaag tttctattct aggaaaatat attccaatat ttttaattct 6180taatctcatt tattttattc tagcgaaata catttcagct acttgagaca tgtgataccc 6240acaaatcgga ttcggactcg gttgttcaga agagcatatg gcattcgtgc tcgcttgttc 6300acgtattctt cctgttccat ctcttggccg acaatcacac aaaaatgggg tttttttttt 6360aattctaatg attcattaca gcaaaattga gatatagcag accacgtatt ccataatcac 6420caaggaagtt cttgggcgtc ttaattaact cacctgcagg attgagacta tgaatggatt 6480cccgtgcccg tattactcta ctaatttgat cttggaacgc gaaaatacgt ttctaggact 6540ccaaagaatc tcaactcttg tccttactaa atatactacc catagttgat ggtttacttg 6600aacagagagg acatgttcac ttgacccaaa gtttctcgca tctcttggat atttgaacaa 6660cggcgtccac tgaccgtcag ttatccagtc acaaaacccc cacattcata cattcccatg 6720tacgtttaca aagttctcaa ttccatcgtg caaatcaaaa tcacatctat tcattcatca 6780tatataaacc catcatgtct actaacactc acaactccat agaaaacatc gactcagaac 6840acacgctcca tgcggccgct tactgagcct tggcaccggg ctgcttctcg gccattcgag 6900cgaactggga caggtatcgg agcaggatga cgagaccttc atggggcaga gggtttcggt 6960aggggaggtt gtgcttctgg cacagctgtt ccacctggta ggaaacggca gtgaggttgt 7020gtcgaggcag ggtgggccag agatggtgct cgatctggta gttcaggcct ccaaagaacc 7080agtcagtaat gatgcctcgt cgaatgttca tggtctcatg gatctgaccc acagagaagc 7140catgtccgtc ccagacggaa tcaccgatct tctccagagg gtagtggttc atgaagacca 7200cgatggcaat tccgaagcca ccgacgagct cggaaacaaa gaacaccagc atcgaggtca 7260ggatggaggg cataaagaag aggtggaaca gggtcttgag agtccagtgc agagcgagtc 7320caatggcctc tttcttgtac tgagatcggt agaactggtt gtctcggtcc ttgagggatc 7380gaacggtcag cacagactgg aaacaccaga tgaatcgcag gagaatacag atgaccagga 7440aatagtactg ttggaactga atgagctttc gggagatggg agaagctcga gtgacatcgt 7500cctcggacca ggcgagcaga ggcaggttat caatgtcggg atcgtgaccc tgaacgttgg 7560tagcagaatg atgggcgttg tgtctgtcct tccaccaggt cacggagaag ccctggagtc 7620cgttgccaaa gaccagaccc aggacgttat tccagtttcg gttcttgaag gtctggtggt 7680ggcagatgtc atgagacagc catcccattt gctggtagtg cataccgagc acgagagcac 7740caatgaagta caggtggtac tggaccagca tgaagaaggc aagcacgcca agacccaggg 7800tggtcaagat cttgtacgag taccagaggg gagaggcgtc aaacatgcca gtggcgatca 7860gctcttctcg gagctttcgg aaatcctcct gagcttcgtt gacggcagcc tggggaggca 7920gctcggaagc ctggttgatc ttgggcattc gcttgagctt gtcgaaggct tcctgagagt 7980gcataaccat gaaggcgtca gtagcatctc gtccctggta

gttctcaatg atttcagctc 8040caccagggtg gaagttcacc caagcggaga cgtcgtacac ctttccgtcg atgacgaggg 8100gcagagcctg tcgagaagcc ttcaccatgg ttgtgaatta gggtggtgag aatggttggt 8160tgtagggaag aatcaaaggc cggtctcggg atccgtgggt atatatatat atatatatat 8220atacgatcct tcgttacctc cctgttctca aaactgtggt ttttcgtttt tcgttttttg 8280ctttttttga tttttttagg gccaactaag cttccagatt tcgctaatca cctttgtact 8340aattacaaga aaggaagaag ctgattagag ttgggctttt tatgcaactg tgctactcct 8400tatctctgat atgaaagtgt agacccaatc acatcatgtc atttagagtt ggtaatactg 8460ggaggataga taaggcacga aaacgagcca tagcagacat gctgggtgta gccaagcaga 8520agaaagtaga tgggagccaa ttgacgagcg agggagctac gccaatccga catacgacac 8580gctgagatcg tcttggccgg ggggtaccta cagatgtcca agggtaagtg cttgactgta 8640attgtatgtc tgaggacaaa tatgtagtca gccgtataaa gtcataccag gcaccagtgc 8700catcatcgaa ccactaactc tctatgatac atgcctccgg tattattgta ccatgcgtcg 8760ctttgttaca tacgtatctt gcctttttct ctcagaaact ccagactttg gctattggtc 8820gagataagcc cggaccatag tgagtctttc acactctaca tttctccctt gctccaacta 8880tttaaattcc ttcacttcaa gttcattctt catctgcttc tgttttactt tgacaggcaa 8940atgaagacat ggtacgactt gatggaggcc aagaacgcca tttcaccccg agacaccgaa 9000gtgcctgaaa tcctggctgc ccccattgat aacatcggaa actacggtat tccggaaagt 9060gtatatagaa cctttcccca gcttgtgtct gtggatatgg atggtgtaat cccctttgag 9120tactcgtctt ggcttctctc cgagcagtat gaggctctct aatctagcgc atttaatatc 9180tcaatgtatt tatatattta tcttctcatg cggccgctta ctgagccttg gcaccgggct 9240gcttctcggc cattcgagcg aactgggaca ggtatcggag caggatgacg agaccttcat 9300ggggcagagg gtttcggtag gggaggttgt gcttctggca cagctgttcc acctggtagg 9360aaacggcagt gaggttgtgt cgaggcaggg tgggccagag atggtgctcg atctggtagt 9420tcaggcctcc aaagaaccag tcagtaatga tgcctcgtcg aatgttcatg gtctcatgga 9480tctgacccac agagaagcca tgtccgtccc agacggaatc accgatcttc tccagagggt 9540agtggttcat gaagaccacg atggcaattc cgaagccacc gacgagctcg gaaacaaaga 9600acaccagcat cgaggtcagg atggagggca taaagaagag gtggaacagg gtcttgagag 9660tccagtgcag agcgagtcca atggcctctt tcttgtactg agatcggtag aactggttgt 9720ctcggtcctt gagggatcga acggtcagca cagactggaa acaccagatg aatcgcagga 9780gaatacagat gaccaggaaa tagtactgtt ggaactgaat gagctttcgg gagatgggag 9840aagctcgagt gacatcgtcc tcggaccagg cgagcagagg caggttatca atgtcgggat 9900cgtgaccctg aacgttggta gcagaatgat gggcgttgtg tctgtccttc caccaggtca 9960cggagaagcc ctggagtccg ttgccaaaga ccagacccag gacgttattc cagtttcggt 10020tcttgaaggt ctggtggtgg cagatgtcat gagacagcca tcccatttgc tggtagtgca 10080taccgagcac gagagcacca atgaagtaca ggtggtactg gaccagcatg aagaaggcaa 10140gcacgccaag acccagggtg gtcaagatct tgtacgagta ccagagggga gaggcgtcaa 10200acatgccagt ggcgatcagc tcttctcgga gctttcggaa atcctcctga gcttcgttga 10260cggcagcctg gggaggcagc tcggaagcct ggttgatctt gggcattcgc ttgagcttgt 10320cgaaggcttc ctgagagtgc ataaccatga aggcgtcagt agcatctcgt ccctggtagt 10380tctcaatgat ttcagctcca ccagggtgga agttcaccca agcggagacg tcgtacacct 10440ttccgtcgat gacgaggggc agagcctgtc gagaagcctt caccatgggc aggacctgtg 10500ttagtacatt gtcggggagt catcaattgg ttcgacaggt tgtcgactgt tagtatgagc 10560tcaattgggc tctggtgggt cgatgacact tgtcatctgt ttctgttggg tcatgtttcc 10620atcaccttct atggtactca caattcgtcc gattcgcccg aatccgttaa taccgacttt 10680gatggccatg ttgatgtgtg tttaattcaa gaatgaatat agagaagaga agaagaaaaa 10740agattcaatt gagccggcga tgcagaccct tatataaatg ttgccttgga cagacggagc 10800aagcccgccc aaacctacgt tcggtataat atgttaagct ttttaacaca aaggtttggc 10860ttggggtaac ctgatgtggt gcaaaagacc gggcgttggc gagccattgc gcgggcgaat 10920ggggccgtga ctcgtctcaa attcgagggc gtgcctcaat tcgtgccccc gtggcttttt 10980cccgccgttt ccgccccgtt tgcaccactg cagccgcttc tttggttcgg acaccttgct 11040gcgagctagg tgccttgtgc tacttaaaaa gtggcctccc aacaccaaca tgacatgagt 11100gcgtgggcca agacacgttg gcggggtcgc agtcggctca atggcccgga aaaaacgctg 11160ctggagctgg ttcggacgca gtccgccgcg gcgtatggat atccgcaagg ttccatagcg 11220ccattgccct ccgtcggcgt ctatcccgca acctctaaat agagcgggaa tataacccaa 11280gcttcttttt tttcctttaa cacgcacacc cccaactatc atgttgctgc tgctgtttga 11340ctctactctg tggaggggtg ctcccaccca acccaaccta caggtggatc cggcgctgtg 11400attggctgat aagtctccta tccggactaa ttctgaccaa tgggacatgc gcgcaggacc 11460caaatgccgc aattacgtaa ccccaacgaa atgcctaccc ctctttggag cccagcggcc 11520ccaaatcccc ccaagcagcc cggttctacc ggcttccatc tccaagcaca agcagcccgg 11580aattccttta cctgcaggat aacttcgtat aatgtatgct atacgaagtt atgatctctc 11640tcttgagctt ttccataaca agttcttctg cctccaggaa gtccatgggt ggtttgatca 11700tggttttggt gtagtggtag tgcagtggtg gtattgtgac tggggatgta gttgagaata 11760agtcatacac aagtcagctt tcttcgagcc tcatataagt ataagtagtt caacgtatta 11820gcactgtacc cagcatctcc gtatcgagaa acacaacaac atgccccatt ggacagatca 11880tgcggataca caggttgtgc agtatcatac atactcgatc agacaggtcg tctgaccatc 11940atacaagctg aacaagcgct ccatacttgc acgctctcta tatacacagt taaattacat 12000atccatagtc taacctctaa cagttaatct tctggtaagc ctcccagcca gccttctggt 12060atcgcttggc ctcctcaata ggatctcggt tctggccgta cagacctcgg ccgacaatta 12120tgatatccgt tccggtagac atgacatcct caacagttcg gtactgctgt ccgagagcgt 12180ctcccttgtc gtcaagaccc accccggggg tcagaataag ccagtcctca gagtcgccct 12240taggtcggtt ctgggcaatg aagccaacca caaactcggg gtcggatcgg gcaagctcaa 12300tggtctgctt ggagtactcg ccagtggcca gagagccctt gcaagacagc tcggccagca 12360tgagcagacc tctggccagc ttctcgttgg gagaggggac taggaactcc ttgtactggg 12420agttctcgta gtcagagacg tcctccttct tctgttcaga gacagtttcc tcggcaccag 12480ctcgcaggcc agcaatgatt ccggttccgg gtacaccgtg ggcgttggtg atatcggacc 12540actcggcgat tcggtgacac cggtactggt gcttgacagt gttgccaata tctgcgaact 12600ttctgtcctc gaacaggaag aaaccgtgct taagagcaag ttccttgagg gggagcacag 12660tgccggcgta ggtgaagtcg tcaatgatgt cgatatgggt tttgatcatg cacacataag 12720gtccgacctt atcggcaagc tcaatgagct ccttggtggt ggtaacatcc agagaagcac 12780acaggttggt tttcttggct gccacgagct tgagcactcg agcggcaaag gcggacttgt 12840ggacgttagc tcgagcttcg taggagggca ttttggtggt gaagaggaga ctgaaataaa 12900tttagtctgc agaacttttt atcggaacct tatctggggc agtgaagtat atgttatggt 12960aatagttacg agttagttga acttatagat agactggact atacggctat cggtccaaat 13020tagaaagaac gtcaatggct ctctgggcgt cgcctttgcc gacaaaaatg tgatcatgat 13080gaaagccagc aatgacgttg cagctgatat tgttgtcggc caaccgcgcc gaaaacgcag 13140ctgtcagacc cacagcctcc aacgaagaat gtatcgtcaa agtgatccaa gcacactcat 13200agttggagtc gtactccaaa ggcggcaatg acgagtcaga cagatactcg tcgacgcgat 13260aacttcgtat aatgtatgct atacgaagtt atcgtacgat agttagtaga caacaatcga 13320taacgtctcg taccaaccac agattacgac ccattcgcag tcacagttca ctagggtttg 13380ggttgcatcc gttgagagcg gtttgttttt aaccttctcc atgtgctcac tcaggttttg 13440ggttcagatc aaatcaaggc gtgaaccact ttgtttgagg acaaatgtga cacaaccaac 13500cagtgtcagg ggcaagtccg tgacaaaggg gaagatacaa tgcaattact gacagttaca 13560gactgcctcg atgccctaac cttgccccaa aataagacaa ctgtcctcgt ttaagcgcaa 13620ccctattcag cgtcacgtca taatagcgtt tggatagcac tagtctatga ggagcgtttt 13680atgttgcggt gagggcgatt ggtgctcata tgggttcaat tgaggtggcg gaacgagctt 13740agtcttcaat tgaggtgcga gcgacacaat tgggtgtcac gtggcctaat tgacctcggg 13800tcgtggagtc cccagttata cagcaaccac gaggtgcatg ggtaggagac gtcaccagac 13860aatagggttt tttttggact ggagagggtt gggcaaaagc gctcaacggg ctgtttgggg 13920agctgtgggg gaggaattgg cgatatttgt gaggttaacg gctccgattt gcgtgttttg 13980tcgctcctgc atctccccat acccatatct tccctcccca cctctttcca cgataatttt 14040acggatcagc aataaggttc cttctcctag tttccacgtc catatatatc tatgctgcgt 14100cgtccttttc gtgacatcac caaaacacat acaacaatgg ctgttactga cgtccttaag 14160cgaaagtccg gtgtcatcgt cggcgacgat gtccgagccg tgagtatcca cgacaagatc 14220agtgtcgaga cgacgcgttt tgtgtaatga cacaatccga aagtcgctag caacacacac 14280tctctacaca aactaaccca gctctccatg gcctccacct cggctctgcc caagcagaac 14340cctgccctcc gacgaaccgt cacttccacc actgtgaccg actcggagtc tgctgccgtc 14400tctccctccg attctcccag acactcggcc tcctctacat cgctgtcttc catgtccgag 14460gtggacattg ccaagcccaa gtccgagtac ggtgtcatgc tggataccta cggcaaccag 14520ttcgaagttc ccgacttcac catcaaggac atctacaacg ctattcccaa gcactgcttc 14580aagcgatctg ctctcaaggg atacggctac attcttcgag acattgtcct cctgactacc 14640actttcagca tctggtacaa ctttgtgaca cccgagtaca ttccctccac tcctgctcga 14700gccggtctgt gggctgtgta caccgttctt cagggactct tcggtactgg actgtgggtc 14760attgcccacg agtgtggaca tggtgctttc tccgattccc gaatcatcaa cgacattact 14820ggctgggtgc ttcactcttc cctgcttgtt ccctacttca gctggcaaat ctcccaccgg 14880aagcatcaca aggccactgg aaacatggag cgagacatgg tcttcgttcc tcgaacccga 14940gagcagcaag ctactcgact cggcaagatg acccacgaac tcgcccatct taccgaggaa 15000actcctgctt tcaccctgct catgcttgtg cttcagcaac tggtcggttg gcccaactat 15060ctcattacca acgttactgg acacaactac catgagcggc agcgagaggg tcgaggcaag 15120ggaaagcaca acggtcttgg cggtggagtt aaccatttcg atccccgatc tcctctgtac 15180gagaacagcg acgccaagct catcgtgctc tccgacattg gcattggtct tatggccacc 15240gctctgtact ttctcgttca gaagttcgga ttctacaaca tggccatctg gtacttcgtt 15300ccctacttgt gggttaacca ctggctcgtc gccattacct ttctgcagca cacagatcct 15360actcttcccc actacaccaa cgacgagtgg aactttgtgc gaggtgccgc tgcaaccatc 15420gaccgagaga tgggcttcat tggacgtcat ctgctccacg gcattatcga gactcacgtc 15480ctgcatcact acgtctcttc cattcccttc tacaatgcgg acgaagctac cgaggccatc 15540aaacctatca tgggcaagca ctatcgagct gatgtccagg acggtcctcg aggattcatt 15600cgagccatgt accgatctgc acgaatgtgc cagtgggttg aaccctccgc tggtgccgag 15660ggagctggca agggtgtcct gttctttcga aaccgaaaca atgtgggcac tcctcccgct 15720gtcatcaagc ccgttgccta agc 15743316303DNAArtificial SequencePlasmid pZKUE3S 31ggccgcaagt gtggatgggg aagtgagtgc ccggttctgt gtgcacaatt ggcaatccaa 60gatggatgga ttcaacacag ggatatagcg agctacgtgg tggtgcgagg atatagcaac 120ggatatttat gtttgacact tgagaatgta cgatacaagc actgtccaag tacaatacta 180aacatactgt acatactcat actcgtaccc gggcaacggt ttcacttgag tgcagtggct 240agtgctctta ctcgtacagt gtgcaatact gcgtatcata gtctttgatg tatatcgtat 300tcattcatgt tagttgcgta cgaggaaact gtctctgaac agaagaagga ggacgtctct 360gactacgaga actcccagta caaggagttc ctagtcccct ctcccaacga gaagctggcc 420agaggtctgc tcatgctggc cgagctgtct tgcaagggct ctctggccac tggcgagtac 480tccaagcaga ccattgagct tgcccgatcc gaccccgagt ttgtggttgg cttcattgcc 540cagaaccgac ctaagggcga ctctgaggac tggcttattc tgacccccgg ggtgggtctt 600gacgacaagg gagacgctct cggacagcag taccgaactg ttgaggatgt catgtctacc 660ggaacggata tcataattgt cggccgaggt ctgtacggcc agaaccgaga tcctattgag 720gaggccaagc gataccagaa ggctggctgg gaggcttacc agaagattaa ctgttagagg 780ttagactatg gatatgtaat ttaactgtgt atatagagag cgtgcaagta tggagcgctt 840gttcagcttg tatgatggtc agacgacctg tctgatcgag tatgtatgat actgcacaac 900ctgtgtatcc gcatgatctg tccaatgggg catgttgttg tgtttctcga tacggagatg 960ctgggtacag tgctaatacg ttgaactact tatacttata tgaggctcga agaaagctga 1020cttgtgtatg acttaattaa tcgagcttgg cgtaatcatg gtcatagctg tttcctgtgt 1080gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag 1140cctggggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt 1200tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 1260gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 1320ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 1380caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 1440aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 1500atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 1560cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 1620ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 1680gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 1740accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 1800cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 1860cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta tttggtatct 1920gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 1980aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 2040aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 2100actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 2160taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 2220gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca 2280tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 2340ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 2400accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc 2460agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca 2520acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 2580tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 2640cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac 2700tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt 2760ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt 2820gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc 2880tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat 2940ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca 3000gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga 3060cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg 3120gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg 3180ttccgcgcac atttccccga aaagtgccac ctgacgcgcc ctgtagcggc gcattaagcg 3240cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg 3300ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc cgtcaagctc 3360taaatcgggg gctcccttta gggttccgat ttagtgcttt acggcacctc gaccccaaaa 3420aacttgatta gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc 3480ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact ggaacaacac 3540tcaaccctat ctcggtctat tcttttgatt tataagggat tttgccgatt tcggcctatt 3600ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaacgc 3660ttacaatttc cattcgccat tcaggctgcg caactgttgg gaagggcgat cggtgcgggc 3720ctcttcgcta ttacgccagc tggcgaaagg gggatgtgct gcaaggcgat taagttgggt 3780aacgccaggg ttttcccagt cacgacgttg taaaacgacg gccagtgaat tgtaatacga 3840ctcactatag ggcgaattgg gtaccgggcc ccccctcgag gtcgacgagt atctgtctga 3900ctcgtcattg catgcctttg gagtacgact ccaactatga gtgtgcttgg atcactttga 3960cgatacattc ttcgttggag gctgtgggtc tgacagctgc gttttcggcg cggttggccg 4020acaacaatat cagctgcaac gtcattgctg gctttcatca tgatcacatt tttgtcggca 4080aaggcgacgc ccagagagcc attgacgttc tttctaattt ggaccgatag ccgtatagtc 4140cagtctatct ataagttcaa ctaactcgta actattacca taacatatac ttcactgccc 4200cagataaggt tccgataaaa agttctgcag actaaattta tttcagtctc ctcttcacca 4260ccaaaatgcc ctcctacgaa gctcgagtgc tcaagctcgt ggcagccaag aaaaccaacc 4320tgtgtgcttc tctggatgtt accaccacca aggagctcat tgagcttgcc gataaggtcg 4380gaccttatgt gtgcatgatc aaaacccata tcgacatcat tgacgacttc acctacgccg 4440gcactgtgct ccccctcaag gaacttgctc ttaagcacgg tttcttcctg ttcgaggaca 4500gaaagttcgc agatattggc aacactgtca agcaccagta ccggtgtcac cgaatcgccg 4560agtggtccga tatcaccaac gcccacggtg tttaaacccg gaaccggaat cgataagctt 4620gatatcgaat tcatgctgtt catcgtggtt aatgctgctg tgtgctgtgt gtgtgtgttg 4680tttggcgctc attgttgcgt tatgcagcgt acaccacaat attggaagct tattagcctt 4740tctatttttt cgtttgcaag gcttaacaac attgctgtgg agagggatgg ggatatggag 4800gccgctggag ggagtcggag aggcgttttg gagcggcttg gcctggcgcc cagctcgcga 4860aacgcaccta ggaccctttg gcacgccgaa atgtgccact tttcagtcta gtaacgcctt 4920acctacgtca ttccatgcgt gcatgtttgc gccttttttc ccttgccctt gatcgccaca 4980cagtacagtg cactgtacag tggaggtttt gggggggtct tagatgggag ctaaaagcgg 5040cctagcggta cactagtggg attgtatgga gtggcatgga gcctaggtgg agcctgacag 5100gacgcacgac cggctagccc gtgacagacg atgggtggct cctgttgtcc accgcgtaca 5160aatgtttggg ccaaagtctt gtcagccttg cttgcgaacc taattcccaa ttttgtcact 5220tcgcaccccc attgatcgag ccctaacccc tgcccatcag gcaatccaat taagctcgca 5280ttgtctgcct tgtttagttt ggctcctgcc cgtttcggcg tccacttgca caaacacaaa 5340caagcattat atataaggct cgtctctccc tcccaaccac actcactttt ttgcccgtct 5400tcccttgcta acacaaaagt caagaacaca aacaaccacc ccaaccccct tacacacaag 5460acatatctac accatggagt ctggacccat gcctgctggc attcccttcc ctgagtacta 5520tgacttcttt atggactgga agactcccct ggccatcgct gccacctaca ctgctgccgt 5580cggtctcttc aaccccaagg ttggcaaggt ctcccgagtg gttgccaagt cggctaacgc 5640aaagcctgcc gagcgaaccc agtccggagc tgccatgact gccttcgtct ttgtgcacaa 5700cctcattctg tgtgtctact ctggcatcac cttctactac atgtttcctg ctatggtcaa 5760gaacttccga acccacacac tgcacgaagc ctactgcgac acggatcagt ccctctggaa 5820caacgcactt ggctactggg gttacctctt ctacctgtcc aagttctacg aggtcattga 5880caccatcatc atcatcctga agggacgacg gtcctcgctg cttcagacct accaccatgc 5940tggagccatg attaccatgt ggtctggcat caactaccaa gccactccca tttggatctt 6000tgtggtcttc aactccttca ttcacaccat catgtactgt tactatgcct tcacctctat 6060cggattccat cctcctggca aaaagtacct gacttcgatg cagattactc agtttctggt 6120cggtatcacc attgccgtgt cctacctctt cgttcctggc tgcatccgaa cacccggtgc 6180tcagatggct gtctggatca acgtcggcta cctgtttccc ttgacctatc tgttcgtgga 6240ctttgccaag cgaacctact ccaagcgatc tgccattgcc gctcagaaaa aggctcagta 6300agc 63033221DNAArtificial SequencePrimer pZP-GW-5-1 32cgacaagatg gaatgagaat g 213322DNAArtificial SequencePrimer pZP-GW-5-2 33ctggtttttc aactacttct ac 223421DNAArtificial SequencePrimer pZP-GW-5-3 34gtactgtcct gtgtctgttc c 213522DNAArtificial SequencePrimer pZP-GW-5-4 35ctacatcgtc cgaaagcaca ag 223624DNAArtificial SequencePrimer pZP-GW-3-1 36ctaccagatc gagcaccatc tctg 243721DNAArtificial SequencePrimer pZP-GW-3-2 37ctaccaggtg gaacagctgt g 213822DNAArtificial SequencePrimer pZP-GW-3-3 38tctgccccat gaaggtctcg tc 223922DNAArtificial SequencePrimer pZP-GW-3-4 39cctgtcccag ttcgctcgaa tg 224044DNAArtificial SequenceGenome Walker adaptor-1 40gtaatacgac tatagggcac gcgtggtcga cggcccgggc tggt

44418DNAArtificial SequenceGenome Walker adaptor-2 41accagccc 84222DNAArtificial SequenceNested adaptor primer 42gtaatacgac tcactatagg gc 224336DNAArtificial SequencePrimer Per10F1 43gatcaaccat ggggggaagt tcacatgcat tcgctg 364429DNAArtificial SequencePrimer ZPGW-5-5 44gttatagttt tcatgtgaaa taccgagag 294537DNAArtificial SequencePrimer Per10R 45gatcaagcgg ccgccagacc tcgtcattat ctgatag 37467222DNAArtificial SequencePlasmid pFBAIn-MOD-1 46catggatcca ggcctgttaa cggccattac ggcctgcagg atccgaaaaa acctcccaca 60cctccccctg aacctgaaac ataaaatgaa tgcaattgtt gttgttaact tgtttattgc 120agcttataat ggttacaaat aaagcaatag catcacaaat ttcacaaata aagcattttt 180ttcactgcat tctagttgtg gtttgtccaa actcatcaat gtatcttatc atgtctgcgg 240ccgcaagtgt ggatggggaa gtgagtgccc ggttctgtgt gcacaattgg caatccaaga 300tggatggatt caacacaggg atatagcgag ctacgtggtg gtgcgaggat atagcaacgg 360atatttatgt ttgacacttg agaatgtacg atacaagcac tgtccaagta caatactaaa 420catactgtac atactcatac tcgtacccgg gcaacggttt cacttgagtg cagtggctag 480tgctcttact cgtacagtgt gcaatactgc gtatcatagt ctttgatgta tatcgtattc 540attcatgtta gttgcgtacg agccggaagc ataaagtgta aagcctgggg tgcctaatga 600gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 660tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg 720cgctcttccg cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 780gtatcagctc actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga 840aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 900gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 960aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 1020gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 1080ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 1140cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 1200ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 1260actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 1320tggcctaact acggctacac tagaaggaca gtatttggta tctgcgctct gctgaagcca 1380gttaccttcg gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc 1440ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat 1500cctttgatct tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt 1560ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 1620tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc 1680agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc 1740gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata 1800ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg 1860gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc 1920cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 1980acaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 2040cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 2100cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca 2160ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac 2220tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca 2280atacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt 2340tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc 2400actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca 2460aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 2520ctcatactct tcctttttca atattattga agcatttatc agggttattg tctcatgagc 2580ggatacatat ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc 2640cgaaaagtgc cacctgacgc gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt 2700acgcgcagcg tgaccgctac acttgccagc gccctagcgc ccgctccttt cgctttcttc 2760ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag ctctaaatcg ggggctccct 2820ttagggttcc gatttagtgc tttacggcac ctcgacccca aaaaacttga ttagggtgat 2880ggttcacgta gtgggccatc gccctgatag acggtttttc gccctttgac gttggagtcc 2940acgttcttta atagtggact cttgttccaa actggaacaa cactcaaccc tatctcggtc 3000tattcttttg atttataagg gattttgccg atttcggcct attggttaaa aaatgagctg 3060atttaacaaa aatttaacgc gaattttaac aaaatattaa cgcttacaat ttccattcgc 3120cattcaggct gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc 3180agctggcgaa agggggatgt gctgcaaggc gattaagttg ggtaacgcca gggttttccc 3240agtcacgacg ttgtaaaacg acggccagtg aattgtaata cgactcacta tagggcgaat 3300tgggtaccgg gccccccctc gaggtcgatg gtgtcgataa gcttgatatc gaattcatgt 3360cacacaaacc gatcttcgcc tcaaggaaac ctaattctac atccgagaga ctgccgagat 3420ccagtctaca ctgattaatt ttcgggccaa taatttaaaa aaatcgtgtt atataatatt 3480atatgtatta tatatataca tcatgatgat actgacagtc atgtcccatt gctaaataga 3540cagactccat ctgccgcctc caactgatgt tctcaatatt taaggggtca tctcgcattg 3600tttaataata aacagactcc atctaccgcc tccaaatgat gttctcaaaa tatattgtat 3660gaacttattt ttattactta gtattattag acaacttact tgctttatga aaaacacttc 3720ctatttagga aacaatttat aatggcagtt cgttcattta acaatttatg tagaataaat 3780gttataaatg cgtatgggaa atcttaaata tggatagcat aaatgatatc tgcattgcct 3840aattcgaaat caacagcaac gaaaaaaatc ccttgtacaa cataaatagt catcgagaaa 3900tatcaactat caaagaacag ctattcacac gttactattg agattattat tggacgagaa 3960tcacacactc aactgtcttt ctctcttcta gaaatacagg tacaagtatg tactattctc 4020attgttcata cttctagtca tttcatccca catattcctt ggatttctct ccaatgaatg 4080acattctatc ttgcaaattc aacaattata ataagatata ccaaagtagc ggtatagtgg 4140caatcaaaaa gcttctctgg tgtgcttctc gtatttattt ttattctaat gatccattaa 4200aggtatatat ttatttcttg ttatataatc cttttgttta ttacatgggc tggatacata 4260aaggtatttt gatttaattt tttgcttaaa ttcaatcccc cctcgttcag tgtcaactgt 4320aatggtagga aattaccata cttttgaaga agcaaaaaaa atgaaagaaa aaaaaaatcg 4380tatttccagg ttagacgttc cgcagaatct agaatgcggt atgcggtaca ttgttcttcg 4440aacgtaaaag ttgcgctccc tgagatattg tacatttttg cttttacaag tacaagtaca 4500tcgtacaact atgtactact gttgatgcat ccacaacagt ttgttttgtt tttttttgtt 4560tttttttttt ctaatgattc attaccgcta tgtataccta cttgtacttg tagtaagccg 4620ggttattggc gttcaattaa tcatagactt atgaatctgc acggtgtgcg ctgcgagtta 4680cttttagctt atgcatgcta cttgggtgta atattgggat ctgttcggaa atcaacggat 4740gctcaatcga tttcgacagt aattaattaa gtcatacaca agtcagcttt cttcgagcct 4800catataagta taagtagttc aacgtattag cactgtaccc agcatctccg tatcgagaaa 4860cacaacaaca tgccccattg gacagatcat gcggatacac aggttgtgca gtatcataca 4920tactcgatca gacaggtcgt ctgaccatca tacaagctga acaagcgctc catacttgca 4980cgctctctat atacacagtt aaattacata tccatagtct aacctctaac agttaatctt 5040ctggtaagcc tcccagccag ccttctggta tcgcttggcc tcctcaatag gatctcggtt 5100ctggccgtac agacctcggc cgacaattat gatatccgtt ccggtagaca tgacatcctc 5160aacagttcgg tactgctgtc cgagagcgtc tcccttgtcg tcaagaccca ccccgggggt 5220cagaataagc cagtcctcag agtcgccctt aggtcggttc tgggcaatga agccaaccac 5280aaactcgggg tcggatcggg caagctcaat ggtctgcttg gagtactcgc cagtggccag 5340agagcccttg caagacagct cggccagcat gagcagacct ctggccagct tctcgttggg 5400agaggggact aggaactcct tgtactggga gttctcgtag tcagagacgt cctccttctt 5460ctgttcagag acagtttcct cggcaccagc tcgcaggcca gcaatgattc cggttccggg 5520tacaccgtgg gcgttggtga tatcggacca ctcggcgatt cggtgacacc ggtactggtg 5580cttgacagtg ttgccaatat ctgcgaactt tctgtcctcg aacaggaaga aaccgtgctt 5640aagagcaagt tccttgaggg ggagcacagt gccggcgtag gtgaagtcgt caatgatgtc 5700gatatgggtt ttgatcatgc acacataagg tccgacctta tcggcaagct caatgagctc 5760cttggtggtg gtaacatcca gagaagcaca caggttggtt ttcttggctg ccacgagctt 5820gagcactcga gcggcaaagg cggacttgtg gacgttagct cgagcttcgt aggagggcat 5880tttggtggtg aagaggagac tgaaataaat ttagtctgca gaacttttta tcggaacctt 5940atctggggca gtgaagtata tgttatggta atagttacga gttagttgaa cttatagata 6000gactggacta tacggctatc ggtccaaatt agaaagaacg tcaatggctc tctgggcgtc 6060gcctttgccg acaaaaatgt gatcatgatg aaagccagca atgacgttgc agctgatatt 6120gttgtcggcc aaccgcgccg aaaacgcagc tgtcagaccc acagcctcca acgaagaatg 6180tatcgtcaaa gtgatccaag cacactcata gttggagtcg tactccaaag gcggcaatga 6240cgagtcagac agatactcgt cgaaaacagt gtacgcagat ctactataga ggaacattta 6300aattgccccg gagaagacgg ccaggccgcc tagatgacaa attcaacaac tcacagctga 6360ctttctgcca ttgccactag gggggggcct ttttatatgg ccaagccaag ctctccacgt 6420cggttgggct gcacccaaca ataaatgggt agggttgcac caacaaaggg atgggatggg 6480gggtagaaga tacgaggata acggggctca atggcacaaa taagaacgaa tactgccatt 6540aagactcgtg atccagcgac tgacaccatt gcatcatcta agggcctcaa aactacctcg 6600gaactgctgc gctgatctgg acaccacaga ggttccgagc actttaggtt gcaccaaatg 6660tcccaccagg tgcaggcaga aaacgctgga acagcgtgta cagtttgtct taacaaaaag 6720tgagggcgct gaggtcgagc agggtggtgt gacttgttat agcctttaga gctgcgaaag 6780cgcgtatgga tttggctcat caggccagat tgagggtctg tggacacatg tcatgttagt 6840gtacttcaat cgccccctgg atatagcccc gacaataggc cgtggcctca tttttttgcc 6900ttccgcacat ttccattgct cggtacccac accttgcttc tcctgcactt gccaacctta 6960atactggttt acattgacca acatcttaca agcggggggc ttgtctaggg tatatataaa 7020cagtggctct cccaatcggt tgccagtctc ttttttcctt tctttcccca cagattcgaa 7080atctaaacta cacatcacag aattccgagc cgtgagtatc cacgacaaga tcagtgtcga 7140gacgacgcgt tttgtgtaat gacacaatcc gaaagtcgct agcaacacac actctctaca 7200caaactaacc cagctctggt ac 7222478133DNAArtificial SequencePlasmid pFBAIN-Pex10 47ggccgcaagt gtggatgggg aagtgagtgc ccggttctgt gtgcacaatt ggcaatccaa 60gatggatgga ttcaacacag ggatatagcg agctacgtgg tggtgcgagg atatagcaac 120ggatatttat gtttgacact tgagaatgta cgatacaagc actgtccaag tacaatacta 180aacatactgt acatactcat actcgtaccc gggcaacggt ttcacttgag tgcagtggct 240agtgctctta ctcgtacagt gtgcaatact gcgtatcata gtctttgatg tatatcgtat 300tcattcatgt tagttgcgta cgagccggaa gcataaagtg taaagcctgg ggtgcctaat 360gagtgagcta actcacatta attgcgttgc gctcactgcc cgctttccag tcgggaaacc 420tgtcgtgcca gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg 480ggcgctcttc cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag 540cggtatcagc tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag 600gaaagaacat gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc 660tggcgttttt ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc 720agaggtggcg aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc 780tcgtgcgctc tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt 840cgggaagcgt ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg 900ttcgctccaa gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat 960ccggtaacta tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag 1020ccactggtaa caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt 1080ggtggcctaa ctacggctac actagaagga cagtatttgg tatctgcgct ctgctgaagc 1140cagttacctt cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta 1200gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag 1260atcctttgat cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga 1320ttttggtcat gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa 1380gttttaaatc aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa 1440tcagtgaggc acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc 1500ccgtcgtgta gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga 1560taccgcgaga cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa 1620gggccgagcg cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt 1680gccgggaagc tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg 1740ctacaggcat cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc 1800aacgatcaag gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg 1860gtcctccgat cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag 1920cactgcataa ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt 1980actcaaccaa gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt 2040caatacggga taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaac 2100gttcttcggg gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac 2160ccactcgtgc acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag 2220caaaaacagg aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa 2280tactcatact cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga 2340gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc 2400cccgaaaagt gccacctgac gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg 2460ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct ttcgctttct 2520tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat cgggggctcc 2580ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt gattagggtg 2640atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg acgttggagt 2700ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac cctatctcgg 2760tctattcttt tgatttataa gggattttgc cgatttcggc ctattggtta aaaaatgagc 2820tgatttaaca aaaatttaac gcgaatttta acaaaatatt aacgcttaca atttccattc 2880gccattcagg ctgcgcaact gttgggaagg gcgatcggtg cgggcctctt cgctattacg 2940ccagctggcg aaagggggat gtgctgcaag gcgattaagt tgggtaacgc cagggttttc 3000ccagtcacga cgttgtaaaa cgacggccag tgaattgtaa tacgactcac tatagggcga 3060attgggtacc gggccccccc tcgaggtcga tggtgtcgat aagcttgata tcgaattcat 3120gtcacacaaa ccgatcttcg cctcaaggaa acctaattct acatccgaga gactgccgag 3180atccagtcta cactgattaa ttttcgggcc aataatttaa aaaaatcgtg ttatataata 3240ttatatgtat tatatatata catcatgatg atactgacag tcatgtccca ttgctaaata 3300gacagactcc atctgccgcc tccaactgat gttctcaata tttaaggggt catctcgcat 3360tgtttaataa taaacagact ccatctaccg cctccaaatg atgttctcaa aatatattgt 3420atgaacttat ttttattact tagtattatt agacaactta cttgctttat gaaaaacact 3480tcctatttag gaaacaattt ataatggcag ttcgttcatt taacaattta tgtagaataa 3540atgttataaa tgcgtatggg aaatcttaaa tatggatagc ataaatgata tctgcattgc 3600ctaattcgaa atcaacagca acgaaaaaaa tcccttgtac aacataaata gtcatcgaga 3660aatatcaact atcaaagaac agctattcac acgttactat tgagattatt attggacgag 3720aatcacacac tcaactgtct ttctctcttc tagaaataca ggtacaagta tgtactattc 3780tcattgttca tacttctagt catttcatcc cacatattcc ttggatttct ctccaatgaa 3840tgacattcta tcttgcaaat tcaacaatta taataagata taccaaagta gcggtatagt 3900ggcaatcaaa aagcttctct ggtgtgcttc tcgtatttat ttttattcta atgatccatt 3960aaaggtatat atttatttct tgttatataa tccttttgtt tattacatgg gctggataca 4020taaaggtatt ttgatttaat tttttgctta aattcaatcc cccctcgttc agtgtcaact 4080gtaatggtag gaaattacca tacttttgaa gaagcaaaaa aaatgaaaga aaaaaaaaat 4140cgtatttcca ggttagacgt tccgcagaat ctagaatgcg gtatgcggta cattgttctt 4200cgaacgtaaa agttgcgctc cctgagatat tgtacatttt tgcttttaca agtacaagta 4260catcgtacaa ctatgtacta ctgttgatgc atccacaaca gtttgttttg tttttttttg 4320tttttttttt ttctaatgat tcattaccgc tatgtatacc tacttgtact tgtagtaagc 4380cgggttattg gcgttcaatt aatcatagac ttatgaatct gcacggtgtg cgctgcgagt 4440tacttttagc ttatgcatgc tacttgggtg taatattggg atctgttcgg aaatcaacgg 4500atgctcaatc gatttcgaca gtaattaatt aagtcataca caagtcagct ttcttcgagc 4560ctcatataag tataagtagt tcaacgtatt agcactgtac ccagcatctc cgtatcgaga 4620aacacaacaa catgccccat tggacagatc atgcggatac acaggttgtg cagtatcata 4680catactcgat cagacaggtc gtctgaccat catacaagct gaacaagcgc tccatacttg 4740cacgctctct atatacacag ttaaattaca tatccatagt ctaacctcta acagttaatc 4800ttctggtaag cctcccagcc agccttctgg tatcgcttgg cctcctcaat aggatctcgg 4860ttctggccgt acagacctcg gccgacaatt atgatatccg ttccggtaga catgacatcc 4920tcaacagttc ggtactgctg tccgagagcg tctcccttgt cgtcaagacc caccccgggg 4980gtcagaataa gccagtcctc agagtcgccc ttaggtcggt tctgggcaat gaagccaacc 5040acaaactcgg ggtcggatcg ggcaagctca atggtctgct tggagtactc gccagtggcc 5100agagagccct tgcaagacag ctcggccagc atgagcagac ctctggccag cttctcgttg 5160ggagagggga ctaggaactc cttgtactgg gagttctcgt agtcagagac gtcctccttc 5220ttctgttcag agacagtttc ctcggcacca gctcgcaggc cagcaatgat tccggttccg 5280ggtacaccgt gggcgttggt gatatcggac cactcggcga ttcggtgaca ccggtactgg 5340tgcttgacag tgttgccaat atctgcgaac tttctgtcct cgaacaggaa gaaaccgtgc 5400ttaagagcaa gttccttgag ggggagcaca gtgccggcgt aggtgaagtc gtcaatgatg 5460tcgatatggg ttttgatcat gcacacataa ggtccgacct tatcggcaag ctcaatgagc 5520tccttggtgg tggtaacatc cagagaagca cacaggttgg ttttcttggc tgccacgagc 5580ttgagcactc gagcggcaaa ggcggacttg tggacgttag ctcgagcttc gtaggagggc 5640attttggtgg tgaagaggag actgaaataa atttagtctg cagaactttt tatcggaacc 5700ttatctgggg cagtgaagta tatgttatgg taatagttac gagttagttg aacttataga 5760tagactggac tatacggcta tcggtccaaa ttagaaagaa cgtcaatggc tctctgggcg 5820tcgcctttgc cgacaaaaat gtgatcatga tgaaagccag caatgacgtt gcagctgata 5880ttgttgtcgg ccaaccgcgc cgaaaacgca gctgtcagac ccacagcctc caacgaagaa 5940tgtatcgtca aagtgatcca agcacactca tagttggagt cgtactccaa aggcggcaat 6000gacgagtcag acagatactc gtcgaaaaca gtgtacgcag atctactata gaggaacatt 6060taaattgccc cggagaagac ggccaggccg cctagatgac aaattcaaca actcacagct 6120gactttctgc cattgccact aggggggggc ctttttatat ggccaagcca agctctccac 6180gtcggttggg ctgcacccaa caataaatgg gtagggttgc accaacaaag ggatgggatg 6240gggggtagaa gatacgagga taacggggct caatggcaca aataagaacg aatactgcca 6300ttaagactcg tgatccagcg actgacacca ttgcatcatc taagggcctc aaaactacct 6360cggaactgct gcgctgatct ggacaccaca gaggttccga gcactttagg ttgcaccaaa 6420tgtcccacca ggtgcaggca gaaaacgctg gaacagcgtg tacagtttgt cttaacaaaa 6480agtgagggcg ctgaggtcga gcagggtggt gtgacttgtt atagccttta gagctgcgaa 6540agcgcgtatg gatttggctc atcaggccag attgagggtc tgtggacaca tgtcatgtta 6600gtgtacttca atcgccccct ggatatagcc ccgacaatag gccgtggcct catttttttg 6660ccttccgcac atttccattg ctcggtaccc acaccttgct tctcctgcac ttgccaacct 6720taatactggt ttacattgac caacatctta caagcggggg gcttgtctag ggtatatata 6780aacagtggct ctcccaatcg gttgccagtc tcttttttcc tttctttccc cacagattcg 6840aaatctaaac tacacatcac agaattccga gccgtgagta tccacgacaa gatcagtgtc 6900gagacgacgc gttttgtgta atgacacaat ccgaaagtcg ctagcaacac acactctcta 6960cacaaactaa cccagctctg gtaccatggg gggaagttca catgcattcg ctggtgaatc 7020tgatctgaca ctacaactac acaccaggtc caacatgagc gacaatacga caatcaaaaa 7080gccgatccga cccaaaccga tccggacgga acgcctgcct tacgctgggg ccgcagaaat 7140catccgagcc aaccagaaag accactactt tgagtccgtg cttgaacagc

atctcgtcac 7200gtttctgcag aaatggaagg gagtacgatt tatccaccag tacaaggagg agctggagac 7260ggcgtccaag tttgcatatc tcggtttgtg tacgcttgtg ggctccaaga ctctcggaga 7320agagtacacc aatctcatgt acactatcag agaccgaaca gctctaccgg gggtggtgag 7380acggtttggc tacgtgcttt ccaacactct gtttccatac ctgtttgtgc gctacatggg 7440caagttgcgc gccaaactga tgcgcgagta tccccatctg gtggagtacg acgaagatga 7500gcctgtgccc agcccggaaa catggaagga gcgggtcatc aagacgtttg tgaacaagtt 7560tgacaagttc acggcgctgg aggggtttac cgcgatccac ttggcgattt tctacgtcta 7620cggctcgtac taccagctca gtaagcggat ctggggcatg cgttatgtat ttggacaccg 7680actggacaag aatgagcctc gaatcggtta cgagatgctc ggtctgctga ttttcgcccg 7740gtttgccacg tcatttgtgc agacgggaag agagtacctc ggagcgctgc tggaaaagag 7800cgtggagaaa gaggcagggg agaaggaaga tgaaaaggaa gcggttgtgc cgaaaaagaa 7860gtcgtcaatt ccgttcattg aggatacaga aggggagacg gaagacaaga tcgatctgga 7920ggaccctcga cagctcaagt tcattcctga ggcgtccaga gcgtgcactc tgtgtctgtc 7980atacattagt gcgccggcat gtacgccatg tggacacttt ttctgttggg actgtatttc 8040cgaatgggtg agagagaagc ccgagtgtcc cttgtgtcgg cagggtgtga gagagcagaa 8100cttgttgcct atcagataat gacgaggtct ggc 81334835DNAArtificial SequencePrimer PEX10-R-BsiWI 48gatcaacgta cgcttcagca gtaactgtat tgctc 354935DNAArtificial SequencePrimer PEX10-F1-SalI 49gatcaagtcg acattgtaac tagtcctgga gggtc 355036DNAArtificial SequencePrimer PEX10-F2-SalI 50gatcaagtcg acgtcttagc gtcatgtatt ctcaag 36517277DNAArtificial SequencePlasmid pEXP-MOD1 51catggatcca ggcctgttaa cggccattac ggcctgcagg atccgaaaaa acctcccaca 60cctccccctg aacctgaaac ataaaatgaa tgcaattgtt gttgttaact tgtttattgc 120agcttataat ggttacaaat aaagcaatag catcacaaat ttcacaaata aagcattttt 180ttcactgcat tctagttgtg gtttgtccaa actcatcaat gtatcttatc atgtctgcgg 240ccgcaagtgt ggatggggaa gtgagtgccc ggttctgtgt gcacaattgg caatccaaga 300tggatggatt caacacaggg atatagcgag ctacgtggtg gtgcgaggat atagcaacgg 360atatttatgt ttgacacttg agaatgtacg atacaagcac tgtccaagta caatactaaa 420catactgtac atactcatac tcgtacccgg gcaacggttt cacttgagtg cagtggctag 480tgctcttact cgtacagtgt gcaatactgc gtatcatagt ctttgatgta tatcgtattc 540attcatgtta gttgcgtacg agccggaagc ataaagtgta aagcctgggg tgcctaatga 600gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 660tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg 720cgctcttccg cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 780gtatcagctc actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga 840aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 900gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 960aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 1020gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 1080ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 1140cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 1200ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 1260actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 1320tggcctaact acggctacac tagaaggaca gtatttggta tctgcgctct gctgaagcca 1380gttaccttcg gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc 1440ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat 1500cctttgatct tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt 1560ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 1620tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc 1680agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc 1740gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata 1800ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg 1860gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc 1920cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 1980acaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 2040cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 2100cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca 2160ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac 2220tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca 2280atacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt 2340tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc 2400actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca 2460aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 2520ctcatactct tcctttttca atattattga agcatttatc agggttattg tctcatgagc 2580ggatacatat ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc 2640cgaaaagtgc cacctgacgc gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt 2700acgcgcagcg tgaccgctac acttgccagc gccctagcgc ccgctccttt cgctttcttc 2760ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag ctctaaatcg ggggctccct 2820ttagggttcc gatttagtgc tttacggcac ctcgacccca aaaaacttga ttagggtgat 2880ggttcacgta gtgggccatc gccctgatag acggtttttc gccctttgac gttggagtcc 2940acgttcttta atagtggact cttgttccaa actggaacaa cactcaaccc tatctcggtc 3000tattcttttg atttataagg gattttgccg atttcggcct attggttaaa aaatgagctg 3060atttaacaaa aatttaacgc gaattttaac aaaatattaa cgcttacaat ttccattcgc 3120cattcaggct gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc 3180agctggcgaa agggggatgt gctgcaaggc gattaagttg ggtaacgcca gggttttccc 3240agtcacgacg ttgtaaaacg acggccagtg aattgtaata cgactcacta tagggcgaat 3300tgggtaccgg gccccccctc gaggtcgatg gtgtcgataa gcttgatatc gaattcatgt 3360cacacaaacc gatcttcgcc tcaaggaaac ctaattctac atccgagaga ctgccgagat 3420ccagtctaca ctgattaatt ttcgggccaa taatttaaaa aaatcgtgtt atataatatt 3480atatgtatta tatatataca tcatgatgat actgacagtc atgtcccatt gctaaataga 3540cagactccat ctgccgcctc caactgatgt tctcaatatt taaggggtca tctcgcattg 3600tttaataata aacagactcc atctaccgcc tccaaatgat gttctcaaaa tatattgtat 3660gaacttattt ttattactta gtattattag acaacttact tgctttatga aaaacacttc 3720ctatttagga aacaatttat aatggcagtt cgttcattta acaatttatg tagaataaat 3780gttataaatg cgtatgggaa atcttaaata tggatagcat aaatgatatc tgcattgcct 3840aattcgaaat caacagcaac gaaaaaaatc ccttgtacaa cataaatagt catcgagaaa 3900tatcaactat caaagaacag ctattcacac gttactattg agattattat tggacgagaa 3960tcacacactc aactgtcttt ctctcttcta gaaatacagg tacaagtatg tactattctc 4020attgttcata cttctagtca tttcatccca catattcctt ggatttctct ccaatgaatg 4080acattctatc ttgcaaattc aacaattata ataagatata ccaaagtagc ggtatagtgg 4140caatcaaaaa gcttctctgg tgtgcttctc gtatttattt ttattctaat gatccattaa 4200aggtatatat ttatttcttg ttatataatc cttttgttta ttacatgggc tggatacata 4260aaggtatttt gatttaattt tttgcttaaa ttcaatcccc cctcgttcag tgtcaactgt 4320aatggtagga aattaccata cttttgaaga agcaaaaaaa atgaaagaaa aaaaaaatcg 4380tatttccagg ttagacgttc cgcagaatct agaatgcggt atgcggtaca ttgttcttcg 4440aacgtaaaag ttgcgctccc tgagatattg tacatttttg cttttacaag tacaagtaca 4500tcgtacaact atgtactact gttgatgcat ccacaacagt ttgttttgtt tttttttgtt 4560tttttttttt ctaatgattc attaccgcta tgtataccta cttgtacttg tagtaagccg 4620ggttattggc gttcaattaa tcatagactt atgaatctgc acggtgtgcg ctgcgagtta 4680cttttagctt atgcatgcta cttgggtgta atattgggat ctgttcggaa atcaacggat 4740gctcaatcga tttcgacagt aattaattaa gtcatacaca agtcagcttt cttcgagcct 4800catataagta taagtagttc aacgtattag cactgtaccc agcatctccg tatcgagaaa 4860cacaacaaca tgccccattg gacagatcat gcggatacac aggttgtgca gtatcataca 4920tactcgatca gacaggtcgt ctgaccatca tacaagctga acaagcgctc catacttgca 4980cgctctctat atacacagtt aaattacata tccatagtct aacctctaac agttaatctt 5040ctggtaagcc tcccagccag ccttctggta tcgcttggcc tcctcaatag gatctcggtt 5100ctggccgtac agacctcggc cgacaattat gatatccgtt ccggtagaca tgacatcctc 5160aacagttcgg tactgctgtc cgagagcgtc tcccttgtcg tcaagaccca ccccgggggt 5220cagaataagc cagtcctcag agtcgccctt aggtcggttc tgggcaatga agccaaccac 5280aaactcgggg tcggatcggg caagctcaat ggtctgcttg gagtactcgc cagtggccag 5340agagcccttg caagacagct cggccagcat gagcagacct ctggccagct tctcgttggg 5400agaggggact aggaactcct tgtactggga gttctcgtag tcagagacgt cctccttctt 5460ctgttcagag acagtttcct cggcaccagc tcgcaggcca gcaatgattc cggttccggg 5520tacaccgtgg gcgttggtga tatcggacca ctcggcgatt cggtgacacc ggtactggtg 5580cttgacagtg ttgccaatat ctgcgaactt tctgtcctcg aacaggaaga aaccgtgctt 5640aagagcaagt tccttgaggg ggagcacagt gccggcgtag gtgaagtcgt caatgatgtc 5700gatatgggtt ttgatcatgc acacataagg tccgacctta tcggcaagct caatgagctc 5760cttggtggtg gtaacatcca gagaagcaca caggttggtt ttcttggctg ccacgagctt 5820gagcactcga gcggcaaagg cggacttgtg gacgttagct cgagcttcgt aggagggcat 5880tttggtggtg aagaggagac tgaaataaat ttagtctgca gaacttttta tcggaacctt 5940atctggggca gtgaagtata tgttatggta atagttacga gttagttgaa cttatagata 6000gactggacta tacggctatc ggtccaaatt agaaagaacg tcaatggctc tctgggcgtc 6060gcctttgccg acaaaaatgt gatcatgatg aaagccagca atgacgttgc agctgatatt 6120gttgtcggcc aaccgcgccg aaaacgcagc tgtcagaccc acagcctcca acgaagaatg 6180tatcgtcaaa gtgatccaag cacactcata gttggagtcg tactccaaag gcggcaatga 6240cgagtcagac agatactcgt cgaccgtacg gggagtttgg cgcccgtttt ttcgagcccc 6300acacgtttcg gtgagtatga gcggcggcag attcgagcgt ttccggtttc cgcggctgga 6360cgagagccca tgatgggggc tcccaccacc agcaatcagg gccctgatta cacacccacc 6420tgtaatgtca tgctgttcat cgatggttaa tgctgctgtg tgctgtgtgt gtgtgttgtt 6480tggcgctcat tgttgcgtta tgcagcgtac accacaatat tggaagctta ttagcctttc 6540tattttttcg tttgcaaggc ttaacaacat tgctgtggag agggatgggg atatggaggc 6600cgctggaggg agtcggagag gcgttttgga gcggcttggc ctggcgccca gctcgcgaaa 6660cgcacctagg accctttggc acgccgaaat gtgccacttt tcagtctagt aacgccttac 6720ctacgtcatt ccatgcgtgc atgtttgcgc cttttttccc ttgcccttga tcgccacaca 6780gtacagtgca ctgtacagtg gaggttttgg gggggtctta gatgggagct aaaagcggcc 6840tagcggtaca ctagtgggat tgtatggagt ggcatggagc ctaggtggag cctgacagga 6900cgcacgaccg gctagcccgt gacagacgat gggtggctcc tgttgtccac cgcgtacaaa 6960tgtttgggcc aaagtcttgt cagccttgct tgcgaaccta attcccaatt ttgtcacttc 7020gcacccccat tgatcgagcc ctaacccctg cccatcaggc aatccaatta agctcgcatt 7080gtctgccttg tttagtttgg ctcctgcccg tttcggcgtc cacttgcaca aacacaaaca 7140agcattatat ataaggctcg tctctccctc ccaaccacac tcactttttt gcccgtcttc 7200ccttgctaac acaaaagtca agaacacaaa caaccacccc aaccccctta cacacaagac 7260atatctacag caatggc 7277527559DNAArtificial SequencePlasmid pPEX10-1 52gtacgagccg gaagcataaa gtgtaaagcc tggggtgcct aatgagtgag ctaactcaca 60ttaattgcgt tgcgctcact gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat 120taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc 180tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca 240aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca 300aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg 360ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg 420acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 480ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 540tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 600tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 660gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 720agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 780tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa 840agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt 900tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct 960acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatta 1020tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 1080agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 1140tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt gtagataact 1200acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg agacccacgc 1260tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gcgcagaagt 1320ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga agctagagta 1380agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg catcgtggtg 1440tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aaggcgagtt 1500acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc 1560agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca taattctctt 1620actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac caagtcattc 1680tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg ggataatacc 1740gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa 1800ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tgcacccaac 1860tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa 1920aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat actcttcctt 1980tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 2040tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 2100gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc 2160gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc ctttctcgcc 2220acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg gttccgattt 2280agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc acgtagtggg 2340ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt ctttaatagt 2400ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc ttttgattta 2460taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta acaaaaattt 2520aacgcgaatt ttaacaaaat attaacgctt acaatttcca ttcgccattc aggctgcgca 2580actgttggga agggcgatcg gtgcgggcct cttcgctatt acgccagctg gcgaaagggg 2640gatgtgctgc aaggcgatta agttgggtaa cgccagggtt ttcccagtca cgacgttgta 2700aaacgacggc cagtgaattg taatacgact cactataggg cgaattgggt accgggcccc 2760ccctcgaggt cgatggtgtc gataagcttg atatcgaatt catgtcacac aaaccgatct 2820tcgcctcaag gaaacctaat tctacatccg agagactgcc gagatccagt ctacactgat 2880taattttcgg gccaataatt taaaaaaatc gtgttatata atattatatg tattatatat 2940atacatcatg atgatactga cagtcatgtc ccattgctaa atagacagac tccatctgcc 3000gcctccaact gatgttctca atatttaagg ggtcatctcg cattgtttaa taataaacag 3060actccatcta ccgcctccaa atgatgttct caaaatatat tgtatgaact tatttttatt 3120acttagtatt attagacaac ttacttgctt tatgaaaaac acttcctatt taggaaacaa 3180tttataatgg cagttcgttc atttaacaat ttatgtagaa taaatgttat aaatgcgtat 3240gggaaatctt aaatatggat agcataaatg atatctgcat tgcctaattc gaaatcaaca 3300gcaacgaaaa aaatcccttg tacaacataa atagtcatcg agaaatatca actatcaaag 3360aacagctatt cacacgttac tattgagatt attattggac gagaatcaca cactcaactg 3420tctttctctc ttctagaaat acaggtacaa gtatgtacta ttctcattgt tcatacttct 3480agtcatttca tcccacatat tccttggatt tctctccaat gaatgacatt ctatcttgca 3540aattcaacaa ttataataag atataccaaa gtagcggtat agtggcaatc aaaaagcttc 3600tctggtgtgc ttctcgtatt tatttttatt ctaatgatcc attaaaggta tatatttatt 3660tcttgttata taatcctttt gtttattaca tgggctggat acataaaggt attttgattt 3720aattttttgc ttaaattcaa tcccccctcg ttcagtgtca actgtaatgg taggaaatta 3780ccatactttt gaagaagcaa aaaaaatgaa agaaaaaaaa aatcgtattt ccaggttaga 3840cgttccgcag aatctagaat gcggtatgcg gtacattgtt cttcgaacgt aaaagttgcg 3900ctccctgaga tattgtacat ttttgctttt acaagtacaa gtacatcgta caactatgta 3960ctactgttga tgcatccaca acagtttgtt ttgttttttt ttgttttttt tttttctaat 4020gattcattac cgctatgtat acctacttgt acttgtagta agccgggtta ttggcgttca 4080attaatcata gacttatgaa tctgcacggt gtgcgctgcg agttactttt agcttatgca 4140tgctacttgg gtgtaatatt gggatctgtt cggaaatcaa cggatgctca atcgatttcg 4200acagtaatta attaagtcat acacaagtca gctttcttcg agcctcatat aagtataagt 4260agttcaacgt attagcactg tacccagcat ctccgtatcg agaaacacaa caacatgccc 4320cattggacag atcatgcgga tacacaggtt gtgcagtatc atacatactc gatcagacag 4380gtcgtctgac catcatacaa gctgaacaag cgctccatac ttgcacgctc tctatataca 4440cagttaaatt acatatccat agtctaacct ctaacagtta atcttctggt aagcctccca 4500gccagccttc tggtatcgct tggcctcctc aataggatct cggttctggc cgtacagacc 4560tcggccgaca attatgatat ccgttccggt agacatgaca tcctcaacag ttcggtactg 4620ctgtccgaga gcgtctccct tgtcgtcaag acccaccccg ggggtcagaa taagccagtc 4680ctcagagtcg cccttaggtc ggttctgggc aatgaagcca accacaaact cggggtcgga 4740tcgggcaagc tcaatggtct gcttggagta ctcgccagtg gccagagagc ccttgcaaga 4800cagctcggcc agcatgagca gacctctggc cagcttctcg ttgggagagg ggactaggaa 4860ctccttgtac tgggagttct cgtagtcaga gacgtcctcc ttcttctgtt cagagacagt 4920ttcctcggca ccagctcgca ggccagcaat gattccggtt ccgggtacac cgtgggcgtt 4980ggtgatatcg gaccactcgg cgattcggtg acaccggtac tggtgcttga cagtgttgcc 5040aatatctgcg aactttctgt cctcgaacag gaagaaaccg tgcttaagag caagttcctt 5100gagggggagc acagtgccgg cgtaggtgaa gtcgtcaatg atgtcgatat gggttttgat 5160catgcacaca taaggtccga ccttatcggc aagctcaatg agctccttgg tggtggtaac 5220atccagagaa gcacacaggt tggttttctt ggctgccacg agcttgagca ctcgagcggc 5280aaaggcggac ttgtggacgt tagctcgagc ttcgtaggag ggcattttgg tggtgaagag 5340gagactgaaa taaatttagt ctgcagaact ttttatcgga accttatctg gggcagtgaa 5400gtatatgtta tggtaatagt tacgagttag ttgaacttat agatagactg gactatacgg 5460ctatcggtcc aaattagaaa gaacgtcaat ggctctctgg gcgtcgcctt tgccgacaaa 5520aatgtgatca tgatgaaagc cagcaatgac gttgcagctg atattgttgt cggccaaccg 5580cgccgaaaac gcagctgtca gacccacagc ctccaacgaa gaatgtatcg tcaaagtgat 5640ccaagcacac tcatagttgg agtcgtactc caaaggcggc aatgacgagt cagacagata 5700ctcgtcgaca ttgtaactag tcctggaggg tcttttttat ggataacctc catgtacgat 5760gtatccaaga tctccacgta ctgtgttctg tttcctaagt aatacccaac aacctctcca 5820acaaacactt gggaagatgc acttgtgctg agatgtcaag atgttagtac tgtactggat 5880ggagagaata ttaataaata attgttaccc aactacatct tgtcgattga aagagatacc 5940cctaagacag ataggatatc tgcaacccga ggaatgaacc ccccagcacc ggcacccttt 6000ctattaacaa aatgccaact gaaatttgaa aagttcaact aaacttattt gacccacaaa 6060aactcgtcaa aagtggcggc gaaagctggc aaatgatgac atccccttgg aactatgata 6120tcccctcgga atcttcgtcc ccatttgcca catctacttg caacgccacg tctgcttact 6180aagcaaccca aatctgcctc ggctcaaaat gtggggaagt tcacatgcat tcgctggtga 6240atctgatctg acactacaac tacacaccag gtccaacatg agcgacaata cgacaatcaa 6300aaagccgatc cgacccaaac cgatccggac ggaacgcctg ccttacgctg gggccgcaga 6360aatcatccga

gccaaccaga aagaccacta ctttgagtcc gtgcttgaac agcatctcgt 6420cacgtttctg cagaaatgga agggagtacg atttatccac cagtacaagg aggagctgga 6480gacggcgtcc aagtttgcat atctcggttt gtgtacgctt gtgggctcca agactctcgg 6540agaagagtac accaatctca tgtacactat cagagaccga acagctctac cgggggtggt 6600gagacggttt ggctacgtgc tttccaacac tctgtttcca tacctgtttg tgcgctacat 6660gggcaagttg cgcgccaaac tgatgcgcga gtatccccat ctggtggagt acgacgaaga 6720tgagcctgtg cccagcccgg aaacatggaa ggagcgggtc atcaagacgt ttgtgaacaa 6780gtttgacaag ttcacggcgc tggaggggtt taccgcgatc cacttggcga ttttctacgt 6840ctacggctcg tactaccagc tcagtaagcg gatctggggc atgcgttatg tatttggaca 6900ccgactggac aagaatgagc ctcgaatcgg ttacgagatg ctcggtctgc tgattttcgc 6960ccggtttgcc acgtcatttg tgcagacggg aagagagtac ctcggagcgc tgctggaaaa 7020gagcgtggag aaagaggcag gggagaagga agatgaaaag gaagcggttg tgccgaaaaa 7080gaagtcgtca attccgttca ttgaggatac agaaggggag acggaagaca agatcgatct 7140ggaggaccct cgacagctca agttcattcc tgaggcgtcc agagcgtgca ctctgtgtct 7200gtcatacatt agtgcgccgg catgtacgcc atgtggacac tttttctgtt gggactgtat 7260ttccgaatgg gtgagagaga agcccgagtg tcccttgtgt cggcagggtg tgagagagca 7320gaacttgttg cctatcagat aatgacgagg tctggatgga aggactagtc agcgagacac 7380agagcatcag ggaccagaca cgaccaattc aatcgacaac actgtgctgc atagcagtgc 7440acagaggtcc tgggcatgaa tatattttag cattggagat atgagtggta gagcgtatac 7500agtattaatt gtggaggtat ctcgtcgcat tgatagagca atacagttac tgctgaagc 7559538051DNAArtificial SequencePlasmid pPEX10-2 53gtacgagccg gaagcataaa gtgtaaagcc tggggtgcct aatgagtgag ctaactcaca 60ttaattgcgt tgcgctcact gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat 120taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc 180tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca 240aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca 300aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg 360ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg 420acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 480ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 540tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 600tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 660gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 720agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 780tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa 840agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt 900tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct 960acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatta 1020tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 1080agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 1140tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt gtagataact 1200acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg agacccacgc 1260tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gcgcagaagt 1320ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga agctagagta 1380agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg catcgtggtg 1440tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aaggcgagtt 1500acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc 1560agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca taattctctt 1620actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac caagtcattc 1680tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg ggataatacc 1740gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa 1800ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tgcacccaac 1860tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa 1920aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat actcttcctt 1980tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 2040tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 2100gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc 2160gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc ctttctcgcc 2220acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg gttccgattt 2280agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc acgtagtggg 2340ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt ctttaatagt 2400ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc ttttgattta 2460taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta acaaaaattt 2520aacgcgaatt ttaacaaaat attaacgctt acaatttcca ttcgccattc aggctgcgca 2580actgttggga agggcgatcg gtgcgggcct cttcgctatt acgccagctg gcgaaagggg 2640gatgtgctgc aaggcgatta agttgggtaa cgccagggtt ttcccagtca cgacgttgta 2700aaacgacggc cagtgaattg taatacgact cactataggg cgaattgggt accgggcccc 2760ccctcgaggt cgatggtgtc gataagcttg atatcgaatt catgtcacac aaaccgatct 2820tcgcctcaag gaaacctaat tctacatccg agagactgcc gagatccagt ctacactgat 2880taattttcgg gccaataatt taaaaaaatc gtgttatata atattatatg tattatatat 2940atacatcatg atgatactga cagtcatgtc ccattgctaa atagacagac tccatctgcc 3000gcctccaact gatgttctca atatttaagg ggtcatctcg cattgtttaa taataaacag 3060actccatcta ccgcctccaa atgatgttct caaaatatat tgtatgaact tatttttatt 3120acttagtatt attagacaac ttacttgctt tatgaaaaac acttcctatt taggaaacaa 3180tttataatgg cagttcgttc atttaacaat ttatgtagaa taaatgttat aaatgcgtat 3240gggaaatctt aaatatggat agcataaatg atatctgcat tgcctaattc gaaatcaaca 3300gcaacgaaaa aaatcccttg tacaacataa atagtcatcg agaaatatca actatcaaag 3360aacagctatt cacacgttac tattgagatt attattggac gagaatcaca cactcaactg 3420tctttctctc ttctagaaat acaggtacaa gtatgtacta ttctcattgt tcatacttct 3480agtcatttca tcccacatat tccttggatt tctctccaat gaatgacatt ctatcttgca 3540aattcaacaa ttataataag atataccaaa gtagcggtat agtggcaatc aaaaagcttc 3600tctggtgtgc ttctcgtatt tatttttatt ctaatgatcc attaaaggta tatatttatt 3660tcttgttata taatcctttt gtttattaca tgggctggat acataaaggt attttgattt 3720aattttttgc ttaaattcaa tcccccctcg ttcagtgtca actgtaatgg taggaaatta 3780ccatactttt gaagaagcaa aaaaaatgaa agaaaaaaaa aatcgtattt ccaggttaga 3840cgttccgcag aatctagaat gcggtatgcg gtacattgtt cttcgaacgt aaaagttgcg 3900ctccctgaga tattgtacat ttttgctttt acaagtacaa gtacatcgta caactatgta 3960ctactgttga tgcatccaca acagtttgtt ttgttttttt ttgttttttt tttttctaat 4020gattcattac cgctatgtat acctacttgt acttgtagta agccgggtta ttggcgttca 4080attaatcata gacttatgaa tctgcacggt gtgcgctgcg agttactttt agcttatgca 4140tgctacttgg gtgtaatatt gggatctgtt cggaaatcaa cggatgctca atcgatttcg 4200acagtaatta attaagtcat acacaagtca gctttcttcg agcctcatat aagtataagt 4260agttcaacgt attagcactg tacccagcat ctccgtatcg agaaacacaa caacatgccc 4320cattggacag atcatgcgga tacacaggtt gtgcagtatc atacatactc gatcagacag 4380gtcgtctgac catcatacaa gctgaacaag cgctccatac ttgcacgctc tctatataca 4440cagttaaatt acatatccat agtctaacct ctaacagtta atcttctggt aagcctccca 4500gccagccttc tggtatcgct tggcctcctc aataggatct cggttctggc cgtacagacc 4560tcggccgaca attatgatat ccgttccggt agacatgaca tcctcaacag ttcggtactg 4620ctgtccgaga gcgtctccct tgtcgtcaag acccaccccg ggggtcagaa taagccagtc 4680ctcagagtcg cccttaggtc ggttctgggc aatgaagcca accacaaact cggggtcgga 4740tcgggcaagc tcaatggtct gcttggagta ctcgccagtg gccagagagc ccttgcaaga 4800cagctcggcc agcatgagca gacctctggc cagcttctcg ttgggagagg ggactaggaa 4860ctccttgtac tgggagttct cgtagtcaga gacgtcctcc ttcttctgtt cagagacagt 4920ttcctcggca ccagctcgca ggccagcaat gattccggtt ccgggtacac cgtgggcgtt 4980ggtgatatcg gaccactcgg cgattcggtg acaccggtac tggtgcttga cagtgttgcc 5040aatatctgcg aactttctgt cctcgaacag gaagaaaccg tgcttaagag caagttcctt 5100gagggggagc acagtgccgg cgtaggtgaa gtcgtcaatg atgtcgatat gggttttgat 5160catgcacaca taaggtccga ccttatcggc aagctcaatg agctccttgg tggtggtaac 5220atccagagaa gcacacaggt tggttttctt ggctgccacg agcttgagca ctcgagcggc 5280aaaggcggac ttgtggacgt tagctcgagc ttcgtaggag ggcattttgg tggtgaagag 5340gagactgaaa taaatttagt ctgcagaact ttttatcgga accttatctg gggcagtgaa 5400gtatatgtta tggtaatagt tacgagttag ttgaacttat agatagactg gactatacgg 5460ctatcggtcc aaattagaaa gaacgtcaat ggctctctgg gcgtcgcctt tgccgacaaa 5520aatgtgatca tgatgaaagc cagcaatgac gttgcagctg atattgttgt cggccaaccg 5580cgccgaaaac gcagctgtca gacccacagc ctccaacgaa gaatgtatcg tcaaagtgat 5640ccaagcacac tcatagttgg agtcgtactc caaaggcggc aatgacgagt cagacagata 5700ctcgtcgacg tcttagcgtc atgtattctc aagcttagtc agagagaagg actatggagg 5760agaaggggag aattgagaag ggtatttgaa gggactttga aggtcgcgtg gaagaggtac 5820ttgaagaggt atttgaaggt cacgtggaag aggtatttga agatcacgtg gaagaagtac 5880ttgttttaca gagaatatcg gggtgatttt gacagtggga ttgtctccca agtcctaatc 5940gtttgacatg ggagcagtga aaagtcgggc taaaaaaggg aatatcggaa atcggaaaga 6000cggaaagaat tactggactc atgtttagta gatctgagca cttcaaattt gaaaatatct 6060cttcaaacag cagatcggtt ggtcgtggag gtaccatcaa gggtaaaatc aaggctatca 6120tcaagggcca tatatcgcaa gtttggggga agataatatg ttcatagtga atcagggttg 6180tggatttcct catctaacgg cattgtaact agtcctggag ggtctttttt atggataacc 6240tccatgtacg atgtatccaa gatctccacg tactgtgttc tgtttcctaa gtaataccca 6300acaacctctc caacaaacac ttgggaagat gcacttgtgc tgagatgtca agatgttagt 6360actgtactgg atggagagaa tattaataaa taattgttac ccaactacat cttgtcgatt 6420gaaagagata cccctaagac agataggata tctgcaaccc gaggaatgaa ccccccagca 6480ccggcaccct ttctattaac aaaatgccaa ctgaaatttg aaaagttcaa ctaaacttat 6540ttgacccaca aaaactcgtc aaaagtggcg gcgaaagctg gcaaatgatg acatcccctt 6600ggaactatga tatcccctcg gaatcttcgt ccccatttgc cacatctact tgcaacgcca 6660cgtctgctta ctaagcaacc caaatctgcc tcggctcaaa atgtggggaa gttcacatgc 6720attcgctggt gaatctgatc tgacactaca actacacacc aggtccaaca tgagcgacaa 6780tacgacaatc aaaaagccga tccgacccaa accgatccgg acggaacgcc tgccttacgc 6840tggggccgca gaaatcatcc gagccaacca gaaagaccac tactttgagt ccgtgcttga 6900acagcatctc gtcacgtttc tgcagaaatg gaagggagta cgatttatcc accagtacaa 6960ggaggagctg gagacggcgt ccaagtttgc atatctcggt ttgtgtacgc ttgtgggctc 7020caagactctc ggagaagagt acaccaatct catgtacact atcagagacc gaacagctct 7080accgggggtg gtgagacggt ttggctacgt gctttccaac actctgtttc catacctgtt 7140tgtgcgctac atgggcaagt tgcgcgccaa actgatgcgc gagtatcccc atctggtgga 7200gtacgacgaa gatgagcctg tgcccagccc ggaaacatgg aaggagcggg tcatcaagac 7260gtttgtgaac aagtttgaca agttcacggc gctggagggg tttaccgcga tccacttggc 7320gattttctac gtctacggct cgtactacca gctcagtaag cggatctggg gcatgcgtta 7380tgtatttgga caccgactgg acaagaatga gcctcgaatc ggttacgaga tgctcggtct 7440gctgattttc gcccggtttg ccacgtcatt tgtgcagacg ggaagagagt acctcggagc 7500gctgctggaa aagagcgtgg agaaagaggc aggggagaag gaagatgaaa aggaagcggt 7560tgtgccgaaa aagaagtcgt caattccgtt cattgaggat acagaagggg agacggaaga 7620caagatcgat ctggaggacc ctcgacagct caagttcatt cctgaggcgt ccagagcgtg 7680cactctgtgt ctgtcataca ttagtgcgcc ggcatgtacg ccatgtggac actttttctg 7740ttgggactgt atttccgaat gggtgagaga gaagcccgag tgtcccttgt gtcggcaggg 7800tgtgagagag cagaacttgt tgcctatcag ataatgacga ggtctggatg gaaggactag 7860tcagcgagac acagagcatc agggaccaga cacgaccaat tcaatcgaca acactgtgct 7920gcatagcagt gcacagaggt cctgggcatg aatatatttt agcattggag atatgagtgg 7980tagagcgtat acagtattaa ttgtggaggt atctcgtcgc attgatagag caatacagtt 8040actgctgaag c 80515415877DNAArtificial SequencePlasmid pZKL1-2SP98C 54aaatgatgtc gacgcagtag gatgtcctgc acgggtcttt ttgtggggtg tggagaaagg 60ggtgcttgga tcgatggaag ccggtagaac cgggctgctt gtgcttggag atggaagccg 120gtagaaccgg gctgcttggg gggatttggg gccgctgggc tccaaagagg ggtaggcatt 180tcgttggggt tacgtaattg cggcatttgg gtcctgcgcg catgtcccat tggtcagaat 240tagtccggat aggagactta tcagccaatc acagcgccgg atccacctgt aggttgggtt 300gggtgggagc acccctccac agagtagagt caaacagcag cagcaacatg atagttgggg 360gtgtgcgtgt taaaggaaaa aaaagaagct tgggttatat tcccgctcta tttagaggtt 420gcgggataga cgccgacgga gggcaatggc gctatggaac cttgcggata tccatacgcc 480gcggcggact gcgtccgaac cagctccagc agcgtttttt ccgggccatt gagccgactg 540cgaccccgcc aacgtgtctt ggcccacgca ctcatgtcat gttggtgttg ggaggccact 600ttttaagtag cacaaggcac ctagctcgca gcaaggtgtc cgaaccaaag aagcggctgc 660agtggtgcaa acggggcgga aacggcggga aaaagccacg ggggcacgaa ttgaggcacg 720ccctcgaatt tgagacgagt cacggcccca ttcgcccgcg caatggctcg ccaacgcccg 780gtcttttgca ccacatcagg ttaccccaag ccaaaccttt gtgttaaaaa gcttaacata 840ttataccgaa cgtaggtttg ggcgggcttg ctccgtctgt ccaaggcaac atttatataa 900gggtctgcat cgccggctca attgaatctt ttttcttctt ctcttctcta tattcattct 960tgaattaaac acacatcaac catgggcgta ttcattaaac aggagcagct tccggctctc 1020aagaagtaca agtactccgc cgaggatcac tcgttcatct ccaacaacat tctgcgcccc 1080ttctggcgac agtttgtcaa aatcttccct ctgtggatgg cccccaacat ggtgactctg 1140ctgggcttct tctttgtcat tgtgaacttc atcaccatgc tcattgttga tcccacccac 1200gaccgcgagc ctcccagatg ggtctacctc acctacgctc tgggtctgtt cctttaccag 1260acatttgatg cctgtgacgg atcccatgcc cgacgaactg gccagagtgg accccttgga 1320gagctgtttg accactgtgt cgacgccatg aatacctctc tgattctcac ggtggtggtg 1380tccaccaccc atatgggata taacatgaag ctactgattg tgcagattgc cgctctcgga 1440aacttctacc tgtcgacctg ggagacctac cataccggaa ctctgtacct ttctggcttc 1500tctggtcctg ttgaaggtat cttgattctg gtggctcttt tcgtcctcac cttcttcact 1560ggtcccaacg tgtacgctct gaccgtctac gaggctcttc ccgagtccat cacttcgctg 1620ctgcctgcca gcttcctgga cgtcaccatc acccagatct acattggatt cggagtgctg 1680ggcatggtgt tcaacatcta cggcgcctgc ggaaacgtga tcaagtacta caacaacaag 1740ggcaagagcg ctctccccgc cattctcgga atcgccccct ttggcatctt ctacgtcggc 1800gtctttgcct gggcccatgt tgctcctctg cttctctcca agtacgccat cgtctatctg 1860tttgccattg gggctgcctt tgccatgcaa gtcggccaga tgattcttgc ccatctcgtg 1920cttgctccct ttccccactg gaacgtgctg ctcttcttcc cctttgtggg actggcagtg 1980cactacattg cacccgtgtt tggctgggac gccgatatcg tgtcggttaa cactctcttc 2040acctgttttg gcgccaccct ctccatttac gccttctttg tgcttgagat catcgacgag 2100atcaccaact acctcgatat ctggtgtctg cgaatcaagt accctcagga gaagaagacc 2160gaataagcgg ccgcatggag cgtgtgttct gagtcgatgt tttctatgga gttgtgagtg 2220ttagtagaca tgatgggttt atatatgatg aatgaataga tgtgattttg atttgcacga 2280tggaattgag aactttgtaa acgtacatgg gaatgtatga atgtgggggt tttgtgactg 2340gataactgac ggtcagtgga cgccgttgtt caaatatcca agagatgcga gaaactttgg 2400gtcaagtgaa catgtcctct ctgttcaagt aaaccatcaa ctatgggtag tatatttagt 2460aaggacaaga gttgagattc tttggagtcc tagaaacgta ttttcgcgtt ccaagatcaa 2520attagtagag taatacgggc acgggaatcc attcatagtc tcaatcctgc aggtgagtta 2580attaatcgag cttggcgtaa tcatggtcat agctgtttcc tgtgtgaaat tgttatccgc 2640tcacaattcc acacaacgta cgatagttag tagacaacaa tcagaacatc tccctcctta 2700tataatcaca caggccagaa cgcgctaaac taaagcgctt tggacactat gttacattgg 2760cattgattga actgaaacca cagtctccct cgcctgaatc gagcaatgga tgttgtcgga 2820agtcaacttc actagaagag cggttctatg ccttgtcaag atcatatcat aaactcactc 2880tgtattaccc catctataga acacttgtta tgaatgggcg gaaacattcc gctatatgca 2940cctttccaca ctaatgcaaa gatgtgcatc ttcaacgggt agtaagactg gttccgactt 3000ccgttgcatg gagagcaatg acctcgataa tgcgaacatc ccccacatat acactcttac 3060acaggccaat ataatctgtg catttactaa atatttaagt ctatgcacct gcttgatgaa 3120aagcggcacg gatggtatca tctagtttcc gccaatccaa gaaccaactg tgttggcagt 3180ggtgtagccc atggcacaca gaccaaagat gaaaatacag acatcggcgg ttcgagccgt 3240ggtgcctcga gcaacaccct tgtaatgcaa aagaggaggg taaatgtaca ccagaggcac 3300acatgcaaac gatccggtga gagcgacgaa ccgatcgaga tcgtcggcac ctccccatgc 3360aacaaaggcg gtgacaaaca caaggaagaa ccggaaaatg ttcttctgcc acttgatggt 3420agagttgtac ttgcctgatc gggtgaagag accattctcg atgattcgga tggcgcgcca 3480gctgcattaa tgaatcggcc aacgcgcggg gagaggcggt ttgcgtattg ggcgctcttc 3540cgcttcctcg ctcactgact cgctgcgctc ggtcgttcgg ctgcggcgag cggtatcagc 3600tcactcaaag gcggtaatac ggttatccac agaatcaggg gataacgcag gaaagaacat 3660gtgagcaaaa ggccagcaaa aggccaggaa ccgtaaaaag gccgcgttgc tggcgttttt 3720ccataggctc cgcccccctg acgagcatca caaaaatcga cgctcaagtc agaggtggcg 3780aaacccgaca ggactataaa gataccaggc gtttccccct ggaagctccc tcgtgcgctc 3840tcctgttccg accctgccgc ttaccggata cctgtccgcc tttctccctt cgggaagcgt 3900ggcgctttct catagctcac gctgtaggta tctcagttcg gtgtaggtcg ttcgctccaa 3960gctgggctgt gtgcacgaac cccccgttca gcccgaccgc tgcgccttat ccggtaacta 4020tcgtcttgag tccaacccgg taagacacga cttatcgcca ctggcagcag ccactggtaa 4080caggattagc agagcgaggt atgtaggcgg tgctacagag ttcttgaagt ggtggcctaa 4140ctacggctac actagaagaa cagtatttgg tatctgcgct ctgctgaagc cagttacctt 4200cggaaaaaga gttggtagct cttgatccgg caaacaaacc accgctggta gcggtggttt 4260ttttgtttgc aagcagcaga ttacgcgcag aaaaaaagga tctcaagaag atcctttgat 4320cttttctacg gggtctgacg ctcagtggaa cgaaaactca cgttaaggga ttttggtcat 4380gagattatca aaaaggatct tcacctagat ccttttaaat taaaaatgaa gttttaaatc 4440aatctaaagt atatatgagt aaacttggtc tgacagttac caatgcttaa tcagtgaggc 4500acctatctca gcgatctgtc tatttcgttc atccatagtt gcctgactcc ccgtcgtgta 4560gataactacg atacgggagg gcttaccatc tggccccagt gctgcaatga taccgcgaga 4620cccacgctca ccggctccag atttatcagc aataaaccag ccagccggaa gggccgagcg 4680cagaagtggt cctgcaactt tatccgcctc catccagtct attaattgtt gccgggaagc 4740tagagtaagt agttcgccag ttaatagttt gcgcaacgtt gttgccattg ctacaggcat 4800cgtggtgtca cgctcgtcgt ttggtatggc ttcattcagc tccggttccc aacgatcaag 4860gcgagttaca tgatccccca tgttgtgcaa aaaagcggtt agctccttcg gtcctccgat 4920cgttgtcaga agtaagttgg ccgcagtgtt atcactcatg gttatggcag cactgcataa 4980ttctcttact gtcatgccat ccgtaagatg cttttctgtg actggtgagt actcaaccaa 5040gtcattctga gaatagtgta tgcggcgacc gagttgctct tgcccggcgt caatacggga 5100taataccgcg ccacatagca gaactttaaa agtgctcatc attggaaaac gttcttcggg 5160gcgaaaactc tcaaggatct taccgctgtt gagatccagt tcgatgtaac ccactcgtgc 5220acccaactga tcttcagcat cttttacttt caccagcgtt tctgggtgag caaaaacagg 5280aaggcaaaat gccgcaaaaa agggaataag ggcgacacgg aaatgttgaa tactcatact 5340cttccttttt caatattatt gaagcattta tcagggttat tgtctcatga gcggatacat 5400atttgaatgt atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt 5460gccacctgat gcggtgtgaa ataccgcaca gatgcgtaag gagaaaatac cgcatcagga 5520aattgtaagc gttaatattt tgttaaaatt cgcgttaaat ttttgttaaa tcagctcatt 5580ttttaaccaa taggccgaaa tcggcaaaat cccttataaa tcaaaagaat agaccgagat 5640agggttgagt gttgttccag tttggaacaa

gagtccacta ttaaagaacg tggactccaa 5700cgtcaaaggg cgaaaaaccg tctatcaggg cgatggccca ctacgtgaac catcacccta 5760atcaagtttt ttggggtcga ggtgccgtaa agcactaaat cggaacccta aagggagccc 5820ccgatttaga gcttgacggg gaaagccggc gaacgtggcg agaaaggaag ggaagaaagc 5880gaaaggagcg ggcgctaggg cgctggcaag tgtagcggtc acgctgcgcg taaccaccac 5940acccgccgcg cttaatgcgc cgctacaggg cgcgtccatt cgccattcag gctgcgcaac 6000tgttgggaag ggcgatcggt gcgggcctct tcgctattac gccagctggc gaaaggggga 6060tgtgctgcaa ggcgattaag ttgggtaacg ccagggtttt cccagtcacg acgttgtaaa 6120acgacggcca gtgaattgta atacgactca ctatagggcg aattgggccc gacgtcgcat 6180gcttagaagt gaggattaca agaagcctct ggatatcaat gatgaacgta ctcagcggct 6240ggtcaagcat ttcgaccgtc gaatcgacga ggtgttcacc tttgacaagc gagggttccc 6300aattgatcac gttctcgagt tgttcaaatc ttctctcaac atctctctgc atgaactatc 6360tctgttgacg aacgtgtcac ccactgttcc tcgaacgccc ttctccgagt ttggtctgaa 6420catcttcgat ctcaaactga cccccgcagt gatcaatagt gccatgccac tgccgatgcg 6480gtgcgaacat ccctggaggg attctcggag ctctacacaa tgcagattct gtcgtcgagt 6540actctctacc ttgctcgaat gacttattgt gctactactg cactcatgct tcgatcatgt 6600gccctactgc accccaaatt tggtgatctg attgagacag agtaccctct tcagctgatt 6660cagaagatca tcagcaacat gaatgatgtg gttgaccagg caggctgttg tagtcacgtc 6720cttcacttca agttcattct tcatctgctt ctgttttact ttgacaggca aatgaagaca 6780tggtacgact tgatggaggc caagaacgcc atttcacccc gagacaccga agtgcctgaa 6840atcctggctg cccccattga taacatcgga aactacggta ttccggaaag tgtatataga 6900acctttcccc agcttgtgtc tgtggatatg gatggtgtaa tccccttaat taactcacct 6960gcaggattga gactatgaat ggattcccgt gcccgtatta ctctactaat ttgatcttgg 7020aacgcgaaaa tacgtttcta ggactccaaa gaatctcaac tcttgtcctt actaaatata 7080ctacccatag ttgatggttt acttgaacag agaggacatg ttcacttgac ccaaagtttc 7140tcgcatctct tggatatttg aacaacggcg tccactgacc gtcagttatc cagtcacaaa 7200acccccacat tcatacattc ccatgtacgt ttacaaagtt ctcaattcca tcgtgcaaat 7260caaaatcaca tctattcatt catcatatat aaacccatca tgtctactaa cactcacaac 7320tccatagaaa acatcgactc agaacacacg ctccatgcgg ccgcttaggc aacgggcttg 7380atgacagcgg gaggagtgcc cacattgttt cggtttcgaa agaacaggac acccttgcca 7440gctccctcgg caccagcgga gggttcaacc cactggcaca ttcgtgcaga tcggtacatg 7500gctcgaatga atcctcgagg accgtcctgg acatcagctc gatagtgctt gcccatgata 7560ggtttgatgg cctcggtagc ttcgtccgca ttgtagaagg gaatggaaga gacgtagtga 7620tgcaggacgt gagtctcgat aatgccgtgg agcagatgac gtccaatgaa gcccatctct 7680cggtcgatgg ttgcagcggc acctcgcaca aagttccact cgtcgttggt gtagtgggga 7740agagtaggat ctgtgtgctg cagaaaggta atggcgacga gccagtggtt aacccacaag 7800tagggaacga agtaccagat ggccatgttg tagaatccga acttctgaac gagaaagtac 7860agagcggtgg ccataagacc aatgccaatg tcggagagca cgatgagctt ggcgtcgctg 7920ttctcgtaca gaggagatcg gggatcgaaa tggttaactc caccgccaag accgttgtgc 7980tttcccttgc ctcgaccctc tcgctgccgc tcatggtagt tgtgtccagt aacgttggta 8040atgagatagt tgggccaacc gaccagttgc tgaagcacaa gcatgagcag ggtgaaagca 8100ggagtttcct cggtaagatg ggcgagttcg tgggtcatct tgccgagtcg agtagcttgc 8160tgctctcggg ttcgaggaac gaagaccatg tctcgctcca tgtttccagt ggccttgtga 8220tgcttccggt gggagatttg ccagctgaag tagggaacaa gcagggaaga gtgaagcacc 8280cagccagtaa tgtcgttgat gattcgggaa tcggagaaag caccatgtcc acactcgtgg 8340gcaatgaccc acagtccagt accgaagagt ccctgaagaa cggtgtacac agcccacaga 8400ccggctcgag caggagtgga gggaatgtac tcgggtgtca caaagttgta ccagatgctg 8460aaagtggtag tcaggaggac aatgtctcga agaatgtagc cgtatccctt gagagcagat 8520cgcttgaagc agtgcttggg aatagcgttg tagatgtcct tgatggtgaa gtcgggaact 8580tcgaactggt tgccgtaggt atccagcatg acaccgtact cggacttggg cttggcaatg 8640tccacctcgg acatggaaga cagcgatgta gaggaggccg agtgtctggg agaatcggag 8700ggagagacgg cagcagactc cgagtcggtc acagtggtgg aagtgacggt tcgtcggagg 8760gcagggttct gcttgggcag agccgaggtg gaggccatgg ccattgctgt agatatgtct 8820tgtgtgtaag ggggttgggg tggttgtttg tgttcttgac ttttgtgtta gcaagggaag 8880acgggcaaaa aagtgagtgt ggttgggagg gagagacgag ccttatatat aatgcttgtt 8940tgtgtttgtg caagtggacg ccgaaacggg caggagccaa actaaacaag gcagacaatg 9000cgagcttaat tggattgcct gatgggcagg ggttagggct cgatcaatgg gggtgcgaag 9060tgacaaaatt gggaattagg ttcgcaagca aggctgacaa gactttggcc caaacatttg 9120tacgcggtgg acaacaggag ccacccatcg tctgtcacgg gctagccggt cgtgcgtcct 9180gtcaggctcc acctaggctc catgccactc catacaatcc cactagtgta ccgctaggcc 9240gcttttagct cccatctaag acccccccaa aacctccact gtacagtgca ctgtactgtg 9300tggcgatcaa gggcaaggga aaaaaggcgc aaacatgcac gcatggaatg acgtaggtaa 9360ggcgttacta gactgaaaag tggcacattt cggcgtgcca aagggtccta ggtgcgtttc 9420gcgagctggg cgccaggcca agccgctcca aaacgcctct ccgactccct ccagcggcct 9480ccatatcccc atccctctcc acagcaatgt tgttaagcct tgcaaacgaa aaaatagaaa 9540ggctaataag cttccaatat tgtggtgtac gctgcataac gcaacaatga gcgccaaaca 9600acacacacac acagcacaca gcagcattaa ccacgatgaa cagcatgaat tcctttacct 9660gcaggataac ttcgtataat gtatgctata cgaagttatg atctctctct tgagcttttc 9720cataacaagt tcttctgcct ccaggaagtc catgggtggt ttgatcatgg ttttggtgta 9780gtggtagtgc agtggtggta ttgtgactgg ggatgtagtt gagaataagt catacacaag 9840tcagctttct tcgagcctca tataagtata agtagttcaa cgtattagca ctgtacccag 9900catctccgta tcgagaaaca caacaacatg ccccattgga cagatcatgc ggatacacag 9960gttgtgcagt atcatacata ctcgatcaga caggtcgtct gaccatcata caagctgaac 10020aagcgctcca tacttgcacg ctctctatat acacagttaa attacatatc catagtctaa 10080cctctaacag ttaatcttct ggtaagcctc ccagccagcc ttctggtatc gcttggcctc 10140ctcaatagga tctcggttct ggccgtacag acctcggccg acaattatga tatccgttcc 10200ggtagacatg acatcctcaa cagttcggta ctgctgtccg agagcgtctc ccttgtcgtc 10260aagacccacc ccgggggtca gaataagcca gtcctcagag tcgcccttag gtcggttctg 10320ggcaatgaag ccaaccacaa actcggggtc ggatcgggca agctcaatgg tctgcttgga 10380gtactcgcca gtggccagag agcccttgca agacagctcg gccagcatga gcagacctct 10440ggccagcttc tcgttgggag aggggactag gaactccttg tactgggagt tctcgtagtc 10500agagacgtcc tccttcttct gttcagagac agtttcctcg gcaccagctc gcaggccagc 10560aatgattccg gttccgggta caccgtgggc gttggtgata tcggaccact cggcgattcg 10620gtgacaccgg tactggtgct tgacagtgtt gccaatatct gcgaactttc tgtcctcgaa 10680caggaagaaa ccgtgcttaa gagcaagttc cttgaggggg agcacagtgc cggcgtaggt 10740gaagtcgtca atgatgtcga tatgggtttt gatcatgcac acataaggtc cgaccttatc 10800ggcaagctca atgagctcct tggtggtggt aacatccaga gaagcacaca ggttggtttt 10860cttggctgcc acgagcttga gcactcgagc ggcaaaggcg gacttgtgga cgttagctcg 10920agcttcgtag gagggcattt tggtggtgaa gaggagactg aaataaattt agtctgcaga 10980actttttatc ggaaccttat ctggggcagt gaagtatatg ttatggtaat agttacgagt 11040tagttgaact tatagataga ctggactata cggctatcgg tccaaattag aaagaacgtc 11100aatggctctc tgggcgtcgc ctttgccgac aaaaatgtga tcatgatgaa agccagcaat 11160gacgttgcag ctgatattgt tgtcggccaa ccgcgccgaa aacgcagctg tcagacccac 11220agcctccaac gaagaatgta tcgtcaaagt gatccaagca cactcatagt tggagtcgta 11280ctccaaaggc ggcaatgacg agtcagacag atactcgtcg acgcgataac ttcgtataat 11340gtatgctata cgaagttatc gtacgatagt tagtagacaa caatcgatcg aggaagagga 11400caagcggctg cttcttaagt ttgtgacatc agtatccaag gcaccattgc aaggattcaa 11460ggctttgaac ccgtcatttg ccattcgtaa cgctggtaga caggttgatc ggttccctac 11520ggcctccacc tgtgtcaatc ttctcaagct gcctgactat caggacattg atcaacttcg 11580gaagaaactt ttgtatgcca ttcgatcaca tgctggtttc gatttgtctt agaggaacgc 11640atatacagta atcatagaga ataaacgata ttcatttatt aaagtagata gttgaggtag 11700aagttgtaaa gagtgataaa tagcggccgc tcactgaatc tttttggctc ccttgtgctt 11760tcggacgatg taggtctgca cgtagaagtt gaggaacaga cacaggacag taccaacgta 11820gaagtagttg aaaaaccagc caaacattct cattccatct tgtcggtagc agggaatgtt 11880ccggtacttc cagacgatgt agaagccaac gttgaactga atgatctgca tagaagtaat 11940cagggacttg ggcataggga acttgagctt gatcagtcgg gtccaatagt agccgtacat 12000gatccagtga atgaagccgt tgagcagcac aaagatccaa acggcttcgt ttcggtagtt 12060gtagaacagc cacatgtcca taggagctcc gagatggtga aagaactgca accaggtcag 12120aggcttgccc atgaggggca gatagaagga gtcaatgtac tcgaggaact tgctgaggta 12180gaacagctga gtggtgattc ggaagacatt gttgtcgaaa gccttctcgc agttgtcgga 12240catgacacca atggtgtaca tggcgtaggc catagagagg aaggagccca gcgagtagat 12300ggacatgagc aggttgtagt tggtgaacac aaacttcatt cgagactgac ccttgggtcc 12360gagaggacca agggtgaact tcaggatgac gaaggcgatg gagaggtaca gcacctcgca 12420gtgcgaggca tcagaccaga gctgagcata gtcgaccttg ggaagaacct cctggccaat 12480ggagacgatt tcgttcacga cctccatggt tgtgaattag ggtggtgaga atggttggtt 12540gtagggaaga atcaaaggcc ggtctcggga tccgtgggta tatatatata tatatatata 12600tacgatcctt cgttacctcc ctgttctcaa aactgtggtt tttcgttttt cgttttttgc 12660tttttttgat ttttttaggg ccaactaagc ttccagattt cgctaatcac ctttgtacta 12720attacaagaa aggaagaagc tgattagagt tgggcttttt atgcaactgt gctactcctt 12780atctctgata tgaaagtgta gacccaatca catcatgtca tttagagttg gtaatactgg 12840gaggatagat aaggcacgaa aacgagccat agcagacatg ctgggtgtag ccaagcagaa 12900gaaagtagat gggagccaat tgacgagcga gggagctacg ccaatccgac atacgacacg 12960ctgagatcgt cttggccggg gggtacctac agatgtccaa gggtaagtgc ttgactgtaa 13020ttgtatgtct gaggacaaat atgtagtcag ccgtataaag tcataccagg caccagtgcc 13080atcatcgaac cactaactct ctatgataca tgcctccggt attattgtac catgcgtcgc 13140tttgttacat acgtatcttg cctttttctc tcagaaactc cagactttgg ctattggtcg 13200agataagccc ggaccatagt gagtctttca cactctgttt aaacaccact aaaaccccac 13260aaaatatatc ttaccgaata tacagatcta ctatagagga acaattgccc cggagaagac 13320ggccaggccg cctagatgac aaattcaaca actcacagct gactttctgc cattgccact 13380aggggggggc ctttttatat ggccaagcca agctctccac gtcggttggg ctgcacccaa 13440caataaatgg gtagggttgc accaacaaag ggatgggatg gggggtagaa gatacgagga 13500taacggggct caatggcaca aataagaacg aatactgcca ttaagactcg tgatccagcg 13560actgacacca ttgcatcatc taagggcctc aaaactacct cggaactgct gcgctgatct 13620ggacaccaca gaggttccga gcactttagg ttgcaccaaa tgtcccacca ggtgcaggca 13680gaaaacgctg gaacagcgtg tacagtttgt cttaacaaaa agtgagggcg ctgaggtcga 13740gcagggtggt gtgacttgtt atagccttta gagctgcgaa agcgcgtatg gatttggctc 13800atcaggccag attgagggtc tgtggacaca tgtcatgtta gtgtacttca atcgccccct 13860ggatatagcc ccgacaatag gccgtggcct catttttttg ccttccgcac atttccattg 13920ctcggtaccc acaccttgct tctcctgcac ttgccaacct taatactggt ttacattgac 13980caacatctta caagcggggg gcttgtctag ggtatatata aacagtggct ctcccaatcg 14040gttgccagtc tcttttttcc tttctttccc cacagattcg aaatctaaac tacacatcac 14100acaatgcctg ttactgacgt ccttaagcga aagtccggtg tcatcgtcgg cgacgatgtc 14160cgagccgtga gtatccacga caagatcagt gtcgagacga cgcgttttgt gtaatgacac 14220aatccgaaag tcgctagcaa cacacactct ctacacaaac taacccagct ctccatggtg 14280aaggcttctc gacaggctct gcccctcgtc atcgacggaa aggtgtacga cgtctccgct 14340tgggtgaact tccaccctgg tggagctgaa atcattgaga actaccaggg acgagatgct 14400actgacgcct tcatggttat gcactctcag gaagccttcg acaagctcaa gcgaatgccc 14460aagatcaacc aggcttccga gctgcctccc caggctgccg tcaacgaagc tcaggaggat 14520ttccgaaagc tccgagaaga gctgatcgcc actggcatgt ttgacgcctc tcccctctgg 14580tactcgtaca agatcttgac caccctgggt cttggcgtgc ttgccttctt catgctggtc 14640cagtaccacc tgtacttcat tggtgctctc gtgctcggta tgcactacca gcaaatggga 14700tggctgtctc atgacatctg ccaccaccag accttcaaga accgaaactg gaataacgtc 14760ctgggtctgg tctttggcaa cggactccag ggcttctccg tgacctggtg gaaggacaga 14820cacaacgccc atcattctgc taccaacgtt cagggtcacg atcccgacat tgataacctg 14880cctctgctcg cctggtccga ggacgatgtc actcgagctt ctcccatctc ccgaaagctc 14940attcagttcc aacagtacta tttcctggtc atctgtattc tcctgcgatt catctggtgt 15000ttccagtctg tgctgaccgt tcgatccctc aaggaccgag acaaccagtt ctaccgatct 15060cagtacaaga aagaggccat tggactcgct ctgcactgga ctctcaagac cctgttccac 15120ctcttcttta tgccctccat cctgacctcg atgctggtgt tctttgtttc cgagctcgtc 15180ggtggcttcg gaattgccat cgtggtcttc atgaaccact accctctgga gaagatcggt 15240gattccgtct gggacggaca tggcttctct gtgggtcaga tccatgagac catgaacatt 15300cgacgaggca tcattactga ctggttcttt ggaggcctga actaccagat cgagcaccat 15360ctctggccca ccctgcctcg acacaacctc actgccgttt cctaccaggt ggaacagctg 15420tgccagaagc acaacctccc ctaccgaaac cctctgcccc atgaaggtct cgtcatcctg 15480ctccgatacc tgtcccagtt cgctcgaatg gccgagaagc agcccggtgc caaggctcag 15540taagcggccg catgagaaga taaatatata aatacattga gatattaaat gcgctagatt 15600agagagcctc atactgctcg gagagaagcc aagacgagta ctcaaagggg attacaccat 15660ccatatccac agacacaagc tggggaaagg ttctatatac actttccgga ataccgtagt 15720ttccgatgtt atcaatgggg gcagccagga tttcaggcac ttcggtgtct cggggtgaaa 15780tggcgttctt ggcctccatc aagtcgtacc atgtcttcat ttgcctgtca aagtaaaaca 15840gaagcagatg aagaatgaac ttgaagtgaa ggaattt 158775515812DNAArtificial SequencePlasmid pZKL2-5U89GC 55gtacgttatc atttgaacag tgaaaggcta cagtaacaga agcagttgta aacttcattc 60cgttgattct gtactacagt accccactac gccgcttccg ctgacactgt tcaacccaaa 120aactacatct gcgtgcgctg tgtaaggcta tcatcagata catactgtag attctgtaga 180tgcgaacctg cttgtatcat atacatcccc ctccccctga cctgcacaag caagcaatgt 240gacattgata ttgctgctta tctagtgccg aggatgtgaa agccgagact caaacatttc 300ttttactctc ttgttcctga ccagacctgg cggagattac gccagtatga ttcttgcagg 360tctgagacaa gcctggaaca gccaacattt atttttcgaa gcgagaaaca tgccacaccc 420cggcacgttc agagatgcat atgatttgtt tttcgagtaa cagtaccccc cccccccccc 480ccaatgaaac cagtattact cacaccatcc tcattcaaag cgttacactg attacgcgcc 540catcaacgac agcatgaggg gactgctgat ctgatctaat caaatgacta caaaaatcgc 600aataatgaag agcaaacgac aaaaaagaaa caggttaacc aatcccgctt caatgtctca 660ccacaatcca gcactgtttc tcattacctc ctccctctaa tttcagagtt gcatcagggt 720ccttgatggc gcgccagctg cattaatgaa tcggccaacg cgcggggaga ggcggtttgc 780gtattgggcg ctcttccgct tcctcgctca ctgactcgct gcgctcggtc gttcggctgc 840ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa tcaggggata 900acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg 960cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct 1020caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa 1080gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc 1140tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt 1200aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 1260ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg 1320cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct 1380tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc 1440tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 1500ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 1560aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt 1620aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa 1680aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat 1740gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct 1800gactccccgt cgtgtagata actacgatac gggagggctt accatctggc cccagtgctg 1860caatgatacc gcgagaccca cgctcaccgg ctccagattt atcagcaata aaccagccag 1920ccggaagggc cgagcgcaga agtggtcctg caactttatc cgcctccatc cagtctatta 1980attgttgccg ggaagctaga gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg 2040ccattgctac aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg 2100gttcccaacg atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa gcggttagct 2160ccttcggtcc tccgatcgtt gtcagaagta agttggccgc agtgttatca ctcatggtta 2220tggcagcact gcataattct cttactgtca tgccatccgt aagatgcttt tctgtgactg 2280gtgagtactc aaccaagtca ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc 2340cggcgtcaat acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg 2400gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga 2460tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg 2520ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat 2580gttgaatact catactcttc ctttttcaat attattgaag catttatcag ggttattgtc 2640tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaataggg gttccgcgca 2700catttccccg aaaagtgcca cctgatgcgg tgtgaaatac cgcacagatg cgtaaggaga 2760aaataccgca tcaggaaatt gtaagcgtta atattttgtt aaaattcgcg ttaaattttt 2820gttaaatcag ctcatttttt aaccaatagg ccgaaatcgg caaaatccct tataaatcaa 2880aagaatagac cgagataggg ttgagtgttg ttccagtttg gaacaagagt ccactattaa 2940agaacgtgga ctccaacgtc aaagggcgaa aaaccgtcta tcagggcgat ggcccactac 3000gtgaaccatc accctaatca agttttttgg ggtcgaggtg ccgtaaagca ctaaatcgga 3060accctaaagg gagcccccga tttagagctt gacggggaaa gccggcgaac gtggcgagaa 3120aggaagggaa gaaagcgaaa ggagcgggcg ctagggcgct ggcaagtgta gcggtcacgc 3180tgcgcgtaac caccacaccc gccgcgctta atgcgccgct acagggcgcg tccattcgcc 3240attcaggctg cgcaactgtt gggaagggcg atcggtgcgg gcctcttcgc tattacgcca 3300gctggcgaaa gggggatgtg ctgcaaggcg attaagttgg gtaacgccag ggttttccca 3360gtcacgacgt tgtaaaacga cggccagtga attgtaatac gactcactat agggcgaatt 3420gggcccgacg tcgcatgctg gtttcgattt gtcttagagg aacgcatata cagtaatcat 3480agagaataaa cgatattcat ttattaaagt agatagttga ggtagaagtt gtaaagagtg 3540ataaatagct tagataccac agacaccctc ggtgacgaag tactgcagat ggtttccaat 3600cacattgacc tgctggagca gagtgttacc ggcagagcac tgtttattgc tctggccctg 3660gcacatgaca acgttggaga gaggagggtg gatcaggggc cagtcaataa agacctcacc 3720agagcagtgc tggtaaccgt cccagaaggg cacttgaggg acgatatctc ctcggtgggt 3780gattcggtag agctttcggt ctttggacac cttggagaca tcggggttct cctggccaaa 3840gaagagttta tcgacccagt tagcaaagcc agcgttaccg acaatgggct gaccaagagt 3900aacaacgagg ggatcgtggc cgttaacctt gaggttgatt ccgaacagaa gggctgcagc 3960tcctccgaga gagtgaccgg tgacagcaat ctggtagtcg ggatactgct caatcacaga 4020gtcgagcttg gggccgatct gattgtaggt gttgttgtag gactggatga agccattgtg 4080gacaagacag tcatcacaag tagcagtaga agagatgtta gcagcaagat caaagttaat 4140taactcacct gcaggattga gactatgaat ggattcccgt gcccgtatta ctctactaat 4200ttgatcttgg aacgcgaaaa tacgtttcta ggactccaaa gaatctcaac tcttgtcctt 4260actaaatata ctacccatag ttgatggttt acttgaacag agaggacatg ttcacttgac 4320ccaaagtttc tcgcatctct tggatatttg aacaacggcg tccactgacc gtcagttatc 4380cagtcacaaa acccccacat tcatacattc ccatgtacgt ttacaaagtt ctcaattcca 4440tcgtgcaaat caaaatcaca tctattcatt catcatatat aaacccatca tgtctactaa 4500cactcacaac tccatagaaa acatcgactc agaacacacg ctccatgcgg ccgcttagga 4560atcctgagcg tccttgacac agtgaaccac accgactttg tgcatgtact tgagggtgga 4620aatgatgttg cccacaatgg tagggtagaa gacgtaccga actccgtgtc gttcgcaaca 4680ctctcggaca gcttgctgca cgaagggata gtgccaagac gacattcgag gaaagaggtg 4740atgctcgatc tggaagttga gaccgccagt

aaagaacatg gcaatgggtc caccgtaggt 4800ggaagaggtc tccacctgag ctctgtacca gtcgatctga tcggcttcaa cgtccttctc 4860ggagctcttg accttgcagt tcttgtcggg gattcgctcc gagccatcga agttgtgaga 4920caagatgaaa aagaaggtga ggaaggcacc ggtagcagtg ggcaccagag gaatggtgat 4980gagcagggag gttccagtga gataccaggg caagaaggcg gttcgaaaga tgaagaaagc 5040tcgcataacg aatgcaaggg ttcggtaccg tcgcagaaag ccgttctctc gcatggctgt 5100gacagactcg ggaatggtgt cgttgtgctg cattcggaag atgtagagag ggttgtacac 5160cagcgaaacg ccgtaggctc caagcacgag gtacatgtac caggcctgga atcggtgaaa 5220ccactttcga gcagtgttgg cagcagggta gttgtggaac acaaggaatg gttctgcgga 5280ctcggcatcc aggtcgagac catgctgatt ggtgtaggtg tgatgtcgca tgatgtgaga 5340ctgcagccag atccatctgg acgatccaat gacgtcgatg ccgtaggcaa agagagcgtt 5400gacccagggc tttttgctga tggcaccatg agaggcatcg tgctgaatgg acaggccgat 5460ctgcatgtgc atgaatccag tcaagagacc ccacagcacc attccggtag tagcccagtg 5520ccactcgcaa aaggcggtga cagcaatgat gccaacggtt cgcagccaga atccaggtgt 5580ggcataccag ttccgacctt tcatgacctc tcgcatagtt cgcttgacgt cctgtgcaaa 5640gggagagtcg taggtgtaga caatgtcctt ggaggttcgg tcgtgcttgc ctcgcacgaa 5700ctgttgaagc agcttcgagt tctcgggctt gacgtaaggg tgcatggagt agaacagagg 5760agaagcatcg gaggcaccag aagcgaggat caagtcgcct ccgggatgga ccttggcaag 5820accttccaga tcgtagagaa tgccgtcgat ggcaaccagg tcgggtcgct cgagcagctg 5880ctcggtagta agggagagag ccatggttgt gaattagggt ggtgagaatg gttggttgta 5940gggaagaatc aaaggccggt ctcgggatcc gtgggtatat atatatatat atatatatac 6000gatccttcgt tacctccctg ttctcaaaac tgtggttttt cgtttttcgt tttttgcttt 6060ttttgatttt tttagggcca actaagcttc cagatttcgc taatcacctt tgtactaatt 6120acaagaaagg aagaagctga ttagagttgg gctttttatg caactgtgct actccttatc 6180tctgatatga aagtgtagac ccaatcacat catgtcattt agagttggta atactgggag 6240gatagataag gcacgaaaac gagccatagc agacatgctg ggtgtagcca agcagaagaa 6300agtagatggg agccaattga cgagcgaggg agctacgcca atccgacata cgacacgctg 6360agatcgtctt ggccgggggg tacctacaga tgtccaaggg taagtgcttg actgtaattg 6420tatgtctgag gacaaatatg tagtcagccg tataaagtca taccaggcac cagtgccatc 6480atcgaaccac taactctcta tgatacatgc ctccggtatt attgtaccat gcgtcgcttt 6540gttacatacg tatcttgcct ttttctctca gaaactccag aattctctct cttgagcttt 6600tccataacaa gttcttctgc ctccaggaag tccatgggtg gtttgatcat ggttttggtg 6660tagtggtagt gcagtggtgg tattgtgact ggggatgtag ttgagaataa gtcatacaca 6720agtcagcttt cttcgagcct catataagta taagtagttc aacgtattag cactgtaccc 6780agcatctccg tatcgagaaa cacaacaaca tgccccattg gacagatcat gcggatacac 6840aggttgtgca gtatcataca tactcgatca gacaggtcgt ctgaccatca tacaagctga 6900acaagcgctc catacttgca cgctctctat atacacagtt aaattacata tccatagtct 6960aacctctaac agttaatctt ctggtaagcc tcccagccag ccttctggta tcgcttggcc 7020tcctcaatag gatctcggtt ctggccgtac agacctcggc cgacaattat gatatccgtt 7080ccggtagaca tgacatcctc aacagttcgg tactgctgtc cgagagcgtc tcccttgtcg 7140tcaagaccca ccccgggggt cagaataagc cagtcctcag agtcgccctt aggtcggttc 7200tgggcaatga agccaaccac aaactcgggg tcggatcggg caagctcaat ggtctgcttg 7260gagtactcgc cagtggccag agagcccttg caagacagct cggccagcat gagcagacct 7320ctggccagct tctcgttggg agaggggact aggaactcct tgtactggga gttctcgtag 7380tcagagacgt cctccttctt ctgttcagag acagtttcct cggcaccagc tcgcaggcca 7440gcaatgattc cggttccggg tacaccgtgg gcgttggtga tatcggacca ctcggcgatt 7500cggtgacacc ggtactggtg cttgacagtg ttgccaatat ctgcgaactt tctgtcctcg 7560aacaggaaga aaccgtgctt aagagcaagt tccttgaggg ggagcacagt gccggcgtag 7620gtgaagtcgt caatgatgtc gatatgggtt ttgatcatgc acacataagg tccgacctta 7680tcggcaagct caatgagctc cttggtggtg gtaacatcca gagaagcaca caggttggtt 7740ttcttggctg ccacgagctt gagcactcga gcggcaaagg cggacttgtg gacgttagct 7800cgagcttcgt aggagggcat tttggtggtg aagaggagac tgaaataaat ttagtctgca 7860gaacttttta tcggaacctt atctggggca gtgaagtata tgttatggta atagttacga 7920gttagttgaa cttatagata gactggacta tacggctatc ggtccaaatt agaaagaacg 7980tcaatggctc tctgggcgtc gcctttgccg acaaaaatgt gatcatgatg aaagccagca 8040atgacgttgc agctgatatt gttgtcggcc aaccgcgccg aaaacgcagc tgtcagaccc 8100acagcctcca acgaagaatg tatcgtcaaa gtgatccaag cacactcata gttggagtcg 8160tactccaaag gcggcaatga cgagtcagac agatactcgt cgaccttttc cttgggaacc 8220accaccgtca gcccttctga ctcacgtatt gtagccaccg acacaggcaa cagtccgtgg 8280atagcagaat atgtcttgtc ggtccatttc tcaccaactt taggcgtcaa gtgaatgttg 8340cagaagaagt atgtgccttc attgagaatc ggtgttgctg atttcaataa agtcttgaga 8400tcagtttggc cagtcatgtt gtggggggta attggattga gttatcgcct acagtctgta 8460caggtatact cgctgcccac tttatacttt ttgattccgc tgcacttgaa gcaatgtcgt 8520ttaccaaaag tgagaatgct ccacagaaca caccccaggg tatggttgag caaaaaataa 8580acactccgat acggggaatc gaaccccggt ctccacggtt ctcaagaagt attcttgatg 8640agagcgtatc gatcgaggaa gaggacaagc ggctgcttct taagtttgtg acatcagtat 8700ccaaggcacc attgcaagga ttcaaggctt tgaacccgtc atttgccatt cgtaacgctg 8760gtagacaggt tgatcggttc cctacggcct ccacctgtgt caatcttctc aagctgcctg 8820actatcagga cattgatcaa cttcggaaga aacttttgta tgccattcga tcacatgctg 8880gtttcgattt gtcttagagg aacgcatata cagtaatcat agagaataaa cgatattcat 8940ttattaaagt agatagttga ggtagaagtt gtaaagagtg ataaatagcg gccgctcact 9000gaatcttttt ggctcccttg tgctttcgga cgatgtaggt ctgcacgtag aagttgagga 9060acagacacag gacagtacca acgtagaagt agttgaaaaa ccagccaaac attctcattc 9120catcttgtcg gtagcaggga atgttccggt acttccagac gatgtagaag ccaacgttga 9180actgaatgat ctgcatagaa gtaatcaggg acttgggcat agggaacttg agcttgatca 9240gtcgggtcca atagtagccg tacatgatcc agtgaatgaa gccgttgagc agcacaaaga 9300tccaaacggc ttcgtttcgg tagttgtaga acagccacat gtccatagga gctccgagat 9360ggtgaaagaa ctgcaaccag gtcagaggct tgcccatgag gggcagatag aaggagtcaa 9420tgtactcgag gaacttgctg aggtagaaca gctgagtggt gattcggaag acattgttgt 9480cgaaagcctt ctcgcagttg tcggacatga caccaatggt gtacatggcg taggccatag 9540agaggaagga gcccagcgag tagatggaca tgagcaggtt gtagttggtg aacacaaact 9600tcattcgaga ctgacccttg ggtccgagag gaccaagggt gaacttcagg atgacgaagg 9660cgatggagag gtacagcacc tcgcagtgcg aggcatcaga ccagagctga gcatagtcga 9720ccttgggaag aacctcctgg ccaatggaga cgatttcgtt cacgacctcc atggttgatg 9780tgtgtttaat tcaagaatga atatagagaa gagaagaaga aaaaagattc aattgagccg 9840gcgatgcaga cccttatata aatgttgcct tggacagacg gagcaagccc gcccaaacct 9900acgttcggta taatatgtta agctttttaa cacaaaggtt tggcttgggg taacctgatg 9960tggtgcaaaa gaccgggcgt tggcgagcca ttgcgcgggc gaatggggcc gtgactcgtc 10020tcaaattcga gggcgtgcct caattcgtgc ccccgtggct ttttcccgcc gtttccgccc 10080cgtttgcacc actgcagccg cttctttggt tcggacacct tgctgcgagc taggtgcctt 10140gtgctactta aaaagtggcc tcccaacacc aacatgacat gagtgcgtgg gccaagacac 10200gttggcgggg tcgcagtcgg ctcaatggcc cggaaaaaac gctgctggag ctggttcgga 10260cgcagtccgc cgcggcgtat ggatatccgc aaggttccat agcgccattg ccctccgtcg 10320gcgtctatcc cgcaacctct aaatagagcg ggaatataac ccaagcttct tttttttcct 10380ttaacacgca cacccccaac tatcatgttg ctgctgctgt ttgactctac tctgtggagg 10440ggtgctccca cccaacccaa cctacaggtg gatccggcgc tgtgattggc tgataagtct 10500cctatccgga ctaattctga ccaatgggac atgcgcgcag gacccaaatg ccgcaattac 10560gtaaccccaa cgaaatgcct acccctcttt ggagcccagc ggccccaaat ccccccaagc 10620agcccggttc taccggcttc catctccaag cacaagcagc ccggttctac cggcttccat 10680ctccaagcac ccctttctcc acaccccaca aaaagacccg tgcaggacat cctactgcgt 10740gtttaaacac cactaaaacc ccacaaaata tatcttaccg aatatacaga tctactatag 10800aggaacaatt gccccggaga agacggccag gccgcctaga tgacaaattc aacaactcac 10860agctgacttt ctgccattgc cactaggggg gggccttttt atatggccaa gccaagctct 10920ccacgtcggt tgggctgcac ccaacaataa atgggtaggg ttgcaccaac aaagggatgg 10980gatggggggt agaagatacg aggataacgg ggctcaatgg cacaaataag aacgaatact 11040gccattaaga ctcgtgatcc agcgactgac accattgcat catctaaggg cctcaaaact 11100acctcggaac tgctgcgctg atctggacac cacagaggtt ccgagcactt taggttgcac 11160caaatgtccc accaggtgca ggcagaaaac gctggaacag cgtgtacagt ttgtcttaac 11220aaaaagtgag ggcgctgagg tcgagcaggg tggtgtgact tgttatagcc tttagagctg 11280cgaaagcgcg tatggatttg gctcatcagg ccagattgag ggtctgtgga cacatgtcat 11340gttagtgtac ttcaatcgcc ccctggatat agccccgaca ataggccgtg gcctcatttt 11400tttgccttcc gcacatttcc attgctcggt acccacacct tgcttctcct gcacttgcca 11460accttaatac tggtttacat tgaccaacat cttacaagcg gggggcttgt ctagggtata 11520tataaacagt ggctctccca atcggttgcc agtctctttt ttcctttctt tccccacaga 11580ttcgaaatct aaactacaca tcacacaatg cctgttactg acgtccttaa gcgaaagtcc 11640ggtgtcatcg tcggcgacga tgtccgagcc gtgagtatcc acgacaagat cagtgtcgag 11700acgacgcgtt ttgtgtaatg acacaatccg aaagtcgcta gcaacacaca ctctctacac 11760aaactaaccc agctctccat ggtgaaggct tctcgacagg ctctgcccct cgtcatcgac 11820ggaaaggtgt acgacgtctc cgcttgggtg aacttccacc ctggtggagc tgaaatcatt 11880gagaactacc agggacgaga tgctactgac gccttcatgg ttatgcactc tcaggaagcc 11940ttcgacaagc tcaagcgaat gcccaagatc aaccaggctt ccgagctgcc tccccaggct 12000gccgtcaacg aagctcagga ggatttccga aagctccgag aagagctgat cgccactggc 12060atgtttgacg cctctcccct ctggtactcg tacaagatct tgaccaccct gggtcttggc 12120gtgcttgcct tcttcatgct ggtccagtac cacctgtact tcattggtgc tctcgtgctc 12180ggtatgcact accagcaaat gggatggctg tctcatgaca tctgccacca ccagaccttc 12240aagaaccgaa actggaataa cgtcctgggt ctggtctttg gcaacggact ccagggcttc 12300tccgtgacct ggtggaagga cagacacaac gcccatcatt ctgctaccaa cgttcagggt 12360cacgatcccg acattgataa cctgcctctg ctcgcctggt ccgaggacga tgtcactcga 12420gcttctccca tctcccgaaa gctcattcag ttccaacagt actatttcct ggtcatctgt 12480attctcctgc gattcatctg gtgtttccag tctgtgctga ccgttcgatc cctcaaggac 12540cgagacaacc agttctaccg atctcagtac aagaaagagg ccattggact cgctctgcac 12600tggactctca agaccctgtt ccacctcttc tttatgccct ccatcctgac ctcgatgctg 12660gtgttctttg tttccgagct cgtcggtggc ttcggaattg ccatcgtggt cttcatgaac 12720cactaccctc tggagaagat cggtgattcc gtctgggacg gacatggctt ctctgtgggt 12780cagatccatg agaccatgaa cattcgacga ggcatcatta ctgactggtt ctttggaggc 12840ctgaactacc agatcgagca ccatctctgg cccaccctgc ctcgacacaa cctcactgcc 12900gtttcctacc aggtggaaca gctgtgccag aagcacaacc tcccctaccg aaaccctctg 12960ccccatgaag gtctcgtcat cctgctccga tacctgtccc agttcgctcg aatggccgag 13020aagcagcccg gtgccaaggc tcagtaagcg gccgcatgag aagataaata tataaataca 13080ttgagatatt aaatgcgcta gattagagag cctcatactg ctcggagaga agccaagacg 13140agtactcaaa ggggattaca ccatccatat ccacagacac aagctgggga aaggttctat 13200atacactttc cggaataccg tagtttccga tgttatcaat gggggcagcc aggatttcag 13260gcacttcggt gtctcggggt gaaatggcgt tcttggcctc catcaagtcg taccatgtct 13320tcatttgcct gtcaaagtaa aacagaagca gatgaagaat gaacttgaag tgaaggaatt 13380taaatagttg gagcaaggga gaaatgtaga gtgtgaaaga ctcactatgg tccgggctta 13440tctcgaccaa tagccaaagt ctggagtttc tgagagaaaa aggcaagata cgtatgtaac 13500aaagcgacgc atggtacaat aataccggag gcatgtatca tagagagtta gtggttcgat 13560gatggcactg gtgcctggta tgactttata cggctgacta catatttgtc ctcagacata 13620caattacagt caagcactta cccttggaca tctgtaggta ccccccggcc aagacgatct 13680cagcgtgtcg tatgtcggat tggcgtagct ccctcgctcg tcaattggct cccatctact 13740ttcttctgct tggctacacc cagcatgtct gctatggctc gttttcgtgc cttatctatc 13800ctcccagtat taccaactct aaatgacatg atgtgattgg gtctacactt tcatatcaga 13860gataaggagt agcacagttg cataaaaagc ccaactctaa tcagcttctt cctttcttgt 13920aattagtaca aaggtgatta gcgaaatctg gaagcttagt tggccctaaa aaaatcaaaa 13980aaagcaaaaa acgaaaaacg aaaaaccaca gttttgagaa cagggaggta acgaaggatc 14040gtatatatat atatatatat atatacccac ggatcccgag accggccttt gattcttccc 14100tacaaccaac cattctcacc accctaattc acaaccatgg gcgtattcat taaacaggag 14160cagcttccgg ctctcaagaa gtacaagtac tccgccgagg atcactcgtt catctccaac 14220aacattctgc gccccttctg gcgacagttt gtcaaaatct tccctctgtg gatggccccc 14280aacatggtga ctctgctggg cttcttcttt gtcattgtga acttcatcac catgctcatt 14340gttgatccca cccacgaccg cgagcctccc agatgggtct acctcaccta cgctctgggt 14400ctgttccttt accagacatt tgatgcctgt gacggatccc atgcccgacg aactggccag 14460agtggacccc ttggagagct gtttgaccac tgtgtcgacg ccatgaatac ctctctgatt 14520ctcacggtgg tggtgtccac cacccatatg ggatataaca tgaagctact gattgtgcag 14580attgccgctc tcggaaactt ctacctgtcg acctgggaga cctaccatac cggaactctg 14640tacctttctg gcttctctgg tcctgttgaa ggtatcttga ttctggtggc tcttttcgtc 14700ctcaccttct tcactggtcc caacgtgtac gctctgaccg tctacgaggc tcttcccgag 14760tccatcactt cgctgctgcc tgccagcttc ctggacgtca ccatcaccca gatctacatt 14820ggattcggag tgctgggcat ggtgttcaac atctacggcg cctgcggaaa cgtgatcaag 14880tactacaaca acaagggcaa gagcgctctc cccgccattc tcggaatcgc cccctttggc 14940atcttctacg tcggcgtctt tgcctgggcc catgttgctc ctctgcttct ctccaagtac 15000gccatcgtct atctgtttgc cattggggct gcctttgcca tgcaagtcgg ccagatgatt 15060cttgcccatc tcgtgcttgc tccctttccc cactggaacg tgctgctctt cttccccttt 15120gtgggactgg cagtgcacta cattgcaccc gtgtttggct gggacgccga tatcgtgtcg 15180gttaacactc tcttcacctg ttttggcgcc accctctcca tttacgcctt ctttgtgctt 15240gagatcatcg acgagatcac caactacctc gatatctggt gtctgcgaat caagtaccct 15300caggagaaga agaccgaata agcggccgca tggagcgtgt gttctgagtc gatgttttct 15360atggagttgt gagtgttagt agacatgatg ggtttatata tgatgaatga atagatgtga 15420ttttgatttg cacgatggaa ttgagaactt tgtaaacgta catgggaatg tatgaatgtg 15480ggggttttgt gactggataa ctgacggtca gtggacgccg ttgttcaaat atccaagaga 15540tgcgagaaac tttgggtcaa gtgaacatgt cctctctgtt caagtaaacc atcaactatg 15600ggtagtatat ttagtaagga caagagttga gattctttgg agtcctagaa acgtattttc 15660gcgttccaag atcaaattag tagagtaata cgggcacggg aatccattca tagtctcaat 15720cctgcaggtg agttaattaa tcgagcttgg cgtaatcatg gtcatagctg tttcctgtgt 15780gaaattgtta tccgctcaca attccacaca ac 15812567966DNAArtificial SequencePlasmid pYPS161 56aaatgtaacg aaactgaaat ttgaccagat attgtgtccg cggtggagct ccagcttttg 60ttccctttag tgagggttaa tttcgagctt ggcgtaatca tggtcatagc tgtttcctgt 120gtgaaattgt tatccgctca caagcttcca cacaacgtac gttctggttg gctcggatga 180tttctgcggc cccagcgtaa ggcaggcgtt ccgtccggat cggtttgggt cggatcggct 240ttttgattgt cgtattgtcg ctcatgttgg acctggtgtg tagttgtagt gtcagatcag 300attcaccagc gaatgcatgt gaacttcccc acattttgag ccgaggcaga tttgggttgc 360ttagtaagca gacgtggcgt tgcaagtaga tgtggcaaat ggggacgaag attccgaggg 420gatatcatag ttccaagggg atgtcatcat ttgccagctt tcgccgccac ttttgacgag 480tttttgtggg tcaaataagt ttagttgaac ttttcaaatt tcagttggca ttttgttaat 540agaaagggtg ccggtgctgg ggggttcatt cctcgggttg cagatatcct atctgtctta 600ggggtatctc tttcaatcga caagatgtag ttgggtaaca attatttatt aatattctct 660ccatccagta cagtactaac atcttgacat ctcagcacaa gtgcatcttc ccaagtgttt 720gttggagagg ttgttgggta ttacttagga aacagaacac agtacgtgga gatcttggat 780acatcgtaca tggaggttat ccataaaaaa gaccctccag gactagttac aatgccgtta 840gatgaggaaa tccacaaccc tgattcacta tgaacatatt atcttccccc aaacttgcga 900tatatggccc ttgatgatag ccttgatttt acccttgatg gtacctccac gaccaaccga 960tctgctgttt gaagagatat tttcaaattt gaagtgctca gatctactaa acatgagtcc 1020agtaattctt tccgtctttc cgatttccga tattcccttt tttagcccga cttttcactg 1080ctcccatgtc aaacgattag gacttgggag acaatcccac tgtcaaaatc accccgatat 1140tctctgtaaa acaagtactt cttccacgtg atcttcaaat acctcttcca cgtgaccttc 1200aaatacctct tcaagtacct cttccacgcg accttcaaag tcccttcaaa tacccttctc 1260aattctcccc ttctcctcca tagtccttct ctctgactaa gcttgagaat acatgacgct 1320aagacgaaaa cacactagag accctgagag cctgaacatg catccactct gcagttgcgc 1380acgtgcctac agcaactatc gggtccagtg ctggatctga cactgcgtct ccctatgaag 1440aaactgataa acagatctgc actcataaca atgatctgag cgatgaaaac gtgacctcca 1500cagccacaag tcataatcgg cgcgccagct gcattaatga atcggccaac gcgcggggag 1560aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 1620cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 1680atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 1740taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 1800aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 1860tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 1920gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 1980cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 2040cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 2100atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 2160tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat 2220ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 2280acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 2340aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 2400aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 2460tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 2520cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat ttcgttcatc 2580catagttgcc tgactccccg tcgtgtagat aactacgata cgggagggct taccatctgg 2640ccccagtgct gcaatgatac cgcgagaccc acgctcaccg gctccagatt tatcagcaat 2700aaaccagcca gccggaaggg ccgagcgcag aagtggtcct gcaactttat ccgcctccat 2760ccagtctatt aattgttgcc gggaagctag agtaagtagt tcgccagtta atagtttgcg 2820caacgttgtt gccattgcta caggcatcgt ggtgtcacgc tcgtcgtttg gtatggcttc 2880attcagctcc ggttcccaac gatcaaggcg agttacatga tcccccatgt tgtgcaaaaa 2940agcggttagc tccttcggtc ctccgatcgt tgtcagaagt aagttggccg cagtgttatc 3000actcatggtt atggcagcac tgcataattc tcttactgtc atgccatccg taagatgctt 3060ttctgtgact ggtgagtact caaccaagtc attctgagaa tagtgtatgc ggcgaccgag 3120ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca catagcagaa ctttaaaagt 3180gctcatcatt ggaaaacgtt cttcggggcg aaaactctca aggatcttac cgctgttgag 3240atccagttcg atgtaaccca ctcgtgcacc caactgatct tcagcatctt ttactttcac 3300cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc gcaaaaaagg gaataagggc 3360gacacggaaa tgttgaatac tcatactctt cctttttcaa tattattgaa gcatttatca 3420gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 3480ggttccgcgc acatttcccc gaaaagtgcc acctgatgcg gtgtgaaata ccgcacagat 3540gcgtaaggag aaaataccgc atcaggaaat tgtaagcgtt aatattttgt taaaattcgc 3600gttaaatttt tgttaaatca gctcattttt taaccaatag gccgaaatcg gcaaaatccc 3660ttataaatca aaagaataga ccgagatagg gttgagtgtt gttccagttt ggaacaagag 3720tccactatta aagaacgtgg actccaacgt caaagggcga aaaaccgtct atcagggcga 3780tggcccacta cgtgaaccat caccctaatc aagttttttg gggtcgaggt gccgtaaagc 3840actaaatcgg aaccctaaag ggagcccccg atttagagct tgacggggaa agccggcgaa 3900cgtggcgaga aaggaaggga agaaagcgaa aggagcgggc

gctagggcgc tggcaagtgt 3960agcggtcacg ctgcgcgtaa ccaccacacc cgccgcgctt aatgcgccgc tacagggcgc 4020gtccattcgc cattcaggct gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg 4080ctattacgcc agctggcgaa agggggatgt gctgcaaggc gattaagttg ggtaacgcca 4140gggttttccc agtcacgacg ttgtaaaacg acggccagtg aattgtaata cgactcacta 4200tagggcgaat tgggcccgac gtcgcatgca actattagtg aggcttcggg agtggttgtc 4260tcggttgtct cattcagact cgttgtgttg tatctatatc tatataaaca ctcttgtccc 4320tcaatcccac tgccatcttt tgctaaactt gccgccaata tgaaactcat ctccctcatc 4380accgtcgcta ccaccgctct ggcggctgtc ggagacaagt acaagctgac ctataccaga 4440tcagacgccc aatcggtcga atctctgccc gtcacctacc aagatgacct gatcaccgcc 4500tccaccgacg gcgaacccat caccatcacc gagggcgagg gcaacacctt ctctgttaac 4560gacatgccca tcgcctatct ggagctgcag gctttgttct ggaccggcga ctacggctac 4620aagctccagg gctcggtctt tgacattgcc gccgatggaa cctttgagct gagagacggc 4680cccaaggagt actactattg cactcctcac cctgagcgaa acgtcatcta cgtcatcaac 4740agccccgact actccaagtg tcggttcaag cgtaccatca agttccacgc tgaaaagatc 4800taagtggtaa tcgaccgact aaccattttt agctgacaaa cacttgctaa ctcctataac 4860gaatgaatga ctaacttggc atattgttac caagtattac ttgggatata gttgagtgta 4920accattgcta agaatccaaa ctggagcttc taaaggtctg ggagtcgccg tatgtgttca 4980tatcgaaatc aaagaaatca taatcgcaac agaattcaaa atcaagcaga ttaatatcca 5040ttattgtact cggatcgtga catatctgat atgatctcgg atatgatctc tgactgttta 5100ctgggagatt tgttgaagat ttgttgaggt tatctgaaaa gtagacaata gagacaaaat 5160gacgatatca agaactgaat cgggccgaaa tactcggtat cattcccttc agcagtaact 5220gtattgctct atcaatgcga cgagatacct ccacaattaa tactgtatac gctctaccac 5280tcatatctcc aatgctaaaa tatattcatg cccaggacct ctgtgcactg ctatgcagca 5340cagtgttgtc gattgaattg gtcgtgtctg gtccctgatg ctctgtgtct cgctgactag 5400tccttccatc cagacctcgt cattatctga taggcaacaa gttctgctct ctcacaccct 5460gccgacacaa gggacactcg ggcttctctc tcacccattc ggaaatacag tccttaatta 5520agttgcgaca catgtcttga tagtatcttg aattctctct cttgagcttt tccataacaa 5580gttcttctgc ctccaggaag tccatgggtg gtttgatcat ggttttggtg tagtggtagt 5640gcagtggtgg tattgtgact ggggatgtag ttgagaataa gtcatacaca agtcagcttt 5700cttcgagcct catataagta taagtagttc aacgtattag cactgtaccc agcatctccg 5760tatcgagaaa cacaacaaca tgccccattg gacagatcat gcggatacac aggttgtgca 5820gtatcataca tactcgatca gacaggtcgt ctgaccatca tacaagctga acaagcgctc 5880catacttgca cgctctctat atacacagtt aaattacata tccatagtct aacctctaac 5940agttaatctt ctggtaagcc tcccagccag ccttctggta tcgcttggcc tcctcaatag 6000gatctcggtt ctggccgtac agacctcggc cgacaattat gatatccgtt ccggtagaca 6060tgacatcctc aacagttcgg tactgctgtc cgagagcgtc tcccttgtcg tcaagaccca 6120ccccgggggt cagaataagc cagtcctcag agtcgccctt aggtcggttc tgggcaatga 6180agccaaccac aaactcgggg tcggatcggg caagctcaat ggtctgcttg gagtactcgc 6240cagtggccag agagcccttg caagacagct cggccagcat gagcagacct ctggccagct 6300tctcgttggg agaggggact aggaactcct tgtactggga gttctcgtag tcagagacgt 6360cctccttctt ctgttcagag acagtttcct cggcaccagc tcgcaggcca gcaatgattc 6420cggttccggg tacaccgtgg gcgttggtga tatcggacca ctcggcgatt cggtgacacc 6480ggtactggtg cttgacagtg ttgccaatat ctgcgaactt tctgtcctcg aacaggaaga 6540aaccgtgctt aagagcaagt tccttgaggg ggagcacagt gccggcgtag gtgaagtcgt 6600caatgatgtc gatatgggtt ttgatcatgc acacataagg tccgacctta tcggcaagct 6660caatgagctc cttggtggtg gtaacatcca gagaagcaca caggttggtt ttcttggctg 6720ccacgagctt gagcactcga gcggcaaagg cggacttgtg gacgttagct cgagcttcgt 6780aggagggcat tttggtggtg aagaggagac tgaaataaat ttagtctgca gaacttttta 6840tcggaacctt atctggggca gtgaagtata tgttatggta atagttacga gttagttgaa 6900cttatagata gactggacta tacggctatc ggtccaaatt agaaagaacg tcaatggctc 6960tctgggcgtc gcctttgccg acaaaaatgt gatcatgatg aaagccagca atgacgttgc 7020agctgatatt gttgtcggcc aaccgcgccg aaaacgcagc tgtcagaccc acagcctcca 7080acgaagaatg tatcgtcaaa gtgatccaag cacactcata gttggagtcg tactccaaag 7140gcggcaatga cgagtcagac agatactcgt cgaccttttc cttgggaacc accaccgtca 7200gcccttctga ctcacgtatt gtagccaccg acacaggcaa cagtccgtgg atagcagaat 7260atgtcttgtc ggtccatttc tcaccaactt taggcgtcaa gtgaatgttg cagaagaagt 7320atgtgccttc attgagaatc ggtgttgctg atttcaataa agtcttgaga tcagtttggc 7380cagtcatgtt gtggggggta attggattga gttatcgcct acagtctgta caggtatact 7440cgctgcccac tttatacttt ttgattccgc tgcacttgaa gcaatgtcgt ttaccaaaag 7500tgagaatgct ccacagaaca caccccaggg tatggttgag caaaaaataa acactccgat 7560acggggaatc gaaccccggt ctccacggtt ctcaagaagt attcttgatg agagcgtatc 7620gatgagccta aaatgaaccc gagtatatct cataaaattc tcggtgagag gtctgtgact 7680gtcagtacaa ggtgccttca ttatgccctc aaccttacca tacctcactg aatgtagtgt 7740acctctaaaa atgaaataca gtgccaaaag ccaaggcact gagctcgtct aacggacttg 7800atatacaacc aattaaaaca aatgaaaaga aatacagttc tttgtatcat ttgtaacaat 7860taccctgtac aaactaaggt attgaaatcc cacaatattc ccaaagtcca cccctttcca 7920aattgtcatg cctacaactc atataccaag cactaaccta ccgttt 79665720DNAArtificial SequencePrimer Pex-10del1 3'.Forward 57ccaacatgag cgacaatacg 205820DNAArtificial SequencePrimer Pex-10del2 5'.Reverse 58caagttctgc tctctcacac 20598673DNAArtificial SequencePlasmid pYRH13 59taagcgattg atgattggaa acacacacat gggttatatc taggtgagag ttagttggac 60agttatatat taaatcagct atgccaacgg taacttcatt catgtcaacg aggaaccagt 120gactgcaagt aatatagaat ttgaccacct tgccattctc ttgcactcct ttactatatc 180tcatttattt cttatataca aatcacttct tcttcccagc atcgagctcg gaaacctcat 240gagcaataac atcgtggatc tcgtcaatag agggcttttt ggactccttg ctgttggcca 300ccttgtcctt gctgtctggc tcattctgtt tcaacgcctt tcgcgccaga ccatcaacct 360tgttgagctc tccgtcagca gcctcgacca gatcatcaaa accagaaccc ttggctcgag 420ttcgggcttc tcgaagcttg tctttagcct cttcataatc gcccttcttg atagcaatca 480caccgactcc atatgtgcat agagcctggg cctcctcgac ttccttggtc cgtcggacat 540cgggctcaag agaaggaatg gccttgagaa cacgcttgta acatgactcg gatcgagcca 600gggcgttatt actgctcgtc ttcattgtgt ccagaggaat ctcgccgcct gtgtcagctt 660tgatggtggt gccctcgttc ttttcggcag tgtgaacaat cacctccagc tgttcagaca 720tgaggtagaa catggaggct aggttggctt gggctaacaa cagatctccc actccacatc 780cggaagcaag catgatctga taagtgattt gcttctctct gagagcaacg ttggcgaggg 840cgtcagagag gttgtgagtt gtgagcacat cacgagcagc aataagctcg tctctgaagg 900gcatccaggc gtcgtaattg ccggaagcac gcagcagacg agcatgagac gcacttttag 960tcagctgggt catgaactcc cgctcgctct gtgtcggggg cgtgctggcg agtttcagca 1020gatctgtggc ctcggggcac cgtcgacaga cctcttcttg agccagcagg atctgcagca 1080gtagcgctcg tgataccaca tcatttttct cggttccaga aatgtgagcg agcttgagag 1140cgatccgcag acctctctgg atcacctggg gccggacatc ctgggcgatt ttgttattct 1200ggaaggcgtc aacgtaggca gcacaaatct ccatgtacac gtcgtgggca gcgtccgggt 1260agttgagcat ctcgtagatc tctgccagtt tgagctggat gcctgtgtat tcgtccgaca 1320agggagacag gccttgggcc tcggcctcca taagtgcctc aatgtaatac ttgacggcat 1380gcgacgtcgg gcccaattcg ccctatagtg agtcgtatta caattcactg gccgtcgttt 1440tacaacgtcg tgactgggaa aaccctggcg ttacccaact taatcgcctt gcagcacatc 1500cccctttcgc cagctggcgt aatagcgaag aggcccgcac cgatcgccct tcccaacagt 1560tgcgcagcct gaatggcgaa tggacgcgcc ctgtagcggc gcattaagcg cggcgggtgt 1620ggtggttacg cgcagcgtga ccgctacact tgccagcgcc ctagcgcccg ctcctttcgc 1680tttcttccct tcctttctcg ccacgttcgc cggctttccc cgtcaagctc taaatcgggg 1740gctcccttta gggttccgat ttagtgcttt acggcacctc gaccccaaaa aacttgatta 1800gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc ctttgacgtt 1860ggagtccacg ttctttaata gtggactctt gttccaaact ggaacaacac tcaaccctat 1920ctcggtctat tcttttgatt tataagggat tttgccgatt tcggcctatt ggttaaaaaa 1980tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaacgc ttacaatttc 2040ctgatgcggt attttctcct tacgcatctg tgcggtattt cacaccgcat caggtggcac 2100ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac attcaaatat 2160gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa aaaggaagag 2220tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat tttgccttcc 2280tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc agttgggtgc 2340acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga gttttcgccc 2400cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg cggtattatc 2460ccgtattgac gccgggcaag agcaactcgg tcgccgcata cactattctc agaatgactt 2520ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag taagagaatt 2580atgcagtgct gccataacca tgagtgataa cactgcggcc aacttacttc tgacaacgat 2640cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg taactcgcct 2700tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg acaccacgat 2760gcctgtagca atggcaacaa cgttgcgcaa actattaact ggcgaactac ttactctagc 2820ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac cacttctgcg 2880ctcggccctt ccggctggct ggtttattgc tgataaatct ggagccggtg agcgtgggtc 2940tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg tagttatcta 3000cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg agataggtgc 3060ctcactgatt aagcattggt aactgtcaga ccaagtttac tcatatatac tttagattga 3120tttaaaactt catttttaat ttaaaaggat ctaggtgaag atcctttttg ataatctcat 3180gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg tcagaccccg tagaaaagat 3240caaaggatct tcttgagatc ctttttttct gcgcgtaatc tgctgcttgc aaacaaaaaa 3300accaccgcta ccagcggtgg tttgtttgcc ggatcaagag ctaccaactc tttttccgaa 3360ggtaactggc ttcagcagag cgcagatacc aaatactgtt cttctagtgt agccgtagtt 3420aggccaccac ttcaagaact ctgtagcacc gcctacatac ctcgctctgc taatcctgtt 3480accagtggct gctgccagtg gcgataagtc gtgtcttacc gggttggact caagacgata 3540gttaccggat aaggcgcagc ggtcgggctg aacggggggt tcgtgcacac agcccagctt 3600ggagcgaacg acctacaccg aactgagata cctacagcgt gagctatgag aaagcgccac 3660gcttcccgaa gggagaaagg cggacaggta tccggtaagc ggcagggtcg gaacaggaga 3720gcgcacgagg gagcttccag ggggaaacgc ctggtatctt tatagtcctg tcgggtttcg 3780ccacctctga cttgagcgtc gatttttgtg atgctcgtca ggggggcgga gcctatggaa 3840aaacgccagc aacgcggcct ttttacggtt cctggccttt tgctggcctt ttgctcacat 3900gttctttcct gcgttatccc ctgattctgt ggataaccgt attaccgcct ttgagtgagc 3960tgataccgct cgccgcagcc gaacgaccga gcgcagcgag tcagtgagcg aggaagcgga 4020agagcgccca atacgcaaac cgcctctccc cgcgcgttgg ccgattcatt aatgcagctg 4080gcgcgccggt ttctgtctct cgtcgtgtca cagatggtgt tgttgttgat gagttcctgg 4140ttgccctgtt tcgcacaagg tggtgcgtga ggttgtgtgg agaggggctt gaaggagggg 4200ggtcgaggtg caggagcgtc ccccgagggg ccctaggccg tcacatgacc ggcataatgg 4260tgtggagtcg ggttttggtt ttcctggcgg gttccacact tgtcaagtct cgtttttcag 4320gctttttttc actcgctctt tttgcacttt ggcatctttt tacctttggt gcttaccacc 4380tttgtatgca ggaaatctat tgggtttggt gtataggtga aaaaaaaaaa gccaaaggtg 4440actgtttttt tccgactcgg tcatgttgca ttttgtgcga tattataagt ggggaacgaa 4500tggaggcgag ctggtgtgat acgggagctg ctgtttctca cgattctgcc cagccattta 4560tcacgcgcac gctgacatct tgcacttagt catcaagagc tacagtacga cgagtacata 4620ctagagccaa ccactcctga agtgcttcca tgagttcagt tgagtgctga accaactctc 4680gacactctcg acagcctgtg aaaaggaatg agtgtgtgga aagggattca atactggaga 4740agagagggga gagatcgaga gggtgatgtt acatccccaa gcgtcgtagt ctcgcgttga 4800tgactggaac ggactgttga acgacgatca acatggtgtg caagctgatg gacagttggg 4860ccaatggttc agaagcgtta gttgagcttc taacgaccta ctactcgcct gtcaagtgag 4920gtgtgtactt gttcatactc ctactcgtct cactggcgtc tagggttgtg agcaccgtcg 4980cttatgaaag acgccgtcgc ctatgaaaga caccgtcgct cattgaagac tagatccata 5040atataaacaa aagagtattt ctctgaatgg cgacggattg gccagcccca tcgttacaca 5100atttgtccaa aaacaccatc tctgccgtcc atcgatatct ttcgaaatca tccggaccag 5160acagtagagc tttgagaacc ccgaaggagg aatactgcag tgaagtgttc tttgaaactc 5220tgactggagt atctccattt ctatatctcc attagtaatc actccaaaca gatgtcttcc 5280agcttgagtc agccgagacc acggtcacgt atggtgattc cttcaaacat ataactccat 5340tgacctaaca agacactggc agttgtaaat acgtaaatac attcttgatg taagttttaa 5400tctgattgga gactcttctg agtaacacac tctcttccaa gcagtcattt tggccttttt 5460ttcttccaaa cccgtctcga ttactcatca ggttttatct gagaaccaaa acgtctcaat 5520cattgacata ttgtaccatc aactctgtaa aaacttgaca gatgtgctac ttgtgtcatt 5580atgaatcgat tttccaaata tccattatca ttatcccatt tcttccccga tatcacctcc 5640ccatctacca cctccattta ccaaccacca tgctcagtaa tcagaaactc ctcttcacag 5700accacaattg ccaataattg accaccaaaa gtcgtaccat gtgtttctcc ggtgaccagg 5760tctcgctttc acccatttat tccctcaaaa acacccctac agtaatttca gcgcctttcc 5820atcaaactcc atacttgcaa caaaatcaca atggccccct gcctaaacta cgcccgccca 5880taattgagta tatttgtatg acaatcccgc tcgaaatttg gcccacttgt tccccgagct 5940ccaaatattc actattcacc ttcacctcgt gcccaccctg gccccccaat gccccccgtg 6000ctcgtaacgt ctccctcccc cacaccccac acacgtgaca taaagtgtaa agtgcgagta 6060cccgtacgtt gtgtggaagc ttgtgagcgg ataacaattt cacacaggaa acagctatga 6120ccatgattac gccaagctcg aaattaaccc tcactaaagg gaacaaaagc tggagctcca 6180ccgcggacac aatatctggt caaatttcag tttcgttaca tttaaacggt aggttagtgc 6240ttggtatatg agttgtaggc atgacaattt ggaaaggggt ggactttggg aatattgtgg 6300gatttcaata ccttagtttg tacagggtaa ttgttacaaa tgatacaaag aactgtattt 6360cttttcattt gttttaattg gttgtatatc aagtccgtta gacgagctca gtgccttggc 6420ttttggcact gtatttcatt tttagaggta cactacattc agtgaggtat ggtaaggttg 6480agggcataat gaaggcacct tgtactgaca gtcacagacc tctcaccgag aattttatga 6540gatatactcg ggttcatttt aggctcatcg atacgctctc atcaagaata cttcttgaga 6600accgtggaga ccggggttcg attccccgta tcggagtgtt tattttttgc tcaaccatac 6660cctggggtgt gttctgtgga gcattctcac ttttggtaaa cgacattgct tcaagtgcag 6720cggaatcaaa aagtataaag tgggcagcga gtatacctgt acagactgta ggcgataact 6780caatccaatt accccccaca acatgactgg ccaaactgat ctcaagactt tattgaaatc 6840agcaacaccg attctcaatg aaggcacata cttcttctgc aacattcact tgacgcctaa 6900agttggtgag aaatggaccg acaagacata ttctgctatc cacggactgt tgcctgtgtc 6960ggtggctaca atacgtgagt cagaagggct gacggtggtg gttcccaagg aaaaggtcga 7020cgagtatctg tctgactcgt cattgccgcc tttggagtac gactccaact atgagtgtgc 7080ttggatcact ttgacgatac attcttcgtt ggaggctgtg ggtctgacag ctgcgttttc 7140ggcgcggttg gccgacaaca atatcagctg caacgtcatt gctggctttc atcatgatca 7200catttttgtc ggcaaaggcg acgcccagag agccattgac gttctttcta atttggaccg 7260atagccgtat agtccagtct atctataagt tcaactaact cgtaactatt accataacat 7320atacttcact gccccagata aggttccgat aaaaagttct gcagactaaa tttatttcag 7380tctcctcttc accaccaaaa tgccctccta cgaagctcga gctaacgtcc acaagtccgc 7440ctttgccgct cgagtgctca agctcgtggc agccaagaaa accaacctgt gtgcttctct 7500ggatgttacc accaccaagg agctcattga gcttgccgat aaggtcggac cttatgtgtg 7560catgatcaaa acccatatcg acatcattga cgacttcacc tacgccggca ctgtgctccc 7620cctcaaggaa cttgctctta agcacggttt cttcctgttc gaggacagaa agttcgcaga 7680tattggcaac actgtcaagc accagtaccg gtgtcaccga atcgccgagt ggtccgatat 7740caccaacgcc cacggtgtac ccggaaccgg aatcattgct ggcctgcgag ctggtgccga 7800ggaaactgtc tctgaacaga agaaggagga cgtctctgac tacgagaact cccagtacaa 7860ggagttccta gtcccctctc ccaacgagaa gctggccaga ggtctgctca tgctggccga 7920gctgtcttgc aagggctctc tggccactgg cgagtactcc aagcagacca ttgagcttgc 7980ccgatccgac cccgagtttg tggttggctt cattgcccag aaccgaccta agggcgactc 8040tgaggactgg cttattctga cccccggggt gggtcttgac gacaagggag acgctctcgg 8100acagcagtac cgaactgttg aggatgtcat gtctaccgga acggatatca taattgtcgg 8160ccgaggtctg tacggccaga accgagatcc tattgaggag gccaagcgat accagaaggc 8220tggctgggag gcttaccaga agattaactg ttagaggtta gactatggat atgtaattta 8280actgtgtata tagagagcgt gcaagtatgg agcgcttgtt cagcttgtat gatggtcaga 8340cgacctgtct gatcgagtat gtatgatact gcacaacctg tgtatccgca tgatctgtcc 8400aatggggcat gttgttgtgt ttctcgatac ggagatgctg ggtacagtgc taatacgttg 8460aactacttat acttatatga ggctcgaaga aagctgactt gtgtatgact tattctcaac 8520tacatcccca gtcacaatac caccactgca ctaccactac accaaaacca tgatcaaacc 8580acccatggac ttcctggagg cagaagaact tgttatggaa aagctcaaga gagagaattc 8640aagatactat caagacatgt gtcgcaactt aat 86736038DNAArtificial SequencePrimer PEX16Fii 60ccaaccagat caccacccac tacaccttcc aggaaccc 386134DNAArtificial SequencePrimer PEX16Rii 61ctggtagaac tcgcctcgga acaaccacca tccc 346234DNAArtificial SequencePrimer 3UTR-URA3 62gagagaattc aagatactat caagacatgt gtcg 346333DNAArtificial SequencePrimer Pex16-conf 63cacaccttca ccccggaagt cgccaccatt ctg 336420DNAArtificial SequenceReal time PCR primer ef-324F 64cgactgtgcc atcctcatca 206521DNAArtificial SequenceReal time PCR primer ef-392R 65tgaccgtcct tggagatacc a 216618DNAArtificial SequenceReal time PCR primer Pex16-741F 66gggagtggtg gccgagtt 186721DNAArtificial SequenceReal time PCR primer Pex16-802R 67ggaaaagcaa gcatgcgtag a 216821DNAArtificial SequenceNucleotide portion of primer ef-345T 68tgctggtggt gttggtgagt t 216921DNAArtificial SequenceNucleotide portion of TaqMan probe Pex16-760T 69ctgtccattc tgcgacccct c 21704313DNAArtificial SequencePlasmid pZKUM 70taatcgagct tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc 60acaattccac acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga 120gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 180tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg 240cgctcttccg cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg 300gtatcagctc actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga 360aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg 420gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag 480aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc 540gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg 600ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt 660cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc 720ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc 780actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg 840tggcctaact acggctacac tagaaggaca gtatttggta tctgcgctct gctgaagcca 900gttaccttcg gaaaaagagt

tggtagctct tgatccggca aacaaaccac cgctggtagc 960ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat 1020cctttgatct tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt 1080ttggtcatga gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt 1140tttaaatcaa tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc 1200agtgaggcac ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc 1260gtcgtgtaga taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata 1320ccgcgagacc cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg 1380gccgagcgca gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc 1440cgggaagcta gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct 1500acaggcatcg tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa 1560cgatcaaggc gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt 1620cctccgatcg ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca 1680ctgcataatt ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac 1740tcaaccaagt cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca 1800atacgggata ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt 1860tcttcggggc gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc 1920actcgtgcac ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca 1980aaaacaggaa ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata 2040ctcatactct tcctttttca atattattga agcatttatc agggttattg tctcatgagc 2100ggatacatat ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc 2160cgaaaagtgc cacctgacgc gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt 2220acgcgcagcg tgaccgctac acttgccagc gccctagcgc ccgctccttt cgctttcttc 2280ccttcctttc tcgccacgtt cgccggcttt ccccgtcaag ctctaaatcg ggggctccct 2340ttagggttcc gatttagtgc tttacggcac ctcgacccca aaaaacttga ttagggtgat 2400ggttcacgta gtgggccatc gccctgatag acggtttttc gccctttgac gttggagtcc 2460acgttcttta atagtggact cttgttccaa actggaacaa cactcaaccc tatctcggtc 2520tattcttttg atttataagg gattttgccg atttcggcct attggttaaa aaatgagctg 2580atttaacaaa aatttaacgc gaattttaac aaaatattaa cgcttacaat ttccattcgc 2640cattcaggct gcgcaactgt tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc 2700agctggcgaa agggggatgt gctgcaaggc gattaagttg ggtaacgcca gggttttccc 2760agtcacgacg ttgtaaaacg acggccagtg aattgtaata cgactcacta tagggcgaat 2820tgggtaccgg gccccccctc gaggtcgacg agtatctgtc tgactcgtca ttgccgcctt 2880tggagtacga ctccaactat gagtgtgctt ggatcacttt gacgatacat tcttcgttgg 2940aggctgtggg tctgacagct gcgttttcgg cgcggttggc cgacaacaat atcagctgca 3000acgtcattgc tggctttcat catgatcaca tttttgtcgg caaaggcgac gcccagagag 3060ccattgacgt tctttctaat ttggaccgat agccgtatag tccagtctat ctataagttc 3120aactaactcg taactattac cataacatat acttcactgc cccagataag gttccgataa 3180aaagttctgc agactaaatt tatttcagtc tcctcttcac caccaaaatg ccctcctacg 3240aagctcgagt gctcaagctc gtggcagcca agaaaaccaa cctgtgtgct tctctggatg 3300ttaccaccac caaggagctc attgagcttg ccgataaggt cggaccttat gtgtgcatga 3360tcaaaaccca tatcgacatc attgacgact tcacctacgc cggcactgtg ctccccctca 3420aggaacttgc tcttaagcac ggtttcttcc tgttcgagga cagaaagttc gcagatattg 3480gcaacactgt caagcaccag taccggtgtc accgaatcgc cgagtggtcc gatatcacca 3540acgcccacgg tgtacccgga accggaatcg attgctggcc tgcgagctgg tgcgtacgag 3600gaaactgtct ctgaacagaa gaaggaggac gtctctgact acgagaactc ccagtacaag 3660gagttcctag tcccctctcc caacgagaag ctggccagag gtctgctcat gctggccgag 3720ctgtcttgca agggctctct ggccactggc gagtactcca agcagaccat tgagcttgcc 3780cgatccgacc ccgagtttgt ggttggcttc attgcccaga accgacctaa gggcgactct 3840gaggactggc ttattctgac ccccggggtg ggtcttgacg acaagggaga cgctctcgga 3900cagcagtacc gaactgttga ggatgtcatg tctaccggaa cggatatcat aattgtcggc 3960cgaggtctgt acggccagaa ccgagatcct attgaggagg ccaagcgata ccagaaggct 4020ggctgggagg cttaccagaa gattaactgt tagaggttag actatggata tgtaatttaa 4080ctgtgtatat agagagcgtg caagtatgga gcgcttgttc agcttgtatg atggtcagac 4140gacctgtctg atcgagtatg tatgatactg cacaacctgt gtatccgcat gatctgtcca 4200atggggcatg ttgttgtgtt tctcgatacg gagatgctgg gtacagtgct aatacgttga 4260actacttata cttatatgag gctcgaagaa agctgacttg tgtatgactt aat 43137115966DNAArtificial SequencePlasmid pZKD2-5U89A2 71gtacgtttca tgaaggcggg cagaaagtac tcgatggtgg agatgattgc tcggaggtac 60ttgttctgcg gccagtatct ctcagcaatc aggtgatact cctggacgtc cagagggtag 120tatgtgtgcg tgggctccag atccaccgtc ttgtgcagag ttatggggaa gtagcggcca 180aagagcttcc agatgaagaa gtttcttgaa ataggcgagt atcgcttgac cactcctccg 240ttggacgggg agtcgtcttt aacagcgtac actacatacg caatcacaaa tggccagagc 300agtggaattg cgcagcatag catgaaaatt gtgaggaaag tgggaatgct gaaaatgtgc 360cagaccagag agaaggtctc acatcggttg agtaatggtg tcgatagcgg ggcatatcgg 420attcccgcga ttttgggtgc cgtgtcgttt ttgtctcgcg acttgtagta ttgtgagtcg 480atagtcatag cttttgtttt gtgtgacttg tctgttgcct gttgttagaa gaaaaagtgg 540gagcttatca gtcacggtcc acgaacgatt tcgtacttgt acgtaattgg tcgtgagaac 600tgttgcagag ccggtgcttt tttttgtggc caagtcgaca ggtcgatttc ggcgctgtgc 660gaggttgctg ggatgtgctg gtttggctgc caaatgtggg gaagatttca acctcggatt 720tgacgtgtgt agaggcgcgc cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg 780gtttgcgtat tgggcgctct tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc 840ggctgcggcg agcggtatca gctcactcaa aggcggtaat acggttatcc acagaatcag 900gggataacgc aggaaagaac atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa 960aggccgcgtt gctggcgttt ttccataggc tccgcccccc tgacgagcat cacaaaaatc 1020gacgctcaag tcagaggtgg cgaaacccga caggactata aagataccag gcgtttcccc 1080ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc gcttaccgga tacctgtccg 1140cctttctccc ttcgggaagc gtggcgcttt ctcatagctc acgctgtagg tatctcagtt 1200cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga accccccgtt cagcccgacc 1260gctgcgcctt atccggtaac tatcgtcttg agtccaaccc ggtaagacac gacttatcgc 1320cactggcagc agccactggt aacaggatta gcagagcgag gtatgtaggc ggtgctacag 1380agttcttgaa gtggtggcct aactacggct acactagaag aacagtattt ggtatctgcg 1440ctctgctgaa gccagttacc ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa 1500ccaccgctgg tagcggtggt ttttttgttt gcaagcagca gattacgcgc agaaaaaaag 1560gatctcaaga agatcctttg atcttttcta cggggtctga cgctcagtgg aacgaaaact 1620cacgttaagg gattttggtc atgagattat caaaaaggat cttcacctag atccttttaa 1680attaaaaatg aagttttaaa tcaatctaaa gtatatatga gtaaacttgg tctgacagtt 1740accaatgctt aatcagtgag gcacctatct cagcgatctg tctatttcgt tcatccatag 1800ttgcctgact ccccgtcgtg tagataacta cgatacggga gggcttacca tctggcccca 1860gtgctgcaat gataccgcga gacccacgct caccggctcc agatttatca gcaataaacc 1920agccagccgg aagggccgag cgcagaagtg gtcctgcaac tttatccgcc tccatccagt 1980ctattaattg ttgccgggaa gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg 2040ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc gtttggtatg gcttcattca 2100gctccggttc ccaacgatca aggcgagtta catgatcccc catgttgtgc aaaaaagcgg 2160ttagctcctt cggtcctccg atcgttgtca gaagtaagtt ggccgcagtg ttatcactca 2220tggttatggc agcactgcat aattctctta ctgtcatgcc atccgtaaga tgcttttctg 2280tgactggtga gtactcaacc aagtcattct gagaatagtg tatgcggcga ccgagttgct 2340cttgcccggc gtcaatacgg gataataccg cgccacatag cagaacttta aaagtgctca 2400tcattggaaa acgttcttcg gggcgaaaac tctcaaggat cttaccgctg ttgagatcca 2460gttcgatgta acccactcgt gcacccaact gatcttcagc atcttttact ttcaccagcg 2520tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa aaagggaata agggcgacac 2580ggaaatgttg aatactcata ctcttccttt ttcaatatta ttgaagcatt tatcagggtt 2640attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc 2700cgcgcacatt tccccgaaaa gtgccacctg atgcggtgtg aaataccgca cagatgcgta 2760aggagaaaat accgcatcag gaaattgtaa gcgttaatat tttgttaaaa ttcgcgttaa 2820atttttgtta aatcagctca ttttttaacc aataggccga aatcggcaaa atcccttata 2880aatcaaaaga atagaccgag atagggttga gtgttgttcc agtttggaac aagagtccac 2940tattaaagaa cgtggactcc aacgtcaaag ggcgaaaaac cgtctatcag ggcgatggcc 3000cactacgtga accatcaccc taatcaagtt ttttggggtc gaggtgccgt aaagcactaa 3060atcggaaccc taaagggagc ccccgattta gagcttgacg gggaaagccg gcgaacgtgg 3120cgagaaagga agggaagaaa gcgaaaggag cgggcgctag ggcgctggca agtgtagcgg 3180tcacgctgcg cgtaaccacc acacccgccg cgcttaatgc gccgctacag ggcgcgtcca 3240ttcgccattc aggctgcgca actgttggga agggcgatcg gtgcgggcct cttcgctatt 3300acgccagctg gcgaaagggg gatgtgctgc aaggcgatta agttgggtaa cgccagggtt 3360ttcccagtca cgacgttgta aaacgacggc cagtgaattg taatacgact cactataggg 3420cgaattgggc ccgacgtcgc atgcatcaaa ggaagggtga atccaaggaa gttcttgaca 3480aactgctgga atcggtacag cttggacgac ttgtcgttgc taacctggtc atagaggtcg 3540ttctcaccaa aggccatgat gggaacaagg gcgacatttc cgacctccat accaagtcga 3600acaaaaccct ttcgcttgag tagcaccagg tccatgacac cgggtctggc cagaagactt 3660tcctgtgctc caccaacgac aatgcagata gactggtttc gcttgaggag ggccttgcag 3720gacttcttgg agacagaagc gactcccaga ctcatgaggt actctctgta gagaggcact 3780cggaagttgt tggtgagagt cataagagaa acagggatgc ccggaaagag cttggaccat 3840ccagctccct cggtggcaat tccaccaaag gctcccatgc cgataatgcc gtgggggtgg 3900tagccgaaga tgtattttct gccagtgggc ttgagttttg tgggcgacag ctgtgggtcg 3960ttttcgccaa tgatctggtt ggcgtaggag ttgagggacc cgttaagaag cgtggaatca 4020gatgcagtgg agccagcaga ggcggacgac aaaggtcgtc ggttagtggt gccattgttg 4080ccgttgccgt taagttcgga gcccgaggcg tggccgttgg agccagatga ttctccacgg 4140ctatatctgc tgtcgtggtt aattaactca cctgcaggat tgagactatg aatggattcc 4200cgtgcccgta ttactctact aatttgatct tggaacgcga aaatacgttt ctaggactcc 4260aaagaatctc aactcttgtc cttactaaat atactaccca tagttgatgg tttacttgaa 4320cagagaggac atgttcactt gacccaaagt ttctcgcatc tcttggatat ttgaacaacg 4380gcgtccactg accgtcagtt atccagtcac aaaaccccca cattcataca ttcccatgta 4440cgtttacaaa gttctcaatt ccatcgtgca aatcaaaatc acatctattc attcatcata 4500tataaaccca tcatgtctac taacactcac aactccatag aaaacatcga ctcagaacac 4560acgctccatg cggccgctta ggaatcctga gcgtccttga cacagtgaac cacaccgact 4620ttgtgcatgt acttgagggt ggaaatgatg ttgcccacaa tggtagggta gaagacgtac 4680cgaactccgt gtcgttcgca acactctcgg acagcttgct gcacgaaggg atagtgccaa 4740gacgacattc gaggaaagag gtgatgctcg atctggaagt tgagaccgcc agtaaagaac 4800atggcaatgg gtccaccgta ggtggaagag gtctccacct gagctctgta ccagtcgatc 4860tgatcggctt caacgtcctt ctcggagctc ttgaccttgc agttcttgtc ggggattcgc 4920tccgagccat cgaagttgtg agacaagatg aaaaagaagg tgaggaaggc accggtagca 4980gtgggcacca gaggaatggt gatgagcagg gaggttccag tgagatacca gggcaagaag 5040gcggttcgaa agatgaagaa agctcgcata acgaatgcaa gggttcggta ccgtcgcaga 5100aagccgttct ctcgcatggc tgtgacagac tcgggaatgg tgtcgttgtg ctgcattcgg 5160aagatgtaga gagggttgta caccagcgaa acgccgtagg ctccaagcac gaggtacatg 5220taccaggcct ggaatcggtg aaaccacttt cgagcagtgt tggcagcagg gtagttgtgg 5280aacacaagga atggttctgc ggactcggca tccaggtcga gaccatgctg attggtgtag 5340gtgtgatgtc gcatgatgtg agactgcagc cagatccatc tggacgatcc aatgacgtcg 5400atgccgtagg caaagagagc gttgacccag ggctttttgc tgatggcacc atgagaggca 5460tcgtgctgaa tggacaggcc gatctgcatg tgcatgaatc cagtcaagag accccacagc 5520accattccgg tagtagccca gtgccactcg caaaaggcgg tgacagcaat gatgccaacg 5580gttcgcagcc agaatccagg tgtggcatac cagttccgac ctttcatgac ctctcgcata 5640gttcgcttga cgtcctgtgc aaagggagag tcgtaggtgt agacaatgtc cttggaggtt 5700cggtcgtgct tgcctcgcac gaactgttga agcagcttcg agttctcggg cttgacgtaa 5760gggtgcatgg agtagaacag aggagaagca tcggaggcac cagaagcgag gatcaagtcg 5820cctccgggat ggaccttggc aagaccttcc agatcgtaga gaatgccgtc gatggcaacc 5880aggtcgggtc gctcgagcag ctgctcggta gtaagggaga gagccatggc cattgctgta 5940gatatgtctt gtgtgtaagg gggttggggt ggttgtttgt gttcttgact tttgtgttag 6000caagggaaga cgggcaaaaa agtgagtgtg gttgggaggg agagacgagc cttatatata 6060atgcttgttt gtgtttgtgc aagtggacgc cgaaacgggc aggagccaaa ctaaacaagg 6120cagacaatgc gagcttaatt ggattgcctg atgggcaggg gttagggctc gatcaatggg 6180ggtgcgaagt gacaaaattg ggaattaggt tcgcaagcaa ggctgacaag actttggccc 6240aaacatttgt acgcggtgga caacaggagc cacccatcgt ctgtcacggg ctagccggtc 6300gtgcgtcctg tcaggctcca cctaggctcc atgccactcc atacaatccc actagtgtac 6360cgctaggccg cttttagctc ccatctaaga cccccccaaa acctccactg tacagtgcac 6420tgtactgtgt ggcgatcaag ggcaagggaa aaaaggcgca aacatgcacg catggaatga 6480cgtaggtaag gcgttactag actgaaaagt ggcacatttc ggcgtgccaa agggtcctag 6540gtgcgtttcg cgagctgggc gccaggccaa gccgctccaa aacgcctctc cgactccctc 6600cagcggcctc catatcccca tccctctcca cagcaatgtt gttaagcctt gcaaacgaaa 6660aaatagaaag gctaataagc ttccaatatt gtggtgtacg ctgcataacg caacaatgag 6720cgccaaacaa cacacacaca cagcacacag cagcattaac cacgatgaac agcatgaatt 6780ctctctcttg agcttttcca taacaagttc ttctgcctcc aggaagtcca tgggtggttt 6840gatcatggtt ttggtgtagt ggtagtgcag tggtggtatt gtgactgggg atgtagttga 6900gaataagtca tacacaagtc agctttcttc gagcctcata taagtataag tagttcaacg 6960tattagcact gtacccagca tctccgtatc gagaaacaca acaacatgcc ccattggaca 7020gatcatgcgg atacacaggt tgtgcagtat catacatact cgatcagaca ggtcgtctga 7080ccatcataca agctgaacaa gcgctccata cttgcacgct ctctatatac acagttaaat 7140tacatatcca tagtctaacc tctaacagtt aatcttctgg taagcctccc agccagcctt 7200ctggtatcgc ttggcctcct caataggatc tcggttctgg ccgtacagac ctcggccgac 7260aattatgata tccgttccgg tagacatgac atcctcaaca gttcggtact gctgtccgag 7320agcgtctccc ttgtcgtcaa gacccacccc gggggtcaga ataagccagt cctcagagtc 7380gcccttaggt cggttctggg caatgaagcc aaccacaaac tcggggtcgg atcgggcaag 7440ctcaatggtc tgcttggagt actcgccagt ggccagagag cccttgcaag acagctcggc 7500cagcatgagc agacctctgg ccagcttctc gttgggagag gggactagga actccttgta 7560ctgggagttc tcgtagtcag agacgtcctc cttcttctgt tcagagacag tttcctcggc 7620accagctcgc aggccagcaa tgattccggt tccgggtaca ccgtgggcgt tggtgatatc 7680ggaccactcg gcgattcggt gacaccggta ctggtgcttg acagtgttgc caatatctgc 7740gaactttctg tcctcgaaca ggaagaaacc gtgcttaaga gcaagttcct tgagggggag 7800cacagtgccg gcgtaggtga agtcgtcaat gatgtcgata tgggttttga tcatgcacac 7860ataaggtccg accttatcgg caagctcaat gagctccttg gtggtggtaa catccagaga 7920agcacacagg ttggttttct tggctgccac gagcttgagc actcgagcgg caaaggcgga 7980cttgtggacg ttagctcgag cttcgtagga gggcattttg gtggtgaaga ggagactgaa 8040ataaatttag tctgcagaac tttttatcgg aaccttatct ggggcagtga agtatatgtt 8100atggtaatag ttacgagtta gttgaactta tagatagact ggactatacg gctatcggtc 8160caaattagaa agaacgtcaa tggctctctg ggcgtcgcct ttgccgacaa aaatgtgatc 8220atgatgaaag ccagcaatga cgttgcagct gatattgttg tcggccaacc gcgccgaaaa 8280cgcagctgtc agacccacag cctccaacga agaatgtatc gtcaaagtga tccaagcaca 8340ctcatagttg gagtcgtact ccaaaggcgg caatgacgag tcagacagat actcgtcgac 8400cttttccttg ggaaccacca ccgtcagccc ttctgactca cgtattgtag ccaccgacac 8460aggcaacagt ccgtggatag cagaatatgt cttgtcggtc catttctcac caactttagg 8520cgtcaagtga atgttgcaga agaagtatgt gccttcattg agaatcggtg ttgctgattt 8580caataaagtc ttgagatcag tttggccagt catgttgtgg ggggtaattg gattgagtta 8640tcgcctacag tctgtacagg tatactcgct gcccacttta tactttttga ttccgctgca 8700cttgaagcaa tgtcgtttac caaaagtgag aatgctccac agaacacacc ccagggtatg 8760gttgagcaaa aaataaacac tccgatacgg ggaatcgaac cccggtctcc acggttctca 8820agaagtattc ttgatgagag cgtatcgata gttggagcaa gggagaaatg tagagtgtga 8880aagactcact atggtccggg cttatctcga ccaatagcca aagtctggag tttctgagag 8940aaaaaggcaa gatacgtatg taacaaagcg acgcatggta caataatacc ggaggcatgt 9000atcatagaga gttagtggtt cgatgatggc actggtgcct ggtatgactt tatacggctg 9060actacatatt tgtcctcaga catacaatta cagtcaagca cttacccttg gacatctgta 9120ggtacccccc ggccaagacg atctcagcgt gtcgtatgtc ggattggcgt agctccctcg 9180ctcgtcaatt ggctcccatc tactttcttc tgcttggcta cacccagcat gtctgctatg 9240gctcgttttc gtgccttatc tatcctccca gtattaccaa ctctaaatga catgatgtga 9300ttgggtctac actttcatat cagagataag gagtagcaca gttgcataaa aagcccaact 9360ctaatcagct tcttcctttc ttgtaattag tacaaaggtg attagcgaaa tctggaagct 9420tagttggccc taaaaaaatc aaaaaaagca aaaaacgaaa aacgaaaaac cacagttttg 9480agaacaggga ggtaacgaag gatcgtatat atatatatat atatatatac ccacggatcc 9540cgagaccggc ctttgattct tccctacaac caaccattct caccacccta attcacaacc 9600atggctgccg tcatcgaggt ggccaacgag ttcgtcgcta tcactgccga gacccttccc 9660aaggtggact atcagcgact ctggcgagac atctactcct gcgagctcct gtacttctcc 9720attgctttcg tcatcctcaa gtttaccctt ggcgagctct cggattctgg caaaaagatt 9780ctgcgagtgc tgttcaagtg gtacaacctc ttcatgtccg tcttttcgct ggtgtccttc 9840ctctgtatgg gttacgccat ctacaccgtt ggactgtact ccaacgaatg cgacagagct 9900ttcgacaaca gcttgttccg atttgccacc aaggtcttct actattccaa gtttctggag 9960tacatcgact ctttctacct tcccctcatg gccaagcctc tgtcctttct gcagttcttt 10020catcacttgg gagctcctat ggacatgtgg ctcttcgtgc agtactctgg cgaatccatt 10080tggatctttg tgttcctgaa cggattcatt cactttgtca tgtacggcta ctattggaca 10140cggctgatga agttcaactt tcccatgccc aagcagctca ttaccgcaat gcagatcacc 10200cagttcaacg ttggcttcta cctcgtgtgg tggtacaagg acattccctg ttaccgaaag 10260gatcccatgc gaatgctggc ctggatcttc aactactggt acgtcggtac cgttcttctg 10320ctcttcatca acttctttgt caagtcctac gtgtttccca agcctaagac tgccgacaaa 10380aaggtccagt agcggccgca tgtacataca agattattta tagaaatgaa tcgcgatcga 10440acaaagagta cgagtgtacg agtaggggat gatgataaaa gtggaagaag ttccgcatct 10500ttggatttat caacgtgtag gacgatactt cctgtaaaaa tgcaatgtct ttaccatagg 10560ttctgctgta gatgttatta actaccatta acatgtctac ttgtacagtt gcagaccagt 10620tggagtatag aatggtacac ttaccaaaaa gtgttgatgg ttgtaactac gatatataaa 10680actgttgacg ggatctgtat attcggtaag atatattttg tggggtttta gtggtgttta 10740aacaccacta aaaccccaca aaatatatct taccgaatat acagatctac tatagaggaa 10800caattgcccc ggagaagacg gccaggccgc ctagatgaca aattcaacaa ctcacagctg 10860actttctgcc attgccacta ggggggggcc tttttatatg gccaagccaa gctctccacg 10920tcggttgggc tgcacccaac aataaatggg tagggttgca ccaacaaagg gatgggatgg 10980ggggtagaag atacgaggat aacggggctc aatggcacaa ataagaacga atactgccat 11040taagactcgt gatccagcga ctgacaccat tgcatcatct aagggcctca aaactacctc 11100ggaactgctg cgctgatctg gacaccacag aggttccgag cactttaggt tgcaccaaat 11160gtcccaccag gtgcaggcag aaaacgctgg aacagcgtgt acagtttgtc ttaacaaaaa 11220gtgagggcgc tgaggtcgag cagggtggtg tgacttgtta tagcctttag agctgcgaaa 11280gcgcgtatgg atttggctca tcaggccaga ttgagggtct gtggacacat gtcatgttag 11340tgtacttcaa tcgccccctg gatatagccc cgacaatagg ccgtggcctc atttttttgc 11400cttccgcaca tttccattgc tcggtaccca caccttgctt ctcctgcact tgccaacctt 11460aatactggtt tacattgacc aacatcttac aagcgggggg cttgtctagg gtatatataa 11520acagtggctc tcccaatcgg ttgccagtct cttttttcct ttctttcccc acagattcga 11580aatctaaact acacatcaca caatgcctgt

tactgacgtc cttaagcgaa agtccggtgt 11640catcgtcggc gacgatgtcc gagccgtgag tatccacgac aagatcagtg tcgagacgac 11700gcgttttgtg taatgacaca atccgaaagt cgctagcaac acacactctc tacacaaact 11760aacccagctc tccatggtga aggcttctcg acaggctctg cccctcgtca tcgacggaaa 11820ggtgtacgac gtctccgctt gggtgaactt ccaccctggt ggagctgaaa tcattgagaa 11880ctaccaggga cgagatgcta ctgacgcctt catggttatg cactctcagg aagccttcga 11940caagctcaag cgaatgccca agatcaacca ggcttccgag ctgcctcccc aggctgccgt 12000caacgaagct caggaggatt tccgaaagct ccgagaagag ctgatcgcca ctggcatgtt 12060tgacgcctct cccctctggt actcgtacaa gatcttgacc accctgggtc ttggcgtgct 12120tgccttcttc atgctggtcc agtaccacct gtacttcatt ggtgctctcg tgctcggtat 12180gcactaccag caaatgggat ggctgtctca tgacatctgc caccaccaga ccttcaagaa 12240ccgaaactgg aataacgtcc tgggtctggt ctttggcaac ggactccagg gcttctccgt 12300gacctggtgg aaggacagac acaacgccca tcattctgct accaacgttc agggtcacga 12360tcccgacatt gataacctgc ctctgctcgc ctggtccgag gacgatgtca ctcgagcttc 12420tcccatctcc cgaaagctca ttcagttcca acagtactat ttcctggtca tctgtattct 12480cctgcgattc atctggtgtt tccagtctgt gctgaccgtt cgatccctca aggaccgaga 12540caaccagttc taccgatctc agtacaagaa agaggccatt ggactcgctc tgcactggac 12600tctcaagacc ctgttccacc tcttctttat gccctccatc ctgacctcga tgctggtgtt 12660ctttgtttcc gagctcgtcg gtggcttcgg aattgccatc gtggtcttca tgaaccacta 12720ccctctggag aagatcggtg attccgtctg ggacggacat ggcttctctg tgggtcagat 12780ccatgagacc atgaacattc gacgaggcat cattactgac tggttctttg gaggcctgaa 12840ctaccagatc gagcaccatc tctggcccac cctgcctcga cacaacctca ctgccgtttc 12900ctaccaggtg gaacagctgt gccagaagca caacctcccc taccgaaacc ctctgcccca 12960tgaaggtctc gtcatcctgc tccgatacct gtcccagttc gctcgaatgg ccgagaagca 13020gcccggtgcc aaggctcagt aagcggccgc atgagaagat aaatatataa atacattgag 13080atattaaatg cgctagatta gagagcctca tactgctcgg agagaagcca agacgagtac 13140tcaaagggga ttacaccatc catatccaca gacacaagct ggggaaaggt tctatataca 13200ctttccggaa taccgtagtt tccgatgtta tcaatggggg cagccaggat ttcaggcact 13260tcggtgtctc ggggtgaaat ggcgttcttg gcctccatca agtcgtacca tgtcttcatt 13320tgcctgtcaa agtaaaacag aagcagatga agaatgaact tgaagtgaag gaatttaaat 13380agttggagca agggagaaat gtagagtgtg aaagactcac tatggtccgg gcttatctcg 13440accaatagcc aaagtctgga gtttctgaga gaaaaaggca agatacgtat gtaacaaagc 13500gacgcatggt acaataatac cggaggcatg tatcatagag agttagtggt tcgatgatgg 13560cactggtgcc tggtatgact ttatacggct gactacatat ttgtcctcag acatacaatt 13620acagtcaagc acttaccctt ggacatctgt aggtaccccc cggccaagac gatctcagcg 13680tgtcgtatgt cggattggcg tagctccctc gctcgtcaat tggctcccat ctactttctt 13740ctgcttggct acacccagca tgtctgctat ggctcgtttt cgtgccttat ctatcctccc 13800agtattacca actctaaatg acatgatgtg attgggtcta cactttcata tcagagataa 13860ggagtagcac agttgcataa aaagcccaac tctaatcagc ttcttccttt cttgtaatta 13920gtacaaaggt gattagcgaa atctggaagc ttagttggcc ctaaaaaaat caaaaaaagc 13980aaaaaacgaa aaacgaaaaa ccacagtttt gagaacaggg aggtaacgaa ggatcgtata 14040tatatatata tatatatata cccacggatc ccgagaccgg cctttgattc ttccctacaa 14100ccaaccattc tcaccaccct aattcacaac catggcctcc acctcggctc tgcccaagca 14160gaaccctgcc ctccgacgaa ccgtcacttc caccactgtg accgactcgg agtctgctgc 14220cgtctctccc tccgattctc ccagacactc ggcctcctct acatcgctgt cttccatgtc 14280cgaggtggac attgccaagc ccaagtccga gtacggtgtc atgctggata cctacggcaa 14340ccagttcgaa gttcccgact tcaccatcaa ggacatctac aacgctattc ccaagcactg 14400cttcaagcga tctgctctca agggatacgg ctacattctt cgagacattg tcctcctgac 14460taccactttc agcatctggt acaactttgt gacacccgag tacattccct ccactcctgc 14520tcgagccggt ctgtgggctg tgtacaccgt tcttcaggga ctcttcggta ctggactgtg 14580ggtcattgcc cacgagtgtg gacatggtgc tttctccgat tcccgaatca tcaacgacat 14640tactggctgg gtgcttcact cttccctgct tgttccctac ttcagctggc aaatctccca 14700ccggaagcat cacaaggcca ctggaaacat ggagcgagac atggtcttcg ttcctcgaac 14760ccgagagcag caagctactc gactcggcaa gatgacccac gaactcgccc atcttaccga 14820ggaaactcct gctttcaccc tgctcatgct tgtgcttcag caactggtcg gttggcccaa 14880ctatctcatt accaacgtta ctggacacaa ctaccatgag cggcagcgag agggtcgagg 14940caagggaaag cacaacggtc ttggcggtgg agttaaccat ttcgatcccc gatctcctct 15000gtacgagaac agcgacgcca agctcatcgt gctctccgac attggcattg gtcttatggc 15060caccgctctg tactttctcg ttcagaagtt cggattctac aacatggcca tctggtactt 15120cgttccctac ttgtgggtta accactggct cgtcgccatt acctttctgc agcacacaga 15180tcctactctt ccccactaca ccaacgacga gtggaacttt gtgcgaggtg ccgctgcaac 15240catcgaccga gagatgggct tcattggacg tcatctgctc cacggcatta tcgagactca 15300cgtcctgcat cactacgtct cttccattcc cttctacaat gcggacgaag ctaccgaggc 15360catcaaacct atcatgggca agcactatcg agctgatgtc caggacggtc ctcgaggatt 15420cattcgagcc atgtaccgat ctgcacgaat gtgccagtgg gttgaaccct ccgctggtgc 15480cgagggagct ggcaagggtg tcctgttctt tcgaaaccga aacaatgtgg gcactcctcc 15540cgctgtcatc aagcccgttg cctaagcggc cgctatttat cactctttac aacttctacc 15600tcaactatct actttaataa atgaatatcg tttattctct atgattactg tatatgcgtt 15660cctctaagac aaatcgaaac cagcatgtga tcgaatggca tacaaaagtt tcttccgaag 15720ttgatcaatg tcctgatagt caggcagctt gagaagattg acacaggtgg aggccgtagg 15780gaaccgatca acctgtctac cagcgttacg aatggcaaat gacgggttca aagccttgaa 15840tccttgcaat ggtgccttgg atactgatgt cacaaactta agaagcagcc gcttgtcctc 15900ttcctcgatc gatggtcata gctgtttcct gtgtgaaatt gttatccgct cacaattcca 15960cacaac 15966722119DNAYarrowia lipolyticaCDS(291)..(1835)DGAT2 opening reading frame, comprising 2 smaller internal opening reading frames 72aaacgcaccc actgctcgtc ctccttgctc ctcgaaaccg actcctctac acacgtcaaa 60tccgaggttg aaatcttccc cacatttggc agccaaacca gcacatccca gcaacctcgc 120acagcgccga aatcgacctg tcgacttggc cacaaaaaaa agcaccggct ctgcaacagt 180tctcacgacc aattacgtac aagtacgaaa tcgttcgtgg accgtgactg ataagctccc 240actttttctt ctaacaacag gcaacagaca agtcacacaa aacaaaagct atg act 296 Met Thr 1atc gac tca caa tac tac aag tcg cga gac aaa aac gac acg gca ccc 344Ile Asp Ser Gln Tyr Tyr Lys Ser Arg Asp Lys Asn Asp Thr Ala Pro 5 10 15aaa atc gcg gga atc cga tat gcc ccg cta tcg aca cca tta ctc aac 392Lys Ile Ala Gly Ile Arg Tyr Ala Pro Leu Ser Thr Pro Leu Leu Asn 20 25 30cga tgt gag acc ttc tct ctg gtc tgg cac att ttc agc att ccc act 440Arg Cys Glu Thr Phe Ser Leu Val Trp His Ile Phe Ser Ile Pro Thr35 40 45 50ttc ctc aca att ttc atg cta tgc tgc gca att cca ctg ctc tgg cca 488Phe Leu Thr Ile Phe Met Leu Cys Cys Ala Ile Pro Leu Leu Trp Pro 55 60 65ttt gtg att gcg tat gta gtg tac gct gtt aaa gac gac tcc ccg tcc 536Phe Val Ile Ala Tyr Val Val Tyr Ala Val Lys Asp Asp Ser Pro Ser 70 75 80aac gga gga gtg gtc aag cga tac tcg cct att tca aga aac ttc ttc 584Asn Gly Gly Val Val Lys Arg Tyr Ser Pro Ile Ser Arg Asn Phe Phe 85 90 95atc tgg aag ctc ttt ggc cgc tac ttc ccc ata act ctg cac aag acg 632Ile Trp Lys Leu Phe Gly Arg Tyr Phe Pro Ile Thr Leu His Lys Thr 100 105 110gtg gat ctg gag ccc acg cac aca tac tac cct ctg gac gtc cag gag 680Val Asp Leu Glu Pro Thr His Thr Tyr Tyr Pro Leu Asp Val Gln Glu115 120 125 130tat cac ctg att gct gag aga tac tgg ccg cag aac aag tac ctc cga 728Tyr His Leu Ile Ala Glu Arg Tyr Trp Pro Gln Asn Lys Tyr Leu Arg 135 140 145gca atc atc tcc acc atc gag tac ttt ctg ccc gcc ttc atg aaa cgg 776Ala Ile Ile Ser Thr Ile Glu Tyr Phe Leu Pro Ala Phe Met Lys Arg 150 155 160tct ctt tct atc aac gag cag gag cag cct gcc gag cga gat cct ctc 824Ser Leu Ser Ile Asn Glu Gln Glu Gln Pro Ala Glu Arg Asp Pro Leu 165 170 175ctg tct ccc gtt tct ccc agc tct ccg ggt tct caa cct gac aag tgg 872Leu Ser Pro Val Ser Pro Ser Ser Pro Gly Ser Gln Pro Asp Lys Trp 180 185 190att aac cac gac agc aga tat agc cgt gga gaa tca tct ggc tcc aac 920Ile Asn His Asp Ser Arg Tyr Ser Arg Gly Glu Ser Ser Gly Ser Asn195 200 205 210ggc cac gcc tcg ggc tcc gaa ctt aac ggc aac ggc aac aat ggc acc 968Gly His Ala Ser Gly Ser Glu Leu Asn Gly Asn Gly Asn Asn Gly Thr 215 220 225act aac cga cga cct ttg tcg tcc gcc tct gct ggc tcc act gca tct 1016Thr Asn Arg Arg Pro Leu Ser Ser Ala Ser Ala Gly Ser Thr Ala Ser 230 235 240gat tcc acg ctt ctt aac ggg tcc ctc aac tcc tac gcc aac cag atc 1064Asp Ser Thr Leu Leu Asn Gly Ser Leu Asn Ser Tyr Ala Asn Gln Ile 245 250 255att ggc gaa aac gac cca cag ctg tcg ccc aca aaa ctc aag ccc act 1112Ile Gly Glu Asn Asp Pro Gln Leu Ser Pro Thr Lys Leu Lys Pro Thr 260 265 270ggc aga aaa tac atc ttc ggc tac cac ccc cac ggc att atc ggc atg 1160Gly Arg Lys Tyr Ile Phe Gly Tyr His Pro His Gly Ile Ile Gly Met275 280 285 290gga gcc ttt ggt gga att gcc acc gag gga gct gga tgg tcc aag ctc 1208Gly Ala Phe Gly Gly Ile Ala Thr Glu Gly Ala Gly Trp Ser Lys Leu 295 300 305ttt ccg ggc atc cct gtt tct ctt atg act ctc acc aac aac ttc cga 1256Phe Pro Gly Ile Pro Val Ser Leu Met Thr Leu Thr Asn Asn Phe Arg 310 315 320gtg cct ctc tac aga gag tac ctc atg agt ctg gga gtc gct tct gtc 1304Val Pro Leu Tyr Arg Glu Tyr Leu Met Ser Leu Gly Val Ala Ser Val 325 330 335tcc aag aag tcc tgc aag gcc ctc ctc aag cga aac cag tct atc tgc 1352Ser Lys Lys Ser Cys Lys Ala Leu Leu Lys Arg Asn Gln Ser Ile Cys 340 345 350att gtc gtt ggt gga gca cag gaa agt ctt ctg gcc aga ccc ggt gtc 1400Ile Val Val Gly Gly Ala Gln Glu Ser Leu Leu Ala Arg Pro Gly Val355 360 365 370atg gac ctg gtg cta ctc aag cga aag ggt ttt gtt cga ctt ggt atg 1448Met Asp Leu Val Leu Leu Lys Arg Lys Gly Phe Val Arg Leu Gly Met 375 380 385gag gtc gga aat gtc gcc ctt gtt ccc atc atg gcc ttt ggt gag aac 1496Glu Val Gly Asn Val Ala Leu Val Pro Ile Met Ala Phe Gly Glu Asn 390 395 400gac ctc tat gac cag gtt agc aac gac aag tcg tcc aag ctg tac cga 1544Asp Leu Tyr Asp Gln Val Ser Asn Asp Lys Ser Ser Lys Leu Tyr Arg 405 410 415ttc cag cag ttt gtc aag aac ttc ctt gga ttc acc ctt cct ttg atg 1592Phe Gln Gln Phe Val Lys Asn Phe Leu Gly Phe Thr Leu Pro Leu Met 420 425 430cat gcc cga ggc gtc ttc aac tac gat gtc ggt ctt gtc ccc tac agg 1640His Ala Arg Gly Val Phe Asn Tyr Asp Val Gly Leu Val Pro Tyr Arg435 440 445 450cga ccc gtc aac att gtg gtt ggt tcc ccc att gac ttg cct tat ctc 1688Arg Pro Val Asn Ile Val Val Gly Ser Pro Ile Asp Leu Pro Tyr Leu 455 460 465cca cac ccc acc gac gaa gaa gtg tcc gaa tac cac gac cga tac atc 1736Pro His Pro Thr Asp Glu Glu Val Ser Glu Tyr His Asp Arg Tyr Ile 470 475 480gcc gag ctg cag cga atc tac aac gag cac aag gat gaa tat ttc atc 1784Ala Glu Leu Gln Arg Ile Tyr Asn Glu His Lys Asp Glu Tyr Phe Ile 485 490 495gat tgg acc gag gag ggc aaa gga gcc cca gag ttc cga atg att gag 1832Asp Trp Thr Glu Glu Gly Lys Gly Ala Pro Glu Phe Arg Met Ile Glu 500 505 510taa ggaaaactgc ctgggttagg caaatagcta atgagtattt ttttgatggc 1885aaccaaatgt agaaagaaaa aaaaaaaaaa agaaaaaaaa aagagaatat tatatctatg 1945taattctatt aaaagctctg ttgagtgagc ggaataaata ctgttgaaga ggggattgtg 2005tagagatctg tttactcaat ggcaaactca tctgggggag atccttccac tgtgggaagc 2065tcctggatag cctttgcatc ggggttcaag aagaccattg tgaacagccc ttga 211973514PRTYarrowia lipolytica 73Met Thr Ile Asp Ser Gln Tyr Tyr Lys Ser Arg Asp Lys Asn Asp Thr1 5 10 15Ala Pro Lys Ile Ala Gly Ile Arg Tyr Ala Pro Leu Ser Thr Pro Leu 20 25 30Leu Asn Arg Cys Glu Thr Phe Ser Leu Val Trp His Ile Phe Ser Ile 35 40 45Pro Thr Phe Leu Thr Ile Phe Met Leu Cys Cys Ala Ile Pro Leu Leu 50 55 60Trp Pro Phe Val Ile Ala Tyr Val Val Tyr Ala Val Lys Asp Asp Ser65 70 75 80Pro Ser Asn Gly Gly Val Val Lys Arg Tyr Ser Pro Ile Ser Arg Asn 85 90 95Phe Phe Ile Trp Lys Leu Phe Gly Arg Tyr Phe Pro Ile Thr Leu His 100 105 110Lys Thr Val Asp Leu Glu Pro Thr His Thr Tyr Tyr Pro Leu Asp Val 115 120 125Gln Glu Tyr His Leu Ile Ala Glu Arg Tyr Trp Pro Gln Asn Lys Tyr 130 135 140Leu Arg Ala Ile Ile Ser Thr Ile Glu Tyr Phe Leu Pro Ala Phe Met145 150 155 160Lys Arg Ser Leu Ser Ile Asn Glu Gln Glu Gln Pro Ala Glu Arg Asp 165 170 175Pro Leu Leu Ser Pro Val Ser Pro Ser Ser Pro Gly Ser Gln Pro Asp 180 185 190Lys Trp Ile Asn His Asp Ser Arg Tyr Ser Arg Gly Glu Ser Ser Gly 195 200 205Ser Asn Gly His Ala Ser Gly Ser Glu Leu Asn Gly Asn Gly Asn Asn 210 215 220Gly Thr Thr Asn Arg Arg Pro Leu Ser Ser Ala Ser Ala Gly Ser Thr225 230 235 240Ala Ser Asp Ser Thr Leu Leu Asn Gly Ser Leu Asn Ser Tyr Ala Asn 245 250 255Gln Ile Ile Gly Glu Asn Asp Pro Gln Leu Ser Pro Thr Lys Leu Lys 260 265 270Pro Thr Gly Arg Lys Tyr Ile Phe Gly Tyr His Pro His Gly Ile Ile 275 280 285Gly Met Gly Ala Phe Gly Gly Ile Ala Thr Glu Gly Ala Gly Trp Ser 290 295 300Lys Leu Phe Pro Gly Ile Pro Val Ser Leu Met Thr Leu Thr Asn Asn305 310 315 320Phe Arg Val Pro Leu Tyr Arg Glu Tyr Leu Met Ser Leu Gly Val Ala 325 330 335Ser Val Ser Lys Lys Ser Cys Lys Ala Leu Leu Lys Arg Asn Gln Ser 340 345 350Ile Cys Ile Val Val Gly Gly Ala Gln Glu Ser Leu Leu Ala Arg Pro 355 360 365Gly Val Met Asp Leu Val Leu Leu Lys Arg Lys Gly Phe Val Arg Leu 370 375 380Gly Met Glu Val Gly Asn Val Ala Leu Val Pro Ile Met Ala Phe Gly385 390 395 400Glu Asn Asp Leu Tyr Asp Gln Val Ser Asn Asp Lys Ser Ser Lys Leu 405 410 415Tyr Arg Phe Gln Gln Phe Val Lys Asn Phe Leu Gly Phe Thr Leu Pro 420 425 430Leu Met His Ala Arg Gly Val Phe Asn Tyr Asp Val Gly Leu Val Pro 435 440 445Tyr Arg Arg Pro Val Asn Ile Val Val Gly Ser Pro Ile Asp Leu Pro 450 455 460Tyr Leu Pro His Pro Thr Asp Glu Glu Val Ser Glu Tyr His Asp Arg465 470 475 480Tyr Ile Ala Glu Leu Gln Arg Ile Tyr Asn Glu His Lys Asp Glu Tyr 485 490 495Phe Ile Asp Trp Thr Glu Glu Gly Lys Gly Ala Pro Glu Phe Arg Met 500 505 510Ile Glu741434DNAFusarium moniliformeCDS(1)..(1434)synthetic delta-12 desaturase (codon-optimized for Yarrowia lipolytica) 74atg gcc tcc acc tcg gct ctg ccc aag cag aac cct gcc ctc cga cga 48Met Ala Ser Thr Ser Ala Leu Pro Lys Gln Asn Pro Ala Leu Arg Arg1 5 10 15acc gtc act tcc acc act gtg acc gac tcg gag tct gct gcc gtc tct 96Thr Val Thr Ser Thr Thr Val Thr Asp Ser Glu Ser Ala Ala Val Ser 20 25 30ccc tcc gat tct ccc aga cac tcg gcc tcc tct aca tcg ctg tct tcc 144Pro Ser Asp Ser Pro Arg His Ser Ala Ser Ser Thr Ser Leu Ser Ser 35 40 45atg tcc gag gtg gac att gcc aag ccc aag tcc gag tac ggt gtc atg 192Met Ser Glu Val Asp Ile Ala Lys Pro Lys Ser Glu Tyr Gly Val Met 50 55 60ctg gat acc tac ggc aac cag ttc gaa gtt ccc gac ttc acc atc aag 240Leu Asp Thr Tyr Gly Asn Gln Phe Glu Val Pro Asp Phe Thr Ile Lys65 70 75 80gac atc tac aac gct att ccc aag cac tgc ttc aag cga tct gct ctc 288Asp Ile Tyr Asn Ala Ile Pro Lys His Cys Phe Lys Arg Ser Ala Leu 85 90 95aag gga tac ggc tac att ctt cga gac att gtc ctc ctg act acc act 336Lys Gly Tyr Gly Tyr Ile Leu Arg Asp Ile Val Leu Leu Thr Thr Thr 100 105 110ttc agc atc tgg tac aac ttt gtg aca ccc gag tac att ccc tcc act 384Phe Ser Ile Trp Tyr Asn Phe Val Thr Pro Glu Tyr Ile Pro Ser Thr 115 120 125cct gct cga gcc ggt ctg tgg gct gtg tac acc gtt ctt cag gga ctc 432Pro Ala Arg Ala Gly Leu Trp Ala Val Tyr Thr Val Leu Gln Gly Leu 130 135 140ttc ggt act gga ctg tgg gtc att gcc cac gag tgt gga cat ggt gct 480Phe Gly Thr Gly Leu Trp Val Ile Ala His Glu Cys Gly His Gly Ala145 150 155 160ttc tcc gat tcc

cga atc atc aac gac att act ggc tgg gtg ctt cac 528Phe Ser Asp Ser Arg Ile Ile Asn Asp Ile Thr Gly Trp Val Leu His 165 170 175tct tcc ctg ctt gtt ccc tac ttc agc tgg caa atc tcc cac cgg aag 576Ser Ser Leu Leu Val Pro Tyr Phe Ser Trp Gln Ile Ser His Arg Lys 180 185 190cat cac aag gcc act gga aac atg gag cga gac atg gtc ttc gtt cct 624His His Lys Ala Thr Gly Asn Met Glu Arg Asp Met Val Phe Val Pro 195 200 205cga acc cga gag cag caa gct act cga ctc ggc aag atg acc cac gaa 672Arg Thr Arg Glu Gln Gln Ala Thr Arg Leu Gly Lys Met Thr His Glu 210 215 220ctc gcc cat ctt acc gag gaa act cct gct ttc acc ctg ctc atg ctt 720Leu Ala His Leu Thr Glu Glu Thr Pro Ala Phe Thr Leu Leu Met Leu225 230 235 240gtg ctt cag caa ctg gtc ggt tgg ccc aac tat ctc att acc aac gtt 768Val Leu Gln Gln Leu Val Gly Trp Pro Asn Tyr Leu Ile Thr Asn Val 245 250 255act gga cac aac tac cat gag cgg cag cga gag ggt cga ggc aag gga 816Thr Gly His Asn Tyr His Glu Arg Gln Arg Glu Gly Arg Gly Lys Gly 260 265 270aag cac aac ggt ctt ggc ggt gga gtt aac cat ttc gat ccc cga tct 864Lys His Asn Gly Leu Gly Gly Gly Val Asn His Phe Asp Pro Arg Ser 275 280 285cct ctg tac gag aac agc gac gcc aag ctc atc gtg ctc tcc gac att 912Pro Leu Tyr Glu Asn Ser Asp Ala Lys Leu Ile Val Leu Ser Asp Ile 290 295 300ggc att ggt ctt atg gcc acc gct ctg tac ttt ctc gtt cag aag ttc 960Gly Ile Gly Leu Met Ala Thr Ala Leu Tyr Phe Leu Val Gln Lys Phe305 310 315 320gga ttc tac aac atg gcc atc tgg tac ttc gtt ccc tac ttg tgg gtt 1008Gly Phe Tyr Asn Met Ala Ile Trp Tyr Phe Val Pro Tyr Leu Trp Val 325 330 335aac cac tgg ctc gtc gcc att acc ttt ctg cag cac aca gat cct act 1056Asn His Trp Leu Val Ala Ile Thr Phe Leu Gln His Thr Asp Pro Thr 340 345 350ctt ccc cac tac acc aac gac gag tgg aac ttt gtg cga ggt gcc gct 1104Leu Pro His Tyr Thr Asn Asp Glu Trp Asn Phe Val Arg Gly Ala Ala 355 360 365gca acc atc gac cga gag atg ggc ttc att gga cgt cat ctg ctc cac 1152Ala Thr Ile Asp Arg Glu Met Gly Phe Ile Gly Arg His Leu Leu His 370 375 380ggc att atc gag act cac gtc ctg cat cac tac gtc tct tcc att ccc 1200Gly Ile Ile Glu Thr His Val Leu His His Tyr Val Ser Ser Ile Pro385 390 395 400ttc tac aat gcg gac gaa gct acc gag gcc atc aaa cct atc atg ggc 1248Phe Tyr Asn Ala Asp Glu Ala Thr Glu Ala Ile Lys Pro Ile Met Gly 405 410 415aag cac tat cga gct gat gtc cag gac ggt cct cga gga ttc att cga 1296Lys His Tyr Arg Ala Asp Val Gln Asp Gly Pro Arg Gly Phe Ile Arg 420 425 430gcc atg tac cga tct gca cga atg tgc cag tgg gtt gaa ccc tcc gct 1344Ala Met Tyr Arg Ser Ala Arg Met Cys Gln Trp Val Glu Pro Ser Ala 435 440 445ggt gcc gag gga gct ggc aag ggt gtc ctg ttc ttt cga aac cga aac 1392Gly Ala Glu Gly Ala Gly Lys Gly Val Leu Phe Phe Arg Asn Arg Asn 450 455 460aat gtg ggc act cct ccc gct gtc atc aag ccc gtt gcc taa 1434Asn Val Gly Thr Pro Pro Ala Val Ile Lys Pro Val Ala465 470 47575477PRTFusarium moniliforme 75Met Ala Ser Thr Ser Ala Leu Pro Lys Gln Asn Pro Ala Leu Arg Arg1 5 10 15Thr Val Thr Ser Thr Thr Val Thr Asp Ser Glu Ser Ala Ala Val Ser 20 25 30Pro Ser Asp Ser Pro Arg His Ser Ala Ser Ser Thr Ser Leu Ser Ser 35 40 45Met Ser Glu Val Asp Ile Ala Lys Pro Lys Ser Glu Tyr Gly Val Met 50 55 60Leu Asp Thr Tyr Gly Asn Gln Phe Glu Val Pro Asp Phe Thr Ile Lys65 70 75 80Asp Ile Tyr Asn Ala Ile Pro Lys His Cys Phe Lys Arg Ser Ala Leu 85 90 95Lys Gly Tyr Gly Tyr Ile Leu Arg Asp Ile Val Leu Leu Thr Thr Thr 100 105 110Phe Ser Ile Trp Tyr Asn Phe Val Thr Pro Glu Tyr Ile Pro Ser Thr 115 120 125Pro Ala Arg Ala Gly Leu Trp Ala Val Tyr Thr Val Leu Gln Gly Leu 130 135 140Phe Gly Thr Gly Leu Trp Val Ile Ala His Glu Cys Gly His Gly Ala145 150 155 160Phe Ser Asp Ser Arg Ile Ile Asn Asp Ile Thr Gly Trp Val Leu His 165 170 175Ser Ser Leu Leu Val Pro Tyr Phe Ser Trp Gln Ile Ser His Arg Lys 180 185 190His His Lys Ala Thr Gly Asn Met Glu Arg Asp Met Val Phe Val Pro 195 200 205Arg Thr Arg Glu Gln Gln Ala Thr Arg Leu Gly Lys Met Thr His Glu 210 215 220Leu Ala His Leu Thr Glu Glu Thr Pro Ala Phe Thr Leu Leu Met Leu225 230 235 240Val Leu Gln Gln Leu Val Gly Trp Pro Asn Tyr Leu Ile Thr Asn Val 245 250 255Thr Gly His Asn Tyr His Glu Arg Gln Arg Glu Gly Arg Gly Lys Gly 260 265 270Lys His Asn Gly Leu Gly Gly Gly Val Asn His Phe Asp Pro Arg Ser 275 280 285Pro Leu Tyr Glu Asn Ser Asp Ala Lys Leu Ile Val Leu Ser Asp Ile 290 295 300Gly Ile Gly Leu Met Ala Thr Ala Leu Tyr Phe Leu Val Gln Lys Phe305 310 315 320Gly Phe Tyr Asn Met Ala Ile Trp Tyr Phe Val Pro Tyr Leu Trp Val 325 330 335Asn His Trp Leu Val Ala Ile Thr Phe Leu Gln His Thr Asp Pro Thr 340 345 350Leu Pro His Tyr Thr Asn Asp Glu Trp Asn Phe Val Arg Gly Ala Ala 355 360 365Ala Thr Ile Asp Arg Glu Met Gly Phe Ile Gly Arg His Leu Leu His 370 375 380Gly Ile Ile Glu Thr His Val Leu His His Tyr Val Ser Ser Ile Pro385 390 395 400Phe Tyr Asn Ala Asp Glu Ala Thr Glu Ala Ile Lys Pro Ile Met Gly 405 410 415Lys His Tyr Arg Ala Asp Val Gln Asp Gly Pro Arg Gly Phe Ile Arg 420 425 430Ala Met Tyr Arg Ser Ala Arg Met Cys Gln Trp Val Glu Pro Ser Ala 435 440 445Gly Ala Glu Gly Ala Gly Lys Gly Val Leu Phe Phe Arg Asn Arg Asn 450 455 460Asn Val Gly Thr Pro Pro Ala Val Ile Lys Pro Val Ala465 470 475761272DNAArtificial Sequencemutant EgD8M delta-8 desaturase (also designated as "EgD8S-23") 76c atg gtg aag gct tct cga cag gct ctg ccc ctc gtc atc gac gga aag 49 Met Val Lys Ala Ser Arg Gln Ala Leu Pro Leu Val Ile Asp Gly Lys 1 5 10 15gtg tac gac gtc tcc gct tgg gtg aac ttc cac cct ggt gga gct gaa 97Val Tyr Asp Val Ser Ala Trp Val Asn Phe His Pro Gly Gly Ala Glu 20 25 30atc att gag aac tac cag gga cga gat gct act gac gcc ttc atg gtt 145Ile Ile Glu Asn Tyr Gln Gly Arg Asp Ala Thr Asp Ala Phe Met Val 35 40 45atg cac tct cag gaa gcc ttc gac aag ctc aag cga atg ccc aag atc 193Met His Ser Gln Glu Ala Phe Asp Lys Leu Lys Arg Met Pro Lys Ile 50 55 60aac cag gct tcc gag ctg cct ccc cag gct gcc gtc aac gaa gct cag 241Asn Gln Ala Ser Glu Leu Pro Pro Gln Ala Ala Val Asn Glu Ala Gln65 70 75 80gag gat ttc cga aag ctc cga gaa gag ctg atc gcc act ggc atg ttt 289Glu Asp Phe Arg Lys Leu Arg Glu Glu Leu Ile Ala Thr Gly Met Phe 85 90 95gac gcc tct ccc ctc tgg tac tcg tac aag atc ttg acc acc ctg ggt 337Asp Ala Ser Pro Leu Trp Tyr Ser Tyr Lys Ile Leu Thr Thr Leu Gly 100 105 110ctt ggc gtg ctt gcc ttc ttc atg ctg gtc cag tac cac ctg tac ttc 385Leu Gly Val Leu Ala Phe Phe Met Leu Val Gln Tyr His Leu Tyr Phe 115 120 125att ggt gct ctc gtg ctc ggt atg cac tac cag caa atg gga tgg ctg 433Ile Gly Ala Leu Val Leu Gly Met His Tyr Gln Gln Met Gly Trp Leu 130 135 140tct cat gac atc tgc cac cac cag acc ttc aag aac cga aac tgg aat 481Ser His Asp Ile Cys His His Gln Thr Phe Lys Asn Arg Asn Trp Asn145 150 155 160aac gtc ctg ggt ctg gtc ttt ggc aac gga ctc cag ggc ttc tcc gtg 529Asn Val Leu Gly Leu Val Phe Gly Asn Gly Leu Gln Gly Phe Ser Val 165 170 175acc tgg tgg aag gac aga cac aac gcc cat cat tct gct acc aac gtt 577Thr Trp Trp Lys Asp Arg His Asn Ala His His Ser Ala Thr Asn Val 180 185 190cag ggt cac gat ccc gac att gat aac ctg cct ctg ctc gcc tgg tcc 625Gln Gly His Asp Pro Asp Ile Asp Asn Leu Pro Leu Leu Ala Trp Ser 195 200 205gag gac gat gtc act cga gct tct ccc atc tcc cga aag ctc att cag 673Glu Asp Asp Val Thr Arg Ala Ser Pro Ile Ser Arg Lys Leu Ile Gln 210 215 220ttc caa cag tac tat ttc ctg gtc atc tgt att ctc ctg cga ttc atc 721Phe Gln Gln Tyr Tyr Phe Leu Val Ile Cys Ile Leu Leu Arg Phe Ile225 230 235 240tgg tgt ttc cag tct gtg ctg acc gtt cga tcc ctc aag gac cga gac 769Trp Cys Phe Gln Ser Val Leu Thr Val Arg Ser Leu Lys Asp Arg Asp 245 250 255aac cag ttc tac cga tct cag tac aag aaa gag gcc att gga ctc gct 817Asn Gln Phe Tyr Arg Ser Gln Tyr Lys Lys Glu Ala Ile Gly Leu Ala 260 265 270ctg cac tgg act ctc aag acc ctg ttc cac ctc ttc ttt atg ccc tcc 865Leu His Trp Thr Leu Lys Thr Leu Phe His Leu Phe Phe Met Pro Ser 275 280 285atc ctg acc tcg atg ctg gtg ttc ttt gtt tcc gag ctc gtc ggt ggc 913Ile Leu Thr Ser Met Leu Val Phe Phe Val Ser Glu Leu Val Gly Gly 290 295 300ttc gga att gcc atc gtg gtc ttc atg aac cac tac cct ctg gag aag 961Phe Gly Ile Ala Ile Val Val Phe Met Asn His Tyr Pro Leu Glu Lys305 310 315 320atc ggt gat tcc gtc tgg gac gga cat ggc ttc tct gtg ggt cag atc 1009Ile Gly Asp Ser Val Trp Asp Gly His Gly Phe Ser Val Gly Gln Ile 325 330 335cat gag acc atg aac att cga cga ggc atc att act gac tgg ttc ttt 1057His Glu Thr Met Asn Ile Arg Arg Gly Ile Ile Thr Asp Trp Phe Phe 340 345 350gga ggc ctg aac tac cag atc gag cac cat ctc tgg ccc acc ctg cct 1105Gly Gly Leu Asn Tyr Gln Ile Glu His His Leu Trp Pro Thr Leu Pro 355 360 365cga cac aac ctc act gcc gtt tcc tac cag gtg gaa cag ctg tgc cag 1153Arg His Asn Leu Thr Ala Val Ser Tyr Gln Val Glu Gln Leu Cys Gln 370 375 380aag cac aac ctc ccc tac cga aac cct ctg ccc cat gaa ggt ctc gtc 1201Lys His Asn Leu Pro Tyr Arg Asn Pro Leu Pro His Glu Gly Leu Val385 390 395 400atc ctg ctc cga tac ctg tcc cag ttc gct cga atg gcc gag aag cag 1249Ile Leu Leu Arg Tyr Leu Ser Gln Phe Ala Arg Met Ala Glu Lys Gln 405 410 415ccc ggt gcc aag gct cag taa gc 1272Pro Gly Ala Lys Ala Gln 42077422PRTArtificial SequenceSynthetic Construct 77Met Val Lys Ala Ser Arg Gln Ala Leu Pro Leu Val Ile Asp Gly Lys1 5 10 15Val Tyr Asp Val Ser Ala Trp Val Asn Phe His Pro Gly Gly Ala Glu 20 25 30Ile Ile Glu Asn Tyr Gln Gly Arg Asp Ala Thr Asp Ala Phe Met Val 35 40 45Met His Ser Gln Glu Ala Phe Asp Lys Leu Lys Arg Met Pro Lys Ile 50 55 60Asn Gln Ala Ser Glu Leu Pro Pro Gln Ala Ala Val Asn Glu Ala Gln65 70 75 80Glu Asp Phe Arg Lys Leu Arg Glu Glu Leu Ile Ala Thr Gly Met Phe 85 90 95Asp Ala Ser Pro Leu Trp Tyr Ser Tyr Lys Ile Leu Thr Thr Leu Gly 100 105 110Leu Gly Val Leu Ala Phe Phe Met Leu Val Gln Tyr His Leu Tyr Phe 115 120 125Ile Gly Ala Leu Val Leu Gly Met His Tyr Gln Gln Met Gly Trp Leu 130 135 140Ser His Asp Ile Cys His His Gln Thr Phe Lys Asn Arg Asn Trp Asn145 150 155 160Asn Val Leu Gly Leu Val Phe Gly Asn Gly Leu Gln Gly Phe Ser Val 165 170 175Thr Trp Trp Lys Asp Arg His Asn Ala His His Ser Ala Thr Asn Val 180 185 190Gln Gly His Asp Pro Asp Ile Asp Asn Leu Pro Leu Leu Ala Trp Ser 195 200 205Glu Asp Asp Val Thr Arg Ala Ser Pro Ile Ser Arg Lys Leu Ile Gln 210 215 220Phe Gln Gln Tyr Tyr Phe Leu Val Ile Cys Ile Leu Leu Arg Phe Ile225 230 235 240Trp Cys Phe Gln Ser Val Leu Thr Val Arg Ser Leu Lys Asp Arg Asp 245 250 255Asn Gln Phe Tyr Arg Ser Gln Tyr Lys Lys Glu Ala Ile Gly Leu Ala 260 265 270Leu His Trp Thr Leu Lys Thr Leu Phe His Leu Phe Phe Met Pro Ser 275 280 285Ile Leu Thr Ser Met Leu Val Phe Phe Val Ser Glu Leu Val Gly Gly 290 295 300Phe Gly Ile Ala Ile Val Val Phe Met Asn His Tyr Pro Leu Glu Lys305 310 315 320Ile Gly Asp Ser Val Trp Asp Gly His Gly Phe Ser Val Gly Gln Ile 325 330 335His Glu Thr Met Asn Ile Arg Arg Gly Ile Ile Thr Asp Trp Phe Phe 340 345 350Gly Gly Leu Asn Tyr Gln Ile Glu His His Leu Trp Pro Thr Leu Pro 355 360 365Arg His Asn Leu Thr Ala Val Ser Tyr Gln Val Glu Gln Leu Cys Gln 370 375 380Lys His Asn Leu Pro Tyr Arg Asn Pro Leu Pro His Glu Gly Leu Val385 390 395 400Ile Leu Leu Arg Tyr Leu Ser Gln Phe Ala Arg Met Ala Glu Lys Gln 405 410 415Pro Gly Ala Lys Ala Gln 42078792DNAEutreptiella sp. CCMP389CDS(1)..(792)synthetic delta-9 elongase (codon-optimized for Yarrowia lipolytica) 78atg gct gcc gtc atc gag gtg gcc aac gag ttc gtc gct atc act gcc 48Met Ala Ala Val Ile Glu Val Ala Asn Glu Phe Val Ala Ile Thr Ala1 5 10 15gag acc ctt ccc aag gtg gac tat cag cga ctc tgg cga gac atc tac 96Glu Thr Leu Pro Lys Val Asp Tyr Gln Arg Leu Trp Arg Asp Ile Tyr 20 25 30tcc tgc gag ctc ctg tac ttc tcc att gct ttc gtc atc ctc aag ttt 144Ser Cys Glu Leu Leu Tyr Phe Ser Ile Ala Phe Val Ile Leu Lys Phe 35 40 45acc ctt ggc gag ctc tcg gat tct ggc aaa aag att ctg cga gtg ctg 192Thr Leu Gly Glu Leu Ser Asp Ser Gly Lys Lys Ile Leu Arg Val Leu 50 55 60ttc aag tgg tac aac ctc ttc atg tcc gtc ttt tcg ctg gtg tcc ttc 240Phe Lys Trp Tyr Asn Leu Phe Met Ser Val Phe Ser Leu Val Ser Phe65 70 75 80ctc tgt atg ggt tac gcc atc tac acc gtt gga ctg tac tcc aac gaa 288Leu Cys Met Gly Tyr Ala Ile Tyr Thr Val Gly Leu Tyr Ser Asn Glu 85 90 95tgc gac aga gct ttc gac aac agc ttg ttc cga ttt gcc acc aag gtc 336Cys Asp Arg Ala Phe Asp Asn Ser Leu Phe Arg Phe Ala Thr Lys Val 100 105 110ttc tac tat tcc aag ttt ctg gag tac atc gac tct ttc tac ctt ccc 384Phe Tyr Tyr Ser Lys Phe Leu Glu Tyr Ile Asp Ser Phe Tyr Leu Pro 115 120 125ctc atg gcc aag cct ctg tcc ttt ctg cag ttc ttt cat cac ttg gga 432Leu Met Ala Lys Pro Leu Ser Phe Leu Gln Phe Phe His His Leu Gly 130 135 140gct cct atg gac atg tgg ctc ttc gtg cag tac tct ggc gaa tcc att 480Ala Pro Met Asp Met Trp Leu Phe Val Gln Tyr Ser Gly Glu Ser Ile145 150 155 160tgg atc ttt gtg ttc ctg aac gga ttc att cac ttt gtc atg tac ggc 528Trp Ile Phe Val Phe Leu Asn Gly Phe Ile His Phe Val Met Tyr Gly 165 170 175tac tat tgg aca cgg ctg atg aag ttc aac ttt ccc atg ccc aag cag 576Tyr Tyr Trp Thr Arg Leu Met Lys Phe Asn Phe Pro Met Pro Lys Gln 180 185 190ctc att acc gca atg cag atc acc cag ttc aac gtt ggc ttc tac ctc 624Leu Ile Thr Ala Met Gln Ile Thr Gln Phe Asn Val Gly Phe Tyr Leu 195 200 205gtg tgg tgg tac aag gac att ccc tgt tac cga aag gat ccc atg cga 672Val Trp Trp Tyr Lys Asp Ile

Pro Cys Tyr Arg Lys Asp Pro Met Arg 210 215 220atg ctg gcc tgg atc ttc aac tac tgg tac gtc ggt acc gtt ctt ctg 720Met Leu Ala Trp Ile Phe Asn Tyr Trp Tyr Val Gly Thr Val Leu Leu225 230 235 240ctc ttc atc aac ttc ttt gtc aag tcc tac gtg ttt ccc aag cct aag 768Leu Phe Ile Asn Phe Phe Val Lys Ser Tyr Val Phe Pro Lys Pro Lys 245 250 255act gcc gac aaa aag gtc cag tag 792Thr Ala Asp Lys Lys Val Gln 26079263PRTEutreptiella sp. CCMP389 79Met Ala Ala Val Ile Glu Val Ala Asn Glu Phe Val Ala Ile Thr Ala1 5 10 15Glu Thr Leu Pro Lys Val Asp Tyr Gln Arg Leu Trp Arg Asp Ile Tyr 20 25 30Ser Cys Glu Leu Leu Tyr Phe Ser Ile Ala Phe Val Ile Leu Lys Phe 35 40 45Thr Leu Gly Glu Leu Ser Asp Ser Gly Lys Lys Ile Leu Arg Val Leu 50 55 60Phe Lys Trp Tyr Asn Leu Phe Met Ser Val Phe Ser Leu Val Ser Phe65 70 75 80Leu Cys Met Gly Tyr Ala Ile Tyr Thr Val Gly Leu Tyr Ser Asn Glu 85 90 95Cys Asp Arg Ala Phe Asp Asn Ser Leu Phe Arg Phe Ala Thr Lys Val 100 105 110Phe Tyr Tyr Ser Lys Phe Leu Glu Tyr Ile Asp Ser Phe Tyr Leu Pro 115 120 125Leu Met Ala Lys Pro Leu Ser Phe Leu Gln Phe Phe His His Leu Gly 130 135 140Ala Pro Met Asp Met Trp Leu Phe Val Gln Tyr Ser Gly Glu Ser Ile145 150 155 160Trp Ile Phe Val Phe Leu Asn Gly Phe Ile His Phe Val Met Tyr Gly 165 170 175Tyr Tyr Trp Thr Arg Leu Met Lys Phe Asn Phe Pro Met Pro Lys Gln 180 185 190Leu Ile Thr Ala Met Gln Ile Thr Gln Phe Asn Val Gly Phe Tyr Leu 195 200 205Val Trp Trp Tyr Lys Asp Ile Pro Cys Tyr Arg Lys Asp Pro Met Arg 210 215 220Met Leu Ala Trp Ile Phe Asn Tyr Trp Tyr Val Gly Thr Val Leu Leu225 230 235 240Leu Phe Ile Asn Phe Phe Val Lys Ser Tyr Val Phe Pro Lys Pro Lys 245 250 255Thr Ala Asp Lys Lys Val Gln 260801350DNAEuglena gracilisCDS(1)..(1350)synthetic delta-5 desaturase (codon-optimized for Yarrowia lipolytica) 80atg gct ctc tcc ctt act acc gag cag ctg ctc gag cga ccc gac ctg 48Met Ala Leu Ser Leu Thr Thr Glu Gln Leu Leu Glu Arg Pro Asp Leu1 5 10 15gtt gcc atc gac ggc att ctc tac gat ctg gaa ggt ctt gcc aag gtc 96Val Ala Ile Asp Gly Ile Leu Tyr Asp Leu Glu Gly Leu Ala Lys Val 20 25 30cat ccc gga ggc gac ttg atc ctc gct tct ggt gcc tcc gat gct tct 144His Pro Gly Gly Asp Leu Ile Leu Ala Ser Gly Ala Ser Asp Ala Ser 35 40 45cct ctg ttc tac tcc atg cac cct tac gtc aag ccc gag aac tcg aag 192Pro Leu Phe Tyr Ser Met His Pro Tyr Val Lys Pro Glu Asn Ser Lys 50 55 60ctg ctt caa cag ttc gtg cga ggc aag cac gac cga acc tcc aag gac 240Leu Leu Gln Gln Phe Val Arg Gly Lys His Asp Arg Thr Ser Lys Asp65 70 75 80att gtc tac acc tac gac tct ccc ttt gca cag gac gtc aag cga act 288Ile Val Tyr Thr Tyr Asp Ser Pro Phe Ala Gln Asp Val Lys Arg Thr 85 90 95atg cga gag gtc atg aaa ggt cgg aac tgg tat gcc aca cct gga ttc 336Met Arg Glu Val Met Lys Gly Arg Asn Trp Tyr Ala Thr Pro Gly Phe 100 105 110tgg ctg cga acc gtt ggc atc att gct gtc acc gcc ttt tgc gag tgg 384Trp Leu Arg Thr Val Gly Ile Ile Ala Val Thr Ala Phe Cys Glu Trp 115 120 125cac tgg gct act acc gga atg gtg ctg tgg ggt ctc ttg act gga ttc 432His Trp Ala Thr Thr Gly Met Val Leu Trp Gly Leu Leu Thr Gly Phe 130 135 140atg cac atg cag atc ggc ctg tcc att cag cac gat gcc tct cat ggt 480Met His Met Gln Ile Gly Leu Ser Ile Gln His Asp Ala Ser His Gly145 150 155 160gcc atc agc aaa aag ccc tgg gtc aac gct ctc ttt gcc tac ggc atc 528Ala Ile Ser Lys Lys Pro Trp Val Asn Ala Leu Phe Ala Tyr Gly Ile 165 170 175gac gtc att gga tcg tcc aga tgg atc tgg ctg cag tct cac atc atg 576Asp Val Ile Gly Ser Ser Arg Trp Ile Trp Leu Gln Ser His Ile Met 180 185 190cga cat cac acc tac acc aat cag cat ggt ctc gac ctg gat gcc gag 624Arg His His Thr Tyr Thr Asn Gln His Gly Leu Asp Leu Asp Ala Glu 195 200 205tcc gca gaa cca ttc ctt gtg ttc cac aac tac cct gct gcc aac act 672Ser Ala Glu Pro Phe Leu Val Phe His Asn Tyr Pro Ala Ala Asn Thr 210 215 220gct cga aag tgg ttt cac cga ttc cag gcc tgg tac atg tac ctc gtg 720Ala Arg Lys Trp Phe His Arg Phe Gln Ala Trp Tyr Met Tyr Leu Val225 230 235 240ctt gga gcc tac ggc gtt tcg ctg gtg tac aac cct ctc tac atc ttc 768Leu Gly Ala Tyr Gly Val Ser Leu Val Tyr Asn Pro Leu Tyr Ile Phe 245 250 255cga atg cag cac aac gac acc att ccc gag tct gtc aca gcc atg cga 816Arg Met Gln His Asn Asp Thr Ile Pro Glu Ser Val Thr Ala Met Arg 260 265 270gag aac ggc ttt ctg cga cgg tac cga acc ctt gca ttc gtt atg cga 864Glu Asn Gly Phe Leu Arg Arg Tyr Arg Thr Leu Ala Phe Val Met Arg 275 280 285gct ttc ttc atc ttt cga acc gcc ttc ttg ccc tgg tat ctc act gga 912Ala Phe Phe Ile Phe Arg Thr Ala Phe Leu Pro Trp Tyr Leu Thr Gly 290 295 300acc tcc ctg ctc atc acc att cct ctg gtg ccc act gct acc ggt gcc 960Thr Ser Leu Leu Ile Thr Ile Pro Leu Val Pro Thr Ala Thr Gly Ala305 310 315 320ttc ctc acc ttc ttt ttc atc ttg tct cac aac ttc gat ggc tcg gag 1008Phe Leu Thr Phe Phe Phe Ile Leu Ser His Asn Phe Asp Gly Ser Glu 325 330 335cga atc ccc gac aag aac tgc aag gtc aag agc tcc gag aag gac gtt 1056Arg Ile Pro Asp Lys Asn Cys Lys Val Lys Ser Ser Glu Lys Asp Val 340 345 350gaa gcc gat cag atc gac tgg tac aga gct cag gtg gag acc tct tcc 1104Glu Ala Asp Gln Ile Asp Trp Tyr Arg Ala Gln Val Glu Thr Ser Ser 355 360 365acc tac ggt gga ccc att gcc atg ttc ttt act ggc ggt ctc aac ttc 1152Thr Tyr Gly Gly Pro Ile Ala Met Phe Phe Thr Gly Gly Leu Asn Phe 370 375 380cag atc gag cat cac ctc ttt cct cga atg tcg tct tgg cac tat ccc 1200Gln Ile Glu His His Leu Phe Pro Arg Met Ser Ser Trp His Tyr Pro385 390 395 400ttc gtg cag caa gct gtc cga gag tgt tgc gaa cga cac gga gtt cgg 1248Phe Val Gln Gln Ala Val Arg Glu Cys Cys Glu Arg His Gly Val Arg 405 410 415tac gtc ttc tac cct acc att gtg ggc aac atc att tcc acc ctc aag 1296Tyr Val Phe Tyr Pro Thr Ile Val Gly Asn Ile Ile Ser Thr Leu Lys 420 425 430tac atg cac aaa gtc ggt gtg gtt cac tgt gtc aag gac gct cag gat 1344Tyr Met His Lys Val Gly Val Val His Cys Val Lys Asp Ala Gln Asp 435 440 445tcc taa 1350Ser 81449PRTEuglena gracilis 81Met Ala Leu Ser Leu Thr Thr Glu Gln Leu Leu Glu Arg Pro Asp Leu1 5 10 15Val Ala Ile Asp Gly Ile Leu Tyr Asp Leu Glu Gly Leu Ala Lys Val 20 25 30His Pro Gly Gly Asp Leu Ile Leu Ala Ser Gly Ala Ser Asp Ala Ser 35 40 45Pro Leu Phe Tyr Ser Met His Pro Tyr Val Lys Pro Glu Asn Ser Lys 50 55 60Leu Leu Gln Gln Phe Val Arg Gly Lys His Asp Arg Thr Ser Lys Asp65 70 75 80Ile Val Tyr Thr Tyr Asp Ser Pro Phe Ala Gln Asp Val Lys Arg Thr 85 90 95Met Arg Glu Val Met Lys Gly Arg Asn Trp Tyr Ala Thr Pro Gly Phe 100 105 110Trp Leu Arg Thr Val Gly Ile Ile Ala Val Thr Ala Phe Cys Glu Trp 115 120 125His Trp Ala Thr Thr Gly Met Val Leu Trp Gly Leu Leu Thr Gly Phe 130 135 140Met His Met Gln Ile Gly Leu Ser Ile Gln His Asp Ala Ser His Gly145 150 155 160Ala Ile Ser Lys Lys Pro Trp Val Asn Ala Leu Phe Ala Tyr Gly Ile 165 170 175Asp Val Ile Gly Ser Ser Arg Trp Ile Trp Leu Gln Ser His Ile Met 180 185 190Arg His His Thr Tyr Thr Asn Gln His Gly Leu Asp Leu Asp Ala Glu 195 200 205Ser Ala Glu Pro Phe Leu Val Phe His Asn Tyr Pro Ala Ala Asn Thr 210 215 220Ala Arg Lys Trp Phe His Arg Phe Gln Ala Trp Tyr Met Tyr Leu Val225 230 235 240Leu Gly Ala Tyr Gly Val Ser Leu Val Tyr Asn Pro Leu Tyr Ile Phe 245 250 255Arg Met Gln His Asn Asp Thr Ile Pro Glu Ser Val Thr Ala Met Arg 260 265 270Glu Asn Gly Phe Leu Arg Arg Tyr Arg Thr Leu Ala Phe Val Met Arg 275 280 285Ala Phe Phe Ile Phe Arg Thr Ala Phe Leu Pro Trp Tyr Leu Thr Gly 290 295 300Thr Ser Leu Leu Ile Thr Ile Pro Leu Val Pro Thr Ala Thr Gly Ala305 310 315 320Phe Leu Thr Phe Phe Phe Ile Leu Ser His Asn Phe Asp Gly Ser Glu 325 330 335Arg Ile Pro Asp Lys Asn Cys Lys Val Lys Ser Ser Glu Lys Asp Val 340 345 350Glu Ala Asp Gln Ile Asp Trp Tyr Arg Ala Gln Val Glu Thr Ser Ser 355 360 365Thr Tyr Gly Gly Pro Ile Ala Met Phe Phe Thr Gly Gly Leu Asn Phe 370 375 380Gln Ile Glu His His Leu Phe Pro Arg Met Ser Ser Trp His Tyr Pro385 390 395 400Phe Val Gln Gln Ala Val Arg Glu Cys Cys Glu Arg His Gly Val Arg 405 410 415Tyr Val Phe Tyr Pro Thr Ile Val Gly Asn Ile Ile Ser Thr Leu Lys 420 425 430Tyr Met His Lys Val Gly Val Val His Cys Val Lys Asp Ala Gln Asp 435 440 445Ser 826356DNAArtificial SequencePlasmid pY157 82ttgagaagcc cattgtatat tattaggatc gtagcattat tgtggcaaaa aatattcaag 60tgctcatgtg aattgacacg atcacgtaaa tacctggtga aattgctagt attcgtgatg 120ttctaataca actctgttca atatttccgg cgctctcttg tatacaagag cacaagacat 180gcaccccaca ttaaccgagg tcaagtgttt atgtatgaaa agtgacataa atcgtccaaa 240aaaaagtagc acatagttgt atggctgtaa gttatgtgat tgtcagttct tcggccttcc 300aactcctatg caccgtcttc aatcatctac ccccgtgccc cacaccccgc actattagag 360tttatcacag tcagctaaac tgcttgcaca tctacacctc tgactacacc accatggatt 420tcttcagacg gcaccagaaa aaggtgctgg cactggtagg tgtggcgctg agttcctacc 480tgtttatcga ctatgtgaag aaaaagttct tcgagatcca gggtcgtttg agctcggagc 540gaaccgctaa acagaatctc cggcgccgat ttgaacagaa ccagcaggat gcagatttta 600caatcatggc tctgctatcc agcttgacga caccggtaat ggagcgttac cccgtcgacc 660agatcaaggc agagttacag agcaagagac gccccacaga ccgggttttg gctctcgaga 720gctccacctc gtcctcagct accgcacaaa ccgtgcccac catgacaagt ggcgccacag 780aggagggcga gaagttaatt aactttggcc ggcctttacc tgcaggataa cttcgtataa 840tgtatgctat acgaagttat gaattctctc tcttgagctt ttccataaca agttcttctg 900cctccaggaa gtccatgggt ggtttgatca tggttttggt gtagtggtag tgcagtggtg 960gtattgtgac tggggatgta gttgagaata agtcatacac aagtcagctt tcttcgagcc 1020tcatataagt ataagtagtt caacgtatta gcactgtacc cagcatctcc gtatcgagaa 1080acacaacaac atgccccatt ggacagatca tgcggataca caggttgtgc agtatcatac 1140atactcgatc agacaggtcg tctgaccatc atacaagctg aacaagcgct ccatacttgc 1200acgctctcta tatacacagt taaattacat atccatagtc taacctctaa cagttaatct 1260tctggtaagc ctcccagcca gccttctggt atcgcttggc ctcctcaata ggatctcggt 1320tctggccgta cagacctcgg ccgacaatta tgatatccgt tccggtagac atgacatcct 1380caacagttcg gtactgctgt ccgagagcgt ctcccttgtc gtcaagaccc accccggggg 1440tcagaataag ccagtcctca gagtcgccct taggtcggtt ctgggcaatg aagccaacca 1500caaactcggg gtcggatcgg gcaagctcaa tggtctgctt ggagtactcg ccagtggcca 1560gagagccctt gcaagacagc tcggccagca tgagcagacc tctggccagc ttctcgttgg 1620gagaggggac taggaactcc ttgtactggg agttctcgta gtcagagacg tcctccttct 1680tctgttcaga gacagtttcc tcggcaccag ctcgcaggcc agcaatgatt ccggttccgg 1740gtacaccgtg ggcgttggtg atatcggacc actcggcgat tcggtgacac cggtactggt 1800gcttgacagt gttgccaata tctgcgaact ttctgtcctc gaacaggaag aaaccgtgct 1860taagagcaag ttccttgagg gggagcacag tgccggcgta ggtgaagtcg tcaatgatgt 1920cgatatgggt tttgatcatg cacacataag gtccgacctt atcggcaagc tcaatgagct 1980ccttggtggt ggtaacatcc agagaagcac acaggttggt tttcttggct gccacgagct 2040tgagcactcg agcggcaaag gcggacttgt ggacgttagc tcgagcttcg taggagggca 2100ttttggtggt gaagaggaga ctgaaataaa tttagtctgc agaacttttt atcggaacct 2160tatctggggc agtgaagtat atgttatggt aatagttacg agttagttga acttatagat 2220agactggact atacggctat cggtccaaat tagaaagaac gtcaatggct ctctgggcgt 2280cgcctttgcc gacaaaaatg tgatcatgat gaaagccagc aatgacgttg cagctgatat 2340tgttgtcggc caaccgcgcc gaaaacgcag ctgtcagacc cacagcctcc aacgaagaat 2400gtatcgtcaa agtgatccaa gcacactcat agttggagtc gtactccaaa ggcggcaatg 2460acgagtcaga cagatactcg tcgactcatc gatataactt cgtataatgt atgctatacg 2520aagttatcct aggtatagat cttgcacttc ttattttctt cacgcgtttg cagctcaaca 2580ttctaggacg acgaaactac gtcaacagtg ttgtcgctct ggcgcagcag ggccgagagg 2640gtaatgccga gggtcgagtg gcgccctcgt ttggtgatct tgcagatatg ggctatttcg 2700gcgacctttc aggctcgtcc agcttcggag aaactattgt cgatcccgat ctggacgaac 2760agtaccttac cttttcgtgg tggctgctga acgagggatg ggtgtcgctg agcgagcgag 2820tggaggaagc ggttcgtcga gtgtgggacc ccgtgtcacc caaggccgaa cttggatttg 2880acgagttgtc ggaactcatt ggacgaacac agatgctcat tgatcgacct ctcaatccct 2940cgtcgccact caactttctg agccagctgc tgccaccacg ggagcaggag gagtacgtgc 3000ttgcccagaa ccccagcgat actgctgccc ccattgtagg acctaccctc cgacggcttc 3060tggacgagac tgccgacttc atcgagtccc ctaatgccgc agaggtgatt gagcgacttg 3120ttcactccgg tctctctgtg ttcatggaca agctggctgt cacgtttgga gccacacctg 3180ctgattcggg ttcgccttat cctgtggtgc tgcctactgc aaaggtcaag ctgccctcca 3240ttcttgccaa catggctcga caggctggag gcatggccca gggatcgccg ggcgtggaaa 3300acgagtacat tgacgtgatg aaccaagtgc aggagctgac ctcctttagt gctgtggtct 3360attcatcttt tgattgggct ctctagaggc tcattcacga aagacacgaa gaacgaagat 3420ggggactgaa tacagcgctc tcatttgtac acaaatgatt tatgacagag taacttgtac 3480atcatgtaga gcatacatac tgaaggtgtg atctcacggg atatcttgaa gaccactcgt 3540agctggaggc ataggtagtg ctagtacgga tacttgcacc gtatccaaca taagtagagg 3600agcctcctag tggctattgg tacaccgata aagatacaca tacatggcgc gccagctgca 3660ttaatgaatc ggccaacgcg cggggagagg cggtttgcgt attgggcgct cttccgcttc 3720ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc 3780aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc 3840aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag 3900gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc 3960gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt 4020tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct 4080ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg 4140ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct 4200tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat 4260tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg 4320ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa 4380aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt 4440ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc 4500tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt 4560atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta 4620aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat 4680ctcagcgatc tgtctatttc gttcatccat agttgcctga ctccccgtcg tgtagataac 4740tacgatacgg gagggcttac catctggccc cagtgctgca atgataccgc gagacccacg 4800ctcaccggct ccagatttat cagcaataaa ccagccagcc ggaagggccg agcgcagaag 4860tggtcctgca actttatccg cctccatcca gtctattaat tgttgccggg aagctagagt 4920aagtagttcg ccagttaata gtttgcgcaa cgttgttgcc attgctacag gcatcgtggt 4980gtcacgctcg tcgtttggta tggcttcatt cagctccggt tcccaacgat caaggcgagt 5040tacatgatcc cccatgttgt gcaaaaaagc ggttagctcc ttcggtcctc cgatcgttgt 5100cagaagtaag ttggccgcag tgttatcact catggttatg gcagcactgc ataattctct 5160tactgtcatg ccatccgtaa gatgcttttc tgtgactggt gagtactcaa ccaagtcatt 5220ctgagaatag tgtatgcggc gaccgagttg ctcttgcccg gcgtcaatac gggataatac 5280cgcgccacat agcagaactt taaaagtgct catcattgga aaacgttctt cggggcgaaa 5340actctcaagg atcttaccgc tgttgagatc cagttcgatg taacccactc gtgcacccaa 5400ctgatcttca gcatctttta ctttcaccag cgtttctggg tgagcaaaaa caggaaggca 5460aaatgccgca aaaaagggaa taagggcgac acggaaatgt tgaatactca tactcttcct 5520ttttcaatat tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga 5580atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccacc

5640tgatgcggtg tgaaataccg cacagatgcg taaggagaaa ataccgcatc aggaaattgt 5700aagcgttaat attttgttaa aattcgcgtt aaatttttgt taaatcagct cattttttaa 5760ccaataggcc gaaatcggca aaatccctta taaatcaaaa gaatagaccg agatagggtt 5820gagtgttgtt ccagtttgga acaagagtcc actattaaag aacgtggact ccaacgtcaa 5880agggcgaaaa accgtctatc agggcgatgg cccactacgt gaaccatcac cctaatcaag 5940ttttttgggg tcgaggtgcc gtaaagcact aaatcggaac cctaaaggga gcccccgatt 6000tagagcttga cggggaaagc cggcgaacgt ggcgagaaag gaagggaaga aagcgaaagg 6060agcgggcgct agggcgctgg caagtgtagc ggtcacgctg cgcgtaacca ccacacccgc 6120cgcgcttaat gcgccgctac agggcgcgtc cattcgccat tcaggctgcg caactgttgg 6180gaagggcgat cggtgcgggc ctcttcgcta ttacgccagc tggcgaaagg gggatgtgct 6240gcaaggcgat taagttgggt aacgccaggg ttttcccagt cacgacgttg taaaacgacg 6300gccagtgaat tgtaatacga ctcactatag ggcgaattgg gcccgacgtc gcatgc 6356835910DNAArtificial SequencePlasmid pY87 83catcaaagga agggtgaatc caaggaagtt cttgacaaac tgctggaatc ggtacagctt 60ggacgacttg tcgttgctaa cctggtcata gaggtcgttc tcaccaaagg ccatgatggg 120aacaagggcg acatttccga cctccatacc aagtcgaaca aaaccctttc gcttgagtag 180caccaggtcc atgacaccgg gtctggccag aagactttcc tgtgctccac caacgacaat 240gcagatagac tggtttcgct tgaggagggc cttgcaggac ttcttggaga cagaagcgac 300tcccagactc atgaggtact ctctgtagag aggcactcgg aagttgttgg tgagagtcat 360aagagaaaca gggatgcccg gaaagagctt ggaccatcca gctccctcgg tggcaattcc 420accaaaggct cccatgccga taatgccgtg ggggtggtag ccgaagatgt attttctgcc 480agtgggcttg agttttgtgg gcgacagctg tgggtcgttt tcgccaatga tctggttggc 540gtaggagttg agggacccgt taagaagcgt ggaatcagat gcagtggagc cagcagaggc 600ggacgacaaa ggtcgtcggt tagtggtgcc attgttgccg ttgccgttaa gttcggagcc 660cgaggcgtgg ccgttggagc cagatgattc tccacggcta tatctgctgt cgtggttaat 720taactttggc cggcctttac ctgcaggata acttcgtata atgtatgcta tacgaagtta 780tgaattctct ctcttgagct tttccataac aagttcttct gcctccagga agtccatggg 840tggtttgatc atggttttgg tgtagtggta gtgcagtggt ggtattgtga ctggggatgt 900agttgagaat aagtcataca caagtcagct ttcttcgagc ctcatataag tataagtagt 960tcaacgtatt agcactgtac ccagcatctc cgtatcgaga aacacaacaa catgccccat 1020tggacagatc atgcggatac acaggttgtg cagtatcata catactcgat cagacaggtc 1080gtctgaccat catacaagct gaacaagcgc tccatacttg cacgctctct atatacacag 1140ttaaattaca tatccatagt ctaacctcta acagttaatc ttctggtaag cctcccagcc 1200agccttctgg tatcgcttgg cctcctcaat aggatctcgg ttctggccgt acagacctcg 1260gccgacaatt atgatatccg ttccggtaga catgacatcc tcaacagttc ggtactgctg 1320tccgagagcg tctcccttgt cgtcaagacc caccccgggg gtcagaataa gccagtcctc 1380agagtcgccc ttaggtcggt tctgggcaat gaagccaacc acaaactcgg ggtcggatcg 1440ggcaagctca atggtctgct tggagtactc gccagtggcc agagagccct tgcaagacag 1500ctcggccagc atgagcagac ctctggccag cttctcgttg ggagagggga ctaggaactc 1560cttgtactgg gagttctcgt agtcagagac gtcctccttc ttctgttcag agacagtttc 1620ctcggcacca gctcgcaggc cagcaatgat tccggttccg ggtacaccgt gggcgttggt 1680gatatcggac cactcggcga ttcggtgaca ccggtactgg tgcttgacag tgttgccaat 1740atctgcgaac tttctgtcct cgaacaggaa gaaaccgtgc ttaagagcaa gttccttgag 1800ggggagcaca gtgccggcgt aggtgaagtc gtcaatgatg tcgatatggg ttttgatcat 1860gcacacataa ggtccgacct tatcggcaag ctcaatgagc tccttggtgg tggtaacatc 1920cagagaagca cacaggttgg ttttcttggc tgccacgagc ttgagcactc gagcggcaaa 1980ggcggacttg tggacgttag ctcgagcttc gtaggagggc attttggtgg tgaagaggag 2040actgaaataa atttagtctg cagaactttt tatcggaacc ttatctgggg cagtgaagta 2100tatgttatgg taatagttac gagttagttg aacttataga tagactggac tatacggcta 2160tcggtccaaa ttagaaagaa cgtcaatggc tctctgggcg tcgcctttgc cgacaaaaat 2220gtgatcatga tgaaagccag caatgacgtt gcagctgata ttgttgtcgg ccaaccgcgc 2280cgaaaacgca gctgtcagac ccacagcctc caacgaagaa tgtatcgtca aagtgatcca 2340agcacactca tagttggagt cgtactccaa aggcggcaat gacgagtcag acagatactc 2400gtcgactcat cgatataact tcgtataatg tatgctatac gaagttatcc taggtataga 2460tctcaccgta cgtttcatga aggcgggcag aaagtactcg atggtggaga tgattgctcg 2520gaggtacttg ttctgcggcc agtatctctc agcaatcagg tgatactcct ggacgtccag 2580agggtagtat gtgtgcgtgg gctccagatc caccgtcttg tgcagagtta tggggaagta 2640gcggccaaag agcttccaga tgaagaagtt tcttgaaata ggcgagtatc gcttgaccac 2700tcctccgttg gacggggagt cgtctttaac agcgtacact acatacgcaa tcacaaatgg 2760ccagagcagt ggaattgcgc agcatagcat gaaaattgtg aggaaagtgg gaatgctgaa 2820aatgtgccag accagagaga aggtctcaca tcggttgagt aatggtgtcg atagcggggc 2880atatcggatt cccgcgattt tgggtgccgt gtcgtttttg tctcgcgact tgtagtattg 2940tgagtcgata gtcatagctt ttgttttgtg tgacttgtct gttgcctgtt gttagaagaa 3000aaagtgggag cttatcagtc acggtccacg aacgatttcg tacttgtacg taattggtcg 3060tgagaactgt tgcagagccg gtgctttttt ttgtggccaa gtcgacaggt cgatttcggc 3120gctgtgcgag gttgctggga tgtgctggtt tggctgccaa atgtggggaa gatttcaacc 3180tcggatttga cgtgtgtaga ggcgcgccag ctgcattaat gaatcggcca acgcgcgggg 3240agaggcggtt tgcgtattgg gcgctcttcc gcttcctcgc tcactgactc gctgcgctcg 3300gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg gttatccaca 3360gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac 3420cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga cgagcatcac 3480aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag ataccaggcg 3540tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct taccggatac 3600ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg ctgtaggtat 3660ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag 3720cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt aagacacgac 3780ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta tgtaggcggt 3840gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaagaac agtatttggt 3900atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc ttgatccggc 3960aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat tacgcgcaga 4020aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc tcagtggaac 4080gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt cacctagatc 4140cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta aacttggtct 4200gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct atttcgttca 4260tccatagttg cctgactccc cgtcgtgtag ataactacga tacgggaggg cttaccatct 4320ggccccagtg ctgcaatgat accgcgagac ccacgctcac cggctccaga tttatcagca 4380ataaaccagc cagccggaag ggccgagcgc agaagtggtc ctgcaacttt atccgcctcc 4440atccagtcta ttaattgttg ccgggaagct agagtaagta gttcgccagt taatagtttg 4500cgcaacgttg ttgccattgc tacaggcatc gtggtgtcac gctcgtcgtt tggtatggct 4560tcattcagct ccggttccca acgatcaagg cgagttacat gatcccccat gttgtgcaaa 4620aaagcggtta gctccttcgg tcctccgatc gttgtcagaa gtaagttggc cgcagtgtta 4680tcactcatgg ttatggcagc actgcataat tctcttactg tcatgccatc cgtaagatgc 4740ttttctgtga ctggtgagta ctcaaccaag tcattctgag aatagtgtat gcggcgaccg 4800agttgctctt gcccggcgtc aatacgggat aataccgcgc cacatagcag aactttaaaa 4860gtgctcatca ttggaaaacg ttcttcgggg cgaaaactct caaggatctt accgctgttg 4920agatccagtt cgatgtaacc cactcgtgca cccaactgat cttcagcatc ttttactttc 4980accagcgttt ctgggtgagc aaaaacagga aggcaaaatg ccgcaaaaaa gggaataagg 5040gcgacacgga aatgttgaat actcatactc ttcctttttc aatattattg aagcatttat 5100cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 5160ggggttccgc gcacatttcc ccgaaaagtg ccacctgatg cggtgtgaaa taccgcacag 5220atgcgtaagg agaaaatacc gcatcaggaa attgtaagcg ttaatatttt gttaaaattc 5280gcgttaaatt tttgttaaat cagctcattt tttaaccaat aggccgaaat cggcaaaatc 5340ccttataaat caaaagaata gaccgagata gggttgagtg ttgttccagt ttggaacaag 5400agtccactat taaagaacgt ggactccaac gtcaaagggc gaaaaaccgt ctatcagggc 5460gatggcccac tacgtgaacc atcaccctaa tcaagttttt tggggtcgag gtgccgtaaa 5520gcactaaatc ggaaccctaa agggagcccc cgatttagag cttgacgggg aaagccggcg 5580aacgtggcga gaaaggaagg gaagaaagcg aaaggagcgg gcgctagggc gctggcaagt 5640gtagcggtca cgctgcgcgt aaccaccaca cccgccgcgc ttaatgcgcc gctacagggc 5700gcgtccattc gccattcagg ctgcgcaact gttgggaagg gcgatcggtg cgggcctctt 5760cgctattacg ccagctggcg aaagggggat gtgctgcaag gcgattaagt tgggtaacgc 5820cagggttttc ccagtcacga cgttgtaaaa cgacggccag tgaattgtaa tacgactcac 5880tatagggcga attgggcccg acgtcgcatg 59108434DNAEscherichia coli 84ataacttcgt ataatgtatg ctatacgaag ttat 348520DNAArtificial SequencePrimer UP 768 85acccgtgttt cgtctaaaag 208622DNAArtificial SequencePrimer LP 769 86ggtagataca agtggcaata ac 22

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed