U.S. patent application number 09/976800 was filed with the patent office on 2003-04-24 for cytochrome p450 monooxygenase and nadph cytochrome p450 oxidoreductase genes and proteins related to the omega hydroxylase complex of candida tropicals and methods relating thereto.
Invention is credited to Brenner, Alfred A., Cornett, Cathy A., Craft, David L., Eirich, L. Dudley, Eshoo, Mark, Gleeson, Martin, Loper, John C., Madduri, Krishna M., Tang, Maria, Wilson, C. Ron.
Application Number | 20030077795 09/976800 |
Document ID | / |
Family ID | 26821671 |
Filed Date | 2003-04-24 |
United States Patent
Application |
20030077795 |
Kind Code |
A1 |
Wilson, C. Ron ; et
al. |
April 24, 2003 |
Cytochrome P450 monooxygenase and NADPH Cytochrome P450
oxidoreductase genes and proteins related to the omega hydroxylase
complex of candida tropicals and methods relating thereto
Abstract
Novel genes have been isolated which encode cytochrome P450 and
NADPH reductase enzymes of the co-hydroxylase complex of C.
tropicalis 20336. Vectors including these genes, transfected host
cells and transformed host cells are provided. Methods of producing
of cytochrome P450 and NADPH reductase enzymes are also provided
which involve transforming a host cell with a gene encoding these
enzymes and culturing the cells. Methods of increasing the
production of a dicarboxylic acid and methods of increasing
production of the aforementioned enzymes are also provided which
involve increasing in the host cell the number of genes encoding
these enzymes. A method for discriminating members of a gene family
by quantifying the expression of genes is also provided.
Inventors: |
Wilson, C. Ron; (Loveland,
OH) ; Craft, David L.; (Fort Thomas, KY) ;
Eirich, L. Dudley; (Cincinnati, OH) ; Eshoo,
Mark; (Fairfax, CA) ; Madduri, Krishna M.;
(Westfield, IN) ; Cornett, Cathy A.; (Crescent
Springs, KY) ; Brenner, Alfred A.; (Santa Rosa,
CA) ; Tang, Maria; (Fairfield, CA) ; Loper,
John C.; (Cincinnati, OH) ; Gleeson, Martin;
(San Diego, CA) |
Correspondence
Address: |
Jeffrey S. Steen
DILWORTH & BARRESE, LLP
333 Earle Ovington Blvd.
Uniondale
NY
11553
US
|
Family ID: |
26821671 |
Appl. No.: |
09/976800 |
Filed: |
October 12, 2001 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60123555 |
Mar 10, 1999 |
|
|
|
Current U.S.
Class: |
435/189 ;
435/254.22; 435/320.1; 435/69.1; 536/23.2 |
Current CPC
Class: |
C12P 7/44 20130101; C12N
9/0042 20130101; C12N 9/0071 20130101; C12N 9/0077 20130101; C07K
14/40 20130101 |
Class at
Publication: |
435/189 ;
435/320.1; 435/254.22; 435/69.1; 536/23.2 |
International
Class: |
C07H 021/04; C12N
009/02; C12N 001/18; C12P 021/02 |
Claims
What is claimed is:
1. Isolated nucleic acid encoding a CYP52A2B protein having the
amino acid sequence set forth in SEQ ID NO: 97.
2. Isolated nucleic acid comprising a coding region defined by
nucleotides 1072-2640 as set forth in SEQ ID NO: 87.
3. Isolated nucleic acid according to claim 2 comprising the
nucleotide sequence as set forth in SEQ ID NO: 87.
4. A vector comprising a nucleotide sequence encoding CYP52A2B
protein including an amino acid sequence as set forth in SEQ ID NO:
97.
5. A vector according to claim 4 wherein the nucleotide sequence is
set forth in nucleotides 1072-2640 of SEQ ID NO: 87.
6. A vector according to claim 4 wherein the vector is selected
from the group consisting of plasmid, phagemid, phage and
cosmid.
7. A host cell transfected or transformed with the nucleic acid of
claim 1.
8. A host cell according to claim 7 wherein the host cell is a
yeast cell.
9. A host cell according to claim 8 wherein the yeast cell is a
Candida sp.
10. A host cell according to claim 9 wherein the Candida sp. is
Candida tropicalis.
11. A host cell according to claim 10 wherein the Candida
tropicalis is Candida tropicalis 20336.
12. A host cell according to claim 11 wherein the Candida
tropicalis is H5343 ura-.
13. A method of producing a CYP52A2B protein including an amino
acid sequence as set forth in SEQ ID NO: 97 comprising: a)
transforming a suitable host cell with a DNA sequence that encodes
the protein having the amino acid sequence as set forth in SEQ ID
NO: 97; and b) culturing the cell under conditions and in media
favoring the expression of the protein.
14. The method according to claim 13 wherein the step of culturing
the cell comprises adding an organic substrate to the media
containing the cell.
15. A method for increasing production of a dicarboxylic acid
comprising: a) providing a host cell having a naturally occurring
number of CYP52A2B genes; b) increasing, in the host cell, the
number of CYP52A2B genes which encode a CYP52A2B protein having the
amino acid sequence as set forth in SEQ ID NO: 97; c) culturing the
host cell in media containing an organic substrate which
upregulates the CYP52A2B gene, to effect increased production of
dicarboxylic acid.
16. A method for increasing the production of a CYP52A2B protein
having an amino acid sequence as set forth in SEQ ID NO: 97
comprising: a) transforming a host cell having a naturally
occurring amount of CYP52A2B protein to increase the copy number of
CYP52A2B genes that encode the CYP52A2B protein having the amino
acid sequence as set forth in SEQ ID NO: 97; and b) culturing the
cell and thereby increasing expression of the protein as compared
to an untransformed host cell.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional
Application Ser.= No. 60/123,555 filed Mar. 10, 1999, U.S.
Provisional Application Ser. No. 60/103,099 filed Oct. 5, 1998, and
U.S. Provisional Application Ser. No. 60/083,798 filed May 1,
1998.
BACKGROUND
[0002] 1. Field of the Invention
[0003] The present invention relates to novel genes which encode
enzymes of the (i-hydroxylase complex in yeast Candida tropicalis
strains. In particular, the invention relates to novel genes
encoding the cytochrome P450 and NADPH reductase enzymes of the
(-hydroxylase complex in yeast Candida tropicalis, and to a method
of quantitating the expression of genes.
[0004] 2. Description of the Related Art
[0005] Aliphatic dioic acids are versatile chemical intermediates
useful as raw materials for the preparation of perfumes, polymers,
adhesives and macrolid antibiotics. While several chemical routes
to the synthesis of long-chain alpha, co-dicarboxylic acids are
available, the synthesis is not easy and most methods result in
mixtures containing shorter chain lengths. As a result, extensive
purification steps are necessary. While it is known that long-chain
dioic acids can also be produced by microbial transformation of
alkanes, fatty acids or esters thereof, chemical synthesis has
remained the most commercially viable route, due to limitations
with the current biological approaches.
[0006] Several strains of yeast are known to excrete alpha,
(o-dicarboxylic acids as a byproduct when cultured on alkanes or
fatty acids as the carbon source. In particular, yeast belonging to
the Genus Candida, such as C. albicans, C. cloacae, C.
guillermondii, C. intermedia, C. lipolytca, C. maltosa, C.
parapsilosis and C. zeylenoides are known to produce such
dicarboxylic acids (Agr. Biol. Chem. 35: 2033-2042 (1971)). Also,
various strains of C. tropicaiis are known to produce dicarboxylic
acids ranging in chain lengths from C.sub.11 through C.sub.18
(Okino et al., B. M. Lawrence, B. D. Mookheijee and B. J. Willis
(eds), in Flavors and Fragrances: A World Perspective. Proceedings
of the 10.sup.th International Conference of Essential Oils,
Flavors and Fragrances, Elsevier Science Publishers BV Amsterdam
(1988)), and are the basis of several patents as reviewed by
BiUhler and Schindler, in Aliphadic Hydrocarbons in Biotechnology,
H. J. Rehm and G. Reed (eds), Vol. 169, Verlag Chemie, Weinheim
(1984).
[0007] Studies of the biochemical processes by which yeasts
metabolize alkanes and fatty acids have revealed three types of
oxidation reactions: .alpha.-oxidation of alkanes to alcohols,
.omega.-oxidation of fatty acids to alpha, .omega.-dicarboxylic
acids and the degradative ,.beta.-oxidation of fatty acids to
CO.sub.2 and water. The first two types of oxidations are catalyzed
by microsomal enzymes while the last type takes place in the
peroxisomes. In C. tropicalis, the first step in the
.omega.-oxidation pathway is catalyzed by a membrane-bound enzyme
complex .omega.-hydroxylase complex) including a cytochrome P450
monooxygenase and a NADPH cytochrome reductase. This hydroxylase
complex is responsible for the primary oxidation of the terminal
methyl group in alkanes and fatty acids (Gilewicz et al., Can. J.
Microbiol. 25:201 (1979)). The genes which encode the cytochrome
P450 and NADPH reductase components of the complex have previously
been identified as P450ALK and P450RED respectively, and have also
been cloned and sequenced (Sanglard et al., Gene 76:121-136
(1989)). P450ALK has also been designated P450ALK1. More recently,
ALK genes have been designated by the symbol CYP and RED genes have
been designated by the symbol CPR. See, e.g., Nelson,
Pharmacogenetics 6(1):1-42 (1996), which is incorporated herein by
reference. See also Ohkuma et al., DNA and Cell Biology 14:163-173
(1995), Seghezzi et al., DNA and CellBiology, 11:767-780 (1992) and
Kargel et al., Yeast 12:333-348 (1996), each incorporated herein by
reference. For example, P450ALK is also designated CYP52 according
to the nomenclature of Nelson, supmr. Fatty acids are ultimately
formed from alkanes after two additional oxidation steps, catalyzed
by alcohol oxidase (Kemp et al., Appl. Microbiol. and Biotechnol
28: 370-374 (1988)) and aldehyde dehydrogenase. The fatty acids can
be further oxidized through the same or similar pathway to the
corresponding dicarboxylic acid. The so-oxidation of fatty acids
proceeds via the .omega.-hydroxy fatty acid and its aldehyde
derivative, to the corresponding dicarboxylic acid without the
requirement for CoA activation. However, both fatty acids and
dicarboxylic acids can be degraded, after activation to the
corresponding acyl-CoA ester through the .beta.-oxidation pathway
in the peroxisomes, leading to chain shortening. In mammalian
systems, both fatty acid and dicarboxylic acid products of
.omega.-oxidation are activated to their CoA-esters at equal rates
and are substrates for both mitochondrial and peroxisomal
.beta.-oxidation (J. Biochem., 102:225-234 (1987)). In yeast,
.beta.-oxidation takes place solely in the peroxisomes
(Agr.Biol.Chem. 49:1821-1828 (1985)).
[0008] It has recently been determined that certain eukaryotes,
e.g., certain yeast, do not adhere, in some respects, to the
"universal" genetic code which provides that particular codons
(triplets of nucleic acids) code for specific amino acids. Indeed,
the genetic code is "universal" because it is virtually the same in
all living organisms. Certain Candida sp. are now known to
translate the CTG codon (which, according to the "universal" code
designates leucine) as serine. See, e.g., Ueda et al., Biochemie
(1994) 76, 1217-1222, where C. tropicaus, C. cylindracea, C.
guilliermodii and C. lusitaniae are shown to adhere to the
"non-universal" code with respect to the CTG codon. Accordingly,
nucleic acid sequences may code for one amino acid sequence in
"universal" code organisms and a variant of that amino acid
sequence in "non-universal" code organisms depending on the number
of CTG codons present in the nucleic acid coding sequence. The
difference may become evident when, in the course of genetic
engineering, nucleic acid encoding a protein is transferred from a
"non-universal" code organism to a "universal" code organism or
vice versa. Obviously, there will be a different amino acid
sequence depending on which organism is used to express the
protein.
[0009] The production of dicarboxylic acids by fermentation of
unsaturated C.sub.14-C.sub.16 monocarboxylic acids using a strain
of the species C. tropicalis is disclosed in U.S. Pat. No.
4,474,882. The unsaturated dicarboxylic acids correspond to the
starting materials in the number and position of the double bonds.
Similar processes in which other special microorganisms are used
are described in U.S. Pat. No. 3,975,234 and 4,339,536, in British
Patent Specification 1,405,026 and in German Patent Publications 21
64 626, 28 53 847, 29 37 292, 29 51 177, and 21 40 133.
[0010] Cytochromes P450 (P450s) are terminal monooxidases of a
multicomponent enzyme system as described above. They comprise a
superfamily of proteins which exist widely in nature having been
isolated from a variety of organisms as described e.g., in Nelson,
supra. These organisms include various mammals, fish,
invertebrates, plants, mollusk, crustaceans, lower eukaryotes and
bacteria (Nelson, supra). First discovered in rodent liver
microsomes as a carbon-monoxide binding pigment as described, e.g.,
in Garfinkel, Arch. Biochem. Biophys. 77:493-509 (1958), which is
incorporated herein by reference, P450s were later named based on
their absorption at 450 nm in a reduced-CO coupled difference
spectrum as described, e.g., in Omura et al.,J. Biol. Chem.
239:2370-2378 (1964), which is incorporated herein by
reference.
[0011] P450s catalyze the metabolism of a variety of endogenous and
exogenous compounds (Nelson, supra). Endogenous compounds include
steroids, prostanoids, eicosanoids, fat-soluble vitamins, fatLy
acids, mammalian alkaloids, leukotrines, biogenic amines and
phytolexins (Nelson, supra). P450 metabolism involves such
reactions as epoxidation, hydroxylation, deakylation,
N-hydroxylation, sulfoxidation, desulfuration and reductive
dehalogenation. These reactions generally make the compound more
water soluble, which is conducive for excretion, and more
electrophilic. These electrophilic products can have detrimental
effects if they react with DNA or other cellular constituents.
However, they can react through conjugation with low molecular
weight hydrophilic substances resulting in glucoronidation,
sulfation, acetylation, amino acid conjugation or glutathione
conjugation typically leading to inactivation and elimination as
described, e.g., in Klaassen et al., Toxicology, 3.sup.rd ed,
Macmillan, N.Y., 1986, incorporated herein by reference.
[0012] P450s are heme thiolate proteins consisting of a heme moiety
bound to a single polypeptide chain of 45,000 to 55,000 Da. The
iron of the heme prosthetic group is located at the center of a
protoporphyrin ring. Four ligands of the heme iron can be
attributed to the porphyrin ring. The fifth ligand is a thiolate
anion from a cysteinyl residue of the polypeptide. The sixth ligand
is probably a hydroxyl group from an amino acid residue, or a
moiety with a similar field strength such as a water molecule as
described, e.g., in Goeptar et al., Critical Reviews in
Toxicology25(1):25-65 (1995), incorporated herein by reference.
[0013] Monooxygenation reactions catalyzed by cytochromes P450 in a
eukaryotic membrane-bound system require the transfer of electrons
from NADPH to P450 via NADPH-cytochrome P450 reductase (CPIM as
described, e.g., in Taniguchi et al., Arch. Biochem. Biophys.
232:585 (1984), incorporated herein by reference. CPR genes are now
also referred to as NCPgenes. See, e.g., Debacker et al.,
Antmicrobial Agents and Chemotherapy 45:1660 (2001). CPR is a
flavoprotein of approximately 78,000 Da containing 1 mol of flavin
adenine dinucleotide (FAD) and 1 mol of flavin mononucleotide (FMN)
per mole of enzyme as described, e.g., in Potter et al.,J. Biol.
Chem. 258:6906 (1983), incorporated herein by reference. The FAD
moiety of CPR is the site of electron entry into the enzyme,
whereas FMN is the electron-donating site to P450 as described,
e.g., in Vermnilion et al.,J. Biol. Chem. 253:8812 (1978),
incorporated herein by reference. The overall reaction is as
follows:
H.sup.++RH+NADPH+O.sub.2 ROH+NADP.sup.++H.sub.2O
[0014] Binding of a substrate to the catalytic site of P450
apparently results in a conformational change initiating electron
transfer from CPR to P450. Subsequent to the transfer of the first
electron, O.sub.2binds to the Fe.sub.2.sup.+-P450 substrate complex
to form Fe.sub.3.sup.+-P450-substrate complex. This complex is then
reduced by a second electron from CPR, or, in some cases, NADH via
cytochrome b5 and NADH-cytochrome b5 reductase as described, e.g.,
in Guengerich et al., Arch. Biochem. Biophys. 205:365 (1980),
incorporated herein by reference. One atom of this reactive oxygen
is introduced into the substrate, while the other is reduced to
water. The oxygenated substrate then dissociates, regenerating the
oxidized form of the cytochrome P450 as described, e.g., in
Klassen, Amdur and Doull, Casarett and Doull's Toxicology,
Macmillan, N.Y. (1986), incorporated herein by reference.
[0015] The P450 reaction cycle can be short-circuited in such a way
that O.sub.2 is reduced to O.sub.2.sup.+and/or
H.sub.2O.sub.2instead of being utilized for substrate oxygenation.
This side reaction is often referred to as the "uncoupling" of
cytochirome P450 as described, e.g., in Kuthen et al., Eur. J.
Biochem. 126:583 (1982) and Poulos et al., FASEBJ. 6:674 (1992),
both of which are incorporated herein by reference. The formation
of these oxygen radicals may lead to oxidative cell damage as
described, e.g., in Mukhopadhyay,J. Biol. Chem. 269(18):13390-13397
(1994) and Ross et al., Biochem. Pharm. 49(7):979-989 (1995), both
of which are incorporated herein by reference. It has been proposed
that cytochrome b5's effect on P450 binding to the CPR results in a
more stable complex which is less likely to become "uncoupled" as
described, e.g., in Yamazaki et al., Arch. Biochem. Biophys.
325(2):174-182 (1996), incorporated herein by reference.
[0016] P450 families are assigned based upon protein sequence
comparisons. Notwithstanding a certain amount of heterogeneity, a
practical classification of P450s into families can be obtained
based on deduced amino acid sequence similarity. P450s with amino
acid sequence similarity of between about 40-80% are considered to
be in the same family, with sequences of about >55% belonging to
the same subfamily. Those with sequence similarity of about <40%
are generally listed as members of different P450 gene families
(Nelson, supra). A value of about >97% is taken to indicate
allelic variants of the same gene, unless proven otherwise based on
catalytic activity, sequence divergence in non-translated regions
of the gene sequence, or chromosomal mapping.
[0017] The most highly conserved region is the HR2 consensus
containing the invariant cysteine residue near the carboxyl
terminus which is required for heme binding as described, e.g., in
Gotoh et al.J. Biochem. 93:807-817 (1983) and Motohashi et al.,J.
Biochem. 101:879-997 (1987), both of which are incorporated herein
by reference. Additional consensus regions, including the central
region of helix I and the transmembrane region, have also been
identified, as described, e.g, in Goeptar et al., supra and Kalb et
al., PNAS. 85:7221-7225 (1988), incorporated herein by reference,
although the HR2 cysteine is the only invariant amino acid among
P450s.
[0018] Short chain (.ltoreq.C12) aliphatic dicarboxylic acids
(diacids) are important industrial intermediates in the manufacture
of diesters and polymers, and find application as thermoplastics,
plasticizing agents, lubricants, hydraulic fluids, agricultural
chemicals, pharmaceuticals, dyes, surfactants, and adhesives. The
high price and limited availability of short chain diacids are due
to constraints imposed by the existing chemical synthesis.
[0019] Long-chain diacids (aliphatic .alpha., .omega.-dicarboxylic
acids with carbon numbers of 12 or greater, hereafter also referred
to as diacids) (HOOC--(CH.sub.2).sub.n--COOH) are a versatile
family of chemicals with demonstrated and potential utility in a
variety of chemical products including plastics, adhesives, and
fragrances. Unfortunately, the full market potential of diacids has
not been realized because chemical processes produce only a limited
range of these materials at a relatively high price. In addition,
chemical processes for the production of diacids have a number of
limitations and disadvantages. All the chemical processes are
restricted to the production of diacids of specific carbon chain
lengths. For example, the dodecanedioic acid process starts with
butadiene. The resulting product diacids are limited to multiples
of four-carbon lengths and, in practice, only dodecanedioic acid is
made. The dodecanedioic process is based on nonrenewable
petrochemical feedstocks. The multireaction conversion process
produces unwanted byproducts, which result in yield losses,
NO.sub.x pollution and heavy metal wastes.
[0020] Long-chain diacids offer potential advantages over shorter
chain diacids, but their high selling price and limited commercial
availability prevent widespread growth in many of these
applications. Biocatalysis offers an innovative way to overcome
these limitations with a process that produces a wide range of
diacid products from renewable feedstocks. However, there is no
commercially viable bioprocess to produce long chain diacids from
renewable resources.
SUMMARY OF THE INVENTION
[0021] An isolated nucleic acid is provided which encodes a CPRA
protein having the amino acid sequence set forth in SEQ ID NO: 83
or SEQ ID NO: 117. An isolated nucleic acid is also provided which
includes a coding region defined by nucleotides 1006-3042 as set
forth in SEQ ID NO: 81. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 83 or
SEQ ID NO: 117. A vector is provided which includes a nucleotide
sequence encoding CPRA protein including an amino acid sequence as
set forth in SEQ ID NO: 83 or SEQ ID NO: 117. A host cell is
provided which is transfected or transformed with the nucleic acid
encoding CPRA protein having an amino acid sequence as set forth in
SEQ ID NO: 83 or SEQ ID NO: 117. A method of producing a CPRA
protein including an amino acid sequence as set forth in SEQ ID NO:
83 or SEQ ID NO: 117 is also provided which includes a)
transforming a suitable host cell with a DNA sequence that encodes
the protein having the amino acid sequence as set forth in SEQ ID
NO: 83 or SEQ ID NO: 117; and b) culturing the cell under
conditions favoring the expression of the protein.
[0022] An isolated nucleic acid is provided which encodes a CPRB
protein having the amino acid sequence set forth in SEQ ID NO: 84
or SEQ ID NO: 118. An isolated nucleic acid is provided which
includes a coding region defined by nucleotides 1033-3069 as set
forth in SEQ ID NO: 82. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 84 or
SEQ ID NO: 118. A vector is provided which includes a nucleotide
sequence encoding CPRB protein including an amino acid sequence as
set forth in SEQ ID NO: 84 or SEQ ID NO: 118. A host cell is
provided which is transfected or transformed with the nucleic acid
encoding CPRB protein having an amino acid sequence as set forth in
SEQ ID NO: 84 or SEQ ID NO: 118. A method of producing a CPRB
protein including an amino acid sequence as set forth in SEQ ID NO:
84 or SEQ ID NO: 118 is provided which includes a) transforming a
suitable host cell with a DNA sequence that encodes the protein
having the amino acid sequence as set forth in SEQ ID NO: 84 or SEQ
ID NO: 118; and b) culturing the cell under conditions favoring the
expression of the protein.
[0023] An isolated nucleic acid is provided which encodes a
CYP52A1A protein having the amino acid sequence set forth in SEQ ID
NO: 95 or SEQ ID NO: 11O. An isolated nucleic acid is provided
which includes a coding region defined by nucleotides 1177-2748 as
set forth in SEQ ID NO: 85. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 95 or
SEQ ID NO: 110. A vector is provided which includes a nucleotide
sequence encoding CYP52A1A protein including an amino acid sequence
as set forth in SEQ ID NO: 95 or SEQ ID NO: 110. A host cell is
provided which is transfected or transformed with the nucleic acid
encoding CYP52A1A protein having an amino acid sequence as set
forth in SEQ ID NO: 95 or SEQ ID NO: 110. A method of producing a
CYP52A1A protein including an amino acid sequence as set forth in
SEQ ID NO: 95 or SEQ ID NO: 110 is provided which includes a)
transforming a suitable host cell with a DNA sequence that encodes
the protein having the amino acid sequence as set forth in SEQ ID
NO: 95 or SEQ ID NO: 110; and b) culturing the cell under
conditions favoring the expression of the protein.
[0024] An isolated nucleic acid encoding a CYP52A2A protein is
provided which has the amino acid sequence set forth in SEQ ID NO:
96. An isolated nucleic acid is provided which includes a coding
region defined by nucleotides 1199-2767 as set forth in SEQ ID NO:
86. An isolated protein is provided which includes an amino acid
sequence as set forth in SEQ ID NO: 96. A vector is provided which
includes a nucleotide sequence encoding CYP52A2A protein including
an amino acid sequence as set forth in SEQ ID NO: 96. A host cell
is provided which is transfected or transformed with the nucleic
acid encoding CYP52A2A protein having an amino acid sequence as set
forth in SEQ ID NO: 96. A method of producing a CYP52A2A protein
including an amino acid sequence as set forth in SEQ ID NO: 96 is
provided which includes a) transforming a suitable host cell with a
DNA sequence that encodes the protein having the amino acid
sequence as set forth in SEQ ID NO: 96; and b) culturing the cell
under conditions favoring the expression of the protein.
[0025] An isolated nucleic acid encoding a CYP52A2B protein is
provided which has the amino acid sequence set forth in SEQ ID NO:
97. An isolated nucleic acid is provided which includes a coding
region defined by nucleotides 1072-2640 as set forth in SEQ ID NO:
87. An isolated protein is provided which includes an amino acid
sequence as set forth in SEQ ID NO: 97. A vector is provided which
includes a nucleotide sequence encoding CYP52A2B protein including
an amino acid sequence as set forth in SEQ ID NO: 97. A host cell
is provided which is transfected or transformed with the nucleic
acid encoding CYP52A2B protein having an amino acid sequence as set
forth in SEQ ID NO: 97. A method of producing a CYP52A2B protein
including an amino acid sequence as set forth in SEQ ID NO: 97 is
provided which includes a) transforming a suitable host cell with a
DNA sequence that encodes the protein having the amino acid
sequence as set forth in SEQ ID NO: 97; and b) culturing the cell
under conditions favoring the expression of the protein.
[0026] An isolated nucleic acid encoding a CYP52A3A protein is
provided which has the amino acid sequence set forth in SEQ ID NO:
98. An isolated nucleic acid is provided which includes a coding
region defined by nucleotides 1126-2748 as set forth in SEQ ID NO:
88. An isolated protein is provided which includes an amino acid
sequence as set forth in SEQ ID NO: 98. A vector is provided which
includes a nucleotide sequence encoding CYP52A3A protein including
an amino acid sequence as set forth in SEQ ID NO: 98. A host cell
is provided which is transfected or transformed with the nucleic
acid encoding CYP52A3A protein having an amino acid sequence as set
forth in SEQ ID NO: 98. A method of producing a CYP52A3A protein
including an amino acid sequence as set forth in SEQ ID NO: 98 is
provided which includes a) transforming a suitable host cell with a
DNA sequence that encodes the protein having the amino acid
sequence as set forth in SEQ ID NO: 98; and b) culturing the cell
under conditions favoring the expression of the protein.
[0027] An isolated nucleic acid encoding a CYP52A3B protein is
provided having the amino acid sequence as set forth in SEQ ID NO:
99 or SEQ ID NO: 111. An isolated nucleic acid is provided which
includes a coding region defined by nucleotides 913-2535 as set
forth in SEQ ID NO: 89. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 99 or
SEQ ID NO: 111. A vector is provided which includes a nucleotide
sequence encoding CYP52A3B protein including an amino acid sequence
as set forth in SEQ ID NO: 99 or SEQ ID NO: 111. A host cell is
provided which is transfected or transformed with the nucleic acid
encoding CYP52A3B protein having an amino acid sequence as set
forth in SEQ ID NO: 99 or SEQ ID NO: 111. A method of producing a
CYP52A3B protein including an amino acid sequence as set forth in
SEQ ID NO: 99 or SEQ ID NO: 111 is provided which includes a)
transforming a suitable host cell with a DNA sequence that encodes
the protein having the amino acid sequence as set forth in SEQ ID
NO: 99 or SEQ ID NO: 1I1; and b) culturing the cell under
conditions favoring the expression of the protein.
[0028] An isolated nucleic acid encoding a CYP52A5A protein is
provided having the amino acid sequence set forth in SEQ ID NO: 100
or SEQ ID NO: 112. An isolated nucleic acid is provided which
includes a coding region defined by nucleotides 1103-2656 as set
forth in SEQ ID NO: 90. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 100 or
SEQ ID NO: 112. A vector is provided which includes a nucleotide
sequence encoding CYP52A5A protein including an amino acid sequence
as set forth in SEQ ID NO: 100 or SEQ ID NO: 112. A host cell is
provided which is transfected or transformed with the nucleic acid
encoding CYP52A5A protein having an amino acid sequence as set
forth in SEQ ID NO: 100 or SEQ ID NO: 112. A method of producing a
CYP52A5A protein including an amino acid sequence as set forth in
SEQ ID NO: 100 or SEQ ID NO: 112 is provided which includes a)
transforming a suitable host cell with a DNA sequence that encodes
the protein having the amino acid sequence as set forth in SEQ ID
NO: 100 or SEQ ID NO: 112; and b) culturing the cell under
conditions favoring the expression of the protein.
[0029] An isolated nucleic acid encoding a CYP52A5B protein is
provided having the amino acid sequence as set forth in SEQ ID NO:
101 or SEQ ID NO: 113. An isolated nucleic acid is provided which
includes a coding region defined by nucleotides 1142-2695 as set
forth in SEQ ID NO: 91. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 101 or
SEQ ID NO: 113. A vector is provided which includes a nucleotide
sequence encoding CYP52A5B protein including the amino acid
sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 113. A host
cell is provided which is transfected or transformed with the
nucleic acid encoding CYP52A5B protein having the amino acid
sequence as set forth in SEQ ID NO: 101 or SEQ ID NO: 113. A method
of producing a CYP52A5B protein including an amino acid sequence as
set forth in SEQ ID NO: 101 or SEQ ID NO: 113 is provided which
includes a) transforming a suitable host cell with a DNA sequence
that encodes the protein having the amino acid sequence as set
forth in SEQ ID NO: 101 or SEQ ID NO: 113; and b) culturing the
cell under conditions favoring the expression of the protein.
[0030] An isolated nucleic acid encoding a CYP52A8A protein is
provided having the amino acid sequence set forth in SEQ ID NO: 102
or SEQ ID NO: 114. An isolated nucleic acid is provided which
includes a coding region defined by nucleotides 464-2002 as set
forth in SEQ ID NO: 92. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 102 or
SEQ ID NO: 114. A vector is provided which includes a nucleotide
sequence encoding CYPS2A8A protein including an amino acid sequence
as set forth in SEQ ID NO: 102 or SEQ ID NO: 114. A host cell is
provided which is transfected or transformed with the nucleic acid
encoding CYP52A8A protein having an amino acid sequence as set
forth in SEQ ID NO: 102 or SEQ ID NO: 114. A method of producing a
CYP52A8A protein including an amino acid sequence as set forth in
SEQ ID NO: 102 or SEQ ID NO: 114 is provided which includes a)
transforming a suitable host cell with a DNA sequence that encodes
the protein having the amino acid sequence as set forth in SEQ ID
NO: 102 or SEQ ID NO: 114; and b) culturing the cell, under
conditions favoring the expression of the protein.
[0031] An isolated nucleic acid encoding a CYP52A8B protein is
provided having the amino acid sequence set forth in SEQ ID NO: 103
or SEQ ID NO: 115. An isolated nucleic acid is provided which
includes a coding region defined by nucleotides 1017-2555 as set
forth in SEQ ID NO: 93. An isolated protein is provided which
includes an amino acid sequence as set forth in SEQ ID NO: 103 or
SEQ ID NO: 115. A vector is provided which includes a nucleotide
sequence encoding CYP52A8B protein including an amino acid sequence
as set forth in SEQ ID NO: 103 or SEQ ID NO: 115. A host cell is
provided which is transfected or transformed with the nucleic acid
encoding CYP52A8B protein having an amino acid sequence as set
forth in SEQ ID NO: 103 or SEQ ID NO: 115. A method of producing a
CYP52A8B protein including an amino acid sequence as set forth in
SEQ ID NO: 103 or SEQ ID NO: 115 is provided which includes a)
transforming a suitable host cell with a DNA sequence that encodes
the protein having the amino acid sequence as set forth in SEQ ID
NO: 103 or SEQ ID NO: 115; and b) culturing the cell under
conditions favoring the expression of the protein.
[0032] An isolated nucleic acid encoding a CYP52D4A protein is
provided having the amino acid sequence set forth in SEQ ID NO: 104
or SEQ ID NO: 116. An isolated nucleic acid is provided including a
coding region defined by nucleotides 767-2266 as set forth in SEQ
ID NO: 94. An isolated protein is provided which includes an amino
acid sequence as set forth in SEQ ID NO: 104 or SEQ ID NO: 116. A
vector is provided which includes a nucleotide sequence encoding
CYP52D4A protein including an amino acid sequence as set forth in
SEQ ID NO: 104 or SEQ ID NO: 116. A host cell is provided which is
transfected or transformed with the nucleic acid encoding CYP52D4A
protein having an amino acid sequence as set forth in SEQ ID NO:
104 or SEQ ID NO: 116. A method of producing a CYP52D4A protein
including an amino acid sequence as set forth in SEQ ID NO: 104 or
SEQ ID NO: 116 is provided which includes a) transforming a
suitable host cell with a DNA sequence that encodes the protein
having the amino acid sequence as set forth in SEQ ID NO: 104 or
SEQ ID NO: 116; and b) culturing the cell under conditions favoring
the expression of the protein.
[0033] A method for discriminating members of a gene family by
quantifying the amount of target mRNA in a sample is provided which
includes a) providing an organism containing a target gene; b)
culturing the organism with an organic substrate which causes
upregulation in the activity of the target gene; c) obtaining a
sample of total RNA from the organism at a first point in time; d)
combining at least a portion of the sample of the total RNA with a
known amount of competitor RNA to form an RNA mixture, wherein the
competitor RNA is substantially similar to the target MRNA but has
a lesser number of nucleotides compared to the target MRNA; e)
adding reverse transcriptase to the RNA mixture in a quantity
sufficient to form corresponding target DNA and competitor DNA; (f)
conducting a polymerase chain reaction in the presence of at least
one primer specific for at least one substantially non-homologous
region of the target DNA within te gene family, the primer also
specific for the competitor DNA; g) repeating steps (c-f) using
increasing amounts of the competitor RNA while maintaining a
substantially constant amount of target RNA; h) determining the
point at which the amount of target DNA is substantially equal to
the amount of competitor DNA; i) quantifying the results by
comparing the ratio of the concentration of unknown target to the
known concentration of competitor; and j) obtaining a sample of
total RNA from the organism at another point in time and repeating
steps (d-i).
[0034] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CPRA genes; b) increasing, in the host cell,
the number of CPRA genes which encode a CPRA protein having the
amino acid sequence as set forth in SEQ ID NO: 83 or SEQ ID NO:
I117; c) culturing the host cell in media containing an organic
substrate which upregulates the CPRA gene, to effect increased
production of dicarboxylic acid.
[0035] A method for increasing the production of a CPRA protein
having an amino acid sequence as set forth in SEQ ID NO: 83 or SEQ
ID NO: 117 is provided which includes a) transforming a host cell
having a naturally occurring amount of CPRA protein with an
increased copy number of a CPRA gene that encodes the CPRA protein
having the amino acid sequence as set forth in SEQ ID NO: 83 or SEQ
ID NO: 117; and b) culturing the cell and thereby increasing
expression of the protein compared with that of a host cell
containing a naturally occurring copy number of the CPRA gene.
[0036] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CPRB genes; b) increasing, in the host cell,
the number of CPRB genes which encode a CPRB protein having the
amino acid sequence as set forth in SEQ ID NO: 84 or SEQ ID NO:
118; c) culturing the host cell in media containing an organic
substrate which upregulates the CPRB gene, to effect increased
production of dicarboxylic acid.
[0037] A method for increasing the production of a CPRB protein
having an amino acid sequence as set forth in SEQ ID NO: 84 or SEQ
ID NO: 118 is provided which includes a) transforming a host cell
having a naturally occurring amount of CPRB protein with an
increased copy number of a CPRB gene that encodes the CPRB protein
having the amino acid sequence as set forth in SEQ ID NO: 84 or SEQ
ID NO: 118; and b) culturing the cell and thereby increasing
expression of the protein compared with that of a host cell
containing a naturally occurring copy number of the CPRB gene.
[0038] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A1A genes; b) increasing, in the host
cell, the number of CYP52A1A genes which encode a CYP52A1A protein
having the amino acid sequence as set forth in SEQ ID NO: 95 or SEQ
ID NO: 110; c) culturing the host cell in media containing an
organic substrate which upregulates the CYP52AIA gene, to effect
increased production of dicarboxylic acid.
[0039] A method for increasing the production of a CYP52A1A protein
having an amino acid sequence as set forth in SEQ ID NO: 95 or SEQ
ID NO: 110 is provided which includes a) transforming a host cell
having a naturally occurring amount of CYP52A1A protein with an
increased copy number of a CYP52A1A gene that encodes the CYP52A1A
protein having the amino acid sequence as set forth in SEQ ID NO:
95 or SEQ ID NO: 110; and b) culturing the cell and thereby
increasing expression of the protein compared with that of a host
cell containing a naturally occurring copy number of the CYP52A1A
gene.
[0040] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A2A genes; b) increasing, in the host
cell, the number of CYP52A2A genes which encode a CYP52A2A protein
having the amino acid sequence as set forth in SEQ ID NO: 96; c)
culturing the host cell in media containing an organic substrate
which upregulates the CYP52A2A gene, to effect increased production
of dicarboxylic acid.
[0041] A method for increasing the production of a CYP52A2A protein
having an amino acid sequence as set forth in SEQ ID NO: 96 is
provided which includes a) transforming a host cell having a
naturally occurring amount of CYP52A2A protein with an increased
copy number of a CYP52A2A gene that encodes the CYP52A2A protein
having the amino acid sequence as set forth in SEQ ID NO: 96; and
b) culturing the cell and thereby increasing expression of the
protein compared with that of a host cell containing a naturally
occurring copy number of the CYP52A2A gene.
[0042] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A2B genes; b) increasing, in the host
cell, the number of CYP52A2B genes which encode a CYP52A2B protein
having the amino acid sequence as set forth in SEQ ID NO: 97; c)
culturing the host cell in media containing an organic substrate
which upregulates the CYP52A2B gene, to effect increased production
of dicarboxylic acid.
[0043] A method for increasing the production of a CYP52A2B protein
having an amino acid sequence as set forth in SEQ ID NO: 97 is
provided which includes a) transforming a host cell having a
naturally occurring amount of CYP52A2B protein with an increased
copy number of a CYP52A2B gene that encodes the CYP52A2B protein
having the amino acid sequence as set forth in SEQ ID NO: 97; and
b) culturing the cell and thereby increasing expression of the
protein compared with that of a host cell containing a naturally
occurring copy number of the CYP52A2B gene.
[0044] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A3A genes; b) increasing, in the host
cell, the number of CYP52A3A genes which encode a CYP52A3A protein
having the amino acid sequence as set forth in SEQ ID NO: 98; c)
culturing the host cell in media containing an organic substrate
which upregulates CYP52A3A gene, to effect increased production of
dicarboxylic acid.
[0045] A method for increasing the production of a CYPS2A3A protein
having an amino acid sequence as set forth in SEQ ID NO: 98 is
provided which includes a) transforming a host cell having a
naturally occurring amount of CYP52A3A protein with an increased
copy number of a CYP52A3A gene that encodes the CYP52A3A protein
having the amino acid sequence as set forth in SEQ ID NO: 98; and
b) culturing the cell and thereby increasing expression of the
protein compared with that of a host cell containing a naturally
occurring copy number of the CYP52A3A gene.
[0046] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A3B genes; b) increasing, in the host
cell, the number of CYP52A3B genes which encode a CYPS2A3B protein
having the amino acid sequence as set forth in SEQ ID NO: 99 or SEQ
ID NO: 111; c) culturing the host cell in media containing an
organic substrate which upregulates the CYP52A3B gene, to effect
increased production of dicarboxylic acid.
[0047] A method for increasing the production of a CYP52A3B protein
having an amino acid sequence as set forth in SEQ ID NO: 99 or SEQ
ID NO: 111 is provided which includes a) transforming a host cell
having a naturally occurring amount of CYP52A3B protein with an
increased copy number of a CYP52A3B gene that encodes the CYP52A3B
protein having the amino acid sequence as set forth in SEQ ID NO:
99 or SEQ ID NO: 111; and b) culturing the cell and thereby
increasing expression of the protein compared with that of a host
cell containing a naturally occurring copy number of the CYP52A3B
gene.
[0048] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A5A genes; b) increasing, in the host
cell, the number of CYP52A5A genes which encode a CYP52A5A protein
having the amino acid sequence as set forth in SEQ ID NO: 100 or
SEQ ID NO: 112; c) culturing the host cell in media containing an
organic substrate which upregulates the CYP52A5A gene, to effect
increased production of dicarboxylic acid.
[0049] A method for increasing the production of a CYPS2A5A protein
having an amino acid sequence as set forth in SEQ ID NO: 100 or SEQ
ID NO: 112 is provided which includes a) transforming a host cell
having a naturally occurring amount of CYP52A5A protein with an
increased copy number of a CYP52A5A gene that encodes the CYP52A5A
protein having the amino acid sequence as set forth in SEQ ID NO:
100 or SEQ ID NO: 112; and b) culturing the cell and thereby
increasing expression of the protein compared with that of a host
cell containing a naturally occurring copy number of the CYP52A5A
gene.
[0050] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A5B genes; b) increasing, in the host
cell, the number of CYP52A5B genes which encode a CYP52A5B protein
having the amino acid sequence as set forth in SEQ ID NO: 101 or
SEQ ID NO: 113; c) culturing the host cell in media containing an
organic substrate which upregulates the CYP52A5B gene, to effect
increased production of dicarboxylic acid.
[0051] A method for increasing the production of a CYP52ASB protein
having an amino acid sequence as set forth in SEQ ID NO: 101 or SEQ
ID NO: 113 is provided which includes a) transforming a host cell
having a naturally occurring amount of CYP52A5B protein with an
increased copy number of a CYPS2A5B gene that encodes the CYP52A5B
protein having the amino acid sequence as set forth in SEQ ID NO:
101 or SEQ ID NO: 113; and b) culturing the cell and thereby
increasing expression of the protein compared with that of a host
cell containing a naturally occurring copy number of the CYP52A5B
gene.
[0052] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A8A genes; b) increasing, in the host
cell, the number of CYP52A8A genes which encode a CYP52A8A protein
having the amino acid sequence as set forth in SEQ ID NO: 102 or
SEQ ID NO: 114; c) culturing the host cell in media containing an
organic substrate which upregulates the CYP52A8A gene, to effect
increased production of dicarboxylic acid.
[0053] A method for increasing the production of a CYP52A8A protein
having an amino acid sequence as set forth in SEQ ID NO: 102 or SEQ
ID NO: 114 is provided which includes a) transforming a host cell
having a naturally occurring amount of CYP52A8A protein with an
increased copy number of a CYP52A8A gene that encodes the CYP52A8A
protein having the amino acid sequence as set forth in SEQ ID NO:
102 or SEQ ID NO: 114; and b) culturing the cell and thereby
increasing expression of the protein compared with that of a host
cell containing a naturally occurring copy number of the CYP52A8A
gene.
[0054] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52A8B genes; b) increasing, in the host
cell, the number of CYP52A8B genes which encode a CYP52A8B protein
having the amino acid sequence as set forth in SEQ ID NO: 103 or
SEQ ID NO: 115; c) culturing the host cell in media containing an
organic substrate which upregulates the CYP52A8B gene, to effect
increased production of dicarboxylic acid.
[0055] A method for increasing the production of a CYP52A8B protein
having an amino acid sequence as set forth in SEQ ID NO: 103 or SEQ
ID NO: 115 is provided which includes a) transforming a host cell
having a naturally occurring amount of CYP52A8B protein with an
increased copy number of a CYP52A8B gene that encodes the CYP52A8B
protein having the amino acid sequence as set forth in SEQ ID NO:
103 or SEQ ID NO: 115; and b) culturing the cell and thereby
increasing expression of the protein compared with that of a host
cell containing a naturally occurring copy number of the CYP52A8B
gene.
[0056] A method for increasing production of a dicarboxylic acid is
provided which includes a) providing a host cell having a naturally
occurring number of CYP52D4A genes; b) increasing, in the host
cell, the number of CYP52D4A genes which encode a CYP52D4A protein
having the amino acid sequence as set forth in SEQ ID NO: 104 or
SEQ ID NO: 116; c) culturing the host cell in media containing an
organic substrate which upregulates the CYP52D4A gene, to effect
increased production of dicarboxylic acid.
[0057] A method for increasing the production of a CYP52D4A protein
having an amino acid sequence as set forth in SEQ ID NO: 104 or SEQ
ID NO: 116 is provided which includes a) transforming a host cell
having a naturally occurring amount of CYP52D4A protein with an
increased copy number of a CYP52D4A gene that encodes the CYP52D4A
protein having the amino acid sequence as set forth in SEQ ID NO:
104 or SEQ ID NO: 116; and b) culturing the cell and thereby
increasing expression of the protein compared with that of a host
cell containing a naturally occurring copy number of the CYP52D4A
gene.
BRIEF DESCRIPTION OF THE DRAWINGS
[0058] FIG. 1 is a schematic representation of cloning vector
pTriplEx from Clontech.TM. Laboratories, Inc. Selected restriction
sites within the multiple cloning site are shown.
[0059] FIG. 2A is a map of the ZAP Express.TM. vector.
[0060] FIG. 2B is a schematic representation of cloning phagemid
vector pBK-CMV.
[0061] FIG. 3 is a double stranded DNA sequence of a portion of the
5 prime coding region of the CYP52ASA gene (SEQ ID NO: 36), the
non-coding or antisense sequence (SEQ ID NO: 108), primer 7581-97F
(SEQ ID NO: 47) and primer 7581-97M (SEQ ID NO: 48).
[0062] FIG. 4 is a diagrammatic representation of highly conserved
regions of CYP and CPR gene protein sequences. Helix I represents
the putative substrate binding site and HR2 represents the heme
binding region. The FMN, FAD and NADPH binding regions are
indicated below the CPR gene.
[0063] FIG. 5 is a diagrammatic representation of the plasmid pHKM1
containing the truncated CPRA gene present in the pTriplEx vector.
A detailed restriction map of only the sequenced region is shown at
the top. The bar indicates the open reading frame. The direction of
transcription is indicated by an arrow under the open reading
frame.
[0064] FIG. 6 is a diagrammatic representation of the plasmid pHKM4
containing the truncated CPRA gene present in the pTriplEx vector.
A detailed restriction map of only the sequenced region is shown at
the top. The bar indicates the open reading frame. The direction of
transcription is indicated by an arrow under the open reading
frame.
[0065] FIG. 7 is a diagrammatic representation of the plasmid pHKM9
containing the CPRB gene (SEQ ID NO: 82) present in the PBK-CMV
vector. A detailed restriction map of only the sequenced region is
shown at the top. The bar indicates the open reading frame. The
direction of transcription is indicated by an arrow under the open
reading frame.
[0066] FIG. 8 is a diagrammatic representation of the plasmid
pHKM11 containing the CYP52A1A gene (SEQ ID NO: 85) present in the
pBK-CMV vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0067] FIG. 9 is a diagrammatic representation of the plasmid
pHKM12 containing the CYP52A8A gene (SEQ ID NO: 92) present in the
PBK-CMV vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0068] FIG. 10 is a diagrammatic representation of the plasmid
pHKM13 containing the CYP52D4A gene (SEQ ID NO: 94) present in the
pBK-CMW vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0069] FIG. 11 is a diagrammatic representation of the plasrmid
pHKM14 containing the CYP52A2B gene (SEQ ID NO: 87) present in the
pBK-CMV vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0070] FIG. 12 is a diagrammatic representation of the plasmid
pHKM15 containing the CYP52A8B gene (SEQ ID NO: 93) present in the
PBK-CMV vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0071] FIGS. 13A-13D show the complete DNA sequences including
regulatory and coding regions for the CPRA gene (SEQ ID NO: 81) and
CPRB gene (SEQ ID NO: 82) from C. ltopicalis ATCC 20336. FIGS.
13A-13D show regulatory and coding region alignment of these
sequences. Asterisks indicate conserved nucleotides. The start
codons are underlined and the last amino acid coding codons
immediately before the stop codon are underlined.
[0072] FIG. 14 shows the amino acid sequence of the CPRA (SEQ ID
NO: 83) and CPRB (SEQ ID NO: 84) proteins from C. topicalis ATCC
20336 and alignment of these amino acid sequences. Asterisks
indicate residues which are not conserved.
[0073] FIGS. 15A-15M show the complete DNA sequences including
regulatory and coding regions for the following genes from C.
tropicalis ATCC 20366: CYP52ALA (SEQ ID NO: 85), CYPT2A2A (SEQ ID
NO: 86), CYP52A2B (SEQ ID NO: 87), CYP52A3A (SEQ ID NO: 88),
CYP52A3B (SEQ ID NO: 89), CYPJ2ASA (SEQ ID NO. 90), CYP52ASB (SEQ
ID NO: 91), CYP52A8A (SEQ ID NO: 92), CYPJ2A8B (SEQ ID NO: 93), and
CYP52D4A (SEQ ID NO: 94). FIGS. 15A-15M show regulatory and coding
region alignment of these sequences. Asterisks indicate conserved
nucleotides. The start codons are underlined and the last amino
acid coding codons immediately before the stop codon are
underlined.
[0074] FIGS. 16A-16C show the amino acid sequences encoding the
CYP52A1A (SEQ ID NO: 95), CYP52A2A (SEQ ID NO: 96), CYP52A2B (SEQ
ID NO: 97), CYP52A3A (SEQ ID NO: 98), CYP52A3B (SEQ ID NO: 99),
CYP52A5A (SEQ ID NO: 100), CYPS2A5B (SEQ ID NO: 101), CYP52A8A (SEQ
ID NO: 102), CYP52A8B (SEQ ID NO: 103) and CYP52D4A (SEQ ID NO.
104) proteins from C. tropicalis ATCC 20336. Asterisks indicate
identical residues and dots indicate conserved residues.
[0075] FIG. 17 is a diagrammatic representation of the pTAg PCR
product cloning vector (commercially available from R&D
Systems, Minneapolis, Minn.).
[0076] FIG. 18 is a plot of the log ratio (U/C) of unknown target
DNA product to competitor DNA product versus the concentration of
competitor mRNA. The plot is used to calculate the target messenger
RNA concentration in a quantitative competitive reverse
transcription polymerase chain reaction (QC-RT-PCR).
[0077] FIG. 19 is a graph showing the relative induction of C.
tropicalis ATCC 20962 CYP52A5A (SEQ ID NO: 90) by the addition of
the fatty acid substrate Emersol.RTM. 267 to the growth medium.
[0078] FIG. 20 is a graph showing the induction of C. tropicahs
ATCC 20962 CYPS2 and CPR genes by Emersol.RTM. 267. P450 genes
CYP52A3A (SEQ ID NO: 88), CYP52A3B (SEQ ID NO: 89), and CYP52D4A
(SEQ ID NO: 94) are expressed at levels below the detection level
of the QC-RT-PCR assay.
[0079] FIG. 21 is a scheme to integrate selected genes into the
genome of Candida tropicalis strains and recovery of URA3A
selectable marker.
[0080] FIG. 22 is a schematic representation of the transformation
of C. tropicalis H5343 ura3' with CYPand/or CPR genes. Only one
URA3 locus needs to be functional. There are a total of 6 possible
ura3 targets (5ura3A loci-2 pox4 disruptions, 2 pox 5 disruptions,
1 ura3A locus; and 1 ura3B locus).
[0081] FIG. 23 is the complete DNA sequence (SEQ ID NO: 105)
encoding URA3A from C. tropicalis ATCC 20336 and the amino acid
sequence of the encoded protein (SEQ ID NO: 106).
[0082] FIG. 24 is a schematic representation of the plasmid pURAin,
the base vector for integrating selected genes into the genome of
C. tropicalis. The detailed construction of pURAin is described in
the text.
[0083] FIG. 25 is a schematic representation of the plasmid PNEB
193 cloning vector (commercially available from New England
Biolabs, Beverly, Mass.).
[0084] FIG. 26 is a diagrammatic representation of the plasmid
pPA15 containing the truncated CYP52A2A gene present in the
pTriplEx vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0085] FIG. 27 is a schematic representation of pURA2in, the base
vector is constructed in pNEB 193 which contains the 8 bp
recognition sequences for Asc I, Pac I and Pme I. URA3A (SEQ ID NO:
105) and CYP52A2A (SEQ ID NO: 86) do not contain these 8 bp
recognition sites. URA3A is inverted so that the transforming
fragment will attempt to recircularize prior to integration. An Asc
I/Pme I fragment was used to transform H5343 ura.
[0086] FIG. 28 shows a scheme to detect integration of CYP52A2A
gene (SEQ ID NO: 86) into the genome of H5343 ura. In all cases,
hybridization band intensity could reflect the number of
integrations.
[0087] FIG. 29 is a diagrammatic representation of the plasmid
pPA57 containing the truncated CYP52A3A gene present in the
pTriplEx vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0088] FIG. 30 is a diagrammatic representation of the plasmid
pPA62 containing the truncated CYP52A3B gene present in the
pTriplEx vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0089] FIG. 31 is a diagrammatic representation of the plasnid
pPAL3 containing the truncated CYP52AJA gene present in the
pTriplEx vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0090] FIG. 32 is a diagrammatic representation of the plasmid pPA5
containing the truncated CYP52A5A gene present in the pTriplEx
vector. A detailed restriction map of only the sequenced region is
shown at the top. The bar indicates the open reading frame. The
direction of transcription is indicated by an arrow under the open
reading frame.
[0091] FIG. 33 is a diagrammatic representation of the plasnid
pPA18 containing the truncated CYP52D4A gene present in the
pTriplEx vector. A detailed restriction map of only the sequenced
region is shown at the top. The bar indicates the open reading
frame. The direction of transcription is indicated by an arrow
under the open reading frame.
[0092] FIG. 34 is a graph showing the expression of CYP52AJ (SEQ ID
NO: 85), CYP52A2 (SEQ ID NO: 86) and CYP52Agenes (SEQ ID NOS: 90
and 91) from C. tropicalis 20962 in a fermentor run upon the
addition of amounts of the substrate oleic acid or tridecane in a
spiking experiment.
[0093] FIG. 35 depicts a scheme used for the extraction and
analysis of diacids and monoacids from fermentation broths.
[0094] FIG. 36 is a graph showing the induction of expression of
CYP52AJA, CYP52A2A and CYP52ASA in a fermentor run upon addition of
the substrate octadecane. No induction of CYP52A3A or CYP52A3B was
observed under these conditions.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0095] Diacid productivity is improved according to the present
invention by selectively increasing enzymes which are known to be
important to the oxidation of organic substrates such as fatty
acids composing the desired feed. According to the present
invention, ten CYPgenes and two CPR genes of C. tropicalis have
been identified and characterized that relate to participation in
the co-hydroxylase complex catalyzing the first step in the
o)-oxidation pathway. In addition, a novel quantitative competitive
reverse transcription polymerase chain reaction (QC-RT-PCR) assay
is used to measure gene expression in the fermentor under
conditions of induction by one or more organic substrates as defmed
herein. Based upon QC-RT-PCR results, three CYPgenes, CYP52AJ,
CYP52A2 and CYP52AJ, have been identified as being of greater
importance for the o-oxidation of long chain fatty acids.
Amplification of the CPR gene copy number improves productivity.
The QC-RT-PCR assay indicates that both CYPand CPR genes appear to
be under tight regulatory control.
[0096] In accordance with the present invention, a method for
discriminating members of a gene family by quantifying the amount
of target mRNA in a sample is provided which includes a) providing
an organism containing a target gene; b) culturing the organism
with an organic substrate which causes upregulation in the activity
of the target gene; c) obtaining a sample of total RNA from the
organism at a first point in time; d) combining at least a portion
of the sample of the total RNA with a known amount of competitor
RNAto form an RNA mixture, wherein the competitor RNA is
substantially similar to the target mRNA but has a lesser number of
nucleotides compared to the target mRNA; e) adding reverse
transcriptase to the RNA mixture in a quantity sufficient to form
corresponding target DNA and competitor DNA; (f) conducting a
polymerase chain reaction in the presence of at least one primer
specific for at least one substantially non-homologous region of
the target DNA within the gene family, the primer also specific for
the competitor DNA; g) repeating steps (c-f) using increasing
amounts of the competitor RNA while maintaining a substantially
constant amount of target RNA; h) determining the point at which
the amount of target DNA is substantially equal to the amount of
competitor DNA; i) quantifying the results by comparing the ratio
of the concentration of unknown target to the known concentration
of competitor; and j) obtaining a sample of total RNA from the
organism at another point in time and repeating steps (d-i).
[0097] In addition, modification of existing promoters and/or the
isolation of alternative promoters provides increased expression of
CYPand CPR genes. Strong promoters are obtained from at least four
sources: random or specific modifications of the CYP52A2promoter,
CYP52A5promoter, CYP52AJ promoter, the selection of a strong
promoter from available Candida .beta.-oxidation genes such as POX4
and POXS, or screening to select another suitable Candida
promoter.
[0098] Promoter strength can be directly measured using QT-RT-PCR
to measure CYPand CPR gene expression in Candida cells isolated
from fermentors. Enzymatic assays and antibodies specific for
CYPand CPR proteins are used to verify that increased promoter
strength is reflected by increased synthesis of the corresponding
enzymes. Once a suitable promoter is identified, it is fused to the
selected CYPand CPR genes and introduced into Candida for
construction of a new improved production strain. It is
contemplated that the coding region of the CYP and CPR genes can be
fused to suitable promoters or other regulatory sequences which are
well known to those skilled in the art.
[0099] In accordance with the present invention, studies on C.
tropicalis ATCC 20336 have identified six unique CYPgenes and four
potential alleles. QC-RT-PCR analyses of cells isolated during the
course of the fermentation bioconversions indicate that at least
three of the CYPgenes are induced by fatty acids and at least two
of the CYP genes are induced by alkanes. See FIG. 34. Two of the
CYPgenes are highly induced indicating participation in the
.omega.-hydroxylase complex which catalyzes the rate limiting step
in the oxidation of fatty acids to the corresponding diacids.
[0100] The biochemical characterizations of each P450 enzyme herein
is used to tailor the C. tropicalis host for optimal diacid
productivity and is used to select P450 enzymes to be amplified
based upon the fatty acid content of the feedstream. CYTgene(s)
encoding P450 enzymes that have a low specific activity for the
fatty acid or alkane substrate of choice are targeted for
inactivation, thereby reducing the physiological load on the
cell.
[0101] Since it has been demonstrated that CPR can be limiting in
yeast systems, ithe removal of non-essential P450s from the system
can free electrons that are being used by non-essential P450s and
make them available to the P450s important for diacid productivity.
Moreover, the removal of non-essential P450s can make available
other necessary but potentially limiting components of the P450
system (i.e., available membrane space, heme and/or NADPH).
[0102] Diacid productivity is thus improved by selective
integration, amplification, and over expression of CYP and CPR
genes in the C. tropicalis production host.
[0103] It should be understood that host cells into which one or
more copies of desired CYP and/or CPR genes have been introduced
can be made to include such genes by any technique known to those
skilled in the art. For example, suitable host cells include
procaryotes such as Bacillus sp., Pseudomous sp., Actilomycetes
sp., Escbenicia sp., Mycobactenum sp., and eukaryotes such as
yeast, algae, insect cells, plant cells and and filamentous fungi.
Suitable host cells are preferably yeast cells such as Yarrowia,
Bebaromyces, Saccharomyces, Schizosaccharomyces, and Pichia and
more preferably those of the Candida genus. Preferred species of
Candida are tropicalis, maltosa, apicola, paratropicalis, albicans,
cloacae, guillermondii, intermedia, lipolytica, parapsilosis and
zeylenoides. Certain preferred stains of Candida tropicalis are
listed in U.S. Pat. No. 5,254,466, incorporated herein by
reference.
[0104] Vectors such as plasmids, phagemids, phages or cosniuds can
be used to transform or transfect suitable host cells. Host cells
may also be transformed by introducing into a cell a linear DNA
vector(s) containing the desired gene sequence. Such linear DNA may
be advantageous when it is desirable to avoid introduction of
non-native (foreign) DNA into the cell. For example, DNA consisting
of a desired target gene(s) flanked by DNA sequences which are
native to the cell can be introduced into the cell by
electroporation, lithium acetate transformation, spheroplasting and
the like. Flanking DNA sequences can include selectable markers
and/or other tools for genetic engineering.
[0105] It should be understood that, depending on whether a
transformed organism utilizes the universal genetic code or the
non-umnversal genetic code known, e.g., in connection with C.
tropicalis, slight differences can be manifest in the amino acid
sequences of protein-products. Thus, nucleotide sequences
containing a CTG codon produce proteins containing a CTG encoded
leucine in prokaryotes such as E. coli and a CTG encoded serine in
non-universal coding eukaryotes such as C. tropicalis. For example,
the CITJ2A1A gene contains one CTG codon starting at position 1354
which is translated as a leucine in E. coli and a serine in C.
tropicalis, leading to two versions of the CYP52A1A protein (SEQ.
ID. NO: 95 and SEQ. ID. NO: 110); the CYP52A3B gene contains one
CTG codon starting at position 2449 which is translated as a
leucine in E. coli and a serine in C. tropicalis, leading to two
versions of the CYP52A3B protein (SEQ. ID. NO: 99 and SEQ. ID NO:
111); the CYP52A5A gene contains two CTG codons starting,
respectively, at positions 1883 and 2570, which are translated as
leucine in E. coli and sernne in C. tropicalis, leading to two
versions of the CYP52A5A protein (SEQ. ID. NO: 100 and SEQ. ID. NO:
112); the CYP52ASBgene contains two CTG codons starting,
respectively, at positions 1922 and 2609, which are translated as
leucine in E. coli and serine in C. tropicalis, leading to two
versions of the CYP52A5B protein (SEQ. ID. NO: 101 and SEQ. ID. NO:
113); the CYP52A8A gene contains one CTG codon starting at position
659, which is translated as a leucine in E. coil and a serine in C.
tropicalis, leading to two versions of the CYP52A8B protein (SEQ.
ID. NO: 103 and SEQ. ID. NO: 115); the CYPS2D4A gene contains three
CTG codons starting, respectively, at positions 1247, 1412 and
1757, which are translated as leucine in E. coi and as serine in C.
tropicalis, leading to two versions of the CYP5234A protein (SEQ.
ID. NO: 104 and SEQ. ID. NO: 116); the CPRA (NCPIA) gene contains
one CTG codon starting at position 1153 which is translated as a
leucine in E. coli and as a serine in C. tropicalis, leading to two
versions of the CPRA (NCPIA) protein (SEQ. ID. NO: 83 and SEQ. ID.
NO: 117); the CPRG (NCPIB) gene contains one CTG codon starting at
position 1180 which is translated as a leucine in E. coli and as a
serine in C. tropicalis, leading to two versions of the CPRB
(NCPIB) protein (SEQ. ID. NO: 84 and SEQ. ID. NO: 118).
[0106] A suitable organic substrate herein can be any organic
compound that is biooxidizable to a mono- or polycarboxylic acid.
Such a compound can be any saturated or unsaturated aliphatic
compound or any carbocyclic or heterocyclic aromatic compound
having at least one terminal methyl group, a terminal carboxyl
group and/or a terminal functional group which is oxidizable to a
carboxyl group by biooxidation. A terminal functional group which
is a derivative of a carboxyl group may be present in the substrate
molecule and may be converted to a carboxyl group by a reaction
other than biooxidation. For example, if the terminal group is an
ester that neither the wild-type C. tropicalis nor the genetic
modifications described herein will allow hydrolysis of the ester
functionality to a carboxyl group, then a lipase can be added
during the fermentation step to liberate free fatty acids. Suitable
organic substrates include, but are not limited to, saturated fatty
acids, unsaturated fatty acids, alkanes, alkenes, alkynes and
combinations thereof.
[0107] Alkanes are a type of saturated organic substrate which are
useful herein. The alkanes can be linear or cyclic, branched or
straight chain, substituted or unsubstituted. Particularly
preferred alkanes are those having from about 4 to about 25 carbon
atoms, examples of which include but are not limited to butane,
hexane, octane, nonane, dodecane, tridecane, tetradecane,
octadecane and the like.
[0108] Examples of unsaturated organic substrates which can be used
herein include but are not limited to internal olefins such as
.sup.2-pentene, 2-hexene, 3-hexene, 9-octadecene and the like;
unsaturated carboxylic acids such as 2-hexenoic acid and esters
thereof, oleic acid and esters thereof including triglyceryl esters
having a relatively high oleic acid content, erucic acid and esters
thereof including triglyceryl esters having a relatively high
erucic acid content, ricinoleic acid and esters thereof including
triglyceryl esters having a relatively high ricinoleic acid
content, linoleic acid and esters thereof including triglyceryl
esters having a relatively high linoleic acid content; unsaturated
alcohols such as 3-hexen-1-ol, 9-octadecen-1-ol and the like;
unsaturated aldehydes such as 3-hexen-1-al, 9-octadecen-1-al and
the like. In addition to the above, an organic substrate which can
be used herein include alicyclic compounds having at least one
internal carbon-carbon double bond and at least one terminal methyl
group, a terminal carboxyl group and/or a terinnal functional group
which is oxidizable to a carboxyl group by biooxidation. Examples
of such compounds include but are not limited to 3,6-dimethyl,
1,4-cyclohexadiene; 3-methylcyclohexene; 3-methyl-1,
4-cyclohexadiene and the like.
[0109] Examples of the aromatic compounds that can be used herein
include but are not limited to arenes such as o-, m-, p-xylene; o-,
m-, p-methyl benzoic acid; dimethyl pyridine, and the like. The
organic substrate can also contain other functional groups that are
biooxidizable to carboxyl groups such as an aldehyde or alcohol
group. The organic substrate can also contain other functional
groups that are not biooxidizable to carboxyl groups and do not
interfere with the biooxidation such as halogens, ethers, and the
like.
[0110] Examples of saturated fatty acids which may be applied to
cells incorporating the present CYPand CPR genes include caproic,
enanthic, caprylic, pelargonic, capric, undecylic, lauric,
myristic, pentadecanoic, palmitic, margaric, stearic, arachidic,
behenic acids and combinations thereof. Examples of unsaturated
fatty acids which may be applied to cells incorporating the present
CYPand CPR genes include paimitoleic, oleic, erucic, linoleic,
linolenic acids and combinations thereof. Alkanes and fractions of
alkanes may be applied which include chain links from C12 to C24 in
any combination. An example of a preferred fatty acid mixtures are
Emersol.RTM. 267 and Tallow, both commercially available from
Henkel Chemicals Group, Cincinnati, Ohio. The typical fatty acid
composition of Emersol.RTM. 267 and Tallow is as follows:
1 TALLOW E267 C14:0 3.5% 2.4% C14:1 1.0% 0.7% C15:0 0.5% -- C16:0
25.5% 4.6% C16:1 4.0% 5.7% C17:0 2.5% -- C17:1 -- 5.7% C18:0 19.5%
1.0% C18:1 41.0% 69.9% C18:2 2.5% 8.8% C18:3 -- 0.3% C20:0 0.5% --
C20:1 -- 0.9%
[0111] The following examples are meant to illustrate but not to
limit the invention. All relevant microbial strains and plasmids
are described in Table 1 and Table 2, respectively.
2TABLE 1 List of Escherichia coli and Candida tropicalis strains
STRAIN GENOTYPE SOURCE E. Coli XL1Blue- endA1, gyrA96, hsdR17, lac,
recA1, Stratagene, MRF' relA1, supE44, thi-1, [F' lacI.sup.RZ M15,
La Jolla, CA proAB, Tn10] BM25.8 SupE44, thi (lac-proAB) [F'
traD36, Clontech, proAB, lacI.sup.RZ M15] Palo Alto, CA
.lambda.imm434 (kan.sup.R)P1 (cam.sup.R) hsdR
(r.sub.1-12-m.sub.1-12-) XLOLR (mcrA)183 (mcrCB-hsdSMR-mrr)173
Stratagene, endA1 thi-1 recA1 gyrA96 relA1 lac La Jolla, CA [F'
proAB lacI.sup.RZ M15 Tn10 (Tet') Su (nonsuppressing
.lambda.'(lambda resistant) C. tropicalis ATCC20336 Wild-type
American Type Culture Collection, Rockville, MD ATCC750 Wild-type
American Type Culture Collection, Rockville, MD ATCC 20962
ura3A/ura3B, Henkel pox4A::ura3A/pox4B::ura3- A,
pox5::ura3A/pox5::URA3A H5343 ura- ura3A/ura3B, Henkel
pox4A::ura3A/pox4B::ura3A, pox5::ura3A/pox5::URA3A, ura3- HDC1
ura3A/ura3B, Henkel pox4A::ura3A/pox4B::ura3A,
pox5::ura3A/pox5::URA3A, ura3::URA3A-CYP52A2A HDC5 ura3A/ura3B,
Henkel pox4A::ura3A/pox4B::ura3A, pox5::ura3A/pox5::URA3A,
ura3::URA3A-CYP52A3A HDC10 ura3A/ura3B, Henkel
pox4A::ura3A/pox4B::ura3A, pox5::ura3A/pox5::URA3A,
ura3::URA3A-CPRB HDC15 ura3A/ura3B, Henkel
pox4A::ura3A/pox4B::ura3A, pox5::ura3A/pox5::URA3A,
ura3::URA3A-CYP52A5A HDC20 ura3A/ura3B, Henkel
pox4A::ura3A/pox4B::ura3A, pox5::ura3A/pox5::URA3A,
ura3::URA3A-CYP52A2A + CPR B (CYP and CPR have opposite 5' to 3'
orientation with respect to each other) HDC23 ura3A/ura3B, Henkel
pox4A::ura3A/pox4B::ura3A, pox5::ura3A/pox5::URA3A,
ura3::URA3A-CYP52A2A + CPR B (CYP and CPR have same 5' to 3'
orientation with respect to each other)
[0112]
3TABLE 2. List of plasmids isolated from genomic libraries and
constructed for use in gene integrations. Base Insert Plasmid
Plasmid vector Insert Size size Description pURAin pNEB193 URA3A
1706 bp 4399 bp pNEB 193 with the URA3A gene inserted in the AscI -
PmeI site, generating a PacI site pURA 2in pURAin CYP52A2 2230 bp
6629 bp pURAin containing a PCR A CYP52A2A allele containing PacI
restriction sites pURA pURAin CPRB 3266 bp 7665 bp pURAin
containing a PCR REDB in CPRB allele containing PacI restriction
sites pHKM1 pTriplEx Trucated Approx. Approx. A truncated CPRA gene
CPRA gene 3.8 kb 7.4 kb obtained by first screening library
containing the 5' untranslated region and 1.2 kb open reading frame
pHKM4 PTriplEx Truncated Approx. Approx. A truncated CPRA gene CPRA
gene 5 kb 8.6 kb obtained by screening second library containing
the 3' untranslated region end sequence pHKM9 pBC- CPRB Approx.
Approx. CPRB allele isolated from the CMV gene 5.3 kb 9.8 kb third
library pHKM11 pBC- CYP52A1 Approx. Approx. CYP52A1A isolated from
the CMV A 5 kb 9.5 kb third library pHKM12 pBC- CYP52A8 Approx.
Approx. CYP52A8A isolated from the CMV A 7.5 kb 12 kb third library
pHKM13 pBC- CYP52D4 Approx. Approx. CYP52D4A isolated from the CMV
A 7.3 kb 11.8 kb third library pHKM14 pBC- CYP52A2 Approx. Approx.
CYP52A2B isolated from the CMV B 6 kb 10.5 kb third library pHKM15
pBC- CYP52A8 Approx. Approx. CYP52A8B isolated from the CMV B 6.6
kb 11.1 kb third library pPAL3 pTriplEx CYP52A5 4.4 kb Approx.
CYP52A5A isolated from the 1st A 8.1 kb library pPA5 pTriplEx
CYP52A5 4.1 kb Approx. CYP52A5B isolated from the B 7.8 kb 2nd
library pPA15 pTriplEx CYP52A2 6.0 kb Approx. CYP52A2A isolated
from the A 9.7 kb 2nd library pPA57 pTriplEx CYP52A3 5.5 kb Approx.
CYP52A3A isolated from the A 9.2 kb 2nd library pPA62 pTriplEx
CYP52A3 6.0 kb Approx. CYP52A3B isolated from the B 9.7 kb 2nd
library
EXAMPLE 1
Purification of Genomic DNA from Candida tropicalis ATCC 20336
[0113] A. Construction of Genomic Libraries
[0114] 50 ml of YEPD broth (see Table 9) was inoculated with a
single colony of C. tropicalis 20336 from YEPD agar plate and grown
overnight at 30.degree. C. 5 ml of the overnight culture was
inoculated into 100 ml of fresh YEPD broth and incubated at
30.degree. C. for 4 to 5 hr with shaking. Cells were harvested by
centrifugation, washed twice with sterile distilled water and
resuspended in 4 ml of spheroplasting buffer (1 M Sorbitol, 50 mM
EDTA, 14 mM mercaptoethanol) and incubated for 30 min at 37.degree.
C. with gentle shaking. 0.5 ml of 2 mg/ml zymolyase (ICN
Pharmaceuticals, Inc., Irvine, CA) was added and incubated at
37.degree. C. with gentle shaking for 30 to 60 min. Spheroplast
formation was monitored by SDS lysis. Spheroplasts were harvested
by brief centrifugation (4,000 rpm, 3 min) and were washed once
with the spheroplast buffer without mercaptoethanol. Harvested
spheroplasts were then suspended in 4 ml of lysis buffer (0.2 M
Tris/pH 8.0, 50 mM EDTA, 1% SDS) containing 100 .mu.g/ml RNase
(Qiagen Inc., Chatsworth, Calif.) and incubated at 37.degree. C.
for 30 to 60 min.
[0115] Proteins were denatured and extracted twice with an equal
volume of chloroforn/isoatny1 alcohol (24:1) by gently mixing the
two phases by hand inversions. The two phases were separated by
centrifugation at 10,000 rpm for 10 min and the aqueous phase
containing the high-molecular weight DNA was recovered. To the
aqueous layer NaCl was added to a final concentration of 0.2 M and
the DNA was precipitated by adding 2 vol of ethanol. Precipitated
DNA was spooled with a clean glass rod and resuspended in TE buffer
(10 mM Tris/pH 8.0, 1 mM EDTA) and allowed to dissolve overnight at
4.degree. C. To the dissolved DNA, RNase free of any DNase activity
(Qiagen Inc., Chatsworth, Calif.) was added to a final
concentration of 50 .mu.g/ml and incubated at 37.degree. C. for 30
min. Then protease (Qiagen Inc., Chatsworth, Calif.) was added to a
final concentration of 100 .mu.g/ml and incubated at 55 to
60.degree. C. for 30 min. The solution was extracted once with an
equal volume of phenol/chloroforn/isoamyl alcohol (25:24:1) and
once with equal volume of chloroform/isoamyl alcohol (24:1). To the
aqueous phase 0.1 vol of 3 M sodium acetate and 2 volumes of ice
cold ethanol (200 proof) were added and the high molecular weight
DNA was spooled with a glass rod and dissolved in 1 to 2 ml of TE
buffer.
[0116] B. Genomic DNA Preparation for PCR Amplification of CYP and
CPR Genes
[0117] Five 5 ml of YPD medium was inoculated with a single colony
and grown at 30.degree. C. overnight. The culture was centrifuged
for 5 min at 1200.times.g. The supernatant was removed by
aspiration and 0.5 ml of a sorbitol solution (0.9 M sorbitol, 0.1 M
Tris-Cl pH 8.0, 0.1 M EDTA) was added to the pellet. The pellet was
resuspended by vortexing and 1 .mu.l of 2-mercaptoethanol and 50
.mu.l of a 10 .mu.g/ml zymolyase solution were added to the
mixture. The tube was incubated at 37.degree. C. for 1 hr on a
rotary shaker (200 rpm). The tube was then centrifuged for 5 min at
1200.times.g and the supernatant was removed by aspiration. The
protoplast pellet was resuspended in 0.5 ml 1.times. TE (10 mM
Tris-Cl pH 8.0, 1 mM EDTA) and transferred to a 1.5 ml
microcentrifuge tube. The protoplasts were lysed by the addition of
50 .mu.l 10% SDS followed by incubation at 65.degree. C. for 20
min. Next, 200 .mu.l of 5M potassium acetate was added and after
mixing, the tube was incubated on ice for at least 30 min. Cellular
debris was removed by centrifugation at 13,000.times.g for 5 min.
The supernatant was carefully removed and transferred to a new
microfuge tube. The DNA was precipitated by the addition of 1 ml
100% (200 proof) ethanol followed by centrifugation for 5 min at
13,000.times.g. The DNA pellet was washed with 1 ml 70 % ethanol
followed by centrifugation for 5 min at 13,000.times.g. After
partially drying the DNA under a vacuum, it was resuspended in 200
.mu.l of 1.times. TE. The DNA concentration was determined by ratio
of the absorbance at 260 nm/280 nm (A.sub.260/280).
EXAMPLE 2
Construction of Candida tropicalis 20336 Genomic Libraries
[0118] Three genomic libraries of C. tropicalis were constructed,
two at Clontech Laboratories, Inc., (Palo Alto, Calif.) and one at
Henkel Corporation (Cincinnati, Ohio).
[0119] A. Clontech Libraries
[0120] The first Clontech library was made as follows: Genomic DNA
was prepared from C. tropicalis 20336 as described above, partially
digested with EcoRi and size fractionated by gel electrophoresis to
eliminate fragments smaller than 0.6 kb. Following size
fractionation, several ligations of the EcoRI genomic DNA fragments
and lambda (.lambda.) TriplEx.TM. vector (FIG. 1) arms with EcoRI
sticky ends were packaged into .lambda. phage heads under
conditions designed to obtain one million independent clones. The
second genomic library was constructed as follows: Genomic DNA was
digested partially with Sau3A1 and size fractionated by gel
electrophoresis. The DNA fragments were blunt ended using standard
protocols as described, e.g., in Sambrook et al, Molecular Cloning:
A Laboratory Manual, 2ed. Cold Spring Harbor Press, USA (1989),
incorporated herein by reference. The strategy was to fill in the
Sau3A1 overhangs with Klenow polymerase (Life Technologies, Grand
Island, N.Y.) followed by digestion with S1 nuclease (Life
Technologies, Grand Island, N.Y.). After S1 nuclease digestion the
fragments were end filled one more time with Klenow polymerase to
obtain the final blunt-ended DNA fragments. EcoRI linkers were
ligated to these blunt-ended DNA fragments followed by ligation
into the .lambda. TriplEx vector. The resultant library contained
approximately 2.times.10.sup.6 independent clones with an average
insert size of 4.5 kb.
[0121] B. Henkel Library
[0122] The third genomic library was constructed at Henkel
Corporation using .lambda.ZAP Express.TM. vector (Stratagene, La
Jolla, Calif.) (FIG. 2). Genomic DNA was partially digested with
Sau3A1 and fragments in the range of 6 to 12 kb were purified from
an agarose gel after electrophoresis of the digested DNA. These DNA
fragments were then ligated to BamHI digested .lambda.ZAP
Express.TM. vector arms according to manufacturers protocols. Three
ligations were set up to obtain approximately 9.8.times.105
independent clones. All three libraries were pooled and amplified
according to manufacturer instructions to obtain high-titre
(>10.sup.9 plaque forming units/ml) stock for long-term storage.
The titre of packaged phage library was ascertained after infection
of E. coli XL1Blue-MRF' cells. E. coli XL1Blue-MRF' were grown
overnight in either in LB medium or NZCYM (Table 9) containing 10
mM MgSO.sub.4 and 0.2% maltose at 37.degree. C. or 30.degree. C.,
respectively with shaking. Cells were then centrifuged and
resuspended in 0.5 to 1 volume of 10 mM MgSO.sub.4. 200 .mu.l of
this E. coli culture was mixed with several dilutions of packaged
phage library and incubated at 37.degree. C. for 15 min. To this
mixture 2.5 ml of LB top agarose or NZCYM top agarose (maintained
at 60.degree. C.) (see Table 9) was added and plated on LB agar or
NCZYM agar (see Table 9) present in 82 rnm petri dishes. Phage were
allowed to propagate overnight at 37.degree. C. to obtain discrete
plaques and the phage titre was determined.
EXAMPLE 3
Screening of Genomic Libraries
[0123] Both .lambda.TriplEx.TM. and .lambda.ZAP Express.TM. vectors
are phagemid vectors that can be propagated either as phage or
plasmid DNA (after conversion of phage to plasmid). Therefore, the
genomic libraries constructed in these vectors can be screened
either by plaque hybridization (screening of lambda form of
library) or by colony hybridization (screening plasmid form of
library after phage to plasmid conversion). Both vectors are
capable of expressing the cloned genes and the main difference is
the mechanism of excision of plasmid from the phage DNA. The
cloning site in .lambda.TriplEx.TM. is located within a plasmid
which is present in the phage and is flanked by loxP site (FIG. 1).
When .lambda.TriplEx.TM. is introduced into E. coli strain BM25.8
(supplied by Clontech), the Cre recombinase present in BM25.8
promotes the excision and circularization of plasmid pTriplEx from
the phage .lambda.TriplEx.TM. at the loxP sites. The mechanism of
excision of plasmid pBK-CMV from phage .lambda.ZAP Express.TM. is
different. It requires the assistance of a helper phage such as
ExAssist.TM. (Stratagene) and an E. coli strain such as XLOR
(Stratagene). Both pTriplEx and pBK-CMVcan replicate autonomously
in E. coli.
[0124] A. Screening Genomic Libraries (Plasmid Form)
[0125] 1) Colony Lifts
[0126] A single colony of E. coli BM25.8 was inoculated into 5 ml
of LB containing 50 .mu.g/ml kanamycin, 10 mM MgSO.sub.4 and 0.1%
maltose and grown overnight at 31 .degree. C., 250 rpm. To 200
.mu.l of this overnight culture (.about.4.times.10.sup.8 cells) 1
.mu.l of phage library (2-5.times.10.sup.6 plaque forming units)
and 150 .mu.l LB broth were added and incubated at 31 .degree. C.
for 30 min after which 400 .mu.l of LB broth was added and
incubated at 31 .degree. C., 225 rpm for 1 h. This bacterial
culture was diluted and plated on LB agar containing 50 .mu.g/ml
ampicillin (Sigma Chemical Company, St. Louis, Mo.) and kanamycin
(Sigma Chemical Company) to obtain 500 to 600 colonies/plate. The
plates were incubated at 37.degree. C. for 6 to 7 hrs until the
colonies became visible. The plates were then stored at 4.degree.
C. for 1.5 h before placing a Colony/Plaque Screen.TM.
Hybridization Transfer Membrane disc (DuPont NEN Research Products,
Boston, Mass.) on the plate in contact with bacterial colonies. The
transfer of colonies to the membrane was allowed to proceed for 3
to 5 min. The membrane was then lifted and placed on a fresh LB
agar (see Table 9) plate containing 200 .mu.g/ml of chloramphenicol
with the side exposed to the bacterial colonies facing up. The
plates containing the membranes were then incubated at 37.degree.
C. overnight in order to allow full development of the bacterial
colonies. The LB agar plates from which colonies were initially
lifted were incubated at 37.degree. C. overnight and stored at
4.degree. C. for future use. The following morning the membranes
containing bacterial colonies were lifted and placed on two sheets
of Whatman 3M (Whatman, Hillsboro, Oreg.) paper saturated with 0.5
N NaOH and left at room temperature (RT) for 3 to 6 min to lyse the
cells. Additional treatment of membranes was as described in the
protocol provided by NEN Research Products.
[0127] 2) DNA Hybridizations
[0128] Membranes were dried overnight before hybridizing to
oligonucleotide probes prepared using a non-radioactive ECL.TM.
3'-oligolabelling and detection system from Amersham Life Sciences
(Arlington Heights, Ill.). DNA labeling, prehybridization and
hybridizations were performed according to manufacturer's
protocols. After hybridization, membranes were washed twice at room
temperature in 5.times.SSC, 0.1% SDS (in a volume equivalent to 2
ml/cm.sup.2 of membrane) for 5 min each followed by two washes at
50.degree. C. in 1>SSC, 0.1% SDS (in a volume equivalent to 2
ml/cm.sup.2 of membrane) for 15 min each. The hybridization signal
was then generated and detected with Hyperfilm ECL.TM. (Amersham)
according to manufacturer's protocols. Membranes were aligned to
plates containing bacterial colonies from which colony lifts were
performed and colonies corresponding to positive signals on X-ray
were then isolated and propagated in LB broth. Plasmid DNA's were
isolated from these cultures and analyzed by restriction enzyme
digestions and by DNA sequencing.
[0129] B. Screening Genomic Libraries (Plaque Form)
[0130] 1) X Library Plating
[0131] E. coli XL1Blue-MRF' cells were grown overnight in LB medium
(25 ml) containing 10 mM MgSO.sub.4 and 0.2% maltose at 37.degree.
C., 250 rpm. Cells were then centrifuged (2,200.times.g for 10 min)
and resuspended in 0.5 volumes of 10 mM MgSO.sub.4. 500 .mu.l of
this E. coli culture was mixed with a phage suspension containing
25,000 amplified lambda phage particles and incubated at 37.degree.
C. for 15 min. To this mixture 6.5 ml of NZCYM top agarose
(maintained at 60.degree. C.) (see Chart) was added and plated on
80 -100 ml NCZYM agar (see Chart) present in a 150 mm petridish.
Phage were allowed to propagate overnight at 37.degree. C. to
obtain discrete plaques. After overnight growth plates were stored
in a refrigerator for 1-2 hr before plaque lifts were
performed.
[0132] 2) Plaque Lift and DNA Hybridizations
[0133] Magna Lift.TM. nylon membranes (Micron Separations, Inc.,
Westborough, Mass.) were placed on the agar surface in complete
contact with .lambda. plaques and transfer of plaques to nylon
membranes was allowed to proceed for 5 min at RT. After plaque
transfer the membrane was placed on 2 sheets of Whatman 3.TM.
(Whatman, Hillsboro, Oreg.) filter paper saturated with a 0.5 N
NaOH, 1.0 M NaCl solution and left for 10 min at RT to denature
DNA. Excess denaturing solution was removed by blotting briefly on
dry Whatman 3M paper. Membranes were then transferred to 2 sheets
of Whatman .sub.3M.TM. paper saturated with 0.5 M Tris-HCl (H 8.0),
1.5 M NaCl and left for 5 min to neutralize. Membranes were then
briefly washed in 200-500 ml of 2.times.SSC, dried by air and baked
for 30-40 min at 80.degree. C. The membranes were then probed with
labelled DNA.
[0134] Membranes were prewashed with a 200-500 ml solution of
5.times.SSC, 0.5% SDS, 1 mM EDTA (pH 8.0) for 1-2 hr at 42.degree.
C. with shaking (60 rpm) to get rid of bacterial debris from the
membranes. The membranes were prehybridized for 1-2 hr at
42.degree. C. with (in a volume equivalent to 0.125-0.25
ml/cm.sup.2 of membrane) ECL Gold.TM. buffer (Amersham) containing
0.5 M NaCl and 5% blocking reagent. DNA fragments that were used as
probes were purified from agarose gel using a QIAEX II.TM. gel
extraction kit (Qiagen Inc., Chatsworth, Calif.) according to
manufacturers protocol and labeled using an Amersham ECL.TM. direct
nucleic acid labeling kit (Amersham). Labeled DNA (5-10 ng/ml
hybridization solution) was added to the prehybridized membranes
and the hybridization was allowed to proceed overnight. The
following day membranes were washed with shaking (60 rpm) twice at
42.degree. C. for 20 min each time in (in a volume equivalent to 2
ml/cm.sup.2 of membrane) a buffer containing either 0.1 (high
stringency) or 0.5 (low stringency).times.SSC, 0.4% SDS and 360 g/l
urea. This was followed by two 5 min washes at room temperature in
(in a volume equivalent to 2 ml/cm.sup.2 of membrane) 2.times.SSC.
Hybridization signals were generated using the ECL.TM. nucleic acid
detection reagent and detected using Hyperfilm ECL.TM.
(Amersham).
[0135] Agar plugs which contained plaques corresponding to positive
signals on the X-ray film were taken from the master plates using
the broad-end of Pasteur pipet. Plaques were selected by aligning
the plates with the x-ray film. At this stage, multiple plaques
were generally taken. Phage particles were eluted from the agar
plugs by soaking in 1 ml SM buffer (Sambrook et al., supra)
overnight. The phage eluate was then diluted and plated with
freshly grown E. coli XL1Blue-MRF' cells to obtain 100-500 plaques
per 85 mm NCZYM agar plate. Plaques were transferred to Magna Lift
nylon membranes as before and probed again using the same probe.
Single well-isolated plaques corresponding to signals on X-ray film
were picked by removing agar plugs and eluting the phage by soaking
overnight in 0.5 ml SM buffer.
[0136] C. Conversion of X Clones to Plasmid Form
[0137] The lambda clones isolated were converted to plasmid form
for further analysis. Conversion from the plaque to the plasmid
form was accomplished by infecting the plaques into E. coli strain
BM25.8. The E. coli strain was grown overnight at 31 .degree. C.,
250 rpm in LB broth containing 10 mM MgSO.sub.4 and 0.2% maltose
until the OD.sub.600. reached 1.1-1.4. Ten milliliters of the
overnight culture was removed and mixed with 100 .mu.l of 1 M
MgCl.sub.2. A 200 .mu.l volume of cells was removed, mixed with 150
.mu.l of eluted phage suspension and incubated at 31 .degree. C.
for 30 min. LB broth (400 .mu.l) was added to the tube and
incubation was continued at 31 .degree. C. for 1 hr with shaking,
250 rpm. 1-10 .mu.l of the infected cell suspension was plated on
LB agar containing 100 .mu.g/ml ampicillin (Sigma, St. Louis, Mo.).
Well-isolated colonies were picked and grown overnight in 5 ml LB
broth containing 100 .mu.g/ml ampicillin at 37.degree. C., 250 rpm.
Plasmid DNA was isolated from these cultures and analyzed. To
convert the .lambda.ZAP Express.TM. vector to plasmid form E. coli
strains .lambda.L1Blue-MRF' and XLOR were used. The conversion was
performed according to the manufacturer's (Stratagene) protocols
for single-plaque excision.
EXAMPLE 4
Transformation of C. Lropicalis H5343 uraf
[0138] A. Transformation of C. tropicalis H5343 by
Electroporation
[0139] 5 ml of YEPD was inoculated with C. tropicalls H5343 urn
from a frozen stock and incubated overnight on a New Brunswick
shaker at 30.degree. C. and 170 rpm. The next day, 10 .mu.l of the
overnight culture was inoculated into 100 ml YEPD and growth was
continued at 30.degree. C., 170 rpm. The following day the cells
were harvested at an OD.sub.600 of 1.0 and the cell pellet was
washed one time with sterile ice-cold water. The cells were
resuspended in ice-cold sterile 35 % Polyethylene glycol (4,000 M
to a density of 5.times.106 cells/ml. A 0.1 ml volume of cells were
utilized for each electroporation. The following electroporation
protocol was followed: 1.0 .mu.g of transforming DNA was added to
0.1 ml cells, along with 5 .mu.g denatured, sheared calf thymus DNA
and the mixture was allowed to incubate on ice for 15 min. The cell
solution was then transferred to an ice-cold 0.2 cm electroporation
cuvette, tapped to make sure the solution was on the bottom of the
cuvette and electroporated. The cells were electroporated using an
Invitrogen electroporator (Carlsbad, Cafif.) at 450 Volts, 200 Ohms
and 250 .mu.F. Following electroporation, 0.9 ml SOS media (1M
Sorbitol, 30% YEPD, 10 mM CaCl.sub.2) was added to the suspension.
The resulting culture was grown for 1 hr at 30.degree. C., 170 rpm.
Following the incubation, the cells were pelleted by centrifugation
at 1500.times.g for 5 min. The electroporated cells were
resuspended in 0.2 ml of IM sorbitol and plated on synthetic
complete media minus uracil (SC--uracil) (Nelson, supra). In some
cases the electroporated cells were plated directly onto SC-uracil.
Growth of transformants was monitored for 5 days. After three days,
several transformnants were picked and transferred to SC-uracil
plates for genomic DNA preparation and screening.
[0140] B. Transformation of C. tropicalis Using Lithium Acetate
[0141] The following protocol was used to transform C. tropicalis
in accordance with the procedures described in Current Protocols in
Molecular Biology, Supplement 5, 13.7.1 (1989), incorporated herein
by reference.
[0142] 5 ml of YEPD was inoculated with C. tropicalis H5343 una
from a frozen stock and incubated overnight on a New Brunswick
shaker at 30.degree. C. and 170 rpm. The next day, 10 .mu.l of the
overnight culture was inoculated into 50 ml YEPD and growth was
continued at 30.degree. C., 170 rpm. The following day the cells
were harvested at an OD.sub.600 of 1.0. The culture was transferred
to a 50 ml polypropylene tube and centrifuged at 1000.times.g for
10 min. The cell pelletwas resuspended in 10 ml sterile TE (10 mM
Tris-Cl and 1 mM EDTA, pH 8.0). The cells were again centrifuged at
1000.times.g for 10 min and the cell pellet was resuspended in 10
ml of a sterile lithium acetate solution [LiAc (0.1 M lithium
acetate, 10 mM Tris-Cl, pH 8.0, 1 mM EDTA)I. Following
centrifugation at 1000.times.g for 10 min., the pellet was
resuspended in 0.5 mnl LAc. This solution was incubated for one
hour at 30.degree. C. while shaking gently at 50 rpm. A 0.1 ml
aliquot of this suspension was incubated with 5 .mu.g of
transforming DNA at 30.degree. C. with no shaking for 30 min. A 0.7
ml PEG solution (40% wt/vol polyethylene glycol 3340, 0.1 M lithium
acetate, 10 mM Tris-Cl, pH 8.0, 1 mM EDTA) was added and incubated
at 30.degree. C. for 45 min. The tubes were then placed at
42.degree. C. for 5 min. A 0.2 ml aliquot was plated on synthetic
complete media minus uracil (SC-uracil) (Kaiser et al. Methods in
Yeat GeneLics, Cold Spring Harbor Laboratory Press, USA, 1994,
incorporated herein by reference). Growt of transfonnants was
monitored for 5 days. After three days, several transformants were
picked and transferred to SC-uracil plates for genomic DNA
preparation and screening.
EXAMPLE 5
Plasmid DNA Isolation
[0143] Plasmid DNA were isolated from E. coli cultures using Qiagen
plasmid isolation kit (Qiagen Inc., Chatsworth, Calif.) according
to manufacturer's instructions.
EXAMPLE 6
DNA Sequencing and Analysis
[0144] DNA sequencing was performed at Sequetech Corporation
(Mountain View, Calif.) using Applied Biosystems automated
sequencer (Perkin Elmer, Foster City, Calif.). DNA sequences were
analyzed with MacVector and GeneWorks software packages (Oxford
Molecular Group, Campbell, Calif.).
EXAMPLE 7
PCR Protocols
[0145] PCR amplification was carried out in a Perkin Elmer
Thermocycler using the AmpliTaqGold enzyme (Perkin Elmer Cetus,
Foster City, Calif.) kit according to manufacturer's
specifications. Following successful amplification, in some cases,
the products were digested with the appropriate enzymes and gel
purified using Qiaexll (Qiagen, Chatsworth, Calif.) as per
manufacturer instructions. In specific cases the Ultma Taq
polymerase (Perkin Elmer Cetus, Foster City, Calif.) or the Expand
Hi-Fi Taq polymerase (Boehringer Mannheim, Indianapolis, Ind.) were
used per manufacturer's recommendations or as defined in Table
3.
4TABLE 3 PCR amplification conditions used with different primer
combinations. TEMPLATE PRIMER DENATURING ANNEALING EXTENSION CYCLE
COMBINATION Taq CONDITION TEMP/TIME TEMP/TIME Number
3674-41-1/41-2/41-4 + Ampli-Taq 94 C./30 sec 55 C./30 sec 72 C./1
min 30 3674414 Gold URA Primer 1a Ampli-Taq 95 C./1 min 70 C./1 min
72 C./2 min 35 URA Primer 1b Gold URA Primer 2a Ampli-Taq 95 C./1
min 70 C./1 min 72 C./2 min 35 URA Primer 2b Gold CYP2A#1 Ampli-Taq
95 C./1 min 70 C./1 min 72 C./2 min 35 CYP2A#2 Gold CYP3A#1 Ultma
Taq 95 C./1 min 70 C./1 min 72 C./1 min 30 CYP3A#2 CPRB#1 Expand 94
C./15 sec 50 C./30 sec 68 C./3 min 10 CPRB#2 Hi-Fi 94 C./15 sec 50
C./30 sec. 68 C./3 min 15 Taq +20 sec/cycle CYP5A#1 Expand 94 C./15
sec 50 C./30 sec 68 C./3 min 10 CYP5A#2 Hi-Fi 94 C./15 sec 50 C./30
sec 68 C./3 min 15 Taq +20 sec/cycle
[0146] Table 4 below contains a list of primers (SEQ ID NOS: 1-35)
used for PCR amplification to construct gene integration vectors or
to generate probes for gene detection and isolation.
5TABLE 4 Primer table for PCR amplification to construct gene
integration vectors, to generate probes for gene isolation and
detection and to obtain DNA sequence of constructs.
(A-deoxyadenosine triphosphate [dATP], G-deoxyguanosine
triphosphate [dGTP], C-deoxycytosine triphosphate [dCTP],
T-deoxythymidine triphosphate [dTTP], Y-dCTP or dTTP, R-dATP or
dGTP, W-dATP or dTTP, M-dATP or dCTP, N-dATP or dCTP or dGTP or
dTTP). Patent PCR Product Target gene(s) Primer Name Lab Primer
Name Sequence (5' to 3') Size CYP52A2A CYP2A#1 3659-72M
CCTTAATTAAATGCACGAAGCGGAGATAAAAG 2230 bp (SEQ ID NO:1) CYP2A#2
3659-72N CCTTAATTAAGCATAAGCTTGCTCGAGTCT (SEQ ID NO:2) CYP52A3A
CYP3A#1 3659-72O CCTTAATTAAACGCAATGGGAACATGGAGTG 2154 bp (SEQ ID
NO:3) CYP3A#2 3659-72P CCTTAATTAATCGCACTACGGTTATTGGTATCAG (SEQ ID
NO:4) CYP52A5A CYP5A#1 3659-72K CCTTAATTAATCAAAGTACGTTCAGGCGG 3298
bp (SEQ ID NO:5) CYP5A#2 3659-72L
CCTTAATTAAGGCAGACAACAACTTGGCAAAGT- C (SEQ ID NO:6) CPRB CPRB#1
3698-20A CCTTAATTAAGAGGTCGTTGGTTGAGTTTTC 3266 bp (SEQ ID NO:7)
CPRB#2 3698-20B CCTTAATTAATTGATAATGACGTTGCGGG (SEQ ID NO:8) URA3A
URA 3698-7C AGGCGCGCCGGAGTCCAAAAAGACCAACCTCTG 956 bp Primer 1a (SEQ
ID NO:9) URA 3698-7D CCTTAATTAATACGTGGATACCTTCAAGCAAGTG Primer 1b
(SEQ ID NO:10) URA3A URA 3698-7A
CCTTAATTAAGCTCACGAGTTTTGGGATTTTCGAG 750 bp Primer 2a (SEQ ID NO:11)
URA 3698-7B GGGTTTAAACCGCAGAGGTTGGTCTTTTTGGACTC Primer 2b (SEQ ID
NO:12) GGGTTTAAAC-PmeI restriction site (SEQ ID NO:13)
AGGCGCGCC-AscI restriction site (SEQ ID NO:14) CCTTAATTAA-PacI
restriction site (SEQ ID NO:15) CPR FMN1 3674-41-1
TCYCAAACWGGTACWGCWGAA (SEQ ID NO:16) CPR FMN2 3674-41-2
GGTTTGGGTAAYTCWACTTTAT (SEQ ID NO:17) CPR FAD 3674-41-3
CGTTATTAYTCYATTTCTTC (SEQ ID NO:18) CPR NADPH 3674-41-4
GCMACACCRGTACCTGGACC (SEQ ID NO:19) CPR PRK1.F3 PRK1.F3
ATCCCAATCGTAATCAGC (SEQ ID NO:20) CPR PRK1.F5 PRK1.F.5
ACTTGTCTTCGTTTAGCA (SEQ ID NO:21) CPR PRK4.R20 PRK4.R20
CTACGTCTGTGGTGATGC (SEQ ID NO:22) CYP UCup1 UCup1 CGNGAYACNACNGCNGG
(SEQ ID NO:23) CYP UCup2 UCup2 AGRGAYACNACNGCNGG (SEQ ID NO:24) CYP
UCdown1 UCdown1 AGNGCRAAYTGYTGNCC (SEQ ID NO:25) CYP UCdown2
UCdown2 YAANGCRAAYTGYTGNCC (SEQ ID NO:26) CYP HemeB1 HemeB1
ATTCAACGGTGGTCCAAGAATCTGTTTGG (SEQ ID NO:27) CYP 2, 3, 5P 2, 3, 5P
GAGCTATGTTGAGACCACAGTTTGC (SEQ ID NO:28) CYP 2, 3, 5M 2, 3, 5M
CTTCAGTTAAAGCAAATTGTTTGGCC (SEQ ID NO:29) pTriplEx Triplex5'
Triplex5' CTCGGGAAGCGCGCCATTGTGTTGG vector (SEQ ID NO:30) pTriplEx
Triplex3' Triplex3' TAATACGACTCACTATAGGGCGAATTGGC vector (SEQ ID
NO:31) CYP Cyp52a CYP52a TGRYTCAAACCATCTYTCTGG (SEQ ID NO:32) CYP
Cyp52b Cyp52b GGACCGGCGTTAAAGGG (SEQ ID NO:33) CYP Cyp52c Cyp52c
CATAGTCGWATYATGCTTAGACC (SEQ ID NO:34) CYP Cyp52d Cyp52d
GGACCACCATTGAATGG (SEQ ID NO:35)
EXAMPLE 8
Yeast Colony PCR Procedure for Confirmation of Gene Integration
into the Genome of C. tropicalis
[0147] Single yeast colonies were removed from the surface of
transformation plates, suspended in 50 .mu.l of spheroplasting
buffer (5OmM KCI, lOmM Tris-HCl, pH 8.3, 1.0 mg/ml Zymolyase, 5%
glycerol) and incubated at 37.degree. C. for 30 min. Following
incubation, the solution was heated for 10 min at 95 .degree. C. to
lyse the cells. Five .mu.l of this solution was used as a template
in PCR. Expand Hi-Fi Taq polymerase (Boehringer Mannheim,
Indianapolis, Ind.) was used in PCR coupled with a gene-specific
primer (gene to be integrated) and a URA3 primer. If integration
did occur, amplification would yield a PCR product of predicted
size confirming the presence of an integrated gene.
EXAMPLE 9
Fermentation Method for Gene Induction Studies
[0148] A fermentor was charged with a semi-synthetic growth medium
having the composition 75 g/l glucose (anhydrous), 6.7 gi Yeast
Nitrogen Base (Difco Laboratories), 3 g/l yeast extract, 3 g/l
ammonium sulfate, 2 g/l monopotassium phosphate, 0.5 g/l sodium
chloride. Components were made as concentrated solutions for
autoclaving then added to the fermentor upon cooling: final pH
approximately 5.2. This charge was inoculated with 5-1O%o of an
overnight culture of C. tropicalis ATCC 20962 prepared in YM medium
(Difco Laboratories) as described in the methods of Examples 17 and
20 of U.S. Pat. No. 5,254,466, which is incorporated herein by
reference. C. tropicalis ATCC 20962 is a POX 4 and POX 5 disrupted
C. tropicalis ATCC 20336. Air and agitation were supplied to
maintain the dissolved oxygen at greater than about 40% of
saturation versus air. The pH was maintained at about 5.0 to 8.5 by
the addition of 5N caustic soda on pH control. Both a fatty acid
feedstream (commercial oleic acid in this example) having a typical
composition: 2.4% C.sub.14; 0.7% C.sub.141; 4.6% C.sub.16; 5.7%
C.sub.161; 5.7% C.sub.171; 1.0% C.sub.18; 69.9% C.sub.181; 8.8%
C.sub.182; 0.30% C.sub.183; 0.90% C.sub.201 and a glucose
co-substrate feed were added in a feedbatch mode beginning near the
end of exponential growth. Caustic was added on pH control during
the bioconversion of fatty acids to diacids to maintain the pH in
the desired range. Typically, samples for gene induction studies
were collected just prior to starting the fatty acid feed and over
the first 10 hours of bioconversion. Determination of fatty acid
and diacid content was determined by a standard methyl ester
protocol using gas liquid chromatography (GLC). Gene induction was
measured using the QC-RT-PCR protocol described in this
application.
EXAMPLE 10
RNA Preparation
[0149] The first step of this protocol involves the isolation of
total cellular RNA from cultures of C. tropicalis. The cellular RNA
was isolated using the Qiagen RNeasy Mini Kit (Qiagen Inc.,
Chatsworth, Calif.) as follows: 2 ml samples of C. tropicalis
cultures were collected from the fermentor in a standard 2 ml screw
capped Eppendorf style tubes at various times before and after the
addition of the fatty acid or alkane substrate. Cell samples were
immediately frozen in liquid nitrogen or a dry-ice/alcohol bath
after their harvesting from the fermentor. To isolate total RNA
from the samples, the tubes were allowed to thaw on ice and the
cells pelleted by centrifugation in a microfuige for 5 minutes
(min) at 4.degree. C. and the supernatant was discarded while
keeping the pellet ice-cold. The microfuge tubes were filled 2/3
full with ice-cold Zirconia/Silica beads (0.5 mm diameter, Biospec
Products, Bartlesville, Okla.) and the tube filled to the top with
ice-cold RLT* lysis buffer (* buffer included with the Qiagen
RNeasy Mini Kit). Cell rupture was achieved by placing the samples
in a mini bead beater (Biospec Products, Bartlesville, Okla.) and
immediately homogenized at full speed for 2.5 min. The samples were
allowed to cool in a ice water bath for I minute and the
homogenization/cool process repeated two more times for a total of
7.5 min homogenization time in the beadbeater. The homogenized
cells samples were microfuged at full speed for 10 min and 700
[.mu.l of the RNA containing supernatant removed and transferred to
a new eppendorf tube. 700 .mu.l of 70% ethanol was added to each
sample followed by mixing by inversion. This and all subsequent
steps were performed at room temperature. Seven hundred microliters
of each ethanol treated sample were transferred to a Qiagen RNeasy
spin column, followed by centrifugation at 8,000.times.g for 15
sec. The flow through was discarded and the column reloaded with
the remaining sample (700 .mu.l) and recentrifuged at 8,000.times.g
for 15 sec. The column was washed once with 700 .mu.l of buffer
RW1*, and centrifuged at 8,000.times.g for 15 sec and the flow
through discarded. The column was placed in a new 2 ml collection
tube and washed with 500 .mu.l of RPE* buffer and the flow through
discarded. The RPE* wash was repeated with centrifugation at
8,000.times.g for 2 min and the flow through discarded. The spin
column was transferred to a new 1.5 ml collection tube and 100
.mu.l of RNase free water added to the column followed by
centrifugation at 8,000.mu.g for 15 seconds. An additional 75 .mu.l
of RNase free water was added to the column followed by
centrifugation at 8,000.times.g for 2 min. RNA eluted in the water
flow through was collected for further purification.
[0150] The RNA eluate was then treated to remove contaminating DNA.
Twenty microliters of 10X DNase I buffer (0.5 M tris (pH 7.5), 50
mM CaCl.sub.2, 100 mM MgCl.sub.2), 10 .mu.l of RNase-free DNase I
(2 Units/.mu.l, Ambion Inc., Austin, Tex.) and 40 units Rnasin
(Promega Corporation, Madison, Wis.) were added to the RNA sample.
The mixture was then incubated at 37.degree. C. for 15 to 30 min.
Samples were placed on ice and 250 .mu.l Lysis buffer RLT* and 250
.mu.l ethanol (200 proof) added. The samples were then mixed by
inversion. The samples were transferred to Qiagen RNeasy spin
columns and centrifuged at 8,000.times.g for 15 sec and the flow
through discarded. Columns were placed in new 2 ml collection tubes
and washed twice with 500 .mu.l of RPE* wash buffer and the flow
through discarded. Columns were transferred to new 1.5 ml eppendorf
tubes and RNA was eluated by the addition of 100 .mu.l of DEPC
treated water followed by centrifugation at 8,000.times.g for 15
sec. Residual RNA was collected by adding an additional 50 .mu.l of
RNase free water to the spin column followed by centrifugation at
full speed for 2 min. 10 .mu.l of the RNA preparation was removed
and quantified by the (A.sub.260 280) method. RNA was stored at
-70.degree. C. Yields were found to be 30-100 .mu.g total RNA per
2.0 ml of fermentation broth.
EXAMPLE 11
Quantitative Competitive Reverse Transcription Polymerase Chain
Reaction (QC-RT-PCR) Protocol
[0151] QC-RT-PCR is a technique used to quantitate the amount of a
specific RNA in a RNA sample. This technique employs the synthesis
of a specific DNA molecule that is complementary to an RNA molecule
in the original sample by reverse transcription and its subsequent
amplification by polymerase chain reaction. By the addition of
various amounts of a competitor RNA molecule to the sample one can
determine the concentration of the RNA molecule of interest (in
this case the mRNA transcripts of the CYP and CPR genes). The
levels of specific MRNA transcripts were assayed over time in
response to the addition of fatty acid and/or alkane substrates to
the growth medium of fermentation grown C. tropicalis cultures for
the identification and characterization of the genes involved in
the oxidation of these substrates. This approach can be used to
identify the CYP and CPR genes involved in the oxidation of any
given substrate based upon their transcriptional regulation.
[0152] A. Primer Design
[0153] The first requirement for QC-RT-PCR is the design of the
primer pairs to be used in the reverse transcription and subsequent
PCR reactions. These primers need to be unique and specific to the
gene of interest. As there is a family of genetically similar CYP
genes present in C. tropicalis 20336, care had to be taken to
design primer pairs that would be discriminating and only amplify
the gene of interest, in this example the CYP52A5 gene. In this
manner, unique primers directed to substantially non-homologous
(aka variable) regions within target members of a gene family are
constructed. What constitutes substantially non-homologous regions
is determined on a case by case basis. Such unique primers should
be specific enough to anneal the non-homologous region of the
target gene without annealing to other non-target members of the
gene family. By comparing the known sequences of the members of a
gene family, non-homologous regions are identified and unique
primers are constructed which will anneal to those regions. It is
contemplated that non-homologous regions herein would typically
exhibit less than about 85% homology but can be more homologous
depending on the positions which are conserved and stringency of
the reaction. After conducting PCR, it may be helpful to check the
reaction product to assure it represents the unique target gene
product. If not, the reaction conditions can be altered in terms of
stringency to focus the reaction to the desired target.
Alternatively a new primer or new non-homologous region can be
chosen. Due to the high level of homology between the genes of the
CYP52A family, the most variable 5 prime region of the CYP52A5
coding sequence was targeted for the design of the primer pairs. In
FIG. 3, a portion of the 5 prime coding region for the CYP52A5A
(SEQ ID NO: 36) allele of C tropicalis 20336 is shown. The boxed
sequences in FIG. 3 are the sequences of the forward and backwards
primers (SEQ ID NOS: 47 and 48) used to quantitate expression of
both alleles of this gene. The actual reverse primer (SEQ ID NO:
48) contains one less adenine than that shown in FIG. 3. Primers
used to measure she expression of specific C. tropicalis 20336
genes using the QC-RT-PCR protocol are listed in Table 5 (SEQ ID
NOS: 37-58).
6TABLE 5 Primer used to measure C. tropicalis gene expression in
the QC-RT-PCR reactions. Primer Name Direction Target Sequence
3737-89F F CYP52A1A CCGATGAAGTTTTCGACGAGT ACCC (SEQ ID NO:37)
3737-89B B CYP52A1A AAGGCTTTAACGTGTCCAATC TGGTC (SEQ ID NO:38)
alk2aF1 F CYP52A2A ATTATCGCCACATACTTCACC AAATGG (SEQ ID NO:39)
alk2aB5 B CYP52A2A CGAGATCGTGGATACGCTGGA GTG (SEQ ID NO:40)
7581-178-3 F CYP52A3A GCCACTCGGTAACTTTGTCAG GGAC (SEQ ID NO:41)
7581-178-4 B CYP52A3A CATTGAACTGAGTAGCCAAAA CAGCC (SEQ ID NO:42)
3737-50F F CYP52A3A CCTACGTTTGGTATCGCTACT CCGTTG & (SEQ ID
NO:43) CYP52A3B 3737-50B B CYP52A3A TTTCCAGCCAGCACCGTCCAA G &
(SEQ ID NO:44) CYP52A3B 3737-175F F CYP52D4A GCAGAGCCGATCTATGTTGCG
TCC (SEQ ID NO:45) 3737-175B B CYP52D4A TCATTGAATGCTTCCAGGAAC CTCG
(SEQ ID NO:46) 7581-97-F F CYP52A5A AAGAGGGCAGGGCTCAAGAG & (SEQ
ID NO:47) CYP52A5B 7581-97-M B CYP52A5A TCCATGTGAAGATCCCATCAC &
(SEQ ID NO:48) CYP52A5B 4P-2 F CYP52A8A CTTGAAGGCCGTGTTGAACG (SEQ
ID NO:49) 4M-1 B CYP52A8A CAGGATTTGTCTGAGTTGCCG (SEQ ID NO:50)
3737-52F F POX4A CCATTGCCTTGAGATACGCCA TTGGTAG & (SEQ ID NO:51)
POX4B 3737-52B B POX4A AGCCTTGGTGTCGTTCTTTTC AACGG & (SEQ ID
NO:52) POX4B 3737-53F F POX5A TTGGGTTTGTTTGTTTCCTGT GTCCG (SEQ ID
NO:53) 3737-53B B POX5A CCTTTGACCTTCAATCTGGCG TAGACG (SEQ ID NO:54)
F33 F CPRA GGTTTGCTGAATACGCTGAAG GTGATG (SEQ ID NO:55) B63 B CPRA
TGGAGCTGAACAACTCTCTCG TCTCGG (SEQ ID NO:56) 3737-133F F CPRA
TTCCTCAACACGGACAGCGG & (SEQ ID NO:57) CPRB 3737-133B B CPRA
AGTCAACCAGGTGTGGAACTC GTC & (SEQ ID NO:58) CPRB F = Forward B =
Backward
[0154] B. Design and Synthesis of the Competitor DNA Template
[0155] The competitor RNA is synthesized in vio from a competitor
DNA template that has the T7 polymerase promoter and preferably
carries a small deletion of e.g., about 10 to 25 nucleotides
relative to the native target RNA sequence. The DNA template for
the in-vitro synthesis of the competitor RNA is synthesized using
PCR primers that are between 46 and 60 nucleotides in length. In
this example, the primer pairs for the synthesis of the
CYP52Atcompetitor DNA are shown in Tables 6 and 7 (SEQ ID NOS: 59
AND 60).
7TABLE 6 Forward and Reverse primers used to synthesize the
competitor RNA template for the QC-RT-PCR measurement of CYP52A5A
gene expression. Forward Primer CYP52A5A
GGATCCTAATACGACTCACTATAGGGA GGAAGAGGGCAGGGCTCAAGAG (SEQ ID NO:59)
Reverse Primer CYP52A5A TCCATGTGAAGATCCCATCACGAGTGT GCCTCTTGCCCAAAG
(SEQ ID NO:60)
[0156]
8TABLE 7 Primers for the synthesis of the QC-RT-PCR competitor RNA
templates Primer Name Direction Target Sequence 5'-3' 3737-89C F
CYP52A1A GGATCCTAATACGACTCACTA TAGGGAGGCCGATGAAGTTTT CGACGAGTACCC
(SEQ ID NO:61) 3737-89D B CYP52A1A AAGGCTTTAACGTGTCCAATC
TGGTCAACATAGCTCTGGAGT GCTTCCAACC (SEQ ID NO:62) 7581-137-A F
CYP52A2A GGATCCTAATACGACTCACTA TAGGGAGGATTATCGCCACAT ACTTCACCAAATGG
(SEQ ID NO:63) 7581-137-B B CYP52A2A CGAGATCGTGGATACGCTGGA
GTGCGTCGCTCTTCTTCTTCA ACAATTCAAG (SEQ ID NO:64) 7581-137-D B
CYP52A3A CATTGAACTGAGTAGCCAAAA CAGCCCATGGTTTCAATCAAT GGGAGGC (SEQ
ID NO:65) 7581-137-C F CYP52A3A GGATCCTAATACGACTCACTA
TAGGGAGGGCCACTCGGTAAC TTTGTCAGGGAC (SEQ ID NO:66) 3737-50-D F
CYP52A3A GGATCCTAATACGACTCACTA & TAGGGAGGCCTACGTTTGGTA CYP52A3B
TCGCTACTCCGTTG (SEQ ID NO:67) 3737-50-C B CYP52A3A
TTTCCAGCCAGCACCGTCCAA & GCAACAAGGAGTACAAGAAAT CYP52A3B CGTGTC
(SEQ ID NO:68) 3737-175C F CYP52D4A GGATCCTAATACGACTCACTA
TAGGGAGGGCAGAGCCGATCT ATGTTGCGTCC (SEQ ID NO:69) 3737-175D B
CYP52D4A TCATTGAATGCTTCCAGGAAC CTCGCCACATCCATCGAGAAC CGG (SEQ ID
NO:70) 7581-97-A F CYP52A5A GGATCCTAATACGACTCACTA &
TAGGGAGGAAGAGGGCAGGGC CYP52A5B TCAAGAG (SEQ ID NO:59) 7581-97-B B
CYP52A5A TCCATGTGAAGATCCCATCAC & GAGTGTGCCTCTTGCCCAAAG CYP52A5B
(SEQ ID NO:60) 4P-2/T7 F CYP52A8A GGATCCTAATACGACTCACTA
TAGGGAGGCTTGAAGGCCGTG TTGAACG (SEQ ID NO:71) 4M-3/4M-1 B CYP52A8A
CAGGATTTGTCTGAGTTGCCG CCTGATCAAGATAGGATCCTT GCCG (SEQ ID NO:72)
3737-26-D F CPRA GGATCCTAATACGACTCACTA TAGGGAGGGGTTTGCTGAATA
CGCTGAAGGTGATG (SEQ ID NO:73) 3737-26-C B CPRA
TGGAGCTGAACAACTCTCTCG TCTCGGGTGGTCGAATGGACC CTTGGTCAAG (SEQ ID
NO:74) 3737-133C F CPRA GGATCCTAATACGACTCACTA &
TAGGGAGGTTCCTCAACACGG CPRB ACAGCGG (SEQ ID NO:75) 3737-133D B CPRA
AGTCAACCAGGTGTGGAACTC & GTCGGTGGCAACAATGAAAAA CPRB CACCAAG (SEQ
ID NO:76) 3737-52-C F POX4A GGATCCTAATACGACTCACTA &
TAGGGAGGCCATTGCCTTGAG POX4B ATACGCCATTGGTAG (SEQ ID NO:77)
3737-52-D B POX4A AGCCTTGGTGTCGTTCTTTTC & AACGGAAGGTGGTCTCGATGG
POX4B TGTGTTCAACC (SEQ ID NO:78) 3737-53-C F POX5A
GGATCCTAATACGACTCACTA TAGGGAGGTTGGGTTTGTTTG TTTCCTGTGTCCG (SEQ ID
NO:79) 3737-53-D B POX5A CCTTTGACCTTCAATCTGGCG
TAGACGCAGCACCACCGATCC ACCACTTG (SEQ ID NO:80) F = Forward B =
Backword
[0157] The forward primer (SEQ ID NO: 59) contains the T7 promoter
consensus sequence "GGATCCTAATACGA CTCACTATAGGG AGG" (SEQ ID NO:
109) fused to the primer 7581-97-F sequence (SEQ ID NO: 47). The
Reverse Primer (SEQ ID NO: 60) contains the sequence of primer
7581-97M (SEQ ID NO: 48) followed by the 20 bases of upstream
sequence with a 18 base pair deletion between the two blocks of the
CYP52ASsequence. The forward primer was used with the corresponding
reverse primer to synthesize the competitor DNA template. The
primer pairs were combined in a standard Taq Gold polymerase PCR
reaction according to the manufacturer's recommended conditions
(Perkin-Elmer/Applied Biosystems, Foster City, Calif.). The PCR
reaction mix contained a fmal concentration of 250 nM each primer
and 10 ng C. tropicalis chromosomal DNA for template. The reaction
mixture was placed in a thermocycler for 25 to 35 cycles using the
highest annealing temperature possible during the PCR reactions to
assure a homogeneous PCR product (in this case 62.degree. C.). The
PCR products were either gel purified or filtered purified to
remove unincorporated nucleotides and primers. The competitor
template DNA was then quantified using the (A.sub.260/280)) method.
Primers used in QC-RT-PCR experiments for the synthesis of various
competitive DNA templates are listed in Table 7 (SEQ ID NOS:
61-80).
[0158] C. Synthesis of the Competitor RNA
[0159] Competitor template DNA was transcribed In-Vitro to make the
competitor RNA using the Megascript T7 kit from Ambion Biosciences
(Ambion Inc., Austin, Tex.). 250 nanograms (ng) of competitor DNA
template and the in-vitro transcription reagents are mixed
according to the directions provided by the manufacturer. The
reaction mixture was incubated for 4 hours at 37.degree. C. The
resulting RNA preparations were then checked by gel electrophoresis
for the conditions giving the highest yields and quality of
competitor RNA. This often required optimization according to the
manufacturer's specifications. The DNA template was then removed
using DNase I as described in the Ambion kit. The RNA competitor
was then quantified by the (A.sub.260/280) method. Serial
dilution's of the RNA (1 ng/.mu.l to 1 femtogram (fg)/.mu.l) were
made for use in the QC-RT-PCR reactions and the original stocks
stored at -70.degree. C.
[0160] D. QC-RT-PCR Reactions
[0161] QC-RT-PCR reactions were performed using rTth polymerase
from Perkin-Elhner(Perkin-Elmer/Applied Biosystems, Foster City,
Calif.) according to the manufacturer's recommended conditions. The
reverse transcription reaction was performed in a 10 .mu.l volume
with a final concentrations of 200 .mu.M for each dNTP, 1.25 units
rTth polymerase, 1.0 mM MnCl.sub.2, 1X of the 10X buffer supplied
with the Enzyme from the manufacturer, 100 ng of total RNA isolated
from a fermentor grown culture of C. tropicalis and 1.25 .mu.M of
the appropriate reverse primer. To quantitate CYP52A5 expression in
C. tropicalis an appropriate reverse primer was 7581-97M (SEQ ID
NO: 48). Several reaction mixes were prepared for each RNA sample
characterized. To quantitate CYP52A5 expression a series of 8 to 12
of the previously described QC-RT-PCR reaction mixes were aliquoted
to different reaction tubes. To each tube 1 .mu.l of a serial
dilution containing from 100 pg to 100 fg CYP52A5 competitor RNA
per ll was added bringing the final reaction mixtures up to the
final volume of 10 .mu.l. The QC-RT-PCR reaction mixtures were
mixed and incubated at 70.degree. C. for 15 min according to the
manufacturer's recommended times for reverse transcription to
occur. At the completion of the 15 minute incubation, the sample
temperature was reduced to 4.degree. C. to stop the reaction and 40
.mu.l of the PCR reaction mix added to the reaction to bring the
total volume up to 50 .mu.l. The PCR reaction mix consists of an
aqueous solution containing 0.3125 .mu.M of the forward primer
7581-97F (SEQ ID NO: 47), 3.125 mM MgCt and 1X chelating buffer
supplied with the enzyme from Perkin-Elmer. The reaction mixtures
were placed in a thermocycler (Perkin-Elmer GeneAmp PCR System
2400, Perkin-Elmer/Applied Biosystems, Foster City, Calif. ) and
the following PCR cycle performed: 94.degree. C. for 1 min.
followed by 94.degree. C. for 10 seconds followed by 58.degree. C.
for 40 seconds for 17 to 22 cycles. The PCR reaction was completed
with a final incubation at 58.degree. C. for 2 min followed by
4.degree. C. In some reactions where no detectable PCR products
were produced the samples were returned the thernocycler for
additional cycles, this process was repeated until enough PCR
products were produced to quantify using HPLC. The number of cycles
necessary to produce enough PCR product is a function of the amount
of the target mRNA in the 100 ng of total cellular RNA. In cultures
where the COPJ2Agene is highly expressed there is sufficient
CYP52ASmRNA message present and less PCR cycles (.ltoreq.1 7) are
required to produce quantifiable amount of PCR product The lower
the concentrations of the target mRNA present the more PCR cycles
are required to produce a detectable amount of product. These
QC-RT-PCR procedures were applied to all the target genes listed in
Table 5 using the respective primers indicated therein.
[0162] E. HPLC Quantification
[0163] Upon completion of the QC-RT-PCR reactions the samples were
analyzed and quantitated by HPLC. Five to fifteen microliters of
the QC-RT-PCR reaction mix was injected into a Waters
Bio-Compatible 625 HPLC with an attached Waters 484 tunable
detector. The detector was set to measure a wave length of 254 nm.
The HPLC contained a Sarasep brand DNASep.TM. column (Sarasep,
Inc., San Jose, Calif.) which was placed within the oven and the
temperature set for 52 .degree. C. The column was installed
according to the manufacturer's recommendation of having 30 cm. of
heated PEEK tubing installed between the injector and the column.
The system was configured with a Sarasep brand Guard column
positioned before the injector. In addition, there was a 0.22 .mu.m
filter disk just before the column, within the oven. Two Buffers
were used to create an elution gradient to resolve and quantitate
the PCR products from the QC-RT-PCR reactions. Buffer-A consists of
0.1 M tri-ethyl ammonium acetate (TEAA) and 5% acetonitrile (volume
to volume). Buffer-B consists of 0.1 M TEAA and 25% acetonitrile
(volume to volume). The QC-RT-PCR samples were injected into die
HPLC and the linear gradient of 75% buffer-A/25% buffer-B to 45%
buffer-A/55% B was run over 6 min at a flow rate of 0.85 ml per
minute. The QC-RT-PCR product of the competitor RNA being 18 base
pairs smaller is eluted from the HPLC column before the QC-RT-PCR
product from the CYP52A5 mRNA(U). The amount of the QC-RT-PCR
products are plotted and quantitated with an attached Waters
Corporation 745 data module. The log ratios of the amount of
CYP52AJ mRNA QC-RT-PCR product (U) to competitor QC-RT-PCR product
(C), as measured by peak areas, was plotted and the amount of
competitor RNA required to equal the amount of CYP52A5 mRNA product
determined. In the case of each of the target genes listed in Table
5, the competitor RNA contained fewer base pairs as compared to the
native target mRNA and eluted before the native mRNA in a manner
similar to that demonstrated by CYP52A5. HPLC quantification of the
genes was conducted as above.
EXAMPLE 12
Evaluation of New Strains in Shake Flasks
[0164] The CYP and CPR amplified strains such as strains HDC10,
HDC15, HDC20 and HDC23 (Table 1) and H5343 were evaluated for
diacid production in shake flasks. A single colony for each strain
was transferred from a YPD agar plate into 5 ml of YPD broth and
grown overnight at 30.degree. C., 250 rpm. An inoculum was then
transferred into 50 ml of DCA2 medium (Table 9) and grown for 24 h
at 30.degree. C., 300 rpm. The cells were centrifuiged at 5000 rpm
for 5 min and resuspended in 50 ml of DCA3 medium (Table 9) and
grown for 24 h at 30.degree. C., 300 rpm. 3% oleic acid w/v was
added after 24 h growth in DCA3 medium and the cultures were
allowed to bioconvert oleic acid for 48 h. Samples were harvested
and the diacid and monoacid concentrations were analyzed as per the
scheme given in FIG. 35. Each strain was tested in duplicate and
the results shown in Table 8 represent the average value from two
flasks.
9TABLE 8 Bioconversion of oleic acid by different recombinant
strains of Candida tropicalis Conversion to Oleic diacid Specific
Conversion Strain (%) (g diacid/g biomass H5343 41.9 0.53 HDC 10-2
50.5 0.85 HDC 15 54.4 0.85 HDC 20-1 45.1 0.72 HDC 20-2 45.3 0.58
HDC 23-2 55.2 0.84 HDC 23-3 58.8 0.89
EXAMPLE 13
Cloning and Characterization of C. tropicalis 20336 Cytochrome P450
Monooxygenase (CYP) and Cytoch rome P450 NADPH Oxidoreductase (CPR)
Genes
[0165] To clone CYP and CPR genes several different strategies were
employed. Available CYP amino acid sequences were aligned and
regions of similarity were observed (FIG. 4). These regions
corresponded to described conserved regions seen in other
cytochrome P450 families (Goeptar et al., supra and Kalb et al.
supra). Proteins from eight eukaryotic cytochrome P450 families
share a segmented region of sequence similarity. One region
corresponded to the HR2 domain containing the invariant cysteine
residue near the carboxyl terminus which is required for heme
binding while the other region corresponded to the central region
of the I helix thought to be involved in substrate recognition
(FIG. 4). Degenerate oligonucleotide primers corresponding to these
highly conserved regions of the CYP52 gene family present in
Candida maltosa and Candida tropicalis ATCC 750 were designed and
used to amplify DNA fragments of CYP genes from C. tropicalis 20336
genomic DNA. These discrete PCR fragments were then used as probes
to isolate full-length CYP genes from the C. tropicalis 20336
genomic libraries. In a few instances oligonucleotide primers
corresponding to highly conserved regions were directly used as
probes to isolate full-length CYP genes from genomic libraries. In
the case of CPR a heterologous probe based upon the known DNA
sequence for the CPR gene from C. tropicalis 750 was used to
isolate the C. tropicalis 20336 CPR gene.
[0166] A. Cloning of the CPR Gene from C. tropicalis 20336
[0167] 1) Cloning of the CPRA Allele
[0168] Approximately 25,000 phage particles from the first genomic
library of C. tropicalis 20336 were screened with a 1.9 kb
BamHI-NdeI fragment from plasmid pCU3RED (See Picattagio et al.,
Bio/Technology 10:894-898 (1992), incorporated herein by reference)
containing most of the C. tropicalis 750 CPR gene. Five clones that
hybridized to the probe were isolated and the plasmid DNA from
these lambda clones was rescued and characterized by restriction
enzyme analysis. The restriction enzyme analysis suggested that all
five clones were identical but it was not clear that a complete CPR
gene was present.
[0169] PCR analysis was used to determine if a complete CPR gene
was present in any of the five clones. Degenerate primers were
prepared for highly conserved regions of known CPR genes (See
Sutter et al., J. Biol. Chem. 265:16428-16436 (1990), incorporated
herein by reference) ( FIG. 4). Two Primers were synthesized for
the FMN binding region (FMN1, SEQ ID NO: 16 and FMN2, SEQ ID NO:
17). One primer was synthesized for the FAD binding region (FAD,
SEQ ID NO: 18), and one primer for the NADPH binding region (NADPH,
SEQ ID NO: 19) (Table 4). These four primers were used in PCR
amplification experiments using as a template plasmid DNA isolated
from four of the five clones described above. The FMN (SEQ ID NOS:
16 and 17) and FAD (SEQ ID NO: 18) primers served as forward
primers and the NADPH primer (SEQ ID NO: 19) as the reverse primer
in the PCR reactions. When different combinations of forward and
reverse primers were used, no PCR products were obtained from any
of the plasmids. However, all primer combinations amplified
expected size products with a plasmid containing the C. tropicalis
750 CPR gene (positive control). The most likely reason for the
failure of the primer pairs to amplify a product, was that all four
of clones contained a truncated CPR gene. One of the four clones
(pHKM1) was sequenced using the Triplex 5' (SEQ ID NO: 30) and the
Triplex 3' (SEQ ID NO: 31) primers (Table 4) which flank the insert
and the multiple cloning site on the cloning vector, and with the
degenerate primer based upon the NADPH binding site described
above. The NADPH primer (SEQ ID NO: 19) failed to yield any
sequence data and this is consistent with the PCR analysis.
Sequences obtained with Triplex primers were compared with C.
tropicalis 750 CPR sequence using the MacVector.TM. program (Oxford
Molecular Group, Campbell, Calif.). Sequence obtained with the
Triplex 3' primer (SEQ ID NO: 31) showed similarity to an internal
sequence of the C. tropicalis 750 CPR gene confirming that pHKMI
contained a truncated version of a 20336 CPR genes pHKMl had a 3.8
kb insert which included a 1.2 kb coding region of the CPR gene
accompanied by 2.5 kb of upstream DNA (FIG. 5). Approximately 0.85
kb of the 20336 CPR gene encoding the C-terrninal portion of the
CPR protein is missing from this clone.
[0170] Since the first Clontech library yielded only a truncated
CPR gene, the second library prepared by Clontech was screened to
isolate a full-length CPR gene. Three putative CPR clones were
obtained. The three clones, having inserts in the range of 5-7 kb,
were designated pHKM2, pHKM3 and pHKM4. All three were
characterized by PCR using the degenerate primers described above.
Both pHKM2 and pHKM4 gave PCR products with two sets of internal
primers. pHKM3 gave a PCR product only with the FAD (SEQ ID NO: 18)
and NADPH (SEQ ID NO: 19) primers suggesting that this clone likely
contained a truncated CPR gene. All three plasmids were partially
sequenced using the two Triplex primers and a third primer whose
sequence was selected from the DNA sequence near the truncated end
of the CPR gene present in pHKM1. This analysis confirmed that both
pHKM2 & 4 have sequences that overlap pHKM1 and that both
contained the 3' region of CPR gene that is missing from pHKM1.
Portions of inserts from pHKMl and pHKM4 were sequenced and a
full-length CPR gene was identified. Based on the DNA sequence and
PCR analysis, it was concluded that pHKM' contained the putative
promoter region and 1.2 kb of sequence encoding a portion (5' end)
of a CPR gene. pHKM4 had 1.1 kb of DNA that overlapped pHKM' and
contained the remainder (3' end) of a CPR gene along with a
downstream untranslated region (FIG. 6). Together these two
plasmids contained a complete CPRA gene with an upstream promoter
region. CPRA is 4206 nucleotides in length (SEQ ID NO: 81) and
includes a regulatory region and a protein coding region (defined
by nucleotides 1006-3042) which is 2037 base pairs in length and
codes for a putative protein of 679 amino acids (SEQ ID NO: 83)
(FIGS. 13 and 14). In FIG. 13, the asterisks denote conserved
nucleotides between CPRA and CPRB, bold denotes protein coding
nucleotides, and the start and stop codons are underlined. The CPRA
protein, when analyzed by the protein alignment program of the
GeneWorks.TM. software package (Oxford Molecular Group, Campbell,
Calif.), showed extensive homology to CPR proteins from C.
tropicalis 750 and C. maltosa.
[0171] 2) Cloning of the CPRB Allele
[0172] To clone the second CPRB allele, the third genomic library,
prepared by Henkel, was screened using DNA fragments from pHKMl and
pHKM4 as probes. Five clones were obtained and these were sequenced
with the three internal primers used to sequence CPRA. These
primers were designated PRKl.F3 (SEQ ID NO: 20), PRKl.F5 (SEQ ID
NO: 21) and PRK4.R20 (SEQ ID NO: 22) (Table 4). and the two outside
primers (M13-20 and T3 [Stratagene]) for the polylinker region
present in the pBK-CMV cloning vector. Sequence analysis suggested
that four of these clones, designated pHKM5 to 8, contained inserts
which were identical to the CPRA allele isolated earlier. All four
seemed to contain a full length CPR gene. The fifth clone was very
similar to the CPRA allele, especially in the open reading frame
region where the identity was very high. However, there were
significant differences in the 5' and 3' untranslated regions. This
suggested that the fifth clone was the allele to CPRA. The plasmid
was designated pHKM9 (FIG. 7) and a 4.14 kb region of this plasmid
was sequenced and the analysis of this sequence confirmed the
presence of the CPRB allele (SEQ ID NO: 82), which includes a
regulatory region and a protein coding region (defined by
nucleotides 1033-3069) (FIG. 13). The amino acid sequence of the
CPRB protein is set forth in SEQ ID NO: 84 (FIG. 14).
[0173] B. Cloning of C tropicalis 20336 (CYP) Genes
[0174] 1) Cloning of CYP52A2A, CYPS2A3A & 3B and CYP52ASA &
SB
[0175] Clones carrying CYP52A2A, A3A, A3B, A5A and A5B genes were
isolated from the first and second Clontech genomic libraries using
an oligonucleotide probe (HemeB1, SEQ ID NO: 27) whose sequence was
based upon the amino acid sequence for the highly conserved heme
binding region present throughout the CYP52 family. The first and
second libraries were converted to the plasmid form and screened by
colony hybridizations using the HemeB1 probe (SEQ ID NO: 27) (Table
4). Several potential clones were isolated and the plasmid DNA was
isolated from these clones and sequenced using the HemeB1
oligonucleotide (SEQ ID NO: 27) as a primer. This approach
succeeded in identifing five CYP52 genes. Three of the CYP genes
appeared unique, while the remaining two were classified as
alleles. Based upon an arbitrary choice of homology to CYPS2 genes
from Candida maltosa, these five genes and corresponding plasmids
were designated CYPS2A2A (PPAI 5 [FIG. 26]), CYP52A3A (pPA57 [FIG.
29]), CYP52A3B (pPA62 [FIG. 30]), CYP52A5A (pPAL3 [FIG. 31]) and
CYP52A5B (pPA5 [FIG. 32]). The complete DNA sequence including
regulatory and protein coding regions of these five genes was
obtained and confirmed that all five were CYPS2 genes (FIG. 15). In
FIG. 15, the asterisks denote conserved nucleotides among the CYP
genes. Bold indicates the protein coding nucleotides of the CYP
genes, and the start and stop codons are underlined. The CYP52A2A
gene as represented by SEQ ID NO: 86 has a protein coding region
defined by nucleotides 1199-2767 and the encoded protein has an
amino acid sequence as set forth in SEQ ID NO: 96. The CYP52A3A
gene as represented by SEQ ID NO: 88 has a protein encoding region
defined by nucleotides 1126-2748 and the encoded protein has an
amino acid sequence as set forth in SEQ ID NO: 98. The CYP52A3B
gene,as represented by SEQ ID NO: 89 has a protein coding defined
by nucleotides 913-2535 and the encoded protein has an amino acid
sequence as set forth in SEQ ID NO: 99. The CYP52A5A gene as
represented by SEQ ID NO: 90 has a protein coding region defined by
nucleotides 1103-2656 and the encoded protein has an amino acid
sequence as set forth in SEQ ID NO: 100. The CYP52A5B gene as
represented by SEQ ID NO: 91 has a protein coding region defined by
nucleotides 1 142-2695 and the encoded protein has an amino acid
sequence as set forth in SEQ ID NO: 101.
[0176] 2) Cloning of CYPS2A]A and CYPS2A8A
[0177] CYP52A1A and CYP52A8A genes were isolated from the third
genomic library using PCR fragments as probes. The PCR fragment
probe for CYP52A1 was generated after PCR amplification of 20336
genomic DNA with oligonucleotide primers that were designed to
amplify a region from the Helix I region to the HR2 region using
all available CYP52 genes from National Center for Biotechnology
Information. Degenerate forward primers UCup1 (SEQ ID NO: 23) and
UCup2 (SEQ ID NO: 24) were designed based upon an amino acid
sequence (-RDTTAG-) from the Helix I region (Table 4). Degenerate
primers UCdownl (SEQ ID NO: 25) and UCdown2 (SEQ ID NO: 26) were
designed based upon an amino acid sequence (-GQQFAL-) from the HR2
region (Table 4). For the reverse primers, the DNA sequence
represents the reverse complement of the corresponding amino acid
sequence. These primers were used in pairwise combinations in a PCR
reaction with Stoffel Taq DNA polymerase (Perkin-Elmer Cetus,
Foster City, Calif.) according to the manufacturer's recommended
procedure. A PCR product of approximately 450 bp was obtained. This
product was purified from agarose gel using Gene-clean.TM. (Bio
101, LaJolla, Calif.) and ligated to the pTAG.TM. vector (FIG. 17)
(R&D systems, Minneapolis, Minn.) according to the
recommendations of the manufacturer. No treatment was necessary to
clone into pTAG because it employs the use of the TA cloning
technique. Plasmids from several transformants were isolated and
their inserts were characterized. One plasmid contained the PCR
clone intact. The DNA sequence of the PCR fragment (designated
44CYP3, SEQ ID NO: 107) shared homology with the DNA sequences for
the CYP52A] gene of C. maltosa and the CYP52A3 gene of C.
tropicalis 750. This fragment was used as a probe in isolating the
C. tropicalis 20336 CYP52A1 homolog. The third genomic library was
screened using the 44CYP3 PCR probe (SEQ ID NO: 107) and a clone
(pHKM11) that contained a full-length CYP52 gene was obtained (FIG.
8). The clone contained a gene having regulatory and protein coding
regions. An open reading frame of 1572 nucleotides encoded a CYP52
protein of 523 amino acids (FIGS. 15 and 16). This CYP52 gene was
designated CYP52A1A (SEQ ID NO: 85) since its putative amino acid
sequence (SEQ ID NO: 95) was most similar to the CYP52A1 protein of
C. maltosa. The protein coding region of the CYP52A1A gene is
defined by nucleotides 1177-2748 of SEQ ID NO: 85.
[0178] A similar approach was taken to clone CYP52A8A. A PCR
fragment probe for CYP52A8 was generated using primers for highly
conserved sequences of CYP52A3, CYP52A2 and CYP52A5 genes of C.
tropicalis 750. The reverse primer (primer 2,3,5,M) (SEQ ID NO: 29)
was designed based on the highly conserved heme binding region
(Table 4). The design of the forward primer (primer 2,3,5,P) (SEQ
ID NO: 28) was based upon a sequence conserved near the N-terminus
of the CYP52A3, CYP52A2 and CYP52A5 genes from C. tropicalis 750
(Table 4). Amplification of 20336 genomic DNA with these two
primers gave a mixed PCR product. One amplified PCR fragment was
1006 bp long (designated DCA1002). The DNA sequence for this
fragment was determined and was found to have 85% identity to the
DNA sequence for the .degree. CYP52D4 gene of C. tropicalis 750.
When this PCR product was used to screen the third genomic library
one clone (pHKM12) was identified that contained a full-length
CYP52 gene along with 5' and 3' flanking sequences (FIG. 9). The
CYP52 gene included regulatory and protein coding regions with an
open reading frame of 1539 nucleotides long which encoded a
putative CYP52 protein of 512 amino acids (FIGS. 15 and 16). This
gene was designated as CYP52A8A (SEQ ID NO: 92) since its amino
acid sequence (SEQ ID NO: 102) was most similar to the CYP52A8
protein of C. maltosa. The protein coding region of the CYP52A8A
gene is defined by nucleotides 464-2002 of SEQ ID NO: 92. The amino
acid sequence of the CYP52A8A protein is set forth in SEQ ID NO:
102.
[0179] 3) Cloning of CYP52D4A
[0180] The screening of the second genomic library with the HemeB1
(SEQ ID NO: 27) primer (Table 4) yielded a clone carrying a plasmid
(pPA18) that contained a truncated gene having homology with the
CYP52D4 gene of C. maltosa (FIG. 33). A 1.3 to 1.5-kb EcoRI-SstI
fragment from pPAI 8 containing part of the truncated CYP gene was
isolated and used as a probe to screen the third genomic library
for a full length CYP52 gene. One clone (pHKM13) was isolated and
found to contain a full-length CYP gene with extensive 5' and 3'
flanking sequences (FIG. 10). This gene has been designated as
CYP52D4A (SEQ ID NO: 94) and the complete DNA including regulatory
and protein coding regions (coding region defined by nucleotides
767-2266) and putative amino acid sequence (SEQ ID NO: 104) of this
gene is shown in FIGS. 15 and 16. CYP52D4A (SEQ ID NO: 94) shares
the greatest homology with the CYP52D4 gene of C. maltosa.
[0181] 4) Cloning of CYP52A2B and CYP52A8B
[0182] A mixed probe containing CYP52A1A, A2A, A3A, D4A, A5AI and
A8A genes was used to screen the third genomic library and several
putative positive clones were identified. Seven of these were
sequenced with the degenerate primers Cyp52a (SEQ ID NO: 32),
Cyp52b (SEQ ID NO: 33), Cyp52c (SEQ ID NO: 34) and Cyp52d (SEQ ID
NO: 35) shown in Table 4. These primers were designed from highly
conserved regions of the four CYP52 subfamilies, namely CYP52A, B,
C & D. Sequences from two clones, pHKM14 and pHKMl5 (FIGS. 11
and 12), shared considerable homology with DNA sequence of the C.
tropicalis 20336 CYP52A2 and CYP52A8 genes, respectively. The
complete DNA (SEQ ID NO: 87) including regulatory and protein
coding regions (coding region defined by nucleotides 1072-2640) and
putative amino acid sequence (SEQ ID NO: 97) of the CYP52 gene
present in pHKM14 suggested that it is CYP52A2B (FIGS. 15 and 16).
The complete DNA (SEQ ID NO: 93) including regulatory and protein
coding regions (coding region defined by nucleotides 1017-2555) and
putative amino acid sequence (SEQ ID NO: 103) of the CYP52 gene
present in pHKM15 suggested that it is CYP52A8B (FIGS. 15 and
16).
EXAMPLE 14
Identification of CYP and CPR Genes Induced by Selected Fatty Acid
and Alkane Substrates
[0183] Genes whose transcription is turned on by the presence of
selected fatty acid or alkane substrates have been identified using
the QC-RT-PCR assay. This assay was used to measure (CYP) and (CPR)
gene expression in fermentor grown cultures C. tropicalis ATCC
20962. Ths method involves the isolation of total cellular RNA from
cultures of C. tropicalis and the quantification of a specific mRNA
within that sample through the design and use of sequence specific
QC-RT-PCR primers and an RNA competitor. Quantification is achieved
through the use of known concentrations of highly homologous
competitor RNA in the QC-RT-PCR reactions. The resulting QC-RT-PCR
amplified cDNA's are separated and quantitated through the use of
ion pairing reverse phase HPLC. This assay was used to characterize
the expression of CYP52 genes of C. tropicalis ATCC 20962 in
response to various fatty acid and alkane substrates. Genes which
were induced were identified by the calculation of their mRNA
concentration at various times before and after induction. FIG. 18
provides an example of how the concentration of mRNA for CYP52AScan
be calculated using the QC-RT-PCR assay. The log ratio of unknown
(U) to competitor product (C) is plotted versus the concentration
of competitor RNA present in the QC-RT-PCR reactions. The
concentration of competitor wlich results in a log ratio of U/C of
zero, represents the point where the unknown messenger RNA
concentration is equal to the concentration of the competitor. FIG.
18 allows for the calculation of the amount of CYP52A5message
present in 100 ng of total RNA isolated from cell samples taken at
0, 1, and 2 hours after the addition of Emersol.RTM. 267 in a
femientor run. From this analysis, it is possible to determine the
concentration of the C9P52AmRNA present in 100 ng of total cellular
RNA. In the plot contained in FIG. 18 it takes 0.46 pg of
competitor to equal the number of mRNA's of CYP52A5 in 100 ng of
RNA isolated from cells just prior (time 0) to the addition of the
substrate, Emersol.RTM. 267. In cell samples taken at one and two
hours after the addition of Emersol.RTM. 267 it takes 5.5 and 8.5
pg of competitor RNA, respectively. This result demonstrates that
CYP52AS(SEQ ID NOS: 90 and 91) is induced more than 18 fold within
two hours after the addition of Emersol.RTM. 267. This t of
analysis was used to demonstrate that CYP52A5 (SEQ ID NO: 90 and
91) is induced by Emersol.RTM. 267. FIG. 19 shows the relative
amounts of CYP52A5 (SEQ ID NOS: 90 and 91) expression in fermentor
runs with and without Emersol.RTM. 267 as a substrate. The
differences in the CYP52A5 (SEQ. ID NOS: 90 and 91) expression
patterns are due to the addition of Emersol.RTM. 267 to the
fermentation medium.
[0184] This analysis clearly demonstrates that expression of
CYP52A5(SEQ ID NOS: 90 and 91) in C. tropicalis 20962 is inducible
by the addition of Emersol.RTM.267 to the growth medium. This
analysis was performed to characterize die expression of CYPJ2A2A
(SEQ ID NO: 86), CYP52A3AB (SEQ ID NOS: 88 and 89), CYP52A8A (SEQ
ID NO: 92), CYP52A1A (SEQ ID NO: 85), CYP52D4A (SEQ ID NO: 94) and
CPRB (SEQ ID NO: 82) in response to the presence of Emersol.RTM.
267 in the fermentation medium (FIG. 20). The results of these
analysis' indicate, that like the CYPS2Agene (SEQ ID NOS: 90 and
91) of C. tropicaus 20962, the CYP52A2A gene (SEQ ID NO: 86) is
inducible by Emersol.RTM. 267. A small induction is observed for
CYP52A1A (SEQ ID NO: 85) and CYP52A8A (SEQ ID NO: 92). In contrast,
any induction for CYPS2D4A (SEQ ID NO: 94), CYPJ2A3A (SEQ ID NO:
88), CYP52A3B (SEQ ID NO: 89) is below the level of detection of
the assay. CPRB (SEQ ID NO: 82) is moderately induced by
Emersol.RTM. 267, four to five fold. The results of these analysis
are summarized in FIG. 20. FIG. 34 provides an example of selective
induction of CYP5SA genes. When pure fatty acid or alkanes are
spiked into a fermentor containing C. tropicalis 20962 or a
derivative thereof, the transcriptional activation of CITS2A genes
was detected using the QC-RT-PCR assay. FIG. 34 shows that pure
oleic acid (C1 8:1) strongly induces CYP52A2A (SEQ ID NO: 86) while
inducing CIP52AS(SEQ ID NOS: 90 and 91). In the same fermentor
addition of pure alkane (tridecane) shows strong induction of both
CYPS2A2A (SEQ ID NO: 86) and CYP52AIA (SEQ ID NO: 85). However,
tridecane did not induce CYP52A5(SEQ ID NOS: 90 and 91). In a
separate fermentation using ATCC 20962, containing pure octadecane
as the substrate, induction of CYPJ2A2A, CYP52ASA and CYP52ALA is
detected (see FIG. 36). The foregoing demonstrates selective
induction of particular CYPgenes by specific substrates, thus
providing techniques for selective metabolic engineering of cell
strains. For example, if tridecane modification is desired,
organisms engineered for high levels of CYP52A2A (SEQ ID NO: 86)
and CYP52AIA (SEQ ID NO: 85) activity are indicated. If oleic acid
modification is desired, organisms engineered for high levels of
CYPJ2A2A (SEQ ID NO: 86) activity are indicated.
EXAMPLE 15
Integration of Selected CYP and CPR Genes into the Genome of
Candida tropicalis
[0185] In order to integrate selected genes into the chromosome of
C. tropicalis 20336 or its descendants, there has to be a target
DNA sequence, which may or may not be an intact gene, into which
the genes can be inserted. There must also be a method to select
for the integration event. In some cases the target DNA sequence
and the selectable marker are the same and, if so, then there must
also be a method to regain use of the target gene as a selectable
marker following the integration event. In C. tropicalis and its
descendants, one gene which fits these criteria is URA3A, encoding
orotidine-5'-phosphate decarboxylase. Using it as a target for
integration, ura.sup.- variants of C. tropicalis can be transformed
in such a way as to regenerate a URA.sup.+ genotype via homologous
recombination (FIG. 21). Depending upon the design of the
integration vector, one or more genes can be integrated into the
genome at the same time. Using a split URA3A gene oriented as shown
in FIG. 22, homologous integration would yield at least one copy of
the gene(s) of interest which are inserted between the split
portions of the URA3A gene. Moreover, because of the high sequence
similarity between URA3A and URA3B genes, integration of the
construct can occur at both the URA3A and URA3B loci. Subsequently,
an oligonucleotide designed with a deletion in a portion of the URA
gene based on the identical sequence across both the URA3A and
URA3B genes, can be utilized to yield C. tropicalis transformants
which are once again ura but which still carry one or more newly
integrated genes of choice (FIG. 21). ura.sup.- variants of C.
tropicalis can also be isolated via other methods such as classical
mutagenesis or by spontaneous mutation. Using well established
protocols, selection of ura.sup.- strains can be facilitated by the
use of 5-fluoroorotic acid (5-FOA) as described, e.g., in Boeke et
al., Mol. Gen. Genet. 197:345-346, (1984), incorporated herein by
reference. The utility of this approach for the manipulation of C.
tropicalis has been well documented as described, e.g., in
Picataggio et al., Mol. and Cell. Biol. 11:4333-4339 (1991); Rohrer
et al., AppL. Microbiol. Biotechnol. 36:650-654 (1992); Picataggio
et al., Bio/Technology 10:894-898 (1992); U.S. Pat. No. 5,648,247;
U.S. Pat. No. 5,620,878; U.S. Pat. No. 5,204,252; U.S. Pat. No.
5,254,466, all of which are incorporated herein by reference.
[0186] A. Construction of a URA Integration Vector, pURAin.
[0187] Primers were designed and synthesized based on the 1712 bp
sequence of the URA3A gene of C. tropicalis 20336 (see FIG. 23).
The nucleotide sequence of the URRA3A gene of C. tropicalis 20336
is set forth in SEQ ID NO: 105 and the amino acid sequence of the
encoded protein is set forth in SEQ ID NO: 106. URA3A Primer Set
#la (SEQ ID NO: 9) and #1b (SEQ ID NO: 10) (Table 4) was used in
PCR with C. tropicalis 20336 genomic DNA to amplify URA3A4
sequences between nucleotide 733 and 1688 as shown in FIG. 23. The
primers are designed to introduce unique 5' AscI and 3' Pad
restriction sites into the resulting amplified URA3A fragment. AscI
and Pad sites were chosen because these sites are not present
within CYPor CPR genes identified to date. URA3A Primer Set #2 was
used in PCR with C. tropicalis 20336 genomic DNA as a template, to
amplify URA3A sequences between nucleotide 9 and 758 as shown in
FIG. 23. URA3A Primer set #2a (SEQ ID NO: 11) and #2b (SEQ ID NO:
12) (Table 4) was designed to introduce unique 5' Pacd and 3' PraeI
restriction sites into the resulting amplified URA3A fragment. The
Pine site is also not present within CYPand CPR genes identified to
date. PCR fragments of the URA3A gene were purified, restricted
with Asd, Pad and Pmel restriction enzymes and ligated to a gel
purified, QiaexIl cleaned Asci-Pmel digest of plasmid pNEB193 (FIG.
25) purchased from New England Biolabs (Beverly, MA). The ligation
was performed with an equimolar number of DNA termini at 16
.degree. C. for 16 hr using T4 DNA ligase (New England Biolabs).
ligations were transformed into E. coli XL1-Blue cells (Stratagene,
LaJolla, Calif.) according to manufacturers recommendations. Vlite
colonies were isolated, grown, plasmid DNA isolated and digested
with Asci-Pmel to confirm insertion of the modified URA3A into
pNEB193. The resulting base integration vector was named pURAin
(FIG. 24).
[0188] B. Amplification of CYP52A2A, CYP52A3A, CYP52ASA and CPRB
from C. tropicalis 20336 Genomic DNA
[0189] The genes encoding CYPS2A2A, (SEQ ID NO: 86) and CYP52A3A
(SEQ ID NO: 88) from C. tropicalis 20336 were amplified from
genomic clones (pPA15 and pPA57, respectively) (FIGS. 26 and 29)
via PCR using primers (Primer CYP2A#1, SEQ ID NO: I and Primer
CYP2A#2, SEQ ID NO: 2 for CYP52A2A) (Primer CYP3A#l, SEQ ID NO: 3
and Primer CYP3A#2, SEQ ID NO: 4 for CYP52A3A) to introduce Pad
cloning sites. These PCR primers were designed based upon the DNA
sequence determined for CYP52A2A (SEQ ID NO: 86) (FIG. 15). The
AmpliTaq Gold PCR kit (Perkin Elmer Cetus, Foster City, Calif.) was
used according to manufacturers specifications. The CYP52A2A PCR
amplification product was 2,230 base pairs in length, yielding 496
bp of DNA upstream of the CYP52A2A start codon and 168 bp
downstream of the stop codon for the CYP52A2A ORF. The CYP52A3A PCR
amplification product was 2154 base pairs in length, yielding 437
bp of DNA upstream of the CYP52A3A start codon and 97 bp downstream
of the stop codon for the CYP52A3A ORF The CG 2A3A PCR
amplification product was 2154 base pairs in length, yielding 437
bp of DNA upstream of the CYP52A3A start codon and 97 bp downsteam
of the stop codon for the CYP52A3A ORF.
[0190] The gene encoding CYP52A5A (SEQ ID NO: 90) from C.
tropicalis 20336 was amplified from genomic DNA via PCR using
primers (Primer CYP 5A#1, SEQ ID NO: 5 and Primer CYP 5A#2, SEQ ID
NO: 6) to introduce Pacl cloning sites. These PCR primers were
designed based upon the DNA sequence determined for CYP52A5A (SEQ
ID NO: 90). The Expand Hi-Fi Taq PCR kit (Boehringer Mannheim,
Indianapolis, IN) was used according to manufacturers
specifications. The CYP52A5A PCR amplification product was 3,298
base pairs in length.
[0191] The gene encoding CPRB (SEQ ID NO: 82) from C. tropicalis
20336 was amplified from genomic DNA via PCR using primers (CPR
B#1, SEQ ID NO: 7 and CPR B#2, SEQ ID NO: 8) based upon the DNA
sequence determined for CPRB (SEQ ID NO: 82) (FIG. 13). These
primers were designed to introduce unique Pad cloning sites. The
Expand Hi-Fi Taq PCR kit (Boehringer Mannheim, Indianapolis, Ind.)
was used according to manufacturers specifications. The CPRB PCR
product was 3266 bp in length, yielding 747 bp pf DNA upstream of
the CPRB start codon and 493 bp downstream of the stop codon for
the CPRB ORF. The resulting PCR products were isolated via agarose
gel electrophoresis, purified using Qiaexll and digested with Pad.
The PCR fragments were purified, desalted and concentrated using a
Microcon 100 (Amicon, Beverly, Mass.).
[0192] The above described amplification procedures are applicable
to the other genes listed in Table 5 using the respectively
indicated primers.
[0193] C. Cloning of CYP and CPR Genes into pURAin.
[0194] The next step was to clone the selected CYPand CPR genes
into the pURAin integration vector. In a preferred aspect of the
present invention, no foreign DNA other than that specifically
provided by synthetic restriction site sequences are incorporated
into the DNA which was cloned into the genome of C. tropicalis,
i.e., with the exception of restriction site DNA only native C.
tropicalis DNA sequences are incorporated into the genome. pURAin
was digested with Pad, Qiaex II cleaned, and dephosphorylated with
Shrimp Alkaline Phosphatase (SAP) (United States Biochemical,
Cleveland, Ohio) according the manufacturer's recommendations.
Approximately 500 ng of Pad linearized pURAin was dephosphorylated
for 1 hr at 37.degree. C. using SAP at a concentration of 0.2 Units
of enzyme per 1 pmol of DNA termini. The reaction was stopped by
heat inactivation at 65 .degree. C. for 20 min.
[0195] The CYP52A2A Pad fragment derived using the primer shown in
Table 4 was ligated to plasmid pURAin which had also been digested
with Pad. Pad digested pURAin was dephosphorylated, and ligated to
the CYP52A2A ULTMA PCR product as described previously. The
ligation mixture was transformed into E. coliXL1 Blue MRF'
(Stratagene) and 2 resistant colonies were selected and screened
for correct constructs which should contain vector sequence, the
inverted URA3A gene, and the amplified CYP52A2A gene (SEQ ID NO:
86) of 20336. AscI-Pmel digestion identified one of the two
constructs, plasmid pURA2in, as being correct (FIG. 27). This
plasmid was sequenced and compared to CYP52A2A (SEQ ID NO: 86) to
confirm that PCR did not introduce DNA base changes that would
result in an amino acid change.
[0196] Prior to its use, the CPRB Pad fragment derived using the
primers showvn in Table 4 was sequenced and compared to CPRB (SEQ
ID NO: 82) to confirm that PCR did not introduce DNA base pair
changes that would result in an amino acid change. Following
confirmation, CPRB (SEQ ID NO: 82) was ligated to plasmid pURAin
which had also been digested with Pad. Pad digested pURAin was
dephosphorylated, and ligated to the CPR Expand Hi-Fi PCR product
as described previously. The ligation mixture was transformed into
E. coli XL1 Blue MRF' (Stratagene) and several resistant colonies
were selected and screened for correct constructs which should
contain vector sequence, the inverted URA3A gene, and the amplified
CPRB gene (SEQ ID NO: 82) of 20336. AscI-PmeI digestion confirmed a
successful construct, pURAREDBin.
[0197] In a manner similar to the above, each of the other CYPand
CPR genes disclosed herein are cloned into pURAin. Pad fragments of
these genes, whose sequences are given in FIGS. 13 and 15, are
derivable by methods known to those skilled in the art.
[0198] 1) Construction of Vectors Used to Generate HDC 20 and HDC
23
[0199] A previously constructed integration vector containing CPRB
(SEQ ID NO: 82), pURAR.EDBin, was chosen as the starting vector.
This vector was partially digested with Pacl and the linearized
fragment was gel-isolated. The active Pacl was destroyed by
treatment with T4 DNA polymerase and the vector was re-ligated.
Subsequent isolation and complete digestion of this new plasmid
yielded a vector now containing only one active Pacl site. This
fragment was gel-isolated, dephosphorylated and ligated to the
CYP52A2A PacI fragment. Vectors that contain the CYP52A2A (SEQ ID
NO: 86) and CPRB (SEQ ID NO: 82) genes oriented in the same
direction, pURAin CPR 2A S, as well as opposite directions (5' ends
connected), pURAin CPR 2A O, were generated.
[0200] D. Confirmation of CYP Integration (FIG. 21 for Integration
Scheme) into the Genome of C. Lropicalis
[0201] Based on the construct, pURA2in, used to transform H5343
ura.sup.-, a scheme to detect integration was devised. Genomic DNA
from transformants was digested with Dra III and Spe I which are
enzymes that cut within the URA3A, and URA3B genes but not within
the integrated CYP52A2A gene. Digestion of genomic DNA where an
integration had occurred at the URA3A or URA3B loci would be
expected to result in a 3.5 kb or a 3.3 kb fragment, respectively
(FIG. 28). Moreover, digestion of the same genomic DNA with Pad
would yield a 2.2 kb fragment characteristic for the integrated
CYP52A2A gene (FIG. 28). Southern hybridizations of these digests
with fragments of the CYP52A2A gene were used to screen for these
integration events. Intensity of the band signal from the Southern
using Pad digestion was used as a measure of the number of
integration events, ((i.e. the more copies of the CYP52A2A gene
(SEQ ID NO: 86) which are present, the stronger the hybridization
signal)).
[0202] C. tropicabs H5343 transformed URA prototrophs were grown at
30 .degree. C., 170 rpm, in 10 ml SC-uracil media for preparation
of genonic DNA. Genomic DNA was isolated by the method described
previously. Genomic DNA was digested with SpeI and DraIll. A 0.95%
agarose gel was used to prepare a Southern hybridization blot. The
DNA from the gel was transferred to a MagnaCharge nylon filter
membrane (MSI Technologies, Westboro, Mass.) according to the
alkaline transfer method of Sambrook et al., supra. For the
Southern hybridization, a 2.2 kb CYP2A2A DNA fragment was used as a
hybridization probe. 300 ng of CYP52A2A DNA was labeled using a ECL
Direct labeling and detection system (Amersham) and the Southern
was processed according to the ECL kit specifications. The blot was
processed in a volume of 30 ml of hybridization fluid corresponding
to 0.125 ml/cm.sup.2. Following a prehybridization at 42.degree. C.
for 1 hr, 300 ng of CYP52A9A probe was added and the hybridization
continued for 16 hr at 420C. Following hybridization, the blots
were washed two times for 20 min each at 42 .degree. C. in primary
wash containing urea. Two 5 min secondary washes at RT were
conducted, followed by detection according to directions. The blots
were exposed for 16 hours (hr) as recommended.
[0203] Integration was confirmed by the detection of a SpeI-DralII
3.5 kb fragment from the genomic DNA of the transfoimants but not
with the C. tropicalis 20336 control. Subsequently, a Pad digestion
of the genomic DNA of the positive transformants, followed by a
Southern hybridization using an CYPS2A2A gene probe, confirmed
integration by the detection of a 2.2 kb fragment. The resulting
CYP52A2A integrated strain was named HDCl (see Table 1).
[0204] In a manner similar to the above, each of the genes
contained in the Pael fragments which are described in Section 3c
above were confirmed for integration into the genome of C.
tropicalis.
[0205] Transformants generated by transformation with the vectors,
pURAin CPR 2A S or pURAin CPR 2A O, were analyzed by Southern
hybridization for integration of both the CYP52A2A (SEQ ID NO: 86)
and CPRB (SEQ ID NO: 82) genes tandemly. Three strains were
generated in which the CYP52A2A (SEQ ID NO: 86) and CPRB (SEQ ID
NO: 82) genes integrated are in the opposite orientation (HDC 20-1,
HDC 20-2 and HDC 20-3) and three were generated with the CYP52A2A
(SEQ ID NO: 86) and CPRB (SEQ ID NO: 82) genes integrated in the
same orientation (HDC 23-1, HDC 23-2 and HDC 23-3), Table 1.
[0206] E. Confirmation of CPRB Integration into H5343 umra
[0207] Seven transformants were screened by colony PCR using CPRB
primer #2 (SEQ ID NO: 8) and a URA3A- specific primer. In five of
the transformants, successful integration was detected by the
presence of a 3899 bp PCR product. This 3899 bp PCR product
represents the CPRB gene adjacent to the URA3A gene in the genome
of H5343 thereby confiring integration. The resulting
CPRBintegrated strains were named HDC10-1 and HDC10-2 (see Table
1).
[0208] F. Strain Evaluation.
[0209] As determined by quantitative PCR, when compared to parent
H5343, HDC10-1 contained three additional copies of the reductase
gene and HDC10-2 contained four additional copies of the reductase
gene. Evaluations of HDC20-1, HDC20-2 and HDC20-3 based on Southern
hybridization data indicates that HDC20-1 contained multiple
integrations, i.e., 2 to 3 times that of HDC20-2 or HDC20-3.
Evaluations of HDC23-1, HDC23-2, and HDC23-3 based on Southern
hybridization data indicates that HDC23-3 contained multiple
integrations, i.e., 2 to 3 times that of HDC23-1 or HDC23-2. The
data in Table 8 indicates that the integration of components of the
.omega.-hydroxylase complex have a positive effect on the
improvement of Candida tropicalis ATCC 20962 as a biocatalyst. The
results indicate that CYP52A5A (SEQ ID NO: 90) is an important gene
for the conversion of oleic acid to diacid. Surprisingly, tandem
integrations of CYP and CPR genes oriented in the opposite
direction (HDC 20 strains) seem to be less productive than tandem
integrations oriented in the same direction (HDC 23 strains),
Tables 1 and 8.
10TABLE 9 Media Composition LB Broth Bacto Tryptone 10 g Bacto
Yeast Extract 5 g Sodium Chloride 10 g Distilled Water 1,000 ml LB
Agar Bacto Tryptone 10 g Bacto Yeast Extract 5 g Sodium Chloride 10
g Agar 15 g Distilled Water 1,000 ml LB Top Agarose Bacto Tryptone
10 g Bacto Yeast Extract 5 g Sodium Chloride 10 g Agarose 7 g
Distilled Water 1,000 ml NZCYM Broth Bacto Casein Digest 10 g Bacto
Casamino Acids 1 g Bacto Yeast Extract 5 g Sodium Chloride 5 g
Magnesium Sulfate 0.98 g (anhydrous) Distilled Water 1,000 ml NZCYM
Agar Bacto Casein Digest 10 g Bacto Casamino Acids 1 g Bacto Yeast
Extract 5 g Sodium Chloride 5 g Magnesium Sulfate 0.98 g
(anhydrous) Agar 15 g Distilled Water 1,000 ml NZCYM Top Agarose
Bacto Casein Digest 10 g Bacto Casamino Acids 1 g Bacto Yeast
Extract 5 g Sodium Chloride 5 g Magnesium Sulfate 0.98 g
(anhydrous) Agarose 7 g Distilled Water 1,000 ml YEPD Broth Bacto
Yeast Extract 10 g Bacto Peptone 20 g Glucose 20 g Distilled Water
1,000 ml YEPD Agar* Bacto Yeast Extract 10 g Bacto Peptone 20 g
Glucose 20 g Agar 20 g Distilled Water 1,000 ml SC-uracil*
Bacto-yeast nitrogen base without amino acids 6.7 g Glucose 20 g
Bacto-agar 20 g Drop-out mix 2 g Distilled water 1,000 ml DCA2
medium g/l Peptone 3.0 Yeast Extract 6.0 Sodium Acetate 3.0 Yeast
Nitrogen Base (Difco) 6.7 Glucose (anhydrous) 50.0 Potassium
Phosphate (dibasic, trihydrate) 7.2 Potassium Phosphate (monobasic,
anhydrous) 9.3 DCA3 medium g/l 0.3 M Phosphate buffer containing,
pH 7.5 Glycerol 50 Yeast Nitrogen base 6.7 (Difco) Drop-out mix
Adenine 0.5 g Alanine 2 g Arginine 2 g Asparagine 2 g Aspartic acid
2 g Cysteine 2 g Glutamine 2 g Glutamic acid 2 g Glycine 2 g
Histidine 2 g Inositol 2 g Isoleucine 2 g Leucine 10 g Lysine 2 g
Methionine 2 g para-Aminobenzoic acid 0.2 g Phenylalanine 2 g
Proline 2 g Serine 2 g Threonine 2 g Tryptophan 2 g Tyrosine 2 g
Valine 2 g *See Kaiser et al., Methods in Yeast Genetics, Cold
Spring Harbor Laboratory Press, USA (1994), incorporated herein by
reference.
[0210] It will be understood that various modifications may be made
to the embodiments and/or examples disclosed herein. Thus, the
above description should not be construed as limiting, but merely
as exemplifications of preferred embodiments. Those skilled in the
art will envision other modifications within the scope and spirit
of the claims appended hereto.
Sequence CWU 1
1
118 1 32 DNA Artificial Sequence Description of Artificial Sequence
Primer 1 ccttaattaa atgcacgaag cggagataaa ag 32 2 30 DNA Artificial
Sequence Description of Artificial Sequence Primer 2 ccttaattaa
gcataagctt gctcgagtct 30 3 31 DNA Artificial Sequence Description
of Artificial Sequence Primer 3 ccttaattaa acgcaatggg aacatggagt g
31 4 34 DNA Artificial Sequence Description of Artificial Sequence
Primer 4 ccttaattaa tcgcactacg gttattggta tcag 34 5 29 DNA
Artificial Sequence Description of Artificial Sequence Primer 5
ccttaattaa tcaaagtacg ttcaggcgg 29 6 34 DNA Artificial Sequence
Description of Artificial Sequence Primer 6 ccttaattaa ggcagacaac
aacttggcaa agtc 34 7 31 DNA Artificial Sequence Description of
Artificial Sequence Primer 7 ccttaattaa gaggtcgttg gttgagtttt c 31
8 29 DNA Artificial Sequence Description of Artificial Sequence
Primer 8 ccttaattaa ttgataatga cgttgcggg 29 9 33 DNA Artificial
Sequence Description of Artificial Sequence Primer 9 aggcgcgccg
gagtccaaaa agaccaacct ctg 33 10 34 DNA Artificial Sequence
Description of Artificial Sequence Primer 10 ccttaattaa tacgtggata
ccttcaagca agtg 34 11 35 DNA Artificial Sequence Description of
Artificial Sequence Primer 11 ccttaattaa gctcacgagt tttgggattt
tcgag 35 12 35 DNA Artificial Sequence Description of Artificial
Sequence Primer 12 gggtttaaac cgcagaggtt ggtctttttg gactc 35 13 10
DNA Artificial Sequence Description of Artificial Sequence Primer
13 gggtttaaac 10 14 9 DNA Artificial Sequence Description of
Artificial Sequence Primer 14 aggcgcgcc 9 15 10 DNA Artificial
Sequence Description of Artificial Sequence Primer 15 ccttaattaa 10
16 21 DNA Artificial Sequence Description of Artificial Sequence
Primer 16 tcycaaacwg gtacwgcwga a 21 17 21 DNA Artificial Sequence
Description of Artificial Sequence Primer 17 ggtttgggta aytcwactta
t 21 18 18 DNA Artificial Sequence Description of Artificial
Sequence Primer 18 cgttattatc atttcttc 18 19 20 DNA Artificial
Sequence Description of Artificial Sequence Primer 19 gcmacaccrg
tacctggacc 20 20 18 DNA Artificial Sequence Description of
Artificial Sequence Primer 20 atcccaatcg taatcagc 18 21 18 DNA
Artificial Sequence Description of Artificial Sequence Primer 21
acttgtcttc gtttagca 18 22 18 DNA Artificial Sequence Description of
Artificial Sequence Primer 22 ctacgtctgt ggtgatgc 18 23 17 DNA
Artificial Sequence Description of Artificial Sequence Primer 23
cgngayacna cngcngg 17 24 17 DNA Artificial Sequence Description of
Artificial Sequence Primer 24 agrgayacna cngcngg 17 25 17 DNA
Artificial Sequence Description of Artifiical Sequence Primer 25
agngcraayt gytgncc 17 26 18 DNA Artificial Sequence Description of
Artificial Sequence Primer 26 yaangcraay tgytgncc 18 27 29 DNA
Artificial Sequence Description of Artificial Sequence Primer 27
attcaacggt ggtccaagaa tctgtttgg 29 28 25 DNA Artificial Sequence
Description of Artificial Sequence Primer 28 gagctatgtt gagaccacag
tttgc 25 29 26 DNA Artificial Sequence Description of Artificial
Sequence Primer 29 cttcagttaa agcaaattgt ttggcc 26 30 25 DNA
Artificial Sequence Description of Artificial Sequence Primer 30
ctcgggaagc gcgccattgt gttgg 25 31 29 DNA Artificial Sequence
Description of Artificial Sequence Primer 31 taatacgact cactataggg
cgaattggc 29 32 21 DNA Artificial Sequence Description of
Artificial Sequence Primer 32 tgrytcaaac catctytctg g 21 33 17 DNA
Artificial Sequence Description of Artificial Sequence Primer 33
ggaccggcgt taaaggg 17 34 23 DNA Artificial Sequence Description of
Artificial Sequence Primer 34 catagtcgwa tyatgcttag acc 23 35 17
DNA Artificial Sequence Description of Artificial Sequence Primer
35 ggaccaccat tgaatgg 17 36 540 DNA Candida tropicalis 36
atgattgaac aactcctaga atattggtat gtcgttgtgc cagtgttgta catcatcaaa
60 caactccttg catacacaaa gactcgcgtc ttgatgaaaa agttgggtgc
tgctccagtc 120 acaaacaagt tgtacgacaa cgctttcggt atcgtcaatg
gatggaaggc tctccagttc 180 aagaaagagg gcagggctca agagtacaac
gattacaagt ttgaccactc caagaaccca 240 agcgtgggca cctacgtcag
tattcttttc ggcaccagga tcgtcgtgac caaagatcca 300 gagaatatca
aagctatttt ggcaacccag tttggtgatt tttctttggg caagaggcac 360
actcttttta agcctttgtt aggtgatggg atcttcacat tggacggcga aggctggaag
420 cacagcagag ccatgttgag accacagttt gccagagaac aagttgctca
tgtgacgtcg 480 ttggaaccac acttccagtt gttgaagaag catattctta
agcacaaggg tgaatacttt 540 37 25 DNA Artificial Sequence Description
of Artificial Sequence Primer 37 ccgatgaagt tttcgacgag taccc 25 38
26 DNA Artificial Sequence Description of Artificial Sequence
Primer 38 aaggctttaa cgtgtccaat ctggtc 26 39 27 DNA Artificial
Sequence Description of Artificial Sequence Primer 39 attatcgcca
catacttcac caaatgg 27 40 24 DNA Artificial Sequence Description of
Artificial Sequence Primer 40 cgagatcgtg gatacgctgg agtg 24 41 25
DNA Artificial Sequence Description of Artificial Sequence Primer
41 gccactcggt aactttgtca gggac 25 42 26 DNA Artificial Sequence
Description of Artificial Sequence Primer 42 cattgaactg agtagccaaa
acagcc 26 43 27 DNA Artificial Sequence Description of Artificial
Sequence Primer 43 cctacgtttg gtatcgctac tccgttg 27 44 22 DNA
Artificial Sequence Description of Artificial Sequence Primer 44
tttccagcca gcaccgtcca ag 22 45 24 DNA Artificial Sequence
Description of Artificial Sequence Primer 45 gcagagccga tctatgttgc
gtcc 24 46 25 DNA Artificial Sequence Description of Artificial
Sequence Primer 46 tcattgaatg cttccaggaa cctcg 25 47 20 DNA
Artificial Sequence Description of Artificial Sequence Primer 47
aagagggcag ggctcaagag 20 48 21 DNA Artificial Sequence Description
of Artificial Sequence Primer 48 tccatgtgaa gatcccatca c 21 49 20
DNA Artificial Sequence Description of Artificial Sequence Primer
49 cttgaaggcc gtgttgaacg 20 50 21 DNA Artificial Sequence
Description of Artificial Sequence Primer 50 caggatttgt ctgagttgcc
g 21 51 28 DNA Artificial Sequence Description of Artificial
Sequence Primer 51 ccattgcctt gagatacgcc attggtag 28 52 26 DNA
Artificial Sequence Description of Artificial Sequence Primer 52
agccttggtg tcgttctttt caacgg 26 53 26 DNA Artificial Sequence
Description of Artificial Sequence Primer 53 ttgggtttgt ttgtttcctg
tgtccg 26 54 27 DNA Artificial Sequence Description of Artificial
Sequence Primer 54 cctttgacct tcaatctggc gtagacg 27 55 26 DNA
Artificial Sequence Description of Artificial Sequence Primer 55
gtttgctgaa tacgctgaag gtgatg 26 56 27 DNA Artificial Sequence
Description of Artificial Sequence Primer 56 tggagctgaa caactctctc
gtctcgg 27 57 20 DNA Artificial Sequence Description of Artificial
Sequence Primer 57 ttcctcaaca cggacagcgg 20 58 24 DNA Artificial
Sequence Description of Artificial Sequence Primer 58 agtcaaccag
gtgtggaact cgtc 24 59 49 DNA Artificial Sequence Description of
Artificial Sequence Primer 59 ggatcctaat acgactcact atagggagga
agagggcagg gctcaagag 49 60 42 DNA Artificial Sequence Description
of Artificial Sequence Primer 60 tccatgtgaa gatcccatca cgagtgtgcc
tcttgcccaa ag 42 61 54 DNA Artificial Sequence Description of
Artificial Sequence Primer 61 ggatcctaat acgactcact atagggaggc
cgatgaagtt ttcgacgagt accc 54 62 52 DNA Artificial Sequence
Description of Artificial Sequence Primer 62 aaggctttaa cgtgtccaat
ctggtcaaca tagctctgga gtgcttccaa cc 52 63 56 DNA Artificial
Sequence Description of Artificial Sequence Primer 63 ggatcctaat
acgactcact atagggagga ttatcgccac atacttcacc aaatgg 56 64 52 DNA
Artificial Sequence Description of Artificial Sequence Primer 64
cgagatcgtg gatacgctgg agtgcgtcgc tcttcttctt caacaattca ag 52 65 49
DNA Artificial Sequence Description of Artificial Sequence Primer
65 cattgaactg agtagccaaa acagcccatg gtttcaatca atgggaggc 49 66 54
DNA Artificial Sequence Description of Artificial Sequence Primer
66 ggatcctaat acgactcact atagggaggg ccactcggta actttgtcag ggac 54
67 56 DNA Artificial Sequence Description of Artificial Sequence
Primer 67 ggatcctaat acgactcact atagggaggc ctacgtttgg tatcgctact
ccgttg 56 68 48 DNA Artificial Sequence Description of Artificial
Sequence Primer 68 tttccagcca gcaccgtcca agcaacaagg agtacaagaa
atcgtgtc 48 69 53 DNA Artificial Sequence Description of Artificial
Sequence Primer 69 ggatcctaat acgactcact atagggaggg cagagccgat
ctatgttgcg tcc 53 70 45 DNA Artificial Sequence Description of
Artificial Sequence Primer 70 tcattgaatg cttccaggaa cctcgccaca
tccatcgaga accgg 45 71 49 DNA Artificial Sequence Description of
Artificial Sequence Primer 71 ggatcctaat acgactcact atagggaggc
ttgaaggccg tgttgaacg 49 72 46 DNA Artificial Sequence Description
of Artificial Sequence Primer 72 caggatttgt ctgagttgcc gcctgatcaa
gataggatcc ttgccg 46 73 56 DNA Artificial Sequence Description of
Artificial Sequence Primer 73 ggatcctaat acgactcact atagggaggg
gtttgctgaa tacgctgaag gtgatg 56 74 52 DNA Artificial Sequence
Description of Artificial Sequence Primer 74 tggagctgaa caactctctc
gtctcgggtg gtcgaatgga cccttggtca ag 52 75 49 DNA Artificial
Sequence Description of Artificial Sequence Primer 75 ggatcctaat
acgactcact atagggaggt tcctcaacac ggacagcgg 49 76 49 DNA Artificial
Sequence Description of Artificial Sequence Primer 76 agtcaaccag
gtgtggaact cgtcggtggc aacaatgaaa aacaccaag 49 77 57 DNA Artificial
Sequence Description of Artificial Sequence Primer 77 ggatcctaat
acgactcact atagggaggc cattgccttg agatacgcca ttggtag 57 78 53 DNA
Artificial Sequence Description of Artificial Sequence Primer 78
agccttggtg tcgttctttt caacggaagg tggtctcgat ggtgtgttca acc 53 79 55
DNA Artificial Sequence Description of Artificial Sequence Primer
79 ggatcctaat acgactcact atagggaggt tgggtttgtt tgtttcctgt gtccg 55
80 50 DNA Artificial Sequence Description of Artificial Sequence
Primer 80 cctttgacct tcaatctggc gtagacgcag caccaccgat ccaccacttg 50
81 4206 DNA Candida tropicalis 81 catcaagatc atctatgggg ataattacga
cagcaacatt gcagaaagag cgttggtcac 60 aatcgaaaga gcctatggcg
ttgccgtcgt tgaggcaaat gacagcacca acaataacga 120 tggtcccagt
gaagagcctt cagaacagtc cattgttgac gcttaaggca cggataatta 180
cgtggggcaa aggaacgcgg aattagttat ggggggatca aaagcggaag atttgtgttg
240 cttgtgggtt ttttccttta tttttcatat gatttctttg cgcaagtaac
atgtgccaat 300 ttagtttgtg attagcgtgc cccacaattg gcatcgtgga
cgggcgtgtt ttgtcatacc 360 ccaagtctta actagctcca cagtctcgac
ggtgtctcga cgatgtcttc ttccacccct 420 cccatgaatc attcaaagtt
gttgggggat ctccaccaag ggcaccggag ttaatgctta 480 tgtttctccc
actttggttg tgattggggt agtctagtga gttggagatt ttcttttttt 540
cgcaggtgtc tccgatatcg aaatttgatg aatatagaga gaagccagat cagcacagta
600 gattgccttt gtagttagag atgttgaaca gcaactagtt gaattacacg
ccaccacttg 660 acagcaagtg cagtgagctg taaacgatgc agccagagtg
tcaccaccaa ctgacgttgg 720 gtggagttgt tgttgttgtt gttggcaggg
ccatattgct aaacgaagac aagtagcaca 780 aaacccaagc ttaagaacaa
aaataaaaaa aattcatacg acaattccaa agccattgat 840 ttacataatc
aacagtaaga cagaaaaaac tttcaacatt tcaaagttcc ctttttccta 900
ttacttcttt tttttcttct ttccttcttt ccttctgttt ttcttacttt atcagtcttt
960 tacttgtttt tgcaattcct catcctcctc ctactcctcc tcaccatggc
tttagacaag 1020 ttagatttgt atgtcatcat aacattggtg gtcgctgtag
ccgcctattt tgctaagaac 1080 cagttccttg atcagcccca ggacaccggg
ttcctcaaca cggacagcgg aagcaactcc 1140 agagacgtct tgctgacatt
gaagaagaat aataaaaaca cgttgttgtt gtttgggtcc 1200 cagacgggta
cggcagaaga ttacgccaac aaattgtcca gagaattgca ctccagattt 1260
ggcttgaaaa cgatggttgc agatttcgct gattacgatt gggataactt cggagatatc
1320 accgaagaca tcttggtgtt tttcattgtt gccacctatg gtgagggtga
acctaccgat 1380 aatgccgacg agttccacac ctggttgact gaagaagctg
acactttgag taccttgaaa 1440 tacaccgtgt tcgggttggg taactccacg
tacgagttct tcaatgccat tggtagaaag 1500 tttgacagat tgttgagcga
gaaaggtggt gacaggtttg ctgaatacgc tgaaggtgat 1560 gacggtactg
gcaccttgga cgaagatttc atggcctgga aggacaatgt ctttgacgcc 1620
ttgaagaatg atttgaactt tgaagaaaag gaattgaagt acgaaccaaa cgtgaaattg
1680 actgagagag acgacttgtc tgctgctgac tcccaagttt ccttgggtga
gccaaacaag 1740 aagtacatca actccgaggg catcgacttg accaagggtc
cattcgacca cacccaccca 1800 tacttggcca gaatcaccga gacgagagag
ttgttcagct ccaaggacag acactgtatc 1860 cacgttgaat ttgacatttc
tgaatcgaac ttgaaataca ccaccggtga ccatctagct 1920 atctggccat
ccaactccga cgaaaacatt aagcaatttg ccaagtgttt cggattggaa 1980
gataaactcg acactgttat tgaattgaag gcgttggact ccacttacac catcccattc
2040 ccaaccccaa ttacctacgg tgctgtcatt agacaccatt tagaaatctc
cggtccagtc 2100 tcgagacaat tctttttgtc aattgctggg tttgctcctg
atgaagaaac aaagaaggct 2160 tttaccagac ttggtggtga caagcaagaa
ttcgccgcca aggtcacccg cagaaagttc 2220 aacattgccg atgccttgtt
atattcctcc aacaacgctc catggtccga tgttcctttt 2280 gaattcctta
ttgaaaacgt tccacacttg actccacgtt actactccat ttcgtcttcg 2340
tcattgagtg aaaagcaact catcaacgtt actgcagttg ttgaagccga agaagaagct
2400 gatggcagac cagtcactgg tgttgtcacc aacttgttga agaacgttga
aattgtgcaa 2460 aacaagactg gcgaaaagcc acttgtccac tacgatttga
gcggcccaag aggcaagttc 2520 aacaagttca agttgccagt gcatgtgaga
agatccaact ttaagttgcc aaagaactcc 2580 accaccccag ttatcttgat
tggtccaggt actggtgttg ccccattgag aggttttgtc 2640 agagaaagag
ttcaacaagt caagaatggt gtcaatgttg gcaagacttt gttgttttat 2700
ggttgcagaa actccaacga ggactttttg tacaagcaag aatgggccga gtacgcttct
2760 gttttgggtg aaaactttga gatgttcaat gccttctcca gacaagaccc
atccaagaag 2820 gtttacgtcc aggataagat tttagaaaac agccaacttg
tgcacgagtt gttgactgaa 2880 ggtgccatta tctacgtctg tggtgatgcc
agtagaatgg ctagagacgt gcagaccaca 2940 atttccaaga ttgttgctaa
aagcagagaa attagtgaag acaaggctgc tgaattggtc 3000 aagtcctgga
aggtccaaaa tagataccaa gaagatgttt ggtagactca aacgaatctc 3060
tctttctccc aacgcattta tgaatcttta ttctcattga agctttacat atgttctaca
3120 ctttattttt tttttttttt ttattattat attacgaaac ataggtcaac
tatatatact 3180 tgattaaatg ttatagaaac aataactatt atctactcgt
ctacttcttt ggcattgaca 3240 tcaacattac cgttcccatt accgttgccg
ttggcaatgc cgggatattt agtacagtat 3300 ctccaatccg gatttgagct
attgtagatc agctgcaagt cattctccac cttcaaccag 3360 tacttatact
tcatctttga cttcaagtcc aagtcataaa tattacaagt tagcaagaac 3420
ttctggccat ccacgatata gacgttattc acgttattat gcgacgtatg gatgtggtta
3480 tccttattga acttctcaaa cttcaaaaac aaccccacgt cccgcaacgt
cattatcaac 3540 gacaagttct ggctcacgtc gtcggagctc gtcaagttct
caattagatc gttcttgtta 3600 ttgatcttct ggtactttct caattgctgg
aacacattgt cctcgttgtt caaatagatc 3660 ttgaacaact ttttcaacgg
gatcaacttc tcaatctggg ccaagatctc cgccgggatc 3720 ttcagaaaca
agtcctgcaa cccctggtcg atggtctccg ggtacaacaa gtccaagggg 3780
cagaagtgtc taggcacgtg tttcaactgg ttcaacgaac atgttcgaca gtagttcgag
3840 ttatagttat cgtacaacca ttttggtttg atttcgaaaa tgacggagct
gatgccatca 3900 ttctcctggt tcctctcata gtacaactgg cacttcttcg
agaggctcaa ttcctcgtag 3960 ttcccgtcca agatattcgg caacaagagc
ccgtaccgct cacggagcat caagtcgtgg 4020 ccctggttgt tcaacttgtt
gatgaagtcc gaggtcaaga caatcaactg gatgtcgatg 4080 atctggtgcg
ggaacaagtt cttgcatttt agctcgatga agtcgtacaa ctcacacgtc 4140
gagatatact cctgttcctc cttcaagagc cggatccgca agagcttgtg cttcaagtag
4200 tcgttg 4206 82 4145 DNA Candida tropicalis 82 tatatgatat
atgatatatc ttcctgtgta attattattc gtattcgtta atacttacta 60
catttttttt tctttattta tgaagaaaag gagagttcgt aagttgagtt gagtagaata
120 ggctgttgtg catacgggga gcagaggaga
gtatccgacg aggaggaact gggtgaaatt 180 tcatctatgc tgttgcgtcc
tgtactgtac tgtaaatctt agatttccta gaggttgttc 240 tagcaaataa
agtgtttcaa gatacaattt tacaggcaag ggtaaaggat caactgatta 300
gcggaagatt ggtgttgcct gtggggttct tttatttttc atatgatttc tttgcgcgag
360 taacatgtgc caatctagtt tatgattagc gtacctccac aattggcatc
ttggacgggc 420 gtgttttgtc ttaccccaag ccttatttag ttccacagtc
tcgacggtgt ctcgccgatg 480 tcttctccca cccctcgcag gaatcattcg
aagttgttgg gggatctcct ccgcagttta 540 tgttcatgtc tttcccactt
tggttgtgat tggggtagcg tagtgagttg gtgattttct 600 tttttcgcag
gtgtctccga tatcgaagtt tgatgaatat aggagccaga tcagcatggt 660
atattgcctt tgtagataga gatgttgaac aacaactagc tgaattacac accaccgcta
720 aacgatgcgc acagggtgtc accgccaact gacgttgggt ggagttgttg
ttggcagggc 780 catattgcta aacgaagaga agtagcacaa aacccaaggt
taagaacaat taaaaaaatt 840 catacgacaa ttccacagcc atttacataa
tcaacagcga caaatgagac agaaaaaact 900 ttcaacattt caaagttccc
tttttcctat tacttctttt tttctttcct tcctttcatt 960 tcctttcctt
ctgcttttat tactttacca gtcttttgct tgtttttgca attcctcatc 1020
ctcctcctca ccatggcttt agacaagtta gatttgtatg tcatcataac attggtggtc
1080 gctgtggccg cctattttgc taagaaccag ttccttgatc agccccagga
caccgggttc 1140 ctcaacacgg acagcggaag caactccaga gacgtcttgc
tgacattgaa gaagaataat 1200 aaaaacacgt tgttgttgtt tgggtcccag
accggtacgg cagaagatta cgccaacaaa 1260 ttgtcaagag aattgcactc
cagatttggc ttgaaaacca tggttgcaga tttcgctgat 1320 tacgattggg
ataacttcgg agatatcacc gaagatatct tggtgttttt catcgttgcc 1380
acctacggtg agggtgaacc taccgacaat gccgacgagt tccacacctg gttgactgaa
1440 gaagctgaca ctttgagtac tttgagatat accgtgttcg ggttgggtaa
ctccacctac 1500 gagttcttca atgctattgg tagaaagttt gacagattgt
tgagtgagaa aggtggtgac 1560 agatttgctg aatatgctga aggtgacgac
ggcactggca ccttggacga agatttcatg 1620 gcctggaagg ataatgtctt
tgacgccttg aagaatgact tgaactttga agaaaaggaa 1680 ttgaagtacg
aaccaaacgt gaaattgact gagagagatg acttgtctgc tgccgactcc 1740
caagtttcct tgggtgagcc aaacaagaag tacatcaact ccgagggcat cgacttgacc
1800 aagggtccat tcgaccacac ccacccatac ttggccagga tcaccgagac
cagagagttg 1860 ttcagctcca aggaaagaca ctgtattcac gttgaatttg
acatttctga atcgaacttg 1920 aaatacacca ccggtgacca tctagccatc
tggccatcca actccgacga aaacatcaag 1980 caatttgcca agtgtttcgg
attggaagat aaactcgaca ctgttattga attgaaggca 2040 ttggactcca
cttacaccat tccattccca actccaatta cttacggtgc tgtcattaga 2100
caccatttag aaatctccgg tccagtctcg agacaattct ttttgtcgat tgctgggttt
2160 gctcctgatg aagaaacaaa gaagactttc accagacttg gtggtgacaa
acaagaattc 2220 gccaccaagg ttacccgcag aaagttcaac attgccgatg
ccttgttata ttcctccaac 2280 aacactccat ggtccgatgt tccttttgag
ttccttattg aaaacatcca acacttgact 2340 ccacgttact actccatttc
ttcttcgtcg ttgagtgaaa aacaactcat caatgttact 2400 gcagtcgttg
aggccgaaga agaagccgat ggcagaccag tcactggtgt tgttaccaac 2460
ttgttgaaga acattgaaat tgcgcaaaac aagactggcg aaaagccact tgttcactac
2520 gatttgagcg gcccaagagg caagttcaac aagttcaagt tgccagtgca
cgtgagaaga 2580 tccaacttta agttgccaaa gaactccacc accccagtta
tcttgattgg tccaggtact 2640 ggtgttgccc cattgagagg tttcgttaga
gaaagagttc aacaagtcaa gaatggtgtc 2700 aatgttggca agactttgtt
gttttatggt tgcagaaact ccaacgagga ctttttgtac 2760 aagcaagaat
gggccgagta cgcttctgtt ttgggtgaaa actttgagat gttcaatgcc 2820
ttctctagac aagacccatc caagaaggtt tacgtccagg ataagatttt agaaaacagc
2880 caacttgtgc acgaattgtt gaccgaaggt gccattatct acgtctgtgg
tgacgccagt 2940 agaatggcca gagacgtcca gaccacgatc tccaagattg
ttgccaaaag cagagaaatc 3000 agtgaagaca aggccgctga attggtcaag
tcctggaaag tccaaaatag ataccaagaa 3060 gatgtttggt agactcaaac
gaatctctct ttctcccaac gcatttatga atattctcat 3120 tgaagtttta
catatgttct atatttcatt ttttttttat tatattacga aacataggtc 3180
aactatatat acttgattaa atgttataga aacaataatt attatctact cgtctacttc
3240 tttggcattg gcattggcat tggcattggc attgccgttg ccgttggtaa
tgccgggata 3300 tttagtacag tatctccaat ccggatttga gctattgtaa
atcagctgca agtcattctc 3360 caccttcaac cagtacttat acttcatctt
tgacttcaag tccaagtcat aaatattaca 3420 agttagcaag aacttctggc
catccacaat atagacgtta ttcacgttat tatgcgacgt 3480 atggatatgg
ttatccttat tgaacttctc aaacttcaaa aacaacccca cgtcccgcaa 3540
cgtcattatc aacgacaagt tctgactcac gtcgtcggag ctcgtcaagt tctcaattag
3600 atcgttcttg ttattgatct tctggtactt tctcaactgc tggaacacat
tgtcctcgtt 3660 gttcaaatag atcttgaaca acttcttcaa gggaatcaac
ttttcgatct gggccaagat 3720 ttccgccggg atcttcagaa acaagtcctg
caacccctgg tcgatggtct cggggtacaa 3780 caagtctaag gggcagaagt
gtctaggcac gtgtttcaac tggttcaagg aacatgttcg 3840 acagtagttc
gagttatagt tatcgtacaa ccactttggc ttgatttcga aaatgacgga 3900
gctgatccca tcattctcct ggttcctttc atagtacaac tggcatttct tcgagagact
3960 caactcctcg tagttcccgt ccaagatatt cggcaacaag agcccgtagc
gctcacggag 4020 catcaagtcg tggccctggt tgttcaactt gttgatgaag
tccgatgtca agacaatcaa 4080 ctggatgtcg atgatctggt gcggaaacaa
gttcttgcac tttagctcga tgaagtcgta 4140 caact 4145 83 679 PRT
CANDIDATROPICALIS 83 Met Ala Leu Asp Lys Leu Asp Leu Tyr Val Ile
Ile Thr Leu Val Val 1 5 10 15 Ala Val Ala Ala Tyr Phe Ala Lys Asn
Gln Phe Leu Asp Gln Pro Gln 20 25 30 Asp Thr Gly Phe Leu Asn Thr
Asp Ser Gly Ser Asn Ser Arg Asp Val 35 40 45 Leu Leu Thr Leu Lys
Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe Gly 50 55 60 Ser Gln Thr
Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg Glu 65 70 75 80 Leu
His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp Phe Ala Asp 85 90
95 Tyr Asp Trp Asp Asn Phe Gly Asp Ile Thr Glu Asp Ile Leu Val Phe
100 105 110 Phe Ile Val Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn
Ala Asp 115 120 125 Glu Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr
Leu Ser Thr Leu 130 135 140 Lys Tyr Thr Val Phe Gly Leu Gly Asn Ser
Thr Tyr Glu Phe Phe Asn 145 150 155 160 Ala Ile Gly Arg Lys Phe Asp
Arg Leu Leu Ser Glu Lys Gly Gly Asp 165 170 175 Arg Phe Ala Glu Tyr
Ala Glu Gly Asp Asp Gly Thr Gly Thr Leu Asp 180 185 190 Glu Asp Phe
Met Ala Trp Lys Asp Asn Val Phe Asp Ala Leu Lys Asn 195 200 205 Asp
Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro Asn Val Lys 210 215
220 Leu Thr Glu Arg Asp Asp Leu Ser Ala Ala Asp Ser Gln Val Ser Leu
225 230 235 240 Gly Glu Pro Asn Lys Lys Tyr Ile Asn Ser Glu Gly Ile
Asp Leu Thr 245 250 255 Lys Gly Pro Phe Asp His Thr His Pro Tyr Leu
Ala Arg Ile Thr Glu 260 265 270 Thr Arg Glu Leu Phe Ser Ser Lys Asp
Arg His Cys Ile His Val Glu 275 280 285 Phe Asp Ile Ser Glu Ser Asn
Leu Lys Tyr Thr Thr Gly Asp His Leu 290 295 300 Ala Ile Trp Pro Ser
Asn Ser Asp Glu Asn Ile Lys Gln Phe Ala Lys 305 310 315 320 Cys Phe
Gly Leu Glu Asp Lys Leu Asp Thr Val Ile Glu Leu Lys Ala 325 330 335
Leu Asp Ser Thr Tyr Thr Ile Pro Phe Pro Thr Pro Ile Thr Tyr Gly 340
345 350 Ala Val Ile Arg His His Leu Glu Ile Ser Gly Pro Val Ser Arg
Gln 355 360 365 Phe Phe Leu Ser Ile Ala Gly Phe Ala Pro Asp Glu Glu
Thr Lys Lys 370 375 380 Ala Phe Thr Arg Leu Gly Gly Asp Lys Gln Glu
Phe Ala Ala Lys Val 385 390 395 400 Thr Arg Arg Lys Phe Asn Ile Ala
Asp Ala Leu Leu Tyr Ser Ser Asn 405 410 415 Asn Ala Pro Trp Ser Asp
Val Pro Phe Glu Phe Leu Ile Glu Asn Val 420 425 430 Pro His Leu Thr
Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Ser Leu Ser 435 440 445 Glu Lys
Gln Leu Ile Asn Val Thr Ala Val Val Glu Ala Glu Glu Glu 450 455 460
Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn Leu Leu Lys Asn 465
470 475 480 Val Glu Ile Val Gln Asn Lys Thr Gly Glu Lys Pro Leu Val
His Tyr 485 490 495 Asp Leu Ser Gly Pro Arg Gly Lys Phe Asn Lys Phe
Lys Leu Pro Val 500 505 510 His Val Arg Arg Ser Asn Phe Lys Leu Pro
Lys Asn Ser Thr Thr Pro 515 520 525 Val Ile Leu Ile Gly Pro Gly Thr
Gly Val Ala Pro Leu Arg Gly Phe 530 535 540 Val Arg Glu Arg Val Gln
Gln Val Lys Asn Gly Val Asn Val Gly Lys 545 550 555 560 Thr Leu Leu
Phe Tyr Gly Cys Arg Asn Ser Asn Glu Asp Phe Leu Tyr 565 570 575 Lys
Gln Glu Trp Ala Glu Tyr Ala Ser Val Leu Gly Glu Asn Phe Glu 580 585
590 Met Phe Asn Ala Phe Ser Arg Gln Asp Pro Ser Lys Lys Val Tyr Val
595 600 605 Gln Asp Lys Ile Leu Glu Asn Ser Gln Leu Val His Glu Leu
Leu Thr 610 615 620 Glu Gly Ala Ile Ile Tyr Val Cys Gly Asp Ala Ser
Arg Met Ala Arg 625 630 635 640 Asp Val Gln Thr Thr Ile Ser Lys Ile
Val Ala Lys Ser Arg Glu Ile 645 650 655 Ser Glu Asp Lys Ala Ala Glu
Leu Val Lys Ser Trp Lys Val Gln Asn 660 665 670 Arg Tyr Gln Glu Asp
Val Trp 675 84 679 PRT CANDIDATROPICALIS 84 Met Ala Leu Asp Lys Leu
Asp Leu Tyr Val Ile Ile Thr Leu Val Val 1 5 10 15 Ala Val Ala Ala
Tyr Phe Ala Lys Asn Gln Phe Leu Asp Gln Pro Gln 20 25 30 Asp Thr
Gly Phe Leu Asn Thr Asp Ser Gly Ser Asn Ser Arg Asp Val 35 40 45
Leu Leu Thr Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe Gly 50
55 60 Ser Gln Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg
Glu 65 70 75 80 Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp
Phe Ala Asp 85 90 95 Tyr Asp Trp Asp Asn Phe Gly Asp Ile Thr Glu
Asp Ile Leu Val Phe 100 105 110 Phe Ile Val Ala Thr Tyr Gly Glu Gly
Glu Pro Thr Asp Asn Ala Asp 115 120 125 Glu Phe His Thr Trp Leu Thr
Glu Glu Ala Asp Thr Leu Ser Thr Leu 130 135 140 Arg Tyr Thr Val Phe
Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe Asn 145 150 155 160 Ala Ile
Gly Arg Lys Phe Asp Arg Leu Leu Ser Glu Lys Gly Gly Asp 165 170 175
Arg Phe Ala Glu Tyr Ala Glu Gly Asp Asp Gly Thr Gly Thr Leu Asp 180
185 190 Glu Asp Phe Met Ala Trp Lys Asp Asn Val Phe Asp Ala Leu Lys
Asn 195 200 205 Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro
Asn Val Lys 210 215 220 Leu Thr Glu Arg Asp Asp Leu Ser Ala Ala Asp
Ser Gln Val Ser Leu 225 230 235 240 Gly Glu Pro Asn Lys Lys Tyr Ile
Asn Ser Glu Gly Ile Asp Leu Thr 245 250 255 Lys Gly Pro Phe Asp His
Thr His Pro Tyr Leu Ala Arg Ile Thr Glu 260 265 270 Thr Arg Glu Leu
Phe Ser Ser Lys Glu Arg His Cys Ile His Val Glu 275 280 285 Phe Asp
Ile Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp His Leu 290 295 300
Ala Ile Trp Pro Ser Asn Ser Asp Glu Asn Ile Lys Gln Phe Ala Lys 305
310 315 320 Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val Ile Glu Leu
Lys Ala 325 330 335 Leu Asp Ser Thr Tyr Thr Ile Pro Phe Pro Thr Pro
Ile Thr Tyr Gly 340 345 350 Ala Val Ile Arg His His Leu Glu Ile Ser
Gly Pro Val Ser Arg Gln 355 360 365 Phe Phe Leu Ser Ile Ala Gly Phe
Ala Pro Asp Glu Glu Thr Lys Lys 370 375 380 Thr Phe Thr Arg Leu Gly
Gly Asp Lys Gln Glu Phe Ala Thr Lys Val 385 390 395 400 Thr Arg Arg
Lys Phe Asn Ile Ala Asp Ala Leu Leu Tyr Ser Ser Asn 405 410 415 Asn
Thr Pro Trp Ser Asp Val Pro Phe Glu Phe Leu Ile Glu Asn Ile 420 425
430 Gln His Leu Thr Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Ser Leu Ser
435 440 445 Glu Lys Gln Leu Ile Asn Val Thr Ala Val Val Glu Ala Glu
Glu Glu 450 455 460 Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn
Leu Leu Lys Asn 465 470 475 480 Ile Glu Ile Ala Gln Asn Lys Thr Gly
Glu Lys Pro Leu Val His Tyr 485 490 495 Asp Leu Ser Gly Pro Arg Gly
Lys Phe Asn Lys Phe Lys Leu Pro Val 500 505 510 His Val Arg Arg Ser
Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr Pro 515 520 525 Val Ile Leu
Ile Gly Pro Gly Thr Gly Val Ala Pro Leu Arg Gly Phe 530 535 540 Val
Arg Glu Arg Val Gln Gln Val Lys Asn Gly Val Asn Val Gly Lys 545 550
555 560 Thr Leu Leu Phe Tyr Gly Cys Arg Asn Ser Asn Glu Asp Phe Leu
Tyr 565 570 575 Lys Gln Glu Trp Ala Glu Tyr Ala Ser Val Leu Gly Glu
Asn Phe Glu 580 585 590 Met Phe Asn Ala Phe Ser Arg Gln Asp Pro Ser
Lys Lys Val Tyr Val 595 600 605 Gln Asp Lys Ile Leu Glu Asn Ser Gln
Leu Val His Glu Leu Leu Thr 610 615 620 Glu Gly Ala Ile Ile Tyr Val
Cys Gly Asp Ala Ser Arg Met Ala Arg 625 630 635 640 Asp Val Gln Thr
Thr Ile Ser Lys Ile Val Ala Lys Ser Arg Glu Ile 645 650 655 Ser Glu
Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val Gln Asn 660 665 670
Arg Tyr Gln Glu Asp Val Trp 675 85 4115 DNA Candida tropicalis 85
catatgcgct aatcttcttt ttctttttat cacaggagaa actatcccac ccccacttcg
60 aaacacaatg acaactcctg cgtaacttgc aaattcttgt ctgactaatt
gaaaactccg 120 gacgagtcag acctccagtc aaacggacag acagacaaac
acttggtgcg atgttcatac 180 ctacagacat gtcaacgggt gttagacgac
ggtttcttgc aaagacaggt gttggcatct 240 cgtacgatgg caactgcagg
aggtgtcgac ttctccttta ggcaatagaa aaagactaag 300 agaacagcgt
ttttacaggt tgcattggtt aatgtagtat ttttttagtc ccagcattct 360
gtgggttgct ctgggtttct agaataggaa atcacaggag aatgcaaatt cagatggaag
420 aacaaagaga taaaaaacaa aaaaaaactg agttttgcac caatagaatg
tttgatgata 480 tcatccactc gctaaacgaa tcatgtgggt gatcttctct
ttagttttgg tctatcataa 540 aacacatgaa agtgaaatcc aaatacacta
cactccgggt attgtccttc gttttacaga 600 tgtctcattg tcttactttt
gaggtcatag gagttgcctg tgagagatca cagagattat 660 cacactcaca
tttatcgtag tttcctatct catgctgtgt gtctctggtt ggttcatgag 720
tttggattgt tgtacattaa aggaatcgct ggaaagcaaa gctaactaaa ttttctttgt
780 cacaggtaca ctaacctgta aaacttcact gccacgccag tctttcctga
ttgggcaagt 840 gcacaaacta caacctgcaa aacagcactc cgcttgtcac
aggttgtctc ctctcaacca 900 acaaaaaaat aagattaaac tttctttgct
catgcatcaa tcggagttat ctctgaaaga 960 gttgcctttg tgtaatgtgt
gccaaactca aactgcaaaa ctaaccacag aatgatttcc 1020 ctcacaatta
tataaactca cccacatttc cacagaccgt aatttcatgt ctcactttct 1080
cttttgctct tcttttactt agtcaggttt gataacttcc ttttttatta ccctatctta
1140 tttatttatt tattcattta taccaaccaa ccaaccatgg ccacacaaga
aatcatcgat 1200 tctgtacttc cgtacttgac caaatggtac actgtgatta
ctgcagcagt attagtcttc 1260 cttatctcca caaacatcaa gaactacgtc
aaggcaaaga aattgaaatg tgtcgatcca 1320 ccatacttga aggatgccgg
tctcactggt attctgtctt tgatcgccgc catcaaggcc 1380 aagaacgacg
gtagattggc taactttgcc gatgaagttt tcgacgagta cccaaaccac 1440
accttctact tgtctgttgc cggtgctttg aagattgtca tgactgttga cccagaaaac
1500 atcaaggctg tcttggccac ccaattcact gacttctcct tgggtaccag
acacgcccac 1560 tttgctcctt tgttgggtga cggtatcttc accttggacg
gagaaggttg gaagcactcc 1620 agagctatgt tgagaccaca gtttgctaga
gaccagattg gacacgttaa agccttggaa 1680 ccacacatcc aaatcatggc
taagcagatc aagttgaacc agggaaagac tttcgatatc 1740 caagaattgt
tctttagatt taccgtcgac accgctactg agttcttgtt tggtgaatcc 1800
gttcactcct tgtacgatga aaaattgggc atcccaactc caaacgaaat cccaggaaga
1860 gaaaactttg ccgctgcttt caacgtttcc caacactact tggccaccag
aagttactcc 1920 cagacttttt actttttgac caaccctaag gaattcagag
actgtaacgc caaggtccac 1980 cacttggcca agtactttgt caacaaggcc
ttgaacttta ctcctgaaga actcgaagag 2040 aaatccaagt ccggttacgt
tttcttgtac gaattggtta agcaaaccag agatccaaag 2100 gtcttgcaag
atcaattgtt gaacattatg gttgccggaa gagacaccac tgccggtttg 2160
ttgtcctttg ctttgtttga attggctaga cacccagaga tgtggtccaa gttgagagaa
2220 gaaatcgaag ttaactttgg tgttggtgaa gactcccgcg ttgaagaaat
taccttcgaa 2280 gccttgaaga gatgtgaata cttgaaggct atccttaacg
aaaccttgcg tatgtaccca 2340 tctgttcctg tcaactttag aaccgccacc
agagacacca ctttgccaag aggtggtggt 2400 gctaacggta ccgacccaat
ctacattcct aaaggctcca ctgttgctta cgttgtctac 2460 aagacccacc
gtttggaaga atactacggt aaggacgcta acgacttcag accagaaaga 2520
tggtttgaac catctactaa gaagttgggc tgggcttatg ttccattcaa cggtggtcca
2580 agagtctgct tgggtcaaca
attcgccttg actgaagctt cttatgtgat cactagattg 2640 gcccagatgt
ttgaaactgt ctcatctgat ccaggtctcg aataccctcc accaaagtgt 2700
attcacttga ccatgagtca caacgatggt gtctttgtca agatgtaaag tagtcgatgc
2760 tgggtattcg attacatgtg tataggaaga ttttggtttt ttattcgttc
ttttttttaa 2820 tttttgttaa attagtttag agatttcatt aatacataga
tgggtgctat ttccgaaact 2880 ttacttctat cccctgtatc ccttattatc
cctctcagtc acatgattgc tgtaattgtc 2940 gtgcaggaca caaactccct
aacggactta aaccataaac aagctcagaa ccataagccg 3000 acatcactcc
ttcttctctc ttctccaacc aatagcatgg acagacccac cctcctatcc 3060
gaatcgaaga cccttattga ctccataccc acctggaagc ccctcaagcc acacacgtca
3120 tccagcccac ccatcaccac atccctctac tcgacaacgt ccaaagacgg
cgagttctgg 3180 tgtgcccgga aatcagccat cccggccaca tacaagcagc
cgttgattgc gtgcatactc 3240 ggcgagccca caatgggagc cacgcattcg
gaccatgaag caaagtacat tcacgagatc 3300 acgggtgttt cagtgtcgca
gattgagaag ttcgacgatg gatggaagta cgatctcgtt 3360 gcggattacg
acttcggtgg gttgttatct aaacgaagat tctatgagac gcagcatgtg 3420
tttcggttcg aggattgtgc gtacgtcatg agtgtgcctt ttgatggacc caaggaggaa
3480 ggttacgtgg ttgggacgta cagatccatt gaaaggttga gctggggtaa
agacggggac 3540 gtggagtgga ccatggcgac gacgtcggat cctggtgggt
ttatcccgca atggataact 3600 cgattgagca tccctggagc aatcgcaaaa
gatgtgccta gtgtattaaa ctacatacag 3660 aaataaaaac gtgtcttgat
tcattggttt ggttcttgtt gggttccgag ccaatatttc 3720 acatcatctc
ctaaattctc caagaatccc aacgtagcgt agtccagcac gccctctgag 3780
atcttattta atatcgactt ctcaaccacc ggtggaatcc cgttcagacc attgttacct
3840 gtagtgtgtt tgctcttgtt cttgatgaca atgatgtatt tgtcacgata
cctgaaataa 3900 taaaacatcc agtcattgag cttattactc gtgaacttat
gaaagaactc attcaagccg 3960 ttcccaaaaa acccagaatt gaagatcttg
ctcaactggt catgcaagta gtagatcgcc 4020 atgatctgat actttaccaa
gctatcctct ccaagttctc ccacgtacgg caagtacggc 4080 aacgagctct
ggaagctttg ttgtttgggg tcata 4115 86 3948 DNA Candida tropicalis 86
gacctgtgac gcttccggtg tcttgccacc agtctccaag ttgaccgacg cccaagtcat
60 gtaccacttt atttccggtt acacttccaa gatggctggt actgaagaag
gtgtcacgga 120 accacaagct actttctccg cttgtttcgg tcaaccattc
ttggtgttgc acccaatgaa 180 gtacgctcaa caattgtctg acaagatctc
gcaacacaag gctaacgcct ggttgttgaa 240 caccggttgg gttggttctt
ctgctgctag aggtggtaag agatgctcat tgaagtacac 300 cagagccatt
ttggacgcta tccactctgg tgaattgtcc aaggttgaat acgaaacttt 360
cccagtcttc aacttgaatg tcccaacctc ctgtccaggt gtcccaagtg aaatcttgaa
420 cccaaccaag gcctggaccg gaaggtgttg actccttcaa caaggaaatc
aagtctttgg 480 ctggtaagtt tgctgaaaac ttcaagacct atgctgacca
agctaccgct gaagtgagag 540 ctgcaggtcc agaagcttaa agatatttat
tcattattta gtttgcctat ttatttctca 600 ttacccatca tcattcaaca
ctatatataa agttacttcg gatatcattg taatcgtgcg 660 tgtcgcaatt
ggatgatttg gaactgcgct tgaaacggat tcatgcacga agcggagata 720
aaagattacg taatttatct cctgagacaa ttttagccgt gttcacacgc ccttctttgt
780 tctgagcgaa ggataaataa ttagacttcc acagctcatt ctaatttccg
tcacgcgaat 840 attgaagggg ggtacatgtg gccgctgaat gtgggggcag
taaacgcagt ctctcctctc 900 ccaggaatag tgcaacggag gaaggataac
ggatagaaag cggaatgcga ggaaaatttt 960 gaacgcgcaa gaaaagcaat
atccgggcta ccaggttttg agccagggaa cacactccta 1020 tttctgctca
atgactgaac atagaaaaaa caccaagacg caatgaaacg cacatggaca 1080
tttagacctc cccacatgtg atagtttgtc ttaacagaaa agtataataa gaacccatgc
1140 cgtccctttt ctttcgccgc ttcaactttt ttttttttat cttacacaca
tcacgaccat 1200 gactgtacac gatattatcg ccacatactt caccaaatgg
tacgtgatag taccactcgc 1260 tttgattgct tatagagtcc tcgactactt
ctatggcaga tacttgatgt acaagcttgg 1320 tgctaaacca tttttccaga
aacagacaga cggctgtttc ggattcaaag ctccgcttga 1380 attgttgaag
aagaagagcg acggtaccct catagacttc acactccagc gtatccacga 1440
tctcgatcgt cccgatatcc caactttcac attcccggtc ttttccatca accttgtcaa
1500 tacccttgag ccggagaaca tcaaggccat cttggccact cagttcaacg
atttctcctt 1560 gggtaccaga cactcgcact ttgctccttt gttgggtgat
ggtatcttta cgttggatgg 1620 cgccggctgg aagcacagca gatctatgtt
gagaccacag tttgccagag aacagatttc 1680 ccacgtcaag ttgttggagc
cacacgttca ggtgttcttc aaacacgtca gaaaggcaca 1740 gggcaagact
tttgacatcc aggaattgtt tttcagattg accgtcgact ccgccaccga 1800
gtttttgttt ggtgaatccg ttgagtcctt gagagatgaa tctatcggca tgtccatcaa
1860 tgcgcttgac tttgacggca aggctggctt tgctgatgct tttaactatt
cgcagaatta 1920 tttggcttcg agagcggtta tgcaacaatt gtactgggtg
ttgaacggga aaaagtttaa 1980 ggagtgcaac gctaaagtgc acaagtttgc
tgactactac gtcaacaagg ctttggactt 2040 gacgcctgaa caattggaaa
agcaggatgg ttatgtgttt ttgtacgaat tggtcaagca 2100 aaccagagac
aagcaagtgt tgagagacca attgttgaac atcatggttg ctggtagaga 2160
caccaccgcc ggtttgttgt cgtttgtttt ctttgaattg gccagaaacc cagaagttac
2220 caacaagttg agagaagaaa ttgaggacaa gtttggactc ggtgagaatg
ctagtgttga 2280 agacatttcc tttgagtcgt tgaagtcctg tgaatacttg
aaggctgttc tcaacgaaac 2340 cttgagattg tacccatccg tgccacagaa
tttcagagtt gccaccaaga acactaccct 2400 cccaagaggt ggtggtaagg
acgggttgtc tcctgttttg gtgagaaagg gtcagaccgt 2460 tatttacggt
gtctacgcag cccacagaaa cccagctgtt tacggtaagg acgctcttga 2520
gtttagacca gagagatggt ttgagccaga gacaaagaag cttggctggg ccttcctccc
2580 attcaacggt ggtccaagaa tctgtttggg acagcagttt gccttgacag
aagcttcgta 2640 tgtcactgtc aggttgctcc aggagtttgc acacttgtct
atggacccag acaccgaata 2700 tccacctaag aaaatgtcgc atttgaccat
gtcgcttttc gacggtgcca atattgagat 2760 gtattagagg gtcatgtgtt
attttgattg tttagtttgt aattactgat taggttaatt 2820 catggattgt
tatttattga taggggtttg cgcgtgttgc attcacttgg gatcgttcca 2880
ggttgatgtt tccttccatc ctgtcgagtc aaaaggagtt ttgttttgta actccggacg
2940 atgttttaaa tagaaggtcg atctccatgt gattgttttg actgttactg
tgattatgta 3000 atctgcggac gttatacaag catgtgattg tggttttgca
gccttttgca cgacaaatga 3060 tcgtcagacg attacgtaat ctttgttaga
ggggtaaaaa aaaacaaaat ggcagccaga 3120 atttcaaaca ttctgcaaac
aatgcaaaaa atgggaaact ccaacagaca aaaaaaaaaa 3180 ctccgcagca
ctccgaaccc acagaacaat ggggcgccag aattattgac tattgtgact 3240
tttttacgct aacgctcatt gcagtgtagt gcgtcttaca cggggtattg ctttctacaa
3300 tgcaagggca cagttgaagg tttgcaccta acgttgcccc gtgtcaactc
aatttgacga 3360 gtaacttcct aagctcgaat tatgcagctc gtgcgtcaac
ctatgtgcag gaaagaaaaa 3420 atccaaaaaa atcgaaaatg cgactttcga
ttttgaataa accaaaaaga aaaatgtcgc 3480 acttttttct cgctctcgct
ctctcgaccc aaatcacaac aaatcctcgc gcgcagtatt 3540 tcgacgaaac
cacaacaaat aaaaaaaaca aattctacac cacttctttt tcttcaccag 3600
tcaacaaaaa acaacaaatt atacaccatt tcaacgattt ttgctcttat aaatgctata
3660 taatggttta attcaactca ggtatgttta ttttactgtt ttcagctcaa
gtatgttcaa 3720 atactaacta cttttgatgt ttgtcgcttt tctagaatca
aaacaacgcc cacaacacgc 3780 cgagcttgtc gaatagacgg tttgtttact
cattagatgg tcccagatta cttttcaagc 3840 caaagtctct cgagttttgt
ttgctgtttc cccaattcct aactatgaag ggtttttata 3900 aggtccaaag
accccaaggc atagtttttt tggttccttc ttgtcgtg 3948 87 3755 DNA Candida
tropicalis 87 gctcaacaat tgtctgacaa gatctcgcaa cacaaggcta
acgcctggtt gttgaacact 60 ggttgggttg gttcttctgc tgctagaggt
ggtaagagat gttcattgaa gtacaccaga 120 gccattttgg acgctatcca
ctctggtgaa ttgtccaagg ttgaatacga gactttccca 180 gtcttcaact
tgaatgtccc aacctcctgc ccaggtgtcc caagtgaaat cttgaaccca 240
accaaggcct ggaccgaagg tgttgactcc ttcaacaagg aaatcaagtc tttggctggt
300 aagtttgctg aaaacttcaa gacctatgct gaccaagcta ccgctgaagt
tagagctgca 360 ggtccagaag cttaaagata tttattcact atttagtttg
cctatttatt tctcatcacc 420 catcatcatt caacaatata tataaagtta
tttcggaact catatatcat tgtaatcgtg 480 cgtgttgcaa ttgggtaatt
tgaaactgta gttggaacgg attcatgcac gatgcggaga 540 taacacgaga
ttatctccta agacaatttt ggcctcattc acacgccctt cttctgagct 600
aaggataaat aattagactt cacaagttca ttaaaatatc cgtcacgcga aaactgcaac
660 aataaggaag gggggggtag acgtagccga tgaatgtggg gtgccagtaa
acgcagtctc 720 tctctccccc cccccccccc ccccctcagg aatagtacaa
cgggggaagg ataacggata 780 gcaagtggaa tgcgaggaaa attttgaatg
cgcaaggaaa gcaatatccg ggctatcagg 840 ttttgagcca ggggacacac
tcctcttctg cacaaaaact taacgtagac aaaaaaaaaa 900 aactccacca
agacacaatg aatcgcacat ggacatttag acctccccac atgtgaaagc 960
ttctctggcg aaagcaaaaa aagtataata aggacccatg ccttccctct tcctgggccg
1020 tttcaacttt ttctttttct ttgtctatca acacacacac acctcacgac
catgactgca 1080 caggatatta tcgccacata catcaccaaa tggtacgtga
tagtaccact cgctttgatt 1140 gcttataggg tcctcgacta cttttacggc
agatacttga tgtacaagct tggtgctaaa 1200 ccgtttttcc agaaacaaac
agacggttat ttcggattca aagctccact tgaattgtta 1260 aaaaagaaga
gtgacggtac cctcatagac ttcactctcg agcgtatcca agcgctcaat 1320
cgtccagata tcccaacttt tacattccca atcttttcca tcaaccttat cagcaccctt
1380 gagccggaga acatcaaggc tatcttggcc acccagttca acgatttctc
cttgggcacc 1440 agacactcgc actttgctcc tttgttgggc gatggtatct
ttaccttgga cggtgccggc 1500 tggaagcaca gcagatctat gttgagacca
cagtttgcca gagaacagat ttcccacgtc 1560 aagttgttgg agccacacat
gcaggtgttc ttcaagcacg tcagaaaggc acagggcaag 1620 acttttgaca
tccaagaatt gtttttcaga ttgaccgtcg actccgccac tgagtttttg 1680
tttggtgaat ccgttgagtc cttgagagat gaatctattg ggatgtccat caatgcactt
1740 gactttgacg gcaaggctgg ctttgctgat gcttttaact actcgcagaa
ctatttggct 1800 tcgagagcgg ttatgcaaca attgtactgg gtgttgaacg
ggaaaaagtt taaggagtgc 1860 aacgctaaag tgcacaagtt tgctgactat
tacgtcagca aggctttgga cttgacacct 1920 gaacaattgg aaaagcagga
tggttatgtg ttcttgtacg agttggtcaa gcaaaccaga 1980 gacaggcaag
tgttgagaga ccagttgttg aacatcatgg ttgccggtag agacaccacc 2040
gccggtttgt tgtcgtttgt tttctttgaa ttggccagaa acccagaggt gaccaacaag
2100 ttgagagaag aaatcgagga caagtttggt cttggtgaga atgctcgtgt
tgaagacatt 2160 tcctttgagt cgttgaagtc atgtgaatac ttgaaggctg
ttctcaacga aactttgaga 2220 ttgtacccat ccgtgccaca gaatttcaga
gttgccacca aaaacactac ccttccaagg 2280 ggaggtggta aggacgggtt
atctcctgtt ttggtcagaa agggtcaaac cgttatgtac 2340 ggtgtctacg
ctgcccacag aaacccagct gtctacggta aggacgccct tgagtttaga 2400
ccagagaggt ggtttgagcc agagacaaag aagcttggct gggccttcct tccattcaac
2460 ggtggtccaa gaatttgctt gggacagcag tttgccttga cagaagcttc
gtatgtcact 2520 gtcagattgc tccaagagtt tggacacttg tctatggacc
ccaacaccga atatccacct 2580 aggaaaatgt cgcatttgac catgtccctt
ttcgacggtg ccaacattga gatgtattag 2640 aggatcatgt gttatttttg
attggtttag tctgtttgta gctattgatt aggttaattc 2700 acggattgtt
atttattgat agggggtgcg tgtgtgtgtg tgtgttgcat tcacatggga 2760
tcgttccagg ttgttgtttc cttccatcct gttgagtcaa aaggagtttt gttttgtaac
2820 tccggacgat gtcttagata gaaggtcgat ctccatgtga ttgtttgact
gctactctga 2880 ttatgtaatc tgtaaagcct agacgttatg caagcatgtg
attgtggttt ttgcaacctg 2940 tttgcacgac aaatgatcga cagtcgatta
cgtaatccat attatttaga ggggtaataa 3000 aaaataaatg gcagccagaa
tttcaaacat tttgcaaaca atgcaaaaga tgagaaactc 3060 caacagaaaa
aataaaaaaa ctccgcagca ctccgaacca acaaaacaat ggggggcgcc 3120
agaattattg actattgtga ctttttttta ttttttccgt taactttcat tgcagtgaag
3180 tgtgttacac ggggtggtga tggtgttggt ttctacaatg caagggcaca
gttgaaggtt 3240 tccacataac gttgcaccat atcaactcaa tttatcctca
ttcatgtgat aaaagaagag 3300 ccaaaaggta attggcagac cccccaaggg
gaacacggag tagaaagcaa tggaaacacg 3360 cccatgacag tgccatttag
cccacaacac atctagtatt cttttttttt tttgtgcgca 3420 ggtgcacacc
tggactttag ttattgcccc ataaagttaa caatctcacc tttggctctc 3480
ccagtgtctc cgcctccaga tgctcgtttt acaccctcga gctaacgaca acacaacacc
3540 catgagggga atgggcaaag ttaaacactt ttggtttcaa tgattcctat
ttgctactct 3600 cttgttttgt gttttgattt gcaccatgtg aaataaacga
caattatata taccttttcg 3660 tctgtcctcc aatgtctctt tttgctgcca
ttttgctttt tgctttttgc ttttgcactc 3720 tctcccactc ccacaatcag
tgcagcaaca cacaa 3755 88 3900 DNA Candida tropicalis 88 gacatcataa
tgacccggtt atttcgccct caggttgctt atttgagccg taaagtgcag 60
tagaaacttt gccttgggtt caaactctag tataatggtg ataactggtt gcactcttgc
120 cataggcatg aaaataggcc gttatagtac tatatttaat aagcgtagga
gtataggatg 180 catatgaccg gtttttctat atttttaaga taatctctag
taaattttgt attctcagta 240 ggatttcatc aaatttcgca accaattctg
gcgaaaaaat gattctttta cgtcaaaagc 300 tgaatagtgc agtttaaagc
acctaaaatc acatatacag cctctagata cgacagagaa 360 gctctttatg
atctgaagaa gcattagaat agctactatg agccactatt ggtgtatata 420
ttagggattg gtgcaattaa gtacgtacta ataaacagaa gaaaatactt aaccaatttc
480 tggtgtatac ttagtggtga gggacctttt ctgaacattc gggtcaaact
tttttttgga 540 gtgcgacatc gatttttcgt ttgtgtaata atagtgaacc
tttgtgtaat aaatcttcat 600 gcaagacttg cataattcga gcttgggagt
tcacgccaat ttgacctcgt tcatgtgata 660 aaagaaaagc caaaaggtaa
ttagcagacg caatgggaac atggagtgga aagcaatgga 720 agcacgccca
ggacggagta atttagtcca cactacatct gggggttttt tttttgtgcg 780
caagtacaca cctggacttt agtttttgcc ccataaagtt aacaatctaa cctttggctc
840 tccaactctc tccgccccca aatattcgtt tttacaccct caagctagcg
acagcacaac 900 acccattaga ggaatggggc aaagttaaac acttttggct
tcaatgattc ctattcgcta 960 ctacattctt ctcttgtttt gtgctttgaa
ttgcaccatg tgaaataaac gacaattata 1020 tatacctttt catccctcct
cctatatctc tttttgctac attttgtttt ttacgtttct 1080 tgcttttgca
ctctcccact cccacaaaga aaaaaaaact acactatgtc gtcttctcca 1140
tcgtttgccc aagaggttct cgctaccact agtccttaca tcgagtactt tcttgacaac
1200 tacaccagat ggtactactt catacctttg gtgcttcttt cgttgaactt
tataagtttg 1260 ctccacacaa ggtacttgga acgcaggttc cacgccaagc
cactcggtaa ctttgtcagg 1320 gaccctacgt ttggtatcgc tactccgttg
cttttgatct acttgaagtc gaaaggtacg 1380 gtcatgaagt ttgcttgggg
cctctggaac aacaagtaca tcgtcagaga cccaaagtac 1440 aagacaactg
ggctcaggat tgttggcctc ccattgattg aaaccatgga cccagagaac 1500
atcaaggctg ttttggctac tcagttcaat gatttctctt tgggaaccag acacgatttc
1560 ttgtactcct tgttgggtga cggtattttc accttggacg gtgctggctg
gaaacatagt 1620 agaactatgt tgagaccaca gtttgctaga gaacaggttt
ctcacgtcaa gttgttggag 1680 ccacacgttc aggtgttctt caagcacgtt
agaaagcacc gcggtcaaac gttcgacatc 1740 caagaattgt tcttcaggtt
gaccgtcgac tccgccaccg agttcttgtt tggtgagtct 1800 gctgaatcct
tgagggacga atctattgga ttgaccccaa ccaccaagga tttcgatggc 1860
agaagagatt tcgctgacgc tttcaactat tcgcagactt accaggccta cagatttttg
1920 ttgcaacaaa tgtactggat cttgaatggc tcggaattca gaaagtcgat
tgctgtcgtg 1980 cacaagtttg ctgaccacta tgtgcaaaag gctttggagt
tgaccgacga tgacttgcag 2040 aaacaagacg gctatgtgtt cttgtacgag
ttggctaagc aaaccagaga cccaaaggtc 2100 ttgagagacc agttattgaa
cattttggtt gccggtagag acacgaccgc cggtttgttg 2160 tcatttgttt
tctacgagtt gtcaagaaac cctgaggtgt ttgctaagtt gagagaggag 2220
gtggaaaaca gatttggact cggtgaagaa gctcgtgttg aagagatctc gtttgagtcc
2280 ttgaagtctt gtgagtactt gaaggctgtc atcaatgaaa ccttgagatt
gtacccatcg 2340 gttccacaca actttagagt tgctaccaga aacactaccc
tcccaagagg tggtggtgaa 2400 gatggatact cgccaattgt cgtcaagaag
ggtcaagttg tcatgtacac tgttattgct 2460 acccacagag acccaagtat
ctacggtgcc gacgctgacg tcttcagacc agaaagatgg 2520 tttgaaccag
aaactagaaa gttgggctgg gcatacgttc cattcaatgg tggtccaaga 2580
atctgtttgg gtcaacagtt tgccttgacc gaagcttcat acgtcactgt cagattgctc
2640 caggagtttg cacacttgtc tatggaccca gacaccgaat atccaccaaa
attgcagaac 2700 accttgacct tgtcgctctt tgatggtgct gatgttagaa
tgtactaagg ttgcttttcc 2760 ttgctaattt tcttctgtat agcttgtgta
tttaaattga atcggcaatt gatttttctg 2820 ataccaataa ccgtagtgcg
atttgaccaa aaccgttcaa actttttgtt ctctcgttga 2880 cgtgctcgct
catcagcact gtttgaagac gaaagagaaa attttttgta aacaacactg 2940
tccaaattta cccaacgtga accattatgc aaatgagcgg ccctttcaac tggtcgctgg
3000 aagcattcgg ggatatctac aacgccctta agtttgaaac agacattgat
ttagacacca 3060 tagatttcag cggcatcaag aatgaccttg cccacatttt
gacgacccca acaccactgg 3120 aagaatcacg ccagaaacta ggcgatggat
ccaagcctgt gaccttgccc aatggagacg 3180 aagtggagtt gaaccaagcg
ttcctagaag ttaccacatt attgtcgaat gagtttgact 3240 tggaccaatt
gaacgcggca gagttgttat actacgctgg cgacatatcc tacaagaagg 3300
gcacatcaat cgcagacagt gccagattgt cttattattt gagagcaaac tacatcttga
3360 acatacttgg gtatttgatt tcgaagcagc gattggattt gatagtcacg
gacaacgacg 3420 cgttgtttga tagtattttg aaaagttttg aaaagatcta
caagttgata agcgtgttga 3480 acgatatgat tgacaagcaa aaggtgacaa
gcgacatcaa cagtctagca ttcatcaatt 3540 gcatcaacta ctcgagaggt
caactattct ccgcacacga acttttggga ctggttttgt 3600 ttggattggt
cgacatctat ttcaaccagt ttggcacatt agacaactac aagaaggtat 3660
tggcattgat actgaagaac atcagcgatg aagacatctt gatcatacac ttcctcccat
3720 cgacactaca attgtttaag ctggtgttgg acaagaaaga cgacgctgca
gttgaacagt 3780 tctacaagta catcacttca acagtgtcac gagactacaa
ctccaacatc ggctccacag 3840 ccaaagatga tatcgatttg tccaaaacca
aactcagtgg ctttgaggtg ttgacgagtt 3900 89 3668 DNA Candida
tropicalis 89 cctgcagaat tcgcggccgc gtcgacagag tagcagttat
gcaagcatgt gattgtggtt 60 tttgcaacct gtttgcacga caaatgatcg
acagtcgatt acgtaatcca tattatttag 120 aggggtaata aaaaataaat
ggcagccaga atttcaaaca ttttgcaaac aatgcaaaag 180 atgagaaact
ccaacagaaa aaataaaaaa actccgcagc actccgaacc aacaaaacaa 240
tggggggcgc cagaattatt gactattgtg actttttttt attttttccg ttaactttca
300 ttgcagtgaa gtgtgttaca cggggtggtg atggtgttgg tttctacaat
gcaagggcac 360 agttgaaggt ttccacataa cgttgcacca tatcaactca
atttatcctc attcatgtga 420 taaaagaaga gccaaaaggt aattggcaga
ccccccaagg ggaacacgga gtagaaagca 480 atggaaacac gcccatgaca
gtgccattta gcccacaaca catctagtat tctttttttt 540 ttttgtgcgc
aggtgcacac ctggacttta gttattgccc cataaagtta acaatctcac 600
ctttggctct cccagtgtct ccgcctccag atgctcgttt tacaccctcg agctaacgac
660 aacacaacac ccatgagggg aatgggcaaa gttaaacact tttggtttca
atgattccta 720 tttgctactc tcttgttttg tgttttgatt tgcaccatgt
gaaataaacg acaattatat 780 ataccttttc gtctgtcctc caatgtctct
ttttgctgcc attttgcttt ttgctttttg 840 cttttgcact ctctcccact
cccacaatca gtgcagcaac acacaaagaa gaaaaataaa 900 aaaacctaca
ctatgtcgtc ttctccatcg tttgctcagg aggttctcgc taccactagt 960
ccttacatcg agtactttct tgacaactac accagatggt actacttcat ccctttggtg
1020 cttctttcgt tgaacttcat cagcttgctc cacacaaagt acttggaacg
caggttccac 1080 gccaagccgc tcggtaacgt cgtgttggat cctacgtttg
gtatcgctac tccgttgatc 1140 ttgatctact taaagtcgaa aggtacagtc
atgaagtttg cctggagctt ctggaacaac 1200 aagtacattg tcaaagaccc
aaagtacaag accactggcc ttagaattgt cggcctccca 1260 ttgattgaaa
ccatagaccc agagaacatc aaagctgtgt tggctactca gttcaacgat 1320
ttctccttgg gaactagaca cgatttcttg tactccttgt tgggcgatgg tatttttacc
1380 ttggacggtg ctggctggaa acacagtaga actatgttga gaccacagtt
tgctagagaa 1440 caggtttccc acgtcaagtt gttggaacca cacgttcagg
tgttcttcaa gcacgttaga 1500 aaacaccgcg gtcagacttt tgacatccaa
gaattgttct tcagattgac cgtcgactcc 1560 gccaccgagt tcttgtttgg
tgagtctgct gaatccttga gagacgactc tgttggtttg 1620 accccaacca
ccaaggattt cgaaggcaga ggagatttcg ctgacgcttt caactactcg 1680
cagacttacc aggcctacag atttttgttg caacaaatgt actggatttt gaatggcgcg
1740 gaattcagaa
agtcgattgc catcgtgcac aagtttgctg accactatgt gcaaaaggct 1800
ttggagttga ccgacgatga cttgcagaaa caagacggct atgtgttctt gtacgagttg
1860 gctaagcaaa ctagagaccc aaaggtcttg agagaccagt tgttgaacat
tttggttgcc 1920 ggtagagaca cgaccgccgg tttgttgtcg tttgtgttct
acgagttgtc gagaaaccct 1980 gaagtgtttg ccaagttgag agaggaggtg
gaaaacagat ttggactcgg cgaagaggct 2040 cgtgttgaag agatctcttt
tgagtccttg aagtcctgtg agtacttgaa ggctgtcatc 2100 aatgaagcct
tgagattgta cccatctgtt ccacacaact tcagagttgc caccagaaac 2160
actacccttc caagaggcgg tggtaaagac ggatgctcgc caattgttgt caagaagggt
2220 caagttgtca tgtacactgt cattggtacc cacagagacc caagtatcta
cggtgccgac 2280 gccgacgtct tcagaccaga aagatggttc gagccagaaa
ctagaaagtt gggctgggca 2340 tatgttccat tcaatggtgg tccaagaatc
tgtttgggtc agcagtttgc cttgactgaa 2400 gcttcatacg tcactgtcag
attgctccaa gagtttggaa acttgtccct ggatccaaac 2460 gctgagtacc
caccaaaatt gcagaacacc ttgaccttgt cactctttga tggtgctgac 2520
gttagaatgt tctaaggttg cttatccttg ctagtgttat ttatagtttg tgtatttaaa
2580 ttgaatcggc gattgatttt tctggtacta ataactgtag tgggttttga
ccaaaaccgt 2640 tcaaactttt tttttttttt tcttccccct accttcgttg
ctcgctcatc agcactgttt 2700 gaaaacgaaa aaagaaaatt ttttgtaaac
aacattgccc aaacttaccc aacgtgaacc 2760 attataacca aatgagcggc
gctttcaact ggtcactgga ggcattcggg gatatctaca 2820 acacccttaa
gtttgaggaa gacattgatt tagacaccat agatttcagc ggcatcaaga 2880
atgaccttgt ccacattttg acaaccccaa caccactgga agaatcgcgc cagaaactag
2940 gcgatggatc caagcctgtg gccttgccca atggagacga agtggagttg
aaccaagcgt 3000 tcctagaagt taccacatta ttgtcgaacg agtttgactt
ggaccaattg aacgcggccg 3060 agttgttata ctacgccggc gacatatcct
acaagaaggg cacatcaatt gccgacagtg 3120 ccagattgtc ttactatttg
agagcaaact acatcttgaa catacttggg tactttattt 3180 cgaagcagcg
attggatgtg atagtcaccg acaacaacgc gttgtttgat aatattttga 3240
aaagttttga aaagatctac aagttgataa gcgcgttgaa cgatatgatt gacaagcaaa
3300 aggtgacaag cgacatcaac agtctagcat ttatcaactg catcaactac
tcgaggggtc 3360 aactattctc cgcacacgaa cttttgggac tggttttgtt
tggattggtt gacaactatt 3420 tcaaccagtt tggctcatta gacaactaca
agaaagtatt ggcattgata ctgaagaaca 3480 tcagtgatga agatatcttg
atcgtacgct tcctcccatc gacactacaa ttgtttaagc 3540 tggtgttgga
taagaaagac gacgccactg ttgaccagtt ctacaagtac atcacctcaa 3600
cagtgtcgca agactacaac tccaacatcg gagccacagc caaagatgat atcgatttgt
3660 ccaaagcc 3668 90 3826 DNA Candida tropicalis 90 tggagtcgcc
agacttgctc acttttgact cccttcgaaa ctcaaagtac gttcaggcgg 60
tgctcaacga aacgctccgt atctacccgg gggtaccacg aaacatgaag acagctacgt
120 gcaacacgac gttgccacgc ggaggaggca aagacggcaa ggaacctatc
ttggtgcaga 180 agggacagtc cgttgggttg attactattg ccacgcagac
ggacccagag tattttgggg 240 ccgacgctgg tgagtttaag ccggagagat
ggtttgattc aagcatgaag aacttggggt 300 gtaaatactt gccgttcaat
gctgggccac ggacttgctt ggggcagcag tacactttga 360 ttgaagcgag
ctacttgcta gtccggttgg cccagaccta ccgggcaata gatttgcagc 420
caggatcggc gtacccacca agaaagaagt cgttgatcaa catgagtgct gccgacgggg
480 tgtttgtaaa gctttataag gatgtaacgg tagatggata gttgtgtagg
aggagcggag 540 ataaattaga tttgattttg tgtaaggttt tggatgtcaa
cctactccgc acttcatgca 600 gtgtgtgtga cacaagggtg tactacgtgt
gcgtgtgcgc caagagacag cccaaggggg 660 tggtagtgtg tgttggcgga
agtgcatgtg acacaacgcg tgggttctgg ccaatggtgg 720 actaagtgca
ggtaagcagc gacctgaaac attcctcaac gcttaagaca ctggtggtag 780
agatgcggac caggctattc ttgtcgtgct acccggcgca tggaaaatca actgcgggaa
840 gaataaattt atccgtagaa tccacagagc ggataaattt gcccacctcc
atcatcaacc 900 acgccgccac taactacatc actcccctat tttctctctc
tctctttgtc ttactccgct 960 cccgtttcct tagccacaga tacacaccca
ctgcaaacag cagcaacaat tataaagata 1020 cgccaggccc accttctttc
tttttcttca cttttttgac tgcaactttc tacaatccac 1080 cacagccacc
accacagccg ctatgattga acaactccta gaatattggt atgtcgttgt 1140
gccagtgttg tacatcatca aacaactcct tgcatacaca aagactcgcg tcttgatgaa
1200 aaagttgggt gctgctccag tcacaaacaa gttgtacgac aacgctttcg
gtatcgtcaa 1260 tggatggaag gctctccagt tcaagaaaga gggcagggct
caagagtaca acgattacaa 1320 gtttgaccac tccaagaacc caagcgtggg
cacctacgtc agtattcttt tcggcaccag 1380 gatcgtcgtg accaaagatc
cagagaatat caaagctatt ttggcaaccc agtttggtga 1440 tttttctttg
ggcaagaggc acactctttt taagcctttg ttaggtgatg ggatcttcac 1500
attggacggc gaaggctgga agcacagcag agccatgttg agaccacagt ttgccagaga
1560 acaagttgct catgtgacgt cgttggaacc acacttccag ttgttgaaga
agcatattct 1620 taagcacaag ggtgaatact ttgatatcca ggaattgttc
tttagattta ccgttgattc 1680 ggccacggag ttcttatttg gtgagtccgt
gcactcctta aaggacgaat ctattggtat 1740 caaccaagac gatatagatt
ttgctggtag aaaggacttt gctgagtcgt tcaacaaagc 1800 ccaggaatac
ttggctatta gaaccttggt gcagacgttc tactggttgg tcaacaacaa 1860
ggagtttaga gactgtacca agctggtgca caagttcacc aactactatg ttcagaaagc
1920 tttggatgct agcccagaag agcttgaaaa gcaaagtggg tatgtgttct
tgtacgagct 1980 tgtcaagcag acaagagacc ccaatgtgtt gcgtgaccag
tctttgaaca tcttgttggc 2040 cggaagagac accactgctg ggttgttgtc
gtttgctgtc tttgagttgg ccagacaccc 2100 agagatctgg gccaagttga
gagaggaaat tgaacaacag tttggtcttg gagaagactc 2160 tcgtgttgaa
gagattacct ttgagagctt gaagagatgt gagtacttga aagcgttcct 2220
taatgaaacc ttgcgtattt acccaagtgt cccaagaaac ttcagaatcg ccaccaagaa
2280 cacgacattg ccaaggggcg gtggttcaga cggtacctcg ccaatcttga
tccaaaaggg 2340 agaagctgtg tcgtatggta tcaactctac tcatttggac
cctgtctatt acggccctga 2400 tgctgctgag ttcagaccag agagatggtt
tgagccatca accaaaaagc tcggctgggc 2460 ttacttgcca ttcaacggtg
gtccaagaat ctgtttgggt cagcagtttg ccttgacgga 2520 agctggctat
gtgttggtta gattggtgca agagttctcc cacgttaggc tggacccaga 2580
cgaggtgtac ccgccaaaga ggttgaccaa cttgaccatg tgtttgcagg atggtgctat
2640 tgtcaagttt gactagcggc gtggtgaatg cgtttgattt tgtagtttct
gtttgcagta 2700 atgagataac tattcagata aggcgagtgg atgtacgttt
tgtaagagtt tccttacaac 2760 cttggtgggg tgtgtgaggt tgaggttgca
tcttggggag attacacctt ttgcagctct 2820 ccgtatacac ttgtactctt
tgtaacctct atcaatcatg tggggggggg ggttcattgt 2880 ttggccatgg
tggtgcatgt taaatccgcc aactacccaa tctcacatga aactcaagca 2940
cactaaaaaa aaaaaagatg ttgggggaaa actttggttt cccttcttag taattaaaca
3000 ctctcactct cactctcact ctctccactc agacaaacca accacctggg
ctgcagacaa 3060 ccagaaaaaa aaagaacaaa atccagatag aaaaacaaag
ggctggacaa ccataaataa 3120 acaatctagg gtctactcca tcttccactg
tttcttcttc ttcagactta gctaacaaac 3180 aactcacttc accatggatt
acgcaggcat cacgcgtggc tccatcagag gcgaggcctt 3240 gaagaaactc
gcagaattga ccatccagaa ccagccatcc agcttgaaag aaatcaacac 3300
cggcatccag aaggacgact ttgccaagtt gttgtctgcc accccgaaaa tccccaccaa
3360 gcacaagttg aacggcaacc acgaattgtc tgaggtcgcc attgccaaaa
aggagtacga 3420 ggtgttgatt gccttgagcg acgccacaaa agacccaatc
aaagtgacct cccagatcaa 3480 gatcttgatt gacaagttca aggtgtactt
gtttgagttg cctgaccaga agttctccta 3540 ctccatcgtg tccaactccg
tcaacatcgc cccctggacc ttgctcgggg agaagttgac 3600 cacgggcttg
atcaacttgg ccttccagaa caacaagcag cacttggacg aggtcattga 3660
catcttcaac gagttcatcg acaagttctt tggcaacacg gagccgcaat tgaccaactt
3720 cttgaccttg tgcggtgtgt tggacgggtt gattgaccat gccaacttct
tgagcgtgtc 3780 ctcgcggacc ttcaagatct tcttgaactt ggactcgtat gtggac
3826 91 3910 DNA Candida tropicalis 91 ttacaatcat ggagctcgct
aggaacccag atgtctggga gaagctccgc gaagaggtca 60 acacgaactt
tggcatggag tcgccagact tgctcacttt tgactctctt agaagctcaa 120
agtacgttca ggcggtgctc aacgaaacgc ttcgtatcta cccgggggtg ccacgaaaca
180 tgaagacagc tacgtgcaac acgacgttgc cgcgtggagg aggcaaagac
ggtaaggaac 240 ctattttggt gcagaagggc cagtccgttg ggttgattac
tattgccacg cagacggacc 300 cagagtattt tggggcagat gctggtgagt
tcaaaccgga gagatggttt gattcaagca 360 tgaagaactt ggggtgtaag
tacttgccgt tcaatgctgg gccccggact tgtttggggc 420 agcagtacac
tttgattgaa gcgagctatt tgctagtcag gttggcgcag acctaccggg 480
taatcgattt gctgccaggg tcggcgtacc caccaagaaa gaagtcgttg atcaatatga
540 gtgctgccga tggggtggtt gtaaagtttc acaaggatct agatggatat
gtaaggtgtg 600 taggaggagc ggagataaat tagatttgat tttgtgtaag
gtttagcacg tcaagctact 660 ccgcactttg tgtgtaggga gcacatactc
cgtctgcgcc tgtgccaaga gacggcccag 720 gggtagtgtg tggtggtgga
agtgcatgtg acacaatacc ctggttctgg ccaattgggg 780 atttagtgta
ggtaagctgc gacctgaaac actcctcaac gcttgagaca ctggtgggta 840
gagatgcggg ccaggaggct attcttgtcg tgctacccgt gcacggaaaa tcgattgagg
900 gaagaacaaa tttatccgtg aaatccacag agcggataaa tttgtcacat
tgctgcgttg 960 cccacccaca gcattctctt ttctctctct ttgtcttact
ccgctcctgt ttccttatcc 1020 agaaatacac accaactcat ataaagatac
gctagcccag ctgtctttct ttttcttcac 1080 tttttttggt gtgttgcttt
tttggctgct actttctaca accaccacca ccaccaccac 1140 catgattgaa
caaatcctag aatattggta tattgttgtg cctgtgttgt acatcatcaa 1200
acaactcatt gcctacagca agactcgcgt cttgatgaaa cagttgggtg ctgctccaat
1260 cacaaaccag ttgtacgaca acgttttcgg tatcgtcaac ggatggaagg
ctctccagtt 1320 caagaaagag ggcagagctc aagagtacaa cgatcacaag
tttgacagct ccaagaaccc 1380 aagcgtcggc acctatgtca gtattctttt
tggcaccaag attgtcgtga ccaaggatcc 1440 agagaatatc aaagctattt
tggcaaccca gtttggcgat ttttctttgg gcaagagaca 1500 cgctcttttt
aaacctttgt taggtgatgg gatcttcacc ttggacggcg aaggctggaa 1560
gcatagcaga tccatgttaa gaccacagtt tgccagagaa caagttgctc atgtgacgtc
1620 gttggaacca cacttccagt tgttgaagaa gcatatcctt aaacacaagg
gtgagtactt 1680 tgatatccag gaattgttct ttagatttac tgtcgactcg
gccacggagt tcttatttgg 1740 tgagtccgtg cactccttaa aggacgaaac
tatcggtatc aaccaagacg atatagattt 1800 tgctggtaga aaggactttg
ctgagtcgtt caacaaagcc caggagtatt tgtctattag 1860 aattttggtg
cagaccttct actggttgat caacaacaag gagtttagag actgtaccaa 1920
gctggtgcac aagtttacca actactatgt tcagaaagct ttggatgcta ccccagagga
1980 acttgaaaag caaggcgggt atgtgttctt gtatgagctt gtcaagcaga
cgagagaccc 2040 caaggtgttg cgtgaccagt ctttgaacat cttgttggca
ggaagagaca ccactgctgg 2100 gttgttgtcc tttgctgtgt ttgagttggc
cagaaaccca cacatctggg ccaagttgag 2160 agaggaaatt gaacagcagt
ttggtcttgg agaagactct cgtgttgaag agattacctt 2220 tgagagcttg
aagagatgtg agtacttgaa agcgttcctt aacgaaacct tgcgtgttta 2280
cccaagtgtc ccaagaaact tcagaatcgc caccaagaat acaacattgc caaggggtgg
2340 tggtccagac ggtacccagc caatcttgat ccaaaaggga gaaggtgtgt
cgtatggtat 2400 caactctacc cacttagatc ctgtctatta tggccctgat
gctgctgagt tcagaccaga 2460 gagatggttt gagccatcaa ccagaaagct
cggctgggct tacttgccat tcaacggtgg 2520 gccacgaatc tgtttgggtc
agcagtttgc cttgaccgaa gctggttacg ttttggtcag 2580 attggtgcaa
gagttctccc acattaggct ggacccagat gaagtgtatc caccaaagag 2640
gttgaccaac ttgaccatgt gtttgcagga tggtgctatt gtcaagtttg actagtacgt
2700 atgagtgcgt ttgattttgt agtttctgtt tgcagtaatg agataactat
tcagataagg 2760 cgggtggatg tacgttttgt aagagtttcc ttacaaccct
ggtgggtgtg tgaggttgca 2820 tcttagggag agatagcacc ttttgcagct
ctccgtatac agttttactc tttgtaacct 2880 atgccaatca tgtggggatt
cattgtttgc ccatggtggt gcatgcaaaa tccccccaac 2940 tacccaatct
cacatgaaac tcaagcacac tagaaaaaaa agatgttgcg tgggttcttt 3000
tgatgttggg gaaaactttc gtttcctttc tcagtaatta aacgttctca ctcagacaaa
3060 ccacctgggc tgcagacaac cagaaaaaac aaaatccaga tagaagaaga
aagggctgga 3120 caaccataaa taaacaacct agggtccact ccatctttca
cttcttcttc ttcagactta 3180 tctaacaaac gactcacttc accatggatt
acgcaggtat cacgcgtggg tccatcagag 3240 gcgaagcctt gaagaaactc
gccgagttga ccatccagaa ccagccatcc agcttgaaag 3300 aaatcaacac
cggcatccag aaggacgact ttgccaagtt gttgtcttcc accccgaaaa 3360
tccacaccaa gcacaagttg aatggcaacc acgaattgtc cgaagtcgcc attgccaaaa
3420 aggagtacga ggtgttgatt gccttgagcg acgccacgaa agaaccaatc
aaagtcacct 3480 cccagatcaa gatcttgatt gacaagttca aggtgtactt
gtttgagttg cccgaccaga 3540 agttctccta ctccatcgtg tccaactccg
ttaacattgc cccctggacc ttgctcggtg 3600 agaagttgac cacgggcttg
atcaacttgg cgttccagaa caacaagcag cacttggacg 3660 aagtcatcga
catcttcaac gagttcatcg acaagttctt tggcaacaca gagccgcaat 3720
tgaccaactt cttgaccttg tccggtgtgt tggacgggtt gattgaccat gccaacttct
3780 tgagcgtgtc ctccaggacc ttcaagatct tcttgaactt ggactcgttt
gtggacaact 3840 cggacttctt gaacgacgtg gagaactact ccgacttttt
gtacgacgag ccgaacgagt 3900 accagaactt 3910 92 3150 DNA Candida
tropicalis 92 gaattctttg gatctaattc cagctgatct tgctaatcct
tatcaacgta gttgtgatca 60 ttgtttgtct gaattataca caccagtgga
agaatatggt ctaatttgca cgtcccactg 120 gcattgtgtg tttgtggggg
ggggggggtg cacacatttt tagtgccatt ctttgttgat 180 tacccctccc
ccctatcatt cattcccaca ggattagttt tttcctcact ggaattcgct 240
gtccacctgt caaccccccc cccccccccc cccactgccc taccctgccc tgccctgcac
300 gtcctgtgtt ttgtgctgtg tctttcccac gctataaaag ccctggcgtc
cggccaaggt 360 ttttccaccc agccaaaaaa acagtctaaa aaatttggtt
gatccttttt ggttgcaagg 420 ttttccacca ccacttccac cacctcaact
attcgaacaa aagatgctcg atcagatctt 480 acattactgg tacattgtct
tgccattgtt ggccattatc aaccagatcg tggctcatgt 540 caggaccaat
tatttgatga agaaattggg tgctaagcca ttcacacacg tccaacgtga 600
cgggtggttg ggcttcaaat tcggccgtga attcctcaaa gcaaaaagtg ctgggagact
660 ggttgattta atcatctccc gtttccacga taatgaggac actttctcca
gctatgcttt 720 tggcaaccat gtggtgttca ccagggaccc cgagaatatc
aaggcgcttt tggcaaccca 780 gtttggtgat ttttcattgg gcagcagggt
caagttcttc aaaccattat tggggtacgg 840 tatcttcaca ttggacgccg
aaggctggaa gcacagcaga gccatgttga gaccacagtt 900 tgccagagaa
caagttgctc atgtgacgtc gttggaacca cacttccagt tgttgaagaa 960
gcatatcctt aaacacaagg gtgagtactt tgatatccag gaattgttct ttagatttac
1020 tgtcgactcg gccacggagt tcttatttgg tgagtccgtg cactccttaa
aggacgagga 1080 aattggctac gacacgaaag acatgtctga agaaagacgc
agatttgccg acgcgttcaa 1140 caagtcgcaa gtctacgtgg ccaccagagt
tgctttacag aacttgtact ggttggtcaa 1200 caacaaagag ttcaaggagt
gcaatgacat tgtccacaag tttaccaact actatgttca 1260 gaaagccttg
gatgctaccc cagaggaact tgaaaagcaa ggcgggtatg tgttcttgta 1320
tgagcttgtc aagcagacga gagaccccaa ggtgttgcgt gaccagtctt tgaacatctt
1380 gttggcagga agagacacca ctgctgggtt gttgtccttt gctgtgtttg
agttggccag 1440 aaacccacac atctgggcca agttgagaga ggaaattgaa
cagcagtttg gtcttggaga 1500 agactctcgt gttgaagaga ttacctttga
gagcttgaag agatgtgagt acttgaaggc 1560 cgtgttgaac gaaactttga
gattacaccc aagtgtccca agaaacgcaa gatttgcgat 1620 taaagacacg
actttaccaa gaggcggtgg ccccaacggc aaggatccta tcttgatcag 1680
gaaggatgag gtggtgcagt actccatctc ggcaactcag acaaatcctg cttattatgg
1740 cgccgatgct gctgatttta gaccggaaag atggtttgaa ccatcaacta
gaaacttggg 1800 atgggctttc ttgccattca acggtggtcc aagaatctgt
ttgggacaac agtttgcttt 1860 gactgaagcc ggttacgttt tggttagact
tgttcaggag tttccaaact tgtcacaaga 1920 ccccgaaacc aagtacccac
cacctagatt ggcacacttg acgatgtgct tgtttgacgg 1980 tgcacacgtc
aagatgtcat aggtttcccc atacaagtag ttcagtaatt atacactgtt 2040
tttactttct cttcatacca aatggacaaa agttttaagc atgcctaaca acgtgaccgg
2100 acaattgtgt cgcactagta tgtaacaatt gtaaaaatag tgtacactaa
tttgtggtgg 2160 ccggagataa attacagttt ggttttgtgt aaactcgcgg
atatctctgg cagtttctct 2220 tctccgcagc agctttgcca cgggtttgct
ctggggccaa caaattcaaa agggggagaa 2280 acttaacacc ccttatctct
ccactctagg ttgtagctct tgtggggatg caattgtcgt 2340 acgtttttta
tgttttgtct agactttgat gattacgttg gatttcttat gtctgaggcg 2400
tgcttgaaag aagtgtcaaa atgtgacagg cgacgctatt cgacatgaac gcgaaagggt
2460 tatttgcatc aatacgaggg gctgactcta gtctaggatg gcagtcctag
gttgcaaaca 2520 tgttgcacca tatccctcct ggagttggtc gacctcgcct
acgccaccct cagcgatcgg 2580 cactttccgt tgttcaatat ttctccttcc
cattgttcca ggggttatca acaacgttgc 2640 cggcctcctc cccaaattac
aagaaaaata aattgtcgca cggcaccgat ctgtcaaaga 2700 tacagataaa
ccttaaatct gcaaaaacaa gacccctccc catagcctag aagcaccagc 2760
aagatgatgg agcaactcct ccagtactgg tacatcgcac tctctgtatg gttcatcctt
2820 cgctacttgg cttcccacgc acgagccgtc tacttgcgcc acaagctcgg
cgcggcgcca 2880 ttcacgcaca cccagtacga cggctggtat gggttcaagt
ttgggcggga gtttctcaag 2940 gcgaagaaga tcgggcggca gacggacttg
gtgcatgcgc ggttccgtgg cggcatggac 3000 accttctcga gctacacttt
cggcatccat atcatcctta cccgggaccc ggagaacatc 3060 aaggcggtct
tggcgacgca gttcgatgac ttctcgctcg gtggcaggat caggttcttg 3120
aagccgttgt tggggtatgg gatattcacg 3150 93 3579 DNA Candida
tropicalis 93 aaaaccgata caagaagaag acagtcaaca agaacgttaa
tgtcaaccag gcgccaagaa 60 gacggtttgg cggacttgga agaatgtggc
atttgcccat gatgtttatg ttctggagag 120 gtttttcaag gaatcgtcat
cctccgccac cacaagaacc accagttaac gagatccata 180 ttcacaaccc
accgcaaggt gacaatgctc aacaacaaca gcaacaacaa caacccccac 240
aagaacagtg gaataatgcc agtcaacaaa gagtggtgac agacgaggga gaaaacgcaa
300 gcaacagtgg ttctgatgca agatcagcta caccgcttca tcaggaaaag
caggagctcc 360 caccaccata tgcccatcac gagcaacacc agcaggttag
tgtatagtag tctgtagtta 420 agtcaatgca atgtaccaat aagactatcc
cttcttacaa ccaagttttc tgccgcgcct 480 gtctggcaac agatgctggc
cgacacactt tcaactgagt ttggtctaga attcttgcac 540 atgcacgaca
aggaaactct tacaaagaca acacttgtgc tctgatgcca cttgatcttg 600
ctaagcctta tcaacgtaat tgagatcatt gtttgtctga attatacaca ccagtggaag
660 aatctggtct aatctgcacg cctcatgggc attgtgtgtt ttgggggggg
gggggggggt 720 gcacacattt ttagtgcgaa tgtttgtttg ctggttcccc
ctcccccctc ccccctatca 780 tgcccacagg attagttttt tcctcactgg
aattcgctgt ccacctgtca accccctcac 840 tgccctgccc tgccctgcac
gccctgtgtt ttgtgctgtg gcactcccac gctataaaag 900 ccctggcgta
cggccaaggt ttttcctcac agccaaaaaa aaatttggct gatccttttg 960
ggctgcaagg tttttcacca ccaccaccac caccacctca actattcaaa caaaggatgc
1020 tcgaccagat cttccattac tggtacattg tcttgccatt gttggtcatt
atcaagcaga 1080 tcgtggctca tgccaggacc aattatttga tgaagaagtt
gggcgctaag ccattcacac 1140 atgtccaact agacgggtgg tttggcttca
aatttggccg tgaattcctc aaagctaaaa 1200 gtgctgggag gcaggttgat
ttaatcatct cccgtttcca cgataatgag gacactttct 1260 ccagctatgc
ttttggcaac catgtggtgt tcaccaggga ccccgagaat atcaaggcgc 1320
ttttggcaac ccagtttggt gatttttcat tgggaagcag ggtcaaattc ttcaaaccat
1380 tgttggggta cggtatcttc accttggacg gcgaaggctg gaagcacagc
agagccatgt 1440 tgagaccaca gtttgccaga gagcaagttg ctcatgtgac
gtcgttggaa ccacatttcc 1500 agttgttgaa gaagcatatt cttaagcaca
agggtgaata ctttgatatc caggaattgt 1560 tctttagatt taccgttgat
tcagcgacgg agttcttatt tggtgagtcc gtgcactcct 1620 taagggacga
ggaaattggc tacgatacga aggacatggc tgaagaaaga cgcaaatttg 1680
ccgacgcgtt caacaagtcg caagtctatt tgtccaccag agttgcttta cagacattgt
1740 actggttggt caacaacaaa gagttcaagg agtgcaacga cattgtccac
aagttcacca 1800 actactatgt tcagaaagcc ttggatgcta ccccagagga
acttgaaaaa caaggcgggt 1860 atgtgttctt gtacgagctt gccaagcaga
cgaaagaccc caatgtgttg cgtgaccagt 1920 ctttgaacat cttgttggct
ggaagggaca ccactgctgg gttgttgtcc tttgctgtgt 1980
ttgagttggc caggaaccca cacatctggg ccaagttgag agaggaaatt gaatcacact
2040 ttgggctggg tgaggactct cgtgttgaag agattacctt tgagagcttg
aagagatgtg 2100 agtacttgaa agccgtgttg aacgaaacgt tgagattaca
cccaagtgtc ccaagaaacg 2160 caagatttgc gattaaagac acgactttac
caagaggcgg tggccccaac ggcaaggatc 2220 ctatcttgat cagaaagaat
gaggtggtgc aatactccat ctcggcaact cagacaaatc 2280 ctgcttatta
tggcgccgat gctgctgatt ttagaccgga aagatggttt gagccatcaa 2340
ctagaaactt gggatgggct tacttgccat tcaacggtgg tccaagaatc tgcttgggac
2400 aacagtttgc tttgaccgaa gccggttacg ttttggttag acttgttcag
gaattcccta 2460 gcttgtcaca ggaccccgaa actgagtacc caccacctag
attggcacac ttgacgatgt 2520 gcttgtttga cggggcatac gtcaagatgc
aataggtttt ggtttgactt tgtttccata 2580 tgcaagtagt tcagtaatta
cacactaatt tgtggtggcc ggcgataaat taccgtttgg 2640 ttttgtgtaa
aaattcggac atctctggtg gtttcccttc tccgcagcag ctttgccacg 2700
ggtttgctct gcggccaaca aattcgaaag gggggggggg gggggagaaa gttaacaccc
2760 cctgttccca ccgtaggctg tagctcttgt ggggggatgt aattgtcgta
cgttttcatg 2820 tttggcccag actttgatga ttacgtaggc tttcttatgt
ctaaggcgtg cttgacacaa 2880 gtgtcaaaag gtgacaggcg acgttattcg
acatgaacgc aaaagggtaa tttgcatcga 2940 tacgaggggt tgcctctggt
ctaagaagga ccccccaggt tgcaaacatg ttgcactgca 3000 tcccactcag
agttggtcga ccacgcctac gcttaccctc agcgatcggc actttccgtt 3060
gctcaatatt tctctccccc ctgcttcccc ccattgttcc agggattatc aacaacgttg
3120 ccggtctcct ctcccccccc tccccccagt tatgtacaag aaaattaaat
tgtcgcacgg 3180 caccgatacg tcaaagatac agagaaacct taatccctcc
catagcctag aagcatcaaa 3240 aagatgattg agcaactcct ccagtactgg
tacattgcac tccctgtatg gttcattctc 3300 cgctacgtgg cttcccacgc
acgaaccatc tacttgcgcc acaagctcgg cgcggcgccg 3360 ttcacgcaca
cccagtacga cggatggtat gggttcaagt ttgggcggga gtttctcaag 3420
gcgaagaaga ttggaaggca gacggacttg gtgcatgcgc ggttccgtgg agggggcatg
3480 gatactttct cgagctatac tttcggcatc catatcattc ttactcggga
cccggagaac 3540 atcaaggcgg tcttggcgac gcagttcgat gacttttcg 3579 94
3348 DNA Candida tropicalis 94 gatgtggtgc ttgatttctc gagacacatc
cttgtgaggt gccatgaatc tgtacctgtc 60 tgtaagcaca gggaactgct
tcaacacctt attgcatatt ctgtctattg caagcgtgtg 120 ctgcaacgat
atctgccaag gtatatagca gaacgtgctg atggttcctc cggtcatatt 180
ctgttggtag ttctgcaggt aaatttggat gtcaggtagt ggagggaggt ttgtatcggt
240 tgtgttttct tcttcctctc tctctgattc aacctccacg tctccttcgg
gttctgtgtc 300 tgtgtctgag tcgtactgtt ggattaagtc catcgcatgt
gtgaaaaaaa gtagcgctta 360 tttagacaac cagttcgttg ggcgggtatc
agaaatagtc tgttgtgcac gaccatgagt 420 atgcaacttg acgagacgtc
gttaggaatc cacagaatga tagcaggaag cttactacgt 480 gagagattct
gcttagagga tgttctcttc ttgttgattc cattaggtgg gtatcatctc 540
cggtggtgac aacttgacac aagcagttcc gagaaccacc cacaacaatc accattccag
600 ctatcacttc tacatgtcaa cctacgatgt atctcatcac catctagttt
cttggcaatc 660 gtttatttgt tatgggtcaa catccaatac aactccacca
atgaagaaga aaaacggaaa 720 gcagaatacc agaatgacag tgtgagttcc
tgaccattgc taatctatgg ctatatctag 780 tttgctatcg tgggatgtga
tctgtgtcgt cttcatttgc gtttgtgttt atttcgggta 840 tgaatattgt
tatactaaat acttgatgca caaacatggc gctcgagaaa tcgagaatgt 900
gatcaacgat gggttctttg ggttccgctt acctttgcta ctcatgcgag ccagcaatga
960 gggccgactt atcgagttca gtgtcaagag attcgagtcg gcgccacatc
cacagaacaa 1020 gacattggtc aaccgggcat tgagcgttcc tgtgatactc
accaaggacc cagtgaatat 1080 caaagcgatg ctatcgaccc agtttgatga
cttttccctt gggttgagac tacaccagtt 1140 tgcgccgttg ttggggaaag
gcatctttac tttggacggc ccagagtgga agcagagccg 1200 atctatgttg
cgtccgcaat ttgccaaaga tcgggtttct catatcctgg atctagaacc 1260
gcattttgtg ttgcttcgga agcacattga tggccacaat ggagactact tcgacatcca
1320 ggagctctac ttccggttct cgatggatgt ggcgacgggg tttttgtttg
gcgagtctgt 1380 ggggtcgttg aaagacgaag atgcgaggtt cctggaagca
ttcaatgagt cgcagaagta 1440 tttggcaact agggcaacgt tgcacgagtt
gtactttctt tgtgacgggt ttaggtttcg 1500 ccagtacaac aaggttgtgc
gaaagttctg cagccagtgt gtccacaagg cgttagatgt 1560 tgcaccggaa
gacaccagcg agtacgtgtt tctccgcgag ttggtcaaac acactcgaga 1620
tcccgttgtt ttacaagacc aagcgttgaa cgtcttgctt gctggacgcg acaccaccgc
1680 gtcgttatta tcgtttgcaa catttgagct agcccggaat gaccacatgt
ggaggaagct 1740 acgagaggag gttatcctga cgatgggacc gtccagtgat
gaaataaccg tggccgggtt 1800 gaagagttgc cgttacctca aagcaatcct
aaacgaaact cttcgactat acccaagtgt 1860 gcctaggaac gcgagatttg
ctacgaggaa tacgacgctt cctcgtggcg gaggtccaga 1920 tggatcgttt
ccgattttga taagaaaggg ccagccagtg gggtatttca tttgtgctac 1980
acacttgaat gagaaggtat atgggaatga tagccatgtg tttcgaccgg agagatgggc
2040 tgcgttagag ggcaagagtt tgggctggtc gtatcttcca ttcaacggcg
gcccgagaag 2100 ctgccttggt cagcagtttg caatccttga agcttcgtat
gttttggctc gattgacaca 2160 gtgctacacg acgatacagc ttagaactac
cgagtaccca ccaaagaaac tcgttcatct 2220 cacgatgagt cttctcaacg
gggtgtacat ccgaactaga acttgattat gtgtttatgg 2280 ttaatcgggg
caaagcactg caagtcattg atgtttgtgg aagcccagca ttggtgttcc 2340
ggagcatcaa taaccaatgt cttgaagggt ttgattttct tgaccttctt cttcctgagc
2400 ttctttccgt caaacttgta cagaatggcc atcatttcag gaacaaccac
gtacgacggc 2460 cggtaccgca tctggagtat ctcgccgtcg ttcaagtagc
acgaaaacag caacgacgtc 2520 accatctgct tcccaatctt gacacccaca
gatacccctg cggcttcatg gatcaaaaac 2580 gtcggcaacc ccgcgtatat
gtccatgtaa ttctccatgg ccacctccat caacacactg 2640 atggagcgac
tgacggtgcc accactgccc tcggttgagt caaggcagta tgatgccggg 2700
atccagtact ccaatgggaa cctctgcacg gtgtcgctgc agtttttgag gcgtatttcg
2760 atccatgatc gttctttggt gctgtagtat aacgagctct tggtgtcctt
gaaatggaac 2820 aggttggatg tgttgttgag tttgtctgcg tgcttggttt
gcaagtcttc gatcgagcgt 2880 agtgagtaga cagttggcgg gggtggtggc
tcgggcttta ttctgtgttt gtgtttcctt 2940 cttagtcttg gaatgacgct
gttatcgacg gttcgtagta taagtagcgc caatatgaga 3000 atgtatatcc
gcatcaccca agactcttca gcctgttaca acgactgagg ctgttggccg 3060
tgtgaccaat tggtttcttt ggtgacctag attggtcccg cagggaaagc aagggctgct
3120 aggggggcat accaaacaag gtcgtgtaat cagtatctat ggtgctacca
tgtgtgtggt 3180 tggggggaaa ttcccgcatt tttgtgtaac gaaagttcta
gaaagttctc gtgggttctg 3240 agaatctgct ggaaccatcc acccgcattt
ccgttgccaa agtgggaaga gcaatcaacc 3300 caccctgctt tgcccaatca
gccattcccc tgggaatata aattcaac 3348 95 523 PRT CANDIDATROPICALIS 95
Met Ala Thr Gln Glu Ile Ile Asp Ser Val Leu Pro Tyr Leu Thr Lys 1 5
10 15 Trp Tyr Thr Val Ile Thr Ala Ala Val Leu Val Phe Leu Ile Ser
Thr 20 25 30 Asn Ile Lys Asn Tyr Val Lys Ala Lys Lys Leu Lys Cys
Val Asp Pro 35 40 45 Pro Tyr Leu Lys Asp Ala Gly Leu Thr Gly Ile
Leu Ser Leu Ile Ala 50 55 60 Ala Ile Lys Ala Lys Asn Asp Gly Arg
Leu Ala Asn Phe Ala Asp Glu 65 70 75 80 Val Phe Asp Glu Tyr Pro Asn
His Thr Phe Tyr Leu Ser Val Ala Gly 85 90 95 Ala Leu Lys Ile Val
Met Thr Val Asp Pro Glu Asn Ile Lys Ala Val 100 105 110 Leu Ala Thr
Gln Phe Thr Asp Phe Ser Leu Gly Thr Arg His Ala His 115 120 125 Phe
Ala Pro Leu Leu Gly Asp Gly Ile Phe Thr Leu Asp Gly Glu Gly 130 135
140 Trp Lys His Ser Arg Ala Met Leu Arg Pro Gln Phe Ala Arg Asp Gln
145 150 155 160 Ile Gly His Val Lys Ala Leu Glu Pro His Ile Gln Ile
Met Ala Lys 165 170 175 Gln Ile Lys Leu Asn Gln Gly Lys Thr Phe Asp
Ile Gln Glu Leu Phe 180 185 190 Phe Arg Phe Thr Val Asp Thr Ala Thr
Glu Phe Leu Phe Gly Glu Ser 195 200 205 Val His Ser Leu Tyr Asp Glu
Lys Leu Gly Ile Pro Thr Pro Asn Glu 210 215 220 Ile Pro Gly Arg Glu
Asn Phe Ala Ala Ala Phe Asn Val Ser Gln His 225 230 235 240 Tyr Leu
Ala Thr Arg Ser Tyr Ser Gln Thr Phe Tyr Phe Leu Thr Asn 245 250 255
Pro Lys Glu Phe Arg Asp Cys Asn Ala Lys Val His His Leu Ala Lys 260
265 270 Tyr Phe Val Asn Lys Ala Leu Asn Phe Thr Pro Glu Glu Leu Glu
Glu 275 280 285 Lys Ser Lys Ser Gly Tyr Val Phe Leu Tyr Glu Leu Val
Lys Gln Thr 290 295 300 Arg Asp Pro Lys Val Leu Gln Asp Gln Leu Leu
Asn Ile Met Val Ala 305 310 315 320 Gly Arg Asp Thr Thr Ala Gly Leu
Leu Ser Phe Ala Leu Phe Glu Leu 325 330 335 Ala Arg His Pro Glu Met
Trp Ser Lys Leu Arg Glu Glu Ile Glu Val 340 345 350 Asn Phe Gly Val
Gly Glu Asp Ser Arg Val Glu Glu Ile Thr Phe Glu 355 360 365 Ala Leu
Lys Arg Cys Glu Tyr Leu Lys Ala Ile Leu Asn Glu Thr Leu 370 375 380
Arg Met Tyr Pro Ser Val Pro Val Asn Phe Arg Thr Ala Thr Arg Asp 385
390 395 400 Thr Thr Leu Pro Arg Gly Gly Gly Ala Asn Gly Thr Asp Pro
Ile Tyr 405 410 415 Ile Pro Lys Gly Ser Thr Val Ala Tyr Val Val Tyr
Lys Thr His Arg 420 425 430 Leu Glu Glu Tyr Tyr Gly Lys Asp Ala Asn
Asp Phe Arg Pro Glu Arg 435 440 445 Trp Phe Glu Pro Ser Thr Lys Lys
Leu Gly Trp Ala Tyr Val Pro Phe 450 455 460 Asn Gly Gly Pro Arg Val
Cys Leu Gly Gln Gln Phe Ala Leu Thr Glu 465 470 475 480 Ala Ser Tyr
Val Ile Thr Arg Leu Ala Gln Met Phe Glu Thr Val Ser 485 490 495 Ser
Asp Pro Gly Leu Glu Tyr Pro Pro Pro Lys Cys Ile His Leu Thr 500 505
510 Met Ser His Asn Asp Gly Val Phe Val Lys Met 515 520 96 522 PRT
CANDIDATROPICALIS 96 Met Thr Val His Asp Ile Ile Ala Thr Tyr Phe
Thr Lys Trp Tyr Val 1 5 10 15 Ile Val Pro Leu Ala Leu Ile Ala Tyr
Arg Val Leu Asp Tyr Phe Tyr 20 25 30 Gly Arg Tyr Leu Met Tyr Lys
Leu Gly Ala Lys Pro Phe Phe Gln Lys 35 40 45 Gln Thr Asp Gly Cys
Phe Gly Phe Lys Ala Pro Leu Glu Leu Leu Lys 50 55 60 Lys Lys Ser
Asp Gly Thr Leu Ile Asp Phe Thr Leu Gln Arg Ile His 65 70 75 80 Asp
Leu Asp Arg Pro Asp Ile Pro Thr Phe Thr Phe Pro Val Phe Ser 85 90
95 Ile Asn Leu Val Asn Thr Leu Glu Pro Glu Asn Ile Lys Ala Ile Leu
100 105 110 Ala Thr Gln Phe Asn Asp Phe Ser Leu Gly Thr Arg His Ser
His Phe 115 120 125 Ala Pro Leu Leu Gly Asp Gly Ile Phe Thr Leu Asp
Gly Ala Gly Trp 130 135 140 Lys His Ser Arg Ser Met Leu Arg Pro Gln
Phe Ala Arg Glu Gln Ile 145 150 155 160 Ser His Val Lys Leu Leu Glu
Pro His Val Gln Val Phe Phe Lys His 165 170 175 Val Arg Lys Ala Gln
Gly Lys Thr Phe Asp Ile Gln Glu Leu Phe Phe 180 185 190 Arg Leu Thr
Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val 195 200 205 Glu
Ser Leu Arg Asp Glu Ser Ile Gly Met Ser Ile Asn Ala Leu Asp 210 215
220 Phe Asp Gly Lys Ala Gly Phe Ala Asp Ala Phe Asn Tyr Ser Gln Asn
225 230 235 240 Tyr Leu Ala Ser Arg Ala Val Met Gln Gln Leu Tyr Trp
Val Leu Asn 245 250 255 Gly Lys Lys Phe Lys Glu Cys Asn Ala Lys Val
His Lys Phe Ala Asp 260 265 270 Tyr Tyr Val Asn Lys Ala Leu Asp Leu
Thr Pro Glu Gln Leu Glu Lys 275 280 285 Gln Asp Gly Tyr Val Phe Leu
Tyr Glu Leu Val Lys Gln Thr Arg Asp 290 295 300 Lys Gln Val Leu Arg
Asp Gln Leu Leu Asn Ile Met Val Ala Gly Arg 305 310 315 320 Asp Thr
Thr Ala Gly Leu Leu Ser Phe Val Phe Phe Glu Leu Ala Arg 325 330 335
Asn Pro Glu Val Thr Asn Lys Leu Arg Glu Glu Ile Glu Asp Lys Phe 340
345 350 Gly Leu Gly Glu Asn Ala Ser Val Glu Asp Ile Ser Phe Glu Ser
Leu 355 360 365 Lys Ser Cys Glu Tyr Leu Lys Ala Val Leu Asn Glu Thr
Leu Arg Leu 370 375 380 Tyr Pro Ser Val Pro Gln Asn Phe Arg Val Ala
Thr Lys Asn Thr Thr 385 390 395 400 Leu Pro Arg Gly Gly Gly Lys Asp
Gly Leu Ser Pro Val Leu Val Arg 405 410 415 Lys Gly Gln Thr Val Ile
Tyr Gly Val Tyr Ala Ala His Arg Asn Pro 420 425 430 Ala Val Tyr Gly
Lys Asp Ala Leu Glu Phe Arg Pro Glu Arg Trp Phe 435 440 445 Glu Pro
Glu Thr Lys Lys Leu Gly Trp Ala Phe Leu Pro Phe Asn Gly 450 455 460
Gly Pro Arg Ile Cys Leu Gly Gln Gln Phe Ala Leu Thr Glu Ala Ser 465
470 475 480 Tyr Val Thr Val Arg Leu Leu Gln Glu Phe Ala His Leu Ser
Met Asp 485 490 495 Pro Asp Thr Glu Tyr Pro Pro Lys Lys Met Ser His
Leu Thr Met Ser 500 505 510 Leu Phe Asp Gly Ala Asn Ile Glu Met Tyr
515 520 97 522 PRT CANDIDATROPICALIS 97 Met Thr Ala Gln Asp Ile Ile
Ala Thr Tyr Ile Thr Lys Trp Tyr Val 1 5 10 15 Ile Val Pro Leu Ala
Leu Ile Ala Tyr Arg Val Leu Asp Tyr Phe Tyr 20 25 30 Gly Arg Tyr
Leu Met Tyr Lys Leu Gly Ala Lys Pro Phe Phe Gln Lys 35 40 45 Gln
Thr Asp Gly Tyr Phe Gly Phe Lys Ala Pro Leu Glu Leu Leu Lys 50 55
60 Lys Lys Ser Asp Gly Thr Leu Ile Asp Phe Thr Leu Glu Arg Ile Gln
65 70 75 80 Ala Leu Asn Arg Pro Asp Ile Pro Thr Phe Thr Phe Pro Ile
Phe Ser 85 90 95 Ile Asn Leu Ile Ser Thr Leu Glu Pro Glu Asn Ile
Lys Ala Ile Leu 100 105 110 Ala Thr Gln Phe Asn Asp Phe Ser Leu Gly
Thr Arg His Ser His Phe 115 120 125 Ala Pro Leu Leu Gly Asp Gly Ile
Phe Thr Leu Asp Gly Ala Gly Trp 130 135 140 Lys His Ser Arg Ser Met
Leu Arg Pro Gln Phe Ala Arg Glu Gln Ile 145 150 155 160 Ser His Val
Lys Leu Leu Glu Pro His Met Gln Val Phe Phe Lys His 165 170 175 Val
Arg Lys Ala Gln Gly Lys Thr Phe Asp Ile Gln Glu Leu Phe Phe 180 185
190 Arg Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu Ser Val
195 200 205 Glu Ser Leu Arg Asp Glu Ser Ile Gly Met Ser Ile Asn Ala
Leu Asp 210 215 220 Phe Asp Gly Lys Ala Gly Phe Ala Asp Ala Phe Asn
Tyr Ser Gln Asn 225 230 235 240 Tyr Leu Ala Ser Arg Ala Val Met Gln
Gln Leu Tyr Trp Val Leu Asn 245 250 255 Gly Lys Lys Phe Lys Glu Cys
Asn Ala Lys Val His Lys Phe Ala Asp 260 265 270 Tyr Tyr Val Ser Lys
Ala Leu Asp Leu Thr Pro Glu Gln Leu Glu Lys 275 280 285 Gln Asp Gly
Tyr Val Phe Leu Tyr Glu Leu Val Lys Gln Thr Arg Asp 290 295 300 Arg
Gln Val Leu Arg Asp Gln Leu Leu Asn Ile Met Val Ala Gly Arg 305 310
315 320 Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe Phe Glu Leu Ala
Arg 325 330 335 Asn Pro Glu Val Thr Asn Lys Leu Arg Glu Glu Ile Glu
Asp Lys Phe 340 345 350 Gly Leu Gly Glu Asn Ala Arg Val Glu Asp Ile
Ser Phe Glu Ser Leu 355 360 365 Lys Ser Cys Glu Tyr Leu Lys Ala Val
Leu Asn Glu Thr Leu Arg Leu 370 375 380 Tyr Pro Ser Val Pro Gln Asn
Phe Arg Val Ala Thr Lys Asn Thr Thr 385 390 395 400 Leu Pro Arg Gly
Gly Gly Lys Asp Gly Leu Ser Pro Val Leu Val Arg 405 410 415 Lys Gly
Gln Thr Val Met Tyr Gly Val Tyr Ala Ala His Arg Asn Pro 420 425 430
Ala Val Tyr Gly Lys Asp Ala Leu Glu Phe Arg Pro Glu Arg Trp Phe 435
440 445 Glu Pro Glu Thr Lys Lys Leu Gly Trp Ala Phe Leu Pro Phe Asn
Gly 450 455 460 Gly Pro Arg Ile Cys Leu Gly Gln Gln Phe Ala Leu Thr
Glu Ala Ser 465 470 475 480 Tyr Val Thr Val Arg Leu Leu Gln Glu Phe
Gly His Leu Ser Met Asp 485 490 495 Pro Asn Thr Glu Tyr Pro Pro Arg
Lys Met Ser His Leu Thr Met Ser 500 505 510 Leu Phe Asp Gly Ala Asn
Ile Glu Met Tyr 515 520 98 540 PRT CANDIDATROPICALIS 98 Met Ser Ser
Ser Pro Ser Phe Ala Gln Glu Val Leu Ala Thr Thr Ser 1 5 10 15 Pro
Tyr Ile Glu Tyr Phe Leu Asp Asn Tyr Thr Arg Trp Tyr Tyr Phe 20 25
30 Ile Pro Leu Val Leu Leu Ser Leu Asn Phe Ile Ser Leu Leu His Thr
35 40 45 Arg Tyr Leu Glu Arg Arg Phe
His Ala Lys Pro Leu Gly Asn Phe Val 50 55 60 Arg Asp Pro Thr Phe
Gly Ile Ala Thr Pro Leu Leu Leu Ile Tyr Leu 65 70 75 80 Lys Ser Lys
Gly Thr Val Met Lys Phe Ala Trp Gly Leu Trp Asn Asn 85 90 95 Lys
Tyr Ile Val Arg Asp Pro Lys Tyr Lys Thr Thr Gly Leu Arg Ile 100 105
110 Val Gly Leu Pro Leu Ile Glu Thr Met Asp Pro Glu Asn Ile Lys Ala
115 120 125 Val Leu Ala Thr Gln Phe Asn Asp Phe Ser Leu Gly Thr Arg
His Asp 130 135 140 Phe Leu Tyr Ser Leu Leu Gly Asp Gly Ile Phe Thr
Leu Asp Gly Ala 145 150 155 160 Gly Trp Lys His Ser Arg Thr Met Leu
Arg Pro Gln Phe Ala Arg Glu 165 170 175 Gln Val Ser His Val Lys Leu
Leu Glu Pro His Val Gln Val Phe Phe 180 185 190 Lys His Val Arg Lys
His Arg Gly Gln Thr Phe Asp Ile Gln Glu Leu 195 200 205 Phe Phe Arg
Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe Gly Glu 210 215 220 Ser
Ala Glu Ser Leu Arg Asp Glu Ser Ile Gly Leu Thr Pro Thr Thr 225 230
235 240 Lys Asp Phe Asp Gly Arg Arg Asp Phe Ala Asp Ala Phe Asn Tyr
Ser 245 250 255 Gln Thr Tyr Gln Ala Tyr Arg Phe Leu Leu Gln Gln Met
Tyr Trp Ile 260 265 270 Leu Asn Gly Ser Glu Phe Arg Lys Ser Ile Ala
Val Val His Lys Phe 275 280 285 Ala Asp His Tyr Val Gln Lys Ala Leu
Glu Leu Thr Asp Asp Asp Leu 290 295 300 Gln Lys Gln Asp Gly Tyr Val
Phe Leu Tyr Glu Leu Ala Lys Gln Thr 305 310 315 320 Arg Asp Pro Lys
Val Leu Arg Asp Gln Leu Leu Asn Ile Leu Val Ala 325 330 335 Gly Arg
Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe Tyr Glu Leu 340 345 350
Ser Arg Asn Pro Glu Val Phe Ala Lys Leu Arg Glu Glu Val Glu Asn 355
360 365 Arg Phe Gly Leu Gly Glu Glu Ala Arg Val Glu Glu Ile Ser Phe
Glu 370 375 380 Ser Leu Lys Ser Cys Glu Tyr Leu Lys Ala Val Ile Asn
Glu Thr Leu 385 390 395 400 Arg Leu Tyr Pro Ser Val Pro His Asn Phe
Arg Val Ala Thr Arg Asn 405 410 415 Thr Thr Leu Pro Arg Gly Gly Gly
Glu Asp Gly Tyr Ser Pro Ile Val 420 425 430 Val Lys Lys Gly Gln Val
Val Met Tyr Thr Val Ile Ala Thr His Arg 435 440 445 Asp Pro Ser Ile
Tyr Gly Ala Asp Ala Asp Val Phe Arg Pro Glu Arg 450 455 460 Trp Phe
Glu Pro Glu Thr Arg Lys Leu Gly Trp Ala Tyr Val Pro Phe 465 470 475
480 Asn Gly Gly Pro Arg Ile Cys Leu Gly Gln Gln Phe Ala Leu Thr Glu
485 490 495 Ala Ser Tyr Val Thr Val Arg Leu Leu Gln Glu Phe Ala His
Leu Ser 500 505 510 Met Asp Pro Asp Thr Glu Tyr Pro Pro Lys Leu Gln
Asn Thr Leu Thr 515 520 525 Leu Ser Leu Phe Asp Gly Ala Asp Val Arg
Met Tyr 530 535 540 99 540 PRT CANDIDATROPICALIS 99 Met Ser Ser Ser
Pro Ser Phe Ala Gln Glu Val Leu Ala Thr Thr Ser 1 5 10 15 Pro Tyr
Ile Glu Tyr Phe Leu Asp Asn Tyr Thr Arg Trp Tyr Tyr Phe 20 25 30
Ile Pro Leu Val Leu Leu Ser Leu Asn Phe Ile Ser Leu Leu His Thr 35
40 45 Lys Tyr Leu Glu Arg Arg Phe His Ala Lys Pro Leu Gly Asn Val
Val 50 55 60 Leu Asp Pro Thr Phe Gly Ile Ala Thr Pro Leu Ile Leu
Ile Tyr Leu 65 70 75 80 Lys Ser Lys Gly Thr Val Met Lys Phe Ala Trp
Ser Phe Trp Asn Asn 85 90 95 Lys Tyr Ile Val Lys Asp Pro Lys Tyr
Lys Thr Thr Gly Leu Arg Ile 100 105 110 Val Gly Leu Pro Leu Ile Glu
Thr Ile Asp Pro Glu Asn Ile Lys Ala 115 120 125 Val Leu Ala Thr Gln
Phe Asn Asp Phe Ser Leu Gly Thr Arg His Asp 130 135 140 Phe Leu Tyr
Ser Leu Leu Gly Asp Gly Ile Phe Thr Leu Asp Gly Ala 145 150 155 160
Gly Trp Lys His Ser Arg Thr Met Leu Arg Pro Gln Phe Ala Arg Glu 165
170 175 Gln Val Ser His Val Lys Leu Leu Glu Pro His Val Gln Val Phe
Phe 180 185 190 Lys His Val Arg Lys His Arg Gly Gln Thr Phe Asp Ile
Gln Glu Leu 195 200 205 Phe Phe Arg Leu Thr Val Asp Ser Ala Thr Glu
Phe Leu Phe Gly Glu 210 215 220 Ser Ala Glu Ser Leu Arg Asp Asp Ser
Val Gly Leu Thr Pro Thr Thr 225 230 235 240 Lys Asp Phe Glu Gly Arg
Gly Asp Phe Ala Asp Ala Phe Asn Tyr Ser 245 250 255 Gln Thr Tyr Gln
Ala Tyr Arg Phe Leu Leu Gln Gln Met Tyr Trp Ile 260 265 270 Leu Asn
Gly Ala Glu Phe Arg Lys Ser Ile Ala Ile Val His Lys Phe 275 280 285
Ala Asp His Tyr Val Gln Lys Ala Leu Glu Leu Thr Asp Asp Asp Leu 290
295 300 Gln Lys Gln Asp Gly Tyr Val Phe Leu Tyr Glu Leu Ala Lys Gln
Thr 305 310 315 320 Arg Asp Pro Lys Val Leu Arg Asp Gln Leu Leu Asn
Ile Leu Val Ala 325 330 335 Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser
Phe Val Phe Tyr Glu Leu 340 345 350 Ser Arg Asn Pro Glu Val Phe Ala
Lys Leu Arg Glu Glu Val Glu Asn 355 360 365 Arg Phe Gly Leu Gly Glu
Glu Ala Arg Val Glu Glu Ile Ser Phe Glu 370 375 380 Ser Leu Lys Ser
Cys Glu Tyr Leu Lys Ala Val Ile Asn Glu Ala Leu 385 390 395 400 Arg
Leu Tyr Pro Ser Val Pro His Asn Phe Arg Val Ala Thr Arg Asn 405 410
415 Thr Thr Leu Pro Arg Gly Gly Gly Lys Asp Gly Cys Ser Pro Ile Val
420 425 430 Val Lys Lys Gly Gln Val Val Met Tyr Thr Val Ile Gly Thr
His Arg 435 440 445 Asp Pro Ser Ile Tyr Gly Ala Asp Ala Asp Val Phe
Arg Pro Glu Arg 450 455 460 Trp Phe Glu Pro Glu Thr Arg Lys Leu Gly
Trp Ala Tyr Val Pro Phe 465 470 475 480 Asn Gly Gly Pro Arg Ile Cys
Leu Gly Gln Gln Phe Ala Leu Thr Glu 485 490 495 Ala Ser Tyr Val Thr
Val Arg Leu Leu Gln Glu Phe Gly Asn Leu Ser 500 505 510 Leu Asp Pro
Asn Ala Glu Tyr Pro Pro Lys Leu Gln Asn Thr Leu Thr 515 520 525 Leu
Ser Leu Phe Asp Gly Ala Asp Val Arg Met Phe 530 535 540 100 517 PRT
CANDIDATROPICALIS 100 Met Ile Glu Gln Leu Leu Glu Tyr Trp Tyr Val
Val Val Pro Val Leu 1 5 10 15 Tyr Ile Ile Lys Gln Leu Leu Ala Tyr
Thr Lys Thr Arg Val Leu Met 20 25 30 Lys Lys Leu Gly Ala Ala Pro
Val Thr Asn Lys Leu Tyr Asp Asn Ala 35 40 45 Phe Gly Ile Val Asn
Gly Trp Lys Ala Leu Gln Phe Lys Lys Glu Gly 50 55 60 Arg Ala Gln
Glu Tyr Asn Asp Tyr Lys Phe Asp His Ser Lys Asn Pro 65 70 75 80 Ser
Val Gly Thr Tyr Val Ser Ile Leu Phe Gly Thr Arg Ile Val Val 85 90
95 Thr Lys Asp Pro Glu Asn Ile Lys Ala Ile Leu Ala Thr Gln Phe Gly
100 105 110 Asp Phe Ser Leu Gly Lys Arg His Thr Leu Phe Lys Pro Leu
Leu Gly 115 120 125 Asp Gly Ile Phe Thr Leu Asp Gly Glu Gly Trp Lys
His Ser Arg Ala 130 135 140 Met Leu Arg Pro Gln Phe Ala Arg Glu Gln
Val Ala His Val Thr Ser 145 150 155 160 Leu Glu Pro His Phe Gln Leu
Leu Lys Lys His Ile Leu Lys His Lys 165 170 175 Gly Glu Tyr Phe Asp
Ile Gln Glu Leu Phe Phe Arg Phe Thr Val Asp 180 185 190 Ser Ala Thr
Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 195 200 205 Glu
Ser Ile Gly Ile Asn Gln Asp Asp Ile Asp Phe Ala Gly Arg Lys 210 215
220 Asp Phe Ala Glu Ser Phe Asn Lys Ala Gln Glu Tyr Leu Ala Ile Arg
225 230 235 240 Thr Leu Val Gln Thr Phe Tyr Trp Leu Val Asn Asn Lys
Glu Phe Arg 245 250 255 Asp Cys Thr Lys Leu Val His Lys Phe Thr Asn
Tyr Tyr Val Gln Lys 260 265 270 Ala Leu Asp Ala Ser Pro Glu Glu Leu
Glu Lys Gln Ser Gly Tyr Val 275 280 285 Phe Leu Tyr Glu Leu Val Lys
Gln Thr Arg Asp Pro Asn Val Leu Arg 290 295 300 Asp Gln Ser Leu Asn
Ile Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 305 310 315 320 Leu Leu
Ser Phe Ala Val Phe Glu Leu Ala Arg His Pro Glu Ile Trp 325 330 335
Ala Lys Leu Arg Glu Glu Ile Glu Gln Gln Phe Gly Leu Gly Glu Asp 340
345 350 Ser Arg Val Glu Glu Ile Thr Phe Glu Ser Leu Lys Arg Cys Glu
Tyr 355 360 365 Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg Ile Tyr Pro
Ser Val Pro 370 375 380 Arg Asn Phe Arg Ile Ala Thr Lys Asn Thr Thr
Leu Pro Arg Gly Gly 385 390 395 400 Gly Ser Asp Gly Thr Ser Pro Ile
Leu Ile Gln Lys Gly Glu Ala Val 405 410 415 Ser Tyr Gly Ile Asn Ser
Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 420 425 430 Asp Ala Ala Glu
Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Lys 435 440 445 Lys Leu
Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ile Cys 450 455 460
Leu Gly Gln Gln Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 465
470 475 480 Leu Val Gln Glu Phe Ser His Val Arg Leu Asp Pro Asp Glu
Val Tyr 485 490 495 Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu
Gln Asp Gly Ala 500 505 510 Ile Val Lys Phe Asp 515 101 517 PRT
CANDIDATROPICALIS 101 Met Ile Glu Gln Ile Leu Glu Tyr Trp Tyr Ile
Val Val Pro Val Leu 1 5 10 15 Tyr Ile Ile Lys Gln Leu Ile Ala Tyr
Ser Lys Thr Arg Val Leu Met 20 25 30 Lys Gln Leu Gly Ala Ala Pro
Ile Thr Asn Gln Leu Tyr Asp Asn Val 35 40 45 Phe Gly Ile Val Asn
Gly Trp Lys Ala Leu Gln Phe Lys Lys Glu Gly 50 55 60 Arg Ala Gln
Glu Tyr Asn Asp His Lys Phe Asp Ser Ser Lys Asn Pro 65 70 75 80 Ser
Val Gly Thr Tyr Val Ser Ile Leu Phe Gly Thr Lys Ile Val Val 85 90
95 Thr Lys Asp Pro Glu Asn Ile Lys Ala Ile Leu Ala Thr Gln Phe Gly
100 105 110 Asp Phe Ser Leu Gly Lys Arg His Ala Leu Phe Lys Pro Leu
Leu Gly 115 120 125 Asp Gly Ile Phe Thr Leu Asp Gly Glu Gly Trp Lys
His Ser Arg Ser 130 135 140 Met Leu Arg Pro Gln Phe Ala Arg Glu Gln
Val Ala His Val Thr Ser 145 150 155 160 Leu Glu Pro His Phe Gln Leu
Leu Lys Lys His Ile Leu Lys His Lys 165 170 175 Gly Glu Tyr Phe Asp
Ile Gln Glu Leu Phe Phe Arg Phe Thr Val Asp 180 185 190 Ser Ala Thr
Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 195 200 205 Glu
Thr Ile Gly Ile Asn Gln Asp Asp Ile Asp Phe Ala Gly Arg Lys 210 215
220 Asp Phe Ala Glu Ser Phe Asn Lys Ala Gln Glu Tyr Leu Ser Ile Arg
225 230 235 240 Ile Leu Val Gln Thr Phe Tyr Trp Leu Ile Asn Asn Lys
Glu Phe Arg 245 250 255 Asp Cys Thr Lys Leu Val His Lys Phe Thr Asn
Tyr Tyr Val Gln Lys 260 265 270 Ala Leu Asp Ala Thr Pro Glu Glu Leu
Glu Lys Gln Gly Gly Tyr Val 275 280 285 Phe Leu Tyr Glu Leu Val Lys
Gln Thr Arg Asp Pro Lys Val Leu Arg 290 295 300 Asp Gln Ser Leu Asn
Ile Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 305 310 315 320 Leu Leu
Ser Phe Ala Val Phe Glu Leu Ala Arg Asn Pro His Ile Trp 325 330 335
Ala Lys Leu Arg Glu Glu Ile Glu Gln Gln Phe Gly Leu Gly Glu Asp 340
345 350 Ser Arg Val Glu Glu Ile Thr Phe Glu Ser Leu Lys Arg Cys Glu
Tyr 355 360 365 Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg Val Tyr Pro
Ser Val Pro 370 375 380 Arg Asn Phe Arg Ile Ala Thr Lys Asn Thr Thr
Leu Pro Arg Gly Gly 385 390 395 400 Gly Pro Asp Gly Thr Gln Pro Ile
Leu Ile Gln Lys Gly Glu Gly Val 405 410 415 Ser Tyr Gly Ile Asn Ser
Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 420 425 430 Asp Ala Ala Glu
Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg 435 440 445 Lys Leu
Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ile Cys 450 455 460
Leu Gly Gln Gln Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 465
470 475 480 Leu Val Gln Glu Phe Ser His Ile Arg Leu Asp Pro Asp Glu
Val Tyr 485 490 495 Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu
Gln Asp Gly Ala 500 505 510 Ile Val Lys Phe Asp 515 102 512 PRT
CANDIDATROPICALIS 102 Met Leu Asp Gln Ile Leu His Tyr Trp Tyr Ile
Val Leu Pro Leu Leu 1 5 10 15 Ala Ile Ile Asn Gln Ile Val Ala His
Val Arg Thr Asn Tyr Leu Met 20 25 30 Lys Lys Leu Gly Ala Lys Pro
Phe Thr His Val Gln Arg Asp Gly Trp 35 40 45 Leu Gly Phe Lys Phe
Gly Arg Glu Phe Leu Lys Ala Lys Ser Ala Gly 50 55 60 Arg Leu Val
Asp Leu Ile Ile Ser Arg Phe His Asp Asn Glu Asp Thr 65 70 75 80 Phe
Ser Ser Tyr Ala Phe Gly Asn His Val Val Phe Thr Arg Asp Pro 85 90
95 Glu Asn Ile Lys Ala Leu Leu Ala Thr Gln Phe Gly Asp Phe Ser Leu
100 105 110 Gly Ser Arg Val Lys Phe Phe Lys Pro Leu Leu Gly Tyr Gly
Ile Phe 115 120 125 Thr Leu Asp Ala Glu Gly Trp Lys His Ser Arg Ala
Met Leu Arg Pro 130 135 140 Gln Phe Ala Arg Glu Gln Val Ala His Val
Thr Ser Leu Glu Pro His 145 150 155 160 Phe Gln Leu Leu Lys Lys His
Ile Leu Lys His Lys Gly Glu Tyr Phe 165 170 175 Asp Ile Gln Glu Leu
Phe Phe Arg Phe Thr Val Asp Ser Ala Thr Glu 180 185 190 Phe Leu Phe
Gly Glu Ser Val His Ser Leu Lys Asp Glu Glu Ile Gly 195 200 205 Tyr
Asp Thr Lys Asp Met Ser Glu Glu Arg Arg Arg Phe Ala Asp Ala 210 215
220 Phe Asn Lys Ser Gln Val Tyr Val Ala Thr Arg Val Ala Leu Gln Asn
225 230 235 240 Leu Tyr Trp Leu Val Asn Asn Lys Glu Phe Lys Glu Cys
Asn Asp Ile 245 250 255 Val His Lys Phe Thr Asn Tyr Tyr Val Gln Lys
Ala Leu Asp Ala Thr 260 265 270 Pro Glu Glu Leu Glu Lys Gln Gly Gly
Tyr Val Phe Leu Tyr Glu Leu 275 280 285 Val Lys Gln Thr Arg Asp Pro
Lys Val Leu Arg Asp Gln Ser Leu Asn 290 295 300 Ile Leu Leu Ala Gly
Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala 305 310 315 320 Val Phe
Glu Leu Ala Arg Asn Pro His Ile Trp Ala Lys Leu Arg Glu 325 330 335
Glu Ile Glu Gln Gln Phe Gly Leu Gly Glu Asp Ser Arg Val Glu Glu 340
345 350 Ile Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr Leu Lys Ala Val
Leu 355
360 365 Asn Glu Thr Leu Arg Leu His Pro Ser Val Pro Arg Asn Ala Arg
Phe 370 375 380 Ala Ile Lys Asp Thr Thr Leu Pro Arg Gly Gly Gly Pro
Asn Gly Lys 385 390 395 400 Asp Pro Ile Leu Ile Arg Lys Asp Glu Val
Val Gln Tyr Ser Ile Ser 405 410 415 Ala Thr Gln Thr Asn Pro Ala Tyr
Tyr Gly Ala Asp Ala Ala Asp Phe 420 425 430 Arg Pro Glu Arg Trp Phe
Glu Pro Ser Thr Arg Asn Leu Gly Trp Ala 435 440 445 Phe Leu Pro Phe
Asn Gly Gly Pro Arg Ile Cys Leu Gly Gln Gln Phe 450 455 460 Ala Leu
Thr Glu Ala Gly Tyr Val Leu Val Arg Leu Val Gln Glu Phe 465 470 475
480 Pro Asn Leu Ser Gln Asp Pro Glu Thr Lys Tyr Pro Pro Pro Arg Leu
485 490 495 Ala His Leu Thr Met Cys Leu Phe Asp Gly Ala His Val Lys
Met Ser 500 505 510 103 512 PRT CANDIDATROPICALIS 103 Met Leu Asp
Gln Ile Phe His Tyr Trp Tyr Ile Val Leu Pro Leu Leu 1 5 10 15 Val
Ile Ile Lys Gln Ile Val Ala His Ala Arg Thr Asn Tyr Leu Met 20 25
30 Lys Lys Leu Gly Ala Lys Pro Phe Thr His Val Gln Leu Asp Gly Trp
35 40 45 Phe Gly Phe Lys Phe Gly Arg Glu Phe Leu Lys Ala Lys Ser
Ala Gly 50 55 60 Arg Gln Val Asp Leu Ile Ile Ser Arg Phe His Asp
Asn Glu Asp Thr 65 70 75 80 Phe Ser Ser Tyr Ala Phe Gly Asn His Val
Val Phe Thr Arg Asp Pro 85 90 95 Glu Asn Ile Lys Ala Leu Leu Ala
Thr Gln Phe Gly Asp Phe Ser Leu 100 105 110 Gly Ser Arg Val Lys Phe
Phe Lys Pro Leu Leu Gly Tyr Gly Ile Phe 115 120 125 Thr Leu Asp Gly
Glu Gly Trp Lys His Ser Arg Ala Met Leu Arg Pro 130 135 140 Gln Phe
Ala Arg Glu Gln Val Ala His Val Thr Ser Leu Glu Pro His 145 150 155
160 Phe Gln Leu Leu Lys Lys His Ile Leu Lys His Lys Gly Glu Tyr Phe
165 170 175 Asp Ile Gln Glu Leu Phe Phe Arg Phe Thr Val Asp Ser Ala
Thr Glu 180 185 190 Phe Leu Phe Gly Glu Ser Val His Ser Leu Arg Asp
Glu Glu Ile Gly 195 200 205 Tyr Asp Thr Lys Asp Met Ala Glu Glu Arg
Arg Lys Phe Ala Asp Ala 210 215 220 Phe Asn Lys Ser Gln Val Tyr Leu
Ser Thr Arg Val Ala Leu Gln Thr 225 230 235 240 Leu Tyr Trp Leu Val
Asn Asn Lys Glu Phe Lys Glu Cys Asn Asp Ile 245 250 255 Val His Lys
Phe Thr Asn Tyr Tyr Val Gln Lys Ala Leu Asp Ala Thr 260 265 270 Pro
Glu Glu Leu Glu Lys Gln Gly Gly Tyr Val Phe Leu Tyr Glu Leu 275 280
285 Ala Lys Gln Thr Lys Asp Pro Asn Val Leu Arg Asp Gln Ser Leu Asn
290 295 300 Ile Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser
Phe Ala 305 310 315 320 Val Phe Glu Leu Ala Arg Asn Pro His Ile Trp
Ala Lys Leu Arg Glu 325 330 335 Glu Ile Glu Ser His Phe Gly Leu Gly
Glu Asp Ser Arg Val Glu Glu 340 345 350 Ile Thr Phe Glu Ser Leu Lys
Arg Cys Glu Tyr Leu Lys Ala Val Leu 355 360 365 Asn Glu Thr Leu Arg
Leu His Pro Ser Val Pro Arg Asn Ala Arg Phe 370 375 380 Ala Ile Lys
Asp Thr Thr Leu Pro Arg Gly Gly Gly Pro Asn Gly Lys 385 390 395 400
Asp Pro Ile Leu Ile Arg Lys Asn Glu Val Val Gln Tyr Ser Ile Ser 405
410 415 Ala Thr Gln Thr Asn Pro Ala Tyr Tyr Gly Ala Asp Ala Ala Asp
Phe 420 425 430 Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg Asn Leu
Gly Trp Ala 435 440 445 Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ile Cys
Leu Gly Gln Gln Phe 450 455 460 Ala Leu Thr Glu Ala Gly Tyr Val Leu
Val Arg Leu Val Gln Glu Phe 465 470 475 480 Pro Ser Leu Ser Gln Asp
Pro Glu Thr Glu Tyr Pro Pro Pro Arg Leu 485 490 495 Ala His Leu Thr
Met Cys Leu Phe Asp Gly Ala Tyr Val Lys Met Gln 500 505 510 104 499
PRT CANDIDATROPICALIS 104 Met Ala Ile Ser Ser Leu Leu Ser Trp Asp
Val Ile Cys Val Val Phe 1 5 10 15 Ile Cys Val Cys Val Tyr Phe Gly
Tyr Glu Tyr Cys Tyr Thr Lys Tyr 20 25 30 Leu Met His Lys His Gly
Ala Arg Glu Ile Glu Asn Val Ile Asn Asp 35 40 45 Gly Phe Phe Gly
Phe Arg Leu Pro Leu Leu Leu Met Arg Ala Ser Asn 50 55 60 Glu Gly
Arg Leu Ile Glu Phe Ser Val Lys Arg Phe Glu Ser Ala Pro 65 70 75 80
His Pro Gln Asn Lys Thr Leu Val Asn Arg Ala Leu Ser Val Pro Val 85
90 95 Ile Leu Thr Lys Asp Pro Val Asn Ile Lys Ala Met Leu Ser Thr
Gln 100 105 110 Phe Asp Asp Phe Ser Leu Gly Leu Arg Leu His Gln Phe
Ala Pro Leu 115 120 125 Leu Gly Lys Gly Ile Phe Thr Leu Asp Gly Pro
Glu Trp Lys Gln Ser 130 135 140 Arg Ser Met Leu Arg Pro Gln Phe Ala
Lys Asp Arg Val Ser His Ile 145 150 155 160 Leu Asp Leu Glu Pro His
Phe Val Leu Leu Arg Lys His Ile Asp Gly 165 170 175 His Asn Gly Asp
Tyr Phe Asp Ile Gln Glu Leu Tyr Phe Arg Phe Ser 180 185 190 Met Asp
Val Ala Thr Gly Phe Leu Phe Gly Glu Ser Val Gly Ser Leu 195 200 205
Lys Asp Glu Asp Ala Arg Phe Leu Glu Ala Phe Asn Glu Ser Gln Lys 210
215 220 Tyr Leu Ala Thr Arg Ala Thr Leu His Glu Leu Tyr Phe Leu Cys
Asp 225 230 235 240 Gly Phe Arg Phe Arg Gln Tyr Asn Lys Val Val Arg
Lys Phe Cys Ser 245 250 255 Gln Cys Val His Lys Ala Leu Asp Val Ala
Pro Glu Asp Thr Ser Glu 260 265 270 Tyr Val Phe Leu Arg Glu Leu Val
Lys His Thr Arg Asp Pro Val Val 275 280 285 Leu Gln Asp Gln Ala Leu
Asn Val Leu Leu Ala Gly Arg Asp Thr Thr 290 295 300 Ala Ser Leu Leu
Ser Phe Ala Thr Phe Glu Leu Ala Arg Asn Asp His 305 310 315 320 Met
Trp Arg Lys Leu Arg Glu Glu Val Ile Leu Thr Met Gly Pro Ser 325 330
335 Ser Asp Glu Ile Thr Val Ala Gly Leu Lys Ser Cys Arg Tyr Leu Lys
340 345 350 Ala Ile Leu Asn Glu Thr Leu Arg Leu Tyr Pro Ser Val Pro
Arg Asn 355 360 365 Ala Arg Phe Ala Thr Arg Asn Thr Thr Leu Pro Arg
Gly Gly Gly Pro 370 375 380 Asp Gly Ser Phe Pro Ile Leu Ile Arg Lys
Gly Gln Pro Val Gly Tyr 385 390 395 400 Phe Ile Cys Ala Thr His Leu
Asn Glu Lys Val Tyr Gly Asn Asp Ser 405 410 415 His Val Phe Arg Pro
Glu Arg Trp Ala Ala Leu Glu Gly Lys Ser Leu 420 425 430 Gly Trp Ser
Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ser Cys Leu Gly 435 440 445 Gln
Gln Phe Ala Ile Leu Glu Ala Ser Tyr Val Leu Ala Arg Leu Thr 450 455
460 Gln Cys Tyr Thr Thr Ile Gln Leu Arg Thr Thr Glu Tyr Pro Pro Lys
465 470 475 480 Lys Leu Val His Leu Thr Met Ser Leu Leu Asn Gly Val
Tyr Ile Arg 485 490 495 Thr Arg Thr 105 1712 DNA Candida tropicalis
105 ggtaccgagc tcacgagttt tgggattttc gagtttggat tgtttccttt
gttgattgaa 60 ttgacgaaac cagaggtttt caagacagat aagattgggt
ttatcaaaac gcagtttgaa 120 atattccagt tggtttccaa gatatcttga
agaagattga cgatttgaaa tttgaagaag 180 tggagaagat ctggtttgga
ttgttggaga atttcaagaa tctcaagatt tactctaacg 240 acgggtacaa
cgagaattgt attgaattga tcaagaacat gatcttggtg ttacagaaca 300
tcaagttctt ggaccagact gagaatgcca cagatataca aggcgtcatg tgataaaatg
360 gatgagattt atcccacaat tgaagaaaga gtttatggaa agtggtcaac
cagaagctaa 420 acaggaagaa gcaaacgaag aggtgaaaca agaagaagaa
ggtaaataag tattttgtat 480 tatataacaa acaaagtaag gaatacagat
ttatacaata aattgccata ctagtcacgt 540 gagatatctc atccattccc
caactcccaa gaaaaaaaaa aagtgaaaaa aaaaatcaaa 600 cccaaagatc
aacctcccca tcatcatcgt catcaaaccc ccagctcaat tcgcaatggt 660
tagcacaaaa acatacacag aaagggcatc agcacacccc tccaaggttg cccaacgttt
720 attccgctta atggagtcca aaaagaccaa cctctgcgcc tcgatcgacg
tgaccacaac 780 cgccgagttc ctttcgctca tcgacaagct cggtccccac
atctgtctcg tgaagacgca 840 catcgatatc atctcagact tcagctacga
gggcacgatt gagccgttgc ttgtgcttgc 900 agagcgccac gggttcttga
tattcgagga caggaagttt gctgatatcg gaaacaccgt 960 gatgttgcag
tacacctcgg gggtataccg gatcgcggcg tggagtgaca tcacgaacgc 1020
gcacggagtg actgggaagg gcgtcgttga agggttgaaa cgcggtgcgg agggggtaga
1080 aaaggaaagg ggcgtgttga tgttggcgga gttgtcgagt aaaggctcgt
tggcgcatgg 1140 tgaatatacc cgtgagacga tcgagattgc gaagagtgat
cgggagttcg tgattgggtt 1200 catcgcgcag cgggacatgg ggggtagaga
agaagggttt gattggatca tcatgacgcc 1260 tggtgtgggg ttggatgata
aaggcgatgc gttgggccag cagtatagga ctgttgatga 1320 ggtggttctg
actggtaccg atgtgattat tgtcgggaga gggttgtttg gaaaaggaag 1380
agaccctgag gtggagggaa agagatacag ggatgctgga tggaaggcat acttgaagag
1440 aactggtcag ttagaataaa tattgtaata aataggtcta tatacataca
ctaagcttct 1500 aggacgtcat tgtagtcttc gaagttgtct gctagtttag
ttctcatgat ttcgaaaacc 1560 aataacgcaa tggatgtagc agggatggtg
gttagtgcgt tcctgacaaa cccagagtac 1620 gccgcctcaa accacgtcac
attcgccctt tgcttcatcc gcatcacttg cttgaaggta 1680 tccacgtacg
agttgtaata caccttgaag aa 1712 106 267 PRT CANDIDATROPICALIS 106 Met
Val Ser Thr Lys Thr Tyr Thr Glu Arg Ala Ser Ala His Pro Ser 1 5 10
15 Lys Val Ala Gln Arg Leu Phe Arg Leu Met Glu Ser Lys Lys Thr Asn
20 25 30 Leu Cys Ala Ser Ile Asp Val Thr Thr Thr Ala Glu Phe Leu
Ser Leu 35 40 45 Ile Asp Lys Leu Gly Pro His Ile Cys Leu Val Lys
Thr His Ile Asp 50 55 60 Ile Ile Ser Asp Phe Ser Tyr Glu Gly Thr
Ile Glu Pro Leu Leu Val 65 70 75 80 Leu Ala Glu Arg His Gly Phe Leu
Ile Phe Glu Asp Arg Lys Phe Ala 85 90 95 Asp Ile Gly Asn Thr Val
Met Leu Gln Tyr Thr Ser Gly Val Tyr Arg 100 105 110 Ile Ala Ala Trp
Ser Asp Ile Thr Asn Ala His Gly Val Thr Gly Lys 115 120 125 Gly Val
Val Glu Gly Leu Lys Arg Gly Ala Glu Gly Val Glu Lys Glu 130 135 140
Arg Gly Val Leu Met Leu Ala Glu Leu Ser Ser Lys Gly Ser Leu Ala 145
150 155 160 His Gly Glu Tyr Thr Arg Glu Thr Ile Glu Ile Ala Lys Ser
Asp Arg 165 170 175 Glu Phe Val Ile Gly Phe Ile Ala Gln Arg Asp Met
Gly Gly Arg Glu 180 185 190 Glu Gly Phe Asp Trp Ile Ile Met Thr Pro
Gly Val Gly Leu Asp Asp 195 200 205 Lys Gly Asp Ala Leu Gly Gln Gln
Tyr Arg Thr Val Asp Glu Val Val 210 215 220 Leu Thr Gly Thr Asp Val
Ile Ile Val Gly Arg Gly Leu Phe Gly Lys 225 230 235 240 Gly Arg Asp
Pro Glu Val Glu Gly Lys Arg Tyr Arg Asp Ala Gly Trp 245 250 255 Lys
Ala Tyr Leu Lys Arg Thr Gly Gln Leu Glu 260 265 107 473 DNA
Artificial Sequence Description of Artificial Sequence Primer 107
gtcaaagcaa attgttggcc caagcagact cttggaccac cgttgaatgg aacataagcc
60 cagcccaact tcttagtaga tggttcaaac catctttctg gtctgaagtc
gttagcgtcc 120 ttaccgtagt attcttccaa acggtgggtc ttgtagacaa
cgtaagcaac agtggagcct 180 ttaggaatgt agattgggtc ggtaccgtta
gcaccaccac ctcttggcaa agtggtgtct 240 ctggtggcgg ttctaaagtt
gacaggaaca gatgggtaca tacgcaaggt ttcgttaagg 300 atagccttca
agtattcaca tctcttcaag gcttcgaaag taatttcttc aacgcgggag 360
tcttcaccaa caccaaagtt aacttcgatt tcttctctca acttggacca catctctggg
420 tgtctagcca attcaaacaa agcaaaggac aacaaacccg cggtggtgtc tct 473
108 540 DNA Candida tropicalis 108 tactaacttg ttgaggatct tataaccata
cagcaacacg gtcacaacat gtagtagttt 60 gttgaggaac gtatgtgttt
ctgagcgcag aactactttt tcaacccacg acgaggtcag 120 tgtttgttca
acatgctgtt gcgaaagcca tagcagttac ctaccttccg agaggtcaag 180
ttctttctcc cgtcccgagt tctcatgttg ctaatgttca aactggtgag gttcttgggt
240 tcgcacccgt ggatgcagtc ataagaaaag ccgtggtcct agcagcactg
gtttctaggt 300 ctcttatagt ttcgataaaa ccgttgggtc aaaccactaa
aaagaaaccc gttctccgtg 360 tgagaaaaat tcggaaacaa tccactaccc
tagaagtgta acctgccgct tccgaccttc 420 gtgtcgtctc ggtacaactc
tggtgtcaaa cggtctcttg ttcaacgagt acactgcagc 480 aaccttggtg
tgaaggtcaa caacttcttc gtataagaat tcgtgttccc acttatgaaa 540 109 29
DNA Bacteriophage T7 109 ggatcctaat acgactcact atagggagg 29 110 523
PRT CANDIDATROPICALIS 110 Met Ala Thr Gln Glu Ile Ile Asp Ser Val
Leu Pro Tyr Leu Thr Lys 1 5 10 15 Trp Tyr Thr Val Ile Thr Ala Ala
Val Leu Val Phe Leu Ile Ser Thr 20 25 30 Asn Ile Lys Asn Tyr Val
Lys Ala Lys Lys Leu Lys Cys Val Asp Pro 35 40 45 Pro Tyr Leu Lys
Asp Ala Gly Leu Thr Gly Ile Ser Ser Leu Ile Ala 50 55 60 Ala Ile
Lys Ala Lys Asn Asp Gly Arg Leu Ala Asn Phe Ala Asp Glu 65 70 75 80
Val Phe Asp Glu Tyr Pro Asn His Thr Phe Tyr Leu Ser Val Ala Gly 85
90 95 Ala Leu Lys Ile Val Met Thr Val Asp Pro Glu Asn Ile Lys Ala
Val 100 105 110 Leu Ala Thr Gln Phe Thr Asp Phe Ser Leu Gly Thr Arg
His Ala His 115 120 125 Phe Ala Pro Leu Leu Gly Asp Gly Ile Phe Thr
Leu Asp Gly Glu Gly 130 135 140 Trp Lys His Ser Arg Ala Met Leu Arg
Pro Gln Phe Ala Arg Asp Gln 145 150 155 160 Ile Gly His Val Lys Ala
Leu Glu Pro His Ile Gln Ile Met Ala Lys 165 170 175 Gln Ile Lys Leu
Asn Gln Gly Lys Thr Phe Asp Ile Gln Glu Leu Phe 180 185 190 Phe Arg
Phe Thr Val Asp Thr Ala Thr Glu Phe Leu Phe Gly Glu Ser 195 200 205
Val His Ser Leu Tyr Asp Glu Lys Leu Gly Ile Pro Thr Pro Asn Glu 210
215 220 Ile Pro Gly Arg Glu Asn Phe Ala Ala Ala Phe Asn Val Ser Gln
His 225 230 235 240 Tyr Leu Ala Thr Arg Ser Tyr Ser Gln Thr Phe Tyr
Phe Leu Thr Asn 245 250 255 Pro Lys Glu Phe Arg Asp Cys Asn Ala Lys
Val His His Leu Ala Lys 260 265 270 Tyr Phe Val Asn Lys Ala Leu Asn
Phe Thr Pro Glu Glu Leu Glu Glu 275 280 285 Lys Ser Lys Ser Gly Tyr
Val Phe Leu Tyr Glu Leu Val Lys Gln Thr 290 295 300 Arg Asp Pro Lys
Val Leu Gln Asp Gln Leu Leu Asn Ile Met Val Ala 305 310 315 320 Gly
Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala Leu Phe Glu Leu 325 330
335 Ala Arg His Pro Glu Met Trp Ser Lys Leu Arg Glu Glu Ile Glu Val
340 345 350 Asn Phe Gly Val Gly Glu Asp Ser Arg Val Glu Glu Ile Thr
Phe Glu 355 360 365 Ala Leu Lys Arg Cys Glu Tyr Leu Lys Ala Ile Leu
Asn Glu Thr Leu 370 375 380 Arg Met Tyr Pro Ser Val Pro Val Asn Phe
Arg Thr Ala Thr Arg Asp 385 390 395 400 Thr Thr Leu Pro Arg Gly Gly
Gly Ala Asn Gly Thr Asp Pro Ile Tyr 405 410 415 Ile Pro Lys Gly Ser
Thr Val Ala Tyr Val Val Tyr Lys Thr His Arg 420 425 430 Leu Glu Glu
Tyr Tyr Gly Lys Asp Ala Asn Asp Phe Arg Pro Glu Arg 435 440 445 Trp
Phe Glu Pro Ser Thr Lys Lys Leu Gly Trp Ala Tyr Val Pro Phe 450 455
460 Asn Gly Gly Pro Arg Val Cys Leu Gly Gln Gln Phe Ala Leu Thr Glu
465 470 475 480 Ala Ser Tyr Val Ile Thr Arg Leu Ala Gln Met Phe Glu
Thr Val Ser 485 490 495 Ser Asp Pro Gly Leu Glu Tyr Pro Pro Pro Lys
Cys Ile His Leu Thr 500 505 510 Met Ser His Asn Asp Gly Val Phe Val
Lys Met 515
520 111 540 PRT CANDIDATROPICALIS 111 Met Ser Ser Ser Pro Ser Phe
Ala Gln Glu Val Leu Ala Thr Thr Ser 1 5 10 15 Pro Tyr Ile Glu Tyr
Phe Leu Asp Asn Tyr Thr Arg Trp Tyr Tyr Phe 20 25 30 Ile Pro Leu
Val Leu Leu Ser Leu Asn Phe Ile Ser Leu Leu His Thr 35 40 45 Lys
Tyr Leu Glu Arg Arg Phe His Ala Lys Pro Leu Gly Asn Val Val 50 55
60 Leu Asp Pro Thr Phe Gly Ile Ala Thr Pro Leu Ile Leu Ile Tyr Leu
65 70 75 80 Lys Ser Lys Gly Thr Val Met Lys Phe Ala Trp Ser Phe Trp
Asn Asn 85 90 95 Lys Tyr Ile Val Lys Asp Pro Lys Tyr Lys Thr Thr
Gly Leu Arg Ile 100 105 110 Val Gly Leu Pro Leu Ile Glu Thr Ile Asp
Pro Glu Asn Ile Lys Ala 115 120 125 Val Leu Ala Thr Gln Phe Asn Asp
Phe Ser Leu Gly Thr Arg His Asp 130 135 140 Phe Leu Tyr Ser Leu Leu
Gly Asp Gly Ile Phe Thr Leu Asp Gly Ala 145 150 155 160 Gly Trp Lys
His Ser Arg Thr Met Leu Arg Pro Gln Phe Ala Arg Glu 165 170 175 Gln
Val Ser His Val Lys Leu Leu Glu Pro His Val Gln Val Phe Phe 180 185
190 Lys His Val Arg Lys His Arg Gly Gln Thr Phe Asp Ile Gln Glu Leu
195 200 205 Phe Phe Arg Leu Thr Val Asp Ser Ala Thr Glu Phe Leu Phe
Gly Glu 210 215 220 Ser Ala Glu Ser Leu Arg Asp Asp Ser Val Gly Leu
Thr Pro Thr Thr 225 230 235 240 Lys Asp Phe Glu Gly Arg Gly Asp Phe
Ala Asp Ala Phe Asn Tyr Ser 245 250 255 Gln Thr Tyr Gln Ala Tyr Arg
Phe Leu Leu Gln Gln Met Tyr Trp Ile 260 265 270 Leu Asn Gly Ala Glu
Phe Arg Lys Ser Ile Ala Ile Val His Lys Phe 275 280 285 Ala Asp His
Tyr Val Gln Lys Ala Leu Glu Leu Thr Asp Asp Asp Leu 290 295 300 Gln
Lys Gln Asp Gly Tyr Val Phe Leu Tyr Glu Leu Ala Lys Gln Thr 305 310
315 320 Arg Asp Pro Lys Val Leu Arg Asp Gln Leu Leu Asn Ile Leu Val
Ala 325 330 335 Gly Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Val Phe
Tyr Glu Leu 340 345 350 Ser Arg Asn Pro Glu Val Phe Ala Lys Leu Arg
Glu Glu Val Glu Asn 355 360 365 Arg Phe Gly Leu Gly Glu Glu Ala Arg
Val Glu Glu Ile Ser Phe Glu 370 375 380 Ser Leu Lys Ser Cys Glu Tyr
Leu Lys Ala Val Ile Asn Glu Ala Leu 385 390 395 400 Arg Leu Tyr Pro
Ser Val Pro His Asn Phe Arg Val Ala Thr Arg Asn 405 410 415 Thr Thr
Leu Pro Arg Gly Gly Gly Lys Asp Gly Cys Ser Pro Ile Val 420 425 430
Val Lys Lys Gly Gln Val Val Met Tyr Thr Val Ile Gly Thr His Arg 435
440 445 Asp Pro Ser Ile Tyr Gly Ala Asp Ala Asp Val Phe Arg Pro Glu
Arg 450 455 460 Trp Phe Glu Pro Glu Thr Arg Lys Leu Gly Trp Ala Tyr
Val Pro Phe 465 470 475 480 Asn Gly Gly Pro Arg Ile Cys Leu Gly Gln
Gln Phe Ala Leu Thr Glu 485 490 495 Ala Ser Tyr Val Thr Val Arg Leu
Leu Gln Glu Phe Gly Asn Leu Ser 500 505 510 Ser Asp Pro Asn Ala Glu
Tyr Pro Pro Lys Leu Gln Asn Thr Leu Thr 515 520 525 Leu Ser Leu Phe
Asp Gly Ala Asp Val Arg Met Phe 530 535 540 112 517 PRT
CANDIDATROPICALIS 112 Met Ile Glu Gln Leu Leu Glu Tyr Trp Tyr Val
Val Val Pro Val Leu 1 5 10 15 Tyr Ile Ile Lys Gln Leu Leu Ala Tyr
Thr Lys Thr Arg Val Leu Met 20 25 30 Lys Lys Leu Gly Ala Ala Pro
Val Thr Asn Lys Leu Tyr Asp Asn Ala 35 40 45 Phe Gly Ile Val Asn
Gly Trp Lys Ala Leu Gln Phe Lys Lys Glu Gly 50 55 60 Arg Ala Gln
Glu Tyr Asn Asp Tyr Lys Phe Asp His Ser Lys Asn Pro 65 70 75 80 Ser
Val Gly Thr Tyr Val Ser Ile Leu Phe Gly Thr Arg Ile Val Val 85 90
95 Thr Lys Asp Pro Glu Asn Ile Lys Ala Ile Leu Ala Thr Gln Phe Gly
100 105 110 Asp Phe Ser Leu Gly Lys Arg His Thr Leu Phe Lys Pro Leu
Leu Gly 115 120 125 Asp Gly Ile Phe Thr Leu Asp Gly Glu Gly Trp Lys
His Ser Arg Ala 130 135 140 Met Leu Arg Pro Gln Phe Ala Arg Glu Gln
Val Ala His Val Thr Ser 145 150 155 160 Leu Glu Pro His Phe Gln Leu
Leu Lys Lys His Ile Leu Lys His Lys 165 170 175 Gly Glu Tyr Phe Asp
Ile Gln Glu Leu Phe Phe Arg Phe Thr Val Asp 180 185 190 Ser Ala Thr
Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 195 200 205 Glu
Ser Ile Gly Ile Asn Gln Asp Asp Ile Asp Phe Ala Gly Arg Lys 210 215
220 Asp Phe Ala Glu Ser Phe Asn Lys Ala Gln Glu Tyr Leu Ala Ile Arg
225 230 235 240 Thr Leu Val Gln Thr Phe Tyr Trp Leu Val Asn Asn Lys
Glu Phe Arg 245 250 255 Asp Cys Thr Lys Ser Val His Lys Phe Thr Asn
Tyr Tyr Val Gln Lys 260 265 270 Ala Leu Asp Ala Ser Pro Glu Glu Leu
Glu Lys Gln Ser Gly Tyr Val 275 280 285 Phe Leu Tyr Glu Leu Val Lys
Gln Thr Arg Asp Pro Asn Val Leu Arg 290 295 300 Asp Gln Ser Leu Asn
Ile Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 305 310 315 320 Leu Leu
Ser Phe Ala Val Phe Glu Leu Ala Arg His Pro Glu Ile Trp 325 330 335
Ala Lys Leu Arg Glu Glu Ile Glu Gln Gln Phe Gly Leu Gly Glu Asp 340
345 350 Ser Arg Val Glu Glu Ile Thr Phe Glu Ser Leu Lys Arg Cys Glu
Tyr 355 360 365 Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg Ile Tyr Pro
Ser Val Pro 370 375 380 Arg Asn Phe Arg Ile Ala Thr Lys Asn Thr Thr
Leu Pro Arg Gly Gly 385 390 395 400 Gly Ser Asp Gly Thr Ser Pro Ile
Leu Ile Gln Lys Gly Glu Ala Val 405 410 415 Ser Tyr Gly Ile Asn Ser
Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 420 425 430 Asp Ala Ala Glu
Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Lys 435 440 445 Lys Leu
Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ile Cys 450 455 460
Leu Gly Gln Gln Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 465
470 475 480 Leu Val Gln Glu Phe Ser His Val Arg Ser Asp Pro Asp Glu
Val Tyr 485 490 495 Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu
Gln Asp Gly Ala 500 505 510 Ile Val Lys Phe Asp 515 113 517 PRT
CANDIDATROPICALIS 113 Met Ile Glu Gln Ile Leu Glu Tyr Trp Tyr Ile
Val Val Pro Val Leu 1 5 10 15 Tyr Ile Ile Lys Gln Leu Ile Ala Tyr
Ser Lys Thr Arg Val Leu Met 20 25 30 Lys Gln Leu Gly Ala Ala Pro
Ile Thr Asn Gln Leu Tyr Asp Asn Val 35 40 45 Phe Gly Ile Val Asn
Gly Trp Lys Ala Leu Gln Phe Lys Lys Glu Gly 50 55 60 Arg Ala Gln
Glu Tyr Asn Asp His Lys Phe Asp Ser Ser Lys Asn Pro 65 70 75 80 Ser
Val Gly Thr Tyr Val Ser Ile Leu Phe Gly Thr Lys Ile Val Val 85 90
95 Thr Lys Asp Pro Glu Asn Ile Lys Ala Ile Leu Ala Thr Gln Phe Gly
100 105 110 Asp Phe Ser Leu Gly Lys Arg His Ala Leu Phe Lys Pro Leu
Leu Gly 115 120 125 Asp Gly Ile Phe Thr Leu Asp Gly Glu Gly Trp Lys
His Ser Arg Ser 130 135 140 Met Leu Arg Pro Gln Phe Ala Arg Glu Gln
Val Ala His Val Thr Ser 145 150 155 160 Leu Glu Pro His Phe Gln Leu
Leu Lys Lys His Ile Leu Lys His Lys 165 170 175 Gly Glu Tyr Phe Asp
Ile Gln Glu Leu Phe Phe Arg Phe Thr Val Asp 180 185 190 Ser Ala Thr
Glu Phe Leu Phe Gly Glu Ser Val His Ser Leu Lys Asp 195 200 205 Glu
Thr Ile Gly Ile Asn Gln Asp Asp Ile Asp Phe Ala Gly Arg Lys 210 215
220 Asp Phe Ala Glu Ser Phe Asn Lys Ala Gln Glu Tyr Leu Ser Ile Arg
225 230 235 240 Ile Leu Val Gln Thr Phe Tyr Trp Leu Ile Asn Asn Lys
Glu Phe Arg 245 250 255 Asp Cys Thr Lys Ser Val His Lys Phe Thr Asn
Tyr Tyr Val Gln Lys 260 265 270 Ala Leu Asp Ala Thr Pro Glu Glu Leu
Glu Lys Gln Gly Gly Tyr Val 275 280 285 Phe Leu Tyr Glu Leu Val Lys
Gln Thr Arg Asp Pro Lys Val Leu Arg 290 295 300 Asp Gln Ser Leu Asn
Ile Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly 305 310 315 320 Leu Leu
Ser Phe Ala Val Phe Glu Leu Ala Arg Asn Pro His Ile Trp 325 330 335
Ala Lys Leu Arg Glu Glu Ile Glu Gln Gln Phe Gly Leu Gly Glu Asp 340
345 350 Ser Arg Val Glu Glu Ile Thr Phe Glu Ser Leu Lys Arg Cys Glu
Tyr 355 360 365 Leu Lys Ala Phe Leu Asn Glu Thr Leu Arg Val Tyr Pro
Ser Val Pro 370 375 380 Arg Asn Phe Arg Ile Ala Thr Lys Asn Thr Thr
Leu Pro Arg Gly Gly 385 390 395 400 Gly Pro Asp Gly Thr Gln Pro Ile
Leu Ile Gln Lys Gly Glu Gly Val 405 410 415 Ser Tyr Gly Ile Asn Ser
Thr His Leu Asp Pro Val Tyr Tyr Gly Pro 420 425 430 Asp Ala Ala Glu
Phe Arg Pro Glu Arg Trp Phe Glu Pro Ser Thr Arg 435 440 445 Lys Leu
Gly Trp Ala Tyr Leu Pro Phe Asn Gly Gly Pro Arg Ile Cys 450 455 460
Leu Gly Gln Gln Phe Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg 465
470 475 480 Leu Val Gln Glu Phe Ser His Ile Arg Ser Asp Pro Asp Glu
Val Tyr 485 490 495 Pro Pro Lys Arg Leu Thr Asn Leu Thr Met Cys Leu
Gln Asp Gly Ala 500 505 510 Ile Val Lys Phe Asp 515 114 512 PRT
CANDIDATROPICALIS 114 Met Leu Asp Gln Ile Leu His Tyr Trp Tyr Ile
Val Leu Pro Leu Leu 1 5 10 15 Ala Ile Ile Asn Gln Ile Val Ala His
Val Arg Thr Asn Tyr Leu Met 20 25 30 Lys Lys Leu Gly Ala Lys Pro
Phe Thr His Val Gln Arg Asp Gly Trp 35 40 45 Leu Gly Phe Lys Phe
Gly Arg Glu Phe Leu Lys Ala Lys Ser Ala Gly 50 55 60 Arg Ser Val
Asp Leu Ile Ile Ser Arg Phe His Asp Asn Glu Asp Thr 65 70 75 80 Phe
Ser Ser Tyr Ala Phe Gly Asn His Val Val Phe Thr Arg Asp Pro 85 90
95 Glu Asn Ile Lys Ala Leu Leu Ala Thr Gln Phe Gly Asp Phe Ser Leu
100 105 110 Gly Ser Arg Val Lys Phe Phe Lys Pro Leu Leu Gly Tyr Gly
Ile Phe 115 120 125 Thr Leu Asp Ala Glu Gly Trp Lys His Ser Arg Ala
Met Leu Arg Pro 130 135 140 Gln Phe Ala Arg Glu Gln Val Ala His Val
Thr Ser Leu Glu Pro His 145 150 155 160 Phe Gln Leu Leu Lys Lys His
Ile Leu Lys His Lys Gly Glu Tyr Phe 165 170 175 Asp Ile Gln Glu Leu
Phe Phe Arg Phe Thr Val Asp Ser Ala Thr Glu 180 185 190 Phe Leu Phe
Gly Glu Ser Val His Ser Leu Lys Asp Glu Glu Ile Gly 195 200 205 Tyr
Asp Thr Lys Asp Met Ser Glu Glu Arg Arg Arg Phe Ala Asp Ala 210 215
220 Phe Asn Lys Ser Gln Val Tyr Val Ala Thr Arg Val Ala Leu Gln Asn
225 230 235 240 Leu Tyr Trp Leu Val Asn Asn Lys Glu Phe Lys Glu Cys
Asn Asp Ile 245 250 255 Val His Lys Phe Thr Asn Tyr Tyr Val Gln Lys
Ala Leu Asp Ala Thr 260 265 270 Pro Glu Glu Leu Glu Lys Gln Gly Gly
Tyr Val Phe Leu Tyr Glu Leu 275 280 285 Val Lys Gln Thr Arg Asp Pro
Lys Val Leu Arg Asp Gln Ser Leu Asn 290 295 300 Ile Leu Leu Ala Gly
Arg Asp Thr Thr Ala Gly Leu Leu Ser Phe Ala 305 310 315 320 Val Phe
Glu Leu Ala Arg Asn Pro His Ile Trp Ala Lys Leu Arg Glu 325 330 335
Glu Ile Glu Gln Gln Phe Gly Leu Gly Glu Asp Ser Arg Val Glu Glu 340
345 350 Ile Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr Leu Lys Ala Val
Leu 355 360 365 Asn Glu Thr Leu Arg Leu His Pro Ser Val Pro Arg Asn
Ala Arg Phe 370 375 380 Ala Ile Lys Asp Thr Thr Leu Pro Arg Gly Gly
Gly Pro Asn Gly Lys 385 390 395 400 Asp Pro Ile Leu Ile Arg Lys Asp
Glu Val Val Gln Tyr Ser Ile Ser 405 410 415 Ala Thr Gln Thr Asn Pro
Ala Tyr Tyr Gly Ala Asp Ala Ala Asp Phe 420 425 430 Arg Pro Glu Arg
Trp Phe Glu Pro Ser Thr Arg Asn Leu Gly Trp Ala 435 440 445 Phe Leu
Pro Phe Asn Gly Gly Pro Arg Ile Cys Leu Gly Gln Gln Phe 450 455 460
Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg Leu Val Gln Glu Phe 465
470 475 480 Pro Asn Leu Ser Gln Asp Pro Glu Thr Lys Tyr Pro Pro Pro
Arg Leu 485 490 495 Ala His Leu Thr Met Cys Leu Phe Asp Gly Ala His
Val Lys Met Ser 500 505 510 115 512 PRT CANDIDATROPICALIS 115 Met
Leu Asp Gln Ile Phe His Tyr Trp Tyr Ile Val Leu Pro Leu Leu 1 5 10
15 Val Ile Ile Lys Gln Ile Val Ala His Ala Arg Thr Asn Tyr Leu Met
20 25 30 Lys Lys Leu Gly Ala Lys Pro Phe Thr His Val Gln Leu Asp
Gly Trp 35 40 45 Phe Gly Phe Lys Phe Gly Arg Glu Phe Leu Lys Ala
Lys Ser Ala Gly 50 55 60 Arg Gln Val Asp Leu Ile Ile Ser Arg Phe
His Asp Asn Glu Asp Thr 65 70 75 80 Phe Ser Ser Tyr Ala Phe Gly Asn
His Val Val Phe Thr Arg Asp Pro 85 90 95 Glu Asn Ile Lys Ala Leu
Leu Ala Thr Gln Phe Gly Asp Phe Ser Leu 100 105 110 Gly Ser Arg Val
Lys Phe Phe Lys Pro Leu Leu Gly Tyr Gly Ile Phe 115 120 125 Thr Leu
Asp Gly Glu Gly Trp Lys His Ser Arg Ala Met Leu Arg Pro 130 135 140
Gln Phe Ala Arg Glu Gln Val Ala His Val Thr Ser Leu Glu Pro His 145
150 155 160 Phe Gln Leu Leu Lys Lys His Ile Leu Lys His Lys Gly Glu
Tyr Phe 165 170 175 Asp Ile Gln Glu Leu Phe Phe Arg Phe Thr Val Asp
Ser Ala Thr Glu 180 185 190 Phe Leu Phe Gly Glu Ser Val His Ser Leu
Arg Asp Glu Glu Ile Gly 195 200 205 Tyr Asp Thr Lys Asp Met Ala Glu
Glu Arg Arg Lys Phe Ala Asp Ala 210 215 220 Phe Asn Lys Ser Gln Val
Tyr Leu Ser Thr Arg Val Ala Leu Gln Thr 225 230 235 240 Leu Tyr Trp
Leu Val Asn Asn Lys Glu Phe Lys Glu Cys Asn Asp Ile 245 250 255 Val
His Lys Phe Thr Asn Tyr Tyr Val Gln Lys Ala Leu Asp Ala Thr 260 265
270 Pro Glu Glu Leu Glu Lys Gln Gly Gly Tyr Val Phe Leu Tyr Glu Leu
275 280 285 Ala Lys Gln Thr Lys Asp Pro Asn Val Leu Arg Asp Gln Ser
Leu Asn 290 295 300 Ile Leu Leu Ala Gly Arg Asp Thr Thr Ala Gly Leu
Leu Ser Phe Ala 305 310 315 320 Val Phe Glu Leu Ala Arg Asn Pro His
Ile Trp Ala Lys Leu Arg Glu 325 330
335 Glu Ile Glu Ser His Phe Gly Ser Gly Glu Asp Ser Arg Val Glu Glu
340 345 350 Ile Thr Phe Glu Ser Leu Lys Arg Cys Glu Tyr Leu Lys Ala
Val Leu 355 360 365 Asn Glu Thr Leu Arg Leu His Pro Ser Val Pro Arg
Asn Ala Arg Phe 370 375 380 Ala Ile Lys Asp Thr Thr Leu Pro Arg Gly
Gly Gly Pro Asn Gly Lys 385 390 395 400 Asp Pro Ile Leu Ile Arg Lys
Asn Glu Val Val Gln Tyr Ser Ile Ser 405 410 415 Ala Thr Gln Thr Asn
Pro Ala Tyr Tyr Gly Ala Asp Ala Ala Asp Phe 420 425 430 Arg Pro Glu
Arg Trp Phe Glu Pro Ser Thr Arg Asn Leu Gly Trp Ala 435 440 445 Tyr
Leu Pro Phe Asn Gly Gly Pro Arg Ile Cys Leu Gly Gln Gln Phe 450 455
460 Ala Leu Thr Glu Ala Gly Tyr Val Leu Val Arg Leu Val Gln Glu Phe
465 470 475 480 Pro Ser Leu Ser Gln Asp Pro Glu Thr Glu Tyr Pro Pro
Pro Arg Leu 485 490 495 Ala His Leu Thr Met Cys Leu Phe Asp Gly Ala
Tyr Val Lys Met Gln 500 505 510 116 499 PRT CANDIDATROPICALIS 116
Met Ala Ile Ser Ser Leu Leu Ser Trp Asp Val Ile Cys Val Val Phe 1 5
10 15 Ile Cys Val Cys Val Tyr Phe Gly Tyr Glu Tyr Cys Tyr Thr Lys
Tyr 20 25 30 Leu Met His Lys His Gly Ala Arg Glu Ile Glu Asn Val
Ile Asn Asp 35 40 45 Gly Phe Phe Gly Phe Arg Leu Pro Leu Leu Leu
Met Arg Ala Ser Asn 50 55 60 Glu Gly Arg Leu Ile Glu Phe Ser Val
Lys Arg Phe Glu Ser Ala Pro 65 70 75 80 His Pro Gln Asn Lys Thr Leu
Val Asn Arg Ala Leu Ser Val Pro Val 85 90 95 Ile Leu Thr Lys Asp
Pro Val Asn Ile Lys Ala Met Leu Ser Thr Gln 100 105 110 Phe Asp Asp
Phe Ser Leu Gly Leu Arg Leu His Gln Phe Ala Pro Leu 115 120 125 Leu
Gly Lys Gly Ile Phe Thr Leu Asp Gly Pro Glu Trp Lys Gln Ser 130 135
140 Arg Ser Met Leu Arg Pro Gln Phe Ala Lys Asp Arg Val Ser His Ile
145 150 155 160 Ser Asp Leu Glu Pro His Phe Val Leu Leu Arg Lys His
Ile Asp Gly 165 170 175 His Asn Gly Asp Tyr Phe Asp Ile Gln Glu Leu
Tyr Phe Arg Phe Ser 180 185 190 Met Asp Val Ala Thr Gly Phe Leu Phe
Gly Glu Ser Val Gly Ser Leu 195 200 205 Lys Asp Glu Asp Ala Arg Phe
Ser Glu Ala Phe Asn Glu Ser Gln Lys 210 215 220 Tyr Leu Ala Thr Arg
Ala Thr Leu His Glu Leu Tyr Phe Leu Cys Asp 225 230 235 240 Gly Phe
Arg Phe Arg Gln Tyr Asn Lys Val Val Arg Lys Phe Cys Ser 245 250 255
Gln Cys Val His Lys Ala Leu Asp Val Ala Pro Glu Asp Thr Ser Glu 260
265 270 Tyr Val Phe Leu Arg Glu Leu Val Lys His Thr Arg Asp Pro Val
Val 275 280 285 Leu Gln Asp Gln Ala Leu Asn Val Leu Leu Ala Gly Arg
Asp Thr Thr 290 295 300 Ala Ser Leu Leu Ser Phe Ala Thr Phe Glu Leu
Ala Arg Asn Asp His 305 310 315 320 Met Trp Arg Lys Leu Arg Glu Glu
Val Ile Ser Thr Met Gly Pro Ser 325 330 335 Ser Asp Glu Ile Thr Val
Ala Gly Leu Lys Ser Cys Arg Tyr Leu Lys 340 345 350 Ala Ile Leu Asn
Glu Thr Leu Arg Leu Tyr Pro Ser Val Pro Arg Asn 355 360 365 Ala Arg
Phe Ala Thr Arg Asn Thr Thr Leu Pro Arg Gly Gly Gly Pro 370 375 380
Asp Gly Ser Phe Pro Ile Leu Ile Arg Lys Gly Gln Pro Val Gly Tyr 385
390 395 400 Phe Ile Cys Ala Thr His Leu Asn Glu Lys Val Tyr Gly Asn
Asp Ser 405 410 415 His Val Phe Arg Pro Glu Arg Trp Ala Ala Leu Glu
Gly Lys Ser Leu 420 425 430 Gly Trp Ser Tyr Leu Pro Phe Asn Gly Gly
Pro Arg Ser Cys Leu Gly 435 440 445 Gln Gln Phe Ala Ile Leu Glu Ala
Ser Tyr Val Leu Ala Arg Leu Thr 450 455 460 Gln Cys Tyr Thr Thr Ile
Gln Leu Arg Thr Thr Glu Tyr Pro Pro Lys 465 470 475 480 Lys Leu Val
His Leu Thr Met Ser Leu Leu Asn Gly Val Tyr Ile Arg 485 490 495 Thr
Arg Thr 117 679 PRT CANDIDATROPICALIS 117 Met Ala Leu Asp Lys Leu
Asp Leu Tyr Val Ile Ile Thr Leu Val Val 1 5 10 15 Ala Val Ala Ala
Tyr Phe Ala Lys Asn Gln Phe Leu Asp Gln Pro Gln 20 25 30 Asp Thr
Gly Phe Leu Asn Thr Asp Ser Gly Ser Asn Ser Arg Asp Val 35 40 45
Leu Ser Thr Leu Lys Lys Asn Asn Lys Asn Thr Leu Leu Leu Phe Gly 50
55 60 Ser Gln Thr Gly Thr Ala Glu Asp Tyr Ala Asn Lys Leu Ser Arg
Glu 65 70 75 80 Leu His Ser Arg Phe Gly Leu Lys Thr Met Val Ala Asp
Phe Ala Asp 85 90 95 Tyr Asp Trp Asp Asn Phe Gly Asp Ile Thr Glu
Asp Ile Leu Val Phe 100 105 110 Phe Ile Val Ala Thr Tyr Gly Glu Gly
Glu Pro Thr Asp Asn Ala Asp 115 120 125 Glu Phe His Thr Trp Leu Thr
Glu Glu Ala Asp Thr Leu Ser Thr Leu 130 135 140 Lys Tyr Thr Val Phe
Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe Asn 145 150 155 160 Ala Ile
Gly Arg Lys Phe Asp Arg Leu Leu Ser Glu Lys Gly Gly Asp 165 170 175
Arg Phe Ala Glu Tyr Ala Glu Gly Asp Asp Gly Thr Gly Thr Leu Asp 180
185 190 Glu Asp Phe Met Ala Trp Lys Asp Asn Val Phe Asp Ala Leu Lys
Asn 195 200 205 Asp Leu Asn Phe Glu Glu Lys Glu Leu Lys Tyr Glu Pro
Asn Val Lys 210 215 220 Leu Thr Glu Arg Asp Asp Leu Ser Ala Ala Asp
Ser Gln Val Ser Leu 225 230 235 240 Gly Glu Pro Asn Lys Lys Tyr Ile
Asn Ser Glu Gly Ile Asp Leu Thr 245 250 255 Lys Gly Pro Phe Asp His
Thr His Pro Tyr Leu Ala Arg Ile Thr Glu 260 265 270 Thr Arg Glu Leu
Phe Ser Ser Lys Asp Arg His Cys Ile His Val Glu 275 280 285 Phe Asp
Ile Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly Asp His Leu 290 295 300
Ala Ile Trp Pro Ser Asn Ser Asp Glu Asn Ile Lys Gln Phe Ala Lys 305
310 315 320 Cys Phe Gly Leu Glu Asp Lys Leu Asp Thr Val Ile Glu Leu
Lys Ala 325 330 335 Leu Asp Ser Thr Tyr Thr Ile Pro Phe Pro Thr Pro
Ile Thr Tyr Gly 340 345 350 Ala Val Ile Arg His His Leu Glu Ile Ser
Gly Pro Val Ser Arg Gln 355 360 365 Phe Phe Leu Ser Ile Ala Gly Phe
Ala Pro Asp Glu Glu Thr Lys Lys 370 375 380 Ala Phe Thr Arg Leu Gly
Gly Asp Lys Gln Glu Phe Ala Ala Lys Val 385 390 395 400 Thr Arg Arg
Lys Phe Asn Ile Ala Asp Ala Leu Leu Tyr Ser Ser Asn 405 410 415 Asn
Ala Pro Trp Ser Asp Val Pro Phe Glu Phe Leu Ile Glu Asn Val 420 425
430 Pro His Leu Thr Pro Arg Tyr Tyr Ser Ile Ser Ser Ser Ser Leu Ser
435 440 445 Glu Lys Gln Leu Ile Asn Val Thr Ala Val Val Glu Ala Glu
Glu Glu 450 455 460 Ala Asp Gly Arg Pro Val Thr Gly Val Val Thr Asn
Leu Leu Lys Asn 465 470 475 480 Val Glu Ile Val Gln Asn Lys Thr Gly
Glu Lys Pro Leu Val His Tyr 485 490 495 Asp Leu Ser Gly Pro Arg Gly
Lys Phe Asn Lys Phe Lys Leu Pro Val 500 505 510 His Val Arg Arg Ser
Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr Pro 515 520 525 Val Ile Leu
Ile Gly Pro Gly Thr Gly Val Ala Pro Leu Arg Gly Phe 530 535 540 Val
Arg Glu Arg Val Gln Gln Val Lys Asn Gly Val Asn Val Gly Lys 545 550
555 560 Thr Leu Leu Phe Tyr Gly Cys Arg Asn Ser Asn Glu Asp Phe Leu
Tyr 565 570 575 Lys Gln Glu Trp Ala Glu Tyr Ala Ser Val Leu Gly Glu
Asn Phe Glu 580 585 590 Met Phe Asn Ala Phe Ser Arg Gln Asp Pro Ser
Lys Lys Val Tyr Val 595 600 605 Gln Asp Lys Ile Leu Glu Asn Ser Gln
Leu Val His Glu Leu Leu Thr 610 615 620 Glu Gly Ala Ile Ile Tyr Val
Cys Gly Asp Ala Ser Arg Met Ala Arg 625 630 635 640 Asp Val Gln Thr
Thr Ile Ser Lys Ile Val Ala Lys Ser Arg Glu Ile 645 650 655 Ser Glu
Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys Val Gln Asn 660 665 670
Arg Tyr Gln Glu Asp Val Trp 675 118 679 PRT CANDIDATROPICALIS 118
Met Ala Leu Asp Lys Leu Asp Leu Tyr Val Ile Ile Thr Leu Val Val 1 5
10 15 Ala Val Ala Ala Tyr Phe Ala Lys Asn Gln Phe Leu Asp Gln Pro
Gln 20 25 30 Asp Thr Gly Phe Leu Asn Thr Asp Ser Gly Ser Asn Ser
Arg Asp Val 35 40 45 Leu Ser Thr Leu Lys Lys Asn Asn Lys Asn Thr
Leu Leu Leu Phe Gly 50 55 60 Ser Gln Thr Gly Thr Ala Glu Asp Tyr
Ala Asn Lys Leu Ser Arg Glu 65 70 75 80 Leu His Ser Arg Phe Gly Leu
Lys Thr Met Val Ala Asp Phe Ala Asp 85 90 95 Tyr Asp Trp Asp Asn
Phe Gly Asp Ile Thr Glu Asp Ile Leu Val Phe 100 105 110 Phe Ile Val
Ala Thr Tyr Gly Glu Gly Glu Pro Thr Asp Asn Ala Asp 115 120 125 Glu
Phe His Thr Trp Leu Thr Glu Glu Ala Asp Thr Leu Ser Thr Leu 130 135
140 Arg Tyr Thr Val Phe Gly Leu Gly Asn Ser Thr Tyr Glu Phe Phe Asn
145 150 155 160 Ala Ile Gly Arg Lys Phe Asp Arg Leu Leu Ser Glu Lys
Gly Gly Asp 165 170 175 Arg Phe Ala Glu Tyr Ala Glu Gly Asp Asp Gly
Thr Gly Thr Leu Asp 180 185 190 Glu Asp Phe Met Ala Trp Lys Asp Asn
Val Phe Asp Ala Leu Lys Asn 195 200 205 Asp Leu Asn Phe Glu Glu Lys
Glu Leu Lys Tyr Glu Pro Asn Val Lys 210 215 220 Leu Thr Glu Arg Asp
Asp Leu Ser Ala Ala Asp Ser Gln Val Ser Leu 225 230 235 240 Gly Glu
Pro Asn Lys Lys Tyr Ile Asn Ser Glu Gly Ile Asp Leu Thr 245 250 255
Lys Gly Pro Phe Asp His Thr His Pro Tyr Leu Ala Arg Ile Thr Glu 260
265 270 Thr Arg Glu Leu Phe Ser Ser Lys Glu Arg His Cys Ile His Val
Glu 275 280 285 Phe Asp Ile Ser Glu Ser Asn Leu Lys Tyr Thr Thr Gly
Asp His Leu 290 295 300 Ala Ile Trp Pro Ser Asn Ser Asp Glu Asn Ile
Lys Gln Phe Ala Lys 305 310 315 320 Cys Phe Gly Leu Glu Asp Lys Leu
Asp Thr Val Ile Glu Leu Lys Ala 325 330 335 Leu Asp Ser Thr Tyr Thr
Ile Pro Phe Pro Thr Pro Ile Thr Tyr Gly 340 345 350 Ala Val Ile Arg
His His Leu Glu Ile Ser Gly Pro Val Ser Arg Gln 355 360 365 Phe Phe
Leu Ser Ile Ala Gly Phe Ala Pro Asp Glu Glu Thr Lys Lys 370 375 380
Thr Phe Thr Arg Leu Gly Gly Asp Lys Gln Glu Phe Ala Thr Lys Val 385
390 395 400 Thr Arg Arg Lys Phe Asn Ile Ala Asp Ala Leu Leu Tyr Ser
Ser Asn 405 410 415 Asn Thr Pro Trp Ser Asp Val Pro Phe Glu Phe Leu
Ile Glu Asn Ile 420 425 430 Gln His Leu Thr Pro Arg Tyr Tyr Ser Ile
Ser Ser Ser Ser Leu Ser 435 440 445 Glu Lys Gln Leu Ile Asn Val Thr
Ala Val Val Glu Ala Glu Glu Glu 450 455 460 Ala Asp Gly Arg Pro Val
Thr Gly Val Val Thr Asn Leu Leu Lys Asn 465 470 475 480 Ile Glu Ile
Ala Gln Asn Lys Thr Gly Glu Lys Pro Leu Val His Tyr 485 490 495 Asp
Leu Ser Gly Pro Arg Gly Lys Phe Asn Lys Phe Lys Leu Pro Val 500 505
510 His Val Arg Arg Ser Asn Phe Lys Leu Pro Lys Asn Ser Thr Thr Pro
515 520 525 Val Ile Leu Ile Gly Pro Gly Thr Gly Val Ala Pro Leu Arg
Gly Phe 530 535 540 Val Arg Glu Arg Val Gln Gln Val Lys Asn Gly Val
Asn Val Gly Lys 545 550 555 560 Thr Leu Leu Phe Tyr Gly Cys Arg Asn
Ser Asn Glu Asp Phe Leu Tyr 565 570 575 Lys Gln Glu Trp Ala Glu Tyr
Ala Ser Val Leu Gly Glu Asn Phe Glu 580 585 590 Met Phe Asn Ala Phe
Ser Arg Gln Asp Pro Ser Lys Lys Val Tyr Val 595 600 605 Gln Asp Lys
Ile Leu Glu Asn Ser Gln Leu Val His Glu Leu Leu Thr 610 615 620 Glu
Gly Ala Ile Ile Tyr Val Cys Gly Asp Ala Ser Arg Met Ala Arg 625 630
635 640 Asp Val Gln Thr Thr Ile Ser Lys Ile Val Ala Lys Ser Arg Glu
Ile 645 650 655 Ser Glu Asp Lys Ala Ala Glu Leu Val Lys Ser Trp Lys
Val Gln Asn 660 665 670 Arg Tyr Gln Glu Asp Val Trp 675
* * * * *