U.S. patent application number 13/176452 was filed with the patent office on 2011-11-24 for cytosolic isobutanol pathway localization for the production of isobutanol.
This patent application is currently assigned to GEVO, INC.. Invention is credited to Aristos Aristidou, Ruth Berry, Thomas Buelter, Catherine Asleson Dundon, Reid M. Renny Feldman, Andrew Hawkins, Ishmeet Kalra, Doug Lies, Peter Meinhold, Matthew Peters, Stephanie Porter-Scheinman, Jun URANO.
Application Number | 20110287500 13/176452 |
Document ID | / |
Family ID | 43586479 |
Filed Date | 2011-11-24 |
United States Patent
Application |
20110287500 |
Kind Code |
A1 |
URANO; Jun ; et al. |
November 24, 2011 |
CYTOSOLIC ISOBUTANOL PATHWAY LOCALIZATION FOR THE PRODUCTION OF
ISOBUTANOL
Abstract
The present invention provides recombinant microorganisms
comprising isobutanol producing metabolic pathway with at least one
isobutanol pathway enzyme localized in the cytosol, wherein said
recombinant microorganism is selected to produce isobutanol from a
carbon source. Methods of using said recombinant microorganisms to
produce isobutanol are also provided. In various aspects of the
invention, the recombinant microorganisms may comprise a
cytosolically active isobutanol pathway enzymes. In some
embodiments, the invention provides mutated, modified, and/or
chimeric isobutanol pathway enzymes with cytosolic activity. In
various embodiments described herein, the recombinant
microorganisms may be microorganisms of the Saccharomyces clade,
Crabtree-negative yeast microorganisms, Crabtree-positive yeast
microorganisms, post-WGD (whole genome duplication) yeast
microorganisms, pre-WGD (whole genome duplication) yeast
microorganisms, and non-fermenting yeast microorganisms.
Inventors: |
URANO; Jun; (Irvine, CA)
; Dundon; Catherine Asleson; (Englewood, CO) ;
Meinhold; Peter; (Denver, CO) ; Feldman; Reid M.
Renny; (San Francisco, CA) ; Aristidou; Aristos;
(Highlands Ranch, CO) ; Hawkins; Andrew; (Parker,
CO) ; Buelter; Thomas; (Denver, CO) ; Peters;
Matthew; (Highlands Ranch, CO) ; Lies; Doug;
(Parker, CO) ; Porter-Scheinman; Stephanie;
(Conifer, CO) ; Berry; Ruth; (Englewood, CO)
; Kalra; Ishmeet; (Englewood, CO) |
Assignee: |
GEVO, INC.
Englewood
CO
|
Family ID: |
43586479 |
Appl. No.: |
13/176452 |
Filed: |
July 5, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12855276 |
Aug 12, 2010 |
|
|
|
13176452 |
|
|
|
|
61272058 |
Aug 12, 2009 |
|
|
|
61272059 |
Aug 12, 2009 |
|
|
|
Current U.S.
Class: |
435/160 ;
435/254.2; 435/254.21 |
Current CPC
Class: |
C12P 7/16 20130101; C12N
9/88 20130101; C12N 15/81 20130101; C12N 9/0006 20130101; Y02E
50/10 20130101; C12N 9/1022 20130101 |
Class at
Publication: |
435/160 ;
435/254.2; 435/254.21 |
International
Class: |
C12P 7/16 20060101
C12P007/16; C12N 1/19 20060101 C12N001/19 |
Goverment Interests
ACKNOWLEDGMENT OF GOVERNMENTAL SUPPORT
[0002] This invention was made with government support under
Contract No. IIP-0823122, awarded by the National Science
Foundation, and under Contract No. EP-D-09-023, awarded by the
Environmental Protection Agency. The government has certain rights
in the invention.
Claims
1. A recombinant yeast microorganism comprising an isobutanol
producing metabolic pathway, wherein said isobutanol producing
metabolic pathway comprises the following substrate to product
conversions: (i) pyruvate to acetolactate; (ii) acetolactate to
2,3-dihydroxyisovalerate; (iii) 2,3-dihydroxyisovalerate to
.alpha.-ketoisovalerate; (iv) .alpha.-ketoisovalerate to
isobutyraldehyde; and (v) isobutyraldehyde to isobutanol; wherein
a) the enzyme that catalyzes a substrate to product conversion of
pyruvate to acetolactate is an acetolactate synthase; b) the enzyme
that catalyzes a substrate to product conversion of acetolactate to
2,3-dihydroxyisovalerate is a ketol-acid reductoisomerase derived
from Lactococcus lactis; c) the enzyme that catalyzes a substrate
to product conversion of 2,3-dihydroxyisovalerate to
.alpha.-ketoisovalerate is a dihydroxy acid dehydratase; d) the
enzyme that catalyzes a substrate to product conversion of
a-ketoisovalerate to isobutyraldehyde is an .alpha.-ketoisovalerate
decarboxylase; and e) the enzyme that catalyzes a substrate to
product conversion of isobutyraldehyde to isobutanol is an alcohol
dehydrogenase.
2. The recombinant yeast microorganism of claim 1, wherein said
acetolactate synthase is derived from a bacterial organism.
3. The recombinant yeast microorganism of claim 2, wherein said
bacterial organism is Bacillus subtilis.
4. The recombinant yeast microorganism of claim 1, wherein said
ketol-acid reductoisomerase is an NADH-dependent ketol-acid
reductoisomerase.
5. The recombinant yeast microorganism of claim 1, wherein said
dihydroxy acid dehydratase comprises the amino acid sequence
P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27), wherein X is any natural or
non-natural amino acid.
6. The recombinant yeast microorganism of claim 5, wherein said
dihydroxy acid dehydratase enzyme is derived from a bacterial
organism.
7. The recombinant yeast microorganism of claim 6, wherein said
bacterial organism is Lactococcus lactis.
8. The recombinant yeast microorganism of claim 1, wherein said
a-ketoisovalerate decarboxylase is derived from a bacterial
organism.
9. The recombinant yeast microorganism of claim 8, wherein said
bacterial organism is Lactococcus lactis.
10. The recombinant yeast microorganism of claim 1, wherein said
alcohol dehydrogenase is derived from a bacterial organism.
11. The recombinant yeast microorganism of claim 10, wherein said
bacterial organism is Lactococcus lactis.
12. The recombinant yeast microorganism of claim 1, wherein the
recombinant yeast microorganism has been engineered to inactivate
one or more endogenous pyruvate decarboxylase (PDC) genes.
13. The recombinant yeast microorganism of claim 12, wherein said
one or more endogenous PDC genes is selected from PDC1, PDC5, and
PDC6.
14. The recombinant yeast microorganism of claim 1, wherein the
recombinant yeast microorganism has been engineered to inactivate
one or more endogenous glycerol-3-phosphate dehydrogenase (GPD)
genes.
15. The recombinant yeast microorganism of claim 14, wherein said
one or more endogenous GPD genes is selected from GPD1 and
GPD2.
16. The recombinant yeast microorganism of claim 1, wherein the
recombinant yeast microorganism is a yeast microorganism of the
Saccharomyces clade.
17. The recombinant yeast microorganism of claim 16, wherein said
yeast microorganism of the Saccharomyces clade is S.
cerevisiae.
18. A method of producing isobutanol, comprising: (a) providing a
recombinant yeast microorganism according to claim 1; and (b)
cultivating said recombinant yeast microorganism in a culture
medium containing a feedstock providing the carbon source, until a
recoverable quantity of the isobutanol is produced.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a divisional of U.S. application Ser.
No. 12/855,276, filed Aug. 12, 2010, which claims the benefit of
U.S. Provisional Application Ser. No. 61/272,058, filed Aug. 12,
2009, and U.S. Provisional Application Ser. No. 61/272,059, filed
Aug. 12, 2009, each of which are herein incorporated by reference
in their entireties for all purposes.
TECHNICAL FIELD
[0003] Recombinant microorganisms and methods of producing such
organisms are provided. Also provided are methods of producing
metabolites that are biofuels by contacting a suitable substrate
with recombinant microorganisms and enzymatic preparations
therefrom.
DESCRIPTION OF THE TEXT FILE SUBMITTED ELECTRONICALLY
[0004] The contents of the text file submitted electronically
herewith are incorporated herein by reference in their entirety: A
computer readable format copy of the Sequence Listing (filename:
GEVO.sub.--041.sub.--11US_SeqList_ST25.txt, date recorded: Jul. 2,
2011, file size: 337 kilobytes).
BACKGROUND
[0005] Biofuels have a long history ranging back to the beginning
of the 20th century. As early as 1900, Rudolf Diesel demonstrated
at the World Exhibition in Paris, France, an engine running on
peanut oil. Soon thereafter, Henry Ford demonstrated his Model T
running on ethanol derived from corn. Petroleum-derived fuels
displaced biofuels in the 1930s and 1940s due to increased supply,
and efficiency at a lower cost.
[0006] Market fluctuations in the 1970s coupled to the decrease in
US oil production led to an increase in crude oil prices and a
renewed interest in biofuels. Today, many interest groups,
including policy makers, industry planners, aware citizens, and the
financial community, are interested in substituting
petroleum-derived fuels with biomass-derived biofuels. The leading
motivations for developing biofuels are of economical, political,
and environmental nature.
[0007] One is the threat of `peak oil`, the point at which the
consumption rate of crude oil exceeds the supply rate, thus leading
to significantly increased fuel cost results in an increased demand
for alternative fuels. In addition, instability in the Middle East
and other oil-rich regions has increased the demand for
domestically produced biofuels. Also, environmental concerns
relating to the possibility of carbon dioxide related climate
change is an important social and ethical driving force which is
starting to result in government regulations and policies such as
caps on carbon dioxide emissions from automobiles, taxes on carbon
dioxide emissions, and tax incentives for the use of biofuels.
[0008] Ethanol is the most abundant fermentatively produced fuel
today but has several drawbacks when compared to gasoline. Butanol,
in comparison, has several advantages over ethanol as a fuel: it
can be made from the same feedstocks as ethanol but, unlike
ethanol, it is compatible with gasoline at any ratio and can also
be used as a pure fuel in existing combustion engines without
modifications. Unlike ethanol, butanol does not absorb water and
can thus be stored and distributed in the existing petrochemical
infrastructure. Due to its higher energy content which is close to
that of gasoline, the fuel economy (miles per gallon) is better
than that of ethanol. Also, butanol-gasoline blends have lower
vapor pressure than ethanol-gasoline blends, which is important in
reducing evaporative hydrocarbon emissions.
[0009] Isobutanol has the same advantages as butanol with the
additional advantage of having a higher octane number due to its
branched carbon chain. Isobutanol is also useful as a commodity
chemical and is also a precursor to MTBE.
[0010] Isobutanol has been produced in recombinant microorganisms
expressing a heterologous, five-step metabolic pathway (See, e.g.,
WO/2007/050671 to Donaldson et al., WO/2008/098227 to Liao et al.,
and WO/2009/103533 to Festel et al.). However, the microorganisms
produced have fallen short of commercial relevance due to their low
performance characteristics, including, for example low
productivity, low titer, low yield, and the requirement for oxygen
during the fermentation process. Thus, recombinant microorganisms
exhibiting increased isobutanol productivity, titer, and/or yield
are desirable.
SUMMARY OF THE INVENTION
[0011] The present invention provides cytosolically active
dihydroxyacid dehydratase (DHAD) enzymes and recombinant
microorganisms comprising said cytosolically active DHAD enzymes.
In some embodiments, said recombinant microorganisms may further
comprise one or more additional enzymes catalyzing a reaction in an
isobutanol producing metabolic pathway. As described herein, the
recombinant microorganisms of the present invention are useful for
the production of several beneficial metabolites, including, but
not limited to isobutanol.
[0012] In a first aspect, the invention provides cytosolically
active dihydroxyacid dehydratase (DHAD) enzymes. These
cytosolically active DHAD enzymes generally exhibit the ability to
convert 2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol.
The cytosolically active DHAD enzymes of the present invention, as
described herein, can include native (i.e. parental) DHAD enzymes
that exhibit cytosolic activity, as well DHAD enzymes that have
been modified or mutated to increase their cytosolic localization
and/or activity as compared to native (i.e. parental) DHAD
enzymes.
[0013] In various embodiments described herein, the DHAD enzymes
may be derived from a prokaryotic organism. In one embodiment, the
prokaryotic organism is a bacterial organism. In another
embodiment, the bacterial organism is Lactococcus lactis. In a
specific embodiment, the DHAD enzyme from L. lactis comprises the
amino acid sequence of SEQ ID NO: 18. In another embodiment, the
bacterial organism is Francisella tularensis. In a specific
embodiment, the DHAD enzyme from F. tularensis comprises the amino
acid sequence of SEQ ID NO: 14. In another embodiment, the
bacterial organism is Gramella forsetii. In a specific embodiment,
the DHAD enzyme from G. forsetii comprises the amino acid sequence
of SEQ ID NO: 17.
[0014] In alternative embodiments described herein, the DHAD enzyme
may be derived from a eukaryotic organism. In one embodiment, the
eukaryotic organism is a fungal organism. In an exemplary
embodiment, the fungal organism is Neurospora crassa. In a specific
embodiment, the DHAD enzyme from N. crassa comprises the amino acid
sequence of SEQ ID NO: 165.
[0015] In some embodiments, the invention provides modified or
mutated DHAD enzymes, wherein said DHAD enzymes exhibit increased
cytosolic activity as compared to their parental DHAD enzymes. In
another embodiment, the invention provides modified or mutated DHAD
enzymes, wherein said DHAD enzymes exhibit increased cytosolic
activity as compared to the DHAD enzyme comprised by the amino acid
sequence of SEQ ID NO: 11.
[0016] In further embodiments, the invention provides DHAD enzymes
comprising the amino acid sequence P(I/L)XXXGX(I/L)XIL (SEQ ID NO:
27), wherein X is any natural or non-natural amino acid, and
wherein said DHAD enzymes exhibit the ability to convert
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol.
[0017] In some embodiments, the DHAD enzymes of the present
invention exhibit a properly folded iron-sulfur cluster domain
and/or redox active domain in the cytosol. In one embodiment, the
DHAD enzymes comprise a mutated or modified iron-sulfur cluster
domain and/or redox active domain.
[0018] In another aspect, the present invention provides
recombinant microorganisms comprising a cytosolically active DHAD
enzyme. In one embodiment, the invention provides recombinant
microorganisms comprising a DHAD enzyme derived from a prokaryotic
organism, wherein said DHAD enzyme exhibits activity in the
cytosol. In one embodiment, the DHAD enzyme is derived from a
bacterial organism. In a specific embodiment, the DHAD enzyme is
derived from L. lactis and comprises the amino acid sequence of SEQ
ID NO: 18. In another embodiment, the invention provides
recombinant microorganisms comprising a DHAD enzyme derived from a
eukaryotic organism, wherein said DHAD enzyme exhibits activity in
the cytosol. In one embodiment, the DHAD enzyme is derived from a
fungal organism. In an alternative embodiment, the DHAD enzyme is
derived from a yeast organism.
[0019] In one embodiment, the invention provides recombinant
microorganisms comprising a modified or mutated DHAD enzyme,
wherein said DHAD enzyme exhibits increased cytosolic activity as
compared to the parental DHAD enzyme. In another embodiment, the
invention provides recombinant microorganisms comprising a modified
or mutated DHAD enzyme, wherein said DHAD enzyme exhibits increased
cytosolic activity as compared to the DHAD enzyme comprised by the
amino acid sequence of SEQ ID NO: 11.
[0020] In another embodiment, the invention provides recombinant
microorganisms comprising a DHAD enzyme comprising the amino acid
sequence P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27), wherein X is any
natural or non-natural amino acid, and wherein said DHAD enzymes
exhibit the ability to convert 2,3-dihydroxyisovalerate to
ketoisovalerate in the cytosol.
[0021] In some embodiments, the invention provides recombinant
microorganisms comprising a DHAD enzyme fused to a peptide tag,
whereby said DHAD enzyme exhibits increased cytosolic localization
and/or cytosolic DHAD activity as compared to the parental
microorganism. In one embodiment, the peptide tag is non-cleavable.
In another embodiment, the peptide tag is fused at the N-terminus
of the DHAD enzyme. In another embodiment, the peptide tag is fused
at the C-terminus of the DHAD enzyme. In certain embodiments, the
peptide tag may be selected from the group consisting of ubiquitin,
ubiquitin-like (UBL) proteins, myc, HA-tag, green fluorescent
protein (GFP), and the maltose binding protein (MBP).
[0022] In certain embodiments described herein, it may be desirable
to further overexpress an additional enzyme that converts
2,3-dihydroxyisovalerate (DHIV) to ketoisovalerate (KIV) in the
cytosol. In a specific embodiment, the enzyme may be selected from
the group consisting of 3-isopropylmalate isomerase (Leu1p) and
imidazoleglycerol-phosphate dehydrogenase (His3p).
[0023] In various embodiments described herein, the recombinant
microorganisms may be further engineered to express an isobutanol
producing metabolic pathway comprising at least one exogenous gene
that catalyzes a step in the conversion of pyruvate to isobutanol.
In one embodiment, the recombinant microorganism may be engineered
to express an isobutanol producing metabolic pathway comprising at
least two exogenous genes. In another embodiment, the recombinant
microorganism may be engineered to express an isobutanol producing
metabolic pathway comprising at least three exogenous genes. In
another embodiment, the recombinant microorganism may be engineered
to express an isobutanol producing metabolic pathway comprising at
least four exogenous genes. In another embodiment, the recombinant
microorganism may be engineered to express an isobutanol producing
metabolic pathway comprising five exogenous genes. Thus, the
present invention further provides recombinant microorganisms that
comprise an isobutanol producing metabolic pathway and methods of
using said recombinant microorganisms to produce isobutanol.
[0024] In one embodiment, the recombinant microorganisms comprise
an isobutanol producing metabolic pathway with at least one
isobutanol pathway enzyme localized in the cytosol. In another
embodiment, the recombinant microorganisms comprise an isobutanol
producing metabolic pathway with at least two isobutanol pathway
enzymes localized in the cytosol. In another embodiment, the
recombinant microorganisms comprise an isobutanol producing
metabolic pathway with at least three isobutanol pathway enzymes
localized in the cytosol. In another embodiment, the recombinant
microorganisms comprise an isobutanol producing metabolic pathway
with at least four isobutanol pathway enzymes localized in the
cytosol. In an exemplary embodiment, the recombinant microorganisms
comprise an isobutanol producing metabolic pathway with five
isobutanol pathway enzymes localized in the cytosol. In a further
exemplary embodiment, at least one of the pathway enzymes localized
to the cytosol is a cytosolically active DHAD enzyme as disclosed
herein.
[0025] In various embodiments described herein, the isobutanol
pathway enzyme(s) is/are selected from the group consisting of
acetolactate synthase (ALS), ketol-acid reductoisomerase (KARI),
dihydroxyacid dehydratase (DHAD), 2-keto-acid decarboxylase (KIVD),
and alcohol dehydrogenase (ADH).
[0026] As described herein, the cytosolically active isobutanol
pathway enzymes of the present invention can include native (i.e.
parental) enzymes that exhibit cytosolic activity, as well
isobutanol pathway enzymes that have been modified or mutated to
increase their cytosolic localization and/or activity as compared
to native (i.e. parental) pathway enzymes.
[0027] In various embodiments described herein, the isobutanol
pathway enzymes may be derived from a prokaryotic organism. In
alternative embodiments described herein, the isobutanol pathway
enzymes may be derived from a eukaryotic organism.
[0028] In some embodiments, the invention provides modified or
mutated isobutanol pathway enzymes, wherein said isobutanol pathway
enzymes exhibit increased cytosolic activity as compared to their
parental isobutanol pathway enzymes. In another embodiment, the
invention provides modified or mutated isobutanol pathway enzymes,
wherein said isobutanol pathway enzymes exhibit increased cytosolic
activity as compared to the homologous isobutanol pathway enzyme
from S. cerevisiae.
[0029] In various embodiments described herein, at least one of the
isobutanol pathway enzymes exhibiting cytosolic activity is ALS. In
one embodiment, the ALS is derived from a prokaryotic organism,
including, but not limited to Bacillus subtilis or L. lactis. In
another embodiment, the ALS is derived from a eukaryotic organism,
including, but not limited to Magnaporthe grisea, Phaeosphaeria
nodorum, Talaromyces stipitatus, and Trichoderma atroviride.
[0030] In additional embodiments, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is KARI. In one
embodiment, the KARI is derived from a prokaryotic organism,
including, but not limited to Escherichia coli, B. subtilis or L.
lactis. In another embodiment, the KARI is derived from a
eukaryotic organism, including, but not limited to Piromyces sp.
E2, S. cerevisiae, and Arabidopsis. In certain specific
embodiments, the KARI comprises an amino acid sequence selected
from an organism selected from the group consisting of E. coli, S.
cerevisiae, B. subtilis Piromyces sp. E2, Buchnera aphidicola,
Spinacia oleracea, Oryza sativa, Chlamydomonas reinhardtii, N.
crassa, Schizosaccharomyces pombe, Laccaria bicolor, Ignicoccus
hospitalis, Picrophilus torridus, Acidiphilium cryptum,
Cyanobacteria/Synechococcus sp., Zymomonas mobilis, Bacteroides
thetaiotaomicron, Methanococcus maripaludis, Vibrio fischeri,
Shewanella sp, G. forsetii, Psychromonas ingrhamaii, and Cytophaga
hutchinsonii. In additional embodiments, the KARI may be an
NADH-dependent KARI.
[0031] In various embodiments described herein, the isobutanol
pathway enzyme may be mutated or modified to remove an N-terminal
mitochondrial targeting sequence (MTS). Removal of the MTS can
increase cytosolic localization of the isobutanol pathway enzyme
and/or increase the cytosolic activity of the isobutanol pathway
enzyme as compared to the parental isobutanol pathway enzyme.
[0032] In some embodiments, the MTS may be modified or mutated to
reduce or eliminate its ability to target the isobutanol pathway
enzyme to the mitochondria. Selected modification of the MTS can
increase cytosolic localization of the isobutanol pathway enzyme
and/or increase the cytosolic activity of the isobutanol pathway
enzyme as compared to the parental isobutanol pathway enzyme.
[0033] In additional embodiments, the invention provides
recombinant microorganisms comprising an isobutanol pathway enzyme
fused to a peptide tag, whereby said isobutanol pathway enzyme
exhibits increased cytosolic localization and/or cytosolic activity
as compared to the parental enzyme. As a result, the recombinant
microorganism comprising the tagged isobutanol pathway enzyme will
generally exhibit an increased ability to perform a step involved
in the conversion of pyruvate to isobutanol in the cytosol. In one
embodiment, the peptide tag is non-cleavable. In another
embodiment, the peptide tag is fused at the N-terminus of the
isobutanol pathway enzyme. In another embodiment, the peptide tag
is fused at the C-terminus of the isobutanol pathway enzyme. In
certain embodiments, the peptide tag may be selected from the group
consisting of ubiquitin, ubiquitin-like (UBL) proteins, myc,
HA-tag, green fluorescent protein (GFP), and the maltose binding
protein (MBP).
[0034] In various embodiments described herein, the recombinant
microorganisms may further comprise a nucleic acid encoding a
chaperone protein, wherein said chaperone protein assists the
folding of a protein exhibiting cytosolic activity. In a preferred
embodiment, the protein exhibiting cytosolic activity is an
isobutanol pathway enzyme. In one embodiment, the chaperone may be
a native protein. In another embodiment, the chaperone protein may
be an exogenous protein. In some embodiments, the chaperone protein
may be selected from the group consisting of: endoplasmic reticulum
oxidoreductin 1 (Ero1)) including variants of Ero1 that have been
suitably altered to reduce or prevent its normal localization to
the endoplasmic reticulum; thioredoxins (including, but not limited
to, Trx1 and Trx2), thioredoxin reductase (Trr1), glutaredoxins
(including, but not limited to, Grx1, Grx2, Grx3, Grx4, Grx5, Grx6,
Grx7, and Grx8), glutathione reductase (Glr1), and Jac1, including
variants of Jac1 that have been suitably altered to reduce or
prevent its normal mitochondrial localization; and homologs or
variants thereof.
[0035] In some embodiments, the recombinant microorganisms may
further comprise one or more genes encoding an iron-sulfur cluster
assembly protein. In one embodiment, the iron-sulfur cluster
assembly protein encoding genes may be derived from prokaryotic
organisms. In one embodiment, the iron-sulfur cluster assembly
protein encoding genes are derived from a bacterial organism,
including, but not limited to E. coli, L. lactis, Helicobacter
pylori, and Entamoeba histolytica. In specific embodiments, the
bacterially derived iron-sulfur cluster assembly protein encoding
genes are selected from the group consisting of cyaY, iscS, iscU,
iscA, hscB, hscA, fdx, isuX, sufA, sufB, sufC, sufD, sufS, sufE,
apbC, and homologs or variants thereof.
[0036] In another embodiment, the iron-sulfur cluster assembly
protein encoding genes may be derived from eukaryotic organisms,
including, but not limited to yeasts and plants. In one embodiment,
the iron-sulfur cluster protein encoding genes are derived from a
yeast organism, including, but not limited to S. cerevisiae. In
specific embodiments, the yeast derived genes encoding iron-sulfur
cluster assembly proteins are selected from the group consisting of
Cfd1, Nbp35, Nar1, Cia1, and homologs or variants thereof. In a
further embodiment, the iron-sulfur cluster assembly protein
encoding genes may be derived from plant nuclear genes which encode
proteins translocated to chloroplast or plant genes found in the
chloroplast genome itself.
[0037] In some embodiments, one or more genes encoding an
iron-sulfur cluster assembly protein may be mutated or modified to
remove a signal peptide, whereby localization of the product of
said one or more genes to the mitochondria or other subcellular
compartment is prevented. In certain embodiments, it may be
preferable to overexpress one or more genes encoding an iron-sulfur
cluster assembly protein.
[0038] In certain embodiments described herein, it may be desirable
to reduce or eliminate the activity and/or proteins levels of one
or more iron-sulfur cluster containing cytosolic proteins. In a
specific embodiment, the iron-sulfur cluster containing cytosolic
protein is 3-isopropylmalate dehydratase (Leu1p). In one
embodiment, the recombinant microorganism comprises a mutation in
the LEU1 gene resulting in the reduction of Leu1p protein levels.
In another embodiment, the recombinant microorganism comprises a
partial deletion in the LEU1 gene resulting in the reduction of
Leu1p protein levels. In another embodiment, the recombinant
microorganism comprises a complete deletion in the LEU1 gene
resulting in the reduction of Leu1p protein levels. In another
embodiment, the recombinant microorganism comprises a modification
of the regulatory region associated with the LEU1 gene resulting in
the reduction of Leu1p protein levels. In yet another embodiment,
the recombinant microorganism comprises a modification of a
transcriptional regulator for the LEU1 gene resulting in the
reduction of Leu1p protein levels.
[0039] In additional embodiments, the present invention provides
recombinant microorganisms comprising chimeric proteins consisting
of isobutanol pathway enzymes. In one embodiment, the chimeric
proteins consist of ALS and at least one additional protein. In a
specific embodiment, the additional protein is KARI. In a preferred
embodiment, the chimeric protein exhibits the biocatalytic
properties of both ALS and KARI. Such a chimeric protein allows for
an increase in the concentration of 2-acetolactate at the active
site of KARI as compared to the parental microorganism, giving the
recombinant microorganism an enhanced ability to convert
2-acetolactate to 2,3-dihydroxyisovalerate. In another embodiment,
the chimeric proteins consist of KARI and at least one additional
protein. In a specific embodiment, the additional protein is DHAD.
In a preferred embodiment, the chimeric protein exhibits the
biocatalytic properties of both KARI and DHAD. In each of the
various embodiments described herein, the proteins may be connected
via a flexible linker.
[0040] In various embodiments described herein, the recombinant
microorganisms may be engineered to express native genes that
catalyze a step in the conversion of pyruvate to isobutanol. In one
embodiment, the recombinant microorganism is engineered to increase
the activity of a native metabolic pathway gene for conversion of
pyruvate to isobutanol. In another embodiment, the recombinant
microorganism is further engineered to include at least one enzyme
encoded by an exogenous gene and at least one enzyme encoded by a
native gene. In yet another embodiment, the recombinant
microorganism comprises a reduction in the activity of a native
metabolic pathway as compared to a parental microorganism.
[0041] In another embodiment, the present invention provides
recombinant microorganisms comprising a scaffold system tethered to
one or more isobutanol pathway enzymes. In a specific embodiment,
the scaffold system is the MAP kinase scaffold (Ste5) system. In a
further embodiment, one or more of the isobutanol pathway enzymes
may be modified or mutated to comprise a protein domain allowing
for binding to the scaffold system.
[0042] In various embodiments described herein, the recombinant
microorganisms may be microorganisms of the Saccharomyces clade,
Saccharomyces sensu stricto microorganisms, Crabtree-negative yeast
microorganisms, Crabtree-positive yeast microorganisms, post-WGD
(whole genome duplication) yeast microorganisms, pre-WGD (whole
genome duplication) yeast microorganisms, and non-fermenting yeast
microorganisms.
[0043] In some embodiments, the recombinant microorganisms may be
yeast recombinant microorganisms of the Saccharomyces clade.
[0044] In some embodiments, the recombinant microorganisms may be
Saccharomyces sensu stricto microorganisms. In one embodiment, the
Saccharomyces sensu stricto is selected from the group consisting
of S. cerevisiae, S. kudriavzevii, S. mikatae, S. bayanus, S.
uvarum. S. carocanis and hybrids thereof.
[0045] In some embodiments, the recombinant microorganisms may be
Crabtree-negative recombinant yeast microorganisms. In one
embodiment, the Crabtree-negative yeast microorganism is classified
into a genera selected from the group consisting of Kluyveromyces,
Pichia, Hansenula, Issatchenkia, or Candida. In additional
embodiments, the Crabtree-negative yeast microorganism is selected
from Kluyveromyces lactis, Kluyveromyces marxianus, Pichia anomala,
Pichia stipitis, Hansenula anomala, Issatchenkia orientalis,
Candida utilis and Kluyveromyces waltii.
[0046] In some embodiments, the recombinant microorganisms may be
Crabtree-positive recombinant yeast microorganisms. In one
embodiment, the Crabtree-positive yeast microorganism is classified
into a genera selected from the group consisting of Saccharomyces,
Kluyveromyces, Zygosaccharomyces, Debaryomyces, Candida, Pichia and
Schizosaccharomyces. In additional embodiments, the
Crabtree-positive yeast microorganism is selected from the group
consisting of Saccharomyces cerevisiae, Saccharomyces uvarum,
Saccharomyces bayanus, Saccharomyces paradoxus, Saccharomyces
castelli, Saccharomyces kluyveri, Kluyveromyces thermotolerans,
Candida glabrata, Z. baiffi, Z. rouxii, Debaryomyces hansenii,
Pichia pastorius, Schizosaccharomyces pombe, and Saccharomyces
uvarum.
[0047] In some embodiments, the recombinant microorganisms may be
post-WGD (whole genome duplication) yeast recombinant
microorganisms. In one embodiment, the post-WGD yeast recombinant
microorganism is classified into a genera selected from the group
consisting of Saccharomyces or Candida. In additional embodiments,
the post-WGD yeast is selected from the group consisting of
Saccharomyces cerevisiae, Saccharomyces uvarum, Saccharomyces
bayanus, Saccharomyces paradoxus, Saccharomyces castelli, and
Candida glabrata.
[0048] In some embodiments, the recombinant microorganisms may be
pre-WGD (whole genome duplication) yeast recombinant
microorganisms. In one embodiment, the pre-WGD yeast recombinant
microorganism is classified into a genera selected from the group
consisting of Saccharomyces, Kluyveromyces, Candida, Pichia,
Debaryomyces, Hansenula, Pachysolen, Yarrowia, Issatchenkia, and
Schizosaccharomyces. In additional embodiments, the pre-WGD yeast
is selected from the group consisting of Saccharomyces kluyveri,
Kluyveromyces thermotolerans, Kluyveromyces marxianus,
Kluyveromyces waltii, Kluyveromyces lactis, Candida tropicalis,
Pichia pastoris, Pichia anomala, Pichia stipitis, Debaryomyces
hansenii, Hansenula anomala, Pachysolen tannophilis, Yarrowia
lipolytica, Issatchenkia orientalis, and Schizosaccharomyces
pombe.
[0049] In some embodiments, the recombinant microorganisms may be
microorganisms that are non-fermenting yeast microorganisms,
including, but not limited to those, classified into a genera
selected from the group consisting of Tricosporon, Rhodotorula, or
Myxozyma.
[0050] In another aspect, the present invention provides methods of
producing isobutanol using one or more recombinant microorganisms
of the invention. In one embodiment, the method includes
cultivating one or more recombinant microorganisms in a culture
medium containing a feedstock providing the carbon source until a
recoverable quantity of the isobutanol is produced and optionally,
recovering the isobutanol. In one embodiment, the microorganism is
selected to produce isobutanol from a carbon source at a yield of
at least about 5 percent theoretical. In another embodiment, the
microorganism is selected to produce isobutanol at a yield of at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent theoretical, at least about 85 percent
theoretical, or at least about 90 percent theoretical.
[0051] In one embodiment, the microorganism produces isobutanol
from a carbon source at a specific productivity of at least about
0.7 mg/L/hr per OD. In another embodiment, the microorganism
produces isobutanol from a carbon source at a specific productivity
of at least about 1 mg/L/hr per OD, at least about 10 mg/L/hr per
OD, at least about 50 mg/L/hr per OD, at least about 100 mg/L/hr
per OD, at least about 250 mg/L/hr per OD, or at least about 500
g/L/hr per OD.
BRIEF DESCRIPTION OF DRAWINGS
[0052] Illustrative embodiments of the invention are illustrated in
the drawings, in which:
[0053] FIG. 1 illustrates an exemplary embodiment of an isobutanol
pathway.
[0054] FIG. 2 illustrates acetoin produced from GEVO 1187 (no ALS),
2280 (B. subtilis AlsS not codon optimized), GEVO 2618 (B. subtilis
AlsS), GEVO 2621 (T. atroviride ALS) and GEVO 2622 (T. stipitatus
ALS). All acetoin values are normalized to OD.sub.600 and reported
as mM/OD.
[0055] FIG. 3 illustrates the specific activity at pH 7.5 of KARI
enzyme in whole cell lysates for GEVO1803 containing empty vector
(pGV1102), ilv5.DELTA.N47(pGV1831), ilv5.DELTA.N46(pGV1901), Full
length ILV5 (pGV1833) and E. coli ilvC codon optimized for S.
cerevisiae (pGV1824).
[0056] FIG. 4 illustrates the results from fermentations of
GEVO2107 transformed with plasmids for expression of KARI and
different DHAD homologs (shown in legend).
[0057] FIG. 5 illustrates a phylogenetic tree of 53 representative
DHAD homologs following pairwise global alignments and progressive
assembly of alignments using Neighbor-Joining phylogeny.
DETAILED DESCRIPTION
[0058] As used herein and in the appended claims, the singular
forms "a," "an," and "the" include plural referents unless the
context clearly dictates otherwise. Thus, for example, reference to
"a polynucleotide" includes a plurality of such polynucleotides and
reference to "the microorganism" includes reference to one or more
microorganisms, and so forth.
[0059] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood to one of
ordinary skill in the art to which this disclosure belongs.
Although methods and materials similar or equivalent to those
described herein can be used in the practice of the disclosed
methods and compositions, the exemplary methods, devices and
materials are described herein.
[0060] Any publications discussed above and throughout the text are
provided solely for their disclosure prior to the filing date of
the present application. Nothing herein is to be construed as an
admission that the inventors are not entitled to antedate such
disclosure by virtue of prior disclosure.
[0061] The term "microorganism" includes prokaryotic and eukaryotic
microbial species from the Domains Archaea, Bacteria and Eucarya,
the latter including yeast and filamentous fungi, protozoa, algae,
or higher Protista. The terms "microbial cells" and "microbes" are
used interchangeably with the term microorganism.
[0062] The term "genus" is defined as a taxonomic group of related
species according to the Taxonomic Outline of Bacteria and Archaea
(Garrity et al., 2007, TOBA Release 7.7, Michigan State University
Board of Trustees).
[0063] The term "species" is defined as a collection of closely
related organisms with greater than 97% 16S ribosomal RNA sequence
homology and greater than 70% genomic hybridization and
sufficiently different from all other organisms so as to be
recognized as a distinct unit.
[0064] The terms "recombinant microorganism," "modified
microorganism" and "recombinant host cell" are used interchangeably
herein and refer to microorganisms that have been genetically
modified to express or over-express endogenous polynucleotides, or
to express heterologous polynucleotides, such as those included in
a vector, or which have an alteration in expression of an
endogenous gene. By "alteration" it is meant that the expression of
the gene, or level of a RNA molecule or equivalent RNA molecules
encoding one or more polypeptides or polypeptide subunits, or
activity of one or more polypeptides or polypeptide subunits is up
regulated or down regulated, such that expression, level, or
activity is greater than or less than that observed in the absence
of the alteration. For example, the term "alter" can mean
"inhibit," but the use of the word "alter" is not limited to this
definition.
[0065] The term "expression" with respect to a gene sequence refers
to transcription of the gene and, as appropriate, translation of
the resulting mRNA transcript to a protein. Thus, as will be clear
from the context, expression of a protein results from
transcription and translation of the open reading frame sequence.
The level of expression of a desired product in a host cell may be
determined on the basis of either the amount of corresponding mRNA
that is present in the cell, or the amount of the desired product
encoded by the selected sequence. For example, mRNA transcribed
from a selected sequence can be quantitated by qRT-PCR or by
Northern hybridization (Sambrook et al., Molecular Cloning: A
Laboratory Manual, Cold Spring Harbor Laboratory Press (1989)).
Protein encoded by a selected sequence can be quantitated by
various methods, e.g., by ELISA, by assaying for the biological
activity of the protein, or by employing assays that are
independent of such activity, such as western blotting or
radioimmunoassay, using antibodies that recognize and bind the
protein. The polynucleotide generally encodes a target enzyme
involved in a metabolic pathway for producing a desired metabolite.
It is understood that the terms "recombinant microorganism" and
"recombinant host cell" refer not only to the particular
recombinant microorganism but to the progeny or potential progeny
of such a microorganism. Because certain modifications may occur in
succeeding generations due to either mutation or environmental
influences, such progeny may not, in fact, be identical to the
parent cell, but are still included within the scope of the term as
used herein.
[0066] The term "wild-type microorganism" describes a cell that
occurs in nature, i.e. a cell that has not been genetically
modified. A wild-type microorganism can be genetically modified to
express or overexpress a first target enzyme. This microorganism
can act as a parental microorganism in the generation of a
microorganism modified to express or overexpress a second target
enzyme. In turn, the microorganism modified to express or
overexpress a first and a second target enzyme can be modified to
express or overexpress a third target enzyme.
[0067] Accordingly, a "parental microorganism" functions as a
reference cell for successive genetic modification events. Each
modification event can be accomplished by introducing a nucleic
acid molecule in to the reference cell. The introduction
facilitates the expression or overexpression of a target enzyme. It
is understood that the term "facilitates" encompasses the
activation of endogenous polynucleotides encoding a target enzyme
through genetic modification of e.g., a promoter sequence in a
parental microorganism. It is further understood that the term
"facilitates" encompasses the introduction of heterologous
polynucleotides encoding a target enzyme in to a parental
microorganism.
[0068] The term "engineer" refers to any manipulation of a
microorganism that results in a detectable change in the
microorganism, wherein the manipulation includes but is not limited
to inserting a polynucleotide and/or polypeptide heterologous to
the microorganism and mutating a polynucleotide and/or polypeptide
native to the microorganism.
[0069] The term "mutation" as used herein indicates any
modification of a nucleic acid and/or polypeptide which results in
an altered nucleic acid or polypeptide. Mutations include, for
example, point mutations, deletions, or insertions of single or
multiple residues in a polynucleotide, which includes alterations
arising within a protein-encoding region of a gene as well as
alterations in regions outside of a protein-encoding sequence, such
as, but not limited to, regulatory or promoter sequences. A genetic
alteration may be a mutation of any type. For instance, the
mutation may constitute a point mutation, a frame-shift mutation,
an insertion, or a deletion of part or all of a gene. In addition,
in some embodiments of the modified microorganism, a portion of the
microorganism genome has been replaced with a heterologous
polynucleotide. In some embodiments, the mutations are
naturally-occurring. In other embodiments, the mutations are the
results of artificial selection pressure. In still other
embodiments, the mutations in the microorganism genome are the
result of genetic engineering.
[0070] The term "biosynthetic pathway", also referred to as
"metabolic pathway", refers to a set of anabolic or catabolic
biochemical reactions for converting one chemical species into
another. Gene products belong to the same "metabolic pathway" if
they, in parallel or in series, act on the same substrate, produce
the same product, or act on or produce a metabolic intermediate
(i.e., metabolite) between the same substrate and metabolite end
product.
[0071] The term "heterologous" as used herein with reference to
molecules and in particular enzymes and polynucleotides, indicates
molecules that are expressed in an organism other than the organism
from which they originated or are found in nature, independently of
the level of expression that can be lower, equal or higher than the
level of expression of the molecule in the native
microorganism.
[0072] On the other hand, the term "native" or "endogenous" as used
herein with reference to molecules, and in particular enzymes and
polynucleotides, indicates molecules that are expressed in the
organism in which they originated or are found in nature,
independently of the level of expression that can be lower equal or
higher than the level of expression of the molecule in the native
microorganism. It is understood that expression of native enzymes
or polynucleotides may be modified in recombinant
microorganisms.
[0073] The term "feedstock" is defined as a raw material or mixture
of raw materials supplied to a microorganism or fermentation
process from which other products can be made. For example, a
carbon source, such as biomass or the carbon compounds derived from
biomass are a feedstock for a microorganism that produces a biofuel
in a fermentation process. However, a feedstock may contain
nutrients other than a carbon source.
[0074] The term "substrate" or "suitable substrate" refers to any
substance or compound that is converted or meant to be converted
into another compound by the action of an enzyme. The term includes
not only a single compound, but also combinations of compounds,
such as solutions, mixtures and other materials which contain at
least one substrate, or derivatives thereof. Further, the term
"substrate" encompasses not only compounds that provide a carbon
source suitable for use as a starting material, such as any biomass
derived sugar, but also intermediate and end product metabolites
used in a pathway associated with a recombinant microorganism as
described herein.
[0075] The term "C2-compound" as used as a carbon source for
engineered yeast microorganisms with mutations in all pyruvate
decarboxylase (PDC) genes resulting in a reduction of pyruvate
decarboxylase activity of said genes refers to organic compounds
comprised of two carbon atoms, including but not limited to ethanol
and acetate
[0076] The term "fermentation" or "fermentation process" is defined
as a process in which a microorganism is cultivated in a culture
medium containing raw materials, such as feedstock and nutrients,
wherein the microorganism converts raw materials, such as a
feedstock, into products.
[0077] The term "volumetric productivity" or "production rate" is
defined as the amount of product formed per volume of medium per
unit of time. Volumetric productivity is reported in gram per liter
per hour (g/L/h).
[0078] The term "specific productivity" or "specific production
rate" is defined as the amount of product formed per volume of
medium per unit of time per amount of cells. Volumetric
productivity is reported in gram or milligram per liter per hour
per OD (g/L/h/OD).
[0079] The term "yield" is defined as the amount of product
obtained per unit weight of raw material and may be expressed as g
product per g substrate (g/g). Yield may be expressed as a
percentage of the theoretical yield. "Theoretical yield" is defined
as the maximum amount of product that can be generated per a given
amount of substrate as dictated by the stoichiometry of the
metabolic pathway used to make the product. For example, the
theoretical yield for one typical conversion of glucose to
isobutanol is 0.41 g/g. As such, a yield of isobutanol from glucose
of 0.39 g/g would be expressed as 95% of theoretical or 95%
theoretical yield.
[0080] The term "titer" is defined as the strength of a solution or
the concentration of a substance in solution. For example, the
titer of a biofuel in a fermentation broth is described as g of
biofuel in solution per liter of fermentation broth (g/L).
[0081] "Aerobic conditions" are defined as conditions under which
the oxygen concentration in the fermentation medium is sufficiently
high for an aerobic or facultative anaerobic microorganism to use
as a terminal electron acceptor.
[0082] In contrast, "anaerobic conditions" are defined as
conditions under which the oxygen concentration in the fermentation
medium is too low for the microorganism to use as a terminal
electron acceptor. Anaerobic conditions may be achieved by sparging
a fermentation medium with an inert gas such as nitrogen until
oxygen is no longer available to the microorganism as a terminal
electron acceptor. Alternatively, anaerobic conditions may be
achieved by the microorganism consuming the available oxygen of the
fermentation until oxygen is unavailable to the microorganism as a
terminal electron acceptor.
[0083] "Aerobic metabolism" refers to a biochemical process in
which oxygen is used as a terminal electron acceptor to make
energy, typically in the form of ATP, from carbohydrates. Aerobic
metabolism occurs e.g. via glycolysis and the TCA cycle, wherein a
single glucose molecule is metabolized completely into carbon
dioxide in the presence of oxygen.
[0084] In contrast, "anaerobic metabolism" refers to a biochemical
process in which oxygen is not the final acceptor of electrons
contained in NADH. Anaerobic metabolism can be divided into
anaerobic respiration, in which compounds other than oxygen serve
as the terminal electron acceptor, and substrate level
phosphorylation, in which the electrons from NADH are utilized to
generate a reduced product via a "fermentative pathway."
[0085] In "fermentative pathways", NAD(P)H donates its electrons to
a molecule produced by the same metabolic pathway that produced the
electrons carried in NAD(P)H. For example, in one of the
fermentative pathways of certain yeast strains, NAD(P)H generated
through glycolysis transfers its electrons to pyruvate, yielding
ethanol. Fermentative pathways are usually active under anaerobic
conditions but may also occur under aerobic conditions, under
conditions where NADH is not fully oxidized via the respiratory
chain. For example, above certain glucose concentrations, Crabtree
positive yeasts produce large amounts of ethanol under aerobic
conditions.
[0086] The term "byproduct" means an undesired product related to
the production of a biofuel or biofuel precursor. Byproducts are
generally disposed as waste, adding cost to a production
process.
[0087] The term "non-fermenting yeast" is a yeast species that
fails to demonstrate an anaerobic metabolism in which the electrons
from NADH are utilized to generate a reduced product via a
fermentative pathway such as the production of ethanol and CO.sub.2
from glucose. Non-fermentative yeast can be identified by the
"Durham Tube Test" (J. A. Barnett, R. W. Payne, and D. Yarrow.
2000. Yeasts Characteristics and Identification. 3.sup.rd edition.
p. 28-29. Cambridge University Press, Cambridge, UK) or by
monitoring the production of fermentation productions such as
ethanol and CO.sub.2.
[0088] The term "polynucleotide" is used herein interchangeably
with the term "nucleic acid" and refers to an organic polymer
composed of two or more monomers including nucleotides, nucleosides
or analogs thereof, including but not limited to single stranded or
double stranded, sense or antisense deoxyribonucleic acid (DNA) of
any length and, where appropriate, single stranded or double
stranded, sense or antisense ribonucleic acid (RNA) of any length,
including siRNA. The term "nucleotide" refers to any of several
compounds that consist of a ribose or deoxyribose sugar joined to a
purine or a pyrimidine base and to a phosphate group, and that are
the basic structural units of nucleic acids. The term "nucleoside"
refers to a compound (as guanosine or adenosine) that consists of a
purine or pyrimidine base combined with deoxyribose or ribose and
is found especially in nucleic acids. The term "nucleotide analog"
or "nucleoside analog" refers, respectively, to a nucleotide or
nucleoside in which one or more individual atoms have been replaced
with a different atom or with a different functional group.
Accordingly, the term polynucleotide includes nucleic acids of any
length, DNA, RNA, analogs and fragments thereof. A polynucleotide
of three or more nucleotides is also called nucleotidic oligomer or
oligonucleotide.
[0089] It is understood that the polynucleotides described herein
include "genes" and that the nucleic acid molecules described
herein include "vectors" or "plasmids." Accordingly, the term
"gene", also called a "structural gene" refers to a polynucleotide
that codes for a particular sequence of amino acids, which comprise
all or part of one or more proteins or enzymes, and may include
regulatory (non-transcribed) DNA sequences, such as promoter
sequences, which determine for example the conditions under which
the gene is expressed. The transcribed region of the gene may
include untranslated regions, including introns, 5'-untranslated
region (UTR), and 3'-UTR, as well as the coding sequence.
[0090] The term "operon" refers to two or more genes which are
transcribed as a single transcriptional unit from a common
promoter. In some embodiments, the genes comprising the operon are
contiguous genes. It is understood that transcription of an entire
operon can be modified (i.e., increased, decreased, or eliminated)
by modifying the common promoter. Alternatively, any gene or
combination of genes in an operon can be modified to alter the
function or activity of the encoded polypeptide. The modification
can result in an increase in the activity of the encoded
polypeptide. Further, the modification can impart new activities on
the encoded polypeptide. Exemplary new activities include the use
of alternative substrates and/or the ability to function in
alternative environmental conditions.
[0091] A "vector" is any means by which a nucleic acid can be
propagated and/or transferred between organisms, cells, or cellular
components. Vectors include viruses, bacteriophage, pro-viruses,
plasmids, phagemids, transposons, and artificial chromosomes such
as YACs (yeast artificial chromosomes), BACs (bacterial artificial
chromosomes), and PLACs (plant artificial chromosomes), and the
like, that are "episomes," that is, that replicate autonomously or
can integrate into a chromosome of a host cell. A vector can also
be a naked RNA polynucleotide, a naked DNA polynucleotide, a
polynucleotide composed of both DNA and RNA within the same strand,
a poly-lysine-conjugated DNA or RNA, a peptide-conjugated DNA or
RNA, a liposome-conjugated DNA, or the like, that are not episomal
in nature, or it can be an organism which comprises one or more of
the above polynucleotide constructs such as an agrobacterium or a
bacterium.
[0092] "Transformation" refers to the process by which a vector is
introduced into a host cell. Transformation (or transduction, or
transfection), can be achieved by any one of a number of means
including chemical transformation (e.g. lithium acetate
transformation), electroporation, microinjection, biolistics (or
particle bombardment-mediated delivery), or agrobacterium mediated
transformation.
[0093] The term "enzyme" as used herein refers to any substance
that catalyzes or promotes one or more chemical or biochemical
reactions, which usually includes enzymes totally or partially
composed of a polypeptide, but can include enzymes composed of a
different molecule including polynucleotides.
[0094] The term "protein", "peptide" or "polypeptide" as used
herein indicates an organic polymer composed of two or more amino
acidic monomers and/or analogs thereof. As used herein, the term
"amino acid" or "amino acidic monomer" refers to any natural and/or
synthetic amino acids including glycine and both D or L optical
isomers. The term "amino acid analog" refers to an amino acid in
which one or more individual atoms have been replaced, either with
a different atom, or with a different functional group.
Accordingly, the term polypeptide includes amino acidic polymer of
any length including full length proteins, and peptides as well as
analogs and fragments thereof. A polypeptide of three or more amino
acids is also called a protein oligomer or oligopeptide
[0095] The term "homolog", used with respect to an original enzyme
or gene of a first family or species, refers to distinct enzymes or
genes of a second family or species which are determined by
functional, structural or genomic analyses to be an enzyme or gene
of the second family or species which corresponds to the original
enzyme or gene of the first family or species. Most often, homologs
will have functional, structural or genomic similarities.
Techniques are known by which homologs of an enzyme or gene can
readily be cloned using genetic probes and PCR. Identity of cloned
sequences as homolog can be confirmed using functional assays
and/or by genomic mapping of the genes.
[0096] A protein has "homology" or is "homologous" to a second
protein if the amino acid sequence encoded by a gene has a similar
amino acid sequence to that of the second gene. Alternatively, a
protein has homology to a second protein if the two proteins have
"similar" amino acid sequences. (Thus, the term "homologous
proteins" is defined to mean that the two proteins have similar
amino acid sequences).
[0097] The term "analog" or "analogous" refers to nucleic acid or
protein sequences or protein structures that are related to one
another in function only and are not from common descent or do not
share a common ancestral sequence. Analogs may differ in sequence
but may share a similar structure, due to convergent evolution. For
example, two enzymes are analogs or analogous if the enzymes
catalyze the same reaction of conversion of a substrate to a
product, are unrelated in sequence, and irrespective of whether the
two enzymes are related in structure.
Cytosolically Localized Isobutanol Pathway Enzymes and Recombinant
Microorganisms Comprising the Same
[0098] Biosynthetic pathways for the production of isobutanol and
2-methyl-1-butanol by recombinant microorganisms are described by
Atsumi et al. (Atsumi et al., 2008, Nature 451: 86-89). One
strategy described herein for improving isobutanol production by
recombinant microorganisms is the localization of the enzymes
catalyzing the biosynthetic isobutanol pathway to the yeast
cytosol. Cytosolic localization of the isobutanol pathway enzymes
activity is desirable, especially for the production of isobutanol
since the ideal biocatalyst (e.g. recombinant microorganism) will
have the entire isobutanol pathway functionally expressed in the
same compartment (e.g. preferably in the cytosol). In addition,
this localization allows the pathway to utilize pyruvate and
NAD(P)H that is generated in the cytosol by glycolysis and/or the
pentose phosphate pathway without the need for transfer of these
metabolites to an alternative compartment (i.e. the mitochondria).
However, such a strategy of compartmental localization in yeast is
not feasible unless the pathway enzymes exhibit cytosolic activity
in that compartment. Thus, if one or more of the cytosolically
localized pathway enzymes lacks catalytic activity in the cytosol,
high level isobutanol production will not occur. As the present
application shows in the Examples below, inefficient cytosolic
activity of one or more isobutanol pathway enzymes (e.g. DHAD or
ALS) can limit isobutanol production.
[0099] The present inventors describe herein cytosolically active
isobutanol pathway enzymes and their use in the production of
various beneficial metabolites, such as isobutanol and
2-methyl-1-butanol. Using a combination of genetic selection and
biochemical analyses, the present inventors have identified a
number of isobutanol pathway enzymes, including DHAD enzymes, that
have activity in the cytosol. Accordingly, in one aspect, the
present application describes the discovery of DHADs with enhanced
cytosolic activity and shows that these newly identified,
cytosolically active DHADs facilitate improved isobutanol
production when co-expressed in the cytosol with the remaining four
isobutanol pathway enzymes.
[0100] As shown in Example 3 below, the native DHAD of yeast is
localized to the mitochondria. Therefore, for economically viable
production of isobutanol to occur in the yeast cytosol, the
identification of heterologous DHAD enzymes that are "cytosolically
active" in yeast (i.e. "active in the cytosol" of the yeast) is
important. In addition, the present application shows that in the
absence of ALS, KARI, KIVD, and ADH which are "cytosolically
active" or "active in the cytosol" in the cytosol of yeast,
economically viable isobutanol production will not occur, thus
making identification of native and/or heterologous ALS, KARI,
KIVD, and ADH enzymes additionally and/or independently important
to cytosolic isobutanol production.
[0101] As used herein, the term "cytosolically active" or "active
in the cytosol" means the enzyme exhibits enzymatic activity in the
cytosol of a eukaryotic organism. Cytosolically active enzymes may
further be additionally and/or independently characterized as
enzymes that generally exhibit a specific cytosolic activity which
is greater than the specific mitochondrial activity. In certain
respects, a "cytosolically active" enzymes of the present invention
exhibit a ratio of the specific activity of the mitochondrial
fraction over the specific activity of the whole cell fraction of
less than 1, as determined by the method disclosed in Example 3
herein. Cytosolically active enzymes may further be additionally
and/or independently characterized as enzymes that, when
overexpressed, result in increased activity in the whole cell
fraction and do not result in increased activity in the
mitochondrial fraction, as determined by the method disclosed in
Example 20. Cytosolically active enzymes may further be
additionally and/or independently characterized as enzymes that,
when overexpressed as one of the five enzymes that together
comprise the fivestep biosynthetic pathway for the conversion of
pyruvate isobutanol, result in increased isobutanol production
compared to enzymes that are not cytosolically active or that are
less cytosolically active.
[0102] As used herein, the term "cytosolically localized" or
"cytosolic localization" means the enzyme is localized in the
cytosol of a eukaryotic organism. Cytosolically localized enzymes
may further be additionally and/or independently characterized as
enzymes that exhibit a cytosolic protein level which is greater
than the mitochondrial protein level.
Identification of Cytosolically Active Isobutanol Pathway
Enzymes
[0103] In one aspect, the present invention encompasses a number of
strategies for identifying cytosolically active and/or localized
isobutanol pathway enzymes that exhibit cytosolic activity and/or
cytosolic localization, as well as methods for modifying said
isobutanol pathway enzymes to increase their ability to exhibit
cytosolic activity and/or cytosolic localization.
[0104] In various embodiments described herein, the isobutanol
pathway enzymes may be derived from a prokaryotic organism. In
alternative embodiments described herein, the isobutanol pathway
enzyme may be derived from a eukaryotic organism. In one
embodiment, the eukaryotic organism is a fungal organism. As
described herein, the present inventors have found that in general,
an enzyme from a fungal source is more likely to show activity in
yeast than a bacterial enzyme expressed in yeast. In addition,
homologs that are normally expressed in the cytosol are desired, as
a normally cytoplasmic enzyme is likely to show higher activity in
the cytosol as compared to an enzyme that is relocalized to the
cytosol from other organelles, such as the mitochondria. Fungal
homologs of various isobutanol pathway enzymes are often localized
to the mitochondria. The present inventors have found that fungal
homologs of isobutanol pathway enzymes that are cytosolically
localized will generally be expected to exhibit higher activity in
the cytosol of yeast than those of wild-type yeast strains. Thus,
in one embodiment, the present invention provides fungal isobutanol
pathway enzyme homologs that are cytosolically active and/or
cytosolically localized.
Dihydroxyacid Dehydratase (DHAD)
[0105] In additional embodiments, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is a dihydroxyacid
dehydratase (DHAD). In accordance with this embodiment, the present
invention provides cytosolically active dihydroxyacid dehydratases
(DHADs) and further describes methods for their use in the
production of various beneficial metabolites, such as isobutanol
and 2-methyl-1-butanol. As noted above, biosynthetic pathways for
the production of isobutanol and 2-methyl-1-butanol have been
described (Atsumi et al., 2008, Nature 451: 86-89). In these
biosynthetic pathways, DHAD catalyzes the conversion of
2,3-dihydroxyisovalerate to 2-ketoisovalerate, and
2,3-dihydroxy-3-methylvalerate to 2-keto-3-methylvarate,
respectively. Using a combination of genetic selection and
biochemical analyses, the present inventors have identified a
number of DHAD homologs that have activity in the cytosol.
[0106] Among the many strategies for identifying cytosolically
active DHADs, the present inventors performed multiway-protein
alignments between several DHAD homologs. Using this analysis, the
present inventors identified a protein motif that was surprisingly
unique to a subset of DHAD homologs exhibiting cytosolical
activity. This protein motif, P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27)
was found in DHAD homologs demonstrating cytosolic activity in
yeast. Therefore, in one embodiment, the present invention provides
DHAD enzymes comprising the amino acid sequence P(I/L)XXXGX(I/L)XIL
(SEQ ID NO: 27), wherein X is any natural or non-natural amino
acid, and wherein said DHAD enzyme exhibits the ability to convert
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol. DHAD
enzymes harboring this sequence include those derived from L.
lactis (SEQ ID NO: 18), G. forsetii (SEQ ID NO: 17), Acidobacteria
bacterium Ellin345 (SEQ ID NO: 16), Saccharopolyspora erythraea
(SEQ ID NO: 19), Yarrowia lipolytica (SEQ ID NO: 13), Francisella
tularensis (SEQ ID NO: 14), Arabidopsis thaliana (SEQ ID NO: 15),
Thermotoga petrophila (SEQ ID NO: 10), and Victivallis vadensis
(SEQ ID NO: 11). Also encompassed herein are DHAD enzymes that
comprise a motif that is at least about 70% similar, at least about
80% similar, or at least about 90% similar to the motif shown in
SEQ ID NO: 27.
[0107] As described herein, an even more specific version of this
motif has been identified by the present inventors. Thus, in a
further embodiment, the present invention provides DHAD enzymes
comprising the amino acid sequence PIKXXGX(I/L)XIL (SEQ ID NO: 28),
wherein X is any natural or non-natural amino acid, and wherein
said DHAD enzyme exhibits the ability to convert
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol. DHAD
enzymes harboring this sequence include those derived from L.
lactis (SEQ ID NO: 18), G. forsetii (SEQ ID NO: 17), Acidobacteria
bacterium Ellin345 (SEQ ID NO: 16), Y. lipolytica (SEQ ID NO: 13),
F. tularensis (SEQ ID NO: 14), A. thaliana (SEQ ID NO: 15), T.
petrophila (SEQ ID NO: 10), and V. vadensis (SEQ ID NO: 11). Also
encompassed herein are DHAD enzymes that comprise a motif that is
at least about 70% similar, at least about 80% similar, or at least
about 90% similar to the motif shown in SEQ ID NO: 28.
[0108] As noted above, one such cytosolically active DHAD
identified herein is exemplified by the L. lactis DHAD amino acid
sequence of SEQ ID NO: 18, which is encoded by the L. lactis ilvD
gene. As described herein, the present inventors have discovered
that yeast strains expressing the cytosolically active L. lactis
ilvD (DHAD) exhibit higher isobutanol production than yeast strains
expressing the S. cerevisiae ILV3 (DHAD), even when the ILV3 from
S. cerevisiae is truncated at its N-terminus to remove a putative
mitochondrial targeting sequence. In addition to the use and
identification of the cytosolically active DHAD homolog from L.
lactis, the present invention encompasses a number of different
strategies for identifying DHAD enzymes that exhibit cytosolic
activity and/or cytosolic localization, as well as methods for
modifying DHADs to increase their ability to exhibit cytosolic
activity and/or cytosolic localization.
[0109] In various embodiments described herein, the DHAD enzymes
may be derived from a prokaryotic organism. In one embodiment, the
prokaryotic organism is a bacterial organism. In another
embodiment, the bacterial organism is L. lactis. In a specific
embodiment, the DHAD enzyme from L. lactis comprises the amino acid
sequence of SEQ ID NO: 18. In other embodiments, the bacterial
organisms are of the genus Lactococcus, Gramella, Acidobacteria,
Francisella, Thermotoga and Victivallis.
[0110] In alternative embodiments, the DHAD enzyme may be derived
from a eukaryotic organism. In one embodiment, the eukaryotic
organism is a fungal organism. In an exemplary embodiment, the
fungal organism is Neurospora crassa. In a specific embodiment, the
DHAD enzyme from N. crassa comprises the amino acid sequence of SEQ
ID NO: 165.
[0111] As described herein, the present inventors have found that
in general, an enzyme from a fungal source is more likely to show
activity in yeast than a bacterial enzyme expressed in yeast. In
addition, homologs that are normally expressed in the cytosol are
desired, as a normally cytoplasmic enzyme is likely to show higher
activity in the cytosol as compared to an enzyme that is
relocalized to the cytosol from other organelles, such as the
mitochondria. Fungal homologs of various isobutanol pathway
enzymes, including DHAD, are often localized to the mitochondria.
The present inventors have found that fungal homologs of DHAD that
are cytosolically localized will generally be expected to exhibit
higher activity in the cytosol of yeast than those of wild-type
yeast strains. Thus, in one embodiment, the present invention
provides fungal DHAD homologs that are cytosolically active and/or
cytosolically localized.
[0112] In another embodiment, the eukaryotic organism is a yeast
organism. In another embodiment, the eukaryotic organism is
selected from the group consisting of the genera Enamoeba and
Giardia.
[0113] In various embodiments described herein, the recombinant
microorganism may exhibit at least about 5 percent greater
dihydroxyacid dehydratase (DHAD) activity in the cytosol as
compared to the parental microorganism. In another embodiment, the
recombinant microorganism may exhibit at least about 10 percent, at
least about 15 percent, about least about 20 percent, at least
about 25 percent, at least about 30 percent, at least about 35
percent, at least about 40 percent, at least about 45 percent, at
least about 50 percent, at least about 55 percent, at least about
60 percent, at least about 65 percent, at least about 70 percent,
at least about 75 percent, at least about 80 percent, at least
about 100 percent, at least about 200 percent, or at least about
500 percent greater dihydroxyacid dehydratase (DHAD) activity in
the cytosol as compared to the parental microorganism.
[0114] In another embodiment, the present invention provides DHAD
enzymes that, when overexpressed in yeast, result in increased
activity in the whole cell fraction and do not result in increased
activity in the mitochondrial fraction. In one embodiment, the DHAD
activity in the whole cell fraction is increased by at least about
2-fold. In another embodiment, DHAD activity in the whole cell
fraction is increased by at least about 5-fold. In yet another
embodiment, DHAD activity in the whole cell fraction is increased
by at least about 7-fold. In yet another embodiment, DHAD activity
in the whole cell fraction is increased by at least about 10-fold.
In yet another embodiment, DHAD activity in the whole cell fraction
is increased by at least about 50-fold. In yet another embodiment,
DHAD activity in the whole cell fraction is increased by at least
about 100-fold.
Acetolactate Synthase (ALS)
[0115] As described herein, the isobutanol pathway enzymes in
addition to DHAD should preferably be active in the cytosol. These
cytosolically active isobutanol pathway enzymes will generally
exhibit enzymatic activity in the cytosol. For instance, a
cytosolically active ALS should generally exhibit the ability to
convert 2 pyruvate to acetolactate in the cytosol. Thus, in various
embodiments described herein, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is acetolactate
synthase (ALS). In yeasts such as S. cerevisiae, the native
acetolactate synthase, encoded in S. cerevisiae by the ILV2 gene,
is naturally expressed in the yeast mitochondria. Unlike the
endogenous acetolactate synthase of yeast, expression of
heterologous, acetolactate synthases such as the B. subtilis alsS
and the L. lactis alsS in yeast occurs in the yeast cytosol (i.e.
cytosolically-localized). Thus, cytosolic expression of
acetolactate synthase is achieved by transforming a yeast with a
gene encoding an acetolactate synthase protein (EC 2.2.1.6).
[0116] ALS homologs that could be cytosolically expressed and
localized in yeast are predicted to lack a mitochondrial targeting
sequence as analyzed using mitoprot (Claros et al., 1996, Eur. J.
Biochem 241: 779-86). Such cytosolically localized ALS proteins can
be used as the first step in the isobutanol pathway. ALS homologs
include, but are not limited to, the following: the Serratia
marcescens ALS (GenBank Accession No. ADH43113.1) (probability of
mitochondrial localization 0.07), the Enterococcus faecalis ALS
(GenBank Accession No. NP.sub.--814940) (probability of
mitochondrial localization 0.21), the Leuconostoc mesenteroides
(GenBank Accession No. YP.sub.--818010.1) (probability of
mitochondrial localization 0.21), the Staphylococcus aureus ALS
(GenBank Accession No. YP.sub.--417545) (probability of
mitochondrial localization 0.13), the Burkholderia cenocepacia ALS
(GenBank Accession No. YP.sub.--624435) (probability of
mitochondrial localization 0.15), the T. atroviride ALS (SEQ ID NO:
71) (probability of mitochondrial localization 0.19), the T.
stipitatus ALS (SEQ ID NO: 72) (probability of mitochondrial
localization 0.19), and the Magnaporthe grisea ALS (GenBank
Accession No. EDJ99221) (probability of mitochondrial localization
0.02), a homolog or variant of any of the foregoing, and a
polypeptide having at least 60% identity to anyone of the foregoing
and exhibiting cytosolic ALS activity.
[0117] In one embodiment, the cytosolically active ALS is derived
from a prokaryotic organism, including, but not limited to B.
subtilis or L. lactis, which exhibit cytosolic activity. In another
embodiment, the ALS may be derived from an eukaryotic organism,
including, but not limited to M. grisea, P. nodorum, T. stipitatus,
and T. atroviride.
[0118] In some embodiments, an ALS enzyme that is predicted to be
mitochondrially localized may be mutated or modified to remove or
modify an N-terminal mitochondrial targeting sequence (MTS) to
remove or eliminate its ability to target the ALS enzyme to the
mitochondria. Removal of the MTS can increase cytosolic
localization of the ALS and/or increase the cytosolic activity of
the ALS as compared to the parental ALS.
[0119] The conversion of two pyruvate molecules to acetolactate can
be carried out by either an acetohydroxyacid synthase (AHAS) or an
acetolactate synthase (ALS). AHASs are involved in biosynthesis of
branched chain amino acids in the mitochondria of yeasts. They are
FAD-dependent and are feedback inhibited by branched chain amino
acids. ALSs are catabolic and are involved in the conversion of
pyruvate to acetoin. ALS are FAD-independent and not feedback
inhibited by branched chain amino acids. In addition, ALSs are
specific for the conversion of two pyruvates to acetolactate.
Therefore, ALSs are favored over AHASs. In addition, in the case of
yeast, AHASs are normally mitochondrial, therefore a fungal ALS
that is cytoplasmic is favored. Sequence analysis has shown that
there is a conserved sequence `RFDDR` found in AHASs that is not
conserved among ALSs (Le et al., 2005, Bull. Korean Chem Soc 26:
916-20). This sequence is likely involved in FAD-binding by AHASs
and thus could be used to distinguish between the FAD-dependent
AHASs and the FAD-independent ALSs. Using this region to
distinguish between AHASs and ALSs BLAST searches of fungal
sequence databases were performed and resulted in the
identification of ALS homologs from several fungal species (M.
grisea, P. nodorum, T. atroviride, T. stipitatus, P. marneffei, and
Glomerella graminicola). Of these sequences, the ALS homologs from
M. grisea, P. nodorum, T. stipitatus, and T. atroviride will
generally be expected to be cytosolically localized.
[0120] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater acetolactate synthase (ALS)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater acetolactate
synthase (ALS) activity in the cytosol as compared to the parental
microorganism.
Ketol-Acid Reductoisomerase (KARI)
[0121] In additional embodiments, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is a ketol-acid
reductoisomerase (KARI). A cytosolically active KARI should
generally exhibit the ability to convert acetolactate to
2,3-dihydroxyisovalerate in the cytosol.
[0122] In one embodiment, the KARI is derived from a prokaryotic
organism, including, but not limited to Escherichia coli, B.
subtilis or L. lactis.
[0123] in another embodiment, the KARI is derived from a eukaryotic
organism, including, but not limited to Piromyces sp. E2, S.
cerevisiae, and Arabidopsis. Fungal homologs of KARI are generally
mitochondrially localized. The present inventors have identified a
fungal homolog from the anaerobic rumenal fungi, Piromyces sp. E2,
that is cytosolically localized.
[0124] In certain specific embodiments, the KARI comprises an amino
acid sequence selected from the group consisting of E. coli
(GenBank No: NP.sub.--418222, SEQ ID NO: 1), S. cerevisiae (GenBank
No: NP.sub.--013459, SEQ ID NO: 2), and B. subtilis (GenBank No:
CAB14789) and the KARI enzymes from Piromyces sp E2 (GenBank No:
CAA76356), B. aphidicola (GenBank No: AAF13807), S. oleracea
(GenBank No: CAA40356), O. sativa (GenBank No: NP.sub.--001056384,
SEQ ID NO: 3), C. reinhardtii (GenBank No: XP.sub.--001702649, SEQ
ID NO: 6), N. crassa (GenBank No: XP.sub.--961335), S. pombe
(GenBank No: NP.sub.--001018845), L. bicolor (GenBank No:
XP.sub.--001880867), I. hospitalis (GenBank No:
YP.sub.--001435197), P. torridus (GenBank No: YP.sub.--023851, SEQ
ID NO: 7), A. cryptum (GenBank No: YP.sub.--001235669, SEQ ID NO:
5), Cyanobacteria/Synechococcus sp. (GenBank No: YP.sub.--473733),
Z. mobilis (GenBank No: YP.sub.--162876: SEQ ID NO. 8), B.
thetaiotaomicron (GenBank No: NP.sub.--810987), M. maripaludis
(GenBank No: YP.sub.--001097443, SEQ ID NO: 4), V. fischeri
(GenBank No: YP.sub.--205911), Shewanella sp (GenBank No:
YP.sub.--732498.1), G. forsetti (GenBank No: YP.sub.--862142), P.
ingrhamaii (GenBank No: YP.sub.--942294), and C. hutchinsonii
(GenBank No: YP.sub.--677763), a homolog or variant of any of the
foregoing, and a polypeptide having at least 60% identity to anyone
of the foregoing and exhibiting cytosolic KARI activity.
[0125] In additional embodiments, the KARI may be an NADH-dependent
KARI. Thus, in one embodiment, the present invention provides
recombinant microorganisms in which the NADPH-dependent enzymes
KARI is replaced with an enzyme that preferentially depends on NADH
(i.e. a KARI that is NADH-dependent). In one embodiment, such
enzymes may be identified in nature. In an alternative embodiment,
such enzymes may be generated by protein engineering techniques
including but not limited to directed evolution or site-directed
mutagenesis. NADH-dependent KARIs useful in various methods of the
present invention are described in commonly owned and co-pending
applications U.S. Ser. No. 12/610,784 and PCT/US09/62952 (published
as WO/2010/051527), which are herein incorporated by reference in
their entireties for all purposes.
[0126] In one embodiment, a microorganism is provided in which
cofactor usage is balanced during the production of a fermentation
product and the microorganism produces the fermentation product at
a higher yield compared to a modified microorganism in which the
cofactor usage in not balanced. In another embodiment of the
present invention, a microorganism is provided in which the
cofactor usage is balanced during the production of isobutanol and
the microorganism produces isobutanol at a higher yield compared to
a modified microorganism in which the cofactor usage in not
balanced. Methods for achieving co-factor balance are described in
commonly owned and co-pending applications U.S. Ser. No. 12/610,784
and PCT/US09/62952 (published as WO/2010/051527), which are herein
incorporated by reference in their entireties for all purposes.
[0127] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater ketol-acid reductoisomerase (KARI)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater ketol-acid
reductoisomerase (KARI) activity in the cytosol as compared to the
parental microorganism.
Keto-Acid Decarboxylase (KIVD)
[0128] A cytosolically active KIVD should generally exhibit the
ability to convert ketoisovalerate to isobutyraldehyde in the
cytosol. In one embodiment, the cytosolically active KIVD is
derived from a prokaryotic organism, including, but not limited to
L. lactis, which exhibits cytosolic activity. In a specific
embodiment, the KIVD enzyme from L. lactis comprises the amino acid
sequence of SEQ ID NO: 173. In additional embodiments, the
cytosolically active KIVD is derived from, for example,
Enterobacter cloacae (Accession No. P23234.1), Mycobacterium
smegmatis (Accession No. A0R480.1), Mycobacterium tuberculosis
(Accession No. O53865.1), Mycobacterium avium (Accession No.
Q742Q2.1), Azospirillum brasilense (Accession No. P51852.1), B.
subtilis (see Oku et al., 1988, J. Biol. Chem. 263: 18386-96), a
homolog or variant of any of the foregoing, and a polypeptide
having at least 60% identity to anyone of the foregoing and
exhibiting cytosolic KIVD activity.
[0129] In an alternative embodiment, the KIVD may be derived from
an eukaryotic organism.
[0130] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater 2-keto-acid decarboxylase (KIVD)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater 2-keto-acid
decarboxylase (KIVD) activity in the cytosol as compared to the
parental microorganism.
Alcohol Dehydrogenase (ADH)
[0131] A cytosolically active ADH (used interchangeably herein with
isobutanol dehydrogenase, "IDH") should generally exhibit the
ability to convert isobutyraldehyde to isobutanol in the cytosol.
In one embodiment, the cytosolically active ADH is derived from a
prokaryotic organism, including, but not limited to L. lactis. In a
specific embodiment, the ADH enzyme from L. lactis comprises the
amino acid sequence of SEQ ID NO: 175. In additional embodiments,
the ADH is derived from, for example, Lactobacillus brevis
(Accession No. YP.sub.--794451.1), Pediococcus acidilactici
(Accession No. ZP.sub.--06197454.1), Bacillus cereus (Accession No.
YP.sub.--001374103.1), Bacillus thuringiensis (Accession No.
ZP.sub.--04101989.1), Leptotrichia goodfellowii (Accession No.
ZP.sub.--06011170.1), Actinobacillus pleuropneumoniae (Accession
No. ZP.sub.--00134308.2), Streptococcus sanguinis (Accession No.
YP.sub.--001035842.1), Eikenella corrodens (Accession No.
ZP.sub.--03713785.1), Exiguobacterium sp. (Accession No.
YP.sub.--002886170.1), Neisseria elongate (Accession No.
ZP.sub.--06736067.1), E. coli (Accession No. ZP.sub.--06937530.1),
Neisseria meningitidis (Accession No. CBA03965.1), Erwinia
pyrifoliae (Accession No. CAY75147.1), and Colwellia
psychrerythraea (Accession No. YP.sub.--270515.1), a homolog or
variant of any of the foregoing, and a polypeptide having at least
60% identity to anyone of the foregoing and having cytosolic ADH
activity.
[0132] In an alternative embodiment, the ADH may be derived from an
eukaryotic organism, including, but not limited to S. cerevisiae
and D. melanogaster. In a specific embodiment, the ADH enzyme from
S. cerevisiae is Adh7. In another specific embodiment, the ADH
enzyme from D. melanogaster comprises the amino acid sequence of
SEQ ID NO: 176.
[0133] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater alcohol dehydrogenase (ADH)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater alcohol
dehydrogenase (ADH) activity in the cytosol as compared to the
parental microorganism.
Chimeric Isobutanol Pathway Enzymes
[0134] In another aspect, the present invention provides
recombinant microorganisms comprising chimeric proteins consisting
of isobutanol pathway enzymes. In one embodiment, the chimeric
proteins consist of ALS and at least one additional protein. In a
specific embodiment, the additional protein is KARI. In a preferred
embodiment, the chimeric protein exhibits the biocatalytic
properties of both ALS and KARI. By creating a chimeric protein
that incorporates the activities of both ALS and KARI, this will
generally be expected to reduce the effect of diffusion and
decreasing the time for spontaneous decomposition to occur. By
using a flexible linker and/or structural and sequence information
to create a protein with the biocatalytic properties of both ALS
and KARI, this will generally increase the concentration of
2-acetolactate at the active site of KARI, causing 2-acetolactate
to be converted to 2,3-dihydroxyisovalerate near its theoretical
maximum (very little effect of diffusion), and thus, the total
concentration of 2-acetolactate should remain low correspondingly
decreasing its spontaneous decomposition. This will generally have
the effect of increasing the rate of conversion of 2-acetolactate
to 2,3-dihydroxyisovalerate.
[0135] In another embodiment, the chimeric proteins consist of KARI
and at least one additional protein. In a specific embodiment, the
additional protein is DHAD. In a preferred embodiment, the chimeric
protein exhibits the biocatalytic properties of both KARI and DHAD.
In each of the various embodiments described herein, the proteins
may be connected via a flexible linker.
Isobutanol Pathway Enzymes Attached to a Protein Scaffold
[0136] In another aspect, the present invention provides
recombinant microorganisms comprising a scaffold system tethered to
one or more isobutanol pathway enzymes. In a specific embodiment,
the scaffold system is the MAP kinase scaffold (Ste5) system. In a
further embodiment, one or more of the isobutanol pathway enzymes
may be modified or mutated to comprise a protein domain allowing
for binding to the scaffold system.
[0137] The present inventors have found that via the use of a
protein scaffold, the isobutanol pathway enzymes that act in
concert as part of a single pathway can be co-localized. In some
embodiments, the scaffold systems are adapted for binding to the
isobutanol pathway enzymes. By tethering the enzymes that work
together in the pathway to a scaffold protein, they are brought
into close physical proximity with each other, thus increasing the
efficiency of the isobutanol production.
[0138] There are several advantages to keeping pathway enzymes
together on a scaffold system. One is that proteins that normally
would localize to an intracellular compartment, like the
mitochondria, are partitioned onto the scaffold, thus keeping a
sizeable portion of the protein population in the cytosol. Another
is that the chemical products of each enzyme is physically close to
the next enzyme in the pathway, which speeds reaction time and
decreases the possibility that the product would be used in a
competing pathway. Finally, unstable products of the enzymes would
be used more quickly, since the next enzyme in the pathway would be
adjacent to use it as a substrate, thus decreasing nonproductive
degradation of the product.
[0139] In a preferred embodiment, the isobutanol pathway enzymes
are arranged in the sequence in which they are needed to function
(i.e. ALS followed by KARI followed by DHAD followed by KIVD
followed by ADH). In another embodiment, the scaffolded protein
complex is targeted to the cytosol by adding localization signals
to the scaffold. In yet another embodiment, the scaffolded protein
complex is targeted to the cell wall by adding localization signals
to the scaffold. As would be understood by one of skill in the art,
the scaffold system allows for co-localization of proteins or
enzymes in addition to the isobutanol pathway enzymes. Such
proteins may include chaperone proteins, proteins for the
conversion of xylose to xylulose-5P, cellulases, etc.
Removal and/or Modification of N-Terminal Mitochondrial Targeting
Sequences
[0140] The localization of the enzymes involved in production of
isobutanol is desired to be cytosolic. Cytosolic localization
allows for the pathway to utilize pyruvate and NAD(P)H that is
generated in the cytosol by glycolysis and/or the pentose phosphate
pathway without the need for the transfer of these metabolites to
an alternative compartment (i.e. mitochondria). However, the yeast
enzymes acetohydroxyacid synthase (AHAS; Ilv2+Ilv6), ketol-acid
reductoisomerase (KARI; Ilv5), and dihydroxyacid dehydratase (DHAD;
Ilv3) that carry out the first three steps of isobutanol production
are physiologically localized to the mitochondria. Mitochondrial
matrix proteins are typically targeted to the mitochondria by a
N-terminal mitochondrial targeting sequence (MTS), which is then
cleaved off in the mitochondria resulting in the `mature` form of
the enzyme (Paschen et al., 2001, IUBMB Life 52: 101-112). Indeed,
the N-terminal targeting sequences for Ilv6 has been defined (Pang
et al., 1999 Biochemistry 38: 5222-31). N-terminal deletions of
Ilv5 has also been shown to re-localize this enzyme to the cytosol
(Omura, 2008, Appl. Microbiol. Biotechnol. 78: 503-513; See also
Omura, WO/2009/078108 A1, hereby incorporated by reference in its
entirety).
[0141] One mechanism identified by the present inventors for the
cytosolic localization of isobutanol pathway enzymes involves the
removal and/or modification of N-terminal mitochondrial targeting
sequences (MTS). Nuclear genome-encoded proteins destined to reside
in the mitochondria often contain an N-terminal Mitochondrial
Targeting Sequence (MTS) that is recognized by a set of proteins
collectively known as mitochondrial import machinery. Following
recognition and import, the MTS is then physically cleaved off of
the imported protein. In eukaryotes, homologs of two of the
isobutanol pathway enzymes, ketol-acid reductoisomerase (KARI, e.g.
S. cerevisiae Ilv5) and dihydroxy acid dehydratase (DHAD, e.g. S.
cerevisiae Ilv3), are predicted to be mitochondrial, based upon the
presence of an N-terminal MTS as well as several in vivo functional
and mutational studies (See e.g., Omura, F., 2008, Appl Gen &
Mol Biot 78: 503-513). As described herein, the present inventors
have designed isobutanol pathway enzymes, whereby the predicted MTS
is removed or modified. In some instances, there exists
experimental evidence for the length of the MTS. Specifically, the
MTS of Ilv6 has been experimentally defined to be the N-terminal 61
amino acids (Pang et al., 1999, Biochemistry 38: 5222-31). The MTS
of Ilv5 has been reported to be the N-terminal 47 residues (Kassow
A., 1992, "Metabolic effects of deleting the region encoding the
transit peptide in Saccharomyces cerevisiae ILV5" PhD thesis,
University of Copenhagen). In addition, the deletion of the
N-terminal 46 amino acids of Ilv5 has been shown to result in an
active enzyme that is localized in the cytosol (Omura, F., 2008,
Appl Gen & Mol Biot 78: 503-513).
[0142] As described herein, the present inventors utilize deletions
and/or modifications of the N-terminal MTS to localize the enzymes
of the isobutanol pathway to the cytosol. In various embodiments,
the MTS can be entirely or partly deleted or its sequence can be
modified to eliminate its ability to target the protein to the
mitochondria. A benefit of removing the entire MTS is that the
resulting protein would essentially be the `mature` form of the
enzyme. The use of deletion of the N-terminal MTS can also be
expanded to all enzymes/homologs to be used for isobutanol
production. This is especially true for homologs from eukaryotic
organisms other than S. cerevisiae where the enzymes are localized
to the mitochondria. In addition, some bacterial homologs may have
a putative MTS. As bacterial enzymes do not undergo an N-terminal
cleavage, N-terminal deletions may be deleterious to these enzymes.
In such cases, modifications of the sequence to block the MTS
function of the N-terminal sequence may be preferable as such
alterations would likely be less deleterious to the enzyme's
activity. N-terminal MTS can be predicted by MitoProt II (See,
e.g., Claros et al., 1996, Eur. J. Biochem. 241: 779-786). Using
this program, the lengths of the MTS for Ilv2 and Ilv3 were
predicted to be the N-terminal 55 and 20 amino acids, respectively.
Modification of the MTS as contemplated herein includes the
introduction of one or multiple mutations to inhibit MTS function.
It is thought that the mitochondrial import machinery recognizes
the aliphatic alpha helix that is formed by the MTS. Thus
modifications that may inhibit MTS functions would be amino acid
changes that would alter the aliphatic amino acids such as mutating
the charged residues. Such modification(s) prevent its recognition
by the mitochondrial import machinery and subsequent cleavage of
the MTS and import into the mitochondria.
Peptide Tags to Augment Cytosolic Localization of Isobutanol
Pathway Enzymes
[0143] In additional embodiments, the mitochondrially imported
isobutanol pathway enzymes can be expressed as a chimeric fusion
protein to augment cytosolic localization. In one embodiment, the
isobutanol pathway enzyme is fused to a peptide tag, whereby said
isobutanol pathway enzyme exhibits increased cytosolic localization
and/or cytosolic activity in yeast as compared to the parental
isobutanol pathway enzyme. In one embodiment, the isobutanol
pathway enzyme is fused to a peptide tag following removal of the
N-terminal Mitochondrial Targeting Sequence (MTS). In one
embodiment, the peptide tag is non-cleavable. In a preferred
embodiment, the peptide tag is fused at the N-terminus of the
isobutanol pathway enzyme. Peptide tags useful in the present
invention preferably have the following properties: (1) they do not
significantly hinder the normal enzymatic function of the
isobutanol pathway enzyme; (2) it folds in such as a way as to
block recognition of an N-terminal MTS by the normal mitochondrial
import machinery; (3) it promotes the stable expression and/or
folding of the isobutanol pathway enzyme it precedes; (4) it can be
detected, for example, by Western blotting or SDS-PAGE plus
Coomassie staining to facilitate analysis of the overexpressed
chimeric protein.
[0144] Suitable peptide tags for use in the present invention
include, but are not limited to, ubiquitin, ubiquitin-like (UBL)
proteins, myc, HA-tag, green fluorescent protein (GFP), and the
maltose binding protein (MBP). Ubiquitin, and the Ubiquitin-like
protein (Ubl's) offer several advantages. For instance, the use of
Ubiquitin or similar Ubl's (e.g., SUMO) as a solubility- and
expression-enhancing fusion partner has been well documented (Ecker
et al., 1989, J Biol Chem 264: 7715-9; Marblestone et al., 2006,
Protein Science 15: 182-9). In fact, in S. cerevisiae, several
ribosomal proteins are expressed as C-terminal fusions to
ubiquitin. Following translation and protein folding, ubiquitin is
cleaved from its co-expressed partner by a highly specific
ubiquitin hydrolase, which recognizes and requires the extreme
C-terminal Gly-Gly motif present in ubiquitin and cleaves
immediately following this sequence; a similar pathway removes Ubl
proteins from their fusion partners.
[0145] The invention described here describes a method to
re-localize a normally mitochondrial protein or enzyme by
expressing it as fusion with an N-terminal, non-cleavable ubiquitin
or ubiquitin-like molecule. In doing so, the re-targeted enzyme
enjoys enhanced expression, solubility, and function in the
cytosol. In another embodiment, the sequence encoding the MTS can
be replaced with a sequence encoding one or more copies of the
c-myc epitope tag (amino acids EQKLISEEDL, SEQ ID NO: 9), which
will generally not target a protein into the mitochondria and can
easily be detected by commercially available antibodies.
Altering the Iron-Sulfur Cluster Domain and/or Redox Active
Domain
[0146] In general, the yeast cytosol demonstrates a different redox
potential than a bacterial cell, as well as the yeast mitochondria.
As a result, isobutanol pathway enzymes which exhibit an iron
sulfur (FeS) domain and/or redox active domain, may require the
redox potential of the native environments to be folded or
expressed in a functional form. Expressing some isobutanol pathway
enzymes in the yeast cytosol, which can harbor unfavorable redox
potential, has the propensity to result in inactive proteins, even
if the proteins are expressed. The present inventors have
identified a number of different strategies to overcome this
problem, which can arise when an isobutanol pathway enzyme which is
suited to a particular environment with a specific redox potential
is expressed in the yeast cytosol.
[0147] In one embodiment, the present invention provides isobutanol
pathway enzymes that exhibit a properly folded iron-sulfur cluster
domain and/or redox active domain in the cytosol. Such isobutanol
pathway enzymes will generally comprise a mutated or modified
iron-sulfur cluster domain and/or redox active domain, allowing for
a non-native isobutanol pathway enzyme to be expressed in the yeast
cytosol in a functional form.
[0148] In various embodiments described herein, the recombinant
microorganisms may further comprise a nucleic acid encoding a
chaperone protein, wherein said chaperone protein assists the
folding of a protein exhibiting cytosolic activity. In a preferred
embodiment, the protein exhibiting cytosolic activity is DHAD. In
one embodiment, the chaperone may be a native protein. In another
embodiment, the chaperone protein may be an exogenous protein. In
some embodiments, the chaperone protein may be selected from the
group consisting of: endoplasmic reticulum oxidoreductin 1 (Ero1,
Accession No. NP.sub.--013576.1), including variants of Ero1 that
have been suitably altered to reduce or prevent its normal
localization to the endoplasmic reticulum; thioredoxins (which
includes Trx1, Accession No. NP.sub.--013144.1; and Trx2, Accession
No. NP.sub.--011725.1), thioredoxin reductase (Trr1, Accession No.
NP.sub.--010640.1); glutaredoxins (which includes Grx1, Accession
No. NP.sub.--009895.1; Grx2, Accession No. NP.sub.--010801.1; Grx3,
Accession No. NP.sub.--010383.1; Grx4, Accession No.
NP.sub.--01101.1; Grx5, Accession No. NP.sub.--015266.1; Grx6,
Accession No. NP.sub.--010274.1; Grx7, Accession No.
NP.sub.--009570.1; Grx8, Accession No. NP.sub.--013468.1);
glutathione reductase Girl (Accession No. NP.sub.--015234.1); and
Jac1 (Accession No. NP.sub.--011497.1), including variants of Jac1
that have been suitably altered to reduce or prevent its normal
mitochondrial localization; and homologs or variants thereof.
[0149] As described herein, iron-sulfur cluster assembly for
insertion into yeast apo-iron-sulfur proteins begins in yeast
mitochondria. To assemble in yeast the active iron-sulfur proteins
containing the cluster, either the apo-iron-sulfur protein is
imported into the mitochondria from the cytosol and the iron-sulfur
cluster is inserted into the protein and the active protein remains
localized in the mitochondria; or the iron-sulfur clusters or
precursors thereof are exported from the mitochondria to the
cytosol and the active protein is assembled in the cytosol or other
cellular compartments.
[0150] Targeting of yeast mitochondrial iron-sulfur proteins or
non-yeast iron-sulfur proteins to the yeast cytosol can result in
such proteins not being properly assembled with their iron-sulfur
clusters. This present invention overcomes this problem by
co-expression and cytosolic targeting in yeast of proteins for
iron-sulfur cluster assembly and cluster insertion into
apo-iron-sulfur proteins, including iron-sulfur cluster assembly
and insertion proteins from organisms other than yeast, together
with the apo-iron-sulfur protein to provide assembly of active
iron-sulfur proteins in the yeast cytosol.
[0151] Therefore, in one embodiment of this invention, the
apo-iron-sulfur protein DHAD enzyme encoded by the E. coli ilvD
gene is expressed in yeast together with E. coli iron-sulfur
cluster assembly and insertion genes comprising either the cyaY,
iscS, iscU, iscA, hscB, hscA, fdx and isuX genes or the sufA, sufB,
sufC, sufD, sufS and sufE genes. This strategy allows for both the
apo-iron-sulfur protein (DHAD) and the iron-sulfur cluster assembly
and insertion components (the products of the isc or suf genes) to
come from the same organism, causing assembly of the active DHAD
iron-sulfur protein in the yeast cytosol. As a modification of this
embodiment, for those E. coli iron-sulfur cluster assembly and
insertion components that localize to or are predicted to localize
to the yeast mitochondria upon expression in yeast, the genes for
these components are engineered to eliminate such targeting signals
to ensure localization of the components in the yeast cytoplasm.
Thus, in some embodiments, one or more genes encoding an
iron-sulfur cluster assembly protein may be mutated or modified to
remove a signal peptide, whereby localization of the product of
said one or more genes to the mitochondria is prevented. In certain
embodiments, it may be preferable to overexpress one or more genes
encoding an iron-sulfur cluster assembly protein.
[0152] In additional embodiments, iron-sulfur cluster assembly and
insertion components from other than E. coli can be co-expressed
with the E. coli DHAD protein to provide assembly of the active
DHAD iron-sulfur cluster protein. Such iron-sulfur cluster assembly
and insertion components from other organisms can consist of the
products of the Helicobacter pylori nifS and nifU genes or the
Entamoeba histolytica nifS and nifU genes. As a modification of
this embodiment, for those non-E. coli iron-sulfur cluster assembly
and insertion components that localize to or are predicted to
localize to the yeast mitochondria upon expression in yeast, the
genes for these components can be engineered to eliminate such
targeting signals to ensure localization of the components in the
yeast cytoplasm.
[0153] As a further modification of this embodiment, in addition to
co-expression of these proteins in aerobically-grown yeast, these
proteins may be co-expressed in anaerobically-grown yeast to lower
the redox state of the yeast cytoplasm to improve assembly of the
active iron-sulfur protein.
[0154] In another embodiment, the above iron-sulfur cluster
assembly and insertion components can be co-expressed with DHAD
apo-iron-sulfur enzymes other than the E. coli IlvD gene product to
generate active DHAD enzymes in the yeast cytoplasm. As a
modification of this embodiment, for those DHAD enzymes that
localize to or are predicted to localize to the yeast mitochondria
upon expression in yeast, then the genes for these enzymes can be
engineered to eliminate such targeting signals to ensure
localization of the enzymes in the yeast cytoplasm.
[0155] In additional embodiments, the above methods used to
generate active DHAD enzymes localized to yeast cytoplasm may be
combined with methods to generate active acetolactate synthase,
KARI, KIVD and ADH enzymes in the same yeast for the production of
isobutanol by yeast.
[0156] In another embodiment, production of active iron-sulfur
proteins other than DHAD enzymes in yeast cytoplasm can be
accomplished by co-expression with iron-sulfur cluster assembly and
insertion proteins from organisms other than yeast, with proper
targeting of the proteins to the yeast cytoplasm if necessary and
expression in anaerobically growing yeast if needed to improve
assembly of the active proteins.
[0157] In another embodiment, the iron-sulfur cluster assembly
protein encoding genes may be derived from eukaryotic organisms,
including, but not limited to yeasts and plants. In one embodiment,
the iron-sulfur cluster protein encoding genes are derived from a
yeast organism, including, but not limited to S. cerevisiae. In
specific embodiments, the yeast derived genes encoding iron-sulfur
cluster assembly proteins are selected from the group consisting of
Cfd1 (Accession No. NP.sub.--012263.1), Nbp35 (Accession No.
NP.sub.--011424.1), Nar1 (Accession No. NP.sub.--014159.1), Cia1
(Accession No. NP.sub.--010553.1), and homologs or variants
thereof. In a further embodiment, the iron-sulfur cluster assembly
protein encoding genes may be derived from plant nuclear genes
which encode proteins translocated to chloroplast or plant genes
found in the chloroplast genome itself.
[0158] As noted above, the iron-sulfur cluster assembly genes may
be derived from eukaryotic organisms, including, but not limited to
yeasts and plants. In one embodiment, the iron-sulfur cluster genes
are derived from a yeast organism, including, but not limited to S.
cerevisiae. In specific embodiments, the yeast derived iron-sulfur
cluster assembly genes are selected from the group consisting of
CFD1, NBP35, NAR1, CIA1, and homologs or variants thereof. In a
further embodiment, the iron-sulfur cluster assembly genes may be
derived from a plant chloroplast.
[0159] In certain embodiments described herein, it may be desirable
to reduce or eliminate the activity and/or proteins levels of one
or more iron-sulfur cluster containing cytosolic proteins. This
modification increases the capacity of a yeast to incorporate
[Fe--S] clusters into cytosolically expressed proteins wherein said
proteins can be native proteins that are expressed in a non-native
compartment or heterologous proteins. This is achieved by deletion
of a highly expressed native cytoplasmic [Fe--S]-dependent protein.
More specifically, the gene LEU1 is deleted coding for the
3-isopropylmalate dehydratase which catalyses the conversion of
3-isopropylmalate into 2-isopropylmalate as part of the leucine
biosynthetic pathway in yeast. Leu1p contains an 4Fe-4S cluster
which takes part in the catalysis of the dehydratase. DHAD also
contains a 4Fe-4S cluster involved in its dehydratase activity.
Therefore, although the two enzymes have different substrate
preferences the process of incorporation of the Fe--S cluster is
generally similar for the two proteins. Given that Leu1p is present
in yeast at 10000 molecules per cell (Ghaemmaghami et al., 2003,
Nature 425: 737), deletion of LEU1 therefore ensures that the cell
has enough spare capacity to incorporate [Fe--S] clusters into at
least 10000 molecules of cytosolically expressed DHAD. Taking into
account the specific activity of DHAD (E. coli DHAD is reported to
have a specific activity of 63 U/mg) (Flint et al., 1993, J
Biological Chem 268: 14732), the LEU1 deletion yeast strain would
generally exhibit an increased capacity for DHAD activity in the
cytosol as measured in cell lysate.
[0160] In alternative embodiments, it may be desirable to further
overexpress an additional enzyme that converts
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol. In a
specific embodiment, the enzyme may be selected from the group
consisting of 3-isopropylmalate dehydratase (Leu1p) and
imidazoleglycerol-phosphate dehydrogenase (His3p). Because DHAD
activity is limited in the cytosol, alternative dehydratases that
convert dihydroxyisovalerate (DHIV) to 2-ketoisovalerate (KIV) and
are physiologically localized to the yeast cytosol may be utilized.
Leu1p and His3p are dehydratases that potentially may exhibit
affinity for DHIV. Leu1p is an Fe--S binding protein that is
involved in leucine biosynthesis and is also normally localized to
the cytosol. His3p is involved in histidine biosynthesis and is
similar to Leu1p, it is generally localized to the cytosol or
predicted to be localized to the cytosol. This modification
overcomes the problem of a DHAD that is limiting isobutanol
production in the cytosol of yeast. The use of an alternative
dehydratase that has activity in the cytosol with a low activity
towards DHIV may thus be used in place of the DHAD in the
isobutanol pathway. As described herein, such enzyme may be further
engineered to increase activity with DHIV.
The Microorganism in General
[0161] Native producers of 1-butanol, such as Clostridium
acetobutylicum, are known, but these organisms also generate
byproducts such as acetone, ethanol, and butyrate during
fermentations. Furthermore, these microorganisms are relatively
difficult to manipulate, with significantly fewer tools available
than in more commonly used production hosts such as S. cerevisiae
or E. coli. Additionally, the physiology and metabolic regulation
of these native producers are much less well understood, impeding
rapid progress towards high-efficiency production. Furthermore, no
native microorganisms have been identified that can metabolize
glucose into isobutanol in industrially relevant quantities.
[0162] The production of isobutanol and other fusel alcohols by
various yeast species, including Saccharomyces cerevisiae is of
special interest to the distillers of alcoholic beverages, for whom
fusel alcohols constitute often undesirable off-notes. Production
of isobutanol in wild-type yeasts has been documented on various
growth media, ranging from grape must from winemaking (Romano et
al., 2003, World J. of Microbiol Biot. 19: 311-5), in which 12-219
mg/L isobutanol were produced, to supplemented minimal media
(Oliviera et al., 2005, World J. of Microbiol Blot. 21: 1569-76),
producing 16-34 mg/L isobutanol. Work from Dickinson et al. (J
Biol. Chem. 272: 26871-8, 1997) has identified the enzymatic steps
utilized in an endogenous S. cerevisiae pathway converting
branch-chain amino acids (e.g., valine or leucine) to
isobutanol.
[0163] Recombinant microorganisms provided herein can express a
plurality of heterologous and/or native target enzymes involved in
pathways for the production of isobutanol from a suitable carbon
source.
[0164] Accordingly, "engineered" or "modified" microorganisms are
produced via the introduction of genetic material into a host or
parental microorganism of choice and/or by modification of the
expression of native genes, thereby modifying or altering the
cellular physiology and biochemistry of the microorganism. Through
the introduction of genetic material and/or the modification of the
expression of native genes the parental microorganism acquires new
properties, e.g. the ability to produce a new, or greater
quantities of, an intracellular metabolite. As described herein,
the introduction of genetic material into and/or the modification
of the expression of native genes in a parental microorganism
results in a new or modified ability to produce isobutanol. The
genetic material introduced into and/or the genes modified for
expression in the parental microorganism contains gene(s), or parts
of genes, coding for one or more of the enzymes involved in a
biosynthetic pathway for the production of isobutanol and may also
include additional elements for the expression and/or regulation of
expression of these genes, e.g. promoter sequences.
[0165] In addition to the introduction of a genetic material into a
host or parental microorganism, an engineered or modified
microorganism can also include alteration, disruption, deletion or
knocking-out of a gene or polynucleotide to alter the cellular
physiology and biochemistry of the microorganism. Through the
alteration, disruption, deletion or knocking-out of a gene or
polynucleotide the microorganism acquires new or improved
properties (e.g., the ability to produce a new metabolite or
greater quantities of an intracellular metabolite, improve the flux
of a metabolite down a desired pathway, and/or reduce the
production of byproducts).
[0166] Recombinant microorganisms provided herein may also produce
metabolites in quantities not available in the parental
microorganism. A "metabolite" refers to any substance produced by
metabolism or a substance necessary for or taking part in a
particular metabolic process. A metabolite can be an organic
compound that is a starting material (e.g., glucose or pyruvate),
an intermediate (e.g., 2-ketoisovalerate), or an end product (e.g.,
isobutanol) of metabolism. Metabolites can be used to construct
more complex molecules, or they can be broken down into simpler
ones. Intermediate metabolites may be synthesized from other
metabolites, perhaps used to make more complex substances, or
broken down into simpler compounds, often with the release of
chemical energy.
[0167] Exemplary metabolites include glucose, pyruvate, and
isobutanol. The metabolite isobutanol can be produced by a
recombinant microorganism which expresses or over-expresses a
metabolic pathway that converts pyruvate to isobutanol. An
exemplary metabolic pathway that converts pyruvate to isobutanol
may be comprised of an acetohydroxy acid synthase (ALS), a
ketolacid reductoisomerase (KARI), a dihyroxy-acid dehydratase
(DHAD), a 2-keto-acid decarboxylase (KIVD), and an alcohol
dehydrogenase (ADH).
[0168] Accordingly, provided herein are recombinant microorganisms
that produce isobutanol and in some aspects may include the
elevated expression of target enzymes such as ALS, KARI, DHAD,
KIVD, and ADH
[0169] The disclosure identifies specific genes useful in the
methods, compositions and organisms of the disclosure; however it
will be recognized that absolute identity to such genes is not
necessary. For example, changes in a particular gene or
polynucleotide comprising a sequence encoding a polypeptide or
enzyme can be performed and screened for activity. Typically such
changes comprise conservative mutation and silent mutations. Such
modified or mutated polynucleotides and polypeptides can be
screened for expression of a functional enzyme using methods known
in the art.
[0170] Due to the inherent degeneracy of the genetic code, other
polynucleotides which encode substantially the same or functionally
equivalent polypeptides can also be used to clone and express the
polynucleotides encoding such enzymes.
[0171] As will be understood by those of skill in the art, it can
be advantageous to modify a coding sequence to enhance its
expression in a particular host. The genetic code is redundant with
64 possible codons, but most organisms typically use a subset of
these codons. The codons that are utilized most often in a species
are called optimal codons, and those not utilized very often are
classified as rare or low-usage codons. Codons can be substituted
to reflect the preferred codon usage of the host, a process
sometimes called "codon optimization" or "controlling for species
codon bias."
[0172] Optimized coding sequences containing codons preferred by a
particular prokaryotic or eukaryotic host (Murray et al., 1989,
Nucl Acids Res. 17: 477-508) can be prepared, for example, to
increase the rate of translation or to produce recombinant RNA
transcripts having desirable properties, such as a longer
half-life, as compared with transcripts produced from a
non-optimized sequence. Translation stop codons can also be
modified to reflect host preference. For example, typical stop
codons for S. cerevisiae and mammals are UAA and UGA, respectively.
The typical stop codon for monocotyledonous plants is UGA, whereas
insects and E. coli commonly use UAA as the stop codon (Dalphin et
al., 1996, Nucl Acids Res. 24: 216-8). Methodology for optimizing a
nucleotide sequence for expression in a plant is provided, for
example, in U.S. Pat. No. 6,015,891, and the references cited
therein.
[0173] Those of skill in the art will recognize that, due to the
degenerate nature of the genetic code, a variety of DNA compounds
differing in their nucleotide sequences can be used to encode a
given enzyme of the disclosure. The native DNA sequence encoding
the biosynthetic enzymes described above are referenced herein
merely to illustrate an embodiment of the disclosure, and the
disclosure includes DNA compounds of any sequence that encode the
amino acid sequences of the polypeptides and proteins of the
enzymes utilized in the methods of the disclosure. In similar
fashion, a polypeptide can typically tolerate one or more amino
acid substitutions, deletions, and insertions in its amino acid
sequence without loss or significant loss of a desired activity.
The disclosure includes such polypeptides with different amino acid
sequences than the specific proteins described herein so long as
they modified or variant polypeptides have the enzymatic anabolic
or catabolic activity of the reference polypeptide. Furthermore,
the amino acid sequences encoded by the DNA sequences shown herein
merely illustrate embodiments of the disclosure.
[0174] In addition, homologs of enzymes useful for generating
metabolites are encompassed by the microorganisms and methods
provided herein.
[0175] As used herein, two proteins (or a region of the proteins)
are substantially homologous when the amino acid sequences have at
least about 30%, 40%, 50% 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity. To determine
the percent identity of two amino acid sequences, or of two nucleic
acid sequences, the sequences are aligned for optimal comparison
purposes (e.g., gaps can be introduced in one or both of a first
and a second amino acid or nucleic acid sequence for optimal
alignment and non-homologous sequences can be disregarded for
comparison purposes). In one embodiment, the length of a reference
sequence aligned for comparison purposes is at least 30%, typically
at least 40%, more typically at least 50%, even more typically at
least 60%, and even more typically at least 70%, 80%, 90%, 100% of
the length of the reference sequence. The amino acid residues or
nucleotides at corresponding amino acid positions or nucleotide
positions are then compared. When a position in the first sequence
is occupied by the same amino acid residue or nucleotide as the
corresponding position in the second sequence, then the molecules
are identical at that position (as used herein amino acid or
nucleic acid "identity" is equivalent to amino acid or nucleic acid
"homology"). The percent identity between the two sequences is a
function of the number of identical positions shared by the
sequences, taking into account the number of gaps, and the length
of each gap, which need to be introduced for optimal alignment of
the two sequences.
[0176] When "homologous" is used in reference to proteins or
peptides, it is recognized that residue positions that are not
identical often differ by conservative amino acid substitutions. A
"conservative amino acid substitution" is one in which an amino
acid residue is substituted by another amino acid residue having a
side chain (R group) with similar chemical properties (e.g., charge
or hydrophobicity). In general, a conservative amino acid
substitution will not substantially change the functional
properties of a protein. In cases where two or more amino acid
sequences differ from each other by conservative substitutions, the
percent sequence identity or degree of homology may be adjusted
upwards to correct for the conservative nature of the substitution.
Means for making this adjustment are well known to those of skill
in the art (See, e.g., Pearson W. R., 1994, Methods in Mol Biol 25:
365-89.
[0177] The following six groups each contain amino acids that are
conservative substitutions for one another: 1) Serine (S),
Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3)
Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5)
Isoleucine (I), Leucine (L), Methionine (M), Alanine (A), Valine
(V), and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).
[0178] Sequence homology for polypeptides, which is also referred
to as percent sequence identity, is typically measured using
sequence analysis software. See commonly owned and co-pending
application US 2009/0226991. A typical algorithm used comparing a
molecule sequence to a database containing a large number of
sequences from different organisms is the computer program BLAST.
When searching a database containing sequences from a large number
of different organisms, it is typical to compare amino acid
sequences. Database searching using amino acid sequences can be
measured by algorithms described in commonly owned and co-pending
application US 2009/0226991.
[0179] The disclosure provides recombinant microorganisms
comprising a biochemical pathway for the production of isobutanol
from a suitable substrate at a high yield. A recombinant
microorganism of the disclosure comprises one or more recombinant
polynucleotides within the genome of the organism or external to
the genome within the organism. The microorganism can comprise a
reduction in expression, disruption or knockout of a gene found in
the wild-type organism and/or introduction of a heterologous
polynucleotide and/or expression or overexpression of an endogenous
polynucleotide.
[0180] In one aspect, the disclosure provides a recombinant
microorganism comprising elevated expression of at least one target
enzyme as compared to a parental microorganism or encodes an enzyme
not found in the parental organism. In another or further aspect,
the microorganism comprises a reduction in expression, disruption
or knockout of at least one gene encoding an enzyme that competes
with a metabolite necessary for the production of isobutanol. The
recombinant microorganism produces at least one metabolite involved
in a biosynthetic pathway for the production of isobutanol. In
general, the recombinant microorganisms comprises at least one
recombinant metabolic pathway that comprises a target enzyme and
may further include a reduction in activity or expression of an
enzyme in a competitive biosynthetic pathway. The pathway acts to
modify a substrate or metabolic intermediate in the production of
isobutanol. The target enzyme is encoded by, and expressed from, a
polynucleotide derived from a suitable biological source. In some
embodiments, the polynucleotide comprises a gene derived from a
prokaryotic or eukaryotic source and recombinantly engineered into
the microorganism of the disclosure. In other embodiments, the
polynucleotide comprises a gene that is native to the host
organism.
[0181] It is understood that a range of microorganisms can be
modified to include a recombinant metabolic pathway suitable for
the production of isobutanol. In various embodiments,
microorganisms may be selected from yeast microorganisms. Yeast
microorganisms for the production of isobutanol may be selected
based on certain characteristics:
[0182] One characteristic may include the property that the
microorganism is selected to convert various carbon sources into
isobutanol. The term "carbon source" generally refers to a
substance suitable to be used as a source of carbon for prokaryotic
or eukaryotic cell growth. Examples of suitable carbon sources are
described in commonly owned and co-pending application US
2009/0226991. Accordingly, in one embodiment, the recombinant
microorganism herein disclosed can convert a variety of carbon
sources to products, including but not limited to glucose,
galactose, mannose, xylose, arabinose, lactose, sucrose, and
mixtures thereof.
[0183] The recombinant microorganism may thus further include a
pathway for the fermentation of isobutanol from five-carbon
(pentose) sugars including xylose. Most yeast species metabolize
xylose via a complex route, in which xylose is first reduced to
xylitol via a xylose reductase (XR) enzyme. The xylitol is then
oxidized to xylulose via a xylitol dehydrogenase (XDH) enzyme. The
xylulose is then phosphorylated via a xylulokinase (XK) enzyme.
This pathway operates inefficiently in yeast species because it
introduces a redox imbalance in the cell. The xylose-to-xylitol
step uses NADH as a cofactor, whereas the xylitol-to-xylulose step
uses NADPH as a cofactor. Other processes must operate to restore
the redox imbalance within the cell. This often means that the
organism cannot grow anaerobically on xylose or other pentose
sugar. Accordingly, a yeast species that can efficiently ferment
xylose and other pentose sugars into a desired fermentation product
is therefore very desirable.
[0184] Thus, in one aspect, the recombinant is engineered to
express a functional exogenous xylose isomerase. Exogenous xylose
isomerases functional in yeast are known in the art. See, e.g.,
Rajgarhia et al, US20060234364, which is herein incorporated by
reference in its entirety. In an embodiment according to this
aspect, the exogenous xylose isomerase gene is operatively linked
to promoter and terminator sequences that are functional in the
yeast cell. In a preferred embodiment, the recombinant
microorganism further has a deletion or disruption of a native gene
that encodes for an enzyme (e.g. XR and/or XDH) that catalyzes the
conversion of xylose to xylitol. In a further preferred embodiment,
the recombinant microorganism also contains a functional, exogenous
xylulokinase (XK) gene operatively linked to promoter and
terminator sequences that are functional in the yeast cell. In one
embodiment, the xylulokinase (XK) gene is overexpressed.
[0185] In one embodiment, the microorganism has reduced or no
pyruvate decarboxylase (PDC) activity. PDC catalyzes the
decarboxylation of pyruvate to acetaldehyde, which is then reduced
to ethanol by ADH via an oxidation of NADH to NADH+. Ethanol
production is the main pathway to oxidize the NADH from glycolysis.
Deletion of this pathway increases the pyruvate and the reducing
equivalents (NADH) available for the isobutanol pathway.
Accordingly, deletion of PDC genes can further increase the yield
of isobutanol.
[0186] In another embodiment, the microorganism has reduced or no
glycerol-3-phosphate dehydrogenase (GPD) activity. GPD catalyzes
the reduction of dihydroxyacetone phosphate (DHAP) to
glycerol-3-phosphate (G3P) via the oxidation of NADH to NAD+.
Glycerol is then produced from G3P by Glycerol-3-phosphatase (GPP).
Glycerol production is a secondary pathway to oxidize excess NADH
from glycolysis. Reduction or elimination of this pathway would
increase the pyruvate and reducing equivalents (NADH) available for
the isobutanol pathway. Thus, deletion of GPD genes can further
increase the yield of isobutanol.
[0187] In yet another embodiment, the microorganism has reduced or
no PDC activity and reduced or no GPD activity.
[0188] In one embodiment, the yeast microorganisms may be selected
from the "Saccharomyces Yeast Clade", as described in commonly
owned and co-pending application US 2009/0226991.
[0189] The term "Saccharomyces sensu stricto" taxonomy group is a
cluster of yeast species that are highly related to S. cerevisiae
(Rainieri et al., 2003, J. Biosci Bioengin 96: 1-9). Saccharomyces
sensu stricto yeast species include but are not limited to S.
cerevisiae, S. cerevisiae, S. kudriavzevii, S. mikatae, S. bayanus,
S. uvarum, S. carocanis and hybrids derived from these species
(Masneuf et al., 1998, Yeast 7: 61-72).
[0190] An ancient whole genome duplication (WGD) event occurred
during the evolution of the hemiascomycete yeast and was discovered
using comparative genomic tools (Kellis et al., 2004, Nature 428:
617-24; Dujon et al., 2004, Nature 430:35-44; Langkjaer et al.,
2003, Nature 428: 848-52; Wolfe et al., 1997, Nature 387: 708-13).
Using this major evolutionary event, yeast can be divided into
species that diverged from a common ancestor following the WGD
event (termed "post-WGD yeast" herein) and species that diverged
from the yeast lineage prior to the WGD event (termed "pre-WGD
yeast" herein).
[0191] Accordingly, in one embodiment, the yeast microorganism may
be selected from a post-WGD yeast genus, including but not limited
to Saccharomyces and Candida. The favored post-WGD yeast species
include: S. cerevisiae, S. uvarum, S. bayanus, S. paradoxus, S.
castelli, and C. glabrata.
[0192] In another embodiment, the yeast microorganism may be
selected from a pre-whole genome duplication (pre-WGD) yeast genus
including but not limited to Saccharomyces, Kluyveromyces, Candida,
Pichia, Issatchenkia, Debaryomyces, Hansenula, Yarrowia and,
Schizosaccharomyces. Representative pre-WGD yeast species include:
S. kluyveri, K. thermotolerans, K. marxianus, K. waltii, K. lactis,
C. tropicalis, P. pastoris, P. anomala, P. stipitis, I. orientalis,
I. occidentalis, I. scutulata, D. hansenii, H. anomala, Y.
lipolytica, and S. pombe.
[0193] A yeast microorganism may be either Crabtree-negative or
Crabtree-positive as described in described in commonly owned and
co-pending application US 2009/0226991. In one embodiment the yeast
microorganism may be selected from yeast with a Crabtree-negative
phenotype including but not limited to the following genera:
Kluyveromyces, Pichia, Issatchenkia, Hansenula, and Candida.
Crabtree-negative species include but are not limited to: K.
lactis, K. marxianus, P. anomala, P. stipitis, I. orientalis, I.
occidentalis, I. scutulata, H. anomala, and C. utilis. In another
embodiment, the yeast microorganism may be selected from a yeast
with a Crabtree-positive phenotype, including but not limited to
Saccharomyces, Kluyveromyces, Zygosaccharomyces, Debaryomyces,
Pichia and Schizosaccharomyces. Crabtree-positive yeast species
include but are not limited to: S. cerevisiae, S. uvarum, S.
bayanus, S. paradoxus, S. castelli, S. kluyveri, K. thermotolerans,
C. glabrata, Z. bailli, Z. rouxii, D. hansenii, P. pastorius, and
S. pombe.
[0194] Another characteristic may include the property that the
microorganism is that it is non-fermenting. In other words, it
cannot metabolize a carbon source anaerobically while the yeast is
able to metabolize a carbon source in the presence of oxygen.
Nonfermenting yeast refers to both naturally occurring yeasts as
well as genetically modified yeast. During anaerobic fermentation
with fermentative yeast, the main pathway to oxidize the NADH from
glycolysis is through the production of ethanol. Ethanol is
produced by alcohol dehydrogenase (ADH) via the reduction of
acetaldehyde, which is generated from pyruvate by pyruvate
decarboxylase (PDC). In one embodiment, a fermentative yeast can be
engineered to be non-fermentative by the reduction or elimination
of the native PDC activity. Thus, most of the pyruvate produced by
glycolysis is not consumed by PDC and is available for the
isobutanol pathway. Deletion of this pathway increases the pyruvate
and the reducing equivalents available for the isobutanol pathway.
Fermentative pathways contribute to low yield and low productivity
of isobutanol. Accordingly, deletion of PDC may increase yield and
productivity of isobutanol.
[0195] In some embodiments, the recombinant microorganisms may be
microorganisms that are non-fermenting yeast microorganisms,
including, but not limited to those, classified into a genera
selected from the group consisting of Tricosporon, Rhodotorula, or
Myxozyma.
[0196] In one embodiment, a yeast microorganism is engineered to
convert a carbon source, such as glucose, to pyruvate by glycolysis
and the pyruvate is converted to isobutanol via an engineered
isobutanol pathway (See, e.g., WO/2007/050671, WO/2008/098227, and
Atsumi et al., 2008, Nature 45: 86-9). Alternative pathways for the
production of isobutanol have been described in WO/2007/050671 and
in Dickinson et al., 1998, J Biol Chem 273:25751-6.
[0197] Accordingly, the engineered isobutanol pathway to convert
pyruvate to isobutanol can be comprised of the following
reactions:
[0198] 1. 2 pyruvate.fwdarw.acetolactate+CO.sub.2
[0199] 2.
acetolactate+NAD(P)H.fwdarw.2,3-dihydroxyisovalerate+NAD(P).sup.-
+
[0200] 3. 2,3-dihydroxyisovalerate.fwdarw.alpha-ketoisovalerate
[0201] 4. alpha-ketoisovalerate.fwdarw.isobutyraldehyde
+CO.sub.2
[0202] 5. isobutyraldehyde+NAD(P)H.fwdarw.isobutanol +NAD(P)
[0203] These reactions are carried out by the enzymes 1)
Acetolactate Synthase (ALS), 2) Keto-acid Reducto-Isomerase (KARI),
3) Dihydroxy-acid dehydratase (DHAD), 4) Keto-isovalerate
decarboxylase (KIVD), and 5) an Alcohol dehydrogenase (ADH) (FIG.
1). In another embodiment, the yeast microorganism is engineered to
overexpress these enzymes. For example, these enzymes can be
encoded by native genes. Alternatively, these enzymes can be
encoded by heterologous genes. For example, ALS can be encoded by
the alsS gene of B. subtilis, alsS of L. lactis, or the ilvK gene
of K. pneumonia. For example, KARI can be encoded by the ilvC genes
of E. coli, C. glutamicum, M. maripaludis, or Piromyces sp E2. For
example, DHAD can be encoded by the ilvD genes of E. coli, C.
glutamicum, or L. lactis. For example, KIVD can be encoded by the
kivD gene of L. lactis. ADH can be encoded by ADH2, ADH6, or ADH7
of S. cerevisiae.
[0204] In one embodiment, pathway steps 2 and 5 may be carried out
by KARI and ADH enzymes that utilize NADH (rather than NADPH) as a
co-factor. Such enzymes are described in commonly owned and
co-pending applications U.S. Ser. No. 12/610,784 and PCT/US09/62952
(published as WO/2010/051527), which are herein incorporated by
reference in their entireties for all purposes. The present
inventors have found that utilization of NADH-dependent KARI and
ADH enzymes to catalyze pathway steps 2 and 5, respectively,
surprisingly enables production of isobutanol under anaerobic
conditions. Thus, in one embodiment, the recombinant microorganisms
of the present invention may use an NADH-dependent KARI to catalyze
the conversion of acetolactate (+NADH) to produce
2,3-dihydroxyisovalerate. In another embodiment, the recombinant
microorganisms of the present invention may use an NADH-dependent
ADH to catalyze the conversion of isobutyraldehyde (+NADH) to
produce isobutanol. In yet another embodiment, the recombinant
microorganisms of the present invention may use both an
NADH-dependent KARI to catalyze the conversion of acetolactate
(+NADH) to produce 2,3-dihydroxyisovalerate, and an NADH-dependent
ADH to catalyze the conversion of isobutyraldehyde (+NADH) to
produce isobutanol.
[0205] The yeast microorganism of the invention may be engineered
to have increased ability to convert pyruvate to isobutanol. In one
embodiment, the yeast microorganism may be engineered to have
increased ability to convert pyruvate to isobutyraldehyde. In
another embodiment, the yeast microorganism may be engineered to
have increased ability to convert pyruvate to keto-isovalerate. In
another embodiment, the yeast microorganism may be engineered to
have increased ability to convert pyruvate to
2,3-dihydroxyisovalerate. In another embodiment, the yeast
microorganism may be engineered to have increased ability to
convert pyruvate to acetolactate.
[0206] Furthermore, any of the genes encoding the foregoing enzymes
(or any others mentioned herein (or any of the regulatory elements
that control or modulate expression thereof)) may be optimized by
genetic/protein engineering techniques, such as directed evolution
or rational mutagenesis, which are known to those of ordinary skill
in the art. Such action allows those of ordinary skill in the art
to optimize the enzymes for expression and activity in yeast.
[0207] In addition, genes encoding these enzymes can be identified
from other fungal and bacterial species and can be expressed for
the modulation of this pathway. A variety of organisms could serve
as sources for these enzymes, including, but not limited to,
Saccharomyces spp., including S. cerevisiae and S. uvarum,
Kluyveromyces spp., including K. thermotolerans, K. lactis, and K.
marxianus, Pichia spp., Hansenula spp., including H. polymorpha,
Candida spp., Trichosporon spp., Yamadazyma spp., including Y. spp.
stipitis, Torulaspora pretoriensis, Schizosaccharomyces spp.,
including S. pombe, Cryptococcus spp., Aspergillus spp., Neurospora
spp., or Ustilago spp. Sources of genes from anaerobic fungi
include, but not limited to, Piromyces spp., Orpinomyces spp., or
Neocallimastix spp. Sources of prokaryotic enzymes that are useful
include, but not limited to, Escherichia coli, Zymomonas mobilis,
Staphylococcus aureus, Bacillus spp., Clostridium spp.,
Corynebacterium spp., Pseudomonas spp., Lactococcus spp.,
Enterobacter spp., and Salmonella spp.
Methods in General
Identification of PDC and GPD in a Yeast Microorganism
[0208] Any method can be used to identify genes that encode for
enzymes with pyruvate decarboxylase (PDC) activity or
glycerol-3-phosphate dehydrogenase (GPD) activity. Suitable methods
for the identification of PDC and GPD are described in co-pending
applications U.S. Ser. No. 12/343,375 (published as US
2009/0226991), U.S. Ser. No. 12/696,645, and U.S. Ser. No.
12/820,505, which claim priority to U.S. Provisional Application
61/016,483, all of which are herein incorporated by reference in
their entireties for all purposes.
Genetic Insertions and Deletions
[0209] Any method can be used to introduce a nucleic acid molecule
into yeast and many such methods are well known. For example,
transformation and electroporation are common methods for
introducing nucleic acid into yeast cells. See, e.g., Gietz et al.,
1992, Nuc Acids Res. 27: 69-74; Ito et al., 1983, J. Bacteriol.
153: 163-8; and Becker et al., 1991, Methods in Enzymology 194:
182-7.
[0210] In an embodiment, the integration of a gene of interest into
a DNA fragment or target gene of a yeast microorganism occurs
according to the principle of homologous recombination. According
to this embodiment, an integration cassette containing a module
comprising at least one yeast marker gene and/or the gene to be
integrated (internal module) is flanked on either side by DNA
fragments homologous to those of the ends of the targeted
integration site (recombinogenic sequences). After transforming the
yeast with the cassette by appropriate methods, a homologous
recombination between the recombinogenic sequences may result in
the internal module replacing the chromosomal region in between the
two sites of the genome corresponding to the recombinogenic
sequences of the integration cassette. (Orr-Weaver et al., 1981,
PNAS USA 78: 6354-58).
[0211] In an embodiment, the integration cassette for integration
of a gene of interest into a yeast microorganism includes the
heterologous gene under the control of an appropriate promoter and
terminator together with the selectable marker flanked by
recombinogenic sequences for integration of a heterologous gene
into the yeast chromosome. In an embodiment, the heterologous gene
includes an appropriate native gene desired to increase the copy
number of a native gene(s). The selectable marker gene can be any
marker gene used in yeast, including but not limited to, HIS3,
TRP1, LEU2, URA3, bar, ble, hph, and kan. The recombinogenic
sequences can be chosen at will, depending on the desired
integration site suitable for the desired application.
[0212] In another embodiment, integration of a gene into the
chromosome of the yeast microorganism may occur via random
integration (Kooistra et al., 2004, Yeast 21: 781-792).
[0213] Additionally, in an embodiment, certain introduced marker
genes are removed from the genome using techniques well known to
those skilled in the art. For example, URA3 marker loss can be
obtained by plating URA3 containing cells in FOA (5-fluoro-orotic
acid) containing medium and selecting for FOA resistant colonies
(Boeke et al., 1984, Mol. Gen. Genet. 197: 345-47).
[0214] The exogenous nucleic acid molecule contained within a yeast
cell of the disclosure can be maintained within that cell in any
form. For example, exogenous nucleic acid molecules can be
integrated into the genome of the cell or maintained in an episomal
state that can stably be passed on ("inherited") to daughter cells.
Such extra-chromosomal genetic elements (such as plasmids, etc.)
can additionally contain selection markers that ensure the presence
of such genetic elements in daughter cells. Moreover, the yeast
cells can be stably or transiently transformed. In addition, the
yeast cells described herein can contain a single copy, or multiple
copies of a particular exogenous nucleic acid molecule as described
above.
Reduction of Enzymatic Activity
[0215] Yeast microorganisms within the scope of the invention may
have reduced enzymatic activity such as reduced
glycerol-3-phosphate dehydrogenase activity. The term "reduced" as
used herein with respect to a particular enzymatic activity refers
to a lower level of enzymatic activity than that measured in a
comparable yeast cell of the same species. The term reduced also
refers to the elimination of enzymatic activity than that measured
in a comparable yeast cell of the same species. Thus, yeast cells
lacking glycerol-3-phosphate dehydrogenase activity are considered
to have reduced glycerol-3-phosphate dehydrogenase activity since
most, if not all, comparable yeast strains have at least some
glycerol-3-phosphate dehydrogenase activity. Such reduced enzymatic
activities can be the result of lower enzyme concentration, lower
specific activity of an enzyme, or a combination thereof. Many
different methods can be used to make yeast having reduced
enzymatic activity. For example, a yeast cell can be engineered to
have a disrupted enzyme-encoding locus using common mutagenesis or
knock-out technology. In addition, certain point-mutation(s) can be
introduced which results in an enzyme with reduced activity.
[0216] Alternatively, antisense technology can be used to reduce
enzymatic activity. For example, yeast can be engineered to contain
a cDNA that encodes an antisense molecule that prevents an enzyme
from being made. The term "antisense molecule" as used herein
encompasses any nucleic acid molecule that contains sequences that
correspond to the coding strand of an endogenous polypeptide. An
antisense molecule also can have flanking sequences (e.g.,
regulatory sequences). Thus antisense molecules can be ribozymes or
antisense oligonucleotides. A ribozyme can have any general
structure including, without limitation, hairpin, hammerhead, or
axhead structures, provided the molecule cleaves RNA.
[0217] Yeast having a reduced enzymatic activity can be identified
using many methods. For example, yeast having reduced
glycerol-3-phosphate dehydrogenase activity can be easily
identified using common methods, which may include, for example,
measuring glycerol formation via liquid chromatography.
Overexpression of Heterologous Genes
[0218] Methods for overexpressing a polypeptide from a native or
heterologous nucleic acid molecule are well known. Such methods
include, without limitation, constructing a nucleic acid sequence
such that a regulatory element promotes the expression of a nucleic
acid sequence that encodes the desired polypeptide. Typically,
regulatory elements are DNA sequences that regulate the expression
of other DNA sequences at the level of transcription. Thus,
regulatory elements include, without limitation, promoters,
enhancers, and the like. For example, the exogenous genes can be
under the control of an inducible promoter or a constitutive
promoter. Moreover, methods for expressing a polypeptide from an
exogenous nucleic acid molecule in yeast are well known. For
example, nucleic acid constructs that are used for the expression
of exogenous polypeptides within Kluyveromyces and Saccharomyces
are well known (see, e.g., U.S. Pat. Nos. 4,859,596 and 4,943,529,
for Kluyveromyces and, e.g., Gellissen et al., Gene 190(1):87-97
(1997) for Saccharomyces). Yeast plasmids have a selectable marker
and an origin of replication. In addition certain plasmids may also
contain a centromeric sequence. These centromeric plasmids are
generally a single or low copy plasmid. Plasmids without a
centromeric sequence and utilizing either a 2 micron (S.
cerevisiae) or 1.6 micron (K. lactis) replication origin are high
copy plasmids. The selectable marker can be either prototrophic,
such as HIS3, TRP1, LEU2, URA3 or ADE2, or antibiotic resistance,
such as, bar, ble, hph, or kan.
[0219] In another embodiment, heterologous control elements can be
used to activate or repress expression of endogenous genes.
Additionally, when expression is to be repressed or eliminated, the
gene for the relevant enzyme, protein or RNA can be eliminated by
known deletion techniques.
[0220] As described herein, any yeast within the scope of the
disclosure can be identified by selection techniques specific to
the particular enzyme being expressed, over-expressed or repressed.
Methods of identifying the strains with the desired phenotype are
well known to those skilled in the art. Such methods include,
without limitation, PCR, RT-PCR, and nucleic acid hybridization
techniques such as Northern and Southern analysis, altered growth
capabilities on a particular substrate or in the presence of a
particular substrate, a chemical compound, a selection agent and
the like. In some cases, immunohistochemistry and biochemical
techniques can be used to determine if a cell contains a particular
nucleic acid by detecting the expression of the encoded
polypeptide. For example, an antibody having specificity for an
encoded enzyme can be used to determine whether or not a particular
yeast cell contains that encoded enzyme. Further, biochemical
techniques can be used to determine if a cell contains a particular
nucleic acid molecule encoding an enzymatic polypeptide by
detecting a product produced as a result of the expression of the
enzymatic polypeptide. For example, transforming a cell with a
vector encoding acetolactate synthase and detecting increased
acetolactate concentrations compared to a cell without the vector
indicates that the vector is both present and that the gene product
is active. Methods for detecting specific enzymatic activities or
the presence of particular products are well known to those skilled
in the art. For example, the presence of acetolactate can be
determined as described by Hugenholtz and Starrenburg, 1992, Appl.
Micro. Biot. 38:17-22.
Increase of Enzymatic Activity
[0221] Yeast microorganisms of the invention may be further
engineered to have increased activity of enzymes. The term
"increased" as used herein with respect to a particular enzymatic
activity refers to a higher level of enzymatic activity than that
measured in a comparable yeast cell of the same species. For
example, overexpression of a specific enzyme can lead to an
increased level of activity in the cells for that enzyme. Increased
activities for enzymes involved in glycolysis or the isobutanol
pathway would result in increased productivity and yield of
isobutanol.
[0222] Methods to increase enzymatic activity are known to those
skilled in the art. Such techniques may include increasing the
expression of the enzyme by increased copy number and/or use of a
strong promoter, introduction of mutations to relieve negative
regulation of the enzyme, introduction of specific mutations to
increase specific activity and/or decrease the Km for the
substrate, or by directed evolution. See, e.g., Methods in
Molecular Biology (vol. 231), ed. Arnold and Georgiou, Humana Press
(2003).
Microorganism Characterized by Producing Isobutanol at High
Yield
[0223] For a biocatalyst to produce isobutanol most economically,
it is desired to produce a high yield. Preferably, the only product
produced is isobutanol. Extra products lead to a reduction in
product yield and an increase in capital and operating costs,
particularly if the extra products have little or no value. Extra
products also require additional capital and operating costs to
separate these products from isobutanol.
[0224] The microorganism may convert one or more carbon sources
derived from biomass into isobutanol with a yield of greater than
5% of theoretical. In one embodiment, the yield is greater than
10%. In one embodiment, the yield is greater than 50% of
theoretical. In one embodiment, the yield is greater than 60% of
theoretical. In another embodiment, the yield is greater than 70%
of theoretical. In yet another embodiment, the yield is greater
than 80% of theoretical. In yet another embodiment, the yield is
greater than 85% of theoretical. In yet another embodiment, the
yield is greater than 90% of theoretical. In yet another
embodiment, the yield is greater than 95% of theoretical. In still
another embodiment, the yield is greater than 97.5% of
theoretical.
[0225] More specifically, the microorganism converts glucose, which
can be derived from biomass into isobutanol with a yield of greater
than 5% of theoretical. In one embodiment, the yield is greater
than 10% of theoretical. In one embodiment, the yield is greater
than 50% of theoretical. In one embodiment the yield is greater
than 60% of theoretical. In another embodiment, the yield is
greater than 70% of theoretical. In yet another embodiment, the
yield is greater than 80% of theoretical. In yet another
embodiment, the yield is greater than 85% of theoretical. In yet
another embodiment the yield is greater than 90% of theoretical. In
yet another embodiment, the yield is greater than 95% of
theoretical. In still another embodiment, the yield is greater than
97.5% of theoretical
Microorganism Characterized by Production of Isobutanol from
Pyruvate Via an Overexpressed Isobutanol Pathway and a Pdc-Minus
and Gpd-Minus Phenotype
[0226] In yeast, the conversion of pyruvate to acetaldehyde is a
major drain on the pyruvate pool, and, hence, a major source of
competition with the isobutanol pathway. This reaction is catalyzed
by the pyruvate decarboxylase (PDC) enzyme. Reduction of this
enzymatic activity in the yeast microorganism results in an
increased availability of pyruvate and reducing equivalents to the
isobutanol pathway and may improve isobutanol production and yield
in a yeast microorganism that expresses a pyruvate-dependent
isobutanol pathway.
[0227] Reduction of PDC activity can be accomplished by 1) mutation
or deletion of a positive transcriptional regulator for the
structural genes encoding for PDC or 2) mutation or deletion of all
PDC genes in a given organism. The term "transcriptional regulator"
can specify a protein or nucleic acid that works in trans to
increase or to decrease the transcription of a different locus in
the genome. For example, in S. cerevisiae, the PDC2 gene, which
encodes for a positive transcriptional regulator of PDC1,5,6 genes
can be deleted; a S. cerevisiae in which the PDC2 gene is deleted
is reported to have only .about.10% of wildtype PDC activity
(Hohmann, 1993, Mol Gen Genet. 241:657-66). Alternatively, for
example, all structural genes for PDC (e.g. in S. cerevisiae, PDC1,
PDC5, and PDC6, or in K. lactis, PDC1) are deleted.
[0228] Crabtree-positive yeast strains such as S. cerevisiae strain
that contains disruptions in all three of the PDC alleles no longer
produce ethanol by fermentation. However, a downstream product of
the reaction catalyzed by PDC, acetyl-CoA, is needed for anabolic
production of necessary molecules. Therefore, the Pdc- mutant is
unable to grow solely on glucose, and requires a two-carbon carbon
source, either ethanol or acetate, to synthesize acetyl-CoA
(Flikweert et al., 1999, FEMS Microbiol Lett. 174: 73-9; and van
Maris et al., 2004, Appl Environ Microbiol. 70: 159-66).
[0229] Thus, in an embodiment, such a Crabtree-positive yeast
strain may be evolved to generate variants of the PDC mutant yeast
that do not have the requirement for a two-carbon molecule and has
a growth rate similar to wild type on glucose. Any method,
including chemostat evolution or serial dilution may be utilized to
generate variants of strains with deletion of three PDC alleles
that can grow on glucose as the sole carbon source at a rate
similar to wild type (van Maris et al., 2004, Appl Envir Micro 70:
159-66).
[0230] Another byproduct that would decrease yield of isobutanol is
glycerol. Glycerol is produced by 1) the reduction of the
glycolysis intermediate, dihydroxyacetone phosphate (DHAP), to
glycerol-3-phosphate (G3P) via the oxidation of NADH to NAD.sup.+
by Glycerol-3-phosphate dehydrogenase (GPD) followed by 2) the
dephosphorylation of glycerol-3-phosphate to glycerol by
glycerol-3-phosphatase (GPP). Production of glycerol results in
loss of carbons as well as reducing equivalents. Reduction of GPD
activity would increase yield of isobutanol. Reduction of GPD
activity in addition to PDC activity would further increase yield
of isobutanol. Reduction of glycerol production has been reported
to increase yield of ethanol production (Nissen et al., 2000, Yeast
16, 463-74; Nevoigt et al., Method of modifying a yeast cell for
the production of ethanol, WO/2009/056984). Disruption of this
pathway has also been reported to increase yield of lactate in a
yeast engineered to produce lactate instead of ethanol (Dundon et
al., Yeast cells having disrupted pathway from dihydroxyacetone
phosphate to glycerol, US 2009/0053782).
[0231] In one embodiment, the microorganism is a Crabtree-positive
yeast with reduced or no GPD activity. In another embodiment, the
microorganism is a crab-tree positive yeast with reduced or no GPD
activity, and expresses an isobutanol biosynthetic pathway and
produces isobutanol. In yet another embodiment, the microorganism
is a Crabtree-positive yeast with reduced or no GPD activity and
with reduced or no PDC activity. In another embodiment, the
microorganism is a crab-tree positive yeast with reduced or no GPD
activity, with reduced or no PDC activity, and expresses an
isobutanol biosynthetic pathway and produces isobutanol.
[0232] In another embodiment, the microorganism is a
Crabtree-negative yeast with reduced or no GPD activity. In another
embodiment, the microorganism is a Crabtree-negative yeast with
reduced or no GPD activity, expresses the isobutanol biosynthetic
pathway and produces isobutanol. In yet another embodiment, the
microorganism is a Crabtree-negative yeast with reduced or no GPD
activity and with reduced or no PDC activity. In another
embodiment, the microorganism is a Crabtree-negative yeast with
reduced or no GPD activity, with reduced or no PDC activity,
expresses an isobutanol biosynthetic pathway and produces
isobutanol.
[0233] PDC-minus/GPD-minus yeast production strains are described
in co-pending applications U.S. Ser. No. 12/343,375 (published as
US 2009/0226991), U.S. Ser. No. 12/696,645, and U.S. Ser. No.
12/820,505, which claim priority to U.S. Provisional Application
61/016,483, all of which are herein incorporated by reference in
their entireties for all purposes.
Method of Using Microorganism for High-Yield Isobutanol
Fermentation
[0234] In a method to produce isobutanol from a carbon source at
high yield, the yeast microorganism is cultured in an appropriate
culture medium containing a carbon source.
[0235] Another exemplary embodiment provides a method for producing
isobutanol comprising a recombinant yeast microorganism of the
invention in a suitable culture medium containing a carbon source
that can be converted to isobutanol by the yeast microorganism of
the invention.
[0236] In certain embodiments, the method further includes
isolating isobutanol from the culture medium. For example,
isobutanol may be isolated from the culture medium by any method
known to those skilled in the art, such as distillation,
pervaporation, or liquid-liquid extraction, including methods
disclosed in co-pending applications U.S. Ser. No. 12/342,992
(published as US 2009/0171129) and PCT/US08/88187 (published as
WO/2009/086391), which are herein incorporated by reference in
their entireties for all purposes.
[0237] This invention is further illustrated by the following
examples that should not be construed as limiting. The contents of
all references, patents, and published patent applications cited
throughout this application, as well as the Figure and the Sequence
Listing, are incorporated herein by reference for all purposes.
Examples
General Methods
TABLE-US-00001 [0238] TABLE 1 Amino acid sequences disclosed
herein. SEQ ID NO Protein, Accession No. 1 E. coli IlvC, NP_418222
2 S. cerevisiae Ilv5, NP_013459 3 Oryza saliva KARI, NP_001056384 4
Methanococcus maripaludis KARI, YP_001097443 5 Acidiphilium cryptum
KARI, YP_001235669 6 Chlamydomonas reinhardtii KARI, XP_001702649 7
Picrophilus torridus KARI, YP_023851 8 Zymomonas mobilis KARI,
YP_162876 9 c-myc epitope tag 10 Thermotoga petrophila RKU-1
dihydroxyacid dehydratase (DHAD), YP_001243973.1 11 Victivallis
vadensis ATCC BAA-548 dihydroxyacid dehydratase (DHAD),
ZP_01924101.1 12 Termite group 1 bacterium phylotype Rs-D17
dihydroxyacid dehydratase (DHAD), YP_001956631.1 13 Yarrowia
lipolytica dihydroxyacid dehydratase (DHAD), XP_502180.2 14
Francisella tularensis subsp. tularensis WY96-3418 dihydroxyacid
dehydratase (DHAD), YP_001122023.1 15 Arabidopsis thaliana
dihydroxyacid dehydratase (DHAD), AAK64025.1 16 Candidatus
Koribacter versatilis Ellin345 dihydroxyacid dehydratase (DHAD),
YP_592184.1 (Acidobacter) 17 Gramella forsetii KT0803 dihydroxyacid
dehydratase (DHAD), YP_862145.1 18 Lactococcus lactis subsp. lactis
Il1403 dihydroxyacid dehydratase (DHAD), NP_267379.1 19
Saccharopolyspora erythraea NRRL 2338 dihydroxyacid dehydratase
(DHAD), YP_001103528.2 20 Saccharomyces cerevisiae Ilv3,
NP_012550.1 21 Piromyces sp E2 ilvD 22 Ralstonia eutropha JMP134
ilvD, YP_298150.1 23 Chromohalobacter salexigens ilvD, YP_573197.1
24 Picrophilus torridus DSM9790 ilvD, YP_024215.1 25 Sulfolobus
tokodaii str. 7 dihydroxyacid dehydratase (DHAD), NP_378168.1 26
Saccharomyces cerevisiae Ilv3.DELTA.N 27 P(I/L)XXXGX(I/L)XIL
(conserved motif described in Example 17) 28 PIKXXGX(I/L)XIL
(conserved motif described in Example 17)
TABLE-US-00002 TABLE 2 Nucleic acid sequences disclosed herein. SEQ
ID NO Gene, Accession No. 87 Lactococcus lactis subsp. lactis
Il1403 (Ll_ilvD) 88 Saccharomyces cerevisiae ILV3 (ScILV3(FL)) 89
Saccharomyces cerevisiae ILV3.DELTA.N (ScILV3.DELTA.N) 90 Gramella
forsetii KT0803 (Gf_ilvD) 91 Saccharopolyspora erythraea NRRL 2338
(Se_ilvD) 92 Candidatus Koribacter versatilis Ellin345 ilvD
(Acidobacter) 93 Piromcyes sp E2 ilvD (Piromyces ilvD) 94 Ralstonia
eutropha JMP134 ilvD, (Re_ilvD) 95 Chromohalobacter salexigens
ilvD, (Cs_ilvD) 96 Picrophilus torridus DSM9790 ilvD, (Pt_ilvD) 97
Sulfolobus tokodaii str. 7 ilvD, (St_ilvD) 98 E. coli
ilvC.sup.Q110V, (Ec_ilvC(Q110V)) 99 Lactococcus lactis kivD,
(Ll_kivD) 100 S. cerevisiae ILV5, (ScILV5)
[0239] Determination of optical density. The optical density of the
yeast cultures is determined at 600 nm using a DU 800
spectrophotometer (Beckman-Coulter, Fullerton, Calif., USA).
Samples are diluted as necessary to yield an optical density of
between 0.1 and 0.8.
[0240] Gas Chromatography. Analysis of volatile organic compounds,
including ethanol and isobutanol was performed on a HP 5890 gas
chromatograph fitted with an HP 7673 Autosampler, a DB-FFAP column
(J&W; 30 m length, 0.32 mm ID, 0.25 .mu.M film thickness) or
equivalent connected to a flame ionization detector (FID). The
temperature program was as follows: 200.degree. C. for the
injector, 300.degree. C. for the detector, 50.degree. C. oven for 1
minute, 31.degree. C./minute gradient to 140.degree. C., and then
hold for 2.5 min. Analysis was performed using authentic standards
(>99%, obtained from Sigma-Aldrich), and a 5-point calibration
curve with 1-pentanol as the internal standard.
[0241] High Performance Liquid Chromatography for quantitative
analysis of glucose and organic acids. Analysis of glucose and
organic acids was performed on a HP-1100 High Performance Liquid
Chromatography system equipped with an Aminex HPX-87H Ion Exclusion
column (Bio-Rad, 300.times.7.8 mm) or equivalent and an H.sup.+
cation guard column (Bio-Rad) or equivalent. Organic acids were
detected using an HP-1100 UV detector (210 nm, 8 nm 360 nm
reference) while glucose was detected using an HP-1100 refractive
index detector. The column temperature was 60.degree. C. This
method was Isocratic with 0.008 N sulfuric acid in water as the
mobile phase. Flow was set at 1 mL/min. Injection volume was 20
.mu.L and the run time was 30 minutes.
[0242] High Performance Liquid Chromatography for quantitative
analysis of ketoisovalerate and isobutyraldehyde. Analysis of the
DNPH derivatives of ketoisovalerate and isobutyraldehyde was
performed on a HP-1100 High Performance Liquid Chromatography
system equipped with a Hewlett Packard 1200 HPLC stack column
(Agilent Eclipse XDB-18, 150.times.4.0 mm; 5 .mu.m particles [P/N
#993967-902] and C18 Guard cartridge). The analytes were detected
using an HP-1100 UV detector at 360 nm The column temperature was
50.degree. C. This method was isocratic with 0.1% H.sub.3PO.sub.4
and 70% acetonitrile in water as mobile phase. Flow was set at 3
mL/min. Injection size was 10 .mu.L and the run time was 2
minutes.
[0243] Molecular biology and bacterial cell culture. Standard
molecular biology methods for cloning and plasmid construction are
generally used, unless otherwise noted (Sambrook, J., Russel, D. W.
Molecular Cloning, A Laboratory Manual. 3 ed. 2001, Cold Spring
Harbor, N.Y.: Cold Spring Harbor Laboratory Press).
[0244] Standard recombinant DNA and molecular biology techniques
used in the Examples are well known in the art and are described by
Sambrook, J., Russel, D. W. Molecular Cloning, A Laboratory Manual.
3 ed. 2001, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory
Press; and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist,
Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. M. et al.,
Current Protocols in Molecular Biology, pub. by Greene Publishing
Assoc. and Wiley-Interscience (1987).
[0245] General materials and methods suitable for the routine
maintenance and growth of bacterial cultures are well known in the
art. Techniques suitable for use in the following examples may be
found as set out in Manual of Methods for General Bacteriology
(Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W.
Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips,
eds.), American Society for Microbiology, Washington, D.C. (1994))
or by Thomas D. Brock in Biotechnology: A Textbook of Industrial
Microbiology, Second Edition, Sinauer Associates, Inc., Sunderland,
Mass. (1989).
[0246] Yeast transformations--S. cerevisiae. S. cerevisiae strains
were transformed by the Lithium Acetate method (Gietz et al.,
Nucleic Acids Res. 27:69-74 (1992): Cells from 50 mL YPD cultures
(YPGal for valine auxotrophs) were collected by centrifugation
(2700 rcf, 2 minutes, 25.degree. C.) once the cultures reached an
OD.sub.600 of 1.0. The cells were washed cells with 50 mL sterile
water and collected by centrifugation at 2700 rcf for 2 minutes at
25.degree. C. The cells were washed again with 25 mL sterile water
and collected cells by centrifugation at 2700 rcf for 2 minutes at
25.degree. C. The cells were resuspended in 1 mL of 100 mM lithium
acetate and transferred to a 1.5 mL eppendorf tube. The cells were
collected cells by centrifugation for 20 sec at 18,000 rcf,
25.degree. C. The cells were resuspended cells in a volume of 100
mM lithium acetate that was approximately 4.times. the volume of
the cell pellet. A mixture of DNA (final volume of 15 .mu.l with
sterile water), 72 .mu.l 50% PEG, 10 .mu.l 1 M lithium acetate, and
3 .mu.l denatured salmon sperm DNA was prepared for each
transformation. In a 1.5 mL tube, 15 .mu.l of the cell suspension
was added to the DNA mixture (85 .mu.l), and the transformation
suspension was vortexed with 5 short pulses. The transformation was
incubated at 30 minutes at 30.degree. C., followed by incubation
for 22 minutes at 42.degree. C. The cells were collected by
centrifugation for 20 sec at 18,000 rcf, 25.degree. C. The cells
were resuspended in 100 .mu.l SOS (1 M sorbitol, 34% (v/v) YP (1%
yeast extract, 2% peptone), 6.5 mM CaCl.sub.2) or 100 .mu.l YP (1%
yeast extract, 2% peptone) and spread over an appropriate selective
plate.
[0247] Yeast transformations--K. lactis. K. lactis cells were
transformed according to a slightly modified version of the
protocol as described by Kooistra et al., Yeast 21: 781-792 (2004).
Saturated overnight-grown cultures of K. lactis cells were diluted
1:50 into 100 mL YPD and were placed in 30.degree. C. shaker (250
rpm) and grown for 4-5 hours until the culture reached an
OD.sub.600 of 0.3-0.5. Cells were collected by centrifugation (2
min, 3000.times.g) and washed with 50 ml cold sterile EB
(electroporation buffer; 10 mM Tris-HCl, pH 7.5, 270 mM sucrose, 1
mM MgCl.sub.2) at 4.degree. C. Cells were resuspended in 50 mL YPD
that contained 25 mM DTT and 20 mM HEPES, pH 8.0 Cells were
transferred back into flasks used to grow cells and incubated in
30.degree. C. incubator (without shaking) for 30 minutes. Cells
were then collected by centrifugation (2 minutes, 3000.times.g) and
washed with 10 mL ice-cold sterile EB, as above. Cells were then
resuspended using one cell pellet volume of ice-cold sterile EB.
Sixty microliters of cells were mixed with plasmid DNA and
incubated on ice for 15 minutes. For targeted integrations, or
transformation of linear DNA, approximately 200-400 ng of
non-specific, short (50-500 bp) linear DNA fragments were added to
300-400 ng of the linearized integrating DNA construct. This DNA
was either provided by gel-purified Alul-digested salmon sperm DNA,
or a mixture of annealed primers 35+36 (yielding a .about.85 bp
linear duplex fragment). Cells were transferred cells to a chilled
electroporation (2 mm) cuvette and pulsed using a BioRad Gene
Pulser at 1 kV, 4000, and 25 uF. The cell suspension was
immediately transferred to a 14 mL round-bottom Falcon tube with 1
mL room temperature YPD and allowed to incubate vertically at
30.degree. C., 225 RPM for at 6-18 h. Cells were collected in an
1.7 mL by centrifugation for 10 seconds at maximum speed, and
resuspended with 150 .mu.L YPD before being spread onto appropriate
selection plates.
[0248] Yeast colony PCR with FailSafe.TM. PCR System
(EPICENTRE.RTM. Biotechnologies, Madison, Wis.; Catalog #FS99250):
Cells from each colony were added to 20 .mu.l of colony PCR mix
(per reaction mix contains 6.8 .mu.l water, 1.5 .mu.l of each
primer, 0.2 .mu.l of FailSafe PCR Enzyme Mix and 10 .mu.l 2.times.
FailSafe Master Mix). Unless otherwise noted, 2.times. FailSafe
Master Mix E was used. The PCR reactions were incubated in a
thermocycler using the following touchdown PCR conditions: 1 cycle
of 94.degree. C..times.2 min, 10 cycles of 94.degree. C..times.20
s, 63.degree.-54.degree. C..times.20 s (decrease 1.degree. C. per
cycle), 72.degree. C..times.60 s, 40 cycles of 94.degree.
C..times.20 s, 53.degree. C..times.20 s, 72.degree. C..times.60 s
and 1 cycle of 72.degree. C..times.5 min.
[0249] Zymoclean Gel DNA Recovery Kit (Zymo Research, Orange,
Calif.; Catalog #D4002) Protocol: DNA fragments were recovered from
agarose gels according to manufacturer's protocol.
[0250] Zymo Research DNA Clean and Concentrator Kit (Zymo Research,
Orange, Calif.; Catalog #D4004) Protocol: DNA fragments were
purified according to manufacturer's protocol.
[0251] Preparation of cell lysates for in vitro enzyme assays. To
Grow cultures for cell lysates, triplicate independent cultures of
the desired strain were grown overnight in 3 mL of the appropriate
medium at 30.degree. C., 250 rpm. The following day, the overnight
cultures were diluted into 50 mL fresh medium in 250 mL
baffle-bottomed Erlenmeyer flasks and incubated at 30.degree. C. at
250 rpm. Cells were grown for at least 4 generations and the
cultures were harvested in mid log phase (OD.sub.600 of 1-3) The
cells of each culture were collected by centrifugation
(2700.times.g, 5 min, 4.degree. C.). The cell pellets were washed
by resuspending in 20 mL of ice cold water. The cells were
centrifuged at 2700.times.g, 4.degree. C. for 5 min. All
supernatant was removed from each tube and the tubes were frozen at
-80.degree. C. until use.
[0252] Lysates were prepared by thawing each cell pellet on ice and
preparing a 20% (w/v) cell suspension in lysis buffer. The lysis
buffer was varied for each enzyme assay and consisted of: 0.1 M
Tris-HCl pH 8.0, 5 mM MgSO.sub.4, for DHAD assays, 50 mM potassium
phosphate buffer pH 6.0, 1 mM MgSO.sub.4 for ALS assays, 250 mM
KPO.sub.4 pH 7.5, 10 mM MgCl.sub.2 for KARI assays, 50 mM
NaHPO.sub.4, 5 mM MgCl.sub.2, for KIVD assays. 10 .mu.L of
Yeast/Fungal Protease Arrest solution (G Biosciences, catalog
#788-333) per 1 mL of lysis buffer was used. 800 microliters of
cell suspension were added to 1 mL of 0.5 mm glass beads that had
been placed in a chilled 1.5 mL tube. Cells were lysed by bead
beating (6 rounds, 1 minute per round, 30 beats per second) with 2
minutes chilling on ice in between rounds. The tubes were then
centrifuged (20,000.times.g, 15 min) to pellet debris and the
supernatants (cell lysates) were retained in fresh tubes on ice.
The protein concentration of each lysate was measured using the
BioRad Bradford protein assay reagent (BioRad, Hercules, Calif.)
according to manufacturer's instructions.
[0253] Preparation of fractionated lysates from S. cerevisiae
strains for in vitro enzyme assays. To grow cultures for cell
fractionated cell lysates, triplicate independent cultures of the
desired strain were grown overnight in 3 mL of the appropriate
medium at 30.degree. C., 250 rpm. The following day, the overnight
cultures were used to inoculate 1 L cultures of each strain which
were grown in the appropriate medium at 30.degree. C. at 250 rpm
until they reached an OD.sub.600 of approximately 2. The cells were
collected by centrifugation (1600.times.g, 2 min) and the culture
medium was decanted. The cell pellets were resuspended in 50 mL
sterile deionized water, collected by centrifugation (1600.times.g,
2 min), and the supernatant was discarded.
[0254] To obtain spheroplasts, the cell pellets were resuspended in
0.1 M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT was added to a final concentration of 10 mM. Cells were
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells were then collect by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet was resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet was
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle, and 50 mg of
Zymolyase 20T (Seikagaku Biobusiness, Code#120491) was added to
each cell suspension. The suspensions were incubated overnight
(approximately 16 hrs) at 30.degree. C. with gentle agitation (60
rev/min) on an orbital shaker. The efficacy of spheroplasting was
ascertained by diluting an aliquot of each cell suspension 1:10 in
either sterile water or in spheroplasting buffer, and comparing the
aliquots microscopically (under 40.times. magnification). In all
cases, >90% of the water-diluted cells lysed, indicating
efficient spheroplasting. The spheroplasts were centrifuged
(3000.times.g, 10 min, 20.degree. C.), and the supernatant was
discarded. Each cell pellet was resuspended in 50 mL spheroplast
buffer without Zymolyase, and cells were collected by
centrifugation (3000.times.g, 10 min, 20.degree. C.).
[0255] To fractionate spheroplasts, the cells were resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) was added. The cell
suspension was subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension was
centrifuged (2500.times.g, 10 min, 4.degree. C.) to collect cell
debris and unbroken cells and spheroplasts. Following
centrifugation, 2 mL of each sample (1 mL of the pGV1900
transformed cells) were saved in a 2 mL centrifuge tube on ice and
designated the "W" (for Whole cell extract) fraction, while the
remaining supernatant was transferred to a clean, ice-cold 35 mL
Oakridge screw-cap tube and centrifuged (12,000.times.g, 20 min,
4.degree. C.) to pellet mitochondria and other organellar
structures. Following centrifugation, 5 mL of each resulting
supernatant was transferred to a clean tube on ice, being careful
to avoid the small, loose pellet, and labelled the "S" (soluble
cytosol) fraction. The resulting pellets were resuspended in MIB
containing Protease Arrest solution, and were labelled the "P"
("pellet") fractions. The BioRad Protein Assay reagent (BioRad,
Hercules, Calif.) was used according to manufacturer's instructions
to determine the protein concentration of each fraction.
[0256] Preparation of fractionated lysates from K. lactis strains
for in vitro enzyme assays Cultures (20 mL YPD) were inoculated
with yeast cells (GEVO1742 and GEVO1829) and incubated at
30.degree. C. while shaking at 250 RPM until they reached late-log
to stationary phase (OD.sub.600 of approximately 10). Cells from
the mL cultures were used to inoculate a 250 mL YPD culture at an
OD.sub.600 of approximately 0.2. The cultures were incubated at
30.degree. C. while shaking at 250 RPM until they reached
mid-log(OD.sub.600 .about.2).
[0257] To prepare spheroplasts, the cells were collected in 500 mL
bottles at 5000.times.g for 5 minutes at room temperature. The
pellets were resuspended with 8 mL Spheroplasting Buffer A (25 mM
potassium phosphate (pH 7.5), 1 mM MgCl.sub.2, 1 mM EDTA, 1.25 mM
TPP, 1 mM DTT) without sorbitol and transferred to pre-weighed 50
mL tubes. The cells were collected at 1600.times.g for 5 minutes at
room temperature. The cells were resuspended with 8 mL of
Spheroplasting Buffer A with 2.5 M Sorbitol (Amresco Code#0691) and
protease inhibitor (G Biosciences Yeast/Fungal ProteaseArrest.TM.
(Catalog #788-333)). Approximately 5 mg of Zymolyase 20T Zymolyase
20T (Seikagaku Biobusiness Code# 120491) was added to each cell
suspension. The suspensions were incubated at 30.degree. C. with
gentle agitation (e.g. 50 RPM), with the tube on its side for good
mixing, for 1-2 hours. The efficiency of formation of spheroplasts
was verified by dilution of the spheroplast suspension 1:10 into
Spheroplasting Buffer A with 2.5 M sorbitol and 1:10 in water.
Spheroplasts should remain intact when diluted into the buffer but
appear fuzzy or completely disappear when diluted into water. The
spheroplasts were collected at 1600.times.g for 7 minutes at
4.degree. C. The spheroplasts were gently washed with 2 mL of
Spheroplasting Buffer A with 2.5 M sorbitol and protease inhibitor,
and collected at 1600.times.g for 7 minutes at 4.degree. C. The
spheroplasts were resuspended in 2 mL of Spheroplasting Buffer A
with 2.5 M sorbitol and protease inhibitor.
[0258] To fractionate the spheroplasts, 8 mL of Spheroplasting
Buffer A with 0.2 M sorbitol and protease inhibitor was slowly
added to the cell suspension, bringing the final concentration of
Sorbitol to 0.66 M. The spheroplasts were broken with 10 strokes
using a B (tight fitting) pestle in a 15 mL Dounce homogenizer
(Bellco Glass, Inc. Cat#1984-10015) on ice. The homogenate was
transferred to a 50 mL tube, and the cell debris was collected by
centrifugation at 4.degree. C. for 10 minutes at 1600.times.g. The
supernatant was transferred to a 15 mL tube with a pipette. This
supernatant is the "W" fraction. 5 mL of this "W" fraction was
transferred to a 35 mL Oakridge tube and centrifuged at
48,000.times.g for 20 minutes at 4.degree. C. The resulting
supernatant was transferred to a 15 mL tube and labeled "S." The
pellet was resuspended in 5 mL of Spheroplasting Buffer A with 0.66
M Sorbitol and protease inhibitor and labeled "P." All fractions
were stored on ice at 4.degree. C. while in use. The BioRad Protein
Assay reagent (BioRad, Hercules, Calif.) was used according to
manufacturer's instructions to determine the protein concentration
of each fraction.
[0259] ALS Assay. Cell lysates were prepared and protein
concentrations were determined as described above. The colorimetric
ALS Assay (FAD-independent) performed here was based on the assay
described in Hugenholtz, J. and Starrenburg, J. C., Appl.
Microbiol. Biotechnol. (1992) 38:17-22. Reaction buffer was
prepared by mixing 900 .mu.l 1M potassium phosphate buffer pH 6.0,
180 .mu.l 100 mM MgSO.sub.4, 180 .mu.l 100 mM TPP, 3.96 ml 500 mM
pyruvate and 12.78 ml water. For the no substrate control, the
volume of pyruvate was replaced with water. Lysates were prepared
at a final protein concentration of 2 .mu.g/.mu.l in Spheroplasting
Buffer A with 0.66 M sorbitol. To 900 .mu.L ALS Buffer, 100 .mu.L
of lysate was added and incubated at 30.degree. C. for 30 min.
Acetoin standards were also prepared at concentrations of 2 mM, 1
mM, 0.5 mM, and 0 mM. From each sample and standard, 175 .mu.L was
transferred to a fresh 1.5 mL tube. To each sample and standard was
added 25 .mu.L 35% (v/v) H.sub.2SO.sub.4, and all were incubated at
37.degree. C. for 30 mins. After the incubation, the following were
added in order, to each standard and sample, with the solutions
being mixed by vortexing in between each addition: 50 .mu.L 50%
(w/v) NaOH, 50 .mu.L 0.5% creatine, and 50 .mu.L 5% 1-naphthol (in
2.5N NaOH). The samples and standards were incubated at room
temperature for 1 hour, being mixed by vortexing every 15 minutes.
To a 96 well, half-area, UV-Star, transparent, flat-bottom plate
(Catalog #675801, Greiner Bio One, Frickenhausen, Germany), 100 uL
of each sample or standard was transferred, and the samples were
analyzed by a plate reader by measuring absorbance at 530 nm.
[0260] KARI Assay. Cell lysates were prepared and protein
concentrations were determined as described above. Acetolactate
substrate was made by mixing 50 .mu.l of ethyl-2
acetoxy-2-methyl-acetoacetate with 990 .mu.l of water. Then 10
.mu.l of 2 N NaOH was sequentially added, with vortex mixing
between additions, until 260 .mu.l of NaOH was added. The
acetolactate was agitated at room temperature for 20 min and then
held on ice. NADPH was prepared in 0.01 N NaOH (to improve
stability) to a concentration of 50 mM. The concentration was
determined by reading the OD of a diluted sample at 340 nm in a
spectrophotometer and using the molar extinction coefficient of
6.22 M.sup.-1 cm.sup.-1 to calculate the actual concentration (the
OD.sub.340 of a 100 .mu.M solution of NAD(P)H should be 0.622).
Three buffers were prepared and held on ice. Reaction buffer
contained 250 mM KPO.sub.4 pH 7.5, 10 mM MgCl.sub.2, 1 mM DTT, 10
mM acetolactate, and 0.2 mM NADPH. No substrate buffer contained
everything except the acetolactate. No NAD(P)H buffer contained
everything except the NADPH. Reactions were performed in triplicate
using 10 .mu.l of cell extract with 90 .mu.l of reaction buffer in
a 96-well plate in a SpectraMax 340PC multi-plate reader (Molecular
Devices, Sunnyvale, Calif.). The reaction was followed at 340 nm by
measuring a kinetic curve for 5 minutes, with OD readings taken
every 10 seconds. The reactions were performed at 30.degree. C. The
reactions were performed in complete, no substrate, and no NAD(P)H
buffers. The V.sub.max for each extract was determined after
subtracting the background reading of the no substrate control from
the reading in complete buffer.
[0261] DHAD Assay. Cell lysates were prepared and protein
concentrations were determined as described above. The DHAD
activity of each lysate was ascertained as follows. In a fresh 1.5
mL centrifuge tube, 50 .mu.L of each lysate was mixed with 50 .mu.L
of 0.1 M 2,3-dihydroxyisovalerate (DHIV), 25 .mu.L of 0.1 M
MgSO.sub.4, and 375 .mu.L of 0.05M Tris-HCl pH 8.0, and the mixture
was incubated for 30 min at 35.degree. C. Each tube was then heated
to 95.degree. C. for 5 min to inactivate any enzymatic activity,
and the solution was centrifuged (16,000.times.g for 5 min) to
pellet insoluble debris. To prepare samples for analysis, 100 .mu.L
of each reaction were mixed with 100 .mu.L of a solution consisting
of 4 parts 15 mM dinitrophenyl hydrazine (DNPH) in acetonitrile
with 1 part 50 mM citric acid, pH 3.0, and the mixture was heated
to 70.degree. C. for 30 min in a thermocycler. The solution was
then analyzed by HPLC as described above in General Methods to
quantitate the concentration of ketoisovalerate (KIV) present in
the sample.
[0262] KIVD Assay. Cell lysates were prepared and protein
concentrations were determined as described above. KIVD Assay
buffer, containing 1 Roche Protease Inhibitor tablet per 5 mL
buffer, was added to each cell pellet to create a 20% (w/v) cell
suspension. The KIVD assay buffer was prepared at a final
concentration of 0.05 M NaHPO.sub.4*H.sub.2O, 5 mM
MgCl.sub.2*8H.sub.2O, and 1.5 mM Thiamin pyrophosphate chloride.
The reaction substrate, .alpha.-keto-isovalerate
(3-methyl-2-oxobutanoic acid, Acros Organics), was added where
appropriate at 30 mM. Lysates were diluted in reaction buffer at a
final protein concentration of 0.1 .mu.g/.mu.L. To 1.5 mL tubes, 50
.mu.L of lysate (5 .mu.g of protein) was mixed with 200 .mu.L of
reaction buffer with or without substrate. The reactions were
incubated at 37.degree. C. for 20 minutes, and the reactions were
immediately filtered through a 2 .mu.m filter plate. The filtered
samples were diluted 1:10 in water, and 100 .mu.L of the 1:10
dilution was mixed with 100 .mu.L of derivatization reagent in a
0.2 ml thin-wall PCR tubes. Derivatization reagent was prepared by
mixing 4 ml of 2,4-Dinitrophenyl Hydrazine (DNPH) in 15 mM in
HPLC-grade Acetonitrile with 1 ml 50 mM Citric Acid Buffer, pH 3.
The samples were incubated at 70.degree. C. for 30 minutes. The
samples were analyzed by HPLC.
[0263] ADH Assay. Cell lysates were prepared and protein
concentrations were determined as described above. Assays (set up
in triplicate for each lysate) contained 10 .mu.L of each lysate
(or an appropriate dilution of each lysate) plus 90 .mu.L of
reaction buffer, which consisted of (final concentrations present
in 1.times. reaction buffer): 0.1M Tris-HCl pH 7.5, 10 mM
MgCl.sub.2, 1 mM DTT, 0.2 mM NADH (or NADPH, where indicated; each
diluted from a 4.4 mM spectrophotometrically-confirmed stock), and
11 mM isobutyraldehyde. Where indicated, as controls a parallel set
of assay reactions were set up using reaction buffer lacking
isobutyraldehyde and/or NAD(P)H, as indicated. For experiments
measuring the acetaldehyde-dependent oxidation of NAD(P)H, reaction
buffer was prepared in which acetaldehyde was substituted for
isobutyraldehyde. In these cases, the reaction buffer contained at
least 11 mM acetaldehyde, although the exact amount present is an
estimate due to the inherent difficulties of pipetting acetaldehyde
solution. Finally, in some cases a parallel set of reactions
lacking yeast cell lysate was included as a negative control. After
being added (using a multi-channel pipet) to the wells of a 96-well
plate, the reactions were immediately placed into a plate reader
that had been pre-warmed to 30.degree. C., and the absorbance at
340 nm was measured every 12 seconds over a period of 300 seconds.
Kinetic parameters were computed from assays with linear slopes
(where necessary, assays were repeated with appropriate dilutions
to obtain linear NAD(P)H consumption curves).
Composition of Culture Media
[0264] Drugs: When indicated, G418 (Calbiochem, Gibbstown, N.J.)
was added at 0.2 g/L, Phleomycin (InvivoGen, San Diego, Calif.) was
added at 7.5 mg/L, Hygromycin (InvivoGen, San Diego, Calif.) was
added at 0.2 g/L, and 5-fluoro-orotic acid (FOA; Toronto Research
Chemicals, North York, Ontario, Canada) was added at 1 g/L.
[0265] YP: 1% (w/v) yeast extract, 2% (w/v) peptone.
[0266] YPD: YP containing 2% (w/v) glucose unless otherwise
noted,
[0267] YPGal: YP containing 2% (w/v) galactose
[0268] YPE: YP containing 2% (w/v) Ethanol.
[0269] SC media: 6.7 g/L Difco.TM. Yeast Nitrogen Base, 14 g/L
Sigma.TM. Synthetic Dropout Media supplement (includes amino acids
and nutrients excluding histidine, tryptophan, uracil, and leucine;
Sigma-Aldrich, St. Louis, Mo.), 0.076 g/L histidine, 0.076 g/L
tryptophan, 0.380 g/L leucine, and 0.076 g/L uracil. Drop-out
versions of SC media is made by omitting one or more of histidine
(H), tryptophan (W), leucine (L), or uracil (U or Ura). When
indicated, SC media are supplemented with additional isoleucine
(9.times.I; 0.684 g/L), valine (9.times.V; 0.684 g/L) or both
isoleucine and valine (9.times.IV). SCD is SC containing 2% (w/v)
glucose unless otherwise noted, SCGal is SC containing 2% (w/v)
galactose and SCE is SC containing 2% (w/v) ethanol. For example,
SCD-Ura+9.times.IV would be composed of 6.7 g/L Difco.TM. Yeast
Nitrogen Base, 14 g/L Sigma.TM. Synthetic Dropout Media supplement
(includes amino acids and nutrients excluding histidine,
tryptophan, uracil, and leucine), 0.076 g/L histidine, 0.076 g/L
tryptophan, 0.380 g/L leucine, 0.684 g/L isoleucine, 0.684 g/L
valine, and 20 g/L glucose.
[0270] SCD-V+9.times.I: 6.7 g/L Difco.TM. Yeast Nitrogen Base,
0.076 g/L Adenine hemisulfate, 0.076 g/L Alanine 0.076 g/L,
Arginine hydrochloride, 0.076 g/L Asparagine monohydrate, 0.076 g/L
Aspartic acid, 0.076 g/L Cysteine hydrochloride monohydrate, 0.076
g/L Glutamic acid monosodium salt, 0.076 g/L Glutamine, 0.076 g/L
Glycine, 0.076 g/L myo-Inositol, 0.76 g/L Isoleucine, 0.076 g/L
Lysine monohydrochloride, 0.076 g/L Methionine, 0.008 g/L
p-Aminobenzoic acid potassium salt, 0.076 g/L Phenylalanine, 0.076
g/L Proline, 0.076 g/L Serine, 0.076 g/L Threonine, 0.076 g/L
Tyrosine disodium salt, and 20 g/L glucose.
[0271] YNB: 6.7 g/L Difco.TM. Yeast Nitrogen Base supplemented with
indicated nutrients as follows: histidine (H, 0.076 g/L),
tryptophan (W; 0.076 g/L), leucine (L; 0.380 g/L), uracil (U or
Ura; 0.076 g/L), isoleucine (1; 0.076 g/L), valine (V; 0.076 g/L),
and casamino acids (CAA; 10 g/L). When indicated, YNB media are
supplemented with higher amounts of isoleucine (10.times.I=0.76
g/L), valine (10.times.V =0.76 g/L) or both isoleucine and valine
(10.times.IV). YNBD is YNB containing 2% (w/v) glucose unless
otherwise noted, YNBGal is YNB containing 2% (w/v) galactose and
YNBE is YNB containing 2% (w/v) ethanol. For example,
YNBGal+HWLU+10.times.I+G418 would be composed of 6.7 g/L Difco.TM.
Yeast Nitrogen Base, 0.076 g/L histidine, 0.076 g/L tryptophan,
0.380 g/L leucine, 0.076 g/L uracil, 0.76 g/L isoleucine, 0.2 g/L
G418, and 20 g/L galactose.
[0272] Plates: Solid versions of the above described media contain
2% (w/v) agar.
Example 1
Isobutanol Pathway is Partially Cytosolic when Expressed in
Yeast
[0273] The purpose of this example is to illustrate that three
enzymes in the isobutanol biosynthetic pathway (acetolactate
synthase, ketoisovalerate decarboxylase, and isobutanol
dehydrogenase) are localized to the cytosol when expressed in
yeast.
TABLE-US-00003 TABLE 3 Genotype of strains disclosed in Example 1.
GEVO No. Genotype/Source 1287 K. lactis ATCC 200826 MAT .alpha.
uraA1 trp1 leu2 lysA1 ade1 lac4-8 [pKD1] 1742 K. lactis ATCC 200826
MAT .alpha. uraA1 trp1 leu2 lysA1 ade1 lac4-8 [pKD1]
pdc1::Kan.sup.R 1829 K. lactis ATCC 200826 MAT .alpha. uraA1 trp1
leu2 lysA1 ade1 lac4-8 [pKD1] pdc1::kan.sup.R {P.sub.TDH3:Ec_
ilvC-.DELTA.N; P.sub.TEF1:Ec_ilvD-.DELTA.N(codon optimized for K.
lactis): ScLEU2 integrated} {P.sub.TEF1:Ll_kivD; P.sub.TDH3ScADH7:
KmURA3 integrated} {P.sub.CUP1-1:Bs_alsS, TRP1 random
integrated}
TABLE-US-00004 TABLE 4 Plasmids disclosed in Example 1. pGV No.
Genotype pGV1503 ScTEF1promoter-kanR bla, pUC ori (GEVO) pGV1537
KlPDC1 promoter region + Klpdc1 3'UTR sequence, ScTEF1promoter-kanR
bla, pUC ori (GEVO) pGV1590 TEF1 promoter:Ll-kivd (codon optimized
for E. coli): TDH3 promoter:ADH7:CYC1 terminator, Km-URA3, 1.6
micron ori, bla, pUC ori (GEVO) pGV1726 CUP1 promoter:Bs-alsS:CYC1
terminator, TRP1, bla, pUC-ori pGV1727 TEF1
promoter:Ec-ilvD.DELTA.N (codon optimized for K. lactis):TDH3
promoter:Ec-ilvC.DELTA.N:CYC1 terminator, LEU2, bla, pUC ori
(GEVO)
Plasmids
[0274] pGV1503 contains an S. cerevisiae TEF1 promoter region
driving a G418-resistance gene (kan.sup.R).
[0275] pGV1537 was constructed by inserting an (AatII plus
MfeI)-digested PCR product containing approximately 500 bp each of
KIPDC1 5' and 3' untranslated regions, into (AatII plus
EcoRI)-digested pGV1503. The insert was generated by SOE-PCR.
First, the KIPDC1 5' and 3' untranslated regions were amplified
from K. lactis genomic DNA by primer pairs 1006+1016 and 1017+1009,
respectively. Primers 1016 and 1017 were designed to have
overlapping sequences. The two fragments were then joined by PCR
using primers 1006+1009.
[0276] pGV1590 is a K. lactis plasmid for expression of the L.
lactis kivD and the S. cerevisiae ADH7. Expression of the L. lactis
kivD is driven by the S. cerevisiae TEF1 promoter and expression of
the S. cerevisiae ADH7 is driven by the S. cerevisiae TDH3
promoter. pGV1590 was generated by cloning a SalI-NotI fragment
carrying the S. cerevisiae ADH7 gene into the XhoI-NotI sites of
pGV1585. The S. cerevisiae ADH7 gene fragment originated as a PCR
product from S. cerevisiae genomic DNA using primers 410 and
411.
[0277] pGV1726 is a yeast integration plasmid (utilizing the S.
cerevisiae TRP1 gene as selection marker) for random integration
(i.e. for K. lactis). This plasmid does not carry a yeast
replication origin, thus is unable to replicate episomally. This
plasmid also carries the B. subtilis alsS gene, whose expression is
under the control of the S. cerevisiae CUP1 promoter. pGV1726 was
generated by cloning a SacI-NgoMIV fragment carrying the S.
cerevisiae CUP1 promoter, Bs-alsS ORF and the CYC1 terminator into
the same sites of pGV1645. The vector, pGV1645, is a K. lactis
expression plasmid that was used for expression of the B. subtilis
alsS under the control of the K. lactis PDC1 promoter. This plasmid
also carries the S. cerevisiae TRP1 gene as a selection marker and
the 1.6 micron replication origin. Digestion of pGV1645 with SacI
and NgoMIV removes the K. lactis PDC1 promoter, B. subtilis alsS,
CYC1 terminator and the 1.6 micron origin of replication. The
insert fragment carrying the S. cerevisiae CUP1 promoter, B.
subtilis alsS ORF and the CYC1 terminator was obtained from pGV1649
via digestion with SacI and NgoMIV. The CUP1 promoter originated as
a PCR product from S. cerevisiae genomic DNA using primers 637 and
638. The B. subtilis alsS originated as a PCR product from B.
subtilis genomic DNA using primers 767 and 697.
[0278] pGV1727 is a yeast integration plasmid (utilizing the S.
cerevisiae LEU2 gene as selection marker) for random integration
(i.e. for K. lactis). This plasmid does not carry a yeast
replication origin, thus is unable to replicate episomally. This
plasmid carries the E. coli ilvD.DELTA.N and ilvC.DELTA.N genes,
whose expressions are under the control of the S. cerevisiae TEF1
and TDH3 promoters respectively. The E. coli ilvD.DELTA.N is a
shortened version of E. coli ilvD where the sequence coding for the
first 24 amino acids, which encodes for a putative mitochondrial
targeting sequence, was removed. Likewise, the E. coli ilvC.DELTA.N
is a shortened version of E. coli ilvC where the sequence coding
for the first 22 amino acids, which is predicted to function as a
mitochondrial targeting sequence was removed. pGV1727 was generated
by cloning a XhoI-NgoMIV fragment carrying the E. coli ilvC.DELTA.N
gene and the CYC1 terminator into the same sites of pGV1635. The
vector, pGV1635, is a K. lactis expression plasmid that was used
for expression of the E. coli ilvD.DELTA.N gene under the control
of the S. cerevisiae TEF1 promoter. The ilvD.DELTA.N gene is
followed by the TDH3 promoter, a short MCS (includes an XhoI site),
the CYC1 terminator and the 1.6 micron replication origin. This
plasmid carries the S. cerevisiae LEU2 gene as a selection marker.
Digestion of pGV1635 with XhoI and NgoMIV removes the CYC1
terminator and the 1.6 micron replication origin. This sequence was
replaced by the insert fragment carrying the E. coli ilvC.DELTA.N
and the CYC1 terminator which was obtained from pGV1677 digested
with XhoI and NgoMIV. The E. coli ilvD.DELTA.N originated as a PCR
product from pGV1578 (plasmid carrying E. coli ilvD codon optimized
for K. lactis from DNA2.0, Menlo Park, Calif.) using primers 1151
and 1152. The E. coli ilvC.DELTA.N originated as a PCR product from
pGV1160 (plasmid carrying the full length E. coli ilvC gene) using
primers 1149 and 1150. The E. coli ilvC in pGV1160 originated as a
PCR product from E. coli genomic DNA using primers 387 and 388.
[0279] GEVO1287 was transformed with PmlI-digested pGV1537,
yielding GEVO1742. GEVO1829 was constructed by sequentially
transforming GEVO1742 with gene fragments from pGV1590, pGV1727,
and pGV1726 following the standard lithium acetate protocol. First,
a 7.8 kb fragment of pGV1590 generated by digestion with NgoMIV and
MfeI was transformed into GEVO1742. Next, this transformant strain
was transformed with pGV1727 (FIG. 4) that had been linearized by
digestion with BcgI. Finally, this transformant strains was
transformed with pGV1726 that had been linearized by digestion with
AhdI. The final transformant was GEVO1829.
[0280] Cellular fractions were prepared from GEVO1742 and GEVO1829
as described above. The protein concentration used to calculate
specific activities from all three fractions ("W," "S," and "P")
was measured for the "W" fraction. Below are the results for the
assays measuring isobutanol dehydrogenase, acetolactate synthase,
and ketoisovalerate decarboxylase activities.
Alcohol Dehydrogenase (ADH) Assay
[0281] The results from the assay are summarized in Table 5. The
"W" fraction and the "S" fraction of the pathway carrying strain
(GEVO1829) contained at least three times the NADPH dependent
alcohol dehydrogenase activity found in the same fractions of
GEVO1742. The "W" and "S" fractions of GEVO1829 contained more than
four times the activity present in the "P" fraction. These data
indicated that S. cerevisiae Adh7 activity was predominantly
localized to the cytosol.
TABLE-US-00005 TABLE 5 Alcohol Dehydrogenase Activity. Specific
Alcohol Dehydrogenase Sample Activity (U/mg protein) 1742 W 0.08
.+-. 0.00 1742 S 0.07 .+-. 0.02 1742 P 0.03 .+-. 0.012 1829 W 0.26
.+-. 0.00 1829 S 0.25 .+-. 0.02 1829 P 0.04 .+-. 0.02
Acetolactate Synthase (ALS) Assay
[0282] The results from the assay are summarized in Table 6. The
"W" and "S" fractions of the isobutanol pathway carrying strain
(GEVO1829) contained ALS activity, while no activity was detected
in the same fractions of GEVO1742. The "W" and "S" fractions
contained three times higher ALS activity than the "P" fraction.
These data indicated that B. subtilis ALS activity was
predominantly localized to the cytosol.
TABLE-US-00006 TABLE 6 Acetolactate Synthase Activity. Specific
Acetolactate Synthase Sample Activity (U/mg protein) 1742 W 0.00
.+-. 0.00 1742 S 0.00 .+-. 0.00 1742 P 0.00 .+-. 0.00 1829 W 0.10
.+-. 0.01 1829 S 0.10 .+-. 0.00 1829 P 0.03 .+-. 0.00
Ketoisovalerate Decarboxylase (KIVD) Assay
[0283] The results from the assay are summarized in Table 7. The
"W" and "S" fractions of the isobutanol pathway carrying strain
(GEVO1829) contained 8-10 times greater activity than in the same
fractions of GEVO1742. Furthermore, the activity in "S" fraction
was 45.times. higher than what was detected in "P" fraction. These
data indicated that L. lactis KIVD activity was predominantly
localized in the cytosol.
TABLE-US-00007 TABLE 7 Ketoisovalerate decarboxylase (KIVD) Assay.
Specific Ketoisovalerate Decarboxylase Sample Activity (U/mg
protein) 1742 W 0.05 .+-. 0.00 1742 S 0.05 .+-. 0.04 1742 P 0.03
.+-. 0.00 1829 W 0.38 .+-. 0.02 1829 S 0.45 .+-. 0.04 1829 P 0.01
.+-. 0.00
Example 2
Construction of an ILV3 Deletion Mutant
[0284] The purpose of this example is to describe the construction
of an ILV3 deletion mutant of S. cerevisiae, GEVO2244.
TABLE-US-00008 TABLE 8 Genotype of strains disclosed in Example 2.
GEVO No. Genotype/Source GEVO1147 K. lactis, NRRL Y-1140, (obtained
from USDA) GEVO1188 S. cerevisiae, CEN.PK, (obtained from
Euroscarf); MAT.alpha. ura3 leu2 his3 trp1 GEVO2145 S. cerevisiae,
CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 ilv3::Kl_URA3 GEVO2244 S.
cerevisiae, CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00009 TABLE 9 Plasmids disclosed in Example 2. Plasmid
name Genotype pUC19 bla, pUC-ori (obtained from Invitrogen) pGV1299
K. lactis URA3, bla, pUC-ori (GEVO)
[0285] Plasmid pGV1299 was constructed by cloning the K. lactis
URA3 gene into pUC19. The K. lactis URA3 was obtained by PCR using
primers 575 and 576 from K. lactis genomic DNA. The PCR product was
digested with EcoRI and BamHI and cloned into pUC19 which was
similarly digested. The K. lactis URA3 insert was sequenced
(Laragen Inc) to confirm correct sequence.
[0286] The ilv3::KI_URA3 integration cassette contained, from 5' to
3', the following: 1) a 80 bp homology to ILV3 (position +158 to
237) that functions as the 5' targeting sequence for the
integration, 2) the K. lactis URA3 marker gene, 3) a 60 bp homology
to a region ILV3 (position -21 to +39) that is further upstream of
the 5' targeting sequence to facilitate loop-out of the K. lactis
URA3 marker, and 4) a 221 bp homology to the 3' region of ILV3
(position +1759 to 1979) that functions as the 3' targeting
sequence for the integration. This cassette was generated by
SOE-PCR. The K. lactis URA3 gene was amplified from pGV1299 using
primers 1887 and 1888. Only the 3' region of ILV3 was initially
amplified using primers 1623 and 1892 from genomic DNA and this
product was used as template to amplify the 3' region of ILV3 using
primers 1889 and 1890. The K. lactis URA3 and the 3' region of ILV3
were combined by SOE-PCR using primers 1886 and 1890.
[0287] GEVO1188 was transformed with the ilv3::KI_URA3 cassette
described above and plated onto YNBD+W+CAA (-Ura) plates.
Initially, eight colonies (#1-8) were patched onto YNBD+HUWLIV
plates and then replica plated onto YNBD+HUWLI (-V) plates to test
for valine auxotrophy. As none of these exhibited valine
auxotrophy, an additional eight colonies (#9-16) were streaked out
for single colonies and 3 or 4 isolates (A through C or D) from
each streak were tested for valine auxotrophy. Isolates A-C from
clone #12 exhibited valine auxotrophy.
[0288] These isolates were tested for the correct integrations by
colony PCR using primer pairs 1916+1920 and 1917+1921 for the 5'
and 3' junctions, respectively. Correct sized bands were observed
with clones #12A through C with primer pair 1916+1920. Correct
sized bands were observed with clone 12A when FailSafe Master Mix A
or C was used with primer pair 1917+1921. Clone #12A was designated
as GEVO2145. The valine auxotrophies of GEVO2145 were reconfirmed
by streaking them onto SCD+9.times.IV and SCD-V+9.times.I plates.
GEVO2145 exhibited no growth on the medium lacking valine
(SCD-V+9.times.I) while it grew on medium supplemented with valine
(SCD+9.times.IV). The parent strain, GEVO1188, grew on both
media.
[0289] GEVO2145 was streaked onto YNBE+W+CAA+FOA to isolate strains
in which the K. lactis URA3 had been excised through homologous
recombination, i.e. "looped out". Five FOA resistant clones (A-E)
were tested for auxotrophies for valine and uracil. All five clones
exhibited auxotrophies to both nutrients. Clone A was designated
GEVO2244. Colony PCR using primers 1891 and 1892 with FailSafe
Buffer C was performed and the loss of the KI_URA3 cassette was
confirmed.
Example 3
DHAD Activity is Localized to Mitochondria
[0290] The purpose of this Example is to demonstrate that the DHAD
activity encoded by ScILV3 is localized to the mitochondria.
TABLE-US-00010 TABLE 9 Genotype of strains disclosed in Example 3.
GEVO No. Genotype/Source Gevo2244 S. cerevisiae, CEN.PK; MAT.alpha.
ura3 leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00011 TABLE 10 Plasmids disclosed in Example 3. pGV No.
Genotype pGV1106 pUC ori, bla (AmpR), 2 micron ori, URA3, TDH3
promoter-Myc tag-polylinker-CYC1 terminator pGV1900 pUC ori, bla
(AmpR), 2 micron ori, URA3, TEF1 promoter-ScILV3(FL)
[0291] Plasmid pGV1106 is a variant of p426GPD (described in
Mumberg et al, 1995, Gene 119-122). To obtain pGV1106, annealed
oligos 271 and 272 were ligated into p426GPD that had been digested
with SpeI and XhoI, and the inserted DNA was confirmed by
sequencing.
[0292] Plasmid pGV1900 was generated by amplifying the full-length,
native ScILV3 nucleotide sequence from S. cerevisiae strain CEN.PK
genomic DNA using primers 1617 and 1618. The resulting 1.76 kb
fragment, which contained the complete ScILV3 coding sequence (SEQ
ID NO: 88) flanked by 5' SalI and 3' BamHI restriction site
sequences was digested with SalI and BamHI and ligated into pGV1662
(described in Example 6) which had been digested with SalI and
BamHI.
[0293] To measure DHAD activities present in fractionated cell
extracts, GEVO2244 was transformed singly with either pGV1106,
which served as an empty vector control, or with pGV1900, which is
an expression plasmid for ScILV3.
[0294] An independent clonal transformant of each plasmid was
isolated, and a 1 L culture of each strain was grown in
SCGaI-Ura+9.times.IV at 30.degree. C. at 250 rpm. The OD.sub.600
was noted, the cells were collected by centrifugation
(1600.times.g, 2 min) and the culture medium was decanted. The cell
pellets were resuspended in 50 mL sterile deionized water,
collected by centrifugation (1600.times.g, 2 min), and the
supernatant was discarded. The OD.sub.600 and total wet cell pellet
weight of each culture are listed in Table 11, below:
TABLE-US-00012 TABLE 11 OD.sub.600 and pellet mass (g) of strain
GEVO2244 transformed with the indicated plasmids. Pellet mass
Plasmid OD.sub.600 (g) pGV1106 2.2 7.6 pGV1900 1.3 3.8
[0295] To obtain spheroplasts, the cell pellets were resuspended in
0.1M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT was added to a final concentration of 10 mM. Cells were
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells were then collect by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet was resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet was
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle, and 50 mg of
Zymolyase 20T (Seikagaku Biobusiness, Code#120491) was added to
each cell suspension. The suspensions were incubated overnight (-16
hrs) at 30.degree. C. with gentle agitation (60 rev/min) on an
orbital shaker. The efficacy of spheroplasting was ascertained by
diluting an aliquot of each cell suspension 1:10 in either sterile
water or in spheroplasting buffer, and comparing the aliquots
microscopically (under 40.times. magnification). In all cases,
>90% of the water-diluted cells lysed, indicating efficient
spheroplasting. The spheroplasts were centrifuged (3000.times.g, 10
min, 20.degree. C.), and the supernatant was discarded. Each cell
pellet was resuspended in 50 mL spheroplast buffer without
Zymolyase, and cells were collected by centrifugation
(3000.times.g, 10 min, 20.degree. C.).
[0296] To fractionate spheroplasts, the cells were resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) was added. The cell
suspension was subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension was
centrifuged (2500.times.g, 10 min, 4.degree. C.) to collect cell
debris and unbroken cells and spheroplasts. Following
centrifugation, 2 mL of each sample (1 mL of the pGV1900
transformed cells) were saved in a 2 mL centrifuge tube on ice and
designated the "W" (for Whole cell extract) fraction, while the
remaining supernatant was transferred to a clean, ice-cold 35 mL
Oakridge screw-cap tube and centrifuged (12,000.times.g, 20 min,
4.degree. C.) to pellet mitochondria and other organellar
structures. Following centrifugation, 5 mL of each resulting
supernatant was transferred to a clean tube on ice, being careful
to avoid the small, loose pellet, and labelled the "S" (soluble
cytosol) fraction. The resulting pellets were resuspended in MIB
containing Protease Arrest solution, and were labelled the "P"
("pellet") fractions. Protein from the "P" fraction was released
after dilution 1:5 in DHAD assay buffer (see above) by rapid mixing
in a 1.5 mL tube with a Retsch Ball Mill MM301 in the presence of
0.1 mM glass beads. The mixing was performed 4 times for 1
minute.
[0297] The BioRad Protein Assay reagent (BioRad, Hercules, Calif.)
was used according to manufacturer's instructions to determine the
protein concentration of each fraction.
[0298] The DHAD activity of each fraction was ascertained as
described in the methods above.
TABLE-US-00013 TABLE 12 Specific activities (KIV generation) and
ratios of specific activities from fractionated lysates of S.
cerevisiae strain GEVO2244 carrying plasmids to overexpress the
indicated DHAD homolog. Sp. Activity Lysate (pGV# [U/mg protein and
fraction*) DHAD in fraction] Std. Dev. 1106 W -- n.d. 1106 S --
n.d. 1106 P -- n.d. 1900 W ScILV3(FL) 0.0096 0.0018 1900 S
ScILV3(FL) 0.0052 0.0004 1900 P ScILV3(FL) 0.0340 0.0029 Each data
point is the result of triplicate samples.
[0299] Cells overexpressing the full-length, native S. cerevisiae
Ilv3 contained in a greater proportion of the specific DHAD
activity in the mitochondrial fraction (P) versus the cytosolic
fraction (S).
Example 4
Replacing Current Mitochondrially Targeted Isobutanol Pathway
Enzymes with Fungal Homologs or Functional Analogs that are
Targeted to the Cytosol
[0300] The purpose of this example is to illustrate that fungal
homologs of isobutanol a pathway enzymes exhibit cytosolic
activity.
TABLE-US-00014 TABLE 13 Genotype of strains disclosed in Example 4.
GEVO No. Genotype/Source 1187 MATa ura3-52 leu2-3_112 his3.DELTA.1
trp1-289 ADE2 CEN.PK2-1C 2280 MATa ura3-52 leu2-3_112 his3.DELTA.1
trp1-289 ADE2 CEN.PK2-1C integrated pGV1730 at PDC1 locus 2618 MATa
ura3-52 leu2-3_112 his3.DELTA.1 trp1-289 ADE2 CEN.PK2-1C integrated
pGV2114 at PDC1 locus 2621 MATa ura3-52 leu2-3_112 his3.DELTA.1
trp1-289 ADE2 CEN.PK2-1C integrated pGV2117 at PDC1 locus 2622 MATa
ura3-52 leu2-3_112 his3.DELTA.1 trp1-289 ADE2 CEN.PK2-1C integrated
pGV2118 at PDC1 locus
TABLE-US-00015 TABLE 14 Plasmids disclosed in Example 4. pGV No.
Genotype 1730 P.sub.Cup1-11:Bs_alsS, pUC ORI, Amp.sup.R, TRP1, PDC1
3'-fragment-NruI-PDC1 5'-fragment. 2114 P.sub.Cup1-11:Bs_alsScoSc,
pUC ORI, Amp.sup.R, TRP1, PDC1 3'-fragment-NruI-PDC1 5'-fragment.
2117 P.sub.Cup1-11:Ta_alsS, pUC ORI, Amp.sup.R, TRP1, PDC1
3'-fragment-NruI-PDC1 5'-fragment. 2118 P.sub.Cup1-11:Ts_alsS, pUC
ORI, Amp.sup.R, TRP1, PDC1 3'-fragment-NruI-PDC1 5'-fragment.
[0301] Yeast AHASs are normally mitochondrial, thus favoring fungal
ALS enzymes for as cytosolically functional isobutanol pathway
enzymes. Sequence analysis by Le and Choi (Bull. Korean Chem. Soc.
(2005) 26:916-920) showed that there is a conserved sequence
`RFDDR` found in AHASs that is not conserved among ALSs. This
sequence is likely involved in FAD-binding by AHASs and thus could
be used to distinguish between the FAD-dependent AHASs and the
FAD-independent ALSs. Using this region to distinguish between
AHASs and ALSs BLAST searches of fungal sequence databases were
performed and resulted in the identification of ALS homologs from
several fungal species (Magnaporthe grisea, Phaeosphaeria nodorum,
Trichoderma atroviride (SEQ ID NO: 71), Talaromyces stipitatus (SEQ
ID NO: 72), Penicillium marneffei, and Glomerella graminicola). Of
these sequences, the ALS homologs from M. grisea, P. nodorum, T.
atroviride and T. stipitatus are predicted to be cytoplasmic by
Mitoprot II v.1.101 as described in the paper M. G. Claros, P.
Vincens. Computational method to predict mitochondrially imported
proteins and their targeting sequences. Eur. J. Biochem. 241,
779-786 (1996).
[0302] Fungal ALS genes were synthesized by DNA 2.0 with codon
optimization biased for S. cerevisiae. The following ALS constructs
were made and tested for ALS activity by assaying acetoin in the
media during a growth timecourse. All ALS genes were cloned into
the integration vector pGV1730 (SEQ ID NO: 69) as described
herein.
[0303] Plasmid pGV1730 is a yeast integration plasmid used to
replace the PDC1 gene in S. cerevisiae with the B. subtilis alsS
gene (SEQ ID NO: 70) (not codon optimized for S. cerevisiae)
expressed using the S. cerevisiae CUP1 promoter. This plasmid
carries the S. cerevisiae TRP1 gene as a selection marker.
[0304] Construction of pGV2114: pGV1730 was treated with BamHI and
SalI and the 4932 bp vector fragment was purified by gel
electrophoresis as described. The B. subtilis AlsS (codon-optimized
for expression in S. cerevisiae) gene was ligated to the pGV1730
vector fragment as a BamHI and SalI 1722 bp fragment using standard
methods with an approximately 5:1 insert:vector molar ratio and
transformed into TOP10 chemically competent E. coli cells. Plasmid
DNA was isolated and correct clones were confirmed using
restriction enzyme analysis.
[0305] Construction of pGV2117. pGV1730 was treated with BamHI and
SalI and the 4932 bp vector fragment was purified by gel
electrophoresis as described. The T. atroviride ALS gene was
ligated to the pGV1730 vector fragment as a BamHI and SalI 1686 bp
fragment using standard methods with an approximately 5:1
insert:vector molar ratio and transformed into TOP10 chemically
competent E. coli cells. Plasmid DNA was isolated and correct
clones were confirmed using restriction enzyme analysis.
[0306] Construction of pGV2118. pGV1730 was treated with BamHI and
SalI and the 4932 bp vector fragment was purified by gel
electrophoresis as described. The T. stipitatus ALS gene was
ligated to the pGV1730 vector fragment as a BamHI and SalI 1707 bp
fragment using standard methods with an approximately 5:1
insert:vector molar ratio and transformed into TOP10 chemically
competent E. coli cells. Plasmid DNA was isolated and correct
clones were confirmed using restriction enzyme analysis.
[0307] All yeast strains were constructed by treating the plasmid
to be integrated with NruI and then transforming the plasmid
according to the standard yeast transformation protocol as
described herein. Transformants were selected by plating
transformed cells onto SCD-W media and growing at 30.degree. C. for
2 days. Primary transformants were single colony purified and then
tested for correct integration using colony PCR. Colony PCR was
performed using the Yeast colony PCR to check for proper
integration of the integrative plasmids used the FailSafe.TM. PCR
System (EPICENTRE.RTM. Biotechnologies, Madison, Wis.; Catalog
#FS99250) according to the manufacturer protocol The PCR reactions
were incubated in a thermocycler using the following conditions: 1
cycle of 94.degree. C. for 2 min, 40 cycles of 94.degree. C. for 30
s, 53.degree. C. for 30 s, 72.degree. C. for 60 s and 1 cycle of
72.degree. C. for 10 min. Presence of the positive PCR product was
assessed using agarose gel electrophoresis. Primer pairs for the
5'-end and 3'-end integration sites contained one primer on the
plasmid and one primer in the genome.
[0308] Yeast strains GEVO1187, 2280, 2618, 2621 and 2622 were grown
in YPD overnight at 30.degree. C. A 100 mL culture was inoculated
to 1 OD/mL and split into 2 50 mL cultures. This was the time zero.
One of the 50 mL cultures received 500 .mu.M CuSO.sub.4 at time 2
hours and the other did not. Timepoints consisted of removing 1 mL
at times 0, 2, 2.5, 3, 4, 7.5, and 23 hours. At each timepoint the
OD.sub.600 was determined and acetoin concentrations were
determined using GC as described in the General Methods. Before GC
samples were treated with H.sub.2SO.sub.4 to convert intermediates
to acetoin. The graph shows the acetoin concentrations in the media
of the strains in which transcription of the ALS genes was induced
by CuSO.sub.4. The acetoin values were normalized to cell OD. Both
the T. stipitatus ALS and the T. atroviride ALS showed increased
levels of acetoin as compared to the no ALS control (FIG. 2).
[0309] ALS activity in whole cell lysates is determined as
described in General Methods. Activity in mitochondrial/organellar
(P) and cytosolic (S) fractions and whole cell (W) lysates is
assayed as described in General Methods
Example 5
Replacing Current Mitochondrially Targeted Isobutanol Pathway
Enzymes with Homologs or Functional Analogs from Anaerobic
Fungi
[0310] The purpose of this example is to illustrate that homologues
of isobutanol a pathway enzymes from anaerobic fungi exhibit
cytosolic activity.
TABLE-US-00016 TABLE 15 Genotype of strains disclosed in Example 5.
GEVO No. Genotype GEVO2244 S. cerevisiae, CEN.PK; MAT.alpha. ura3
leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00017 TABLE 16 Plasmids disclosed in Example 5. Plasmid
name Genotype pGV1106 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TDH3
promoter-Myc tag-polylinker-CYC1 terminator pGV1662 pUC ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1 promoter-(kivD) pGV1855 pUC ori,
bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-Ll_ilvD
[0311] Plasmid pGV1106 is described in Example 3, above.
[0312] Plasmid pGV1662 (SEQ ID NO: 81) served as the parental
plasmid of pGV1855, pGV1900, and pGV2019. The salient features of
pGV1662 include the yeast 2 micron origin of replication, the URA3
selectable marker, and the ScTEF1 promoter sequence followed by
restriction sites into which an ORF can be cloned to permit its
expression under the regulation of the TEF1 promoter.
[0313] Plasmid pGV1855 contains the L. lactis ilvD. The L. lactis
ilvD sequence was synthesized (DNA2.0, Menlo Park, Calif.) and
included a unique SalI and a NotI site at the 5' and 3' end of the
coding sequence, respectively. The synthesized DNA was digested
with SalI and NotI and ligated into vector pGV1662 that had been
digested with SalI plus NotI, yielding pGV1855.
[0314] The DHAD homolog (ilvD) from the anaerobic fungi Piromyces
sp. E2 has a predicted MTS of 49 amino acids at the N-terminus.
Thus, a nucleotide sequence encoding the Piromyces ilvD lacking the
N-terminal 49 amino acids and with a start codon placed at the
N-terminus was synthesized (SEQ ID NO: 73). In addition, a SalI
site and a BamHI site were introduced at the 5' and 3' ends of this
ORF. This fragment was cloned into the SalI and BamHI sites of
pGV1662. The resulting plasmid was transformed in to GEVO2242. An
empty vector, pGV1106, is used as a negative control. Plasmid,
pGV1855, expressing L. lactis ilvD is used as a positive
control.
[0315] An independent clonal transformant of each plasmid is
isolated, and a 1 L culture of each strain is grown in
SCGal-Ura+9.times.IV at 30.degree. C. at 250 rpm. The OD.sub.600 is
noted, the cells are collected by centrifugation (1600.times.g, 2
min) and the culture medium is decanted. The cell pellets are
resuspended in 50 mL sterile deionized water, collected by
centrifugation (1600.times.g, 2 min), and the supernatant is
discarded.
[0316] To obtain spheroplasts, the cell pellets are resuspended in
0.1M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT is added to a final concentration of 10 mM. Cells are
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells are then collected by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet is resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet is
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle and 50 mg of Zymolyase
20T (Seikagaku Biobusiness, Code#120491) is added to each cell
suspension. The suspensions are incubated overnight (approximately
16 hrs) at 30.degree. C. with gentle agitation (60 rev/min) on an
orbital shaker. The efficacy of spheroplasting is ascertained by
diluting an aliquot of each cell suspension 1:10 in either sterile
water or in spheroplasting buffer, and comparing the aliquots
microscopically (under 40.times. magnification). The spheroplasts
are centrifuged (3000.times.g, 10 min, 20.degree. C.), and the
supernatant is discarded. Each cell pellet is resuspended in 50 mL
spheroplast buffer without Zymolyase and cells are collected by
centrifugation (3000.times.g, 10 min, 20.degree. C.).
[0317] To fractionate spheroplasts, the cells are resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) is added. The cell
suspension is subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension is
centrifuged (2500.times.g, 10 min, 4.degree. C.) to collect cell
debris and unbroken cells and spheroplasts. Following
centrifugation, 2 mL of each sample (1 mL of the pGV1900
transformed cells) are saved in a 2 mL centrifuge tube on ice and
designated the "W" (for Whole cell extract) fraction, while the
remaining supernatant is transferred to a clean, ice-cold 35 mL
Oakridge screw-cap tube and centrifuged (12,000.times.g, 20 min,
4.degree. C.) to pellet mitochondria and other organellar
structures. Following centrifugation, 5 mL of each resulting
supernatant is transferred to a clean tube on ice, being careful to
avoid the small, loose pellet, and labelled the "S" (soluble
cytosol) fraction. The resulting pellets are resuspended in MIB
containing Protease Arrest solution, and are labelled the "P"
("pellet") fractions. The protein concentration of each fraction is
determined using the BioRad Protein Assay reagent (BioRad,
Hercules, Calif.) according to manufacturer's instructions.
[0318] The DHAD activity of each fraction is ascertained using the
DHAD assays as described above in the General Methods.
Example 6
Modification of the N-Terminal Mitochondrial Targeting Sequence of
an Isobutanol Pathway Enzyme
[0319] The purpose of this example is to illustrate that removal or
modification of N-terminal mitochondrial targeting sequences allows
for cytosolic activity of isobutanol pathway enzymes.
TABLE-US-00018 TABLE 17 Genotype of strains disclosed in Example 6.
GEVO No. Genotype 1803 MATa/alpha ura3/ura3 leu2/leu2 his3/his3
trp1/trp1 pdc1::Bs-alsS, TRP1/PDC1
TABLE-US-00019 TABLE 18 Plasmids disclosed in Example 6. Plasmid
name Relevant Genes/Usage Genotype pGV1354 Plasmid that contains
the P.sub.TDH3:ILV5.DELTA.N47: ILv5.DELTA.N47 gene. CYC1 term, bla,
ColE1 ORI, URA3, 2.mu. ori. pGV1662 Parent vector that has
pTEF1::L. lactis Ampicillin resistance, kivD::CYC1 term, bla, the
2.mu. origin, a URA3 gene, ColE1 ORI, URA3, 2.mu. the TEF1
promoter, CYC1 ori. terminator region and an E. coli origin. It
also has the L. lactis KivD gene that is removed by cutting the
plasmid with SalI and NotI, and then gel purifying the vector
portion. SalI and NotI were used for cloning genes to be expressed
from the TEF1 promoter. pGV1810 Plasmid that contains the
pTEF1::ILV5::CYC1 term, full length ILV5 gene. This bla, ColE1 ORI,
URA3, 2.mu. was used as a PCR template ori. to generate the
.DELTA.46-ilv5 mutant. pGV1831 Plasmid that contains the pTEF1::Sc
Ilv5 N47::CYC1 Ilv5.DELTA.N47 gene under control term, bla, ColE1
ORI, of the TEF1 promoter. URA3, 2.mu. ori. pGV1833 Plasmid that
contains the pTEF1::Sc ILV5:CYC1 full length ILV5 gene under term,
bla, ColE1 ORI, control of the TEF1 promoter. URA3, 2.mu. ori
pGV1901 The S. cerevisiae KARI with pTEF1::.DELTA.46ilv5 the
N-terminal 46 amino acid KARI::CYC1 term, bla, deleted (.DELTA.46)
cloned into ColE1 ORI, URA3, 2.mu. ori pGV1662 at the SalI-NotI
sites of the vector. The S. cerevisiae .DELTA.46 KARI was a
SalI-NotI fragment that was PCR amplified from pGV1810 using
primers 1809 and 1615. pGV1824 The E. coli coSc KARI cloned
pTEF1::E. coli coSc into pGV1662 at SalI-BamHI KARI:CYC1 term, bla,
sites of the vector. ColE1 ORI, URA3, 2.mu. ori
[0320] The yeast enzymes acetohydroxyacid synthase (AHAS;
ILV2+ILV6), ketol-acid reductoisomerase (KARI; ILV5), and
dihydroxyacid dehydratase (DHAD; ILV3) that carry out the first
three steps of isobutanol production are physiologically localized
to the mitochondria. Mitochondrial matrix proteins are typically
targeted to the mitochondria by an N-terminal mitochondrial
targeting sequence (MTS), which is then cleaved off in the
mitochondria resulting in the `mature` form of the enzyme.
N-terminal deletions of ILV5 have been shown to re-localize this
enzyme to the cytosol (Omura, 2008, Appl. Microbiol. Biotechnol.
78: 503-513; Omura, WO/2009/078108 A1, hereby incorporated by
reference in its entirety).
[0321] N-terminal mitochondria targeting sequences (MTS) are
predicted by MitoProt II software; Claros et al., 1996, Eur. J.
Biochem. 241: 779-786. Two N-terminal deletions of the ILV5 gene
was constructed, one missing the first 46 amino acids and one
missing the first 47 amino acids.
[0322] pGV1831 was constructed as follows. pGV1662 was digested
with SalI and NotI and the large fragment (6.3 Kb vector backbone)
was gel purified by agarose gel electrophoresis. The Ilv5.DELTA.N47
gene was excised from plasmid pGV1354 (SEQ ID NO: 80) using SalI
and NotI. The ilv5.DELTA.N47 gene fragment (1.06 Kb) was purified
away from the larger vector fragment by agarose gel
electrophoresis. The pGV1662 vector and ilv5.DELTA.N47 insert were
ligated using standard methods in an approximately 5:1
insert:vector molar ratio and transformed into TOP10 chemically
competent E. coli cells. Plasmid DNA was isolated and correct
clones were confirmed using restriction enzyme analysis, namely
generation of the correct insert size by digesting clones with SalI
and NotI enzymes. The clones were verified by sequencing with the
primers 351, 1625, and 1626. Purified plasmid DNA was transformed
into S. cerevisiae strain GEVO1803 using a standard yeast
transformation protocol.
[0323] pGV1833 was constructed as follows. pGV1662 was digested
with SalI and NotI and the large fragment (6.3 Kb vector backbone)
was gel purified by agarose gel electrophoresis. Primers 1615 and
1616 were used to amplify the S. cerevisiae ILV5 gene from the
plasmid template pGV1810 by PCR. The correct fragment size was
verified with DNA gel electrophoresis (1.2 Kb). The PCR product was
purified after PCR using the Qiagen QIAquick PCR Purification Kit.
The PCR product was then digested with XhoI and NotI to generate
ends compatible with the pGV1662 backbone (the XhoI end of the PCR
product is compatible with the SalI end of the vector, although the
ligated DNA fragment can't be recut with either enzyme). After
digestion, the PCR product was purified with a Qiagen QIAquick PCR
Purification Kit. The two fragments were ligated using standard
methods in an approximately 5:1 insert:vector molar ratio and
transformed into TOP10 chemically competent E. coli cells. Plasmid
DNA was isolated and correct clones were confirmed using
restriction enzyme analysis. In this case, SacI plus NotI digestion
yielded a fragment of the predicted size (1.6 Kb). The clones were
verified by sequencing with the primers 351, 1625, and 1626.
Purified plasmid DNA was transformed into S. cerevisiae strain
GEVO1803.
[0324] pGV1901 was constructed as follows. pGV1662 was digested
with SalI and NotI and the large fragment (6.3 Kb vector backbone)
was gel purified by agarose gel electrophoresis. The ILV5 gene was
amplified from pGV1810 (SEQ ID NO: 82) using primers 1809 (which
removes the first 46 amino acids from the N-terminus while adding a
methionine codon) and 1615. The PCR product was digested with SalI
and NotI. After digestion, the PCR product was purified on an
agarose gel and the proper fragment (1.07 Kb) was recovered using
the Zymoclean Gel DNA Recovery Kit. The pGV1662 vector and
Ilv5-.DELTA.46 PCR products were ligated using standard methods in
an approximately 5:1 insert:vector molar ratio and transformed into
TOP10 chemically competent E. coli cells. Plasmid DNA was isolated
and correct clones were confirmed with PCR screening of colonies
using primers 351 and 1577. The predicted correct PCR product was
580 bp. The clones were sequenced using primers 351, 1625, and
1626. Purified plasmid DNA was transformed into S. cerevisiae
strain GEVO1803 using the standard yeast transformation
protocol.
[0325] pGV1824 contains the E. coli ilvC gene that is codon
optimized for S. cerevisiae cloned into the SalI and BamHI of
pGV1662 as described above. The sequence of the codon optimized E.
coli ilvC is found as SEQ ID NO: 83.
[0326] Plasmids were transformed into the yeast strain GEVO1803 and
an individual colony was purified from each transformation. KARI
assays of whole cell lysates were performed at pH 7.5 as described
in General Methods. Results are shown in FIG. 3.
[0327] KARI activity in mitochondrial/organellar (P) and cytosolic
(S) fractions and whole cell (W) lysates is assayed as described in
General Methods
Example 7
Scaffolding Two or More Isobutanol Pathway Enzymes
[0328] The purpose of this example is to illustrate how isobutanol
pathway enzymes can be scaffolded in order to localize them to the
cytosol.
[0329] Cellulolytic microorganisms utilize a scaffolded enzyme
complex called a cellulosome. In such a complex, numerous enzymes
are docked to a single scaffold protein, called a scaffoldin, which
contain multiple binding domains called cohesin domains. Each
cohesin domain interacts with a dockerin domain. In a cellulosome
complex, each cellulytic enzyme also has a dockerin domain that
allows it to bind to the scaffoldin.
[0330] The cohesin domains of a scaffoldin protein, for example,
CipA from Clostridium thermocellum, can be expressed in yeast. The
dockerin domains from the cellulolytic enzymes from the same
organism, for example Xyn10B, can be fused to the isobutanol
enzymes and the fusion proteins expressed in yeast.
[0331] The activity of each pathway enzyme in whole cell lysates is
determined as described in General Methods. Activity in
mitochondrial/organellar (P) and cytosolic (S) fractions and whole
cell (W) lysates is assayed as described in General Methods.
Example 8
Adding of Tags, e.g. Ubiquitin Tags, to the N-Terminus of an
Isobutanol Pathway Enzyme
[0332] The purpose of this is example is to demonstrate that
isobutanol pathway enzymes can be targeted to the yeast cytosol.
For instance, this example illustrates how a DHAD enzyme can be
targeted to the yeast cytosol.
TABLE-US-00020 TABLE 18 Genotype of strains disclosed in Example 8.
GEVO No. Genotype/Source Gevo2242 S. cerevisiae, CEN.PK; MAT-alpha
ura3 leu2 his3 trp1 ilv5.sup.D255E pdc1::Bs-alsS, TRP1 Gevo2244 S.
cerevisiae, CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00021 TABLE 19 Plasmids disclosed in Example 8. pGV No.
Genotype pGV1106 pMB1 ori, bla (AmpR), 2 .mu.m ori, URA3, TDH3
promoter-Myc tag-polylinker-CYC1 terminator pGV1662 pMB1 ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1 promoter-(kivD) pGV1784 pUC ori,
kanR, Mm_ubiquitin coding sequence pGV1855 pMB1 ori, bla (AmpR), 2
.mu.m ori, URA3, TEF1 promoter-Ll_ilvD pGV1897 pMB1 ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1 promoter-Mm_ubiquitin(Gly-X)
pGV1900 pMB1 ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter-ScILV3(FL) pGV2019 pUC ori, bla (AmpR), 2 .mu.m ori, URA3,
TEF1 promoter-ScILV3.DELTA.N pGV2052 pMB1 ori, bla (AmpR), 2 .mu.m
ori, URA3, TEF1 promoter-Mm_ubiquitin(Gly-X)-ScIlv3(FL) pGV2053
pMB1 ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter-Mm_ubiquitin(Gly-X)-ScIlv3.DELTA.N pGV2054 pMB1 ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1
promoter-Mm_ubiquitin(Gly-X)-Ll_ilvD pGV2055 pMB1 ori, bla (AmpR),
2 .mu.m ori, URA3, TEF1 promoter-Mm_ubiquitin(Gly-X)-Gf_ilvD
pGV2056 pMB1 ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter-Mm_ubiquitin(Gly-X)-Se_ilvD
[0333] To develop the constructs required to express DHAD as a
fusion with an N-terminal ubiquitin, plasmid pGV1784 was
synthesized by DNA2.0. This plasmid contained the synthesized
sequence for the Mus musculus ubiquitin gene, codon-optimized for
expression in S. cerevisiae (SEQ ID NO: 86). Using this plasmid as
the template, the M. musculus ubiquitin gene was amplified via PCR
using primers 1792 and 1794 to generate a PCR product containing
the M. musculus ubiquitin gene codon sequence flanked by
restriction sites XhoI and NotI at its 5' and 3' ends,
respectively, and altered so as to lack the codon for its
endogenous C-terminal most glycine residue (denoted as Gly-X). This
PCR product was cloned into pGV1662 (described in Example 6),
yielding pGV1897.
[0334] Plasmid pGV1897 was then used as a recipient cloning vector
for sequences encoding S. cerevisiae ILV3 (ScIlv3(FL), SEQ ID NO:
88), S. cerevisiae Ilv3.DELTA.N (ScIlv3.DELTA.N, SEQ ID NO: 89), L.
lactis ilvD (Ll_ilvD, SEQ ID NO: 87), G. forsetti ilvD (Gf_ilvD,
SEQ ID NO: 90), and S. erythraea ilvD (Se_ilvD, SEQ ID NO: 91),
yielding plasmids pGV2052-2056, respectively.
[0335] The DHAD activity exhibited by cells transformed with each
of the resulting constructs is ascertained by in vitro assay.
GEVO2244 is transformed (singly) with pGV2052-2056, pGV1106 (empty
control vector), pGV1855 (expressing native, unfused Ll_ilvD) or
pGV1900 (expressing native, full-length Sc_ILV3(FL)). Lysates of
transformants are prepared and DHAD activity in
mitochondrial/organellar (P) and cytosolic (S) fractions and whole
cell (W) lysates is assayed as described in Example 3.
[0336] In an analogous manner, a desired ALS (e.g., the B. subtilis
alsS) or KARI gene whose product is known or predicted to be
mitochondrial can be re-targeted to the cytosol by means of the
methods detailed in this example. The nucleotide sequence encoding
for a full-length, or variant, ALS or KARI is amplified by PCR
using primers that introduce restriction sites convenient for
cloning the final product as an in-frame fusion of the M. musculus
ubiquitin gene. The resulting construct is transformed into a host
S. cerevisiae cell suitable for assaying the in vitro activity of
the expressed M. musculus ubiquitin-gene chimeric fusion protein,
using methods described in Example 3.
Example 9
Dihydroxy Acid Dehydratase Limits Isobutanol Production in
Yeast
[0337] This example illustrates the specific activity of various
DHAD homologs in yeast. The example also illustrates that high
specific activity of the Lactococcus lactis IlvD enzyme (SEQ ID NO:
18) correlates with an increase in isobutanol production.
[0338] Plasmid pGV1106 was used as a control and is described in
Example 3. Plasmid pGV1662 (described in Example 6) served as the
parental plasmid of pGV1855, pGV1900, and pGV2019 (see Example 5).
Plasmids pGV1851-1855 and pGV1904-1907 are all variants of pGV1662
(See Table 20), in which the kivD ORF sequence present in pGV1662
was excised and replaced with a sequence encoding a DHAD homolog,
as indicated below.
TABLE-US-00022 TABLE 20 Plasmids disclosed in Example 9. pGV No.
Genotype pGV1851 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter- Gramella forsetti ilvD pGV1852 pUC ori, bla (AmpR), 2
.mu.m ori, URA3, TEF1 promoter- Chromohalobacter salexigens ilvD
pGV1853 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-
Ralstonia eutropha ilvD pGV1854 pUC ori, bla (AmpR), 2 .mu.m ori,
URA3, TEF1 promoter- Saccharopolyspora erythraea ilvD pGV1855 pUC
ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter- Ll_ilvD pGV1900
pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter- ScILV3(FL)
pGV1904 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-
Acidobacteria bacterium Ellin345 ilvD pGV1905 pUC ori, bla (AmpR),
2 .mu.m ori, URA3, TEF1 promoter- Picrophilus torridus DSM 9790
ilvD pGV1906 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-
Piromyces species E2 ilvD pGV1907 pUC ori, bla (AmpR), 2 .mu.m ori,
URA3, TEF1 promoter- Sulfolobus tokodaii strain 7 ilvD
[0339] Plasmid pGV1851 contains the G. forsetti ilvD gene (SEQ ID
NO: 90). Plasmid pGV1852 contains the C. salexigens gene (SEQ ID
NO: 95). Plasmid pGV1853 contains the R. eutropha gene (SEQ ID NO:
94). Plasmid pGV1854 contains the S. erythraea ilvD (SEQ ID NO:
91). Plasmid pGV1855 contains the L. lactis ilvD (SEQ ID NO: 87).
Plasmid pGV1900 contains the S. cerevisiae ILV3 (SEQ ID NO: 88).
Plasmid pGV1904 contains the A. bacterium Ellin345 ilvD (SEQ ID NO:
92). Plasmid pGV1905 contains the P. torridus DSM 9790 ilvD (SEQ ID
NO: 96). Plasmid pGV1906 contains the Piromyces sp. E2 ilvD (SEQ ID
NO: 93). Plasmid pGV1907 contains the S. tokodaii ilvD (SEQ ID NO:
97). All sequences (except that of the S. cerevisiae ILV3 (full
length) were synthesized with 5' SalI and 3' NotI sites by DNA2.0
(Menlo Park, Calif.), digested with SalI and NotI, and ligated into
pGV1662 which had also been digested with SalI and NotI. For
plasmid pGV1900, the sequence containing the open reading frame of
the S. cerevisiae ILV3 (full length) was amplified from S.
cerevisiae genomic DNA using primers 1617 and 1618, and the
resulting 1.8 kb fragment was digested with SalI plus BamHI and
cloned into pGV1662. Various DHADs were tested for in vitro
activity using whole cell lysates. In this case, the DHADs were
expressed in a yeast deficient for DHAD activity (GEVO2244;
ilv3.DELTA.) (see Example 2) to minimize endogenous background
activity.
[0340] To grow cultures for cell lysates, triplicate independent
cultures of each desired strain were grown overnight in 3 mL
SCD-Ura+9.times.IV at 30.degree. C., 250 rpm. The following day,
the overnight cultures were diluted 1:50 into 50 mL fresh SCD-Ura
in a 250 mL baffle-bottomed Erlenmeyer flask and incubated at
30.degree. C. at 250 rpm. After approximately 10 hours, the
OD.sub.600 of all cultures were measured, and the cells of each
culture were collected by centrifugation (2700.times.g, 5 min). The
cell pellets were washed by resuspending in 1 mL of water, and the
suspension was placed in a 1.5 mL tube and the cells were collected
by centrifugation (16,000.times.g, 30 seconds). All supernatant was
removed from each tube and the tubes were frozen at -80.degree. C.
until use.
[0341] Lysates were prepared by resuspending each cell pellet in
0.7 mL of lysis buffer. Lysate lysis buffer consisted of: 0.1M
Tris-HCl pH 8.0, 5 mM MgSO.sub.4, with 10 .mu.L of Yeast/Fungal
Protease Arrest solution (G Biosciences, catalog #788-333) per 1 mL
of lysis buffer. Eight hundred microliters of cell suspension were
added to 1 mL of 0.5 mm glass beads that had been placed in a
chilled 1.5 mL tube. Cells were lysed by bead beating (6 rounds, 1
minute per round, 30 beats per second) with 2 minutes chilling on
ice in between rounds. The tubes were then centrifuged
(20,000.times.g, 15 min) to pellet debris and the supernatant (cell
lysates) were retained in fresh tubes on ice. The protein
concentration of each lysate was measured using the BioRad Bradford
protein assay reagent (BioRad, Hercules, Calif.) according to
manufacturer's instructions.
[0342] The DHAD activity of each lysate was ascertained as follows.
In a fresh 1.5 mL centrifuge tube, 50 .mu.L of each lysate was
mixed with 50 .mu.L of 0.1M 2,3-dihydroxyisovalerate (DHIV), 25
.mu.L of 0.1 M MgSO.sub.4, and 375 .mu.L of 0.05M Tris-HCl pH 8.0,
and the mixture was incubated for 30 min at 35.degree. C. Each tube
was then heated to 95.degree. C. for 5 min to inactivate any
enzymatic activity, and the solution was centrifuged
(16,000.times.g for 5 min) to pellet insoluble debris. To prepare
samples for analysis, 100 .mu.L of each reaction were mixed with
100 .mu.L of a solution consisting of 4 parts 15 mM dinitrophenyl
hydrazine (DNPH) in acetonitrile with 1 part 50 mM citric acid, pH
3.0, and the mixture was heated to 70.degree. C. for 30 min in a
thermocycler. The solution was then analyzed by HPLC as described
above in General Methods to quantitate the concentration of
ketoisovalerate (KIV) present in the sample. The results are shown
in Table 21.
TABLE-US-00023 TABLE 21 Specific activities (KIV generation) from
lysates of S. cerevisiae strain GEVO2244 carrying plasmids to
overexpress the indicated DHAD homolog. Specific activity (U/mg
total Plasmid Gene protein) pGV1106 Control (i.e. no DHAD) n.d.
pGV1851 Gramella forsetti ilvD 0.012 pGV1852 Chromohalobacter
salexigens n.d. (SEQ ID NO: 95) pGV1853 Ralstonia eutropha n.d.
(SEQ ID NO: 94) pGV1854 Saccharopolyspora erythraea ilvD 0.002
pGV1855 Lactococcus lactis ilvD 0.027 pGV1900 Saccharomyces
cerevisiae ILV3(FL) 0.148 pGV1904 Acidobacteria bacterium 0.004
Ellin345 DHAD pGV1905 Picrophilus torridus n.d. DSM 9790 DHAD
pGV1906 Piromyces Sp E2 DHAD 0.016 pGV1907 Sulfolobus tokodaii
0.001 str. 7 DHAD Each data point is the result of triplicate
samples. * n.d., not detectable
Example 10
Dihydroxy Acid Dehydratase Limits Isobutanol Production in
Yeast
[0343] This example illustrates that high specific DHAD activity,
and in particular the high specific activity of the L. lactis IlvD
enzyme (SEQ ID NO: 18) correlates with an increase in isobutanol
production.
TABLE-US-00024 TABLE 22 Genotype of strains disclosed in Example
10. GEVO No. Genotype/Source GEVO1186 S. cerevisiae, CEN.PK;
MATa/.alpha. ura3/ura3 leu2/leu2 his3/his3 trp1/trp1 GEVO1188 S.
cerevisiae, CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 GEVO1803
MATa/.alpha. ura3/ura3 leu2/leu2 his3/his3 trp1/trp1 pdc1::Bs-alsS,
TRP1/PDC1 GEVO2107 MATa/.alpha. ura3/ura3 leu2/leu2 his3/his3
trp1/trp1 pdc1::Bs-alsS, TRP1/PDC1 pdc6::{ScTEF1p-Ll_kivd
ScTDH3p-Dm_ADH URA3}/PDC6
TABLE-US-00025 TABLE 23 Plasmids disclosed in Example 10. pGV No.
Genotype p423GPD P.sub.TDH3:MCS:T.sub.CYC1, HIS3, 2-micron, bla,
pUC ori (Mumberg, D. et al. (1995) Gene 156: 119-122; obtained from
ATCC) pGV1103 P.sub.TDH3:myc-tag:MCS:T.sub.CYC1, HIS3, 2 micron,
bla, pUC ori pGV1730 P.sub.CUP1:Bs-alsS:T.sub.PDC1/PDC1-3'
region:PDC1-5' region, TRP1, bla, pUC ori pGV1914
P.sub.TEF1:Ll_kivD P.sub.TDH3:Dm_ADH PDC6 5',3' targeting homology
URA3 pUC ori bla(ampR) pGV1974
P.sub.TEF1:Sc_ILV3.DELTA.N:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub-
.CYC1, HIS3, 2 micron, bla, pUC ori bla(ampR) pGV1981
P.sub.TEF1:Lactococcus lactis ilvD-coSc:P.sub.TDH3:
Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1, HIS3, 2 micron, bla, pUC ori
pGV2001 P.sub.TEF1:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
HIS3, 2 micron, bla, pUC ori
[0344] Plasmid pGV1103 was generated by inserting a linker (primers
271 annealed to primer 272) containing a myc-tag and a new MCS
(SalI-EcoRI-SmaI-BamHI-NotI) into the SpeI and XhoI sites of
p423GPD. The construction of plasmid pGV1730 is described in
Example 4.
[0345] pGV1914 (SEQ ID NO: 117) is a yeast integrating vector that
includes the S. cerevisiae URA3 gene as a selection marker and
contains homologous sequence for targeting the HpaI-digested,
linearized plasmid for integration at the PDC6 locus of S.
cerevisiae. pGV1914 carries the D. melanogaster adh (Dm_ADH) (SEQ
ID NO: 116) and L. lactis kivd (Ll_kivD) genes, expressed under the
control of the S. cerevisiae TDH3 and TEF1 promoters, respectively.
The open reading frame sequence of DmADH was originally amplified
by PCR from clone RH54514 (available from the Drosophila Genome
Resource Center).
[0346] Plasmid pGV1974 is a yeast high copy plasmid with HIS3 as a
marker for the expression of E. coli ilvC.sup.Q110V (SEQ ID NO: 98)
and S. cerevisiae ILV3.DELTA.N (SEQ ID NO: 89). pGV1974 was
generated by cloning a SacI-NotI fragment (4.9 kb, SEQ ID NO: 118)
carrying the S. cerevisiae TEF1 promoter:S. cerevisiae
ilv3.DELTA.N:S. cerevisiae TDH3 promoter:E. coli ilvC.sup.Q110V
into the SacI-NotI sites of pGV1103 (5.4 kb), a yeast expression
plasmid carrying the HIS3 marker.
[0347] Plasmid pGV1981 is a yeast high copy plasmid with HIS3 as a
marker for the expression of E. coli ilvC.sup.Q110V and L. lactis
ilvD. pGV1981 was generated by cloning a SalI-BamHI fragment (1.7
kb) carrying the L. lactis ilvD ORF (SEQ ID NO: 87 with a SalI and
BamHI sites introduces at the 5' and 3' ends, respectively) into
the SalI-BamHI of pGV1974 (8.5 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF.
[0348] Plasmid pGV2001 is a yeast high copy plasmid with HIS3 as a
marker for the expression of E. coli ilvC.sup.Q110V. pGV2001 was
generated by digesting pGV1974 with SalI-BamHI to remove the S.
cerevisiae Ilv3.DELTA.N ORF. The digest was treated with Klenow to
fill-in the 5' overhangs, the larger 8.5 kb fragment was isolated
and self-ligated.
[0349] GEVO1803 was made by transforming GEVO1186 with the 6.7 kb
pGV1730 (contains S. cerevisiae TRP1 marker and the CUP1
promoter-driven B. subtilis alsS) that had been linearized by
digestion with NruI. Completion of the digest was confirmed by
running a small sample on a gel. The digested DNA was then purified
using Zymo Research DNA Clean and Concentrator and used in the
transformation. Trp+ clones were confirmed for the correct
integration into the PDC1 locus by colony PCR using primer pairs
1440+1441 and 1442+1443 for the 5' and 3' junctions, respectively.
Expression of B. subtilis alsS was confirmed by qRT-PCR using
primer pairs 1323+1324.
[0350] GEVO2107 was made by transforming GEVO1803 with linearized,
HpaI-digested pGV1914. Correct integration of pGV1914 at the PDC6
locus was confirmed by analyzing candidate Ura+ colonies by colony
PCR using primers 1440 plus 1441, or 1443 plus 1633, to detect the
5' and 3' junctions of the integrated construct, respectively.
Expression of all transgenes were confirmed by qRT-PCR using primer
pairs 1321 plus 1322, 1587 plus 1588, and 1633 plus 1634 to examine
Bs_alsS, Ll_kivD, and Dm_ADH transcript levels, respectively.
[0351] GEVO 2107 was transformed with plasmids that contained
either a KARI alone (pGV2001 with E. coli ilvC.sup.Q110V) or the
same KARI with a DHAD (pGV1974 with the S. cerevisiae Ilv3.DELTA.N
or pGV1981 with the L. lactis ilvD). Fermentations were carried out
with three independent transformants for each DHAD homolog being
tested, as well as the no DHAD control plasmid. Seed cultures were
grown in SCD-H medium to mid-log phase. The fermentations were
initiated by collecting cells and resuspending in 25 mL of SCD-H
(5% glucose) medium to an OD.sub.600 of 1. Fermentations were
performed aerobically in 125 mL unbaffled flasks shaken at 250 rpm
at 30.degree. C. At 0, 24, 48 and 72 hours, the OD.sub.600 was
checked and 2 mL samples were taken. These samples were centrifuged
at 18,000.times.g in a microcentrifuge and 1.5 mL of the clarified
media was transferred to a 1.5 mL Eppendorf tube. The clarified
media was stored at 4.degree. C. until analyzed by GC and HPLC as
described in General Methods. At 24 and 48 hours, 2.5 mL of glucose
from a 400 g/L stock solution was added to the cultures. FIG. 4
shows the production of isobutanol in these fermentations. All
values were adjusted for the dilution caused by the volume change
from adding glucose. An increased amount of isobutanol was produced
from the cells expressing the L. lactis ilvD.
Example 11
Assaying DHAD Activity in Fractionated Cell Extracts
[0352] The purpose of this Example is to describe how DHAD activity
can be measured in fractionated cellular extracts that are enriched
for either mitochondrial or soluble cytosolic components.
[0353] Plasmids pGV1106, pGV1662, pGV1855, pGV1900 are described in
Example 9 above. To measure the DHAD activities present in
fractionated cell extracts, the strain GEVO2244 was transformed
singly with either pGV1106, which served as an empty vector
control, or with one of: pGV1855, pGV1900, or pGV2019, which are
expression plasmids for L. lactis ilvD, S. cerevisiae ILV3 (full
length), and S. cerevisiae ILV3.DELTA.N, respectively.
[0354] An independent clonal transformant of each plasmid was
isolated, and a 1 L culture of each strain was grown in
SCGal-Ura+9.times.IV at 30.degree. C. at 250 rpm. The OD.sub.600
was noted, the cells were collected by centrifugation
(1600.times.g, 2 min) and the culture medium was decanted. The cell
pellets were resuspended in 50 mL sterile deionized water,
collected by centrifugation (1600.times.g, 2 min), and the
supernatant was discarded. The OD.sub.600 and total wet cell pellet
weight of each culture are listed in Table 24, below:
TABLE-US-00026 TABLE 24 OD.sub.600 and pellet mass (g) of strain
GEVO2244 transformed with the indicated plasmids. Pellet mass
Plasmid OD.sub.600 (g) pGV1106 2.2 7.6 pGV1855 2.3 7.7 pGV1900 1.3
3.8 pGV2019 2.6 8.4
[0355] To obtain spheroplasts, the cell pellets were resuspended in
0.1 M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT was added to a final concentration of 10 mM. Cells were
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells were then collect by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet was resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet was
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle, and 50 mg of
Zymolyase 20T (Seikagaku Biobusiness, Code#120491) was added to
each cell suspension. The suspensions were incubated overnight
(approximately 16 hrs) at 30.degree. C. with gentle agitation (60
rev/min) on an orbital shaker. The efficacy of spheroplasting was
ascertained by diluting an aliquot of each cell suspension 1:10 in
either sterile water or in spheroplasting buffer, and comparing the
aliquots microscopically (under 40.times. magnification). In all
cases, >90% of the water-diluted cells lysed, indicating
efficient spheroplasting. The spheroplasts were centrifuged
(3000.times.g, 10 min, 20.degree. C.), and the supernatant was
discarded. Each cell pellet was resuspended in 50 mL spheroplast
buffer without Zymolyase, and cells were collected by
centrifugation (3000.times.g, 10 min, 20.degree. C.).
[0356] To fractionate spheroplasts, the cells were resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) was added. The cell
suspension was subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension was
centrifuged (2500.times.g, 10 min, 4.degree. C.) to collect cell
debris and unbroken cells and spheroplasts. Following
centrifugation, 2 mL of each sample (1 mL of the pGV1900
transformed cells) were saved in a 2 mL centrifuge tube on ice and
designated the "W" (for Whole cell extract) fraction, while the
remaining supernatant was transferred to a clean, ice-cold 35 mL
Oakridge screw-cap tube and centrifuged (12,000.times.g, 20 min,
4.degree. C.) to pellet mitochondria and other organellar
structures. Following centrifugation, 5 mL of each resulting
supernatant was transferred to a clean tube on ice, being careful
to avoid the small, loose pellet, and labelled the "S" (soluble
cytosol) fraction. The resulting pellets were resuspended in MIB
containing Protease Arrest solution, and were labelled the "P"
("pellet") fractions. Protein from the "P" fraction was released
after dilution 1:5 in DHAD assay buffer (see above) by rapid mixing
in a 1.5 mL tube with a Retsch Ball Mill MM301 in the presence of
0.1 mM glass beads. The bead-beating was performed 4 times for 1
minute, 30 beats per second, after which insoluble debris was
removed by centrifugation (20,000.times.g, 10 min, 4.degree. C.)
and the soluble portion retained for use.
[0357] The BioRad Protein Assay reagent (BioRad, Hercules, Calif.)
was used according to manufacturer's instructions to determine the
protein concentration of each fraction; the data are summarized in
Table 25, below:
TABLE-US-00027 TABLE 25 Protein concentrations of
mitochondrial/organellar (P) and cytosolic (S) fractions and whole
cell (W) lysates, prepared as described in the text.
plasmid/fraction protein [.mu.g/.mu.L] 1106 P 20.3 1855 P 17.7 1900
P 9.2 2019 P 19.7 1106 S 12.3 1855 S 12.9 1900 S 7.9 2019 S 12.4
1106 W 14.0 1855 W 15.0 1900 W 7.9 2019 W 14.7
[0358] The DHAD activity of each fraction was ascertained as
follows. In a fresh 1.5 mL centrifuge tube, 50 .mu.L of each
fraction was mixed with 50 .mu.L of 0.1M 2,3-dihydroxyisovalerate
(DHIV), 25 .mu.L of 0.1 M MgSO.sub.4, and 375 .mu.L of 0.05M
Tris-HCl pH 8.0, and the mixture was incubated for 30 min at
35.degree. C. Each reaction was carried out in triplicate. Each
tube was then heated to 95.degree. C. for 5 min to inactivate any
enzymatic activity, and the solution was centrifuged
(16,000.times.g for 5 min) to pellet insoluble debris. To prepare
samples for analysis, 100 .mu.L of each reaction were mixed with
100 .mu.L of a solution consisting of 4 parts 15 mM dinitrophenyl
hydrazine (DNPH) in acetonitrile with 1 part 50 mM citric acid, pH
3.0, and the mixture was heated to 70.degree. C. for 30 min in a
thermocycler. Analysis of ketoisovalerate via HPLC was carried out
as described in General Methods. Data from the experiment are
summarized below in Table 26.
TABLE-US-00028 TABLE 26 Specific activities (KIV generation) and
ratios of specific activities from fractionated lysates of S.
cerevisiae strain GEVO2244 carrying plasmids to overexpress the
indicated DHAD homolog. Sp. Activity Ratio of Sp. Lysate [U/mg
Activities (pGV# and protein in Std. (Cyto or Mito fraction*) DHAD
fraction] Dev. to Whole-Cell) 1106 WCL -- n.d. 1106 cyto -- n.d.
1106 mito -- n.d. 1855 WCL Ll_ilvD 0.0006 4.7E-05 1855 cyto Ll_ilvD
0.0011 0.0001 1.76 1855 mito Ll_ilvD 2E-05 3.5E-05 0.03 1900 WCL
ScILV3(FL) 0.0096 0.0018 1900 cyto ScILV3(FL) 0.0052 0.0004 0.54
1900 mito ScILV3(FL) 0.0340 0.0029 3.53 Each data point is the
result of triplicate samples. *WCL, whole cell lysate; cyto,
cytosolic-enriched fraction; mito, mitochondrial
(organellar)-enriched fraction
[0359] Cells overexpressing the L. lactis ilvD generated a
significantly greater proportion of DHAD activity in the cytosolic
fraction versus the mitochondrial fraction, whereas cells
overexpressing the full-length, native (mitochondrial) S.
cerevisiae ILV3 resulted in a greater proportion of the specific
activity residing in the mitochondrial fraction.
Example 12
Alternative, Native Dehydratases with DHAD Activity
[0360] This example describes how the overexpression of native
dehydratases in S. cerevisiae for the conversion of
2,3-dihydroxyisovalerate to ketoisovalerate is measured.
TABLE-US-00029 TABLE 27 Plasmids disclosed in Example 12. pGV No.
Genotype p426TEF P.sub.TEF1:MCS:T.sub.CYC1, URA3, 2-micron, bla,
pUC-ori (Mumberg, D. et al. (1995) Gene 156: 119-122; obtained from
ATCC) 1102 P.sub.TEF1:HA-tag:MCS:T.sub.CYC1, URA3, 2-micron, bla,
pUC-ori 1106 P.sub.TDH3:myc-tag:MCS:T.sub.CYC1, URA3, 2-micron,
bla, pUC-ori 1662 P.sub.TEF1:Ll_kivd:T.sub.CYC1, URA3, 2-micron,
bla, pUC-ori 1894 P.sub.TEF1:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
URA3, 2-micron, bla, pUC-ori 2000
P.sub.TEF1:Sc_ILV3.DELTA.N:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CY-
C1, URA3, 2-micron, bla, pUC-ori 2111
P.sub.TEF1:Ll_ilvD:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
URA3, 2-micron, bla, pUC-ori 2112
P.sub.TEF1:Sc_LEU1:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
URA3, 2-micron, bla, pUC-ori 2113
P.sub.TEF1:Sc_HIS3:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
URA3, 2-micron, bla, pUC-ori
[0361] Plasmid pGV1102 was generated by inserting a linker (primers
269 annealed to primer 270) containing a HA-tag and a new MCS
(SalI-EcoRI-SmaI-BamHI-NotI) into the SpeI and XhoI sites of
p426TEF. Plasmids pGV1106 and pGV1662 are described in Examples 3
and 5, respectively. Plasmid pGV1894 is a yeast high copy plasmid
with URA3 as a marker for the expression of E. coli ilvC.sup.Q110V
and was generated by cloning a XhoI-NotI fragment (1.5 kb) carrying
the E. coli ilvC.sup.Q110V ORF (SEQ ID NO: 98) into the SalI-NotI
of pGV1662 (6.3 kb), replacing the L. lactis kivD ORF. Plasmids
pGV2000, pGV2111, pGV2112, and pGV2113 are yeast high copy plasmids
with URA3 as a marker for the expression of E. coli ilvC.sup.Q110V
and a DHAD. pGV2000 is generated by cloning a SacI-NotI fragment
(4.9 kb) from pGV1974 (described in Example 10) carrying the S.
cerevisiae TEF1 promoter:S. cerevisiae Ilv3.DELTA.N:S. cerevisiae
TDH3 promoter:E. coli ilvC.sup.Q110V into the SacI-NotI sites of
pGV1106 (6.6 kb), a yeast expression plasmid carrying the URA3
marker. pGV2111 is generated by cloning a SalI-BamHI fragment (1.7
kb) carrying the L. lactis ilvD ORF (SEQ ID NO: 97 with SalI and
BamHI sites introduced at the 5' and 3' ends, respectively) into
the SalI-BamHI of pGV2000 (8.4 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF. pGV2112 is generated by cloning the S. cerevisiae
LEU1 gene as a SalI-BamHI fragment (2.3 kb), generated by PCR using
primers 2163 and 1842 using genomic DNA as template, into the
SalI-BamHI of pGV2000 (8.4 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF. pGV2113 is generated by cloning the S. cerevisiae
HIS3 gene as a SalI-BamHI fragment (0.7 kb), generated by PCR using
primers 2183 and 2184 using genomic DNA as template, into the
SalI-BamHI of pGV2000 (8.4 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF.
[0362] DHADs are tested for in vitro activity using whole cell
lysates. The DHADs as well as LEU1 and HIS3 are expressed from
pGV2000, pGV2112, and pGV2113 GEVO2244 to minimize endogenous DHAD
background activity. A plasmid that does not express DHAD, pGV1894,
and a plasmid that expresses the L. lactis ilvD, pGV2111, are used
as negative and positive controls, respectively
[0363] To grow cultures for cell lysates, triplicate independent
cultures of each desired strain are grown overnight in 3 mL
YNBD+HLW+10.times.IV at 30.degree. C., 250 rpm. The following day,
the overnight cultures are diluted 1:50 into 50 mL fresh
YNBD+HLW+10.times.IV in a 250 mL baffle-bottomed Erlenmeyer flask
and incubated at 30.degree. C. at 250 rpm. After approximately 10
hours, the OD.sub.600 of all cultures are measured, and the cells
of each culture are collected by centrifugation (2700.times.g, 5
min). The cell pellets are washed by resuspending in 1 mL of water,
and the suspension is placed in a 1.5 mL tube and the cells are
collected by centrifugation (16,000.times.g, 30 seconds). All
supernatant is removed from each tube and the tubes are frozen at
-80.degree. C. until use.
[0364] Lysates are prepared by resuspending each cell pellet in 0.7
mL of lysis buffer. Lysate lysis buffer consisted of: 0.1M Tris-HCl
pH 8.0, 5 mM MgSO.sub.4, with 10 .mu.L of Yeast/Fungal Protease
Arrest solution (G Biosciences, catalog #788-333) per 1 mL of lysis
buffer. Eight hundred microliters of cell suspension are added to 1
mL of 0.5 mm glass beads that had been placed in a chilled 1.5 mL
tube. Cells are lysed by bead beating (6 rounds, 1 minute per
round, 30 beats per second) with 2 minutes chilling on ice in
between rounds. The tubes are then centrifuged (20,000.times.g, 15
min) to pellet debris and the supernatant (cell lysates) are
retained in fresh tubes on ice. The protein concentration of each
lysate is measured using the BioRad Bradford protein assay reagent
(BioRad, Hercules, Calif.) according to manufacturer's
instructions.
[0365] The DHAD activity of each lysate is ascertained as follows.
In a fresh 1.5 mL centrifuge tube, 50 .mu.L of each lysate is mixed
with 50 .mu.L of 0.1M 2,3-dihydroxyisovalerate (DHIV), 25 .mu.L of
0.1 M MgSO.sub.4, and 375 .mu.L of 0.05M Tris-HCl pH 8.0, and the
mixture is incubated for 30 min at 35.degree. C. Each tube is then
heated to 95.degree. C. for 5 min to inactivate any enzymatic
activity, and the solution is centrifuged (16,000.times.g for 5
min) to pellet insoluble debris. To prepare samples for analysis,
100 .mu.L of each reaction are mixed with 100 .mu.L of a solution
consisting of 4 parts 15 mM dinitrophenyl hydrazine (DNPH) in
acetonitrile with 1 part 50 mM citric acid, pH 3.0, and the mixture
is heated to 70.degree. C. for 30 min in a thermocycler. The
solution is then analyzed by HPLC as described above in General
Methods to quantitate the concentration of ketoisovalerate (KIV)
present in the sample.
[0366] DHADs are tested for in vitro activity using whole cell
lysates. The DHADs are expressed in a yeast deficient for DHAD
activity (GEVO2244; ilv3.DELTA.) to minimize endogenous background
activity.
Example 13
Cloning of Low-Abundance, Endogenous Cytosolic Iron-Sulfur Cluster
Assembly Machinery for Overexpression in S. cerevisiae
[0367] The purpose of this example is to describe how three known
components of the S. cerevisiae cytosolic iron-sulfur assembly
machinery were cloned to permit their overexpression in S.
cerevisiae, to increase cytosolic DHAD activity.
[0368] In the yeast S. cerevisiae, at four least genes--CIA1, CFD1,
NAR1, and NBP35--encode activities that contribute to the proper
assembly and/or transfer of iron-sulfur [Fe--S] clusters of
cytosolic proteins. Of these four genes, three--CFD1, NAR1, and
NBP35--have been shown to be expressed at very low levels during
aerobic growth on glucose (Ghaemmaghami et al., 2003, Nature, 425:
737-741). These three genes thus represent attractive candidates
for overexpression to increase the cellular capacity for proper
cytosolic [Fe--S] cluster protein assembly.
TABLE-US-00030 TABLE 27 Plasmids disclosed in Example 13. pGV No.
Genotype pGV2074 pUC ori, bla (AmpR), 2 .mu.m ori, TPI1
promoter-hph (HygroR), PGK1 promoter, TEF1 promoter, TDH3 promoter
pGV2127 pUC ori, bla (AmpR), 2 .mu.m ori, TPI1 promoter-hph
(HygroR), PGK1 promoter, TEF1 promoter, TDH3 promoter-CFD1 pGV2138
pUC ori, bla (AmpR), 2 .mu.m ori, TPI1 promoter-hph (HygroR), PGK1
promoter, TEF1 promoter-NAR1, TDH3 promoter-CFD1 pGV2144 pUC ori,
bla (AmpR), 2 .mu.m ori, TPI1 promoter-hph (HygroR), PGK1
promoter-NBP35, TEF1 promoter, TDH3 promoter pGV2147 pUC ori, bla
(AmpR), 2 .mu.m ori, TPI1 promoter-hph (HygroR), PGK1
promoter-NBP35, TEF1 promoter-NAR1, TDH3 promoter-CFD1
[0369] To clone the sequences for CFD1, NAR1, and NBP35 into an
appropriate S. cerevisiae expression vector, the following steps
were carried out: Vector pGV2074 (SEQ ID NO: 133) was used as a
parental plasmid for subsequent cloning steps described below. The
salient features of pGV2074 include a bacterial origin of
replication (pUC) and selectable marker (bla), an S. cerevisiae 2
.mu.m origin of replication and selectable marker (the hph gene,
conferring resistance to hygromycin, operably linked to the TPI1
promoter region), and sequences containing the S. cerevisiae
promoters for the PGK1, TDH3 and TEF1 genes, each followed by one
or more unique restriction sites to facilitate the introduction of
coding sequences.
[0370] First, the CFD1 coding sequence was amplified from S.
cerevisiae genomic DNA by PCR, using primers 2195 and 2196, which
also added 5' XhoI and 3' NotI sites, respectively. The resulting
.about.890 bp product was digested with XhoI plus NotI and ligated
into pGV2074 that had been digested with XhoI plus NotI, yielding
the plasmid pGV2127. All sequences amplified by PCR were confirmed
by DNA sequencing. Next, the NAR1 coding sequence was amplified
from S. cerevisiae genomic DNA by PCR, using primers 2197 and 2198,
which added 5' SalI and 3' BamHI sites, respectively. The resulting
.about.1485 bp product was digested with SalI plus BamHI and cloned
into pGV2127 which had also been digested with SalI plus BamHI,
thereby yielding pGV2138. Next, the NBP35 coding sequence was
amplified S. cerevisiae genomic DNA by PCR, using primers 2259 and
2260, which added 5' BglII and 3' KpnI and XhoI (from 5' to 3')
sites, respectively. The resulting .about.995 bp product was
digested with BglII plus XhoI and ligated into pGV2074 that had
been digested with BglII plus SalI, yielding pGV2144. Finally,
pGV2144 was digested with AvrII plus BamHI, and the resulting 1.78
kb fragment (which contained the PGK1 promoter and the NBP35 ORF
sequence) was gel purified and ligated into the vector pGV2138 that
had been digested with AvrII plus BglII, yielding pGV2147.
Example 14
Cloning of Heterologous Cytosolic Iron-Sulfur Cluster Assembly
Machinery for Overexpression in S. cerevisiae
[0371] The purpose of this example is to describe how one or more
cytosolic iron-sulfur assembly machinery components, from various
species, can be cloned to permit their overexpression in S.
cerevisiae, thereby increasing cytosolic DHAD activity.
[0372] In addition to the endogenous cytosolic iron-sulfur assembly
machinery found in S. cerevisiae, homologous sequences and
activities have been identified in other microbial and eukaryotic
species. In one example, the ApbC protein of Salmonella enterica
serovar Typhimurium has been shown, in vitro, to bind and
effectively transfer iron-sulfur clusters to a known cytosolic
[Fe--S] cluster-containing S. cerevisiae substrate, Leu1 (Boyd et
al., 2008, Biochemistry, 47: 8195-202). Thus, a number of other
useful homologs of the known S. cerevisiae cytosolic iron-sulfur
assembly machinery components exist and present attractive
candidates for overexpression in S. cerevisiae. Table 28 lists
several exemplary homologs and their GenBank accession numbers, as
identified by previous homology searches (Boyd et al., 2009, J.
Biol Chem 284: 110-118). Also included in the table are two closely
related S. cerevisiae homologs, Nbp35 and Cfd1. Of note, Ind1 is
reported to be localized to and functional in the mitochondria
(Bych et al., 2008, EMBO J. 27: 1736-46), whereas Hcf101 is
reported to participate in iron-sulfur cluster assembly in
Arabidopsis chloroplasts (Lezhneva et al., 2004, Plant J. Cell Mol
Biol 37: 174-185).
TABLE-US-00031 TABLE 28 Functionally homologous proteins involved
in iron-sulfur cluster formation. Gene Source, Accession Number
ApbC Salmonella enterica serovar Typhimurium LT2, NP_461098 Ind1
Yarrowia lypolytica, YALI0B18590g Hcf101 Arabidopsis thaliana,
AAR97892.1 Nubp1 Homo sapiens, NP_002475.2 Nbp35 S. cerevisiae,
CAA96797.1 Cfd1 S. cerevisiae, AAS56623
[0373] The cloning of one or more of these genes is carried out
using techniques well known to one skilled in the art.
Oligonucleotide primers are designed that are homologous to the 5'
and 3' ends of each desired reading, and which furthermore
incorporate a restriction site sequence convenient for the cloning
of each reading frame into vector pGV2074. A standard PCR reaction
is used to amplify each gene, either from the genome of each host
organism, or from an in vitro synthesized DNA fragment, and the
resulting PCR product is cloned into an expression vector
(pGV2074). In the case of a protein known to be targeted to the
mitochondria, such as Yarrowia lypolytica Ind1, PCR primers are
designed to amplify the majority of the coding sequence while
excluding the known N-terminal mitochondrial targeting sequence
(Bych et al., 2008, EMBO J. 27: 1736-46).
Example 15
Overexpression of S. cerevisiae Cytosolic Iron-Sulfur Assembly
Machinery to Increase Cytosolic DHAD Activity
[0374] The purpose of this example is to describe how a plasmid
expressing one or more iron-sulfur assembly machinery components is
co-expressed with a DHAD, thereby increasing the cytosolic activity
of the DHAD.
[0375] Strain GEVO2244 is simultaneously co-transformed with one
of: pGV1851, pGV1852, pGV1853, pGV1854, pGV1855, pGV1904, pGV1905,
pGV1906, or pGV1907 (pGV1851-55 and pGV1904-07 are described in
Table 20); plus, one of either: pGV2074 (Table 27) (which serves as
an empty-vector control) or pGV2147 (Table 27) (which serves as the
cytosolic Fe--S cluster machinery overexpression plasmid), and
doubly-transformed cells are selected by plating onto
SCD-Ura+9.times.IV containing 0.1 g/L Hygromycin B.
[0376] Three independent isolates from each transformation are
cultured in SCD-Ura+9.times.IV containing 0.1 g/L Hygromycin B to
obtain a cell mass suitable for preparation of a lysate, as
described in Example 3. Lysates are prepared from each culture, and
the resulting lysates are assayed for DHAD activity as described in
Example 3. To further confirm that the increased DHAD activity is
due specifically to increased cytosolic activity, cultures of
GEVO2244 containing pGV1855 plus either pGV2074 or pGV2147 are
grown in SCD-Ura+9.times.IV containing 0.1 g/L Hygromycin B as
otherwise described in Example 11. Fractionated lysates are
prepared and in vitro assays to measure DHAD activity are further
carried out as described in Example 11.
Example 16
Deletion of LEU1
[0377] The purpose of this example is to describe the deletion of
LEU1 to increase the iron-sulfur cluster availability in the yeast
cytosol.
TABLE-US-00032 TABLE 29 Plasmids disclosed in Example 16. pGV No.
Genotype pGV1299 K. lactis URA3, bla, pUC-ori (GEVO) PGV1981
P.sub.TEF1:Lactococcus lactis ilvD-coSc:P.sub.TDH3:
Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1, HIS3, 2-micron, bla, pUC-ori
pGV2001 P.sub.TEF1:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
HIS3, 2-micron, bla, pUC-ori
[0378] The LEU1 gene was deleted by transforming cells with a
leu1::K. lactis URA3 deletion cassette that was generated by two
rounds of PCR. Initially, the K. lactis URA3 gene was amplified
with primers 2171 and 2172 from pGV1299 (described in Example 2).
These primers add 40 bp of the LEU1 promoter and terminator
sequences to the 5' and 3' ends of the K. lactis URA3 gene. This
PCR product was then used as a template for a PCR using primers
2170 and 2173. Primer 2170 adds an additional 36 bp of the LEU1
promoter sequence at the 5' end and primer 2173 adds an additional
38 bp of the LEU1 terminator sequence at the 3' end. This PCR
product was transformed into GEVO2244 (described in Example 2) to
generate GEVO2570. The 5' junction of the integrations were
confirmed by colony PCR using primers 2226 and 587. The 3' junction
of the integrations were confirmed by colony PCR using primers 588
and 2175. The loss of the LEU1 gene was confirmed by a lack of PCR
product using primers 2167 and 2227.
[0379] GEVO2570 has a deletion in ILV3. GEVO2570 is used to measure
DHAD activity in the presence of L. lactis ilvD overexpressed as
described in Examples 2 and 4. A plasmid (pGV2001) with no DHAD is
used as a negative control.
Example 17
Conserved Motif Amongst Cytosolically Active DHAD Enzymes
[0380] This example illustrates that a DHAD enzymes with a specific
amino acid sequence motif are more likely to be functional when
expressed in the yeast cytosol.
[0381] Based on the data from biochemical assays (see Example 10),
several DHAD homologs were identified that exhibit at least some
cytosolic activity. A total of ten different homologs were tested
using biochemical assays. The DHADs were expressed from 2 micron
yeast vectors and transformed into GEVO2244. The homologs were then
ranked based on their measured specific activity in both whole cell
lysates and in cytosolic fractions.
[0382] Based on these data, four DHAD homologs: L. lactis (SEQ ID
NO: 18), G. forsetii (SEQ ID NO: 17), Acidobacteria (SEQ ID NO:
16), and S. erythraea (SEQ ID NO: 19) exhibit cytosolic DHAD
activity. Four DHAD homologs exhibit no cytosolic DHAD activity: R.
eutropha (SEQ ID NO: 22), C. salexigens (SEQ ID NO: 23), P.
torridus (SEQ ID NO: 24), and S. tokodaii (SEQ ID NO: 25). One
motif-containing homolog was inconclusive: Piromyces sp. E2 (SEQ ID
NO: 21), which did not complement the GEVO2242 valine auxotrophy
and had detectable biochemical DHAD activity. Since, this homolog
has a putative organellar targeting sequence, the protein is likely
to be mitochondrially located explaining its inability to
complement the GEVO2242 auxotrophy, despite containing the
motif.
[0383] A multiple sequence alignment (MSA) was created using the
Align Multiple Sequences tool of Clone Manager 9 Professional
Addition Software using the "Multi-Way" function. This function
performs exhaustive pairwise global alignments of all sequences and
progressive assembly of alignments using Neighbor-Joining
phylogeny. A total of 53 representative DHAD homologs (FIG. 5) were
aligned using the following using the BLOSUM62 scoring matrix
setting. This alignment generated the tree in FIG. 5.
[0384] Many of the DHAD homologs exhibiting cytosolic activity are
related by overall homology (>40%) homology when compared to the
S. cerevisiae DHAD encoded by S. cerevisiae ILV3 (e.g. L. lactis,
G. forsetii, Acidobacteria, and S. erythraea). However, the 40%
homology cut-off still includes several DHAD homologs that do not
exhibit cytosolic DHAD activity (e.g. R. eutropha, C. salexigens,
P. torridus, and S. tokodaii). The Piromyces sp. E2 DHAD failed to
complement in the genetic/biochemistry assay but this result is
still consistent with our motif hypothesis since the protein still
retained its mitochondrial localization signal. Therefore, a common
sequence motif, unique to DHAD homologs that are cytosolically
active, was identified: P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27), where
(I/L) indicates an isoleucine or leucine at that position, and X
indicates any natural or non-natural amino acid. This motif can be
found in all DHAD homologs exhibiting cytosolically activity.
Furthermore, an even more specific version of this motif was
identified that is conserved in all of DHAD homologs exhibiting
cytosolic activity except for the S. erythraea DHAD:
PIKXXGX(I/L)XIL (SEQ ID NO: 28). This motif is conserved amongst
the majority if not all eukaryotic homologs of DHAD.
[0385] Six additional DHAD homologs were identified: SEQ ID NOs:
10-15 as specified in Table 1. These DHAD homologs (SEQ ID NOs:
10-15) contain the motifs PYHKEGGLGIL (SEQ ID NO: 145), PYSEKGGLAIL
(SEQ ID NO: 146), PYKPEGGIAIL (SEQ ID NO: 147), PLKPSGHLQIL (SEQ ID
NO: 148), PIKKTGHLQIL (SEQ ID NO: 149), and PIKETGHIQIL (SEQ ID NO:
150), respectively.
Example 18
Use of Cytosolically Localized DHADs for the Production of
Isobutanol
[0386] The following example illustrates the use of DHADs that have
cytosolic activity in yeast and when expressed in the context of an
isobutanol biosynthetic pathway lead to isobutanol production.
[0387] A yeast strain that contains one integrated copy of the B.
subtilis alsS gene codon-optimized for expression in S. cerevisiae
(SEQ ID NO: 144), two integrated copies of the L. lactis kivD gene
(SEQ ID NOs: 99 and 151), one integrated copy of L. lactis
adhA.sup.RE1 gene (SEQ ID NO: 152), and one integrated copy of the
S. cerevisiae AFT1 gene (SEQ ID NO: 153) was transformed with high
copy three-component isobutanol pathway plasmids containing a KARI
(Ec_ilvC_coSc.sup.P2D1-A1-his6, SEQ ID NO: 154), an ADH (L. lactis
adhA.sup.RE1, SEQ ID NO: 152) and a DHAD which was expressed from
the S. cerevisiae PDC1-286 promoter. The DHAD varied according to
Table 31. Isobutanol titer and DHAD activity of each strain was
compared to that of a control strain that did not express a DHAD in
the plasmid. Strains, plasmids, and DHADs are listed in Tables 30,
31, and 32, respectively.
TABLE-US-00033 TABLE 30 Genotype of strains disclosed in Example
18. GEVO No. Genotype GEVO3868 S. cerevisiae, CEN.PK2, MATa ura3
leu2 his3 trp1 gpd1::T.sub.Kl.sub.--.sub.URA3
gpd2::T.sub.Kl.sub.--.sub.URA3 tma29::T.sub.Kl.sub.--.sub.URA3
pdc1::P.sub.PDC1-
Ll_kivD2_coSc5-P.sub.FBA1-LEU2-T.sub.LEU2-P.sub.ADH1-Bs_alsS1_coSc-
T.sub.CYC1-P.sub.PGK1-Ll_kivD2_coEc-P.sub.ENO2-Sp_HIS5
pdc5::T.sub.Kl.sub.--.sub.URA3
pdc6::P.sub.TDH3-Sc_AFT1-P.sub.ENO2-Ll_adhA.sup.RE1-T-.sub.Kl.sub.--.sub.-
URA3.sub.--.sub.short-P.sub.FBA1- Kl_URA3-T.sub.Kl.sub.--.sub.URA3
{evolved for C2 supplement- independence, glucose tolerance and
faster growth}
TABLE-US-00034 TABLE 31 Plasmids disclosed in Example 18. Plasmid
Name DHAD Genotype pGV2663 none
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2.mu.-ori, pUC ori, bla, G418r pGV2635
L. lactis P.sub.PDC1-286-Ll_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2.mu.-ori, pUC ori, bla, G418r pGV2671
S. cerevisiae P.sub.PDC1-286-Sc_ilv3_.DELTA.N20,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2.mu.-ori, pUC ori, bla, G418r pGV2672
G. forsetii P.sub.PDC1-286-Gf_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2.mu.-ori, pUC ori, bla, G418r pGV2673
S. erythraea P.sub.PDC1-286-Se_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2.mu.-ori, pUC ori, bla, G418r pGV2674
F. tularensis P.sub.PDC1-286-Ft_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r pGV2675
S. cerevisiae P.sub.PDC1-286-Sc_ilv3_.DELTA.N19, ilv3.DELTA.N19
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r pGV2676
S. cerevisiae P.sub.PDC1-286-Sc_ilv3_.DELTA.N23, ilv3.DELTA.N23
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r pGV2677
N. crassa P.sub.PDC1-286-Nc_ilvD2_coSc, ilvD2
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r pGV2678
Acidobacteria P.sub.PDC1-286-Ab_ilvD_coSc, bacterium
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r pGV2679
Acaryochloris P.sub.PDC1-286-Am_ilvD_coSc, marina
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r pGV2680
Lyngbya spp. P.sub.PDC1-286-Lsp_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r pGV2681
E. coli P.sub.PDC1-286-Ec_ilvD_coKl,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2.mu.-ori, pUC ori, bla, G418r
TABLE-US-00035 TABLE 32 DHAD sequences disclosed in Example 18. SEQ
ID NO SEQ ID NO DHAD Abbreviation (DNA) (protein) L. lactis
Ll_ilvD_coSc 155 18 S. cerevisiae Sc_ilv3_.DELTA.N20 89 26
ilv3.DELTA.N20 G. forsetii Gf_ilvD_coSc 90 17 S. erythraea
Se_ilvD_coSc 91 19 F. tularensis Ft_ilvD_coSc 156 14 S. cerevisiae
Sc_ilv3_.DELTA.N19 157 163 ilv3.DELTA.N19 S. cerevisiae
Sc_ilv3_.DELTA.N23 158 164 ilv3.DELTA.N23 N. crassa Nc_ilvD2_coSc
159 165 ilvD2 A. bacterium Ab_ilvD_coSc 92 16 A. marina
Am_ilvD_coSc 160 166 Lyngbya spp. Lsp_ilvD_coSc 161 167 E. coli
Ec_ilvD_coKl 162 168
[0388] Cloning techniques included digestion with restriction
enzymes, gel purification of DNA fragments (using the Zymoclean Gel
DNA Recovery Kit, Cat# D4002, Zymo Research Corp, Orange, Calif.),
ligation of two DNA fragments using the DNA Ligation Kit (Mighty
Mix Cat# TAK 6023, Clontech Laboratories, Madison, Wis.), and
bacterial transformations into competent E. coli cells (Xtreme
Efficiency DH5.alpha. Competent Cells, Cat# ABP-CE-CC02096P, Allele
Biotechnology, San Diego, Calif.). Plasmid DNA was purified from E.
coli cells using the Qiagen QIAprep Spin Miniprep Kit (Cat# 27106,
Qiagen, Valencia, Calif.).
[0389] Yeast media used for this example include YP medium (1%
(w/v) yeast extract, 2% (w/v) peptone), YPD medium (YP medium
containing 2% (w/v) glucose), YPD supplemented with glycerol and
ethanol (YPD medium containing 1% (v/v) 80% glycerol and 1% (v/v)
ethanol. The antibiotic G418 was added to agar plates to a final
concentration of 0.2 g/L. Precultures were grown in YP medium
supplemented with 5% glucose, 1% ethanol, and 0.2 g/L G418.
Fermentations were carried out in YP medium containing 8% glucose,
1% v/v of ergosterol and Tween-80 in 100% ethanol, 200 mM MES (pH
6.5), and 0.2 .mu.g/mL G418.
[0390] A large patch of S. cerevisiae strain GEVO3868 was grown on
an YPD plate. Cells from the patch were scraped from the plate,
resuspended in 2 mL YPD containing 1% v/v ethanol containing 1% v/v
80% glycerol and placed in the 30.degree. C. orbital shaker
overnight. The following morning, 1 mL of the overnight culture was
used to inoculate 50 mL YPD containing 1% ethanol containing 1% v/v
80% glycerol and returned to the 30.degree. C. orbital shaker.
After 6 hours, the cells were at an OD.sub.600 of 0.55. They were
diluted to an OD.sub.600 of 0.1 in the same media and grown
overnight at 30.degree. C. In the morning the cells were diluted to
an OD.sub.600 of 0.6, grown for 3 hours at 30.degree. C. until the
OD.sub.600 was 1.1, and the cells were collected by centrifugation
at 2700 rcf for 2 min at room temperature. The medium was removed,
50 mL sterile milliQ water was used to wash the cells, and the
cells were centrifuged for 2 min at 2700 rcf at room temperature.
After removing the supernatant, the cells were washed with 25 mL
sterile milliQ water and centrifuged at 2700 rcf for 2 min at room
temperature. The supernatant was removed and the cells were
resuspended in 1 mL 100 mM lithium acetate. The cells were
centrifuged for 10 sec, the supernatant removed, and the cells
resuspended in 400 .mu.L 100 mM lithium acetate. The cells were
transformed as follows. First, a mixture of plasmid DNA (final
volume of 15 .mu.l with sterile water), 72 .mu.l 50% PEG, 10 .mu.l
1M lithium acetate, and 3 .mu.l of denatured salmon sperm DNA (10
mg/mL) was prepared for each transformation. In a sterile 1.5 mL
tube, 15 .mu.l of the cell suspension was added to the DNA mixture
(100 .mu.l), and the transformation suspension was vortexed for 5
short pulses. The transformation was incubated for 30 min at
30.degree. C., followed by incubation for 22 min at 42.degree. C.
The cells were collected by centrifugation (18,000.times.g, 10
seconds, 25.degree. C.). After removing the supernatant, the cells
were resuspended in 400 .mu.l YPD. After an overnight recovery
shaking at 30.degree. C. and 250 RPM, the cells were spread over
selective plates, YPD containing 0.2 g/L G418. Transformants were
then single colony purified onto selective plates.
[0391] For fermentations, 3 mL cultures of GEVO3868 transformed
with each 2.mu. plasmid were started in YPD containing 1% ethanol
containing 0.2 g/L G418 and incubated overnight at 30.degree. C.
and 250 RPM. There were three biological replicates of each strain
for 39 cultures total. After the OD.sub.600 of these cultures were
taken the next day, the appropriate amount of culture was used to
inoculate 50 mL of YP with 5% glucose containing 1% ethanol
containing 0.2 g/L G418 (baffled flask) to an OD.sub.600 of
approximately 0.1. These cultures were incubated at 30.degree. C.
and 250 RPM overnight. The next day, the cultures containing the S.
cerevisiae ilv3.DELTA.N20, the S. cerevisiae ilv3.DELTA.N19, and
the S. cerevisiae ilv3.DELTA.N23 did not reach an OD.sub.600 of 5
(0.6-2.4) so incubation continued for another 24 h at 30.degree. C.
and 250 RPM. The remaining 30 cultures had reached an OD.sub.600 of
approximately 5 and were centrifuged in 50 mL Falcon tubes at 2700
rcf for 5 min at 25.degree. C. The cells from the cultures were
resuspended in 50 mL YP with 8% glucose, 1% (v/v) ethanol,
ergosterol, Tween-80, 200 mM MES (pH 6.5), and 0.2 g/L G418. The
cultures were transferred to 250 mL unbaffled flasks with closed
screw caps and incubated at 30.degree. C. and 75 RPM. The next day,
the remaining 9 cultures were at a higher OD.sub.600 (3-5) and
prepared for the fermentation as described above. At 24 and 48 h
after transfer to 250 mL unbaffled flasks with closed screw caps,
samples of each of the 39 flasks were taken to determine OD.sub.600
and prepared for gas chromatography as follows. 2 mL of sample (per
flask) was removed and OD.sub.600 was determined. The remaining
sample was centrifuged for 10 min at maximum speed. 1 mL of the
supernatant was analyzed by gas chromatography as described. For
the final 72 h timepoint, the same procedures were used for
measuring OD.sub.600 and analysis by gas chromatography. In
addition samples were analyzed by high performance liquid
chromatography. Cells were also prepared for enzyme assays. After
3.times.15 mL Falcon tubes per flask were weighed (total of 117),
14 mL of the appropriate sample was transferred into the Falcon
tubes. After centrifugation at 3000.times.g for 5 min at 4.degree.
C., the supernatant was removed and the cells washed in 3 mL cold,
sterile water. The tubes were centrifuged as per above for 2 min,
the supernatant removed, and the tubes reweighed to determine total
cell weight. The Falcon tubes were stored at -80.degree. C.
[0392] Analysis of organic acid metabolites was performed on an
HP-1200 HPLC system equipped with two Restek RFQ 150.times.7.8 mm
columns in series. Organic acid metabolites were detected using an
HP-1100 UV detector (210 nm) and refractive index. The column
temperature was 60.degree. C. This method was isocratic with 0.0180
N H.sub.2SO.sub.4 (in Milli-Q water) as mobile phase. Flow was set
to 1.1 mL/min. Injection volume was 20 .mu.L and run time was 16
min. Analysis was performed using authentic standards (>99%,
obtained from Sigma-Aldrich, with the exception of
2,3-dihydroxyisovalerate (DHIV), which was custom synthesized
according to Cioffi et al., 1980, Anal Biochem 104: 485 and a
5-point calibration curve.
[0393] Analysis of volatile organic compounds, including ethanol
and isobutanol was performed on a HP 5890, 6890 or 7890 gas
chromatograph fitted with an HP 7673 Autosampler, a DB-FFAP column
(J&W; 30 m length, 0.32 mm ID, 0.25 .mu.M film thickness) or
equivalent connected to a flame ionization detector (FID). The
temperature program was as follows: 230.degree. C. for the
injector, 300.degree. C. for the detector, 100.degree. C. oven for
1 minute, 70.degree. C./minute gradient to 230.degree. C., and then
hold for 2.5 min. Analysis was performed using authentic standards
(>99%, obtained from Sigma-Aldrich, and a 5-point calibration
curve with 1-pentanol as the internal standard.
[0394] For DHAD activity assays cells were thawed on ice and
resuspended in lysis buffer (50 mM Tris pH 8.0 and 5 mM MgSO.sub.4)
for a 20% cell suspension by mass. 1000 .mu.l of glass beads (0.5
mm diameter) were added to a 1.5 ml Eppendorf tube and 875 .mu.l of
cell suspension was added. Yeast cells were lysed using a Retsch
MM301 mixer mill (Retsch Inc. Newtown, Pa.), mixing 6.times.1 min
each at full speed with 1 min incubations on ice between each
bead-beating step. The tubes were centrifuged for 10 min at
23,500.times.g at 4.degree. C. and the supernatant was removed for
use. These lysates were held on ice until assayed. Yeast lysate
protein concentration was determined using the BioRad Bradford
Protein Assay Reagent Kit (Cat# 500-0006, BioRad Laboratories,
Hercules, Calif.) and using BSA for the standard curve. Briefly 10
.mu.L standard or lysate were added into a microcentrifuge tube.
The samples were diluted to fit in the linear range of the standard
curve (1:40). 500 .mu.L of diluted and filtered Bio-Rad protein
assay dye was added to the blank and samples and then vortexed.
Samples were incubated at room temperature for 6 min, transferred
into cuvettes and the OD.sub.595 was determined in a
spectrophotometer. The linear regression of the standards was then
used to calculate the protein concentration in each sample. For
DHAD assays technical triplicates were performed for each sample.
In addition, a no lysate control with lysis buffer was performed.
To assay each sample, 10 .mu.L of an appropriate dilution of lysate
in assay buffer was mixed with 90 .mu.L of assay buffer (5 .mu.L of
0.1 M MgSO.sub.4, 10 .mu.L of 0.1 M DHIV, and 75 .mu.L 50 mM Tris
pH 8.0), and incubated in a thermocycler for 30 minutes at
35.degree. C., then at 95.degree. C. for 5 minutes. Cell debris and
precipitant were removed from the samples by centrifugation at
3000.times.g for 5 min.
[0395] Finally, 75 .mu.L of supernatant was transferred to new PCR
tubes and analyzed by Liquid Chromatography for the
2-keto-isovalerate (KIV) product. DNPH reagent (12 mM
2,4-Dinitrophenyl Hydrazine 20 mM Citric Acid pH 3.0 80%
Acetonitrile 20% MilliQ H.sub.2O) was added to each sample in a 1:1
ratio. Samples were incubated for 30 min at 70.degree. C. in a
thermo-cycler (Eppendorf, Mastercycler). Analysis of KIV was
performed on an HP-1200 High Performance Liquid Chromatography
system equipped with an Eclipse XDB C-18 reverse phase column
(Agilent) and a C-18 reverse phase column guard (Phenomenex).
Ketoisovalerate was detected using an HP-1100 UV detector (360 nm).
The column temperature was 50.degree. C. This method was isocratic
with 70% acetonitrile 2.5% phosphoric acid (4%), 27.5% water as
mobile phase. Flow was set to 3 mL/min. Injection size was 10 .mu.L
and run time was 2 min.
[0396] The data at 72 hours are summarized in Table 33. The data
demonstrates that the DHADs contained in plasmids pGV2635, 2677,
2674, 2672, 2673 and 2676 led to production of isobutanol titers of
at least 2.5 g/L and are considered to be significantly active in
the cytosolic isobutanol pathway. The DHADs contained in plasmids
pGV2675, 2681, 2680, 2678, 2679, 2671, and 2676 led to production
of isobutanol titers below 2.5 g/L and are considered to be
inactive or poorly active in the cytosolic isobutanol pathway.
TABLE-US-00036 TABLE 33 Isobutanol production with selected DHADs.
Plasmid Isobutanol produced DHAD activity (DHAD Gene) OD.sub.600
[g/L] (U/mg) pGV2635 8.6 .+-. 0.6 9.02 .+-. 0.28 0.62 .+-. 0.01 (L.
lactis) pGV2677 9.4 .+-. 0.6 6.30 .+-. 0.85 0.42 .+-. 0.02 (N.
crassa) pGV2674 7.5 .+-. 0.7 6.22 .+-. 0.31 0.30 .+-. 0.00 (F.
tularensis) pGV2672 8.1 .+-. 0.6 6.10 .+-. 0.26 0.20 .+-. 0.00 (G.
forsetii) pGV2673 8.0 .+-. 1.1 3.23 .+-. 0.12 0.03 .+-. 0.00 (S.
erythraea) pGV2676 5.2 .+-. 0.2 2.67 .+-. 0.06 0.02 .+-. 0.00 (S.
cerevisiae ilv3.DELTA.N23) pGV2675 5.0 .+-. 0.2 2.27 .+-. 0.16 0.09
.+-. 0.00 (S. cerevisiae ilv3.DELTA.N19) pGV2681 6.9 .+-. 0.6 2.21
.+-. 0.09 0.03 .+-. 0.00 (E. coli) pGV2680 6.9 .+-. 1.3 2.13 .+-.
0.09 0.02 .+-. 0.00 (Lyngbya spp.) pGV2678 7.5 .+-. 0.2 2.06 .+-.
0.17 0.03 .+-. 0.00 (Acidobacteria) pGV2679 7.5 .+-. 0.6 2.05 .+-.
0.06 0.03 .+-. 0.00 (A. marina) pGV2671 5.5 .+-. 0.0 1.92 .+-. 0.03
0.44 .+-. 0.01 (S. cerevisiae) pGV2663 6.7 .+-. 0.2 1.53 .+-. 0.18
0.01 .+-. 0.01 (none)
Example 19
Overexpression of the L. lactis ilvD in K. lactis and K.
marxianus
[0397] The purpose of this example is to demonstrate activity of L.
lactis DHAD in K. lactis and in K. marxianus.
[0398] Strains, plasmids, and sequences disclosed herein are listed
in Tables 34, 35, and 36, respectively.
TABLE-US-00037 TABLE 34 Genotype of strains disclosed in Example
19. GEVO Number Genotype K. marxianus K. marxianus NRRL-Y-7571
ura3-delta2 strain pdc1.DELTA.::Ll.kivd2 coSc.
P.sub.TDH3:Dm_ADH:P.sub.FBA1: GEVO2504
URA3:P.sub.Sc.sub.--.sub.FBA1:31COX4_MTS:Bs_alsS1_coSc K. marxianus
ura3-delta2 pdc1.DELTA.::.DELTA.::{Ll_kivd2
co:P.sub.Sc.sub.--.sub.TDH3: strain Ec_ilvC.sup.Q110V
coSC:P.sub.Sc.sub.--.sub.TPI1:G418.sup.R:P.sub.Sc.sub.--.sub.CUP1:
GEVO2543 Bs_alsS1_coSc} K. marxianus ura3-delta2
pdc1.DELTA.::{Ll_kivd2 co:P.sub.Sc.sub.--.sub.TDH3: strain
Ec_ilvC.sup.Q110V
coSC:P.sub.Sc.sub.--.sub.TPI1:G418.sup.R:P.sub.Sc.sub.--.sub.CUP1:
GEVO2598 Bs_alsS1_coSc} + random integration of
{P.sub.Sc.sub.--.sub.TEF1:Ll_ilvD_coSc URA3} K. lactis MATalpha
uraA1 trp1 leu2 lysA1 ade1 strain lac4-8 [pKD1] ATCC 200826
GEVO1287
TABLE-US-00038 TABLE 35 Plasmids disclosed in Example 19. Plasmid
Name Relevant Genes/Usage Genotype pGV2271 Empty 1.6 micron 1.6.mu.
ori, bla, hygroR vector that can be maintained in K. lactis.
Encodes hygromycin resistance. pGV2273 1.6 micron vector
P.sub.TDH3:Ec_ilvC_P2D1-A1, P.sub.TEF1: for expression of
Ll_ilvD_coSc, P.sub.PGK1: KARI, KIVD, DHAD Ll_kivD2_coEc,
P.sub.ENO2:Ll_adhA and ADH in 1.6.mu. ori, bla, HygroR K. lactis
pGV2069 2 micron plasmid P.sub.TDH3:Ec_ilvC_coScQ.sup.110V,
P.sup.TEF1: for expression of Ll_ilvD_coSc, P.sub.PGK1: KIVD, DHAD,
KARI, Ll_kivD2_coEc, P.sub.CUP1: and ALS in Bs_alsS1_coSc,
P.sub.ENO2: K. marxianus Dm_adhA, 2.mu. ori, bla, G418 pGV1855 2
micron plasmid P.sub.TEF1:Ll_ilvD, 2.mu. ori, for expression of
bla, URA DHAD in K. marxianus
TABLE-US-00039 TABLE 36 Amino acid and nucleotide sequences of
enzymes and genes disclosed in Example 19. Corresponding Protein
Enz. Source Gene (SEQ ID NO) (SEQ ID NO) ALS B. subtilis
Bs_alsS1_coSc Bs_AlsS1_coSc (SEQ ID NO: 144) (SEQ ID NO: 169) KARI
E. coli Ec_ilvC_coSc.sup.Q110V Ec_IlvC_coSc.sup.Q110V (SEQ ID NO:
98) (SEQ ID NO: 170) E. coli Ec_ilvC_coSc.sup.P2D1-A1
Ec_ilvC_coSc.sup.P2D1-A1 (SEQ ID NO: 171) (SEQ ID NO: 172) KIVD L.
lactis Ll_kivd2_coEc Ll_Kivd2_coEc (SEQ ID NO: 99) (SEQ ID NO: 173)
DHAD L. lactis Ll_ilvD_coSc Ll_IlvD_coSc (SEQ ID NO: 155) (SEQ ID
NO: 18) ADH L. lactis Ll_adhA Ll_adhA (SEQ ID NO: 174) (SEQ ID NO:
175) D. Dm_adh Dm_adh melanogaster (SEQ ID NO: 116) (SEQ ID NO:
176)
[0399] To generate GEVO2543, GEVO2504 was transformed with pGV2069
to integrate into the genome three genes: Bs_alsS1_coSc (SEQ ID NO:
144), Ec_ilvC_coSc.sup.Q110V (SEQ ID NO: 98), and Ll_kivd2_coEc
(SEQ ID NO: 99). To generate GEVO2598, GEVO2543 was transformed
pGV1855 to integrate the L. lactis ilvD gene which was codon
optimized for S. cerevisiae (gene sequence SEQ ID NO: 155, also
referred to as Ll_ilvD_coSc; protein sequence SEQ ID NO: 18) into
the chromosome. GEVO1287 was transformed with either pGV2271
(control plasmid) or pGV2273, which contains Ll_ilvD_coSc.
[0400] GEVO2543, GEVO2598 and GEVO1287 transformed with pGV2271 or
pGV2273 were inoculated into 3 mL of YPD (for GEVO2543 and
GEVO2598) or YPD supplemented with 0.1 g/L hygromycin (for
GEVO1287) for an overnight culture. After approximately 18 hours, a
50 ml YPD culture in a baffled 250 ml shake flask was inoculated to
0.15 OD.sub.600 and shaken at 250 rpms for approximately 9 hours.
Next, DHAD activity and protein concentrations were measured.
[0401] Over-expression of the L. lactis ilvD gene resulted in an
increase in DHAD activity (U/mg total cell lysate protein). Table
37 shows the DHAD activity (U/mg total cell lysate protein)
averages from technical triplicates comparing strains expressing
the L. lactis DHAD to strains not expressing the L. lactis DHAD
gene.
TABLE-US-00040 TABLE 37 DHAD activity in whole cell yeast lysates.
Strain Activity [mU/mg] K. marxianus strain GEVO2543 0.010 .+-.
0.002 (no DHAD) K. marxianus strain GEVO2598 0.016 .+-. 0.001
(DHAD) K. lactis strain GEVO1287 + 0.052 .+-. 0.003 pGV2271 (No
DHAD) K. lactis strain GEVO1287 + 0.122 .+-. 0.011 pGV2273
(DHAD)
Example 20
L. lactis ilvD Activity is Localized to the Yeast Cytosol
[0402] The purpose of this example is to demonstrate that the
Lactococcus lactis ilvD protein localizes to the cytosol when
expressed in a yeast strain.
[0403] The S. cerevisiae strain GEVO1187 (S. cerevisiae CEN.PK2,
MATa ura3 leu2 his3 trp1 ADE2) was transformed with plasmid
pGV2484, a 2 micron plasmid expressing the L. lactis ilvD gene
which was codon optimized for S. cerevisiae (gene sequence SEQ ID
NO: 155, also referred to as Ll_ilvD coSc; protein sequence SEQ ID
NO: 18) under the S. cerevisiae TEF1 promoter
(P.sub.TEF1:Ll_ilv_ilvD_coSc, 2.mu. ori, bla, G418R). Briefly, the
strain was grown in YPD to an OD.sub.600 of 0.6-0.8. Cells were
washed in H.sub.2O, and then resuspended in 100 mM Lithium acetate.
In a 1.5 mL tube, 15 .mu.L of the cell suspension was added to a
mixture of DNA (final volume of 15 .mu.l with sterile water), 72
.mu.l 50% PEG, 10 .mu.l 1M lithium acetate, and 3 .mu.l of
denatured salmon sperm DNA (10 mg/mL). The transformation
suspension was vortexed for 5 short pulses. The mixture was
incubated at 30.degree. C. for 30 minutes, followed by incubation
for 22 minutes at 42.degree. C. The cells were collected by
centrifugation (18,000.times.g, 10 seconds, 25.degree. C.). The
cells were resuspended in 1 ml YPD medium (1% (w/v) yeast extract,
2% (w/v) peptone, 2% (w/v) glucose, pH 5) and after an overnight
recovery shaking at 30.degree. C. and 250 rpms, the cells were
spread over YPD agar plates supplemented with 0.2 g/L G418.
Transformants were then single colony purified onto G418 selective
plates.
[0404] All isolations of crude mitochondrial fractions were
performed in duplicate. GEVO1187 and GEVO1187 transformed with
pGV2484 were each grown in 100 mL of YPG medium (1% (w/v) yeast
extract, 2% (w/v) peptone, 3% (v/v) glycerol, pH5) overnight at
30.degree. C. and 250 rpm. This overnight culture was used to
inoculate 840 mL of YPG in a 2800 mL baffled flask at an OD.sub.600
of 0.03, and cells were grown at 30.degree. C. and 250 rpm for
20-28 h. At an OD.sub.600 of about 2.0, cells were harvested by
centrifugation at 3000.times.g for 5 minutes, resuspended in 100 mL
H.sub.2O followed by centrifugation at 3000.times.g for 5 minutes.
Cells were incubated in 2 mL/g CWW (cell wet weight) of DTT buffer
(100 mM Tris-H.sub.2SO.sub.4 pH 9.4, 10 mM DTT) for 20 minutes at
30.degree. C. Cells were resuspended in 7 mL/g CWW Zymolyase buffer
(1.2 M sorbitol, 20 mM Potassium phosphate pH 7.4) and then
centrifuged at 3000.times.g for 5 minutes. Cells were spheroplasted
by incubating in Zymolyase buffer with Zymolyase (Seikagaku
Biobusiness Corporation #120491-1; 3 mg/g CWW) for 45 minutes at
30.degree. C. on a rocking platform. 100 OD of spheroplasts were
set aside for whole cell lysate preparation (see below).
Spheroplasts were resuspended in Zymolyase buffer and centrifuged
at 3000.times.g for 5 minutes before resuspension in 6.5 mL/g CWW
homogenization buffer (chilled to 4.degree. C.; 6.5 mL/g 0.6 M
sorbitol, 10 mM Tris-HCl pH 7.4, 1 mM EDTA, 1 mM PMSF, 0.2% (w/v)
BSA). Spheroplasts were homogenized on ice with 15 strokes of a
pre-chilled glass-Teflon homogenizer (40 mL capacity), and the
sample was diluted 2-fold with homogenization buffer. Cell debris
and nuclei were pelleted by serial supernatant centrifugations of
1500.times.g for 5 minutes, and 4000.times.g for 5 minutes. The
mitochondrial fraction was isolated by centrifugation at
12,000.times.g for 15 minutes. The crude mitochondrial pellet was
resuspended in 10 mL SEM buffer (250 mM sucrose, 1 mM EDTA, 10 mM
MOPS-KOH pH 7.2), centrifuged at 4000.times.g for 5 minutes to
further remove cellular debris and nuclei before recovering the
mitochondrial fraction by centrifugation at 12,000.times.g for 15
minutes. The mitochondrial fraction may contain markers of the
plasma membrane, the endoplasmic reticulum, and vacuoles in
addition to markers of the mitochondria. Mitochondrial pellet was
resuspended in 750 .mu.L SEM Buffer+Protease Arrest (GBiosciences
#786-108).
[0405] Preparation of whole cell yeast lysates was performed using
the 100 ODs of yeast cells set aside after spheroplasting (see
above) by resuspending cells in 20% (w/v) SEM Buffer+1.times.
Protease Arrest (GBiosciences #786-108). 1000 .mu.l of glass beads
(0.5 mm diameter) were added to a 1.5 ml eppendorf tube, and 875
.mu.l of cell suspension was added. Yeast cells were lysed using a
Retsch MM301 mixer mill (Retsch Inc. Newtown, Pa.), mixing
6.times.1 min each at full speed with 1 min incubations on ice
between each bead-beating step. The tubes were centrifuged for 10
min at 23,500.times.g at 4.degree. C., the supernatant was removed,
aliquoted, flash frozen in liquid nitrogen, and stored at
-80.degree. C.
[0406] The resuspended mitochondrial fraction (see above) was added
to 1000 .mu.l of glass beads (0.1 mm diameter) in a 1.5 ml
Eppendorf tube. Additional buffer was added if necessary to fill
the tube completely. The mitochondrial fraction was lysed using a
Retsch MM301 mixer mill (Retsch Inc. Newtown, Pa.), mixing
3.times.1 minute each at full speed with 1 minute incubations on
ice between each bead-beating step. The tubes were centrifuged for
10 min at 23,500.times.g at 4.degree. C., the supernatant was
removed, aliquoted, flash frozen in liquid nitrogen, and stored at
-80.degree. C.
[0407] Whole cell yeast lysate and mitochondrial fraction lysate
protein concentration was determined using the BioRad Bradford
Protein Assay Reagent Kit (Cat# 500-0006, BioRad Laboratories,
Hercules, Calif.) and using BSA for the standard curve. Briefly, 10
.mu.L standard or lysate were added into a microcentrifuge tube.
The samples were diluted to fit in the linear range of the standard
curve (1:10-1:40). 500 .mu.L of diluted and filtered Bio-Rad
protein assay dye was added to the blank and samples and then
vortexed. Samples were incubated at room temperature for 6 mins,
transferred into cuvettes and the OD.sub.595 was determined in a
spectrophotometer. The linear regression of the standards was then
used to calculate the protein concentration in each sample.
[0408] Three samples of each of the mitochondrial and whole cell
yeast lysates were assayed for DHAD activity, along with no lysate
controls. Table 38 shows the DHAD activity (U/mg protein) averages
from duplicate cultures comparing strains GEVO1187 (no DHAD
expression) to GEVO1187 transformed with pGV2484 (L. lactis DHAD
expressed from pGV2484). DHAD activity was measured in the whole
cell yeast lysate and the mitochondrial fraction lysate. Expression
of DHAD from pGV2484 resulted in about a 7-fold increase in DHAD
activity in the whole cell yeast lysate. Expression of DHAD from
pGV2484 did not affect DHAD activity localized to the mitochondrial
fraction. Subtracting the background activity in the GEVO1187 whole
cell yeast lysate of 0.27 mU/mg from the activity in the whole cell
yeast lysate of GEVO1187 transformed with pGV2484 of 1.87 mU/mg
shows an increase in 1.60 mU/mg. These data suggest that L. lactis
DHAD activity does not localize to the organellar structures
harvested in the mitochondrial fraction, and is therefore cytosolic
when expressed in a yeast strain.
TABLE-US-00041 TABLE 38 DHAD activity in whole cell yeast lysates
and mitochondrial fraction lysates. Activity Strain Lysate [mU/mg]
GEVO1187 Whole cell 0.27 .+-. 0.07 GEVO1187 transformed Whole cell
1.87 .+-. 0.14 with pGV2484 GEVO1187 Mitochondrial 3.76 .+-. 0.01
GEVO1187 transformed Mitochondrial 3.85 .+-. 0.13 with pGV2484
Example 21
Overexpression of the L. lactis ilvD in Issatchenkia orientalis
[0409] The purpose of this example is to demonstrate cytosolic
activity of L. lactis DHAD in I. orientalis.
[0410] An engineered strain derived from the wild-type I.
orientalis strain ATCC PTA-6658 was further modified to contain
copies of all five isobutanol pathway genes integrated into the
chromosome. First, both alleles of the PDC1 locus were deleted in
series (See e.g. WO/2007/106524, which is herein incorporated by
reference in its entirety). The deletion event also simultaneously
integrated a copy of B. subtilis alsS gene and a copy of the L.
lactis kivD gene which encode SEQ ID NOs: 169 and 173,
respectively. This resulted in a Pdc- strain with two integrated
copies of the B. subtilis alsS gene and two integrated copies of
the L. lactis kivD gene (pdc1.DELTA.::Ll_kivD: Bs_alsS
pdc1.DELTA.:Ll_kivD:Bs_alsS). This strain was further engineered to
delete a single allele of the GPD1 locus (See e.g. WO/2007/106524).
The deletion event also simultaneously integrated a single copy of
the L. lactis adhA.sup.RE1, the E. coli ilvC.sup.P2D1-A1, and L.
lactis ilvD which encode the proteins shown in SEQ ID NOs: 177,
172, and 18, respectively. This results in a Pdc- Gpd+strain with
one integrated copy of the Ll_adhA.sup.RE1, Ec_ilvC.sup.P2D1-A1 and
Ll_ilvD genes (GPD1/gpd1.DELTA.:[Ll_adhA.sup.RE1:
Ec_ilvC.sup.P2D1-A1: URA3:Ll_ilvD]). This strain is GEVO4306 (Table
39).
[0411] To generate a control strain which does not express the
pathway genes, both alleles of the PDC1 locus were deleted in
series but with no simultaneous integration of heterologous genes.
Next one of the two GPD1 alleles was deleted with no simultaneous
integration of heterologous genes. The resulting control strain is
GEVO4308 (pdc1.DELTA.::loxP/pdc1.DELTA.::loxP
GPD1/gpd1.DELTA.::loxP:URA3:loxP) (Table 39).
TABLE-US-00042 TABLE 39 Genotype of strains disclosed in Example
21. GEVO Number Genotype 4306 pdc1.DELTA.::[Ll_kivD:Bs_alsSl
pdc1.DELTA.::Ll_kivD: Bs_alsS] GPD1/gpd1.DELTA.::[Ll_adhA.sup.RE1:
Ec_ilvC.sup.P2D1-A1:URA3:Ll_ilvD] 4308
pdc1.DELTA.::loxP/pdc1.DELTA.::loxP GPD1/
gpd1.DELTA.::loxP:URA3:loxP
[0412] Over-expression of the L. lactis ilvD gene resulted in an
increase in DHAD activity (U/mg total cell lysate protein). Table
40 shows the DHAD activity (U/mg total cell lysate protein)
averages from technical triplicates comparing the strain expressing
the L. lactis DHAD gene to the strain not expressing the L. lactis
DHAD gene. Expression of the L. lactis ilvD gene, when expressed
with the remainder of the isobutanol pathway, resulted in
isobutanol production as seen in Table 40.
TABLE-US-00043 TABLE 40 DHAD activity in whole cell yeast lysates
and isobutanol titer after 72 hr fermentation. Activity Isobutanol
Strain [mU/mg] titer g/L GEVO4306 0.041 .+-. 0.009 0.56 .+-. 0.01
GEVO4308 0.012 .+-. 0.002 0.00 .+-. 0.00
Example 22
Cytosolic ALS Homologs that Support Isobutanol Production
[0413] This example demonstrates isobutanol production using
expression of cytosolically localized ALS genes in the presence of
the rest of the isobutanol pathway. The ALS genes were integrated
into the PDC1 locus of S. cerevisiae strain GEVO1187 and isobutanol
production was achieved by expression from plasmid of the other
genes in the isobutanol pathway. Isobutanol production in strains
carrying the ALS genes from T. atroviride (Ta_ALS) and T.
stipitatus (Ts_ALS) was compared to isobutanol production in
strains carrying the ALS gene from B. subtilis. Plasmids described
in this example are listed in Table 41.
TABLE-US-00044 TABLE 41 Plasmids disclosed in Example 22. Plasmid
name Relevant Genes/Usage Genotype pGV1730 Integration plasmid that
will See Table 14. integrate P.sub.CUP1-1:Bs_alsS2 into PDC1 using
digestion with NruI for targeting. This was the parent vector for
cloning the ALS homologs. pGV1773 Vector with Bacillus subtilis
P.sub.PDC1:Bs_AlsS1_coSc, AlsS codon optimized for
P.sub.TDH3:Ll_kivD, S. cerevisiae. P.sub.ADH1:Sc_ADH7_coSc, URA3
5'-end, pUC ORI, kan.sup.R. pGV1802 DNA2.0 plasmid carrying the
Ta_ALS_coSc in DNA Trichoderma atroviride 2.0 vector ALS. pGV1803
DNA2.0 plasmid carrying the Ts_ALS_coSc in DNA Talaromyces
stipitatus 2.0 vector ALS. pGV2082 High copy 2.mu. plasmid with 4
Ec_ilvC_coSc.sup.Q110V, isobutanol pathway genes Ll_ilvD_coSc,
without an ALS gene. Ll_kivD2_coEc, and Dm_ADH, 2.mu. ori, bla,
G418R. pGV2114 Integration plasmid that will See Table 14.
integrate into PDC1 using digestion with NruI for targeting. It
carries the Bacillus subtilis AlsS gene codon optimized for S.
cerevisiae. pGV2117 Integration plasmid that will See Table 14.
integrate into PDC1 using digestion with NruI for targeting. It
carries the Trichoderma atroviride ALS gene codon optimized for S.
cerevisiae. pGV2118 Integration plasmid that will See Table 14.
integrate into PDC1 using digestion with NruI for targeting. It
carries the Talaromyces stipitatus ALS gene codon optimized for S.
cerevisiae.
[0414] Strains with integrated ALS genes expressed from the CUP1
promoter were transformed with pGV2082 (which carries the other 4
isobutanol pathway genes Ec_ilvC_coScQ110V (SEQ ID NO: 98), Ll_ilvD
(SEQ ID NO: 155), Ll_kivd2_coEc (SEQ ID NO: 99), and Dm_ADH (SEQ ID
NO: 116).
[0415] GEVO2618, GEVO2621, and GEVO2622 (see Table 13) were each
transformed with pGV2082. Control strains GEVO2280 (B. subtilis
alsS2) (Table 13) and GEVO1187 (no ALS) (Table 13) were also
transformed with pGV2082.
[0416] Fermentations of the transformed strains GEVO1187, GEVO2280,
GEVO2618, GEVO2621, GEVO2622 were performed. Strains encoding the
ALS from T. atroviride (SEQ ID NO: 71) and T. stipitatus (SEQ ID
NO: 72) produced more isobutanol than the strain containing the B.
subtilis als2. The strain containing Bs_Als1_coSc produced the most
isobutanol. Table 42 shows the final OD, glucose consumption, and
isobutanol titer for each of the strains. The integration of the
cytosolic genes Ta_ALS_coSc and Ts_ALS_coSc led to production of
isobutanol that was in each case 6-fold above that of a strain
without an integrated ALS gene, demonstrating that these strains
are producing isobutanol using a cytosolic pathway.
TABLE-US-00045 TABLE 42 Results of fermentations with cytosolic ALS
homologs at 72 hrs. Glucose Isobutanol Strain OD.sub.600 consumed
g/L produced g/L GEVO1187 10.9 .+-. 0.3 233 .+-. 36 0.3 .+-. 0.0
GEVO2280 9.9 .+-. 0.3 274 .+-. 26 1.3 .+-. 0.11 GEVO2618 9.4 .+-.
0.2 138 .+-. 9 2.6 .+-. .09 GEVO2621 9.9 .+-. 0.3 161 .+-. 52 1.9
.+-. .18 GEVO2622 10.8 .+-. 0.6 182 .+-. 47 1.8 .+-. .15
[0417] The foregoing detailed description has been given for
clearness of understanding only and no unnecessary limitations
should be understood there from as modifications will be obvious to
those skilled in the art.
[0418] While the invention has been described in connection with
specific embodiments thereof, it will be understood that it is
capable of further modifications and this application is intended
to cover any variations, uses, or adaptations of the invention
following, in general, the principles of the invention and
including such departures from the present disclosure as come
within known or customary practice within the art to which the
invention pertains and as may be applied to the essential features
hereinbefore set forth and as follows in the scope of the appended
claims.
[0419] The disclosures, including the claims, figures and/or
drawings, of each and every patent, patent application, and
publication cited herein are hereby incorporated herein by
reference in their entireties.
Sequence CWU 1
1
1771491PRTEscherichia coli 1Met Ala Asn Tyr Phe Asn Thr Leu Asn Leu
Arg Gln Gln Leu Ala Gln1 5 10 15Leu Gly Lys Cys Arg Phe Met Gly Arg
Asp Glu Phe Ala Asp Gly Ala 20 25 30Ser Tyr Leu Gln Gly Lys Lys Val
Val Ile Val Gly Cys Gly Ala Gln 35 40 45Gly Leu Asn Gln Gly Leu Asn
Met Arg Asp Ser Gly Leu Asp Ile Ser 50 55 60Tyr Ala Leu Arg Lys Glu
Ala Ile Ala Glu Lys Arg Ala Ser Trp Arg65 70 75 80Lys Ala Thr Glu
Asn Gly Phe Lys Val Gly Thr Tyr Glu Glu Leu Ile 85 90 95Pro Gln Ala
Asp Leu Val Ile Asn Leu Thr Pro Asp Lys Gln His Ser 100 105 110Asp
Val Val Arg Thr Val Gln Pro Leu Met Lys Asp Gly Ala Ala Leu 115 120
125Gly Tyr Ser His Gly Phe Asn Ile Val Glu Val Gly Glu Gln Ile Arg
130 135 140Lys Asp Ile Thr Val Val Met Val Ala Pro Lys Cys Pro Gly
Thr Glu145 150 155 160Val Arg Glu Glu Tyr Lys Arg Gly Phe Gly Val
Pro Thr Leu Ile Ala 165 170 175Val His Pro Glu Asn Asp Pro Lys Gly
Glu Gly Met Ala Ile Ala Lys 180 185 190Ala Trp Ala Ala Ala Thr Gly
Gly His Arg Ala Gly Val Leu Glu Ser 195 200 205Ser Phe Val Ala Glu
Val Lys Ser Asp Leu Met Gly Glu Gln Thr Ile 210 215 220Leu Cys Gly
Met Leu Gln Ala Gly Ser Leu Leu Cys Phe Asp Lys Leu225 230 235
240Val Glu Glu Gly Thr Asp Pro Ala Tyr Ala Glu Lys Leu Ile Gln Phe
245 250 255Gly Trp Glu Thr Ile Thr Glu Ala Leu Lys Gln Gly Gly Ile
Thr Leu 260 265 270Met Met Asp Arg Leu Ser Asn Pro Ala Lys Leu Arg
Ala Tyr Ala Leu 275 280 285Ser Glu Gln Leu Lys Glu Ile Met Ala Pro
Leu Phe Gln Lys His Met 290 295 300Asp Asp Ile Ile Ser Gly Glu Phe
Ser Ser Gly Met Met Ala Asp Trp305 310 315 320Ala Asn Asp Asp Lys
Lys Leu Leu Thr Trp Arg Glu Glu Thr Gly Lys 325 330 335Thr Ala Phe
Glu Thr Ala Pro Gln Tyr Glu Gly Lys Ile Gly Glu Gln 340 345 350Glu
Tyr Phe Asp Lys Gly Val Leu Met Ile Ala Met Val Lys Ala Gly 355 360
365Val Glu Leu Ala Phe Glu Thr Met Val Asp Ser Gly Ile Ile Glu Glu
370 375 380Ser Ala Tyr Tyr Glu Ser Leu His Glu Leu Pro Leu Ile Ala
Asn Thr385 390 395 400Ile Ala Arg Lys Arg Leu Tyr Glu Met Asn Val
Val Ile Ser Asp Thr 405 410 415Ala Glu Tyr Gly Asn Tyr Leu Phe Ser
Tyr Ala Cys Val Pro Leu Leu 420 425 430Lys Pro Phe Met Ala Glu Leu
Gln Pro Gly Asp Leu Gly Lys Ala Ile 435 440 445Pro Glu Gly Ala Val
Asp Asn Gly Gln Leu Arg Asp Val Asn Glu Ala 450 455 460Ile Arg Ser
His Ala Ile Glu Gln Val Gly Lys Lys Leu Arg Gly Tyr465 470 475
480Met Thr Asp Met Lys Arg Ile Ala Val Ala Gly 485
4902395PRTSaccharomyces cerevisiae 2Met Leu Arg Thr Gln Ala Ala Arg
Leu Ile Cys Asn Ser Arg Val Ile1 5 10 15Thr Ala Lys Arg Thr Phe Ala
Leu Ala Thr Arg Ala Ala Ala Tyr Ser 20 25 30Arg Pro Ala Ala Arg Phe
Val Lys Pro Met Ile Thr Thr Arg Gly Leu 35 40 45Lys Gln Ile Asn Phe
Gly Gly Thr Val Glu Thr Val Tyr Glu Arg Ala 50 55 60Asp Trp Pro Arg
Glu Lys Leu Leu Asp Tyr Phe Lys Asn Asp Thr Phe65 70 75 80Ala Leu
Ile Gly Tyr Gly Ser Gln Gly Tyr Gly Gln Gly Leu Asn Leu 85 90 95Arg
Asp Asn Gly Leu Asn Val Ile Ile Gly Val Arg Lys Asp Gly Ala 100 105
110Ser Trp Lys Ala Ala Ile Glu Asp Gly Trp Val Pro Gly Lys Asn Leu
115 120 125Phe Thr Val Glu Asp Ala Ile Lys Arg Gly Ser Tyr Val Met
Asn Leu 130 135 140Leu Ser Asp Ala Ala Gln Ser Glu Thr Trp Pro Ala
Ile Lys Pro Leu145 150 155 160Leu Thr Lys Gly Lys Thr Leu Tyr Phe
Ser His Gly Phe Ser Pro Val 165 170 175Phe Lys Asp Leu Thr His Val
Glu Pro Pro Lys Asp Leu Asp Val Ile 180 185 190Leu Val Ala Pro Lys
Gly Ser Gly Arg Thr Val Arg Ser Leu Phe Lys 195 200 205Glu Gly Arg
Gly Ile Asn Ser Ser Tyr Ala Val Trp Asn Asp Val Thr 210 215 220Gly
Lys Ala His Glu Lys Ala Gln Ala Leu Ala Val Ala Ile Gly Ser225 230
235 240Gly Tyr Val Tyr Gln Thr Thr Phe Glu Arg Glu Val Asn Ser Asp
Leu 245 250 255Tyr Gly Glu Arg Gly Cys Leu Met Gly Gly Ile His Gly
Met Phe Leu 260 265 270Ala Gln Tyr Asp Val Leu Arg Glu Asn Gly His
Ser Pro Ser Glu Ala 275 280 285Phe Asn Glu Thr Val Glu Glu Ala Thr
Gln Ser Leu Tyr Pro Leu Ile 290 295 300Gly Lys Tyr Gly Met Asp Tyr
Met Tyr Asp Ala Cys Ser Thr Thr Ala305 310 315 320Arg Arg Gly Ala
Leu Asp Trp Tyr Pro Ile Phe Lys Asn Ala Leu Lys 325 330 335Pro Val
Phe Gln Asp Leu Tyr Glu Ser Thr Lys Asn Gly Thr Glu Thr 340 345
350Lys Arg Ser Leu Glu Phe Asn Ser Gln Pro Asp Tyr Arg Glu Lys Leu
355 360 365Glu Lys Glu Leu Asp Thr Ile Arg Asn Met Glu Ile Trp Lys
Val Gly 370 375 380Lys Glu Val Arg Lys Leu Arg Pro Glu Asn Gln385
390 3953578PRTOryza sativa 3Met Ala Ala Ser Thr Thr Leu Ala Leu Ser
His Pro Lys Thr Leu Ala1 5 10 15Ala Ala Ala Ala Ala Ala Pro Lys Ala
Pro Thr Ala Pro Ala Ala Val 20 25 30Ser Phe Pro Val Ser His Ala Ala
Cys Ala Pro Leu Ala Ala Arg Arg 35 40 45Arg Ala Val Thr Ala Met Val
Ala Ala Pro Pro Ala Val Gly Ala Ala 50 55 60Met Pro Ser Leu Asp Phe
Asp Thr Ser Val Phe Asn Lys Glu Lys Val65 70 75 80Ser Leu Ala Gly
His Glu Glu Tyr Ile Val Arg Gly Gly Arg Asn Leu 85 90 95Phe Pro Leu
Leu Pro Glu Ala Phe Lys Gly Ile Lys Gln Ile Gly Val 100 105 110Ile
Gly Trp Gly Ser Gln Gly Pro Ala Gln Ala Gln Asn Leu Arg Asp 115 120
125Ser Leu Ala Glu Ala Lys Ser Asp Ile Val Val Lys Ile Gly Leu Arg
130 135 140Lys Gly Ser Lys Ser Phe Asp Glu Ala Arg Ala Ala Gly Phe
Thr Glu145 150 155 160Glu Ser Gly Thr Leu Gly Asp Ile Trp Glu Thr
Val Ser Gly Ser Asp 165 170 175Leu Val Leu Leu Leu Ile Ser Asp Ala
Ala Gln Ala Asp Asn Tyr Glu 180 185 190Lys Ile Phe Ser His Met Lys
Pro Asn Ser Ile Leu Gly Leu Ser His 195 200 205Gly Phe Leu Leu Gly
His Leu Gln Ser Ala Gly Leu Asp Phe Pro Lys 210 215 220Asn Ile Ser
Val Ile Ala Val Cys Pro Lys Gly Met Gly Pro Ser Val225 230 235
240Arg Arg Leu Tyr Val Gln Gly Lys Glu Ile Asn Gly Ala Gly Ile Asn
245 250 255Ser Ser Phe Ala Val His Gln Asp Val Asp Gly Arg Ala Thr
Asp Val 260 265 270Ala Leu Gly Trp Ser Val Ala Leu Gly Ser Pro Phe
Thr Phe Ala Thr 275 280 285Thr Leu Glu Gln Glu Tyr Lys Ser Asp Ile
Phe Gly Glu Arg Gly Ile 290 295 300Leu Leu Gly Ala Val His Gly Ile
Val Glu Ala Leu Phe Arg Arg Tyr305 310 315 320Thr Glu Gln Gly Met
Asp Glu Glu Met Ala Tyr Lys Asn Thr Val Glu 325 330 335Gly Ile Thr
Gly Ile Ile Ser Lys Thr Ile Ser Lys Lys Gly Met Leu 340 345 350Glu
Val Tyr Asn Ser Leu Thr Glu Glu Gly Lys Lys Glu Phe Asn Lys 355 360
365Ala Tyr Ser Ala Ser Phe Tyr Pro Cys Met Asp Ile Leu Tyr Glu Cys
370 375 380Tyr Glu Asp Val Ala Ser Gly Ser Glu Ile Arg Ser Val Val
Leu Ala385 390 395 400Gly Arg Arg Phe Tyr Glu Lys Glu Gly Leu Pro
Ala Phe Pro Met Gly 405 410 415Asn Ile Asp Gln Thr Arg Met Trp Lys
Val Gly Glu Lys Val Arg Ser 420 425 430Thr Arg Pro Glu Asn Asp Leu
Gly Pro Leu His Pro Phe Thr Ala Gly 435 440 445Val Tyr Val Ala Leu
Met Met Ala Gln Ile Glu Val Leu Arg Lys Lys 450 455 460Gly His Ser
Tyr Ser Glu Ile Ile Asn Glu Ser Val Ile Glu Ser Val465 470 475
480Asp Ser Leu Asn Pro Phe Met His Ala Arg Gly Val Ala Phe Met Val
485 490 495Asp Asn Cys Ser Thr Thr Ala Arg Leu Gly Ser Arg Lys Trp
Ala Pro 500 505 510Arg Phe Asp Tyr Ile Leu Thr Gln Gln Ala Phe Val
Thr Val Asp Lys 515 520 525Asp Ala Pro Ile Asn Gln Asp Leu Ile Ser
Asn Phe Met Ser Asp Pro 530 535 540Val His Gly Ala Ile Glu Val Cys
Ala Glu Leu Arg Pro Thr Val Asp545 550 555 560Ile Ser Val Pro Ala
Asn Ala Asp Phe Val Arg Pro Glu Leu Arg Gln 565 570 575Ser
Ser4329PRTMethanococcus maripaludis 4Met Lys Val Phe Tyr Asp Ser
Asp Phe Lys Leu Asp Ala Leu Lys Glu1 5 10 15Lys Thr Ile Ala Val Ile
Gly Tyr Gly Ser Gln Gly Arg Ala Gln Ser 20 25 30Leu Asn Met Lys Asp
Ser Gly Leu Asn Val Val Val Gly Leu Arg Lys 35 40 45Asn Gly Ala Ser
Trp Glu Asn Ala Lys Ala Asp Gly His Asn Val Met 50 55 60Thr Ile Glu
Glu Ala Ala Glu Lys Ala Asp Ile Ile His Ile Leu Ile65 70 75 80Pro
Asp Glu Leu Gln Ala Glu Val Tyr Glu Ser Gln Ile Lys Pro Tyr 85 90
95Leu Lys Glu Gly Lys Thr Leu Ser Phe Ser His Gly Phe Asn Ile His
100 105 110Tyr Gly Phe Ile Val Pro Pro Lys Gly Val Asn Val Val Leu
Val Ala 115 120 125Pro Lys Ser Pro Gly Lys Met Val Arg Arg Thr Tyr
Glu Glu Gly Phe 130 135 140Gly Val Pro Gly Leu Ile Cys Ile Glu Ile
Asp Ala Thr Asn Asn Ala145 150 155 160Phe Asp Ile Val Ser Ala Met
Ala Lys Gly Ile Gly Leu Ser Arg Ala 165 170 175Gly Val Ile Gln Thr
Thr Phe Lys Glu Glu Thr Glu Thr Asp Leu Phe 180 185 190Gly Glu Gln
Ala Val Leu Cys Gly Gly Val Thr Glu Leu Ile Lys Ala 195 200 205Gly
Phe Glu Thr Leu Val Glu Ala Gly Tyr Ala Pro Glu Met Ala Tyr 210 215
220Phe Glu Thr Cys His Glu Leu Lys Leu Ile Val Asp Leu Ile Tyr
Gln225 230 235 240Lys Gly Phe Lys Asn Met Trp Asn Asp Val Ser Asn
Thr Ala Glu Tyr 245 250 255Gly Gly Leu Thr Arg Arg Ser Arg Ile Val
Thr Ala Asp Ser Lys Ala 260 265 270Ala Met Lys Glu Ile Leu Lys Glu
Ile Gln Asp Gly Arg Phe Thr Lys 275 280 285Glu Phe Val Leu Glu Lys
Gln Val Asn His Ala His Leu Lys Ala Met 290 295 300Arg Arg Ile Glu
Gly Asp Leu Gln Ile Glu Glu Val Gly Ala Lys Leu305 310 315 320Arg
Lys Met Cys Gly Leu Glu Lys Glu 3255339PRTAcidiphilium cryptum 5Met
Arg Val Tyr Tyr Asp Ser Asp Ala Asp Val Asn Leu Ile Lys Ala1 5 10
15Lys Lys Val Ala Val Val Gly Tyr Gly Ser Gln Gly His Ala His Ala
20 25 30Leu Asn Leu Lys Glu Ser Gly Val Lys Glu Leu Val Val Ala Leu
Arg 35 40 45Lys Gly Ser Ala Ala Val Ala Lys Ala Glu Ala Ala Gly Leu
Arg Val 50 55 60Met Thr Pro Glu Glu Ala Ala Ala Trp Ala Asp Val Val
Met Ile Leu65 70 75 80Thr Pro Asp Glu Gly Gln Gly Asp Leu Tyr Arg
Asp Ser Leu Ala Ala 85 90 95Asn Leu Lys Pro Gly Ala Ala Ile Ala Phe
Ala His Gly Leu Asn Ile 100 105 110His Phe Asn Leu Ile Glu Pro Arg
Ala Asp Ile Asp Val Phe Met Ile 115 120 125Ala Pro Lys Gly Pro Gly
His Thr Val Arg Ser Glu Tyr Gln Arg Gly 130 135 140Gly Gly Val Pro
Cys Leu Val Ala Val Ala Gln Asn Pro Ser Gly Asn145 150 155 160Ala
Leu Asp Ile Ala Leu Ser Tyr Ala Ser Ala Ile Gly Gly Gly Arg 165 170
175Ala Gly Ile Ile Glu Thr Thr Phe Lys Glu Glu Cys Glu Thr Asp Leu
180 185 190Phe Gly Glu Gln Thr Val Leu Cys Gly Gly Leu Val Glu Leu
Ile Lys 195 200 205Ala Gly Phe Glu Thr Leu Val Glu Ala Gly Tyr Ala
Pro Glu Met Ala 210 215 220Tyr Phe Glu Cys Leu His Glu Val Lys Leu
Ile Val Asp Leu Ile Tyr225 230 235 240Glu Gly Gly Ile Ala Asn Met
Asn Tyr Ser Ile Ser Asn Thr Ala Glu 245 250 255Tyr Gly Glu Tyr Val
Thr Gly Pro Arg Met Ile Thr Pro Glu Thr Lys 260 265 270Ala Glu Met
Lys Arg Val Leu Asp Asp Ile Gln Lys Gly Arg Phe Thr 275 280 285Arg
Asp Trp Met Leu Glu Asn Lys Val Asn Gln Thr Asn Phe Lys Ala 290 295
300Met Arg Arg Ala Asn Ala Ala His Pro Ile Glu Glu Val Gly Glu
Lys305 310 315 320Leu Arg Ala Met Met Pro Trp Ile Lys Lys Gly Ala
Leu Val Asp Lys 325 330 335Thr Arg Asn6555PRTChlamydomonas
reinhardtii 6Met Gln Leu Leu Asn Ser Lys Ser Arg Val Leu Ser Gly
Ser Arg Gln1 5 10 15Gln Ala Ala Ala Lys Ala Val Arg Val Ala Pro Ser
Gly Arg Arg Ser 20 25 30Ala Val Arg Val Ser Ala Ala Val His Leu Asp
Phe Asn Thr Lys Val 35 40 45Phe Gln Lys Glu His Ala Lys Phe Gly Pro
Thr Glu Glu Tyr Ile Val 50 55 60Arg Gly Gly Arg Asp Lys Tyr Pro Leu
Leu Lys Glu Ala Phe Lys Gly65 70 75 80Ile Lys Lys Val Ser Val Ile
Gly Trp Gly Ser Gln Ala Pro Ala Gln 85 90 95Ala Gln Asn Leu Arg Asp
Ser Ile Ala Glu Ala Gly Met Asp Ile Lys 100 105 110Val Ala Ile Gly
Leu Arg Pro Asp Ser Pro Ser Trp Ala Glu Ala Glu 115 120 125Ala Cys
Gly Phe Ser Lys Thr Asp Gly Thr Leu Gly Glu Val Phe Glu 130 135
140Gln Ile Ser Ser Ser Asp Phe Val Ile Leu Leu Ile Ser Asp Ala
Ala145 150 155 160Gln Ala Lys Leu Tyr Pro Arg Ile Leu Ala Ala Met
Lys Pro Gly Ala 165 170 175Thr Leu Gly Leu Ser His Gly Phe Leu Leu
Gly Val Met Arg Asn Asp 180 185 190Gly Val Asp Phe Arg Lys Asp Ile
Asn Val Val Leu Val Ala Pro Lys 195 200 205Gly Met Gly Pro Ser Val
Arg Arg Leu Tyr Glu Gln Gly Lys Ser Val 210 215 220Asn Gly Ala Gly
Ile Asn Cys Ser Phe Ala Ile Gln Gln Asp Ala Thr225 230 235 240Gly
Gln Ala Ala Asp Ile Ala Ile Gly Trp Ala Ile Gly Val Gly Ala 245 250
255Pro Phe Ala Phe Pro Thr Thr Leu Glu Ser Glu Tyr Lys Ser Asp Ile
260 265 270Tyr Gly Glu Arg Cys Val Leu Leu Gly Ala Val His Gly Ile
Val Glu 275 280 285Ala Leu Phe Arg Arg Tyr Thr Arg Gln Gly Met Ser
Asp Glu Glu Ala 290 295 300Phe Lys Gln Ser Val Glu Ser Ile Thr Gly
Pro Ile Ser Arg Thr Ile305 310 315 320Ser Thr
Lys Gly Met Leu Ser Val Tyr Asn Ser Phe Asn Glu Ala Asp 325 330
335Lys Lys Ile Phe Glu Gln Ala Tyr Ser Ala Ser Tyr Lys Pro Ala Leu
340 345 350Asp Ile Cys Phe Glu Ile Tyr Glu Asp Val Ala Ser Gly Asn
Glu Ile 355 360 365Lys Ser Val Val Gln Ala Val Gln Arg Phe Asp Arg
Phe Pro Met Gly 370 375 380Lys Ile Asp Gln Thr Tyr Met Trp Lys Val
Gly Gln Lys Val Arg Ala385 390 395 400Glu Arg Asp Glu Ser Lys Ile
Pro Val Asn Pro Phe Thr Ala Gly Val 405 410 415Tyr Val Ala Val Met
Met Ala Thr Val Glu Val Leu Arg Glu Lys Gly 420 425 430His Pro Phe
Ser Glu Ile Cys Asn Glu Ser Ile Ile Glu Ala Val Asp 435 440 445Ser
Leu Asn Pro Tyr Met His Ala Arg Gly Val Ala Phe Met Val Asp 450 455
460Asn Cys Ser Tyr Thr Ala Arg Leu Gly Ser Arg Lys Trp Ala Pro
Arg465 470 475 480Phe Asp Tyr Ile Ile Glu Gln Gln Ala Phe Val Asp
Ile Asp Ser Gly 485 490 495Lys Ala Ala Asp Lys Glu Val Met Ala Glu
Phe Leu Ala His Pro Val 500 505 510His Ser Ala Leu Ala Thr Cys Ser
Ser Met Arg Pro Ser Val Asp Ile 515 520 525Ser Val Gly Gly Glu Asn
Ser Ser Val Gly Val Gly Ala Gly Ala Ala 530 535 540Arg Thr Glu Phe
Arg Ser Thr Ala Ala Lys Val545 550 5557329PRTPicrophilus torridus
7Met Glu Lys Val Tyr Thr Glu Asn Asp Leu Lys Glu Asn Leu Met Arg1 5
10 15Asn Lys Lys Ile Ala Val Leu Gly Tyr Gly Ser Gln Gly Arg Ala
Trp 20 25 30Ala Leu Asn Met Arg Asp Ser Gly Leu Asn Val Thr Val Gly
Leu Glu 35 40 45Arg Gln Gly Lys Ser Trp Glu Lys Ala Val Ala Asp Gly
Phe Lys Pro 50 55 60Leu Lys Ser Arg Asp Ala Val Arg Asp Ala Asp Ala
Val Ile Phe Leu65 70 75 80Val Pro Asp Met Ala Gln Arg Glu Leu Tyr
Lys Asn Ile Met Asn Asp 85 90 95Ile Lys Asp Asp Ala Asp Ile Val Phe
Ala His Gly Phe Asn Val His 100 105 110Tyr Gly Leu Ile Asn Pro Lys
Asn His Asp Val Tyr Met Val Ala Pro 115 120 125Lys Ala Pro Gly Pro
Ser Val Arg Glu Phe Tyr Glu Arg Gly Gly Gly 130 135 140Val Pro Val
Leu Ile Ala Val Ala Asn Asp Val Ser Gly Arg Ser Lys145 150 155
160Glu Lys Ala Leu Ser Ile Ala Tyr Ser Leu Gly Ala Leu Arg Ala Gly
165 170 175Ala Ile Glu Thr Thr Phe Lys Glu Glu Thr Glu Thr Asp Leu
Ile Gly 180 185 190Glu Gln Leu Asp Leu Val Gly Gly Ile Thr Glu Leu
Leu Arg Ser Thr 195 200 205Phe Asn Ile Met Val Glu Met Gly Tyr Lys
Pro Glu Met Ala Tyr Phe 210 215 220Glu Ala Ile Asn Glu Met Lys Leu
Ile Val Asp Gln Val Phe Glu Lys225 230 235 240Gly Ile Ser Gly Met
Leu Arg Ala Val Ser Asp Thr Ala Lys Tyr Gly 245 250 255Gly Leu Thr
Thr Gly Lys Tyr Ile Ile Asn Asp Asp Val Arg Lys Arg 260 265 270Met
Arg Glu Arg Ala Glu Tyr Ile Val Ser Gly Lys Phe Ala Glu Glu 275 280
285Trp Ile Glu Glu Tyr Gly Glu Gly Ser Lys Asn Leu Glu Ser Met Met
290 295 300Leu Asp Ile Asp Asn Ser Leu Glu Glu Gln Val Gly Lys Gln
Leu Arg305 310 315 320Glu Ile Val Leu Arg Gly Arg Pro Lys
3258339PRTZymomonas mobilis 8Met Lys Val Tyr Tyr Asp Ser Asp Ala
Asp Leu Gly Leu Ile Lys Ser1 5 10 15Lys Lys Ile Ala Ile Leu Gly Tyr
Gly Ser Gln Gly His Ala His Ala 20 25 30Gln Asn Leu Arg Asp Ser Gly
Val Ala Glu Val Ala Ile Ala Leu Arg 35 40 45Pro Asp Ser Ala Ser Val
Lys Lys Ala Gln Asp Ala Gly Phe Lys Val 50 55 60Leu Thr Asn Ala Glu
Ala Ala Lys Trp Ala Asp Ile Leu Met Ile Leu65 70 75 80Ala Pro Asp
Glu His Gln Ala Ala Ile Tyr Ala Glu Asp Leu Lys Asp 85 90 95Asn Leu
Arg Pro Gly Ser Ala Ile Ala Phe Ala His Gly Leu Asn Ile 100 105
110His Phe Gly Leu Ile Glu Pro Arg Lys Asp Ile Asp Val Phe Met Ile
115 120 125Ala Pro Lys Gly Pro Gly His Thr Val Arg Ser Glu Tyr Val
Arg Gly 130 135 140Gly Gly Val Pro Cys Leu Val Ala Val Asp Gln Asp
Ala Ser Gly Asn145 150 155 160Ala His Asp Ile Ala Leu Ala Tyr Ala
Ser Gly Ile Gly Gly Gly Arg 165 170 175Ser Gly Val Ile Glu Thr Thr
Phe Arg Glu Glu Val Glu Thr Asp Leu 180 185 190Phe Gly Glu Gln Ala
Val Leu Cys Gly Gly Leu Thr Ala Leu Ile Thr 195 200 205Ala Gly Phe
Glu Thr Leu Thr Glu Ala Gly Tyr Ala Pro Glu Met Ala 210 215 220Phe
Phe Glu Cys Met His Glu Met Lys Leu Ile Val Asp Leu Ile Tyr225 230
235 240Glu Ala Gly Ile Ala Asn Met Arg Tyr Ser Ile Ser Asn Thr Ala
Glu 245 250 255Tyr Gly Asp Ile Val Ser Gly Pro Arg Val Ile Asn Glu
Glu Ser Lys 260 265 270Lys Ala Met Lys Ala Ile Leu Asp Asp Ile Gln
Ser Gly Arg Phe Val 275 280 285Ser Lys Phe Val Leu Asp Asn Arg Ala
Gly Gln Pro Glu Leu Lys Ala 290 295 300Ala Arg Lys Arg Met Ala Ala
His Pro Ile Glu Gln Val Gly Ala Arg305 310 315 320Leu Arg Lys Met
Met Pro Trp Ile Ala Ser Asn Lys Leu Val Asp Lys 325 330 335Ala Arg
Asn910PRTArtificial Sequencec-myc epitope tag 9Glu Gln Lys Leu Ile
Ser Glu Glu Asp Leu1 5 1010554PRTThermotoga petrophila 10Met Arg
Ser Asp Val Ile Lys Lys Gly Leu Glu Arg Ala Pro His Arg1 5 10 15Ser
Leu Leu Lys Ala Leu Gly Ile Thr Asp Asp Glu Met Arg Arg Pro 20 25
30Phe Ile Gly Ile Val Ser Ser Trp Asn Glu Ile Ile Pro Gly His Val
35 40 45His Leu Asp Lys Val Val Glu Ala Val Lys Ala Gly Val Arg Met
Ala 50 55 60Gly Gly Val Pro Phe Val Phe Pro Thr Ile Gly Ile Cys Asp
Gly Ile65 70 75 80Ala Met Asp His Arg Gly Met Lys Phe Ser Leu Pro
Ser Arg Glu Leu 85 90 95Ile Ala Asp Ser Ile Glu Ile Val Ala Ser Gly
Phe Pro Phe Asp Gly 100 105 110Leu Val Phe Val Pro Asn Cys Asp Lys
Ile Thr Pro Gly Met Met Met 115 120 125Ala Met Gly Arg Leu Asn Ile
Pro Ser Val Leu Ile Ser Gly Gly Pro 130 135 140Met Leu Ala Gly Arg
Tyr Asn Gly Arg Asp Ile Asp Leu Ile Thr Val145 150 155 160Phe Glu
Ala Val Gly Gly Tyr Lys Val Gly Lys Val Asp Glu Glu Thr 165 170
175Leu Lys Ala Ile Glu Asp Leu Ala Cys Pro Gly Ala Gly Ser Cys Ala
180 185 190Gly Leu Phe Thr Ala Asn Thr Met Asn Ser Leu Ala Glu Ala
Leu Gly 195 200 205Ile Ala Pro Arg Gly Asn Gly Thr Val Pro Ala Val
His Ala Lys Arg 210 215 220Leu Arg Met Ala Lys Glu Ala Gly Met Leu
Val Val Glu Leu Val Lys225 230 235 240Arg Asp Val Lys Pro Arg Asp
Ile Val Thr Leu Asp Ser Phe Met Asn 245 250 255Ala Val Met Val Asp
Leu Ala Thr Gly Gly Ser Thr Asn Thr Val Leu 260 265 270His Leu Lys
Ala Ile Ala Glu Ser Phe Gly Ile Asp Phe Asp Ile Lys 275 280 285Leu
Phe Asp Glu Leu Ser Arg Lys Ile Pro His Ile Cys Asn Ile Ser 290 295
300Pro Val Gly Pro Tyr His Ile Gln Asp Leu Asp Asp Ala Gly Gly
Ile305 310 315 320Tyr Ala Val Met Lys Arg Leu Gln Glu Asn Gly Leu
Leu Lys Glu Asp 325 330 335Ala Met Thr Ile Tyr Leu Arg Lys Ile Gly
Asp Leu Val Arg Glu Ala 340 345 350Lys Ile Leu Asn Glu Asp Val Ile
Arg Pro Phe Asp Asn Pro Tyr His 355 360 365Lys Glu Gly Gly Leu Gly
Ile Leu Phe Gly Asn Leu Ala Pro Glu Gly 370 375 380Ala Val Ala Lys
Leu Ser Gly Val Pro Glu Lys Met Met His His Val385 390 395 400Gly
Pro Ala Val Val Phe Glu Asp Gly Glu Glu Ala Thr Lys Ala Ile 405 410
415Leu Ser Gly Lys Ile Lys Lys Gly Asp Val Val Val Ile Arg Tyr Glu
420 425 430Gly Pro Lys Gly Gly Pro Gly Met Arg Glu Met Leu Ser Pro
Thr Ser 435 440 445Ala Ile Val Gly Met Gly Leu Ala Glu Asp Val Ala
Leu Ile Thr Asp 450 455 460Gly Arg Phe Ser Gly Gly Ser His Gly Ala
Val Ile Gly His Val Ser465 470 475 480Pro Glu Ala Ala Glu Gly Gly
Pro Ile Gly Ile Val Lys Asp Gly Asp 485 490 495Leu Ile Glu Ile Asp
Phe Glu Lys Arg Thr Leu Asn Leu Leu Ile Ser 500 505 510Asp Glu Glu
Phe Glu Arg Arg Met Lys Glu Phe Thr Pro Leu Val Lys 515 520 525Glu
Val Asp Ser Asp Tyr Leu Arg Arg Tyr Ala Phe Phe Val Gln Ser 530 535
540Ala Ser Lys Gly Ala Ile Phe Arg Lys Pro545
55011561PRTVictivallis vadensis 11Met Arg Ser Asp Thr Met Lys Lys
Gly Pro Glu Arg Ala Pro His Arg1 5 10 15Gly Leu Met Arg Ala Thr Gly
Leu Lys Lys Glu Asp Phe Asp Lys Pro 20 25 30Phe Ile Gly Val Cys Asn
Ser Tyr Thr Asn Ile Val Pro Gly His Cys 35 40 45His Leu Lys Lys Val
Gly Glu Ile Ile Cys Asp Ala Ile Arg Glu Ala 50 55 60Gly Gly Val Pro
Tyr Glu Phe Asn Thr Ile Ala Val Cys Asp Gly Ile65 70 75 80Ala Met
Gly His Lys Gly Met Lys Tyr Ser Leu Ala Ser Arg Glu Ile 85 90 95Ile
Ala Asp Ser Val Glu Thr Met Gly Thr Ala His Pro Phe Asp Ala 100 105
110Met Ile Cys Ile Pro Asn Cys Asp Lys Val Val Pro Gly Met Leu Met
115 120 125Gly Ala Met Arg Leu Asn Ile Pro Thr Ile Phe Ala Ser Gly
Gly Pro 130 135 140Met Arg Ala Gly Lys Pro Gln Ala Glu Gly Gly Pro
Asp Thr Asp Leu145 150 155 160Ile Ser Ile Phe Glu Gly Val Ala Ala
Asn Arg Ile Gly Lys Leu Ser 165 170 175Asp Glu Gly Leu Glu Ala Leu
Glu Cys Ser Ala Cys Pro Gly Pro Gly 180 185 190Ser Cys Ser Gly Met
Phe Thr Ala Asn Ser Met Asn Cys Leu Cys Glu 195 200 205Ala Leu Gly
Ile Ala Leu Pro Gly Asn Gly Thr Ile Ala Ala Asp Ser 210 215 220Pro
Glu Arg Val Glu Leu Trp Lys Arg Ala Ala Arg Arg Ala Val Glu225 230
235 240Leu Ala Arg Met Glu Asn Pro Pro Thr Ala Lys Asp Phe Ala Thr
Pro 245 250 255Ala Ala Phe Gln Asn Ala Leu Val Leu Asp Met Ala Met
Gly Gly Ser 260 265 270Ser Asn Thr Val Leu His Thr Leu Ala Val Ala
Thr Glu Ala Gly Thr 275 280 285Lys Leu Asp Leu Lys Lys Leu Asp Glu
Ile Ser Ala Arg Thr Pro Asn 290 295 300Ile Cys Lys Leu Ser Pro Ser
Val Gln Tyr His Ile Val Glu Asp Gly305 310 315 320Asn Arg Val Gly
Gly Ile Met Ala Ile Leu Lys Glu Ile Ser Lys Val 325 330 335Pro Gly
Leu Ile Asp Gly Ser Ala Pro Thr Val Ser Gly Lys Thr Leu 340 345
350Ala Glu Glu Phe Asn Gly Ala Pro Asp Pro Asp Gly Thr Ile Ile Arg
355 360 365Pro Leu Ser Asn Pro Tyr Ser Glu Lys Gly Gly Leu Ala Ile
Leu Phe 370 375 380Gly Asn Leu Ala Glu Lys Gly Cys Val Val Lys Ala
Ala Gly Val Ala385 390 395 400Lys Ala Met Leu Thr His Lys Gly Pro
Ala Val Ile Phe Asp Ser Glu 405 410 415Glu Glu Ala Gly Glu Gly Ile
Leu Ala Gly Lys Val Lys Ala Gly Asp 420 425 430Val Val Val Ile Arg
Tyr Glu Gly Pro Lys Gly Gly Pro Gly Met Gln 435 440 445Glu Met Leu
Ala Pro Thr Ser Tyr Ile Met Gly Arg Gly Leu Gly Glu 450 455 460Ser
Val Ala Leu Val Thr Asp Gly Arg Phe Ser Gly Gly Thr Arg Gly465 470
475 480Ala Cys Ile Gly His Val Ser Pro Glu Ala Ala Ala Gly Gly Leu
Ile 485 490 495Gly Leu Val Glu Pro Gly Asp Ile Ile Glu Ile Asp Ile
Pro Asn Arg 500 505 510Ser Ile Lys Leu Asp Val Pro Asp Glu Val Ile
Ala Glu Arg Arg Lys 515 520 525Asn Trp Lys Pro Arg Glu Pro Lys Ile
Lys Thr Gly Tyr Leu Ala Lys 530 535 540Tyr Ala Ser Leu Ala Thr Ser
Ala Asp Thr Gly Gly Val Leu Lys Val545 550 555
560Asn12549PRTUnknownTermite Group 1 Bacterium Phylotype Rs-D17
12Met Arg Ser Asp Gln Ile Lys Arg Gly Ala Val Arg Ala Pro Asn Arg1
5 10 15Cys Leu Leu Tyr Ser Thr Gly Ile Ser Pro Gly Asp Leu Asp Lys
Pro 20 25 30Phe Ile Gly Ile Ala Ser Ser Phe Thr Asp Leu Val Pro Gly
His Val 35 40 45Ala Met Arg Asp Leu Glu Arg Tyr Val Glu Arg Gly Ile
Ala Ala Gly 50 55 60Gly Gly Val Pro Phe Ile Phe Gly Ala Pro Ala Val
Cys Asp Gly Ile65 70 75 80Ala Met Gly His Ser Gly Met His Tyr Ser
Leu Gly Ser Arg Glu Ile 85 90 95Ile Ala Asp Leu Val Glu Thr Val Ala
Asn Ala His Met Leu Asp Gly 100 105 110Leu Ile Leu Leu Ser Asn Cys
Asp Lys Val Thr Pro Gly Met Leu Met 115 120 125Ala Ala Ala Arg Leu
Asn Ile Pro Ala Ile Val Val Thr Ala Gly Ala 130 135 140Met Met Thr
Gly Met Tyr Asp Lys Lys Arg Arg Ser Met Val Arg Asp145 150 155
160Thr Phe Glu Ala Val Gly Gln Phe Gln Ala Gly Lys Ile Thr Glu Lys
165 170 175Gln Leu Ser Glu Leu Glu Met Ala Ala Cys Pro Gly Ala Gly
Ala Cys 180 185 190Gln Gly Met Tyr Thr Ala Asn Thr Met Ala Cys Leu
Thr Glu Thr Met 195 200 205Gly Met Ser Met Arg Gly Cys Ala Thr Thr
Leu Ala Val Ser Ala Lys 210 215 220Lys Lys Arg Ile Ala Tyr Glu Ser
Gly Ile Arg Val Val Ala Leu Val225 230 235 240Lys Lys Asp Val Lys
Pro Arg Asp Ile Leu Thr Leu Ala Ala Phe Lys 245 250 255Asn Ala Ile
Val Ala Asp Met Ala Leu Gly Gly Ser Thr Asn Thr Val 260 265 270Leu
His Leu Pro Ala Ile Ala Asn Glu Ala Gly Ile Glu Leu Pro Leu 275 280
285Glu Leu Phe Asp Glu Ile Ser Lys Lys Thr Pro Gln Ile Ala Cys Leu
290 295 300Glu Pro Ala Gly Asp His Tyr Met Glu Asp Leu Asp Asn Ala
Gly Gly305 310 315 320Ile Pro Ala Val Leu Phe Ala Ile Gln Lys Asn
Leu Ala His Ser Lys 325 330 335Thr Val Ser Gly Phe Asp Ile Ile Glu
Ile Ala Asn Ser Ala Glu Ile 340 345 350Leu Asp Glu Tyr Val Ile Arg
Ala Lys Asn Pro Tyr Lys Pro Glu Gly 355 360 365Gly Ile Ala Ile Leu
Arg Gly Asn Ile Ala Pro Arg Gly Cys Val Val 370 375 380Lys Gln Ala
Ala Val Ser Glu Lys Met Lys Val Phe Ser Gly Arg Ala385 390 395
400Arg Val Phe Asn Ser Glu Asp Asn Ala Met Lys Ala Ile Leu Asp Asn
405 410 415Lys Ile Val Pro Gly Asp Ile Val Val Ile Arg Tyr Glu Gly
Pro Ala
420 425 430Gly Gly Pro Gly Met Arg Glu Met Leu Ser Pro Thr Ser Ala
Leu His 435 440 445Gly Met Gly Leu Ser Asp Ser Val Ala Leu Leu Thr
Asp Gly Arg Phe 450 455 460Ser Gly Gly Thr Arg Gly Pro Cys Ile Gly
His Ile Ser Pro Glu Ala465 470 475 480Ala Ala Asp Gly Ala Ile Val
Ala Ile Asn Glu Gly Asp Thr Ile Asn 485 490 495Ile Asn Ile Pro Glu
Arg Thr Leu Asn Val Glu Leu Thr Asp Asp Glu 500 505 510Ile Lys Ala
Arg Ile Gly Lys Val Ile Lys Pro Glu Pro Lys Ile Lys 515 520 525Thr
Gly Tyr Met Ala Arg Tyr Ala Lys Leu Val Gln Ser Ala Asp Thr 530 535
540Gly Ala Val Leu Lys54513573PRTYarrowia lipolytica 13Met Ile Arg
Ala Arg Asn Tyr Ala Thr Lys Ala His Thr Leu Asn Lys1 5 10 15Phe Ser
Lys Ile Ile Thr Glu Pro Lys Ser Gln Gly Ala Ser Gln Ala 20 25 30Met
Leu Tyr Ala Cys Gly Phe Asn Glu Ala Asp Leu Gly Lys Pro Gln 35 40
45Val Gly Val Ala Ser Val Trp Trp Ser Gly Asn Pro Cys Asn Met His
50 55 60Leu Leu Asp Leu Asn Phe Lys Val Lys Glu Gly Ile Glu Lys His
Asn65 70 75 80Leu Lys Ala Met Gln Phe Asn Thr Ile Gly Val Ser Asp
Gly Ile Ser 85 90 95Met Gly Thr Lys Gly Met Arg Tyr Ser Leu Gln Ser
Arg Asp Met Ile 100 105 110Ala Asp Ser Ile Glu Thr Leu Met Met Ala
Gln His Tyr Asp Ala Asn 115 120 125Ile Ser Ile Pro Gly Cys Asp Lys
Asn Met Pro Gly Val Leu Met Ala 130 135 140Met Gly Arg Val Asn Arg
Pro Ser Ile Met Leu Tyr Gly Gly Thr Ile145 150 155 160His Pro Gly
Lys Ala Glu Thr Arg Lys Gly Glu Asp Ile Asp Ile Val 165 170 175Ser
Ala Phe Gln Ala Tyr Gly Gln Tyr Ile Ala Gly Gly Ile Ser Glu 180 185
190Thr Glu Arg Ala Asp Val Ile Arg His Ala Cys Pro Gly Gln Gly Ala
195 200 205Cys Gly Gly Met Tyr Thr Ala Asn Thr Met Ala Ser Ala Ala
Glu Val 210 215 220Leu Gly Met Thr Leu Pro Gly Ser Ser Ser Ala Pro
Ala Ile Ser Lys225 230 235 240Glu Lys Met Ala Glu Cys Glu Ala Leu
Gly Pro Ala Ile Asn Lys Leu 245 250 255Leu Glu Met Asp Leu Lys Pro
Lys Asp Ile Met Thr Arg Gln Ala Phe 260 265 270Glu Asn Ala Ile Ala
Tyr Ile Ile Ala Thr Gly Gly Ser Thr Asn Ala 275 280 285Val Leu His
Leu Leu Ala Ile Ala His Thr Val Asp Val Pro Leu Thr 290 295 300Ile
Asp Asp Phe Gln Arg Ile Ser Asp Asn Thr Pro Leu Leu Ala Asp305 310
315 320Phe Lys Pro Ser Gly Ala His Val Met Ala Asp Leu Gln Lys Trp
Gly 325 330 335Gly Thr Pro Ala Val Ile Lys Met Leu Ile Glu Gln Gly
Phe Ile Asp 340 345 350Gly Ser Pro Met Thr Cys Ser Gly Glu Ser Leu
Lys Asp Thr Val Ala 355 360 365Lys Tyr Pro Ser Leu Pro Lys Glu Gln
Asp Ile Phe Ala Ser Val Asp 370 375 380Ala Pro Leu Lys Pro Ser Gly
His Leu Gln Ile Leu Lys Gly Ser Leu385 390 395 400Ala Pro Gly Gly
Ser Val Gly Lys Ile Thr Gly Lys Glu Gly Thr Phe 405 410 415Phe Lys
Gly Thr Ala Arg Cys Phe Asp Glu Glu Asp Leu Phe Ile Glu 420 425
430Ala Leu Glu Lys Gly Glu Ile Lys Lys Gly Glu Lys Thr Cys Val Ile
435 440 445Ile Arg Tyr Glu Gly Pro Lys Gly Gly Pro Gly Met Pro Glu
Met Leu 450 455 460Lys Pro Ser Ser Ala Leu Met Gly Tyr Gly Leu Gly
Lys Asp Val Ala465 470 475 480Leu Leu Thr Asp Gly Arg Phe Ser Gly
Gly Ser His Gly Phe Leu Ile 485 490 495Gly His Ile Val Pro Glu Ala
Tyr Glu Gly Gly Pro Ile Gly Leu Val 500 505 510Glu Asp Gly Asp Glu
Ile Ile Ile Asp Ala Asp Asn Asn Ile Ile Asp 515 520 525Leu Leu Val
Asp Glu Lys Thr Met Ala Glu Arg Lys Ala Lys Trp Thr 530 535 540Pro
Pro Ala Pro Arg Tyr Thr Ser Gly Thr Leu His Lys Tyr Ser Lys545 550
555 560Leu Val Ser Asp Ala Ser Thr Gly Cys Ile Thr Asp Ala 565
57014560PRTFrancisella tularensis 14Met Lys Lys Val Leu Asn Lys Tyr
Ser Arg Arg Leu Thr Glu Asp Lys1 5 10 15Ser Gln Gly Ala Ser Gln Ala
Met Leu Tyr Gly Thr Glu Met Asn Asp 20 25 30Ala Asp Met His Lys Pro
Gln Ile Gly Ile Gly Ser Val Trp Tyr Glu 35 40 45Gly Asn Thr Cys Asn
Met His Leu Asn Gln Leu Ala Gln Phe Val Lys 50 55 60Asp Ser Val Glu
Lys Glu Asn Leu Lys Gly Met Arg Phe Asn Thr Ile65 70 75 80Gly Val
Ser Asp Gly Ile Ser Met Gly Thr Asp Gly Met Ser Tyr Ser 85 90 95Leu
Gln Ser Arg Asp Leu Ile Ala Asp Ser Ile Glu Thr Val Met Ser 100 105
110Ala His Trp Tyr Asp Gly Leu Val Ser Ile Pro Gly Cys Asp Lys Asn
115 120 125Met Pro Gly Cys Met Met Ala Leu Gly Arg Leu Asn Arg Pro
Gly Phe 130 135 140Val Ile Tyr Gly Gly Thr Ile Gln Ala Gly Val Met
Arg Gly Lys Pro145 150 155 160Ile Asp Ile Val Thr Ala Phe Gln Ser
Tyr Gly Ala Cys Leu Ser Gly 165 170 175Gln Ile Thr Glu Gln Glu Arg
Gln Glu Thr Ile Lys Lys Ala Cys Pro 180 185 190Gly Ala Gly Ala Cys
Gly Gly Met Tyr Thr Ala Asn Thr Met Ala Cys 195 200 205Ala Ile Glu
Ala Leu Gly Met Ser Leu Pro Phe Ser Ser Ser Thr Ser 210 215 220Ala
Thr Ser Val Glu Lys Val Gln Glu Cys Asp Lys Ala Gly Glu Thr225 230
235 240Ile Lys Asn Leu Leu Glu Leu Asp Ile Lys Pro Arg Asp Ile Met
Thr 245 250 255Arg Lys Ala Phe Glu Asn Ala Met Val Leu Ile Thr Val
Met Gly Gly 260 265 270Ser Thr Asn Ala Val Leu His Leu Leu Ala Met
Ala Ser Ser Val Asp 275 280 285Val Asp Leu Ser Ile Asp Asp Phe Gln
Glu Ile Ala Asn Lys Thr Pro 290 295 300Val Leu Ala Asp Phe Lys Pro
Ser Gly Lys Tyr Val Met Ala Asn Leu305 310 315 320His Ala Ile Gly
Gly Thr Pro Ala Val Met Lys Met Leu Leu Lys Ala 325 330 335Gly Met
Leu His Gly Asp Cys Leu Thr Val Thr Gly Lys Thr Leu Ala 340 345
350Glu Asn Leu Glu Asn Val Ala Asp Leu Pro Glu Asp Asn Thr Ile Ile
355 360 365His Lys Leu Asp Asn Pro Ile Lys Lys Thr Gly His Leu Gln
Ile Leu 370 375 380Lys Gly Asn Val Ala Pro Glu Gly Ser Val Ala Lys
Ile Thr Gly Lys385 390 395 400Glu Gly Glu Ile Phe Glu Gly Val Ala
Asn Val Phe Asp Ser Glu Glu 405 410 415Glu Met Val Ala Ala Val Glu
Thr Gly Lys Val Lys Lys Gly Asp Val 420 425 430Ile Val Ile Arg Tyr
Glu Gly Pro Lys Gly Gly Pro Gly Met Pro Glu 435 440 445Met Leu Lys
Pro Thr Ser Leu Ile Met Gly Ala Gly Leu Gly Gln Asp 450 455 460Val
Ala Leu Ile Thr Asp Gly Arg Phe Ser Gly Gly Ser His Gly Phe465 470
475 480Ile Val Gly His Ile Thr Pro Glu Ala Tyr Glu Gly Gly Met Ile
Ala 485 490 495Leu Leu Glu Asn Gly Asp Lys Ile Thr Ile Asp Ala Ile
Asn Asn Val 500 505 510Ile Asn Val Asp Leu Ser Asp Gln Glu Ile Ala
Gln Arg Lys Ser Lys 515 520 525Trp Arg Ala Ser Lys Gln Lys Ala Ser
Arg Gly Thr Leu Lys Lys Tyr 530 535 540Ile Lys Thr Val Ser Ser Ala
Ser Thr Gly Cys Val Thr Asp Leu Asp545 550 555
56015581PRTArabidopsis thaliana 15Met Pro Ser Ile Ile Ser Cys Ser
Ala Gln Ser Val Thr Ala Asp Pro1 5 10 15Ser Pro Pro Ile Thr Asp Thr
Asn Lys Leu Asn Lys Tyr Ser Ser Arg 20 25 30Ile Thr Glu Pro Lys Ser
Gln Gly Gly Ser Gln Ala Ile Leu His Gly 35 40 45Val Gly Leu Ser Asp
Asp Asp Leu Leu Lys Pro Gln Ile Gly Ile Ser 50 55 60Ser Val Trp Tyr
Glu Gly Asn Thr Cys Asn Met His Leu Leu Lys Leu65 70 75 80Ser Glu
Ala Val Lys Glu Gly Val Glu Asn Ala Gly Met Val Gly Phe 85 90 95Arg
Phe Asn Thr Ile Gly Val Ser Asp Ala Ile Ser Met Gly Thr Arg 100 105
110Gly Met Cys Phe Ser Leu Gln Ser Arg Asp Leu Ile Ala Asp Ser Ile
115 120 125Glu Thr Val Met Ser Ala Gln Trp Tyr Asp Gly Asn Ile Ser
Ile Pro 130 135 140Gly Cys Asp Lys Asn Met Pro Gly Thr Ile Met Ala
Met Gly Arg Leu145 150 155 160Asn Arg Pro Gly Ile Met Val Tyr Gly
Gly Thr Ile Lys Pro Gly His 165 170 175Phe Gln Asp Lys Thr Tyr Asp
Ile Val Ser Ala Phe Gln Ser Tyr Gly 180 185 190Glu Phe Val Ser Gly
Ser Ile Ser Asp Glu Gln Arg Lys Thr Val Leu 195 200 205His His Ser
Cys Pro Gly Ala Gly Ala Cys Gly Gly Met Tyr Thr Ala 210 215 220Asn
Thr Met Ala Ser Ala Ile Gly Ala Met Gly Met Ser Leu Pro Tyr225 230
235 240Ser Ser Ser Ile Pro Ala Glu Asp Pro Leu Lys Leu Asp Glu Cys
Arg 245 250 255Leu Ala Gly Lys Tyr Leu Leu Glu Leu Leu Lys Met Asp
Leu Lys Pro 260 265 270Arg Asp Ile Ile Thr Pro Lys Ser Leu Arg Asn
Ala Met Val Ser Val 275 280 285Met Ala Leu Gly Gly Ser Thr Asn Ala
Val Leu His Leu Ile Ala Ile 290 295 300Ala Arg Ser Val Gly Leu Glu
Leu Thr Leu Asp Asp Phe Gln Lys Val305 310 315 320Ser Asp Ala Val
Pro Phe Leu Ala Asp Leu Lys Pro Ser Gly Lys Tyr 325 330 335Val Met
Glu Asp Ile His Lys Ile Gly Gly Thr Pro Ala Val Leu Arg 340 345
350Tyr Leu Leu Glu Leu Gly Leu Met Asp Gly Asp Cys Met Thr Val Thr
355 360 365Gly Gln Thr Leu Ala Gln Asn Leu Glu Asn Val Pro Ser Leu
Thr Glu 370 375 380Gly Gln Glu Ile Ile Arg Pro Leu Ser Asn Pro Ile
Lys Glu Thr Gly385 390 395 400His Ile Gln Ile Leu Arg Gly Asp Leu
Ala Pro Asp Gly Ser Val Ala 405 410 415Lys Ile Thr Gly Lys Glu Gly
Leu Tyr Phe Ser Gly Pro Ala Leu Val 420 425 430Phe Glu Gly Glu Glu
Ser Met Leu Ala Ala Ile Ser Ala Asp Pro Met 435 440 445Ser Phe Lys
Gly Thr Val Val Val Ile Arg Gly Glu Gly Pro Lys Gly 450 455 460Gly
Pro Gly Met Pro Glu Met Leu Thr Pro Thr Ser Ala Ile Met Gly465 470
475 480Ala Gly Leu Gly Lys Glu Cys Ala Leu Leu Thr Asp Gly Arg Phe
Ser 485 490 495Gly Gly Ser His Gly Phe Val Val Gly His Ile Cys Pro
Glu Ala Gln 500 505 510Glu Gly Gly Pro Ile Gly Leu Ile Lys Asn Gly
Asp Ile Ile Thr Ile 515 520 525Asp Ile Gly Lys Lys Arg Ile Asp Thr
Gln Val Ser Pro Glu Glu Met 530 535 540Asn Asp Arg Arg Lys Lys Trp
Thr Ala Pro Ala Tyr Lys Val Asn Arg545 550 555 560Gly Val Leu Tyr
Lys Tyr Ile Lys Asn Val Gln Ser Ala Ser Asp Gly 565 570 575Cys Val
Thr Asp Glu 58016573PRTCandidatus Koribacter versatilis 16Met Thr
Glu Lys Ser Pro Lys Pro His Lys Arg Ser Asp Ala Ile Thr1 5 10 15Glu
Gly Pro Asn Arg Ala Pro Ala Arg Ala Met Leu Arg Ala Ala Gly 20 25
30Phe Thr Pro Glu Asp Leu Arg Lys Pro Ile Ile Gly Ile Ala Asn Thr
35 40 45Trp Ile Glu Ile Gly Pro Cys Asn Leu His Leu Arg Glu Leu Ala
Glu 50 55 60His Ile Lys Gln Gly Val Arg Glu Ala Gly Gly Thr Pro Met
Glu Phe65 70 75 80Asn Thr Val Ser Ile Ser Asp Gly Ile Thr Met Gly
Ser Glu Gly Met 85 90 95Lys Ala Ser Leu Val Ser Arg Glu Val Ile Ala
Asp Ser Ile Glu Leu 100 105 110Val Ala Arg Gly Asn Leu Phe Asp Gly
Leu Ile Ala Leu Ser Gly Cys 115 120 125Asp Lys Thr Ile Pro Gly Thr
Ile Met Ala Leu Glu Arg Leu Asp Ile 130 135 140Pro Gly Leu Met Leu
Tyr Gly Gly Ser Ile Ala Pro Gly Lys Phe His145 150 155 160Ala Gln
Lys Val Thr Ile Gln Asp Val Phe Glu Ala Val Gly Thr His 165 170
175Ala Arg Gly Lys Met Ser Asp Ala Asp Leu Glu Glu Leu Glu His Asn
180 185 190Ala Cys Pro Gly Ala Gly Ala Cys Gly Gly Gln Phe Thr Ala
Asn Thr 195 200 205Met Ser Met Cys Gly Glu Phe Leu Gly Ile Ser Pro
Met Gly Ala Asn 210 215 220Ser Val Pro Ala Met Thr Val Glu Lys Gln
Gln Val Ala Arg Arg Cys225 230 235 240Gly His Leu Val Met Glu Leu
Val Arg Arg Asp Ile Arg Pro Ser Gln 245 250 255Ile Ile Thr Arg Lys
Ala Ile Glu Asn Ala Ile Ala Ser Val Ala Ala 260 265 270Ser Gly Gly
Ser Thr Asn Ala Val Leu His Leu Leu Ala Ile Ala His 275 280 285Glu
Met Asp Val Glu Leu Asn Ile Glu Asp Phe Asp Lys Ile Ser Ser 290 295
300Arg Thr Pro Leu Leu Cys Glu Leu Lys Pro Ala Gly Arg Phe Thr
Ala305 310 315 320Thr Asp Leu His Asp Ala Gly Gly Ile Pro Leu Val
Ala Gln Arg Leu 325 330 335Leu Glu Ala Asn Leu Leu His Ala Asp Ala
Leu Thr Val Thr Gly Lys 340 345 350Thr Ile Ala Glu Glu Ala Lys Gln
Ala Lys Glu Thr Pro Gly Gln Glu 355 360 365Val Val Arg Pro Leu Thr
Asp Pro Ile Lys Ala Thr Gly Gly Leu Met 370 375 380Ile Leu Lys Gly
Asn Leu Ala Ser Glu Gly Cys Val Val Lys Leu Val385 390 395 400Gly
His Lys Lys Leu Phe Phe Glu Gly Pro Ala Arg Val Phe Glu Ser 405 410
415Glu Glu Glu Ala Phe Ala Gly Val Glu Asp Arg Thr Ile Gln Ala Gly
420 425 430Glu Val Val Val Val Arg Tyr Glu Gly Pro Lys Gly Gly Pro
Gly Met 435 440 445Arg Glu Met Leu Gly Val Thr Ala Ala Ile Ala Gly
Thr Glu Leu Ala 450 455 460Glu Thr Val Ala Leu Ile Thr Asp Gly Arg
Phe Ser Gly Ala Thr Arg465 470 475 480Gly Leu Ser Val Gly His Val
Ala Pro Glu Ala Ala Asn Gly Gly Ala 485 490 495Ile Ala Val Val Arg
Asn Gly Asp Ile Ile Thr Leu Asp Val Glu Arg 500 505 510Arg Glu Leu
Arg Val His Leu Thr Asp Ala Glu Leu Glu Ala Arg Leu 515 520 525Arg
Asn Trp Arg Ala Pro Glu Pro Arg Tyr Lys Arg Gly Val Phe Ala 530 535
540Lys Tyr Ala Ser Thr Val Ser Ser Ala Ser Phe Gly Ala Val Thr
Gly545 550 555 560Ser Thr Ile Glu Asn Lys Thr Leu Ala Gly Ser Thr
Lys 565 57017562PRTGramella forsetii 17Met Asp Lys Thr Ala Met Asn
Asn Lys Tyr Ser Ser Thr Ile Thr Gln1 5 10 15Ser Asp Ser Gln Pro Ala
Ser Gln Ala Met Leu His Ala Ile Gly Leu 20 25 30Asn Lys Glu Asp Leu
Lys Lys Pro Phe Val Gly Ile Gly Ser Thr Gly 35 40 45Tyr Glu Gly
Asn
Pro Cys Asn Met His Leu Asn Asp Leu Ala Lys Glu 50 55 60Val Lys Lys
Gly Thr Gln Asn Ala Asp Leu Asn Gly Leu Ile Phe Asn65 70 75 80Thr
Ile Gly Val Ser Asp Gly Ile Ser Met Gly Thr Pro Gly Met Arg 85 90
95Phe Ser Leu Pro Ser Arg Asp Leu Ile Ala Asp Ser Met Glu Thr Val
100 105 110Val Gly Gly Met Ser Tyr Asp Gly Leu Val Thr Val Val Gly
Cys Asp 115 120 125Lys Asn Met Pro Gly Ala Leu Met Ala Met Leu Arg
Leu Asn Arg Pro 130 135 140Ser Val Leu Val Tyr Gly Gly Thr Ile Ala
Ser Gly Cys His Asn Gly145 150 155 160Lys Lys Leu Asp Val Val Ser
Ala Phe Glu Ala Trp Gly Ser Lys Val 165 170 175Ser Gly Asp Met Gln
Glu Glu Glu Tyr Gln Gln Val Ile Glu Lys Ala 180 185 190Cys Pro Gly
Ala Gly Ala Cys Gly Gly Met Tyr Thr Ala Asn Thr Met 195 200 205Ala
Ser Ser Ile Glu Ala Leu Gly Met Ser Leu Pro Phe Asn Ser Ser 210 215
220Asn Pro Ala Thr Gly Pro Glu Lys Thr Gln Glu Ser Val Lys Ala
Gly225 230 235 240Glu Ala Met Lys Tyr Leu Leu Glu Asn Asp Leu Lys
Pro Lys Asp Ile 245 250 255Val Thr Ala Lys Ser Leu Glu Asn Ala Ile
Arg Leu Leu Thr Val Leu 260 265 270Gly Gly Ser Thr Asn Ala Val Leu
His Phe Leu Ala Ile Ala Lys Ala 275 280 285Ala Glu Ile Asn Phe Gly
Leu Lys Asp Phe Thr Arg Ile Cys Glu Glu 290 295 300Thr Pro Phe Leu
Ala Asp Leu Lys Pro Ser Gly Lys Tyr Leu Met Glu305 310 315 320Asp
Ile His Arg Ile Gly Gly Ile Pro Ala Val Met Lys Tyr Met Leu 325 330
335Glu Lys Gly Leu Leu His Gly Glu Cys Met Thr Val Thr Gly Lys Thr
340 345 350Ile Ala Glu Asn Leu Glu Asn Val Lys Pro Leu Pro Asp Asp
Gln Asp 355 360 365Val Ile His Pro Val Glu Lys Pro Ile Lys Ala Thr
Gly His Ile Arg 370 375 380Ile Leu Tyr Gly Asn Leu Ala Ser Glu Gly
Ser Val Ala Lys Ile Thr385 390 395 400Gly Lys Glu Gly Leu Glu Phe
Gln Gly Lys Ala Arg Val Phe Asn Gly 405 410 415Glu Phe Glu Ala Asn
Glu Gly Ile Ser Ser Gly Lys Val Gln Lys Gly 420 425 430Asp Val Val
Val Ile Arg Tyr Glu Gly Pro Lys Gly Gly Pro Gly Met 435 440 445Pro
Glu Met Leu Lys Pro Thr Ser Ala Ile Met Gly Ala Gly Leu Gly 450 455
460Lys Ser Val Ala Leu Ile Thr Asp Gly Arg Phe Ser Gly Gly Thr
His465 470 475 480Gly Phe Val Val Gly His Ile Thr Pro Glu Ala Gln
Gln Gly Gly Leu 485 490 495Ile Gly Leu Leu Lys Asp Gly Asp Glu Ile
Ser Ile Asn Ala Glu Lys 500 505 510Asn Thr Ile Glu Ala His Leu Ser
Ala Glu Glu Ile Asn Arg Arg Lys 515 520 525Glu Ala Trp Lys Ala Pro
Ala Leu Lys Val Asn Gly Gly Val Leu Tyr 530 535 540Lys Tyr Ala Lys
Thr Val Ala Ser Ala Ser Glu Gly Cys Val Thr Asp545 550 555 560Glu
Phe18570PRTLactococcus lactis 18Met Glu Phe Lys Tyr Asn Gly Lys Val
Glu Ser Val Glu Leu Asn Lys1 5 10 15Tyr Ser Lys Thr Leu Thr Gln Asp
Pro Thr Gln Pro Ala Thr Gln Ala 20 25 30Met Tyr Tyr Gly Ile Gly Phe
Lys Asp Glu Asp Phe Lys Lys Ala Gln 35 40 45Val Gly Ile Val Ser Met
Asp Trp Asp Gly Asn Pro Cys Asn Met His 50 55 60Leu Gly Thr Leu Gly
Ser Lys Ile Lys Ser Ser Val Asn Gln Thr Asp65 70 75 80Gly Leu Ile
Gly Leu Gln Phe His Thr Ile Gly Val Ser Asp Gly Ile 85 90 95Ala Asn
Gly Lys Leu Gly Met Arg Tyr Ser Leu Val Ser Arg Glu Val 100 105
110Ile Ala Asp Ser Ile Glu Thr Asn Ala Gly Ala Glu Tyr Tyr Asp Ala
115 120 125Ile Val Ala Ile Pro Gly Cys Asp Lys Asn Met Pro Gly Ser
Ile Ile 130 135 140Gly Met Ala Arg Leu Asn Arg Pro Ser Ile Met Val
Tyr Gly Gly Thr145 150 155 160Ile Glu His Gly Glu Tyr Lys Gly Glu
Lys Leu Asn Ile Val Ser Ala 165 170 175Phe Glu Ser Leu Gly Gln Lys
Ile Thr Gly Asn Ile Ser Asp Glu Asp 180 185 190Tyr His Gly Val Ile
Cys Asn Ala Ile Pro Gly Gln Gly Ala Cys Gly 195 200 205Gly Met Tyr
Thr Ala Asn Thr Leu Ala Ala Ala Ile Glu Thr Leu Gly 210 215 220Met
Ser Leu Pro Tyr Ser Ser Ser Asn Pro Ala Val Ser Gln Glu Lys225 230
235 240Gln Glu Glu Cys Asp Glu Ile Gly Leu Ala Ile Lys Asn Leu Leu
Glu 245 250 255Lys Asp Ile Lys Pro Ser Asp Ile Met Thr Lys Glu Ala
Phe Glu Asn 260 265 270Ala Ile Thr Ile Val Met Val Leu Gly Gly Ser
Thr Asn Ala Val Leu 275 280 285His Ile Ile Ala Met Ala Asn Ala Ile
Gly Val Glu Ile Thr Gln Asp 290 295 300Asp Phe Gln Arg Ile Ser Asp
Ile Thr Pro Val Leu Gly Asp Phe Lys305 310 315 320Pro Ser Gly Lys
Tyr Met Met Glu Asp Leu His Lys Ile Gly Gly Leu 325 330 335Pro Ala
Val Leu Lys Tyr Leu Leu Lys Glu Gly Lys Leu His Gly Asp 340 345
350Cys Leu Thr Val Thr Gly Lys Thr Leu Ala Glu Asn Val Glu Thr Ala
355 360 365Leu Asp Leu Asp Phe Asp Ser Gln Asp Ile Met Arg Pro Leu
Lys Asn 370 375 380Pro Ile Lys Ala Thr Gly His Leu Gln Ile Leu Tyr
Gly Asn Leu Ala385 390 395 400Gln Gly Gly Ser Val Ala Lys Ile Ser
Gly Lys Glu Gly Glu Phe Phe 405 410 415Lys Gly Thr Ala Arg Val Phe
Asp Gly Glu Gln His Phe Ile Asp Gly 420 425 430Ile Glu Ser Gly Arg
Leu His Ala Gly Asp Val Ala Val Ile Arg Asn 435 440 445Ile Gly Pro
Val Gly Gly Pro Gly Met Pro Glu Met Leu Lys Pro Thr 450 455 460Ser
Ala Leu Ile Gly Ala Gly Leu Gly Lys Ser Cys Ala Leu Ile Thr465 470
475 480Asp Gly Arg Phe Ser Gly Gly Thr His Gly Phe Val Val Gly His
Ile 485 490 495Val Pro Glu Ala Val Glu Gly Gly Leu Ile Gly Leu Val
Glu Asp Asp 500 505 510Asp Ile Ile Glu Ile Asp Ala Val Asn Asn Ser
Ile Ser Leu Lys Val 515 520 525Ser Asp Glu Glu Ile Ala Lys Arg Arg
Ala Asn Tyr Gln Lys Pro Thr 530 535 540Pro Lys Ala Thr Arg Gly Val
Leu Ala Lys Phe Ala Lys Leu Thr Arg545 550 555 560Pro Ala Ser Glu
Gly Cys Val Thr Asp Leu 565 57019568PRTSaccharopolyspora erythraea
19Met Ser Thr Ser Thr Asp Gly Thr Gly Gln Ser Gly Arg Gly Leu Lys1
5 10 15Pro Arg Ser Gly Asp Val Thr Glu Gly Ile Glu Arg Ala Ala Ala
Arg 20 25 30Gly Met Leu Arg Ala Val Gly Met Gln Asp Ala Asp Phe Ala
Lys Pro 35 40 45Gln Ile Gly Val Ala Ser Ser Trp Asn Glu Ile Thr Pro
Cys Asn Leu 50 55 60Ser Leu Gln Arg Leu Ala Gln Ala Ser Lys Glu Gly
Val His Ala Ala65 70 75 80Gly Gly Phe Pro Met Glu Phe Gly Thr Ile
Ser Val Ser Asp Gly Ile 85 90 95Ser Met Gly His Val Gly Met His Tyr
Ser Leu Val Ser Arg Glu Val 100 105 110Ile Ala Asp Ser Val Glu Thr
Val Met Glu Ala Glu Arg Leu Asp Gly 115 120 125Ser Val Leu Leu Ala
Gly Cys Asp Lys Ser Leu Pro Gly Met Leu Met 130 135 140Ala Ala Ala
Arg Leu Asp Val Ala Ala Val Phe Val Tyr Ala Gly Ser145 150 155
160Ile Leu Pro Gly Arg Val Asp Asp Arg Glu Val Thr Ile Ile Asp Ala
165 170 175Phe Glu Ala Val Gly Ala Cys Ala Arg Gly Leu Ile Ser Glu
Ala Glu 180 185 190Val Asp Arg Ile Glu Arg Ala Ile Cys Pro Gly Glu
Gly Ala Cys Gly 195 200 205Gly Met Tyr Thr Ala Asn Thr Met Ala Cys
Ala Ala Glu Ala Met Gly 210 215 220Met Ser Leu Pro Gly Ser Ala Ser
Pro Pro Ser Val Asp Arg Arg Arg225 230 235 240Asp Ala Gly Ala Arg
Glu Ala Gly Arg Ala Val Val Gly Met Ile Glu 245 250 255Arg Gly Leu
Thr Ala Arg Gln Ile Leu Thr Lys Glu Ala Phe Glu Asn 260 265 270Ala
Ile Ala Val Val Met Ala Phe Gly Gly Ser Thr Asn Ala Val Leu 275 280
285His Leu Leu Ala Ile Ala Arg Glu Ala Glu Val Asp Leu Thr Leu Asp
290 295 300Asp Phe Asn Arg Ile Gly Asp Arg Val Pro His Leu Ala Asp
Val Lys305 310 315 320Pro Phe Gly Arg His Val Met Thr Ala Val Asp
Arg Ile Gly Gly Val 325 330 335Pro Val Val Met Lys Ala Leu Leu Asp
Ala Gly Leu Leu His Gly Asp 340 345 350Cys Met Thr Val Thr Gly Lys
Thr Val Ala Glu Asn Leu Ala Glu Leu 355 360 365Asp Pro Pro Glu Leu
Asp Gly Glu Val Leu His Lys Leu Ser Asn Pro 370 375 380Leu His Pro
Thr Gly Gly Leu Thr Ile Leu Arg Gly Ser Leu Ala Pro385 390 395
400Glu Gly Ala Val Val Lys Ser Ala Gly Phe Asp Ser Ala Thr Phe Glu
405 410 415Gly Thr Ala Arg Val Phe Asp Gly Glu Gln Gly Ala Met Asp
Ala Val 420 425 430Glu Asp Gly Ser Leu Lys Ala Gly Asp Val Val Val
Ile Arg Tyr Glu 435 440 445Gly Pro Arg Gly Gly Pro Gly Met Arg Glu
Met Leu Ala Val Thr Gly 450 455 460Ala Ile Lys Gly Ala Gly Leu Gly
Lys Asp Val Leu Leu Leu Thr Asp465 470 475 480Gly Arg Phe Ser Gly
Gly Thr Thr Gly Leu Cys Ile Gly His Val Ala 485 490 495Pro Glu Ala
Thr Asp Gly Gly Pro Ile Ala Phe Val Arg Asp Gly Asp 500 505 510Pro
Ile Arg Leu Asp Leu Ala Gly Arg Thr Leu Asp Leu Leu Val Asp 515 520
525Glu Ala Glu Leu Ala Arg Arg Lys Glu Gly Trp Val Pro Arg Glu Pro
530 535 540Lys Phe Arg Gln Gly Val Leu Gly Lys Tyr Ala Arg Leu Val
Arg Ser545 550 555 560Ala Ala Val Gly Ala Val Cys Ser
56520585PRTSaccharomyces cerevisiae 20Met Gly Leu Leu Thr Lys Val
Ala Thr Ser Arg Gln Phe Ser Thr Thr1 5 10 15Arg Cys Val Ala Lys Lys
Leu Asn Lys Tyr Ser Tyr Ile Ile Thr Glu 20 25 30Pro Lys Gly Gln Gly
Ala Ser Gln Ala Met Leu Tyr Ala Thr Gly Phe 35 40 45Lys Lys Glu Asp
Phe Lys Lys Pro Gln Val Gly Val Gly Ser Cys Trp 50 55 60Trp Ser Gly
Asn Pro Cys Asn Met His Leu Leu Asp Leu Asn Asn Arg65 70 75 80Cys
Ser Gln Ser Ile Glu Lys Ala Gly Leu Lys Ala Met Gln Phe Asn 85 90
95Thr Ile Gly Val Ser Asp Gly Ile Ser Met Gly Thr Lys Gly Met Arg
100 105 110Tyr Ser Leu Gln Ser Arg Glu Ile Ile Ala Asp Ser Phe Glu
Thr Ile 115 120 125Met Met Ala Gln His Tyr Asp Ala Asn Ile Ala Ile
Pro Ser Cys Asp 130 135 140Lys Asn Met Pro Gly Val Met Met Ala Met
Gly Arg His Asn Arg Pro145 150 155 160Ser Ile Met Val Tyr Gly Gly
Thr Ile Leu Pro Gly His Pro Thr Cys 165 170 175Gly Ser Ser Lys Ile
Ser Lys Asn Ile Asp Ile Val Ser Ala Phe Gln 180 185 190Ser Tyr Gly
Glu Tyr Ile Ser Lys Gln Phe Thr Glu Glu Glu Arg Glu 195 200 205Asp
Val Val Glu His Ala Cys Pro Gly Pro Gly Ser Cys Gly Gly Met 210 215
220Tyr Thr Ala Asn Thr Met Ala Ser Ala Ala Glu Val Leu Gly Leu
Thr225 230 235 240Ile Pro Asn Ser Ser Ser Phe Pro Ala Val Ser Lys
Glu Lys Leu Ala 245 250 255Glu Cys Asp Asn Ile Gly Glu Tyr Ile Lys
Lys Thr Met Glu Leu Gly 260 265 270Ile Leu Pro Arg Asp Ile Leu Thr
Lys Glu Ala Phe Glu Asn Ala Ile 275 280 285Thr Tyr Val Val Ala Thr
Gly Gly Ser Thr Asn Ala Val Leu His Leu 290 295 300Val Ala Val Ala
His Ser Ala Gly Val Lys Leu Ser Pro Asp Asp Phe305 310 315 320Gln
Arg Ile Ser Asp Thr Thr Pro Leu Ile Gly Asp Phe Lys Pro Ser 325 330
335Gly Lys Tyr Val Met Ala Asp Leu Ile Asn Val Gly Gly Thr Gln Ser
340 345 350Val Ile Lys Tyr Leu Tyr Glu Asn Asn Met Leu His Gly Asn
Thr Met 355 360 365Thr Val Thr Gly Asp Thr Leu Ala Glu Arg Ala Lys
Lys Ala Pro Ser 370 375 380Leu Pro Glu Gly Gln Glu Ile Ile Lys Pro
Leu Ser His Pro Ile Lys385 390 395 400Ala Asn Gly His Leu Gln Ile
Leu Tyr Gly Ser Leu Ala Pro Gly Gly 405 410 415Ala Val Gly Lys Ile
Thr Gly Lys Glu Gly Thr Tyr Phe Lys Gly Arg 420 425 430Ala Arg Val
Phe Glu Glu Glu Gly Ala Phe Ile Glu Ala Leu Glu Arg 435 440 445Gly
Glu Ile Lys Lys Gly Glu Lys Thr Val Val Val Ile Arg Tyr Glu 450 455
460Gly Pro Arg Gly Ala Pro Gly Met Pro Glu Met Leu Lys Pro Ser
Ser465 470 475 480Ala Leu Met Gly Tyr Gly Leu Gly Lys Asp Val Ala
Leu Leu Thr Asp 485 490 495Gly Arg Phe Ser Gly Gly Ser His Gly Phe
Leu Ile Gly His Ile Val 500 505 510Pro Glu Ala Ala Glu Gly Gly Pro
Ile Gly Leu Val Arg Asp Gly Asp 515 520 525Glu Ile Ile Ile Asp Ala
Asp Asn Asn Lys Ile Asp Leu Leu Val Ser 530 535 540Asp Lys Glu Met
Ala Gln Arg Lys Gln Ser Trp Val Ala Pro Pro Pro545 550 555 560Arg
Tyr Thr Arg Gly Thr Leu Ser Lys Tyr Ala Lys Leu Val Ser Asn 565 570
575Ala Ser Asn Gly Cys Val Leu Asp Ala 580 58521592PRTPiromyces sp
21Met Ser Phe Ser Leu Ala Asn Leu Ala Ala Lys Gly Ser Asn Leu Phe1
5 10 15Lys Phe Thr Pro Ala Leu Leu Ser Ala Lys Arg Phe Gly Ser Ser
Gly 20 25 30Lys Pro Ile Asn Lys Phe Ser Lys Ile Ile Thr Glu Pro Lys
Ser Arg 35 40 45Gly Gly Ser Gln Ala Met Leu Ile Ala Thr Gly Ile Lys
Pro Glu Asp 50 55 60Leu Lys Lys Pro Gln Ile Gly Ile Gly Ser Val Trp
Tyr Asp Gly Asn65 70 75 80Pro Cys Asn Met His Leu Leu Asp Leu Gly
Ser Val Val Lys Lys Ala 85 90 95Val Gln Lys Gln Asn Met Asn Gly Met
Arg Phe Asn Met Ile Gly Val 100 105 110Ser Asp Gly Ile Ser Asn Gly
Thr Asp Gly Met Ser Phe Ser Leu Gln 115 120 125Ser Arg Glu Ile Ile
Ala Asp Ser Ile Glu Thr Ile Met Ser Ala Gln 130 135 140Tyr Tyr Asp
Ala Asn Ile Ser Leu Pro Gly Cys Asp Lys Asn Met Pro145 150 155
160Gly Cys Leu Ile Ala Ala Ala Arg Leu Asn Arg Pro Thr Ile Ile Ile
165 170 175Tyr Gly Gly Thr Ile Lys Pro Gly His Thr Lys Lys Gly Glu
Thr Ile 180 185 190Asp Leu Val Ser Ala Phe Gln Cys Tyr Gly Gln Tyr
Leu Ala Gly Glu 195 200 205Ile Thr Glu Glu Gln Arg Glu Glu Ile Val
Asn Asn Ala Cys Pro Gly 210 215 220Ala Gly Ala
Cys Gly Gly Met Tyr Thr Ala Asn Thr Met Ala Ser Ile225 230 235
240Ile Glu Ser Met Gly Met Ser Leu Pro Tyr Ser Ala Ser Thr Pro Ala
245 250 255Glu Asp Pro Leu Lys Glu Leu Glu Cys Ile Asn Ala Ala Ala
Ala Ile 260 265 270Lys Asn Leu Met Glu Lys Asp Ile Lys Pro Leu Asp
Ile Met Thr Arg 275 280 285Lys Ala Phe Glu Asn Ala Ile Thr Ile Thr
Leu Ile Leu Gly Gly Ser 290 295 300Thr Asn Ser Val Leu His Leu Leu
Ala Ile Ala Arg Ala Cys Lys Val305 310 315 320Pro Leu Thr Ile Asp
Asp Phe Gln Glu Phe Ser Asn Arg Ile Pro Val 325 330 335Leu Ala Asp
Leu Lys Pro Ser Gly Lys Tyr Val Met Glu Asp Leu Gln 340 345 350Leu
Ile Gly Gly Leu Pro Ala Ile Gln Lys Tyr Leu Leu Asn Glu Gly 355 360
365Leu Leu His Gly Asp Ile Met Thr Val Thr Gly Lys Thr Leu Ala Glu
370 375 380Asn Leu Lys Asp Val Ala Pro Ile Asp Phe Glu Thr Gln Asp
Ile Ile385 390 395 400Arg Pro Leu Ser Asn Pro Ile Lys Lys Asn Gly
His Ile Ile Ile Met 405 410 415Lys Gly Asn Val Ser Pro Asp Gly Gly
Val Ala Lys Ile Thr Gly Lys 420 425 430Gln Gly Leu Phe Phe Glu Gly
Val Ala Asn Cys Phe Asp Cys Glu Glu 435 440 445Asp Met Leu Ala Ala
Leu Glu Arg Gly Glu Ile Lys Lys Gly Gln Val 450 455 460Ile Ile Ile
Arg Tyr Glu Gly Pro Thr Gly Gly Pro Gly Met Pro Glu465 470 475
480Met Leu Thr Pro Thr Ser Ala Ile Met Gly Ala Gly Leu Gly Lys Asp
485 490 495Val Ala Leu Leu Thr Asp Gly Arg Phe Ser Gly Gly Ser His
Gly Phe 500 505 510Ile Ile Gly His Ile Thr Pro Glu Ala Gln Val Gly
Gly Pro Ile Ala 515 520 525Leu Ile Lys Asn Gly Asp Lys Ile Thr Ile
Asp Ala Asn Lys Arg Thr 530 535 540Ile His Ala His Val Ser Glu Glu
Glu Phe Ala Lys Arg Arg Ala Glu545 550 555 560Trp Lys Ala Pro Pro
Tyr Arg Ala Thr Gln Gly Thr Leu Lys Lys Tyr 565 570 575Ile Lys Leu
Val Lys Pro Ala Asn Phe Gly Cys Val Thr Asp Glu Trp 580 585
59022587PRTRalstonia eutropha 22Met Pro Tyr Ala Asp Asp Pro Lys Leu
Pro Gln Asp Gly Ala Ala Pro1 5 10 15Thr Glu Gly Leu Ala Lys Gly Leu
Thr Asn Tyr Gly Asp Thr Gly Phe 20 25 30Ser Leu Phe Leu Arg Lys Ala
Phe Ile Lys Gly Ala Gly Phe Thr Asp 35 40 45Asp Ala Leu Ser Arg Pro
Val Ile Gly Ile Val Asn Thr Gly Ser Ser 50 55 60Tyr Asn Pro Cys His
Gly Asn Ala Pro Gln Leu Val Glu Ala Val Lys65 70 75 80Arg Gly Val
Met Leu Ala Gly Gly Leu Pro Val Asp Phe Pro Thr Ile 85 90 95Ser Val
His Glu Ser Phe Ser Ala Pro Thr Ser Met Tyr Leu Arg Asn 100 105
110Leu Met Ser Met Asp Thr Glu Glu Met Ile Arg Ala Gln Pro Met Asp
115 120 125Ala Val Val Leu Ile Gly Gly Cys Asp Lys Thr Val Pro Ala
Gln Leu 130 135 140Met Gly Ala Ala Ser Ala Gly Val Pro Ala Ile Gln
Leu Val Thr Gly145 150 155 160Ser Met Leu Thr Gly Ser His Arg Ser
Glu Arg Val Gly Ala Cys Thr 165 170 175Asp Cys Arg Arg Tyr Trp Gly
Arg Tyr Arg Ala Glu Glu Ile Asp Ser 180 185 190Ala Glu Ile Ala Asp
Val Asn Asn Gln Leu Val Ala Ser Val Gly Thr 195 200 205Cys Ser Val
Met Gly Thr Ala Ser Thr Met Ala Cys Val Ala Glu Ala 210 215 220Leu
Gly Met Met Val Ser Gly Gly Ala Ser Ala Pro Ala Val Thr Ala225 230
235 240Asp Arg Val Arg Val Ala Glu Arg Thr Gly Thr Thr Ala Val Gly
Met 245 250 255Ala Ala Ala Arg Leu Thr Pro Asp Arg Ile Leu Thr Gly
Lys Ala Phe 260 265 270Glu Asn Ala Leu Arg Val Leu Leu Ala Ile Gly
Gly Ser Thr Asn Gly 275 280 285Ile Val His Leu Thr Ala Ile Ala Gly
Arg Leu Gly Ile Asp Ile Asp 290 295 300Leu Ala Gly Leu Asp Arg Met
Ser Arg Glu Thr Pro Val Leu Val Asp305 310 315 320Leu Lys Pro Ser
Gly Gln His Tyr Met Glu Asp Phe His Lys Ala Gly 325 330 335Gly Met
Leu Thr Leu Leu Arg Glu Leu Arg Pro Leu Leu His Leu Asp 340 345
350Thr Leu Thr Val Ser Gly Arg Thr Leu Gly Glu Glu Leu Asp Ala Ala
355 360 365Pro Pro Leu Phe Pro Gln Asp Val Ile Arg Ser Ala Gly Asn
Pro Ile 370 375 380Tyr Pro Ala Gly Gly Leu Ala Val Leu Arg Gly Asn
Leu Ala Pro Gly385 390 395 400Gly Ala Ile Ile Lys Gln Ser Ala Ala
Asn Pro Ala Leu Met Glu His 405 410 415Glu Gly Arg Ala Val Val Phe
Glu Asn Ala Glu Asp Met Ala Gln Arg 420 425 430Ile Asp Asp Glu Ser
Leu Asp Val Lys Ala Asp Asp Ile Leu Val Leu 435 440 445Lys Arg Ile
Gly Pro Thr Gly Ala Pro Gly Met Pro Glu Ala Gly Tyr 450 455 460Met
Pro Ile Pro Lys Lys Leu Ala Arg Ala Gly Val Lys Asp Met Val465 470
475 480Arg Val Ser Asp Gly Arg Met Ser Gly Thr Ala Ala Gly Thr Ile
Val 485 490 495Leu His Val Thr Pro Glu Ala Ala Ile Gly Gly Pro Leu
Ala Leu Val 500 505 510Gln Ser Gly Asp Arg Ile Arg Leu Ser Val Ala
Asn Arg Glu Ile Ala 515 520 525Leu Leu Val Asp Asp Ala Glu Leu Ala
Arg Arg Ala Ala Ala Gln Pro 530 535 540Val Glu Arg Pro Arg Ala Glu
Arg Gly Tyr Arg Lys Leu Phe Leu Glu545 550 555 560Thr Val Thr Gln
Ala Asp Gln Gly Val Asp Phe Asp Phe Leu Arg Ala 565 570 575Ala Gln
Thr Val Asp Thr Val Pro Lys Gln Gly 580 58523581PRTChromohalobacter
salexigens 23Met Thr His Lys Lys Arg Pro Leu Arg Ser Ala Glu Trp
Phe Gly Asn1 5 10 15Asp Asp Lys Asn Gly Phe Met Tyr Arg Ser Trp Met
Lys Asn Gln Gly 20 25 30Ile Pro Asp His Glu Phe Arg Gly Lys Pro Ile
Ile Gly Ile Cys Asn 35 40 45Thr Phe Ser Glu Leu Thr Pro Cys Asn Ala
His Phe Arg Lys Leu Ala 50 55 60Glu His Val Lys Lys Gly Val Leu Glu
Ala Gly Gly Tyr Pro Val Glu65 70 75 80Phe Pro Val Phe Ser Asn Gly
Glu Ser Asn Leu Arg Pro Thr Ala Met 85 90 95Phe Thr Arg Asn Leu Ala
Ser Met Asp Val Glu Glu Ala Ile Arg Gly 100 105 110Asn Pro Leu Asp
Ala Val Val Leu Leu Val Gly Cys Asp Lys Thr Thr 115 120 125Pro Ala
Leu Leu Met Gly Ala Ala Ser Cys Asp Ile Pro Thr Ile Val 130 135
140Val Thr Gly Gly Pro Met Leu Asn Gly Lys His Lys Gly Arg Asp
Ile145 150 155 160Gly Ser Gly Thr Val Val Trp Gln Leu Ser Glu Glu
Val Lys Ala Gly 165 170 175Lys Ile Ser Leu His Asp Phe Met Ala Ala
Glu Ala Gly Met Ser Arg 180 185 190Ser Ala Gly Thr Cys Asn Thr Met
Gly Thr Ala Ser Thr Met Ala Cys 195 200 205Met Ala Glu Ser Leu Gly
Thr Ser Leu Pro His Asn Ala Ala Ile Pro 210 215 220Ala Val Asp Ser
Arg Arg Tyr Val Leu Ala His Leu Ser Gly Asn Arg225 230 235 240Ile
Val Glu Met Val Asp Glu Asp Leu Thr Leu Ser Lys Val Leu Thr 245 250
255Lys Ser Ala Phe Glu Asn Ala Ile Arg Thr Asn Ala Ala Ile Gly Gly
260 265 270Ser Thr Asn Ala Val Ile His Leu Gln Ala Ile Ala Gly Arg
Met Gly 275 280 285Val Asp Leu Thr Leu Asp Asp Trp Thr Arg Val Gly
Arg Gly Thr Pro 290 295 300Thr Ile Val Asp Leu Gln Pro Ser Gly Arg
Tyr Leu Met Glu Glu Phe305 310 315 320Tyr Tyr Ala Gly Gly Leu Pro
Ala Val Leu Arg Arg Leu Gly Glu Ala 325 330 335Asp Arg Leu Pro His
Lys Asp Ala Leu Thr Val Asn Gly Lys Thr Leu 340 345 350Trp Glu Asn
Val Gln Asp Ala Pro Leu Tyr Asn Asp Ala Val Ile Leu 355 360 365Pro
Leu Asp Ala Pro Leu Arg Glu Asp Gly Gly Met Cys Val Met Arg 370 375
380Gly Asn Leu Ala Pro Asn Gly Ala Val Leu Lys Pro Ser Ala Ala
Thr385 390 395 400Pro Ala Leu Met Gln His Arg Gly Arg Ala Val Val
Phe Glu Asn Phe 405 410 415Asp Asp Tyr Lys Ala Arg Ile Asn Asp Pro
Asp Leu Asp Val Thr Ala 420 425 430Asp Asp Ile Leu Val Met Lys Asn
Cys Gly Pro Arg Gly Tyr His Gly 435 440 445Met Ala Glu Val Gly Asn
Met Gly Leu Pro Ala Lys Leu Leu Glu Gln 450 455 460Gly Val Thr Asp
Met Val Arg Ile Ser Asp Ala Arg Met Ser Gly Thr465 470 475 480Ala
Tyr Gly Thr Val Val Leu His Val Ala Pro Glu Ala Ala Ala Gly 485 490
495Gly Pro Leu Ala Ala Val Arg Asn Gly Asp Trp Ile Ala Leu Asp Ala
500 505 510Tyr Ser Gly Lys Leu His Leu Glu Val Asp Asp Ala Glu Ile
Ala Ser 515 520 525Arg Leu Ala Glu Ala Asp Pro Thr Ala Glu Ser Thr
Arg Ile Ala Ser 530 535 540Thr Gly Gly Tyr Arg Gln Leu Tyr Ile Glu
His Val Leu Gln Ala Asp545 550 555 560Gln Gly Cys Asp Phe Asp Phe
Leu Val Gly Cys Arg Gly Ala Glu Val 565 570 575Pro Arg His Ser
His58024329PRTPicrophilus torridus 24Met Glu Lys Val Tyr Thr Glu
Asn Asp Leu Lys Glu Asn Leu Met Arg1 5 10 15Asn Lys Lys Ile Ala Val
Leu Gly Tyr Gly Ser Gln Gly Arg Ala Trp 20 25 30Ala Leu Asn Met Arg
Asp Ser Gly Leu Asn Val Thr Val Gly Leu Glu 35 40 45Arg Gln Gly Lys
Ser Trp Glu Lys Ala Val Ala Asp Gly Phe Lys Pro 50 55 60Leu Lys Ser
Arg Asp Ala Val Arg Asp Ala Asp Ala Val Ile Phe Leu65 70 75 80Val
Pro Asp Met Ala Gln Arg Glu Leu Tyr Lys Asn Ile Met Asn Asp 85 90
95Ile Lys Asp Asp Ala Asp Ile Val Phe Ala His Gly Phe Asn Val His
100 105 110Tyr Gly Leu Ile Asn Pro Lys Asn His Asp Val Tyr Met Val
Ala Pro 115 120 125Lys Ala Pro Gly Pro Ser Val Arg Glu Phe Tyr Glu
Arg Gly Gly Gly 130 135 140Val Pro Val Leu Ile Ala Val Ala Asn Asp
Val Ser Gly Arg Ser Lys145 150 155 160Glu Lys Ala Leu Ser Ile Ala
Tyr Ser Leu Gly Ala Leu Arg Ala Gly 165 170 175Ala Ile Glu Thr Thr
Phe Lys Glu Glu Thr Glu Thr Asp Leu Ile Gly 180 185 190Glu Gln Leu
Asp Leu Val Gly Gly Ile Thr Glu Leu Leu Arg Ser Thr 195 200 205Phe
Asn Ile Met Val Glu Met Gly Tyr Lys Pro Glu Met Ala Tyr Phe 210 215
220Glu Ala Ile Asn Glu Met Lys Leu Ile Val Asp Gln Val Phe Glu
Lys225 230 235 240Gly Ile Ser Gly Met Leu Arg Ala Val Ser Asp Thr
Ala Lys Tyr Gly 245 250 255Gly Leu Thr Thr Gly Lys Tyr Ile Ile Asn
Asp Asp Val Arg Lys Arg 260 265 270Met Arg Glu Arg Ala Glu Tyr Ile
Val Ser Gly Lys Phe Ala Glu Glu 275 280 285Trp Ile Glu Glu Tyr Gly
Glu Gly Ser Lys Asn Leu Glu Ser Met Met 290 295 300Leu Asp Ile Asp
Asn Ser Leu Glu Glu Gln Val Gly Lys Gln Leu Arg305 310 315 320Glu
Ile Val Leu Arg Gly Arg Pro Lys 32525560PRTPicrophilus torridus
25Met Asn Pro Asp Lys Lys Lys Arg Ser Asn Leu Ile Tyr Gly Gly Tyr1
5 10 15Glu Lys Ala Pro Asn Arg Ala Phe Leu Lys Ala Met Gly Leu Thr
Asp 20 25 30Asp Asp Ile Ala Lys Pro Ile Val Gly Val Ala Val Ala Trp
Asn Glu 35 40 45Ala Gly Pro Cys Asn Ile His Leu Leu Gly Leu Ser Asn
Ile Val Lys 50 55 60Glu Gly Val Arg Ser Gly Gly Gly Thr Pro Arg Val
Phe Thr Ala Pro65 70 75 80Val Val Ile Asp Gly Ile Ala Met Gly Ser
Glu Gly Met Lys Tyr Ser 85 90 95Leu Val Ser Arg Glu Ile Val Ala Asn
Thr Val Glu Leu Val Val Asn 100 105 110Ala His Gly Tyr Asp Gly Phe
Val Ala Leu Ala Gly Cys Asp Lys Thr 115 120 125Pro Pro Gly Met Met
Met Ala Met Ala Arg Leu Asn Ile Pro Ser Ile 130 135 140Ile Met Tyr
Gly Gly Thr Thr Leu Pro Gly Asn Phe Lys Gly Lys Pro145 150 155
160Ile Thr Ile Gln Asp Val Tyr Glu Ala Val Gly Ala Tyr Ser Lys Gly
165 170 175Lys Ile Thr Ala Glu Asp Leu Arg Leu Met Glu Asp Asn Ala
Ile Pro 180 185 190Gly Pro Gly Thr Cys Gly Gly Leu Tyr Thr Ala Asn
Thr Met Gly Leu 195 200 205Met Thr Glu Ala Leu Gly Leu Ala Leu Pro
Gly Ser Ala Ser Pro Pro 210 215 220Ala Val Asp Ser Ala Arg Val Lys
Tyr Ala Tyr Glu Thr Gly Lys Ala225 230 235 240Leu Met Asn Leu Ile
Glu Ile Gly Leu Lys Pro Arg Asp Ile Leu Thr 245 250 255Phe Glu Ala
Phe Glu Asn Ala Ile Thr Val Leu Met Ala Ser Gly Gly 260 265 270Ser
Thr Asn Ala Val Leu His Leu Leu Ala Ile Ala Tyr Glu Ala Gly 275 280
285Val Lys Leu Thr Leu Asp Asp Phe Asp Arg Ile Ser Gln Arg Thr Pro
290 295 300Glu Ile Val Asn Met Lys Pro Gly Gly Glu Tyr Ala Met Tyr
Asp Leu305 310 315 320His Arg Val Gly Gly Ala Pro Leu Ile Met Lys
Lys Leu Leu Glu Ala 325 330 335Asp Leu Leu His Gly Asp Val Ile Thr
Val Thr Gly Lys Thr Val Lys 340 345 350Gln Asn Leu Glu Glu Tyr Lys
Leu Pro Asn Val Pro His Glu His Ile 355 360 365Val Arg Pro Ile Ser
Asn Pro Phe Asn Pro Thr Gly Gly Ile Arg Ile 370 375 380Leu Lys Gly
Ser Leu Ala Pro Glu Gly Ala Val Ile Lys Val Ser Ala385 390 395
400Thr Lys Val Arg Tyr His Lys Gly Pro Ala Arg Val Phe Asn Ser Glu
405 410 415Glu Glu Ala Phe Lys Ala Val Leu Glu Glu Lys Ile Gln Glu
Asn Asp 420 425 430Val Val Val Ile Arg Tyr Glu Gly Pro Lys Gly Gly
Pro Gly Met Arg 435 440 445Glu Met Leu Ala Val Thr Ser Ala Ile Val
Gly Gln Gly Leu Gly Glu 450 455 460Lys Val Ala Leu Ile Thr Asp Gly
Arg Phe Ser Gly Ala Thr Arg Gly465 470 475 480Ile Met Val Gly His
Val Ala Pro Glu Ala Ala Val Gly Gly Pro Ile 485 490 495Ala Leu Leu
Arg Asp Gly Asp Thr Ile Ile Ile Asp Ala Asn Asn Gly 500 505 510Arg
Leu Asp Val Asp Leu Pro Gln Glu Glu Leu Lys Lys Arg Ala Asp 515 520
525Glu Trp Thr Pro Pro Pro Pro Lys Tyr Lys Ser Gly Leu Leu Ala Gln
530 535 540Tyr Ala Arg Leu Val Ser Ser Ser Ser Leu Gly Ala Val Leu
Leu Thr545 550 555 56026566PRTArtificial SequenceSaccharomyces
cerevisiae ILV3deltaN 26Met Lys Lys Leu Asn Lys Tyr Ser Tyr Ile Ile
Thr Glu Pro Lys Gly1 5 10 15Gln Gly Ala Ser Gln Ala Met Leu Tyr Ala
Thr Gly Phe Lys Lys Glu 20
25 30Asp Phe Lys Lys Pro Gln Val Gly Val Gly Ser Cys Trp Trp Ser
Gly 35 40 45Asn Pro Cys Asn Met His Leu Leu Asp Leu Asn Asn Arg Cys
Ser Gln 50 55 60Ser Ile Glu Lys Ala Gly Leu Lys Ala Met Gln Phe Asn
Thr Ile Gly65 70 75 80Val Ser Asp Gly Ile Ser Met Gly Thr Lys Gly
Met Arg Tyr Ser Leu 85 90 95Gln Ser Arg Glu Ile Ile Ala Asp Ser Phe
Glu Thr Ile Met Met Ala 100 105 110Gln His Tyr Asp Ala Asn Ile Ala
Ile Pro Ser Cys Asp Lys Asn Met 115 120 125Pro Gly Val Met Met Ala
Met Gly Arg His Asn Arg Pro Ser Ile Met 130 135 140Val Tyr Gly Gly
Thr Ile Leu Pro Gly His Pro Thr Cys Gly Ser Ser145 150 155 160Lys
Ile Ser Lys Asn Ile Asp Ile Val Ser Ala Phe Gln Ser Tyr Gly 165 170
175Glu Tyr Ile Ser Lys Gln Phe Thr Glu Glu Glu Arg Glu Asp Val Val
180 185 190Glu His Ala Cys Pro Gly Pro Gly Ser Cys Gly Gly Met Tyr
Thr Ala 195 200 205Asn Thr Met Ala Ser Ala Ala Glu Val Leu Gly Leu
Thr Ile Pro Asn 210 215 220Ser Ser Ser Phe Pro Ala Val Ser Lys Glu
Lys Leu Ala Glu Cys Asp225 230 235 240Asn Ile Gly Glu Tyr Ile Lys
Lys Thr Met Glu Leu Gly Ile Leu Pro 245 250 255Arg Asp Ile Leu Thr
Lys Glu Ala Phe Glu Asn Ala Ile Thr Tyr Val 260 265 270Val Ala Thr
Gly Gly Ser Thr Asn Ala Val Leu His Leu Val Ala Val 275 280 285Ala
His Ser Ala Gly Val Lys Leu Ser Pro Asp Asp Phe Gln Arg Ile 290 295
300Ser Asp Thr Thr Pro Leu Ile Gly Asp Phe Lys Pro Ser Gly Lys
Tyr305 310 315 320Val Met Ala Asp Leu Ile Asn Val Gly Gly Thr Gln
Ser Val Ile Lys 325 330 335Tyr Leu Tyr Glu Asn Asn Met Leu His Gly
Asn Thr Met Thr Val Thr 340 345 350Gly Asp Thr Leu Ala Glu Arg Ala
Lys Lys Ala Pro Ser Leu Pro Glu 355 360 365Gly Gln Glu Ile Ile Lys
Pro Leu Ser His Pro Ile Lys Ala Asn Gly 370 375 380His Leu Gln Ile
Leu Tyr Gly Ser Leu Ala Pro Gly Gly Ala Val Gly385 390 395 400Lys
Ile Thr Gly Lys Glu Gly Thr Tyr Phe Lys Gly Arg Ala Arg Val 405 410
415Phe Glu Glu Glu Gly Ala Phe Ile Glu Ala Leu Glu Arg Gly Glu Ile
420 425 430Lys Lys Gly Glu Lys Thr Val Val Val Ile Arg Tyr Glu Gly
Pro Arg 435 440 445Gly Ala Pro Gly Met Pro Glu Met Leu Lys Pro Ser
Ser Ala Leu Met 450 455 460Gly Tyr Gly Leu Gly Lys Asp Val Ala Leu
Leu Thr Asp Gly Arg Phe465 470 475 480Ser Gly Gly Ser His Gly Phe
Leu Ile Gly His Ile Val Pro Glu Ala 485 490 495Ala Glu Gly Gly Pro
Ile Gly Leu Val Arg Asp Gly Asp Glu Ile Ile 500 505 510Ile Asp Ala
Asp Asn Asn Lys Ile Asp Leu Leu Val Ser Asp Lys Glu 515 520 525Met
Ala Gln Arg Lys Gln Ser Trp Val Ala Pro Pro Pro Arg Tyr Thr 530 535
540Arg Gly Thr Leu Ser Lys Tyr Ala Lys Leu Val Ser Asn Ala Ser
Asn545 550 555 560Gly Cys Val Leu Asp Ala 5652711PRTArtificial
SequenceConserved DHAD motif 27Pro Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa
Ile Leu1 5 102811PRTArtificial SequenceConserved DHAD motif 28Pro
Ile Lys Xaa Xaa Gly Xaa Xaa Xaa Ile Leu1 5 102936DNAArtificial
SequencePrimer 387 29gtcacagtcg acatggctaa ctacttcaat acactg
363033DNAArtificial SequencePrimer 388 30gcataaggat ccttaacccg
caacagcaat acg 333136DNAArtificial SequencePrimer 410 31gactttgtcg
acatgcttta cccagaaaaa tttcag 363241DNAArtificial SequencePrimer 411
32ctaatagcgg ccgcctattt atggaatttc ttatcataat c 413335DNAArtificial
SequencePrimer 637 33ttttgagctc gccgatccca ttaccgacat ttggg
353496DNAArtificial SequencePrimer 638 34aaagtcgaca ccgatatacc
tgtatgtgtc accaccaatg tatctataag tatccatgct 60agccctaggt ttatgtgatg
attgattgat tgattg 963536DNAArtificial SequencePrimer 697
35gagtacggat ccctagagag ctttcgtttt catgag 363637DNAArtificial
SequencePrimer 767 36caagaagtcg acatgttgac aaaagcaaca aaagaac
373732DNAArtificial SequencePrimer 1149 37cgcttactcg agatgggccg
cgatgaattc gc 323833DNAArtificial SequencePrimer 1150 38gcataaagat
ctttaacccg caacagcaat acg 333935DNAArtificial SequencePrimer 1151
39agacgtgtcg acatgactgg catgactgat gcaga 354034DNAArtificial
SequencePrimer 1152 40gtttagggat cctcatccac ccaacttcga tttg
344131DNAArtificial SequencePrimer 1006 41gtagaagacg tcacctggta
gaccaaagat g 314232DNAArtificial SequencePrimer 1009 42catcgtgacg
tcgctcaatt gactgctgct ac 324332DNAArtificial SequencePrimer 1016
43actaagcgac acgtgcggtt tctgtggtat ag 324436DNAArtificial
SequencePrimer 1017 44gaaaccgcac gtgtcgctta gtttacattt ctttcc
36451647DNAArtificial SequenceL. lactis kivD (codon optimized for
E. coli) in pGV1590 45atgtatactg ttggtgatta tctgctggac cgtctgcatg
aactgggtat cgaagaaatc 60ttcggcgttc cgggtgatta caatctgcag ttcctggatc
agatcatctc tcataaagac 120atgaaatggg tgggtaacgc taacgaactg
aacgcaagct acatggcaga tggttatgca 180cgtaccaaga aagccgcggc
atttctgacc actttcggtg ttggcgaact gagcgccgtc 240aacggtctgg
cgggctccta cgccgaaaac ctgccggtgg tggagatcgt aggcagccca
300acgagcaaag ttcagaacga aggtaaattc gtccaccaca ctctggctga
cggcgatttc 360aaacacttca tgaaaatgca tgaacctgtg actgcggcac
gtacgctgct gactgcagag 420aacgctactg tggaaatcga ccgcgttctg
tctgcgctgc tgaaagaacg caaaccagtt 480tacatcaacc tgcctgtgga
tgttgcggca gctaaagcgg aaaaaccgag cctgccgctg 540aagaaagaaa
actccacttc taacactagc gaccaggaaa tcctgaacaa aatccaggag
600tctctgaaaa acgcaaagaa accaatcgtg atcaccggcc acgaaatcat
ttcttttggt 660ctggagaaga ccgtgaccca attcatcagc aaaaccaaac
tgccgattac caccctgaac 720ttcggcaagt cctctgttga cgaggctctg
ccgtctttcc tgggcatcta caacggtact 780ctgagcgaac cgaacctgaa
agaatttgtt gaatctgcgg acttcatcct gatgctgggc 840gttaaactga
ccgactcttc taccggtgca ttcactcacc atctgaacga aaacaaaatg
900attagcctga acatcgacga gggtaaaatc ttcaacgagc gtatccagaa
cttcgacttc 960gaaagcctga tcagctctct gctggacctg tccgaaatcg
agtataaagg caaatacatt 1020gacaaaaagc aagaagattt cgtaccatct
aacgcactgc tgtcccagga tcgcctgtgg 1080caggccgtgg agaacctgac
ccagagcaat gaaaccatcg tggcggaaca aggtacgagc 1140tttttcggcg
cgtcttctat ctttctgaaa tccaaaagcc attttatcgg tcagccgctg
1200tggggtagca ttggctatac tttcccggca gcgctgggct ctcagatcgc
tgataaagaa 1260tctcgtcatc tgctgttcat cggtgacggt tccctgcagc
tgaccgtaca ggaactgggt 1320ctggcaattc gtgaaaagat caacccgatt
tgcttcatta ttaacaatga cggctacacc 1380gttgagcgtg agatccacgg
tccgaaccag tcttacaacg atatccctat gtggaactac 1440tctaaactgc
cggagtcctt cggcgcaact gaggaccgtg ttgtgtctaa aattgtgcgt
1500accgaaaacg aatttgtgag cgtgatgaaa gaggcccagg ccgatccgaa
ccgtatgtac 1560tggatcgaac tgatcctggc gaaagaaggc gcaccgaagg
tactgaagaa aatgggcaag 1620ctgtttgctg aacagaataa atcctaa
1647461086DNAArtificial SequenceS. cerevisiae ADH7 in pGV1590
46atgctttacc cagaaaaatt tcagggcatc ggtatttcca acgcaaagga ttggaagcat
60cctaaattag tgagttttga cccaaaaccc tttggcgatc atgacgttga tgttgaaatt
120gaagcctgtg gtatctgcgg atctgatttt catatagccg ttggtaattg
gggtccagtc 180ccagaaaatc aaatccttgg acatgaaata attggccgcg
tggtgaaggt tggatccaag 240tgccacactg gggtaaaaat cggtgaccgt
gttggtgttg gtgcccaagc cttggcgtgt 300tttgagtgtg aacgttgcaa
aagtgacaac gagcaatact gtaccaatga ccacgttttg 360actatgtgga
ctccttacaa ggacggctac atttcacaag gaggctttgc ctcccacgtg
420aggcttcatg aacactttgc tattcaaata ccagaaaata ttccaagtcc
gctagccgct 480ccattattgt gtggtggtat tacagttttc tctccactac
taagaaatgg ctgtggtcca 540ggtaagaggg taggtattgt tggcatcggt
ggtattgggc atatggggat tctgttggct 600aaagctatgg gagccgaggt
ttatgcgttt tcgcgaggcc actccaagcg ggaggattct 660atgaaactcg
gtgctgatca ctatattgct atgttggagg ataaaggctg gacagaacaa
720tactctaacg ctttggacct tcttgtcgtt tgctcatcat ctttgtcgaa
agttaatttt 780gacagtatcg ttaagattat gaagattgga ggctccatcg
tttcaattgc tgctcctgaa 840gttaatgaaa agcttgtttt aaaaccgttg
ggcctaatgg gagtatcaat ctcaagcagt 900gctatcggat ctaggaagga
aatcgaacaa ctattgaaat tagtttccga aaagaatgtc 960aaaatatggg
tggaaaaact tccgatcagc gaagaaggcg tcagccatgc ctttacaagg
1020atggaaagcg gagacgtcaa atacagattt actttggtcg attatgataa
gaaattccat 1080aaatag 1086471716DNAArtificial SequenceB. subtilis
alsS in pGV1726 47atgttgacaa aagcaacaaa agaacaaaaa tcccttgtga
aaagcagagg ggcggagctt 60gttgttgatt gcttagcgga gcaaggtgtc acacatgtat
ttggcattcc aggtgcaaaa 120attgatgcgg tatttgacgc tttacaagat
aaagggcctg aaattatcgt tgcccggcat 180gaacaaaatg cagcatttat
ggcgcaagca gtcggccgtt taactggaaa accgggagtc 240gtgttagtca
catcaggacc aggtgcttcg aacttggcaa caggactgct gacagcaaac
300actgaaggtg accctgtcgt tgcgcttgct gggaacgtga tccgtgcaga
tcgtttaaaa 360cggacacatc aatctttgga taatgcggcg ctattccagc
cgattacaaa atacagtgta 420gaagttcaag atgtaaaaaa tataccggaa
gctgttacaa atgcgtttag gatagcgtca 480gcagggcagg ctggggccgc
ttttgtgagt tttccgcaag atgttgtgaa tgaagtcaca 540aatacaaaaa
acgtacgtgc tgtcgcagcg ccaaaacttg gtcccgcagc agatgacgca
600atcagtatgg ccattgcaaa aattcaaaca gcaaaacttc ctgtcgtttt
agtcggcatg 660aagggcggaa gaccggaagc gattaaagcg gttcgcaagc
tattgaaaaa agtgcagctt 720ccattcgttg aaacatatca agctgccggt
actcttacga gagatttaga ggatcagtat 780tttggccgga tcggtttatt
ccgcaaccag cctggcgatc tgctgcttga gcaggctgat 840gttgttctga
caatcggcta tgacccaatt gaatatgatc cgaaattctg gaatgtcaat
900ggagaccgga cgatcatcca tttagacgag attctggctg acattgatca
tgcttaccag 960ccggatcttg aactgatcgg tgatattcca tctacgatca
atcatatcga acacgatgct 1020gtgaaagtag actttgcgga acgtgagcag
aagatccttt ctgatttaaa acaatatatg 1080catgagggtg agcaggtgcc
tgcagattgg aaatcagaca gagtgcatcc tcttgaaatc 1140gttaaagaat
tgcgaaacgc agtcgatgat catgttacag tgacttgcga tatcggttca
1200cacgcgattt ggatgtcacg ttatttccgc agctacgagc cgttaacatt
aatgattagt 1260aacggtatgc aaacactcgg cgttgcgctt ccttgggcaa
tcggcgcttc attggtgaaa 1320ccgggagaaa aagtagtatc agtctccggt
gatggcggtt tcttattctc agctatggaa 1380ttagagacag cagttcgttt
aaaagcacca attgtacaca ttgtatggaa cgacagcaca 1440tatgacatgg
ttgcattcca gcaattgaaa aaatataatc gtacatctgc ggtcgatttc
1500ggaaatatcg atatcgtgaa atacgcggaa agcttcggag caactggctt
acgcgtagaa 1560tcaccagacc agctggcaga tgttctgcgt caaggcatga
acgctgaggg gcctgtcatc 1620attgatgtcc cggttgacta cagtgataac
gttaatttag caagtgacaa gcttccgaaa 1680gaattcgggg aactcatgaa
aacgaaagct ctctag 1716481410DNAArtificial SequenceE. coli
ilvCdeltaN in pGV1727 48atgggccgcg atgaattcgc cgatggcgcg agctaccttc
agggtaaaaa agtagtcatc 60gtcggctgtg gcgcacaggg tctgaaccag ggcctgaaca
tgcgtgattc tggtctcgat 120atctcctacg ctctgcgtaa agaagcgatt
gccgagaagc gcgcgtcctg gcgtaaagcg 180accgaaaatg gttttaaagt
gggtacttac gaagaactga tcccacaggc ggatctggtg 240attaacctga
cgccggacaa gcagcactct gatgtagtgc gcaccgtaca gccactgatg
300aaagacggcg cggcgctggg ctactcgcac ggtttcaaca tcgtcgaagt
gggcgagcag 360atccgtaaag atatcaccgt agtgatggtt gcgccgaaat
gcccaggcac cgaagtgcgt 420gaagagtaca aacgtgggtt cggcgtaccg
acgctgattg ccgttcaccc ggaaaacgat 480ccgaaaggcg aaggcatggc
gattgccaaa gcctgggcgg ctgcaaccgg tggtcaccgt 540gcgggtgtgc
tggaatcgtc cttcgttgcg gaagtgaaat ctgacctgat gggcgagcaa
600accatcctgt gcggtatgtt gcaggctggc tctctgctgt gcttcgacaa
gctggtggaa 660gaaggtaccg atccagcata cgcagaaaaa ctgattcagt
tcggttggga aaccatcacc 720gaagcactga aacagggcgg catcaccctg
atgatggacc gtctctctaa cccggcgaaa 780ctgcgtgctt atgcgctttc
tgaacagctg aaagagatca tggcacccct gttccagaaa 840catatggacg
acatcatctc cggcgaattc tcttccggta tgatggcgga ctgggccaac
900gatgataaga aactgctgac ctggcgtgaa gagaccggca aaaccgcgtt
tgaaaccgcg 960ccgcagtatg aaggcaaaat cggcgagcag gagtacttcg
ataaaggcgt actgatgatt 1020gcgatggtga aagcgggcgt tgaactggcg
ttcgaaacca tggtcgattc cggcatcatt 1080gaagagtctg catattatga
atcactgcac gagctgccgc tgattgccaa caccatcgcc 1140cgtaagcgtc
tgtacgaaat gaacgtggtt atctctgata ccgctgagta cggtaactat
1200ctgttctctt acgcttgtgt gccgttgctg aaaccgttta tggcagagct
gcaaccgggc 1260gacctgggta aagctattcc ggaaggcgcg gtagataacg
ggcaactgcg tgatgtgaac 1320gaagcgattc gcagccatgc gattgagcag
gtaggtaaga aactgcgcgg ctatatgaca 1380gatatgaaac gtattgctgt
tgcgggttaa 1410491782DNAArtificial SequenceE. coli ilvDdeltaN
(codon optimized for K. lactis) in pGV1727 49atgactggca tgactgatgc
agatttcgga aagccaatca ttgccgtcgt caactctttt 60acacaattcg ttccgggtca
tgtccatttg cgtgatctag gtaagcttgt tgccgaacaa 120attgaagctg
caggtggtgt cgcaaaagag tttaatacta ttgctgtgga cgacggtata
180gctatggggc atggcggtat gttatactct ttaccatcga gagaattaat
tgcagactca 240gtcgaatata tggttaatgc tcattgtgcc gatgcaatgg
tttgtatctc taattgtgat 300aagataacgc ctggtatgtt gatggcgtcc
ttgagattga acatcccagt aatcttcgta 360tctggcggcc caatggaggc
tggtaaaact aagttaagtg atcagatcat caaacttgat 420cttgtggatg
caatgattca aggtgcagat ccaaaagttt cagactcgca gtcagaccaa
480gttgaaagaa gtgcatgtcc aacttgtggt tcttgcagtg gaatgttcac
ggctaactct 540atgaattgct tgactgaagc tctaggttta tctcaaccag
gaaatggttc attattagcg 600acccatgcag acagaaagca attgttctta
aatgccggaa aaagaattgt ggaactaacg 660aaaaggtatt acgaacaaaa
tgatgaatca gcattaccga ggaatatagc ttcaaaggct 720gcattcgaaa
atgccatgac attggatatt gcaatgggtg gtagtacaaa cacggtctta
780catcttctag ctgcagccca agaagctgag atagatttca ccatgtctga
tatcgacaag 840ctttcacgta aggttccaca gttatgtaag gttgcaccat
caactcaaaa gtatcacatg 900gaagacgttc atcgtgcagg aggggttatt
ggtattttag gggagttgga cagagccggt 960cttttaaaca gggatgtgaa
gaatgtattg ggtttaacac ttccacagac attagagcaa 1020tacgatgtca
tgttaactca agatgatgcc gtgaaaaaca tgttcagggc aggtccagca
1080gggatcagaa ccacccaagc attctcgcaa gactgtaggt gggacacttt
ggacgatgat 1140agagcaaatg gatgtataag atcgcttgag catgcttata
gtaaggatgg tggtttagca 1200gtattatatg gaaacttcgc tgaaaatggt
tgcattgtga aaactgctgg tgtagatgat 1260agtattttga aatttactgg
acccgctaaa gtttacgaaa gtcaagacga tgctgttgag 1320gctatacttg
gcggaaaggt ggtagcagga gacgtggtag tgataagata tgagggacca
1380aagggaggac caggtatgca ggaaatgctt tacccaactt catttttgaa
gtccatggga 1440ctaggaaaag cttgtgccct tatcactgac ggtagattct
ctggtggcac ttcgggttta 1500agtatcggtc acgtatcacc agaggcagct
tctggtggtt cgattggatt gattgaagat 1560ggagatttga tcgccataga
tatcccaaat agaggtatcc aattacaagt ctcagacgct 1620gaattggctg
caagaagaga agcacaagat gccagaggag ataaggcttg gactcctaaa
1680aatagagaac gtcaagtaag tttcgccctt agggcttatg cttcattggc
tacttcagcc 1740gataaggggg cagtaagaga caaatcgaag ttgggtggat ga
17825039DNAArtificial SequencePrimer 575 50ttttgaattc tggttctatc
gaggagaaaa agcgacaag 395135DNAArtificial SequencePrimer 576
51ttttggatcc ggatgtgaag tcgttgacac agtcg 355222DNAArtificial
SequencePrimer 1623 52gtctctgata aggaaatggc tc 225360DNAArtificial
SequencePrimer 1886 53tcaagaagcc tcaagtcggg gttggttcct gttggtggtc
cggtaaccca tgtaacatgc 605460DNAArtificial SequencePrimer 1887
54cggtaaccca tgtaacatgc atctattgga cttgaataac attctggttc tatcgaggag
605560DNAArtificial SequencePrimer 1888 55ctttcgttaa caagcccatc
tctacttttt tcttggctgt atccggatgt gaagtcgttg 605660DNAArtificial
SequencePrimer 1889 56gatgggcttg ttaacgaaag ttgctacatc tagacaattc
tgcattatag gccccaatcg 605720DNAArtificial SequencePrimer 1890
57ttagtggcag caaagcagag 205820DNAArtificial SequencePrimer 1892
58acatgatgcc cgttcacaac 205920DNAArtificial SequencePrimer 1916
59caggatgaca gttcgatgag 206020DNAArtificial SequencePrimer 1917
60tgtcaacgac ttcacatccg 206120DNAArtificial SequencePrimer 1920
61tgcagcctag ctttgaagac 206220DNAArtificial SequencePrimer 1921
62tacgttagga ccccagtatc 206367DNAArtificial SequencePrimer 271
63ctagcatgga acaaaaactc atctcagaag aagatggtgt cgacgaattc ccgggatccg
60cggccgc 676467DNAArtificial SequencePrimer 272 64tcgagcggcc
gcggatcccg ggaattcgtc gacaccatct tcttctgaga tgagtttttg 60ttccatg
676534DNAArtificial SequencePrimer 421 65gccaacggat cctcaagcat
ctaaaacaca accg 346636DNAArtificial SequencePrimer 551 66gctcatgtcg
acatgaagaa gctcaacaag tactcg 366735DNAArtificial SequencePrimer
1617 67cgttgagtcg acatgggctt gttaacgaaa gttgc
356834DNAArtificial SequencePrimer 1618 68gccaacggat cctcaagcat
ctaaaacaca accg 34696654DNAArtificial SequencepGV1730 69caggcaagtg
cacaaacaat acttaaataa atactactca gtaataacct atttcttagc 60atttttgacg
aaatttgcta ttttgttaga gtcttttaca ccatttgtct ccacacctcc
120gcttacatca acaccaataa cgccatttaa tctaagcgca tcaccaacat
tttctggcgt 180cagtccacca gctaacataa aatgtaagct ttcggggctc
tcttgccttc caacccagtc 240agaaatcgag ttccaatcca aaagttcacc
tgtcccacct gcttctgaat caaacaaggg 300aataaacgaa tgaggtttct
gtgaagctgc actgagtagt atgttgcagt cttttggaaa 360tacgagtctt
ttaataactg gcaaaccgag gaactcttgg tattcttgcc acgactcatc
420tccatgcagt tggacgatat caatgccgta atcattgacc agagccaaaa
catcctcctt 480aggttgatta cgaaacacgc caaccaagta tttcggagtg
cctgaactat ttttatatgc 540ttttacaaga cttgaaattt tccttgcaat
aaccgggtca attgttctct ttctattggg 600cacacatata atacccagca
agtcagcatc ggaatctaga gcacattctg cggcctctgt 660gctctgcaag
ccgcaaactt tcaccaatgg accagaacta cctgtgaaat taataacaga
720catactccaa gctgcctttg tgtgcttaat cacgtatact cacgtgctca
atagtcacca 780atgccctccc tcttggccct ctccttttct tttttcgacc
gaattaattc ttaatcggca 840aaaaaagaaa agctccggat caagattgta
cgtaaggtga caagctattt ttcaataaag 900aatatcttcc actactgcca
tctggcgtca taactgcaaa gtacacatat attacgatgc 960tgtctattaa
atgcttccta tattatatat atagtaatgt cgttgacgtc gccggcagga
1020gagtgaaaga gccttgttta tatatttttt tttcctatgt tcaacgagga
cagctaggtt 1080tatgcaaaaa tgtgccatca ccataagctg attcaaatga
gctaaaaaaa aaatagttag 1140aaaataaggt ggtgttgaac gatagcaagt
agatcaagac accgtctaac agaaaaaggg 1200gcagcggaca atattatgca
attatgaaga aaagtactca aagggtcgga aaaatattca 1260aacgatattt
gcattaaatc ctcaattgat tgattattcc atagtaaaat accgtaacaa
1320cacaaaattg ttctcaaatt cataaattat tcattttttc cacgagcctc
atcacacgaa 1380aagtcagaag agcatacata atcttttaaa tgcataggtt
atgcattttg caaatgccac 1440caggcaacaa aaatatgcgt ttagcgggcg
gaatcgggaa ggaagccgga accaccaaaa 1500actggaagct acgtttttaa
ggaaggtatg ggtgcagtgt gcttatctca agaaatatta 1560gttatgatat
aaggtgttga agtttagaga taggtaaata aacgcggggt gtgtttatta
1620catgaagaag aagttagttt ctgccttgct tgtttatctt gcacatcaca
tcagcggaac 1680atatgctcac ccagtcgcga catccaattt atagaaatca
gcttgtgggt attgttcaga 1740gaatttttca atcattggag caatcatttt
acatggaccg caccaagtgg cgtagaaatc 1800tacgacaact agcttgtctt
gagcaattgc agagtcgaat tcgctggcag ttttgaattg 1860agtaaccatt
atttgtatcg aggtgtctag tcttctatta cactaatgca gtttcagggt
1920tttggaaacc acactgttta aacagtgttc cttaatcaag gatacctctt
tttttttcct 1980tggttccact aattcatcgg tttttttttt ggaagacatc
ttttccaacg aaaagaatat 2040acatatcgtt taagagaaat tctccaaatt
tgtaaagaag cggacccaga cttaagccta 2100accaggccaa ttcaacagac
tgtcggcaac ttcttgtctg gtctttccat ggtaagtgac 2160agtgcagtaa
taatatgaac caatttattt ttcgttacat aaaaatgctt ataaaacttt
2220aactaataat tagagattaa atcgcggccg cggatcccta gagagctttc
gttttcatga 2280gttccccgaa ttctttcgga agcttgtcac ttgctaaatt
aacgttatca ctgtagtcaa 2340ccgggacatc aatgatgaca ggcccctcag
cgttcatgcc ttgacgcaga acatctgcca 2400gctggtctgg tgattctacg
cgtaagccag ttgctccgaa gctttccgcg tatttcacga 2460tatcgatatt
tccgaaatcg accgcagatg tacgattata ttttttcaat tgctggaatg
2520caaccatgtc atatgtgctg tcgttccata caatgtgtac aattggtgct
tttaaacgaa 2580ctgctgtctc taattccata gctgagaata agaaaccgcc
atcaccggag actgatacta 2640ctttttctcc cggtttcacc aatgaagcgc
cgattgccca aggaagcgca acgccgagtg 2700tttgcatacc gttactaatc
attaatgtta acggctcgta gctgcggaaa taacgtgaca 2760tccaaatcgc
gtgtgaaccg atatcgcaag tcactgtaac atgatcatcg actgcgtttc
2820gcaattcttt aacgatttca agaggatgca ctctgtctga tttccaatct
gcaggcacct 2880gctcaccctc atgcatatat tgttttaaat cagaaaggat
cttctgctca cgttccgcaa 2940agtctacttt cacagcatcg tgttcgatat
gattgatcgt agatggaata tcaccgatca 3000gttcaagatc cggctggtaa
gcatgatcaa tgtcagccag aatctcgtct aaatggatga 3060tcgtccggtc
tccattgaca ttccagaatt tcggatcata ttcaattggg tcatagccga
3120ttgtcagaac aacatcagcc tgctcaagca gcagatcgcc aggctggttg
cggaataaac 3180cgatccggcc aaaatactga tcctctaaat ctctcgtaag
agtaccggca gcttgatatg 3240tttcaacgaa tggaagctgc acttttttca
atagcttgcg aaccgcttta atcgcttccg 3300gtcttccgcc cttcatgccg
actaaaacga caggaagttt tgctgtttga atttttgcaa 3360tggccatact
gattgcgtca tctgctgcgg gaccaagttt tggcgctgcg acagcacgta
3420cgttttttgt atttgtgact tcattcacaa catcttgcgg aaaactcaca
aaagcggccc 3480cagcctgccc tgctgacgct atcctaaacg catttgtaac
agcttccggt atatttttta 3540catcttgaac ttctacactg tattttgtaa
tcggctggaa tagcgccgca ttatccaaag 3600attgatgtgt ccgttttaaa
cgatctgcac ggatcacgtt cccagcaagc gcaacgacag 3660ggtcaccttc
agtgtttgct gtcagcagtc ctgttgccaa gttcgaagca cctggtcctg
3720atgtgactaa cacgactccc ggttttccag ttaaacggcc gactgcttgc
gccataaatg 3780ctgcattttg ttcatgccgg gcaacgataa tttcaggccc
tttatcttgt aaagcgtcaa 3840ataccgcatc aatttttgca cctggaatgc
caaatacatg tgtgacacct tgctccgcta 3900agcaatcaac aacaagctcc
gcccctctgc ttttcacaag ggatttttgt tcttttgttg 3960cttttgtcaa
catgtcgact ttatgtgatg attgattgat tgattgtaca gtttgttttt
4020cttaatatct atttcgatga cttctatatg atattgcact aacaagaaga
tattataatg 4080caattgatac aagacaagga gttatttgct tctcttttat
atgattctga caatccatat 4140tgcgttggta gtcttttttg ctggaacggt
tcagcggaaa agacgcatcg ctctttttgc 4200ttctagaaga aatgccagca
aaagaatctc ttgacagtga ctgacagcaa aaatgtcttt 4260ttctaactag
taacaaggct aagatatcag cctgaaataa agggtggtga agtaataatt
4320aaatcatccg tataaaccta tacacatata tgaggaaaaa taatacaaaa
gtgttttaaa 4380tacagataca tacatgaaca tatgcacgta tagcgcccaa
atgtcggtaa tgggatcggc 4440gagctccagc ttttgttccc tttagtgagg
gttaattgcg cgcttggcgt aatcatggtc 4500atagctgttt cctgtgtgaa
attgttatcc gctcacaatt ccacacaaca taggagccgg 4560aagcataaag
tgtaaagcct ggggtgccta atgagtgagg taactcacat taattgcgtt
4620gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt
aatgaatcgg 4680ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct
tccgcttcct cgctcactga 4740ctcgctgcgc tcggtcgttc ggctgcggcg
agcggtatca gctcactcaa aggcggtaat 4800acggttatcc acagaatcag
gggataacgc aggaaagaac atgtgagcaa aaggccagca 4860aaaggccagg
aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc
4920tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga
caggactata 4980aagataccag gcgtttcccc ctggaagctc cctcgtgcgc
tctcctgttc cgaccctgcc 5040gcttaccgga tacctgtccg cctttctccc
ttcgggaagc gtggcgcttt ctcatagctc 5100acgctgtagg tatctcagtt
cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 5160accccccgtt
cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc
5220ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta
gcagagcgag 5280gtatgtaggc ggtgctacag agttcttgaa gtggtggcct
aactacggct acactagaag 5340gacagtattt ggtatctgcg ctctgctgaa
gccagttacc ttcggaaaaa gagttggtag 5400ctcttgatcc ggcaaacaaa
ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 5460gattacgcgc
agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga
5520cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat
caaaaaggat 5580cttcacctag atccttttaa attaaaaatg aagttttaaa
tcaatctaaa gtatatatga 5640gtaaacttgg tctgacagtt accaatgctt
aatcagtgag gcacctatct cagcgatctg 5700tctatttcgt tcatccatag
ttgcctgact ccccgtcgtg tagataacta cgatacggga 5760gggcttacca
tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc
5820agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg
gtcctgcaac 5880tttatccgcc tccatccagt ctattaattg ttgccgggaa
gctagagtaa gtagttcgcc 5940agttaatagt ttgcgcaacg ttgttgccat
tgctacaggc atcgtggtgt cacgctcgtc 6000gtttggtatg gcttcattca
gctccggttc ccaacgatca aggcgagtta catgatcccc 6060catgttgtgc
aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt
6120ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta
ctgtcatgcc 6180atccgtaaga tgcttttctg tgactggtga gtactcaacc
aagtcattct gagaatagtg 6240tatgcggcga ccgagttgct cttgcccggc
gtcaatacgg gataataccg cgccacatag 6300cagaacttta aaagtgctca
tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 6360cttaccgctg
ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc
6420atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa
atgccgcaaa 6480aaagggaata agggcgacac ggaaatgttg aatactcata
ctcttccttt ttcaatatta 6540ttgaagcatt tatcagggtt attgtctcat
gagcggatac atatttgaat gtatttagaa 6600aaataaacaa ataggggttc
cgcgcacatt tccccgaaaa gtgccacctg acgt 6654701728DNAArtificial
SequenceB. subtilis alsS in pGV1730 70gtcgacatgt tgactaaagc
tacaaaagag cagaaatcat tggtgaaaaa taggggtgca 60gaacttgttg tggactgttt
ggtagaacag ggcgtaacac atgtttttgg tatcccaggt 120gcaaaaatcg
acgccgtgtt tgatgcatta caagacaagg gtccagaaat tattgttgct
180agacatgagc aaaatgccgc atttatggcg caagctgtag gtaggcttac
aggtaaacct 240ggtgttgtcc tagttacgtc tggcccagga gcctccaatt
tagcaactgg tctattgaca 300gctaatactg agggagatcc tgtagttgcg
ttagccggta atgtaattag agctgatagg 360cttaagagaa ctcaccagtc
tctagacaac gctgctttat tccaaccgat caccaagtac 420tcagtagagg
tacaagacgt aaagaatata cctgaagctg tgacaaacgc atttcgtata
480gcttctgctg gtcaggctgg tgccgcgttt gtttcttttc ctcaagacgt
tgtcaatgaa 540gtgaccaata ctaaaaacgt tagagcggtt gcagccccta
aactaggtcc agccgcagac 600gacgcaatta gcgctgcaat tgctaaaatt
cagacggcga aactaccagt agtccttgtc 660ggtatgaagg gcggaagacc
agaagcaata aaagctgttc gtaagttatt gaagaaagtc 720caattacctt
tcgttgagac ttaccaagca gcaggtactt tatctagaga tttagaggat
780cagtattttg gaaggatagg tctatttaga aaccaaccag gagatttact
attagaacaa 840gctgatgttg tacttactat cggttatgat cctatagagt
atgacccaaa gttttggaac 900ataaatgggg atagaacaat tatacatcta
gacgagataa tcgccgacat cgatcacgct 960tatcaaccag atttagaact
aatcggagat atcccgtcaa caatcaatca tattgaacat 1020gatgctgtaa
aggttgagtt cgctgaacgt gagcagaaaa tcttatctga tctaaagcaa
1080tatatgcatg agggtgaaca agttccagca gactggaaat ctgaccgtgc
acatcctttg 1140gaaatcgtta aggaactaag aaatgcggtc gatgatcatg
tgactgttac atgtgatatc 1200ggttcacatg caatttggat gtcacgttat
tttaggagct acgaaccatt aactttaatg 1260atatctaacg ggatgcaaac
tctgggggtt gcacttcctt gggctattgg cgctagttta 1320gttaagcccg
gtgagaaggt ggtatcggta tcaggtgatg gtggctttct gttttcggct
1380atggaattag aaactgcagt ccgtttaaaa gctcccattg tgcatattgt
ctggaatgat 1440tctacttacg acatggttgc ttttcaacag ttgaagaaat
acaatagaac ttcggctgta 1500gactttggta acatcgatat tgtgaaatat
gctgagtctt ttggcgcaac aggcctgagg 1560gtggaaagtc cagatcagtt
agctgatgtg ttgagacaag ggatgaatgc cgagggaccg 1620gtaatcatag
atgtgccagt tgactactca gacaatatta atttggcttc tgataaactt
1680cctaaagagt ttggcgagct aatgaagacc aaagccttat aaggatcc
1728711698DNATrichoderma atraviride 71gtcgacatga caaaggatac
cgttgacatt ttgattgatt ctttaaaagc agcaggtgta 60aaatatgttt tcggcgttcc
gggagcgaaa attgactccg tgtttaatgc cctaatcgat 120catccagaca
tcaagttagt tgtatgtaga cacgaacaaa acgccgcctt tatcgcagca
180gctatgggta aggttaccgg tagacctggt gtctgcatcg ctacaagtgg
gcctgggact 240tctaatttgg ttacaggcct ggttacagcg accgacgaag
gggcgccggt tgttgctata 300gtgggttcag ttaaacgtag tcaatcatta
caaagaactc atcagtcgct aaggggagcc 360gacctgttgg ctcccgttac
caagaaggtg gtaagtgccg ttgtcgaaga tcaagttgcc 420gaaatcatgt
tggatgcatt tcgtgttgca gctgcttccc ctccaggcgc taccgctgtg
480tctcttccca tcgatctgat gacgccagcc aaatctactt ctaccgttac
ggccttccca 540gctgaatgtt tcatacctcc aaaatacggc aaaagccctg
aaactacatt acaagccgca 600gccgatttga taagcgccgc caaagctcca
gttctattct tagggatgcg tgttagcgag 660tctgacgata caattagcgc
agtacacggt tttcttcgta agcatcctgt tccagttgtg 720gaaacctttc
aagctgcagg cgcgatttcc aaagagctag tgcacttatt ttatggtaga
780atcggtttat tttctaatca accgggtgat caattgctac aacatgcgga
cctagtaata 840gcgatcggct tagatcaagc tgagtatgac gctaatatgt
ggaacgccag aggcacaaca 900attttacatg tcgatataca accagcggac
tttgttgctc attataaacc taagatcgag 960ctggtcggtt cactagcaga
caacatgaca gatttgactt ctaggttgga tacggtcgct 1020aggctacaat
taacgaaacc tggtgaagcc attagaacca acatgtggga atggcaaaat
1080tccccggaag cctccggtag atcaacgggt cctgttcatc cattgcactt
tattagacta 1140tttcaatcca ttattgaccc gagcaccact gtaattagtg
atgtaggtag tgtgtatatc 1200tggttgtgca gatacttcta ctcttacgct
cgtagaactt tcctgatgag taacgtgcag 1260caaacacttg gagtcgctat
gccttgggcg ataggggtat ctttatctca gacgccacct 1320agtagtaaga
aagttgtatc cattagcggt gatggtggtt ttatgttctc ttcacaagag
1380ttggtgacag ctgttcaaca aggttgcaac atcactcatt ttatatggaa
cgatggaaaa 1440tataacatgg tggaatttca agaagttaat aagtatggta
ggtcatccgg cgtggatcta 1500ggtggagtgg attttgtaaa gttagctgat
agtatgggag ccaagggttt aagagtatca 1560agtgctggcg atcttgaagc
cgtaatgaag gaagcattag catacgacgg tgtatgtttg 1620gttgacatag
aaattgacta ctctcaaaac cataacttaa tgatggattt ggtaacatcc
1680gatgtatctt aaggatcc 1698721720DNATalaromyces stipitatus
72agtcgacatg tctaacagga acccttctca cgtgatagtg gagtcattat ctaatgccgg
60cgttaagata gttttcggga taccaggtgc aaaggtcgat ggtatctttg atgcattgtc
120agatcatcct actatcaagt tgattgtgtg tagacatgaa cagaacgctg
cctttatggc 180agccgcagtt ggacgtctta ctggcgcccc gggtgtctgc
ttagtaacga gtgggcctgg 240aacttctaat ttggtaaccg gtttagctac
tgccactaca gaaggtgatc ctgttttagc 300aatagctgga acagtctcta
gattgcaagc agctaggcat actcatcaaa gtttagatgt 360taacaaagta
ttagaagggg tctgtaagag tgtaatacaa gtcggggtgg aagatcaagt
420gagtgaagta atcgctaatg cttttagaca tgcgaggcaa ttcccacaag
gagccaccgc 480agttgcgctg ccaatggata taataaaatc tacttccgtg
ggtgtgccac cttttccatc 540tctatcattc gaggcaccag gttatggtag
ttccaatacg aaactttgta aagtagcggt 600cgataaacta attgcggcga
aatatccagt gatactgctg ggaatgagat cctcagaccc 660tgagattgta
gcttcagtcc gtcgtatgat aaaagatcat accttgcctg tagttgaaac
720ttttcaagct gcgggagcca tctcagaaga tttgcttcat agatactatg
ggagggtggg 780tttattccgt aatcaacctg gtgacaaagt actagcaaga
gcagacctga ttattgcagt 840tggctacgat ccatacgaat atgatgcaga
aacatggaat gtcaataatc cagcaaccat 900acacaacatt attcacattg
attacacaca ttccagggtg tcacaacact atatgcctca 960tgttgagcta
ctgggaaacc cagcggatat cgtcgatgaa ttgacggcca gtttacaggc
1020cctaaaacca aacttttggt ctggggctga agatacctta gaaaatatta
ggcaagaaat 1080agctcgttgt gaagccactg ccactcatac tgaatctttg
caagatggcg cggttcagcc 1140tactcacttc gtatatcaat tgaggcatct
gttaccaaag gaaactattg ttgctgttga 1200tgtaggaacc gtctatatct
acatgatgag atacttccaa acctattcac cgagacactt 1260gctgtgtagt
aatggacaac aaactttggg agttggtttg ccttgggcta tagctgcttc
1320actaattcaa gaacctcctt gcagtaggaa ggttgtctct atatctggtg
atggcgggtt 1380tatgtttagt agccaagaac tggctacggc agtcttgcaa
aagtgtaaca taacccattt 1440tatttggaat gacagcggct acaacatggt
tgaatttcaa gaggaggcta agtatggtcg 1500tagctctggt ataaaactag
gcggtattga tttcgtcaaa tttgcagagg ctttcgacgg 1560tgcgcgtgga
ttccgtataa acagcaccaa agaagttaag gaggtcatta aagaggcact
1620agcctttgaa ggcgttgcta tagttgatgt cagaatcgat tattctagga
gtcatgaatt 1680aatgaaagat attattccaa aggactacca ataaggatcc
1720731635DNAArtificial SequencePiromyces ilvD with predicted MTS
removed 73atgggtagtc aagcgatgtt aatcgcaact ggtataaaac cagaagattt
aaaaaagcca 60cagatcggca taggcagtgt ttggtatgat ggaaatccat gcaacatgca
tctattggat 120cttggctccg tggtaaaaaa ggccgttcaa aaacaaaata
tgaatggtat gagattcaat 180atgattggag tgtcagacgg gatctccaac
ggtacggatg gaatgtcctt ttctttgcag 240tcccgtgaaa ttattgcgga
ttctatcgaa acaatcatgt ctgcacaata ttatgatgct 300aacatcagct
tacctggctg cgacaagaac atgcctggtt gtttaatcgc cgctgccaga
360ttgaacagac cgactataat tatctacggt ggcacgatca agcccggaca
tacaaaaaag 420ggagagacga ttgatttagt ctcggccttc caatgttatg
ggcaatactt ggctggagaa 480attactgaag agcaaagaga agaaatagtg
aataatgcat gtcctggcgc aggtgcatgc 540ggtggaatgt atacagctaa
tacaatggct tccataatcg aatcaatggg tatgagttta 600ccttactccg
cctcgacccc ggcagaagac ccattgaaag agcttgaatg tataaacgcg
660gcagctgcaa ttaagaattt aatggaaaaa gacatcaagc cattagacat
aatgacaaga 720aaagcgtttg agaacgctat aactattact ttgattcttg
gagggagtac aaactccgtt 780ctgcaccttt tggctatcgc tagggcctgc
aaagtcccat taactattga cgatttccag 840gaattttcta ataggatacc
cgttttagcc gacttaaaac ctagtggtaa atatgtcatg 900gaagatttgc
agttgatcgg cggtcttcca gctattcaga aatatcttct gaatgaaggt
960ctacttcatg gtgatattat gactgttacc ggaaagaccc tagcagagaa
tttgaaagac 1020gttgctccaa tcgattttga aactcaagat ataattagac
ctttatcgaa tcccattaaa 1080aagaatggtc acattatcat tatgaaaggt
aacgtctctc cggacggtgg tgttgctaaa 1140attacaggta agcagggatt
gtttttcgaa ggcgtggcga attgctttga ttgtgaagaa 1200gacatgttag
ctgcactgga aagaggcgaa attaaaaaag gtcaagtgat tataataagg
1260tatgaaggcc ccactggagg gcctggtatg ccggagatgc taactccgac
cagtgctatt 1320atgggtgctg ggttaggaaa agatgtagca ctattaacag
atggcagatt ttcaggcggg 1380tcacacggct tcattattgg tcatattacg
cctgaggcac aagtaggtgg tccaattgcc 1440ctaatcaaaa acggtgataa
gataactata gacgcgaata aacgtaccat acatgcccat 1500gtcagcgaag
aagaatttgc taaaagacgt gccgagtgga aagcaccacc ttacagagct
1560actcaaggta ctttaaagaa atacattaag ctggttaaac ccgcaaactt
tggatgtgtt 1620accgatgagt ggtaa 16357424DNAArtificial
SequencePrimer 351 74cttcttgctc attagaaaga aagc 247521DNAArtificial
SequencePrimer 1625 75caaggttacg gtcaaggttt g 217622DNAArtificial
SequencePrimer 1626 76cattggttcc ggttacgttt ac 227746DNAArtificial
SequencePrimer 1615 77caactcgcgg ccgcggatcc taggttattg gttttctggt
ctcaac 467832DNAArtificial SequencePrimer 1616 78cgccgactcg
agatgttgag aactcaagcc gc 327944DNAArtificial SequencePrimer 1809
79cgccgactcg aggtcgacat gggtttgaag caaatcaact tcgg
44807564DNAArtificial SequencepGV1354 80tcgcgcgttt cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg
tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accataaacg acattactat atatataata taggaagcat ttaatagaca
gcatcgtaat 240atatgtgtac tttgcagtta tgacgccaga tggcagtagt
ggaagatatt ctttattgaa 300aaatagcttg tcaccttacg tacaatcttg
atccggagct tttctttttt tgccgattaa 360gaattaattc ggtcgaaaaa
agaaaaggag agggccaaga gggagggcat tggtgactat 420tgagcacgtg
agtatacgtg attaagcaca caaaggcagc ttggagtatg tctgttatta
480atttcacagg tagttctggt ccattggtga aagtttgcgg cttgcagagc
acagaggccg 540cagaatgtgc tctagattcc gatgctgact tgctgggtat
tatatgtgtg
cccaatagaa 600agagaacaat tgacccggtt attgcaagga aaatttcaag
tcttgtaaaa gcatataaaa 660atagttcagg cactccgaaa tacttggttg
gcgtgtttcg taatcaacct aaggaggatg 720ttttggctct ggtcaatgat
tacggcattg atatcgtcca actgcatgga gatgagtcgt 780ggcaagaata
ccaagagttc ctcggtttgc cagttattaa aagactcgta tttccaaaag
840actgcaacat actactcagt gcagcttcac agaaacctca ttcgtttatt
cccttgtttg 900attcagaagc aggtgggaca ggtgaacttt tggattggaa
ctcgatttct gactgggttg 960gaaggcaaga gagccccgaa agcttacatt
ttatgttagc tggtggactg acgccagaaa 1020atgttggtga tgcgcttaga
ttaaatggcg ttattggtgt tgatgtaagc ggaggtgtgg 1080agacaaatgg
tgtaaaagac tctaacaaaa tagcaaattt cgtcaaaaat gctaagaaat
1140aggttattac tgagtagtat ttatttaagt attgtttgtg cacttgccta
tgcggtgtga 1200aataccgcac agatgcgtaa ggagaaaata ccgcatcagg
aaattgtaaa cgttaatatt 1260ttgttaaaat tcgcgttaaa tttttgttaa
atcagctcat tttttaacca ataggccgaa 1320atcggcaaaa tcccttataa
atcaaaagaa tagaccgaga tagggttgag tgttgttcca 1380gtttggaaca
agagtccact attaaagaac gtggactcca acgtcaaagg gcgaaaaacc
1440gtctatcagg gcgatggccc actacgtgaa ccatcaccct aatcaagttt
tttggggtcg 1500aggtgccgta aagcactaaa tcggaaccct aaagggagcc
cccgatttag agcttgacgg 1560ggaaagccgg cgaacgtggc gagaaaggaa
gggaagaaag cgaaaggagc gggcgctagg 1620gcgctggcaa gtgtagcggt
cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg 1680ccgctacagg
gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg ggaagggcga
1740tcggtgcggg cctcttcgct attacgccag ctggcgaaag ggggatgtgc
tgcaaggcga 1800ttaagttggg taacgccagg gttttcccag tcacgacgtt
gtaaaacgac ggccagtgag 1860cgcgcgtaat acgactcact atagggcgaa
ttgggtaccg gccgcaaatt aaagccttcg 1920agcgtcccaa aaccttctca
agcaaggttt tcagtataat gttacatgcg tacacgcgtc 1980tgtacagaaa
aaaaagaaaa atttgaaata taaataacgt tcttaatact aacataacta
2040taaaaaaata aatagggacc tagacttcag gttgtctaac tccttccttt
tcggttagag 2100cggatgtggg gggagggcgt gaatgtaagc gtgacataac
taattacatg actcgagcgg 2160ccgcggatcc ttattggttt tctggtctca
actttctgac ttccttacca accttccaga 2220tttccatgtt tctgatggtg
tctaattcct tttctagctt ttctctgtag tcaggttgag 2280agttgaattc
caaagatctc ttggtttcgg taccgttctt ggtagattcg tacaagtctt
2340ggaaaacagg cttcaaagca ttcttgaaga ttgggtacca gtccaaagca
cctcttctgg 2400cggtggtgga acaagcatcg tacatgtaat ccataccgta
cttaccgatc aatgggtata 2460gagattgggt agcttcttcg acggtttcgt
tgaaagcttc agatggggag tgaccgtttt 2520ctctcaagac gtcgtattga
gccaagaaca taccgtggat accacccatt aaacaacctc 2580tttcaccgta
caagtcagag ttgacttctc tttcgaaagt ggtttggtaa acgtaaccgg
2640aaccaatggc aacggccaaa gcttgggcct tttcgtgagc cttaccggtg
acatcgttcc 2700agacggcgta agaagagtta ataccacgac cttccttgaa
caaagatctg acagttctac 2760cggaaccctt tggagcaacc aagataacat
ctaagtcctt tggtggttca acgtgagtca 2820agtccttgaa gactggggag
aaaccgtggg agaagtacaa agtcttaccc ttggtcaaca 2880atggcttgat
agcaggccag gtttctgatt gagcggcatc ggacaacaag ttcataacgt
2940aactacctct cttgatagca tcttcaacag tgaacaagtt cttgcctgga
acccaaccgt 3000cttcgatggc agccttccaa gaagcaccat ctttacggac
accaatgata acgttcaaac 3060cgttgtctct caagttcaaa ccttgaccgt
aaccttggga accgtaaccg atcaaagcaa 3120aagtgtcgtt cttgaagtag
tccaacaact tttctcttgg ccagtcagct ctttcgtaga 3180cggtttcaac
agtaccaccg aagttgattt gcttcaacat gtcgacacca tcttcttctg
3240agatgagttt ttgttccatg ctagttctag aatccgtcga aactaagttc
tggtgtttta 3300aaactaaaaa aaagactaac tataaaagta gaatttaaga
agtttaagaa atagatttac 3360agaattacaa tcaataccta ccgtctttat
atacttatta gtcaagtagg ggaataattt 3420cagggaactg gtttcaacct
tttttttcag ctttttccaa atcagagaga gcagaaggta 3480atagaaggtg
taagaaaatg agatagatac atgcgtgggt caattgcctt gtgtcatcat
3540ttactccagg caggttgcat cactccattg aggttgtgcc cgttttttgc
ctgtttgtgc 3600ccctgttctc tgtagttgcg ctaagagaat ggacctatga
actgatggtt ggtgaagaaa 3660acaatatttt ggtgctggga ttcttttttt
ttctggatgc cagcttaaaa agcgggctcc 3720attatattta gtggatgcca
ggaataaact gttcacccag acacctacga tgttatatat 3780tctgtgtaac
ccgcccccta ttttgggcat gtacgggtta cagcagaatt aaaaggctaa
3840ttttttgact aaataaagtt aggaaaatca ctactattaa ttatttacgt
attctttgaa 3900atggcgagta ttgataatga taaactgagc tagatctggg
cccgagctcc agcttttgtt 3960ccctttagtg agggttaatt gcgcgcttgg
cgtaatcatg gtcatagctg tttcctgtgt 4020gaaattgtta tccgctcaca
attccacaca acataggagc cggaagcata aagtgtaaag 4080cctggggtgc
ctaatgagtg aggtaactca cattaattgc gttgcgctca ctgcccgctt
4140tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc
gcggggagag 4200gcggtttgcg tattgggcgc tcttccgctt cctcgctcac
tgactcgctg cgctcggtcg 4260ttcggctgcg gcgagcggta tcagctcact
caaaggcggt aatacggtta tccacagaat 4320caggggataa cgcaggaaag
aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 4380aaaaggccgc
gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa
4440atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac
caggcgtttc 4500cccctggaag ctccctcgtg cgctctcctg ttccgaccct
gccgcttacc ggatacctgt 4560ccgcctttct cccttcggga agcgtggcgc
tttctcatag ctcacgctgt aggtatctca 4620gttcggtgta ggtcgttcgc
tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 4680accgctgcgc
cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat
4740cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta
ggcggtgcta 4800cagagttctt gaagtggtgg cctaactacg gctacactag
aaggacagta tttggtatct 4860gcgctctgct gaagccagtt accttcggaa
aaagagttgg tagctcttga tccggcaaac 4920aaaccaccgc tggtagcggt
ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 4980aaggatctca
agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa
5040actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc
tagatccttt 5100taaattaaaa atgaagtttt aaatcaatct aaagtatata
tgagtaaact tggtctgaca 5160gttaccaatg cttaatcagt gaggcaccta
tctcagcgat ctgtctattt cgttcatcca 5220tagttgcctg actccccgtc
gtgtagataa ctacgatacg ggagggctta ccatctggcc 5280ccagtgctgc
aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa
5340accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc
gcctccatcc 5400agtctattaa ttgttgccgg gaagctagag taagtagttc
gccagttaat agtttgcgca 5460acgttgttgc cattgctaca ggcatcgtgg
tgtcacgctc gtcgtttggt atggcttcat 5520tcagctccgg ttcccaacga
tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 5580cggttagctc
cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac
5640tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta
agatgctttt 5700ctgtgactgg tgagtactca accaagtcat tctgagaata
gtgtatgcgg cgaccgagtt 5760gctcttgccc ggcgtcaata cgggataata
ccgcgccaca tagcagaact ttaaaagtgc 5820tcatcattgg aaaacgttct
tcggggcgaa aactctcaag gatcttaccg ctgttgagat 5880ccagttcgat
gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca
5940gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga
ataagggcga 6000cacggaaatg ttgaatactc atactcttcc tttttcaata
ttattgaagc atttatcagg 6060gttattgtct catgagcgga tacatatttg
aatgtattta gaaaaataaa caaatagggg 6120ttccgcgcac atttccccga
aaagtgccac ctgaacgaag catctgtgct tcattttgta 6180gaacaaaaat
gcaacgcgag agcgctaatt tttcaaacaa agaatctgag ctgcattttt
6240acagaacaga aatgcaacgc gaaagcgcta ttttaccaac gaagaatctg
tgcttcattt 6300ttgtaaaaca aaaatgcaac gcgagagcgc taatttttca
aacaaagaat ctgagctgca 6360tttttacaga acagaaatgc aacgcgagag
cgctatttta ccaacaaaga atctatactt 6420cttttttgtt ctacaaaaat
gcatcccgag agcgctattt ttctaacaaa gcatcttaga 6480ttactttttt
tctcctttgt gcgctctata atgcagtctc ttgataactt tttgcactgt
6540aggtccgtta aggttagaag aaggctactt tggtgtctat tttctcttcc
ataaaaaaag 6600cctgactcca cttcccgcgt ttactgatta ctagcgaagc
tgcgggtgca ttttttcaag 6660ataaaggcat ccccgattat attctatacc
gatgtggatt gcgcatactt tgtgaacaga 6720aagtgatagc gttgatgatt
cttcattggt cagaaaatta tgaacggttt cttctatttt 6780gtctctatat
actacgtata ggaaatgttt acattttcgt attgttttcg attcactcta
6840tgaatagttc ttactacaat ttttttgtct aaagagtaat actagagata
aacataaaaa 6900atgtagaggt cgagtttaga tgcaagttca aggagcgaaa
ggtggatggg taggttatat 6960agggatatag cacagagata tatagcaaag
agatactttt gagcaatgtt tgtggaagcg 7020gtattcgcaa tattttagta
gctcgttaca gtccggtgcg tttttggttt tttgaaagtg 7080cgtcttcaga
gcgcttttgg ttttcaaaag cgctctgaag ttcctatact ttctagagaa
7140taggaacttc ggaataggaa cttcaaagcg tttccgaaaa cgagcgcttc
cgaaaatgca 7200acgcgagctg cgcacataca gctcactgtt cacgtcgcac
ctatatctgc gtgttgcctg 7260tatatatata tacatgagaa gaacggcata
gtgcgtgttt atgcttaaat gcgtacttat 7320atgcgtctat ttatgtagga
tgaaaggtag tctagtacct cctgtgatat tatcccattc 7380catgcggggt
atcgtatgct tccttcagca ctacccttta gctgttctat atgctgccac
7440tcctcaattg gattagtctc atccttcaat gctatcattt cctttgatat
tggatcatat 7500taagaaacca ttattatcat gacattaacc tataaaaata
ggcgtatcac gaggcccttt 7560cgtc 7564817955DNAArtificial
SequencepGV1662 81ttggatcata ctaagaaacc attattatca tgacattaac
ctataaaaat aggcgtatca 60cgaggccctt tcgtctcgcg cgtttcggtg atgacggtga
aaacctctga cacatgcagc 120tcccggagac ggtcacagct tgtctgtaag
cggatgccgg gagcagacaa gcccgtcagg 180gcgcgtcagc gggtgttggc
gggtgtcggg gctggcttaa ctatgcggca tcagagcaga 240ttgtactgag
agtgcaccat accacagctt ttcaattcaa ttcatcattt tttttttatt
300cttttttttg atttcggttt ctttgaaatt tttttgattc ggtaatctcc
gaacagaagg 360aagaacgaag gaaggagcac agacttagat tggtatatat
acgcatatgt agtgttgaag 420aaacatgaaa ttgcccagta ttcttaaccc
aactgcacag aacaaaaacc tgcaggaaac 480gaagataaat catgtcgaaa
gctacatata aggaacgtgc tgctactcat cctagtcctg 540ttgctgccaa
gctatttaat atcatgcacg aaaagcaaac aaacttgtgt gcttcattgg
600atgttcgtac caccaaggaa ttactggagt tagttgaagc attaggtccc
aaaatttgtt 660tactaaaaac acatgtggat atcttgactg atttttccat
ggagggcaca gttaagccgc 720taaaggcatt atccgccaag tacaattttt
tactcttcga agacagaaaa tttgctgaca 780ttggtaatac agtcaaattg
cagtactctg cgggtgtata cagaatagca gaatgggcag 840acattacgaa
tgcacacggt gtggtgggcc caggtattgt tagcggtttg aagcaggcgg
900cagaagaagt aacaaaggaa cctagaggcc ttttgatgtt agcagaattg
tcatgcaagg 960gctccctatc tactggagaa tatactaagg gtactgttga
cattgcgaag agcgacaaag 1020attttgttat cggctttatt gctcaaagag
acatgggtgg aagagatgaa ggttacgatt 1080ggttgattat gacacccggt
gtgggtttag atgacaaggg agacgcattg ggtcaacagt 1140atagaaccgt
ggatgatgtg gtctctacag gatctgacat tattattgtt ggaagaggac
1200tatttgcaaa gggaagggat gctaaggtag agggtgaacg ttacagaaaa
gcaggctggg 1260aagcatattt gagaagatgc ggccagcaaa actaaaaaac
tgtattataa gtaaatgcat 1320gtatactaaa ctcacaaatt agagcttcaa
tttaattata tcagttatta ccctatgcgg 1380tgtgaaatac cgcacagatg
cgtaaggaga aaataccgca tcaggaaatt gtaaacgtta 1440atattttgtt
aaaattcgcg ttaaattttt gttaaatcag ctcatttttt aaccaatagg
1500ccgaaatcgg caaaatccct tataaatcaa aagaatagac cgagataggg
ttgagtgttg 1560ttccagtttg gaacaagagt ccactattaa agaacgtgga
ctccaacgtc aaagggcgaa 1620aaaccgtcta tcagggcgat ggcccactac
gtgaaccatc accctaatca agttttttgg 1680ggtcgaggtg ccgtaaagca
ctaaatcgga accctaaagg gagcccccga tttagagctt 1740gacggggaaa
gccggcgaac gtggcgagaa aggaagggaa gaaagcgaaa ggagcgggcg
1800ctagggcgct ggcaagtgta gcggtcacgc tgcgcgtaac caccacaccc
gccgcgctta 1860atgcgccgct acagggcgcg tcgcgccatt cgccattcag
gctgcgcaac tgttgggaag 1920ggcgatcggt gcgggcctct tcgctattac
gccagctggc gaaaggggga tgtgctgcaa 1980ggcgattaag ttgggtaacg
ccagggtttt cccagtcacg acgttgtaaa acgacggcca 2040gtgagcgcgc
gtaatacgac tcactatagg gcgaattggg taccggccgc aaattaaagc
2100cttcgagcgt cccaaaacct tctcaagcaa ggttttcagt ataatgttac
atgcgtacac 2160gcgtctgtac agaaaaaaaa gaaaaatttg aaatataaat
aacgttctta atactaacat 2220aactataaaa aaataaatag ggacctagac
ttcaggttgt ctaactcctt ccttttcggt 2280tagagcggat gtggggggag
ggcgtgaatg taagcgtgac ataactaatt acatgactcg 2340agcggccgcg
gatccttagg atttattctg ttcagcaaac agcttgccca ttttcttcag
2400taccttcggt gcgccttctt tcgccaggat cagttcgatc cagtacatac
ggttcggatc 2460ggcctgggcc tctttcatca cgctcacaaa ttcgttttcg
gtacgcacaa ttttagacac 2520aacacggtcc tcagttgcgc cgaaggactc
cggcagttta gagtagttcc acatagggat 2580atcgttgtaa gactggttcg
gaccgtggat ctcacgctca acggtgtagc cgtcattgtt 2640aataatgaag
caaatcgggt tgatcttttc acgaattgcc agacccagtt cctgtacggt
2700cagctgcagg gaaccgtcac cgatgaacag cagatgacga gattctttat
cagcgatctg 2760agagcccagc gctgccggga aagtatagcc aatgctaccc
cacagcggct gaccgataaa 2820atggcttttg gatttcagaa agatagaaga
cgcgccgaaa aagctcgtac cttgttccgc 2880cacgatggtt tcattgctct
gggtcaggtt ctccacggcc tgccacaggc gatcctggga 2940cagcagtgcg
ttagatggta cgaaatcttc ttgctttttg tcaatgtatt tgcctttata
3000ctcgatttcg gacaggtcca gcagagagct gatcaggctt tcgaagtcga
agttctggat 3060acgctcgttg aagattttac cctcgtcgat gttcaggcta
atcattttgt tttcgttcag 3120atggtgagtg aatgcaccgg tagaagagtc
ggtcagttta acgcccagca tcaggatgaa 3180gtccgcagat tcaacaaatt
ctttcaggtt cggttcgctc agagtaccgt tgtagatgcc 3240caggaaagac
ggcagagcct cgtcaacaga ggacttgccg aagttcaggg tggtaatcgg
3300cagtttggtt ttgctgatga attgggtcac ggtcttctcc agaccaaaag
aaatgatttc 3360gtggccggtg atcacgattg gtttctttgc gtttttcaga
gactcctgga ttttgttcag 3420gatttcctgg tcgctagtgt tagaagtgga
gttttctttc ttcagcggca ggctcggttt 3480ttccgcttta gctgccgcaa
catccacagg caggttgatg taaactggtt tgcgttcttt 3540cagcagcgca
gacagaacgc ggtcgatttc cacagtagcg ttctctgcag tcagcagcgt
3600acgtgccgca gtcacaggtt catgcatttt catgaagtgt ttgaaatcgc
cgtcagccag 3660agtgtggtgg acgaatttac cttcgttctg aactttgctc
gttgggctgc ctacgatctc 3720caccaccggc aggttttcgg cgtaggagcc
cgccagaccg ttgacggcgc tcagttcgcc 3780aacaccgaaa gtggtcagaa
atgccgcggc tttcttggta cgtgcataac catctgccat 3840gtagcttgcg
ttcagttcgt tagcgttacc cacccatttc atgtctttat gagagatgat
3900ctgatccagg aactgcagat tgtaatcacc cggaacgccg aagatttctt
cgatacccag 3960ttcatgcaga cggtccagca gataatcacc aacagtatac
atgtcgacaa acttagatta 4020gattgctatg ctttctttct aatgagcaag
aagtaaaaaa agttgtaata gaacaagaaa 4080aatgaaactg aaacttgaga
aattgaagac cgtttattaa cttaaatatc aatgggaggt 4140catcgaaaga
gaaaaaaatc aaaaaaaaaa ttttcaagaa aaagaaacgt gataaaaatt
4200tttattgcct ttttcgacga agaaaaagaa acgaggcggt ctcttttttc
ttttccaaac 4260ctttagtacg ggtaattaac gacaccctag aggaagaaag
aggggaaatt tagtatgctg 4320tgcttgggtg ttttgaagtg gtacggcgat
gcgcggagtc cgagaaaatc tggaagagta 4380aaaaaggagt agaaacattt
tgaagctatg agctccagct tttgttccct ttagtgaggg 4440ttaattgcgc
gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg
4500ctcacaattc cacacaacat aggagccgga agcataaagt gtaaagcctg
gggtgcctaa 4560tgagtgaggt aactcacatt aattgcgttg cgctcactgc
ccgctttcca gtcgggaaac 4620ctgtcgtgcc agctgcatta atgaatcggc
caacgcgcgg ggagaggcgg tttgcgtatt 4680gggcgctctt ccgcttcctc
gctcactgac tcgctgcgct cggtcgttcg gctgcggcga 4740gcggtatcag
ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca
4800ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa
ggccgcgttg 4860ctggcgtttt tccataggct ccgcccccct gacgagcatc
acaaaaatcg acgctcaagt 4920cagaggtggc gaaacccgac aggactataa
agataccagg cgtttccccc tggaagctcc 4980ctcgtgcgct ctcctgttcc
gaccctgccg cttaccggat acctgtccgc ctttctccct 5040tcgggaagcg
tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc
5100gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg
ctgcgcctta 5160tccggtaact atcgtcttga gtccaacccg gtaagacacg
acttatcgcc actggcagca 5220gccactggta acaggattag cagagcgagg
tatgtaggcg gtgctacaga gttcttgaag 5280tggtggccta actacggcta
cactagaagg acagtatttg gtatctgcgc tctgctgaag 5340ccagttacct
tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt
5400agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg
atctcaagaa 5460gatcctttga tcttttctac ggggtctgac gctcagtgga
acgaaaactc acgttaaggg 5520attttggtca tgagattatc aaaaaggatc
ttcacctaga tccttttaaa ttaaaaatga 5580agttttaaat caatctaaag
tatatatgag taaacttggt ctgacagtta ccaatgctta 5640atcagtgagg
cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc
5700cccgtcgtgt agataactac gatacgggag ggcttaccat ctggccccag
tgctgcaatg 5760ataccgcgag acccacgctc accggctcca gatttatcag
caataaacca gccagccgga 5820agggccgagc gcagaagtgg tcctgcaact
ttatccgcct ccatccagtc tattaattgt 5880tgccgggaag ctagagtaag
tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt 5940gctacaggca
tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc
6000caacgatcaa ggcgagttac atgatccccc atgttgtgca aaaaagcggt
tagctccttc 6060ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt
tatcactcat ggttatggca 6120gcactgcata attctcttac tgtcatgcca
tccgtaagat gcttttctgt gactggtgag 6180tactcaacca agtcattctg
agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 6240tcaatacggg
ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa
6300cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag
ttcgatgtaa 6360cccactcgtg cacccaactg atcttcagca tcttttactt
tcaccagcgt ttctgggtga 6420gcaaaaacag gaaggcaaaa tgccgcaaaa
aagggaataa gggcgacacg gaaatgttga 6480atactcatac tcttcctttt
tcaatattat tgaagcattt atcagggtta ttgtctcatg 6540agcggataca
tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt
6600ccccgaaaag tgccacctga acgaagcatc tgtgcttcat tttgtagaac
aaaaatgcaa 6660cgcgagagcg ctaatttttc aaacaaagaa tctgagctgc
atttttacag aacagaaatg 6720caacgcgaaa gcgctatttt accaacgaag
aatctgtgct tcatttttgt aaaacaaaaa 6780tgcaacgcga gagcgctaat
ttttcaaaca aagaatctga gctgcatttt tacagaacag 6840aaatgcaacg
cgagagcgct attttaccaa caaagaatct atacttcttt tttgttctac
6900aaaaatgcat cccgagagcg ctatttttct aacaaagcat cttagattac
tttttttctc 6960ctttgtgcgc tctataatgc agtctcttga taactttttg
cactgtaggt ccgttaaggt 7020tagaagaagg ctactttggt gtctattttc
tcttccataa aaaaagcctg actccacttc 7080ccgcgtttac tgattactag
cgaagctgcg ggtgcatttt ttcaagataa aggcatcccc 7140gattatattc
tataccgatg tggattgcgc atactttgtg aacagaaagt gatagcgttg
7200atgattcttc attggtcaga aaattatgaa cggtttcttc tattttgtct
ctatatacta 7260cgtataggaa atgtttacat tttcgtattg ttttcgattc
actctatgaa tagttcttac 7320tacaattttt ttgtctaaag agtaatacta
gagataaaca taaaaaatgt agaggtcgag 7380tttagatgca agttcaagga
gcgaaaggtg gatgggtagg ttatataggg atatagcaca 7440gagatatata
gcaaagagat acttttgagc aatgtttgtg gaagcggtat tcgcaatatt
7500ttagtagctc gttacagtcc ggtgcgtttt tggttttttg aaagtgcgtc
ttcagagcgc 7560ttttggtttt caaaagcgct ctgaagttcc tatactttct
agagaatagg aacttcggaa 7620taggaacttc aaagcgtttc cgaaaacgag
cgcttccgaa aatgcaacgc gagctgcgca 7680catacagctc actgttcacg
tcgcacctat atctgcgtgt tgcctgtata tatatataca 7740tgagaagaac
ggcatagtgc gtgtttatgc ttaaatgcgt acttatatgc gtctatttat
7800gtaggatgaa aggtagtcta gtacctcctg tgatattatc ccattccatg
cggggtatcg 7860tatgcttcct tcagcactac cctttagctg ttctatatgc
tgccactcct caattggatt 7920agtctcatcc ttcaatgcta tcatttcctt
tgata
7955828572DNAArtificial SequencepGV1810 82tagaaaaact catcgagcat
caaatgaaac tgcaatttat tcatatcagg attatcaata 60ccatattttt gaaaaagccg
tttctgtaat gaaggagaaa actcaccgag gcagttccat 120aggatggcaa
gatcctggta tcggtctgcg attccgactc gtccaacatc aatacaacct
180attaatttcc cctcgtcaaa aataaggtta tcaagtgaga aatcaccatg
agtgacgact 240gaatccggtg agaatggcaa aagtttatgc atttctttcc
agacttgttc aacaggccag 300ccattacgct cgtcatcaaa atcactcgca
tcaaccaaac cgttattcat tcgtgattgc 360gcctgagcga ggcgaaatac
gcgatcgctg ttaaaaggac aattacaaac aggaatcgag 420tgcaaccggc
gcaggaacac tgccagcgca tcaacaatat tttcacctga atcaggatat
480tcttctaata cctggaacgc tgtttttccg gggatcgcag tggtgagtaa
ccatgcatca 540tcaggagtac ggataaaatg cttgatggtc ggaagtggca
taaattccgt cagccagttt 600agtctgacca tctcatctgt aacatcattg
gcaacgctac ctttgccatg tttcagaaac 660aactctggcg catcgggctt
cccatacaag cgatagattg tcgcacctga ttgcccgaca 720ttatcgcgag
cccatttata cccatataaa tcagcatcca tgttggaatt taatcgcggc
780ctcgacgttt cccgttgaat atggctcata ttcttccttt ttcaatatta
ttgaagcatt 840tatcagggtt attgtctcat gagcggatac atatttgaat
gtatttagaa aaataaacaa 900ataggggtca gtgttacaac caattaacca
attctgaaca ttatcgcgag cccatttata 960cctgaatatg gctcataaca
ccccttgttt gcctggcggc agtagcgcgg tggtcccacc 1020tgaccccatg
ccgaactcag aagtgaaacg ccgtagcgcc gatggtagtg tggggactcc
1080ccatgcgaga gtagggaact gccaggcatc aaataaaacg aaaggctcag
tcgaaagact 1140gggcctttcg cccgggctaa ttagggggtg tcgcccttat
tcgactctat agtgaagttc 1200ctattctcta gaaagtatag gaacttctga
agtggggcta gccacgaaaa acaaactaac 1260ttatgcgcat cattagatgt
aagaactact aaagagctac tggagttggt tgaggcttta 1320ggtccaaaaa
tttgtttgtt gaagacacat gttgacatat taacagattt ttctatggag
1380ggtaccgtta agcctctgaa agcgttaagc gcgaaatata actttctttt
atttgaagac 1440cgtaagtttg ctgatattgg aaatactgtt aagttgcaat
atagcgcagg agtttataga 1500attgccgaat gggctgacat tacgaatgcc
cacggtgttg taggtcctgg cattgtgtct 1560ggattgaaac aagcggcaga
ggaagtgact aaggaaccaa gaggtttact aatgctggcg 1620gaattatctt
gcaaaggctc tctagccacc ggtgaatata caaaaggtac tgtggatatt
1680gcaaagtctg ataaggactt cgtaatcggt tttattgcac aaagagatat
gggaggtcgt 1740gacgagggct acgattggtt aattatgaca ccaggcgtag
gattagatga caaaggcgat 1800gcgttaggcc aacagtatcg tacagtcgat
gatgtcgtaa gtaccggttc tgatatcatt 1860attgtcggga gaggtttatt
tgccaagggc cgtgatgcga aagtggaggg ggaaagatat 1920aggaaggcag
gttgggaggc ttacttgaga agatgtggtc agcagaatta agcggccgca
1980taacaatact gacagtacta aataattgcc tacttggctt cacatacgtt
gcatacgtcg 2040atatagataa taatgataat gacagcagga ttatcgtaat
acgtaatagt tgaaaatctc 2100aaaaatgtgt gggtcattac gtaaataatg
ataggaatgg gattcttcta tttttccttt 2160ttccattcta gcagccgtcg
ggaaaacgtg gcatcctctc tttcgggctc aattggagtc 2220acgctgccgt
gagcatcctc tctttccata tctaacaact gagcacgtaa ccaatggaaa
2280agcatgagct tagcgttgct ccaaaaaagt attggatggt taataccatt
tgtctgttct 2340cttctgactt tgactcctca aaaaaaaaaa atctacaatc
aacagatcgc ttcaattacg 2400ccctcacaaa aacttttttc cttcttcttc
gcccacgtta aattttatcc ctcatgttgt 2460ctaacggatt tctgcacttg
atttattata aaaagacaaa gacataatac ttctctatca 2520atttcagtta
ttgttcttcc ttgcgttatt cttctgttct tctttttctt ttgtcatata
2580taaccataac caagtaatac atattcaaac tcgagatgtt gagaactcaa
gccgccagat 2640tgatctgcaa ctcccgtgtc atcactgcta agagaacctt
tgctttggcc acccgtgctg 2700ctgcttacag cagaccagct gcccgtttcg
ttaagccaat gatcactacc cgtggtttga 2760agcaaatcaa cttcggtggt
actgttgaaa ccgtctacga aagagctgac tggccaagag 2820aaaagttgtt
ggactacttc aagaacgaca cttttgcttt gatcggttac ggttcccaag
2880gttacggtca aggtttgaac ttgagagaca acggtttgaa cgttatcatt
ggtgtccgta 2940aagatggtgc ttcttggaag gctgccatcg aagacggttg
ggttccaggc aagaacttgt 3000tcactgttga agatgctatc aagagaggta
gttacgttat gaacttgttg tccgatgccg 3060ctcaatcaga aacctggcct
gctatcaagc cattgttgac caagggtaag actttgtact 3120tctcccacgg
tttctcccca gtcttcaagg acttgactca cgttgaacca ccaaaggact
3180tagatgttat cttggttgct ccaaagggtt ccggtagaac tgtcagatct
ttgttcaagg 3240aaggtcgtgg tattaactct tcttacgccg tctggaacga
tgtcaccggt aaggctcacg 3300aaaaggccca agctttggcc gttgccattg
gttccggtta cgtttaccaa accactttcg 3360aaagagaagt caactctgac
ttgtacggtg aaagaggttg tttaatgggt ggtatccacg 3420gtatgttctt
ggctcaatac gacgtcttga gagaaaacgg tcactcccca tctgaagctt
3480tcaacgaaac cgtcgaagaa gctacccaat ctctataccc attgatcggt
aagtacggta 3540tggattacat gtacgatgct tgttccacca ccgccagaag
aggtgctttg gactggtacc 3600caatcttcaa gaatgctttg aagcctgttt
tccaagactt gtacgaatct accaagaacg 3660gtaccgaaac caagagatct
ttggaattca actctcaacc tgactacaga gaaaagctag 3720aaaaggaatt
agacaccatc agaaacatgg aaatctggaa ggttggtaag gaagtcagaa
3780agttgagacc agaaaaccaa taacctagga tcttgtttaa agattacgga
tatttaactt 3840acttagaata atgccatttt tttgagttat aataatccta
cgttagtgtg agcgggattt 3900aaactgtgag gaccttaata cattcagaca
cttctgcggt atcaccctac ttattccctt 3960cgagattata tctaggaacc
catcaggttg gtggaagatt acccgttcta agacttttca 4020gcttcctcta
ttgatgttac acctggacac cccttttctg gcatccagtt tttaatcttc
4080agtggcatgt gagattctcc gaaattaatt aaagcaatca cacaattctc
tcggatacca 4140cctcggttga aactgacagg tggtttgtta cgcatgctaa
tgcaaaggag cctatatacc 4200tttggctcgg ctgctgtaac agggaatata
aagggcagca taatttagga gtttagtgaa 4260cttgcaacat ttactatttt
cccttcttac gtaaatattt ttctttttaa ttctaaatca 4320atctttttca
attttttgtt tgtattcttt tcttgcttaa atctataact acaaaaaaca
4380catacataaa ctaaaagtcg acatgtaccc atacgatgtt ccagattacg
caggtggtgg 4440tgtcgacatg cctaaataca gatcagctac gactacacac
ggtagaaata tggccggagc 4500cagggcccta tggagagcca ccggcatgac
agatgcagat tttggtaaac ctataattgc 4560tgtagttaac tcttttacac
agtttgttcc aggtcatgta catctaagag acttgggcaa 4620attggtggca
gaacaaatcg aggctgctgg tggtgttgca aaagaattta acactattgc
4680cgtagacgac ggcattgcga tgggtcatgg cggtatgctt tattcgctac
cctccagaga 4740attaattgca gacagcgttg aatatatggt aaatgcccac
tgcgcagatg ccatggtttg 4800catttccaat tgtgacaaaa tcacgccggg
catgttgatg gcgtcattga gactaaatat 4860tcctgtgatc ttcgttagcg
gaggtcccat ggaagccggg aaaactaaac tttccgatca 4920gataatcaag
ttagacttgg tcgatgccat gatccagggt gcggacccca aagtaagcga
4980ctctcaatcc gatcaagttg aaagatccgc atgtccaact tgcgggagtt
gctctgggat 5040gttcacggcg aactctatga attgcctaac agaggccctg
ggcctgtcac aacctggcaa 5100cggttcgctt ttagcaactc atgctgatag
aaagcaatta tttctaaatg ctggtaaaag 5160aatcgttgaa ttaacaaaaa
gatattacga acaaaacgat gaatctgcac tgccaaggaa 5220cattgcttca
aaggccgctt tcgaaaacgc tatgacattg gatattgcaa tgggtggaag
5280cacaaatact gtccttcatc tactggcggc tgctcaagaa gcagaaattg
atttcacaat 5340gagcgatatc gacaagctat cacgtaaggt cccgcagctg
tgtaaagtgg caccgtctac 5400tcaaaaatac cacatggaag atgtccatcg
tgctggaggc gttatcggaa tcttggggga 5460gttggacagg gccggtctat
taaacagaga tgttaagaac gtgctaggtc taactttgcc 5520tcaaacctta
gagcagtacg acgttatgtt aactcaagat gacgcagtca aaaacatgtt
5580cagagcgggg ccagctggaa taaggactac ccaagcgttc tcgcaagatt
gcagatggga 5640tactctggac gatgatagag ctaacggttg cataagatca
ctagagcatg cttactcgaa 5700agatggaggt ttagctgttt tatacggtaa
ttttgccgaa aacggatgta tagtgaagac 5760cgctggggtt gatgattcaa
ttctaaaatt cactgggcca gccaaggtat acgagtcaca 5820agatgatgct
gttgaagcca tcttaggtgg gaaagtggtg gcaggggacg tggtggtaat
5880aagatatgaa ggtccaaagg gtggtccagg tatgcaagaa atgctgtacc
ctacttcttt 5940ccttaaatct atgggtttag gcaaggcttg tgctcttata
accgatggta gattttctgg 6000aggtacatca ggcctttcca taggacatgt
tagccccgaa gctgcctcag gtggtagtat 6060tggcttaatc gaggatggtg
acttaattgc tattgacatt cctaacaggg gtattcaact 6120acaggttagc
gatgcagaat tagccgctag aagagaggca caagatgcga gaggcgataa
6180agcatggaca cctaagaaca gggagagaca agtgagcttt gccctgagag
cttatgcctc 6240gctggcgacg agcgcagaca aaggagccgt aagagataaa
tcaaaattgg gtggttaggg 6300atccgcgatt taatctctaa ttattagtta
aagttttata agcattttta tgtaacgaaa 6360aataaattgg ttcatattat
tactgcactg tcacttacca tggaaagacc agacaagaag 6420ttgccgacag
tctgttgaat tggcctggtt aggcttaagt ctgggtccgc ttctttacaa
6480atttggagaa tttctcttaa acgatatgta tattcttttc gttggaaaag
atgtcttcca 6540aaaaaaaaac cgatgaatta gtggaaccaa ggaaaaaaaa
agaggtatcc ttgattaagg 6600aacactgttt aaacagtgtg gtttccaaaa
ccctgaaact gcattagtgt aatagaagac 6660tagacacctc gatacaaata
atggttactc aattcaaaac tgccagcgaa ttcgactctg 6720caattgctca
agacaagcta gttgtcgtag atttctacgc cacttggtgc ggtccatgta
6780aaatgattgc tccaatgatt gaaaaattct ctgaacaata cccacaagct
gatttctata 6840aattggatgt cgatgaattg ggtgatgttg cacaaaagaa
tgaagtttcc gctatgccaa 6900ctttgcttct attcaagaac ggtaaggaag
ttgcaaaggt tgttggtgcc aacccagcgg 6960ctattaagca agccattgct
gctaatgctt aaactcaccc aatgaccgat atattgtgtt 7020tctatactgt
gtttgttata tatagtttac ctttaagctt aaaatgaagt gaagttccta
7080tactttctag agaataggaa cttctatagt gagtcgaata agggcgacac
aaaatttatt 7140ctaaatgcat aataaatact gataacatct tatagtttgt
attatatttt gtattatcgt 7200tgacatgtat aattttgata tcaaaaactg
attttccctt tattattttc gagatttatt 7260ttcttaattc tctttaacaa
actagaaata ttgtatatac aaaaaatcat aaataataga 7320tgaatagttt
aattataggt gttcatcaat cgaaaaagca acgtatctta tttaaagtgc
7380gttgcttttt tctcatttat aaggttaaat aattctcata tatcaagcaa
agtgacaggc 7440gcccttaaat attctgacaa atgctctttc cctaaactcc
ccccataaaa aaacccgccg 7500aagcgggttt ttacgttatt tgcggattaa
cgattactcg ttatcagaac cgcccagggg 7560gcccgagctt aagactggcc
gtcgttttac aacacagaaa gagtttgtag aaacgcaaaa 7620aggccatccg
tcaggggcct tctgcttagt ttgatgcctg gcagttccct actctcgcct
7680tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg
agcggtatca 7740gctcactcaa aggcggtaat acggttatcc acagaatcag
gggataacgc aggaaagaac 7800atgtgagcaa aaggccagca aaaggccagg
aaccgtaaaa aggccgcgtt gctggcgttt 7860ttccataggc tccgcccccc
tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg 7920cgaaacccga
caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc
7980tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc
ttcgggaagc 8040gtggcgcttt ctcatagctc acgctgtagg tatctcagtt
cggtgtaggt cgttcgctcc 8100aagctgggct gtgtgcacga accccccgtt
cagcccgacc gctgcgcctt atccggtaac 8160tatcgtcttg agtccaaccc
ggtaagacac gacttatcgc cactggcagc agccactggt 8220aacaggatta
gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtgggct
8280aactacggct acactagaag aacagtattt ggtatctgcg ctctgctgaa
gccagttacc 8340ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa
ccaccgctgg tagcggtggt 8400ttttttgttt gcaagcagca gattacgcgc
agaaaaaaag gatctcaaga agatcctttg 8460atcttttcta cggggtctga
cgctcagtgg aacgacgcgc gcgtaactca cgttaaggga 8520ttttggtcat
gagcttgcgc cgtcccgtca agtcagcgta atgctctgct tt
8572831488DNAArtificial SequenceE. coli codon optimized ilvC for
expression in S. cerevisiae 83ctcgagatgg ccaactattt taacacatta
aatttgagac aacaattggc tcaactgggt 60aagtgcagat ttatgggaag ggacgagttt
gctgatggtg cttcttatct gcaaggaaag 120aaagtagtaa ttgttggctg
cggtgctcag ggtctaaacc aaggtttaaa catgagagat 180tcaggtctgg
atatttcgta tgcattgagg aaagaggcaa ttgcagaaaa gagggcctcc
240tggcgtaaag cgacggaaaa tgggttcaaa gttggtactt acgaagaact
gatccctcag 300gcagatttag tgattaacct aacaccagat aagcaacact
cagacgtagt aagaacagtt 360caaccgctga tgaaggatgg ggcagcttta
ggttactctc atggctttaa tatcgttgaa 420gtgggcgagc agatcagaaa
agatataaca gtcgtaatgg ttgcaccaaa gtgcccaggt 480acggaagtca
gagaggagta caagaggggt tttggtgtac ctacattgat cgccgtacat
540cctgaaaatg accccaaagg tgaaggtatg gcaattgcga aggcatgggc
agccgcaacc 600ggaggtcata gagcgggtgt gttagagagt tctttcgtag
ctgaggtcaa gagtgactta 660atgggtgaac aaaccattct gtgcggaatg
ttgcaggcag ggtctttact atgctttgat 720aaattggtcg aagagggtac
agatcctgcc tatgctgaaa agttgataca atttggttgg 780gagacaatca
ccgaggcact taaacaaggt ggcataacat tgatgatgga tagactttca
840aatccggcca agctaagagc ctacgcctta tctgagcaac taaaagagat
catggcacca 900ttattccaaa agcacatgga cgatattatc tccggtgagt
tttcctcagg aatgatggca 960gattgggcaa acgatgataa aaagttattg
acgtggagag aagaaaccgg caagacggca 1020ttcgagacag ccccacaata
cgaaggtaaa attggtgaac aagaatactt tgataaggga 1080gtattgatga
tagctatggt gaaggcaggg gtagaacttg cattcgaaac tatggttgac
1140tccggtatca ttgaagaatc tgcatactat gagtctttgc atgaattgcc
tttgatagca 1200aatactattg caagaaaaag actttacgag atgaatgttg
tcatatcaga cactgcagaa 1260tatggtaatt acttatttag ctacgcatgt
gtcccgttgt taaagccctt catggccgag 1320ttacaacctg gtgatttggg
gaaggctatt ccggaaggag cggttgacaa tggccaactg 1380agagacgtaa
atgaagctat tcgttcacat gctatagaac aggtgggtaa aaagctgaga
1440ggatatatga ccgatatgaa aagaattgca gtggcaggat gaagatct
14888438DNAArtificial SequencePrimer 1792 84ttttctcgag atgcagattt
ttgtgaagac cctcactg 388551DNAArtificial SequencePrimer 1794
85ttttgcggcc gcggatccgt cgacacctcg caggcgcaac accaggtgca g
5186228DNAArtificial SequenceM. musculus ubiquitin gene
codon-optimized for expression in S. cerevisiae 86atgcagattt
ttgtgaagac cctcactggc aaaaccatca cccttgaggt cgagcccagt 60gacaccattg
agaatgtcaa agccaaaatt caagacaagg agggtatccc acctgaccag
120cagcgtctga tatttgccgg caaacagctg gaggatggcc gcactctctc
agactacaac 180atccagaaag agtccaccct gcacctggtg ttgcgcctgc gaggtgga
228871713DNALactococcus lactis 87atggagttta agtataacgg caaagttgaa
tctgttgaac tgaataagta cagcaaaacg 60ttgacacaag atcccacaca acccgccaca
caggcaatgt attacggcat cgggtttaaa 120gacgaagatt tcaagaaagc
tcaagtgggt atagtgtcga tggactggga tggaaatcca 180tgcaacatgc
atttaggaac ccttggatca aagattaaaa gctcagtaaa tcagacagat
240ggtctgatcg gcttacaatt tcatacgata ggagtttctg atgggatagc
aaatggaaag 300ttgggaatga gatactccct tgtttccaga gaagttatag
ctgactctat tgaaaccaac 360gctggcgctg aatactatga tgcaattgta
gccatcccag gttgtgacaa aaatatgcca 420ggttctatta ttggtatggc
aagacttaat aggccaagca ttatggtgta tggaggaaca 480atagaacacg
gtgaatataa aggtgagaaa ttgaacatcg tatcggcttt tgaatctcta
540ggccagaaaa ttaccggcaa tatctctgat gaagattatc acggtgttat
ttgtaatgct 600attcctggtc aaggggcatg tggggggatg tacacagcta
ataccttagc tgccgctatc 660gaaacactag gtatgtcatt gccgtattct
tcttcgaacc ctgcagtatc tcaagaaaaa 720caagaagaat gtgatgagat
tggattagcc attaagaatc ttttggaaaa agacatcaag 780cctagtgata
taatgactaa ggaggcgttc gagaacgcta ttaccattgt gatggtcttg
840gggggtagta ctaatgctgt cttgcatatt attgcaatgg ctaacgcgat
aggtgtcgaa 900ataactcagg atgacttcca aagaattagt gacattactc
cagtactagg tgattttaaa 960ccttcaggta aatatatgat ggaagatttg
cataaaattg gaggcttgcc agcagtgctt 1020aagtaccttc taaaggaagg
aaaattgcat ggtgactgcc ttactgtgac gggtaaaaca 1080ttagccgaga
atgtcgagac tgccctagac ttggatttcg actcacaaga tatcatgagg
1140ccactaaaga atcctatcaa ggccaccggc cacttgcaga ttctgtacgg
taatttagct 1200caagggggtt ccgtagcaaa aattagcggt aaagaaggag
agttcttcaa aggcactgcc 1260agagtctttg atggtgaaca acattttatc
gacggcatag aatctggtcg tttgcatgct 1320ggagatgtag cggtaattag
gaatataggt cccgtcggcg gacctggtat gcccgaaatg 1380ctgaagccta
catcagcatt aattggtgcg ggtttaggga aaagttgcgc gttaattacg
1440gatggtagat tctccggtgg cactcacggt tttgttgtcg gccatattgt
gcctgaagcc 1500gttgagggtg gactaatcgg cttagttgaa gatgacgata
taatagagat agatgcagtc 1560aacaactcta tatccctgaa agtttccgat
gaagaaatcg caaagagaag agctaattat 1620cagaagccaa ctccgaaagc
caccagggga gttttggcaa aattcgctaa attaacccgt 1680cctgcatcgg
aagggtgtgt tactgatctg taa 1713881758DNASaccharomyces cerevisiae
88atgggcttgt taacgaaagt tgctacatct agacaattct ctacaacgag atgcgttgca
60aagaagctca acaagtactc gtatatcatc actgaaccta agggccaagg tgcgtcccag
120gccatgcttt atgccaccgg tttcaagaag gaagatttca agaagcctca
agtcggggtt 180ggttcctgtt ggtggtccgg taacccatgt aacatgcatc
tattggactt gaataacaga 240tgttctcaat ccattgaaaa agcgggtttg
aaagctatgc agttcaacac catcggtgtt 300tcagacggta tctctatggg
tactaaaggt atgagatact cgttacaaag tagagaaatc 360attgcagact
cctttgaaac catcatgatg gcacaacact acgatgctaa catcgccatc
420ccatcatgtg acaaaaacat gcccggtgtc atgatggcca tgggtagaca
taacagacct 480tccatcatgg tatatggtgg tactatcttg cccggtcatc
caacatgtgg ttcttcgaag 540atctctaaaa acatcgatat cgtctctgcg
ttccaatcct acggtgaata tatttccaag 600caattcactg aagaagaaag
agaagatgtt gtggaacatg catgcccagg tcctggttct 660tgtggtggta
tgtatactgc caacacaatg gcttctgccg ctgaagtgct aggtttgacc
720attccaaact cctcttcctt cccagccgtt tccaaggaga agttagctga
gtgtgacaac 780attggtgaat acatcaagaa gacaatggaa ttgggtattt
tacctcgtga tatcctcaca 840aaagaggctt ttgaaaacgc cattacttat
gtcgttgcaa ccggtgggtc cactaatgct 900gttttgcatt tggtggctgt
tgctcactct gcgggtgtca agttgtcacc agatgatttc 960caaagaatca
gtgatactac accattgatc ggtgacttca aaccttctgg taaatacgtc
1020atggccgatt tgattaacgt tggtggtacc caatctgtga ttaagtatct
atatgaaaac 1080aacatgttgc acggtaacac aatgactgtt accggtgaca
ctttggcaga acgtgcaaag 1140aaagcaccaa gcctacctga aggacaagag
attattaagc cactctccca cccaatcaag 1200gccaacggtc acttgcaaat
tctgtacggt tcattggcac caggtggagc tgtgggtaaa 1260attaccggta
aggaaggtac ttacttcaag ggtagagcac gtgtgttcga agaggaaggt
1320gcctttattg aagccttgga aagaggtgaa atcaagaagg gtgaaaaaac
cgttgttgtt 1380atcagatatg aaggtccaag aggtgcacca ggtatgcctg
aaatgctaaa gccttcctct 1440gctctgatgg gttacggttt gggtaaagat
gttgcattgt tgactgatgg tagattctct 1500ggtggttctc acgggttctt
aatcggccac attgttcccg aagccgctga aggtggtcct 1560atcgggttgg
tcagagacgg cgatgagatt atcattgatg ctgataataa caagattgac
1620ctattagtct ctgataagga aatggctcaa cgtaaacaaa gttgggttgc
acctccacct 1680cgttacacaa gaggtactct atccaagtat gctaagttgg
tttccaacgc ttccaacggt 1740tgtgttttag atgcttga
1758891701DNASaccharomyces cerevisiae 89atgaagaagc tcaacaagta
ctcgtatatc atcactgaac ctaagggcca aggtgcgtcc 60caggccatgc tttatgccac
cggtttcaag aaggaagatt tcaagaagcc tcaagtcggg 120gttggttcct
gttggtggtc cggtaaccca tgtaacatgc atctattgga cttgaataac
180agatgttctc aatccattga aaaagcgggt ttgaaagcta tgcagttcaa
caccatcggt 240gtttcagacg gtatctctat gggtactaaa ggtatgagat
actcgttaca aagtagagaa 300atcattgcag actcctttga aaccatcatg
atggcacaac actacgatgc taacatcgcc 360atcccatcat gtgacaaaaa
catgcccggt gtcatgatgg ccatgggtag acataacaga 420ccttccatca
tggtatatgg tggtactatc ttgcccggtc atccaacatg tggttcttcg
480aagatctcta aaaacatcga tatcgtctct gcgttccaat cctacggtga
atatatttcc 540aagcaattca ctgaagaaga aagagaagat gttgtggaac
atgcatgccc aggtcctggt 600tcttgtggtg gtatgtatac tgccaacaca
atggcttctg ccgctgaagt gctaggtttg 660accattccaa actcctcttc
cttcccagcc
gtttccaagg agaagttagc tgagtgtgac 720aacattggtg aatacatcaa
gaagacaatg gaattgggta ttttacctcg tgatatcctc 780acaaaagagg
cttttgaaaa cgccattact tatgtcgttg caaccggtgg gtccactaat
840gctgttttgc atttggtggc tgttgctcac tctgcgggtg tcaagttgtc
accagatgat 900ttccaaagaa tcagtgatac tacaccattg atcggtgact
tcaaaccttc tggtaaatac 960gtcatggccg atttgattaa cgttggtggt
acccaatctg tgattaagta tctatatgaa 1020aacaacatgt tgcacggtaa
cacaatgact gttaccggtg acactttggc agaacgtgca 1080aagaaagcac
caagcctacc tgaaggacaa gagattatta agccactctc ccacccaatc
1140aaggccaacg gtcacttgca aattctgtac ggttcattgg caccaggtgg
agctgtgggt 1200aaaattaccg gtaaggaagg tacttacttc aagggtagag
cacgtgtgtt cgaagaggaa 1260ggtgccttta ttgaagcctt ggaaagaggt
gaaatcaaga agggtgaaaa aaccgttgtt 1320gttatcagat atgaaggtcc
aagaggtgca ccaggtatgc ctgaaatgct aaagccttcc 1380tctgctctga
tgggttacgg tttgggtaaa gatgttgcat tgttgactga tggtagattc
1440tctggtggtt ctcacgggtt cttaatcggc cacattgttc ccgaagccgc
tgaaggtggt 1500cctatcgggt tggtcagaga cggcgatgag attatcattg
atgctgataa taacaagatt 1560gacctattag tctctgataa ggaaatggct
caacgtaaac aaagttgggt tgcacctcca 1620cctcgttaca caagaggtac
tctatccaag tatgctaagt tggtttccaa cgcttccaac 1680ggttgtgttt
tagatgcttg a 1701901689DNAGramella forsetii 90atggataaaa cagccatgaa
taacaaatac tcttctacta ttacacaaag tgactcacaa 60ccagcgtcac aagcaatgct
tcacgccatc ggccttaata aggaagattt gaaaaagcct 120tttgtaggca
tcggcagtac cggatatgaa ggaaacccat gcaacatgca cctgaatgat
180ttggctaagg aagtgaaaaa aggcactcag aatgcagatt taaacggtct
gatctttaat 240acaattggcg tcagcgatgg aatatctatg ggtactccag
gtatgaggtt ctcattgcca 300tcccgtgact tgattgcaga tagcatggaa
acagtagttg gtggaatgtc gtatgatggt 360ttagttaccg tagttgggtg
tgataaaaac atgccaggag cattaatggc aatgttgagg 420ttaaatcgtc
cgtcggtttt agtgtatggg ggaacaattg ctagtggttg ccacaatgga
480aagaagttag atgttgtgtc tgctttcgag gcctggggtt ctaaagtttc
aggtgatatg 540caggaagaag aataccagca agtcattgaa aaggcatgtc
ctggtgcagg tgcttgtggg 600ggtatgtaca cagccaacac catggcttca
tctattgaag ccttggggat gtccttgcct 660tttaactcat ccaatcctgc
aactggtccg gaaaaaactc aagaatctgt caaagctggc 720gaggctatga
aatacttact agaaaatgat ctgaaaccca aagatattgt gacggccaag
780tcgctggaaa atgctattag attgctaacg gttttgggtg gtagtaccaa
tgccgtcttg 840cacttcttgg ctatagctaa ggcagccgaa ataaactttg
gtttgaaaga ttttacaaga 900atatgtgagg aaactccctt cttggccgac
ttaaaaccat ctggtaagta tctgatggaa 960gacattcata ggataggcgg
aatccccgcg gttatgaagt acatgttaga gaaaggatta 1020cttcatggtg
agtgcatgac ggtaactggc aagactatcg cagaaaacct tgaaaatgtg
1080aaacctctgc cagatgatca ggacgtgatt catccagtcg aaaaacctat
taaagctact 1140ggacatatca ggattttgta tggcaattta gccagcgaag
gctccgtagc caagattact 1200gggaaggaag gattagaatt tcaaggtaag
gccagagtct ttaatggcga atttgaggcc 1260aatgaaggga tcagtagcgg
aaaggtccaa aaaggcgacg tagtagtaat tagatatgag 1320ggtcccaagg
ggggtccggg tatgccggaa atgctaaaac ccacgtcagc aataatggga
1380gctggtcttg gtaagagtgt cgctttaata actgacggta gattcagcgg
cggtactcat 1440ggttttgtcg tgggtcatat aacccctgaa gcgcaacaag
gtggactaat agggctattg 1500aaagatggtg atgaaatttc gatcaacgcg
gagaaaaaca cgattgaagc acatttatcc 1560gcagaagaaa ttaatagaag
aaaggaggct tggaaggctc ctgctctaaa agttaacggt 1620ggggtacttt
acaaatatgc gaagacagtt gctagtgcat cagaggggtg tgttacagac
1680gagttctaa 1689911707DNASaccharopolyspora erythraea 91atgagtacga
gtacagatgg tacgggtcaa tcaggtagag gactaaaacc aaggtccgga 60gacgtaaccg
agggtataga aagagccgcc gcaagaggca tgttacgtgc agtcggtatg
120caagatgctg acttcgccaa gcctcaaatt ggtgtcgctt cgtcttggaa
cgagataact 180ccctgtaatc tttcccttca gcgtttagca caagcgtcta
aggaaggagt gcatgcagct 240ggtgggttcc caatggaatt tggcactatt
tcagtgagtg atgggatatc tatgggccat 300gttggaatgc attactctct
agtgagtagg gaggtgattg ctgattcggt tgagacggta 360atggaagctg
aaaggctgga cggttccgtt ttgttagccg gttgtgacaa gagcctaccg
420ggtatgctaa tggccgcagc acgtttagat gtcgccgctg tattcgtgta
tgcaggttcc 480atactgcctg gaagagtaga cgatagagaa gtaactatta
ttgacgcttt tgaagccgtc 540ggagcttgtg caaggggctt gatctcagaa
gccgaggtgg ataggattga aagggctata 600tgcccaggtg aaggcgcttg
tggaggaatg tatacggcga ataccatggc ttgtgcggct 660gaagcaatgg
gcatgtcgtt accaggatca gcctcccctc ctagcgtaga tcgtagaaga
720gacgcgggcg cacgtgaagc tggtagagct gtggtcggta tgattgaacg
tggtcttaca 780gccagacaaa tattgactaa agaggcgttc gaaaacgcta
tcgcggttgt tatggctttt 840ggcggcagta ctaatgctgt tctgcatttg
ctggcaattg cacgtgaggc agaagttgat 900ttaacattag atgattttaa
caggattggt gatagagtgc ctcatctggc tgatgttaag 960ccatttggaa
ggcacgtgat gaccgcagtc gataggatag gtggagtacc agtagtaatg
1020aaagccttgt tggatgctgg tttgcttcat ggagactgta tgacagttac
tgggaaaact 1080gtcgccgaga atctagctga attagaccca ccagaattag
acggggaagt tcttcacaaa 1140ctgtctaacc ccttacaccc taccggcggc
ttgaccatct tgagagggag cttggcccct 1200gagggagctg ttgtcaaaag
cgctggcttt gactccgcaa cattcgaggg tactgcacgt 1260gttttcgatg
gagagcaggg tgccatggat gctgttgagg atggttcatt gaaagcgggt
1320gacgtggtag tcatcagata tgaaggtcca agaggcggtc caggtatgag
ggaaatgctt 1380gctgtaacag gggctatcaa aggtgcaggg ttagggaagg
acgttctatt gttaactgat 1440ggtagatttt cgggtggaac cacaggttta
tgcatcggac acgtcgcgcc cgaagcaact 1500gacggcggtc cgattgcttt
tgttcgtgac ggtgatccta ttagactgga tttagcgggt 1560agaactttgg
atctattagt agatgaagcc gaacttgcaa gaagaaaaga aggctgggtt
1620ccgagagaac ccaagtttag acaaggtgtt ttgggcaaat acgctagact
ggttaggtct 1680gctgcagttg gagccgtctg ctcttga
1707921722DNACandidatus Koribacter versatilis 92atgactgaga
agtcaccaaa accccataag agatccgatg caatcacaga ggggccaaat 60cgtgctcctg
ctcgtgctat gttaagggct gcaggtttta ctcctgagga tttgagaaaa
120cccattatcg gtatagccaa cacatggatt gaaattggcc cttgcaactt
acatctaaga 180gaattggccg aacatatcaa gcaaggtgta agagaagctg
gagggacacc catggaattt 240aatacagttt ccatctccga cgggataacc
atgggatcag aaggtatgaa agctagtcta 300gtgagtcgtg aggtaatagc
cgattcaatt gagttagttg ccagaggaaa cttgtttgat 360ggactaatag
ctttatctgg atgtgataag acaatcccag gtacaattat ggcattggag
420agacttgata tcccaggcct tatgctttat ggtggttcaa ttgctccggg
caaattccac 480gcacagaagg ttacgatcca agatgtattc gaagccgttg
gtacccacgc taggggtaaa 540atgagcgatg cagacttaga agagcttgag
cacaatgctt gtcctggtgc tggggcgtgc 600ggaggacagt tcacagctaa
tactatgtct atgtgtggtg aatttctggg tatatctcct 660atgggagcga
atagcgttcc cgcaatgacg gtcgagaaac aacaagtcgc gcgtagatgt
720ggacatttag ttatggagtt ggtgagaaga gacatcaggc cgtctcaaat
cataacaaga 780aaagcaattg agaacgcaat agcatcagtt gcggctagtg
gaggtagtac taacgcggtc 840ctgcatctgt tagctattgc acacgagatg
gatgtcgaat tgaacattga agattttgat 900aagataagct ctcgtactcc
acttctttgt gaactgaaac cagccggtag gtttacggct 960acagatttgc
atgacgctgg tggtattcca ttagttgctc aaagactgtt ggaagcaaat
1020ttgttacacg ctgacgcttt gacagtaact ggcaagacta ttgcagaaga
agctaaacag 1080gccaaagaaa ccccgggcca agaagtagtc aggcccttga
ccgacccaat taaggctacc 1140ggcggattaa tgatcttaaa aggtaatcta
gcatcagaag ggtgcgtggt aaagttggtt 1200ggtcacaaga agttattctt
cgaaggtcct gcgagagttt ttgaatctga agaagaagca 1260tttgccggcg
tcgaggatag gacgattcaa gcgggtgaag ttgtagtggt cagatacgaa
1320gggccaaaag gcggacctgg aatgcgtgaa atgttaggcg ttactgctgc
gatagctggc 1380accgagttag ctgaaactgt ggccctaatc accgacggta
gattttcggg tgcaacaaga 1440ggtctatccg tggggcatgt cgcacctgaa
gccgcaaatg gtggtgccat tgccgtagtt 1500aggaatggtg acattattac
gctggatgtt gagagaagag aattaagggt tcatttgact 1560gatgctgaat
tggaggccag attgcgtaac tggagagcgc ctgaaccgag atacaaacgt
1620ggtgttttcg ctaaatatgc ttctacggtc tcatcagcat cgttcggagc
tgtaacaggt 1680tctaccatag aaaacaaaac actggcaggc tcgactaagt aa
1722931779DNAPiromyces sp 93atgtctttct cactggctaa cctggccgct
aagggttcga acttgttcaa atttactcct 60gcgcttctaa gcgcaaagcg ttttggttca
tcaggaaagc caattaataa gttcagcaag 120attataacag agccaaagtc
tagagggggt agtcaagcga tgttaatcgc aactggtata 180aaaccagaag
atttaaaaaa gccacagatc ggcataggca gtgtttggta tgatggaaat
240ccatgcaaca tgcatctatt ggatcttggc tccgtggtaa aaaaggccgt
tcaaaaacaa 300aatatgaatg gtatgagatt caatatgatt ggagtgtcag
acgggatctc caacggtacg 360gatggaatgt ccttttcttt gcagtcccgt
gaaattattg cggattctat cgaaacaatc 420atgtctgcac aatattatga
tgctaacatc agcttacctg gctgcgacaa gaacatgcct 480ggttgtttaa
tcgccgctgc cagattgaac agaccgacta taattatcta cggtggcacg
540atcaagcccg gacatacaaa aaagggagag acgattgatt tagtctcggc
cttccaatgt 600tatgggcaat acttggctgg agaaattact gaagagcaaa
gagaagaaat agtgaataat 660gcatgtcctg gcgcaggtgc atgcggtgga
atgtatacag ctaatacaat ggcttccata 720atcgaatcaa tgggtatgag
tttaccttac tccgcctcga ccccggcaga agacccattg 780aaagagcttg
aatgtataaa cgcggcagct gcaattaaga atttaatgga aaaagacatc
840aagccattag acataatgac aagaaaagcg tttgagaacg ctataactat
tactttgatt 900cttggaggga gtacaaactc cgttctgcac cttttggcta
tcgctagggc ctgcaaagtc 960ccattaacta ttgacgattt ccaggaattt
tctaatagga tacccgtttt agccgactta 1020aaacctagtg gtaaatatgt
catggaagat ttgcagttga tcggcggtct tccagctatt 1080cagaaatatc
ttctgaatga aggtctactt catggtgata ttatgactgt taccggaaag
1140accctagcag agaatttgaa agacgttgct ccaatcgatt ttgaaactca
agatataatt 1200agacctttat cgaatcccat taaaaagaat ggtcacatta
tcattatgaa aggtaacgtc 1260tctccggacg gtggtgttgc taaaattaca
ggtaagcagg gattgttttt cgaaggcgtg 1320gcgaattgct ttgattgtga
agaagacatg ttagctgcac tggaaagagg cgaaattaaa 1380aaaggtcaag
tgattataat aaggtatgaa ggccccactg gagggcctgg tatgccggag
1440atgctaactc cgaccagtgc tattatgggt gctgggttag gaaaagatgt
agcactatta 1500acagatggca gattttcagg cgggtcacac ggcttcatta
ttggtcatat tacgcctgag 1560gcacaagtag gtggtccaat tgccctaatc
aaaaacggtg ataagataac tatagacgcg 1620aataaacgta ccatacatgc
ccatgtcagc gaagaagaat ttgctaaaag acgtgccgag 1680tggaaagcac
caccttacag agctactcaa ggtactttaa agaaatacat taagctggtt
1740aaacccgcaa actttggatg tgttaccgat gagtggtaa
1779941764DNARalstonia eutropha 94atgccgtacg cagatgaccc aaaattacct
caagatgggg ctgcgcctac agaaggtttg 60gccaagggcc ttactaatta tggtgatact
ggtttctctt tattcctgag gaaggctttt 120atcaaaggtg caggttttac
cgatgatgca ctatcaaggc cggtgatagg aattgtaaat 180actggatctt
cttataaccc atgccacggc aacgcccctc aattagtgga ggcggtgaag
240agaggtgtca tgttggcagg tggtttaccc gtagacttcc ctactatatc
cgtccacgag 300tcatttagcg cacccactag tatgtattta aggaacttga
tgtccatgga taccgaagaa 360atgattcgtg ctcagccgat ggacgccgtc
gttctgatag ggggttgtga caaaacagtt 420ccagcccaac tgatgggtgc
cgcatcagct ggagtaccag ccatccaatt agtcacaggt 480tctatgctaa
ctggtagcca tagaagtgag agagtcggag cgtgtacgga ttgtcgtaga
540tactggggta gataccgtgc tgaggagatt gattcagccg agatcgcaga
tgttaataat 600cagttggttg cctcagttgg tacatgctcg gtcatgggga
cagcttcaac aatggcttgt 660gtagcagagg ccttgggtat gatggtttct
ggcggtgctt cggcacctgc tgtgaccgcg 720gatagagtta gggtcgcgga
acgtaccggg acgactgctg ttggaatggc ggcggccagg 780ttgacacctg
atagaatatt aacaggtaaa gcctttgaaa acgctttgag agttctactg
840gcaatcggcg gttcaacaaa tgggatagta catctaacgg ctattgctgg
tagactagga 900atcgacatcg acctagcagg gttggacaga atgtctcgtg
aaacgcctgt tctggttgac 960ttgaaaccta gcggtcaaca ttacatggaa
gattttcata aggccggagg aatgttaacg 1020ttgttacgtg aactgagacc
actattacac ttagatactt tgaccgttag tggaaggacc 1080cttggcgaag
aattagatgc agcaccccct ctgttcccac aagatgtcat tagaagtgca
1140ggtaatccta tttatcccgc aggtggatta gcggtccttc gtggtaattt
ggctccaggc 1200ggggctatca tcaaacaatc cgctgcgaac ccagctctta
tggagcatga aggaagagcc 1260gtagtttttg aaaatgcaga agacatggct
caaagaattg acgacgaatc cttagacgtg 1320aaagctgacg atattcttgt
acttaaaagg attggtccaa ctggcgcccc gggtatgcct 1380gaagctggct
atatgccgat accaaagaag ttagcaagag caggggttaa ggatatggta
1440agagttagtg atggtcgtat gtctggaacg gcagctggca caatagtttt
gcatgtgaca 1500ccagaagcag ccataggggg acccttagcc cttgttcagt
cgggagatag aattaggcta 1560tctgtggcca accgtgaaat tgcattgtta
gtagatgatg ccgaattagc aaggagggcc 1620gctgctcaac ccgtagaaag
accaagggct gagagaggtt atagaaaatt gtttctggag 1680acagtaactc
aggcggatca gggtgttgat ttcgactttt tgagagctgc tcaaactgtg
1740gatacagtcc caaagcaagg ctaa 1764951746DNAChromohalobacter
salexigens 95atgactcata agaagagacc tttaagaagt gccgagtggt tcggtaatga
tgacaaaaat 60ggatttatgt atagatcgtg gatgaaaaac caaggtatac ccgatcacga
gtttagaggt 120aaaccgataa ttggtatctg caataccttt agtgaactaa
ctccatgcaa cgcccatttc 180agaaagttag cagaacatgt gaaaaaaggt
gtattagaag caggcggtta cccggttgaa 240tttccagtat tttctaacgg
ggaatctaat ttgagaccaa ctgctatgtt cacaaggaat 300ttggctagta
tggatgtcga ggaagccatt agaggcaatc cattagacgc agtcgtgttg
360cttgtgggtt gtgataaaac aacaccagcc ttacttatgg gtgctgcttc
ttgtgacatt 420ccgactatag ttgttacagg tgggccaatg cttaacggga
aacacaaggg aagagacatc 480ggatcaggta cggtcgtgtg gcagctttct
gaagaggtta aggccgggaa aatttcctta 540catgatttca tggcggctga
ggctggaatg agccgttccg ctggcacttg taacactatg 600ggaaccgcct
ctaccatggc atgcatggcc gaatctcttg gtacttcatt gccacacaat
660gccgctattc cggccgtgga tagccgtagg tatgtacttg cacatttgag
tggtaatagg 720attgtcgaaa tggtcgatga agacctaaca ctgagcaaag
tgctgaccaa gagcgctttt 780gaaaacgcta tcagaacgaa tgctgcgatt
ggcgggtcaa ccaatgcagt aatccatcta 840caggcaatcg caggtagaat
gggggtggac ttgacactag atgactggac aagagtaggt 900cgtggcacgc
ctactatcgt cgatttacaa ccctcgggta ggtacttgat ggaggaattt
960tattatgcgg gaggtctgcc tgcagtttta aggagattgg gggaagctga
tagactaccc 1020cataaagatg ccttaaccgt taatggcaag accctgtggg
aaaacgttca agatgcgcca 1080ttatacaacg acgccgttat tttgccattg
gatgctccct tacgtgagga cggaggcatg 1140tgtgtgatgc gtggtaatct
tgcgcctaac ggggctgtat taaaacctag cgcagcaact 1200cctgctctaa
tgcagcacag gggcagagcg gttgtttttg agaattttga tgattacaaa
1260gccaggataa atgatcctga cttggatgtt actgccgatg atatattagt
aatgaagaac 1320tgtggtccta gaggttatca tggtatggca gaagtaggca
acatgggact gcctgcaaaa 1380ctactggagc agggtgtcac ggacatggtc
cgtatttcag atgcaagaat gagtggaacc 1440gcttacggta ctgttgtatt
gcatgtagct cctgaagctg ctgccggtgg tcccttagct 1500gccgttcgta
atggcgattg gatcgcacta gacgcatatt caggaaaatt acacttggag
1560gtcgatgatg ctgaaatagc gtccagatta gcagaggcag acccaacagc
tgaatcaact 1620aggatagcgt caacaggagg ttacagacaa ctttacattg
aacatgtttt gcaagctgat 1680caaggctgtg atttcgattt cttagttgga
tgcaggggcg cagaagtccc aagacattcc 1740cactaa 174696990DNAPicrophilus
torridus 96atggaaaagg tttatacgga gaacgaccta aaggaaaact tgatgcgtaa
caaaaagata 60gcagttctag gttatggctc acaaggtaga gcttgggcat taaatatgag
agacagcgga 120ttaaatgtga cagtgggatt ggaaagacag gggaaatctt
gggaaaaagc cgttgctgat 180ggctttaagc cacttaagtc aagagatgct
gttagagacg ctgacgcagt cattttctta 240gtcccagaca tggcccagag
agaattatat aagaatatta tgaatgatat taaagatgac 300gcagacatcg
tttttgccca cggctttaac gttcattatg gtcttattaa tcctaaaaac
360catgatgttt acatggtggc tcctaaagca cccggcccat cggtaaggga
gttttacgaa 420agagggggag gggtcccggt tcttattgct gttgcaaatg
atgtctcggg ccgttctaaa 480gaaaaggcgt taagtatagc gtatagcttg
ggtgccttga gagcaggtgc gattgaaacc 540accttcaaag aggaaactga
aacagaccta atcggtgaac aattggatct ggttggaggt 600attactgaat
tactaagatc aacgtttaat attatggttg aaatgggtta taaaccagaa
660atggcttatt ttgaggccat caatgagatg aagttgatag tagaccaggt
attcgaaaaa 720ggtatttctg gtatgcttag agccgtaagt gataccgcta
aatatggagg tctgacaact 780ggtaagtaca taataaatga tgatgtaaga
aaaaggatga gggaaagggc agaatacatt 840gtgtcaggaa aattcgctga
ggagtggatt gaagaatacg gcgagggttc taagaatctg 900gaaagtatga
tgttggatat cgataactcc ctagaagagc aagttggaaa gcaattaaga
960gaaatcgtct taaggggacg tcctaagtaa 990971683DNASulfolobus tokodaii
97atgaacccag acaagaaaaa acgttcgaat ctgatatatg gtggatacga gaaggctcct
60aacagggcct tcttgaaagc catgggcttg acggatgatg acatcgctaa accaatagtc
120ggtgtcgctg ttgcttggaa tgaagctggc ccatgtaata ttcatttact
aggtttatct 180aatattgtta aagaaggagt gaggtcaggg ggtggcactc
cgagggtatt taccgcccct 240gttgtgattg acggtatcgc aatgggttct
gaagggatga agtattccct tgtttcaaga 300gaaattgtgg caaatacggt
cgagcttgtg gttaatgctc acgggtacga tggtttcgtt 360gcattagctg
ggtgtgacaa gactccacca ggaatgatga tggcaatggc tagattaaac
420attcccagca ttatcatgta tggaggcaca acactacctg gtaatttcaa
aggaaaaccc 480atcactatcc aggatgtata tgaggctgtt ggggcttatt
ctaaaggaaa gattacagca 540gaagatttaa gattgatgga agataatgct
attccaggtc cgggaacctg cggcggtcta 600tacacagcca atactatggg
cttaatgaca gaagcccttg gtcttgcgct accaggcagt 660gcttctcctc
cagcagtgga tagtgcaagg gtaaaatatg catacgaaac gggtaaagcc
720ctaatgaatt taatcgaaat cgggttaaaa cctcgtgaca ttcttacctt
tgaagccttt 780gaaaacgcaa taaccgtatt gatggcgtcg ggcggatcaa
ccaacgcagt gttgcattta 840ctggcgatag catacgaagc aggcgttaaa
ttaactttag atgattttga tcgtatatcc 900caaagaacac cagaaattgt
taacatgaag cctggaggtg aatacgctat gtacgatttg 960catagggtcg
gtggtgctcc cctgataatg aagaaattgc ttgaggccga cttattgcac
1020ggtgatgtaa taactgttac tggtaagacc gtcaaacaga atcttgagga
gtataagttg 1080ccaaatgttc cacacgaaca cattgtcagg cccatatcca
acccttttaa cccaacagga 1140gggataagaa ttttgaaggg ttcactggct
ccagagggcg cagtaattaa agtctccgcc 1200actaaggtga gataccataa
gggtccagcg agagtcttca attccgaaga ggaagccttt 1260aaggcagttc
tggaagaaaa aatccaagag aatgatgtag ttgttatcag atatgaagga
1320cctaagggcg gtcctggaat gcgtgaaatg ttggctgtca cgtcggctat
cgtgggtcaa 1380ggtttaggtg aaaaagttgc cttgattact gacggtagat
tttcaggagc cacgagaggt 1440attatggtcg gacatgtagc tcccgaggcg
gcagtaggtg gtccgatagc tttgctgagg 1500gacggtgaca caatcataat
tgatgcaaat aatggcagac tagacgtcga tctacctcaa 1560gaagaattaa
agaaaagagc tgatgagtgg acgcctcctc ccccgaaata taaaagtgga
1620ttattggctc aatacgctag actagttagc agttcttcac taggtgcggt
gctattgact 1680taa 1683981476DNAArtificial SequenceE. coli
ilvC(Q110V) 98atggccaact attttaacac attaaatttg agacaacaat
tggctcaact gggtaagtgc 60agatttatgg gaagggacga gtttgctgat ggtgcttctt
atctgcaagg aaagaaagta 120gtaattgttg gctgcggtgc tcagggtcta
aaccaaggtt taaacatgag agattcaggt 180ctggatattt cgtatgcatt
gaggaaagag gcaattgcag aaaagagggc ctcctggcgt 240aaagcgacgg
aaaatgggtt caaagttggt acttacgaag aactgatccc tcaggcagat
300ttagtgatta
acctaacacc agataaggtt cactcagacg tagtaagaac agttcaaccg
360ctgatgaagg atggggcagc tttaggttac tctcatggct ttaatatcgt
tgaagtgggc 420gagcagatca gaaaagatat aacagtcgta atggttgcac
caaagtgccc aggtacggaa 480gtcagagagg agtacaagag gggttttggt
gtacctacat tgatcgccgt acatcctgaa 540aatgacccca aaggtgaagg
tatggcaatt gcgaaggcat gggcagccgc aaccggaggt 600catagagcgg
gtgtgttaga gagttctttc gtagctgagg tcaagagtga cttaatgggt
660gaacaaacca ttctgtgcgg aatgttgcag gcagggtctt tactatgctt
tgataaattg 720gtcgaagagg gtacagatcc tgcctatgct gaaaagttga
tacaatttgg ttgggagaca 780atcaccgagg cacttaaaca aggtggcata
acattgatga tggatagact ttcaaatccg 840gccaagctaa gagcctacgc
cttatctgag caactaaaag agatcatggc accattattc 900caaaagcaca
tggacgatat tatctccggt gagttttcct caggaatgat ggcagattgg
960gcaaacgatg ataaaaagtt attgacgtgg agagaagaaa ccggcaagac
ggcattcgag 1020acagccccac aatacgaagg taaaattggt gaacaagaat
actttgataa gggagtattg 1080atgatagcta tggtgaaggc aggggtagaa
cttgcattcg aaactatggt tgactccggt 1140atcattgaag aatctgcata
ctatgagtct ttgcatgaat tgcctttgat agcaaatact 1200attgcaagaa
aaagacttta cgagatgaat gttgtcatat cagacactgc agaatatggt
1260aattacttat ttagctacgc atgtgtcccg ttgttaaagc ccttcatggc
cgagttacaa 1320cctggtgatt tggggaaggc tattccggaa ggagcggttg
acaatggcca actgagagac 1380gtaaatgaag ctattcgttc acatgctata
gaacaggtgg gtaaaaagct gagaggatat 1440atgaccgata tgaaaagaat
tgcagtggca ggatga 1476991647DNALactococcus lactis 99atgtatactg
ttggtgatta tctgctggac cgtctgcatg aactgggtat cgaagaaatc 60ttcggcgttc
cgggtgatta caatctgcag ttcctggatc agatcatctc tcataaagac
120atgaaatggg tgggtaacgc taacgaactg aacgcaagct acatggcaga
tggttatgca 180cgtaccaaga aagccgcggc atttctgacc actttcggtg
ttggcgaact gagcgccgtc 240aacggtctgg cgggctccta cgccgaaaac
ctgccggtgg tggagatcgt aggcagccca 300acgagcaaag ttcagaacga
aggtaaattc gtccaccaca ctctggctga cggcgatttc 360aaacacttca
tgaaaatgca tgaacctgtg actgcggcac gtacgctgct gactgcagag
420aacgctactg tggaaatcga ccgcgttctg tctgcgctgc tgaaagaacg
caaaccagtt 480tacatcaacc tgcctgtgga tgttgcggca gctaaagcgg
aaaaaccgag cctgccgctg 540aagaaagaaa actccacttc taacactagc
gaccaggaaa tcctgaacaa aatccaggag 600tctctgaaaa acgcaaagaa
accaatcgtg atcaccggcc acgaaatcat ttcttttggt 660ctggagaaga
ccgtgaccca attcatcagc aaaaccaaac tgccgattac caccctgaac
720ttcggcaagt cctctgttga cgaggctctg ccgtctttcc tgggcatcta
caacggtact 780ctgagcgaac cgaacctgaa agaatttgtt gaatctgcgg
acttcatcct gatgctgggc 840gttaaactga ccgactcttc taccggtgca
ttcactcacc atctgaacga aaacaaaatg 900attagcctga acatcgacga
gggtaaaatc ttcaacgagc gtatccagaa cttcgacttc 960gaaagcctga
tcagctctct gctggacctg tccgaaatcg agtataaagg caaatacatt
1020gacaaaaagc aagaagattt cgtaccatct aacgcactgc tgtcccagga
tcgcctgtgg 1080caggccgtgg agaacctgac ccagagcaat gaaaccatcg
tggcggaaca aggtacgagc 1140tttttcggcg cgtcttctat ctttctgaaa
tccaaaagcc attttatcgg tcagccgctg 1200tggggtagca ttggctatac
tttcccggca gcgctgggct ctcagatcgc tgataaagaa 1260tctcgtcatc
tgctgttcat cggtgacggt tccctgcagc tgaccgtaca ggaactgggt
1320ctggcaattc gtgaaaagat caacccgatt tgcttcatta ttaacaatga
cggctacacc 1380gttgagcgtg agatccacgg tccgaaccag tcttacaacg
atatccctat gtggaactac 1440tctaaactgc cggagtcctt cggcgcaact
gaggaccgtg ttgtgtctaa aattgtgcgt 1500accgaaaacg aatttgtgag
cgtgatgaaa gaggcccagg ccgatccgaa ccgtatgtac 1560tggatcgaac
tgatcctggc gaaagaaggc gcaccgaagg tactgaagaa aatgggcaag
1620ctgtttgctg aacagaataa atcctaa 16471001188DNASaccharomyces
cerevisiae 100atgttgagaa ctcaagccgc cagattgatc tgcaactccc
gtgtcatcac tgctaagaga 60acctttgctt tggccacccg tgctgctgct tacagcagac
cagctgcccg tttcgttaag 120ccaatgatca ctacccgtgg tttgaagcaa
atcaacttcg gtggtactgt tgaaaccgtc 180tacgaaagag ctgactggcc
aagagaaaag ttgttggact acttcaagaa cgacactttt 240gctttgatcg
gttacggttc ccaaggttac ggtcaaggtt tgaacttgag agacaacggt
300ttgaacgtta tcattggtgt ccgtaaagat ggtgcttctt ggaaggctgc
catcgaagac 360ggttgggttc caggcaagaa cttgttcact gttgaagatg
ctatcaagag aggtagttac 420gttatgaact tgttgtccga tgccgctcaa
tcagaaacct ggcctgctat caagccattg 480ttgaccaagg gtaagacttt
gtacttctcc cacggtttct ccccagtctt caaggacttg 540actcacgttg
aaccaccaaa ggacttagat gttatcttgg ttgctccaaa gggttccggt
600agaactgtca gatctttgtt caaggaaggt cgtggtatta actcttctta
cgccgtctgg 660aacgatgtca ccggtaaggc tcacgaaaag gcccaagctt
tggccgttgc cattggttcc 720ggttacgttt accaaaccac tttcgaaaga
gaagtcaact ctgacttgta cggtgaaaga 780ggttgtttaa tgggtggtat
ccacggtatg ttcttggctc aatacgacgt cttgagagaa 840aacggtcact
ccccatctga agctttcaac gaaaccgtcg aagaagctac ccaatctcta
900tacccattga tcggtaagta cggtatggat tacatgtacg atgcttgttc
caccaccgcc 960agaagaggtg ctttggactg gtacccaatc ttcaagaatg
ctttgaagcc tgttttccaa 1020gacttgtacg aatctaccaa gaacggtacc
gaaaccaaga gatctttgga attcaactct 1080caacctgact acagagaaaa
gctagaaaag gaattagaca ccatcagaaa catggaaatc 1140tggaaggttg
gtaaggaagt cagaaagttg agaccagaaa accaataa 118810120DNAArtificial
SequencePrimer 1321 101aatcatatcg aacacgatgc 2010220DNAArtificial
SequencePrimer 1322 102tcagaaagga tcttctgctc 2010320DNAArtificial
SequencePrimer 1323 103atcgatatcg tgaaatacgc 2010420DNAArtificial
SequencePrimer 1324 104agctggtctg gtgattctac 2010538DNAArtificial
SequencePrimer 1409 105attgatgcgg ccgcgattta atctctaatt attagtta
3810634DNAArtificial SequencePrimer 1410 106cacccagtcg cgacatccaa
tttatagaaa tcag 3410732DNAArtificial SequencePrimer 1411
107attggatgtc gcgactgggt gagcatatgt tc 3210832DNAArtificial
SequencePrimer 1412 108gagaaagccg gcaggagagt gaaagagcct tg
3210921DNAArtificial SequencePrimer 1440 109atcgtacatc ttccaagcat c
2111020DNAArtificial SequencePrimer 1441 110aatcggaacc ctaaagggag
2011120DNAArtificial SequencePrimer 1443 111tgcagatgca gatgtgagac
2011224DNAArtificial SequencePrimer 1587 112cggctgccag aactctacta
actg 2411323DNAArtificial SequencePrimer 1588 113gcgacgtcta
ctggcaggtt aat 2311424DNAArtificial SequencePrimer 1633
114tccgtcactg gattcaatgc catc 2411520DNAArtificial SequencePrimer
1634 115ttcgccaggg agctggtgaa 20116771DNADrosophila melanogaster
116atgtcgttta ctttgaccaa caagaacgtg attttcgttg ccggtctggg
aggcattggt 60ctggacacca gcaaggagct gctcaagcgc gatctgaaga acctggtgat
cctcgaccgc 120attgagaacc cggctgccat tgccgagctg aaggcaatca
atccaaaggt gaccgtcacc 180ttctacccct atgatgtgac cgtgcccatt
gccgagacca ccaagctgct gaagaccatc 240ttcgcccagc tgaagaccgt
cgatgtcctg atcaacggag ctggtatcct ggacgatcac 300cagatcgagc
gcaccattgc cgtcaactac actggcctgg tcaacaccac gacggccatt
360ctggacttct gggacaagcg caagggcggt cccggtggta tcatctgcaa
cattggatcc 420gtcactggat tcaatgccat ctaccaggtg cccgtctact
ccggcaccaa ggccgccgtg 480gtcaacttca ccagctccct ggcgaaactg
gcccccatta ccggcgtgac ggcttacact 540gtgaaccccg gcatcacccg
caccaccctg gtgcacacgt tcaactcctg gttggatgtt 600gagcctcagg
ttgccgagaa gctcctggct catcccaccc agccctcgtt ggcctgcgcc
660gagaacttcg tcaaggctat cgagctgaac cagaacggag ccatctggaa
actggacttg 720ggcaccctgg aggccatcca gtggaccaag cactgggact
ccggcatcta a 7711178870DNAArtificial SequencepGV1914 117tcgcgcgttt
cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct
gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta
ctgagagtgc 180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag
aaaataccgc atcaggcgcc 240attcgccatt caggctgcgc aactgttggg
aagggcgatc ggtgcgggcc tcttcgctat 300tacgccagct ggcgaaaggg
ggatgtgctg caaggcgatt aagttgggta acgccagggt 360tttcccagtc
acgacgttgt aaaacgacgg ccagtgaatt cataccacag cttttcaatt
420caattcatca tttttttttt attctttttt ttgatttcgg tttccttgaa
atttttttga 480ttcggtaatc tccgaacaga aggaagaacg aaggaaggag
cacagactta gattggtata 540tatacgcata tgtagtgttg aagaaacatg
aaattgccca gtattcttaa cccaactgca 600cagaacaaaa acctgcagga
aacgaagata aatcatgtcg aaagctacat ataaggaacg 660tgctgctact
catcctagtc ctgttgctgc caagctattt aatatcatgc acgaaaagca
720aacaaacttg tgtgcttcat tggatgttcg taccaccaag gaattactgg
agttagttga 780agcattaggt cccaaaattt gtttactaaa aacacatgtg
gatatcttga ctgatttttc 840catggagggc acagttaagc cgctaaaggc
attatccgcc aagtacaatt ttttactctt 900cgaagacaga aaatttgctg
acattggtaa tacagtcaaa ttgcagtact ctgcgggtgt 960atacagaata
gcagaatggg cagacattac gaatgcacac ggtgtggtgg gcccaggtat
1020tgttagcggt ttgaagcagg cggcagaaga agtaacaaag gaacctagag
gccttttgat 1080gttagcagaa ttgtcatgca agggctccct atctactgga
gaatatacta agggtactgt 1140tgacattgcg aagagcgaca aagattttgt
tatcggcttt attgctcaaa gagacatggg 1200tggaagagat gaaggttacg
attggttgat tatgacaccc ggtgtgggtt tagatgacaa 1260gggagacgca
ttgggtcaac agtatagaac cgtggatgat gtggtctcta caggatctga
1320cattattatt gttggaagag gactatttgc aaagggaagg gatgctaagg
tagagggtga 1380acgttacaga aaagcaggct gggaagcata tttgagaaga
tgcggccagc aaaactaaaa 1440aactgtatta taagtaaatg catgtatact
aaactcacaa attagagctt caatttaatt 1500atatcagtta ttaccctatg
cggtgtgaaa taccgcacag atgcgtaagg agaaaatacc 1560gcatcaggaa
attgtaaacg ttaatatttt gttaaaattc gcgttaaatt tttgttaaat
1620cagctcattt tttaaccaat aggccgaaat cggcaaaatc ccttataaat
caaaagaata 1680gaccgagata gggttgagtg ttgttccagt ttggaacaag
agtccactat taaagaacgt 1740ggactccaac gtcaaagggc gaaaaaccgt
ctatcagggc gatggcccac tacgtgaacc 1800atcaccctaa tcaagttttt
tggggtcgag gtgccgtaaa gcactaaatc ggaaccctaa 1860agggagcccc
cgatttagag cttgacgggg aaagccggcg aggactgcaa tagcacaaga
1920ttaagataga atggcttcaa acagccgcct tttatacata ttggtaaaag
ctcgcgaatc 1980gcaccatatc ccttatcctg taatcaaatc gatctaggtg
cagatacaga tcaattcata 2040aaaagaaatt gaagcaccag tttatcacta
ctacactatc tttttctttt tttttttttt 2100ttgcgcagtt tcgccctttg
ttcaatatca cttgataagt tgtgggcttt ttctgtcact 2160cattcggctt
aaaaagtatt cgttcttttg tgttttatga aaagggaacg tgatataaaa
2220aaacatcctt tggtgtggga catgggcttt tgtttagaga atggttatca
ctaccgcccc 2280cacccttgaa agccacagaa aatgaaaaag tatgtgaata
aggtgtgaac tctataacat 2340tttggccaaa tgccacagcc gatctgcata
ttccaatgga catgatgcaa caacaattga 2400tgtcacattc tcttacacac
ttcgattggt ccgtacgtag tactttttac ataactgact 2460caggcgtttc
cttcattgaa atgctcatct attgccaagt acatagaatc cacagtgcat
2520aggttaacgc attgtaccca aacgacggga aacaaggaag gatgcagaat
gagcacttgt 2580tatttataaa aagacacggg agggggaatc ccgtctttcg
tccgtcggag ccaaagagat 2640gagccaaagc agaaaaacag gggacgccgc
ccttcttccg tcccgtgcgt gaggggggcg 2700cggccattcg gtttttgcaa
tatgacctgt gggccaaaaa tcgaaaaaaa aaaaaaaaat 2760aagaggcggc
tgcggaattt tataagacaa gcgcagggcc aaagaaaaaa taataattga
2820cgtggctgaa caacagtctc tccccacccc tttccaaaaa ggggaatgaa
atacgagttc 2880tttttcccaa ttggtagata ttcaacaaga gacgcgcagt
acgtaacatg cgaattgcgt 2940aattcacggc gataacgtag tatttagatt
tagtataatt tgaaccgatg tatttatttg 3000tctgattgat ttatgtattc
aaactgtgta agtttattta tttgcaacaa taattcgttt 3060gagtacacta
ctaatggcgg ccgcttagat gccggagtcc cagtgcttgg tccactggat
3120ggcctccagg gtgcccaagt ccagtttcca gatggctccg ttctggttca
gctcgatagc 3180cttgacgaag ttctcggcgc aggccaacga gggctgggtg
ggatgagcca ggagcttctc 3240ggcaacctga ggctcaacat ccaaccagga
gttgaacgtg tgcaccaggg tggtgcgggt 3300gatgccgggg ttcacagtgt
aagccgtcac gccggtaatg ggggccagtt tcgccaggga 3360gctggtgaag
ttgaccacgg cggccttggt gccggagtag acgggcacct ggtagatggc
3420attgaatcca gtgacggatc caatgttgca gatgatacca ccgggaccgc
ccttgcgctt 3480gtcccagaag tccagaatgg ccgtcgtggt gttgaccagg
ccagtgtagt tgacggcaat 3540ggtgcgctcg atctggtgat cgtccaggat
accagctccg ttgatcagga catcgacggt 3600cttcagctgg gcgaagatgg
tcttcagcag cttggtggtc tcggcaatgg gcacggtcac 3660atcatagggg
tagaaggtga cggtcacctt tggattgatt gccttcagct cggcaatggc
3720agccgggttc tcaatgcggt cgaggatcac caggttcttc agatcgcgct
tgagcagctc 3780cttgctggtg tccagaccaa tgcctcccag accggcaacg
aaaatcacgt tcttgttggt 3840caaagtaaac gacataccgg tatctcctag
atccgtcgaa gtcgaaacta agttctggtg 3900ttttaaaact aaaaaaaaga
ctaactataa aagtagaatt taagaagttt aagaaataga 3960tttacagaat
tacaatcaat acctaccgtc tttatatact tattagtcaa gtaggggaat
4020aatttcaggg aactggtttc aacctttttt ttcagctttt tccaaatcag
agagagcaga 4080aggtaataga aggtgtaaga aaatgagata gatacatgcg
tgggtcaatt gccttgtgtc 4140atcatttact ccaggcaggt tgcatcactc
cattgaggtt gtgcccgttt tttgcctgtt 4200tgtgcccctg ttctctgtag
ttgcgctaag agaatggacc tatgaactga tggttggtga 4260agaaaacaat
attttggtgc tgggattctt tttttttctg gatgccagct taaaaagcgg
4320gctccattat atttagtgga tgccaggaat aaactgttca cccagacacc
tacgatgtta 4380tatattctgt gtaacccgcc ccctattttg ggcatgtacg
ggttacagca gaattaaaag 4440gctaattttt tgactaaata aagttaggaa
aatcactact attaattatt tacgtattct 4500ttgaaatggc gagtattgat
aatgataaac tggatcctta ggatttattc tgttcagcaa 4560acagcttgcc
cattttcttc agtaccttcg gtgcgccttc tttcgccagg atcagttcga
4620tccagtacat acggttcgga tcggcctggg cctctttcat cacgctcaca
aattcgtttt 4680cggtacgcac aattttagac acaacacggt cctcagttgc
gccgaaggac tccggcagtt 4740tagagtagtt ccacataggg atatcgttgt
aagactggtt cggaccgtgg atctcacgct 4800caacggtgta gccgtcattg
ttaataatga agcaaatcgg gttgatcttt tcacgaattg 4860ccagacccag
ttcctgtacg gtcagctgca gggaaccgtc accgatgaac agcagatgac
4920gagattcttt atcagcgatc tgagagccca gcgctgccgg gaaagtatag
ccaatgctac 4980cccacagcgg ctgaccgata aaatggcttt tggatttcag
aaagatagaa gacgcgccga 5040aaaagctcgt accttgttcc gccacgatgg
tttcattgct ctgggtcagg ttctccacgg 5100cctgccacag gcgatcctgg
gacagcagtg cgttagatgg tacgaaatct tcttgctttt 5160tgtcaatgta
tttgccttta tactcgattt cggacaggtc cagcagagag ctgatcaggc
5220tttcgaagtc gaagttctgg atacgctcgt tgaagatttt accctcgtcg
atgttcaggc 5280taatcatttt gttttcgttc agatggtgag tgaatgcacc
ggtagaagag tcggtcagtt 5340taacgcccag catcaggatg aagtccgcag
attcaacaaa ttctttcagg ttcggttcgc 5400tcagagtacc gttgtagatg
cccaggaaag acggcagagc ctcgtcaaca gaggacttgc 5460cgaagttcag
ggtggtaatc ggcagtttgg ttttgctgat gaattgggtc acggtcttct
5520ccagaccaaa agaaatgatt tcgtggccgg tgatcacgat tggtttcttt
gcgtttttca 5580gagactcctg gattttgttc aggatttcct ggtcgctagt
gttagaagtg gagttttctt 5640tcttcagcgg caggctcggt ttttccgctt
tagctgccgc aacatccaca ggcaggttga 5700tgtaaactgg tttgcgttct
ttcagcagcg cagacagaac gcggtcgatt tccacagtag 5760cgttctctgc
agtcagcagc gtacgtgccg cagtcacagg ttcatgcatt ttcatgaagt
5820gtttgaaatc gccgtcagcc agagtgtggt ggacgaattt accttcgttc
tgaactttgc 5880tcgttgggct gcctacgatc tccaccaccg gcaggttttc
ggcgtaggag cccgccagac 5940cgttgacggc gctcagttcg ccaacaccga
aagtggtcag aaatgccgcg gctttcttgg 6000tacgtgcata accatctgcc
atgtagcttg cgttcagttc gttagcgtta cccacccatt 6060tcatgtcttt
atgagagatg atctgatcca ggaactgcag attgtaatca cccggaacgc
6120cgaagatttc ttcgataccc agttcatgca gacggtccag cagataatca
ccaacagtat 6180acatgtcgac aaacttagat tagattgcta tgctttcttt
ctaatgagca agaagtaaaa 6240aaagttgtaa tagaacaaga aaaatgaaac
tgaaacttga gaaattgaag accgtttatt 6300aacttaaata tcaatgggag
gtcatcgaaa gagaaaaaaa tcaaaaaaaa aattttcaag 6360aaaaagaaac
gtgataaaaa tttttattgc ctttttcgac gaagaaaaag aaacgaggcg
6420gtctcttttt tcttttccaa acctttagta cgggtaatta acgacaccct
agaggaagaa 6480agaggggaaa tttagtatgc tgtgcttggg tgttttgaag
tggtacggcg atgcgcggag 6540tccgagaaaa tctggaagag taaaaaagga
gtagaaacat tttgaagcta tgagctccag 6600cttttgttcc ctttagtgag
ggttaattgc gcgcttggcg taatcatggt catagctgtt 6660tcctgtgtga
aattgttatc cgctcacaat tccacacaac ataggagccg gaagcataaa
6720gtgtaaagcc tggggtgcct aatgagtgag gtaactcaca ttaattgcgt
tgcgctcact 6780gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat
taatgaatcg gccaacgcgc 6840ggggagaggc ggtttgcgta ttgggcgctc
ttccgcttcc tcgctcactg actcgctgcg 6900ctcggtcgtt cggctgcggc
gagcggtatc agctcactca aaggcggtaa tacggttatc 6960cacagaatca
ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag
7020gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc
ctgacgagca 7080tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg
acaggactat aaagatacca 7140ggcgtttccc cctggaagct ccctcgtgcg
ctctcctgtt ccgaccctgc cgcttaccgg 7200atacctgtcc gcctttctcc
cttcgggaag cgtggcgctt tctcatagct cacgctgtag 7260gtatctcagt
tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt
7320tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc
cggtaagaca 7380cgacttatcg ccactggcag cagccactgg taacaggatt
agcagagcga ggtatgtagg 7440cggtgctaca gagttcttga agtggtggcc
taactacggc tacactagaa ggacagtatt 7500tggtatctgc gctctgctga
agccagttac cttcggaaaa agagttggta gctcttgatc 7560cggcaaacaa
accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg
7620cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg
acgctcagtg 7680gaacgaaaac tcacgttaag ggattttggt catgagatta
tcaaaaagga tcttcaccta 7740gatcctttta aattaaaaat gaagttttaa
atcaatctaa agtatatatg agtaaacttg 7800gtctgacagt taccaatgct
taatcagtga ggcacctatc tcagcgatct gtctatttcg 7860ttcatccata
gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc
7920atctggcccc agtgctgcaa tgataccgcg agacccacgc tcaccggctc
cagatttatc 7980agcaataaac cagccagccg gaagggccga gcgcagaagt
ggtcctgcaa ctttatccgc 8040ctccatccag tctattaatt gttgccggga
agctagagta agtagttcgc cagttaatag 8100tttgcgcaac gttgttgcca
ttgctacagg catcgtggtg tcacgctcgt cgtttggtat 8160ggcttcattc
agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg
8220caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc agaagtaagt
tggccgcagt 8280gttatcactc atggttatgg cagcactgca taattctctt
actgtcatgc catccgtaag 8340atgcttttct gtgactggtg agtactcaac
caagtcattc tgagaatagt gtatgcggcg 8400accgagttgc tcttgcccgg
cgtcaatacg ggataatacc gcgccacata gcagaacttt 8460aaaagtgctc
atcattggaa aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct
8520gttgagatcc agttcgatgt aacccactcg tgcacccaac tgatcttcag
catcttttac 8580tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa
aatgccgcaa aaaagggaat 8640aagggcgaca cggaaatgtt gaatactcat
actcttcctt tttcaatatt attgaagcat 8700ttatcagggt tattgtctca
tgagcggata catatttgaa tgtatttaga aaaataaaca 8760aataggggtt
ccgcgcacat ttccccgaaa agtgccacct gacgtctaag aaaccattat
8820tatcatgaca ttaacctata aaaataggcg tatcacgagg ccctttcgtc
88701184857DNAArtificial SequenceSacI-NotI fragment 118gagctcatag
cttcaaaatg tttctactcc ttttttactc ttccagattt tctcggactc 60cgcgcatcgc
cgtaccactt caaaacaccc aagcacagca tactaaattt cccctctttc
120ttcctctagg gtgtcgttaa ttacccgtac taaaggtttg gaaaagaaaa
aagagaccgc 180ctcgtttctt tttcttcgtc gaaaaaggca ataaaaattt
ttatcacgtt tctttttctt 240gaaaattttt tttttgattt ttttctcttt
cgatgacctc ccattgatat ttaagttaat 300aaacggtctt caatttctca
agtttcagtt tcatttttct tgttctatta caactttttt 360tacttcttgc
tcattagaaa gaaagcatag caatctaatc taagtttgtc gacatgaaga
420agctcaacaa gtactcgtat atcatcactg aacctaaggg ccaaggtgcg
tcccaggcca 480tgctttatgc caccggtttc aagaaggaag atttcaagaa
gcctcaagtc ggggttggtt 540cctgttggtg gtccggtaac ccatgtaaca
tgcatctatt ggacttgaat aacagatgtt 600ctcaatccat tgaaaaagcg
ggtttgaaag ctatgcagtt caacaccatc ggtgtttcag 660acggtatctc
tatgggtact aaaggtatga gatactcgtt acaaagtaga gaaatcattg
720cagactcctt tgaaaccatc atgatggcac aacactacga tgctaacatc
gccatcccat 780catgtgacaa aaacatgccc ggtgtcatga tggccatggg
tagacataac agaccttcca 840tcatggtata tggtggtact atcttgcccg
gtcatccaac atgtggttct tcgaagatct 900ctaaaaacat cgatatcgtc
tctgcgttcc aatcctacgg tgaatatatt tccaagcaat 960tcactgaaga
agaaagagaa gatgttgtgg aacatgcatg cccaggtcct ggttcttgtg
1020gtggtatgta tactgccaac acaatggctt ctgccgctga agtgctaggt
ttgaccattc 1080caaactcctc ttccttccca gccgtttcca aggagaagtt
agctgagtgt gacaacattg 1140gtgaatacat caagaagaca atggaattgg
gtattttacc tcgtgatatc ctcacaaaag 1200aggcttttga aaacgccatt
acttatgtcg ttgcaaccgg tgggtccact aatgctgttt 1260tgcatttggt
ggctgttgct cactctgcgg gtgtcaagtt gtcaccagat gatttccaaa
1320gaatcagtga tactacacca ttgatcggtg acttcaaacc ttctggtaaa
tacgtcatgg 1380ccgatttgat taacgttggt ggtacccaat ctgtgattaa
gtatctatat gaaaacaaca 1440tgttgcacgg taacacaatg actgttaccg
gtgacacttt ggcagaacgt gcaaagaaag 1500caccaagcct acctgaagga
caagagatta ttaagccact ctcccaccca atcaaggcca 1560acggtcactt
gcaaattctg tacggttcat tggcaccagg tggagctgtg ggtaaaatta
1620ccggtaagga aggtacttac ttcaagggta gagcacgtgt gttcgaagag
gaaggtgcct 1680ttattgaagc cttggaaaga ggtgaaatca agaagggtga
aaaaaccgtt gttgttatca 1740gatatgaagg tccaagaggt gcaccaggta
tgcctgaaat gctaaagcct tcctctgctc 1800tgatgggtta cggtttgggt
aaagatgttg cattgttgac tgatggtaga ttctctggtg 1860gttctcacgg
gttcttaatc ggccacattg ttcccgaagc cgctgaaggt ggtcctatcg
1920ggttggtcag agacggcgat gagattatca ttgatgctga taataacaag
attgacctat 1980tagtctctga taaggaaatg gctcaacgta aacaaagttg
ggttgcacct ccacctcgtt 2040acacaagagg tactctatcc aagtatgcta
agttggtttc caacgcttcc aacggttgtg 2100ttttagatgc ttgaggatcc
agtttatcat tatcaatact cgccatttca aagaatacgt 2160aaataattaa
tagtagtgat tttcctaact ttatttagtc aaaaaattag ccttttaatt
2220ctgctgtaac ccgtacatgc ccaaaatagg gggcgggtta cacagaatat
ataacatcgt 2280aggtgtctgg gtgaacagtt tattcctggc atccactaaa
tataatggag cccgcttttt 2340aagctggcat ccagaaaaaa aaagaatccc
agcaccaaaa tattgttttc ttcaccaacc 2400atcagttcat aggtccattc
tcttagcgca actacagaga acaggggcac aaacaggcaa 2460aaaacgggca
caacctcaat ggagtgatgc aacctgcctg gagtaaatga tgacacaagg
2520caattgaccc acgcatgtat ctatctcatt ttcttacacc ttctattacc
ttctgctctc 2580tctgatttgg aaaaagctga aaaaaaaggt tgaaaccagt
tccctgaaat tattccccta 2640cttgactaat aagtatataa agacggtagg
tattgattgt aattctgtaa atctatttct 2700taaacttctt aaattctact
tttatagtta gtcttttttt tagttttaaa acaccagaac 2760ttagtttcga
ctcgagatgg ccaactattt taacacatta aatttgagac aacaattggc
2820tcaactgggt aagtgcagat ttatgggaag ggacgagttt gctgatggtg
cttcttatct 2880gcaaggaaag aaagtagtaa ttgttggctg cggtgctcag
ggtctaaacc aaggtttaaa 2940catgagagat tcaggtctgg atatttcgta
tgcattgagg aaagaggcaa ttgcagaaaa 3000gagggcctcc tggcgtaaag
cgacggaaaa tgggttcaaa gttggtactt acgaagaact 3060gatccctcag
gcagatttag tgattaacct aacaccagat aaggttcact cagacgtagt
3120aagaacagtt caaccgctga tgaaggatgg ggcagcttta ggttactctc
atggctttaa 3180tatcgttgaa gtgggcgagc agatcagaaa agatataaca
gtcgtaatgg ttgcaccaaa 3240gtgcccaggt acggaagtca gagaggagta
caagaggggt tttggtgtac ctacattgat 3300cgccgtacat cctgaaaatg
accccaaagg tgaaggtatg gcaattgcga aggcatgggc 3360agccgcaacc
ggaggtcata gagcgggtgt gttagagagt tctttcgtag ctgaggtcaa
3420gagtgactta atgggtgaac aaaccattct gtgcggaatg ttgcaggcag
ggtctttact 3480atgctttgat aaattggtcg aagagggtac agatcctgcc
tatgctgaaa agttgataca 3540atttggttgg gagacaatca ccgaggcact
taaacaaggt ggcataacat tgatgatgga 3600tagactttca aatccggcca
agctaagagc ctacgcctta tctgagcaac taaaagagat 3660catggcacca
ttattccaaa agcacatgga cgatattatc tccggtgagt tttcctcagg
3720aatgatggca gattgggcaa acgatgataa aaagttattg acgtggagag
aagaaaccgg 3780caagacggca ttcgagacag ccccacaata cgaaggtaaa
attggtgaac aagaatactt 3840tgataaggga gtattgatga tagctatggt
gaaggcaggg gtagaacttg cattcgaaac 3900tatggttgac tccggtatca
ttgaagaatc tgcatactat gagtctttgc atgaattgcc 3960tttgatagca
aatactattg caagaaaaag actttacgag atgaatgttg tcatatcaga
4020cactgcagaa tatggtaatt acttatttag ctacgcatgt gtcccgttgt
taaagccctt 4080catggccgag ttacaacctg gtgatttggg gaaggctatt
ccggaaggag cggttgacaa 4140tggccaactg agagacgtaa atgaagctat
tcgttcacat gctatagaac aggtgggtaa 4200aaagctgaga ggatatatga
ccgatatgaa aagaattgca gtggcaggat gaagatccgc 4260ggccgctcga
gtcatgtaat tagttatgtc acgcttacat tcacgccctc cccccacatc
4320cgctctaacc gaaaaggaag gagttagaca acctgaagtc taggtcccta
tttatttttt 4380tatagttatg ttagtattaa gaacgttatt tatatttcaa
atttttcttt tttttctgta 4440cagacgcgtg tacgcatgta acattatact
gaaaaccttg cttgagaagg ttttgggacg 4500ctcgaaggct ttaatttgcg
gccggtaccc aattcgccct atagtgagtc gtattacgcg 4560cgctcactgg
ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt tacccaactt
4620aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga
ggcccgcacc 4680gatcgccctt cccaacagtt gcgcagcctg aatggcgaat
ggcgcgacgc gccctgtagc 4740ggcgcattaa gcgcggcggg tgtggtggtt
acgcgcagcg tgaccgctac acttgccagc 4800gccctagcgc ccgctccttt
cgctttcttc ccttcctttc tcgccacgtt cgccggc 485711934DNAArtificial
SequencePrimer 421 119gccaacggat cctcaagcat ctaaaacaca accg
3412036DNAArtificial SequencePrimer 551 120gctcatgtcg acatgaagaa
gctcaacaag tactcg 3612167DNAArtificial SequencePrimer 269
121ctagcatgta cccatacgat gttcctgact atgcgggtgt cgacgaattc
ccgggatccg 60cggccgc 6712267DNAArtificial SequencePrimer 270
122tcgagcggcc gcggatcccg ggaattcgtc gacacccgca tagtcaggaa
catcgtatgg 60gtacatg 6712335DNAArtificial SequencePrimer 1842
123ttttggatcc ctaccaatcc tggtggactt tatcg 3512437DNAArtificial
SequencePrimer 2163 124ttggtagtcg acatggttta cactccatcc aagggtc
3712532DNAArtificial SequencePrimer 2183 125acagtagtcg acatgacaga
gcagaaagcc ct 3212634DNAArtificial SequencePrimer 2184
126tacatcggat ccctacataa gaacaccttt ggtg 3412741DNAArtificial
SequencePrimer 2195 127ttgttcctcg agatggagga acaggagata ggcgttcctg
c 4112842DNAArtificial SequencePrimer 2196 128gttcttgcgg ccgcttattt
tggagattct atctggggtt gc 4212940DNAArtificial SequencePrimer 2197
129ttcttggtcg acatgagtgc tctactgtcc gagtctgacc 4013043DNAArtificial
SequencePrimer 2198 130ttgttcggat ccttaccagg tgctcccaac agagacgaga
tcc 4313142DNAArtificial SequencePrimer 2259 131tcagtaagat
ctatgactga gatactacca catgtaaacg ac 4213244DNAArtificial
SequencePrimer 2260 132catatcctcg aggtacccta tacatccccc acagcatctc
gcag 441337685DNAArtificial SequencepGV2074 133ttggatcata
ctaagaaacc attattatca tgacattaac ctataaaaat aggcgtatca 60cgaggccctt
tcgtctcgcg cgtttcggtg atgacggtga aaacctctga cacatgcagc
120tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa
gcccgtcagg 180gcgcgtcagc gggtgttggc gggtgtcggg gctggcttaa
ctatgcggca tcagagcaga 240ttgtactgag agtgcaccat accacagctt
ttcaattcaa ttcatcattt tttttttatt 300cttttttttg atttcggttt
ctttgaaatt tttttgattc ggtaatctcc gaacagaagg 360aagaacgaag
gaaggagcac agacttagat tggtatatat acgcatatgg caaattaaag
420ccttcgagcg tcccaaaacc ttctcaagca aggttttcag tataatgtta
catgcgtaca 480cgcgtctgta cagaaaaaaa agaaaaattt gaaatataaa
taacgttctt aatactaaca 540taactataaa aaaataaata gggacctaga
cttcaggttg tctaactcct tccttttcgg 600ttagagcgga tgtgggggga
gggcgtgaat gtaagcgtga cataagaatt cttattcctt 660tgccctcgga
cgagtgctgg ggcgtcggtt tccactatcg gcgagtactt ctacacagcc
720atcggtccag acggccgcgc ttctgcgggc gatttgtgta cgcccgacag
tcccggctcc 780ggatcggacg attgcgtcgc atcgaccctg cgcccaagct
gcatcatcga aattgccgtc 840aaccaagctc tgatagagtt ggtcaagacc
aatgcggagc atatacgccc ggaggcgcgg 900cgatcctgca agctccggat
gcctccgctc gaagtagcgc gtctgctgct ccatacaagc 960caaccacggc
ctccagaaga ggatgttggc gacctcgtat tgggaatccc cgaacatcgc
1020ctcgctccag tcaatgaccg ctgttatgcg gccattgtcc gtcaggacat
tgttggagcc 1080gaaatccgca tgcacgaggt gccggacttc ggggcagtcc
tcggcccaaa gcatcagctc 1140atcgagagcc tgcgcgacgg acgcactgac
ggtgtcgtcc atcacagttt gccagtgata 1200cacatgggga tcagcaatcg
cgcatatgaa atcacgccat gtagtgtatt gaccgattcc 1260ttgcggtccg
aatgggccga acccgctcgt ctggctaaga tcggccgcag cgatcgcatc
1320catggcctcc gcgaccggct ggagaacagc gggcagttcg gtttcaggca
ggtcttgcaa 1380cgtgacaccc tgtgcacggc gggagatgca ataggtcagg
ctctcgctga actccccaat 1440gtcaagcact tccggaatcg ggagcgcggc
cgatgcaaag tgccgataaa cataacgatc 1500tttgtagaaa ccatcggcgc
agctatttac ccgcaggaca tatccacgcc ctcctacatc 1560gaagctgaaa
gcacgagatt cttcgccctc cgagagctgc atcaggtcgg agacgctgtc
1620gaacttttcg atcagaaact tctcgacaga cgtcgcggtg agttcaggct
ttttacccat 1680actagttttt agtttatgta tgtgtttttt gtagttatag
atttaagcaa gaaaagaata 1740caaacaaaaa attgaaaaag attgatttag
aattaaaaag aaaaatattt acgtaagaag 1800ggaaaatagt aaatgttgca
agttcactaa actcctaaat tatgctgccc tttatattcc 1860ctgttacagc
agccgagcca aaggtatata ggctcctttg cattagcatg cgtaacaaac
1920cacctgtcag tttcaaccga ggtggtatcc gagagaattg tgtgattgct
ttaattaatt 1980tcggagaatc tcacatgcca ctgaagatta aaaactggat
gccagaaaag gggtgtccag 2040gtgtaacatc aatagaggaa gctgaaaagt
cttagaacgg gtaatcttcc accaacctga 2100tgggttccta gatataatct
cgaagggaat aagtagggtg ataccgcaga agtgtctgaa 2160tgtattaagg
tcctcacagt ttaaatcccg ctcacactaa cgtaggatta ttataactca
2220aaaaaatggc attattctaa gtaagttaaa tatccgtaat ctttaaacag
cggccgcaga 2280tctctcgagt cgaaactaag ttctggtgtt ttaaaactaa
aaaaaagact aactataaaa 2340gtagaattta agaagtttaa gaaatagatt
tacagaatta caatcaatac ctaccgtctt 2400tatatactta ttagtcaagt
aggggaataa tttcagggaa ctggtttcaa cctttttttt 2460cagctttttc
caaatcagag agagcagaag gtaatagaag gtgtaagaaa atgagataga
2520tacatgcgtg ggtcaattgc cttgtgtcat catttactcc aggcaggttg
catcactcca 2580ttgaggttgt gcccgttttt tgcctgtttg tgcccctgtt
ctctgtagtt gcgctaagag 2640aatggaccta tgaactgatg gttggtgaag
aaaacaatat tttggtgctg ggattctttt 2700tttttctgga tgccagctta
aaaagcgggc tccattatat ttagtggatg ccaggaataa 2760actgttcacc
cagacaccta cgatgttata tattctgtgt aacccgcccc ctattttggg
2820catgtacggg ttacagcaga attaaaaggc taattttttg actaaataaa
gttaggaaaa 2880tcactactat taattattta cgtattcttt gaaatggcga
gtattgataa tgataaactg 2940gatccgtcga caaacttaga ttagattgct
atgctttctt tctaatgagc aagaagtaaa 3000aaaagttgta atagaacaag
aaaaatgaaa ctgaaacttg agaaattgaa gaccgtttat 3060taacttaaat
atcaatggga ggtcatcgaa agagaaaaaa atcaaaaaaa aaattttcaa
3120gaaaaagaaa cgtgataaaa atttttattg cctttttcga cgaagaaaaa
gaaacgaggc 3180ggtctctttt ttcttttcca aacctttagt acgggtaatt
aacgacaccc tagaggaaga 3240aagaggggaa atttagtatg ctgtgcttgg
gtgttttgaa gtggtacggc gatgcgcgga 3300gtccgagaaa atctggaaga
gtaaaaaagg agtagaaaca ttttgaagct atgagctcag 3360atctgttaac
cttgttttat atttgttgta aaaagtagat aattacttcc ttgatgatct
3420gtaaaaaaga gaaaaagaaa gcatctaaga acttgaaaaa ctacgaatta
gaaaagacca 3480aatatgtatt tcttgcattg accaatttat gcaagtttat
atatatgtaa atgtaagttt 3540cacgaggttc tactaaacta aaccaccccc
ttggttagaa gaaaagagtg tgtgagaaca 3600ggctgttgtt gtcacacgat
tcggacaatt ctgtttgaaa gagagagagt aacagtacga 3660tcgaacgaac
tttgctctgg agatcacagt gggcatcata gcatgtggta ctaaaccctt
3720tcccgccatt ccagaacctt cgattgcttg ttacaaaacc tgtgagccgt
cgctaggacc 3780ttgttgtgtg acgaaattgg aagctgcaat caataggaag
acaggaagtc gagcgtgtct 3840gggttttttc agttttgttc tttttgcaaa
caaatcacga gcgacggtaa tttctttctc 3900gataagaggc cacgtgcttt
atgagggtaa catcaattca agaaggaggg aaacacttcc 3960tttttctggc
cctgataata gtatgagggt gaagccaaaa taaaggattc gcgcccaaat
4020cggcatcttt aaatgcaggt atgcgatagt tcctcactct ttccttactc
acgagtaatt 4080cttgcaaatg cctattatgc agatgttata atatctgtgc
gtcttgagtt gagcctaggg 4140agctccagct tttgttccct ttagtgaggg
ttaattgcgc gcttggcgta atcatggtca 4200tagctgtttc ctgtgtgaaa
ttgttatccg ctcacaattc cacacaacat aggagccgga 4260agcataaagt
gtaaagcctg gggtgcctaa tgagtgaggt aactcacatt aattgcgttg
4320cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta
atgaatcggc 4380caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt
ccgcttcctc gctcactgac 4440tcgctgcgct cggtcgttcg gctgcggcga
gcggtatcag ctcactcaaa ggcggtaata 4500cggttatcca cagaatcagg
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa 4560aaggccagga
accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct
4620gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac
aggactataa 4680agataccagg cgtttccccc tggaagctcc ctcgtgcgct
ctcctgttcc gaccctgccg 4740cttaccggat acctgtccgc ctttctccct
tcgggaagcg tggcgctttc tcatagctca 4800cgctgtaggt atctcagttc
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 4860ccccccgttc
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg
4920gtaagacacg acttatcgcc actggcagca gccactggta acaggattag
cagagcgagg 4980tatgtaggcg gtgctacaga gttcttgaag tggtggccta
actacggcta cactagaagg 5040acagtatttg gtatctgcgc tctgctgaag
ccagttacct tcggaaaaag agttggtagc 5100tcttgatccg gcaaacaaac
caccgctggt agcggtggtt tttttgtttg caagcagcag 5160attacgcgca
gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac
5220gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc
aaaaaggatc 5280ttcacctaga tccttttaaa ttaaaaatga agttttaaat
caatctaaag tatatatgag 5340taaacttggt ctgacagtta ccaatgctta
atcagtgagg cacctatctc agcgatctgt 5400ctatttcgtt catccatagt
tgcctgactc cccgtcgtgt agataactac gatacgggag 5460ggcttaccat
ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca
5520gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg
tcctgcaact 5580ttatccgcct ccatccagtc tattaattgt tgccgggaag
ctagagtaag tagttcgcca 5640gttaatagtt tgcgcaacgt tgttgccatt
gctacaggca tcgtggtgtc acgctcgtcg 5700tttggtatgg cttcattcag
ctccggttcc caacgatcaa ggcgagttac atgatccccc 5760atgttgtgca
aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg
5820gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac
tgtcatgcca 5880tccgtaagat gcttttctgt gactggtgag tactcaacca
agtcattctg agaatagtgt 5940atgcggcgac cgagttgctc ttgcccggcg
tcaatacggg ataataccgc gccacatagc 6000agaactttaa aagtgctcat
cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc 6060ttaccgctgt
tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca
6120tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa
tgccgcaaaa 6180aagggaataa gggcgacacg gaaatgttga atactcatac
tcttcctttt tcaatattat 6240tgaagcattt atcagggtta ttgtctcatg
agcggataca tatttgaatg tatttagaaa 6300aataaacaaa taggggttcc
gcgcacattt ccccgaaaag tgccacctga acgaagcatc 6360tgtgcttcat
tttgtagaac aaaaatgcaa cgcgagagcg ctaatttttc aaacaaagaa
6420tctgagctgc atttttacag aacagaaatg caacgcgaaa gcgctatttt
accaacgaag 6480aatctgtgct tcatttttgt aaaacaaaaa tgcaacgcga
gagcgctaat ttttcaaaca 6540aagaatctga gctgcatttt tacagaacag
aaatgcaacg cgagagcgct attttaccaa 6600caaagaatct atacttcttt
tttgttctac aaaaatgcat cccgagagcg ctatttttct 6660aacaaagcat
cttagattac tttttttctc ctttgtgcgc tctataatgc agtctcttga
6720taactttttg cactgtaggt ccgttaaggt tagaagaagg ctactttggt
gtctattttc 6780tcttccataa aaaaagcctg actccacttc ccgcgtttac
tgattactag cgaagctgcg 6840ggtgcatttt ttcaagataa aggcatcccc
gattatattc tataccgatg tggattgcgc 6900atactttgtg aacagaaagt
gatagcgttg atgattcttc attggtcaga aaattatgaa 6960cggtttcttc
tattttgtct ctatatacta cgtataggaa atgtttacat tttcgtattg
7020ttttcgattc actctatgaa tagttcttac tacaattttt ttgtctaaag
agtaatacta 7080gagataaaca taaaaaatgt agaggtcgag tttagatgca
agttcaagga gcgaaaggtg 7140gatgggtagg ttatataggg atatagcaca
gagatatata gcaaagagat acttttgagc 7200aatgtttgtg gaagcggtat
tcgcaatatt ttagtagctc gttacagtcc ggtgcgtttt 7260tggttttttg
aaagtgcgtc ttcagagcgc ttttggtttt caaaagcgct ctgaagttcc
7320tatactttct agagaatagg aacttcggaa taggaacttc aaagcgtttc
cgaaaacgag 7380cgcttccgaa aatgcaacgc gagctgcgca catacagctc
actgttcacg tcgcacctat 7440atctgcgtgt tgcctgtata tatatataca
tgagaagaac ggcatagtgc gtgtttatgc 7500ttaaatgcgt acttatatgc
gtctatttat gtaggatgaa aggtagtcta gtacctcctg 7560tgatattatc
ccattccatg cggggtatcg tatgcttcct tcagcactac cctttagctg
7620ttctatatgc tgccactcct caattggatt agtctcatcc ttcaatgcta
tcatttcctt 7680tgata 768513424DNAArtificial SequencePrimer 587
134ccaatgcaga ccgatcttct accc 2413522DNAArtificial SequencePrimer
588 135gatcacgtga tctgttgtat tg 2213620DNAArtificial SequencePrimer
2167 136tacatggggt acttctcctc 2013760DNAArtificial SequencePrimer
2170 137cagtcaacaa atataaagaa tattgaaatt gacagttttt gtcgctatcg
atttttatta 6013860DNAArtificial SequencePrimer 2171 138ttttgtcgct
atcgattttt attatttgct gttttaaatc attctggttc tatcgaggag
6013960DNAArtificial SequencePrimer 2172
139catgttattg acgccaggtt tggacgttgt ttttcactgt atccggatgt
gaagtcgttg 6014060DNAArtificial SequencePrimer 2173 140tggttttaga
aaaggatggt gtgcttgtcg ctgagacaca tgttattgac gccaggtttg
6014120DNAArtificial SequencePrimer 2175 141tctagttcag agcttggtgc
2014220DNAArtificial SequencePrimer 2226 142tgctccattt ggaagtctcg
2014320DNAArtificial SequencePrimer 2227 143tatctacgaa gtgacctgcg
201441716DNAArtificial SequenceB. subtilis alsS codon-optimized for
expression in S. cerevisiae 144atgttgacta aagctacaaa agagcagaaa
tcattggtga aaaatagggg tgcagaactt 60gttgtggact gtttggtaga acagggcgta
acacatgttt ttggtatccc aggtgcaaaa 120atcgacgccg tgtttgatgc
attacaagac aagggtccag aaattattgt tgctagacat 180gagcaaaatg
ccgcatttat ggcgcaagct gtaggtaggc ttacaggtaa acctggtgtt
240gtcctagtta cgtctggccc aggagcctcc aatttagcaa ctggtctatt
gacagctaat 300actgagggag atcctgtagt tgcgttagcc ggtaatgtaa
ttagagctga taggcttaag 360agaactcacc agtctctaga caacgctgct
ttattccaac cgatcaccaa gtactcagta 420gaggtacaag acgtaaagaa
tatacctgaa gctgtgacaa acgcatttcg tatagcttct 480gctggtcagg
ctggtgccgc gtttgtttct tttcctcaag acgttgtcaa tgaagtgacc
540aatactaaaa acgttagagc ggttgcagcc cctaaactag gtccagccgc
agacgacgca 600attagcgctg caattgctaa aattcagacg gcgaaactac
cagtagtcct tgtcggtatg 660aagggcggaa gaccagaagc aataaaagct
gttcgtaagt tattgaagaa agtccaatta 720cctttcgttg agacttacca
agcagcaggt actttatcta gagatttaga ggatcagtat 780tttggaagga
taggtctatt tagaaaccaa ccaggagatt tactattaga acaagctgat
840gttgtactta ctatcggtta tgatcctata gagtatgacc caaagttttg
gaacataaat 900ggggatagaa caattataca tctagacgag ataatcgccg
acatcgatca cgcttatcaa 960ccagatttag aactaatcgg agatatcccg
tcaacaatca atcatattga acatgatgct 1020gtaaaggttg agttcgctga
acgtgagcag aaaatcttat ctgatctaaa gcaatatatg 1080catgagggtg
aacaagttcc agcagactgg aaatctgacc gtgcacatcc tttggaaatc
1140gttaaggaac taagaaatgc ggtcgatgat catgtgactg ttacatgtga
tatcggttca 1200catgcaattt ggatgtcacg ttattttagg agctacgaac
cattaacttt aatgatatct 1260aacgggatgc aaactctggg ggttgcactt
ccttgggcta ttggcgctag tttagttaag 1320cccggtgaga aggtggtatc
ggtatcaggt gatggtggct ttctgttttc ggctatggaa 1380ttagaaactg
cagtccgttt aaaagctccc attgtgcata ttgtctggaa tgattctact
1440tacgacatgg ttgcttttca acagttgaag aaatacaata gaacttcggc
tgtagacttt 1500ggtaacatcg atattgtgaa atatgctgag tcttttggcg
caacaggcct gagggtggaa 1560agtccagatc agttagctga tgtgttgaga
caagggatga atgccgaggg accggtaatc 1620atagatgtgc cagttgacta
ctcagacaat attaatttgg cttctgataa acttcctaaa 1680gagtttggcg
agctaatgaa gaccaaagcc ttataa 171614511PRTThermotoga petrophila
145Pro Tyr His Lys Glu Gly Gly Leu Gly Ile Leu1 5
1014611PRTVictivallis vadensis 146Pro Tyr Ser Glu Lys Gly Gly Leu
Ala Ile Leu1 5 1014711PRTUnknownTermite Group 1 Bacterium Phylotype
Rs-D17 147Pro Tyr Lys Pro Glu Gly Gly Ile Ala Ile Leu1 5
1014811PRTYarrowia lipolytica 148Pro Leu Lys Pro Ser Gly His Leu
Gln Ile Leu1 5 1014911PRTFrancisella tularensis 149Pro Ile Lys Lys
Thr Gly His Leu Gln Ile Leu1 5 1015011PRTArabidopsis thaliana
150Pro Ile Lys Glu Thr Gly His Ile Gln Ile Leu1 5
101511647DNAArtificial SequenceL. lactis kivD codon-optimized for
expression in S. cerevisiae 151atgtatacag tgggtgatta cttgctagac
cgtttacatg aattaggcat agaagagatt 60tttggagtac caggtgatta caatttgcaa
ttcttggatc agattatctc acataaggat 120atgaaatggg tcgggaatgc
gaacgagtta aacgcttcct atatggcaga tggttatgcg 180agaaccaaaa
aggccgctgc ttttcttaca actttcggtg taggtgaact ttcagccgtt
240aatggattag ccggatctta cgctgaaaac ttaccagtcg ttgaaattgt
tggttctcct 300acttctaagg tacaaaacga gggaaagttt gttcaccata
ctttagcgga tggtgatttc 360aaacatttca tgaagatgca tgaacctgtc
acagcagcga gaacactttt gaccgcagag 420aacgctactg ttgaaatcga
tagagtatta agtgctttgt taaaagagag aaagccagtg 480tatatcaact
tgcctgttga tgtcgctgcc gcaaaagcag aaaaaccatc tttaccattg
540aaaaaggaaa actcaaccag taacacatct gatcaagaga ttctaaacaa
aatccaagag 600tcattgaaaa acgccaaaaa gccaatcgtt ataacaggcc
atgaaatcat ctcctttggt 660ttagaaaaga ccgttacaca attcatctct
aagacaaaat tgccaatcac tactttaaac 720tttggcaaat cttctgtaga
tgaagcttta ccatcttttc taggtatcta caatggtact 780ctttctgaac
caaacctaaa agagtttgtc gaatccgctg atttcatact tatgttgggt
840gttaagctaa ctgatagttc aactggtgca ttcactcacc acttgaacga
aaacaagatg 900atatcattga atatcgatga gggcaagata ttcaacgaac
gtattcaaaa cttcgatttt 960gaatcactaa tttcttctct acttgatttg
tcagaaatag aatacaaagg aaaatacatc 1020gacaaaaagc aagaggattt
cgtgccatca aacgcattgt tgtcacaaga tagactttgg 1080caagctgtgg
aaaacctaac tcaatctaac gaaacaattg tggctgagca aggcacatca
1140ttcttcggcg catctagtat tttcttgaag agtaagtccc acttcatcgg
tcaacctctt 1200tggggatcta ttgggtacac cttccctgcc gcattgggat
cacagatcgc agacaaggag 1260tccagacatc tactattcat tggggacgga
tcattgcaac ttaccgttca ggaattaggg 1320ttggctataa gagaaaagat
caatcctatt tgttttatca tcaacaatga tggctataca 1380gtagaacgtg
agattcacgg tccaaatcaa agttacaatg acatcccaat gtggaattac
1440tctaagttgc ctgaatcttt cggtgcaaca gaagatagag ttgtttccaa
aatagtcaga 1500acagaaaatg agtttgtttc tgtcatgaag gaagctcaag
ccgatcctaa tagaatgtac 1560tggattgaac taatcttggc caaggaagga
gccccaaaag tattgaaaaa gatgggaaaa 1620cttttcgctg aacagaataa gtcctga
16471521023DNAArtificial SequenceL. lactis alcohol dehydrogenase
(adhA) RE1 152atgaaagcag cagtagtaag acacaatcca gatggttatg
cggaccttgt tgaaaaggaa 60cttcgagcaa tcaaacctaa tgaagctttg cttgacatgg
agtattgtgg agtctgtcat 120accgatttgc acgttgcagc aggtgatttt
ggcaacaaag cagggactgt tcttggtcat 180gaaggaattg gaattgtcaa
agaaattgga gctgatgtaa gctcgcttca agttggtgat 240cgggtttcag
tggcttggtt ctttgaagga tgtggtcact gtgaatactg tgtatctggt
300aatgaaactt tttgtcgaga agttaaaaat gcaggatatt cagttgatgg
cggaatggct 360gaagaagcaa ttgttgttgc cgattatgct gtcaaagttc
ctgacggact tgacccaatt 420gaagctagct caattacttg tgctggagta
acaacttaca aagcaatcaa agtatcagga 480gtaaaacctg gtgattggca
agtaattttt ggtgctggag gacttggaaa tttagcaatt 540caatatgcta
aaaatgtttt tggagcaaaa gtaattgctg ttgatattaa tcaagataaa
600ttaaatttag ctaaaaaaat tggagctgat gtgactatca attctggtga
tgtaaatcca 660gttgatgaaa ttaaaaaaat aactggcggc ttaggggtgc
aaagtgcaat agtttgtgct 720gttgcaagga ttgcttttga acaagcggtt
gcttctttga aacctatggg caaaatggtt 780gctgtggcag ttcccaatac
tgagatgact ttatcagttc caacagttgt ttttgacgga 840gtggaggttg
caggttcact tgtcggaaca agacttgact tggcagaagc ttttcaattt
900ggagcagaag gtaaggtaaa accaattgtt gcgacacgca aactggaaga
aatcaatgat 960attattgatg aaatgaaggc aggaaaaatt gaaggccgaa
tggtcattga ttttactaaa 1020taa 10231532073DNASaccharomyces
cerevisiae 153atggaaggct tcaatccggc tgacatagaa catgcgtcac
cgattaattc atctgacagc 60cattcatcct cctttgtata tgctctaccc aaaagtgcta
gtgaatatgt agtcaaccat 120aatgagggtc gtgcaagtgc aagtggaaat
ccagccgcag tgccgtctcc cataatgaca 180ctgaatctca aaagcacaca
ttccctcaat attgatcagc atgttcatac ctcaacatcg 240ccgacggaaa
ctattgggca tattcatcat gtggaaaagc tgaatcaaaa caatttgatt
300catctggatc cagtacccaa ctttgaagat aagtccgata ttaagccttg
gttgcaaaag 360attttttatc ctcaaggaat agaacttgtg atagaaaggt
cggacgcatt taaagttgtc 420ttcaagtgta aagctgctaa aaggggaagg
aacgcgagaa ggaaaagaaa agataagccc 480aaaggacagg accacgaaga
cgagaaatcc aagatcaatg atgacgaatt agaatatgcg 540agtccttcta
atgccacagt aaccaatggg cctcaaacat cgcccgatca aacatcctcc
600ataaagccaa agaaaaaaag atgtgtatcg aggtttaata actgtccgtt
tagagtacga 660gctacttatt cgttaaagag gaaaagatgg agcattgttg
taatggacaa taaccattca 720catcagctaa agtttaaccc tgattccgaa
gagtacaaaa aattcaaaga aaaattaaga 780aaggataatg acgtagatgc
aatcaagaaa ttcgacgaat tggaatacag aactttggcc 840aatttgccca
ttccaacagc tacaatcccc tgtgattgtg gtttaacaaa tgaaatacaa
900agtttcaatg tcgtattgcc cactaacagt aatgttactt catcagcatc
ctcttcaact 960gtatcgtcca tatcccttga ttcatcgaat gcatctaaaa
ggccatgctt accctctgta 1020aataacaccg gtagtatcaa taccaataac
gtaaggaaac cgaaaagcca gtgtaagaat 1080aaagacacac tcttaaaaag
aaccaccatg cagaactttc tcacaactaa atcaaggctg 1140cgtaagaccg
gtacgccaac atcttcgcaa cactcatcta cagcattttc aggatatatt
1200gatgatcctt tcaatttgaa tgaaatcttg ccactgccgg catccgattt
caagctaaac 1260actgtaacaa atttgaacga aattgacttt acgaacattt
ttaccaaatc gccgcatcca 1320catagcgggt ctacccatcc aagacaagtc
ttcgaccaat tggacgattg ttcctctata 1380ctcttctctc cattaactac
aaacacgaat aatgaatttg aaggagagtc agatgatttt 1440gttcattctc
catatttgaa ctcagaggca gatttcagcc aaattcttag tagtgctccc
1500ccagtccatc atgacccaaa tgaaacacat caggaaaacc aggatattat
tgatagattt 1560gctaatagtt cccaagaaca taatgagtat attctacaat
atttgacgca ctccgatgct 1620gctaaccaca ataacatcgg cgttccaaac
aacaattcac attcgctaaa tactcagcat 1680aacgtttctg atctgggcaa
ctcactttta agacaagaag ctttagttgg cagctcttca 1740acaaaaatct
tcgacgaatt gaaatttgta caaaatggcc cacacggttc tcaacatcct
1800atagattttc aacatgttga ccatcgtcat ctcagctcta atgaacctca
agtacgatca 1860catcaatatg gtccgcaaca gcagccaccg cagcaattgc
aatatcacca aaatcagccc 1920cacgacggcc ataaccacga acagcaccaa
acagtacaaa aggatatgca aacgcatgaa 1980tcgctagaaa taatgggaaa
cacattattg gaagagttca aagacattaa aatggtgaac 2040ggcgagttga
agtatgtgaa gccagaagat tag 20731541494DNAArtificial SequenceE. coli
ketolacid reductoisomerase P2D1-A1 154atggccaact attttaacac
attaaatttg agacaacaat tggctcaact gggtaagtgc 60agatttatgg gaagggacga
gtttgctgat ggtgcttctt atctgcaagg aaagaaagta 120gtaattgttg
gctgcggtgc tcagggtcta aaccaaggtt taaacatgag agattcaggt
180ctggatattt cgtatgcatt gaggaaagag tctattgcag aaaaggatgc
cgattggcgt 240aaagcgacgg aaaatgggtt caaagttggt acttacgaag
aactgatccc tcaggcagat 300ttagtgatta acctaacacc agataaggtt
cactcagacg tagtaagaac agttcaaccg 360ctgatgaagg atggggcagc
tttaggttac tctcatggct ttaatatcgt tgaagtgggc 420gagcagatca
gaaaaggtat aacagtcgta atggttgcgc caaagtgccc aggtacggaa
480gtcagagagg agtacaagag gggttttggt gtacctacat tgatcgccgt
acatcctgaa 540aatgacccca aacgtgaagg tatggcaata gcgaaggcat
gggcagccgc aaccggaggt 600catagagcgg gtgtgttaga gagttctttc
gtagctgagg tcaagagtga cttaatgggt 660gaacaaacca ttctgtgcgg
aatgttgcag gcagggtctt tactatgctt tgataaattg 720gtcgaagagg
gtacagatcc tgcctatgct gaaaagttga tacaatttgg ttgggagaca
780atcaccgagg cacttaaaca aggtggcata acattgatga tggatagact
ttcaaatccg 840gccaagctaa gagcctacgc cttatctgag caactaaaag
agatcatggc accattattc 900caaaagcaca tggacgatat tatctccggt
gagttttcct caggaatgat ggcagattgg 960gcaaacgatg ataaaaagtt
attgacgtgg agagaagaaa ccggcaagac ggcattcgag 1020acagccccac
aatacgaagg taaaattggt gaacaagaat actttgataa gggagtattg
1080atgatagcta tggtgaaggc aggggtagaa cttgcattcg aaactatggt
tgactccggt 1140atcattgaag aatctgcata ctatgagtct ttgcatgaat
tgcctttgat agcaaatact 1200attgcaagaa aaagacttta cgagatgaat
gttgtcatat cagacactgc agaatatggt 1260aattacttat ttagctacgc
gtgtgtcccg ttgttagagc ccttcatggc cgagttacaa 1320cctggtgatt
tggggaaggc tattccggaa ggagcggttg acaatggcca actgagagac
1380gtaaatgaag ctattcgttc gcatgctata gaacaggtgg gtaaaaagct
gagaggatat 1440atgaccgata tgaaaagaat tgcagtggca ggacaccacc
accaccacca ctaa 14941551713DNAArtificial SequenceL. lactis ilvD
codon-optimized for expression in S. cerevisiae 155atggagttta
agtataacgg caaagttgaa tctgttgaac tgaataagta cagcaaaacg 60ttgacacaag
atcccacaca acccgccaca caggcaatgt attacggcat cgggtttaaa
120gacgaagatt tcaagaaagc tcaagtgggt atagtgtcga tggactggga
tggaaatcca 180tgcaacatgc atttaggaac ccttggatca aagattaaaa
gctcagtaaa tcagacagat 240ggtctgatcg gcttacaatt tcatacgata
ggagtttctg atgggatagc aaatggaaag 300ttgggaatga gatactccct
tgtttccaga gaagttatag ctgactctat tgaaaccaac 360gctggcgctg
aatactatga tgcaattgta gccatcccag gttgtgacaa aaatatgcca
420ggttctatta ttggtatggc aagacttaat aggccaagca ttatggtgta
tggaggaaca 480atagaacacg gtgaatataa aggtgagaaa ttgaacatcg
tatcggcttt tgaatctcta 540ggccagaaaa ttaccggcaa tatctctgat
gaagattatc acggtgttat ttgtaatgct 600attcctggtc aaggggcatg
tggggggatg tacacagcta ataccttagc tgccgctatc 660gaaacactag
gtatgtcatt gccgtattct tcttcgaacc ctgcagtatc tcaagaaaaa
720caagaagaat gtgatgagat tggattagcc attaagaatc ttttggaaaa
agacatcaag 780cctagtgata taatgactaa ggaggcgttc gagaacgcta
ttaccattgt gatggtcttg 840gggggtagta ctaatgctgt cttgcatatt
attgcaatgg ctaacgcgat aggtgtcgaa 900ataactcagg atgacttcca
aagaattagt gacattactc cagtactagg tgattttaaa 960ccttcaggta
aatatatgat ggaagatttg cataaaattg gaggcttgcc agcagtgctt
1020aagtaccttc taaaggaagg aaaattgcat ggtgactgcc ttactgtgac
gggtaaaaca 1080ttagccgaga atgtcgagac tgccctagac ttggatttcg
actcacaaga tatcatgagg 1140ccactaaaga atcctatcaa ggccaccggc
cacttgcaga ttctgtacgg taatttagct 1200caagggggtt ccgtagcaaa
aattagcggt aaagaaggag agttcttcaa aggcactgcc 1260agagtctttg
atggtgaaca acattttatc gacggcatag aatctggtcg tttgcatgct
1320ggagatgtag cggtaattag gaatataggt cccgtcggcg gacctggtat
gcccgaaatg 1380ctgaagccta catcagcatt aattggtgcg ggtttaggga
aaagttgcgc gttaattacg 1440gatggtagat tctccggtgg cactcacggt
tttgttgtcg gccatattgt gcctgaagcc 1500gttgagggtg gactaatcgg
cttagttgaa gatgacgata taatagagat agatgcagtc 1560aacaactcta
tatccctgaa agtttccgat gaagaaatcg caaagagaag agctaattat
1620cagaagccaa ctccgaaagc caccagggga gttttggcaa aattcgctaa
attaacccgt 1680cctgcatcgg aagggtgtgt tactgatctg taa
17131561683DNAArtificial SequenceF. tularensis ilvD codon-optimized
for expression in S. cerevisiae 156atgaaaaagg tgctgaataa gtactcaaga
cgtcttaccg aagataagtc tcaaggtgct 60tctcaggcta tgctatacgg aacagagatg
aatgatgcag atatgcacaa gcctcaaatc 120ggtatcggtt ccgtttggta
tgaaggaaat acttgtaata tgcatttgaa tcaattagca 180caatttgtca
aggattctgt tgaaaaggaa aacttgaaag gcatgagatt caacacaatt
240ggagtttctg atggtatctc catgggtact gatggcatgt cctactctct
acaatcacgt 300gatctaatcg ctgattcaat cgaaacagtt atgagtgcac
actggtatga tggcctagtt 360tcaatcccag gttgtgacaa aaacatgcca
ggttgcatga tggcccttgg tagattaaac 420agaccaggtt tcgtgatcta
cggtggaacc atacaagctg gcgttatgag aggcaaacct 480attgatattg
tcacagcttt ccaatcatat ggagcatgct tatctgggca aataactgaa
540caggaaagac aagagactat caaaaaggct tgtccaggtg caggagcctg
tggcggcatg 600tacacagcta acacaatggc ctgtgccatt gaggcccttg
gaatgagttt gcctttttcc 660tcttctactt ctgcaacttc agttgaaaag
gtacaagagt gtgataaggc aggcgaaaca 720atcaaaaact tgttagaatt
ggacattaaa ccaagagaca tcatgactag aaaagctttc 780gaaaacgcta
tggtactaat tacagtaatg ggaggttcaa caaatgccgt gttacatctg
840ttagcaatgg cttcatccgt cgatgtagat ttgagtatcg atgactttca
ggaaatagct 900aacaaaactc cagtgctggc tgatttcaag ccatccggga
aatatgtcat ggcaaacttg 960catgcaattg gcgggactcc tgcagttatg
aaaatgttgc tgaaggccgg aatgcttcat 1020ggcgattgtt tgactgtaac
tgggaaaacc ttagccgaaa acttggaaaa tgtggccgac 1080ctgccagaag
ataacacaat catacacaaa ctagataacc caatcaaaaa gactggtcat
1140ttgcaaatct tgaaggggaa tgttgcccca gaaggttctg ttgctaagat
aacagggaag 1200gaaggtgaga tattcgaggg cgtagccaat gtctttgatt
cagaggaaga gatggttgcc 1260gcagtcgaaa ctggaaaagt caaaaagggc
gatgttattg ttattagata cgaaggtcct 1320aaaggtggcc ctggcatgcc
tgaaatgctt aagccaacct ctttgataat gggtgctgga 1380ctaggccagg
atgttgcatt aatcacagat ggcagatttt caggtggtag tcatggtttc
1440attgtaggtc acattacacc agaagcatac gaaggcggta tgatcgcctt
attagaaaac 1500ggtgataaga taacaatcga tgctatcaac aatgtgataa
atgtagactt aagtgatcaa 1560gagattgctc aacgtaaatc taagtggaga
gcatcaaagc aaaaagcttc cagaggtaca 1620ctgaaaaagt acattaagac
cgtctcttct gcttctaccg ggtgcgtgac tgatttggat 1680tga
16831571704DNASaccharomyces cerevisiae 157atggcaaaga agctcaacaa
gtactcgtat atcatcactg aacctaaggg ccaaggtgcg 60tcccaggcca tgctttatgc
caccggtttc aagaaggaag atttcaagaa gcctcaagtc 120ggggttggtt
cctgttggtg gtccggtaac ccatgtaaca tgcatctatt ggacttgaat
180aacagatgtt ctcaatccat tgaaaaagcg ggtttgaaag ctatgcagtt
caacaccatc 240ggtgtttcag acggtatctc tatgggtact aaaggtatga
gatactcgtt acaaagtaga 300gaaatcattg cagactcctt tgaaaccatc
atgatggcac aacactacga tgctaacatc 360gccatcccat catgtgacaa
aaacatgccc ggtgtcatga tggccatggg tagacataac 420agaccttcca
tcatggtata tggtggtact atcttgcccg gtcatccaac atgtggttct
480tcgaagatct ctaaaaacat cgatatcgtc tctgcgttcc aatcctacgg
tgaatatatt 540tccaagcaat tcactgaaga agaaagagaa gatgttgtgg
aacatgcatg cccaggtcct 600ggttcttgtg gtggtatgta tactgccaac
acaatggctt ctgccgctga agtgctaggt 660ttgaccattc caaactcctc
ttccttccca gccgtttcca aggagaagtt agctgagtgt 720gacaacattg
gtgaatacat caagaagaca atggaattgg gtattttacc tcgtgatatc
780ctcacaaaag aggcttttga aaacgccatt acttatgtcg ttgcaaccgg
tgggtccact 840aatgctgttt tgcatttggt ggctgttgct cactctgcgg
gtgtcaagtt gtcaccagat 900gatttccaaa gaatcagtga tactacacca
ttgatcggtg acttcaaacc ttctggtaaa 960tacgtcatgg ccgatttgat
taacgttggt ggtacccaat ctgtgattaa gtatctatat 1020gaaaacaaca
tgttgcacgg taacacaatg actgttaccg gtgacacttt ggcagaacgt
1080gcaaagaaag caccaagcct acctgaagga caagagatta ttaagccact
ctcccaccca 1140atcaaggcca acggtcactt gcaaattctg tacggttcat
tggcaccagg tggagctgtg 1200ggtaaaatta ccggtaagga aggtacttac
ttcaagggta gagcacgtgt gttcgaagag 1260gaaggtgcct ttattgaagc
cttggaaaga ggtgaaatca agaagggtga aaaaaccgtt 1320gttgttatca
gatatgaagg tccaagaggt gcaccaggta tgcctgaaat gctaaagcct
1380tcctctgctc tgatgggtta cggtttgggt aaagatgttg cattgttgac
tgatggtaga 1440ttctctggtg gttctcacgg gttcttaatc ggccacattg
ttcccgaagc cgctgaaggt 1500ggtcctatcg ggttggtcag agacggcgat
gagattatca ttgatgctga taataacaag 1560attgacctat tagtctctga
taaggaaatg gctcaacgta aacaaagttg ggttgcacct 1620ccacctcgtt
acacaagagg tactctatcc aagtatgcta agttggtttc caacgcttcc
1680aacggttgtg ttttagatgc ttga 17041581692DNASaccharomyces
cerevisiae 158atgaacaagt actcgtatat catcactgaa cctaagggcc
aaggtgcgtc ccaggccatg 60ctttatgcca ccggtttcaa
gaaggaagat ttcaagaagc ctcaagtcgg ggttggttcc 120tgttggtggt
ccggtaaccc atgtaacatg catctattgg acttgaataa cagatgttct
180caatccattg aaaaagcggg tttgaaagct atgcagttca acaccatcgg
tgtttcagac 240ggtatctcta tgggtactaa aggtatgaga tactcgttac
aaagtagaga aatcattgca 300gactcctttg aaaccatcat gatggcacaa
cactacgatg ctaacatcgc catcccatca 360tgtgacaaaa acatgcccgg
tgtcatgatg gccatgggta gacataacag accttccatc 420atggtatatg
gtggtactat cttgcccggt catccaacat gtggttcttc gaagatctct
480aaaaacatcg atatcgtctc tgcgttccaa tcctacggtg aatatatttc
caagcaattc 540actgaagaag aaagagaaga tgttgtggaa catgcatgcc
caggtcctgg ttcttgtggt 600ggtatgtata ctgccaacac aatggcttct
gccgctgaag tgctaggttt gaccattcca 660aactcctctt ccttcccagc
cgtttccaag gagaagttag ctgagtgtga caacattggt 720gaatacatca
agaagacaat ggaattgggt attttacctc gtgatatcct cacaaaagag
780gcttttgaaa acgccattac ttatgtcgtt gcaaccggtg ggtccactaa
tgctgttttg 840catttggtgg ctgttgctca ctctgcgggt gtcaagttgt
caccagatga tttccaaaga 900atcagtgata ctacaccatt gatcggtgac
ttcaaacctt ctggtaaata cgtcatggcc 960gatttgatta acgttggtgg
tacccaatct gtgattaagt atctatatga aaacaacatg 1020ttgcacggta
acacaatgac tgttaccggt gacactttgg cagaacgtgc aaagaaagca
1080ccaagcctac ctgaaggaca agagattatt aagccactct cccacccaat
caaggccaac 1140ggtcacttgc aaattctgta cggttcattg gcaccaggtg
gagctgtggg taaaattacc 1200ggtaaggaag gtacttactt caagggtaga
gcacgtgtgt tcgaagagga aggtgccttt 1260attgaagcct tggaaagagg
tgaaatcaag aagggtgaaa aaaccgttgt tgttatcaga 1320tatgaaggtc
caagaggtgc accaggtatg cctgaaatgc taaagccttc ctctgctctg
1380atgggttacg gtttgggtaa agatgttgca ttgttgactg atggtagatt
ctctggtggt 1440tctcacgggt tcttaatcgg ccacattgtt cccgaagccg
ctgaaggtgg tcctatcggg 1500ttggtcagag acggcgatga gattatcatt
gatgctgata ataacaagat tgacctatta 1560gtctctgata aggaaatggc
tcaacgtaaa caaagttggg ttgcacctcc acctcgttac 1620acaagaggta
ctctatccaa gtatgctaag ttggtttcca acgcttccaa cggttgtgtt
1680ttagatgctt ga 16921591923DNANeurospora crassa 159atggcttcta
atcaagataa caaggcagtt gctccagacg ctgctgcacc agcgggtcag 60tcaacaacca
ccacaactac aaatgataac agtgaaagga atctaccaaa ggaaggcgaa
120tacattcaat ggaggacact tccagcgggc aatccagatc agttgaacag
atggagtcat 180ttcctgactc gtgagcatga gtttccaggc gctcaggcaa
tgttgtacgg tgcgggtgta 240cctaacaaag atatgatgaa aaaggctcct
catgttggga tcgctactgt ttggtgggaa 300ggtaacccat gtaatactca
tctgcttgat ctaggtcaaa aagtcaaaaa ggctgttgaa 360agagagaaga
tgttagcttg gcaattcaac acaattggcg ttagtgacgg aataacaatg
420ggtggtgaag gcatgaggta ctctttgcag agcagagaga tcatagcaga
ttctatagag 480actgtgacat gtgcacaaca ccatgatgcc aatatctcaa
ttccagggtg cgacaaaaac 540atgccaggcg tcatcatggc agctgcaaga
cacaacagac cattcgttat gatctacgga 600ggtacaatga gaggcggtca
ttccgaatta cttgatagac ctatcaatat cgtaacttgt 660tacgaggcct
caggggccta tacttatggt agacttaagc cagcctgtcc aaactccact
720gctaccccat ctgacgtgat ggacgatata gaacaacacg cctgtccagg
ggctggagct 780tgtggaggga tgtacaccgc gaatactatg gcaaccgcca
tagaagctat gggtctgaca 840gcaccagggt catcctcctt tccagccagc
tcaccagaaa agttcagaga gtgcgaaaaa 900gccgcggaat acattaagat
atgcatggaa aaagatattc gtccaagaga cttactaaca 960aaggcttcct
tcgagaatgc tctcgtcttg acaatgattc taggtggttc aaccaacggt
1020gttttacatt acttagccat ggccaactcc gccgatgtcg atctaactct
tgatgatatc 1080aatagagtca gtgctaagac tcctttcctc gctgatatgg
ctccatctgg tagatactat 1140atggaggatt tgtacaaggt aggtggtact
ccagccgtac tcaagatgtt gatagctgcc 1200ggctatatcg atggaacaat
tccaacaata acaggaaaat ctttggctga aaacgtgtca 1260gattggccat
ctttagaccc tgatcaaaag attatccgtc ctttggataa tcctatcaaa
1320tcacaaggtc acattagagt gctgtatggt aacttctctc ctggtggggc
tgttgccaag 1380atcacgggta aggaaggtct tagttttact ggtaaggcaa
gatgctttaa caaagagttt 1440gaattggatg ctgcgctgaa aaactctgaa
atcacgctcg aacaaggaaa tcaagttcta 1500attgtaaggt atgaaggccc
taagggcgga ccaggcatgc cagaacaatt gaaagcatct 1560gccgctatca
tgggcgctgg tttgacgaac gtagctttag tcacggatgg gcgatactct
1620ggcgcttctc acggtttcat cgtcggtcac gtcgtgcctg aggcggcaac
tggcggacct 1680attgctttag taaaggatgg agatttgatc acaattgatg
cagtcagaaa tagaattgat 1740gttgtcaaaa ccgtagaagg agtggagggc
gaggaggaaa ttgcaaaggt tttagaagag 1800aggaaaaagg gatggaaagc
acctaagatg aagccaacaa gaggagccct ggccaaatac 1860gcaagacttg
ttggtgacgc atcacatgga gcagttacag acttaggagg agatgcttac 1920taa
19231601686DNAAcaryochloris marina 160atgtcagata atcgtaattc
tcaagtagtc acacaaggtg ttcaaagagc acctaataga 60gctatgttaa gagctgtagg
attcggagat gatgatttca cgaaaccaat agttggattg 120gctaatggtt
tctctactat tactccttgt aacatgggaa ttgatagttt ggccacaaga
180gctgaagcat ctattaggac ggctggtgca atgccacaaa agtttggaac
cattacaata 240agcgatggga tatcaatggg tacagaaggt atgaagtatt
ctctcgtttc aagagaagtg 300attgccgatt ccattgaaac agcttgcatg
ggccagagta tggatggcgt attagcaatt 360ggtggctgcg acaaaaacat
gcctggcgcg atgttagcaa tggctcgtat gaacatacca 420gccatcttcg
tatatggtgg cactatcaag ccaggccacc tcaatggtga agatttgact
480gtcgtatcag ctttcgaagc tgtggggcaa cattccgccg gtagaatatc
cgaagccgaa 540cttacagcag tcgaaaagca tgcatgtcca ggcgctggat
catgtggtgg catgtacacg 600gccaacacaa tgtctagtgc ttttgaggct
atgggcatgt ccttgatgta ctcatccact 660atggctgcag aagatgagga
gaaggctgtt tctgccgaac aatctgcggc tgtgctagtt 720gaggcaatcc
acaaacagat tctaccaaga gatattctaa ccagaaaggc gtttgagaac
780gcaatagcag tcataatggc tgttgggggt tccacaaatg cagttctcca
cttgttagcg 840atttcaagag cagcaggaga ctctttaact ttagatgatt
tcgaaactat cagggctcaa 900gttccagtga tttgtgattt gaagccttct
ggtcgatatg tcgctacaga ccttcataaa 960gctggcggaa tcccattagt
tatgaaaatg ctattagagc atgggctatt acatggggat 1020gcattgacta
ttaccggcaa gacaattgca gagcaattgg ccgatgtgcc atctgaacct
1080tctcctgatc aagacgtaat ccgtccttgg gataatccaa tgtacaagca
aggtcacctt 1140gccatcttga gaggtaactt ggccacagaa ggtgcagtag
ccaagatcac agggatcaaa 1200aaccctcaaa tcactggacc agctagagtt
ttcgagtcag aggaagcctg tttagaagcg 1260atcctggccg gaaagatcca
accaaatgac gtgatagtcg ttcgatacga aggtccaaaa 1320ggaggaccag
gtatgaggga aatgctggct cctacttccg caatcatagg tgcgggtcta
1380ggagactcag ttggccttat cactgatggg agattttccg gtgggacata
cggtatggtt 1440gttggacatg tagcaccaga agcagctgtt ggtgggacca
ttgctctggt tcaagagggt 1500gaccaaatca ctatcgatgc tcacgctaga
aagttggagc tgcatgtctc tgaccaagag 1560cttaaagagc gtaaggaaaa
gtgggagcag ccaaaaccac tgtacaataa gggtgtgctt 1620gcgaagtacg
ccaaactcgt aagctctagt tcagtaggag cggttacaga tttggatttg 1680ttctaa
16861611683DNALyngbya spp. 161atgtccgata acttccgttc tcaagccatt
acacagggca aaaagagaac tcctaataga 60gctatgctga gagcagttgg atttggagat
gaagatttca acaaaccaat tgttggtatt 120gccaatggct actccaccat
aactccttgc aacatcggtc ttaacgatct tgcacatagg 180gccgaaacag
ctctaaagca agcagacgcc atgccacaaa tgttcggaac tattactgta
240agtgatggaa ttgcaatggg aaccgaaggt atgaagtact ctcttgttag
cagagaagtt 300atagccgatg ctattgaaac tgcttgtaac ggacagtcta
tggatggggt cttagcaata 360ggaggttgtg acaaaaacat gcctggtgct
atgatcgcca tagcgcgtat gaatatccct 420gctatctttg tatacggcgg
tacaatcaag ccaggtaatc taaacggttg tgatctaaca 480gttgtctccg
cattcgaagc cgttggagag tattctgctg gcaaactaga tgacgataga
540ttactggaca tcgagagatt agcatgccct ggttctggct catgtggggg
aatgttcact 600gctaatacaa tgtctagtgc atttgaagca atgggtatga
gtctgatgta cagcagtaca 660atggcatccg aagatgctga aaaggctgat
tccaccgaaa agtccgcttt tgttttgaga 720gaggcaattt ctcagagaat
cctacctaag caaatcctga cgaggaaagc cttcgaaaac 780gcaattgcag
tcatcatggc ggtaggcggc tccacaaact ctgtattgca tctattggct
840attgcctatg ctgccgatgt agaattgacc atagatgatt tcgaaacaat
tcgtgggaga 900gtaccagttt tgtgtgatct taagccatca ggacgatttg
tcactaccga tttccataag 960gctggtggag tcccattgat catgaagatg
ttactcgaac aaggtttgat ccatggggat 1020gcccttacta taacgggtaa
aacagtcgca gagcaattag ctgatatccc atctcaacca 1080tctgccgacc
aagaggtgat aagaccatgg aataacccaa tgtacaagca aggtcacttg
1140gcgatcctta aggggaatct tgcaacagaa ggttcagtcg ccaagataac
aggtgtgaaa 1200aagcctcaga tgacaggtcc agcgcgagtt tttgaatcag
aagagcaatg cttagaagct 1260atactagccg gcaaaatcca agctggggac
gttttagtgg ttagatacga aggtccaaaa 1320gggggaccag gtatgagaga
aatgctggct ccaacatctg caatcattgg tgccggcttg 1380ggtgattctg
ttggactcat tacggatggc agattctctg gcggaacata tggtttggta
1440gtcggacacg ttgctccaga ggctgcagtg ggtggtaaca tcgctttagt
gcaagagggc 1500gattcaatta ctattgatgc ttcacagcgt ttgttacaag
taaacatctc tgaccaggtg 1560ttggagcaaa gacgacaaaa ctggcaacca
ccacaaccta gatacactaa aggcgtatta 1620gcgaagtacg caaagttggt
ttcaagtagt tcagttggcg cagttactga tctcgattgt 1680taa
16831621851DNAArtificial SequenceE. coli ilvD codon-optimized for
expression in K. lactis 162atgccgaaat acagatcagc aacaacaacc
catggtagaa atatggctgg tgcaagggct 60ctatggagag ctactggcat gactgatgca
gatttcggaa agccaatcat tgccgtcgtc 120aactctttta cacaattcgt
tccgggtcat gtccatttgc gtgatctagg taagcttgtt 180gccgaacaaa
ttgaagctgc aggtggtgtc gcaaaagagt ttaatactat tgctgtggac
240gacggtatag ctatggggca tggcggtatg ttatactctt taccatcgag
agaattaatt 300gcagactcag tcgaatatat ggttaatgct cattgtgccg
atgcaatggt ttgtatctct 360aattgtgata agataacgcc tggtatgttg
atggcgtcct tgagattgaa catcccagta 420atcttcgtat ctggcggccc
aatggaggct ggtaaaacta agttaagtga tcagatcatc 480aaacttgatc
ttgtggatgc aatgattcaa ggtgcagatc caaaagtttc agactcgcag
540tcagaccaag ttgaaagaag tgcatgtcca acttgtggtt cttgcagtgg
aatgttcacg 600gctaactcta tgaattgctt gactgaagct ctaggtttat
ctcaaccagg aaatggttca 660ttattagcga cccatgcaga cagaaagcaa
ttgttcttaa atgccggaaa aagaattgtg 720gaactaacga aaaggtatta
cgaacaaaat gatgaatcag cattaccgag gaatatagct 780tcaaaggctg
cattcgaaaa tgccatgaca ttggatattg caatgggtgg tagtacaaac
840acggtcttac atcttctagc tgcagcccaa gaagctgaga tagatttcac
catgtctgat 900atcgacaagc tttcacgtaa ggttccacag ttatgtaagg
ttgcaccatc aactcaaaag 960tatcacatgg aagacgttca tcgtgcagga
ggggttattg gtattttagg ggagttggac 1020agagccggtc ttttaaacag
ggatgtgaag aatgtattgg gtttaacact tccacagaca 1080ttagagcaat
acgatgtcat gttaactcaa gatgatgccg tgaaaaacat gttcagggca
1140ggtccagcag ggatcagaac cacccaagca ttctcgcaag actgtaggtg
ggacactttg 1200gacgatgata gagcaaatgg atgtataaga tcgcttgagc
atgcttatag taaggatggt 1260ggtttagcag tattatatgg aaacttcgct
gaaaatggtt gcattgtgaa aactgctggt 1320gtagatgata gtattttgaa
atttactgga cccgctaaag tttacgaaag tcaagacgat 1380gctgttgagg
ctatacttgg cggaaaggtg gtagcaggag acgtggtagt gataagatat
1440gagggaccaa agggaggacc aggtatgcag gaaatgcttt acccaacttc
atttttgaag 1500tccatgggac taggaaaagc ttgtgccctt atcactgacg
gtagattctc tggtggcact 1560tcgggtttaa gtatcggtca cgtatcacca
gaggcagctt ctggtggttc gattggattg 1620attgaagatg gagatttgat
cgccatagat atcccaaata gaggtatcca attacaagtc 1680tcagacgctg
aattggctgc aagaagagaa gcacaagatg ccagaggaga taaggcttgg
1740actcctaaaa atagagaacg tcaagtaagt ttcgccctta gggcttatgc
ttcattggct 1800acttcagccg ataagggggc agtaagagac aaatcgaagt
tgggtggatg a 1851163567PRTSaccharomyces cerevisiae 163Met Ala Lys
Lys Leu Asn Lys Tyr Ser Tyr Ile Ile Thr Glu Pro Lys1 5 10 15Gly Gln
Gly Ala Ser Gln Ala Met Leu Tyr Ala Thr Gly Phe Lys Lys 20 25 30Glu
Asp Phe Lys Lys Pro Gln Val Gly Val Gly Ser Cys Trp Trp Ser 35 40
45Gly Asn Pro Cys Asn Met His Leu Leu Asp Leu Asn Asn Arg Cys Ser
50 55 60Gln Ser Ile Glu Lys Ala Gly Leu Lys Ala Met Gln Phe Asn Thr
Ile65 70 75 80Gly Val Ser Asp Gly Ile Ser Met Gly Thr Lys Gly Met
Arg Tyr Ser 85 90 95Leu Gln Ser Arg Glu Ile Ile Ala Asp Ser Phe Glu
Thr Ile Met Met 100 105 110Ala Gln His Tyr Asp Ala Asn Ile Ala Ile
Pro Ser Cys Asp Lys Asn 115 120 125Met Pro Gly Val Met Met Ala Met
Gly Arg His Asn Arg Pro Ser Ile 130 135 140Met Val Tyr Gly Gly Thr
Ile Leu Pro Gly His Pro Thr Cys Gly Ser145 150 155 160Ser Lys Ile
Ser Lys Asn Ile Asp Ile Val Ser Ala Phe Gln Ser Tyr 165 170 175Gly
Glu Tyr Ile Ser Lys Gln Phe Thr Glu Glu Glu Arg Glu Asp Val 180 185
190Val Glu His Ala Cys Pro Gly Pro Gly Ser Cys Gly Gly Met Tyr Thr
195 200 205Ala Asn Thr Met Ala Ser Ala Ala Glu Val Leu Gly Leu Thr
Ile Pro 210 215 220Asn Ser Ser Ser Phe Pro Ala Val Ser Lys Glu Lys
Leu Ala Glu Cys225 230 235 240Asp Asn Ile Gly Glu Tyr Ile Lys Lys
Thr Met Glu Leu Gly Ile Leu 245 250 255Pro Arg Asp Ile Leu Thr Lys
Glu Ala Phe Glu Asn Ala Ile Thr Tyr 260 265 270Val Val Ala Thr Gly
Gly Ser Thr Asn Ala Val Leu His Leu Val Ala 275 280 285Val Ala His
Ser Ala Gly Val Lys Leu Ser Pro Asp Asp Phe Gln Arg 290 295 300Ile
Ser Asp Thr Thr Pro Leu Ile Gly Asp Phe Lys Pro Ser Gly Lys305 310
315 320Tyr Val Met Ala Asp Leu Ile Asn Val Gly Gly Thr Gln Ser Val
Ile 325 330 335Lys Tyr Leu Tyr Glu Asn Asn Met Leu His Gly Asn Thr
Met Thr Val 340 345 350Thr Gly Asp Thr Leu Ala Glu Arg Ala Lys Lys
Ala Pro Ser Leu Pro 355 360 365Glu Gly Gln Glu Ile Ile Lys Pro Leu
Ser His Pro Ile Lys Ala Asn 370 375 380Gly His Leu Gln Ile Leu Tyr
Gly Ser Leu Ala Pro Gly Gly Ala Val385 390 395 400Gly Lys Ile Thr
Gly Lys Glu Gly Thr Tyr Phe Lys Gly Arg Ala Arg 405 410 415Val Phe
Glu Glu Glu Gly Ala Phe Ile Glu Ala Leu Glu Arg Gly Glu 420 425
430Ile Lys Lys Gly Glu Lys Thr Val Val Val Ile Arg Tyr Glu Gly Pro
435 440 445Arg Gly Ala Pro Gly Met Pro Glu Met Leu Lys Pro Ser Ser
Ala Leu 450 455 460Met Gly Tyr Gly Leu Gly Lys Asp Val Ala Leu Leu
Thr Asp Gly Arg465 470 475 480Phe Ser Gly Gly Ser His Gly Phe Leu
Ile Gly His Ile Val Pro Glu 485 490 495Ala Ala Glu Gly Gly Pro Ile
Gly Leu Val Arg Asp Gly Asp Glu Ile 500 505 510Ile Ile Asp Ala Asp
Asn Asn Lys Ile Asp Leu Leu Val Ser Asp Lys 515 520 525Glu Met Ala
Gln Arg Lys Gln Ser Trp Val Ala Pro Pro Pro Arg Tyr 530 535 540Thr
Arg Gly Thr Leu Ser Lys Tyr Ala Lys Leu Val Ser Asn Ala Ser545 550
555 560Asn Gly Cys Val Leu Asp Ala 565164563PRTSaccharomyces
cerevisiae 164Met Asn Lys Tyr Ser Tyr Ile Ile Thr Glu Pro Lys Gly
Gln Gly Ala1 5 10 15Ser Gln Ala Met Leu Tyr Ala Thr Gly Phe Lys Lys
Glu Asp Phe Lys 20 25 30Lys Pro Gln Val Gly Val Gly Ser Cys Trp Trp
Ser Gly Asn Pro Cys 35 40 45Asn Met His Leu Leu Asp Leu Asn Asn Arg
Cys Ser Gln Ser Ile Glu 50 55 60Lys Ala Gly Leu Lys Ala Met Gln Phe
Asn Thr Ile Gly Val Ser Asp65 70 75 80Gly Ile Ser Met Gly Thr Lys
Gly Met Arg Tyr Ser Leu Gln Ser Arg 85 90 95Glu Ile Ile Ala Asp Ser
Phe Glu Thr Ile Met Met Ala Gln His Tyr 100 105 110Asp Ala Asn Ile
Ala Ile Pro Ser Cys Asp Lys Asn Met Pro Gly Val 115 120 125Met Met
Ala Met Gly Arg His Asn Arg Pro Ser Ile Met Val Tyr Gly 130 135
140Gly Thr Ile Leu Pro Gly His Pro Thr Cys Gly Ser Ser Lys Ile
Ser145 150 155 160Lys Asn Ile Asp Ile Val Ser Ala Phe Gln Ser Tyr
Gly Glu Tyr Ile 165 170 175Ser Lys Gln Phe Thr Glu Glu Glu Arg Glu
Asp Val Val Glu His Ala 180 185 190Cys Pro Gly Pro Gly Ser Cys Gly
Gly Met Tyr Thr Ala Asn Thr Met 195 200 205Ala Ser Ala Ala Glu Val
Leu Gly Leu Thr Ile Pro Asn Ser Ser Ser 210 215 220Phe Pro Ala Val
Ser Lys Glu Lys Leu Ala Glu Cys Asp Asn Ile Gly225 230 235 240Glu
Tyr Ile Lys Lys Thr Met Glu Leu Gly Ile Leu Pro Arg Asp Ile 245 250
255Leu Thr Lys Glu Ala Phe Glu Asn Ala Ile Thr Tyr Val Val Ala Thr
260 265 270Gly Gly Ser Thr Asn Ala Val Leu His Leu Val Ala Val Ala
His Ser 275 280 285Ala Gly Val Lys Leu Ser Pro Asp Asp Phe Gln Arg
Ile Ser Asp Thr 290 295 300Thr Pro Leu Ile Gly Asp Phe Lys Pro Ser
Gly Lys Tyr Val Met Ala305 310 315 320Asp Leu Ile Asn Val Gly Gly
Thr Gln Ser Val Ile Lys Tyr Leu Tyr 325 330 335Glu Asn Asn Met Leu
His Gly Asn Thr Met Thr Val Thr Gly Asp Thr 340 345 350Leu Ala Glu
Arg Ala Lys Lys Ala Pro Ser Leu Pro Glu Gly Gln Glu 355 360 365Ile
Ile Lys Pro Leu Ser His Pro Ile Lys Ala Asn Gly His Leu Gln 370 375
380Ile Leu Tyr Gly Ser Leu Ala Pro Gly Gly Ala Val Gly Lys Ile
Thr385 390
395 400Gly Lys Glu Gly Thr Tyr Phe Lys Gly Arg Ala Arg Val Phe Glu
Glu 405 410 415Glu Gly Ala Phe Ile Glu Ala Leu Glu Arg Gly Glu Ile
Lys Lys Gly 420 425 430Glu Lys Thr Val Val Val Ile Arg Tyr Glu Gly
Pro Arg Gly Ala Pro 435 440 445Gly Met Pro Glu Met Leu Lys Pro Ser
Ser Ala Leu Met Gly Tyr Gly 450 455 460Leu Gly Lys Asp Val Ala Leu
Leu Thr Asp Gly Arg Phe Ser Gly Gly465 470 475 480Ser His Gly Phe
Leu Ile Gly His Ile Val Pro Glu Ala Ala Glu Gly 485 490 495Gly Pro
Ile Gly Leu Val Arg Asp Gly Asp Glu Ile Ile Ile Asp Ala 500 505
510Asp Asn Asn Lys Ile Asp Leu Leu Val Ser Asp Lys Glu Met Ala Gln
515 520 525Arg Lys Gln Ser Trp Val Ala Pro Pro Pro Arg Tyr Thr Arg
Gly Thr 530 535 540Leu Ser Lys Tyr Ala Lys Leu Val Ser Asn Ala Ser
Asn Gly Cys Val545 550 555 560Leu Asp Ala165640PRTNeurospora crassa
165Met Ala Ser Asn Gln Asp Asn Lys Ala Val Ala Pro Asp Ala Ala Ala1
5 10 15Pro Ala Gly Gln Ser Thr Thr Thr Thr Thr Thr Asn Asp Asn Ser
Glu 20 25 30Arg Asn Leu Pro Lys Glu Gly Glu Tyr Ile Gln Trp Arg Thr
Leu Pro 35 40 45Ala Gly Asn Pro Asp Gln Leu Asn Arg Trp Ser His Phe
Leu Thr Arg 50 55 60Glu His Glu Phe Pro Gly Ala Gln Ala Met Leu Tyr
Gly Ala Gly Val65 70 75 80Pro Asn Lys Asp Met Met Lys Lys Ala Pro
His Val Gly Ile Ala Thr 85 90 95Val Trp Trp Glu Gly Asn Pro Cys Asn
Thr His Leu Leu Asp Leu Gly 100 105 110Gln Lys Val Lys Lys Ala Val
Glu Arg Glu Lys Met Leu Ala Trp Gln 115 120 125Phe Asn Thr Ile Gly
Val Ser Asp Gly Ile Thr Met Gly Gly Glu Gly 130 135 140Met Arg Tyr
Ser Leu Gln Ser Arg Glu Ile Ile Ala Asp Ser Ile Glu145 150 155
160Thr Val Thr Cys Ala Gln His His Asp Ala Asn Ile Ser Ile Pro Gly
165 170 175Cys Asp Lys Asn Met Pro Gly Val Ile Met Ala Ala Ala Arg
His Asn 180 185 190Arg Pro Phe Val Met Ile Tyr Gly Gly Thr Met Arg
Gly Gly His Ser 195 200 205Glu Leu Leu Asp Arg Pro Ile Asn Ile Val
Thr Cys Tyr Glu Ala Ser 210 215 220Gly Ala Tyr Thr Tyr Gly Arg Leu
Lys Pro Ala Cys Pro Asn Ser Thr225 230 235 240Ala Thr Pro Ser Asp
Val Met Asp Asp Ile Glu Gln His Ala Cys Pro 245 250 255Gly Ala Gly
Ala Cys Gly Gly Met Tyr Thr Ala Asn Thr Met Ala Thr 260 265 270Ala
Ile Glu Ala Met Gly Leu Thr Ala Pro Gly Ser Ser Ser Phe Pro 275 280
285Ala Ser Ser Pro Glu Lys Phe Arg Glu Cys Glu Lys Ala Ala Glu Tyr
290 295 300Ile Lys Ile Cys Met Glu Lys Asp Ile Arg Pro Arg Asp Leu
Leu Thr305 310 315 320Lys Ala Ser Phe Glu Asn Ala Leu Val Leu Thr
Met Ile Leu Gly Gly 325 330 335Ser Thr Asn Gly Val Leu His Tyr Leu
Ala Met Ala Asn Ser Ala Asp 340 345 350Val Asp Leu Thr Leu Asp Asp
Ile Asn Arg Val Ser Ala Lys Thr Pro 355 360 365Phe Leu Ala Asp Met
Ala Pro Ser Gly Arg Tyr Tyr Met Glu Asp Leu 370 375 380Tyr Lys Val
Gly Gly Thr Pro Ala Val Leu Lys Met Leu Ile Ala Ala385 390 395
400Gly Tyr Ile Asp Gly Thr Ile Pro Thr Ile Thr Gly Lys Ser Leu Ala
405 410 415Glu Asn Val Ser Asp Trp Pro Ser Leu Asp Pro Asp Gln Lys
Ile Ile 420 425 430Arg Pro Leu Asp Asn Pro Ile Lys Ser Gln Gly His
Ile Arg Val Leu 435 440 445Tyr Gly Asn Phe Ser Pro Gly Gly Ala Val
Ala Lys Ile Thr Gly Lys 450 455 460Glu Gly Leu Ser Phe Thr Gly Lys
Ala Arg Cys Phe Asn Lys Glu Phe465 470 475 480Glu Leu Asp Ala Ala
Leu Lys Asn Ser Glu Ile Thr Leu Glu Gln Gly 485 490 495Asn Gln Val
Leu Ile Val Arg Tyr Glu Gly Pro Lys Gly Gly Pro Gly 500 505 510Met
Pro Glu Gln Leu Lys Ala Ser Ala Ala Ile Met Gly Ala Gly Leu 515 520
525Thr Asn Val Ala Leu Val Thr Asp Gly Arg Tyr Ser Gly Ala Ser His
530 535 540Gly Phe Ile Val Gly His Val Val Pro Glu Ala Ala Thr Gly
Gly Pro545 550 555 560Ile Ala Leu Val Lys Asp Gly Asp Leu Ile Thr
Ile Asp Ala Val Arg 565 570 575Asn Arg Ile Asp Val Val Lys Thr Val
Glu Gly Val Glu Gly Glu Glu 580 585 590Glu Ile Ala Lys Val Leu Glu
Glu Arg Lys Lys Gly Trp Lys Ala Pro 595 600 605Lys Met Lys Pro Thr
Arg Gly Ala Leu Ala Lys Tyr Ala Arg Leu Val 610 615 620Gly Asp Ala
Ser His Gly Ala Val Thr Asp Leu Gly Gly Asp Ala Tyr625 630 635
640166561PRTAcaryochloris marina 166Met Ser Asp Asn Arg Asn Ser Gln
Val Val Thr Gln Gly Val Gln Arg1 5 10 15Ala Pro Asn Arg Ala Met Leu
Arg Ala Val Gly Phe Gly Asp Asp Asp 20 25 30Phe Thr Lys Pro Ile Val
Gly Leu Ala Asn Gly Phe Ser Thr Ile Thr 35 40 45Pro Cys Asn Met Gly
Ile Asp Ser Leu Ala Thr Arg Ala Glu Ala Ser 50 55 60Ile Arg Thr Ala
Gly Ala Met Pro Gln Lys Phe Gly Thr Ile Thr Ile65 70 75 80Ser Asp
Gly Ile Ser Met Gly Thr Glu Gly Met Lys Tyr Ser Leu Val 85 90 95Ser
Arg Glu Val Ile Ala Asp Ser Ile Glu Thr Ala Cys Met Gly Gln 100 105
110Ser Met Asp Gly Val Leu Ala Ile Gly Gly Cys Asp Lys Asn Met Pro
115 120 125Gly Ala Met Leu Ala Met Ala Arg Met Asn Ile Pro Ala Ile
Phe Val 130 135 140Tyr Gly Gly Thr Ile Lys Pro Gly His Leu Asn Gly
Glu Asp Leu Thr145 150 155 160Val Val Ser Ala Phe Glu Ala Val Gly
Gln His Ser Ala Gly Arg Ile 165 170 175Ser Glu Ala Glu Leu Thr Ala
Val Glu Lys His Ala Cys Pro Gly Ala 180 185 190Gly Ser Cys Gly Gly
Met Tyr Thr Ala Asn Thr Met Ser Ser Ala Phe 195 200 205Glu Ala Met
Gly Met Ser Leu Met Tyr Ser Ser Thr Met Ala Ala Glu 210 215 220Asp
Glu Glu Lys Ala Val Ser Ala Glu Gln Ser Ala Ala Val Leu Val225 230
235 240Glu Ala Ile His Lys Gln Ile Leu Pro Arg Asp Ile Leu Thr Arg
Lys 245 250 255Ala Phe Glu Asn Ala Ile Ala Val Ile Met Ala Val Gly
Gly Ser Thr 260 265 270Asn Ala Val Leu His Leu Leu Ala Ile Ser Arg
Ala Ala Gly Asp Ser 275 280 285Leu Thr Leu Asp Asp Phe Glu Thr Ile
Arg Ala Gln Val Pro Val Ile 290 295 300Cys Asp Leu Lys Pro Ser Gly
Arg Tyr Val Ala Thr Asp Leu His Lys305 310 315 320Ala Gly Gly Ile
Pro Leu Val Met Lys Met Leu Leu Glu His Gly Leu 325 330 335Leu His
Gly Asp Ala Leu Thr Ile Thr Gly Lys Thr Ile Ala Glu Gln 340 345
350Leu Ala Asp Val Pro Ser Glu Pro Ser Pro Asp Gln Asp Val Ile Arg
355 360 365Pro Trp Asp Asn Pro Met Tyr Lys Gln Gly His Leu Ala Ile
Leu Arg 370 375 380Gly Asn Leu Ala Thr Glu Gly Ala Val Ala Lys Ile
Thr Gly Ile Lys385 390 395 400Asn Pro Gln Ile Thr Gly Pro Ala Arg
Val Phe Glu Ser Glu Glu Ala 405 410 415Cys Leu Glu Ala Ile Leu Ala
Gly Lys Ile Gln Pro Asn Asp Val Ile 420 425 430Val Val Arg Tyr Glu
Gly Pro Lys Gly Gly Pro Gly Met Arg Glu Met 435 440 445Leu Ala Pro
Thr Ser Ala Ile Ile Gly Ala Gly Leu Gly Asp Ser Val 450 455 460Gly
Leu Ile Thr Asp Gly Arg Phe Ser Gly Gly Thr Tyr Gly Met Val465 470
475 480Val Gly His Val Ala Pro Glu Ala Ala Val Gly Gly Thr Ile Ala
Leu 485 490 495Val Gln Glu Gly Asp Gln Ile Thr Ile Asp Ala His Ala
Arg Lys Leu 500 505 510Glu Leu His Val Ser Asp Gln Glu Leu Lys Glu
Arg Lys Glu Lys Trp 515 520 525Glu Gln Pro Lys Pro Leu Tyr Asn Lys
Gly Val Leu Ala Lys Tyr Ala 530 535 540Lys Leu Val Ser Ser Ser Ser
Val Gly Ala Val Thr Asp Leu Asp Leu545 550 555
560Phe167559PRTLyngbya spp. 167Met Ser Asp Asn Phe Arg Ser Gln Ala
Ile Thr Gln Gly Lys Lys Arg1 5 10 15Thr Pro Asn Arg Ala Met Leu Arg
Ala Val Gly Phe Gly Asp Glu Asp 20 25 30Phe Asn Lys Pro Ile Val Gly
Ile Ala Asn Gly Tyr Ser Thr Ile Thr 35 40 45Pro Cys Asn Ile Gly Leu
Asn Asp Leu Ala His Arg Ala Glu Thr Ala 50 55 60Leu Lys Gln Ala Asp
Ala Met Pro Gln Met Phe Gly Thr Ile Thr Val65 70 75 80Ser Asp Gly
Ile Ala Met Gly Thr Glu Gly Met Lys Tyr Ser Leu Val 85 90 95Ser Arg
Glu Val Ile Ala Asp Ala Ile Glu Thr Ala Cys Asn Gly Gln 100 105
110Ser Met Asp Gly Val Leu Ala Ile Gly Gly Cys Asp Lys Asn Met Pro
115 120 125Gly Ala Met Ile Ala Ile Ala Arg Met Asn Ile Pro Ala Ile
Phe Val 130 135 140Tyr Gly Gly Thr Ile Lys Pro Gly Asn Leu Asn Gly
Cys Asp Leu Thr145 150 155 160Val Val Ser Ala Phe Glu Ala Val Gly
Glu Tyr Ser Ala Gly Lys Leu 165 170 175Asp Asp Asp Arg Leu Leu Asp
Ile Glu Arg Leu Ala Cys Pro Gly Ser 180 185 190Gly Ser Cys Gly Gly
Met Phe Thr Ala Asn Thr Met Ser Ser Ala Phe 195 200 205Glu Ala Met
Gly Met Ser Leu Met Tyr Ser Ser Thr Met Ala Ser Glu 210 215 220Asp
Ala Glu Lys Ala Asp Ser Thr Glu Lys Ser Ala Phe Val Leu Arg225 230
235 240Glu Ala Ile Ser Gln Arg Ile Leu Pro Lys Gln Ile Leu Thr Arg
Lys 245 250 255Ala Phe Glu Asn Ala Ile Ala Val Ile Met Ala Val Gly
Gly Ser Thr 260 265 270Asn Ser Val Leu His Leu Leu Ala Ile Ala Tyr
Ala Ala Asp Val Glu 275 280 285Leu Thr Ile Asp Asp Phe Glu Thr Ile
Arg Gly Arg Val Pro Val Leu 290 295 300Cys Asp Leu Lys Pro Ser Gly
Arg Phe Val Thr Thr Asp Phe His Lys305 310 315 320Ala Gly Gly Val
Pro Leu Ile Met Lys Met Leu Leu Glu Gln Gly Leu 325 330 335Ile His
Gly Asp Ala Leu Thr Ile Thr Gly Lys Thr Val Ala Glu Gln 340 345
350Leu Ala Asp Ile Pro Ser Gln Pro Ser Ala Asp Gln Glu Val Ile Arg
355 360 365Pro Trp Asn Asn Pro Met Tyr Lys Gln Gly His Leu Ala Ile
Leu Lys 370 375 380Gly Asn Leu Ala Thr Glu Gly Ser Val Ala Lys Ile
Thr Gly Val Lys385 390 395 400Lys Pro Gln Met Thr Gly Pro Ala Arg
Val Phe Glu Ser Glu Glu Gln 405 410 415Cys Leu Glu Ala Ile Leu Ala
Gly Lys Ile Gln Ala Gly Asp Val Leu 420 425 430Val Val Arg Tyr Glu
Gly Pro Lys Gly Gly Pro Gly Met Arg Glu Met 435 440 445Leu Ala Pro
Thr Ser Ala Ile Ile Gly Ala Gly Leu Gly Asp Ser Val 450 455 460Gly
Leu Ile Thr Asp Gly Arg Phe Ser Gly Gly Thr Tyr Gly Leu Val465 470
475 480Val Gly His Val Ala Pro Glu Ala Ala Val Gly Gly Asn Ile Ala
Leu 485 490 495Val Gln Glu Gly Asp Ser Ile Thr Ile Asp Ala Ser Gln
Arg Leu Leu 500 505 510Gln Val Asn Ile Ser Asp Gln Val Leu Glu Gln
Arg Arg Gln Asn Trp 515 520 525Gln Pro Pro Gln Pro Arg Tyr Thr Lys
Gly Val Leu Ala Lys Tyr Ala 530 535 540Lys Leu Val Ser Ser Ser Ser
Val Gly Ala Val Thr Asp Leu Asp545 550 555168616PRTEscherichia coli
168Met Pro Lys Tyr Arg Ser Ala Thr Thr Thr His Gly Arg Asn Met Ala1
5 10 15Gly Ala Arg Ala Leu Trp Arg Ala Thr Gly Met Thr Asp Ala Asp
Phe 20 25 30Gly Lys Pro Ile Ile Ala Val Val Asn Ser Phe Thr Gln Phe
Val Pro 35 40 45Gly His Val His Leu Arg Asp Leu Gly Lys Leu Val Ala
Glu Gln Ile 50 55 60Glu Ala Ala Gly Gly Val Ala Lys Glu Phe Asn Thr
Ile Ala Val Asp65 70 75 80Asp Gly Ile Ala Met Gly His Gly Gly Met
Leu Tyr Ser Leu Pro Ser 85 90 95Arg Glu Leu Ile Ala Asp Ser Val Glu
Tyr Met Val Asn Ala His Cys 100 105 110Ala Asp Ala Met Val Cys Ile
Ser Asn Cys Asp Lys Ile Thr Pro Gly 115 120 125Met Leu Met Ala Ser
Leu Arg Leu Asn Ile Pro Val Ile Phe Val Ser 130 135 140Gly Gly Pro
Met Glu Ala Gly Lys Thr Lys Leu Ser Asp Gln Ile Ile145 150 155
160Lys Leu Asp Leu Val Asp Ala Met Ile Gln Gly Ala Asp Pro Lys Val
165 170 175Ser Asp Ser Gln Ser Asp Gln Val Glu Arg Ser Ala Cys Pro
Thr Cys 180 185 190Gly Ser Cys Ser Gly Met Phe Thr Ala Asn Ser Met
Asn Cys Leu Thr 195 200 205Glu Ala Leu Gly Leu Ser Gln Pro Gly Asn
Gly Ser Leu Leu Ala Thr 210 215 220His Ala Asp Arg Lys Gln Leu Phe
Leu Asn Ala Gly Lys Arg Ile Val225 230 235 240Glu Leu Thr Lys Arg
Tyr Tyr Glu Gln Asn Asp Glu Ser Ala Leu Pro 245 250 255Arg Asn Ile
Ala Ser Lys Ala Ala Phe Glu Asn Ala Met Thr Leu Asp 260 265 270Ile
Ala Met Gly Gly Ser Thr Asn Thr Val Leu His Leu Leu Ala Ala 275 280
285Ala Gln Glu Ala Glu Ile Asp Phe Thr Met Ser Asp Ile Asp Lys Leu
290 295 300Ser Arg Lys Val Pro Gln Leu Cys Lys Val Ala Pro Ser Thr
Gln Lys305 310 315 320Tyr His Met Glu Asp Val His Arg Ala Gly Gly
Val Ile Gly Ile Leu 325 330 335Gly Glu Leu Asp Arg Ala Gly Leu Leu
Asn Arg Asp Val Lys Asn Val 340 345 350Leu Gly Leu Thr Leu Pro Gln
Thr Leu Glu Gln Tyr Asp Val Met Leu 355 360 365Thr Gln Asp Asp Ala
Val Lys Asn Met Phe Arg Ala Gly Pro Ala Gly 370 375 380Ile Arg Thr
Thr Gln Ala Phe Ser Gln Asp Cys Arg Trp Asp Thr Leu385 390 395
400Asp Asp Asp Arg Ala Asn Gly Cys Ile Arg Ser Leu Glu His Ala Tyr
405 410 415Ser Lys Asp Gly Gly Leu Ala Val Leu Tyr Gly Asn Phe Ala
Glu Asn 420 425 430Gly Cys Ile Val Lys Thr Ala Gly Val Asp Asp Ser
Ile Leu Lys Phe 435 440 445Thr Gly Pro Ala Lys Val Tyr Glu Ser Gln
Asp Asp Ala Val Glu Ala 450 455 460Ile Leu Gly Gly Lys Val Val Ala
Gly Asp Val Val Val Ile Arg Tyr465 470 475 480Glu Gly Pro Lys Gly
Gly Pro Gly Met Gln Glu Met Leu Tyr Pro Thr 485 490 495Ser Phe Leu
Lys Ser Met Gly Leu Gly Lys Ala Cys Ala Leu Ile Thr 500 505 510Asp
Gly Arg Phe Ser Gly Gly Thr Ser Gly Leu Ser Ile Gly His Val 515 520
525Ser Pro Glu Ala Ala Ser Gly Gly Ser Ile Gly Leu Ile Glu Asp Gly
530
535 540Asp Leu Ile Ala Ile Asp Ile Pro Asn Arg Gly Ile Gln Leu Gln
Val545 550 555 560Ser Asp Ala Glu Leu Ala Ala Arg Arg Glu Ala Gln
Asp Ala Arg Gly 565 570 575Asp Lys Ala Trp Thr Pro Lys Asn Arg Glu
Arg Gln Val Ser Phe Ala 580 585 590Leu Arg Ala Tyr Ala Ser Leu Ala
Thr Ser Ala Asp Lys Gly Ala Val 595 600 605Arg Asp Lys Ser Lys Leu
Gly Gly 610 615169571PRTBacillus subtilis 169Met Leu Thr Lys Ala
Thr Lys Glu Gln Lys Ser Leu Val Lys Asn Arg1 5 10 15Gly Ala Glu Leu
Val Val Asp Cys Leu Val Glu Gln Gly Val Thr His 20 25 30Val Phe Gly
Ile Pro Gly Ala Lys Ile Asp Ala Val Phe Asp Ala Leu 35 40 45Gln Asp
Lys Gly Pro Glu Ile Ile Val Ala Arg His Glu Gln Asn Ala 50 55 60Ala
Phe Met Ala Gln Ala Val Gly Arg Leu Thr Gly Lys Pro Gly Val65 70 75
80Val Leu Val Thr Ser Gly Pro Gly Ala Ser Asn Leu Ala Thr Gly Leu
85 90 95Leu Thr Ala Asn Thr Glu Gly Asp Pro Val Val Ala Leu Ala Gly
Asn 100 105 110Val Ile Arg Ala Asp Arg Leu Lys Arg Thr His Gln Ser
Leu Asp Asn 115 120 125Ala Ala Leu Phe Gln Pro Ile Thr Lys Tyr Ser
Val Glu Val Gln Asp 130 135 140Val Lys Asn Ile Pro Glu Ala Val Thr
Asn Ala Phe Arg Ile Ala Ser145 150 155 160Ala Gly Gln Ala Gly Ala
Ala Phe Val Ser Phe Pro Gln Asp Val Val 165 170 175Asn Glu Val Thr
Asn Thr Lys Asn Val Arg Ala Val Ala Ala Pro Lys 180 185 190Leu Gly
Pro Ala Ala Asp Asp Ala Ile Ser Ala Ala Ile Ala Lys Ile 195 200
205Gln Thr Ala Lys Leu Pro Val Val Leu Val Gly Met Lys Gly Gly Arg
210 215 220Pro Glu Ala Ile Lys Ala Val Arg Lys Leu Leu Lys Lys Val
Gln Leu225 230 235 240Pro Phe Val Glu Thr Tyr Gln Ala Ala Gly Thr
Leu Ser Arg Asp Leu 245 250 255Glu Asp Gln Tyr Phe Gly Arg Ile Gly
Leu Phe Arg Asn Gln Pro Gly 260 265 270Asp Leu Leu Leu Glu Gln Ala
Asp Val Val Leu Thr Ile Gly Tyr Asp 275 280 285Pro Ile Glu Tyr Asp
Pro Lys Phe Trp Asn Ile Asn Gly Asp Arg Thr 290 295 300Ile Ile His
Leu Asp Glu Ile Ile Ala Asp Ile Asp His Ala Tyr Gln305 310 315
320Pro Asp Leu Glu Leu Ile Gly Asp Ile Pro Ser Thr Ile Asn His Ile
325 330 335Glu His Asp Ala Val Lys Val Glu Phe Ala Glu Arg Glu Gln
Lys Ile 340 345 350Leu Ser Asp Leu Lys Gln Tyr Met His Glu Gly Glu
Gln Val Pro Ala 355 360 365Asp Trp Lys Ser Asp Arg Ala His Pro Leu
Glu Ile Val Lys Glu Leu 370 375 380Arg Asn Ala Val Asp Asp His Val
Thr Val Thr Cys Asp Ile Gly Ser385 390 395 400His Ala Ile Trp Met
Ser Arg Tyr Phe Arg Ser Tyr Glu Pro Leu Thr 405 410 415Leu Met Ile
Ser Asn Gly Met Gln Thr Leu Gly Val Ala Leu Pro Trp 420 425 430Ala
Ile Gly Ala Ser Leu Val Lys Pro Gly Glu Lys Val Val Ser Val 435 440
445Ser Gly Asp Gly Gly Phe Leu Phe Ser Ala Met Glu Leu Glu Thr Ala
450 455 460Val Arg Leu Lys Ala Pro Ile Val His Ile Val Trp Asn Asp
Ser Thr465 470 475 480Tyr Asp Met Val Ala Phe Gln Gln Leu Lys Lys
Tyr Asn Arg Thr Ser 485 490 495Ala Val Asp Phe Gly Asn Ile Asp Ile
Val Lys Tyr Ala Glu Ser Phe 500 505 510Gly Ala Thr Gly Leu Arg Val
Glu Ser Pro Asp Gln Leu Ala Asp Val 515 520 525Leu Arg Gln Gly Met
Asn Ala Glu Gly Pro Val Ile Ile Asp Val Pro 530 535 540Val Asp Tyr
Ser Asp Asn Ile Asn Leu Ala Ser Asp Lys Leu Pro Lys545 550 555
560Glu Phe Gly Glu Leu Met Lys Thr Lys Ala Leu 565
570170491PRTArtificial SequenceE. coli ilvC Q110V 170Met Ala Asn
Tyr Phe Asn Thr Leu Asn Leu Arg Gln Gln Leu Ala Gln1 5 10 15Leu Gly
Lys Cys Arg Phe Met Gly Arg Asp Glu Phe Ala Asp Gly Ala 20 25 30Ser
Tyr Leu Gln Gly Lys Lys Val Val Ile Val Gly Cys Gly Ala Gln 35 40
45Gly Leu Asn Gln Gly Leu Asn Met Arg Asp Ser Gly Leu Asp Ile Ser
50 55 60Tyr Ala Leu Arg Lys Glu Ala Ile Ala Glu Lys Arg Ala Ser Trp
Arg65 70 75 80Lys Ala Thr Glu Asn Gly Phe Lys Val Gly Thr Tyr Glu
Glu Leu Ile 85 90 95Pro Gln Ala Asp Leu Val Ile Asn Leu Thr Pro Asp
Lys Val His Ser 100 105 110Asp Val Val Arg Thr Val Gln Pro Leu Met
Lys Asp Gly Ala Ala Leu 115 120 125Gly Tyr Ser His Gly Phe Asn Ile
Val Glu Val Gly Glu Gln Ile Arg 130 135 140Lys Asp Ile Thr Val Val
Met Val Ala Pro Lys Cys Pro Gly Thr Glu145 150 155 160Val Arg Glu
Glu Tyr Lys Arg Gly Phe Gly Val Pro Thr Leu Ile Ala 165 170 175Val
His Pro Glu Asn Asp Pro Lys Gly Glu Gly Met Ala Ile Ala Lys 180 185
190Ala Trp Ala Ala Ala Thr Gly Gly His Arg Ala Gly Val Leu Glu Ser
195 200 205Ser Phe Val Ala Glu Val Lys Ser Asp Leu Met Gly Glu Gln
Thr Ile 210 215 220Leu Cys Gly Met Leu Gln Ala Gly Ser Leu Leu Cys
Phe Asp Lys Leu225 230 235 240Val Glu Glu Gly Thr Asp Pro Ala Tyr
Ala Glu Lys Leu Ile Gln Phe 245 250 255Gly Trp Glu Thr Ile Thr Glu
Ala Leu Lys Gln Gly Gly Ile Thr Leu 260 265 270Met Met Asp Arg Leu
Ser Asn Pro Ala Lys Leu Arg Ala Tyr Ala Leu 275 280 285Ser Glu Gln
Leu Lys Glu Ile Met Ala Pro Leu Phe Gln Lys His Met 290 295 300Asp
Asp Ile Ile Ser Gly Glu Phe Ser Ser Gly Met Met Ala Asp Trp305 310
315 320Ala Asn Asp Asp Lys Lys Leu Leu Thr Trp Arg Glu Glu Thr Gly
Lys 325 330 335Thr Ala Phe Glu Thr Ala Pro Gln Tyr Glu Gly Lys Ile
Gly Glu Gln 340 345 350Glu Tyr Phe Asp Lys Gly Val Leu Met Ile Ala
Met Val Lys Ala Gly 355 360 365Val Glu Leu Ala Phe Glu Thr Met Val
Asp Ser Gly Ile Ile Glu Glu 370 375 380Ser Ala Tyr Tyr Glu Ser Leu
His Glu Leu Pro Leu Ile Ala Asn Thr385 390 395 400Ile Ala Arg Lys
Arg Leu Tyr Glu Met Asn Val Val Ile Ser Asp Thr 405 410 415Ala Glu
Tyr Gly Asn Tyr Leu Phe Ser Tyr Ala Cys Val Pro Leu Leu 420 425
430Lys Pro Phe Met Ala Glu Leu Gln Pro Gly Asp Leu Gly Lys Ala Ile
435 440 445Pro Glu Gly Ala Val Asp Asn Gly Gln Leu Arg Asp Val Asn
Glu Ala 450 455 460Ile Arg Ser His Ala Ile Glu Gln Val Gly Lys Lys
Leu Arg Gly Tyr465 470 475 480Met Thr Asp Met Lys Arg Ile Ala Val
Ala Gly 485 4901711476DNAArtificial SequenceE. coli ilvC
codon-optimized for expression in S. cerevisiae (P2D1-A1)
171atggccaact attttaacac attaaatttg agacaacaat tggctcaact
gggtaagtgc 60agatttatgg gaagggacga gtttgctgat ggtgcttctt atctgcaagg
aaagaaagta 120gtaattgttg gctgcggtgc tcagggtcta aaccaaggtt
taaacatgag agattcaggt 180ctggatattt cgtatgcatt gaggaaagag
tctattgcag aaaaggatgc cgattggcgt 240aaagcgacgg aaaatgggtt
caaagttggt acttacgaag aactgatccc tcaggcagat 300ttagtgatta
acctaacacc agataaggtt cactcagacg tagtaagaac agttcaaccg
360ctgatgaagg atggggcagc tttaggttac tctcatggct ttaatatcgt
tgaagtgggc 420gagcagatca gaaaaggtat aacagtcgta atggttgcgc
caaagtgccc aggtacggaa 480gtcagagagg agtacaagag gggttttggt
gtacctacat tgatcgccgt acatcctgaa 540aatgacccca aacgtgaagg
tatggcaata gcgaaggcat gggcagccgc aaccggaggt 600catagagcgg
gtgtgttaga gagttctttc gtagctgagg tcaagagtga cttaatgggt
660gaacaaacca ttctgtgcgg aatgttgcag gcagggtctt tactatgctt
tgataaattg 720gtcgaagagg gtacagatcc tgcctatgct gaaaagttga
tacaatttgg ttgggagaca 780atcaccgagg cacttaaaca aggtggcata
acattgatga tggatagact ttcaaatccg 840gccaagctaa gagcctacgc
cttatctgag caactaaaag agatcatggc accattattc 900caaaagcaca
tggacgatat tatctccggt gagttttcct caggaatgat ggcagattgg
960gcaaacgatg ataaaaagtt attgacgtgg agagaagaaa ccggcaagac
ggcattcgag 1020acagccccac aatacgaagg taaaattggt gaacaagaat
actttgataa gggagtattg 1080atgatagcta tggtgaaggc aggggtagaa
cttgcattcg aaactatggt tgactccggt 1140atcattgaag aatctgcata
ctatgagtct ttgcatgaat tgcctttgat agcaaatact 1200attgcaagaa
aaagacttta cgagatgaat gttgtcatat cagacactgc agaatatggt
1260aattacttat ttagctacgc gtgtgtcccg ttgttagagc ccttcatggc
cgagttacaa 1320cctggtgatt tggggaaggc tattccggaa ggagcggttg
acaatggcca actgagagac 1380gtaaatgaag ctattcgttc acatgctata
gaacaggtgg gtaaaaagct gagaggatat 1440atgaccgata tgaaaagaat
tgcagtggca ggatga 1476172491PRTArtificial SequenceE. coli ilvC
codon-optimized for expression in S. cerevisiae (P2D1-A1) 172Met
Ala Asn Tyr Phe Asn Thr Leu Asn Leu Arg Gln Gln Leu Ala Gln1 5 10
15Leu Gly Lys Cys Arg Phe Met Gly Arg Asp Glu Phe Ala Asp Gly Ala
20 25 30Ser Tyr Leu Gln Gly Lys Lys Val Val Ile Val Gly Cys Gly Ala
Gln 35 40 45Gly Leu Asn Gln Gly Leu Asn Met Arg Asp Ser Gly Leu Asp
Ile Ser 50 55 60Tyr Ala Leu Arg Lys Glu Ser Ile Ala Glu Lys Asp Ala
Asp Trp Arg65 70 75 80Lys Ala Thr Glu Asn Gly Phe Lys Val Gly Thr
Tyr Glu Glu Leu Ile 85 90 95Pro Gln Ala Asp Leu Val Ile Asn Leu Thr
Pro Asp Lys Val His Ser 100 105 110Asp Val Val Arg Thr Val Gln Pro
Leu Met Lys Asp Gly Ala Ala Leu 115 120 125Gly Tyr Ser His Gly Phe
Asn Ile Val Glu Val Gly Glu Gln Ile Arg 130 135 140Lys Gly Ile Thr
Val Val Met Val Ala Pro Lys Cys Pro Gly Thr Glu145 150 155 160Val
Arg Glu Glu Tyr Lys Arg Gly Phe Gly Val Pro Thr Leu Ile Ala 165 170
175Val His Pro Glu Asn Asp Pro Lys Arg Glu Gly Met Ala Ile Ala Lys
180 185 190Ala Trp Ala Ala Ala Thr Gly Gly His Arg Ala Gly Val Leu
Glu Ser 195 200 205Ser Phe Val Ala Glu Val Lys Ser Asp Leu Met Gly
Glu Gln Thr Ile 210 215 220Leu Cys Gly Met Leu Gln Ala Gly Ser Leu
Leu Cys Phe Asp Lys Leu225 230 235 240Val Glu Glu Gly Thr Asp Pro
Ala Tyr Ala Glu Lys Leu Ile Gln Phe 245 250 255Gly Trp Glu Thr Ile
Thr Glu Ala Leu Lys Gln Gly Gly Ile Thr Leu 260 265 270Met Met Asp
Arg Leu Ser Asn Pro Ala Lys Leu Arg Ala Tyr Ala Leu 275 280 285Ser
Glu Gln Leu Lys Glu Ile Met Ala Pro Leu Phe Gln Lys His Met 290 295
300Asp Asp Ile Ile Ser Gly Glu Phe Ser Ser Gly Met Met Ala Asp
Trp305 310 315 320Ala Asn Asp Asp Lys Lys Leu Leu Thr Trp Arg Glu
Glu Thr Gly Lys 325 330 335Thr Ala Phe Glu Thr Ala Pro Gln Tyr Glu
Gly Lys Ile Gly Glu Gln 340 345 350Glu Tyr Phe Asp Lys Gly Val Leu
Met Ile Ala Met Val Lys Ala Gly 355 360 365Val Glu Leu Ala Phe Glu
Thr Met Val Asp Ser Gly Ile Ile Glu Glu 370 375 380Ser Ala Tyr Tyr
Glu Ser Leu His Glu Leu Pro Leu Ile Ala Asn Thr385 390 395 400Ile
Ala Arg Lys Arg Leu Tyr Glu Met Asn Val Val Ile Ser Asp Thr 405 410
415Ala Glu Tyr Gly Asn Tyr Leu Phe Ser Tyr Ala Cys Val Pro Leu Leu
420 425 430Glu Pro Phe Met Ala Glu Leu Gln Pro Gly Asp Leu Gly Lys
Ala Ile 435 440 445Pro Glu Gly Ala Val Asp Asn Gly Gln Leu Arg Asp
Val Asn Glu Ala 450 455 460Ile Arg Ser His Ala Ile Glu Gln Val Gly
Lys Lys Leu Arg Gly Tyr465 470 475 480Met Thr Asp Met Lys Arg Ile
Ala Val Ala Gly 485 490173548PRTLactococcus lactis 173Met Tyr Thr
Val Gly Asp Tyr Leu Leu Asp Arg Leu His Glu Leu Gly1 5 10 15Ile Glu
Glu Ile Phe Gly Val Pro Gly Asp Tyr Asn Leu Gln Phe Leu 20 25 30Asp
Gln Ile Ile Ser His Lys Asp Met Lys Trp Val Gly Asn Ala Asn 35 40
45Glu Leu Asn Ala Ser Tyr Met Ala Asp Gly Tyr Ala Arg Thr Lys Lys
50 55 60Ala Ala Ala Phe Leu Thr Thr Phe Gly Val Gly Glu Leu Ser Ala
Val65 70 75 80Asn Gly Leu Ala Gly Ser Tyr Ala Glu Asn Leu Pro Val
Val Glu Ile 85 90 95Val Gly Ser Pro Thr Ser Lys Val Gln Asn Glu Gly
Lys Phe Val His 100 105 110His Thr Leu Ala Asp Gly Asp Phe Lys His
Phe Met Lys Met His Glu 115 120 125Pro Val Thr Ala Ala Arg Thr Leu
Leu Thr Ala Glu Asn Ala Thr Val 130 135 140Glu Ile Asp Arg Val Leu
Ser Ala Leu Leu Lys Glu Arg Lys Pro Val145 150 155 160Tyr Ile Asn
Leu Pro Val Asp Val Ala Ala Ala Lys Ala Glu Lys Pro 165 170 175Ser
Leu Pro Leu Lys Lys Glu Asn Ser Thr Ser Asn Thr Ser Asp Gln 180 185
190Glu Ile Leu Asn Lys Ile Gln Glu Ser Leu Lys Asn Ala Lys Lys Pro
195 200 205Ile Val Ile Thr Gly His Glu Ile Ile Ser Phe Gly Leu Glu
Lys Thr 210 215 220Val Thr Gln Phe Ile Ser Lys Thr Lys Leu Pro Ile
Thr Thr Leu Asn225 230 235 240Phe Gly Lys Ser Ser Val Asp Glu Ala
Leu Pro Ser Phe Leu Gly Ile 245 250 255Tyr Asn Gly Thr Leu Ser Glu
Pro Asn Leu Lys Glu Phe Val Glu Ser 260 265 270Ala Asp Phe Ile Leu
Met Leu Gly Val Lys Leu Thr Asp Ser Ser Thr 275 280 285Gly Ala Phe
Thr His His Leu Asn Glu Asn Lys Met Ile Ser Leu Asn 290 295 300Ile
Asp Glu Gly Lys Ile Phe Asn Glu Arg Ile Gln Asn Phe Asp Phe305 310
315 320Glu Ser Leu Ile Ser Ser Leu Leu Asp Leu Ser Glu Ile Glu Tyr
Lys 325 330 335Gly Lys Tyr Ile Asp Lys Lys Gln Glu Asp Phe Val Pro
Ser Asn Ala 340 345 350Leu Leu Ser Gln Asp Arg Leu Trp Gln Ala Val
Glu Asn Leu Thr Gln 355 360 365Ser Asn Glu Thr Ile Val Ala Glu Gln
Gly Thr Ser Phe Phe Gly Ala 370 375 380Ser Ser Ile Phe Leu Lys Ser
Lys Ser His Phe Ile Gly Gln Pro Leu385 390 395 400Trp Gly Ser Ile
Gly Tyr Thr Phe Pro Ala Ala Leu Gly Ser Gln Ile 405 410 415Ala Asp
Lys Glu Ser Arg His Leu Leu Phe Ile Gly Asp Gly Ser Leu 420 425
430Gln Leu Thr Val Gln Glu Leu Gly Leu Ala Ile Arg Glu Lys Ile Asn
435 440 445Pro Ile Cys Phe Ile Ile Asn Asn Asp Gly Tyr Thr Val Glu
Arg Glu 450 455 460Ile His Gly Pro Asn Gln Ser Tyr Asn Asp Ile Pro
Met Trp Asn Tyr465 470 475 480Ser Lys Leu Pro Glu Ser Phe Gly Ala
Thr Glu Asp Arg Val Val Ser 485 490 495Lys Ile Val Arg Thr Glu Asn
Glu Phe Val Ser Val Met Lys Glu Ala 500 505 510Gln Ala Asp Pro Asn
Arg Met Tyr Trp Ile Glu Leu Ile Leu Ala Lys 515 520 525Glu Gly Ala
Pro Lys Val Leu Lys Lys Met Gly Lys Leu Phe Ala Glu 530 535 540Gln
Asn Lys Ser5451741023DNALactococcus
lactis 174atgaaagcag cagtagtaag acacaatcca gatggttatg cggaccttgt
tgaaaaggaa 60cttcgagcaa tcaaacctaa tgaagctttg cttgacatgg agtattgtgg
agtctgtcat 120accgatttgc acgttgcagc aggtgattat ggcaacaaag
cagggactgt tcttggtcat 180gaaggaattg gaattgtcaa agaaattgga
gctgatgtaa gctcgcttca agttggtgat 240cgggtttcag tggcttggtt
ctttgaagga tgtggtcact gtgaatactg tgtatctggt 300aatgaaactt
tttgtcgaga agttaaaaat gcaggatatt cagttgatgg cggaatggct
360gaagaagcaa ttgttgttgc cgattatgct gtcaaagttc ctgacggact
tgacccaatt 420gaagctagct caattacttg tgctggagta acaacttaca
aagcaatcaa agtatcagga 480gtaaaacctg gtgattggca agtaattttt
ggtgctggag gacttggaaa tttagcaatt 540caatatgcta aaaatgtttt
tggagcaaaa gtaattgctg ttgatattaa tcaagataaa 600ttaaatttag
ctaaaaaaat tggagctgat gtgattatca attctggtga tgtaaatcca
660gttgatgaaa ttaaaaaaat aactggcggc ttaggggtgc aaagtgcaat
agtttgtgct 720gttgcaagga ttgcttttga acaagcggtt gcttctttga
aacctatggg caaaatggtt 780gctgtggcac ttcccaatac tgagatgact
ttatcagttc caacagttgt ttttgacgga 840gtggaggttg caggttcact
tgtcggaaca agacttgact tggcagaagc ttttcaattt 900ggagcagaag
gtaaggtaaa accaattgtt gcgacacgca aactggaaga aatcaatgat
960attattgatg aaatgaaggc aggaaaaatt gaaggccgaa tggtcattga
ttttactaaa 1020taa 1023175340PRTLactococcus lactis 175Met Lys Ala
Ala Val Val Arg His Asn Pro Asp Gly Tyr Ala Asp Leu1 5 10 15Val Glu
Lys Glu Leu Arg Ala Ile Lys Pro Asn Glu Ala Leu Leu Asp 20 25 30Met
Glu Tyr Cys Gly Val Cys His Thr Asp Leu His Val Ala Ala Gly 35 40
45Asp Tyr Gly Asn Lys Ala Gly Thr Val Leu Gly His Glu Gly Ile Gly
50 55 60Ile Val Lys Glu Ile Gly Ala Asp Val Ser Ser Leu Gln Val Gly
Asp65 70 75 80Arg Val Ser Val Ala Trp Phe Phe Glu Gly Cys Gly His
Cys Glu Tyr 85 90 95Cys Val Ser Gly Asn Glu Thr Phe Cys Arg Glu Val
Lys Asn Ala Gly 100 105 110Tyr Ser Val Asp Gly Gly Met Ala Glu Glu
Ala Ile Val Val Ala Asp 115 120 125Tyr Ala Val Lys Val Pro Asp Gly
Leu Asp Pro Ile Glu Ala Ser Ser 130 135 140Ile Thr Cys Ala Gly Val
Thr Thr Tyr Lys Ala Ile Lys Val Ser Gly145 150 155 160Val Lys Pro
Gly Asp Trp Gln Val Ile Phe Gly Ala Gly Gly Leu Gly 165 170 175Asn
Leu Ala Ile Gln Tyr Ala Lys Asn Val Phe Gly Ala Lys Val Ile 180 185
190Ala Val Asp Ile Asn Gln Asp Lys Leu Asn Leu Ala Lys Lys Ile Gly
195 200 205Ala Asp Val Ile Ile Asn Ser Gly Asp Val Asn Pro Val Asp
Glu Ile 210 215 220Lys Lys Ile Thr Gly Gly Leu Gly Val Gln Ser Ala
Ile Val Cys Ala225 230 235 240Val Ala Arg Ile Ala Phe Glu Gln Ala
Val Ala Ser Leu Lys Pro Met 245 250 255Gly Lys Met Val Ala Val Ala
Leu Pro Asn Thr Glu Met Thr Leu Ser 260 265 270Val Pro Thr Val Val
Phe Asp Gly Val Glu Val Ala Gly Ser Leu Val 275 280 285Gly Thr Arg
Leu Asp Leu Ala Glu Ala Phe Gln Phe Gly Ala Glu Gly 290 295 300Lys
Val Lys Pro Ile Val Ala Thr Arg Lys Leu Glu Glu Ile Asn Asp305 310
315 320Ile Ile Asp Glu Met Lys Ala Gly Lys Ile Glu Gly Arg Met Val
Ile 325 330 335Asp Phe Thr Lys 340176256PRTDrosophila melanogaster
176Met Ser Phe Thr Leu Thr Asn Lys Asn Val Ile Phe Val Ala Gly Leu1
5 10 15Gly Gly Ile Gly Leu Asp Thr Ser Lys Glu Leu Leu Lys Arg Asp
Leu 20 25 30Lys Asn Leu Val Ile Leu Asp Arg Ile Glu Asn Pro Ala Ala
Ile Ala 35 40 45Glu Leu Lys Ala Ile Asn Pro Lys Val Thr Val Thr Phe
Tyr Pro Tyr 50 55 60Asp Val Thr Val Pro Ile Ala Glu Thr Thr Lys Leu
Leu Lys Thr Ile65 70 75 80Phe Ala Gln Leu Lys Thr Val Asp Val Leu
Ile Asn Gly Ala Gly Ile 85 90 95Leu Asp Asp His Gln Ile Glu Arg Thr
Ile Ala Val Asn Tyr Thr Gly 100 105 110Leu Val Asn Thr Thr Thr Ala
Ile Leu Asp Phe Trp Asp Lys Arg Lys 115 120 125Gly Gly Pro Gly Gly
Ile Ile Cys Asn Ile Gly Ser Val Thr Gly Phe 130 135 140Asn Ala Ile
Tyr Gln Val Pro Val Tyr Ser Gly Thr Lys Ala Ala Val145 150 155
160Val Asn Phe Thr Ser Ser Leu Ala Lys Leu Ala Pro Ile Thr Gly Val
165 170 175Thr Ala Tyr Thr Val Asn Pro Gly Ile Thr Arg Thr Thr Leu
Val His 180 185 190Thr Phe Asn Ser Trp Leu Asp Val Glu Pro Gln Val
Ala Glu Lys Leu 195 200 205Leu Ala His Pro Thr Gln Pro Ser Leu Ala
Cys Ala Glu Asn Phe Val 210 215 220Lys Ala Ile Glu Leu Asn Gln Asn
Gly Ala Ile Trp Lys Leu Asp Leu225 230 235 240Gly Thr Leu Glu Ala
Ile Gln Trp Thr Lys His Trp Asp Ser Gly Ile 245 250
255177340PRTArtificial SequenceL. lactis AdhA RE1 177Met Lys Ala
Ala Val Val Arg His Asn Pro Asp Gly Tyr Ala Asp Leu1 5 10 15Val Glu
Lys Glu Leu Arg Ala Ile Lys Pro Asn Glu Ala Leu Leu Asp 20 25 30Met
Glu Tyr Cys Gly Val Cys His Thr Asp Leu His Val Ala Ala Gly 35 40
45Asp Phe Gly Asn Lys Ala Gly Thr Val Leu Gly His Glu Gly Ile Gly
50 55 60Ile Val Lys Glu Ile Gly Ala Asp Val Ser Ser Leu Gln Val Gly
Asp65 70 75 80Arg Val Ser Val Ala Trp Phe Phe Glu Gly Cys Gly His
Cys Glu Tyr 85 90 95Cys Val Ser Gly Asn Glu Thr Phe Cys Arg Glu Val
Lys Asn Ala Gly 100 105 110Tyr Ser Val Asp Gly Gly Met Ala Glu Glu
Ala Ile Val Val Ala Asp 115 120 125Tyr Ala Val Lys Val Pro Asp Gly
Leu Asp Pro Ile Glu Ala Ser Ser 130 135 140Ile Thr Cys Ala Gly Val
Thr Thr Tyr Lys Ala Ile Lys Val Ser Gly145 150 155 160Val Lys Pro
Gly Asp Trp Gln Val Ile Phe Gly Ala Gly Gly Leu Gly 165 170 175Asn
Leu Ala Ile Gln Tyr Ala Lys Asn Val Phe Gly Ala Lys Val Ile 180 185
190Ala Val Asp Ile Asn Gln Asp Lys Leu Asn Leu Ala Lys Lys Ile Gly
195 200 205Ala Asp Val Thr Ile Asn Ser Gly Asp Val Asn Pro Val Asp
Glu Ile 210 215 220Lys Lys Ile Thr Gly Gly Leu Gly Val Gln Ser Ala
Ile Val Cys Ala225 230 235 240Val Ala Arg Ile Ala Phe Glu Gln Ala
Val Ala Ser Leu Lys Pro Met 245 250 255Gly Lys Met Val Ala Val Ala
Val Pro Asn Thr Glu Met Thr Leu Ser 260 265 270Val Pro Thr Val Val
Phe Asp Gly Val Glu Val Ala Gly Ser Leu Val 275 280 285Gly Thr Arg
Leu Asp Leu Ala Glu Ala Phe Gln Phe Gly Ala Glu Gly 290 295 300Lys
Val Lys Pro Ile Val Ala Thr Arg Lys Leu Glu Glu Ile Asn Asp305 310
315 320Ile Ile Asp Glu Met Lys Ala Gly Lys Ile Glu Gly Arg Met Val
Ile 325 330 335Asp Phe Thr Lys 340
* * * * *