U.S. patent application number 14/157799 was filed with the patent office on 2014-10-16 for cytosolic isobutanol pathway localization for the production of isobutanol.
This patent application is currently assigned to Gevo, Inc.. The applicant listed for this patent is Gevo, Inc.. Invention is credited to Aristos A. Aristidou, Ruth Berry, Thomas Buelter, Catherine Asleson Dundon, Reid M. Renny Feldman, Andrew Hawkins, Ishmeet Kalra, Doug Lies, Peter Meinhold, Matthew Peters, Stephanie Porter-Scheinman, Christopher Smith, Jun Urano.
Application Number | 20140308721 14/157799 |
Document ID | / |
Family ID | 43586479 |
Filed Date | 2014-10-16 |
United States Patent
Application |
20140308721 |
Kind Code |
A1 |
Urano; Jun ; et al. |
October 16, 2014 |
Cytosolic Isobutanol Pathway Localization for the Production of
Isobutanol
Abstract
The present invention provides recombinant microorganisms
comprising isobutanol producing metabolic pathway with at least one
isobutanol pathway enzyme localized in the cytosol, wherein said
recombinant microorganism is selected to produce isobutanol from a
carbon source. Methods of using said recombinant microorganisms to
produce isobutanol are also provided. In various aspects of the
invention, the recombinant microorganisms may comprise a
cytosolically active isobutanol pathway enzymes. In some
embodiments, the invention provides mutated, modified, and/or
chimeric isobutanol pathway enzymes with cytosolic activity. In
various embodiments described herein, the recombinant
microorganisms may be microorganisms of the Saccharomyces clade,
Crabtree-negative yeast microorganisms, Crabtree-positive yeast
microorganisms, post-WGD (whole genome duplication) yeast
microorganisms, pre-WGD (whole genome duplication) yeast
microorganisms, and non-fermenting yeast microorganisms.
Inventors: |
Urano; Jun; (Irvine, CA)
; Dundon; Catherine Asleson; (Englewood, CO) ;
Meinhold; Peter; (Topanga, CA) ; Feldman; Reid M.
Renny; (San Francisco, CA) ; Aristidou; Aristos
A.; (Maple Grove, MN) ; Hawkins; Andrew;
(Parker, CO) ; Buelter; Thomas; (Santa Monica,
CA) ; Peters; Matthew; (Highlands Ranch, CO) ;
Lies; Doug; (Parker, CO) ; Porter-Scheinman;
Stephanie; (Englewood, CO) ; Smith; Christopher;
(Parker, CO) ; Berry; Ruth; (Englewood, CO)
; Kalra; Ishmeet; (Englewood, CO) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Gevo, Inc. |
Englewood |
CO |
US |
|
|
Assignee: |
Gevo, Inc.
Englewood
CO
|
Family ID: |
43586479 |
Appl. No.: |
14/157799 |
Filed: |
January 17, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13176452 |
Jul 5, 2011 |
|
|
|
14157799 |
|
|
|
|
12855276 |
Aug 12, 2010 |
8232089 |
|
|
13176452 |
|
|
|
|
61272058 |
Aug 12, 2009 |
|
|
|
61272059 |
Aug 12, 2009 |
|
|
|
Current U.S.
Class: |
435/160 ;
435/254.21 |
Current CPC
Class: |
C12N 9/1022 20130101;
C12N 15/81 20130101; C12N 9/88 20130101; Y02E 50/10 20130101; C12P
7/16 20130101; C12N 9/0006 20130101 |
Class at
Publication: |
435/160 ;
435/254.21 |
International
Class: |
C12N 15/81 20060101
C12N015/81; C12P 7/16 20060101 C12P007/16 |
Goverment Interests
ACKNOWLEDGMENT OF GOVERNMENTAL SUPPORT
[0002] This invention was made with government support under
Contract No. IIP-0823122, awarded by the National Science
Foundation, and under Contract No. EP-D-09-023, awarded by the
Environmental Protection Agency. The government has certain rights
in the invention.
Claims
1. A recombinant yeast microorganism comprising an isobutanol
producing metabolic pathway, wherein said isobutanol producing
metabolic pathway comprises the following substrate to product
conversions: (i) pyruvate to acetolactate; (ii) acetolactate to
2,3-dihydroxyisovalerate; (iii) 2,3-dihydroxyisovalerate to
.alpha.-ketoisovalerate; (iv) .alpha.-ketoisovalerate to
isobutyraldehyde; and (v) isobutyraldehyde to isobutanol; wherein
a) the enzyme that catalyzes a substrate to product conversion of
pyruvate to acetolactate is an acetolactate synthase; b) the enzyme
that catalyzes a substrate to product conversion of acetolactate to
2,3-dihydroxyisovalerate is a ketol-acid reductoisomerase derived
from Lactococcus lactis; c) the enzyme that catalyzes a substrate
to product conversion of 2,3-dihydroxyisovalerate to
.alpha.-ketoisovalerate is a dihydroxy acid dehydratase; d) the
enzyme that catalyzes a substrate to product conversion of
.alpha.-ketoisovalerate to isobutyraldehyde is an
.alpha.-ketoisovalerate decarboxylase; and e) the enzyme that
catalyzes a substrate to product conversion of isobutyraldehyde to
isobutanol is an alcohol dehydrogenase.
2. The recombinant yeast microorganism of claim 1, wherein said
acetolactate synthase is derived from a bacterial organism.
3. The recombinant yeast microorganism of claim 2, wherein said
bacterial organism is Bacillus subtilis.
4. The recombinant yeast microorganism of claim 1, wherein said
ketol-acid reductoisomerase is an NADH-dependent ketol-acid
reductoisomerase.
5. The recombinant yeast microorganism of claim 1, wherein said
dihydroxy acid dehydratase comprises the amino acid sequence
P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27), wherein X is any natural or
non-natural amino acid.
6. The recombinant yeast microorganism of claim 5, wherein said
dihydroxy acid dehydratase enzyme is derived from a bacterial
organism.
7. The recombinant yeast microorganism of claim 6, wherein said
bacterial organism is Lactococcus lactis.
8. The recombinant yeast microorganism of claim 1, wherein said
.alpha.-ketoisovalerate decarboxylase is derived from a bacterial
organism.
9. The recombinant yeast microorganism of claim 8, wherein said
bacterial organism is Lactococcus lactis.
10. The recombinant yeast microorganism of claim 1, wherein said
alcohol dehydrogenase is derived from a bacterial organism.
11. The recombinant yeast microorganism of claim 10, wherein said
bacterial organism is Lactococcus lactis.
12. The recombinant yeast microorganism of claim 1, wherein the
recombinant yeast microorganism has been engineered to inactivate
one or more endogenous pyruvate decarboxylase (PDC) genes.
13. The recombinant yeast microorganism of claim 12, wherein said
one or more endogenous PDC genes is selected from PDC1, PDC5, and
PDC6.
14. The recombinant yeast microorganism of claim 1, wherein the
recombinant yeast microorganism has been engineered to inactivate
one or more endogenous glycerol-3-phosphate dehydrogenase (GPD)
genes.
15. The recombinant yeast microorganism of claim 14, wherein said
one or more endogenous GPD genes is selected from GPD1 and
GPD2.
16. The recombinant yeast microorganism of claim 1, wherein the
recombinant yeast microorganism is a yeast microorganism of the
Saccharomyces clade.
17. The recombinant yeast microorganism of claim 16, wherein said
yeast microorganism of the Saccharomyces clade is S.
cerevisiae.
18. A method of producing isobutanol, comprising: (a) providing a
recombinant yeast microorganism according to claim 1; and (b)
cultivating said recombinant yeast microorganism in a culture
medium containing a feedstock providing the carbon source, until a
recoverable quantity of the isobutanol is produced.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. application Ser.
No. 13/176,452, filed Jul. 5, 2011, which is a divisional of U.S.
application Ser. No. 12/855,276, filed Aug. 12, 2010, which issued
as U.S. Pat. No. 8,232,089, which claims the benefit of U.S.
Provisional Application Ser. No. 61/272,058, filed Aug. 12, 2009,
and U.S. Provisional Application Ser. No. 61/272,059, filed Aug.
12, 2009, each of which are herein incorporated by reference in
their entireties for all purposes.
TECHNICAL FIELD
[0003] Recombinant microorganisms and methods of producing such
organisms are provided. Also provided are methods of producing
metabolites that are biofuels by contacting a suitable substrate
with recombinant microorganisms and enzymatic preparations
therefrom.
DESCRIPTION OF THE TEXT FILE SUBMITTED ELECTRONICALLY
[0004] The contents of the text file submitted electronically
herewith are incorporated herein by reference in their entirety: A
computer readable format copy of the Sequence Listing (filename:
GEVO.sub.--041.sub.--18US_SeqList_ST25.txt, date recorded: Jan. 16,
2014, file size: 343 kilobytes).
BACKGROUND
[0005] Biofuels have a long history ranging back to the beginning
of the 20th century. As early as 1900, Rudolf Diesel demonstrated
at the World Exhibition in Paris, France, an engine running on
peanut oil. Soon thereafter, Henry Ford demonstrated his Model T
running on ethanol derived from corn. Petroleum-derived fuels
displaced biofuels in the 1930s and 1940s due to increased supply,
and efficiency at a lower cost.
[0006] Market fluctuations in the 1970s coupled to the decrease in
US oil production led to an increase in crude oil prices and a
renewed interest in biofuels. Today, many interest groups,
including policy makers, industry planners, aware citizens, and the
financial community, are interested in substituting
petroleum-derived fuels with biomass-derived biofuels. The leading
motivations for developing biofuels are of economical, political,
and environmental nature.
[0007] One is the threat of `peak oil`, the point at which the
consumption rate of crude oil exceeds the supply rate, thus leading
to significantly increased fuel cost results in an increased demand
for alternative fuels. In addition, instability in the Middle East
and other oil-rich regions has increased the demand for
domestically produced biofuels. Also, environmental concerns
relating to the possibility of carbon dioxide related climate
change is an important social and ethical driving force which is
starting to result in government regulations and policies such as
caps on carbon dioxide emissions from automobiles, taxes on carbon
dioxide emissions, and tax incentives for the use of biofuels.
[0008] Ethanol is the most abundant fermentatively produced fuel
today but has several drawbacks when compared to gasoline. Butanol,
in comparison, has several advantages over ethanol as a fuel: it
can be made from the same feedstocks as ethanol but, unlike
ethanol, it is compatible with gasoline at any ratio and can also
be used as a pure fuel in existing combustion engines without
modifications. Unlike ethanol, butanol does not absorb water and
can thus be stored and distributed in the existing petrochemical
infrastructure. Due to its higher energy content which is close to
that of gasoline, the fuel economy (miles per gallon) is better
than that of ethanol. Also, butanol-gasoline blends have lower
vapor pressure than ethanol-gasoline blends, which is important in
reducing evaporative hydrocarbon emissions.
[0009] Isobutanol has the same advantages as butanol with the
additional advantage of having a higher octane number due to its
branched carbon chain. Isobutanol is also useful as a commodity
chemical and is also a precursor to MTBE.
[0010] Isobutanol has been produced in recombinant microorganisms
expressing a heterologous, five-step metabolic pathway (See, e.g.,
WO/2007/050671 to Donaldson et al., WO/2008/098227 to Liao et al.,
and WO/2009/103533 to Festel et al.). However, the microorganisms
produced have fallen short of commercial relevance due to their low
performance characteristics, including, for example low
productivity, low titer, low yield, and the requirement for oxygen
during the fermentation process. Thus, recombinant microorganisms
exhibiting increased isobutanol productivity, titer, and/or yield
are desirable.
SUMMARY OF THE INVENTION
[0011] The present invention provides cytosolically active
dihydroxyacid dehydratase (DHAD) enzymes and recombinant
microorganisms comprising said cytosolically active DHAD enzymes.
In some embodiments, said recombinant microorganisms may further
comprise one or more additional enzymes catalyzing a reaction in an
isobutanol producing metabolic pathway. As described herein, the
recombinant microorganisms of the present invention are useful for
the production of several beneficial metabolites, including, but
not limited to isobutanol.
[0012] In a first aspect, the invention provides cytosolically
active dihydroxyacid dehydratase (DHAD) enzymes. These
cytosolically active DHAD enzymes generally exhibit the ability to
convert 2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol.
The cytosolically active DHAD enzymes of the present invention, as
described herein, can include native (i.e. parental) DHAD enzymes
that exhibit cytosolic activity, as well DHAD enzymes that have
been modified or mutated to increase their cytosolic localization
and/or activity as compared to native (i.e. parental) DHAD
enzymes.
[0013] In various embodiments described herein, the DHAD enzymes
may be derived from a prokaryotic organism. In one embodiment, the
prokaryotic organism is a bacterial organism. In another
embodiment, the bacterial organism is Lactococcus lactis. In a
specific embodiment, the DHAD enzyme from L. lactis comprises the
amino acid sequence of SEQ ID NO: 18. In another embodiment, the
bacterial organism is Francisella tularensis. In a specific
embodiment, the DHAD enzyme from F. tularensis comprises the amino
acid sequence of SEQ ID NO: 14. In another embodiment, the
bacterial organism is Gramella forsetii. In a specific embodiment,
the DHAD enzyme from G. forsetii comprises the amino acid sequence
of SEQ ID NO: 17.
[0014] In alternative embodiments described herein, the DHAD enzyme
may be derived from a eukaryotic organism. In one embodiment, the
eukaryotic organism is a fungal organism. In an exemplary
embodiment, the fungal organism is Neurospora crassa. In a specific
embodiment, the DHAD enzyme from N. crassa comprises the amino acid
sequence of SEQ ID NO: 165.
[0015] In some embodiments, the invention provides modified or
mutated DHAD enzymes, wherein said DHAD enzymes exhibit increased
cytosolic activity as compared to their parental DHAD enzymes. In
another embodiment, the invention provides modified or mutated DHAD
enzymes, wherein said DHAD enzymes exhibit increased cytosolic
activity as compared to the DHAD enzyme comprised by the amino acid
sequence of SEQ ID NO: 11.
[0016] In further embodiments, the invention provides DHAD enzymes
comprising the amino acid sequence P(I/L)XXXGX(I/L)XIL (SEQ ID NO:
27), wherein X is any natural or non-natural amino acid, and
wherein said DHAD enzymes exhibit the ability to convert
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol.
[0017] In some embodiments, the DHAD enzymes of the present
invention exhibit a properly folded iron-sulfur cluster domain
and/or redox active domain in the cytosol. In one embodiment, the
DHAD enzymes comprise a mutated or modified iron-sulfur cluster
domain and/or redox active domain.
[0018] In another aspect, the present invention provides
recombinant microorganisms comprising a cytosolically active DHAD
enzyme. In one embodiment, the invention provides recombinant
microorganisms comprising a DHAD enzyme derived from a prokaryotic
organism, wherein said DHAD enzyme exhibits activity in the
cytosol. In one embodiment, the DHAD enzyme is derived from a
bacterial organism. In a specific embodiment, the DHAD enzyme is
derived from L. lactis and comprises the amino acid sequence of SEQ
ID NO: 18. In another embodiment, the invention provides
recombinant microorganisms comprising a DHAD enzyme derived from a
eukaryotic organism, wherein said DHAD enzyme exhibits activity in
the cytosol. In one embodiment, the DHAD enzyme is derived from a
fungal organism. In an alternative embodiment, the DHAD enzyme is
derived from a yeast organism.
[0019] In one embodiment, the invention provides recombinant
microorganisms comprising a modified or mutated DHAD enzyme,
wherein said DHAD enzyme exhibits increased cytosolic activity as
compared to the parental DHAD enzyme. In another embodiment, the
invention provides recombinant microorganisms comprising a modified
or mutated DHAD enzyme, wherein said DHAD enzyme exhibits increased
cytosolic activity as compared to the DHAD enzyme comprised by the
amino acid sequence of SEQ ID NO: 11.
[0020] In another embodiment, the invention provides recombinant
microorganisms comprising a DHAD enzyme comprising the amino acid
sequence P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27), wherein X is any
natural or non-natural amino acid, and wherein said DHAD enzymes
exhibit the ability to convert 2,3-dihydroxyisovalerate to
ketoisovalerate in the cytosol.
[0021] In some embodiments, the invention provides recombinant
microorganisms comprising a DHAD enzyme fused to a peptide tag,
whereby said DHAD enzyme exhibits increased cytosolic localization
and/or cytosolic DHAD activity as compared to the parental
microorganism. In one embodiment, the peptide tag is non-cleavable.
In another embodiment, the peptide tag is fused at the N-terminus
of the DHAD enzyme. In another embodiment, the peptide tag is fused
at the C-terminus of the DHAD enzyme. In certain embodiments, the
peptide tag may be selected from the group consisting of ubiquitin,
ubiquitin-like (UBL) proteins, myc, HA-tag, green fluorescent
protein (GFP), and the maltose binding protein (MBP).
[0022] In certain embodiments described herein, it may be desirable
to further overexpress an additional enzyme that converts
2,3-dihydroxyisovalerate (DHIV) to ketoisovalerate (KIV) in the
cytosol. In a specific embodiment, the enzyme may be selected from
the group consisting of 3-isopropylmalate isomerase (Leu1p) and
imidazoleglycerol-phosphate dehydrogenase (His3p).
[0023] In various embodiments described herein, the recombinant
microorganisms may be further engineered to express an isobutanol
producing metabolic pathway comprising at least one exogenous gene
that catalyzes a step in the conversion of pyruvate to isobutanol.
In one embodiment, the recombinant microorganism may be engineered
to express an isobutanol producing metabolic pathway comprising at
least two exogenous genes. In another embodiment, the recombinant
microorganism may be engineered to express an isobutanol producing
metabolic pathway comprising at least three exogenous genes. In
another embodiment, the recombinant microorganism may be engineered
to express an isobutanol producing metabolic pathway comprising at
least four exogenous genes. In another embodiment, the recombinant
microorganism may be engineered to express an isobutanol producing
metabolic pathway comprising five exogenous genes. Thus, the
present invention further provides recombinant microorganisms that
comprise an isobutanol producing metabolic pathway and methods of
using said recombinant microorganisms to produce isobutanol.
[0024] In one embodiment, the recombinant microorganisms comprise
an isobutanol producing metabolic pathway with at least one
isobutanol pathway enzyme localized in the cytosol. In another
embodiment, the recombinant microorganisms comprise an isobutanol
producing metabolic pathway with at least two isobutanol pathway
enzymes localized in the cytosol. In another embodiment, the
recombinant microorganisms comprise an isobutanol producing
metabolic pathway with at least three isobutanol pathway enzymes
localized in the cytosol. In another embodiment, the recombinant
microorganisms comprise an isobutanol producing metabolic pathway
with at least four isobutanol pathway enzymes localized in the
cytosol. In an exemplary embodiment, the recombinant microorganisms
comprise an isobutanol producing metabolic pathway with five
isobutanol pathway enzymes localized in the cytosol. In a further
exemplary embodiment, at least one of the pathway enzymes localized
to the cytosol is a cytosolically active DHAD enzyme as disclosed
herein.
[0025] In various embodiments described herein, the isobutanol
pathway enzyme(s) is/are selected from the group consisting of
acetolactate synthase (ALS), ketol-acid reductoisomerase (KARI),
dihydroxyacid dehydratase (DHAD), 2-keto-acid decarboxylase (KIVD),
and alcohol dehydrogenase (ADH).
[0026] As described herein, the cytosolically active isobutanol
pathway enzymes of the present invention can include native (i.e.
parental) enzymes that exhibit cytosolic activity, as well
isobutanol pathway enzymes that have been modified or mutated to
increase their cytosolic localization and/or activity as compared
to native (i.e. parental) pathway enzymes.
[0027] In various embodiments described herein, the isobutanol
pathway enzymes may be derived from a prokaryotic organism. In
alternative embodiments described herein, the isobutanol pathway
enzymes may be derived from a eukaryotic organism.
[0028] In some embodiments, the invention provides modified or
mutated isobutanol pathway enzymes, wherein said isobutanol pathway
enzymes exhibit increased cytosolic activity as compared to their
parental isobutanol pathway enzymes. In another embodiment, the
invention provides modified or mutated isobutanol pathway enzymes,
wherein said isobutanol pathway enzymes exhibit increased cytosolic
activity as compared to the homologous isobutanol pathway enzyme
from S. cerevisiae.
[0029] In various embodiments described herein, at least one of the
isobutanol pathway enzymes exhibiting cytosolic activity is ALS. In
one embodiment, the ALS is derived from a prokaryotic organism,
including, but not limited to Bacillus subtilis or L. lactis. In
another embodiment, the ALS is derived from a eukaryotic organism,
including, but not limited to Magnaporthe grisea, Phaeosphaeria
nodorum, Talaromyces stipitatus, and Trichoderma atroviride.
[0030] In additional embodiments, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is KARI. In one
embodiment, the KARI is derived from a prokaryotic organism,
including, but not limited to Escherichia coli, B. subtilis or L.
lactis. In another embodiment, the KARI is derived from a
eukaryotic organism, including, but not limited to Piromyces sp.
E2, S. cerevisiae, and Arabidopsis. In certain specific
embodiments, the KARI comprises an amino acid sequence selected
from an organism selected from the group consisting of E. coli, S.
cerevisiae, B. subtilis Piromyces sp. E2, Buchnera aphidicola,
Spinacia oleracea, Oryza sativa, Chlamydomonas reinhardtii, N.
crassa, Schizosaccharomyces pombe, Laccaria bicolor, Ignicoccus
hospitalis, Picrophilus torridus, Acidiphilium cryptum,
Cyanobacteria/Synechococcus sp., Zymomonas mobilis, Bacteroides
thetaiotaomicron, Methanococcus maripaludis, Vibrio fischeri,
Shewanella sp, G. forsetii, Psychromonas ingrhamaii, and Cytophaga
hutchinsonii. In additional embodiments, the KARI may be an
NADH-dependent KARI.
[0031] In various embodiments described herein, the isobutanol
pathway enzyme may be mutated or modified to remove an N-terminal
mitochondrial targeting sequence (MTS). Removal of the MTS can
increase cytosolic localization of the isobutanol pathway enzyme
and/or increase the cytosolic activity of the isobutanol pathway
enzyme as compared to the parental isobutanol pathway enzyme.
[0032] In some embodiments, the MTS may be modified or mutated to
reduce or eliminate its ability to target the isobutanol pathway
enzyme to the mitochondria. Selected modification of the MTS can
increase cytosolic localization of the isobutanol pathway enzyme
and/or increase the cytosolic activity of the isobutanol pathway
enzyme as compared to the parental isobutanol pathway enzyme.
[0033] In additional embodiments, the invention provides
recombinant microorganisms comprising an isobutanol pathway enzyme
fused to a peptide tag, whereby said isobutanol pathway enzyme
exhibits increased cytosolic localization and/or cytosolic activity
as compared to the parental enzyme. As a result, the recombinant
microorganism comprising the tagged isobutanol pathway enzyme will
generally exhibit an increased ability to perform a step involved
in the conversion of pyruvate to isobutanol in the cytosol. In one
embodiment, the peptide tag is non-cleavable. In another
embodiment, the peptide tag is fused at the N-terminus of the
isobutanol pathway enzyme. In another embodiment, the peptide tag
is fused at the C-terminus of the isobutanol pathway enzyme. In
certain embodiments, the peptide tag may be selected from the group
consisting of ubiquitin, ubiquitin-like (UBL) proteins, myc,
HA-tag, green fluorescent protein (GFP), and the maltose binding
protein (MBP).
[0034] In various embodiments described herein, the recombinant
microorganisms may further comprise a nucleic acid encoding a
chaperone protein, wherein said chaperone protein assists the
folding of a protein exhibiting cytosolic activity. In a preferred
embodiment, the protein exhibiting cytosolic activity is an
isobutanol pathway enzyme. In one embodiment, the chaperone may be
a native protein. In another embodiment, the chaperone protein may
be an exogenous protein. In some embodiments, the chaperone protein
may be selected from the group consisting of: endoplasmic reticulum
oxidoreductin 1 (Ero1) including variants of Ero1 that have been
suitably altered to reduce or prevent its normal localization to
the endoplasmic reticulum; thioredoxins (including, but not limited
to, Trx1 and Trx2), thioredoxin reductase (Trr1), glutaredoxins
(including, but not limited to, Grx1, Grx2, Grx3, Grx4, Grx5, Grx6,
Grx7, and Grx8), glutathione reductase (Gir1), and Jac1, including
variants of Jac1 that have been suitably altered to reduce or
prevent its normal mitochondrial localization; and homologs or
variants thereof.
[0035] In some embodiments, the recombinant microorganisms may
further comprise one or more genes encoding an iron-sulfur cluster
assembly protein. In one embodiment, the iron-sulfur cluster
assembly protein encoding genes may be derived from prokaryotic
organisms. In one embodiment, the iron-sulfur cluster assembly
protein encoding genes are derived from a bacterial organism,
including, but not limited to E. coli, L. lactis, Helicobacter
pylori, and Entamoeba histolytica. In specific embodiments, the
bacterially derived iron-sulfur cluster assembly protein encoding
genes are selected from the group consisting of cyaY, iscS, iscU,
iscA, hscB, hscA, fdx, isuX, sufA, sufB, sufC, sufD, sufS, sufE,
apbC, and homologs or variants thereof.
[0036] In another embodiment, the iron-sulfur cluster assembly
protein encoding genes may be derived from eukaryotic organisms,
including, but not limited to yeasts and plants. In one embodiment,
the iron-sulfur cluster protein encoding genes are derived from a
yeast organism, including, but not limited to S. cerevisiae. In
specific embodiments, the yeast derived genes encoding iron-sulfur
cluster assembly proteins are selected from the group consisting of
Cfd1, Nbp35, Nar1, Cia1, and homologs or variants thereof. In a
further embodiment, the iron-sulfur cluster assembly protein
encoding genes may be derived from plant nuclear genes which encode
proteins translocated to chloroplast or plant genes found in the
chloroplast genome itself.
[0037] In some embodiments, one or more genes encoding an
iron-sulfur cluster assembly protein may be mutated or modified to
remove a signal peptide, whereby localization of the product of
said one or more genes to the mitochondria or other subcellular
compartment is prevented. In certain embodiments, it may be
preferable to overexpress one or more genes encoding an iron-sulfur
cluster assembly protein.
[0038] In certain embodiments described herein, it may be desirable
to reduce or eliminate the activity and/or proteins levels of one
or more iron-sulfur cluster containing cytosolic proteins. In a
specific embodiment, the iron-sulfur cluster containing cytosolic
protein is 3-isopropylmalate dehydratase (Leu1p). In one
embodiment, the recombinant microorganism comprises a mutation in
the LEU1 gene resulting in the reduction of Leu1p protein levels.
In another embodiment, the recombinant microorganism comprises a
partial deletion in the LEU1 gene resulting in the reduction of
Leu1p protein levels. In another embodiment, the recombinant
microorganism comprises a complete deletion in the LEU1 gene
resulting in the reduction of Leu1p protein levels. In another
embodiment, the recombinant microorganism comprises a modification
of the regulatory region associated with the LEU1 gene resulting in
the reduction of Leu1p protein levels. In yet another embodiment,
the recombinant microorganism comprises a modification of a
transcriptional regulator for the LEU1 gene resulting in the
reduction of Leu1p protein levels.
[0039] In additional embodiments, the present invention provides
recombinant microorganisms comprising chimeric proteins consisting
of isobutanol pathway enzymes. In one embodiment, the chimeric
proteins consist of ALS and at least one additional protein. In a
specific embodiment, the additional protein is KARI. In a preferred
embodiment, the chimeric protein exhibits the biocatalytic
properties of both ALS and KARI. Such a chimeric protein allows for
an increase in the concentration of 2-acetolactate at the active
site of KARI as compared to the parental microorganism, giving the
recombinant microorganism an enhanced ability to convert
2-acetolactate to 2,3-dihydroxyisovalerate. In another embodiment,
the chimeric proteins consist of KARI and at least one additional
protein. In a specific embodiment, the additional protein is DHAD.
In a preferred embodiment, the chimeric protein exhibits the
biocatalytic properties of both KARI and DHAD. In each of the
various embodiments described herein, the proteins may be connected
via a flexible linker.
[0040] In various embodiments described herein, the recombinant
microorganisms may be engineered to express native genes that
catalyze a step in the conversion of pyruvate to isobutanol. In one
embodiment, the recombinant microorganism is engineered to increase
the activity of a native metabolic pathway gene for conversion of
pyruvate to isobutanol. In another embodiment, the recombinant
microorganism is further engineered to include at least one enzyme
encoded by an exogenous gene and at least one enzyme encoded by a
native gene. In yet another embodiment, the recombinant
microorganism comprises a reduction in the activity of a native
metabolic pathway as compared to a parental microorganism.
[0041] In another embodiment, the present invention provides
recombinant microorganisms comprising a scaffold system tethered to
one or more isobutanol pathway enzymes. In a specific embodiment,
the scaffold system is the MAP kinase scaffold (Ste5) system. In a
further embodiment, one or more of the isobutanol pathway enzymes
may be modified or mutated to comprise a protein domain allowing
for binding to the scaffold system.
[0042] In various embodiments described herein, the recombinant
microorganisms may be microorganisms of the Saccharomyces clade,
Saccharomyces sensu stricto microorganisms, Crabtree-negative yeast
microorganisms, Crabtree-positive yeast microorganisms, post-WGD
(whole genome duplication) yeast microorganisms, pre-WGD (whole
genome duplication) yeast microorganisms, and non-fermenting yeast
microorganisms.
[0043] In some embodiments, the recombinant microorganisms may be
yeast recombinant microorganisms of the Saccharomyces clade.
[0044] In some embodiments, the recombinant microorganisms may be
Saccharomyces sensu stricto microorganisms. In one embodiment, the
Saccharomyces sensu stricto is selected from the group consisting
of S. cerevisiae, S. kudriavzevii, S. mikatae, S. bayanus, S.
uvarum. S. carocanis and hybrids thereof.
[0045] In some embodiments, the recombinant microorganisms may be
Crabtree-negative recombinant yeast microorganisms. In one
embodiment, the Crabtree-negative yeast microorganism is classified
into a genera selected from the group consisting of Kluyveromyces,
Pichia, Hansenula, Issatchenkia, or Candida. In additional
embodiments, the Crabtree-negative yeast microorganism is selected
from Kluyveromyces lactis, Kluyveromyces marxianus, Pichia anomala,
Pichia stipitis, Hansenula anomala, Issatchenkia orientalis,
Candida utilis and Kluyveromyces waltii.
[0046] In some embodiments, the recombinant microorganisms may be
Crabtree-positive recombinant yeast microorganisms. In one
embodiment, the Crabtree-positive yeast microorganism is classified
into a genera selected from the group consisting of Saccharomyces,
Kluyveromyces, Zygosaccharomyces, Debaryomyces, Candida, Pichia and
Schizosaccharomyces. In additional embodiments, the
Crabtree-positive yeast microorganism is selected from the group
consisting of Saccharomyces cerevisiae, Saccharomyces uvarum,
Saccharomyces bayanus, Saccharomyces paradoxus, Saccharomyces
castelli, Saccharomyces kluyveri, Kluyveromyces thermotolerans,
Candida glabrata, Z. bailli, Z. rouxii, Debaryomyces hansenii,
Pichia pastorius, Schizosaccharomyces pombe, and Saccharomyces
uvarum.
[0047] In some embodiments, the recombinant microorganisms may be
post-WGD (whole genome duplication) yeast recombinant
microorganisms. In one embodiment, the post-WGD yeast recombinant
microorganism is classified into a genera selected from the group
consisting of Saccharomyces or Candida. In additional embodiments,
the post-WGD yeast is selected from the group consisting of
Saccharomyces cerevisiae, Saccharomyces uvarum, Saccharomyces
bayanus, Saccharomyces paradoxus, Saccharomyces castelli, and
Candida glabrata.
[0048] In some embodiments, the recombinant microorganisms may be
pre-WGD (whole genome duplication) yeast recombinant
microorganisms. In one embodiment, the pre-WGD yeast recombinant
microorganism is classified into a genera selected from the group
consisting of Saccharomyces, Kluyveromyces, Candida, Pichia,
Debaryomyces, Hansenula, Pachysolen, Yarrowia, Issatchenkia, and
Schizosaccharomyces. In additional embodiments, the pre-WGD yeast
is selected from the group consisting of Saccharomyces kluyveri,
Kluyveromyces thermotolerans, Kluyveromyces marxianus,
Kluyveromyces waltii, Kluyveromyces lactis, Candida tropicalis,
Pichia pastoris, Pichia anomala, Pichia stipitis, Debaryomyces
hansenii, Hansenula anomala, Pachysolen tannophilis, Yarrowia
lipolytica, Issatchenkia orientalis, and Schizosaccharomyces
pombe.
[0049] In some embodiments, the recombinant microorganisms may be
microorganisms that are non-fermenting yeast microorganisms,
including, but not limited to those, classified into a genera
selected from the group consisting of Tricosporon, Rhodotorula, or
Myxozyma.
[0050] In another aspect, the present invention provides methods of
producing isobutanol using one or more recombinant microorganisms
of the invention. In one embodiment, the method includes
cultivating one or more recombinant microorganisms in a culture
medium containing a feedstock providing the carbon source until a
recoverable quantity of the isobutanol is produced and optionally,
recovering the isobutanol. In one embodiment, the microorganism is
selected to produce isobutanol from a carbon source at a yield of
at least about 5 percent theoretical. In another embodiment, the
microorganism is selected to produce isobutanol at a yield of at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent theoretical, at least about 85 percent
theoretical, or at least about 90 percent theoretical.
[0051] In one embodiment, the microorganism produces isobutanol
from a carbon source at a specific productivity of at least about
0.7 mg/L/hr per OD. In another embodiment, the microorganism
produces isobutanol from a carbon source at a specific productivity
of at least about 1 mg/L/hr per OD, at least about 10 mg/L/hr per
OD, at least about 50 mg/L/hr per OD, at least about 100 mg/L/hr
per OD, at least about 250 mg/L/hr per OD, or at least about 500
g/L/hr per OD.
BRIEF DESCRIPTION OF DRAWINGS
[0052] Illustrative embodiments of the invention are illustrated in
the drawings, in which:
[0053] FIG. 1 illustrates an exemplary embodiment of an isobutanol
pathway.
[0054] FIG. 2 illustrates acetoin produced from GEVO 1187 (no ALS),
2280 (B. subtilis AlsS not codon optimized), GEVO 2618 (B. subtilis
AlsS), GEVO 2621 (T. atroviride ALS) and GEVO 2622 (T. stipitatus
ALS). All acetoin values are normalized to OD.sub.600 and reported
as mM/OD.
[0055] FIG. 3 illustrates the specific activity at pH 7.5 of KARI
enzyme in whole cell lysates for GEVO1803 containing empty vector
(pGV1102), ilv5.DELTA.N47(pGV1831), ilv5.DELTA.N46(pGV1901), Full
length ILV5 (pGV1833) and E. coli ilvC codon optimized for S.
cerevisiae (pGV1824).
[0056] FIG. 4 illustrates the results from fermentations of
GEVO2107 transformed with plasmids for expression of KARI and
different DHAD homologs (shown in legend).
[0057] FIG. 5 illustrates a phylogenetic tree of 53 representative
DHAD homologs following pairwise global alignments and progressive
assembly of alignments using Neighbor-Joining phylogeny.
DETAILED DESCRIPTION
[0058] As used herein and in the appended claims, the singular
forms "a," "an," and "the" include plural referents unless the
context clearly dictates otherwise. Thus, for example, reference to
"a polynucleotide" includes a plurality of such polynucleotides and
reference to "the microorganism" includes reference to one or more
microorganisms, and so forth.
[0059] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood to one of
ordinary skill in the art to which this disclosure belongs.
Although methods and materials similar or equivalent to those
described herein can be used in the practice of the disclosed
methods and compositions, the exemplary methods, devices and
materials are described herein.
[0060] Any publications discussed above and throughout the text are
provided solely for their disclosure prior to the filing date of
the present application. Nothing herein is to be construed as an
admission that the inventors are not entitled to antedate such
disclosure by virtue of prior disclosure.
[0061] The term "microorganism" includes prokaryotic and eukaryotic
microbial species from the Domains Archaea, Bacteria and Eucarya,
the latter including yeast and filamentous fungi, protozoa, algae,
or higher Protista. The terms "microbial cells" and "microbes" are
used interchangeably with the term microorganism.
[0062] The term "genus" is defined as a taxonomic group of related
species according to the Taxonomic Outline of Bacteria and Archaea
(Garrity et al., 2007, TOBA Release 7.7, Michigan State University
Board of Trustees).
[0063] The term "species" is defined as a collection of closely
related organisms with greater than 97% 16S ribosomal RNA sequence
homology and greater than 70% genomic hybridization and
sufficiently different from all other organisms so as to be
recognized as a distinct unit.
[0064] The terms "recombinant microorganism," "modified
microorganism" and "recombinant host cell" are used interchangeably
herein and refer to microorganisms that have been genetically
modified to express or over-express endogenous polynucleotides, or
to express heterologous polynucleotides, such as those included in
a vector, or which have an alteration in expression of an
endogenous gene. By "alteration" it is meant that the expression of
the gene, or level of a RNA molecule or equivalent RNA molecules
encoding one or more polypeptides or polypeptide subunits, or
activity of one or more polypeptides or polypeptide subunits is up
regulated or down regulated, such that expression, level, or
activity is greater than or less than that observed in the absence
of the alteration. For example, the term "alter" can mean
"inhibit," but the use of the word "alter" is not limited to this
definition.
[0065] The term "expression" with respect to a gene sequence refers
to transcription of the gene and, as appropriate, translation of
the resulting mRNA transcript to a protein. Thus, as will be clear
from the context, expression of a protein results from
transcription and translation of the open reading frame sequence.
The level of expression of a desired product in a host cell may be
determined on the basis of either the amount of corresponding mRNA
that is present in the cell, or the amount of the desired product
encoded by the selected sequence. For example, mRNA transcribed
from a selected sequence can be quantitated by qRT-PCR or by
Northern hybridization (Sambrook et al., Molecular Cloning: A
Laboratory Manual, Cold Spring Harbor Laboratory Press (1989)).
Protein encoded by a selected sequence can be quantitated by
various methods, e.g., by ELISA, by assaying for the biological
activity of the protein, or by employing assays that are
independent of such activity, such as western blotting or
radioimmunoassay, using antibodies that recognize and bind the
protein. The polynucleotide generally encodes a target enzyme
involved in a metabolic pathway for producing a desired metabolite.
It is understood that the terms "recombinant microorganism" and
"recombinant host cell" refer not only to the particular
recombinant microorganism but to the progeny or potential progeny
of such a microorganism. Because certain modifications may occur in
succeeding generations due to either mutation or environmental
influences, such progeny may not, in fact, be identical to the
parent cell, but are still included within the scope of the term as
used herein.
[0066] The term "wild-type microorganism" describes a cell that
occurs in nature, i.e. a cell that has not been genetically
modified. A wild-type microorganism can be genetically modified to
express or overexpress a first target enzyme. This microorganism
can act as a parental microorganism in the generation of a
microorganism modified to express or overexpress a second target
enzyme. In turn, the microorganism modified to express or
overexpress a first and a second target enzyme can be modified to
express or overexpress a third target enzyme.
[0067] Accordingly, a "parental microorganism" functions as a
reference cell for successive genetic modification events. Each
modification event can be accomplished by introducing a nucleic
acid molecule in to the reference cell. The introduction
facilitates the expression or overexpression of a target enzyme. It
is understood that the term "facilitates" encompasses the
activation of endogenous polynucleotides encoding a target enzyme
through genetic modification of e.g., a promoter sequence in a
parental microorganism. It is further understood that the term
"facilitates" encompasses the introduction of heterologous
polynucleotides encoding a target enzyme in to a parental
microorganism.
[0068] The term "engineer" refers to any manipulation of a
microorganism that results in a detectable change in the
microorganism, wherein the manipulation includes but is not limited
to inserting a polynucleotide and/or polypeptide heterologous to
the microorganism and mutating a polynucleotide and/or polypeptide
native to the microorganism.
[0069] The term "mutation" as used herein indicates any
modification of a nucleic acid and/or polypeptide which results in
an altered nucleic acid or polypeptide. Mutations include, for
example, point mutations, deletions, or insertions of single or
multiple residues in a polynucleotide, which includes alterations
arising within a protein-encoding region of a gene as well as
alterations in regions outside of a protein-encoding sequence, such
as, but not limited to, regulatory or promoter sequences. A genetic
alteration may be a mutation of any type. For instance, the
mutation may constitute a point mutation, a frame-shift mutation,
an insertion, or a deletion of part or all of a gene. In addition,
in some embodiments of the modified microorganism, a portion of the
microorganism genome has been replaced with a heterologous
polynucleotide. In some embodiments, the mutations are
naturally-occurring. In other embodiments, the mutations are the
results of artificial selection pressure. In still other
embodiments, the mutations in the microorganism genome are the
result of genetic engineering.
[0070] The term "biosynthetic pathway", also referred to as
"metabolic pathway", refers to a set of anabolic or catabolic
biochemical reactions for converting one chemical species into
another. Gene products belong to the same "metabolic pathway" if
they, in parallel or in series, act on the same substrate, produce
the same product, or act on or produce a metabolic intermediate
(i.e., metabolite) between the same substrate and metabolite end
product.
[0071] The term "heterologous" as used herein with reference to
molecules and in particular enzymes and polynucleotides, indicates
molecules that are expressed in an organism other than the organism
from which they originated or are found in nature, independently of
the level of expression that can be lower, equal or higher than the
level of expression of the molecule in the native
microorganism.
[0072] On the other hand, the term "native" or "endogenous" as used
herein with reference to molecules, and in particular enzymes and
polynucleotides, indicates molecules that are expressed in the
organism in which they originated or are found in nature,
independently of the level of expression that can be lower equal or
higher than the level of expression of the molecule in the native
microorganism. It is understood that expression of native enzymes
or polynucleotides may be modified in recombinant
microorganisms.
[0073] The term "feedstock" is defined as a raw material or mixture
of raw materials supplied to a microorganism or fermentation
process from which other products can be made. For example, a
carbon source, such as biomass or the carbon compounds derived from
biomass are a feedstock for a microorganism that produces a biofuel
in a fermentation process. However, a feedstock may contain
nutrients other than a carbon source.
[0074] The term "substrate" or "suitable substrate" refers to any
substance or compound that is converted or meant to be converted
into another compound by the action of an enzyme. The term includes
not only a single compound, but also combinations of compounds,
such as solutions, mixtures and other materials which contain at
least one substrate, or derivatives thereof. Further, the term
"substrate" encompasses not only compounds that provide a carbon
source suitable for use as a starting material, such as any biomass
derived sugar, but also intermediate and end product metabolites
used in a pathway associated with a recombinant microorganism as
described herein.
[0075] The term "C2-compound" as used as a carbon source for
engineered yeast microorganisms with mutations in all pyruvate
decarboxylase (PDC) genes resulting in a reduction of pyruvate
decarboxylase activity of said genes refers to organic compounds
comprised of two carbon atoms, including but not limited to ethanol
and acetate
[0076] The term "fermentation" or "fermentation process" is defined
as a process in which a microorganism is cultivated in a culture
medium containing raw materials, such as feedstock and nutrients,
wherein the microorganism converts raw materials, such as a
feedstock, into products.
[0077] The term "volumetric productivity" or "production rate" is
defined as the amount of product formed per volume of medium per
unit of time. Volumetric productivity is reported in gram per liter
per hour (g/L/h).
[0078] The term "specific productivity" or "specific production
rate" is defined as the amount of product formed per volume of
medium per unit of time per amount of cells. Volumetric
productivity is reported in gram or milligram per liter per hour
per OD (g/L/h/OD).
[0079] The term "yield" is defined as the amount of product
obtained per unit weight of raw material and may be expressed as g
product per g substrate (g/g). Yield may be expressed as a
percentage of the theoretical yield. "Theoretical yield" is defined
as the maximum amount of product that can be generated per a given
amount of substrate as dictated by the stoichiometry of the
metabolic pathway used to make the product. For example, the
theoretical yield for one typical conversion of glucose to
isobutanol is 0.41 g/g. As such, a yield of isobutanol from glucose
of 0.39 g/g would be expressed as 95% of theoretical or 95%
theoretical yield.
[0080] The term "titer" is defined as the strength of a solution or
the concentration of a substance in solution. For example, the
titer of a biofuel in a fermentation broth is described as g of
biofuel in solution per liter of fermentation broth (g/L).
[0081] "Aerobic conditions" are defined as conditions under which
the oxygen concentration in the fermentation medium is sufficiently
high for an aerobic or facultative anaerobic microorganism to use
as a terminal electron acceptor.
[0082] In contrast, "anaerobic conditions" are defined as
conditions under which the oxygen concentration in the fermentation
medium is too low for the microorganism to use as a terminal
electron acceptor. Anaerobic conditions may be achieved by sparging
a fermentation medium with an inert gas such as nitrogen until
oxygen is no longer available to the microorganism as a terminal
electron acceptor. Alternatively, anaerobic conditions may be
achieved by the microorganism consuming the available oxygen of the
fermentation until oxygen is unavailable to the microorganism as a
terminal electron acceptor.
[0083] "Aerobic metabolism" refers to a biochemical process in
which oxygen is used as a terminal electron acceptor to make
energy, typically in the form of ATP, from carbohydrates. Aerobic
metabolism occurs e.g. via glycolysis and the TCA cycle, wherein a
single glucose molecule is metabolized completely into carbon
dioxide in the presence of oxygen.
[0084] In contrast, "anaerobic metabolism" refers to a biochemical
process in which oxygen is not the final acceptor of electrons
contained in NADH. Anaerobic metabolism can be divided into
anaerobic respiration, in which compounds other than oxygen serve
as the terminal electron acceptor, and substrate level
phosphorylation, in which the electrons from NADH are utilized to
generate a reduced product via a "fermentative pathway."
[0085] In "fermentative pathways", NAD(P)H donates its electrons to
a molecule produced by the same metabolic pathway that produced the
electrons carried in NAD(P)H. For example, in one of the
fermentative pathways of certain yeast strains, NAD(P)H generated
through glycolysis transfers its electrons to pyruvate, yielding
ethanol. Fermentative pathways are usually active under anaerobic
conditions but may also occur under aerobic conditions, under
conditions where NADH is not fully oxidized via the respiratory
chain. For example, above certain glucose concentrations, Crabtree
positive yeasts produce large amounts of ethanol under aerobic
conditions.
[0086] The term "byproduct" means an undesired product related to
the production of a biofuel or biofuel precursor. Byproducts are
generally disposed as waste, adding cost to a production
process.
[0087] The term "non-fermenting yeast" is a yeast species that
fails to demonstrate an anaerobic metabolism in which the electrons
from NADH are utilized to generate a reduced product via a
fermentative pathway such as the production of ethanol and CO.sub.2
from glucose. Non-fermentative yeast can be identified by the
"Durham Tube Test" (J. A. Barnett, R. W. Payne, and D. Yarrow.
2000. Yeasts Characteristics and Identification. 3.sup.rd edition.
p. 28-29. Cambridge University Press, Cambridge, UK) or by
monitoring the production of fermentation productions such as
ethanol and CO.sub.2.
[0088] The term "polynucleotide" is used herein interchangeably
with the term "nucleic acid" and refers to an organic polymer
composed of two or more monomers including nucleotides, nucleosides
or analogs thereof, including but not limited to single stranded or
double stranded, sense or antisense deoxyribonucleic acid (DNA) of
any length and, where appropriate, single stranded or double
stranded, sense or antisense ribonucleic acid (RNA) of any length,
including siRNA. The term "nucleotide" refers to any of several
compounds that consist of a ribose or deoxyribose sugar joined to a
purine or a pyrimidine base and to a phosphate group, and that are
the basic structural units of nucleic acids. The term "nucleoside"
refers to a compound (as guanosine or adenosine) that consists of a
purine or pyrimidine base combined with deoxyribose or ribose and
is found especially in nucleic acids. The term "nucleotide analog"
or "nucleoside analog" refers, respectively, to a nucleotide or
nucleoside in which one or more individual atoms have been replaced
with a different atom or with a different functional group.
Accordingly, the term polynucleotide includes nucleic acids of any
length, DNA, RNA, analogs and fragments thereof. A polynucleotide
of three or more nucleotides is also called nucleotidic oligomer or
oligonucleotide.
[0089] It is understood that the polynucleotides described herein
include "genes" and that the nucleic acid molecules described
herein include "vectors" or "plasmids." Accordingly, the term
"gene", also called a "structural gene" refers to a polynucleotide
that codes for a particular sequence of amino acids, which comprise
all or part of one or more proteins or enzymes, and may include
regulatory (non-transcribed) DNA sequences, such as promoter
sequences, which determine for example the conditions under which
the gene is expressed. The transcribed region of the gene may
include untranslated regions, including introns, 5'-untranslated
region (UTR), and 3'-UTR, as well as the coding sequence.
[0090] The term "operon" refers to two or more genes which are
transcribed as a single transcriptional unit from a common
promoter. In some embodiments, the genes comprising the operon are
contiguous genes. It is understood that transcription of an entire
operon can be modified (i.e., increased, decreased, or eliminated)
by modifying the common promoter. Alternatively, any gene or
combination of genes in an operon can be modified to alter the
function or activity of the encoded polypeptide. The modification
can result in an increase in the activity of the encoded
polypeptide. Further, the modification can impart new activities on
the encoded polypeptide. Exemplary new activities include the use
of alternative substrates and/or the ability to function in
alternative environmental conditions.
[0091] A "vector" is any means by which a nucleic acid can be
propagated and/or transferred between organisms, cells, or cellular
components. Vectors include viruses, bacteriophage, pro-viruses,
plasmids, phagemids, transposons, and artificial chromosomes such
as YACs (yeast artificial chromosomes), BACs (bacterial artificial
chromosomes), and PLACs (plant artificial chromosomes), and the
like, that are "episomes," that is, that replicate autonomously or
can integrate into a chromosome of a host cell. A vector can also
be a naked RNA polynucleotide, a naked DNA polynucleotide, a
polynucleotide composed of both DNA and RNA within the same strand,
a poly-lysine-conjugated DNA or RNA, a peptide-conjugated DNA or
RNA, a liposome-conjugated DNA, or the like, that are not episomal
in nature, or it can be an organism which comprises one or more of
the above polynucleotide constructs such as an agrobacterium or a
bacterium.
[0092] "Transformation" refers to the process by which a vector is
introduced into a host cell. Transformation (or transduction, or
transfection), can be achieved by any one of a number of means
including chemical transformation (e.g. lithium acetate
transformation), electroporation, microinjection, biolistics (or
particle bombardment-mediated delivery), or agrobacterium mediated
transformation.
[0093] The term "enzyme" as used herein refers to any substance
that catalyzes or promotes one or more chemical or biochemical
reactions, which usually includes enzymes totally or partially
composed of a polypeptide, but can include enzymes composed of a
different molecule including polynucleotides.
[0094] The term "protein", "peptide" or "polypeptide" as used
herein indicates an organic polymer composed of two or more amino
acidic monomers and/or analogs thereof. As used herein, the term
"amino acid" or "amino acidic monomer" refers to any natural and/or
synthetic amino acids including glycine and both D or L optical
isomers. The term "amino acid analog" refers to an amino acid in
which one or more individual atoms have been replaced, either with
a different atom, or with a different functional group.
Accordingly, the term polypeptide includes amino acidic polymer of
any length including full length proteins, and peptides as well as
analogs and fragments thereof. A polypeptide of three or more amino
acids is also called a protein oligomer or oligopeptide
[0095] The term "homolog", used with respect to an original enzyme
or gene of a first family or species, refers to distinct enzymes or
genes of a second family or species which are determined by
functional, structural or genomic analyses to be an enzyme or gene
of the second family or species which corresponds to the original
enzyme or gene of the first family or species. Most often, homologs
will have functional, structural or genomic similarities.
Techniques are known by which homologs of an enzyme or gene can
readily be cloned using genetic probes and PCR. Identity of cloned
sequences as homolog can be confirmed using functional assays
and/or by genomic mapping of the genes.
[0096] A protein has "homology" or is "homologous" to a second
protein if the amino acid sequence encoded by a gene has a similar
amino acid sequence to that of the second gene. Alternatively, a
protein has homology to a second protein if the two proteins have
"similar" amino acid sequences. (Thus, the term "homologous
proteins" is defined to mean that the two proteins have similar
amino acid sequences).
[0097] The term "analog" or "analogous" refers to nucleic acid or
protein sequences or protein structures that are related to one
another in function only and are not from common descent or do not
share a common ancestral sequence. Analogs may differ in sequence
but may share a similar structure, due to convergent evolution. For
example, two enzymes are analogs or analogous if the enzymes
catalyze the same reaction of conversion of a substrate to a
product, are unrelated in sequence, and irrespective of whether the
two enzymes are related in structure.
Cytosolically Localized Isobutanol Pathway Enzymes and Recombinant
Microorganisms Comprising the Same
[0098] Biosynthetic pathways for the production of isobutanol and
2-methyl-1-butanol by recombinant microorganisms are described by
Atsumi et al. (Atsumi et al., 2008, Nature 451: 86-89). One
strategy described herein for improving isobutanol production by
recombinant microorganisms is the localization of the enzymes
catalyzing the biosynthetic isobutanol pathway to the yeast
cytosol. Cytosolic localization of the isobutanol pathway enzymes
activity is desirable, especially for the production of isobutanol
since the ideal biocatalyst (e.g. recombinant microorganism) will
have the entire isobutanol pathway functionally expressed in the
same compartment (e.g. preferably in the cytosol). In addition,
this localization allows the pathway to utilize pyruvate and
NAD(P)H that is generated in the cytosol by glycolysis and/or the
pentose phosphate pathway without the need for transfer of these
metabolites to an alternative compartment (i.e. the mitochondria).
However, such a strategy of compartmental localization in yeast is
not feasible unless the pathway enzymes exhibit cytosolic activity
in that compartment. Thus, if one or more of the cytosolically
localized pathway enzymes lacks catalytic activity in the cytosol,
high level isobutanol production will not occur. As the present
application shows in the Examples below, inefficient cytosolic
activity of one or or more isobutanol pathway enzymes (e.g. DHAD or
ALS) can limit isobutanol production.
[0099] The present inventors describe herein cytosolically active
isobutanol pathway enzymes and their use in the production of
various beneficial metabolites, such as isobutanol and
2-methyl-1-butanol. Using a combination of genetic selection and
biochemical analyses, the present inventors have identified a
number of isobutanol pathway enzymes, including DHAD enzymes, that
have activity in the cytosol. Accordingly, in one aspect, the
present application describes the discovery of DHADs with enhanced
cytosolic activity and shows that these newly identified,
cytosolically active DHADs facilitate improved isobutanol
production when co-expressed in the cytosol with the remaining four
isobutanol pathway enzymes.
[0100] As shown in Example 3 below, the native DHAD of yeast is
localized to the mitochondria. Therefore, for economically viable
production of isobutanol to occur in the yeast cytosol, the
identification of heterologous DHAD enzymes that are "cytosolically
active" in yeast (i.e. "active in the cytosol" of the yeast) is
important. In addition, the present application shows that in the
absence of ALS, KARI, KIVD, and ADH which are "cytosolically
active" or "active in the cytosol" in the cytosol of yeast,
economically viable isobutanol production will not occur, thus
making identification of native and/or heterologous ALS, KARI,
KIVD, and ADH enzymes additionally and/or independently important
to cytosolic isobutanol production.
[0101] As used herein, the term "cytosolically active" or "active
in the cytosol" means the enzyme exhibits enzymatic activity in the
cytosol of a eukaryotic organism. Cytosolically active enzymes may
further be additionally and/or independently characterized as
enzymes that generally exhibit a specific cytosolic activity which
is greater than the specific mitochondrial activity. In certain
respects, a "cytosolically active" enzymes of the present invention
exhibit a ratio of the specific activity of the mitochondrial
fraction over the specific activity of the whole cell fraction of
less than 1, as determined by the method disclosed in Example 3
herein. Cytosolically active enzymes may further be additionally
and/or independently characterized as enzymes that, when
overexpressed, result in increased activity in the whole cell
fraction and do not result in increased activity in the
mitochondrial fraction, as determined by the method disclosed in
Example 20. Cytosolically active enzymes may further be
additionally and/or independently characterized as enzymes that,
when overexpressed as one of the five enzymes that together
comprise the fivestep biosynthetic pathway for the conversion of
pyruvate isobutanol, result in increased isobutanol production
compared to enzymes that are not cytosolically active or that are
less cytosolically active.
[0102] As used herein, the term "cytosolically localized" or
"cytosolic localization" means the enzyme is localized in the
cytosol of a eukaryotic organism. Cytosolically localized enzymes
may further be additionally and/or independently characterized as
enzymes that exhibit a cytosolic protein level which is greater
than the mitochondrial protein level.
Identification of Cytosolically Active Isobutanol Pathway
Enzymes
[0103] In one aspect, the present invention encompasses a number of
strategies for identifying cytosolically active and/or localized
isobutanol pathway enzymes that exhibit cytosolic activity and/or
cytosolic localization, as well as methods for modifying said
isobutanol pathway enzymes to increase their ability to exhibit
cytosolic activity and/or cytosolic localization.
[0104] In various embodiments described herein, the isobutanol
pathway enzymes may be derived from a prokaryotic organism. In
alternative embodiments described herein, the isobutanol pathway
enzyme may be derived from a eukaryotic organism. In one
embodiment, the eukaryotic organism is a fungal organism. As
described herein, the present inventors have found that in general,
an enzyme from a fungal source is more likely to show activity in
yeast than a bacterial enzyme expressed in yeast. In addition,
homologs that are normally expressed in the cytosol are desired, as
a normally cytoplasmic enzyme is likely to show higher activity in
the cytosol as compared to an enzyme that is relocalized to the
cytosol from other organelles, such as the mitochondria. Fungal
homologs of various isobutanol pathway enzymes are often localized
to the mitochondria. The present inventors have found that fungal
homologs of isobutanol pathway enzymes that are cytosolically
localized will generally be expected to exhibit higher activity in
the cytosol of yeast than those of wild-type yeast strains. Thus,
in one embodiment, the present invention provides fungal isobutanol
pathway enzyme homologs that are cytosolically active and/or
cytosolically localized.
Dihydroxyacid Dehydratase (DHAD)
[0105] In additional embodiments, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is a dihydroxyacid
dehydratase (DHAD). In accordance with this embodiment, the present
invention provides cytosolically active dihydroxyacid dehydratases
(DHADs) and further describes methods for their use in the
production of various beneficial metabolites, such as isobutanol
and 2-methyl-1-butanol. As noted above, biosynthetic pathways for
the production of isobutanol and 2-methyl-1-butanol have been
described (Atsumi et al., 2008, Nature 451: 86-89). In these
biosynthetic pathways, DHAD catalyzes the conversion of
2,3-dihydroxyisovalerate to 2-ketoisovalerate, and
2,3-dihydroxy-3-methylvalerate to 2-keto-3-methylvarate,
respectively. Using a combination of genetic selection and
biochemical analyses, the present inventors have identified a
number of DHAD homologs that have activity in the cytosol.
[0106] Among the many strategies for identifying cytosolically
active DHADs, the present inventors performed multiway-protein
alignments between several DHAD homologs. Using this analysis, the
present inventors identified a protein motif that was surprisingly
unique to a subset of DHAD homologs exhibiting cytosolical
activity. This protein motif, P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27)
was found in DHAD homologs demonstrating cytosolic activity in
yeast. Therefore, in one embodiment, the present invention provides
DHAD enzymes comprising the amino acid sequence P(I/L)XXXGX(I/L)XIL
(SEQ ID NO: 27), wherein X is any natural or non-natural amino
acid, and wherein said DHAD enzyme exhibits the ability to convert
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol. DHAD
enzymes harboring this sequence include those derived from L.
lactis (SEQ ID NO: 18), G. forsetii (SEQ ID NO: 17), Acidobacteria
bacterium Ellin345 (SEQ ID NO: 16), Saccharopolyspora erythraea
(SEQ ID NO: 19), Yarrowia lipolytica (SEQ ID NO: 13), Francisella
tularensis (SEQ ID NO: 14), Arabidopsis thaliana (SEQ ID NO: 15),
Thermotoga petrophila (SEQ ID NO: 10), and Victivallis vadensis
(SEQ ID NO: 11). Also encompassed herein are DHAD enzymes that
comprise a motif that is at least about 70% similar, at least about
80% similar, or at least about 90% similar to the motif shown in
SEQ ID NO: 27.
[0107] As described herein, an even more specific version of this
motif has been identified by the present inventors. Thus, in a
further embodiment, the present invention provides DHAD enzymes
comprising the amino acid sequence PIKXXGX(I/L)XIL (SEQ ID NO: 28),
wherein X is any natural or non-natural amino acid, and wherein
said DHAD enzyme exhibits the ability to convert
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol. DHAD
enzymes harboring this sequence include those derived from L.
lactis (SEQ ID NO: 18), G. forsetii (SEQ ID NO: 17), Acidobacteria
bacterium Ellin345 (SEQ ID NO: 16), Y. lipolytica (SEQ ID NO: 13),
F. tularensis (SEQ ID NO: 14), A. thaliana (SEQ ID NO: 15), T.
petrophila (SEQ ID NO: 10), and V. vadensis (SEQ ID NO: 11). Also
encompassed herein are DHAD enzymes that comprise a motif that is
at least about 70% similar, at least about 80% similar, or at least
about 90% similar to the motif shown in SEQ ID NO: 28.
[0108] As noted above, one such cytosolically active DHAD
identified herein is exemplified by the L. lactis DHAD amino acid
sequence of SEQ ID NO: 18, which is encoded by the L. lactis ilvD
gene. As described herein, the present inventors have discovered
that yeast strains expressing the cytosolically active L. lactis
ilvD (DHAD) exhibit higher isobutanol production than yeast strains
expressing the S. cerevisiae ILV3 (DHAD), even when the ILV3 from
S. cerevisiae is truncated at its N-terminus to remove a putative
mitochondrial targeting sequence. In addition to the use and
identification of the cytosolically active DHAD homolog from L.
lactis, the present invention encompasses a number of different
strategies for identifying DHAD enzymes that exhibit cytosolic
activity and/or cytosolic localization, as well as methods for
modifying DHADs to increase their ability to exhibit cytosolic
activity and/or cytosolic localization.
[0109] In various embodiments described herein, the DHAD enzymes
may be derived from a prokaryotic organism. In one embodiment, the
prokaryotic organism is a bacterial organism. In another
embodiment, the bacterial organism is L. lactis. In a specific
embodiment, the DHAD enzyme from L. lactis comprises the amino acid
sequence of SEQ ID NO: 18. In other embodiments, the bacterial
organisms are of the genus Lactococcus, Gramella, Acidobacteria,
Francisella, Thermotoga and Victivallis.
[0110] In alternative embodiments, the DHAD enzyme may be derived
from a eukaryotic organism. In one embodiment, the eukaryotic
organism is a fungal organism. In an exemplary embodiment, the
fungal organism is Neurospora crassa. In a specific embodiment, the
DHAD enzyme from N. crassa comprises the amino acid sequence of SEQ
ID NO: 165.
[0111] As described herein, the present inventors have found that
in general, an enzyme from a fungal source is more likely to show
activity in yeast than a bacterial enzyme expressed in yeast. In
addition, homologs that are normally expressed in the cytosol are
desired, as a normally cytoplasmic enzyme is likely to show higher
activity in the cytosol as compared to an enzyme that is
relocalized to the cytosol from other organelles, such as the
mitochondria. Fungal homologs of various isobutanol pathway
enzymes, including DHAD, are often localized to the mitochondria.
The present inventors have found that fungal homologs of DHAD that
are cytosolically localized will generally be expected to exhibit
higher activity in the cytosol of yeast than those of wild-type
yeast strains. Thus, in one embodiment, the present invention
provides fungal DHAD homologs that are cytosolically active and/or
cytosolically localized.
[0112] In another embodiment, the eukaryotic organism is a yeast
organism. In another embodiment, the eukaryotic organism is
selected from the group consisting of the genera Enamoeba and
Giardia.
[0113] In various embodiments described herein, the recombinant
microorganism may exhibit at least about 5 percent greater
dihydroxyacid dehydratase (DHAD) activity in the cytosol as
compared to the parental microorganism. In another embodiment, the
recombinant microorganism may exhibit at least about 10 percent, at
least about 15 percent, about least about 20 percent, at least
about 25 percent, at least about 30 percent, at least about 35
percent, at least about 40 percent, at least about 45 percent, at
least about 50 percent, at least about 55 percent, at least about
60 percent, at least about 65 percent, at least about 70 percent,
at least about 75 percent, at least about 80 percent, at least
about 100 percent, at least about 200 percent, or at least about
500 percent greater dihydroxyacid dehydratase (DHAD) activity in
the cytosol as compared to the parental microorganism.
[0114] In another embodiment, the present invention provides DHAD
enzymes that, when overexpressed in yeast, result in increased
activity in the whole cell fraction and do not result in increased
activity in the mitochondrial fraction. In one embodiment, the DHAD
activity in the whole cell fraction is increased by at least about
2-fold. In another embodiment, DHAD activity in the whole cell
fraction is increased by at least about 5-fold. In yet another
embodiment, DHAD activity in the whole cell fraction is increased
by at least about 7-fold. In yet another embodiment, DHAD activity
in the whole cell fraction is increased by at least about 10-fold.
In yet another embodiment, DHAD activity in the whole cell fraction
is increased by at least about 50-fold. In yet another embodiment,
DHAD activity in the whole cell fraction is increased by at least
about 100-fold.
Acetolactate Synthase (ALS)
[0115] As described herein, the isobutanol pathway enzymes in
addition to DHAD should preferably be active in the cytosol. These
cytosolically active isobutanol pathway enzymes will generally
exhibit enzymatic activity in the cytosol. For instance, a
cytosolically active ALS should generally exhibit the ability to
convert 2 pyruvate to acetolactate in the cytosol. Thus, in various
embodiments described herein, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is acetolactate
synthase (ALS). In yeasts such as S. cerevisiae, the native
acetolactate synthase, encoded in S. cerevisiae by the ILV2 gene,
is naturally expressed in the yeast mitochondria. Unlike the
endogenous acetolactate synthase of yeast, expression of
heterologous, acetolactate synthases such as the B. subtilis alsS
and the L. lactis alsS in yeast occurs in the yeast cytosol (i.e.
cytosolically-localized). Thus, cytosolic expression of
acetolactate synthase is achieved by transforming a yeast with a
gene encoding an acetolactate synthase protein (EC 2.2.1.6).
[0116] ALS homologs that could be cytosolically expressed and
localized in yeast are predicted to lack a mitochondrial targeting
sequence as analyzed using mitoprot (Claros et al., 1996, Eur. J.
Biochem 241: 779-86). Such cytosolically localized ALS proteins can
be used as the first step in the isobutanol pathway. ALS homologs
include, but are not limited to, the following: the Serratia
marcescens ALS (GenBank Accession No. ADH43113.1) (probability of
mitochondrial localization 0.07), the Enterococcus faecalis ALS
(GenBank Accession No. NP.sub.--814940) (probability of
mitochondrial localization 0.21), the Leuconostoc mesenteroides
(GenBank Accession No. YP.sub.--818010.1) (probability of
mitochondrial localization 0.21), the Staphylococcus aureus ALS
(GenBank Accession No. YP.sub.--417545) (probability of
mitochondrial localization 0.13), the Burkholderia cenocepacia ALS
(GenBank Accession No. YP.sub.--624435) (probability of
mitochondrial localization 0.15), the T. atroviride ALS (SEQ ID NO:
71) (probability of mitochondrial localization 0.19), the T.
stipitatus ALS (SEQ ID NO: 72) (probability of mitochondrial
localization 0.19), and the Magnaporthe grisea ALS (GenBank
Accession No. EDJ99221) (probability of mitochondrial localization
0.02), a homolog or variant of any of the foregoing, and a
polypeptide having at least 60% identity to anyone of the foregoing
and exhibiting cytosolic ALS activity.
[0117] In one embodiment, the cytosolically active ALS is derived
from a prokaryotic organism, including, but not limited to B.
subtilis or L. lactis, which exhibit cytosolic activity. In another
embodiment, the ALS may be derived from an eukaryotic organism,
including, but not limited to M. grisea, P. nodorum, T. stipitatus,
and T. atroviride.
[0118] In some embodiments, an ALS enzyme that is predicted to be
mitochondrially localized may be mutated or modified to remove or
modify an N-terminal mitochondrial targeting sequence (MTS) to
remove or eliminate its ability to target the ALS enzyme to the
mitochondria. Removal of the MTS can increase cytosolic
localization of the ALS and/or increase the cytosolic activity of
the ALS as compared to the parental ALS.
[0119] The conversion of two pyruvate molecules to acetolactate can
be carried out by either an acetohydroxyacid synthase (AHAS) or an
acetolactate synthase (ALS). AHASs are involved in biosynthesis of
branched chain amino acids in the mitochondria of yeasts. They are
FAD-dependent and are feedback inhibited by branched chain amino
acids. ALSs are catabolic and are involved in the conversion of
pyruvate to acetoin. ALS are FAD-independent and not feedback
inhibited by branched chain amino acids. In addition, ALSs are
specific for the conversion of two pyruvates to acetolactate.
Therefore, ALSs are favored over AHASs. In addition, in the case of
yeast, AHASs are normally mitochondrial, therefore a fungal ALS
that is cytoplasmic is favored. Sequence analysis has shown that
there is a conserved sequence `RFDDR` found in AHASs that is not
conserved among ALSs (Le et al., 2005, Bull. Korean Chem Soc 26:
916-20). This sequence is likely involved in FAD-binding by AHASs
and thus could be used to distinguish between the FAD-dependent
AHASs and the FAD-independent ALSs. Using this region to
distinguish between AHASs and ALSs BLAST searches of fungal
sequence databases were performed and resulted in the
identification of ALS homologs from several fungal species (M.
grisea, P. nodorum, T. atroviride, T. stipitatus, P. marneffei, and
Glomerella graminicola). Of these sequences, the ALS homologs from
M. grisea, P. nodorum, T. stipitatus, and T. atroviride will
generally be expected to be cytosolically localized.
[0120] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater acetolactate synthase (ALS)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater acetolactate
synthase (ALS) activity in the cytosol as compared to the parental
microorganism.
Ketol-Acid Reductoisomerase (KARI)
[0121] In additional embodiments, at least one of the isobutanol
pathway enzymes exhibiting cytosolic activity is a ketol-acid
reductoisomerase (KARI). A cytosolically active KARI should
generally exhibit the ability to convert acetolactate to
2,3-dihydroxyisovalerate in the cytosol.
[0122] In one embodiment, the KARI is derived from a prokaryotic
organism, including, but not limited to Escherichia coli, B.
subtilis or L. lactis.
[0123] in another embodiment, the KARI is derived from a eukaryotic
organism, including, but not limited to Piromyces sp. E2, S.
cerevisiae, and Arabidopsis.Fungal homologs of KARI are generally
mitochondrially localized. The present inventors have identified a
fungal homolog from the anaerobic rumenal fungi, Piromyces sp. E2,
that is cytosolically localized.
[0124] In certain specific embodiments, the KARI comprises an amino
acid sequence selected from the group consisting of E. coli
(GenBank No: NP.sub.--418222, SEQ ID NO: 1), S. cerevisiae (GenBank
No: NP.sub.--013459, SEQ ID NO: 2), and B. subtilis (GenBank No:
CAB14789) and the KARI enzymes from Piromyces sp E2 (GenBank No:
CAA76356), B. aphidicola (GenBank No: AAF13807), S. oleracea
(GenBank No: CAA40356), O. sativa (GenBank No: NP.sub.--001056384,
SEQ ID NO: 3), C. reinhardtii (GenBank No: XP.sub.--001702649, SEQ
ID NO: 6), N. crassa (GenBank No: XP.sub.--961335), S. pombe
(GenBank No: NP.sub.--001018845), L. bicolor (GenBank No:
XP.sub.--001880867), I. hospitalis (GenBank No:
YP.sub.--001435197), P. torridus (GenBank No: YP.sub.--023851, SEQ
ID NO: 7), A. cryptum (GenBank No: YP.sub.--001235669, SEQ ID NO:
5), Cyanobacteria/Synechococcus sp. (GenBank No: YP.sub.--473733),
Z. mobilis (GenBank No: YP.sub.--162876: SEQ ID NO. 8), B.
thetaiotaomicron (GenBank No: NP.sub.--810987), M. maripaludis
(GenBank No: YP.sub.--001097443, SEQ ID NO: 4), V. fischeri
(GenBank No: YP.sub.--205911), Shewanella sp (GenBank No:
YP.sub.--732498.1), G. forsetti (GenBank No: YP.sub.--862142), P.
ingrhamaii (GenBank No: YP.sub.--942294), and C. hutchinsonii
(GenBank No: YP.sub.--677763), a homolog or variant of any of the
foregoing, and a polypeptide having at least 60% identity to anyone
of the foregoing and exhibiting cytosolic KARI activity.
[0125] In additional embodiments, the KARI may be an NADH-dependent
KARI. Thus, in one embodiment, the present invention provides
recombinant microorganisms in which the NADPH-dependent enzymes
KARI is replaced with an enzyme that preferentially depends on NADH
(i.e. a KARI that is NADH-dependent). In one embodiment, such
enzymes may be identified in nature. In an alternative embodiment,
such enzymes may be generated by protein engineering techniques
including but not limited to directed evolution or site-directed
mutagenesis. NADH-dependent KARIs useful in various methods of the
present invention are described in commonly owned and co-pending
applications U.S. Ser. No. 12/610,784 and PCT/US09/62952 (published
as WO/2010/051527), which are herein incorporated by reference in
their entireties for all purposes.
[0126] In one embodiment, a microorganism is provided in which
cofactor usage is balanced during the production of a fermentation
product and the microorganism produces the fermentation product at
a higher yield compared to a modified microorganism in which the
cofactor usage in not balanced. In another embodiment of the
present invention, a microorganism is provided in which the
cofactor usage is balanced during the production of isobutanol and
the microorganism produces isobutanol at a higher yield compared to
a modified microorganism in which the cofactor usage in not
balanced. Methods for achieving co-factor balance are described in
commonly owned and co-pending applications U.S. Ser. No. 12/610,784
and PCT/US09/62952 (published as WO/2010/051527), which are herein
incorporated by reference in their entireties for all purposes.
[0127] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater ketol-acid reductoisomerase (KARI)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater ketol-acid
reductoisomerase (KARI) activity in the cytosol as compared to the
parental microorganism.
Keto-Acid Decarboxylase (KIVD)
[0128] A cytosolically active KIVD should generally exhibit the
ability to convert ketoisovalerate to isobutyraldehyde in the
cytosol. In one embodiment, the cytosolically active KIVD is
derived from a prokaryotic organism, including, but not limited to
L. lactis, which exhibits cytosolic activity. In a specific
embodiment, the KIVD enzyme from L. lactis comprises the amino acid
sequence of SEQ ID NO: 173. In additional embodiments, the
cytosolically active KIVD is derived from, for example,
Enterobacter cloacae (Accession No. P23234.1), Mycobacterium
smegmatis (Accession No. A0R480.1), Mycobacterium tuberculosis
(Accession No. O53865.1), Mycobacterium avium (Accession No.
Q742Q2.1), Azospirillum brasilense (Accession No. P51852.1), B.
subtilis (see Oku et al., 1988, J. Biol. Chem. 263: 18386-96), a
homolog or variant of any of the foregoing, and a polypeptide
having at least 60% identity to anyone of the foregoing and
exhibiting cytosolic KIVD activity.
[0129] In an alternative embodiment, the KIVD may be derived from
an eukaryotic organism.
[0130] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater 2-keto-acid decarboxylase (KIVD)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater 2-keto-acid
decarboxylase (KIVD) activity in the cytosol as compared to the
parental microorganism.
Alcohol Dehydroqenase (ADH)
[0131] A cytosolically active ADH (used interchangeably herein with
isobutanol dehydrogenase, "IDH") should generally exhibit the
ability to convert isobutyraldehyde to isobutanol in the cytosol.
In one embodiment, the cytosolically active ADH is derived from a
prokaryotic organism, including, but not limited to L. lactis. In a
specific embodiment, the ADH enzyme from L. lactis comprises the
amino acid sequence of SEQ ID NO: 175. In additional embodiments,
the ADH is derived from, for example, Lactobacillus brevis
(Accession No. YP.sub.--794451.1), Pediococcus acidilactici
(Accession No. ZP.sub.--06197454.1), Bacillus cereus (Accession No.
YP.sub.--001374103.1), Bacillus thuringiensis (Accession No.
ZP.sub.--04101989.1), Leptotrichia goodfellowii (Accession No.
ZP.sub.--06011170.1), Actinobacillus pleuropneumoniae (Accession
No. ZP.sub.--00134308.2), Streptococcus sanguinis (Accession No.
YP.sub.--001035842.1), Eikenella corrodens (Accession No.
ZP.sub.--03713785.1), Exiguobacterium sp. (Accession No.
YP.sub.--002886170.1), Neisseria elongate (Accession No.
ZP.sub.--06736067.1), E. coli (Accession No. ZP.sub.--06937530.1),
Neisseria meningitidis (Accession No. CBA03965.1), Erwinia
pyrifoliae (Accession No. CAY75147.1), and Colwellia
psychrerythraea (Accession No. YP.sub.--270515.1), a homolog or
variant of any of the foregoing, and a polypeptide having at least
60% identity to anyone of the foregoing and having cytosolic ADH
activity.
[0132] In an alternative embodiment, the ADH may be derived from an
eukaryotic organism, including, but not limited to S. cerevisiae
and D. melanogaster. In a specific embodiment, the ADH enzyme from
S. cerevisiae is Adh7. In another specific embodiment, the ADH
enzyme from D. melanogaster comprises the amino acid sequence of
SEQ ID NO: 176.
[0133] In one embodiment, the recombinant microorganism may exhibit
at least about 5 percent greater alcohol dehydrogenase (ADH)
activity in the cytosol as compared to the parental microorganism.
In another embodiment, the recombinant microorganism may exhibit at
least about 10 percent, at least about 15 percent, about least
about 20 percent, at least about 25 percent, at least about 30
percent, at least about 35 percent, at least about 40 percent, at
least about 45 percent, at least about 50 percent, at least about
55 percent, at least about 60 percent, at least about 65 percent,
at least about 70 percent, at least about 75 percent, at least
about 80 percent, at least about 100 percent, at least about 200
percent, or at least about 500 percent greater alcohol
dehydrogenase (ADH) activity in the cytosol as compared to the
parental microorganism.
Chimeric Isobutanol Pathway Enzymes
[0134] In another aspect, the present invention provides
recombinant microorganisms comprising chimeric proteins consisting
of isobutanol pathway enzymes. In one embodiment, the chimeric
proteins consist of ALS and at least one additional protein. In a
specific embodiment, the additional protein is KARI. In a preferred
embodiment, the chimeric protein exhibits the biocatalytic
properties of both ALS and KARI. By creating a chimeric protein
that incorporates the activities of both ALS and KARI, this will
generally be expected to reduce the effect of diffusion and
decreasing the time for spontaneous decomposition to occur. By
using a flexible linker and/or structural and sequence information
to create a protein with the biocatalytic properties of both ALS
and KARI, this will generally increase the concentration of
2-acetolactate at the active site of KARI, causing 2-acetolactate
to be converted to 2,3-dihydroxyisovalerate near its theoretical
maximum (very little effect of diffusion), and thus, the total
concentration of 2-acetolactate should remain low correspondingly
decreasing its spontaneous decomposition. This will generally have
the effect of increasing the rate of conversion of 2-acetolactate
to 2,3-dihydroxyisovalerate.
[0135] In another embodiment, the chimeric proteins consist of KARI
and at least one additional protein. In a specific embodiment, the
additional protein is DHAD. In a preferred embodiment, the chimeric
protein exhibits the biocatalytic properties of both KARI and DHAD.
In each of the various embodiments described herein, the proteins
may be connected via a flexible linker.
Isobutanol Pathway Enzymes Attached to a Protein Scaffold
[0136] In another aspect, the present invention provides
recombinant microorganisms comprising a scaffold system tethered to
one or more isobutanol pathway enzymes. In a specific embodiment,
the scaffold system is the MAP kinase scaffold (Ste5) system. In a
further embodiment, one or more of the isobutanol pathway enzymes
may be modified or mutated to comprise a protein domain allowing
for binding to the scaffold system.
[0137] The present inventors have found that via the use of a
protein scaffold, the isobutanol pathway enzymes that act in
concert as part of a single pathway can be co-localized. In some
embodiments, the scaffold systems are adapted for binding to the
isobutanol pathway enzymes. By tethering the enzymes that work
together in the pathway to a scaffold protein, they are brought
into close physical proximity with each other, thus increasing the
efficiency of the isobutanol production.
[0138] There are several advantages to keeping pathway enzymes
together on a scaffold system. One is that proteins that normally
would localize to an intracellular compartment, like the
mitochondria, are partitioned onto the scaffold, thus keeping a
sizeable portion of the protein population in the cytosol. Another
is that the chemical products of each enzyme is physically close to
the next enzyme in the pathway, which speeds reaction time and
decreases the possibility that the product would be used in a
competing pathway. Finally, unstable products of the enzymes would
be used more quickly, since the next enzyme in the pathway would be
adjacent to use it as a substrate, thus decreasing nonproductive
degradation of the product.
[0139] In a preferred embodiment, the isobutanol pathway enzymes
are arranged in the sequence in which they are needed to function
(i.e. ALS followed by KARI followed by DHAD followed by KIVD
followed by ADH). In another embodiment, the scaffolded protein
complex is targeted to the cytosol by adding localization signals
to the scaffold. In yet another embodiment, the scaffolded protein
complex is targeted to the cell wall by adding localization signals
to the scaffold. As would be understood by one of skill in the art,
the scaffold system allows for co-localization of proteins or
enzymes in addition to the isobutanol pathway enzymes. Such
proteins may include chaperone proteins, proteins for the
conversion of xylose to xylulose-5P, cellulases, etc.
Removal and/or Modification of N-Terminal Mitochondrial Targeting
Sequences
[0140] The localization of the enzymes involved in production of
isobutanol is desired to be cytosolic. Cytosolic localization
allows for the pathway to utilize pyruvate and NAD(P)H that is
generated in the cytosol by glycolysis and/or the pentose phosphate
pathway without the need for the transfer of these metabolites to
an alternative compartment (i.e. mitochondria). However, the yeast
enzymes acetohydroxyacid synthase (AHAS; Ilv2+Ilv6), ketol-acid
reductoisomerase (KARI; Ilv5), and dihydroxyacid dehydratase (DHAD;
Ilv3) that carry out the first three steps of isobutanol production
are physiologically localized to the mitochondria. Mitochondrial
matrix proteins are typically targeted to the mitochondria by a
N-terminal mitochondrial targeting sequence (MTS), which is then
cleaved off in the mitochondria resulting in the `mature` form of
the enzyme (Paschen et al., 2001, IUBMB Life 52: 101-112). Indeed,
the N-terminal targeting sequences for Ilv6 has been defined (Pang
et al., 1999 Biochemistry 38: 5222-31). N-terminal deletions of
Ilv5 has also been shown to re-localize this enzyme to the cytosol
(Omura, 2008, Appl. Microbiol. Biotechnol. 78: 503-513; See also
Omura, WO/2009/078108 A1, hereby incorporated by reference in its
entirety).
[0141] One mechanism identified by the present inventors for the
cytosolic localization of isobutanol pathway enzymes involves the
removal and/or modification of N-terminal mitochondrial targeting
sequences (MTS). Nuclear genome-encoded proteins destined to reside
in the mitochondria often contain an N-terminal Mitochondrial
Targeting Sequence (MTS) that is recognized by a set of proteins
collectively known as mitochondrial import machinery. Following
recognition and import, the MTS is then physically cleaved off of
the imported protein. In eukaryotes, homologs of two of the
isobutanol pathway enzymes, ketol-acid reductoisomerase (KARI, e.g.
S. cerevisiae Ilv5) and dihydroxy acid dehydratase (DHAD, e.g. S.
cerevisiae Ilv3), are predicted to be mitochondrial, based upon the
presence of an N-terminal MTS as well as several in vivo functional
and mutational studies (See e.g., Omura, F., 2008, Appl Gen &
Mol Biot 78: 503-513). As described herein, the present inventors
have designed isobutanol pathway enzymes, whereby the predicted MTS
is removed or modified. In some instances, there exists
experimental evidence for the length of the MTS. Specifically, the
MTS of Ilv6 has been experimentally defined to be the N-terminal 61
amino acids (Pang et al., 1999, Biochemistry 38: 5222-31). The MTS
of Ilv5 has been reported to be the N-terminal 47 residues (Kassow
A., 1992, "Metabolic effects of deleting the region encoding the
transit peptide in Saccharomyces cerevisiae ILV5" PhD thesis,
University of Copenhagen). In addition, the deletion of the
N-terminal 46 amino acids of Ilv5 has been shown to result in an
active enzyme that is localized in the cytosol (Omura, F., 2008,
Appl Gen & Mol Biot 78: 503-513).
[0142] As described herein, the present inventors utilize deletions
and/or modifications of the N-terminal MTS to localize the enzymes
of the isobutanol pathway to the cytosol. In various embodiments,
the MTS can be entirely or partly deleted or its sequence can be
modified to eliminate its ability to target the protein to the
mitochondria. A benefit of removing the entire MTS is that the
resulting protein would essentially be the `mature` form of the
enzyme. The use of deletion of the N-terminal MTS can also be
expanded to all enzymes/homologs to be used for isobutanol
production. This is especially true for homologs from eukaryotic
organisms other than S. cerevisiae where the enzymes are localized
to the mitochondria. In addition, some bacterial homologs may have
a putative MTS. As bacterial enzymes do not undergo an N-terminal
cleavage, N-terminal deletions may be deleterious to these enzymes.
In such cases, modifications of the sequence to block the MTS
function of the N-terminal sequence may be preferable as such
alterations would likely be less deleterious to the enzyme's
activity. N-terminal MTS can be predicted by MitoProt II (See,
e.g., Claros et al., 1996, Eur. J. Biochem. 241: 779-786). Using
this program, the lengths of the MTS for Ilv2 and Ilv3 were
predicted to be the N-terminal 55 and 20 amino acids, respectively.
Modification of the MTS as contemplated herein includes the
introduction of one or multiple mutations to inhibit MTS function.
It is thought that the mitochondrial import machinery recognizes
the aliphatic alpha helix that is formed by the MTS. Thus
modifications that may inhibit MTS functions would be amino acid
changes that would alter the aliphatic amino acids such as mutating
the charged residues. Such modification(s) prevent its recognition
by the mitochondrial import machinery and subsequent cleavage of
the MTS and import into the mitochondria.
Peptide Tags to Augment Cytosolic Localization of Isobutanol
Pathway Enzymes
[0143] In additional embodiments, the mitochondrially imported
isobutanol pathway enzymes can be expressed as a chimeric fusion
protein to augment cytosolic localization. In one embodiment, the
isobutanol pathway enzyme is fused to a peptide tag, whereby said
isobutanol pathway enzyme exhibits increased cytosolic localization
and/or cytosolic activity in yeast as compared to the parental
isobutanol pathway enzyme. In one embodiment, the isobutanol
pathway enzyme is fused to a peptide tag following removal of the
N-terminal Mitochondrial Targeting Sequence (MTS). In one
embodiment, the peptide tag is non-cleavable. In a preferred
embodiment, the peptide tag is fused at the N-terminus of the
isobutanol pathway enzyme. Peptide tags useful in the present
invention preferably have the following properties: (1) they do not
significantly hinder the normal enzymatic function of the
isobutanol pathway enzyme; (2) it folds in such as a way as to
block recognition of an N-terminal MTS by the normal mitochondrial
import machinery; (3) it promotes the stable expression and/or
folding of the isobutanol pathway enzyme it precedes; (4) it can be
detected, for example, by Western blotting or SDS-PAGE plus
Coomassie staining to facilitate analysis of the overexpressed
chimeric protein.
[0144] Suitable peptide tags for use in the present invention
include, but are not limited to, ubiquitin, ubiquitin-like (UBL)
proteins, myc, HA-tag, green fluorescent protein (GFP), and the
maltose binding protein (MBP). Ubiquitin, and the Ubiquitin-like
protein (Ubl's) offer several advantages. For instance, the use of
Ubiquitin or similar Ubl's (e.g., SUMO) as a solubility- and
expression-enhancing fusion partner has been well documented (Ecker
et al., 1989, J Biol Chem 264: 7715-9; Marblestone et al., 2006,
Protein Science 15: 182-9). In fact, in S. cerevisiae, several
ribosomal proteins are expressed as C-terminal fusions to
ubiquitin. Following translation and protein folding, ubiquitin is
cleaved from its co-expressed partner by a highly specific
ubiquitin hydrolase, which recognizes and requires the extreme
C-terminal Gly-Gly motif present in ubiquitin and cleaves
immediately following this sequence; a similar pathway removes Ubl
proteins from their fusion partners.
[0145] The invention described here describes a method to
re-localize a normally mitochondrial protein or enzyme by
expressing it as fusion with an N-terminal, non-cleavable ubiquitin
or ubiquitin-like molecule. In doing so, the re-targeted enzyme
enjoys enhanced expression, solubility, and function in the
cytosol. In another embodiment, the sequence encoding the MTS can
be replaced with a sequence encoding one or more copies of the
c-myc epitope tag (amino acids EQKLISEEDL, SEQ ID NO: 9), which
will generally not target a protein into the mitochondria and can
easily be detected by commercially available antibodies.
Altering the Iron-Sulfur Cluster Domain and/or Redox Active
Domain
[0146] In general, the yeast cytosol demonstrates a different redox
potential than a bacterial cell, as well as the yeast mitochondria.
As a result, isobutanol pathway enzymes which exhibit an iron
sulfur (FeS) domain and/or redox active domain, may require the
redox potential of the native environments to be folded or
expressed in a functional form. Expressing some isobutanol pathway
enzymes in the yeast cytosol, which can harbor unfavorable redox
potential, has the propensity to result in inactive proteins, even
if the proteins are expressed. The present inventors have
identified a number of different strategies to overcome this
problem, which can arise when an isobutanol pathway enzyme which is
suited to a particular environment with a specific redox potential
is expressed in the yeast cytosol.
[0147] In one embodiment, the present invention provides isobutanol
pathway enzymes that exhibit a properly folded iron-sulfur cluster
domain and/or redox active domain in the cytosol. Such isobutanol
pathway enzymes will generally comprise a mutated or modified
iron-sulfur cluster domain and/or redox active domain, allowing for
a non-native isobutanol pathway enzyme to be expressed in the yeast
cytosol in a functional form.
[0148] In various embodiments described herein, the recombinant
microorganisms may further comprise a nucleic acid encoding a
chaperone protein, wherein said chaperone protein assists the
folding of a protein exhibiting cytosolic activity. In a preferred
embodiment, the protein exhibiting cytosolic activity is DHAD. In
one embodiment, the chaperone may be a native protein. In another
embodiment, the chaperone protein may be an exogenous protein. In
some embodiments, the chaperone protein may be selected from the
group consisting of: endoplasmic reticulum oxidoreductin 1 (Ero1,
Accession No. NP.sub.--013576.1), including variants of Ero1 that
have been suitably altered to reduce or prevent its normal
localization to the endoplasmic reticulum; thioredoxins (which
includes Trx1, Accession No. NP.sub.--013144.1; and Trx2, Accession
No. NP.sub.--011725.1), thioredoxin reductase (Trr1, Accession No.
NP.sub.--010640.1); glutaredoxins (which includes Grx1, Accession
No. NP.sub.--009895.1; Grx2, Accession No. NP.sub.--010801.1; Grx3,
Accession No. NP.sub.--010383.1; Grx4, Accession No.
NP.sub.--01101.1; Grx5, Accession No. NP.sub.--015266.1; Grx6,
Accession No. NP.sub.--010274.1; Grx7, Accession No.
NP.sub.--009570.1; Grx8, Accession No. NP.sub.--013468.1);
glutathione reductase Glr1 (Accession No. NP.sub.--015234.1); and
Jac1 (Accession No. NP.sub.--011497.1), including variants of Jac1
that have been suitably altered to reduce or prevent its normal
mitochondrial localization; and homologs or variants thereof.
[0149] As described herein, iron-sulfur cluster assembly for
insertion into yeast apo-iron-sulfur proteins begins in yeast
mitochondria. To assemble in yeast the active iron-sulfur proteins
containing the cluster, either the apo-iron-sulfur protein is
imported into the mitochondria from the cytosol and the iron-sulfur
cluster is inserted into the protein and the active protein remains
localized in the mitochondria; or the iron-sulfur clusters or
precursors thereof are exported from the mitochondria to the
cytosol and the active protein is assembled in the cytosol or other
cellular compartments.
[0150] Targeting of yeast mitochondrial iron-sulfur proteins or
non-yeast iron-sulfur proteins to the yeast cytosol can result in
such proteins not being properly assembled with their iron-sulfur
clusters. This present invention overcomes this problem by
co-expression and cytosolic targeting in yeast of proteins for
iron-sulfur cluster assembly and cluster insertion into
apo-iron-sulfur proteins, including iron-sulfur cluster assembly
and insertion proteins from organisms other than yeast, together
with the apo-iron-sulfur protein to provide assembly of active
iron-sulfur proteins in the yeast cytosol.
[0151] Therefore, in one embodiment of this invention, the
apo-iron-sulfur protein DHAD enzyme encoded by the E. coli ilvD
gene is expressed in yeast together with E. coli iron-sulfur
cluster assembly and insertion genes comprising either the cyaY,
iscS, iscU, iscA, hscB, hscA, fdx and isuX genes or the sufA, sufB,
sufC, sufD, sufS and sufE genes. This strategy allows for both the
apo-iron-sulfur protein (DHAD) and the iron-sulfur cluster assembly
and insertion components (the products of the isc or suf genes) to
come from the same organism, causing assembly of the active DHAD
iron-sulfur protein in the yeast cytosol. As a modification of this
embodiment, for those E. coli iron-sulfur cluster assembly and
insertion components that localize to or are predicted to localize
to the yeast mitochondria upon expression in yeast, the genes for
these components are engineered to eliminate such targeting signals
to ensure localization of the components in the yeast cytoplasm.
Thus, in some embodiments, one or more genes encoding an
iron-sulfur cluster assembly protein may be mutated or modified to
remove a signal peptide, whereby localization of the product of
said one or more genes to the mitochondria is prevented. In certain
embodiments, it may be preferable to overexpress one or more genes
encoding an iron-sulfur cluster assembly protein.
[0152] In additional embodiments, iron-sulfur cluster assembly and
insertion components from other than E. coli can be co-expressed
with the E. coli DHAD protein to provide assembly of the active
DHAD iron-sulfur cluster protein. Such iron-sulfur cluster assembly
and insertion components from other organisms can consist of the
products of the Helicobacter pylori nifS and nifU genes or the
Entamoeba histolytica nifS and nifU genes. As a modification of
this embodiment, for those non-E. coli iron-sulfur cluster assembly
and insertion components that localize to or are predicted to
localize to the yeast mitochondria upon expression in yeast, the
genes for these components can be engineered to eliminate such
targeting signals to ensure localization of the components in the
yeast cytoplasm.
[0153] As a further modification of this embodiment, in addition to
co-expression of these proteins in aerobically-grown yeast, these
proteins may be co-expressed in anaerobically-grown yeast to lower
the redox state of the yeast cytoplasm to improve assembly of the
active iron-sulfur protein.
[0154] In another embodiment, the above iron-sulfur cluster
assembly and insertion components can be co-expressed with DHAD
apo-iron-sulfur enzymes other than the E. coli IlvD gene product to
generate active DHAD enzymes in the yeast cytoplasm. As a
modification of this embodiment, for those DHAD enzymes that
localize to or are predicted to localize to the yeast mitochondria
upon expression in yeast, then the genes for these enzymes can be
engineered to eliminate such targeting signals to ensure
localization of the enzymes in the yeast cytoplasm.
[0155] In additional embodiments, the above methods used to
generate active DHAD enzymes localized to yeast cytoplasm may be
combined with methods to generate active acetolactate synthase,
KARI, KIVD and ADH enzymes in the same yeast for the production of
isobutanol by yeast.
[0156] In another embodiment, production of active iron-sulfur
proteins other than DHAD enzymes in yeast cytoplasm can be
accomplished by co-expression with iron-sulfur cluster assembly and
insertion proteins from organisms other than yeast, with proper
targeting of the proteins to the yeast cytoplasm if necessary and
expression in anaerobically growing yeast if needed to improve
assembly of the active proteins.
[0157] In another embodiment, the iron-sulfur cluster assembly
protein encoding genes may be derived from eukaryotic organisms,
including, but not limited to yeasts and plants. In one embodiment,
the iron-sulfur cluster protein encoding genes are derived from a
yeast organism, including, but not limited to S. cerevisiae. In
specific embodiments, the yeast derived genes encoding iron-sulfur
cluster assembly proteins are selected from the group consisting of
Cfd1 (Accession No. NP.sub.--012263.1), Nbp35 (Accession No.
NP.sub.--011424.1), Nar1 (Accession No. NP.sub.--014159.1), Cia1
(Accession No. NP.sub.--010553.1), and homologs or variants
thereof. In a further embodiment, the iron-sulfur cluster assembly
protein encoding genes may be derived from plant nuclear genes
which encode proteins translocated to chloroplast or plant genes
found in the chloroplast genome itself.
[0158] As noted above, the iron-sulfur cluster assembly genes may
be derived from eukaryotic organisms, including, but not limited to
yeasts and plants. In one embodiment, the iron-sulfur cluster genes
are derived from a yeast organism, including, but not limited to S.
cerevisiae. In specific embodiments, the yeast derived iron-sulfur
cluster assembly genes are selected from the group consisting of
CFD1, NBP35, NAR1, CIA1, and homologs or variants thereof. In a
further embodiment, the iron-sulfur cluster assembly genes may be
derived from a plant chloroplast.
[0159] In certain embodiments described herein, it may be desirable
to reduce or eliminate the activity and/or proteins levels of one
or more iron-sulfur cluster containing cytosolic proteins. This
modification increases the capacity of a yeast to incorporate
[Fe--S] clusters into cytosolically expressed proteins wherein said
proteins can be native proteins that are expressed in a non-native
compartment or heterologous proteins. This is achieved by deletion
of a highly expressed native cytoplasmic [Fe--S]-dependent protein.
More specifically, the gene LEU1 is deleted coding for the
3-isopropylmalate dehydratase which catalyses the conversion of
3-isopropylmalate into 2-isopropylmalate as part of the leucine
biosynthetic pathway in yeast. Leu1p contains an 4Fe-4S cluster
which takes part in the catalysis of the dehydratase. DHAD also
contains a 4Fe-4S cluster involved in its dehydratase activity.
Therefore, although the two enzymes have different substrate
preferences the process of incorporation of the Fe--S cluster is
generally similar for the two proteins. Given that Leu1p is present
in yeast at 10000 molecules per cell (Ghaemmaghami et al., 2003,
Nature 425: 737), deletion of LEU1 therefore ensures that the cell
has enough spare capacity to incorporate [Fe--S] clusters into at
least 10000 molecules of cytosolically expressed DHAD. Taking into
account the specific activity of DHAD (E. coli DHAD is reported to
have a specific activity of 63 U/mg) (Flint et al., 1993, J
Biological Chem 268: 14732), the LEU1 deletion yeast strain would
generally exhibit an increased capacity for DHAD activity in the
cytosol as measured in cell lysate.
[0160] In alternative embodiments, it may be desirable to further
overexpress an additional enzyme that converts
2,3-dihydroxyisovalerate to ketoisovalerate in the cytosol. In a
specific embodiment, the enzyme may be selected from the group
consisting of 3-isopropylmalate dehydratase (Leu1p) and
imidazoleglycerol-phosphate dehydrogenase (His3p). Because DHAD
activity is limited in the cytosol, alternative dehydratases that
convert dihydroxyisovalerate (DHIV) to 2-ketoisovalerate (KIV) and
are physiologically localized to the yeast cytosol may be utilized.
Leu1p and His3p are dehydratases that potentially may exhibit
affinity for DHIV. Leu1p is an Fe--S binding protein that is
involved in leucine biosynthesis and is also normally localized to
the cytosol. His3p is involved in histidine biosynthesis and is
similar to Leu1p, it is generally localized to the cytosol or
predicted to be localized to the cytosol. This modification
overcomes the problem of a DHAD that is limiting isobutanol
production in the cytosol of yeast. The use of an alternative
dehydratase that has activity in the cytosol with a low activity
towards DHIV may thus be used in place of the DHAD in the
isobutanol pathway. As described herein, such enzyme may be further
engineered to increase activity with DHIV.
The Microorganism in General
[0161] Native producers of 1-butanol, such as Clostridium
acetobutylicum, are known, but these organisms also generate
byproducts such as acetone, ethanol, and butyrate during
fermentations. Furthermore, these microorganisms are relatively
difficult to manipulate, with significantly fewer tools available
than in more commonly used production hosts such as S. cerevisiae
or E. coli. Additionally, the physiology and metabolic regulation
of these native producers are much less well understood, impeding
rapid progress towards high-efficiency production. Furthermore, no
native microorganisms have been identified that can metabolize
glucose into isobutanol in industrially relevant quantities.
[0162] The production of isobutanol and other fusel alcohols by
various yeast species, including Saccharomyces cerevisiae is of
special interest to the distillers of alcoholic beverages, for whom
fusel alcohols constitute often undesirable off-notes. Production
of isobutanol in wild-type yeasts has been documented on various
growth media, ranging from grape must from winemaking (Romano et
al., 2003, World J. of Microbiol Biot. 19: 311-5), in which 12-219
mg/L isobutanol were produced, to supplemented minimal media
(Oliviera et al., 2005, World J. of Microbiol Blot. 21: 1569-76),
producing 16-34 mg/L isobutanol. Work from Dickinson et al. (J Biol
Chem. 272: 26871-8, 1997) has identified the enzymatic steps
utilized in an endogenous S. cerevisiae pathway converting
branch-chain amino acids (e.g., valine or leucine) to
isobutanol.
[0163] Recombinant microorganisms provided herein can express a
plurality of heterologous and/or native target enzymes involved in
pathways for the production of isobutanol from a suitable carbon
source.
[0164] Accordingly, "engineered" or "modified" microorganisms are
produced via the introduction of genetic material into a host or
parental microorganism of choice and/or by modification of the
expression of native genes, thereby modifying or altering the
cellular physiology and biochemistry of the microorganism. Through
the introduction of genetic material and/or the modification of the
expression of native genes the parental microorganism acquires new
properties, e.g. the ability to produce a new, or greater
quantities of, an intracellular metabolite. As described herein,
the introduction of genetic material into and/or the modification
of the expression of native genes in a parental microorganism
results in a new or modified ability to produce isobutanol. The
genetic material introduced into and/or the genes modified for
expression in the parental microorganism contains gene(s), or parts
of genes, coding for one or more of the enzymes involved in a
biosynthetic pathway for the production of isobutanol and may also
include additional elements for the expression and/or regulation of
expression of these genes, e.g. promoter sequences.
[0165] In addition to the introduction of a genetic material into a
host or parental microorganism, an engineered or modified
microorganism can also include alteration, disruption, deletion or
knocking-out of a gene or polynucleotide to alter the cellular
physiology and biochemistry of the microorganism. Through the
alteration, disruption, deletion or knocking-out of a gene or
polynucleotide the microorganism acquires new or improved
properties (e.g., the ability to produce a new metabolite or
greater quantities of an intracellular metabolite, improve the flux
of a metabolite down a desired pathway, and/or reduce the
production of byproducts).
[0166] Recombinant microorganisms provided herein may also produce
metabolites in quantities not available in the parental
microorganism. A "metabolite" refers to any substance produced by
metabolism or a substance necessary for or taking part in a
particular metabolic process. A metabolite can be an organic
compound that is a starting material (e.g., glucose or pyruvate),
an intermediate (e.g., 2-ketoisovalerate), or an end product (e.g.,
isobutanol) of metabolism. Metabolites can be used to construct
more complex molecules, or they can be broken down into simpler
ones. Intermediate metabolites may be synthesized from other
metabolites, perhaps used to make more complex substances, or
broken down into simpler compounds, often with the release of
chemical energy.
[0167] Exemplary metabolites include glucose, pyruvate, and
isobutanol. The metabolite isobutanol can be produced by a
recombinant microorganism which expresses or over-expresses a
metabolic pathway that converts pyruvate to isobutanol. An
exemplary metabolic pathway that converts pyruvate to isobutanol
may be comprised of an acetohydroxy acid synthase (ALS), a
ketolacid reductoisomerase (KARI), a dihyroxy-acid dehydratase
(DHAD), a 2-keto-acid decarboxylase (KIVD), and an alcohol
dehydrogenase (ADH).
[0168] Accordingly, provided herein are recombinant microorganisms
that produce isobutanol and in some aspects may include the
elevated expression of target enzymes such as ALS, KARI, DHAD,
KIVD, and ADH
[0169] The disclosure identifies specific genes useful in the
methods, compositions and organisms of the disclosure; however it
will be recognized that absolute identity to such genes is not
necessary. For example, changes in a particular gene or
polynucleotide comprising a sequence encoding a polypeptide or
enzyme can be performed and screened for activity. Typically such
changes comprise conservative mutation and silent mutations. Such
modified or mutated polynucleotides and polypeptides can be
screened for expression of a functional enzyme using methods known
in the art.
[0170] Due to the inherent degeneracy of the genetic code, other
polynucleotides which encode substantially the same or functionally
equivalent polypeptides can also be used to clone and express the
polynucleotides encoding such enzymes.
[0171] As will be understood by those of skill in the art, it can
be advantageous to modify a coding sequence to enhance its
expression in a particular host. The genetic code is redundant with
64 possible codons, but most organisms typically use a subset of
these codons. The codons that are utilized most often in a species
are called optimal codons, and those not utilized very often are
classified as rare or low-usage codons. Codons can be substituted
to reflect the preferred codon usage of the host, a process
sometimes called "codon optimization" or "controlling for species
codon bias."
[0172] Optimized coding sequences containing codons preferred by a
particular prokaryotic or eukaryotic host (Murray et al., 1989,
Nucl Acids Res. 17: 477-508) can be prepared, for example, to
increase the rate of translation or to produce recombinant RNA
transcripts having desirable properties, such as a longer
half-life, as compared with transcripts produced from a
non-optimized sequence. Translation stop codons can also be
modified to reflect host preference. For example, typical stop
codons for S. cerevisiae and mammals are UAA and UGA, respectively.
The typical stop codon for monocotyledonous plants is UGA, whereas
insects and E. coli commonly use UAA as the stop codon (Dalphin et
al., 1996, Nucl Acids Res. 24: 216-8). Methodology for optimizing a
nucleotide sequence for expression in a plant is provided, for
example, in U.S. Pat. No. 6,015,891, and the references cited
therein.
[0173] Those of skill in the art will recognize that, due to the
degenerate nature of the genetic code, a variety of DNA compounds
differing in their nucleotide sequences can be used to encode a
given enzyme of the disclosure. The native DNA sequence encoding
the biosynthetic enzymes described above are referenced herein
merely to illustrate an embodiment of the disclosure, and the
disclosure includes DNA compounds of any sequence that encode the
amino acid sequences of the polypeptides and proteins of the
enzymes utilized in the methods of the disclosure. In similar
fashion, a polypeptide can typically tolerate one or more amino
acid substitutions, deletions, and insertions in its amino acid
sequence without loss or significant loss of a desired activity.
The disclosure includes such polypeptides with different amino acid
sequences than the specific proteins described herein so long as
they modified or variant polypeptides have the enzymatic anabolic
or catabolic activity of the reference polypeptide. Furthermore,
the amino acid sequences encoded by the DNA sequences shown herein
merely illustrate embodiments of the disclosure.
[0174] In addition, homologs of enzymes useful for generating
metabolites are encompassed by the microorganisms and methods
provided herein.
[0175] As used herein, two proteins (or a region of the proteins)
are substantially homologous when the amino acid sequences have at
least about 30%, 40%, 50% 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity. To determine
the percent identity of two amino acid sequences, or of two nucleic
acid sequences, the sequences are aligned for optimal comparison
purposes (e.g., gaps can be introduced in one or both of a first
and a second amino acid or nucleic acid sequence for optimal
alignment and non-homologous sequences can be disregarded for
comparison purposes). In one embodiment, the length of a reference
sequence aligned for comparison purposes is at least 30%, typically
at least 40%, more typically at least 50%, even more typically at
least 60%, and even more typically at least 70%, 80%, 90%, 100% of
the length of the reference sequence. The amino acid residues or
nucleotides at corresponding amino acid positions or nucleotide
positions are then compared. When a position in the first sequence
is occupied by the same amino acid residue or nucleotide as the
corresponding position in the second sequence, then the molecules
are identical at that position (as used herein amino acid or
nucleic acid "identity" is equivalent to amino acid or nucleic acid
"homology"). The percent identity between the two sequences is a
function of the number of identical positions shared by the
sequences, taking into account the number of gaps, and the length
of each gap, which need to be introduced for optimal alignment of
the two sequences.
[0176] When "homologous" is used in reference to proteins or
peptides, it is recognized that residue positions that are not
identical often differ by conservative amino acid substitutions. A
"conservative amino acid substitution" is one in which an amino
acid residue is substituted by another amino acid residue having a
side chain (R group) with similar chemical properties (e.g., charge
or hydrophobicity). In general, a conservative amino acid
substitution will not substantially change the functional
properties of a protein. In cases where two or more amino acid
sequences differ from each other by conservative substitutions, the
percent sequence identity or degree of homology may be adjusted
upwards to correct for the conservative nature of the substitution.
Means for making this adjustment are well known to those of skill
in the art (See, e.g., Pearson W. R., 1994, Methods in Mol Biol 25:
365-89.
[0177] The following six groups each contain amino acids that are
conservative substitutions for one another: 1) Serine (S),
Threonine (T); 2) Aspartic Acid (D), Glutamic Acid (E); 3)
Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5)
Isoleucine (I), Leucine (L), Methionine (M), Alanine (A), Valine
(V), and 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).
[0178] Sequence homology for polypeptides, which is also referred
to as percent sequence identity, is typically measured using
sequence analysis software. See commonly owned and co-pending
application US 2009/0226991. A typical algorithm used comparing a
molecule sequence to a database containing a large number of
sequences from different organisms is the computer program BLAST.
When searching a database containing sequences from a large number
of different organisms, it is typical to compare amino acid
sequences. Database searching using amino acid sequences can be
measured by algorithms described in commonly owned and co-pending
application US 2009/0226991.
[0179] The disclosure provides recombinant microorganisms
comprising a biochemical pathway for the production of isobutanol
from a suitable substrate at a high yield. A recombinant
microorganism of the disclosure comprises one or more recombinant
polynucleotides within the genome of the organism or external to
the genome within the organism. The microorganism can comprise a
reduction in expression, disruption or knockout of a gene found in
the wild-type organism and/or introduction of a heterologous
polynucleotide and/or expression or overexpression of an endogenous
polynucleotide.
[0180] In one aspect, the disclosure provides a recombinant
microorganism comprising elevated expression of at least one target
enzyme as compared to a parental microorganism or encodes an enzyme
not found in the parental organism. In another or further aspect,
the microorganism comprises a reduction in expression, disruption
or knockout of at least one gene encoding an enzyme that competes
with a metabolite necessary for the production of isobutanol. The
recombinant microorganism produces at least one metabolite involved
in a biosynthetic pathway for the production of isobutanol. In
general, the recombinant microorganisms comprises at least one
recombinant metabolic pathway that comprises a target enzyme and
may further include a reduction in activity or expression of an
enzyme in a competitive biosynthetic pathway. The pathway acts to
modify a substrate or metabolic intermediate in the production of
isobutanol. The target enzyme is encoded by, and expressed from, a
polynucleotide derived from a suitable biological source. In some
embodiments, the polynucleotide comprises a gene derived from a
prokaryotic or eukaryotic source and recombinantly engineered into
the microorganism of the disclosure. In other embodiments, the
polynucleotide comprises a gene that is native to the host
organism.
[0181] It is understood that a range of microorganisms can be
modified to include a recombinant metabolic pathway suitable for
the production of isobutanol. In various embodiments,
microorganisms may be selected from yeast microorganisms. Yeast
microorganisms for the production of isobutanol may be selected
based on certain characteristics:
[0182] One characteristic may include the property that the
microorganism is selected to convert various carbon sources into
isobutanol. The term "carbon source" generally refers to a
substance suitable to be used as a source of carbon for prokaryotic
or eukaryotic cell growth. Examples of suitable carbon sources are
described in commonly owned and co-pending application US
2009/0226991. Accordingly, in one embodiment, the recombinant
microorganism herein disclosed can convert a variety of carbon
sources to products, including but not limited to glucose,
galactose, mannose, xylose, arabinose, lactose, sucrose, and
mixtures thereof.
[0183] The recombinant microorganism may thus further include a
pathway for the fermentation of isobutanol from five-carbon
(pentose) sugars including xylose. Most yeast species metabolize
xylose via a complex route, in which xylose is first reduced to
xylitol via a xylose reductase (XR) enzyme. The xylitol is then
oxidized to xylulose via a xylitol dehydrogenase (XDH) enzyme. The
xylulose is then phosphorylated via a xylulokinase (XK) enzyme.
This pathway operates inefficiently in yeast species because it
introduces a redox imbalance in the cell. The xylose-to-xylitol
step uses NADH as a cofactor, whereas the xylitol-to-xylulose step
uses NADPH as a cofactor. Other processes must operate to restore
the redox imbalance within the cell. This often means that the
organism cannot grow anaerobically on xylose or other pentose
sugar. Accordingly, a yeast species that can efficiently ferment
xylose and other pentose sugars into a desired fermentation product
is therefore very desirable.
[0184] Thus, in one aspect, the recombinant is engineered to
express a functional exogenous xylose isomerase. Exogenous xylose
isomerases functional in yeast are known in the art. See, e.g.,
Rajgarhia et al, US20060234364, which is herein incorporated by
reference in its entirety. In an embodiment according to this
aspect, the exogenous xylose isomerase gene is operatively linked
to promoter and terminator sequences that are functional in the
yeast cell. In a preferred embodiment, the recombinant
microorganism further has a deletion or disruption of a native gene
that encodes for an enzyme (e.g. XR and/or XDH) that catalyzes the
conversion of xylose to xylitol. In a further preferred embodiment,
the recombinant microorganism also contains a functional, exogenous
xylulokinase (XK) gene operatively linked to promoter and
terminator sequences that are functional in the yeast cell. In one
embodiment, the xylulokinase (XK) gene is overexpressed.
[0185] In one embodiment, the microorganism has reduced or no
pyruvate decarboxylase (PDC) activity. PDC catalyzes the
decarboxylation of pyruvate to acetaldehyde, which is then reduced
to ethanol by ADH via an oxidation of NADH to NADH+. Ethanol
production is the main pathway to oxidize the NADH from glycolysis.
Deletion of this pathway increases the pyruvate and the reducing
equivalents (NADH) available for the isobutanol pathway.
Accordingly, deletion of PDC genes can further increase the yield
of isobutanol.
[0186] In another embodiment, the microorganism has reduced or no
glycerol-3-phosphate dehydrogenase (GPD) activity. GPD catalyzes
the reduction of dihydroxyacetone phosphate (DHAP) to
glycerol-3-phosphate (G3P) via the oxidation of NADH to NAD+.
Glycerol is then produced from G3P by Glycerol-3-phosphatase (GPP).
Glycerol production is a secondary pathway to oxidize excess NADH
from glycolysis. Reduction or elimination of this pathway would
increase the pyruvate and reducing equivalents (NADH) available for
the isobutanol pathway. Thus, deletion of GPD genes can further
increase the yield of isobutanol.
[0187] In yet another embodiment, the microorganism has reduced or
no PDC activity and reduced or no GPD activity.
[0188] In one embodiment, the yeast microorganisms may be selected
from the "Saccharomyces Yeast Clade", as described in commonly
owned and co-pending application US 2009/0226991.
[0189] The term "Saccharomyces sensu stricto" taxonomy group is a
cluster of yeast species that are highly related to S. cerevisiae
(Rainieri et al., 2003, J. Biosci Bioengin 96: 1-9). Saccharomyces
sensu stricto yeast species include but are not limited to S.
cerevisiae, S. cerevisiae, S. kudriavzevii, S. mikatae, S. bayanus,
S. uvarum, S. carocanis and hybrids derived from these species
(Masneuf et al., 1998, Yeast 7: 61-72).
[0190] An ancient whole genome duplication (WGD) event occurred
during the evolution of the hemiascomycete yeast and was discovered
using comparative genomic tools (Kellis et al., 2004, Nature 428:
617-24; Dujon et al., 2004, Nature 430:35-44; Langkjaer et al.,
2003, Nature 428: 848-52; Wolfe et al., 1997, Nature 387: 708-13).
Using this major evolutionary event, yeast can be divided into
species that diverged from a common ancestor following the WGD
event (termed "post-WGD yeast" herein) and species that diverged
from the yeast lineage prior to the WGD event (termed "pre-WGD
yeast" herein).
[0191] Accordingly, in one embodiment, the yeast microorganism may
be selected from a post-WGD yeast genus, including but not limited
to Saccharomyces and Candida. The favored post-WGD yeast species
include: S. cerevisiae, S. uvarum, S. bayanus, S. paradoxus, S.
castelli, and C. glabrata.
[0192] In another embodiment, the yeast microorganism may be
selected from a pre-whole genome duplication (pre-WGD) yeast genus
including but not limited to Saccharomyces, Kluyveromyces, Candida,
Pichia, Issatchenkia, Debaryomyces, Hansenula, Yarrowia and,
Schizosaccharomyces. Representative pre-WGD yeast species include:
S. kluyveri, K. thermotolerans, K. marxianus, K. waltii, K. lactis,
C. tropicalis, P. pastoris, P. anomala, P. stipitis, I. orientalis,
I. occidentalis, I. scutulata, D. hansenii, H. anomala, Y.
lipolytica, and S. pombe.
[0193] A yeast microorganism may be either Crabtree-negative or
Crabtree-positive as described in described in commonly owned and
co-pending application US 2009/0226991. In one embodiment the yeast
microorganism may be selected from yeast with a Crabtree-negative
phenotype including but not limited to the following genera:
Kluyveromyces, Pichia, Issatchenkia, Hansenula, and Candida.
Crabtree-negative species include but are not limited to: K.
lactis, K. marxianus, P. anomala, P. stipitis, I. orientalis, I.
occidentalis, I. scutulata, H. anomala, and C. utilis. In another
embodiment, the yeast microorganism may be selected from a yeast
with a Crabtree-positive phenotype, including but not limited to
Saccharomyces, Kluyveromyces, Zygosaccharomyces, Debaryomyces,
Pichia and Schizosaccharomyces. Crabtree-positive yeast species
include but are not limited to: S. cerevisiae, S. uvarum, S.
bayanus, S. paradoxus, S. castelli, S. kluyveri, K. thermotolerans,
C. glabrata, Z. bailli, Z. rouxii, D. hansenii, P. pastorius, and
S. pombe.
[0194] Another characteristic may include the property that the
microorganism is that it is non-fermenting. In other words, it
cannot metabolize a carbon source anaerobically while the yeast is
able to metabolize a carbon source in the presence of oxygen.
Nonfermenting yeast refers to both naturally occurring yeasts as
well as genetically modified yeast. During anaerobic fermentation
with fermentative yeast, the main pathway to oxidize the NADH from
glycolysis is through the production of ethanol. Ethanol is
produced by alcohol dehydrogenase (ADH) via the reduction of
acetaldehyde, which is generated from pyruvate by pyruvate
decarboxylase (PDC). In one embodiment, a fermentative yeast can be
engineered to be non-fermentative by the reduction or elimination
of the native PDC activity. Thus, most of the pyruvate produced by
glycolysis is not consumed by PDC and is available for the
isobutanol pathway. Deletion of this pathway increases the pyruvate
and the reducing equivalents available for the isobutanol pathway.
Fermentative pathways contribute to low yield and low productivity
of isobutanol. Accordingly, deletion of PDC may increase yield and
productivity of isobutanol.
[0195] In some embodiments, the recombinant microorganisms may be
microorganisms that are non-fermenting yeast microorganisms,
including, but not limited to those, classified into a genera
selected from the group consisting of Tricosporon, Rhodotorula, or
Myxozyma.
[0196] In one embodiment, a yeast microorganism is engineered to
convert a carbon source, such as glucose, to pyruvate by glycolysis
and the pyruvate is converted to isobutanol via an engineered
isobutanol pathway (See, e.g., WO/2007/050671, WO/2008/098227, and
Atsumi et al., 2008, Nature 45: 86-9). Alternative pathways for the
production of isobutanol have been described in WO/2007/050671 and
in Dickinson et al., 1998, J Biol Chem 273:25751-6.
[0197] Accordingly, the engineered isobutanol pathway to convert
pyruvate to isobutanol can be comprised of the following
reactions:
[0198] 1. 2 pyruvate.fwdarw.acetolactate+CO.sub.2
[0199] 2.
acetolactate+NAD(P)H.fwdarw.2,3-dihydroxyisovalerate+NAD(P).sup.-
+
[0200] 3. 2,3-dihydroxyisovalerate.fwdarw.alpha-ketoisovalerate
[0201] 4.
alpha-ketoisovalerate.fwdarw.isobutyraldehyde+CO.sub.2
[0202] 5. isobutyraldehyde+NAD(P)H.fwdarw.isobutanol+NAD(P)
[0203] These reactions are carried out by the enzymes 1)
Acetolactate Synthase (ALS), 2) Keto-acid Reducto-Isomerase (KARI),
3) Dihydroxy-acid dehydratase (DHAD), 4) Keto-isovalerate
decarboxylase (KIVD), and 5) an Alcohol dehydrogenase (ADH) (FIG.
1). In another embodiment, the yeast microorganism is engineered to
overexpress these enzymes. For example, these enzymes can be
encoded by native genes. Alternatively, these enzymes can be
encoded by heterologous genes. For example, ALS can be encoded by
the alsS gene of B. subtilis, alsS of L. lactis, or the ilvK gene
of K. pneumonia. For example, KARI can be encoded by the ilvC genes
of E. coli, C. glutamicum, M. maripaludis, or Piromyces sp E2. For
example, DHAD can be encoded by the ilvD genes of E. coli, C.
glutamicum, or L. lactis. For example, KIVD can be encoded by the
kivD gene of L. lactis. ADH can be encoded by ADH2, ADH6, or ADH7
of S. cerevisiae.
[0204] In one embodiment, pathway steps 2 and 5 may be carried out
by KARI and ADH enzymes that utilize NADH (rather than NADPH) as a
co-factor. Such enzymes are described in commonly owned and
co-pending applications U.S. Ser. No. 12/610,784 and PCT/US09/62952
(published as WO/2010/051527), which are herein incorporated by
reference in their entireties for all purposes. The present
inventors have found that utilization of NADH-dependent KARI and
ADH enzymes to catalyze pathway steps 2 and 5, respectively,
surprisingly enables production of isobutanol under anaerobic
conditions. Thus, in one embodiment, the recombinant microorganisms
of the present invention may use an NADH-dependent KARI to catalyze
the conversion of acetolactate (+NADH) to produce
2,3-dihydroxyisovalerate. In another embodiment, the recombinant
microorganisms of the present invention may use an NADH-dependent
ADH to catalyze the conversion of isobutyraldehyde (+NADH) to
produce isobutanol. In yet another embodiment, the recombinant
microorganisms of the present invention may use both an
NADH-dependent KARI to catalyze the conversion of acetolactate
(+NADH) to produce 2,3-dihydroxyisovalerate, and an NADH-dependent
ADH to catalyze the conversion of isobutyraldehyde (+NADH) to
produce isobutanol.
[0205] The yeast microorganism of the invention may be engineered
to have increased ability to convert pyruvate to isobutanol. In one
embodiment, the yeast microorganism may be engineered to have
increased ability to convert pyruvate to isobutyraldehyde. In
another embodiment, the yeast microorganism may be engineered to
have increased ability to convert pyruvate to keto-isovalerate. In
another embodiment, the yeast microorganism may be engineered to
have increased ability to convert pyruvate to
2,3-dihydroxyisovalerate. In another embodiment, the yeast
microorganism may be engineered to have increased ability to
convert pyruvate to acetolactate.
[0206] Furthermore, any of the genes encoding the foregoing enzymes
(or any others mentioned herein (or any of the regulatory elements
that control or modulate expression thereof)) may be optimized by
genetic/protein engineering techniques, such as directed evolution
or rational mutagenesis, which are known to those of ordinary skill
in the art. Such action allows those of ordinary skill in the art
to optimize the enzymes for expression and activity in yeast.
[0207] In addition, genes encoding these enzymes can be identified
from other fungal and bacterial species and can be expressed for
the modulation of this pathway. A variety of organisms could serve
as sources for these enzymes, including, but not limited to,
Saccharomyces spp., including S. cerevisiae and S. uvarum,
Kluyveromyces spp., including K. thermotolerans, K. lactis, and K.
marxianus, Pichia spp., Hansenula spp., including H. polymorpha,
Candida spp., Trichosporon spp., Yamadazyma spp., including Y. spp.
stipitis, Torulaspora pretoriensis, Schizosaccharomyces spp.,
including S. pombe, Cryptococcus spp., Aspergillus spp., Neurospora
spp., or Ustilago spp. Sources of genes from anaerobic fungi
include, but not limited to, Piromyces spp., Orpinomyces spp., or
Neocallimastix spp. Sources of prokaryotic enzymes that are useful
include, but not limited to, Escherichia. coli, Zymomonas mobilis,
Staphylococcus aureus, Bacillus spp., Clostridium spp.,
Corynebacterium spp., Pseudomonas spp., Lactococcus spp.,
Enterobacter spp., and Salmonella spp.
Methods in General
Identification of PDC and GPD in a Yeast Microorganism
[0208] Any method can be used to identify genes that encode for
enzymes with pyruvate decarboxylase (PDC) activity or
glycerol-3-phosphate dehydrogenase (GPD) activity. Suitable methods
for the identification of PDC and GPD are described in co-pending
applications U.S. Ser. No. 12/343,375 (published as US
2009/0226991), U.S. Ser. No. 12/696,645, and U.S. Ser. No.
12/820,505, which claim priority to U.S. Provisional Application
61/016,483, all of which are herein incorporated by reference in
their entireties for all purposes.
Genetic Insertions and Deletions
[0209] Any method can be used to introduce a nucleic acid molecule
into yeast and many such methods are well known. For example,
transformation and electroporation are common methods for
introducing nucleic acid into yeast cells. See, e.g., Gietz et al.,
1992, Nuc Acids Res. 27: 69-74; Ito et al., 1983, J. Bacteriol.
153: 163-8; and Becker et al., 1991, Methods in Enzymology 194:
182-7.
[0210] In an embodiment, the integration of a gene of interest into
a DNA fragment or target gene of a yeast microorganism occurs
according to the principle of homologous recombination. According
to this embodiment, an integration cassette containing a module
comprising at least one yeast marker gene and/or the gene to be
integrated (internal module) is flanked on either side by DNA
fragments homologous to those of the ends of the targeted
integration site (recombinogenic sequences). After transforming the
yeast with the cassette by appropriate methods, a homologous
recombination between the recombinogenic sequences may result in
the internal module replacing the chromosomal region in between the
two sites of the genome corresponding to the recombinogenic
sequences of the integration cassette. (Orr-Weaver et al., 1981,
PNAS USA 78: 6354-58).
[0211] In an embodiment, the integration cassette for integration
of a gene of interest into a yeast microorganism includes the
heterologous gene under the control of an appropriate promoter and
terminator together with the selectable marker flanked by
recombinogenic sequences for integration of a heterologous gene
into the yeast chromosome. In an embodiment, the heterologous gene
includes an appropriate native gene desired to increase the copy
number of a native gene(s). The selectable marker gene can be any
marker gene used in yeast, including but not limited to, HIS3,
TRP1, LEU2, URA3, bar, ble, hph, and kan. The recombinogenic
sequences can be chosen at will, depending on the desired
integration site suitable for the desired application.
[0212] In another embodiment, integration of a gene into the
chromosome of the yeast microorganism may occur via random
integration (Kooistra et al., 2004, Yeast 21: 781-792).
[0213] Additionally, in an embodiment, certain introduced marker
genes are removed from the genome using techniques well known to
those skilled in the art. For example, URA3 marker loss can be
obtained by plating URA3 containing cells in FOA (5-fluoro-orotic
acid) containing medium and selecting for FOA resistant colonies
(Boeke et al., 1984, Mol. Gen. Genet 197: 345-47).
[0214] The exogenous nucleic acid molecule contained within a yeast
cell of the disclosure can be maintained within that cell in any
form. For example, exogenous nucleic acid molecules can be
integrated into the genome of the cell or maintained in an episomal
state that can stably be passed on ("inherited") to daughter cells.
Such extra-chromosomal genetic elements (such as plasmids, etc.)
can additionally contain selection markers that ensure the presence
of such genetic elements in daughter cells. Moreover, the yeast
cells can be stably or transiently transformed. In addition, the
yeast cells described herein can contain a single copy, or multiple
copies of a particular exogenous nucleic acid molecule as described
above.
Reduction of Enzymatic Activity
[0215] Yeast microorganisms within the scope of the invention may
have reduced enzymatic activity such as reduced
glycerol-3-phosphate dehydrogenase activity. The term "reduced" as
used herein with respect to a particular enzymatic activity refers
to a lower level of enzymatic activity than that measured in a
comparable yeast cell of the same species. The term reduced also
refers to the elimination of enzymatic activity than that measured
in a comparable yeast cell of the same species. Thus, yeast cells
lacking glycerol-3-phosphate dehydrogenase activity are considered
to have reduced glycerol-3-phosphate dehydrogenase activity since
most, if not all, comparable yeast strains have at least some
glycerol-3-phosphate dehydrogenase activity. Such reduced enzymatic
activities can be the result of lower enzyme concentration, lower
specific activity of an enzyme, or a combination thereof. Many
different methods can be used to make yeast having reduced
enzymatic activity. For example, a yeast cell can be engineered to
have a disrupted enzyme-encoding locus using common mutagenesis or
knock-out technology. In addition, certain point-mutation(s) can be
introduced which results in an enzyme with reduced activity.
[0216] Alternatively, antisense technology can be used to reduce
enzymatic activity. For example, yeast can be engineered to contain
a cDNA that encodes an antisense molecule that prevents an enzyme
from being made. The term "antisense molecule" as used herein
encompasses any nucleic acid molecule that contains sequences that
correspond to the coding strand of an endogenous polypeptide. An
antisense molecule also can have flanking sequences (e.g.,
regulatory sequences). Thus antisense molecules can be ribozymes or
antisense oligonucleotides. A ribozyme can have any general
structure including, without limitation, hairpin, hammerhead, or
axhead structures, provided the molecule cleaves RNA.
[0217] Yeast having a reduced enzymatic activity can be identified
using many methods. For example, yeast having reduced
glycerol-3-phosphate dehydrogenase activity can be easily
identified using common methods, which may include, for example,
measuring glycerol formation via liquid chromatography.
Overexpression of Heterologous Genes
[0218] Methods for overexpressing a polypeptide from a native or
heterologous nucleic acid molecule are well known. Such methods
include, without limitation, constructing a nucleic acid sequence
such that a regulatory element promotes the expression of a nucleic
acid sequence that encodes the desired polypeptide. Typically,
regulatory elements are DNA sequences that regulate the expression
of other DNA sequences at the level of transcription. Thus,
regulatory elements include, without limitation, promoters,
enhancers, and the like. For example, the exogenous genes can be
under the control of an inducible promoter or a constitutive
promoter. Moreover, methods for expressing a polypeptide from an
exogenous nucleic acid molecule in yeast are well known. For
example, nucleic acid constructs that are used for the expression
of exogenous polypeptides within Kluyveromyces and Saccharomyces
are well known (see, e.g., U.S. Pat. Nos. 4,859,596 and 4,943,529,
for Kluyveromyces and, e.g., Gellissen et al., Gene 190(1):87-97
(1997) for Saccharomyces). Yeast plasmids have a selectable marker
and an origin of replication. In addition certain plasmids may also
contain a centromeric sequence. These centromeric plasmids are
generally a single or low copy plasmid. Plasmids without a
centromeric sequence and utilizing either a 2 micron (S.
cerevisiae) or 1.6 micron (K. lactis) replication origin are high
copy plasmids. The selectable marker can be either prototrophic,
such as HIS3, TRP1, LEU2, URA3 or ADE2, or antibiotic resistance,
such as, bar, ble, hph, or kan.
[0219] In another embodiment, heterologous control elements can be
used to activate or repress expression of endogenous genes.
Additionally, when expression is to be repressed or eliminated, the
gene for the relevant enzyme, protein or RNA can be eliminated by
known deletion techniques.
[0220] As described herein, any yeast within the scope of the
disclosure can be identified by selection techniques specific to
the particular enzyme being expressed, over-expressed or repressed.
Methods of identifying the strains with the desired phenotype are
well known to those skilled in the art. Such methods include,
without limitation, PCR, RT-PCR, and nucleic acid hybridization
techniques such as Northern and Southern analysis, altered growth
capabilities on a particular substrate or in the presence of a
particular substrate, a chemical compound, a selection agent and
the like. In some cases, immunohistochemistry and biochemical
techniques can be used to determine if a cell contains a particular
nucleic acid by detecting the expression of the encoded
polypeptide. For example, an antibody having specificity for an
encoded enzyme can be used to determine whether or not a particular
yeast cell contains that encoded enzyme. Further, biochemical
techniques can be used to determine if a cell contains a particular
nucleic acid molecule encoding an enzymatic polypeptide by
detecting a product produced as a result of the expression of the
enzymatic polypeptide. For example, transforming a cell with a
vector encoding acetolactate synthase and detecting increased
acetolactate concentrations compared to a cell without the vector
indicates that the vector is both present and that the gene product
is active. Methods for detecting specific enzymatic activities or
the presence of particular products are well known to those skilled
in the art. For example, the presence of acetolactate can be
determined as described by Hugenholtz and Starrenburg, 1992, Appl.
Micro. Biot. 38:17-22.
Increase of Enzymatic Activity
[0221] Yeast microorganisms of the invention may be further
engineered to have increased activity of enzymes. The term
"increased" as used herein with respect to a particular enzymatic
activity refers to a higher level of enzymatic activity than that
measured in a comparable yeast cell of the same species. For
example, overexpression of a specific enzyme can lead to an
increased level of activity in the cells for that enzyme. Increased
activities for enzymes involved in glycolysis or the isobutanol
pathway would result in increased productivity and yield of
isobutanol.
[0222] Methods to increase enzymatic activity are known to those
skilled in the art. Such techniques may include increasing the
expression of the enzyme by increased copy number and/or use of a
strong promoter, introduction of mutations to relieve negative
regulation of the enzyme, introduction of specific mutations to
increase specific activity and/or decrease the Km for the
substrate, or by directed evolution. See, e.g., Methods in
Molecular Biology (vol. 231), ed. Arnold and Georgiou, Humana Press
(2003).
Microorganism Characterized by Producing Isobutanol at High
Yield
[0223] For a biocatalyst to produce isobutanol most economically,
it is desired to produce a high yield. Preferably, the only product
produced is isobutanol. Extra products lead to a reduction in
product yield and an increase in capital and operating costs,
particularly if the extra products have little or no value. Extra
products also require additional capital and operating costs to
separate these products from isobutanol.
[0224] The microorganism may convert one or more carbon sources
derived from biomass into isobutanol with a yield of greater than
5% of theoretical. In one embodiment, the yield is greater than
10%. In one embodiment, the yield is greater than 50% of
theoretical. In one embodiment, the yield is greater than 60% of
theoretical. In another embodiment, the yield is greater than 70%
of theoretical. In yet another embodiment, the yield is greater
than 80% of theoretical. In yet another embodiment, the yield is
greater than 85% of theoretical. In yet another embodiment, the
yield is greater than 90% of theoretical. In yet another
embodiment, the yield is greater than 95% of theoretical. In still
another embodiment, the yield is greater than 97.5% of
theoretical.
[0225] More specifically, the microorganism converts glucose, which
can be derived from biomass into isobutanol with a yield of greater
than 5% of theoretical. In one embodiment, the yield is greater
than 10% of theoretical. In one embodiment, the yield is greater
than 50% of theoretical. In one embodiment the yield is greater
than 60% of theoretical. In another embodiment, the yield is
greater than 70% of theoretical. In yet another embodiment, the
yield is greater than 80% of theoretical. In yet another
embodiment, the yield is greater than 85% of theoretical. In yet
another embodiment the yield is greater than 90% of theoretical. In
yet another embodiment, the yield is greater than 95% of
theoretical. In still another embodiment, the yield is greater than
97.5% of theoretical
Microorganism Characterized by Production of Isobutanol from
Pyruvate Via an Overexpressed Isobutanol Pathway and a Pdc-Minus
and Gpd-Minus Phenotype
[0226] In yeast, the conversion of pyruvate to acetaldehyde is a
major drain on the pyruvate pool, and, hence, a major source of
competition with the isobutanol pathway. This reaction is catalyzed
by the pyruvate decarboxylase (PDC) enzyme. Reduction of this
enzymatic activity in the yeast microorganism results in an
increased availability of pyruvate and reducing equivalents to the
isobutanol pathway and may improve isobutanol production and yield
in a yeast microorganism that expresses a pyruvate-dependent
isobutanol pathway.
[0227] Reduction of PDC activity can be accomplished by 1) mutation
or deletion of a positive transcriptional regulator for the
structural genes encoding for PDC or 2) mutation or deletion of all
PDC genes in a given organism. The term "transcriptional regulator"
can specify a protein or nucleic acid that works in trans to
increase or to decrease the transcription of a different locus in
the genome. For example, in S. cerevisiae, the PDC2 gene, which
encodes for a positive transcriptional regulator of PDC1,5,6 genes
can be deleted; a S. cerevisiae in which the PDC2 gene is deleted
is reported to have only .about.10% of wildtype PDC activity
(Hohmann, 1993, Mol Gen Genet 241:657-66). Alternatively, for
example, all structural genes for PDC (e.g. in S. cerevisiae, PDC1,
PDC5, and PDC6, or in K. lactis, PDC1) are deleted.
[0228] Crabtree-positive yeast strains such as S. cerevisiae strain
that contains disruptions in all three of the PDC alleles no longer
produce ethanol by fermentation. However, a downstream product of
the reaction catalyzed by PDC, acetyl-CoA, is needed for anabolic
production of necessary molecules. Therefore, the Pdc- mutant is
unable to grow solely on glucose, and requires a two-carbon carbon
source, either ethanol or acetate, to synthesize acetyl-CoA
(Flikweert et al., 1999, FEMS Microbiol Lett. 174: 73-9; and van
Maris et al., 2004, Appl Environ Microbiol. 70: 159-66).
[0229] Thus, in an embodiment, such a Crabtree-positive yeast
strain may be evolved to generate variants of the PDC mutant yeast
that do not have the requirement for a two-carbon molecule and has
a growth rate similar to wild type on glucose. Any method,
including chemostat evolution or serial dilution may be utilized to
generate variants of strains with deletion of three PDC alleles
that can grow on glucose as the sole carbon source at a rate
similar to wild type (van Maris et al., 2004, Appl Envir Micro 70:
159-66).
[0230] Another byproduct that would decrease yield of isobutanol is
glycerol. Glycerol is produced by 1) the reduction of the
glycolysis intermediate, dihydroxyacetone phosphate (DHAP), to
glycerol-3-phosphate (G3P) via the oxidation of NADH to NAD.sup.+
by Glycerol-3-phosphate dehydrogenase (GPD) followed by 2) the
dephosphorylation of glycerol-3-phosphate to glycerol by
glycerol-3-phosphatase (GPP). Production of glycerol results in
loss of carbons as well as reducing equivalents. Reduction of GPD
activity would increase yield of isobutanol. Reduction of GPD
activity in addition to PDC activity would further increase yield
of isobutanol. Reduction of glycerol production has been reported
to increase yield of ethanol production (Nissen et al., 2000, Yeast
16, 463-74; Nevoigt et al., Method of modifying a yeast cell for
the production of ethanol, WO/2009/056984). Disruption of this
pathway has also been reported to increase yield of lactate in a
yeast engineered to produce lactate instead of ethanol (Dundon et
al., Yeast cells having disrupted pathway from dihydroxyacetone
phosphate to glycerol, US 2009/0053782).
[0231] In one embodiment, the microorganism is a Crabtree-positive
yeast with reduced or no GPD activity. In another embodiment, the
microorganism is a crabtree positive yeast with reduced or no GPD
activity, and expresses an isobutanol biosynthetic pathway and
produces isobutanol. In yet another embodiment, the microorganism
is a Crabtree-positive yeast with reduced or no GPD activity and
with reduced or no PDC activity. In another embodiment, the
microorganism is a crabtree positive yeast with reduced or no GPD
activity, with reduced or no PDC activity, and expresses an
isobutanol biosynthetic pathway and produces isobutanol.
[0232] In another embodiment, the microorganism is a
Crabtree-negative yeast with reduced or no GPD activity. In another
embodiment, the microorganism is a Crabtree-negative yeast with
reduced or no GPD activity, expresses the isobutanol biosynthetic
pathway and produces isobutanol. In yet another embodiment, the
microorganism is a Crabtree-negative yeast with reduced or no GPD
activity and with reduced or no PDC activity. In another
embodiment, the microorganism is a Crabtree-negative yeast with
reduced or no GPD activity, with reduced or no PDC activity,
expresses an isobutanol biosynthetic pathway and produces
isobutanol.
[0233] PDC-minus/GPD-minus yeast production strains are described
in co-pending applications U.S. Ser. No. 12/343,375 (published as
US 2009/0226991), U.S. Ser. No. 12/696,645, and U.S. Ser. No.
12/820,505, which claim priority to U.S. Provisional Application
61/016,483, all of which are herein incorporated by reference in
their entireties for all purposes.
Method of Using Microorganism for High-Yield Isobutanol
Fermentation
[0234] In a method to produce isobutanol from a carbon source at
high yield, the yeast microorganism is cultured in an appropriate
culture medium containing a carbon source.
[0235] Another exemplary embodiment provides a method for producing
isobutanol comprising a recombinant yeast microorganism of the
invention in a suitable culture medium containing a carbon source
that can be converted to isobutanol by the yeast microorganism of
the invention.
[0236] In certain embodiments, the method further includes
isolating isobutanol from the culture medium. For example,
isobutanol may be isolated from the culture medium by any method
known to those skilled in the art, such as distillation,
pervaporation, or liquid-liquid extraction, including methods
disclosed in co-pending applications U.S. Ser. No. 12/342,992
(published as US 2009/0171129) and PCT/US08/88187 (published as
WO/2009/086391), which are herein incorporated by reference in
their entireties for all purposes.
[0237] This invention is further illustrated by the following
examples that should not be construed as limiting. The contents of
all references, patents, and published patent applications cited
throughout this application, as well as the Figure and the Sequence
Listing, are incorporated herein by reference for all purposes.
EXAMPLES
General Methods
TABLE-US-00001 [0238] TABLE 1 Amino acid sequences disclosed
herein. SEQ ID NO Protein, Accession No. 1 E. coli IlvC, NP_418222
2 S. cerevisiae Ilv5, NP_013459 3 Oryza sativa KARI, NP_001056384 4
Methanococcus maripaludis KARI, YP_001097443 5 Acidiphilium cryptum
KARI, YP_001235669 6 Chlamydomonas reinhardtii KARI, XP_001702649 7
Picrophilus torridus KARI, YP_023851 8 Zymomonas mobilis KARI,
YP_162876 9 c-myc epitope tag 10 Thermotoga petrophila RKU-1
dihydroxyacid dehydratase (DHAD), YP_001243973.1 11 Victivallis
vadensis ATCC BAA-548 dihydroxyacid dehydratase (DHAD),
ZP_01924101.1 12 Termite group 1 bacterium phylotype Rs-D17
dihydroxyacid dehydratase (DHAD), YP_001956631.1 13 Yarrowia
lipolytica dihydroxyacid dehydratase (DHAD), XP_502180.2 14
Francisella tularensis subsp. tularensis WY96-3418 dihydroxyacid
dehydratase (DHAD), YP_001122023.1 15 Arabidopsis thaliana
dihydroxyacid dehydratase (DHAD), AAK64025.1 16 Candidatus
Koribacter versatilis Ellin345 dihydroxyacid dehydratase (DHAD),
YP_592184.1 (Acidobacter) 17 Gramella forsetii KT0803 dihydroxyacid
dehydratase (DHAD), YP_862145.1 18 Lactococcus lactis subsp. lactis
Il1403 dihydroxyacid dehydratase (DHAD), NP_267379.1 19
Saccharopolyspora erythraea NRRL 2338 dihydroxyacid dehydratase
(DHAD), YP_001103528.2 20 Saccharomyces cerevisiae Ilv3,
NP_012550.1 21 Piromyces sp E2 ilvD 22 Ralstonia eutropha JMP134
ilvD, YP_298150.1 23 Chromohalobacter salexigens ilvD, YP_573197.1
24 Picrophilus torridus DSM9790 ilvD, YP_024215.1 25 Sulfolobus
tokodaii str. 7 dihydroxyacid dehydratase (DHAD), NP_378168.1 26
Saccharomyces cerevisiae Ilv3.DELTA.N 27 P(I/L)XXXGX(I/L)XIL
(conserved motif described in Example 17) 28 PIKXXGX(I/L)XIL
(conserved motif described in Example 17)
TABLE-US-00002 TABLE 2 Nucleic acid sequences disclosed herein. SEQ
ID NO Gene, Accession No. 87 Lactococcus lactis subsp. lactis
Il1403 (Ll_ilvD) 88 Saccharomyces cerevisiae ILV3 (ScILV3(FL)) 89
Saccharomyces cerevisiae ILV3.DELTA.N (ScILV3.DELTA.N) 90 Gramella
forsetii KT0803 (Gf_ilvD) 91 Saccharopolyspora erythraea NRRL 2338
(Se_ilvD) 92 Candidatus Koribacter versatilis Ellin345 ilvD
(Acidobacter) 93 Piromcyes sp E2 ilvD (Piromyces ilvD) 94 Ralstonia
eutropha JMP134 ilvD, (Re_ilvD) 95 Chromohalobacter salexigens
ilvD, (Cs_ilvD) 96 Picrophilus torridus DSM9790 ilvD, (Pt_ilvD) 97
Sulfolobus tokodail str. 7 ilvD, (St_ilvD) 98 E. coli
ilvC.sup.Q110V, (Ec_ilvC(Q110V)) 99 Lactococcus lactis kivD,
(Ll_kivD) 100 S. cerevisiae ILV5, (ScILV5)
[0239] Determination of Optical Density.
[0240] The optical density of the yeast cultures is determined at
600 nm using a DU 800 spectrophotometer (Beckman-Coulter,
Fullerton, Calif., USA). Samples are diluted as necessary to yield
an optical density of between 0.1 and 0.8.
[0241] Gas Chromatography.
[0242] Analysis of volatile organic compounds, including ethanol
and isobutanol was performed on a HP 5890 gas chromatograph fitted
with an HP 7673 Autosampler, a DB-FFAP column (J&W; 30 m
length, 0.32 mm ID, 0.25_.mu.M film thickness) or equivalent
connected to a flame ionization detector (FID). The temperature
program was as follows: 200.degree. C. for the injector,
300.degree. C. for the detector, 50.degree. C. oven for 1 minute,
31.degree. C./minute gradient to 140.degree. C., and then hold for
2.5 min. Analysis was performed using authentic standards (>99%,
obtained from Sigma-Aldrich), and a 5-point calibration curve with
1-pentanol as the internal standard.
[0243] High Performance Liquid Chromatography for Quantitative
Analysis of Glucose and Organic Acids.
[0244] Analysis of glucose and organic acids was performed on a
HP-1100 High Performance Liquid Chromatography system equipped with
an Aminex HPX-87H Ion Exclusion column (Bio-Rad, 300.times.7.8 mm)
or equivalent and an H.sup.+ cation guard column (Bio-Rad) or
equivalent. Organic acids were detected using an HP-1100 UV
detector (210 nm, 8 nm 360 nm reference) while glucose was detected
using an HP-1100 refractive index detector. The column temperature
was 60.degree. C. This method was Isocratic with 0.008 N sulfuric
acid in water as the mobile phase. Flow was set at 1 mL/min.
Injection volume was 20 .mu.L and the run time was 30 minutes.
[0245] High Performance Liquid Chromatography for Quantitative
Analysis of Ketoisovalerate and Isobutyraldehyde.
[0246] Analysis of the DNPH derivatives of ketoisovalerate and
isobutyraldehyde was performed on a HP-1100 High Performance Liquid
Chromatography system equipped with a Hewlett Packard 1200 HPLC
stack column (Agilent Eclipse XDB-18, 150.times.4.0 mm; 5 .mu.m
particles [P/N #993967-902] and C18 Guard cartridge). The analytes
were detected using an HP-1100 UV detector at 360 nm The column
temperature was 50.degree. C. This method was isocratic with 0.1%
H.sub.3PO.sub.4 and 70% acetonitrile in water as mobile phase. Flow
was set at 3 mL/min. Injection size was 10 .mu.L and the run time
was 2 minutes.
[0247] Molecular Biology and Bacterial Cell Culture.
[0248] Standard molecular biology methods for cloning and plasmid
construction are generally used, unless otherwise noted (Sambrook,
J., Russel, D. W. Molecular Cloning, A Laboratory Manual. 3 ed.
2001, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory
Press).
[0249] Standard recombinant DNA and molecular biology techniques
used in the Examples are well known in the art and are described by
Sambrook, J., Russel, D. W. Molecular Cloning, A Laboratory Manual.
3 ed. 2001, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory
Press; and by T. J. Silhavy, M. L. Bennan, and L. W. Enquist,
Experiments with Gene Fusions, Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, N.Y. (1984) and by Ausubel, F. M. et al.,
Current Protocols in Molecular Biology, pub. by Greene Publishing
Assoc. and Wiley-Interscience (1987).
[0250] General materials and methods suitable for the routine
maintenance and growth of bacterial cultures are well known in the
art. Techniques suitable for use in the following examples may be
found as set out in Manual of Methods for General Bacteriology
(Phillipp Gerhardt, R. G. E. Murray, Ralph N. Costilow, Eugene W.
Nester, Willis A. Wood, Noel R. Krieg and G. Briggs Phillips,
eds.), American Society for Microbiology, Washington, D.C. (1994))
or by Thomas D. Brock in Biotechnology: A Textbook of Industrial
Microbiology, Second Edition, Sinauer Associates, Inc., Sunderland,
Mass. (1989).
[0251] Yeast transformations--S. cerevisiae.
[0252] S. cerevisiae strains were transformed by the Lithium
Acetate method (Gietz et al., Nucleic Acids Res. 27:69-74 (1992):
Cells from 50 mL YPD cultures (YPGaI for valine auxotrophs) were
collected by centrifugation (2700 rcf, 2 minutes, 25.degree. C.)
once the cultures reached an OD.sub.600 of 1.0. The cells were
washed cells with 50 mL sterile water and collected by
centrifugation at 2700 rcf for 2 minutes at 25.degree. C. The cells
were washed again with 25 mL sterile water and collected cells by
centrifugation at 2700 rcf for 2 minutes at 25.degree. C. The cells
were resuspended in 1 mL of 100 mM lithium acetate and transferred
to a 1.5 mL eppendorf tube. The cells were collected cells by
centrifugation for 20 sec at 18,000 rcf, 25.degree. C. The cells
were resuspended cells in a volume of 100 mM lithium acetate that
was approximately 4.times. the volume of the cell pellet. A mixture
of DNA (final volume of 15 .mu.l with sterile water), 72 .mu.l 50%
PEG, 10 .mu.l 1 M lithium acetate, and 3 .mu.l denatured salmon
sperm DNA was prepared for each transformation. In a 1.5 mL tube,
15 .mu.l of the cell suspension was added to the DNA mixture (85
.mu.l), and the transformation suspension was vortexed with 5 short
pulses. The transformation was incubated at 30 minutes at
30.degree. C., followed by incubation for 22 minutes at 42.degree.
C. The cells were collected by centrifugation for 20 sec at 18,000
rcf, 25.degree. C. The cells were resuspended in 100 .mu.l SOS (1 M
sorbitol, 34% (v/v) YP (1% yeast extract, 2% peptone), 6.5 mM
CaCl.sub.2) or 100 .mu.l YP (1% yeast extract, 2% peptone) and
spread over an appropriate selective plate.
[0253] Yeast Transformations--K. lactis.
[0254] K. lactis cells were transformed according to a slightly
modified version of the protocol as described by Kooistra et al.,
Yeast 21: 781-792 (2004). Saturated overnight-grown cultures of K.
lactis cells were diluted 1:50 into 100 mL YPD and were placed in
30.degree. C. shaker (250 rpm) and grown for 4-5 hours until the
culture reached an OD.sub.600 of 0.3-0.5. Cells were collected by
centrifugation (2 min, 3000.times.g) and washed with 50 ml cold
sterile EB (electroporation buffer; 10 mM Tris-HCl, pH 7.5, 270 mM
sucrose, 1 mM MgCl.sub.2) at 4.degree. C. Cells were resuspended in
50 mL YPD that contained 25 mM DTT and 20 mM HEPES, pH 8.0 Cells
were transferred back into flasks used to grow cells and incubated
in 30.degree. C. incubator (without shaking) for 30 minutes. Cells
were then collected by centrifugation (2 minutes, 3000.times.g) and
washed with 10 mL ice-cold sterile EB, as above. Cells were then
resuspended using one cell pellet volume of ice-cold sterile EB.
Sixty microliters of cells were mixed with plasmid DNA and
incubated on ice for 15 minutes. For targeted integrations, or
transformation of linear DNA, approximately 200-400 ng of
non-specific, short (50-500 bp) linear DNA fragments were added to
300-400 ng of the linearized integrating DNA construct. This DNA
was either provided by gel-purified Alul-digested salmon sperm DNA,
or a mixture of annealed primers 35+36 (yielding a .about.85 bp
linear duplex fragment). Cells were transferred cells to a chilled
electroporation (2 mm) cuvette and pulsed using a BioRad Gene
Pulser at 1 kV, 400.OMEGA., and 25 uF. The cell suspension was
immediately transferred to a 14 mL round-bottom Falcon tube with 1
mL room temperature YPD and allowed to incubate vertically at
30.degree. C., 225 RPM for at 6-18 h. Cells were collected in an
1.7 mL by centrifugation for 10 seconds at maximum speed, and
resuspended with 150 .mu.L YPD before being spread onto appropriate
selection plates.
[0255] Yeast Colony PCR with FailSafe.TM. PCR System(EPICENTRE.RTM.
Biotechnologies,
[0256] Madison, Wis.; Catalog #FS99250): Cells from each colony
were added to 20 .mu.l of colony PCR mix (per reaction mix contains
6.8 .mu.l water, 1.5 .mu.l of each primer, 0.2 .mu.l of FailSafe
PCR Enzyme Mix and 10 .mu.l 2.times. FailSafe Master Mix). Unless
otherwise noted, 2.times. FailSafe Master Mix E was used. The PCR
reactions were incubated in a thermocycler using the following
touchdown PCR conditions: 1 cycle of 94.degree. C..times.2 min, 10
cycles of 94.degree. C..times.20 s, 63.degree.-54.degree.
C..times.20 s (decrease 1.degree. C. per cycle), 72.degree.
C..times.60 s, 40 cycles of 94.degree. C..times.20 s, 53.degree.
C..times.20 s, 72.degree. C..times.60 s and 1 cycle of 72.degree.
C..times.5 min.
[0257] Zymoclean Gel DNA Recovery Kit (Zymo Research, Orange,
Calif.; Catalog #D4002) Protocol:
[0258] DNA fragments were recovered from agarose gels according to
manufacturer's protocol.
[0259] Zymo Research DNA Clean and Concentrator Kit (Zymo Research,
Orange, Calif.; Catalog #D4004) Protocol:
[0260] DNA fragments were purified according to manufacturer's
protocol.
[0261] Preparation of Cell Lysates for In Vitro Enzyme Assays.
[0262] To grow cultures for cell lysates, triplicate independent
cultures of the desired strain were grown overnight in 3 mL of the
appropriate medium at 30.degree. C., 250 rpm. The following day,
the overnight cultures were diluted into 50 mL fresh medium in 250
mL baffle-bottomed Erlenmeyer flasks and incubated at 30.degree. C.
at 250 rpm. Cells were grown for at least 4 generations and the
cultures were harvested in mid log phase (OD.sub.600 of 1-3) The
cells of each culture were collected by centrifugation
(2700.times.g, 5 min, 4.degree. C.). The cell pellets were washed
by resuspending in 20 mL of ice cold water. The cells were
centrifuged at 2700.times.g, 4.degree. C. for 5 min. All
supernatant was removed from each tube and the tubes were frozen at
-80.degree. C. until use.
[0263] Lysates were prepared by thawing each cell pellet on ice and
preparing a 20% (w/v) cell suspension in lysis buffer. The lysis
buffer was varied for each enzyme assay and consisted of: 0.1 M
Tris-HCl pH 8.0, 5 mM MgSO.sub.4, for DHAD assays, 50 mM potassium
phosphate buffer pH 6.0, 1 mM MgSO.sub.4 for ALS assays, 250 mM
KPO.sub.4 pH 7.5, 10 mM MgCl.sub.2 for KARI assays, 50 mM
NaHPO.sub.4, 5 mM MgCl.sub.2, for KIVD assays. 10 .mu.L of
Yeast/Fungal Protease Arrest solution (G Biosciences, catalog
#788-333) per 1 mL of lysis buffer was used. 800 microliters of
cell suspension were added to 1 mL of 0.5 mm glass beads that had
been placed in a chilled 1.5 mL tube. Cells were lysed by bead
beating (6 rounds, 1 minute per round, 30 beats per second) with 2
minutes chilling on ice in between rounds. The tubes were then
centrifuged (20,000.times.g, 15 min) to pellet debris and the
supernatants (cell lysates) were retained in fresh tubes on ice.
The protein concentration of each lysate was measured using the
BioRad Bradford protein assay reagent (BioRad, Hercules, Calif.)
according to manufacturer's instructions.
[0264] Preparation of Fractionated Lysates from S. cerevisiae
Strains for In Vitro Enzyme Assays.
[0265] To grow cultures for cell fractionated cell lysates,
triplicate independent cultures of the desired strain were grown
overnight in 3 mL of the appropriate medium at 30.degree. C., 250
rpm. The following day, the overnight cultures were used to
inoculate 1 L cultures of each strain which were grown in the
appropriate medium at 30.degree. C. at 250 rpm until they reached
an OD.sub.600 of approximately 2. The cells were collected by
centrifugation (1600.times.g, 2 min) and the culture medium was
decanted. The cell pellets were resuspended in 50 mL sterile
deionized water, collected by centrifugation (1600.times.g, 2 min),
and the supernatant was discarded.
[0266] To obtain spheroplasts, the cell pellets were resuspended in
0.1 M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT was added to a final concentration of 10 mM. Cells were
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells were then collect by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet was resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet was
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle, and 50 mg of
Zymolyase 20T (Seikagaku Biobusiness, Code#120491) was added to
each cell suspension. The suspensions were incubated overnight
(approximately 16 hrs) at 30.degree. C. with gentle agitation (60
rev/min) on an orbital shaker. The efficacy of spheroplasting was
ascertained by diluting an aliquot of each cell suspension 1:10 in
either sterile water or in spheroplasting buffer, and comparing the
aliquots microscopically (under 40.times. magnification). In all
cases, >90% of the water-diluted cells lysed, indicating
efficient spheroplasting. The spheroplasts were centrifuged
(3000.times.g, 10 min, 20.degree. C.), and the supernatant was
discarded. Each cell pellet was resuspended in 50 mL spheroplast
buffer without Zymolyase, and cells were collected by
centrifugation (3000.times.g, 10 min, 20.degree. C.).
[0267] To fractionate spheroplasts, the cells were resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) was added. The cell
suspension was subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension was
centrifuged (2500.times.g, 10 min, 4.degree. C.) to collect cell
debris and unbroken cells and spheroplasts. Following
centrifugation, 2 mL of each sample (1 mL of the pGV1900
transformed cells) were saved in a 2 mL centrifuge tube on ice and
designated the "W" (for Whole cell extract) fraction, while the
remaining supernatant was transferred to a clean, ice-cold 35 mL
Oakridge screw-cap tube and centrifuged (12,000.times.g, 20 min,
4.degree. C.) to pellet mitochondria and other organellar
structures. Following centrifugation, 5 mL of each resulting
supernatant was transferred to a clean tube on ice, being careful
to avoid the small, loose pellet, and labelled the "S" (soluble
cytosol) fraction. The resulting pellets were resuspended in MIB
containing Protease Arrest solution, and were labelled the "P"
("pellet") fractions. The BioRad Protein Assay reagent (BioRad,
Hercules, Calif.) was used according to manufacturer's instructions
to determine the protein concentration of each fraction.
[0268] Preparation of Fractionated Lysates from K. lactis Strains
for In Vitro Enzyme Assays
[0269] Cultures (20 mL YPD) were inoculated with yeast cells
(GEVO1742 and GEVO1829) and incubated at 30.degree. C. while
shaking at 250 RPM until they reached late-log to stationary phase
(OD.sub.600 of approximately 10). Cells from the 20 mL cultures
were used to inoculate a 250 mL YPD culture at an OD.sub.600 of
approximately 0.2. The cultures were incubated at 30.degree. C.
while shaking at 250 RPM until they reached mid-log (OD.sub.600
.about.2).
[0270] To prepare spheroplasts, the cells were collected in 500 mL
bottles at 5000.times.g for 5 minutes at room temperature. The
pellets were resuspended with 8 mL Spheroplasting Buffer A (25 mM
potassium phosphate (pH 7.5), 1 mM MgCl.sub.2, 1 mM EDTA, 1.25 mM
TPP, 1 mM DTT) without sorbitol and transferred to pre-weighed 50
mL tubes. The cells were collected at 1600.times.g for 5 minutes at
room temperature. The cells were resuspended with 8 mL of
Spheroplasting Buffer A with 2.5 M Sorbitol (Amresco Code#0691) and
protease inhibitor (G Biosciences Yeast/Fungal ProteaseArrest.TM.
(Catalog #788-333)). Approximately 5 mg of Zymolyase 20T Zymolyase
20T (Seikagaku Biobusiness Code#120491) was added to each cell
suspension. The suspensions were incubated at 30.degree. C. with
gentle agitation (e.g. 50 RPM), with the tube on its side for good
mixing, for 1-2 hours. The efficiency of formation of spheroplasts
was verified by dilution of the spheroplast suspension 1:10 into
Spheroplasting Buffer A with 2.5 M sorbitol and 1:10 in water.
Spheroplasts should remain intact when diluted into the buffer but
appear fuzzy or completely disappear when diluted into water. The
spheroplasts were collected at 1600.times.g for 7 minutes at
4.degree. C. The spheroplasts were gently washed with 2 mL of
Spheroplasting Buffer A with 2.5 M sorbitol and protease inhibitor,
and collected at 1600.times.g for 7 minutes at 4.degree. C. The
spheroplasts were resuspended in 2 mL of Spheroplasting Buffer A
with 2.5 M sorbitol and protease inhibitor.
[0271] To fractionate the spheroplasts, 8 mL of Spheroplasting
Buffer A with 0.2 M sorbitol and protease inhibitor was slowly
added to the cell suspension, bringing the final concentration of
Sorbitol to 0.66 M. The spheroplasts were broken with 10 strokes
using a B (tight fitting) pestle in a 15 mL Dounce homogenizer
(Bellco Glass, Inc. Cat#1984-10015) on ice. The homogenate was
transferred to a 50 mL tube, and the cell debris was collected by
centrifugation at 4.degree. C. for 10 minutes at 1600.times.g. The
supernatant was transferred to a 15 mL tube with a pipette. This
supernatant is the "W" fraction. 5 mL of this "W" fraction was
transferred to a 35 mL Oakridge tube and centrifuged at
48,000.times.g for 20 minutes at 4.degree. C. The resulting
supernatant was transferred to a 15 mL tube and labeled "S." The
pellet was resuspended in 5 mL of Spheroplasting Buffer A with 0.66
M Sorbitol and protease inhibitor and labeled "P." All fractions
were stored on ice at 4.degree. C. while in use. The BioRad Protein
Assay reagant (BioRad, Hercules, Calif.) was used according to
manufacturer's instructions to determine the protein concentration
of each fraction.
[0272] ALS Assay.
[0273] Cell lysates were prepared and protein concentrations were
determined as described above. The colorimetric ALS Assay
(FAD-independent) performed here was based on the assay described
in Hugenholtz, J. and Starrenburg, J. C., Appl. Microbiol.
Biotechnol. (1992) 38:17-22. Reaction buffer was prepared by mixing
900 .mu.l 1M potassium phosphate buffer pH 6.0, 180 .mu.l 100 mM
MgSO.sub.4, 180 .mu.l 100 mM TPP, 3.96 ml 500 mM pyruvate and 12.78
ml water. For the no substrate control, the volume of pyruvate was
replaced with water. Lysates were prepared at a final protein
concentration of 2 .mu.g/.mu.l in Spheroplasting Buffer A with 0.66
M sorbitol. To 900 .mu.L ALS Buffer, 100 .mu.L of lysate was added
and incubated at 30.degree. C. for 30 min. Acetoin standards were
also prepared at concentrations of 2 mM, 1 mM, 0.5 mM, and 0 mM.
From each sample and standard, 175 .mu.L was transferred to a fresh
1.5 mL tube. To each sample and standard was added 25 .mu.L 35%
(v/v) H.sub.2SO.sub.4, and all were incubated at 37.degree. C. for
30 mins. After the incubation, the following were added in order,
to each standard and sample, with the solutions being mixed by
vortexing in between each addition: 50 .mu.L 50% (w/v) NaOH, 50
.mu.L 0.5% creatine, and 50 .mu.L 5% 1-naphthol (in 2.5N NaOH). The
samples and standards were incubated at room temperature for 1
hour, being mixed by vortexing every 15 minutes. To a 96 well,
half-area, UV-Star, transparent, flat-bottom plate (Catalog
#675801, Greiner Bio One, Frickenhausen, Germany), 100 uL of each
sample or standard was transferred, and the samples were analyzed
by a plate reader by measuring absorbance at 530 nm.
[0274] KARI Assay.
[0275] Cell lysates were prepared and protein concentrations were
determined as described above. Acetolactate substrate was made by
mixing 50 .mu.l of ethyl-2 acetoxy-2-methyl-acetoacetate with 990
.mu.l of water. Then 10 .mu.l of 2 N NaOH was sequentially added,
with vortex mixing between additions, until 260 .mu.l of NaOH was
added. The acetolactate was agitated at room temperature for 20 min
and then held on ice. NADPH was prepared in 0.01N NaOH (to improve
stability) to a concentration of 50 mM. The concentration was
determined by reading the OD of a diluted sample at 340 nm in a
spectrophotometer and using the molar extinction coefficient of
6.22 M.sup.-1 cm.sup.-1 to calculate the actual concentration (the
OD.sub.340 of a 100 .mu.M solution of NAD(P)H should be 0.622).
Three buffers were prepared and held on ice. Reaction buffer
contained 250 mM KPO.sub.4 pH 7.5, 10 mM MgCl.sub.2, 1 mM DTT, 10
mM acetolactate, and 0.2 mM NADPH. No substrate buffer contained
everything except the acetolactate. No NAD(P)H buffer contained
everything except the NADPH. Reactions were performed in triplicate
using 10 .mu.l of cell extract with 90 .mu.l of reaction buffer in
a 96-well plate in a SpectraMax 340PC multi-plate reader (Molecular
Devices, Sunnyvale, Calif.). The reaction was followed at 340 nm by
measuring a kinetic curve for 5 minutes, with OD readings taken
every 10 seconds. The reactions were performed at 30.degree. C. The
reactions were performed in complete, no substrate, and no NAD(P)H
buffers. The V.sub.max for each extract was determined after
subtracting the background reading of the no substrate control from
the reading in complete buffer.
[0276] DHAD Assay.
[0277] Cell lysates were prepared and protein concentrations were
determined as described above. The DHAD activity of each lysate was
ascertained as follows. In a fresh 1.5 mL centrifuge tube, 50 .mu.L
of each lysate was mixed with 50 .mu.L of 0.1 M
2,3-dihydroxyisovalerate (DHIV), 25 .mu.L of 0.1 M MgSO.sub.4, and
375 .mu.L of 0.05M Tris-HCl pH 8.0, and the mixture was incubated
for 30 min at 35.degree. C. Each tube was then heated to 95.degree.
C. for 5 min to inactivate any enzymatic activity, and the solution
was centrifuged (16,000.times.g for 5 min) to pellet insoluble
debris. To prepare samples for analysis, 100 .mu.L of each reaction
were mixed with 100 .mu.L of a solution consisting of 4 parts 15 mM
dinitrophenyl hydrazine (DNPH) in acetonitrile with 1 part 50 mM
citric acid, pH 3.0, and the mixture was heated to 70.degree. C.
for 30 min in a thermocycler. The solution was then analyzed by
HPLC as described above in General Methods to quantitate the
concentration of ketoisovalerate (KIV) present in the sample.
[0278] KIVD Assay.
[0279] Cell lysates were prepared and protein concentrations were
determined as described above. KIVD Assay buffer, containing 1
Roche Protease Inhibitor tablet per 5 mL buffer, was added to each
cell pellet to create a 20% (w/v) cell suspension. The KIVD assay
buffer was prepared at a final concentration of 0.05 M
NaHPO.sub.4*H.sub.2O, 5 mM MgCl.sub.2*8H.sub.2O, and 1.5 mM Thiamin
pyrophosphate chloride. The reaction substrate,
.alpha.-keto-isovalerate (3-methyl-2-oxobutanoic acid, Acros
Organics), was added where appropriate at 30 mM. Lysates were
diluted in reaction buffer at a final protein concentration of 0.1
.mu.g/.mu.L. To 1.5 mL tubes, 50 .mu.L of lysate (5 .mu.g of
protein) was mixed with 200 .mu.L of reaction buffer with or
without substrate. The reactions were incubated at 37.degree. C.
for 20 minutes, and the reactions were immediately filtered through
a 2 .mu.m filter plate. The filtered samples were diluted 1:10 in
water, and 100 .mu.L of the 1:10 dilution was mixed with 100 .mu.L
of derivatization reagent in a 0.2 ml thin-wall PCR tubes.
Derivatization reagent was prepared by mixing 4 ml of
2,4-Dinitrophenyl Hydrazine (DNPH) in 15 mM in HPLC-grade
Acetonitrile with 1 ml 50 mM Citric Acid Buffer, pH 3. The samples
were incubated at 70.degree. C. for 30 minutes. The samples were
analyzed by HPLC.
[0280] ADH Assay.
[0281] Cell lysates were prepared and protein concentrations were
determined as described above. Assays (set up in triplicate for
each lysate) contained 10 .mu.L of each lysate (or an appropriate
dilution of each lysate) plus 90 .mu.L of reaction buffer, which
consisted of (final concentrations present in 1.times. reaction
buffer): 0.1M Tris-HCl pH 7.5, 10 mM MgC.sub.12, 1 mM DTT, 0.2 mM
NADH (or NADPH, where indicated; each diluted from a 4.4 mM
spectrophotometrically-confirmed stock), and 11 mM
isobutyraldehyde. Where indicated, as controls a parallel set of
assay reactions were set up using reaction buffer lacking
isobutyraldehyde and/or NAD(P)H, as indicated. For experiments
measuring the acetaldehyde-dependent oxidation of NAD(P)H, reaction
buffer was prepared in which acetaldehyde was substituted for
isobutyraldehyde. In these cases, the reaction buffer contained at
least 11 mM acetaldehyde, although the exact amount present is an
estimate due to the inherent difficulties of pipetting acetaldehyde
solution. Finally, in some cases a parallel set of reactions
lacking yeast cell lysate was included as a negative control. After
being added (using a multi-channel pipet) to the wells of a 96-well
plate, the reactions were immediately placed into a plate reader
that had been pre-warmed to 30.degree. C., and the absorbance at
340 nm was measured every 12 seconds over a period of 300 seconds.
Kinetic parameters were computed from assays with linear slopes
(where necessary, assays were repeated with appropriate dilutions
to obtain linear NAD(P)H consumption curves).
Composition of Culture Media
[0282] Drugs: When indicated, G418 (Calbiochem, Gibbstown, N.J.)
was added at 0.2 g/L, Phleomycin (InvivoGen, San Diego, Calif.) was
added at 7.5 mg/L, Hygromycin (InvivoGen, San Diego, Calif.) was
added at 0.2 g/L, and 5-fluoro-orotic acid (FOA; Toronto Research
Chemicals, North York, Ontario, Canada) was added at 1 g/L.
[0283] YP: 1% (w/v) yeast extract, 2% (w/v) peptone.
[0284] YPD: YP containing 2% (w/v) glucose unless otherwise
noted,
[0285] YPGal: YP containing 2% (w/v) galactose
[0286] YPE: YP containing 2% (w/v) Ethanol.
[0287] SC media: 6.7 g/L Difco.TM. Yeast Nitrogen Base, 14 g/L
Sigma.TM. Synthetic Dropout Media supplement (includes amino acids
and nutrients excluding histidine, tryptophan, uracil, and leucine;
Sigma-Aldrich, St. Louis, Mo.), 0.076 g/L histidine, 0.076 g/L
tryptophan, 0.380 g/L leucine, and 0.076 g/L uracil. Drop-out
versions of SC media is made by omitting one or more of histidine
(H), tryptophan (W), leucine (L), or uracil (U or Ura). When
indicated, SC media are supplemented with additional isoleucine
(9xI; 0.684 g/L), valine (9xV; 0.684 g/L) or both isoleucine and
valine (9xIV). SCD is SC containing 2% (w/v) glucose unless
otherwise noted, SCGal is SC containing 2% (w/v) galactose and SCE
is SC containing 2% (w/v) ethanol. For example, SCD-Ura+9xIV would
be composed of 6.7 g/L Difco.TM. Yeast Nitrogen Base, 14 g/L
Sigma.TM. Synthetic Dropout Media supplement (includes amino acids
and nutrients excluding histidine, tryptophan, uracil, and
leucine), 0.076 g/L histidine, 0.076 g/L tryptophan, 0.380 g/L
leucine, 0.684 g/L isoleucine, 0.684 g/L valine, and 20 g/L
glucose.
[0288] SCD-V+9xI: 6.7 g/L Difco.TM. Yeast Nitrogen Base, 0.076 g/L
Adenine hemisulfate, 0.076 g/L Alanine 0.076 g/L, Arginine
hydrochloride, 0.076 g/L Asparagine monohydrate, 0.076 g/L Aspartic
acid, 0.076 g/L Cysteine hydrochloride monohydrate, 0.076 g/L
Glutamic acid monosodium salt, 0.076 g/L Glutamine, 0.076 g/L
Glycine, 0.076 g/L myo-lnositol, 0.76 g/L Isoleucine, 0.076 g/L
Lysine monohydrochloride, 0.076 g/L Methionine, 0.008 g/L
p-Aminobenzoic acid potassium salt, 0.076 g/L Phenylalanine, 0.076
g/L Proline, 0.076 g/L Serine, 0.076 g/L Threonine, 0.076 g/L
Tyrosine disodium salt, and 20 g/L glucose.
[0289] YNB: 6.7 g/L Difco.TM. Yeast Nitrogen Base supplemented with
indicated nutrients as follows: histidine (H; 0.076 g/L),
tryptophan (W; 0.076 g/L), leucine (L; 0.380 g/L), uracil (U or
Ura; 0.076 g/L), isoleucine (1; 0.076 g/L), valine (V; 0.076 g/L),
and casamino acids (CAA; 10 g/L). When indicated, YNB media are
supplemented with higher amounts of isoleucine (10xI=0.76 g/L),
valine (10xV=0.76 g/L) or both isoleucine and valine (10xIV). YNBD
is YNB containing 2% (w/v) glucose unless otherwise noted, YNBGal
is YNB containing 2% (w/v) galactose and YNBE is YNB containing 2%
(w/v) ethanol. For example, YNBGal+HWLU+10xI+G418 would be composed
of 6.7 g/L Difco.TM. Yeast Nitrogen Base, 0.076 g/L histidine,
0.076 g/L tryptophan, 0.380 g/L leucine, 0.076 g/L uracil, 0.76 g/L
isoleucine, 0.2 g/L G418, and 20 g/L galactose.
[0290] Plates: Solid versions of the above described media contain
2% (w/v) agar.
Example 1
Isobutanol Pathway is Partially Cytosolic when Expressed in
Yeast
[0291] The purpose of this example is to illustrate that three
enzymes in the isobutanol biosynthetic pathway (acetolactate
synthase, ketoisovalerate decarboxylase, and isobutanol
dehydrogenase) are localized to the cytosol when expressed in
yeast.
TABLE-US-00003 TABLE 3 Genotype of strains disclosed in Example 1.
GEVO No. Genotype/Source 1287 K. lactis ATCC 200826 MAT .alpha.
uraA1 trp1 leu2 lysA1 ade1 lac4-8 [pKD1] 1742 K. lactis ATCC 200826
MAT .alpha. uraA1 trp1 leu2 lysA1 ade1 lac4-8 [pKD1]
pdc1::Kan.sup.R 1829 K. lactis ATCC 200826 MAT .alpha. uraA1 trp1
leu2 lysA1 ade1 lac4-8 [pKD1] pdc1::kan.sup.R {P.sub.TDH3:Ec_ilvC-
.DELTA.N; P.sub.TEF1:Ec_ilvD-.DELTA.N(codon optimized for K.
lactis):ScLEU2 integrated} {P.sub.TEF1:Ll_kivD;
P.sub.TDH3ScADH7:KmURA3 integrated} {P.sub.CUP1-1: Bs_alsS, TRP1
random integrated}
TABLE-US-00004 TABLE 4 Plasmids disclosed in Example 1. pGV No.
Genotype pGV1503 ScTEF1promoter-kanR bla, pUC ori (GEVO) pGV1537
KlPDC1 promoter region + Klpdc1 3'UTR sequence, ScTEF1promoter-kanR
bla, pUC ori (GEVO) pGV1590 TEF1 promoter:Ll-kivd (codon optimized
for E. coli):TDH3 promoter:ADH7:CYC1 terminator, Km-URA3, 1.6
micron ori, bla, pUC ori (GEVO) pGV1726 CUP1 promoter:Bs-alsS:CYC1
terminator, TRP1, bla, pUC-ori pGV1727 TEF1
promoter:Ec-ilvD.DELTA.N (codon optimized for K. lactis):TDH3
promoter:Ec-ilvC.DELTA.N:CYC1 terminator, LEU2, bla, pUC ori
(GEVO)
Plasmids
[0292] pGV1503 contains an S. cerevisiae TEF1 promoter region
driving a G418-resistance gene (kan.sup.R).
[0293] pGV1537 was constructed by inserting an (AatII plus
MfeI)-digested PCR product containing approximately 500 bp each of
KIPDC1 5' and 3' untranslated regions, into (AatII plus
EcoRI)-digested pGV1503. The insert was generated by SOE-PCR.
First, the KIPDC1 5' and 3' untranslated regions were amplified
from K. lactis genomic DNA by primer pairs 1006+1016 and 1017+1009,
respectively. Primers 1016 and 1017 were designed to have
overlapping sequences. The two fragments were then joined by PCR
using primers 1006+1009.
[0294] pGV1590 is a K. lactis plasmid for expression of the L.
lactis kivD and the S. cerevisiae ADH7. Expression of the L. lactis
kivD is driven by the S. cerevisiae TEF1 promoter and expression of
the S. cerevisiae ADH7 is driven by the S. cerevisiae TDH3
promoter. pGV1590 was generated by cloning a SalI-NotI fragment
carrying the S. cerevisiae ADH7 gene into the XhoI-NotI sites of
pGV1585. The S. cerevisiae ADH7 gene fragment originated as a PCR
product from S. cerevisiae genomic DNA using primers 410 and
411.
[0295] pGV1726 is a yeast integration plasmid (utilizing the S.
cerevisiae TRP1 gene as selection marker) for random integration
(i.e. for K. lactis). This plasmid does not carry a yeast
replication origin, thus is unable to replicate episomally. This
plasmid also carries the B. subtilis alsS gene, whose expression is
under the control of the S. cerevisiae CUP1 promoter. pGV1726 was
generated by cloning a SacI-NgoMIV fragment carrying the S.
cerevisiae CUP1 promoter, Bs-alsS ORF and the CYC1 terminator into
the same sites of pGV1645. The vector, pGV1645, is a K. lactis
expression plasmid that was used for expression of the B. subtilis
alsS under the control of the K. lactis PDC1 promoter. This plasmid
also carries the S. cerevisiae TRP1 gene as a selection marker and
the 1.6 micron replication origin. Digestion of pGV1645 with SacI
and NgoMIV removes the K. lactis PDC1 promoter, B. subtilis alsS,
CYC1 terminator and the 1.6 micron origin of replication. The
insert fragment carrying the S. cerevisiae CUP1 promoter, B.
subtilis alsS ORF and the CYC1 terminator was obtained from pGV1649
via digestion with SacI and NgoMIV. The CUP1 promoter originated as
a PCR product from S. cerevisiae genomic DNA using primers 637 and
638. The B. subtilis alsS originated as a PCR product from B.
subtilis genomic DNA using primers 767 and 697.
[0296] pGV1727 is a yeast integration plasmid (utilizing the S.
cerevisiae LEU2 gene as selection marker) for random integration
(i.e. for K. lactis). This plasmid does not carry a yeast
replication origin, thus is unable to replicate episomally. This
plasmid carries the E. coli ilvD.DELTA.N and ilvC.DELTA.N genes,
whose expressions are under the control of the S. cerevisiae TEF1
and TDH3 promoters respectively. The E. coli ilvD.DELTA.N is a
shortened version of E. coli ilvD where the sequence coding for the
first 24 amino acids, which encodes for a putative mitochondrial
targeting sequence, was removed. Likewise, the E. coli ilvC.DELTA.N
is a shortened version of E. coli ilvC where the sequence coding
for the first 22 amino acids, which is predicted to function as a
mitochondrial targeting sequence was removed. pGV1727 was generated
by cloning a XhoI-NgoMIV fragment carrying the E. coli ilvC.DELTA.N
gene and the CYC1 terminator into the same sites of pGV1635. The
vector, pGV1635, is a K. lactis expression plasmid that was used
for expression of the E. coli ilvD.DELTA.N gene under the control
of the S. cerevisiae TEF1 promoter. The ilvD.DELTA.N gene is
followed by the TDH3 promoter, a short MCS (includes an XhoI site),
the CYC1 terminator and the 1.6 micron replication origin. This
plasmid carries the S. cerevisiae LEU2 gene as a selection marker.
Digestion of pGV1635 with XhoI and NgoMIV removes the CYC1
terminator and the 1.6 micron replication origin. This sequence was
replaced by the insert fragment carrying the E. coli ilvC.DELTA.N
and the CYC1 terminator which was obtained from pGV1677 digested
with XhoI and NgoMIV. The E. coli ilvD.DELTA.N originated as a PCR
product from pGV1578 (plasmid carrying E. coli ilvD codon optimized
for K. lactis from DNA2.0, Menlo Park, Calif.) using primers 1151
and 1152. The E. coli ilvC.DELTA.N originated as a PCR product from
pGV1160 (plasmid carrying the full length E. coli ilvC gene) using
primers 1149 and 1150. The E. coli ilvC in pGV1160 originated as a
PCR product from E. coli genomic DNA using primers 387 and 388.
[0297] GEVO1287 was transformed with PmlI-digested pGV1537,
yielding GEVO1742. GEVO1829 was constructed by sequentially
transforming GEVO1742 with gene fragments from pGV1590, pGV1727,
and pGV1726 following the standard lithium acetate protocol. First,
a 7.8 kb fragment of pGV1590 generated by digestion with NgoMIV and
MfeI was transformed into GEVO1742. Next, this transformant strain
was transformed with pGV1727 (FIG. 4) that had been linearized by
digestion with BcgI. Finally, this transformant strains was
transformed with pGV1726 that had been linearized by digestion with
AhdI. The final transformant was GEVO1829.
[0298] Cellular fractions were prepared from GEVO1742 and GEVO1829
as described above. The protein concentration used to calculate
specific activities from all three fractions ("W," "S," and "P")
was measured for the "W" fraction. Below are the results for the
assays measuring isobutanol dehydrogenase, acetolactate synthase,
and ketoisovalerate decarboxylase activities.
Alcohol Dehydrogenase (ADH) Assay
[0299] The results from the assay are summarized in Table 5. The
"W" fraction and the "S" fraction of the pathway carrying strain
(GEVO1829) contained at least three times the NADPH dependent
alcohol dehydrogenase activity found in the same fractions of
GEVO1742. The "W" and "S" fractions of GEVO1829 contained more than
four times the activity present in the "P" fraction. These data
indicated that S. cerevisiae Adh7 activity was predominantly
localized to the cytosol.
TABLE-US-00005 TABLE 5 Alcohol Dehydrogenase Activity. Specific
Alcohol Sample Dehydrogenase Activity (U/mg protein) 1742 W 0.08
.+-. 0.00 1742 S 0.07 .+-. 0.02 1742 P 0.03 .+-. 0.012 1829 W 0.26
.+-. 0.00 1829 S 0.25 .+-. 0.02 1829 P 0.04 .+-. 0.02
Acetolactate Synthase (ALS) Assay
[0300] The results from the assay are summarized in Table 6. The
"W" and "S" fractions of the isobutanol pathway carrying strain
(GEVO1829) contained ALS activity, while no activity was detected
in the same fractions of GEVO1742. The "W" and "S" fractions
contained three times higher ALS activity than the "P" fraction.
These data indicated that B. subtilis ALS activity was
predominantly localized to the cytosol.
TABLE-US-00006 TABLE 6 Acetolactate Synthase Activity. Sample
Specific Acetolactate Synthase Activity (U/mg protein) 1742 W 0.00
.+-. 0.00 1742 S 0.00 .+-. 0.00 1742 P 0.00 .+-. 0.00 1829 W 0.10
.+-. 0.01 1829 S 0.10 .+-. 0.00 1829 P 0.03 .+-. 0.00
Ketoisovalerate Decarboxylase (KIVD) Assay
[0301] The results from the assay are summarized in Table 7. The
"W" and "S" fractions of the isobutanol pathway carrying strain
(GEVO1829) contained 8-10 times greater activity than in the same
fractions of GEVO1742. Furthermore, the activity in "S" fraction
was 45.times. higher than what was detected in "P" fraction. These
data indicated that L. lactis KIVD activity was predominantly
localized in the cytosol.
TABLE-US-00007 TABLE 7 Ketoisovalerate decarboxylase (KIVD) Assay.
Sample Specific Ketoisovalerate Decarboxylase Activity (U/mg
protein) 1742 W 0.05 .+-. 0.00 1742 S 0.05 .+-. 0.04 1742 P 0.03
.+-. 0.00 1829 W 0.38 .+-. 0.02 1829 S 0.45 .+-. 0.04 1829 P 0.01
.+-. 0.00
Example 2
Construction of an ILV3 Deletion Mutant
[0302] The purpose of this example is to describe the construction
of an ILV3 deletion mutant of S. cerevisiae, GEVO2244.
TABLE-US-00008 TABLE 8 Genotype of strains disclosed in Example 2.
GEVO No. Genotype/Source GEVO1147 K. lactis, NRRL Y-1140, (obtained
from USDA) GEVO1188 S. cerevisiae, CEN.PK, (obtained from
Euroscarf); MAT.alpha. ura3 leu2 his3 trp1 GEVO2145 S. cerevisiae,
CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 ilv3::Kl_URA3 GEVO2244 S.
cerevisiae, CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00009 TABLE 9 Plasmids disclosed in Example 2. Plasmid
name Genotype pUC19 bla, pUC-ori (obtained from Invitrogen) pGV1299
K. lactis URA3, bla, pUC-ori (GEVO)
[0303] Plasmid pGV1299 was constructed by cloning the K. lactis
URA3 gene into pUC19. The K. lactis URA3 was obtained by PCR using
primers 575 and 576 from K. lactis genomic DNA. The PCR product was
digested with EcoRI and BamHI and cloned into pUC19 which was
similarly digested. The K. lactis URA3 insert was sequenced
(Laragen Inc) to confirm correct sequence.
[0304] The ilv3::KI_URA3 integration cassette contained, from 5' to
3', the following: 1) a 80 bp homology to ILV3 (position +158 to
237) that functions as the 5' targeting sequence for the
integration, 2) the K. lactis URA3 marker gene, 3) a 60 bp homology
to a region ILV3 (position -21 to +39) that is further upstream of
the 5' targeting sequence to facilitate loop-out of the K. lactis
URA3 marker, and 4) a 221 bp homology to the 3' region of ILV3
(position +1759 to 1979) that functions as the 3' targeting
sequence for the integration. This cassette was generated by
SOE-PCR. The K. lactis URA3 gene was amplified from pGV1299 using
primers 1887 and 1888. Only the 3' region of ILV3 was initially
amplified using primers 1623 and 1892 from genomic DNA and this
product was used as template to amplify the 3' region of ILV3 using
primers 1889 and 1890. The K. lactis URA3 and the 3' region of ILV3
were combined by SOE-PCR using primers 1886 and 1890.
[0305] GEVO1188 was transformed with the ilv3::KI_URA3 cassette
described above and plated onto YNBD+W+CAA (-Ura) plates.
Initially, eight colonies (#1-8) were patched onto YNBD+HUWLIV
plates and then replica plated onto YNBD+HUWLI (-V) plates to test
for valine auxotrophy. As none of these exhibited valine
auxotrophy, an additional eight colonies (#9-16) were streaked out
for single colonies and 3 or 4 isolates (A through C or D) from
each streak were tested for valine auxotrophy. Isolates A-C from
clone #12 exhibited valine auxotrophy.
[0306] These isolates were tested for the correct integrations by
colony PCR using primer pairs 1916+1920 and 1917+1921 for the 5'
and 3' junctions, respectively. Correct sized bands were observed
with clones #12A through C with primer pair 1916+1920. Correct
sized bands were observed with clone 12A when FailSafe Master Mix A
or C was used with primer pair 1917+1921. Clone #12A was designated
as GEVO2145. The valine auxotrophies of GEVO2145 were reconfirmed
by streaking them onto SCD+9xIV and SCD-V+9xI plates. GEVO2145
exhibited no growth on the medium lacking valine (SCD-V+9xI) while
it grew on medium supplemented with valine (SCD+9xIV). The parent
strain, GEVO1188, grew on both media.
[0307] GEVO2145 was streaked onto YNBE+W+CAA+FOA to isolate strains
in which the K. lactis URA3 had been excised through homologous
recombination, i.e. "looped out". Five FOA resistant clones (A-E)
were tested for auxotrophies for valine and uracil. All five clones
exhibited auxotrophies to both nutrients. Clone A was designated
GEVO2244. Colony PCR using primers 1891 and 1892 with FailSafe
Buffer C was performed and the loss of the KI_URA3 cassette was
confirmed.
Example 3
DHAD Activity is Localized to Mitochondria
[0308] The purpose of this Example is to demonstrate that the DHAD
activity encoded by ScILV3 is localized to the mitochondria.
TABLE-US-00010 TABLE 9 Genotype of strains disclosed in Example 3.
GEVO No. Genotype/Source Gevo2244 S. cerevisiae, CEN.PK; MAT.alpha.
ura3 leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00011 TABLE 10 Plasmids disclosed in Example 3. pGV No.
Genotype pGV1106 pUC ori, bla (AmpR), 2micron ori, URA3, TDH3
promoter- Myc tag-polylinker-CYC1 terminator pGV1900 pUC ori, bla
(AmpR), 2micron ori, URA3, TEF1 promoter-ScILV3(FL)
[0309] Plasmid pGV1106 is a variant of p426GPD (described in
Mumberg et al, 1995, Gene 119-122). To obtain pGV1106, annealed
oligos 271 and 272 were ligated into p426GPD that had been digested
with SpeI and XhoI, and the inserted DNA was confirmed by
sequencing.
[0310] Plasmid pGV1900 was generated by amplifying the full-length,
native ScILV3 nucleotide sequence from S. cerevisiae strain CEN.PK
genomic DNA using primers 1617 and 1618. The resulting 1.76 kb
fragment, which contained the complete ScILV3 coding sequence (SEQ
ID NO: 88) flanked by 5' SalI and 3' BamHI restriction site
sequences was digested with SalI and BamHI and ligated into pGV1662
(described in Example 6) which had been digested with SalI and
BamHI.
[0311] To measure DHAD activities present in fractionated cell
extracts, GEVO2244 was transformed singly with either pGV1106,
which served as an empty vector control, or with pGV1900, which is
an expression plasmid for ScILV3.
[0312] An independent clonal transformant of each plasmid was
isolated, and a 1 L culture of each strain was grown in
SCGaI-Ura+9xIV at 30.degree. C. at 250 rpm. The OD.sub.600 was
noted, the cells were collected by centrifugation (1600.times.g, 2
min) and the culture medium was decanted. The cell pellets were
resuspended in 50 mL sterile deionized water, collected by
centrifugation (1600.times.g, 2 min), and the supernatant was
discarded. The OD.sub.600 and total wet cell pellet weight of each
culture are listed in Table 11, below:
TABLE-US-00012 TABLE 11 OD.sub.600 and pellet mass (g) of strain
GEVO2244 transformed with the indicated plasmids. Pellet mass
Plasmid OD.sub.600 (g) pGV1106 2.2 7.6 pGV1900 1.3 3.8
[0313] To obtain spheroplasts, the cell pellets were resuspended in
0.1M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT was added to a final concentration of 10 mM. Cells were
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells were then collect by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet was resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet was
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle, and 50 mg of
Zymolyase 20T (Seikagaku Biobusiness, Code#120491) was added to
each cell suspension. The suspensions were incubated overnight
(.about.16 hrs) at 30.degree. C. with gentle agitation (60 rev/min)
on an orbital shaker. The efficacy of spheroplasting was
ascertained by diluting an aliquot of each cell suspension 1:10 in
either sterile water or in spheroplasting buffer, and comparing the
aliquots microscopically (under 40.times. magnification). In all
cases, >90% of the water-diluted cells lysed, indicating
efficient spheroplasting. The spheroplasts were centrifuged
(3000.times.g, 10 min, 20.degree. C.), and the supernatant was
discarded. Each cell pellet was resuspended in 50 mL spheroplast
buffer without Zymolyase, and cells were collected by
centrifugation (3000.times.g, 10 min, 20.degree. C.).
[0314] To fractionate spheroplasts, the cells were resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) was added. The cell
suspension was subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension was
centrifuged (2500 g, 10 min, 4.degree. C.) to collect cell debris
and unbroken cells and spheroplasts. Following centrifugation, 2 mL
of each sample (1 mL of the pGV1900 transformed cells) were saved
in a 2 mL centrifuge tube on ice and designated the "W" (for Whole
cell extract) fraction, while the remaining supernatant was
transferred to a clean, ice-cold 35 mL Oakridge screw-cap tube and
centrifuged (12,000.times.g, 20 min, 4.degree. C.) to pellet
mitochondria and other organellar structures. Following
centrifugation, 5 mL of each resulting supernatant was transferred
to a clean tube on ice, being careful to avoid the small, loose
pellet, and labelled the "S" (soluble cytosol) fraction. The
resulting pellets were resuspended in MIB containing Protease
Arrest solution, and were labelled the "P" ("pellet") fractions.
Protein from the "P" fraction was released after dilution 1:5 in
DHAD assay buffer (see above) by rapid mixing in a 1.5 mL tube with
a Retsch Ball Mill MM301 in the presence of 0.1 mM glass beads. The
mixing was performed 4 times for 1 minute.
[0315] The BioRad Protein Assay reagant (BioRad, Hercules, Calif.)
was used according to manufacturer's instructions to determine the
protein concentration of each fraction.
[0316] The DHAD activity of each fraction was ascertained as
described in the methods above.
TABLE-US-00013 TABLE 12 Specific activities (KIV generation) and
ratios of specific activities from fractionated lysates of S.
cerevisiae strain GEVO2244 carrying plasmids to overexpress the
indicated DHAD homolog. Each data point is the result of triplicate
samples. Sp. Activity Lysate (pGV# [U/mg protein and fraction*)
DHAD in fraction] Std. Dev. 1106 W -- n.d. 1106 S -- n.d. 1106 P --
n.d. 1900 W ScILV3(FL) 0.0096 0.0018 1900 S ScILV3(FL) 0.0052
0.0004 1900 P ScILV3(FL) 0.0340 0.0029
[0317] Cells overexpressing the full-length, native S. cerevisiae
Ilv3 contained in a greater proportion of the specific DHAD
activity in the mitochondrial fraction (P) versus the cytosolic
fraction (S).
Example 4
Replacing Current Mitochondrially Targeted Isobutanol Pathway
Enzymes with Fungal Homologs or Functional Analogs that are
Targeted to the Cytosol
[0318] The purpose of this example is to illustrate that fungal
homologs of isobutanol a pathway enzymes exhibit cytosolic
activity.
TABLE-US-00014 TABLE 13 Genotype of strains disclosed in Example 4.
GEVO No. Genotype/Source 1187 MATa ura3-52 leu2-3_112 his3.DELTA.1
trp1-289 ADE2 CEN.PK2-1C 2280 MATa ura3-52 leu2-3_112 his3.DELTA.1
trp1-289 ADE2 CEN.PK2-1C integrated pGV1730 at PDC1 locus 2618 MATa
ura3-52 leu2-3_112 his3.DELTA.1 trp1-289 ADE2 CEN.PK2-1C integrated
pGV2114 at PDC1 locus 2621 MATa ura3-52 leu2-3_112 his3.DELTA.1
trp1-289 ADE2 CEN.PK2-1C integrated pGV2117 at PDC1 locus 2622 MATa
ura3-52 leu2-3_112 his3.DELTA.1 trp1-289 ADE2 CEN.PK2-1C integrated
pGV2118 at PDC1 locus
TABLE-US-00015 TABLE 14 Plasmids disclosed in Example 4. pGV No.
Genotype 1730 P.sub.Cup1-11:Bs_alsS, pUC ORI, Amp.sup.R, TRP1, PDC1
3'-fragment-NruI-PDC1 5'-fragment. 2114 P.sub.Cup1-11:Bs_alsScoSc,
pUC ORI, Amp.sup.R, TRP1, PDC1 3'-fragment-NruI-PDC1 5'-fragment.
2117 P.sub.Cup1-11:Ta_alsS, pUC ORI, Amp.sup.R, TRP1, PDC1
3'-fragment-NruI-PDC1 5'-fragment. 2118 P.sub.Cup1-11:Ts_alsS, pUC
ORI, Amp.sup.R, TRP1, PDC1 3'-fragment-NruI-PDC1 5'-fragment.
[0319] Yeast AHASs are normally mitochondrial, thus favoring fungal
ALS enzymes for as cytosolically functional isobutanol pathway
enzymes. Sequence analysis by Le and Choi (Bull. Korean Chem. Soc.
(2005) 26:916-920) showed that there is a conserved sequence
`RFDDR` found in AHASs that is not conserved among ALSs. This
sequence is likely involved in FAD-binding by AHASs and thus could
be used to distinguish between the FAD-dependent AHASs and the
FAD-independent ALSs. Using this region to distinguish between
AHASs and ALSs BLAST searches of fungal sequence databases were
performed and resulted in the identification of ALS homologs from
several fungal species (Magnaporthe grisea, Phaeosphaeria nodorum,
Trichoderma atroviride (SEQ ID NO: 71), Talaromyces stipitatus (SEQ
ID NO: 72), Penicillium marneffei, and Glomerella graminicola). Of
these sequences, the ALS homologs from M. grisea, P. nodorum, T.
atroviride and T. stipitatus are predicted to be cytoplasmic by
Mitoprot II v.1.101 as described in the paper M. G. Claros, P.
Vincens. Computational method to predict mitochondrially imported
proteins and their targeting sequences. Eur. J. Biochem. 241,
779-786 (1996).
[0320] Fungal ALS genes were synthesized by DNA 2.0 with codon
optimization biased for S. cerevisiae. The following ALS constructs
were made and tested for ALS activity by assaying acetoin in the
media during a growth timecourse. All ALS genes were cloned into
the integration vector pGV1730 (SEQ ID NO: 69) as described
herein.
[0321] Plasmid pGV1730 is a yeast integration plasmid used to
replace the PDC1 gene in S. cerevisiae with the B. subtilis alsS
gene (SEQ ID NO: 70) (not codon optimized for S. cerevisiae)
expressed using the S. cerevisiae CUP1 promoter. This plasmid
carries the S. cerevisiae TRP1 gene as a selection marker.
[0322] Construction of pGV2114: pGV1730 was treated with BamHI and
SalI and the 4932 bp vector fragment was purified by gel
electrophoresis as described. The B. subtilis AlsS (codon-optimized
for expression in S. cerevisiae) gene was ligated to the pGV1730
vector fragment as a BamHI and SalI 1722 bp fragment using standard
methods with an approximately 5:1 insert:vector molar ratio and
transformed into TOP10 chemically competent E. coli cells. Plasmid
DNA was isolated and correct clones were confirmed using
restriction enzyme analysis.
[0323] Construction of pGV2117. pGV1730 was treated with BamHI and
SalI and the 4932 bp vector fragment was purified by gel
electrophoresis as described. The T. atroviride ALS gene was
ligated to the pGV1730 vector fragment as a BamHI and SalI 1686 bp
fragment using standard methods with an approximately 5:1
insert:vector molar ratio and transformed into TOP10 chemically
competent E. coli cells. Plasmid DNA was isolated and correct
clones were confirmed using restriction enzyme analysis.
[0324] Construction of pGV2118. pGV1730 was treated with BamHI and
SalI and the 4932 bp vector fragment was purified by gel
electrophoresis as described. The T. stipitatus ALS gene was
ligated to the pGV1730 vector fragment as a BamHI and SalI 1707 bp
fragment using standard methods with an approximately 5:1
insert:vector molar ratio and transformed into TOP10 chemically
competent E. coli cells. Plasmid DNA was isolated and correct
clones were confirmed using restriction enzyme analysis.
[0325] All yeast strains were constructed by treating the plasmid
to be integrated with NruI and then transforming the plasmid
according to the standard yeast transformation protocol as
described herein. Transformants were selected by plating
transformed cells onto SCD-W media and growing at 30.degree. C. for
2 days. Primary transformants were single colony purified and then
tested for correct integration using colony PCR. Colony PCR was
performed using the Yeast colony PCR to check for proper
integration of the integrative plasmids used the FailSafe.TM. PCR
System (EPICENTRE.RTM. Biotechnologies, Madison, Wis.; Catalog
#FS99250) according to the manufacturer protocol The PCR reactions
were incubated in a thermocycler using the following conditions: 1
cycle of 94.degree. C. for 2 min, 40 cycles of 94.degree. C. for 30
s, 53.degree. C. for 30 s, 72.degree. C. for 60 s and 1 cycle of
72.degree. C. for 10 min. Presence of the positive PCR product was
assessed using agarose gel electrophoresis. Primer pairs for the
5'-end and 3'-end integration sites contained one primer on the
plasmid and one primer in the genome.
[0326] Yeast strains GEVO1187, 2280, 2618, 2621 and 2622 were grown
in YPD overnight at 30.degree. C. A 100 mL culture was inoculated
to 1 OD/mL and split into 2 50 mL cultures. This was the time zero.
One of the 50 mL cultures received 500 .mu.M CuSO.sub.4 at time 2
hours and the other did not. Timepoints consisted of removing 1 mL
at times 0, 2, 2.5, 3, 4, 7.5, and 23 hours. At each timepoint the
OD.sub.600 was determined and acetoin concentrations were
determined using GC as described in the General Methods. Before GC
samples were treated with H.sub.2SO.sub.4 to convert intermediates
to acetoin. The graph shows the acetoin concentrations in the media
of the strains in which transcription of the ALS genes was induced
by CuSO.sub.4. The acetoin values were normalized to cell OD. Both
the T. stipitatus ALS and the T. atroviride ALS showed increased
levels of acetoin as compared to the no ALS control (FIG. 2).
[0327] ALS activity in whole cell lysates is determined as
described in General Methods. Activity in mitochondrial/organellar
(P) and cytosolic (S) fractions and whole cell (W) lysates is
assayed as described in General Methods
Example 5
Replacing Current Mitochondrially Targeted Isobutanol Pathway
Enzymes with Homologs or Functional Analogs from Anaerobic
Fungi
[0328] The purpose of this example is to illustrate that homologues
of isobutanol a pathway enzymes from anaerobic fungi exhibit
cytosolic activity.
TABLE-US-00016 TABLE 15 Genotype of strains disclosed in Example 5.
GEVO No. Genotype GEVO2244 S. cerevisiae, CEN.PK; MAT.alpha. ura3
leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00017 TABLE 16 Plasmids disclosed in Example 5. Plasmid
name Genotype pGV1106 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TDH3
promoter-Myc tag-polylinker-CYC1 terminator pGV1662 pUC ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1 promoter-(kivD) pGV1855 pUC ori,
bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-Ll_ilvD
[0329] Plasmid pGV1106 is described in Example 3, above.
[0330] Plasmid pGV1662 (SEQ ID NO: 81) served as the parental
plasmid of pGV1855, pGV1900, and pGV2019. The salient features of
pGV1662 include the yeast 2 micron origin of replication, the URA3
selectable marker, and the ScTEF1 promoter sequence followed by
restriction sites into which an ORF can be cloned to permit its
expression under the regulation of the TEF1 promoter.
[0331] Plasmid pGV1855 contains the L. lactis ilvD. The L. lactis
ilvD sequence was synthesized (DNA2.0, Menlo Park, Calif.) and
included a unique SalI and a NotI site at the 5' and 3' end of the
coding sequence, respectively. The synthesized DNA was digested
with SalI and NotI and ligated into vector pGV1662 that had been
digested with SalI plus NotI, yielding pGV1855.
[0332] The DHAD homolog (ilvD) from the anaerobic fungi Piromyces
sp. E2 has a predicted MTS of 49 amino acids at the N-terminus.
Thus, a nucleotide sequence encoding the Piromyces ilvD lacking the
N-terminal 49 amino acids and with a start codon placed at the
N-terminus was synthesized (SEQ ID NO: 73). In addition, a SalI
site and a BamHI site were introduced at the 5' and 3' ends of this
ORF. This fragment was cloned into the SalI and BamHI sites of
pGV1662. The resulting plasmid was transformed in to GEVO2242. An
empty vector, pGV1106, is used as a negative control. Plasmid,
pGV1855, expressing L. lactis ilvD is used as a positive
control.
[0333] An independent clonal transformant of each plasmid is
isolated, and a 1 L culture of each strain is grown in
SCGaI-Ura+9xIV at 30.degree. C. at 250 rpm. The OD.sub.600 is
noted, the cells are collected by centrifugation (1600.times.g, 2
min) and the culture medium is decanted. The cell pellets are
resuspended in 50 mL sterile deionized water, collected by
centrifugation (1600.times.g, 2 min), and the supernatant is
discarded.
[0334] To obtain spheroplasts, the cell pellets are resuspended in
0.1M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT is added to a final concentration of 10 mM. Cells are
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells are then collected by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet is resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet is
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle and 50 mg of Zymolyase
20T (Seikagaku Biobusiness, Code#120491) is added to each cell
suspension. The suspensions are incubated overnight (approximately
16 hrs) at 30.degree. C. with gentle agitation (60 rev/min) on an
orbital shaker. The efficacy of spheroplasting is ascertained by
diluting an aliquot of each cell suspension 1:10 in either sterile
water or in spheroplasting buffer, and comparing the aliquots
microscopically (under 40.times. magnification). The spheroplasts
are centrifuged (3000.times.g, 10 min, 20.degree. C.), and the
supernatant is discarded. Each cell pellet is resuspended in 50 mL
spheroplast buffer without Zymolyase and cells are collected by
centrifugation (3000.times.g, 10 min, 20.degree. C.).
[0335] To fractionate spheroplasts, the cells are resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) is added. The cell
suspension is subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension is
centrifuged (2500.times.g, 10 min, 4.degree. C.) to collect cell
debris and unbroken cells and spheroplasts. Following
centrifugation, 2 mL of each sample (1 mL of the pGV1900
transformed cells) are saved in a 2 mL centrifuge tube on ice and
designated the "W" (for Whole cell extract) fraction, while the
remaining supernatant is transferred to a clean, ice-cold 35 mL
Oakridge screw-cap tube and centrifuged (12,000.times.g, 20 min,
4.degree. C.) to pellet mitochondria and other organellar
structures. Following centrifugation, 5 mL of each resulting
supernatant is transferred to a clean tube on ice, being careful to
avoid the small, loose pellet, and labelled the "S" (soluble
cytosol) fraction. The resulting pellets are resuspended in MIB
containing Protease Arrest solution, and are labelled the "P"
("pellet") fractions. The protein concentration of each fraction is
determined using the BioRad Protein Assay reagant (BioRad,
Hercules, Calif.) according to manufacturer's instructions.
[0336] The DHAD activity of each fraction is ascertained using the
DHAD assays as described above in the General Methods.
Example 6
Modification of the N-Terminal Mitochondrial Targeting Sequence of
an Isobutanol Pathway Enzyme
[0337] The purpose of this example is to illustrate that removal or
modification of N-terminal mitochondrial targeting sequences allows
for cytosolic activity of isobutanol pathway enzymes.
TABLE-US-00018 TABLE 17 Genotype of strains disclosed in Example 6.
GEVO Genotype No. 1803 MATa/alpha ura3/ura3 leu2/leu2 his3/his3
trp1/trp1 pdc1::Bs-alsS, TRP1/PDC1
TABLE-US-00019 TABLE 18 Plasmids disclosed in Example 6. Plasmid
name Relevant Genes/Usage Genotype pGV1354 Plasmid that contains
P.sub.TDH3:ILV.DELTA.N47:CYC1 the Ilv5.DELTA.N47 gene. term, bla,
ColE1 ORI, URA3, 2 .mu. ori. pGV1662 Parent vector that has
pTEF1::L. lactis Ampicillin resistance, kivD::CYC1 the 2 .mu.
origin, a URA3 gene, the term, bla, ColE1 ORI, TEF1 promoter, CYC1
URA3, 2 .mu. ori. terminator region and an E. coli origin. It also
has the L. lactis KivD gene that is removed by cutting the plasmid
with SalI and NotI, and then gel purifying the vector portion. SalI
and NotI were used for cloning genes to be expressed from the TEF1
promoter. pGV1810 Plasmid that contains the pTEF1::ILV5::CYC1 full
length ILV5 gene. This was term, bla, used as a PCR template to
ColE1 ORI, URA3, 2 .mu. ori. generate the .DELTA.46-ilv5 mutant.
pGV1831 Plasmid that contains pTEF1::Sc Ilv5 the Ilv5.DELTA.N47
gene N47::CYC1 under control of the TEF1 term, bla, ColE1 ORI,
promoter. URA3, 2 .mu. ori. pGV1833 Plasmid that contains pTEF1::Sc
ILV5:CYC1 the full length ILV5 gene under term, bla, control of the
TEF1 promoter. ColE1 ORI, URA3, 2 .mu. ori pGV1901 The S.
cerevisiae KARI pTEF1::.DELTA.46ilv5 with the N-terminal KARI::CYC1
46 amino acid deleted (.DELTA.46) cloned term, bla, ColE1 into
pGV1662 at the SalI-NotI ORI, URA3, 2 .mu. ori sites of the vector.
The S. cerevisiae .DELTA.46 KARI was a SalI-NotI fragment that was
PCR amplified from pGV1810 using primers 1809 and 1615. pGV1824 The
E. coli coSc KARI pTEF1::E. coli coSc cloned into pGV1662 KARI:CYC1
term, bla, at SalI-BamHI sites of the vector. ColE1 ORI, URA3, 2
.mu. ori
[0338] The yeast enzymes acetohydroxyacid synthase (AHAS;
ILV2+ILV6), ketol-acid reductoisomerase (KARI; ILV5), and
dihydroxyacid dehydratase (DHAD; ILV3) that carry out the first
three steps of isobutanol production are physiologically localized
to the mitochondria. Mitochondrial matrix proteins are typically
targeted to the mitochondria by an N-terminal mitochondrial
targeting sequence (MTS), which is then cleaved off in the
mitochondria resulting in the `mature` form of the enzyme.
N-terminal deletions of ILV5 have been shown to re-localize this
enzyme to the cytosol (Omura, 2008, Appl. Microbiol. Biotechnol.
78: 503-513; Omura, WO/2009/078108 A1, hereby incorporated by
reference in its entirety).
[0339] N-terminal mitochondria targeting sequences (MTS) are
predicted by MitoProt II software; Claros et al., 1996, Eur. J.
Biochem. 241: 779-786. Two N-terminal deletions of the ILV5 gene
was constructed, one missing the first 46 amino acids and one
missing the first 47 amino acids.
[0340] pGV1831 was constructed as follows. pGV1662 was digested
with SalI and NotI and the large fragment (6.3 Kb vector backbone)
was gel purified by agarose gel electrophoresis. The Ilv5.DELTA.N47
gene was excised from plasmid pGV1354 (SEQ ID NO: 80) using SalI
and NotI. The ilv5.DELTA.N47 gene fragment (1.06 Kb) was purified
away from the larger vector fragment by agarose gel
electrophoresis. The pGV1662 vector and ilv5.DELTA.N47 insert were
ligated using standard methods in an approximately 5:1
insert:vector molar ratio and transformed into TOP10 chemically
competent E. coli cells. Plasmid DNA was isolated and correct
clones were confirmed using restriction enzyme analysis, namely
generation of the correct insert size by digesting clones with SalI
and NotI enzymes. The clones were verified by sequencing with the
primers 351, 1625, and 1626. Purified plasmid DNA was transformed
into S. cerevisiae strain GEVO1803 using a standard yeast
transformation protocol.
[0341] pGV1833 was constructed as follows. pGV1662 was digested
with SalI and NotI and the large fragment (6.3 Kb vector backbone)
was gel purified by agarose gel electrophoresis. Primers 1615 and
1616 were used to amplify the S. cerevisiae ILV5 gene from the
plasmid template pGV1810 by PCR. The correct fragment size was
verified with DNA gel electrophoresis (1.2 Kb). The PCR product was
purified after PCR using the Qiagen QIAquick PCR Purification Kit.
The PCR product was then digested with XhoI and NotI to generate
ends compatible with the pGV1662 backbone (the XhoI end of the PCR
product is compatible with the SalI end of the vector, although the
ligated DNA fragment can't be recut with either enzyme). After
digestion, the PCR product was purified with a Qiagen QIAquick PCR
Purification Kit. The two fragments were ligated using standard
methods in an approximately 5:1 insert:vector molar ratio and
transformed into TOP10 chemically competent E. coli cells. Plasmid
DNA was isolated and correct clones were confirmed using
restriction enzyme analysis. In this case, SacI plus NotI digestion
yielded a fragment of the predicted size (1.6 Kb). The clones were
verified by sequencing with the primers 351, 1625, and 1626.
Purified plasmid DNA was transformed into S. cerevisiae strain
GEVO1803.
[0342] pGV1901 was constructed as follows. pGV1662 was digested
with SalI and NotI and the large fragment (6.3 Kb vector backbone)
was gel purified by agarose gel electrophoresis. The ILV5 gene was
amplified from pGV1810 (SEQ ID NO: 82) using primers 1809 (which
removes the first 46 amino acids from the N-terminus while adding a
methionine codon) and 1615. The PCR product was digested with SalI
and NotI. After digestion, the PCR product was purified on an
agarose gel and the proper fragment (1.07 Kb) was recovered using
the Zymoclean Gel DNA Recovery Kit. The pGV1662 vector and
Ilv5-.DELTA.46 PCR products were ligated using standard methods in
an approximately 5:1 insert:vector molar ratio and transformed into
TOP10 chemically competent E. coli cells. Plasmid DNA was isolated
and correct clones were confirmed with PCR screening of colonies
using primers 351 and 1577. The predicted correct PCR product was
580 bp. The clones were sequenced using primers 351, 1625, and
1626. Purified plasmid DNA was transformed into S. cerevisiae
strain GEVO1803 using the standard yeast transformation
protocol.
[0343] pGV1824 contains the E. coli ilvC gene that is codon
optimized for S. cerevisiae cloned into the SalI and BamHI of
pGV1662 as described above. The sequence of the codon optimized E.
coli ilvC is found as SEQ ID NO: 83.
[0344] Plasmids were transformed into the yeast strain GEVO1803 and
an individual colony was purified from each transformation. KARI
assays of whole cell lysates were performed at pH 7.5 as described
in General Methods. Results are shown in FIG. 3.
[0345] KARI activity in mitochondrial/organellar (P) and cytosolic
(S) fractions and whole cell (W) lysates is assayed as described in
General Methods
Example 7
Scaffolding Two or More Isobutanol Pathway Enzymes
[0346] The purpose of this example is to illustrate how isobutanol
pathway enzymes can be scaffolded in order to localize them to the
cytosol.
[0347] Cellulolytic microorganisms utilize a scaffolded enzyme
complex called a cellulosome. In such a complex, numerous enzymes
are docked to a single scaffold protein, called a scaffoldin, which
contain multiple binding domains called cohesin domains. Each
cohesin domain interacts with a dockerin domain. In a cellulosome
complex, each cellulytic enzyme also has a dockerin domain that
allows it to bind to the scaffoldin.
[0348] The cohesin domains of a scaffoldin protein, for example,
CipA from Clostridium thermocellum, can be expressed in yeast. The
dockerin domains from the cellulolytic enzymes from the same
organism, for example Xyn10B, can be fused to the isobutanol
enzymes and the fusion proteins expressed in yeast.
[0349] The activity of each pathway enzyme in whole cell lysates is
determined as described in General Methods. Activity in
mitochondrial/organellar (P) and cytosolic (S) fractions and whole
cell (W) lysates is assayed as described in General Methods.
Example 8
Adding of Tags, e.g. Ubiquitin Tags, to the N-Terminus of an
Isobutanol Pathway Enzyme
[0350] The purpose of this is example is to demonstrate that
isobutanol pathway enzymes can be targeted to the yeast cytosol.
For instance, this example illustrates how a DHAD enzyme can be
targeted to the yeast cytosol.
TABLE-US-00020 TABLE 18 Genotype of strains disclosed in Example 8.
GEVO No. Genotype/Source Gevo2242 S.cerevisiae, CEN.PK; MAT-alpha
ura3 leu2 his3 trp1 ilv5.sup.D255E pdc1::Bs-alsS,TRP1 Gevo2244 S.
cerevisiae, CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 ilv3.DELTA.
TABLE-US-00021 TABLE 19 Plasmids disclosed in Example 8. pGV No.
Genotype pGV1106 pMB1 ori, bla (AmpR), 2 .mu.m ori, URA3, TDH3
promoter- Myc tag-polylinker-CYC1 terminator pGV1662 pMB1 ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1 promoter- (kivD) pGV1784 pUC ori,
kanR, Mm_ubiquitin coding sequence pGV1855 pMB1 ori, bla (AmpR), 2
.mu.m ori, URA3, TEF1 promoter-Ll_ilvD pGV1897 pMB1 ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1 promoter- Mm_ubiquitin(Gly-X)
pGV1900 pMB1 ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-
ScILV3(FL) pGV2019 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter- ScILV3.DELTA.N pGV2052 pMB1 ori, bla (AmpR), 2 .mu.m ori,
URA3, TEF1 promoter- Mm_ubiquitin(Gly-X)-ScIlv3(FL) pGV2053 pMB1
ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-
Mm_ubiquitin(Gly-X)-ScIlv3.DELTA.N pGV2054 pMB1 ori, bla (AmpR), 2
.mu.m ori, URA3, TEF1 promoter- Mm_ubiquitin(Gly-X)-Ll_ilvD pGV2055
pMB1 ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-
Mm_ubiquitin(Gly-X)-Gf_ilvD pGV2056 pMB1 ori, bla (AmpR), 2 .mu.m
ori, URA3, TEF1 promoter- Mm_ubiquitin(Gly-X)-Se_ilvD
[0351] To develop the constructs required to express DHAD as a
fusion with an N-terminal ubiquitin, plasmid pGV1784 was
synthesized by DNA2.0. This plasmid contained the synthesized
sequence for the Mus musculus ubiquitin gene, codon-optimized for
expression in S. cerevisiae (SEQ ID NO: 86). Using this plasmid as
the template, the M. musculus ubiquitin gene was amplified via PCR
using primers 1792 and 1794 to generate a PCR product containing
the M. musculus ubiquitin gene codon sequence flanked by
restriction sites XhoI and NotI at its 5' and 3' ends,
respectively, and altered so as to lack the codon for its
endogenous C-terminal most glycine residue (denoted as Gly-X). This
PCR product was cloned into pGV1662 (described in Example 6),
yielding pGV1897.
[0352] Plasmid pGV1897 was then used as a recipient cloning vector
for sequences encoding S. cerevisiae ILV3 (ScIlv3(FL), SEQ ID NO:
88), S. cerevisiae Ilv3.DELTA.N (ScIlv3.DELTA.N, SEQ ID NO: 89), L.
lactis ilvD (LI_ilvD, SEQ ID NO: 87), G. forsetti ilvD (Gf_ilvD,
SEQ ID NO: 90), and S. erythraea ilvD (Se_ilvD, SEQ ID NO: 91),
yielding plasmids pGV2052-2056, respectively.
[0353] The DHAD activity exhibited by cells transformed with each
of the resulting constructs is ascertained by in vitro assay.
GEVO2244 is transformed (singly) with pGV2052-2056, pGV1106 (empty
control vector), pGV1855 (expressing native, unfused LI_ilvD) or
pGV1900 (expressing native, full-length Sc_ILV3(FL)). Lysates of
transformants are prepared and DHAD activity in
mitochondrial/organellar (P) and cytosolic (S) fractions and whole
cell (W) lysates is assayed as described in Example 3.
[0354] In an analogous manner, a desired ALS (e.g., the B. subtilis
alsS) or KARI gene whose product is known or predicted to be
mitochondrial can be re-targeted to the cytosol by means of the
methods detailed in this example. The nucleotide sequence encoding
for a full-length, or variant, ALS or KARI is amplified by PCR
using primers that introduce restriction sites convenient for
cloning the final product as an in-frame fusion of the M. musculus
ubiquitin gene. The resulting construct is transformed into a host
S. cerevisiae cell suitable for assaying the in vitro activity of
the expressed M. musculus ubiquitin-gene chimeric fusion protein,
using methods described in Example 3.
Example 9
Dihydroxy Acid Dehydratase Limits Isobutanol Production in
Yeast
[0355] This example illustrates the specific activity of various
DHAD homologs in yeast. The example also illustrates that high
specific activity of the Lactococcus lactis IlvD enzyme (SEQ ID NO:
18) correlates with an increase in isobutanol production.
[0356] Plasmid pGV1106 was used as a control and is described in
Example 3. Plasmid pGV1662 (described in Example 6) served as the
parental plasmid of pGV1855, pGV1900, and pGV2019 (see Example 5).
Plasmids pGV1851-1855 and pGV1904-1907 are all variants of pGV1662
(See Table 20), in which the kivD ORF sequence present in pGV1662
was excised and replaced with a sequence encoding a DHAD homolog,
as indicated below.
TABLE-US-00022 TABLE 20 Plasmids disclosed in Example 9. pGV No.
Genotype pGV1851 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter-Gramella forsetti ilvD pGV1852 pUC ori, bla (AmpR), 2
.mu.m ori, URA3, TEF1 promoter-Chromohalobacter salexigens ilvD
pGV1853 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter-Ralstonia eutropha ilvD pGV1854 pUC ori, bla (AmpR), 2
.mu.m ori, URA3, TEF1 promoter-Saccharopolyspora erythraea ilvD
pGV1855 pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1
promoter-Ll_ilvD pGV1900 pUC ori, bla (AmpR), 2 .mu.m ori, URA3,
TEF1 promoter-ScILV3(FL) pGV1904 pUC ori, bla (AmpR), 2 .mu.m ori,
URA3, TEF1 promoter-Acidobacteria bacterium Ellin345 ilvD pGV1905
pUC ori, bla (AmpR), 2 .mu.m ori, URA3, TEF1 promoter-Picrophilus
torridus DSM 9790 ilvD pGV1906 pUC ori, bla (AmpR), 2 .mu.m ori,
URA3, TEF1 promoter-Piromyces species E2 ilvD pGV1907 pUC ori, bla
(AmpR), 2 .mu.m ori, URA3, TEF1 promoter-Sulfolobus tokodaii strain
7 ilvD
[0357] Plasmid pGV1851 contains the G. forsetti ilvD gene (SEQ ID
NO: 90). Plasmid pGV1852 contains the C. salexigens gene (SEQ ID
NO: 95). Plasmid pGV1853 contains the R. eutropha gene (SEQ ID NO:
94). Plasmid pGV1854 contains the S. erythraea ilvD (SEQ ID NO:
91). Plasmid pGV1855 contains the L. lactis ilvD (SEQ ID NO: 87).
Plasmid pGV1900 contains the S. cerevisiae ILV3 (SEQ ID NO: 88).
Plasmid pGV1904 contains the A. bacterium Ellin345 ilvD (SEQ ID NO:
92). Plasmid pGV1905 contains the P. torridus DSM 9790 ilvD (SEQ ID
NO: 96). Plasmid pGV1906 contains the Piromyces sp. E2 ilvD (SEQ ID
NO: 93). Plasmid pGV1907 contains the S. tokodaii ilvD (SEQ ID NO:
97). All sequences (except that of the S. cerevisiae ILV3 (full
length) were synthesized with 5' SalI and 3' NotI sites by DNA2.0
(Menlo Park, Calif.), digested with SalI and NotI, and ligated into
pGV1662 which had also been digested with SalI and NotI. For
plasmid pGV1900, the sequence containing the open reading frame of
the S. cerevisiae ILV3 (full length) was amplified from S.
cerevisiae genomic DNA using primers 1617 and 1618, and the
resulting 1.8 kb fragment was digested with SalI plus BamHI and
cloned into pGV1662. Various DHADs were tested for in vitro
activity using whole cell lysates. In this case, the DHADs were
expressed in a yeast deficient for DHAD activity (GEVO2244;
ilv3.DELTA.) (see Example 2) to minimize endogenous background
activity.
[0358] To grow cultures for cell lysates, triplicate independent
cultures of each desired strain were grown overnight in 3 mL
SCD-Ura+9xIV at 30.degree. C., 250 rpm. The following day, the
overnight cultures were diluted 1:50 into 50 mL fresh SCD-Ura in a
250 mL baffle-bottomed Erlenmeyer flask and incubated at 30.degree.
C. at 250 rpm. After approximately 10 hours, the OD.sub.600 of all
cultures were measured, and the cells of each culture were
collected by centrifugation (2700.times.g, 5 min). The cell pellets
were washed by resuspending in 1 mL of water, and the suspension
was placed in a 1.5 mL tube and the cells were collected by
centrifugation (16,000.times.g, 30 seconds). All supernatant was
removed from each tube and the tubes were frozen at -80.degree. C.
until use.
[0359] Lysates were prepared by resuspending each cell pellet in
0.7 mL of lysis buffer. Lysate lysis buffer consisted of: 0.1M
Tris-HCl pH 8.0, 5 mM MgSO.sub.4, with 10 .mu.L of Yeast/Fungal
Protease Arrest solution (G Biosciences, catalog #788-333) per 1 mL
of lysis buffer. Eight hundred microliters of cell suspension were
added to 1 mL of 0.5 mm glass beads that had been placed in a
chilled 1.5 mL tube. Cells were lysed by bead beating (6 rounds, 1
minute per round, 30 beats per second) with 2 minutes chilling on
ice in between rounds. The tubes were then centrifuged
(20,000.times.g, 15 min) to pellet debris and the supernatant (cell
lysates) were retained in fresh tubes on ice. The protein
concentration of each lysate was measured using the BioRad Bradford
protein assay reagent (BioRad, Hercules, Calif.) according to
manufacturer's instructions.
[0360] The DHAD activity of each lysate was ascertained as follows.
In a fresh 1.5 mL centrifuge tube, 50 .mu.L of each lysate was
mixed with 50 .mu.L of 0.1M 2,3-dihydroxyisovalerate (DHIV), 25
.mu.L of 0.1 M MgSO.sub.4, and 375 .mu.L of 0.05M Tris-HCl pH 8.0,
and the mixture was incubated for 30 min at 35.degree. C. Each tube
was then heated to 95.degree. C. for 5 min to inactivate any
enzymatic activity, and the solution was centrifuged
(16,000.times.g for 5 min) to pellet insoluble debris. To prepare
samples for analysis, 100 .mu.L of each reaction were mixed with
100 .mu.L of a solution consisting of 4 parts 15 mM dinitrophenyl
hydrazine (DNPH) in acetonitrile with 1 part 50 mM citric acid, pH
3.0, and the mixture was heated to 70.degree. C. for 30 min in a
thermocycler. The solution was then analyzed by HPLC as described
above in General Methods to quantitate the concentration of
ketoisovalerate (KIV) present in the sample. The results are shown
in Table 21.
TABLE-US-00023 TABLE 21 Specific activities (KIV generation) from
lysates of S. cerevisiae strain GEVO2244 carrying plasmids to
overexpress the indicated DHAD homolog. Each data point is the
result of triplicate samples. Specific activity Plasmid Gene (U/mg
total protein) pGV1106 Control (i.e. no DHAD) n.d. pGV1851 Gramella
forsetti ilvD 0.012 pGV1852 Chromohalobacter salexigens n.d. (SEQ
ID NO: 95) pGV1853 Ralstonia eutropha (SEQ ID NO: 94) n.d. pGV1854
Saccharopolyspora erythraea ilvD 0.002 pGV1855 Lactococcus lactis
ilvD 0.027 pGV1900 Saccharomyces cerevisiae ILV3(FL) 0.148 pGV1904
Acidobacteria bacterium Ellin345 DHAD 0.004 pGV1905 Picrophilus
torridus DSM 9790 DHAD n.d. pGV1906 Piromyces Sp E2 DHAD 0.016
pGV1907 Sulfolobus tokodaii str. 7 DHAD 0.001 * n.d., not
detectable
Example 10
Dihydroxy Acid Dehydratase Limits Isobutanol Production in
Yeast
[0361] This example illustrates that high specific DHAD activity,
and in particular the high specific activity of the L. lactis IlvD
enzyme (SEQ ID NO: 18) correlates with an increase in isobutanol
production.
TABLE-US-00024 TABLE 22 Genotype of strains disclosed in Example
10. GEVO No. Genotype/Source GEVO1186 S. cerevisiae, CEN.PK;
MATa/.alpha. ura3/ura3 leu2/leu2 his3/his3 trp1/trp1 GEVO1188 S.
cerevisiae, CEN.PK; MAT.alpha. ura3 leu2 his3 trp1 GEVO1803
MATa/.alpha. ura3/ura3 leu2/leu2 his3/his3 trp1/trp1 pdc1::Bs-
alsS, TRP1/PDC1 GEVO2107 MATa/.alpha. ura3/ura3 leu2/leu2 his3/his3
trp1/trp1 pdc1::Bs- alsS, TRP1/PDC1 pdc6::{ScTEF1p-Ll_kivd ScTDH3p-
Dm_ADH URA3}/PDC6
TABLE-US-00025 TABLE 23 Plasmids disclosed in Example 10. pGV No.
Genotype p423GPD P.sub.TDH3:MCS:T.sub.CYC1, HIS3, 2-micron, bla,
pUC ori (Mumberg, D. et al. (1995) Gene 156: 119-122; obtained from
ATCC) pGV1103 P.sub.TDH3:myc-tag:MCS:T.sub.CYC1, HIS3, 2 micron,
bla, pUC ori pGV1730 P.sub.CUP1:Bs-alsS:T.sub.PDC1/PDC1-3'
region:PDC1-5' region, TRP1, bla, pUC ori pGV1914
P.sub.TEF1:Ll_kivD P.sub.TDH3:Dm_ADH PDC6 5', 3' targeting homology
URA3 pUC ori bla(ampR) pGV1974
P.sub.TEF1:Sc_ILV3.DELTA.N:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub-
.CYC1, HIS3, 2 micron, bla, pUC ori bla(ampR) pGV1981
P.sub.TEF1:Lactococcus lactis
ilvD-coSc:P.sub.TDH3:Ec_ilvC.sup.Q110V- coSc:T.sub.CYC1, HIS3, 2
micron, bla, pUC ori pGV2001
P.sub.TEF1:P.sub.TDH3:EC_ilvC.sup.Q110V-coSc:T.sub.CYC1, HIS3, 2
micron, bla, pUC ori
[0362] Plasmid pGV1103 was generated by inserting a linker (primers
271 annealed to primer 272) containing a myc-tag and a new MCS
(SalI-EcoRI-SmaI-BamHI-NotI) into the SpeI and XhoI sites of
p423GPD. The construction of plasmid pGV1730 is described in
Example 4.
[0363] pGV1914 (SEQ ID NO: 117) is a yeast integrating vector that
includes the S. cerevisiae URA3 gene as a selection marker and
contains homologous sequence for targeting the HpaI-digested,
linearized plasmid for integration at the PDC6 locus of S.
cerevisiae. pGV1914 carries the D. melanogaster adh (Dm_ADH) (SEQ
ID NO: 116) and L. lactis kivd (LI_kivD) genes, expressed under the
control of the S. cerevisiae TDH3 and TEF1 promoters, respectively.
The open reading frame sequence of DmADH was originally amplified
by PCR from clone RH54514 (available from the Drosophila Genome
Resource Center).
[0364] Plasmid pGV1974 is a yeast high copy plasmid with HIS3 as a
marker for the expression of E. coli ilvC.sup.Q110V (SEQ ID NO: 98)
and S. cerevisiae ILV3.DELTA.N (SEQ ID NO: 89). pGV1974 was
generated by cloning a SacI-NotI fragment (4.9 kb, SEQ ID NO: 118)
carrying the S. cerevisiae TEF1 promoter:S. cerevisiae
ilv3.DELTA.N:S. cerevisiae TDH3 promoter:E. coli ilvC.sup.Q110V
into the SacI-NotI sites of pGV1103 (5.4 kb), a yeast expression
plasmid carrying the HIS3 marker.
[0365] Plasmid pGV1981 is a yeast high copy plasmid with HIS3 as a
marker for the expression of E. coli ilvC.sup.Q110V and L. lactis
ilvD. pGV1981 was generated by cloning a SalI-BamHI fragment (1.7
kb) carrying the L. lactis ilvD ORF (SEQ ID NO: 87 with a SalI and
BamHI sites introduces at the 5' and 3' ends, respectively) into
the SalI-BamHI of pGV1974 (8.5 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF.
[0366] Plasmid pGV2001 is a yeast high copy plasmid with HIS3 as a
marker for the expression of E. coli ilvC.sup.Q110V. pGV2001 was
generated by digesting pGV1974 with SalI-BamHI to remove the S.
cerevisiae Ilv3.DELTA.N ORF. The digest was treated with Klenow to
fill-in the 5' overhangs, the larger 8.5 kb fragment was isolated
and self-ligated.
[0367] GEVO1803 was made by transforming GEVO1186 with the 6.7 kb
pGV1730 (contains S. cerevisiae TRP1 marker and the CUP1
promoter-driven B. subtilis alsS) that had been linearized by
digestion with NruI. Completion of the digest was confirmed by
running a small sample on a gel. The digested DNA was then purified
using Zymo Research DNA Clean and Concentrator and used in the
transformation. Trp+ clones were confirmed for the correct
integration into the PDC1 locus by colony PCR using primer pairs
1440+1441 and 1442+1443 for the 5' and 3' junctions, respectively.
Expression of B. subtilis alsS was confirmed by qRT-PCR using
primer pairs 1323+1324.
[0368] GEVO2107 was made by transforming GEVO1803 with linearized,
HpaI-digested pGV1914. Correct integration of pGV1914 at the PDC6
locus was confirmed by analyzing candidate Ura+ colonies by colony
PCR using primers 1440 plus 1441, or 1443 plus 1633, to detect the
5' and 3' junctions of the integrated construct, respectively.
Expression of all transgenes were confirmed by qRT-PCR using primer
pairs 1321 plus 1322, 1587 plus 1588, and 1633 plus 1634 to examine
Bs_alsS, LI_kivD, and Dm_ADH transcript levels, respectively.
[0369] GEVO 2107 was transformed with plasmids that contained
either a KARI alone (pGV2001 with E. coli ilvC.sup.Q110V) or the
same KARI with a DHAD (pGV1974 with the S. cerevisiae Ilv3.DELTA.N
or pGV1981 with the L. lactis ilvD). Fermentations were carried out
with three independent transformants for each DHAD homolog being
tested, as well as the no DHAD control plasmid. Seed cultures were
grown in SCD-H medium to mid-log phase. The fermentations were
initiated by collecting cells and resuspending in 25 mL of SCD-H
(5% glucose) medium to an OD.sub.600 of 1. Fermentations were
performed aerobically in 125 mL unbaffled flasks shaken at 250 rpm
at 30.degree. C. At 0, 24, 48 and 72 hours, the OD.sub.600 was
checked and 2 mL samples were taken. These samples were centrifuged
at 18,000.times.g in a microcentrifuge and 1.5 mL of the clarified
media was transferred to a 1.5 mL Eppendorf tube. The clarified
media was stored at 4.degree. C. until analyzed by GC and HPLC as
described in General Methods. At 24 and 48 hours, 2.5 mL of glucose
from a 400 g/L stock solution was added to the cultures. FIG. 4
shows the production of isobutanol in these fermentations. All
values were adjusted for the dilution caused by the volume change
from adding glucose. An increased amount of isobutanol was produced
from the cells expressing the L. lactis ilvD.
Example 11
Assaying DHAD Activity in Fractionated Cell Extracts
[0370] The purpose of this Example is to describe how DHAD activity
can be measured in fractionated cellular extracts that are enriched
for either mitochondrial or soluble cytosolic components.
[0371] Plasmids pGV1106, pGV1662, pGV1855, pGV1900 are described in
Example 9 above. To measure the DHAD activities present in
fractionated cell extracts, the strain GEVO2244 was transformed
singly with either pGV1106, which served as an empty vector
control, or with one of: pGV1855, pGV1900, or pGV2019, which are
expression plasmids for L. lactis ilvD, S. cerevisiae ILV3 (full
length), and S. cerevisiae ILV3.DELTA.N, respectively.
[0372] An independent clonal transformant of each plasmid was
isolated, and a 1 L culture of each strain was grown in
SCGaI-Ura+9xIV at 30.degree. C. at 250 rpm. The OD.sub.600 was
noted, the cells were collected by centrifugation (1600.times.g, 2
min) and the culture medium was decanted. The cell pellets were
resuspended in 50 mL sterile deionized water, collected by
centrifugation (1600.times.g, 2 min), and the supernatant was
discarded. The OD.sub.600 and total wet cell pellet weight of each
culture are listed in Table 24, below:
TABLE-US-00026 TABLE 24 OD.sub.600 and pellet mass (g) of strain
GEVO2244 transformed with the indicated plasmids. Pellet mass
Plasmid OD.sub.600 (g) pGV1106 2.2 7.6 pGV1855 2.3 7.7 pGV1900 1.3
3.8 pGV2019 2.6 8.4
[0373] To obtain spheroplasts, the cell pellets were resuspended in
0.1 M Tris-SO.sub.4, pH 9.3, to a final concentration of 0.1 g/mL,
and DTT was added to a final concentration of 10 mM. Cells were
incubated with gentle (60 rev/min) agitation on an orbital shaker
for 20 min at 30.degree. C., and the cells were then collect by
centrifugation (1600.times.g, 2 min) and the supernatant discarded.
Each cell pellet was resuspended in spheroplasting buffer, which
consists of (final concentrations): 1.2M sorbitol (Amresco, catalog
#0691), 20 mM potassium phosphate pH 7.4) and then collected by
centrifugation (1600.times.g, 10 min). Each cell pellet was
resuspended in spheroplasting buffer to a final concentration of
0.1 g cells/mL in a 500 mL centrifuge bottle, and 50 mg of
Zymolyase 20T (Seikagaku Biobusiness, Code#120491) was added to
each cell suspension. The suspensions were incubated overnight
(approximately 16 hrs) at 30.degree. C. with gentle agitation (60
rev/min) on an orbital shaker. The efficacy of spheroplasting was
ascertained by diluting an aliquot of each cell suspension 1:10 in
either sterile water or in spheroplasting buffer, and comparing the
aliquots microscopically (under 40.times. magnification). In all
cases, >90% of the water-diluted cells lysed, indicating
efficient spheroplasting. The spheroplasts were centrifuged
(3000.times.g, 10 min, 20.degree. C.), and the supernatant was
discarded. Each cell pellet was resuspended in 50 mL spheroplast
buffer without Zymolyase, and cells were collected by
centrifugation (3000.times.g, 10 min, 20.degree. C.).
[0374] To fractionate spheroplasts, the cells were resuspended to a
final concentration of 0.5 g/mL in ice cold mitochondrial isolation
buffer (MIB), consisting of (final concentration): 0.6M D-mannitol
(BD Difco Cat#217020), 20 mM HEPES-KOH, pH 7.4. For each 1 mL of
resulting cell suspension, 0.01 mL of Yeast/Fungal Protease Arrest
solution (G Biosciences, catalog #788-333) was added. The cell
suspension was subjected to 35 strokes of a Dounce homogenizer with
the B (tight) pestle, and the resulting cell suspension was
centrifuged (2500.times.g, 10 min, 4.degree. C.) to collect cell
debris and unbroken cells and spheroplasts. Following
centrifugation, 2 mL of each sample (1 mL of the pGV1900
transformed cells) were saved in a 2 mL centrifuge tube on ice and
designated the "W" (for Whole cell extract) fraction, while the
remaining supernatant was transferred to a clean, ice-cold 35 mL
Oakridge screw-cap tube and centrifuged (12,000.times.g, 20 min,
4.degree. C.) to pellet mitochondria and other organellar
structures. Following centrifugation, 5 mL of each resulting
supernatant was transferred to a clean tube on ice, being careful
to avoid the small, loose pellet, and labelled the "S" (soluble
cytosol) fraction. The resulting pellets were resuspended in MIB
containing Protease Arrest solution, and were labelled the "P"
("pellet") fractions. Protein from the "P" fraction was released
after dilution 1:5 in DHAD assay buffer (see above) by rapid mixing
in a 1.5 mL tube with a Retsch Ball Mill MM301 in the presence of
0.1 mM glass beads. The bead-beating was performed 4 times for 1
minute, 30 beats per second, after which insoluble debris was
removed by centrifugation (20,000.times.g, 10 min, 4.degree. C.)
and the soluble portion retained for use.
[0375] The BioRad Protein Assay reagant (BioRad, Hercules, Calif.)
was used according to manufacturer's instructions to determine the
protein concentration of each fraction; the data are summarized in
Table 25, below:
TABLE-US-00027 TABLE 25 Protein concentrations of
mitochondrial/organellar (P) and cytosolic (S) fractions and whole
cell (W) lysates, prepared as described in the text.
plasmid/fraction protein [.mu.g/pL] 1106 P 20.3 1855 P 17.7 1900 P
9.2 2019 P 19.7 1106 S 12.3 1855 S 12.9 1900 S 7.9 2019 S 12.4 1106
W 14.0 1855 W 15.0 1900 W 7.9 2019 W 14.7
[0376] The DHAD activity of each fraction was ascertained as
follows. In a fresh 1.5 mL centrifuge tube, 50 .mu.L of each
fraction was mixed with 50 .mu.L of 0.1M 2,3-dihydroxyisovalerate
(DHIV), 25 .mu.L of 0.1 M MgSO.sub.4, and 375 .mu.L of 0.05M
Tris-HCl pH 8.0, and the mixture was incubated for 30 min at
35.degree. C. Each reaction was carried out in triplicate. Each
tube was then heated to 95.degree. C. for 5 min to inactivate any
enzymatic activity, and the solution was centrifuged
(16,000.times.g for 5 min) to pellet insoluble debris. To prepare
samples for analysis, 100 .mu.L of each reaction were mixed with
100 .mu.L of a solution consisting of 4 parts 15 mM dinitrophenyl
hydrazine (DNPH) in acetonitrile with 1 part 50 mM citric acid, pH
3.0, and the mixture was heated to 70.degree. C. for 30 min in a
thermocycler. Analysis of ketoisovalerate via HPLC was carried out
as described in General Methods. Data from the experiment are
summarized below in Table 26.
TABLE-US-00028 TABLE 26 Specific activities (KIV generation) and
ratios of specific activities from fractionated lysates of S.
cerevisiae strain GEVO2244 carrying plasmids to overexpress the
indicated DHAD homolog. Each data point is the result of triplicate
samples. Sp. Activity Ratio of Sp. Lysate [U/mg Activities (pGV#
and protein in (Cyto or Mito fraction*) DHAD fraction] Std. Dev. to
Whole-Cell) 1106 WCL -- n.d. 1106 cyto -- n.d. 1106 mito -- n.d.
1855 WCL Ll_ilvD 0.0006 4.7E-05 1855 cyto Ll_ilvD 0.0011 0.0001
1.76 1855 mito Ll_ilvD 2E-05 3.5E-05 0.03 1900 WCL ScILV3(FL)
0.0096 0.0018 1900 cyto ScILV3(FL) 0.0052 0.0004 0.54 1900 mito
ScILV3(FL) 0.0340 0.0029 3.53 *WCL, whole cell lysate; cyto,
cytosolic-enriched fraction; mito, mitochondrial
(organellar)-enriched fraction
[0377] Cells overexpressing the L. lactis ilvD generated a
significantly greater proportion of DHAD activity in the cytosolic
fraction versus the mitochondrial fraction, whereas cells
overexpressing the full-length, native (mitochondrial) S.
cerevisiae ILV3 resulted in a greater proportion of the specific
activity residing in the mitochondrial fraction.
Example 12
Alternative, Native Dehydratases with DHAD Activity
[0378] This example describes how the overexpression of native
dehydratases in S. cerevisiae for the conversion of
2,3-dihydroxyisovalerate to ketoisovalerate is measured.
TABLE-US-00029 TABLE 27 Plasmids disclosed in Example 12. pGV No.
Genotype p426TEF P.sub.TEF1:MCS:T.sub.CYC1, URA3, 2-micron, bla,
pUC-ori (Mumberg, D. et al. (1995) Gene 156: 119-122; obtained from
ATCC) 1102 P.sub.TEF1:HA-tag:MCS:T.sub.CYC1, URA3, 2-micron, bla,
pUC-ori 1106 P.sub.TDH3:myc-tag:MCS:T.sub.CYC1, URA3, 2-micron,
bla, pUC-ori 1662 P.sub.TEF1:Ll_kivd: T.sub.CYC1, URA3, 2-micron,
bla, pUC-ori 1894 P.sub.TEF1:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
URA3, 2-micron, bla, pUC-ori 2000 P.sub.TEF1:Sc_ILV3.DELTA.N:
P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc: T.sub.CYC1, URA3, 2-micron, bla,
pUC-ori 2111 P.sub.TEF1:Ll_
ilvD:P.sub.TDH3:Ec_ilvC.sup.Q110v-coSc:T.sub.CYC1, URA3, 2- micron,
bla, pUC-ori 2112
P.sub.TEF1:Sc_LEU1:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
URA3, 2-micron, bla, pUC-ori 2113
P.sub.TEF1:Sc_HIS3:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1,
URA3, 2-micron, bla, pUC-ori
[0379] Plasmid pGV1102 was generated by inserting a linker (primers
269 annealed to primer 270) containing a HA-tag and a new MCS
(SalI-EcoRI-SmaI-BamHI-NotI) into the SpeI and XhoI sites of
p426TEF. Plasmids pGV1106 and pGV1662 are described in Examples 3
and 5, respectively. Plasmid pGV1894 is a yeast high copy plasmid
with URA3 as a marker for the expression of E. coli ilvC.sup.Q110V
and was generated by cloning a XhoI-NotI fragment (1.5 kb) carrying
the E. coli ilvC.sup.Q110V ORF (SEQ ID NO: 98) into the SalI-NotI
of pGV1662 (6.3 kb), replacing the L. lactis kivD ORF. Plasmids
pGV2000, pGV2111, pGV2112, and pGV2113 are yeast high copy plasmids
with URA3 as a marker for the expression of E. coli ilvC.sup.Q110V
and a DHAD. pGV2000 is generated by cloning a SacI-NotI fragment
(4.9 kb) from pGV1974 (described in Example 10) carrying the S.
cerevisiae TEF1 promoter:S. cerevisiae Ilv3.DELTA.N:S. cerevisiae
TDH3 promoter:E. coli ilvC.sup.Q110V into the SacI-NotI sites of
pGV1106 (6.6 kb), a yeast expression plasmid carrying the URA3
marker. pGV2111 is generated by cloning a SalI-BamHI fragment (1.7
kb) carrying the L. lactis ilvD ORF (SEQ ID NO: 97 with SalI and
BamHI sites introduced at the 5' and 3' ends, respectively) into
the SalI-BamHI of pGV2000 (8.4 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF. pGV2112 is generated by cloning the S. cerevisiae
LEU1 gene as a SalI-BamHI fragment (2.3 kb), generated by PCR using
primers 2163 and 1842 using genomic DNA as template, into the
SalI-BamHI of pGV2000 (8.4 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF. pGV2113 is generated by cloning the S. cerevisiae
HIS3 gene as a SalI-BamHI fragment (0.7 kb), generated by PCR using
primers 2183 and 2184 using genomic DNA as template, into the
SalI-BamHI of pGV2000 (8.4 kb), replacing the S. cerevisiae
Ilv3.DELTA.N ORF.
[0380] DHADs are tested for in vitro activity using whole cell
lysates. The DHADs as well as LEU1 and HIS3 are expressed from
pGV2000, pGV2112, and pGV2113 GEVO2244 to minimize endogenous DHAD
background activity. A plasmid that does not express DHAD, pGV1894,
and a plasmid that expresses the L. lactis ilvD, pGV2111, are used
as negative and positive controls, respectively
[0381] To grow cultures for cell lysates, triplicate independent
cultures of each desired strain are grown overnight in 3 mL
YNBD+HLW+10xIV at 30.degree. C., 250 rpm. The following day, the
overnight cultures are diluted 1:50 into 50 mL fresh YNBD+HLW+10xIV
in a 250 mL baffle-bottomed Erlenmeyer flask and incubated at
30.degree. C. at 250 rpm. After approximately 10 hours, the
OD.sub.600 of all cultures are measured, and the cells of each
culture are collected by centrifugation (2700.times.g, 5 min). The
cell pellets are washed by resuspending in 1 mL of water, and the
suspension is placed in a 1.5 mL tube and the cells are collected
by centrifugation (16,000.times.g, 30 seconds). All supernatant is
removed from each tube and the tubes are frozen at -80.degree. C.
until use.
[0382] Lysates are prepared by resuspending each cell pellet in 0.7
mL of lysis buffer. Lysate lysis buffer consisted of: 0.1M Tris-HCl
pH 8.0, 5 mM MgSO.sub.4, with 10 .mu.L of Yeast/Fungal Protease
Arrest solution (G Biosciences, catalog #788-333) per 1 mL of lysis
buffer. Eight hundred microliters of cell suspension are added to 1
mL of 0.5 mm glass beads that had been placed in a chilled 1.5 mL
tube. Cells are lysed by bead beating (6 rounds, 1 minute per
round, 30 beats per second) with 2 minutes chilling on ice in
between rounds. The tubes are then centrifuged (20,000.times.g, 15
min) to pellet debris and the supernatant (cell lysates) are
retained in fresh tubes on ice. The protein concentration of each
lysate is measured using the BioRad Bradford protein assay reagent
(BioRad, Hercules, Calif.) according to manufacturer's
instructions.
[0383] The DHAD activity of each lysate is ascertained as follows.
In a fresh 1.5 mL centrifuge tube, 50 .mu.L of each lysate is mixed
with 50 .mu.L of 0.1M 2,3-dihydroxyisovalerate (DHIV), 25 .mu.L of
0.1 M MgSO.sub.4, and 375 .mu.L of 0.05M Tris-HCl pH 8.0, and the
mixture is incubated for 30 min at 35.degree. C. Each tube is then
heated to 95.degree. C. for 5 min to inactivate any enzymatic
activity, and the solution is centrifuged (16,000.times.g for 5
min) to pellet insoluble debris. To prepare samples for analysis,
100 .mu.L of each reaction are mixed with 100 .mu.L of a solution
consisting of 4 parts 15 mM dinitrophenyl hydrazine (DNPH) in
acetonitrile with 1 part 50 mM citric acid, pH 3.0, and the mixture
is heated to 70.degree. C. for 30 min in a thermocycler. The
solution is then analyzed by HPLC as described above in General
Methods to quantitate the concentration of ketoisovalerate (KIV)
present in the sample.
[0384] DHADs are tested for in vitro activity using whole cell
lysates. The DHADs are expressed in a yeast deficient for DHAD
activity (GEVO2244; ilv3.DELTA.) to minimize endogenous background
activity.
Example 13
Cloning of Low-Abundance, Endogenous Cytosolic Iron-Sulfur Cluster
Assembly Machinery for Overexpression in S. cerevisiae
[0385] The purpose of this example is to describe how three known
components of the S. cerevisiae cytosolic iron-sulfur assembly
machinery were cloned to permit their overexpression in S.
cerevisiae, to increase cytosolic DHAD activity.
[0386] In the yeast S. cerevisiae, at four least genes--CIA1, CFD1,
NAR1, and NBP35--encode activities that contribute to the proper
assembly and/or transfer of iron-sulfur [Fe--S] clusters of
cytosolic proteins. Of these four genes, three--CFD1, NAR1, and
NBP35--have been shown to be expressed at very low levels during
aerobic growth on glucose (Ghaemmaghami et al., 2003, Nature, 425:
737-741). These three genes thus represent attractive candidates
for overexpression to increase the cellular capacity for proper
cytosolic [Fe--S] cluster protein assembly.
TABLE-US-00030 TABLE 27 Plasmids disclosed in Example 13. pGV No.
Genotype pGV2074 pUC ori, bla (AmpR), 2 .mu.m ori, TPI1
promoter-hph (HygroR),PGK1 promoter, TEF1 promoter, TDH3 promoter
pGV2127 pUC ori, bla (AmpR), 2 .mu.m ori, TPI1 promoter-hph
(HygroR), PGK1 promoter, TEF1 promoter, TDH3 promoter-CFD1 pGV2138
pUC ori, bla (AmpR), 2 .mu.m ori, TPI1 promoter-hph (HygroR), PGK1
promoter, TEF1 promoter-NAR1, TDH3 promoter-CFD1 pGV2144 pUC ori,
bla (AmpR), 2 .mu.m ori, TPI1 promoter-hph (HygroR), PGK1 promoter-
NBP35, TEF1 promoter, TDH3 promoter pGV2147 pUC ori, bla (AmpR), 2
.mu.m ori, TPI1 promoter-hph (HygroR), PGK1 promoter- NBP35, TEF1
promoter-NAR1, TDH3 promoter-CFD1
[0387] To clone the sequences for CFD1, NAR1, and NBP35 into an
appropriate S. cerevisiae expression vector, the following steps
were carried out: Vector pGV2074 (SEQ ID NO: 133) was used as a
parental plasmid for subsequent cloning steps described below. The
salient features of pGV2074 include a bacterial origin of
replication (pUC) and selectable marker (bla), an S. cerevisiae 2
.mu.m origin of replication and selectable marker (the hph gene,
conferring resistance to hygromycin, operably linked to the TPI1
promoter region), and sequences containing the S. cerevisiae
promoters for the PGK1, TDH3 and TEF1 genes, each followed by one
or more unique restriction sites to facilitate the introduction of
coding sequences.
[0388] First, the CFD1 coding sequence was amplified from S.
cerevisiae genomic DNA by PCR, using primers 2195 and 2196, which
also added 5' XhoI and 3' NotI sites, respectively. The resulting
.about.890 bp product was digested with XhoI plus NotI and ligated
into pGV2074 that had been digested with XhoI plus NotI, yielding
the plasmid pGV2127. All sequences amplified by PCR were confirmed
by DNA sequencing. Next, the NAR1 coding sequence was amplified
from S. cerevisiae genomic DNA by PCR, using primers 2197 and 2198,
which added 5' SalI and 3' BamHI sites, respectively. The resulting
.about.1485 bp product was digested with SalI plus BamHI and cloned
into pGV2127 which had also been digested with SalI plus BamHI,
thereby yielding pGV2138. Next, the NBP35 coding sequence was
amplified S. cerevisiae genomic DNA by PCR, using primers 2259 and
2260, which added 5' BglII and 3' KpnI and XhoI (from 5' to 3')
sites, respectively. The resulting .about.995 bp product was
digested with BglII plus XhoI and ligated into pGV2074 that had
been digested with BglII plus SalI, yielding pGV2144. Finally,
pGV2144 was digested with AvrII plus BamHI, and the resulting 1.78
kb fragment (which contained the PGK1 promoter and the NBP35 ORF
sequence) was gel purified and ligated into the vector pGV2138 that
had been digested with AvrII plus BglII, yielding pGV2147.
Example 14
Cloning of Heterologous Cytosolic Iron-Sulfur Cluster Assembly
Machinery for Overexpression in S. cerevisiae
[0389] The purpose of this example is to describe how one or more
cytosolic iron-sulfur assembly machinery components, from various
species, can be cloned to permit their overexpression in S.
cerevisiae, thereby increasing cytosolic DHAD activity.
[0390] In addition to the endogenous cytosolic iron-sulfur assembly
machinery found in S. cerevisiae, homologous sequences and
activities have been identified in other microbial and eukaryotic
species. In one example, the ApbC protein of Salmonella enterica
serovar Typhimurium has been shown, in vitro, to bind and
effectively transfer iron-sulfur clusters to a known cytosolic
[Fe--S] cluster-containing S. cerevisiae substrate, Leu1 (Boyd et
al., 2008, Biochemistry, 47: 8195-202). Thus, a number of other
useful homologs of the known S. cerevisiae cytosolic iron-sulfur
assembly machinery components exist and present attractive
candidates for overexpression in S. cerevisiae. Table 28 lists
several exemplary homologs and their GenBank accession numbers, as
identified by previous homology searches (Boyd et al., 2009, J.
Biol Chem 284: 110-118). Also included in the table are two closely
related S. cerevisiae homologs, Nbp35 and Cfd1. Of note, Ind1 is
reported to be localized to and functional in the mitochondria
(Bych et al., 2008, EMBO J. 27: 1736-46), whereas Hcf101 is
reported to participate in iron-sulfur cluster assembly in
Arabidopsis chloroplasts (Lezhneva et al., 2004, Plant J. Cell Mol
Biol 37: 174-185).
TABLE-US-00031 TABLE 28 Functionally homologous proteins involved
in iron-sulfur cluster formation. Gene Source, Accession Number
ApbC Salmonella enterica serovar Typhimurium LT2, NP_461098 Ind1
Yarrowia lypolytica, YALI0B18590g Hcf101 Arabidopsis thaliana,
AAR97892.1 Nubp1 Homo sapiens, NP_002475.2 Nbp35 S. cerevisiae,
CAA96797.1 Cfd1 S. cerevisiae, AAS56623
[0391] The cloning of one or more of these genes is carried out
using techniques well known to one skilled in the art.
Oligonucleotide primers are designed that are homologous to the 5'
and 3' ends of each desired reading, and which furthermore
incorporate a restriction site sequence convenient for the cloning
of each reading frame into vector pGV2074. A standard PCR reaction
is used to amplify each gene, either from the genome of each host
organism, or from an in vitro synthesized DNA fragment, and the
resulting PCR product is cloned into an expression vector
(pGV2074). In the case of a protein known to be targeted to the
mitochondria, such as Yarrowia lypolytica Ind1, PCR primers are
designed to amplify the majority of the coding sequence while
excluding the known N-terminal mitochondrial targeting sequence
(Bych et al., 2008, EMBO J. 27: 1736-46).
Example 15
Overexpression of S. cerevisiae Cytosolic Iron-Sulfur Assembly
Machinery to Increase Cytosolic DHAD Activity
[0392] The purpose of this example is to describe how a plasmid
expressing one or more iron-sulfur assembly machinery components is
co-expressed with a DHAD, thereby increasing the cytosolic activity
of the DHAD.
[0393] Strain GEVO2244 is simultaneously co-transformed with one
of: pGV1851, pGV1852, pGV1853, pGV1854, pGV1855, pGV1904, pGV1905,
pGV1906, or pGV1907 (pGV1851-55 and pGV1904-07 are described in
Table 20); plus, one of either: pGV2074 (Table 27) (which serves as
an empty-vector control) or pGV2147 (Table 27) (which serves as the
cytosolic Fe--S cluster machinery overexpression plasmid), and
doubly-transformed cells are selected by plating onto SCD-Ura+9xIV
containing 0.1 g/L Hygromycin B.
[0394] Three independent isolates from each transformation are
cultured in SCD-Ura+9xIV containing 0.1 g/L Hygromycin B to obtain
a cell mass suitable for preparation of a lysate, as described in
Example 3. Lysates are prepared from each culture, and the
resulting lysates are assayed for DHAD activity as described in
Example 3. To further confirm that the increased DHAD activity is
due specifically to increased cytosolic activity, cultures of
GEVO2244 containing pGV1855 plus either pGV2074 or pGV2147 are
grown in SCD-Ura+9xIV containing 0.1 g/L Hygromycin B as otherwise
described in Example 11. Fractionated lysates are prepared and in
vitro assays to measure DHAD activity are further carried out as
described in Example 11.
Example 16
Deletion of LEU1
[0395] The purpose of this example is to describe the deletion of
LEU1 to increase the iron-sulfur cluster availability in the yeast
cytosol.
TABLE-US-00032 TABLE 29 Plasmids disclosed in Example 16. pGV No.
Genotype pGV1299 K. lactis URA3, bla, pUC-ori (GEVO) pGV1981
P.sub.TEF1:Lactococcus lactis
ilvD-coSc:P.sub.TDH3:Ec_ilvC.sup.Q110V- coSc:T.sub.CYC1, HIS3,
2-micron, bla, pUC-ori pGV2001
P.sub.TEF1:P.sub.TDH3:Ec_ilvC.sup.Q110V-coSc:T.sub.CYC1, HIS3,
2-micron, bla, pUC-ori
[0396] The LEU1 gene was deleted by transforming cells with a
leu1:K. lactis URA3 deletion cassette that was generated by two
rounds of PCR. Initially, the K. lactis URA3 gene was amplified
with primers 2171 and 2172 from pGV1299 (described in Example 2).
These primers add 40 bp of the LEU1 promoter and terminator
sequences to the 5' and 3' ends of the K. lactis URA3 gene. This
PCR product was then used as a template for a PCR using primers
2170 and 2173. Primer 2170 adds an additional 36 bp of the LEU1
promoter sequence at the 5' end and primer 2173 adds an additional
38 bp of the LEU1 terminator sequence at the 3' end. This PCR
product was transformed into GEVO2244 (described in Example 2) to
generate GEVO2570. The 5' junction of the integrations were
confirmed by colony PCR using primers 2226 and 587. The 3' junction
of the integrations were confirmed by colony PCR using primers 588
and 2175. The loss of the LEU1 gene was confirmed by a lack of PCR
product using primers 2167 and 2227.
[0397] GEVO2570 has a deletion in ILV3. GEVO2570 is used to measure
DHAD activity in the presence of L. lactis ilvD overexpressed as
described in Examples 2 and 4. A plasmid (pGV2001) with no DHAD is
used as a negative control.
Example 17
Conserved Motif Amongst Cytosolically Active DHAD Enzymes
[0398] This example illustrates that a DHAD enzymes with a specific
amino acid sequence motif are more likely to be functional when
expressed in the yeast cytosol.
[0399] Based on the data from biochemical assays (see Example 10),
several DHAD homologs were identified that exhibit at least some
cytosolic activity. A total of ten different homologs were tested
using biochemical assays. The DHADs were expressed from 2 micron
yeast vectors and transformed into GEVO2244. The homologs were then
ranked based on their measured specific activity in both whole cell
lysates and in cytosolic fractions.
[0400] Based on these data, four DHAD homologs: L. lactis (SEQ ID
NO: 18), G. forsetii (SEQ ID NO: 17), Acidobacteria (SEQ ID NO:
16), and S. erythraea (SEQ ID NO: 19) exhibit cytosolic DHAD
activity. Four DHAD homologs exhibit no cytosolic DHAD activity: R.
eutropha (SEQ ID NO: 22), C. salexigens (SEQ ID NO: 23), P.
torridus (SEQ ID NO: 24), and S. tokodaii (SEQ ID NO: 25). One
motif-containing homolog was inconclusive: Piromyces sp. E2 (SEQ ID
NO: 21), which did not complement the GEVO2242 valine auxotrophy
and had detectable biochemical DHAD activity. Since, this homolog
has a putative organellar targeting sequence, the protein is likely
to be mitochondrially located explaining its inability to
complement the GEVO2242 auxotrophy, despite containing the
motif.
[0401] A multiple sequence alignment (MSA) was created using the
Align Multiple Sequences tool of Clone Manager 9 Professional
Addition Software using the "MultiWay" function. This function
performs exhaustive pairwise global alignments of all sequences and
progressive assembly of alignments using Neighbor-Joining
phylogeny. A total of 53 representative DHAD homologs (FIG. 5) were
aligned using the following using the BLOSUM62 scoring matrix
setting. This alignment generated the tree in FIG. 5.
[0402] Many of the DHAD homologs exhibiting cytosolic activity are
related by overall homology (>40%) homology when compared to the
S. cerevisiae DHAD encoded by S. cerevisiae ILV3 (e.g. L. lactis,
G. forsetii, Acidobacteria, and S. erythraea). However, the 40%
homology cut-off still includes several DHAD homologs that do not
exhibit cytosolic DHAD activity (e.g. R. eutropha, C. salexigens,
P. torridus, and S. tokodaii). The Piromyces sp. E2 DHAD failed to
complement in the genetic/biochemistry assay but this result is
still consistent with our motif hypothesis since the protein still
retained its mitochondrial localization signal. Therefore, a common
sequence motif, unique to DHAD homologs that are cytosolically
active, was identified: P(I/L)XXXGX(I/L)XIL (SEQ ID NO: 27), where
(I/L) indicates an isoleucine or leucine at that position, and X
indicates any natural or non-natural amino acid. This motif can be
found in all DHAD homologs exhibiting cytosolically activity.
Furthermore, an even more specific version of this motif was
identified that is conserved in all of DHAD homologs exhibiting
cytosolic activity except for the S. erythraea DHAD:
PIKXXGX(I/L)XIL (SEQ ID NO: 28). This motif is conserved amongst
the majority if not all eukaryotic homologs of DHAD.
[0403] Six additional DHAD homologs were identified: SEQ ID NOs:
10-15 as specified in Table 1. These DHAD homologs (SEQ ID NOs:
10-15) contain the motifs PYHKEGGLGIL (SEQ ID NO: 145), PYSEKGGLAIL
(SEQ ID NO: 146), PYKPEGGIAIL (SEQ ID NO: 147), PLKPSGHLQIL (SEQ ID
NO: 148), PIKKTGHLQIL (SEQ ID NO: 149), and PIKETGHIQIL (SEQ ID NO:
150), respectively.
Example 18
Use of Cytosolically Localized DHADs for the Production of
Isobutanol
[0404] The following example illustrates the use of DHADs that have
cytosolic activity in yeast and when expressed in the context of an
isobutanol biosynthetic pathway lead to isobutanol production.
[0405] A yeast strain that contains one integrated copy of the B.
subtilis alsS gene codon-optimized for expression in S. cerevisiae
(SEQ ID NO: 144), two integrated copies of the L. lactis kivD gene
(SEQ ID NOs: 99 and 151), one integrated copy of L. lactis
adhA.sup.RE1 gene (SEQ ID NO: 152), and one integrated copy of the
S. cerevisiae AFT1 gene (SEQ ID NO: 153) was transformed with high
copy three-component isobutanol pathway plasmids containing a KARI
(Ec_ilvC_coSc.sup.P2D1-A1-his6, SEQ ID NO: 154), an ADH (L. lactis
adhA.sup.RE1, SEQ ID NO: 152) and a DHAD which was expressed from
the S. cerevisiae PDC1-286 promoter. The DHAD varied according to
Table 31. Isobutanol titer and DHAD activity of each strain was
compared to that of a control strain that did not express a DHAD in
the plasmid. Strains, plasmids, and DHADs are listed in Tables 30,
31, and 32, respectively.
TABLE-US-00033 TABLE 30 Genotype of strains disclosed in Example
18. GEVO No. Genotype GEVO3868 S. cerevisiae, CEN.PK2, MATa ura3
leu2 his3 trp1 gpd1::T.sub.Kl_URA3 gpd2::T.sub.Kl_URA3
tma29::T.sub.Kl_URA3 pdc1::P.sub.PDC1-Ll_kivD2_coSc5-P.sub.FBA1-
LEU2-T.sub.LEU2-P.sub.ADH1-Bs_alsS1_coSc-T.sub.CYC1-
P.sub.PGK1-Ll_kivD2_coEc-P.sub.ENO2-Sp_HIS5 pdc5::T.sub.Kl_URA3
pdc6::P.sub.TDH3-Sc_AFT1-P.sub.ENO2-
Ll_adhA.sup.RE1-T-.sub.Kl_URA3_short-P.sub.FBA1-Kl_URA3-T.sub.Kl_URA3
{evolved for C2 supplement-independence, glucose tolerance and
faster growth}
TABLE-US-00034 TABLE 31 Plasmids disclosed in Example 18. Plasmid
Name DHAD Genotype pGV2663 none
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2 .mu.-ori, pUC ori, bla, G418r pGV2635
L. lactis P.sub.PDC1-286-Ll_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2 .mu.-ori, pUC ori, bla, G418r pGV2671
S. cerevisiae P.sub.PDC1-286-Sc_ilv3_.DELTA.N20,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2 .mu.-ori, pUC ori, bla, G418r pGV2672
G. forsetii P.sub.PDC1-286-Gf_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2 .mu.-ori, pUC ori, bla, G418r pGV2673
S. erythraea P.sub.PDC1-286-Se_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1, 2 .mu.-ori, pUC ori, bla, G418r pGV2674
F. tularensis P.sub.PDC1-286-Ft_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r pGV2675
S. cerevisiae P.sub.PDC1-286-Sc_ilv3_.DELTA.N19, ilv3.DELTA.N19
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r pGV2676
S. P.sub.PDC1-286-Sc_ilv3_.DELTA.N23, cerevisiae
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6, ilv3.DELTA.N23
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r pGV2677
N. P.sub.PDC1-286-Nc_ilvD2_coSc, crassa ilvD2
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r pGV2678
Acidobacteria P.sub.PDC1-286-Ab_ilvD_coSc, bacterium
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r pGV2679
Acaryochloris P.sub.PDC1-286-Am_ilvD_coSc, marina
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r pGV2680
Lyngbya spp. P.sub.PDC1-286-Lsp_ilvD_coSc,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r pGV2681
E. coli P.sub.PDC1-286-Ec_ilvD_coKl,
P.sub.TDH3-Ec_ilvC_coSc.sup.P2D1-A1-his6,
P.sub.ENO2-Ll_adhA.sup.RE1 2 .mu.-ori, pUC ori, bla, G418r
TABLE-US-00035 TABLE 32 DHAD sequences disclosed in Example 18. SEQ
ID NO SEQ ID NO DHAD Abbreviation (DNA) (protein) L. lactis
Ll_ilvD_coSc 155 18 S. cerevisiae ilv3.DELTA.N20 Sc_ilv3_.DELTA.N20
89 26 G. forsetii Gf_ilvD_coSc 90 17 S. erythraea Se_ilvD_coSc 91
19 F. tularensis Ft_ilvD_coSc 156 14 S. cerevisiae ilv3.DELTA.N19
Sc_ilv3_.DELTA.N19 157 163 S. cerevisiae ilv3.DELTA.N23
Sc_ilv3_.DELTA.N23 158 164 N. crassa ilvD2 Nc_ilvD2_coSc 159 165 A.
bacterium Ab_ilvD_coSc 92 16 A. marina Am_ilvD_coSc 160 166 Lyngbya
spp. Lsp_ilvD_coSc 161 167 E. coli Ec_ilvD_coKl 162 168
[0406] Cloning techniques included digestion with restriction
enzymes, gel purification of DNA fragments (using the Zymoclean Gel
DNA Recovery Kit, Cat# D4002, Zymo Research Corp, Orange, Calif.),
ligation of two DNA fragments using the DNA Ligation Kit (Mighty
Mix Cat# TAK 6023, Clontech Laboratories, Madison, Wis.), and
bacterial transformations into competent E. coli cells (Xtreme
Efficiency DH5.alpha. Competent Cells, Cat# ABP-CE-CCO2096P, Allele
Biotechnology, San Diego, Calif.). Plasmid DNA was purified from E.
coli cells using the Qiagen QIAprep Spin Miniprep Kit (Cat#27106,
Qiagen, Valencia, Calif.).
[0407] Yeast media used for this example include YP medium (1%
(w/v) yeast extract, 2% (w/v) peptone), YPD medium (YP medium
containing 2% (w/v) glucose), YPD supplemented with glycerol and
ethanol (YPD medium containing 1% (v/v) 80% glycerol and 1% (v/v)
ethanol. The antibiotic G418 was added to agar plates to a final
concentration of 0.2 g/L. Precultures were grown in YP medium
supplemented with 5% glucose, 1% ethanol, and 0.2 g/L G418.
Fermentations were carried out in YP medium containing 8% glucose,
1% v/v of ergosterol and Tween-80 in 100% ethanol, 200 mM MES (pH
6.5), and 0.2 .mu.g/mL G418.
[0408] A large patch of S. cerevisiae strain GEVO3868 was grown on
an YPD plate. Cells from the patch were scraped from the plate,
resuspended in 2 mL YPD containing 1% v/v ethanol containing 1% v/v
80% glycerol and placed in the 30.degree. C. orbital shaker
overnight. The following morning, 1 mL of the overnight culture was
used to inoculate 50 mL YPD containing 1% ethanol containing 1% v/v
80% glycerol and returned to the 30.degree. C. orbital shaker.
After 6 hours, the cells were at an OD.sub.600 of 0.55. They were
diluted to an OD.sub.600 of 0.1 in the same media and grown
overnight at 30.degree. C. In the morning the cells were diluted to
an OD.sub.600 of 0.6, grown for 3 hours at 30.degree. C. until the
OD.sub.600 was 1.1, and the cells were collected by centrifugation
at 2700 rcf for 2 min at room temperature. The medium was removed,
50 mL sterile milliQ water was used to wash the cells, and the
cells were centrifuged for 2 min at 2700 rcf at room temperature.
After removing the supernatant, the cells were washed with 25 mL
sterile milliQ water and centrifuged at 2700 rcf for 2 min at room
temperature. The supernatant was removed and the cells were
resuspended in 1 mL 100 mM lithium acetate. The cells were
centrifuged for 10 sec, the supernatant removed, and the cells
resuspended in 400 .mu.L 100 mM lithium acetate. The cells were
transformed as follows. First, a mixture of plasmid DNA (final
volume of 15 .mu.l with sterile water), 72 .mu.l 50% PEG, 10 .mu.l
1M lithium acetate, and 3 .mu.l of denatured salmon sperm DNA (10
mg/mL) was prepared for each transformation. In a sterile 1.5 mL
tube, 15 .mu.l of the cell suspension was added to the DNA mixture
(100 .mu.l), and the transformation suspension was vortexed for 5
short pulses. The transformation was incubated for 30 min at
30.degree. C., followed by incubation for 22 min at 42.degree. C.
The cells were collected by centrifugation (18,000.times.g, 10
seconds, 25.degree. C.). After removing the supernatant, the cells
were resuspended in 400 .mu.l YPD. After an overnight recovery
shaking at 30.degree. C. and 250 RPM, the cells were spread over
selective plates, YPD containing 0.2 g/L G418. Transformants were
then single colony purified onto selective plates.
[0409] For fermentations, 3 mL cultures of GEVO3868 transformed
with each 2p plasmid were started in YPD containing 1% ethanol
containing 0.2 g/L G418 and incubated overnight at 30.degree. C.
and 250 RPM. There were three biological replicates of each strain
for 39 cultures total. After the OD.sub.600 of these cultures were
taken the next day, the appropriate amount of culture was used to
inoculate 50 mL of YP with 5% glucose containing 1% ethanol
containing 0.2 g/L G418 (baffled flask) to an OD.sub.600 of
approximately 0.1. These cultures were incubated at 30.degree. C.
and 250 RPM overnight. The next day, the cultures containing the S.
cerevisiae ilv3.DELTA.N20, the S. cerevisiae ilv3.DELTA.N19, and
the S. cerevisiae ilv3.DELTA.N23 did not reach an OD.sub.600 of 5
(0.6-2.4) so incubation continued for another 24 h at 30.degree. C.
and 250 RPM. The remaining 30 cultures had reached an OD.sub.600 of
approximately 5 and were centrifuged in 50 mL Falcon tubes at 2700
rcf for 5 min at 25.degree. C. The cells from the 30 cultures were
resuspended in 50 mL YP with 8% glucose, 1% (v/v) ethanol,
ergosterol, Tween-80, 200 mM MES (pH 6.5), and 0.2 g/L G418. The
cultures were transferred to 250 mL unbaffled flasks with closed
screw caps and incubated at 30.degree. C. and 75 RPM. The next day,
the remaining 9 cultures were at a higher OD.sub.600 (3-5) and
prepared for the fermentation as described above. At 24 and 48 h
after transfer to 250 mL unbaffled flasks with closed screw caps,
samples of each of the 39 flasks were taken to determine OD.sub.600
and prepared for gas chromatography as follows. 2 mL of sample (per
flask) was removed and OD.sub.600 was determined. The remaining
sample was centrifuged for 10 min at maximum speed. 1 mL of the
supernatant was analyzed by gas chromatography as described. For
the final 72 h timepoint, the same procedures were used for
measuring OD.sub.600 and analysis by gas chromatography. In
addition samples were analyzed by high performance liquid
chromatography. Cells were also prepared for enzyme assays. After
3.times.15 mL Falcon tubes per flask were weighed (total of 117),
14 mL of the appropriate sample was transferred into the Falcon
tubes. After centrifugation at 3000.times.g for 5 min at 4.degree.
C., the supernatant was removed and the cells washed in 3 mL cold,
sterile water. The tubes were centrifuged as per above for 2 min,
the supernatant removed, and the tubes reweighed to determine total
cell weight. The Falcon tubes were stored at -80.degree. C.
[0410] Analysis of organic acid metabolites was performed on an
HP-1200 HPLC system equipped with two Restek RFQ 150.times.7.8 mm
columns in series. Organic acid metabolites were detected using an
HP-1100 UV detector (210 nm) and refractive index. The column
temperature was 60.degree. C. This method was isocratic with 0.0180
N H.sub.2SO.sub.4 (in Milli-Q water) as mobile phase. Flow was set
to 1.1 mL/min. Injection volume was 20 .mu.L and run time was 16
min. Analysis was performed using authentic standards (>99%,
obtained from Sigma-Aldrich, with the exception of
2,3-dihydroxyisovalerate (DHIV), which was custom synthesized
according to Cioffi et al., 1980, Anal Biochem 104: 485 and a
5-point calibration curve.
[0411] Analysis of volatile organic compounds, including ethanol
and isobutanol was performed on a HP 5890, 6890 or 7890 gas
chromatograph fitted with an HP 7673 Autosampler, a DB-FFAP column
(J&W; 30 m length, 0.32 mm ID, 0.25 .mu.M film thickness) or
equivalent connected to a flame ionization detector (FID). The
temperature program was as follows: 230.degree. C. for the
injector, 300.degree. C. for the detector, 100.degree. C. oven for
1 minute, 70.degree. C./minute gradient to 230.degree. C., and then
hold for 2.5 min. Analysis was performed using authentic standards
(>99%, obtained from Sigma-Aldrich, and a 5-point calibration
curve with 1-pentanol as the internal standard.
[0412] For DHAD activity assays cells were thawed on ice and
resuspended in lysis buffer (50 mM Tris pH 8.0 and 5 mM MgSO.sub.4)
for a 20% cell suspension by mass. 1000 .mu.l of glass beads (0.5
mm diameter) were added to a 1.5 ml Eppendorf tube and 875 .mu.l of
cell suspension was added. Yeast cells were lysed using a Retsch
MM301 mixer mill (Retsch Inc. Newtown, Pa.), mixing 6.times.1 min
each at full speed with 1 min incubations on ice between each
bead-beating step. The tubes were centrifuged for 10 min at
23,500.times.g at 4.degree. C. and the supernatant was removed for
use. These lysates were held on ice until assayed. Yeast lysate
protein concentration was determined using the BioRad Bradford
Protein Assay Reagent Kit (Cat#500-0006, BioRad Laboratories,
Hercules, Calif.) and using BSA for the standard curve. Briefly 10
.mu.L standard or lysate were added into a microcentrifuge tube.
The samples were diluted to fit in the linear range of the standard
curve (1:40). 500 .mu.L of diluted and filtered Bio-Rad protein
assay dye was added to the blank and samples and then vortexed.
Samples were incubated at room temperature for 6 min, transferred
into cuvettes and the OD.sub.595 was determined in a
spectrophotometer. The linear regression of the standards was then
used to calculate the protein concentration in each sample. For
DHAD assays technical triplicates were performed for each sample.
In addition, a no lysate control with lysis buffer was performed.
To assay each sample, 10 .mu.L of an appropriate dilution of lysate
in assay buffer was mixed with 90 .mu.L of assay buffer (5 .mu.L of
0.1 M MgSO.sub.4, 10 .mu.L of 0.1 M DHIV, and 75 .mu.L 50 mM Tris
pH 8.0), and incubated in a thermocycler for 30 minutes at
35.degree. C., then at 95.degree. C. for 5 minutes. Cell debris and
precipitant were removed from the samples by centrifugation at
3000.times.g for 5 min.
[0413] Finally, 75 .mu.L of supernatant was transferred to new PCR
tubes and analyzed by Liquid Chromatography for the
2-keto-isovalerate (KIV) product. DNPH reagent (12 mM
2,4-Dinitrophenyl Hydrazine 20 mM Citric Acid pH 3.0 80%
Acetonitrile 20% MilliQ H.sub.2O) was added to each sample in a 1:1
ratio. Samples were incubated for 30 min at 70.degree. C. in a
thermo-cycler (Eppendorf, Mastercycler). Analysis of KIV was
performed on an HP-1200 High Performance Liquid Chromatography
system equipped with an Eclipse XDB C-18 reverse phase column
(Agilent) and a C-18 reverse phase column guard (Phenomenex).
Ketoisovalerate was detected using an HP-1100 UV detector (360 nm).
The column temperature was 50.degree. C. This method was isocratic
with 70% acetonitrile 2.5% phosphoric acid (4%), 27.5% water as
mobile phase. Flow was set to 3 mL/min. Injection size was 10 .mu.L
and run time was 2 min.
[0414] The data at 72 hours are summarized in Table 33. The data
demonstrates that the DHADs contained in plasmids pGV2635, 2677,
2674, 2672, 2673 and 2676 led to production of isobutanol titers of
at least 2.5 g/L and are considered to be significantly active in
the cytosolic isobutanol pathway. The DHADs contained in plasmids
pGV2675, 2681, 2680, 2678, 2679, 2671, and 2676 led to production
of isobutanol titers below 2.5 g/L and are considered to be
inactive or poorly active in the cytosolic isobutanol pathway.
TABLE-US-00036 TABLE 33 Isobutanol production with selected DHADs.
Plasmid Isobutanol produced DHAD activity (DHAD Gene) OD.sub.600
[g/L] (U/mg) pGV2635 8.6 .+-. 0.6 9.02 .+-. 0.28 0.62 .+-. 0.01 (L.
lactis) pGV2677 9.4 .+-. 0.6 6.30 .+-. 0.85 0.42 .+-. 0.02 (N.
crassa) pGV2674 7.5 .+-. 0.7 6.22 .+-. 0.31 0.30 .+-. 0.00 (F.
tularensis) pGV2672 8.1 .+-. 0.6 6.10 .+-. 0.26 0.20 .+-. 0.00 (G.
forsetii) pGV2673 8.0 .+-. 1.1 3.23 .+-. 0.12 0.03 .+-. 0.00 (S.
erythraea) pGV2676 5.2 .+-. 0.2 2.67 .+-. 0.06 0.02 .+-. 0.00 (S.
cerevisiae ilv3.DELTA.N23) pGV2675 5.0 .+-. 0.2 2.27 .+-. 0.16 0.09
.+-. 0.00 (S. cerevisiae ilv3.DELTA.N19) pGV2681 6.9 .+-. 0.6 2.21
.+-. 0.09 0.03 .+-. 0.00 (E. coli) pGV2680 6.9 .+-. 1.3 2.13 .+-.
0.09 0.02 .+-. 0.00 (Lyngbya spp.) pGV2678 7.5 .+-. 0.2 2.06 .+-.
0.17 0.03 .+-. 0.00 (Acidobacteria) pGV2679 7.5 .+-. 0.6 2.05 .+-.
0.06 0.03 .+-. 0.00 (A. marina) pGV2671 5.5 .+-. 0.0 1.92 .+-. 0.03
0.44 .+-. 0.01 (S. cerevisiae) pGV2663 6.7 .+-. 0.2 1.53 .+-. 0.18
0.01 .+-. 0.01 (none)
Example 19
Overexpression of the L. lactis ilvD in K. lactis and K.
Marxianus
[0415] The purpose of this example is to demonstrate activity of L.
lactis DHAD in K. lactis and in K. marxianus.
[0416] Strains, plasmids, and sequences disclosed herein are listed
in Tables 34, 35, and 36, respectively.
TABLE-US-00037 TABLE 34 Genotype of strains disclosed in Example
19. GEVO Number Genotype K. marxianus strain K. marxianus
NRRL-Y-7571 ura3-delta2 GEVO2504 pdc1.DELTA.::Ll.kivd2 coSc.
P.sub.TDH3: Dm_ADH:P.sub.FBA1:URA3:
P.sub.Sc_FBA1:31COX4_MTS:Bs_alsS1_coSc K. marxianus strain
ura3-delta2 pdc1.DELTA.::.DELTA.::{Ll_kivd2 GEVO2543
co:P.sub.Sc_TDH3:Ec_ilvC.sup.Q11V coSC:
P.sub.Sc_TPI1:G418.sup.R:P.sub.Sc_CUP1:Bs_alsS1_coSc} K. marxianus
strain ura3-delta2 pdc1.DELTA.::{Ll_kivd2 GEVO2598
co:P.sub.Sc_TDH3:Ec_ilvC.sup.Q110V coSC:
P.sub.Sc_TPI1:G418.sup.R:P.sub.Sc_CUP1:Bs_alsS1_coSc} + random
integration of {P.sub.Sc_TEF1:Ll_ilvD_coSc URA3} K. lactis strain
MATalpha uraA1 trp1 leu2 lysA1 ade1 GEVO1287 lac4-8 [pKD1] ATCC
200826
TABLE-US-00038 TABLE 35 Plasmids disclosed in Example 19. Plasmid
Name Relevant Genes/Usage Genotype pGV2271 Empty 1.6 micron 1.6
.mu. ori, bla, hygroR vector that can be maintained in K. lactis.
Encodes hygromycin resistance. pGV2273 1.6 micron vector for
P.sub.TDH3: Ec_ilvC_P2D1-A1, expression of KARI, P.sub.TEF1:
Ll_ilvD_coSc, P.sub.PGK1: KIVD, DHAD and ADH Ll_kivD2_coEc, in K.
lactis P.sub.ENO2: Ll_adhA 1.6 .mu. ori, bla, HygroR pGV2069 2
micron plasmid for P.sub.TDH3: Ec_ilvC_coScQ.sup.110V, expression
of KIVD, P.sub.TEF1: Ll_ilvD_coSc, P.sub.PGK1: DHAD, KARI, and ALS
Ll_kivD2_coEc, P.sub.CUP1: in K. marxianus Bs_alsS1_coSc,
P.sub.ENO2: Dm_adhA, 2 .mu. ori, bla, G418 pGV1855 2 micron plasmid
for expression P.sub.TEF1: Ll_ilvD, 2 .mu. ori, of DHAD in K.
marxianus bla, URA
TABLE-US-00039 TABLE 36 Amino acid and nucleotide sequences of
enzymes and genes disclosed in Example 19. Corresponding Protein
Enz. Source Gene (SEQ ID NO) (SEQ ID NO) ALS B. subtilis
Bs_alsS1_coSc Bs_AlsS1_coSc (SEQ ID NO: 144) (SEQ ID NO: 169) KARI
E. coli Ec_ilvC_coSc.sup.Q110V Ec_IlvC_coSc.sup.Q110V (SEQ ID NO:
98) (SEQ ID NO: 170) E.coli Ec_ilvC_coSc.sup.P2D1-A1
Ec_ilvC_coSc.sup.P2D1-A1 (SEQ ID NO: 171) (SEQ ID NO: 172) KIVD L.
lactis Ll_kivd2_coEc Ll_Kivd2_coEc (SEQ ID NO: 99) (SEQ ID NO: 173)
DHAD L. lactis Ll_ilvD_coSc Ll_IlvD_coSc (SEQ ID NO: 155) (SEQ ID
NO: 18) ADH L. lactis Ll_adhA Ll_adhA (SEQ ID NO: 174) (SEQ ID NO:
175) D. melanogaster Dm_adh Dm_adh (SEQ ID NO: 116) (SEQ ID NO:
176)
[0417] To generate GEVO2543, GEVO2504 was transformed with pGV2069
to integrate into the genome three genes: Bs_alsS1_coSc (SEQ ID NO:
144), Ec_ilvC_coSc.sup.Q110V (SEQ ID NO: 98), and LI_kivd2_coEc
(SEQ ID NO: 99). To generate GEVO2598, GEVO2543 was transformed
pGV1855 to integrate the L. lactis ilvD gene which was codon
optimized for S. cerevisiae (gene sequence SEQ ID NO: 155, also
referred to as LI_ilvD_coSc; protein sequence SEQ ID NO: 18) into
the chromosome. GEVO1287 was transformed with either pGV2271
(control plasmid) or pGV2273, which contains LI_ilvD_coSc.
[0418] GEVO2543, GEVO2598 and GEVO1287 transformed with pGV2271 or
pGV2273 were inoculated into 3 mL of YPD (for GEVO2543 and
GEVO2598) or YPD supplemented with 0.1 g/L hygromycin (for
GEVO1287) for an overnight culture. After approximately 18 hours, a
50 ml YPD culture in a baffled 250 ml shake flask was inoculated to
0.15 OD.sub.600 and shaken at 250 rpms for approximately 9 hours.
Next, DHAD activity and protein concentrations were measured.
[0419] Over-expression of the L. lactis ilvD gene resulted in an
increase in DHAD activity (U/mg total cell lysate protein). Table
37 shows the DHAD activity (U/mg total cell lysate protein)
averages from technical triplicates comparing strains expressing
the L. lactis DHAD to strains not expressing the L. lactis DHAD
gene.
TABLE-US-00040 TABLE 37 DHAD activity in whole cell yeast lysates.
Strain Activity [mU/mg] K. marxianus strain GEVO2543 (no DHAD)
0.010 .+-. 0.002 K. marxianus strain GEVO2598 (DHAD) 0.016 .+-.
0.001 K. lactis strain GEVO1287 + pGV2271 (No DHAD) 0.052 .+-.
0.003 K. lactis strain GEVO1287 + pGV2273 (DHAD) 0.122 .+-.
0.011
Example 20
L. lactis ilvD Activity is Localized to the Yeast Cytosol
[0420] The purpose of this example is to demonstrate that the
Lactococcus lactis ilvD protein localizes to the cytosol when
expressed in a yeast strain.
[0421] The S. cerevisiae strain GEVO1187 (S. cerevisiae CEN.PK2,
MATa ura3 leu2 his3 trp1 ADE2) was transformed with plasmid
pGV2484, a 2 micron plasmid expressing the L. lactis ilvD gene
which was codon optimized for S. cerevisiae (gene sequence SEQ ID
NO: 155, also referred to as LI_ilvD_coSc; protein sequence SEQ ID
NO: 18) under the S. cerevisiae TEF1 promoter
(P.sub.TEF1:LI_ilvD_coSc, 2.mu. ori, bla, G418R). Briefly, the
strain was grown in YPD to an OD.sub.600 of 0.6-0.8. Cells were
washed in H.sub.20, and then resuspended in 100 mM Lithium acetate.
In a 1.5 mL tube, 15 .mu.L of the cell suspension was added to a
mixture of DNA (final volume of 15 .mu.l with sterile water), 72
.mu.l 50% PEG, 10 .mu.l 1M lithium acetate, and 3 .mu.l of
denatured salmon sperm DNA (10 mg/mL). The transformation
suspension was vortexed for 5 short pulses. The mixture was
incubated at 30.degree. C. for 30 minutes, followed by incubation
for 22 minutes at 42.degree. C. The cells were collected by
centrifugation (18,000.times.g, 10 seconds, 25.degree. C.). The
cells were resuspended in 1 ml YPD medium (1% (w/v) yeast extract,
2% (w/v) peptone, 2% (w/v) glucose, pH 5) and after an overnight
recovery shaking at 30.degree. C. and 250 rpms, the cells were
spread over YPD agar plates supplemented with 0.2 g/L G418.
Transformants were then single colony purified onto G418 selective
plates.
[0422] All isolations of crude mitochondrial fractions were
performed in duplicate. GEVO1187 and GEVO1187 transformed with
pGV2484 were each grown in 100 mL of YPG medium (1% (w/v) yeast
extract, 2% (w/v) peptone, 3% (v/v) glycerol, pH5) overnight at
30.degree. C. and 250 rpm. This overnight culture was used to
inoculate 840 mL of YPG in a 2800 mL baffled flask at an OD.sub.600
of 0.03, and cells were grown at 30.degree. C. and 250 rpm for
20-28 h. At an OD.sub.600 of about 2.0, cells were harvested by
centrifugation at 3000.times.g for 5 minutes, resuspended in 100 mL
H.sub.2O followed by centrifugation at 3000.times.g for 5 minutes.
Cells were incubated in 2 mL/g CWW (cell wet weight) of DTT buffer
(100 mM Tris-H.sub.2SO.sub.4 pH 9.4, 10 mM DTT) for 20 minutes at
30.degree. C. Cells were resuspended in 7 mL/g CWW Zymolyase buffer
(1.2 M sorbitol, 20 mM Potassium phosphate pH 7.4) and then
centrifuged at 3000.times.g for 5 minutes. Cells were spheroplasted
by incubating in Zymolyase buffer with Zymolyase (Seikagaku
Biobusiness Corporation #120491-1; 3 mg/g CWW) for 45 minutes at
30.degree. C. on a rocking platform. 100 OD of spheroplasts were
set aside for whole cell lysate preparation (see below).
Spheroplasts were resuspended in Zymolyase buffer and centrifuged
at 3000.times.g for 5 minutes before resuspension in 6.5 mL/g CWW
homogenization buffer (chilled to 4.degree. C.; 6.5 mL/g 0.6 M
sorbitol, 10 mM Tris-HCl pH 7.4, 1 mM EDTA, 1 mM PMSF, 0.2% (w/v)
BSA). Spheroplasts were homogenized on ice with 15 strokes of a
pre-chilled glass-Teflon homogenizer (40 mL capacity), and the
sample was diluted 2-fold with homogenization buffer. Cell debris
and nuclei were pelleted by serial supernatant centrifugations of
1500.times.g for 5 minutes, and 4000.times.g for 5 minutes. The
mitochondrial fraction was isolated by centrifugation at
12,000.times.g for 15 minutes. The crude mitochondrial pellet was
resuspended in 10 mL SEM buffer (250 mM sucrose, 1 mM EDTA, 10 mM
MOPS-KOH pH 7.2), centrifuged at 4000.times.g for 5 minutes to
further remove cellular debris and nuclei before recovering the
mitochondrial fraction by centrifugation at 12,000.times.g for 15
minutes. The mitochondrial fraction may contain markers of the
plasma membrane, the endoplasmic reticulum, and vacuoles in
addition to markers of the mitochondria. Mitochondrial pellet was
resuspended in 750 .mu.L SEM Buffer+Protease Arrest (GBiosciences
#786-108).
[0423] Preparation of whole cell yeast lysates was performed using
the 100 ODs of yeast cells set aside after spheroplasting (see
above) by resuspending cells in 20% (w/v) SEM Buffer+1.times.
Protease Arrest (GBiosciences #786-108). 1000 .mu.l of glass beads
(0.5 mm diameter) were added to a 1.5 ml eppendorf tube, and 875
.mu.l of cell suspension was added. Yeast cells were lysed using a
Retsch MM301 mixer mill (Retsch Inc. Newtown, Pa.), mixing
6.times.1 min each at full speed with 1 min incubations on ice
between each bead-beating step. The tubes were centrifuged for 10
min at 23,500.times.g at 4.degree. C., the supernatant was removed,
aliquoted, flash frozen in liquid nitrogen, and stored at
-80.degree. C.
[0424] The resuspended mitochondrial fraction (see above) was added
to 1000 .mu.l of glass beads (0.1 mm diameter) in a 1.5 ml
Eppendorf tube. Additional buffer was added if necessary to fill
the tube completely. The mitochondrial fraction was lysed using a
Retsch MM301 mixer mill (Retsch Inc. Newtown, Pa.), mixing
3.times.1 minute each at full speed with 1 minute incubations on
ice between each bead-beating step. The tubes were centrifuged for
10 min at 23,500.times.g at 4.degree. C., the supernatant was
removed, aliquoted, flash frozen in liquid nitrogen, and stored at
-80.degree. C.
[0425] Whole cell yeast lysate and mitochondrial fraction lysate
protein concentration was determined using the BioRad Bradford
Protein Assay Reagent Kit (Cat#500-0006, BioRad Laboratories,
Hercules, Calif.) and using BSA for the standard curve. Briefly, 10
.mu.L standard or lysate were added into a microcentrifuge tube.
The samples were diluted to fit in the linear range of the standard
curve (1:10-1:40). 500 .mu.L of diluted and filtered Bio-Rad
protein assay dye was added to the blank and samples and then
vortexed. Samples were incubated at room temperature for 6 mins,
transferred into cuvettes and the OD.sub.595 was determined in a
spectrophotometer. The linear regression of the standards was then
used to calculate the protein concentration in each sample.
[0426] Three samples of each of the mitochondrial and whole cell
yeast lysates were assayed for DHAD activity, along with no lysate
controls. Table 38 shows the DHAD activity (U/mg protein) averages
from duplicate cultures comparing strains GEVO1187 (no DHAD
expression) to GEVO1187 transformed with pGV2484 (L. lactis DHAD
expressed from pGV2484). DHAD activity was measured in the whole
cell yeast lysate and the mitochondrial fraction lysate. Expression
of DHAD from pGV2484 resulted in about a 7-fold increase in DHAD
activity in the whole cell yeast lysate. Expression of DHAD from
pGV2484 did not affect DHAD activity localized to the mitochondrial
fraction. Subtracting the background activity in the GEVO1187 whole
cell yeast lysate of 0.27 mU/mg from the activity in the whole cell
yeast lysate of GEVO1187 transformed with pGV2484 of 1.87 mU/mg
shows an increase in 1.60 mU/mg. These data suggest that L. lactis
DHAD activity does not localize to the organellar structures
harvested in the mitochondrial fraction, and is therefore cytosolic
when expressed in a yeast strain.
TABLE-US-00041 TABLE 38 DHAD activity in whole cell yeast lysates
and mitochondrial fraction lysates. Activity Strain Lysate [mU/mg]
GEVO1187 Whole cell 0.27 .+-. 0.07 GEVO1187 transformed with
pGV2484 Whole cell 1.87 .+-. 0.14 GEVO1187 Mitochondrial 3.76 .+-.
0.01 GEVO1187 transformed with pGV2484 Mitochondrial 3.85 .+-.
0.13
Example 21
Overexpression of the L. lactis ilvD in Issatchenkia orientalis
[0427] The purpose of this example is to demonstrate cytosolic
activity of L. lactis DHAD in I. orientalis.
[0428] An engineered strain derived from the wild-type I.
orientalis strain ATCC PTA-6658 was further modified to contain
copies of all five isobutanol pathway genes integrated into the
chromosome. First, both alleles of the PDC1 locus were deleted in
series (See e.g. WO/2007/106524, which is herein incorporated by
reference in its entirety). The deletion event also simultaneously
integrated a copy of B. subtilis alsS gene and a copy of the L.
lactis kivD gene which encode SEQ ID NOs: 169 and 173,
respectively. This resulted in a Pdc-strain with two integrated
copies of the B. subtilis alsS gene and two integrated copies of
the L. lactis kivD gene (pdc1.DELTA.:LI_kivD: Bs_alsS
pdc1.DELTA.:LI_kivD: Bs_alsS). This strain was further engineered
to delete a single allele of the GPD1 locus (See e.g.
WO/2007/106524). The deletion event also simultaneously integrated
a single copy of the L. lactis adhA.sup.RE1, the E. coli
ilvC.sup.P2D1-A1, and L. lactis ilvD which encode the proteins
shown in SEQ ID NOs: 177, 172, and 18, respectively. This results
in a Pdc- Gpd+ strain with one integrated copy of the
LI_adhA.sup.RE1, Ec_ilvC.sup.P2D1-A1, and LI_ilvD genes
(GPD1/gpd1.DELTA.:[LI_adhA.sup.RE1: Ec_ilvC.sup.P2D1-A1:
URA3:LI_ilvD]). This strain is GEVO4306 (Table 39).
[0429] To generate a control strain which does not express the
pathway genes, both alleles of the PDC1 locus were deleted in
series but with no simultaneous integration of heterologous genes.
Next one of the two GPD1 alleles was deleted with no simultaneous
integration of heterologous genes. The resulting control strain is
GEVO4308 (pdc1.DELTA.::loxP/pdc1.DELTA.::loxP
GPD1/gpd1.DELTA.::loxP:URA3:loxP) (Table 39).
TABLE-US-00042 TABLE 39 Genotype of strains disclosed in Example
21. GEVO Number Genotype 4306 pdc1.DELTA.::[Ll_kivD: Bs_alsSl
pdc1.DELTA.::Ll_kivD: Bs_alsS] GPD1/gpd1.DELTA.::[Ll_adhA.sup.RE1:
Ec_ilvC.sup.P2D1-A1: URA3Ll_ilvD] 4308
pdc1.DELTA.::loxP/pdc1.DELTA.::loxP
GPD1/gpd1.DELTA.::loxP:URA3:loxP
[0430] Over-expression of the L. lactis ilvD gene resulted in an
increase in DHAD activity (U/mg total cell lysate protein). Table
40 shows the DHAD activity (U/mg total cell lysate protein)
averages from technical triplicates comparing the strain expressing
the L. lactis DHAD gene to the strain not expressing the L. lactis
DHAD gene. Expression of the L. lactis ilvD gene, when expressed
with the remainder of the isobutanol pathway, resulted in
isobutanol production as seen in Table 40.
TABLE-US-00043 TABLE 40 DHAD activity in whole cell yeast lysates
and isobutanol titer after 72 hr fermentation. Strain Activity
[mU/mg] Isobutanol titer g/L GEVO4306 0.041 .+-. 0.009 0.56 .+-.
0.01 GEVO4308 0.012 .+-. 0.002 0.00 .+-. 0.00
Example 22
Cytosolic ALS Homologs that Support Isobutanol Production
[0431] This example demonstrates isobutanol production using
expression of cytosolically localized ALS genes in the presence of
the rest of the isobutanol pathway. The ALS genes were integrated
into the PDC1 locus of S. cerevisiae strain GEVO1187 and isobutanol
production was achieved by expression from plasmid of the other
genes in the isobutanol pathway. Isobutanol production in strains
carrying the ALS genes from T. atroviride (Ta_ALS) and T.
stipitatus (Ts_ALS) was compared to isobutanol production in
strains carrying the ALS gene from B. subtilis. Plasmids described
in this example are listed in Table 41.
TABLE-US-00044 TABLE 41 Plasmids disclosed in Example 22. Plasmid
name Relevant Genes/Usage Genotype pGV1730 Integration plasmid that
will integrate See Table 14. P.sub.CUP1-1:Bs_alsS2 into PDC1 using
digestion was the with NruI for targeting. This parent vector for
cloning the ALS homologs. pGV1773 Vector with Bacillus subtilis
AlsS P.sub.PDC1:Bs_AlsS1_coSc, codon optimized for S. cerevisiae.
P.sub.TDH3:Ll_kivD, P.sub.ADH1:Sc_ADH7_coSc, URA3 5'-end, pUC ORI,
kan.sup.R. pGV1802 DNA2.0 plasmid carrying the Ta_ALS_coSc in DNA
Trichoderma atrovirideALS. 2.0 vector pGV1803 DNA2.0 plasmid
carrying the Ts_ALS_coSc in DNA Talaromyces stipitatus ALS. 2.0
vector pGV2082 High copy 2 .mu. plasmid with 4
Ec_ilvC_coSc.sup.Q110V, isobutanol pathway genes Ll_ilvD_coSc,
without an ALS gene. Ll_kivD2_coEc, and Dm_ADH, 2 .mu. ori, bla,
G418R. pGV2114 Integration plasmid that will integrate See Table
14. into PDC1 using digestion with NruI for targeting. It carries
the Bacillus subtilis AlsS gene codon optimized for S. cerevisiae.
pGV2117 Integration plasmid that will See Table 14. integrate into
PDC1 using digestion with NruI for targeting. It carries the
Trichoderma atroviride ALS gene codon optimized for S. cerevisiae.
pGV2118 Integration plasmid that will See Table 14. integrate into
PDC1 using digestion with NruI for targeting. It carries the
Talaromyces stipitatus ALS gene codon optimized for S.
cerevisiae.
[0432] Strains with integrated ALS genes expressed from the CUP1
promoter were transformed with pGV2082 (which carries the other 4
isobutanol pathway genes Ec_ilvC_coScQ110V (SEQ ID NO: 98), LI_ilvD
(SEQ ID NO: 155), LI_kivd2_coEc (SEQ ID NO: 99), and Dm ADH (SEQ ID
NO: 116).
[0433] GEVO2618, GEVO2621, and GEVO2622 (see Table 13) were each
transformed with pGV2082. Control strains GEVO2280 (B. subtilis
alsS2) (Table 13) and GEVO1187 (no ALS) (Table 13) were also
transformed with pGV2082.
[0434] Fermentations of the transformed strains GEVO1187, GEVO2280,
GEVO2618, GEVO2621, GEVO2622 were performed. Strains encoding the
ALS from T. atroviride (SEQ ID NO: 71) and T. stipitatus (SEQ ID
NO: 72) produced more isobutanol than the strain containing the B.
subtilis als2. The strain containing Bs_Als1_coSc produced the most
isobutanol. Table 42 shows the final OD, glucose consumption, and
isobutanol titer for each of the strains. The integration of the
cytosolic genes Ta_ALS_coSc and Ts_ALS_coSc led to production of
isobutanol that was in each case 6-fold above that of a strain
without an integrated ALS gene, demonstrating that these strains
are producing isobutanol using a cytosolic pathway.
TABLE-US-00045 TABLE 42 Results of fermentations with cytosolic ALS
homologs at 72 hrs. Strain OD.sub.600 Glucose consumed g/L
Isobutanol produced g/L GEVO1187 10.9 .+-. 0.3 233 .+-. 36 0.3 .+-.
0.0 GEVO2280 9.9 .+-. 0.3 274 .+-. 26 1.3 .+-. 0.11 GEVO2618 9.4
.+-. 0.2 138 .+-. 9 2.6 .+-. .09 GEVO2621 9.9 .+-. 0.3 161 .+-. 52
1.9 .+-. .18 GEVO2622 10.8 .+-. 0.6 182 .+-. 47 1.8 .+-. .15
[0435] The foregoing detailed description has been given for
clearness of understanding only and no unnecessary limitations
should be understood there from as modifications will be obvious to
those skilled in the art.
[0436] While the invention has been described in connection with
specific embodiments thereof, it will be understood that it is
capable of further modifications and this application is intended
to cover any variations, uses, or adaptations of the invention
following, in general, the principles of the invention and
including such departures from the present disclosure as come
within known or customary practice within the art to which the
invention pertains and as may be applied to the essential features
hereinbefore set forth and as follows in the scope of the appended
claims.
[0437] The disclosures, including the claims, figures and/or
drawings, of each and every patent, patent application, and
publication cited herein are hereby incorporated herein by
reference in their entireties.
Sequence CWU 1
1
1771491PRTEscherichia coli 1Met Ala Asn Tyr Phe Asn Thr Leu Asn Leu
Arg Gln Gln Leu Ala Gln 1 5 10 15 Leu Gly Lys Cys Arg Phe Met Gly
Arg Asp Glu Phe Ala Asp Gly Ala 20 25 30 Ser Tyr Leu Gln Gly Lys
Lys Val Val Ile Val Gly Cys Gly Ala Gln 35 40 45 Gly Leu Asn Gln
Gly Leu Asn Met Arg Asp Ser Gly Leu Asp Ile Ser 50 55 60 Tyr Ala
Leu Arg Lys Glu Ala Ile Ala Glu Lys Arg Ala Ser Trp Arg 65 70 75 80
Lys Ala Thr Glu Asn Gly Phe Lys Val Gly Thr Tyr Glu Glu Leu Ile 85
90 95 Pro Gln Ala Asp Leu Val Ile Asn Leu Thr Pro Asp Lys Gln His
Ser 100 105 110 Asp Val Val Arg Thr Val Gln Pro Leu Met Lys Asp Gly
Ala Ala Leu 115 120 125 Gly Tyr Ser His Gly Phe Asn Ile Val Glu Val
Gly Glu Gln Ile Arg 130 135 140 Lys Asp Ile Thr Val Val Met Val Ala
Pro Lys Cys Pro Gly Thr Glu 145 150 155 160 Val Arg Glu Glu Tyr Lys
Arg Gly Phe Gly Val Pro Thr Leu Ile Ala 165 170 175 Val His Pro Glu
Asn Asp Pro Lys Gly Glu Gly Met Ala Ile Ala Lys 180 185 190 Ala Trp
Ala Ala Ala Thr Gly Gly His Arg Ala Gly Val Leu Glu Ser 195 200 205
Ser Phe Val Ala Glu Val Lys Ser Asp Leu Met Gly Glu Gln Thr Ile 210
215 220 Leu Cys Gly Met Leu Gln Ala Gly Ser Leu Leu Cys Phe Asp Lys
Leu 225 230 235 240 Val Glu Glu Gly Thr Asp Pro Ala Tyr Ala Glu Lys
Leu Ile Gln Phe 245 250 255 Gly Trp Glu Thr Ile Thr Glu Ala Leu Lys
Gln Gly Gly Ile Thr Leu 260 265 270 Met Met Asp Arg Leu Ser Asn Pro
Ala Lys Leu Arg Ala Tyr Ala Leu 275 280 285 Ser Glu Gln Leu Lys Glu
Ile Met Ala Pro Leu Phe Gln Lys His Met 290 295 300 Asp Asp Ile Ile
Ser Gly Glu Phe Ser Ser Gly Met Met Ala Asp Trp 305 310 315 320 Ala
Asn Asp Asp Lys Lys Leu Leu Thr Trp Arg Glu Glu Thr Gly Lys 325 330
335 Thr Ala Phe Glu Thr Ala Pro Gln Tyr Glu Gly Lys Ile Gly Glu Gln
340 345 350 Glu Tyr Phe Asp Lys Gly Val Leu Met Ile Ala Met Val Lys
Ala Gly 355 360 365 Val Glu Leu Ala Phe Glu Thr Met Val Asp Ser Gly
Ile Ile Glu Glu 370 375 380 Ser Ala Tyr Tyr Glu Ser Leu His Glu Leu
Pro Leu Ile Ala Asn Thr 385 390 395 400 Ile Ala Arg Lys Arg Leu Tyr
Glu Met Asn Val Val Ile Ser Asp Thr 405 410 415 Ala Glu Tyr Gly Asn
Tyr Leu Phe Ser Tyr Ala Cys Val Pro Leu Leu 420 425 430 Lys Pro Phe
Met Ala Glu Leu Gln Pro Gly Asp Leu Gly Lys Ala Ile 435 440 445 Pro
Glu Gly Ala Val Asp Asn Gly Gln Leu Arg Asp Val Asn Glu Ala 450 455
460 Ile Arg Ser His Ala Ile Glu Gln Val Gly Lys Lys Leu Arg Gly Tyr
465 470 475 480 Met Thr Asp Met Lys Arg Ile Ala Val Ala Gly 485 490
2395PRTSaccharomyces cerevisiae 2Met Leu Arg Thr Gln Ala Ala Arg
Leu Ile Cys Asn Ser Arg Val Ile 1 5 10 15 Thr Ala Lys Arg Thr Phe
Ala Leu Ala Thr Arg Ala Ala Ala Tyr Ser 20 25 30 Arg Pro Ala Ala
Arg Phe Val Lys Pro Met Ile Thr Thr Arg Gly Leu 35 40 45 Lys Gln
Ile Asn Phe Gly Gly Thr Val Glu Thr Val Tyr Glu Arg Ala 50 55 60
Asp Trp Pro Arg Glu Lys Leu Leu Asp Tyr Phe Lys Asn Asp Thr Phe 65
70 75 80 Ala Leu Ile Gly Tyr Gly Ser Gln Gly Tyr Gly Gln Gly Leu
Asn Leu 85 90 95 Arg Asp Asn Gly Leu Asn Val Ile Ile Gly Val Arg
Lys Asp Gly Ala 100 105 110 Ser Trp Lys Ala Ala Ile Glu Asp Gly Trp
Val Pro Gly Lys Asn Leu 115 120 125 Phe Thr Val Glu Asp Ala Ile Lys
Arg Gly Ser Tyr Val Met Asn Leu 130 135 140 Leu Ser Asp Ala Ala Gln
Ser Glu Thr Trp Pro Ala Ile Lys Pro Leu 145 150 155 160 Leu Thr Lys
Gly Lys Thr Leu Tyr Phe Ser His Gly Phe Ser Pro Val 165 170 175 Phe
Lys Asp Leu Thr His Val Glu Pro Pro Lys Asp Leu Asp Val Ile 180 185
190 Leu Val Ala Pro Lys Gly Ser Gly Arg Thr Val Arg Ser Leu Phe Lys
195 200 205 Glu Gly Arg Gly Ile Asn Ser Ser Tyr Ala Val Trp Asn Asp
Val Thr 210 215 220 Gly Lys Ala His Glu Lys Ala Gln Ala Leu Ala Val
Ala Ile Gly Ser 225 230 235 240 Gly Tyr Val Tyr Gln Thr Thr Phe Glu
Arg Glu Val Asn Ser Asp Leu 245 250 255 Tyr Gly Glu Arg Gly Cys Leu
Met Gly Gly Ile His Gly Met Phe Leu 260 265 270 Ala Gln Tyr Asp Val
Leu Arg Glu Asn Gly His Ser Pro Ser Glu Ala 275 280 285 Phe Asn Glu
Thr Val Glu Glu Ala Thr Gln Ser Leu Tyr Pro Leu Ile 290 295 300 Gly
Lys Tyr Gly Met Asp Tyr Met Tyr Asp Ala Cys Ser Thr Thr Ala 305 310
315 320 Arg Arg Gly Ala Leu Asp Trp Tyr Pro Ile Phe Lys Asn Ala Leu
Lys 325 330 335 Pro Val Phe Gln Asp Leu Tyr Glu Ser Thr Lys Asn Gly
Thr Glu Thr 340 345 350 Lys Arg Ser Leu Glu Phe Asn Ser Gln Pro Asp
Tyr Arg Glu Lys Leu 355 360 365 Glu Lys Glu Leu Asp Thr Ile Arg Asn
Met Glu Ile Trp Lys Val Gly 370 375 380 Lys Glu Val Arg Lys Leu Arg
Pro Glu Asn Gln 385 390 395 3578PRTOryza sativa 3Met Ala Ala Ser
Thr Thr Leu Ala Leu Ser His Pro Lys Thr Leu Ala 1 5 10 15 Ala Ala
Ala Ala Ala Ala Pro Lys Ala Pro Thr Ala Pro Ala Ala Val 20 25 30
Ser Phe Pro Val Ser His Ala Ala Cys Ala Pro Leu Ala Ala Arg Arg 35
40 45 Arg Ala Val Thr Ala Met Val Ala Ala Pro Pro Ala Val Gly Ala
Ala 50 55 60 Met Pro Ser Leu Asp Phe Asp Thr Ser Val Phe Asn Lys
Glu Lys Val 65 70 75 80 Ser Leu Ala Gly His Glu Glu Tyr Ile Val Arg
Gly Gly Arg Asn Leu 85 90 95 Phe Pro Leu Leu Pro Glu Ala Phe Lys
Gly Ile Lys Gln Ile Gly Val 100 105 110 Ile Gly Trp Gly Ser Gln Gly
Pro Ala Gln Ala Gln Asn Leu Arg Asp 115 120 125 Ser Leu Ala Glu Ala
Lys Ser Asp Ile Val Val Lys Ile Gly Leu Arg 130 135 140 Lys Gly Ser
Lys Ser Phe Asp Glu Ala Arg Ala Ala Gly Phe Thr Glu 145 150 155 160
Glu Ser Gly Thr Leu Gly Asp Ile Trp Glu Thr Val Ser Gly Ser Asp 165
170 175 Leu Val Leu Leu Leu Ile Ser Asp Ala Ala Gln Ala Asp Asn Tyr
Glu 180 185 190 Lys Ile Phe Ser His Met Lys Pro Asn Ser Ile Leu Gly
Leu Ser His 195 200 205 Gly Phe Leu Leu Gly His Leu Gln Ser Ala Gly
Leu Asp Phe Pro Lys 210 215 220 Asn Ile Ser Val Ile Ala Val Cys Pro
Lys Gly Met Gly Pro Ser Val 225 230 235 240 Arg Arg Leu Tyr Val Gln
Gly Lys Glu Ile Asn Gly Ala Gly Ile Asn 245 250 255 Ser Ser Phe Ala
Val His Gln Asp Val Asp Gly Arg Ala Thr Asp Val 260 265 270 Ala Leu
Gly Trp Ser Val Ala Leu Gly Ser Pro Phe Thr Phe Ala Thr 275 280 285
Thr Leu Glu Gln Glu Tyr Lys Ser Asp Ile Phe Gly Glu Arg Gly Ile 290
295 300 Leu Leu Gly Ala Val His Gly Ile Val Glu Ala Leu Phe Arg Arg
Tyr 305 310 315 320 Thr Glu Gln Gly Met Asp Glu Glu Met Ala Tyr Lys
Asn Thr Val Glu 325 330 335 Gly Ile Thr Gly Ile Ile Ser Lys Thr Ile
Ser Lys Lys Gly Met Leu 340 345 350 Glu Val Tyr Asn Ser Leu Thr Glu
Glu Gly Lys Lys Glu Phe Asn Lys 355 360 365 Ala Tyr Ser Ala Ser Phe
Tyr Pro Cys Met Asp Ile Leu Tyr Glu Cys 370 375 380 Tyr Glu Asp Val
Ala Ser Gly Ser Glu Ile Arg Ser Val Val Leu Ala 385 390 395 400 Gly
Arg Arg Phe Tyr Glu Lys Glu Gly Leu Pro Ala Phe Pro Met Gly 405 410
415 Asn Ile Asp Gln Thr Arg Met Trp Lys Val Gly Glu Lys Val Arg Ser
420 425 430 Thr Arg Pro Glu Asn Asp Leu Gly Pro Leu His Pro Phe Thr
Ala Gly 435 440 445 Val Tyr Val Ala Leu Met Met Ala Gln Ile Glu Val
Leu Arg Lys Lys 450 455 460 Gly His Ser Tyr Ser Glu Ile Ile Asn Glu
Ser Val Ile Glu Ser Val 465 470 475 480 Asp Ser Leu Asn Pro Phe Met
His Ala Arg Gly Val Ala Phe Met Val 485 490 495 Asp Asn Cys Ser Thr
Thr Ala Arg Leu Gly Ser Arg Lys Trp Ala Pro 500 505 510 Arg Phe Asp
Tyr Ile Leu Thr Gln Gln Ala Phe Val Thr Val Asp Lys 515 520 525 Asp
Ala Pro Ile Asn Gln Asp Leu Ile Ser Asn Phe Met Ser Asp Pro 530 535
540 Val His Gly Ala Ile Glu Val Cys Ala Glu Leu Arg Pro Thr Val Asp
545 550 555 560 Ile Ser Val Pro Ala Asn Ala Asp Phe Val Arg Pro Glu
Leu Arg Gln 565 570 575 Ser Ser 4329PRTMethanococcus maripaludis
4Met Lys Val Phe Tyr Asp Ser Asp Phe Lys Leu Asp Ala Leu Lys Glu 1
5 10 15 Lys Thr Ile Ala Val Ile Gly Tyr Gly Ser Gln Gly Arg Ala Gln
Ser 20 25 30 Leu Asn Met Lys Asp Ser Gly Leu Asn Val Val Val Gly
Leu Arg Lys 35 40 45 Asn Gly Ala Ser Trp Glu Asn Ala Lys Ala Asp
Gly His Asn Val Met 50 55 60 Thr Ile Glu Glu Ala Ala Glu Lys Ala
Asp Ile Ile His Ile Leu Ile 65 70 75 80 Pro Asp Glu Leu Gln Ala Glu
Val Tyr Glu Ser Gln Ile Lys Pro Tyr 85 90 95 Leu Lys Glu Gly Lys
Thr Leu Ser Phe Ser His Gly Phe Asn Ile His 100 105 110 Tyr Gly Phe
Ile Val Pro Pro Lys Gly Val Asn Val Val Leu Val Ala 115 120 125 Pro
Lys Ser Pro Gly Lys Met Val Arg Arg Thr Tyr Glu Glu Gly Phe 130 135
140 Gly Val Pro Gly Leu Ile Cys Ile Glu Ile Asp Ala Thr Asn Asn Ala
145 150 155 160 Phe Asp Ile Val Ser Ala Met Ala Lys Gly Ile Gly Leu
Ser Arg Ala 165 170 175 Gly Val Ile Gln Thr Thr Phe Lys Glu Glu Thr
Glu Thr Asp Leu Phe 180 185 190 Gly Glu Gln Ala Val Leu Cys Gly Gly
Val Thr Glu Leu Ile Lys Ala 195 200 205 Gly Phe Glu Thr Leu Val Glu
Ala Gly Tyr Ala Pro Glu Met Ala Tyr 210 215 220 Phe Glu Thr Cys His
Glu Leu Lys Leu Ile Val Asp Leu Ile Tyr Gln 225 230 235 240 Lys Gly
Phe Lys Asn Met Trp Asn Asp Val Ser Asn Thr Ala Glu Tyr 245 250 255
Gly Gly Leu Thr Arg Arg Ser Arg Ile Val Thr Ala Asp Ser Lys Ala 260
265 270 Ala Met Lys Glu Ile Leu Lys Glu Ile Gln Asp Gly Arg Phe Thr
Lys 275 280 285 Glu Phe Val Leu Glu Lys Gln Val Asn His Ala His Leu
Lys Ala Met 290 295 300 Arg Arg Ile Glu Gly Asp Leu Gln Ile Glu Glu
Val Gly Ala Lys Leu 305 310 315 320 Arg Lys Met Cys Gly Leu Glu Lys
Glu 325 5339PRTAcidiphilium cryptum 5Met Arg Val Tyr Tyr Asp Ser
Asp Ala Asp Val Asn Leu Ile Lys Ala 1 5 10 15 Lys Lys Val Ala Val
Val Gly Tyr Gly Ser Gln Gly His Ala His Ala 20 25 30 Leu Asn Leu
Lys Glu Ser Gly Val Lys Glu Leu Val Val Ala Leu Arg 35 40 45 Lys
Gly Ser Ala Ala Val Ala Lys Ala Glu Ala Ala Gly Leu Arg Val 50 55
60 Met Thr Pro Glu Glu Ala Ala Ala Trp Ala Asp Val Val Met Ile Leu
65 70 75 80 Thr Pro Asp Glu Gly Gln Gly Asp Leu Tyr Arg Asp Ser Leu
Ala Ala 85 90 95 Asn Leu Lys Pro Gly Ala Ala Ile Ala Phe Ala His
Gly Leu Asn Ile 100 105 110 His Phe Asn Leu Ile Glu Pro Arg Ala Asp
Ile Asp Val Phe Met Ile 115 120 125 Ala Pro Lys Gly Pro Gly His Thr
Val Arg Ser Glu Tyr Gln Arg Gly 130 135 140 Gly Gly Val Pro Cys Leu
Val Ala Val Ala Gln Asn Pro Ser Gly Asn 145 150 155 160 Ala Leu Asp
Ile Ala Leu Ser Tyr Ala Ser Ala Ile Gly Gly Gly Arg 165 170 175 Ala
Gly Ile Ile Glu Thr Thr Phe Lys Glu Glu Cys Glu Thr Asp Leu 180 185
190 Phe Gly Glu Gln Thr Val Leu Cys Gly Gly Leu Val Glu Leu Ile Lys
195 200 205 Ala Gly Phe Glu Thr Leu Val Glu Ala Gly Tyr Ala Pro Glu
Met Ala 210 215 220 Tyr Phe Glu Cys Leu His Glu Val Lys Leu Ile Val
Asp Leu Ile Tyr 225 230 235 240 Glu Gly Gly Ile Ala Asn Met Asn Tyr
Ser Ile Ser Asn Thr Ala Glu 245 250 255 Tyr Gly Glu Tyr Val Thr Gly
Pro Arg Met Ile Thr Pro Glu Thr Lys 260 265 270 Ala Glu Met Lys Arg
Val Leu Asp Asp Ile Gln Lys Gly Arg Phe Thr 275 280 285 Arg Asp Trp
Met Leu Glu Asn Lys Val Asn Gln Thr Asn Phe Lys Ala 290 295 300 Met
Arg Arg Ala Asn Ala Ala His Pro Ile Glu Glu Val Gly Glu Lys 305 310
315 320 Leu Arg Ala Met Met Pro Trp Ile Lys Lys Gly Ala Leu Val Asp
Lys 325 330 335 Thr Arg Asn 6555PRTChlamydomonas reinhardtii 6Met
Gln Leu Leu Asn Ser Lys Ser Arg Val Leu Ser Gly Ser Arg Gln 1 5 10
15 Gln Ala Ala Ala Lys Ala Val Arg Val Ala Pro Ser Gly Arg Arg Ser
20 25 30 Ala Val Arg Val Ser Ala Ala Val His Leu Asp Phe Asn Thr
Lys Val 35 40 45 Phe Gln Lys Glu His Ala Lys Phe Gly Pro Thr Glu
Glu Tyr Ile Val 50 55 60 Arg Gly Gly Arg Asp Lys Tyr Pro Leu Leu
Lys Glu Ala Phe Lys Gly 65 70 75 80 Ile Lys Lys Val Ser Val Ile Gly
Trp Gly Ser Gln Ala Pro Ala Gln 85 90 95 Ala Gln Asn Leu Arg Asp
Ser Ile Ala Glu Ala Gly Met Asp Ile Lys 100 105 110 Val Ala Ile Gly
Leu Arg Pro Asp Ser Pro Ser Trp Ala Glu Ala Glu 115 120 125 Ala Cys
Gly Phe Ser Lys Thr Asp Gly Thr Leu Gly Glu Val Phe Glu 130 135 140
Gln Ile Ser Ser Ser Asp Phe Val Ile Leu Leu Ile Ser Asp Ala Ala
145
150 155 160 Gln Ala Lys Leu Tyr Pro Arg Ile Leu Ala Ala Met Lys Pro
Gly Ala 165 170 175 Thr Leu Gly Leu Ser His Gly Phe Leu Leu Gly Val
Met Arg Asn Asp 180 185 190 Gly Val Asp Phe Arg Lys Asp Ile Asn Val
Val Leu Val Ala Pro Lys 195 200 205 Gly Met Gly Pro Ser Val Arg Arg
Leu Tyr Glu Gln Gly Lys Ser Val 210 215 220 Asn Gly Ala Gly Ile Asn
Cys Ser Phe Ala Ile Gln Gln Asp Ala Thr 225 230 235 240 Gly Gln Ala
Ala Asp Ile Ala Ile Gly Trp Ala Ile Gly Val Gly Ala 245 250 255 Pro
Phe Ala Phe Pro Thr Thr Leu Glu Ser Glu Tyr Lys Ser Asp Ile 260 265
270 Tyr Gly Glu Arg Cys Val Leu Leu Gly Ala Val His Gly Ile Val Glu
275 280 285 Ala Leu Phe Arg Arg Tyr Thr Arg Gln Gly Met Ser Asp Glu
Glu Ala 290 295 300 Phe Lys Gln Ser Val Glu Ser Ile Thr Gly Pro Ile
Ser Arg Thr Ile 305 310 315 320 Ser Thr Lys Gly Met Leu Ser Val Tyr
Asn Ser Phe Asn Glu Ala Asp 325 330 335 Lys Lys Ile Phe Glu Gln Ala
Tyr Ser Ala Ser Tyr Lys Pro Ala Leu 340 345 350 Asp Ile Cys Phe Glu
Ile Tyr Glu Asp Val Ala Ser Gly Asn Glu Ile 355 360 365 Lys Ser Val
Val Gln Ala Val Gln Arg Phe Asp Arg Phe Pro Met Gly 370 375 380 Lys
Ile Asp Gln Thr Tyr Met Trp Lys Val Gly Gln Lys Val Arg Ala 385 390
395 400 Glu Arg Asp Glu Ser Lys Ile Pro Val Asn Pro Phe Thr Ala Gly
Val 405 410 415 Tyr Val Ala Val Met Met Ala Thr Val Glu Val Leu Arg
Glu Lys Gly 420 425 430 His Pro Phe Ser Glu Ile Cys Asn Glu Ser Ile
Ile Glu Ala Val Asp 435 440 445 Ser Leu Asn Pro Tyr Met His Ala Arg
Gly Val Ala Phe Met Val Asp 450 455 460 Asn Cys Ser Tyr Thr Ala Arg
Leu Gly Ser Arg Lys Trp Ala Pro Arg 465 470 475 480 Phe Asp Tyr Ile
Ile Glu Gln Gln Ala Phe Val Asp Ile Asp Ser Gly 485 490 495 Lys Ala
Ala Asp Lys Glu Val Met Ala Glu Phe Leu Ala His Pro Val 500 505 510
His Ser Ala Leu Ala Thr Cys Ser Ser Met Arg Pro Ser Val Asp Ile 515
520 525 Ser Val Gly Gly Glu Asn Ser Ser Val Gly Val Gly Ala Gly Ala
Ala 530 535 540 Arg Thr Glu Phe Arg Ser Thr Ala Ala Lys Val 545 550
555 7329PRTPicrophilus torridus 7Met Glu Lys Val Tyr Thr Glu Asn
Asp Leu Lys Glu Asn Leu Met Arg 1 5 10 15 Asn Lys Lys Ile Ala Val
Leu Gly Tyr Gly Ser Gln Gly Arg Ala Trp 20 25 30 Ala Leu Asn Met
Arg Asp Ser Gly Leu Asn Val Thr Val Gly Leu Glu 35 40 45 Arg Gln
Gly Lys Ser Trp Glu Lys Ala Val Ala Asp Gly Phe Lys Pro 50 55 60
Leu Lys Ser Arg Asp Ala Val Arg Asp Ala Asp Ala Val Ile Phe Leu 65
70 75 80 Val Pro Asp Met Ala Gln Arg Glu Leu Tyr Lys Asn Ile Met
Asn Asp 85 90 95 Ile Lys Asp Asp Ala Asp Ile Val Phe Ala His Gly
Phe Asn Val His 100 105 110 Tyr Gly Leu Ile Asn Pro Lys Asn His Asp
Val Tyr Met Val Ala Pro 115 120 125 Lys Ala Pro Gly Pro Ser Val Arg
Glu Phe Tyr Glu Arg Gly Gly Gly 130 135 140 Val Pro Val Leu Ile Ala
Val Ala Asn Asp Val Ser Gly Arg Ser Lys 145 150 155 160 Glu Lys Ala
Leu Ser Ile Ala Tyr Ser Leu Gly Ala Leu Arg Ala Gly 165 170 175 Ala
Ile Glu Thr Thr Phe Lys Glu Glu Thr Glu Thr Asp Leu Ile Gly 180 185
190 Glu Gln Leu Asp Leu Val Gly Gly Ile Thr Glu Leu Leu Arg Ser Thr
195 200 205 Phe Asn Ile Met Val Glu Met Gly Tyr Lys Pro Glu Met Ala
Tyr Phe 210 215 220 Glu Ala Ile Asn Glu Met Lys Leu Ile Val Asp Gln
Val Phe Glu Lys 225 230 235 240 Gly Ile Ser Gly Met Leu Arg Ala Val
Ser Asp Thr Ala Lys Tyr Gly 245 250 255 Gly Leu Thr Thr Gly Lys Tyr
Ile Ile Asn Asp Asp Val Arg Lys Arg 260 265 270 Met Arg Glu Arg Ala
Glu Tyr Ile Val Ser Gly Lys Phe Ala Glu Glu 275 280 285 Trp Ile Glu
Glu Tyr Gly Glu Gly Ser Lys Asn Leu Glu Ser Met Met 290 295 300 Leu
Asp Ile Asp Asn Ser Leu Glu Glu Gln Val Gly Lys Gln Leu Arg 305 310
315 320 Glu Ile Val Leu Arg Gly Arg Pro Lys 325 8339PRTZymomonas
mobilis 8Met Lys Val Tyr Tyr Asp Ser Asp Ala Asp Leu Gly Leu Ile
Lys Ser 1 5 10 15 Lys Lys Ile Ala Ile Leu Gly Tyr Gly Ser Gln Gly
His Ala His Ala 20 25 30 Gln Asn Leu Arg Asp Ser Gly Val Ala Glu
Val Ala Ile Ala Leu Arg 35 40 45 Pro Asp Ser Ala Ser Val Lys Lys
Ala Gln Asp Ala Gly Phe Lys Val 50 55 60 Leu Thr Asn Ala Glu Ala
Ala Lys Trp Ala Asp Ile Leu Met Ile Leu 65 70 75 80 Ala Pro Asp Glu
His Gln Ala Ala Ile Tyr Ala Glu Asp Leu Lys Asp 85 90 95 Asn Leu
Arg Pro Gly Ser Ala Ile Ala Phe Ala His Gly Leu Asn Ile 100 105 110
His Phe Gly Leu Ile Glu Pro Arg Lys Asp Ile Asp Val Phe Met Ile 115
120 125 Ala Pro Lys Gly Pro Gly His Thr Val Arg Ser Glu Tyr Val Arg
Gly 130 135 140 Gly Gly Val Pro Cys Leu Val Ala Val Asp Gln Asp Ala
Ser Gly Asn 145 150 155 160 Ala His Asp Ile Ala Leu Ala Tyr Ala Ser
Gly Ile Gly Gly Gly Arg 165 170 175 Ser Gly Val Ile Glu Thr Thr Phe
Arg Glu Glu Val Glu Thr Asp Leu 180 185 190 Phe Gly Glu Gln Ala Val
Leu Cys Gly Gly Leu Thr Ala Leu Ile Thr 195 200 205 Ala Gly Phe Glu
Thr Leu Thr Glu Ala Gly Tyr Ala Pro Glu Met Ala 210 215 220 Phe Phe
Glu Cys Met His Glu Met Lys Leu Ile Val Asp Leu Ile Tyr 225 230 235
240 Glu Ala Gly Ile Ala Asn Met Arg Tyr Ser Ile Ser Asn Thr Ala Glu
245 250 255 Tyr Gly Asp Ile Val Ser Gly Pro Arg Val Ile Asn Glu Glu
Ser Lys 260 265 270 Lys Ala Met Lys Ala Ile Leu Asp Asp Ile Gln Ser
Gly Arg Phe Val 275 280 285 Ser Lys Phe Val Leu Asp Asn Arg Ala Gly
Gln Pro Glu Leu Lys Ala 290 295 300 Ala Arg Lys Arg Met Ala Ala His
Pro Ile Glu Gln Val Gly Ala Arg 305 310 315 320 Leu Arg Lys Met Met
Pro Trp Ile Ala Ser Asn Lys Leu Val Asp Lys 325 330 335 Ala Arg Asn
910PRTArtificial Sequencec-myc epitope tag 9Glu Gln Lys Leu Ile Ser
Glu Glu Asp Leu 1 5 10 10554PRTThermotoga petrophila 10Met Arg Ser
Asp Val Ile Lys Lys Gly Leu Glu Arg Ala Pro His Arg 1 5 10 15 Ser
Leu Leu Lys Ala Leu Gly Ile Thr Asp Asp Glu Met Arg Arg Pro 20 25
30 Phe Ile Gly Ile Val Ser Ser Trp Asn Glu Ile Ile Pro Gly His Val
35 40 45 His Leu Asp Lys Val Val Glu Ala Val Lys Ala Gly Val Arg
Met Ala 50 55 60 Gly Gly Val Pro Phe Val Phe Pro Thr Ile Gly Ile
Cys Asp Gly Ile 65 70 75 80 Ala Met Asp His Arg Gly Met Lys Phe Ser
Leu Pro Ser Arg Glu Leu 85 90 95 Ile Ala Asp Ser Ile Glu Ile Val
Ala Ser Gly Phe Pro Phe Asp Gly 100 105 110 Leu Val Phe Val Pro Asn
Cys Asp Lys Ile Thr Pro Gly Met Met Met 115 120 125 Ala Met Gly Arg
Leu Asn Ile Pro Ser Val Leu Ile Ser Gly Gly Pro 130 135 140 Met Leu
Ala Gly Arg Tyr Asn Gly Arg Asp Ile Asp Leu Ile Thr Val 145 150 155
160 Phe Glu Ala Val Gly Gly Tyr Lys Val Gly Lys Val Asp Glu Glu Thr
165 170 175 Leu Lys Ala Ile Glu Asp Leu Ala Cys Pro Gly Ala Gly Ser
Cys Ala 180 185 190 Gly Leu Phe Thr Ala Asn Thr Met Asn Ser Leu Ala
Glu Ala Leu Gly 195 200 205 Ile Ala Pro Arg Gly Asn Gly Thr Val Pro
Ala Val His Ala Lys Arg 210 215 220 Leu Arg Met Ala Lys Glu Ala Gly
Met Leu Val Val Glu Leu Val Lys 225 230 235 240 Arg Asp Val Lys Pro
Arg Asp Ile Val Thr Leu Asp Ser Phe Met Asn 245 250 255 Ala Val Met
Val Asp Leu Ala Thr Gly Gly Ser Thr Asn Thr Val Leu 260 265 270 His
Leu Lys Ala Ile Ala Glu Ser Phe Gly Ile Asp Phe Asp Ile Lys 275 280
285 Leu Phe Asp Glu Leu Ser Arg Lys Ile Pro His Ile Cys Asn Ile Ser
290 295 300 Pro Val Gly Pro Tyr His Ile Gln Asp Leu Asp Asp Ala Gly
Gly Ile 305 310 315 320 Tyr Ala Val Met Lys Arg Leu Gln Glu Asn Gly
Leu Leu Lys Glu Asp 325 330 335 Ala Met Thr Ile Tyr Leu Arg Lys Ile
Gly Asp Leu Val Arg Glu Ala 340 345 350 Lys Ile Leu Asn Glu Asp Val
Ile Arg Pro Phe Asp Asn Pro Tyr His 355 360 365 Lys Glu Gly Gly Leu
Gly Ile Leu Phe Gly Asn Leu Ala Pro Glu Gly 370 375 380 Ala Val Ala
Lys Leu Ser Gly Val Pro Glu Lys Met Met His His Val 385 390 395 400
Gly Pro Ala Val Val Phe Glu Asp Gly Glu Glu Ala Thr Lys Ala Ile 405
410 415 Leu Ser Gly Lys Ile Lys Lys Gly Asp Val Val Val Ile Arg Tyr
Glu 420 425 430 Gly Pro Lys Gly Gly Pro Gly Met Arg Glu Met Leu Ser
Pro Thr Ser 435 440 445 Ala Ile Val Gly Met Gly Leu Ala Glu Asp Val
Ala Leu Ile Thr Asp 450 455 460 Gly Arg Phe Ser Gly Gly Ser His Gly
Ala Val Ile Gly His Val Ser 465 470 475 480 Pro Glu Ala Ala Glu Gly
Gly Pro Ile Gly Ile Val Lys Asp Gly Asp 485 490 495 Leu Ile Glu Ile
Asp Phe Glu Lys Arg Thr Leu Asn Leu Leu Ile Ser 500 505 510 Asp Glu
Glu Phe Glu Arg Arg Met Lys Glu Phe Thr Pro Leu Val Lys 515 520 525
Glu Val Asp Ser Asp Tyr Leu Arg Arg Tyr Ala Phe Phe Val Gln Ser 530
535 540 Ala Ser Lys Gly Ala Ile Phe Arg Lys Pro 545 550
11561PRTVictivallis vadensis 11Met Arg Ser Asp Thr Met Lys Lys Gly
Pro Glu Arg Ala Pro His Arg 1 5 10 15 Gly Leu Met Arg Ala Thr Gly
Leu Lys Lys Glu Asp Phe Asp Lys Pro 20 25 30 Phe Ile Gly Val Cys
Asn Ser Tyr Thr Asn Ile Val Pro Gly His Cys 35 40 45 His Leu Lys
Lys Val Gly Glu Ile Ile Cys Asp Ala Ile Arg Glu Ala 50 55 60 Gly
Gly Val Pro Tyr Glu Phe Asn Thr Ile Ala Val Cys Asp Gly Ile 65 70
75 80 Ala Met Gly His Lys Gly Met Lys Tyr Ser Leu Ala Ser Arg Glu
Ile 85 90 95 Ile Ala Asp Ser Val Glu Thr Met Gly Thr Ala His Pro
Phe Asp Ala 100 105 110 Met Ile Cys Ile Pro Asn Cys Asp Lys Val Val
Pro Gly Met Leu Met 115 120 125 Gly Ala Met Arg Leu Asn Ile Pro Thr
Ile Phe Ala Ser Gly Gly Pro 130 135 140 Met Arg Ala Gly Lys Pro Gln
Ala Glu Gly Gly Pro Asp Thr Asp Leu 145 150 155 160 Ile Ser Ile Phe
Glu Gly Val Ala Ala Asn Arg Ile Gly Lys Leu Ser 165 170 175 Asp Glu
Gly Leu Glu Ala Leu Glu Cys Ser Ala Cys Pro Gly Pro Gly 180 185 190
Ser Cys Ser Gly Met Phe Thr Ala Asn Ser Met Asn Cys Leu Cys Glu 195
200 205 Ala Leu Gly Ile Ala Leu Pro Gly Asn Gly Thr Ile Ala Ala Asp
Ser 210 215 220 Pro Glu Arg Val Glu Leu Trp Lys Arg Ala Ala Arg Arg
Ala Val Glu 225 230 235 240 Leu Ala Arg Met Glu Asn Pro Pro Thr Ala
Lys Asp Phe Ala Thr Pro 245 250 255 Ala Ala Phe Gln Asn Ala Leu Val
Leu Asp Met Ala Met Gly Gly Ser 260 265 270 Ser Asn Thr Val Leu His
Thr Leu Ala Val Ala Thr Glu Ala Gly Thr 275 280 285 Lys Leu Asp Leu
Lys Lys Leu Asp Glu Ile Ser Ala Arg Thr Pro Asn 290 295 300 Ile Cys
Lys Leu Ser Pro Ser Val Gln Tyr His Ile Val Glu Asp Gly 305 310 315
320 Asn Arg Val Gly Gly Ile Met Ala Ile Leu Lys Glu Ile Ser Lys Val
325 330 335 Pro Gly Leu Ile Asp Gly Ser Ala Pro Thr Val Ser Gly Lys
Thr Leu 340 345 350 Ala Glu Glu Phe Asn Gly Ala Pro Asp Pro Asp Gly
Thr Ile Ile Arg 355 360 365 Pro Leu Ser Asn Pro Tyr Ser Glu Lys Gly
Gly Leu Ala Ile Leu Phe 370 375 380 Gly Asn Leu Ala Glu Lys Gly Cys
Val Val Lys Ala Ala Gly Val Ala 385 390 395 400 Lys Ala Met Leu Thr
His Lys Gly Pro Ala Val Ile Phe Asp Ser Glu 405 410 415 Glu Glu Ala
Gly Glu Gly Ile Leu Ala Gly Lys Val Lys Ala Gly Asp 420 425 430 Val
Val Val Ile Arg Tyr Glu Gly Pro Lys Gly Gly Pro Gly Met Gln 435 440
445 Glu Met Leu Ala Pro Thr Ser Tyr Ile Met Gly Arg Gly Leu Gly Glu
450 455 460 Ser Val Ala Leu Val Thr Asp Gly Arg Phe Ser Gly Gly Thr
Arg Gly 465 470 475 480 Ala Cys Ile Gly His Val Ser Pro Glu Ala Ala
Ala Gly Gly Leu Ile 485 490 495 Gly Leu Val Glu Pro Gly Asp Ile Ile
Glu Ile Asp Ile Pro Asn Arg 500 505 510 Ser Ile Lys Leu Asp Val Pro
Asp Glu Val Ile Ala Glu Arg Arg Lys 515 520 525 Asn Trp Lys Pro Arg
Glu Pro Lys Ile Lys Thr Gly Tyr Leu Ala Lys 530 535 540 Tyr Ala Ser
Leu Ala Thr Ser Ala Asp Thr Gly Gly Val Leu Lys Val 545 550 555 560
Asn 12549PRTUnknownTermite Group 1 Bacterium Phylotype Rs-D17 12Met
Arg Ser Asp Gln Ile Lys Arg Gly Ala Val Arg Ala Pro Asn Arg 1 5 10
15 Cys Leu Leu Tyr Ser Thr Gly Ile Ser Pro Gly Asp Leu Asp Lys Pro
20 25 30 Phe Ile Gly Ile Ala Ser Ser Phe Thr Asp Leu Val Pro Gly
His Val 35 40 45 Ala Met Arg Asp Leu Glu Arg Tyr Val Glu Arg Gly
Ile Ala Ala Gly 50 55 60 Gly Gly Val Pro Phe Ile Phe Gly Ala Pro
Ala Val Cys Asp Gly Ile 65 70 75 80 Ala Met Gly His Ser Gly Met His
Tyr
Ser Leu Gly Ser Arg Glu Ile 85 90 95 Ile Ala Asp Leu Val Glu Thr
Val Ala Asn Ala His Met Leu Asp Gly 100 105 110 Leu Ile Leu Leu Ser
Asn Cys Asp Lys Val Thr Pro Gly Met Leu Met 115 120 125 Ala Ala Ala
Arg Leu Asn Ile Pro Ala Ile Val Val Thr Ala Gly Ala 130 135 140 Met
Met Thr Gly Met Tyr Asp Lys Lys Arg Arg Ser Met Val Arg Asp 145 150
155 160 Thr Phe Glu Ala Val Gly Gln Phe Gln Ala Gly Lys Ile Thr Glu
Lys 165 170 175 Gln Leu Ser Glu Leu Glu Met Ala Ala Cys Pro Gly Ala
Gly Ala Cys 180 185 190 Gln Gly Met Tyr Thr Ala Asn Thr Met Ala Cys
Leu Thr Glu Thr Met 195 200 205 Gly Met Ser Met Arg Gly Cys Ala Thr
Thr Leu Ala Val Ser Ala Lys 210 215 220 Lys Lys Arg Ile Ala Tyr Glu
Ser Gly Ile Arg Val Val Ala Leu Val 225 230 235 240 Lys Lys Asp Val
Lys Pro Arg Asp Ile Leu Thr Leu Ala Ala Phe Lys 245 250 255 Asn Ala
Ile Val Ala Asp Met Ala Leu Gly Gly Ser Thr Asn Thr Val 260 265 270
Leu His Leu Pro Ala Ile Ala Asn Glu Ala Gly Ile Glu Leu Pro Leu 275
280 285 Glu Leu Phe Asp Glu Ile Ser Lys Lys Thr Pro Gln Ile Ala Cys
Leu 290 295 300 Glu Pro Ala Gly Asp His Tyr Met Glu Asp Leu Asp Asn
Ala Gly Gly 305 310 315 320 Ile Pro Ala Val Leu Phe Ala Ile Gln Lys
Asn Leu Ala His Ser Lys 325 330 335 Thr Val Ser Gly Phe Asp Ile Ile
Glu Ile Ala Asn Ser Ala Glu Ile 340 345 350 Leu Asp Glu Tyr Val Ile
Arg Ala Lys Asn Pro Tyr Lys Pro Glu Gly 355 360 365 Gly Ile Ala Ile
Leu Arg Gly Asn Ile Ala Pro Arg Gly Cys Val Val 370 375 380 Lys Gln
Ala Ala Val Ser Glu Lys Met Lys Val Phe Ser Gly Arg Ala 385 390 395
400 Arg Val Phe Asn Ser Glu Asp Asn Ala Met Lys Ala Ile Leu Asp Asn
405 410 415 Lys Ile Val Pro Gly Asp Ile Val Val Ile Arg Tyr Glu Gly
Pro Ala 420 425 430 Gly Gly Pro Gly Met Arg Glu Met Leu Ser Pro Thr
Ser Ala Leu His 435 440 445 Gly Met Gly Leu Ser Asp Ser Val Ala Leu
Leu Thr Asp Gly Arg Phe 450 455 460 Ser Gly Gly Thr Arg Gly Pro Cys
Ile Gly His Ile Ser Pro Glu Ala 465 470 475 480 Ala Ala Asp Gly Ala
Ile Val Ala Ile Asn Glu Gly Asp Thr Ile Asn 485 490 495 Ile Asn Ile
Pro Glu Arg Thr Leu Asn Val Glu Leu Thr Asp Asp Glu 500 505 510 Ile
Lys Ala Arg Ile Gly Lys Val Ile Lys Pro Glu Pro Lys Ile Lys 515 520
525 Thr Gly Tyr Met Ala Arg Tyr Ala Lys Leu Val Gln Ser Ala Asp Thr
530 535 540 Gly Ala Val Leu Lys 545 13573PRTYarrowia lipolytica
13Met Ile Arg Ala Arg Asn Tyr Ala Thr Lys Ala His Thr Leu Asn Lys 1
5 10 15 Phe Ser Lys Ile Ile Thr Glu Pro Lys Ser Gln Gly Ala Ser Gln
Ala 20 25 30 Met Leu Tyr Ala Cys Gly Phe Asn Glu Ala Asp Leu Gly
Lys Pro Gln 35 40 45 Val Gly Val Ala Ser Val Trp Trp Ser Gly Asn
Pro Cys Asn Met His 50 55 60 Leu Leu Asp Leu Asn Phe Lys Val Lys
Glu Gly Ile Glu Lys His Asn 65 70 75 80 Leu Lys Ala Met Gln Phe Asn
Thr Ile Gly Val Ser Asp Gly Ile Ser 85 90 95 Met Gly Thr Lys Gly
Met Arg Tyr Ser Leu Gln Ser Arg Asp Met Ile 100 105 110 Ala Asp Ser
Ile Glu Thr Leu Met Met Ala Gln His Tyr Asp Ala Asn 115 120 125 Ile
Ser Ile Pro Gly Cys Asp Lys Asn Met Pro Gly Val Leu Met Ala 130 135
140 Met Gly Arg Val Asn Arg Pro Ser Ile Met Leu Tyr Gly Gly Thr Ile
145 150 155 160 His Pro Gly Lys Ala Glu Thr Arg Lys Gly Glu Asp Ile
Asp Ile Val 165 170 175 Ser Ala Phe Gln Ala Tyr Gly Gln Tyr Ile Ala
Gly Gly Ile Ser Glu 180 185 190 Thr Glu Arg Ala Asp Val Ile Arg His
Ala Cys Pro Gly Gln Gly Ala 195 200 205 Cys Gly Gly Met Tyr Thr Ala
Asn Thr Met Ala Ser Ala Ala Glu Val 210 215 220 Leu Gly Met Thr Leu
Pro Gly Ser Ser Ser Ala Pro Ala Ile Ser Lys 225 230 235 240 Glu Lys
Met Ala Glu Cys Glu Ala Leu Gly Pro Ala Ile Asn Lys Leu 245 250 255
Leu Glu Met Asp Leu Lys Pro Lys Asp Ile Met Thr Arg Gln Ala Phe 260
265 270 Glu Asn Ala Ile Ala Tyr Ile Ile Ala Thr Gly Gly Ser Thr Asn
Ala 275 280 285 Val Leu His Leu Leu Ala Ile Ala His Thr Val Asp Val
Pro Leu Thr 290 295 300 Ile Asp Asp Phe Gln Arg Ile Ser Asp Asn Thr
Pro Leu Leu Ala Asp 305 310 315 320 Phe Lys Pro Ser Gly Ala His Val
Met Ala Asp Leu Gln Lys Trp Gly 325 330 335 Gly Thr Pro Ala Val Ile
Lys Met Leu Ile Glu Gln Gly Phe Ile Asp 340 345 350 Gly Ser Pro Met
Thr Cys Ser Gly Glu Ser Leu Lys Asp Thr Val Ala 355 360 365 Lys Tyr
Pro Ser Leu Pro Lys Glu Gln Asp Ile Phe Ala Ser Val Asp 370 375 380
Ala Pro Leu Lys Pro Ser Gly His Leu Gln Ile Leu Lys Gly Ser Leu 385
390 395 400 Ala Pro Gly Gly Ser Val Gly Lys Ile Thr Gly Lys Glu Gly
Thr Phe 405 410 415 Phe Lys Gly Thr Ala Arg Cys Phe Asp Glu Glu Asp
Leu Phe Ile Glu 420 425 430 Ala Leu Glu Lys Gly Glu Ile Lys Lys Gly
Glu Lys Thr Cys Val Ile 435 440 445 Ile Arg Tyr Glu Gly Pro Lys Gly
Gly Pro Gly Met Pro Glu Met Leu 450 455 460 Lys Pro Ser Ser Ala Leu
Met Gly Tyr Gly Leu Gly Lys Asp Val Ala 465 470 475 480 Leu Leu Thr
Asp Gly Arg Phe Ser Gly Gly Ser His Gly Phe Leu Ile 485 490 495 Gly
His Ile Val Pro Glu Ala Tyr Glu Gly Gly Pro Ile Gly Leu Val 500 505
510 Glu Asp Gly Asp Glu Ile Ile Ile Asp Ala Asp Asn Asn Ile Ile Asp
515 520 525 Leu Leu Val Asp Glu Lys Thr Met Ala Glu Arg Lys Ala Lys
Trp Thr 530 535 540 Pro Pro Ala Pro Arg Tyr Thr Ser Gly Thr Leu His
Lys Tyr Ser Lys 545 550 555 560 Leu Val Ser Asp Ala Ser Thr Gly Cys
Ile Thr Asp Ala 565 570 14560PRTFrancisella tularensis 14Met Lys
Lys Val Leu Asn Lys Tyr Ser Arg Arg Leu Thr Glu Asp Lys 1 5 10 15
Ser Gln Gly Ala Ser Gln Ala Met Leu Tyr Gly Thr Glu Met Asn Asp 20
25 30 Ala Asp Met His Lys Pro Gln Ile Gly Ile Gly Ser Val Trp Tyr
Glu 35 40 45 Gly Asn Thr Cys Asn Met His Leu Asn Gln Leu Ala Gln
Phe Val Lys 50 55 60 Asp Ser Val Glu Lys Glu Asn Leu Lys Gly Met
Arg Phe Asn Thr Ile 65 70 75 80 Gly Val Ser Asp Gly Ile Ser Met Gly
Thr Asp Gly Met Ser Tyr Ser 85 90 95 Leu Gln Ser Arg Asp Leu Ile
Ala Asp Ser Ile Glu Thr Val Met Ser 100 105 110 Ala His Trp Tyr Asp
Gly Leu Val Ser Ile Pro Gly Cys Asp Lys Asn 115 120 125 Met Pro Gly
Cys Met Met Ala Leu Gly Arg Leu Asn Arg Pro Gly Phe 130 135 140 Val
Ile Tyr Gly Gly Thr Ile Gln Ala Gly Val Met Arg Gly Lys Pro 145 150
155 160 Ile Asp Ile Val Thr Ala Phe Gln Ser Tyr Gly Ala Cys Leu Ser
Gly 165 170 175 Gln Ile Thr Glu Gln Glu Arg Gln Glu Thr Ile Lys Lys
Ala Cys Pro 180 185 190 Gly Ala Gly Ala Cys Gly Gly Met Tyr Thr Ala
Asn Thr Met Ala Cys 195 200 205 Ala Ile Glu Ala Leu Gly Met Ser Leu
Pro Phe Ser Ser Ser Thr Ser 210 215 220 Ala Thr Ser Val Glu Lys Val
Gln Glu Cys Asp Lys Ala Gly Glu Thr 225 230 235 240 Ile Lys Asn Leu
Leu Glu Leu Asp Ile Lys Pro Arg Asp Ile Met Thr 245 250 255 Arg Lys
Ala Phe Glu Asn Ala Met Val Leu Ile Thr Val Met Gly Gly 260 265 270
Ser Thr Asn Ala Val Leu His Leu Leu Ala Met Ala Ser Ser Val Asp 275
280 285 Val Asp Leu Ser Ile Asp Asp Phe Gln Glu Ile Ala Asn Lys Thr
Pro 290 295 300 Val Leu Ala Asp Phe Lys Pro Ser Gly Lys Tyr Val Met
Ala Asn Leu 305 310 315 320 His Ala Ile Gly Gly Thr Pro Ala Val Met
Lys Met Leu Leu Lys Ala 325 330 335 Gly Met Leu His Gly Asp Cys Leu
Thr Val Thr Gly Lys Thr Leu Ala 340 345 350 Glu Asn Leu Glu Asn Val
Ala Asp Leu Pro Glu Asp Asn Thr Ile Ile 355 360 365 His Lys Leu Asp
Asn Pro Ile Lys Lys Thr Gly His Leu Gln Ile Leu 370 375 380 Lys Gly
Asn Val Ala Pro Glu Gly Ser Val Ala Lys Ile Thr Gly Lys 385 390 395
400 Glu Gly Glu Ile Phe Glu Gly Val Ala Asn Val Phe Asp Ser Glu Glu
405 410 415 Glu Met Val Ala Ala Val Glu Thr Gly Lys Val Lys Lys Gly
Asp Val 420 425 430 Ile Val Ile Arg Tyr Glu Gly Pro Lys Gly Gly Pro
Gly Met Pro Glu 435 440 445 Met Leu Lys Pro Thr Ser Leu Ile Met Gly
Ala Gly Leu Gly Gln Asp 450 455 460 Val Ala Leu Ile Thr Asp Gly Arg
Phe Ser Gly Gly Ser His Gly Phe 465 470 475 480 Ile Val Gly His Ile
Thr Pro Glu Ala Tyr Glu Gly Gly Met Ile Ala 485 490 495 Leu Leu Glu
Asn Gly Asp Lys Ile Thr Ile Asp Ala Ile Asn Asn Val 500 505 510 Ile
Asn Val Asp Leu Ser Asp Gln Glu Ile Ala Gln Arg Lys Ser Lys 515 520
525 Trp Arg Ala Ser Lys Gln Lys Ala Ser Arg Gly Thr Leu Lys Lys Tyr
530 535 540 Ile Lys Thr Val Ser Ser Ala Ser Thr Gly Cys Val Thr Asp
Leu Asp 545 550 555 560 15581PRTArabidopsis thaliana 15Met Pro Ser
Ile Ile Ser Cys Ser Ala Gln Ser Val Thr Ala Asp Pro 1 5 10 15 Ser
Pro Pro Ile Thr Asp Thr Asn Lys Leu Asn Lys Tyr Ser Ser Arg 20 25
30 Ile Thr Glu Pro Lys Ser Gln Gly Gly Ser Gln Ala Ile Leu His Gly
35 40 45 Val Gly Leu Ser Asp Asp Asp Leu Leu Lys Pro Gln Ile Gly
Ile Ser 50 55 60 Ser Val Trp Tyr Glu Gly Asn Thr Cys Asn Met His
Leu Leu Lys Leu 65 70 75 80 Ser Glu Ala Val Lys Glu Gly Val Glu Asn
Ala Gly Met Val Gly Phe 85 90 95 Arg Phe Asn Thr Ile Gly Val Ser
Asp Ala Ile Ser Met Gly Thr Arg 100 105 110 Gly Met Cys Phe Ser Leu
Gln Ser Arg Asp Leu Ile Ala Asp Ser Ile 115 120 125 Glu Thr Val Met
Ser Ala Gln Trp Tyr Asp Gly Asn Ile Ser Ile Pro 130 135 140 Gly Cys
Asp Lys Asn Met Pro Gly Thr Ile Met Ala Met Gly Arg Leu 145 150 155
160 Asn Arg Pro Gly Ile Met Val Tyr Gly Gly Thr Ile Lys Pro Gly His
165 170 175 Phe Gln Asp Lys Thr Tyr Asp Ile Val Ser Ala Phe Gln Ser
Tyr Gly 180 185 190 Glu Phe Val Ser Gly Ser Ile Ser Asp Glu Gln Arg
Lys Thr Val Leu 195 200 205 His His Ser Cys Pro Gly Ala Gly Ala Cys
Gly Gly Met Tyr Thr Ala 210 215 220 Asn Thr Met Ala Ser Ala Ile Gly
Ala Met Gly Met Ser Leu Pro Tyr 225 230 235 240 Ser Ser Ser Ile Pro
Ala Glu Asp Pro Leu Lys Leu Asp Glu Cys Arg 245 250 255 Leu Ala Gly
Lys Tyr Leu Leu Glu Leu Leu Lys Met Asp Leu Lys Pro 260 265 270 Arg
Asp Ile Ile Thr Pro Lys Ser Leu Arg Asn Ala Met Val Ser Val 275 280
285 Met Ala Leu Gly Gly Ser Thr Asn Ala Val Leu His Leu Ile Ala Ile
290 295 300 Ala Arg Ser Val Gly Leu Glu Leu Thr Leu Asp Asp Phe Gln
Lys Val 305 310 315 320 Ser Asp Ala Val Pro Phe Leu Ala Asp Leu Lys
Pro Ser Gly Lys Tyr 325 330 335 Val Met Glu Asp Ile His Lys Ile Gly
Gly Thr Pro Ala Val Leu Arg 340 345 350 Tyr Leu Leu Glu Leu Gly Leu
Met Asp Gly Asp Cys Met Thr Val Thr 355 360 365 Gly Gln Thr Leu Ala
Gln Asn Leu Glu Asn Val Pro Ser Leu Thr Glu 370 375 380 Gly Gln Glu
Ile Ile Arg Pro Leu Ser Asn Pro Ile Lys Glu Thr Gly 385 390 395 400
His Ile Gln Ile Leu Arg Gly Asp Leu Ala Pro Asp Gly Ser Val Ala 405
410 415 Lys Ile Thr Gly Lys Glu Gly Leu Tyr Phe Ser Gly Pro Ala Leu
Val 420 425 430 Phe Glu Gly Glu Glu Ser Met Leu Ala Ala Ile Ser Ala
Asp Pro Met 435 440 445 Ser Phe Lys Gly Thr Val Val Val Ile Arg Gly
Glu Gly Pro Lys Gly 450 455 460 Gly Pro Gly Met Pro Glu Met Leu Thr
Pro Thr Ser Ala Ile Met Gly 465 470 475 480 Ala Gly Leu Gly Lys Glu
Cys Ala Leu Leu Thr Asp Gly Arg Phe Ser 485 490 495 Gly Gly Ser His
Gly Phe Val Val Gly His Ile Cys Pro Glu Ala Gln 500 505 510 Glu Gly
Gly Pro Ile Gly Leu Ile Lys Asn Gly Asp Ile Ile Thr Ile 515 520 525
Asp Ile Gly Lys Lys Arg Ile Asp Thr Gln Val Ser Pro Glu Glu Met 530
535 540 Asn Asp Arg Arg Lys Lys Trp Thr Ala Pro Ala Tyr Lys Val Asn
Arg 545 550 555 560 Gly Val Leu Tyr Lys Tyr Ile Lys Asn Val Gln Ser
Ala Ser Asp Gly 565 570 575 Cys Val Thr Asp Glu 580
16573PRTCandidatus Koribacter versatilis 16Met Thr Glu Lys Ser Pro
Lys Pro His Lys Arg Ser Asp Ala Ile Thr 1 5 10 15 Glu Gly Pro Asn
Arg Ala Pro Ala Arg Ala Met Leu Arg Ala Ala Gly 20 25 30 Phe Thr
Pro Glu Asp Leu Arg Lys Pro Ile Ile Gly Ile Ala Asn Thr 35 40 45
Trp Ile Glu Ile Gly Pro Cys Asn Leu His Leu Arg Glu Leu Ala Glu 50
55 60 His Ile Lys Gln Gly Val Arg Glu Ala Gly Gly Thr Pro Met Glu
Phe 65 70 75 80 Asn Thr Val Ser Ile Ser Asp Gly Ile Thr Met Gly Ser
Glu Gly Met 85 90 95 Lys Ala Ser Leu Val Ser Arg Glu Val Ile Ala
Asp Ser Ile Glu Leu 100 105 110 Val Ala
Arg Gly Asn Leu Phe Asp Gly Leu Ile Ala Leu Ser Gly Cys 115 120 125
Asp Lys Thr Ile Pro Gly Thr Ile Met Ala Leu Glu Arg Leu Asp Ile 130
135 140 Pro Gly Leu Met Leu Tyr Gly Gly Ser Ile Ala Pro Gly Lys Phe
His 145 150 155 160 Ala Gln Lys Val Thr Ile Gln Asp Val Phe Glu Ala
Val Gly Thr His 165 170 175 Ala Arg Gly Lys Met Ser Asp Ala Asp Leu
Glu Glu Leu Glu His Asn 180 185 190 Ala Cys Pro Gly Ala Gly Ala Cys
Gly Gly Gln Phe Thr Ala Asn Thr 195 200 205 Met Ser Met Cys Gly Glu
Phe Leu Gly Ile Ser Pro Met Gly Ala Asn 210 215 220 Ser Val Pro Ala
Met Thr Val Glu Lys Gln Gln Val Ala Arg Arg Cys 225 230 235 240 Gly
His Leu Val Met Glu Leu Val Arg Arg Asp Ile Arg Pro Ser Gln 245 250
255 Ile Ile Thr Arg Lys Ala Ile Glu Asn Ala Ile Ala Ser Val Ala Ala
260 265 270 Ser Gly Gly Ser Thr Asn Ala Val Leu His Leu Leu Ala Ile
Ala His 275 280 285 Glu Met Asp Val Glu Leu Asn Ile Glu Asp Phe Asp
Lys Ile Ser Ser 290 295 300 Arg Thr Pro Leu Leu Cys Glu Leu Lys Pro
Ala Gly Arg Phe Thr Ala 305 310 315 320 Thr Asp Leu His Asp Ala Gly
Gly Ile Pro Leu Val Ala Gln Arg Leu 325 330 335 Leu Glu Ala Asn Leu
Leu His Ala Asp Ala Leu Thr Val Thr Gly Lys 340 345 350 Thr Ile Ala
Glu Glu Ala Lys Gln Ala Lys Glu Thr Pro Gly Gln Glu 355 360 365 Val
Val Arg Pro Leu Thr Asp Pro Ile Lys Ala Thr Gly Gly Leu Met 370 375
380 Ile Leu Lys Gly Asn Leu Ala Ser Glu Gly Cys Val Val Lys Leu Val
385 390 395 400 Gly His Lys Lys Leu Phe Phe Glu Gly Pro Ala Arg Val
Phe Glu Ser 405 410 415 Glu Glu Glu Ala Phe Ala Gly Val Glu Asp Arg
Thr Ile Gln Ala Gly 420 425 430 Glu Val Val Val Val Arg Tyr Glu Gly
Pro Lys Gly Gly Pro Gly Met 435 440 445 Arg Glu Met Leu Gly Val Thr
Ala Ala Ile Ala Gly Thr Glu Leu Ala 450 455 460 Glu Thr Val Ala Leu
Ile Thr Asp Gly Arg Phe Ser Gly Ala Thr Arg 465 470 475 480 Gly Leu
Ser Val Gly His Val Ala Pro Glu Ala Ala Asn Gly Gly Ala 485 490 495
Ile Ala Val Val Arg Asn Gly Asp Ile Ile Thr Leu Asp Val Glu Arg 500
505 510 Arg Glu Leu Arg Val His Leu Thr Asp Ala Glu Leu Glu Ala Arg
Leu 515 520 525 Arg Asn Trp Arg Ala Pro Glu Pro Arg Tyr Lys Arg Gly
Val Phe Ala 530 535 540 Lys Tyr Ala Ser Thr Val Ser Ser Ala Ser Phe
Gly Ala Val Thr Gly 545 550 555 560 Ser Thr Ile Glu Asn Lys Thr Leu
Ala Gly Ser Thr Lys 565 570 17562PRTGramella forsetii 17Met Asp Lys
Thr Ala Met Asn Asn Lys Tyr Ser Ser Thr Ile Thr Gln 1 5 10 15 Ser
Asp Ser Gln Pro Ala Ser Gln Ala Met Leu His Ala Ile Gly Leu 20 25
30 Asn Lys Glu Asp Leu Lys Lys Pro Phe Val Gly Ile Gly Ser Thr Gly
35 40 45 Tyr Glu Gly Asn Pro Cys Asn Met His Leu Asn Asp Leu Ala
Lys Glu 50 55 60 Val Lys Lys Gly Thr Gln Asn Ala Asp Leu Asn Gly
Leu Ile Phe Asn 65 70 75 80 Thr Ile Gly Val Ser Asp Gly Ile Ser Met
Gly Thr Pro Gly Met Arg 85 90 95 Phe Ser Leu Pro Ser Arg Asp Leu
Ile Ala Asp Ser Met Glu Thr Val 100 105 110 Val Gly Gly Met Ser Tyr
Asp Gly Leu Val Thr Val Val Gly Cys Asp 115 120 125 Lys Asn Met Pro
Gly Ala Leu Met Ala Met Leu Arg Leu Asn Arg Pro 130 135 140 Ser Val
Leu Val Tyr Gly Gly Thr Ile Ala Ser Gly Cys His Asn Gly 145 150 155
160 Lys Lys Leu Asp Val Val Ser Ala Phe Glu Ala Trp Gly Ser Lys Val
165 170 175 Ser Gly Asp Met Gln Glu Glu Glu Tyr Gln Gln Val Ile Glu
Lys Ala 180 185 190 Cys Pro Gly Ala Gly Ala Cys Gly Gly Met Tyr Thr
Ala Asn Thr Met 195 200 205 Ala Ser Ser Ile Glu Ala Leu Gly Met Ser
Leu Pro Phe Asn Ser Ser 210 215 220 Asn Pro Ala Thr Gly Pro Glu Lys
Thr Gln Glu Ser Val Lys Ala Gly 225 230 235 240 Glu Ala Met Lys Tyr
Leu Leu Glu Asn Asp Leu Lys Pro Lys Asp Ile 245 250 255 Val Thr Ala
Lys Ser Leu Glu Asn Ala Ile Arg Leu Leu Thr Val Leu 260 265 270 Gly
Gly Ser Thr Asn Ala Val Leu His Phe Leu Ala Ile Ala Lys Ala 275 280
285 Ala Glu Ile Asn Phe Gly Leu Lys Asp Phe Thr Arg Ile Cys Glu Glu
290 295 300 Thr Pro Phe Leu Ala Asp Leu Lys Pro Ser Gly Lys Tyr Leu
Met Glu 305 310 315 320 Asp Ile His Arg Ile Gly Gly Ile Pro Ala Val
Met Lys Tyr Met Leu 325 330 335 Glu Lys Gly Leu Leu His Gly Glu Cys
Met Thr Val Thr Gly Lys Thr 340 345 350 Ile Ala Glu Asn Leu Glu Asn
Val Lys Pro Leu Pro Asp Asp Gln Asp 355 360 365 Val Ile His Pro Val
Glu Lys Pro Ile Lys Ala Thr Gly His Ile Arg 370 375 380 Ile Leu Tyr
Gly Asn Leu Ala Ser Glu Gly Ser Val Ala Lys Ile Thr 385 390 395 400
Gly Lys Glu Gly Leu Glu Phe Gln Gly Lys Ala Arg Val Phe Asn Gly 405
410 415 Glu Phe Glu Ala Asn Glu Gly Ile Ser Ser Gly Lys Val Gln Lys
Gly 420 425 430 Asp Val Val Val Ile Arg Tyr Glu Gly Pro Lys Gly Gly
Pro Gly Met 435 440 445 Pro Glu Met Leu Lys Pro Thr Ser Ala Ile Met
Gly Ala Gly Leu Gly 450 455 460 Lys Ser Val Ala Leu Ile Thr Asp Gly
Arg Phe Ser Gly Gly Thr His 465 470 475 480 Gly Phe Val Val Gly His
Ile Thr Pro Glu Ala Gln Gln Gly Gly Leu 485 490 495 Ile Gly Leu Leu
Lys Asp Gly Asp Glu Ile Ser Ile Asn Ala Glu Lys 500 505 510 Asn Thr
Ile Glu Ala His Leu Ser Ala Glu Glu Ile Asn Arg Arg Lys 515 520 525
Glu Ala Trp Lys Ala Pro Ala Leu Lys Val Asn Gly Gly Val Leu Tyr 530
535 540 Lys Tyr Ala Lys Thr Val Ala Ser Ala Ser Glu Gly Cys Val Thr
Asp 545 550 555 560 Glu Phe 18570PRTLactococcus lactis 18Met Glu
Phe Lys Tyr Asn Gly Lys Val Glu Ser Val Glu Leu Asn Lys 1 5 10 15
Tyr Ser Lys Thr Leu Thr Gln Asp Pro Thr Gln Pro Ala Thr Gln Ala 20
25 30 Met Tyr Tyr Gly Ile Gly Phe Lys Asp Glu Asp Phe Lys Lys Ala
Gln 35 40 45 Val Gly Ile Val Ser Met Asp Trp Asp Gly Asn Pro Cys
Asn Met His 50 55 60 Leu Gly Thr Leu Gly Ser Lys Ile Lys Ser Ser
Val Asn Gln Thr Asp 65 70 75 80 Gly Leu Ile Gly Leu Gln Phe His Thr
Ile Gly Val Ser Asp Gly Ile 85 90 95 Ala Asn Gly Lys Leu Gly Met
Arg Tyr Ser Leu Val Ser Arg Glu Val 100 105 110 Ile Ala Asp Ser Ile
Glu Thr Asn Ala Gly Ala Glu Tyr Tyr Asp Ala 115 120 125 Ile Val Ala
Ile Pro Gly Cys Asp Lys Asn Met Pro Gly Ser Ile Ile 130 135 140 Gly
Met Ala Arg Leu Asn Arg Pro Ser Ile Met Val Tyr Gly Gly Thr 145 150
155 160 Ile Glu His Gly Glu Tyr Lys Gly Glu Lys Leu Asn Ile Val Ser
Ala 165 170 175 Phe Glu Ser Leu Gly Gln Lys Ile Thr Gly Asn Ile Ser
Asp Glu Asp 180 185 190 Tyr His Gly Val Ile Cys Asn Ala Ile Pro Gly
Gln Gly Ala Cys Gly 195 200 205 Gly Met Tyr Thr Ala Asn Thr Leu Ala
Ala Ala Ile Glu Thr Leu Gly 210 215 220 Met Ser Leu Pro Tyr Ser Ser
Ser Asn Pro Ala Val Ser Gln Glu Lys 225 230 235 240 Gln Glu Glu Cys
Asp Glu Ile Gly Leu Ala Ile Lys Asn Leu Leu Glu 245 250 255 Lys Asp
Ile Lys Pro Ser Asp Ile Met Thr Lys Glu Ala Phe Glu Asn 260 265 270
Ala Ile Thr Ile Val Met Val Leu Gly Gly Ser Thr Asn Ala Val Leu 275
280 285 His Ile Ile Ala Met Ala Asn Ala Ile Gly Val Glu Ile Thr Gln
Asp 290 295 300 Asp Phe Gln Arg Ile Ser Asp Ile Thr Pro Val Leu Gly
Asp Phe Lys 305 310 315 320 Pro Ser Gly Lys Tyr Met Met Glu Asp Leu
His Lys Ile Gly Gly Leu 325 330 335 Pro Ala Val Leu Lys Tyr Leu Leu
Lys Glu Gly Lys Leu His Gly Asp 340 345 350 Cys Leu Thr Val Thr Gly
Lys Thr Leu Ala Glu Asn Val Glu Thr Ala 355 360 365 Leu Asp Leu Asp
Phe Asp Ser Gln Asp Ile Met Arg Pro Leu Lys Asn 370 375 380 Pro Ile
Lys Ala Thr Gly His Leu Gln Ile Leu Tyr Gly Asn Leu Ala 385 390 395
400 Gln Gly Gly Ser Val Ala Lys Ile Ser Gly Lys Glu Gly Glu Phe Phe
405 410 415 Lys Gly Thr Ala Arg Val Phe Asp Gly Glu Gln His Phe Ile
Asp Gly 420 425 430 Ile Glu Ser Gly Arg Leu His Ala Gly Asp Val Ala
Val Ile Arg Asn 435 440 445 Ile Gly Pro Val Gly Gly Pro Gly Met Pro
Glu Met Leu Lys Pro Thr 450 455 460 Ser Ala Leu Ile Gly Ala Gly Leu
Gly Lys Ser Cys Ala Leu Ile Thr 465 470 475 480 Asp Gly Arg Phe Ser
Gly Gly Thr His Gly Phe Val Val Gly His Ile 485 490 495 Val Pro Glu
Ala Val Glu Gly Gly Leu Ile Gly Leu Val Glu Asp Asp 500 505 510 Asp
Ile Ile Glu Ile Asp Ala Val Asn Asn Ser Ile Ser Leu Lys Val 515 520
525 Ser Asp Glu Glu Ile Ala Lys Arg Arg Ala Asn Tyr Gln Lys Pro Thr
530 535 540 Pro Lys Ala Thr Arg Gly Val Leu Ala Lys Phe Ala Lys Leu
Thr Arg 545 550 555 560 Pro Ala Ser Glu Gly Cys Val Thr Asp Leu 565
570 19568PRTSaccharopolyspora erythraea 19Met Ser Thr Ser Thr Asp
Gly Thr Gly Gln Ser Gly Arg Gly Leu Lys 1 5 10 15 Pro Arg Ser Gly
Asp Val Thr Glu Gly Ile Glu Arg Ala Ala Ala Arg 20 25 30 Gly Met
Leu Arg Ala Val Gly Met Gln Asp Ala Asp Phe Ala Lys Pro 35 40 45
Gln Ile Gly Val Ala Ser Ser Trp Asn Glu Ile Thr Pro Cys Asn Leu 50
55 60 Ser Leu Gln Arg Leu Ala Gln Ala Ser Lys Glu Gly Val His Ala
Ala 65 70 75 80 Gly Gly Phe Pro Met Glu Phe Gly Thr Ile Ser Val Ser
Asp Gly Ile 85 90 95 Ser Met Gly His Val Gly Met His Tyr Ser Leu
Val Ser Arg Glu Val 100 105 110 Ile Ala Asp Ser Val Glu Thr Val Met
Glu Ala Glu Arg Leu Asp Gly 115 120 125 Ser Val Leu Leu Ala Gly Cys
Asp Lys Ser Leu Pro Gly Met Leu Met 130 135 140 Ala Ala Ala Arg Leu
Asp Val Ala Ala Val Phe Val Tyr Ala Gly Ser 145 150 155 160 Ile Leu
Pro Gly Arg Val Asp Asp Arg Glu Val Thr Ile Ile Asp Ala 165 170 175
Phe Glu Ala Val Gly Ala Cys Ala Arg Gly Leu Ile Ser Glu Ala Glu 180
185 190 Val Asp Arg Ile Glu Arg Ala Ile Cys Pro Gly Glu Gly Ala Cys
Gly 195 200 205 Gly Met Tyr Thr Ala Asn Thr Met Ala Cys Ala Ala Glu
Ala Met Gly 210 215 220 Met Ser Leu Pro Gly Ser Ala Ser Pro Pro Ser
Val Asp Arg Arg Arg 225 230 235 240 Asp Ala Gly Ala Arg Glu Ala Gly
Arg Ala Val Val Gly Met Ile Glu 245 250 255 Arg Gly Leu Thr Ala Arg
Gln Ile Leu Thr Lys Glu Ala Phe Glu Asn 260 265 270 Ala Ile Ala Val
Val Met Ala Phe Gly Gly Ser Thr Asn Ala Val Leu 275 280 285 His Leu
Leu Ala Ile Ala Arg Glu Ala Glu Val Asp Leu Thr Leu Asp 290 295 300
Asp Phe Asn Arg Ile Gly Asp Arg Val Pro His Leu Ala Asp Val Lys 305
310 315 320 Pro Phe Gly Arg His Val Met Thr Ala Val Asp Arg Ile Gly
Gly Val 325 330 335 Pro Val Val Met Lys Ala Leu Leu Asp Ala Gly Leu
Leu His Gly Asp 340 345 350 Cys Met Thr Val Thr Gly Lys Thr Val Ala
Glu Asn Leu Ala Glu Leu 355 360 365 Asp Pro Pro Glu Leu Asp Gly Glu
Val Leu His Lys Leu Ser Asn Pro 370 375 380 Leu His Pro Thr Gly Gly
Leu Thr Ile Leu Arg Gly Ser Leu Ala Pro 385 390 395 400 Glu Gly Ala
Val Val Lys Ser Ala Gly Phe Asp Ser Ala Thr Phe Glu 405 410 415 Gly
Thr Ala Arg Val Phe Asp Gly Glu Gln Gly Ala Met Asp Ala Val 420 425
430 Glu Asp Gly Ser Leu Lys Ala Gly Asp Val Val Val Ile Arg Tyr Glu
435 440 445 Gly Pro Arg Gly Gly Pro Gly Met Arg Glu Met Leu Ala Val
Thr Gly 450 455 460 Ala Ile Lys Gly Ala Gly Leu Gly Lys Asp Val Leu
Leu Leu Thr Asp 465 470 475 480 Gly Arg Phe Ser Gly Gly Thr Thr Gly
Leu Cys Ile Gly His Val Ala 485 490 495 Pro Glu Ala Thr Asp Gly Gly
Pro Ile Ala Phe Val Arg Asp Gly Asp 500 505 510 Pro Ile Arg Leu Asp
Leu Ala Gly Arg Thr Leu Asp Leu Leu Val Asp 515 520 525 Glu Ala Glu
Leu Ala Arg Arg Lys Glu Gly Trp Val Pro Arg Glu Pro 530 535 540 Lys
Phe Arg Gln Gly Val Leu Gly Lys Tyr Ala Arg Leu Val Arg Ser 545 550
555 560 Ala Ala Val Gly Ala Val Cys Ser 565 20585PRTSaccharomyces
cerevisiae 20Met Gly Leu Leu Thr Lys Val Ala Thr Ser Arg Gln Phe
Ser Thr Thr 1 5 10 15 Arg Cys Val Ala Lys Lys Leu Asn Lys Tyr Ser
Tyr Ile Ile Thr Glu 20 25 30 Pro Lys Gly Gln Gly Ala Ser Gln Ala
Met Leu Tyr Ala Thr Gly Phe 35 40 45 Lys Lys Glu Asp Phe Lys Lys
Pro Gln Val Gly Val Gly Ser Cys Trp 50 55 60 Trp Ser Gly Asn Pro
Cys Asn Met His Leu Leu Asp Leu Asn Asn Arg 65 70 75 80 Cys Ser Gln
Ser Ile Glu Lys Ala Gly Leu Lys Ala Met Gln Phe Asn 85 90 95 Thr
Ile Gly Val Ser Asp Gly Ile Ser Met Gly Thr Lys Gly Met Arg 100 105
110 Tyr Ser Leu Gln Ser Arg Glu Ile Ile Ala Asp Ser Phe Glu Thr Ile
115 120 125 Met Met Ala Gln
His Tyr Asp Ala Asn Ile Ala Ile Pro Ser Cys Asp 130 135 140 Lys Asn
Met Pro Gly Val Met Met Ala Met Gly Arg His Asn Arg Pro 145 150 155
160 Ser Ile Met Val Tyr Gly Gly Thr Ile Leu Pro Gly His Pro Thr Cys
165 170 175 Gly Ser Ser Lys Ile Ser Lys Asn Ile Asp Ile Val Ser Ala
Phe Gln 180 185 190 Ser Tyr Gly Glu Tyr Ile Ser Lys Gln Phe Thr Glu
Glu Glu Arg Glu 195 200 205 Asp Val Val Glu His Ala Cys Pro Gly Pro
Gly Ser Cys Gly Gly Met 210 215 220 Tyr Thr Ala Asn Thr Met Ala Ser
Ala Ala Glu Val Leu Gly Leu Thr 225 230 235 240 Ile Pro Asn Ser Ser
Ser Phe Pro Ala Val Ser Lys Glu Lys Leu Ala 245 250 255 Glu Cys Asp
Asn Ile Gly Glu Tyr Ile Lys Lys Thr Met Glu Leu Gly 260 265 270 Ile
Leu Pro Arg Asp Ile Leu Thr Lys Glu Ala Phe Glu Asn Ala Ile 275 280
285 Thr Tyr Val Val Ala Thr Gly Gly Ser Thr Asn Ala Val Leu His Leu
290 295 300 Val Ala Val Ala His Ser Ala Gly Val Lys Leu Ser Pro Asp
Asp Phe 305 310 315 320 Gln Arg Ile Ser Asp Thr Thr Pro Leu Ile Gly
Asp Phe Lys Pro Ser 325 330 335 Gly Lys Tyr Val Met Ala Asp Leu Ile
Asn Val Gly Gly Thr Gln Ser 340 345 350 Val Ile Lys Tyr Leu Tyr Glu
Asn Asn Met Leu His Gly Asn Thr Met 355 360 365 Thr Val Thr Gly Asp
Thr Leu Ala Glu Arg Ala Lys Lys Ala Pro Ser 370 375 380 Leu Pro Glu
Gly Gln Glu Ile Ile Lys Pro Leu Ser His Pro Ile Lys 385 390 395 400
Ala Asn Gly His Leu Gln Ile Leu Tyr Gly Ser Leu Ala Pro Gly Gly 405
410 415 Ala Val Gly Lys Ile Thr Gly Lys Glu Gly Thr Tyr Phe Lys Gly
Arg 420 425 430 Ala Arg Val Phe Glu Glu Glu Gly Ala Phe Ile Glu Ala
Leu Glu Arg 435 440 445 Gly Glu Ile Lys Lys Gly Glu Lys Thr Val Val
Val Ile Arg Tyr Glu 450 455 460 Gly Pro Arg Gly Ala Pro Gly Met Pro
Glu Met Leu Lys Pro Ser Ser 465 470 475 480 Ala Leu Met Gly Tyr Gly
Leu Gly Lys Asp Val Ala Leu Leu Thr Asp 485 490 495 Gly Arg Phe Ser
Gly Gly Ser His Gly Phe Leu Ile Gly His Ile Val 500 505 510 Pro Glu
Ala Ala Glu Gly Gly Pro Ile Gly Leu Val Arg Asp Gly Asp 515 520 525
Glu Ile Ile Ile Asp Ala Asp Asn Asn Lys Ile Asp Leu Leu Val Ser 530
535 540 Asp Lys Glu Met Ala Gln Arg Lys Gln Ser Trp Val Ala Pro Pro
Pro 545 550 555 560 Arg Tyr Thr Arg Gly Thr Leu Ser Lys Tyr Ala Lys
Leu Val Ser Asn 565 570 575 Ala Ser Asn Gly Cys Val Leu Asp Ala 580
585 21592PRTPiromyces sp 21Met Ser Phe Ser Leu Ala Asn Leu Ala Ala
Lys Gly Ser Asn Leu Phe 1 5 10 15 Lys Phe Thr Pro Ala Leu Leu Ser
Ala Lys Arg Phe Gly Ser Ser Gly 20 25 30 Lys Pro Ile Asn Lys Phe
Ser Lys Ile Ile Thr Glu Pro Lys Ser Arg 35 40 45 Gly Gly Ser Gln
Ala Met Leu Ile Ala Thr Gly Ile Lys Pro Glu Asp 50 55 60 Leu Lys
Lys Pro Gln Ile Gly Ile Gly Ser Val Trp Tyr Asp Gly Asn 65 70 75 80
Pro Cys Asn Met His Leu Leu Asp Leu Gly Ser Val Val Lys Lys Ala 85
90 95 Val Gln Lys Gln Asn Met Asn Gly Met Arg Phe Asn Met Ile Gly
Val 100 105 110 Ser Asp Gly Ile Ser Asn Gly Thr Asp Gly Met Ser Phe
Ser Leu Gln 115 120 125 Ser Arg Glu Ile Ile Ala Asp Ser Ile Glu Thr
Ile Met Ser Ala Gln 130 135 140 Tyr Tyr Asp Ala Asn Ile Ser Leu Pro
Gly Cys Asp Lys Asn Met Pro 145 150 155 160 Gly Cys Leu Ile Ala Ala
Ala Arg Leu Asn Arg Pro Thr Ile Ile Ile 165 170 175 Tyr Gly Gly Thr
Ile Lys Pro Gly His Thr Lys Lys Gly Glu Thr Ile 180 185 190 Asp Leu
Val Ser Ala Phe Gln Cys Tyr Gly Gln Tyr Leu Ala Gly Glu 195 200 205
Ile Thr Glu Glu Gln Arg Glu Glu Ile Val Asn Asn Ala Cys Pro Gly 210
215 220 Ala Gly Ala Cys Gly Gly Met Tyr Thr Ala Asn Thr Met Ala Ser
Ile 225 230 235 240 Ile Glu Ser Met Gly Met Ser Leu Pro Tyr Ser Ala
Ser Thr Pro Ala 245 250 255 Glu Asp Pro Leu Lys Glu Leu Glu Cys Ile
Asn Ala Ala Ala Ala Ile 260 265 270 Lys Asn Leu Met Glu Lys Asp Ile
Lys Pro Leu Asp Ile Met Thr Arg 275 280 285 Lys Ala Phe Glu Asn Ala
Ile Thr Ile Thr Leu Ile Leu Gly Gly Ser 290 295 300 Thr Asn Ser Val
Leu His Leu Leu Ala Ile Ala Arg Ala Cys Lys Val 305 310 315 320 Pro
Leu Thr Ile Asp Asp Phe Gln Glu Phe Ser Asn Arg Ile Pro Val 325 330
335 Leu Ala Asp Leu Lys Pro Ser Gly Lys Tyr Val Met Glu Asp Leu Gln
340 345 350 Leu Ile Gly Gly Leu Pro Ala Ile Gln Lys Tyr Leu Leu Asn
Glu Gly 355 360 365 Leu Leu His Gly Asp Ile Met Thr Val Thr Gly Lys
Thr Leu Ala Glu 370 375 380 Asn Leu Lys Asp Val Ala Pro Ile Asp Phe
Glu Thr Gln Asp Ile Ile 385 390 395 400 Arg Pro Leu Ser Asn Pro Ile
Lys Lys Asn Gly His Ile Ile Ile Met 405 410 415 Lys Gly Asn Val Ser
Pro Asp Gly Gly Val Ala Lys Ile Thr Gly Lys 420 425 430 Gln Gly Leu
Phe Phe Glu Gly Val Ala Asn Cys Phe Asp Cys Glu Glu 435 440 445 Asp
Met Leu Ala Ala Leu Glu Arg Gly Glu Ile Lys Lys Gly Gln Val 450 455
460 Ile Ile Ile Arg Tyr Glu Gly Pro Thr Gly Gly Pro Gly Met Pro Glu
465 470 475 480 Met Leu Thr Pro Thr Ser Ala Ile Met Gly Ala Gly Leu
Gly Lys Asp 485 490 495 Val Ala Leu Leu Thr Asp Gly Arg Phe Ser Gly
Gly Ser His Gly Phe 500 505 510 Ile Ile Gly His Ile Thr Pro Glu Ala
Gln Val Gly Gly Pro Ile Ala 515 520 525 Leu Ile Lys Asn Gly Asp Lys
Ile Thr Ile Asp Ala Asn Lys Arg Thr 530 535 540 Ile His Ala His Val
Ser Glu Glu Glu Phe Ala Lys Arg Arg Ala Glu 545 550 555 560 Trp Lys
Ala Pro Pro Tyr Arg Ala Thr Gln Gly Thr Leu Lys Lys Tyr 565 570 575
Ile Lys Leu Val Lys Pro Ala Asn Phe Gly Cys Val Thr Asp Glu Trp 580
585 590 22587PRTRalstonia eutropha 22Met Pro Tyr Ala Asp Asp Pro
Lys Leu Pro Gln Asp Gly Ala Ala Pro 1 5 10 15 Thr Glu Gly Leu Ala
Lys Gly Leu Thr Asn Tyr Gly Asp Thr Gly Phe 20 25 30 Ser Leu Phe
Leu Arg Lys Ala Phe Ile Lys Gly Ala Gly Phe Thr Asp 35 40 45 Asp
Ala Leu Ser Arg Pro Val Ile Gly Ile Val Asn Thr Gly Ser Ser 50 55
60 Tyr Asn Pro Cys His Gly Asn Ala Pro Gln Leu Val Glu Ala Val Lys
65 70 75 80 Arg Gly Val Met Leu Ala Gly Gly Leu Pro Val Asp Phe Pro
Thr Ile 85 90 95 Ser Val His Glu Ser Phe Ser Ala Pro Thr Ser Met
Tyr Leu Arg Asn 100 105 110 Leu Met Ser Met Asp Thr Glu Glu Met Ile
Arg Ala Gln Pro Met Asp 115 120 125 Ala Val Val Leu Ile Gly Gly Cys
Asp Lys Thr Val Pro Ala Gln Leu 130 135 140 Met Gly Ala Ala Ser Ala
Gly Val Pro Ala Ile Gln Leu Val Thr Gly 145 150 155 160 Ser Met Leu
Thr Gly Ser His Arg Ser Glu Arg Val Gly Ala Cys Thr 165 170 175 Asp
Cys Arg Arg Tyr Trp Gly Arg Tyr Arg Ala Glu Glu Ile Asp Ser 180 185
190 Ala Glu Ile Ala Asp Val Asn Asn Gln Leu Val Ala Ser Val Gly Thr
195 200 205 Cys Ser Val Met Gly Thr Ala Ser Thr Met Ala Cys Val Ala
Glu Ala 210 215 220 Leu Gly Met Met Val Ser Gly Gly Ala Ser Ala Pro
Ala Val Thr Ala 225 230 235 240 Asp Arg Val Arg Val Ala Glu Arg Thr
Gly Thr Thr Ala Val Gly Met 245 250 255 Ala Ala Ala Arg Leu Thr Pro
Asp Arg Ile Leu Thr Gly Lys Ala Phe 260 265 270 Glu Asn Ala Leu Arg
Val Leu Leu Ala Ile Gly Gly Ser Thr Asn Gly 275 280 285 Ile Val His
Leu Thr Ala Ile Ala Gly Arg Leu Gly Ile Asp Ile Asp 290 295 300 Leu
Ala Gly Leu Asp Arg Met Ser Arg Glu Thr Pro Val Leu Val Asp 305 310
315 320 Leu Lys Pro Ser Gly Gln His Tyr Met Glu Asp Phe His Lys Ala
Gly 325 330 335 Gly Met Leu Thr Leu Leu Arg Glu Leu Arg Pro Leu Leu
His Leu Asp 340 345 350 Thr Leu Thr Val Ser Gly Arg Thr Leu Gly Glu
Glu Leu Asp Ala Ala 355 360 365 Pro Pro Leu Phe Pro Gln Asp Val Ile
Arg Ser Ala Gly Asn Pro Ile 370 375 380 Tyr Pro Ala Gly Gly Leu Ala
Val Leu Arg Gly Asn Leu Ala Pro Gly 385 390 395 400 Gly Ala Ile Ile
Lys Gln Ser Ala Ala Asn Pro Ala Leu Met Glu His 405 410 415 Glu Gly
Arg Ala Val Val Phe Glu Asn Ala Glu Asp Met Ala Gln Arg 420 425 430
Ile Asp Asp Glu Ser Leu Asp Val Lys Ala Asp Asp Ile Leu Val Leu 435
440 445 Lys Arg Ile Gly Pro Thr Gly Ala Pro Gly Met Pro Glu Ala Gly
Tyr 450 455 460 Met Pro Ile Pro Lys Lys Leu Ala Arg Ala Gly Val Lys
Asp Met Val 465 470 475 480 Arg Val Ser Asp Gly Arg Met Ser Gly Thr
Ala Ala Gly Thr Ile Val 485 490 495 Leu His Val Thr Pro Glu Ala Ala
Ile Gly Gly Pro Leu Ala Leu Val 500 505 510 Gln Ser Gly Asp Arg Ile
Arg Leu Ser Val Ala Asn Arg Glu Ile Ala 515 520 525 Leu Leu Val Asp
Asp Ala Glu Leu Ala Arg Arg Ala Ala Ala Gln Pro 530 535 540 Val Glu
Arg Pro Arg Ala Glu Arg Gly Tyr Arg Lys Leu Phe Leu Glu 545 550 555
560 Thr Val Thr Gln Ala Asp Gln Gly Val Asp Phe Asp Phe Leu Arg Ala
565 570 575 Ala Gln Thr Val Asp Thr Val Pro Lys Gln Gly 580 585
23581PRTChromohalobacter salexigens 23Met Thr His Lys Lys Arg Pro
Leu Arg Ser Ala Glu Trp Phe Gly Asn 1 5 10 15 Asp Asp Lys Asn Gly
Phe Met Tyr Arg Ser Trp Met Lys Asn Gln Gly 20 25 30 Ile Pro Asp
His Glu Phe Arg Gly Lys Pro Ile Ile Gly Ile Cys Asn 35 40 45 Thr
Phe Ser Glu Leu Thr Pro Cys Asn Ala His Phe Arg Lys Leu Ala 50 55
60 Glu His Val Lys Lys Gly Val Leu Glu Ala Gly Gly Tyr Pro Val Glu
65 70 75 80 Phe Pro Val Phe Ser Asn Gly Glu Ser Asn Leu Arg Pro Thr
Ala Met 85 90 95 Phe Thr Arg Asn Leu Ala Ser Met Asp Val Glu Glu
Ala Ile Arg Gly 100 105 110 Asn Pro Leu Asp Ala Val Val Leu Leu Val
Gly Cys Asp Lys Thr Thr 115 120 125 Pro Ala Leu Leu Met Gly Ala Ala
Ser Cys Asp Ile Pro Thr Ile Val 130 135 140 Val Thr Gly Gly Pro Met
Leu Asn Gly Lys His Lys Gly Arg Asp Ile 145 150 155 160 Gly Ser Gly
Thr Val Val Trp Gln Leu Ser Glu Glu Val Lys Ala Gly 165 170 175 Lys
Ile Ser Leu His Asp Phe Met Ala Ala Glu Ala Gly Met Ser Arg 180 185
190 Ser Ala Gly Thr Cys Asn Thr Met Gly Thr Ala Ser Thr Met Ala Cys
195 200 205 Met Ala Glu Ser Leu Gly Thr Ser Leu Pro His Asn Ala Ala
Ile Pro 210 215 220 Ala Val Asp Ser Arg Arg Tyr Val Leu Ala His Leu
Ser Gly Asn Arg 225 230 235 240 Ile Val Glu Met Val Asp Glu Asp Leu
Thr Leu Ser Lys Val Leu Thr 245 250 255 Lys Ser Ala Phe Glu Asn Ala
Ile Arg Thr Asn Ala Ala Ile Gly Gly 260 265 270 Ser Thr Asn Ala Val
Ile His Leu Gln Ala Ile Ala Gly Arg Met Gly 275 280 285 Val Asp Leu
Thr Leu Asp Asp Trp Thr Arg Val Gly Arg Gly Thr Pro 290 295 300 Thr
Ile Val Asp Leu Gln Pro Ser Gly Arg Tyr Leu Met Glu Glu Phe 305 310
315 320 Tyr Tyr Ala Gly Gly Leu Pro Ala Val Leu Arg Arg Leu Gly Glu
Ala 325 330 335 Asp Arg Leu Pro His Lys Asp Ala Leu Thr Val Asn Gly
Lys Thr Leu 340 345 350 Trp Glu Asn Val Gln Asp Ala Pro Leu Tyr Asn
Asp Ala Val Ile Leu 355 360 365 Pro Leu Asp Ala Pro Leu Arg Glu Asp
Gly Gly Met Cys Val Met Arg 370 375 380 Gly Asn Leu Ala Pro Asn Gly
Ala Val Leu Lys Pro Ser Ala Ala Thr 385 390 395 400 Pro Ala Leu Met
Gln His Arg Gly Arg Ala Val Val Phe Glu Asn Phe 405 410 415 Asp Asp
Tyr Lys Ala Arg Ile Asn Asp Pro Asp Leu Asp Val Thr Ala 420 425 430
Asp Asp Ile Leu Val Met Lys Asn Cys Gly Pro Arg Gly Tyr His Gly 435
440 445 Met Ala Glu Val Gly Asn Met Gly Leu Pro Ala Lys Leu Leu Glu
Gln 450 455 460 Gly Val Thr Asp Met Val Arg Ile Ser Asp Ala Arg Met
Ser Gly Thr 465 470 475 480 Ala Tyr Gly Thr Val Val Leu His Val Ala
Pro Glu Ala Ala Ala Gly 485 490 495 Gly Pro Leu Ala Ala Val Arg Asn
Gly Asp Trp Ile Ala Leu Asp Ala 500 505 510 Tyr Ser Gly Lys Leu His
Leu Glu Val Asp Asp Ala Glu Ile Ala Ser 515 520 525 Arg Leu Ala Glu
Ala Asp Pro Thr Ala Glu Ser Thr Arg Ile Ala Ser 530 535 540 Thr Gly
Gly Tyr Arg Gln Leu Tyr Ile Glu His Val Leu Gln Ala Asp 545 550 555
560 Gln Gly Cys Asp Phe Asp Phe Leu Val Gly Cys Arg Gly Ala Glu Val
565 570 575 Pro Arg His Ser His 580 24329PRTPicrophilus torridus
24Met Glu Lys Val Tyr Thr Glu Asn Asp Leu Lys Glu Asn Leu Met Arg 1
5 10 15 Asn Lys Lys Ile Ala Val Leu Gly Tyr Gly Ser Gln Gly Arg Ala
Trp 20 25 30 Ala Leu Asn Met Arg Asp Ser Gly Leu Asn Val Thr Val
Gly Leu Glu 35 40 45 Arg Gln Gly Lys Ser Trp Glu Lys Ala Val Ala
Asp Gly Phe Lys Pro 50 55 60 Leu Lys Ser Arg Asp Ala Val Arg Asp
Ala Asp Ala Val Ile Phe Leu 65 70
75 80 Val Pro Asp Met Ala Gln Arg Glu Leu Tyr Lys Asn Ile Met Asn
Asp 85 90 95 Ile Lys Asp Asp Ala Asp Ile Val Phe Ala His Gly Phe
Asn Val His 100 105 110 Tyr Gly Leu Ile Asn Pro Lys Asn His Asp Val
Tyr Met Val Ala Pro 115 120 125 Lys Ala Pro Gly Pro Ser Val Arg Glu
Phe Tyr Glu Arg Gly Gly Gly 130 135 140 Val Pro Val Leu Ile Ala Val
Ala Asn Asp Val Ser Gly Arg Ser Lys 145 150 155 160 Glu Lys Ala Leu
Ser Ile Ala Tyr Ser Leu Gly Ala Leu Arg Ala Gly 165 170 175 Ala Ile
Glu Thr Thr Phe Lys Glu Glu Thr Glu Thr Asp Leu Ile Gly 180 185 190
Glu Gln Leu Asp Leu Val Gly Gly Ile Thr Glu Leu Leu Arg Ser Thr 195
200 205 Phe Asn Ile Met Val Glu Met Gly Tyr Lys Pro Glu Met Ala Tyr
Phe 210 215 220 Glu Ala Ile Asn Glu Met Lys Leu Ile Val Asp Gln Val
Phe Glu Lys 225 230 235 240 Gly Ile Ser Gly Met Leu Arg Ala Val Ser
Asp Thr Ala Lys Tyr Gly 245 250 255 Gly Leu Thr Thr Gly Lys Tyr Ile
Ile Asn Asp Asp Val Arg Lys Arg 260 265 270 Met Arg Glu Arg Ala Glu
Tyr Ile Val Ser Gly Lys Phe Ala Glu Glu 275 280 285 Trp Ile Glu Glu
Tyr Gly Glu Gly Ser Lys Asn Leu Glu Ser Met Met 290 295 300 Leu Asp
Ile Asp Asn Ser Leu Glu Glu Gln Val Gly Lys Gln Leu Arg 305 310 315
320 Glu Ile Val Leu Arg Gly Arg Pro Lys 325 25560PRTPicrophilus
torridus 25Met Asn Pro Asp Lys Lys Lys Arg Ser Asn Leu Ile Tyr Gly
Gly Tyr 1 5 10 15 Glu Lys Ala Pro Asn Arg Ala Phe Leu Lys Ala Met
Gly Leu Thr Asp 20 25 30 Asp Asp Ile Ala Lys Pro Ile Val Gly Val
Ala Val Ala Trp Asn Glu 35 40 45 Ala Gly Pro Cys Asn Ile His Leu
Leu Gly Leu Ser Asn Ile Val Lys 50 55 60 Glu Gly Val Arg Ser Gly
Gly Gly Thr Pro Arg Val Phe Thr Ala Pro 65 70 75 80 Val Val Ile Asp
Gly Ile Ala Met Gly Ser Glu Gly Met Lys Tyr Ser 85 90 95 Leu Val
Ser Arg Glu Ile Val Ala Asn Thr Val Glu Leu Val Val Asn 100 105 110
Ala His Gly Tyr Asp Gly Phe Val Ala Leu Ala Gly Cys Asp Lys Thr 115
120 125 Pro Pro Gly Met Met Met Ala Met Ala Arg Leu Asn Ile Pro Ser
Ile 130 135 140 Ile Met Tyr Gly Gly Thr Thr Leu Pro Gly Asn Phe Lys
Gly Lys Pro 145 150 155 160 Ile Thr Ile Gln Asp Val Tyr Glu Ala Val
Gly Ala Tyr Ser Lys Gly 165 170 175 Lys Ile Thr Ala Glu Asp Leu Arg
Leu Met Glu Asp Asn Ala Ile Pro 180 185 190 Gly Pro Gly Thr Cys Gly
Gly Leu Tyr Thr Ala Asn Thr Met Gly Leu 195 200 205 Met Thr Glu Ala
Leu Gly Leu Ala Leu Pro Gly Ser Ala Ser Pro Pro 210 215 220 Ala Val
Asp Ser Ala Arg Val Lys Tyr Ala Tyr Glu Thr Gly Lys Ala 225 230 235
240 Leu Met Asn Leu Ile Glu Ile Gly Leu Lys Pro Arg Asp Ile Leu Thr
245 250 255 Phe Glu Ala Phe Glu Asn Ala Ile Thr Val Leu Met Ala Ser
Gly Gly 260 265 270 Ser Thr Asn Ala Val Leu His Leu Leu Ala Ile Ala
Tyr Glu Ala Gly 275 280 285 Val Lys Leu Thr Leu Asp Asp Phe Asp Arg
Ile Ser Gln Arg Thr Pro 290 295 300 Glu Ile Val Asn Met Lys Pro Gly
Gly Glu Tyr Ala Met Tyr Asp Leu 305 310 315 320 His Arg Val Gly Gly
Ala Pro Leu Ile Met Lys Lys Leu Leu Glu Ala 325 330 335 Asp Leu Leu
His Gly Asp Val Ile Thr Val Thr Gly Lys Thr Val Lys 340 345 350 Gln
Asn Leu Glu Glu Tyr Lys Leu Pro Asn Val Pro His Glu His Ile 355 360
365 Val Arg Pro Ile Ser Asn Pro Phe Asn Pro Thr Gly Gly Ile Arg Ile
370 375 380 Leu Lys Gly Ser Leu Ala Pro Glu Gly Ala Val Ile Lys Val
Ser Ala 385 390 395 400 Thr Lys Val Arg Tyr His Lys Gly Pro Ala Arg
Val Phe Asn Ser Glu 405 410 415 Glu Glu Ala Phe Lys Ala Val Leu Glu
Glu Lys Ile Gln Glu Asn Asp 420 425 430 Val Val Val Ile Arg Tyr Glu
Gly Pro Lys Gly Gly Pro Gly Met Arg 435 440 445 Glu Met Leu Ala Val
Thr Ser Ala Ile Val Gly Gln Gly Leu Gly Glu 450 455 460 Lys Val Ala
Leu Ile Thr Asp Gly Arg Phe Ser Gly Ala Thr Arg Gly 465 470 475 480
Ile Met Val Gly His Val Ala Pro Glu Ala Ala Val Gly Gly Pro Ile 485
490 495 Ala Leu Leu Arg Asp Gly Asp Thr Ile Ile Ile Asp Ala Asn Asn
Gly 500 505 510 Arg Leu Asp Val Asp Leu Pro Gln Glu Glu Leu Lys Lys
Arg Ala Asp 515 520 525 Glu Trp Thr Pro Pro Pro Pro Lys Tyr Lys Ser
Gly Leu Leu Ala Gln 530 535 540 Tyr Ala Arg Leu Val Ser Ser Ser Ser
Leu Gly Ala Val Leu Leu Thr 545 550 555 560 26566PRTArtificial
SequenceSaccharomyces cerevisiae ILV3deltaN 26Met Lys Lys Leu Asn
Lys Tyr Ser Tyr Ile Ile Thr Glu Pro Lys Gly 1 5 10 15 Gln Gly Ala
Ser Gln Ala Met Leu Tyr Ala Thr Gly Phe Lys Lys Glu 20 25 30 Asp
Phe Lys Lys Pro Gln Val Gly Val Gly Ser Cys Trp Trp Ser Gly 35 40
45 Asn Pro Cys Asn Met His Leu Leu Asp Leu Asn Asn Arg Cys Ser Gln
50 55 60 Ser Ile Glu Lys Ala Gly Leu Lys Ala Met Gln Phe Asn Thr
Ile Gly 65 70 75 80 Val Ser Asp Gly Ile Ser Met Gly Thr Lys Gly Met
Arg Tyr Ser Leu 85 90 95 Gln Ser Arg Glu Ile Ile Ala Asp Ser Phe
Glu Thr Ile Met Met Ala 100 105 110 Gln His Tyr Asp Ala Asn Ile Ala
Ile Pro Ser Cys Asp Lys Asn Met 115 120 125 Pro Gly Val Met Met Ala
Met Gly Arg His Asn Arg Pro Ser Ile Met 130 135 140 Val Tyr Gly Gly
Thr Ile Leu Pro Gly His Pro Thr Cys Gly Ser Ser 145 150 155 160 Lys
Ile Ser Lys Asn Ile Asp Ile Val Ser Ala Phe Gln Ser Tyr Gly 165 170
175 Glu Tyr Ile Ser Lys Gln Phe Thr Glu Glu Glu Arg Glu Asp Val Val
180 185 190 Glu His Ala Cys Pro Gly Pro Gly Ser Cys Gly Gly Met Tyr
Thr Ala 195 200 205 Asn Thr Met Ala Ser Ala Ala Glu Val Leu Gly Leu
Thr Ile Pro Asn 210 215 220 Ser Ser Ser Phe Pro Ala Val Ser Lys Glu
Lys Leu Ala Glu Cys Asp 225 230 235 240 Asn Ile Gly Glu Tyr Ile Lys
Lys Thr Met Glu Leu Gly Ile Leu Pro 245 250 255 Arg Asp Ile Leu Thr
Lys Glu Ala Phe Glu Asn Ala Ile Thr Tyr Val 260 265 270 Val Ala Thr
Gly Gly Ser Thr Asn Ala Val Leu His Leu Val Ala Val 275 280 285 Ala
His Ser Ala Gly Val Lys Leu Ser Pro Asp Asp Phe Gln Arg Ile 290 295
300 Ser Asp Thr Thr Pro Leu Ile Gly Asp Phe Lys Pro Ser Gly Lys Tyr
305 310 315 320 Val Met Ala Asp Leu Ile Asn Val Gly Gly Thr Gln Ser
Val Ile Lys 325 330 335 Tyr Leu Tyr Glu Asn Asn Met Leu His Gly Asn
Thr Met Thr Val Thr 340 345 350 Gly Asp Thr Leu Ala Glu Arg Ala Lys
Lys Ala Pro Ser Leu Pro Glu 355 360 365 Gly Gln Glu Ile Ile Lys Pro
Leu Ser His Pro Ile Lys Ala Asn Gly 370 375 380 His Leu Gln Ile Leu
Tyr Gly Ser Leu Ala Pro Gly Gly Ala Val Gly 385 390 395 400 Lys Ile
Thr Gly Lys Glu Gly Thr Tyr Phe Lys Gly Arg Ala Arg Val 405 410 415
Phe Glu Glu Glu Gly Ala Phe Ile Glu Ala Leu Glu Arg Gly Glu Ile 420
425 430 Lys Lys Gly Glu Lys Thr Val Val Val Ile Arg Tyr Glu Gly Pro
Arg 435 440 445 Gly Ala Pro Gly Met Pro Glu Met Leu Lys Pro Ser Ser
Ala Leu Met 450 455 460 Gly Tyr Gly Leu Gly Lys Asp Val Ala Leu Leu
Thr Asp Gly Arg Phe 465 470 475 480 Ser Gly Gly Ser His Gly Phe Leu
Ile Gly His Ile Val Pro Glu Ala 485 490 495 Ala Glu Gly Gly Pro Ile
Gly Leu Val Arg Asp Gly Asp Glu Ile Ile 500 505 510 Ile Asp Ala Asp
Asn Asn Lys Ile Asp Leu Leu Val Ser Asp Lys Glu 515 520 525 Met Ala
Gln Arg Lys Gln Ser Trp Val Ala Pro Pro Pro Arg Tyr Thr 530 535 540
Arg Gly Thr Leu Ser Lys Tyr Ala Lys Leu Val Ser Asn Ala Ser Asn 545
550 555 560 Gly Cys Val Leu Asp Ala 565 2711PRTArtificial
SequenceConserved DHAD motif 27Pro Xaa Xaa Xaa Xaa Gly Xaa Xaa Xaa
Ile Leu 1 5 10 2811PRTArtificial SequenceConserved DHAD motif 28Pro
Ile Lys Xaa Xaa Gly Xaa Xaa Xaa Ile Leu 1 5 10 2936DNAArtificial
SequencePrimer 387 29gtcacagtcg acatggctaa ctacttcaat acactg
363033DNAArtificial SequencePrimer 388 30gcataaggat ccttaacccg
caacagcaat acg 333136DNAArtificial SequencePrimer 410 31gactttgtcg
acatgcttta cccagaaaaa tttcag 363241DNAArtificial SequencePrimer 411
32ctaatagcgg ccgcctattt atggaatttc ttatcataat c 413335DNAArtificial
SequencePrimer 637 33ttttgagctc gccgatccca ttaccgacat ttggg
353496DNAArtificial SequencePrimer 638 34aaagtcgaca ccgatatacc
tgtatgtgtc accaccaatg tatctataag tatccatgct 60agccctaggt ttatgtgatg
attgattgat tgattg 963536DNAArtificial SequencePrimer 697
35gagtacggat ccctagagag ctttcgtttt catgag 363637DNAArtificial
SequencePrimer 767 36caagaagtcg acatgttgac aaaagcaaca aaagaac
373732DNAArtificial SequencePrimer 1149 37cgcttactcg agatgggccg
cgatgaattc gc 323833DNAArtificial SequencePrimer 1150 38gcataaagat
ctttaacccg caacagcaat acg 333935DNAArtificial SequencePrimer 1151
39agacgtgtcg acatgactgg catgactgat gcaga 354034DNAArtificial
SequencePrimer 1152 40gtttagggat cctcatccac ccaacttcga tttg
344131DNAArtificial SequencePrimer 1006 41gtagaagacg tcacctggta
gaccaaagat g 314232DNAArtificial SequencePrimer 1009 42catcgtgacg
tcgctcaatt gactgctgct ac 324332DNAArtificial SequencePrimer 1016
43actaagcgac acgtgcggtt tctgtggtat ag 324436DNAArtificial
SequencePrimer 1017 44gaaaccgcac gtgtcgctta gtttacattt ctttcc
36451647DNAArtificial SequenceL. lactis kivD (codon optimized for
E. coli) in pGV1590 45atgtatactg ttggtgatta tctgctggac cgtctgcatg
aactgggtat cgaagaaatc 60ttcggcgttc cgggtgatta caatctgcag ttcctggatc
agatcatctc tcataaagac 120atgaaatggg tgggtaacgc taacgaactg
aacgcaagct acatggcaga tggttatgca 180cgtaccaaga aagccgcggc
atttctgacc actttcggtg ttggcgaact gagcgccgtc 240aacggtctgg
cgggctccta cgccgaaaac ctgccggtgg tggagatcgt aggcagccca
300acgagcaaag ttcagaacga aggtaaattc gtccaccaca ctctggctga
cggcgatttc 360aaacacttca tgaaaatgca tgaacctgtg actgcggcac
gtacgctgct gactgcagag 420aacgctactg tggaaatcga ccgcgttctg
tctgcgctgc tgaaagaacg caaaccagtt 480tacatcaacc tgcctgtgga
tgttgcggca gctaaagcgg aaaaaccgag cctgccgctg 540aagaaagaaa
actccacttc taacactagc gaccaggaaa tcctgaacaa aatccaggag
600tctctgaaaa acgcaaagaa accaatcgtg atcaccggcc acgaaatcat
ttcttttggt 660ctggagaaga ccgtgaccca attcatcagc aaaaccaaac
tgccgattac caccctgaac 720ttcggcaagt cctctgttga cgaggctctg
ccgtctttcc tgggcatcta caacggtact 780ctgagcgaac cgaacctgaa
agaatttgtt gaatctgcgg acttcatcct gatgctgggc 840gttaaactga
ccgactcttc taccggtgca ttcactcacc atctgaacga aaacaaaatg
900attagcctga acatcgacga gggtaaaatc ttcaacgagc gtatccagaa
cttcgacttc 960gaaagcctga tcagctctct gctggacctg tccgaaatcg
agtataaagg caaatacatt 1020gacaaaaagc aagaagattt cgtaccatct
aacgcactgc tgtcccagga tcgcctgtgg 1080caggccgtgg agaacctgac
ccagagcaat gaaaccatcg tggcggaaca aggtacgagc 1140tttttcggcg
cgtcttctat ctttctgaaa tccaaaagcc attttatcgg tcagccgctg
1200tggggtagca ttggctatac tttcccggca gcgctgggct ctcagatcgc
tgataaagaa 1260tctcgtcatc tgctgttcat cggtgacggt tccctgcagc
tgaccgtaca ggaactgggt 1320ctggcaattc gtgaaaagat caacccgatt
tgcttcatta ttaacaatga cggctacacc 1380gttgagcgtg agatccacgg
tccgaaccag tcttacaacg atatccctat gtggaactac 1440tctaaactgc
cggagtcctt cggcgcaact gaggaccgtg ttgtgtctaa aattgtgcgt
1500accgaaaacg aatttgtgag cgtgatgaaa gaggcccagg ccgatccgaa
ccgtatgtac 1560tggatcgaac tgatcctggc gaaagaaggc gcaccgaagg
tactgaagaa aatgggcaag 1620ctgtttgctg aacagaataa atcctaa
1647461086DNAArtificial SequenceS. cerevisiae ADH7 in pGV1590
46atgctttacc cagaaaaatt tcagggcatc ggtatttcca acgcaaagga ttggaagcat
60cctaaattag tgagttttga cccaaaaccc tttggcgatc atgacgttga tgttgaaatt
120gaagcctgtg gtatctgcgg atctgatttt catatagccg ttggtaattg
gggtccagtc 180ccagaaaatc aaatccttgg acatgaaata attggccgcg
tggtgaaggt tggatccaag 240tgccacactg gggtaaaaat cggtgaccgt
gttggtgttg gtgcccaagc cttggcgtgt 300tttgagtgtg aacgttgcaa
aagtgacaac gagcaatact gtaccaatga ccacgttttg 360actatgtgga
ctccttacaa ggacggctac atttcacaag gaggctttgc ctcccacgtg
420aggcttcatg aacactttgc tattcaaata ccagaaaata ttccaagtcc
gctagccgct 480ccattattgt gtggtggtat tacagttttc tctccactac
taagaaatgg ctgtggtcca 540ggtaagaggg taggtattgt tggcatcggt
ggtattgggc atatggggat tctgttggct 600aaagctatgg gagccgaggt
ttatgcgttt tcgcgaggcc actccaagcg ggaggattct 660atgaaactcg
gtgctgatca ctatattgct atgttggagg ataaaggctg gacagaacaa
720tactctaacg ctttggacct tcttgtcgtt tgctcatcat ctttgtcgaa
agttaatttt 780gacagtatcg ttaagattat gaagattgga ggctccatcg
tttcaattgc tgctcctgaa 840gttaatgaaa agcttgtttt aaaaccgttg
ggcctaatgg gagtatcaat ctcaagcagt 900gctatcggat ctaggaagga
aatcgaacaa ctattgaaat tagtttccga aaagaatgtc 960aaaatatggg
tggaaaaact tccgatcagc gaagaaggcg tcagccatgc ctttacaagg
1020atggaaagcg gagacgtcaa atacagattt actttggtcg attatgataa
gaaattccat 1080aaatag 1086471716DNAArtificial SequenceB. subtilis
alsS in pGV1726 47atgttgacaa aagcaacaaa agaacaaaaa tcccttgtga
aaagcagagg ggcggagctt 60gttgttgatt gcttagcgga gcaaggtgtc acacatgtat
ttggcattcc aggtgcaaaa 120attgatgcgg tatttgacgc tttacaagat
aaagggcctg aaattatcgt tgcccggcat 180gaacaaaatg cagcatttat
ggcgcaagca gtcggccgtt taactggaaa accgggagtc 240gtgttagtca
catcaggacc aggtgcttcg aacttggcaa caggactgct gacagcaaac
300actgaaggtg accctgtcgt tgcgcttgct gggaacgtga tccgtgcaga
tcgtttaaaa 360cggacacatc aatctttgga taatgcggcg ctattccagc
cgattacaaa atacagtgta 420gaagttcaag atgtaaaaaa tataccggaa
gctgttacaa atgcgtttag gatagcgtca 480gcagggcagg ctggggccgc
ttttgtgagt tttccgcaag atgttgtgaa tgaagtcaca 540aatacaaaaa
acgtacgtgc tgtcgcagcg ccaaaacttg gtcccgcagc agatgacgca
600atcagtatgg ccattgcaaa aattcaaaca gcaaaacttc ctgtcgtttt
agtcggcatg 660aagggcggaa gaccggaagc gattaaagcg gttcgcaagc
tattgaaaaa agtgcagctt 720ccattcgttg aaacatatca agctgccggt
actcttacga gagatttaga ggatcagtat 780tttggccgga tcggtttatt
ccgcaaccag cctggcgatc tgctgcttga gcaggctgat 840gttgttctga
caatcggcta tgacccaatt gaatatgatc cgaaattctg gaatgtcaat
900ggagaccgga cgatcatcca tttagacgag attctggctg acattgatca
tgcttaccag 960ccggatcttg aactgatcgg tgatattcca tctacgatca
atcatatcga acacgatgct 1020gtgaaagtag actttgcgga acgtgagcag
aagatccttt ctgatttaaa acaatatatg 1080catgagggtg agcaggtgcc
tgcagattgg aaatcagaca gagtgcatcc
tcttgaaatc 1140gttaaagaat tgcgaaacgc agtcgatgat catgttacag
tgacttgcga tatcggttca 1200cacgcgattt ggatgtcacg ttatttccgc
agctacgagc cgttaacatt aatgattagt 1260aacggtatgc aaacactcgg
cgttgcgctt ccttgggcaa tcggcgcttc attggtgaaa 1320ccgggagaaa
aagtagtatc agtctccggt gatggcggtt tcttattctc agctatggaa
1380ttagagacag cagttcgttt aaaagcacca attgtacaca ttgtatggaa
cgacagcaca 1440tatgacatgg ttgcattcca gcaattgaaa aaatataatc
gtacatctgc ggtcgatttc 1500ggaaatatcg atatcgtgaa atacgcggaa
agcttcggag caactggctt acgcgtagaa 1560tcaccagacc agctggcaga
tgttctgcgt caaggcatga acgctgaggg gcctgtcatc 1620attgatgtcc
cggttgacta cagtgataac gttaatttag caagtgacaa gcttccgaaa
1680gaattcgggg aactcatgaa aacgaaagct ctctag 1716481410DNAArtificial
SequenceE. coli ilvCdeltaN in pGV1727 48atgggccgcg atgaattcgc
cgatggcgcg agctaccttc agggtaaaaa agtagtcatc 60gtcggctgtg gcgcacaggg
tctgaaccag ggcctgaaca tgcgtgattc tggtctcgat 120atctcctacg
ctctgcgtaa agaagcgatt gccgagaagc gcgcgtcctg gcgtaaagcg
180accgaaaatg gttttaaagt gggtacttac gaagaactga tcccacaggc
ggatctggtg 240attaacctga cgccggacaa gcagcactct gatgtagtgc
gcaccgtaca gccactgatg 300aaagacggcg cggcgctggg ctactcgcac
ggtttcaaca tcgtcgaagt gggcgagcag 360atccgtaaag atatcaccgt
agtgatggtt gcgccgaaat gcccaggcac cgaagtgcgt 420gaagagtaca
aacgtgggtt cggcgtaccg acgctgattg ccgttcaccc ggaaaacgat
480ccgaaaggcg aaggcatggc gattgccaaa gcctgggcgg ctgcaaccgg
tggtcaccgt 540gcgggtgtgc tggaatcgtc cttcgttgcg gaagtgaaat
ctgacctgat gggcgagcaa 600accatcctgt gcggtatgtt gcaggctggc
tctctgctgt gcttcgacaa gctggtggaa 660gaaggtaccg atccagcata
cgcagaaaaa ctgattcagt tcggttggga aaccatcacc 720gaagcactga
aacagggcgg catcaccctg atgatggacc gtctctctaa cccggcgaaa
780ctgcgtgctt atgcgctttc tgaacagctg aaagagatca tggcacccct
gttccagaaa 840catatggacg acatcatctc cggcgaattc tcttccggta
tgatggcgga ctgggccaac 900gatgataaga aactgctgac ctggcgtgaa
gagaccggca aaaccgcgtt tgaaaccgcg 960ccgcagtatg aaggcaaaat
cggcgagcag gagtacttcg ataaaggcgt actgatgatt 1020gcgatggtga
aagcgggcgt tgaactggcg ttcgaaacca tggtcgattc cggcatcatt
1080gaagagtctg catattatga atcactgcac gagctgccgc tgattgccaa
caccatcgcc 1140cgtaagcgtc tgtacgaaat gaacgtggtt atctctgata
ccgctgagta cggtaactat 1200ctgttctctt acgcttgtgt gccgttgctg
aaaccgttta tggcagagct gcaaccgggc 1260gacctgggta aagctattcc
ggaaggcgcg gtagataacg ggcaactgcg tgatgtgaac 1320gaagcgattc
gcagccatgc gattgagcag gtaggtaaga aactgcgcgg ctatatgaca
1380gatatgaaac gtattgctgt tgcgggttaa 1410491782DNAArtificial
SequenceE. coli ilvDdeltaN (codon optimized for K. lactis) in
pGV1727 49atgactggca tgactgatgc agatttcgga aagccaatca ttgccgtcgt
caactctttt 60acacaattcg ttccgggtca tgtccatttg cgtgatctag gtaagcttgt
tgccgaacaa 120attgaagctg caggtggtgt cgcaaaagag tttaatacta
ttgctgtgga cgacggtata 180gctatggggc atggcggtat gttatactct
ttaccatcga gagaattaat tgcagactca 240gtcgaatata tggttaatgc
tcattgtgcc gatgcaatgg tttgtatctc taattgtgat 300aagataacgc
ctggtatgtt gatggcgtcc ttgagattga acatcccagt aatcttcgta
360tctggcggcc caatggaggc tggtaaaact aagttaagtg atcagatcat
caaacttgat 420cttgtggatg caatgattca aggtgcagat ccaaaagttt
cagactcgca gtcagaccaa 480gttgaaagaa gtgcatgtcc aacttgtggt
tcttgcagtg gaatgttcac ggctaactct 540atgaattgct tgactgaagc
tctaggttta tctcaaccag gaaatggttc attattagcg 600acccatgcag
acagaaagca attgttctta aatgccggaa aaagaattgt ggaactaacg
660aaaaggtatt acgaacaaaa tgatgaatca gcattaccga ggaatatagc
ttcaaaggct 720gcattcgaaa atgccatgac attggatatt gcaatgggtg
gtagtacaaa cacggtctta 780catcttctag ctgcagccca agaagctgag
atagatttca ccatgtctga tatcgacaag 840ctttcacgta aggttccaca
gttatgtaag gttgcaccat caactcaaaa gtatcacatg 900gaagacgttc
atcgtgcagg aggggttatt ggtattttag gggagttgga cagagccggt
960cttttaaaca gggatgtgaa gaatgtattg ggtttaacac ttccacagac
attagagcaa 1020tacgatgtca tgttaactca agatgatgcc gtgaaaaaca
tgttcagggc aggtccagca 1080gggatcagaa ccacccaagc attctcgcaa
gactgtaggt gggacacttt ggacgatgat 1140agagcaaatg gatgtataag
atcgcttgag catgcttata gtaaggatgg tggtttagca 1200gtattatatg
gaaacttcgc tgaaaatggt tgcattgtga aaactgctgg tgtagatgat
1260agtattttga aatttactgg acccgctaaa gtttacgaaa gtcaagacga
tgctgttgag 1320gctatacttg gcggaaaggt ggtagcagga gacgtggtag
tgataagata tgagggacca 1380aagggaggac caggtatgca ggaaatgctt
tacccaactt catttttgaa gtccatggga 1440ctaggaaaag cttgtgccct
tatcactgac ggtagattct ctggtggcac ttcgggttta 1500agtatcggtc
acgtatcacc agaggcagct tctggtggtt cgattggatt gattgaagat
1560ggagatttga tcgccataga tatcccaaat agaggtatcc aattacaagt
ctcagacgct 1620gaattggctg caagaagaga agcacaagat gccagaggag
ataaggcttg gactcctaaa 1680aatagagaac gtcaagtaag tttcgccctt
agggcttatg cttcattggc tacttcagcc 1740gataaggggg cagtaagaga
caaatcgaag ttgggtggat ga 17825039DNAArtificial SequencePrimer 575
50ttttgaattc tggttctatc gaggagaaaa agcgacaag 395135DNAArtificial
SequencePrimer 576 51ttttggatcc ggatgtgaag tcgttgacac agtcg
355222DNAArtificial SequencePrimer 1623 52gtctctgata aggaaatggc tc
225360DNAArtificial SequencePrimer 1886 53tcaagaagcc tcaagtcggg
gttggttcct gttggtggtc cggtaaccca tgtaacatgc 605460DNAArtificial
SequencePrimer 1887 54cggtaaccca tgtaacatgc atctattgga cttgaataac
attctggttc tatcgaggag 605560DNAArtificial SequencePrimer 1888
55ctttcgttaa caagcccatc tctacttttt tcttggctgt atccggatgt gaagtcgttg
605660DNAArtificial SequencePrimer 1889 56gatgggcttg ttaacgaaag
ttgctacatc tagacaattc tgcattatag gccccaatcg 605720DNAArtificial
SequencePrimer 1890 57ttagtggcag caaagcagag 205820DNAArtificial
SequencePrimer 1892 58acatgatgcc cgttcacaac 205920DNAArtificial
SequencePrimer 1916 59caggatgaca gttcgatgag 206020DNAArtificial
SequencePrimer 1917 60tgtcaacgac ttcacatccg 206120DNAArtificial
SequencePrimer 1920 61tgcagcctag ctttgaagac 206220DNAArtificial
SequencePrimer 1921 62tacgttagga ccccagtatc 206367DNAArtificial
SequencePrimer 271 63ctagcatgga acaaaaactc atctcagaag aagatggtgt
cgacgaattc ccgggatccg 60cggccgc 676467DNAArtificial SequencePrimer
272 64tcgagcggcc gcggatcccg ggaattcgtc gacaccatct tcttctgaga
tgagtttttg 60ttccatg 676534DNAArtificial SequencePrimer 421
65gccaacggat cctcaagcat ctaaaacaca accg 346636DNAArtificial
SequencePrimer 551 66gctcatgtcg acatgaagaa gctcaacaag tactcg
366735DNAArtificial SequencePrimer 1617 67cgttgagtcg acatgggctt
gttaacgaaa gttgc 356834DNAArtificial SequencePrimer 1618
68gccaacggat cctcaagcat ctaaaacaca accg 34696654DNAArtificial
SequencepGV1730 69caggcaagtg cacaaacaat acttaaataa atactactca
gtaataacct atttcttagc 60atttttgacg aaatttgcta ttttgttaga gtcttttaca
ccatttgtct ccacacctcc 120gcttacatca acaccaataa cgccatttaa
tctaagcgca tcaccaacat tttctggcgt 180cagtccacca gctaacataa
aatgtaagct ttcggggctc tcttgccttc caacccagtc 240agaaatcgag
ttccaatcca aaagttcacc tgtcccacct gcttctgaat caaacaaggg
300aataaacgaa tgaggtttct gtgaagctgc actgagtagt atgttgcagt
cttttggaaa 360tacgagtctt ttaataactg gcaaaccgag gaactcttgg
tattcttgcc acgactcatc 420tccatgcagt tggacgatat caatgccgta
atcattgacc agagccaaaa catcctcctt 480aggttgatta cgaaacacgc
caaccaagta tttcggagtg cctgaactat ttttatatgc 540ttttacaaga
cttgaaattt tccttgcaat aaccgggtca attgttctct ttctattggg
600cacacatata atacccagca agtcagcatc ggaatctaga gcacattctg
cggcctctgt 660gctctgcaag ccgcaaactt tcaccaatgg accagaacta
cctgtgaaat taataacaga 720catactccaa gctgcctttg tgtgcttaat
cacgtatact cacgtgctca atagtcacca 780atgccctccc tcttggccct
ctccttttct tttttcgacc gaattaattc ttaatcggca 840aaaaaagaaa
agctccggat caagattgta cgtaaggtga caagctattt ttcaataaag
900aatatcttcc actactgcca tctggcgtca taactgcaaa gtacacatat
attacgatgc 960tgtctattaa atgcttccta tattatatat atagtaatgt
cgttgacgtc gccggcagga 1020gagtgaaaga gccttgttta tatatttttt
tttcctatgt tcaacgagga cagctaggtt 1080tatgcaaaaa tgtgccatca
ccataagctg attcaaatga gctaaaaaaa aaatagttag 1140aaaataaggt
ggtgttgaac gatagcaagt agatcaagac accgtctaac agaaaaaggg
1200gcagcggaca atattatgca attatgaaga aaagtactca aagggtcgga
aaaatattca 1260aacgatattt gcattaaatc ctcaattgat tgattattcc
atagtaaaat accgtaacaa 1320cacaaaattg ttctcaaatt cataaattat
tcattttttc cacgagcctc atcacacgaa 1380aagtcagaag agcatacata
atcttttaaa tgcataggtt atgcattttg caaatgccac 1440caggcaacaa
aaatatgcgt ttagcgggcg gaatcgggaa ggaagccgga accaccaaaa
1500actggaagct acgtttttaa ggaaggtatg ggtgcagtgt gcttatctca
agaaatatta 1560gttatgatat aaggtgttga agtttagaga taggtaaata
aacgcggggt gtgtttatta 1620catgaagaag aagttagttt ctgccttgct
tgtttatctt gcacatcaca tcagcggaac 1680atatgctcac ccagtcgcga
catccaattt atagaaatca gcttgtgggt attgttcaga 1740gaatttttca
atcattggag caatcatttt acatggaccg caccaagtgg cgtagaaatc
1800tacgacaact agcttgtctt gagcaattgc agagtcgaat tcgctggcag
ttttgaattg 1860agtaaccatt atttgtatcg aggtgtctag tcttctatta
cactaatgca gtttcagggt 1920tttggaaacc acactgttta aacagtgttc
cttaatcaag gatacctctt tttttttcct 1980tggttccact aattcatcgg
tttttttttt ggaagacatc ttttccaacg aaaagaatat 2040acatatcgtt
taagagaaat tctccaaatt tgtaaagaag cggacccaga cttaagccta
2100accaggccaa ttcaacagac tgtcggcaac ttcttgtctg gtctttccat
ggtaagtgac 2160agtgcagtaa taatatgaac caatttattt ttcgttacat
aaaaatgctt ataaaacttt 2220aactaataat tagagattaa atcgcggccg
cggatcccta gagagctttc gttttcatga 2280gttccccgaa ttctttcgga
agcttgtcac ttgctaaatt aacgttatca ctgtagtcaa 2340ccgggacatc
aatgatgaca ggcccctcag cgttcatgcc ttgacgcaga acatctgcca
2400gctggtctgg tgattctacg cgtaagccag ttgctccgaa gctttccgcg
tatttcacga 2460tatcgatatt tccgaaatcg accgcagatg tacgattata
ttttttcaat tgctggaatg 2520caaccatgtc atatgtgctg tcgttccata
caatgtgtac aattggtgct tttaaacgaa 2580ctgctgtctc taattccata
gctgagaata agaaaccgcc atcaccggag actgatacta 2640ctttttctcc
cggtttcacc aatgaagcgc cgattgccca aggaagcgca acgccgagtg
2700tttgcatacc gttactaatc attaatgtta acggctcgta gctgcggaaa
taacgtgaca 2760tccaaatcgc gtgtgaaccg atatcgcaag tcactgtaac
atgatcatcg actgcgtttc 2820gcaattcttt aacgatttca agaggatgca
ctctgtctga tttccaatct gcaggcacct 2880gctcaccctc atgcatatat
tgttttaaat cagaaaggat cttctgctca cgttccgcaa 2940agtctacttt
cacagcatcg tgttcgatat gattgatcgt agatggaata tcaccgatca
3000gttcaagatc cggctggtaa gcatgatcaa tgtcagccag aatctcgtct
aaatggatga 3060tcgtccggtc tccattgaca ttccagaatt tcggatcata
ttcaattggg tcatagccga 3120ttgtcagaac aacatcagcc tgctcaagca
gcagatcgcc aggctggttg cggaataaac 3180cgatccggcc aaaatactga
tcctctaaat ctctcgtaag agtaccggca gcttgatatg 3240tttcaacgaa
tggaagctgc acttttttca atagcttgcg aaccgcttta atcgcttccg
3300gtcttccgcc cttcatgccg actaaaacga caggaagttt tgctgtttga
atttttgcaa 3360tggccatact gattgcgtca tctgctgcgg gaccaagttt
tggcgctgcg acagcacgta 3420cgttttttgt atttgtgact tcattcacaa
catcttgcgg aaaactcaca aaagcggccc 3480cagcctgccc tgctgacgct
atcctaaacg catttgtaac agcttccggt atatttttta 3540catcttgaac
ttctacactg tattttgtaa tcggctggaa tagcgccgca ttatccaaag
3600attgatgtgt ccgttttaaa cgatctgcac ggatcacgtt cccagcaagc
gcaacgacag 3660ggtcaccttc agtgtttgct gtcagcagtc ctgttgccaa
gttcgaagca cctggtcctg 3720atgtgactaa cacgactccc ggttttccag
ttaaacggcc gactgcttgc gccataaatg 3780ctgcattttg ttcatgccgg
gcaacgataa tttcaggccc tttatcttgt aaagcgtcaa 3840ataccgcatc
aatttttgca cctggaatgc caaatacatg tgtgacacct tgctccgcta
3900agcaatcaac aacaagctcc gcccctctgc ttttcacaag ggatttttgt
tcttttgttg 3960cttttgtcaa catgtcgact ttatgtgatg attgattgat
tgattgtaca gtttgttttt 4020cttaatatct atttcgatga cttctatatg
atattgcact aacaagaaga tattataatg 4080caattgatac aagacaagga
gttatttgct tctcttttat atgattctga caatccatat 4140tgcgttggta
gtcttttttg ctggaacggt tcagcggaaa agacgcatcg ctctttttgc
4200ttctagaaga aatgccagca aaagaatctc ttgacagtga ctgacagcaa
aaatgtcttt 4260ttctaactag taacaaggct aagatatcag cctgaaataa
agggtggtga agtaataatt 4320aaatcatccg tataaaccta tacacatata
tgaggaaaaa taatacaaaa gtgttttaaa 4380tacagataca tacatgaaca
tatgcacgta tagcgcccaa atgtcggtaa tgggatcggc 4440gagctccagc
ttttgttccc tttagtgagg gttaattgcg cgcttggcgt aatcatggtc
4500atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca
taggagccgg 4560aagcataaag tgtaaagcct ggggtgccta atgagtgagg
taactcacat taattgcgtt 4620gcgctcactg cccgctttcc agtcgggaaa
cctgtcgtgc cagctgcatt aatgaatcgg 4680ccaacgcgcg gggagaggcg
gtttgcgtat tgggcgctct tccgcttcct cgctcactga 4740ctcgctgcgc
tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat
4800acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa
aaggccagca 4860aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt
ttccataggc tccgcccccc 4920tgacgagcat cacaaaaatc gacgctcaag
tcagaggtgg cgaaacccga caggactata 4980aagataccag gcgtttcccc
ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 5040gcttaccgga
tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc
5100acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct
gtgtgcacga 5160accccccgtt cagcccgacc gctgcgcctt atccggtaac
tatcgtcttg agtccaaccc 5220ggtaagacac gacttatcgc cactggcagc
agccactggt aacaggatta gcagagcgag 5280gtatgtaggc ggtgctacag
agttcttgaa gtggtggcct aactacggct acactagaag 5340gacagtattt
ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag
5400ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt
gcaagcagca 5460gattacgcgc agaaaaaaag gatctcaaga agatcctttg
atcttttcta cggggtctga 5520cgctcagtgg aacgaaaact cacgttaagg
gattttggtc atgagattat caaaaaggat 5580cttcacctag atccttttaa
attaaaaatg aagttttaaa tcaatctaaa gtatatatga 5640gtaaacttgg
tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg
5700tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta
cgatacggga 5760gggcttacca tctggcccca gtgctgcaat gataccgcga
gacccacgct caccggctcc 5820agatttatca gcaataaacc agccagccgg
aagggccgag cgcagaagtg gtcctgcaac 5880tttatccgcc tccatccagt
ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 5940agttaatagt
ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc
6000gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta
catgatcccc 6060catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg
atcgttgtca gaagtaagtt 6120ggccgcagtg ttatcactca tggttatggc
agcactgcat aattctctta ctgtcatgcc 6180atccgtaaga tgcttttctg
tgactggtga gtactcaacc aagtcattct gagaatagtg 6240tatgcggcga
ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag
6300cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac
tctcaaggat 6360cttaccgctg ttgagatcca gttcgatgta acccactcgt
gcacccaact gatcttcagc 6420atcttttact ttcaccagcg tttctgggtg
agcaaaaaca ggaaggcaaa atgccgcaaa 6480aaagggaata agggcgacac
ggaaatgttg aatactcata ctcttccttt ttcaatatta 6540ttgaagcatt
tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa
6600aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgt
6654701728DNAArtificial SequenceB. subtilis alsS in pGV1730
70gtcgacatgt tgactaaagc tacaaaagag cagaaatcat tggtgaaaaa taggggtgca
60gaacttgttg tggactgttt ggtagaacag ggcgtaacac atgtttttgg tatcccaggt
120gcaaaaatcg acgccgtgtt tgatgcatta caagacaagg gtccagaaat
tattgttgct 180agacatgagc aaaatgccgc atttatggcg caagctgtag
gtaggcttac aggtaaacct 240ggtgttgtcc tagttacgtc tggcccagga
gcctccaatt tagcaactgg tctattgaca 300gctaatactg agggagatcc
tgtagttgcg ttagccggta atgtaattag agctgatagg 360cttaagagaa
ctcaccagtc tctagacaac gctgctttat tccaaccgat caccaagtac
420tcagtagagg tacaagacgt aaagaatata cctgaagctg tgacaaacgc
atttcgtata 480gcttctgctg gtcaggctgg tgccgcgttt gtttcttttc
ctcaagacgt tgtcaatgaa 540gtgaccaata ctaaaaacgt tagagcggtt
gcagccccta aactaggtcc agccgcagac 600gacgcaatta gcgctgcaat
tgctaaaatt cagacggcga aactaccagt agtccttgtc 660ggtatgaagg
gcggaagacc agaagcaata aaagctgttc gtaagttatt gaagaaagtc
720caattacctt tcgttgagac ttaccaagca gcaggtactt tatctagaga
tttagaggat 780cagtattttg gaaggatagg tctatttaga aaccaaccag
gagatttact attagaacaa 840gctgatgttg tacttactat cggttatgat
cctatagagt atgacccaaa gttttggaac 900ataaatgggg atagaacaat
tatacatcta gacgagataa tcgccgacat cgatcacgct 960tatcaaccag
atttagaact aatcggagat atcccgtcaa caatcaatca tattgaacat
1020gatgctgtaa aggttgagtt cgctgaacgt gagcagaaaa tcttatctga
tctaaagcaa 1080tatatgcatg agggtgaaca agttccagca gactggaaat
ctgaccgtgc acatcctttg 1140gaaatcgtta aggaactaag aaatgcggtc
gatgatcatg tgactgttac atgtgatatc 1200ggttcacatg caatttggat
gtcacgttat tttaggagct acgaaccatt aactttaatg 1260atatctaacg
ggatgcaaac tctgggggtt gcacttcctt gggctattgg cgctagttta
1320gttaagcccg gtgagaaggt ggtatcggta tcaggtgatg gtggctttct
gttttcggct 1380atggaattag aaactgcagt ccgtttaaaa gctcccattg
tgcatattgt ctggaatgat 1440tctacttacg acatggttgc ttttcaacag
ttgaagaaat acaatagaac ttcggctgta 1500gactttggta acatcgatat
tgtgaaatat gctgagtctt ttggcgcaac aggcctgagg 1560gtggaaagtc
cagatcagtt agctgatgtg ttgagacaag ggatgaatgc cgagggaccg
1620gtaatcatag atgtgccagt tgactactca gacaatatta atttggcttc
tgataaactt 1680cctaaagagt ttggcgagct aatgaagacc aaagccttat aaggatcc
1728711698DNATrichoderma atraviride 71gtcgacatga caaaggatac
cgttgacatt ttgattgatt ctttaaaagc agcaggtgta 60aaatatgttt tcggcgttcc
gggagcgaaa attgactccg tgtttaatgc cctaatcgat 120catccagaca
tcaagttagt tgtatgtaga cacgaacaaa acgccgcctt tatcgcagca
180gctatgggta aggttaccgg tagacctggt gtctgcatcg ctacaagtgg
gcctgggact 240tctaatttgg ttacaggcct ggttacagcg accgacgaag
gggcgccggt tgttgctata 300gtgggttcag ttaaacgtag tcaatcatta
caaagaactc atcagtcgct aaggggagcc 360gacctgttgg ctcccgttac
caagaaggtg gtaagtgccg ttgtcgaaga tcaagttgcc 420gaaatcatgt
tggatgcatt tcgtgttgca gctgcttccc ctccaggcgc taccgctgtg
480tctcttccca tcgatctgat gacgccagcc aaatctactt ctaccgttac
ggccttccca 540gctgaatgtt tcatacctcc aaaatacggc aaaagccctg
aaactacatt acaagccgca 600gccgatttga taagcgccgc caaagctcca
gttctattct tagggatgcg tgttagcgag 660tctgacgata caattagcgc
agtacacggt tttcttcgta agcatcctgt tccagttgtg 720gaaacctttc
aagctgcagg cgcgatttcc aaagagctag tgcacttatt ttatggtaga
780atcggtttat tttctaatca accgggtgat caattgctac aacatgcgga
cctagtaata 840gcgatcggct tagatcaagc tgagtatgac gctaatatgt
ggaacgccag aggcacaaca 900attttacatg tcgatataca accagcggac
tttgttgctc attataaacc taagatcgag 960ctggtcggtt cactagcaga
caacatgaca gatttgactt ctaggttgga tacggtcgct 1020aggctacaat
taacgaaacc tggtgaagcc attagaacca acatgtggga atggcaaaat
1080tccccggaag cctccggtag atcaacgggt cctgttcatc cattgcactt
tattagacta 1140tttcaatcca ttattgaccc gagcaccact gtaattagtg
atgtaggtag tgtgtatatc 1200tggttgtgca gatacttcta ctcttacgct
cgtagaactt tcctgatgag taacgtgcag 1260caaacacttg gagtcgctat
gccttgggcg ataggggtat ctttatctca gacgccacct 1320agtagtaaga
aagttgtatc cattagcggt gatggtggtt ttatgttctc ttcacaagag
1380ttggtgacag ctgttcaaca aggttgcaac atcactcatt ttatatggaa
cgatggaaaa 1440tataacatgg tggaatttca agaagttaat aagtatggta
ggtcatccgg cgtggatcta 1500ggtggagtgg attttgtaaa gttagctgat
agtatgggag ccaagggttt aagagtatca 1560agtgctggcg atcttgaagc
cgtaatgaag gaagcattag catacgacgg tgtatgtttg 1620gttgacatag
aaattgacta ctctcaaaac cataacttaa tgatggattt ggtaacatcc
1680gatgtatctt aaggatcc 1698721720DNATalaromyces stipitatus
72agtcgacatg tctaacagga acccttctca cgtgatagtg gagtcattat ctaatgccgg
60cgttaagata gttttcggga taccaggtgc aaaggtcgat ggtatctttg atgcattgtc
120agatcatcct actatcaagt tgattgtgtg tagacatgaa cagaacgctg
cctttatggc 180agccgcagtt ggacgtctta ctggcgcccc gggtgtctgc
ttagtaacga gtgggcctgg 240aacttctaat ttggtaaccg gtttagctac
tgccactaca gaaggtgatc ctgttttagc 300aatagctgga acagtctcta
gattgcaagc agctaggcat actcatcaaa gtttagatgt 360taacaaagta
ttagaagggg tctgtaagag tgtaatacaa gtcggggtgg aagatcaagt
420gagtgaagta atcgctaatg cttttagaca tgcgaggcaa ttcccacaag
gagccaccgc 480agttgcgctg ccaatggata taataaaatc tacttccgtg
ggtgtgccac cttttccatc 540tctatcattc gaggcaccag gttatggtag
ttccaatacg aaactttgta aagtagcggt 600cgataaacta attgcggcga
aatatccagt gatactgctg ggaatgagat cctcagaccc 660tgagattgta
gcttcagtcc gtcgtatgat aaaagatcat accttgcctg tagttgaaac
720ttttcaagct gcgggagcca tctcagaaga tttgcttcat agatactatg
ggagggtggg 780tttattccgt aatcaacctg gtgacaaagt actagcaaga
gcagacctga ttattgcagt 840tggctacgat ccatacgaat atgatgcaga
aacatggaat gtcaataatc cagcaaccat 900acacaacatt attcacattg
attacacaca ttccagggtg tcacaacact atatgcctca 960tgttgagcta
ctgggaaacc cagcggatat cgtcgatgaa ttgacggcca gtttacaggc
1020cctaaaacca aacttttggt ctggggctga agatacctta gaaaatatta
ggcaagaaat 1080agctcgttgt gaagccactg ccactcatac tgaatctttg
caagatggcg cggttcagcc 1140tactcacttc gtatatcaat tgaggcatct
gttaccaaag gaaactattg ttgctgttga 1200tgtaggaacc gtctatatct
acatgatgag atacttccaa acctattcac cgagacactt 1260gctgtgtagt
aatggacaac aaactttggg agttggtttg ccttgggcta tagctgcttc
1320actaattcaa gaacctcctt gcagtaggaa ggttgtctct atatctggtg
atggcgggtt 1380tatgtttagt agccaagaac tggctacggc agtcttgcaa
aagtgtaaca taacccattt 1440tatttggaat gacagcggct acaacatggt
tgaatttcaa gaggaggcta agtatggtcg 1500tagctctggt ataaaactag
gcggtattga tttcgtcaaa tttgcagagg ctttcgacgg 1560tgcgcgtgga
ttccgtataa acagcaccaa agaagttaag gaggtcatta aagaggcact
1620agcctttgaa ggcgttgcta tagttgatgt cagaatcgat tattctagga
gtcatgaatt 1680aatgaaagat attattccaa aggactacca ataaggatcc
1720731635DNAArtificial SequencePiromyces ilvD with predicted MTS
removed 73atgggtagtc aagcgatgtt aatcgcaact ggtataaaac cagaagattt
aaaaaagcca 60cagatcggca taggcagtgt ttggtatgat ggaaatccat gcaacatgca
tctattggat 120cttggctccg tggtaaaaaa ggccgttcaa aaacaaaata
tgaatggtat gagattcaat 180atgattggag tgtcagacgg gatctccaac
ggtacggatg gaatgtcctt ttctttgcag 240tcccgtgaaa ttattgcgga
ttctatcgaa acaatcatgt ctgcacaata ttatgatgct 300aacatcagct
tacctggctg cgacaagaac atgcctggtt gtttaatcgc cgctgccaga
360ttgaacagac cgactataat tatctacggt ggcacgatca agcccggaca
tacaaaaaag 420ggagagacga ttgatttagt ctcggccttc caatgttatg
ggcaatactt ggctggagaa 480attactgaag agcaaagaga agaaatagtg
aataatgcat gtcctggcgc aggtgcatgc 540ggtggaatgt atacagctaa
tacaatggct tccataatcg aatcaatggg tatgagttta 600ccttactccg
cctcgacccc ggcagaagac ccattgaaag agcttgaatg tataaacgcg
660gcagctgcaa ttaagaattt aatggaaaaa gacatcaagc cattagacat
aatgacaaga 720aaagcgtttg agaacgctat aactattact ttgattcttg
gagggagtac aaactccgtt 780ctgcaccttt tggctatcgc tagggcctgc
aaagtcccat taactattga cgatttccag 840gaattttcta ataggatacc
cgttttagcc gacttaaaac ctagtggtaa atatgtcatg 900gaagatttgc
agttgatcgg cggtcttcca gctattcaga aatatcttct gaatgaaggt
960ctacttcatg gtgatattat gactgttacc ggaaagaccc tagcagagaa
tttgaaagac 1020gttgctccaa tcgattttga aactcaagat ataattagac
ctttatcgaa tcccattaaa 1080aagaatggtc acattatcat tatgaaaggt
aacgtctctc cggacggtgg tgttgctaaa 1140attacaggta agcagggatt
gtttttcgaa ggcgtggcga attgctttga ttgtgaagaa 1200gacatgttag
ctgcactgga aagaggcgaa attaaaaaag gtcaagtgat tataataagg
1260tatgaaggcc ccactggagg gcctggtatg ccggagatgc taactccgac
cagtgctatt 1320atgggtgctg ggttaggaaa agatgtagca ctattaacag
atggcagatt ttcaggcggg 1380tcacacggct tcattattgg tcatattacg
cctgaggcac aagtaggtgg tccaattgcc 1440ctaatcaaaa acggtgataa
gataactata gacgcgaata aacgtaccat acatgcccat 1500gtcagcgaag
aagaatttgc taaaagacgt gccgagtgga aagcaccacc ttacagagct
1560actcaaggta ctttaaagaa atacattaag ctggttaaac ccgcaaactt
tggatgtgtt 1620accgatgagt ggtaa 16357424DNAArtificial
SequencePrimer 351 74cttcttgctc attagaaaga aagc 247521DNAArtificial
SequencePrimer 1625 75caaggttacg gtcaaggttt g 217622DNAArtificial
SequencePrimer 1626 76cattggttcc ggttacgttt ac 227746DNAArtificial
SequencePrimer 1615 77caactcgcgg ccgcggatcc taggttattg gttttctggt
ctcaac 467832DNAArtificial SequencePrimer 1616 78cgccgactcg
agatgttgag aactcaagcc gc 327944DNAArtificial SequencePrimer 1809
79cgccgactcg aggtcgacat gggtttgaag caaatcaact tcgg
44807564DNAArtificial SequencepGV1354 80tcgcgcgttt cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg
tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accataaacg acattactat atatataata taggaagcat ttaatagaca
gcatcgtaat 240atatgtgtac tttgcagtta tgacgccaga tggcagtagt
ggaagatatt ctttattgaa 300aaatagcttg tcaccttacg tacaatcttg
atccggagct tttctttttt tgccgattaa 360gaattaattc ggtcgaaaaa
agaaaaggag agggccaaga gggagggcat tggtgactat 420tgagcacgtg
agtatacgtg attaagcaca caaaggcagc ttggagtatg tctgttatta
480atttcacagg tagttctggt ccattggtga aagtttgcgg cttgcagagc
acagaggccg 540cagaatgtgc tctagattcc gatgctgact tgctgggtat
tatatgtgtg cccaatagaa 600agagaacaat tgacccggtt attgcaagga
aaatttcaag tcttgtaaaa gcatataaaa 660atagttcagg cactccgaaa
tacttggttg gcgtgtttcg taatcaacct aaggaggatg 720ttttggctct
ggtcaatgat tacggcattg atatcgtcca actgcatgga gatgagtcgt
780ggcaagaata ccaagagttc ctcggtttgc cagttattaa aagactcgta
tttccaaaag 840actgcaacat actactcagt gcagcttcac agaaacctca
ttcgtttatt cccttgtttg 900attcagaagc aggtgggaca ggtgaacttt
tggattggaa ctcgatttct gactgggttg 960gaaggcaaga gagccccgaa
agcttacatt ttatgttagc tggtggactg acgccagaaa 1020atgttggtga
tgcgcttaga ttaaatggcg ttattggtgt tgatgtaagc ggaggtgtgg
1080agacaaatgg tgtaaaagac tctaacaaaa tagcaaattt cgtcaaaaat
gctaagaaat 1140aggttattac tgagtagtat ttatttaagt attgtttgtg
cacttgccta tgcggtgtga 1200aataccgcac agatgcgtaa ggagaaaata
ccgcatcagg aaattgtaaa cgttaatatt 1260ttgttaaaat tcgcgttaaa
tttttgttaa atcagctcat tttttaacca ataggccgaa 1320atcggcaaaa
tcccttataa atcaaaagaa tagaccgaga tagggttgag tgttgttcca
1380gtttggaaca agagtccact attaaagaac gtggactcca acgtcaaagg
gcgaaaaacc 1440gtctatcagg gcgatggccc actacgtgaa ccatcaccct
aatcaagttt tttggggtcg 1500aggtgccgta aagcactaaa tcggaaccct
aaagggagcc cccgatttag agcttgacgg 1560ggaaagccgg cgaacgtggc
gagaaaggaa gggaagaaag cgaaaggagc gggcgctagg 1620gcgctggcaa
gtgtagcggt cacgctgcgc gtaaccacca cacccgccgc gcttaatgcg
1680ccgctacagg gcgcgtcgcg ccattcgcca ttcaggctgc gcaactgttg
ggaagggcga 1740tcggtgcggg cctcttcgct attacgccag ctggcgaaag
ggggatgtgc tgcaaggcga 1800ttaagttggg taacgccagg gttttcccag
tcacgacgtt gtaaaacgac ggccagtgag 1860cgcgcgtaat acgactcact
atagggcgaa ttgggtaccg gccgcaaatt aaagccttcg 1920agcgtcccaa
aaccttctca agcaaggttt tcagtataat gttacatgcg tacacgcgtc
1980tgtacagaaa aaaaagaaaa atttgaaata taaataacgt tcttaatact
aacataacta 2040taaaaaaata aatagggacc tagacttcag gttgtctaac
tccttccttt tcggttagag 2100cggatgtggg gggagggcgt gaatgtaagc
gtgacataac taattacatg actcgagcgg 2160ccgcggatcc ttattggttt
tctggtctca actttctgac ttccttacca accttccaga 2220tttccatgtt
tctgatggtg tctaattcct tttctagctt ttctctgtag tcaggttgag
2280agttgaattc caaagatctc ttggtttcgg taccgttctt ggtagattcg
tacaagtctt 2340ggaaaacagg cttcaaagca ttcttgaaga ttgggtacca
gtccaaagca cctcttctgg 2400cggtggtgga acaagcatcg tacatgtaat
ccataccgta cttaccgatc aatgggtata 2460gagattgggt agcttcttcg
acggtttcgt tgaaagcttc agatggggag tgaccgtttt 2520ctctcaagac
gtcgtattga gccaagaaca taccgtggat accacccatt aaacaacctc
2580tttcaccgta caagtcagag ttgacttctc tttcgaaagt ggtttggtaa
acgtaaccgg 2640aaccaatggc aacggccaaa gcttgggcct tttcgtgagc
cttaccggtg acatcgttcc 2700agacggcgta agaagagtta ataccacgac
cttccttgaa caaagatctg acagttctac 2760cggaaccctt tggagcaacc
aagataacat ctaagtcctt tggtggttca acgtgagtca 2820agtccttgaa
gactggggag aaaccgtggg agaagtacaa agtcttaccc ttggtcaaca
2880atggcttgat agcaggccag gtttctgatt gagcggcatc ggacaacaag
ttcataacgt 2940aactacctct cttgatagca tcttcaacag tgaacaagtt
cttgcctgga acccaaccgt 3000cttcgatggc agccttccaa gaagcaccat
ctttacggac accaatgata acgttcaaac 3060cgttgtctct caagttcaaa
ccttgaccgt aaccttggga accgtaaccg atcaaagcaa 3120aagtgtcgtt
cttgaagtag tccaacaact tttctcttgg ccagtcagct ctttcgtaga
3180cggtttcaac agtaccaccg aagttgattt gcttcaacat gtcgacacca
tcttcttctg 3240agatgagttt ttgttccatg ctagttctag aatccgtcga
aactaagttc tggtgtttta 3300aaactaaaaa aaagactaac tataaaagta
gaatttaaga agtttaagaa atagatttac 3360agaattacaa tcaataccta
ccgtctttat atacttatta gtcaagtagg ggaataattt 3420cagggaactg
gtttcaacct tttttttcag ctttttccaa atcagagaga gcagaaggta
3480atagaaggtg taagaaaatg agatagatac atgcgtgggt caattgcctt
gtgtcatcat 3540ttactccagg caggttgcat cactccattg aggttgtgcc
cgttttttgc ctgtttgtgc 3600ccctgttctc tgtagttgcg ctaagagaat
ggacctatga actgatggtt ggtgaagaaa 3660acaatatttt ggtgctggga
ttcttttttt ttctggatgc cagcttaaaa agcgggctcc 3720attatattta
gtggatgcca ggaataaact gttcacccag acacctacga tgttatatat
3780tctgtgtaac ccgcccccta ttttgggcat gtacgggtta cagcagaatt
aaaaggctaa 3840ttttttgact aaataaagtt aggaaaatca ctactattaa
ttatttacgt attctttgaa 3900atggcgagta ttgataatga taaactgagc
tagatctggg cccgagctcc agcttttgtt 3960ccctttagtg agggttaatt
gcgcgcttgg cgtaatcatg gtcatagctg tttcctgtgt 4020gaaattgtta
tccgctcaca attccacaca acataggagc cggaagcata aagtgtaaag
4080cctggggtgc ctaatgagtg aggtaactca cattaattgc gttgcgctca
ctgcccgctt 4140tccagtcggg aaacctgtcg tgccagctgc attaatgaat
cggccaacgc gcggggagag 4200gcggtttgcg tattgggcgc tcttccgctt
cctcgctcac tgactcgctg cgctcggtcg 4260ttcggctgcg gcgagcggta
tcagctcact caaaggcggt aatacggtta tccacagaat 4320caggggataa
cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta
4380aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag
catcacaaaa 4440atcgacgctc aagtcagagg tggcgaaacc cgacaggact
ataaagatac caggcgtttc 4500cccctggaag ctccctcgtg cgctctcctg
ttccgaccct gccgcttacc ggatacctgt 4560ccgcctttct cccttcggga
agcgtggcgc tttctcatag ctcacgctgt aggtatctca 4620gttcggtgta
ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg
4680accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga
cacgacttat 4740cgccactggc agcagccact ggtaacagga ttagcagagc
gaggtatgta ggcggtgcta 4800cagagttctt gaagtggtgg cctaactacg
gctacactag aaggacagta tttggtatct 4860gcgctctgct gaagccagtt
accttcggaa aaagagttgg tagctcttga tccggcaaac 4920aaaccaccgc
tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa
4980aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag
tggaacgaaa 5040actcacgtta agggattttg gtcatgagat tatcaaaaag
gatcttcacc tagatccttt 5100taaattaaaa atgaagtttt aaatcaatct
aaagtatata tgagtaaact tggtctgaca 5160gttaccaatg cttaatcagt
gaggcaccta tctcagcgat ctgtctattt cgttcatcca 5220tagttgcctg
actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc
5280ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta
tcagcaataa 5340accagccagc cggaagggcc gagcgcagaa gtggtcctgc
aactttatcc gcctccatcc 5400agtctattaa ttgttgccgg gaagctagag
taagtagttc gccagttaat agtttgcgca 5460acgttgttgc cattgctaca
ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 5520tcagctccgg
ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag
5580cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca
gtgttatcac 5640tcatggttat ggcagcactg cataattctc ttactgtcat
gccatccgta agatgctttt 5700ctgtgactgg tgagtactca accaagtcat
tctgagaata gtgtatgcgg cgaccgagtt 5760gctcttgccc ggcgtcaata
cgggataata ccgcgccaca tagcagaact ttaaaagtgc 5820tcatcattgg
aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat
5880ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt
actttcacca 5940gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc
aaaaaaggga ataagggcga 6000cacggaaatg ttgaatactc atactcttcc
tttttcaata ttattgaagc atttatcagg 6060gttattgtct catgagcgga
tacatatttg aatgtattta gaaaaataaa caaatagggg 6120ttccgcgcac
atttccccga aaagtgccac ctgaacgaag catctgtgct tcattttgta
6180gaacaaaaat gcaacgcgag agcgctaatt tttcaaacaa agaatctgag
ctgcattttt 6240acagaacaga aatgcaacgc gaaagcgcta ttttaccaac
gaagaatctg tgcttcattt 6300ttgtaaaaca aaaatgcaac gcgagagcgc
taatttttca aacaaagaat ctgagctgca 6360tttttacaga acagaaatgc
aacgcgagag cgctatttta ccaacaaaga atctatactt 6420cttttttgtt
ctacaaaaat gcatcccgag agcgctattt ttctaacaaa gcatcttaga
6480ttactttttt tctcctttgt gcgctctata atgcagtctc ttgataactt
tttgcactgt 6540aggtccgtta aggttagaag aaggctactt tggtgtctat
tttctcttcc ataaaaaaag 6600cctgactcca cttcccgcgt ttactgatta
ctagcgaagc tgcgggtgca ttttttcaag 6660ataaaggcat ccccgattat
attctatacc gatgtggatt gcgcatactt tgtgaacaga 6720aagtgatagc
gttgatgatt cttcattggt cagaaaatta tgaacggttt cttctatttt
6780gtctctatat actacgtata ggaaatgttt acattttcgt attgttttcg
attcactcta 6840tgaatagttc ttactacaat ttttttgtct aaagagtaat
actagagata aacataaaaa 6900atgtagaggt cgagtttaga tgcaagttca
aggagcgaaa ggtggatggg taggttatat 6960agggatatag cacagagata
tatagcaaag agatactttt gagcaatgtt tgtggaagcg 7020gtattcgcaa
tattttagta gctcgttaca gtccggtgcg tttttggttt tttgaaagtg
7080cgtcttcaga gcgcttttgg ttttcaaaag cgctctgaag ttcctatact
ttctagagaa 7140taggaacttc ggaataggaa cttcaaagcg tttccgaaaa
cgagcgcttc cgaaaatgca 7200acgcgagctg cgcacataca gctcactgtt
cacgtcgcac ctatatctgc gtgttgcctg 7260tatatatata tacatgagaa
gaacggcata gtgcgtgttt atgcttaaat gcgtacttat 7320atgcgtctat
ttatgtagga tgaaaggtag tctagtacct cctgtgatat tatcccattc
7380catgcggggt atcgtatgct tccttcagca ctacccttta gctgttctat
atgctgccac 7440tcctcaattg gattagtctc atccttcaat gctatcattt
cctttgatat tggatcatat 7500taagaaacca ttattatcat gacattaacc
tataaaaata ggcgtatcac gaggcccttt 7560cgtc 7564817955DNAArtificial
SequencepGV1662 81ttggatcata ctaagaaacc attattatca tgacattaac
ctataaaaat aggcgtatca 60cgaggccctt tcgtctcgcg cgtttcggtg atgacggtga
aaacctctga cacatgcagc 120tcccggagac ggtcacagct tgtctgtaag
cggatgccgg gagcagacaa gcccgtcagg 180gcgcgtcagc gggtgttggc
gggtgtcggg gctggcttaa ctatgcggca tcagagcaga 240ttgtactgag
agtgcaccat accacagctt ttcaattcaa ttcatcattt tttttttatt
300cttttttttg atttcggttt ctttgaaatt tttttgattc ggtaatctcc
gaacagaagg 360aagaacgaag gaaggagcac agacttagat tggtatatat
acgcatatgt agtgttgaag 420aaacatgaaa ttgcccagta ttcttaaccc
aactgcacag aacaaaaacc tgcaggaaac 480gaagataaat catgtcgaaa
gctacatata aggaacgtgc tgctactcat cctagtcctg 540ttgctgccaa
gctatttaat atcatgcacg aaaagcaaac aaacttgtgt gcttcattgg
600atgttcgtac caccaaggaa ttactggagt tagttgaagc attaggtccc
aaaatttgtt 660tactaaaaac acatgtggat atcttgactg atttttccat
ggagggcaca gttaagccgc 720taaaggcatt atccgccaag tacaattttt
tactcttcga agacagaaaa tttgctgaca 780ttggtaatac agtcaaattg
cagtactctg cgggtgtata cagaatagca gaatgggcag 840acattacgaa
tgcacacggt gtggtgggcc caggtattgt tagcggtttg aagcaggcgg
900cagaagaagt aacaaaggaa cctagaggcc ttttgatgtt agcagaattg
tcatgcaagg 960gctccctatc tactggagaa tatactaagg gtactgttga
cattgcgaag agcgacaaag 1020attttgttat cggctttatt gctcaaagag
acatgggtgg aagagatgaa ggttacgatt 1080ggttgattat gacacccggt
gtgggtttag atgacaaggg agacgcattg ggtcaacagt 1140atagaaccgt
ggatgatgtg gtctctacag gatctgacat tattattgtt ggaagaggac
1200tatttgcaaa gggaagggat gctaaggtag agggtgaacg ttacagaaaa
gcaggctggg 1260aagcatattt gagaagatgc ggccagcaaa actaaaaaac
tgtattataa gtaaatgcat 1320gtatactaaa ctcacaaatt agagcttcaa
tttaattata tcagttatta ccctatgcgg 1380tgtgaaatac cgcacagatg
cgtaaggaga aaataccgca tcaggaaatt gtaaacgtta 1440atattttgtt
aaaattcgcg ttaaattttt gttaaatcag ctcatttttt aaccaatagg
1500ccgaaatcgg caaaatccct tataaatcaa aagaatagac cgagataggg
ttgagtgttg 1560ttccagtttg gaacaagagt ccactattaa agaacgtgga
ctccaacgtc aaagggcgaa 1620aaaccgtcta tcagggcgat ggcccactac
gtgaaccatc accctaatca agttttttgg 1680ggtcgaggtg ccgtaaagca
ctaaatcgga accctaaagg gagcccccga tttagagctt 1740gacggggaaa
gccggcgaac gtggcgagaa aggaagggaa gaaagcgaaa ggagcgggcg
1800ctagggcgct ggcaagtgta gcggtcacgc tgcgcgtaac caccacaccc
gccgcgctta 1860atgcgccgct acagggcgcg tcgcgccatt cgccattcag
gctgcgcaac tgttgggaag 1920ggcgatcggt gcgggcctct tcgctattac
gccagctggc gaaaggggga tgtgctgcaa 1980ggcgattaag ttgggtaacg
ccagggtttt cccagtcacg acgttgtaaa acgacggcca 2040gtgagcgcgc
gtaatacgac tcactatagg gcgaattggg taccggccgc aaattaaagc
2100cttcgagcgt cccaaaacct tctcaagcaa ggttttcagt ataatgttac
atgcgtacac 2160gcgtctgtac
agaaaaaaaa gaaaaatttg aaatataaat aacgttctta atactaacat
2220aactataaaa aaataaatag ggacctagac ttcaggttgt ctaactcctt
ccttttcggt 2280tagagcggat gtggggggag ggcgtgaatg taagcgtgac
ataactaatt acatgactcg 2340agcggccgcg gatccttagg atttattctg
ttcagcaaac agcttgccca ttttcttcag 2400taccttcggt gcgccttctt
tcgccaggat cagttcgatc cagtacatac ggttcggatc 2460ggcctgggcc
tctttcatca cgctcacaaa ttcgttttcg gtacgcacaa ttttagacac
2520aacacggtcc tcagttgcgc cgaaggactc cggcagttta gagtagttcc
acatagggat 2580atcgttgtaa gactggttcg gaccgtggat ctcacgctca
acggtgtagc cgtcattgtt 2640aataatgaag caaatcgggt tgatcttttc
acgaattgcc agacccagtt cctgtacggt 2700cagctgcagg gaaccgtcac
cgatgaacag cagatgacga gattctttat cagcgatctg 2760agagcccagc
gctgccggga aagtatagcc aatgctaccc cacagcggct gaccgataaa
2820atggcttttg gatttcagaa agatagaaga cgcgccgaaa aagctcgtac
cttgttccgc 2880cacgatggtt tcattgctct gggtcaggtt ctccacggcc
tgccacaggc gatcctggga 2940cagcagtgcg ttagatggta cgaaatcttc
ttgctttttg tcaatgtatt tgcctttata 3000ctcgatttcg gacaggtcca
gcagagagct gatcaggctt tcgaagtcga agttctggat 3060acgctcgttg
aagattttac cctcgtcgat gttcaggcta atcattttgt tttcgttcag
3120atggtgagtg aatgcaccgg tagaagagtc ggtcagttta acgcccagca
tcaggatgaa 3180gtccgcagat tcaacaaatt ctttcaggtt cggttcgctc
agagtaccgt tgtagatgcc 3240caggaaagac ggcagagcct cgtcaacaga
ggacttgccg aagttcaggg tggtaatcgg 3300cagtttggtt ttgctgatga
attgggtcac ggtcttctcc agaccaaaag aaatgatttc 3360gtggccggtg
atcacgattg gtttctttgc gtttttcaga gactcctgga ttttgttcag
3420gatttcctgg tcgctagtgt tagaagtgga gttttctttc ttcagcggca
ggctcggttt 3480ttccgcttta gctgccgcaa catccacagg caggttgatg
taaactggtt tgcgttcttt 3540cagcagcgca gacagaacgc ggtcgatttc
cacagtagcg ttctctgcag tcagcagcgt 3600acgtgccgca gtcacaggtt
catgcatttt catgaagtgt ttgaaatcgc cgtcagccag 3660agtgtggtgg
acgaatttac cttcgttctg aactttgctc gttgggctgc ctacgatctc
3720caccaccggc aggttttcgg cgtaggagcc cgccagaccg ttgacggcgc
tcagttcgcc 3780aacaccgaaa gtggtcagaa atgccgcggc tttcttggta
cgtgcataac catctgccat 3840gtagcttgcg ttcagttcgt tagcgttacc
cacccatttc atgtctttat gagagatgat 3900ctgatccagg aactgcagat
tgtaatcacc cggaacgccg aagatttctt cgatacccag 3960ttcatgcaga
cggtccagca gataatcacc aacagtatac atgtcgacaa acttagatta
4020gattgctatg ctttctttct aatgagcaag aagtaaaaaa agttgtaata
gaacaagaaa 4080aatgaaactg aaacttgaga aattgaagac cgtttattaa
cttaaatatc aatgggaggt 4140catcgaaaga gaaaaaaatc aaaaaaaaaa
ttttcaagaa aaagaaacgt gataaaaatt 4200tttattgcct ttttcgacga
agaaaaagaa acgaggcggt ctcttttttc ttttccaaac 4260ctttagtacg
ggtaattaac gacaccctag aggaagaaag aggggaaatt tagtatgctg
4320tgcttgggtg ttttgaagtg gtacggcgat gcgcggagtc cgagaaaatc
tggaagagta 4380aaaaaggagt agaaacattt tgaagctatg agctccagct
tttgttccct ttagtgaggg 4440ttaattgcgc gcttggcgta atcatggtca
tagctgtttc ctgtgtgaaa ttgttatccg 4500ctcacaattc cacacaacat
aggagccgga agcataaagt gtaaagcctg gggtgcctaa 4560tgagtgaggt
aactcacatt aattgcgttg cgctcactgc ccgctttcca gtcgggaaac
4620ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg ggagaggcgg
tttgcgtatt 4680gggcgctctt ccgcttcctc gctcactgac tcgctgcgct
cggtcgttcg gctgcggcga 4740gcggtatcag ctcactcaaa ggcggtaata
cggttatcca cagaatcagg ggataacgca 4800ggaaagaaca tgtgagcaaa
aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 4860ctggcgtttt
tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt
4920cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc
tggaagctcc 4980ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat
acctgtccgc ctttctccct 5040tcgggaagcg tggcgctttc tcatagctca
cgctgtaggt atctcagttc ggtgtaggtc 5100gttcgctcca agctgggctg
tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 5160tccggtaact
atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca
5220gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga
gttcttgaag 5280tggtggccta actacggcta cactagaagg acagtatttg
gtatctgcgc tctgctgaag 5340ccagttacct tcggaaaaag agttggtagc
tcttgatccg gcaaacaaac caccgctggt 5400agcggtggtt tttttgtttg
caagcagcag attacgcgca gaaaaaaagg atctcaagaa 5460gatcctttga
tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg
5520attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa
ttaaaaatga 5580agttttaaat caatctaaag tatatatgag taaacttggt
ctgacagtta ccaatgctta 5640atcagtgagg cacctatctc agcgatctgt
ctatttcgtt catccatagt tgcctgactc 5700cccgtcgtgt agataactac
gatacgggag ggcttaccat ctggccccag tgctgcaatg 5760ataccgcgag
acccacgctc accggctcca gatttatcag caataaacca gccagccgga
5820agggccgagc gcagaagtgg tcctgcaact ttatccgcct ccatccagtc
tattaattgt 5880tgccgggaag ctagagtaag tagttcgcca gttaatagtt
tgcgcaacgt tgttgccatt 5940gctacaggca tcgtggtgtc acgctcgtcg
tttggtatgg cttcattcag ctccggttcc 6000caacgatcaa ggcgagttac
atgatccccc atgttgtgca aaaaagcggt tagctccttc 6060ggtcctccga
tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca
6120gcactgcata attctcttac tgtcatgcca tccgtaagat gcttttctgt
gactggtgag 6180tactcaacca agtcattctg agaatagtgt atgcggcgac
cgagttgctc ttgcccggcg 6240tcaatacggg ataataccgc gccacatagc
agaactttaa aagtgctcat cattggaaaa 6300cgttcttcgg ggcgaaaact
ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa 6360cccactcgtg
cacccaactg atcttcagca tcttttactt tcaccagcgt ttctgggtga
6420gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg
gaaatgttga 6480atactcatac tcttcctttt tcaatattat tgaagcattt
atcagggtta ttgtctcatg 6540agcggataca tatttgaatg tatttagaaa
aataaacaaa taggggttcc gcgcacattt 6600ccccgaaaag tgccacctga
acgaagcatc tgtgcttcat tttgtagaac aaaaatgcaa 6660cgcgagagcg
ctaatttttc aaacaaagaa tctgagctgc atttttacag aacagaaatg
6720caacgcgaaa gcgctatttt accaacgaag aatctgtgct tcatttttgt
aaaacaaaaa 6780tgcaacgcga gagcgctaat ttttcaaaca aagaatctga
gctgcatttt tacagaacag 6840aaatgcaacg cgagagcgct attttaccaa
caaagaatct atacttcttt tttgttctac 6900aaaaatgcat cccgagagcg
ctatttttct aacaaagcat cttagattac tttttttctc 6960ctttgtgcgc
tctataatgc agtctcttga taactttttg cactgtaggt ccgttaaggt
7020tagaagaagg ctactttggt gtctattttc tcttccataa aaaaagcctg
actccacttc 7080ccgcgtttac tgattactag cgaagctgcg ggtgcatttt
ttcaagataa aggcatcccc 7140gattatattc tataccgatg tggattgcgc
atactttgtg aacagaaagt gatagcgttg 7200atgattcttc attggtcaga
aaattatgaa cggtttcttc tattttgtct ctatatacta 7260cgtataggaa
atgtttacat tttcgtattg ttttcgattc actctatgaa tagttcttac
7320tacaattttt ttgtctaaag agtaatacta gagataaaca taaaaaatgt
agaggtcgag 7380tttagatgca agttcaagga gcgaaaggtg gatgggtagg
ttatataggg atatagcaca 7440gagatatata gcaaagagat acttttgagc
aatgtttgtg gaagcggtat tcgcaatatt 7500ttagtagctc gttacagtcc
ggtgcgtttt tggttttttg aaagtgcgtc ttcagagcgc 7560ttttggtttt
caaaagcgct ctgaagttcc tatactttct agagaatagg aacttcggaa
7620taggaacttc aaagcgtttc cgaaaacgag cgcttccgaa aatgcaacgc
gagctgcgca 7680catacagctc actgttcacg tcgcacctat atctgcgtgt
tgcctgtata tatatataca 7740tgagaagaac ggcatagtgc gtgtttatgc
ttaaatgcgt acttatatgc gtctatttat 7800gtaggatgaa aggtagtcta
gtacctcctg tgatattatc ccattccatg cggggtatcg 7860tatgcttcct
tcagcactac cctttagctg ttctatatgc tgccactcct caattggatt
7920agtctcatcc ttcaatgcta tcatttcctt tgata 7955828572DNAArtificial
SequencepGV1810 82tagaaaaact catcgagcat caaatgaaac tgcaatttat
tcatatcagg attatcaata 60ccatattttt gaaaaagccg tttctgtaat gaaggagaaa
actcaccgag gcagttccat 120aggatggcaa gatcctggta tcggtctgcg
attccgactc gtccaacatc aatacaacct 180attaatttcc cctcgtcaaa
aataaggtta tcaagtgaga aatcaccatg agtgacgact 240gaatccggtg
agaatggcaa aagtttatgc atttctttcc agacttgttc aacaggccag
300ccattacgct cgtcatcaaa atcactcgca tcaaccaaac cgttattcat
tcgtgattgc 360gcctgagcga ggcgaaatac gcgatcgctg ttaaaaggac
aattacaaac aggaatcgag 420tgcaaccggc gcaggaacac tgccagcgca
tcaacaatat tttcacctga atcaggatat 480tcttctaata cctggaacgc
tgtttttccg gggatcgcag tggtgagtaa ccatgcatca 540tcaggagtac
ggataaaatg cttgatggtc ggaagtggca taaattccgt cagccagttt
600agtctgacca tctcatctgt aacatcattg gcaacgctac ctttgccatg
tttcagaaac 660aactctggcg catcgggctt cccatacaag cgatagattg
tcgcacctga ttgcccgaca 720ttatcgcgag cccatttata cccatataaa
tcagcatcca tgttggaatt taatcgcggc 780ctcgacgttt cccgttgaat
atggctcata ttcttccttt ttcaatatta ttgaagcatt 840tatcagggtt
attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa
900ataggggtca gtgttacaac caattaacca attctgaaca ttatcgcgag
cccatttata 960cctgaatatg gctcataaca ccccttgttt gcctggcggc
agtagcgcgg tggtcccacc 1020tgaccccatg ccgaactcag aagtgaaacg
ccgtagcgcc gatggtagtg tggggactcc 1080ccatgcgaga gtagggaact
gccaggcatc aaataaaacg aaaggctcag tcgaaagact 1140gggcctttcg
cccgggctaa ttagggggtg tcgcccttat tcgactctat agtgaagttc
1200ctattctcta gaaagtatag gaacttctga agtggggcta gccacgaaaa
acaaactaac 1260ttatgcgcat cattagatgt aagaactact aaagagctac
tggagttggt tgaggcttta 1320ggtccaaaaa tttgtttgtt gaagacacat
gttgacatat taacagattt ttctatggag 1380ggtaccgtta agcctctgaa
agcgttaagc gcgaaatata actttctttt atttgaagac 1440cgtaagtttg
ctgatattgg aaatactgtt aagttgcaat atagcgcagg agtttataga
1500attgccgaat gggctgacat tacgaatgcc cacggtgttg taggtcctgg
cattgtgtct 1560ggattgaaac aagcggcaga ggaagtgact aaggaaccaa
gaggtttact aatgctggcg 1620gaattatctt gcaaaggctc tctagccacc
ggtgaatata caaaaggtac tgtggatatt 1680gcaaagtctg ataaggactt
cgtaatcggt tttattgcac aaagagatat gggaggtcgt 1740gacgagggct
acgattggtt aattatgaca ccaggcgtag gattagatga caaaggcgat
1800gcgttaggcc aacagtatcg tacagtcgat gatgtcgtaa gtaccggttc
tgatatcatt 1860attgtcggga gaggtttatt tgccaagggc cgtgatgcga
aagtggaggg ggaaagatat 1920aggaaggcag gttgggaggc ttacttgaga
agatgtggtc agcagaatta agcggccgca 1980taacaatact gacagtacta
aataattgcc tacttggctt cacatacgtt gcatacgtcg 2040atatagataa
taatgataat gacagcagga ttatcgtaat acgtaatagt tgaaaatctc
2100aaaaatgtgt gggtcattac gtaaataatg ataggaatgg gattcttcta
tttttccttt 2160ttccattcta gcagccgtcg ggaaaacgtg gcatcctctc
tttcgggctc aattggagtc 2220acgctgccgt gagcatcctc tctttccata
tctaacaact gagcacgtaa ccaatggaaa 2280agcatgagct tagcgttgct
ccaaaaaagt attggatggt taataccatt tgtctgttct 2340cttctgactt
tgactcctca aaaaaaaaaa atctacaatc aacagatcgc ttcaattacg
2400ccctcacaaa aacttttttc cttcttcttc gcccacgtta aattttatcc
ctcatgttgt 2460ctaacggatt tctgcacttg atttattata aaaagacaaa
gacataatac ttctctatca 2520atttcagtta ttgttcttcc ttgcgttatt
cttctgttct tctttttctt ttgtcatata 2580taaccataac caagtaatac
atattcaaac tcgagatgtt gagaactcaa gccgccagat 2640tgatctgcaa
ctcccgtgtc atcactgcta agagaacctt tgctttggcc acccgtgctg
2700ctgcttacag cagaccagct gcccgtttcg ttaagccaat gatcactacc
cgtggtttga 2760agcaaatcaa cttcggtggt actgttgaaa ccgtctacga
aagagctgac tggccaagag 2820aaaagttgtt ggactacttc aagaacgaca
cttttgcttt gatcggttac ggttcccaag 2880gttacggtca aggtttgaac
ttgagagaca acggtttgaa cgttatcatt ggtgtccgta 2940aagatggtgc
ttcttggaag gctgccatcg aagacggttg ggttccaggc aagaacttgt
3000tcactgttga agatgctatc aagagaggta gttacgttat gaacttgttg
tccgatgccg 3060ctcaatcaga aacctggcct gctatcaagc cattgttgac
caagggtaag actttgtact 3120tctcccacgg tttctcccca gtcttcaagg
acttgactca cgttgaacca ccaaaggact 3180tagatgttat cttggttgct
ccaaagggtt ccggtagaac tgtcagatct ttgttcaagg 3240aaggtcgtgg
tattaactct tcttacgccg tctggaacga tgtcaccggt aaggctcacg
3300aaaaggccca agctttggcc gttgccattg gttccggtta cgtttaccaa
accactttcg 3360aaagagaagt caactctgac ttgtacggtg aaagaggttg
tttaatgggt ggtatccacg 3420gtatgttctt ggctcaatac gacgtcttga
gagaaaacgg tcactcccca tctgaagctt 3480tcaacgaaac cgtcgaagaa
gctacccaat ctctataccc attgatcggt aagtacggta 3540tggattacat
gtacgatgct tgttccacca ccgccagaag aggtgctttg gactggtacc
3600caatcttcaa gaatgctttg aagcctgttt tccaagactt gtacgaatct
accaagaacg 3660gtaccgaaac caagagatct ttggaattca actctcaacc
tgactacaga gaaaagctag 3720aaaaggaatt agacaccatc agaaacatgg
aaatctggaa ggttggtaag gaagtcagaa 3780agttgagacc agaaaaccaa
taacctagga tcttgtttaa agattacgga tatttaactt 3840acttagaata
atgccatttt tttgagttat aataatccta cgttagtgtg agcgggattt
3900aaactgtgag gaccttaata cattcagaca cttctgcggt atcaccctac
ttattccctt 3960cgagattata tctaggaacc catcaggttg gtggaagatt
acccgttcta agacttttca 4020gcttcctcta ttgatgttac acctggacac
cccttttctg gcatccagtt tttaatcttc 4080agtggcatgt gagattctcc
gaaattaatt aaagcaatca cacaattctc tcggatacca 4140cctcggttga
aactgacagg tggtttgtta cgcatgctaa tgcaaaggag cctatatacc
4200tttggctcgg ctgctgtaac agggaatata aagggcagca taatttagga
gtttagtgaa 4260cttgcaacat ttactatttt cccttcttac gtaaatattt
ttctttttaa ttctaaatca 4320atctttttca attttttgtt tgtattcttt
tcttgcttaa atctataact acaaaaaaca 4380catacataaa ctaaaagtcg
acatgtaccc atacgatgtt ccagattacg caggtggtgg 4440tgtcgacatg
cctaaataca gatcagctac gactacacac ggtagaaata tggccggagc
4500cagggcccta tggagagcca ccggcatgac agatgcagat tttggtaaac
ctataattgc 4560tgtagttaac tcttttacac agtttgttcc aggtcatgta
catctaagag acttgggcaa 4620attggtggca gaacaaatcg aggctgctgg
tggtgttgca aaagaattta acactattgc 4680cgtagacgac ggcattgcga
tgggtcatgg cggtatgctt tattcgctac cctccagaga 4740attaattgca
gacagcgttg aatatatggt aaatgcccac tgcgcagatg ccatggtttg
4800catttccaat tgtgacaaaa tcacgccggg catgttgatg gcgtcattga
gactaaatat 4860tcctgtgatc ttcgttagcg gaggtcccat ggaagccggg
aaaactaaac tttccgatca 4920gataatcaag ttagacttgg tcgatgccat
gatccagggt gcggacccca aagtaagcga 4980ctctcaatcc gatcaagttg
aaagatccgc atgtccaact tgcgggagtt gctctgggat 5040gttcacggcg
aactctatga attgcctaac agaggccctg ggcctgtcac aacctggcaa
5100cggttcgctt ttagcaactc atgctgatag aaagcaatta tttctaaatg
ctggtaaaag 5160aatcgttgaa ttaacaaaaa gatattacga acaaaacgat
gaatctgcac tgccaaggaa 5220cattgcttca aaggccgctt tcgaaaacgc
tatgacattg gatattgcaa tgggtggaag 5280cacaaatact gtccttcatc
tactggcggc tgctcaagaa gcagaaattg atttcacaat 5340gagcgatatc
gacaagctat cacgtaaggt cccgcagctg tgtaaagtgg caccgtctac
5400tcaaaaatac cacatggaag atgtccatcg tgctggaggc gttatcggaa
tcttggggga 5460gttggacagg gccggtctat taaacagaga tgttaagaac
gtgctaggtc taactttgcc 5520tcaaacctta gagcagtacg acgttatgtt
aactcaagat gacgcagtca aaaacatgtt 5580cagagcgggg ccagctggaa
taaggactac ccaagcgttc tcgcaagatt gcagatggga 5640tactctggac
gatgatagag ctaacggttg cataagatca ctagagcatg cttactcgaa
5700agatggaggt ttagctgttt tatacggtaa ttttgccgaa aacggatgta
tagtgaagac 5760cgctggggtt gatgattcaa ttctaaaatt cactgggcca
gccaaggtat acgagtcaca 5820agatgatgct gttgaagcca tcttaggtgg
gaaagtggtg gcaggggacg tggtggtaat 5880aagatatgaa ggtccaaagg
gtggtccagg tatgcaagaa atgctgtacc ctacttcttt 5940ccttaaatct
atgggtttag gcaaggcttg tgctcttata accgatggta gattttctgg
6000aggtacatca ggcctttcca taggacatgt tagccccgaa gctgcctcag
gtggtagtat 6060tggcttaatc gaggatggtg acttaattgc tattgacatt
cctaacaggg gtattcaact 6120acaggttagc gatgcagaat tagccgctag
aagagaggca caagatgcga gaggcgataa 6180agcatggaca cctaagaaca
gggagagaca agtgagcttt gccctgagag cttatgcctc 6240gctggcgacg
agcgcagaca aaggagccgt aagagataaa tcaaaattgg gtggttaggg
6300atccgcgatt taatctctaa ttattagtta aagttttata agcattttta
tgtaacgaaa 6360aataaattgg ttcatattat tactgcactg tcacttacca
tggaaagacc agacaagaag 6420ttgccgacag tctgttgaat tggcctggtt
aggcttaagt ctgggtccgc ttctttacaa 6480atttggagaa tttctcttaa
acgatatgta tattcttttc gttggaaaag atgtcttcca 6540aaaaaaaaac
cgatgaatta gtggaaccaa ggaaaaaaaa agaggtatcc ttgattaagg
6600aacactgttt aaacagtgtg gtttccaaaa ccctgaaact gcattagtgt
aatagaagac 6660tagacacctc gatacaaata atggttactc aattcaaaac
tgccagcgaa ttcgactctg 6720caattgctca agacaagcta gttgtcgtag
atttctacgc cacttggtgc ggtccatgta 6780aaatgattgc tccaatgatt
gaaaaattct ctgaacaata cccacaagct gatttctata 6840aattggatgt
cgatgaattg ggtgatgttg cacaaaagaa tgaagtttcc gctatgccaa
6900ctttgcttct attcaagaac ggtaaggaag ttgcaaaggt tgttggtgcc
aacccagcgg 6960ctattaagca agccattgct gctaatgctt aaactcaccc
aatgaccgat atattgtgtt 7020tctatactgt gtttgttata tatagtttac
ctttaagctt aaaatgaagt gaagttccta 7080tactttctag agaataggaa
cttctatagt gagtcgaata agggcgacac aaaatttatt 7140ctaaatgcat
aataaatact gataacatct tatagtttgt attatatttt gtattatcgt
7200tgacatgtat aattttgata tcaaaaactg attttccctt tattattttc
gagatttatt 7260ttcttaattc tctttaacaa actagaaata ttgtatatac
aaaaaatcat aaataataga 7320tgaatagttt aattataggt gttcatcaat
cgaaaaagca acgtatctta tttaaagtgc 7380gttgcttttt tctcatttat
aaggttaaat aattctcata tatcaagcaa agtgacaggc 7440gcccttaaat
attctgacaa atgctctttc cctaaactcc ccccataaaa aaacccgccg
7500aagcgggttt ttacgttatt tgcggattaa cgattactcg ttatcagaac
cgcccagggg 7560gcccgagctt aagactggcc gtcgttttac aacacagaaa
gagtttgtag aaacgcaaaa 7620aggccatccg tcaggggcct tctgcttagt
ttgatgcctg gcagttccct actctcgcct 7680tccgcttcct cgctcactga
ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca 7740gctcactcaa
aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac
7800atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt
gctggcgttt 7860ttccataggc tccgcccccc tgacgagcat cacaaaaatc
gacgctcaag tcagaggtgg 7920cgaaacccga caggactata aagataccag
gcgtttcccc ctggaagctc cctcgtgcgc 7980tctcctgttc cgaccctgcc
gcttaccgga tacctgtccg cctttctccc ttcgggaagc 8040gtggcgcttt
ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc
8100aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt
atccggtaac 8160tatcgtcttg agtccaaccc ggtaagacac gacttatcgc
cactggcagc agccactggt 8220aacaggatta gcagagcgag gtatgtaggc
ggtgctacag agttcttgaa gtggtgggct 8280aactacggct acactagaag
aacagtattt ggtatctgcg ctctgctgaa gccagttacc 8340ttcggaaaaa
gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt
8400ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga
agatcctttg 8460atcttttcta cggggtctga cgctcagtgg aacgacgcgc
gcgtaactca cgttaaggga 8520ttttggtcat gagcttgcgc cgtcccgtca
agtcagcgta atgctctgct tt 8572831488DNAArtificial SequenceE. coli
codon optimized ilvC for expression in S. cerevisiae 83ctcgagatgg
ccaactattt taacacatta aatttgagac aacaattggc tcaactgggt 60aagtgcagat
ttatgggaag ggacgagttt gctgatggtg cttcttatct gcaaggaaag
120aaagtagtaa ttgttggctg cggtgctcag ggtctaaacc aaggtttaaa
catgagagat 180tcaggtctgg atatttcgta tgcattgagg aaagaggcaa
ttgcagaaaa gagggcctcc 240tggcgtaaag cgacggaaaa tgggttcaaa
gttggtactt acgaagaact gatccctcag 300gcagatttag tgattaacct
aacaccagat aagcaacact cagacgtagt aagaacagtt 360caaccgctga
tgaaggatgg ggcagcttta ggttactctc atggctttaa tatcgttgaa
420gtgggcgagc agatcagaaa agatataaca gtcgtaatgg ttgcaccaaa
gtgcccaggt 480acggaagtca gagaggagta caagaggggt tttggtgtac
ctacattgat cgccgtacat
540cctgaaaatg accccaaagg tgaaggtatg gcaattgcga aggcatgggc
agccgcaacc 600ggaggtcata gagcgggtgt gttagagagt tctttcgtag
ctgaggtcaa gagtgactta 660atgggtgaac aaaccattct gtgcggaatg
ttgcaggcag ggtctttact atgctttgat 720aaattggtcg aagagggtac
agatcctgcc tatgctgaaa agttgataca atttggttgg 780gagacaatca
ccgaggcact taaacaaggt ggcataacat tgatgatgga tagactttca
840aatccggcca agctaagagc ctacgcctta tctgagcaac taaaagagat
catggcacca 900ttattccaaa agcacatgga cgatattatc tccggtgagt
tttcctcagg aatgatggca 960gattgggcaa acgatgataa aaagttattg
acgtggagag aagaaaccgg caagacggca 1020ttcgagacag ccccacaata
cgaaggtaaa attggtgaac aagaatactt tgataaggga 1080gtattgatga
tagctatggt gaaggcaggg gtagaacttg cattcgaaac tatggttgac
1140tccggtatca ttgaagaatc tgcatactat gagtctttgc atgaattgcc
tttgatagca 1200aatactattg caagaaaaag actttacgag atgaatgttg
tcatatcaga cactgcagaa 1260tatggtaatt acttatttag ctacgcatgt
gtcccgttgt taaagccctt catggccgag 1320ttacaacctg gtgatttggg
gaaggctatt ccggaaggag cggttgacaa tggccaactg 1380agagacgtaa
atgaagctat tcgttcacat gctatagaac aggtgggtaa aaagctgaga
1440ggatatatga ccgatatgaa aagaattgca gtggcaggat gaagatct
14888438DNAArtificial SequencePrimer 1792 84ttttctcgag atgcagattt
ttgtgaagac cctcactg 388551DNAArtificial SequencePrimer 1794
85ttttgcggcc gcggatccgt cgacacctcg caggcgcaac accaggtgca g
5186228DNAArtificial SequenceM. musculus ubiquitin gene
codon-optimized for expression in S. cerevisiae 86atgcagattt
ttgtgaagac cctcactggc aaaaccatca cccttgaggt cgagcccagt 60gacaccattg
agaatgtcaa agccaaaatt caagacaagg agggtatccc acctgaccag
120cagcgtctga tatttgccgg caaacagctg gaggatggcc gcactctctc
agactacaac 180atccagaaag agtccaccct gcacctggtg ttgcgcctgc gaggtgga
228871713DNALactococcus lactis 87atggagttta agtataacgg caaagttgaa
tctgttgaac tgaataagta cagcaaaacg 60ttgacacaag atcccacaca acccgccaca
caggcaatgt attacggcat cgggtttaaa 120gacgaagatt tcaagaaagc
tcaagtgggt atagtgtcga tggactggga tggaaatcca 180tgcaacatgc
atttaggaac ccttggatca aagattaaaa gctcagtaaa tcagacagat
240ggtctgatcg gcttacaatt tcatacgata ggagtttctg atgggatagc
aaatggaaag 300ttgggaatga gatactccct tgtttccaga gaagttatag
ctgactctat tgaaaccaac 360gctggcgctg aatactatga tgcaattgta
gccatcccag gttgtgacaa aaatatgcca 420ggttctatta ttggtatggc
aagacttaat aggccaagca ttatggtgta tggaggaaca 480atagaacacg
gtgaatataa aggtgagaaa ttgaacatcg tatcggcttt tgaatctcta
540ggccagaaaa ttaccggcaa tatctctgat gaagattatc acggtgttat
ttgtaatgct 600attcctggtc aaggggcatg tggggggatg tacacagcta
ataccttagc tgccgctatc 660gaaacactag gtatgtcatt gccgtattct
tcttcgaacc ctgcagtatc tcaagaaaaa 720caagaagaat gtgatgagat
tggattagcc attaagaatc ttttggaaaa agacatcaag 780cctagtgata
taatgactaa ggaggcgttc gagaacgcta ttaccattgt gatggtcttg
840gggggtagta ctaatgctgt cttgcatatt attgcaatgg ctaacgcgat
aggtgtcgaa 900ataactcagg atgacttcca aagaattagt gacattactc
cagtactagg tgattttaaa 960ccttcaggta aatatatgat ggaagatttg
cataaaattg gaggcttgcc agcagtgctt 1020aagtaccttc taaaggaagg
aaaattgcat ggtgactgcc ttactgtgac gggtaaaaca 1080ttagccgaga
atgtcgagac tgccctagac ttggatttcg actcacaaga tatcatgagg
1140ccactaaaga atcctatcaa ggccaccggc cacttgcaga ttctgtacgg
taatttagct 1200caagggggtt ccgtagcaaa aattagcggt aaagaaggag
agttcttcaa aggcactgcc 1260agagtctttg atggtgaaca acattttatc
gacggcatag aatctggtcg tttgcatgct 1320ggagatgtag cggtaattag
gaatataggt cccgtcggcg gacctggtat gcccgaaatg 1380ctgaagccta
catcagcatt aattggtgcg ggtttaggga aaagttgcgc gttaattacg
1440gatggtagat tctccggtgg cactcacggt tttgttgtcg gccatattgt
gcctgaagcc 1500gttgagggtg gactaatcgg cttagttgaa gatgacgata
taatagagat agatgcagtc 1560aacaactcta tatccctgaa agtttccgat
gaagaaatcg caaagagaag agctaattat 1620cagaagccaa ctccgaaagc
caccagggga gttttggcaa aattcgctaa attaacccgt 1680cctgcatcgg
aagggtgtgt tactgatctg taa 1713881758DNASaccharomyces cerevisiae
88atgggcttgt taacgaaagt tgctacatct agacaattct ctacaacgag atgcgttgca
60aagaagctca acaagtactc gtatatcatc actgaaccta agggccaagg tgcgtcccag
120gccatgcttt atgccaccgg tttcaagaag gaagatttca agaagcctca
agtcggggtt 180ggttcctgtt ggtggtccgg taacccatgt aacatgcatc
tattggactt gaataacaga 240tgttctcaat ccattgaaaa agcgggtttg
aaagctatgc agttcaacac catcggtgtt 300tcagacggta tctctatggg
tactaaaggt atgagatact cgttacaaag tagagaaatc 360attgcagact
cctttgaaac catcatgatg gcacaacact acgatgctaa catcgccatc
420ccatcatgtg acaaaaacat gcccggtgtc atgatggcca tgggtagaca
taacagacct 480tccatcatgg tatatggtgg tactatcttg cccggtcatc
caacatgtgg ttcttcgaag 540atctctaaaa acatcgatat cgtctctgcg
ttccaatcct acggtgaata tatttccaag 600caattcactg aagaagaaag
agaagatgtt gtggaacatg catgcccagg tcctggttct 660tgtggtggta
tgtatactgc caacacaatg gcttctgccg ctgaagtgct aggtttgacc
720attccaaact cctcttcctt cccagccgtt tccaaggaga agttagctga
gtgtgacaac 780attggtgaat acatcaagaa gacaatggaa ttgggtattt
tacctcgtga tatcctcaca 840aaagaggctt ttgaaaacgc cattacttat
gtcgttgcaa ccggtgggtc cactaatgct 900gttttgcatt tggtggctgt
tgctcactct gcgggtgtca agttgtcacc agatgatttc 960caaagaatca
gtgatactac accattgatc ggtgacttca aaccttctgg taaatacgtc
1020atggccgatt tgattaacgt tggtggtacc caatctgtga ttaagtatct
atatgaaaac 1080aacatgttgc acggtaacac aatgactgtt accggtgaca
ctttggcaga acgtgcaaag 1140aaagcaccaa gcctacctga aggacaagag
attattaagc cactctccca cccaatcaag 1200gccaacggtc acttgcaaat
tctgtacggt tcattggcac caggtggagc tgtgggtaaa 1260attaccggta
aggaaggtac ttacttcaag ggtagagcac gtgtgttcga agaggaaggt
1320gcctttattg aagccttgga aagaggtgaa atcaagaagg gtgaaaaaac
cgttgttgtt 1380atcagatatg aaggtccaag aggtgcacca ggtatgcctg
aaatgctaaa gccttcctct 1440gctctgatgg gttacggttt gggtaaagat
gttgcattgt tgactgatgg tagattctct 1500ggtggttctc acgggttctt
aatcggccac attgttcccg aagccgctga aggtggtcct 1560atcgggttgg
tcagagacgg cgatgagatt atcattgatg ctgataataa caagattgac
1620ctattagtct ctgataagga aatggctcaa cgtaaacaaa gttgggttgc
acctccacct 1680cgttacacaa gaggtactct atccaagtat gctaagttgg
tttccaacgc ttccaacggt 1740tgtgttttag atgcttga
1758891701DNASaccharomyces cerevisiae 89atgaagaagc tcaacaagta
ctcgtatatc atcactgaac ctaagggcca aggtgcgtcc 60caggccatgc tttatgccac
cggtttcaag aaggaagatt tcaagaagcc tcaagtcggg 120gttggttcct
gttggtggtc cggtaaccca tgtaacatgc atctattgga cttgaataac
180agatgttctc aatccattga aaaagcgggt ttgaaagcta tgcagttcaa
caccatcggt 240gtttcagacg gtatctctat gggtactaaa ggtatgagat
actcgttaca aagtagagaa 300atcattgcag actcctttga aaccatcatg
atggcacaac actacgatgc taacatcgcc 360atcccatcat gtgacaaaaa
catgcccggt gtcatgatgg ccatgggtag acataacaga 420ccttccatca
tggtatatgg tggtactatc ttgcccggtc atccaacatg tggttcttcg
480aagatctcta aaaacatcga tatcgtctct gcgttccaat cctacggtga
atatatttcc 540aagcaattca ctgaagaaga aagagaagat gttgtggaac
atgcatgccc aggtcctggt 600tcttgtggtg gtatgtatac tgccaacaca
atggcttctg ccgctgaagt gctaggtttg 660accattccaa actcctcttc
cttcccagcc gtttccaagg agaagttagc tgagtgtgac 720aacattggtg
aatacatcaa gaagacaatg gaattgggta ttttacctcg tgatatcctc
780acaaaagagg cttttgaaaa cgccattact tatgtcgttg caaccggtgg
gtccactaat 840gctgttttgc atttggtggc tgttgctcac tctgcgggtg
tcaagttgtc accagatgat 900ttccaaagaa tcagtgatac tacaccattg
atcggtgact tcaaaccttc tggtaaatac 960gtcatggccg atttgattaa
cgttggtggt acccaatctg tgattaagta tctatatgaa 1020aacaacatgt
tgcacggtaa cacaatgact gttaccggtg acactttggc agaacgtgca
1080aagaaagcac caagcctacc tgaaggacaa gagattatta agccactctc
ccacccaatc 1140aaggccaacg gtcacttgca aattctgtac ggttcattgg
caccaggtgg agctgtgggt 1200aaaattaccg gtaaggaagg tacttacttc
aagggtagag cacgtgtgtt cgaagaggaa 1260ggtgccttta ttgaagcctt
ggaaagaggt gaaatcaaga agggtgaaaa aaccgttgtt 1320gttatcagat
atgaaggtcc aagaggtgca ccaggtatgc ctgaaatgct aaagccttcc
1380tctgctctga tgggttacgg tttgggtaaa gatgttgcat tgttgactga
tggtagattc 1440tctggtggtt ctcacgggtt cttaatcggc cacattgttc
ccgaagccgc tgaaggtggt 1500cctatcgggt tggtcagaga cggcgatgag
attatcattg atgctgataa taacaagatt 1560gacctattag tctctgataa
ggaaatggct caacgtaaac aaagttgggt tgcacctcca 1620cctcgttaca
caagaggtac tctatccaag tatgctaagt tggtttccaa cgcttccaac
1680ggttgtgttt tagatgcttg a 1701901689DNAGramella forsetii
90atggataaaa cagccatgaa taacaaatac tcttctacta ttacacaaag tgactcacaa
60ccagcgtcac aagcaatgct tcacgccatc ggccttaata aggaagattt gaaaaagcct
120tttgtaggca tcggcagtac cggatatgaa ggaaacccat gcaacatgca
cctgaatgat 180ttggctaagg aagtgaaaaa aggcactcag aatgcagatt
taaacggtct gatctttaat 240acaattggcg tcagcgatgg aatatctatg
ggtactccag gtatgaggtt ctcattgcca 300tcccgtgact tgattgcaga
tagcatggaa acagtagttg gtggaatgtc gtatgatggt 360ttagttaccg
tagttgggtg tgataaaaac atgccaggag cattaatggc aatgttgagg
420ttaaatcgtc cgtcggtttt agtgtatggg ggaacaattg ctagtggttg
ccacaatgga 480aagaagttag atgttgtgtc tgctttcgag gcctggggtt
ctaaagtttc aggtgatatg 540caggaagaag aataccagca agtcattgaa
aaggcatgtc ctggtgcagg tgcttgtggg 600ggtatgtaca cagccaacac
catggcttca tctattgaag ccttggggat gtccttgcct 660tttaactcat
ccaatcctgc aactggtccg gaaaaaactc aagaatctgt caaagctggc
720gaggctatga aatacttact agaaaatgat ctgaaaccca aagatattgt
gacggccaag 780tcgctggaaa atgctattag attgctaacg gttttgggtg
gtagtaccaa tgccgtcttg 840cacttcttgg ctatagctaa ggcagccgaa
ataaactttg gtttgaaaga ttttacaaga 900atatgtgagg aaactccctt
cttggccgac ttaaaaccat ctggtaagta tctgatggaa 960gacattcata
ggataggcgg aatccccgcg gttatgaagt acatgttaga gaaaggatta
1020cttcatggtg agtgcatgac ggtaactggc aagactatcg cagaaaacct
tgaaaatgtg 1080aaacctctgc cagatgatca ggacgtgatt catccagtcg
aaaaacctat taaagctact 1140ggacatatca ggattttgta tggcaattta
gccagcgaag gctccgtagc caagattact 1200gggaaggaag gattagaatt
tcaaggtaag gccagagtct ttaatggcga atttgaggcc 1260aatgaaggga
tcagtagcgg aaaggtccaa aaaggcgacg tagtagtaat tagatatgag
1320ggtcccaagg ggggtccggg tatgccggaa atgctaaaac ccacgtcagc
aataatggga 1380gctggtcttg gtaagagtgt cgctttaata actgacggta
gattcagcgg cggtactcat 1440ggttttgtcg tgggtcatat aacccctgaa
gcgcaacaag gtggactaat agggctattg 1500aaagatggtg atgaaatttc
gatcaacgcg gagaaaaaca cgattgaagc acatttatcc 1560gcagaagaaa
ttaatagaag aaaggaggct tggaaggctc ctgctctaaa agttaacggt
1620ggggtacttt acaaatatgc gaagacagtt gctagtgcat cagaggggtg
tgttacagac 1680gagttctaa 1689911707DNASaccharopolyspora erythraea
91atgagtacga gtacagatgg tacgggtcaa tcaggtagag gactaaaacc aaggtccgga
60gacgtaaccg agggtataga aagagccgcc gcaagaggca tgttacgtgc agtcggtatg
120caagatgctg acttcgccaa gcctcaaatt ggtgtcgctt cgtcttggaa
cgagataact 180ccctgtaatc tttcccttca gcgtttagca caagcgtcta
aggaaggagt gcatgcagct 240ggtgggttcc caatggaatt tggcactatt
tcagtgagtg atgggatatc tatgggccat 300gttggaatgc attactctct
agtgagtagg gaggtgattg ctgattcggt tgagacggta 360atggaagctg
aaaggctgga cggttccgtt ttgttagccg gttgtgacaa gagcctaccg
420ggtatgctaa tggccgcagc acgtttagat gtcgccgctg tattcgtgta
tgcaggttcc 480atactgcctg gaagagtaga cgatagagaa gtaactatta
ttgacgcttt tgaagccgtc 540ggagcttgtg caaggggctt gatctcagaa
gccgaggtgg ataggattga aagggctata 600tgcccaggtg aaggcgcttg
tggaggaatg tatacggcga ataccatggc ttgtgcggct 660gaagcaatgg
gcatgtcgtt accaggatca gcctcccctc ctagcgtaga tcgtagaaga
720gacgcgggcg cacgtgaagc tggtagagct gtggtcggta tgattgaacg
tggtcttaca 780gccagacaaa tattgactaa agaggcgttc gaaaacgcta
tcgcggttgt tatggctttt 840ggcggcagta ctaatgctgt tctgcatttg
ctggcaattg cacgtgaggc agaagttgat 900ttaacattag atgattttaa
caggattggt gatagagtgc ctcatctggc tgatgttaag 960ccatttggaa
ggcacgtgat gaccgcagtc gataggatag gtggagtacc agtagtaatg
1020aaagccttgt tggatgctgg tttgcttcat ggagactgta tgacagttac
tgggaaaact 1080gtcgccgaga atctagctga attagaccca ccagaattag
acggggaagt tcttcacaaa 1140ctgtctaacc ccttacaccc taccggcggc
ttgaccatct tgagagggag cttggcccct 1200gagggagctg ttgtcaaaag
cgctggcttt gactccgcaa cattcgaggg tactgcacgt 1260gttttcgatg
gagagcaggg tgccatggat gctgttgagg atggttcatt gaaagcgggt
1320gacgtggtag tcatcagata tgaaggtcca agaggcggtc caggtatgag
ggaaatgctt 1380gctgtaacag gggctatcaa aggtgcaggg ttagggaagg
acgttctatt gttaactgat 1440ggtagatttt cgggtggaac cacaggttta
tgcatcggac acgtcgcgcc cgaagcaact 1500gacggcggtc cgattgcttt
tgttcgtgac ggtgatccta ttagactgga tttagcgggt 1560agaactttgg
atctattagt agatgaagcc gaacttgcaa gaagaaaaga aggctgggtt
1620ccgagagaac ccaagtttag acaaggtgtt ttgggcaaat acgctagact
ggttaggtct 1680gctgcagttg gagccgtctg ctcttga
1707921722DNACandidatus Koribacter versatilis 92atgactgaga
agtcaccaaa accccataag agatccgatg caatcacaga ggggccaaat 60cgtgctcctg
ctcgtgctat gttaagggct gcaggtttta ctcctgagga tttgagaaaa
120cccattatcg gtatagccaa cacatggatt gaaattggcc cttgcaactt
acatctaaga 180gaattggccg aacatatcaa gcaaggtgta agagaagctg
gagggacacc catggaattt 240aatacagttt ccatctccga cgggataacc
atgggatcag aaggtatgaa agctagtcta 300gtgagtcgtg aggtaatagc
cgattcaatt gagttagttg ccagaggaaa cttgtttgat 360ggactaatag
ctttatctgg atgtgataag acaatcccag gtacaattat ggcattggag
420agacttgata tcccaggcct tatgctttat ggtggttcaa ttgctccggg
caaattccac 480gcacagaagg ttacgatcca agatgtattc gaagccgttg
gtacccacgc taggggtaaa 540atgagcgatg cagacttaga agagcttgag
cacaatgctt gtcctggtgc tggggcgtgc 600ggaggacagt tcacagctaa
tactatgtct atgtgtggtg aatttctggg tatatctcct 660atgggagcga
atagcgttcc cgcaatgacg gtcgagaaac aacaagtcgc gcgtagatgt
720ggacatttag ttatggagtt ggtgagaaga gacatcaggc cgtctcaaat
cataacaaga 780aaagcaattg agaacgcaat agcatcagtt gcggctagtg
gaggtagtac taacgcggtc 840ctgcatctgt tagctattgc acacgagatg
gatgtcgaat tgaacattga agattttgat 900aagataagct ctcgtactcc
acttctttgt gaactgaaac cagccggtag gtttacggct 960acagatttgc
atgacgctgg tggtattcca ttagttgctc aaagactgtt ggaagcaaat
1020ttgttacacg ctgacgcttt gacagtaact ggcaagacta ttgcagaaga
agctaaacag 1080gccaaagaaa ccccgggcca agaagtagtc aggcccttga
ccgacccaat taaggctacc 1140ggcggattaa tgatcttaaa aggtaatcta
gcatcagaag ggtgcgtggt aaagttggtt 1200ggtcacaaga agttattctt
cgaaggtcct gcgagagttt ttgaatctga agaagaagca 1260tttgccggcg
tcgaggatag gacgattcaa gcgggtgaag ttgtagtggt cagatacgaa
1320gggccaaaag gcggacctgg aatgcgtgaa atgttaggcg ttactgctgc
gatagctggc 1380accgagttag ctgaaactgt ggccctaatc accgacggta
gattttcggg tgcaacaaga 1440ggtctatccg tggggcatgt cgcacctgaa
gccgcaaatg gtggtgccat tgccgtagtt 1500aggaatggtg acattattac
gctggatgtt gagagaagag aattaagggt tcatttgact 1560gatgctgaat
tggaggccag attgcgtaac tggagagcgc ctgaaccgag atacaaacgt
1620ggtgttttcg ctaaatatgc ttctacggtc tcatcagcat cgttcggagc
tgtaacaggt 1680tctaccatag aaaacaaaac actggcaggc tcgactaagt aa
1722931779DNAPiromyces sp 93atgtctttct cactggctaa cctggccgct
aagggttcga acttgttcaa atttactcct 60gcgcttctaa gcgcaaagcg ttttggttca
tcaggaaagc caattaataa gttcagcaag 120attataacag agccaaagtc
tagagggggt agtcaagcga tgttaatcgc aactggtata 180aaaccagaag
atttaaaaaa gccacagatc ggcataggca gtgtttggta tgatggaaat
240ccatgcaaca tgcatctatt ggatcttggc tccgtggtaa aaaaggccgt
tcaaaaacaa 300aatatgaatg gtatgagatt caatatgatt ggagtgtcag
acgggatctc caacggtacg 360gatggaatgt ccttttcttt gcagtcccgt
gaaattattg cggattctat cgaaacaatc 420atgtctgcac aatattatga
tgctaacatc agcttacctg gctgcgacaa gaacatgcct 480ggttgtttaa
tcgccgctgc cagattgaac agaccgacta taattatcta cggtggcacg
540atcaagcccg gacatacaaa aaagggagag acgattgatt tagtctcggc
cttccaatgt 600tatgggcaat acttggctgg agaaattact gaagagcaaa
gagaagaaat agtgaataat 660gcatgtcctg gcgcaggtgc atgcggtgga
atgtatacag ctaatacaat ggcttccata 720atcgaatcaa tgggtatgag
tttaccttac tccgcctcga ccccggcaga agacccattg 780aaagagcttg
aatgtataaa cgcggcagct gcaattaaga atttaatgga aaaagacatc
840aagccattag acataatgac aagaaaagcg tttgagaacg ctataactat
tactttgatt 900cttggaggga gtacaaactc cgttctgcac cttttggcta
tcgctagggc ctgcaaagtc 960ccattaacta ttgacgattt ccaggaattt
tctaatagga tacccgtttt agccgactta 1020aaacctagtg gtaaatatgt
catggaagat ttgcagttga tcggcggtct tccagctatt 1080cagaaatatc
ttctgaatga aggtctactt catggtgata ttatgactgt taccggaaag
1140accctagcag agaatttgaa agacgttgct ccaatcgatt ttgaaactca
agatataatt 1200agacctttat cgaatcccat taaaaagaat ggtcacatta
tcattatgaa aggtaacgtc 1260tctccggacg gtggtgttgc taaaattaca
ggtaagcagg gattgttttt cgaaggcgtg 1320gcgaattgct ttgattgtga
agaagacatg ttagctgcac tggaaagagg cgaaattaaa 1380aaaggtcaag
tgattataat aaggtatgaa ggccccactg gagggcctgg tatgccggag
1440atgctaactc cgaccagtgc tattatgggt gctgggttag gaaaagatgt
agcactatta 1500acagatggca gattttcagg cgggtcacac ggcttcatta
ttggtcatat tacgcctgag 1560gcacaagtag gtggtccaat tgccctaatc
aaaaacggtg ataagataac tatagacgcg 1620aataaacgta ccatacatgc
ccatgtcagc gaagaagaat ttgctaaaag acgtgccgag 1680tggaaagcac
caccttacag agctactcaa ggtactttaa agaaatacat taagctggtt
1740aaacccgcaa actttggatg tgttaccgat gagtggtaa
1779941764DNARalstonia eutropha 94atgccgtacg cagatgaccc aaaattacct
caagatgggg ctgcgcctac agaaggtttg 60gccaagggcc ttactaatta tggtgatact
ggtttctctt tattcctgag gaaggctttt 120atcaaaggtg caggttttac
cgatgatgca ctatcaaggc cggtgatagg aattgtaaat 180actggatctt
cttataaccc atgccacggc aacgcccctc aattagtgga ggcggtgaag
240agaggtgtca tgttggcagg tggtttaccc gtagacttcc ctactatatc
cgtccacgag 300tcatttagcg cacccactag tatgtattta aggaacttga
tgtccatgga taccgaagaa 360atgattcgtg ctcagccgat ggacgccgtc
gttctgatag ggggttgtga caaaacagtt 420ccagcccaac tgatgggtgc
cgcatcagct ggagtaccag ccatccaatt agtcacaggt 480tctatgctaa
ctggtagcca tagaagtgag agagtcggag cgtgtacgga ttgtcgtaga
540tactggggta gataccgtgc tgaggagatt gattcagccg agatcgcaga
tgttaataat 600cagttggttg cctcagttgg tacatgctcg gtcatgggga
cagcttcaac aatggcttgt 660gtagcagagg ccttgggtat gatggtttct
ggcggtgctt cggcacctgc tgtgaccgcg 720gatagagtta gggtcgcgga
acgtaccggg acgactgctg ttggaatggc ggcggccagg 780ttgacacctg
atagaatatt aacaggtaaa gcctttgaaa acgctttgag agttctactg
840gcaatcggcg gttcaacaaa tgggatagta catctaacgg ctattgctgg
tagactagga 900atcgacatcg acctagcagg gttggacaga atgtctcgtg
aaacgcctgt tctggttgac 960ttgaaaccta gcggtcaaca ttacatggaa
gattttcata aggccggagg
aatgttaacg 1020ttgttacgtg aactgagacc actattacac ttagatactt
tgaccgttag tggaaggacc 1080cttggcgaag aattagatgc agcaccccct
ctgttcccac aagatgtcat tagaagtgca 1140ggtaatccta tttatcccgc
aggtggatta gcggtccttc gtggtaattt ggctccaggc 1200ggggctatca
tcaaacaatc cgctgcgaac ccagctctta tggagcatga aggaagagcc
1260gtagtttttg aaaatgcaga agacatggct caaagaattg acgacgaatc
cttagacgtg 1320aaagctgacg atattcttgt acttaaaagg attggtccaa
ctggcgcccc gggtatgcct 1380gaagctggct atatgccgat accaaagaag
ttagcaagag caggggttaa ggatatggta 1440agagttagtg atggtcgtat
gtctggaacg gcagctggca caatagtttt gcatgtgaca 1500ccagaagcag
ccataggggg acccttagcc cttgttcagt cgggagatag aattaggcta
1560tctgtggcca accgtgaaat tgcattgtta gtagatgatg ccgaattagc
aaggagggcc 1620gctgctcaac ccgtagaaag accaagggct gagagaggtt
atagaaaatt gtttctggag 1680acagtaactc aggcggatca gggtgttgat
ttcgactttt tgagagctgc tcaaactgtg 1740gatacagtcc caaagcaagg ctaa
1764951746DNAChromohalobacter salexigens 95atgactcata agaagagacc
tttaagaagt gccgagtggt tcggtaatga tgacaaaaat 60ggatttatgt atagatcgtg
gatgaaaaac caaggtatac ccgatcacga gtttagaggt 120aaaccgataa
ttggtatctg caataccttt agtgaactaa ctccatgcaa cgcccatttc
180agaaagttag cagaacatgt gaaaaaaggt gtattagaag caggcggtta
cccggttgaa 240tttccagtat tttctaacgg ggaatctaat ttgagaccaa
ctgctatgtt cacaaggaat 300ttggctagta tggatgtcga ggaagccatt
agaggcaatc cattagacgc agtcgtgttg 360cttgtgggtt gtgataaaac
aacaccagcc ttacttatgg gtgctgcttc ttgtgacatt 420ccgactatag
ttgttacagg tgggccaatg cttaacggga aacacaaggg aagagacatc
480ggatcaggta cggtcgtgtg gcagctttct gaagaggtta aggccgggaa
aatttcctta 540catgatttca tggcggctga ggctggaatg agccgttccg
ctggcacttg taacactatg 600ggaaccgcct ctaccatggc atgcatggcc
gaatctcttg gtacttcatt gccacacaat 660gccgctattc cggccgtgga
tagccgtagg tatgtacttg cacatttgag tggtaatagg 720attgtcgaaa
tggtcgatga agacctaaca ctgagcaaag tgctgaccaa gagcgctttt
780gaaaacgcta tcagaacgaa tgctgcgatt ggcgggtcaa ccaatgcagt
aatccatcta 840caggcaatcg caggtagaat gggggtggac ttgacactag
atgactggac aagagtaggt 900cgtggcacgc ctactatcgt cgatttacaa
ccctcgggta ggtacttgat ggaggaattt 960tattatgcgg gaggtctgcc
tgcagtttta aggagattgg gggaagctga tagactaccc 1020cataaagatg
ccttaaccgt taatggcaag accctgtggg aaaacgttca agatgcgcca
1080ttatacaacg acgccgttat tttgccattg gatgctccct tacgtgagga
cggaggcatg 1140tgtgtgatgc gtggtaatct tgcgcctaac ggggctgtat
taaaacctag cgcagcaact 1200cctgctctaa tgcagcacag gggcagagcg
gttgtttttg agaattttga tgattacaaa 1260gccaggataa atgatcctga
cttggatgtt actgccgatg atatattagt aatgaagaac 1320tgtggtccta
gaggttatca tggtatggca gaagtaggca acatgggact gcctgcaaaa
1380ctactggagc agggtgtcac ggacatggtc cgtatttcag atgcaagaat
gagtggaacc 1440gcttacggta ctgttgtatt gcatgtagct cctgaagctg
ctgccggtgg tcccttagct 1500gccgttcgta atggcgattg gatcgcacta
gacgcatatt caggaaaatt acacttggag 1560gtcgatgatg ctgaaatagc
gtccagatta gcagaggcag acccaacagc tgaatcaact 1620aggatagcgt
caacaggagg ttacagacaa ctttacattg aacatgtttt gcaagctgat
1680caaggctgtg atttcgattt cttagttgga tgcaggggcg cagaagtccc
aagacattcc 1740cactaa 174696990DNAPicrophilus torridus 96atggaaaagg
tttatacgga gaacgaccta aaggaaaact tgatgcgtaa caaaaagata 60gcagttctag
gttatggctc acaaggtaga gcttgggcat taaatatgag agacagcgga
120ttaaatgtga cagtgggatt ggaaagacag gggaaatctt gggaaaaagc
cgttgctgat 180ggctttaagc cacttaagtc aagagatgct gttagagacg
ctgacgcagt cattttctta 240gtcccagaca tggcccagag agaattatat
aagaatatta tgaatgatat taaagatgac 300gcagacatcg tttttgccca
cggctttaac gttcattatg gtcttattaa tcctaaaaac 360catgatgttt
acatggtggc tcctaaagca cccggcccat cggtaaggga gttttacgaa
420agagggggag gggtcccggt tcttattgct gttgcaaatg atgtctcggg
ccgttctaaa 480gaaaaggcgt taagtatagc gtatagcttg ggtgccttga
gagcaggtgc gattgaaacc 540accttcaaag aggaaactga aacagaccta
atcggtgaac aattggatct ggttggaggt 600attactgaat tactaagatc
aacgtttaat attatggttg aaatgggtta taaaccagaa 660atggcttatt
ttgaggccat caatgagatg aagttgatag tagaccaggt attcgaaaaa
720ggtatttctg gtatgcttag agccgtaagt gataccgcta aatatggagg
tctgacaact 780ggtaagtaca taataaatga tgatgtaaga aaaaggatga
gggaaagggc agaatacatt 840gtgtcaggaa aattcgctga ggagtggatt
gaagaatacg gcgagggttc taagaatctg 900gaaagtatga tgttggatat
cgataactcc ctagaagagc aagttggaaa gcaattaaga 960gaaatcgtct
taaggggacg tcctaagtaa 990971683DNASulfolobus tokodaii 97atgaacccag
acaagaaaaa acgttcgaat ctgatatatg gtggatacga gaaggctcct 60aacagggcct
tcttgaaagc catgggcttg acggatgatg acatcgctaa accaatagtc
120ggtgtcgctg ttgcttggaa tgaagctggc ccatgtaata ttcatttact
aggtttatct 180aatattgtta aagaaggagt gaggtcaggg ggtggcactc
cgagggtatt taccgcccct 240gttgtgattg acggtatcgc aatgggttct
gaagggatga agtattccct tgtttcaaga 300gaaattgtgg caaatacggt
cgagcttgtg gttaatgctc acgggtacga tggtttcgtt 360gcattagctg
ggtgtgacaa gactccacca ggaatgatga tggcaatggc tagattaaac
420attcccagca ttatcatgta tggaggcaca acactacctg gtaatttcaa
aggaaaaccc 480atcactatcc aggatgtata tgaggctgtt ggggcttatt
ctaaaggaaa gattacagca 540gaagatttaa gattgatgga agataatgct
attccaggtc cgggaacctg cggcggtcta 600tacacagcca atactatggg
cttaatgaca gaagcccttg gtcttgcgct accaggcagt 660gcttctcctc
cagcagtgga tagtgcaagg gtaaaatatg catacgaaac gggtaaagcc
720ctaatgaatt taatcgaaat cgggttaaaa cctcgtgaca ttcttacctt
tgaagccttt 780gaaaacgcaa taaccgtatt gatggcgtcg ggcggatcaa
ccaacgcagt gttgcattta 840ctggcgatag catacgaagc aggcgttaaa
ttaactttag atgattttga tcgtatatcc 900caaagaacac cagaaattgt
taacatgaag cctggaggtg aatacgctat gtacgatttg 960catagggtcg
gtggtgctcc cctgataatg aagaaattgc ttgaggccga cttattgcac
1020ggtgatgtaa taactgttac tggtaagacc gtcaaacaga atcttgagga
gtataagttg 1080ccaaatgttc cacacgaaca cattgtcagg cccatatcca
acccttttaa cccaacagga 1140gggataagaa ttttgaaggg ttcactggct
ccagagggcg cagtaattaa agtctccgcc 1200actaaggtga gataccataa
gggtccagcg agagtcttca attccgaaga ggaagccttt 1260aaggcagttc
tggaagaaaa aatccaagag aatgatgtag ttgttatcag atatgaagga
1320cctaagggcg gtcctggaat gcgtgaaatg ttggctgtca cgtcggctat
cgtgggtcaa 1380ggtttaggtg aaaaagttgc cttgattact gacggtagat
tttcaggagc cacgagaggt 1440attatggtcg gacatgtagc tcccgaggcg
gcagtaggtg gtccgatagc tttgctgagg 1500gacggtgaca caatcataat
tgatgcaaat aatggcagac tagacgtcga tctacctcaa 1560gaagaattaa
agaaaagagc tgatgagtgg acgcctcctc ccccgaaata taaaagtgga
1620ttattggctc aatacgctag actagttagc agttcttcac taggtgcggt
gctattgact 1680taa 1683981476DNAArtificial SequenceE. coli
ilvC(Q110V) 98atggccaact attttaacac attaaatttg agacaacaat
tggctcaact gggtaagtgc 60agatttatgg gaagggacga gtttgctgat ggtgcttctt
atctgcaagg aaagaaagta 120gtaattgttg gctgcggtgc tcagggtcta
aaccaaggtt taaacatgag agattcaggt 180ctggatattt cgtatgcatt
gaggaaagag gcaattgcag aaaagagggc ctcctggcgt 240aaagcgacgg
aaaatgggtt caaagttggt acttacgaag aactgatccc tcaggcagat
300ttagtgatta acctaacacc agataaggtt cactcagacg tagtaagaac
agttcaaccg 360ctgatgaagg atggggcagc tttaggttac tctcatggct
ttaatatcgt tgaagtgggc 420gagcagatca gaaaagatat aacagtcgta
atggttgcac caaagtgccc aggtacggaa 480gtcagagagg agtacaagag
gggttttggt gtacctacat tgatcgccgt acatcctgaa 540aatgacccca
aaggtgaagg tatggcaatt gcgaaggcat gggcagccgc aaccggaggt
600catagagcgg gtgtgttaga gagttctttc gtagctgagg tcaagagtga
cttaatgggt 660gaacaaacca ttctgtgcgg aatgttgcag gcagggtctt
tactatgctt tgataaattg 720gtcgaagagg gtacagatcc tgcctatgct
gaaaagttga tacaatttgg ttgggagaca 780atcaccgagg cacttaaaca
aggtggcata acattgatga tggatagact ttcaaatccg 840gccaagctaa
gagcctacgc cttatctgag caactaaaag agatcatggc accattattc
900caaaagcaca tggacgatat tatctccggt gagttttcct caggaatgat
ggcagattgg 960gcaaacgatg ataaaaagtt attgacgtgg agagaagaaa
ccggcaagac ggcattcgag 1020acagccccac aatacgaagg taaaattggt
gaacaagaat actttgataa gggagtattg 1080atgatagcta tggtgaaggc
aggggtagaa cttgcattcg aaactatggt tgactccggt 1140atcattgaag
aatctgcata ctatgagtct ttgcatgaat tgcctttgat agcaaatact
1200attgcaagaa aaagacttta cgagatgaat gttgtcatat cagacactgc
agaatatggt 1260aattacttat ttagctacgc atgtgtcccg ttgttaaagc
ccttcatggc cgagttacaa 1320cctggtgatt tggggaaggc tattccggaa
ggagcggttg acaatggcca actgagagac 1380gtaaatgaag ctattcgttc
acatgctata gaacaggtgg gtaaaaagct gagaggatat 1440atgaccgata
tgaaaagaat tgcagtggca ggatga 1476991647DNALactococcus lactis
99atgtatactg ttggtgatta tctgctggac cgtctgcatg aactgggtat cgaagaaatc
60ttcggcgttc cgggtgatta caatctgcag ttcctggatc agatcatctc tcataaagac
120atgaaatggg tgggtaacgc taacgaactg aacgcaagct acatggcaga
tggttatgca 180cgtaccaaga aagccgcggc atttctgacc actttcggtg
ttggcgaact gagcgccgtc 240aacggtctgg cgggctccta cgccgaaaac
ctgccggtgg tggagatcgt aggcagccca 300acgagcaaag ttcagaacga
aggtaaattc gtccaccaca ctctggctga cggcgatttc 360aaacacttca
tgaaaatgca tgaacctgtg actgcggcac gtacgctgct gactgcagag
420aacgctactg tggaaatcga ccgcgttctg tctgcgctgc tgaaagaacg
caaaccagtt 480tacatcaacc tgcctgtgga tgttgcggca gctaaagcgg
aaaaaccgag cctgccgctg 540aagaaagaaa actccacttc taacactagc
gaccaggaaa tcctgaacaa aatccaggag 600tctctgaaaa acgcaaagaa
accaatcgtg atcaccggcc acgaaatcat ttcttttggt 660ctggagaaga
ccgtgaccca attcatcagc aaaaccaaac tgccgattac caccctgaac
720ttcggcaagt cctctgttga cgaggctctg ccgtctttcc tgggcatcta
caacggtact 780ctgagcgaac cgaacctgaa agaatttgtt gaatctgcgg
acttcatcct gatgctgggc 840gttaaactga ccgactcttc taccggtgca
ttcactcacc atctgaacga aaacaaaatg 900attagcctga acatcgacga
gggtaaaatc ttcaacgagc gtatccagaa cttcgacttc 960gaaagcctga
tcagctctct gctggacctg tccgaaatcg agtataaagg caaatacatt
1020gacaaaaagc aagaagattt cgtaccatct aacgcactgc tgtcccagga
tcgcctgtgg 1080caggccgtgg agaacctgac ccagagcaat gaaaccatcg
tggcggaaca aggtacgagc 1140tttttcggcg cgtcttctat ctttctgaaa
tccaaaagcc attttatcgg tcagccgctg 1200tggggtagca ttggctatac
tttcccggca gcgctgggct ctcagatcgc tgataaagaa 1260tctcgtcatc
tgctgttcat cggtgacggt tccctgcagc tgaccgtaca ggaactgggt
1320ctggcaattc gtgaaaagat caacccgatt tgcttcatta ttaacaatga
cggctacacc 1380gttgagcgtg agatccacgg tccgaaccag tcttacaacg
atatccctat gtggaactac 1440tctaaactgc cggagtcctt cggcgcaact
gaggaccgtg ttgtgtctaa aattgtgcgt 1500accgaaaacg aatttgtgag
cgtgatgaaa gaggcccagg ccgatccgaa ccgtatgtac 1560tggatcgaac
tgatcctggc gaaagaaggc gcaccgaagg tactgaagaa aatgggcaag
1620ctgtttgctg aacagaataa atcctaa 16471001188DNASaccharomyces
cerevisiae 100atgttgagaa ctcaagccgc cagattgatc tgcaactccc
gtgtcatcac tgctaagaga 60acctttgctt tggccacccg tgctgctgct tacagcagac
cagctgcccg tttcgttaag 120ccaatgatca ctacccgtgg tttgaagcaa
atcaacttcg gtggtactgt tgaaaccgtc 180tacgaaagag ctgactggcc
aagagaaaag ttgttggact acttcaagaa cgacactttt 240gctttgatcg
gttacggttc ccaaggttac ggtcaaggtt tgaacttgag agacaacggt
300ttgaacgtta tcattggtgt ccgtaaagat ggtgcttctt ggaaggctgc
catcgaagac 360ggttgggttc caggcaagaa cttgttcact gttgaagatg
ctatcaagag aggtagttac 420gttatgaact tgttgtccga tgccgctcaa
tcagaaacct ggcctgctat caagccattg 480ttgaccaagg gtaagacttt
gtacttctcc cacggtttct ccccagtctt caaggacttg 540actcacgttg
aaccaccaaa ggacttagat gttatcttgg ttgctccaaa gggttccggt
600agaactgtca gatctttgtt caaggaaggt cgtggtatta actcttctta
cgccgtctgg 660aacgatgtca ccggtaaggc tcacgaaaag gcccaagctt
tggccgttgc cattggttcc 720ggttacgttt accaaaccac tttcgaaaga
gaagtcaact ctgacttgta cggtgaaaga 780ggttgtttaa tgggtggtat
ccacggtatg ttcttggctc aatacgacgt cttgagagaa 840aacggtcact
ccccatctga agctttcaac gaaaccgtcg aagaagctac ccaatctcta
900tacccattga tcggtaagta cggtatggat tacatgtacg atgcttgttc
caccaccgcc 960agaagaggtg ctttggactg gtacccaatc ttcaagaatg
ctttgaagcc tgttttccaa 1020gacttgtacg aatctaccaa gaacggtacc
gaaaccaaga gatctttgga attcaactct 1080caacctgact acagagaaaa
gctagaaaag gaattagaca ccatcagaaa catggaaatc 1140tggaaggttg
gtaaggaagt cagaaagttg agaccagaaa accaataa 118810120DNAArtificial
SequencePrimer 1321 101aatcatatcg aacacgatgc 2010220DNAArtificial
SequencePrimer 1322 102tcagaaagga tcttctgctc 2010320DNAArtificial
SequencePrimer 1323 103atcgatatcg tgaaatacgc 2010420DNAArtificial
SequencePrimer 1324 104agctggtctg gtgattctac 2010538DNAArtificial
SequencePrimer 1409 105attgatgcgg ccgcgattta atctctaatt attagtta
3810634DNAArtificial SequencePrimer 1410 106cacccagtcg cgacatccaa
tttatagaaa tcag 3410732DNAArtificial SequencePrimer 1411
107attggatgtc gcgactgggt gagcatatgt tc 3210832DNAArtificial
SequencePrimer 1412 108gagaaagccg gcaggagagt gaaagagcct tg
3210921DNAArtificial SequencePrimer 1440 109atcgtacatc ttccaagcat c
2111020DNAArtificial SequencePrimer 1441 110aatcggaacc ctaaagggag
2011120DNAArtificial SequencePrimer 1443 111tgcagatgca gatgtgagac
2011224DNAArtificial SequencePrimer 1587 112cggctgccag aactctacta
actg 2411323DNAArtificial SequencePrimer 1588 113gcgacgtcta
ctggcaggtt aat 2311424DNAArtificial SequencePrimer 1633
114tccgtcactg gattcaatgc catc 2411520DNAArtificial SequencePrimer
1634 115ttcgccaggg agctggtgaa 20116771DNADrosophila melanogaster
116atgtcgttta ctttgaccaa caagaacgtg attttcgttg ccggtctggg
aggcattggt 60ctggacacca gcaaggagct gctcaagcgc gatctgaaga acctggtgat
cctcgaccgc 120attgagaacc cggctgccat tgccgagctg aaggcaatca
atccaaaggt gaccgtcacc 180ttctacccct atgatgtgac cgtgcccatt
gccgagacca ccaagctgct gaagaccatc 240ttcgcccagc tgaagaccgt
cgatgtcctg atcaacggag ctggtatcct ggacgatcac 300cagatcgagc
gcaccattgc cgtcaactac actggcctgg tcaacaccac gacggccatt
360ctggacttct gggacaagcg caagggcggt cccggtggta tcatctgcaa
cattggatcc 420gtcactggat tcaatgccat ctaccaggtg cccgtctact
ccggcaccaa ggccgccgtg 480gtcaacttca ccagctccct ggcgaaactg
gcccccatta ccggcgtgac ggcttacact 540gtgaaccccg gcatcacccg
caccaccctg gtgcacacgt tcaactcctg gttggatgtt 600gagcctcagg
ttgccgagaa gctcctggct catcccaccc agccctcgtt ggcctgcgcc
660gagaacttcg tcaaggctat cgagctgaac cagaacggag ccatctggaa
actggacttg 720ggcaccctgg aggccatcca gtggaccaag cactgggact
ccggcatcta a 7711178870DNAArtificial SequencepGV1914 117tcgcgcgttt
cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct
gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta
ctgagagtgc 180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag
aaaataccgc atcaggcgcc 240attcgccatt caggctgcgc aactgttggg
aagggcgatc ggtgcgggcc tcttcgctat 300tacgccagct ggcgaaaggg
ggatgtgctg caaggcgatt aagttgggta acgccagggt 360tttcccagtc
acgacgttgt aaaacgacgg ccagtgaatt cataccacag cttttcaatt
420caattcatca tttttttttt attctttttt ttgatttcgg tttccttgaa
atttttttga 480ttcggtaatc tccgaacaga aggaagaacg aaggaaggag
cacagactta gattggtata 540tatacgcata tgtagtgttg aagaaacatg
aaattgccca gtattcttaa cccaactgca 600cagaacaaaa acctgcagga
aacgaagata aatcatgtcg aaagctacat ataaggaacg 660tgctgctact
catcctagtc ctgttgctgc caagctattt aatatcatgc acgaaaagca
720aacaaacttg tgtgcttcat tggatgttcg taccaccaag gaattactgg
agttagttga 780agcattaggt cccaaaattt gtttactaaa aacacatgtg
gatatcttga ctgatttttc 840catggagggc acagttaagc cgctaaaggc
attatccgcc aagtacaatt ttttactctt 900cgaagacaga aaatttgctg
acattggtaa tacagtcaaa ttgcagtact ctgcgggtgt 960atacagaata
gcagaatggg cagacattac gaatgcacac ggtgtggtgg gcccaggtat
1020tgttagcggt ttgaagcagg cggcagaaga agtaacaaag gaacctagag
gccttttgat 1080gttagcagaa ttgtcatgca agggctccct atctactgga
gaatatacta agggtactgt 1140tgacattgcg aagagcgaca aagattttgt
tatcggcttt attgctcaaa gagacatggg 1200tggaagagat gaaggttacg
attggttgat tatgacaccc ggtgtgggtt tagatgacaa 1260gggagacgca
ttgggtcaac agtatagaac cgtggatgat gtggtctcta caggatctga
1320cattattatt gttggaagag gactatttgc aaagggaagg gatgctaagg
tagagggtga 1380acgttacaga aaagcaggct gggaagcata tttgagaaga
tgcggccagc aaaactaaaa 1440aactgtatta taagtaaatg catgtatact
aaactcacaa attagagctt caatttaatt 1500atatcagtta ttaccctatg
cggtgtgaaa taccgcacag atgcgtaagg agaaaatacc 1560gcatcaggaa
attgtaaacg ttaatatttt gttaaaattc gcgttaaatt tttgttaaat
1620cagctcattt tttaaccaat aggccgaaat cggcaaaatc ccttataaat
caaaagaata 1680gaccgagata gggttgagtg ttgttccagt ttggaacaag
agtccactat taaagaacgt 1740ggactccaac gtcaaagggc gaaaaaccgt
ctatcagggc gatggcccac tacgtgaacc 1800atcaccctaa tcaagttttt
tggggtcgag gtgccgtaaa gcactaaatc ggaaccctaa 1860agggagcccc
cgatttagag cttgacgggg aaagccggcg aggactgcaa tagcacaaga
1920ttaagataga atggcttcaa acagccgcct tttatacata ttggtaaaag
ctcgcgaatc 1980gcaccatatc ccttatcctg taatcaaatc gatctaggtg
cagatacaga tcaattcata 2040aaaagaaatt gaagcaccag tttatcacta
ctacactatc tttttctttt tttttttttt 2100ttgcgcagtt tcgccctttg
ttcaatatca cttgataagt tgtgggcttt ttctgtcact 2160cattcggctt
aaaaagtatt cgttcttttg tgttttatga aaagggaacg tgatataaaa
2220aaacatcctt tggtgtggga catgggcttt tgtttagaga atggttatca
ctaccgcccc 2280cacccttgaa agccacagaa aatgaaaaag tatgtgaata
aggtgtgaac tctataacat 2340tttggccaaa tgccacagcc gatctgcata
ttccaatgga catgatgcaa caacaattga 2400tgtcacattc tcttacacac
ttcgattggt ccgtacgtag tactttttac ataactgact 2460caggcgtttc
cttcattgaa atgctcatct attgccaagt acatagaatc cacagtgcat
2520aggttaacgc attgtaccca aacgacggga aacaaggaag gatgcagaat
gagcacttgt 2580tatttataaa aagacacggg agggggaatc ccgtctttcg
tccgtcggag ccaaagagat 2640gagccaaagc agaaaaacag gggacgccgc
ccttcttccg tcccgtgcgt gaggggggcg 2700cggccattcg gtttttgcaa
tatgacctgt gggccaaaaa tcgaaaaaaa
aaaaaaaaat 2760aagaggcggc tgcggaattt tataagacaa gcgcagggcc
aaagaaaaaa taataattga 2820cgtggctgaa caacagtctc tccccacccc
tttccaaaaa ggggaatgaa atacgagttc 2880tttttcccaa ttggtagata
ttcaacaaga gacgcgcagt acgtaacatg cgaattgcgt 2940aattcacggc
gataacgtag tatttagatt tagtataatt tgaaccgatg tatttatttg
3000tctgattgat ttatgtattc aaactgtgta agtttattta tttgcaacaa
taattcgttt 3060gagtacacta ctaatggcgg ccgcttagat gccggagtcc
cagtgcttgg tccactggat 3120ggcctccagg gtgcccaagt ccagtttcca
gatggctccg ttctggttca gctcgatagc 3180cttgacgaag ttctcggcgc
aggccaacga gggctgggtg ggatgagcca ggagcttctc 3240ggcaacctga
ggctcaacat ccaaccagga gttgaacgtg tgcaccaggg tggtgcgggt
3300gatgccgggg ttcacagtgt aagccgtcac gccggtaatg ggggccagtt
tcgccaggga 3360gctggtgaag ttgaccacgg cggccttggt gccggagtag
acgggcacct ggtagatggc 3420attgaatcca gtgacggatc caatgttgca
gatgatacca ccgggaccgc ccttgcgctt 3480gtcccagaag tccagaatgg
ccgtcgtggt gttgaccagg ccagtgtagt tgacggcaat 3540ggtgcgctcg
atctggtgat cgtccaggat accagctccg ttgatcagga catcgacggt
3600cttcagctgg gcgaagatgg tcttcagcag cttggtggtc tcggcaatgg
gcacggtcac 3660atcatagggg tagaaggtga cggtcacctt tggattgatt
gccttcagct cggcaatggc 3720agccgggttc tcaatgcggt cgaggatcac
caggttcttc agatcgcgct tgagcagctc 3780cttgctggtg tccagaccaa
tgcctcccag accggcaacg aaaatcacgt tcttgttggt 3840caaagtaaac
gacataccgg tatctcctag atccgtcgaa gtcgaaacta agttctggtg
3900ttttaaaact aaaaaaaaga ctaactataa aagtagaatt taagaagttt
aagaaataga 3960tttacagaat tacaatcaat acctaccgtc tttatatact
tattagtcaa gtaggggaat 4020aatttcaggg aactggtttc aacctttttt
ttcagctttt tccaaatcag agagagcaga 4080aggtaataga aggtgtaaga
aaatgagata gatacatgcg tgggtcaatt gccttgtgtc 4140atcatttact
ccaggcaggt tgcatcactc cattgaggtt gtgcccgttt tttgcctgtt
4200tgtgcccctg ttctctgtag ttgcgctaag agaatggacc tatgaactga
tggttggtga 4260agaaaacaat attttggtgc tgggattctt tttttttctg
gatgccagct taaaaagcgg 4320gctccattat atttagtgga tgccaggaat
aaactgttca cccagacacc tacgatgtta 4380tatattctgt gtaacccgcc
ccctattttg ggcatgtacg ggttacagca gaattaaaag 4440gctaattttt
tgactaaata aagttaggaa aatcactact attaattatt tacgtattct
4500ttgaaatggc gagtattgat aatgataaac tggatcctta ggatttattc
tgttcagcaa 4560acagcttgcc cattttcttc agtaccttcg gtgcgccttc
tttcgccagg atcagttcga 4620tccagtacat acggttcgga tcggcctggg
cctctttcat cacgctcaca aattcgtttt 4680cggtacgcac aattttagac
acaacacggt cctcagttgc gccgaaggac tccggcagtt 4740tagagtagtt
ccacataggg atatcgttgt aagactggtt cggaccgtgg atctcacgct
4800caacggtgta gccgtcattg ttaataatga agcaaatcgg gttgatcttt
tcacgaattg 4860ccagacccag ttcctgtacg gtcagctgca gggaaccgtc
accgatgaac agcagatgac 4920gagattcttt atcagcgatc tgagagccca
gcgctgccgg gaaagtatag ccaatgctac 4980cccacagcgg ctgaccgata
aaatggcttt tggatttcag aaagatagaa gacgcgccga 5040aaaagctcgt
accttgttcc gccacgatgg tttcattgct ctgggtcagg ttctccacgg
5100cctgccacag gcgatcctgg gacagcagtg cgttagatgg tacgaaatct
tcttgctttt 5160tgtcaatgta tttgccttta tactcgattt cggacaggtc
cagcagagag ctgatcaggc 5220tttcgaagtc gaagttctgg atacgctcgt
tgaagatttt accctcgtcg atgttcaggc 5280taatcatttt gttttcgttc
agatggtgag tgaatgcacc ggtagaagag tcggtcagtt 5340taacgcccag
catcaggatg aagtccgcag attcaacaaa ttctttcagg ttcggttcgc
5400tcagagtacc gttgtagatg cccaggaaag acggcagagc ctcgtcaaca
gaggacttgc 5460cgaagttcag ggtggtaatc ggcagtttgg ttttgctgat
gaattgggtc acggtcttct 5520ccagaccaaa agaaatgatt tcgtggccgg
tgatcacgat tggtttcttt gcgtttttca 5580gagactcctg gattttgttc
aggatttcct ggtcgctagt gttagaagtg gagttttctt 5640tcttcagcgg
caggctcggt ttttccgctt tagctgccgc aacatccaca ggcaggttga
5700tgtaaactgg tttgcgttct ttcagcagcg cagacagaac gcggtcgatt
tccacagtag 5760cgttctctgc agtcagcagc gtacgtgccg cagtcacagg
ttcatgcatt ttcatgaagt 5820gtttgaaatc gccgtcagcc agagtgtggt
ggacgaattt accttcgttc tgaactttgc 5880tcgttgggct gcctacgatc
tccaccaccg gcaggttttc ggcgtaggag cccgccagac 5940cgttgacggc
gctcagttcg ccaacaccga aagtggtcag aaatgccgcg gctttcttgg
6000tacgtgcata accatctgcc atgtagcttg cgttcagttc gttagcgtta
cccacccatt 6060tcatgtcttt atgagagatg atctgatcca ggaactgcag
attgtaatca cccggaacgc 6120cgaagatttc ttcgataccc agttcatgca
gacggtccag cagataatca ccaacagtat 6180acatgtcgac aaacttagat
tagattgcta tgctttcttt ctaatgagca agaagtaaaa 6240aaagttgtaa
tagaacaaga aaaatgaaac tgaaacttga gaaattgaag accgtttatt
6300aacttaaata tcaatgggag gtcatcgaaa gagaaaaaaa tcaaaaaaaa
aattttcaag 6360aaaaagaaac gtgataaaaa tttttattgc ctttttcgac
gaagaaaaag aaacgaggcg 6420gtctcttttt tcttttccaa acctttagta
cgggtaatta acgacaccct agaggaagaa 6480agaggggaaa tttagtatgc
tgtgcttggg tgttttgaag tggtacggcg atgcgcggag 6540tccgagaaaa
tctggaagag taaaaaagga gtagaaacat tttgaagcta tgagctccag
6600cttttgttcc ctttagtgag ggttaattgc gcgcttggcg taatcatggt
catagctgtt 6660tcctgtgtga aattgttatc cgctcacaat tccacacaac
ataggagccg gaagcataaa 6720gtgtaaagcc tggggtgcct aatgagtgag
gtaactcaca ttaattgcgt tgcgctcact 6780gcccgctttc cagtcgggaa
acctgtcgtg ccagctgcat taatgaatcg gccaacgcgc 6840ggggagaggc
ggtttgcgta ttgggcgctc ttccgcttcc tcgctcactg actcgctgcg
6900ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa
tacggttatc 6960cacagaatca ggggataacg caggaaagaa catgtgagca
aaaggccagc aaaaggccag 7020gaaccgtaaa aaggccgcgt tgctggcgtt
tttccatagg ctccgccccc ctgacgagca 7080tcacaaaaat cgacgctcaa
gtcagaggtg gcgaaacccg acaggactat aaagatacca 7140ggcgtttccc
cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg
7200atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct
cacgctgtag 7260gtatctcagt tcggtgtagg tcgttcgctc caagctgggc
tgtgtgcacg aaccccccgt 7320tcagcccgac cgctgcgcct tatccggtaa
ctatcgtctt gagtccaacc cggtaagaca 7380cgacttatcg ccactggcag
cagccactgg taacaggatt agcagagcga ggtatgtagg 7440cggtgctaca
gagttcttga agtggtggcc taactacggc tacactagaa ggacagtatt
7500tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta
gctcttgatc 7560cggcaaacaa accaccgctg gtagcggtgg tttttttgtt
tgcaagcagc agattacgcg 7620cagaaaaaaa ggatctcaag aagatccttt
gatcttttct acggggtctg acgctcagtg 7680gaacgaaaac tcacgttaag
ggattttggt catgagatta tcaaaaagga tcttcaccta 7740gatcctttta
aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg
7800gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct
gtctatttcg 7860ttcatccata gttgcctgac tccccgtcgt gtagataact
acgatacggg agggcttacc 7920atctggcccc agtgctgcaa tgataccgcg
agacccacgc tcaccggctc cagatttatc 7980agcaataaac cagccagccg
gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc 8040ctccatccag
tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag
8100tttgcgcaac gttgttgcca ttgctacagg catcgtggtg tcacgctcgt
cgtttggtat 8160ggcttcattc agctccggtt cccaacgatc aaggcgagtt
acatgatccc ccatgttgtg 8220caaaaaagcg gttagctcct tcggtcctcc
gatcgttgtc agaagtaagt tggccgcagt 8280gttatcactc atggttatgg
cagcactgca taattctctt actgtcatgc catccgtaag 8340atgcttttct
gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg
8400accgagttgc tcttgcccgg cgtcaatacg ggataatacc gcgccacata
gcagaacttt 8460aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa
ctctcaagga tcttaccgct 8520gttgagatcc agttcgatgt aacccactcg
tgcacccaac tgatcttcag catcttttac 8580tttcaccagc gtttctgggt
gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat 8640aagggcgaca
cggaaatgtt gaatactcat actcttcctt tttcaatatt attgaagcat
8700ttatcagggt tattgtctca tgagcggata catatttgaa tgtatttaga
aaaataaaca 8760aataggggtt ccgcgcacat ttccccgaaa agtgccacct
gacgtctaag aaaccattat 8820tatcatgaca ttaacctata aaaataggcg
tatcacgagg ccctttcgtc 88701184857DNAArtificial SequenceSacI-NotI
fragment 118gagctcatag cttcaaaatg tttctactcc ttttttactc ttccagattt
tctcggactc 60cgcgcatcgc cgtaccactt caaaacaccc aagcacagca tactaaattt
cccctctttc 120ttcctctagg gtgtcgttaa ttacccgtac taaaggtttg
gaaaagaaaa aagagaccgc 180ctcgtttctt tttcttcgtc gaaaaaggca
ataaaaattt ttatcacgtt tctttttctt 240gaaaattttt tttttgattt
ttttctcttt cgatgacctc ccattgatat ttaagttaat 300aaacggtctt
caatttctca agtttcagtt tcatttttct tgttctatta caactttttt
360tacttcttgc tcattagaaa gaaagcatag caatctaatc taagtttgtc
gacatgaaga 420agctcaacaa gtactcgtat atcatcactg aacctaaggg
ccaaggtgcg tcccaggcca 480tgctttatgc caccggtttc aagaaggaag
atttcaagaa gcctcaagtc ggggttggtt 540cctgttggtg gtccggtaac
ccatgtaaca tgcatctatt ggacttgaat aacagatgtt 600ctcaatccat
tgaaaaagcg ggtttgaaag ctatgcagtt caacaccatc ggtgtttcag
660acggtatctc tatgggtact aaaggtatga gatactcgtt acaaagtaga
gaaatcattg 720cagactcctt tgaaaccatc atgatggcac aacactacga
tgctaacatc gccatcccat 780catgtgacaa aaacatgccc ggtgtcatga
tggccatggg tagacataac agaccttcca 840tcatggtata tggtggtact
atcttgcccg gtcatccaac atgtggttct tcgaagatct 900ctaaaaacat
cgatatcgtc tctgcgttcc aatcctacgg tgaatatatt tccaagcaat
960tcactgaaga agaaagagaa gatgttgtgg aacatgcatg cccaggtcct
ggttcttgtg 1020gtggtatgta tactgccaac acaatggctt ctgccgctga
agtgctaggt ttgaccattc 1080caaactcctc ttccttccca gccgtttcca
aggagaagtt agctgagtgt gacaacattg 1140gtgaatacat caagaagaca
atggaattgg gtattttacc tcgtgatatc ctcacaaaag 1200aggcttttga
aaacgccatt acttatgtcg ttgcaaccgg tgggtccact aatgctgttt
1260tgcatttggt ggctgttgct cactctgcgg gtgtcaagtt gtcaccagat
gatttccaaa 1320gaatcagtga tactacacca ttgatcggtg acttcaaacc
ttctggtaaa tacgtcatgg 1380ccgatttgat taacgttggt ggtacccaat
ctgtgattaa gtatctatat gaaaacaaca 1440tgttgcacgg taacacaatg
actgttaccg gtgacacttt ggcagaacgt gcaaagaaag 1500caccaagcct
acctgaagga caagagatta ttaagccact ctcccaccca atcaaggcca
1560acggtcactt gcaaattctg tacggttcat tggcaccagg tggagctgtg
ggtaaaatta 1620ccggtaagga aggtacttac ttcaagggta gagcacgtgt
gttcgaagag gaaggtgcct 1680ttattgaagc cttggaaaga ggtgaaatca
agaagggtga aaaaaccgtt gttgttatca 1740gatatgaagg tccaagaggt
gcaccaggta tgcctgaaat gctaaagcct tcctctgctc 1800tgatgggtta
cggtttgggt aaagatgttg cattgttgac tgatggtaga ttctctggtg
1860gttctcacgg gttcttaatc ggccacattg ttcccgaagc cgctgaaggt
ggtcctatcg 1920ggttggtcag agacggcgat gagattatca ttgatgctga
taataacaag attgacctat 1980tagtctctga taaggaaatg gctcaacgta
aacaaagttg ggttgcacct ccacctcgtt 2040acacaagagg tactctatcc
aagtatgcta agttggtttc caacgcttcc aacggttgtg 2100ttttagatgc
ttgaggatcc agtttatcat tatcaatact cgccatttca aagaatacgt
2160aaataattaa tagtagtgat tttcctaact ttatttagtc aaaaaattag
ccttttaatt 2220ctgctgtaac ccgtacatgc ccaaaatagg gggcgggtta
cacagaatat ataacatcgt 2280aggtgtctgg gtgaacagtt tattcctggc
atccactaaa tataatggag cccgcttttt 2340aagctggcat ccagaaaaaa
aaagaatccc agcaccaaaa tattgttttc ttcaccaacc 2400atcagttcat
aggtccattc tcttagcgca actacagaga acaggggcac aaacaggcaa
2460aaaacgggca caacctcaat ggagtgatgc aacctgcctg gagtaaatga
tgacacaagg 2520caattgaccc acgcatgtat ctatctcatt ttcttacacc
ttctattacc ttctgctctc 2580tctgatttgg aaaaagctga aaaaaaaggt
tgaaaccagt tccctgaaat tattccccta 2640cttgactaat aagtatataa
agacggtagg tattgattgt aattctgtaa atctatttct 2700taaacttctt
aaattctact tttatagtta gtcttttttt tagttttaaa acaccagaac
2760ttagtttcga ctcgagatgg ccaactattt taacacatta aatttgagac
aacaattggc 2820tcaactgggt aagtgcagat ttatgggaag ggacgagttt
gctgatggtg cttcttatct 2880gcaaggaaag aaagtagtaa ttgttggctg
cggtgctcag ggtctaaacc aaggtttaaa 2940catgagagat tcaggtctgg
atatttcgta tgcattgagg aaagaggcaa ttgcagaaaa 3000gagggcctcc
tggcgtaaag cgacggaaaa tgggttcaaa gttggtactt acgaagaact
3060gatccctcag gcagatttag tgattaacct aacaccagat aaggttcact
cagacgtagt 3120aagaacagtt caaccgctga tgaaggatgg ggcagcttta
ggttactctc atggctttaa 3180tatcgttgaa gtgggcgagc agatcagaaa
agatataaca gtcgtaatgg ttgcaccaaa 3240gtgcccaggt acggaagtca
gagaggagta caagaggggt tttggtgtac ctacattgat 3300cgccgtacat
cctgaaaatg accccaaagg tgaaggtatg gcaattgcga aggcatgggc
3360agccgcaacc ggaggtcata gagcgggtgt gttagagagt tctttcgtag
ctgaggtcaa 3420gagtgactta atgggtgaac aaaccattct gtgcggaatg
ttgcaggcag ggtctttact 3480atgctttgat aaattggtcg aagagggtac
agatcctgcc tatgctgaaa agttgataca 3540atttggttgg gagacaatca
ccgaggcact taaacaaggt ggcataacat tgatgatgga 3600tagactttca
aatccggcca agctaagagc ctacgcctta tctgagcaac taaaagagat
3660catggcacca ttattccaaa agcacatgga cgatattatc tccggtgagt
tttcctcagg 3720aatgatggca gattgggcaa acgatgataa aaagttattg
acgtggagag aagaaaccgg 3780caagacggca ttcgagacag ccccacaata
cgaaggtaaa attggtgaac aagaatactt 3840tgataaggga gtattgatga
tagctatggt gaaggcaggg gtagaacttg cattcgaaac 3900tatggttgac
tccggtatca ttgaagaatc tgcatactat gagtctttgc atgaattgcc
3960tttgatagca aatactattg caagaaaaag actttacgag atgaatgttg
tcatatcaga 4020cactgcagaa tatggtaatt acttatttag ctacgcatgt
gtcccgttgt taaagccctt 4080catggccgag ttacaacctg gtgatttggg
gaaggctatt ccggaaggag cggttgacaa 4140tggccaactg agagacgtaa
atgaagctat tcgttcacat gctatagaac aggtgggtaa 4200aaagctgaga
ggatatatga ccgatatgaa aagaattgca gtggcaggat gaagatccgc
4260ggccgctcga gtcatgtaat tagttatgtc acgcttacat tcacgccctc
cccccacatc 4320cgctctaacc gaaaaggaag gagttagaca acctgaagtc
taggtcccta tttatttttt 4380tatagttatg ttagtattaa gaacgttatt
tatatttcaa atttttcttt tttttctgta 4440cagacgcgtg tacgcatgta
acattatact gaaaaccttg cttgagaagg ttttgggacg 4500ctcgaaggct
ttaatttgcg gccggtaccc aattcgccct atagtgagtc gtattacgcg
4560cgctcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt
tacccaactt 4620aatcgccttg cagcacatcc ccctttcgcc agctggcgta
atagcgaaga ggcccgcacc 4680gatcgccctt cccaacagtt gcgcagcctg
aatggcgaat ggcgcgacgc gccctgtagc 4740ggcgcattaa gcgcggcggg
tgtggtggtt acgcgcagcg tgaccgctac acttgccagc 4800gccctagcgc
ccgctccttt cgctttcttc ccttcctttc tcgccacgtt cgccggc
485711934DNAArtificial SequencePrimer 421 119gccaacggat cctcaagcat
ctaaaacaca accg 3412036DNAArtificial SequencePrimer 551
120gctcatgtcg acatgaagaa gctcaacaag tactcg 3612167DNAArtificial
SequencePrimer 269 121ctagcatgta cccatacgat gttcctgact atgcgggtgt
cgacgaattc ccgggatccg 60cggccgc 6712267DNAArtificial SequencePrimer
270 122tcgagcggcc gcggatcccg ggaattcgtc gacacccgca tagtcaggaa
catcgtatgg 60gtacatg 6712335DNAArtificial SequencePrimer 1842
123ttttggatcc ctaccaatcc tggtggactt tatcg 3512437DNAArtificial
SequencePrimer 2163 124ttggtagtcg acatggttta cactccatcc aagggtc
3712532DNAArtificial SequencePrimer 2183 125acagtagtcg acatgacaga
gcagaaagcc ct 3212634DNAArtificial SequencePrimer 2184
126tacatcggat ccctacataa gaacaccttt ggtg 3412741DNAArtificial
SequencePrimer 2195 127ttgttcctcg agatggagga acaggagata ggcgttcctg
c 4112842DNAArtificial SequencePrimer 2196 128gttcttgcgg ccgcttattt
tggagattct atctggggtt gc 4212940DNAArtificial SequencePrimer 2197
129ttcttggtcg acatgagtgc tctactgtcc gagtctgacc 4013043DNAArtificial
SequencePrimer 2198 130ttgttcggat ccttaccagg tgctcccaac agagacgaga
tcc 4313142DNAArtificial SequencePrimer 2259 131tcagtaagat
ctatgactga gatactacca catgtaaacg ac 4213244DNAArtificial
SequencePrimer 2260 132catatcctcg aggtacccta tacatccccc acagcatctc
gcag 441337685DNAArtificial SequencepGV2074 133ttggatcata
ctaagaaacc attattatca tgacattaac ctataaaaat aggcgtatca 60cgaggccctt
tcgtctcgcg cgtttcggtg atgacggtga aaacctctga cacatgcagc
120tcccggagac ggtcacagct tgtctgtaag cggatgccgg gagcagacaa
gcccgtcagg 180gcgcgtcagc gggtgttggc gggtgtcggg gctggcttaa
ctatgcggca tcagagcaga 240ttgtactgag agtgcaccat accacagctt
ttcaattcaa ttcatcattt tttttttatt 300cttttttttg atttcggttt
ctttgaaatt tttttgattc ggtaatctcc gaacagaagg 360aagaacgaag
gaaggagcac agacttagat tggtatatat acgcatatgg caaattaaag
420ccttcgagcg tcccaaaacc ttctcaagca aggttttcag tataatgtta
catgcgtaca 480cgcgtctgta cagaaaaaaa agaaaaattt gaaatataaa
taacgttctt aatactaaca 540taactataaa aaaataaata gggacctaga
cttcaggttg tctaactcct tccttttcgg 600ttagagcgga tgtgggggga
gggcgtgaat gtaagcgtga cataagaatt cttattcctt 660tgccctcgga
cgagtgctgg ggcgtcggtt tccactatcg gcgagtactt ctacacagcc
720atcggtccag acggccgcgc ttctgcgggc gatttgtgta cgcccgacag
tcccggctcc 780ggatcggacg attgcgtcgc atcgaccctg cgcccaagct
gcatcatcga aattgccgtc 840aaccaagctc tgatagagtt ggtcaagacc
aatgcggagc atatacgccc ggaggcgcgg 900cgatcctgca agctccggat
gcctccgctc gaagtagcgc gtctgctgct ccatacaagc 960caaccacggc
ctccagaaga ggatgttggc gacctcgtat tgggaatccc cgaacatcgc
1020ctcgctccag tcaatgaccg ctgttatgcg gccattgtcc gtcaggacat
tgttggagcc 1080gaaatccgca tgcacgaggt gccggacttc ggggcagtcc
tcggcccaaa gcatcagctc 1140atcgagagcc tgcgcgacgg acgcactgac
ggtgtcgtcc atcacagttt gccagtgata 1200cacatgggga tcagcaatcg
cgcatatgaa atcacgccat gtagtgtatt gaccgattcc 1260ttgcggtccg
aatgggccga acccgctcgt ctggctaaga tcggccgcag cgatcgcatc
1320catggcctcc gcgaccggct ggagaacagc gggcagttcg gtttcaggca
ggtcttgcaa 1380cgtgacaccc tgtgcacggc gggagatgca ataggtcagg
ctctcgctga actccccaat 1440gtcaagcact tccggaatcg ggagcgcggc
cgatgcaaag tgccgataaa cataacgatc 1500tttgtagaaa ccatcggcgc
agctatttac ccgcaggaca tatccacgcc ctcctacatc 1560gaagctgaaa
gcacgagatt cttcgccctc cgagagctgc atcaggtcgg agacgctgtc
1620gaacttttcg atcagaaact tctcgacaga cgtcgcggtg agttcaggct
ttttacccat 1680actagttttt agtttatgta tgtgtttttt gtagttatag
atttaagcaa gaaaagaata 1740caaacaaaaa attgaaaaag attgatttag
aattaaaaag aaaaatattt acgtaagaag 1800ggaaaatagt aaatgttgca
agttcactaa actcctaaat tatgctgccc tttatattcc 1860ctgttacagc
agccgagcca aaggtatata ggctcctttg cattagcatg cgtaacaaac
1920cacctgtcag tttcaaccga ggtggtatcc gagagaattg tgtgattgct
ttaattaatt 1980tcggagaatc tcacatgcca ctgaagatta aaaactggat
gccagaaaag gggtgtccag 2040gtgtaacatc aatagaggaa gctgaaaagt
cttagaacgg gtaatcttcc accaacctga 2100tgggttccta gatataatct
cgaagggaat aagtagggtg ataccgcaga agtgtctgaa 2160tgtattaagg
tcctcacagt ttaaatcccg ctcacactaa cgtaggatta ttataactca
2220aaaaaatggc attattctaa gtaagttaaa tatccgtaat ctttaaacag
cggccgcaga 2280tctctcgagt cgaaactaag ttctggtgtt ttaaaactaa
aaaaaagact aactataaaa 2340gtagaattta agaagtttaa gaaatagatt
tacagaatta caatcaatac ctaccgtctt 2400tatatactta ttagtcaagt
aggggaataa tttcagggaa ctggtttcaa cctttttttt 2460cagctttttc
caaatcagag agagcagaag gtaatagaag gtgtaagaaa atgagataga
2520tacatgcgtg ggtcaattgc cttgtgtcat catttactcc aggcaggttg
catcactcca 2580ttgaggttgt gcccgttttt tgcctgtttg tgcccctgtt
ctctgtagtt gcgctaagag 2640aatggaccta tgaactgatg gttggtgaag
aaaacaatat tttggtgctg ggattctttt 2700tttttctgga tgccagctta
aaaagcgggc tccattatat ttagtggatg ccaggaataa 2760actgttcacc
cagacaccta cgatgttata tattctgtgt aacccgcccc ctattttggg
2820catgtacggg ttacagcaga attaaaaggc taattttttg actaaataaa
gttaggaaaa 2880tcactactat taattattta cgtattcttt gaaatggcga
gtattgataa tgataaactg 2940gatccgtcga caaacttaga ttagattgct
atgctttctt tctaatgagc aagaagtaaa 3000aaaagttgta atagaacaag
aaaaatgaaa ctgaaacttg agaaattgaa gaccgtttat 3060taacttaaat
atcaatggga ggtcatcgaa agagaaaaaa atcaaaaaaa aaattttcaa
3120gaaaaagaaa cgtgataaaa atttttattg cctttttcga cgaagaaaaa
gaaacgaggc 3180ggtctctttt ttcttttcca aacctttagt acgggtaatt
aacgacaccc tagaggaaga 3240aagaggggaa atttagtatg ctgtgcttgg
gtgttttgaa gtggtacggc gatgcgcgga 3300gtccgagaaa atctggaaga
gtaaaaaagg agtagaaaca ttttgaagct atgagctcag 3360atctgttaac
cttgttttat atttgttgta aaaagtagat aattacttcc ttgatgatct
3420gtaaaaaaga gaaaaagaaa gcatctaaga acttgaaaaa ctacgaatta
gaaaagacca 3480aatatgtatt tcttgcattg accaatttat gcaagtttat
atatatgtaa atgtaagttt 3540cacgaggttc tactaaacta aaccaccccc
ttggttagaa gaaaagagtg tgtgagaaca 3600ggctgttgtt gtcacacgat
tcggacaatt ctgtttgaaa gagagagagt aacagtacga 3660tcgaacgaac
tttgctctgg agatcacagt gggcatcata gcatgtggta ctaaaccctt
3720tcccgccatt ccagaacctt cgattgcttg ttacaaaacc tgtgagccgt
cgctaggacc 3780ttgttgtgtg acgaaattgg aagctgcaat caataggaag
acaggaagtc gagcgtgtct 3840gggttttttc agttttgttc tttttgcaaa
caaatcacga gcgacggtaa tttctttctc 3900gataagaggc cacgtgcttt
atgagggtaa catcaattca agaaggaggg aaacacttcc 3960tttttctggc
cctgataata gtatgagggt gaagccaaaa taaaggattc gcgcccaaat
4020cggcatcttt aaatgcaggt atgcgatagt tcctcactct ttccttactc
acgagtaatt 4080cttgcaaatg cctattatgc agatgttata atatctgtgc
gtcttgagtt gagcctaggg 4140agctccagct tttgttccct ttagtgaggg
ttaattgcgc gcttggcgta atcatggtca 4200tagctgtttc ctgtgtgaaa
ttgttatccg ctcacaattc cacacaacat aggagccgga 4260agcataaagt
gtaaagcctg gggtgcctaa tgagtgaggt aactcacatt aattgcgttg
4320cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta
atgaatcggc 4380caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt
ccgcttcctc gctcactgac 4440tcgctgcgct cggtcgttcg gctgcggcga
gcggtatcag ctcactcaaa ggcggtaata 4500cggttatcca cagaatcagg
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa 4560aaggccagga
accgtaaaaa ggccgcgttg ctggcgtttt tccataggct ccgcccccct
4620gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac
aggactataa 4680agataccagg cgtttccccc tggaagctcc ctcgtgcgct
ctcctgttcc gaccctgccg 4740cttaccggat acctgtccgc ctttctccct
tcgggaagcg tggcgctttc tcatagctca 4800cgctgtaggt atctcagttc
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa 4860ccccccgttc
agcccgaccg ctgcgcctta tccggtaact atcgtcttga gtccaacccg
4920gtaagacacg acttatcgcc actggcagca gccactggta acaggattag
cagagcgagg 4980tatgtaggcg gtgctacaga gttcttgaag tggtggccta
actacggcta cactagaagg 5040acagtatttg gtatctgcgc tctgctgaag
ccagttacct tcggaaaaag agttggtagc 5100tcttgatccg gcaaacaaac
caccgctggt agcggtggtt tttttgtttg caagcagcag 5160attacgcgca
gaaaaaaagg atctcaagaa gatcctttga tcttttctac ggggtctgac
5220gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc
aaaaaggatc 5280ttcacctaga tccttttaaa ttaaaaatga agttttaaat
caatctaaag tatatatgag 5340taaacttggt ctgacagtta ccaatgctta
atcagtgagg cacctatctc agcgatctgt 5400ctatttcgtt catccatagt
tgcctgactc cccgtcgtgt agataactac gatacgggag 5460ggcttaccat
ctggccccag tgctgcaatg ataccgcgag acccacgctc accggctcca
5520gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg
tcctgcaact 5580ttatccgcct ccatccagtc tattaattgt tgccgggaag
ctagagtaag tagttcgcca 5640gttaatagtt tgcgcaacgt tgttgccatt
gctacaggca tcgtggtgtc acgctcgtcg 5700tttggtatgg cttcattcag
ctccggttcc caacgatcaa ggcgagttac atgatccccc 5760atgttgtgca
aaaaagcggt tagctccttc ggtcctccga tcgttgtcag aagtaagttg
5820gccgcagtgt tatcactcat ggttatggca gcactgcata attctcttac
tgtcatgcca 5880tccgtaagat gcttttctgt gactggtgag tactcaacca
agtcattctg agaatagtgt 5940atgcggcgac cgagttgctc ttgcccggcg
tcaatacggg ataataccgc gccacatagc 6000agaactttaa aagtgctcat
cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc 6060ttaccgctgt
tgagatccag ttcgatgtaa cccactcgtg cacccaactg atcttcagca
6120tcttttactt tcaccagcgt ttctgggtga gcaaaaacag gaaggcaaaa
tgccgcaaaa 6180aagggaataa gggcgacacg gaaatgttga atactcatac
tcttcctttt tcaatattat 6240tgaagcattt atcagggtta ttgtctcatg
agcggataca tatttgaatg tatttagaaa 6300aataaacaaa taggggttcc
gcgcacattt ccccgaaaag tgccacctga acgaagcatc 6360tgtgcttcat
tttgtagaac aaaaatgcaa cgcgagagcg ctaatttttc aaacaaagaa
6420tctgagctgc atttttacag aacagaaatg caacgcgaaa gcgctatttt
accaacgaag 6480aatctgtgct tcatttttgt aaaacaaaaa tgcaacgcga
gagcgctaat ttttcaaaca 6540aagaatctga gctgcatttt tacagaacag
aaatgcaacg cgagagcgct attttaccaa 6600caaagaatct atacttcttt
tttgttctac aaaaatgcat cccgagagcg ctatttttct 6660aacaaagcat
cttagattac tttttttctc ctttgtgcgc tctataatgc agtctcttga
6720taactttttg cactgtaggt ccgttaaggt tagaagaagg ctactttggt
gtctattttc 6780tcttccataa aaaaagcctg actccacttc ccgcgtttac
tgattactag cgaagctgcg 6840ggtgcatttt ttcaagataa aggcatcccc
gattatattc tataccgatg tggattgcgc 6900atactttgtg aacagaaagt
gatagcgttg atgattcttc attggtcaga aaattatgaa 6960cggtttcttc
tattttgtct ctatatacta cgtataggaa atgtttacat tttcgtattg
7020ttttcgattc actctatgaa tagttcttac tacaattttt ttgtctaaag
agtaatacta 7080gagataaaca taaaaaatgt agaggtcgag tttagatgca
agttcaagga gcgaaaggtg 7140gatgggtagg ttatataggg atatagcaca
gagatatata gcaaagagat acttttgagc 7200aatgtttgtg gaagcggtat
tcgcaatatt ttagtagctc gttacagtcc ggtgcgtttt 7260tggttttttg
aaagtgcgtc ttcagagcgc ttttggtttt caaaagcgct ctgaagttcc
7320tatactttct agagaatagg aacttcggaa taggaacttc aaagcgtttc
cgaaaacgag 7380cgcttccgaa aatgcaacgc gagctgcgca catacagctc
actgttcacg tcgcacctat 7440atctgcgtgt tgcctgtata tatatataca
tgagaagaac ggcatagtgc gtgtttatgc 7500ttaaatgcgt acttatatgc
gtctatttat gtaggatgaa aggtagtcta gtacctcctg 7560tgatattatc
ccattccatg cggggtatcg tatgcttcct tcagcactac cctttagctg
7620ttctatatgc tgccactcct caattggatt agtctcatcc ttcaatgcta
tcatttcctt 7680tgata 768513424DNAArtificial SequencePrimer 587
134ccaatgcaga ccgatcttct accc 2413522DNAArtificial SequencePrimer
588 135gatcacgtga tctgttgtat tg 2213620DNAArtificial SequencePrimer
2167 136tacatggggt acttctcctc 2013760DNAArtificial SequencePrimer
2170 137cagtcaacaa atataaagaa tattgaaatt gacagttttt gtcgctatcg
atttttatta 6013860DNAArtificial SequencePrimer 2171 138ttttgtcgct
atcgattttt attatttgct gttttaaatc attctggttc tatcgaggag
6013960DNAArtificial SequencePrimer 2172 139catgttattg acgccaggtt
tggacgttgt ttttcactgt atccggatgt gaagtcgttg 6014060DNAArtificial
SequencePrimer 2173 140tggttttaga aaaggatggt gtgcttgtcg ctgagacaca
tgttattgac gccaggtttg 6014120DNAArtificial SequencePrimer 2175
141tctagttcag agcttggtgc 2014220DNAArtificial SequencePrimer 2226
142tgctccattt ggaagtctcg 2014320DNAArtificial SequencePrimer 2227
143tatctacgaa gtgacctgcg 201441716DNAArtificial SequenceB. subtilis
alsS codon-optimized for expression in S. cerevisiae 144atgttgacta
aagctacaaa agagcagaaa tcattggtga aaaatagggg tgcagaactt 60gttgtggact
gtttggtaga acagggcgta acacatgttt ttggtatccc aggtgcaaaa
120atcgacgccg tgtttgatgc attacaagac aagggtccag aaattattgt
tgctagacat 180gagcaaaatg ccgcatttat ggcgcaagct gtaggtaggc
ttacaggtaa acctggtgtt 240gtcctagtta cgtctggccc aggagcctcc
aatttagcaa ctggtctatt gacagctaat 300actgagggag atcctgtagt
tgcgttagcc ggtaatgtaa ttagagctga taggcttaag 360agaactcacc
agtctctaga caacgctgct ttattccaac cgatcaccaa gtactcagta
420gaggtacaag acgtaaagaa tatacctgaa gctgtgacaa acgcatttcg
tatagcttct 480gctggtcagg ctggtgccgc gtttgtttct tttcctcaag
acgttgtcaa tgaagtgacc 540aatactaaaa acgttagagc ggttgcagcc
cctaaactag gtccagccgc agacgacgca 600attagcgctg caattgctaa
aattcagacg gcgaaactac cagtagtcct tgtcggtatg 660aagggcggaa
gaccagaagc aataaaagct gttcgtaagt tattgaagaa agtccaatta
720cctttcgttg agacttacca agcagcaggt actttatcta gagatttaga
ggatcagtat 780tttggaagga taggtctatt tagaaaccaa ccaggagatt
tactattaga acaagctgat 840gttgtactta ctatcggtta tgatcctata
gagtatgacc caaagttttg gaacataaat 900ggggatagaa caattataca
tctagacgag ataatcgccg acatcgatca cgcttatcaa 960ccagatttag
aactaatcgg agatatcccg tcaacaatca atcatattga acatgatgct
1020gtaaaggttg agttcgctga acgtgagcag aaaatcttat ctgatctaaa
gcaatatatg 1080catgagggtg aacaagttcc agcagactgg aaatctgacc
gtgcacatcc tttggaaatc 1140gttaaggaac taagaaatgc ggtcgatgat
catgtgactg ttacatgtga tatcggttca 1200catgcaattt ggatgtcacg
ttattttagg agctacgaac cattaacttt aatgatatct 1260aacgggatgc
aaactctggg ggttgcactt ccttgggcta ttggcgctag tttagttaag
1320cccggtgaga aggtggtatc ggtatcaggt gatggtggct ttctgttttc
ggctatggaa 1380ttagaaactg cagtccgttt aaaagctccc attgtgcata
ttgtctggaa tgattctact 1440tacgacatgg ttgcttttca acagttgaag
aaatacaata gaacttcggc tgtagacttt 1500ggtaacatcg atattgtgaa
atatgctgag tcttttggcg caacaggcct gagggtggaa 1560agtccagatc
agttagctga tgtgttgaga caagggatga atgccgaggg accggtaatc
1620atagatgtgc cagttgacta ctcagacaat attaatttgg cttctgataa
acttcctaaa 1680gagtttggcg agctaatgaa gaccaaagcc ttataa
171614511PRTThermotoga petrophila 145Pro Tyr His Lys Glu Gly Gly
Leu Gly Ile Leu 1 5 10 14611PRTVictivallis vadensis 146Pro Tyr Ser
Glu Lys Gly Gly Leu Ala Ile Leu 1 5 10 14711PRTUnknownTermite Group
1 Bacterium Phylotype Rs-D17 147Pro Tyr Lys Pro Glu Gly Gly Ile Ala
Ile Leu 1 5 10 14811PRTYarrowia lipolytica 148Pro Leu Lys Pro Ser
Gly His Leu Gln Ile Leu 1 5 10 14911PRTFrancisella tularensis
149Pro Ile Lys Lys Thr Gly His Leu Gln Ile Leu 1 5 10
15011PRTArabidopsis thaliana 150Pro Ile Lys Glu Thr Gly His Ile Gln
Ile Leu 1 5 10 1511647DNAArtificial SequenceL. lactis kivD
codon-optimized for expression in S. cerevisiae 151atgtatacag
tgggtgatta cttgctagac cgtttacatg aattaggcat agaagagatt 60tttggagtac
caggtgatta caatttgcaa ttcttggatc agattatctc acataaggat
120atgaaatggg tcgggaatgc gaacgagtta aacgcttcct atatggcaga
tggttatgcg 180agaaccaaaa aggccgctgc ttttcttaca actttcggtg
taggtgaact ttcagccgtt 240aatggattag ccggatctta cgctgaaaac
ttaccagtcg ttgaaattgt tggttctcct 300acttctaagg tacaaaacga
gggaaagttt gttcaccata ctttagcgga tggtgatttc 360aaacatttca
tgaagatgca tgaacctgtc acagcagcga gaacactttt gaccgcagag
420aacgctactg ttgaaatcga tagagtatta agtgctttgt taaaagagag
aaagccagtg 480tatatcaact tgcctgttga tgtcgctgcc gcaaaagcag
aaaaaccatc tttaccattg 540aaaaaggaaa actcaaccag taacacatct
gatcaagaga ttctaaacaa aatccaagag 600tcattgaaaa acgccaaaaa
gccaatcgtt ataacaggcc atgaaatcat ctcctttggt 660ttagaaaaga
ccgttacaca attcatctct aagacaaaat tgccaatcac tactttaaac
720tttggcaaat cttctgtaga tgaagcttta ccatcttttc taggtatcta
caatggtact 780ctttctgaac caaacctaaa agagtttgtc gaatccgctg
atttcatact tatgttgggt 840gttaagctaa ctgatagttc aactggtgca
ttcactcacc acttgaacga aaacaagatg 900atatcattga atatcgatga
gggcaagata ttcaacgaac gtattcaaaa cttcgatttt 960gaatcactaa
tttcttctct acttgatttg tcagaaatag aatacaaagg aaaatacatc
1020gacaaaaagc aagaggattt cgtgccatca aacgcattgt tgtcacaaga
tagactttgg 1080caagctgtgg aaaacctaac tcaatctaac gaaacaattg
tggctgagca aggcacatca 1140ttcttcggcg catctagtat tttcttgaag
agtaagtccc acttcatcgg tcaacctctt 1200tggggatcta ttgggtacac
cttccctgcc gcattgggat cacagatcgc agacaaggag 1260tccagacatc
tactattcat tggggacgga tcattgcaac ttaccgttca ggaattaggg
1320ttggctataa gagaaaagat caatcctatt tgttttatca tcaacaatga
tggctataca 1380gtagaacgtg agattcacgg tccaaatcaa agttacaatg
acatcccaat gtggaattac 1440tctaagttgc ctgaatcttt cggtgcaaca
gaagatagag ttgtttccaa aatagtcaga 1500acagaaaatg agtttgtttc
tgtcatgaag gaagctcaag ccgatcctaa tagaatgtac 1560tggattgaac
taatcttggc caaggaagga gccccaaaag tattgaaaaa gatgggaaaa
1620cttttcgctg aacagaataa gtcctga 16471521023DNAArtificial
SequenceL. lactis alcohol dehydrogenase (adhA) RE1 152atgaaagcag
cagtagtaag acacaatcca gatggttatg cggaccttgt tgaaaaggaa 60cttcgagcaa
tcaaacctaa tgaagctttg cttgacatgg agtattgtgg agtctgtcat
120accgatttgc acgttgcagc aggtgatttt ggcaacaaag cagggactgt
tcttggtcat 180gaaggaattg gaattgtcaa agaaattgga gctgatgtaa
gctcgcttca agttggtgat 240cgggtttcag tggcttggtt ctttgaagga
tgtggtcact gtgaatactg tgtatctggt 300aatgaaactt tttgtcgaga
agttaaaaat gcaggatatt cagttgatgg cggaatggct 360gaagaagcaa
ttgttgttgc cgattatgct gtcaaagttc ctgacggact tgacccaatt
420gaagctagct caattacttg tgctggagta acaacttaca aagcaatcaa
agtatcagga 480gtaaaacctg gtgattggca agtaattttt ggtgctggag
gacttggaaa tttagcaatt 540caatatgcta aaaatgtttt tggagcaaaa
gtaattgctg ttgatattaa tcaagataaa 600ttaaatttag ctaaaaaaat
tggagctgat gtgactatca attctggtga tgtaaatcca 660gttgatgaaa
ttaaaaaaat aactggcggc ttaggggtgc aaagtgcaat agtttgtgct
720gttgcaagga ttgcttttga acaagcggtt gcttctttga aacctatggg
caaaatggtt 780gctgtggcag ttcccaatac tgagatgact ttatcagttc
caacagttgt ttttgacgga 840gtggaggttg caggttcact tgtcggaaca
agacttgact tggcagaagc ttttcaattt 900ggagcagaag gtaaggtaaa
accaattgtt gcgacacgca aactggaaga aatcaatgat 960attattgatg
aaatgaaggc aggaaaaatt gaaggccgaa tggtcattga ttttactaaa 1020taa
10231532073DNASaccharomyces cerevisiae 153atggaaggct tcaatccggc
tgacatagaa catgcgtcac cgattaattc atctgacagc 60cattcatcct cctttgtata
tgctctaccc aaaagtgcta gtgaatatgt agtcaaccat 120aatgagggtc
gtgcaagtgc aagtggaaat ccagccgcag tgccgtctcc cataatgaca
180ctgaatctca aaagcacaca ttccctcaat attgatcagc atgttcatac
ctcaacatcg 240ccgacggaaa ctattgggca tattcatcat gtggaaaagc
tgaatcaaaa caatttgatt 300catctggatc cagtacccaa ctttgaagat
aagtccgata ttaagccttg gttgcaaaag 360attttttatc ctcaaggaat
agaacttgtg atagaaaggt cggacgcatt taaagttgtc 420ttcaagtgta
aagctgctaa aaggggaagg aacgcgagaa ggaaaagaaa agataagccc
480aaaggacagg accacgaaga cgagaaatcc aagatcaatg atgacgaatt
agaatatgcg 540agtccttcta atgccacagt aaccaatggg cctcaaacat
cgcccgatca aacatcctcc 600ataaagccaa agaaaaaaag atgtgtatcg
aggtttaata actgtccgtt tagagtacga 660gctacttatt cgttaaagag
gaaaagatgg agcattgttg taatggacaa taaccattca 720catcagctaa
agtttaaccc tgattccgaa gagtacaaaa aattcaaaga aaaattaaga
780aaggataatg acgtagatgc aatcaagaaa ttcgacgaat tggaatacag
aactttggcc 840aatttgccca ttccaacagc tacaatcccc tgtgattgtg
gtttaacaaa tgaaatacaa 900agtttcaatg tcgtattgcc cactaacagt
aatgttactt catcagcatc ctcttcaact 960gtatcgtcca tatcccttga
ttcatcgaat gcatctaaaa ggccatgctt accctctgta 1020aataacaccg
gtagtatcaa taccaataac gtaaggaaac cgaaaagcca gtgtaagaat
1080aaagacacac tcttaaaaag aaccaccatg cagaactttc tcacaactaa
atcaaggctg 1140cgtaagaccg gtacgccaac atcttcgcaa cactcatcta
cagcattttc aggatatatt 1200gatgatcctt tcaatttgaa tgaaatcttg
ccactgccgg catccgattt caagctaaac 1260actgtaacaa atttgaacga
aattgacttt acgaacattt ttaccaaatc gccgcatcca 1320catagcgggt
ctacccatcc aagacaagtc ttcgaccaat tggacgattg ttcctctata
1380ctcttctctc cattaactac aaacacgaat aatgaatttg aaggagagtc
agatgatttt 1440gttcattctc catatttgaa ctcagaggca gatttcagcc
aaattcttag tagtgctccc 1500ccagtccatc atgacccaaa tgaaacacat
caggaaaacc aggatattat tgatagattt 1560gctaatagtt cccaagaaca
taatgagtat attctacaat atttgacgca ctccgatgct 1620gctaaccaca
ataacatcgg cgttccaaac aacaattcac attcgctaaa tactcagcat
1680aacgtttctg atctgggcaa ctcactttta agacaagaag ctttagttgg
cagctcttca 1740acaaaaatct tcgacgaatt gaaatttgta caaaatggcc
cacacggttc tcaacatcct 1800atagattttc aacatgttga ccatcgtcat
ctcagctcta atgaacctca agtacgatca 1860catcaatatg gtccgcaaca
gcagccaccg cagcaattgc aatatcacca aaatcagccc 1920cacgacggcc
ataaccacga acagcaccaa acagtacaaa aggatatgca aacgcatgaa
1980tcgctagaaa taatgggaaa cacattattg gaagagttca aagacattaa
aatggtgaac 2040ggcgagttga agtatgtgaa gccagaagat tag
20731541494DNAArtificial SequenceE. coli ketolacid reductoisomerase
P2D1-A1 154atggccaact attttaacac attaaatttg agacaacaat tggctcaact
gggtaagtgc 60agatttatgg gaagggacga gtttgctgat ggtgcttctt atctgcaagg
aaagaaagta 120gtaattgttg gctgcggtgc tcagggtcta aaccaaggtt
taaacatgag agattcaggt 180ctggatattt cgtatgcatt gaggaaagag
tctattgcag aaaaggatgc cgattggcgt 240aaagcgacgg aaaatgggtt
caaagttggt acttacgaag aactgatccc tcaggcagat 300ttagtgatta
acctaacacc agataaggtt cactcagacg tagtaagaac agttcaaccg
360ctgatgaagg atggggcagc tttaggttac tctcatggct ttaatatcgt
tgaagtgggc 420gagcagatca gaaaaggtat aacagtcgta atggttgcgc
caaagtgccc aggtacggaa 480gtcagagagg agtacaagag gggttttggt
gtacctacat tgatcgccgt acatcctgaa 540aatgacccca aacgtgaagg
tatggcaata gcgaaggcat gggcagccgc aaccggaggt 600catagagcgg
gtgtgttaga gagttctttc gtagctgagg tcaagagtga cttaatgggt
660gaacaaacca ttctgtgcgg aatgttgcag gcagggtctt tactatgctt
tgataaattg 720gtcgaagagg gtacagatcc tgcctatgct gaaaagttga
tacaatttgg ttgggagaca 780atcaccgagg cacttaaaca aggtggcata
acattgatga tggatagact ttcaaatccg 840gccaagctaa gagcctacgc
cttatctgag caactaaaag agatcatggc accattattc 900caaaagcaca
tggacgatat tatctccggt gagttttcct caggaatgat ggcagattgg
960gcaaacgatg ataaaaagtt attgacgtgg agagaagaaa ccggcaagac
ggcattcgag 1020acagccccac aatacgaagg taaaattggt gaacaagaat
actttgataa gggagtattg 1080atgatagcta tggtgaaggc aggggtagaa
cttgcattcg aaactatggt tgactccggt 1140atcattgaag aatctgcata
ctatgagtct ttgcatgaat tgcctttgat
agcaaatact 1200attgcaagaa aaagacttta cgagatgaat gttgtcatat
cagacactgc agaatatggt 1260aattacttat ttagctacgc gtgtgtcccg
ttgttagagc ccttcatggc cgagttacaa 1320cctggtgatt tggggaaggc
tattccggaa ggagcggttg acaatggcca actgagagac 1380gtaaatgaag
ctattcgttc gcatgctata gaacaggtgg gtaaaaagct gagaggatat
1440atgaccgata tgaaaagaat tgcagtggca ggacaccacc accaccacca ctaa
14941551713DNAArtificial SequenceL. lactis ilvD codon-optimized for
expression in S. cerevisiae 155atggagttta agtataacgg caaagttgaa
tctgttgaac tgaataagta cagcaaaacg 60ttgacacaag atcccacaca acccgccaca
caggcaatgt attacggcat cgggtttaaa 120gacgaagatt tcaagaaagc
tcaagtgggt atagtgtcga tggactggga tggaaatcca 180tgcaacatgc
atttaggaac ccttggatca aagattaaaa gctcagtaaa tcagacagat
240ggtctgatcg gcttacaatt tcatacgata ggagtttctg atgggatagc
aaatggaaag 300ttgggaatga gatactccct tgtttccaga gaagttatag
ctgactctat tgaaaccaac 360gctggcgctg aatactatga tgcaattgta
gccatcccag gttgtgacaa aaatatgcca 420ggttctatta ttggtatggc
aagacttaat aggccaagca ttatggtgta tggaggaaca 480atagaacacg
gtgaatataa aggtgagaaa ttgaacatcg tatcggcttt tgaatctcta
540ggccagaaaa ttaccggcaa tatctctgat gaagattatc acggtgttat
ttgtaatgct 600attcctggtc aaggggcatg tggggggatg tacacagcta
ataccttagc tgccgctatc 660gaaacactag gtatgtcatt gccgtattct
tcttcgaacc ctgcagtatc tcaagaaaaa 720caagaagaat gtgatgagat
tggattagcc attaagaatc ttttggaaaa agacatcaag 780cctagtgata
taatgactaa ggaggcgttc gagaacgcta ttaccattgt gatggtcttg
840gggggtagta ctaatgctgt cttgcatatt attgcaatgg ctaacgcgat
aggtgtcgaa 900ataactcagg atgacttcca aagaattagt gacattactc
cagtactagg tgattttaaa 960ccttcaggta aatatatgat ggaagatttg
cataaaattg gaggcttgcc agcagtgctt 1020aagtaccttc taaaggaagg
aaaattgcat ggtgactgcc ttactgtgac gggtaaaaca 1080ttagccgaga
atgtcgagac tgccctagac ttggatttcg actcacaaga tatcatgagg
1140ccactaaaga atcctatcaa ggccaccggc cacttgcaga ttctgtacgg
taatttagct 1200caagggggtt ccgtagcaaa aattagcggt aaagaaggag
agttcttcaa aggcactgcc 1260agagtctttg atggtgaaca acattttatc
gacggcatag aatctggtcg tttgcatgct 1320ggagatgtag cggtaattag
gaatataggt cccgtcggcg gacctggtat gcccgaaatg 1380ctgaagccta
catcagcatt aattggtgcg ggtttaggga aaagttgcgc gttaattacg
1440gatggtagat tctccggtgg cactcacggt tttgttgtcg gccatattgt
gcctgaagcc 1500gttgagggtg gactaatcgg cttagttgaa gatgacgata
taatagagat agatgcagtc 1560aacaactcta tatccctgaa agtttccgat
gaagaaatcg caaagagaag agctaattat 1620cagaagccaa ctccgaaagc
caccagggga gttttggcaa aattcgctaa attaacccgt 1680cctgcatcgg
aagggtgtgt tactgatctg taa 17131561683DNAArtificial SequenceF.
tularensis ilvD codon-optimized for expression in S. cerevisiae
156atgaaaaagg tgctgaataa gtactcaaga cgtcttaccg aagataagtc
tcaaggtgct 60tctcaggcta tgctatacgg aacagagatg aatgatgcag atatgcacaa
gcctcaaatc 120ggtatcggtt ccgtttggta tgaaggaaat acttgtaata
tgcatttgaa tcaattagca 180caatttgtca aggattctgt tgaaaaggaa
aacttgaaag gcatgagatt caacacaatt 240ggagtttctg atggtatctc
catgggtact gatggcatgt cctactctct acaatcacgt 300gatctaatcg
ctgattcaat cgaaacagtt atgagtgcac actggtatga tggcctagtt
360tcaatcccag gttgtgacaa aaacatgcca ggttgcatga tggcccttgg
tagattaaac 420agaccaggtt tcgtgatcta cggtggaacc atacaagctg
gcgttatgag aggcaaacct 480attgatattg tcacagcttt ccaatcatat
ggagcatgct tatctgggca aataactgaa 540caggaaagac aagagactat
caaaaaggct tgtccaggtg caggagcctg tggcggcatg 600tacacagcta
acacaatggc ctgtgccatt gaggcccttg gaatgagttt gcctttttcc
660tcttctactt ctgcaacttc agttgaaaag gtacaagagt gtgataaggc
aggcgaaaca 720atcaaaaact tgttagaatt ggacattaaa ccaagagaca
tcatgactag aaaagctttc 780gaaaacgcta tggtactaat tacagtaatg
ggaggttcaa caaatgccgt gttacatctg 840ttagcaatgg cttcatccgt
cgatgtagat ttgagtatcg atgactttca ggaaatagct 900aacaaaactc
cagtgctggc tgatttcaag ccatccggga aatatgtcat ggcaaacttg
960catgcaattg gcgggactcc tgcagttatg aaaatgttgc tgaaggccgg
aatgcttcat 1020ggcgattgtt tgactgtaac tgggaaaacc ttagccgaaa
acttggaaaa tgtggccgac 1080ctgccagaag ataacacaat catacacaaa
ctagataacc caatcaaaaa gactggtcat 1140ttgcaaatct tgaaggggaa
tgttgcccca gaaggttctg ttgctaagat aacagggaag 1200gaaggtgaga
tattcgaggg cgtagccaat gtctttgatt cagaggaaga gatggttgcc
1260gcagtcgaaa ctggaaaagt caaaaagggc gatgttattg ttattagata
cgaaggtcct 1320aaaggtggcc ctggcatgcc tgaaatgctt aagccaacct
ctttgataat gggtgctgga 1380ctaggccagg atgttgcatt aatcacagat
ggcagatttt caggtggtag tcatggtttc 1440attgtaggtc acattacacc
agaagcatac gaaggcggta tgatcgcctt attagaaaac 1500ggtgataaga
taacaatcga tgctatcaac aatgtgataa atgtagactt aagtgatcaa
1560gagattgctc aacgtaaatc taagtggaga gcatcaaagc aaaaagcttc
cagaggtaca 1620ctgaaaaagt acattaagac cgtctcttct gcttctaccg
ggtgcgtgac tgatttggat 1680tga 16831571704DNASaccharomyces
cerevisiae 157atggcaaaga agctcaacaa gtactcgtat atcatcactg
aacctaaggg ccaaggtgcg 60tcccaggcca tgctttatgc caccggtttc aagaaggaag
atttcaagaa gcctcaagtc 120ggggttggtt cctgttggtg gtccggtaac
ccatgtaaca tgcatctatt ggacttgaat 180aacagatgtt ctcaatccat
tgaaaaagcg ggtttgaaag ctatgcagtt caacaccatc 240ggtgtttcag
acggtatctc tatgggtact aaaggtatga gatactcgtt acaaagtaga
300gaaatcattg cagactcctt tgaaaccatc atgatggcac aacactacga
tgctaacatc 360gccatcccat catgtgacaa aaacatgccc ggtgtcatga
tggccatggg tagacataac 420agaccttcca tcatggtata tggtggtact
atcttgcccg gtcatccaac atgtggttct 480tcgaagatct ctaaaaacat
cgatatcgtc tctgcgttcc aatcctacgg tgaatatatt 540tccaagcaat
tcactgaaga agaaagagaa gatgttgtgg aacatgcatg cccaggtcct
600ggttcttgtg gtggtatgta tactgccaac acaatggctt ctgccgctga
agtgctaggt 660ttgaccattc caaactcctc ttccttccca gccgtttcca
aggagaagtt agctgagtgt 720gacaacattg gtgaatacat caagaagaca
atggaattgg gtattttacc tcgtgatatc 780ctcacaaaag aggcttttga
aaacgccatt acttatgtcg ttgcaaccgg tgggtccact 840aatgctgttt
tgcatttggt ggctgttgct cactctgcgg gtgtcaagtt gtcaccagat
900gatttccaaa gaatcagtga tactacacca ttgatcggtg acttcaaacc
ttctggtaaa 960tacgtcatgg ccgatttgat taacgttggt ggtacccaat
ctgtgattaa gtatctatat 1020gaaaacaaca tgttgcacgg taacacaatg
actgttaccg gtgacacttt ggcagaacgt 1080gcaaagaaag caccaagcct
acctgaagga caagagatta ttaagccact ctcccaccca 1140atcaaggcca
acggtcactt gcaaattctg tacggttcat tggcaccagg tggagctgtg
1200ggtaaaatta ccggtaagga aggtacttac ttcaagggta gagcacgtgt
gttcgaagag 1260gaaggtgcct ttattgaagc cttggaaaga ggtgaaatca
agaagggtga aaaaaccgtt 1320gttgttatca gatatgaagg tccaagaggt
gcaccaggta tgcctgaaat gctaaagcct 1380tcctctgctc tgatgggtta
cggtttgggt aaagatgttg cattgttgac tgatggtaga 1440ttctctggtg
gttctcacgg gttcttaatc ggccacattg ttcccgaagc cgctgaaggt
1500ggtcctatcg ggttggtcag agacggcgat gagattatca ttgatgctga
taataacaag 1560attgacctat tagtctctga taaggaaatg gctcaacgta
aacaaagttg ggttgcacct 1620ccacctcgtt acacaagagg tactctatcc
aagtatgcta agttggtttc caacgcttcc 1680aacggttgtg ttttagatgc ttga
17041581692DNASaccharomyces cerevisiae 158atgaacaagt actcgtatat
catcactgaa cctaagggcc aaggtgcgtc ccaggccatg 60ctttatgcca ccggtttcaa
gaaggaagat ttcaagaagc ctcaagtcgg ggttggttcc 120tgttggtggt
ccggtaaccc atgtaacatg catctattgg acttgaataa cagatgttct
180caatccattg aaaaagcggg tttgaaagct atgcagttca acaccatcgg
tgtttcagac 240ggtatctcta tgggtactaa aggtatgaga tactcgttac
aaagtagaga aatcattgca 300gactcctttg aaaccatcat gatggcacaa
cactacgatg ctaacatcgc catcccatca 360tgtgacaaaa acatgcccgg
tgtcatgatg gccatgggta gacataacag accttccatc 420atggtatatg
gtggtactat cttgcccggt catccaacat gtggttcttc gaagatctct
480aaaaacatcg atatcgtctc tgcgttccaa tcctacggtg aatatatttc
caagcaattc 540actgaagaag aaagagaaga tgttgtggaa catgcatgcc
caggtcctgg ttcttgtggt 600ggtatgtata ctgccaacac aatggcttct
gccgctgaag tgctaggttt gaccattcca 660aactcctctt ccttcccagc
cgtttccaag gagaagttag ctgagtgtga caacattggt 720gaatacatca
agaagacaat ggaattgggt attttacctc gtgatatcct cacaaaagag
780gcttttgaaa acgccattac ttatgtcgtt gcaaccggtg ggtccactaa
tgctgttttg 840catttggtgg ctgttgctca ctctgcgggt gtcaagttgt
caccagatga tttccaaaga 900atcagtgata ctacaccatt gatcggtgac
ttcaaacctt ctggtaaata cgtcatggcc 960gatttgatta acgttggtgg
tacccaatct gtgattaagt atctatatga aaacaacatg 1020ttgcacggta
acacaatgac tgttaccggt gacactttgg cagaacgtgc aaagaaagca
1080ccaagcctac ctgaaggaca agagattatt aagccactct cccacccaat
caaggccaac 1140ggtcacttgc aaattctgta cggttcattg gcaccaggtg
gagctgtggg taaaattacc 1200ggtaaggaag gtacttactt caagggtaga
gcacgtgtgt tcgaagagga aggtgccttt 1260attgaagcct tggaaagagg
tgaaatcaag aagggtgaaa aaaccgttgt tgttatcaga 1320tatgaaggtc
caagaggtgc accaggtatg cctgaaatgc taaagccttc ctctgctctg
1380atgggttacg gtttgggtaa agatgttgca ttgttgactg atggtagatt
ctctggtggt 1440tctcacgggt tcttaatcgg ccacattgtt cccgaagccg
ctgaaggtgg tcctatcggg 1500ttggtcagag acggcgatga gattatcatt
gatgctgata ataacaagat tgacctatta 1560gtctctgata aggaaatggc
tcaacgtaaa caaagttggg ttgcacctcc acctcgttac 1620acaagaggta
ctctatccaa gtatgctaag ttggtttcca acgcttccaa cggttgtgtt
1680ttagatgctt ga 16921591923DNANeurospora crassa 159atggcttcta
atcaagataa caaggcagtt gctccagacg ctgctgcacc agcgggtcag 60tcaacaacca
ccacaactac aaatgataac agtgaaagga atctaccaaa ggaaggcgaa
120tacattcaat ggaggacact tccagcgggc aatccagatc agttgaacag
atggagtcat 180ttcctgactc gtgagcatga gtttccaggc gctcaggcaa
tgttgtacgg tgcgggtgta 240cctaacaaag atatgatgaa aaaggctcct
catgttggga tcgctactgt ttggtgggaa 300ggtaacccat gtaatactca
tctgcttgat ctaggtcaaa aagtcaaaaa ggctgttgaa 360agagagaaga
tgttagcttg gcaattcaac acaattggcg ttagtgacgg aataacaatg
420ggtggtgaag gcatgaggta ctctttgcag agcagagaga tcatagcaga
ttctatagag 480actgtgacat gtgcacaaca ccatgatgcc aatatctcaa
ttccagggtg cgacaaaaac 540atgccaggcg tcatcatggc agctgcaaga
cacaacagac cattcgttat gatctacgga 600ggtacaatga gaggcggtca
ttccgaatta cttgatagac ctatcaatat cgtaacttgt 660tacgaggcct
caggggccta tacttatggt agacttaagc cagcctgtcc aaactccact
720gctaccccat ctgacgtgat ggacgatata gaacaacacg cctgtccagg
ggctggagct 780tgtggaggga tgtacaccgc gaatactatg gcaaccgcca
tagaagctat gggtctgaca 840gcaccagggt catcctcctt tccagccagc
tcaccagaaa agttcagaga gtgcgaaaaa 900gccgcggaat acattaagat
atgcatggaa aaagatattc gtccaagaga cttactaaca 960aaggcttcct
tcgagaatgc tctcgtcttg acaatgattc taggtggttc aaccaacggt
1020gttttacatt acttagccat ggccaactcc gccgatgtcg atctaactct
tgatgatatc 1080aatagagtca gtgctaagac tcctttcctc gctgatatgg
ctccatctgg tagatactat 1140atggaggatt tgtacaaggt aggtggtact
ccagccgtac tcaagatgtt gatagctgcc 1200ggctatatcg atggaacaat
tccaacaata acaggaaaat ctttggctga aaacgtgtca 1260gattggccat
ctttagaccc tgatcaaaag attatccgtc ctttggataa tcctatcaaa
1320tcacaaggtc acattagagt gctgtatggt aacttctctc ctggtggggc
tgttgccaag 1380atcacgggta aggaaggtct tagttttact ggtaaggcaa
gatgctttaa caaagagttt 1440gaattggatg ctgcgctgaa aaactctgaa
atcacgctcg aacaaggaaa tcaagttcta 1500attgtaaggt atgaaggccc
taagggcgga ccaggcatgc cagaacaatt gaaagcatct 1560gccgctatca
tgggcgctgg tttgacgaac gtagctttag tcacggatgg gcgatactct
1620ggcgcttctc acggtttcat cgtcggtcac gtcgtgcctg aggcggcaac
tggcggacct 1680attgctttag taaaggatgg agatttgatc acaattgatg
cagtcagaaa tagaattgat 1740gttgtcaaaa ccgtagaagg agtggagggc
gaggaggaaa ttgcaaaggt tttagaagag 1800aggaaaaagg gatggaaagc
acctaagatg aagccaacaa gaggagccct ggccaaatac 1860gcaagacttg
ttggtgacgc atcacatgga gcagttacag acttaggagg agatgcttac 1920taa
19231601686DNAAcaryochloris marina 160atgtcagata atcgtaattc
tcaagtagtc acacaaggtg ttcaaagagc acctaataga 60gctatgttaa gagctgtagg
attcggagat gatgatttca cgaaaccaat agttggattg 120gctaatggtt
tctctactat tactccttgt aacatgggaa ttgatagttt ggccacaaga
180gctgaagcat ctattaggac ggctggtgca atgccacaaa agtttggaac
cattacaata 240agcgatggga tatcaatggg tacagaaggt atgaagtatt
ctctcgtttc aagagaagtg 300attgccgatt ccattgaaac agcttgcatg
ggccagagta tggatggcgt attagcaatt 360ggtggctgcg acaaaaacat
gcctggcgcg atgttagcaa tggctcgtat gaacatacca 420gccatcttcg
tatatggtgg cactatcaag ccaggccacc tcaatggtga agatttgact
480gtcgtatcag ctttcgaagc tgtggggcaa cattccgccg gtagaatatc
cgaagccgaa 540cttacagcag tcgaaaagca tgcatgtcca ggcgctggat
catgtggtgg catgtacacg 600gccaacacaa tgtctagtgc ttttgaggct
atgggcatgt ccttgatgta ctcatccact 660atggctgcag aagatgagga
gaaggctgtt tctgccgaac aatctgcggc tgtgctagtt 720gaggcaatcc
acaaacagat tctaccaaga gatattctaa ccagaaaggc gtttgagaac
780gcaatagcag tcataatggc tgttgggggt tccacaaatg cagttctcca
cttgttagcg 840atttcaagag cagcaggaga ctctttaact ttagatgatt
tcgaaactat cagggctcaa 900gttccagtga tttgtgattt gaagccttct
ggtcgatatg tcgctacaga ccttcataaa 960gctggcggaa tcccattagt
tatgaaaatg ctattagagc atgggctatt acatggggat 1020gcattgacta
ttaccggcaa gacaattgca gagcaattgg ccgatgtgcc atctgaacct
1080tctcctgatc aagacgtaat ccgtccttgg gataatccaa tgtacaagca
aggtcacctt 1140gccatcttga gaggtaactt ggccacagaa ggtgcagtag
ccaagatcac agggatcaaa 1200aaccctcaaa tcactggacc agctagagtt
ttcgagtcag aggaagcctg tttagaagcg 1260atcctggccg gaaagatcca
accaaatgac gtgatagtcg ttcgatacga aggtccaaaa 1320ggaggaccag
gtatgaggga aatgctggct cctacttccg caatcatagg tgcgggtcta
1380ggagactcag ttggccttat cactgatggg agattttccg gtgggacata
cggtatggtt 1440gttggacatg tagcaccaga agcagctgtt ggtgggacca
ttgctctggt tcaagagggt 1500gaccaaatca ctatcgatgc tcacgctaga
aagttggagc tgcatgtctc tgaccaagag 1560cttaaagagc gtaaggaaaa
gtgggagcag ccaaaaccac tgtacaataa gggtgtgctt 1620gcgaagtacg
ccaaactcgt aagctctagt tcagtaggag cggttacaga tttggatttg 1680ttctaa
16861611683DNALyngbya spp. 161atgtccgata acttccgttc tcaagccatt
acacagggca aaaagagaac tcctaataga 60gctatgctga gagcagttgg atttggagat
gaagatttca acaaaccaat tgttggtatt 120gccaatggct actccaccat
aactccttgc aacatcggtc ttaacgatct tgcacatagg 180gccgaaacag
ctctaaagca agcagacgcc atgccacaaa tgttcggaac tattactgta
240agtgatggaa ttgcaatggg aaccgaaggt atgaagtact ctcttgttag
cagagaagtt 300atagccgatg ctattgaaac tgcttgtaac ggacagtcta
tggatggggt cttagcaata 360ggaggttgtg acaaaaacat gcctggtgct
atgatcgcca tagcgcgtat gaatatccct 420gctatctttg tatacggcgg
tacaatcaag ccaggtaatc taaacggttg tgatctaaca 480gttgtctccg
cattcgaagc cgttggagag tattctgctg gcaaactaga tgacgataga
540ttactggaca tcgagagatt agcatgccct ggttctggct catgtggggg
aatgttcact 600gctaatacaa tgtctagtgc atttgaagca atgggtatga
gtctgatgta cagcagtaca 660atggcatccg aagatgctga aaaggctgat
tccaccgaaa agtccgcttt tgttttgaga 720gaggcaattt ctcagagaat
cctacctaag caaatcctga cgaggaaagc cttcgaaaac 780gcaattgcag
tcatcatggc ggtaggcggc tccacaaact ctgtattgca tctattggct
840attgcctatg ctgccgatgt agaattgacc atagatgatt tcgaaacaat
tcgtgggaga 900gtaccagttt tgtgtgatct taagccatca ggacgatttg
tcactaccga tttccataag 960gctggtggag tcccattgat catgaagatg
ttactcgaac aaggtttgat ccatggggat 1020gcccttacta taacgggtaa
aacagtcgca gagcaattag ctgatatccc atctcaacca 1080tctgccgacc
aagaggtgat aagaccatgg aataacccaa tgtacaagca aggtcacttg
1140gcgatcctta aggggaatct tgcaacagaa ggttcagtcg ccaagataac
aggtgtgaaa 1200aagcctcaga tgacaggtcc agcgcgagtt tttgaatcag
aagagcaatg cttagaagct 1260atactagccg gcaaaatcca agctggggac
gttttagtgg ttagatacga aggtccaaaa 1320gggggaccag gtatgagaga
aatgctggct ccaacatctg caatcattgg tgccggcttg 1380ggtgattctg
ttggactcat tacggatggc agattctctg gcggaacata tggtttggta
1440gtcggacacg ttgctccaga ggctgcagtg ggtggtaaca tcgctttagt
gcaagagggc 1500gattcaatta ctattgatgc ttcacagcgt ttgttacaag
taaacatctc tgaccaggtg 1560ttggagcaaa gacgacaaaa ctggcaacca
ccacaaccta gatacactaa aggcgtatta 1620gcgaagtacg caaagttggt
ttcaagtagt tcagttggcg cagttactga tctcgattgt 1680taa
16831621851DNAArtificial SequenceE. coli ilvD codon-optimized for
expression in K. lactis 162atgccgaaat acagatcagc aacaacaacc
catggtagaa atatggctgg tgcaagggct 60ctatggagag ctactggcat gactgatgca
gatttcggaa agccaatcat tgccgtcgtc 120aactctttta cacaattcgt
tccgggtcat gtccatttgc gtgatctagg taagcttgtt 180gccgaacaaa
ttgaagctgc aggtggtgtc gcaaaagagt ttaatactat tgctgtggac
240gacggtatag ctatggggca tggcggtatg ttatactctt taccatcgag
agaattaatt 300gcagactcag tcgaatatat ggttaatgct cattgtgccg
atgcaatggt ttgtatctct 360aattgtgata agataacgcc tggtatgttg
atggcgtcct tgagattgaa catcccagta 420atcttcgtat ctggcggccc
aatggaggct ggtaaaacta agttaagtga tcagatcatc 480aaacttgatc
ttgtggatgc aatgattcaa ggtgcagatc caaaagtttc agactcgcag
540tcagaccaag ttgaaagaag tgcatgtcca acttgtggtt cttgcagtgg
aatgttcacg 600gctaactcta tgaattgctt gactgaagct ctaggtttat
ctcaaccagg aaatggttca 660ttattagcga cccatgcaga cagaaagcaa
ttgttcttaa atgccggaaa aagaattgtg 720gaactaacga aaaggtatta
cgaacaaaat gatgaatcag cattaccgag gaatatagct 780tcaaaggctg
cattcgaaaa tgccatgaca ttggatattg caatgggtgg tagtacaaac
840acggtcttac atcttctagc tgcagcccaa gaagctgaga tagatttcac
catgtctgat 900atcgacaagc tttcacgtaa ggttccacag ttatgtaagg
ttgcaccatc aactcaaaag 960tatcacatgg aagacgttca tcgtgcagga
ggggttattg gtattttagg ggagttggac 1020agagccggtc ttttaaacag
ggatgtgaag aatgtattgg gtttaacact tccacagaca 1080ttagagcaat
acgatgtcat gttaactcaa gatgatgccg tgaaaaacat gttcagggca
1140ggtccagcag ggatcagaac cacccaagca ttctcgcaag actgtaggtg
ggacactttg 1200gacgatgata gagcaaatgg atgtataaga tcgcttgagc
atgcttatag taaggatggt 1260ggtttagcag tattatatgg aaacttcgct
gaaaatggtt gcattgtgaa aactgctggt 1320gtagatgata gtattttgaa
atttactgga cccgctaaag tttacgaaag tcaagacgat 1380gctgttgagg
ctatacttgg cggaaaggtg gtagcaggag acgtggtagt gataagatat
1440gagggaccaa agggaggacc aggtatgcag gaaatgcttt acccaacttc
atttttgaag 1500tccatgggac taggaaaagc ttgtgccctt atcactgacg
gtagattctc tggtggcact 1560tcgggtttaa gtatcggtca cgtatcacca
gaggcagctt ctggtggttc gattggattg 1620attgaagatg gagatttgat
cgccatagat atcccaaata gaggtatcca attacaagtc 1680tcagacgctg
aattggctgc aagaagagaa gcacaagatg ccagaggaga taaggcttgg
1740actcctaaaa atagagaacg tcaagtaagt ttcgccctta gggcttatgc
ttcattggct 1800acttcagccg ataagggggc agtaagagac aaatcgaagt
tgggtggatg a 1851163567PRTSaccharomyces cerevisiae
163Met Ala Lys Lys Leu Asn Lys Tyr Ser Tyr Ile Ile Thr Glu Pro Lys
1 5 10 15 Gly Gln Gly Ala Ser Gln Ala Met Leu Tyr Ala Thr Gly Phe
Lys Lys 20 25 30 Glu Asp Phe Lys Lys Pro Gln Val Gly Val Gly Ser
Cys Trp Trp Ser 35 40 45 Gly Asn Pro Cys Asn Met His Leu Leu Asp
Leu Asn Asn Arg Cys Ser 50 55 60 Gln Ser Ile Glu Lys Ala Gly Leu
Lys Ala Met Gln Phe Asn Thr Ile 65 70 75 80 Gly Val Ser Asp Gly Ile
Ser Met Gly Thr Lys Gly Met Arg Tyr Ser 85 90 95 Leu Gln Ser Arg
Glu Ile Ile Ala Asp Ser Phe Glu Thr Ile Met Met 100 105 110 Ala Gln
His Tyr Asp Ala Asn Ile Ala Ile Pro Ser Cys Asp Lys Asn 115 120 125
Met Pro Gly Val Met Met Ala Met Gly Arg His Asn Arg Pro Ser Ile 130
135 140 Met Val Tyr Gly Gly Thr Ile Leu Pro Gly His Pro Thr Cys Gly
Ser 145 150 155 160 Ser Lys Ile Ser Lys Asn Ile Asp Ile Val Ser Ala
Phe Gln Ser Tyr 165 170 175 Gly Glu Tyr Ile Ser Lys Gln Phe Thr Glu
Glu Glu Arg Glu Asp Val 180 185 190 Val Glu His Ala Cys Pro Gly Pro
Gly Ser Cys Gly Gly Met Tyr Thr 195 200 205 Ala Asn Thr Met Ala Ser
Ala Ala Glu Val Leu Gly Leu Thr Ile Pro 210 215 220 Asn Ser Ser Ser
Phe Pro Ala Val Ser Lys Glu Lys Leu Ala Glu Cys 225 230 235 240 Asp
Asn Ile Gly Glu Tyr Ile Lys Lys Thr Met Glu Leu Gly Ile Leu 245 250
255 Pro Arg Asp Ile Leu Thr Lys Glu Ala Phe Glu Asn Ala Ile Thr Tyr
260 265 270 Val Val Ala Thr Gly Gly Ser Thr Asn Ala Val Leu His Leu
Val Ala 275 280 285 Val Ala His Ser Ala Gly Val Lys Leu Ser Pro Asp
Asp Phe Gln Arg 290 295 300 Ile Ser Asp Thr Thr Pro Leu Ile Gly Asp
Phe Lys Pro Ser Gly Lys 305 310 315 320 Tyr Val Met Ala Asp Leu Ile
Asn Val Gly Gly Thr Gln Ser Val Ile 325 330 335 Lys Tyr Leu Tyr Glu
Asn Asn Met Leu His Gly Asn Thr Met Thr Val 340 345 350 Thr Gly Asp
Thr Leu Ala Glu Arg Ala Lys Lys Ala Pro Ser Leu Pro 355 360 365 Glu
Gly Gln Glu Ile Ile Lys Pro Leu Ser His Pro Ile Lys Ala Asn 370 375
380 Gly His Leu Gln Ile Leu Tyr Gly Ser Leu Ala Pro Gly Gly Ala Val
385 390 395 400 Gly Lys Ile Thr Gly Lys Glu Gly Thr Tyr Phe Lys Gly
Arg Ala Arg 405 410 415 Val Phe Glu Glu Glu Gly Ala Phe Ile Glu Ala
Leu Glu Arg Gly Glu 420 425 430 Ile Lys Lys Gly Glu Lys Thr Val Val
Val Ile Arg Tyr Glu Gly Pro 435 440 445 Arg Gly Ala Pro Gly Met Pro
Glu Met Leu Lys Pro Ser Ser Ala Leu 450 455 460 Met Gly Tyr Gly Leu
Gly Lys Asp Val Ala Leu Leu Thr Asp Gly Arg 465 470 475 480 Phe Ser
Gly Gly Ser His Gly Phe Leu Ile Gly His Ile Val Pro Glu 485 490 495
Ala Ala Glu Gly Gly Pro Ile Gly Leu Val Arg Asp Gly Asp Glu Ile 500
505 510 Ile Ile Asp Ala Asp Asn Asn Lys Ile Asp Leu Leu Val Ser Asp
Lys 515 520 525 Glu Met Ala Gln Arg Lys Gln Ser Trp Val Ala Pro Pro
Pro Arg Tyr 530 535 540 Thr Arg Gly Thr Leu Ser Lys Tyr Ala Lys Leu
Val Ser Asn Ala Ser 545 550 555 560 Asn Gly Cys Val Leu Asp Ala 565
164563PRTSaccharomyces cerevisiae 164Met Asn Lys Tyr Ser Tyr Ile
Ile Thr Glu Pro Lys Gly Gln Gly Ala 1 5 10 15 Ser Gln Ala Met Leu
Tyr Ala Thr Gly Phe Lys Lys Glu Asp Phe Lys 20 25 30 Lys Pro Gln
Val Gly Val Gly Ser Cys Trp Trp Ser Gly Asn Pro Cys 35 40 45 Asn
Met His Leu Leu Asp Leu Asn Asn Arg Cys Ser Gln Ser Ile Glu 50 55
60 Lys Ala Gly Leu Lys Ala Met Gln Phe Asn Thr Ile Gly Val Ser Asp
65 70 75 80 Gly Ile Ser Met Gly Thr Lys Gly Met Arg Tyr Ser Leu Gln
Ser Arg 85 90 95 Glu Ile Ile Ala Asp Ser Phe Glu Thr Ile Met Met
Ala Gln His Tyr 100 105 110 Asp Ala Asn Ile Ala Ile Pro Ser Cys Asp
Lys Asn Met Pro Gly Val 115 120 125 Met Met Ala Met Gly Arg His Asn
Arg Pro Ser Ile Met Val Tyr Gly 130 135 140 Gly Thr Ile Leu Pro Gly
His Pro Thr Cys Gly Ser Ser Lys Ile Ser 145 150 155 160 Lys Asn Ile
Asp Ile Val Ser Ala Phe Gln Ser Tyr Gly Glu Tyr Ile 165 170 175 Ser
Lys Gln Phe Thr Glu Glu Glu Arg Glu Asp Val Val Glu His Ala 180 185
190 Cys Pro Gly Pro Gly Ser Cys Gly Gly Met Tyr Thr Ala Asn Thr Met
195 200 205 Ala Ser Ala Ala Glu Val Leu Gly Leu Thr Ile Pro Asn Ser
Ser Ser 210 215 220 Phe Pro Ala Val Ser Lys Glu Lys Leu Ala Glu Cys
Asp Asn Ile Gly 225 230 235 240 Glu Tyr Ile Lys Lys Thr Met Glu Leu
Gly Ile Leu Pro Arg Asp Ile 245 250 255 Leu Thr Lys Glu Ala Phe Glu
Asn Ala Ile Thr Tyr Val Val Ala Thr 260 265 270 Gly Gly Ser Thr Asn
Ala Val Leu His Leu Val Ala Val Ala His Ser 275 280 285 Ala Gly Val
Lys Leu Ser Pro Asp Asp Phe Gln Arg Ile Ser Asp Thr 290 295 300 Thr
Pro Leu Ile Gly Asp Phe Lys Pro Ser Gly Lys Tyr Val Met Ala 305 310
315 320 Asp Leu Ile Asn Val Gly Gly Thr Gln Ser Val Ile Lys Tyr Leu
Tyr 325 330 335 Glu Asn Asn Met Leu His Gly Asn Thr Met Thr Val Thr
Gly Asp Thr 340 345 350 Leu Ala Glu Arg Ala Lys Lys Ala Pro Ser Leu
Pro Glu Gly Gln Glu 355 360 365 Ile Ile Lys Pro Leu Ser His Pro Ile
Lys Ala Asn Gly His Leu Gln 370 375 380 Ile Leu Tyr Gly Ser Leu Ala
Pro Gly Gly Ala Val Gly Lys Ile Thr 385 390 395 400 Gly Lys Glu Gly
Thr Tyr Phe Lys Gly Arg Ala Arg Val Phe Glu Glu 405 410 415 Glu Gly
Ala Phe Ile Glu Ala Leu Glu Arg Gly Glu Ile Lys Lys Gly 420 425 430
Glu Lys Thr Val Val Val Ile Arg Tyr Glu Gly Pro Arg Gly Ala Pro 435
440 445 Gly Met Pro Glu Met Leu Lys Pro Ser Ser Ala Leu Met Gly Tyr
Gly 450 455 460 Leu Gly Lys Asp Val Ala Leu Leu Thr Asp Gly Arg Phe
Ser Gly Gly 465 470 475 480 Ser His Gly Phe Leu Ile Gly His Ile Val
Pro Glu Ala Ala Glu Gly 485 490 495 Gly Pro Ile Gly Leu Val Arg Asp
Gly Asp Glu Ile Ile Ile Asp Ala 500 505 510 Asp Asn Asn Lys Ile Asp
Leu Leu Val Ser Asp Lys Glu Met Ala Gln 515 520 525 Arg Lys Gln Ser
Trp Val Ala Pro Pro Pro Arg Tyr Thr Arg Gly Thr 530 535 540 Leu Ser
Lys Tyr Ala Lys Leu Val Ser Asn Ala Ser Asn Gly Cys Val 545 550 555
560 Leu Asp Ala 165640PRTNeurospora crassa 165Met Ala Ser Asn Gln
Asp Asn Lys Ala Val Ala Pro Asp Ala Ala Ala 1 5 10 15 Pro Ala Gly
Gln Ser Thr Thr Thr Thr Thr Thr Asn Asp Asn Ser Glu 20 25 30 Arg
Asn Leu Pro Lys Glu Gly Glu Tyr Ile Gln Trp Arg Thr Leu Pro 35 40
45 Ala Gly Asn Pro Asp Gln Leu Asn Arg Trp Ser His Phe Leu Thr Arg
50 55 60 Glu His Glu Phe Pro Gly Ala Gln Ala Met Leu Tyr Gly Ala
Gly Val 65 70 75 80 Pro Asn Lys Asp Met Met Lys Lys Ala Pro His Val
Gly Ile Ala Thr 85 90 95 Val Trp Trp Glu Gly Asn Pro Cys Asn Thr
His Leu Leu Asp Leu Gly 100 105 110 Gln Lys Val Lys Lys Ala Val Glu
Arg Glu Lys Met Leu Ala Trp Gln 115 120 125 Phe Asn Thr Ile Gly Val
Ser Asp Gly Ile Thr Met Gly Gly Glu Gly 130 135 140 Met Arg Tyr Ser
Leu Gln Ser Arg Glu Ile Ile Ala Asp Ser Ile Glu 145 150 155 160 Thr
Val Thr Cys Ala Gln His His Asp Ala Asn Ile Ser Ile Pro Gly 165 170
175 Cys Asp Lys Asn Met Pro Gly Val Ile Met Ala Ala Ala Arg His Asn
180 185 190 Arg Pro Phe Val Met Ile Tyr Gly Gly Thr Met Arg Gly Gly
His Ser 195 200 205 Glu Leu Leu Asp Arg Pro Ile Asn Ile Val Thr Cys
Tyr Glu Ala Ser 210 215 220 Gly Ala Tyr Thr Tyr Gly Arg Leu Lys Pro
Ala Cys Pro Asn Ser Thr 225 230 235 240 Ala Thr Pro Ser Asp Val Met
Asp Asp Ile Glu Gln His Ala Cys Pro 245 250 255 Gly Ala Gly Ala Cys
Gly Gly Met Tyr Thr Ala Asn Thr Met Ala Thr 260 265 270 Ala Ile Glu
Ala Met Gly Leu Thr Ala Pro Gly Ser Ser Ser Phe Pro 275 280 285 Ala
Ser Ser Pro Glu Lys Phe Arg Glu Cys Glu Lys Ala Ala Glu Tyr 290 295
300 Ile Lys Ile Cys Met Glu Lys Asp Ile Arg Pro Arg Asp Leu Leu Thr
305 310 315 320 Lys Ala Ser Phe Glu Asn Ala Leu Val Leu Thr Met Ile
Leu Gly Gly 325 330 335 Ser Thr Asn Gly Val Leu His Tyr Leu Ala Met
Ala Asn Ser Ala Asp 340 345 350 Val Asp Leu Thr Leu Asp Asp Ile Asn
Arg Val Ser Ala Lys Thr Pro 355 360 365 Phe Leu Ala Asp Met Ala Pro
Ser Gly Arg Tyr Tyr Met Glu Asp Leu 370 375 380 Tyr Lys Val Gly Gly
Thr Pro Ala Val Leu Lys Met Leu Ile Ala Ala 385 390 395 400 Gly Tyr
Ile Asp Gly Thr Ile Pro Thr Ile Thr Gly Lys Ser Leu Ala 405 410 415
Glu Asn Val Ser Asp Trp Pro Ser Leu Asp Pro Asp Gln Lys Ile Ile 420
425 430 Arg Pro Leu Asp Asn Pro Ile Lys Ser Gln Gly His Ile Arg Val
Leu 435 440 445 Tyr Gly Asn Phe Ser Pro Gly Gly Ala Val Ala Lys Ile
Thr Gly Lys 450 455 460 Glu Gly Leu Ser Phe Thr Gly Lys Ala Arg Cys
Phe Asn Lys Glu Phe 465 470 475 480 Glu Leu Asp Ala Ala Leu Lys Asn
Ser Glu Ile Thr Leu Glu Gln Gly 485 490 495 Asn Gln Val Leu Ile Val
Arg Tyr Glu Gly Pro Lys Gly Gly Pro Gly 500 505 510 Met Pro Glu Gln
Leu Lys Ala Ser Ala Ala Ile Met Gly Ala Gly Leu 515 520 525 Thr Asn
Val Ala Leu Val Thr Asp Gly Arg Tyr Ser Gly Ala Ser His 530 535 540
Gly Phe Ile Val Gly His Val Val Pro Glu Ala Ala Thr Gly Gly Pro 545
550 555 560 Ile Ala Leu Val Lys Asp Gly Asp Leu Ile Thr Ile Asp Ala
Val Arg 565 570 575 Asn Arg Ile Asp Val Val Lys Thr Val Glu Gly Val
Glu Gly Glu Glu 580 585 590 Glu Ile Ala Lys Val Leu Glu Glu Arg Lys
Lys Gly Trp Lys Ala Pro 595 600 605 Lys Met Lys Pro Thr Arg Gly Ala
Leu Ala Lys Tyr Ala Arg Leu Val 610 615 620 Gly Asp Ala Ser His Gly
Ala Val Thr Asp Leu Gly Gly Asp Ala Tyr 625 630 635 640
166561PRTAcaryochloris marina 166Met Ser Asp Asn Arg Asn Ser Gln
Val Val Thr Gln Gly Val Gln Arg 1 5 10 15 Ala Pro Asn Arg Ala Met
Leu Arg Ala Val Gly Phe Gly Asp Asp Asp 20 25 30 Phe Thr Lys Pro
Ile Val Gly Leu Ala Asn Gly Phe Ser Thr Ile Thr 35 40 45 Pro Cys
Asn Met Gly Ile Asp Ser Leu Ala Thr Arg Ala Glu Ala Ser 50 55 60
Ile Arg Thr Ala Gly Ala Met Pro Gln Lys Phe Gly Thr Ile Thr Ile 65
70 75 80 Ser Asp Gly Ile Ser Met Gly Thr Glu Gly Met Lys Tyr Ser
Leu Val 85 90 95 Ser Arg Glu Val Ile Ala Asp Ser Ile Glu Thr Ala
Cys Met Gly Gln 100 105 110 Ser Met Asp Gly Val Leu Ala Ile Gly Gly
Cys Asp Lys Asn Met Pro 115 120 125 Gly Ala Met Leu Ala Met Ala Arg
Met Asn Ile Pro Ala Ile Phe Val 130 135 140 Tyr Gly Gly Thr Ile Lys
Pro Gly His Leu Asn Gly Glu Asp Leu Thr 145 150 155 160 Val Val Ser
Ala Phe Glu Ala Val Gly Gln His Ser Ala Gly Arg Ile 165 170 175 Ser
Glu Ala Glu Leu Thr Ala Val Glu Lys His Ala Cys Pro Gly Ala 180 185
190 Gly Ser Cys Gly Gly Met Tyr Thr Ala Asn Thr Met Ser Ser Ala Phe
195 200 205 Glu Ala Met Gly Met Ser Leu Met Tyr Ser Ser Thr Met Ala
Ala Glu 210 215 220 Asp Glu Glu Lys Ala Val Ser Ala Glu Gln Ser Ala
Ala Val Leu Val 225 230 235 240 Glu Ala Ile His Lys Gln Ile Leu Pro
Arg Asp Ile Leu Thr Arg Lys 245 250 255 Ala Phe Glu Asn Ala Ile Ala
Val Ile Met Ala Val Gly Gly Ser Thr 260 265 270 Asn Ala Val Leu His
Leu Leu Ala Ile Ser Arg Ala Ala Gly Asp Ser 275 280 285 Leu Thr Leu
Asp Asp Phe Glu Thr Ile Arg Ala Gln Val Pro Val Ile 290 295 300 Cys
Asp Leu Lys Pro Ser Gly Arg Tyr Val Ala Thr Asp Leu His Lys 305 310
315 320 Ala Gly Gly Ile Pro Leu Val Met Lys Met Leu Leu Glu His Gly
Leu 325 330 335 Leu His Gly Asp Ala Leu Thr Ile Thr Gly Lys Thr Ile
Ala Glu Gln 340 345 350 Leu Ala Asp Val Pro Ser Glu Pro Ser Pro Asp
Gln Asp Val Ile Arg 355 360 365 Pro Trp Asp Asn Pro Met Tyr Lys Gln
Gly His Leu Ala Ile Leu Arg 370 375 380 Gly Asn Leu Ala Thr Glu Gly
Ala Val Ala Lys Ile Thr Gly Ile Lys 385 390 395 400 Asn Pro Gln Ile
Thr Gly Pro Ala Arg Val Phe Glu Ser Glu Glu Ala 405 410 415 Cys Leu
Glu Ala Ile Leu Ala Gly Lys Ile Gln Pro Asn Asp Val Ile 420 425 430
Val Val Arg Tyr Glu Gly Pro Lys Gly Gly Pro Gly Met Arg Glu Met 435
440 445 Leu Ala Pro Thr Ser Ala Ile Ile Gly Ala Gly Leu Gly Asp Ser
Val 450 455 460 Gly Leu Ile Thr Asp Gly Arg Phe Ser Gly Gly Thr Tyr
Gly Met Val 465 470 475 480 Val Gly His Val Ala Pro Glu Ala Ala Val
Gly Gly Thr Ile Ala Leu 485 490 495 Val Gln Glu Gly Asp Gln Ile Thr
Ile Asp Ala His Ala Arg Lys Leu 500 505 510 Glu Leu His Val Ser Asp
Gln Glu Leu Lys Glu Arg Lys Glu Lys Trp 515 520
525 Glu Gln Pro Lys Pro Leu Tyr Asn Lys Gly Val Leu Ala Lys Tyr Ala
530 535 540 Lys Leu Val Ser Ser Ser Ser Val Gly Ala Val Thr Asp Leu
Asp Leu 545 550 555 560 Phe 167559PRTLyngbya spp. 167Met Ser Asp
Asn Phe Arg Ser Gln Ala Ile Thr Gln Gly Lys Lys Arg 1 5 10 15 Thr
Pro Asn Arg Ala Met Leu Arg Ala Val Gly Phe Gly Asp Glu Asp 20 25
30 Phe Asn Lys Pro Ile Val Gly Ile Ala Asn Gly Tyr Ser Thr Ile Thr
35 40 45 Pro Cys Asn Ile Gly Leu Asn Asp Leu Ala His Arg Ala Glu
Thr Ala 50 55 60 Leu Lys Gln Ala Asp Ala Met Pro Gln Met Phe Gly
Thr Ile Thr Val 65 70 75 80 Ser Asp Gly Ile Ala Met Gly Thr Glu Gly
Met Lys Tyr Ser Leu Val 85 90 95 Ser Arg Glu Val Ile Ala Asp Ala
Ile Glu Thr Ala Cys Asn Gly Gln 100 105 110 Ser Met Asp Gly Val Leu
Ala Ile Gly Gly Cys Asp Lys Asn Met Pro 115 120 125 Gly Ala Met Ile
Ala Ile Ala Arg Met Asn Ile Pro Ala Ile Phe Val 130 135 140 Tyr Gly
Gly Thr Ile Lys Pro Gly Asn Leu Asn Gly Cys Asp Leu Thr 145 150 155
160 Val Val Ser Ala Phe Glu Ala Val Gly Glu Tyr Ser Ala Gly Lys Leu
165 170 175 Asp Asp Asp Arg Leu Leu Asp Ile Glu Arg Leu Ala Cys Pro
Gly Ser 180 185 190 Gly Ser Cys Gly Gly Met Phe Thr Ala Asn Thr Met
Ser Ser Ala Phe 195 200 205 Glu Ala Met Gly Met Ser Leu Met Tyr Ser
Ser Thr Met Ala Ser Glu 210 215 220 Asp Ala Glu Lys Ala Asp Ser Thr
Glu Lys Ser Ala Phe Val Leu Arg 225 230 235 240 Glu Ala Ile Ser Gln
Arg Ile Leu Pro Lys Gln Ile Leu Thr Arg Lys 245 250 255 Ala Phe Glu
Asn Ala Ile Ala Val Ile Met Ala Val Gly Gly Ser Thr 260 265 270 Asn
Ser Val Leu His Leu Leu Ala Ile Ala Tyr Ala Ala Asp Val Glu 275 280
285 Leu Thr Ile Asp Asp Phe Glu Thr Ile Arg Gly Arg Val Pro Val Leu
290 295 300 Cys Asp Leu Lys Pro Ser Gly Arg Phe Val Thr Thr Asp Phe
His Lys 305 310 315 320 Ala Gly Gly Val Pro Leu Ile Met Lys Met Leu
Leu Glu Gln Gly Leu 325 330 335 Ile His Gly Asp Ala Leu Thr Ile Thr
Gly Lys Thr Val Ala Glu Gln 340 345 350 Leu Ala Asp Ile Pro Ser Gln
Pro Ser Ala Asp Gln Glu Val Ile Arg 355 360 365 Pro Trp Asn Asn Pro
Met Tyr Lys Gln Gly His Leu Ala Ile Leu Lys 370 375 380 Gly Asn Leu
Ala Thr Glu Gly Ser Val Ala Lys Ile Thr Gly Val Lys 385 390 395 400
Lys Pro Gln Met Thr Gly Pro Ala Arg Val Phe Glu Ser Glu Glu Gln 405
410 415 Cys Leu Glu Ala Ile Leu Ala Gly Lys Ile Gln Ala Gly Asp Val
Leu 420 425 430 Val Val Arg Tyr Glu Gly Pro Lys Gly Gly Pro Gly Met
Arg Glu Met 435 440 445 Leu Ala Pro Thr Ser Ala Ile Ile Gly Ala Gly
Leu Gly Asp Ser Val 450 455 460 Gly Leu Ile Thr Asp Gly Arg Phe Ser
Gly Gly Thr Tyr Gly Leu Val 465 470 475 480 Val Gly His Val Ala Pro
Glu Ala Ala Val Gly Gly Asn Ile Ala Leu 485 490 495 Val Gln Glu Gly
Asp Ser Ile Thr Ile Asp Ala Ser Gln Arg Leu Leu 500 505 510 Gln Val
Asn Ile Ser Asp Gln Val Leu Glu Gln Arg Arg Gln Asn Trp 515 520 525
Gln Pro Pro Gln Pro Arg Tyr Thr Lys Gly Val Leu Ala Lys Tyr Ala 530
535 540 Lys Leu Val Ser Ser Ser Ser Val Gly Ala Val Thr Asp Leu Asp
545 550 555 168616PRTEscherichia coli 168Met Pro Lys Tyr Arg Ser
Ala Thr Thr Thr His Gly Arg Asn Met Ala 1 5 10 15 Gly Ala Arg Ala
Leu Trp Arg Ala Thr Gly Met Thr Asp Ala Asp Phe 20 25 30 Gly Lys
Pro Ile Ile Ala Val Val Asn Ser Phe Thr Gln Phe Val Pro 35 40 45
Gly His Val His Leu Arg Asp Leu Gly Lys Leu Val Ala Glu Gln Ile 50
55 60 Glu Ala Ala Gly Gly Val Ala Lys Glu Phe Asn Thr Ile Ala Val
Asp 65 70 75 80 Asp Gly Ile Ala Met Gly His Gly Gly Met Leu Tyr Ser
Leu Pro Ser 85 90 95 Arg Glu Leu Ile Ala Asp Ser Val Glu Tyr Met
Val Asn Ala His Cys 100 105 110 Ala Asp Ala Met Val Cys Ile Ser Asn
Cys Asp Lys Ile Thr Pro Gly 115 120 125 Met Leu Met Ala Ser Leu Arg
Leu Asn Ile Pro Val Ile Phe Val Ser 130 135 140 Gly Gly Pro Met Glu
Ala Gly Lys Thr Lys Leu Ser Asp Gln Ile Ile 145 150 155 160 Lys Leu
Asp Leu Val Asp Ala Met Ile Gln Gly Ala Asp Pro Lys Val 165 170 175
Ser Asp Ser Gln Ser Asp Gln Val Glu Arg Ser Ala Cys Pro Thr Cys 180
185 190 Gly Ser Cys Ser Gly Met Phe Thr Ala Asn Ser Met Asn Cys Leu
Thr 195 200 205 Glu Ala Leu Gly Leu Ser Gln Pro Gly Asn Gly Ser Leu
Leu Ala Thr 210 215 220 His Ala Asp Arg Lys Gln Leu Phe Leu Asn Ala
Gly Lys Arg Ile Val 225 230 235 240 Glu Leu Thr Lys Arg Tyr Tyr Glu
Gln Asn Asp Glu Ser Ala Leu Pro 245 250 255 Arg Asn Ile Ala Ser Lys
Ala Ala Phe Glu Asn Ala Met Thr Leu Asp 260 265 270 Ile Ala Met Gly
Gly Ser Thr Asn Thr Val Leu His Leu Leu Ala Ala 275 280 285 Ala Gln
Glu Ala Glu Ile Asp Phe Thr Met Ser Asp Ile Asp Lys Leu 290 295 300
Ser Arg Lys Val Pro Gln Leu Cys Lys Val Ala Pro Ser Thr Gln Lys 305
310 315 320 Tyr His Met Glu Asp Val His Arg Ala Gly Gly Val Ile Gly
Ile Leu 325 330 335 Gly Glu Leu Asp Arg Ala Gly Leu Leu Asn Arg Asp
Val Lys Asn Val 340 345 350 Leu Gly Leu Thr Leu Pro Gln Thr Leu Glu
Gln Tyr Asp Val Met Leu 355 360 365 Thr Gln Asp Asp Ala Val Lys Asn
Met Phe Arg Ala Gly Pro Ala Gly 370 375 380 Ile Arg Thr Thr Gln Ala
Phe Ser Gln Asp Cys Arg Trp Asp Thr Leu 385 390 395 400 Asp Asp Asp
Arg Ala Asn Gly Cys Ile Arg Ser Leu Glu His Ala Tyr 405 410 415 Ser
Lys Asp Gly Gly Leu Ala Val Leu Tyr Gly Asn Phe Ala Glu Asn 420 425
430 Gly Cys Ile Val Lys Thr Ala Gly Val Asp Asp Ser Ile Leu Lys Phe
435 440 445 Thr Gly Pro Ala Lys Val Tyr Glu Ser Gln Asp Asp Ala Val
Glu Ala 450 455 460 Ile Leu Gly Gly Lys Val Val Ala Gly Asp Val Val
Val Ile Arg Tyr 465 470 475 480 Glu Gly Pro Lys Gly Gly Pro Gly Met
Gln Glu Met Leu Tyr Pro Thr 485 490 495 Ser Phe Leu Lys Ser Met Gly
Leu Gly Lys Ala Cys Ala Leu Ile Thr 500 505 510 Asp Gly Arg Phe Ser
Gly Gly Thr Ser Gly Leu Ser Ile Gly His Val 515 520 525 Ser Pro Glu
Ala Ala Ser Gly Gly Ser Ile Gly Leu Ile Glu Asp Gly 530 535 540 Asp
Leu Ile Ala Ile Asp Ile Pro Asn Arg Gly Ile Gln Leu Gln Val 545 550
555 560 Ser Asp Ala Glu Leu Ala Ala Arg Arg Glu Ala Gln Asp Ala Arg
Gly 565 570 575 Asp Lys Ala Trp Thr Pro Lys Asn Arg Glu Arg Gln Val
Ser Phe Ala 580 585 590 Leu Arg Ala Tyr Ala Ser Leu Ala Thr Ser Ala
Asp Lys Gly Ala Val 595 600 605 Arg Asp Lys Ser Lys Leu Gly Gly 610
615 169571PRTBacillus subtilis 169Met Leu Thr Lys Ala Thr Lys Glu
Gln Lys Ser Leu Val Lys Asn Arg 1 5 10 15 Gly Ala Glu Leu Val Val
Asp Cys Leu Val Glu Gln Gly Val Thr His 20 25 30 Val Phe Gly Ile
Pro Gly Ala Lys Ile Asp Ala Val Phe Asp Ala Leu 35 40 45 Gln Asp
Lys Gly Pro Glu Ile Ile Val Ala Arg His Glu Gln Asn Ala 50 55 60
Ala Phe Met Ala Gln Ala Val Gly Arg Leu Thr Gly Lys Pro Gly Val 65
70 75 80 Val Leu Val Thr Ser Gly Pro Gly Ala Ser Asn Leu Ala Thr
Gly Leu 85 90 95 Leu Thr Ala Asn Thr Glu Gly Asp Pro Val Val Ala
Leu Ala Gly Asn 100 105 110 Val Ile Arg Ala Asp Arg Leu Lys Arg Thr
His Gln Ser Leu Asp Asn 115 120 125 Ala Ala Leu Phe Gln Pro Ile Thr
Lys Tyr Ser Val Glu Val Gln Asp 130 135 140 Val Lys Asn Ile Pro Glu
Ala Val Thr Asn Ala Phe Arg Ile Ala Ser 145 150 155 160 Ala Gly Gln
Ala Gly Ala Ala Phe Val Ser Phe Pro Gln Asp Val Val 165 170 175 Asn
Glu Val Thr Asn Thr Lys Asn Val Arg Ala Val Ala Ala Pro Lys 180 185
190 Leu Gly Pro Ala Ala Asp Asp Ala Ile Ser Ala Ala Ile Ala Lys Ile
195 200 205 Gln Thr Ala Lys Leu Pro Val Val Leu Val Gly Met Lys Gly
Gly Arg 210 215 220 Pro Glu Ala Ile Lys Ala Val Arg Lys Leu Leu Lys
Lys Val Gln Leu 225 230 235 240 Pro Phe Val Glu Thr Tyr Gln Ala Ala
Gly Thr Leu Ser Arg Asp Leu 245 250 255 Glu Asp Gln Tyr Phe Gly Arg
Ile Gly Leu Phe Arg Asn Gln Pro Gly 260 265 270 Asp Leu Leu Leu Glu
Gln Ala Asp Val Val Leu Thr Ile Gly Tyr Asp 275 280 285 Pro Ile Glu
Tyr Asp Pro Lys Phe Trp Asn Ile Asn Gly Asp Arg Thr 290 295 300 Ile
Ile His Leu Asp Glu Ile Ile Ala Asp Ile Asp His Ala Tyr Gln 305 310
315 320 Pro Asp Leu Glu Leu Ile Gly Asp Ile Pro Ser Thr Ile Asn His
Ile 325 330 335 Glu His Asp Ala Val Lys Val Glu Phe Ala Glu Arg Glu
Gln Lys Ile 340 345 350 Leu Ser Asp Leu Lys Gln Tyr Met His Glu Gly
Glu Gln Val Pro Ala 355 360 365 Asp Trp Lys Ser Asp Arg Ala His Pro
Leu Glu Ile Val Lys Glu Leu 370 375 380 Arg Asn Ala Val Asp Asp His
Val Thr Val Thr Cys Asp Ile Gly Ser 385 390 395 400 His Ala Ile Trp
Met Ser Arg Tyr Phe Arg Ser Tyr Glu Pro Leu Thr 405 410 415 Leu Met
Ile Ser Asn Gly Met Gln Thr Leu Gly Val Ala Leu Pro Trp 420 425 430
Ala Ile Gly Ala Ser Leu Val Lys Pro Gly Glu Lys Val Val Ser Val 435
440 445 Ser Gly Asp Gly Gly Phe Leu Phe Ser Ala Met Glu Leu Glu Thr
Ala 450 455 460 Val Arg Leu Lys Ala Pro Ile Val His Ile Val Trp Asn
Asp Ser Thr 465 470 475 480 Tyr Asp Met Val Ala Phe Gln Gln Leu Lys
Lys Tyr Asn Arg Thr Ser 485 490 495 Ala Val Asp Phe Gly Asn Ile Asp
Ile Val Lys Tyr Ala Glu Ser Phe 500 505 510 Gly Ala Thr Gly Leu Arg
Val Glu Ser Pro Asp Gln Leu Ala Asp Val 515 520 525 Leu Arg Gln Gly
Met Asn Ala Glu Gly Pro Val Ile Ile Asp Val Pro 530 535 540 Val Asp
Tyr Ser Asp Asn Ile Asn Leu Ala Ser Asp Lys Leu Pro Lys 545 550 555
560 Glu Phe Gly Glu Leu Met Lys Thr Lys Ala Leu 565 570
170491PRTArtificial SequenceE. coli ilvC Q110V 170Met Ala Asn Tyr
Phe Asn Thr Leu Asn Leu Arg Gln Gln Leu Ala Gln 1 5 10 15 Leu Gly
Lys Cys Arg Phe Met Gly Arg Asp Glu Phe Ala Asp Gly Ala 20 25 30
Ser Tyr Leu Gln Gly Lys Lys Val Val Ile Val Gly Cys Gly Ala Gln 35
40 45 Gly Leu Asn Gln Gly Leu Asn Met Arg Asp Ser Gly Leu Asp Ile
Ser 50 55 60 Tyr Ala Leu Arg Lys Glu Ala Ile Ala Glu Lys Arg Ala
Ser Trp Arg 65 70 75 80 Lys Ala Thr Glu Asn Gly Phe Lys Val Gly Thr
Tyr Glu Glu Leu Ile 85 90 95 Pro Gln Ala Asp Leu Val Ile Asn Leu
Thr Pro Asp Lys Val His Ser 100 105 110 Asp Val Val Arg Thr Val Gln
Pro Leu Met Lys Asp Gly Ala Ala Leu 115 120 125 Gly Tyr Ser His Gly
Phe Asn Ile Val Glu Val Gly Glu Gln Ile Arg 130 135 140 Lys Asp Ile
Thr Val Val Met Val Ala Pro Lys Cys Pro Gly Thr Glu 145 150 155 160
Val Arg Glu Glu Tyr Lys Arg Gly Phe Gly Val Pro Thr Leu Ile Ala 165
170 175 Val His Pro Glu Asn Asp Pro Lys Gly Glu Gly Met Ala Ile Ala
Lys 180 185 190 Ala Trp Ala Ala Ala Thr Gly Gly His Arg Ala Gly Val
Leu Glu Ser 195 200 205 Ser Phe Val Ala Glu Val Lys Ser Asp Leu Met
Gly Glu Gln Thr Ile 210 215 220 Leu Cys Gly Met Leu Gln Ala Gly Ser
Leu Leu Cys Phe Asp Lys Leu 225 230 235 240 Val Glu Glu Gly Thr Asp
Pro Ala Tyr Ala Glu Lys Leu Ile Gln Phe 245 250 255 Gly Trp Glu Thr
Ile Thr Glu Ala Leu Lys Gln Gly Gly Ile Thr Leu 260 265 270 Met Met
Asp Arg Leu Ser Asn Pro Ala Lys Leu Arg Ala Tyr Ala Leu 275 280 285
Ser Glu Gln Leu Lys Glu Ile Met Ala Pro Leu Phe Gln Lys His Met 290
295 300 Asp Asp Ile Ile Ser Gly Glu Phe Ser Ser Gly Met Met Ala Asp
Trp 305 310 315 320 Ala Asn Asp Asp Lys Lys Leu Leu Thr Trp Arg Glu
Glu Thr Gly Lys 325 330 335 Thr Ala Phe Glu Thr Ala Pro Gln Tyr Glu
Gly Lys Ile Gly Glu Gln 340 345 350 Glu Tyr Phe Asp Lys Gly Val Leu
Met Ile Ala Met Val Lys Ala Gly 355 360 365 Val Glu Leu Ala Phe Glu
Thr Met Val Asp Ser Gly Ile Ile Glu Glu 370 375 380 Ser Ala Tyr Tyr
Glu Ser Leu His Glu Leu Pro Leu Ile Ala Asn Thr 385 390 395 400 Ile
Ala Arg Lys Arg Leu Tyr Glu Met Asn Val Val Ile Ser Asp Thr 405 410
415 Ala Glu Tyr Gly Asn Tyr Leu Phe Ser Tyr Ala Cys Val Pro Leu Leu
420 425 430 Lys Pro Phe Met Ala Glu Leu Gln Pro Gly Asp Leu Gly Lys
Ala Ile 435 440 445 Pro Glu Gly Ala Val Asp Asn Gly Gln Leu Arg Asp
Val Asn Glu Ala 450 455 460 Ile Arg Ser His Ala Ile Glu Gln Val Gly
Lys Lys Leu Arg Gly Tyr 465 470 475 480 Met Thr Asp Met Lys Arg Ile
Ala Val Ala Gly 485 490 1711476DNAArtificial SequenceE. coli ilvC
codon-optimized for expression in S. cerevisiae (P2D1-A1)
171atggccaact attttaacac
attaaatttg agacaacaat tggctcaact gggtaagtgc 60agatttatgg gaagggacga
gtttgctgat ggtgcttctt atctgcaagg aaagaaagta 120gtaattgttg
gctgcggtgc tcagggtcta aaccaaggtt taaacatgag agattcaggt
180ctggatattt cgtatgcatt gaggaaagag tctattgcag aaaaggatgc
cgattggcgt 240aaagcgacgg aaaatgggtt caaagttggt acttacgaag
aactgatccc tcaggcagat 300ttagtgatta acctaacacc agataaggtt
cactcagacg tagtaagaac agttcaaccg 360ctgatgaagg atggggcagc
tttaggttac tctcatggct ttaatatcgt tgaagtgggc 420gagcagatca
gaaaaggtat aacagtcgta atggttgcgc caaagtgccc aggtacggaa
480gtcagagagg agtacaagag gggttttggt gtacctacat tgatcgccgt
acatcctgaa 540aatgacccca aacgtgaagg tatggcaata gcgaaggcat
gggcagccgc aaccggaggt 600catagagcgg gtgtgttaga gagttctttc
gtagctgagg tcaagagtga cttaatgggt 660gaacaaacca ttctgtgcgg
aatgttgcag gcagggtctt tactatgctt tgataaattg 720gtcgaagagg
gtacagatcc tgcctatgct gaaaagttga tacaatttgg ttgggagaca
780atcaccgagg cacttaaaca aggtggcata acattgatga tggatagact
ttcaaatccg 840gccaagctaa gagcctacgc cttatctgag caactaaaag
agatcatggc accattattc 900caaaagcaca tggacgatat tatctccggt
gagttttcct caggaatgat ggcagattgg 960gcaaacgatg ataaaaagtt
attgacgtgg agagaagaaa ccggcaagac ggcattcgag 1020acagccccac
aatacgaagg taaaattggt gaacaagaat actttgataa gggagtattg
1080atgatagcta tggtgaaggc aggggtagaa cttgcattcg aaactatggt
tgactccggt 1140atcattgaag aatctgcata ctatgagtct ttgcatgaat
tgcctttgat agcaaatact 1200attgcaagaa aaagacttta cgagatgaat
gttgtcatat cagacactgc agaatatggt 1260aattacttat ttagctacgc
gtgtgtcccg ttgttagagc ccttcatggc cgagttacaa 1320cctggtgatt
tggggaaggc tattccggaa ggagcggttg acaatggcca actgagagac
1380gtaaatgaag ctattcgttc acatgctata gaacaggtgg gtaaaaagct
gagaggatat 1440atgaccgata tgaaaagaat tgcagtggca ggatga
1476172491PRTArtificial SequenceE. coli ilvC codon-optimized for
expression in S. cerevisiae (P2D1-A1) 172Met Ala Asn Tyr Phe Asn
Thr Leu Asn Leu Arg Gln Gln Leu Ala Gln 1 5 10 15 Leu Gly Lys Cys
Arg Phe Met Gly Arg Asp Glu Phe Ala Asp Gly Ala 20 25 30 Ser Tyr
Leu Gln Gly Lys Lys Val Val Ile Val Gly Cys Gly Ala Gln 35 40 45
Gly Leu Asn Gln Gly Leu Asn Met Arg Asp Ser Gly Leu Asp Ile Ser 50
55 60 Tyr Ala Leu Arg Lys Glu Ser Ile Ala Glu Lys Asp Ala Asp Trp
Arg 65 70 75 80 Lys Ala Thr Glu Asn Gly Phe Lys Val Gly Thr Tyr Glu
Glu Leu Ile 85 90 95 Pro Gln Ala Asp Leu Val Ile Asn Leu Thr Pro
Asp Lys Val His Ser 100 105 110 Asp Val Val Arg Thr Val Gln Pro Leu
Met Lys Asp Gly Ala Ala Leu 115 120 125 Gly Tyr Ser His Gly Phe Asn
Ile Val Glu Val Gly Glu Gln Ile Arg 130 135 140 Lys Gly Ile Thr Val
Val Met Val Ala Pro Lys Cys Pro Gly Thr Glu 145 150 155 160 Val Arg
Glu Glu Tyr Lys Arg Gly Phe Gly Val Pro Thr Leu Ile Ala 165 170 175
Val His Pro Glu Asn Asp Pro Lys Arg Glu Gly Met Ala Ile Ala Lys 180
185 190 Ala Trp Ala Ala Ala Thr Gly Gly His Arg Ala Gly Val Leu Glu
Ser 195 200 205 Ser Phe Val Ala Glu Val Lys Ser Asp Leu Met Gly Glu
Gln Thr Ile 210 215 220 Leu Cys Gly Met Leu Gln Ala Gly Ser Leu Leu
Cys Phe Asp Lys Leu 225 230 235 240 Val Glu Glu Gly Thr Asp Pro Ala
Tyr Ala Glu Lys Leu Ile Gln Phe 245 250 255 Gly Trp Glu Thr Ile Thr
Glu Ala Leu Lys Gln Gly Gly Ile Thr Leu 260 265 270 Met Met Asp Arg
Leu Ser Asn Pro Ala Lys Leu Arg Ala Tyr Ala Leu 275 280 285 Ser Glu
Gln Leu Lys Glu Ile Met Ala Pro Leu Phe Gln Lys His Met 290 295 300
Asp Asp Ile Ile Ser Gly Glu Phe Ser Ser Gly Met Met Ala Asp Trp 305
310 315 320 Ala Asn Asp Asp Lys Lys Leu Leu Thr Trp Arg Glu Glu Thr
Gly Lys 325 330 335 Thr Ala Phe Glu Thr Ala Pro Gln Tyr Glu Gly Lys
Ile Gly Glu Gln 340 345 350 Glu Tyr Phe Asp Lys Gly Val Leu Met Ile
Ala Met Val Lys Ala Gly 355 360 365 Val Glu Leu Ala Phe Glu Thr Met
Val Asp Ser Gly Ile Ile Glu Glu 370 375 380 Ser Ala Tyr Tyr Glu Ser
Leu His Glu Leu Pro Leu Ile Ala Asn Thr 385 390 395 400 Ile Ala Arg
Lys Arg Leu Tyr Glu Met Asn Val Val Ile Ser Asp Thr 405 410 415 Ala
Glu Tyr Gly Asn Tyr Leu Phe Ser Tyr Ala Cys Val Pro Leu Leu 420 425
430 Glu Pro Phe Met Ala Glu Leu Gln Pro Gly Asp Leu Gly Lys Ala Ile
435 440 445 Pro Glu Gly Ala Val Asp Asn Gly Gln Leu Arg Asp Val Asn
Glu Ala 450 455 460 Ile Arg Ser His Ala Ile Glu Gln Val Gly Lys Lys
Leu Arg Gly Tyr 465 470 475 480 Met Thr Asp Met Lys Arg Ile Ala Val
Ala Gly 485 490 173548PRTLactococcus lactis 173Met Tyr Thr Val Gly
Asp Tyr Leu Leu Asp Arg Leu His Glu Leu Gly 1 5 10 15 Ile Glu Glu
Ile Phe Gly Val Pro Gly Asp Tyr Asn Leu Gln Phe Leu 20 25 30 Asp
Gln Ile Ile Ser His Lys Asp Met Lys Trp Val Gly Asn Ala Asn 35 40
45 Glu Leu Asn Ala Ser Tyr Met Ala Asp Gly Tyr Ala Arg Thr Lys Lys
50 55 60 Ala Ala Ala Phe Leu Thr Thr Phe Gly Val Gly Glu Leu Ser
Ala Val 65 70 75 80 Asn Gly Leu Ala Gly Ser Tyr Ala Glu Asn Leu Pro
Val Val Glu Ile 85 90 95 Val Gly Ser Pro Thr Ser Lys Val Gln Asn
Glu Gly Lys Phe Val His 100 105 110 His Thr Leu Ala Asp Gly Asp Phe
Lys His Phe Met Lys Met His Glu 115 120 125 Pro Val Thr Ala Ala Arg
Thr Leu Leu Thr Ala Glu Asn Ala Thr Val 130 135 140 Glu Ile Asp Arg
Val Leu Ser Ala Leu Leu Lys Glu Arg Lys Pro Val 145 150 155 160 Tyr
Ile Asn Leu Pro Val Asp Val Ala Ala Ala Lys Ala Glu Lys Pro 165 170
175 Ser Leu Pro Leu Lys Lys Glu Asn Ser Thr Ser Asn Thr Ser Asp Gln
180 185 190 Glu Ile Leu Asn Lys Ile Gln Glu Ser Leu Lys Asn Ala Lys
Lys Pro 195 200 205 Ile Val Ile Thr Gly His Glu Ile Ile Ser Phe Gly
Leu Glu Lys Thr 210 215 220 Val Thr Gln Phe Ile Ser Lys Thr Lys Leu
Pro Ile Thr Thr Leu Asn 225 230 235 240 Phe Gly Lys Ser Ser Val Asp
Glu Ala Leu Pro Ser Phe Leu Gly Ile 245 250 255 Tyr Asn Gly Thr Leu
Ser Glu Pro Asn Leu Lys Glu Phe Val Glu Ser 260 265 270 Ala Asp Phe
Ile Leu Met Leu Gly Val Lys Leu Thr Asp Ser Ser Thr 275 280 285 Gly
Ala Phe Thr His His Leu Asn Glu Asn Lys Met Ile Ser Leu Asn 290 295
300 Ile Asp Glu Gly Lys Ile Phe Asn Glu Arg Ile Gln Asn Phe Asp Phe
305 310 315 320 Glu Ser Leu Ile Ser Ser Leu Leu Asp Leu Ser Glu Ile
Glu Tyr Lys 325 330 335 Gly Lys Tyr Ile Asp Lys Lys Gln Glu Asp Phe
Val Pro Ser Asn Ala 340 345 350 Leu Leu Ser Gln Asp Arg Leu Trp Gln
Ala Val Glu Asn Leu Thr Gln 355 360 365 Ser Asn Glu Thr Ile Val Ala
Glu Gln Gly Thr Ser Phe Phe Gly Ala 370 375 380 Ser Ser Ile Phe Leu
Lys Ser Lys Ser His Phe Ile Gly Gln Pro Leu 385 390 395 400 Trp Gly
Ser Ile Gly Tyr Thr Phe Pro Ala Ala Leu Gly Ser Gln Ile 405 410 415
Ala Asp Lys Glu Ser Arg His Leu Leu Phe Ile Gly Asp Gly Ser Leu 420
425 430 Gln Leu Thr Val Gln Glu Leu Gly Leu Ala Ile Arg Glu Lys Ile
Asn 435 440 445 Pro Ile Cys Phe Ile Ile Asn Asn Asp Gly Tyr Thr Val
Glu Arg Glu 450 455 460 Ile His Gly Pro Asn Gln Ser Tyr Asn Asp Ile
Pro Met Trp Asn Tyr 465 470 475 480 Ser Lys Leu Pro Glu Ser Phe Gly
Ala Thr Glu Asp Arg Val Val Ser 485 490 495 Lys Ile Val Arg Thr Glu
Asn Glu Phe Val Ser Val Met Lys Glu Ala 500 505 510 Gln Ala Asp Pro
Asn Arg Met Tyr Trp Ile Glu Leu Ile Leu Ala Lys 515 520 525 Glu Gly
Ala Pro Lys Val Leu Lys Lys Met Gly Lys Leu Phe Ala Glu 530 535 540
Gln Asn Lys Ser 545 1741023DNALactococcus lactis 174atgaaagcag
cagtagtaag acacaatcca gatggttatg cggaccttgt tgaaaaggaa 60cttcgagcaa
tcaaacctaa tgaagctttg cttgacatgg agtattgtgg agtctgtcat
120accgatttgc acgttgcagc aggtgattat ggcaacaaag cagggactgt
tcttggtcat 180gaaggaattg gaattgtcaa agaaattgga gctgatgtaa
gctcgcttca agttggtgat 240cgggtttcag tggcttggtt ctttgaagga
tgtggtcact gtgaatactg tgtatctggt 300aatgaaactt tttgtcgaga
agttaaaaat gcaggatatt cagttgatgg cggaatggct 360gaagaagcaa
ttgttgttgc cgattatgct gtcaaagttc ctgacggact tgacccaatt
420gaagctagct caattacttg tgctggagta acaacttaca aagcaatcaa
agtatcagga 480gtaaaacctg gtgattggca agtaattttt ggtgctggag
gacttggaaa tttagcaatt 540caatatgcta aaaatgtttt tggagcaaaa
gtaattgctg ttgatattaa tcaagataaa 600ttaaatttag ctaaaaaaat
tggagctgat gtgattatca attctggtga tgtaaatcca 660gttgatgaaa
ttaaaaaaat aactggcggc ttaggggtgc aaagtgcaat agtttgtgct
720gttgcaagga ttgcttttga acaagcggtt gcttctttga aacctatggg
caaaatggtt 780gctgtggcac ttcccaatac tgagatgact ttatcagttc
caacagttgt ttttgacgga 840gtggaggttg caggttcact tgtcggaaca
agacttgact tggcagaagc ttttcaattt 900ggagcagaag gtaaggtaaa
accaattgtt gcgacacgca aactggaaga aatcaatgat 960attattgatg
aaatgaaggc aggaaaaatt gaaggccgaa tggtcattga ttttactaaa 1020taa
1023175340PRTLactococcus lactis 175Met Lys Ala Ala Val Val Arg His
Asn Pro Asp Gly Tyr Ala Asp Leu 1 5 10 15 Val Glu Lys Glu Leu Arg
Ala Ile Lys Pro Asn Glu Ala Leu Leu Asp 20 25 30 Met Glu Tyr Cys
Gly Val Cys His Thr Asp Leu His Val Ala Ala Gly 35 40 45 Asp Tyr
Gly Asn Lys Ala Gly Thr Val Leu Gly His Glu Gly Ile Gly 50 55 60
Ile Val Lys Glu Ile Gly Ala Asp Val Ser Ser Leu Gln Val Gly Asp 65
70 75 80 Arg Val Ser Val Ala Trp Phe Phe Glu Gly Cys Gly His Cys
Glu Tyr 85 90 95 Cys Val Ser Gly Asn Glu Thr Phe Cys Arg Glu Val
Lys Asn Ala Gly 100 105 110 Tyr Ser Val Asp Gly Gly Met Ala Glu Glu
Ala Ile Val Val Ala Asp 115 120 125 Tyr Ala Val Lys Val Pro Asp Gly
Leu Asp Pro Ile Glu Ala Ser Ser 130 135 140 Ile Thr Cys Ala Gly Val
Thr Thr Tyr Lys Ala Ile Lys Val Ser Gly 145 150 155 160 Val Lys Pro
Gly Asp Trp Gln Val Ile Phe Gly Ala Gly Gly Leu Gly 165 170 175 Asn
Leu Ala Ile Gln Tyr Ala Lys Asn Val Phe Gly Ala Lys Val Ile 180 185
190 Ala Val Asp Ile Asn Gln Asp Lys Leu Asn Leu Ala Lys Lys Ile Gly
195 200 205 Ala Asp Val Ile Ile Asn Ser Gly Asp Val Asn Pro Val Asp
Glu Ile 210 215 220 Lys Lys Ile Thr Gly Gly Leu Gly Val Gln Ser Ala
Ile Val Cys Ala 225 230 235 240 Val Ala Arg Ile Ala Phe Glu Gln Ala
Val Ala Ser Leu Lys Pro Met 245 250 255 Gly Lys Met Val Ala Val Ala
Leu Pro Asn Thr Glu Met Thr Leu Ser 260 265 270 Val Pro Thr Val Val
Phe Asp Gly Val Glu Val Ala Gly Ser Leu Val 275 280 285 Gly Thr Arg
Leu Asp Leu Ala Glu Ala Phe Gln Phe Gly Ala Glu Gly 290 295 300 Lys
Val Lys Pro Ile Val Ala Thr Arg Lys Leu Glu Glu Ile Asn Asp 305 310
315 320 Ile Ile Asp Glu Met Lys Ala Gly Lys Ile Glu Gly Arg Met Val
Ile 325 330 335 Asp Phe Thr Lys 340 176256PRTDrosophila
melanogaster 176Met Ser Phe Thr Leu Thr Asn Lys Asn Val Ile Phe Val
Ala Gly Leu 1 5 10 15 Gly Gly Ile Gly Leu Asp Thr Ser Lys Glu Leu
Leu Lys Arg Asp Leu 20 25 30 Lys Asn Leu Val Ile Leu Asp Arg Ile
Glu Asn Pro Ala Ala Ile Ala 35 40 45 Glu Leu Lys Ala Ile Asn Pro
Lys Val Thr Val Thr Phe Tyr Pro Tyr 50 55 60 Asp Val Thr Val Pro
Ile Ala Glu Thr Thr Lys Leu Leu Lys Thr Ile 65 70 75 80 Phe Ala Gln
Leu Lys Thr Val Asp Val Leu Ile Asn Gly Ala Gly Ile 85 90 95 Leu
Asp Asp His Gln Ile Glu Arg Thr Ile Ala Val Asn Tyr Thr Gly 100 105
110 Leu Val Asn Thr Thr Thr Ala Ile Leu Asp Phe Trp Asp Lys Arg Lys
115 120 125 Gly Gly Pro Gly Gly Ile Ile Cys Asn Ile Gly Ser Val Thr
Gly Phe 130 135 140 Asn Ala Ile Tyr Gln Val Pro Val Tyr Ser Gly Thr
Lys Ala Ala Val 145 150 155 160 Val Asn Phe Thr Ser Ser Leu Ala Lys
Leu Ala Pro Ile Thr Gly Val 165 170 175 Thr Ala Tyr Thr Val Asn Pro
Gly Ile Thr Arg Thr Thr Leu Val His 180 185 190 Thr Phe Asn Ser Trp
Leu Asp Val Glu Pro Gln Val Ala Glu Lys Leu 195 200 205 Leu Ala His
Pro Thr Gln Pro Ser Leu Ala Cys Ala Glu Asn Phe Val 210 215 220 Lys
Ala Ile Glu Leu Asn Gln Asn Gly Ala Ile Trp Lys Leu Asp Leu 225 230
235 240 Gly Thr Leu Glu Ala Ile Gln Trp Thr Lys His Trp Asp Ser Gly
Ile 245 250 255 177340PRTArtificial SequenceL. lactis AdhA RE1
177Met Lys Ala Ala Val Val Arg His Asn Pro Asp Gly Tyr Ala Asp Leu
1 5 10 15 Val Glu Lys Glu Leu Arg Ala Ile Lys Pro Asn Glu Ala Leu
Leu Asp 20 25 30 Met Glu Tyr Cys Gly Val Cys His Thr Asp Leu His
Val Ala Ala Gly 35 40 45 Asp Phe Gly Asn Lys Ala Gly Thr Val Leu
Gly His Glu Gly Ile Gly 50 55 60 Ile Val Lys Glu Ile Gly Ala Asp
Val Ser Ser Leu Gln Val Gly Asp 65 70 75 80 Arg Val Ser Val Ala Trp
Phe Phe Glu Gly Cys Gly His Cys Glu Tyr 85 90 95 Cys Val Ser Gly
Asn Glu Thr Phe Cys Arg Glu Val Lys Asn Ala Gly 100 105 110 Tyr Ser
Val Asp Gly Gly Met Ala Glu Glu Ala Ile Val Val Ala Asp 115 120 125
Tyr Ala Val Lys Val Pro Asp Gly Leu Asp Pro Ile Glu Ala Ser Ser 130
135 140 Ile Thr Cys Ala Gly Val Thr Thr Tyr Lys Ala Ile Lys Val Ser
Gly 145 150 155 160 Val Lys Pro Gly Asp Trp Gln Val Ile Phe Gly Ala
Gly Gly Leu Gly 165 170 175 Asn Leu Ala Ile Gln Tyr Ala Lys Asn Val
Phe Gly Ala Lys Val Ile 180 185 190 Ala Val Asp Ile Asn Gln Asp Lys
Leu Asn Leu Ala Lys Lys Ile Gly 195 200 205 Ala Asp Val Thr Ile Asn
Ser Gly Asp Val Asn Pro Val Asp Glu Ile 210 215 220 Lys Lys Ile Thr
Gly Gly Leu Gly Val Gln Ser Ala Ile Val Cys Ala 225 230
235 240 Val Ala Arg Ile Ala Phe Glu Gln Ala Val Ala Ser Leu Lys Pro
Met 245 250 255 Gly Lys Met Val Ala Val Ala Val Pro Asn Thr Glu Met
Thr Leu Ser 260 265 270 Val Pro Thr Val Val Phe Asp Gly Val Glu Val
Ala Gly Ser Leu Val 275 280 285 Gly Thr Arg Leu Asp Leu Ala Glu Ala
Phe Gln Phe Gly Ala Glu Gly 290 295 300 Lys Val Lys Pro Ile Val Ala
Thr Arg Lys Leu Glu Glu Ile Asn Asp 305 310 315 320 Ile Ile Asp Glu
Met Lys Ala Gly Lys Ile Glu Gly Arg Met Val Ile 325 330 335 Asp Phe
Thr Lys 340
* * * * *