U.S. patent application number 14/496979 was filed with the patent office on 2015-01-08 for gh61 glycoside hydrolase protein variants and cofactors that enhance gh61 activity.
The applicant listed for this patent is Codexis, Inc.. Invention is credited to Dipnath Baidyaroy, David M. Elgart, John H. Grate, Kripa K. Rao, Jie Yang, Jungjoo Yoon, Xiyun Zhang.
Application Number | 20150010981 14/496979 |
Document ID | / |
Family ID | 47746817 |
Filed Date | 2015-01-08 |
United States Patent
Application |
20150010981 |
Kind Code |
A1 |
Yang; Jie ; et al. |
January 8, 2015 |
GH61 GLYCOSIDE HYDROLASE PROTEIN VARIANTS AND COFACTORS THAT
ENHANCE GH61 ACTIVITY
Abstract
The present invention provides various GH61 protein variants
comprising various amino acid substitutions. The GH61 protein
variants have an improved ability to synergize with cellulase
enzymes, thereby increasing the yield of fermentable sugars
obtained by saccharification of biomass. In some embodiments,
sugars obtained from saccharification are fermented to produce
numerous end-products, including but not limited to alcohol.
Inventors: |
Yang; Jie; (Foster City,
CA) ; Zhang; Xiyun; (Fremont, CA) ; Yoon;
Jungjoo; (Foster City, CA) ; Rao; Kripa K.;
(Union City, CA) ; Grate; John H.; (Los Altos,
CA) ; Elgart; David M.; (San Mateo, CA) ;
Baidyaroy; Dipnath; (Fremont, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Codexis, Inc. |
Redwood City |
CA |
US |
|
|
Family ID: |
47746817 |
Appl. No.: |
14/496979 |
Filed: |
September 25, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13592024 |
Aug 22, 2012 |
8877474 |
|
|
14496979 |
|
|
|
|
13215193 |
Aug 22, 2011 |
8298795 |
|
|
13592024 |
|
|
|
|
61526224 |
Aug 22, 2011 |
|
|
|
61601997 |
Feb 22, 2012 |
|
|
|
Current U.S.
Class: |
435/198 ;
435/203; 435/205; 435/209; 435/254.11; 435/320.1; 536/23.2 |
Current CPC
Class: |
C12N 9/2434 20130101;
C12N 9/2445 20130101; Y02E 50/10 20130101; C12P 19/14 20130101;
C12P 7/10 20130101; Y02E 50/16 20130101; C12N 9/242 20130101; C12N
9/2437 20130101; C12Y 302/01004 20130101; C12Y 302/01091 20130101;
C12Y 302/01021 20130101 |
Class at
Publication: |
435/198 ;
536/23.2; 435/320.1; 435/254.11; 435/209; 435/203; 435/205 |
International
Class: |
C12N 9/42 20060101
C12N009/42; C12N 9/30 20060101 C12N009/30 |
Claims
1. A polynucleotide comprising a nucleic acid sequence encoding a
GH61 variant protein that is at least about 90% identical to SEQ ID
NO:2 or a polynucleotide that hybridizes under stringent
hybridization conditions to the polynucleotide and/or a complement
of a polynucleotide encoding said GH61 variant protein.
2. A polynucleotide sequence encoding a GH61 variant protein,
wherein said polynucleotide sequence is at least about 75%, at
least about 80%, at least about 85%, at least about 90%, at least
about 91%, at least about 92%, at least about 93%, at least about
94%, at least about 95%, at least about 96%, at least about 97%, at
least about 98%, or at least about 99% identical to any of SEQ ID
NOS:1, 4, 7, and/or 10, or a polynucleotide that hybridizes under
stringent hybridization conditions to the polynucleotide and/or a
complement of any of SEQ ID NOS:1, 4, 7, and/or 10.
3. A recombinant nucleic acid construct comprising at least one
polynucleotide sequence encoding at least one GH61 protein, wherein
the polynucleotide is selected from: (a) a polynucleotide that
encodes a polypeptide comprising an amino acid sequence having at
least about 70%, at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, or at
least about 99% identity to SEQ ID NO:2, 3, 5, 6, 8, and/or 9,
wherein the amino acid sequence comprises at least one substitution
and/or substitution set provided herein; (b) a polynucleotide that
hybridizes under stringent hybridization conditions to at least a
fragment of a polynucleotide that encodes a polypeptide having the
amino acid sequence of SEQ ID NO:2, 3, 5, 6, 8, and/or 9, and
wherein said amino acid sequence comprises at least one
substitution and/or at least one substitution set provided herein;
and/or (c) a polynucleotide that hybridizes under stringent
hybridization conditions to the complement of at least a fragment
of a polynucleotide that encodes a polypeptide having the amino
acid sequence of SEQ ID NO:2, 3, 5, 6, 8, and/or 9, and wherein
said amino acid sequence comprises at least one substitution and/or
at least one substitution set provided herein.
4. A recombinant nucleic acid construct comprising at least one
polynucleotide sequence encoding at least one GH61 protein, wherein
the polynucleotide is selected from: (a) a polynucleotide that
encodes a polypeptide comprising an amino acid sequence having at
least 70%, at least 75%, at least 80%, at least 85%, at least 90%,
at least 91%, at least 92%, at least 93%, at least 94%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%
identity to SEQ ID NO:2, wherein the amino acid sequence comprises
at least one substitution and/or substitution set provided herein;
(b) a polynucleotide that hybridizes under stringent hybridization
conditions to a polynucleotide that encodes a polypeptide having
the amino acid sequence of SEQ ID NO:2, and wherein said amino acid
sequence comprises at least one substitution and/or at least one
substitution set provided herein; and/or (c) a polynucleotide that
hybridizes under stringent hybridization conditions to the
complement of a polynucleotide that encodes a polypeptide having
the amino acid sequence of SEQ ID NO:2, and wherein said amino acid
sequence comprises at least one substitution and/or at least one
substitution set provided herein.
5. The recombinant nucleic acid construct of claim 3, wherein the
polynucleotide sequence is at least about 70%, at least about 75%,
at least about 80%, at least about 85%, at least about 90%, at
least about 91%, at least about 92%, at least about 93%, at least
about 94%, at least about 95%, at least about 96%, at least about
97%, at least about 98%, or at least about 99% identical to any of
SEQ ID NOS:1, 4, 7, and/or 10, and wherein said polynucleotide
sequence comprises at least one mutation and/or at least one
mutation set provided herein.
6. The recombinant nucleic acid construct of claim 3, wherein the
polynucleotide sequence comprises at least one mutation or mutation
set selected from t60c/c573g, t60c/c573g/g1026a, c573g,
t60c/c291a/c573g, t60c/c291a, t60c/c876t, a312g, t60c,
t379a/c380g/g381c, c300t, t204c/t379a/c380g/g381c/c385t, g1026a,
c246t, c597g, c72t, c732g/c843t/c882t, c909t, c912g, g921a, c792t,
g972t, g921a, t379a/c380g/g381c/c454a/c456a/c732t/c843t/c849t,
c520a/c522g; t60c/c573g; t60c/c288t/c573g; t60c/c198t/c573g; and/or
t60c/g399a/c573g.
7. The recombinant nucleic acid construct of claim 3, wherein said
nucleic acid sequence is operably linked to a promoter.
8. The nucleic acid construct of claim 3, wherein said construct
further encodes at least one enzyme in addition to said GH61
variant protein.
9. The nucleic acid construct of claim 8, wherein said at least one
additional enzyme is selected from wild-type GH61 enzymes,
endoglucanases (EG), beta-glucosidases (BGL), Type 1
cellobiohydrolases (CBH1), Type 2 cellobiohydrolases (CBH2),
cellulases, hemicellulases, xylanases, xylosidases, amylases,
glucoamylases, proteases, esterases, and lipases.
10. An expression construct comprising at least one nucleic acid
construct of claim 3.
11. A host cell comprising the nucleic acid construct of claim
3.
12. The host cell of claim 11, wherein said host cell further
produces at least one enzyme selected from wild-type GH61 enzymes,
endoglucanases (EG), beta-glucosidases (BGL), Type 1
cellobiohydrolases (CBH1), Type 2 cellobiohydrolases (CBH2),
cellulases, hemicellulases, xylanases, xylosidases, amylases,
glucoamylases, proteases, esterases, and lipases.
13. The host cell of claim 11, wherein said host cell is a yeast or
filamentous fungal cell.
14. A method of producing at least one GH61 variant protein
comprising culturing the host cell set forth in claim 3, under
conditions such that said host cell produces at least one of GH61
variant proteins.
15. The method of claim 14, wherein said host cell further produces
at least one additional enzyme selected from wild-type GH61
enzymes, endoglucanases (EG), beta-glucosidases (BGL), Type 1
cellobiohydrolases (CBH1), Type 2 cellobiohydrolases (CBH2),
cellulases, hemicellulases, xylanases, xylosidases, amylases,
glucoamylases, proteases, esterases, and lipases.
Description
[0001] The present application is a Divisional of U.S. patent
application Ser. No. 13/592,024, filed Aug. 22, 2013, which claims
priority to previously filed U.S. patent application Ser. No.
13/215,193, filed Aug. 22, 2011, U.S. Prov. Appln. Ser. No.
61/526,224, filed Aug. 22, 2011, and U.S. Prov. Appln. Ser. No.
61/601,997, filed Feb. 22, 2012, all of which are hereby
incorporated in their entireties for all purposes.
REFERENCE TO A "SEQUENCE LISTING," A TABLE, OR A COMPUTER PROGRAM
LISTING APPENDIX SUBMITTED AS AN ASCII TEXT FILE
[0002] The Sequence Listing written in file CX35-101US2A_ST25.TXT,
created on Aug. 20, 2012, 416,766 bytes, machine format IBM-PC,
MS-Windows operating system, is hereby incorporated by
reference.
FIELD OF THE INVENTION
[0003] The invention relates generally to the field of glycolytic
enzymes and their use, and to the field of directed enzyme
evolution or modification. More specifically, the present invention
provides GH61 protein variants, and methods for the use of such
protein variants in production of fermentable sugars and ethanol
from cellulosic biomass.
BACKGROUND
[0004] Cellulosic biomass is a significant renewable resource for
the generation of fermentable sugars. These sugars can be used as
substrates for fermentation and other metabolic processes to
produce biofuels, chemical compounds and other commercially
valuable end-products.
[0005] The conversion of cellulosic biomass to fermentable sugars
may begin with chemical, mechanical, enzymatic or other
pretreatments to increase the susceptibility of cellulose to
hydrolysis. Such pretreatment may be followed by the enzymatic
conversion of cellulose to cellobiose, cello-oligosaccharides,
glucose, and other sugars and sugar polymers, using enzymes that
break down cellulose. These enzymes are collectively referred to as
"cellulases" and include endoglucanases, beta-glucosidases and
cellobiohydrolases.
SUMMARY OF THE INVENTION
[0006] The invention provides numerous variants of GH61 proteins.
In some embodiments, these variants comprise amino acid
substitutions as set forth herein. In some embodiments, these
variants exhibit an improved ability to synergize with cellulase
enzymes, thereby increasing the yield of fermentable sugars
obtained by saccharification of cellulose-containing biomass.
Sugars obtained from saccharification can be fermented to produce
alcohol and other end-products. Thus, the GH61 variant proteins of
this invention have important commercial applicability in the
production of biofuels and other end-products. In some embodiments,
the present invention provides GH61 variant proteins comprising an
amino acid sequence that is substantially identical (for example,
at least about 70%, about 75%, about 80%, about 85%, about 90%,
about 91%, about 92%, about 93%, about 94%, about 95%, about 96%,
about 97%, about 98%, about 99%, or about 100% identical) to SEQ ID
NO:2 or a fragment of SEQ ID NO:2 having GH61 activity as defined
below. In some embodiments, the variant protein has one or more
amino acid substitutions with respect to SEQ ID NO:2 or a fragment
of SEQ ID NO:2. In some embodiments, the GH61 is at least 95%
identical to SEQ ID NO:2 or a fragment of SEQ ID NO:2 having GH61
activity. In some embodiments, the GH61 variant proteins have
increased thermoactivity compared with the GH61 wild-type protein
of SEQ ID NO:2. In some further embodiments, the GH61 variant
proteins have increased thermostability compared with the GH61
wild-type protein of SEQ ID NO:2.
[0007] In some embodiments, the present invention provides GH61
variants comprising substitution(s) in at least one of the
positions as indicated herein. In some embodiments, the
substitution(s) provide GH61 variants that have increased activity
as compared to wild-type GH61. In some embodiments, the GH61
variants comprise at least one substitution selected from those
listed in Table 1 and/or Table 2 in any combination, wherein the
positions are numbered with reference to SEQ ID NO:2.
[0008] In some further embodiments, the GH61 variants provided
herein comprise the any one or more of the mutations listed in
Table 1 and/or Table 2 in any combination. It is not intended that
the present invention be limited to the specific substitutions. Any
two, three, four, or more than four substitutions find use in any
combination that improves GH61 activity. Non-limiting illustrations
of effective combinations are provided herein.
[0009] In some embodiments, a substitution or combination of
substitutions in the amino acid sequence as provided herein results
in the variant protein having increased GH61 activity in a
saccharification reaction. In some embodiments, crystalline
cellulose undergoes saccharification by cellulase enzymes that are
contained in culture broth from M. thermophila cells. When measured
in this manner, a GH61 variant protein of this invention causes
increase in yield of fermentable sugars (e.g., glucose) to a degree
that is about 1.5-fold, about 2-fold, about 3-fold, about 5-fold,
about 8-fold, about 10-fold or more compared with the parental GH61
sequence (SEQ ID NO:2) or biologically active fragment, compared
with a reference protein comprising SEQ ID NO:2 or the fragment,
without any substitutions. It is not intended that the present
invention be limited to the production of any particular
fermentable sugar(s). It is also not intended that the present
invention be limited to any specific level of improvement in the
yield of fermentable sugar using at least one of the variants
provided herein.
[0010] This invention also provides GH61 protein variants that are
more resistant to the presence of enzyme inhibitors that may be
present in commercial sources of biomass, or be generated as a
result of pretreatment of the biomass substrate.
[0011] In some embodiments, the present invention provides GH61
variant proteins comprising amino acid sequences that are at least
about at least about 60%, at least about 65%, at least about 70%,
75%, at least 80%, at least about 85%, at least about 90%, at least
about 91%, at least about 92%, at least about 93%, at least about
94%, at least about 95%, at least about 96%, at least about 97%, at
least about 98%, at least about 99%, or at least about 100%
identical to SEQ ID NO:2 or a fragment of SEQ ID NO:2 having GH61
activity, wherein the amino acid sequence of the variant protein
has one or more amino acid substitutions with respect to SEQ ID
NO:2 or the fragment.
[0012] In some embodiments, the present invention provides GH61
variant proteins comprising amino acid sequences that are at least
about at least about 60%, at least about 65%, at least about 70%,
at least about 75%, at least about 80%, at least about 85%, at
least about 90%, at least about 91%, at least about 92%, at least
about 93%, at least about 94%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, or at least about 99%
identical to SEQ ID NO:2 or a fragment of SEQ ID NO:2 having GH61
activity, wherein the amino acid sequence of the variant protein
has one or more amino acid substitutions with respect to SEQ ID
NO:2 or the fragment, and wherein the substitution(s) in the amino
acid sequence result in the variant protein having increased GH61
activity in a reaction where crystalline cellulose undergoes
saccharification by cellulase enzymes that are contained in culture
broth from M. thermophila cells, compared with a reference protein
comprising SEQ ID NO:2 or the fragment, without any
substitutions.
[0013] In some embodiments, the present invention provides GH61
variant proteins comprising amino acid sequences that are at least
about 60%, at least about 65%, at least about 75%, at least about
80%, at least about 85%, at least about 90%, at least about 91%, at
least about 92%, at least about 93%, at least about 94%, at least
about 95%, at least about 96%, at least about 97%, at least about
98%, at least about 99%, or at least about 100% identical to SEQ ID
NO:2 or a fragment of SEQ ID NO:2 having GH61 activity, wherein the
amino acid sequence of the variant protein has one or more amino
acid substitutions with respect to SEQ ID NO:2 or the fragment, and
wherein the polynucleotide encoding the GH61 variant protein
comprises at least one mutation and/or mutation set selected from
those listed in Table 1 and/or Table 2 in any combination, wherein
the nucleotide positions of the substitutions are determined by
alignment with SEQ ID NO:1.
[0014] In some embodiments, the present invention provides enzyme
compositions comprising at least one GH61 variant of the present
invention and/or at least one wild-type GH61 protein. In some
embodiments, the present invention provides enzyme compositions
comprising at least one GH61 variant protein of this invention is
combined with one or more cellulase enzyme(s), including but not
limited to endoglucanases (EG), beta-glucosidases (BGL),
cellobiohydrolases (e.g., CBH1 and/or CBH2), and/or at least one
wild-type GH61 protein. In some embodiments, the enzyme
compositions further comprise one or more enzymes selected from
cellulases, hemicellulases, xylanases, amylases, glucoamylases,
proteases, esterases xylosidases, and lipases.
[0015] The invention also includes polynucleotides encoding GH61
variant proteins, recombinant cells expressing such polynucleotides
and optionally one or more cellulase enzymes, and methods for
increasing yield of fermentable sugars in a saccharification
reaction by conducting the reaction in the presence of at least one
GH61 protein of this invention.
[0016] In some embodiments, the present invention provides at least
one polynucleotide comprising at least one nucleic acid sequence
encoding at least one GH61 variant protein; at least one
polynucleotide that hybridizes under stringent hybridization
conditions to at least one polynucleotide encoding at least one
GH61 variant protein; and/or at least one polynucleotide that
hybridizes under stringent hybridization conditions to the
complement of at least one polynucleotide encoding at least one
polypeptide comprising at least one GH61 variant protein.
[0017] The present invention also provides recombinant nucleic acid
constructs comprising at least one polynucleotide sequence encoding
at least one GH61 protein, wherein the polynucleotide is selected
from: (a) a polynucleotide that encodes a polypeptide comprising an
amino acid sequence having at least 60%, at least 65%, at least
70%, at least 75%, at least 80%, at least 85%, at least 90%, at
least 91%, at least 92%, at least 93%, at least 94%, at least 95%,
at least 96%, at least 97%, at least 98%, at least 99%, or at least
100% identity to SEQ ID NO:2, wherein the amino acid sequence
comprises at least one substitution and/or substitution set
provided herein; (b) a polynucleotide that hybridizes under
stringent hybridization conditions to at least a fragment of a
polynucleotide that encodes a polypeptide having the amino acid
sequence of SEQ ID NO:2, and wherein the amino acid sequence
comprises at least one substitution and/or at least one
substitution set provided herein; and/or (c) a polynucleotide that
hybridizes under stringent hybridization conditions to the
complement of at least a fragment of a polynucleotide that encodes
a polypeptide having the amino acid sequence of SEQ ID NO:2, and
wherein the amino acid sequence comprises at least one substitution
and/or at least one substitution set provided herein.
[0018] The present invention further provides recombinant nucleic
acid constructs comprising at least one polynucleotide sequence
encoding at least one GH61 protein, wherein the polynucleotide is
selected from: (a) a polynucleotide that encodes a polypeptide
comprising an amino acid sequence having at least about 60%, at
least about 65%, at least about 70%, at least about 75%, at least
about 80%, at least about 85%, at least about 90%, at least about
91%, at least about 92%, at least about 93%, at least about 94%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, at least about 99%, or at least about 100% identity to
SEQ ID NO:2, wherein the amino acid sequence comprises at least one
substitution and/or substitution set provided herein; (b) a
polynucleotide that hybridizes under stringent hybridization
conditions to a polynucleotide that encodes a polypeptide having
the amino acid sequence of SEQ ID NO:2, and wherein the amino acid
sequence comprises at least one substitution and/or at least one
substitution set provided herein; and/or (c) a polynucleotide that
hybridizes under stringent hybridization conditions to the
complement of a polynucleotide that encodes a polypeptide having
the amino acid sequence of SEQ ID NO:2, and wherein the amino acid
sequence comprises at least one substitution and/or at least one
substitution set provided herein. In some embodiments of the
nucleic acid constructs, the polynucleotide sequence is at least
about 70%, at least about 75%, at least about 80%, at least about
85%, at least about 90%, at least about 91%, at least about 92%, at
least about 93%, at least about 94%, at least about 95%, at least
about 96%, at least about 97%, at least about 98%, or at least
about 99% identical to SEQ ID NO:1, and wherein the polynucleotide
sequence comprises at least one mutation and/or at least one
mutation set provided herein. Exemplary are those shown in Table 1
and Table 2, which may be incorporated into the polynucleotide in
any combination.
[0019] In some embodiments, the present invention provides
polynucleotides and nucleic acid constructs comprising
polynucleotides encoding at least one GH61 variant and/or wild-type
protein (e.g., any of SEQ ID NOS:2, 3, 5, 6, 8, 9, 11, 12, 14, 15,
17, 18, 20, 21, 23, 24, 26, 27, 29, 30, 32, 33, 35, 36, 38, 39, 41,
42, 44, 45, 47, 48, 50, 51, 53, 54, 56, 57, 59, 60, 62, 64, 65, 67,
68, 70, 71, 73, 74, 76, 77, 79, 80, 82, 83, 85, 86, 88, 89, 91, 93,
95, 96, 98, 99, 101, 102, 104, 105, 107, 108), operably linked to
promoters. In some embodiments, the promoters are heterologous
promoters. In some embodiments, the present invention provides
expression constructs comprising polynucleotides and/or nucleic
acid constructs that comprise polynucleotides encoding at least one
GH61 variant and/or wild-type protein. In some embodiments, the
expression constructs comprise at least one nucleic acid sequence
operably linked to at least one additional regulatory sequence.
[0020] The present invention also provides recombinant host cells
that express at least one polynucleotide sequence encoding at least
one GH61 variant protein. In some embodiments, the host cell also
expresses at least one polynucleotide sequence encoding at least
one GH61 wild-type protein. In some embodiments, the expressed GH61
variant and/or wild-type protein is secreted from the host cell. In
some embodiments, the host cell also produces at least one
cellulase enzyme selected from endoglucanases (EG),
beta-glucosidases (BGL), cellobiohydrolases (e.g., CBH1 and/or
CBH2), xylanases, xylosidases, etc. In some embodiments, the host
cell is a yeast, while in some other embodiments, the host cell is
a filamentous fungal cell. In some further embodiments, the
filamentous fungal cell is a Myceliophthora, a Thielavia, a
Trichoderma, or an Aspergillus cell. In some embodiments, the
filamentous fungal cell is Myceliophthora thermophila. In some
additional embodiments, the host cell also produces at least one
additional enzyme (e.g., esterase, protease, amylase, laccase,
etc.).
[0021] In some additional embodiments, the present invention
provides methods for producing at least one end-product from at
least one cellulosic substrate. The substrate is contacted with at
least one GH61 variant protein of the invention, and one or more
cellulase enzymes. The fermentable sugars that are produced as a
result are contacted with a microorganism in a fermentation to
produce an end-product (e.g., an alcohol such as ethanol). The
fermentation may be simultaneous with the saccharification, or may
occur subsequently. It is not intended that the fermentation
end-product be limited to any specific composition, as various
end-products may be obtained from the fermentation reaction,
including but not limited to alcohols.
[0022] The present invention also provides methods for producing
fermentable sugars from cellulosic substrates, comprising
contacting at the cellulosic substrate with at least one enzyme
composition provided herein, under culture conditions whereby
fermentable sugars are produced. In some embodiments the enzyme
composition comprises a plurality of enzymes selected from at least
one GH61 variant, at least one wild-type GH61, at least one
endoglucanase (EG), at least one beta-glucosidase (BGL), at least
one cellobiohydrolase (e.g., CBH1 and/or CBH2), at least one
xylanase, at least one xylosidase, and/or at least one esterase. In
some embodiments, the CBH1 is CBH1a. In further embodiments, the
CBH2 is CHB2b. In some embodiments, the methods further comprise
the step of pretreating the cellulosic substrate prior to the
contacting step. In some embodiments, the enzyme composition is
added concurrently with the pretreating step.
[0023] In some embodiments, the cellulosic substrate comprises
wheat grass, wheat straw, barley straw, sorghum, rice grass,
sugarcane, sugarcane straw, bagasse, switchgrass, corn stover, corn
fiber, grains, or a combination thereof. In further embodiments,
the fermentable sugars comprise glucose and/or xylose. In some
embodiments, the methods further comprise the step of recovering
the fermentable sugars. In some embodiments, the methods further
comprise the step of contacting the fermentable sugars with a
microorganism under conditions such that the microorganism produces
at least one fermentation end product. In further embodiments, the
fermentation end product is selected from alcohols, fatty alcohols,
fatty acids, lactic acid, acetic acid, 3-hydroxypropionic acid,
acrylic acid, succinic acid, citric acid, malic acid, fumaric acid,
amino acids, 1,3-propanediol, ethylene, glycerol, butadiene, and/or
beta-lactams. In some still further embodiments, the fermentation
end product is an alcohol selected from ethanol and butanol. In
some embodiments, the alcohol is ethanol. It is not intended that
the fermentation end-product be limited to any specific
composition(s), as various end-products can be produced using the
present invention.
[0024] The present invention also provides methods for producing an
end product from a cellulosic substrate, comprising: contacting the
cellulosic substrate with at any enzyme composition provided
herein, under conditions whereby fermentable sugars are produced
from the substrate; and contacting the fermentable sugars with a
microorganism in a fermentation to produce an end-product. In some
embodiments, the methods comprise simultaneous saccharification and
fermentation reactions (SSF). In some alternative embodiments, the
methods comprise saccharification of the cellulosic substrate and
fermentation in separate reactions (SHF). In some additional
embodiments, the methods comprise production of at least one enzyme
simultaneously with hydrolysis and/or fermentation (e.g.,
"consolidated bioprocessing").
[0025] The present invention also provides methods for producing a
fermentation end product from a cellulosic substrate, comprising
obtaining fermentable sugars produced according to any method
provided herein, and contacting the fermentable sugars with a
microorganism in a fermentation to produce a fermentation end
product. In some embodiments, the fermentation end product is
selected from alcohols, fatty alcohols, fatty acids, lactic acid,
acetic acid, 3-hydroxypropionic acid, acrylic acid, citric acid,
malic acid, fumaric acid, succinic acid, amino acids,
1,3-propanediol, ethylene, glycerol, butadiene, and/or
beta-lactams. In some embodiments, the fermentation end product is
at least one alcohol selected from ethanol and butanol. In further
embodiments, the alcohol is ethanol. In some still further
embodiments, the microorganism is a yeast. In some embodiments, the
methods further comprise the step of recovering the fermentation
end product. It is not intended that the fermentation end-product
be limited to any specific composition(s), as various end-products
can be produced using the present invention. It is also not
intended that the present invention be limited to any particular
microorganism. It is further not intended that the present
invention be limited to any particular yeast, as any suitable yeast
finds use in the present invention.
[0026] The present invention also provides for use of at least one
GH61 variant protein as provided herein to produce at least one
fermentation end product. The present invention also provides for
use of at least one GH61 variant protein provided herein to produce
at least one fermentation end product selected from alcohols, fatty
alcohols, fatty acids, lactic acid, acetic acid, 3-hydroxypropionic
acid, acrylic acid, citric acid, malic acid, fumaric acid, succinic
acid, amino acids, 1,3-propanediol, ethylene, glycerol, butadiene,
and/or beta-lactams. In some embodiments, the fermentation end
product is an alcohol selected from ethanol and butanol. In some
embodiments, the alcohol is ethanol. It is not intended that the
fermentation end-product be limited to any specific composition(s),
as various end-products can be produced using the present
invention.
[0027] A further embodiment of the invention is a composition
comprising a GH61 protein, one or more cellulase enzymes, a
cellulosic substrate, and an effective concentration of Cu.sup.++
and/or gallic acid, as further described and illustrated below. The
GH61 protein may be any GH61 protein disclosed herein, such as a
protein comprising an amino acid sequence at least about 60%, at
least about 65%, at least about 70%, at least about 75%, at least
about 80%, at least about 85%, at least about 90%, at least about
91%, at least about 92%, at least about 93%, at least about 94%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, at least about 99%, or about 100% identical to SEQ ID
NO:2, or a fragment thereof with GH61 activity. In some
embodiments, the GH61 protein is a variant protein comprising all
or part of SEQ ID NO:2 having GH61 activity, wherein the variant
comprises one or more of the amino acid substitutions provided
herein. In some embodiments, the cellulase enzyme(s) are selected
from endoglucanases (EG), beta-glucosidases (BGL),
cellobiohydrolases (e.g., CBH1 and/or CBH2), xylanases,
xylosidases, etc. In some embodiments, the presence of Cu.sup.++,
gallic acid, or both enhances activity of the GH61 protein, thereby
increasing the rate of glucose production or reducing the amount of
GH61 protein needed to supply GH61 activity in a saccharification
reaction.
[0028] In another embodiment, the present invention provides
methods for producing fermentable sugars from cellulosic
substrate(s), in which a composition comprising at least one GH61,
at least one cofactor, at least one additional cellulase enzyme,
and at least one cellulosic substrate is cultured or maintained
under conditions whereby fermentable sugars are produced from the
substrate(s). The fermentable sugars can then be contacted with a
microorganism under conditions such that the microorganism produces
at least one fermentation end product, such as ethanol. A further
embodiment of the invention is use of Cu.sup.++ to increase
production of fermentable sugars from a saccharification reaction
where cellulase activity is enhanced in the presence of a protein
or protein variant with GH61 activity.
The present invention provides GH61 variant proteins comprising
amino acid sequences that are at least about 75%, at least about
80%, at least about 85%, at least about 86%, at least about 87%, at
least about 88%, at least about 89%, at least about 90%, at least
about 91%, at least about 92%, at least about 93%, at least about
94%, at least about 95%, at least about 96%, at least about 97%, at
least about 98%, or at least about 99% identical to SEQ ID NO:2 or
a fragment of SEQ ID NO:2 having GH61 activity, wherein the amino
acid sequence of the variant protein has one or more amino acid
substitutions with respect to SEQ ID NO:2 or the fragment. In some
embodiments, the GH61 variant proteins comprise an amino acid
sequence that is at least 75%, at least 80%, at least 85%, at least
86%, at least 87%, at least 88%, at least 89%, at least 90%, at
least 91%, at least 92%, at least 93%, at least 94%, at least 95%,
at least 96%, at least 97%, at least 98%, or at least 99% identical
to SEQ ID NO:2 or a fragment of SEQ ID NO:2 having GH61 activity,
wherein the amino acid sequence of the variant protein has one or
more amino acid substitutions with respect to SEQ ID NO:2 or the
fragment. In some embodiments, the GH61 variant proteins are at
least 95% identical to SEQ ID NO:2 or a fragment of SEQ ID NO:2
having GH61 activity. In some embodiments, the GH61 variant
proteins have increased thermoactivity, thermostability, and/or
activity, as compared to the GH61 wild-type protein of SEQ ID NO:2.
In some further embodiments, the GH61 variant proteins comprise at
least one substitution(s) at one or more of the following amino
acid positions: 20, 35, 42, 44, 45, 68, 87, 97, 103, 104, 127, 131,
132, 133, 134, 137, 139, 142, 143, 162, 163, 164, 165, 166, 167,
168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180,
181, 190, 191, 192, 192, 205, 212, 215, 218, 232, 236, 239, 244,
246, 258, 270, 273, 317, 322, 323, 328, 330, and/or 341, wherein
the amino acid positions are numbered with reference to SEQ ID
NO:2. In some embodiments, the GH61 variant proteins comprise at
least one substitution(s) at one or more of the following amino
acid positions: H20, N35, W42, Q44, P45, F68, T87, V97, P103, E104,
S127, W131, F132, K133, I134. A137, Y139, A142, A143, I162, P163,
S164, D165, L166, K167, A168, G169, N170, Y171, V172, L173, R174,
H175, E176, I177, I178, A179, L180, H181, Q190, A191, Y192, Y192,
S205, A212, S215, K218, S232, T236, G239, A244, A246, T258, G270,
P273, N317, P322, T323, G328, S330, and/or C341, wherein the amino
acid positions are numbered with reference to SEQ ID NO:2. In some
further embodiments, the GH61 variant proteins comprise at least
one substitution(s) at one or more of the following amino acid
positions: H20, N35, W42, E104, I134, S164, K167, A168, V172, I177,
A179, and/or A191, wherein the amino acid positions are numbered
with reference to SEQ ID NO:2. In some additional embodiments, the
GH61 variant proteins comprise at least two amino acid
substitutions. In still some further embodiments, the GH61 variant
proteins comprise at least one substitution set selected from:
N35/E104/A168; W42/E104/K167; N35/W42/V97/A191; W42/E104;
E104/K167; W42/A191; N35/W42/A191; V97/A191; and N35/E104/A191,
wherein the amino acid positions are numbered with reference to SEQ
ID NO:2. In some embodiments, the GH61 variant proteins comprise at
least one amino acid substitution comprising one or more of the
following substitutions numbered with reference to SEQ ID NO:2:
H20C/D, N35G, W42P, Q44V, P45T, F68Y, T87P, V97Q, P103E/H,
E104C/D/H/Q, S127T, W131X, F132X, K133X, 134X, A137P, Y139L, A142W,
A143P, I162X, P163X, S164X, D165X, L166X, K167A/X, A168P/X, G169X,
N170X, Y171A/R, V172X, L173X, R174X, H175X, E176X, I177X, I178X,
A179X, L180M/W, H181X, Q190E/H, A191N/T, Y192H, Y192Q, S205N,
A212P, S215W, K218T, S232A, T236P, G239D, A244D, A246T, T258I,
G270S, P273S, N317K, P322L, T323P, G328A, S330R, and/or C341R,
wherein the amino acid positions are numbered with reference to SEQ
ID NO:2. In some additional embodiments, the GH61 variant proteins
comprise one or more of the following substitutions: N35G, W42P,
V97Q, E104H, K167A, A168P, and/or A191N, wherein the amino acid
positions are numbered with reference to SEQ ID NO:2. In some
embodiments, the GH61 variant proteins comprise one or more of the
following substitution sets: N35G/E104H/A168P; W42P/E104H/K167A;
N35G/W42P/V97Q/A191N; W42P/E104H; E104H/K167A; W42P/A191N;
N35G/W42P/A191N; V97Q/A191N; and/or N35G/E104H/A191N, wherein the
amino acid positions are numbered with reference to SEQ ID NO:2. In
some additional embodiments, the GH61 variant proteins comprise the
substitutions N35G/E104H/A168P, wherein the amino acid positions
are numbered with reference to SEQ ID NO:2. In some further
embodiments, the GH61 variant proteins comprise the sequence set
forth in any of SEQ ID NOS:4, 6, and/or 8. In some additional
further embodiments, the GH61 variant proteins are encoded by at
least one polynucleotide sequence set forth in SEQ ID NOS:3, 5,
and/or 7. In some embodiments, the GH61 variant proteins comprise
at least one substitution(s) at one or more of the following amino
acid positions: 24, 28, 32, 34, 35, 40, 44, 45, 46, 49, 51, 54, 55,
56, 58, 64, 66, 67, 69, 70, 71, 78, 80, 82, 83, 88, 93, 95, 101,
104, 116, 118, 128, 130, 136, 137, 141, 142, 144, 145, 150, 155,
161, 164, 168, 184, 187, 199, 203, 205, 212, 218, 219, 230, 231,
232, 233, 234, 236, 237, 245, 253, 263, 266, 267, 268, 269, 270,
271, 280, 281, 282, 290, 295, 297, 303, 305, 310, 317, 320, 324,
326, 327, 329, 330, 332, 333, 336, 337, and/or 339, wherein the
amino acid positions are numbered with reference to SEQ ID NO:2. In
some further embodiments, the GH61 variant proteins, comprise at
least one substitution(s) at one or more of the following amino
acid positions: S24, V28, Y32, R34, N35, T40, Q44, P45, N46, T49,
I51, T54, A55, A56, Q58, E64, N66, S67, G69, T70, P71, S78, T80,
G82, G83, V88, K93, N95, E101, E104, A116, N118, S128, R130, G136,
A137, K141, A142, G144, R145, A150, G155, Q161, S164, A168, Q184,
N187, R199, G203, S205, A212, K218, A219, V230, S231, S232, P233,
D234, T236, V237, G245, S253, A263, P266, G267, G268, G269, G270,
A271, A280, T281, S282, R290, S295, A297, P303, G305, K310, N317,
T320, V324, A326, P327, S329, S330, S332, V333, E336, W337, and/or
S339, wherein the amino acid positions are numbered with reference
to SEQ ID NO:2. In some further embodiments, the GH61 variant
proteins comprise a plurality of amino acid substitutions as set
forth herein. In some embodiments, the GH61 variant proteins
comprise at least one substitution set selected from:
N35/T40/E104/A168/P327; N35/P45/E104/A168/N317; N35/E104/A168/N317;
N35/E104/A168/N317/S329; N35/E104/A137/A168/S232;
N35/E104/A168/N317/T320; N35/E104/A168/D234;
N35/T40/E104/A142/A168; N35/E104/R145/A168;
N35/T40/S78/V88/E104/S128K/A168/D234; N35/E104/A168/S330;
N35/E104/A168/G203/P266; N35/E104/A168/D234; N35/E104/A168/S330;
N35/E104/A168/W337; R34/N35/E104/R145/A168; Y32/N35/E64/E104/A168;
V28/N35/P45/E104/A168; N35/E104/G144/A168/V333; N35/N66/E104/A168;
N35/E104/A168/P327; N35/E104/A168/G203; N35/E104/A168/S339;
N35/P45/N46/E104/A150/A168; N35/E104/A168/S231;
N35/T40/E104/A168/D234/P327; N35/E104/A168/S231;
N35/E104/A168/N317; N35/E104/A168/S330; N35/E104/A168/S329;
N35/E104/A168/P327; N35/P45/E104/A168; N35/E104/A116/A168;
N35/T40/E104/A168N230/P327; N35/E104/A168/S332; N35/E104/A168/G203;
N35/E104/R145/A168/S329; N35/T40/T49/E104/A168/D234; /P327;
N35/A56/E104/A168; N35/E104/Q161/A168; N35/E104/A168/S332;
N35/P45/T49/E104/A168/N317/T320; N35/E104/A168/V237;
N35/E104/A168/E336; N35/E104/A168/P233; N35/E104/R130/A168;
N35/E104/A168/P327; N35/E104/A168/N317; N35/Q44/E104/A168;
N35/E104/A168/A326; N35/E104/A168/N317; N35/T40/E104/S128/A168;
N35/T80/E104/A168/P303; N35/E104/A116/A168;
N35/E104/A168/S231/S295; N35/T40/E101/E104/A168/P327;
N35/P45/E104/A168/A219/S232; N35/N46/E104/A168; N35/E104/A168/A326;
N35/E104/A168/G203/T281; N35/E104/A168/E336;
N35/T40/E104/S128/A142/A168; N35/E104/N118/A168;
N35/E104/G155/A168; S24/N35/E104/A168/V237/P303;
N35/E104/Q161/A168; N35/Q44/S67/E104/A168; V28/N35/E104/A168;
N35/E104/A168/Q184; N35/T54/E104/A168; N35/N66/E104/A168;
N35/E64/E104/A168; N35/E104/S164/A168/A271; N35/N66/E104/A168;
N35/G83/E104/A168; N35/E104/K141/A168; N35/E104/A168/N317/T320;
N35/E104/R130/A168; N35/E104/R145/A168; N35/T70/E104/A168;
N35/E104/R130/A168; N35/E104/A168/Q184; N35/E104/A168/S329;
N35/T49/E104/A168; Y32/N35/E104/A168; N35/E104/A168/S330;
N35/Q58/E104/A168; Y32/N35/P71/E104/A168; N35/E104/A168/S330;
N35/T80/E104/A168; N35/G82/E104/A168; N35/E104/A168/S295;
N35/N66/E104/A168; N35/T54/E104/A168; N35/P45/E104/A168;
N35/E104/S128/A168; N35/N66/N95/E104/S164/A168; /G267;
N35/T54/E104/A168; N35/P45/E104/K141/A168; N35/E104/A168/S332;
N35/E104/A168/A297; N35/E104/K141/R145/A168;
N35/Q44/E104/A168/S231; N35/T40/T49/S78/E104/A142; /A168;
N35/E104/S164/A168/S295; N35/E104/A168/N317; N35/P45/E104/A168;
N35/G82/E104/A168; N35/N46/E104/A168/G203/A263; N35/Q58/E104/A168;
N35/G69/E104/A168; N35/S67/E104/A168; N35/E104/A168/R199;
N35/E104/A168/G203/G268/G269/G270; N35/E104/A168/V324;
N35/E104/A168/P266; N35/E104/A168/G245; N35/N66/E104/A168;
S24/N35/Q44/T80/E104/A168; N35/E104/A168/T236; N35/E104/A168/K310;
N35/E104/R130/A168; N35/N66/S78/E104/A168/S253;
N35/N66/E104/S164/A168/S282; N35/E104/A142/A168;
N35/E104/R145/A168; N35/E104/A168/S231; N35/E104/A168/Q184;
N35/E104/A168/K218; N35/E104/A168/P233; N35/T49/E104/A168/Q184;
N35/T40/E104/A168/P327; N35/T54/E104/A168;
N35/N66/E104/S164/A168/S231/S253; N35/E104/A168/G203;
N35/T49/E104/A168; N35/E104/A168/P266/G267; N35/Q44/N66/E104/A168;
N35/S67/E104/A168; N35/E104/A137/A168; N35/T49/E104/S128/A168;
N35/T49/E104/A168/K218/N317; N35/I51/E104/A168; N35/E104/A168/A326;
N35/P45/E104/A168/T320; N35/N66/E104/A168; N35/E104/A168/V237/P303;
N35/P45/E104/A168/K218/N317; N35/T80/E104/A168; N35/A55/E104/A168;
N35/E104/K141/A168/P266; N35/E104/A168/S330;
N35/N66/E104/A168/R290; N35/E104/N118/A168; N35/E104/A168/A212;
N35/K93/E104/R130/A168; N35/E104/A168/G267;
N35/P45/T49/E104/A168/N317; N35/E104/A168/V230; N35/E104/A168/S329;
N35/P45/E104/A168/A219; N35/S78/E104/S164/A168; N35/E104/A168/S205;
N35/E104/A168/Q184; V28/N35/N46/Q58/E104/A168; N35/E104/A142/A168;
N35/E104/A168/E336; N35/E104/A168/A280; N35/E104/A168/A219;
N35/E104/A168/P303/G305; R34/N35/E104/A168/A280;
N35/E104/A168/N187; N35/E104/G136/A168; N35/E104/A168/Q184;
N35/T49/E104/A168/N317; N35/T40/T49/S78/E104/A168;
R34/N35/K93/E104/R130/R145/A168/R199/K218/A280;
N35/T40/E104/A142/A168; and N35/N66/E104/A168, wherein the amino
acid positions are numbered with reference to SEQ ID NO:2. In some
further embodiments, the GH61 variant proteins comprise at least
one amino acid substitution comprising one or more of the following
substitutions numbered with reference to SEQ ID NO:2: S24Q; V28H;
Y32S; R34E; N35G; T40A/G/L/S; Q44K; P45D/E/K/R/S; N46E/R;
T49A/Q/R/Y; I51A; T54G/M/S/W; A55G; A56S; Q58H/P; E64L/S;
N66A/D/G/L/M/Q/R/V; S67G/H/T; G69T; T70A; P71A; S78C/D; T80H/L/V;
G82A/S; G83R; V88I; K93N/T; N95E; E101T; E104H; A116Q/S; N118E/S;
S128K/L/N; R130E/G/H/K/Y; G136H; A137M/S; K141A/N/P/R; A142D/G/L;
G144S; R145H/L/N/Q/T; A150Y; G155N; Q161E/R; S164E; A168P;
Q184E/H/L/N/R; N187D; R199E; G203E/V/Y; S205T; A212M; K218L/T;
A219R/T; V230I/Q; S231A/H/K/I; S232E; P233F/T; D234E/M/N; T236E;
V237I; G245A; S253D/T; A263V; P266S; G267D/V; G268A; G269A; G270A;
A271T; A280D/T; T281A; S282D; R290K; S295D/L/T; A297T; P303T;
G305D; K310I; N317D/H/I/M/Q/R; T320A; V324M; A326C/Q/V;
P327F/K/L/M; S329H/I/Q/T/Y; S330A/H/I/T/V; S332C/F/R; V333Q;
E336L/R/S; W337R; and/or S339W, wherein the amino acid positions
are numbered with reference to SEQ ID NO:2. In some embodiments,
the GH61 variant proteins comprise a plurality of substitutions
and/or substitution sets as provided therein. In some additional
embodiments, the GH61 variant proteins comprise one or more of the
following substitution sets: N35G/T40A/E104H/A168P/P327M;
N35G/P45D/E104H/A168P/N317R; N35G/E104H/A168P/N317R;
N35G/E104H/A168P/N317D/S329Y; N35G/E104H/A137S/A168P/S232E;
N35G/E104H/A168P/N317R/T320A; N35G/E104H/A168P/D234E;
N35G/T40S/E104H/A142G/A168P; N35G/E104H/R145L/A168P;
N35G/T40S/S78C/V88I/E104H/S128K/A168P/D234M;
N35G/E104H/A168P/S330V; N35G/E104H/A168P/G203E/P266S;
N35G/E104H/A168P/D234N; N35G/E104H/A168P/S330H;
N35G/E104H/A168P/W337R; R34E/N35G/E104H/R145T/A168P;
Y32S/N35G/E64S/E104H/A168P; V28H/N35G/P45K/E104H/A168P;
N35G/E104H/G144S/A168P/V333Q; N35G/N66Q/E104H/A168P;
N35G/E104H/A168P/P327K; N35G/E104H/A168P/G203E;
N35G/E104H/A168P/S339W; N35G/P45K/N46E/E104H/A150Y/A168P;
N35G/E104H/A168P/S231K; N35G/T40A/E104H/A168P/D234E/P327M;
N35G/E104H/A168P/S231H; N35G/E104H/A168P/N317M;
N35G/E104H/A168P/S330Y; N35G/E104H/A168P/S329I;
N35G/E104H/A168P/P327F; N35G/P45D/E104H/A168P;
N35G/E104H/A116S/A168P; N35G/T40A/E104H/A168P/V230I/P327M;
N35G/E104H/A168P/S332R; N35G/E104H/A168P/G203V;
N35G/E104H/R145N/A168P/S329H; N35G/T40S/T49R/E104H/A168P/D234E;
/P327M; N35G/A56S/E104H/A168P; N35G/E104H/Q161R/A168P;
N35G/E104H/A168P/S332F; N35G/P45R/T49A/E104H/A168P/N317R/T320A;
N35G/E104H/A168P/V237I; N35G/E104H/A168P/E336S;
N35G/E104H/A168P/P233T; N35G/E104H/R130H/A168P;
N35G/E104H/A168P/P327L; N35G/E104H/A168P/N317I;
N35G/Q44K/E104H/A168P; N35G/E104H/A168P/A326V;
N35G/E104H/A168P/N317H; N35G/T40L/E104H/S128K/A168P;
N35G/T80V/E104H/A168P/P303T; N35G/E104H/A116Q/A168P;
N35G/E104H/A168P/S231A/S295L; N35G/T40S/E101T/E104H/A168P/P327M;
N35G/P45K/E104H/A168P/A219R/S232E; N35G/N46R/E104H/A168P;
N35G/E104H/A168P/A326Q; N35G/E104H/A168P/G203E/T281A;
N35G/E104H/A168P/E336R; N35G/T40S/E104H/S128K/A142G/A168P;
N35G/E104H/N118S/A168P; N35G/E104H/G155N/A168P;
S24Q/N35G/E104H/A168P/V237I/P303T; N35G/E104H/Q161E/A168P;
N35G/Q44K/S67T/E104H/A168P; V28H/N35G/E104H/A168P;
N35G/E104H/A168P/Q184L; N35G/T54G/E104H/A168P;
N35G/N66M/E104H/A168P; N35G/E64L/E104H/A168P;
N35G/E104H/S164E/A168P/A271T; N35G/N66A/E104H/A168P;
N35G/G83R/E104H/A168P; N35G/E104H/K141A/A168P;
N35G/E104H/A168P/N317Q/T320A; N35G/E104H/R130G/A168P;
N35G/E104H/R145Q/A168P; N35G/T70A/E104H/A168P;
N35G/E104H/R130K/A168P; N35G/E104H/A168P/Q184E;
N35G/E104H/A168P/S329T; N35G/T49A/E104H/A168P;
Y32S/N35G/E104H/A168P; N35G/E104H/A168P/S330I;
N35G/Q58H/E104H/A168P; Y32S/N35G/P71A/E104H/A168P;
N35G/E104H/A168P/S330T; N35G/T80V/E104H/A168P;
N35G/G82A/E104H/A168P; N35G/E104H/A168P/S295T;
N35G/N66G/E104H/A168P; N35G/T54S/E104H/A168P;
N35G/P45S/E104H/A168P; N35G/E104H/S128L/A168P;
N35G/N66D/N95E/E104H/S164E/A168P; /G267D; N35G/T54W/E104H/A168P;
N35G/P45E/E104H/K141R/A168P; N35G/E104H/A168P/S332C;
N35G/E104H/A168P/A297T; N35G/E104H/K141P/R145Q/A168P;
N35G/Q44K/E104H/A168P/S231T; N35G/T40G/T49R/S78C/E104H/A142G;
/A168P; N35G/E104H/S164E/A168P/S295D; N35G/E104H/A168P/N317Q;
N35G/P45R/E104H/A168P; N35G/G82S/E104H/A168P;
N35G/N46R/E104H/A168P/G203E/A263V; N35G/Q58P/E104H/A168P;
N35G/G69T/E104H/A168P; N35G/S67G/E104H/A168P;
N35G/E104H/A168P/R199E; N35G/E104H/A168P/G203E/G268A/G269A/G270A;
N35G/E104H/A168P/V324M; N35G/E104H/A168P/P266S;
N35G/E104H/A168P/G245A; N35G/N66R/E104H/A168P;
S24Q/N35G/Q44K/T80H/E104H/A168P; N35G/E104H/A168P/T236E;
N35G/E104H/A168P/K310I; N35G/E104H/R130Y/A168P;
N35G/N66D/S78D/E104H/A168P/S253D;
N35G/N66D/E104H/S164E/A168P/S282D; N35G/E104H/A142L/A168P;
N35G/E104H/R145H/A168P; N35G/E104H/A168P/S231T;
N35G/E104H/A168P/Q184R; N35G/E104H/A168P/K218L;
N35G/E104H/A168P/P233F; N35G/T49A/E104H/A168P/Q184H;
N35G/T40S/E104H/A168P/P327M; N35G/T54M/E104H/A168P;
N35G/N66D/E104H/S164E/A168P/S231T/S253T; N35G/E104H/A168P/G203Y;
N35G/T49Q/E104H/A168P; N35G/E104H/A168P/P266S/G267V;
N35G/Q44K/N66V/E104H/A168P; N35G/S67H/E104H/A168P;
N35G/E104H/A137M/A168P; N35G/T49A/E104H/S128N/A168P;
N35G/T49R/E104H/A168P/K218L/N317Q; N35G/I51A/E104H/A168P;
N35G/E104H/A168P/A326C; N35G/P45R/E104H/A168P/T320A;
N35G/N66L/E104H/A168P; N35G/E104H/A168P/V237I/P303T;
N35G/P45R/E104H/A168P/K218L/N317Q; N35G/T80L/E104H/A168P;
N35G/A55G/E104H/A168P; N35G/E104H/K141N/A168P/P266S;
N35G/E104H/A168P/S330A; N35G/N66D/E104H/A168P/R290K;
N35G/E104H/N118E/A168P; N35G/E104H/A168P/A212M;
N35G/K93N/E104H/R130Y/A168P; N35G/E104H/A168P/G267D;
N35G/P45R/T49Y/E104H/A168P/N317D; N35G/E104H/A168P/V230Q;
N35G/E104H/A168P/S329Q; N35G/P45K/E104H/A168P/A219R;
N35G/S78D/E104H/S164E/A168P; N35G/E104H/A168P/S205T;
N35G/E104H/A168P/Q184H;
V28H/N35G/N46E/Q58H/E104H/A168P; N35G/E104H/A142D/A168P;
N35G/E104H/A168P/E336L; N35G/E104H/A168P/A280T;
N35G/E104H/A168P/A219T; N35G/E104H/A168P/P303T/G305D;
R34E/N35G/E104H/A168P/A280T; N35G/E104H/A168P/N187D;
N35G/E104H/G136H/A168P; N35G/E104H/A168P/Q184N;
N35G/T49Y/E104H/A168P/N317R; N35G/T40A/T49Q/S78C/E104H/A168P;
R34E/N35G/K93T/E104H/R130E/R145T/A168P/R199E/K218T/A280D;
N35G/T40L/E104H/A142G/A168P; and/or N35G/N66G/E104H/A168P, wherein
the amino acid positions are numbered with reference to SEQ ID
NO:2. In some further embodiments, the GH61 variant proteins
comprise a plurality of substitutions as provided herein. In some
additional embodiments, the GH61 variant proteins comprise
polypeptide sequences that are at least about 90%, at least about
91%, at least about 92%, at least about 93%, at least about 94%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, at least about 99%, or 100% identical to any of SEQ ID
NOS:2, 3, 5, 6, 8, and/or 9, and/or a biologically active fragment
of any of SEQ ID NOS: 2, 3, 5, 6, 8, and/or 9, wherein the fragment
has GH61 activity. In still some additional embodiments, the GH61
variant proteins comprise polypeptide sequences that are at least
90%, at least 91%, at least 92%, at least 93%, at least 94%, at
least 95%, at least 96%, at least 97%, at least 98%, at least 99%,
or 100% identical to any of SEQ ID NOS:2, 3, 5, 6, 8, and/or 9,
and/or a biologically active fragment of any of SEQ ID NOS: 2, 3,
5, 6, 8, and/or 9, wherein the fragment has GH61 activity.
[0030] The present invention also provides GH61 variant proteins
comprising amino acid sequences that are at least about 75%, at
least about 80%, at least about 85%, at least about 90%, at least
about 91%, at least about 92%, at least about 93%, at least about
94%, at least about 95%, at least about 96%, at least about 97%, at
least about 98%, or at least about 99% identical to any of SEQ ID
NOS:2, 3, 5, 6, 8, and/or 9, or a fragment of SEQ ID NOS:2, 3, 5,
6, 8, and/or 9 having GH61 activity, wherein the amino acid
sequence of the variant protein has one or more amino acid
substitutions with respect to SEQ ID NOS:2, 3, 5, 6, 8, and/or 9 or
the fragment, and wherein the substitution(s) in the amino acid
sequences result in the variant proteins having increased GH61
activity in a reaction where crystalline cellulose undergoes
saccharification by cellulase enzymes that are contained in culture
broth from M. thermophila cells, compared with a reference protein
comprising SEQ ID NO:2, 3, 5, 6, 8, and/or 9 or the fragment,
without any substitutions. In some embodiments, the GH61 variant
proteins comprise amino acid sequences that are at least 75%, at
least 80%, at least 85%, at least 90%, at least 91%, at least 92%,
at least 93%, at least 94%, at least 95%, at least 96%, at least
97%, at least 98%, or at least 99% identical to any of SEQ ID
NOS:2, 3, 5, 6, 8, and/or 9, or a fragment of SEQ ID NOS:2, 3, 5,
6, 8, and/or 9 having GH61 activity, wherein the amino acid
sequence of the variant protein has one or more amino acid
substitutions with respect to SEQ ID NOS:2, 3, 5, 6, 8, and/or 9 or
the fragment, and wherein the substitution(s) in the amino acid
sequences result in the variant proteins having increased GH61
activity in a reaction where crystalline cellulose undergoes
saccharification by cellulase enzymes that are contained in culture
broth from M. thermophila cells, compared with a reference protein
comprising SEQ ID NO:2, 3, 5, 6, 8, and/or 9 or the fragment,
without any substitutions. In some further embodiments, the present
invention provides GH61 variant proteins encoded by
polynucleotides, wherein the proteins comprise amino acid sequences
that are at least about 75%, at least about 80%, at least about
85%, at least about 90%, at least about 91%, at least about 92%, at
least about 93%, at least about 94%, at least about 95%, at least
about 96%, at least about 97%, at least about 98%, or at least
about 99% identical to any of SEQ ID NOS:2, 3, 5, 6, 8, and/or 9 or
a fragment of SEQ ID NO:2, 3, 5, 6, 8, and/or 9 having GH61
activity, wherein the amino acid sequence of the variant protein
has one or more amino acid substitutions with respect to SEQ ID
NO:2, 3, 5, 6, 8, and/or 9 or the fragment, and wherein the
polynucleotide encoding the GH61 variant protein comprises at least
one mutation and/or mutation set selected from t60c/c573g,
t60c/c573g/g1026a, c573g, t60c/c291a/c573g, t60c/c291a, t60c/c876t,
a312g, t60c, t379a/c380g/g381c, c300t,
t204c/t379a/c380g/g381c/c385t, g1026a, c246t, c597g, c72t,
c732g/c843t/c882t, c909t, c912g, g921a, c792t, g972t, g921a,
t379a/c380g/g381c/c454a/c456a/c732t/c843t/c849t, c520a/c522g,
t60c/c573g; t60c/c288t/c573g; t60c/c198t/c573g; and/or
t60c/g399a/c573g; wherein the nucleotide positions are numbered
with reference to SEQ ID NO:1. In still some further embodiments,
the present invention provides GH61 variant proteins encoded by
polynucleotides, wherein the proteins comprise amino acid sequences
that are at least 75%, at least 80%, at least 85%, at least 90%, at
least 91%, at least 92%, at least 93%, at least 94%, at least 95%,
at least 96%, at least 97%, at least 98%, or at least 99% identical
to any of SEQ ID NOS:2, 3, 5, 6, 8, and/or 9 or a fragment of SEQ
ID NO:2, 3, 5, 6, 8, and/or 9 having GH61 activity, wherein the
amino acid sequence of the variant protein has one or more amino
acid substitutions with respect to SEQ ID NO:2, 3, 5, 6, 8, and/or
9 or the fragment, and wherein the polynucleotide encoding the GH61
variant protein comprises at least one mutation and/or mutation set
selected from t60c/c573g, t60c/c573g/g1026a, c573g,
t60c/c291a/c573g, t60c/c291a, t60c/c876t, a312g, t60c,
t379a/c380g/g381c, c300t, t204c/t379a/c380g/g381c/c385t, g1026a,
c246t, c597g, c72t, c732g/c843t/c882t, c909t, c912g, g921a, c792t,
g972t, g921a, t379a/c380g/g381c/c454a/c456a/c732t/c843t/c849t,
c520a/c522g, t60c/c573g; t60c/c288t/c573g; t60c/c198t/c573g; and/or
t60c/g399a/c573g; wherein the nucleotide positions are numbered
with reference to SEQ ID NO:1.
[0031] The present invention also provides polynucleotides
comprising a nucleic acid sequences encoding the GH61 variant
proteins provided herein, as well as polynucleotides that hybridize
under stringent hybridization conditions to at least one
polynucleotide and/or a complement of at least one polynucleotide
encoding GH61 variant proteins provided herein. In some
embodiments, the present invention provides polynucleotide
sequences encoding GH61 variant proteins, wherein the
polynucleotide sequences are at least about 75%, at least about
80%, at least about 85%, at least about 90%, at least about 91%, at
least about 92%, at least about 93%, at least about 94%, at least
about 95%, at least about 96%, at least about 97%, at least about
98%, or at least about 99% identical to any of SEQ ID NOS:1, 4, 7,
and/or 10, or at least one polynucleotide that hybridizes under
stringent hybridization conditions to at least one polynucleotide
and/or complement of any of SEQ ID NOS:1, 4, 7, and/or 10. In some
additional embodiments, the present invention provides
polynucleotide sequences encoding GH61 variant proteins, wherein
the polynucleotide sequences are at least 75%, at least 80%, at
least 85%, at least 90%, at least 91%, at least 92%, at least 93%,
at least 94%, at least 95%, at least 96%, at least 97%, at least
98%, at least 99%, or 100% identical to any of SEQ ID NOS:1, 4, 7,
and/or 10, or at least one polynucleotides that hybridizes under
stringent hybridization conditions to at least one polynucleotide
and/or complement of any of SEQ ID NOS:1, 4, 7, and/or 10.
[0032] The present invention also provides recombinant nucleic acid
constructs comprising at least one polynucleotide sequence encoding
at least one GH61 protein, wherein the polynucleotide is selected
from: (a) a polynucleotide that encodes a polypeptide comprising an
amino acid sequence having at least about 70%, at least about 75%,
at least about 80%, at least about 85%, at least about 90%, at
least about 91%, at least about 92%, at least about 93%, at least
about 94%, at least about 95%, at least about 96%, at least about
97%, at least about 98%, or at least about 99% identity to SEQ ID
NO:2, 3, 5, 6, 8, and/or 9, wherein the amino acid sequence
comprises at least one substitution and/or substitution set
provided herein; (b) a polynucleotide that hybridizes under
stringent hybridization conditions to at least a fragment of a
polynucleotide that encodes a polypeptide having the amino acid
sequence of SEQ ID NO:2, 3, 5, 6, 8, and/or 9, and wherein the
amino acid sequence comprises at least one substitution and/or at
least one substitution set provided herein; and/or (c) a
polynucleotide that hybridizes under stringent hybridization
conditions to the complement of at least a fragment of a
polynucleotide that encodes a polypeptide having the amino acid
sequence of SEQ ID NO:2, 3, 5, 6, 8, and/or 9, and wherein the
amino acid sequence comprises at least one substitution and/or at
least one substitution set provided herein. In some embodiments,
the recombinant nucleic acid constructs comprise at least one
polynucleotide sequence encoding at least one GH61 protein, wherein
the polynucleotide is selected from: (a) a polynucleotide that
encodes a polypeptide comprising an amino acid sequence having at
least 70%, at least 75%, at least 80%, at least 85%, at least 90%,
at least 91%, at least 92%, at least 93%, at least 94%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%
identity to SEQ ID NO:2, wherein the amino acid sequence comprises
at least one substitution and/or substitution set provided herein;
(b) a polynucleotide that hybridizes under stringent hybridization
conditions to a polynucleotide that encodes a polypeptide having
the amino acid sequence of SEQ ID NO:2, and wherein the amino acid
sequence comprises at least one substitution and/or at least one
substitution set provided herein; and/or (c) a polynucleotide that
hybridizes under stringent hybridization conditions to the
complement of a polynucleotide that encodes a polypeptide having
the amino acid sequence of SEQ ID NO:2, and wherein the amino acid
sequence comprises at least one substitution and/or at least one
substitution set provided herein. In some additional embodiments,
the recombinant nucleic acid constructs comprise at least one
polynucleotide sequence at least about 70%, at least about 75%, at
least about 80%, at least about 85%, at least about 90%, at least
about 91%, at least about 92%, at least about 93%, at least about
94%, at least about 95%, at least about 96%, at least about 97%, at
least about 98%, or at least about 99% identical to any of SEQ ID
NOS:1, 4, 7, and/or 10, and wherein the polynucleotide sequence
comprises at least one mutation and/or at least one mutation set
provided herein. In some further additional embodiments, the
recombinant nucleic acid constructs comprise polynucleotide
sequences that are at least 70%, at least 75%, at least 80%, at
least 85%, at least 90%, at least 91%, at least 92%, at least 93%,
at least 94%, at least 95%, at least 96%, at least 97%, at least
98%, at least 99%, or 100% identical to any of SEQ ID NOS:1, 4, 7,
and/or 10, and wherein the polynucleotide sequence comprises at
least one mutation and/or at least one mutation set provided
herein. In some embodiments, the polynucleotides and/or nucleic
acid constructs provided herein comprise at least one
polynucleotide sequence comprising at least one mutation or
mutation set selected from t60c/c573g, t60c/c573g/g1026a, c573g,
t60c/c291a/c573g, t60c/c291a, t60c/c876t, a312g, t60c,
t379a/c380g/g381c, c300t, t204c/t379a/c380g/g381c/c385t, g1026a,
c246t, c597g, c72t, c732g/c843t/c882t, c909t, c912g, g921a, c792t,
g972t, g921a, t379a/c380g/g381c/c454a/c456a/c732t/c843t/c849t,
c520a/c522g; t60c/c573g; t60c/c288t/c573g; t60c/c198t/c573g; and/or
t60c/g399a/c573g. In some additional embodiments, the
polynucleotide and/or nucleic acid construct comprise at least one
nucleic acid sequence operably linked to a promoter. In some
additional embodiments, the promoter is a heterologous promoter. In
some further embodiments, the nucleic acid constructs further
encode at least one enzyme in addition to the GH61 variant protein.
In some embodiments, the nucleic acid constructs comprise at least
one additional enzyme is selected from wild-type GH61 enzymes,
endoglucanases (EG), beta-glucosidases (BGL), Type 1
cellobiohydrolases (CBH1), Type 2 cellobiohydrolases (CBH2),
cellulases, hemicellulases, xylanases, xylosidases, amylases,
glucoamylases, proteases, esterases, and lipases. In some further
embodiments, at least one additional enzyme is selected from
wild-type GH61 enzymes, endoglucanases (EG), beta-glucosidases
(BGL), Type 1 cellobiohydrolases (CBH1), Type 2 cellobiohydrolases
(CBH2), xylanases, and xylosidases.
[0033] The present invention also provides expression constructs
comprising at least one polynucleotide or nucleic acid construct as
provided herein. In some expression construct embodiments, the
nucleic acid construct and/or the polynucleotide is operably linked
to a promoter. In some embodiments, the promoter is heterologous.
In some further embodiments of the expression constructs provided
herein, the nucleic acid sequence is operably linked to at least
one additional regulatory sequence.
[0034] The present invention also provides host cells that express
at least one polynucleotide sequence encoding at least one GH61
variant protein provided herein. In some embodiments, the host
cells produce at least one GH61 variant protein provided herein. In
some additional embodiments, at least one GH61 variant protein is
secreted from the host cells. In some further embodiments, the host
cells further produce at least one enzyme selected from wild-type
GH61 enzymes, endoglucanases (EG), beta-glucosidases (BGL), Type 1
cellobiohydrolases (CBH1), Type 2 cellobiohydrolases (CBH2),
cellulases, hemicellulases, xylanases, xylosidases, amylases,
glucoamylases, proteases, esterases, and lipases. In some
additional embodiments, the host cell further produces at least one
enzyme selected from wild-type GH61 enzymes, endoglucanases (EG),
beta-glucosidases (BGL), Type 1 cellobiohydrolases (CBH1), and Type
2 cellobiohydrolases (CBH2). In some embodiments, the host cell is
a yeast or filamentous fungal cell. In some embodiments, the
filamentous fungal cell is a Myceliophthora, a Chrysosporium a
Thielavia, a Trichoderma, or an Aspergillus cell. In some further
embodiments, the filamentous fungal cell is Myceliophthora
thermophila. In some additional embodiments, the host cell is a
yeast cell. In some further additional embodiments, the host cell
is Saccharomyces. In some further embodiments, the host cells
further comprise at least one polynucleotide, polynucleotide
construct, and/or expression construct as provided herein.
[0035] The present invention also provides methods of producing at
least one GH61 variant protein comprising culturing the host cell
set forth herein under conditions such that the host cell produces
at least one GH61 variant proteins as provided herein. In some
embodiments of the methods, the host cell further produces at least
one additional enzyme selected from wild-type GH61 enzymes,
endoglucanases (EG), beta-glucosidases (BGL), Type 1
cellobiohydrolases (CBH1), Type 2 cellobiohydrolases (CBH2),
cellulases, hemicellulases, xylanases, xylosidases, amylases,
glucoamylases, proteases, esterases, and lipases. In some
embodiments of the methods, the host cell further produces at least
one EG, at least one BGL, at least one CBH1, at least one CBH2,
and/or at least one wild-type GH61 enzyme. In some further
embodiments of the methods, the conditions comprise culturing at
about pH 5, while in some alternative embodiments of the methods,
the conditions comprise culturing at about pH 6.7. In some
embodiments of the methods, the filamentous fungal cell is a
Myceliophthora, a Chrysosporium, a Thielavia, a Trichoderma, or an
Aspergillus cell. In some further embodiments of the methods, the
filamentous fungal cell is a Myceliophthora thermophila. In some
additional embodiments of the methods, the host cell is a yeast
cell. In some further additional embodiments of the methods, the
host cell is Saccharomyces.
[0036] The present invention also provides enzyme compositions
comprising at least one GH61 variant protein as provided herein. In
some embodiments, the enzyme compositions further comprise one or
more enzymes selected from wild-type GH61 enzymes, endoglucanases
(EG), beta-glucosidases (BGL), Type 1 cellobiohydrolases (CBH1),
and/or Type 2 cellobiohydrolases (CBH2), cellulases,
hemicellulases, xylanases, xylosidases, amylases, glucoamylases,
proteases, esterases, and lipases. In some additional embodiments,
the enzyme compositions further comprise at least two additional
enzymes selected from wild-type GH61 enzymes, endoglucanases (EG),
beta-glucosidases (BGL), Type 1 cellobiohydrolases (CBH1), and/or
Type 2 cellobiohydrolases (CBH2), cellulases, hemicellulases,
xylanases, xylosidases, amylases, glucoamylases, proteases,
esterases, and lipases. In some embodiments, the enzyme
compositions are produced by the host cells provided herein. In
some additional embodiments, the enzyme compositions further
comprise a microorganism. In some further embodiments, the
microorganism comprises M. thermophila. In some embodiments, the
enzyme compositions further comprise at least one adjunct
composition. In some additional embodiments, the enzyme
compositions comprise at least one adjunct composition selected
from divalent metal cations, reductants, surfactants, buffers,
culture media, and enzyme stabilizing systems. In some further
embodiments, the enzyme compositions comprise adjunct composition
comprising copper and/or gallic acid. In some additional
embodiments, the enzyme compositions find use in saccharification
reactions.
[0037] The present invention also provides compositions comprising
at least one GH61 protein, one or more cellulase enzymes, a
cellulosic substrate, and Cu.sup.++, wherein the GH61 protein is at
least about 70%, about 75%, about 80%, about 85%, about 86%, about
87%, about 88%, about 89%, about 90%, about 91%, about 92%, about
93%, about 94%, about 95%, about 96%, about 97%, about 98%, about
99%, or about 100% identical to any of SEQ ID NOS:2, 5, 6, 8, 9,
11, and/or 12, and/or a biologically fragment thereof with GH61
activity. In some embodiments, the present invention provides
compositions comprising at least one GH61 protein, one or more
cellulase enzymes, a cellulosic substrate, and Cu.sup.++, wherein
the GH61 protein is at least 70%, 75%, 80%, 85%, 86%, 87%, 88%,
89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%
identical to SEQ ID NO:2, 5, 6, 8, 9, 11, and/or 12, and/or a
biologically fragment thereof with GH61 activity. In some
embodiments, the concentration of Cu.sup.++ is at least about 4
.mu.M. In some embodiments, the concentration of Cu.sup.++ is
between about 1 .mu.M and about 100 .mu.M, between about 4 .mu.M
and about 100 .mu.M, or between about 5 .mu.M and about 100
.mu.M.
[0038] The present invention also provides compositions comprising
at least one GH61 protein, one or more cellulase enzymes, a
cellulosic substrate, and gallic acid, wherein the GH61 protein is
at least about 70%, about 80%, about 85%, about 86%, about 87%,
about 88%, about 89%, about 90%, about 91%, about 92%, about 93%,
about 94%, about 95%, about 96%, about 97%, about 98%, about 99%,
or about 100% identical to any of SEQ ID NO:2, 5, 6, 8, 9, 11,
and/or 12, and/or a biologically fragment thereof with GH61
activity. The present invention also provides compositions
comprising at least one GH61 protein, one or more cellulase
enzymes, a cellulosic substrate, and gallic acid, wherein the GH61
protein is at least 70%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical to any of
SEQ ID NO:2, 5, 6, 8, 9, 11, and/or 12, and/or a biologically
fragment thereof with GH61 activity. In some embodiments, the
concentration of gallic acid in the compositions is at least about
0.1 mM. In some embodiments, the compositions comprise gallic acid
at a concentration between about 1 mM and about 5 mM. In some
embodiments, the concentration of gallic acid in the composition is
at least 0.1 mM. In some embodiments, the compositions comprise
gallic acid at a concentration between 1 mM and 5 mM. In some
embodiments, the compositions comprise at least one GH61 protein
comprising SEQ ID NO:2, 5, 6, 8, 9, 11, and/or 12, and/or a
biologically active fragment thereof with GH61 activity. In some
embodiments, the compositions comprise at least one GH61 variant
protein as provided herein. In some embodiments, the compositions
comprise at least one cellulase enzyme selected from endoglucanases
(EG), beta-glucosidases (BGL), Type 1 cellobiohydrolases (CBH1),
and/or Type 2 cellobiohydrolases (CBH2). In some embodiments, the
compositions comprise at least one BGL, CBH1, and CBH2. In some
additional embodiments, the compositions further comprise at least
one additional enzyme. In some further embodiments, at least one
additional enzyme is selected from hemicellulases, xylanases,
xylosidases, amylases, glucoamylases, proteases, esterases, and
lipases. In still some further embodiments of the compositions, the
cellulosic substrate is selected from wheat grass, wheat straw,
barley straw, sorghum, rice grass, sugarcane straw, bagasse,
switchgrass, corn stover, corn fiber, grains, or any combination
thereof.
[0039] The present invention also provides methods for producing
fermentable sugars from a cellulosic substrate, comprising
contacting the cellulosic substrate with at least one enzyme
composition as provided herein under conditions whereby fermentable
sugars are produced. In some embodiments, the methods further
comprise pretreating the cellulosic substrate prior to the
contacting. In some additional embodiments of the methods, the
enzyme composition is added concurrently with pretreating. In some
further embodiments of the methods, the cellulosic substrate
comprises wheat grass, wheat straw, barley straw, sorghum, rice
grass, sugarcane, sugarcane straw, bagasse, switchgrass, corn
stover, corn fiber, grains, or any combination thereof. In some
additional embodiments of the methods, the fermentable sugars
comprise glucose and/or xylose. In some embodiments, the methods
further comprise recovering the fermentable sugars. In some
embodiments of the methods, the conditions comprise using
continuous, batch, and/or fed-batch culturing conditions. In some
further embodiments, the method is a batch process, while in some
alternative embodiments, the method is a continuous process, and in
some still further embodiments, the method is a fed-batch process.
In some embodiments, the methods comprise any combination of batch,
continuous, and/or fed-batch processes conducted in any order. In
still some further embodiments, the methods are conducted in a
reaction volume of at least 10,000 liters, while in some other
embodiments, the methods are conducted in a reaction volume of at
least 100,000 liters. In some embodiments, the methods further
comprise use of at least one adjunct composition. In some
embodiments, the adjunct composition is selected from at least one
divalent metal cation, gallic acid, and/or at least one surfactant.
In some embodiments, the divalent metal cation comprises copper
and/or gallic acid. In some additional embodiments, the surfactant
is selected from TWEEN.RTM.-20 non-ionic detergent and polyethylene
glycol. In some further embodiments, the methods are conducted at
about pH 5.0, while in some alternative embodiments, the methods
are conducted at about pH 6.0. In some additional embodiments, the
pH is in the range of about 4.5 to about 7. In some embodiments,
the methods further comprise contacting the fermentable sugars with
a microorganism under conditions such that the microorganism
produces at least one fermentation end product. In some
embodiments, the fermentation end product is selected from
alcohols, fatty acids, lactic acid, acetic acid, 3-hydroxypropionic
acid, acrylic acid, succinic acid, citric acid, malic acid, fumaric
acid, amino acids, 1,3-propanediol, ethylene, glycerol, fatty
alcohols, butadiene, and beta-lactams. In some further embodiments,
the fermentation product is an alcohol selected from ethanol and
butanol. In some still further embodiments, the alcohol is
ethanol.
[0040] The present invention also provides methods for increasing
production of fermentable sugars from a saccharification reaction
comprising combining at least one cellulase substrate, one or more
cellulase enzymes, and at least one GH61 protein wherein the
protein is at least about 70%, about 75%, about 80%, about 85%,
about 86%, about 87%, about 88%, about 89%, about 90%, about 91%,
about 92%, about 93%, about 94%, about 95%, about 96%, about 97%,
about 98%, about 99%, or about 100% identical to SEQ ID NO:2, and
an adjunct composition in a saccharification reaction, wherein the
adjunct composition comprises Cu.sup.++ at a concentration of at
least about 4 .mu.M and/or gallic acid at a concentration of at
least about 0.5 mM. The present invention also provides methods for
increasing production of fermentable sugars from a saccharification
reaction comprising combining at least one cellulase substrate, one
or more cellulase enzymes, and at least one GH61 protein wherein
the protein is at least 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%,
90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identical
to SEQ ID NO:2, and an adjunct composition in a saccharification
reaction, wherein the adjunct composition comprises Cu.sup.++ at a
concentration of at least about 4 .mu.M and/or gallic acid at a
concentration of at least about 0.5 mM. In some embodiments, at
least one GH61 protein comprises SEQ ID NO:2, 5, 6, 8, 9, 11,
and/or a biologically active fragment thereof. In some embodiments
of the methods, the GH61 protein is at least one GH61 protein
variant as provided herein. In some embodiments, the methods
further comprise use of at least one surfactant selected from
TWEEN.RTM.-20 non-ionic detergent and polyethylene glycol. In some
additional embodiments, the methods are conducted at about pH 5.0,
while in some other embodiments, the methods are conducted at about
pH 6.0.
[0041] The present invention also provides methods of producing at
least one end product from at least one cellulosic substrate,
comprising: a) providing at least one cellulosic substrate and at
least one enzyme composition as provided herein; b) contacting the
cellulosic substrate with the enzyme composition under conditions
whereby fermentable sugars are produced from the cellulosic
substrate in a saccharification reaction; and c) contacting the
fermentable sugars with a microorganism under fermentation
conditions such that at least one end product is produced. In some
embodiments, the method comprises simultaneous saccharification and
fermentation reactions (SSF), while in some alternative embodiments
of the methods, saccharification of the cellulosic substrate and
fermentation are conducted in separate reactions (SHF). In some
additional embodiments, the methods comprise production of at least
one enzyme simultaneously with hydrolysis and/or fermentation
(e.g., "consolidated bioprocessing"; CBP). In some embodiments, the
enzyme composition is produced simultaneously with the
saccharification and fermentation reactions. In some additional
embodiments at least one enzyme of said composition is produced
simultaneously with the saccharification and fermentation
reactions. In some embodiments, in which at least one enzyme and/or
the enzyme composition is produced simultaneously with the
saccharification and fermentation reactions, the methods are
conducted in a single reaction vessel. In some embodiments, the
methods further comprise use of at least one adjunct composition in
the saccharification reaction. In some embodiments of the methods,
at least one adjunct composition is selected from at least one
divalent metal cation, gallic acid, and/or at least one surfactant.
In some further embodiments of the methods, the divalent metal
cation comprises copper. In some further embodiments of the
methods, the adjunct composition comprises gallic acid. In some
additional embodiments of the methods, the surfactant is selected
from TWEEN.RTM.-20 non-ionic detergent and polyethylene glycol. In
some embodiments, the method is conducted at about pH 5.0. In some
embodiments, the method is conducted at about pH 6.0. In some
further embodiments, the method is conducted at a pH in the range
of about 4.5 to about 7.0. In some embodiments, the methods further
comprise recovering at least one end product. In some embodiments
of the methods the end product comprises at least one fermentation
end product. In some further embodiments of the methods, the
fermentation end product is selected from alcohols, fatty acids,
lactic acid, acetic acid, 3-hydroxypropionic acid, acrylic acid,
succinic acid, citric acid, malic acid, fumaric acid, an amino
acid, 1,3-propanediol, ethylene, glycerol, fatty alcohols,
butadiene, and beta-lactams. In some embodiments of the methods,
the fermentation end product is at least one alcohol selected from
ethanol and butanol. In some embodiments of the methods, the
alcohol is ethanol. In some additional embodiments of the methods,
the microorganism is a yeast. In some further embodiments, the
yeast is Saccharomyces. In some further additional embodiments, the
methods further comprise recovering at least one fermentation end
product.
[0042] The present invention also provides for use of at least one
GH61 variant protein provided herein to produce at least one
fermentation end product. In some embodiments, at least one GH61
variant protein provided herein is used to produce at least one
fermentation end product selected from alcohols, fatty acids,
lactic acid, acetic acid, 3-hydroxypropionic acid, acrylic acid,
citric acid, malic acid, fumaric acid, succinic acid, amino acids,
1,3-propanediol, ethylene, glycerol, butadiene, fatty alcohols, and
beta-lactams. In some embodiments, the fermentation end product is
at least one alcohol selected from ethanol and butanol. In some
further embodiments, the alcohol is ethanol.
[0043] Additional embodiments of the invention are apparent from
the present description.
DESCRIPTION OF THE DRAWINGS
[0044] FIG. 1 provides results of an experiment using recombinantly
produced GH61a protein having the sequence shown in SEQ ID NO:2.
The protein was tested for its ability to promote the activity of
cellulases present in culture broth of M. thermophila. The graph
shows the improvement in the yield of the fermentable sugar glucose
that is attained by adding GH61 to the reaction.
[0045] FIG. 2 shows specific GH61 activity observed in a reaction
where a wheat straw substrate was hydrolyzed by cellulase enzymes
CBH1, CBH2, and beta-glucosidase. The results show that GH61a
Variants 5 and 9 have a 2.0 to 2.9 fold improvement over the
parental GH61 sequence (SEQ ID NO:2); and Variant 1 has a 3.0 to
3.9 fold improvement.
[0046] FIG. 3 shows the increase in glucose production in the
presence of GH61 protein when Cu.sup.++ is included the reaction.
In this Figure, n=4; and mean.+-.SD. Panel A shows the increase
with a GH61 variant protein "Variant 5," while Panel B shows the
increase with the wild-type GH61a protein (SEQ ID NO:2).
[0047] FIG. 4 shows activity of GH61a pre-incubated with 0 or 50
.mu.M CuSO.sub.4, copper(II) ion at either saccharification pH 5.0
or pH 6.0. Panel A shows glucose production, while Panel B shows
the total production of C5 sugars.
[0048] FIG. 5 shows activity of M. thermophila-produced GH61a
Variant 1 on cellulosic substrates. Panel A shows the results on
AVICEL.RTM. PH microcrystalline cellulose, and Panel B shows the
results on phosphoric acid swollen cellulose (PASC), in the
presence of ascorbic acid, gallic acid and pretreatment
filtrate.
[0049] FIG. 6 provides results showing the effects of surfactants
on saccharification. Panel A shows enzymatic hydrolysis activity of
a cellulase mixture in the presence of TWEEN.RTM.-20, while Panel B
shows the enzymatic hydrolysis activity of a cellulase mixture in
the presence of PEG-4000.
DETAILED DESCRIPTION OF THE INVENTION
[0050] As described herein, the present invention provides GH61
proteins of the filamentous fungus Myceliophthora thermophila that
have been genetically modified. These GH61 protein variants exhibit
improved activity and other benefits, as compared to wild-type GH61
proteins.
[0051] Before modification, the GH61 protein having the sequence
shown in SEQ ID NO:2 improves the yield of fermentable sugars
produced from a cellulosic substrate through the activity of
cellulase enzymes (e.g., endoglucanase, beta-glucosidase (BGL),
cellobiohydrolase, and combinations of such enzymes; See, FIG. 1).
The GH61 variant proteins of this invention have certain amino acid
substitutions in relation to SEQ ID NO:2, either alone or in
various combinations. GH61 variant proteins that have gone through
one round of optimization, when included in a saccharification
assay, improve the yield of fermentable sugars in such reactions by
at least about 2-fold, about 3-fold, or more, in relation to the
improvement in yield when wild-type GH61a (SEQ ID NO:2) is used
instead. (See, FIG. 2). After multiple rounds of optimization, the
GH61 activity can be improved by a further 1.5-fold, 2-fold, 3-fold
or more.
[0052] The GH61 variant proteins of the present invention have
important industrial applicability in the processing of cellulosic
biomass to produce fermentable sugars, which in turn can be
fermented or processed to produce commercially important
fermentation products (e.g., "fermentation end-products" or
"end-products"), including but not limited to, at least one
alcohol, fatty acid, lactic acid, acetic acid, 3-hydroxypropionic
acid, acrylic acid, succinic acid, citric acid, malic acid, fumaric
acid, amino acid, 1,3-propanediol, ethylene, glycerol, fatty
alcohol, butadiene, and/or beta-lactam. In further embodiments, the
alcohol is ethanol, butanol, and/or a fatty alcohol. In some
embodiments, the fermentation product is ethanol. In some still
further embodiments, the fermentation product is a fatty alcohol
that is a C8-C20 fatty alcohol. In some embodiments, the
fermentation medium comprises at least one product from a
saccharification process.
[0053] GH61 proteins, their production and use are generally
described in PCT/US11/488700. This application claims priority to
U.S. Ser. No. 61/375,788, both of which are incorporated herein by
reference in their entirety. Proteins, procedures, and uses
described in these applications find use with the GH61 variant
proteins of the present invention.
DEFINITIONS
[0054] All patents and publications, including all sequences
disclosed within such patents and publications, referred to herein
are expressly incorporated by reference. Unless otherwise
indicated, the practice of the present invention involves
conventional techniques commonly used in molecular biology,
fermentation, microbiology, and related fields, which are known to
those of skill in the art. Unless defined otherwise herein, all
technical and scientific terms used herein have the same meaning as
commonly understood by one of ordinary skill in the art to which
this invention belongs. Although any methods and materials similar
or equivalent to those described herein can be used in the practice
or testing of the present invention, the preferred methods and
materials are described. Indeed, it is intended that the present
invention not be limited to the particular methodology, protocols,
and reagents described herein, as these may vary, depending upon
the context in which they are used. The headings provided herein
are not limitations of the various aspects or embodiments of the
present invention.
[0055] Nonetheless, in order to facilitate understanding of the
present invention, a number of terms are defined below. Numeric
ranges are inclusive of the numbers defining the range. Thus, every
numerical range disclosed herein is intended to encompass every
narrower numerical range that falls within such broader numerical
range, as if such narrower numerical ranges were all expressly
written herein. It is also intended that every maximum (or minimum)
numerical limitation disclosed herein includes every lower (or
higher) numerical limitation, as if such lower (or higher)
numerical limitations were expressly written herein.
[0056] As used herein, the term "comprising" and its cognates are
used in their inclusive sense (i.e., equivalent to the term
"including" and its corresponding cognates).
[0057] As used herein and in the appended claims, the singular "a",
"an" and "the" include the plural reference unless the context
clearly dictates otherwise. Thus, for example, reference to a "host
cell" includes a plurality of such host cells.
[0058] Unless otherwise indicated, nucleic acids are written left
to right in 5' to 3' orientation; amino acid sequences are written
left to right in amino to carboxy orientation, respectively. The
headings provided herein are not limitations of the various aspects
or embodiments of the invention that can be had by reference to the
specification as a whole. Accordingly, the terms defined below are
more fully defined by reference to the specification as a
whole.
[0059] As used herein, the term "produces" refers to the production
of proteins (polypeptides) and/or other compounds by cells. It is
intended that the term encompass any step involved in the
production of polypeptides including, but not limited to,
transcription, post-transcriptional modification, translation, and
post-translational modification. In some embodiments, the term also
encompasses secretion of the polypeptide from a cell.
[0060] As used in this disclosure, the term "GH61 protein" means a
protein that has GH61 activity, including GH61 variants and
wild-type GH61 enzymes. In some embodiments, the GH61 proteins have
been purified from M. thermophila cells, while in other
embodiments, they are structurally related to the amino acid
sequences shown in Tables 1 and 2. The terms also encompasses
species and strain homologs and orthologs comprising protein
sequences listed in Tables 1 and 2, as well as variants, and
fragments of such sequences (produced using any suitable means
known in the art), having GH61 activity.
[0061] As used herein, the terms "variant," "GH61 variant," refer
to a GH61 polypeptide or polynucleotide encoding a GH61 polypeptide
comprising one or more modifications relative to wild-type GH61 or
the wild-type polynucleotide encoding GH61 (such as substitutions,
insertions, deletions, and/or truncations of one or more amino acid
residues or of one or more specific nucleotides or codons in the
polypeptide or polynucleotide, respectively), and biologically
active fragments thereof. In some embodiments, the variant is
derived from a M. thermophila polypeptide and comprises one or more
modifications relative to wild-type M. thermophila GH61 or the
wild-type polynucleotide encoding wild-type M. thermophila GH61, or
a biologically active fragment thereof. In some embodiments, a
"GH61 variant protein" ("GH61 variant polypeptide") of the present
invention is a protein that is structurally related to a reference
protein comprising SEQ ID NO:2 or a fragment of SEQ ID NO:2 that
has GH61 activity, but has one or more amino acid substitutions in
relation to the reference protein. In some embodiments, the GH61
variant is a GH61a variant (i.e., a variant of GH61a enzyme). In
some embodiments, the GH61 variant polypeptide is a "polypeptide of
interest." In some additional embodiments, the GH61 variant
polypeptide is encoded by a "polynucleotide of interest."
[0062] The terms "improved" or "improved properties," as used in
the context of describing the properties of a GH61 variant, refers
to a GH61 variant polypeptide that exhibits an improvement in a
property or properties as compared to the wild-type GH61 (e.g., SEQ
ID NO:2) or a specified reference polypeptide. Improved properties
may include, but are not limited to increased protein expression,
increased thermoactivity, increased thermostability, increased pH
activity, increased stability (e.g., increased pH stability or pH
tolerance at various pH levels), increased product specificity,
increased specific activity, increased substrate specificity,
increased resistance to substrate or end-product inhibition,
increased chemical stability, reduced inhibition by glucose,
increased resistance to inhibitors (e.g., acetic acid, lectins,
tannic acids, and phenolic compounds), and altered pH/temperature
profile.
[0063] The term "biologically active fragment," as used herein,
refers to a polypeptide that has an amino-terminal and/or
carboxy-terminal deletion and/or internal deletion, but where the
remaining amino acid sequence is identical to the corresponding
positions in the sequence to which it is being compared (e.g., a
full-length GH61 variant of the invention) and that retains
substantially all of the activity of the full-length polypeptide. A
biologically active fragment can comprise about 60%, about 65%,
about 70%, about 75%, about 80%, about 85%, at about 90%, about
91%, about 92%, about 93%, about 94%, about 95%, about 96%, about
97%, about 98%, or about 99% of a full-length GH61 polypeptide.
[0064] A GH61 variant protein of this invention having "increased
GH61 activity" has more GH61 activity when that protein is present
in a saccharification reaction with a specified substrate and
specified cellulase enzyme(s), compared with a saccharification
reaction conducted with the same substrate and enzyme(s) under the
same conditions in the presence of a reference protein (e.g.,
including but not limited to wild-type GH61). The increase is
determined by measuring the amount of fermentable sugar produced in
the reaction in the presence of the GH61 variant protein, in the
presence of the reference protein (Positive Control), and in the
absence of either protein (Negative Control). The Improvement Over
Positive Control (FIOPC) is calculated as ([Glucose production of
the GH61 Variant Protein]-[Glucose production of the Negative
Control])/([Glucose production of the Positive Control]-[Glucose
production of the Negative Control]).
[0065] As used herein, "GH61 activity" is the functional activity
of a GH61 protein that results in production of more fermentable
sugar from a polysaccharide substrate when the GH61 protein is
present in a saccharification reaction, compared with a
saccharification reaction conducted under the same conditions in
the absence of the GH61 protein.
[0066] A GH61 variant protein of this invention having "increased
GH61 thermoactivity" has more GH61 activity in a saccharification
reaction conducted at an elevated temperature (about 50.degree. C.,
about 55.degree. C., about 60.degree. C., or higher) with a
specified substrate and specified cellulase enzyme(s), compared
with a saccharification reaction conducted under the same
conditions in the presence of the reference protein (e.g.,
including but not limited to wild-type GH61).
[0067] GH61 proteins of this invention may be said to "enhance",
"promote", or "facilitate" activity of one or more cellulase
enzymes during hydrolysis of sugar polymers (e.g., cellulosic
and/or lignocellulosic biomass) such that the enzyme(s) produce(s)
more product over a particular time period, hydrolysis proceeds
more rapidly, or goes further to completion when the GH61 protein
is present, compared with a similar reaction mixture in which the
GH61 protein is absent. This invention may be practiced by
following GH61 activity in an empirical fashion using assay methods
provided in this disclosure, without knowing the mechanism of
operation of the GH61 variant protein being used. However, it is
not intended that the present invention be limited to any
particular assay system and/or method, as any suitable method known
in the art finds use.
[0068] The terms "transform" or "transformation," as used in
reference to a cell, mean a cell has a non-native nucleic acid
sequence integrated into its genome and/or as an episome (e.g.,
plasmid) that is maintained through multiple generations.
[0069] The term "introduced," as used in the context of inserting a
nucleic acid sequence into a cell, means that the nucleic acid has
been conjugated, transfected, transduced or transformed
(collectively "transformed") or otherwise incorporated into the
genome of and/or maintained as an episome in the cell. Thus, the
term encompasses transformation, transduction, conjugation,
transfection, and/or any other suitable method(s) known in the art
for inserting nucleic acid sequences into host cells. Any suitable
means for the introduction of nucleic acid into host cells find use
in the present invention.
[0070] The terms "percent identity," "% identity", "percent
identical", and "% identical" are used interchangeably to refer to
a comparison of two optimally aligned sequences over a comparison
window. The comparison window may include additions or deletions in
either sequence to optimize alignment. The percentage of identity
is the number of positions that are identical between the
sequences, divided by the total number of positions in the
comparison window (including positions where one of the sequences
has a gap). For example, a protein with an amino acid sequence that
matches at 310 positions a sequence of GH61a (which has 323 amino
acids in the secreted form), would have 310/323=95.9% identity to
the reference. Similarly, a protein variant that has 300 residues
(i.e., less than full-length) and matches the reference sequence at
280 positions would have 280/300=93.3% identity.
Computer-implemented alignment algorithms useful in determining the
degree of identity are known in the art, including the BLAST and
BLAST 2.0 algorithms (See e.g., Altschul et al., J. Mol. Biol.,
215: 403-410 [1990]; and Altschul et al., Nucl. Acids Res.,
3389-3402 [1977]).
[0071] As used herein, "polynucleotide" refers to a polymer of
deoxyribonucleotides or ribonucleotides in either single- or
double-stranded form, and complements thereof.
[0072] As used herein, the term "allelic variant" refers to any of
two or more (e.g., several) alternative forms of a gene occupying
the same chromosomal locus. In some embodiments, allelic variation
arises naturally through mutation and results in genetic
polymorphism within populations. In some embodiments, gene
mutations are silent (i.e., there is no change in the encoded
polypeptide), while in some other embodiments the genes encode
polypeptides that have altered amino acid sequences. An "allelic
variant of a polypeptide" is a polypeptide encoded by an allelic
variant of a gene.
[0073] As used herein, "cDNA" refers to a DNA molecule that can be
prepared by reverse transcription from a mature, spliced, mRNA
molecule obtained from a eukaryotic or prokaryotic cell. cDNA
sequences lack intron sequences that may be present in the
corresponding genomic DNA. The initial, primary RNA transcript is a
precursor to mRNA that is processed through a series of steps,
including splicing, before appearing as mature spliced mRNA.
[0074] As used herein, the term "coding sequence" refers to a
polynucleotide that directly specifies the amino acid sequence of a
polypeptide. The boundaries of the coding sequence are generally
determined by an open reading frame, which begins with a start
codon (e.g., ATG, GTG, or TTG) and ends with a stop codon (e.g.,
TAA, TAG, or TGA). In some embodiments, a coding sequence comprises
genomic DNA, while in some alternative embodiments, the coding
sequence comprises cDNA, synthetic DNA, and/or a combination
thereof.
[0075] As used herein, the terms "control sequences" and
"regulatory sequences" refer to nucleic acid sequences necessary
and/or useful for expression of a polynucleotide encoding a
polypeptide. In some embodiments, control sequences are native
(i.e., from the same gene) or foreign (i.e., from a different gene)
to the polynucleotide encoding the polypeptide. Control sequences
include, but are not limited to leaders, polyadenylation sequences,
propeptide sequences, promoters, signal peptide sequences, and
transcription terminators. In some embodiments, at a minimum,
control sequences include a promoter, and transcriptional and
translational stop signals. In some embodiments, control sequences
are provided with linkers for the purpose of introducing specific
restriction sites facilitating ligation of the control sequences
with the coding region of the polynucleotide encoding the
polypeptide.
[0076] A nucleic acid construct, nucleic acid (e.g., a
polynucleotide), polypeptide, or host cell is referred to herein as
"recombinant" when it is non-naturally occurring, artificial or
engineered. The present invention also provides recombinant nucleic
acid constructs comprising at least one GH61 variant polynucleotide
sequence that hybridizes under stringent hybridization conditions
to the complement of a polynucleotide which encodes a polypeptide
comprising the amino acid sequence of any of SEQ ID NOS:2, 3, 5, 6,
8, 9, 11, and/or 12.
[0077] The term "recombinant nucleic acid" has its conventional
meaning. A recombinant nucleic acid, or equivalently,
"polynucleotide," is one that is inserted into a heterologous
location such that it is not associated with nucleotide sequences
that normally flank the nucleic acid as it is found in nature (for
example, a nucleic acid inserted into a vector or a genome of a
heterologous organism). Likewise, a nucleic acid sequence that does
not appear in nature, for example a variant of a naturally
occurring gene, is recombinant. A cell containing a recombinant
nucleic acid, or protein expressed in vitro or in vivo from a
recombinant nucleic acid are also "recombinant" Examples of
recombinant nucleic acids include a protein-encoding DNA sequence
that is (i) operably linked to a heterologous promoter and/or (ii)
encodes a fusion polypeptide with a protein sequence and a
heterologous signal peptide sequence.
[0078] For purposes of this disclosure, a promoter is
"heterologous" to a gene sequence if the promoter is not associated
in nature with the gene. A signal peptide is "heterologous" to a
protein sequence when the signal peptide sequence is not associated
with the protein in nature. In some embodiments, "hybrid promoters"
find use. Hybrid promoters are promoters comprising portions of two
or more (e.g., several) promoters that are linked together to
generate a sequence that is a fusion of the portions of the two or
more promoters, which when operably linked to a coding sequence,
mediates the transcription of the coding sequence into mRNA.
[0079] In relation to regulatory sequences (e.g., promoters), the
term "operably linked" refers to a configuration in which a
regulatory sequence is located at a position relative to a
polypeptide encoding sequence such that the regulatory sequence
influences the expression of the polypeptide. In relation to a
signal sequence, the term "operably linked" refers to a
configuration in which the signal sequence encodes an
amino-terminal signal peptide fused to the polypeptide, such that
expression of the gene produces a pre-protein.
[0080] Nucleic acids "hybridize" when they associate, typically in
solution. Nucleic acids hybridize due to a variety of
well-characterized physico-chemical forces, such as hydrogen
bonding, solvent exclusion, base stacking and the like. As used
herein, the term "stringent hybridization wash conditions" in the
context of nucleic acid hybridization experiments, such as Southern
and Northern hybridizations, are sequence dependent, and are
different under different environmental parameters. An extensive
guide to the hybridization of nucleic acids is found in Tijssen,
1993, "Laboratory Techniques in Biochemistry and Molecular
Biology-Hybridization with Nucleic Acid Probes," Part I, Chapter 2
(Elsevier, New York), which is incorporated herein by reference.
For polynucleotides of at least 100 nucleotides in length, low to
very high stringency conditions are defined as follows:
prehybridization and hybridization at 42.degree. C. in
5.times.SSPE, 0.3% SDS, 200 .mu.g/m sheared and denatured salmon
sperm DNA, and either 25% formamide for low stringencies, 35%
formamide for medium and medium-high stringencies, or 50% formamide
for high and very high stringencies, following standard Southern
blotting procedures. For polynucleotides of at least 100
nucleotides in length, the carrier material is finally washed three
times each for 15 minutes using 2.times.SSC, 0.2% SDS 50.degree. C.
(low stringency), at 55.degree. C. (medium stringency), at
60.degree. C. (medium-high stringency), at 65.degree. C. (high
stringency), or at 70.degree. C. (very high stringency).
[0081] As used herein, a "vector" and "nucleic acid construct"
comprise nucleic acid (e.g., DNA) constructs for introducing a DNA
sequence into a cell. In some embodiments, the vector is an
expression vector that is operably linked to a suitable control
sequence capable of effecting the expression in a suitable host of
the polypeptide encoded in the DNA sequence. The term "expression
vector" refers to a DNA molecule, linear or circular, that
comprises a segment encoding a polypeptide of the invention, and
which is operably linked to additional segments that provide for
its transcription (e.g., a promoter, a transcription terminator
sequence, enhancers, etc.) and optionally a selectable marker.
[0082] As used herein, the term "isolated" refers to a nucleic
acid, polypeptide, or other component that is partially or
completely separated from components with which it is normally
associated in nature. Thus, the term encompasses a substance in a
form or environment that does not occur in nature. Non-limiting
examples of isolated substances include, but are not limited to:
any non-naturally occurring substance; any substance including, but
not limited to, any enzyme, variant, polynucleotide, protein,
peptide or cofactor, that is at least partially removed from one or
more or all of the naturally occurring constituents with which it
is associated in nature; any substance modified by the hand of man
relative to that substance found in nature; and/or any substance
modified by increasing the amount of the substance relative to
other components with which it is naturally associated (e.g.,
multiple copies of a gene encoding the substance; and/or use of a
stronger promoter than the promoter naturally associated with the
gene encoding the substance). In some embodiments, a polypeptide of
interest is used in industrial applications in the form of a
fermentation broth product (i.e., the polypeptide is a component of
a fermentation broth) used as a product in industrial applications
such as ethanol production. In some embodiments, in addition to the
polypeptide of interest (e.g., a GH61 variant polypeptide), the
fermentation broth product further comprises ingredients used in
the fermentation process (e.g., cells, including the host cells
containing the gene encoding the polypeptide of interest and/or the
polypeptide of interest), cell debris, biomass, fermentation media,
and/or fermentation products. In some embodiments, the fermentation
broth is optionally subjected to one or more purification steps
(e.g., filtration) to remove or reduce at least one components of a
fermentation process. Accordingly, in some embodiments, an isolated
substance is present in such a fermentation broth product.
[0083] As used herein, the terms "peptide," "polypeptide," and
"protein" are used interchangeably herein to refer to a polymer of
amino acid residues.
[0084] As used herein, the term "amino acid" refers to naturally
occurring and synthetic amino acids, as well as amino acid analogs.
Naturally occurring amino acids are those encoded by the genetic
code, as well as those amino acids that are later modified (e.g.,
hydroxyproline, .gamma.-carboxyglutamate, and O-phosphoserine)
Amino acid analogs refers to compounds that have the same basic
chemical structure as a naturally occurring amino acid, (i.e., an
.alpha.-carbon that is bound to a hydrogen, a carboxyl group, an
amino group, and an R group, such as homoserine, norleucine,
methionine sulfoxide, and methionine methyl sulfonium). Such
analogs have modified R groups (e.g., norleucine) or modified
peptide backbones, but retain the same basic chemical structure as
a naturally occurring amino acid.
[0085] An "amino acid substitution" in a protein sequence is
replacement of a single amino acid within that sequence with
another amino acid. Unless indicated otherwise, variant GH61
proteins of this invention have substitutions as specifically
indicated. In some embodiments, the variant GH61 proteins of the
present invention also have other substitutions and/or alterations
at any position in any combination with the substitutions
specifically indicated.
[0086] Amino acids are referred to herein by either their commonly
known three letter symbols or by the one-letter symbols recommended
by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides,
likewise, may be referred to by their commonly accepted
single-letter codes.
[0087] An amino acid or nucleotide base "position" is denoted by a
number that sequentially identifies each amino acid (or nucleotide
base) in the reference sequence based on its position relative to
the N-terminus (or 5'-end). Due to deletions, insertions,
truncations, fusions, and the like that must be taken into account
when determining an optimal alignment, the amino acid residue
number in a test sequence determined by simply counting from the
N-terminus will not necessarily be the same as the number of its
corresponding position in the reference sequence. For example, in a
case where a test sequence has a deletion relative to an aligned
reference sequence, there will be no amino acid in the variant that
corresponds to a position in the reference sequence at the site of
deletion. Where there is an insertion in an aligned reference
sequence, that insertion will not correspond to a numbered amino
acid position in the reference sequence. In the case of truncations
or fusions there can be stretches of amino acids in either the
reference or aligned sequence that do not correspond to any amino
acid in the corresponding sequence.
[0088] As used herein, the terms "numbered with reference to" or
"corresponding to," when used in the context of the numbering of a
given amino acid or polynucleotide sequence, refers to the
numbering of the residues of a specified reference sequence when
the given amino acid or polynucleotide sequence is compared to the
reference sequence.
[0089] As used herein, the term "reference enzyme" refers to an
enzyme to which another enzyme of the present invention (e.g., a
"test" enzyme) is compared in order to determine the presence of an
improved property in the other enzyme being evaluated. In some
embodiments, a reference enzyme is a wild-type enzyme (e.g.,
wild-type GH61). In some embodiments, the reference enzyme is an
enzyme with which a test enzyme of the present invention is
compared in order to determine the presence of an improved property
in the test enzyme being evaluated, including but not limited to
improved thermoactivity, improved thermostability, improved
activity, and/or improved stability. In some embodiments, a
reference enzyme is a wild-type enzyme (e.g., wild-type GH61).
[0090] Amino acid substitutions in a GH61 protein are referred to
in this disclosure using the following notation: The single-letter
abbreviation for the amino acid being substituted; its position in
the reference sequence (e.g., the wild-type "parental sequence" set
forth in SEQ ID NO:2); and the single-letter abbreviation for the
amino acid that replaces it. Thus, the following nomenclature is
used herein to describe substitutions in a reference sequence
relative to a reference sequence or a variant polypeptide or
nucleic acid sequence: "R-#-V," where # refers to the position in
the reference sequence, R refers to the amino acid (or base) at
that position in the reference sequence, and V refers to the amino
acid (or base) at that position in the variant sequence. In some
embodiments, an amino acid (or base) may be called "X," by which is
meant any amino acid (or base). As a non-limiting example, for a
variant polypeptide described with reference to a wild-type GH61
polypeptide (e.g., SEQ ID NO:2), "N35G" indicates that in the
variant polypeptide, the asparagine at position 35 of the reference
sequence is replaced by glycine, with amino acid position being
determined by optimal alignment of the variant sequence with SEQ ID
NO:2. Similarly, "H20C/D" describes two variants: a variant in
which the histidine at position 20 of the reference sequence is
replaced by cysteine and a variant in which the serine at position
20 of the reference sequence is replaced by aspartic acid. In the
example "W141X" indicates that the tryptophan at position 131 has
been replaced with any amino acid.
[0091] As used herein in reference to nucleotide and amino acid
sequences, the term "mutation" refers to any change in the
sequence, as compared to a reference nucleotide or amino acid
sequence, including but not limited to substitutions, deletions,
additions, truncations, modifications, etc. Indeed, it is intended
that any change in a reference (or "parent" or "starting")
nucleotide or amino acid sequence comprises a mutation in the
sequence.
[0092] As used herein, the terms "amino acid mutation set",
"mutation set" when used in the context of amino acid sequences
(e.g., polypeptides) refer to a group of amino acid substitutions,
insertions, deletions and/or other modifications to the sequence.
In some embodiments, "mutation set" refers to the nucleic acid
mutation sets present in some of the GH61 variants provided in
Table 1 and Table 2.
[0093] The term "amino acid substitution set," "substitution set,"
and "combination of amino acid substitutions" refer to a group
(i.e., set of combinations) of amino acid substitutions. A
substitution set can have about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, or more amino acid substitutions. In some
embodiments, a substitution set refers to the set of amino acid
substitutions that is present in any of the variant GH61 enzymes
provided herein.
[0094] As used herein, the terms "nucleic acid substitution set"
and "substitution set" when used in the context of nucleotide
sequences (e.g., polynucleotides) refer to a group of nucleic acid
substitutions. In some embodiments, mutation set refers to the
nucleic acid substitution sets present in some of the variant GH61
proteins provided in Table 1 and Table 2.
[0095] As used herein, the terms "nucleic acid mutation set" and
"mutation set" when used in the context of nucleotide sequences
(e.g., polynucleotides) refer to a group of nucleic acid
substitutions, insertions, deletions, and/or other modifications to
the sequence. In some embodiments, "mutation set" refers to the
amino acid mutation sets present in some of the GH61 variants
provided in Table 1 and Table 2.
[0096] A "cellulase-engineered" cell is a cell comprising at least
one, at least two, at least three, or at least four recombinant
sequences encoding a cellulase or cellulase variant, and in which
expression of the cellulase(s) or cellulase variant(s) has been
modified relative to the wild-type form. Expression of a cellulase
is "modified" when a non-naturally occurring cellulase variant is
expressed or when a naturally occurring cellulase is
over-expressed. One exemplary means to over-express a cellulase is
to operably link a strong (optionally constitutive) promoter to the
cellulase encoding sequence. Another exemplary way to over-express
a cellulase is to increase the copy number of a heterologous,
variant, or endogenous cellulase gene. The cellulase-engineered
cell may be any suitable fungal cell, including, but not limited to
Myceliophthora, Trichoderma, Aspergillus, cells, etc.
[0097] As used herein, the terms "host cell" and "host strain"
refer to suitable hosts for expression vectors comprising DNA
provided herein. In some embodiments, the host cells are
prokaryotic or eukaryotic cells that have been transformed or
transfected with vectors constructed using recombinant DNA
techniques as known in the art. Transformed hosts are capable of
either replicating vectors encoding at least one protein of
interest and/or expressing the desired protein of interest. In
addition, reference to a cell of a particular strain refers to a
parental cell of the strain as well as progeny and genetically
modified derivatives. Genetically modified derivatives of a
parental cell include progeny cells that contain a modified genome
or episomal plasmids that confer for example, antibiotic
resistance, improved fermentation, etc. In some embodiments, host
cells are genetically modified to have characteristics that improve
protein secretion, protein stability or other properties desirable
for expression and/or secretion of a protein. For example, knockout
of Alp1 function results in a cell that is protease deficient.
Knockout of pyr5 function results in a cell with a pyrimidine
deficient phenotype. In some embodiments, host cells are modified
to delete endogenous cellulase protein-encoding sequences or
otherwise eliminate expression of one or more endogenous
cellulases. In some embodiments, expression of one or more
endogenous cellulases is inhibited to increase production of
cellulases of interest. Genetic modification can be achieved by any
suitable genetic engineering techniques and/or classical
microbiological techniques (e.g., chemical or UV mutagenesis and
subsequent selection). Using recombinant technology, nucleic acid
molecules can be introduced, deleted, inhibited or modified, in a
manner that results in increased yields of GH61 variant(s) within
the organism or in the culture. For example, knockout of Alp1
function results in a cell that is protease deficient. Knockout of
pyr5 function results in a cell with a pyrimidine deficient
phenotype. In some genetic engineering approaches, homologous
recombination is used to induce targeted gene modifications by
specifically targeting a gene in vivo to suppress expression of the
encoded protein. In an alternative approach, siRNA, antisense,
and/or ribozyme technology finds use in inhibiting gene
expression.
[0098] As used herein, the term "C1" refers to strains of
Myceliophthora thermophila, including the fungal strain described
by Garg (See, Garg, Mycopathol., 30: 3-4 [1966]). As used herein,
"Chrysosporium lucknowense" includes the strains described in U.S.
Pat. Nos. 6,015,707, 5,811,381 and 6,573,086; US Pat. Pub. Nos.
2007/0238155, US 2008/0194005, US 2009/0099079; International Pat.
Pub. Nos., WO 2008/073914 and WO 98/15633, all of which are
incorporated herein by reference, and include, without limitation,
Chrysosporium lucknowense Garg 27K, VKM-F 3500 D (Accession No. VKM
F-3500-D), C1 strain UV13-6 (Accession No. VKM F-3632 D), C1 strain
NG7C-19 (Accession No. VKM F-3633 D), and C1 strain UV18-25 (VKM
F-3631 D), all of which have been deposited at the All-Russian
Collection of Microorganisms of Russian Academy of Sciences (VKM),
Bakhurhina St. 8, Moscow, Russia, 113184, and any derivatives
thereof. Although initially described as Chrysosporium lucknowense,
C1 may currently be considered a strain of Myceliophthora
thermophila. Other C1 strains include cells deposited under
accession numbers ATCC 44006, CBS (Centraalbureau voor
Schimmelcultures) 122188, CBS 251.72, CBS 143.77, CBS 272.77,
CBS122190, CBS122189, and VKM F-3500D. Exemplary C1 derivatives
include but are not limited to modified organisms in which one or
more endogenous genes or sequences have been deleted or modified
and/or one or more heterologous genes or sequences have been
introduced. Derivatives include, but are not limited to UV18#100f
.DELTA.alp1, UV18#100f .DELTA.pyr5 .DELTA.alp1, UV18#100.f
.DELTA.alp1 .DELTA.pep4 .DELTA.alp2, UV18#100.f .DELTA.pyr5
.DELTA.alp1 .DELTA.pep4 .DELTA.alp2 and UV18#100.f .DELTA.pyr4
.DELTA.pyr5 .DELTA.aIp1 .DELTA.pep4 .DELTA.alp2, as described in
WO2008073914 and WO2010107303, each of which is incorporated herein
by reference.
[0099] As used herein, the term "culturing" refers to growing a
population of microbial cells under suitable conditions in a
liquid, semi-solid, or solid medium.
[0100] In general, "saccharification" refers to the process in
which substrates (e.g., cellulosic biomass and/or lignocellulosic
biomass) are broken down via the action of cellulases to produce
fermentable sugars (e.g. monosaccharides, including but not limited
to glucose and/or xylose). In particular, "saccharification" is an
enzyme-catalyzed reaction that results in hydrolysis of a complex
carbohydrate to produce shorter-chain carbohydrate polymers and/or
fermentable sugar(s) that are more suitable for fermentation or
further hydrolysis. In some embodiments, the enzymes comprise
cellulase enzyme(s) such as endoglucanases, beta-glucosidases,
cellobiohydrolases (e.g., CBH1 and/or CBH1), a synthetic mixture of
any of such enzymes, and/or cellulase enzymes contained in culture
broth from an organism that produces cellulase enzymes, such as M.
thermophila or recombinant yeast cells. Products of
saccharification may include disaccharides, and/or monosaccharides
such as glucose or xylose.
[0101] In some embodiments, the fermentable sugars produced by the
methods of the present invention are used to produce an alcohol
(e.g., including but not limited to ethanol, butanol, etc.). The
variant GH61 proteins of the present invention find use in any
suitable method to generate alcohols and/or other biofuels from
cellulose and/or lignocellulose, and are not limited necessarily to
those described herein. Two methods commonly employed are the
separate saccharification and fermentation (SHF) method (See, Wilke
et al., Biotechnol. Bioengin. 6:155-75 [1976]) or the simultaneous
saccharification and fermentation (SSF) method (See e.g., U.S. Pat.
Nos. 3,990,944 and 3,990,945). An additional method that finds use
with the present invention is consolidated bioprocessing (CBP),
which encompasses the combination of the biological steps used in
the conversion of lignocellulosic biomass to bioethanol (e.g.,
production of cellulase(s), hydrolysis of the polysaccharides in
the biomass, and fermentation of hexose and pentose sugars) in one
reactor (See e.g., Vertes et al., Biomass to Biofuels: Strategies
for Global Industries, John Wiley & Sons, Ltd., [2010],
Hoboken, N.J., pp. 324-325).
[0102] The SHF method of saccharification comprises the steps of
contacting cellulase with a cellulose-containing substrate to
enzymatically break down cellulose into fermentable sugars (e.g.,
monosaccharides such as glucose), contacting the fermentable sugars
with an alcohol-producing microorganism to produce alcohol (e.g.,
ethanol or butanol) and recovering the alcohol. In some
embodiments, the method of consolidated bioprocessing (CBP) can be
used, in which the cellulase production from the host is
simultaneous with saccharification and fermentation either from one
host or from a mixed cultivation.
[0103] In addition to SHF methods, a SSF method may be used. In
some cases, SSF methods result in a higher efficiency of alcohol
production than is afforded by the SHF method (See e.g., Drissen et
al., Biocat. Biotransform., 27:27-35 [2009]). One disadvantage of
SSF over SHF is that higher temperatures are required for SSF than
for SHF. In some embodiments, the present invention provides GH61
polypeptides that have higher thermostability than a wild-type GH61
s. Thus, it is contemplated that the present invention will find
use in increasing ethanol production in SSF, as well as SHF
methods.
[0104] As used herein "fermentable sugars" refers to fermentable
sugars (e.g., monosaccharides, disaccharides and short
oligosaccharides), including but not limited to glucose, xylose,
galactose, arabinose, mannose and sucrose. In general, the term
"fermentable sugar" refers to any sugar that a microorganism can
utilize or ferment.
[0105] As used herein, the terms "adjunct material," "adjunct
composition," and "adjunct compound" refer to any composition
suitable for use in the compositions and/or saccharification
reactions provided herein, including but not limited to cofactors,
surfactants, builders, buffers, enzyme stabilizing systems,
chelants, dispersants, colorants, preservatives, antioxidants,
solublizing agents, carriers, processing aids, pH control agents,
etc. In some embodiments, divalent metal cations are used to
supplement saccharification reactions and/or the growth of host
cells producing GH61 variant proteins. Any suitable divalent metal
cation finds use in the present invention, including but not
limited to Cu.sup.++, Mn.sup.++, Co.sup.++, Mg.sup.++, Ni.sup.++,
Zn.sup.++, and Ca.sup.++. In addition, any suitable combination of
divalent metal cations finds use in the present invention.
Furthermore, divalent metal cations find use from any suitable
source.
[0106] In some embodiments, the host cells producing GH61 variant
proteins of the present invention are grown under culture
conditions comprising about pH 5, while in some other embodiments,
the host cells are grown at about pH 6.7. In some embodiments, the
host cells cultured at pH 5 provide improved saccharification in
the presence of supplemented copper, when saccharification is
conducted at about pH 5 or about pH 6.7. In some alternative
embodiments, the host cells cultured at about pH 6.7 provide
improved saccharification in the absence of supplemented copper
when saccharification is conducted at about pH 5 or about pH 6.
[0107] As used herein, the terms "biomass," "biomass substrate,"
"cellulosic biomass," "cellulosic feedstock," and "cellulosic
substrate" refer to any materials that contain cellulose. Biomass
can be derived from plants, animals, or microorganisms, and may
include, but is not limited to agricultural, industrial, and
forestry residues, industrial and municipal wastes, and terrestrial
and aquatic crops grown for energy purposes. Examples of cellulosic
substrates include, but are not limited to, wood, wood pulp, paper
pulp, corn fiber, corn grain, corn cobs, crop residues such as corn
husks, corn stover, grasses, wheat, wheat straw, barley, barley
straw, hay, rice, rice straw, switchgrass, waste paper, paper and
pulp processing waste, woody or herbaceous plants, fruit or
vegetable pulp, corn cobs, distillers grain, grasses, rice hulls,
cotton, hemp, flax, sisal, sugar cane bagasse, sorghum, soy,
switchgrass, components obtained from milling of grains, trees,
branches, roots, leaves, wood chips, sawdust, shrubs and bushes,
vegetables, fruits, and flowers and any suitable mixtures thereofn
some embodiments, the cellulosic biomass comprises, but is not
limited to cultivated crops (e.g., grasses, including C4 grasses,
such as switch grass, cord grass, rye grass, miscanthus, reed
canary grass, or any combination thereof), sugar processing
residues, for example, but not limited to, bagasse (e.g., sugar
cane bagasse, beet pulp [e.g., sugar beet], or a combination
thereof), agricultural residues (e.g. soybean stover, corn stover,
corn fiber, rice straw, sugar cane straw, rice, rice hulls, barley
straw, corn cobs, wheat straw, canola straw, oat straw, oat hulls,
corn fiber, hemp, flax, sisal, cotton, or any combination thereof),
fruit pulp, vegetable pulp, distillers' grains, forestry biomass
(e.g., wood, wood pulp, paper pulp, recycled wood pulp fiber,
sawdust, hardwood, such as aspen wood, softwood, or a combination
thereof). Furthermore, in some embodiments, the cellulosic biomass
comprises cellulosic waste material and/or forestry waste
materials, including but not limited to, paper and pulp processing
waste, newsprint, cardboard and the like. In some embodiments, the
cellulosic biomass comprises one species of fiber, while in some
alternative embodiments, the cellulosic biomass comprises a mixture
of fibers that originate from different cellulosic biomasses. In
some embodiments, the biomass may also comprise transgenic plants
that express ligninase and/or cellulase enzymes (US 2008/0104724
A1).
[0108] The terms "lignocellulosic biomass" and "lignocellulosic
feedstock" refer to plant biomass that is composed of cellulose and
hemicellulose, bound to lignin. The biomass may optionally be
pretreated to increase the susceptibility of cellulose to
hydrolysis by chemical, physical and biological pretreatments (such
as steam explosion, pulping, grinding, acid hydrolysis, solvent
exposure, and the like, as well as combinations thereof). Various
lignocellulosic feedstocks find use, including those that comprise
fresh lignocellulosic feedstock, partially dried lignocellulosic
feedstock, fully dried lignocellulosic feedstock, and/or any
combination thereof. In some embodiments, lignocellulosic
feedstocks comprise cellulose in an amount greater than about 20%,
more preferably greater than about 30%, more preferably greater
than about 40% (w/w). For example, in some embodiments, the
lignocellulosic material comprises from about 20% to about 90%
(w/w) cellulose, or any amount therebetween, although in some
embodiments, the lignocellulosic material comprises less than about
19%, less than about 18%, less than about 17%, less than about 16%,
less than about 15%, less than about 14%, less than about 13%, less
than about 12%, less than about 11%, less than about 10%, less than
about 9%, less than about 8%, less than about 7%, less than about
6%, or less than about 5% cellulose (w/w).
[0109] Furthermore, in some embodiments, the lignocellulosic
feedstock comprises lignin in an amount greater than about 10%,
more typically in an amount greater than about 15% (w/w). In some
embodiments, the lignocellulosic feedstock comprises small amounts
of sucrose, fructose and/or starch. The lignocellulosic feedstock
is generally first subjected to size reduction by methods
including, but not limited to, milling, grinding, agitation,
shredding, compression/expansion, or other types of mechanical
action. Size reduction by mechanical action can be performed by any
type of equipment adapted for the purpose, for example, but not
limited to, hammer mills, tub-grinders, roll presses, refiners and
hydrapulpers. In some embodiments, at least 90% by weight of the
particles produced from the size reduction have lengths less than
between about 1/16 and about 4 in (the measurement may be a volume
or a weight average length). In some embodiments, the equipment
used to reduce the particle size reduction is a hammer mill or
shredder. Subsequent to size reduction, the feedstock is typically
slurried in water, as this facilitates pumping of the feedstock. In
some embodiments, lignocellulosic feedstocks of particle size less
than about 6 inches do not require size reduction.
[0110] As used herein, the term "pretreated lignocellulosic
feedstock," refers to lignocellulosic feedstocks that have been
subjected to physical and/or chemical processes to make the fiber
more accessible and/or receptive to the actions of cellulolytic
enzymes, as described above.
[0111] A cellulosic substrate or lignocellulosic substrate is said
to be "pretreated" when it has been processed by some physical
and/or chemical means to facilitate saccharification. As described
further herein, in some embodiments, the biomass substrate is
"pretreated," or treated using methods known in the art, such as
chemical pretreatment (e.g., ammonia pretreatment, dilute acid
pretreatment, dilute alkali pretreatment, or solvent exposure),
physical pretreatment (e.g., steam explosion or irradiation),
mechanical pretreatment (e.g., grinding or milling) and biological
pretreatment (e.g., application of lignin-solubilizing
microorganisms) and combinations thereof, to increase the
susceptibility of cellulose to hydrolysis. Thus, the term
"cellulosic biomass" encompasses any living or dead biological
material that contains a polysaccharide substrate, including but
not limited to cellulose, starch, other forms of long-chain
carbohydrate polymers, and mixtures of such sources. It may or may
not be assembled entirely or primarily from glucose or xylose, and
may optionally also contain various other pentose or hexose
monomers. Xylose is an aldopentose containing five carbon atoms and
an aldehyde group. It is the precursor to hemicellulose, and is
often a main constituent of biomass. In some embodiments, the
substrate is slurried prior to pretreatment. In some embodiments,
the consistency of the slurry is between about 2% and about 30% and
more typically between about 4% and about 15%. In some embodiments,
the slurry is subjected to a water and/or acid soaking operation
prior to pretreatment. In some embodiments, the slurry is dewatered
using any suitable method to reduce steam and chemical usage prior
to pretreatment. Examples of dewatering devices include, but are
not limited to pressurized screw presses (See e.g., WO 2010/022511,
incorporated herein by reference) pressurized filters and
extruders.
[0112] In some embodiments, the pretreatment is carried out to
hydrolyze hemicellulose, and/or a portion thereof present in the
cellulosic substrate to monomeric pentose and hexose sugars (e.g.,
xylose, arabinose, mannose, galactose, and/or any combination
thereof). In some embodiments, the pretreatment is carried out so
that nearly complete hydrolysis of the hemicellulose and a small
amount of conversion of cellulose to glucose occurs. In some
embodiments, an acid concentration in the aqueous slurry from about
0.02% (w/w) to about 2% (w/w), or any amount therebetween, is
typically used for the treatment of the cellulosic substrate. Any
suitable acid finds use in these methods, including but not limited
to, hydrochloric acid, nitric acid, and/or sulfuric acid. In some
embodiments, the acid used during pretreatment is sulfuric acid.
Steam explosion is one method of performing acid pretreatment of
biomass substrates (See e.g., U.S. Pat. No. 4,461,648). Another
method of pretreating the slurry involves continuous pretreatment
(i.e., the cellulosic biomass is pumped though a reactor
continuously). This methods are well-known to those skilled in the
art (See e.g., U.S. Pat. No. 7,754,457).
[0113] In some embodiments, alkali is used in the pretreatment. In
contrast to acid pretreatment, pretreatment with alkali may not
hydrolyze the hemicellulose component of the biomass. Rather, the
alkali reacts with acidic groups present on the hemicellulose to
open up the surface of the substrate. In some embodiments, the
addition of alkali alters the crystal structure of the cellulose so
that it is more amenable to hydrolysis. Examples of alkali that
find use in the pretreatment include, but are not limited to
ammonia, ammonium hydroxide, potassium hydroxide, and sodium
hydroxide. One method of alkali pretreatment is Ammonia Freeze
Explosion, Ammonia Fiber Explosion or Ammonia Fiber Expansion
("AFEX" process; See e.g., U.S. Pat. Nos. 5,171,592; 5,037,663;
4,600,590; 6,106,888; 4,356,196; 5,939,544; 6,176,176; 5,037,663
and 5,171,592). During this process, the cellulosic substrate is
contacted with ammonia or ammonium hydroxide in a pressure vessel
for a sufficient time to enable the ammonia or ammonium hydroxide
to alter the crystal structure of the cellulose fibers. The
pressure is then rapidly reduced, which allows the ammonia to flash
or boil and explode the cellulose fiber structure. In some
embodiments, the flashed ammonia is then recovered using methods
known in the art. In some alternative methods, dilute ammonia
pretreatment is utilized. The dilute ammonia pretreatment method
utilizes more dilute solutions of ammonia or ammonium hydroxide
than AFEX (See e.g., WO2009/045651 and US 2007/0031953). This
pretreatment process may or may not produce any
monosaccharides.
[0114] Additional pretreatment processes for use in the present
invention include chemical treatment of the cellulosic substrate
with organic solvents, in methods such as those utilizing organic
liquids in pretreatment systems (See e.g., U.S. Pat. No. 4,556,430;
incorporated herein by reference). These methods have the advantage
that the low boiling point liquids easily can be recovered and
reused. Other pretreatments, such as the Organosolv.TM. process,
also use organic liquids (See e.g., U.S. Pat. No. 7,465,791, which
is also incorporated herein by reference). Subjecting the substrate
to pressurized water may also be a suitable pretreatment method
(See e.g., Weil et al., Appl. Biochem. Biotechnol., 68(1-2): 21-40
[1997], which is incorporated herein by reference). In some
embodiments, the pretreated cellulosic biomass is processed after
pretreatment by any of several steps, such as dilution with water,
washing with water, buffering, filtration, or centrifugation, or
any combination of these processes, prior to enzymatic hydrolysis,
as is familiar to those skilled in the art. The pretreatment
produces a pretreated feedstock composition (e.g., a "pretreated
feedstock slurry") that contains a soluble component including the
sugars resulting from hydrolysis of the hemicellulose, optionally
acetic acid and other inhibitors, and solids including unhydrolyzed
feedstock and lignin. In some embodiments, the soluble components
of the pretreated feedstock composition are separated from the
solids to produce a soluble fraction.
[0115] In some embodiments, the soluble fraction, including the
sugars released during pretreatment and other soluble components
(e.g., inhibitors), is then sent to fermentation. However, in some
embodiments in which the hemicellulose is not effectively
hydrolyzed during the pretreatment one or more additional steps are
included (e.g., a further hydrolysis step(s) and/or enzymatic
treatment step(s) and/or further alkali and/or acid treatment) to
produce fermentable sugars. In some embodiments, the separation is
carried out by washing the pretreated feedstock composition with an
aqueous solution to produce a wash stream and a solids stream
comprising the unhydrolyzed, pretreated feedstock. Alternatively,
the soluble component is separated from the solids by subjecting
the pretreated feedstock composition to a solids-liquid separation,
using any suitable method (e.g., centrifugation, microfiltration,
plate and frame filtration, cross-flow filtration, pressure
filtration, vacuum filtration, etc.). Optionally, in some
embodiments, a washing step is incorporated into the solids-liquids
separation. In some embodiments, the separated solids containing
cellulose, then undergo enzymatic hydrolysis with cellulase enzymes
in order to convert the cellulose to glucose. In some embodiments,
the pretreated feedstock composition is fed into the fermentation
process without separation of the solids contained therein. In some
embodiments, the unhydrolyzed solids are subjected to enzymatic
hydrolysis with cellulase enzymes to convert the cellulose to
glucose after the fermentation process. In some embodiments, the
pretreated cellulosic feedstock is subjected to enzymatic
hydrolysis with cellulase enzymes.
[0116] As used herein, the term "recovered" refers to the
harvesting, isolating, collecting, or recovering of protein from a
cell and/or culture medium. In the context of saccharification, it
is used in reference to the harvesting the fermentable sugars
produced during the saccharification reaction from the culture
medium and/or cells. In the context of fermentation, it is used in
reference to harvesting the fermentation product from the culture
medium and/or cells. Thus, a process can be said to comprise
"recovering" a product of a reaction (such as a soluble sugar
recovered from saccharification) if the process includes separating
the product from other components of a reaction mixture subsequent
to at least some of the product being generated in the
reaction.
[0117] As used herein, the term "slurry" refers to an aqueous
solution in which are dispersed one or more solid components, such
as a cellulosic substrate.
[0118] "Increasing" yield of a product (such as a fermentable
sugar) from a reaction occurs when a particular component present
during the reaction (such as a GH61 protein) causes more product to
be produced, compared with a reaction conducted under the same
conditions with the same substrate and other substituents, but in
the absence of the component of interest.
[0119] "Hydrolyzing" cellulose or other polysaccharide occurs when
at least some of the glycosidic bonds between two monosaccharides
present in the substrate are hydrolyzed, thereby detaching from
each other the two monomers that were previously bonded.
[0120] A reaction is said to be "substantially free" of a
particular enzyme if the amount of that enzyme compared with other
enzymes that participate in catalyzing the reaction is less than
about 2%, about 1%, or about 0.1% (wt/wt).
[0121] "Fractionating" a liquid (e.g., a culture broth) means
applying a separation process (e.g., salt precipitation, column
chromatography, size exclusion, and filtration) or a combination of
such processes to provide a solution in which a desired protein
(e.g., GH61 protein, cellulase enzyme, or combination thereof)
comprises a greater percentage of total protein in the solution
than in the initial liquid product.
GH61 Variant Proteins with Improved Activity
[0122] GH61 variant proteins of the present invention have certain
amino acid substitutions in relation to wild-type GH61a protein. In
saccharification reactions, wild-type GH61a protein increases the
yield of fermentable sugars. An equivalent amount of GH61 variant
proteins instead of the wild type increases the yield of
fermentable sugars still further. The present invention provides
numerous GH61 variants, as indicated herein. Substitutions that
have been shown to improve GH61 activity are included in Table 1,
below.
TABLE-US-00001 TABLE 1 GH61 Variants with Improved Activity Silent
Var. Amino Acid Nucleotide No. Changes Changes 1 N35G/E104H/A168P
t60c/c573g (SEQ ID NO: 5) 2 W42P/E104H/K167A t60c/c573g/ g1026a 3
N35G/W42P/V97Q/ A191N 4 W42P/E104H c573g 5 E104H/K167A t60c/c291a/
c573g 6 W42P/A191N t60c/c291a 7 N35G/W42P/A191N t60c/c291a 8 H20D 9
V97Q/A191N 10 N35G/E104H/A191N t60c/c876t 11 E104H 12 E104Q 13
H20D/E104D/Q190H/ Y192H 14 H20D/Q190E/Y192Q a312g 15 H20D/E104C 16
H20D/P103H/E104C 17 H20D/P103H a312g 18 N35G/E104H t60c/c573g 19
H20D/P103H/E104Q/ Q190E 20 H20D/P103H/E104C/ Y192Q 21 E104D t60c 22
N35G/W42P t60c/c573g 23 A137P 24 H20D/P103H/E104Q 25 P103E/E104D
t60c 26 N35G/F68Y/A191N t379a/c380g/ g381c 27 W42P/A168P 28
H20D/E104C/Q190E/ Y192Q 29 A142W 30 N35G 31 H20C/Q190E 32
W42P/A212P/T236P 33 N35G/W42P/V97Q/ t60c/c573g K167A/ A168P 34
V97Q/A168P c573g 35 S232A 36 W42P/E104H/K167A/ c573g A168P/Q190E 37
W42P/A168P/A212P/ T236P 38 N35G/V97Q/K167A 39 N35G/V97Q 40
N35G/A191N 41 S127T/K167A/ A191N 42 W42P 43 W42P/E104C/K167A/
t60c/c291a/ A168P c573g 44 K167Q 45 W131V 46 E176C 47 K167I/P273S
c300t 48 W42P/T87P 49 W42P/A212P 50 K133H 51 D165N 52 D165A 53
A168D 54 K218T 55 P45T 56 Q44V 57 S164W 58 I177F 59 A191N 60 I134P
61 K133F 62 I134D 63 N35G/K167A t60c/c291a/ c573g 64 I162R 65
N35G/K167A t204c/t379a c380g/ g381c/c385t 66 D165W/A246T 67 I162L
68 S164M 69 F132D/A244D 70 H181Q 71 I177G g1026a 72 L166W 73 I162F
74 I134V 75 E176Q 76 H181S 77 I178A 78 K167A 79 V172K 80 I177H 81
I134N 82 K133Y 83 N35G/Y139L 84 A168G 85 T12A/I162G c246t 86 D165E
87 D165M 88 I134M 89 A168P 90 I177D 91 S164P 92 H175T 93
N187K/S330R c597g 94 H175R 95 L166H 96 I178L 97 L173H 98 I177T 99
N170Y 100 H175S 101 K167T 102 L166R 103 V172Y '104 P163S/E176D 105
S164I 106 H175M 107 A168N 108 A179W 109 W131K/H175Q g1026a 110
Y171A 111 N170H 112 P163R 113 A168C 114 G169T 115 R174F 116 W131Y
117 I134L 118 I177V 119 K167E 120 H175C 121 W131I 122 W42P/A143P
123 I178G c72t 124 N170P 125 A179D/N317K c732g/c843t/ c882t/c909t/
c912g 126 I162V 127 I178M 128 V172A 129 K167A/A191N t60c/c291a 130
F132A 131 P163E 132 F132M 133 A179G 134 I177S 135 K167A g921a 136
K167F 137 A168I 138 A179N 139 I134A c792t 140 K167E g972t 141 R174K
142 S164F 143 V172L 144 A168H 145 I134T 146 K167H 147 L166A 148
S164R 149 R174C 150 A179P 151 G169R g1026a 152 L173M 153 D165K 154
E176S 155 F132L 156 F132I/A179I 157 F132P 158 S164Q 159 V172Q 160
W131D 161 W131Q 162 A179H 163 I134H/G270S 164 N170G 165 A168T 166
A179C 167 K133N 168 K167L 169 L180M 170 W131F 171 I134W g1026a 172
I178H 173 N170A 174 V172H 175 A168H/S205N 176 I134H g921a 177 S164C
178 S164K 179 I177C 180 I178Q 181 L180W 182 I177M 183 R174D 184
V172M 185 A179M 186 H175Y 187 I178P 188 L173A 189 N170E 190 N170F
191 N35G/A191N/T258I/ t379a/c380g/ T323P/ g381c/ G328A/C341R
c454a/c456a/ c732t/c843t/ c849t 192 A168R 193 D165I 194 I162M 195
K167V 196 A179S 197 E176N 198 I134L/P322L 199 P163L 200 H181D 201
N170S 202 R174G 203 I177R 204 K167C 205 L166Q 206 P163I 207
S164L/L166I 208 Y171R 209 F132P/Q190E/A191T 210 F132Q 211 I134C 212
I177A 213 E176R 214 G169A 215 G169K 216 H181A 217 I177L 218 A168G
219 A179R
220 D165T 221 K167R 222 L166V 223 N170C 224 I178R 225 R174H 226
S164H 227 W131R/L166I 228 I162A/A191T 229 L173F 230 N170Q 231 I177P
232 R174N 233 V172K/S215W 234 D165R 235 G239D c520a/c522g 236 H175V
237 H181R 238 I134Y 239 V172F 240 V172G
[0123] Positions that were changed in variants with improved GH61
activity listed in Table 1 include 20, 34, 35, 42, 44, 45, 68, 87,
97, 103, 104, 127, 131, 132, 133, 137, 139, 142, 143, 162, 163,
164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176,
177, 178, 179, 180, 181, 190, 191, 192, 192, 205, 212, 215, 218,
232, 236, 239, 244, 246, 258, 270, 273, 317, 322, 323, 328, 330,
and 341, wherein the amino acid positions are numbered with
reference to SEQ ID NO:2.
[0124] Residues that were changed in variants with improved GH61
activity listed in Table 1 include H20, I134, N35, W42, Q44, P45,
F68, T87, V97, P103, E104, S127, W131, F132, K133, A137, Y139,
A142, A143, I162, P163, S164, D165, L166, K167, A168, G169, N170,
Y171, V172, L173, R174, H175, E176, I177, I178, A179, L180, H181,
Q190, A191, Y192, Y192, S205, A212, S215, K218, S232, T236, G239,
A244, A246, T258, G270, P273, N317, P322, T323, G328, S330, and
C341, wherein the amino acid positions are numbered with reference
to SEQ ID NO:2.
[0125] Substitutions occurring in variants with improved GH61
activity listed in Table 1 include H20C/D, I134X, N35G, W42P, Q44V,
P45T, F68Y, T87P, V97Q, P103E/H, E104C/D/H/Q, S127T, W131X, F132X,
K133X, A137P, Y139L, A142W, A143P, I162X, P163X, S164X, D165X,
L166X, K167A/X, A168P/X, G169X, N170X, Y171A/R, V172X, L173X,
R174X, H175X, E176X, I177X, I178X, A179X, L180M/W, H181X, Q190E/H,
A191N/T, Y192H, Y192Q, S205N, A212P, S215W, K218T, S232A, T236P,
G239D, A244D, A246T, T258I, G270S, P273S, N317K, P322L, T323P,
G328A, S330R, and C341R, wherein the amino acid positions are
numbered with reference to SEQ ID NO:2.
[0126] As shown herein, the changed residues and substitutions of
the GH61 variants of this invention may be combined in a manner
that produces an effect that is cumulative or synergistic.
Cumulative effects occur when adding an additional mutation
increases the effect beyond those of the mutations already present.
Synergistic effects occur when having two more mutations in a
variant produces an effect than is more than the product of the
mutations when incorporated by themselves. This invention includes
without limitation any and all combinations of any two, three,
four, five, six, seven, eight, nine, ten, or more than ten of the
mutations listed in this disclosure.
[0127] Useful combinations include but are not limited to the
mutations and mutation sets: N35G/E104H/A168P (SEQ ID NO:5);
W42P/E104H/K167A; N35G/W42P/V97Q/A191N; W42P/E104H; E104H/K167A;
W42P/A191N; N35G/W42P/A191N; V97Q/A191N; N35G/E104H/A191N;
H20D/E104D/Q190H/Y192H; H20D/Q190E/Y192Q; H20D/E104C;
H20D/P103H/E104C; H20D/P103H; N35G/E104H; H20D/P103H/E104Q/Q190E;
H20D/P103H/E104C/Y192Q; N35G/W42P; H20D/P103H/E104Q; P103E/E104D;
N35G/F68Y/A191N; W42P/A168P; H20D/E104C/Q190E/Y192Q; H20C/Q190E;
W42P/A212P/T236P; N35G/W42P/V97Q/K167A/V97Q/A168P;
W42P/E104H/K167A/A168P/Q190E; W42P/A168P/A212P/T236P;
N35G/V97Q/K167A; N35G/V97Q; N35G/A191N; S127T/K167A/A191N;
W42P/E104C/K167A/A168P; K167I/P273S; W42P/T87P; W42P/A212P;
N35G/K167A; N35G/K167A; D165W/A246T; F132D/A244D; N35G/Y139L;
T12A/I162G; N187K/S330R; P163S/E176D; W131K/H175Q; W42P/A143P;
A179D/N317K; K167A/A191N; F132I/A179I; I134H/G270S; A168H/S205N;
N35G/A191N/T258I/T323P/G328A/C341R; I134L/P322L; S164L/L166I;
F132P/Q190E/A191T; W131R/L166I; I162A/A191T; and V172K/S215W,
wherein the amino acid positions are numbered with reference to SEQ
ID NO:2.
GH61 Variant Proteins Made with Multiple Rounds of Activity
Enhancement
[0128] GH61 variant proteins can be generated that have been
further optimized by subjecting to multiple rounds of variation and
selection. In some embodiments, additional rounds of optimization
increase saccharification reaction yields beyond what is achieved
with one round of variation and selection. Substitutions improving
GH61 activity are compiled in Table 2 below.
[0129] Table 2 shows GH61a variants derived from the GH61a protein
designated "Variant 1" (SEQ ID NO:5) in Table 1 with improved
thermoactivity. The second-round variants usually retained the
alterations of Variant 1 compared with wild-type GH61a
(N35G/E104H/A168P), along with additional modifications.
TABLE-US-00002 TABLE 2 GH61 Variants with Improved Activity
Compared to Variant 1 Silent Var. Amino Acid Nucleotide No. Changes
Changes 241 N35G/T40A/E104H/ t60c/c573g A168P/P327M 242
N35G/P45D/E104H/ t60c/c573g A168P/N317R 243 N35G/E104H/A168P/
t60c/c573g N317R 244 N35G/E104H/A168P/ t60c/c573g N317L 245
N35G/T54H/E104H/ t60c/c573g A168P 246 N35G/E104H/A168P/ t60c/c573g
N317D/S329Y 247 N35G/E104H/A137S/ t60c/c573g A168P/S232E 248
N35G/E104H/A168P/ t60c/c573g N317R/T320A 249 N35G/E104H/A168P/
t60c/c573g D234E 250 N35G/T40S/E104H/ t60c/c573g A142G/A168P 251
N35G/T40S/S78C/ t60c/c573g V88I/E104H/S128K/ A168P/D234M 252
N35G/E104H/A168P/ t60c/c573g S330V 253 N35G/E104H/A168P/ t60c/c573g
G203E/P266S 254 N35G/E104H/A168P/ t60c/c573g D234N 255
N35G/E104H/A168P/ t60c/c573g S286N/S329H 256 N35G/E104H/A168P/
t60c/c573g S330H 257 N35G/E104H/A168P/ t60c/c573g W337R 258
N35G/N66D/E104H/ t60c/c573g S164E/A168P/G267T 259 N35G/E104H/A168P/
t60c/c573g P233V 260 R34E/N35G/E104H/ t60c/c573g R145T/A168P 261
S24Q/N35G/E104H/ t60c/c573g A168P/V237I 262 Y32S/N35G/E64S/
t60c/c573g E104H/A168P 263 N35G/E104H/A168P/ t60c/c573g V333R 264
N35G/E104H/G144S/ t60c/c573g A168P/V333Q 265 V28H/N35G/P45K/
t60c/c573g E104H/A168P 266 N35G/E104H/A168P/ t60c/c573g P327K 267
N35G/N66Q/E104H/ t60c/c573g A168P 268 N35G/E104H/A168P/ t60c/c573g
G203E 269 N35G/E104H/A168P/ t60c/c573g S339W 270 N35G/P45K/N46E/
t60c/c573g E104H/A150Y/ A168P 271 N35G/E104H/R130S/ t60c/c573g
A168P 272 N35G/E104H/R145T/ t60c/c573g/ A168P g891a 273
N35G/E104H/A168P/ t60c/c573g S231K 274 N35G/T40A/E104H/ t60c/c573g
A168P/D234E/ P327M 275 N35G/E104H/A168P/ t60c/c573g S231H 276
N35G/E104H/A168P/ t60c/c573g N317M 277 N35G/E104H/A168P/ t60c/c573g
S330Y 278 N35G/E104H/A168P/ t60c/c573g S329I 279 N35G/E104H/A168P/
t60c/c573g S329R 280 N35G/N66D/E104H/ t60c/c573g A168P/P322R/S329L
281 N35G/E104H/A168P/ t60c/c288t/ P327F c573g 282 N35G/P45D/E104H/
t60c/c573g A168P 283 N35G/E104H/A168P/ t60c/c573g S332R 284
N35G/E104H/A116S/ t60c/c573g A168P 285 N35G/T40A/E104H/ t60c/c573g
A168P/V230I/P327M 286 N35G/T49A/E104H/ t60c/c573g A168P 287
N35G/E104H/A168P/ t60c/c573g N317T 288 N35G/N46Y/E104H/ t60c/c573g
A168P 289 N35G/E104H/A168P/ t60c/c573g G203V 290 N35G/E104H/A168P/
t60c/c573g S329L 291 N35G/E104H/R145N/ t60c/c573g A168P/S329H 292
N35G/A56S/E104H/ t60c/c573g A168P 293 N35G/T40S/T49R/ t60c/c573g
E104H/A168P/ D234E/P327M 294 N35G/E104H/Q161R/ t60c/c573g A168P 295
N35G/E104H/A168P/ t60c/c573g S332F 296 N35G/P45R/T49A/ t60c/c573g
E104H/A168P/ N317R/T320A 297 N35G/E104H/A168P/ t60c/c573g V237I 298
N35G/Q44K/T80V/ t60c/c573g E104H/A168P 299 N35G/E104H/A168P/
t60c/c573g E336S 300 N35G/E104H/A168P/ t60c/c573g P233T 301
N35G/E104H/A168P/ t60c/c573g S329Y 302 N35G/E104H/A168P/ t60c/c573g
P327L 303 N35G/E104H/A168P/ t60c/c573g N317I 304 N35G/E104H/R130H/
t60c/c573g A168P 305 N35G/Q44K/E104H/ t60c/c573g A168P 306
N35G/N66D/E104H/ t60c/c573g A168P 307 N35G/E104H/A168P/ t60c/c573g
S329V 308 N35G/E104H/A168P/ t60c/c573g W337F 309 N35G/E104H/A168P/
t60c/c573g N317H 310 N35G/T40L/E104H/ t60c/c573g S128K/A168P 311
N35G/E104H/A168P/ t60c/c573g A326V 312 N35G/T80V/E104H/ t60c/c573g
A168P/P303T 313 N35G/E104H/A168P/ t60c/c573g S231A/S295L 314
N35G/E104H/A116Q/ t60c/c573g A168P 315 N35G/E104H/A168P/ t60c/c573g
S330C 316 N35G/T40S/E101T/ t60c/c573g E104H/A168P/ P327M 317
N35G/E104H/A168P t60c/c573g //A326Q 318 N35G/N46R/E104H/ t60c/c573g
A168P 319 N35G/P45K/E104H/ t60c/c573g A168P/A219R/ S232E 320
S24Q/N35G/E104H/ t60c/c573g A168P/V237I/P303T 321 N35G/E104H/A168P/
t60c/c573g G203E/T281A 322 N35G/A56N/E104H/ t60c/c573g A168P 323
N35G/E104H/A168P/ t60c/c573g E336G 324 N35G/E104H/A168P/ t60c/c573g
E336R 325 N35G/T40S/E104H/ t60c/c573g S128K/A142G/ A168P 326
N35G/Q44K/S67T/ t60c/c198t/ E104H/A168P c573g 327 N35G/E104H/A168P/
t60c/c573g N317A 328 N35G/E104H/G155N/ t60c/c573g A168P 329
N35G/E104H/Q161E/ t60c/c573g A168P 330 N35G/E104H/N118S/ t60c/c573g
A168P 331 N35G/P45T/V97Q/ t60c/c573g E104H/A168P/ G267S 332
V28H/N35G/E104H/ t60c/c573g A168P 333 N35G/E104H/A168P/ t60c/c573g
Q184L 334 N35G/E104H/A168P/ t60c/c573g N317V 335 N35G/Q44L/E104H/
t60c/c573g A168P 336 N35G/E104H/A168P/ t60c/c573g S330G 337
N35G/E104H/A168P/ t60c/c573g T320A/V333W 338 N35G/E104H/A168P/
t60c/c573g E336A 339 N35G/E104H/A168P/ t60c/c573g N335S 340
N35G/N66M/E104H/ t60c/c573g A168P 341 N35G/T54G/E104H/ t60c/c573g
A168P 342 N35G/E104H/A168P/ t60c/c573g N317S 343 N35G/E64L/E104H/
t60c/c573g A168P 344 N35G/E104H/S164E/ t60c/c573g A168P/A271T 345
N35G/N66A/E104H/ t60c/c573g A168P 346 N35G/G83R/E104H/ t60c/c573g
A168P 347 N35G/E104H/A168P/ t60c/c573g N317Q/T320A 348
N35G/E104H/K141A/ t60c/c573g A168P 349 N35G/P71T/E104H/ t60c/c573g
A168P 350 N35G/P71S/E104H/ t60c/c573g A168P 351 N35G/E104H/R130G/
t60c/c573g A168P 352 N35G/E104H/R145Q/ t60c/c573g A168P 353
N35G/T70A/E104H/ t60c/c573g A168P 354 N35G/E104H/A168P/ t60c/c573g
K218R 355 N35G/E104H/A168P/ t60c/c573g Q184E 356 N35G/E104H/R130K/
t60c/c573g A168P 357 N35G/Q58H/E104H/ t60c/c573g A168P
358 Y32S/N35G/E104H/ t60c/c573g A168P 359 N35G/E104H/A168P/
t60c/c573g S329T 360 N35G/E104H/A168P/ t60c/c573g S330I 361
Y32S/N35G/P71A/ t60c/c573g E104H/A168P 362 N35G/E104H/A168P/
t60c/c573g S330T 363 N35G/G82A/E104H/ t60c/c573g A168P 364
N35G/T80V/E104H/ t60c/c573g A168P 365 N35G/E104H/A168P/ t60c/c573g
S295T 366 N35G/N66G/E104H/ t60c/c573g A168P 367 N35G/E104H/R145L/
t60c/c573g A168P 368 N35G/S67H/E104H/ t60c/c573g A168P/V230M 369
N35G/E104H/G136E/ t60c/c573g A168P 370 N35G/T54S/E104H/ t60c/c573g
A168P 371 N35G/P45S/E104H/ t60c/c573g A168P 372 N35G/E104H/A168P/
t60c/c573g/ A326M c882t 373 N35G/N66D/N95E/ t60c/c573g
E104H/S164E/A168P/ G267D 374 N35G/E104H/A168P/ t60c/c573g S332C 375
N35G/E104H/S128L/ t60c/c573g A168P 376 N35G/T54W/E104H/ t60c/c573g
A168P 377 N35G/E104H/A168P/ t60c/c573g G268A/G269A/ G270A 378
N35G/Q44K/E104H/ t60c/c573g A168P/S231T 379 R34E/N35G/E104H/
t60c/c573g A168P/A280D 380 N35G/E104H/A168P/ t60c/g399a/ A297T
c573g 381 N35G/E104H/K141P/ t60c/c573g R145Q/A168P 382
N35G/P45E/E104H\/ t60c/c573g K141R/A168P 383 N35G/N66T/E104H/
t60c/c573g A168P 384 N35G/E104H/S164E/ t60c/c573g A168P/S295D 385
N35G/E104H/A168P/ t60c/c573g N317F 386 N35G/E104H/A168P/ t60c/c573g
N317Q 387 N35G/T40G/T49R/ t60c/c573g S78C/E104H/A142G/ A168P 388
N35G/G82S/E104H/ t60c/c573g A168P 389 N35G/Q58P/E104H/ t60c/c573g
A168P 390 N35G/N46R/E104H/ t60c/c573g A168P/G203E/ A263V 391
N35G/P45R/E104H/ t60c/c573g A168P 392 N35G/S67G/E104H/ t60c/c573g
A168P 393 N35G/E104H/A168P/ t60c/c573g R199E 394 N35G/G69T/E104H/
t60c/c573g A168P 395 N35G/E104H/A168P/ t60c/c573g G203E/G268A/
G269A/G270A 396 N35G/E104H/A168P/ t60c/c573g P266S 397
N35G/E104H/A168P/ t60c/c573g V324M 398 N35G/E104H/A168P/ t60c/c573g
G245A 399 N35G/N66R/E104H/ t60c/c573g A168P 400 N35G/E104H/A168P/
t60c/c573g T236E 401 S24Q/N35G/Q44K/ t60c/c573g T80H/E104H/A168P
402 N35G/E104H/S128D/ t60c/c573g A168P 403 N35G/N66D/S78D/
t60c/c573g E104H/A168P/ S253D 404 N35G/E104H/R130Y/ t60c/c573g
A168P 405 N35G/E104H/A168P/ t60c/c573g K310I 406 N35G/E104H/R145E/
t60c/c573g A168P 407 N35G/N66D/E104H/ t60c/c573g S164E/A168P/S282D
408 N35G/E104H/K141P/ t60c/c573g A168P 409 N35G/E104H/A168P/
t60c/c573g Q184R 410 N35G/E104H/A168P/ t60c/c573g S231T 411
N35G/N66V/E104H/ t60c/c573g A168P 412 N35G/E104H/A142L/ t60c/c573g
A168P 413 N35G/E104H/R145H/ t60c/c573g A168P 414 N35G/E104H/A168P/
t60c/c573g K218L 415 N35G/E104H/K141T/ t60c/c573g A168P 416
N35G/E104H/A168P/ t60c/c573g P233F 417 N35G/T40S/E104H/ t60c/c573g
A168P/P327M 418 N35G/T54M/E104H/ t60c/c573g A168P 419
S24T/N35G/E104H/S164E/ t60c/c573g A168P 420 N35G/P45T/E104H/
t60c/c573g A168P 421 N35G/N66D/E104H/ t60c/c573g S164E/A168P/S231T/
S253T 422 N35G/G69H/E104H/ t60c/c573g A168P 423 N35G/E104H/S128Y/
t60c/c573g A168P 424 N35G/T49Q/E104H/ t60c/c573g A168P 425
N35G/T49A/E104H/ t60c/c573g A168P/Q184H 426 N35G/E104H/A168P/
t60c/c573g G203Y 427 N35G/Q44K/N66V/E104H/ t60c/c573g A168P 428
N35G/E104H/ t60c/c573g A137M/A168P 429 N35G/E104H/A168P/ t60c/c573g
P327C 430 N35G/E104H/A168P/ t60c/c573g T236R 431 N35G/I51A/E104H/
t60c/c573g A168P 432 N35G/S67H/E104H/ t60c/c573g A168P 433
N35G/E104H/A168P/ t60c/c573g A326C 434 N35G/T49A/E104H/ t60c/c573g
S128N/A168P 435 N35G/T49R/E104H/ t60c/c573g A168P/K218L/ N317Q 436
N35G/E104H/A168P/ t60c/c573g P266S/G267V 437 N35G/E104H/A168P/
t60c/c573g V237I/P303T 438 N35G/T49E/E104H/ t60c/c573g A168P 439
N35G/P45R/E104H/ t60c/c573g A168P/T320A 440 N35G/N66L/E104H/
t60c/c573g A168P 441 N35G/P45R/E104H/ t60c/c573g A168P/K218L/ N317Q
442 N35G/E104H/R145V/ t60c/c573g A168P 443 N35G/N66D/E104H/
t60c/c573g A168P/R290K 444 N35G/T80L/E104H/ t60c/c573g A168P 445
N35G/A55G/E104H/ t60c/c573g A168P 446 N35G/E104H/A168P/ t60c/c573g
S330A 447 N35G/E104H/K141N/ t60c/c573g A168P/P266S 448
N35G/E104H/A142S/ t60c/c573g A168P 449 N35G/E104H/A168P/ t60c/c573g
Q184G 450 N35G/E104H/N118E/ t60c/c573g A168P 451 N35G/E104H/A168P/
t60c/c573g A212M 452 N35G/E104H/A168P/ t60c/c573g G267D 453
N35G/K93N/E104H/ t60c/c573g R130Y/A168P 454 N35G/P45R/T49Y/
t60c/c573g E104H/A168P/ N317D 455 N35G/E104H/A168P/ t60c/c573g
S329Q 456 N35G/E104H/A168P/ t60c/c573g V230Q 457 N35G/P45K/E104H/
t60c/c573g A168P/A219R 458 N35G/E104H/A142G/ t60c/c573g A168P 459
N35G/E104H/A168P/ t60c/c573g S205T 460 N35G/S78D/E104H/ t60c/c573g
S164E/A168P 461 N35G/E104H/R130E/ t60c/c573g A168P 462
N35G/E104H/A168P/ t60c/c573g Q184H 463 N35G/E104H/A116P/ t60c/c573g
A168P 464 N35G/E104H/A142D/ t60c/c573g A168P 465 V28H/N35G/N46E/
t60c/c573g Q58H/E104H/A168P 466 N35G/E104H/A168P/ t60c/c573g A280T
467 R34E/N35G/E104H/ t60c/c573g A168P/A280T 468 N35G/E104H/A168P/
t60c/c573g E336L 469 N35G/T49D/E104H/ t60c/c573g A168P 470
N35G/E104H/A168P/ t60c/c573g A219T 471 N35G/E104H/A142W/ t60c/c573g
A168P 472 N35G/E104H/A168P/ t60c/c573g P303T/G305D 473
N35G/Q44V/E104H/ t60c/c573g A168P 474 N35G/E104H/A168P/ t60c/c573g
N187D 475 N35G/E104H/G136H/ t60c/c573g A168P 476
S24Q/N35G/Q44K/E104H/ t60c/c573g A168P/P303T/ S332D 477 N35G/E104H/
t60c/c573g A168P/Q184N
478 N35G/E104H/A168P/ t60c/c573g S332L 479 S24T/N35G/N66D/
t60c/c573g S78D/E104H/A168P/ S205T/S253T 480 N35G/E104H/A168P/
t60c/c573g P327A 481 N35G/T40A/T49Q/ t60c/c573g S78C/E104H/A168P
482 N35G/T40L/E104H/ t60c/c573g A142G/A168P 483 N35G/T49Y/E104H/
t60c/c573g A168P/N317R 484 R34E/N35G/K93T/E104H/ t60c/c573g
R130E/R145T/ A168P/R199E/ K218T/A280D
[0130] Positions that were changed in variants with improved GH61
activity listed in Table 2 include 24, 28, 32, 34, 35, 40, 44, 45,
46, 49, 51, 54, 55, 56, 58, 64, 66, 67, 69, 70, 71, 78, 80, 82, 83,
88, 93, 95, 101, 104, 116, 118, 128, 130, 136, 137, 141, 142, 144,
145, 150, 155, 161, 164, 168, 184, 187, 199, 203, 205, 212, 218,
219, 230, 231, 232, 233, 234, 236, 237, 245, 253, 263, 266, 267,
268, 269, 270, 271, 280, 281, 282, 290, 295, 297, 303, 305, 310,
317, 320, 324, 326, 327, 329, 330, 332, 333, 336, 337, and 339,
wherein the amino acid positions are numbered with reference to SEQ
ID NO:2.
[0131] Residues that were changed in variants with improved GH61
activity listed in Table 2 include S24, V28, Y32, R34, N35, T40,
Q44, P45, N46, T49, I51, T54, A55, A56, Q58, E64, N66, S67, G69,
T70, P71, S78, T80, G82, G83, V88, K93, N95, E101, E104, A116,
N118, S128, R130, G136, A137, K141, A142, G144, R145, A150, G155,
Q161, S164, A168, Q184, N187, R199, G203, S205, A212, K218, A219,
V230, S231, S232, P233, D234, T236, V237, G245, S253, A263, P266,
G267, G268, G269, G270, A271, A280, T281, S282, R290, S295, A297,
P303, G305, K310, N317, T320, V324, A326, P327, S329, S330, S332,
V333, E336, W337, and S339, wherein the amino acid positions are
numbered with reference to SEQ ID NO:2.
[0132] Substitutions occurring in variants with improved GH61
activity listed in Table 2 include S24Q, V28H, Y32S, R34E, N35G,
T40A/G/L/S, Q44K, P45D/E/K/R/S, N46E/R, T49A/Q/R/Y, I51A,
T54G/M/S/W, A55G, A56S, Q58H/P, E64L/S, N66A/D/G/L/M/Q/R/V,
S67G/H/T, G69T, T70A, P71A, S78C/D, T80H/L/V, G82A/S, G83R, V88I,
K93N/T, N95E, E101T, E104H, A116Q/S, N118E/S, S128K/L/N,
R130E/G/H/K/Y, G136H, A137M/S, K141A/N/P/R, A142D/G/L, G144S,
R145H/L/N/Q/T, A150Y, G155N, Q161E/R, S164E, A168P, Q184E/H/L/N/R,
N187D, R199E, G203E/V/Y, S205T, A212M, K218L/T, A219R/T, V230I/Q,
S231A/H/K/I, S232E, P233F/T, D234E/M/N, T236E, V237I, G245A,
S253D/T, A263V, P266S, G267D/V, G268A, G269A, G270A, A271T,
A280D/T, T281A, S282D, R290K, S295D/L/T, A297T, P303T, G305D,
K310I, N317D/H/I/M/Q/R, T320A, V324M, A326C/Q/V, P327F/K/L/M,
S329H/I/Q/T/Y, S330A/H/I/T/V, S332C/F/R, V333Q, E336L/R/S, W337R,
and S339W.
[0133] In some embodiments, the changed residues and substitutions
of the GH61 variants of this invention may be combined in a manner
that produces an effect that is cumulative or synergistic.
Cumulative effects occur when adding an additional mutation
increases the effect beyond those of the mutations already present.
Synergistic effects occur when having two more mutations in a
variant produces an effect than is greater than the product of the
mutations when incorporated by themselves. This invention includes
without limitation any and all combinations of any two, three,
four, five, six, seven, eight, nine, ten, or more than ten of the
mutations listed in Table 1, Table 2, or both Tables.
[0134] Useful combinations of mutated positions include but are not
limited to N35/T40/E104/A168/P327; N35/P45/E104/A168/N317;
N35/E104/A168/N317; N35/E104/A168/N317/S329;
N35/E104/A137/A168/S232; N35/E104/A168/N317/T320;
N35/E104/A168/D234; N35/T40/E104/A142/A168; N35/E104/R145/A168;
N35/T40/S78/V88/E104/S128K/A168/D234; N35/E104/A168/S330;
N35/E104/A168/G203/P266; N35/E104/A168/D234; N35/E104/A168/S330;
N35/E104/A168/W337; R34/N35/E104/R145/A168; Y32/N35/E64/E104/A168;
V28/N35/P45/E104/A168; N35/E104/G144/A168/V333; N35/N66/E104/A168;
N35/E104/A168/P327; N35/E104/A168/G203; N35/E104/A168/S339;
N35/P45/N46/E104/A150/A168; N35/E104/A168/S231;
N35/T40/E104/A168/D234/P327; N35/E104/A168/S231;
N35/E104/A168/N317; N35/E104/A168/S330; N35/E104/A168/S329;
N35/E104/A168/P327; N35/P45/E104/A168; N35/E104/A116/A168;
N35/T40/E104/A168N230/P327; and N35/E104/A168/S332.
[0135] Useful combinations of mutated residues further include but
are not limited to N35/E104/A168/G203; N35/E104/R145/A168/S329;
N35/T40/T49/E104/A168/D234/P327; N35/A56/E104/A168;
N35/E104/Q161/A168; N35/E104/A168/S332;
N35/P45/T49/E104/A168/N317/T320; N35/E104/A168/V237;
N35/E104/A168/E336; N35/E104/A168/P233; N35/E104/R130/A168;
N35/E104/A168/P327; N35/E104/A168/N317; N35/Q44/E104/A168;
N35/E104/A168/A326; N35/E104/A168/N317; N35/T40/E104/S128/A168;
N35/T80/E104/A168/P303; N35/E104/A116/A168;
N35/E104/A168/S231/S295; N35/T40/E101/E104/A168/P327;
N35/P45/E104/A168/A219/S232; N35/N46/E104/A168; N35/E104/A168/A326;
N35/E104/A168/G203/T281; N35/E104/A168/E336;
N35/T40/E104/S128/A142/A168; N35/E104/N118/A168;
N35/E104/G155/A168; S24/N35/E104/A168/V237/P303;
N35/E104/Q161/A168; N35/Q44/S67/E104/A168; V28/N35/E104/A168;
N35/E104/A168/Q184; N35/T54/E104/A168; N35/N66/E104/A168;
N35/E64/E104/A168; N35/E104/S164/A168/A271; N35/N66/E104/A168;
N35/G83/E104/A168; N35/E104/K141/A168; and
N35/E104/A168/N317/T320.
[0136] Useful combinations of mutated residues include but are not
limited to N35/E104/R130/A168; N35/E104/R145/A168;
N35/T70/E104/A168; N35/E104/R130/A168; N35/E104/A168/Q184;
N35/E104/A168/S329; N35/T49/E104/A168; Y32/N35/E104/A168;
N35/E104/A168/S330; N35/Q58/E104/A168; Y32/N35/P71/E104/A168;
N35/E104/A168/S330; N35/T80/E104/A168; N35/G82/E104/A168;
N35/E104/A168/S295; N35/N66/E104/A168; N35/T54/E104/A168;
N35/P45/E104/A168; N35/E104/S128/A168; N35/N66/N95/E104/S164/A168;
/G267; N35/T54/E104/A168; N35/P45/E104/K141/A168;
N35/E104/A168/S332; N35/E104/A168/A297; N35/E104/K141/R145/A168;
N35/Q44/E104/A168/S231; N35/T40/T49/S78/E104/A142; /A168;
N35/E104/S164/A168/S295; N35/E104/A168/N317; N35/P45/E104/A168;
N35/G82/E104/A168; N35/N46/E104/A168/G203/A263; N35/Q58/E104/A168;
N35/G69/E104/A168; N35/S67/E104/A168; N35/E104/A168/R199;
N35/E104/A168/G203/G268/G269/G270; N35/E104/A168/V324;
N35/E104/A168/P266; N35/E104/A168/G245; N35/N66/E104/A168; and
S24/N35/Q44/T80/E104/A168.
[0137] Useful combinations of mutated residues further include but
are not limited to N35/E104/A168/T236; N35/E104/A168/K310;
N35/E104/R130/A168; N35/N66/S78/E104/A168/S253;
N35/N66/E104/S164/A168/S282; N35/E104/A142/A168;
N35/E104/R145/A168; N35/E104/A168/S231; N35/E104/A168/Q184;
N35/E104/A168/K218; N35/E104/A168/P233; N35/T49/E104/A168/Q184;
N35/T40/E104/A168/P327; N35/T54/E104/A168;
N35/N66/E104/S164/A168/S231/S253; N35/E104/A168/G203;
N35/T49/E104/A168; N35/E104/A168/P266/G267; N35/Q44/N66/E104/A168;
N35/S67/E104/A168; N35/E104/A137/A168; N35/T49/E104/S128/A168;
N35/T49/E104/A168/K218/N317; N35/I51/E104/A168; N35/E104/A168/A326;
N35/P45/E104/A168/T320; N35/N66/E104/A168; N35/E104/A168/V237/P303;
N35/P45/E104/A168/K218/N317; N35/T80/E104/A168; N35/A55/E104/A168;
N35/E104/K141/A168/P266; N35/E104/A168/S330;
N35/N66/E104/A168/R290; N35/E104/N118/A168; N35/E104/A168/A212;
N35/K93/E104/R130/A168; N35/E104/A168/G267;
N35/P45/T49/E104/A168/N317; N35/E104/A168/V230; N35/E104/A168/S329;
N35/P45/E104/A168/A219; N35/S78/E104/S164/A168; N35/E104/A168/S205;
N35/E104/A168/Q184; V28/N35/N46/Q58/E104/A168; N35/E104/A142/A168;
N35/E104/A168/E336; N35/E104/A168/A280; N35/E104/A168/A219;
N35/E104/A168/P303/G305; R34/N35/E104/A168/A280;
N35/E104/A168/N187; N35/E104/G136/A168; N35/E104/A168/Q184;
N35/T49/E104/A168/N317; N35/T40/T49/S78/E104/A168;
R34/N35/K93/E104/R130/R145/A168/R199/K218/A280;
N35/T40/E104/A142/A168; and N35/N66/E104/A168.
[0138] Useful combinations of mutations further include but are not
limited to N35G/T40A/E104H/A168P/P327M;
N35G/P45D/E104H/A168P/N317R; N35G/E104H/A168P/N317R;
N35G/E104H/A168P/N317D/S329Y; N35G/E104H/A137S/A168P/S232E;
N35G/E104H/A168P/N317R/T320A; N35G/E104H/A168P/D234E;
N35G/T40S/E104H/A142G/A168P; N35G/E104H/R145L/A168P;
N35G/T40S/S78C/V88I/E104H/S128K/A168P/D234M;
N35G/E104H/A168P/S330V; N35G/E104H/A168P/G203E/P266S;
N35G/E104H/A168P/D234N; N35G/E104H/A168P/S330H;
N35G/E104H/A168P/W337R; R34E/N35G/E104H/R145T/A168P;
Y32S/N35G/E64S/E104H/A168P; V28H/N35G/P45K/E104H/A168P;
N35G/E104H/G144S/A168P/V333Q; N35G/N66Q/E104H/A168P;
N35G/E104H/A168P/P327K; N35G/E104H/A168P/G203E;
N35G/E104H/A168P/S339W; N35G/P45K/N46E/E104H/A150Y/A168P;
N35G/E104H/A168P/S231K; N35G/T40A/E104H/A168P/D234E/P327M;
N35G/E104H/A168P/S231H; N35G/E104H/A168P/N317M;
N35G/E104H/A168P/S330Y; N35G/E104H/A168P/S329I;
N35G/E104H/A168P/P327F; N35G/P45D/E104H/A168P;
N35G/E104H/A116S/A168P; N35G/T40A/E104H/A168P/V230I/P327M; and
N35G/E104H/A168P/S332R.
[0139] Useful combinations of mutations further include but are not
limited to N35G/E104H/A168P/G203V; N35G/E104H/R145N/A168P/S329H;
N35G/T40S/T49R/E104H/A168P/D234E; /P327M; N35G/A56S/E104H/A168P;
N35G/E104H/Q161R/A168P; N35G/E104H/A168P/S332F;
N35G/P45R/T49A/E104H/A168P/N317R/T320A; N35G/E104H/A168P/V237I;
N35G/E104H/A168P/E336S; N35G/E104H/A168P/P233T;
N35G/E104H/R130H/A168P; N35G/E104H/A168P/P327L;
N35G/E104H/A168P/N317I; N35G/Q44K/E104H/A168P;
N35G/E104H/A168P/A326V; N35G/E104H/A168P/N317H;
N35G/T40L/E104H/S128K/A168P; N35G/T80V/E104H/A168P/P303T;
N35G/E104H/A116Q/A168P; N35G/E104H/A168P/S231A/S295L;
N35G/T40S/E101T/E104H/A168P/P327M;
N35G/P45K/E104H/A168P/A219R/S232E; N35G/N46R/E104H/A168P;
N35G/E104H/A168P/A326Q; N35G/E104H/A168P/G203E/T281A;
N35G/E104H/A168P/E336R; N35G/T40S/E104H/S128K/A142G/A168P;
N35G/E104H/N118S/A168P; N35G/E104H/G155N/A168P;
S24Q/N35G/E104H/A168P/V237I/P303T; N35G/E104H/Q161E/A168P;
N35G/Q44K/S67T/E104H/A168P; V28H/N35G/E104H/A168P;
N35G/E104H/A168P/Q184L; N35G/T54G/E104H/A168P;
N35G/N66M/E104H/A168P; N35G/E64L/E104H/A168P;
N35G/E104H/S164E/A168P/A271T; N35G/N66A/E104H/A168P;
N35G/G83R/E104H/A168P; N35G/E104H/K141A/A168P; and
N35G/E104H/A168P/N317Q/T320A.
[0140] Useful combinations of mutations further include but are not
limited to N35G/E104H/R130G/A168P; N35G/E104H/R145Q/A168P;
N35G/T70A/E104H/A168P; N35G/E104H/R130K/A168P;
N35G/E104H/A168P/Q184E; N35G/E104H/A168P/S329T;
N35G/T49A/E104H/A168P; Y32S/N35G/E104H/A168P;
N35G/E104H/A168P/S330I; N35G/Q58H/E104H/A168P;
Y32S/N35G/P71A/E104H/A168P; N35G/E104H/A168P/S330T;
N35G/T80V/E104H/A168P; N35G/G82A/E104H/A168P;
N35G/E104H/A168P/S295T; N35G/N66G/E104H/A168P;
N35G/T54S/E104H/A168P; N35G/P45S/E104H/A168P;
N35G/E104H/S128L/A168P; N35G/N66D/N95E/E104H/S164E/A168P/G267D;
N35G/T54W/E104H/A168P; N35G/P45E/E104H/K141R/A168P;
N35G/E104H/A168P/S332C; N35G/E104H/A168P/A297T;
N35G/E104H/K141P/R145Q/A168P; N35G/Q44K/E104H/A168P/S231T;
N35G/T40G/T49R/S78C/E104H/A142G; /A168P;
N35G/E104H/S164E/A168P/S295D; N35G/E104H/A168P/N317Q;
N35G/P45R/E104H/A168P; N35G/G82S/E104H/A168P;
N35G/N46R/E104H/A168P/G203E/A263V; N35G/Q58P/E104H/A168P;
N35G/G69T/E104H/A168P; N35G/S67G/E104H/A168P;
N35G/E104H/A168P/R199E; N35G/E104H/A168P/G203E/G268A/G269A/G270A;
N35G/E104H/A168P/V324M; N35G/E104H/A168P/P266S;
N35G/E104H/A168P/G245A; N35G/N66R/E104H/A168P; and
S24Q/N35G/Q44K/T80H/E104H/A168P.
[0141] Useful combinations of mutations further include but are not
limited to N35G/E104H/A168P/T236E; N35G/E104H/A168P/K310I;
N35G/E104H/R130Y/A168P; N35G/N66D/S78D/E104H/A168P/S253D;
N35G/N66D/E104H/S164E/A168P/S282D; N35G/E104H/A142L/A168P;
N35G/E104H/R145H/A168P; N35G/E104H/A168P/S231T;
N35G/E104H/A168P/Q184R; N35G/E104H/A168P/K218L;
N35G/E104H/A168P/P233F; N35G/T49A/E104H/A168P/Q184H;
N35G/T40S/E104H/A168P/P327M; N35G/T54M/E104H/A168P;
N35G/N66D/E104H/S164E/A168P/S231T/S253T; N35G/E104H/A168P/G203Y;
N35G/T49Q/E104H/A168P; N35G/E104H/A168P/P266S/G267V;
N35G/Q44K/N66V/E104H/A168P; N35G/S67H/E104H/A168P;
N35G/E104H/A137M/A168P; N35G/T49A/E104H/S128N/A168P;
N35G/T49R/E104H/A168P/K218L/N317Q; N35G/151A/E104H/A168P;
N35G/E104H/A168P/A326C; N35G/P45R/E104H/A168P/T320A;
N35G/N66L/E104H/A168P; N35G/E104H/A168P/V237I/P303T;
N35G/P45R/E104H/A168P/K218L/N317Q; N35G/T80L/E104H/A168P;
N35G/A55G/E104H/A168P; N35G/E104H/K141N/A168P/P266S;
N35G/E104H/A168P/S330A; N35G/N66D/E104H/A168P/R290K;
N35G/E104H/N118E/A168P; N35G/E104H/A168P/A212M;
N35G/K93N/E104H/R130Y/A168P; N35G/E104H/A168P/G267D;
N35G/P45R/T49Y/E104H/A168P/N317D; N35G/E104H/A168P/V230Q;
N35G/E104H/A168P/S329Q; N35G/P45K/E104H/A168P/A219R;
N35G/S78D/E104H/S164E/A168P; N35G/E104H/A168P/S205T;
N35G/E104H/A168P/Q184H; V28H/N35G/N46E/Q58H/E104H/A168P;
N35G/E104H/A142D/A168P; N35G/E104H/A168P/E336L;
N35G/E104H/A168P/A280T; N35G/E104H/A168P/A219T;
N35G/E104H/A168P/P303T/G305D; R34E/N35G/E104H/A168P/A280T;
N35G/E104H/A168P/N187D; N35G/E104H/G136H/A168P;
N35G/E104H/A168P/Q184N; N35G/T49Y/E104H/A168P/N317R;
N35G/T40A/T49Q/S78C/E104H/A168P;
R34E/N35G/K93T/E104H/R130E/R145T/A168P/R199E/K218T/A280D;
N35G/T40L/E104H/A142G/A168P; and N35G/N66G/E104H/A168P.
Production of GH61 Variant Proteins
[0142] In some embodiments, the GH61 variant proteins of this
invention are produced by recombinant expression in a host cell.
Any suitable method for recombinant expression in any suitable host
cell finds use in the present invention. In some embodiments, a
nucleotide sequence encoding the protein is obtained, and
introduced into a suitable host cell by way of a suitable transfer
vector or expression vector. In some embodiments, the nucleotide
sequence is operably linked to a promoter that promotes expression
in the host cell. The promoter sequence is often selected to
optimize in a cell that is not M. thermophila, in which case the
promoter is typically heterologous to the GH61 variant protein
encoding sequence. In some embodiments, the host cell is a
eukaryotic cell and the GH61 variant protein comprises a
heterologous signal peptide at the N-terminus.
[0143] Optionally, in some embodiments, the encoding sequence is
codon-optimized for the host cell (e.g., a particular species of
yeast cell). Any suitable method for obtaining codon-optimized
sequences find use in the present invention (e.g., GCG
CodonPreference, Genetics Computer Group Wisconsin Package; Codon
W, John Peden, University of Nottingham; and McInerney, Bioinform.,
14:372-73 [1998]).
[0144] General reference texts relating to gene expression include
but are not limited to the most recent editions of Protocols in
Molecular Biology (Ausubel et al. eds.); Molecular Cloning: A
Laboratory Manual (Sambrook et al. eds.); Advances In Fungal
Biotechnology For Industry, Agriculture, And Medicine (Tkacz and
Lange, 2004); and Fungi: Biology and Applications (K. Kavanagh ed.,
2005).
[0145] In some embodiments, culture broth from GH61
protein-producing cells is collected and combined directly with
cellulase enzymes in a saccharification reaction. In some
alternative embodiments, the broth is fractionated to any extent
desired to provide partially or substantially purified GH61
protein, following the activity during the separation process using
a GH61 activity assay, using standard protein separation
techniques, and following GH61 activity during fractionation with a
suitable GH61 activity assay. Such protocols may combine one or
more of the following methods (but are not limited to these
particular methods): salt precipitation, solid phase binding,
affinity chromatography, ion exchange chromatography, molecular
size separation, and/or filtration. Protein separation techniques
are generally described in Protein Purification: Principles, High
Resolution Methods, and Applications, (J. C. Janson, ed., 2011);
High Throughput Protein Expression and Purification: Methods and
Protocols (S. A. Doyle ed., 2009).
[0146] The present invention provides GH61 variant protein having
an amino acid sequence that is at least about 60%, at least about
65%, at least about 70%, about 75%, about 80%, about 85%, about
90%, about 91%, about 92%, about 93%, about 94%, about 95%, about
96%, about 97%, about 98%, or about 99% identical to SEQ ID NO:2 or
a fragment of SEQ ID NO:2 having GH61 activity. In some
embodiments, the amino acid sequence of the variant proteins have
one or more amino acid substitutions with respect to SEQ ID NO:2 or
said fragment. In some embodiments, the substitution(s) that are
present in the amino acid sequence result in the variant protein
having increased GH61 activity in a saccharification reaction by
certain cellulase enzymes under specified conditions, compared with
a reference protein comprising SEQ ID NO:2 or said fragment,
without any of the substitutions.
[0147] In some embodiments, GH61 variant proteins of this invention
comprise one or more of SEQ ID NOS:5, 6, 8, 9, 11, and/or 12, or
biologically-active fragments of these sequences having GH61
activity. These correspond to Variants 1 (SEQ ID NOS:5 and 6),
Variant 5 (SEQ ID NOS: 8 and 9), and Variant 9 (SEQ ID NOS: 11 and
12). In some embodiments, the variants have more than about 2-fold,
3-fold, or more than 3-fold GH61 activity compared with wild-type
GH61a (i.e., SEQ ID NO:2). The combined effect of multiple rounds
of optimization yield GH61 variant proteins that have about 3-fold,
about 5-fold, about 8-fold, or about 10-fold activity compared with
the original parental sequence (SEQ ID NO:2).
[0148] Also provided are polynucleotides encoding such GH61 variant
proteins, expression vectors comprising such polynucleotides, and
host cells that have been transfected with such vectors so as to
express the GH61 variant proteins that are encoded.
Fragments and Variants
[0149] GH61 variant proteins of this invention may comprise one or
more substitutions, deletions, or additions in the sequence in
addition to the substitutions highlighted above. By way of
illustration, the GH61 protein may be longer or shorter by at least
about 5, 10, 20, 40, 75, 100, 125, 150, or 200 amino acids; or by
about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 15%, 30%,
35%, 40%, 45%, 50%, 60%, 70%, or 80% of the total number of amino
acids in the polypeptide, compared with SEQ ID NO:2. The variant or
any of these fragments may also be part of a fusion protein in
which a portion having GH61 activity is joined to one or more other
sequences. Providing the protein retains a degree of GH61 activity
or other commercial applicability, the variations may comprise any
combination of amino acid substitutions at any position that is not
specifically indicated otherwise. Depending on the circumstances, a
conservative amino acid substitution may be preferred over other
types of substitutions.
[0150] Where an amino acid substitution is a "conservative"
substitution, the substituted amino acid that shares one or more
chemical property with the amino acid it is replacing. Shared
properties include the following: Basic amino acids: arginine (R),
lysine (K), histidine (H); acidic amino acids: glutamic acid (E)
and aspartic acid (D); uncharged polar amino acids: glutamine (Q)
and asparagine (N); hydrophobic amino acids: leucine (L),
isoleucine (I), valine (V); aromatic amino acids: phenylalanine
(F), tryptophan (W), and tyrosine (Y); sulphur-containing amino
acids: cysteine (C), methionine (M); small amino acids: glycine
(G), alanine (A), serine (S), threonine (T), proline (P), cysteine
(C), and methionine (M).
Obtaining Functional Fragments and Variants
[0151] Functional fragments of GH61 protein variants of this
invention can be identified by standard methodology for mapping
function within a polypeptide. In some embodiments, recombinant
protein is expressed that has effectively been trimmed at the N- or
C-terminus, and then tested in a GH61 activity assay. Trimming can
continue until activity is lost, at which point the minimum
functional unit of the protein would be identified. Fragments
containing any portion of the protein down to the identified size
would typically be functional, as would be fusion constructs
containing at least the functional core of the protein.
[0152] To generate further variants that incorporate one or more
amino acid changes in a GH61 encoding sequence, the skilled artisan
can change particular nucleotides, and then retest the expressed
protein for GH61 activity.
[0153] An effective way to generate a large collection of
functional variants is to use a random mutation strategy. The
standard texts Protocols in Molecular Biology (Ausubel et al. eds.)
and Molecular Cloning: A Laboratory Manual (Sambrook et al. eds.)
describe techniques employing chemical mutagenesis, cassette
mutagenesis, degenerate oligonucleotides, mutually priming
oligonucleotides, linker-scanning mutagenesis, alanine-scanning
mutagenesis, and error-prone PCR. Other efficient methods include
the E. coli mutator strains of Stratagene (See e.g., Greener et
al., Methods Mol. Biol. 57:375 [1996]) and the DNA shuffling
technique of Maxygen (See e.g., Patten et al., Curr. Opin.
Biotechnol., 8:724 [1997]; Harayama, Tr. Biotechnol., 16:76 [1998];
U.S. Pat. Nos. 5,605,793 and 6,132,970). To increase variation, a
technology can be used that generates more abrupt changes, such as
DNA shuffling techniques.
[0154] Mutagenesis may be performed in accordance with any of the
techniques known in the art, including random and site-specific
mutagenesis. Directed evolution can be performed with any of the
techniques known in the art to screen for production of variants
including shuffling. Mutagenesis and directed evolution methods are
well known in the art (See e.g., U.S. Pat. Nos. 5,605,793,
5,830,721, 6,132,970, 6,420,175, 6,277,638, 6,365,408, 6,602,986,
7,288,375, 6,287,861, 6,297,053, 6,576,467, 6,444,468, 5,811,238,
6,117,679, 6,165,793, 6,180,406, 6,291,242, 6,995,017, 6,395,547,
6,506,602, 6,519,065, 6,506,603, 6,413,774, 6,573,098, 6,323,030,
6,344,356, 6,372,497, 7,868,138, 5,834,252, 5,928,905, 6,489,146,
6,096,548, 6,387,702, 6,391,552, 6,358,742, 6,482,647, 6,335,160,
6,653,072, 6,355,484, 6,03,344, 6,319,713, 6,613,514, 6,455,253,
6,579,678, 6,586,182, 6,406,855, 6,946,296, 7,534,564, 7,776,598,
5,837,458, 6,391,640, 6,309,883, 7,105,297, 7,795,030, 6,326,204,
6,251,674, 6,716,631, 6,528,311, 6,287,862, 6,335,198, 6,352,859,
6,379,964, 7,148,054, 7,629,170, 7,620,500, 6,365,377, 6,358,740,
6,406,910, 6,413,745, 6,436,675, 6,961,664, 7,430,477, 7,873,499,
7,702,464, 7,783,428, 7,747,391, 7,747,393, 7,751,986, 6,376,246,
6,426,224, 6,423,542, 6,479,652, 6,319,714, 6,521,453, 6,368,861,
7,421,347, 7,058,515, 7,024,312, 7,620,502, 7,853,410, 7,957,912,
7,904,249, and all related US and non-US counterparts; Ling et al.,
Anal. Biochem., 254(2):157-78 [1997]; Dale et al., Meth. Mol.
Biol., 57:369-74 [1996]; Smith, Ann. Rev. Genet., 19:423-462
[1985]; Botstein et al., Science, 229:1193-1201 [1985]; Carter,
Biochem. J., 237:1-7 [1986]; Kramer et al., Cell, 38:879-887
[1984]; Wells et al., Gene, 34:315-323 [1985]; Minshull et al.,
Curr. Op. Chem. Biol., 3:284-290 [1999]; Christians et al., Nat.
Biotechnol., 17:259-264 [1999]; Crameri et al., Nature, 391:288-291
[1998]; Crameri, et al., Nat. Biotechnol., 15:436-438 [1997]; Zhang
et al., Proc. Nat. Acad. Sci. U.S.A., 94:4504-4509 [1997]; Crameri
et al., Nat. Biotechnol., 14:315-319 [1996]; Stemmer, Nature,
370:389-391 [1994]; Stemmer, Proc. Nat. Acad. Sci. USA,
91:10747-10751 [1994]; WO 95/22625; WO 97/0078; WO 97/35966; WO
98/27230; WO 00/42651; WO 01/75767; and WO 2009/152336, all of
which are incorporated herein by reference).
[0155] There are commercially available services and kits available
to the skilled reader to use in obtaining variants of the claimed
proteins. By way of illustration, systems specifically designed for
mutagenesis projects include the following: the GeneTailor.TM.
Site-Directed Mutagenesis System sold by InVitrogen.TM. Life
Technologies; the BD Diversify.TM. PCR Random Mutagenesis Kit.TM.,
sold by BD Biosciences/Clontech; the Template Generation
System.TM., sold by MJ Research Inc., the XL1-Red.TM. mutator
strain of E. coli, sold by Stratagene; and the GeneMorph.RTM.
Random Mutagenesis Kit, also sold by Stratagene. By employing any
of these systems in conjunction with a suitable GH61 activity
assay, variants can be generated and tested in a high throughput
manner.
[0156] Alternatively or in addition, the user may conduct further
evolution of the encoded protein (See e.g., U.S. Pat. No.
7,981,614; US Pat. Appln. Publ. No. 2011/0034342; U.S. Pat. No.
7,795,030; U.S. Pat. No. 7,647,184; U.S. Pat. No. 6,939,689; and
U.S. Pat. No. 6,773,900).
[0157] After each iteration of mutagenesis, the user can test and
select the desired clones retaining GH61 activity. Optionally, the
selected clones can be subject to further rounds of mutagenesis,
until the desired degree of variation from the original sequence
has been achieved.
Cellulase Enzymes and Compositions
[0158] The GH61 proteins of this invention are useful for
increasing the yield of fermentable sugars in a saccharification
reaction with one or more cellulase enzymes. The cellulase enzymes
can be produced in the same cell as the GH61 protein or in a
different cell. In either case, the cellulase enzymes can be
expressed from a recombinant encoding region or from a constitutive
gene. The cellulase enzymes can be provided in the form of a
culture broth (with or without the microorganism producing the
enzyme(s)) or supernatant, or purified to any extent desired.
[0159] The terms "cellulase" and "cellulase enzyme" broadly refer
to enzymes that catalyze the hydrolysis of the beta-1,4-glycosidic
bonds joining individual glucose units in a cellulose containing
substrate. Examples of cellulase enzymes suitable for use with the
GH61 proteins of this invention are described in more detail later
in this section.
[0160] Endoglucanases (EGs), comprise a group of cellulase enzymes
classified as E.C. 3.2.1.4. These enzymes catalyze the hydrolysis
of internal beta-1,4 glycosidic bonds of cellulose. In some
embodiments, the present invention comprises an endogenous M.
thermophila endoglucanase such as M. thermophila EG2 (See, WO
2007/109441) or a variant thereof. In some additional embodiments,
the EG is from S. avermitilis, having a sequence set forth in
GenBank accession NP.sub.--821730, or a variant thereof (See e.g.,
US Pat. Appln. Publ. No. 2010/0267089 A1). In some additional
embodiments, the EG is a Thermoascus aurantiacus EG or variant
thereof. In some further embodiments, the EG is an endogenous EG
from a bacteria, a yeast, or a filamentous fungus other than M.
thermophila. Indeed, it is contemplated that any suitable EG will
find use in combination with the GH61 proteins provided herein. It
is not intended that the present invention be limited to any
specific EG.
[0161] Beta-glucosidases (BGL), comprise a group of cellulase
enzymes classified as E.C. 3.2.1.21. These enzymes hydrolyze
cellobiose to glucose. In some embodiments, the BGL is an
endogenous M. thermophila enzyme, or a variant thereof (See e.g.,
US Pat. Appln. Publ. No. 2011/0129881 A1; and US Pat. Appln. Publ.
No. 2011/0124058 A1). In some alternative embodiments, the BGL is
from Azospirillum irakense (CelA), or a variant thereof (See e.g.,
US Pat. Appln. Publ. No. 2011/0114744 A1; and PCT/US2010/038902).
Indeed, it is contemplated that any suitable BGL will find use in
combination with the GH61 proteins provided herein. It is not
intended that the present invention be limited to any specific
BGL.
[0162] Cellobiohydrolases comprise a group of cellulase enzymes
classified as E.C. 3.2.1.91. Type 1 cellobiohydrolase (CBH1)
hydrolyzes cellobiose processively from the reducing end of
cellulose chains. Type 2 cellobiohydrolase (CBH2) hydrolyzes
cellobiose processively from the nonreducing end of cellulose
chains. In some embodiments, the CBH1 and/or CBH2 enzymes used in
the present invention are endogenous to M. thermophila, while in
some other embodiments, the CBH1 and/or CBH2 enzymes used in the
present invention are obtained from bacteria, yeast, and/or a
filamentous fungus other than M. thermophila. Indeed, it is
contemplated that any suitable CBHs will find use in combination
with the GH61 proteins provided herein. It is not intended that the
present invention be limited to any specific CBHs. The invention
provides compositions comprising a GH61 variant protein in
combination with at least one, at least two, at least three, or
more than three cellulases selected from EG, BGL, CBH1, CBH2,
xylosidase, and/or xylanase. In some embodiments, enzymes are
purified or partly purified before combining them, so that the
combined mass of the GH61, EG, BGL, CBH1 and CBH2 is at least about
50% or at least about 70% of the total cell-free protein in
compositions.
[0163] In addition to one or more cellulase enzymes such as those
listed above, in some embodiments, GH61 variant enzymes are
combined with other enzymes to produce mixtures with industrial
applicability. Such combinations are useful, for example, in
rendering a cellulose-containing source into an intermediate that
is more amenable to hydrolysis by the cellulase enzymes in the
mixture. For example, in some embodiments, enzymes are selected to
digest or hydrolyze other components of a particular cellulosic
biomass, such as hemicellulose, arabinogalactan, pectin,
rhamnogalacturonan and/or lignin.
[0164] In some embodiments, the compositions comprise enzymes
selected from endoxylanases (EC 3.2.1.8); .beta.-xylosidases (EC
3.2.1.37); alpha-L-arabinofuranosidases (EC 3.2.1.55);
alpha-glucuronidases (EC 3.2.1.139); acetylxylanesterases (EC
3.1.1.72); feruloyl esterases (EC 3.1.1.73); coumaroyl esterases
(EC 3.1.1.73); alpha-galactosidases (EC 3.2.1.22);
beta-galactosidases (EC 3.2.1.23); beta-mannanases (EC 3.2.1.78);
beta-mannosidases (EC 3.2.1.25); endo-polygalacturonases (EC
3.2.1.15); pectin methyl esterases (EC 3.1.1.11); endo-galactanases
(EC 3.2.1.89); pectin acetyl esterases (EC 3.1.1.6); endo-pectin
lyases (EC 4.2.2.10); pectate lyases (EC 4.2.2.2); alpha
rhamnosidases (EC 3.2.1.40); exo-poly-alpha-galacturonosidase (EC
3.2.1.82); 1,4-alpha-galacturonidase (EC 3.2.1.67);
exopolygalacturonate lyases (EC 4.2.2.9); rhamnogalacturonan
endolyases EC (4.2.2.B3); rhamnogalacturonan acetylesterases (EC
3.2.1.B11); rhamnogalacturonan galacturonohydrolases (EC
3.2.1.B11); endo-arabinanases (EC 3.2.1.99); laccases (EC
1.10.3.2); manganese-dependent peroxidases (EC 1.10.3.2); amylases
(EC 3.2.1.1), glucoamylases (EC 3.2.1.3), proteases, lipases, and
lignin peroxidases (EC 1.11.1.14). Any combination of one, two,
three, four, five, or more than five enzymes find use in the
compositions of the present invention.
[0165] Cellulase mixtures for efficient enzymatic hydrolysis of
cellulose are known (See e.g., Viikari et al., Adv. Biochem. Eng.
Biotechnol., 108:121-45 [2007]; and US Pat. Publns. 2009/0061484;
US 2008/0057541; and US 2009/0209009, each of which is incorporated
herein by reference). In some embodiments, mixtures of purified
naturally occurring or recombinant enzymes are combined with
cellulosic feedstock or a product of cellulose hydrolysis. In some
embodiments, one or more cell populations, each producing one or
more naturally occurring or recombinant cellulases, are combined
with cellulosic feedstock or a product of cellulose hydrolysis.
[0166] In some embodiments, the GH61 variant polypeptides of the
present invention are present in mixtures comprising enzymes other
than cellulases that degrade cellulose, hemicellulose, pectin,
and/or lignocellulose.
[0167] In some embodiments, the present invention provides at least
one GH61 variant and at least one endoxylanase. Endoxylanases (EC
3.2.1.8) catalyze the endo hydrolysis of 1,4-beta-D-xylosidic
linkages in xylans. This enzyme may also be referred to as
endo-1,4-beta-xylanase or 1,4-beta-D-xylan xylanohydrolase. In some
embodiments, an alternative is EC 3.2.1.136, a
glucuronoarabinoxylan endoxylanase, an enzyme that is able to
hydrolyze 1,4 xylosidic linkages in glucuronoarabinoxylans.
[0168] In some embodiments, the present invention provides at least
one GH61 variant and at least one beta-xylosidase. Beta-xylosidases
(EC 3.2.1.37) catalyze the hydrolysis of 1,4-beta-D-xylans, to
remove successive D-xylose residues from the non-reducing termini.
This enzyme may also be referred to as xylan 1,4-beta-xylosidase,
1,4-beta-D-xylan xylohydrolase, exo-1,4-beta-xylosidase or
xylobiase.
[0169] In some embodiments, the present invention provides at least
one GH61 variant and at least one .alpha.-L-arabinofuranosidase.
Alpha-L-arabinofuranosidases (EC 3.2.1.55) catalyze the hydrolysis
of terminal non-reducing alpha-L-arabinofuranoside residues in
alpha-L-arabinosides. The enzyme acts on
alpha-L-arabinofuranosides, alpha-L-arabinans containing (1,3)-
and/or (1,5)-linkages, arabinoxylans, and arabinogalactans.
Alpha-L-arabinofuranosidase is also known as arabinosidase,
alpha-arabinosidase, alpha-L-arabinosidase,
alpha-arabinofuranosidase, arabinofuranosidase, polysaccharide
alpha-L-arabinofuranosidase, alpha-L-arabinofuranoside hydrolase,
L-arabinosidase and alpha-L-arabinanase.
[0170] In some embodiments, the present invention provides at least
one GH61 variant and at least one alpha-glucuronidase.
Alpha-glucuronidases (EC 3.2.1.139) catalyze the hydrolysis of an
alpha-D-glucuronoside to D-glucuronate and an alcohol.
[0171] In some embodiments, the present invention provides at least
one GH61 variant and at least one acetylxylanesterase.
Acetylxylanesterases (EC 3.1.1.72) catalyze the hydrolysis of
acetyl groups from polymeric xylan, acetylated xylose, acetylated
glucose, alpha-napthyl acetate, and p-nitrophenyl acetate.
[0172] In some embodiments, the present invention provides at least
one GH61 variant and at least one feruloyl esterase. Feruloyl
esterases (EC 3.1.1.73) have 4-hydroxy-3-methoxycinnamoyl-sugar
hydrolase activity (EC 3.1.1.73) that catalyzes the hydrolysis of
the 4-hydroxy-3-methoxycinnamoyl (feruloyl) group from an
esterified sugar, which is usually arabinose in "natural"
substrates, to produce ferulate (4-hydroxy-3-methoxycinnamate).
Feruloyl esterase is also known as ferulic acid esterase,
hydroxycinnamoyl esterase, FAE-III, cinnamoyl ester hydrolase,
FAEA, cinnAE, FAE-I, or FAE-II.
[0173] In some embodiments, the present invention provides at least
one GH61 variant and at least one coumaroyl esterase. Coumaroyl
esterases (EC 3.1.1.73) catalyze a reaction of the form:
coumaroyl-saccharide+H.sub.2O=coumarate+saccharide. In some
embodiments, the saccharide is an oligosaccharide or a
polysaccharide. This enzyme may also be referred to as
trans-4-coumaroyl esterase, trans-p-coumaroyl esterase, p-coumaroyl
esterase or p-coumaric acid esterase. The enzyme also falls within
EC 3.1.1.73; it may also be referred to as a "feruloyl
esterase."
[0174] In some embodiments, the present invention provides at least
one GH61 variant and at least one alpha-galactosidase.
Alpha-galactosidases (EC 3.2.1.22) catalyze the hydrolysis of
terminal, non-reducing alpha-D-galactose residues in
alpha-D-galactosides, including galactose oligosaccharides,
galactomannans, galactans and arabinogalactans. This enzyme may
also be referred to as "melibiase."
[0175] In some embodiments, the present invention provides at least
one GH61 variant and at least one beta-galactosidase.
Beta-galactosidases (EC 3.2.1.23) catalyze the hydrolysis of
terminal non-reducing beta-D-galactose residues in
beta-D-galactosides. In some embodiments, the polypeptide is also
capable of hydrolyzing alpha-L-arabinosides. This enzyme may also
be referred to as exo-(1->4)-beta-D-galactanase or lactase.
[0176] In some embodiments, the present invention provides at least
one GH61 variant and at least one beta-mannanase. Beta-mannanases
(EC 3.2.1.78) catalyze the random hydrolysis of
1,4-beta-D-mannosidic linkages in mannans, galactomannans and
glucomannans. This enzyme may also be referred to as "mannan
endo-1,4-beta-mannosidase" or "endo-1,4-mannanase."
[0177] In some embodiments, the present invention provides at least
one GH61 variant and at least one beta-mannosidase.
Beta-mannosidases (EC 3.2.1.25) catalyze the hydrolysis of
terminal, non-reducing beta-D-mannose residues in
beta-D-mannosides. This enzyme may also be referred to as mannanase
or mannase.
[0178] In some embodiments, the present invention provides at least
one GH61 variant and at least one glucoamylase. Glucoamylases (EC
3.2.1.3) catalyzes the release of D-glucose from non-reducing ends
of oligo- and poly-saccharide molecules. Glucoamylase is also
generally considered a type of amylase known as
amylo-glucosidase.
[0179] In some embodiments, the present invention provides at least
one GH61 variant and at least one amylase. Amylases (EC 3.2.1.1)
are starch cleaving enzymes that degrade starch and related
compounds by hydrolyzing the alpha-1,4 and/or alpha-1,6 glucosidic
linkages in an endo- or an exo-acting fashion. Amylases include
alpha-amylases (EC 3.2.1.1); beta-amylases (3.2.1.2),
amylo-amylases (EC 3.2.1.3), alpha-glucosidases (EC 3.2.1.20),
pullulanases (EC 3.2.1.41), and isoamylases (EC 3.2.1.68). In some
embodiments, the amylase is an alpha-amylase.
[0180] In some embodiments one or more enzymes that degrade pectin
are included in enzyme mixtures that comprise at least one GH61
variant of the present invention. Pectinases catalyze the
hydrolysis of pectin into smaller units such as oligosaccharide or
monomeric saccharides. In some embodiments, the enzyme mixtures
comprise any pectinase, for example an endo-polygalacturonase, a
pectin methyl esterase, an endo-galactanase, a pectin acetyl
esterase, an endo-pectin lyase, pectate lyase, alpha rhamnosidase,
an exo-galacturonase, an exo-polygalacturonate lyase, a
rhamnogalacturonan hydrolase, a rhamnogalacturonan lyase, a
rhamnogalacturonan acetyl esterase, a rhamnogalacturonan
galacturonohydrolase and/or a xylogalacturonase.
[0181] In some embodiments, the present invention provides at least
one GH61 variant and at least one endo-polygalacturonase.
Endo-polygalacturonases (EC 3.2.1.15) catalyze the random
hydrolysis of 1,4-alpha-D-galactosiduronic linkages in pectate and
other galacturonans. This enzyme may also be referred to as
"polygalacturonase pectin depolymerase," "pectinase,"
"endopolygalacturonase," "pectolase," "pectin hydrolase," "pectin
polygalacturonase," "poly-alpha-1,4-galacturonide
glycanohydrolase," "endogalacturonase," "endo-D-galacturonase" or
"poly(1,4-alpha-D-galacturonide) glycanohydrolase."
[0182] In some embodiments, the present invention provides at least
one GH61 variant and at least one pectin methyl esterase. Pectin
methyl esterases (EC 3.1.1.11) catalyze the reaction: pectin+n
H.sub.2O=n methanol+pectate. The enzyme may also been known as
"pectin esterase," "pectin demethoxylase," "pectin methoxylase,"
"pectin methylesterase," "pectase," "pectinoesterase," or "pectin
pectylhydrolase."
[0183] In some embodiments, the present invention provides at least
one GH61 variant and at least one endo-galactanase.
Endo-galactanases (EC 3.2.1.89) catalyze the endohydrolysis of
1,4-beta-D-galactosidic linkages in arabinogalactans. The enzyme
may also be known as "arabinogalactan endo-1,4-beta-galactosidase,"
"endo-1,4-beta-galactanase," "galactanase," "arabinogalactanase,"
or "arabinogalactan 4-beta-D-galactanohydrolase."
[0184] In some embodiments, the present invention provides at least
one GH61 variant and at least one pectin acetyl esterase. Pectin
acetyl esterases catalyze the deacetylation of the acetyl groups at
the hydroxyl groups of GaIUA residues of pectin.
[0185] In some embodiments, the present invention provides at least
one GH61 variant and at least one endo-pectin lyase. Endo-pectin
lyases (EC 4.2.2.10) catalyze the eliminative cleavage of
(1.fwdarw.4)-alpha-D-galacturonan methyl ester to give
oligosaccharides with
4-deoxy-6-O-methyl-.alpha.-D-galact-4-enuronosyl groups at their
non-reducing ends. The enzyme may also be known as "pectin lyase,"
"pectin trans-eliminase," "endo-pectin lyase,"
"polymethylgalacturonic transeliminase," "pectin
methyltranseliminase," "pectolyase," "PL," "PNL," "PMGL," or
"(1.fwdarw.4)-6-O-methyl-alpha-D-galacturonan lyase."
[0186] In some embodiments, the present invention provides at least
one GH61 variant and at least one pectate lyase. Pectate lyases (EC
4.2.2.2) catalyze the eliminative cleavage of
(1.fwdarw.4)-alpha-D-galacturonan to give oligosaccharides with
4-deoxy-alpha-D-galact-4-enuronosyl groups at their non-reducing
ends. The enzyme may also be known "polygalacturonic
transeliminase," "pectic acid transeliminase," "polygalacturonate
lyase," "endopectin methyltranseliminase," "pectate
transeliminase," "endogalacturonate transeliminase," "pectic acid
lyase," "pectic lyase," alpha-1,4-D-endopolygalacturonic acid
lyase," "PGA lyase," "PPase-N," "endo-alpha-1,4-polygalacturonic
acid lyase," "polygalacturonic acid lyase," "pectin
trans-eliminase," "polygalacturonic acid trans-eliminase," or
"(1.fwdarw.4)-alpha-D-galacturonan lyase."
[0187] In some embodiments, the present invention provides at least
one GH61 variant and at least one alpha-rhamnosidase.
Alpha-rhamnosidases (EC 3.2.1.40) catalyze the hydrolysis of
terminal non-reducing alpha-L-rhamnose residues in
alpha-L-rhamnosides or alternatively in rhamnogalacturonan. This
enzyme may also be known as "alpha-L-rhamnosidase T,"
"alpha-L-rhamnosidase N," or "alpha-L-rhamnoside
rhamnohydrolase."
[0188] In some embodiments, the present invention provides at least
one GH61 variant and at least one exo-galacturonase.
Exo-galacturonases (EC 3.2.1.82) hydrolyze pectic acid from the
non-reducing end, releasing digalacturonate. The enzyme may also be
known as "exo-poly-alpha-galacturonosidase,"
"exopolygalacturonosidase," or "exopolygalacturanosidase."
[0189] In some embodiments, the present invention provides at least
one GH61 variant and at least one -galacturan 1,4-alpha
galacturonidase. Exo-galacturonases (EC 3.2.1.67) catalyze a
reaction of the following type:
(1,4-.alpha.-D-galacturonide)n+H2O=(1,4-.alpha.-D-galacturonide)n-i-
+D-galacturonate. The enzyme may also be known as "poly[1->4)
alpha-D-galacturonide]galacturonohydrolase,"
"exopolygalacturonase," "poly(galacturonate) hydrolase,"
"exo-D-galacturonase," "exo-D-galacturonanase,"
"exopoly-D-galacturonase," or "poly(1,4-alpha-D-galacturonide)
galacturonohydrolase."
[0190] In some embodiments, the present invention provides at least
one GH61 variant and at least one exopolygalacturonate lyase.
Exopolygalacturonate lyases (EC 4.2.2.9) catalyze eliminative
cleavage of 4-(4-deoxy-alpha-D-galact-4-enuronosyl)-D-galacturonate
from the reducing end of pectate (i.e., de-esterified pectin). This
enzyme may be known as "pectate disaccharide-lyase," "pectate
exo-lyase," "exopectic acid transeliminase," "exopectate lyase,"
"exopolygalacturonic acid-trans-eliminase," "PATE," "exo-PATE,"
"exo-PGL," or "(1.fwdarw.4)-alpha-D-galacturonan
reducing-end-disaccharide-lyase."
[0191] In some embodiments, the present invention provides at least
one GH61 variant and at least one rhamnogalacturonanase.
Rhamnogalacturonanases hydrolyze the linkage between
galactosyluronic acid and rhamnopyranosyl in an endo-fashion in
strictly alternating rhamnogalacturonan structures, consisting of
the disaccharide
[(1,2-alpha-L-rhamnoyl-(1,4)-alpha-galactosyluronic acid].
[0192] In some embodiments, the present invention provides at least
one GH61 variant and at least one rhamnogalacturonan lyase
Rhamnogalacturonan lyases cleave
alpha-L-Rhap-(1.fwdarw.4)-alpha-D-GalpA linkages in an endo-fashion
in rhamnogalacturonan by beta-elimination.
[0193] In some embodiments, the present invention provides at least
one GH61 variant and at least one rhamnogalacturonan acetyl
esterase Rhamnogalacturonan acetyl esterases catalyze the
deacetylation of the backbone of alternating rhamnose and
galacturonic acid residues in rhamnogalacturonan.
[0194] In some embodiments, the present invention provides at least
one GH61 variant and at least one rhamnogalacturonan
galacturonohydrolase Rhamnogalacturonan galacturonohydrolases
hydrolyze galacturonic acid from the non-reducing end of strictly
alternating rhamnogalacturonan structures in an exo-fashion. This
enzyme may also be known as "xylogalacturonan hydrolase."
[0195] In some embodiments, the present invention provides at least
one GH61 variant and at least one endo-arabinanase.
Endo-arabinanases (EC 3.2.1.99) catalyze endohydrolysis of
1,5-alpha-arabinofuranosidic linkages in 1,5-arabinans. The enzyme
may also be known as "endo-arabinase," "arabinan
endo-1,5-alpha-L-arabinosidase," "endo-1,5-alpha-L-arabinanase,"
"endo-alpha-1,5-arabanase," "endo-arabanase," or
"1,5-alpha-L-arabinan 1,5-alpha-L-arabinanohydrolase."
[0196] In some embodiments, the present invention provides at least
one GH61 variant and at least one enzyme that participates in
lignin degradation in an enzyme mixture. Enzymatic lignin
depolymerization can be accomplished by lignin peroxidases,
manganese peroxidases, laccases, and/or cellobiose dehydrogenases
(CDH), often working in synergy. These extracellular enzymes are
often referred to as "lignin-modifying enzymes" or "LMEs." Three of
these enzymes comprise two glycosylated heme-containing
peroxidases, namely lignin peroxidase (LIP), Mn-dependent
peroxidase (MNP), and copper-containing phenoloxidase laccase
(LCC).
[0197] In some embodiments, the present invention provides at least
one GH61 variant and at least one laccase. Laccases are copper
containing oxidase enzymes that are found in many plants, fungi and
microorganisms. Laccases are enzymatically active on phenols and
similar molecules and perform a one electron oxidation. Laccases
can be polymeric and the enzymatically active form can be a dimer
or trimer.
[0198] In some embodiments, the present invention provides at least
one GH61 variant and at least one Mn-dependent peroxidase. The
enzymatic activity of Mn-dependent peroxidase (MnP) in is dependent
on Mn2+. Without being bound by theory, it has been suggested that
the main role of this enzyme is to oxidize Mn2+ to Mn3+(See e.g,
Glenn et al., Arch. Biochem. Biophys., 251:688-696 [1986]).
Subsequently, phenolic substrates are oxidized by the Mn3+
generated.
[0199] In some embodiments, the present invention provides at least
one GH61 variant and at least one lignin peroxidase. Lignin
peroxidase is an extracellular heme peroxidase that catalyses the
oxidative depolymerization of dilute solutions of polymeric lignin
in vitro. Some of the substrates of LiP, most notably
3,4-dimethoxybenzyl alcohol (veratryl alcohol, VA), are active
redox compounds that have been shown to act as redox mediators. VA
is a secondary metabolite produced at the same time as LiP by
ligninolytic cultures of P. chrysosporium and without being bound
by theory, has been proposed to function as a physiological redox
mediator in the LiP-catalyzed oxidation of lignin in vivo (See
e.g., Harvey, et al., FEBS Lett., 195:242-246 [1986]).
[0200] In some embodiments, the present invention provides at least
one GH61 variant and at least one protease, amylase, glucoamylase,
and/or a lipase that participates in cellulose degradation.
[0201] As used herein, the term "protease" includes enzymes that
hydrolyze peptide bonds (peptidases), as well as enzymes that
hydrolyze bonds between peptides and other moieties, such as sugars
(glycopeptidases). Many proteases are characterized under EC 3.4,
and are suitable for use in the invention. Some specific types of
proteases include, cysteine proteases including pepsin, papain and
serine proteases including chymotrypsins, carboxypeptidases and
metalloendopeptidases.
[0202] As used herein, the term "lipase" includes enzymes that
hydrolyze lipids, fatty acids, and acylglycerides, including
phospoglycerides, lipoproteins, diacylglycerols, and the like. In
plants, lipids are used as structural components to limit water
loss and pathogen infection. These lipids include waxes derived
from fatty acids, as well as cutin and suberin.
[0203] In some additional embodiments, the present invention
provides at least one GH61 variant and at least one expansin or
expansin-like protein, such as a swollenin (See e.g., Salheimo et
al., Eur. J. Biochem., 269:4202-4211 [2002]) or a swollenin-like
protein. Expansins are implicated in loosening of the cell wall
structure during plant cell growth. Expansins have been proposed to
disrupt hydrogen bonding between cellulose and other cell wall
polysaccharides without having hydrolytic activity. In this way,
they are thought to allow the sliding of cellulose fibers and
enlargement of the cell wall. Swollenin, an expansin-like protein
contains an N-terminal Carbohydrate Binding Module Family 1 domain
(CBD) and a C-terminal expansin-like domain. In some embodiments,
an expansin-like protein or swollenin-like protein comprises one or
both of such domains and/or disrupts the structure of cell walls
(such as disrupting cellulose structure), optionally without
producing detectable amounts of reducing sugars.
[0204] In some embodiments, the present invention provides at least
one GH61 variant and at least one polypeptide product of a
cellulose integrating protein, scaffoldin or a scaffoldin-like
protein, for example CipA or CipC from Clostridium thermocellum or
Clostridium cellulolyticum, respectively. Scaffoldins and cellulose
integrating proteins are multi-functional integrating subunits
which may organize cellulolytic subunits into a multi-enzyme
complex. This is accomplished by the interaction of two
complementary classes of domains (i.e. a cohesion domain on
scaffoldin and a dockerin domain on each enzymatic unit). The
scaffoldin subunit also bears a cellulose-binding module that
mediates attachment of the cellulosome to its substrate. A
scaffoldin or cellulose integrating protein for the purposes of
this invention may comprise one or both such domains.
[0205] In some embodiments, the present invention provides at least
one GH61 variant and at least one cellulose induced protein or
modulating protein, for example as encoded by a cip1 or cip2 gene
or similar genes from Trichoderma reesei (See e.g., Foreman et al.,
J. Biol. Chem., 278:31988-31997 [2003]).
[0206] In some embodiments, the present invention provides at least
one GH61 variant and at least one member of each of the classes of
the polypeptides described above, several members of one
polypeptide class, or any combination of these polypeptide classes
to provide enzyme mixtures suitable for various uses.
[0207] In some embodiments, the enzyme mixture comprises other
types of cellulases, selected from but not limited to
cellobiohydrolase, endoglucanase, beta-glucosidase, and glycoside
hydrolase 61 protein (GH61) cellulases. These enzymes may be
wild-type or recombinant enzymes. In some embodiments, the
cellobiohydrolase is a type 1 cellobiohydrolase (e.g., a T. reesei
cellobiohydrolase I). In some embodiments, the endoglucanase
comprises a catalytic domain derived from the catalytic domain of a
Streptomyces avermitilis endoglucanase (See e.g., US Pat. Appln.
Pub. No. 2010/0267089; U.S. Pat. No. 8,206,960; and U.S. Pat. No.
8,088,608, each of which is incorporated herein by reference). In
some embodiments, at least one cellulase in the mixtures of the
present invention is derived from Acidothermus cellulolyticus,
Thermobifida fusca, Humicola grisea, Myceliophthora thermophila,
Chaetomium thermophilum, Acremonium sp., Thielavia sp, Trichoderma
reesei, Aspergillus sp., or a Chrysosporium sp. In some
embodiments, cellulase enzymes of the cellulase mixture work
together resulting in decrystallization and hydrolysis of the
cellulose from a biomass substrate to yield fermentable sugars,
such as but not limited to glucose.
[0208] Some cellulase mixtures for efficient enzymatic hydrolysis
of cellulose are known (See e.g., Viikari et al., Adv. Biochem.
Eng. Biotechnol., 108:121-45 [2007]; and US Pat. Appln. Publn. Nos.
US 2009/0061484, US 2008/0057541, and US 2009/0209009, each of
which is incorporated herein by reference in their entireties). In
some embodiments, mixtures of purified naturally occurring or
recombinant enzymes are combined with cellulosic feedstock or a
product of cellulose hydrolysis. Alternatively or in addition, one
or more cell populations, each producing one or more naturally
occurring or recombinant cellulase, are combined with cellulosic
feedstock or a product of cellulose hydrolysis.
[0209] In some embodiments, the enzyme mixture comprises
commercially available purified cellulases. Commercial cellulases
are known and available (e.g., C2730 cellulase from Trichoderma
reesei ATCC No. 25921 available from Sigma-Aldrich, Inc.) Any
suitable commercially available enzyme finds use in the present
invention.
[0210] In some embodiments, the enzyme mixture comprises at least
one isolated GH61 variant as provided herein and at least one or
more isolated enzymes, including but not limited to at least one
isolated CBH1a, isolated CBH2b, isolated endoglucanase (EG) (e.g.,
EG2 and/or EG1), and/or isolated beta-glucosidase (BGL). In some
embodiments, at least 5%, at least 10%, at last 15%, at least 20%,
at least 25%, at least 30%, at least 35%, at least 40%, at least
45%, or at least 50% of the enzyme mixture is GH61. In some
embodiments, the enzyme mixture further comprises a
cellobiohydrolase type 1a (e.g., CBH1a), and GH61, wherein the
enzymes together comprise at least 25%, at least 30%, at least 35%,
at least 40%, at least 45%, at least 50%, at least 55%, at least
60%, at least 65%, at least 70%, at least 75%, or at least 80% of
the enzyme mixture. In some embodiments, the enzyme mixture further
comprises a beta-glucosidase (BGL), GH61, and CBH, wherein the
three enzymes together comprise at least 30%, at least 35%, at
least 40%, at least 45%, at least 50%, at least 55%, at least 60%,
at least 65%, at least 70%, at least 75%, at least 80%, or at least
85% of the enzyme mixture. In some embodiments, the enzyme mixture
further comprises an endoglucanase (EG), GH61, CBH2b, CBH1a, BGL,
wherein the five enzymes together comprise at least 35%, at least
40%, at least 45%, at least 50%, at least 55%, at least 60%, at
least 65%, at least 70%, at least 75%, at least 80%, at least 85%,
or at least 90% of the enzyme mixture. In some embodiments, the
enzyme mixture comprises GH61, CBH2b, CBH1, BGL, and at least one
EG, in any suitable proportion for the desired reaction.
[0211] In some embodiments, the enzyme mixture composition
comprises isolated cellulases in the following proportions by
weight (wherein the total weight of the cellulases is 100%): about
20%-10% of GH61, about 20%-10% of BGL, about 30%-25% of CBH1a,
about 10%-30% of GH61, about 20%-10% of EG, and about 20%-25% of
CBH2b. In some embodiments, the enzyme mixture composition
comprises isolated cellulases in the following proportions by
weight: about 20%-10% of GH61, about 25%-15% of BGL, about 20%-30%
of CBH1a, about 10%-15% of EG, and about 25%-30% of CBH2b. In some
embodiments, the enzyme mixture composition comprises isolated
cellulases in the following proportions by weight: about 30%-20% of
GH61, about 15%-10% of BGL, about 25%-10% of CBH1a, about 25%-10%
of CBH2b, about 15%-10% of EG. In some embodiments, the enzyme
mixture composition comprises isolated cellulases in the following
proportions by weight: about 40-30% of GH61, about 15%-10% of BGL,
about 20%-10% of CBH1a, about 20%-10% of CBH2b, and about 15%-10%
of EG.
[0212] In some embodiments, the enzyme mixture composition
comprises isolated cellulases in the following proportions by
weight: about 50-40% of GH61, about 15%-10% of BGL, about 20%-5% of
CBH1a, about 15%-10% of CBH2b, and about 10%-5% of EG. However, in
some embodiments, the enzyme mixture composition comprises no EG
(e.g., EG2). In some embodiments, the enzyme mixture composition
comprises isolated cellulases in the following proportions by
weight: about 10%-15% of GH61, about 20%-25% of BGL, about 30%-20%
of CBH1a, about 15%-5% of EG, and about 25%-35% of CBH2b. In some
embodiments, the enzyme mixture composition comprises isolated
cellulases in the following proportions by weight: about 15%-5% of
GH61, about 15%-10% of BGL, about 45%-30% of CBH1a, about 25%-5% of
EG, and about 40%-10% of CBH2b. In some embodiments, the enzyme
mixture composition comprises isolated cellulases in the following
proportions by weight: about 10% of GH61, about 15% of BGL, about
40% of CBH1a, about 25% of EG, and about 10% of CBH2b.
[0213] In some embodiments, the enzyme mixtures provided herein
further comprise at least one xylan-active enzyme and/or at least
one ester-active enzyme. In some embodiments, the enzyme mixture
compositions comprise about 0-25% xylanase (e.g., about 2%-5%,
about 1%-10%, about 10%-15%, about 15%-25%, about 1%, about 2%,
about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about
9%, about 10%, about 11%, about 12%, about 13%, about 14%, or about
15% xylanase) by weight. In some embodiments, the enzyme mixture
compositions comprise about 0-15% xylosidase (e.g., about 2%-5%,
about 1%-10%, about 10%-15%, about 1%, about 2%, about 3%, about
4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10%,
about 11%, about 12%, about 13%, about 14%, or about 15%
xylosidase) by weight. In some embodiments, the enzyme mixture
compositions comprise about 0-15% esterase (e.g., about 2%-5%,
about 1%-10%, about 10%-15%, about 1%, about 2%, about 3%, about
4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10%,
about 11%, about 12%, about 13%, about 14%, or about 15% esterase)
by weight. It is contemplated that any suitable combination of
enzymes and suitable enzyme concentrations will find use in the
present invention, as applied using various saccharification
reactions and conditions.
[0214] In some embodiments, the enzyme component comprises more
than one CBH1a, CBH2b, EG, BGL, and/or GH61 variant enzyme (e.g.,
2, 3 or 4 different enzymes), in any suitable combination. In some
embodiments, an enzyme mixture composition of the invention further
comprises at least one additional protein and/or enzyme. In some
embodiments, enzyme mixture compositions of the present invention
further comprise at least one additional enzyme other than at least
one GH61 variant, BGL, CBH1a, wild-type GH61, and/or CBH2b. In some
embodiments, the enzyme mixture compositions of the invention
further comprise at least one additional cellulase, other than at
least one GH61 variant, BGL, CBH1a, GH61, and/or CBH2b as described
herein. In some embodiments, the GH61 polypeptide variant of the
invention is also present in mixtures with non-cellulase enzymes
that degrade cellulose, hemicellulose, pectin, and/or
lignocellulose.
[0215] In some embodiments, GH61 polypeptide variant of the present
invention is used in combination with other optional ingredients
such as at least one buffer, surfactant, and/or scouring agent. In
some embodiments, at least one buffer is used with the GH61
polypeptide variant of the present invention (optionally combined
with other enzymes) to maintain a desired pH within the solution in
which the GH61 variant is employed. The exact concentration of
buffer employed depends on several factors which the skilled
artisan can determine. Suitable buffers are well known in the art.
In some embodiments, at least one surfactant is used in with the
GH61 variant of the present invention. Suitable surfactants include
any surfactant compatible with the GH61 variant and, optionally,
with any other enzymes being used in the mixture. Exemplary
surfactants include, but are not limited to anionic, non-ionic, and
ampholytic surfactants. Suitable anionic surfactants include, but
are not limited to, linear or branched alkylbenzenesulfonates;
alkyl or alkenyl ether sulfates having linear or branched alkyl
groups or alkenyl groups; alkyl or alkenyl sulfates;
olefinsulfonates; alkanesulfonates, and the like. Suitable counter
ions for anionic surfactants include, for example, alkali metal
ions, such as sodium and potassium; alkaline earth metal ions, such
as calcium and magnesium; ammonium ion; and alkanolamines having
from 1 to 3 alkanol groups of carbon number 2 or 3. Ampholytic
surfactants suitable for use in the practice of the present
invention include, for example, quaternary ammonium salt
sulfonates, betaine-type ampholytic surfactants, and the like.
Suitable nonionic surfactants generally include polyoxalkylene
ethers, as well as higher fatty acid alkanolamides or alkylene
oxide adduct thereof, fatty acid glycerine monoesters, and the
like. Mixtures of surfactants also find use in the present
invention, as is known in the art.
Exemplary Mixtures of Cellulolytic Enzymes and Cofactors
[0216] As a further guide to the reader, yet without implying any
limitation in the practice of the present invention, exemplary
mixtures of components that may be used as catalysts in a
saccharification reaction to generate fermentable sugars from a
cellulosic substrate are provided herein. Concentrations are given
in wt/vol of each component in the final reaction volume with the
cellulose substrate. Also provided are percentages of each
component (wt/wt) in relation to the total mass of the components
that are listed for addition into each mixture (the "total
protein"). This may be a mixture of purified enzymes and/or enzymes
in a culture supernatant.
[0217] By way of example, the invention embodies mixtures that
comprise at least four, at least five, or all six of the following
components. In some embodiments, cellobiohydrolase 1 (CBH1) finds
use; in some embodiments CBH1 is present at a concentration of
about 0.14 to about 0.23 g/L (about 15% to about 25% of total
protein). Exemplary CBH1 enzymes include, but are not limited to T.
emersonii CBH1 (wild-type) (e.g., SEQ ID NO:125), M. thermophila
CBH1a (wild-type) (e.g., SEQ ID NO:128), and the variants CBH1a-983
(SEQ ID NO:134) and CBH1a-145 (SEQ ID NO:131). In some embodiments,
cellobiohydrolase 2 (CBH2) finds use; in some embodiments, CBH2 is
present at a concentration of about 0.14 to about 0.23 g/L (about
15% to about 25% of total protein). Exemplary CBH2 enzymes include
but are not limited to CBH2b from M. thermophila (wild-type) (e.g.,
SEQ ID NO:137). In some embodiments, endoglucanase 2 (EG2) finds
use; in some embodiments, EG2 is present at a concentration of 0 to
about 0.05 g/L (0 to about 5% of total protein). Exemplary EGs
include, but are not limited to M. thermophila EG2 (wild-type)
(e.g., SEQ ID NO:113). In some further embodiments, endoglucanase 1
(EG1) finds use; in some embodiments, EG1 is present at a
concentration of about 0.05 to about 0.14 g/L (about 5% to about
15% of total protein). Exemplary EGls include, but are not limited
to M. thermophila EG1b (wild-type) (e.g., SEQ ID NO:110). In some
embodiments, beta-glucosidase (BGL) finds use in the present
invention; in some embodiments, BGL is present at a concentration
of about 0.05 to about 0.09 g/L (about 5% to about 10% of total
protein). Exemplary beta-glucosidases include, but are not limited
to M. thermophila BGL1 (wild-type) (e.g., SEQ ID NO:116), variant
BGL-900 (SEQ ID NO:122), and variant BGL-883 (SEQ ID NO:119). In
some further embodiments, GH61 protein and/or protein variants find
use; in some embodiments, GH61 enzymes are present at a
concentration of about 0.23 to about 0.33 g/L (about 25% to about
35% of total protein). Exemplary GH61s include, but are not limited
to M. thermophila GH61a wild-type (SEQ ID NO:2), Variant 1 (SEQ ID
NO:5), Variant 5 (SEQ ID NO:8) and/or Variant 9 (SEQ ID NO:11),
and/or any other GH61a variant proteins, as well as any of the
other GH61 enzymes (e.g., GH61b, GH61c, GH61d, GH61e, GH61f, GH61g,
GH61h, GH161i, GH61j, GH61k, GH61l, GH61m, GH61n, GH61o, GH61p,
GH61q, GH61r, GH61s, GH61t, GH61u, GH61v, GH61w, GH61x, and/or
GH61y) as provided herein.
[0218] In some embodiments, one, two or more than two enzymes are
present in the mixtures of the present invention. In some
embodiments, GH61p is present at a concentration of about 0.05 to
about 0.14 g/L (e.g, about 1% to about 15% of total protein).
Exemplary M. thermophila GH61p enzymes include those set forth in
SEQ ID NOS:70 and 73. In some embodiments, GH61f is present at a
concentration of about 0.05 to about 0.14 g/L (about 1% to about
15% of total protein). An exemplary M. thermophila GH61f is set
forth in SEQ ID NO:29. In some additional embodiments, at least one
additional GH61 enzyme provided herein (e.g., GH61b, GH61c, GH61d,
GH61e, GH61g, GH61h, GH61i, GH61j, GH61k, GH61l, GH61m, GH61n,
GH61n, GH61o, GH61q, GH61r, GH61s, GH61t, GH61u, GH61v, GH61w,
GH61x, and/or GH61y, finds use at an appropriate concentration
(e.g., about 0.05 to about 0.14 g/L [about 1% to about 15% of total
protein]).
[0219] In some embodiments, at least one xylanase at a
concentration of about 0.05 to about 0.14 g/L (about 1% to about
15% of total protein) finds use in the present invention. Exemplary
xylanases include but are not limited to the M. thermophila
xylanase-3 (SEQ ID NO:149), xylanase-2 (SEQ ID NO:152), xylanase-1
(SEQ ID NO:155), xylanase-6 (SEQ ID NO:158), and xylanase-5 (SEQ ID
NO:161).
[0220] In some additional embodiments, at least one beta-xylosidase
at a concentration of about 0.05 to about 0.14 g/L (e.g., about 1%
to about 15% of total protein) finds use in the present invention.
Exemplary beta-xylosidases include but are not limited to the M.
thermophila beta-xylosidase (SEQ ID NO:164).
[0221] In still some additional embodiments, at least one acetyl
xylan esterase at a concentration of about 0.05 to about 0.14 g/L
(e.g., about 1% to about 15% of total protein) finds use in the
present invention. Exemplary acetylxylan esterases include but are
not limited to the M. thermophila acetylxylan esterase (SEQ ID
NO:167).
[0222] In some further additional embodiments, at least one ferulic
acid esterase at a concentration of about 0.05 to about 0.14 g/L
(e.g., about 1% to about 15% of total protein) finds use in the
present invention. Exemplary ferulic esterases include but are not
limited to the M. thermophila ferulic acid esterase (SEQ ID
NO:170).
[0223] In some embodiments, the enzyme mixtures comprise at least
one GH61 variant protein as provided herein and at least one
cellulase, including but not limited to any of the enzymes
described herein. In some embodiments, the enzyme mixtures comprise
at least one GH61 variant protein and at least one wild-type GH61
protein. In some embodiments, the enzyme mixtures comprise at least
one GH61 variant protein and at least one non-cellulase enzyme.
Indeed, it is intended that any combination of enzymes will find
use in the enzyme compositions comprising at least one GH61 variant
of the present invention.
[0224] The concentrations listed above are appropriate for a final
reaction volume with the biomass substrate in which all of the
components listed (the "total protein") is about 0.75 g/L, and the
amount of glucan is about 93 g/L, subject to routine optimization.
The user may empirically adjust the amount of each component and
total protein for cellulosic substrates that have different
characteristics and/or are processed at a different concentration.
Any one or more of the components may be supplemented or
substituted with variants with common structural and functional
characteristics, as described below.
[0225] Without implying any limitation, the following mixtures
further describe some embodiments of the present invention.
[0226] Some mixtures comprise CBH1a within a range of about 15% to
about 30% total protein, typically about 20% to about 25%; CBH2
within a range of about 15% to about 30%, typically about 17% to
about 22%; EG2 within a range of about 1% to about 10%, typically
about 2% to about 5%; BGL1 within a range of about 5% to about 15%,
typically about 8% to about 12%; GH61a within a range of about 10%
to about 40%, typically about 20% to about 30%; EG1b within a range
of about 5% to about 25%, typically about 10% to about 18%; and
GH61f within a range of 0% to about 30%; typically about 5% to
about 20%.
[0227] In some mixtures, exemplary BGL1s include the BGL1 variant
900 (SEQ ID NO:122) and/or variant 883 (SEQ ID NO:119). In some
embodiments, other enzymes are M. thermophila wild-type: CBH1a (SEQ
ID NO:128), CBH2b (SEQ ID NO:137), EG2 (SEQ ID NO:113), GH61a (SEQ
ID NO:2), EG1b (SEQ ID NO:110) and GH61f (SEQ ID NO:29). Any one or
more of the components may be supplemented or substituted with
variants having common structural and functional characteristics
with the component being substituted or supplemented, as described
below. In a saccharification reaction, the amount of glucan is
generally about 50 to about 300 g/L, typically about 75 to about
150 g/L. The total protein is about 0.1 to about 10 g/L, typically
about 0.5 to about 2 g/L, or about 0.75 g/L.
[0228] Some mixtures comprise CBH1 within a range of about 10% to
about 30%, typically about 15% to about 25%; CBH2b within a range
of about 10% to about 25%, typically about 15% to about 20%; EG2
within a range of about 1% to about 10%, typically about 2% to
about 5%; EG1b within a range of about 2% to about 25%, typically
about 6% to about 14%; GH61a within a range of about 5% to about
50%, typically about 10% to about 35%; and BGL1 within a range of
about 2% to about 15%, typically about 5% to about 12%. Also
included is copper sulfate to generate a final concentration of
Cu.sup.++ of about 4 .mu.M to about 200 .mu.M, typically about 25
.mu.M to about 60 .mu.M. However, it is not intended that the added
copper be limited to any particular concentration, as any suitable
concentration finds use in the present invention and will be
determined based on the reaction conditions.
[0229] In an additional mixture, an exemplary CBH1 is wild-type
CBH1 from T. emersonii (SEQ ID NO:125), as well as wild-type M.
thermophila CBH1a (SEQ ID NO:128), Variant 983 (SEQ ID NO:134), and
Variant 145 (SEQ ID NO:131); exemplary CBH2 enzymes include the
wild-type (SEQ ID NO:137), Variant 962 (SEQ ID NO:146), Variant 196
(SEQ ID NO:140), and Variant 287 (SEQ ID NO:143); an exemplary EG2
is the wild-type M. thermophila (SEQ ID NO:113); an exemplary EG1b
is the wild-type (SEQ ID NO: 110); exemplary GH61a enzymes include
wild-type M. thermophila (SEQ ID NO:2), Variant 1 (SEQ ID NO:5),
Variant 5 (SEQ ID NO:11), and Variant 9 (SEQ ID NO:11); and
exemplary BGLs include wild-type M. thermophila BGL (SEQ ID
NO:116), Variant 883 (SEQ ID NO:119), and Variant 900 (SEQ ID
NO:122). Any one or more of the components may be supplemented or
substituted with other variants having common structural and
functional characteristics with the component being substituted or
supplemented, as described below. In a saccharification reaction,
the amount of glucan is generally about 50 to about 300 g/L,
typically about 75 to about 150 g/L. The total protein is about 0.1
to about 10 g/L, typically about 0.5 to about 2 g/L, or about 0.75
g/L.
[0230] Any or all of the components listed in the mixtures referred
to above may be supplemented or substituted with variant proteins
that are structurally and functionally related, as described
herein.
[0231] In some embodiments, the CBH1 cellobiohydrolase used in
mixtures of the present invention comprises at least about 80%, at
least about 85%, at least about 90%, at least about 91%, at least
about 92%, at least about 93%, at least about 94%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, at
least about 99%, or 100% identical to either SEQ ID NO:128 (M.
thermophila), SEQ ID NO:125 (T. emersonii), or a fragment of either
SEQ ID NO:128 or SEQ ID NO:125 having cellobiohydrolase activity,
as well as variants of M. thermophila CBH1a (e.g., SEQ ID NO:131
and/or SEQ ID NO:133), and variant fragment(s) having
cellobiohydrolase activity. Exemplary CBH1 enzymes include, but are
not limited to those described in US Pat. Appln. Publn. No.
2012/0003703 A1, which is hereby incorporated herein by reference
in its entirety for all purposes.
[0232] In some embodiments, the CBH2b cellobiohydrolase used in the
mixtures of the present invention comprises at least about 80%, at
least about 85%, at least about 90%, at least about 91%, at least
about 92%, at least about 93%, at least about 94%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, at
least about 99%, or 100% identical to SEQ ID NO:127 or a fragment
of SEQ ID NO:127, as well as at least one variant M. thermophila
CBH2b enzyme (e.g., SEQ ID NO:140, 143, and/or 146) and/or variant
fragment(s) having cellobiohydrolase activity. Exemplary CBH2b
enzymes are described in U.S. Patent Appln. Ser. No. 61/479,800,
Ser. No. 13/459,038, both of which are hereby incorporated herein
by reference in their entirety for all purposes.
[0233] In some embodiments, the EG2 endoglucanase used in the
mixtures of the present invention comprises at least about 80%, at
least about 85%, at least about 90%, at least about 91%, at least
about 92%, at least about 93%, at least about 94%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, at
least about 99%, or 100% identical to SEQ ID NO:113 or a fragment
of SEQ ID NO:113 having endoglucanase activity. Exemplary EG2
enzymes are described in U.S. patent application Ser. No.
13/332,114, and WO 2012/088159, both of which are hereby
incorporated herein by reference in their entirety for all
purposes.
[0234] In some embodiments, the EG1b endoglucanase used in the
mixtures of the present invention comprises at least about 80%, at
least about 85%, at least about 90%, at least about 91%, at least
about 92%, at least about 93%, at least about 94%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, at
least about 99%, or 100% identical to SEQ ID NO:110 or a fragment
of SEQ ID NO:110 having endoglucanase activity.
[0235] In some embodiments, the BGL1 beta-glucosidase used the
mixtures of the present invention comprises at least about 80%, at
least about 85%, at least about 90%, at least about 91%, at least
about 92%, at least about 93%, at least about 94%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, at
least about 99%, or 100% identical to SEQ ID NOS:116, 119, and/or
122, or a fragment of SEQ ID NOS:116, 119, and/or 122 having
beta-glucosidase activity. Exemplary BGL1 enzymes include, but are
not limited to those described in US Pat. Appln. Publ. No.
2011/0129881, WO 2011/041594, and US Pat. Appln. Publ. No.
2011/0124058 A1, all of which are hereby incorporated herein by
reference in their entireties for all purposes.
[0236] In some embodiments, the GH61f protein used in the mixtures
of the present invention comprises at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, at least
about 99%, or 100% identical to SEQ ID NO:29, or a fragment of SEQ
ID NO:29 having GH61 activity, assayed as described elsewhere in
this disclosure.
[0237] In some embodiments, the GH61p protein used in the mixtures
of the present invention comprises at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, at least
about 99%, or 100% identical to SEQ ID NO:70, SEQ ID NO:73, or a
fragment of such sequence having GH61p activity.
[0238] In some embodiments, the xylanase used in the mixtures of
the present invention comprises at least about 80%, at least about
85%, at least about 90%, at least about 91%, at least about 92%, at
least about 93%, at least about 94%, at least about 95%, at least
about 96%, at least about 97%, at least about 98%, at least about
99%, or 100% identical to SEQ ID NO:149, SEQ ID NO:151, or a
fragment of such sequence having xylanase activity.
GH61 Activity Assays
[0239] The cellulase enhancing activity of GH61 proteins of the
invention can be determined using any suitable GH61 activity assay.
For example, in some embodiments, a purified and/or recombinant
GH61 protein of this invention is obtained, and then assayed for
GH61 activity by combining it with cellulase enzymes in a
saccharification reaction, and determining if there is an increase
in glucose yield, as compared to the same saccharification reaction
conducted without the GH61.
[0240] In one approach, GH61 activity can be assayed by combining a
cellulosic substrate with cellulase enzymes (e.g., 5-10 mg total
weight of cellulase enzymes per gram of substrate) in the presence
and absence of GH61 protein. In some embodiments, the cellulase
enzymes comprise a defined set of recombinant cellulase enzymes
from M. thermophila.
[0241] In another approach, broth from a culture of wild-type M.
thermophila is used (with and without supplementation with GH61
protein and/or GH61 variants). GH61 activity is evidenced by
enhanced glucose yield in the presence of exogenous GH61 (i.e.,
beyond any enhancement resulting from endogenous GH61 in the
broth). It is also possible to use a broth supplemented with one or
more purified enzymes.
[0242] Suitable enzymes include isolated recombinant enzymes cloned
from M. thermophila, including but not limited to EG, BGL, CBH1,
and/or CBH2, in any combination suitable for the chosen substrate
to yield a measurable product.
[0243] In one exemplary assay for measuring GH61 activity from M.
thermophila derived GH61 proteins and variant proteins, the
cellulase enzymes used are M. thermophila BGL1 (e.g., SEQ ID
NOS:116, 119, and/or 122); See e.g., Badhan et al., Biores.
Technol., 98:504-10 [2007]); M. thermophila CBH1 (SEQ ID NOS:128,
131, and/or 134); and M. thermophila CBH2 (SEQ ID NOS:137, 140, 143
and/or 146). In some embodiments, endoglucanase is also used, such
as M. thermophila EG2 (SEQ ID NO:113; See e.g., Rosgaard et al.,
Prog., 22:493-8 [2006]; and Badhan et al., supra).
[0244] Alternatively, commercially available preparations
comprising a mixture of cellulase enzymes may be used, such as
Laminex.TM. and Spezyme.TM. (Genencor), Rohament.TM. (Rohm GmbH),
and Celluzyme.TM., Cereflo.TM. and Ultraflo.TM. (Novozymes).
[0245] Assays with cellulose enzymes are typically done at
50.degree. C., but in some embodiments, other temperatures find use
(e.g., 35, 45, 55, 60, or 65.degree. C.). In some embodiments, the
GH61 enzymes and any other desired enzymes are combined with the
substrate and incubated so as to produce fermentable sugars. The
sugars are then recovered and quantitated for yield of glucose. One
suitable substrate is wheat straw (e.g., pre-treated wheat straw).
Other cellulosic substrates listed in this disclosure may be used
as an alternative, including corn stover pretreated with sulfuric
acid (See e.g., U.S. Pat. No. 7,868,227). Assay methods are known
in the art. For example, the method of Harris et al., (Harris et
al., Biochem., 49:3305-3316 [2010], incorporated herein by
reference) finds use. In this method, corn stover is pretreated
with sulfuric acid, washed, incubated with cellulase enzymes and
GH61 for several days, and then the yield of sugars quantitated by
refraction. Another method is described in U.S. Pat. No. 7,868,227
(incorporated herein by reference). In this method, the cellulosic
substrate is PCS (corn stover pretreated with heat and dilute
sulfuric acid, as described in WO 2005/074647; and a cellulose
enzyme mixture is Cellucast.RTM., a blend of cellulase enzymes from
the fungus Trichoderma reesei (Sigma-Aldrich). Hydrolysis of PCS is
conducted in a total reaction volume of 1.0 mL and a PCS
concentration of 50 mg/mL in 1 mM manganese sulfate, 50 mM sodium
acetate buffer pH 5.0. The test protein is combined with the base
cellulase mixture at relative concentrations between 0 and 100%
total protein. The protein composition is incubated with the PCS at
65.degree. C. for 7 days. The combined yield of glucose and
cellobiose is measured by refractive index detection.
[0246] GH61 activity is calculated as an increase in glucose
production from the substrate by the cellulase(s) in the presence
of GH61 protein, in comparison with the same reaction mixture in
the absence of GH61 protein. Typically, the increase is
dose-dependent within at least a 3-fold range of concentrations.
GH61 activity can be expressed as a degree of "synergy".
Use of GH61 Variant Protein to Promote Saccharification
[0247] The GH61 variant proteins of the present invention can be
used industrially to promote or otherwise modulate the activity of
cellulase enzymes.
[0248] In some embodiments, suitably prepared lignocellulose is
subjected to enzymatic hydrolysis using one or more cellulase
enzymes in the presence of one or more GH61 variant proteins or
preparations according to this invention. Thus, in some
embodiments, saccharification reactions are carried out by exposing
biomass to GH61 variant protein and cellulases, which work in
concert to break down the biomass. Typically, the cellulases
include at least one endoglucanase (EG), at least one
beta-glucosidase (BGL), at least one Type 1 cellobiohydrolase
(CBH1), and/or at least one Type 2 cellobiohydrolase (CBH2). In
some alternative embodiments, a minimum enzyme mixture is used, for
example, comprising GH61 protein in combination with BGL and either
CBH1 or CBH2, or both, but with substantially no EG.
[0249] Hydrolysis of the hemicellulose and cellulose components of
a lignocellulosic feedstock yields a lignocellulosic hydrolysate
comprising xylose and glucose. Other sugars typically present
include galactose, mannose, arabinose, fucose, rhamnose, or a
combination thereof. Regardless of the means of hydrolyzing the
lignocellulosic feedstock (e.g., full acid hydrolysis or chemical
pretreatment with or without subsequent enzymatic hydrolysis), the
xylose and glucose generally make up a large proportion of the
sugars present. In some embodiments, if the lignocellulosic
hydrolysate is a hemicellulose hydrolysate resulting from acid
pretreatment, xylose will likely be the predominant sugar and
lesser amounts of glucose will be present. The relative amount of
xylose present in the lignocellulosic hydrolysate will depend on
the feedstock and the pretreatment that is employed.
[0250] The cells and compositions of the present invention
(including culture broth and/or cell lysates) find use in the
production of fermentable sugars from cellulosic biomass. The
biomass substrate may be converted to a fermentable sugar by (a)
optionally pretreating a cellulosic substrate to increase its
susceptibility to hydrolysis; (b) contacting the optionally
pretreated cellulosic substrate of step (a) with a composition,
culture medium or cell lysate containing at least one GH61 variant
and any additional cellulases under conditions suitable for the
production of cellobiose and fermentable sugars such as
glucose.
[0251] In some embodiments, each of the at least one GH61 variant
and additional cellulase enzymes described herein are partially or
substantially purified, and the purified proteins are added to the
biomass. Alternatively or in addition, the various individual
enzymes are recombinantly expressed in different cells, and the
media containing the secreted proteins are added to the biomass.
The GH61 variant protein(s) and cellulase enzymes are then reacted
with the biomass at a suitable temperature for a suitable
period.
[0252] In some embodiments, sugars produced by methods of this
invention are used to produce an end product such as an alcohol,
such as ethanol. Other end-products may be produced, such as
acetone, amino acid(s) (e.g., glycine, or lysine), organic acids
(e.g., lactic acid, acetic acid, formic acid, citric acid, oxalic
acid, or uric acid), glycerol, diols (e.g., 1,3 propanediol or
butanediol), or at least one hydrocarbon with 1 to 20 carbon atoms.
In some embodiments, cellulosic biomass is treated with at least
one composition of the present invention to prepare an animal
feed.
[0253] In some embodiments, when GH61 protein (e.g., at least one
GH61 variant) is used to increase the yield of fermentable sugars
in a saccharification reaction, at least one divalent metal cation
or additional cofactor or adjunct compound is added to the reaction
at a concentration of about 1 to 100 uM. In some embodiments, the
divalent metal cation (e.g., copper) is included at a concentration
of about 1 to 90 uM, about 10 to 80 uM, about 15 to 75 uM, about 20
to 70 uM, about 30 to 60 uM, about 40 to 50 uM, about 5 to 10 uM,
about 10 to 20 .mu.M, about 15 to 25 uM, about 20 to 30 uM, about
25 to 35 uM, about 30 to 40 uM, about 35 to 45 uM, about 40 to 50
uM, about 45 to 55 uM, about 50 to 60 uM, about 55 to 65 uM, about
60 to 70 uM, about 65 to 75 uM, about 70 to 80 uM, about 75 to 85
uM, about 80 to 90 uM, about 85 to 95 uM, about 90 to 100 uM, about
95 to 100 uM, or about 1 uM, about 2 uM, about 3 uM, about 4 uM,
about 5 uM, about 6 uM, about 7 uM, about 8 uM, about 9 uM, about
10 uM, about 11 uM, about 12 uM, about 13 uM, about 14 uM, about 15
uM, about 16 uM, about 17 uM, about 18 uM, about 19 uM, about 20
uM, about 25 uM, about 30 uM, about 35 uM, about 40 uM, about 45
uM, about 50 uM, about 55 uM, about 60 uM, about 65 uM, about 70
uM, about 75 uM, about 80 uM, about 85 uM, about 90 uM, about 95
uM, or about 100 uM. Divalent cations present in the reaction
include, but are not limited to Cu.sup.++, Mn.sup.++, Co.sup.++,
Mg.sup.++, Ni.sup.++, Zn.sup.++, and Ca.sup.++ at concentrations of
0.001 to 50 mM, 1 .mu.M to 1 mM, or 10-50 .mu.M. Indeed, it is not
intended that the concentration of divalent metal cation(s) be
limited to any particular value, as any suitable concentration
finds use in the present invention and will depend upon the
reaction conditions, as known in the art.
Fermentation of Sugars
[0254] In some embodiments, once a suitable cellulosic biomass
substrate has been treated with cellulase(s) and at least one GH61
variant protein(s) according to this invention, sugars and other
components in the product are fermented to produce various
fermentation end products, including but not limited to biofuels,
such as ethanol or alcohol mixtures. Depending on the substrate
used, other components (e.g., long-chain esters) may also be
present.
[0255] Fermentation is the process of extracting energy from the
oxidation of organic compounds, such as carbohydrates, using an
endogenous electron acceptor. Alcoholic fermentation is a process
in which sugars such as xylulose, glucose, fructose, and sucrose
are converted into a fermentation end product, including but not
limited to biofuel. For example, the fermentation product may
comprise alcohol (such as ethanol or butanol) and/or a sugar
alcohol, such as xylitol.
[0256] In some embodiments, enzyme compositions comprising at least
one GH61 variant of the present invention is reacted with a biomass
substrate in the range of about 25.degree. C. to 100.degree. C.,
about 30.degree. C. to 90.degree. C., about 30.degree. C. to
80.degree. C., and about 30.degree. C. to 70.degree. C. In some
embodiments, the biomass is reacted with the enzyme compositions at
about 25.degree. C., at about 30.degree. C., at about 35.degree.
C., at about 40.degree. C., at about 45.degree. C., at about
50.degree. C., at about 55.degree. C., at about 60.degree. C., at
about 65.degree. C., at about 70.degree. C., at about 75.degree.
C., at about 80.degree. C., at about 85.degree. C., at about
90.degree. C., at about 95.degree. C. and at about 100.degree. C.
In general, the pH range is from about pH 3.0 to 8.5, pH 3.5 to
8.5, pH 4.0 to 7.5, pH 4.0 to 7.0 and pH 4.0 to 6.5. The incubation
time may vary for example from 1.0 to 240 hours, from 5.0 to 180
hrs and from 10.0 to 150 hrs. For example, the incubation time is
generally at least 1 h, at least 5 hrs, at least 10 hrs, at least
15 hrs, at least 25 hrs, at least 50 h, at least 100 hrs, at least
180, or longer. Incubation of the cellulase under these conditions
and subsequent contact with the substrate may result in the release
of substantial amounts of fermentable sugars from the substrate
(e.g., glucose when the cellulase is combined with
beta-glucosidase). For example at least 20%, at least 30%, at least
40%, at least 50%, at least 60%, at least 70%, at least 80%, at
least 90% or more fermentable sugar may be available as compared to
the release of sugar by a wild-type polypeptide.
[0257] Any suitable micro-organism finds use in converting sugar in
the sugar hydrolysate to ethanol or other fermentation products.
These include yeast from the genera Saccharomyces, Hansenula,
Pichia, Kluyveromyces, and Candida. Commercially available yeasts
also find use, including but not limited to ETHANOLRED.RTM.
SAFDISTIL.RTM., THERMOSACC.RTM., FERMIOL.RTM., FERMIVIN.RTM., or
Superstart.TM..
[0258] In some embodiments, the yeast is genetically engineered to
ferment both hexose and pentose sugars to at least one end-product,
including but not limited to ethanol. Alternatively, in some
embodiments, the yeast is a strain that has been made capable of
xylose and glucose fermentation by one or more non-recombinant
methods, such as adaptive evolution or random mutagenesis and
selection. For example, in some embodiments, the fermentation is
performed with recombinant Saccharomyces. In some embodiments, the
recombinant yeast is a strain that has been made capable of xylose
fermentation by recombinant incorporation of genes encoding xylose
reductase (XR) and xylitol dehydrogenase (XDH) (See e.g., U.S. Pat.
Nos. 5,789,210, 5,866,382, 6,582,944 and 7,527,927; and EP 450 530)
and/or gene(s) encoding one or more xylose isomerase (XI) (See
e.g., U.S. Pat. Nos. 6,475,768 and 7,622,284). In some additional
embodiments, the modified yeast strain overexpresses an endogenous
and/or heterologous gene encoding xylulokinase (XK). Other yeast
can ferment hexose and pentose sugars to at least one end-product,
including but not limited to ethanol, such as yeast of the genera
Hansenula, Pichia, Kluyveromyces and Candida (See e.g., WO
2008/130603).
[0259] A typical temperature range for the fermentation of xylose
to ethanol using Saccharomyces spp. is between about 25.degree. C.
to about 37.degree. C., although the temperature may be higher (up
to 55.degree. C.) if the yeast is naturally or genetically modified
to be thermostable. The pH of a typical fermentation employing
Saccharomyces spp. is between about 3 and about 6, depending on the
pH optimum of the fermentation microorganism. The sugar hydrolysate
may also be supplemented with additional nutrients required for
growth and fermentation performance of the fermentation
microorganism. For example, yeast extract, specific amino acids,
phosphate, nitrogen sources, salts, trace elements and vitamins
(See e.g., Verduyn et al., Yeast 8:501-170 [1992]; Jorgensen, Appl.
Biochem. Biotechnol., 153:44-57 [2009]; and Zhao et al., J.
Biotechnol., 139:55-60 [2009]). In some embodiments, the
fermentation is conducted under anaerobic conditions, although
aerobic or microaerobic conditions also find use.
Use of Copper, Gallic Acid, and Biomass Pretreatment Filtrate to
Enhance GH61 Activity
[0260] In some embodiments, GH61 proteins and variants exhibit
increased activity in a saccharification reaction when Cu.sup.++,
gallic acid, and/or pretreatment filtrate are added. In some
embodiments, wild-type GH61a (SEQ ID NO:2) and/or Variant 1 (SEQ ID
NO:5) are used. Similarly, in some embodiments, the present
invention encompasses the supplemental addition of Cu.sup.++,
gallic acid, and/or pretreatment filtrate as an enhancing agent in
saccharification reactions conducted using any of the GH61a
variants shown in Tables 1 and 2, any of the other GH61 proteins
described herein, and any active variant or fragment thereof such
as may be obtained using any suitable method, including but not
limited to the methods provided herein. In some embodiments,
enhancing GH61 activity allows saccharification reactions to
proceed more quickly and/or with less GH61 or cellulase enzyme.
[0261] In some embodiments, Cu.sup.++, gallic acid, and other
potential cofactors are tested by titrating into a saccharification
reaction comprising a GH61 protein, one or more cellulase enzymes
(e.g., CBH1, CBH2, and/or BGL), and a cellulosic substrate, and
measuring the relative rate of glucose production. Controls may
include the combination of GH61 protein, cellulase enzymes, and
substrate in the absence of the putative cofactor (to test the
relative enhancement), and combinations of cellulase enzymes and
substrate with or without cofactor in the absence of GH61 protein
(to determine the effect of the putative cofactor on other enzymes
in the reaction).
[0262] As shown herein, in some embodiments, Cu.sup.++ can enhance
the activity of GH61a Variant 1 (SEQ ID NO:5). The source of
Cu.sup.++ used in the example was CuSO.sub.4, although any
effective copper source can be used as an alternative. Effective
supplemental copper sources include copper salts and metallic
copper, or mixtures thereof. Copper salts include copper(II)
(Cu.sup.++) salts and copper(I) (Cu.sup.+) salts. Copper in
metallic copper(0) and copper(I) salts can be oxidized to Cu.sup.++
in water by oxygen (e.g., by oxygen present in air). Suitable
copper(II) and copper(I) salts include sulfates, chlorides, oxides,
hydroxides, nitrates, carbonates, hydroxycarbonates (basic
carbonates), oxychlorides, and acetates. Suitable sources of
metallic copper include metallic copper refined from copper ores,
including copper vessels and piping in contact with water and
oxygen (e.g., in air).
[0263] In some embodiments, as shown herein, gallic acid and/or
pretreated biomass filtrate can also be used to enhance the
activity of GH61 protein. In some embodiments, the gallic acid
and/or pretreated biomass filtrate are titrated to the optimal dose
for the reaction conditions used. Thus, an effective concentration
of gallic acid can be determined empirically by titrating it into
the reaction mixture, depending on the enzymes being used and the
total biomass. In some embodiments, in which gallic acid is
utilized, an effective concentration of gallic acid is within the
range of about 0.1 to 20 mM, about 0.5 to 5 mM, or about 1 to 2 mM.
However, it is not intended that the present invention be limited
to any particular concentration of gallic acid, as any suitable
concentration finds use in the present invention, depending upon
the reaction conditions.
[0264] A cofactor of GH61 in a reaction volume such as Cu.sup.++ is
said to be "supplemented" if it has been added into the reaction
volume as a separate reagent, which is in addition to any metal
ions that may be bound to GH61 or other reactants beforehand.
Depending on the amount or molar ratio of cofactors such as
Cu.sup.++ already present in a GH61 preparation, addition of such
cofactors into the reaction may increase the amount of glucose
produced per weight of GH61 by 25%, 50%, 2-fold, or more.
[0265] Effective concentrations of supplemented Cu.sup.++ in the
reaction volume may be readily determined empirically as described
herein. Depending on reaction conditions, effective supplemented
concentrations include but are not limited to 1 .mu.M to 200 .mu.M,
4 .mu.M to 100 .mu.M, 10 .mu.M to 100 .mu.M, or at least 1 .mu.M, 4
.mu.M, 10 .mu.M, 20 .mu.M, 30 .mu.M, 40 .mu.M, or 50 .mu.M in the
reaction volume (i.e., the concentration of supplemented copper in
the reaction volume). However, it is not intended that the present
invention be limited to any particular copper concentration or
range of concentrations, as any suitable concentration finds use
and will depend upon the reaction conditions used. In some
embodiments, prior to or without copper supplementation, copper is
present in the GH61 protein preparation, the other enzymes, the
cellulase fermentation production media, the pretreated biomass,
and/or any other component of the reaction volume (i.e., in some
embodiments, there are other sources of copper present in the
reaction than any copper added to the reaction as a supplement).
Thus, in some embodiments, the reaction is conducted without the
supplemental addition of copper as described herein.
[0266] In some embodiments, inclusion of copper and/or gallic acid
in the reaction mixture at an effective concentration or ratio,
less GH61 protein is needed to produce the same amount of
fermentable sugars from the same cellulase enzymes. In some
embodiments, this provides a cost reduction associated with
saccharification reactions.
Vectors, Promoters, Other Expression Elements, Host Cells, and
Signal Peptides.
[0267] There are numerous general texts that describe molecular
biological techniques including the use of vectors, promoters, in
vitro amplification methods including the polymerase chain reaction
(PCR) and the ligase chain reaction (LCR) (See e.g., Berger and
Kimmel, Guide to Molecular Cloning Techniques, Methods in
Enzymology, Volume 152 Academic Press, Inc., San Diego, Calif.
(Berger); Sambrook et al., Molecular Cloning--A Laboratory Manual
(2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring
Harbor, N.Y., 1989; and Current Protocols in Molecular Biology, F.
M. Ausubel et al., eds., Current Protocols [as supplemented through
2009]). Introduction of a vector or a DNA construct into a host
cell can be effected by any suitable method, including but not
limited to calcium phosphate transfection, DEAE-Dextran mediated
transfection, electroporation, or other common techniques (See
Davis et al., 1986, Basic Methods in Molecular Biology). General
references on cell culture techniques and nutrient media for fungal
host cells include Gene Manipulations in Fungi, Bennett, J. W. et
al., Ed., Academic Press, 1985; More Gene Manipulations in Fungi,
Bennett, J. W. et al., Ed., Academic Press, 1991; and The Handbook
of Microbiological Media, CRC Press, Boca Raton, Fla., 1993.
Vectors
[0268] The present invention makes use of recombinant constructs
comprising at least one sequence encoding at least one GH61 variant
as described above. In some embodiments, the present invention
provides expression vectors comprising at least one GH61 variant
polynucleotide operably linked to a heterologous promoter.
Expression vectors of the present invention may be used to
transform an appropriate host cell to permit the host to express
the GH61 variant protein. Methods for recombinant expression of
proteins in fungi and other organisms are well known in the art,
and a number expression vectors are available or can be constructed
using routine methods (See, e.g., Tkacz and Lange, 2004, Advances
in fungal biotechnology for industry, agriculture, and medicine,
Kluwer Academic/Plenum Publishers, New York; Zhu et al., Plasmid
6:128-33 [2009]; and Kavanagh, K. 2005, Fungi: biology and
applications, Wiley, all of which are incorporated herein by
reference).
[0269] Nucleic acid constructs of the present invention comprise a
vector, such as, a plasmid, a cosmid, a phage, a virus, a bacterial
artificial chromosome (BAC), a yeast artificial chromosome (YAC),
and the like, into which a nucleic acid sequence of the invention
has been inserted. Polynucleotides of the present invention can be
incorporated into any one of a variety of expression vectors
suitable for expressing a polypeptide. Suitable vectors include,
but are not limited to chromosomal, nonchromosomal and synthetic
DNA sequences (e.g., derivatives of SV40); bacterial plasmids;
phage DNA; baculovirus; yeast plasmids; vectors derived from
combinations of plasmids and phage DNA, viral DNA such as vaccinia,
adenovirus, fowl pox virus, pseudorabies, adenovirus,
adeno-associated virus, retroviruses and many others. Any vector
that transduces genetic material into a cell, and, if replication
is desired, which is replicable and viable in the relevant host can
be used.
[0270] In some embodiments, the construct further comprises
regulatory sequences, including, for example, a promoter, operably
linked to the protein encoding sequence. Large numbers of suitable
vectors and promoters are known to those of skill in the art.
Promoters
[0271] In order to obtain high levels of expression in a particular
host it is often useful to express the GH61 variant of the present
invention under the control of a heterologous promoter. A promoter
sequence may be operably linked to the 5' region of the GH61
variant coding sequence using routine methods.
[0272] Examples of useful promoters for expression of GH61 enzymes
include promoters from fungi. In some embodiments, a promoter
sequence that drives expression of a gene other than a GH61 gene in
a fungal strain may be used. As a non-limiting example, a fungal
promoter from a gene encoding an endoglucanase may be used. In some
embodiments, a promoter sequence that drives the expression of a
GH61 gene in a fungal strain other than the fungal strain from
which the GH61 variant was derived may be used. As a non-limiting
example, if the GH61 variant is derived from C1, a promoter from a
T. reesei GH61 gene may be used or a promoter as described in WO
2010/107303, such as but not limited to the sequences identified as
SEQ ID NO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, or SEQ ID
NO:29 in WO 2010/107303.
[0273] Examples of other suitable promoters useful for directing
the transcription of the nucleotide constructs of the present
invention in a filamentous fungal host cell are promoters obtained
from the genes for Aspergillus oryzae TAKA amylase, Rhizomucor
miehei aspartic proteinase, Aspergillus niger neutral
alpha-amylase, Aspergillus niger acid stable alpha-amylase,
Aspergillus niger or Aspergillus awamori glucoamylase (glaA),
Rhizomucor miehei lipase, Aspergillus oryzae alkaline protease,
Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans
acetamidase, and Fusarium oxysporum trypsin-like protease (WO
96/00787, which is incorporated herein by reference), as well as
the NA2-tpi promoter (a hybrid of the promoters from the genes for
Aspergillus niger neutral alpha-amylase and Aspergillus oryzae
triose phosphate isomerase), promoters such as cbh1, cbh2, egl1,
egl2, pepA, hfb1, hfb2, xyn1, amy, and glaA (Nunberg et al., Mol.
Cell Biol., 4:2306-2315 [1984]; Boel et al., EMBO J, 3:1581-85
[1984]; and European Pat. Publ. 137280, all of which are
incorporated herein by reference), and mutant, truncated, and
hybrid promoters thereof. In a yeast host, useful promoters can be
from the genes for Saccharomyces cerevisiae enolase (eno-1),
Saccharomyces cerevisiae galactokinase (gal1), Saccharomyces
cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate
dehydrogenase (ADH2/GAP), and S. cerevisiae 3-phosphoglycerate
kinase. Other useful promoters for yeast host cells are known (See
e.g., Romanos et al., Yeast 8:423-488 [1992], incorporated herein
by reference. Promoters associated with chitinase production in
fungi may be used (See, e.g., Blaiseau and Lafay, Gene 120243-248
[1992] (filamentous fungus Aphanocladium album); Limon et al.,
Curr. Genet, 28:478-83 (Trichoderma harzianum), both of which are
incorporated herein by reference).
[0274] Promoters known to control expression of genes in
prokaryotic or eukaryotic cells or their viruses and which can be
used in some embodiments of the invention include SV40 promoter, E.
coli lac or trp promoter, phage lambda P.sub.L promoter, tac
promoter, T7 promoter, and the like. In bacterial host cells,
suitable promoters include the promoters obtained from the E. coli
lac operon, Streptomyces coelicolor agarase gene (dagA), Bacillus
subtilis levansucranse gene (sacB), Bacillus licheniformis
.alpha.-amylase gene (amyl), Bacillus stearothermophilus maltogenic
amylase gene (amyM), Bacillus amyloliquefaciens .alpha.-amylase
gene (amyQ), Bacillus subtilis xylA and xylB genes and prokaryotic
beta-lactamase gene.
[0275] Any other promoter sequence that drives expression in a
suitable host cell may be used. Suitable promoter sequences can be
identified using well known methods. In one approach, a putative
promoter sequence is linked 5' to a sequence encoding a reporter
protein, the construct is transfected into the host cell (e.g., M.
thermophila) and the level of expression of the reporter is
measured. Expression of the reporter can be determined by
measuring, for example, mRNA levels of the reporter sequence, an
enzymatic activity of the reporter protein, or the amount of
reporter protein produced. For example, promoter activity may be
determined by using the green fluorescent protein as coding
sequence (See e.g., Henriksen et al, Microbiol., 145:729-34 [1999],
incorporated herein by reference) or a lacZ reporter gene (Punt et
al., Gene, 197:189-93 [1997], incorporated herein by reference).
Functional promoters may be derived from naturally occurring
promoter sequences by directed evolution methods (See, e.g. Wright
et al., Human Gene Therapy, 16:881-892 [2005], incorporated herein
by reference.
[0276] Additional promoters include those from M. thermophila,
provided in U.S. Prov. Patent Appln. Ser. Nos. 61/375,702,
61/375,745, 61/375,753, 61/375,755, and 61/375,760, all of which
were filed on Aug. 20, 2010, and are hereby incorporated by
reference in their entireties, as well as WO 2010/107303.
Other Expression Elements
[0277] Cloned GH61 variants may also have a suitable transcription
terminator sequence, a sequence recognized by a host cell to
terminate transcription. The terminator sequence is operably linked
to the 3' terminus of the nucleic acid sequence encoding the
polypeptide. Any terminator that is functional in the host cell of
choice may be used in the present invention.
[0278] For example, exemplary transcription terminators for
filamentous fungal host cells can be obtained from the genes for
Aspergillus oryzae TAKA amylase, Aspergillus niger glucoamylase,
Aspergillus nidulans anthranilate synthase, Aspergillus niger
alpha-glucosidase, and Fusarium oxysporum trypsin-like protease.
Suitable transcription terminators are known in the art (See e.g.,
U.S. Pat. No. 7,399,627, incorporated herein by reference).
[0279] Exemplary terminators for yeast host cells include those
obtained from the genes for Saccharomyces cerevisiae enolase,
Saccharomyces cerevisiae cytochrome C (CYC1), and Saccharomyces
cerevisiae glyceraldehyde-3-phosphate dehydrogenase. Other useful
terminators for yeast host cells are known in the art (See e.g.,
Romanos et al., Yeast 8:423-88 [1992]).
[0280] A suitable leader sequence may be part of a cloned GH61
variant sequence, which is a nontranslated region of an mRNA that
is important for translation by the host cell. The leader sequence
is operably linked to the 5' terminus of the nucleic acid sequence
encoding the polypeptide. Any leader sequence that is functional in
the host cell of choice may be used. Exemplary leaders for
filamentous fungal host cells are obtained from the genes for
Aspergillus oryzae TAKA amylase and Aspergillus nidulans triose
phosphate isomerase. Suitable leaders for yeast host cells are
obtained from the genes for Saccharomyces cerevisiae enolase
(ENO-1), Saccharomyces cerevisiae 3-phosphoglycerate kinase,
Saccharomyces cerevisiae alpha-factor, and Saccharomyces cerevisiae
alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase
(ADH2/GAP).
[0281] In some embodiments, sequences also contain a
polyadenylation sequence, which is a sequence operably linked to
the 3' terminus of the nucleic acid sequence and which, when
transcribed, is recognized by the host cell as a signal to add
polyadenosine residues to transcribed mRNA. Any polyadenylation
sequence which is functional in the host cell of choice may be used
in the present invention. Exemplary polyadenylation sequences for
filamentous fungal host cells can be from the genes for Aspergillus
oryzae TAKA amylase, Aspergillus niger glucoamylase, Aspergillus
nidulans anthranilate synthase, Fusarium oxysporum trypsin-like
protease, and Aspergillus niger alpha-glucosidase. Useful
polyadenylation sequences for yeast host cells are known in the art
(See e.g., Guo and Sherman, Mol. Cell. Biol., 15:5983-5990
[1995]).
[0282] The expression vector of the present invention optionally
contains one or more selectable markers, which facilitate easy
selection of transformed cells. A selectable marker is a typically
gene, the product of which provides for biocide or viral
resistance, resistance to heavy metals, prototrophy to auxotrophs,
and the like. Selectable markers for use in a filamentous fungal
host cell include, but are not limited to, amdS (acetamidase), argB
(ornithine carbamoyltransferase), bar (phosphinothricin
acetyltransferase), hph (hygromycin phosphotransferase), niaD
(nitrate reductase), pyrG (orotidine-5'-phosphate decarboxylase),
sC (sulfate adenyltransferase), and trpC (anthranilate synthase),
as well as equivalents thereof. Embodiments for use in an
Aspergillus cell include the amdS and pyrG genes of Aspergillus
nidulans or Aspergillus oryzae and the bar gene of Streptomyces
hygroscopicus. Suitable markers for yeast host cells include but
are not limited to ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and
URA3.
Host Cells
[0283] In some embodiments, at least one GH61 variant protein of
the present invention is expressed from a nucleic acid that has
been recombinantly introduced into a suitable host cell line. In
some embodiments, the host cell also expresses other proteins of
interest, particularly one or more cellulase enzymes that work in
concert with at least one GH61 variant protein in the process of
saccharification. The cellulase enzymes may be constitutively
expressed by the parent strain of the host cell, or they may be
expressed from other recombinant nucleic acids that were introduced
serially or simultaneously with the GH61 variant encoding
sequence.
[0284] Rather than expressing at least one GH61 variant protein and
at least one additional cellulase enzyme in the same cell, in some
embodiments, the invention is practiced by producing at least one
GH61 variant protein in one host cell, and producing one or more
cellulases together in another host cell, or in a plurality of host
cells. Once such cells have been engineered, cells expressing GH61
protein and cells expressing cellulase enzymes can be combined and
cultured together to produce compositions of this invention
containing both GH61 variant proteins and other cellulase enzymes.
Alternatively, the culture supernatant or broth from each cell line
can be collected separately, optionally fractionated to enrich for
the respective activities, and then mixed together to produce the
desired combination.
[0285] Suitable fungal host cells include, but are not limited to
Ascomycota, Basidiomycota, Deuteromycota, Zygomycota, and Fungi
imperfecti. In some embodiments, preferred fungal host cells are
yeast cells, and filamentous fungal cells, including all
filamentous forms of the subdivision Eumycotina and Oomycota.
Filamentous fungi are characterized by a vegetative mycelium with a
cell wall composed of chitin, cellulose and other complex
polysaccharides, and are morphologically distinct from yeast. In
some embodiments, Trichoderma is a source of one or more cellulases
for use in combination with GH61 variant proteins.
[0286] Any suitable host cell finds use in the present invention,
including but not limited to host cells that are species of Achlya,
Acremonium, Aspergillus, Aureobasidium, Azospirillum, Bjerkandera,
Cellulomonas, Cephalosporium, Ceriporiopsis, Chrysosporium,
Clostridium, Coccidioides, Cochliobolus, Coprinus, Coriolus,
Corynascus, Cryphonectria, Cryptococcus, Dictyostelium, Diplodia,
Elizabethkingia, Endothia, Erwinia, Escherichia, Fusarium,
Gibberella, Gliocladium, Gluconacetobacter, Humicola, Hypocrea,
Kuraishia, Mucor, Myceliophthora, Neurospora, Nicotiana,
Paenibacillus, Penicillium, Periconia, Phaeosphaeria, Phlebia,
Piromyces, Podospora, Prevotella, Pyricularia, Rhizobium,
Rhizomucor, Rhizopus, Ruminococcus, Saccharomycopsis, Salmonella,
Schizophyllum, Scytalidium, Septoria, Sporotrichum, Streptomyces,
Talaromyces, Thermoanaerobacter, Thermoascus, Thermotoga,
Thielavia, Tolypocladium, Trametes, Trichoderma, Tropaeolum,
Uromyces, Verticillium, Volvariella, Wickerhamomyces, or
corresponding teleomorphs, or anamorphs, and synonyms or taxonomic
equivalents thereof.
[0287] An exemplary host cell is yeast, including but not limited
to Candida, Hansenula, Saccharomyces, Schizosaccharomyces, Pichia,
Kluyveromyces, or Yarrowia. In some embodiments, the yeast cell is
Hansenula polymorpha, Saccharomyces cerevisiae, Saccharomyces
carlsbergensis, Saccharomyces diastaticus, Saccharomyces norbensis,
Saccharomyces kluyveri, Schizosaccharomyces pombe, Pichia pastoris,
Pichia finlandica, Pichia trehalophila, Pichia kodamae, Pichia
membranaefaciens, Pichia opuntiae, Pichia thermotolerans, Pichia
salictaria, Pichia quercuum, Pichia pijperi, Pichia stipitis,
Pichia methanolica, Pichia angusta, Kluyveromyces lactis, Candida
albicans, or Yarrowia lipolytica.
[0288] Another exemplary host cell is a Myceliophthora species,
such as M. thermophila. As used herein, the term "C1" refers to
Myceliophthora thermophila, including a fungal strain described by
Garg (See, Garg, Mycopathol., 30: 3-4 [1966]). As used herein,
"Chrysosporium lucknowense" includes the strains described in U.S.
Pat. Nos. 6,015,707, 5,811,381 and 6,573,086; US Pat. Pub. Nos.
2007/0238155, US 2008/0194005, US 2009/0099079; International Pat.
Pub. Nos., WO 2008/073914 and WO 98/15633, all of which are
incorporated herein by reference, and include, without limitation,
Chrysosporium lucknowense Garg 27K, VKM-F 3500 D (Accession No. VKM
F-3500-D), C1 strain UV13-6 (Accession No. VKM F-3632 D), C1 strain
NG7C-19 (Accession No. VKM F-3633 D), and C1 strain UV18-25 (VKM
F-3631 D), all of which have been deposited at the All-Russian
Collection of Microorganisms of Russian Academy of Sciences (VKM),
Bakhurhina St. 8, Moscow, Russia, 113184, and any derivatives
thereof. Although initially described as Chrysosporium lucknowense,
C1 may currently be considered a strain of Myceliophthora
thermophila. Other C1 strains include cells deposited under
accession numbers ATCC 44006, CBS (Centraalbureau voor
Schimmelcultures) 122188, CBS 251.72, CBS 143.77, CBS 272.77,
CBS122190, CBS122189, and VKM F-3500D. Exemplary C1 derivatives
include modified organisms in which one or more endogenous genes or
sequences have been deleted or modified and/or one or more
heterologous genes or sequences have been introduced. Derivatives
include, but are not limited to UV18#100f .DELTA.alp1, UV18#100f
.DELTA.pyr5 .DELTA.alp1, UV18#100.f .DELTA.alp1 .DELTA.pep4
.DELTA.alp2, UV18#100.f .DELTA.pyr5 .DELTA.alp1 .DELTA.pep4
.DELTA.alp2 and UV18#100.f .DELTA.pyr4 .DELTA.pyr5 .DELTA.aIp1
.DELTA.pep4 .DELTA.alp2, as described in WO2008073914 and
WO2010107303, each of which is incorporated herein by
reference.
[0289] In some embodiments, the host cell is a Trichoderma species,
such as T. longibrachiatum, T. viride, Hypocrea jecorina or T.
reesei, T. koningii, and T. harzianum.
[0290] In some embodiments, the host cell is a Aspergillus species,
such as A. awamori, A. funigatus, A. japonicus, A. nidulans, A.
niger, A. aculeatus, A. foetidus, A. oryzae, A. sojae, and A.
kawachi.
[0291] In some additional embodiments, the host cell is a Fusarium
species, such as F. bactridioides, F. cerealis, F. crookwellense,
F. culmorum, F. graminearum, F. graminum. F. oxysporum, F. roseum,
and F. venenatum.
[0292] The host cell may also be a Neurospora species, such as N.
crassa. Alternatively, the host cell is a Humicola species, such as
H. insolens, H. grisea, and H. lanuginosa. Alternatively, the host
cell is a Mucor species, such as M. miehei and M. circinelloides.
Alternatively, the host cell is a Rhizopus species, such as R.
oryzae and R. niveus. Alternatively, the host cell is a Penicillum
species, such as P. purpurogenum, P. chrysogenum, and P.
verruculosum.
[0293] In some embodiments, the host cell is a Thielavia species,
such as T. terrestris. Alternatively, the host cell is a
Tolypocladium species, such as T. inflatum and T. geodes.
Alternatively, the host cell is a the Trametes species, such as T.
villosa and T. versicolor.
[0294] In some embodiments, the host cell is of a Chrysosporium
species, such as C. lucknowense, C. keratinophilum, C. tropicum, C.
merdarium, C. inops, C. pannicola, and C. zonatum. In a particular
embodiment the host is C. lucknowense. Alternatively, the host cell
is an algae such as Chlamydomonas (e.g., C. reinhardtii) or
Phormidium (P. sp. ATCC29409).
[0295] In some alternative embodiments, the host cell is a
prokaryotic cell. Suitable prokaryotic cells include Gram-positive,
Gram-negative and Gram-variable bacterial cells. Examples of
bacterial host cells include, but are not limited to Bacillus
(e.g., B. subtilis, B. licheniformis, B. megaterium, B.
stearothermophilus and B. amyloliquefaciens), Streptomyces (e.g.,
S. ambofaciens, S. achromogenes, S. avermitilis, S. coelicolor, S.
aureofaciens, S. aureus, S. fungicidicus, S. griseus, and S.
lividans), and Streptococcus (e.g., S. equisimiles, S. pyogenes,
and S. uberis) species.
[0296] Any suitable eukaryotic or prokaryotic species finds use as
host cells, including but not limited to Aspergillus aculeatus,
Azospirillum irakense KBC1, Bacillus sp. GL1, Cellulomonas
biazotea, Clostridium thermocellum, Thermoanaerobacter brockii,
Coccidioides posadasii, Dictyostelium discoideum, Elizabethkingia
meningoseptica, Erwinia chrysanthemi, Escherichia coli,
Gluconacetobacter xylinus, Hypocrea jecorina, Kuraishia capsulata,
Nicotiana tabacum, Paenibacillus sp. C7, Penicillium brasilianum,
Periconia sp. BCC 2871, Phaeosphaeria avenaria, Prevotella
albensis, Rhizobium leguminosarum, Rhizomucor miehei, Ruminococcus
albus, Saccharomycopsis fibuligera, Salmonella typhimurium,
Septoria lycopersici, Streptomyces coelicolor, Talaromyces
emersonii, Thermotoga maritima, Tropaeolum majus, Uromyces
viciae-fabae, and Wickerhamomyces anomalus.
[0297] Strains that may be used in the practice of the invention
(both prokaryotic and eukaryotic strains) may be obtained from any
suitable source, including but not limited to the American Type
Culture Collection (ATCC), or other biological depositories such as
Deutsche Sammlung von Mikroorganismen and Zellkulturen GmbH (DSM),
Centraalbureau Voor Schimmelcultures (CBS), and the Agricultural
Research Service Patent Culture Collection, Northern Regional
Research Center (NRRL).
[0298] In some embodiments, host cells are genetically modified to
have characteristics that improve genetic manipulation, protein
secretion, protein stability or other properties desirable for
expression or secretion of a protein. For example, knock-out of
Alp1 function results in a cell that is protease deficient.
Knock-out of pyr5 function results in a cell with a pyrimidine
deficient phenotype. Host cells may be modified to delete
endogenous cellulase protein-encoding sequences or otherwise
eliminate expression of one or more endogenous cellulases.
Expression of one or more unwanted endogenous cellulases may be
inhibited to increase the proportion of cellulases of interest, for
example, by chemical or UV mutagenesis and subsequent selection.
Homologous recombination can be used to induce targeted gene
modifications by specifically targeting a gene in vivo to suppress
expression of the encoded protein.
Signal Peptides
[0299] In general, polypeptides are secreted from the host cell
after being expressed as a pre-protein including a signal peptide
(i.e., an amino acid sequence linked to the amino terminus of a
polypeptide which directs the encoded polypeptide into the cell's
secretory pathway).
[0300] In some embodiments, the secreted part of a GH61 variant is
linked at the N-terminal to a heterologous signal peptide,
depending on the host cell and other factors. Effective signal
peptide coding regions for filamentous fungal host cells include
but are not limited to signal peptide coding regions obtained from
Aspergillus oryzae TAKA amylase, Aspergillus niger neutral amylase,
Aspergillus niger glucoamylase, Rhizomucor miehei aspartic
proteinase, Humicola insolens cellulase, Humicola lanuginosa
lipase, and T. reesei cellobiohydrolase II (TrCBH2).
[0301] Effective signal peptide coding regions for bacterial host
cells include but are not limited to signal peptide coding regions
obtained from the genes for Bacillus NClB 11837 maltogenic amylase,
Bacillus stearothermophilus alpha-amylase, Bacillus licheniformis
subtilisin, Bacillus licheniformis beta-lactamase, Bacillus
stearothermophilus neutral proteases (nprT, nprS, nprM), and
Bacillus subtilis prsA. Further signal peptides are known in the
art (See e.g., described by Simonen and Palva, Microbiol. Rev.,
57:109-137 [1993]).
[0302] Useful signal peptides for yeast host cells also include
those from the genes for Saccharomyces cerevisiae alpha-factor,
Saccharomyces cerevisiae SUC2 invertase (see Taussig and Carlson,
Nucl. Acids Res., 11:1943-54 [1983]; SwissProt Accession No.
P00724; and Romanos et al., Yeast 8:423-488 [1992]). Variants of
these signal peptides and other signal peptides are suitable. In
addition, the signal peptides provided herein find use in the
present invention.
EXPERIMENTAL
[0303] The present invention is described in further detail in the
following Examples, which are not in any way intended to limit the
scope of the invention as claimed.
[0304] In the experimental disclosure below, the following
abbreviations apply: ppm (parts per million); M (molar); mM
(millimolar), uM and .mu.M (micromolar); nM (nanomolar); mol
(moles); gm and g (gram); mg (milligrams); ug and .mu.g
(micrograms); L and 1 (liter); ml and mL (milliliter); cm
(centimeters); mm (millimeters); um and .mu.m (micrometers); sec.
(seconds); min(s) (minute(s)); h(s) and hr(s) (hour(s)); U (units);
MW (molecular weight); rpm (rotations per minute); .degree. C.
(degrees Centigrade); DNA (deoxyribonucleic acid); RNA (ribonucleic
acid); HPLC (high pressure liquid chromatography); MES
(2-N-morpholino ethanesulfonic acid); FIOPC (fold improvements over
positive control); YPD (10 g/L yeast extract, 20 g/L peptone, and
20 g/L dextrose); SOE-PCR (splicing by overlapping extension PCR);
PEG (polyethylene glycol); TWEEN.RTM.-20 (TWEEN.RTM. non-ionic
surfactant; Sigma-Aldrich); ARS (ARS Culture Collection or NRRL
Culture Collection, Peoria, Ill.); Axygen (Axygen, Inc., Union
City, Calif.); Lallemand (Lallemand Ethanol Technology, Milwaukee,
Wis.); Dual Biosystems (Dual Biosystems AG, Schlieven,
Switzerland); US Biological (United States Biological, Swampscott,
Mass.); Megazyme (Megazyme International Ireland, Ltd., Wicklow,
Ireland); Genetix (Genetix USA, Inc., Beaverton, Oreg.);
Sigma-Aldrich (Sigma-Aldrich, St. Louis, Mo.); Dasgip (Dasgip
Biotools, LLC, Shrewsbury, Mass.); Difco (Difco Laboratories, BD
Diagnostic Systems, Detroit, Mich.); PCRdiagnostics
(PCRdiagnostics, by E coli SRO, Slovak Republic); Agilent (Agilent
Technologies, Inc., Santa Clara, Calif.); Molecular Devices
(Molecular Devices, Sunnyvale, Calif.); Symbio (Symbio, Inc., Menlo
Park, Calif.); Newport (Newport Scientific, Australia); and Bio-Rad
(Bio-Rad Laboratories, Hercules, Calif.).
[0305] The M. thermophila strains included in the development of
the present invention included a "Strain CF-400" (.DELTA.cdh1),
which is a derivative of C1 strain
("UV18#100f.DELTA.alp1.DELTA.pyr5"), modified by deletion of cdh1,
wherein cdh1 comprises the polynucleotide sequence of SEQ ID NO:5
of U.S. Pat. No. 8,236,551. "Strain CF-401"
(.DELTA.cdh1.DELTA.cdh2) (ATCC No. PTA-12255), is a derivative of
the C1 strain modified by deletion of both a cdh1 and a cdh2,
wherein cdh2 comprises the polynucleotide sequence of SEQ ID NO:7
of U.S. Pat. No. 8,236,551. "Strain CF-402" (+Bgl1) is a derivative
of the C1 strain further modified for overexpression of an
endogenous beta-glucosidase 1 enzyme (Bgl1). "Strain CF-403" is a
derivative of the C1 strain modified with a deletion of cdh1 and
further modified to overexpress bgl1. "Strain CF-404" is a
derivative of the C1 strain further modified to overexpress bgl1
with a deletion of both cdh1 and cdh2. "Strain CF-416" is a
derivative of the CF-404 strain, further modified to overexpress
wild-type GH61a enzyme.
[0306] The following sequences are referred to herein and find use
in the present invention
Wild-Type M. thermophila C1 GH61a cDNA Sequence:
TABLE-US-00003 (SEQ ID NO: 1)
ATGTCCAAGGCCTCTGCTCTCCTCGCTGGCCTGACGGGCGCGGCCCTCG
TCGCTGCACATGGCCACGTCAGCCACATCGTCGTCAACGGCGTCTACTA
CAGGAACTACGACCCCACGACAGACTGGTACCAGCCCAACCCGCCAACA
GTCATCGGCTGGACGGCAGCCGATCAGGATAATGGCTTCGTTGAACCCA
ACAGCTTTGGCACGCCAGATATCATCTGCCACAAGAGCGCCACCCCCGG
CGGCGGCCACGCTACCGTTGCTGCCGGAGACAAGATCAACATCGTCTGG
ACCCCCGAGTGGCCCGAATCCCACATCGGCCCCGTCATTGACTACCTAG
CCGCCTGCAACGGTGACTGCGAGACCGTCGACAAGTCGTCGCTGCGCTG
GTTCAAGATTGACGGCGCCGGCTACGACAAGGCCGCCGGCCGCTGGGCC
GCCGACGCTCTGCGCGCCAACGGCAACAGCTGGCTCGTCCAGATCCCGT
CGGATCTCAAGGCCGGCAACTACGTCCTCCGCCACGAGATCATCGCCCT
CCACGGTGCTCAGAGCCCCAACGGCGCCCAGGCCTACCCGCAGTGCATC
AACCTCCGCGTCACCGGCGGCGGCAGCAACCTGCCCAGCGGCGTCGCCG
GCACCTCGCTGTACAAGGCGACCGACCCGGGCATCCTCTTCAACCCCTA
CGTCTCCTCCCCGGATTACACCGTCCCCGGCCCGGCCCTCATTGCCGGC
GCCGCCAGCTCGATCGCCCAGAGCACGTCGGTCGCCACTGCCACCGGCA
CGGCCACCGTTCCCGGCGGCGGCGGCGCCAACCCTACCGCCACCACCAC
CGCCGCCACCTCCGCCGCCCCGAGCACCACCCTGAGGACGACCACTACC
TCGGCCGCGCAGACTACCGCCCCGCCCTCCGGCGATGTGCAGACCAAGT
ACGGCCAGTGTGGTGGCAACGGATGGACGGGCCCGACGGTGTGCGCCCC
CGGCTCGAGCTGCTCCGTCCTCAACGAGTGGTACTCCCAGTGTTTGTAA
Wild-Type M. thermophila C1 GH61a Polypeptide Sequence:
TABLE-US-00004 (SEQ ID NO: 2)
MSKASALLAGLTGAALVAAHGHVSHIVVNGVYYRNYDPTTDWYQPNPPT
VIGWTAADQDNGFVEPNSFGTPDIICHKSATPGGGHATVAAGDKINIVW
TPEWPESHIGPVIDYLAACNGDCETVDKSSLRWFKIDGAGYDKAAGRWA
ADALRANGNSWLVQIPSDLKAGNYVLRHEIIALHGAQSPNGAQAYPQCI
NLRVTGGGSNLPSGVAGTSLYKATDPGILFNPYVSSPDYTVPGPALIAG
AASSIAQSTSVATATGTATVPGGGGANPTATTTAATSAAPSTTLRTTTT
SAAQTTAPPSGDVQTKYGQCGGNGWTGPTVCAPGSSCSVLNEWYSQCL
Wild-Type M. thermophila C1 GH61a Polypeptide Sequence without the
Signal Sequence:
TABLE-US-00005 (SEQ ID NO: 3)
HGHVSHIVVNGVYYRNYDPTTDWYQPNPPTVIGWTAADQDNGFVEPNSF
GTPDIICHKSATPGGGHATVAAGDKINIVWTPEWPESHIGPVIDYLAAC
NGDCETVDKSSLRWFKIDGAGYDKAAGRWAADALRANGNSWLVQIPSDL
KAGNYVLRHEIIALHGAQSPNGAQAYPQCINLRVTGGGSNLPSGVAGTS
LYKATDPGILFNPYVSSPDYTVPGPALIAGAASSIAQSTSVATATGTAT
VPGGGGANPTATTTAATSAAPSTTLRTTTTSAAQTTAPPSGDVQTKYGQ
CGGNGWTGPTVCAPGSSCSVLNEWYSQCL
GH61a Variant 1 cDNA Sequence:
TABLE-US-00006 (SEQ ID NO: 4)
ATGTCCAAGGCCTCTGCTCTCCTCGCTGGCCTGACGGGCGCGGCCCTC
GTCGCTGCACACGGCCACGTCAGCCACATCGTCGTCAACGGCGTCTAC
TACAGGGGCTACGACCCCACGACAGACTGGTACCAGCCCAACCCGCCA
ACAGTCATCGGCTGGACGGCAGCCGATCAGGATAATGGCTTCGTTGAA
CCCAACAGCTTTGGCACGCCAGATATCATCTGCCACAAGAGCGCCACC
CCCGGCGGCGGCCACGCTACCGTTGCTGCCGGAGACAAGATCAACATC
GTCTGGACCCCCGAGTGGCCCCACTCCCACATCGGCCCCGTCATTGAC
TACCTAGCCGCCTGCAACGGTGACTGCGAGACCGTCGACAAGTCGTCG
CTGCGCTGGTTCAAGATTGACGGCGCCGGCTACGACAAGGCCGCCGGC
CGCTGGGCCGCCGACGCTCTGCGCGCCAACGGCAACAGCTGGCTCGTC
CAGATCCCGTCGGATCTCAAGCCCGGCAACTACGTCCTCCGCCACGAG
ATCATCGCCCTCCACGGTGCTCAGAGCCCCAACGGCGCCCAGGCGTAC
CCGCAGTGCATCAACCTCCGCGTCACCGGCGGCGGCAGCAACCTGCCC
AGCGGCGTCGCCGGCACCTCGCTGTACAAGGCGACCGACCCGGGCATC
CTCTTCAACCCCTACGTCTCCTCCCCGGATTACACCGTCCCCGGCCCG
GCCCTCATTGCCGGCGCCGCCAGCTCGATCGCCCAGAGCACGTCGGTC
GCCACTGCCACCGGCACGGCCACCGTTCCCGGCGGCGGCGGCGCCAAC
CCTACCGCCACCACCACCGCCGCCACCTCCGCCGCCCCGAGCACCACC
CTGAGGACGACCACTACCTCGGCCGCGCAGACTACCGCCCCGCCCTCC
GGCGATGTGCAGACCAAGTACGGCCAGTGTGGTGGCAACGGATGGACG
GGCCCGACGGTGTGCGCCCCCGGCTCGAGCTGCTCCGTCCTCAACGAG
TGGTACTCCCAGTGTTTGTAA
GH61a Variant 1 Polypeptide Sequence:
TABLE-US-00007 [0307] (SEQ ID NO: 5)
MSKASALLAGLTGAALVAAHGHVSHIVVNGVYYRGYDPTTDWYQPNPPT
VIGWTAADQDNGFVEPNSFGTPDIICHKSATPGGGHATVAAGDKINIVW
TPEWPHSHIGPVIDYLAACNGDCETVDKSSLRWFKIDGAGYDKAAGRWA
ADALRANGNSWLVQIPSDLKPGNYVLRHEIIALHGAQSPNGAQAYPQCI
NLRVTGGGSNLPSGVAGTSLYKATDPGILFNPYVSSPDYTVPGPALIAG
AASSIAQSTSVATATGTATVPGGGGANPTATTTAATSAAPSTTLRTTTT
SAAQTTAPPSGDVQTKYGQCGGNGWTGPTVCAPGSSCSVLNEWYSQCL
GH61a Variant 1 Polypeptide Sequence without the Signal
Sequence:
TABLE-US-00008 (SEQ ID NO: 6)
HGHVSHIVVNGVYYRGYDPTTDWYQPNPPTVIGWTAADQDNGFVEPNS
FGTPDIICHKSATPGGGHATVAAGDKINIVWTPEWPHSHIGPVIDYLA
ACNGDCETVDKSSLRWFKIDGAGYDKAAGRWAADALRANGNSWLVQIP
SDLKPGNYVLRHEIIALHGAQSPNGAQAYPQCINLRVTGGGSNLPSGV
AGTSLYKATDPGILFNPYVSSPDYTVPGPALIAGAASSIAQSTSVATA
TGTATVPGGGGANPTATTTAATSAAPSTTLRTTTTSAAQTTAPPSGDV
QTKYGQCGGNGWTGPTVCAPGSSCSVLNEWYSQCL
GH61a Variant 5 cDNA Sequence
TABLE-US-00009 (SEQ ID NO: 7)
ACACAAATGTCCAAGGCCTCTGCTCTCCTCGCTGGCCTGACGGGCGCG
GCCCTCGTCGCTGCACACGGCCACGTCAGCCACATCGTCGTCAACGGC
GTCTACTACAGGAACTACGACCCCACGACAGACTGGTACCAGCCCAAC
CCGCCAACAGTCATCGGCTGGACGGCAGCCGATCAGGATAATGGCTTC
GTTGAACCCAACAGCTTTGGCACGCCAGATATCATCTGCCACAAGAGC
GCCACCCCCGGCGGCGGCCACGCTACCGTTGCTGCCGGAGACAAGATC
AACATCGTATGGACCCCCGAGTGGCCCCACTCCCACATCGGCCCCGTC
ATTGACTACCTAGCCGCCTGCAACGGTGACTGCGAGACCGTCGACAAG
TCGTCGCTGCGCTGGTTCAAGATTGACGGCGCCGGCTACGACAAGGCC
GCCGGCCGCTGGGCCGCCGACGCTCTGCGCGCCAACGGCAACAGCTGG
CTCGTCCAGATCCCGTCGGATCTCGCGGCCGGCAACTACGTCCTCCGC
CACGAGATCATCGCCCTCCACGGTGCTCAGAGCCCCAACGGCGCCCAG
GCGTACCCGCAGTGCATCAACCTCCGCGTCACCGGCGGCGGCAGCAAC
CTGCCCAGCGGCGTCGCCGGCACCTCGCTGTACAAGGCGACCGACCCG
GGCATCCTCTTCAACCCCTACGTCTCCTCCCCGGATTACACCGTCCCC
GGCCCGGCCCTCATTGCCGGCGCCGCCAGCTCGATCGCCCAGAGCACG
TCGGTCGCCACTGCCACCGGCACGGCCACCGTTCCCGGCGGCGGCGGC
GCCAACCCTACCGCCACCACCACCGCCGCCACCTCCGCCGCCCCGAGC
ACCACCCTGAGGACGACCACTACCTCGGCCGCGCAGACTACCGCCCCG
CCCTCCGGCGATGTGCAGACCAAGTACGGCCAGTGTGGTGGCAACGGA
TGGACGGGCCCGACGGTGTGCGCCCCCGGCTCGAGCTGCTCCGTCCTC
AACGAGTGGTACTCCCAGTGTTTGTAA
GH61a Variant 5 Polypeptide Sequence:
TABLE-US-00010 [0308] (SEQ ID NO: 8)
MSKASALLAGLTGAALVAAHGHVSHIVVNGVYYRNYDPTTDWYQPNPPT
VIGWTAADQDNGFVEPNSFGTPDIICHKSATPGGGHATVAAGDKINIVW
TPEWPHSHIGPVIDYLAACNGDCETVDKSSLRWFKIDGAGYDKAAGRWA
ADALRANGNSWLVQIPSDLAAGNYVLRHEIIALHGAQSPNGAQAYPQCI
NLRVTGGGSNLPSGVAGTSLYKATDPGILFNPYVSSPDYTVPGPALIAG
AASSIAQSTSVATATGTATVPGGGGANPTATTTAATSAAPSTTLRTTTT
SAAQTTAPPSGDVQTKYGQCGGNGWTGPTVCAPGSSCSVLNEWYSQCL
GH61a Variant 5 Polypeptide Sequence without the Signal
Sequence:
TABLE-US-00011 (SEQ ID NO: 9)
HGHVSHIVVNGVYYRNYDPTTDWYQPNPPTVIGWTAADQDNGFVEPNS
FGTPDIICHKSATPGGGHATVAAGDKINIVWTPEWPHSHIGPVIDYLA
ACNGDCETVDKSSLRWFKIDGAGYDKAAGRWAADALRANGNSWLVQIP
SDLAAGNYVLRHEIIALHGAQSPNGAQAYPQCINLRVTGGGSNLPSGV
AGTSLYKATDPGILFNPYVSSPDYTVPGPALIAGAASSIAQSTSVATA
TGTATVPGGGGANPTATTTAATSAAPSTTLRTTTTSAAQTTAPPSGDV
QTKYGQCGGNGWTGPTVCAPGSSCSVLNEWYSQCL
GH61a Variant 9 cDNA Sequence:
TABLE-US-00012 (SEQ ID NO: 10)
ACAAACATGTCCAAGGCCTCTGCTCTCCTCGCTGGCCTGACGGGCGCGG
CCCTCGTCGCTGCACATGGCCACGTCAGCCACATCGTCGTCAACGGCGT
CTACTACAGGAACTACGACCCCACGACAGACTGGTACCAGCCCAACCCG
CCAACAGTCATCGGCTGGACGGCAGCCGATCAGGATAATGGCTTCGTTG
AACCCAACAGCTTTGGCACGCCAGATATCATCTGCCACAAGAGCGCCAC
CCCCGGCGGCGGCCACGCTACCGTTGCTGCCGGAGACAAGATCAACATC
CAGTGGACCCCCGAGTGGCCCGAATCCCACATCGGCCCCGTCATTGACT
ACCTAGCCGCCTGCAACGGTGACTGCGAGACCGTCGACAAGTCGTCGCT
GCGCTGGTTCAAGATTGACGGCGCCGGCTACGACAAGGCCGCCGGCCGC
TGGGCCGCCGACGCTCTGCGCGCCAACGGCAACAGCTGGCTCGTCCAGA
TCCCGTCGGATCTCAAGGCCGGCAACTACGTCCTCCGCCACGAGATCAT
CGCCCTCCACGGTGCTCAGAGCCCCAACGGCGCCCAGAACTACCCGCAG
TGCATCAACCTCCGCGTCACCGGCGGCGGCAGCAACCTGCCCAGCGGCG
TCGCCGGCACCTCGCTGTACAAGGCGACCGACCCGGGCATCCTCTTCAA
CCCCTACGTCTCCTCCCCGGATTACACCGTCCCCGGCCCGGCCCTCATT
GCCGGCGCCGCCAGCTCGATCGCCCAGAGCACGTCGGTCGCCACTGCCA
CCGGCACGGCCACCGTTCCCGGCGGCGGCGGCGCCAACCCTACCGCCAC
CACCACCGCCGCCACCTCCGCCGCCCCGAGCACCACCCTGAGGACGACC
ACTACCTCGGCCGCGCAGACTACCGCCCCGCCCTCCGGCGATGTGCAGA
CCAAGTACGGCCAGTGTGGTGGCAACGGATGGACGGGCCCGACGGTGTG
CGCCCCCGGCTCGAGCTGCTCCGTCCTCAACGAGTGGTACTCCCAGTGT TTGTAA
GH61a Variant 9 Polypeptide Sequence:
TABLE-US-00013 [0309] (SEQ ID NO: 11)
MSKASALLAGLTGAALVAAHGHVSHIVVNGVYYRNYDPTTDWYQPNPPT
VIGWTAADQDNGFVEPNSFGTPDIICHKSATPGGGHATVAAGDKINIQW
TPEWPESHIGPVIDYLAACNGDCETVDKSSLRWFKIDGAGYDKAAGRWA
ADALRANGNSWLVQIPSDLKAGNYVLRHEIIALHGAQSPNGAQNYPQCI
NLRVTGGGSNLPSGVAGTSLYKATDPGILFNPYVSSPDYTVPGPALIAG
AASSIAQSTSVATATGTATVPGGGGANPTATTTAATSAAPSTTLRTTTT
SAAQTTAPPSGDVQTKYGQCGGNGWTGPTVCAPGSSCSVLNEWYSQCL
GH61a Variant 9 Polypeptide Sequence without the Signal
Sequence:
TABLE-US-00014 (SEQ ID NO: 12)
MSKASALLAGLTGAALVAAHGHVSHIVVNGVYYRNYDPTTDWYQPNPP
TVIGWTAADQDNGFVEPNSFGTPDIICHKSATPGGGHATVAAGDKINI
QWTPEWPESHIGPVIDYLAACNGDCETVDKSSLRWFKIDGAGYDKAAG
RWAADALRANGNSWLVQIPSDLKAGNYVLRHEIIALHGAQSPNGAQNY
PQCINLRVTGGGSNLPSGVAGTSLYKATDPGILFNPYVSSPDYTVPGP
ALIAGAASSIAQSTSVATATGTATVPGGGGANPTATTTAATSAAPSTT
LRTTTTSAAQTTAPPSGDVQTKYGQCGGNGWTGPTVCAPGSSCSVLNE WYSQCL
[0310] The polynucleotide (SEQ ID NO:13) and amino acid (SEQ ID
NO:14) sequences of an M. thermophila GH61b are provided below. The
signal sequence is shown underlined in SEQ ID NO:14. SEQ ID NO:15
provides the sequence of this GH61b without the signal
sequence.
TABLE-US-00015 (SEQ ID NO: 13)
ATGAAGCTCTCCCTCTTTTCCGTCCTGGCCACTGCCCTCACCGTCGAG
GGGCATGCCATCTTCCAGAAGGTCTCCGTCAACGGAGCGGACCAGGGC
TCCCTCACCGGCCTCCGCGCTCCCAACAACAACAACCCCGTGCAGAAT
GTCAACAGCCAGGACATGATCTGCGGCCAGTCGGGATCGACGTCGAAC
ACTATCATCGAGGTCAAGGCCGGCGATAGGATCGGTGCCTGGTATCAG
CATGTCATCGGCGGTGCCCAGTTCCCCAACGACCCAGACAACCCGATT
GCCAAGTCGCACAAGGGCCCCGTCATGGCCTACCTCGCCAAGGTTGAC
AATGCCGCAACCGCCAGCAAGACGGGCCTGAAGTGGTTCAAGATTTGG
GAGGATACCTTTAATCCCAGCACCAAGACCTGGGGTGTCGACAACCTC
ATCAACAACAACGGCTGGGTGTACTTCAACCTCCCGCAGTGCATCGCC
GACGGCAACTACCTCCTCCGCGTCGAGGTCCTCGCTCTGCACTCGGCC
TACTCCCAGGGCCAGGCTCAGTTCTACCAGTCCTGCGCCCAGATCAAC
GTATCCGGCGGCGGCTCCTTCACGCCGGCGTCGACTGTCAGCTTCCCG
GGTGCCTACAGCGCCAGCGACCCCGGTATCCTGATCAACATCTACGGC
GCCACCGGCCAGCCCGACAACAACGGCCAGCCGTACACTGCCCCTGGG CCCGCGCCCATCTCCTGC
(SEQ ID NO: 14) MKLSLFSVLATALTVEGHAIFQKVSVNGADQGSLTGLRAPNNNNPVQN
VNSQDMICGQSGSTSNTIIEVKAGDRIGAWYQHVIGGAQFPNDPDNPI
AKSHKGPVMAYLAKVDNAATASKTGLKWFKIWEDTFNPSTKTWGVDNL
INNNGWVYFNLPQCIADGNYLLRVEVLALHSAYSQGQAQFYQSCAQIN
VSGGGSFTPASTVSFPGAYSASDPGILINIYGATGQPDNNGQPYTAPG PAPISC (SEQ ID NO:
15) IFQKVSVNGADQGSLTGLRAPNNNNPVQNVNSQDMICGQSGSTSNTII
EVKAGDRIGAWYQHVIGGAQFPNDPDNPIAKSHKGPVMAYLAKVDNAA
TASKTGLKWFKIWEDTFNPSTKTWGVDNLINNNGWVYFNLPQCIADGN
YLLRVEVLALHSAYSQGQAQFYQSCAQINVSGGGSFTPASTVSFPGAY
SASDPGILINIYGATGQPDNNGQPYTAPGPAPISC
[0311] The polynucleotide (SEQ ID NO:16) and amino acid (SEQ ID
NO:17) sequences of an M. thermophila GH61c are provided below. The
signal sequence is shown underlined in SEQ ID NO:17. SEQ ID NO:18
provides the sequence of this GH61c without the signal
sequence.
TABLE-US-00016 (SEQ ID NO: 16)
ATGGCCCTCCAGCTCTTGGCGAGCTTGGCCCTCCTCTCAGTGCCGGCC
CTTGCCCACGGTGGCTTGGCCAACTACACCGTCGGTGATACTTGGTAC
AGAGGCTACGACCCAAACCTGCCGCCGGAGACGCAGCTCAACCAGACC
TGGATGATCCAGCGGCAATGGGCCACCATCGACCCCGTCTTCACCGTG
TCGGAGCCGTACCTGGCCTGCAACAACCCGGGCGCGCCGCCGCCCTCG
TACATCCCCATCCGCGCCGGTGACAAGATCACGGCCGTGTACTGGTAC
TGGCTGCACGCCATCGGGCCCATGAGCGTCTGGCTCGCGCGGTGCGGC
GACACGCCCGCGGCCGACTGCCGCGACGTCGACGTCAACCGGGTCGGC
TGGTTCAAGATCTGGGAGGGCGGCCTGCTGGAGGGTCCCAACCTGGCC
GAGGGGCTCTGGTACCAAAAGGACTTCCAGCGCTGGGACGGCTCCCCG
TCCCTCTGGCCCGTCACGATCCCCAAGGGGCTCAAGAGCGGGACCTAC
ATCATCCGGCACGAGATCCTGTCGCTTCACGTCGCCCTCAAGCCCCAG
TTTTACCCGGAGTGTGCGCATCTGAATATTACTGGGGGCGGAGACTTG
CTGCCACCCGAAGAGACTCTGGTGCGGTTTCCGGGGGTTTACAAAGAG
GACGATCCCTCTATCTTCATCGATGTCTACTCGGAGGAGAACGCGAAC
CGGACAGATTATACGGTTCCGGGAGGGCCAATCTGGGAAGGG (SEQ ID NO: 17)
MALQLLASLALLSVPALAHGGLANYTVGDTWYRGYDPNLPPETQLNQT
WMIQRQWATIDPVFTVSEPYLACNNPGAPPPSYIPIRAGDKITAVYWY
WLHAIGPMSVWLARCGDTPAADCRDVDVNRVGWFKIWEGGLLEGPNLA
EGLWYQKDFQRWDGSPSLWPVTIPKGLKSGTYIIRHEILSLHVALKPQ
FYPECAHLNITGGGDLLPPEETLVRFPGVYKEDDPSIFIDVYSEENAN RTDYTVPGGPIWEG
(SEQ ID NO: 18) NYTVGDTWYRGYDPNLPPETQLNQTWMIQRQWATIDPVFTVSEPYLAC
NNPGAPPPSYIPIRAGDKITAVYWYWLHAIGPMSVWLARCGDTPAADC
RDVDVNRVGWFKIWEGGLLEGPNLAEGLWYQKDFQRWDGSPSLWPVTI
PKGLKSGTYIIRHEILSLHVALKPQFYPECAHLNITGGGDLLPPEETL
VRFPGVYKEDDPSIFIDVYSEENANRTDYTVPGGPIWEG
[0312] The polynucleotide (SEQ ID NO:19) and amino acid (SEQ ID
NO:20) sequences of an M. thermophila GH61d are provided below. The
signal sequence is shown underlined in SEQ ID NO:20. SEQ ID NO:21
provides the sequence of this GH61d without the signal
sequence.
TABLE-US-00017 (SEQ ID NO: 19)
ATGAAGGCCCTCTCTCTCCTTGCGGCTGCCGGGGCAGTCTCTGCGCAT
ACCATCTTCGTCCAGCTCGAAGCAGACGGCACGAGGTACCCGGTTTCG
TACGGGATCCGGGACCCAACCTACGACGGCCCCATCACCGACGTCACA
TCCAACGACGTTGCTTGCAACGGCGGTCCGAACCCGACGACCCCCTCC
AGCGACGTCATCACCGTCACCGCGGGCACCACCGTCAAGGCCATCTGG
AGGCACACCCTCCAATCCGGCCCGGACGATGTCATGGACGCCAGCCAC
AAGGGCCCGACCCTGGCCTACATCAAGAAGGTCGGCGATGCCACCAAG
GACTCGGGCGTCGGCGGTGGCTGGTTCAAGATCCAGGAGGACGGTTAC
AACAACGGCCAGTGGGGCACCAGCACCGTTATCTCCAACGGCGGCGAG
CACTACATTGACATCCCGGCCTGCATCCCCGAGGGTCAGTACCTCCTC
CGCGCCGAGATGATCGCCCTCCACGCGGCCGGGTCCCCCGGCGGCGCT
CAGCTCTACATGGAATGTGCCCAGATCAACATCGTCGGCGGCTCCGGC
TCGGTGCCCAGCTCGACGGTCAGCTTCCCCGGCGCGTATAGCCCCAAC
GACCCGGGTCTCCTCATCAACATCTATTCCATGTCGCCCTCGAGCTCG
TACACCATCCCGGGCCCGCCCGTTTTCAAGTGC (SEQ ID NO: 20)
MKALSLLAAAGAVSAHTIFVQLEADGTRYPVSYGIRDPTYDGPITDVT
SNDVACNGGPNPTTPSSDVITVTAGTTVKAIWRHTLQSGPDDVMDASH
KGPTLAYIKKVGDATKDSGVGGGWFKIQEDGYNNGQWGTSTVISNGGE
HYIDIPACIPEGQYLLRAEMIALHAAGSPGGAQLYMECAQINIVGGSG
SVPSSTVSFPGAYSPNDPGLLINIYSMSPSSSYTIPGPPVFKC (SEQ ID NO: 21)
HTIFVQLEADGTRYPVSYGIRDPTYDGPITDVTSNDVACNGGPNPTTP
SSDVITVTAGTTVKAIWRHTLQSGPDDVMDASHKGPTLAYIKKVGDAT
KDSGVGGGWFKIQEDGYNNGQWGTSTVISNGGEHYIDIPACIPEGQYL
LRAEMIALHAAGSPGGAQLYMECAQINIVGGSGSVPSSTVSFPGAYSP
NDPGLLINIYSMSPSSSYTIPGPPVFKC
[0313] The polynucleotide (SEQ ID NO:22) and amino acid (SEQ ID
NO:23) sequences of an M. thermophila GH61e are provided below. The
signal sequence is shown underlined in SEQ ID NO:23. SEQ ID NO:24
provides the sequence of this GH61d without the signal
sequence.
TABLE-US-00018 (SEQ ID NO: 22)
ATGAAGTCGTCTACCCCGGCCTTGTTCGCCGCTGGGCTCCTTGCTCAG
CATGCTGCGGCCCACTCCATCTTCCAGCAGGCGAGCAGCGGCTCGACC
GACTTTGATACGCTGTGCACCCGGATGCCGCCCAACAATAGCCCCGTC
ACTAGTGTGACCAGCGGCGACATGACCTGCAAAGTCGGCGGCACCAAG
GGGGTGTCCGGCTTCTGCGAGGTGAACGCCGGCGACGAGTTCACGGTT
GAGATGCACGCGCAGCCCGGCGACCGCTCGTGCGCCAACGAGGCCATC
GGCGGGAACCACTTCGGCCCGGTCCTCATCTACATGAGCAAGGTCGAC
GACGCCTCCACCGCCGACGGGTCCGGCGACTGGTTCAAGGTGGACGAG
TTCGGCTACGACGCAAGCACCAAGACCTGGGGCACCGACAAGCTCAAC
GAGAACTGCGGCAAGCGCACCTTCAACATCCCCAGCCACATCCCCGCG
GGCGACTATCTCGTCCGGGCCGAGGCTATCGCGCTACACACTGCCAAC
CAGCCAGGCGGCGCGCAGTTCTACATGAGCTGCTATCAAGTCAGGATT
TCCGGCGGCGAAGGGGGCCAGCTGCCTGCCGGAGTCAAGATCCCGGGC
GCGTACAGTGCCAACGACCCCGGCATCCTTGTCGACATCTGGGGTAAC
GATTTCAACGACCCTCCAGGACACTCGGCCCGTCACGCCATCATCATC
ATCAGCAGCAGCAGCAACAACAGCGGCGCCAAGATGACCAAGAAGATC
CAGGAGCCCACCATCACATCGGTCACGGACCTCCCCACCGACGAGGCC
AAGTGGATCGCGCTCCAAAAGATCTCGTACGTGGACCAGACGGGCACG
GCGCGGACATACGAGCCGGCGTCGCGCAAGACGCGGTCGCCAAGAGTC TAG (SEQ ID NO:
23) MKSSTPALFAAGLLAQHAAAHSIFQQASSGSTDFDTLCTRMPPNNSPV
TSVTSGDMTCKVGGTKGVSGFCEVNAGDEFTVEMHAQPGDRSCANEAI
GGNHFGPVLIYMSKVDDASTADGSGDWFKVDEFGYDASTKTWGTDKLN
ENCGKRTFNIPSHIPAGDYLVRAEAIALHTANQPGGAQFYMSCYQVRI
SGGEGGQLPAGVKIPGAYSANDPGILVDIWGNDFNDPPGHSARHAIII
ISSSSNNSGAKMTKKIQEPTITSVTDLPTDEAKWIALQKISYVDQTGT ARTYEPASRKTRSPRV
(SEQ ID NO: 24) HSIFQQASSGSTDFDTLCTRMPPNNSPVTSVTSGDMTCKVGGTKGVSG
FCEVNAGDEFTVEMHAQPGDRSCANEAIGGNHFGPVLIYMSKVDDAST
ADGSGDWFKVDEFGYDASTKTWGTDKLNENCGKRTFNIPSHIPAGDYL
VRAEAIALHTANQPGGAQFYMSCYQVRISGGEGGQLPAGVKIPGAYSA
NDPGILVDIWGNDFNDPPGHSARHAIIIISSSSNNSGAKMTKKIQEPT
ITSVTDLPTDEAKWIALQKISYVDQTGTARTYEPASRKTRSPRV
[0314] The polynucleotide (SEQ ID NO:25) and amino acid (SEQ ID
NO:26) sequences of an alternative M. thermophila GH61e are
provided below. The signal sequence is shown underlined in SEQ ID
NO:26. SEQ ID NO:27 provides the sequence of this GH61e without the
signal sequence.
TABLE-US-00019 (SEQ ID NO: 25)
ATGAAGTCGTCTACCCCGGCCTTGTTCGCCGCTGGGCTCCTTGCTCAG
CATGCTGCGGCCCACTCCATCTTCCAGCAGGCGAGCAGCGGCTCGACC
GACTTTGATACGCTGTGCACCCGGATGCCGCCCAACAATAGCCCCGTC
ACTAGTGTGACCAGCGGCGACATGACCTGCAACGTCGGCGGCACCAAG
GGGGTGTCGGGCTTCTGCGAGGTGAACGCCGGCGACGAGTTCACGGTT
GAGATGCACGCGCAGCCCGGCGACCGCTCGTGCGCCAACGAGGCCATC
GGCGGGAACCACTTCGGCCCGGTCCTCATCTACATGAGCAAGGTCGAC
GACGCCTCCACTGCCGACGGGTCCGGCGACTGGTTCAAGGTGGACGAG
TTCGGCTACGACGCAAGCACCAAGACCTGGGGCACCGACAAGCTCAAC
GAGAACTGCGGCAAGCGCACCTTCAACATCCCCAGCCACATCCCCGCG
GGCGACTATCTCGTCCGGGCCGAGGCTATCGCGCTACACACTGCCAAC
CAGCCAGGCGGCGCGCAGTTCTACATGAGCTGCTATCAAGTCAGGATT
TCCGGCGGCGAAGGGGGCCAGCTGCCTGCCGGAGTCAAGATCCCGGGC
GCGTACAGTGCCAACGACCCCGGCATCCTTGTCGACATCTGGGGTAAC
GATTTCAACGAGTACGTTATTCCGGGCCCCCCGGTCATCGACAGCAGC TACTTC (SEQ ID NO:
26) MKSSTPALFAAGLLAQHAAAHSIFQQASSGSTDFDTLCTRMPPNNSPV
TSVTSGDMTCNVGGTKGVSGFCEVNAGDEFTVEMHAQPGDRSCANEAI
GGNHFGPVLIYMSKVDDASTADGSGDWFKVDEFGYDASTKTWGTDKLN
ENCGKRTFNIPSHIPAGDYLVRAEAIALHTANQPGGAQFYMSCYQVRI
SGGEGGQLPAGVKIPGAYSANDPGILVDIWGNDFNEYVIPGPPVIDSS YF (SEQ ID NO: 27)
HSIFQQASSGSTDFDTLCTRMPPNNSPVTSVTSGDMTCNVGGTKGVSG
FCEVNAGDEFTVEMHAQPGDRSCANEAIGGNHFGPVLIYMSKVDDAST
ADGSGDWFKVDEFGYDASTKTWGTDKLNENCGKRTFNIPSHIPAGDYL
VRAEAIALHTANQPGGAQFYMSCYQVRISGGEGGQLPAGVKIPGAYSA
NDPGILVDIWGNDFNEYVIPGPPVIDSSYF
[0315] The polynucleotide (SEQ ID NO:28) and amino acid (SEQ ID
NO:29) sequences of a M. thermophila GH61f are provided below. The
signal sequence is shown underlined in SEQ ID NO:29. SEQ ID NO:30
provides the sequence of this GH61f without the signal
sequence.
TABLE-US-00020 (SEQ ID NO: 28)
ATGAAGTCCTTCACCCTCACCACTCTGGCCGCCCTGGCTGGCAACGCC
GCCGCTCACGCGACCTTCCAGGCCCTCTGGGTCGACGGCGTCGACTAC
GGCGCGCAGTGTGCCCGTCTGCCCGCGTCCAACTCGCCGGTCACCGAC
GTGACCTCCAACGCGATCCGCTGCAACGCCAACCCCTCGCCCGCTCGG
GGCAAGTGCCCGGTCAAGGCCGGCTCGACCGTTACGGTCGAGATGCAT
CAGCAACCCGGTGACCGCTCGTGCAGCAGCGAGGCGATCGGCGGGGCG
CACTACGGCCCCGTGATGGTGTACATGTCCAAGGTGTCGGACGCGGCG
TCGGCGGACGGGTCGTCGGGCTGGTTCAAGGTGTTCGAGGACGGCTGG
GCCAAGAACCCGTCCGGCGGGTCGGGCGACGACGACTACTGGGGCACC
AAGGACCTGAACTCGTGCTGCGGGAAGATGAACGTCAAGATCCCCGCC
GACCTGCCCTCGGGCGACTACCTGCTCCGGGCCGAGGCCCTCGCGCTG
CACACGGCCGGCAGCGCGGGCGGCGCCCAGTTCTACATGACCTGCTAC
CAGCTCACCGTGACCGGCTCCGGCAGCGCCAGCCCGCCCACCGTCTCC
TTCCCGGGCGCCTACAAGGCCACCGACCCGGGCATCCTCGTCAACATC
CACGCCCCGCTGTCCGGCTACACCGTGCCCGGCCCGGCCGTCTACTCG
GGCGGCTCCACCAAGAAGGCCGGCAGCGCCTGCACCGGCTGCGAGTCC
ACTTGCGCCGTCGGCTCCGGCCCCACCGCCACCGTCTCCCAGTCGCCC
GGTTCCACCGCCACCTCGGCCCCCGGCGGCGGCGGCGGCTGCACCGTC
CAGAAGTACCAGCAGTGCGGCGGCCAGGGCTACACCGGCTGCACCAAC
TGCGCGTCCGGCTCCACCTGCAGCGCGGTCTCGCCGCCCTACTACTCG CAGTGCGTC (SEQ ID
NO: 29) MKSFTLTTLAALAGNAAAHATFQALWVDGVDYGAQCARLPASNSPVTD
VTSNAIRCNANPSPARGKCPVKAGSTVTVEMHQQPGDRSCSSEAIGGA
HYGPVMVYMSKVSDAASADGSSGWFKVFEDGWAKNPSGGSGDDDYWGT
KDLNSCCGKMNVKIPADLPSGDYLLRAEALALHTAGSAGGAQFYMTCY
QLTVTGSGSASPPTVSFPGAYKATDPGILVNIHAPLSGYTVPGPAVYS
GGSTKKAGSACTGCESTCAVGSGPTATVSQSPGSTATSAPGGGGGCTV
QKYQQCGGQGYTGCTNCASGSTCSAVSPPYYSQCV (SEQ ID NO: 30)
HATFQALWVDGVDYGAQCARLPASNSPVTDVTSNAIRCNANPSPARGK
CPVKAGSTVTVEMHQQPGDRSCSSEAIGGAHYGPVMVYMSKVSDAASA
DGSSGWFKVFEDGWAKNPSGGSGDDDYWGTKDLNSCCGKMNVKIPADL
PSGDYLLRAEALALHTAGSAGGAQFYMTCYQLTVTGSGSASPPTVSFP
GAYKATDPGILVNIHAPLSGYTVPGPAVYSGGSTKKAGSACTGCESTC
AVGSGPTATVSQSPGSTATSAPGGGGGCTVQKYQQCGGQGYTGCTNCA
SGSTCSAVSPPYYSQCV
[0316] The polynucleotide (SEQ ID NO:31) and amino acid (SEQ ID
NO:32) sequences of an M. thermophila GH61g are provided below. The
signal sequence is shown underlined in SEQ ID NO:32. SEQ ID NO:33
provides the sequence of this GH61g without the signal
sequence.
TABLE-US-00021 (SEQ ID NO: 31)
ATGAAGGGACTCCTCGGCGCCGCCGCCCTCTCGCTGGCCGTCAGCGAT
GTCTCGGCCCACTACATCTTTCAGCAGCTGACGACGGGCGGCGTCAAG
CACGCTGTGTACCAGTACATCCGCAAGAACACCAACTATAACTCGCCC
GTGACCGATCTGACGTCCAACGACCTCCGCTGCAATGTGGGTGCTACC
GGTGCGGGCACCGATACCGTCACGGTGCGCGCCGGCGATTCGTTCACC
TTCACGACCGATACGCCCGTTTACCACCAGGGCCCGACCTCGATCTAC
ATGTCCAAGGCCCCCGGCAGCGCGTCCGACTACGACGGCAGCGGCGGC
TGGTTCAAGATCAAGGACTGGGCTGACTACACCGCCACGATTCCGGAA
TGTATTCCCCCCGGCGACTACCTGCTTCGCATCCAGCAACTCGGCATC
CACAACCCTTGGCCCGCGGGCATCCCCCAGTTCTACATCTCTTGTGCC
CAGATCACCGTGACTGGTGGCGGCAGTGCCAACCCCGGCCCGACCGTC
TCCATCCCAGGCGCCTTCAAGGAGACCGACCCGGGCTACACTGTCAAC
ATCTACAACAACTTCCACAACTACACCGTCCCTGGCCCAGCCGTCTTC
ACCTGCAACGGTAGCGGCGGCAACAACGGCGGCGGCTCCAACCCAGTC
ACCACCACCACCACCACCACCACCAGGCCGTCCACCAGCACCGCCCAG
TCCCAGCCGTCGTCGAGCCCGACCAGCCCCTCCAGCTGCACCGTCGCG
AAGTGGGGCCAGTGCGGAGGACAGGGTTACAGCGGCTGCACCGTGTGC
GCGGCCGGGTCGACCTGCCAGAAGACCAACGACTACTACAGCCAGTGC TTGTAG (SEQ ID NO:
32) MKGLLGAAALSLAVSDVSAHYIFQQLTTGGVKHAVYQYIRKNTNYNSP
VTDLTSNDLRCNVGATGAGTDTVTVRAGDSFTFTTDTPVYHQGPTSIY
MSKAPGSASDYDGSGGWFKIKDWADYTATIPECIPPGDYLLRIQQLGI
HNPWPAGIPQFYISCAQITVTGGGSANPGPTVSIPGAFKETDPGYTVN
IYNNFHNYTVPGPAVFTCNGSGGNNGGGSNPVTTTTTTTTRPSTSTAQ
SQPSSSPTSPSSCTVAKWGQCGGQGYSGCTVCAAGSTCQKTNDYYSQC L (SEQ ID NO: 33)
HYIFQQLTTGGVKHAVYQYIRKNTNYNSPVTDLTSNDLRCNVGATGAG
TDTVTVRAGDSFTFTTDTPVYHQGPTSIYMSKAPGSASDYDGSGGWFK
IKDWADYTATIPECIPPGDYLLRIQQLGIHNPWPAGIPQFYISCAQIT
VTGGGSANPGPTVSIPGAFKETDPGYTVNIYNNFHNYTVPGPAVFTCN
GSGGNNGGGSNPVTTTTTTTTRPSTSTAQSQPSSSPTSPSSCTVAKWG
QCGGQGYSGCTVCAAGSTCQKTNDYYSQCL
[0317] The polynucleotide (SEQ ID NO:34) and amino acid (SEQ ID
NO:35) sequences of an alternative M. thermophila GH61g are
provided below. The signal sequence is shown underlined in SEQ ID
NO:35. SEQ ID NO:36 provides the sequence of this GH61g without the
signal sequence.
TABLE-US-00022 (SEQ ID NO: 34)
CTGACGACGGGCGGCGTCAAGCACGCTGTGTACCAGTACATCCGCAAG
AACACCAACTATAACTCGCCCGTGACCGATCTGACGTCCAACGACCTC
CGCTGCAATGTGGGTGCTACCGGTGCGGGCACCGATACCGTCACGGTG
CGCGCCGGCGATTCGTTCACCTTCACGACCGATACGCCCGTTTACCAC
CAGGGCCCGACCTCGATCTACATGTCCAAGGCCCCCGGCAGCGCGTCC
GACTACGACGGCAGCGGCGGCTGGTTCAAGATCAAGGACTGGGGTGCC
GACTTTAGCAGCGGCCAGGCCACCTGGACCTTGGCGTCTGACTACACC
GCCACGATTCCGGAATGTATTCCCCCCGGCGACTACCTGCTTCGCATC
CAGCAACTCGGCATCCACAACCCTTGGCCCGCGGGCATCCCCCAGTTC
TACATCTCTTGTGCCCAGATCACCGTGACTGGTGGCGGCAGTGCCAAC
CCCGGCCCGACCGTCTCCATCCCAGGCGCCTTCAAGGAGACCGACCCG
GGCTACACTGTCAACATCTACAACAACTTCCACAACTACACCGTCCCT
GGCCCAGCCGTCTTCACCTGCAACGGTAGCGGCGGCAACAACGGCGGC
GGCTCCAACCCAGTCACCACCACCACCACCACCACCACCAGGCCGTCC
ACCAGCACCGCCCAGTCCCAGCCGTCGTCGAGCCCGACCAGCCCCTCC
AGCTGCACCGTCGCGAAGTGGGGCCAGTGCGGAGGACAGGGTTACAGC
GGCTGCACCGTGTGCGCGGCCGGGTCGACCTGCCAGAAGACCAACGAC TACTACAGCCAGTGCTTG
(SEQ ID NO: 35) MKGLLGAAALSLAVSDVSAHYIFQQLTTGGVKHAVYQYIRKNTNYNSP
VTDLTSNDLRCNVGATGAGTDTVTVRAGDSFTFTTDTPVYHQGPTSIY
MSKAPGSASDYDGSGGWFKIKDWGADFSSGQATWTLASDYTATIPECI
PPGDYLLRIQQLGIHNPWPAGIPQFYISCAQITVTGGGSANPGPTVSI
PGAFKETDPGYTVNIYNNFHNYTVPGPAVFTCNGSGGNNGGGSNPVTT
TTTTTTRPSTSTAQSQPSSSPTSPSSCTVAKWGQCGGQGYSGCTVCAA GSTCQKTNDYYSQCL
(SEQ ID NO: 36) HYIFQQLTTGGVKHAVYQYIRKNTNYNSPVTDLTSNDLRCNVGATGAG
TDTVTVRAGDSFTFTTDTPVYHQGPTSIYMSKAPGSASDYDGSGGWFK
IKDWGADFSSGQATWTLASDYTATIPECIPPGDYLLRIQQLGIHNPWP
AGIPQFYISCAQITVTGGGSANPGPTVSIPGAFKETDPGYTVNIYNNF
HNYTVPGPAVFTCNGSGGNNGGGSNPVTTTTTTTTRPSTSTAQSQPSS
SPTSPSSCTVAKWGQCGGQGYSGCTVCAAGSTCQKTNDYYSQCL
[0318] The polynucleotide (SEQ ID NO:37) and amino acid (SEQ ID
NO:38) sequences of an M. thermophila GH61h are provided below. The
signal sequence is shown underlined in SEQ ID NO:38. SEQ ID NO:39
provides the sequence of this GH61h without the signal
sequence.
TABLE-US-00023 (SEQ ID NO: 37)
ATGTCTTCCTTCACCTCCAAGGGTCTCCTTTCCGCCCTCATGGGCGCG
GCAACGGTTGCCGCCCACGGTCACGTCACCAACATCGTCATCAACGGC
GTCTCATACCAGAACTTCGACCCATTCACGCACCCTTATATGCAGAAC
CCTCCGACGGTTGTCGGCTGGACCGCGAGCAACACGGACAACGGCTTC
GTCGGCCCCGAGTCCTTCTCTAGCCCGGACATCATCTGCCACAAGTCC
GCCACCAACGCTGGCGGCCATGCCGTCGTCGCGGCCGGCGATAAGGTC
TTCATCCAGTGGGACACCTGGCCCGAGTCGCACCACGGTCCGGTCATC
GACTATCTCGCCGACTGCGGCGACGCGGGCTGCGAGAAGGTCGACAAG
ACCACGCTCAAGTTCTTCAAGATCAGCGAGTCCGGCCTGCTCGACGGC
ACTAACGCCCCCGGCAAGTGGGCGTCCGACACGCTGATCGCCAACAAC
AACTCGTGGCTGGTCCAGATCCCGCCCAACATCGCCCCGGGCAACTAC
GTCCTGCGCCACGAGATCATCGCCCTGCACAGCGCCGGCCAGCAGAAC
GGCGCCCAGAACTACCCTCAGTGCTTCAACCTGCAGGTCACCGGCTCC
GGCACTCAGAAGCCCTCCGGCGTCCTCGGCACCGAGCTCTACAAGGCC
ACCGACGCCGGCATCCTGGCCAACATCTACACCTCGCCCGTCACCTAC
CAGATCCCCGGCCCGGCCATCATCTCGGGCGCCTCCGCCGTCCAGCAG
ACCACCTCGGCCATCACCGCCTCTGCTAGCGCCATCACCGGCTCCGCT
ACCGCCGCGCCCACGGCTGCCACCACCACCGCCGCCGCCGCCGCCACC
ACTACCACCACCGCTGGCTCCGGTGCTACCGCCACGCCCTCGACCGGC
GGCTCTCCTTCTTCCGCCCAGCCTGCTCCTACCACCGCTGCCGCTACC
TCCAGCCCTGCTCGCCCGACCCGCTGCGCTGGTCTGAAGAAGCGCCGT
CGCCACGCCCGTGACGTCAAGGTTGCCCTC (SEQ ID NO: 38)
MSSFTSKGLLSALMGAATVAAHGHVTNIVINGVSYQNFDPFTHPYMQN
PPTVVGWTASNTDNGFVGPESFSSPDIICHKSATNAGGHAVVAAGDKV
FIQWDTWPESHHGPVIDYLADCGDAGCEKVDKTTLKFFKISESGLLDG
TNAPGKWASDTLIANNNSWLVQIPPNIAPGNYVLRHEIIALHSAGQQN
GAQNYPQCFNLQVTGSGTQKPSGVLGTELYKATDAGILANIYTSPVTY
QIPGPAIISGASAVQQTTSAITASASAITGSATAAPTAATTTAAAAAT
TTTTAGSGATATPSTGGSPSSAQPAPTTAAATSSPARPTRCAGLKKRR RHARDVKVAL (SEQ ID
NO: 39) AHGHVTNIVINGVSYQNFDPFTHPYMQNPPTVVGWTASNTDNGFVGPE
SFSSPDIICHKSATNAGGHAVVAAGDKVFIQWDTWPESHHGPVIDYLA
DCGDAGCEKVDKTTLKFFKISESGLLDGTNAPGKWASDTLIANNNSWL
VQIPPNIAPGNYVLRHEIIALHSAGQQNGAQNYPQCFNLQVTGSGTQK
PSGVLGTELYKATDAGILANIYTSPVTYQIPGPAIISGASAVQQTTSA
ITASASAITGSATAAPTAATTTAAAAATTTTTAGSGATATPSTGGSPS
SAQPAPTTAAATSSPARPTRCAGLKKRRRHARDVKVAL
[0319] The polynucleotide (SEQ ID NO:40) and amino acid (SEQ ID
NO:41) sequences of an M. thermophila GH61i are provided below. The
signal sequence is shown underlined in SEQ ID NO:41. SEQ ID NO:42
provides the sequence of this GH61i without the signal
sequence.
TABLE-US-00024 (SEQ ID NO: 40)
ATGAAGACGCTCGCCGCCCTCGTGGTCTCGGCCGCCCTCGTGGCCGCG
CACGGCTATGTTGACCACGCCACGATCGGTGGCAAGGATTATCAGTTC
TACCAGCCGTACCAGGACCCTTACATGGGCGACAACAAGCCCGATAGG
GTTTCCCGCTCCATCCCGGGCAACGGCCCCGTGGAGGACGTCAACTCC
ATCGACCTCCAGTGCCACGCCGGTGCCGAACCGGCCAAGCTCCACGCC
CCCGCCGCCGCCGGCTCGACCGTGACGCTCTACTGGACCCTCTGGCCC
GACTCCCACGTCGGCCCCGTCATCACCTACATGGCTCGCTGCCCCGAC
ACCGGCTGCCAGGACTGGTCCCCGGGAACTAAGCCCGTTTGGTTCAAG
ATCAAGGAAGGCGGCCGTGAGGGCACCTCCAATACCCCGCTCATGACG
GCCCCCTCCGCCTACACCTACACGATCCCGTCCTGCCTCAAGAGCGGC
TACTACCTCGTCCGCCACGAGATCATCGCCCTGCACTCGGCCTGGCAG
TACCCCGGCGCCCAGTTCTACCCGGGCTGCCACCAGCTCCAGGTCACC
GGCGGCGGCTCCACCGTGCCCTCTACCAACCTGGTCTCCTTCCCCGGC
GCCTACAAGGGGAGCGACCCCGGCATCACCTACGACGCTTACAAGGCG
CAACCTTACACCATCCCTGGCCCGGCCGTGTTTACCTGCTGA (SEQ ID NO: 41)
MKTLAALVVSAALVAAHGYVDHATIGGKDYQFYQPYQDPYMGDNKPDR
VSRSIPGNGPVEDVNSIDLQCHAGAEPAKLHAPAAAGSTVTLYWTLWP
DSHVGPVITYMARCPDTGCQDWSPGTKPVWFKIKEGGREGTSNTPLMT
APSAYTYTIPSCLKSGYYLVRHEIIALHSAWQYPGAQFYPGCHQLQVT
GGGSTVPSTNLVSFPGAYKGSDPGITYDAYKAQPYTIPGPAVFTC (SEQ ID NO: 42)
YVDHATIGGKDYQFYQPYQDPYMGDNKPDRVSRSIPGNGPVEDVNSID
LQCHAGAEPAKLHAPAAAGSTVTLYWTLWPDSHVGPVITYMARCPDTG
CQDWSPGTKPVWFKIKEGGREGTSNTPLMTAPSAYTYTIPSCLKSGYY
LVRHEIIALHSAWQYPGAQFYPGCHQLQVTGGGSTVPSTNLVSFPGAY
KGSDPGITYDAYKAQPYTIPGPAVFTC
[0320] The polynucleotide (SEQ ID NO:43) and amino acid (SEQ ID
NO:44) sequences of an alternative M. thermophila GH61i are
provided below. The signal sequence is shown underlined in SEQ ID
NO:44. SEQ ID NO:45 provides the sequence of this GH61i without the
signal sequence.
TABLE-US-00025 (SEQ ID NO: 43)
ATGAAGACGCTCGCCGCCCTCGTGGTCTCGGCCGCCCTCGTGGCCGCGCA
CGGCTATGTTGACCACGCCACGATCGGTGGCAAGGATTATCAGTTCTACC
AGCCGTACCAGGACCCTTACATGGGCGACAACAAGCCCGATAGGGTTTCC
CGCTCCATCCCGGGCAACGGCCCCGTGGAGGACGTCAACTCCATCGACCT
CCAGTGCCACGCCGGTGCCGAACCGGCCAAGCTCCACGCCCCCGCCGCCG
CCGGCTCGACCGTGACGCTCTACTGGACCCTCTGGCCCGACTCCCACGTC
GGCCCCGTCATCACCTACATGGCTCGCTGCCCCGACACCGGCTGCCAGGA
CTGGTCCCCGGGAACTAAGCCCGTTTGGTTCAAGATCAAGGAAGGCGGCC
GTGAGGGCACCTCCAATGTCTGGGCTGCTACCCCGCTCATGACGGCCCCC
TCCGCCTACACCTACACGATCCCGTCCTGCCTCAAGAGCGGCTACTACCT
CGTCCGCCACGAGATCATCGCCCTGCACTCGGCCTGGCAGTACCCCGGCG
CCCAGTTCTACCCGGGCTGCCACCAGCTCCAGGTCACCGGCGGCGGCTCC
ACCGTGCCCTCTACCAACCTGGTCTCCTTCCCCGGCGCCTACAAGGGGAG
CGACCCCGGCATCACCTACGACGCTTACAAGGCGCAACCTTACACCATCC
CTGGCCCGGCCGTGTTTACCTGC (SEQ ID NO: 44)
MKTLAALVVSAALVAAHGYVDHATIGGKDYQFYQPYQDPYMGDNKPDRVS
RSIPGNGPVEDVNSIDLQCHAGAEPAKLHAPAAAGSTVTLYWTLWPDSHV
GPVITYMARCPDTGCQDWSPGTKPVWFKIKEGGREGTSNVWAATPLMTAP
SAYTYTIPSCLKSGYYLVRHEIIALHSAWQYPGAQFYPGCHQLQVTGGGS
TVPSTNLVSFPGAYKGSDPGITYDAYKAQPYTIPGPAVFTC (SEQ ID NO: 45)
YVDHATIGGKDYQFYQPYQDPYMGDNKPDRVSRSIPGNGPVEDVNSIDLQ
CHAGAEPAKLHAPAAAGSTVTLYWTLWPDSHVGPVITYMARCPDTGCQDW
SPGTKPVWFKIKEGGREGTSNVWAATPLMTAPSAYTYTIPSCLKSGYYLV
RHEIIALHSAWQYPGAQFYPGCHQLQVTGGGSTVPSTNLVSFPGAYKGSD
PGITYDAYKAQPYTIPGPAVFTC
[0321] The polynucleotide (SEQ ID NO:46) and amino acid (SEQ ID
NO:47) sequences of an M. thermophila GH61j are provided below. The
signal sequence is shown underlined in SEQ ID NO:47. SEQ ID NO:48
provides the sequence of this GH61j without the signal
sequence.
TABLE-US-00026 (SEQ ID NO: 46)
ATGAGATACTTCCTCCAGCTCGCTGCGGCCGCGGCCTTTGCCGTGAACAG
CGCGGCGGGTCACTACATCTTCCAGCAGTTCGCGACGGGCGGGTCCAAGT
ACCCGCCCTGGAAGTACATCCGGCGCAACACCAACCCGGACTGGCTGCAG
AACGGGCCGGTGACGGACCTGTCGTCGACCGACCTGCGCTGCAACGTGGG
CGGGCAGGTCAGCAACGGGACCGAGACCATCACCTTGAACGCCGGCGACG
AGTTCAGCTTCATCCTCGACACGCCCGTCTACCATGCCGGCCCCACCTCG
CTCTACATGTCCAAGGCGCCCGGAGCTGTGGCCGACTACGACGGCGGCGG
GGCCTGGTTCAAGATCTACGACTGGGGTCCGTCGGGGACGAGCTGGACGT
TGAGTGGCACGTACACTCAGAGAATTCCCAAGTGCATCCCTGACGGCGAG
TACCTCCTCCGCATCCAGCAGATCGGGCTCCACAACCCCGGCGCCGCGCC
ACAGTTCTACATCAGCTGCGCTCAAGTCAAGGTCGTCGATGGCGGCAGCA
CCAATCCGACCCCGACCGCCCAGATTCCGGGAGCCTTCCACAGCAACGAC
CCTGGCTTGACTGTCAATATCTACAACGACCCTCTCACCAACTACGTCGT
CCCGGGACCTAGAGTTTCGCACTGG (SEQ ID NO:47)
MRYFLQLAAAAAFAVNSAAGHYIFQQFATGGSKYPPWKYIRRNTNPDWLQ
NGPVTDLSSTDLRCNVGGQVSNGTETITLNAGDEFSFILDTPVYHAGPTS
LYMSKAPGAVADYDGGGAWFKIYDWGPSGTSWTLSGTYTQRIPKCIPDGE
YLLRIQQIGLHNPGAAPQFYISCAQVKVVDGGSTNPTPTAQIPGAFHSND
PGLTVNIYNDPLTNYVVPGPRVSHW (SEQ ID NO:48)
HYIFQQFATGGSKYPPWKYIRRNTNPDWLQNGPVTDLSSTDLRCNVGGQV
SNGTETITLNAGDEFSFILDTPVYHAGPTSLYMSKAPGAVADYDGGGAWF
KIYDWGPSGTSWTLSGTYTQRIPKCIPDGEYLLRIQQIGLHNPGAAPQFY
ISCAQVKVVDGGSTNPTPTAQIPGAFHSNDPGLTVNIYNDPLTNYVVPGP RVSHW
[0322] The polynucleotide (SEQ ID NO:49) and amino acid (SEQ ID
NO:50) sequences of an M. thermophila GH61k are provided below. The
signal sequence is shown underlined in SEQ ID NO:50. SEQ ID NO:51
provides the sequence of this GH61k without the signal
sequence.
TABLE-US-00027 (SEQ ID NO: 49)
ATGCACCCCTCCCTTCTTTTCACGCTTGGGCTGGCGAGCGTGCTTGTCCC
CCTCTCGTCTGCACACACTACCTTCACGACCCTCTTCGTCAACGATGTCA
ACCAAGGTGATGGTACCTGCATTCGCATGGCGAAGAAGGGCAATGTCGCC
ACCCATCCTCTCGCAGGCGGTCTCGACTCCGAAGACATGGCCTGTGGTCG
GGATGGTCAAGAACCCGTGGCATTTACGTGTCCGGCCCCAGCTGGTGCCA
AGTTGACTCTCGAGTTTCGCATGTGGGCCGATGCTTCGCAGTCCGGATCG
ATCGATCCATCCCACCTTGGCGTCATGGCCATCTACCTCAAGAAGGTTTC
CGACATGAAATCTGACGCGGCCGCTGGCCCGGGCTGGTTCAAGATTTGGG
ACCAAGGCTACGACTTGGCGGCCAAGAAGTGGGCCACCGAGAAGCTCATC
GACAACAACGGCCTCCTGAGCGTCAACCTTCCAACCGGCTTACCAACCGG
CTACTACCTCGCCCGCCAGGAGATCATCACGCTCCAAAACGTTACCAATG
ACAGGCCAGAGCCCCAGTTCTACGTCGGCTGCGCACAGCTCTACGTCGAG
GGCACCTCGGACTCACCCATCCCCTCGGACAAGACGGTCTCCATTCCCGG
CCACATCAGCGACCCGGCCGACCCGGGCCTGACCTTCAACGTCTACACGG
GCGACGCATCCACCTACAAGCCGCCCGGCCCCGAGGTTTACTTCCCCACC
ACCACCACCACCACCTCCTCCTCCTCCTCCGGAAGCAGCGACAACAAGGG
AGCCAGGCGCCAGCAAACCCCCGACGACAAGCAGGCCGACGGCCTCGTTC
CAGCCGACTGCCTCGTCAAGAACGCGAACTGGTGCGCCGCTGCCCTGCCG
CCGTACACCGACGAGGCCGGCTGCTGGGCCGCCGCCGAGGACTGCAACAA
GCAGCTGGACGCGTGCTACACCAGCGCACCCCCCTCGGGCAGCAAGGGGT
GCAAGGTCTGGGAGGAGCAGGTGTGCACCGTCGTCTCGCAGAAGTGCGAG
GCCGGGGATTTCAAGGGGCCCCCGCAGCTCGGGAAGGAGCTCGGCGAGGG
GATCGATGAGCCTATTCCGGGGGGAAAGCTGCCCCCGGCGGTCAACGCGG
GAGAGAACGGGAATCATGGCGGAGGTGGTGGTGATGATGGTGATGATGAT
AATGATGAGGCCGGGGCTGGGGCAGCGTCGACTCCGACTTTTGCTGCTCC
TGGTGCGGCCAAGACTCCCCAACCAAACTCCGAGAGGGCCCGGCGCCGTG
AGGCGCATTGGCGGCGACTGGAATCTGCTGAG (SEQ ID NO: 50)
MHPSLLFTLGLASVLVPLSSAHTTFTTLFVNDVNQGDGTCIRMAKKGNVA
THPLAGGLDSEDMACGRDGQEPVAFTCPAPAGAKLTLEFRMWADASQSGS
IDPSHLGVMAIYLKKVSDMKSDAAAGPGWFKIWDQGYDLAAKKWATEKLI
DNNGLLSVNLPTGLPTGYYLARQEIITLQNVTNDRPEPQFYVGCAQLYVE
GTSDSPIPSDKTVSIPGHISDPADPGLTFNVYTGDASTYKPPGPEVYFPT
TTTTTSSSSSGSSDNKGARRQQTPDDKQADGLVPADCLVKNANWCAAALP
PYTDEAGCWAAAEDCNKQLDACYTSAPPSGSKGCKVWEEQVCTVVSQKCE
AGDFKGPPQLGKELGEGIDEPIPGGKLPPAVNAGENGNHGGGGGDDGDDD
NDEAGAGAASTPTFAAPGAAKTPQPNSERARRREAHWRRLESAE (SEQ ID NO: 51)
HTTFTTLFVNDVNQGDGTCIRMAKKGNVATHPLAGGLDSEDMACGRDGQE
PVAFTCPAPAGAKLTLEFRMWADASQSGSIDPSHLGVMAIYLKKVSDMKS
DAAAGPGWFKIWDQGYDLAAKKWATEKLIDNNGLLSVNLPTGLPTGYYLA
RQEIITLQNVTNDRPEPQFYVGCAQLYVEGTSDSPIPSDKTVSIPGHISD
PADPGLTFNVYTGDASTYKPPGPEVYFPTTTTTTSSSSSGSSDNKGARRQ
QTPDDKQADGLVPADCLVKNANWCAAALPPYTDEAGCWAAAEDCNKQLDA
CYTSAPPSGSKGCKVWEEQVCTVVSQKCEAGDFKGPPQLGKELGEGIDEP
IPGGKLPPAVNAGENGNHGGGGGDDGDDDNDEAGAGAASTPTFAAPGAAK
TPQPNSERARRREAHWRRLESAE
[0323] The polynucleotide (SEQ ID NO:52) and amino acid (SEQ ID
NO:53) sequences of a M. thermophila GH61l are provided below. The
signal sequence is shown underlined in SEQ ID NO:53. SEQ ID NO:54
provides the sequence of this GH61l without the signal
sequence.
TABLE-US-00028 (SEQ ID NO: 52)
ATGTTTTCTCTCAAGTTCTTTATCTTGGCCGGTGGGCTTGCTGTCCTCAC
CGAGGCTCACATAAGACTAGTGTCGCCCGCCCCTTTTACCAACCCTGACC
AGGGCCCCAGCCCACTCCTAGAGGCTGGCAGCGACTATCCCTGCCACAAC
GGCAATGGGGGCGGTTATCAGGGAACGCCAACCCAGATGGCAAAGGGTTC
TAAGCAGCAGCTAGCCTTCCAGGGGTCTGCCGTTCATGGGGGTGGCTCCT
GCCAAGTGTCCATCACCTACGACGAAAACCCGACCGCTCAGAGCTCCTTC
AAGGTCATTCACTCGATTCAAGGTGGCTGCCCCGCCAGGGCCGAGACGAT
CCCGGATTGCAGCGCACAAAATATCAACGCCTGCAATATAAAGCCCGATA
ATGCCCAGATGGACACCCCGGATAAGTATGAGTTCACGATCCCGGAGGAT
CTCCCCAGTGGCAAGGCCACCCTCGCCTGGACATGGATCAACACTATCGG
CAACCGCGAGTTTTATATGGCATGCGCCCCGGTTGAGATCACCGGCGACG
GCGGTAGCGAGTCGGCTCTGGCTGCGCTGCCCGACATGGTCATTGCCAAC
ATCCCGTCCATCGGAGGAACCTGCGCGACCGAGGAGGGGAAGTACTACGA
ATATCCCAACCCCGGTAAGTCGGTCGAAACCATCCCGGGCTGGACCGATT
TGGTTCCCCTGCAAGGCGAATGCGGTGCTGCCTCCGGTGTCTCGGGCTCC
GGCGGAAACGCCAGCAGTGCTACCCCTGCCGCAGGGGCCGCCCCGACTCC
TGCTGTCCGCGGCCGCCGTCCCACCTGGAACGCC (SEQ ID NO: 53)
MFSLKFFILAGGLAVLTEAHIRLVSPAPFTNPDQGPSPLLEAGSDYPCHN
GNGGGYQGTPTQMAKGSKQQLAFQGSAVHGGGSCQVSITYDENPTAQSSF
KVIHSIQGGCPARAETIPDCSAQNINACNIKPDNAQMDTPDKYEFTIPED
LPSGKATLAWTWINTIGNREFYMACAPVEITGDGGSESALAALPDMVIAN
IPSIGGTCATEEGKYYEYPNPGKSVETIPGWTDLVPLQGECGAASGVSGS
GGNASSATPAAGAAPTPAVRGRRPTWNA (SEQ ID NO:54)
HIRLVSPAPFTNPDQGPSPLLEAGSDYPCHNGNGGGYQGTPTQMAKGSKQ
QLAFQGSAVHGGGSCQVSITYDENPTAQSSFKVIHSIQGGCPARAETIPD
CSAQNINACNIKPDNAQMDTPDKYEFTIPEDLPSGKATLAWTWINTIGNR
EFYMACAPVEITGDGGSESALAALPDMVIANIPSIGGTCATEEGKYYEYP
NPGKSVETIPGWTDLVPLQGECGAASGVSGSGGNASSATPAAGAAPTPAV RGRRPTWNA
[0324] The polynucleotide (SEQ ID NO:55) and amino acid (SEQ ID
NO:56) sequences of a M. thermophila GH61m are provided below. The
signal sequence is shown underlined in SEQ ID NO:56. SEQ ID NO:57
provides the sequence of this GH61m without the signal
sequence.
TABLE-US-00029 (SEQ ID NO: 55)
ATGAAGCTCGCCACGCTCCTCGCCGCCCTCACCCTCGGGGTGGCCGACCA
GCTCAGCGTCGGGTCCAGAAAGTTTGGCGTGTACGAGCACATTCGCAAGA
ACACGAACTACAACTCGCCCGTTACCGACCTGTCGGACACCAACCTGCGC
TGCAACGTCGGCGGGGGCTCGGGCACCAGCACCACCGTGCTCGACGTCAA
GGCCGGAGACTCGTTCACCTTCTTCAGCGACGTTGCCGTCTACCACCAGG
GGCCCATCTCGCTGTGCGTGGACCGGACCAGTGCAGAGAGCATGGATGGA
CGGGAACCGGACATGCGCTGCCGAACTGGCTCACAAGCTGGCTACCTGGC
GGTGACTGACTACGACGGGTCCGGTGACTGTTTCAAGATCTATGACTGGG
GACCGACGTTCAACGGGGGCCAGGCGTCGTGGCCGACGAGGAATTCGTAC
GAGTACAGCATCCTCAAGTGCATCAGGGACGGCGAATACCTACTGCGGAT
TCAGTCCCTGGCCATCCATAACCCAGGTGCCCTTCCGCAGTTCTACATCA
GCTGCGCCCAGGTGAATGTGACGGGCGGAGGCACCGTCACCCCGAGATCA
AGGCGACCGATCCTGATCTATTTCAACTTCCACTCGTATATCGTCCCTGG
GCCGGCAGTGTTCAAGTGCTAG (SEQ ID NO: 56)
MKLATLLAALTLGVADQLSVGSRKFGVYEHIRKNTNYNSPVTDLSDTNLR
CNVGGGSGTSTTVLDVKAGDSFTFFSDVAVYHQGPISLCVDRTSAESMDG
REPDMRCRTGSQAGYLAVTDYDGSGDCFKIYDWGPTFNGGQASWPTRNSY
EYSILKCIRDGEYLLRIQSLAIHNPGALPQFYISCAQVNVTGGGTVTPRS
RRPILIYFNFHSYIVPGPAVFKC (SEQ ID NO: 57)
DQLSVGSRKFGVYEHIRKNTNYNSPVTDLSDTNLRCNVGGGSGTSTTVLD
VKAGDSFTFFSDVAVYHQGPISLCVDRTSAESMDGREPDMRCRTGSQAGY
LAVTDYDGSGDCFKIYDWGPTFNGGQASWPTRNSYEYSILKCIRDGEYLL
RIQSLAIHNPGALPQFYISCAQVNVTGGGTVTPRSRRPILIYFNFHSYIV PGPAVFKC
[0325] The polynucleotide (SEQ ID NO:58) and amino acid (SEQ ID
NO:59) sequences of an alternative M. thermophila GH61m are
provided below. The signal sequence is shown underlined in SEQ ID
NO:59. SEQ ID NO:60 provides the sequence of this GH61m without the
signal sequence.
TABLE-US-00030 (SEQ ID NO: 58)
ATGAAGCTCGCCACGCTCCTCGCCGCCCTCACCCTCGGGCTCAGCGTCGG
GTCCAGAAAGTTTGGCGTGTACGAGCACATTCGCAAGAACACGAACTACA
ACTCGCCCGTTACCGACCTGTCGGACACCAACCTGCGCTGCAACGTCGGC
GGGGGCTCGGGCACCAGCACCACCGTGCTCGACGTCAAGGCCGGAGACTC
GTTCACCTTCTTCAGCGACGTTGCCGTCTACCACCAGGGGCCCATCTCGC
TGTGCGTGGACCGGACCAGTGCAGAGAGCATGGATGGACGGGAACCGGAC
ATGCGCTGCCGAACTGGCTCACAAGCTGGCTACCTGGCGGTGACTGTGAT
GACTGTGACTGACTACGACGGGTCCGGTGACTGTTTCAAGATCTATGACT
GGGGACCGACGTTCAACGGGGGCCAGGCGTCGTGGCCGACGAGGAATTCG
TACGAGTACAGCATCCTCAAGTGCATCAGGGACGGCGAATACCTACTGCG
GATTCAGTCCCTGGCCATCCATAACCCAGGTGCCCTTCCGCAGTTCTACA
TCAGCTGCGCCCAGGTGAATGTGACGGGCGGAGGCACCATCTATTTCAAC
TTCCACTCGTATATCGTCCCTGGGCCGGCAGTGTTCAAGTGC (SEQ ID NO: 59)
MKLATLLAALTLGLSVGSRKFGVYEHIRKNTNYNSPVTDLSDTNLRCNVG
GGSGTSTTVLDVKAGDSFTFFSDVAVYHQGPISLCVDRTSAESMDGREPD
MRCRTGSQAGYLAVTVMTVTDYDGSGDCFKIYDWGPTFNGGQASWPTRNS
YEYSILKCIRDGEYLLRIQSLAIHNPGALPQFYISCAQVNVTGGGTIYFN FHSYIVPGPAVFKC
(SEQ ID NO: 60) RKFGVYEHIRKNTNYNSPVTDLSDTNLRCNVGGGSGTSTTVLDVKAGDSF
TFFSDVAVYHQGPISLCVDRTSAESMDGREPDMRCRTGSQAGYLAVTVMT
VTDYDGSGDCFKIYDWGPTFNGGQASWPTRNSYEYSILKCIRDGEYLLRI
QSLAIHNPGALPQFYISCAQVNVTGGGTIYFNFHSYIVPGPAVFKC
[0326] The polynucleotide (SEQ ID NO:61) and amino acid (SEQ ID
NO:62) sequences of a M. thermophila GH61n are provided below.
TABLE-US-00031 (SEQ ID NO: 61)
ATGACCAAGAATGCGCAGAGCAAGCAGGGCGTTGAGAACCCAACAAGCGG
CGACATCCGCTGCTACACCTCGCAGACGGCGGCCAACGTCGTGACCGTGC
CGGCCGGCTCGACCATTCACTACATCTCGACCCAGCAGATCAACCACCCC
GGCCCGACTCAGTACTACCTGGCCAAGGTACCCCCCGGCTCGTCGGCCAA
GACCTTTGACGGGTCCGGCGCCGTCTGGTTCAAGATCTCGACCACGATGC
CTACCGTGGACAGCAACAAGCAGATGTTCTGGCCAGGGCAGAACACTTAT
GAGACCTCAAACACCACCATTCCCGCCAACACCCCGGACGGCGAGTACCT
CCTTCGCGTCAAGCAGATCGCCCTCCACATGGCGTCTCAGCCCAACAAGG
TCCAGTTCTACCTCGCCTGCACCCAGATCAAGATCACCGGTGGTCGCAAC
GGCACCCCCAGCCCGCTGGTCGCGCTGCCCGGAGCCTACAAGAGCACCGA
CCCCGGCATCCTGGTCGACATCTACTCCATGAAGCCCGAATCGTACCAGC
CTCCCGGGCCGCCCGTCTGGCGCGGCTAA (SEQ ID NO: 62)
MTKNAQSKQGVENPTSGDIRCYTSQTAANVVTVPAGSTIHYISTQQINHP
GPTQYYLAKVPPGSSAKTFDGSGAVWFKISTTMPTVDSNKQMFWPGQNTY
ETSNTTIPANTPDGEYLLRVKQIALHMASQPNKVQFYLACTQIKITGGRN
GTPSPLVALPGAYKSTDPGILVDIYSMKPESYQPPGPPVWRG
[0327] The polynucleotide (SEQ ID NO:63) and amino acid (SEQ ID
NO:64) sequences of an alternative M. thermophila GH61n are
provided below. The signal sequence is shown underlined in SEQ ID
NO:64. SEQ ID NO:65 provides the sequence of this GH61n without the
signal sequence.
TABLE-US-00032 (SEQ ID NO: 63)
ATGAGGCTTCTCGCAAGCTTGTTGCTCGCAGCTACGGCTGTTCAAGCTCA
CTTTGTTAACGGACAGCCCGAAGAGAGTGACTGGTCAGCCACGCGCATGA
CCAAGAATGCGCAGAGCAAGCAGGGCGTTGAGAACCCAACAAGCGGCGAC
ATCCGCTGCTACACCTCGCAGACGGCGGCCAACGTCGTGACCGTGCCGGC
CGGCTCGACCATTCACTACATCTCGACCCAGCAGATCAACCACCCCGGCC
CGACTCAGTACTACCTGGCCAAGGTACCCCCCGGCTCGTCGGCCAAGACC
TTTGACGGGTCCGGCGCCGTCTGGTTCAAGATCTCGACCACGATGCCTAC
CGTGGACAGCAACAAGCAGATGTTCTGGCCAGGGCAGAACACTTATGAGA
CCTCAAACACCACCATTCCCGCCAACACCCCGGACGGCGAGTACCTCCTT
CGCGTCAAGCAGATCGCCCTCCACATGGCGTCTCAGCCCAACAAGGTCCA
GTTCTACCTCGCCTGCACCCAGATCAAGATCACCGGTGGTCGCAACGGCA
CCCCCAGCCCGCTGGTCGCGCTGCCCGGAGCCTACAAGAGCACCGACCCC
GGCATCCTGGTCGACATCTACTCCATGAAGCCCGAATCGTACCAGCCTCC
CGGGCCGCCCGTCTGGCGCGGC (SEQ ID NO: 64)
MRLLASLLLAATAVQAHFVNGQPEESDWSATRMTKNAQSKQGVENPTSGD
IRCYTSQTAANVVTVPAGSTIHYISTOQINHPGPTQYYLAKVPPGSSAKT
FDGSGAVWFKISTTMPTVDSNKQMFWPGQNTYETSNTTIPANTPDGEYLL
RVKQIALHMASQPNKVQFYLACTQIKITGGRNGTPSPLVALPGAYKSTDP
GILVDIYSMKPESYQPPGPPVWRG (SEQ ID NO: 65)
HFVNGQPEESDWSATRMTKNAQSKQGVENPTSGDIRCYTSQTAANVVTVP
AGSTIHYISTQQINHPGPTQYYLAKVPPGSSAKTFDGSGAVWFKISTTMP
TVDSNKQMFWPGQNTYETSNTTIPANTPDGEYLLRVKQIALHMASQPNKV
QFYLACTQIKITGGRNGTPSPLVALPGAYKSTDPGILVDIYSMKPESYQP PGPPVWRG
[0328] The polynucleotide (SEQ ID NO:66) and amino acid (SEQ ID
NO:67) sequences of an alternative M. thermophila GH61o are
provided below. The signal sequence is shown underlined in SEQ ID
NO:67. SEQ ID NO:68 provides the sequence of this GH61o without the
signal sequence.
TABLE-US-00033 (SEQ ID NO: 66)
ATGAAGCCCTTTAGCCTCGTCGCCCTGGCGACTGCCGTGAGCGGCCATGC
CATCTTCCAGCGGGTGTCGGTCAACGGGCAGGACCAGGGCCAGCTCAAGG
GGGTGCGGGCGCCGTCGAGCAACTCCCCGATCCAGAACGTCAACGATGCC
AACATGGCCTGCAACGCCAACATTGTGTACCACGACAACACCATCATCAA
GGTGCCCGCGGGAGCCCGCGTCGGCGCGTGGTGGCAGCACGTCATCGGCG
GGCCGCAGGGCGCCAACGACCCGGACAACCCGATCGCCGCCTCCCACAAG
GGCCCCATCCAGGTCTACCTGGCCAAGGTGGACAACGCGGCGACGGCGTC
GCCGTCGGGCCTCAAGTGGTTCAAGGTGGCCGAGCGCGGCCTGAACAACG
GCGTGTGGGCCTACCTGATGCGCGTCGAGCTGCTCGCCCTGCACAGCGCC
TCGAGCCCCGGCGGCGCCCAGTTCTACATGGGCTGTGCACAGATCGAAGT
CACTGGCTCCGGCACCAACTCGGGCTCCGACTTTGTCTCGTTCCCCGGCG
CCTACTCGGCCAACGACCCGGGCATCTTGCTGAGCATCTACGACAGCTCG
GGCAAGCCCAACAATGGCGGGCGCTCGTACCCGATCCCCGGCCCGCGCCC
CATCTCCTGCTCCGGCAGCGGCGGCGGCGGCAACAACGGCGGCGACGGCG
GCGACGACAACAACGGTGGTGGCAACAACAACGGCGGCGGCAGCGTCCCC
CTGTACGGGCAGTGCGGCGGCATCGGCTACACGGGCCCGACCACCTGTGC
CCAGGGAACTTGCAAGGTGTCGAACGAATACTACAGCCAGTGCCTCCCC (SEQ ID NO: 67)
MKPFSLVALATAVSGHAIFQRVSVNGQDQGQLKGVRAPSSNSPIQNVNDA
NMACNANIVYHDNTIIKVPAGARVGAWWQHVIGGPQGANDPDNPIAASHK
GPIQVYLAKVDNAATASPSGLKWFKVAERGLNNGVWAYLMRVELLALHSA
SSPGGAQFYMGCAQIEVTGSGTNSGSDFVSFPGAYSANDPGILLSIYDSS
GKPNNGGRSYPIPGPRPISCSGSGGGGNNGGDGGDDNNGGGNNNGGGSVP
LYGQCGGIGYTGPTTCAQGTCKVSNEYYSQCLP (SEQ ID NO: 68)
HAIFQRVSVNGQDQGQLKGVRAPSSNSPIQNVNDANMACNANIVYHDNTI
IKVPAGARVGAWWQHVIGGPQGANDPDNPIAASHKGPIQVYLAKVDNAAT
ASPSGLKWFKVAERGLNNGVWAYLMRVELLALHSASSPGGAQFYMGCAQI
EVTGSGTNSGSDFVSFPGAYSANDPGILLSIYDSSGKPNNGGRSYPIPGP
RPISCSGSGGGGNNGGDGGDDNNGGGNNNGGGSVPLYGQCGGIGYTGPTT
CAQGTCKVSNEYYSQCLP
[0329] The polynucleotide (SEQ ID NO:69) and amino acid (SEQ ID
NO:70) sequences of a M. thermophila GH61p are provided below. The
signal sequence is shown underlined in SEQ ID NO:70. SEQ ID NO:71
provides the sequence of this GH61p without the signal
sequence.
TABLE-US-00034 (SEQ ID NO: 69)
ATGAAGCTCACCTCGTCCCTCGCTGTCCTGGCCGCTGCCGGCGCCCAGGC
TCACTATACCTTCCCTAGGGCCGGCACTGGTGGTTCGCTCTCTGGCGAGT
GGGAGGTGGTCCGCATGACCGAGAACCATTACTCGCACGGCCCGGTCACC
GATGTCACCAGCCCCGAGATGACCTGCTATCAGTCCGGCGTGCAGGGTGC
GCCCCAGACCGTCCAGGTCAAGGCGGGCTCCCAATTCACCTTCAGCGTGG
ATCCCTCCATCGGCCACCCCGGCCCTCTCCAGTTCTACATGGCTAAGGTG
CCGTCGGGCCAGACGGCCGCCACCTTTGACGGCACGGGAGCCGTGTGGTT
CAAGATCTACCAAGACGGCCCGAACGGCCTCGGCACCGACAGCATTACCT
GGCCCAGCGCCGGCAAAACCGAGGTCTCGGTCACCATCCCCAGCTGCATC
GAGGATGGCGAGTACCTGCTCCGGGTCGAGCACACCCCCCTCCCTACAGC
GCCAGCAGCGCAAAACCGAGCTCGCTCGTCACCATCCCCAGCTGCATACA
AGGCCACCGACCCGGGCATCCTCTTCCAGCTCTACTGGCCCATCCCGACC
GAGTACATCAACCCCGGCCCGGCCCCCGTCTCTTGCTAA (SEQ ID NO: 70)
MKLTSSLAVLAAAGAQAHYTFPRAGTGGSLSGEWEVVRMTENHYSHGPVT
DVTSPEMTCYQSGVQGAPQTVQVKAGSQFTFSVDPSIGHPGPLQFYMAKV
PSGQTAATFDGTGAVWFKIYQDGPNGLGTDSITWPSAGKTEVSVTIPSCI
EDGEYLLRVEHTPLPTAPAAQNRARSSPSPAAYKATDPGILFQLYWPIPT EYINPGPAPVSC
(SEQ ID NO: 71) HYTFPRAGTGGSLSGEWEVVRMTENHYSHGPVTDVTSPEMTCYQSGVQGA
PQTVQVKAGSQFTFSVDPSIGHPGPLQFYMAKVPSGQTAATFDGTGAVWF
KIYQDGPNGLGTDSITWPSAGKTEVSVTIPSCIEDGEYLLRVEHTPLPTA
PAAQNRARSSPSPAAYKATDPGILFQLYWPIPTEYINPGPAPVSC
[0330] The polynucleotide (SEQ ID NO:72) and amino acid (SEQ ID
NO:73) sequences of an alternative M. thermophila GH61p are
provided below. The signal sequence is shown underlined in SEQ ID
NO:73. SEQ ID NO:74 provides the sequence of this GH61p without the
signal sequence.
TABLE-US-00035 (SEQ ID NO: 72)
ATGAAGCTCACCTCGTCCCTCGCTGTCCTGGCCGCTGCCGGCGCCCAGGC
TCACTATACCTTCCCTAGGGCCGGCACTGGTGGTTCGCTCTCTGGCGAGT
GGGAGGTGGTCCGCATGACCGAGACCATTACTCGCACGGCCCGGTCACCG
ATGTCACCAGCCCCGAGATGACCTGCTATCAGTCCGGCGTGCAGGGTGCG
CCCCAGACCGTCCAGGTCAAGGCGGGCTCCCAATTCACCTTCAGCGTGGA
TCCCTCCATCGGCCACCCCGGCCCTCTCCAGTTCTACATGGCTAAGGTGC
CGTCGGGCCAGACGGCCGCCACCTTTGACGGCACGGGAGCCGTGTGGTTC
AAGATCTACCAAGACGGCCCGAACGGCCTCGGCACCGACAGCATTACCTG
GCCCAGCGCCGGCAAAACCGAGGTCTCGGTCACCATCCCCAGCTGCATCG
AGGATGGCGAGTACCTGCTCCGGGTCGAGCACATCGCGCTCCACAGCGCC
AGCAGCGTGGGCGGCGCCCAGTTCTACATCGCCTGCGCCCAGCTCTCCGT
CACCGGCGGCTCCGGCACCCTCAACACGGGCTCGCTCGTCTCCCTGCCCG
GCGCCTACAAGGCCACCGACCCGGGCATCCTCTTCCAGCTCTACTGGCCC
ATCCCGACCGAGTACATCAACCCCGGCCCGGCCCCCGTCTCTTGC (SEQ ID NO: 73)
MKLTSSLAVLAAAGAQAHYTFPRAGTGGSLSGEWEVVRMTENHYSHGPVT
DVTSPEMTCYQSGVQGAPQTVQVKAGSQFTFSVDPSIGHPGPLQFYMAKV
PSGQTAATFDGTGAVWFKIYQDGPNGLGTDSITWPSAGKTEVSVTIPSCI
EDGEYLLRVEHIALHSASSVGGAQFYIACAQLSVTGGSGTLNTGSLVSLP
GAYKATDPGILFQLYWPIPTEYINPGPAPVSC (SEQ ID NO: 74)
HYTFPRAGTGGSLSGEWEVVRMTENHYSHGPVTDVTSPEMTCYQSGVQGA
PQTVQVKAGSQFTFSVDPSIGHPGPLQFYMAKVPSGQTAATFDGTGAVWF
KIYQDGPNGLGTDSITWPSAGKTEVSVTIPSCIEDGEYLLRVEHIALHSA
SSVGGAQFYIACAQLSVTGGSGTLNTGSLVSLPGAYKATDPGILFQLYWP
IPTEYINPGPAPVSC
[0331] The polynucleotide (SEQ ID NO:75) and amino acid (SEQ ID
NO:76) sequences of an alternative M. thermophila GH61q are
provided below. The signal sequence is shown underlined in SEQ ID
NO:76. SEQ ID NO:77 provides the sequence of this GH61q without the
signal sequence.
TABLE-US-00036 (SEQ ID NO: 75)
ATGCCGCCACCACGACTGAGCACCCTCCTTCCCCTCCTAGCCTTAATAGC
CCCCACCGCCCTGGGGCACTCCCACCTCGGGTACATCATCATCAACGGCG
AGGTATACCAAGGATTCGACCCGCGGCCGGAGCAGGCGAACTCGCCGTTG
CGCGTGGGCTGGTCGACGGGGGCAATCGACGACGGGTTCGTGGCGCCGGC
CAACTACTCGTCGCCCGACATCATCTGCCACATCGAGGGGGCCAGCCCGC
CGGCGCACGCGCCCGTCCGGGCGGGCGACCGGGTGCACGTGCAATGGAAC
GGCTGGCCGCTCGGACACGTGGGGCCGGTGCTGTCGTACCTGGCGCCCTG
CGGCGGGCTGGAGGGGTCCGAGAGCGGGTGCGCCGGGGTGGACAAGCGGC
AGCTGCGGTGGACCAAGGTGGACGACTCGCTGCCGGCGATGGAGCTG (SEQ ID NO: 76)
MPPPRLSTLLPLLALIAPTALGHSHLGYIIINGEVYQGFDPRPEQANSPL
RVGWSTGAIDDGFVAPANYSSPDIICHIEGASPPAHAPVRAGDRVHVQWN
GWPLGHVGPVLSYLAPCGGLEGSESGCAGVDKRQLRWTKVDDSLPAMEL (SEQ ID NO: 77)
HSHLGYIIINGEVYQGFDPRPEQANSPLRVGWSTGAIDDGFVAPANYSSP
DIICHIEGASPPAHAPVRAGDRVHVQWNGWPLGHVGPVLSYLAPCGGLEG
SESGCAGVDKRQLRWTKVDDSLPAMEL
[0332] The polynucleotide (SEQ ID NO:78) and amino acid (SEQ ID
NO:79) sequences of an alternative M. thermophila GH61q are
provided below. The signal sequence is shown underlined in SEQ ID
NO:79. SEQ ID NO:80 provides the sequence of this GH61q without the
signal sequence.
TABLE-US-00037 (SEQ ID NO: 78)
ATGCCGCCACCACGACTGAGCACCCTCCTTCCCCTCCTAGCCTTAATAGC
CCCCACCGCCCTGGGGCACTCCCACCTCGGGTACATCATCATCAACGGCG
AGGTATACCAAGGATTCGACCCGCGGCCGGAGCAGGCGAACTCGCCGTTG
CGCGTGGGCTGGTCGACGGGGGCAATCGACGACGGGTTCGTGGCGCCGGC
CAACTACTCGTCGCCCGACATCATCTGCCACATCGAGGGGGCCAGCCCGC
CGGCGCACGCGCCCGTCCGGGCGGGCGACCGGGTGCACGTGCAATGGAAA
CGGCTGGCCGCTCGGACACGTGGGGCCGGTGCTGTCGTACCTGGCGCCCT
GCGGCGGGCTGGAGGGGTCCGAGAGCGGGTGGACGACTCGCTGCCGGCGA
TGGAGCTGGTCGGGGCCGCGGGGGGCGCGGGGGGCGAGGACGACGGCAGC
GGCAGCGACGGCAGCGGCAGCGGCGGCAGCGGACGCGTCGGCGTGCCCGG
GCAGCGCTGGGCCACCGACGTGTTGATCGCGGCCAACAACAGCTGGCAGG
TCGAGATCCCGCGCGGGCTGCGGGACGGGCCGTACGTGCTGCGCCACGAG
ATCGTCGCGCTGCACTACGCGGCCGAGCCCGGCGGCGCGCAGAACTACCC
GCTCTGCGTCAACCTGTGGGTCGAGGGCGGCGACGGCAGCATGGAGCTGG
ACCACTTCGACGCCACCCAGTTCTACCGGCCCGACGACCCGGGCATCCTG
CTCAACGTGACGGCCGGCCTGCGCTCATACGCCGTGCCGGGCCCGACGCT
GGCCGCGGGGGCGACGCCGGTGCCGTACGCGCAGCAGAACATCAGCTCGG
CGAGGGCGGATGGAACCCCCGTGATTGTCACCAGGAGCACGGAGACGGTG
CCCTTCACCGCGGCACCCACGCCAGCCGAGACGGCAGAAGCCAAAGGGGG
GAGGTATGATGACCAAACCCGAACTAAAGACCTAAATGAACGCTTCTTTT
ATAGTAGCCGGCCAGAACAGAAGAGGCTGACAGCGACCTCAAGAAGGGAA
CTAGTTGATCATCGTACCCGGTACCTCTCCGTAGCTGTCTGCGCAGATTT
CGGCGCTCATAAGGCAGCAGAAACCAACCACGAAGCTTTGAGAGGCGGCA
ATAAGCACCATGGCGGTGTTTCAGAG (SEQ ID NO: 79)
MPPPRLSTLLPLLALIAPTALGHSHLGYIIINGEVYQGFDPRPEQANSPL
RVGWSTGAIDDGFVAPANYSSPDIICHIEGASPPAHAPVRAGDRVHVQWK
RLAARTRGAGAVVPGALRRAGGVRERVDDSLPAMELVGAAGGAGGEDDGS
GSDGSGSGGSGRVGVPGQRWATDVLIAANNSWQVEIPRGLRDGPYVLRHE
IVALHYAAEPGGAQNYPLCVNLWVEGGDGSMELDHFDATQFYRPDDPGIL
LNVTAGLRSYAVPGPTLAAGATPVPYAQQNISSARADGTPVIVTRSTETV
PFTAAPTPAETAEAKGGRYDDQTRTKDLNERFFYSSRPEQKRLTATSRRE
LVDHRTRYLSVAVCADFGAHKAAETNHEALRGGNKHHGGVSE (SEQ ID NO: 80)
HSHLGYIIINGEVYQGFDPRPEQANSPLRVGWSTGAIDDGFVAPANYSSP
DIICHIEGASPPAHAPVRAGDRVHVQWKRLAARTRGAGAVVPGALRRAGG
VRERVDDSLPAMELVGAAGGAGGEDDGSGSDGSGSGGSGRVGVPGQRWAT
DVLIAANNSWQVEIPRGLRDGPYVLRHEIVALHYAAEPGGAQNYPLCVNL
WVEGGDGSMELDHFDATQFYRPDDPGILLNVTAGLRSYAVPGPTLAAGAT
PVPYAQQNISSARADGTPVIVTRSTETVPFTAAPTPAETAEAKGGRYDDQ
TRTKDLNERFFYSSRPEQKRLTATSRRELVDHRTRYLSVAVCADFGAHKA
AETNHEALRGGNKHHGGVSE
[0333] The polynucleotide (SEQ ID NO:81) and amino acid (SEQ ID
NO:82) sequences of an M. thermophila GH61r are provided below. The
signal sequence is shown underlined in SEQ ID NO:82. SEQ ID NO:83
provides the sequence of this GH61r without the signal
sequence.
TABLE-US-00038 (SEQ ID NO: 81)
ATGAGGTCGACATTGGCCGGTGCCCTGGCAGCCATCGCTGCTCAGAAAGT
AGCCGGCCACGCCACGTTTCAGCAGCTCTGGCACGGCTCCTCCTGTGTCC
GCCTTCCGGCTAGCAACTCACCCGTCACCAATGTGGGAAGCAGAGACTTC
GTCTGCAACGCTGGCACCCGCCCCGTCAGTGGCAAGTGCCCCGTGAAGGC
TGGCGGCACCGTCACCATCGAGATGCACCAGCAACCCGGCGACCGCAGCT
GCAACAACGAAGCCATCGGAGGGGCGCATTGGGGCCCCGTCCAGGTGTAC
CTGACCAAGGTTCAGGACGCCGCGACGGCCGACGGCTCGACGGGCTGGTT
CAAGATCTTCTCCGACTCGTGGTCCAAGAAGCCCGGGGGCAACTTGGGCG
ACGACGACAACTGGGGCACGCGCGACCTGAACGCCTGCTGCGGGAAGATG GAC (SEQ ID NO:
82) MRSTLAGALAAIAAQKVAGHATFQQLWHGSSCVRLPASNSPVTNVGSRDF
VCNAGTRPVSGKCPVKAGGTVTIEMHQQPGDRSCNNEAIGGAHWGPVQVY
LTKVQDAATADGSTGWFKIFSDSWSKKPGGNLGDDDNWGTRDLNACCGKM D (SEQ ID NO:
83) HATFQQLWHGSSCVRLPASNSPVTNVGSRDFVCNAGTRPVSGKCPVKAGG
TVTIEMHQQPGDRSCNNEAIGGAHWGPVQVYLTKVQDAATADGSTGWFKI
FSDSWSKKPGGNLGDDDNWGTRDLNACCGKMD
[0334] The polynucleotide (SEQ ID NO:84) and amino acid (SEQ ID
NO:85) sequences of an alternative M. thermophila GH61r are
provided below. The signal sequence is shown underlined in SEQ ID
NO:85. SEQ ID NO:86 provides the sequence of this GH61r without the
signal sequence.
TABLE-US-00039 (SEQ ID NO: 84)
ATGAGGTCGACATTGGCCGGTGCCCTGGCAGCCATCGCTGCTCAGAAAGT
AGCCGGCCACGCCACGTTTCAGCAGCTCTGGCACGGCTCCTCCTGTGTCC
GCCTTCCGGCTAGCAACTCACCCGTCACCAATGTGGGAAGCAGAGACTTC
GTCTGCAACGCTGGCACCCGCCCCGTCAGTGGCAAGTGCCCCGTGAAGGC
TGGCGGCACCGTCACCATCGAGATGCACCAGCAACCCGGCGACCGCAGCT
GCAACAACGAAGCCATCGGAGGGGCGCATTGGGGCCCCGTCCAGGTGTAC
CTGACCAAGGTTCAGGACGCCGCGACGGCCGACGGCTCGACGGGCTGGTT
CAAGATCTTCTCCGACTCGTGGTCCAAGAAGCCCGGGGGCAACTCGGGCG
ACGACGACAACTGGGGCACGCGCGACCTGAACGCCTGCTGCGGGAAGATG
GACGTGGCCATCCCGGCCGACATCGCGTCGGGCGACTACCTGCTGCGGGC
CGAGGCGCTGGCCCTGCACACGGCCGGACAGGCCGGCGGCGCCCAGTTCT
ACATGAGCTGCTACCAGATGACGGTCGAGGGCGGCTCCGGGACCGCCAAC
CCGCCCACCGTCAAGTTCCCGGGCGCCTACAGCGCCAACGACCCGGGCAT
CCTCGTCAACATCCACGCCCCCCTTTCCAGCTACACCGCGCCCGGCCCGG
CCGTCTACGCGGGCGGCACCATCCGCGAGGCCGGCTCCGCCTGCACCGGC
TGCGCGCAGACCTGCAAGGTCGGGTCGTCCCCGAGCGCCGTTGCCCCCGG
CAGCGGCGCGGGCAACGGCGGCGGGTTCCAACCCCGA (SEQ ID NO: 85)
MRSTLAGALAAIAAQKVAGHATFQQLWHGSSCVRLPASNSPVTNVGSRDF
VCNAGTRPVSGKCPVKAGGTVTIEMHQQPGDRSCNNEAIGGAHWGPVQVY
LTKVQDAATADGSTGWFKIFSDSWSKKPGGNSGDDDNWGTRDLNACCGKM
DVAIPADIASGDYLLRAEALALHTAGQAGGAQFYMSCYQMTVEGGSGTAN
PPTVKFPGAYSANDPGILVNIHAPLSSYTAPGPAVYAGGTIREAGSACTG
CAQTCKVGSSPSAVAPGSGAGNGGGFQPR (SEQ ID NO: 86)
HATFQQLWHGSSCVRLPASNSPVTNVGSRDFVCNAGTRPVSGKCPVKAGG
TVTIEMHQQPGDRSCNNEAIGGAHWGPVQVYLTKVQDAATADGSTGWFKI
FSDSWSKKPGGNSGDDDNWGTRDLNACCGKMDVAIPADIASGDYLLRAEA
LALHTAGQAGGAQFYMSCYQMTVEGGSGTANPPTVKFPGAYSANDPGILV
NIHAPLSSYTAPGPAVYAGGTIREAGSACTGCAQTCKVGSSPSAVAPGSG AGNGGGFQPR
[0335] The polynucleotide (SEQ ID NO:87) and amino acid (SEQ ID
NO:88) sequences of an M. thermophila GH61s are provided below. The
signal sequence is shown underlined in SEQ ID NO:88. SEQ ID NO:89
provides the sequence of this GH61s without the signal
sequence.
TABLE-US-00040 (SEQ ID NO: 87)
ATGCTCCTCCTCACCCTAGCCACACTCGTCACCCTCCTGGCGCGCCACG
TCTCGGCTCACGCCCGGCTGTTCCGCGTCTCTGTCGACGGGAAAGACCA
GGGCGACGGGCTGAACAAGTACATCCGCTCGCCGGCGACCAACGACCCC
GTGCGCGACCTCTCGAGCGCCGCCATCGTGTGCAACACCCAGGGGTCCA
AGGCCGCCCCGGACTTCGTCAGGGCCGCGGCCGGCGACAAGCTGACCTT
CCTCTGGGCGCACGACAACCCGGACGACCCGGTCGACTACGTCCTCGAC
CCGTCCCACAAGGGCGCCATCCTGACCTACGTCGCCGCCTACCCCTCCG
GGGACCCGACCGGCCCCATCTGGAGCAAGCTTGCCGAGGAAGGATTCAC
CGGCGGGCAGTGGGCGACCATCAAGATGATCGACAACGGCGGCAAGGTC
GACGTGACGCTGCCCGAGGCCCTTGCGCCGGGAAAGTACCTGATCCGCC
AGGAGCTGCTGGCCCTGCACCGGGCCGACTTTGCCTGCGACGACCCGGC
CCACCCCAACCGCGGCGCCGAGTCGTACCCCAACTGCGTCCAGGTGGAG
GTGTCGGGCAGCGGCGACAAGAAGCCGGACCAGAACTTTGACTTCAACA
AGGGCTATACCTGCGATAACAAAGGACTCCACTTTAAGATCTACATCGG
TCAGGACAGCCAGTATGTGGCCCCGGGGCCGCGGCCTTGGAATGGGAGC (SEQ ID NO: 88)
MLLLTLATLVTLLARHVSAHARLFRVSVDGKDQGDGLNKYIRSPATNDP
VRDLSSAAIVCNTQGSKAAPDFVRAAAGDKLTFLWAHDNPDDPVDYVLD
PSHKGAILTYVAAYPSGDPTGPIWSKLAEEGFTGGQWATIKMIDNGGKV
DVTLPEALAPGKYLIRQELLALHRADFACDDPAHPNRGAESYPNCVQVE
VSGSGDKKPDQNFDFNKGYTCDNKGLHFKIYIGQDSQYVAPGPRPWNGS (SEQ ID NO: 89)
HARLFRVSVDGKDQGDGLNKYIRSPATNDPVRDLSSAAIVCNTQGSKAA
PDFVRAAAGDKLTFLWAHDNPDDPVDYVLDPSHKGAILTYVAAYPSGDP
TGPIWSKLAEEGFTGGQWATIKMIDNGGKVDVTLPEALAPGKYLIRQEL
LALHRADFACDDPAHPNRGAESYPNCVQVEVSGSGDKKPDQNFDFNKGY
TCDNKGLHFKIYIGQDSQYVAPGPRPWNGS
[0336] The polynucleotide (SEQ ID NO:90) and amino acid (SEQ ID
NO:91) sequences of an M. thermophila GH61t are provided below.
TABLE-US-00041 (SEQ ID NO: 90)
ATGTTCACTTCGCTTTGCATCACAGATCATTGGAGGACTCTTAGCAGCC
ACTCTGGGCCAGTCATGAACTATCTCGCCCATTGCACCAATGACGACTG
CAAGTCTTTCAAGGGCGACAGCGGCAACGTCTGGGTCAAGATCGAGCAG
CTCGCGTACAACCCGTCAGCCAACCCCCCCTGGGCGTCTGACCTCCTCC
GTGAGCACGGTGCCAAGTGGAAGGTGACGATCCCGCCCAGTCTTGTCCC
CGGCGAATATCTGCTGCGGCACGAGATCCTGGGGTTGCACGTCGCAGGA
ACCGTGATGGGCGCCCAGTTCTACCCCGGCTGCACCCAGATCAGGGTCA
CCGAAGGCGGGAGCACGCAGCTGCCCTCGGGTATTGCGCTCCCAGGCGC
TTACGGCCCACAAGACGAGGGTATCTTGGTCGACTTGTGGAGGGTTAAC
CAGGGCCAGGTCAACTACACGGCGCCTGGAGGACCCGTTTGGAGCGAAG
CGTGGGACACCGAGTTTGGCGGGTCCAACACGACCGAGTGCGCCACCAT
GCTCGACGACCTGCTCGACTACATGGCGGCCAACGACGAGTGGATCGGC TGGACGGCCTAG (SEQ
ID NO: 91) MFTSLCITDHWRTLSSHSGPVMNYLAHCTNDDCKSFKGDSGNVWVKIEQ
LAYNPSANPPWASDLLREHGAKWKVTIPPSLVPGEYLLRHEILGLHVAG
TVMGAQFYPGCTQIRVTEGGSTQLPSGIALPGAYGPQDEGILVDLWRVN
QGQVNYTAPGGPVWSEAWDTEFGGSNTTECATMLDDLLDYMAANDEWIG WTA
[0337] The polynucleotide (SEQ ID NO:92) and amino acid (SEQ ID
NO:93) sequences of an alternative M. thermophila GH61t are
provided below.
TABLE-US-00042 (SEQ ID NO: 92)
ATGAACTATCTCGCCCATTGCACCAATGACGACTGCAAGTCTTTCAAGG
GCGACAGCGGCAACGTCTGGGTCAAGATCGAGCAGCTCGCGTACAACCC
GTCAGCCAACCCCCCCTGGGCGTCTGACCTCCTCCGTGAGCACGGTGCC
AAGTGGAAGGTGACGATCCCGCCCAGTCTTGTCCCCGGCGAATATCTGC
TGCGGCACGAGATCCTGGGGTTGCACGTCGCAGGAACCGTGATGGGCGC
CCAGTTCTACCCCGGCTGCACCCAGATCAGGGTCACCGAAGGCGGGAGC
ACGCAGCTGCCCTCGGGTATTGCGCTCCCAGGCGCTTACGGCCCACAAG
ACGAGGGTATCTTGGTCGACTTGTGGAGGGTTAACCAGGGCCAGGTCAA
CTACACGGCGCCTGGAGGACCCGTTTGGAGCGAAGCGTGGGACACCGAG
TTTGGCGGGTCCAACACGACCGAGTGCGCCACCATGCTCGACGACCTGC
TCGACTACATGGCGGCCAACGACGACCCATGCTGCACCGACCAGAACCA
GTTCGGGAGTCTCGAGCCGGGGAGCAAGGCGGCCGGCGGCTCGCCGAGC
CTGTACGATACCGTCTTGGTCCCCGTTCTCCAGAAGAAAGTGCCGACAA
AGCTGCAGTGGAGCGGACCGGCGAGCGTCAACGGGGATGAGTTGACAGA GAGGCCC (SEQ ID
NO: 93) MNYLAHCTNDDCKSFKGDSGNVWVKIEQLAYNPSANPPWASDLLREHGA
KWKVTIPPSLVPGEYLLRHEILGLHVAGTVMGAQFYPGCTQIRVTEGGS
TQLPSGIALPGAYGPQDEGILVDLWRVNQGQVNYTAPGGPVWSEAWDTE
FGGSNTTECATMLDDLLDYMAANDDPCCTDQNQFGSLEPGSKAAGGSPS
LYDTVLVPVLQKKVPTKLQWSGPASVNGDELTERP
[0338] The polynucleotide (SEQ ID NO:94) and amino acid (SEQ ID
NO:95) sequences of an M. thermophila GH61u are provided below. The
signal sequence is shown underlined in SEQ ID NO:95. SEQ ID NO:96
provides the sequence of this GH61u without the signal
sequence.
TABLE-US-00043 (SEQ ID NO: 94)
ATGAAGCTGAGCGCTGCCATCGCCGTGCTCGCGGCCGCCCTTGCCGAGG
GGCACTATACCTTCCCCAGCATCGCCAACACGGCCGACTGGCAATATGT
GCGCATCACGACCAACTTCCAGAGCAACGGCCCCGTGACGGACGTCAAC
TCGGACCAGATCCGGTGCTACGAGCGCAACCCGGGCACCGGCGCCCCCG
GCATCTACAACGTCACGGCCGGCACAACCATCAACTACAACGCCAAGTC
GTCCATCTCCCACCCGGGACCCATGGCCTTCTACATTGCCAAGGTTCCC
GCCGGCCAGTCGGCCGCCACCTGGGACGGTAAGGGCGCCGTCTGGTCCA
AGATCCACCAGGAGATGCCGCACTTTGGCACCAGCCTCACCTGGGACTC
CAACGGCCGCACCTCCATGCCCGTCACCATCCCCCGCTGTCTGCAGGAC
GGCGAGTATCTGCTGCGTGCAGAGCACATTGCCCTCCACAGCGCCGGCA
GCCCCGGCGGCGCCCAGTTCTACATTTCTTGTGCCCAGCTCTCAGTCAC
CGGCGGCAGCGGGACCTGGAACCCCAGGAACAAGGTGTCGTTCCCCGGC
GCCTACAAGGCCACTGACCCGGGCATCCTGATCAACATCTACTACCCCG
TCCCGACTAGCTACACTCCCGCTGGTCCCCCCGTCGACACCTGC (SEQ ID NO: 95)
MKLSAAIAVLAAALAEGHYTFPSIANTADWQYVRITTNFQSNGPVTDVN
SDQIRCYERNPGTGAPGIYNVTAGTTINYNAKSSISHPGPMAFYIAKVP
AGQSAATWDGKGAVWSKIHQEMPHFGTSLTWDSNGRTSMPVTIPRCLQD
GEYLLRAEHIALHSAGSPGGAQFYISCAQLSVTGGSGTWNPRNKVSFPG
AYKATDPGILINIYYPVPTSYTPAGPPVDTC (SEQ ID NO: 96)
HYTFPSIANTADWQYVRITTNFQSNGPVTDVNSDQIRCYERNPGTGAPG
IYNVTAGTTINYNAKSSISHPGPMAFYIAKVPAGQSAATWDGKGAVWSK
IHQEMPHFGTSLTWDSNGRTSMPVTIPRCLQDGEYLLRAEHIALHSAGS
PGGAQFYISCAQLSVTGGSGTWNPRNKVSFPGAYKATDPGILINIYYPV
PTSYTPAGPPVDTC
[0339] The polynucleotide (SEQ ID NO:97) and amino acid (SEQ ID
NO:98) sequences of an M. thermophila GH61v are provided below. The
signal sequence is shown underlined in SEQ ID NO:98. SEQ ID NO:99
provides the sequence of this GH61v without the signal
sequence.
TABLE-US-00044 (SEQ ID NO: 97)
ATGTACCGCACGCTCGGTTCCATTGCCCTGCTCGCGGGGGGCGCTGCCG
CCCACGGCGCCGTGACCAGCTACAACATTGCGGGCAAGGACTACCCTGG
ATACTCGGGCTTCGCCCCTACCGGCCAGGATGTCATCCAGTGGCAATGG
CCCGACTATAACCCCGTGCTGTCCGCCAGCGACCCCAAGCTCCGCTGCA
ACGGCGGCACCGGGGCGGCGCTGTATGCCGAGGCGGCCCCCGGCGACAC
CATCACGGCCACCTGGGCCCAGTGGACGCACTCCCAGGGCCCGATCCTG
GTGTGGATGTACAAGTGCCCCGGCGACTTCAGCTCCTGCGACGGCTCCG
GCGCGGGTTGGTTCAAGATCGACGAGGCCGGCTTCCACGGCGACGGCAC
GACCGTCTTCCTCGACACCGAGACCCCCTCGGGCTGGGACATTGCCAAG
CTGGTCGGCGGCAACAAGTCGTGGAGCAGCAAGATCCCTGACGGCCTCG
CCCCGGGCAATTACCTGGTCCGCCACGAGCTCATCGCCCTGCACCAGGC
CAACAACCCGCAATTCTACCCCGAGTGCGCCCAGATCAAGGTCACCGGC
TCTGGCACCGCCGAGCCCGCCGCCTCCTACAAGGCCGCCATCCCCGGCT
ACTGCCAGCAGAGCGACCCCAACATTTCGTTCAACATCAACGACCACTC
CCTCCCGCAGGAGTACAAGATCCCCGGTCCCCCGGTCTTCAAGGGCACC
GCCTCCGCCAAGGCTCGCGCTTTCCAGGCC (SEQ ID NO: 98)
MYRTLGSIALLAGGAAAHGAVTSYNIAGKDYPGYSGFAPTGQDVIQWQW
PDYNPVLSASDPKLRCNGGTGAALYAEAAPGDTITATWAQWTHSQGPIL
VWMYKCPGDFSSCDGSGAGWFKIDEAGFHGDGTTVFLDTETPSGWDIAK
LVGGNKSWSSKIPDGLAPGNYLVRHELIALHQANNPQFYPECAQIKVTG
SGTAEPAASYKAAIPGYCQQSDPNISFNINDHSLPQEYKIPGPPVFKGT ASAKARAFQA (SEQ
ID NO: 99) AVTSYNIAGKDYPGYSGFAPTGQDVIQWQWPDYNPVLSASDPKLRCNGG
TGAALYAEAAPGDTITATWAQWTHSQGPILVWMYKCPGDFSSCDGSGAG
WFKIDEAGFHGDGTTVFLDTETPSGWDIAKLVGGNKSWSSKIPDGLAPG
NYLVRHELIALHQANNPQFYPECAQIKVTGSGTAEPAASYKAAIPGYCQ
QSDPNISFNINDHSLPQEYKIPGPPVFKGTASAKARAFQA
[0340] The polynucleotide (SEQ ID NO:100) and amino acid (SEQ ID
NO:101) sequences of an M. thermophila GH61w are provided below.
The signal sequence is shown underlined in SEQ ID NO:101. SEQ ID
NO:102 provides the sequence of this GH61w without the signal
sequence.
TABLE-US-00045 (SEQ ID NO: 100)
ATGCTGACAACAACCTTCGCCCTCCTGACGGCCGCTCTCGGCGTCAGCG
CCCATTATACCCTCCCCAGGGTCGGGACCGGTTCCGACTGGCAGCACGT
GCGGCGGGCTGACAACTGGCAAAACAACGGCTTCGTCGGCGACGTCAAC
TCGGAGCAGATCAGGTGCTTCCAGGCGACCCCTGCCGGCGCCCAAGACG
TCTACACTGTTCAGGCGGGATCGACCGTGACCTACCACGCCAACCCCAG
TATCTACCACCCCGGCCCCATGCAGTTCTACCTGGCCCGCGTTCCGGAC
GGACAGGACGTCAAGTCGTGGACCGGCGAGGGTGCCGTGTGGTTCAAGG
TGTACGAGGAGCAGCCTCAATTTGGCGCCCAGCTGACCTGGCCTAGCAA
CGGCAAGAGCTCGTTCGAGGTTCCTATCCCCAGCTGCATTCGGGCGGGC
AACTACCTCCTCCGCGCTGAGCACATCGCCCTGCACGTTGCCCAAAGCC
AGGGCGGCGCCCAGTTCTACATCTCGTGCGCCCAGCTCCAGGTCACTGG
TGGCGGCAGCACCGAGCCTTCTCAGAAGGTTTCCTTCCCGGGTGCCTAC
AAGTCCACCGACCCCGGCATTCTTATCAACATCAACTACCCCGTCCCTA
CCTCGTACCAGAATCCGGGTCCGGCTGTCTTCCGTTGC (SEQ ID NO: 101)
MLTTTFALLTAALGVSAHYTLPRVGTGSDWQHVRRADNWQNNGFVGDVN
SEQIRCFQATPAGAQDVYTVQAGSTVTYHANPSIYHPGPMQFYLARVPD
GQDVKSWTGEGAVWFKVYEEQPQFGAQLTWPSNGKSSFEVPIPSCIRAG
NYLLRAEHIALHVAQSQGGAQFYISCAQLQVTGGGSTEPSQKVSFPGAY
KSTDPGILININYPVPTSYQNPGPAVFRC (SEQ ID NO: 102)
HYTLPRVGTGSDWQHVRRADNWQNNGFVGDVNSEQIRCFQATPAGAQDV
YTVQAGSTVTYHANPSIYHPGPMQFYLARVPDGQDVKSWTGEGAVWFKV
YEEQPQFGAQLTWPSNGKSSFEVPIPSCIRAGNYLLRAEHIALHVAQSQ
GGAQFYISCAQLQVTGGGSTEPSQKVSFPGAYKSTDPGILININYPVPT SYQNPGPAVFRC
[0341] The polynucleotide (SEQ ID NO:103) and amino acid (SEQ ID
NO:104) sequences of a M. thermophila GH61x are provided below. The
signal sequence is shown underlined in SEQ ID NO:104. SEQ ID NO:105
provides the sequence of this GH61x without the signal
sequence.
TABLE-US-00046 (SEQ ID NO: 103)
ATGAAGGTTCTCGCGCCCCTGATTCTGGCCGGTGCCGCCAGCGCCCACA
CCATCTTCTCATCCCTCGAGGTGGGCGGCGTCAACCAGGGCATCGGGCA
GGGTGTCCGCGTGCCGTCGTACAACGGTCCGATCGAGGACGTGACGTCC
AACTCGATCGCCTGCAACGGGCCCCCCAACCCGACGACGCCGACCAACA
AGGTCATCACGGTCCGGGCCGGCGAGACGGTGACGGCCGTCTGGCGGTA
CATGCTGAGCACCACCGGCTCGGCCCCCAACGACATCATGGACAGCAGC
CACAAGGGCCCGACCATGGCCTACCTCAAGAAGGTCGACAACGCCACCA
CCGACTCGGGCGTCGGCGGCGGCTGGTTCAAGATCCAGGAGGACGGCCT
TACCAACGGCGTCTGGGGCACCGAGCGCGTCATCAACGGCCAGGGCCGC
CACAACATCAAGATCCCCGAGTGCATCGCCCCCGGCCAGTACCTCCTCC
GCGCCGAGATGCTTGCCCTGCACGGAGCTTCCAACTACCCCGGCGCTCA
GTTCTACATGGAGTGCGCCCAGCTCAATATCGTCGGCGGCACCGGCAGC
AAGACGCCGTCCACCGTCAGCTTCCCGGGCGCTTACAAGGGTACCGACC
CCGGAGTCAAGATCAACATCTACTGGCCCCCCGTCACCAGCTACCAGAT
TCCCGGCCCCGGCGTGTTCACCTGC (SEQ ID NO: 104)
MKVLAPLILAGAASAHTIFSSLEVGGVNQGIGQGVRVPSYNGPIEDVTS
NSIACNGPPNPTTPTNKVITVRAGETVTAVWRYMLSTTGSAPNDIMDSS
HKGPTMAYLKKVDNATTDSGVGGGWFKIQEDGLTNGVWGTERVINGQGR
HNIKIPECIAPGQYLLRAEMLALHGASNYPGAQFYMECAQLNIVGGTGS
KTPSTVSFPGAYKGTDPGVKINIYWPPVTSYQIPGPGVFTC (SEQ ID NO: 105)
HTIFSSLEVGGVNQGIGQGVRVPSYNGPIEDVTSNSIACNGPPNPTTPT
NKVITVRAGETVTAVWRYMLSTTGSAPNDIMDSSHKGPTMAYLKKVDNA
TTDSGVGGGWFKIQEDGLTNGVWGTERVINGQGRHNIKIPECIAPGQYL
LRAEMLALHGASNYPGAQFYMECAQLNIVGGTGSKTPSTVSFPGAYKGT
DPGVKINIYWPPVTSYQIPGPGVFTC
[0342] The polynucleotide (SEQ ID NO:106) and amino acid (SEQ ID
NO:107) sequences of an M. thermophila GH61y are provided below.
The signal sequence is underlined in SEQ ID NO:107. SEQ ID NO:108
provides the sequence of GH61y, without the signal sequence.
TABLE-US-00047 (SEQ ID NO: 106)
ATGATCGACAACCTCCCTGATGACTCCCTACAACCCGCCTGCCTCCGCC
CGGGCCACTACCTCGTCCGCCACGAGATCATCGCGCTGCACTCGGCCTG
GGCCGAGGGCGAGGCCCAGTTCTACCCCTTCCCCCTTTTTCCTTTTTTT
CCCTCCCTTCTTTTGTCCGGTAACTACACGATTCCCGGTCCCGCGATCT
GGAAGTGCCCAGAGGCACAGCAGAACGAG (SEQ ID NO: 107)
MIDNLPDDSLQPACLRPGHYLVRHEIIALHSAWAEGEAQFYPFPLFPFF
PSLLLSGNYTIPGPAIWKCPEAQQNE (SEQ ID NO: 108)
HYLVRHEIIALHSAWAEGEAQFYPFPLFPFFPSLLLSGNYTIPGPAIWK CPEAQQNE
[0343] Additional enzymes (i.e., non-GH61 enzymes) that find us in
the present invention include, but are not limited to the following
enzymes.
[0344] Wild-type EG1b cDNA (SEQ ID NO:109) and amino acid (SEQ ID
NO:110) sequences are provided below. The signal sequence is
underlined in SEQ ID NO:110. SEQ ID NO:111 provides the sequence of
EG1b, without the signal sequence.
TABLE-US-00048 (SEQ ID NO: 109)
ATGGGGCAGAAGACTCTCCAGGGGCTGGTGGCGGCGGCGGCACTGGCAG
CCTCGGTGGCGAACGCGCAGCAACCGGGCACCTTCACGCCCGAGGTGCA
TCCGACGCTGCCGACGTGGAAGTGCACGACGAGCGGCGGGTGCGTCCAG
CAGGACACGTCGGTGGTGCTCGACTGGAACTACCGCTGGTTCCACACCG
AGGACGGTAGCAAGTCGTGCATCACCTCTAGCGGCGTCGACCGGACCCT
GTGCCCGGACGAGGCGACGTGCGCCAAGAACTGCTTCGTCGAGGGCGTC
AACTACACGAGCAGCGGGGTCGAGACGTCCGGCAGCTCCCTCACCCTCC
GCCAGTTCTTCAAGGGCTCCGACGGCGCCATCAACAGCGTCTCCCCGCG
CGTCTACCTGCTCGGGGGAGACGGCAACTATGTCGTGCTCAAGCTCCTC
GGCCAGGAGCTGAGCTTCGACGTGGACGTATCGTCGCTCCCGTGCGGCG
AGAACGCGGCCCTGTACCTGTCCGAGATGGACGCGACGGGAGGACGGAA
CGAGTACAACACGGGCGGGGCCGAGTACGGGTCGGGCTACTGTGACGCC
CAGTGCCCCGTGCAGAACTGGAACAACGGGACGCTCAACACGGGCCGGG
TGGGCTCGTGCTGCAACGAGATGGACATCCTCGAGGCCAACTCCAAGGC
CGAGGCCTTCACGCCGCACCCCTGCATCGGCAACTCGTGCGACAAGAGC
GGGTGCGGCTTCAACGCGTACGCGCGCGGTTACCACAACTACTGGGCCC
CCGGCGGCACGCTCGACACGTCCCGGCCTTTCACCATGATCACCCGCTT
CGTCACCGACGACGGCACCACCTCGGGCAAGCTCGCCCGCATCGAGCGC
GTCTACGTCCAGGACGGCAAGAAGGTGCCCAGCGCGGCGCCCGGGGGGG
ACGTCATCACGGCCGACGGGTGCACCTCCGCGCAGCCCTACGGCGGCCT
TTCCGGCATGGGCGACGCCCTCGGCCGCGGCATGGTCCTGGCCCTGAGC
ATCTGGAACGACGCGTCCGGGTACATGAACTGGCTCGACGCCGGCAGCA
ACGGCCCCTGCAGCGACACCGAGGGTAACCCGTCCAACATCCTGGCCAA
CCACCCGGACGCCCACGTCGTGCTCTCCAACATCCGCTGGGGCGACATC
GGCTCCACCGTCGACACCGGCGATGGCGACAACAACGGCGGCGGCCCCA
ACCCGTCATCCACCACCACCGCTACCGCTACCACCACCTCCTCCGGCCC
GGCCGAGCCTACCCAGACCCACTACGGCCAGTGTGGAGGGAAAGGATGG
ACGGGCCCTACCCGCTGCGAGACGCCCTACACCTGCAAGTACCAGAACG
ACTGGTACTCGCAGTGCCTGTAG (SEQ ID NO: 110)
MGQKTLQGLVAAAALAASVANAQQPGTFTPEVHPTLPTWKCTTSGGCVQ
QDTSVVLDWNYRWFHTEDGSKSCITSSGVDRTLCPDEATCAKNCFVEGV
NYTSSGVETSGSSLTLRQFFKGSDGAINSVSPRVYLLGGDGNYVVLKLL
GQELSFDVDVSSLPCGENAALYLSEMDATGGRNEYNTGGAEYGSGYCDA
QCPVQNWNNGTLNTGRVGSCCNEMDILEANSKAEAFTPHPCIGNSCDKS
GCGFNAYARGYHNYWAPGGTLDTSRPFTMITRFVTDDGTTSGKLARIER
VYVQDGKKVPSAAPGGDVITADGCTSAQPYGGLSGMGDALGRGMVLALS
IWNDASGYMNWLDAGSNGPCSDTEGNPSNILANHPDAHVVLSNIRWGDI
GSTVDTGDGDNNGGGPNPSSTTTATATTTSSGPAEPTQTHYGQCGGKGW
TGPTRCETPYTCKYQNDWYSQCL (SEQ ID NO: 111)
QQPGTFTPEVHPTLPTWKCTTSGGCVQQDTSVVLDWNYRWFHTEDGSKS
CITSSGVDRTLCPDEATCAKNCFVEGVNYTSSGVETSGSSLTLRQFFKG
SDGAINSVSPRVYLLGGDGNYVVLKLLGQELSFDVDVSSLPCGENAALY
LSEMDATGGRNEYNTGGAEYGSGYCDAQCPVQNWNNGTLNTGRVGSCCN
EMDILEANSKAEAFTPHPCIGNSCDKSGCGFNAYARGYHNYWAPGGTLD
TSRPFTMITRFVTDDGTTSGKLARIERVYVQDGKKVPSAAPGGDVITAD
GCTSAQPYGGLSGMGDALGRGMVLALSIWNDASGYMNWLDAGSNGPCSD
TEGNPSNILANHPDAHVVLSNIRWGDIGSTVDTGDGDNNGGGPNPSSTT
TATATTTSSGPAEPTQTHYGQCGGKGWTGPTRCETPYTCKYQNDWYSQC L
[0345] Wild-type M. thermophila EG2 polynucleotide (SEQ ID NO:112)
and amino acid (SEQ ID NO:113) sequences are provided below. The
signal sequence is underlined in SEQ ID NO:113. SEQ ID NO:114
provides the sequence of EG2, without the signal sequence.
TABLE-US-00049 (SEQ ID NO: 112)
ATGAAGTCCTCCATCCTCGCCAGCGTCTTCGCCACGGGCGCCGTGGCTC
AAAGTGGTCCGTGGCAGCAATGTGGTGGCATCGGATGGCAAGGATCGAC
CGACTGTGTGTCGGGTTACCACTGCGTCTACCAGAACGATTGGTACAGC
CAGTGCGTGCCTGGCGCGGCGTCGACAACGCTCCAGACATCTACCACGT
CCAGGCCCACCGCCACCAGCACCGCCCCTCCGTCGTCCACCACCTCGCC
TAGCAAGGGCAAGCTCAAGTGGCTCGGCAGCAACGAGTCGGGCGCCGAG
TTCGGGGAGGGCAACTACCCCGGCCTCTGGGGCAAGCACTTCATCTTCC
CGTCGACTTCGGCGATTCAGACGCTCATCAATGATGGATACAACATCTT
CCGGATCGACTTCTCGATGGAGCGTCTGGTGCCCAACCAGTTGACGTCG
TCCTTCGACGAGGGCTACCTCCGCAACCTGACCGAGGTGGTCAACTTCG
TGACGAACGCGGGCAAGTACGCCGTCCTGGACCCGCACAACTACGGCCG
GTACTACGGCAACGTCATCACGGACACGAACGCGTTCCGGACCTTCTGG
ACCAACCTGGCCAAGCAGTTCGCCTCCAACTCGCTCGTCATCTTCGACA
CCAACAACGAGTACAACACGATGGACCAGACCCTGGTGCTCAACCTCAA
CCAGGCCGCCATCGACGGCATCCGGGCCGCCGGCGCGACCTCGCAGTAC
ATCTTCGTCGAGGGCAACGCGTGGAGCGGGGCCTGGAGCTGGAACACGA
CCAACACCAACATGGCCGCCCTGACGGACCCGCAGAACAAGATCGTGTA
CGAGATGCACCAGTACCTCGACTCGGACAGCTCGGGCACCCACGCCGAG
TGCGTCAGCAGCAACATCGGCGCCCAGCGCGTCGTCGGAGCCACCCAGT
GGCTCCGCGCCAACGGCAAGCTCGGCGTCCTCGGCGAGTTCGCCGGCGG
CGCCAACGCCGTCTGCCAGCAGGCCGTCACCGGCCTCCTCGACCACCTC
CAGGACAACAGCGACGTCTGGCTGGGTGCCCTCTGGTGGGCCGCCGGTC
CCTGGTGGGGCGACTACATGTACTCGTTCGAGCCTCCTTCGGGCACCGG
CTATGTCAACTACAACTCGATCCTAAAGAAGTACTTGCCGTAA (SEQ ID NO: 113)
MKSSILASVFATGAVAQSGPWQQCGGIGWQGSTDCVSGYHCVYQNDWYS
QCVPGAASTTLQTSTTSRPTATSTAPPSSTTSPSKGKLKWLGSNESGAE
FGEGNYPGLWGKHFIFPSTSAIQTLINDGYNIFRIDFSMERLVPNQLTS
SFDEGYLRNLTEVVNFVTNAGKYAVLDPHNYGRYYGNVITDTNAFRTFW
TNLAKQFASNSLVIFDTNNEYNTMDQTLVLNLNQAAIDGIRAAGATSQY
IFVEGNAWSGAWSWNTTNTNMAALTDPQNKIVYEMHQYLDSDSSGTHAE
CVSSNIGAQRVVGATQWLRANGKLGVLGEFAGGANAVCQQAVTGLLDHL
QDNSEVWLGALWWAAGPWWGDYMYSFEPPSGTGYVNYNSILKKYLP (SEQ ID NO: 114)
QSGPWQQCGGIGWQGSTDCVSGYHCVYQNDWYSQCVPGAASTTLQTSTT
SRPTATSTAPPSSTTSPSKGKLKWLGSNESGAEFGEGNYPGLWGKHFIF
PSTSAIQTLINDGYNIFRIDFSMERLVPNQLTSSFDEGYLRNLTEVVNF
VTNAGKYAVLDPHNYGRYYGNVITDTNAFRTFWTNLAKQFASNSLVIFD
TNNEYNTMDQTLVLNLNQAAIDGIRAAGATSQYIFVEGNAWSGAWSWNT
TNTNMAALTDPQNKIVYEMHQYLDSDSSGTHAECVSSNIGAQRVVGATQ
WLRANGKLGVLGEFAGGANAVCQQAVTGLLDHLQDNSEVWLGALWWAAG
PWWGDYMYSFEPPSGTGYVNYNSILKKYLP
[0346] The polynucleotide (SEQ ID NO:115) and amino acid (SEQ ID
NO:116) sequences of a wild-type BGL are provided below. The signal
sequence is underlined in SEQ ID NO:116. SEQ ID NO:117 provides the
polypeptide sequence without the signal sequence.
TABLE-US-00050 (SEQ ID NO: 115)
ATGAAGGCTGCTGCGCTTTCCTGCCTCTTCGGCAGTACCCTTGCCGTTGCAGGCGCCATTGAATCGAGAAAGGT-
TCA
CCAGAAGCCCCTCGCGAGATCTGAACCTTTTTACCCGTCGCCATGGATGAATCCCAACGCCGACGGCTGGGCGG-
AGG
CCTATGCCCAGGCCAAGTCCTTTGTCTCCCAAATGACTCTGCTAGAGAAGGTCAACTTGACCACGGGAGTCGGC-
TGG
GGGGCTGAGCAGTGCGTCGGCCAAGTGGGCGCGATCCCTCGCCTTGGACTTCGCAGTCTGTGCATGCATGACTC-
CCC
TCTCGGCATCCGAGGAGCCGACTACAACTCAGCGTTCCCCTCTGGCCAGACCGTTGCTGCTACCTGGGATCGCG-
GTC
TGATGTACCGTCGCGGCTACGCAATGGGCCAGGAGGCCAAAGGCAAGGGCATCAATGTCCTTCTCGGACCAGTC-
GCC
GGCCCCCTTGGCCGCATGCCCGAGGGCGGTCGTAACTGGGAAGGCTTCGCTCCGGATCCCGTCCTTACCGGCAT-
CGG
CATGTCCGAGACGATCAAGGGCATTCAGGATGCTGGCGTCATCGCTTGTGCGAAGCACTTTATTGGAAACGAGC-
AGG
AGCACTTCAGACAGGTGCCAGAAGCCCAGGGATACGGTTACAACATCAGCGAAACCCTCTCCTCCAACATTGAC-
GAC
AAGACCATGCACGAGCTCTACCTTTGGCCGTTTGCCGATGCCGTCCGGGCCGGCGTCGGCTCTGTCATGTGCTC-
GTA
CCAGCAGGTCAACAACTCGTACGCCTGCCAGAACTCGAAGCTGCTGAACGACCTCCTCAAGAACGAGCTTGGGT-
TTC
AGGGCTTCGTCATGAGCGACTGGCAGGCACAGCACACTGGCGCAGCAAGCGCCGTGGCTGGTCTCGATATGTCC-
ATG
CCGGGCGACACCCAGTTCAACACTGGCGTCAGTTTCTGGGGCGCCAATCTCACCCTCGCCGTCCTCAACGGCAC-
AGT
CCCTGCCTACCGTCTCGACGACATGGCCATGCGCATCATGGCCGCCCTCTTCAAGGTCACCAAGACCACCGACC-
TGG
AACCGATCAACTTCTCCTTCTGGACCGACGACACTTATGGCCCGATCCACTGGGCCGCCAAGCAGGGCTACCAG-
GAG
ATTAATTCCCACGTTGACGTCCGCGCCGACCACGGCAACCTCATCCGGGAGATTGCCGCCAAGGGTACGGTGCT-
GCT
GAAGAATACCGGCTCTCTACCCCTGAACAAGCCAAAGTTCGTGGCCGTCATCGGCGAGGATGCTGGGTCGAGCC-
CCA
ACGGGCCCAACGGCTGCAGCGACCGCGGCTGTAACGAAGGCACGCTCGCCATGGGCTGGGGATCCGGCACAGCC-
AAC
TATCCGTACCTCGTTTCCCCCGACGCCGCGCTCCAGGCCCGGGCCATCCAGGACGGCACGAGGTACGAGAGCGT-
CCT
GTCCAACTACGCCGAGGAAAAGACAAAGGCTCTGGTCTCGCAGGCCAATGCAACCGCCATCGTCTTCGTCAATG-
CCG
ACTCAGGCGAGGGCTACATCAACGTGGACGGTAACGAGGGCGACCGTAAGAACCTGACTCTCTGGAACAACGGT-
GAT
ACTCTGGTCAAGAACGTCTCGAGCTGGTGCAGCAACACCATCGTCGTCATCCACTCGGTCGGCCCGGTCCTCCT-
GAC
CGATTGGTACGACAACCCCAACATCACGGCCATTCTCTGGGCTGGTCTTCCGGGCCAGGAGTCGGGCAACTCCA-
TCA
CCGACGTGCTTTACGGCAAGGTCAACCCCGCCGCCCGCTCGCCCTTCACTTGGGGCAAGACCCGCGAAAGCTAT-
GGC
GCGGACGTCCTGTACAAGCCGAATAATGGCAATGGTGCGCCCCAACAGGACTTCACCGAGGGCGTCTTCATCGA-
CTA
CCGCTACTTCGACAAGGTTGACGATGACTCGGTCATCTACGAGTTCGGCCACGGCCTGAGCTACACCACCTTCG-
AGT
ACAGCAACATCCGCGTCGTCAAGTCCAACGTCAGCGAGTACCGGCCCACGACGGGCACCACGGCCCAGGCCCCG-
ACG
TTTGGCAACTTCTCCACCGACCTCGAGGACTATCTCTTCCCCAAGGACGAGTTCCCCTACATCTACCAGTACAT-
CTA
CCCGTACCTCAACACGACCGACCCCCGGAGGGCCTCGGCCGATCCCCACTACGGCCAGACCGCCGAGGAGTTCC-
TCC
CGCCCCACGCCACCGATGACGACCCCCAGCCGCTCCTCCGGTCCTCGGGCGGAAACTCCCCCGGCGGCAACCGC-
CAG
CTGTACGACATTGTCTACACAATCACGGCCGACATCACGAATACGGGCTCCGTTGTAGGCGAGGAGGTACCGCA-
GCT
CTACGTCTCGCTGGGCGGTCCCGAGGATCCCAAGGTGCAGCTGCGCGACTTTGACAGGATGCGGATCGAACCCG-
GCG
AGACGAGGCAGTTCACCGGCCGCCTGACGCGCAGAGATCTGAGCAACTGGGACGTCACGGTGCAGGACTGGGTC-
ATC
AGCAGGTATCCCAAGACGGCATATGTTGGGAGGAGCAGCCGGAAGTTGGATCTCAAGATTGAGCTTCCTTGA
(SEQ ID NO: 116)
MKAAALSCLFGSTLAVAGAIESRKVHQKPLARSEPFYPSPWMNPNADGWAEAYAQAKSFVSQMTLLEKVNLTTG-
VGW
GAEQCVGQVGAIPRLGLRSLCMHDSPLGIRGADYNSAFPSGQTVAATWDRGLMYRRGYAMGQEAKGKGINVLLG-
PVA
GPLGRMPEGGRNWEGFAPDPVLTGIGMSETIKGIQDAGVIACAKHFIGNEQEHFRQVPEAQGYGYNISETLSSN-
IDD
KTMHELYLWPFADAVRAGVGSVMCSYQQVNNSYACQNSKLLNDLLKNELGFQGFVMSDWQAQHTGAASAVAGLD-
MSM
PGDTQFNTGVSFWGANLTLAVLNGTVPAYRLDDMAMRIMAALFKVTKTTDLEPINFSFWTDDTYGPIHWAAKQG-
YQE
INSHVDVRADHGNLIREIAAKGTVLLKNTGSLPLNKPKFVAVIGEDAGSSPNGPNGCSDRGCNEGTLAMGWGSG-
TAN
YPYLVSPDAALQARAIQDGTRYESVLSNYAEEKTKALVSQANATAIVFVNADSGEGYINVDGNEGDRKNLTLWN-
NGD
TLVKNVSSWCSNTIVVIHSVGPVLLTDWYDNPNITAILWAGLPGQESGNSITDVLYGKVNPAARSPFTWGKTRE-
SYG
ADVLYKPNNGNGAPQQDFTEGVFIDYRYFDKVDDDSVIYEFGHGLSYTTFEYSNIRVVKSNVSEYRPTTGTTAQ-
APT
FGNFSTDLEDYLFPKDEFPYIYQYIYPYLNTTDPRRASADPHYGQTAEEFLPPHATDDDPQPLLRSSGGNSPGG-
NRQ
LYDIVYTITADITNTGSVVGEEVPQLYVSLGGPEDPKVQLRDFDRMRIEPGETRQFTGRLTRRDLSNWDVTVQD-
WVI SRYPKTAYVGRSSRKLDLKIELP (SEQ ID NO: 117)
IESRKVHQKPLARSEPFYPSPWMNPNADGWAEAYAQAKSFVSQMTLLEKVNLTTGVGWGAEQCVGQVGAIPRLG-
LRS
LCMHDSPLGIRGADYNSAFPSGQTVAATWDRGLMYRRGYAMGQEAKGKGINVLLGPVAGPLGRMPEGGRNWEGF-
APD
PVLTGIGMSETIKGIQDAGVIACAKHFIGNEQEHFRQVPEAQGYGYNISETLSSNIDDKTMHELYLWPFADAVR-
AGV
GSVMCSYQQVNNSYACQNSKLLNDLLKNELGFQGFVMSDWQAQHTGAASAVAGLDMSMPGDTQFNTGVSFWGAN-
LTL
AVLNGTVPAYRLDDMAMRIMAALFKVTKTTDLEPINFSFWTDDTYGPIHWAAKQGYQEINSHVDVRADHGNLIR-
EIA
AKGTVLLKNTGSLPLNKPKFVAVIGEDAGSSPNGPNGCSDRGCNEGTLAMGWGSGTANYPYLVSPDAALQARAI-
QDG
TRYESVLSNYAEEKTKALVSQANATAIVFVNADSGEGYINVDGNEGDRKNLTLWNNGDTLVKNVSSWCSNTIVV-
IHS
VGPVLLTDWYDNPNITAILWAGLPGQESGNSITDVLYGKVNPAARSPFTWGKTRESYGADVLYKPNNGNGAPQQ-
DFT
EGVFIDYRYFDKVDDDSVIYEFGHGLSYTTFEYSNIRVVKSNVSEYRPTTGTTAQAPTFGNFSTDLEDYLFPKD-
EFP
YIYQYIYPYLNTTDPRRASADPHYGQTAEEFLPPHATDDDPQPLLRSSGGNSPGGNRQLYDIVYTITADITNTG-
SVV
GEEVPQLYVSLGGPEDPKVQLRDFDRMRIEPGETRQFTGRLTRRDLSNWDVTVQDWVISRYPKTAYVGRSSRKL-
DLK IELP
[0347] The polynucleotide (SEQ ID NO:118) and amino acid (SEQ ID
NO:119) sequences of a BGL variant ("Variant 883") are provided
below. The signal sequence is underlined in SEQ ID NO:119. SEQ ID
NO:120 provides the sequence of this BGL variant, without the
signal sequence.
TABLE-US-00051 (SEQ ID NO: 118)
ATGAAGGCTGCTGCGCTTTCCTGCCTCTTCGGCAGTACCCTTGCCGTTGCAGGCGCCATTGAATCGAGAAAGGT-
TCAC
CAGAAGCCCCTCGCGAGATCTGAACCTTTTTACCCGTCGCCATGGATGAATCCCAACGCCGACGGCTGGGCGGA-
GGCC
TATGCCCAGGCCAAGTCCTTTGTCTCCCAAATGACTCTGCTAGAGAAGGTCAACTTGACCACGGGAGTCGGCTG-
GGGG
GCTGAGCAGTGCGTCGGCCAAGTGGGCGCGATCCCTCGCCTTGGACTTCGCAGTCTGTGCATGCATGACTCCCC-
TCTC
GGCATCCGAGGAGCCGACTACAACTCAGCGTTCCCCTCTGGCCAGACCGTTGCTGCTACCTGGGATCGCGGTCT-
GATG
TACCGTCGCGGCTACGCAATGGGCCAGGAGGCCAAAGGCAAGGGCATCAATGTCCTTCTCGGACCAGTCGCCGG-
CCCC
CTTGGCCGCATGCCCGAGGGCGGTCGTAACTGGGAAGGCTTCGCTCCGGATCCCGTCCTTACCGGCATCGGCAT-
GTCC
GAGACGATCAAGGGCATTCAGGATGCTGGCGTCATCGCTTGTGCGAAGCACTTTATTGGAAACGAGCAGGAGCA-
CTTC
AGACAGGTGCCAGAAGCCCAGGGATACGGTTACAACATCAGCGAAACCCTCTCCTCCAACATTGACGACAAGAC-
CATG
CACGAGCTCTACCTTTGGCCGTTTGCCGATGCCGTCCGGGCCGGCGTCGGCTCTGTCATGTGCTCGTACAACCA-
GGTC
AACAACTCGTACGCCTGCCAGAACTCGAAGCTGCTGAACGACCTCCTCAAGAACGAGCTTGGGTTTCAGGGCTT-
CGTC
ATGAGCGACTGGTGGGCACAGCACACTGGCGCAGCAAGCGCCGTGGCTGGTCTCGATATGTCCATGCCGGGCGA-
CACC
ATGTTCAACACTGGCGTCAGTTTCTGGGGCGCCAATCTCACCCTCGCCGTCCTCAACGGCACAGTCCCTGCCTA-
CCGT
CTCGACGACATGGCCATGCGCATCATGGCCGCCCTCTTCAAGGTCACCAAGACCACCGACCTGGAACCGATCAA-
CTTC
TCCTTCTGGACCCGCGACACTTATGGCCCGATCCACTGGGCCGCCAAGCAGGGCTACCAGGAGATTAATTCCCA-
CGTT
GACGTCCGCGCCGACCACGGCAACCTCATCCGGAACATTGCCGCCAAGGGTACGGTGCTGCTGAAGAATACCGG-
CTCT
CTACCCCTGAACAAGCCAAAGTTCGTGGCCGTCATCGGCGAGGATGCTGGGCCGAGCCCCAACGGGCCCAACGG-
CTGC
AGCGACCGCGGCTGTAACGAAGGCACGCTCGCCATGGGCTGGGGATCCGGCACAGCCAACTATCCGTACCTCGT-
TTCC
CCCGACGCCGCGCTCCAGTTGCGGGCCATCCAGGACGGCACGAGGTACGAGAGCGTCCTGTCCAACTACGCCGA-
GGAA
AATACAAAGGCTCTGGTCTCGCAGGCCAATGCAACCGCCATCGTCTTCGTCAATGCCGACTCAGGCGAGGGCTA-
CATC
AACGTGGACGGTAACGAGGGCGACCGTAAGAACCTGACTCTCTGGAACAACGGTGATACTCTGGTCAAGAACGT-
CTCG
AGCTGGTGCAGCAACACCATCGTCGTCATCCACTCGGTCGGCCCGGTCCTCCTGACCGATTGGTACGACAACCC-
CAAC
ATCACGGCCATTCTCTGGGCTGGTCTTCCGGGCCAGGAGTCGGGCAACTCCATCACCGACGTGCTTTACGGCAA-
GGTC
AACCCCGCCGCCCGCTCGCCCTTCACTTGGGGCAAGACCCGCGAAAGCTATGGCGCGGACGTCCTGTACAAGCC-
GAAT
AATGGCAATTGGGCGCCCCAACAGGACTTCACCGAGGGCGTCTTCATCGACTACCGCTACTTCGACAAGGTTGA-
CGAT
GACTCGGTCATCTACGAGTTCGGCCACGGCCTGAGCTACACCACCTTCGAGTACAGCAACATCCGCGTCGTCAA-
GTCC
AACGTCAGCGAGTACCGGCCCACGACGGGCACCACGATTCAGGCCCCGACGTTTGGCAACTTCTCCACCGACCT-
CGAG
GACTATCTCTTCCCCAAGGACGAGTTCCCCTACATCCCGCAGTACATCTACCCGTACCTCAACACGACCGACCC-
CCGG
AGGGCCTCGGCCGATCCCCACTACGGCCAGACCGCCGAGGAGTTCCTCCCGCCCCACGCCACCGATGACGACCC-
CCAG
CCGCTCCTCCGGTCCTCGGGCGGAAACTCCCCCGGCGGCAACCGCCAGCTGTACGACATTGTCTACACAATCAC-
GGCC
GACATCACGAATACGGGCTCCGTTGTAGGCGAGGAGGTACCGCAGCTCTACGTCTCGCTGGGCGGTCCCGAGGA-
TCCC
AAGGTGCAGCTGCGCGACTTTGACAGGATGCGGATCGAACCCGGCGAGACGAGGCAGTTCACCGGCCGCCTGAC-
GCGC
AGAGATCTGAGCAACTGGGACGTCACGGTGCAGGACTGGGTCATCAGCAGGTATCCCAAGACGGCATATGTTGG-
GAGG AGCAGCCGGAAGTTGGATCTCAAGATTGAGCTTCCTTGA (SEQ ID NO: 119)
MKAAALSCLFGSTLAVAGAIESRKVHQKPLARSEPFYPSPWMNPNADGWAEAYAQAKSFVSQMTLLEKVNLTTG-
VGWG
AEQCVGQVGAIPRLGLRSLCMHDSPLGIRGADYNSAFPSGQTVAATWDRGLMYRRGYAMGQEAKGKGINVLLGP-
VAGP
LGRMPEGGRNWEGFAPDPVLTGIGMSETIKGIQDAGVIACAKHFIGNEQEHFRQVPEAQGYGYNISETLSSNID-
DKTM
HELYLWPFADAVRAGVGSVMCSYNQVNNSYACQNSKLLNDLLKNELGFQGFVMSDWWAQHTGAASAVAGLDMSM-
PGDT
MFNTGVSFWGANLTLAVLNGTVPAYRLDDMAMRIMAALFKVTKTTDLEPINFSFWTRDTYGPIHWAAKQGYQEI-
NSHV
DVRADHGNLIRNIAAKGTVLLKNTGSLPLNKPKFVAVIGEDAGPSPNGPNGCSDRGCNEGTLAMGWGSGTANYP-
YLVS
PDAALQLRAIQDGTRYESVLSNYAEENTKALVSQANATAIVFVNADSGEGYINVDGNEGDRKNLTLWNNGDTLV-
KNVS
SWCSNTIVVIHSVGPVLLTDWYDNPNITAILWAGLPGQESGNSITDVLYGKVNPAARSPFTWGKTRESYGADVL-
YKPN
NGNWAPQQDFTEGVFIDYRYFDKVDDDSVIYEFGHGLSYTTFEYSNIRVVKSNVSEYRPTTGTTIQAPTFGNFS-
TDLE
DYLFPKDEFPYIPQYIYPYLNTTDPRRASADPHYGQTAEEFLPPHATDDDPQPLLRSSGGNSPGGNRQLYDIVY-
TITA
DITNTGSVVGEEVPQLYVSLGGPEDPKVQLRDFDRMRIEPGETRQFTGRLTRRDLSNWDVTVQDWVISRYPKTA-
YVGR SSRKLDLKIELP (SEQ ID NO: 120)
IESRKVHQKPLARSEPFYPSPWMNPNADGWAEAYAQAKSFVSQMTLLEKVNLTTGVGWGAEQCVGQVGAIPRLG-
LRSL
CMHDSPLGIRGADYNSAFPSGQTVAATWDRGLMYRRGYAMGQEAKGKGINVLLGPVAGPLGRMPEGGRNWEGFA-
PDPV
LTGIGMSETIKGIQDAGVIACAKHFIGNEQEHFRQVPEAQGYGYNISETLSSNIDDKTMHELYLWPFADAVRAG-
VGSV
MCSYNQVNNSYACQNSKLLNDLLKNELGFQGFVMSDWWAQHTGAASAVAGLDMSMPGDTMFNTGVSFWGANLTL-
AVLN
GTVPAYRLDDMAMRIMAALFKVTKTTDLEPINFSFWTRDTYGPIHWAAKQGYQEINSHVDVRADHGNLIRNIAA-
KGTV
LLKNTGSLPLNKPKFVAVIGEDAGPSPNGPNGCSDRGCNEGTLAMGWGSGTANYPYLVSPDAALQLRAIQDGTR-
YESV
LSNYAEENTKALVSQANATAIVFVNADSGEGYINVDGNEGDRKNLTLWNNGDTLVKNVSSWCSNTIVVIHSVGP-
VLLT
DWYDNPNITAILWAGLPGQESGNSITDVLYGKVNPAARSPFTWGKTRESYGADVLYKPNNGNWAPQQDFTEGVF-
IDYR
YFDKVDDDSVIYEFGHGLSYTTFEYSNIRVVKSNVSEYRPTTGTTIQAPTFGNFSTDLEDYLFPKDEFPYIPQY-
IYPY
LNTTDPRRASADPHYGQTAEEFLPPHATDDDPQPLLRSSGGNSPGGNRQLYDIVYTITADITNTGSVVGEEVPQ-
LYVS
LGGPEDPKVQLRDFDRMRIEPGETRQFTGRLTRRDLSNWDVTVQDWVISRYPKTAYVGRSSRKLDLKIELP
[0348] The polynucleotide (SEQ ID NO:121) and amino acid (SEQ ID
NO:122) sequences of a BGL variant ("Variant 900") are provided
below. The signal sequence is underlined in SEQ ID NO:122. SEQ ID
NO:123 provides the sequence of this BGL variant, without the
signal sequence.
TABLE-US-00052 (SEQ ID NO: 121)
ATGAAGGCTGCTGCGCTTTCCTGCCTCTTCGGCAGTACCCTTGCCGTTGCAGGCGCCATTGAATCGAGAAAGGT-
TCAC
CAGAAGCCCCTCGCGAGATCTGAACCTTTTTACCCGTCGCCATGGATGAATCCCAACGCCATCGGCTGGGCGGA-
GGCC
TATGCCCAGGCCAAGTCCTTTGTCTCCCAAATGACTCTGCTAGAGAAGGTCAACTTGACCACGGGAGTCGGCTG-
GGGG
GAGGAGCAGTGCGTCGGCAACGTGGGCGCGATCCCTCGCCTTGGACTTCGCAGTCTGTGCATGCATGACTCCCC-
TCTC
GGCGTGCGAGGAACCGACTACAACTCAGCGTTCCCCTCTGGCCAGACCGTTGCTGCTACCTGGGATCGCGGTCT-
GATG
TACCGTCGCGGCTACGCAATGGGCCAGGAGGCCAAAGGCAAGGGCATCAATGTCCTTCTCGGACCAGTCGCCGG-
CCCC
CTTGGCCGCATGCCCGAGGGCGGTCGTAACTGGGAAGGCTTCGCTCCGGATCCCGTCCTTACCGGCATCGGCAT-
GTCC
GAGACGATCAAGGGCATTCAGGATGCTGGCGTCATCGCTTGTGCGAAGCACTTTATTGGAAACGAGCAGGAGCA-
CTTC
AGACAGGTGCCAGAAGCCCAGGGATACGGTTACAACATCAGCGAAACCCTCTCCTCCAACATTGACGACAAGAC-
CATG
CACGAGCTCTACCTTTGGCCGTTTGCCGATGCCGTCCGGGCCGGCGTCGGCTCTGTCATGTGCTCGTACAACCA-
GGGC
AACAACTCGTACGCCTGCCAGAACTCGAAGCTGCTGAACGACCTCCTCAAGAACGAGCTTGGGTTTCAGGGCTT-
CGTC
ATGAGCGACTGGTGGGCACAGCACACTGGCGCAGCAAGCGCCGTGGCTGGTCTCGATATGTCCATGCCGGGCGA-
CACC
ATGGTCAACACTGGCGTCAGTTTCTGGGGCGCCAATCTCACCCTCGCCGTCCTCAACGGCACAGTCCCTGCCTA-
CCGT
CTCGACGACATGTGCATGCGCATCATGGCCGCCCTCTTCAAGGTCACCAAGACCACCGACCTGGAACCGATCAA-
CTTC
TCCTTCTGGACCCGCGACACTTATGGCCCGATCCACTGGGCCGCCAAGCAGGGCTACCAGGAGATTAATTCCCA-
CGTT
GACGTCCGCGCCGACCACGGCAACCTCATCCGGAACATTGCCGCCAAGGGTACGGTGCTGCTGAAGAATACCGG-
CTCT
CTACCCCTGAACAAGCCAAAGTTCGTGGCCGTCATCGGCGAGGATGCTGGGCCGAGCCCCAACGGGCCCAACGG-
CTGC
AGCGACCGCGGCTGTAACGAAGGCACGCTCGCCATGGGCTGGGGATCCGGCACAGCCAACTATCCGTACCTCGT-
TTCC
CCCGACGCCGCGCTCCAGGCGCGGGCCATCCAGGACGGCACGAGGTACGAGAGCGTCCTGTCCAACTACGCCGA-
GGAA
AATACAAAGGCTCTGGTCTCGCAGGCCAATGCAACCGCCATCGTCTTCGTCAATGCCGACTCAGGCGAGGGCTA-
CATC
AACGTGGACGGTAACGAGGGCGACCGTAAGAACCTGACTCTCTGGAACAACGGTGATACTCTGGTCAAGAACGT-
CTCG
AGCTGGTGCAGCAACACCATCGTCGTCATCCACTCGGTCGGCCCGGTCCTCCTGACCGATTGGTACGACAACCC-
CAAC
ATCACGGCCATTCTCTGGGCTGGTCTTCCGGGCCAGGAGTCGGGCAACTCCATCACCGACGTGCTTTACGGCAA-
GGTC
AACCCCGCCGCCCGCTCGCCCTTCACTTGGGGCAAGACCCGCGAAAGCTATGGCGCGGACGTCCTGTACAAGCC-
GAAT
AATGGCAATTGGGCGCCCCAACAGGACTTCACCGAGGGCGTCTTCATCGACTACCGCTACTTCGACAAGGTTGA-
CGAT
GACTCGGTCATCTACGAGTTCGGCCACGGCCTGAGCTACACCACCTTCGAGTACAGCAACATCCGCGTCGTCAA-
GTCC
AACGTCAGCGAGTACCGGCCCACGACGGGCACCACGATTCAGGCCCCGACGTTTGGCAACTTCTCCACCGACCT-
CGAG
GACTATCTCTTCCCCAAGGACGAGTTCCCCTACATCCCGCAGTACATCTACCCGTACCTCAACACGACCGACCC-
CCGG
AGGGCCTCGGGCGATCCCCACTACGGCCAGACCGCCGAGGAGTTCCTCCCGCCCCACGCCACCGATGACGACCC-
CCAG
CCGCTCCTCCGGTCCTCGGGCGGAAACTCCCCCGGCGGCAACCGCCAGCTGTACGACATTGTCTACACAATCAC-
GGCC
GACATCACGAATACGGGCTCCGTTGTAGGCGAGGAGGTACCGCAGCTCTACGTCTCGCTGGGCGGTCCCGAGGA-
TCCC
AAGGTGCAGCTGCGCGACTTTGACAGGATGCGGATCGAACCCGGCGAGACGAGGCAGTTCACCGGCCGCCTGAC-
GCGC
AGAGATCTGAGCAACTGGGACGTCACGGTGCAGGACTGGGTCATCAGCAGGTATCCCAAGACGGCATATGTTGG-
GAGG AGCAGCCGGAAGTTGGATCTCAAGATTGAGCTTCCTTGA (SEQ ID NO: 122)
MKAAALSCLFGSTLAVAGAIESRKVHQKPLARSEPFYPSPWMNPNAIGWAEAYAQAKSFVSQMTLLEKVNLTTG-
VGWG
EEQCVGNVGAIPRLGLRSLCMHDSPLGVRGTDYNSAFPSGQTVAATWDRGLMYRRGYAMGQEAKGKGINVLLGP-
VAGP
LGRMPEGGRNWEGFAPDPVLTGIGMSETIKGIQDAGVIACAKHFIGNEQEHFRQVPEAQGYGYNISETLSSNID-
DKTM
HELYLWPFADAVRAGVGSVMCSYNQGNNSYACQNSKLLNDLLKNELGFQGFVMSDWWAQHTGAASAVAGLDMSM-
PGDT
MVNTGVSFWGANLTLAVLNGTVPAYRLDDMCMRIMAALFKVTKTTDLEPINFSFWTRDTYGPIHWAAKQGYQEI-
NSHV
DVRADHGNLIRNIAAKGTVLLKNTGSLPLNKPKFVAVIGEDAGPSPNGPNGCSDRGCNEGTLAMGWGSGTANYP-
YLVS
PDAALQARAIQDGTRYESVLSNYAEENTKALVSQANATAIVFVNADSGEGYINVDGNEGDRKNLTLWNNGDTLV-
KNVS
SWCSNTIVVIHSVGPVLLTDWYDNPNITAILWAGLPGQESGNSITDVLYGKVNPAARSPFTWGKTRESYGADVL-
YKPN
NGNWAPQQDFTEGVFIDYRYFDKVDDDSVIYEFGHGLSYTTFEYSNIRVVKSNVSEYRPTTGTTIQAPTFGNFS-
TDLE
DYLFPKDEFPYIPQYIYPYLNTTDPRRASGDPHYGQTAEEFLPPHATDDDPQPLLRSSGGNSPGGNRQLYDIVY-
TITA
DITNTGSVVGEEVPQLYVSLGGPEDPKVQLRDFDRMRIEPGETRQFTGRLTRRDLSNWDVTVQDWVISRYPKTA-
YVGR SSRKLDLKIELP (SEQ ID NO: 123)
IESRKVHQKPLARSEPFYPSPWMNPNAIGWAEAYAQAKSFVSQMTLLEKVNLTTGVGWGEEQCVGNVGAIPRLG-
LRSL
CMHDSPLGVRGTDYNSAFPSGQTVAATWDRGLMYRRGYAMGQEAKGKGINVLLGPVAGPLGRMPEGGRNWEGFA-
PDPV
LTGIGMSETIKGIQDAGVIACAKHFIGNEQEHFRQVPEAQGYGYNISETLSSNIDDKTMHELYLWPFADAVRAG-
VGSV
MCSYNQGNNSYACQNSKLLNDLLKNELGFQGFVMSDWWAQHTGAASAVAGLDMSMPGDTMVNTGVSFWGANLTL-
AVLN
GTVPAYRLDDMCMRIMAALFKVTKTTDLEPINFSFWTRDTYGPIHWAAKQGYQEINSHVDVRADHGNLIRNIAA-
KGTV
LLKNTGSLPLNKPKFVAVIGEDAGPSPNGPNGCSDRGCNEGTLAMGWGSGTANYPYLVSPDAALQARAIQDGTR-
YESV
LSNYAEENTKALVSQANATAIVFVNADSGEGYINVDGNEGDRKNLTLWNNGDTLVKNVSSWCSNTIVVIHSVGP-
VLLT
DWYDNPNITAILWAGLPGQESGNSITDVLYGKVNPAARSPFTWGKTRESYGADVLYKPNNGNWAPQQDFTEGVF-
IDYR
YFDKVDDDSVIYEFGHGLSYTTFEYSNIRVVKSNVSEYRPTTGTTIQAPTFGNFSTDLEDYLFPKDEFPYIPQY-
IYPY
LNTTDPRRASGDPHYGQTAEEFLPPHATDDDPQPLLRSSGGNSPGGNRQLYDIVYTITADITNTGSVVGEEVPQ-
LYVS
LGGPEDPKVQLRDFDRMRIEPGETRQFTGRLTRRDLSNWDVTVQDWVISRYPKTAYVGRSSRKLDLKIELP
[0349] The polynucleotide (SEQ ID NO:124) and amino acid (SEQ ID
NO:125) sequences of wild-type Talaromyces emersonii CBH1 are
provided below. The signal sequence is shown underlined in SEQ ID
NO:125. SEQ ID NO:126 provides the sequence of this CBH1, without
the signal sequence.
TABLE-US-00053 (SEQ ID NO: 124)
ATGCTTCGACGGGCTCTTCTTCTATCCTCTTCCGCCATCCTTGCTGTCAAGGCACAGCAGGCCGGCACGGCGAC-
GG
CAGAGAACCACCCGCCCCTGACATGGCAGGAATGCACCGCCCCTGGGAGCTGCACCACCCAGAACGGGGCGGTC-
GT
TCTTGATGCGAACTGGCGTTGGGTGCACGATGTGAACGGATACACCAACTGCTACACGGGCAATACCTGGGACC-
CC
ACGTACTGCCCTGACGACGAAACCTGCGCCCAGAACTGTGCGCTGGACGGCGCGGATTACGAGGGCACCTACGG-
CG
TGACTTCGTCGGGCAGCTCCTTGAAACTCAATTTCGTCACCGGGTCGAACGTCGGATCCCGTCTCTACCTGCTG-
CA
GGACGACTCGACCTATCAGATCTTCAAGCTTCTGAACCGCGAGTTCAGCTTTGACGTCGATGTCTCCAATCTTC-
CG
TGCGGATTGAACGGCGCTCTGTACTTTGTCGCCATGGACGCCGACGGCGGCGTGTCCAAGTACCCGAACAACAA-
GG
CTGGTGCCAAGTACGGAACCGGGTATTGCGACTCCCAATGCCCACGGGACCTCAAGTTCATCGACGGCGAGGCC-
AA
CGTCGAGGGCTGGCAGCCGTCTTCGAACAACGCCAACACCGGAATTGGCGACCACGGCTCCTGCTGTGCGGAGA-
TG
GATGTCTGGGAAGCAAACAGCATCTCCAATGCGGTCACTCCGCACCCGTGCGACACGCCAGGCCAGACGATGTG-
CT
CTGGAGATGACTGCGGTGGCACATACTCTAACGATCGCTACGCGGGAACCTGCGATCCTGACGGCTGTGACTTC-
AA
CCCTTACCGCATGGGCAACACTTCTTTCTACGGGCCTGGCAAGATCATCGATACCACCAAGCCCTTCACTGTCG-
TG
ACGCAGTTCCTCACTGATGATGGTACGGATACTGGAACTCTCAGCGAGATCAAGCGCTTCTACATCCAGAACAG-
CA
ACGTCATTCCGCAGCCCAACTCGGACATCAGTGGCGTGACCGGCAACTCGATCACGACGGAGTTCTGCACTGCT-
CA
GAAGCAGGCCTTTGGCGACACGGACGACTTCTCTCAGCACGGTGGCCTGGCCAAGATGGGAGCGGCCATGCAGC-
AG
GGTATGGTCCTGGTGATGAGTTTGTGGGACGACTACGCCGCGCAGATGCTGTGGTTGGATTCCGACTACCCGAC-
GG
ATGCGGACCCCACGACCCCTGGTATTGCCCGTGGAACGTGTCCGACGGACTCGGGCGTCCCATCGGATGTCGAG-
TC
GCAGAGCCCCAACTCCTACGTGACCTACTCGAACATTAAGTTTGGTCCGATCAACTCGACCTTCACCGCTTCGT-
GA (SEQ ID NO: 125)
MLRRALLLSSSAILAVKAQQAGTATAENHPPLTWQECTAPGSCTTQNGAVVLDANWRWVHDVNGYTNCYTGNTW-
DP
TYCPDDETCAQNCALDGADYEGTYGVTSSGSSLKLNFVTGSNVGSRLYLLQDDSTYQIFKLLNREFSFDVDVSN-
LP
CGLNGALYFVAMDADGGVSKYPNNKAGAKYGTGYCDSQCPRDLKFIDGEANVEGWQPSSNNANTGIGDHGSCCA-
EM
DVWEANSISNAVTPHPCDTPGQTMCSGDDCGGTYSNDRYAGTCDPDGCDFNPYRMGNTSFYGPGKIIDTTKPFT-
VV
TQFLTDDGTDTGTLSEIKRFYIQNSNVIPQPNSDISGVTGNSITTEFCTAQKQAFGDTDDFSQHGGLAKMGAAM-
QQ
GMVLVMSLWDDYAAQMLWLDSDYPTDADPTTPGIARGTCPTDSGVPSDVESQSPNSYVTYSNIKFGPINSTFTA-
S (SEQ ID NO: 126)
QQAGTATAENHPPLTWQECTAPGSCTTQNGAVVLDANWRWVHDVNGYTNCYTGNTWDPTYCPDDETCAQNCALD-
GA
DYEGTYGVTSSGSSLKLNFVTGSNVGSRLYLLQDDSTYQIFKLLNREFSFDVDVSNLPCGLNGALYFVAMDADG-
GV
SKYPNNKAGAKYGTGYCDSQCPRDLKFIDGEANVEGWQPSSNNANTGIGDHGSCCAEMDVWEANSISNAVTPHP-
CD
TPGQTMCSGDDCGGTYSNDRYAGTCDPDGCDFNPYRMGNTSFYGPGKIIDTTKPFTVVTQFLTDDGTDTGTLSE-
IK
RFYIQNSNVIPQPNSDISGVTGNSITTEFCTAQKQAFGDTDDFSQHGGLAKMGAAMQQGMVLVMSLWDDYAAQM-
LW LDSDYPTDADPTTPGIARGTCPTDSGVPSDVESQSPNSYVTYSNIKFGPINSTFTAS
[0350] The polynucleotide (SEQ ID NO:127) and amino acid (SEQ ID
NO:128) sequences of wild-type M. thermophila CBH1a are provided
below. The signal sequence is shown underlined in SEQ ID NO:128.
SEQ ID NO:129 provides the sequence of this CBH1a, without the
signal sequence.
TABLE-US-00054 (SEQ ID NO: 127)
ATGTACGCCAAGTTCGCGACCCTCGCCGCCCTTGTGGCTGGCGCCGCTGCTCAGAACGCCTGCACTCTGACCGC-
TGA
GAACCACCCCTCGCTGACGTGGTCCAAGTGCACGTCTGGCGGCAGCTGCACCAGCGTCCAGGGTTCCATCACCA-
TCG
ACGCCAACTGGCGGTGGACTCACCGGACCGATAGCGCCACCAACTGCTACGAGGGCAACAAGTGGGATACTTCG-
TAC
TGCAGCGATGGTCCTTCTTGCGCCTCCAAGTGCTGCATCGACGGCGCTGACTACTCGAGCACCTATGGCATCAC-
CAC
GAGCGGTAACTCCCTGAACCTCAAGTTCGTCACCAAGGGCCAGTACTCGACCAACATCGGCTCGCGTACCTACC-
TGA
TGGAGAGCGACACCAAGTACCAGATGTTCCAGCTCCTCGGCAACGAGTTCACCTTCGATGTCGACGTCTCCAAC-
CTC
GGCTGCGGCCTCAATGGCGCCCTCTACTTCGTGTCCATGGATGCCGATGGTGGCATGTCCAAGTACTCGGGCAA-
CAA
GGCAGGTGCCAAGTACGGTACCGGCTACTGTGATTCTCAGTGCCCCCGCGACCTCAAGTTCATCAACGGCGAGG-
CCA
ACGTAGAGAACTGGCAGAGCTCGACCAACGATGCCAACGCCGGCACGGGCAAGTACGGCAGCTGCTGCTCCGAG-
ATG
GACGTCTGGGAGGCCAACAACATGGCCGCCGCCTTCACTCCCCACCCTTGCACCGTGATCGGCCAGTCGCGCTG-
CGA
GGGCGACTCGTGCGGCGGTACCTACAGCACCGACCGCTATGCCGGCATCTGCGACCCCGACGGATGCGACTTCA-
ACT
CGTACCGCCAGGGCAACAAGACCTTCTACGGCAAGGGCATGACGGTCGACACGACCAAGAAGATCACGGTCGTC-
ACC
CAGTTCCTCAAGAACTCGGCCGGCGAGCTCTCCGAGATCAAGCGGTTCTACGTCCAGAACGGCAAGGTCATCCC-
CAA
CTCCGAGTCCACCATCCCGGGCGTCGAGGGCAACTCCATCACCCAGGACTGGTGCGACCGCCAGAAGGCCGCCT-
TCG
GCGACGTGACCGACTTCCAGGACAAGGGCGGCATGGTCCAGATGGGCAAGGCCCTCGCGGGGCCCATGGTCCTC-
GTC
ATGTCCATCTGGGACGACCACGCCGTCAACATGCTCTGGCTCGACTCCACCTGGCCCATCGACGGCGCCGGCAA-
GCC
GGGCGCCGAGCGCGGTGCCTGCCCCACCACCTCGGGCGTCCCCGCTGAGGTCGAGGCCGAGGCCCCCAACTCCA-
ACG
TCATCTTCTCCAACATCCGCTTCGGCCCCATCGGCTCCACCGTCTCCGGCCTGCCCGACGGCGGCAGCGGCAAC-
CCC
AACCCGCCCGTCAGCTCGTCCACCCCGGTCCCCTCCTCGTCCACCACATCCTCCGGTTCCTCCGGCCCGACTGG-
CGG
CACGGGTGTCGCTAAGCACTATGAGCAATGCGGAGGAATCGGGTTCACTGGCCCTACCCAGTGCGAGAGCCCCT-
ACA CTTGCACCAAGCTGAATGACTGGTACTCGCAGTGCCTGTAA (SEQ ID NO: 128)
MYAKFATLAALVAGAAAQNACTLTAENHPSLTYSKCTSGGSCTSVQGSITIDANWRWTHRTDSATNCYEGNKWD-
TSW
CSDGPSCASKCCIDGADYSSTYGITTSGNSLNLKFVTKGQYSTNIGSRTYLMESDTKYQMFQLLGNEFTFDVDV-
SNL
GCGLNGALYFVSMDADGGMSKYSGNKAGAKYGTGYCDSQCPRDLKFINGEANVENWQSSTNDANAGTGKYGSCC-
SEM
DVWEANNMAAAFTPHPCTVIGQSRCEGDSCGGTYSTDRYAGICDPDGCDFNSYRQGNKTFYGKGMTVDTTKKIT-
VVT
QFLKNSAGELSEIKRFYVQNGKVIPNSESTIPGVEGNSITQDWCDRQKAAFGDVTDFQDKGGMVQMGKALAGPM-
VLV
MSIWDDHAVNMLWLDSTWPIDGAGKPGAERGACPTTSGVPAEVEAEAPNSNVIFSNIRFGPIGSTVSGLPDGGS-
GNP
NPPVSSSTPVPSSSTTSSGSSGPTGGTGVAKHYEQCGGIGFTGPTQCESPYTCTKLNDWYSQCL
(SEQ ID NO: 129)
QNACTLTAENHPSLTYSKCTSGGSCTSVQGSITIDANWRWTHRTDSATNCYEGNKWDTSWCSDGPSCASKCCID-
GAD
YSSTYGITTSGNSLNLKFVTKGQYSTNIGSRTYLMESDTKYQMFQLLGNEFTFDVDVSNLGCGLNGALYFVSMD-
ADG
GMSKYSGNKAGAKYGTGYCDSQCPRDLKFINGEANVENWQSSTNDANAGTGKYGSCCSEMDVWEANNMAAAFTP-
HPC
TVIGQSRCEGDSCGGTYSTDRYAGICDPDGCDFNSYRQGNKTFYGKGMTVDTTKKITVVTQFLKNSAGELSEIK-
RFY
VQNGKVIPNSESTIPGVEGNSITQDWCDRQKAAFGDVTDFQDKGGMVQMGKALAGPMVLVMSIWDDHAVNMLWL-
DST
WPIDGAGKPGAERGACPTTSGVPAEVEAEAPNSNVIFSNIRFGPIGSTVSGLPDGGSGNPNPPVSSSTPVPSSS-
TTS SGSSGPTGGTGVAKHYEQCGGIGFTGPTQCESPYTCTKLNDWYSQCL
[0351] The polynucleotide (SEQ ID NO:130) and amino acid (SEQ ID
NO:131) sequences of a M. thermophila CBH1a variant ("Variant 145")
are provided below. The signal sequence is shown underlined in SEQ
ID NO:131. SEQ ID NO:132 provides the sequence of this CBH1a,
without the signal sequence.
TABLE-US-00055 (SEQ ID NO: 130)
ATGTACGCCAAGTTCGCGACCCTCGCCGCCCTTGTGGCTGGCGCCGCTGCTCAGAACGCCTG
CACTCTGACCGCTGAGAACCACCCCTCGCTGACGTGGTCCAAGTGCACGTCTGGCGGCAGCT
GCACCAGCGTCCAGGGTTCCATCACCATCGACGCCAACTGGCGGTGGACTCACCGGACCGAT
AGCGCCACCAACTGCTACGAGGGCAACAAGTGGGATACTTCGTGGTGCAGCGATGGTCCTTC
TTGCGCCTCCAAGTGCTGCATCGACGGCGCTGACTACTCGAGCACCTATGGCATCACCACGA
GCGGTAACTCCCTGAACCTCAAGTTCGTCACCAAGGGCCAGTACTCGACCAACATCGGCTCG
CGTACCTACCTGATGGAGAGCGACACCAAGTACCAGATGTTCCAGCTCCTCGGCAACGAGTT
CACCTTCGATGTCGACGTCTCCAACCTCGGCTGCGGCCTCAATGGCGCCCTCTACTTCGTGTC
CATGGATGCCGATGGTGGCATGTCCAAGTACTCGGGCAACAAGGCAGGTGCCAAGTACGGT
ACCGGCTACTGTGATTCTCAGTGCCCCCGCGACCTCAAGTTCATCAACGGCGAGGCCAACGT
AGAGAACTGGCAGAGCTCGACCAACGATGCCAACGCCGGCACGGGCAAGTACGGCAGCTGC
TGCTCCGAGATGGACGTCTGGGAGGCCAACAACATGGCCGCCGCCTTCACTCCCCACCCTTG
CACCGTGATCGGCCAGTCGCGCTGCGAGGGCGACTCGTGCGGCGGTACCTACAGCACCGAC
CGCTATGCCGGCATCTGCGACCCCGACGGATGCGACTTCAACTCGTACCGCCAGGGCAACAA
GACCTTCTACGGCAAGGGCATGACGGTCGACACGACCAAGAAGATCACGGTCGTCACCCAG
TTCCTCAAGAACTCGGCCGGCGAGCTCTCCGAGATCAAGCGGTTCTACGTCCAGAACGGCAA
GGTCATCCCCAACTCCGAGTCCACCATCCCGGGCGTCGAGGGCAACTCCATCACCCAGGACT
GGTGCGACCGCCAGAAGGCCGCCTTCGGCGACGTGACCGACTTCCAGGACAAGGGCGGCAT
GGTCCAGATGGGCAAGGCCCTCGCGGGGCCCATGGTCCTCGTCATGTCCATCTGGGACGACC
ACGCCGTCAACATGCTCTGGCTCGACTCCACCTGGCCCATCGACGGCGCCGGCAAGCCGGGC
GCCGAGCGCGGTGCCTGCCCCACCACCTCGGGCGTCCCCGCTGAGGTCGAGGCCGAGGCCC
CCAACTCCAACGTCATCTTCTCCAACATCCGCTTCGGCCCCATCGGCTCCACCGTCTCCGGCC
TGCCCGACGGCGGCAGCGGCAACCCCAACCCGCCCGTCAGCTCGTCCACCCCGGTCCCCTCC
TCGTCCACCACATCCTCCGGTTCCTCCGGCCCGACTGGCGGCACGGGTGTCGCTAAGCACTA
TGAGCAATGCGGAGGAATCGGGTTCACTGGCCCTACCCAGTGCGAGAGCCCCTACACTTGCA
CCAAGCTGAATGACTGGTACTCGCAGTGCCTGTAA (SEQ ID NO: 131)
MYAKFATLAALVAGAAAQNACTLTAENHPSLTWSKCTSGGSCTSVQGSITIDANWRWTHRTDS
ATNCYEGNKWDTSWCSDGPSCASKCCIDGADYSSTYGITTSGNSLNLKFVTKGQYSTNIGSRTY
LMESDTKYQMFQLLGNEFTFDVDVSNLGCGLNGALYFVSMDADGGMSKYSGNKAGAKYGTG
YCDSQCPRDLKFINGEANVENWQSSTNDANAGTGKYGSCCSEMDVWEANNMAAAFTPHPCTVI
GQSRCEGDSCGGTYSTDRYAGICDPDGCDFNSYRQGNKTFYGKGMTVDTTKKITVVTQFLKNS
AGELSEIKRFYVQNGKVIPNSESTIPGVEGNSITQDWCDRQKAAFGDVTDFQDKGGMVQMGKA
LAGPMVLVMSIWDDHAVNMLWLDSTWPIDGAGKPGAERGACPTTSGVPAEVEAEAPNSNVIFS
NIRFGPIGSTVSGLPDGGSGNPNPPVSSSTPVPSSSTTSSGSSGPTGGTGVAKHYEQCGGIGFTGPT
QCESPYTCTKLNDWYSQCL (SEQ ID NO: 132)
QNACTLTAENHPSLTWSKCTSGGSCTSVQGSITIDANWRWTHRTDSATNCYEGNKWDTSWCSD
GPSCASKCCIDGADYSSTYGITTSGNSLNLKFVTKGQYSTNIGSRTYLMESDTKYQMFQLLGNEF
TFDVDVSNLGCGLNGALYFVSMDADGGMSKYSGNKAGAKYGTGYCDSQCPRDLKFINGEANV
ENWQSSTNDANAGTGKYGSCCSEMDVWEANNMAAAFTPHPCTVIGQSRCEGDSCGGTYSTDR
YAGICDPDGCDFNSYRQGNKTFYGKGMTVDTTKKITVVTQFLKNSAGELSEIKRFYVQNGKVIP
NSESTIPGVEGNSITQDWCDRQKAAFGDVTDFQDKGGMVQMGKALAGPMVLVMSIWDDHAVN
MLWLDSTWPIDGAGKPGAERGACPTTSGVPAEVEAEAPNSNVIFSNIRFGPIGSTVSGLPDGGSG
NPNPPVSSSTPVPSSSTTSSGSSGPTGGTGVAKHYEQCGGIGFTGPTQCESPYTCTKLNDWYSQCL
[0352] The polynucleotide (SEQ ID NO:133) and amino acid (SEQ ID
NO:134) sequences of a M. thermophila CBH1a variant ("Variant 983")
are provided below. The signal sequence is shown underlined in SEQ
ID NO:134. SEQ ID NO:135 provides the sequence of this CBH1a
variant, without the signal sequence.
TABLE-US-00056 (SEQ ID NO: 133)
ATGTACGCCAAGTTCGCGACCCTCGCCGCCCTTGTGGCTGGCGCCGCTGCTCAGAACGCCTG
CACTCTGAACGCTGAGAACCACCCCTCGCTGACGTGGTCCAAGTGCACGTCTGGCGGCAGCT
GCACCAGCGTCCAGGGTTCCATCACCATCGACGCCAACTGGCGGTGGACTCACCGGACCGAT
AGCGCCACCAACTGCTACGAGGGCAACAAGTGGGATACTTCGTACTGCAGCGATGGTCCTTC
TTGCGCCTCCAAGTGCTGCATCGACGGCGCTGACTACTCGAGCACCTATGGCATCACCACGA
GCGGTAACTCCCTGAACCTCAAGTTCGTCACCAAGGGCCAGTACTCGACCAACATCGGCTCG
CGTACCTACCTGATGGAGAGCGACACCAAGTACCAGATGTTCCAGCTCCTCGGCAACGAGTT
CACCTTCGATGTCGACGTCTCCAACCTCGGCTGCGGCCTCAATGGCGCCCTCTACTTCGTGTC
CATGGATGCCGATGGTGGCATGTCCAAGTACTCGGGCAACAAGGCAGGTGCCAAGTACGGT
ACCGGCTACTGTGATTCTCAGTGCCCCCGCGACCTCAAGTTCATCAACGGCGAGGCCAACGT
AGAGAACTGGCAGAGCTCGACCAACGATGCCAACGCCGGCACGGGCAAGTACGGCAGCTGC
TGCTCCGAGATGGACGTCTGGGAGGCCAACAACATGGCCGCCGCCTTCACTCCCCACCCTTG
CACCGTGATCGGCCAGTCGCGCTGCGAGGGCGACTCGTGCGGCGGTACCTACAGCACCGAC
CGCTATGCCGGCATCTGCGACCCCGACGGATGCGACTTCAACTCGTACCGCCAGGGCAACAA
GACCTTCTACGGCAAGGGCATGACGGTCGACACGACCAAGAAGATCACGGTCGTCACCCAG
TTCCTCAAGAACTCGGCCGGCGAGCTCTCCGAGATCAAGCGGTTCTACGTCCAGAACGGCAA
GGTCATCCCCAACTCCGAGTCCACCATCCCGGGCGTCGAGGGCAACTCCATCACCCAGGAGT
ACTGCGACCGCCAGAAGGCCGCCTTCGGCGACGTGACCGACTTCCAGGACAAGGGCGGCAT
GGTCCAGATGGGCAAGGCCCTCGCGGGGCCCATGGTCCTCGTCATGTCCATCTGGGACGACC
ACGCCGACAACATGCTCTGGCTCGACTCCACCTGGCCCATCGACGGCGCCGGCAAGCCGGG
CGCCGAGCGCGGTGCCTGCCCCACCACCTCGGGCGTCCCCGCTGAGGTCGAGGCCGAGGCC
CCCAACTCCAACGTCATCTTCTCCAACATCCGCTTCGGCCCCATCGGCTCCACCGTCTCCGGC
CTGCCCGACGGCGGCAGCGGCAACCCCAACCCGCCCGTCAGCTCGTCCACCCCGGTCCCCTC
CTCGTCCACCACATCCTCCGGTTCCTCCGGCCCGACTGGCGGCACGGGTGTCGCTAAGCACT
ATGAGCAATGCGGAGGAATCGGGTTCACTGGCCCTACCCAGTGCGAGAGCCCCTACACTTGC
ACCAAGCTGAATGACTGGTACTCGCAGTGCCTGTAA (SEQ ID NO: 134)
MYAKFATLAALVAGAAAQNACTLNAENHPSLTWSKCTSGGSCTSVQGSITIDANWRWTHRTDS
ATNCYEGNKWDTSYCSDGPSCASKCCIDGADYSSTYGITTSGNSLNLKFVTKGQYSTNIGSRTYL
MESDTKYQMFQLLGNEFTFDVDVSNLGCGLNGALYFVSMDADGGMSKYSGNKAGAKYGTGY
CDSQCPRDLKFINGEANVENWQSSTNDANAGTGKYGSCCSEMDVWEANNMAAAFTPHPCTVIG
QSRCEGDSCGGTYSTDRYAGICDPDGCDFNSYRQGNKTFYGKGMTVDTTKKITVVTQFLKNSA
GELSEIKRFYVQNGKVIPNSESTIPGVEGNSITQEYCDRQKAAFGDVTDFQDKGGMVQMGKALA
GPMVLVMSIWDDHADNMLWLDSTWPIDGAGKPGAERGACPTTSGVPAEVEAEAPNSNVIFSNI
RFGPIGSTVSGLPDGGSGNPNPPVSSSTPVPSSSTTSSGSSGPTGGTGVAKHYEQCGGIGFTGPTQC
ESPYTCTKLNDWYSQCL (SEQ ID NO: 135)
QNACTLNAENHPSLTWSKCTSGGSCTSVQGSITIDANWRWTHRTDSATNCYEGNKWDTSYCSD
GPSCASKCCIDGADYSSTYGITTSGNSLNLKFVTKGQYSTNIGSRTYLMESDTKYQMFQLLGNEF
TFDVDVSNLGCGLNGALYFVSMDADGGMSKYSGNKAGAKYGTGYCDSQCPRDLKFINGEANV
ENWQSSTNDANAGTGKYGSCCSEMDVWEANNMAAAFTPHPCTVIGQSRCEGDSCGGTYSTDR
YAGICDPDGCDFNSYRQGNKTFYGKGMTVDTTKKITVVTQFLKNSAGELSEIKRFYVQNGKVIP
NSESTIPGVEGNSITQEYCDRQKAAFGDVTDFQDKGGMVQMGKALAGPMVLVMSIWDDHADN
MLWLDSTWPIDGAGKPGAERGACPTTSGVPAEVEAEAPNSNVIFSNIRFGPIGSTVSGLPDGGSG
NPNPPVSSSTPVPSSSTTSSGSSGPTGGTGVAKHYEQCGGIGFTGPTQCESPYTCTKLNDWYSQCL
[0353] The polynucleotide (SEQ ID NO:136) and amino acid (SEQ ID
NO:137) sequences of wild-type M. thermophila CBH2b are provided
below. The signal sequence is shown underlined in SEQ ID NO:137.
SEQ ID NO:138 provides the sequence of this CBH2b, without the
signal sequence.
TABLE-US-00057 (SEQ ID NO: 136)
ATGGCCAAGAAGCTTTTCATCACCGCCGCGCTTGCGGCTGCCGTGTTGGCGGCCCCCGTCAT
TGAGGAGCGCCAGAACTGCGGCGCTGTGTGGACTCAATGCGGCGGTAACGGGTGGCAAGGT
CCCACATGCTGCGCCTCGGGCTCGACCTGCGTTGCGCAGAACGAGTGGTACTCTCAGTGCCT
GCCCAACAGCCAGGTGACGAGTTCCACCACTCCGTCGTCGACTTCCACCTCGCAGCGCAGCA
CCAGCACCTCCAGCAGCACCACCAGGAGCGGCAGCTCCTCCTCCTCCTCCACCACGCCCCCG
CCCGTCTCCAGCCCCGTGACCAGCATTCCCGGCGGTGCGACCTCCACGGCGAGCTACTCTGG
CAACCCCTTCTCGGGCGTCCGGCTCTTCGCCAACGACTACTACAGGTCCGAGGTCCACAATC
TCGCCATTCCTAGCATGACTGGTACTCTGGCGGCCAAGGCTTCCGCCGTCGCCGAAGTCCCT
AGCTTCCAGTGGCTCGACCGGAACGTCACCATCGACACCCTGATGGTCCAGACTCTGTCCCA
GGTCCGGGCTCTCAATAAGGCCGGTGCCAATCCTCCCTATGCTGCCCAACTCGTCGTCTACG
ACCTCCCCGACCGTGACTGTGCCGCCGCTGCGTCCAACGGCGAGTTTTCGATTGCAAACGGC
GGCGCCGCCAACTACAGGAGCTACATCGACGCTATCCGCAAGCACATCATTGAGTACTCGG
ACATCCGGATCATCCTGGTTATCGAGCCCGACTCGATGGCCAACATGGTGACCAACATGAAC
GTGGCCAAGTGCAGCAACGCCGCGTCGACGTACCACGAGTTGACCGTGTACGCGCTCAAGC
AGCTGAACCTGCCCAACGTCGCCATGTATCTCGACGCCGGCCACGCCGGCTGGCTCGGCTGG
CCCGCCAACATCCAGCCCGCCGCCGAGCTGTTTGCCGGCATCTACAATGATGCCGGCAAGCC
GGCTGCCGTCCGCGGCCTGGCCACTAACGTCGCCAACTACAACGCCTGGAGCATCGCTTCGG
CCCCGTCGTACACGTCGCCTAACCCTAACTACGACGAGAAGCACTACATCGAGGCCTTCAGC
CCGCTCTTGAACTCGGCCGGCTTCCCCGCACGCTTCATTGTCGACACTGGCCGCAACGGCAA
ACAACCTACCGGCCAACAACAGTGGGGTGACTGGTGCAATGTCAAGGGCACCGGCTTTGGC
GTGCGCCCGACGGCCAACACGGGCCACGAGCTGGTCGATGCCTTTGTCTGGGTCAAGCCCGG
CGGCGAGTCCGACGGCACAAGCGACACCAGCGCCGCCCGCTACGACTACCACTGCGGCCTG
TCCGATGCCCTGCAGCCTGCCCCCGAGGCTGGACAGTGGTTCCAGGCCTACTTCGAGCAGCT
GCTCACCAACGCCAACCCGCCCTTCTAA (SEQ ID NO: 137)
MAKKLFITAALAAAVLAAPVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCL
PNSQVTSSTTPSSTSTSQRSTSTSSSTTRSGSSSSSSTTPPPVSSPVTSIPGGATSTASYSGNPFSGVRL
FANDYYRSEVHNLAIPSMTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVQTLSQVRALNKAGA
NPPYAAQLVVYDLPDRDCAAAASNGEFSIANGGAANYRSYIDAIRKHIIEYSDIRIILVIEPDSMAN
MVTNMNVAKCSNAASTYHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAAELFAGI
YNDAGKPAAVRGLATNVANYNAWSIASAPSYTSPNPNYDEKHYIEAFSPLLNSAGFPARFIVDTG
RNGKQPTGQQQWGDWCNVKGTGFGVRPTANTGHELVDAFVWVKPGGESDGTSDTSAARYDY
HCGLSDALQPAPEAGQWFQAYFEQLLTNANPPF (SEQ ID NO: 138)
APVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCLPNSQVTSSTTPSSTSTSQR
STSTSSSTTRSGSSSSSSTTPPPVSSPVTSIPGGATSTASYSGNPFSGVRLFANDYYRSEVHNLAIPS
MTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVQTLSQVRALNKAGANPPYAAQLVVYDLPDR
DCAAAASNGEFSIANGGAANYRSYIDAIRKHIIEYSDIRIILVIEPDSMANMVTNMNVAKCSNAAS
TYHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAAELFAGIYNDAGKPAAVRGLATN
VANYNAWSIASAPSYTSPNPNYDEKHYIEAFSPLLNSAGFPARFIVDTGRNGKQPTGQQQWGDW
CNVKGTGFGVRPTANTGHELVDAFVWVKPGGESDGTSDTSAARYDYHCGLSDALQPAPEAGQ
WFQAYFEQLLTNANPPF
[0354] The polynucleotide (SEQ ID NO:139) and amino acid (SEQ ID
NO:140) sequences of a M. thermophila CBH2b variant ("Variant 196")
are provided below. The signal sequence is shown underlined in SEQ
ID NO:140. SEQ ID NO:141 provides the sequence of this CBH2b
variant, without the signal sequence.
TABLE-US-00058 (SEQ ID NO: 139)
ATGGCCAAGAAGCTTTTCATCACCGCCGCGCTTGCGGCTGCCGTGTTGGCGGCCCCCGTCAT
TGAGGAGCGCCAGAACTGCGGCGCTGTGTGGACTCAATGCGGCGGTAACGGGTGGCAAGGT
CCCACATGCTGCGCCTCGGGCTCGACCTGCGTTGCGCAGAACGAGTGGTACTCTCAGTGCCT
GCCCAACAGCCAGGTGACGAGTTCCACCACTCCGTCGTCGACTTCCACCTCGCAGCGCAGCA
CCAGCACCTCCAGCAGCACCACCAGGAGCGGCAGCTCCTCCTCCTCCTCCACCACGCCCACC
CCCGTCTCCAGCCCCGTGACCAGCATTCCCGGCGGTGCGACCTCCACGGCGAGCTACTCTGG
CAACCCCTTCTCGGGCGTCCGGCTCTTCGCCAACGACTACTACAGGTCCGAGGTCCACAATC
TCGCCATTCCTAGCATGACTGGTACTCTGGCGGCCAAGGCTTCCGCCGTCGCCGAAGTCCCT
AGCTTCCAGTGGCTCGACCGGAACGTCACCATCGACACCCTGATGGTCCCGACTCTGTCCCG
CGTCCGGGCTCTCAATAAGGCCGGTGCCAATCCTCCCTATGCTGCCCAACTCGTCGTCTACG
ACCTCCCCGACCGTGACTGTGCCGCCGCTGCGTCCAACGGCGAGTTTTCGATTGCAAACGGC
GGCGCCGCCAACTACAGGAGCTACATCGACGCTATCCGCAAGCACATCATTGAGTACTCGG
ACATCCGGATCATCCTGGTTATCGAGCCCGACTCGATGGCCAACATGGTGACCAACATGAAC
GTGGCCAAGTGCAGCAACGCCGCGTCGACGTACCACGAGTTGACCGTGTACGCGCTCAAGC
AGCTGAACCTGCCCAACGTCGCCATGTATCTCGACGCCGGCCACGCCGGCTGGCTCGGCTGG
CCCGCCAACATCCAGCCCGCCGCCGAGCTGTTTGCCGGCATCTACAATGATGCCGGCAAGCC
GGCTGCCGTCCGCGGCCTGGCCACTAACGTCGCCAACTACAACGCCTGGAGCATCGCTTCGG
CCCCGTCGTACACGTCGCCTAACCCTAACTACGACGAGAAGCACTACATCGAGGCCTTCAGC
CCGCTCTTGAACTCGGCCGGCTTCCCCGCACGCTTCATTGTCGACACTGGCCGCAACGGCAA
ACAACCTACCGGCCAACAACAGTGGGGTGACTGGTGCAATGTCAAGGGCACCGGCTTTGGC
GTGCGCCCGACGGCCAACACGGGCCACGAGCTGGTCGATGCCTTTGTCTGGGTCAAGCCCGG
CGGCGAGTCCGACGGCACAAGCGACACCAGCGCCGCCCGCTACGACTACCACTGCGGCCTG
TCCGATGCCCTGCAGCCTGCCCCCGAGGCTGGACAGTGGTTCCAGGCCTACTTCGAGCAGCT
GCTCACCAACGCCAACCCGCCCTTCTAA (SEQ ID NO: 140)
MAKKLFITAALAAAVLAAPVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCL
PNSQVTSSTTPSSTSTSQRSTSTSSSTTRSGSSSSSSTTPTPVSSPVTSIPGGATSTASYSGNPFSGVR
LFANDYYRSEVHNLAIPSMTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVPTLSRVRALNKAG
ANPPYAAQLVVYDLPDRDCAAAASNGEFSIANGGAANYRSYIDAIRKHIIEYSDIRIILVIEPDSMA
NMVTNMNVAKCSNAASTYHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAAELFAG
IYNDAGKPAAVRGLATNVANYNAWSIASAPSYTSPNPNYDEKHYIEAFSPLLNSAGFPARFIVDT
GRNGKQPTGQQQWGDWCNVKGTGFGVRPTANTGHELVDAFVWVKPGGESDGTSDTSAARYD
YHCGLSDALQPAPEAGQWFQAYFEQLLTNANPPF (SEQ ID NO: 141)
APVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCLPNSQVTSSTTPSSTSTSQR
STSTSSSTTRSGSSSSSSTTPTPVSSPVTSIPGGATSTASYSGNPFSGVRLFANDYYRSEVHNLAIPS
MTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVPTLSRVRALNKAGANPPYAAQLVVYDLPDRD
CAAAASNGEFSIANGGAANYRSYIDAIRKHIIEYSDIRIILVIEPDSMANMVTNMNVAKCSNAAST
YHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAAELFAGIYNDAGKPAAVRGLATNV
ANYNAWSIASAPSYTSPNPNYDEKHYIEAFSPLLNSAGFPARFIVDTGRNGKQPTGQQQWGDWC
NVKGTGFGVRPTANTGHELVDAFVWVKPGGESDGTSDTSAARYDYHCGLSDALQPAPEAGQW
FQAYFEQLLTNANPPF
[0355] The polynucleotide (SEQ ID NO:142) and amino acid (SEQ ID
NO:143) sequences of a M. thermophila CBH2b variant ("Variant 287")
are provided below. The signal sequence is shown underlined in SEQ
ID NO:143. SEQ ID NO:144 provides the sequence of this CBH2b
variant, without the signal sequence.
TABLE-US-00059 (SEQ ID NO: 142)
ATGGCCAAGAAGCTTTTCATCACCGCCGCGCTTGCGGCTGCCGTGTTGGCGGCCCCCGTCAT
TGAGGAGCGCCAGAACTGCGGCGCTGTGTGGACTCAATGCGGCGGTAACGGGTGGCAAGGT
CCCACATGCTGCGCCTCGGGCTCGACCTGCGTTGCGCAGAACGAGTGGTACTCTCAGTGCCT
GCCCAACAGCCAGGTGACGAGTTCCACCACTCCGTCGTCGACTTCCACCTCGCAGCGCAGCA
CCAGCACCTCCAGCAGCACCACCAGGAGCGGCAGCTCCTCCTCCTCCTCCACCACGCCCCCG
CCCGTCTCCAGCCCCGTGACCAGCATTCCCGGCGGTGCGACCTCCACGGCGAGCTACTCTGG
CAACCCCTTCTCGGGCGTCCGGCTCTTCGCCAACGACTACTACAGGTCCGAGGTCCACAATC
TCGCCATTCCTAGCATGACTGGTACTCTGGCGGCCAAGGCTTCCGCCGTCGCCGAAGTCCCT
AGCTTCCAGTGGCTCGACCGGAACGTCACCATCGACACCCTGATGGTCCCGACTCTGTCCCG
CGTCCGGGCTCTCAATAAGGCCGGTGCCAATCCTCCCTATGCTGCCCAACTCGTCGTCTACG
ACCTCCCCGACCGTGACTGTGCCGCCGCTGCGTCCAACGGCGAGTTTTCGATTGCAAACGGC
GGCGCCGCCAACTACAGGAGCTACATCGACGCTATCCGCAAGCACATCAAGGAGTACTCGG
ACATCCGGATCATCCTGGTTATCGAGCCCGACTCGATGGCCAACATGGTGACCAACATGAAC
GTGGCCAAGTGCAGCAACGCCGCGTCGACGTACCACGAGTTGACCGTGTACGCGCTCAAGC
AGCTGAACCTGCCCAACGTCGCCATGTATCTCGACGCCGGCCACGCCGGCTGGCTCGGCTGG
CCCGCCAACATCCAGCCCGCCGCCGAGCTGTTTGCCGGCATCTACAATGATGCCGGCAAGCC
GGCTGCCGTCCGCGGCCTGGCCACTAACGTCGCCAACTACAACGCCTGGAGCATCGCTTCGG
CCCCGTCGTACACGTCGCCTAACCCTAACTACGACGAGAAGCACTACATCGAGGCCTTCAGC
CCGCTCTTGAACGACGCCGGCTTCCCCGCACGCTTCATTGTCGACACTGGCCGCAACGGCAA
ACAACCTACCGGCCAACAACAGTGGGGTGACTGGTGCAATGTCAAGGGCACCGGCTTTGGC
GTGCGCCCGACGGCCAACACGGGCCACGAGCTGGTCGATGCCTTTGTCTGGGTCAAGCCCGG
CGGCGAGTCCGACGGCACAAGCGACACCAGCGCCGCCCGCTACGACTACCACTGCGGCCTG
TCCGATGCCCTGCAGCCTGCCCCCGAGGCTGGACAGTGGTTCCAGGCCTACTTCGAGCAGCT
GCTCACCAACGCCAACCCGCCCTTCTAA (SEQ ID NO: 143)
MAKKLFITAALAAAVLAAPVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCL
PNSQVTSSTTPSSTSTSQRSTSTSSSTTRSGSSSSSSTTPPPVSSPVTSIPGGATSTASYSGNPFSGVRL
FANDYYRSEVHNLAIPSMTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVPTLSRVRALNKAGA
NPPYAAQLVVYDLPDRDCAAAASNGEFSIANGGAANYRSYIDAIRKHIKEYSDIRIILVIEPDSMA
NMVTNMNVAKCSNAASTYHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAAELFAG
IYNDAGKPAAVRGLATNVANYNAWSIASAPSYTSPNPNYDEKHYIEAFSPLLNDAGFPARFIVDT
GRNGKQPTGQQQWGDWCNVKGTGFGVRPTANTGHELVDAFVWVKPGGESDGTSDTSAARYD
YHCGLSDALQPAPEAGQWFQAYFEQLLTNANPPF (SEQ ID NO: 144)
APVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCLPNSQVTSSTTPSSTSTSQR
STSTSSSTTRSGSSSSSSTTPPPVSSPVTSIPGGATSTASYSGNPFSGVRLFANDYYRSEVHNLAIPS
MTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVPTLSRVRALNKAGANPPYAAQLVVYDLPDRD
CAAAASNGEFSIANGGAANYRSYIDAIRKHIKEYSDIRIILVIEPDSMANMVTNMNVAKCSNAAS
TYHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAAELFAGIYNDAGKPAAVRGLATN
VANYNAWSIASAPSYTSPNPNYDEKHYIEAFSPLLNDAGFPARFIVDTGRNGKQPTGQQQWGDW
CNVKGTGFGVRPTANTGHELVDAFVWVKPGGESDGTSDTSAARYDYHCGLSDALQPAPEAGQ
WFQAYFEQLLTNANPPF
[0356] The polynucleotide (SEQ ID NO:145) and amino acid (SEQ ID
NO:146) sequences of a M. thermophila CBH2b variant ("Variant 962")
are provided below. The signal sequence is shown underlined in SEQ
ID NO:146. SEQ ID NO:147 provides the sequence of this CBH2b
variant, without the signal sequence.
TABLE-US-00060 (SEQ ID NO: 145)
ATGGCCAAGAAGCTTTTCATCACCGCCGCGCTTGCGGCTGCCGTGTTGGCGGCCCCCGTCAT
TGAGGAGCGCCAGAACTGCGGCGCTGTGTGGACTCAATGCGGCGGTAACGGGTGGCAAGGT
CCCACATGCTGCGCCTCGGGCTCGACCTGCGTTGCGCAGAACGAGTGGTACTCTCAGTGCCT
GCCCAACAGCCAGGTGACGAGTTCCACCACTCCGTCGTCGACTTCCACCTCGCAGCGCAGCA
CCAGCACCTCCAGCAGCACCACCAGGAGCGGCAGCTCCTCCTCCTCCTCCACCACGCCCACC
CCCGTCTCCAGCCCCGTGACCAGCATTCCCGGCGGTGCGACCTCCACGGCGAGCTACTCTGG
CAACCCCTTCTCGGGCGTCCGGCTCTTCGCCAACGACTACTACAGGTCCGAGGTCATGAATC
TCGCCATTCCTAGCATGACTGGTACTCTGGCGGCCAAGGCTTCCGCCGTCGCCGAAGTCCCT
AGCTTCCAGTGGCTCGACCGGAACGTCACCATCGACACCCTGATGGTCACCACTCTGTCCCA
GGTCCGGGCTCTCAATAAGGCCGGTGCCAATCCTCCCTATGCTGCCCAACTCGTCGTCTACG
ACCTCCCCGACCGTGACTGTGCCGCCGCTGCGTCCAACGGCGAGTTTTCGATTGCAAACGGC
GGCAGCGCCAACTACAGGAGCTACATCGACGCTATCCGCAAGCACATCATTGAGTACTCGG
ACATCCGGATCATCCTGGTTATCGAGCCCGACTCGATGGCCAACATGGTGACCAACATGAAC
GTGGCCAAGTGCAGCAACGCCGCGTCGACGTACCACGAGTTGACCGTGTACGCGCTCAAGC
AGCTGAACCTGCCCAACGTCGCCATGTATCTCGACGCCGGCCACGCCGGCTGGCTCGGCTGG
CCCGCCAACATCCAGCCCGCCGCCGAGCTGTTTGCCGGCATCTACAATGATGCCGGCAAGCC
GGCTGCCGTCCGCGGCCTGGCCACTAACGTCGCCAACTACAACGCCTGGAGCATCGCTTCGG
CCCCGTCGTACACGCAGCCTAACCCTAACTACGACGAGAAGCACTACATCGAGGCCTTCAGC
CCGCTCTTGAACTCGGCCGGCTTCCCCGCACGCTTCATTGTCGACACTGGCCGCAACGGCAA
ACAACCTACCGGCCAACAACAGTGGGGTGACTGGTGCAATGTCAAGGGCACCGGCTTTGGC
GTGCGCCCGACGGCCAACACGGGCCACGAGCTGGTCGATGCCTTTGTCTGGGTCAAGCCCGG
CGGCGAGTCCGACGGCACAAGCGACACCAGCGCCGCCCGCTACGACTACCACTGCGGCCTG
TCCGATGCCCTGCAGCCTGCCCCCGAGGCTGGACAGTGGTTCCAGGCCTACTTCGAGCAGCT
GCTCACCAACGCCAACCCGCCCTTCTAA (SEQ ID NO: 146)
MAKKLFITAALAAAVLAAPVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCL
PNSQVTSSTTPSSTSTSQRSTSTSSSTTRSGSSSSSSTTPTPVSSPVTSIPGGATSTASYSGNPFSGVR
LFANDYYRSEVMNLAIPSMTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVTTLSQVRALNKAG
ANPPYAAQLVVYDLPDRDCAAAASNGEFSIANGGSANYRSYIDAIRKHIIEYSDIRIILVIEPDSMA
NMVTNMNVAKCSNAASTYHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAAELFAG
IYNDAGKPAAVRGLATNVANYNAWSIASAPSYTQPNPNYDEKHYIEAFSPLLNSAGFPARFIVDT
GRNGKQPTGQQQWGDWCNVKGTGFGVRPTANTGHELVDAFVWVKPGGESDGTSDTSAARYD
YHCGLSDALQPAPEAGQWFQAYFEQLLTNANPPF (SEQ ID NO: 147)
APVIEERQNCGAVWTQCGGNGWQGPTCCASGSTCVAQNEWYSQCLPNSQVTSSTTPSSTS
TSQRSTSTSSSTTRSGSSSSSSTTPTPVSSPVTSIPGGATSTASYSGNPFSGVRLFANDY
YRSEVMNLAIPSMTGTLAAKASAVAEVPSFQWLDRNVTIDTLMVTTLSQVRALNKAGANP
PYAAQLVVYDLPDRDCAAAASNGEFSIANGGSANYRSYIDAIRKHHEYSDIRIILVIEP
DSMANMVTNMNVAKCSNAASTYHELTVYALKQLNLPNVAMYLDAGHAGWLGWPANIQPAA
ELFAGIYNDAGKPAAVRGLATNVANYNAWSIASAPSYTQPNPNYDEKHYIEAFSPLLNSA
GFPARFIVDTGRNGKQPTGQQQWGDWCNVKGTGFGVRPTANTGHELVDAFVWVKPGGESD
GTSDTSAARYDYHCGLSDALQPAPEAGQWFQAYFEQLLTNANPPF
[0357] The polynucleotide (SEQ ID NO:148) and amino acid (SEQ ID
NO:149) sequences of another wild-type M. thermophila xylanase
("Xyl3") are provided below. The signal sequence is shown
underlined in SEQ ID NO:149. SEQ ID NO:150 provides the sequence of
this xylanase without the signal sequence.
TABLE-US-00061 (SEQ ID NO: 148)
ATGCACTCCAAAGCTTTCTTGGCAGCGCTTCTTGCGCCTGCCGTCTCAGGGCAACTGAACGA
CCTCGCCGTCAGGGCTGGACTCAAGTACTTTGGTACTGCTCTTAGCGAGAGCGTCATCAACA
GTGATACTCGGTATGCTGCCATCCTCAGCGACAAGAGCATGTTCGGCCAGCTCGTCCCCGAG
AATGGCATGAAGTGGGATGCTACTGAGCCGTCCCGTGGCCAGTTCAACTACGCCTCGGGCGA
CATCACGGCCAACACGGCCAAGAAGAATGGCCAGGGCATGCGTTGCCACACCATGGTCTGG
TACAGCCAGCTCCCGAGCTGGGTCTCCTCGGGCTCGTGGACCAGGGACTCGCTCACCTCGGT
CATCGAGACGCACATGAACAACGTCATGGGCCACTACAAGGGCCAATGCTACGCCTGGGAT
GTCATCAACGAGGCCATCAATGACGACGGCAACTCCTGGCGCGACAACGTCTTTCTCCGGAC
CTTTGGGACCGACTACTTCGCCCTGTCCTTCAACCTAGCCAAGAAGGCCGATCCCGATACCA
AGCTGTACTACAACGACTACAACCTCGAGTACAACCAGGCCAAGACGGACCGCGCTGTTGA
GCTCGTCAAGATGGTCCAGGCCGCCGGCGCGCCCATCGACGGTGTCGGCTTCCAGGGCCACC
TCATTGTCGGCTCGACCCCGACGCGCTCGCAGCTGGCCACCGCCCTCCAGCGCTTCACCGCG
CTCGGCCTCGAGGTCGCCTACACCGAGCTCGACATCCGCCACTCGAGCCTGCCGGCCTCTTC
GTCGGCGCTCGCGACCCAGGGCAACGACTTCGCCAACGTGGTCGGCTCTTGCCTCGACACCG
CCGGCTGCGTCGGCGTCACCGTCTGGGGCTTCACCGATGCGCACTCGTGGATCCCGAACACG
TTCCCCGGCCAGGGCGACGCCCTGATCTACGACAGCAACTACAACAAGAAGCCCGCGTGGA
CCTCGATCTCGTCCGTCCTGGCCGCCAAGGCCACCGGCGCCCCGCCCGCCTCGTCCTCCACC
ACCCTCGTCACCATCACCACCCCTCCGCCGGCATCCACCACCGCCTCCTCCTCCTCCAGTGCC
ACGCCCACGAGCGTCCCGACGCAGACGAGGTGGGGACAGTGCGGCGGCATCGGATGGACGG
GGCCGACCCAGTGCGAGAGCCCATGGACCTGCCAGAAGCTGAACGACTGGTACTGGCAGTG CCTG
(SEQ ID NO: 149)
MHSKAFLAALLAPAVSGQLNDLAVRAGLKYFGTALSESVINSDTRYAAILSDKSMFGQLVPENG
MKWDATEPSRGQFNYASGDITANTAKKNGQGMRCHTMVWYSQLPSWVSSGSWTRDSLTSVIE
THMNNVMGHYKGQCYAWDVINEAINDDGNSWRDNVFLRTFGTDYFALSFNLAKKADPDTKLY
YNDYNLEYNQAKTDRAVELVKMVQAAGAPIDGVGFQGHLIVGSTPTRSQLATALQRFTALGLE
VAYTELDIRHSSLPASSSALATQGNDFANVVGSCLDTAGCVGVTVWGFTDAHSWIPNTFPGQGD
ALIYDSNYNKKPAWTSISSVLAAKATGAPPASSSTTLVTITTPPPASTTASSSSSATPTSVPTQTRW
GQCGGIGWTGPTQCESPWTCQKLNDWYWQCL (SEQ ID NO: 150)
QLNDLAVRAGLKYFGTALSESVINSDTRYAAILSDKSMFGQLVPENGMKWDATEPSRGQFNYAS
GDITANTAKKNGQGMRCHTMVWYSQLPSWVSSGSWTRDSLTSVIETHMNNVMGHYKGQCYA
WDVINEAINDDGNSWRDNVFLRTFGTDYFALSFNLAKKADPDTKLYYNDYNLEYNQAKTDRA
VELVKMVQAAGAPIDGVGFQGHLIVGSTPTRSQLATALQRFTALGLEVAYTELDIRHSSLPASSS
ALATQGNDFANVVGSCLDTAGCVGVTVWGFTDAHSWIPNTFPGQGDALIYDSNYNKKPAWTSI
SSVLAAKATGAPPASSSTTLVTITTPPPASTTASSSSSATPTSVPTQTRWGQCGGIGWTGPTQCESP
WTCQKLNDWYWQCL
[0358] The polynucleotide (SEQ ID NO:151) and amino acid (SEQ ID
NO:152) sequences of a wild-type M. thermophila xylanase ("Xyl 2")
are provided below. The signal sequence is shown underlined in SEQ
ID NO:152. SEQ ID NO:153 provides the sequence of this xylanase
without the signal sequence.
TABLE-US-00062 (SEQ ID NO: 151)
ATGGTCTCGTTCACTCTCCTCCTCACGGTCATCGCCGCTGCGGTGACGACGGCCAGCCCTCTC
GAGGTGGTCAAGCGCGGCATCCAGCCGGGCACGGGCACCCACGAGGGGTACTTCTACTCGT
TCTGGACCGACGGCCGTGGCTCGGTCGACTTCAACCCCGGGCCCCGCGGCTCGTACAGCGTC
ACCTGGAACAACGTCAACAACTGGGTTGGCGGCAAGGGCTGGAACCCGGGCCCGCCGCGCA
AGATTGCGTACAACGGCACCTGGAACAACTACAACGTGAACAGCTACCTCGCCCTGTACGG
CTGGACTCGCAACCCGCTGGTCGAGTATTACATCGTGGAGGCATACGGCACGTACAACCCCT
CGTCGGGCACGGCGCGGCTGGGCACCATCGAGGACGACGGCGGCGTGTACGACATCTACAA
GACGACGCGGTACAACCAGCCGTCCATCGAGGGGACCTCCACCTTCGACCAGTACTGGTCCG
TCCGCCGCCAGAAGCGCGTCGGCGGCACTATCGACACGGGCAAGCACTTTGACGAGTGGAA
GCGCCAGGGCAACCTCCAGCTCGGCACCTGGAACTACATGATCATGGCCACCGAGGGCTAC
CAGAGCTCTGGTTCGGCCACTATCGAGGTCCGGGAGGCC (SEQ ID NO: 152)
MVSFTLLLTVIAAAVTTASPLEVVKRGIQPGTGTHEGYFYSFWTDGRGSVDFNPGPRGSYSVTW
NNVNNWVGGKGWNPGPPRKIAYNGTWNNYNVNSYLALYGWTRNPLVEYYIVEAYGTYNPSS
GTARLGTIEDDGGVYDIYKTTRYNQPSIEGTSTFDQYWSVRRQKRVGGTIDTGKHFDEWKRQGN
LQLGTWNYMIMATEGYQSSGSATIEVREA (SEQ ID NO: 153)
MVSFTLLLTVIAAAVTTASPLEVVKRGIQPGTGTHEGYFYSFWTDGRGSVDFNPGPRGSYSVTW
NNVNNWVGGKGWNPGPPRKIAYNGTWNNYNVNSYLALYGWTRNPLVEYYIVEAYGTYNPSS
GTARLGTIEDDGGVYDIYKTTRYNQPSIEGTSTFDQYWSVRRQKRVGGTIDTGKHFDEWKRQGN
LQLGTWNYMIMATEGYQSSGSATIEVREA
[0359] The polynucleotide (SEQ ID NO:154) and amino acid (SEQ ID
NO:155) sequences of another wild-type M. thermophila xylanase
("Xyl1") are provided below. The signal sequence is shown
underlined in SEQ ID NO:155. SEQ ID NO:156 provides the sequence of
this xylanase without the signal sequence.
TABLE-US-00063 (SEQ ID NO: 154)
ATGCGTACTCTTACGTTCGTGCTGGCAGCCGCCCCGGTGGCTGTGCTTGCCCAATCTCCTCTG
TGGGGCCAGTGCGGCGGTCAAGGCTGGACAGGTCCCACGACCTGCGTTTCTGGCGCAGTATG
CCAATTCGTCAATGACTGGTACTCCCAATGCGTGCCCGGATCGAGCAACCCTCCTACGGGCA
CCACCAGCAGCACCACTGGAAGCACCCCGGCTCCTACTGGCGGCGGCGGCAGCGGAACCGG
CCTCCACGACAAATTCAAGGCCAAGGGCAAGCTCTACTTCGGAACCGAGATCGATCACTACC
ATCTCAACAACAATGCCTTGACCAACATTGTCAAGAAAGACTTTGGTCAAGTCACTCACGAG
AACAGCTTGAAGTGGGATGCTACTGAGCCGAGCCGCAATCAATTCAACTTTGCCAACGCCGA
CGCGGTTGTCAACTTTGCCCAGGCCAACGGCAAGCTCATCCGCGGCCACACCCTCCTCTGGC
ACTCTCAGCTGCCGCAGTGGGTGCAGAACATCAACGACCGCAACACCTTGACCCAGGTCATC
GAGAACCACGTCACCACCCTTGTCACTCGCTACAAGGGCAAGATCCTCCACTGGGACGTCGT
TAACGAGATCTTTGCCGAGGACGGCTCGCTCCGCGACAGCGTCTTCAGCCGCGTCCTCGGCG
AGGACTTTGTCGGCATCGCCTTCCGCGCCGCCCGCGCCGCCGATCCCAACGCCAAGCTCTAC
ATCAACGACTACAACCTCGACATTGCCAACTACGCCAAGGTGACCCGGGGCATGGTCGAGA
AGGTCAACAAGTGGATCGCCCAGGGCATCCCGATCGACGGCATCGGCACCCAGTGCCACCT
GGCCGGGCCCGGCGGGTGGAACACGGCCGCCGGCGTCCCCGACGCCCTCAAGGCCCTCGCC
GCGGCCAACGTCAAGGAGATCGCCATCACCGAGCTCGACATCGCCGGCGCCTCCGCCAACG
ACTACCTCACCGTCATGAACGCCTGCCTCCAGGTCTCCAAGTGCGTCGGCATCACCGTCTGG
GGCGTCTCTGACAAGGACAGCTGGAGGTCGAGCAGCAACCCGCTCCTCTTCGACAGCAACT
ACCAGCCAAAGGCGGCATACAATGCTCTGATTAATGCCTTGTAA (SEQ ID NO: 155)
MRTLTFVLAAAPVAVLAQSPLWGQCGGQGWTGPTTCVSGAVCQFVNDWYSQCVPGSSNPPTG
TTSSTTGSTPAPTGGGGSGTGLHDKFKAKGKLYFGTEIDHYHLNNNALTNIVKKDFGQVTHENS
LKWDATEPSRNQFNFANADAVVNFAQANGKLIRGHTLLWHSQLPQWVQNINDRNTLTQVIENH
VTTLVTRYKGKILHWDVVNEIFAEDGSLRDSVFSRVLGEDFVGIAFRAARAADPNAKLYINDYN
LDIANYAKVTRGMVEKVNKWIAQGIPIDGIGTQCHLAGPGGWNTAAGVPDALKALAAANVKEI
AITELDIAGASANDYLTVMNACLQVSKCVGITVWGVSDKDSWRSSSNPLLFDSNYQPKAAYNA
LINAL (SEQ ID NO: 156)
QSPLWGQCGGQGWTGPTTCVSGAVCQFVNDWYSQCVPGSSNPPTGTTSSTTGSTPAPTGGGGS
GTGLHDKFKAKGKLYFGTEIDHYHLNNNALTNIVKKDFGQVTHENSLKWDATEPSRNQFNFAN
ADAVVNFAQANGKLIRGHTLLWHSQLPQWVQNINDRNTLTQVIENHVTTLVTRYKGKILHWDV
VNEIFAEDGSLRDSVFSRVLGEDFVGIAFRAARAADPNAKLYINDYNLDIANYAKVTRGMVEKV
NKWIAQGIPIDGIGTQCHLAGPGGWNTAAGVPDALKALAAANVKEIAITELDIAGASANDYLTV
MNACLQVSKCVGITVWGVSDKDSWRSSSNPLLFDSNYQPKAAYNALINAL
[0360] The polynucleotide (SEQ ID NO:157) and amino acid (SEQ ID
NO:158) sequences of another wild-type M. thermophila xylanase
("Xyl6") are provided below. The signal sequence is shown
underlined in SEQ ID NO:158. SEQ ID NO:159 provides the sequence of
this xylanase without the signal sequence.
TABLE-US-00064 (SEQ ID NO: 157)
ATGGTCTCGCTCAAGTCCCTCCTCCTCGCCGCGGCGGCGACGTTGACGGCGGTGACGGCGCG
CCCGTTCGACTTTGACGACGGCAACTCGACCGAGGCGCTGGCCAAGCGCCAGGTCACGCCC
AACGCGCAGGGCTACCACTCGGGCTACTTCTACTCGTGGTGGTCCGACGGCGGCGGCCAGGC
CACCTTCACCCTGCTCGAGGGCAGCCACTACCAGGTCAACTGGAGGAACACGGGCAACTTTG
TCGGTGGCAAGGGCTGGAACCCGGGTACCGGCCGGACCATCAACTACGGCGGCTCGTTCAA
CCCGAGCGGCAACGGCTACCTGGCCGTCTACGGCTGGACGCACAACCCGCTGATCGAGTACT
ACGTGGTCGAGTCGTACGGGACCTACAACCCGGGCAGCCAGGCCCAGTACAAGGGCAGCTT
CCAGAGCGACGGCGGCACCTACAACATCTACGTCTCGACCCGCTACAACGCGCCCTCGATCG
AGGGCACCCGCACCTTCCAGCAGTACTGGTCCATCCGCACCTCCAAGCGCGTCGGCGGCTCC
GTCACCATGCAGAACCACTTCAACGCCTGGGCCCAGCACGGCATGCCCCTCGGCTCCCACGA
CTACCAGATCGTCGCCACCGAGGGCTACCAGAGCAGCGGCTCCTCCGACATCTACGTCCAGA
CTCACTAG (SEQ ID NO: 158)
MVSLKSLLLAAAATLTAVTARPFDFDDGNSTEALAKRQVTPNAQGYHSGYFYSWWSDGGGQA
TFTLLEGSHYQVNWRNTGNFVGGKGWNPGTGRTINYGGSFNPSGNGYLAVYGWTHNPLIEYYV
VESYGTYNPGSQAQYKGSFQSDGGTYNIYVSTRYNAPSIEGTRTFQQYWSIRTSKRVGGSVTMQ
NHFNAWAQHGMPLGSHDYQIVATEGYQSSGSSDIYVQTH (SEQ ID NO: 159)
RPFDFDDGNSTEALAKRQVTPNAQGYHSGYFYSWWSDGGGQATFTLLEGSHYQVNWRNTGNF
VGGKGWNPGTGRTINYGGSFNPSGNGYLAVYGWTHNPLIEYYVVESYGTYNPGSQAQYKGSFQ
SDGGTYNIYVSTRYNAPSIEGTRTFQQYWSIRTSKRVGGSVTMQNHFNAWAQHGMPLGSHDYQI
VATEGYQSSGSSDIYVQTH
[0361] The polynucleotide (SEQ ID NO:160) and amino acid (SEQ ID
NO:161) sequences of another wild-type M. thermophila xylanase
("Xyl5") are provided below. The signal sequence is shown
underlined in SEQ ID NO:161. SEQ ID NO:162 provides the sequence of
this xylanase, without the signal sequence.
TABLE-US-00065 (SEQ ID NO: 160)
ATGGTTACCCTCACTCGCCTGGCGGTCGCCGCGGCGGCCATGATCTCCAGCACTGGCCTGGC
TGCCCCGACGCCCGAAGCTGGCCCCGACCTTCCCGACTTTGAGCTCGGGGTCAACAACCTCG
CCCGCCGCGCGCTGGACTACAACCAGAACTACAGGACCAGCGGCAACGTCAACTACTCGCC
CACCGACAACGGCTACTCGGTCAGCTTCTCCAACGCGGGAGATTTTGTCGTCGGGAAGGGCT
GGAGGACGGGAGCCACCAGAAACATCACCTTCTCGGGATCGACACAGCATACCTCGGGCAC
CGTGCTCGTCTCCGTCTACGGCTGGACCCGGAACCCGCTGATCGAGTACTACGTGCAGGAGT
ACACGTCCAACGGGGCCGGCTCCGCTCAGGGCGAGAAGCTGGGCACGGTCGAGAGCGACGG
GGGCACGTACGAGATCTGGCGGCACCAGCAGGTCAACCAGCCGTCGATCGAGGGCACCTCG
ACCTTCTGGCAGTACATCTCGAACCGCGTGTCCGGCCAGCGGCCCAACGGCGGCACCGTCAC
CCTCGCCAACCACTTCGCCGCCTGGCAGAAGCTCGGCCTGAACCTGGGCCAGCACGACTACC
AGGTCCTGGCCACCGAGGGCTGGGGCAACGCCGGCGGCAGCTCCCAGTACACCGTCAGCGG CTGA
(SEQ ID NO: 161)
MVTLTRLAVAAAAMISSTGLAAPTPEAGPDLPDFELGVNNLARRALDYNQNYRTSGNVNYSPT
DNGYSVSFSNAGDFVVGKGWRTGATRNITFSGSTQHTSGTVLVSVYGWTRNPLIEYYVQEYTSN
GAGSAQGEKLGTVESDGGTYEIWRHQQVNQPSIEGTSTFWQYISNRVSGQRPNGGTVTLANHFA
AWQKLGLNLGQHDYQVLATEGWGNAGGSSQYTVSG (SEQ ID NO: 162)
APTPEAGPDLPDFELGVNNLARRALDYNQNYRTSGNVNYSPTDNGYSVSFSNAGDFVVGKGWR
TGATRNITFSGSTQHTSGTVLVSVYGWTRNPLIEYYVQEYTSNGAGSAQGEKLGTVESDGGTYEI
WRHQQVNQPSIEGTSTFWQYISNRVSGQRPNGGTVTLANHFAAWQKLGLNLGQHDYQVLATE
GWGNAGGSSQYTVSG
[0362] The polynucleotide (SEQ ID NO:163) and amino acid (SEQ ID
NO:164) sequences of a wild-type M. thermophila beta-xylosidase are
provided below. The signal sequence is shown underlined in SEQ ID
NO:164. SEQ ID NO:165 provides the sequence of this xylanase
without the signal sequence.
TABLE-US-00066 (SEQ ID NO: 163)
ATGTTCTTCGCTTCTCTGCTGCTCGGTCTCCTGGCGGGCGTGTCCGCTTCACCGGGACACGGG
CGGAATTCCACCTTCTACAACCCCATCTTCCCCGGCTTCTACCCCGATCCGAGCTGCATCTAC
GTGCCCGAGCGTGACCACACCTTCTTCTGTGCCTCGTCGAGCTTCAACGCCTTCCCGGGCATC
CCGATTCATGCCAGCAAGGACCTGCAGAACTGGAAGTTGATCGGCCATGTGCTGAATCGCA
AGGAACAGCTTCCCCGGCTCGCTGAGACCAACCGGTCGACCAGCGGCATCTGGGCACCCAC
CCTCCGGTTCCATGACGACACCTTCTGGTTGGTCACCACACTAGTGGACGACGACCGGCCGC
AGGAGGACGCTTCCAGATGGGACAATATTATCTTCAAGGCAAAGAATCCGTATGATCCGAG
GTCCTGGTCCAAGGCCGTCCACTTCAACTTCACTGGCTACGACACGGAGCCTTTCTGGGACG
AAGATGGAAAGGTGTACATCACCGGCGCCCATGCTTGGCATGTTGGCCCATACATCCAGCAG
GCCGAAGTCGATCTCGACACGGGGGCCGTCGGCGAGTGGCGCATCATCTGGAACGGAACGG
GCGGCATGGCTCCTGAAGGGCCGCACATCTACCGCAAAGATGGGTGGTACTACTTGCTGGCT
GCTGAAGGGGGGACCGGCATCGACCATATGGTGACCATGGCCCGGTCGAGAAAAATCTCCA
GTCCTTACGAGTCCAACCCAAACAACCCCGTGTTGACCAACGCCAACACGACCAGTTACTTT
CAAACCGTCGGGCATTCAGACCTGTTCCATGACAGACATGGGAACTGGTGGGCAGTCGCCCT
CTCCACCCGCTCCGGTCCAGAATATCTTCACTACCCCATGGGCCGCGAGACCGTCATGACAG
CCGTGAGCTGGCCGAAGGACGAGTGGCCAACCTTCACCCCCATATCTGGCAAGATGAGCGG
CTGGCCGATGCCTCCTTCGCAGAAGGACATTCGCGGAGTCGGCCCCTACGTCAACTCCCCCG
ACCCGGAACACCTGACCTTCCCCCGCTCGGCGCCCCTGCCGGCCCACCTCACCTACTGGCGA
TACCCGAACCCGTCCTCCTACACGCCGTCCCCGCCCGGGCACCCCAACACCCTCCGCCTGAC
CCCGTCCCGCCTGAACCTGACCGCCCTCAACGGCAACTACGCGGGGGCCGACCAGACCTTCG
TCTCGCGCCGGCAGCAGCACACCCTCTTCACCTACAGCGTCACGCTCGACTACGCGCCGCGG
ACCGCCGGGGAGGAGGCCGGCGTGACCGCCTTCCTGACGCAGAACCACCACCTCGACCTGG
GCGTCGTCCTGCTCCCTCGCGGCTCCGCCACCGCGCCCTCGCTGCCGGGCCTGAGTAGTAGT
ACAACTACTACTAGTAGTAGTAGTAGTCGTCCGGACGAGGAGGAGGAGCGCGAGGCGGGCG
AAGAGGAAGAAGAGGGCGGACAAGACTTGATGATCCCGCATGTGCGGTTCAGGGGCGAGTC
GTACGTGCCCGTCCCGGCGCCCGTCGTGTACCCGATACCCCGGGCCTGGAGAGGCGGGAAG
CTTGTGTTAGAGATCCGGGCTTGTAATTCGACTCACTTCTCGTTCCGTGTCGGGCCGGACGGG
AGACGGTCTGAGCGGACGGTGGTCATGGAGGCTTCGAACGAGGCCGTTAGCTGGGGCTTTA
CTGGAACGCTGCTGGGCATCTATGCGACCAGTAATGGTGGCAACGGAACCACGCCGGCGTA
TTTTTCGGATTGGAGGTACACACCATTGGAGCAGTTTAGGGAT (SEQ ID NO: 164)
MFFASLLLGLLAGVSASPGHGRNSTFYNPIFPGFYPDPSCIYVPERDHTFFCASSSFNAFPGIPIHAS
KDLQNWKLIGHVLNRKEQLPRLAETNRSTSGIWAPTLRFHDDTFWLVTTLVDDDRPQEDASRW
DNIIFKAKNPYDPRSWSKAVHFNFTGYDTEPFWDEDGKVYITGAHAWHVGPYIQQAEVDLDTG
AVGEWRIIWNGTGGMAPEGPHIYRKDGWYYLLAAEGGTGIDHMVTMARSRKISSPYESNPNNP
VLTNANTTSYFQTVGHSDLFHDRHGNWWAVALSTRSGPEYLHYPMGRETVMTAVSWPKDEWP
TFTPISGKMSGWPMPPSQKDIRGVGPYVNSPDPEHLTFPRSAPLPAHLTYWRYPNPSSYTPSPPGH
PNTLRLTPSRLNLTALNGNYAGADQTFVSRRQQHTLFTYSVTLDYAPRTAGEEAGVTAFLTQNH
HLDLGVVLLPRGSATAPSLPGLSSSTTTTSSSSSRPDEEEEREAGEEEEEGGQDLMIPHVRFRGESY
VPVPAPVVYPIPRAWRGGKLVLEIRACNSTHFSFRVGPDGRRSERTVVMEASNEAVSWGFTGTL
LGIYATSNGGNGTTPAYFSDWRYTPLEQFRD (SEQ ID NO: 165)
SPGHGRNSTFYNPIFPGFYPDPSCIYVPERDHTFFCASSSFNAFPGIPIHASKDLQNWKLIGHVLNR
KEQLPRLAETNRSTSGIWAPTLRFHDDTFWLVTTLVDDDRPQEDASRWDNIIFKAKNPYDPRSW
SKAVHFNFTGYDTEPFWDEDGKVYITGAHAWHVGPYIQQAEVDLDTGAVGEWRIIWNGTGGM
APEGPHIYRKDGWYYLLAAEGGTGIDHMVTMARSRKISSPYESNPNNPVLTNANTTSYFQTVGH
SDLFHDRHGNWWAVALSTRSGPEYLHYPMGRETVMTAVSWPKDEWPTFTPISGKMSGWPMPP
SQKDIRGVGPYVNSPDPEHLTFPRSAPLPAHLTYWRYPNPSSYTPSPPGHPNTLRLTPSRLNLTAL
NGNYAGADQTFVSRRQQHTLFTYSVTLDYAPRTAGEEAGVTAFLTQNHHLDLGVVLLPRGSAT
APSLPGLSSSTTTTSSSSSRPDEEEEREAGEEEEEGGQDLMIPHVRFRGESYVPVPAPVVYPIPRAW
RGGKLVLEIRACNSTHFSFRVGPDGRRSERTVVMEASNEAVSWGFTGTLLGIYATSNGGNGTTP
AYFSDWRYTPLEQFRD
[0363] The polynucleotide (SEQ ID NO:166) and amino acid (SEQ ID
NO:167) sequences of a wild-type M. thermophila acetylxylan
esterase ("Axe3") are provided below. The signal sequence is shown
underlined in SEQ ID NO:167. SEQ ID NO:168 provides the sequence of
this acetylxylan esterase without the signal sequence.
TABLE-US-00067 (SEQ ID NO: 166)
ATGAAGCTCCTGGGCAAACTCTCGGCGGCACTCGCCCTCGCGGGCAGCAGGCTGGCTGCCGC
GCACCCGGTCTTCGACGAGCTGATGCGGCCGACGGCGCCGCTGGTGCGCCCGCGGGCGGCC
CTGCAGCAGGTGACCAACTTTGGCAGCAACCCGTCCAACACGAAGATGTTCATCTACGTGCC
CGACAAGCTGGCCCCCAACCCGCCCATCATAGTGGCCATCCACTACTGCACCGGCACCGCCC
AGGCCTACTACTCGGGCTCCCCTTACGCCCGCCTCGCCGACCAGAAGGGCTTCATCGTCATC
TACCCGGAGTCCCCCTACAGCGGCACCTGTTGGGACGTCTCGTCGCGCGCCGCCCTGACCCA
CAACGGCGGCGGCGACAGCAACTCGATCGCCAACATGGTCACCTACACCCTCGAAAAGTAC
AATGGCGACGCCAGCAAGGTCTTTGTCACCGGCTCCTCGTCCGGCGCCATGATGACGAACGT
GATGGCCGCCGCGTACCCGGAACTGTTCGCGGCAGGAATCGCCTACTCGGGCGTGCCCGCCG
GCTGCTTCTACAGCCAGTCCGGAGGCACCAACGCGTGGAACAGCTCGTGCGCCAACGGGCA
GATCAACTCGACGCCCCAGGTGTGGGCCAAGATGGTCTTCGACATGTACCCGGAATACGAC
GGCCCGCGCCCCAAGATGCAGATCTACCACGGCTCGGCCGACGGCACGCTCAGACCCAGCA
ACTACAACGAGACCATCAAGCAGTGGTGCGGCGTCTTCGGCTTCGACTACACCCGCCCCGAC
ACCACCCAGGCCAACTCCCCGCAGGCCGGCTACACCACCTACACCTGGGGCGAGCAGCAGC
TCGTCGGCATCTACGCCCAGGGCGTCGGACACACGGTCCCCATCCGCGGCAGCGACGACAT
GGCCTTCTTTGGCCTGTGA (SEQ ID NO: 167)
MKLLGKLSAALALAGSRLAAAHPVFDELMRPTAPLVRPRAALQQVTNFGSNPSNTKMFIYVPDK
LAPNPPIIVAIHYCTGTAQAYYSGSPYARLADQKGFIVIYPESPYSGTCWDVSSRAALTHNGGGDS
NSIANMVTYTLEKYNGDASKVFVTGSSSGAMMTNVMAAAYPELFAAGIAYSGVPAGCFYSQSG
GTNAWNSSCANGQINSTPQVWAKMVFDMYPEYDGPRPKMQIYHGSADGTLRPSNYNETIKQW
CGVFGFDYTRPDTTQANSPQAGYTTYTWGEQQLVGIYAQGVGHTVPIRGSDDMAFFGL (SEQ ID
NO: 168)
HPVFDELMRPTAPLVRPRAALQQVTNFGSNPSNTKMFIYVPDKLAPNPPIIVAIHYCTGTAQAYY
SGSPYARLADQKGFIVIYPESPYSGTCWDVSSRAALTHNGGGDSNSIANMVTYTLEKYNGDASK
VFVTGSSSGAMMTNVMAAAYPELFAAGIAYSGVPAGCFYSQSGGTNAWNSSCANGQINSTPQV
WAKMVFDMYPEYDGPRPKMQIYHGSADGTLRPSNYNETIKQWCGVFGFDYTRPDTTQANSPQ
AGYTTYTWGEQQLVGIYAQGVGHTVPIRGSDDMAFFGL
[0364] The polynucleotide (SEQ ID NO:169) and amino acid (SEQ ID
NO:170) sequences of a wild-type M. thermophila ferulic acid
esterase ("FAE") are provided below. The signal sequence is shown
underlined in SEQ ID NO:170. SEQ ID NO:171 provides the sequence of
this xylanase without the signal sequence
TABLE-US-00068 (SEQ ID NO: 169)
ATGATCTCGGTTCCTGCTCTCGCTCTGGCCCTTCTGGCCGCCGTCCAGGTCGTCGAGTCTGCC
TCGGCTGGCTGTGGCAAGGCGCCCCCTTCCTCGGGCACCAAGTCGATGACGGTCAACGGCAA
GCAGCGCCAGTACATTCTCCAGCTGCCCAACAACTACGACGCCAACAAGGCCCACAGGGTG
GTGATCGGGTACCACTGGCGCGACGGATCCATGAACGACGTGGCCAACGGCGGCTTCTACG
ATCTGCGGTCCCGGGCGGGCGACAGCACCATCTTCGTTGCCCCCAACGGCCTCAATGCCGGA
TGGGCCAACGTGGGCGGCGAGGACATCACCTTTACGGACCAGATCGTAGACATGCTCAAGA
ACGACCTCTGCGTGGACGAGACCCAGTTCTTTGCTACGGGCTGGAGCTATGGCGGTGCCATG
AGCCATAGCGTGGCTTGTTCTCGGCCAGACGTCTTCAAGGCCGTCGCGGTCATCGCCGGGGC
CCAGCTGTCCGGCTGCGCCGGCGGCACGACGCCCGTGGCGTACCTAGGCATCCACGGAGCC
GCCGACAACGTCCTGCCCATCGACCTCGGCCGCCAGCTGCGCGACAAGTGGCTGCAGACCA
ACGGCTGCAACTACCAGGGCGCCCAGGACCCCGCGCCGGGCCAGCAGGCCCACATCAAGAC
CACCTACAGCTGCTCCCGCGCGCCCGTCACCTGGATCGGCCACGGGGGCGGCCACGTCCCCG
ACCCCACGGGCAACAACGGCGTCAAGTTTGCGCCCCAGGAGACCTGGGACTTCTTTGATGCC
GCCGTCGGAGCGGCCGGCGCGCAGAGCCCGATGACATAA (SEQ ID NO: 170)
MISVPALALALLAAVQVVESASAGCGKAPPSSGTKSMTVNGKQRQYILQLPNNYDANKAHRVV
IGYHWRDGSMNDVANGGFYDLRSRAGDSTIFVAPNGLNAGWANVGGEDITFTDQIVDMLKNDL
CVDETQFFATGWSYGGAMSHSVACSRPDVFKAVAVIAGAQLSGCAGGTTPVAYLGIHGAADNV
LPIDLGRQLRDKWLQTNGCNYQGAQDPAPGQQAHIKTTYSCSRAPVTWIGHGGGHVPDPTGNN
GVKFAPQETWDFFDAAVGAAGAQSPMT (SEQ ID NO: 171)
ASAGCGKAPPSSGTKSMTVNGKQRQYILQLPNNYDANKAHRVVIGYHWRDGSMNDVANGGFY
DLRSRAGDSTIFVAPNGLNAGWANVGGEDITFTDQIVDMLKNDLCVDETQFFATGWSYGGAMS
HSVACSRPDVFKAVAVIAGAQLSGCAGGTTPVAYLGIHGAADNVLPIDLGRQLRDKWLQTNGC
NYQGAQDPAPGQQAHIKTTYSCSRAPVTWIGHGGGHVPDPTGNNGVKFAPQETWDFFDAAVG
AAGAQSPMT
Example 1
Gene Acquisition and Construction of Expression Vectors
[0365] A protein from a strain of M. thermophila having the amino
acid sequence provided in SEQ ID NO:2 was previously identified as
having GH61 activity. It was designated "GH61a". FIG. 1 shows the
improvement in glucose yield resulting from having GH61a present in
a reaction where crystalline cellulose undergoes saccharification
by cellulase enzymes that are contained in culture broth from M.
thermophila cells.
[0366] In this Example, the wild type GH61a gene from M.
thermophila was isolated from the genome and the DNA sequence
verified. The gene was cloned into a Saccharomyces cerevisiae/M.
thermophila shuttle vector pYTDX60 using Pml1 cloning sites, using
standard methods known in the art. The signal peptide and gene were
under the control of a yeast transcription elongation factor 1
promoter (pTEF1). The vector contained the REP2, rep1 and protein D
(partial) origin of replication for S. cerevisiae and a URA3
resistance marker.
[0367] The resulting plasmid (pYTDX60-GH61a) was transformed into
S. cerevisiae INVSC1 strain and the transformed host cells were
grown in Costar 96 deep well plates for GH61a protein production.
The GH61a sequence from the transformants were verified as the wild
type GH61a DNA sequence (SEQ ID NO:1) and the encoded polypeptide
(SEQ ID NO:2).
Example 2
Shake Flask Procedure
[0368] A single colony of S. cerevisiae containing a plasmid with
the GH61a gene was inoculated into 3 mL synthetic defined-uracil
(SD-ura) broth (2 g/L synthetic drop-out minus uracil without yeast
nitrogen base (US Biological), 5 g/L ammonium sulfate, 0.1 g/L
calcium chloride, 2 mg/L inositol, 0.5 g/L magnesium sulfate, 1 g/L
potassium phosphate monobasic (KH.2PO4), 0.1 g/L sodium chloride)
containing 6% glucose. Cells were grown overnight (at least 21 hrs)
in an incubator at 30.degree. C. with shaking at 250 rpm. Then, 500
.mu.L of the overnight culture was diluted into either 50 mL SD-ura
medium or modified galactose expression medium (30 g/L galactose,
6.7 g/L yeast nitrogen base without amino acids, 5 g/L ammonium
sulfate, 24 g/L amino acid mix minus uracil, 10 g/L potassium
phosphate monobasic (KH.sub.2PO.sub.4) and 0.38% vitamin mix)
containing 2% glucose in a 250 mL baffled sterile shake flask and
incubated at 37.degree. C. (for SD-ura medium) or 30.degree. C.
(for modified galactose expression medium) for 48 hours. Cells were
pelleted by centrifugation (4000 rpm, 15 min, 4.degree. C.). The
clear media supernatant containing the secreted GH61a enzyme was
collected and stored at 4.degree. C. until used.
Example 3
GH61 Activity Assays
[0369] In some experiments, GH61 activity was determined using a
biomass assay. The substrate was wheat straw that had been
pretreated under acidic conditions (hereinafter referred to as
"pretreated wheat straw"). The reaction was carried out in a total
volume of 77 .mu.L in the presence of 10 mg of pre-treated wheat
straw, with 62 .mu.L of 1x-20x concentrated clear media supernatant
("broth") containing S. cerevisiae-produced M. thermophila GH61a
enzyme and 15 .mu.L of sodium acetate buffer (pH 5.0), M.
thermophila-produced cellobiohydrolase 1a (CBH1a),
cellobiohydrolase 2b (CBH2b) and beta-glucosidase. The final
concentration of sodium acetate was 150 mM and the enzyme loads of
CBHs and beta-glucosidase were approximately 0.0025%.about.0.0125%
(CBH1a and CBH2b in 1:1 ratio) and 0.01 to 0.02% with respect to
substrate glucan mass in the biomass substrate, respectively.
[0370] Some experiments were also performed in the presence of
inhibitors that may arise through the routine preparation or
pre-treatment of a cellulose substrate. In this way, GH61 protein
variants can be identified that are more resistant to the presence
of such inhibitors, and therefore find use with a wider range of
feedstocks and have wider applicability in the processing of
biomass from different sources.
[0371] In some experiments, the pretreatment filtrate was obtained
by washing pretreated substrate solids with water. The GH61
activity assay was carried out with 50 .mu.L of GH61a containing
supernatant, 12 .mu.L of pretreatment filtrate, and 15 .mu.L of
sodium acetate buffer mixed with CBH1a, CBH2b and beta-glucosidase
isolated from M. thermophila. Background negative controls were
obtained by using media supernatant from cultures of cells without
the GH61a gene in the plasmid. Thus, the negative controls
represent activities of CBH1a, CBH2b and beta-glucosidase in the
absence of GH61a. The reaction was incubated at 50 to 60.degree. C.
for 24 to 72 hours with shaking, and then quenched by adding 130
.mu.L H.sub.2O at room temperature.
[0372] Some experiments were carried out in a total volume of 360
.mu.L in the presence of 10 mg of pre-treated wheat straw and 40
.mu.L filtrate (11% total volume), with 262 .mu.L of clear media
supernatant containing S. cerevisiae-produced M. thermophila GH61a
enzyme and 48 .mu.L of sodium acetate buffer (pH 5; supplemented
with CuSO.sub.4) mixed with M. thermophila-produced CBH1a, CBH2b
and .beta.-glucosidase. The final concentrations of sodium acetate
and CuSO.sub.4 were 128 mM and 15 .mu.M, respectively, and the
enzyme loads of CBH's and beta-glucosidase were 0.01% (CBH1a and
CBH2b in 1:1 ratio) and 0.02% with respect to substrate glucan mass
in the biomass substrate, respectively. Background negative
controls were obtained by using media supernatant from cultures of
S. cerevisiae cells without the GH61a gene in the plasmid. Thus,
the negative controls represent glucose production by CBH1a, CBH2b
and beta-glucosidase in the absence of GH61a. The reaction was
incubated at 55.degree. C. for 72 hours with shaking.
[0373] The GH61 activity in the reaction mixture was measured by
monitoring glucose production, as determined using an enzymatic
glucose assay kit (K-GLUC, Megazyme). In a total volume of 200
.mu.L, 20 .mu.L of GH61a reaction mixture was added to 180 .mu.L of
2x concentrated glucose determination reagent (GOPOD Reagent.TM.,
supplied as part of the K-GLUC assay kit). The reaction was
incubated at room temperature for 30 minutes and the absorbance of
the solution was measured at 510 nm. The glucose oxidase enzyme in
the GOPOD reagent reacts with glucose and produces hydrogen
peroxide, which then reacts with the 4-aminoantipyrine in the
reagent to produce a quinoneimine dye. The amount of quinoneimine
dye was measured spectrophotometrically at 510 nm to calculate the
total amount of D-glucose in the reaction mixture. The total amount
of glucose in the reaction mixture was also measured using an
AGILENT.RTM. HPLC 1200 equipped with an AMINEX.TM. HPX-87H ion
exclusion column (300 mm.times.7.8 mm+Bio-Rad) with 5 mM sulfuric
acid in water as eluent at a flow rate of 0.6 mL/min at 65.degree.
C. The retention time of glucose was 9.5 minutes.
[0374] Detectable amounts of glucose, as a measure of GH61
activity, were observed under high throughput screening conditions
(pH 5, 55.degree. C.). GH61a specific activity in the reaction
mixture (which also comprised CBH1a, CBH2b and beta-glucosidase)
was determined by subtracting the amount of glucose in the negative
control reaction (comprising CBH1a, CBH2b and BGL, but not GH61a)
from the total glucose measurement.
Example 4
High Throughput Assays to Identify Improved GH61a Variants
[0375] Plasmid libraries containing variant GH61a genes were
transformed into S. cerevisiae INVSC1 strain and plated on SD-ura
agar plate containing 2% glucose. After incubation for at least 48
hours at 30.degree. C., colonies were picked using a Q-bot.RTM.
robotic colony picker (Genetix) into shallow, 96-well well
microtiter plates containing 200 .mu.L SD-ura media and 6% glucose.
Cells were grown for at least 21 hours at 30.degree. C. with
shaking at 250 rpm and 85% humidity. Then, 20 .mu.L of the
overnight culture was transferred into 96-deep well microtiter
plates containing 380 .mu.L SD-ura medium with 2% glucose as
described in Example 2. In some cases, 15 .mu.L of the overnight
culture was transferred into 96-deep well microtiter plates
containing 285 .mu.L modified galactose expression medium with 2%
glucose as described in Example 2. The plates were incubated at
37.degree. C. (for SD-ura medium) or 30.degree. C. (for modified
galactose expression medium) with shaking at 250 rpm and 85%
humidity for 48 hours. The deep well plates were centrifuged at
4000 rpm for 15 minutes and the clear media supernatant containing
the secreted GH61a enzyme was used for the high throughput biomass
assay.
[0376] The GH61a libraries were screened for thermoactivity using a
biomass-based high throughput method using the assays described in
Example 3.
Example 5
Improved GH61 Activity of Engineered GH61a Variants
[0377] Improved GH61a variants were identified from the high
throughput screening of various GH61a variant libraries as
described in the previous Example. The screening was done by
measuring thermoactivity of these variants compared with that of
the parental GH61a enzyme (expressed from GH61a DNA; SEQ ID NO:1).
The high throughput (HTP) saccharification reactions were conducted
at pH 5, 55.degree. C. for 24-72 hrs, using 50 g/L pretreated wheat
straw, 0.0025-0.01% of mixture of CBH1a and CBH2b (1:1 ratio), and
0.01 to 0.02% of beta-glucosidase.
Example 6
Shake Flask Validation of Improved GH61a Variants
[0378] Improved GH61a variants identified in the high throughput
screening (as described in the previous Example) were prepared
using the shake flask procedure described above. GH61 activities
were determined using a biomass assay as described above, in which
normalized concentrations of GH61a variants were used for direct
comparison of the specific activities of the GH61a variants.
Reactions were quenched at different time points between 24 to 72
hours and glucose levels measured for time-course analysis. FIG. 2
shows time course results for three GH61a variants. FIG. 2 also
shows specific activities observed under the following assay
conditions: pH 5.0, and 55.degree. C., utilizing 50 g/L pretreated
wheat straw, 0.0025%.about.0.0125% of mixture of CBH1a and CBH2b
(1:1 ratio) and 0.01 to 0.02% of beta-galactosidase. The protein
concentration was normalized in reactions. In this Figure, N=3;
error bars represent .+-.1 standard deviation. GH61 activity is
shown as the increase in glucose production by the enzyme
combination [CBH1a+CBH2b+BGL1] supplemented by the GH61 protein,
minus the glucose production by the same enzyme combination in the
absence of the GH61 protein.
[0379] The results show that Variants 5 and 9 (SEQ ID NOS:6 and 8)
have a 2.0 to 2.9 fold improvement over the native GH61a (SEQ ID
NO:2); and Variant 1 has a 3.0 to 3.9 fold improvement over GH61a
(SEQ ID NO:2).
[0380] Substitutions improving GH61 activity are compiled in Table
6-1 below. This table shows GH61a variants derived from the native
GH61a enzyme (SEQ ID NO:2) that were shown to have improved
thermoactivity. Improvement in GH61 activity in relation to the
parental GH61a protein (SEQ ID NO:2) is indicated with the
following scale:
[0381] +=1.1 to 1.9 fold improvement compared with wild-type (SEQ
ID NO:2)
[0382] ++=2.0 to 2.9 fold improvement compared with wild-type
[0383] +++=3.0 to 3.9 fold improvement compared with wild-type
TABLE-US-00069 TABLE 6-1 GH61 Variants with Improved Activity
Improvement Var. Silent Nucleotide in GH61 No. Amino Acid Changes
Changes Activity 1 N35G/E104H/A168P t60c/c573g +++ (SEQ ID NO: 4) 2
W42P/E104H/K167A t60c/c573g/g1026a ++ 3 N35G/W42P/V97Q/A191N ++ 4
W42P/E104H c573g ++ 5 E104H/K167A t60c/c291a/c573g ++ 6 W42P/A191N
t60c/c291a ++ 7 N35G/W42P/A191N t60c/c291a ++ 8 H20D ++ 9
V97Q/A191N ++ 10 N35G/E104H/A191N t60c/c876t ++ 11 E104H ++ 12
E104Q + 13 H20D/E104D/Q190H/Y192H + 14 H20D/Q190E/Y192Q a312g + 15
H20D/E104C + 16 H20D/P103H/E104C + 17 H20D/P103H a312g + 18
N35G/E104H t60c/c573g + 19 H20D/P103H/E104Q/Q190E + 20
H20D/P103H/E104C/Y192Q + 21 E104D t60c + 22 N35G/W42P t60c/c573g +
23 A137P + 24 H20D/P103H/E104Q + 25 P103E/E104D t60c + 26
N35G/F68Y/A191N t379a/c380g/g381c + 27 W42P/A168P + 28
H20D/E104C/Q190E/Y192Q + 29 A142W + 30 N35G + 31 H20C/Q190E + 32
W42P/A212P/T236P + 33 N35G/W42P/V97Q/K167A/ t60c/c573g + A168P 34
V97Q/A168P c573g + 35 S232A + 36 W42P/E104H/K167A/A168P/ c573g +
Q190E 37 W42P/A168P/A212P/T236P + 38 N35G/V97Q/K167A + 39 N35G/V97Q
+ 40 N35G/A191N + 41 S127T/K167A/A191N + 42 W42P + 43
W42P/E104C/K167A/A168P t60c/c291a/c573g + 44 K167Q + 45 W131V + 46
E176C + 47 K167I/P273S c300t + 48 W42P/T87P + 49 W42P/A212P + 50
K133H + 51 D165N + 52 D165A + 53 A168D + 54 K218T + 55 P45T + 56
Q44V + 57 S164W + 58 I177F + 59 A191N + 60 I134P + 61 K133F + 62
I134D + 63 N35G/K167A t60c/c291a/c573g + 64 I162R + 65 N35G/K167A
t204c/t379a/c380g/ + g381c/c385t 66 D165W/A246T + 67 I162L + 68
S164M + 69 F132D/A244D + 70 H181Q + 71 I177G g1026a + 72 L166W + 73
I162F + 74 I134V + 75 E176Q + 76 H181S + 77 I178A + 78 K167A + 79
V172K + 80 I177H + 81 I134N + 82 K133Y + 83 N35G/Y139L + 84 A168G +
85 T12A/I162G c246t + 86 D165E + 87 D165M + 88 I134M + 89 A168P +
90 I177D + 91 S164P + 92 H175T + 93 N187K/S330R c597g + 94 H175R +
95 L166H + 96 I178L + 97 L173H + 98 I177T + 99 N170Y + 100 H175S +
101 K167T + 102 L166R + 103 V172Y + '104 P163S/E176D + 105 S164I +
106 H175M + 107 A168N + 108 A179W + 109 W131K/H175Q g1026a + 110
Y171A + 111 N170H + 112 P163R + 113 A168C + 114 G169T + 115 R174F +
116 W131Y + 117 I134L + 118 I177V + 119 K167E + 120 H175C + 121
W131I + 122 W42P/A143P + 123 I178G c72t + 124 N170P + 125
A179D/N317K c732g/c843t/c882t/ + c909t/c912g 126 I162V + 127 I178M
+ 128 V172A + 129 K167A/A191N t60c/c291a + 130 F132A + 131 P163E +
132 F132M + 133 A179G + 134 I177S + 135 K167A g921a + 136 K167F +
137 A168I + 138 A179N + 139 I134A c792t + 140 K167E g972t + 141
R174K + 142 S164F + 143 V172L + 144 A168H + 145 I134T + 146 K167H +
147 L166A + 148 S164R + 149 R174C + 150 A179P + 151 G169R g1026a +
152 L173M + 153 D165K + 154 E176S + 155 F132L + 156 F132I/A179I +
157 F132P + 158 S164Q + 159 V172Q + 160 W131D + 161 W131Q + 162
A179H + 163 I134H/G270S + 164 N170G + 165 A168T + 166 A179C + 167
K133N + 168 K167L + 169 L180M + 170 W131F + 171 I134W g1026a + 172
I178H + 173 N170A + 174 V172H + 175 A168H/S205N + 176 I134H g921a +
177 S164C + 178 S164K + 179 I177C + 180 I178Q + 181 L180W + 182
I177M + 183 R174D + 184 V172M + 185 A179M + 186 H175Y + 187 I178P +
188 L173A + 189 N170E + 190 N170F + 191 N35G/A191N/T258I/T323P/
t379a/c380g/g381c/ + G328A/C341R c454a/c456a/c732t/ c843t/c849t 192
A168R + 193 D165I + 194 I162M + 195 K167V + 196 A179S + 197 E176N +
198 I134L/P322L + 199 P163L + 200 H181D + 201 N170S + 202 R174G +
203 I177R + 204 K167C + 205 L166Q + 206 P163I + 207 S164L/L166I +
208 Y171R + 209 F132P/Q190E/A191T + 210 F132Q + 211 I134C + 212
I177A + 213 E176R + 214 G169A + 215 G169K + 216 H181A + 217 I177L +
218 A168G + 219 A179R + 220 D165T + 221 K167R + 222 L166V + 223
N170C + 224 I178R + 225 R174H + 226 S164H + 227 W131R/L166I + 228
I162A/A191T + 229 L173F + 230 N170Q + 231 I177P + 232 R174N + 233
V172K/S215W + 234 D165R + 235 G239D c520a/c522g + 236 H175V +
237 H181R + 238 I134Y + 239 V172F + 240 V172G +
[0384] Table 6-2 shows GH61a variants derived from the GH61a
protein designated "Variant 1" in Table 6-1 with improved
thermoactivity. The second-round variants usually retained the
alterations of Variant 1 compared with wild-type GH61a
(N35G/E104H/A168P), along with additional alterations. Improvement
in GH61 activity in relation to Variant 1 (SEQ ID NO:4) is
indicated in Table 6-2 according to the following scale:
[0385] *=0.5 to 1.0 fold improvement compared with Variant 1 (SEQ
ID NO:4)
[0386] +=1.1 to 1.9 fold improvement compared with Variant 1;
[0387] ++=2.0 to 2.9 fold improvement compared with Variant 1
TABLE-US-00070 TABLE 6-2 GH61 Variants with Improved Activity
Compared to Variant 1 Silent Variant Nucleotide GH61 Activity
Number Amino Acid Changes Changes Improvement 241
N35G/T40A/E104H/A168P/P327M t60c/c573g ++ 242
N35G/P45D/E104H/A168P/N317R t60c/c573g ++ 243
N35G/E104H/A168P/N317R t60c/c573g + 244 N35G/E104H/A168P/N317L
t60c/c573g + 245 N35G/T54H/E104H/A168P t60c/c573g + 246
N35G/E104H/A168P/N317D/S329Y t60c/c573g + 247
N35G/E104H/A137S/A168P/S232E t60c/c573g + 248
N35G/E104H/A168P/N317R/T320A t60c/c573g + 249
N35G/E104H/A168P/D234E t60c/c573g + 250 N35G/T40S/E104H/A142G/A168P
t60c/c573g + 251 N35G/T40S/S78C/V88I/E104H/S128K/ t60c/c573g +
A168P/D234M 252 N35G/E104H/A168P/S330V t60c/c573g + 253
N35G/E104H/A168P/G203E/P266S t60c/c573g + 254
N35G/E104H/A168P/D234N t60c/c573g + 255
N35G/E104H/A168P/S286N/S329H t60c/c573g + 256
N35G/E104H/A168P/S330H t60c/c573g + 257 N35G/E104H/A168P/W337R
t60c/c573g + 258 N35G/N66D/E104H/S164E/A168P/G267T t60c/c573g + 259
N35G/E104H/A168P/P233V t60c/c573g + 260 R34E/N35G/E104H/R145T/A168P
t60c/c573g + 261 S24Q/N35G/E104H/A168P/V237I t60c/c573g + 262
Y32S/N35G/E64S/E104H/A168P t60c/c573g + 263 N35G/E104H/A168P/V333R
t60c/c573g + 264 N35G/E104H/G144S/A168P/V333Q t60c/c573g + 265
V28H/N35G/P45K/E104H/A168P t60c/c573g + 266 N35G/E104H/A168P/P327K
t60c/c573g + 267 N35G/N66Q/E104H/A168P t60c/c573g + 268
N35G/E104H/A168P/G203E t60c/c573g + 269 N35G/E104H/A168P/S339W
t60c/c573g + 270 N35G/P45K/N46E/E104H/A150Y/A168P t60c/c573g + 271
N35G/E104H/R130S/A168P t60c/c573g + 272 N35G/E104H/R145T/A168P
t60c/c573g/g891a + 273 N35G/E104H/A168P/S231K t60c/c573g + 274
N35G/T40A/E104H/A168P/D234E/P327M t60c/c573g + 275
N35G/E104H/A168P/S231H t60c/c573g + 276 N35G/E104H/A168P/N317M
t60c/c573g + 277 N35G/E104H/A168P/S330Y t60c/c573g + 278
N35G/E104H/A168P/S329I t60c/c573g + 279 N35G/E104H/A168P/S329R
t60c/c573g + 280 N35G/N66D/E104H/A168P/P322R/S329L t60c/c573g + 281
N35G/E104H/A168P/P327F t60c/c288t/c573g + 282 N35G/P45D/E104H/A168P
t60c/c573g + 283 N35G/E104H/A168P/S332R t60c/c573g + 284
N35G/E104H/A116S/A168P t60c/c573g + 285
N35G/T40A/E104H/A168P/V230I/P327M t60c/c573g + 286
N35G/T49A/E104H/A168P t60c/c573g + 287 N35G/E104H/A168P/N317T
t60c/c573g + 288 N35G/N46Y/E104H/A168P t60c/c573g + 289
N35G/E104H/A168P/G203V t60c/c573g + 290 N35G/E104H/A168P/S329L
t60c/c573g + 291 N35G/E104H/R145N/A168P/S329H t60c/c573g + 292
N35G/A56S/E104H/A168P t60c/c573g + 293
N35G/T40S/T49R/E104H/A168P/D234E/ t60c/c573g + P327M 294
N35G/E104H/Q161R/A168P t60c/c573g + 295 N35G/E104H/A168P/S332F
t60c/c573g + 296 N35G/P45R/T49A/E104H/A168P/N317R/ t60c/c573g +
T320A 297 N35G/E104H/A168P/V237I t60c/c573g + 298
N35G/Q44K/T80V/E104H/A168P t60c/c573g + 299 N35G/E104H/A168P/E336S
t60c/c573g + 300 N35G/E104H/A168P/P233T t60c/c573g + 301
N35G/E104H/A168P/S329Y t60c/c573g + 302 N35G/E104H/A168P/P327L
t60c/c573g + 303 N35G/E104H/A168P/N317I t60c/c573g + 304
N35G/E104H/R130H/A168P t60c/c573g + 305 N35G/Q44K/E104H/A168P
t60c/c573g + 306 N35G/N66D/E104H/A168P t60c/c573g + 307
N35G/E104H/A168P/S329V t60c/c573g + 308 N35G/E104H/A168P/W337F
t60c/c573g + 309 N35G/E104H/A168P/N317H t60c/c573g + 310
N35G/T40L/E104H/S128K/A168P t60c/c573g + 311 N35G/E104H/A168P/A326V
t60c/c573g + 312 N35G/T80V/E104H/A168P/P303T t60c/c573g + 313
N35G/E104H/A168P/S231A/S295L t60c/c573g + 314
N35G/E104H/A116Q/A168P t60c/c573g + 315 N35G/E104H/A168P/S330C
t60c/c573g + 316 N35G/T40S/E101T/E104H/A168P/P327M t60c/c573g + 317
N35G/E104H/A168P/A326Q t60c/c573g + 318 N35G/N46R/E104H/A168P
t60c/c573g + 319 N35G/P45K/E104H/A168P/A219R/S232E t60c/c573g + 320
S24Q/N35G/E104H/A168P/V237I/P303T t60c/c573g + 321
N35G/E104H/A168P/G203E/T281A t60c/c573g + 322 N35G/A56N/E104H/A168P
t60c/c573g + 323 N35G/E104H/A168P/E336G t60c/c573g + 324
N35G/E104H/A168P/E336R t60c/c573g + 325
N35G/T40S/E104H/S128K/A142G/A168P t60c/c573g + 326
N35G/Q44K/S67T/E104H/A168P t60c/c198t/c573g + 327
N35G/E104H/A168P/N317A t60c/c573g + 328 N35G/E104H/G155N/A168P
t60c/c573g + 329 N35G/E104H/Q161E/A168P t60c/c573g + 330
N35G/E104H/N118S/A168P t60c/c573g + 331
N35G/P45T/V97Q/E104H/A168P/G267S t60c/c573g + 332
V28H/N35G/E104H/A168P t60c/c573g + 333 N35G/E104H/A168P/Q184L
t60c/c573g + 334 N35G/E104H/A168P/N317V t60c/c573g + 335
N35G/Q44L/E104H/A168P t60c/c573g + 336 N35G/E104H/A168P/S330G
t60c/c573g + 337 N35G/E104H/A168P/T320A/V333W t60c/c573g + 338
N35G/E104H/A168P/E336A t60c/c573g + 339 N35G/E104H/A168P/N335S
t60c/c573g + 340 N35G/N66M/E104H/A168P t60c/c573g + 341
N35G/T54G/E104H/A168P t60c/c573g + 342 N35G/E104H/A168P/N317S
t60c/c573g + 343 N35G/E64L/E104H/A168P t60c/c573g + 344
N35G/E104H/S164E/A168P/A271T t60c/c573g + 345 N35G/N66A/E104H/A168P
t60c/c573g + 346 N35G/G83R/E104H/A168P t60c/c573g + 347
N35G/E104H/A168P/N317Q/T320A t60c/c573g + 348
N35G/E104H/K141A/A168P t60c/c573g + 349 N35G/P71T/E104H/A168P
t60c/c573g + 350 N35G/P71S/E104H/A168P t60c/c573g + 351
N35G/E104H/R130G/A168P t60c/c573g + 352 N35G/E104H/R145Q/A168P
t60c/c573g + 353 N35G/T70A/E104H/A168P t60c/c573g + 354
N35G/E104H/A168P/K218R t60c/c573g + 355 N35G/E104H/A168P/Q184E
t60c/c573g + 356 N35G/E104H/R130K/A168P t60c/c573g + 357
N35G/Q58H/E104H/A168P t60c/c573g + 358 Y32S/N35G/E104H/A168P
t60c/c573g + 359 N35G/E104H/A168P/S329T t60c/c573g + 360
N35G/E104H/A168P/S330I t60c/c573g + 361 Y32S/N35G/P71A/E104H/A168P
t60c/c573g + 362 N35G/E104H/A168P/S330T t60c/c573g + 363
N35G/G82A/E104H/A168P t60c/c573g + 364 N35G/T80V/E104H/A168P
t60c/c573g + 365 N35G/E104H/A168P/S295T t60c/c573g + 366
N35G/N66G/E104H/A168P t60c/c573g + 367 N35G/E104H/R145L/A168P
t60c/c573g + 368 N35G/S67H/E104H/A168P/V230M t60c/c573g + 369
N35G/E104H/G136E/A168P t60c/c573g + 370 N35G/T54S/E104H/A168P
t60c/c573g + 371 N35G/P45S/E104H/A168P t60c/c573g + 372
N35G/E104H/A168P/A326M t60c/c573g/c882t + 373
N35G/N66D/N95E/E104H/S164E/A168P/ t60c/c573g + G267D 374
N35G/E104H/A168P/S332C t60c/c573g + 375 N35G/E104H/S128L/A168P
t60c/c573g + 376 N35G/T54W/E104H/A168P t60c/c573g + 377
N35G/E104H/A168P/G268A/G269A/G270A t60c/c573g + 378
N35G/Q44K/E104H/A168P/S231T t60c/c573g + 379
R34E/N35G/E104H/A168P/A280D t60c/c573g + 380 N35G/E104H/A168P/A297T
t60c/g399a/c573g + 381 N35G/E104H/K141P/R145Q/A168P t60c/c573g +
382 N35G/P45E/E104H/K141R/A168P t60c/c573g + 383
N35G/N66T/E104H/A168P t60c/c573g + 384 N35G/E104H/S164E/A168P/S295D
t60c/c573g + 385 N35G/E104H/A168P/N317F t60c/c573g + 386
N35G/E104H/A168P/N317Q t60c/c573g + 387
N35G/T40G/T49R/S78C/E104H/A142G/ t60c/c573g + A168P 388
N35G/G82S/E104H/A168P t60c/c573g + 389 N35G/Q58P/E104H/A168P
t60c/c573g + 390 N35G/N46R/E104H/A168P/G203E/A263V t60c/c573g + 391
N35G/P45R/E104H/A168P t60c/c573g + 392 N35G/S67G/E104H/A168P
t60c/c573g + 393 N35G/E104H/A168P/R199E t60c/c573g + 394
N35G/G69T/E104H/A168P t60c/c573g + 395
N35G/E104H/A168P/G203E/G268A/G269A/ t60c/c573g + G270A 396
N35G/E104H/A168P/P266S t60c/c573g + 397 N35G/E104H/A168P/V324M
t60c/c573g + 398 N35G/E104H/A168P/G245A t60c/c573g + 399
N35G/N66R/E104H/A168P t60c/c573g + 400 N35G/E104H/A168P/T236E
t60c/c573g + 401 S24Q/N35G/Q44K/T80H/E104H/A168P t60c/c573g + 402
N35G/E104H/S128D/A168P t60c/c573g + 403
N35G/N66D/S78D/E104H/A168P/S253D t60c/c573g + 404
N35G/E104H/R130Y/A168P t60c/c573g + 405 N35G/E104H/A168P/K310I
t60c/c573g + 406 N35G/E104H/R145E/A168P t60c/c573g + 407
N35G/N66D/E104H/S164E/A168P/S282D t60c/c573g + 408
N35G/E104H/K141P/A168P t60c/c573g + 409 N35G/E104H/A168P/Q184R
t60c/c573g + 410 N35G/E104H/A168P/S231T t60c/c573g + 411
N35G/N66V/E104H/A168P t60c/c573g + 412 N35G/E104H/A142L/A168P
t60c/c573g + 413 N35G/E104H/R145H/A168P t60c/c573g + 414
N35G/E104H/A168P/K218L t60c/c573g + 415 N35G/E104H/K141T/A168P
t60c/c573g + 416 N35G/E104H/A168P/P233F t60c/c573g + 417
N35G/T40S/E104H/A168P/P327M t60c/c573g + 418 N35G/T54M/E104H/A168P
t60c/c573g + 419 S24T/N35G/E104H/S164E/A168P t60c/c573g + 420
N35G/P45T/E104H/A168P t60c/c573g + 421
N35G/N66D/E104H/S164E/A168P/S231T/ t60c/c573g + S253T 422
N35G/G69H/E104H/A168P t60c/c573g + 423 N35G/E104H/S128Y/A168P
t60c/c573g + 424 N35G/T49Q/E104H/A168P t60c/c573g + 425
N35G/T49A/E104H/A168P/Q184H t60c/c573g + 426 N35G/E104H/A168P/G203Y
t60c/c573g + 427 N35G/Q44K/N66V/E104H/A168P t60c/c573g + 428
N35G/E104H/A137M/A168P t60c/c573g + 429 N35G/E104H/A168P/P327C
t60c/c573g + 430 N35G/E104H/A168P/T236R t60c/c573g + 431
N35G/I51A/E104H/A168P t60c/c573g + 432 N35G/S67H/E104H/A168P
t60c/c573g + 433 N35G/E104H/A168P/A326C t60c/c573g + 434
N35G/T49A/E104H/S128N/A168P t60c/c573g + 435
N35G/T49R/E104H/A168P/K218L/N317Q t60c/c573g + 436
N35G/E104H/A168P/P266S/G267V t60c/c573g + 437
N35G/E104H/A168P/V237I/P303T t60c/c573g + 438 N35G/T49E/E104H/A168P
t60c/c573g + 439 N35G/P45R/E104H/A168P/T320A t60c/c573g + 440
N35G/N66L/E104H/A168P t60c/c573g + 441
N35G/P45R/E104H/A168P/K218L/N317Q t60c/c573g + 442
N35G/E104H/R145V/A168P t60c/c573g + 443 N35G/N66D/E104H/A168P/R290K
t60c/c573g + 444 N35G/T80L/E104H/A168P t60c/c573g + 445
N35G/A55G/E104H/A168P t60c/c573g + 446 N35G/E104H/A168P/S330A
t60c/c573g + 447 N35G/E104H/K141N/A168P/P266S t60c/c573g + 448
N35G/E104H/A142S/A168P t60c/c573g + 449 N35G/E104H/A168P/Q184G
t60c/c573g + 450 N35G/E104H/N118E/A168P t60c/c573g + 451
N35G/E104H/A168P/A212M t60c/c573g + 452 N35G/E104H/A168P/G267D
t60c/c573g + 453 N35G/K93N/E104H/R130Y/A168P t60c/c573g + 454
N35G/P45R/T49Y/E104H/A168P/N317D t60c/c573g + 455
N35G/E104H/A168P/S329Q t60c/c573g + 456 N35G/E104H/A168P/V230Q
t60c/c573g + 457 N35G/P45K/E104H/A168P/A219R t60c/c573g + 458
N35G/E104H/A142G/A168P t60c/c573g + 459 N35G/E104H/A168P/S205T
t60c/c573g + 460 N35G/S78D/E104H/S164E/A168P t60c/c573g + 461
N35G/E104H/R130E/A168P t60c/c573g + 462 N35G/E104H/A168P/Q184H
t60c/c573g + 463 N35G/E104H/A116P/A168P t60c/c573g + 464
N35G/E104H/A142D/A168P t60c/c573g + 465
V28H/N35G/N46E/Q58H/E104H/A168P t60c/c573g + 466
N35G/E104H/A168P/A280T t60c/c573g + 467 R34E/N35G/E104H/A168P/A280T
t60c/c573g + 468 N35G/E104H/A168P/E336L t60c/c573g + 469
N35G/T49D/E104H/A168P t60c/c573g + 470 N35G/E104H/A168P/A219T
t60c/c573g + 471 N35G/E104H/A142W/A168P t60c/c573g + 472
N35G/E104H/A168P/P303T/G305D t60c/c573g + 473 N35G/Q44V/E104H/A168P
t60c/c573g + 474 N35G/E104H/A168P/N187D t60c/c573g + 475
N35G/E104H/G136H/A168P t60c/c573g + 476
S24Q/N35G/Q44K/E104H/A168P/P303T/ t60c/c573g +
S332D 477 N35G/E104H/A168P/Q184N t60c/c573g + 478
N35G/E104H/A168P/S332L t60c/c573g + 479
S24T/N35G/N66D/S78D/E104H/A168P/ t60c/c573g + S205T/S253T 480
N35G/E104H/A168P/P327A t60c/c573g + 481
N35G/T40A/T49Q/S78C/E104H/A168P t60c/c573g + 482
N35G/T40L/E104H/A142G/A168P t60c/c573g + 483
N35G/T49Y/E104H/A168P/N317R t60c/c573g + 484
R34E/N35G/K93T/E104H/R130E/R145T/ t60c/c573g +
A168P/R199E/K218T/A280D
Example 7
Selection of Further GH61 Candidates for Strain Improvement
[0388] This example illustrates the selection of potential
candidates to further improve whole cellulase broth activity of M.
thermophila cultures on different types of pretreated substrates
like pretreated corn stover and pretreated wheat straw.
[0389] In this Example, M. thermophila-produced and purified GH61a,
GH61p, GH61f, GH61n, CBH1a, CBH2b, AXE3, FAE, and Xyl3, were used
to supplement the activity present in culture broths (i.e., "whole
broth cellulase base") of the M. thermophila strain CF-416 prepared
using standard methods known in the art. The broth cellulase base
was fixed to 0.5% protein and the single purified enzyme was added
at 0.4% (wt added protein/wt glucan) to the saccharification
reactions. The whole cellulase broth base and individual enzymes
were characterized by standard BCA assays for total protein
quantification.
[0390] The saccharification reactions were carried out at 74 g/L
glucan load of pretreated wheat straw (PWS) or pretreated corn
stover (PCS) at pH 5.0, 55.degree. C. at 950 rpm in the presence of
50 .mu.M copper in high throughput (HTP) 96 deep well plates.
Glucose analysis was carried out by the glucose oxidase assay as
described above. In each case, the fold improvement was calculated
using the formula Fold Improvement=[Total Glucose Production with
addition of 0.4% single enzyme to the whole cellulase broth
base]/[glucose production from the 0.5% whole cellulase broth
base]. The results are provided in Table 10-1. In this Table, the
fold improvements were ranked from 0 to 3; fold improvements less
than 1.2x are indicated by "0," fold improvements of >1.2 to
<1.5 are indicated by "1," fold improvements of
.gtoreq.1.5.times. to <1.7.times. improvements are indicated by
"2," and fold improvements .gtoreq.1.7 are indicated by "3."
[0391] As indicated by the results in the Table, the greatest
benefit was observed using GH61p on pre-treated corn stover (PCS),
and GH61a on pre-treated wheatstraw (PWS), indicating that GH61
activity is increases the cellulolytic activity of the reaction
mix. In addition to the enzymes listed in Table 10-1, EG1b, Xyl1,
Xyl6, beta-xylosidase, and another xylanase were also tested, but
did not show any improvement under the test conditions.
TABLE-US-00071 TABLE 7-1 Fold Improvement Fold Improvement Over
Fold Improvement Whole Cellulase Broth Over Whole Cellulase Tested
on PCS Broth Tested on PWS Whole broth cellulases 1 1 from CF-416
CBH1a 1 3 CBH2b 1 1 GH61a 2 3 GH61p 3 2 GH61f 1 1 GH61n 1 1 AXE3 0
1 FAE 1 1 Xyl3 0 1
Example 8
Improvement of GH61 Activity by Copper(II) Ions
[0392] This example illustrates the enhancement in GH61 activity
with the addition of copper(II) ion to the saccharification
reaction.
[0393] Purified M. thermophila-produced GH61a or S. cerevisiae
supernatant containing M. thermophila-GH61a was pre-incubated with
different amounts of copper(II) (CuSO.sub.4) at concentrations of 0
to 100 .mu.M at ambient temperature for 30 min. The biomass assay
was then performed in a total volume of 300 .mu.L, in the presence
of 10 mg of pre-treated wheat straw, using 261 .mu.L of
copper-treated GH61 samples, 39 .mu.L of sodium acetate buffer (pH
5), M. thermophila-produced CBH1a, CBH2b and .beta.-glucosidase.
The final concentration of sodium acetate was 120 mM and the enzyme
loads of CBHs and .beta.-glucosidase (CBH1a and CBH2b in 1:1 ratio)
were 0.01% and 0.02% with respect to substrate glucan mass in the
biomass substrate, respectively. Background (negative) controls
were obtained by using either water or media supernatant from
cultures of S. cerevisiae cells without the GH61a gene in the
plasmid. Thus, the negative controls represent activities of CBH1a,
CBH2b and beta-glucosidase in the absence of GH61a. The reaction
was incubated at 55.degree. C. for 72 hours with shaking. The GH61a
activity in the reaction mixture was measured by monitoring glucose
production using a glucose oxidase/peroxidase-based glucose
assay.
[0394] Some experiments were also performed without pre-incubating
GH61 with copper(II), but instead, by directly adding different
amounts of copper(II) (CuSO.sub.4) to the biomass assay reactions
as described herein.
[0395] FIG. 3 shows activity of M. thermophila-GH61a pre-incubated
with different amounts of copper(II) ion. Biomass assays were
performed with (A) S. cerevisiae-produced M. thermophila GH61a
Variant 5, and (B) M. thermophila-produced wild-type GH61a. Glucose
production after 72 h incubation at pH 5, 55.degree. C. was
determined by the glucose assay. The data in this Figure indicate
GH61a-only activity, in which the amount of glucose produced in
control reaction containing CBH and .beta.-glucosidase was
subtracted from the total amount of glucose produced in the
reactions with GH61a. In this Figure, N=4; and the error bars
represent .+-.1 standard deviation. Copper concentrations shown are
with respect to the total reaction volume.
[0396] The results indicate that the activities of M.
thermophila-produced GH61a and S. cerevisiae supernatant containing
M. thermophila-GH61a are improved by pre-incubation with copper(II)
ions under the conditions tested. Similar results were obtained
when copper(II) was directly added to the biomass assay
reactions.
Example 9
Further Evaluation of Copper Requirements in Saccharification
Reactions
[0397] This Example describes experiments designed to determine the
effects of added copper in saccharification reactions. The
saccharification reactions were run in 30 g shake flasks (250 mL
flasks) using 82 g/kg glucan of acid-pretreated corn stover and
whole broth enzymes produced by M. thermophila strain CF-416
(produced using standard methods known in the art) at a 0.81% total
enzyme load with respect to glucan. The reactions were conducted at
pH 5.0 or pH 6.0, 55.degree. C. and 250 rpm mixing, with
supplementation of either 0 or 50 .mu.M CuSO.sub.4, copper(II) with
respect to the total reaction volume. A pH trim was also performed
using 2M NaOH at time intervals of 1, 4, 7, 22, 24 29, 46, 52, 70,
75 and 96 hrs, to maintain the pH at the desired value of pH 5.0 or
pH 6.0. Samples were removed at 72 hours and the total amount of
glucose in the reaction mixture was determined using standard HPLC
methods and equipment as known in the art. The results indicated
that under the conditions described herein, the effect of copper is
dependent on saccharification pH. As shown in FIG. 4, Panel A, at a
saccharification pH of pH 5.0, the addition of copper caused an
increase in glucose yields by .about.3.5% while this effect was not
observed when the saccharification was carried out at pH 6.0. Also,
the addition of copper may cause a decrease in the total amounts of
C5 sugars that are produced as shown in FIG. 4, Panel B.
Example 10
Effect of Reducing Agents on the Cellulolytic Activity of GH61a
[0398] This Example provides experiments conducted to determine the
effect of adding reducing agents (e.g., gallic acid and ascorbic
acid) to saccharification reactions. In these experiments,
enhancement of GH61 activity was tested using Variant 1 (SEQ ID
NO:5) in the presence of reducing agents (specifically, ascorbic
acid or gallic acid) and pretreatment filtrate, which contains
various reducing agents from lignin degradation. Reactions were
performed on cellulosic substrates, AVICEL.RTM. PH microcrystalline
cellulose and phosphoric acid swollen cellulose (PASC), with
purified M. thermophila-produced GH61 Variant 1 and
beta-glucosidase at 0.3% and 0.08% respectively, with respect to
substrate glucan mass, and 128 mM sodium acetate buffer (pH 5)
supplemented with 30 .mu.M CuSO.sub.4. Thus, reactions were
performed with 0.3% GH61a and 0.08% BGL, where % enzyme loads are
with respect to substrate glucan mass (36 g/L AVICEL.RTM. cellulose
and 5 g/L PASC). Background (negative) controls were
beta-glucosidase-only reactions tested in the absence of GH61a.
Glucose production after 48 h incubation at pH 5, 55.degree. C. was
determined by glucose oxidase/peroxidase-based or HPLC-based
glucose assay glucose assay, using methods known in the art.
[0399] FIG. 5 shows the activity of M. thermophila-produced GH61a
Variant 1 on cellulosic substrates in the presence of ascorbic
acid, gallic acid and pretreatment filtrate. Panel A shows the
results for AVICEL.RTM. PH microcrystalline cellulose and Panel B
shows the results for PASC. GH61-only activity is also shown, these
results were obtained by subtracting the amount of glucose produced
in the beta-glucosidase-only control reaction from the total amount
of glucose produced in the reaction that included GH61a. Filtrate
dilutions are indicated in this Figure, where undiluted filtrate
equals 72% of the total reaction volume. In this Figure, N=4; and
the error bars represent .+-.1 standard deviation.
[0400] The results indicate that supplementing the GH61a reaction
with gallic acid improved the GH61 activity in generating soluble
sugars from AVICEL.RTM. cellulose and PASC, which were then
hydrolyzed by beta-glucosidase to generate glucose monomers. The
improvement was also observed with diluted pretreatment filtrate,
which suggests that the filtrate may contain gallic acid or gallic
acid-like reductants that can beneficially impact GH61
activity.
Example 11
Evaluation of Oxygen Limitation in Saccharification Reactions
[0401] This example describes experiments conducted to determine if
oxygen is a limiting factor in saccharification reactions. To
investigate the level of oxygen required in the overall
saccharification efficiency, two shake flask reactions were
performed, in which one was left closed throughout the 72 hour
reaction, while the other was opened at 4 hrs and 24 hrs for 10
seconds to provide fresh air. The reactions were run in 30 g shake
flasks (250 mL flasks) using 87 g/kg glucan and M. thermophila
CF-416 whole broth cellulases. The total protein content in each
reaction was 0.81% total enzyme load with respect to glucan. The
reactions were conducted at pH 5.0, 55.degree. C. and 250 rpm
mixing, with supplementation of 50 .mu.M CuSO.sub.4. Samples were
removed at 72 hours and glucose yields were measured by monitoring
glucose production using a glucose oxidase/peroxidase-based glucose
assay. The results indicated that under the reaction conditions
tested, oxygen was not a limiting factor as the two reactions
(control vs. the reaction with air supplemented) yielded similar
levels of glucose.
Example 12
Enhancement of Saccharification Efficiency by Addition of
Surfactants
[0402] This example illustrates the enhancement of overall
saccharification yield with the addition of surfactants such as
TWEEN.RTM.-20 and PEG-4000. Experiments were designed to monitor
the enhancement in cellulase activity with different concentrations
of TWEEN.RTM.-20 or PEG-4000 in the biomass assay. The biomass
assay was performed in a total volume of 90 .mu.L, including 10 mg
of pre-treated wheat straw, 64.8 .mu.L (72% by volume) of filtrate
(or H.sub.2O for no filtrate conditions), and 11.6 .mu.L of a
mixture of sodium acetate buffer (pH 5.0, supplemented with
CuSO.sub.4), M. thermophila-produced cellobiohydrolase 1a (CBH1a),
cellobiohydrolase 2b (CBH2b), beta-glucosidase (BGL), and glycoside
hydrolase type 61 (GH61a). The final concentration of sodium
acetate was 128 mM (with 30 .mu.M CuSO.sub.4) and the enzyme loads
of CBH1a, CBH2b, BGL and GH61a were 0.15%, 0.15%, 0.08% and 0.3%
with respect to the substrate glucan mass in the biomass substrate,
respectively. Water was used in place of the enzymes as a negative
control. Herein, "1.times. filtrate" indicates 72% of filtrate
(i.e., the filtered liquid portion of pre-treated substrate) in the
total reaction volume. The amount of glucose in the filtrate
background was subtracted from the test data (N=2; Error bars in
the Figures represent .+-.1 standard deviation). The reaction was
incubated at 55.degree. C. for 72 hours at pH 5, with shaking at
950 rpm, then was quenched by adding 180 .mu.L of water. The total
cellulase activity in the reaction mixture was measured by
monitoring glucose production using a glucose
oxidase/peroxidase-based glucose assay as known in the art. The
results indicate that the total glucose production in the
saccharification reaction was enhanced with the addition of
TWEEN.RTM.-20 or PEG-4000.
[0403] FIG. 6, Panel A, shows enzymatic hydrolysis activity of the
cellulase mixture in the presence of TWEEN.RTM.-20. Data shown are
total glucose produced by a mixture of GH61a, CBH1a, CBH2b, and BGL
at 0.3%, 0.15%, 0.15%, and 0.08% with respect to the substrate
glucan mass in the biomass substrate, respectively. In this Figure,
TWEEN.RTM.-20 concentrations are expressed as % total reaction
volume.
[0404] FIG. 6, Panel B, shows enzymatic hydrolysis activity of the
cellulase mixture in the presence of PEG-4000. In this Figure,
PEG-4000 concentrations are expressed as % total reaction
volume.
[0405] While the invention has been described with reference to the
specific embodiments, various changes can be made and equivalents
can be substituted to adapt to a particular situation, material,
composition of matter, process, process step or steps, thereby
achieving benefits of the invention without departing from the
scope of what is claimed.
[0406] For all purposes in the United States of America, each and
every publication and patent document cited in this disclosure is
incorporated herein by reference as if each such publication or
document was specifically and individually indicated to be
incorporated herein by reference. Citation of publications and
patent documents is not intended as an indication that any such
document is pertinent prior art, nor does it constitute an
admission as to its contents or date.
Sequence CWU 1
1
17111029DNAMyceliophthora thermophila 1atgtccaagg cctctgctct
cctcgctggc ctgacgggcg cggccctcgt cgctgcacat 60ggccacgtca gccacatcgt
cgtcaacggc gtctactaca ggaactacga ccccacgaca 120gactggtacc
agcccaaccc gccaacagtc atcggctgga cggcagccga tcaggataat
180ggcttcgttg aacccaacag ctttggcacg ccagatatca tctgccacaa
gagcgccacc 240cccggcggcg gccacgctac cgttgctgcc ggagacaaga
tcaacatcgt ctggaccccc 300gagtggcccg aatcccacat cggccccgtc
attgactacc tagccgcctg caacggtgac 360tgcgagaccg tcgacaagtc
gtcgctgcgc tggttcaaga ttgacggcgc cggctacgac 420aaggccgccg
gccgctgggc cgccgacgct ctgcgcgcca acggcaacag ctggctcgtc
480cagatcccgt cggatctcaa ggccggcaac tacgtcctcc gccacgagat
catcgccctc 540cacggtgctc agagccccaa cggcgcccag gcctacccgc
agtgcatcaa cctccgcgtc 600accggcggcg gcagcaacct gcccagcggc
gtcgccggca cctcgctgta caaggcgacc 660gacccgggca tcctcttcaa
cccctacgtc tcctccccgg attacaccgt ccccggcccg 720gccctcattg
ccggcgccgc cagctcgatc gcccagagca cgtcggtcgc cactgccacc
780ggcacggcca ccgttcccgg cggcggcggc gccaacccta ccgccaccac
caccgccgcc 840acctccgccg ccccgagcac caccctgagg acgaccacta
cctcggccgc gcagactacc 900gccccgccct ccggcgatgt gcagaccaag
tacggccagt gtggtggcaa cggatggacg 960ggcccgacgg tgtgcgcccc
cggctcgagc tgctccgtcc tcaacgagtg gtactcccag 1020tgtttgtaa
10292342PRTMyceliophthora thermophila 2Met Ser Lys Ala Ser Ala Leu
Leu Ala Gly Leu Thr Gly Ala Ala Leu 1 5 10 15 Val Ala Ala His Gly
His Val Ser His Ile Val Val Asn Gly Val Tyr 20 25 30 Tyr Arg Asn
Tyr Asp Pro Thr Thr Asp Trp Tyr Gln Pro Asn Pro Pro 35 40 45 Thr
Val Ile Gly Trp Thr Ala Ala Asp Gln Asp Asn Gly Phe Val Glu 50 55
60 Pro Asn Ser Phe Gly Thr Pro Asp Ile Ile Cys His Lys Ser Ala Thr
65 70 75 80 Pro Gly Gly Gly His Ala Thr Val Ala Ala Gly Asp Lys Ile
Asn Ile 85 90 95 Val Trp Thr Pro Glu Trp Pro Glu Ser His Ile Gly
Pro Val Ile Asp 100 105 110 Tyr Leu Ala Ala Cys Asn Gly Asp Cys Glu
Thr Val Asp Lys Ser Ser 115 120 125 Leu Arg Trp Phe Lys Ile Asp Gly
Ala Gly Tyr Asp Lys Ala Ala Gly 130 135 140 Arg Trp Ala Ala Asp Ala
Leu Arg Ala Asn Gly Asn Ser Trp Leu Val 145 150 155 160 Gln Ile Pro
Ser Asp Leu Lys Ala Gly Asn Tyr Val Leu Arg His Glu 165 170 175 Ile
Ile Ala Leu His Gly Ala Gln Ser Pro Asn Gly Ala Gln Ala Tyr 180 185
190 Pro Gln Cys Ile Asn Leu Arg Val Thr Gly Gly Gly Ser Asn Leu Pro
195 200 205 Ser Gly Val Ala Gly Thr Ser Leu Tyr Lys Ala Thr Asp Pro
Gly Ile 210 215 220 Leu Phe Asn Pro Tyr Val Ser Ser Pro Asp Tyr Thr
Val Pro Gly Pro 225 230 235 240 Ala Leu Ile Ala Gly Ala Ala Ser Ser
Ile Ala Gln Ser Thr Ser Val 245 250 255 Ala Thr Ala Thr Gly Thr Ala
Thr Val Pro Gly Gly Gly Gly Ala Asn 260 265 270 Pro Thr Ala Thr Thr
Thr Ala Ala Thr Ser Ala Ala Pro Ser Thr Thr 275 280 285 Leu Arg Thr
Thr Thr Thr Ser Ala Ala Gln Thr Thr Ala Pro Pro Ser 290 295 300 Gly
Asp Val Gln Thr Lys Tyr Gly Gln Cys Gly Gly Asn Gly Trp Thr 305 310
315 320 Gly Pro Thr Val Cys Ala Pro Gly Ser Ser Cys Ser Val Leu Asn
Glu 325 330 335 Trp Tyr Ser Gln Cys Leu 340 3323PRTMyceliophthora
thermophila 3His Gly His Val Ser His Ile Val Val Asn Gly Val Tyr
Tyr Arg Asn 1 5 10 15 Tyr Asp Pro Thr Thr Asp Trp Tyr Gln Pro Asn
Pro Pro Thr Val Ile 20 25 30 Gly Trp Thr Ala Ala Asp Gln Asp Asn
Gly Phe Val Glu Pro Asn Ser 35 40 45 Phe Gly Thr Pro Asp Ile Ile
Cys His Lys Ser Ala Thr Pro Gly Gly 50 55 60 Gly His Ala Thr Val
Ala Ala Gly Asp Lys Ile Asn Ile Val Trp Thr 65 70 75 80 Pro Glu Trp
Pro Glu Ser His Ile Gly Pro Val Ile Asp Tyr Leu Ala 85 90 95 Ala
Cys Asn Gly Asp Cys Glu Thr Val Asp Lys Ser Ser Leu Arg Trp 100 105
110 Phe Lys Ile Asp Gly Ala Gly Tyr Asp Lys Ala Ala Gly Arg Trp Ala
115 120 125 Ala Asp Ala Leu Arg Ala Asn Gly Asn Ser Trp Leu Val Gln
Ile Pro 130 135 140 Ser Asp Leu Lys Ala Gly Asn Tyr Val Leu Arg His
Glu Ile Ile Ala 145 150 155 160 Leu His Gly Ala Gln Ser Pro Asn Gly
Ala Gln Ala Tyr Pro Gln Cys 165 170 175 Ile Asn Leu Arg Val Thr Gly
Gly Gly Ser Asn Leu Pro Ser Gly Val 180 185 190 Ala Gly Thr Ser Leu
Tyr Lys Ala Thr Asp Pro Gly Ile Leu Phe Asn 195 200 205 Pro Tyr Val
Ser Ser Pro Asp Tyr Thr Val Pro Gly Pro Ala Leu Ile 210 215 220 Ala
Gly Ala Ala Ser Ser Ile Ala Gln Ser Thr Ser Val Ala Thr Ala 225 230
235 240 Thr Gly Thr Ala Thr Val Pro Gly Gly Gly Gly Ala Asn Pro Thr
Ala 245 250 255 Thr Thr Thr Ala Ala Thr Ser Ala Ala Pro Ser Thr Thr
Leu Arg Thr 260 265 270 Thr Thr Thr Ser Ala Ala Gln Thr Thr Ala Pro
Pro Ser Gly Asp Val 275 280 285 Gln Thr Lys Tyr Gly Gln Cys Gly Gly
Asn Gly Trp Thr Gly Pro Thr 290 295 300 Val Cys Ala Pro Gly Ser Ser
Cys Ser Val Leu Asn Glu Trp Tyr Ser 305 310 315 320 Gln Cys Leu
41029DNAArtificial sequenceSynthetic polynucleotide. 4atgtccaagg
cctctgctct cctcgctggc ctgacgggcg cggccctcgt cgctgcacac 60ggccacgtca
gccacatcgt cgtcaacggc gtctactaca ggggctacga ccccacgaca
120gactggtacc agcccaaccc gccaacagtc atcggctgga cggcagccga
tcaggataat 180ggcttcgttg aacccaacag ctttggcacg ccagatatca
tctgccacaa gagcgccacc 240cccggcggcg gccacgctac cgttgctgcc
ggagacaaga tcaacatcgt ctggaccccc 300gagtggcccc actcccacat
cggccccgtc attgactacc tagccgcctg caacggtgac 360tgcgagaccg
tcgacaagtc gtcgctgcgc tggttcaaga ttgacggcgc cggctacgac
420aaggccgccg gccgctgggc cgccgacgct ctgcgcgcca acggcaacag
ctggctcgtc 480cagatcccgt cggatctcaa gcccggcaac tacgtcctcc
gccacgagat catcgccctc 540cacggtgctc agagccccaa cggcgcccag
gcgtacccgc agtgcatcaa cctccgcgtc 600accggcggcg gcagcaacct
gcccagcggc gtcgccggca cctcgctgta caaggcgacc 660gacccgggca
tcctcttcaa cccctacgtc tcctccccgg attacaccgt ccccggcccg
720gccctcattg ccggcgccgc cagctcgatc gcccagagca cgtcggtcgc
cactgccacc 780ggcacggcca ccgttcccgg cggcggcggc gccaacccta
ccgccaccac caccgccgcc 840acctccgccg ccccgagcac caccctgagg
acgaccacta cctcggccgc gcagactacc 900gccccgccct ccggcgatgt
gcagaccaag tacggccagt gtggtggcaa cggatggacg 960ggcccgacgg
tgtgcgcccc cggctcgagc tgctccgtcc tcaacgagtg gtactcccag
1020tgtttgtaa 10295342PRTArtificial sequenceSynthetic polypeptides.
5Met Ser Lys Ala Ser Ala Leu Leu Ala Gly Leu Thr Gly Ala Ala Leu 1
5 10 15 Val Ala Ala His Gly His Val Ser His Ile Val Val Asn Gly Val
Tyr 20 25 30 Tyr Arg Gly Tyr Asp Pro Thr Thr Asp Trp Tyr Gln Pro
Asn Pro Pro 35 40 45 Thr Val Ile Gly Trp Thr Ala Ala Asp Gln Asp
Asn Gly Phe Val Glu 50 55 60 Pro Asn Ser Phe Gly Thr Pro Asp Ile
Ile Cys His Lys Ser Ala Thr 65 70 75 80 Pro Gly Gly Gly His Ala Thr
Val Ala Ala Gly Asp Lys Ile Asn Ile 85 90 95 Val Trp Thr Pro Glu
Trp Pro His Ser His Ile Gly Pro Val Ile Asp 100 105 110 Tyr Leu Ala
Ala Cys Asn Gly Asp Cys Glu Thr Val Asp Lys Ser Ser 115 120 125 Leu
Arg Trp Phe Lys Ile Asp Gly Ala Gly Tyr Asp Lys Ala Ala Gly 130 135
140 Arg Trp Ala Ala Asp Ala Leu Arg Ala Asn Gly Asn Ser Trp Leu Val
145 150 155 160 Gln Ile Pro Ser Asp Leu Lys Pro Gly Asn Tyr Val Leu
Arg His Glu 165 170 175 Ile Ile Ala Leu His Gly Ala Gln Ser Pro Asn
Gly Ala Gln Ala Tyr 180 185 190 Pro Gln Cys Ile Asn Leu Arg Val Thr
Gly Gly Gly Ser Asn Leu Pro 195 200 205 Ser Gly Val Ala Gly Thr Ser
Leu Tyr Lys Ala Thr Asp Pro Gly Ile 210 215 220 Leu Phe Asn Pro Tyr
Val Ser Ser Pro Asp Tyr Thr Val Pro Gly Pro 225 230 235 240 Ala Leu
Ile Ala Gly Ala Ala Ser Ser Ile Ala Gln Ser Thr Ser Val 245 250 255
Ala Thr Ala Thr Gly Thr Ala Thr Val Pro Gly Gly Gly Gly Ala Asn 260
265 270 Pro Thr Ala Thr Thr Thr Ala Ala Thr Ser Ala Ala Pro Ser Thr
Thr 275 280 285 Leu Arg Thr Thr Thr Thr Ser Ala Ala Gln Thr Thr Ala
Pro Pro Ser 290 295 300 Gly Asp Val Gln Thr Lys Tyr Gly Gln Cys Gly
Gly Asn Gly Trp Thr 305 310 315 320 Gly Pro Thr Val Cys Ala Pro Gly
Ser Ser Cys Ser Val Leu Asn Glu 325 330 335 Trp Tyr Ser Gln Cys Leu
340 6323PRTArtificial sequenceSynthetic polypeptides. 6His Gly His
Val Ser His Ile Val Val Asn Gly Val Tyr Tyr Arg Gly 1 5 10 15 Tyr
Asp Pro Thr Thr Asp Trp Tyr Gln Pro Asn Pro Pro Thr Val Ile 20 25
30 Gly Trp Thr Ala Ala Asp Gln Asp Asn Gly Phe Val Glu Pro Asn Ser
35 40 45 Phe Gly Thr Pro Asp Ile Ile Cys His Lys Ser Ala Thr Pro
Gly Gly 50 55 60 Gly His Ala Thr Val Ala Ala Gly Asp Lys Ile Asn
Ile Val Trp Thr 65 70 75 80 Pro Glu Trp Pro His Ser His Ile Gly Pro
Val Ile Asp Tyr Leu Ala 85 90 95 Ala Cys Asn Gly Asp Cys Glu Thr
Val Asp Lys Ser Ser Leu Arg Trp 100 105 110 Phe Lys Ile Asp Gly Ala
Gly Tyr Asp Lys Ala Ala Gly Arg Trp Ala 115 120 125 Ala Asp Ala Leu
Arg Ala Asn Gly Asn Ser Trp Leu Val Gln Ile Pro 130 135 140 Ser Asp
Leu Lys Pro Gly Asn Tyr Val Leu Arg His Glu Ile Ile Ala 145 150 155
160 Leu His Gly Ala Gln Ser Pro Asn Gly Ala Gln Ala Tyr Pro Gln Cys
165 170 175 Ile Asn Leu Arg Val Thr Gly Gly Gly Ser Asn Leu Pro Ser
Gly Val 180 185 190 Ala Gly Thr Ser Leu Tyr Lys Ala Thr Asp Pro Gly
Ile Leu Phe Asn 195 200 205 Pro Tyr Val Ser Ser Pro Asp Tyr Thr Val
Pro Gly Pro Ala Leu Ile 210 215 220 Ala Gly Ala Ala Ser Ser Ile Ala
Gln Ser Thr Ser Val Ala Thr Ala 225 230 235 240 Thr Gly Thr Ala Thr
Val Pro Gly Gly Gly Gly Ala Asn Pro Thr Ala 245 250 255 Thr Thr Thr
Ala Ala Thr Ser Ala Ala Pro Ser Thr Thr Leu Arg Thr 260 265 270 Thr
Thr Thr Ser Ala Ala Gln Thr Thr Ala Pro Pro Ser Gly Asp Val 275 280
285 Gln Thr Lys Tyr Gly Gln Cys Gly Gly Asn Gly Trp Thr Gly Pro Thr
290 295 300 Val Cys Ala Pro Gly Ser Ser Cys Ser Val Leu Asn Glu Trp
Tyr Ser 305 310 315 320 Gln Cys Leu 71035DNAArtificial
sequenceSynthetic polynucleotide. 7acacaaatgt ccaaggcctc tgctctcctc
gctggcctga cgggcgcggc cctcgtcgct 60gcacacggcc acgtcagcca catcgtcgtc
aacggcgtct actacaggaa ctacgacccc 120acgacagact ggtaccagcc
caacccgcca acagtcatcg gctggacggc agccgatcag 180gataatggct
tcgttgaacc caacagcttt ggcacgccag atatcatctg ccacaagagc
240gccacccccg gcggcggcca cgctaccgtt gctgccggag acaagatcaa
catcgtatgg 300acccccgagt ggccccactc ccacatcggc cccgtcattg
actacctagc cgcctgcaac 360ggtgactgcg agaccgtcga caagtcgtcg
ctgcgctggt tcaagattga cggcgccggc 420tacgacaagg ccgccggccg
ctgggccgcc gacgctctgc gcgccaacgg caacagctgg 480ctcgtccaga
tcccgtcgga tctcgcggcc ggcaactacg tcctccgcca cgagatcatc
540gccctccacg gtgctcagag ccccaacggc gcccaggcgt acccgcagtg
catcaacctc 600cgcgtcaccg gcggcggcag caacctgccc agcggcgtcg
ccggcacctc gctgtacaag 660gcgaccgacc cgggcatcct cttcaacccc
tacgtctcct ccccggatta caccgtcccc 720ggcccggccc tcattgccgg
cgccgccagc tcgatcgccc agagcacgtc ggtcgccact 780gccaccggca
cggccaccgt tcccggcggc ggcggcgcca accctaccgc caccaccacc
840gccgccacct ccgccgcccc gagcaccacc ctgaggacga ccactacctc
ggccgcgcag 900actaccgccc cgccctccgg cgatgtgcag accaagtacg
gccagtgtgg tggcaacgga 960tggacgggcc cgacggtgtg cgcccccggc
tcgagctgct ccgtcctcaa cgagtggtac 1020tcccagtgtt tgtaa
10358342PRTArtificial sequenceSynthetic polypeptides. 8Met Ser Lys
Ala Ser Ala Leu Leu Ala Gly Leu Thr Gly Ala Ala Leu 1 5 10 15 Val
Ala Ala His Gly His Val Ser His Ile Val Val Asn Gly Val Tyr 20 25
30 Tyr Arg Asn Tyr Asp Pro Thr Thr Asp Trp Tyr Gln Pro Asn Pro Pro
35 40 45 Thr Val Ile Gly Trp Thr Ala Ala Asp Gln Asp Asn Gly Phe
Val Glu 50 55 60 Pro Asn Ser Phe Gly Thr Pro Asp Ile Ile Cys His
Lys Ser Ala Thr 65 70 75 80 Pro Gly Gly Gly His Ala Thr Val Ala Ala
Gly Asp Lys Ile Asn Ile 85 90 95 Val Trp Thr Pro Glu Trp Pro His
Ser His Ile Gly Pro Val Ile Asp 100 105 110 Tyr Leu Ala Ala Cys Asn
Gly Asp Cys Glu Thr Val Asp Lys Ser Ser 115 120 125 Leu Arg Trp Phe
Lys Ile Asp Gly Ala Gly Tyr Asp Lys Ala Ala Gly 130 135 140 Arg Trp
Ala Ala Asp Ala Leu Arg Ala Asn Gly Asn Ser Trp Leu Val 145 150 155
160 Gln Ile Pro Ser Asp Leu Ala Ala Gly Asn Tyr Val Leu Arg His Glu
165 170 175 Ile Ile Ala Leu His Gly Ala Gln Ser Pro Asn Gly Ala Gln
Ala Tyr 180 185 190 Pro Gln Cys Ile Asn Leu Arg Val Thr Gly Gly Gly
Ser Asn Leu Pro 195 200 205 Ser Gly Val Ala Gly Thr Ser Leu Tyr Lys
Ala Thr Asp Pro Gly Ile 210 215 220 Leu Phe Asn Pro Tyr Val Ser Ser
Pro Asp Tyr Thr Val Pro Gly Pro 225 230 235 240 Ala Leu Ile Ala Gly
Ala Ala Ser Ser Ile Ala Gln Ser Thr Ser Val 245 250 255 Ala Thr Ala
Thr Gly Thr Ala Thr Val Pro Gly Gly Gly Gly Ala Asn 260 265 270 Pro
Thr Ala Thr Thr Thr Ala Ala Thr Ser Ala Ala Pro Ser Thr Thr 275 280
285 Leu Arg Thr Thr Thr Thr Ser Ala Ala Gln Thr Thr Ala Pro Pro Ser
290 295 300 Gly Asp Val Gln Thr Lys Tyr Gly Gln Cys Gly Gly Asn Gly
Trp Thr 305 310 315 320 Gly Pro Thr Val Cys Ala Pro Gly Ser Ser Cys
Ser Val Leu Asn Glu 325 330 335 Trp Tyr Ser Gln Cys Leu 340
9323PRTArtificial sequenceSynthetic polypeptides. 9His Gly His Val
Ser His Ile Val Val Asn Gly Val Tyr Tyr Arg Asn 1 5 10 15 Tyr Asp
Pro Thr Thr Asp Trp Tyr Gln Pro Asn Pro Pro Thr Val Ile 20 25 30
Gly Trp Thr Ala Ala Asp Gln Asp Asn Gly Phe Val Glu Pro Asn Ser 35
40 45 Phe Gly Thr Pro Asp Ile Ile Cys His Lys Ser Ala Thr Pro Gly
Gly 50 55 60 Gly His Ala Thr Val Ala Ala Gly Asp Lys Ile Asn Ile
Val Trp Thr 65 70 75 80 Pro Glu Trp Pro His Ser His Ile Gly Pro
Val
Ile Asp Tyr Leu Ala 85 90 95 Ala Cys Asn Gly Asp Cys Glu Thr Val
Asp Lys Ser Ser Leu Arg Trp 100 105 110 Phe Lys Ile Asp Gly Ala Gly
Tyr Asp Lys Ala Ala Gly Arg Trp Ala 115 120 125 Ala Asp Ala Leu Arg
Ala Asn Gly Asn Ser Trp Leu Val Gln Ile Pro 130 135 140 Ser Asp Leu
Ala Ala Gly Asn Tyr Val Leu Arg His Glu Ile Ile Ala 145 150 155 160
Leu His Gly Ala Gln Ser Pro Asn Gly Ala Gln Ala Tyr Pro Gln Cys 165
170 175 Ile Asn Leu Arg Val Thr Gly Gly Gly Ser Asn Leu Pro Ser Gly
Val 180 185 190 Ala Gly Thr Ser Leu Tyr Lys Ala Thr Asp Pro Gly Ile
Leu Phe Asn 195 200 205 Pro Tyr Val Ser Ser Pro Asp Tyr Thr Val Pro
Gly Pro Ala Leu Ile 210 215 220 Ala Gly Ala Ala Ser Ser Ile Ala Gln
Ser Thr Ser Val Ala Thr Ala 225 230 235 240 Thr Gly Thr Ala Thr Val
Pro Gly Gly Gly Gly Ala Asn Pro Thr Ala 245 250 255 Thr Thr Thr Ala
Ala Thr Ser Ala Ala Pro Ser Thr Thr Leu Arg Thr 260 265 270 Thr Thr
Thr Ser Ala Ala Gln Thr Thr Ala Pro Pro Ser Gly Asp Val 275 280 285
Gln Thr Lys Tyr Gly Gln Cys Gly Gly Asn Gly Trp Thr Gly Pro Thr 290
295 300 Val Cys Ala Pro Gly Ser Ser Cys Ser Val Leu Asn Glu Trp Tyr
Ser 305 310 315 320 Gln Cys Leu 101035DNAArtificial
sequenceSynthetic polynucleotide. 10acaaacatgt ccaaggcctc
tgctctcctc gctggcctga cgggcgcggc cctcgtcgct 60gcacatggcc acgtcagcca
catcgtcgtc aacggcgtct actacaggaa ctacgacccc 120acgacagact
ggtaccagcc caacccgcca acagtcatcg gctggacggc agccgatcag
180gataatggct tcgttgaacc caacagcttt ggcacgccag atatcatctg
ccacaagagc 240gccacccccg gcggcggcca cgctaccgtt gctgccggag
acaagatcaa catccagtgg 300acccccgagt ggcccgaatc ccacatcggc
cccgtcattg actacctagc cgcctgcaac 360ggtgactgcg agaccgtcga
caagtcgtcg ctgcgctggt tcaagattga cggcgccggc 420tacgacaagg
ccgccggccg ctgggccgcc gacgctctgc gcgccaacgg caacagctgg
480ctcgtccaga tcccgtcgga tctcaaggcc ggcaactacg tcctccgcca
cgagatcatc 540gccctccacg gtgctcagag ccccaacggc gcccagaact
acccgcagtg catcaacctc 600cgcgtcaccg gcggcggcag caacctgccc
agcggcgtcg ccggcacctc gctgtacaag 660gcgaccgacc cgggcatcct
cttcaacccc tacgtctcct ccccggatta caccgtcccc 720ggcccggccc
tcattgccgg cgccgccagc tcgatcgccc agagcacgtc ggtcgccact
780gccaccggca cggccaccgt tcccggcggc ggcggcgcca accctaccgc
caccaccacc 840gccgccacct ccgccgcccc gagcaccacc ctgaggacga
ccactacctc ggccgcgcag 900actaccgccc cgccctccgg cgatgtgcag
accaagtacg gccagtgtgg tggcaacgga 960tggacgggcc cgacggtgtg
cgcccccggc tcgagctgct ccgtcctcaa cgagtggtac 1020tcccagtgtt tgtaa
103511342PRTArtificial sequenceSynthetic polypeptides. 11Met Ser
Lys Ala Ser Ala Leu Leu Ala Gly Leu Thr Gly Ala Ala Leu 1 5 10 15
Val Ala Ala His Gly His Val Ser His Ile Val Val Asn Gly Val Tyr 20
25 30 Tyr Arg Asn Tyr Asp Pro Thr Thr Asp Trp Tyr Gln Pro Asn Pro
Pro 35 40 45 Thr Val Ile Gly Trp Thr Ala Ala Asp Gln Asp Asn Gly
Phe Val Glu 50 55 60 Pro Asn Ser Phe Gly Thr Pro Asp Ile Ile Cys
His Lys Ser Ala Thr 65 70 75 80 Pro Gly Gly Gly His Ala Thr Val Ala
Ala Gly Asp Lys Ile Asn Ile 85 90 95 Gln Trp Thr Pro Glu Trp Pro
Glu Ser His Ile Gly Pro Val Ile Asp 100 105 110 Tyr Leu Ala Ala Cys
Asn Gly Asp Cys Glu Thr Val Asp Lys Ser Ser 115 120 125 Leu Arg Trp
Phe Lys Ile Asp Gly Ala Gly Tyr Asp Lys Ala Ala Gly 130 135 140 Arg
Trp Ala Ala Asp Ala Leu Arg Ala Asn Gly Asn Ser Trp Leu Val 145 150
155 160 Gln Ile Pro Ser Asp Leu Lys Ala Gly Asn Tyr Val Leu Arg His
Glu 165 170 175 Ile Ile Ala Leu His Gly Ala Gln Ser Pro Asn Gly Ala
Gln Asn Tyr 180 185 190 Pro Gln Cys Ile Asn Leu Arg Val Thr Gly Gly
Gly Ser Asn Leu Pro 195 200 205 Ser Gly Val Ala Gly Thr Ser Leu Tyr
Lys Ala Thr Asp Pro Gly Ile 210 215 220 Leu Phe Asn Pro Tyr Val Ser
Ser Pro Asp Tyr Thr Val Pro Gly Pro 225 230 235 240 Ala Leu Ile Ala
Gly Ala Ala Ser Ser Ile Ala Gln Ser Thr Ser Val 245 250 255 Ala Thr
Ala Thr Gly Thr Ala Thr Val Pro Gly Gly Gly Gly Ala Asn 260 265 270
Pro Thr Ala Thr Thr Thr Ala Ala Thr Ser Ala Ala Pro Ser Thr Thr 275
280 285 Leu Arg Thr Thr Thr Thr Ser Ala Ala Gln Thr Thr Ala Pro Pro
Ser 290 295 300 Gly Asp Val Gln Thr Lys Tyr Gly Gln Cys Gly Gly Asn
Gly Trp Thr 305 310 315 320 Gly Pro Thr Val Cys Ala Pro Gly Ser Ser
Cys Ser Val Leu Asn Glu 325 330 335 Trp Tyr Ser Gln Cys Leu 340
12342PRTArtificial sequenceSynthetic polypeptides. 12Met Ser Lys
Ala Ser Ala Leu Leu Ala Gly Leu Thr Gly Ala Ala Leu 1 5 10 15 Val
Ala Ala His Gly His Val Ser His Ile Val Val Asn Gly Val Tyr 20 25
30 Tyr Arg Asn Tyr Asp Pro Thr Thr Asp Trp Tyr Gln Pro Asn Pro Pro
35 40 45 Thr Val Ile Gly Trp Thr Ala Ala Asp Gln Asp Asn Gly Phe
Val Glu 50 55 60 Pro Asn Ser Phe Gly Thr Pro Asp Ile Ile Cys His
Lys Ser Ala Thr 65 70 75 80 Pro Gly Gly Gly His Ala Thr Val Ala Ala
Gly Asp Lys Ile Asn Ile 85 90 95 Gln Trp Thr Pro Glu Trp Pro Glu
Ser His Ile Gly Pro Val Ile Asp 100 105 110 Tyr Leu Ala Ala Cys Asn
Gly Asp Cys Glu Thr Val Asp Lys Ser Ser 115 120 125 Leu Arg Trp Phe
Lys Ile Asp Gly Ala Gly Tyr Asp Lys Ala Ala Gly 130 135 140 Arg Trp
Ala Ala Asp Ala Leu Arg Ala Asn Gly Asn Ser Trp Leu Val 145 150 155
160 Gln Ile Pro Ser Asp Leu Lys Ala Gly Asn Tyr Val Leu Arg His Glu
165 170 175 Ile Ile Ala Leu His Gly Ala Gln Ser Pro Asn Gly Ala Gln
Asn Tyr 180 185 190 Pro Gln Cys Ile Asn Leu Arg Val Thr Gly Gly Gly
Ser Asn Leu Pro 195 200 205 Ser Gly Val Ala Gly Thr Ser Leu Tyr Lys
Ala Thr Asp Pro Gly Ile 210 215 220 Leu Phe Asn Pro Tyr Val Ser Ser
Pro Asp Tyr Thr Val Pro Gly Pro 225 230 235 240 Ala Leu Ile Ala Gly
Ala Ala Ser Ser Ile Ala Gln Ser Thr Ser Val 245 250 255 Ala Thr Ala
Thr Gly Thr Ala Thr Val Pro Gly Gly Gly Gly Ala Asn 260 265 270 Pro
Thr Ala Thr Thr Thr Ala Ala Thr Ser Ala Ala Pro Ser Thr Thr 275 280
285 Leu Arg Thr Thr Thr Thr Ser Ala Ala Gln Thr Thr Ala Pro Pro Ser
290 295 300 Gly Asp Val Gln Thr Lys Tyr Gly Gln Cys Gly Gly Asn Gly
Trp Thr 305 310 315 320 Gly Pro Thr Val Cys Ala Pro Gly Ser Ser Cys
Ser Val Leu Asn Glu 325 330 335 Trp Tyr Ser Gln Cys Leu 340
13738DNAMyceliophthora thermophila 13atgaagctct ccctcttttc
cgtcctggcc actgccctca ccgtcgaggg gcatgccatc 60ttccagaagg tctccgtcaa
cggagcggac cagggctccc tcaccggcct ccgcgctccc 120aacaacaaca
accccgtgca gaatgtcaac agccaggaca tgatctgcgg ccagtcggga
180tcgacgtcga acactatcat cgaggtcaag gccggcgata ggatcggtgc
ctggtatcag 240catgtcatcg gcggtgccca gttccccaac gacccagaca
acccgattgc caagtcgcac 300aagggccccg tcatggccta cctcgccaag
gttgacaatg ccgcaaccgc cagcaagacg 360ggcctgaagt ggttcaagat
ttgggaggat acctttaatc ccagcaccaa gacctggggt 420gtcgacaacc
tcatcaacaa caacggctgg gtgtacttca acctcccgca gtgcatcgcc
480gacggcaact acctcctccg cgtcgaggtc ctcgctctgc actcggccta
ctcccagggc 540caggctcagt tctaccagtc ctgcgcccag atcaacgtat
ccggcggcgg ctccttcacg 600ccggcgtcga ctgtcagctt cccgggtgcc
tacagcgcca gcgaccccgg tatcctgatc 660aacatctacg gcgccaccgg
ccagcccgac aacaacggcc agccgtacac tgcccctggg 720cccgcgccca tctcctgc
73814246PRTMyceliophthora thermophila 14Met Lys Leu Ser Leu Phe Ser
Val Leu Ala Thr Ala Leu Thr Val Glu 1 5 10 15 Gly His Ala Ile Phe
Gln Lys Val Ser Val Asn Gly Ala Asp Gln Gly 20 25 30 Ser Leu Thr
Gly Leu Arg Ala Pro Asn Asn Asn Asn Pro Val Gln Asn 35 40 45 Val
Asn Ser Gln Asp Met Ile Cys Gly Gln Ser Gly Ser Thr Ser Asn 50 55
60 Thr Ile Ile Glu Val Lys Ala Gly Asp Arg Ile Gly Ala Trp Tyr Gln
65 70 75 80 His Val Ile Gly Gly Ala Gln Phe Pro Asn Asp Pro Asp Asn
Pro Ile 85 90 95 Ala Lys Ser His Lys Gly Pro Val Met Ala Tyr Leu
Ala Lys Val Asp 100 105 110 Asn Ala Ala Thr Ala Ser Lys Thr Gly Leu
Lys Trp Phe Lys Ile Trp 115 120 125 Glu Asp Thr Phe Asn Pro Ser Thr
Lys Thr Trp Gly Val Asp Asn Leu 130 135 140 Ile Asn Asn Asn Gly Trp
Val Tyr Phe Asn Leu Pro Gln Cys Ile Ala 145 150 155 160 Asp Gly Asn
Tyr Leu Leu Arg Val Glu Val Leu Ala Leu His Ser Ala 165 170 175 Tyr
Ser Gln Gly Gln Ala Gln Phe Tyr Gln Ser Cys Ala Gln Ile Asn 180 185
190 Val Ser Gly Gly Gly Ser Phe Thr Pro Ala Ser Thr Val Ser Phe Pro
195 200 205 Gly Ala Tyr Ser Ala Ser Asp Pro Gly Ile Leu Ile Asn Ile
Tyr Gly 210 215 220 Ala Thr Gly Gln Pro Asp Asn Asn Gly Gln Pro Tyr
Thr Ala Pro Gly 225 230 235 240 Pro Ala Pro Ile Ser Cys 245
15227PRTMyceliophthora thermophila 15Ile Phe Gln Lys Val Ser Val
Asn Gly Ala Asp Gln Gly Ser Leu Thr 1 5 10 15 Gly Leu Arg Ala Pro
Asn Asn Asn Asn Pro Val Gln Asn Val Asn Ser 20 25 30 Gln Asp Met
Ile Cys Gly Gln Ser Gly Ser Thr Ser Asn Thr Ile Ile 35 40 45 Glu
Val Lys Ala Gly Asp Arg Ile Gly Ala Trp Tyr Gln His Val Ile 50 55
60 Gly Gly Ala Gln Phe Pro Asn Asp Pro Asp Asn Pro Ile Ala Lys Ser
65 70 75 80 His Lys Gly Pro Val Met Ala Tyr Leu Ala Lys Val Asp Asn
Ala Ala 85 90 95 Thr Ala Ser Lys Thr Gly Leu Lys Trp Phe Lys Ile
Trp Glu Asp Thr 100 105 110 Phe Asn Pro Ser Thr Lys Thr Trp Gly Val
Asp Asn Leu Ile Asn Asn 115 120 125 Asn Gly Trp Val Tyr Phe Asn Leu
Pro Gln Cys Ile Ala Asp Gly Asn 130 135 140 Tyr Leu Leu Arg Val Glu
Val Leu Ala Leu His Ser Ala Tyr Ser Gln 145 150 155 160 Gly Gln Ala
Gln Phe Tyr Gln Ser Cys Ala Gln Ile Asn Val Ser Gly 165 170 175 Gly
Gly Ser Phe Thr Pro Ala Ser Thr Val Ser Phe Pro Gly Ala Tyr 180 185
190 Ser Ala Ser Asp Pro Gly Ile Leu Ile Asn Ile Tyr Gly Ala Thr Gly
195 200 205 Gln Pro Asp Asn Asn Gly Gln Pro Tyr Thr Ala Pro Gly Pro
Ala Pro 210 215 220 Ile Ser Cys 225 16762DNAMyceliophthora
thermophila 16atggccctcc agctcttggc gagcttggcc ctcctctcag
tgccggccct tgcccacggt 60ggcttggcca actacaccgt cggtgatact tggtacagag
gctacgaccc aaacctgccg 120ccggagacgc agctcaacca gacctggatg
atccagcggc aatgggccac catcgacccc 180gtcttcaccg tgtcggagcc
gtacctggcc tgcaacaacc cgggcgcgcc gccgccctcg 240tacatcccca
tccgcgccgg tgacaagatc acggccgtgt actggtactg gctgcacgcc
300atcgggccca tgagcgtctg gctcgcgcgg tgcggcgaca cgcccgcggc
cgactgccgc 360gacgtcgacg tcaaccgggt cggctggttc aagatctggg
agggcggcct gctggagggt 420cccaacctgg ccgaggggct ctggtaccaa
aaggacttcc agcgctggga cggctccccg 480tccctctggc ccgtcacgat
ccccaagggg ctcaagagcg ggacctacat catccggcac 540gagatcctgt
cgcttcacgt cgccctcaag ccccagtttt acccggagtg tgcgcatctg
600aatattactg ggggcggaga cttgctgcca cccgaagaga ctctggtgcg
gtttccgggg 660gtttacaaag aggacgatcc ctctatcttc atcgatgtct
actcggagga gaacgcgaac 720cggacagatt atacggttcc gggagggcca
atctgggaag gg 76217254PRTMyceliophthora thermophila 17Met Ala Leu
Gln Leu Leu Ala Ser Leu Ala Leu Leu Ser Val Pro Ala 1 5 10 15 Leu
Ala His Gly Gly Leu Ala Asn Tyr Thr Val Gly Asp Thr Trp Tyr 20 25
30 Arg Gly Tyr Asp Pro Asn Leu Pro Pro Glu Thr Gln Leu Asn Gln Thr
35 40 45 Trp Met Ile Gln Arg Gln Trp Ala Thr Ile Asp Pro Val Phe
Thr Val 50 55 60 Ser Glu Pro Tyr Leu Ala Cys Asn Asn Pro Gly Ala
Pro Pro Pro Ser 65 70 75 80 Tyr Ile Pro Ile Arg Ala Gly Asp Lys Ile
Thr Ala Val Tyr Trp Tyr 85 90 95 Trp Leu His Ala Ile Gly Pro Met
Ser Val Trp Leu Ala Arg Cys Gly 100 105 110 Asp Thr Pro Ala Ala Asp
Cys Arg Asp Val Asp Val Asn Arg Val Gly 115 120 125 Trp Phe Lys Ile
Trp Glu Gly Gly Leu Leu Glu Gly Pro Asn Leu Ala 130 135 140 Glu Gly
Leu Trp Tyr Gln Lys Asp Phe Gln Arg Trp Asp Gly Ser Pro 145 150 155
160 Ser Leu Trp Pro Val Thr Ile Pro Lys Gly Leu Lys Ser Gly Thr Tyr
165 170 175 Ile Ile Arg His Glu Ile Leu Ser Leu His Val Ala Leu Lys
Pro Gln 180 185 190 Phe Tyr Pro Glu Cys Ala His Leu Asn Ile Thr Gly
Gly Gly Asp Leu 195 200 205 Leu Pro Pro Glu Glu Thr Leu Val Arg Phe
Pro Gly Val Tyr Lys Glu 210 215 220 Asp Asp Pro Ser Ile Phe Ile Asp
Val Tyr Ser Glu Glu Asn Ala Asn 225 230 235 240 Arg Thr Asp Tyr Thr
Val Pro Gly Gly Pro Ile Trp Glu Gly 245 250 18231PRTMyceliophthora
thermophila 18Asn Tyr Thr Val Gly Asp Thr Trp Tyr Arg Gly Tyr Asp
Pro Asn Leu 1 5 10 15 Pro Pro Glu Thr Gln Leu Asn Gln Thr Trp Met
Ile Gln Arg Gln Trp 20 25 30 Ala Thr Ile Asp Pro Val Phe Thr Val
Ser Glu Pro Tyr Leu Ala Cys 35 40 45 Asn Asn Pro Gly Ala Pro Pro
Pro Ser Tyr Ile Pro Ile Arg Ala Gly 50 55 60 Asp Lys Ile Thr Ala
Val Tyr Trp Tyr Trp Leu His Ala Ile Gly Pro 65 70 75 80 Met Ser Val
Trp Leu Ala Arg Cys Gly Asp Thr Pro Ala Ala Asp Cys 85 90 95 Arg
Asp Val Asp Val Asn Arg Val Gly Trp Phe Lys Ile Trp Glu Gly 100 105
110 Gly Leu Leu Glu Gly Pro Asn Leu Ala Glu Gly Leu Trp Tyr Gln Lys
115 120 125 Asp Phe Gln Arg Trp Asp Gly Ser Pro Ser Leu Trp Pro Val
Thr Ile 130 135 140 Pro Lys Gly Leu Lys Ser Gly Thr Tyr Ile Ile Arg
His Glu Ile Leu 145 150 155 160 Ser Leu His Val Ala Leu Lys Pro Gln
Phe Tyr Pro Glu Cys Ala His 165 170 175 Leu Asn Ile Thr Gly Gly Gly
Asp Leu Leu Pro Pro Glu Glu Thr Leu 180 185 190 Val Arg Phe Pro Gly
Val Tyr Lys Glu Asp Asp Pro Ser Ile Phe Ile 195
200 205 Asp Val Tyr Ser Glu Glu Asn Ala Asn Arg Thr Asp Tyr Thr Val
Pro 210 215 220 Gly Gly Pro Ile Trp Glu Gly 225 230
19705DNAMyceliophthora thermophila 19atgaaggccc tctctctcct
tgcggctgcc ggggcagtct ctgcgcatac catcttcgtc 60cagctcgaag cagacggcac
gaggtacccg gtttcgtacg ggatccggga cccaacctac 120gacggcccca
tcaccgacgt cacatccaac gacgttgctt gcaacggcgg tccgaacccg
180acgaccccct ccagcgacgt catcaccgtc accgcgggca ccaccgtcaa
ggccatctgg 240aggcacaccc tccaatccgg cccggacgat gtcatggacg
ccagccacaa gggcccgacc 300ctggcctaca tcaagaaggt cggcgatgcc
accaaggact cgggcgtcgg cggtggctgg 360ttcaagatcc aggaggacgg
ttacaacaac ggccagtggg gcaccagcac cgttatctcc 420aacggcggcg
agcactacat tgacatcccg gcctgcatcc ccgagggtca gtacctcctc
480cgcgccgaga tgatcgccct ccacgcggcc gggtcccccg gcggcgctca
gctctacatg 540gaatgtgccc agatcaacat cgtcggcggc tccggctcgg
tgcccagctc gacggtcagc 600ttccccggcg cgtatagccc caacgacccg
ggtctcctca tcaacatcta ttccatgtcg 660ccctcgagct cgtacaccat
cccgggcccg cccgttttca agtgc 70520235PRTMyceliophthora thermophila
20Met Lys Ala Leu Ser Leu Leu Ala Ala Ala Gly Ala Val Ser Ala His 1
5 10 15 Thr Ile Phe Val Gln Leu Glu Ala Asp Gly Thr Arg Tyr Pro Val
Ser 20 25 30 Tyr Gly Ile Arg Asp Pro Thr Tyr Asp Gly Pro Ile Thr
Asp Val Thr 35 40 45 Ser Asn Asp Val Ala Cys Asn Gly Gly Pro Asn
Pro Thr Thr Pro Ser 50 55 60 Ser Asp Val Ile Thr Val Thr Ala Gly
Thr Thr Val Lys Ala Ile Trp 65 70 75 80 Arg His Thr Leu Gln Ser Gly
Pro Asp Asp Val Met Asp Ala Ser His 85 90 95 Lys Gly Pro Thr Leu
Ala Tyr Ile Lys Lys Val Gly Asp Ala Thr Lys 100 105 110 Asp Ser Gly
Val Gly Gly Gly Trp Phe Lys Ile Gln Glu Asp Gly Tyr 115 120 125 Asn
Asn Gly Gln Trp Gly Thr Ser Thr Val Ile Ser Asn Gly Gly Glu 130 135
140 His Tyr Ile Asp Ile Pro Ala Cys Ile Pro Glu Gly Gln Tyr Leu Leu
145 150 155 160 Arg Ala Glu Met Ile Ala Leu His Ala Ala Gly Ser Pro
Gly Gly Ala 165 170 175 Gln Leu Tyr Met Glu Cys Ala Gln Ile Asn Ile
Val Gly Gly Ser Gly 180 185 190 Ser Val Pro Ser Ser Thr Val Ser Phe
Pro Gly Ala Tyr Ser Pro Asn 195 200 205 Asp Pro Gly Leu Leu Ile Asn
Ile Tyr Ser Met Ser Pro Ser Ser Ser 210 215 220 Tyr Thr Ile Pro Gly
Pro Pro Val Phe Lys Cys 225 230 235 21220PRTMyceliophthora
thermophila 21His Thr Ile Phe Val Gln Leu Glu Ala Asp Gly Thr Arg
Tyr Pro Val 1 5 10 15 Ser Tyr Gly Ile Arg Asp Pro Thr Tyr Asp Gly
Pro Ile Thr Asp Val 20 25 30 Thr Ser Asn Asp Val Ala Cys Asn Gly
Gly Pro Asn Pro Thr Thr Pro 35 40 45 Ser Ser Asp Val Ile Thr Val
Thr Ala Gly Thr Thr Val Lys Ala Ile 50 55 60 Trp Arg His Thr Leu
Gln Ser Gly Pro Asp Asp Val Met Asp Ala Ser 65 70 75 80 His Lys Gly
Pro Thr Leu Ala Tyr Ile Lys Lys Val Gly Asp Ala Thr 85 90 95 Lys
Asp Ser Gly Val Gly Gly Gly Trp Phe Lys Ile Gln Glu Asp Gly 100 105
110 Tyr Asn Asn Gly Gln Trp Gly Thr Ser Thr Val Ile Ser Asn Gly Gly
115 120 125 Glu His Tyr Ile Asp Ile Pro Ala Cys Ile Pro Glu Gly Gln
Tyr Leu 130 135 140 Leu Arg Ala Glu Met Ile Ala Leu His Ala Ala Gly
Ser Pro Gly Gly 145 150 155 160 Ala Gln Leu Tyr Met Glu Cys Ala Gln
Ile Asn Ile Val Gly Gly Ser 165 170 175 Gly Ser Val Pro Ser Ser Thr
Val Ser Phe Pro Gly Ala Tyr Ser Pro 180 185 190 Asn Asp Pro Gly Leu
Leu Ile Asn Ile Tyr Ser Met Ser Pro Ser Ser 195 200 205 Ser Tyr Thr
Ile Pro Gly Pro Pro Val Phe Lys Cys 210 215 220
22915DNAMyceliophthora thermophila 22atgaagtcgt ctaccccggc
cttgttcgcc gctgggctcc ttgctcagca tgctgcggcc 60cactccatct tccagcaggc
gagcagcggc tcgaccgact ttgatacgct gtgcacccgg 120atgccgccca
acaatagccc cgtcactagt gtgaccagcg gcgacatgac ctgcaaagtc
180ggcggcacca agggggtgtc cggcttctgc gaggtgaacg ccggcgacga
gttcacggtt 240gagatgcacg cgcagcccgg cgaccgctcg tgcgccaacg
aggccatcgg cgggaaccac 300ttcggcccgg tcctcatcta catgagcaag
gtcgacgacg cctccaccgc cgacgggtcc 360ggcgactggt tcaaggtgga
cgagttcggc tacgacgcaa gcaccaagac ctggggcacc 420gacaagctca
acgagaactg cggcaagcgc accttcaaca tccccagcca catccccgcg
480ggcgactatc tcgtccgggc cgaggctatc gcgctacaca ctgccaacca
gccaggcggc 540gcgcagttct acatgagctg ctatcaagtc aggatttccg
gcggcgaagg gggccagctg 600cctgccggag tcaagatccc gggcgcgtac
agtgccaacg accccggcat ccttgtcgac 660atctggggta acgatttcaa
cgaccctcca ggacactcgg cccgtcacgc catcatcatc 720atcagcagca
gcagcaacaa cagcggcgcc aagatgacca agaagatcca ggagcccacc
780atcacatcgg tcacggacct ccccaccgac gaggccaagt ggatcgcgct
ccaaaagatc 840tcgtacgtgg accagacggg cacggcgcgg acatacgagc
cggcgtcgcg caagacgcgg 900tcgccaagag tctag 91523304PRTMyceliophthora
thermophila 23Met Lys Ser Ser Thr Pro Ala Leu Phe Ala Ala Gly Leu
Leu Ala Gln 1 5 10 15 His Ala Ala Ala His Ser Ile Phe Gln Gln Ala
Ser Ser Gly Ser Thr 20 25 30 Asp Phe Asp Thr Leu Cys Thr Arg Met
Pro Pro Asn Asn Ser Pro Val 35 40 45 Thr Ser Val Thr Ser Gly Asp
Met Thr Cys Lys Val Gly Gly Thr Lys 50 55 60 Gly Val Ser Gly Phe
Cys Glu Val Asn Ala Gly Asp Glu Phe Thr Val 65 70 75 80 Glu Met His
Ala Gln Pro Gly Asp Arg Ser Cys Ala Asn Glu Ala Ile 85 90 95 Gly
Gly Asn His Phe Gly Pro Val Leu Ile Tyr Met Ser Lys Val Asp 100 105
110 Asp Ala Ser Thr Ala Asp Gly Ser Gly Asp Trp Phe Lys Val Asp Glu
115 120 125 Phe Gly Tyr Asp Ala Ser Thr Lys Thr Trp Gly Thr Asp Lys
Leu Asn 130 135 140 Glu Asn Cys Gly Lys Arg Thr Phe Asn Ile Pro Ser
His Ile Pro Ala 145 150 155 160 Gly Asp Tyr Leu Val Arg Ala Glu Ala
Ile Ala Leu His Thr Ala Asn 165 170 175 Gln Pro Gly Gly Ala Gln Phe
Tyr Met Ser Cys Tyr Gln Val Arg Ile 180 185 190 Ser Gly Gly Glu Gly
Gly Gln Leu Pro Ala Gly Val Lys Ile Pro Gly 195 200 205 Ala Tyr Ser
Ala Asn Asp Pro Gly Ile Leu Val Asp Ile Trp Gly Asn 210 215 220 Asp
Phe Asn Asp Pro Pro Gly His Ser Ala Arg His Ala Ile Ile Ile 225 230
235 240 Ile Ser Ser Ser Ser Asn Asn Ser Gly Ala Lys Met Thr Lys Lys
Ile 245 250 255 Gln Glu Pro Thr Ile Thr Ser Val Thr Asp Leu Pro Thr
Asp Glu Ala 260 265 270 Lys Trp Ile Ala Leu Gln Lys Ile Ser Tyr Val
Asp Gln Thr Gly Thr 275 280 285 Ala Arg Thr Tyr Glu Pro Ala Ser Arg
Lys Thr Arg Ser Pro Arg Val 290 295 300 24284PRTMyceliophthora
thermophila 24His Ser Ile Phe Gln Gln Ala Ser Ser Gly Ser Thr Asp
Phe Asp Thr 1 5 10 15 Leu Cys Thr Arg Met Pro Pro Asn Asn Ser Pro
Val Thr Ser Val Thr 20 25 30 Ser Gly Asp Met Thr Cys Lys Val Gly
Gly Thr Lys Gly Val Ser Gly 35 40 45 Phe Cys Glu Val Asn Ala Gly
Asp Glu Phe Thr Val Glu Met His Ala 50 55 60 Gln Pro Gly Asp Arg
Ser Cys Ala Asn Glu Ala Ile Gly Gly Asn His 65 70 75 80 Phe Gly Pro
Val Leu Ile Tyr Met Ser Lys Val Asp Asp Ala Ser Thr 85 90 95 Ala
Asp Gly Ser Gly Asp Trp Phe Lys Val Asp Glu Phe Gly Tyr Asp 100 105
110 Ala Ser Thr Lys Thr Trp Gly Thr Asp Lys Leu Asn Glu Asn Cys Gly
115 120 125 Lys Arg Thr Phe Asn Ile Pro Ser His Ile Pro Ala Gly Asp
Tyr Leu 130 135 140 Val Arg Ala Glu Ala Ile Ala Leu His Thr Ala Asn
Gln Pro Gly Gly 145 150 155 160 Ala Gln Phe Tyr Met Ser Cys Tyr Gln
Val Arg Ile Ser Gly Gly Glu 165 170 175 Gly Gly Gln Leu Pro Ala Gly
Val Lys Ile Pro Gly Ala Tyr Ser Ala 180 185 190 Asn Asp Pro Gly Ile
Leu Val Asp Ile Trp Gly Asn Asp Phe Asn Asp 195 200 205 Pro Pro Gly
His Ser Ala Arg His Ala Ile Ile Ile Ile Ser Ser Ser 210 215 220 Ser
Asn Asn Ser Gly Ala Lys Met Thr Lys Lys Ile Gln Glu Pro Thr 225 230
235 240 Ile Thr Ser Val Thr Asp Leu Pro Thr Asp Glu Ala Lys Trp Ile
Ala 245 250 255 Leu Gln Lys Ile Ser Tyr Val Asp Gln Thr Gly Thr Ala
Arg Thr Tyr 260 265 270 Glu Pro Ala Ser Arg Lys Thr Arg Ser Pro Arg
Val 275 280 25726DNAMyceliophthora thermophila 25atgaagtcgt
ctaccccggc cttgttcgcc gctgggctcc ttgctcagca tgctgcggcc 60cactccatct
tccagcaggc gagcagcggc tcgaccgact ttgatacgct gtgcacccgg
120atgccgccca acaatagccc cgtcactagt gtgaccagcg gcgacatgac
ctgcaacgtc 180ggcggcacca agggggtgtc gggcttctgc gaggtgaacg
ccggcgacga gttcacggtt 240gagatgcacg cgcagcccgg cgaccgctcg
tgcgccaacg aggccatcgg cgggaaccac 300ttcggcccgg tcctcatcta
catgagcaag gtcgacgacg cctccactgc cgacgggtcc 360ggcgactggt
tcaaggtgga cgagttcggc tacgacgcaa gcaccaagac ctggggcacc
420gacaagctca acgagaactg cggcaagcgc accttcaaca tccccagcca
catccccgcg 480ggcgactatc tcgtccgggc cgaggctatc gcgctacaca
ctgccaacca gccaggcggc 540gcgcagttct acatgagctg ctatcaagtc
aggatttccg gcggcgaagg gggccagctg 600cctgccggag tcaagatccc
gggcgcgtac agtgccaacg accccggcat ccttgtcgac 660atctggggta
acgatttcaa cgagtacgtt attccgggcc ccccggtcat cgacagcagc 720tacttc
72626242PRTMyceliophthora thermophila 26Met Lys Ser Ser Thr Pro Ala
Leu Phe Ala Ala Gly Leu Leu Ala Gln 1 5 10 15 His Ala Ala Ala His
Ser Ile Phe Gln Gln Ala Ser Ser Gly Ser Thr 20 25 30 Asp Phe Asp
Thr Leu Cys Thr Arg Met Pro Pro Asn Asn Ser Pro Val 35 40 45 Thr
Ser Val Thr Ser Gly Asp Met Thr Cys Asn Val Gly Gly Thr Lys 50 55
60 Gly Val Ser Gly Phe Cys Glu Val Asn Ala Gly Asp Glu Phe Thr Val
65 70 75 80 Glu Met His Ala Gln Pro Gly Asp Arg Ser Cys Ala Asn Glu
Ala Ile 85 90 95 Gly Gly Asn His Phe Gly Pro Val Leu Ile Tyr Met
Ser Lys Val Asp 100 105 110 Asp Ala Ser Thr Ala Asp Gly Ser Gly Asp
Trp Phe Lys Val Asp Glu 115 120 125 Phe Gly Tyr Asp Ala Ser Thr Lys
Thr Trp Gly Thr Asp Lys Leu Asn 130 135 140 Glu Asn Cys Gly Lys Arg
Thr Phe Asn Ile Pro Ser His Ile Pro Ala 145 150 155 160 Gly Asp Tyr
Leu Val Arg Ala Glu Ala Ile Ala Leu His Thr Ala Asn 165 170 175 Gln
Pro Gly Gly Ala Gln Phe Tyr Met Ser Cys Tyr Gln Val Arg Ile 180 185
190 Ser Gly Gly Glu Gly Gly Gln Leu Pro Ala Gly Val Lys Ile Pro Gly
195 200 205 Ala Tyr Ser Ala Asn Asp Pro Gly Ile Leu Val Asp Ile Trp
Gly Asn 210 215 220 Asp Phe Asn Glu Tyr Val Ile Pro Gly Pro Pro Val
Ile Asp Ser Ser 225 230 235 240 Tyr Phe 27222PRTMyceliophthora
thermophila 27His Ser Ile Phe Gln Gln Ala Ser Ser Gly Ser Thr Asp
Phe Asp Thr 1 5 10 15 Leu Cys Thr Arg Met Pro Pro Asn Asn Ser Pro
Val Thr Ser Val Thr 20 25 30 Ser Gly Asp Met Thr Cys Asn Val Gly
Gly Thr Lys Gly Val Ser Gly 35 40 45 Phe Cys Glu Val Asn Ala Gly
Asp Glu Phe Thr Val Glu Met His Ala 50 55 60 Gln Pro Gly Asp Arg
Ser Cys Ala Asn Glu Ala Ile Gly Gly Asn His 65 70 75 80 Phe Gly Pro
Val Leu Ile Tyr Met Ser Lys Val Asp Asp Ala Ser Thr 85 90 95 Ala
Asp Gly Ser Gly Asp Trp Phe Lys Val Asp Glu Phe Gly Tyr Asp 100 105
110 Ala Ser Thr Lys Thr Trp Gly Thr Asp Lys Leu Asn Glu Asn Cys Gly
115 120 125 Lys Arg Thr Phe Asn Ile Pro Ser His Ile Pro Ala Gly Asp
Tyr Leu 130 135 140 Val Arg Ala Glu Ala Ile Ala Leu His Thr Ala Asn
Gln Pro Gly Gly 145 150 155 160 Ala Gln Phe Tyr Met Ser Cys Tyr Gln
Val Arg Ile Ser Gly Gly Glu 165 170 175 Gly Gly Gln Leu Pro Ala Gly
Val Lys Ile Pro Gly Ala Tyr Ser Ala 180 185 190 Asn Asp Pro Gly Ile
Leu Val Asp Ile Trp Gly Asn Asp Phe Asn Glu 195 200 205 Tyr Val Ile
Pro Gly Pro Pro Val Ile Asp Ser Ser Tyr Phe 210 215 220
28969DNAMyceliophthora thermophila 28atgaagtcct tcaccctcac
cactctggcc gccctggctg gcaacgccgc cgctcacgcg 60accttccagg ccctctgggt
cgacggcgtc gactacggcg cgcagtgtgc ccgtctgccc 120gcgtccaact
cgccggtcac cgacgtgacc tccaacgcga tccgctgcaa cgccaacccc
180tcgcccgctc ggggcaagtg cccggtcaag gccggctcga ccgttacggt
cgagatgcat 240cagcaacccg gtgaccgctc gtgcagcagc gaggcgatcg
gcggggcgca ctacggcccc 300gtgatggtgt acatgtccaa ggtgtcggac
gcggcgtcgg cggacgggtc gtcgggctgg 360ttcaaggtgt tcgaggacgg
ctgggccaag aacccgtccg gcgggtcggg cgacgacgac 420tactggggca
ccaaggacct gaactcgtgc tgcgggaaga tgaacgtcaa gatccccgcc
480gacctgccct cgggcgacta cctgctccgg gccgaggccc tcgcgctgca
cacggccggc 540agcgcgggcg gcgcccagtt ctacatgacc tgctaccagc
tcaccgtgac cggctccggc 600agcgccagcc cgcccaccgt ctccttcccg
ggcgcctaca aggccaccga cccgggcatc 660ctcgtcaaca tccacgcccc
gctgtccggc tacaccgtgc ccggcccggc cgtctactcg 720ggcggctcca
ccaagaaggc cggcagcgcc tgcaccggct gcgagtccac ttgcgccgtc
780ggctccggcc ccaccgccac cgtctcccag tcgcccggtt ccaccgccac
ctcggccccc 840ggcggcggcg gcggctgcac cgtccagaag taccagcagt
gcggcggcca gggctacacc 900ggctgcacca actgcgcgtc cggctccacc
tgcagcgcgg tctcgccgcc ctactactcg 960cagtgcgtc
96929323PRTMyceliophthora thermophila 29Met Lys Ser Phe Thr Leu Thr
Thr Leu Ala Ala Leu Ala Gly Asn Ala 1 5 10 15 Ala Ala His Ala Thr
Phe Gln Ala Leu Trp Val Asp Gly Val Asp Tyr 20 25 30 Gly Ala Gln
Cys Ala Arg Leu Pro Ala Ser Asn Ser Pro Val Thr Asp 35 40 45 Val
Thr Ser Asn Ala Ile Arg Cys Asn Ala Asn Pro Ser Pro Ala Arg 50 55
60 Gly Lys Cys Pro Val Lys Ala Gly Ser Thr Val Thr Val Glu Met His
65 70 75 80 Gln Gln Pro Gly Asp Arg Ser Cys Ser Ser Glu Ala Ile Gly
Gly Ala 85 90 95 His Tyr Gly Pro Val Met Val Tyr Met Ser Lys Val
Ser Asp Ala Ala 100 105 110 Ser Ala Asp Gly Ser Ser Gly Trp Phe Lys
Val Phe Glu Asp Gly Trp 115 120 125 Ala Lys Asn Pro Ser Gly Gly Ser
Gly Asp Asp Asp Tyr Trp Gly Thr 130 135 140 Lys Asp Leu Asn Ser Cys
Cys Gly Lys Met Asn Val Lys Ile Pro Ala 145 150 155 160 Asp Leu Pro
Ser Gly Asp Tyr Leu Leu Arg Ala Glu Ala Leu Ala Leu 165 170 175 His
Thr Ala Gly Ser Ala Gly Gly Ala Gln
Phe Tyr Met Thr Cys Tyr 180 185 190 Gln Leu Thr Val Thr Gly Ser Gly
Ser Ala Ser Pro Pro Thr Val Ser 195 200 205 Phe Pro Gly Ala Tyr Lys
Ala Thr Asp Pro Gly Ile Leu Val Asn Ile 210 215 220 His Ala Pro Leu
Ser Gly Tyr Thr Val Pro Gly Pro Ala Val Tyr Ser 225 230 235 240 Gly
Gly Ser Thr Lys Lys Ala Gly Ser Ala Cys Thr Gly Cys Glu Ser 245 250
255 Thr Cys Ala Val Gly Ser Gly Pro Thr Ala Thr Val Ser Gln Ser Pro
260 265 270 Gly Ser Thr Ala Thr Ser Ala Pro Gly Gly Gly Gly Gly Cys
Thr Val 275 280 285 Gln Lys Tyr Gln Gln Cys Gly Gly Gln Gly Tyr Thr
Gly Cys Thr Asn 290 295 300 Cys Ala Ser Gly Ser Thr Cys Ser Ala Val
Ser Pro Pro Tyr Tyr Ser 305 310 315 320 Gln Cys Val
30305PRTMyceliophthora thermophila 30His Ala Thr Phe Gln Ala Leu
Trp Val Asp Gly Val Asp Tyr Gly Ala 1 5 10 15 Gln Cys Ala Arg Leu
Pro Ala Ser Asn Ser Pro Val Thr Asp Val Thr 20 25 30 Ser Asn Ala
Ile Arg Cys Asn Ala Asn Pro Ser Pro Ala Arg Gly Lys 35 40 45 Cys
Pro Val Lys Ala Gly Ser Thr Val Thr Val Glu Met His Gln Gln 50 55
60 Pro Gly Asp Arg Ser Cys Ser Ser Glu Ala Ile Gly Gly Ala His Tyr
65 70 75 80 Gly Pro Val Met Val Tyr Met Ser Lys Val Ser Asp Ala Ala
Ser Ala 85 90 95 Asp Gly Ser Ser Gly Trp Phe Lys Val Phe Glu Asp
Gly Trp Ala Lys 100 105 110 Asn Pro Ser Gly Gly Ser Gly Asp Asp Asp
Tyr Trp Gly Thr Lys Asp 115 120 125 Leu Asn Ser Cys Cys Gly Lys Met
Asn Val Lys Ile Pro Ala Asp Leu 130 135 140 Pro Ser Gly Asp Tyr Leu
Leu Arg Ala Glu Ala Leu Ala Leu His Thr 145 150 155 160 Ala Gly Ser
Ala Gly Gly Ala Gln Phe Tyr Met Thr Cys Tyr Gln Leu 165 170 175 Thr
Val Thr Gly Ser Gly Ser Ala Ser Pro Pro Thr Val Ser Phe Pro 180 185
190 Gly Ala Tyr Lys Ala Thr Asp Pro Gly Ile Leu Val Asn Ile His Ala
195 200 205 Pro Leu Ser Gly Tyr Thr Val Pro Gly Pro Ala Val Tyr Ser
Gly Gly 210 215 220 Ser Thr Lys Lys Ala Gly Ser Ala Cys Thr Gly Cys
Glu Ser Thr Cys 225 230 235 240 Ala Val Gly Ser Gly Pro Thr Ala Thr
Val Ser Gln Ser Pro Gly Ser 245 250 255 Thr Ala Thr Ser Ala Pro Gly
Gly Gly Gly Gly Cys Thr Val Gln Lys 260 265 270 Tyr Gln Gln Cys Gly
Gly Gln Gly Tyr Thr Gly Cys Thr Asn Cys Ala 275 280 285 Ser Gly Ser
Thr Cys Ser Ala Val Ser Pro Pro Tyr Tyr Ser Gln Cys 290 295 300 Val
305 31870DNAMyceliophthora thermophila 31atgaagggac tcctcggcgc
cgccgccctc tcgctggccg tcagcgatgt ctcggcccac 60tacatctttc agcagctgac
gacgggcggc gtcaagcacg ctgtgtacca gtacatccgc 120aagaacacca
actataactc gcccgtgacc gatctgacgt ccaacgacct ccgctgcaat
180gtgggtgcta ccggtgcggg caccgatacc gtcacggtgc gcgccggcga
ttcgttcacc 240ttcacgaccg atacgcccgt ttaccaccag ggcccgacct
cgatctacat gtccaaggcc 300cccggcagcg cgtccgacta cgacggcagc
ggcggctggt tcaagatcaa ggactgggct 360gactacaccg ccacgattcc
ggaatgtatt ccccccggcg actacctgct tcgcatccag 420caactcggca
tccacaaccc ttggcccgcg ggcatccccc agttctacat ctcttgtgcc
480cagatcaccg tgactggtgg cggcagtgcc aaccccggcc cgaccgtctc
catcccaggc 540gccttcaagg agaccgaccc gggctacact gtcaacatct
acaacaactt ccacaactac 600accgtccctg gcccagccgt cttcacctgc
aacggtagcg gcggcaacaa cggcggcggc 660tccaacccag tcaccaccac
caccaccacc accaccaggc cgtccaccag caccgcccag 720tcccagccgt
cgtcgagccc gaccagcccc tccagctgca ccgtcgcgaa gtggggccag
780tgcggaggac agggttacag cggctgcacc gtgtgcgcgg ccgggtcgac
ctgccagaag 840accaacgact actacagcca gtgcttgtag
87032289PRTMyceliophthora thermophila 32Met Lys Gly Leu Leu Gly Ala
Ala Ala Leu Ser Leu Ala Val Ser Asp 1 5 10 15 Val Ser Ala His Tyr
Ile Phe Gln Gln Leu Thr Thr Gly Gly Val Lys 20 25 30 His Ala Val
Tyr Gln Tyr Ile Arg Lys Asn Thr Asn Tyr Asn Ser Pro 35 40 45 Val
Thr Asp Leu Thr Ser Asn Asp Leu Arg Cys Asn Val Gly Ala Thr 50 55
60 Gly Ala Gly Thr Asp Thr Val Thr Val Arg Ala Gly Asp Ser Phe Thr
65 70 75 80 Phe Thr Thr Asp Thr Pro Val Tyr His Gln Gly Pro Thr Ser
Ile Tyr 85 90 95 Met Ser Lys Ala Pro Gly Ser Ala Ser Asp Tyr Asp
Gly Ser Gly Gly 100 105 110 Trp Phe Lys Ile Lys Asp Trp Ala Asp Tyr
Thr Ala Thr Ile Pro Glu 115 120 125 Cys Ile Pro Pro Gly Asp Tyr Leu
Leu Arg Ile Gln Gln Leu Gly Ile 130 135 140 His Asn Pro Trp Pro Ala
Gly Ile Pro Gln Phe Tyr Ile Ser Cys Ala 145 150 155 160 Gln Ile Thr
Val Thr Gly Gly Gly Ser Ala Asn Pro Gly Pro Thr Val 165 170 175 Ser
Ile Pro Gly Ala Phe Lys Glu Thr Asp Pro Gly Tyr Thr Val Asn 180 185
190 Ile Tyr Asn Asn Phe His Asn Tyr Thr Val Pro Gly Pro Ala Val Phe
195 200 205 Thr Cys Asn Gly Ser Gly Gly Asn Asn Gly Gly Gly Ser Asn
Pro Val 210 215 220 Thr Thr Thr Thr Thr Thr Thr Thr Arg Pro Ser Thr
Ser Thr Ala Gln 225 230 235 240 Ser Gln Pro Ser Ser Ser Pro Thr Ser
Pro Ser Ser Cys Thr Val Ala 245 250 255 Lys Trp Gly Gln Cys Gly Gly
Gln Gly Tyr Ser Gly Cys Thr Val Cys 260 265 270 Ala Ala Gly Ser Thr
Cys Gln Lys Thr Asn Asp Tyr Tyr Ser Gln Cys 275 280 285 Leu
33270PRTMyceliophthora thermophila 33His Tyr Ile Phe Gln Gln Leu
Thr Thr Gly Gly Val Lys His Ala Val 1 5 10 15 Tyr Gln Tyr Ile Arg
Lys Asn Thr Asn Tyr Asn Ser Pro Val Thr Asp 20 25 30 Leu Thr Ser
Asn Asp Leu Arg Cys Asn Val Gly Ala Thr Gly Ala Gly 35 40 45 Thr
Asp Thr Val Thr Val Arg Ala Gly Asp Ser Phe Thr Phe Thr Thr 50 55
60 Asp Thr Pro Val Tyr His Gln Gly Pro Thr Ser Ile Tyr Met Ser Lys
65 70 75 80 Ala Pro Gly Ser Ala Ser Asp Tyr Asp Gly Ser Gly Gly Trp
Phe Lys 85 90 95 Ile Lys Asp Trp Ala Asp Tyr Thr Ala Thr Ile Pro
Glu Cys Ile Pro 100 105 110 Pro Gly Asp Tyr Leu Leu Arg Ile Gln Gln
Leu Gly Ile His Asn Pro 115 120 125 Trp Pro Ala Gly Ile Pro Gln Phe
Tyr Ile Ser Cys Ala Gln Ile Thr 130 135 140 Val Thr Gly Gly Gly Ser
Ala Asn Pro Gly Pro Thr Val Ser Ile Pro 145 150 155 160 Gly Ala Phe
Lys Glu Thr Asp Pro Gly Tyr Thr Val Asn Ile Tyr Asn 165 170 175 Asn
Phe His Asn Tyr Thr Val Pro Gly Pro Ala Val Phe Thr Cys Asn 180 185
190 Gly Ser Gly Gly Asn Asn Gly Gly Gly Ser Asn Pro Val Thr Thr Thr
195 200 205 Thr Thr Thr Thr Thr Arg Pro Ser Thr Ser Thr Ala Gln Ser
Gln Pro 210 215 220 Ser Ser Ser Pro Thr Ser Pro Ser Ser Cys Thr Val
Ala Lys Trp Gly 225 230 235 240 Gln Cys Gly Gly Gln Gly Tyr Ser Gly
Cys Thr Val Cys Ala Ala Gly 245 250 255 Ser Thr Cys Gln Lys Thr Asn
Asp Tyr Tyr Ser Gln Cys Leu 260 265 270 34834DNAMyceliophthora
thermophila 34ctgacgacgg gcggcgtcaa gcacgctgtg taccagtaca
tccgcaagaa caccaactat 60aactcgcccg tgaccgatct gacgtccaac gacctccgct
gcaatgtggg tgctaccggt 120gcgggcaccg ataccgtcac ggtgcgcgcc
ggcgattcgt tcaccttcac gaccgatacg 180cccgtttacc accagggccc
gacctcgatc tacatgtcca aggcccccgg cagcgcgtcc 240gactacgacg
gcagcggcgg ctggttcaag atcaaggact ggggtgccga ctttagcagc
300ggccaggcca cctggacctt ggcgtctgac tacaccgcca cgattccgga
atgtattccc 360cccggcgact acctgcttcg catccagcaa ctcggcatcc
acaacccttg gcccgcgggc 420atcccccagt tctacatctc ttgtgcccag
atcaccgtga ctggtggcgg cagtgccaac 480cccggcccga ccgtctccat
cccaggcgcc ttcaaggaga ccgacccggg ctacactgtc 540aacatctaca
acaacttcca caactacacc gtccctggcc cagccgtctt cacctgcaac
600ggtagcggcg gcaacaacgg cggcggctcc aacccagtca ccaccaccac
caccaccacc 660accaggccgt ccaccagcac cgcccagtcc cagccgtcgt
cgagcccgac cagcccctcc 720agctgcaccg tcgcgaagtg gggccagtgc
ggaggacagg gttacagcgg ctgcaccgtg 780tgcgcggccg ggtcgacctg
ccagaagacc aacgactact acagccagtg cttg 83435303PRTMyceliophthora
thermophila 35Met Lys Gly Leu Leu Gly Ala Ala Ala Leu Ser Leu Ala
Val Ser Asp 1 5 10 15 Val Ser Ala His Tyr Ile Phe Gln Gln Leu Thr
Thr Gly Gly Val Lys 20 25 30 His Ala Val Tyr Gln Tyr Ile Arg Lys
Asn Thr Asn Tyr Asn Ser Pro 35 40 45 Val Thr Asp Leu Thr Ser Asn
Asp Leu Arg Cys Asn Val Gly Ala Thr 50 55 60 Gly Ala Gly Thr Asp
Thr Val Thr Val Arg Ala Gly Asp Ser Phe Thr 65 70 75 80 Phe Thr Thr
Asp Thr Pro Val Tyr His Gln Gly Pro Thr Ser Ile Tyr 85 90 95 Met
Ser Lys Ala Pro Gly Ser Ala Ser Asp Tyr Asp Gly Ser Gly Gly 100 105
110 Trp Phe Lys Ile Lys Asp Trp Gly Ala Asp Phe Ser Ser Gly Gln Ala
115 120 125 Thr Trp Thr Leu Ala Ser Asp Tyr Thr Ala Thr Ile Pro Glu
Cys Ile 130 135 140 Pro Pro Gly Asp Tyr Leu Leu Arg Ile Gln Gln Leu
Gly Ile His Asn 145 150 155 160 Pro Trp Pro Ala Gly Ile Pro Gln Phe
Tyr Ile Ser Cys Ala Gln Ile 165 170 175 Thr Val Thr Gly Gly Gly Ser
Ala Asn Pro Gly Pro Thr Val Ser Ile 180 185 190 Pro Gly Ala Phe Lys
Glu Thr Asp Pro Gly Tyr Thr Val Asn Ile Tyr 195 200 205 Asn Asn Phe
His Asn Tyr Thr Val Pro Gly Pro Ala Val Phe Thr Cys 210 215 220 Asn
Gly Ser Gly Gly Asn Asn Gly Gly Gly Ser Asn Pro Val Thr Thr 225 230
235 240 Thr Thr Thr Thr Thr Thr Arg Pro Ser Thr Ser Thr Ala Gln Ser
Gln 245 250 255 Pro Ser Ser Ser Pro Thr Ser Pro Ser Ser Cys Thr Val
Ala Lys Trp 260 265 270 Gly Gln Cys Gly Gly Gln Gly Tyr Ser Gly Cys
Thr Val Cys Ala Ala 275 280 285 Gly Ser Thr Cys Gln Lys Thr Asn Asp
Tyr Tyr Ser Gln Cys Leu 290 295 300 36284PRTMyceliophthora
thermophila 36His Tyr Ile Phe Gln Gln Leu Thr Thr Gly Gly Val Lys
His Ala Val 1 5 10 15 Tyr Gln Tyr Ile Arg Lys Asn Thr Asn Tyr Asn
Ser Pro Val Thr Asp 20 25 30 Leu Thr Ser Asn Asp Leu Arg Cys Asn
Val Gly Ala Thr Gly Ala Gly 35 40 45 Thr Asp Thr Val Thr Val Arg
Ala Gly Asp Ser Phe Thr Phe Thr Thr 50 55 60 Asp Thr Pro Val Tyr
His Gln Gly Pro Thr Ser Ile Tyr Met Ser Lys 65 70 75 80 Ala Pro Gly
Ser Ala Ser Asp Tyr Asp Gly Ser Gly Gly Trp Phe Lys 85 90 95 Ile
Lys Asp Trp Gly Ala Asp Phe Ser Ser Gly Gln Ala Thr Trp Thr 100 105
110 Leu Ala Ser Asp Tyr Thr Ala Thr Ile Pro Glu Cys Ile Pro Pro Gly
115 120 125 Asp Tyr Leu Leu Arg Ile Gln Gln Leu Gly Ile His Asn Pro
Trp Pro 130 135 140 Ala Gly Ile Pro Gln Phe Tyr Ile Ser Cys Ala Gln
Ile Thr Val Thr 145 150 155 160 Gly Gly Gly Ser Ala Asn Pro Gly Pro
Thr Val Ser Ile Pro Gly Ala 165 170 175 Phe Lys Glu Thr Asp Pro Gly
Tyr Thr Val Asn Ile Tyr Asn Asn Phe 180 185 190 His Asn Tyr Thr Val
Pro Gly Pro Ala Val Phe Thr Cys Asn Gly Ser 195 200 205 Gly Gly Asn
Asn Gly Gly Gly Ser Asn Pro Val Thr Thr Thr Thr Thr 210 215 220 Thr
Thr Thr Arg Pro Ser Thr Ser Thr Ala Gln Ser Gln Pro Ser Ser 225 230
235 240 Ser Pro Thr Ser Pro Ser Ser Cys Thr Val Ala Lys Trp Gly Gln
Cys 245 250 255 Gly Gly Gln Gly Tyr Ser Gly Cys Thr Val Cys Ala Ala
Gly Ser Thr 260 265 270 Cys Gln Lys Thr Asn Asp Tyr Tyr Ser Gln Cys
Leu 275 280 371038DNAMyceliophthora thermophila 37atgtcttcct
tcacctccaa gggtctcctt tccgccctca tgggcgcggc aacggttgcc 60gcccacggtc
acgtcaccaa catcgtcatc aacggcgtct cataccagaa cttcgaccca
120ttcacgcacc cttatatgca gaaccctccg acggttgtcg gctggaccgc
gagcaacacg 180gacaacggct tcgtcggccc cgagtccttc tctagcccgg
acatcatctg ccacaagtcc 240gccaccaacg ctggcggcca tgccgtcgtc
gcggccggcg ataaggtctt catccagtgg 300gacacctggc ccgagtcgca
ccacggtccg gtcatcgact atctcgccga ctgcggcgac 360gcgggctgcg
agaaggtcga caagaccacg ctcaagttct tcaagatcag cgagtccggc
420ctgctcgacg gcactaacgc ccccggcaag tgggcgtccg acacgctgat
cgccaacaac 480aactcgtggc tggtccagat cccgcccaac atcgccccgg
gcaactacgt cctgcgccac 540gagatcatcg ccctgcacag cgccggccag
cagaacggcg cccagaacta ccctcagtgc 600ttcaacctgc aggtcaccgg
ctccggcact cagaagccct ccggcgtcct cggcaccgag 660ctctacaagg
ccaccgacgc cggcatcctg gccaacatct acacctcgcc cgtcacctac
720cagatccccg gcccggccat catctcgggc gcctccgccg tccagcagac
cacctcggcc 780atcaccgcct ctgctagcgc catcaccggc tccgctaccg
ccgcgcccac ggctgccacc 840accaccgccg ccgccgccgc caccactacc
accaccgctg gctccggtgc taccgccacg 900ccctcgaccg gcggctctcc
ttcttccgcc cagcctgctc ctaccaccgc tgccgctacc 960tccagccctg
ctcgcccgac ccgctgcgct ggtctgaaga agcgccgtcg ccacgcccgt
1020gacgtcaagg ttgccctc 103838346PRTMyceliophthora thermophila
38Met Ser Ser Phe Thr Ser Lys Gly Leu Leu Ser Ala Leu Met Gly Ala 1
5 10 15 Ala Thr Val Ala Ala His Gly His Val Thr Asn Ile Val Ile Asn
Gly 20 25 30 Val Ser Tyr Gln Asn Phe Asp Pro Phe Thr His Pro Tyr
Met Gln Asn 35 40 45 Pro Pro Thr Val Val Gly Trp Thr Ala Ser Asn
Thr Asp Asn Gly Phe 50 55 60 Val Gly Pro Glu Ser Phe Ser Ser Pro
Asp Ile Ile Cys His Lys Ser 65 70 75 80 Ala Thr Asn Ala Gly Gly His
Ala Val Val Ala Ala Gly Asp Lys Val 85 90 95 Phe Ile Gln Trp Asp
Thr Trp Pro Glu Ser His His Gly Pro Val Ile 100 105 110 Asp Tyr Leu
Ala Asp Cys Gly Asp Ala Gly Cys Glu Lys Val Asp Lys 115 120 125 Thr
Thr Leu Lys Phe Phe Lys Ile Ser Glu Ser Gly Leu Leu Asp Gly 130 135
140 Thr Asn Ala Pro Gly Lys Trp Ala Ser Asp Thr Leu Ile Ala Asn Asn
145 150 155 160 Asn Ser Trp Leu Val Gln Ile Pro Pro Asn Ile Ala Pro
Gly Asn Tyr 165 170 175 Val Leu Arg His Glu Ile Ile Ala Leu His Ser
Ala Gly Gln Gln Asn 180 185 190 Gly Ala Gln Asn Tyr Pro Gln Cys Phe
Asn Leu Gln Val Thr Gly Ser 195 200 205 Gly Thr Gln Lys Pro Ser Gly
Val Leu Gly Thr Glu Leu Tyr Lys Ala 210 215 220 Thr Asp Ala Gly Ile
Leu Ala Asn Ile Tyr Thr Ser Pro Val Thr Tyr 225 230
235 240 Gln Ile Pro Gly Pro Ala Ile Ile Ser Gly Ala Ser Ala Val Gln
Gln 245 250 255 Thr Thr Ser Ala Ile Thr Ala Ser Ala Ser Ala Ile Thr
Gly Ser Ala 260 265 270 Thr Ala Ala Pro Thr Ala Ala Thr Thr Thr Ala
Ala Ala Ala Ala Thr 275 280 285 Thr Thr Thr Thr Ala Gly Ser Gly Ala
Thr Ala Thr Pro Ser Thr Gly 290 295 300 Gly Ser Pro Ser Ser Ala Gln
Pro Ala Pro Thr Thr Ala Ala Ala Thr 305 310 315 320 Ser Ser Pro Ala
Arg Pro Thr Arg Cys Ala Gly Leu Lys Lys Arg Arg 325 330 335 Arg His
Ala Arg Asp Val Lys Val Ala Leu 340 345 39326PRTMyceliophthora
thermophila 39Ala His Gly His Val Thr Asn Ile Val Ile Asn Gly Val
Ser Tyr Gln 1 5 10 15 Asn Phe Asp Pro Phe Thr His Pro Tyr Met Gln
Asn Pro Pro Thr Val 20 25 30 Val Gly Trp Thr Ala Ser Asn Thr Asp
Asn Gly Phe Val Gly Pro Glu 35 40 45 Ser Phe Ser Ser Pro Asp Ile
Ile Cys His Lys Ser Ala Thr Asn Ala 50 55 60 Gly Gly His Ala Val
Val Ala Ala Gly Asp Lys Val Phe Ile Gln Trp 65 70 75 80 Asp Thr Trp
Pro Glu Ser His His Gly Pro Val Ile Asp Tyr Leu Ala 85 90 95 Asp
Cys Gly Asp Ala Gly Cys Glu Lys Val Asp Lys Thr Thr Leu Lys 100 105
110 Phe Phe Lys Ile Ser Glu Ser Gly Leu Leu Asp Gly Thr Asn Ala Pro
115 120 125 Gly Lys Trp Ala Ser Asp Thr Leu Ile Ala Asn Asn Asn Ser
Trp Leu 130 135 140 Val Gln Ile Pro Pro Asn Ile Ala Pro Gly Asn Tyr
Val Leu Arg His 145 150 155 160 Glu Ile Ile Ala Leu His Ser Ala Gly
Gln Gln Asn Gly Ala Gln Asn 165 170 175 Tyr Pro Gln Cys Phe Asn Leu
Gln Val Thr Gly Ser Gly Thr Gln Lys 180 185 190 Pro Ser Gly Val Leu
Gly Thr Glu Leu Tyr Lys Ala Thr Asp Ala Gly 195 200 205 Ile Leu Ala
Asn Ile Tyr Thr Ser Pro Val Thr Tyr Gln Ile Pro Gly 210 215 220 Pro
Ala Ile Ile Ser Gly Ala Ser Ala Val Gln Gln Thr Thr Ser Ala 225 230
235 240 Ile Thr Ala Ser Ala Ser Ala Ile Thr Gly Ser Ala Thr Ala Ala
Pro 245 250 255 Thr Ala Ala Thr Thr Thr Ala Ala Ala Ala Ala Thr Thr
Thr Thr Thr 260 265 270 Ala Gly Ser Gly Ala Thr Ala Thr Pro Ser Thr
Gly Gly Ser Pro Ser 275 280 285 Ser Ala Gln Pro Ala Pro Thr Thr Ala
Ala Ala Thr Ser Ser Pro Ala 290 295 300 Arg Pro Thr Arg Cys Ala Gly
Leu Lys Lys Arg Arg Arg His Ala Arg 305 310 315 320 Asp Val Lys Val
Ala Leu 325 40714DNAMyceliophthora thermophila 40atgaagacgc
tcgccgccct cgtggtctcg gccgccctcg tggccgcgca cggctatgtt 60gaccacgcca
cgatcggtgg caaggattat cagttctacc agccgtacca ggacccttac
120atgggcgaca acaagcccga tagggtttcc cgctccatcc cgggcaacgg
ccccgtggag 180gacgtcaact ccatcgacct ccagtgccac gccggtgccg
aaccggccaa gctccacgcc 240cccgccgccg ccggctcgac cgtgacgctc
tactggaccc tctggcccga ctcccacgtc 300ggccccgtca tcacctacat
ggctcgctgc cccgacaccg gctgccagga ctggtccccg 360ggaactaagc
ccgtttggtt caagatcaag gaaggcggcc gtgagggcac ctccaatacc
420ccgctcatga cggccccctc cgcctacacc tacacgatcc cgtcctgcct
caagagcggc 480tactacctcg tccgccacga gatcatcgcc ctgcactcgg
cctggcagta ccccggcgcc 540cagttctacc cgggctgcca ccagctccag
gtcaccggcg gcggctccac cgtgccctct 600accaacctgg tctccttccc
cggcgcctac aaggggagcg accccggcat cacctacgac 660gcttacaagg
cgcaacctta caccatccct ggcccggccg tgtttacctg ctga
71441237PRTMyceliophthora thermophila 41Met Lys Thr Leu Ala Ala Leu
Val Val Ser Ala Ala Leu Val Ala Ala 1 5 10 15 His Gly Tyr Val Asp
His Ala Thr Ile Gly Gly Lys Asp Tyr Gln Phe 20 25 30 Tyr Gln Pro
Tyr Gln Asp Pro Tyr Met Gly Asp Asn Lys Pro Asp Arg 35 40 45 Val
Ser Arg Ser Ile Pro Gly Asn Gly Pro Val Glu Asp Val Asn Ser 50 55
60 Ile Asp Leu Gln Cys His Ala Gly Ala Glu Pro Ala Lys Leu His Ala
65 70 75 80 Pro Ala Ala Ala Gly Ser Thr Val Thr Leu Tyr Trp Thr Leu
Trp Pro 85 90 95 Asp Ser His Val Gly Pro Val Ile Thr Tyr Met Ala
Arg Cys Pro Asp 100 105 110 Thr Gly Cys Gln Asp Trp Ser Pro Gly Thr
Lys Pro Val Trp Phe Lys 115 120 125 Ile Lys Glu Gly Gly Arg Glu Gly
Thr Ser Asn Thr Pro Leu Met Thr 130 135 140 Ala Pro Ser Ala Tyr Thr
Tyr Thr Ile Pro Ser Cys Leu Lys Ser Gly 145 150 155 160 Tyr Tyr Leu
Val Arg His Glu Ile Ile Ala Leu His Ser Ala Trp Gln 165 170 175 Tyr
Pro Gly Ala Gln Phe Tyr Pro Gly Cys His Gln Leu Gln Val Thr 180 185
190 Gly Gly Gly Ser Thr Val Pro Ser Thr Asn Leu Val Ser Phe Pro Gly
195 200 205 Ala Tyr Lys Gly Ser Asp Pro Gly Ile Thr Tyr Asp Ala Tyr
Lys Ala 210 215 220 Gln Pro Tyr Thr Ile Pro Gly Pro Ala Val Phe Thr
Cys 225 230 235 42219PRTMyceliophthora thermophila 42Tyr Val Asp
His Ala Thr Ile Gly Gly Lys Asp Tyr Gln Phe Tyr Gln 1 5 10 15 Pro
Tyr Gln Asp Pro Tyr Met Gly Asp Asn Lys Pro Asp Arg Val Ser 20 25
30 Arg Ser Ile Pro Gly Asn Gly Pro Val Glu Asp Val Asn Ser Ile Asp
35 40 45 Leu Gln Cys His Ala Gly Ala Glu Pro Ala Lys Leu His Ala
Pro Ala 50 55 60 Ala Ala Gly Ser Thr Val Thr Leu Tyr Trp Thr Leu
Trp Pro Asp Ser 65 70 75 80 His Val Gly Pro Val Ile Thr Tyr Met Ala
Arg Cys Pro Asp Thr Gly 85 90 95 Cys Gln Asp Trp Ser Pro Gly Thr
Lys Pro Val Trp Phe Lys Ile Lys 100 105 110 Glu Gly Gly Arg Glu Gly
Thr Ser Asn Thr Pro Leu Met Thr Ala Pro 115 120 125 Ser Ala Tyr Thr
Tyr Thr Ile Pro Ser Cys Leu Lys Ser Gly Tyr Tyr 130 135 140 Leu Val
Arg His Glu Ile Ile Ala Leu His Ser Ala Trp Gln Tyr Pro 145 150 155
160 Gly Ala Gln Phe Tyr Pro Gly Cys His Gln Leu Gln Val Thr Gly Gly
165 170 175 Gly Ser Thr Val Pro Ser Thr Asn Leu Val Ser Phe Pro Gly
Ala Tyr 180 185 190 Lys Gly Ser Asp Pro Gly Ile Thr Tyr Asp Ala Tyr
Lys Ala Gln Pro 195 200 205 Tyr Thr Ile Pro Gly Pro Ala Val Phe Thr
Cys 210 215 43723DNAMyceliophthora thermophila 43atgaagacgc
tcgccgccct cgtggtctcg gccgccctcg tggccgcgca cggctatgtt 60gaccacgcca
cgatcggtgg caaggattat cagttctacc agccgtacca ggacccttac
120atgggcgaca acaagcccga tagggtttcc cgctccatcc cgggcaacgg
ccccgtggag 180gacgtcaact ccatcgacct ccagtgccac gccggtgccg
aaccggccaa gctccacgcc 240cccgccgccg ccggctcgac cgtgacgctc
tactggaccc tctggcccga ctcccacgtc 300ggccccgtca tcacctacat
ggctcgctgc cccgacaccg gctgccagga ctggtccccg 360ggaactaagc
ccgtttggtt caagatcaag gaaggcggcc gtgagggcac ctccaatgtc
420tgggctgcta ccccgctcat gacggccccc tccgcctaca cctacacgat
cccgtcctgc 480ctcaagagcg gctactacct cgtccgccac gagatcatcg
ccctgcactc ggcctggcag 540taccccggcg cccagttcta cccgggctgc
caccagctcc aggtcaccgg cggcggctcc 600accgtgccct ctaccaacct
ggtctccttc cccggcgcct acaaggggag cgaccccggc 660atcacctacg
acgcttacaa ggcgcaacct tacaccatcc ctggcccggc cgtgtttacc 720tgc
72344241PRTMyceliophthora thermophila 44Met Lys Thr Leu Ala Ala Leu
Val Val Ser Ala Ala Leu Val Ala Ala 1 5 10 15 His Gly Tyr Val Asp
His Ala Thr Ile Gly Gly Lys Asp Tyr Gln Phe 20 25 30 Tyr Gln Pro
Tyr Gln Asp Pro Tyr Met Gly Asp Asn Lys Pro Asp Arg 35 40 45 Val
Ser Arg Ser Ile Pro Gly Asn Gly Pro Val Glu Asp Val Asn Ser 50 55
60 Ile Asp Leu Gln Cys His Ala Gly Ala Glu Pro Ala Lys Leu His Ala
65 70 75 80 Pro Ala Ala Ala Gly Ser Thr Val Thr Leu Tyr Trp Thr Leu
Trp Pro 85 90 95 Asp Ser His Val Gly Pro Val Ile Thr Tyr Met Ala
Arg Cys Pro Asp 100 105 110 Thr Gly Cys Gln Asp Trp Ser Pro Gly Thr
Lys Pro Val Trp Phe Lys 115 120 125 Ile Lys Glu Gly Gly Arg Glu Gly
Thr Ser Asn Val Trp Ala Ala Thr 130 135 140 Pro Leu Met Thr Ala Pro
Ser Ala Tyr Thr Tyr Thr Ile Pro Ser Cys 145 150 155 160 Leu Lys Ser
Gly Tyr Tyr Leu Val Arg His Glu Ile Ile Ala Leu His 165 170 175 Ser
Ala Trp Gln Tyr Pro Gly Ala Gln Phe Tyr Pro Gly Cys His Gln 180 185
190 Leu Gln Val Thr Gly Gly Gly Ser Thr Val Pro Ser Thr Asn Leu Val
195 200 205 Ser Phe Pro Gly Ala Tyr Lys Gly Ser Asp Pro Gly Ile Thr
Tyr Asp 210 215 220 Ala Tyr Lys Ala Gln Pro Tyr Thr Ile Pro Gly Pro
Ala Val Phe Thr 225 230 235 240 Cys 45223PRTMyceliophthora
thermophila 45Tyr Val Asp His Ala Thr Ile Gly Gly Lys Asp Tyr Gln
Phe Tyr Gln 1 5 10 15 Pro Tyr Gln Asp Pro Tyr Met Gly Asp Asn Lys
Pro Asp Arg Val Ser 20 25 30 Arg Ser Ile Pro Gly Asn Gly Pro Val
Glu Asp Val Asn Ser Ile Asp 35 40 45 Leu Gln Cys His Ala Gly Ala
Glu Pro Ala Lys Leu His Ala Pro Ala 50 55 60 Ala Ala Gly Ser Thr
Val Thr Leu Tyr Trp Thr Leu Trp Pro Asp Ser 65 70 75 80 His Val Gly
Pro Val Ile Thr Tyr Met Ala Arg Cys Pro Asp Thr Gly 85 90 95 Cys
Gln Asp Trp Ser Pro Gly Thr Lys Pro Val Trp Phe Lys Ile Lys 100 105
110 Glu Gly Gly Arg Glu Gly Thr Ser Asn Val Trp Ala Ala Thr Pro Leu
115 120 125 Met Thr Ala Pro Ser Ala Tyr Thr Tyr Thr Ile Pro Ser Cys
Leu Lys 130 135 140 Ser Gly Tyr Tyr Leu Val Arg His Glu Ile Ile Ala
Leu His Ser Ala 145 150 155 160 Trp Gln Tyr Pro Gly Ala Gln Phe Tyr
Pro Gly Cys His Gln Leu Gln 165 170 175 Val Thr Gly Gly Gly Ser Thr
Val Pro Ser Thr Asn Leu Val Ser Phe 180 185 190 Pro Gly Ala Tyr Lys
Gly Ser Asp Pro Gly Ile Thr Tyr Asp Ala Tyr 195 200 205 Lys Ala Gln
Pro Tyr Thr Ile Pro Gly Pro Ala Val Phe Thr Cys 210 215 220
46675DNAMyceliophthora thermophila 46atgagatact tcctccagct
cgctgcggcc gcggcctttg ccgtgaacag cgcggcgggt 60cactacatct tccagcagtt
cgcgacgggc gggtccaagt acccgccctg gaagtacatc 120cggcgcaaca
ccaacccgga ctggctgcag aacgggccgg tgacggacct gtcgtcgacc
180gacctgcgct gcaacgtggg cgggcaggtc agcaacggga ccgagaccat
caccttgaac 240gccggcgacg agttcagctt catcctcgac acgcccgtct
accatgccgg ccccacctcg 300ctctacatgt ccaaggcgcc cggagctgtg
gccgactacg acggcggcgg ggcctggttc 360aagatctacg actggggtcc
gtcggggacg agctggacgt tgagtggcac gtacactcag 420agaattccca
agtgcatccc tgacggcgag tacctcctcc gcatccagca gatcgggctc
480cacaaccccg gcgccgcgcc acagttctac atcagctgcg ctcaagtcaa
ggtcgtcgat 540ggcggcagca ccaatccgac cccgaccgcc cagattccgg
gagccttcca cagcaacgac 600cctggcttga ctgtcaatat ctacaacgac
cctctcacca actacgtcgt cccgggacct 660agagtttcgc actgg
67547225PRTMyceliophthora thermophila 47Met Arg Tyr Phe Leu Gln Leu
Ala Ala Ala Ala Ala Phe Ala Val Asn 1 5 10 15 Ser Ala Ala Gly His
Tyr Ile Phe Gln Gln Phe Ala Thr Gly Gly Ser 20 25 30 Lys Tyr Pro
Pro Trp Lys Tyr Ile Arg Arg Asn Thr Asn Pro Asp Trp 35 40 45 Leu
Gln Asn Gly Pro Val Thr Asp Leu Ser Ser Thr Asp Leu Arg Cys 50 55
60 Asn Val Gly Gly Gln Val Ser Asn Gly Thr Glu Thr Ile Thr Leu Asn
65 70 75 80 Ala Gly Asp Glu Phe Ser Phe Ile Leu Asp Thr Pro Val Tyr
His Ala 85 90 95 Gly Pro Thr Ser Leu Tyr Met Ser Lys Ala Pro Gly
Ala Val Ala Asp 100 105 110 Tyr Asp Gly Gly Gly Ala Trp Phe Lys Ile
Tyr Asp Trp Gly Pro Ser 115 120 125 Gly Thr Ser Trp Thr Leu Ser Gly
Thr Tyr Thr Gln Arg Ile Pro Lys 130 135 140 Cys Ile Pro Asp Gly Glu
Tyr Leu Leu Arg Ile Gln Gln Ile Gly Leu 145 150 155 160 His Asn Pro
Gly Ala Ala Pro Gln Phe Tyr Ile Ser Cys Ala Gln Val 165 170 175 Lys
Val Val Asp Gly Gly Ser Thr Asn Pro Thr Pro Thr Ala Gln Ile 180 185
190 Pro Gly Ala Phe His Ser Asn Asp Pro Gly Leu Thr Val Asn Ile Tyr
195 200 205 Asn Asp Pro Leu Thr Asn Tyr Val Val Pro Gly Pro Arg Val
Ser His 210 215 220 Trp 225 48205PRTMyceliophthora thermophila
48His Tyr Ile Phe Gln Gln Phe Ala Thr Gly Gly Ser Lys Tyr Pro Pro 1
5 10 15 Trp Lys Tyr Ile Arg Arg Asn Thr Asn Pro Asp Trp Leu Gln Asn
Gly 20 25 30 Pro Val Thr Asp Leu Ser Ser Thr Asp Leu Arg Cys Asn
Val Gly Gly 35 40 45 Gln Val Ser Asn Gly Thr Glu Thr Ile Thr Leu
Asn Ala Gly Asp Glu 50 55 60 Phe Ser Phe Ile Leu Asp Thr Pro Val
Tyr His Ala Gly Pro Thr Ser 65 70 75 80 Leu Tyr Met Ser Lys Ala Pro
Gly Ala Val Ala Asp Tyr Asp Gly Gly 85 90 95 Gly Ala Trp Phe Lys
Ile Tyr Asp Trp Gly Pro Ser Gly Thr Ser Trp 100 105 110 Thr Leu Ser
Gly Thr Tyr Thr Gln Arg Ile Pro Lys Cys Ile Pro Asp 115 120 125 Gly
Glu Tyr Leu Leu Arg Ile Gln Gln Ile Gly Leu His Asn Pro Gly 130 135
140 Ala Ala Pro Gln Phe Tyr Ile Ser Cys Ala Gln Val Lys Val Val Asp
145 150 155 160 Gly Gly Ser Thr Asn Pro Thr Pro Thr Ala Gln Ile Pro
Gly Ala Phe 165 170 175 His Ser Asn Asp Pro Gly Leu Thr Val Asn Ile
Tyr Asn Asp Pro Leu 180 185 190 Thr Asn Tyr Val Val Pro Gly Pro Arg
Val Ser His Trp 195 200 205 491332DNAMyceliophthora thermophila
49atgcacccct cccttctttt cacgcttggg ctggcgagcg tgcttgtccc cctctcgtct
60gcacacacta ccttcacgac cctcttcgtc aacgatgtca accaaggtga tggtacctgc
120attcgcatgg cgaagaaggg caatgtcgcc acccatcctc tcgcaggcgg
tctcgactcc 180gaagacatgg cctgtggtcg ggatggtcaa gaacccgtgg
catttacgtg tccggcccca 240gctggtgcca agttgactct cgagtttcgc
atgtgggccg atgcttcgca gtccggatcg 300atcgatccat cccaccttgg
cgtcatggcc atctacctca agaaggtttc cgacatgaaa 360tctgacgcgg
ccgctggccc gggctggttc aagatttggg accaaggcta cgacttggcg
420gccaagaagt gggccaccga gaagctcatc gacaacaacg gcctcctgag
cgtcaacctt 480ccaaccggct taccaaccgg ctactacctc gcccgccagg
agatcatcac gctccaaaac 540gttaccaatg acaggccaga gccccagttc
tacgtcggct gcgcacagct ctacgtcgag 600ggcacctcgg actcacccat
cccctcggac aagacggtct ccattcccgg ccacatcagc 660gacccggccg
acccgggcct gaccttcaac gtctacacgg gcgacgcatc cacctacaag
720ccgcccggcc ccgaggttta cttccccacc accaccacca ccacctcctc
ctcctcctcc 780ggaagcagcg
acaacaaggg agccaggcgc cagcaaaccc ccgacgacaa gcaggccgac
840ggcctcgttc cagccgactg cctcgtcaag aacgcgaact ggtgcgccgc
tgccctgccg 900ccgtacaccg acgaggccgg ctgctgggcc gccgccgagg
actgcaacaa gcagctggac 960gcgtgctaca ccagcgcacc cccctcgggc
agcaaggggt gcaaggtctg ggaggagcag 1020gtgtgcaccg tcgtctcgca
gaagtgcgag gccggggatt tcaaggggcc cccgcagctc 1080gggaaggagc
tcggcgaggg gatcgatgag cctattccgg ggggaaagct gcccccggcg
1140gtcaacgcgg gagagaacgg gaatcatggc ggaggtggtg gtgatgatgg
tgatgatgat 1200aatgatgagg ccggggctgg ggcagcgtcg actccgactt
ttgctgctcc tggtgcggcc 1260aagactcccc aaccaaactc cgagagggcc
cggcgccgtg aggcgcattg gcggcgactg 1320gaatctgctg ag
133250444PRTMyceliophthora thermophila 50Met His Pro Ser Leu Leu
Phe Thr Leu Gly Leu Ala Ser Val Leu Val 1 5 10 15 Pro Leu Ser Ser
Ala His Thr Thr Phe Thr Thr Leu Phe Val Asn Asp 20 25 30 Val Asn
Gln Gly Asp Gly Thr Cys Ile Arg Met Ala Lys Lys Gly Asn 35 40 45
Val Ala Thr His Pro Leu Ala Gly Gly Leu Asp Ser Glu Asp Met Ala 50
55 60 Cys Gly Arg Asp Gly Gln Glu Pro Val Ala Phe Thr Cys Pro Ala
Pro 65 70 75 80 Ala Gly Ala Lys Leu Thr Leu Glu Phe Arg Met Trp Ala
Asp Ala Ser 85 90 95 Gln Ser Gly Ser Ile Asp Pro Ser His Leu Gly
Val Met Ala Ile Tyr 100 105 110 Leu Lys Lys Val Ser Asp Met Lys Ser
Asp Ala Ala Ala Gly Pro Gly 115 120 125 Trp Phe Lys Ile Trp Asp Gln
Gly Tyr Asp Leu Ala Ala Lys Lys Trp 130 135 140 Ala Thr Glu Lys Leu
Ile Asp Asn Asn Gly Leu Leu Ser Val Asn Leu 145 150 155 160 Pro Thr
Gly Leu Pro Thr Gly Tyr Tyr Leu Ala Arg Gln Glu Ile Ile 165 170 175
Thr Leu Gln Asn Val Thr Asn Asp Arg Pro Glu Pro Gln Phe Tyr Val 180
185 190 Gly Cys Ala Gln Leu Tyr Val Glu Gly Thr Ser Asp Ser Pro Ile
Pro 195 200 205 Ser Asp Lys Thr Val Ser Ile Pro Gly His Ile Ser Asp
Pro Ala Asp 210 215 220 Pro Gly Leu Thr Phe Asn Val Tyr Thr Gly Asp
Ala Ser Thr Tyr Lys 225 230 235 240 Pro Pro Gly Pro Glu Val Tyr Phe
Pro Thr Thr Thr Thr Thr Thr Ser 245 250 255 Ser Ser Ser Ser Gly Ser
Ser Asp Asn Lys Gly Ala Arg Arg Gln Gln 260 265 270 Thr Pro Asp Asp
Lys Gln Ala Asp Gly Leu Val Pro Ala Asp Cys Leu 275 280 285 Val Lys
Asn Ala Asn Trp Cys Ala Ala Ala Leu Pro Pro Tyr Thr Asp 290 295 300
Glu Ala Gly Cys Trp Ala Ala Ala Glu Asp Cys Asn Lys Gln Leu Asp 305
310 315 320 Ala Cys Tyr Thr Ser Ala Pro Pro Ser Gly Ser Lys Gly Cys
Lys Val 325 330 335 Trp Glu Glu Gln Val Cys Thr Val Val Ser Gln Lys
Cys Glu Ala Gly 340 345 350 Asp Phe Lys Gly Pro Pro Gln Leu Gly Lys
Glu Leu Gly Glu Gly Ile 355 360 365 Asp Glu Pro Ile Pro Gly Gly Lys
Leu Pro Pro Ala Val Asn Ala Gly 370 375 380 Glu Asn Gly Asn His Gly
Gly Gly Gly Gly Asp Asp Gly Asp Asp Asp 385 390 395 400 Asn Asp Glu
Ala Gly Ala Gly Ala Ala Ser Thr Pro Thr Phe Ala Ala 405 410 415 Pro
Gly Ala Ala Lys Thr Pro Gln Pro Asn Ser Glu Arg Ala Arg Arg 420 425
430 Arg Glu Ala His Trp Arg Arg Leu Glu Ser Ala Glu 435 440
51423PRTMyceliophthora thermophila 51His Thr Thr Phe Thr Thr Leu
Phe Val Asn Asp Val Asn Gln Gly Asp 1 5 10 15 Gly Thr Cys Ile Arg
Met Ala Lys Lys Gly Asn Val Ala Thr His Pro 20 25 30 Leu Ala Gly
Gly Leu Asp Ser Glu Asp Met Ala Cys Gly Arg Asp Gly 35 40 45 Gln
Glu Pro Val Ala Phe Thr Cys Pro Ala Pro Ala Gly Ala Lys Leu 50 55
60 Thr Leu Glu Phe Arg Met Trp Ala Asp Ala Ser Gln Ser Gly Ser Ile
65 70 75 80 Asp Pro Ser His Leu Gly Val Met Ala Ile Tyr Leu Lys Lys
Val Ser 85 90 95 Asp Met Lys Ser Asp Ala Ala Ala Gly Pro Gly Trp
Phe Lys Ile Trp 100 105 110 Asp Gln Gly Tyr Asp Leu Ala Ala Lys Lys
Trp Ala Thr Glu Lys Leu 115 120 125 Ile Asp Asn Asn Gly Leu Leu Ser
Val Asn Leu Pro Thr Gly Leu Pro 130 135 140 Thr Gly Tyr Tyr Leu Ala
Arg Gln Glu Ile Ile Thr Leu Gln Asn Val 145 150 155 160 Thr Asn Asp
Arg Pro Glu Pro Gln Phe Tyr Val Gly Cys Ala Gln Leu 165 170 175 Tyr
Val Glu Gly Thr Ser Asp Ser Pro Ile Pro Ser Asp Lys Thr Val 180 185
190 Ser Ile Pro Gly His Ile Ser Asp Pro Ala Asp Pro Gly Leu Thr Phe
195 200 205 Asn Val Tyr Thr Gly Asp Ala Ser Thr Tyr Lys Pro Pro Gly
Pro Glu 210 215 220 Val Tyr Phe Pro Thr Thr Thr Thr Thr Thr Ser Ser
Ser Ser Ser Gly 225 230 235 240 Ser Ser Asp Asn Lys Gly Ala Arg Arg
Gln Gln Thr Pro Asp Asp Lys 245 250 255 Gln Ala Asp Gly Leu Val Pro
Ala Asp Cys Leu Val Lys Asn Ala Asn 260 265 270 Trp Cys Ala Ala Ala
Leu Pro Pro Tyr Thr Asp Glu Ala Gly Cys Trp 275 280 285 Ala Ala Ala
Glu Asp Cys Asn Lys Gln Leu Asp Ala Cys Tyr Thr Ser 290 295 300 Ala
Pro Pro Ser Gly Ser Lys Gly Cys Lys Val Trp Glu Glu Gln Val 305 310
315 320 Cys Thr Val Val Ser Gln Lys Cys Glu Ala Gly Asp Phe Lys Gly
Pro 325 330 335 Pro Gln Leu Gly Lys Glu Leu Gly Glu Gly Ile Asp Glu
Pro Ile Pro 340 345 350 Gly Gly Lys Leu Pro Pro Ala Val Asn Ala Gly
Glu Asn Gly Asn His 355 360 365 Gly Gly Gly Gly Gly Asp Asp Gly Asp
Asp Asp Asn Asp Glu Ala Gly 370 375 380 Ala Gly Ala Ala Ser Thr Pro
Thr Phe Ala Ala Pro Gly Ala Ala Lys 385 390 395 400 Thr Pro Gln Pro
Asn Ser Glu Arg Ala Arg Arg Arg Glu Ala His Trp 405 410 415 Arg Arg
Leu Glu Ser Ala Glu 420 52834DNAMyceliophthora thermophila
52atgttttctc tcaagttctt tatcttggcc ggtgggcttg ctgtcctcac cgaggctcac
60ataagactag tgtcgcccgc cccttttacc aaccctgacc agggccccag cccactccta
120gaggctggca gcgactatcc ctgccacaac ggcaatgggg gcggttatca
gggaacgcca 180acccagatgg caaagggttc taagcagcag ctagccttcc
aggggtctgc cgttcatggg 240ggtggctcct gccaagtgtc catcacctac
gacgaaaacc cgaccgctca gagctccttc 300aaggtcattc actcgattca
aggtggctgc cccgccaggg ccgagacgat cccggattgc 360agcgcacaaa
atatcaacgc ctgcaatata aagcccgata atgcccagat ggacaccccg
420gataagtatg agttcacgat cccggaggat ctccccagtg gcaaggccac
cctcgcctgg 480acatggatca acactatcgg caaccgcgag ttttatatgg
catgcgcccc ggttgagatc 540accggcgacg gcggtagcga gtcggctctg
gctgcgctgc ccgacatggt cattgccaac 600atcccgtcca tcggaggaac
ctgcgcgacc gaggagggga agtactacga atatcccaac 660cccggtaagt
cggtcgaaac catcccgggc tggaccgatt tggttcccct gcaaggcgaa
720tgcggtgctg cctccggtgt ctcgggctcc ggcggaaacg ccagcagtgc
tacccctgcc 780gcaggggccg ccccgactcc tgctgtccgc ggccgccgtc
ccacctggaa cgcc 83453278PRTMyceliophthora thermophila 53Met Phe Ser
Leu Lys Phe Phe Ile Leu Ala Gly Gly Leu Ala Val Leu 1 5 10 15 Thr
Glu Ala His Ile Arg Leu Val Ser Pro Ala Pro Phe Thr Asn Pro 20 25
30 Asp Gln Gly Pro Ser Pro Leu Leu Glu Ala Gly Ser Asp Tyr Pro Cys
35 40 45 His Asn Gly Asn Gly Gly Gly Tyr Gln Gly Thr Pro Thr Gln
Met Ala 50 55 60 Lys Gly Ser Lys Gln Gln Leu Ala Phe Gln Gly Ser
Ala Val His Gly 65 70 75 80 Gly Gly Ser Cys Gln Val Ser Ile Thr Tyr
Asp Glu Asn Pro Thr Ala 85 90 95 Gln Ser Ser Phe Lys Val Ile His
Ser Ile Gln Gly Gly Cys Pro Ala 100 105 110 Arg Ala Glu Thr Ile Pro
Asp Cys Ser Ala Gln Asn Ile Asn Ala Cys 115 120 125 Asn Ile Lys Pro
Asp Asn Ala Gln Met Asp Thr Pro Asp Lys Tyr Glu 130 135 140 Phe Thr
Ile Pro Glu Asp Leu Pro Ser Gly Lys Ala Thr Leu Ala Trp 145 150 155
160 Thr Trp Ile Asn Thr Ile Gly Asn Arg Glu Phe Tyr Met Ala Cys Ala
165 170 175 Pro Val Glu Ile Thr Gly Asp Gly Gly Ser Glu Ser Ala Leu
Ala Ala 180 185 190 Leu Pro Asp Met Val Ile Ala Asn Ile Pro Ser Ile
Gly Gly Thr Cys 195 200 205 Ala Thr Glu Glu Gly Lys Tyr Tyr Glu Tyr
Pro Asn Pro Gly Lys Ser 210 215 220 Val Glu Thr Ile Pro Gly Trp Thr
Asp Leu Val Pro Leu Gln Gly Glu 225 230 235 240 Cys Gly Ala Ala Ser
Gly Val Ser Gly Ser Gly Gly Asn Ala Ser Ser 245 250 255 Ala Thr Pro
Ala Ala Gly Ala Ala Pro Thr Pro Ala Val Arg Gly Arg 260 265 270 Arg
Pro Thr Trp Asn Ala 275 54259PRTMyceliophthora thermophila 54His
Ile Arg Leu Val Ser Pro Ala Pro Phe Thr Asn Pro Asp Gln Gly 1 5 10
15 Pro Ser Pro Leu Leu Glu Ala Gly Ser Asp Tyr Pro Cys His Asn Gly
20 25 30 Asn Gly Gly Gly Tyr Gln Gly Thr Pro Thr Gln Met Ala Lys
Gly Ser 35 40 45 Lys Gln Gln Leu Ala Phe Gln Gly Ser Ala Val His
Gly Gly Gly Ser 50 55 60 Cys Gln Val Ser Ile Thr Tyr Asp Glu Asn
Pro Thr Ala Gln Ser Ser 65 70 75 80 Phe Lys Val Ile His Ser Ile Gln
Gly Gly Cys Pro Ala Arg Ala Glu 85 90 95 Thr Ile Pro Asp Cys Ser
Ala Gln Asn Ile Asn Ala Cys Asn Ile Lys 100 105 110 Pro Asp Asn Ala
Gln Met Asp Thr Pro Asp Lys Tyr Glu Phe Thr Ile 115 120 125 Pro Glu
Asp Leu Pro Ser Gly Lys Ala Thr Leu Ala Trp Thr Trp Ile 130 135 140
Asn Thr Ile Gly Asn Arg Glu Phe Tyr Met Ala Cys Ala Pro Val Glu 145
150 155 160 Ile Thr Gly Asp Gly Gly Ser Glu Ser Ala Leu Ala Ala Leu
Pro Asp 165 170 175 Met Val Ile Ala Asn Ile Pro Ser Ile Gly Gly Thr
Cys Ala Thr Glu 180 185 190 Glu Gly Lys Tyr Tyr Glu Tyr Pro Asn Pro
Gly Lys Ser Val Glu Thr 195 200 205 Ile Pro Gly Trp Thr Asp Leu Val
Pro Leu Gln Gly Glu Cys Gly Ala 210 215 220 Ala Ser Gly Val Ser Gly
Ser Gly Gly Asn Ala Ser Ser Ala Thr Pro 225 230 235 240 Ala Ala Gly
Ala Ala Pro Thr Pro Ala Val Arg Gly Arg Arg Pro Thr 245 250 255 Trp
Asn Ala 55672DNAMyceliophthora thermophila 55atgaagctcg ccacgctcct
cgccgccctc accctcgggg tggccgacca gctcagcgtc 60gggtccagaa agtttggcgt
gtacgagcac attcgcaaga acacgaacta caactcgccc 120gttaccgacc
tgtcggacac caacctgcgc tgcaacgtcg gcgggggctc gggcaccagc
180accaccgtgc tcgacgtcaa ggccggagac tcgttcacct tcttcagcga
cgttgccgtc 240taccaccagg ggcccatctc gctgtgcgtg gaccggacca
gtgcagagag catggatgga 300cgggaaccgg acatgcgctg ccgaactggc
tcacaagctg gctacctggc ggtgactgac 360tacgacgggt ccggtgactg
tttcaagatc tatgactggg gaccgacgtt caacgggggc 420caggcgtcgt
ggccgacgag gaattcgtac gagtacagca tcctcaagtg catcagggac
480ggcgaatacc tactgcggat tcagtccctg gccatccata acccaggtgc
ccttccgcag 540ttctacatca gctgcgccca ggtgaatgtg acgggcggag
gcaccgtcac cccgagatca 600aggcgaccga tcctgatcta tttcaacttc
cactcgtata tcgtccctgg gccggcagtg 660ttcaagtgct ag
67256223PRTMyceliophthora thermophila 56Met Lys Leu Ala Thr Leu Leu
Ala Ala Leu Thr Leu Gly Val Ala Asp 1 5 10 15 Gln Leu Ser Val Gly
Ser Arg Lys Phe Gly Val Tyr Glu His Ile Arg 20 25 30 Lys Asn Thr
Asn Tyr Asn Ser Pro Val Thr Asp Leu Ser Asp Thr Asn 35 40 45 Leu
Arg Cys Asn Val Gly Gly Gly Ser Gly Thr Ser Thr Thr Val Leu 50 55
60 Asp Val Lys Ala Gly Asp Ser Phe Thr Phe Phe Ser Asp Val Ala Val
65 70 75 80 Tyr His Gln Gly Pro Ile Ser Leu Cys Val Asp Arg Thr Ser
Ala Glu 85 90 95 Ser Met Asp Gly Arg Glu Pro Asp Met Arg Cys Arg
Thr Gly Ser Gln 100 105 110 Ala Gly Tyr Leu Ala Val Thr Asp Tyr Asp
Gly Ser Gly Asp Cys Phe 115 120 125 Lys Ile Tyr Asp Trp Gly Pro Thr
Phe Asn Gly Gly Gln Ala Ser Trp 130 135 140 Pro Thr Arg Asn Ser Tyr
Glu Tyr Ser Ile Leu Lys Cys Ile Arg Asp 145 150 155 160 Gly Glu Tyr
Leu Leu Arg Ile Gln Ser Leu Ala Ile His Asn Pro Gly 165 170 175 Ala
Leu Pro Gln Phe Tyr Ile Ser Cys Ala Gln Val Asn Val Thr Gly 180 185
190 Gly Gly Thr Val Thr Pro Arg Ser Arg Arg Pro Ile Leu Ile Tyr Phe
195 200 205 Asn Phe His Ser Tyr Ile Val Pro Gly Pro Ala Val Phe Lys
Cys 210 215 220 57208PRTMyceliophthora thermophila 57Asp Gln Leu
Ser Val Gly Ser Arg Lys Phe Gly Val Tyr Glu His Ile 1 5 10 15 Arg
Lys Asn Thr Asn Tyr Asn Ser Pro Val Thr Asp Leu Ser Asp Thr 20 25
30 Asn Leu Arg Cys Asn Val Gly Gly Gly Ser Gly Thr Ser Thr Thr Val
35 40 45 Leu Asp Val Lys Ala Gly Asp Ser Phe Thr Phe Phe Ser Asp
Val Ala 50 55 60 Val Tyr His Gln Gly Pro Ile Ser Leu Cys Val Asp
Arg Thr Ser Ala 65 70 75 80 Glu Ser Met Asp Gly Arg Glu Pro Asp Met
Arg Cys Arg Thr Gly Ser 85 90 95 Gln Ala Gly Tyr Leu Ala Val Thr
Asp Tyr Asp Gly Ser Gly Asp Cys 100 105 110 Phe Lys Ile Tyr Asp Trp
Gly Pro Thr Phe Asn Gly Gly Gln Ala Ser 115 120 125 Trp Pro Thr Arg
Asn Ser Tyr Glu Tyr Ser Ile Leu Lys Cys Ile Arg 130 135 140 Asp Gly
Glu Tyr Leu Leu Arg Ile Gln Ser Leu Ala Ile His Asn Pro 145 150 155
160 Gly Ala Leu Pro Gln Phe Tyr Ile Ser Cys Ala Gln Val Asn Val Thr
165 170 175 Gly Gly Gly Thr Val Thr Pro Arg Ser Arg Arg Pro Ile Leu
Ile Tyr 180 185 190 Phe Asn Phe His Ser Tyr Ile Val Pro Gly Pro Ala
Val Phe Lys Cys 195 200 205 58642DNAMyceliophthora thermophila
58atgaagctcg ccacgctcct cgccgccctc accctcgggc tcagcgtcgg gtccagaaag
60tttggcgtgt acgagcacat tcgcaagaac acgaactaca actcgcccgt taccgacctg
120tcggacacca acctgcgctg caacgtcggc gggggctcgg gcaccagcac
caccgtgctc 180gacgtcaagg ccggagactc gttcaccttc ttcagcgacg
ttgccgtcta ccaccagggg 240cccatctcgc tgtgcgtgga ccggaccagt
gcagagagca tggatggacg ggaaccggac 300atgcgctgcc gaactggctc
acaagctggc tacctggcgg tgactgtgat gactgtgact 360gactacgacg
ggtccggtga ctgtttcaag atctatgact ggggaccgac gttcaacggg
420ggccaggcgt cgtggccgac gaggaattcg tacgagtaca gcatcctcaa
gtgcatcagg 480gacggcgaat acctactgcg gattcagtcc ctggccatcc
ataacccagg tgcccttccg 540cagttctaca tcagctgcgc ccaggtgaat
gtgacgggcg gaggcaccat ctatttcaac 600ttccactcgt
atatcgtccc tgggccggca gtgttcaagt gc 64259214PRTMyceliophthora
thermophila 59Met Lys Leu Ala Thr Leu Leu Ala Ala Leu Thr Leu Gly
Leu Ser Val 1 5 10 15 Gly Ser Arg Lys Phe Gly Val Tyr Glu His Ile
Arg Lys Asn Thr Asn 20 25 30 Tyr Asn Ser Pro Val Thr Asp Leu Ser
Asp Thr Asn Leu Arg Cys Asn 35 40 45 Val Gly Gly Gly Ser Gly Thr
Ser Thr Thr Val Leu Asp Val Lys Ala 50 55 60 Gly Asp Ser Phe Thr
Phe Phe Ser Asp Val Ala Val Tyr His Gln Gly 65 70 75 80 Pro Ile Ser
Leu Cys Val Asp Arg Thr Ser Ala Glu Ser Met Asp Gly 85 90 95 Arg
Glu Pro Asp Met Arg Cys Arg Thr Gly Ser Gln Ala Gly Tyr Leu 100 105
110 Ala Val Thr Val Met Thr Val Thr Asp Tyr Asp Gly Ser Gly Asp Cys
115 120 125 Phe Lys Ile Tyr Asp Trp Gly Pro Thr Phe Asn Gly Gly Gln
Ala Ser 130 135 140 Trp Pro Thr Arg Asn Ser Tyr Glu Tyr Ser Ile Leu
Lys Cys Ile Arg 145 150 155 160 Asp Gly Glu Tyr Leu Leu Arg Ile Gln
Ser Leu Ala Ile His Asn Pro 165 170 175 Gly Ala Leu Pro Gln Phe Tyr
Ile Ser Cys Ala Gln Val Asn Val Thr 180 185 190 Gly Gly Gly Thr Ile
Tyr Phe Asn Phe His Ser Tyr Ile Val Pro Gly 195 200 205 Pro Ala Val
Phe Lys Cys 210 60196PRTMyceliophthora thermophila 60Arg Lys Phe
Gly Val Tyr Glu His Ile Arg Lys Asn Thr Asn Tyr Asn 1 5 10 15 Ser
Pro Val Thr Asp Leu Ser Asp Thr Asn Leu Arg Cys Asn Val Gly 20 25
30 Gly Gly Ser Gly Thr Ser Thr Thr Val Leu Asp Val Lys Ala Gly Asp
35 40 45 Ser Phe Thr Phe Phe Ser Asp Val Ala Val Tyr His Gln Gly
Pro Ile 50 55 60 Ser Leu Cys Val Asp Arg Thr Ser Ala Glu Ser Met
Asp Gly Arg Glu 65 70 75 80 Pro Asp Met Arg Cys Arg Thr Gly Ser Gln
Ala Gly Tyr Leu Ala Val 85 90 95 Thr Val Met Thr Val Thr Asp Tyr
Asp Gly Ser Gly Asp Cys Phe Lys 100 105 110 Ile Tyr Asp Trp Gly Pro
Thr Phe Asn Gly Gly Gln Ala Ser Trp Pro 115 120 125 Thr Arg Asn Ser
Tyr Glu Tyr Ser Ile Leu Lys Cys Ile Arg Asp Gly 130 135 140 Glu Tyr
Leu Leu Arg Ile Gln Ser Leu Ala Ile His Asn Pro Gly Ala 145 150 155
160 Leu Pro Gln Phe Tyr Ile Ser Cys Ala Gln Val Asn Val Thr Gly Gly
165 170 175 Gly Thr Ile Tyr Phe Asn Phe His Ser Tyr Ile Val Pro Gly
Pro Ala 180 185 190 Val Phe Lys Cys 195 61579DNAMyceliophthora
thermophila 61atgaccaaga atgcgcagag caagcagggc gttgagaacc
caacaagcgg cgacatccgc 60tgctacacct cgcagacggc ggccaacgtc gtgaccgtgc
cggccggctc gaccattcac 120tacatctcga cccagcagat caaccacccc
ggcccgactc agtactacct ggccaaggta 180ccccccggct cgtcggccaa
gacctttgac gggtccggcg ccgtctggtt caagatctcg 240accacgatgc
ctaccgtgga cagcaacaag cagatgttct ggccagggca gaacacttat
300gagacctcaa acaccaccat tcccgccaac accccggacg gcgagtacct
ccttcgcgtc 360aagcagatcg ccctccacat ggcgtctcag cccaacaagg
tccagttcta cctcgcctgc 420acccagatca agatcaccgg tggtcgcaac
ggcaccccca gcccgctggt cgcgctgccc 480ggagcctaca agagcaccga
ccccggcatc ctggtcgaca tctactccat gaagcccgaa 540tcgtaccagc
ctcccgggcc gcccgtctgg cgcggctaa 57962192PRTMyceliophthora
thermophila 62Met Thr Lys Asn Ala Gln Ser Lys Gln Gly Val Glu Asn
Pro Thr Ser 1 5 10 15 Gly Asp Ile Arg Cys Tyr Thr Ser Gln Thr Ala
Ala Asn Val Val Thr 20 25 30 Val Pro Ala Gly Ser Thr Ile His Tyr
Ile Ser Thr Gln Gln Ile Asn 35 40 45 His Pro Gly Pro Thr Gln Tyr
Tyr Leu Ala Lys Val Pro Pro Gly Ser 50 55 60 Ser Ala Lys Thr Phe
Asp Gly Ser Gly Ala Val Trp Phe Lys Ile Ser 65 70 75 80 Thr Thr Met
Pro Thr Val Asp Ser Asn Lys Gln Met Phe Trp Pro Gly 85 90 95 Gln
Asn Thr Tyr Glu Thr Ser Asn Thr Thr Ile Pro Ala Asn Thr Pro 100 105
110 Asp Gly Glu Tyr Leu Leu Arg Val Lys Gln Ile Ala Leu His Met Ala
115 120 125 Ser Gln Pro Asn Lys Val Gln Phe Tyr Leu Ala Cys Thr Gln
Ile Lys 130 135 140 Ile Thr Gly Gly Arg Asn Gly Thr Pro Ser Pro Leu
Val Ala Leu Pro 145 150 155 160 Gly Ala Tyr Lys Ser Thr Asp Pro Gly
Ile Leu Val Asp Ile Tyr Ser 165 170 175 Met Lys Pro Glu Ser Tyr Gln
Pro Pro Gly Pro Pro Val Trp Arg Gly 180 185 190
63672DNAMyceliophthora thermophila 63atgaggcttc tcgcaagctt
gttgctcgca gctacggctg ttcaagctca ctttgttaac 60ggacagcccg aagagagtga
ctggtcagcc acgcgcatga ccaagaatgc gcagagcaag 120cagggcgttg
agaacccaac aagcggcgac atccgctgct acacctcgca gacggcggcc
180aacgtcgtga ccgtgccggc cggctcgacc attcactaca tctcgaccca
gcagatcaac 240caccccggcc cgactcagta ctacctggcc aaggtacccc
ccggctcgtc ggccaagacc 300tttgacgggt ccggcgccgt ctggttcaag
atctcgacca cgatgcctac cgtggacagc 360aacaagcaga tgttctggcc
agggcagaac acttatgaga cctcaaacac caccattccc 420gccaacaccc
cggacggcga gtacctcctt cgcgtcaagc agatcgccct ccacatggcg
480tctcagccca acaaggtcca gttctacctc gcctgcaccc agatcaagat
caccggtggt 540cgcaacggca cccccagccc gctggtcgcg ctgcccggag
cctacaagag caccgacccc 600ggcatcctgg tcgacatcta ctccatgaag
cccgaatcgt accagcctcc cgggccgccc 660gtctggcgcg gc
67264224PRTMyceliophthora thermophila 64Met Arg Leu Leu Ala Ser Leu
Leu Leu Ala Ala Thr Ala Val Gln Ala 1 5 10 15 His Phe Val Asn Gly
Gln Pro Glu Glu Ser Asp Trp Ser Ala Thr Arg 20 25 30 Met Thr Lys
Asn Ala Gln Ser Lys Gln Gly Val Glu Asn Pro Thr Ser 35 40 45 Gly
Asp Ile Arg Cys Tyr Thr Ser Gln Thr Ala Ala Asn Val Val Thr 50 55
60 Val Pro Ala Gly Ser Thr Ile His Tyr Ile Ser Thr Gln Gln Ile Asn
65 70 75 80 His Pro Gly Pro Thr Gln Tyr Tyr Leu Ala Lys Val Pro Pro
Gly Ser 85 90 95 Ser Ala Lys Thr Phe Asp Gly Ser Gly Ala Val Trp
Phe Lys Ile Ser 100 105 110 Thr Thr Met Pro Thr Val Asp Ser Asn Lys
Gln Met Phe Trp Pro Gly 115 120 125 Gln Asn Thr Tyr Glu Thr Ser Asn
Thr Thr Ile Pro Ala Asn Thr Pro 130 135 140 Asp Gly Glu Tyr Leu Leu
Arg Val Lys Gln Ile Ala Leu His Met Ala 145 150 155 160 Ser Gln Pro
Asn Lys Val Gln Phe Tyr Leu Ala Cys Thr Gln Ile Lys 165 170 175 Ile
Thr Gly Gly Arg Asn Gly Thr Pro Ser Pro Leu Val Ala Leu Pro 180 185
190 Gly Ala Tyr Lys Ser Thr Asp Pro Gly Ile Leu Val Asp Ile Tyr Ser
195 200 205 Met Lys Pro Glu Ser Tyr Gln Pro Pro Gly Pro Pro Val Trp
Arg Gly 210 215 220 65208PRTMyceliophthora thermophila 65His Phe
Val Asn Gly Gln Pro Glu Glu Ser Asp Trp Ser Ala Thr Arg 1 5 10 15
Met Thr Lys Asn Ala Gln Ser Lys Gln Gly Val Glu Asn Pro Thr Ser 20
25 30 Gly Asp Ile Arg Cys Tyr Thr Ser Gln Thr Ala Ala Asn Val Val
Thr 35 40 45 Val Pro Ala Gly Ser Thr Ile His Tyr Ile Ser Thr Gln
Gln Ile Asn 50 55 60 His Pro Gly Pro Thr Gln Tyr Tyr Leu Ala Lys
Val Pro Pro Gly Ser 65 70 75 80 Ser Ala Lys Thr Phe Asp Gly Ser Gly
Ala Val Trp Phe Lys Ile Ser 85 90 95 Thr Thr Met Pro Thr Val Asp
Ser Asn Lys Gln Met Phe Trp Pro Gly 100 105 110 Gln Asn Thr Tyr Glu
Thr Ser Asn Thr Thr Ile Pro Ala Asn Thr Pro 115 120 125 Asp Gly Glu
Tyr Leu Leu Arg Val Lys Gln Ile Ala Leu His Met Ala 130 135 140 Ser
Gln Pro Asn Lys Val Gln Phe Tyr Leu Ala Cys Thr Gln Ile Lys 145 150
155 160 Ile Thr Gly Gly Arg Asn Gly Thr Pro Ser Pro Leu Val Ala Leu
Pro 165 170 175 Gly Ala Tyr Lys Ser Thr Asp Pro Gly Ile Leu Val Asp
Ile Tyr Ser 180 185 190 Met Lys Pro Glu Ser Tyr Gln Pro Pro Gly Pro
Pro Val Trp Arg Gly 195 200 205 66849DNAMyceliophthora thermophila
66atgaagccct ttagcctcgt cgccctggcg actgccgtga gcggccatgc catcttccag
60cgggtgtcgg tcaacgggca ggaccagggc cagctcaagg gggtgcgggc gccgtcgagc
120aactccccga tccagaacgt caacgatgcc aacatggcct gcaacgccaa
cattgtgtac 180cacgacaaca ccatcatcaa ggtgcccgcg ggagcccgcg
tcggcgcgtg gtggcagcac 240gtcatcggcg ggccgcaggg cgccaacgac
ccggacaacc cgatcgccgc ctcccacaag 300ggccccatcc aggtctacct
ggccaaggtg gacaacgcgg cgacggcgtc gccgtcgggc 360ctcaagtggt
tcaaggtggc cgagcgcggc ctgaacaacg gcgtgtgggc ctacctgatg
420cgcgtcgagc tgctcgccct gcacagcgcc tcgagccccg gcggcgccca
gttctacatg 480ggctgtgcac agatcgaagt cactggctcc ggcaccaact
cgggctccga ctttgtctcg 540ttccccggcg cctactcggc caacgacccg
ggcatcttgc tgagcatcta cgacagctcg 600ggcaagccca acaatggcgg
gcgctcgtac ccgatccccg gcccgcgccc catctcctgc 660tccggcagcg
gcggcggcgg caacaacggc ggcgacggcg gcgacgacaa caacggtggt
720ggcaacaaca acggcggcgg cagcgtcccc ctgtacgggc agtgcggcgg
catcggctac 780acgggcccga ccacctgtgc ccagggaact tgcaaggtgt
cgaacgaata ctacagccag 840tgcctcccc 84967283PRTMyceliophthora
thermophila 67Met Lys Pro Phe Ser Leu Val Ala Leu Ala Thr Ala Val
Ser Gly His 1 5 10 15 Ala Ile Phe Gln Arg Val Ser Val Asn Gly Gln
Asp Gln Gly Gln Leu 20 25 30 Lys Gly Val Arg Ala Pro Ser Ser Asn
Ser Pro Ile Gln Asn Val Asn 35 40 45 Asp Ala Asn Met Ala Cys Asn
Ala Asn Ile Val Tyr His Asp Asn Thr 50 55 60 Ile Ile Lys Val Pro
Ala Gly Ala Arg Val Gly Ala Trp Trp Gln His 65 70 75 80 Val Ile Gly
Gly Pro Gln Gly Ala Asn Asp Pro Asp Asn Pro Ile Ala 85 90 95 Ala
Ser His Lys Gly Pro Ile Gln Val Tyr Leu Ala Lys Val Asp Asn 100 105
110 Ala Ala Thr Ala Ser Pro Ser Gly Leu Lys Trp Phe Lys Val Ala Glu
115 120 125 Arg Gly Leu Asn Asn Gly Val Trp Ala Tyr Leu Met Arg Val
Glu Leu 130 135 140 Leu Ala Leu His Ser Ala Ser Ser Pro Gly Gly Ala
Gln Phe Tyr Met 145 150 155 160 Gly Cys Ala Gln Ile Glu Val Thr Gly
Ser Gly Thr Asn Ser Gly Ser 165 170 175 Asp Phe Val Ser Phe Pro Gly
Ala Tyr Ser Ala Asn Asp Pro Gly Ile 180 185 190 Leu Leu Ser Ile Tyr
Asp Ser Ser Gly Lys Pro Asn Asn Gly Gly Arg 195 200 205 Ser Tyr Pro
Ile Pro Gly Pro Arg Pro Ile Ser Cys Ser Gly Ser Gly 210 215 220 Gly
Gly Gly Asn Asn Gly Gly Asp Gly Gly Asp Asp Asn Asn Gly Gly 225 230
235 240 Gly Asn Asn Asn Gly Gly Gly Ser Val Pro Leu Tyr Gly Gln Cys
Gly 245 250 255 Gly Ile Gly Tyr Thr Gly Pro Thr Thr Cys Ala Gln Gly
Thr Cys Lys 260 265 270 Val Ser Asn Glu Tyr Tyr Ser Gln Cys Leu Pro
275 280 68268PRTMyceliophthora thermophila 68His Ala Ile Phe Gln
Arg Val Ser Val Asn Gly Gln Asp Gln Gly Gln 1 5 10 15 Leu Lys Gly
Val Arg Ala Pro Ser Ser Asn Ser Pro Ile Gln Asn Val 20 25 30 Asn
Asp Ala Asn Met Ala Cys Asn Ala Asn Ile Val Tyr His Asp Asn 35 40
45 Thr Ile Ile Lys Val Pro Ala Gly Ala Arg Val Gly Ala Trp Trp Gln
50 55 60 His Val Ile Gly Gly Pro Gln Gly Ala Asn Asp Pro Asp Asn
Pro Ile 65 70 75 80 Ala Ala Ser His Lys Gly Pro Ile Gln Val Tyr Leu
Ala Lys Val Asp 85 90 95 Asn Ala Ala Thr Ala Ser Pro Ser Gly Leu
Lys Trp Phe Lys Val Ala 100 105 110 Glu Arg Gly Leu Asn Asn Gly Val
Trp Ala Tyr Leu Met Arg Val Glu 115 120 125 Leu Leu Ala Leu His Ser
Ala Ser Ser Pro Gly Gly Ala Gln Phe Tyr 130 135 140 Met Gly Cys Ala
Gln Ile Glu Val Thr Gly Ser Gly Thr Asn Ser Gly 145 150 155 160 Ser
Asp Phe Val Ser Phe Pro Gly Ala Tyr Ser Ala Asn Asp Pro Gly 165 170
175 Ile Leu Leu Ser Ile Tyr Asp Ser Ser Gly Lys Pro Asn Asn Gly Gly
180 185 190 Arg Ser Tyr Pro Ile Pro Gly Pro Arg Pro Ile Ser Cys Ser
Gly Ser 195 200 205 Gly Gly Gly Gly Asn Asn Gly Gly Asp Gly Gly Asp
Asp Asn Asn Gly 210 215 220 Gly Gly Asn Asn Asn Gly Gly Gly Ser Val
Pro Leu Tyr Gly Gln Cys 225 230 235 240 Gly Gly Ile Gly Tyr Thr Gly
Pro Thr Thr Cys Ala Gln Gly Thr Cys 245 250 255 Lys Val Ser Asn Glu
Tyr Tyr Ser Gln Cys Leu Pro 260 265 69639DNAMyceliophthora
thermophila 69atgaagctca cctcgtccct cgctgtcctg gccgctgccg
gcgcccaggc tcactatacc 60ttccctaggg ccggcactgg tggttcgctc tctggcgagt
gggaggtggt ccgcatgacc 120gagaaccatt actcgcacgg cccggtcacc
gatgtcacca gccccgagat gacctgctat 180cagtccggcg tgcagggtgc
gccccagacc gtccaggtca aggcgggctc ccaattcacc 240ttcagcgtgg
atccctccat cggccacccc ggccctctcc agttctacat ggctaaggtg
300ccgtcgggcc agacggccgc cacctttgac ggcacgggag ccgtgtggtt
caagatctac 360caagacggcc cgaacggcct cggcaccgac agcattacct
ggcccagcgc cggcaaaacc 420gaggtctcgg tcaccatccc cagctgcatc
gaggatggcg agtacctgct ccgggtcgag 480cacacccccc tccctacagc
gccagcagcg caaaaccgag ctcgctcgtc accatcccca 540gctgcataca
aggccaccga cccgggcatc ctcttccagc tctactggcc catcccgacc
600gagtacatca accccggccc ggcccccgtc tcttgctaa
63970212PRTMyceliophthora thermophila 70Met Lys Leu Thr Ser Ser Leu
Ala Val Leu Ala Ala Ala Gly Ala Gln 1 5 10 15 Ala His Tyr Thr Phe
Pro Arg Ala Gly Thr Gly Gly Ser Leu Ser Gly 20 25 30 Glu Trp Glu
Val Val Arg Met Thr Glu Asn His Tyr Ser His Gly Pro 35 40 45 Val
Thr Asp Val Thr Ser Pro Glu Met Thr Cys Tyr Gln Ser Gly Val 50 55
60 Gln Gly Ala Pro Gln Thr Val Gln Val Lys Ala Gly Ser Gln Phe Thr
65 70 75 80 Phe Ser Val Asp Pro Ser Ile Gly His Pro Gly Pro Leu Gln
Phe Tyr 85 90 95 Met Ala Lys Val Pro Ser Gly Gln Thr Ala Ala Thr
Phe Asp Gly Thr 100 105 110 Gly Ala Val Trp Phe Lys Ile Tyr Gln Asp
Gly Pro Asn Gly Leu Gly 115 120 125 Thr Asp Ser Ile Thr Trp Pro Ser
Ala Gly Lys Thr Glu Val Ser Val 130 135 140 Thr Ile Pro Ser Cys Ile
Glu Asp Gly Glu Tyr Leu Leu Arg Val Glu 145 150 155 160 His Thr Pro
Leu Pro Thr Ala Pro Ala Ala Gln Asn Arg Ala Arg Ser 165 170 175 Ser
Pro Ser Pro Ala Ala Tyr Lys Ala Thr Asp Pro Gly Ile Leu Phe 180 185
190 Gln Leu Tyr Trp Pro Ile Pro Thr Glu Tyr Ile Asn Pro Gly Pro Ala
195 200 205 Pro
Val Ser Cys 210 71195PRTMyceliophthora thermophila 71His Tyr Thr
Phe Pro Arg Ala Gly Thr Gly Gly Ser Leu Ser Gly Glu 1 5 10 15 Trp
Glu Val Val Arg Met Thr Glu Asn His Tyr Ser His Gly Pro Val 20 25
30 Thr Asp Val Thr Ser Pro Glu Met Thr Cys Tyr Gln Ser Gly Val Gln
35 40 45 Gly Ala Pro Gln Thr Val Gln Val Lys Ala Gly Ser Gln Phe
Thr Phe 50 55 60 Ser Val Asp Pro Ser Ile Gly His Pro Gly Pro Leu
Gln Phe Tyr Met 65 70 75 80 Ala Lys Val Pro Ser Gly Gln Thr Ala Ala
Thr Phe Asp Gly Thr Gly 85 90 95 Ala Val Trp Phe Lys Ile Tyr Gln
Asp Gly Pro Asn Gly Leu Gly Thr 100 105 110 Asp Ser Ile Thr Trp Pro
Ser Ala Gly Lys Thr Glu Val Ser Val Thr 115 120 125 Ile Pro Ser Cys
Ile Glu Asp Gly Glu Tyr Leu Leu Arg Val Glu His 130 135 140 Thr Pro
Leu Pro Thr Ala Pro Ala Ala Gln Asn Arg Ala Arg Ser Ser 145 150 155
160 Pro Ser Pro Ala Ala Tyr Lys Ala Thr Asp Pro Gly Ile Leu Phe Gln
165 170 175 Leu Tyr Trp Pro Ile Pro Thr Glu Tyr Ile Asn Pro Gly Pro
Ala Pro 180 185 190 Val Ser Cys 195 72695DNAMyceliophthora
thermophila 72atgaagctca cctcgtccct cgctgtcctg gccgctgccg
gcgcccaggc tcactatacc 60ttccctaggg ccggcactgg tggttcgctc tctggcgagt
gggaggtggt ccgcatgacc 120gagaccatta ctcgcacggc ccggtcaccg
atgtcaccag ccccgagatg acctgctatc 180agtccggcgt gcagggtgcg
ccccagaccg tccaggtcaa ggcgggctcc caattcacct 240tcagcgtgga
tccctccatc ggccaccccg gccctctcca gttctacatg gctaaggtgc
300cgtcgggcca gacggccgcc acctttgacg gcacgggagc cgtgtggttc
aagatctacc 360aagacggccc gaacggcctc ggcaccgaca gcattacctg
gcccagcgcc ggcaaaaccg 420aggtctcggt caccatcccc agctgcatcg
aggatggcga gtacctgctc cgggtcgagc 480acatcgcgct ccacagcgcc
agcagcgtgg gcggcgccca gttctacatc gcctgcgccc 540agctctccgt
caccggcggc tccggcaccc tcaacacggg ctcgctcgtc tccctgcccg
600gcgcctacaa ggccaccgac ccgggcatcc tcttccagct ctactggccc
atcccgaccg 660agtacatcaa ccccggcccg gcccccgtct cttgc
69573232PRTMyceliophthora thermophila 73Met Lys Leu Thr Ser Ser Leu
Ala Val Leu Ala Ala Ala Gly Ala Gln 1 5 10 15 Ala His Tyr Thr Phe
Pro Arg Ala Gly Thr Gly Gly Ser Leu Ser Gly 20 25 30 Glu Trp Glu
Val Val Arg Met Thr Glu Asn His Tyr Ser His Gly Pro 35 40 45 Val
Thr Asp Val Thr Ser Pro Glu Met Thr Cys Tyr Gln Ser Gly Val 50 55
60 Gln Gly Ala Pro Gln Thr Val Gln Val Lys Ala Gly Ser Gln Phe Thr
65 70 75 80 Phe Ser Val Asp Pro Ser Ile Gly His Pro Gly Pro Leu Gln
Phe Tyr 85 90 95 Met Ala Lys Val Pro Ser Gly Gln Thr Ala Ala Thr
Phe Asp Gly Thr 100 105 110 Gly Ala Val Trp Phe Lys Ile Tyr Gln Asp
Gly Pro Asn Gly Leu Gly 115 120 125 Thr Asp Ser Ile Thr Trp Pro Ser
Ala Gly Lys Thr Glu Val Ser Val 130 135 140 Thr Ile Pro Ser Cys Ile
Glu Asp Gly Glu Tyr Leu Leu Arg Val Glu 145 150 155 160 His Ile Ala
Leu His Ser Ala Ser Ser Val Gly Gly Ala Gln Phe Tyr 165 170 175 Ile
Ala Cys Ala Gln Leu Ser Val Thr Gly Gly Ser Gly Thr Leu Asn 180 185
190 Thr Gly Ser Leu Val Ser Leu Pro Gly Ala Tyr Lys Ala Thr Asp Pro
195 200 205 Gly Ile Leu Phe Gln Leu Tyr Trp Pro Ile Pro Thr Glu Tyr
Ile Asn 210 215 220 Pro Gly Pro Ala Pro Val Ser Cys 225 230
74215PRTMyceliophthora thermophila 74His Tyr Thr Phe Pro Arg Ala
Gly Thr Gly Gly Ser Leu Ser Gly Glu 1 5 10 15 Trp Glu Val Val Arg
Met Thr Glu Asn His Tyr Ser His Gly Pro Val 20 25 30 Thr Asp Val
Thr Ser Pro Glu Met Thr Cys Tyr Gln Ser Gly Val Gln 35 40 45 Gly
Ala Pro Gln Thr Val Gln Val Lys Ala Gly Ser Gln Phe Thr Phe 50 55
60 Ser Val Asp Pro Ser Ile Gly His Pro Gly Pro Leu Gln Phe Tyr Met
65 70 75 80 Ala Lys Val Pro Ser Gly Gln Thr Ala Ala Thr Phe Asp Gly
Thr Gly 85 90 95 Ala Val Trp Phe Lys Ile Tyr Gln Asp Gly Pro Asn
Gly Leu Gly Thr 100 105 110 Asp Ser Ile Thr Trp Pro Ser Ala Gly Lys
Thr Glu Val Ser Val Thr 115 120 125 Ile Pro Ser Cys Ile Glu Asp Gly
Glu Tyr Leu Leu Arg Val Glu His 130 135 140 Ile Ala Leu His Ser Ala
Ser Ser Val Gly Gly Ala Gln Phe Tyr Ile 145 150 155 160 Ala Cys Ala
Gln Leu Ser Val Thr Gly Gly Ser Gly Thr Leu Asn Thr 165 170 175 Gly
Ser Leu Val Ser Leu Pro Gly Ala Tyr Lys Ala Thr Asp Pro Gly 180 185
190 Ile Leu Phe Gln Leu Tyr Trp Pro Ile Pro Thr Glu Tyr Ile Asn Pro
195 200 205 Gly Pro Ala Pro Val Ser Cys 210 215
75447DNAMyceliophthora thermophila 75atgccgccac cacgactgag
caccctcctt cccctcctag ccttaatagc ccccaccgcc 60ctggggcact cccacctcgg
gtacatcatc atcaacggcg aggtatacca aggattcgac 120ccgcggccgg
agcaggcgaa ctcgccgttg cgcgtgggct ggtcgacggg ggcaatcgac
180gacgggttcg tggcgccggc caactactcg tcgcccgaca tcatctgcca
catcgagggg 240gccagcccgc cggcgcacgc gcccgtccgg gcgggcgacc
gggtgcacgt gcaatggaac 300ggctggccgc tcggacacgt ggggccggtg
ctgtcgtacc tggcgccctg cggcgggctg 360gaggggtccg agagcgggtg
cgccggggtg gacaagcggc agctgcggtg gaccaaggtg 420gacgactcgc
tgccggcgat ggagctg 44776149PRTMyceliophthora thermophila 76Met Pro
Pro Pro Arg Leu Ser Thr Leu Leu Pro Leu Leu Ala Leu Ile 1 5 10 15
Ala Pro Thr Ala Leu Gly His Ser His Leu Gly Tyr Ile Ile Ile Asn 20
25 30 Gly Glu Val Tyr Gln Gly Phe Asp Pro Arg Pro Glu Gln Ala Asn
Ser 35 40 45 Pro Leu Arg Val Gly Trp Ser Thr Gly Ala Ile Asp Asp
Gly Phe Val 50 55 60 Ala Pro Ala Asn Tyr Ser Ser Pro Asp Ile Ile
Cys His Ile Glu Gly 65 70 75 80 Ala Ser Pro Pro Ala His Ala Pro Val
Arg Ala Gly Asp Arg Val His 85 90 95 Val Gln Trp Asn Gly Trp Pro
Leu Gly His Val Gly Pro Val Leu Ser 100 105 110 Tyr Leu Ala Pro Cys
Gly Gly Leu Glu Gly Ser Glu Ser Gly Cys Ala 115 120 125 Gly Val Asp
Lys Arg Gln Leu Arg Trp Thr Lys Val Asp Asp Ser Leu 130 135 140 Pro
Ala Met Glu Leu 145 77127PRTMyceliophthora thermophila 77His Ser
His Leu Gly Tyr Ile Ile Ile Asn Gly Glu Val Tyr Gln Gly 1 5 10 15
Phe Asp Pro Arg Pro Glu Gln Ala Asn Ser Pro Leu Arg Val Gly Trp 20
25 30 Ser Thr Gly Ala Ile Asp Asp Gly Phe Val Ala Pro Ala Asn Tyr
Ser 35 40 45 Ser Pro Asp Ile Ile Cys His Ile Glu Gly Ala Ser Pro
Pro Ala His 50 55 60 Ala Pro Val Arg Ala Gly Asp Arg Val His Val
Gln Trp Asn Gly Trp 65 70 75 80 Pro Leu Gly His Val Gly Pro Val Leu
Ser Tyr Leu Ala Pro Cys Gly 85 90 95 Gly Leu Glu Gly Ser Glu Ser
Gly Cys Ala Gly Val Asp Lys Arg Gln 100 105 110 Leu Arg Trp Thr Lys
Val Asp Asp Ser Leu Pro Ala Met Glu Leu 115 120 125
781176DNAMyceliophthora thermophila 78atgccgccac cacgactgag
caccctcctt cccctcctag ccttaatagc ccccaccgcc 60ctggggcact cccacctcgg
gtacatcatc atcaacggcg aggtatacca aggattcgac 120ccgcggccgg
agcaggcgaa ctcgccgttg cgcgtgggct ggtcgacggg ggcaatcgac
180gacgggttcg tggcgccggc caactactcg tcgcccgaca tcatctgcca
catcgagggg 240gccagcccgc cggcgcacgc gcccgtccgg gcgggcgacc
gggtgcacgt gcaatggaaa 300cggctggccg ctcggacacg tggggccggt
gctgtcgtac ctggcgccct gcggcgggct 360ggaggggtcc gagagcgggt
ggacgactcg ctgccggcga tggagctggt cggggccgcg 420gggggcgcgg
ggggcgagga cgacggcagc ggcagcgacg gcagcggcag cggcggcagc
480ggacgcgtcg gcgtgcccgg gcagcgctgg gccaccgacg tgttgatcgc
ggccaacaac 540agctggcagg tcgagatccc gcgcgggctg cgggacgggc
cgtacgtgct gcgccacgag 600atcgtcgcgc tgcactacgc ggccgagccc
ggcggcgcgc agaactaccc gctctgcgtc 660aacctgtggg tcgagggcgg
cgacggcagc atggagctgg accacttcga cgccacccag 720ttctaccggc
ccgacgaccc gggcatcctg ctcaacgtga cggccggcct gcgctcatac
780gccgtgccgg gcccgacgct ggccgcgggg gcgacgccgg tgccgtacgc
gcagcagaac 840atcagctcgg cgagggcgga tggaaccccc gtgattgtca
ccaggagcac ggagacggtg 900cccttcaccg cggcacccac gccagccgag
acggcagaag ccaaaggggg gaggtatgat 960gaccaaaccc gaactaaaga
cctaaatgaa cgcttctttt atagtagccg gccagaacag 1020aagaggctga
cagcgacctc aagaagggaa ctagttgatc atcgtacccg gtacctctcc
1080gtagctgtct gcgcagattt cggcgctcat aaggcagcag aaaccaacca
cgaagctttg 1140agaggcggca ataagcacca tggcggtgtt tcagag
117679392PRTMyceliophthora thermophila 79Met Pro Pro Pro Arg Leu
Ser Thr Leu Leu Pro Leu Leu Ala Leu Ile 1 5 10 15 Ala Pro Thr Ala
Leu Gly His Ser His Leu Gly Tyr Ile Ile Ile Asn 20 25 30 Gly Glu
Val Tyr Gln Gly Phe Asp Pro Arg Pro Glu Gln Ala Asn Ser 35 40 45
Pro Leu Arg Val Gly Trp Ser Thr Gly Ala Ile Asp Asp Gly Phe Val 50
55 60 Ala Pro Ala Asn Tyr Ser Ser Pro Asp Ile Ile Cys His Ile Glu
Gly 65 70 75 80 Ala Ser Pro Pro Ala His Ala Pro Val Arg Ala Gly Asp
Arg Val His 85 90 95 Val Gln Trp Lys Arg Leu Ala Ala Arg Thr Arg
Gly Ala Gly Ala Val 100 105 110 Val Pro Gly Ala Leu Arg Arg Ala Gly
Gly Val Arg Glu Arg Val Asp 115 120 125 Asp Ser Leu Pro Ala Met Glu
Leu Val Gly Ala Ala Gly Gly Ala Gly 130 135 140 Gly Glu Asp Asp Gly
Ser Gly Ser Asp Gly Ser Gly Ser Gly Gly Ser 145 150 155 160 Gly Arg
Val Gly Val Pro Gly Gln Arg Trp Ala Thr Asp Val Leu Ile 165 170 175
Ala Ala Asn Asn Ser Trp Gln Val Glu Ile Pro Arg Gly Leu Arg Asp 180
185 190 Gly Pro Tyr Val Leu Arg His Glu Ile Val Ala Leu His Tyr Ala
Ala 195 200 205 Glu Pro Gly Gly Ala Gln Asn Tyr Pro Leu Cys Val Asn
Leu Trp Val 210 215 220 Glu Gly Gly Asp Gly Ser Met Glu Leu Asp His
Phe Asp Ala Thr Gln 225 230 235 240 Phe Tyr Arg Pro Asp Asp Pro Gly
Ile Leu Leu Asn Val Thr Ala Gly 245 250 255 Leu Arg Ser Tyr Ala Val
Pro Gly Pro Thr Leu Ala Ala Gly Ala Thr 260 265 270 Pro Val Pro Tyr
Ala Gln Gln Asn Ile Ser Ser Ala Arg Ala Asp Gly 275 280 285 Thr Pro
Val Ile Val Thr Arg Ser Thr Glu Thr Val Pro Phe Thr Ala 290 295 300
Ala Pro Thr Pro Ala Glu Thr Ala Glu Ala Lys Gly Gly Arg Tyr Asp 305
310 315 320 Asp Gln Thr Arg Thr Lys Asp Leu Asn Glu Arg Phe Phe Tyr
Ser Ser 325 330 335 Arg Pro Glu Gln Lys Arg Leu Thr Ala Thr Ser Arg
Arg Glu Leu Val 340 345 350 Asp His Arg Thr Arg Tyr Leu Ser Val Ala
Val Cys Ala Asp Phe Gly 355 360 365 Ala His Lys Ala Ala Glu Thr Asn
His Glu Ala Leu Arg Gly Gly Asn 370 375 380 Lys His His Gly Gly Val
Ser Glu 385 390 80370PRTMyceliophthora thermophila 80His Ser His
Leu Gly Tyr Ile Ile Ile Asn Gly Glu Val Tyr Gln Gly 1 5 10 15 Phe
Asp Pro Arg Pro Glu Gln Ala Asn Ser Pro Leu Arg Val Gly Trp 20 25
30 Ser Thr Gly Ala Ile Asp Asp Gly Phe Val Ala Pro Ala Asn Tyr Ser
35 40 45 Ser Pro Asp Ile Ile Cys His Ile Glu Gly Ala Ser Pro Pro
Ala His 50 55 60 Ala Pro Val Arg Ala Gly Asp Arg Val His Val Gln
Trp Lys Arg Leu 65 70 75 80 Ala Ala Arg Thr Arg Gly Ala Gly Ala Val
Val Pro Gly Ala Leu Arg 85 90 95 Arg Ala Gly Gly Val Arg Glu Arg
Val Asp Asp Ser Leu Pro Ala Met 100 105 110 Glu Leu Val Gly Ala Ala
Gly Gly Ala Gly Gly Glu Asp Asp Gly Ser 115 120 125 Gly Ser Asp Gly
Ser Gly Ser Gly Gly Ser Gly Arg Val Gly Val Pro 130 135 140 Gly Gln
Arg Trp Ala Thr Asp Val Leu Ile Ala Ala Asn Asn Ser Trp 145 150 155
160 Gln Val Glu Ile Pro Arg Gly Leu Arg Asp Gly Pro Tyr Val Leu Arg
165 170 175 His Glu Ile Val Ala Leu His Tyr Ala Ala Glu Pro Gly Gly
Ala Gln 180 185 190 Asn Tyr Pro Leu Cys Val Asn Leu Trp Val Glu Gly
Gly Asp Gly Ser 195 200 205 Met Glu Leu Asp His Phe Asp Ala Thr Gln
Phe Tyr Arg Pro Asp Asp 210 215 220 Pro Gly Ile Leu Leu Asn Val Thr
Ala Gly Leu Arg Ser Tyr Ala Val 225 230 235 240 Pro Gly Pro Thr Leu
Ala Ala Gly Ala Thr Pro Val Pro Tyr Ala Gln 245 250 255 Gln Asn Ile
Ser Ser Ala Arg Ala Asp Gly Thr Pro Val Ile Val Thr 260 265 270 Arg
Ser Thr Glu Thr Val Pro Phe Thr Ala Ala Pro Thr Pro Ala Glu 275 280
285 Thr Ala Glu Ala Lys Gly Gly Arg Tyr Asp Asp Gln Thr Arg Thr Lys
290 295 300 Asp Leu Asn Glu Arg Phe Phe Tyr Ser Ser Arg Pro Glu Gln
Lys Arg 305 310 315 320 Leu Thr Ala Thr Ser Arg Arg Glu Leu Val Asp
His Arg Thr Arg Tyr 325 330 335 Leu Ser Val Ala Val Cys Ala Asp Phe
Gly Ala His Lys Ala Ala Glu 340 345 350 Thr Asn His Glu Ala Leu Arg
Gly Gly Asn Lys His His Gly Gly Val 355 360 365 Ser Glu 370
81453DNAMyceliophthora thermophila 81atgaggtcga cattggccgg
tgccctggca gccatcgctg ctcagaaagt agccggccac 60gccacgtttc agcagctctg
gcacggctcc tcctgtgtcc gccttccggc tagcaactca 120cccgtcacca
atgtgggaag cagagacttc gtctgcaacg ctggcacccg ccccgtcagt
180ggcaagtgcc ccgtgaaggc tggcggcacc gtcaccatcg agatgcacca
gcaacccggc 240gaccgcagct gcaacaacga agccatcgga ggggcgcatt
ggggccccgt ccaggtgtac 300ctgaccaagg ttcaggacgc cgcgacggcc
gacggctcga cgggctggtt caagatcttc 360tccgactcgt ggtccaagaa
gcccgggggc aacttgggcg acgacgacaa ctggggcacg 420cgcgacctga
acgcctgctg cgggaagatg gac 45382151PRTMyceliophthora thermophila
82Met Arg Ser Thr Leu Ala Gly Ala Leu Ala Ala Ile Ala Ala Gln Lys 1
5 10 15 Val Ala Gly His Ala Thr Phe Gln Gln Leu Trp His Gly Ser Ser
Cys 20 25 30 Val Arg Leu Pro Ala Ser Asn Ser Pro Val Thr Asn Val
Gly Ser Arg 35 40 45 Asp Phe Val Cys Asn Ala Gly Thr Arg Pro Val
Ser Gly Lys Cys Pro 50 55 60 Val Lys Ala Gly Gly Thr Val Thr Ile
Glu Met His Gln Gln Pro Gly 65 70 75 80 Asp Arg Ser Cys Asn Asn Glu
Ala Ile Gly Gly Ala His Trp Gly Pro 85 90 95 Val Gln Val Tyr Leu
Thr Lys Val Gln Asp Ala Ala Thr Ala Asp Gly 100 105 110 Ser Thr Gly
Trp Phe Lys Ile Phe Ser Asp Ser Trp Ser Lys Lys
Pro 115 120 125 Gly Gly Asn Leu Gly Asp Asp Asp Asn Trp Gly Thr Arg
Asp Leu Asn 130 135 140 Ala Cys Cys Gly Lys Met Asp 145 150
83132PRTMyceliophthora thermophila 83His Ala Thr Phe Gln Gln Leu
Trp His Gly Ser Ser Cys Val Arg Leu 1 5 10 15 Pro Ala Ser Asn Ser
Pro Val Thr Asn Val Gly Ser Arg Asp Phe Val 20 25 30 Cys Asn Ala
Gly Thr Arg Pro Val Ser Gly Lys Cys Pro Val Lys Ala 35 40 45 Gly
Gly Thr Val Thr Ile Glu Met His Gln Gln Pro Gly Asp Arg Ser 50 55
60 Cys Asn Asn Glu Ala Ile Gly Gly Ala His Trp Gly Pro Val Gln Val
65 70 75 80 Tyr Leu Thr Lys Val Gln Asp Ala Ala Thr Ala Asp Gly Ser
Thr Gly 85 90 95 Trp Phe Lys Ile Phe Ser Asp Ser Trp Ser Lys Lys
Pro Gly Gly Asn 100 105 110 Leu Gly Asp Asp Asp Asn Trp Gly Thr Arg
Asp Leu Asn Ala Cys Cys 115 120 125 Gly Lys Met Asp 130
84837DNAMyceliophthora thermophila 84atgaggtcga cattggccgg
tgccctggca gccatcgctg ctcagaaagt agccggccac 60gccacgtttc agcagctctg
gcacggctcc tcctgtgtcc gccttccggc tagcaactca 120cccgtcacca
atgtgggaag cagagacttc gtctgcaacg ctggcacccg ccccgtcagt
180ggcaagtgcc ccgtgaaggc tggcggcacc gtcaccatcg agatgcacca
gcaacccggc 240gaccgcagct gcaacaacga agccatcgga ggggcgcatt
ggggccccgt ccaggtgtac 300ctgaccaagg ttcaggacgc cgcgacggcc
gacggctcga cgggctggtt caagatcttc 360tccgactcgt ggtccaagaa
gcccgggggc aactcgggcg acgacgacaa ctggggcacg 420cgcgacctga
acgcctgctg cgggaagatg gacgtggcca tcccggccga catcgcgtcg
480ggcgactacc tgctgcgggc cgaggcgctg gccctgcaca cggccggaca
ggccggcggc 540gcccagttct acatgagctg ctaccagatg acggtcgagg
gcggctccgg gaccgccaac 600ccgcccaccg tcaagttccc gggcgcctac
agcgccaacg acccgggcat cctcgtcaac 660atccacgccc ccctttccag
ctacaccgcg cccggcccgg ccgtctacgc gggcggcacc 720atccgcgagg
ccggctccgc ctgcaccggc tgcgcgcaga cctgcaaggt cgggtcgtcc
780ccgagcgccg ttgcccccgg cagcggcgcg ggcaacggcg gcgggttcca accccga
83785279PRTMyceliophthora thermophila 85Met Arg Ser Thr Leu Ala Gly
Ala Leu Ala Ala Ile Ala Ala Gln Lys 1 5 10 15 Val Ala Gly His Ala
Thr Phe Gln Gln Leu Trp His Gly Ser Ser Cys 20 25 30 Val Arg Leu
Pro Ala Ser Asn Ser Pro Val Thr Asn Val Gly Ser Arg 35 40 45 Asp
Phe Val Cys Asn Ala Gly Thr Arg Pro Val Ser Gly Lys Cys Pro 50 55
60 Val Lys Ala Gly Gly Thr Val Thr Ile Glu Met His Gln Gln Pro Gly
65 70 75 80 Asp Arg Ser Cys Asn Asn Glu Ala Ile Gly Gly Ala His Trp
Gly Pro 85 90 95 Val Gln Val Tyr Leu Thr Lys Val Gln Asp Ala Ala
Thr Ala Asp Gly 100 105 110 Ser Thr Gly Trp Phe Lys Ile Phe Ser Asp
Ser Trp Ser Lys Lys Pro 115 120 125 Gly Gly Asn Ser Gly Asp Asp Asp
Asn Trp Gly Thr Arg Asp Leu Asn 130 135 140 Ala Cys Cys Gly Lys Met
Asp Val Ala Ile Pro Ala Asp Ile Ala Ser 145 150 155 160 Gly Asp Tyr
Leu Leu Arg Ala Glu Ala Leu Ala Leu His Thr Ala Gly 165 170 175 Gln
Ala Gly Gly Ala Gln Phe Tyr Met Ser Cys Tyr Gln Met Thr Val 180 185
190 Glu Gly Gly Ser Gly Thr Ala Asn Pro Pro Thr Val Lys Phe Pro Gly
195 200 205 Ala Tyr Ser Ala Asn Asp Pro Gly Ile Leu Val Asn Ile His
Ala Pro 210 215 220 Leu Ser Ser Tyr Thr Ala Pro Gly Pro Ala Val Tyr
Ala Gly Gly Thr 225 230 235 240 Ile Arg Glu Ala Gly Ser Ala Cys Thr
Gly Cys Ala Gln Thr Cys Lys 245 250 255 Val Gly Ser Ser Pro Ser Ala
Val Ala Pro Gly Ser Gly Ala Gly Asn 260 265 270 Gly Gly Gly Phe Gln
Pro Arg 275 86260PRTMyceliophthora thermophila 86His Ala Thr Phe
Gln Gln Leu Trp His Gly Ser Ser Cys Val Arg Leu 1 5 10 15 Pro Ala
Ser Asn Ser Pro Val Thr Asn Val Gly Ser Arg Asp Phe Val 20 25 30
Cys Asn Ala Gly Thr Arg Pro Val Ser Gly Lys Cys Pro Val Lys Ala 35
40 45 Gly Gly Thr Val Thr Ile Glu Met His Gln Gln Pro Gly Asp Arg
Ser 50 55 60 Cys Asn Asn Glu Ala Ile Gly Gly Ala His Trp Gly Pro
Val Gln Val 65 70 75 80 Tyr Leu Thr Lys Val Gln Asp Ala Ala Thr Ala
Asp Gly Ser Thr Gly 85 90 95 Trp Phe Lys Ile Phe Ser Asp Ser Trp
Ser Lys Lys Pro Gly Gly Asn 100 105 110 Ser Gly Asp Asp Asp Asn Trp
Gly Thr Arg Asp Leu Asn Ala Cys Cys 115 120 125 Gly Lys Met Asp Val
Ala Ile Pro Ala Asp Ile Ala Ser Gly Asp Tyr 130 135 140 Leu Leu Arg
Ala Glu Ala Leu Ala Leu His Thr Ala Gly Gln Ala Gly 145 150 155 160
Gly Ala Gln Phe Tyr Met Ser Cys Tyr Gln Met Thr Val Glu Gly Gly 165
170 175 Ser Gly Thr Ala Asn Pro Pro Thr Val Lys Phe Pro Gly Ala Tyr
Ser 180 185 190 Ala Asn Asp Pro Gly Ile Leu Val Asn Ile His Ala Pro
Leu Ser Ser 195 200 205 Tyr Thr Ala Pro Gly Pro Ala Val Tyr Ala Gly
Gly Thr Ile Arg Glu 210 215 220 Ala Gly Ser Ala Cys Thr Gly Cys Ala
Gln Thr Cys Lys Val Gly Ser 225 230 235 240 Ser Pro Ser Ala Val Ala
Pro Gly Ser Gly Ala Gly Asn Gly Gly Gly 245 250 255 Phe Gln Pro Arg
260 87735DNAMyceliophthora thermophila 87atgctcctcc tcaccctagc
cacactcgtc accctcctgg cgcgccacgt ctcggctcac 60gcccggctgt tccgcgtctc
tgtcgacggg aaagaccagg gcgacgggct gaacaagtac 120atccgctcgc
cggcgaccaa cgaccccgtg cgcgacctct cgagcgccgc catcgtgtgc
180aacacccagg ggtccaaggc cgccccggac ttcgtcaggg ccgcggccgg
cgacaagctg 240accttcctct gggcgcacga caacccggac gacccggtcg
actacgtcct cgacccgtcc 300cacaagggcg ccatcctgac ctacgtcgcc
gcctacccct ccggggaccc gaccggcccc 360atctggagca agcttgccga
ggaaggattc accggcgggc agtgggcgac catcaagatg 420atcgacaacg
gcggcaaggt cgacgtgacg ctgcccgagg cccttgcgcc gggaaagtac
480ctgatccgcc aggagctgct ggccctgcac cgggccgact ttgcctgcga
cgacccggcc 540caccccaacc gcggcgccga gtcgtacccc aactgcgtcc
aggtggaggt gtcgggcagc 600ggcgacaaga agccggacca gaactttgac
ttcaacaagg gctatacctg cgataacaaa 660ggactccact ttaagatcta
catcggtcag gacagccagt atgtggcccc ggggccgcgg 720ccttggaatg ggagc
73588245PRTMyceliophthora thermophila 88Met Leu Leu Leu Thr Leu Ala
Thr Leu Val Thr Leu Leu Ala Arg His 1 5 10 15 Val Ser Ala His Ala
Arg Leu Phe Arg Val Ser Val Asp Gly Lys Asp 20 25 30 Gln Gly Asp
Gly Leu Asn Lys Tyr Ile Arg Ser Pro Ala Thr Asn Asp 35 40 45 Pro
Val Arg Asp Leu Ser Ser Ala Ala Ile Val Cys Asn Thr Gln Gly 50 55
60 Ser Lys Ala Ala Pro Asp Phe Val Arg Ala Ala Ala Gly Asp Lys Leu
65 70 75 80 Thr Phe Leu Trp Ala His Asp Asn Pro Asp Asp Pro Val Asp
Tyr Val 85 90 95 Leu Asp Pro Ser His Lys Gly Ala Ile Leu Thr Tyr
Val Ala Ala Tyr 100 105 110 Pro Ser Gly Asp Pro Thr Gly Pro Ile Trp
Ser Lys Leu Ala Glu Glu 115 120 125 Gly Phe Thr Gly Gly Gln Trp Ala
Thr Ile Lys Met Ile Asp Asn Gly 130 135 140 Gly Lys Val Asp Val Thr
Leu Pro Glu Ala Leu Ala Pro Gly Lys Tyr 145 150 155 160 Leu Ile Arg
Gln Glu Leu Leu Ala Leu His Arg Ala Asp Phe Ala Cys 165 170 175 Asp
Asp Pro Ala His Pro Asn Arg Gly Ala Glu Ser Tyr Pro Asn Cys 180 185
190 Val Gln Val Glu Val Ser Gly Ser Gly Asp Lys Lys Pro Asp Gln Asn
195 200 205 Phe Asp Phe Asn Lys Gly Tyr Thr Cys Asp Asn Lys Gly Leu
His Phe 210 215 220 Lys Ile Tyr Ile Gly Gln Asp Ser Gln Tyr Val Ala
Pro Gly Pro Arg 225 230 235 240 Pro Trp Asn Gly Ser 245
89226PRTMyceliophthora thermophila 89His Ala Arg Leu Phe Arg Val
Ser Val Asp Gly Lys Asp Gln Gly Asp 1 5 10 15 Gly Leu Asn Lys Tyr
Ile Arg Ser Pro Ala Thr Asn Asp Pro Val Arg 20 25 30 Asp Leu Ser
Ser Ala Ala Ile Val Cys Asn Thr Gln Gly Ser Lys Ala 35 40 45 Ala
Pro Asp Phe Val Arg Ala Ala Ala Gly Asp Lys Leu Thr Phe Leu 50 55
60 Trp Ala His Asp Asn Pro Asp Asp Pro Val Asp Tyr Val Leu Asp Pro
65 70 75 80 Ser His Lys Gly Ala Ile Leu Thr Tyr Val Ala Ala Tyr Pro
Ser Gly 85 90 95 Asp Pro Thr Gly Pro Ile Trp Ser Lys Leu Ala Glu
Glu Gly Phe Thr 100 105 110 Gly Gly Gln Trp Ala Thr Ile Lys Met Ile
Asp Asn Gly Gly Lys Val 115 120 125 Asp Val Thr Leu Pro Glu Ala Leu
Ala Pro Gly Lys Tyr Leu Ile Arg 130 135 140 Gln Glu Leu Leu Ala Leu
His Arg Ala Asp Phe Ala Cys Asp Asp Pro 145 150 155 160 Ala His Pro
Asn Arg Gly Ala Glu Ser Tyr Pro Asn Cys Val Gln Val 165 170 175 Glu
Val Ser Gly Ser Gly Asp Lys Lys Pro Asp Gln Asn Phe Asp Phe 180 185
190 Asn Lys Gly Tyr Thr Cys Asp Asn Lys Gly Leu His Phe Lys Ile Tyr
195 200 205 Ile Gly Gln Asp Ser Gln Tyr Val Ala Pro Gly Pro Arg Pro
Trp Asn 210 215 220 Gly Ser 225 90600DNAMyceliophthora thermophila
90atgttcactt cgctttgcat cacagatcat tggaggactc ttagcagcca ctctgggcca
60gtcatgaact atctcgccca ttgcaccaat gacgactgca agtctttcaa gggcgacagc
120ggcaacgtct gggtcaagat cgagcagctc gcgtacaacc cgtcagccaa
ccccccctgg 180gcgtctgacc tcctccgtga gcacggtgcc aagtggaagg
tgacgatccc gcccagtctt 240gtccccggcg aatatctgct gcggcacgag
atcctggggt tgcacgtcgc aggaaccgtg 300atgggcgccc agttctaccc
cggctgcacc cagatcaggg tcaccgaagg cgggagcacg 360cagctgccct
cgggtattgc gctcccaggc gcttacggcc cacaagacga gggtatcttg
420gtcgacttgt ggagggttaa ccagggccag gtcaactaca cggcgcctgg
aggacccgtt 480tggagcgaag cgtgggacac cgagtttggc gggtccaaca
cgaccgagtg cgccaccatg 540ctcgacgacc tgctcgacta catggcggcc
aacgacgagt ggatcggctg gacggcctag 60091199PRTMyceliophthora
thermophila 91Met Phe Thr Ser Leu Cys Ile Thr Asp His Trp Arg Thr
Leu Ser Ser 1 5 10 15 His Ser Gly Pro Val Met Asn Tyr Leu Ala His
Cys Thr Asn Asp Asp 20 25 30 Cys Lys Ser Phe Lys Gly Asp Ser Gly
Asn Val Trp Val Lys Ile Glu 35 40 45 Gln Leu Ala Tyr Asn Pro Ser
Ala Asn Pro Pro Trp Ala Ser Asp Leu 50 55 60 Leu Arg Glu His Gly
Ala Lys Trp Lys Val Thr Ile Pro Pro Ser Leu 65 70 75 80 Val Pro Gly
Glu Tyr Leu Leu Arg His Glu Ile Leu Gly Leu His Val 85 90 95 Ala
Gly Thr Val Met Gly Ala Gln Phe Tyr Pro Gly Cys Thr Gln Ile 100 105
110 Arg Val Thr Glu Gly Gly Ser Thr Gln Leu Pro Ser Gly Ile Ala Leu
115 120 125 Pro Gly Ala Tyr Gly Pro Gln Asp Glu Gly Ile Leu Val Asp
Leu Trp 130 135 140 Arg Val Asn Gln Gly Gln Val Asn Tyr Thr Ala Pro
Gly Gly Pro Val 145 150 155 160 Trp Ser Glu Ala Trp Asp Thr Glu Phe
Gly Gly Ser Asn Thr Thr Glu 165 170 175 Cys Ala Thr Met Leu Asp Asp
Leu Leu Asp Tyr Met Ala Ala Asn Asp 180 185 190 Glu Trp Ile Gly Trp
Thr Ala 195 92693DNAMyceliophthora thermophila 92atgaactatc
tcgcccattg caccaatgac gactgcaagt ctttcaaggg cgacagcggc 60aacgtctggg
tcaagatcga gcagctcgcg tacaacccgt cagccaaccc cccctgggcg
120tctgacctcc tccgtgagca cggtgccaag tggaaggtga cgatcccgcc
cagtcttgtc 180cccggcgaat atctgctgcg gcacgagatc ctggggttgc
acgtcgcagg aaccgtgatg 240ggcgcccagt tctaccccgg ctgcacccag
atcagggtca ccgaaggcgg gagcacgcag 300ctgccctcgg gtattgcgct
cccaggcgct tacggcccac aagacgaggg tatcttggtc 360gacttgtgga
gggttaacca gggccaggtc aactacacgg cgcctggagg acccgtttgg
420agcgaagcgt gggacaccga gtttggcggg tccaacacga ccgagtgcgc
caccatgctc 480gacgacctgc tcgactacat ggcggccaac gacgacccat
gctgcaccga ccagaaccag 540ttcgggagtc tcgagccggg gagcaaggcg
gccggcggct cgccgagcct gtacgatacc 600gtcttggtcc ccgttctcca
gaagaaagtg ccgacaaagc tgcagtggag cggaccggcg 660agcgtcaacg
gggatgagtt gacagagagg ccc 69393231PRTMyceliophthora thermophila
93Met Asn Tyr Leu Ala His Cys Thr Asn Asp Asp Cys Lys Ser Phe Lys 1
5 10 15 Gly Asp Ser Gly Asn Val Trp Val Lys Ile Glu Gln Leu Ala Tyr
Asn 20 25 30 Pro Ser Ala Asn Pro Pro Trp Ala Ser Asp Leu Leu Arg
Glu His Gly 35 40 45 Ala Lys Trp Lys Val Thr Ile Pro Pro Ser Leu
Val Pro Gly Glu Tyr 50 55 60 Leu Leu Arg His Glu Ile Leu Gly Leu
His Val Ala Gly Thr Val Met 65 70 75 80 Gly Ala Gln Phe Tyr Pro Gly
Cys Thr Gln Ile Arg Val Thr Glu Gly 85 90 95 Gly Ser Thr Gln Leu
Pro Ser Gly Ile Ala Leu Pro Gly Ala Tyr Gly 100 105 110 Pro Gln Asp
Glu Gly Ile Leu Val Asp Leu Trp Arg Val Asn Gln Gly 115 120 125 Gln
Val Asn Tyr Thr Ala Pro Gly Gly Pro Val Trp Ser Glu Ala Trp 130 135
140 Asp Thr Glu Phe Gly Gly Ser Asn Thr Thr Glu Cys Ala Thr Met Leu
145 150 155 160 Asp Asp Leu Leu Asp Tyr Met Ala Ala Asn Asp Asp Pro
Cys Cys Thr 165 170 175 Asp Gln Asn Gln Phe Gly Ser Leu Glu Pro Gly
Ser Lys Ala Ala Gly 180 185 190 Gly Ser Pro Ser Leu Tyr Asp Thr Val
Leu Val Pro Val Leu Gln Lys 195 200 205 Lys Val Pro Thr Lys Leu Gln
Trp Ser Gly Pro Ala Ser Val Asn Gly 210 215 220 Asp Glu Leu Thr Glu
Arg Pro 225 230 94681DNAMyceliophthora thermophila 94atgaagctga
gcgctgccat cgccgtgctc gcggccgccc ttgccgaggg gcactatacc 60ttccccagca
tcgccaacac ggccgactgg caatatgtgc gcatcacgac caacttccag
120agcaacggcc ccgtgacgga cgtcaactcg gaccagatcc ggtgctacga
gcgcaacccg 180ggcaccggcg cccccggcat ctacaacgtc acggccggca
caaccatcaa ctacaacgcc 240aagtcgtcca tctcccaccc gggacccatg
gccttctaca ttgccaaggt tcccgccggc 300cagtcggccg ccacctggga
cggtaagggc gccgtctggt ccaagatcca ccaggagatg 360ccgcactttg
gcaccagcct cacctgggac tccaacggcc gcacctccat gcccgtcacc
420atcccccgct gtctgcagga cggcgagtat ctgctgcgtg cagagcacat
tgccctccac 480agcgccggca gccccggcgg cgcccagttc tacatttctt
gtgcccagct ctcagtcacc 540ggcggcagcg ggacctggaa ccccaggaac
aaggtgtcgt tccccggcgc ctacaaggcc 600actgacccgg gcatcctgat
caacatctac taccccgtcc cgactagcta cactcccgct 660ggtccccccg
tcgacacctg c 68195227PRTMyceliophthora thermophila 95Met Lys Leu
Ser Ala Ala Ile Ala Val Leu Ala Ala Ala Leu Ala Glu 1 5 10 15 Gly
His Tyr Thr Phe Pro Ser Ile Ala Asn Thr Ala Asp Trp Gln Tyr 20 25
30 Val Arg Ile Thr Thr Asn Phe Gln Ser Asn Gly Pro Val Thr Asp Val
35 40 45 Asn Ser Asp Gln Ile Arg Cys Tyr Glu Arg Asn Pro Gly Thr
Gly Ala 50 55 60 Pro Gly Ile Tyr Asn Val Thr Ala Gly Thr Thr Ile
Asn Tyr Asn Ala 65
70 75 80 Lys Ser Ser Ile Ser His Pro Gly Pro Met Ala Phe Tyr Ile
Ala Lys 85 90 95 Val Pro Ala Gly Gln Ser Ala Ala Thr Trp Asp Gly
Lys Gly Ala Val 100 105 110 Trp Ser Lys Ile His Gln Glu Met Pro His
Phe Gly Thr Ser Leu Thr 115 120 125 Trp Asp Ser Asn Gly Arg Thr Ser
Met Pro Val Thr Ile Pro Arg Cys 130 135 140 Leu Gln Asp Gly Glu Tyr
Leu Leu Arg Ala Glu His Ile Ala Leu His 145 150 155 160 Ser Ala Gly
Ser Pro Gly Gly Ala Gln Phe Tyr Ile Ser Cys Ala Gln 165 170 175 Leu
Ser Val Thr Gly Gly Ser Gly Thr Trp Asn Pro Arg Asn Lys Val 180 185
190 Ser Phe Pro Gly Ala Tyr Lys Ala Thr Asp Pro Gly Ile Leu Ile Asn
195 200 205 Ile Tyr Tyr Pro Val Pro Thr Ser Tyr Thr Pro Ala Gly Pro
Pro Val 210 215 220 Asp Thr Cys 225 96210PRTMyceliophthora
thermophila 96His Tyr Thr Phe Pro Ser Ile Ala Asn Thr Ala Asp Trp
Gln Tyr Val 1 5 10 15 Arg Ile Thr Thr Asn Phe Gln Ser Asn Gly Pro
Val Thr Asp Val Asn 20 25 30 Ser Asp Gln Ile Arg Cys Tyr Glu Arg
Asn Pro Gly Thr Gly Ala Pro 35 40 45 Gly Ile Tyr Asn Val Thr Ala
Gly Thr Thr Ile Asn Tyr Asn Ala Lys 50 55 60 Ser Ser Ile Ser His
Pro Gly Pro Met Ala Phe Tyr Ile Ala Lys Val 65 70 75 80 Pro Ala Gly
Gln Ser Ala Ala Thr Trp Asp Gly Lys Gly Ala Val Trp 85 90 95 Ser
Lys Ile His Gln Glu Met Pro His Phe Gly Thr Ser Leu Thr Trp 100 105
110 Asp Ser Asn Gly Arg Thr Ser Met Pro Val Thr Ile Pro Arg Cys Leu
115 120 125 Gln Asp Gly Glu Tyr Leu Leu Arg Ala Glu His Ile Ala Leu
His Ser 130 135 140 Ala Gly Ser Pro Gly Gly Ala Gln Phe Tyr Ile Ser
Cys Ala Gln Leu 145 150 155 160 Ser Val Thr Gly Gly Ser Gly Thr Trp
Asn Pro Arg Asn Lys Val Ser 165 170 175 Phe Pro Gly Ala Tyr Lys Ala
Thr Asp Pro Gly Ile Leu Ile Asn Ile 180 185 190 Tyr Tyr Pro Val Pro
Thr Ser Tyr Thr Pro Ala Gly Pro Pro Val Asp 195 200 205 Thr Cys 210
97765DNAMyceliophthora thermophila 97atgtaccgca cgctcggttc
cattgccctg ctcgcggggg gcgctgccgc ccacggcgcc 60gtgaccagct acaacattgc
gggcaaggac taccctggat actcgggctt cgcccctacc 120ggccaggatg
tcatccagtg gcaatggccc gactataacc ccgtgctgtc cgccagcgac
180cccaagctcc gctgcaacgg cggcaccggg gcggcgctgt atgccgaggc
ggcccccggc 240gacaccatca cggccacctg ggcccagtgg acgcactccc
agggcccgat cctggtgtgg 300atgtacaagt gccccggcga cttcagctcc
tgcgacggct ccggcgcggg ttggttcaag 360atcgacgagg ccggcttcca
cggcgacggc acgaccgtct tcctcgacac cgagaccccc 420tcgggctggg
acattgccaa gctggtcggc ggcaacaagt cgtggagcag caagatccct
480gacggcctcg ccccgggcaa ttacctggtc cgccacgagc tcatcgccct
gcaccaggcc 540aacaacccgc aattctaccc cgagtgcgcc cagatcaagg
tcaccggctc tggcaccgcc 600gagcccgccg cctcctacaa ggccgccatc
cccggctact gccagcagag cgaccccaac 660atttcgttca acatcaacga
ccactccctc ccgcaggagt acaagatccc cggtcccccg 720gtcttcaagg
gcaccgcctc cgccaaggct cgcgctttcc aggcc 76598255PRTMyceliophthora
thermophila 98Met Tyr Arg Thr Leu Gly Ser Ile Ala Leu Leu Ala Gly
Gly Ala Ala 1 5 10 15 Ala His Gly Ala Val Thr Ser Tyr Asn Ile Ala
Gly Lys Asp Tyr Pro 20 25 30 Gly Tyr Ser Gly Phe Ala Pro Thr Gly
Gln Asp Val Ile Gln Trp Gln 35 40 45 Trp Pro Asp Tyr Asn Pro Val
Leu Ser Ala Ser Asp Pro Lys Leu Arg 50 55 60 Cys Asn Gly Gly Thr
Gly Ala Ala Leu Tyr Ala Glu Ala Ala Pro Gly 65 70 75 80 Asp Thr Ile
Thr Ala Thr Trp Ala Gln Trp Thr His Ser Gln Gly Pro 85 90 95 Ile
Leu Val Trp Met Tyr Lys Cys Pro Gly Asp Phe Ser Ser Cys Asp 100 105
110 Gly Ser Gly Ala Gly Trp Phe Lys Ile Asp Glu Ala Gly Phe His Gly
115 120 125 Asp Gly Thr Thr Val Phe Leu Asp Thr Glu Thr Pro Ser Gly
Trp Asp 130 135 140 Ile Ala Lys Leu Val Gly Gly Asn Lys Ser Trp Ser
Ser Lys Ile Pro 145 150 155 160 Asp Gly Leu Ala Pro Gly Asn Tyr Leu
Val Arg His Glu Leu Ile Ala 165 170 175 Leu His Gln Ala Asn Asn Pro
Gln Phe Tyr Pro Glu Cys Ala Gln Ile 180 185 190 Lys Val Thr Gly Ser
Gly Thr Ala Glu Pro Ala Ala Ser Tyr Lys Ala 195 200 205 Ala Ile Pro
Gly Tyr Cys Gln Gln Ser Asp Pro Asn Ile Ser Phe Asn 210 215 220 Ile
Asn Asp His Ser Leu Pro Gln Glu Tyr Lys Ile Pro Gly Pro Pro 225 230
235 240 Val Phe Lys Gly Thr Ala Ser Ala Lys Ala Arg Ala Phe Gln Ala
245 250 255 99236PRTMyceliophthora thermophila 99Ala Val Thr Ser
Tyr Asn Ile Ala Gly Lys Asp Tyr Pro Gly Tyr Ser 1 5 10 15 Gly Phe
Ala Pro Thr Gly Gln Asp Val Ile Gln Trp Gln Trp Pro Asp 20 25 30
Tyr Asn Pro Val Leu Ser Ala Ser Asp Pro Lys Leu Arg Cys Asn Gly 35
40 45 Gly Thr Gly Ala Ala Leu Tyr Ala Glu Ala Ala Pro Gly Asp Thr
Ile 50 55 60 Thr Ala Thr Trp Ala Gln Trp Thr His Ser Gln Gly Pro
Ile Leu Val 65 70 75 80 Trp Met Tyr Lys Cys Pro Gly Asp Phe Ser Ser
Cys Asp Gly Ser Gly 85 90 95 Ala Gly Trp Phe Lys Ile Asp Glu Ala
Gly Phe His Gly Asp Gly Thr 100 105 110 Thr Val Phe Leu Asp Thr Glu
Thr Pro Ser Gly Trp Asp Ile Ala Lys 115 120 125 Leu Val Gly Gly Asn
Lys Ser Trp Ser Ser Lys Ile Pro Asp Gly Leu 130 135 140 Ala Pro Gly
Asn Tyr Leu Val Arg His Glu Leu Ile Ala Leu His Gln 145 150 155 160
Ala Asn Asn Pro Gln Phe Tyr Pro Glu Cys Ala Gln Ile Lys Val Thr 165
170 175 Gly Ser Gly Thr Ala Glu Pro Ala Ala Ser Tyr Lys Ala Ala Ile
Pro 180 185 190 Gly Tyr Cys Gln Gln Ser Asp Pro Asn Ile Ser Phe Asn
Ile Asn Asp 195 200 205 His Ser Leu Pro Gln Glu Tyr Lys Ile Pro Gly
Pro Pro Val Phe Lys 210 215 220 Gly Thr Ala Ser Ala Lys Ala Arg Ala
Phe Gln Ala 225 230 235 100675DNAMyceliophthora thermophila
100atgctgacaa caaccttcgc cctcctgacg gccgctctcg gcgtcagcgc
ccattatacc 60ctccccaggg tcgggaccgg ttccgactgg cagcacgtgc ggcgggctga
caactggcaa 120aacaacggct tcgtcggcga cgtcaactcg gagcagatca
ggtgcttcca ggcgacccct 180gccggcgccc aagacgtcta cactgttcag
gcgggatcga ccgtgaccta ccacgccaac 240cccagtatct accaccccgg
ccccatgcag ttctacctgg cccgcgttcc ggacggacag 300gacgtcaagt
cgtggaccgg cgagggtgcc gtgtggttca aggtgtacga ggagcagcct
360caatttggcg cccagctgac ctggcctagc aacggcaaga gctcgttcga
ggttcctatc 420cccagctgca ttcgggcggg caactacctc ctccgcgctg
agcacatcgc cctgcacgtt 480gcccaaagcc agggcggcgc ccagttctac
atctcgtgcg cccagctcca ggtcactggt 540ggcggcagca ccgagccttc
tcagaaggtt tccttcccgg gtgcctacaa gtccaccgac 600cccggcattc
ttatcaacat caactacccc gtccctacct cgtaccagaa tccgggtccg
660gctgtcttcc gttgc 675101225PRTMyceliophthora thermophila 101Met
Leu Thr Thr Thr Phe Ala Leu Leu Thr Ala Ala Leu Gly Val Ser 1 5 10
15 Ala His Tyr Thr Leu Pro Arg Val Gly Thr Gly Ser Asp Trp Gln His
20 25 30 Val Arg Arg Ala Asp Asn Trp Gln Asn Asn Gly Phe Val Gly
Asp Val 35 40 45 Asn Ser Glu Gln Ile Arg Cys Phe Gln Ala Thr Pro
Ala Gly Ala Gln 50 55 60 Asp Val Tyr Thr Val Gln Ala Gly Ser Thr
Val Thr Tyr His Ala Asn 65 70 75 80 Pro Ser Ile Tyr His Pro Gly Pro
Met Gln Phe Tyr Leu Ala Arg Val 85 90 95 Pro Asp Gly Gln Asp Val
Lys Ser Trp Thr Gly Glu Gly Ala Val Trp 100 105 110 Phe Lys Val Tyr
Glu Glu Gln Pro Gln Phe Gly Ala Gln Leu Thr Trp 115 120 125 Pro Ser
Asn Gly Lys Ser Ser Phe Glu Val Pro Ile Pro Ser Cys Ile 130 135 140
Arg Ala Gly Asn Tyr Leu Leu Arg Ala Glu His Ile Ala Leu His Val 145
150 155 160 Ala Gln Ser Gln Gly Gly Ala Gln Phe Tyr Ile Ser Cys Ala
Gln Leu 165 170 175 Gln Val Thr Gly Gly Gly Ser Thr Glu Pro Ser Gln
Lys Val Ser Phe 180 185 190 Pro Gly Ala Tyr Lys Ser Thr Asp Pro Gly
Ile Leu Ile Asn Ile Asn 195 200 205 Tyr Pro Val Pro Thr Ser Tyr Gln
Asn Pro Gly Pro Ala Val Phe Arg 210 215 220 Cys 225
102208PRTMyceliophthora thermophila 102His Tyr Thr Leu Pro Arg Val
Gly Thr Gly Ser Asp Trp Gln His Val 1 5 10 15 Arg Arg Ala Asp Asn
Trp Gln Asn Asn Gly Phe Val Gly Asp Val Asn 20 25 30 Ser Glu Gln
Ile Arg Cys Phe Gln Ala Thr Pro Ala Gly Ala Gln Asp 35 40 45 Val
Tyr Thr Val Gln Ala Gly Ser Thr Val Thr Tyr His Ala Asn Pro 50 55
60 Ser Ile Tyr His Pro Gly Pro Met Gln Phe Tyr Leu Ala Arg Val Pro
65 70 75 80 Asp Gly Gln Asp Val Lys Ser Trp Thr Gly Glu Gly Ala Val
Trp Phe 85 90 95 Lys Val Tyr Glu Glu Gln Pro Gln Phe Gly Ala Gln
Leu Thr Trp Pro 100 105 110 Ser Asn Gly Lys Ser Ser Phe Glu Val Pro
Ile Pro Ser Cys Ile Arg 115 120 125 Ala Gly Asn Tyr Leu Leu Arg Ala
Glu His Ile Ala Leu His Val Ala 130 135 140 Gln Ser Gln Gly Gly Ala
Gln Phe Tyr Ile Ser Cys Ala Gln Leu Gln 145 150 155 160 Val Thr Gly
Gly Gly Ser Thr Glu Pro Ser Gln Lys Val Ser Phe Pro 165 170 175 Gly
Ala Tyr Lys Ser Thr Asp Pro Gly Ile Leu Ile Asn Ile Asn Tyr 180 185
190 Pro Val Pro Thr Ser Tyr Gln Asn Pro Gly Pro Ala Val Phe Arg Cys
195 200 205 103711DNAMyceliophthora thermophila 103atgaaggttc
tcgcgcccct gattctggcc ggtgccgcca gcgcccacac catcttctca 60tccctcgagg
tgggcggcgt caaccagggc atcgggcagg gtgtccgcgt gccgtcgtac
120aacggtccga tcgaggacgt gacgtccaac tcgatcgcct gcaacgggcc
ccccaacccg 180acgacgccga ccaacaaggt catcacggtc cgggccggcg
agacggtgac ggccgtctgg 240cggtacatgc tgagcaccac cggctcggcc
cccaacgaca tcatggacag cagccacaag 300ggcccgacca tggcctacct
caagaaggtc gacaacgcca ccaccgactc gggcgtcggc 360ggcggctggt
tcaagatcca ggaggacggc cttaccaacg gcgtctgggg caccgagcgc
420gtcatcaacg gccagggccg ccacaacatc aagatccccg agtgcatcgc
ccccggccag 480tacctcctcc gcgccgagat gcttgccctg cacggagctt
ccaactaccc cggcgctcag 540ttctacatgg agtgcgccca gctcaatatc
gtcggcggca ccggcagcaa gacgccgtcc 600accgtcagct tcccgggcgc
ttacaagggt accgaccccg gagtcaagat caacatctac 660tggccccccg
tcaccagcta ccagattccc ggccccggcg tgttcacctg c
711104237PRTMyceliophthora thermophila 104Met Lys Val Leu Ala Pro
Leu Ile Leu Ala Gly Ala Ala Ser Ala His 1 5 10 15 Thr Ile Phe Ser
Ser Leu Glu Val Gly Gly Val Asn Gln Gly Ile Gly 20 25 30 Gln Gly
Val Arg Val Pro Ser Tyr Asn Gly Pro Ile Glu Asp Val Thr 35 40 45
Ser Asn Ser Ile Ala Cys Asn Gly Pro Pro Asn Pro Thr Thr Pro Thr 50
55 60 Asn Lys Val Ile Thr Val Arg Ala Gly Glu Thr Val Thr Ala Val
Trp 65 70 75 80 Arg Tyr Met Leu Ser Thr Thr Gly Ser Ala Pro Asn Asp
Ile Met Asp 85 90 95 Ser Ser His Lys Gly Pro Thr Met Ala Tyr Leu
Lys Lys Val Asp Asn 100 105 110 Ala Thr Thr Asp Ser Gly Val Gly Gly
Gly Trp Phe Lys Ile Gln Glu 115 120 125 Asp Gly Leu Thr Asn Gly Val
Trp Gly Thr Glu Arg Val Ile Asn Gly 130 135 140 Gln Gly Arg His Asn
Ile Lys Ile Pro Glu Cys Ile Ala Pro Gly Gln 145 150 155 160 Tyr Leu
Leu Arg Ala Glu Met Leu Ala Leu His Gly Ala Ser Asn Tyr 165 170 175
Pro Gly Ala Gln Phe Tyr Met Glu Cys Ala Gln Leu Asn Ile Val Gly 180
185 190 Gly Thr Gly Ser Lys Thr Pro Ser Thr Val Ser Phe Pro Gly Ala
Tyr 195 200 205 Lys Gly Thr Asp Pro Gly Val Lys Ile Asn Ile Tyr Trp
Pro Pro Val 210 215 220 Thr Ser Tyr Gln Ile Pro Gly Pro Gly Val Phe
Thr Cys 225 230 235 105222PRTMyceliophthora thermophila 105His Thr
Ile Phe Ser Ser Leu Glu Val Gly Gly Val Asn Gln Gly Ile 1 5 10 15
Gly Gln Gly Val Arg Val Pro Ser Tyr Asn Gly Pro Ile Glu Asp Val 20
25 30 Thr Ser Asn Ser Ile Ala Cys Asn Gly Pro Pro Asn Pro Thr Thr
Pro 35 40 45 Thr Asn Lys Val Ile Thr Val Arg Ala Gly Glu Thr Val
Thr Ala Val 50 55 60 Trp Arg Tyr Met Leu Ser Thr Thr Gly Ser Ala
Pro Asn Asp Ile Met 65 70 75 80 Asp Ser Ser His Lys Gly Pro Thr Met
Ala Tyr Leu Lys Lys Val Asp 85 90 95 Asn Ala Thr Thr Asp Ser Gly
Val Gly Gly Gly Trp Phe Lys Ile Gln 100 105 110 Glu Asp Gly Leu Thr
Asn Gly Val Trp Gly Thr Glu Arg Val Ile Asn 115 120 125 Gly Gln Gly
Arg His Asn Ile Lys Ile Pro Glu Cys Ile Ala Pro Gly 130 135 140 Gln
Tyr Leu Leu Arg Ala Glu Met Leu Ala Leu His Gly Ala Ser Asn 145 150
155 160 Tyr Pro Gly Ala Gln Phe Tyr Met Glu Cys Ala Gln Leu Asn Ile
Val 165 170 175 Gly Gly Thr Gly Ser Lys Thr Pro Ser Thr Val Ser Phe
Pro Gly Ala 180 185 190 Tyr Lys Gly Thr Asp Pro Gly Val Lys Ile Asn
Ile Tyr Trp Pro Pro 195 200 205 Val Thr Ser Tyr Gln Ile Pro Gly Pro
Gly Val Phe Thr Cys 210 215 220 106225DNAMyceliophthora thermophila
106atgatcgaca acctccctga tgactcccta caacccgcct gcctccgccc
gggccactac 60ctcgtccgcc acgagatcat cgcgctgcac tcggcctggg ccgagggcga
ggcccagttc 120taccccttcc ccctttttcc tttttttccc tcccttcttt
tgtccggtaa ctacacgatt 180cccggtcccg cgatctggaa gtgcccagag
gcacagcaga acgag 22510775PRTMyceliophthora thermophila 107Met Ile
Asp Asn Leu Pro Asp Asp Ser Leu Gln Pro Ala Cys Leu Arg 1 5 10 15
Pro Gly His Tyr Leu Val Arg His Glu Ile Ile Ala Leu His Ser Ala 20
25 30 Trp Ala Glu Gly Glu Ala Gln Phe Tyr Pro Phe Pro Leu Phe Pro
Phe 35 40 45 Phe Pro Ser Leu Leu Leu Ser Gly Asn Tyr Thr Ile Pro
Gly Pro Ala 50 55 60 Ile Trp Lys Cys Pro Glu Ala Gln Gln Asn Glu 65
70 75 10857PRTMyceliophthora thermophila 108His Tyr Leu Val Arg His
Glu Ile Ile Ala Leu His Ser Ala Trp Ala 1 5 10 15 Glu Gly Glu Ala
Gln Phe Tyr Pro Phe Pro Leu Phe Pro Phe Phe Pro 20 25 30 Ser Leu
Leu Leu Ser Gly Asn Tyr Thr
Ile Pro Gly Pro Ala Ile Trp 35 40 45 Lys Cys Pro Glu Ala Gln Gln
Asn Glu 50 55 1091395DNAMyceliophthora thermophila 109atggggcaga
agactctcca ggggctggtg gcggcggcgg cactggcagc ctcggtggcg 60aacgcgcagc
aaccgggcac cttcacgccc gaggtgcatc cgacgctgcc gacgtggaag
120tgcacgacga gcggcgggtg cgtccagcag gacacgtcgg tggtgctcga
ctggaactac 180cgctggttcc acaccgagga cggtagcaag tcgtgcatca
cctctagcgg cgtcgaccgg 240accctgtgcc cggacgaggc gacgtgcgcc
aagaactgct tcgtcgaggg cgtcaactac 300acgagcagcg gggtcgagac
gtccggcagc tccctcaccc tccgccagtt cttcaagggc 360tccgacggcg
ccatcaacag cgtctccccg cgcgtctacc tgctcggggg agacggcaac
420tatgtcgtgc tcaagctcct cggccaggag ctgagcttcg acgtggacgt
atcgtcgctc 480ccgtgcggcg agaacgcggc cctgtacctg tccgagatgg
acgcgacggg aggacggaac 540gagtacaaca cgggcggggc cgagtacggg
tcgggctact gtgacgccca gtgccccgtg 600cagaactgga acaacgggac
gctcaacacg ggccgggtgg gctcgtgctg caacgagatg 660gacatcctcg
aggccaactc caaggccgag gccttcacgc cgcacccctg catcggcaac
720tcgtgcgaca agagcgggtg cggcttcaac gcgtacgcgc gcggttacca
caactactgg 780gcccccggcg gcacgctcga cacgtcccgg cctttcacca
tgatcacccg cttcgtcacc 840gacgacggca ccacctcggg caagctcgcc
cgcatcgagc gcgtctacgt ccaggacggc 900aagaaggtgc ccagcgcggc
gcccgggggg gacgtcatca cggccgacgg gtgcacctcc 960gcgcagccct
acggcggcct ttccggcatg ggcgacgccc tcggccgcgg catggtcctg
1020gccctgagca tctggaacga cgcgtccggg tacatgaact ggctcgacgc
cggcagcaac 1080ggcccctgca gcgacaccga gggtaacccg tccaacatcc
tggccaacca cccggacgcc 1140cacgtcgtgc tctccaacat ccgctggggc
gacatcggct ccaccgtcga caccggcgat 1200ggcgacaaca acggcggcgg
ccccaacccg tcatccacca ccaccgctac cgctaccacc 1260acctcctccg
gcccggccga gcctacccag acccactacg gccagtgtgg agggaaagga
1320tggacgggcc ctacccgctg cgagacgccc tacacctgca agtaccagaa
cgactggtac 1380tcgcagtgcc tgtag 1395110464PRTMyceliophthora
thermophila 110Met Gly Gln Lys Thr Leu Gln Gly Leu Val Ala Ala Ala
Ala Leu Ala 1 5 10 15 Ala Ser Val Ala Asn Ala Gln Gln Pro Gly Thr
Phe Thr Pro Glu Val 20 25 30 His Pro Thr Leu Pro Thr Trp Lys Cys
Thr Thr Ser Gly Gly Cys Val 35 40 45 Gln Gln Asp Thr Ser Val Val
Leu Asp Trp Asn Tyr Arg Trp Phe His 50 55 60 Thr Glu Asp Gly Ser
Lys Ser Cys Ile Thr Ser Ser Gly Val Asp Arg 65 70 75 80 Thr Leu Cys
Pro Asp Glu Ala Thr Cys Ala Lys Asn Cys Phe Val Glu 85 90 95 Gly
Val Asn Tyr Thr Ser Ser Gly Val Glu Thr Ser Gly Ser Ser Leu 100 105
110 Thr Leu Arg Gln Phe Phe Lys Gly Ser Asp Gly Ala Ile Asn Ser Val
115 120 125 Ser Pro Arg Val Tyr Leu Leu Gly Gly Asp Gly Asn Tyr Val
Val Leu 130 135 140 Lys Leu Leu Gly Gln Glu Leu Ser Phe Asp Val Asp
Val Ser Ser Leu 145 150 155 160 Pro Cys Gly Glu Asn Ala Ala Leu Tyr
Leu Ser Glu Met Asp Ala Thr 165 170 175 Gly Gly Arg Asn Glu Tyr Asn
Thr Gly Gly Ala Glu Tyr Gly Ser Gly 180 185 190 Tyr Cys Asp Ala Gln
Cys Pro Val Gln Asn Trp Asn Asn Gly Thr Leu 195 200 205 Asn Thr Gly
Arg Val Gly Ser Cys Cys Asn Glu Met Asp Ile Leu Glu 210 215 220 Ala
Asn Ser Lys Ala Glu Ala Phe Thr Pro His Pro Cys Ile Gly Asn 225 230
235 240 Ser Cys Asp Lys Ser Gly Cys Gly Phe Asn Ala Tyr Ala Arg Gly
Tyr 245 250 255 His Asn Tyr Trp Ala Pro Gly Gly Thr Leu Asp Thr Ser
Arg Pro Phe 260 265 270 Thr Met Ile Thr Arg Phe Val Thr Asp Asp Gly
Thr Thr Ser Gly Lys 275 280 285 Leu Ala Arg Ile Glu Arg Val Tyr Val
Gln Asp Gly Lys Lys Val Pro 290 295 300 Ser Ala Ala Pro Gly Gly Asp
Val Ile Thr Ala Asp Gly Cys Thr Ser 305 310 315 320 Ala Gln Pro Tyr
Gly Gly Leu Ser Gly Met Gly Asp Ala Leu Gly Arg 325 330 335 Gly Met
Val Leu Ala Leu Ser Ile Trp Asn Asp Ala Ser Gly Tyr Met 340 345 350
Asn Trp Leu Asp Ala Gly Ser Asn Gly Pro Cys Ser Asp Thr Glu Gly 355
360 365 Asn Pro Ser Asn Ile Leu Ala Asn His Pro Asp Ala His Val Val
Leu 370 375 380 Ser Asn Ile Arg Trp Gly Asp Ile Gly Ser Thr Val Asp
Thr Gly Asp 385 390 395 400 Gly Asp Asn Asn Gly Gly Gly Pro Asn Pro
Ser Ser Thr Thr Thr Ala 405 410 415 Thr Ala Thr Thr Thr Ser Ser Gly
Pro Ala Glu Pro Thr Gln Thr His 420 425 430 Tyr Gly Gln Cys Gly Gly
Lys Gly Trp Thr Gly Pro Thr Arg Cys Glu 435 440 445 Thr Pro Tyr Thr
Cys Lys Tyr Gln Asn Asp Trp Tyr Ser Gln Cys Leu 450 455 460
111442PRTMyceliophthora thermophila 111Gln Gln Pro Gly Thr Phe Thr
Pro Glu Val His Pro Thr Leu Pro Thr 1 5 10 15 Trp Lys Cys Thr Thr
Ser Gly Gly Cys Val Gln Gln Asp Thr Ser Val 20 25 30 Val Leu Asp
Trp Asn Tyr Arg Trp Phe His Thr Glu Asp Gly Ser Lys 35 40 45 Ser
Cys Ile Thr Ser Ser Gly Val Asp Arg Thr Leu Cys Pro Asp Glu 50 55
60 Ala Thr Cys Ala Lys Asn Cys Phe Val Glu Gly Val Asn Tyr Thr Ser
65 70 75 80 Ser Gly Val Glu Thr Ser Gly Ser Ser Leu Thr Leu Arg Gln
Phe Phe 85 90 95 Lys Gly Ser Asp Gly Ala Ile Asn Ser Val Ser Pro
Arg Val Tyr Leu 100 105 110 Leu Gly Gly Asp Gly Asn Tyr Val Val Leu
Lys Leu Leu Gly Gln Glu 115 120 125 Leu Ser Phe Asp Val Asp Val Ser
Ser Leu Pro Cys Gly Glu Asn Ala 130 135 140 Ala Leu Tyr Leu Ser Glu
Met Asp Ala Thr Gly Gly Arg Asn Glu Tyr 145 150 155 160 Asn Thr Gly
Gly Ala Glu Tyr Gly Ser Gly Tyr Cys Asp Ala Gln Cys 165 170 175 Pro
Val Gln Asn Trp Asn Asn Gly Thr Leu Asn Thr Gly Arg Val Gly 180 185
190 Ser Cys Cys Asn Glu Met Asp Ile Leu Glu Ala Asn Ser Lys Ala Glu
195 200 205 Ala Phe Thr Pro His Pro Cys Ile Gly Asn Ser Cys Asp Lys
Ser Gly 210 215 220 Cys Gly Phe Asn Ala Tyr Ala Arg Gly Tyr His Asn
Tyr Trp Ala Pro 225 230 235 240 Gly Gly Thr Leu Asp Thr Ser Arg Pro
Phe Thr Met Ile Thr Arg Phe 245 250 255 Val Thr Asp Asp Gly Thr Thr
Ser Gly Lys Leu Ala Arg Ile Glu Arg 260 265 270 Val Tyr Val Gln Asp
Gly Lys Lys Val Pro Ser Ala Ala Pro Gly Gly 275 280 285 Asp Val Ile
Thr Ala Asp Gly Cys Thr Ser Ala Gln Pro Tyr Gly Gly 290 295 300 Leu
Ser Gly Met Gly Asp Ala Leu Gly Arg Gly Met Val Leu Ala Leu 305 310
315 320 Ser Ile Trp Asn Asp Ala Ser Gly Tyr Met Asn Trp Leu Asp Ala
Gly 325 330 335 Ser Asn Gly Pro Cys Ser Asp Thr Glu Gly Asn Pro Ser
Asn Ile Leu 340 345 350 Ala Asn His Pro Asp Ala His Val Val Leu Ser
Asn Ile Arg Trp Gly 355 360 365 Asp Ile Gly Ser Thr Val Asp Thr Gly
Asp Gly Asp Asn Asn Gly Gly 370 375 380 Gly Pro Asn Pro Ser Ser Thr
Thr Thr Ala Thr Ala Thr Thr Thr Ser 385 390 395 400 Ser Gly Pro Ala
Glu Pro Thr Gln Thr His Tyr Gly Gln Cys Gly Gly 405 410 415 Lys Gly
Trp Thr Gly Pro Thr Arg Cys Glu Thr Pro Tyr Thr Cys Lys 420 425 430
Tyr Gln Asn Asp Trp Tyr Ser Gln Cys Leu 435 440
1121170DNAMyceliophthora thermophila 112atgaagtcct ccatcctcgc
cagcgtcttc gccacgggcg ccgtggctca aagtggtccg 60tggcagcaat gtggtggcat
cggatggcaa ggatcgaccg actgtgtgtc gggttaccac 120tgcgtctacc
agaacgattg gtacagccag tgcgtgcctg gcgcggcgtc gacaacgctc
180cagacatcta ccacgtccag gcccaccgcc accagcaccg cccctccgtc
gtccaccacc 240tcgcctagca agggcaagct caagtggctc ggcagcaacg
agtcgggcgc cgagttcggg 300gagggcaact accccggcct ctggggcaag
cacttcatct tcccgtcgac ttcggcgatt 360cagacgctca tcaatgatgg
atacaacatc ttccggatcg acttctcgat ggagcgtctg 420gtgcccaacc
agttgacgtc gtccttcgac gagggctacc tccgcaacct gaccgaggtg
480gtcaacttcg tgacgaacgc gggcaagtac gccgtcctgg acccgcacaa
ctacggccgg 540tactacggca acgtcatcac ggacacgaac gcgttccgga
ccttctggac caacctggcc 600aagcagttcg cctccaactc gctcgtcatc
ttcgacacca acaacgagta caacacgatg 660gaccagaccc tggtgctcaa
cctcaaccag gccgccatcg acggcatccg ggccgccggc 720gcgacctcgc
agtacatctt cgtcgagggc aacgcgtgga gcggggcctg gagctggaac
780acgaccaaca ccaacatggc cgccctgacg gacccgcaga acaagatcgt
gtacgagatg 840caccagtacc tcgactcgga cagctcgggc acccacgccg
agtgcgtcag cagcaacatc 900ggcgcccagc gcgtcgtcgg agccacccag
tggctccgcg ccaacggcaa gctcggcgtc 960ctcggcgagt tcgccggcgg
cgccaacgcc gtctgccagc aggccgtcac cggcctcctc 1020gaccacctcc
aggacaacag cgacgtctgg ctgggtgccc tctggtgggc cgccggtccc
1080tggtggggcg actacatgta ctcgttcgag cctccttcgg gcaccggcta
tgtcaactac 1140aactcgatcc taaagaagta cttgccgtaa
1170113389PRTMyceliophthora thermophila 113Met Lys Ser Ser Ile Leu
Ala Ser Val Phe Ala Thr Gly Ala Val Ala 1 5 10 15 Gln Ser Gly Pro
Trp Gln Gln Cys Gly Gly Ile Gly Trp Gln Gly Ser 20 25 30 Thr Asp
Cys Val Ser Gly Tyr His Cys Val Tyr Gln Asn Asp Trp Tyr 35 40 45
Ser Gln Cys Val Pro Gly Ala Ala Ser Thr Thr Leu Gln Thr Ser Thr 50
55 60 Thr Ser Arg Pro Thr Ala Thr Ser Thr Ala Pro Pro Ser Ser Thr
Thr 65 70 75 80 Ser Pro Ser Lys Gly Lys Leu Lys Trp Leu Gly Ser Asn
Glu Ser Gly 85 90 95 Ala Glu Phe Gly Glu Gly Asn Tyr Pro Gly Leu
Trp Gly Lys His Phe 100 105 110 Ile Phe Pro Ser Thr Ser Ala Ile Gln
Thr Leu Ile Asn Asp Gly Tyr 115 120 125 Asn Ile Phe Arg Ile Asp Phe
Ser Met Glu Arg Leu Val Pro Asn Gln 130 135 140 Leu Thr Ser Ser Phe
Asp Glu Gly Tyr Leu Arg Asn Leu Thr Glu Val 145 150 155 160 Val Asn
Phe Val Thr Asn Ala Gly Lys Tyr Ala Val Leu Asp Pro His 165 170 175
Asn Tyr Gly Arg Tyr Tyr Gly Asn Val Ile Thr Asp Thr Asn Ala Phe 180
185 190 Arg Thr Phe Trp Thr Asn Leu Ala Lys Gln Phe Ala Ser Asn Ser
Leu 195 200 205 Val Ile Phe Asp Thr Asn Asn Glu Tyr Asn Thr Met Asp
Gln Thr Leu 210 215 220 Val Leu Asn Leu Asn Gln Ala Ala Ile Asp Gly
Ile Arg Ala Ala Gly 225 230 235 240 Ala Thr Ser Gln Tyr Ile Phe Val
Glu Gly Asn Ala Trp Ser Gly Ala 245 250 255 Trp Ser Trp Asn Thr Thr
Asn Thr Asn Met Ala Ala Leu Thr Asp Pro 260 265 270 Gln Asn Lys Ile
Val Tyr Glu Met His Gln Tyr Leu Asp Ser Asp Ser 275 280 285 Ser Gly
Thr His Ala Glu Cys Val Ser Ser Asn Ile Gly Ala Gln Arg 290 295 300
Val Val Gly Ala Thr Gln Trp Leu Arg Ala Asn Gly Lys Leu Gly Val 305
310 315 320 Leu Gly Glu Phe Ala Gly Gly Ala Asn Ala Val Cys Gln Gln
Ala Val 325 330 335 Thr Gly Leu Leu Asp His Leu Gln Asp Asn Ser Glu
Val Trp Leu Gly 340 345 350 Ala Leu Trp Trp Ala Ala Gly Pro Trp Trp
Gly Asp Tyr Met Tyr Ser 355 360 365 Phe Glu Pro Pro Ser Gly Thr Gly
Tyr Val Asn Tyr Asn Ser Ile Leu 370 375 380 Lys Lys Tyr Leu Pro 385
114373PRTMyceliophthora thermophila 114Gln Ser Gly Pro Trp Gln Gln
Cys Gly Gly Ile Gly Trp Gln Gly Ser 1 5 10 15 Thr Asp Cys Val Ser
Gly Tyr His Cys Val Tyr Gln Asn Asp Trp Tyr 20 25 30 Ser Gln Cys
Val Pro Gly Ala Ala Ser Thr Thr Leu Gln Thr Ser Thr 35 40 45 Thr
Ser Arg Pro Thr Ala Thr Ser Thr Ala Pro Pro Ser Ser Thr Thr 50 55
60 Ser Pro Ser Lys Gly Lys Leu Lys Trp Leu Gly Ser Asn Glu Ser Gly
65 70 75 80 Ala Glu Phe Gly Glu Gly Asn Tyr Pro Gly Leu Trp Gly Lys
His Phe 85 90 95 Ile Phe Pro Ser Thr Ser Ala Ile Gln Thr Leu Ile
Asn Asp Gly Tyr 100 105 110 Asn Ile Phe Arg Ile Asp Phe Ser Met Glu
Arg Leu Val Pro Asn Gln 115 120 125 Leu Thr Ser Ser Phe Asp Glu Gly
Tyr Leu Arg Asn Leu Thr Glu Val 130 135 140 Val Asn Phe Val Thr Asn
Ala Gly Lys Tyr Ala Val Leu Asp Pro His 145 150 155 160 Asn Tyr Gly
Arg Tyr Tyr Gly Asn Val Ile Thr Asp Thr Asn Ala Phe 165 170 175 Arg
Thr Phe Trp Thr Asn Leu Ala Lys Gln Phe Ala Ser Asn Ser Leu 180 185
190 Val Ile Phe Asp Thr Asn Asn Glu Tyr Asn Thr Met Asp Gln Thr Leu
195 200 205 Val Leu Asn Leu Asn Gln Ala Ala Ile Asp Gly Ile Arg Ala
Ala Gly 210 215 220 Ala Thr Ser Gln Tyr Ile Phe Val Glu Gly Asn Ala
Trp Ser Gly Ala 225 230 235 240 Trp Ser Trp Asn Thr Thr Asn Thr Asn
Met Ala Ala Leu Thr Asp Pro 245 250 255 Gln Asn Lys Ile Val Tyr Glu
Met His Gln Tyr Leu Asp Ser Asp Ser 260 265 270 Ser Gly Thr His Ala
Glu Cys Val Ser Ser Asn Ile Gly Ala Gln Arg 275 280 285 Val Val Gly
Ala Thr Gln Trp Leu Arg Ala Asn Gly Lys Leu Gly Val 290 295 300 Leu
Gly Glu Phe Ala Gly Gly Ala Asn Ala Val Cys Gln Gln Ala Val 305 310
315 320 Thr Gly Leu Leu Asp His Leu Gln Asp Asn Ser Glu Val Trp Leu
Gly 325 330 335 Ala Leu Trp Trp Ala Ala Gly Pro Trp Trp Gly Asp Tyr
Met Tyr Ser 340 345 350 Phe Glu Pro Pro Ser Gly Thr Gly Tyr Val Asn
Tyr Asn Ser Ile Leu 355 360 365 Lys Lys Tyr Leu Pro 370
1152613DNAMyceliophthora thermophila 115atgaaggctg ctgcgctttc
ctgcctcttc ggcagtaccc ttgccgttgc aggcgccatt 60gaatcgagaa aggttcacca
gaagcccctc gcgagatctg aaccttttta cccgtcgcca 120tggatgaatc
ccaacgccga cggctgggcg gaggcctatg cccaggccaa gtcctttgtc
180tcccaaatga ctctgctaga gaaggtcaac ttgaccacgg gagtcggctg
gggggctgag 240cagtgcgtcg gccaagtggg cgcgatccct cgccttggac
ttcgcagtct gtgcatgcat 300gactcccctc tcggcatccg aggagccgac
tacaactcag cgttcccctc tggccagacc 360gttgctgcta cctgggatcg
cggtctgatg taccgtcgcg gctacgcaat gggccaggag 420gccaaaggca
agggcatcaa tgtccttctc ggaccagtcg ccggccccct tggccgcatg
480cccgagggcg gtcgtaactg ggaaggcttc gctccggatc ccgtccttac
cggcatcggc 540atgtccgaga cgatcaaggg cattcaggat gctggcgtca
tcgcttgtgc gaagcacttt 600attggaaacg agcaggagca cttcagacag
gtgccagaag cccagggata cggttacaac 660atcagcgaaa ccctctcctc
caacattgac gacaagacca tgcacgagct ctacctttgg 720ccgtttgccg
atgccgtccg ggccggcgtc ggctctgtca tgtgctcgta ccagcaggtc
780aacaactcgt acgcctgcca gaactcgaag ctgctgaacg acctcctcaa
gaacgagctt 840gggtttcagg gcttcgtcat gagcgactgg caggcacagc
acactggcgc agcaagcgcc 900gtggctggtc tcgatatgtc catgccgggc
gacacccagt tcaacactgg cgtcagtttc 960tggggcgcca atctcaccct
cgccgtcctc aacggcacag tccctgccta ccgtctcgac 1020gacatggcca
tgcgcatcat ggccgccctc ttcaaggtca ccaagaccac cgacctggaa
1080ccgatcaact tctccttctg gaccgacgac acttatggcc cgatccactg
ggccgccaag
1140cagggctacc aggagattaa ttcccacgtt gacgtccgcg ccgaccacgg
caacctcatc 1200cgggagattg ccgccaaggg tacggtgctg ctgaagaata
ccggctctct acccctgaac 1260aagccaaagt tcgtggccgt catcggcgag
gatgctgggt cgagccccaa cgggcccaac 1320ggctgcagcg accgcggctg
taacgaaggc acgctcgcca tgggctgggg atccggcaca 1380gccaactatc
cgtacctcgt ttcccccgac gccgcgctcc aggcccgggc catccaggac
1440ggcacgaggt acgagagcgt cctgtccaac tacgccgagg aaaagacaaa
ggctctggtc 1500tcgcaggcca atgcaaccgc catcgtcttc gtcaatgccg
actcaggcga gggctacatc 1560aacgtggacg gtaacgaggg cgaccgtaag
aacctgactc tctggaacaa cggtgatact 1620ctggtcaaga acgtctcgag
ctggtgcagc aacaccatcg tcgtcatcca ctcggtcggc 1680ccggtcctcc
tgaccgattg gtacgacaac cccaacatca cggccattct ctgggctggt
1740cttccgggcc aggagtcggg caactccatc accgacgtgc tttacggcaa
ggtcaacccc 1800gccgcccgct cgcccttcac ttggggcaag acccgcgaaa
gctatggcgc ggacgtcctg 1860tacaagccga ataatggcaa tggtgcgccc
caacaggact tcaccgaggg cgtcttcatc 1920gactaccgct acttcgacaa
ggttgacgat gactcggtca tctacgagtt cggccacggc 1980ctgagctaca
ccaccttcga gtacagcaac atccgcgtcg tcaagtccaa cgtcagcgag
2040taccggccca cgacgggcac cacggcccag gccccgacgt ttggcaactt
ctccaccgac 2100ctcgaggact atctcttccc caaggacgag ttcccctaca
tctaccagta catctacccg 2160tacctcaaca cgaccgaccc ccggagggcc
tcggccgatc cccactacgg ccagaccgcc 2220gaggagttcc tcccgcccca
cgccaccgat gacgaccccc agccgctcct ccggtcctcg 2280ggcggaaact
cccccggcgg caaccgccag ctgtacgaca ttgtctacac aatcacggcc
2340gacatcacga atacgggctc cgttgtaggc gaggaggtac cgcagctcta
cgtctcgctg 2400ggcggtcccg aggatcccaa ggtgcagctg cgcgactttg
acaggatgcg gatcgaaccc 2460ggcgagacga ggcagttcac cggccgcctg
acgcgcagag atctgagcaa ctgggacgtc 2520acggtgcagg actgggtcat
cagcaggtat cccaagacgg catatgttgg gaggagcagc 2580cggaagttgg
atctcaagat tgagcttcct tga 2613116870PRTMyceliophthora thermophila
116Met Lys Ala Ala Ala Leu Ser Cys Leu Phe Gly Ser Thr Leu Ala Val
1 5 10 15 Ala Gly Ala Ile Glu Ser Arg Lys Val His Gln Lys Pro Leu
Ala Arg 20 25 30 Ser Glu Pro Phe Tyr Pro Ser Pro Trp Met Asn Pro
Asn Ala Asp Gly 35 40 45 Trp Ala Glu Ala Tyr Ala Gln Ala Lys Ser
Phe Val Ser Gln Met Thr 50 55 60 Leu Leu Glu Lys Val Asn Leu Thr
Thr Gly Val Gly Trp Gly Ala Glu 65 70 75 80 Gln Cys Val Gly Gln Val
Gly Ala Ile Pro Arg Leu Gly Leu Arg Ser 85 90 95 Leu Cys Met His
Asp Ser Pro Leu Gly Ile Arg Gly Ala Asp Tyr Asn 100 105 110 Ser Ala
Phe Pro Ser Gly Gln Thr Val Ala Ala Thr Trp Asp Arg Gly 115 120 125
Leu Met Tyr Arg Arg Gly Tyr Ala Met Gly Gln Glu Ala Lys Gly Lys 130
135 140 Gly Ile Asn Val Leu Leu Gly Pro Val Ala Gly Pro Leu Gly Arg
Met 145 150 155 160 Pro Glu Gly Gly Arg Asn Trp Glu Gly Phe Ala Pro
Asp Pro Val Leu 165 170 175 Thr Gly Ile Gly Met Ser Glu Thr Ile Lys
Gly Ile Gln Asp Ala Gly 180 185 190 Val Ile Ala Cys Ala Lys His Phe
Ile Gly Asn Glu Gln Glu His Phe 195 200 205 Arg Gln Val Pro Glu Ala
Gln Gly Tyr Gly Tyr Asn Ile Ser Glu Thr 210 215 220 Leu Ser Ser Asn
Ile Asp Asp Lys Thr Met His Glu Leu Tyr Leu Trp 225 230 235 240 Pro
Phe Ala Asp Ala Val Arg Ala Gly Val Gly Ser Val Met Cys Ser 245 250
255 Tyr Gln Gln Val Asn Asn Ser Tyr Ala Cys Gln Asn Ser Lys Leu Leu
260 265 270 Asn Asp Leu Leu Lys Asn Glu Leu Gly Phe Gln Gly Phe Val
Met Ser 275 280 285 Asp Trp Gln Ala Gln His Thr Gly Ala Ala Ser Ala
Val Ala Gly Leu 290 295 300 Asp Met Ser Met Pro Gly Asp Thr Gln Phe
Asn Thr Gly Val Ser Phe 305 310 315 320 Trp Gly Ala Asn Leu Thr Leu
Ala Val Leu Asn Gly Thr Val Pro Ala 325 330 335 Tyr Arg Leu Asp Asp
Met Ala Met Arg Ile Met Ala Ala Leu Phe Lys 340 345 350 Val Thr Lys
Thr Thr Asp Leu Glu Pro Ile Asn Phe Ser Phe Trp Thr 355 360 365 Asp
Asp Thr Tyr Gly Pro Ile His Trp Ala Ala Lys Gln Gly Tyr Gln 370 375
380 Glu Ile Asn Ser His Val Asp Val Arg Ala Asp His Gly Asn Leu Ile
385 390 395 400 Arg Glu Ile Ala Ala Lys Gly Thr Val Leu Leu Lys Asn
Thr Gly Ser 405 410 415 Leu Pro Leu Asn Lys Pro Lys Phe Val Ala Val
Ile Gly Glu Asp Ala 420 425 430 Gly Ser Ser Pro Asn Gly Pro Asn Gly
Cys Ser Asp Arg Gly Cys Asn 435 440 445 Glu Gly Thr Leu Ala Met Gly
Trp Gly Ser Gly Thr Ala Asn Tyr Pro 450 455 460 Tyr Leu Val Ser Pro
Asp Ala Ala Leu Gln Ala Arg Ala Ile Gln Asp 465 470 475 480 Gly Thr
Arg Tyr Glu Ser Val Leu Ser Asn Tyr Ala Glu Glu Lys Thr 485 490 495
Lys Ala Leu Val Ser Gln Ala Asn Ala Thr Ala Ile Val Phe Val Asn 500
505 510 Ala Asp Ser Gly Glu Gly Tyr Ile Asn Val Asp Gly Asn Glu Gly
Asp 515 520 525 Arg Lys Asn Leu Thr Leu Trp Asn Asn Gly Asp Thr Leu
Val Lys Asn 530 535 540 Val Ser Ser Trp Cys Ser Asn Thr Ile Val Val
Ile His Ser Val Gly 545 550 555 560 Pro Val Leu Leu Thr Asp Trp Tyr
Asp Asn Pro Asn Ile Thr Ala Ile 565 570 575 Leu Trp Ala Gly Leu Pro
Gly Gln Glu Ser Gly Asn Ser Ile Thr Asp 580 585 590 Val Leu Tyr Gly
Lys Val Asn Pro Ala Ala Arg Ser Pro Phe Thr Trp 595 600 605 Gly Lys
Thr Arg Glu Ser Tyr Gly Ala Asp Val Leu Tyr Lys Pro Asn 610 615 620
Asn Gly Asn Gly Ala Pro Gln Gln Asp Phe Thr Glu Gly Val Phe Ile 625
630 635 640 Asp Tyr Arg Tyr Phe Asp Lys Val Asp Asp Asp Ser Val Ile
Tyr Glu 645 650 655 Phe Gly His Gly Leu Ser Tyr Thr Thr Phe Glu Tyr
Ser Asn Ile Arg 660 665 670 Val Val Lys Ser Asn Val Ser Glu Tyr Arg
Pro Thr Thr Gly Thr Thr 675 680 685 Ala Gln Ala Pro Thr Phe Gly Asn
Phe Ser Thr Asp Leu Glu Asp Tyr 690 695 700 Leu Phe Pro Lys Asp Glu
Phe Pro Tyr Ile Tyr Gln Tyr Ile Tyr Pro 705 710 715 720 Tyr Leu Asn
Thr Thr Asp Pro Arg Arg Ala Ser Ala Asp Pro His Tyr 725 730 735 Gly
Gln Thr Ala Glu Glu Phe Leu Pro Pro His Ala Thr Asp Asp Asp 740 745
750 Pro Gln Pro Leu Leu Arg Ser Ser Gly Gly Asn Ser Pro Gly Gly Asn
755 760 765 Arg Gln Leu Tyr Asp Ile Val Tyr Thr Ile Thr Ala Asp Ile
Thr Asn 770 775 780 Thr Gly Ser Val Val Gly Glu Glu Val Pro Gln Leu
Tyr Val Ser Leu 785 790 795 800 Gly Gly Pro Glu Asp Pro Lys Val Gln
Leu Arg Asp Phe Asp Arg Met 805 810 815 Arg Ile Glu Pro Gly Glu Thr
Arg Gln Phe Thr Gly Arg Leu Thr Arg 820 825 830 Arg Asp Leu Ser Asn
Trp Asp Val Thr Val Gln Asp Trp Val Ile Ser 835 840 845 Arg Tyr Pro
Lys Thr Ala Tyr Val Gly Arg Ser Ser Arg Lys Leu Asp 850 855 860 Leu
Lys Ile Glu Leu Pro 865 870 117851PRTMyceliophthora thermophila
117Ile Glu Ser Arg Lys Val His Gln Lys Pro Leu Ala Arg Ser Glu Pro
1 5 10 15 Phe Tyr Pro Ser Pro Trp Met Asn Pro Asn Ala Asp Gly Trp
Ala Glu 20 25 30 Ala Tyr Ala Gln Ala Lys Ser Phe Val Ser Gln Met
Thr Leu Leu Glu 35 40 45 Lys Val Asn Leu Thr Thr Gly Val Gly Trp
Gly Ala Glu Gln Cys Val 50 55 60 Gly Gln Val Gly Ala Ile Pro Arg
Leu Gly Leu Arg Ser Leu Cys Met 65 70 75 80 His Asp Ser Pro Leu Gly
Ile Arg Gly Ala Asp Tyr Asn Ser Ala Phe 85 90 95 Pro Ser Gly Gln
Thr Val Ala Ala Thr Trp Asp Arg Gly Leu Met Tyr 100 105 110 Arg Arg
Gly Tyr Ala Met Gly Gln Glu Ala Lys Gly Lys Gly Ile Asn 115 120 125
Val Leu Leu Gly Pro Val Ala Gly Pro Leu Gly Arg Met Pro Glu Gly 130
135 140 Gly Arg Asn Trp Glu Gly Phe Ala Pro Asp Pro Val Leu Thr Gly
Ile 145 150 155 160 Gly Met Ser Glu Thr Ile Lys Gly Ile Gln Asp Ala
Gly Val Ile Ala 165 170 175 Cys Ala Lys His Phe Ile Gly Asn Glu Gln
Glu His Phe Arg Gln Val 180 185 190 Pro Glu Ala Gln Gly Tyr Gly Tyr
Asn Ile Ser Glu Thr Leu Ser Ser 195 200 205 Asn Ile Asp Asp Lys Thr
Met His Glu Leu Tyr Leu Trp Pro Phe Ala 210 215 220 Asp Ala Val Arg
Ala Gly Val Gly Ser Val Met Cys Ser Tyr Gln Gln 225 230 235 240 Val
Asn Asn Ser Tyr Ala Cys Gln Asn Ser Lys Leu Leu Asn Asp Leu 245 250
255 Leu Lys Asn Glu Leu Gly Phe Gln Gly Phe Val Met Ser Asp Trp Gln
260 265 270 Ala Gln His Thr Gly Ala Ala Ser Ala Val Ala Gly Leu Asp
Met Ser 275 280 285 Met Pro Gly Asp Thr Gln Phe Asn Thr Gly Val Ser
Phe Trp Gly Ala 290 295 300 Asn Leu Thr Leu Ala Val Leu Asn Gly Thr
Val Pro Ala Tyr Arg Leu 305 310 315 320 Asp Asp Met Ala Met Arg Ile
Met Ala Ala Leu Phe Lys Val Thr Lys 325 330 335 Thr Thr Asp Leu Glu
Pro Ile Asn Phe Ser Phe Trp Thr Asp Asp Thr 340 345 350 Tyr Gly Pro
Ile His Trp Ala Ala Lys Gln Gly Tyr Gln Glu Ile Asn 355 360 365 Ser
His Val Asp Val Arg Ala Asp His Gly Asn Leu Ile Arg Glu Ile 370 375
380 Ala Ala Lys Gly Thr Val Leu Leu Lys Asn Thr Gly Ser Leu Pro Leu
385 390 395 400 Asn Lys Pro Lys Phe Val Ala Val Ile Gly Glu Asp Ala
Gly Ser Ser 405 410 415 Pro Asn Gly Pro Asn Gly Cys Ser Asp Arg Gly
Cys Asn Glu Gly Thr 420 425 430 Leu Ala Met Gly Trp Gly Ser Gly Thr
Ala Asn Tyr Pro Tyr Leu Val 435 440 445 Ser Pro Asp Ala Ala Leu Gln
Ala Arg Ala Ile Gln Asp Gly Thr Arg 450 455 460 Tyr Glu Ser Val Leu
Ser Asn Tyr Ala Glu Glu Lys Thr Lys Ala Leu 465 470 475 480 Val Ser
Gln Ala Asn Ala Thr Ala Ile Val Phe Val Asn Ala Asp Ser 485 490 495
Gly Glu Gly Tyr Ile Asn Val Asp Gly Asn Glu Gly Asp Arg Lys Asn 500
505 510 Leu Thr Leu Trp Asn Asn Gly Asp Thr Leu Val Lys Asn Val Ser
Ser 515 520 525 Trp Cys Ser Asn Thr Ile Val Val Ile His Ser Val Gly
Pro Val Leu 530 535 540 Leu Thr Asp Trp Tyr Asp Asn Pro Asn Ile Thr
Ala Ile Leu Trp Ala 545 550 555 560 Gly Leu Pro Gly Gln Glu Ser Gly
Asn Ser Ile Thr Asp Val Leu Tyr 565 570 575 Gly Lys Val Asn Pro Ala
Ala Arg Ser Pro Phe Thr Trp Gly Lys Thr 580 585 590 Arg Glu Ser Tyr
Gly Ala Asp Val Leu Tyr Lys Pro Asn Asn Gly Asn 595 600 605 Gly Ala
Pro Gln Gln Asp Phe Thr Glu Gly Val Phe Ile Asp Tyr Arg 610 615 620
Tyr Phe Asp Lys Val Asp Asp Asp Ser Val Ile Tyr Glu Phe Gly His 625
630 635 640 Gly Leu Ser Tyr Thr Thr Phe Glu Tyr Ser Asn Ile Arg Val
Val Lys 645 650 655 Ser Asn Val Ser Glu Tyr Arg Pro Thr Thr Gly Thr
Thr Ala Gln Ala 660 665 670 Pro Thr Phe Gly Asn Phe Ser Thr Asp Leu
Glu Asp Tyr Leu Phe Pro 675 680 685 Lys Asp Glu Phe Pro Tyr Ile Tyr
Gln Tyr Ile Tyr Pro Tyr Leu Asn 690 695 700 Thr Thr Asp Pro Arg Arg
Ala Ser Ala Asp Pro His Tyr Gly Gln Thr 705 710 715 720 Ala Glu Glu
Phe Leu Pro Pro His Ala Thr Asp Asp Asp Pro Gln Pro 725 730 735 Leu
Leu Arg Ser Ser Gly Gly Asn Ser Pro Gly Gly Asn Arg Gln Leu 740 745
750 Tyr Asp Ile Val Tyr Thr Ile Thr Ala Asp Ile Thr Asn Thr Gly Ser
755 760 765 Val Val Gly Glu Glu Val Pro Gln Leu Tyr Val Ser Leu Gly
Gly Pro 770 775 780 Glu Asp Pro Lys Val Gln Leu Arg Asp Phe Asp Arg
Met Arg Ile Glu 785 790 795 800 Pro Gly Glu Thr Arg Gln Phe Thr Gly
Arg Leu Thr Arg Arg Asp Leu 805 810 815 Ser Asn Trp Asp Val Thr Val
Gln Asp Trp Val Ile Ser Arg Tyr Pro 820 825 830 Lys Thr Ala Tyr Val
Gly Arg Ser Ser Arg Lys Leu Asp Leu Lys Ile 835 840 845 Glu Leu Pro
850 1182613DNAArtificial SequenceSynthetic polynucleotide.
118atgaaggctg ctgcgctttc ctgcctcttc ggcagtaccc ttgccgttgc
aggcgccatt 60gaatcgagaa aggttcacca gaagcccctc gcgagatctg aaccttttta
cccgtcgcca 120tggatgaatc ccaacgccga cggctgggcg gaggcctatg
cccaggccaa gtcctttgtc 180tcccaaatga ctctgctaga gaaggtcaac
ttgaccacgg gagtcggctg gggggctgag 240cagtgcgtcg gccaagtggg
cgcgatccct cgccttggac ttcgcagtct gtgcatgcat 300gactcccctc
tcggcatccg aggagccgac tacaactcag cgttcccctc tggccagacc
360gttgctgcta cctgggatcg cggtctgatg taccgtcgcg gctacgcaat
gggccaggag 420gccaaaggca agggcatcaa tgtccttctc ggaccagtcg
ccggccccct tggccgcatg 480cccgagggcg gtcgtaactg ggaaggcttc
gctccggatc ccgtccttac cggcatcggc 540atgtccgaga cgatcaaggg
cattcaggat gctggcgtca tcgcttgtgc gaagcacttt 600attggaaacg
agcaggagca cttcagacag gtgccagaag cccagggata cggttacaac
660atcagcgaaa ccctctcctc caacattgac gacaagacca tgcacgagct
ctacctttgg 720ccgtttgccg atgccgtccg ggccggcgtc ggctctgtca
tgtgctcgta caaccaggtc 780aacaactcgt acgcctgcca gaactcgaag
ctgctgaacg acctcctcaa gaacgagctt 840gggtttcagg gcttcgtcat
gagcgactgg tgggcacagc acactggcgc agcaagcgcc 900gtggctggtc
tcgatatgtc catgccgggc gacaccatgt tcaacactgg cgtcagtttc
960tggggcgcca atctcaccct cgccgtcctc aacggcacag tccctgccta
ccgtctcgac 1020gacatggcca tgcgcatcat ggccgccctc ttcaaggtca
ccaagaccac cgacctggaa 1080ccgatcaact tctccttctg gacccgcgac
acttatggcc cgatccactg ggccgccaag 1140cagggctacc aggagattaa
ttcccacgtt gacgtccgcg ccgaccacgg caacctcatc 1200cggaacattg
ccgccaaggg tacggtgctg ctgaagaata ccggctctct acccctgaac
1260aagccaaagt tcgtggccgt catcggcgag gatgctgggc cgagccccaa
cgggcccaac 1320ggctgcagcg accgcggctg taacgaaggc acgctcgcca
tgggctgggg atccggcaca 1380gccaactatc cgtacctcgt ttcccccgac
gccgcgctcc agttgcgggc catccaggac 1440ggcacgaggt acgagagcgt
cctgtccaac tacgccgagg aaaatacaaa ggctctggtc 1500tcgcaggcca
atgcaaccgc catcgtcttc gtcaatgccg actcaggcga gggctacatc
1560aacgtggacg gtaacgaggg cgaccgtaag aacctgactc tctggaacaa
cggtgatact 1620ctggtcaaga acgtctcgag ctggtgcagc aacaccatcg
tcgtcatcca ctcggtcggc 1680ccggtcctcc tgaccgattg gtacgacaac
cccaacatca cggccattct ctgggctggt 1740cttccgggcc aggagtcggg
caactccatc accgacgtgc tttacggcaa ggtcaacccc 1800gccgcccgct
cgcccttcac ttggggcaag acccgcgaaa gctatggcgc ggacgtcctg
1860tacaagccga ataatggcaa ttgggcgccc caacaggact tcaccgaggg
cgtcttcatc 1920gactaccgct acttcgacaa ggttgacgat gactcggtca
tctacgagtt cggccacggc 1980ctgagctaca ccaccttcga gtacagcaac
atccgcgtcg tcaagtccaa cgtcagcgag 2040taccggccca cgacgggcac
cacgattcag gccccgacgt ttggcaactt ctccaccgac 2100ctcgaggact
atctcttccc caaggacgag ttcccctaca tcccgcagta catctacccg
2160tacctcaaca cgaccgaccc ccggagggcc
tcggccgatc cccactacgg ccagaccgcc 2220gaggagttcc tcccgcccca
cgccaccgat gacgaccccc agccgctcct ccggtcctcg 2280ggcggaaact
cccccggcgg caaccgccag ctgtacgaca ttgtctacac aatcacggcc
2340gacatcacga atacgggctc cgttgtaggc gaggaggtac cgcagctcta
cgtctcgctg 2400ggcggtcccg aggatcccaa ggtgcagctg cgcgactttg
acaggatgcg gatcgaaccc 2460ggcgagacga ggcagttcac cggccgcctg
acgcgcagag atctgagcaa ctgggacgtc 2520acggtgcagg actgggtcat
cagcaggtat cccaagacgg catatgttgg gaggagcagc 2580cggaagttgg
atctcaagat tgagcttcct tga 2613119870PRTArtificial SequenceSynthetic
polypeptides. 119Met Lys Ala Ala Ala Leu Ser Cys Leu Phe Gly Ser
Thr Leu Ala Val 1 5 10 15 Ala Gly Ala Ile Glu Ser Arg Lys Val His
Gln Lys Pro Leu Ala Arg 20 25 30 Ser Glu Pro Phe Tyr Pro Ser Pro
Trp Met Asn Pro Asn Ala Asp Gly 35 40 45 Trp Ala Glu Ala Tyr Ala
Gln Ala Lys Ser Phe Val Ser Gln Met Thr 50 55 60 Leu Leu Glu Lys
Val Asn Leu Thr Thr Gly Val Gly Trp Gly Ala Glu 65 70 75 80 Gln Cys
Val Gly Gln Val Gly Ala Ile Pro Arg Leu Gly Leu Arg Ser 85 90 95
Leu Cys Met His Asp Ser Pro Leu Gly Ile Arg Gly Ala Asp Tyr Asn 100
105 110 Ser Ala Phe Pro Ser Gly Gln Thr Val Ala Ala Thr Trp Asp Arg
Gly 115 120 125 Leu Met Tyr Arg Arg Gly Tyr Ala Met Gly Gln Glu Ala
Lys Gly Lys 130 135 140 Gly Ile Asn Val Leu Leu Gly Pro Val Ala Gly
Pro Leu Gly Arg Met 145 150 155 160 Pro Glu Gly Gly Arg Asn Trp Glu
Gly Phe Ala Pro Asp Pro Val Leu 165 170 175 Thr Gly Ile Gly Met Ser
Glu Thr Ile Lys Gly Ile Gln Asp Ala Gly 180 185 190 Val Ile Ala Cys
Ala Lys His Phe Ile Gly Asn Glu Gln Glu His Phe 195 200 205 Arg Gln
Val Pro Glu Ala Gln Gly Tyr Gly Tyr Asn Ile Ser Glu Thr 210 215 220
Leu Ser Ser Asn Ile Asp Asp Lys Thr Met His Glu Leu Tyr Leu Trp 225
230 235 240 Pro Phe Ala Asp Ala Val Arg Ala Gly Val Gly Ser Val Met
Cys Ser 245 250 255 Tyr Asn Gln Val Asn Asn Ser Tyr Ala Cys Gln Asn
Ser Lys Leu Leu 260 265 270 Asn Asp Leu Leu Lys Asn Glu Leu Gly Phe
Gln Gly Phe Val Met Ser 275 280 285 Asp Trp Trp Ala Gln His Thr Gly
Ala Ala Ser Ala Val Ala Gly Leu 290 295 300 Asp Met Ser Met Pro Gly
Asp Thr Met Phe Asn Thr Gly Val Ser Phe 305 310 315 320 Trp Gly Ala
Asn Leu Thr Leu Ala Val Leu Asn Gly Thr Val Pro Ala 325 330 335 Tyr
Arg Leu Asp Asp Met Ala Met Arg Ile Met Ala Ala Leu Phe Lys 340 345
350 Val Thr Lys Thr Thr Asp Leu Glu Pro Ile Asn Phe Ser Phe Trp Thr
355 360 365 Arg Asp Thr Tyr Gly Pro Ile His Trp Ala Ala Lys Gln Gly
Tyr Gln 370 375 380 Glu Ile Asn Ser His Val Asp Val Arg Ala Asp His
Gly Asn Leu Ile 385 390 395 400 Arg Asn Ile Ala Ala Lys Gly Thr Val
Leu Leu Lys Asn Thr Gly Ser 405 410 415 Leu Pro Leu Asn Lys Pro Lys
Phe Val Ala Val Ile Gly Glu Asp Ala 420 425 430 Gly Pro Ser Pro Asn
Gly Pro Asn Gly Cys Ser Asp Arg Gly Cys Asn 435 440 445 Glu Gly Thr
Leu Ala Met Gly Trp Gly Ser Gly Thr Ala Asn Tyr Pro 450 455 460 Tyr
Leu Val Ser Pro Asp Ala Ala Leu Gln Leu Arg Ala Ile Gln Asp 465 470
475 480 Gly Thr Arg Tyr Glu Ser Val Leu Ser Asn Tyr Ala Glu Glu Asn
Thr 485 490 495 Lys Ala Leu Val Ser Gln Ala Asn Ala Thr Ala Ile Val
Phe Val Asn 500 505 510 Ala Asp Ser Gly Glu Gly Tyr Ile Asn Val Asp
Gly Asn Glu Gly Asp 515 520 525 Arg Lys Asn Leu Thr Leu Trp Asn Asn
Gly Asp Thr Leu Val Lys Asn 530 535 540 Val Ser Ser Trp Cys Ser Asn
Thr Ile Val Val Ile His Ser Val Gly 545 550 555 560 Pro Val Leu Leu
Thr Asp Trp Tyr Asp Asn Pro Asn Ile Thr Ala Ile 565 570 575 Leu Trp
Ala Gly Leu Pro Gly Gln Glu Ser Gly Asn Ser Ile Thr Asp 580 585 590
Val Leu Tyr Gly Lys Val Asn Pro Ala Ala Arg Ser Pro Phe Thr Trp 595
600 605 Gly Lys Thr Arg Glu Ser Tyr Gly Ala Asp Val Leu Tyr Lys Pro
Asn 610 615 620 Asn Gly Asn Trp Ala Pro Gln Gln Asp Phe Thr Glu Gly
Val Phe Ile 625 630 635 640 Asp Tyr Arg Tyr Phe Asp Lys Val Asp Asp
Asp Ser Val Ile Tyr Glu 645 650 655 Phe Gly His Gly Leu Ser Tyr Thr
Thr Phe Glu Tyr Ser Asn Ile Arg 660 665 670 Val Val Lys Ser Asn Val
Ser Glu Tyr Arg Pro Thr Thr Gly Thr Thr 675 680 685 Ile Gln Ala Pro
Thr Phe Gly Asn Phe Ser Thr Asp Leu Glu Asp Tyr 690 695 700 Leu Phe
Pro Lys Asp Glu Phe Pro Tyr Ile Pro Gln Tyr Ile Tyr Pro 705 710 715
720 Tyr Leu Asn Thr Thr Asp Pro Arg Arg Ala Ser Ala Asp Pro His Tyr
725 730 735 Gly Gln Thr Ala Glu Glu Phe Leu Pro Pro His Ala Thr Asp
Asp Asp 740 745 750 Pro Gln Pro Leu Leu Arg Ser Ser Gly Gly Asn Ser
Pro Gly Gly Asn 755 760 765 Arg Gln Leu Tyr Asp Ile Val Tyr Thr Ile
Thr Ala Asp Ile Thr Asn 770 775 780 Thr Gly Ser Val Val Gly Glu Glu
Val Pro Gln Leu Tyr Val Ser Leu 785 790 795 800 Gly Gly Pro Glu Asp
Pro Lys Val Gln Leu Arg Asp Phe Asp Arg Met 805 810 815 Arg Ile Glu
Pro Gly Glu Thr Arg Gln Phe Thr Gly Arg Leu Thr Arg 820 825 830 Arg
Asp Leu Ser Asn Trp Asp Val Thr Val Gln Asp Trp Val Ile Ser 835 840
845 Arg Tyr Pro Lys Thr Ala Tyr Val Gly Arg Ser Ser Arg Lys Leu Asp
850 855 860 Leu Lys Ile Glu Leu Pro 865 870 120851PRTArtificial
SequenceSynthetic polypeptides. 120Ile Glu Ser Arg Lys Val His Gln
Lys Pro Leu Ala Arg Ser Glu Pro 1 5 10 15 Phe Tyr Pro Ser Pro Trp
Met Asn Pro Asn Ala Asp Gly Trp Ala Glu 20 25 30 Ala Tyr Ala Gln
Ala Lys Ser Phe Val Ser Gln Met Thr Leu Leu Glu 35 40 45 Lys Val
Asn Leu Thr Thr Gly Val Gly Trp Gly Ala Glu Gln Cys Val 50 55 60
Gly Gln Val Gly Ala Ile Pro Arg Leu Gly Leu Arg Ser Leu Cys Met 65
70 75 80 His Asp Ser Pro Leu Gly Ile Arg Gly Ala Asp Tyr Asn Ser
Ala Phe 85 90 95 Pro Ser Gly Gln Thr Val Ala Ala Thr Trp Asp Arg
Gly Leu Met Tyr 100 105 110 Arg Arg Gly Tyr Ala Met Gly Gln Glu Ala
Lys Gly Lys Gly Ile Asn 115 120 125 Val Leu Leu Gly Pro Val Ala Gly
Pro Leu Gly Arg Met Pro Glu Gly 130 135 140 Gly Arg Asn Trp Glu Gly
Phe Ala Pro Asp Pro Val Leu Thr Gly Ile 145 150 155 160 Gly Met Ser
Glu Thr Ile Lys Gly Ile Gln Asp Ala Gly Val Ile Ala 165 170 175 Cys
Ala Lys His Phe Ile Gly Asn Glu Gln Glu His Phe Arg Gln Val 180 185
190 Pro Glu Ala Gln Gly Tyr Gly Tyr Asn Ile Ser Glu Thr Leu Ser Ser
195 200 205 Asn Ile Asp Asp Lys Thr Met His Glu Leu Tyr Leu Trp Pro
Phe Ala 210 215 220 Asp Ala Val Arg Ala Gly Val Gly Ser Val Met Cys
Ser Tyr Asn Gln 225 230 235 240 Val Asn Asn Ser Tyr Ala Cys Gln Asn
Ser Lys Leu Leu Asn Asp Leu 245 250 255 Leu Lys Asn Glu Leu Gly Phe
Gln Gly Phe Val Met Ser Asp Trp Trp 260 265 270 Ala Gln His Thr Gly
Ala Ala Ser Ala Val Ala Gly Leu Asp Met Ser 275 280 285 Met Pro Gly
Asp Thr Met Phe Asn Thr Gly Val Ser Phe Trp Gly Ala 290 295 300 Asn
Leu Thr Leu Ala Val Leu Asn Gly Thr Val Pro Ala Tyr Arg Leu 305 310
315 320 Asp Asp Met Ala Met Arg Ile Met Ala Ala Leu Phe Lys Val Thr
Lys 325 330 335 Thr Thr Asp Leu Glu Pro Ile Asn Phe Ser Phe Trp Thr
Arg Asp Thr 340 345 350 Tyr Gly Pro Ile His Trp Ala Ala Lys Gln Gly
Tyr Gln Glu Ile Asn 355 360 365 Ser His Val Asp Val Arg Ala Asp His
Gly Asn Leu Ile Arg Asn Ile 370 375 380 Ala Ala Lys Gly Thr Val Leu
Leu Lys Asn Thr Gly Ser Leu Pro Leu 385 390 395 400 Asn Lys Pro Lys
Phe Val Ala Val Ile Gly Glu Asp Ala Gly Pro Ser 405 410 415 Pro Asn
Gly Pro Asn Gly Cys Ser Asp Arg Gly Cys Asn Glu Gly Thr 420 425 430
Leu Ala Met Gly Trp Gly Ser Gly Thr Ala Asn Tyr Pro Tyr Leu Val 435
440 445 Ser Pro Asp Ala Ala Leu Gln Leu Arg Ala Ile Gln Asp Gly Thr
Arg 450 455 460 Tyr Glu Ser Val Leu Ser Asn Tyr Ala Glu Glu Asn Thr
Lys Ala Leu 465 470 475 480 Val Ser Gln Ala Asn Ala Thr Ala Ile Val
Phe Val Asn Ala Asp Ser 485 490 495 Gly Glu Gly Tyr Ile Asn Val Asp
Gly Asn Glu Gly Asp Arg Lys Asn 500 505 510 Leu Thr Leu Trp Asn Asn
Gly Asp Thr Leu Val Lys Asn Val Ser Ser 515 520 525 Trp Cys Ser Asn
Thr Ile Val Val Ile His Ser Val Gly Pro Val Leu 530 535 540 Leu Thr
Asp Trp Tyr Asp Asn Pro Asn Ile Thr Ala Ile Leu Trp Ala 545 550 555
560 Gly Leu Pro Gly Gln Glu Ser Gly Asn Ser Ile Thr Asp Val Leu Tyr
565 570 575 Gly Lys Val Asn Pro Ala Ala Arg Ser Pro Phe Thr Trp Gly
Lys Thr 580 585 590 Arg Glu Ser Tyr Gly Ala Asp Val Leu Tyr Lys Pro
Asn Asn Gly Asn 595 600 605 Trp Ala Pro Gln Gln Asp Phe Thr Glu Gly
Val Phe Ile Asp Tyr Arg 610 615 620 Tyr Phe Asp Lys Val Asp Asp Asp
Ser Val Ile Tyr Glu Phe Gly His 625 630 635 640 Gly Leu Ser Tyr Thr
Thr Phe Glu Tyr Ser Asn Ile Arg Val Val Lys 645 650 655 Ser Asn Val
Ser Glu Tyr Arg Pro Thr Thr Gly Thr Thr Ile Gln Ala 660 665 670 Pro
Thr Phe Gly Asn Phe Ser Thr Asp Leu Glu Asp Tyr Leu Phe Pro 675 680
685 Lys Asp Glu Phe Pro Tyr Ile Pro Gln Tyr Ile Tyr Pro Tyr Leu Asn
690 695 700 Thr Thr Asp Pro Arg Arg Ala Ser Ala Asp Pro His Tyr Gly
Gln Thr 705 710 715 720 Ala Glu Glu Phe Leu Pro Pro His Ala Thr Asp
Asp Asp Pro Gln Pro 725 730 735 Leu Leu Arg Ser Ser Gly Gly Asn Ser
Pro Gly Gly Asn Arg Gln Leu 740 745 750 Tyr Asp Ile Val Tyr Thr Ile
Thr Ala Asp Ile Thr Asn Thr Gly Ser 755 760 765 Val Val Gly Glu Glu
Val Pro Gln Leu Tyr Val Ser Leu Gly Gly Pro 770 775 780 Glu Asp Pro
Lys Val Gln Leu Arg Asp Phe Asp Arg Met Arg Ile Glu 785 790 795 800
Pro Gly Glu Thr Arg Gln Phe Thr Gly Arg Leu Thr Arg Arg Asp Leu 805
810 815 Ser Asn Trp Asp Val Thr Val Gln Asp Trp Val Ile Ser Arg Tyr
Pro 820 825 830 Lys Thr Ala Tyr Val Gly Arg Ser Ser Arg Lys Leu Asp
Leu Lys Ile 835 840 845 Glu Leu Pro 850 1212613DNAArtificial
SequenceSynthetic polynucleotide. 121atgaaggctg ctgcgctttc
ctgcctcttc ggcagtaccc ttgccgttgc aggcgccatt 60gaatcgagaa aggttcacca
gaagcccctc gcgagatctg aaccttttta cccgtcgcca 120tggatgaatc
ccaacgccat cggctgggcg gaggcctatg cccaggccaa gtcctttgtc
180tcccaaatga ctctgctaga gaaggtcaac ttgaccacgg gagtcggctg
gggggaggag 240cagtgcgtcg gcaacgtggg cgcgatccct cgccttggac
ttcgcagtct gtgcatgcat 300gactcccctc tcggcgtgcg aggaaccgac
tacaactcag cgttcccctc tggccagacc 360gttgctgcta cctgggatcg
cggtctgatg taccgtcgcg gctacgcaat gggccaggag 420gccaaaggca
agggcatcaa tgtccttctc ggaccagtcg ccggccccct tggccgcatg
480cccgagggcg gtcgtaactg ggaaggcttc gctccggatc ccgtccttac
cggcatcggc 540atgtccgaga cgatcaaggg cattcaggat gctggcgtca
tcgcttgtgc gaagcacttt 600attggaaacg agcaggagca cttcagacag
gtgccagaag cccagggata cggttacaac 660atcagcgaaa ccctctcctc
caacattgac gacaagacca tgcacgagct ctacctttgg 720ccgtttgccg
atgccgtccg ggccggcgtc ggctctgtca tgtgctcgta caaccagggc
780aacaactcgt acgcctgcca gaactcgaag ctgctgaacg acctcctcaa
gaacgagctt 840gggtttcagg gcttcgtcat gagcgactgg tgggcacagc
acactggcgc agcaagcgcc 900gtggctggtc tcgatatgtc catgccgggc
gacaccatgg tcaacactgg cgtcagtttc 960tggggcgcca atctcaccct
cgccgtcctc aacggcacag tccctgccta ccgtctcgac 1020gacatgtgca
tgcgcatcat ggccgccctc ttcaaggtca ccaagaccac cgacctggaa
1080ccgatcaact tctccttctg gacccgcgac acttatggcc cgatccactg
ggccgccaag 1140cagggctacc aggagattaa ttcccacgtt gacgtccgcg
ccgaccacgg caacctcatc 1200cggaacattg ccgccaaggg tacggtgctg
ctgaagaata ccggctctct acccctgaac 1260aagccaaagt tcgtggccgt
catcggcgag gatgctgggc cgagccccaa cgggcccaac 1320ggctgcagcg
accgcggctg taacgaaggc acgctcgcca tgggctgggg atccggcaca
1380gccaactatc cgtacctcgt ttcccccgac gccgcgctcc aggcgcgggc
catccaggac 1440ggcacgaggt acgagagcgt cctgtccaac tacgccgagg
aaaatacaaa ggctctggtc 1500tcgcaggcca atgcaaccgc catcgtcttc
gtcaatgccg actcaggcga gggctacatc 1560aacgtggacg gtaacgaggg
cgaccgtaag aacctgactc tctggaacaa cggtgatact 1620ctggtcaaga
acgtctcgag ctggtgcagc aacaccatcg tcgtcatcca ctcggtcggc
1680ccggtcctcc tgaccgattg gtacgacaac cccaacatca cggccattct
ctgggctggt 1740cttccgggcc aggagtcggg caactccatc accgacgtgc
tttacggcaa ggtcaacccc 1800gccgcccgct cgcccttcac ttggggcaag
acccgcgaaa gctatggcgc ggacgtcctg 1860tacaagccga ataatggcaa
ttgggcgccc caacaggact tcaccgaggg cgtcttcatc 1920gactaccgct
acttcgacaa ggttgacgat gactcggtca tctacgagtt cggccacggc
1980ctgagctaca ccaccttcga gtacagcaac atccgcgtcg tcaagtccaa
cgtcagcgag 2040taccggccca cgacgggcac cacgattcag gccccgacgt
ttggcaactt ctccaccgac 2100ctcgaggact atctcttccc caaggacgag
ttcccctaca tcccgcagta catctacccg 2160tacctcaaca cgaccgaccc
ccggagggcc tcgggcgatc cccactacgg ccagaccgcc 2220gaggagttcc
tcccgcccca cgccaccgat gacgaccccc agccgctcct ccggtcctcg
2280ggcggaaact cccccggcgg caaccgccag ctgtacgaca ttgtctacac
aatcacggcc 2340gacatcacga atacgggctc cgttgtaggc gaggaggtac
cgcagctcta cgtctcgctg 2400ggcggtcccg aggatcccaa ggtgcagctg
cgcgactttg acaggatgcg gatcgaaccc 2460ggcgagacga ggcagttcac
cggccgcctg acgcgcagag atctgagcaa ctgggacgtc 2520acggtgcagg
actgggtcat cagcaggtat cccaagacgg catatgttgg gaggagcagc
2580cggaagttgg atctcaagat tgagcttcct tga 2613122870PRTArtificial
SequenceSynthetic polypeptides. 122Met Lys Ala Ala Ala Leu Ser Cys
Leu Phe Gly Ser Thr Leu Ala Val 1 5 10 15 Ala Gly Ala Ile Glu Ser
Arg Lys Val His Gln Lys Pro Leu Ala Arg 20 25 30 Ser Glu Pro Phe
Tyr Pro Ser Pro Trp Met Asn Pro Asn Ala Ile Gly 35 40 45 Trp Ala
Glu Ala Tyr Ala Gln Ala Lys Ser Phe Val Ser Gln Met Thr 50 55 60
Leu Leu Glu Lys Val Asn Leu Thr Thr Gly Val Gly Trp Gly Glu Glu 65
70
75 80 Gln Cys Val Gly Asn Val Gly Ala Ile Pro Arg Leu Gly Leu Arg
Ser 85 90 95 Leu Cys Met His Asp Ser Pro Leu Gly Val Arg Gly Thr
Asp Tyr Asn 100 105 110 Ser Ala Phe Pro Ser Gly Gln Thr Val Ala Ala
Thr Trp Asp Arg Gly 115 120 125 Leu Met Tyr Arg Arg Gly Tyr Ala Met
Gly Gln Glu Ala Lys Gly Lys 130 135 140 Gly Ile Asn Val Leu Leu Gly
Pro Val Ala Gly Pro Leu Gly Arg Met 145 150 155 160 Pro Glu Gly Gly
Arg Asn Trp Glu Gly Phe Ala Pro Asp Pro Val Leu 165 170 175 Thr Gly
Ile Gly Met Ser Glu Thr Ile Lys Gly Ile Gln Asp Ala Gly 180 185 190
Val Ile Ala Cys Ala Lys His Phe Ile Gly Asn Glu Gln Glu His Phe 195
200 205 Arg Gln Val Pro Glu Ala Gln Gly Tyr Gly Tyr Asn Ile Ser Glu
Thr 210 215 220 Leu Ser Ser Asn Ile Asp Asp Lys Thr Met His Glu Leu
Tyr Leu Trp 225 230 235 240 Pro Phe Ala Asp Ala Val Arg Ala Gly Val
Gly Ser Val Met Cys Ser 245 250 255 Tyr Asn Gln Gly Asn Asn Ser Tyr
Ala Cys Gln Asn Ser Lys Leu Leu 260 265 270 Asn Asp Leu Leu Lys Asn
Glu Leu Gly Phe Gln Gly Phe Val Met Ser 275 280 285 Asp Trp Trp Ala
Gln His Thr Gly Ala Ala Ser Ala Val Ala Gly Leu 290 295 300 Asp Met
Ser Met Pro Gly Asp Thr Met Val Asn Thr Gly Val Ser Phe 305 310 315
320 Trp Gly Ala Asn Leu Thr Leu Ala Val Leu Asn Gly Thr Val Pro Ala
325 330 335 Tyr Arg Leu Asp Asp Met Cys Met Arg Ile Met Ala Ala Leu
Phe Lys 340 345 350 Val Thr Lys Thr Thr Asp Leu Glu Pro Ile Asn Phe
Ser Phe Trp Thr 355 360 365 Arg Asp Thr Tyr Gly Pro Ile His Trp Ala
Ala Lys Gln Gly Tyr Gln 370 375 380 Glu Ile Asn Ser His Val Asp Val
Arg Ala Asp His Gly Asn Leu Ile 385 390 395 400 Arg Asn Ile Ala Ala
Lys Gly Thr Val Leu Leu Lys Asn Thr Gly Ser 405 410 415 Leu Pro Leu
Asn Lys Pro Lys Phe Val Ala Val Ile Gly Glu Asp Ala 420 425 430 Gly
Pro Ser Pro Asn Gly Pro Asn Gly Cys Ser Asp Arg Gly Cys Asn 435 440
445 Glu Gly Thr Leu Ala Met Gly Trp Gly Ser Gly Thr Ala Asn Tyr Pro
450 455 460 Tyr Leu Val Ser Pro Asp Ala Ala Leu Gln Ala Arg Ala Ile
Gln Asp 465 470 475 480 Gly Thr Arg Tyr Glu Ser Val Leu Ser Asn Tyr
Ala Glu Glu Asn Thr 485 490 495 Lys Ala Leu Val Ser Gln Ala Asn Ala
Thr Ala Ile Val Phe Val Asn 500 505 510 Ala Asp Ser Gly Glu Gly Tyr
Ile Asn Val Asp Gly Asn Glu Gly Asp 515 520 525 Arg Lys Asn Leu Thr
Leu Trp Asn Asn Gly Asp Thr Leu Val Lys Asn 530 535 540 Val Ser Ser
Trp Cys Ser Asn Thr Ile Val Val Ile His Ser Val Gly 545 550 555 560
Pro Val Leu Leu Thr Asp Trp Tyr Asp Asn Pro Asn Ile Thr Ala Ile 565
570 575 Leu Trp Ala Gly Leu Pro Gly Gln Glu Ser Gly Asn Ser Ile Thr
Asp 580 585 590 Val Leu Tyr Gly Lys Val Asn Pro Ala Ala Arg Ser Pro
Phe Thr Trp 595 600 605 Gly Lys Thr Arg Glu Ser Tyr Gly Ala Asp Val
Leu Tyr Lys Pro Asn 610 615 620 Asn Gly Asn Trp Ala Pro Gln Gln Asp
Phe Thr Glu Gly Val Phe Ile 625 630 635 640 Asp Tyr Arg Tyr Phe Asp
Lys Val Asp Asp Asp Ser Val Ile Tyr Glu 645 650 655 Phe Gly His Gly
Leu Ser Tyr Thr Thr Phe Glu Tyr Ser Asn Ile Arg 660 665 670 Val Val
Lys Ser Asn Val Ser Glu Tyr Arg Pro Thr Thr Gly Thr Thr 675 680 685
Ile Gln Ala Pro Thr Phe Gly Asn Phe Ser Thr Asp Leu Glu Asp Tyr 690
695 700 Leu Phe Pro Lys Asp Glu Phe Pro Tyr Ile Pro Gln Tyr Ile Tyr
Pro 705 710 715 720 Tyr Leu Asn Thr Thr Asp Pro Arg Arg Ala Ser Gly
Asp Pro His Tyr 725 730 735 Gly Gln Thr Ala Glu Glu Phe Leu Pro Pro
His Ala Thr Asp Asp Asp 740 745 750 Pro Gln Pro Leu Leu Arg Ser Ser
Gly Gly Asn Ser Pro Gly Gly Asn 755 760 765 Arg Gln Leu Tyr Asp Ile
Val Tyr Thr Ile Thr Ala Asp Ile Thr Asn 770 775 780 Thr Gly Ser Val
Val Gly Glu Glu Val Pro Gln Leu Tyr Val Ser Leu 785 790 795 800 Gly
Gly Pro Glu Asp Pro Lys Val Gln Leu Arg Asp Phe Asp Arg Met 805 810
815 Arg Ile Glu Pro Gly Glu Thr Arg Gln Phe Thr Gly Arg Leu Thr Arg
820 825 830 Arg Asp Leu Ser Asn Trp Asp Val Thr Val Gln Asp Trp Val
Ile Ser 835 840 845 Arg Tyr Pro Lys Thr Ala Tyr Val Gly Arg Ser Ser
Arg Lys Leu Asp 850 855 860 Leu Lys Ile Glu Leu Pro 865 870
123851PRTArtificial SequenceSynthetic polypeptides. 123Ile Glu Ser
Arg Lys Val His Gln Lys Pro Leu Ala Arg Ser Glu Pro 1 5 10 15 Phe
Tyr Pro Ser Pro Trp Met Asn Pro Asn Ala Ile Gly Trp Ala Glu 20 25
30 Ala Tyr Ala Gln Ala Lys Ser Phe Val Ser Gln Met Thr Leu Leu Glu
35 40 45 Lys Val Asn Leu Thr Thr Gly Val Gly Trp Gly Glu Glu Gln
Cys Val 50 55 60 Gly Asn Val Gly Ala Ile Pro Arg Leu Gly Leu Arg
Ser Leu Cys Met 65 70 75 80 His Asp Ser Pro Leu Gly Val Arg Gly Thr
Asp Tyr Asn Ser Ala Phe 85 90 95 Pro Ser Gly Gln Thr Val Ala Ala
Thr Trp Asp Arg Gly Leu Met Tyr 100 105 110 Arg Arg Gly Tyr Ala Met
Gly Gln Glu Ala Lys Gly Lys Gly Ile Asn 115 120 125 Val Leu Leu Gly
Pro Val Ala Gly Pro Leu Gly Arg Met Pro Glu Gly 130 135 140 Gly Arg
Asn Trp Glu Gly Phe Ala Pro Asp Pro Val Leu Thr Gly Ile 145 150 155
160 Gly Met Ser Glu Thr Ile Lys Gly Ile Gln Asp Ala Gly Val Ile Ala
165 170 175 Cys Ala Lys His Phe Ile Gly Asn Glu Gln Glu His Phe Arg
Gln Val 180 185 190 Pro Glu Ala Gln Gly Tyr Gly Tyr Asn Ile Ser Glu
Thr Leu Ser Ser 195 200 205 Asn Ile Asp Asp Lys Thr Met His Glu Leu
Tyr Leu Trp Pro Phe Ala 210 215 220 Asp Ala Val Arg Ala Gly Val Gly
Ser Val Met Cys Ser Tyr Asn Gln 225 230 235 240 Gly Asn Asn Ser Tyr
Ala Cys Gln Asn Ser Lys Leu Leu Asn Asp Leu 245 250 255 Leu Lys Asn
Glu Leu Gly Phe Gln Gly Phe Val Met Ser Asp Trp Trp 260 265 270 Ala
Gln His Thr Gly Ala Ala Ser Ala Val Ala Gly Leu Asp Met Ser 275 280
285 Met Pro Gly Asp Thr Met Val Asn Thr Gly Val Ser Phe Trp Gly Ala
290 295 300 Asn Leu Thr Leu Ala Val Leu Asn Gly Thr Val Pro Ala Tyr
Arg Leu 305 310 315 320 Asp Asp Met Cys Met Arg Ile Met Ala Ala Leu
Phe Lys Val Thr Lys 325 330 335 Thr Thr Asp Leu Glu Pro Ile Asn Phe
Ser Phe Trp Thr Arg Asp Thr 340 345 350 Tyr Gly Pro Ile His Trp Ala
Ala Lys Gln Gly Tyr Gln Glu Ile Asn 355 360 365 Ser His Val Asp Val
Arg Ala Asp His Gly Asn Leu Ile Arg Asn Ile 370 375 380 Ala Ala Lys
Gly Thr Val Leu Leu Lys Asn Thr Gly Ser Leu Pro Leu 385 390 395 400
Asn Lys Pro Lys Phe Val Ala Val Ile Gly Glu Asp Ala Gly Pro Ser 405
410 415 Pro Asn Gly Pro Asn Gly Cys Ser Asp Arg Gly Cys Asn Glu Gly
Thr 420 425 430 Leu Ala Met Gly Trp Gly Ser Gly Thr Ala Asn Tyr Pro
Tyr Leu Val 435 440 445 Ser Pro Asp Ala Ala Leu Gln Ala Arg Ala Ile
Gln Asp Gly Thr Arg 450 455 460 Tyr Glu Ser Val Leu Ser Asn Tyr Ala
Glu Glu Asn Thr Lys Ala Leu 465 470 475 480 Val Ser Gln Ala Asn Ala
Thr Ala Ile Val Phe Val Asn Ala Asp Ser 485 490 495 Gly Glu Gly Tyr
Ile Asn Val Asp Gly Asn Glu Gly Asp Arg Lys Asn 500 505 510 Leu Thr
Leu Trp Asn Asn Gly Asp Thr Leu Val Lys Asn Val Ser Ser 515 520 525
Trp Cys Ser Asn Thr Ile Val Val Ile His Ser Val Gly Pro Val Leu 530
535 540 Leu Thr Asp Trp Tyr Asp Asn Pro Asn Ile Thr Ala Ile Leu Trp
Ala 545 550 555 560 Gly Leu Pro Gly Gln Glu Ser Gly Asn Ser Ile Thr
Asp Val Leu Tyr 565 570 575 Gly Lys Val Asn Pro Ala Ala Arg Ser Pro
Phe Thr Trp Gly Lys Thr 580 585 590 Arg Glu Ser Tyr Gly Ala Asp Val
Leu Tyr Lys Pro Asn Asn Gly Asn 595 600 605 Trp Ala Pro Gln Gln Asp
Phe Thr Glu Gly Val Phe Ile Asp Tyr Arg 610 615 620 Tyr Phe Asp Lys
Val Asp Asp Asp Ser Val Ile Tyr Glu Phe Gly His 625 630 635 640 Gly
Leu Ser Tyr Thr Thr Phe Glu Tyr Ser Asn Ile Arg Val Val Lys 645 650
655 Ser Asn Val Ser Glu Tyr Arg Pro Thr Thr Gly Thr Thr Ile Gln Ala
660 665 670 Pro Thr Phe Gly Asn Phe Ser Thr Asp Leu Glu Asp Tyr Leu
Phe Pro 675 680 685 Lys Asp Glu Phe Pro Tyr Ile Pro Gln Tyr Ile Tyr
Pro Tyr Leu Asn 690 695 700 Thr Thr Asp Pro Arg Arg Ala Ser Gly Asp
Pro His Tyr Gly Gln Thr 705 710 715 720 Ala Glu Glu Phe Leu Pro Pro
His Ala Thr Asp Asp Asp Pro Gln Pro 725 730 735 Leu Leu Arg Ser Ser
Gly Gly Asn Ser Pro Gly Gly Asn Arg Gln Leu 740 745 750 Tyr Asp Ile
Val Tyr Thr Ile Thr Ala Asp Ile Thr Asn Thr Gly Ser 755 760 765 Val
Val Gly Glu Glu Val Pro Gln Leu Tyr Val Ser Leu Gly Gly Pro 770 775
780 Glu Asp Pro Lys Val Gln Leu Arg Asp Phe Asp Arg Met Arg Ile Glu
785 790 795 800 Pro Gly Glu Thr Arg Gln Phe Thr Gly Arg Leu Thr Arg
Arg Asp Leu 805 810 815 Ser Asn Trp Asp Val Thr Val Gln Asp Trp Val
Ile Ser Arg Tyr Pro 820 825 830 Lys Thr Ala Tyr Val Gly Arg Ser Ser
Arg Lys Leu Asp Leu Lys Ile 835 840 845 Glu Leu Pro 850
1241368DNATalaromyces emersonii 124atgcttcgac gggctcttct tctatcctct
tccgccatcc ttgctgtcaa ggcacagcag 60gccggcacgg cgacggcaga gaaccacccg
cccctgacat ggcaggaatg caccgcccct 120gggagctgca ccacccagaa
cggggcggtc gttcttgatg cgaactggcg ttgggtgcac 180gatgtgaacg
gatacaccaa ctgctacacg ggcaatacct gggaccccac gtactgccct
240gacgacgaaa cctgcgccca gaactgtgcg ctggacggcg cggattacga
gggcacctac 300ggcgtgactt cgtcgggcag ctccttgaaa ctcaatttcg
tcaccgggtc gaacgtcgga 360tcccgtctct acctgctgca ggacgactcg
acctatcaga tcttcaagct tctgaaccgc 420gagttcagct ttgacgtcga
tgtctccaat cttccgtgcg gattgaacgg cgctctgtac 480tttgtcgcca
tggacgccga cggcggcgtg tccaagtacc cgaacaacaa ggctggtgcc
540aagtacggaa ccgggtattg cgactcccaa tgcccacggg acctcaagtt
catcgacggc 600gaggccaacg tcgagggctg gcagccgtct tcgaacaacg
ccaacaccgg aattggcgac 660cacggctcct gctgtgcgga gatggatgtc
tgggaagcaa acagcatctc caatgcggtc 720actccgcacc cgtgcgacac
gccaggccag acgatgtgct ctggagatga ctgcggtggc 780acatactcta
acgatcgcta cgcgggaacc tgcgatcctg acggctgtga cttcaaccct
840taccgcatgg gcaacacttc tttctacggg cctggcaaga tcatcgatac
caccaagccc 900ttcactgtcg tgacgcagtt cctcactgat gatggtacgg
atactggaac tctcagcgag 960atcaagcgct tctacatcca gaacagcaac
gtcattccgc agcccaactc ggacatcagt 1020ggcgtgaccg gcaactcgat
cacgacggag ttctgcactg ctcagaagca ggcctttggc 1080gacacggacg
acttctctca gcacggtggc ctggccaaga tgggagcggc catgcagcag
1140ggtatggtcc tggtgatgag tttgtgggac gactacgccg cgcagatgct
gtggttggat 1200tccgactacc cgacggatgc ggaccccacg acccctggta
ttgcccgtgg aacgtgtccg 1260acggactcgg gcgtcccatc ggatgtcgag
tcgcagagcc ccaactccta cgtgacctac 1320tcgaacatta agtttggtcc
gatcaactcg accttcaccg cttcgtga 1368125455PRTTalaromyces emersonii
125Met Leu Arg Arg Ala Leu Leu Leu Ser Ser Ser Ala Ile Leu Ala Val
1 5 10 15 Lys Ala Gln Gln Ala Gly Thr Ala Thr Ala Glu Asn His Pro
Pro Leu 20 25 30 Thr Trp Gln Glu Cys Thr Ala Pro Gly Ser Cys Thr
Thr Gln Asn Gly 35 40 45 Ala Val Val Leu Asp Ala Asn Trp Arg Trp
Val His Asp Val Asn Gly 50 55 60 Tyr Thr Asn Cys Tyr Thr Gly Asn
Thr Trp Asp Pro Thr Tyr Cys Pro 65 70 75 80 Asp Asp Glu Thr Cys Ala
Gln Asn Cys Ala Leu Asp Gly Ala Asp Tyr 85 90 95 Glu Gly Thr Tyr
Gly Val Thr Ser Ser Gly Ser Ser Leu Lys Leu Asn 100 105 110 Phe Val
Thr Gly Ser Asn Val Gly Ser Arg Leu Tyr Leu Leu Gln Asp 115 120 125
Asp Ser Thr Tyr Gln Ile Phe Lys Leu Leu Asn Arg Glu Phe Ser Phe 130
135 140 Asp Val Asp Val Ser Asn Leu Pro Cys Gly Leu Asn Gly Ala Leu
Tyr 145 150 155 160 Phe Val Ala Met Asp Ala Asp Gly Gly Val Ser Lys
Tyr Pro Asn Asn 165 170 175 Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr
Cys Asp Ser Gln Cys Pro 180 185 190 Arg Asp Leu Lys Phe Ile Asp Gly
Glu Ala Asn Val Glu Gly Trp Gln 195 200 205 Pro Ser Ser Asn Asn Ala
Asn Thr Gly Ile Gly Asp His Gly Ser Cys 210 215 220 Cys Ala Glu Met
Asp Val Trp Glu Ala Asn Ser Ile Ser Asn Ala Val 225 230 235 240 Thr
Pro His Pro Cys Asp Thr Pro Gly Gln Thr Met Cys Ser Gly Asp 245 250
255 Asp Cys Gly Gly Thr Tyr Ser Asn Asp Arg Tyr Ala Gly Thr Cys Asp
260 265 270 Pro Asp Gly Cys Asp Phe Asn Pro Tyr Arg Met Gly Asn Thr
Ser Phe 275 280 285 Tyr Gly Pro Gly Lys Ile Ile Asp Thr Thr Lys Pro
Phe Thr Val Val 290 295 300 Thr Gln Phe Leu Thr Asp Asp Gly Thr Asp
Thr Gly Thr Leu Ser Glu 305 310 315 320 Ile Lys Arg Phe Tyr Ile Gln
Asn Ser Asn Val Ile Pro Gln Pro Asn 325 330 335 Ser Asp Ile Ser Gly
Val Thr Gly Asn Ser Ile Thr Thr Glu Phe Cys 340 345 350 Thr Ala Gln
Lys Gln Ala Phe Gly Asp Thr Asp Asp Phe Ser Gln His 355 360 365 Gly
Gly Leu Ala Lys Met Gly Ala Ala Met Gln Gln Gly Met Val Leu 370 375
380 Val Met Ser Leu Trp Asp Asp Tyr Ala Ala Gln Met Leu Trp Leu Asp
385 390 395 400 Ser Asp Tyr Pro Thr Asp Ala Asp Pro Thr Thr Pro Gly
Ile Ala Arg 405 410 415 Gly Thr Cys Pro Thr Asp Ser Gly Val Pro Ser
Asp Val Glu Ser Gln 420 425 430 Ser Pro Asn Ser Tyr Val Thr Tyr
Ser Asn Ile Lys Phe Gly Pro Ile 435 440 445 Asn Ser Thr Phe Thr Ala
Ser 450 455 126437PRTTalaromyces emersonii 126Gln Gln Ala Gly Thr
Ala Thr Ala Glu Asn His Pro Pro Leu Thr Trp 1 5 10 15 Gln Glu Cys
Thr Ala Pro Gly Ser Cys Thr Thr Gln Asn Gly Ala Val 20 25 30 Val
Leu Asp Ala Asn Trp Arg Trp Val His Asp Val Asn Gly Tyr Thr 35 40
45 Asn Cys Tyr Thr Gly Asn Thr Trp Asp Pro Thr Tyr Cys Pro Asp Asp
50 55 60 Glu Thr Cys Ala Gln Asn Cys Ala Leu Asp Gly Ala Asp Tyr
Glu Gly 65 70 75 80 Thr Tyr Gly Val Thr Ser Ser Gly Ser Ser Leu Lys
Leu Asn Phe Val 85 90 95 Thr Gly Ser Asn Val Gly Ser Arg Leu Tyr
Leu Leu Gln Asp Asp Ser 100 105 110 Thr Tyr Gln Ile Phe Lys Leu Leu
Asn Arg Glu Phe Ser Phe Asp Val 115 120 125 Asp Val Ser Asn Leu Pro
Cys Gly Leu Asn Gly Ala Leu Tyr Phe Val 130 135 140 Ala Met Asp Ala
Asp Gly Gly Val Ser Lys Tyr Pro Asn Asn Lys Ala 145 150 155 160 Gly
Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gln Cys Pro Arg Asp 165 170
175 Leu Lys Phe Ile Asp Gly Glu Ala Asn Val Glu Gly Trp Gln Pro Ser
180 185 190 Ser Asn Asn Ala Asn Thr Gly Ile Gly Asp His Gly Ser Cys
Cys Ala 195 200 205 Glu Met Asp Val Trp Glu Ala Asn Ser Ile Ser Asn
Ala Val Thr Pro 210 215 220 His Pro Cys Asp Thr Pro Gly Gln Thr Met
Cys Ser Gly Asp Asp Cys 225 230 235 240 Gly Gly Thr Tyr Ser Asn Asp
Arg Tyr Ala Gly Thr Cys Asp Pro Asp 245 250 255 Gly Cys Asp Phe Asn
Pro Tyr Arg Met Gly Asn Thr Ser Phe Tyr Gly 260 265 270 Pro Gly Lys
Ile Ile Asp Thr Thr Lys Pro Phe Thr Val Val Thr Gln 275 280 285 Phe
Leu Thr Asp Asp Gly Thr Asp Thr Gly Thr Leu Ser Glu Ile Lys 290 295
300 Arg Phe Tyr Ile Gln Asn Ser Asn Val Ile Pro Gln Pro Asn Ser Asp
305 310 315 320 Ile Ser Gly Val Thr Gly Asn Ser Ile Thr Thr Glu Phe
Cys Thr Ala 325 330 335 Gln Lys Gln Ala Phe Gly Asp Thr Asp Asp Phe
Ser Gln His Gly Gly 340 345 350 Leu Ala Lys Met Gly Ala Ala Met Gln
Gln Gly Met Val Leu Val Met 355 360 365 Ser Leu Trp Asp Asp Tyr Ala
Ala Gln Met Leu Trp Leu Asp Ser Asp 370 375 380 Tyr Pro Thr Asp Ala
Asp Pro Thr Thr Pro Gly Ile Ala Arg Gly Thr 385 390 395 400 Cys Pro
Thr Asp Ser Gly Val Pro Ser Asp Val Glu Ser Gln Ser Pro 405 410 415
Asn Ser Tyr Val Thr Tyr Ser Asn Ile Lys Phe Gly Pro Ile Asn Ser 420
425 430 Thr Phe Thr Ala Ser 435 1271581DNAMyceliophthora
thermophila 127atgtacgcca agttcgcgac cctcgccgcc cttgtggctg
gcgccgctgc tcagaacgcc 60tgcactctga ccgctgagaa ccacccctcg ctgacgtggt
ccaagtgcac gtctggcggc 120agctgcacca gcgtccaggg ttccatcacc
atcgacgcca actggcggtg gactcaccgg 180accgatagcg ccaccaactg
ctacgagggc aacaagtggg atacttcgta ctgcagcgat 240ggtccttctt
gcgcctccaa gtgctgcatc gacggcgctg actactcgag cacctatggc
300atcaccacga gcggtaactc cctgaacctc aagttcgtca ccaagggcca
gtactcgacc 360aacatcggct cgcgtaccta cctgatggag agcgacacca
agtaccagat gttccagctc 420ctcggcaacg agttcacctt cgatgtcgac
gtctccaacc tcggctgcgg cctcaatggc 480gccctctact tcgtgtccat
ggatgccgat ggtggcatgt ccaagtactc gggcaacaag 540gcaggtgcca
agtacggtac cggctactgt gattctcagt gcccccgcga cctcaagttc
600atcaacggcg aggccaacgt agagaactgg cagagctcga ccaacgatgc
caacgccggc 660acgggcaagt acggcagctg ctgctccgag atggacgtct
gggaggccaa caacatggcc 720gccgccttca ctccccaccc ttgcaccgtg
atcggccagt cgcgctgcga gggcgactcg 780tgcggcggta cctacagcac
cgaccgctat gccggcatct gcgaccccga cggatgcgac 840ttcaactcgt
accgccaggg caacaagacc ttctacggca agggcatgac ggtcgacacg
900accaagaaga tcacggtcgt cacccagttc ctcaagaact cggccggcga
gctctccgag 960atcaagcggt tctacgtcca gaacggcaag gtcatcccca
actccgagtc caccatcccg 1020ggcgtcgagg gcaactccat cacccaggac
tggtgcgacc gccagaaggc cgccttcggc 1080gacgtgaccg acttccagga
caagggcggc atggtccaga tgggcaaggc cctcgcgggg 1140cccatggtcc
tcgtcatgtc catctgggac gaccacgccg tcaacatgct ctggctcgac
1200tccacctggc ccatcgacgg cgccggcaag ccgggcgccg agcgcggtgc
ctgccccacc 1260acctcgggcg tccccgctga ggtcgaggcc gaggccccca
actccaacgt catcttctcc 1320aacatccgct tcggccccat cggctccacc
gtctccggcc tgcccgacgg cggcagcggc 1380aaccccaacc cgcccgtcag
ctcgtccacc ccggtcccct cctcgtccac cacatcctcc 1440ggttcctccg
gcccgactgg cggcacgggt gtcgctaagc actatgagca atgcggagga
1500atcgggttca ctggccctac ccagtgcgag agcccctaca cttgcaccaa
gctgaatgac 1560tggtactcgc agtgcctgta a 1581128526PRTMyceliophthora
thermophila 128Met Tyr Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala
Gly Ala Ala 1 5 10 15 Ala Gln Asn Ala Cys Thr Leu Thr Ala Glu Asn
His Pro Ser Leu Thr 20 25 30 Tyr Ser Lys Cys Thr Ser Gly Gly Ser
Cys Thr Ser Val Gln Gly Ser 35 40 45 Ile Thr Ile Asp Ala Asn Trp
Arg Trp Thr His Arg Thr Asp Ser Ala 50 55 60 Thr Asn Cys Tyr Glu
Gly Asn Lys Trp Asp Thr Ser Trp Cys Ser Asp 65 70 75 80 Gly Pro Ser
Cys Ala Ser Lys Cys Cys Ile Asp Gly Ala Asp Tyr Ser 85 90 95 Ser
Thr Tyr Gly Ile Thr Thr Ser Gly Asn Ser Leu Asn Leu Lys Phe 100 105
110 Val Thr Lys Gly Gln Tyr Ser Thr Asn Ile Gly Ser Arg Thr Tyr Leu
115 120 125 Met Glu Ser Asp Thr Lys Tyr Gln Met Phe Gln Leu Leu Gly
Asn Glu 130 135 140 Phe Thr Phe Asp Val Asp Val Ser Asn Leu Gly Cys
Gly Leu Asn Gly 145 150 155 160 Ala Leu Tyr Phe Val Ser Met Asp Ala
Asp Gly Gly Met Ser Lys Tyr 165 170 175 Ser Gly Asn Lys Ala Gly Ala
Lys Tyr Gly Thr Gly Tyr Cys Asp Ser 180 185 190 Gln Cys Pro Arg Asp
Leu Lys Phe Ile Asn Gly Glu Ala Asn Val Glu 195 200 205 Asn Trp Gln
Ser Ser Thr Asn Asp Ala Asn Ala Gly Thr Gly Lys Tyr 210 215 220 Gly
Ser Cys Cys Ser Glu Met Asp Val Trp Glu Ala Asn Asn Met Ala 225 230
235 240 Ala Ala Phe Thr Pro His Pro Cys Thr Val Ile Gly Gln Ser Arg
Cys 245 250 255 Glu Gly Asp Ser Cys Gly Gly Thr Tyr Ser Thr Asp Arg
Tyr Ala Gly 260 265 270 Ile Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser
Tyr Arg Gln Gly Asn 275 280 285 Lys Thr Phe Tyr Gly Lys Gly Met Thr
Val Asp Thr Thr Lys Lys Ile 290 295 300 Thr Val Val Thr Gln Phe Leu
Lys Asn Ser Ala Gly Glu Leu Ser Glu 305 310 315 320 Ile Lys Arg Phe
Tyr Val Gln Asn Gly Lys Val Ile Pro Asn Ser Glu 325 330 335 Ser Thr
Ile Pro Gly Val Glu Gly Asn Ser Ile Thr Gln Asp Trp Cys 340 345 350
Asp Arg Gln Lys Ala Ala Phe Gly Asp Val Thr Asp Phe Gln Asp Lys 355
360 365 Gly Gly Met Val Gln Met Gly Lys Ala Leu Ala Gly Pro Met Val
Leu 370 375 380 Val Met Ser Ile Trp Asp Asp His Ala Val Asn Met Leu
Trp Leu Asp 385 390 395 400 Ser Thr Trp Pro Ile Asp Gly Ala Gly Lys
Pro Gly Ala Glu Arg Gly 405 410 415 Ala Cys Pro Thr Thr Ser Gly Val
Pro Ala Glu Val Glu Ala Glu Ala 420 425 430 Pro Asn Ser Asn Val Ile
Phe Ser Asn Ile Arg Phe Gly Pro Ile Gly 435 440 445 Ser Thr Val Ser
Gly Leu Pro Asp Gly Gly Ser Gly Asn Pro Asn Pro 450 455 460 Pro Val
Ser Ser Ser Thr Pro Val Pro Ser Ser Ser Thr Thr Ser Ser 465 470 475
480 Gly Ser Ser Gly Pro Thr Gly Gly Thr Gly Val Ala Lys His Tyr Glu
485 490 495 Gln Cys Gly Gly Ile Gly Phe Thr Gly Pro Thr Gln Cys Glu
Ser Pro 500 505 510 Tyr Thr Cys Thr Lys Leu Asn Asp Trp Tyr Ser Gln
Cys Leu 515 520 525 129509PRTMyceliophthora thermophila 129Gln Asn
Ala Cys Thr Leu Thr Ala Glu Asn His Pro Ser Leu Thr Tyr 1 5 10 15
Ser Lys Cys Thr Ser Gly Gly Ser Cys Thr Ser Val Gln Gly Ser Ile 20
25 30 Thr Ile Asp Ala Asn Trp Arg Trp Thr His Arg Thr Asp Ser Ala
Thr 35 40 45 Asn Cys Tyr Glu Gly Asn Lys Trp Asp Thr Ser Trp Cys
Ser Asp Gly 50 55 60 Pro Ser Cys Ala Ser Lys Cys Cys Ile Asp Gly
Ala Asp Tyr Ser Ser 65 70 75 80 Thr Tyr Gly Ile Thr Thr Ser Gly Asn
Ser Leu Asn Leu Lys Phe Val 85 90 95 Thr Lys Gly Gln Tyr Ser Thr
Asn Ile Gly Ser Arg Thr Tyr Leu Met 100 105 110 Glu Ser Asp Thr Lys
Tyr Gln Met Phe Gln Leu Leu Gly Asn Glu Phe 115 120 125 Thr Phe Asp
Val Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly Ala 130 135 140 Leu
Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys Tyr Ser 145 150
155 160 Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser
Gln 165 170 175 Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly Glu Ala Asn
Val Glu Asn 180 185 190 Trp Gln Ser Ser Thr Asn Asp Ala Asn Ala Gly
Thr Gly Lys Tyr Gly 195 200 205 Ser Cys Cys Ser Glu Met Asp Val Trp
Glu Ala Asn Asn Met Ala Ala 210 215 220 Ala Phe Thr Pro His Pro Cys
Thr Val Ile Gly Gln Ser Arg Cys Glu 225 230 235 240 Gly Asp Ser Cys
Gly Gly Thr Tyr Ser Thr Asp Arg Tyr Ala Gly Ile 245 250 255 Cys Asp
Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gln Gly Asn Lys 260 265 270
Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys Ile Thr 275
280 285 Val Val Thr Gln Phe Leu Lys Asn Ser Ala Gly Glu Leu Ser Glu
Ile 290 295 300 Lys Arg Phe Tyr Val Gln Asn Gly Lys Val Ile Pro Asn
Ser Glu Ser 305 310 315 320 Thr Ile Pro Gly Val Glu Gly Asn Ser Ile
Thr Gln Asp Trp Cys Asp 325 330 335 Arg Gln Lys Ala Ala Phe Gly Asp
Val Thr Asp Phe Gln Asp Lys Gly 340 345 350 Gly Met Val Gln Met Gly
Lys Ala Leu Ala Gly Pro Met Val Leu Val 355 360 365 Met Ser Ile Trp
Asp Asp His Ala Val Asn Met Leu Trp Leu Asp Ser 370 375 380 Thr Trp
Pro Ile Asp Gly Ala Gly Lys Pro Gly Ala Glu Arg Gly Ala 385 390 395
400 Cys Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala Glu Ala Pro
405 410 415 Asn Ser Asn Val Ile Phe Ser Asn Ile Arg Phe Gly Pro Ile
Gly Ser 420 425 430 Thr Val Ser Gly Leu Pro Asp Gly Gly Ser Gly Asn
Pro Asn Pro Pro 435 440 445 Val Ser Ser Ser Thr Pro Val Pro Ser Ser
Ser Thr Thr Ser Ser Gly 450 455 460 Ser Ser Gly Pro Thr Gly Gly Thr
Gly Val Ala Lys His Tyr Glu Gln 465 470 475 480 Cys Gly Gly Ile Gly
Phe Thr Gly Pro Thr Gln Cys Glu Ser Pro Tyr 485 490 495 Thr Cys Thr
Lys Leu Asn Asp Trp Tyr Ser Gln Cys Leu 500 505
1301581DNAArtificial SequenceSynthetic polynucleotide.
130atgtacgcca agttcgcgac cctcgccgcc cttgtggctg gcgccgctgc
tcagaacgcc 60tgcactctga ccgctgagaa ccacccctcg ctgacgtggt ccaagtgcac
gtctggcggc 120agctgcacca gcgtccaggg ttccatcacc atcgacgcca
actggcggtg gactcaccgg 180accgatagcg ccaccaactg ctacgagggc
aacaagtggg atacttcgtg gtgcagcgat 240ggtccttctt gcgcctccaa
gtgctgcatc gacggcgctg actactcgag cacctatggc 300atcaccacga
gcggtaactc cctgaacctc aagttcgtca ccaagggcca gtactcgacc
360aacatcggct cgcgtaccta cctgatggag agcgacacca agtaccagat
gttccagctc 420ctcggcaacg agttcacctt cgatgtcgac gtctccaacc
tcggctgcgg cctcaatggc 480gccctctact tcgtgtccat ggatgccgat
ggtggcatgt ccaagtactc gggcaacaag 540gcaggtgcca agtacggtac
cggctactgt gattctcagt gcccccgcga cctcaagttc 600atcaacggcg
aggccaacgt agagaactgg cagagctcga ccaacgatgc caacgccggc
660acgggcaagt acggcagctg ctgctccgag atggacgtct gggaggccaa
caacatggcc 720gccgccttca ctccccaccc ttgcaccgtg atcggccagt
cgcgctgcga gggcgactcg 780tgcggcggta cctacagcac cgaccgctat
gccggcatct gcgaccccga cggatgcgac 840ttcaactcgt accgccaggg
caacaagacc ttctacggca agggcatgac ggtcgacacg 900accaagaaga
tcacggtcgt cacccagttc ctcaagaact cggccggcga gctctccgag
960atcaagcggt tctacgtcca gaacggcaag gtcatcccca actccgagtc
caccatcccg 1020ggcgtcgagg gcaactccat cacccaggac tggtgcgacc
gccagaaggc cgccttcggc 1080gacgtgaccg acttccagga caagggcggc
atggtccaga tgggcaaggc cctcgcgggg 1140cccatggtcc tcgtcatgtc
catctgggac gaccacgccg tcaacatgct ctggctcgac 1200tccacctggc
ccatcgacgg cgccggcaag ccgggcgccg agcgcggtgc ctgccccacc
1260acctcgggcg tccccgctga ggtcgaggcc gaggccccca actccaacgt
catcttctcc 1320aacatccgct tcggccccat cggctccacc gtctccggcc
tgcccgacgg cggcagcggc 1380aaccccaacc cgcccgtcag ctcgtccacc
ccggtcccct cctcgtccac cacatcctcc 1440ggttcctccg gcccgactgg
cggcacgggt gtcgctaagc actatgagca atgcggagga 1500atcgggttca
ctggccctac ccagtgcgag agcccctaca cttgcaccaa gctgaatgac
1560tggtactcgc agtgcctgta a 1581131526PRTArtificial
SequenceSynthetic polypeptides. 131Met Tyr Ala Lys Phe Ala Thr Leu
Ala Ala Leu Val Ala Gly Ala Ala 1 5 10 15 Ala Gln Asn Ala Cys Thr
Leu Thr Ala Glu Asn His Pro Ser Leu Thr 20 25 30 Trp Ser Lys Cys
Thr Ser Gly Gly Ser Cys Thr Ser Val Gln Gly Ser 35 40 45 Ile Thr
Ile Asp Ala Asn Trp Arg Trp Thr His Arg Thr Asp Ser Ala 50 55 60
Thr Asn Cys Tyr Glu Gly Asn Lys Trp Asp Thr Ser Trp Cys Ser Asp 65
70 75 80 Gly Pro Ser Cys Ala Ser Lys Cys Cys Ile Asp Gly Ala Asp
Tyr Ser 85 90 95 Ser Thr Tyr Gly Ile Thr Thr Ser Gly Asn Ser Leu
Asn Leu Lys Phe 100 105 110 Val Thr Lys Gly Gln Tyr Ser Thr Asn Ile
Gly Ser Arg Thr Tyr Leu 115 120 125 Met Glu Ser Asp Thr Lys Tyr Gln
Met Phe Gln Leu Leu Gly Asn Glu 130 135 140 Phe Thr Phe Asp Val Asp
Val Ser Asn Leu Gly Cys Gly Leu Asn Gly 145 150 155 160 Ala Leu Tyr
Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys Tyr 165 170 175 Ser
Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser 180 185
190 Gln Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly Glu Ala Asn Val Glu
195 200 205 Asn Trp Gln Ser Ser Thr Asn Asp Ala Asn Ala Gly Thr Gly
Lys Tyr 210 215 220 Gly Ser Cys Cys Ser Glu Met Asp Val Trp Glu Ala
Asn Asn Met Ala 225 230 235 240 Ala Ala Phe Thr Pro His Pro Cys Thr
Val Ile Gly Gln Ser Arg Cys 245 250 255 Glu Gly Asp Ser Cys Gly Gly
Thr Tyr Ser Thr Asp Arg Tyr Ala Gly 260 265 270 Ile Cys Asp Pro Asp
Gly Cys Asp Phe Asn Ser Tyr Arg Gln Gly Asn 275 280
285 Lys Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys Ile
290 295 300 Thr Val Val Thr Gln Phe Leu Lys Asn Ser Ala Gly Glu Leu
Ser Glu 305 310 315 320 Ile Lys Arg Phe Tyr Val Gln Asn Gly Lys Val
Ile Pro Asn Ser Glu 325 330 335 Ser Thr Ile Pro Gly Val Glu Gly Asn
Ser Ile Thr Gln Asp Trp Cys 340 345 350 Asp Arg Gln Lys Ala Ala Phe
Gly Asp Val Thr Asp Phe Gln Asp Lys 355 360 365 Gly Gly Met Val Gln
Met Gly Lys Ala Leu Ala Gly Pro Met Val Leu 370 375 380 Val Met Ser
Ile Trp Asp Asp His Ala Val Asn Met Leu Trp Leu Asp 385 390 395 400
Ser Thr Trp Pro Ile Asp Gly Ala Gly Lys Pro Gly Ala Glu Arg Gly 405
410 415 Ala Cys Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala Glu
Ala 420 425 430 Pro Asn Ser Asn Val Ile Phe Ser Asn Ile Arg Phe Gly
Pro Ile Gly 435 440 445 Ser Thr Val Ser Gly Leu Pro Asp Gly Gly Ser
Gly Asn Pro Asn Pro 450 455 460 Pro Val Ser Ser Ser Thr Pro Val Pro
Ser Ser Ser Thr Thr Ser Ser 465 470 475 480 Gly Ser Ser Gly Pro Thr
Gly Gly Thr Gly Val Ala Lys His Tyr Glu 485 490 495 Gln Cys Gly Gly
Ile Gly Phe Thr Gly Pro Thr Gln Cys Glu Ser Pro 500 505 510 Tyr Thr
Cys Thr Lys Leu Asn Asp Trp Tyr Ser Gln Cys Leu 515 520 525
132509PRTArtificial SequenceSynthetic polypeptides. 132Gln Asn Ala
Cys Thr Leu Thr Ala Glu Asn His Pro Ser Leu Thr Trp 1 5 10 15 Ser
Lys Cys Thr Ser Gly Gly Ser Cys Thr Ser Val Gln Gly Ser Ile 20 25
30 Thr Ile Asp Ala Asn Trp Arg Trp Thr His Arg Thr Asp Ser Ala Thr
35 40 45 Asn Cys Tyr Glu Gly Asn Lys Trp Asp Thr Ser Trp Cys Ser
Asp Gly 50 55 60 Pro Ser Cys Ala Ser Lys Cys Cys Ile Asp Gly Ala
Asp Tyr Ser Ser 65 70 75 80 Thr Tyr Gly Ile Thr Thr Ser Gly Asn Ser
Leu Asn Leu Lys Phe Val 85 90 95 Thr Lys Gly Gln Tyr Ser Thr Asn
Ile Gly Ser Arg Thr Tyr Leu Met 100 105 110 Glu Ser Asp Thr Lys Tyr
Gln Met Phe Gln Leu Leu Gly Asn Glu Phe 115 120 125 Thr Phe Asp Val
Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly Ala 130 135 140 Leu Tyr
Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys Tyr Ser 145 150 155
160 Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gln
165 170 175 Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly Glu Ala Asn Val
Glu Asn 180 185 190 Trp Gln Ser Ser Thr Asn Asp Ala Asn Ala Gly Thr
Gly Lys Tyr Gly 195 200 205 Ser Cys Cys Ser Glu Met Asp Val Trp Glu
Ala Asn Asn Met Ala Ala 210 215 220 Ala Phe Thr Pro His Pro Cys Thr
Val Ile Gly Gln Ser Arg Cys Glu 225 230 235 240 Gly Asp Ser Cys Gly
Gly Thr Tyr Ser Thr Asp Arg Tyr Ala Gly Ile 245 250 255 Cys Asp Pro
Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gln Gly Asn Lys 260 265 270 Thr
Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys Ile Thr 275 280
285 Val Val Thr Gln Phe Leu Lys Asn Ser Ala Gly Glu Leu Ser Glu Ile
290 295 300 Lys Arg Phe Tyr Val Gln Asn Gly Lys Val Ile Pro Asn Ser
Glu Ser 305 310 315 320 Thr Ile Pro Gly Val Glu Gly Asn Ser Ile Thr
Gln Asp Trp Cys Asp 325 330 335 Arg Gln Lys Ala Ala Phe Gly Asp Val
Thr Asp Phe Gln Asp Lys Gly 340 345 350 Gly Met Val Gln Met Gly Lys
Ala Leu Ala Gly Pro Met Val Leu Val 355 360 365 Met Ser Ile Trp Asp
Asp His Ala Val Asn Met Leu Trp Leu Asp Ser 370 375 380 Thr Trp Pro
Ile Asp Gly Ala Gly Lys Pro Gly Ala Glu Arg Gly Ala 385 390 395 400
Cys Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala Glu Ala Pro 405
410 415 Asn Ser Asn Val Ile Phe Ser Asn Ile Arg Phe Gly Pro Ile Gly
Ser 420 425 430 Thr Val Ser Gly Leu Pro Asp Gly Gly Ser Gly Asn Pro
Asn Pro Pro 435 440 445 Val Ser Ser Ser Thr Pro Val Pro Ser Ser Ser
Thr Thr Ser Ser Gly 450 455 460 Ser Ser Gly Pro Thr Gly Gly Thr Gly
Val Ala Lys His Tyr Glu Gln 465 470 475 480 Cys Gly Gly Ile Gly Phe
Thr Gly Pro Thr Gln Cys Glu Ser Pro Tyr 485 490 495 Thr Cys Thr Lys
Leu Asn Asp Trp Tyr Ser Gln Cys Leu 500 505 1331581DNAArtificial
SequenceSynthetic polynucleotide. 133atgtacgcca agttcgcgac
cctcgccgcc cttgtggctg gcgccgctgc tcagaacgcc 60tgcactctga acgctgagaa
ccacccctcg ctgacgtggt ccaagtgcac gtctggcggc 120agctgcacca
gcgtccaggg ttccatcacc atcgacgcca actggcggtg gactcaccgg
180accgatagcg ccaccaactg ctacgagggc aacaagtggg atacttcgta
ctgcagcgat 240ggtccttctt gcgcctccaa gtgctgcatc gacggcgctg
actactcgag cacctatggc 300atcaccacga gcggtaactc cctgaacctc
aagttcgtca ccaagggcca gtactcgacc 360aacatcggct cgcgtaccta
cctgatggag agcgacacca agtaccagat gttccagctc 420ctcggcaacg
agttcacctt cgatgtcgac gtctccaacc tcggctgcgg cctcaatggc
480gccctctact tcgtgtccat ggatgccgat ggtggcatgt ccaagtactc
gggcaacaag 540gcaggtgcca agtacggtac cggctactgt gattctcagt
gcccccgcga cctcaagttc 600atcaacggcg aggccaacgt agagaactgg
cagagctcga ccaacgatgc caacgccggc 660acgggcaagt acggcagctg
ctgctccgag atggacgtct gggaggccaa caacatggcc 720gccgccttca
ctccccaccc ttgcaccgtg atcggccagt cgcgctgcga gggcgactcg
780tgcggcggta cctacagcac cgaccgctat gccggcatct gcgaccccga
cggatgcgac 840ttcaactcgt accgccaggg caacaagacc ttctacggca
agggcatgac ggtcgacacg 900accaagaaga tcacggtcgt cacccagttc
ctcaagaact cggccggcga gctctccgag 960atcaagcggt tctacgtcca
gaacggcaag gtcatcccca actccgagtc caccatcccg 1020ggcgtcgagg
gcaactccat cacccaggag tactgcgacc gccagaaggc cgccttcggc
1080gacgtgaccg acttccagga caagggcggc atggtccaga tgggcaaggc
cctcgcgggg 1140cccatggtcc tcgtcatgtc catctgggac gaccacgccg
acaacatgct ctggctcgac 1200tccacctggc ccatcgacgg cgccggcaag
ccgggcgccg agcgcggtgc ctgccccacc 1260acctcgggcg tccccgctga
ggtcgaggcc gaggccccca actccaacgt catcttctcc 1320aacatccgct
tcggccccat cggctccacc gtctccggcc tgcccgacgg cggcagcggc
1380aaccccaacc cgcccgtcag ctcgtccacc ccggtcccct cctcgtccac
cacatcctcc 1440ggttcctccg gcccgactgg cggcacgggt gtcgctaagc
actatgagca atgcggagga 1500atcgggttca ctggccctac ccagtgcgag
agcccctaca cttgcaccaa gctgaatgac 1560tggtactcgc agtgcctgta a
1581134526PRTArtificial SequenceSynthetic polypeptides. 134Met Tyr
Ala Lys Phe Ala Thr Leu Ala Ala Leu Val Ala Gly Ala Ala 1 5 10 15
Ala Gln Asn Ala Cys Thr Leu Asn Ala Glu Asn His Pro Ser Leu Thr 20
25 30 Trp Ser Lys Cys Thr Ser Gly Gly Ser Cys Thr Ser Val Gln Gly
Ser 35 40 45 Ile Thr Ile Asp Ala Asn Trp Arg Trp Thr His Arg Thr
Asp Ser Ala 50 55 60 Thr Asn Cys Tyr Glu Gly Asn Lys Trp Asp Thr
Ser Tyr Cys Ser Asp 65 70 75 80 Gly Pro Ser Cys Ala Ser Lys Cys Cys
Ile Asp Gly Ala Asp Tyr Ser 85 90 95 Ser Thr Tyr Gly Ile Thr Thr
Ser Gly Asn Ser Leu Asn Leu Lys Phe 100 105 110 Val Thr Lys Gly Gln
Tyr Ser Thr Asn Ile Gly Ser Arg Thr Tyr Leu 115 120 125 Met Glu Ser
Asp Thr Lys Tyr Gln Met Phe Gln Leu Leu Gly Asn Glu 130 135 140 Phe
Thr Phe Asp Val Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly 145 150
155 160 Ala Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys
Tyr 165 170 175 Ser Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr
Cys Asp Ser 180 185 190 Gln Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly
Glu Ala Asn Val Glu 195 200 205 Asn Trp Gln Ser Ser Thr Asn Asp Ala
Asn Ala Gly Thr Gly Lys Tyr 210 215 220 Gly Ser Cys Cys Ser Glu Met
Asp Val Trp Glu Ala Asn Asn Met Ala 225 230 235 240 Ala Ala Phe Thr
Pro His Pro Cys Thr Val Ile Gly Gln Ser Arg Cys 245 250 255 Glu Gly
Asp Ser Cys Gly Gly Thr Tyr Ser Thr Asp Arg Tyr Ala Gly 260 265 270
Ile Cys Asp Pro Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gln Gly Asn 275
280 285 Lys Thr Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys
Ile 290 295 300 Thr Val Val Thr Gln Phe Leu Lys Asn Ser Ala Gly Glu
Leu Ser Glu 305 310 315 320 Ile Lys Arg Phe Tyr Val Gln Asn Gly Lys
Val Ile Pro Asn Ser Glu 325 330 335 Ser Thr Ile Pro Gly Val Glu Gly
Asn Ser Ile Thr Gln Glu Tyr Cys 340 345 350 Asp Arg Gln Lys Ala Ala
Phe Gly Asp Val Thr Asp Phe Gln Asp Lys 355 360 365 Gly Gly Met Val
Gln Met Gly Lys Ala Leu Ala Gly Pro Met Val Leu 370 375 380 Val Met
Ser Ile Trp Asp Asp His Ala Asp Asn Met Leu Trp Leu Asp 385 390 395
400 Ser Thr Trp Pro Ile Asp Gly Ala Gly Lys Pro Gly Ala Glu Arg Gly
405 410 415 Ala Cys Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala
Glu Ala 420 425 430 Pro Asn Ser Asn Val Ile Phe Ser Asn Ile Arg Phe
Gly Pro Ile Gly 435 440 445 Ser Thr Val Ser Gly Leu Pro Asp Gly Gly
Ser Gly Asn Pro Asn Pro 450 455 460 Pro Val Ser Ser Ser Thr Pro Val
Pro Ser Ser Ser Thr Thr Ser Ser 465 470 475 480 Gly Ser Ser Gly Pro
Thr Gly Gly Thr Gly Val Ala Lys His Tyr Glu 485 490 495 Gln Cys Gly
Gly Ile Gly Phe Thr Gly Pro Thr Gln Cys Glu Ser Pro 500 505 510 Tyr
Thr Cys Thr Lys Leu Asn Asp Trp Tyr Ser Gln Cys Leu 515 520 525
135509PRTArtificial SequenceSynthetic polypeptides. 135Gln Asn Ala
Cys Thr Leu Asn Ala Glu Asn His Pro Ser Leu Thr Trp 1 5 10 15 Ser
Lys Cys Thr Ser Gly Gly Ser Cys Thr Ser Val Gln Gly Ser Ile 20 25
30 Thr Ile Asp Ala Asn Trp Arg Trp Thr His Arg Thr Asp Ser Ala Thr
35 40 45 Asn Cys Tyr Glu Gly Asn Lys Trp Asp Thr Ser Tyr Cys Ser
Asp Gly 50 55 60 Pro Ser Cys Ala Ser Lys Cys Cys Ile Asp Gly Ala
Asp Tyr Ser Ser 65 70 75 80 Thr Tyr Gly Ile Thr Thr Ser Gly Asn Ser
Leu Asn Leu Lys Phe Val 85 90 95 Thr Lys Gly Gln Tyr Ser Thr Asn
Ile Gly Ser Arg Thr Tyr Leu Met 100 105 110 Glu Ser Asp Thr Lys Tyr
Gln Met Phe Gln Leu Leu Gly Asn Glu Phe 115 120 125 Thr Phe Asp Val
Asp Val Ser Asn Leu Gly Cys Gly Leu Asn Gly Ala 130 135 140 Leu Tyr
Phe Val Ser Met Asp Ala Asp Gly Gly Met Ser Lys Tyr Ser 145 150 155
160 Gly Asn Lys Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gln
165 170 175 Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly Glu Ala Asn Val
Glu Asn 180 185 190 Trp Gln Ser Ser Thr Asn Asp Ala Asn Ala Gly Thr
Gly Lys Tyr Gly 195 200 205 Ser Cys Cys Ser Glu Met Asp Val Trp Glu
Ala Asn Asn Met Ala Ala 210 215 220 Ala Phe Thr Pro His Pro Cys Thr
Val Ile Gly Gln Ser Arg Cys Glu 225 230 235 240 Gly Asp Ser Cys Gly
Gly Thr Tyr Ser Thr Asp Arg Tyr Ala Gly Ile 245 250 255 Cys Asp Pro
Asp Gly Cys Asp Phe Asn Ser Tyr Arg Gln Gly Asn Lys 260 265 270 Thr
Phe Tyr Gly Lys Gly Met Thr Val Asp Thr Thr Lys Lys Ile Thr 275 280
285 Val Val Thr Gln Phe Leu Lys Asn Ser Ala Gly Glu Leu Ser Glu Ile
290 295 300 Lys Arg Phe Tyr Val Gln Asn Gly Lys Val Ile Pro Asn Ser
Glu Ser 305 310 315 320 Thr Ile Pro Gly Val Glu Gly Asn Ser Ile Thr
Gln Glu Tyr Cys Asp 325 330 335 Arg Gln Lys Ala Ala Phe Gly Asp Val
Thr Asp Phe Gln Asp Lys Gly 340 345 350 Gly Met Val Gln Met Gly Lys
Ala Leu Ala Gly Pro Met Val Leu Val 355 360 365 Met Ser Ile Trp Asp
Asp His Ala Asp Asn Met Leu Trp Leu Asp Ser 370 375 380 Thr Trp Pro
Ile Asp Gly Ala Gly Lys Pro Gly Ala Glu Arg Gly Ala 385 390 395 400
Cys Pro Thr Thr Ser Gly Val Pro Ala Glu Val Glu Ala Glu Ala Pro 405
410 415 Asn Ser Asn Val Ile Phe Ser Asn Ile Arg Phe Gly Pro Ile Gly
Ser 420 425 430 Thr Val Ser Gly Leu Pro Asp Gly Gly Ser Gly Asn Pro
Asn Pro Pro 435 440 445 Val Ser Ser Ser Thr Pro Val Pro Ser Ser Ser
Thr Thr Ser Ser Gly 450 455 460 Ser Ser Gly Pro Thr Gly Gly Thr Gly
Val Ala Lys His Tyr Glu Gln 465 470 475 480 Cys Gly Gly Ile Gly Phe
Thr Gly Pro Thr Gln Cys Glu Ser Pro Tyr 485 490 495 Thr Cys Thr Lys
Leu Asn Asp Trp Tyr Ser Gln Cys Leu 500 505
1361449DNAMyceliophthora thermophila 136atggccaaga agcttttcat
caccgccgcg cttgcggctg ccgtgttggc ggcccccgtc 60attgaggagc gccagaactg
cggcgctgtg tggactcaat gcggcggtaa cgggtggcaa 120ggtcccacat
gctgcgcctc gggctcgacc tgcgttgcgc agaacgagtg gtactctcag
180tgcctgccca acagccaggt gacgagttcc accactccgt cgtcgacttc
cacctcgcag 240cgcagcacca gcacctccag cagcaccacc aggagcggca
gctcctcctc ctcctccacc 300acgcccccgc ccgtctccag ccccgtgacc
agcattcccg gcggtgcgac ctccacggcg 360agctactctg gcaacccctt
ctcgggcgtc cggctcttcg ccaacgacta ctacaggtcc 420gaggtccaca
atctcgccat tcctagcatg actggtactc tggcggccaa ggcttccgcc
480gtcgccgaag tccctagctt ccagtggctc gaccggaacg tcaccatcga
caccctgatg 540gtccagactc tgtcccaggt ccgggctctc aataaggccg
gtgccaatcc tccctatgct 600gcccaactcg tcgtctacga cctccccgac
cgtgactgtg ccgccgctgc gtccaacggc 660gagttttcga ttgcaaacgg
cggcgccgcc aactacagga gctacatcga cgctatccgc 720aagcacatca
ttgagtactc ggacatccgg atcatcctgg ttatcgagcc cgactcgatg
780gccaacatgg tgaccaacat gaacgtggcc aagtgcagca acgccgcgtc
gacgtaccac 840gagttgaccg tgtacgcgct caagcagctg aacctgccca
acgtcgccat gtatctcgac 900gccggccacg ccggctggct cggctggccc
gccaacatcc agcccgccgc cgagctgttt 960gccggcatct acaatgatgc
cggcaagccg gctgccgtcc gcggcctggc cactaacgtc 1020gccaactaca
acgcctggag catcgcttcg gccccgtcgt acacgtcgcc taaccctaac
1080tacgacgaga agcactacat cgaggccttc agcccgctct tgaactcggc
cggcttcccc 1140gcacgcttca ttgtcgacac tggccgcaac ggcaaacaac
ctaccggcca acaacagtgg 1200ggtgactggt gcaatgtcaa gggcaccggc
tttggcgtgc gcccgacggc caacacgggc 1260cacgagctgg tcgatgcctt
tgtctgggtc aagcccggcg gcgagtccga cggcacaagc 1320gacaccagcg
ccgcccgcta cgactaccac tgcggcctgt ccgatgccct gcagcctgcc
1380cccgaggctg gacagtggtt ccaggcctac ttcgagcagc tgctcaccaa
cgccaacccg 1440cccttctaa 1449137482PRTMyceliophthora thermophila
137Met Ala Lys Lys Leu Phe Ile
Thr Ala Ala Leu Ala Ala Ala Val Leu 1 5 10 15 Ala Ala Pro Val Ile
Glu Glu Arg Gln Asn Cys Gly Ala Val Trp Thr 20 25 30 Gln Cys Gly
Gly Asn Gly Trp Gln Gly Pro Thr Cys Cys Ala Ser Gly 35 40 45 Ser
Thr Cys Val Ala Gln Asn Glu Trp Tyr Ser Gln Cys Leu Pro Asn 50 55
60 Ser Gln Val Thr Ser Ser Thr Thr Pro Ser Ser Thr Ser Thr Ser Gln
65 70 75 80 Arg Ser Thr Ser Thr Ser Ser Ser Thr Thr Arg Ser Gly Ser
Ser Ser 85 90 95 Ser Ser Ser Thr Thr Pro Pro Pro Val Ser Ser Pro
Val Thr Ser Ile 100 105 110 Pro Gly Gly Ala Thr Ser Thr Ala Ser Tyr
Ser Gly Asn Pro Phe Ser 115 120 125 Gly Val Arg Leu Phe Ala Asn Asp
Tyr Tyr Arg Ser Glu Val His Asn 130 135 140 Leu Ala Ile Pro Ser Met
Thr Gly Thr Leu Ala Ala Lys Ala Ser Ala 145 150 155 160 Val Ala Glu
Val Pro Ser Phe Gln Trp Leu Asp Arg Asn Val Thr Ile 165 170 175 Asp
Thr Leu Met Val Gln Thr Leu Ser Gln Val Arg Ala Leu Asn Lys 180 185
190 Ala Gly Ala Asn Pro Pro Tyr Ala Ala Gln Leu Val Val Tyr Asp Leu
195 200 205 Pro Asp Arg Asp Cys Ala Ala Ala Ala Ser Asn Gly Glu Phe
Ser Ile 210 215 220 Ala Asn Gly Gly Ala Ala Asn Tyr Arg Ser Tyr Ile
Asp Ala Ile Arg 225 230 235 240 Lys His Ile Ile Glu Tyr Ser Asp Ile
Arg Ile Ile Leu Val Ile Glu 245 250 255 Pro Asp Ser Met Ala Asn Met
Val Thr Asn Met Asn Val Ala Lys Cys 260 265 270 Ser Asn Ala Ala Ser
Thr Tyr His Glu Leu Thr Val Tyr Ala Leu Lys 275 280 285 Gln Leu Asn
Leu Pro Asn Val Ala Met Tyr Leu Asp Ala Gly His Ala 290 295 300 Gly
Trp Leu Gly Trp Pro Ala Asn Ile Gln Pro Ala Ala Glu Leu Phe 305 310
315 320 Ala Gly Ile Tyr Asn Asp Ala Gly Lys Pro Ala Ala Val Arg Gly
Leu 325 330 335 Ala Thr Asn Val Ala Asn Tyr Asn Ala Trp Ser Ile Ala
Ser Ala Pro 340 345 350 Ser Tyr Thr Ser Pro Asn Pro Asn Tyr Asp Glu
Lys His Tyr Ile Glu 355 360 365 Ala Phe Ser Pro Leu Leu Asn Ser Ala
Gly Phe Pro Ala Arg Phe Ile 370 375 380 Val Asp Thr Gly Arg Asn Gly
Lys Gln Pro Thr Gly Gln Gln Gln Trp 385 390 395 400 Gly Asp Trp Cys
Asn Val Lys Gly Thr Gly Phe Gly Val Arg Pro Thr 405 410 415 Ala Asn
Thr Gly His Glu Leu Val Asp Ala Phe Val Trp Val Lys Pro 420 425 430
Gly Gly Glu Ser Asp Gly Thr Ser Asp Thr Ser Ala Ala Arg Tyr Asp 435
440 445 Tyr His Cys Gly Leu Ser Asp Ala Leu Gln Pro Ala Pro Glu Ala
Gly 450 455 460 Gln Trp Phe Gln Ala Tyr Phe Glu Gln Leu Leu Thr Asn
Ala Asn Pro 465 470 475 480 Pro Phe 138465PRTMyceliophthora
thermophila 138Ala Pro Val Ile Glu Glu Arg Gln Asn Cys Gly Ala Val
Trp Thr Gln 1 5 10 15 Cys Gly Gly Asn Gly Trp Gln Gly Pro Thr Cys
Cys Ala Ser Gly Ser 20 25 30 Thr Cys Val Ala Gln Asn Glu Trp Tyr
Ser Gln Cys Leu Pro Asn Ser 35 40 45 Gln Val Thr Ser Ser Thr Thr
Pro Ser Ser Thr Ser Thr Ser Gln Arg 50 55 60 Ser Thr Ser Thr Ser
Ser Ser Thr Thr Arg Ser Gly Ser Ser Ser Ser 65 70 75 80 Ser Ser Thr
Thr Pro Pro Pro Val Ser Ser Pro Val Thr Ser Ile Pro 85 90 95 Gly
Gly Ala Thr Ser Thr Ala Ser Tyr Ser Gly Asn Pro Phe Ser Gly 100 105
110 Val Arg Leu Phe Ala Asn Asp Tyr Tyr Arg Ser Glu Val His Asn Leu
115 120 125 Ala Ile Pro Ser Met Thr Gly Thr Leu Ala Ala Lys Ala Ser
Ala Val 130 135 140 Ala Glu Val Pro Ser Phe Gln Trp Leu Asp Arg Asn
Val Thr Ile Asp 145 150 155 160 Thr Leu Met Val Gln Thr Leu Ser Gln
Val Arg Ala Leu Asn Lys Ala 165 170 175 Gly Ala Asn Pro Pro Tyr Ala
Ala Gln Leu Val Val Tyr Asp Leu Pro 180 185 190 Asp Arg Asp Cys Ala
Ala Ala Ala Ser Asn Gly Glu Phe Ser Ile Ala 195 200 205 Asn Gly Gly
Ala Ala Asn Tyr Arg Ser Tyr Ile Asp Ala Ile Arg Lys 210 215 220 His
Ile Ile Glu Tyr Ser Asp Ile Arg Ile Ile Leu Val Ile Glu Pro 225 230
235 240 Asp Ser Met Ala Asn Met Val Thr Asn Met Asn Val Ala Lys Cys
Ser 245 250 255 Asn Ala Ala Ser Thr Tyr His Glu Leu Thr Val Tyr Ala
Leu Lys Gln 260 265 270 Leu Asn Leu Pro Asn Val Ala Met Tyr Leu Asp
Ala Gly His Ala Gly 275 280 285 Trp Leu Gly Trp Pro Ala Asn Ile Gln
Pro Ala Ala Glu Leu Phe Ala 290 295 300 Gly Ile Tyr Asn Asp Ala Gly
Lys Pro Ala Ala Val Arg Gly Leu Ala 305 310 315 320 Thr Asn Val Ala
Asn Tyr Asn Ala Trp Ser Ile Ala Ser Ala Pro Ser 325 330 335 Tyr Thr
Ser Pro Asn Pro Asn Tyr Asp Glu Lys His Tyr Ile Glu Ala 340 345 350
Phe Ser Pro Leu Leu Asn Ser Ala Gly Phe Pro Ala Arg Phe Ile Val 355
360 365 Asp Thr Gly Arg Asn Gly Lys Gln Pro Thr Gly Gln Gln Gln Trp
Gly 370 375 380 Asp Trp Cys Asn Val Lys Gly Thr Gly Phe Gly Val Arg
Pro Thr Ala 385 390 395 400 Asn Thr Gly His Glu Leu Val Asp Ala Phe
Val Trp Val Lys Pro Gly 405 410 415 Gly Glu Ser Asp Gly Thr Ser Asp
Thr Ser Ala Ala Arg Tyr Asp Tyr 420 425 430 His Cys Gly Leu Ser Asp
Ala Leu Gln Pro Ala Pro Glu Ala Gly Gln 435 440 445 Trp Phe Gln Ala
Tyr Phe Glu Gln Leu Leu Thr Asn Ala Asn Pro Pro 450 455 460 Phe 465
1391449DNAArtificial SequenceSynthetic polynucleotide.
139atggccaaga agcttttcat caccgccgcg cttgcggctg ccgtgttggc
ggcccccgtc 60attgaggagc gccagaactg cggcgctgtg tggactcaat gcggcggtaa
cgggtggcaa 120ggtcccacat gctgcgcctc gggctcgacc tgcgttgcgc
agaacgagtg gtactctcag 180tgcctgccca acagccaggt gacgagttcc
accactccgt cgtcgacttc cacctcgcag 240cgcagcacca gcacctccag
cagcaccacc aggagcggca gctcctcctc ctcctccacc 300acgcccaccc
ccgtctccag ccccgtgacc agcattcccg gcggtgcgac ctccacggcg
360agctactctg gcaacccctt ctcgggcgtc cggctcttcg ccaacgacta
ctacaggtcc 420gaggtccaca atctcgccat tcctagcatg actggtactc
tggcggccaa ggcttccgcc 480gtcgccgaag tccctagctt ccagtggctc
gaccggaacg tcaccatcga caccctgatg 540gtcccgactc tgtcccgcgt
ccgggctctc aataaggccg gtgccaatcc tccctatgct 600gcccaactcg
tcgtctacga cctccccgac cgtgactgtg ccgccgctgc gtccaacggc
660gagttttcga ttgcaaacgg cggcgccgcc aactacagga gctacatcga
cgctatccgc 720aagcacatca ttgagtactc ggacatccgg atcatcctgg
ttatcgagcc cgactcgatg 780gccaacatgg tgaccaacat gaacgtggcc
aagtgcagca acgccgcgtc gacgtaccac 840gagttgaccg tgtacgcgct
caagcagctg aacctgccca acgtcgccat gtatctcgac 900gccggccacg
ccggctggct cggctggccc gccaacatcc agcccgccgc cgagctgttt
960gccggcatct acaatgatgc cggcaagccg gctgccgtcc gcggcctggc
cactaacgtc 1020gccaactaca acgcctggag catcgcttcg gccccgtcgt
acacgtcgcc taaccctaac 1080tacgacgaga agcactacat cgaggccttc
agcccgctct tgaactcggc cggcttcccc 1140gcacgcttca ttgtcgacac
tggccgcaac ggcaaacaac ctaccggcca acaacagtgg 1200ggtgactggt
gcaatgtcaa gggcaccggc tttggcgtgc gcccgacggc caacacgggc
1260cacgagctgg tcgatgcctt tgtctgggtc aagcccggcg gcgagtccga
cggcacaagc 1320gacaccagcg ccgcccgcta cgactaccac tgcggcctgt
ccgatgccct gcagcctgcc 1380cccgaggctg gacagtggtt ccaggcctac
ttcgagcagc tgctcaccaa cgccaacccg 1440cccttctaa
1449140482PRTArtificial SequenceSynthetic polypeptides. 140Met Ala
Lys Lys Leu Phe Ile Thr Ala Ala Leu Ala Ala Ala Val Leu 1 5 10 15
Ala Ala Pro Val Ile Glu Glu Arg Gln Asn Cys Gly Ala Val Trp Thr 20
25 30 Gln Cys Gly Gly Asn Gly Trp Gln Gly Pro Thr Cys Cys Ala Ser
Gly 35 40 45 Ser Thr Cys Val Ala Gln Asn Glu Trp Tyr Ser Gln Cys
Leu Pro Asn 50 55 60 Ser Gln Val Thr Ser Ser Thr Thr Pro Ser Ser
Thr Ser Thr Ser Gln 65 70 75 80 Arg Ser Thr Ser Thr Ser Ser Ser Thr
Thr Arg Ser Gly Ser Ser Ser 85 90 95 Ser Ser Ser Thr Thr Pro Thr
Pro Val Ser Ser Pro Val Thr Ser Ile 100 105 110 Pro Gly Gly Ala Thr
Ser Thr Ala Ser Tyr Ser Gly Asn Pro Phe Ser 115 120 125 Gly Val Arg
Leu Phe Ala Asn Asp Tyr Tyr Arg Ser Glu Val His Asn 130 135 140 Leu
Ala Ile Pro Ser Met Thr Gly Thr Leu Ala Ala Lys Ala Ser Ala 145 150
155 160 Val Ala Glu Val Pro Ser Phe Gln Trp Leu Asp Arg Asn Val Thr
Ile 165 170 175 Asp Thr Leu Met Val Pro Thr Leu Ser Arg Val Arg Ala
Leu Asn Lys 180 185 190 Ala Gly Ala Asn Pro Pro Tyr Ala Ala Gln Leu
Val Val Tyr Asp Leu 195 200 205 Pro Asp Arg Asp Cys Ala Ala Ala Ala
Ser Asn Gly Glu Phe Ser Ile 210 215 220 Ala Asn Gly Gly Ala Ala Asn
Tyr Arg Ser Tyr Ile Asp Ala Ile Arg 225 230 235 240 Lys His Ile Ile
Glu Tyr Ser Asp Ile Arg Ile Ile Leu Val Ile Glu 245 250 255 Pro Asp
Ser Met Ala Asn Met Val Thr Asn Met Asn Val Ala Lys Cys 260 265 270
Ser Asn Ala Ala Ser Thr Tyr His Glu Leu Thr Val Tyr Ala Leu Lys 275
280 285 Gln Leu Asn Leu Pro Asn Val Ala Met Tyr Leu Asp Ala Gly His
Ala 290 295 300 Gly Trp Leu Gly Trp Pro Ala Asn Ile Gln Pro Ala Ala
Glu Leu Phe 305 310 315 320 Ala Gly Ile Tyr Asn Asp Ala Gly Lys Pro
Ala Ala Val Arg Gly Leu 325 330 335 Ala Thr Asn Val Ala Asn Tyr Asn
Ala Trp Ser Ile Ala Ser Ala Pro 340 345 350 Ser Tyr Thr Ser Pro Asn
Pro Asn Tyr Asp Glu Lys His Tyr Ile Glu 355 360 365 Ala Phe Ser Pro
Leu Leu Asn Ser Ala Gly Phe Pro Ala Arg Phe Ile 370 375 380 Val Asp
Thr Gly Arg Asn Gly Lys Gln Pro Thr Gly Gln Gln Gln Trp 385 390 395
400 Gly Asp Trp Cys Asn Val Lys Gly Thr Gly Phe Gly Val Arg Pro Thr
405 410 415 Ala Asn Thr Gly His Glu Leu Val Asp Ala Phe Val Trp Val
Lys Pro 420 425 430 Gly Gly Glu Ser Asp Gly Thr Ser Asp Thr Ser Ala
Ala Arg Tyr Asp 435 440 445 Tyr His Cys Gly Leu Ser Asp Ala Leu Gln
Pro Ala Pro Glu Ala Gly 450 455 460 Gln Trp Phe Gln Ala Tyr Phe Glu
Gln Leu Leu Thr Asn Ala Asn Pro 465 470 475 480 Pro Phe
141465PRTArtificial SequenceSynthetic polypeptides. 141Ala Pro Val
Ile Glu Glu Arg Gln Asn Cys Gly Ala Val Trp Thr Gln 1 5 10 15 Cys
Gly Gly Asn Gly Trp Gln Gly Pro Thr Cys Cys Ala Ser Gly Ser 20 25
30 Thr Cys Val Ala Gln Asn Glu Trp Tyr Ser Gln Cys Leu Pro Asn Ser
35 40 45 Gln Val Thr Ser Ser Thr Thr Pro Ser Ser Thr Ser Thr Ser
Gln Arg 50 55 60 Ser Thr Ser Thr Ser Ser Ser Thr Thr Arg Ser Gly
Ser Ser Ser Ser 65 70 75 80 Ser Ser Thr Thr Pro Thr Pro Val Ser Ser
Pro Val Thr Ser Ile Pro 85 90 95 Gly Gly Ala Thr Ser Thr Ala Ser
Tyr Ser Gly Asn Pro Phe Ser Gly 100 105 110 Val Arg Leu Phe Ala Asn
Asp Tyr Tyr Arg Ser Glu Val His Asn Leu 115 120 125 Ala Ile Pro Ser
Met Thr Gly Thr Leu Ala Ala Lys Ala Ser Ala Val 130 135 140 Ala Glu
Val Pro Ser Phe Gln Trp Leu Asp Arg Asn Val Thr Ile Asp 145 150 155
160 Thr Leu Met Val Pro Thr Leu Ser Arg Val Arg Ala Leu Asn Lys Ala
165 170 175 Gly Ala Asn Pro Pro Tyr Ala Ala Gln Leu Val Val Tyr Asp
Leu Pro 180 185 190 Asp Arg Asp Cys Ala Ala Ala Ala Ser Asn Gly Glu
Phe Ser Ile Ala 195 200 205 Asn Gly Gly Ala Ala Asn Tyr Arg Ser Tyr
Ile Asp Ala Ile Arg Lys 210 215 220 His Ile Ile Glu Tyr Ser Asp Ile
Arg Ile Ile Leu Val Ile Glu Pro 225 230 235 240 Asp Ser Met Ala Asn
Met Val Thr Asn Met Asn Val Ala Lys Cys Ser 245 250 255 Asn Ala Ala
Ser Thr Tyr His Glu Leu Thr Val Tyr Ala Leu Lys Gln 260 265 270 Leu
Asn Leu Pro Asn Val Ala Met Tyr Leu Asp Ala Gly His Ala Gly 275 280
285 Trp Leu Gly Trp Pro Ala Asn Ile Gln Pro Ala Ala Glu Leu Phe Ala
290 295 300 Gly Ile Tyr Asn Asp Ala Gly Lys Pro Ala Ala Val Arg Gly
Leu Ala 305 310 315 320 Thr Asn Val Ala Asn Tyr Asn Ala Trp Ser Ile
Ala Ser Ala Pro Ser 325 330 335 Tyr Thr Ser Pro Asn Pro Asn Tyr Asp
Glu Lys His Tyr Ile Glu Ala 340 345 350 Phe Ser Pro Leu Leu Asn Ser
Ala Gly Phe Pro Ala Arg Phe Ile Val 355 360 365 Asp Thr Gly Arg Asn
Gly Lys Gln Pro Thr Gly Gln Gln Gln Trp Gly 370 375 380 Asp Trp Cys
Asn Val Lys Gly Thr Gly Phe Gly Val Arg Pro Thr Ala 385 390 395 400
Asn Thr Gly His Glu Leu Val Asp Ala Phe Val Trp Val Lys Pro Gly 405
410 415 Gly Glu Ser Asp Gly Thr Ser Asp Thr Ser Ala Ala Arg Tyr Asp
Tyr 420 425 430 His Cys Gly Leu Ser Asp Ala Leu Gln Pro Ala Pro Glu
Ala Gly Gln 435 440 445 Trp Phe Gln Ala Tyr Phe Glu Gln Leu Leu Thr
Asn Ala Asn Pro Pro 450 455 460 Phe 465 1421449DNAArtificial
SequenceSynthetic polynucleotide. 142atggccaaga agcttttcat
caccgccgcg cttgcggctg ccgtgttggc ggcccccgtc 60attgaggagc gccagaactg
cggcgctgtg tggactcaat gcggcggtaa cgggtggcaa 120ggtcccacat
gctgcgcctc gggctcgacc tgcgttgcgc agaacgagtg gtactctcag
180tgcctgccca acagccaggt gacgagttcc accactccgt cgtcgacttc
cacctcgcag 240cgcagcacca gcacctccag cagcaccacc aggagcggca
gctcctcctc ctcctccacc 300acgcccccgc ccgtctccag ccccgtgacc
agcattcccg gcggtgcgac ctccacggcg 360agctactctg gcaacccctt
ctcgggcgtc cggctcttcg ccaacgacta ctacaggtcc 420gaggtccaca
atctcgccat tcctagcatg actggtactc tggcggccaa ggcttccgcc
480gtcgccgaag tccctagctt ccagtggctc gaccggaacg tcaccatcga
caccctgatg 540gtcccgactc tgtcccgcgt ccgggctctc aataaggccg
gtgccaatcc tccctatgct 600gcccaactcg tcgtctacga cctccccgac
cgtgactgtg ccgccgctgc gtccaacggc 660gagttttcga ttgcaaacgg
cggcgccgcc aactacagga gctacatcga cgctatccgc 720aagcacatca
aggagtactc ggacatccgg atcatcctgg ttatcgagcc cgactcgatg
780gccaacatgg tgaccaacat gaacgtggcc aagtgcagca acgccgcgtc
gacgtaccac 840gagttgaccg tgtacgcgct caagcagctg aacctgccca
acgtcgccat gtatctcgac 900gccggccacg ccggctggct cggctggccc
gccaacatcc agcccgccgc cgagctgttt 960gccggcatct acaatgatgc
cggcaagccg
gctgccgtcc gcggcctggc cactaacgtc 1020gccaactaca acgcctggag
catcgcttcg gccccgtcgt acacgtcgcc taaccctaac 1080tacgacgaga
agcactacat cgaggccttc agcccgctct tgaacgacgc cggcttcccc
1140gcacgcttca ttgtcgacac tggccgcaac ggcaaacaac ctaccggcca
acaacagtgg 1200ggtgactggt gcaatgtcaa gggcaccggc tttggcgtgc
gcccgacggc caacacgggc 1260cacgagctgg tcgatgcctt tgtctgggtc
aagcccggcg gcgagtccga cggcacaagc 1320gacaccagcg ccgcccgcta
cgactaccac tgcggcctgt ccgatgccct gcagcctgcc 1380cccgaggctg
gacagtggtt ccaggcctac ttcgagcagc tgctcaccaa cgccaacccg
1440cccttctaa 1449143482PRTArtificial SequenceSynthetic
polypeptides. 143Met Ala Lys Lys Leu Phe Ile Thr Ala Ala Leu Ala
Ala Ala Val Leu 1 5 10 15 Ala Ala Pro Val Ile Glu Glu Arg Gln Asn
Cys Gly Ala Val Trp Thr 20 25 30 Gln Cys Gly Gly Asn Gly Trp Gln
Gly Pro Thr Cys Cys Ala Ser Gly 35 40 45 Ser Thr Cys Val Ala Gln
Asn Glu Trp Tyr Ser Gln Cys Leu Pro Asn 50 55 60 Ser Gln Val Thr
Ser Ser Thr Thr Pro Ser Ser Thr Ser Thr Ser Gln 65 70 75 80 Arg Ser
Thr Ser Thr Ser Ser Ser Thr Thr Arg Ser Gly Ser Ser Ser 85 90 95
Ser Ser Ser Thr Thr Pro Pro Pro Val Ser Ser Pro Val Thr Ser Ile 100
105 110 Pro Gly Gly Ala Thr Ser Thr Ala Ser Tyr Ser Gly Asn Pro Phe
Ser 115 120 125 Gly Val Arg Leu Phe Ala Asn Asp Tyr Tyr Arg Ser Glu
Val His Asn 130 135 140 Leu Ala Ile Pro Ser Met Thr Gly Thr Leu Ala
Ala Lys Ala Ser Ala 145 150 155 160 Val Ala Glu Val Pro Ser Phe Gln
Trp Leu Asp Arg Asn Val Thr Ile 165 170 175 Asp Thr Leu Met Val Pro
Thr Leu Ser Arg Val Arg Ala Leu Asn Lys 180 185 190 Ala Gly Ala Asn
Pro Pro Tyr Ala Ala Gln Leu Val Val Tyr Asp Leu 195 200 205 Pro Asp
Arg Asp Cys Ala Ala Ala Ala Ser Asn Gly Glu Phe Ser Ile 210 215 220
Ala Asn Gly Gly Ala Ala Asn Tyr Arg Ser Tyr Ile Asp Ala Ile Arg 225
230 235 240 Lys His Ile Lys Glu Tyr Ser Asp Ile Arg Ile Ile Leu Val
Ile Glu 245 250 255 Pro Asp Ser Met Ala Asn Met Val Thr Asn Met Asn
Val Ala Lys Cys 260 265 270 Ser Asn Ala Ala Ser Thr Tyr His Glu Leu
Thr Val Tyr Ala Leu Lys 275 280 285 Gln Leu Asn Leu Pro Asn Val Ala
Met Tyr Leu Asp Ala Gly His Ala 290 295 300 Gly Trp Leu Gly Trp Pro
Ala Asn Ile Gln Pro Ala Ala Glu Leu Phe 305 310 315 320 Ala Gly Ile
Tyr Asn Asp Ala Gly Lys Pro Ala Ala Val Arg Gly Leu 325 330 335 Ala
Thr Asn Val Ala Asn Tyr Asn Ala Trp Ser Ile Ala Ser Ala Pro 340 345
350 Ser Tyr Thr Ser Pro Asn Pro Asn Tyr Asp Glu Lys His Tyr Ile Glu
355 360 365 Ala Phe Ser Pro Leu Leu Asn Asp Ala Gly Phe Pro Ala Arg
Phe Ile 370 375 380 Val Asp Thr Gly Arg Asn Gly Lys Gln Pro Thr Gly
Gln Gln Gln Trp 385 390 395 400 Gly Asp Trp Cys Asn Val Lys Gly Thr
Gly Phe Gly Val Arg Pro Thr 405 410 415 Ala Asn Thr Gly His Glu Leu
Val Asp Ala Phe Val Trp Val Lys Pro 420 425 430 Gly Gly Glu Ser Asp
Gly Thr Ser Asp Thr Ser Ala Ala Arg Tyr Asp 435 440 445 Tyr His Cys
Gly Leu Ser Asp Ala Leu Gln Pro Ala Pro Glu Ala Gly 450 455 460 Gln
Trp Phe Gln Ala Tyr Phe Glu Gln Leu Leu Thr Asn Ala Asn Pro 465 470
475 480 Pro Phe 144465PRTArtificial SequenceSynthetic polypeptides.
144Ala Pro Val Ile Glu Glu Arg Gln Asn Cys Gly Ala Val Trp Thr Gln
1 5 10 15 Cys Gly Gly Asn Gly Trp Gln Gly Pro Thr Cys Cys Ala Ser
Gly Ser 20 25 30 Thr Cys Val Ala Gln Asn Glu Trp Tyr Ser Gln Cys
Leu Pro Asn Ser 35 40 45 Gln Val Thr Ser Ser Thr Thr Pro Ser Ser
Thr Ser Thr Ser Gln Arg 50 55 60 Ser Thr Ser Thr Ser Ser Ser Thr
Thr Arg Ser Gly Ser Ser Ser Ser 65 70 75 80 Ser Ser Thr Thr Pro Pro
Pro Val Ser Ser Pro Val Thr Ser Ile Pro 85 90 95 Gly Gly Ala Thr
Ser Thr Ala Ser Tyr Ser Gly Asn Pro Phe Ser Gly 100 105 110 Val Arg
Leu Phe Ala Asn Asp Tyr Tyr Arg Ser Glu Val His Asn Leu 115 120 125
Ala Ile Pro Ser Met Thr Gly Thr Leu Ala Ala Lys Ala Ser Ala Val 130
135 140 Ala Glu Val Pro Ser Phe Gln Trp Leu Asp Arg Asn Val Thr Ile
Asp 145 150 155 160 Thr Leu Met Val Pro Thr Leu Ser Arg Val Arg Ala
Leu Asn Lys Ala 165 170 175 Gly Ala Asn Pro Pro Tyr Ala Ala Gln Leu
Val Val Tyr Asp Leu Pro 180 185 190 Asp Arg Asp Cys Ala Ala Ala Ala
Ser Asn Gly Glu Phe Ser Ile Ala 195 200 205 Asn Gly Gly Ala Ala Asn
Tyr Arg Ser Tyr Ile Asp Ala Ile Arg Lys 210 215 220 His Ile Lys Glu
Tyr Ser Asp Ile Arg Ile Ile Leu Val Ile Glu Pro 225 230 235 240 Asp
Ser Met Ala Asn Met Val Thr Asn Met Asn Val Ala Lys Cys Ser 245 250
255 Asn Ala Ala Ser Thr Tyr His Glu Leu Thr Val Tyr Ala Leu Lys Gln
260 265 270 Leu Asn Leu Pro Asn Val Ala Met Tyr Leu Asp Ala Gly His
Ala Gly 275 280 285 Trp Leu Gly Trp Pro Ala Asn Ile Gln Pro Ala Ala
Glu Leu Phe Ala 290 295 300 Gly Ile Tyr Asn Asp Ala Gly Lys Pro Ala
Ala Val Arg Gly Leu Ala 305 310 315 320 Thr Asn Val Ala Asn Tyr Asn
Ala Trp Ser Ile Ala Ser Ala Pro Ser 325 330 335 Tyr Thr Ser Pro Asn
Pro Asn Tyr Asp Glu Lys His Tyr Ile Glu Ala 340 345 350 Phe Ser Pro
Leu Leu Asn Asp Ala Gly Phe Pro Ala Arg Phe Ile Val 355 360 365 Asp
Thr Gly Arg Asn Gly Lys Gln Pro Thr Gly Gln Gln Gln Trp Gly 370 375
380 Asp Trp Cys Asn Val Lys Gly Thr Gly Phe Gly Val Arg Pro Thr Ala
385 390 395 400 Asn Thr Gly His Glu Leu Val Asp Ala Phe Val Trp Val
Lys Pro Gly 405 410 415 Gly Glu Ser Asp Gly Thr Ser Asp Thr Ser Ala
Ala Arg Tyr Asp Tyr 420 425 430 His Cys Gly Leu Ser Asp Ala Leu Gln
Pro Ala Pro Glu Ala Gly Gln 435 440 445 Trp Phe Gln Ala Tyr Phe Glu
Gln Leu Leu Thr Asn Ala Asn Pro Pro 450 455 460 Phe 465
1451449DNAArtificial SequenceSynthetic polynucleotide.
145atggccaaga agcttttcat caccgccgcg cttgcggctg ccgtgttggc
ggcccccgtc 60attgaggagc gccagaactg cggcgctgtg tggactcaat gcggcggtaa
cgggtggcaa 120ggtcccacat gctgcgcctc gggctcgacc tgcgttgcgc
agaacgagtg gtactctcag 180tgcctgccca acagccaggt gacgagttcc
accactccgt cgtcgacttc cacctcgcag 240cgcagcacca gcacctccag
cagcaccacc aggagcggca gctcctcctc ctcctccacc 300acgcccaccc
ccgtctccag ccccgtgacc agcattcccg gcggtgcgac ctccacggcg
360agctactctg gcaacccctt ctcgggcgtc cggctcttcg ccaacgacta
ctacaggtcc 420gaggtcatga atctcgccat tcctagcatg actggtactc
tggcggccaa ggcttccgcc 480gtcgccgaag tccctagctt ccagtggctc
gaccggaacg tcaccatcga caccctgatg 540gtcaccactc tgtcccaggt
ccgggctctc aataaggccg gtgccaatcc tccctatgct 600gcccaactcg
tcgtctacga cctccccgac cgtgactgtg ccgccgctgc gtccaacggc
660gagttttcga ttgcaaacgg cggcagcgcc aactacagga gctacatcga
cgctatccgc 720aagcacatca ttgagtactc ggacatccgg atcatcctgg
ttatcgagcc cgactcgatg 780gccaacatgg tgaccaacat gaacgtggcc
aagtgcagca acgccgcgtc gacgtaccac 840gagttgaccg tgtacgcgct
caagcagctg aacctgccca acgtcgccat gtatctcgac 900gccggccacg
ccggctggct cggctggccc gccaacatcc agcccgccgc cgagctgttt
960gccggcatct acaatgatgc cggcaagccg gctgccgtcc gcggcctggc
cactaacgtc 1020gccaactaca acgcctggag catcgcttcg gccccgtcgt
acacgcagcc taaccctaac 1080tacgacgaga agcactacat cgaggccttc
agcccgctct tgaactcggc cggcttcccc 1140gcacgcttca ttgtcgacac
tggccgcaac ggcaaacaac ctaccggcca acaacagtgg 1200ggtgactggt
gcaatgtcaa gggcaccggc tttggcgtgc gcccgacggc caacacgggc
1260cacgagctgg tcgatgcctt tgtctgggtc aagcccggcg gcgagtccga
cggcacaagc 1320gacaccagcg ccgcccgcta cgactaccac tgcggcctgt
ccgatgccct gcagcctgcc 1380cccgaggctg gacagtggtt ccaggcctac
ttcgagcagc tgctcaccaa cgccaacccg 1440cccttctaa
1449146482PRTArtificial SequenceSynthetic polypeptides. 146Met Ala
Lys Lys Leu Phe Ile Thr Ala Ala Leu Ala Ala Ala Val Leu 1 5 10 15
Ala Ala Pro Val Ile Glu Glu Arg Gln Asn Cys Gly Ala Val Trp Thr 20
25 30 Gln Cys Gly Gly Asn Gly Trp Gln Gly Pro Thr Cys Cys Ala Ser
Gly 35 40 45 Ser Thr Cys Val Ala Gln Asn Glu Trp Tyr Ser Gln Cys
Leu Pro Asn 50 55 60 Ser Gln Val Thr Ser Ser Thr Thr Pro Ser Ser
Thr Ser Thr Ser Gln 65 70 75 80 Arg Ser Thr Ser Thr Ser Ser Ser Thr
Thr Arg Ser Gly Ser Ser Ser 85 90 95 Ser Ser Ser Thr Thr Pro Thr
Pro Val Ser Ser Pro Val Thr Ser Ile 100 105 110 Pro Gly Gly Ala Thr
Ser Thr Ala Ser Tyr Ser Gly Asn Pro Phe Ser 115 120 125 Gly Val Arg
Leu Phe Ala Asn Asp Tyr Tyr Arg Ser Glu Val Met Asn 130 135 140 Leu
Ala Ile Pro Ser Met Thr Gly Thr Leu Ala Ala Lys Ala Ser Ala 145 150
155 160 Val Ala Glu Val Pro Ser Phe Gln Trp Leu Asp Arg Asn Val Thr
Ile 165 170 175 Asp Thr Leu Met Val Thr Thr Leu Ser Gln Val Arg Ala
Leu Asn Lys 180 185 190 Ala Gly Ala Asn Pro Pro Tyr Ala Ala Gln Leu
Val Val Tyr Asp Leu 195 200 205 Pro Asp Arg Asp Cys Ala Ala Ala Ala
Ser Asn Gly Glu Phe Ser Ile 210 215 220 Ala Asn Gly Gly Ser Ala Asn
Tyr Arg Ser Tyr Ile Asp Ala Ile Arg 225 230 235 240 Lys His Ile Ile
Glu Tyr Ser Asp Ile Arg Ile Ile Leu Val Ile Glu 245 250 255 Pro Asp
Ser Met Ala Asn Met Val Thr Asn Met Asn Val Ala Lys Cys 260 265 270
Ser Asn Ala Ala Ser Thr Tyr His Glu Leu Thr Val Tyr Ala Leu Lys 275
280 285 Gln Leu Asn Leu Pro Asn Val Ala Met Tyr Leu Asp Ala Gly His
Ala 290 295 300 Gly Trp Leu Gly Trp Pro Ala Asn Ile Gln Pro Ala Ala
Glu Leu Phe 305 310 315 320 Ala Gly Ile Tyr Asn Asp Ala Gly Lys Pro
Ala Ala Val Arg Gly Leu 325 330 335 Ala Thr Asn Val Ala Asn Tyr Asn
Ala Trp Ser Ile Ala Ser Ala Pro 340 345 350 Ser Tyr Thr Gln Pro Asn
Pro Asn Tyr Asp Glu Lys His Tyr Ile Glu 355 360 365 Ala Phe Ser Pro
Leu Leu Asn Ser Ala Gly Phe Pro Ala Arg Phe Ile 370 375 380 Val Asp
Thr Gly Arg Asn Gly Lys Gln Pro Thr Gly Gln Gln Gln Trp 385 390 395
400 Gly Asp Trp Cys Asn Val Lys Gly Thr Gly Phe Gly Val Arg Pro Thr
405 410 415 Ala Asn Thr Gly His Glu Leu Val Asp Ala Phe Val Trp Val
Lys Pro 420 425 430 Gly Gly Glu Ser Asp Gly Thr Ser Asp Thr Ser Ala
Ala Arg Tyr Asp 435 440 445 Tyr His Cys Gly Leu Ser Asp Ala Leu Gln
Pro Ala Pro Glu Ala Gly 450 455 460 Gln Trp Phe Gln Ala Tyr Phe Glu
Gln Leu Leu Thr Asn Ala Asn Pro 465 470 475 480 Pro Phe
147465PRTArtificial SequenceSynthetic polypeptides. 147Ala Pro Val
Ile Glu Glu Arg Gln Asn Cys Gly Ala Val Trp Thr Gln 1 5 10 15 Cys
Gly Gly Asn Gly Trp Gln Gly Pro Thr Cys Cys Ala Ser Gly Ser 20 25
30 Thr Cys Val Ala Gln Asn Glu Trp Tyr Ser Gln Cys Leu Pro Asn Ser
35 40 45 Gln Val Thr Ser Ser Thr Thr Pro Ser Ser Thr Ser Thr Ser
Gln Arg 50 55 60 Ser Thr Ser Thr Ser Ser Ser Thr Thr Arg Ser Gly
Ser Ser Ser Ser 65 70 75 80 Ser Ser Thr Thr Pro Thr Pro Val Ser Ser
Pro Val Thr Ser Ile Pro 85 90 95 Gly Gly Ala Thr Ser Thr Ala Ser
Tyr Ser Gly Asn Pro Phe Ser Gly 100 105 110 Val Arg Leu Phe Ala Asn
Asp Tyr Tyr Arg Ser Glu Val Met Asn Leu 115 120 125 Ala Ile Pro Ser
Met Thr Gly Thr Leu Ala Ala Lys Ala Ser Ala Val 130 135 140 Ala Glu
Val Pro Ser Phe Gln Trp Leu Asp Arg Asn Val Thr Ile Asp 145 150 155
160 Thr Leu Met Val Thr Thr Leu Ser Gln Val Arg Ala Leu Asn Lys Ala
165 170 175 Gly Ala Asn Pro Pro Tyr Ala Ala Gln Leu Val Val Tyr Asp
Leu Pro 180 185 190 Asp Arg Asp Cys Ala Ala Ala Ala Ser Asn Gly Glu
Phe Ser Ile Ala 195 200 205 Asn Gly Gly Ser Ala Asn Tyr Arg Ser Tyr
Ile Asp Ala Ile Arg Lys 210 215 220 His Ile Ile Glu Tyr Ser Asp Ile
Arg Ile Ile Leu Val Ile Glu Pro 225 230 235 240 Asp Ser Met Ala Asn
Met Val Thr Asn Met Asn Val Ala Lys Cys Ser 245 250 255 Asn Ala Ala
Ser Thr Tyr His Glu Leu Thr Val Tyr Ala Leu Lys Gln 260 265 270 Leu
Asn Leu Pro Asn Val Ala Met Tyr Leu Asp Ala Gly His Ala Gly 275 280
285 Trp Leu Gly Trp Pro Ala Asn Ile Gln Pro Ala Ala Glu Leu Phe Ala
290 295 300 Gly Ile Tyr Asn Asp Ala Gly Lys Pro Ala Ala Val Arg Gly
Leu Ala 305 310 315 320 Thr Asn Val Ala Asn Tyr Asn Ala Trp Ser Ile
Ala Ser Ala Pro Ser 325 330 335 Tyr Thr Gln Pro Asn Pro Asn Tyr Asp
Glu Lys His Tyr Ile Glu Ala 340 345 350 Phe Ser Pro Leu Leu Asn Ser
Ala Gly Phe Pro Ala Arg Phe Ile Val 355 360 365 Asp Thr Gly Arg Asn
Gly Lys Gln Pro Thr Gly Gln Gln Gln Trp Gly 370 375 380 Asp Trp Cys
Asn Val Lys Gly Thr Gly Phe Gly Val Arg Pro Thr Ala 385 390 395 400
Asn Thr Gly His Glu Leu Val Asp Ala Phe Val Trp Val Lys Pro Gly 405
410 415 Gly Glu Ser Asp Gly Thr Ser Asp Thr Ser Ala Ala Arg Tyr Asp
Tyr 420 425 430 His Cys Gly Leu Ser Asp Ala Leu Gln Pro Ala Pro Glu
Ala Gly Gln 435 440 445 Trp Phe Gln Ala Tyr Phe Glu Gln Leu Leu Thr
Asn Ala Asn Pro Pro 450 455 460 Phe 465 1481239DNAMyceliophthora
thermophila 148atgcactcca aagctttctt ggcagcgctt cttgcgcctg
ccgtctcagg gcaactgaac 60gacctcgccg tcagggctgg actcaagtac tttggtactg
ctcttagcga gagcgtcatc 120aacagtgata ctcggtatgc tgccatcctc
agcgacaaga gcatgttcgg ccagctcgtc 180cccgagaatg gcatgaagtg
ggatgctact gagccgtccc gtggccagtt caactacgcc 240tcgggcgaca
tcacggccaa cacggccaag aagaatggcc agggcatgcg ttgccacacc
300atggtctggt acagccagct cccgagctgg gtctcctcgg gctcgtggac
cagggactcg 360ctcacctcgg tcatcgagac gcacatgaac aacgtcatgg
gccactacaa
gggccaatgc 420tacgcctggg atgtcatcaa cgaggccatc aatgacgacg
gcaactcctg gcgcgacaac 480gtctttctcc ggacctttgg gaccgactac
ttcgccctgt ccttcaacct agccaagaag 540gccgatcccg ataccaagct
gtactacaac gactacaacc tcgagtacaa ccaggccaag 600acggaccgcg
ctgttgagct cgtcaagatg gtccaggccg ccggcgcgcc catcgacggt
660gtcggcttcc agggccacct cattgtcggc tcgaccccga cgcgctcgca
gctggccacc 720gccctccagc gcttcaccgc gctcggcctc gaggtcgcct
acaccgagct cgacatccgc 780cactcgagcc tgccggcctc ttcgtcggcg
ctcgcgaccc agggcaacga cttcgccaac 840gtggtcggct cttgcctcga
caccgccggc tgcgtcggcg tcaccgtctg gggcttcacc 900gatgcgcact
cgtggatccc gaacacgttc cccggccagg gcgacgccct gatctacgac
960agcaactaca acaagaagcc cgcgtggacc tcgatctcgt ccgtcctggc
cgccaaggcc 1020accggcgccc cgcccgcctc gtcctccacc accctcgtca
ccatcaccac ccctccgccg 1080gcatccacca ccgcctcctc ctcctccagt
gccacgccca cgagcgtccc gacgcagacg 1140aggtggggac agtgcggcgg
catcggatgg acggggccga cccagtgcga gagcccatgg 1200acctgccaga
agctgaacga ctggtactgg cagtgcctg 1239149413PRTMyceliophthora
thermophila 149Met His Ser Lys Ala Phe Leu Ala Ala Leu Leu Ala Pro
Ala Val Ser 1 5 10 15 Gly Gln Leu Asn Asp Leu Ala Val Arg Ala Gly
Leu Lys Tyr Phe Gly 20 25 30 Thr Ala Leu Ser Glu Ser Val Ile Asn
Ser Asp Thr Arg Tyr Ala Ala 35 40 45 Ile Leu Ser Asp Lys Ser Met
Phe Gly Gln Leu Val Pro Glu Asn Gly 50 55 60 Met Lys Trp Asp Ala
Thr Glu Pro Ser Arg Gly Gln Phe Asn Tyr Ala 65 70 75 80 Ser Gly Asp
Ile Thr Ala Asn Thr Ala Lys Lys Asn Gly Gln Gly Met 85 90 95 Arg
Cys His Thr Met Val Trp Tyr Ser Gln Leu Pro Ser Trp Val Ser 100 105
110 Ser Gly Ser Trp Thr Arg Asp Ser Leu Thr Ser Val Ile Glu Thr His
115 120 125 Met Asn Asn Val Met Gly His Tyr Lys Gly Gln Cys Tyr Ala
Trp Asp 130 135 140 Val Ile Asn Glu Ala Ile Asn Asp Asp Gly Asn Ser
Trp Arg Asp Asn 145 150 155 160 Val Phe Leu Arg Thr Phe Gly Thr Asp
Tyr Phe Ala Leu Ser Phe Asn 165 170 175 Leu Ala Lys Lys Ala Asp Pro
Asp Thr Lys Leu Tyr Tyr Asn Asp Tyr 180 185 190 Asn Leu Glu Tyr Asn
Gln Ala Lys Thr Asp Arg Ala Val Glu Leu Val 195 200 205 Lys Met Val
Gln Ala Ala Gly Ala Pro Ile Asp Gly Val Gly Phe Gln 210 215 220 Gly
His Leu Ile Val Gly Ser Thr Pro Thr Arg Ser Gln Leu Ala Thr 225 230
235 240 Ala Leu Gln Arg Phe Thr Ala Leu Gly Leu Glu Val Ala Tyr Thr
Glu 245 250 255 Leu Asp Ile Arg His Ser Ser Leu Pro Ala Ser Ser Ser
Ala Leu Ala 260 265 270 Thr Gln Gly Asn Asp Phe Ala Asn Val Val Gly
Ser Cys Leu Asp Thr 275 280 285 Ala Gly Cys Val Gly Val Thr Val Trp
Gly Phe Thr Asp Ala His Ser 290 295 300 Trp Ile Pro Asn Thr Phe Pro
Gly Gln Gly Asp Ala Leu Ile Tyr Asp 305 310 315 320 Ser Asn Tyr Asn
Lys Lys Pro Ala Trp Thr Ser Ile Ser Ser Val Leu 325 330 335 Ala Ala
Lys Ala Thr Gly Ala Pro Pro Ala Ser Ser Ser Thr Thr Leu 340 345 350
Val Thr Ile Thr Thr Pro Pro Pro Ala Ser Thr Thr Ala Ser Ser Ser 355
360 365 Ser Ser Ala Thr Pro Thr Ser Val Pro Thr Gln Thr Arg Trp Gly
Gln 370 375 380 Cys Gly Gly Ile Gly Trp Thr Gly Pro Thr Gln Cys Glu
Ser Pro Trp 385 390 395 400 Thr Cys Gln Lys Leu Asn Asp Trp Tyr Trp
Gln Cys Leu 405 410 150396PRTMyceliophthora thermophila 150Gln Leu
Asn Asp Leu Ala Val Arg Ala Gly Leu Lys Tyr Phe Gly Thr 1 5 10 15
Ala Leu Ser Glu Ser Val Ile Asn Ser Asp Thr Arg Tyr Ala Ala Ile 20
25 30 Leu Ser Asp Lys Ser Met Phe Gly Gln Leu Val Pro Glu Asn Gly
Met 35 40 45 Lys Trp Asp Ala Thr Glu Pro Ser Arg Gly Gln Phe Asn
Tyr Ala Ser 50 55 60 Gly Asp Ile Thr Ala Asn Thr Ala Lys Lys Asn
Gly Gln Gly Met Arg 65 70 75 80 Cys His Thr Met Val Trp Tyr Ser Gln
Leu Pro Ser Trp Val Ser Ser 85 90 95 Gly Ser Trp Thr Arg Asp Ser
Leu Thr Ser Val Ile Glu Thr His Met 100 105 110 Asn Asn Val Met Gly
His Tyr Lys Gly Gln Cys Tyr Ala Trp Asp Val 115 120 125 Ile Asn Glu
Ala Ile Asn Asp Asp Gly Asn Ser Trp Arg Asp Asn Val 130 135 140 Phe
Leu Arg Thr Phe Gly Thr Asp Tyr Phe Ala Leu Ser Phe Asn Leu 145 150
155 160 Ala Lys Lys Ala Asp Pro Asp Thr Lys Leu Tyr Tyr Asn Asp Tyr
Asn 165 170 175 Leu Glu Tyr Asn Gln Ala Lys Thr Asp Arg Ala Val Glu
Leu Val Lys 180 185 190 Met Val Gln Ala Ala Gly Ala Pro Ile Asp Gly
Val Gly Phe Gln Gly 195 200 205 His Leu Ile Val Gly Ser Thr Pro Thr
Arg Ser Gln Leu Ala Thr Ala 210 215 220 Leu Gln Arg Phe Thr Ala Leu
Gly Leu Glu Val Ala Tyr Thr Glu Leu 225 230 235 240 Asp Ile Arg His
Ser Ser Leu Pro Ala Ser Ser Ser Ala Leu Ala Thr 245 250 255 Gln Gly
Asn Asp Phe Ala Asn Val Val Gly Ser Cys Leu Asp Thr Ala 260 265 270
Gly Cys Val Gly Val Thr Val Trp Gly Phe Thr Asp Ala His Ser Trp 275
280 285 Ile Pro Asn Thr Phe Pro Gly Gln Gly Asp Ala Leu Ile Tyr Asp
Ser 290 295 300 Asn Tyr Asn Lys Lys Pro Ala Trp Thr Ser Ile Ser Ser
Val Leu Ala 305 310 315 320 Ala Lys Ala Thr Gly Ala Pro Pro Ala Ser
Ser Ser Thr Thr Leu Val 325 330 335 Thr Ile Thr Thr Pro Pro Pro Ala
Ser Thr Thr Ala Ser Ser Ser Ser 340 345 350 Ser Ala Thr Pro Thr Ser
Val Pro Thr Gln Thr Arg Trp Gly Gln Cys 355 360 365 Gly Gly Ile Gly
Trp Thr Gly Pro Thr Gln Cys Glu Ser Pro Trp Thr 370 375 380 Cys Gln
Lys Leu Asn Asp Trp Tyr Trp Gln Cys Leu 385 390 395
151654DNAMyceliophthora thermophila 151atggtctcgt tcactctcct
cctcacggtc atcgccgctg cggtgacgac ggccagccct 60ctcgaggtgg tcaagcgcgg
catccagccg ggcacgggca cccacgaggg gtacttctac 120tcgttctgga
ccgacggccg tggctcggtc gacttcaacc ccgggccccg cggctcgtac
180agcgtcacct ggaacaacgt caacaactgg gttggcggca agggctggaa
cccgggcccg 240ccgcgcaaga ttgcgtacaa cggcacctgg aacaactaca
acgtgaacag ctacctcgcc 300ctgtacggct ggactcgcaa cccgctggtc
gagtattaca tcgtggaggc atacggcacg 360tacaacccct cgtcgggcac
ggcgcggctg ggcaccatcg aggacgacgg cggcgtgtac 420gacatctaca
agacgacgcg gtacaaccag ccgtccatcg aggggacctc caccttcgac
480cagtactggt ccgtccgccg ccagaagcgc gtcggcggca ctatcgacac
gggcaagcac 540tttgacgagt ggaagcgcca gggcaacctc cagctcggca
cctggaacta catgatcatg 600gccaccgagg gctaccagag ctctggttcg
gccactatcg aggtccggga ggcc 654152218PRTMyceliophthora thermophila
152Met Val Ser Phe Thr Leu Leu Leu Thr Val Ile Ala Ala Ala Val Thr
1 5 10 15 Thr Ala Ser Pro Leu Glu Val Val Lys Arg Gly Ile Gln Pro
Gly Thr 20 25 30 Gly Thr His Glu Gly Tyr Phe Tyr Ser Phe Trp Thr
Asp Gly Arg Gly 35 40 45 Ser Val Asp Phe Asn Pro Gly Pro Arg Gly
Ser Tyr Ser Val Thr Trp 50 55 60 Asn Asn Val Asn Asn Trp Val Gly
Gly Lys Gly Trp Asn Pro Gly Pro 65 70 75 80 Pro Arg Lys Ile Ala Tyr
Asn Gly Thr Trp Asn Asn Tyr Asn Val Asn 85 90 95 Ser Tyr Leu Ala
Leu Tyr Gly Trp Thr Arg Asn Pro Leu Val Glu Tyr 100 105 110 Tyr Ile
Val Glu Ala Tyr Gly Thr Tyr Asn Pro Ser Ser Gly Thr Ala 115 120 125
Arg Leu Gly Thr Ile Glu Asp Asp Gly Gly Val Tyr Asp Ile Tyr Lys 130
135 140 Thr Thr Arg Tyr Asn Gln Pro Ser Ile Glu Gly Thr Ser Thr Phe
Asp 145 150 155 160 Gln Tyr Trp Ser Val Arg Arg Gln Lys Arg Val Gly
Gly Thr Ile Asp 165 170 175 Thr Gly Lys His Phe Asp Glu Trp Lys Arg
Gln Gly Asn Leu Gln Leu 180 185 190 Gly Thr Trp Asn Tyr Met Ile Met
Ala Thr Glu Gly Tyr Gln Ser Ser 195 200 205 Gly Ser Ala Thr Ile Glu
Val Arg Glu Ala 210 215 153218PRTMyceliophthora thermophila 153Met
Val Ser Phe Thr Leu Leu Leu Thr Val Ile Ala Ala Ala Val Thr 1 5 10
15 Thr Ala Ser Pro Leu Glu Val Val Lys Arg Gly Ile Gln Pro Gly Thr
20 25 30 Gly Thr His Glu Gly Tyr Phe Tyr Ser Phe Trp Thr Asp Gly
Arg Gly 35 40 45 Ser Val Asp Phe Asn Pro Gly Pro Arg Gly Ser Tyr
Ser Val Thr Trp 50 55 60 Asn Asn Val Asn Asn Trp Val Gly Gly Lys
Gly Trp Asn Pro Gly Pro 65 70 75 80 Pro Arg Lys Ile Ala Tyr Asn Gly
Thr Trp Asn Asn Tyr Asn Val Asn 85 90 95 Ser Tyr Leu Ala Leu Tyr
Gly Trp Thr Arg Asn Pro Leu Val Glu Tyr 100 105 110 Tyr Ile Val Glu
Ala Tyr Gly Thr Tyr Asn Pro Ser Ser Gly Thr Ala 115 120 125 Arg Leu
Gly Thr Ile Glu Asp Asp Gly Gly Val Tyr Asp Ile Tyr Lys 130 135 140
Thr Thr Arg Tyr Asn Gln Pro Ser Ile Glu Gly Thr Ser Thr Phe Asp 145
150 155 160 Gln Tyr Trp Ser Val Arg Arg Gln Lys Arg Val Gly Gly Thr
Ile Asp 165 170 175 Thr Gly Lys His Phe Asp Glu Trp Lys Arg Gln Gly
Asn Leu Gln Leu 180 185 190 Gly Thr Trp Asn Tyr Met Ile Met Ala Thr
Glu Gly Tyr Gln Ser Ser 195 200 205 Gly Ser Ala Thr Ile Glu Val Arg
Glu Ala 210 215 1541155DNAMyceliophthora thermophila 154atgcgtactc
ttacgttcgt gctggcagcc gccccggtgg ctgtgcttgc ccaatctcct 60ctgtggggcc
agtgcggcgg tcaaggctgg acaggtccca cgacctgcgt ttctggcgca
120gtatgccaat tcgtcaatga ctggtactcc caatgcgtgc ccggatcgag
caaccctcct 180acgggcacca ccagcagcac cactggaagc accccggctc
ctactggcgg cggcggcagc 240ggaaccggcc tccacgacaa attcaaggcc
aagggcaagc tctacttcgg aaccgagatc 300gatcactacc atctcaacaa
caatgccttg accaacattg tcaagaaaga ctttggtcaa 360gtcactcacg
agaacagctt gaagtgggat gctactgagc cgagccgcaa tcaattcaac
420tttgccaacg ccgacgcggt tgtcaacttt gcccaggcca acggcaagct
catccgcggc 480cacaccctcc tctggcactc tcagctgccg cagtgggtgc
agaacatcaa cgaccgcaac 540accttgaccc aggtcatcga gaaccacgtc
accacccttg tcactcgcta caagggcaag 600atcctccact gggacgtcgt
taacgagatc tttgccgagg acggctcgct ccgcgacagc 660gtcttcagcc
gcgtcctcgg cgaggacttt gtcggcatcg ccttccgcgc cgcccgcgcc
720gccgatccca acgccaagct ctacatcaac gactacaacc tcgacattgc
caactacgcc 780aaggtgaccc ggggcatggt cgagaaggtc aacaagtgga
tcgcccaggg catcccgatc 840gacggcatcg gcacccagtg ccacctggcc
gggcccggcg ggtggaacac ggccgccggc 900gtccccgacg ccctcaaggc
cctcgccgcg gccaacgtca aggagatcgc catcaccgag 960ctcgacatcg
ccggcgcctc cgccaacgac tacctcaccg tcatgaacgc ctgcctccag
1020gtctccaagt gcgtcggcat caccgtctgg ggcgtctctg acaaggacag
ctggaggtcg 1080agcagcaacc cgctcctctt cgacagcaac taccagccaa
aggcggcata caatgctctg 1140attaatgcct tgtaa
1155155384PRTMyceliophthora thermophila 155Met Arg Thr Leu Thr Phe
Val Leu Ala Ala Ala Pro Val Ala Val Leu 1 5 10 15 Ala Gln Ser Pro
Leu Trp Gly Gln Cys Gly Gly Gln Gly Trp Thr Gly 20 25 30 Pro Thr
Thr Cys Val Ser Gly Ala Val Cys Gln Phe Val Asn Asp Trp 35 40 45
Tyr Ser Gln Cys Val Pro Gly Ser Ser Asn Pro Pro Thr Gly Thr Thr 50
55 60 Ser Ser Thr Thr Gly Ser Thr Pro Ala Pro Thr Gly Gly Gly Gly
Ser 65 70 75 80 Gly Thr Gly Leu His Asp Lys Phe Lys Ala Lys Gly Lys
Leu Tyr Phe 85 90 95 Gly Thr Glu Ile Asp His Tyr His Leu Asn Asn
Asn Ala Leu Thr Asn 100 105 110 Ile Val Lys Lys Asp Phe Gly Gln Val
Thr His Glu Asn Ser Leu Lys 115 120 125 Trp Asp Ala Thr Glu Pro Ser
Arg Asn Gln Phe Asn Phe Ala Asn Ala 130 135 140 Asp Ala Val Val Asn
Phe Ala Gln Ala Asn Gly Lys Leu Ile Arg Gly 145 150 155 160 His Thr
Leu Leu Trp His Ser Gln Leu Pro Gln Trp Val Gln Asn Ile 165 170 175
Asn Asp Arg Asn Thr Leu Thr Gln Val Ile Glu Asn His Val Thr Thr 180
185 190 Leu Val Thr Arg Tyr Lys Gly Lys Ile Leu His Trp Asp Val Val
Asn 195 200 205 Glu Ile Phe Ala Glu Asp Gly Ser Leu Arg Asp Ser Val
Phe Ser Arg 210 215 220 Val Leu Gly Glu Asp Phe Val Gly Ile Ala Phe
Arg Ala Ala Arg Ala 225 230 235 240 Ala Asp Pro Asn Ala Lys Leu Tyr
Ile Asn Asp Tyr Asn Leu Asp Ile 245 250 255 Ala Asn Tyr Ala Lys Val
Thr Arg Gly Met Val Glu Lys Val Asn Lys 260 265 270 Trp Ile Ala Gln
Gly Ile Pro Ile Asp Gly Ile Gly Thr Gln Cys His 275 280 285 Leu Ala
Gly Pro Gly Gly Trp Asn Thr Ala Ala Gly Val Pro Asp Ala 290 295 300
Leu Lys Ala Leu Ala Ala Ala Asn Val Lys Glu Ile Ala Ile Thr Glu 305
310 315 320 Leu Asp Ile Ala Gly Ala Ser Ala Asn Asp Tyr Leu Thr Val
Met Asn 325 330 335 Ala Cys Leu Gln Val Ser Lys Cys Val Gly Ile Thr
Val Trp Gly Val 340 345 350 Ser Asp Lys Asp Ser Trp Arg Ser Ser Ser
Asn Pro Leu Leu Phe Asp 355 360 365 Ser Asn Tyr Gln Pro Lys Ala Ala
Tyr Asn Ala Leu Ile Asn Ala Leu 370 375 380 156367PRTMyceliophthora
thermophila 156Gln Ser Pro Leu Trp Gly Gln Cys Gly Gly Gln Gly Trp
Thr Gly Pro 1 5 10 15 Thr Thr Cys Val Ser Gly Ala Val Cys Gln Phe
Val Asn Asp Trp Tyr 20 25 30 Ser Gln Cys Val Pro Gly Ser Ser Asn
Pro Pro Thr Gly Thr Thr Ser 35 40 45 Ser Thr Thr Gly Ser Thr Pro
Ala Pro Thr Gly Gly Gly Gly Ser Gly 50 55 60 Thr Gly Leu His Asp
Lys Phe Lys Ala Lys Gly Lys Leu Tyr Phe Gly 65 70 75 80 Thr Glu Ile
Asp His Tyr His Leu Asn Asn Asn Ala Leu Thr Asn Ile 85 90 95 Val
Lys Lys Asp Phe Gly Gln Val Thr His Glu Asn Ser Leu Lys Trp 100 105
110 Asp Ala Thr Glu Pro Ser Arg Asn Gln Phe Asn Phe Ala Asn Ala Asp
115 120 125 Ala Val Val Asn Phe Ala Gln Ala Asn Gly Lys Leu Ile Arg
Gly His 130 135 140 Thr Leu Leu Trp His Ser Gln Leu Pro Gln Trp Val
Gln Asn Ile Asn 145 150 155 160 Asp Arg Asn Thr Leu Thr Gln Val Ile
Glu Asn His Val Thr Thr Leu 165 170 175 Val Thr Arg Tyr Lys Gly Lys
Ile Leu His Trp Asp Val Val Asn Glu 180 185 190 Ile Phe Ala Glu Asp
Gly Ser Leu Arg Asp Ser Val Phe Ser Arg Val 195 200 205 Leu Gly Glu
Asp Phe Val Gly Ile Ala Phe Arg Ala Ala Arg
Ala Ala 210 215 220 Asp Pro Asn Ala Lys Leu Tyr Ile Asn Asp Tyr Asn
Leu Asp Ile Ala 225 230 235 240 Asn Tyr Ala Lys Val Thr Arg Gly Met
Val Glu Lys Val Asn Lys Trp 245 250 255 Ile Ala Gln Gly Ile Pro Ile
Asp Gly Ile Gly Thr Gln Cys His Leu 260 265 270 Ala Gly Pro Gly Gly
Trp Asn Thr Ala Ala Gly Val Pro Asp Ala Leu 275 280 285 Lys Ala Leu
Ala Ala Ala Asn Val Lys Glu Ile Ala Ile Thr Glu Leu 290 295 300 Asp
Ile Ala Gly Ala Ser Ala Asn Asp Tyr Leu Thr Val Met Asn Ala 305 310
315 320 Cys Leu Gln Val Ser Lys Cys Val Gly Ile Thr Val Trp Gly Val
Ser 325 330 335 Asp Lys Asp Ser Trp Arg Ser Ser Ser Asn Pro Leu Leu
Phe Asp Ser 340 345 350 Asn Tyr Gln Pro Lys Ala Ala Tyr Asn Ala Leu
Ile Asn Ala Leu 355 360 365 157687DNAMyceliophthora thermophila
157atggtctcgc tcaagtccct cctcctcgcc gcggcggcga cgttgacggc
ggtgacggcg 60cgcccgttcg actttgacga cggcaactcg accgaggcgc tggccaagcg
ccaggtcacg 120cccaacgcgc agggctacca ctcgggctac ttctactcgt
ggtggtccga cggcggcggc 180caggccacct tcaccctgct cgagggcagc
cactaccagg tcaactggag gaacacgggc 240aactttgtcg gtggcaaggg
ctggaacccg ggtaccggcc ggaccatcaa ctacggcggc 300tcgttcaacc
cgagcggcaa cggctacctg gccgtctacg gctggacgca caacccgctg
360atcgagtact acgtggtcga gtcgtacggg acctacaacc cgggcagcca
ggcccagtac 420aagggcagct tccagagcga cggcggcacc tacaacatct
acgtctcgac ccgctacaac 480gcgccctcga tcgagggcac ccgcaccttc
cagcagtact ggtccatccg cacctccaag 540cgcgtcggcg gctccgtcac
catgcagaac cacttcaacg cctgggccca gcacggcatg 600cccctcggct
cccacgacta ccagatcgtc gccaccgagg gctaccagag cagcggctcc
660tccgacatct acgtccagac tcactag 687158228PRTMyceliophthora
thermophila 158Met Val Ser Leu Lys Ser Leu Leu Leu Ala Ala Ala Ala
Thr Leu Thr 1 5 10 15 Ala Val Thr Ala Arg Pro Phe Asp Phe Asp Asp
Gly Asn Ser Thr Glu 20 25 30 Ala Leu Ala Lys Arg Gln Val Thr Pro
Asn Ala Gln Gly Tyr His Ser 35 40 45 Gly Tyr Phe Tyr Ser Trp Trp
Ser Asp Gly Gly Gly Gln Ala Thr Phe 50 55 60 Thr Leu Leu Glu Gly
Ser His Tyr Gln Val Asn Trp Arg Asn Thr Gly 65 70 75 80 Asn Phe Val
Gly Gly Lys Gly Trp Asn Pro Gly Thr Gly Arg Thr Ile 85 90 95 Asn
Tyr Gly Gly Ser Phe Asn Pro Ser Gly Asn Gly Tyr Leu Ala Val 100 105
110 Tyr Gly Trp Thr His Asn Pro Leu Ile Glu Tyr Tyr Val Val Glu Ser
115 120 125 Tyr Gly Thr Tyr Asn Pro Gly Ser Gln Ala Gln Tyr Lys Gly
Ser Phe 130 135 140 Gln Ser Asp Gly Gly Thr Tyr Asn Ile Tyr Val Ser
Thr Arg Tyr Asn 145 150 155 160 Ala Pro Ser Ile Glu Gly Thr Arg Thr
Phe Gln Gln Tyr Trp Ser Ile 165 170 175 Arg Thr Ser Lys Arg Val Gly
Gly Ser Val Thr Met Gln Asn His Phe 180 185 190 Asn Ala Trp Ala Gln
His Gly Met Pro Leu Gly Ser His Asp Tyr Gln 195 200 205 Ile Val Ala
Thr Glu Gly Tyr Gln Ser Ser Gly Ser Ser Asp Ile Tyr 210 215 220 Val
Gln Thr His 225 159208PRTMyceliophthora thermophila 159Arg Pro Phe
Asp Phe Asp Asp Gly Asn Ser Thr Glu Ala Leu Ala Lys 1 5 10 15 Arg
Gln Val Thr Pro Asn Ala Gln Gly Tyr His Ser Gly Tyr Phe Tyr 20 25
30 Ser Trp Trp Ser Asp Gly Gly Gly Gln Ala Thr Phe Thr Leu Leu Glu
35 40 45 Gly Ser His Tyr Gln Val Asn Trp Arg Asn Thr Gly Asn Phe
Val Gly 50 55 60 Gly Lys Gly Trp Asn Pro Gly Thr Gly Arg Thr Ile
Asn Tyr Gly Gly 65 70 75 80 Ser Phe Asn Pro Ser Gly Asn Gly Tyr Leu
Ala Val Tyr Gly Trp Thr 85 90 95 His Asn Pro Leu Ile Glu Tyr Tyr
Val Val Glu Ser Tyr Gly Thr Tyr 100 105 110 Asn Pro Gly Ser Gln Ala
Gln Tyr Lys Gly Ser Phe Gln Ser Asp Gly 115 120 125 Gly Thr Tyr Asn
Ile Tyr Val Ser Thr Arg Tyr Asn Ala Pro Ser Ile 130 135 140 Glu Gly
Thr Arg Thr Phe Gln Gln Tyr Trp Ser Ile Arg Thr Ser Lys 145 150 155
160 Arg Val Gly Gly Ser Val Thr Met Gln Asn His Phe Asn Ala Trp Ala
165 170 175 Gln His Gly Met Pro Leu Gly Ser His Asp Tyr Gln Ile Val
Ala Thr 180 185 190 Glu Gly Tyr Gln Ser Ser Gly Ser Ser Asp Ile Tyr
Val Gln Thr His 195 200 205 160681DNAMyceliophthora thermophila
160atggttaccc tcactcgcct ggcggtcgcc gcggcggcca tgatctccag
cactggcctg 60gctgccccga cgcccgaagc tggccccgac cttcccgact ttgagctcgg
ggtcaacaac 120ctcgcccgcc gcgcgctgga ctacaaccag aactacagga
ccagcggcaa cgtcaactac 180tcgcccaccg acaacggcta ctcggtcagc
ttctccaacg cgggagattt tgtcgtcggg 240aagggctgga ggacgggagc
caccagaaac atcaccttct cgggatcgac acagcatacc 300tcgggcaccg
tgctcgtctc cgtctacggc tggacccgga acccgctgat cgagtactac
360gtgcaggagt acacgtccaa cggggccggc tccgctcagg gcgagaagct
gggcacggtc 420gagagcgacg ggggcacgta cgagatctgg cggcaccagc
aggtcaacca gccgtcgatc 480gagggcacct cgaccttctg gcagtacatc
tcgaaccgcg tgtccggcca gcggcccaac 540ggcggcaccg tcaccctcgc
caaccacttc gccgcctggc agaagctcgg cctgaacctg 600ggccagcacg
actaccaggt cctggccacc gagggctggg gcaacgccgg cggcagctcc
660cagtacaccg tcagcggctg a 681161226PRTMyceliophthora thermophila
161Met Val Thr Leu Thr Arg Leu Ala Val Ala Ala Ala Ala Met Ile Ser
1 5 10 15 Ser Thr Gly Leu Ala Ala Pro Thr Pro Glu Ala Gly Pro Asp
Leu Pro 20 25 30 Asp Phe Glu Leu Gly Val Asn Asn Leu Ala Arg Arg
Ala Leu Asp Tyr 35 40 45 Asn Gln Asn Tyr Arg Thr Ser Gly Asn Val
Asn Tyr Ser Pro Thr Asp 50 55 60 Asn Gly Tyr Ser Val Ser Phe Ser
Asn Ala Gly Asp Phe Val Val Gly 65 70 75 80 Lys Gly Trp Arg Thr Gly
Ala Thr Arg Asn Ile Thr Phe Ser Gly Ser 85 90 95 Thr Gln His Thr
Ser Gly Thr Val Leu Val Ser Val Tyr Gly Trp Thr 100 105 110 Arg Asn
Pro Leu Ile Glu Tyr Tyr Val Gln Glu Tyr Thr Ser Asn Gly 115 120 125
Ala Gly Ser Ala Gln Gly Glu Lys Leu Gly Thr Val Glu Ser Asp Gly 130
135 140 Gly Thr Tyr Glu Ile Trp Arg His Gln Gln Val Asn Gln Pro Ser
Ile 145 150 155 160 Glu Gly Thr Ser Thr Phe Trp Gln Tyr Ile Ser Asn
Arg Val Ser Gly 165 170 175 Gln Arg Pro Asn Gly Gly Thr Val Thr Leu
Ala Asn His Phe Ala Ala 180 185 190 Trp Gln Lys Leu Gly Leu Asn Leu
Gly Gln His Asp Tyr Gln Val Leu 195 200 205 Ala Thr Glu Gly Trp Gly
Asn Ala Gly Gly Ser Ser Gln Tyr Thr Val 210 215 220 Ser Gly 225
162205PRTMyceliophthora thermophila 162Ala Pro Thr Pro Glu Ala Gly
Pro Asp Leu Pro Asp Phe Glu Leu Gly 1 5 10 15 Val Asn Asn Leu Ala
Arg Arg Ala Leu Asp Tyr Asn Gln Asn Tyr Arg 20 25 30 Thr Ser Gly
Asn Val Asn Tyr Ser Pro Thr Asp Asn Gly Tyr Ser Val 35 40 45 Ser
Phe Ser Asn Ala Gly Asp Phe Val Val Gly Lys Gly Trp Arg Thr 50 55
60 Gly Ala Thr Arg Asn Ile Thr Phe Ser Gly Ser Thr Gln His Thr Ser
65 70 75 80 Gly Thr Val Leu Val Ser Val Tyr Gly Trp Thr Arg Asn Pro
Leu Ile 85 90 95 Glu Tyr Tyr Val Gln Glu Tyr Thr Ser Asn Gly Ala
Gly Ser Ala Gln 100 105 110 Gly Glu Lys Leu Gly Thr Val Glu Ser Asp
Gly Gly Thr Tyr Glu Ile 115 120 125 Trp Arg His Gln Gln Val Asn Gln
Pro Ser Ile Glu Gly Thr Ser Thr 130 135 140 Phe Trp Gln Tyr Ile Ser
Asn Arg Val Ser Gly Gln Arg Pro Asn Gly 145 150 155 160 Gly Thr Val
Thr Leu Ala Asn His Phe Ala Ala Trp Gln Lys Leu Gly 165 170 175 Leu
Asn Leu Gly Gln His Asp Tyr Gln Val Leu Ala Thr Glu Gly Trp 180 185
190 Gly Asn Ala Gly Gly Ser Ser Gln Tyr Thr Val Ser Gly 195 200 205
1631833DNAMyceliophthora thermophila 163atgttcttcg cttctctgct
gctcggtctc ctggcgggcg tgtccgcttc accgggacac 60gggcggaatt ccaccttcta
caaccccatc ttccccggct tctaccccga tccgagctgc 120atctacgtgc
ccgagcgtga ccacaccttc ttctgtgcct cgtcgagctt caacgccttc
180ccgggcatcc cgattcatgc cagcaaggac ctgcagaact ggaagttgat
cggccatgtg 240ctgaatcgca aggaacagct tccccggctc gctgagacca
accggtcgac cagcggcatc 300tgggcaccca ccctccggtt ccatgacgac
accttctggt tggtcaccac actagtggac 360gacgaccggc cgcaggagga
cgcttccaga tgggacaata ttatcttcaa ggcaaagaat 420ccgtatgatc
cgaggtcctg gtccaaggcc gtccacttca acttcactgg ctacgacacg
480gagcctttct gggacgaaga tggaaaggtg tacatcaccg gcgcccatgc
ttggcatgtt 540ggcccataca tccagcaggc cgaagtcgat ctcgacacgg
gggccgtcgg cgagtggcgc 600atcatctgga acggaacggg cggcatggct
cctgaagggc cgcacatcta ccgcaaagat 660gggtggtact acttgctggc
tgctgaaggg gggaccggca tcgaccatat ggtgaccatg 720gcccggtcga
gaaaaatctc cagtccttac gagtccaacc caaacaaccc cgtgttgacc
780aacgccaaca cgaccagtta ctttcaaacc gtcgggcatt cagacctgtt
ccatgacaga 840catgggaact ggtgggcagt cgccctctcc acccgctccg
gtccagaata tcttcactac 900cccatgggcc gcgagaccgt catgacagcc
gtgagctggc cgaaggacga gtggccaacc 960ttcaccccca tatctggcaa
gatgagcggc tggccgatgc ctccttcgca gaaggacatt 1020cgcggagtcg
gcccctacgt caactccccc gacccggaac acctgacctt cccccgctcg
1080gcgcccctgc cggcccacct cacctactgg cgatacccga acccgtcctc
ctacacgccg 1140tccccgcccg ggcaccccaa caccctccgc ctgaccccgt
cccgcctgaa cctgaccgcc 1200ctcaacggca actacgcggg ggccgaccag
accttcgtct cgcgccggca gcagcacacc 1260ctcttcacct acagcgtcac
gctcgactac gcgccgcgga ccgccgggga ggaggccggc 1320gtgaccgcct
tcctgacgca gaaccaccac ctcgacctgg gcgtcgtcct gctccctcgc
1380ggctccgcca ccgcgccctc gctgccgggc ctgagtagta gtacaactac
tactagtagt 1440agtagtagtc gtccggacga ggaggaggag cgcgaggcgg
gcgaagagga agaagagggc 1500ggacaagact tgatgatccc gcatgtgcgg
ttcaggggcg agtcgtacgt gcccgtcccg 1560gcgcccgtcg tgtacccgat
accccgggcc tggagaggcg ggaagcttgt gttagagatc 1620cgggcttgta
attcgactca cttctcgttc cgtgtcgggc cggacgggag acggtctgag
1680cggacggtgg tcatggaggc ttcgaacgag gccgttagct ggggctttac
tggaacgctg 1740ctgggcatct atgcgaccag taatggtggc aacggaacca
cgccggcgta tttttcggat 1800tggaggtaca caccattgga gcagtttagg gat
1833164611PRTMyceliophthora thermophila 164Met Phe Phe Ala Ser Leu
Leu Leu Gly Leu Leu Ala Gly Val Ser Ala 1 5 10 15 Ser Pro Gly His
Gly Arg Asn Ser Thr Phe Tyr Asn Pro Ile Phe Pro 20 25 30 Gly Phe
Tyr Pro Asp Pro Ser Cys Ile Tyr Val Pro Glu Arg Asp His 35 40 45
Thr Phe Phe Cys Ala Ser Ser Ser Phe Asn Ala Phe Pro Gly Ile Pro 50
55 60 Ile His Ala Ser Lys Asp Leu Gln Asn Trp Lys Leu Ile Gly His
Val 65 70 75 80 Leu Asn Arg Lys Glu Gln Leu Pro Arg Leu Ala Glu Thr
Asn Arg Ser 85 90 95 Thr Ser Gly Ile Trp Ala Pro Thr Leu Arg Phe
His Asp Asp Thr Phe 100 105 110 Trp Leu Val Thr Thr Leu Val Asp Asp
Asp Arg Pro Gln Glu Asp Ala 115 120 125 Ser Arg Trp Asp Asn Ile Ile
Phe Lys Ala Lys Asn Pro Tyr Asp Pro 130 135 140 Arg Ser Trp Ser Lys
Ala Val His Phe Asn Phe Thr Gly Tyr Asp Thr 145 150 155 160 Glu Pro
Phe Trp Asp Glu Asp Gly Lys Val Tyr Ile Thr Gly Ala His 165 170 175
Ala Trp His Val Gly Pro Tyr Ile Gln Gln Ala Glu Val Asp Leu Asp 180
185 190 Thr Gly Ala Val Gly Glu Trp Arg Ile Ile Trp Asn Gly Thr Gly
Gly 195 200 205 Met Ala Pro Glu Gly Pro His Ile Tyr Arg Lys Asp Gly
Trp Tyr Tyr 210 215 220 Leu Leu Ala Ala Glu Gly Gly Thr Gly Ile Asp
His Met Val Thr Met 225 230 235 240 Ala Arg Ser Arg Lys Ile Ser Ser
Pro Tyr Glu Ser Asn Pro Asn Asn 245 250 255 Pro Val Leu Thr Asn Ala
Asn Thr Thr Ser Tyr Phe Gln Thr Val Gly 260 265 270 His Ser Asp Leu
Phe His Asp Arg His Gly Asn Trp Trp Ala Val Ala 275 280 285 Leu Ser
Thr Arg Ser Gly Pro Glu Tyr Leu His Tyr Pro Met Gly Arg 290 295 300
Glu Thr Val Met Thr Ala Val Ser Trp Pro Lys Asp Glu Trp Pro Thr 305
310 315 320 Phe Thr Pro Ile Ser Gly Lys Met Ser Gly Trp Pro Met Pro
Pro Ser 325 330 335 Gln Lys Asp Ile Arg Gly Val Gly Pro Tyr Val Asn
Ser Pro Asp Pro 340 345 350 Glu His Leu Thr Phe Pro Arg Ser Ala Pro
Leu Pro Ala His Leu Thr 355 360 365 Tyr Trp Arg Tyr Pro Asn Pro Ser
Ser Tyr Thr Pro Ser Pro Pro Gly 370 375 380 His Pro Asn Thr Leu Arg
Leu Thr Pro Ser Arg Leu Asn Leu Thr Ala 385 390 395 400 Leu Asn Gly
Asn Tyr Ala Gly Ala Asp Gln Thr Phe Val Ser Arg Arg 405 410 415 Gln
Gln His Thr Leu Phe Thr Tyr Ser Val Thr Leu Asp Tyr Ala Pro 420 425
430 Arg Thr Ala Gly Glu Glu Ala Gly Val Thr Ala Phe Leu Thr Gln Asn
435 440 445 His His Leu Asp Leu Gly Val Val Leu Leu Pro Arg Gly Ser
Ala Thr 450 455 460 Ala Pro Ser Leu Pro Gly Leu Ser Ser Ser Thr Thr
Thr Thr Ser Ser 465 470 475 480 Ser Ser Ser Arg Pro Asp Glu Glu Glu
Glu Arg Glu Ala Gly Glu Glu 485 490 495 Glu Glu Glu Gly Gly Gln Asp
Leu Met Ile Pro His Val Arg Phe Arg 500 505 510 Gly Glu Ser Tyr Val
Pro Val Pro Ala Pro Val Val Tyr Pro Ile Pro 515 520 525 Arg Ala Trp
Arg Gly Gly Lys Leu Val Leu Glu Ile Arg Ala Cys Asn 530 535 540 Ser
Thr His Phe Ser Phe Arg Val Gly Pro Asp Gly Arg Arg Ser Glu 545 550
555 560 Arg Thr Val Val Met Glu Ala Ser Asn Glu Ala Val Ser Trp Gly
Phe 565 570 575 Thr Gly Thr Leu Leu Gly Ile Tyr Ala Thr Ser Asn Gly
Gly Asn Gly 580 585 590 Thr Thr Pro Ala Tyr Phe Ser Asp Trp Arg Tyr
Thr Pro Leu Glu Gln 595 600 605 Phe Arg Asp 610
165595PRTMyceliophthora thermophila 165Ser Pro Gly His Gly Arg Asn
Ser Thr Phe Tyr Asn Pro Ile Phe Pro 1 5 10 15 Gly Phe Tyr Pro Asp
Pro Ser Cys Ile Tyr Val Pro Glu Arg Asp His 20 25 30 Thr Phe Phe
Cys Ala Ser Ser Ser Phe Asn Ala Phe Pro Gly Ile Pro 35 40 45 Ile
His Ala Ser Lys Asp Leu Gln Asn Trp Lys Leu Ile Gly His Val 50 55
60 Leu Asn Arg Lys Glu Gln Leu Pro Arg Leu Ala Glu Thr Asn Arg Ser
65 70 75 80 Thr Ser Gly Ile Trp Ala Pro Thr Leu Arg Phe His Asp Asp
Thr Phe 85 90 95 Trp Leu Val Thr Thr Leu Val Asp Asp Asp Arg Pro
Gln Glu Asp Ala 100 105 110 Ser Arg Trp Asp Asn Ile Ile Phe Lys Ala
Lys Asn Pro Tyr Asp Pro 115 120
125 Arg Ser Trp Ser Lys Ala Val His Phe Asn Phe Thr Gly Tyr Asp Thr
130 135 140 Glu Pro Phe Trp Asp Glu Asp Gly Lys Val Tyr Ile Thr Gly
Ala His 145 150 155 160 Ala Trp His Val Gly Pro Tyr Ile Gln Gln Ala
Glu Val Asp Leu Asp 165 170 175 Thr Gly Ala Val Gly Glu Trp Arg Ile
Ile Trp Asn Gly Thr Gly Gly 180 185 190 Met Ala Pro Glu Gly Pro His
Ile Tyr Arg Lys Asp Gly Trp Tyr Tyr 195 200 205 Leu Leu Ala Ala Glu
Gly Gly Thr Gly Ile Asp His Met Val Thr Met 210 215 220 Ala Arg Ser
Arg Lys Ile Ser Ser Pro Tyr Glu Ser Asn Pro Asn Asn 225 230 235 240
Pro Val Leu Thr Asn Ala Asn Thr Thr Ser Tyr Phe Gln Thr Val Gly 245
250 255 His Ser Asp Leu Phe His Asp Arg His Gly Asn Trp Trp Ala Val
Ala 260 265 270 Leu Ser Thr Arg Ser Gly Pro Glu Tyr Leu His Tyr Pro
Met Gly Arg 275 280 285 Glu Thr Val Met Thr Ala Val Ser Trp Pro Lys
Asp Glu Trp Pro Thr 290 295 300 Phe Thr Pro Ile Ser Gly Lys Met Ser
Gly Trp Pro Met Pro Pro Ser 305 310 315 320 Gln Lys Asp Ile Arg Gly
Val Gly Pro Tyr Val Asn Ser Pro Asp Pro 325 330 335 Glu His Leu Thr
Phe Pro Arg Ser Ala Pro Leu Pro Ala His Leu Thr 340 345 350 Tyr Trp
Arg Tyr Pro Asn Pro Ser Ser Tyr Thr Pro Ser Pro Pro Gly 355 360 365
His Pro Asn Thr Leu Arg Leu Thr Pro Ser Arg Leu Asn Leu Thr Ala 370
375 380 Leu Asn Gly Asn Tyr Ala Gly Ala Asp Gln Thr Phe Val Ser Arg
Arg 385 390 395 400 Gln Gln His Thr Leu Phe Thr Tyr Ser Val Thr Leu
Asp Tyr Ala Pro 405 410 415 Arg Thr Ala Gly Glu Glu Ala Gly Val Thr
Ala Phe Leu Thr Gln Asn 420 425 430 His His Leu Asp Leu Gly Val Val
Leu Leu Pro Arg Gly Ser Ala Thr 435 440 445 Ala Pro Ser Leu Pro Gly
Leu Ser Ser Ser Thr Thr Thr Thr Ser Ser 450 455 460 Ser Ser Ser Arg
Pro Asp Glu Glu Glu Glu Arg Glu Ala Gly Glu Glu 465 470 475 480 Glu
Glu Glu Gly Gly Gln Asp Leu Met Ile Pro His Val Arg Phe Arg 485 490
495 Gly Glu Ser Tyr Val Pro Val Pro Ala Pro Val Val Tyr Pro Ile Pro
500 505 510 Arg Ala Trp Arg Gly Gly Lys Leu Val Leu Glu Ile Arg Ala
Cys Asn 515 520 525 Ser Thr His Phe Ser Phe Arg Val Gly Pro Asp Gly
Arg Arg Ser Glu 530 535 540 Arg Thr Val Val Met Glu Ala Ser Asn Glu
Ala Val Ser Trp Gly Phe 545 550 555 560 Thr Gly Thr Leu Leu Gly Ile
Tyr Ala Thr Ser Asn Gly Gly Asn Gly 565 570 575 Thr Thr Pro Ala Tyr
Phe Ser Asp Trp Arg Tyr Thr Pro Leu Glu Gln 580 585 590 Phe Arg Asp
595 166942DNAMyceliophthora thermophila 166atgaagctcc tgggcaaact
ctcggcggca ctcgccctcg cgggcagcag gctggctgcc 60gcgcacccgg tcttcgacga
gctgatgcgg ccgacggcgc cgctggtgcg cccgcgggcg 120gccctgcagc
aggtgaccaa ctttggcagc aacccgtcca acacgaagat gttcatctac
180gtgcccgaca agctggcccc caacccgccc atcatagtgg ccatccacta
ctgcaccggc 240accgcccagg cctactactc gggctcccct tacgcccgcc
tcgccgacca gaagggcttc 300atcgtcatct acccggagtc cccctacagc
ggcacctgtt gggacgtctc gtcgcgcgcc 360gccctgaccc acaacggcgg
cggcgacagc aactcgatcg ccaacatggt cacctacacc 420ctcgaaaagt
acaatggcga cgccagcaag gtctttgtca ccggctcctc gtccggcgcc
480atgatgacga acgtgatggc cgccgcgtac ccggaactgt tcgcggcagg
aatcgcctac 540tcgggcgtgc ccgccggctg cttctacagc cagtccggag
gcaccaacgc gtggaacagc 600tcgtgcgcca acgggcagat caactcgacg
ccccaggtgt gggccaagat ggtcttcgac 660atgtacccgg aatacgacgg
cccgcgcccc aagatgcaga tctaccacgg ctcggccgac 720ggcacgctca
gacccagcaa ctacaacgag accatcaagc agtggtgcgg cgtcttcggc
780ttcgactaca cccgccccga caccacccag gccaactccc cgcaggccgg
ctacaccacc 840tacacctggg gcgagcagca gctcgtcggc atctacgccc
agggcgtcgg acacacggtc 900cccatccgcg gcagcgacga catggccttc
tttggcctgt ga 942167313PRTMyceliophthora thermophila 167Met Lys Leu
Leu Gly Lys Leu Ser Ala Ala Leu Ala Leu Ala Gly Ser 1 5 10 15 Arg
Leu Ala Ala Ala His Pro Val Phe Asp Glu Leu Met Arg Pro Thr 20 25
30 Ala Pro Leu Val Arg Pro Arg Ala Ala Leu Gln Gln Val Thr Asn Phe
35 40 45 Gly Ser Asn Pro Ser Asn Thr Lys Met Phe Ile Tyr Val Pro
Asp Lys 50 55 60 Leu Ala Pro Asn Pro Pro Ile Ile Val Ala Ile His
Tyr Cys Thr Gly 65 70 75 80 Thr Ala Gln Ala Tyr Tyr Ser Gly Ser Pro
Tyr Ala Arg Leu Ala Asp 85 90 95 Gln Lys Gly Phe Ile Val Ile Tyr
Pro Glu Ser Pro Tyr Ser Gly Thr 100 105 110 Cys Trp Asp Val Ser Ser
Arg Ala Ala Leu Thr His Asn Gly Gly Gly 115 120 125 Asp Ser Asn Ser
Ile Ala Asn Met Val Thr Tyr Thr Leu Glu Lys Tyr 130 135 140 Asn Gly
Asp Ala Ser Lys Val Phe Val Thr Gly Ser Ser Ser Gly Ala 145 150 155
160 Met Met Thr Asn Val Met Ala Ala Ala Tyr Pro Glu Leu Phe Ala Ala
165 170 175 Gly Ile Ala Tyr Ser Gly Val Pro Ala Gly Cys Phe Tyr Ser
Gln Ser 180 185 190 Gly Gly Thr Asn Ala Trp Asn Ser Ser Cys Ala Asn
Gly Gln Ile Asn 195 200 205 Ser Thr Pro Gln Val Trp Ala Lys Met Val
Phe Asp Met Tyr Pro Glu 210 215 220 Tyr Asp Gly Pro Arg Pro Lys Met
Gln Ile Tyr His Gly Ser Ala Asp 225 230 235 240 Gly Thr Leu Arg Pro
Ser Asn Tyr Asn Glu Thr Ile Lys Gln Trp Cys 245 250 255 Gly Val Phe
Gly Phe Asp Tyr Thr Arg Pro Asp Thr Thr Gln Ala Asn 260 265 270 Ser
Pro Gln Ala Gly Tyr Thr Thr Tyr Thr Trp Gly Glu Gln Gln Leu 275 280
285 Val Gly Ile Tyr Ala Gln Gly Val Gly His Thr Val Pro Ile Arg Gly
290 295 300 Ser Asp Asp Met Ala Phe Phe Gly Leu 305 310
168292PRTMyceliophthora thermophila 168His Pro Val Phe Asp Glu Leu
Met Arg Pro Thr Ala Pro Leu Val Arg 1 5 10 15 Pro Arg Ala Ala Leu
Gln Gln Val Thr Asn Phe Gly Ser Asn Pro Ser 20 25 30 Asn Thr Lys
Met Phe Ile Tyr Val Pro Asp Lys Leu Ala Pro Asn Pro 35 40 45 Pro
Ile Ile Val Ala Ile His Tyr Cys Thr Gly Thr Ala Gln Ala Tyr 50 55
60 Tyr Ser Gly Ser Pro Tyr Ala Arg Leu Ala Asp Gln Lys Gly Phe Ile
65 70 75 80 Val Ile Tyr Pro Glu Ser Pro Tyr Ser Gly Thr Cys Trp Asp
Val Ser 85 90 95 Ser Arg Ala Ala Leu Thr His Asn Gly Gly Gly Asp
Ser Asn Ser Ile 100 105 110 Ala Asn Met Val Thr Tyr Thr Leu Glu Lys
Tyr Asn Gly Asp Ala Ser 115 120 125 Lys Val Phe Val Thr Gly Ser Ser
Ser Gly Ala Met Met Thr Asn Val 130 135 140 Met Ala Ala Ala Tyr Pro
Glu Leu Phe Ala Ala Gly Ile Ala Tyr Ser 145 150 155 160 Gly Val Pro
Ala Gly Cys Phe Tyr Ser Gln Ser Gly Gly Thr Asn Ala 165 170 175 Trp
Asn Ser Ser Cys Ala Asn Gly Gln Ile Asn Ser Thr Pro Gln Val 180 185
190 Trp Ala Lys Met Val Phe Asp Met Tyr Pro Glu Tyr Asp Gly Pro Arg
195 200 205 Pro Lys Met Gln Ile Tyr His Gly Ser Ala Asp Gly Thr Leu
Arg Pro 210 215 220 Ser Asn Tyr Asn Glu Thr Ile Lys Gln Trp Cys Gly
Val Phe Gly Phe 225 230 235 240 Asp Tyr Thr Arg Pro Asp Thr Thr Gln
Ala Asn Ser Pro Gln Ala Gly 245 250 255 Tyr Thr Thr Tyr Thr Trp Gly
Glu Gln Gln Leu Val Gly Ile Tyr Ala 260 265 270 Gln Gly Val Gly His
Thr Val Pro Ile Arg Gly Ser Asp Asp Met Ala 275 280 285 Phe Phe Gly
Leu 290 169840DNAMyceliophthora thermophila 169atgatctcgg
ttcctgctct cgctctggcc cttctggccg ccgtccaggt cgtcgagtct 60gcctcggctg
gctgtggcaa ggcgccccct tcctcgggca ccaagtcgat gacggtcaac
120ggcaagcagc gccagtacat tctccagctg cccaacaact acgacgccaa
caaggcccac 180agggtggtga tcgggtacca ctggcgcgac ggatccatga
acgacgtggc caacggcggc 240ttctacgatc tgcggtcccg ggcgggcgac
agcaccatct tcgttgcccc caacggcctc 300aatgccggat gggccaacgt
gggcggcgag gacatcacct ttacggacca gatcgtagac 360atgctcaaga
acgacctctg cgtggacgag acccagttct ttgctacggg ctggagctat
420ggcggtgcca tgagccatag cgtggcttgt tctcggccag acgtcttcaa
ggccgtcgcg 480gtcatcgccg gggcccagct gtccggctgc gccggcggca
cgacgcccgt ggcgtaccta 540ggcatccacg gagccgccga caacgtcctg
cccatcgacc tcggccgcca gctgcgcgac 600aagtggctgc agaccaacgg
ctgcaactac cagggcgccc aggaccccgc gccgggccag 660caggcccaca
tcaagaccac ctacagctgc tcccgcgcgc ccgtcacctg gatcggccac
720gggggcggcc acgtccccga ccccacgggc aacaacggcg tcaagtttgc
gccccaggag 780acctgggact tctttgatgc cgccgtcgga gcggccggcg
cgcagagccc gatgacataa 840170279PRTMyceliophthora thermophila 170Met
Ile Ser Val Pro Ala Leu Ala Leu Ala Leu Leu Ala Ala Val Gln 1 5 10
15 Val Val Glu Ser Ala Ser Ala Gly Cys Gly Lys Ala Pro Pro Ser Ser
20 25 30 Gly Thr Lys Ser Met Thr Val Asn Gly Lys Gln Arg Gln Tyr
Ile Leu 35 40 45 Gln Leu Pro Asn Asn Tyr Asp Ala Asn Lys Ala His
Arg Val Val Ile 50 55 60 Gly Tyr His Trp Arg Asp Gly Ser Met Asn
Asp Val Ala Asn Gly Gly 65 70 75 80 Phe Tyr Asp Leu Arg Ser Arg Ala
Gly Asp Ser Thr Ile Phe Val Ala 85 90 95 Pro Asn Gly Leu Asn Ala
Gly Trp Ala Asn Val Gly Gly Glu Asp Ile 100 105 110 Thr Phe Thr Asp
Gln Ile Val Asp Met Leu Lys Asn Asp Leu Cys Val 115 120 125 Asp Glu
Thr Gln Phe Phe Ala Thr Gly Trp Ser Tyr Gly Gly Ala Met 130 135 140
Ser His Ser Val Ala Cys Ser Arg Pro Asp Val Phe Lys Ala Val Ala 145
150 155 160 Val Ile Ala Gly Ala Gln Leu Ser Gly Cys Ala Gly Gly Thr
Thr Pro 165 170 175 Val Ala Tyr Leu Gly Ile His Gly Ala Ala Asp Asn
Val Leu Pro Ile 180 185 190 Asp Leu Gly Arg Gln Leu Arg Asp Lys Trp
Leu Gln Thr Asn Gly Cys 195 200 205 Asn Tyr Gln Gly Ala Gln Asp Pro
Ala Pro Gly Gln Gln Ala His Ile 210 215 220 Lys Thr Thr Tyr Ser Cys
Ser Arg Ala Pro Val Thr Trp Ile Gly His 225 230 235 240 Gly Gly Gly
His Val Pro Asp Pro Thr Gly Asn Asn Gly Val Lys Phe 245 250 255 Ala
Pro Gln Glu Thr Trp Asp Phe Phe Asp Ala Ala Val Gly Ala Ala 260 265
270 Gly Ala Gln Ser Pro Met Thr 275 171259PRTMyceliophthora
thermophila 171Ala Ser Ala Gly Cys Gly Lys Ala Pro Pro Ser Ser Gly
Thr Lys Ser 1 5 10 15 Met Thr Val Asn Gly Lys Gln Arg Gln Tyr Ile
Leu Gln Leu Pro Asn 20 25 30 Asn Tyr Asp Ala Asn Lys Ala His Arg
Val Val Ile Gly Tyr His Trp 35 40 45 Arg Asp Gly Ser Met Asn Asp
Val Ala Asn Gly Gly Phe Tyr Asp Leu 50 55 60 Arg Ser Arg Ala Gly
Asp Ser Thr Ile Phe Val Ala Pro Asn Gly Leu 65 70 75 80 Asn Ala Gly
Trp Ala Asn Val Gly Gly Glu Asp Ile Thr Phe Thr Asp 85 90 95 Gln
Ile Val Asp Met Leu Lys Asn Asp Leu Cys Val Asp Glu Thr Gln 100 105
110 Phe Phe Ala Thr Gly Trp Ser Tyr Gly Gly Ala Met Ser His Ser Val
115 120 125 Ala Cys Ser Arg Pro Asp Val Phe Lys Ala Val Ala Val Ile
Ala Gly 130 135 140 Ala Gln Leu Ser Gly Cys Ala Gly Gly Thr Thr Pro
Val Ala Tyr Leu 145 150 155 160 Gly Ile His Gly Ala Ala Asp Asn Val
Leu Pro Ile Asp Leu Gly Arg 165 170 175 Gln Leu Arg Asp Lys Trp Leu
Gln Thr Asn Gly Cys Asn Tyr Gln Gly 180 185 190 Ala Gln Asp Pro Ala
Pro Gly Gln Gln Ala His Ile Lys Thr Thr Tyr 195 200 205 Ser Cys Ser
Arg Ala Pro Val Thr Trp Ile Gly His Gly Gly Gly His 210 215 220 Val
Pro Asp Pro Thr Gly Asn Asn Gly Val Lys Phe Ala Pro Gln Glu 225 230
235 240 Thr Trp Asp Phe Phe Asp Ala Ala Val Gly Ala Ala Gly Ala Gln
Ser 245 250 255 Pro Met Thr
* * * * *