U.S. patent application number 14/816992 was filed with the patent office on 2015-11-19 for variant cbh i polypeptides with reduced product inhibition.
This patent application is currently assigned to BP CORPORATION NORTH AMERICA INC.. The applicant listed for this patent is BP Corporation North America Inc.. Invention is credited to Shaun Healey, Peter Luginbuhl, Chris S. Lyon, John Poland, Justin T. Stege, Alexander Varvak.
Application Number | 20150329880 14/816992 |
Document ID | / |
Family ID | 45002108 |
Filed Date | 2015-11-19 |
United States Patent
Application |
20150329880 |
Kind Code |
A1 |
Stege; Justin T. ; et
al. |
November 19, 2015 |
VARIANT CBH I POLYPEPTIDES WITH REDUCED PRODUCT INHIBITION
Abstract
The present disclosure relates to variant CBH I polypeptides
that have reduced product inhibition, and compositions, e.g.,
cellulase compositions, comprising variant CBH I polypeptides. The
variant CBH I polypeptides and related compositions can be used in
variety of agricultural and industrial applications. The present
disclosure further relates to nucleic acids encoding variant CBH I
polypeptides and host cells that recombinantly express the variant
CBH I polypeptides.
Inventors: |
Stege; Justin T.; (San
Diego, CA) ; Varvak; Alexander; (Netanya, IL)
; Poland; John; (San Diego, CA) ; Lyon; Chris
S.; (San Diego, CA) ; Healey; Shaun; (San
Diego, CA) ; Luginbuhl; Peter; (San Diego,
CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
BP Corporation North America Inc. |
Houston |
TX |
US |
|
|
Assignee: |
BP CORPORATION NORTH AMERICA
INC.
|
Family ID: |
45002108 |
Appl. No.: |
14/816992 |
Filed: |
August 3, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13824317 |
Dec 18, 2013 |
9096871 |
|
|
PCT/US2011/055181 |
Oct 6, 2011 |
|
|
|
14816992 |
|
|
|
|
61390392 |
Oct 6, 2010 |
|
|
|
Current U.S.
Class: |
435/165 ;
435/209; 536/25.3 |
Current CPC
Class: |
C12P 7/14 20130101; Y02E
50/17 20130101; C12Y 302/01091 20130101; C12P 19/14 20130101; C12P
19/02 20130101; C12N 9/2437 20130101; C12P 7/10 20130101; Y02E
50/16 20130101; Y02E 50/10 20130101 |
International
Class: |
C12P 7/10 20060101
C12P007/10; C12N 9/42 20060101 C12N009/42 |
Claims
1. A polypeptide comprising a variant cellobiohydrolase I ("CBH I")
catalytic domain as compared to a reference CBH I catalytic domain,
comprising: (a) a substitution at the amino acid position
corresponding to R268 of T. reesei CBH I ("R268 substitution"); (b)
a substitution at the amino acid position corresponding to R411 of
T. reesei CBH I ("R411 substitution"); or (c) both an R268
substitution and an R411 substitution, wherein substitution (a),
(b) or (c) decreases product inhibition as compared to the
reference CBH I catalytic domain.
2. A method for producing ethanol, comprising: (a) treating biomass
with a composition according to any one of claims 37 to 43 or with
a fermentation broth according to claim 1, thereby producing
monosaccharides; and (b) culturing a fermenting microorganism in
the presence of the monosaccharides produced in step (a) under
fermentation conditions, thereby producing ethanol.
3. The method of claim 2, further comprising, prior to step (a),
pretreating the biomass.
4. The method of claim 2, wherein said fermenting microorganism is
a bacterium or a yeast.
5. The method of claim 4, wherein said fermenting microorganism is
a bacterium selected from Zymomonas mobilis, Escherichia coli and
Klebsiella oxytoca.
6. The method of claim 4, wherein said fermenting microorganism is
a yeast selected from Saccharomyces cerevisiae, Saccharomyces
uvarum, Kluyveromyces fragilis, Kluyveromyces lactis, Candida
pseudotropicalis, and Pachysolen tannophilus.
7. The method of claim 2, wherein said biomass is corn stover,
bagasses, sorghum, giant reed, elephant grass, miscanthus, Japanese
cedar, wheat straw, switchgrass, hardwood pulp, softwood pulp,
crushed sugar cane, energy cane, or Napier grass.
8. A method for generating a nucleic acid that encodes a product
tolerant variant CBH I polypeptide, comprising modifying the
nucleotide sequence of a CBH I-encoding nucleic acid so that the
nucleic acid encodes a variant CBH I polypeptide, wherein said
variant CBH I polypeptide comprises: (i) an R268 substitution; (ii)
an R411 substitution; or (iii) both an R268 substitution and an
R411 substitution, thereby generating a nucleic acid that encodes a
product tolerant variant CBH I polypeptide.
9. The method of claim 8, wherein the modification is by site
directed mutagenesis.
10. The method of claim 8, wherein variant CBH I polypeptide
comprises an R268 substitution.
11. The method of claim 10, wherein the R268 substituent is a
lysine.
12. The method of claim 10, wherein the R268 substituent is an
alanine.
13. The method of claim 8, which comprises an R411
substitution.
14. The method of claim 13, wherein the R411 substituent is a
lysine.
15. The method of claim 13, wherein the R411 substituent is an
alanine.
16. A method for producing ethanol, comprising: (a) treating
biomass with a fermentation broth according to claim 1, thereby
producing monosaccharides; and (b) culturing a fermenting
microorganism in the presence of the monosaccharides produced in
step (a) under fermentation conditions, thereby producing ethanol.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a divisional application of U.S.
application Ser. No. 13/824,317 filed Dec. 18, 2013, now issued as
U.S. Pat. No. 9,096,871; which is a 35 USC .sctn.371 National Stage
application of International Application No. PCT/US2011/055181
filed Oct. 6, 2011, now expired; which claims the benefit under 35
USC .sctn.119(e) to U.S. Application Ser. No. 61/390,392 filed Oct.
6, 2010, now expired. The disclosure of each of the prior
applications is considered part of and is incorporated by reference
in the disclosure of this application.
BACKGROUND OF THE INVENTION
[0002] Cellulose is an unbranched polymer of glucose linked by
.beta.(1.fwdarw.4)-glycosidic bonds. Cellulose chains can interact
with each other via hydrogen bonding to form a crystalline solid of
high mechanical strength and chemical stability. The cellulose
chains are depolymerized into glucose and short oligosaccharides
before organisms, such as the fermenting microbes used in ethanol
production, can use them as metabolic fuel. Cellulase enzymes
catalyze the hydrolysis of the cellulose (hydrolysis of
.beta.-1,4-D-glucan linkages) in the biomass into products such as
glucose, cellobiose, and other cellooligosaccharides. Cellulase is
a generic term denoting a multienzyme mixture comprising exo-acting
cellobiohydrolases (CBHs), endoglucanases (EGs) and
.beta.-glucosidases (BGs) that can be produced by a number of
plants and microorganisms. Enzymes in the cellulase of Trichoderma
reesei include CBH I (more generally, Ce17A), CBH2 (Cel6A), EG1
(Cel7B), EG2 (Cel5), EG3 (Cel2), EG4 (Cel61A), EG5 (Cel45A), EG6
(Cel74A), Cip1, Cip2, .beta.-glucosidases (including, e.g., Cel3A),
acetyl xylan esterase, .beta.-mannanase, and swollenin.
[0003] Cellulase enzymes work synergistically to hydrolyze
cellulose to glucose. CBH I and CBH II act on opposing ends of
cellulose chains (Barr et al., 1996, Biochemistry 35:586-92), while
the endoglucanases act at internal locations in the cellulose. The
primary product of these enzymes is cellobiose, which is further
hydrolyzed to glucose by one or more .beta.-glucosidases.
[0004] The cellobiohydrolases are subject to inhibition by their
direct product, cellobiose, which results in a slowing down of
saccharification reactions as product accumulates. There is a need
for new and improved cellobiohyrolases with improved productivity
that maintain their reaction rates during the course of a
saccharification reaction, for use in the conversion of cellulose
into fermentable sugars and for related fields of cellulosic
material processing such as pulp and paper, textiles and animal
feeds.
SUMMARY OF THE INVENTION
[0005] The present disclosure relates to variant CBH I
polypeptides. Most naturally occurring CBH I polypeptides have
arginines at positions corresponding to R268 and R411 of T. reesei
CBH I (SEQ ID NO:2). The variant CBH I polypeptides of the present
disclosure include a substitution at either or both positions
resulting in a reduction or decrease in product (e.g., cellobiose)
inhibition. Such variants are sometimes referred to herein as
"product tolerant."
[0006] The variant CBH I polypeptides of the disclosure minimally
contain at least a CBH I catalytic domain, comprising (a) a
substitution at the amino acid position corresponding to R268 of T.
reesei CBH I ("R268 substitution"); (b) a substitution at the amino
acid position corresponding to R411 of T. reesei CBH I ("R411
substitution"); or (c) both an R268 substitution and an R411
substitution. The amino acid positions of exemplary CBH I
polypeptides into which R268 and/or R411 substitutions can be
introduced are shown in Table 1, and the amino acid positions
corresponding to R268 and/or R411 in these exemplary CBH I
polypeptides are shown in Table 2.
[0007] R268 and/or R411 substituents can include lysines and/or
alanines Accordingly, the present disclosure provides a variant CBH
I polypeptide comprising a CBH I catalytic domain with one of the
following amino acid substitutions or pairs of R268 and/or R411
substitutions: (a) R268K and R411K; (b) R268K and R411A; (c) R268A
and R411K; (d) R268A and R411A; (e) R268A; (f) R268K; (g) R411A;
and (h) R411K. In some embodiments, however, the amino acid
sequence of the variant CBH I polypeptide does not comprise or
consist of SEQ ID NO:299, SEQ ID NO:300, SEQ ID NO:301, or SEQ ID
NO:302.
[0008] The variant CBHI polypeptides of the disclosure typically
include a CD comprising an amino acid sequence having at least 50%
sequence identity to a CD of a reference CBH I exemplified in Table
1. The CD portions of the CBH I polypeptides exemplified in Table 1
are delineated in Table 3. The variant CBH I polypeptides can have
a cellulose binding domain ("CBD") sequence in addition to the
catalytic domain ("CD") sequence. The CBD can be N- or C-terminal
to the CD, and the CBD and CD are optionally connected via a linker
sequence.
[0009] The variant CBH I polypeptides can be mature polypeptides or
they may further comprise a signal sequence.
[0010] Additional embodiments of the variant CBH I polypeptides are
provided in Section 0.
[0011] The variant CBH I polypeptides of the disclosure typically
exhibit reduced product inhibition by cellobiose. In certain
embodiments, the IC.sub.50 of cellobiose towards a variant CBH I
polypeptide of the disclosure is at least 1.2-fold, at least
1.5-fold, or at least 2-fold the IC.sub.50 of cellobiose towards a
reference CBH I lacking the R268 substitution and/or R411
substitution present in the variant. Additional embodiments of the
product inhibition characteristics of the variant CBH I
polypeptides are provided in Section 0.
[0012] The variant CBH I polypeptides of the disclosure typically
retain some cellobiohydrolase activity. In certain embodiments, a
variant CBH I polypeptide retains at least 50% the CBH I activity
of a reference CBH I lacking the R268 substitution and/or R411
substitution present in the variant. Additional embodiments of
cellobiohydrolase activity of the variant CBH I polypeptides are
provided in Section 0.
[0013] The present disclosure further provides compositions
(including cellulase compositions, e.g., whole cellulase
compositions, and fermentation broths) comprising variant CBH I
polypeptides. Additional embodiments of compositions comprising
variant CBH I polypeptides are provided in Section 0. The variant
CBH I polypeptides and compositions comprising them can be used,
inter alia, in processes for saccharifying biomass. Additional
details of saccharification reactions, and additional applications
of the variant CBH I polypeptides, are provided in Section 0.
[0014] The present disclosure further provides nucleic acids (e.g.,
vectors) comprising nucleotide sequences encoding variant CBH I
polypeptides as described herein, and recombinant cells engineered
to express the variant CBH I polypeptides. The recombinant cell can
be a prokaryotic (e.g., bacterial) or eukaryotic (e.g., yeast or
filamentous fungal) cell. Further provided are methods of producing
and optionally recovering the variant CBH I polypeptides.
Additional embodiments of the recombinant expression system
suitable for expression and production of the variant CBH I
polypeptides are provided in Section 0.
BRIEF DESCRIPTION OF THE DRAWINGS AND TABLES
[0015] FIGS. 1A-1B: Cellobiose dose-response curves using a 4-MUL
assay for a wild-type CBH I (BD29555; FIG. 1A) and a R268K/R411K
variant CBH I (BD29555 with the substitutions R273K/R422K; FIG.
1B).
[0016] FIGS. 2A-2B: The effect of cellobiose accumulation on the
activity of wild-type CBH I and a R268K/R411K variant CBH I, based
on percent conversion of glucan after 72 hours in the bagasse
assay. FIG. 2A shows relative activity in the presence (+) and
absence (-) of .beta.-glucosidase (BG), where relative activity is
normalized to wild type activity with BG (WT+=1). FIG. 2B shows
tolerance to cellobiose as a function of the ratio of activity in
the absence vs. presence of .beta.-glucosidase (activity
ratio=Activity -BG/Activity +BG).
[0017] FIG. 3: Cellobiose dose-response curves using PASC assay for
a R268K/R411K variant CBH I polypeptide as compared to two wild
type CBH I polypeptides.
[0018] FIG. 4: The effect of cellobiose accumulation on the
activity of a wild-type CBH I and a R268K/R411K variant CBH I based
on percent conversion of glucan after 72 hours in the bagasse assay
in the presence (+) and absence (-) of .beta.-glucosidase (BG).
Activity is normalized to wild type activity with BG (WT+=1).
[0019] FIG. 5: Characterization of cellobiose product tolerance of
variant CBH I polypeptides, based on percent conversion of glucan
after 72 hours in the absence and presence of .beta.-glucosidase
(BG) in the bagasse assay; tolerance is evaluated as a function of
the ratio of activity in the absence vs. presence of
.beta.-glucosidase.
[0020] TABLE 1: Amino acid sequences of exemplary "reference" CBH I
polypeptides that can be modified at positions corresponding to
R268 and/or R411 in T. reesei CBH I (SEQ ID NO:2). The database
accession numbers are indicated in the second column. Unless
indicated otherwise, the accession numbers refer to the Genbank
database. "#" indicates that the CBH I has no signal peptide;
"&" indicate that the sequence is from the PDB database and
represents the catalytic domain only without signal sequence; *
indicates a nonpublic database. These amino acid sequences are
mostly wild type, with the exception of some sequences from the PDB
database which contain mutations to facilitate protein
crystallization.
[0021] TABLE 2: Amino acid positions in the exemplary reference CBH
I polypeptides that correspond to R268 and R411 in T. reesei CBH I.
Database descriptors are as for Table 1.
[0022] TABLE 3: Approximate amino acid positions of CBH I
polypeptide domains. Abbreviations used: SS is signal sequence; CD
is catalytic domain; and CBD is cellulose binding domain. Database
descriptors are as for Table 1.
[0023] TABLE 4: Table 4 shows a segment within the catalytic domain
of each exemplary reference CBH I polypeptide containing the active
site loop (shown in bold, underlined text) and the catalytic
residues (glutamates in most CBH I polypeptides) (shown in bold,
double underlined text). Database descriptors are as for Table
1.
[0024] TABLE 5: MUL and bagasse assay results for variants of
BD29555. ND means not determined. .+-.% Activity
(+/-cellobiose)=[(Activity with cellobiose)/(Activity without
cellobiose)]*100. % Activity (-/+BG)=[(Activity without
BG)/(Activity with BG)]*100]
[0025] TABLE 6: MUL and bagasse assay results for variants of T.
reesei CBH I. ND means not determined. .+-.% Activity
(+/-cellobiose)=[(Activity with cellobiose)/(Activity without
cellobiose)]*100. % Activity (-/+BG)=[(Activity without
BG)/(Activity with BG)]*100.
[0026] TABLE 7: Informal sequence listing. SEQ ID NO:1-149
correspond to the exemplary reference CBH I polypeptides. SEQ ID
NO:299 corresponds to mature T. reesei CBH I (amino acids 26-529 of
SEQ ID NO:2) with an R268A substitution. SEQ ID NO:300 corresponds
to mature T. reesei CBH I (amino acids 26-529 of SEQ ID NO:2) with
an R411A substitution. SEQ ID NO:301 corresponds to full length
BD29555 with both an R268K substitution and an R411K substitution.
SEQ ID NO:302 corresponds to mature BD29555 with both an R268K
substitution and an R411K substitution.
DETAILED DESCRIPTION OF THE INVENTION
[0027] The present disclosure relates to variant CBH I
polypeptides. Most naturally occurring CBH I polypeptides have
arginines at positions corresponding to R268 and R411 of T. reesei
CBH I (SEQ ID NO:2). The variant CBH I polypeptides of the present
disclosure include a substitution at either or both positions
resulting in a reduction of product (e.g., cellobiose) inhibition.
The following subsections describe in greater detail the variant
CBH I polypeptides and exemplary methods of their production,
exemplary cellulase compositions comprising them, and some
industrial applications of the polypeptides and cellulase
compositions.
Variant CBH I Polypeptides
[0028] The present disclosure provides variant CBH I polypeptides
comprising at least one amino acid substitution that results in
reduced product inhibition. "Variant" means a polypeptide which is
differs in sequence from a reference polypeptide by substitution of
one or more amino acids at one or a number of different sites in
the amino acid sequence. Exemplary reference CBH I polypeptides are
shown in Table 1.
[0029] The variant CBH I polypeptides of the disclosure have an
amino acid substitution at the amino acid position corresponding to
R268 of T. reesei CBH I (SEQ ID NO:2) (an "R268 substitution"), (b)
a substitution at the amino acid position corresponding to R411 of
T. reesei CBH I ("R411 substitution"); or (c) both an R268
substitution and an R411 substitution, as compared to a reference
CBH I polypeptide. It is noted that the R268 and R411 numbering is
made by reference to the full length T. reesei CBH I, which
includes a signal sequence that is generally absent from the mature
enzyme. The corresponding numbering in the mature T. reesei CBH I
(see, e.g., SEQ ID NO:4) is 8251 and R394, respectively.
[0030] Accordingly, the present disclosure provides variant CBH I
polypeptides in which at least one of the amino acid positions
corresponding to R268 and R411 of T. reesei CBH I, and optionally
both the amino acid positions corresponding to R268 and R411 of T.
reesei CBH I, is not an arginine.
[0031] The amino acid positions in the reference polypeptides of
Table 1 that correspond to R268 and R411 in T. reesei CBH I are
shown in Table 2. Amino acid positions in other CBH I polypeptides
that correspond to R268 and R411 can be identified through
alignment of their sequences with T. reesei CBH I using a sequence
comparison algorithm. Optimal alignment of sequences for comparison
can be conducted, e.g., by the local homology algorithm of Smith
& Waterman, 1981, Adv. Appl. Math. 2:482-89; by the homology
alignment algorithm of Needleman & Wunsch, 1970, J. Mol. Biol.
48:443-53; by the search for similarity method of Pearson &
Lipman, 1988, Proc. Nat'l Acad. Sci. USA 85:2444-48, by
computerized implementations of these algorithms (GAP, BESTFIT,
FASTA, and TFASTA in the Wisconsin Genetics Software Package,
Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by
visual inspection.
[0032] The R268 and/or R411 substitutions are preferably selected
from (a) R268K and R411K; (b) R268K and R411A; (c) R268A and R411K;
(d) R268A and R411A; (e) R268A; (f) R268K; (g) R411A; and (h)
R411K.
[0033] CBH I polypeptides belong to the glycosyl hydrolase family 7
("GH7"). The glycosyl hydrolases of this family include
endoglucanases and cellobiohydrolases (exoglucanases). The
cellobiohydrolases act processively from the reducing ends of
cellulose chains to generate cellobiose. Cellulases of bacterial
and fungal origin characteristically have a small cellulose-binding
domain ("CBD") connected to either the N or the C terminus of the
catalytic domain ("CD") via a linker peptide (see Suumakki et al.,
2000, Cellulose 7: 189-209). The CD contains the active site
whereas the CBD interacts with cellulose by binding the enzyme to
it (van Tilbeurgh et al., 1986, FEBS Lett. 204(2): 223-227; Tomme
et al., 1988, Eur. J. Biochem. 170:575-581). The three-dimensional
structure of the catalytic domain of T. reesei CBH I has been
solved (Divne et al., 1994, Science 265:524-528). The CD consists
of two .beta.-sheets that pack face-to-face to form a
.beta.-sandwich. Most of the remaining amino acids in the CD are
loops connecting the .beta.-sheets. Some loops are elongated and
bend around the active site, forming cellulose-binding tunnel of
(.about.50 .ANG.). In contrast, endoglucanases have an open
substrate binding cleft/groove rather than a tunnel. Typically, the
catalytic residues are glutamic acids corresponding to E229 and
E234 of T. reesei CBH I.
[0034] The loops characteristic of the active sites ("the active
site loops") of reference CBH I polypeptides, which are absent from
GH7 family endoglucanases, as well as catalytic glutamate residues
of the reference CBH I polypeptides, are shown in Table 4. The
variant CBH I polypeptides of the disclosure preferably retain the
catalytic glutamate residues or may include a glutamine instead at
the position corresponding to E234, as for SEQ ID NO:4. In some
embodiments, the variant CBH I polypeptides contain no
substitutions or only conservative substitutions in the active site
loops relative to the reference CBH I polypeptides from which the
variants are derived.
[0035] Many CBH I polypeptides do not have a CBD, and most studies
concerning the activity of cellulase domains on different
substrates have been carried out with only the catalytic domains of
CBH I polypeptides. Because CDs with cellobiohydrolase activity can
be generated by limited proteolysis of mature CBH I by papain (see,
e.g., Chen et al., 1993, Biochem. Mol. Biol. Int. 30(5):901-10),
they are often referred to as "core" domains. Accordingly, a
variant CBH I can include only the CD "core" of CBH I. Exemplary
reference CDs comprise amino acid sequences corresponding to
positions 26 to 455 of SEQ ID NO:1, positions 18 to 444 of SEQ ID
NO:2, positions 26 to 455 of SEQ ID NO:3, positions 1 to 427 of SEQ
ID NO:4, positions 24 to 457 of SEQ ID NO:5, positions 18 to 448 of
SEQ ID NO:6, positions 27 to 460 of SEQ ID NO:7, positions 27 to
460 of SEQ ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1
to 424 of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11,
positions 18 to 434 of SEQ ID NO:12, positions 18 to 445 of SEQ ID
NO:13, positions 19 to 454 of SEQ ID NO:14, positions 19 to 443 of
SEQ ID NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to
446 of SEQ ID NO:17, positions 19 to 449 of SEQ ID NO:18, positions
23 to 446 of SEQ ID NO:19, positions 19 to 449 of SEQ ID NO:20,
positions 2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ ID
NO:22, positions 19 to 447 of SEQ ID NO:23, positions 19 to 447 of
SEQ ID NO:24, positions 20 to 443 of SEQ ID NO:25, positions 18 to
447 of SEQ ID NO:26, positions 19 to 442 of SEQ ID NO:27, positions
18 to 451 of SEQ ID NO:28, positions 23 to 446 of SEQ ID NO:29,
positions 18 to 444 of SEQ ID NO:30, positions 18 to 451 of SEQ ID
NO:31, positions 18 to 447 of SEQ ID NO:32, positions 19 to 449 of
SEQ ID NO:33, positions 18 to 447 of SEQ ID NO:34, positions 26 to
459 of SEQ ID NO:35, positions 19 to 450 of SEQ ID NO:36, positions
19 to 453 of SEQ ID NO:37, positions 18 to 448 of SEQ ID NO:38,
positions 19 to 443 of SEQ ID NO:39, positions 19 to 442 of SEQ ID
NO:40, positions 18 to 444 of SEQ ID NO:41, positions 24 to 457 of
SEQ ID NO:42, positions 18 to 449 of SEQ ID NO:43, positions 19 to
453 of SEQ ID NO:44, positions 26 to 456 of SEQ ID NO:45, positions
19 to 451 of SEQ ID NO:46, positions 18 to 443 of SEQ ID NO:47,
positions 18 to 448 of SEQ ID NO:48, positions 19 to 451 of SEQ ID
NO:49, positions 18 to 444 of SEQ ID NO:50, positions 2 to 419 of
SEQ ID NO:51, positions 27 to 461 of SEQ ID NO:52, positions 21 to
445 of SEQ ID NO:53, positions 19 to 449 of SEQ ID NO:54, positions
19 to 448 of SEQ ID NO:55, positions 18 to 443 of SEQ ID NO:56,
positions 20 to 443 of SEQ ID NO:57, positions 18 to 448 of SEQ ID
NO:58, positions 18 to 447 of SEQ ID NO:59, positions 26 to 455 of
SEQ ID NO:60, positions 19 to 449 of SEQ ID NO:61, positions 19 to
449 of SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions
18 to 448 of SEQ ID NO:64, positions 19 to 451 of SEQ ID NO:65,
positions 19 to 447 of SEQ ID NO:66, positions 1 to 424 of SEQ ID
NO:67, positions 19 to 448 of SEQ ID NO:68, positions 19 to 443 of
SEQ ID NO:69, positions 23 to 447 of SEQ ID NO:70, positions 17 to
448 of SEQ ID NO:71, positions 19 to 449 of SEQ ID NO:72, positions
18 to 444 of SEQ ID NO:73, positions 23 to 458 of SEQ ID NO:74,
positions 20 to 452 of SEQ ID NO:75, positions 18 to 435 of SEQ ID
NO:76, positions 18 to 446 of SEQ ID NO:77, positions 22 to 457 of
SEQ ID NO:78, positions 18 to 448 of SEQ ID NO:79, positions 1 to
431 of SEQ ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions
21 to 440 of SEQ ID NO:82, positions 19 to 442 of SEQ ID NO:83,
positions 18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID
NO:85, positions 18 to 447 of SEQ ID NO:86, positions 18 to 443 of
SEQ ID NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to
451 of SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90, positions
18 to 444 of SEQ ID NO:91, positions 19 to 442 of SEQ ID NO:92,
positions 20 to 436 of SEQ ID NO:93, positions 18 to 450 of SEQ ID
NO:94, positions 22 to 453 of SEQ ID NO:95, positions 16 to 472 of
SEQ ID NO:96, positions 21 to 445 of SEQ ID NO:97, positions 19 to
447 of SEQ ID NO:98, positions 19 to 450 of SEQ ID NO:99, positions
19 to 451 of SEQ ID NO:100, positions 18 to 448 of SEQ ID NO:101,
positions 19 to 442 of SEQ ID NO:102, positions 20 to 457 of SEQ ID
NO:103, positions 19 to 454 of SEQ ID NO:104, positions 18 to 440
of SEQ ID NO:105, positions 18 to 439 of SEQ ID NO:106, positions
27 to 460 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108,
positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID
NO:110, positions 19 to 447 of SEQ ID NO:111, positions 18 to 449
of SEQ ID NO:112, positions 22 to 457 of SEQ ID NO:113, positions
18 to 445 of SEQ ID NO:114, positions 18 to 448 of SEQ ID NO:115,
positions 18 to 448 of SEQ ID NO:116, positions 23 to 435 of SEQ ID
NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to 435
of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120, positions
21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ ID NO:122,
positions 23 to 443 of SEQ ID NO:123, positions 20 to 445 of SEQ ID
NO:124, positions 21 to 435 of SEQ ID NO:125, positions 20 to 437
of SEQ ID NO:126, positions 21 to 442 of SEQ ID NO:127, positions
23 to 434 of SEQ ID NO:128, positions 20 to 444 of SEQ ID NO:129,
positions 21 to 435 of SEQ ID NO:130, positions 20 to 445 of SEQ ID
NO:131, positions 21 to 446 of SEQ ID NO:132, positions 21 to 435
of SEQ ID NO:133, positions 22 to 448 of SEQ ID NO:134, positions
23 to 433 of SEQ ID NO:135, positions 23 to 434 of SEQ ID NO:136,
positions 23 to 435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID
NO:138, positions 20 to 445 of SEQ ID NO:139, positions 20 to 437
of SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions
20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID NO:143,
positions 26 to 435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID
NO:145, positions 24 to 443 of SEQ ID NO:146, positions 20 to 445
of SEQ ID NO:147, positions 21 to 441 of SEQ ID NO:148, and
positions 20 to 437 of SEQ ID NO:149.
[0036] The CBDs are particularly involved in the hydrolysis of
crystalline cellulose. It has been shown that the ability of
cellobiohydrolases to degrade crystalline cellulose decreases when
the CBD is absent (Linder and Teeri, 1997, Journal of Biotechnol.
57:15-28). The variant CBH I polypeptides of the disclosure can
further include a CBD. Exemplary CBDs comprise amino acid sequences
corresponding to positions 494 to 529 of SEQ ID NO:1, positions 480
to 514 of SEQ ID NO:2, positions 494 to 529 of SEQ ID NO:3,
positions 491 to 526 of SEQ ID NO:5, positions 477 to 512 of SEQ ID
NO:6, positions 497 to 532 of SEQ ID NO:7, positions 504 to 539 of
SEQ ID NO:8, positions 486 to 521 of SEQ ID NO:13, positions 556 to
596 of SEQ ID NO:15, positions 490 to 525 of SEQ ID NO:18,
positions 495 to 530 of SEQ ID NO:20, positions 471 to 506 of SEQ
ID NO:23, positions 481 to 516 of SEQ ID NO:27, positions 480 to
514 of SEQ ID NO:30, positions 495 to 529 of SEQ ID NO:35,
positions 493 to 528 of SEQ ID NO:36, positions 477 to 512 of SEQ
ID NO:38, positions 547 to 586 of SEQ ID NO:39, positions 475 to
510 of SEQ ID NO:40, positions 479 to 513 of SEQ ID NO:41,
positions 506 to 541 of SEQ ID NO:42, positions 481 to 516 of SEQ
ID NO:43, positions 503 to 537 of SEQ ID NO:45, positions 488 to
523 of SEQ ID NO:46, positions 476 to 511 of SEQ ID NO:48,
positions 488 to 523 of SEQ ID NO:49, positions 479 to 513 of SEQ
ID NO:50, positions 500 to 535 of SEQ ID NO:52, positions 493 to
528 of SEQ ID NO:55, positions 479 to 514 of SEQ ID NO:58,
positions 494 to 529 of SEQ ID NO:60, positions 490 to 525 of SEQ
ID NO:61, positions 497 to 532 of SEQ ID NO:62, positions 475 to
510 of SEQ ID NO:64, positions 477 to 512 of SEQ ID NO:65,
positions 486 to 521 of SEQ ID NO:66, positions 470 to 505 of SEQ
ID NO:67, positions 491 to 526 of SEQ ID NO:68, positions 476 to
511 of SEQ ID NO:69, positions 480 to 514 of SEQ ID NO:73,
positions 506 to 540 of SEQ ID NO:74, positions 471 to 504 of SEQ
ID NO:76, positions 501 to 536 of SEQ ID NO:78, positions 473 to
508 of SEQ ID NO:79, positions 481 to 516 of SEQ ID NO:83,
positions 488 to 523 of SEQ ID NO:86, positions 475 to 510 of SEQ
ID NO:92, positions 468 to 504 of SEQ ID NO:93, positions 501 to
536 of SEQ ID NO:96, positions 482 to 517 of SEQ ID NO:98,
positions 481 to 516 of SEQ ID NO:99, positions 488 to 523 of SEQ
ID NO:100, positions 472 to 507 of SEQ ID NO:101, positions 481 to
516 of SEQ ID NO:102, positions 471 to 505 of SEQ ID NO:105,
positions 481 to 516 of SEQ ID NO:106, positions 495 to 530 of SEQ
ID NO:107, positions 488 to 523 of SEQ ID NO:111, positions 478 to
513 of SEQ ID NO:112, positions 501 to 536 of SEQ ID NO:113,
positions 491 to 526 of SEQ ID NO:115, and positions 503 to 538 of
SEQ ID NO:116.
[0037] The CD and CBD are often connected via a linker. Exemplary
linker sequences correspond to positions 456 to 493 of SEQ ID NO:1,
positions 445 to 479 of SEQ ID NO:2, positions 456 to 493 of SEQ ID
NO:3, positions 458 to 490 of SEQ ID NO:5, positions 449 to 476 of
SEQ ID NO:6, positions 461 to 496 of SEQ ID NO:7, positions 461 to
503 of SEQ ID NO:8, positions 446 to 485 of SEQ ID NO:13, positions
444 to 555 of SEQ ID NO:15, positions 450 to 489 of SEQ ID NO:18,
positions 450 to 494 of SEQ ID NO:20, positions 448 to 470 of SEQ
ID NO:23, positions 443 to 480 of SEQ ID NO:27, positions 445 to
479 of SEQ ID NO:30, positions 460 to 494 of SEQ ID NO:35,
positions 451 to 492 of SEQ ID NO:36, positions 449 to 476 of SEQ
ID NO:38, positions 444 to 546 of SEQ ID NO:39, positions 443 to
474 of SEQ ID NO:40, positions 445 to 478 of SEQ ID NO:41,
positions 458 to 505 of SEQ ID NO:42, positions 450 to 480 of SEQ
ID NO:43, positions 457 to 502 of SEQ ID NO:45, positions 452 to
487 of SEQ ID NO:46, positions 449 to 475 of SEQ ID NO:48,
positions 452 to 487 of SEQ ID NO:49, positions 445 to 478 of SEQ
ID NO:50, positions 462 to 499 of SEQ ID NO:52, positions 449 to
492 of SEQ ID NO:55, positions 449 to 478 of SEQ ID NO:58,
positions 456 to 493 of SEQ ID NO:60, positions 450 to 489 of SEQ
ID NO:61, positions 450 to 496 of SEQ ID NO:62, positions 449 to
474 of SEQ ID NO:64, positions 452 to 476 of SEQ ID NO:65,
positions 448 to 485 of SEQ ID NO:66, positions 425 to 469 of SEQ
ID NO:67, positions 449 to 490 of SEQ ID NO:68, positions 444 to
475 of SEQ ID NO:69, positions 445 to 479 of SEQ ID NO:73,
positions 459 to 505 of SEQ ID NO:74, positions 436 to 470 of SEQ
ID NO:76, positions 458 to 500 of SEQ ID NO:78, positions 449 to
472 of SEQ ID NO:79, positions 443 to 480 of SEQ ID NO:83,
positions 448 to 487 of SEQ ID NO:86, positions 443 to 474 of SEQ
ID NO:92, positions 437 to 467 of SEQ ID NO:93, positions 473 to
500 of SEQ ID NO:96, positions 448 to 481 of SEQ ID NO:98,
positions 451 to 480 of SEQ ID NO:99, positions 452 to 487 of SEQ
ID NO:100, positions 449 to 471 of SEQ ID NO:101, positions 443 to
480 of SEQ ID NO:102, positions 441 to 470 of SEQ ID NO:105,
positions 440 to 480 of SEQ ID NO:106, positions 461 to 494 of SEQ
ID NO:107, positions 448 to 487 of SEQ ID NO:111, positions 450 to
478 of SEQ ID NO:112, positions 458 to 500 of SEQ ID NO:113,
positions 449 to 490 of SEQ ID NO:115, and positions 449 to 502 of
SEQ ID NO:116.
[0038] Because CBH I polypeptides are modular, the CBDs, CDs and
linkers of different CBH I polypeptides, such as the exemplary CBH
I polypeptides of Table 1, can be used interchangeably. However, in
a preferred embodiment, the CBDs, CDs and linkers of a variant CBH
I of the disclosure originate from the same polypeptide.
[0039] The variant CBH I polypeptides of the disclosure preferably
have at least a two-fold reduction of product inhibition, such that
cellobiose has an IC.sub.50 towards the variant CBH I that is at
least 2-fold the IC.sub.50 of the corresponding reference CBH I,
e.g., CBH I lacking the R268 substitution and/or R411 substitution.
More preferably the IC.sub.50 of cellobiose towards the variant CBH
I is at least 3-fold, at least 5-fold, at least 8-fold, at least
10-fold, at least 12-fold or at least 15-fold the IC.sub.50 of the
corresponding reference CBH I. In specific embodiments the
IC.sub.50 of cellobiose towards the variant CBH I is ranges from
2-fold to 15-fold, from 2-fold to 10-fold, from 3-fold to 10-fold,
from 5-fold to 12-fold, from 4-fold to 12-fold, from 5-fold to
10-fold, from 5-fold to 12-fold, from 2-fold to 8-fold, or from
8-fold to 20-fold the IC.sub.50 of the corresponding reference CBH
I. The IC.sub.50 can be determined in a phosphoric acid swollen
cellulose ("PASC") assay (Du et al., 2010, Applied Biochemistry and
Biotechnology 161:313-317) or a methylumbelliferyl lactoside
("MUL") assay (van Tilbeurgh and Claeyssens, 1985, FEBS Letts.
187(2):283-288), as exemplified in the Examples below.
[0040] The variant CBH I polypeptides of the disclosure preferably
have a cellobiohydrolase activity that is at least 30% the
cellobiohydrolase activity of the corresponding reference CBH I,
e.g., CBH I lacking the R268 substitution and/or R411 substitution.
More preferably, the cellobiohydrolase activity of the variant CBH
I is at least 40%, at least 50%, at least 60% or at least 70% the
cellobiohydrolase activity of the corresponding reference CBH I. In
specific embodiments the IC.sub.50 cellobiohydrolase activity of
the variant CBH I is ranges from 30% to 80%, from 40% to 70%, 30%
to 60%, from 50% to 80% or from 60% to 80% of the cellobiohydrolase
activity of the corresponding reference CBH I. Assays for
cellobiohydrolase activity are described, for example, in Becker et
al., 2011, Biochem J. 356:19-30 and Mitsuishi et al., 1990, FEBS
Letts. 275:135-138, each of which is expressly incorporated by
reference herein. The ability of CBH I to hydrolyze isolated
soluble and insoluble substrates can also be measured using assays
described in Srisodsuk et al., 1997, J. Biotech. 57:4957 and
Nidetzky and Claeyssens, 1994, Biotech. Bioeng. 44:961-966.
Substrates useful for assaying cellobiohydrolase activity include
crystalline cellulose, filter paper, phosphoric acid swollen
cellulose, cellooligosaccharides, methylumbelliferyl lactoside,
methylumbelliferyl cellobioside, orthonitrophenyl lactoside,
paranitrophenyl lactoside, orthonitrophenyl cellobioside,
paranitrophenyl cellobioside. Cellobiohydrolase activity can be
measured in an assay utilizing PASC as the substrate and a
calcofluor white detection method (Du et al., 2010, Applied
Biochemistry and Biotechnology 161:313-317). PASC can be prepared
as described by Walseth, 1952, TAPPI 35:228-235 and Wood, 1971,
Biochem. J. 121:353-362.
[0041] Other than said R268 and/or R411 substitution, the variant
CBH I polypeptides of the disclosure preferably: [0042] comprise an
amino acid sequence having at least 50%, 51%, 52%, 53%, 54%, 55%,
56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%,
69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%,
82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,
95%, 96%, 97%, 98%, 99%, or more, or complete (100%) sequence
identity to a CD of a reference CBH I exemplified in Table 1 (i.e.,
a CD comprising an amino acid sequence corresponding to positions
26 to 455 of SEQ ID NO:1, positions 18 to 444 of SEQ ID NO:2,
positions 26 to 455 of SEQ ID NO:3, positions 1 to 427 of SEQ ID
NO:4, positions 24 to 457 of SEQ ID NO:5, positions 18 to 448 of
SEQ ID NO:6, positions 27 to 460 of SEQ ID NO:7, positions 27 to
460 of SEQ ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1
to 424 of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11,
positions 18 to 434 of SEQ ID NO:12, positions 18 to 445 of SEQ ID
NO:13, positions 19 to 454 of SEQ ID NO:14, positions 19 to 443 of
SEQ ID NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to
446 of SEQ ID NO:17, positions 19 to 449 of SEQ ID NO:18, positions
23 to 446 of SEQ ID NO:19, positions 19 to 449 of SEQ ID NO:20,
positions 2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ ID
NO:22, positions 19 to 447 of SEQ ID NO:23, positions 19 to 447 of
SEQ ID NO:24, positions 20 to 443 of SEQ ID NO:25, positions 18 to
447 of SEQ ID NO:26, positions 19 to 442 of SEQ ID NO:27, positions
18 to 451 of SEQ ID NO:28, positions 23 to 446 of SEQ ID NO:29,
positions 18 to 444 of SEQ ID NO:30, positions 18 to 451 of SEQ ID
NO:31, positions 18 to 447 of SEQ ID NO:32, positions 19 to 449 of
SEQ ID NO:33, positions 18 to 447 of SEQ ID NO:34, positions 26 to
459 of SEQ ID NO:35, positions 19 to 450 of SEQ ID NO:36, positions
19 to 453 of SEQ ID NO:37, positions 18 to 448 of SEQ ID NO:38,
positions 19 to 443 of SEQ ID NO:39, positions 19 to 442 of SEQ ID
NO:40, positions 18 to 444 of SEQ ID NO:41, positions 24 to 457 of
SEQ ID NO:42, positions 18 to 449 of SEQ ID NO:43, positions 19 to
453 of SEQ ID NO:44, positions 26 to 456 of SEQ ID NO:45, positions
19 to 451 of SEQ ID NO:46, positions 18 to 443 of SEQ ID NO:47,
positions 18 to 448 of SEQ ID NO:48, positions 19 to 451 of SEQ ID
NO:49, positions 18 to 444 of SEQ ID NO:50, positions 2 to 419 of
SEQ ID NO:51, positions 27 to 461 of SEQ ID NO:52, positions 21 to
445 of SEQ ID NO:53, positions 19 to 449 of SEQ ID NO:54, positions
19 to 448 of SEQ ID NO:55, positions 18 to 443 of SEQ ID NO:56,
positions 20 to 443 of SEQ ID NO:57, positions 18 to 448 of SEQ ID
NO:58, positions 18 to 447 of SEQ ID NO:59, positions 26 to 455 of
SEQ ID NO:60, positions 19 to 449 of SEQ ID NO:61, positions 19 to
449 of SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions
18 to 448 of SEQ ID NO:64, positions 19 to 451 of SEQ ID NO:65,
positions 19 to 447 of SEQ ID NO:66, positions 1 to 424 of SEQ ID
NO:67, positions 19 to 448 of SEQ ID NO:68, positions 19 to 443 of
SEQ ID NO:69, positions 23 to 447 of SEQ ID NO:70, positions 17 to
448 of SEQ ID NO:71, positions 19 to 449 of SEQ ID NO:72, positions
18 to 444 of SEQ ID NO:73, positions 23 to 458 of SEQ ID NO:74,
positions 20 to 452 of SEQ ID NO:75, positions 18 to 435 of SEQ ID
NO:76, positions 18 to 446 of SEQ ID NO:77, positions 22 to 457 of
SEQ ID NO:78, positions 18 to 448 of SEQ ID NO:79, positions 1 to
431 of SEQ ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions
21 to 440 of SEQ ID NO:82, positions 19 to 442 of SEQ ID NO:83,
positions 18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID
NO:85, positions 18 to 447 of SEQ ID NO:86, positions 18 to 443 of
SEQ ID NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to
451 of SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90, positions
18 to 444 of SEQ ID NO:91, positions 19 to 442 of SEQ ID NO:92,
positions 20 to 436 of SEQ ID NO:93, positions 18 to 450 of SEQ ID
NO:94, positions 22 to 453 of SEQ ID NO:95, positions 16 to 472 of
SEQ ID NO:96, positions 21 to 445 of SEQ ID NO:97, positions 19 to
447 of SEQ ID NO:98, positions 19 to 450 of SEQ ID NO:99, positions
19 to 451 of SEQ ID NO:100, positions 18 to 448 of SEQ ID NO:101,
positions 19 to 442 of SEQ ID NO:102, positions 20 to 457 of SEQ ID
NO:103, positions 19 to 454 of SEQ ID NO:104, positions 18 to 440
of SEQ ID NO:105, positions 18 to 439 of SEQ ID NO:106, positions
27 to 460 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108,
positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID
NO:110, positions 19 to 447 of SEQ ID NO:111, positions 18 to 449
of SEQ ID NO:112, positions 22 to 457 of SEQ ID NO:113, positions
18 to 445 of SEQ ID NO:114, positions 18 to 448 of SEQ ID NO:115,
positions 18 to 448 of SEQ ID NO:116, positions 23 to 435 of SEQ ID
NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to 435
of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120, positions
21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ ID NO:122,
positions 23 to 443 of SEQ ID NO:123, positions 20 to 445 of SEQ ID
NO:124, positions 21 to 435 of SEQ ID NO:125, positions 20 to 437
of SEQ ID NO:126, positions 21 to 442 of SEQ ID NO:127, positions
23 to 434 of SEQ ID NO:128, positions 20 to 444 of SEQ ID NO:129,
positions 21 to 435 of SEQ ID NO:130, positions 20 to 445 of SEQ ID
NO:131, positions 21 to 446 of SEQ ID NO:132, positions 21 to 435
of SEQ ID NO:133, positions 22 to 448 of SEQ ID NO:134, positions
23 to 433 of SEQ ID NO:135, positions 23 to 434 of SEQ ID NO:136,
positions 23 to 435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID
NO:138, positions 20 to 445 of SEQ ID NO:139, positions 20 to 437
of SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions
20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID NO:143,
positions 26 to 435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID
NO:145, positions 24 to 443 of SEQ ID NO:146, positions 20 to 445
of SEQ ID NO:147, positions 21 to 441 of SEQ ID NO:148, and
positions 20 to 437 of SEQ ID NO:149 (preferably the CD
corresponding to positions 26-455 of SEQ ID NO:1 or 18-444 of SEQ
ID NO:2); and/or [0043] comprise an amino acid sequence having at
least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%,
62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%,
75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%,
88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or
more, or complete (100%) sequence identity to a mature polypeptide
of a reference CBH I exemplified in Table 1 (i.e., a mature protein
comprising an amino acid sequence corresponding to positions 26 to
529 of SEQ ID NO:1, positions 18 to 514 of SEQ ID NO:2, positions
26 to 529 of SEQ ID NO:3, positions 1 to 427 of SEQ ID NO:4,
positions 24 to 526 of SEQ ID NO:5, positions 18 to 512 of SEQ ID
NO:6, positions 27 to 532 of SEQ ID NO:7, positions 27 to 539 of
SEQ ID NO:8, positions 20 to 449 of SEQ ID NO:9, positions 1 to 424
of SEQ ID NO:10, positions 18 to 447 of SEQ ID NO:11, positions 18
to 434 of SEQ ID NO:12, positions 18 to 521 of SEQ ID NO:13,
positions 19 to 454 of SEQ ID NO:14, positions 19 to 596 of SEQ ID
NO:15, positions 2 to 426 of SEQ ID NO:16, positions 23 to 446 of
SEQ ID NO:17, positions 19 to 525 of SEQ ID NO:18, positions 23 to
446 of SEQ ID NO:19, positions 19 to 530 of SEQ ID NO:20, positions
2 to 416 of SEQ ID NO:21, positions 19 to 454 of SEQ ID NO:22,
positions 19 to 506 of SEQ ID NO:23, positions 19 to 447 of SEQ ID
NO:24, positions 20 to 443 of SEQ ID NO:25, positions 18 to 447 of
SEQ ID NO:26, positions 19 to 516 of SEQ ID NO:27, positions 18 to
451 of SEQ ID NO:28, positions 23 to 446 of SEQ ID NO:29, positions
18 to 514 of SEQ ID NO:30, positions 18 to 451 of SEQ ID NO:31,
positions 18 to 447 of SEQ ID NO:32, positions 19 to 449 of SEQ ID
NO:33, positions 18 to 447 of SEQ ID NO:34, positions 26 to 529 of
SEQ ID NO:35, positions 19 to 528 of SEQ ID NO:36, positions 19 to
453 of SEQ ID NO:37, positions 18 to 512 of SEQ ID NO:38, positions
19 to 586 of SEQ ID NO:39, positions 19 to 510 of SEQ ID NO:40,
positions 18 to 513 of SEQ ID NO:41, positions 24 to 541 of SEQ ID
NO:42, positions 18 to 516 of SEQ ID NO:43, positions 19 to 453 of
SEQ ID NO:44, positions 26 to 537 of SEQ ID NO:45, positions 19 to
523 of SEQ ID NO:46, positions 18 to 443 of SEQ ID NO:47, positions
18 to 511 of SEQ ID NO:48, positions 19 to 523 of SEQ ID NO:49,
positions 18 to 513 of SEQ ID NO:50, positions 2 to 419 of SEQ ID
NO:51, positions 27 to 535 of SEQ ID NO:52, positions 21 to 445 of
SEQ ID NO:53, positions 19 to 449 of SEQ ID NO:54, positions 19 to
528 of SEQ ID NO:55, positions 18 to 443 of SEQ ID NO:56, positions
20 to 443 of SEQ ID NO:57, positions 18 to 514 of SEQ ID NO:58,
positions 18 to 447 of SEQ ID NO:59, positions 26 to 529 of SEQ ID
NO:60, positions 19 to 525 of SEQ ID NO:61, positions 19 to 532 of
SEQ ID NO:62, positions 26 to 460 of SEQ ID NO:63, positions 18 to
510 of SEQ ID NO:64, positions 19 to 512 of SEQ ID NO:65, positions
19 to 521 of SEQ ID NO:66, positions 1 to 505 of SEQ ID NO:67,
positions 19 to 526 of SEQ ID NO:68, positions 19 to 511 of SEQ ID
NO:69, positions 23 to 447 of SEQ ID NO:70, positions 17 to 448 of
SEQ ID NO:71, positions 19 to 449 of SEQ ID NO:72, positions 18 to
514 of SEQ ID NO:73, positions 23 to 540 of SEQ ID NO:74, positions
20 to 452 of SEQ ID NO:75, positions 18 to 504 of SEQ ID NO:76,
positions 18 to 446 of SEQ ID NO:77, positions 22 to 536 of SEQ ID
NO:78, positions 18 to 508 of SEQ ID NO:79, positions 1 to 431 of
SEQ ID NO:80, positions 19 to 453 of SEQ ID NO:81, positions 21 to
440 of SEQ ID NO:82, positions 19 to 516 of SEQ ID NO:83, positions
18 to 448 of SEQ ID NO:84, positions 17 to 446 of SEQ ID NO:85,
positions 18 to 523 of SEQ ID NO:86, positions 18 to 443 of SEQ ID
NO:87, positions 23 to 448 of SEQ ID NO:88, positions 18 to 451 of
SEQ ID NO:89, positions 21 to 447 of SEQ ID NO:90, positions 18 to
444 of SEQ ID NO:91, positions 19 to 510 of SEQ ID NO:92, positions
20 to 504 of SEQ ID NO:93, positions 18 to 450 of SEQ ID NO:94,
positions 22 to 453 of SEQ ID NO:95, positions 16 to 536 of SEQ ID
NO:96, positions 21 to 445 of SEQ ID NO:97, positions 19 to 517 of
SEQ ID NO:98, positions 19 to 516 of SEQ ID NO:99, positions 19 to
523 of SEQ ID NO:100, positions 18 to 507 of SEQ ID NO:101,
positions 19 to 516 of SEQ ID NO:102, positions 20 to 457 of SEQ ID
NO:103, positions 19 to 454 of SEQ ID NO:104, positions 18 to 505
of SEQ ID NO:105, positions 18 to 516 of SEQ ID NO:106, positions
27 to 530 of SEQ ID NO:107, positions 23 to 446 of SEQ ID NO:108,
positions 17 to 446 of SEQ ID NO:109, positions 21 to 447 of SEQ ID
NO:110, positions 19 to 523 of SEQ ID NO:111, positions 18 to 513
of SEQ ID NO:112, positions 22 to 536 of SEQ ID NO:113, positions
18 to 445 of SEQ ID NO:114, positions 18 to 526 of SEQ ID NO:115,
positions 18 to 538 of SEQ ID NO:116, positions 23 to 435 of SEQ ID
NO:117, positions 21 to 442 of SEQ ID NO:118, positions 23 to 435
of SEQ ID NO:119, positions 20 to 445 of SEQ ID NO:120, positions
21 to 443 of SEQ ID NO:121, positions 20 to 445 of SEQ ID NO:122,
positions 23 to 443 of SEQ ID NO:123, positions 20 to 445 of SEQ ID
NO:124, positions 21 to 435 of SEQ ID NO:125, positions 20 to 437
of SEQ ID NO:126, positions 21 to 442 of SEQ ID NO:127, positions
23 to 434 of SEQ ID NO:128, positions 20 to 444 of SEQ ID NO:129,
positions 21 to 435 of SEQ ID NO:130, positions 20 to 445 of SEQ ID
NO:131, positions 21 to 446 of SEQ ID NO:132, positions 21 to 435
of SEQ ID NO:133, positions 22 to 448 of SEQ ID NO:134, positions
23 to 433 of SEQ ID NO:135, positions 23 to 434 of SEQ ID NO:136,
positions 23 to 435 of SEQ ID NO:137, positions 23 to 435 of SEQ ID
NO:138, positions 20 to 445, of SEQ ID NO:139, positions 20 to 437
of SEQ ID NO:140, positions 21 to 435 of SEQ ID NO:141, positions
20 to 437 of SEQ ID NO:142, positions 21 to 435 of SEQ ID NO:143,
positions 26 to 435 of SEQ ID NO:144, positions 23 to 435 of SEQ ID
NO:145, positions 24 to 443 of SEQ ID NO:146, positions 20 to 445
of SEQ ID NO:147, positions 21 to 441 of SEQ ID NO:148, and
positions 20 to 437 of SEQ ID NO:149, preferably the mature
polypeptide corresponding to positions 26-529 of SEQ ID NO:1 or
18-514 of SEQ ID NO:2).
[0044] An example of an algorithm that is suitable for determining
sequence similarity is the BLAST algorithm, which is described in
Altschul et al., 1990, J. Mol. Biol. 215:403-410. Software for
performing BLAST analyses is publicly available through the
National Center for Biotechnology Information. This algorithm
involves first identifying high scoring sequence pairs (HSPs) by
identifying short words of length W in the query sequence that
either match or satisfy some positive-valued threshold score T when
aligned with a word of the same length in a database sequence.
These initial neighborhood word hits act as starting points to find
longer HSPs containing them. The word hits are expanded in both
directions along each of the two sequences being compared for as
far as the cumulative alignment score can be increased. Extension
of the word hits is stopped when: the cumulative alignment score
falls off by the quantity X from a maximum achieved value; the
cumulative score goes to zero or below; or the end of either
sequence is reached. The BLAST algorithm parameters W, T, and X
determine the sensitivity and speed of the alignment. The BLAST
program uses as defaults a word length (W) of 11, the BLOSUM62
scoring matrix (see Henikoff & Henikoff, 1992, Proc. Nat'l.
Acad. Sci. USA 89:10915-10919) alignments (B) of 50, expectation
(E) of 10, M'S, N'-4, and a comparison of both strands.
[0045] Most CBH I polypeptides are secreted and are therefore
expressed with a signal sequence that is cleaved upon secretion of
the polypeptide from the cell. Accordingly, in certain aspects, the
variant CBH I polypeptides of the disclosure further include a
signal sequence. Exemplary signal sequences comprise amino acid
sequences corresponding to positions 1 to 25 of SEQ ID NO:1,
positions 1 to 17 of SEQ ID NO:2, positions 1 to 25 of SEQ ID NO:3,
positions 1 to 23 of SEQ ID NO:5, positions 1 to 17 of SEQ ID NO:6,
positions 1 to 26 of SEQ ID NO:7, positions 1 to 27 of SEQ ID NO:8,
positions 1 to 19 of SEQ ID NO:9, positions 1 to 17 of SEQ ID
NO:11, positions 1 to 17 of SEQ ID NO:12, positions 1 to 17 of SEQ
ID NO:13, positions 1 to 18 of SEQ ID NO:14, positions 1 to 18 of
SEQ ID NO:15, positions 1 to 22 of SEQ ID NO:17, positions 1 to 18
of SEQ ID NO:18, positions 1 to 22 of SEQ ID NO:19, positions 1 to
18 of SEQ ID NO:20, positions 1 to 18 of SEQ ID NO:22, positions 1
to 18 of SEQ ID NO:23, positions 1 to 18 of SEQ ID NO:24, positions
1 to 19 of SEQ ID NO:25, positions 1 to 17 of SEQ ID NO:26,
positions 1 to 18 of SEQ ID NO:27, positions 1 to 17 of SEQ ID
NO:28, positions 1 to 22 of SEQ ID NO:29, positions 1 to 18 of SEQ
ID NO:30, positions 1 to 17 of SEQ ID NO:31, positions 1 to 17 of
SEQ ID NO:32, positions 1 to 18 of SEQ ID NO:33, positions 1 to 17
of SEQ ID NO:34, positions 1 to 25 of SEQ ID NO:35, positions 1 to
18 of SEQ ID NO:36, positions 1 to 18 of SEQ ID NO:37, positions 1
to 17 of SEQ ID NO:38, positions 1 to 18 of SEQ ID NO:39, positions
1 to 18 of SEQ ID NO:40, positions 1 to 17 of SEQ ID NO:41,
positions 1 to 23 of SEQ ID NO:42, positions 1 to 17 of SEQ ID
NO:43, positions 1 to 18 of SEQ ID NO:44, positions 1 to 25 of SEQ
ID NO:45, positions 1 to 18 of SEQ ID NO:46, positions 1 to 17 of
SEQ ID NO:47, positions 1 to 17 of SEQ ID NO:48, positions 1 to 18
of SEQ ID NO:49, positions 1 to 17 of SEQ ID NO:50, positions 1 to
26 of SEQ ID NO:52, positions 1 to 20 of SEQ ID NO:53, positions 1
to 18 of SEQ ID NO:54, positions 1 to 18 of SEQ ID NO:55, positions
1 to 17 of SEQ ID NO:56, positions 1 to 19 of SEQ ID NO:57,
positions 1 to 17 of SEQ ID NO:58, positions 1 to 17 of SEQ ID
NO:59, positions 1 to 25 of SEQ ID NO:60, positions 1 to 18 of SEQ
ID NO:61, positions 1 to 18 of SEQ ID NO:62, positions 1 to 25 of
SEQ ID NO:63, positions 1 to 17 of SEQ ID NO:64, positions 1 to 18
of SEQ ID NO:65, positions 1 to 18 of SEQ ID NO:66, positions 1 to
18 of SEQ ID NO:68, positions 1 to 18 of SEQ ID NO:69, positions 1
to 23 of SEQ ID NO:70, positions 1 to 17 of SEQ ID NO:71, positions
1 to 18 of SEQ ID NO:72, positions 1 to 17 of SEQ ID NO:73,
positions 1 to 22 of SEQ ID NO:74, positions 1 to 19 of SEQ ID
NO:75, positions 1 to 17 of SEQ ID NO:76, positions 1 to 17 of SEQ
ID NO:77, positions 1 to 21 of SEQ ID NO:78, positions 1 to 18 of
SEQ ID NO:79, positions 1 to 18 of SEQ ID NO:81, positions 1 to 20
of SEQ ID NO:82, positions 1 to 18 of SEQ ID NO:83, positions 1 to
17 of SEQ ID NO:84, positions 1 to 16 of SEQ ID NO:85, positions 1
to 17 of SEQ ID NO:86, positions 1 to 17 of SEQ ID NO:87, positions
1 to 22 of SEQ ID NO:88, positions 1 to 17 of SEQ ID NO:89,
positions 1 to 20 of SEQ ID NO:90, positions 1 to 17 of SEQ ID
NO:91, positions 1 to 18 of SEQ ID NO:92, positions 1 to 19 of SEQ
ID NO:93, positions 1 to 17 of SEQ ID NO:94, positions 1 to 21 of
SEQ ID NO:95, positions 1 to 15 of SEQ ID NO:96, positions 1 to 20
of SEQ ID NO:97, positions 1 to 18 of SEQ ID NO:98, positions 1 to
18 of SEQ ID NO:99, positions 1 to 18 of SEQ ID NO:100, positions 1
to 17 of SEQ ID NO:101, positions 1 to 18 of SEQ ID NO:102,
positions 1 to 19 of SEQ ID NO:103, positions 1 to 18 of SEQ ID
NO:104, positions 1 to 17 of SEQ ID NO:105, positions 1 to 17 of
SEQ ID NO:106, positions 1 to 26 of SEQ ID NO:107, positions 1 to
22 of SEQ ID NO:108, positions 1 to 16 of SEQ ID NO:109, positions
1 to 20 of SEQ ID NO:110, positions 1 to 18 of SEQ ID NO:111,
positions 1 to 17 of SEQ ID NO:112, positions 1 to 21 of SEQ ID
NO:113, positions 1 to 17 of SEQ ID NO:114, positions 1 to 17 of
SEQ ID NO:115, positions 1 to 18 of SEQ ID NO:116, positions 1 to
22 of SEQ ID NO:117, positions 1 to 20 of SEQ ID NO:118, positions
1 to 22 of SEQ ID NO:119, positions 1 to 19 of SEQ ID NO:120,
positions 1 to 20 of SEQ ID NO:121, positions 1 to 19 of SEQ ID
NO:122, positions 1 to 22 of SEQ ID NO:123, positions 1 to 19 of
SEQ ID NO:124, positions 1 to 20 of SEQ ID NO:125, positions 1 to
19 of SEQ ID NO:126, positions 1 to 21 of SEQ ID NO:127, positions
1 to 22 of SEQ ID NO:128, positions 1 to 19 of SEQ ID NO:129,
positions 1 to 20 of SEQ ID NO:130, positions 1 to 19 of SEQ ID
NO:131, positions 1 to 20 of SEQ ID NO:132, positions 1 to 20 of
SEQ ID NO:133, positions 1 to 21 of SEQ ID NO:134, positions 1 to
22 of SEQ ID NO:135, positions 1 to 22 of SEQ ID NO:136, positions
1 to 22 of SEQ ID NO:137, positions 1 to 22 of SEQ ID NO:138,
positions 1 to 19 of SEQ ID NO:139, positions 1 to 19 of SEQ ID
NO:140, positions 1 to 20 of SEQ ID NO:141, positions 1 to 19 of
SEQ ID NO:142, positions 1 to 20 of SEQ ID NO:143, positions 1 to
25 of SEQ ID NO:144, positions 1 to 22 of SEQ ID NO:145, positions
1 to 23 of SEQ ID NO:146, positions 1 to 19 of SEQ ID NO:147,
positions 1 to 20 of SEQ ID NO:148, and positions 1 to 19 of SEQ ID
NO:149.
Recombinant Expression of Variant CBH I Polypeptides
Cell Culture Systems
[0046] The disclosure also provides recombinant cells engineered to
express variant CBH I polypeptides. Suitably, the variant CBH I
polypeptide is encoded by a nucleic acid operably linked to a
promoter.
[0047] Where recombinant expression in a filamentous fungal host is
desired, the promoter can be a filamentous fungal promoter. The
nucleic acids can be, for example, under the control of
heterologous promoters. The variant CBH I polypeptides can also be
expressed under the control of constitutive or inducible promoters.
Examples of promoters that can be used include, but are not limited
to, a cellulase promoter, a xylanase promoter, the 1818 promoter
(previously identified as a highly expressed protein by EST mapping
Trichoderma). For example, the promoter can suitably be a
cellobiohydrolase, endoglucanase, or .beta.-glucosidase promoter. A
particularly suitable promoter can be, for example, a T. reesei
cellobiohydrolase, endoglucanase, or .beta.-glucosidase promoter.
Non-limiting examples of promoters include a cbh1, cbh2, egl1,
eg12, eg13, eg14, eg15, pki1, gpdl, xyn1, or xyn2 promoter.
[0048] Suitable host cells include cells of any microorganism
(e.g., cells of a bacterium, a protist, an alga, a fungus (e.g., a
yeast or filamentous fungus), or other microbe), and are preferably
cells of a bacterium, a yeast, or a filamentous fungus.
[0049] Suitable host cells of the bacterial genera include, but are
not limited to, cells of Escherichia, Bacillus, Lactobacillus,
Pseudomonas, and Streptomyces. Suitable cells of bacterial species
include, but are not limited to, cells of Escherichia coli,
Bacillus subtilis, Bacillus licheniformis, Lactobacillus brevis,
Pseudomonas aeruginosa, and Streptomyces lividans.
[0050] Suitable host cells of the genera of yeast include, but are
not limited to, cells of Saccharomyces, Schizosaccharomyces,
Candida, Hansenula, Pichia, Kluyveromyces, and Phaffia. Suitable
cells of yeast species include, but are not limited to, cells of
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Candida
albicans, Hansenula polymorphs, Pichia pastoris, P. canadensis,
Kluyveromyces marxianus, and Phaffia rhodozyma.
[0051] Suitable host cells of filamentous fungi include all
filamentous forms of the subdivision Eumycotina. Suitable cells of
filamentous fungal genera include, but are not limited to, cells of
Acremonium, Aspergillus, Aureobasidium, Bjerkandera, Ceriporiopsis,
Chrysoporium, Coprinus, Coriolus, Corynascus, Chaetomium,
Cryptococcus, Filobasidium, Fusarium, Gibberella, Humicola,
Hypocrea, Magnaporthe, Mucor, Myceliophthora, Mucor,
Neocallimastix, Neurospora, Paecilomyces, Penicillium,
Phanerochaete, Phlebia, Piromyces, Pleurotus, Scytaldium,
Schizophyllum, Sporotrichum, Talaromyces, Thermoascus, Thielavia,
Tolypocladium, Trametes, and Trichoderma. More preferably, the
recombinant cell is a Trichoderma sp. (e.g., Trichoderma reesei),
Penicillium sp., Humicola sp. (e.g., Humicola insolens);
Aspergillus sp. (e.g., Aspergillus niger), Chrysosporium sp.,
Fusarium sp., or Hypocrea sp. Suitable cells can also include cells
of various anamorph and teleomorph forms of these filamentous
fungal genera.
[0052] Suitable cells of filamentous fungal species include, but
are not limited to, cells of Aspergillus awamori, Aspergillus
fumigatus, Aspergillus foetidus, Aspergillus japonicus, Aspergillus
nidulans, Aspergillus niger, Aspergillus oryzae, Chrysosporium
lucknowense, Fusarium bactridioides, Fusarium cerealis, Fusarium
crookwellense, Fusarium culmorum, Fusarium graminearum, Fusarium
graminum, Fusarium heterosporum, Fusarium negundi, Fusarium
oxysporum, Fusarium reticulatum, Fusarium roseum, Fusarium
sambucinum, Fusarium sarcochroum, Fusarium sporotrichioides,
Fusarium sulphureum, Fusarium torulosum, Fusarium trichothecioides,
Fusarium venenatum, Bjerkandera adusta, Ceriporiopsis aneirina,
Ceriporiopsis aneirina, Ceriporiopsis caregiea, Ceriporiopsis
gilvescens, Ceriporiopsis pannocinta, Ceriporiopsis rivulosa,
Ceriporiopsis subrufa, Ceriporiopsis subvermispora, Coprinus
cinereus, Coriolus hirsutus, Humicola insolens, Humicola
lanuginosa, Mucor miehei, Myceliophthora thermophile, Neurospora
crassa, Neurospora intermedia, Penicillium purpurogenum,
Penicillium canescens, Penicillium solitum, Penicillium
funiculosum, Phanerochaete chrysosporium, Phlebia radiate,
Pleurotus eryngii, Talaromyces flavus, Thielavia terrestris,
Trametes villosa, Trametes versicolor, Trichoderma harzianum,
Trichoderma koningii, Trichoderma longibrachiatum, Trichoderma
reesei, and Trichoderma viride.
[0053] The engineered host cells can be cultured in conventional
nutrient media modified as appropriate for activating promoters,
selecting transformants, or amplifying the nucleic acid sequence
encoding the variant CBH I polypeptide. Culture conditions, such as
temperature, pH and the like, are those previously used with the
host cell selected for expression, and will be apparent to those
skilled in the art. As noted, many references are available for the
culture and production of many cells, including cells of bacterial
and fungal origin. Cell culture media in general are set forth in
Atlas and Parks (eds.), 1993, The Handbook of Microbiological
Media, CRC Press, Boca Raton, Fla., which is incorporated herein by
reference. For recombinant expression in filamentous fungal cells,
the cells are cultured in a standard medium containing
physiological salts and nutrients, such as described in Pourquie et
al., 1988, Biochemistry and Genetics of Cellulose Degradation, eds.
Aubert, et al., Academic Press, pp. 71-86; and Ilmen et al., 1997,
Appl. Environ. Microbiol. 63:1298-1306. Culture conditions are also
standard, e.g., cultures are incubated at 28.degree. C. in shaker
cultures or fermenters until desired levels of variant CBH I
expression are achieved. Preferred culture conditions for a given
filamentous fungus may be found in the scientific literature and/or
from the source of the fungi such as the American Type Culture
Collection (ATCC). After fungal growth has been established, the
cells are exposed to conditions effective to cause or permit the
expression of a variant CBH I.
[0054] In cases where a variant CBH I coding sequence is under the
control of an inducible promoter, the inducing agent, e.g., a
sugar, metal salt or antibiotics, is added to the medium at a
concentration effective to induce variant CBH I expression.
[0055] In one embodiment, the recombinant cell is an Aspergillus
niger, which is a useful strain for obtaining overexpressed
polypeptide. For example A. niger var. awamori dgr246 is known to
product elevated amounts of secreted cellulases (Goedegebuur et
al., 2002, Curr. Genet. 41:89-98). Other strains of Aspergillus
niger var awamori such as GCDAP3, GCDAP4 and GAP3-4 are known (Ward
et al., 1993, Appl. Microbiol. Biotechnol. 39:738-743).
[0056] In another embodiment, the recombinant cell is a Trichoderma
reesei, which is a useful strain for obtaining overexpressed
polypeptide. For example, RL-P37, described by Sheir-Neiss et al.,
1984, Appl. Microbiol. Biotechnol. 20:46-53, is known to secrete
elevated amounts of cellulase enzymes. Functional equivalents of
RL-P37 include Trichoderma reesei strain RUT-C30 (ATCC No. 56765)
and strain QM9414 (ATCC No. 26921). It is contemplated that these
strains would also be useful in overexpressing variant CBH I
polypeptides.
[0057] Cells expressing the variant CBH I polypeptides of the
disclosure can be grown under batch, fed-batch or continuous
fermentations conditions. Classical batch fermentation is a closed
system, wherein the compositions of the medium is set at the
beginning of the fermentation and is not subject to artificial
alternations during the fermentation. A variation of the batch
system is a fed-batch fermentation in which the substrate is added
in increments as the fermentation progresses. Fed-batch systems are
useful when catabolite repression is likely to inhibit the
metabolism of the cells and where it is desirable to have limited
amounts of substrate in the medium. Batch and fed-batch
fermentations are common and well known in the art. Continuous
fermentation is an open system where a defined fermentation medium
is added continuously to a bioreactor and an equal amount of
conditioned medium is removed simultaneously for processing.
Continuous fermentation generally maintains the cultures at a
constant high density where cells are primarily in log phase
growth. Continuous fermentation systems strive to maintain steady
state growth conditions. Methods for modulating nutrients and
growth factors for continuous fermentation processes as well as
techniques for maximizing the rate of product formation are well
known in the art of industrial microbiology.
Recombinant Expression in Plants
[0058] The disclosure provides transgenic plants and seeds that
recombinantly express a variant CBH I polypeptide. The disclosure
also provides plant products, e.g., oils, seeds, leaves, extracts
and the like, comprising a variant CBH I polypeptide.
[0059] The transgenic plant can be dicotyledonous (a dicot) or
monocotyledonous (a monocot). The disclosure also provides methods
of making and using these transgenic plants and seeds. The
transgenic plant or plant cell expressing a variant CBH I can be
constructed in accordance with any method known in the art. See,
for example, U.S. Pat. No. 6,309,872. T. reesei CBH I has been
successfully expressed in transgenic tobacco (Nicotiana tabaccum)
and potato (Solanum tuberosum). See Hooker et al., 2000, in
Glycosyl Hydrolases for Biomass Conversion, ACS Symposium Series,
Vol. 769, Chapter 4, pp. 55-90.
[0060] In a particular aspect, the present disclosure provides for
the expression of CBH I variants in transgenic plants or plant
organs and methods for the production thereof. DNA expression
constructs are provided for the transformation of plants with a
nucleic acid encoding the variant CBH I polypeptide, preferably
under the control of regulatory sequences which are capable of
directing expression of the variant CBH I polypeptide. These
regulatory sequences include sequences capable of directing
transcription in plants, either constitutively, or in stage and/or
tissue specific manners.
[0061] The expression of variant CBH I polypeptides in plants can
be achieved by a variety of means. Specifically, for example,
technologies are available for transforming a large number of plant
species, including dicotyledonous species (e.g., tobacco, potato,
tomato, Petunia, Brassica) and monocot species. Additionally, for
example, strategies for the expression of foreign genes in plants
are available. Additionally still, regulatory sequences from plant
genes have been identified that are serviceable for the
construction of chimeric genes that can be functionally expressed
in plants and in plant cells (e.g., Klee, 1987, Ann. Rev. of Plant
Phys. 38:467-486; Clark et al., 1990, Virology 179(2):640-7; Smith
et al., 1990, Mol. Gen. Genet. 224(3):477-81.
[0062] The introduction of nucleic acids into plants can be
achieved using several technologies including transformation with
Agrobacterium tumefaciens or Agrobacterium rhizogenes. Non-limiting
examples of plant tissues that can be transformed include
protoplasts, microspores or pollen, and explants such as leaves,
stems, roots, hypocotyls, and cotyls. Furthermore, DNA encoding a
variant CBH I can be introduced directly into protoplasts and plant
cells or tissues by microinjection, electroporation, particle
bombardment, and direct DNA uptake.
[0063] Variant CBH I polypeptides can be produced in plants by a
variety of expression systems. For instance, the use of a
constitutive promoter such as the 35S promoter of Cauliflower
Mosaic Virus (Guilley et al., 1982, Cell 30:763-73) is serviceable
for the accumulation of the expressed protein in virtually all
organs of the transgenic plant. Alternatively, promoters that are
tissue-specific and/or stage-specific can be used (Higgins, 1984,
Annu Rev. Plant Physiol. 35:191-221; Shotwell and Larkins, 1989,
In: The Biochemistry of Plants Vol. 15 (Academic Press, San Diego:
Stumpf and Conn, eds.), p. 297), permit expression of variant CBH I
polypeptides in a target tissue and/or during a desired stage of
development.
Compositions of Variant CBH I Polypeptides
[0064] In general, a variant CBH I polypeptide produced in cell
culture is secreted into the medium and may be purified or
isolated, e.g., by removing unwanted components from the cell
culture medium. However, in some cases, a variant CBH I polypeptide
may be produced in a cellular form necessitating recovery from a
cell lysate. In such cases the variant CBH I polypeptide is
purified from the cells in which it was produced using techniques
routinely employed by those of skill in the art. Examples include,
but are not limited to, affinity chromatography (Van Tilbeurgh et
al., 1984, FEBS Lett. 169(2):215-218), ion-exchange chromatographic
methods (Goyal et al., 1991, Bioresource Technology, 36:37-50;
Fliess et al., 1983, Eur. J. Appl. Microbiol. Biotechnol.
17:314-318; Bhikhabhai et al., 1984, J. Appl. Biochem. 6:336-345;
Ellouz et al., 1987, Journal of Chromatography, 396:307-317),
including ion-exchange using materials with high resolution power
(Medve et al., 1998, J. Chromatography A, 808:153-165), hydrophobic
interaction chromatography (Tomaz and Queiroz, 1999, J.
Chromatography A, 865:123-128), and two-phase partitioning
(Brumbauer et al., 1999, Bioseparation 7:287-295).
[0065] The variant CBH I polypeptides of the disclosure are
suitably used in cellulase compositions. Cellulases are known in
the art as enzymes that hydrolyze cellulose (beta-1,4-glucan or
beta D-glucosidic linkages) resulting in the formation of glucose,
cellobiose, cellooligosaccharides, and the like. Cellulase enzymes
have been traditionally divided into three major classes:
endoglucanases ("EG"), exoglucanases or cellobiohydrolases (EC
3.2.1.91) ("CBH") and beta-glucosidases (EC 3.2.1.21) ("BG")
(Knowles et al., 1987, TIBTECH 5:255-261; Schulein, 1988, Methods
in Enzymology 160(25):234-243).
[0066] Certain fungi produce complete cellulase systems which
include exo-cellobiohydrolases or CBH-type cellulases,
endoglucanases or EG-type cellulases and .beta.-glucosidases or
BG-type cellulases (Schulein, 1988, Methods in Enzymology
160(25):234-243). Such cellulase compositions are referred to
herein as "whole" cellulases. However, sometimes these systems lack
CBH-type cellulases and bacterial cellulases also typically include
little or no CBH-type cellulases. In addition, it has been shown
that the EG components and CBH components synergistically interact
to more efficiently degrade cellulose. See, e.g., Wood, 1985,
Biochemical Society Transactions 13(2):407-410.
[0067] The cellulase compositions of the disclosure typically
include, in addition to a variant CBH I polypeptide, one or more
cellobiohydrolases, endoglucanases and/or .beta.-glucosidases. In
their crudest form, cellulase compositions contain the
microorganism culture that produced the enzyme components.
"Cellulase compositions" also refers to a crude fermentation
product of the microorganisms. A crude fermentation is preferably a
fermentation broth that has been separated from the microorganism
cells and/or cellular debris (e.g., by centrifugation and/or
filtration). In some cases, the enzymes in the broth can be
optionally diluted, concentrated, partially purified or purified
and/or dried. The variant CBH I polypeptide can be co-expressed
with one or more of the other components of the cellulase
composition or it can be expressed separately, optionally purified
and combined with a composition comprising one or more of the other
cellulase components.
[0068] When employed in cellulase compositions, the variant CBH I
is generally present in an amount sufficient to allow release of
soluble sugars from the biomass. The amount of variant CBH I
enzymes added depends upon the type of biomass to be saccharified
which can be readily determined by the skilled artisan. In certain
embodiments, the weight percent of variant CBH I polypeptide is
suitably at least 1, at least 5, at least 10, or at least 20 weight
percent of the total polypeptides in a cellulase composition.
Exemplary cellulase compositions include a variant CBH I of the
disclosure in an amount ranging from about 1 to about 20 weight
percent, from about 1 to about 25 weight percent, from about 5 to
about 20 weight percent, from about 5 to about 25 weight percent,
from about 5 to about 30 weight percent, from about 5 to about 35
weight percent, from about 5 to about 40 weight percent, from about
5 to about 45 weight percent, from about 5 to about 50 weight
percent, from about 10 to about 20 weight percent, from about 10 to
about 25 weight percent, from about 10 to about 30 weight percent,
from about 10 to about 35 weight percent, from about 10 to about 40
weight percent, from about 10 to about 45 weight percent, from
about 10 to about 50 weight percent, from about 15 to about 20
weight percent, from about 15 to about 25 weight percent, from
about 15 to about 30 weight percent, from about 15 to about 35
weight percent, from about 15 to about 30 weight percent, from
about 15 to about 45 weight percent, or from about 15 to about 50
weight percent of the total polypeptides in the composition.
Utility of Variant CBH I Polypeptides
[0069] It can be appreciated that the variant CBH I polypeptides of
the disclosure and compositions comprising the variant CBH I
polypeptides find utility in a wide variety applications, for
example detergent compositions that exhibit enhanced cleaning
ability, function as a softening agent and/or improve the feel of
cotton fabrics (e.g., "stone washing" or "biopolishing"), or in
cellulase compositions for degrading wood pulp into sugars (e.g.,
for bio-ethanol production). Other applications include the
treatment of mechanical pulp (Pere et al., 1996, Tappi Pulping
Conference, pp. 693-696 (Nashville, Tenn., Oct. 27-31, 1996)), for
use as a feed additive (see, e.g., WO 91/04673) and in grain wet
milling.
Saccharification Reactions
[0070] Ethanol can be produced via saccharification and
fermentation processes from cellulosic biomass such as trees,
herbaceous plants, municipal solid waste and agricultural and
forestry residues. However, the ratio of individual cellulase
enzymes within a naturally occurring cellulase mixture produced by
a microbe may not be the most efficient for rapid conversion of
cellulose in biomass to glucose. It is known that endoglucanases
act to produce new cellulose chain ends which themselves are
substrates for the action of cellobiohydrolases and thereby improve
the efficiency of hydrolysis of the entire cellulase system. The
use of optimized cellobiohydrolase activity may greatly enhance the
production of ethanol.
[0071] Cellulase compositions comprising one or more of the variant
CBH I polypeptides of the disclosure can be used in
saccharification reaction to produce simple sugars for
fermentation. Accordingly, the present disclosure provides methods
for saccharification comprising contacting biomass with a cellulase
composition comprising a variant CBH I polypeptide of the
disclosure and, optionally, subjecting the resulting sugars to
fermentation by a microorganism.
[0072] The term "biomass," as used herein, refers to any
composition comprising cellulose (optionally also hemicellulose
and/or lignin). As used herein, biomass includes, without
limitation, seeds, grains, tubers, plant waste or byproducts of
food processing or industrial processing (e.g., stalks), corn
(including, e.g., cobs, stover, and the like), grasses (including,
e.g., Indian grass, such as Sorghastrum nutans; or, switchgrass,
e.g., Panicum species, such as Panicum virgatum), wood (including,
e.g., wood chips, processing waste), paper, pulp, and recycled
paper (including, e.g., newspaper, printer paper, and the like).
Other biomass materials include, without limitation, potatoes,
soybean (e.g., rapeseed), barley, rye, oats, wheat, beets, and
sugar cane bagasse.
[0073] The saccharified biomass (e.g., lignocellulosic material
processed by enzymes of the disclosure) can be made into a number
of bio-based products, via processes such as, e.g., microbial
fermentation and/or chemical synthesis. As used herein, "microbial
fermentation" refers to a process of growing and harvesting
fermenting microorganisms under suitable conditions. The fermenting
microorganism can be any microorganism suitable for use in a
desired fermentation process for the production of bio-based
products. Suitable fermenting microorganisms include, without
limitation, filamentous fungi, yeast, and bacteria. The
saccharified biomass can, for example, be made it into a fuel
(e.g., a biofuel such as a bioethanol, biobutanol, biomethanol, a
biopropanol, a biodiesel, a jet fuel, or the like) via fermentation
and/or chemical synthesis. The saccharified biomass can, for
example, also be made into a commodity chemical (e.g., ascorbic
acid, isoprene, 1,3-propanediol), lipids, amino acids,
polypeptides, and enzymes, via fermentation and/or chemical
synthesis.
[0074] Thus, in certain aspects, the variant CBH I polypeptides of
the disclosure find utility in the generation of ethanol from
biomass in either separate or simultaneous saccharification and
fermentation processes. Separate saccharification and fermentation
is a process whereby cellulose present in biomass is saccharified
into simple sugars (e.g., glucose) and the simple sugars
subsequently fermented by microorganisms (e.g., yeast) into
ethanol. Simultaneous saccharification and fermentation is a
process whereby cellulose present in biomass is saccharified into
simple sugars (e.g., glucose) and, at the same time and in the same
reactor, microorganisms (e.g., yeast) ferment the simple sugars
into ethanol.
[0075] Prior to saccharification, biomass is preferably subject to
one or more pretreatment step(s) in order to render cellulose
material more accessible or susceptible to enzymes and thus more
amenable to hydrolysis by the variant CBH I polypeptides of the
disclosure.
[0076] In an exemplary embodiment, the pretreatment entails
subjecting biomass material to a catalyst comprising a dilute
solution of a strong acid and a metal salt in a reactor. The
biomass material can, e.g., be a raw material or a dried material.
This pretreatment can lower the activation energy, or the
temperature, of cellulose hydrolysis, ultimately allowing higher
yields of fermentable sugars. See, e.g., U.S. Pat. Nos. 6,660,506;
6,423,145.
[0077] Another exemplary pretreatment method entails hydrolyzing
biomass by subjecting the biomass material to a first hydrolysis
step in an aqueous medium at a temperature and a pressure chosen to
effectuate primarily depolymerization of hemicellulose without
achieving significant depolymerization of cellulose into glucose.
This step yields a slurry in which the liquid aqueous phase
contains dissolved monosaccharides resulting from depolymerization
of hemicellulose, and a solid phase containing cellulose and
lignin. The slurry is then subject to a second hydrolysis step
under conditions that allow a major portion of the cellulose to be
depolymerized, yielding a liquid aqueous phase containing
dissolved/soluble depolymerization products of cellulose. See,
e.g., U.S. Pat. No. 5,536,325.
[0078] A further exemplary method involves processing a biomass
material by one or more stages of dilute acid hydrolysis using
about 0.4% to about 2% of a strong acid; followed by treating the
unreacted solid lignocellulosic component of the acid hydrolyzed
material with alkaline delignification. See, e.g., U.S. Pat. No.
6,409,841. Another exemplary pretreatment method comprises
prehydrolyzing biomass (e.g., lignocellulosic materials) in a
prehydrolysis reactor; adding an acidic liquid to the solid
lignocellulosic material to make a mixture; heating the mixture to
reaction temperature; maintaining reaction temperature for a period
of time sufficient to fractionate the lignocellulosic material into
a solubilized portion containing at least about 20% of the lignin
from the lignocellulosic material, and a solid fraction containing
cellulose; separating the solubilized portion from the solid
fraction, and removing the solubilized portion while at or near
reaction temperature; and recovering the solubilized portion. The
cellulose in the solid fraction is rendered more amenable to
enzymatic digestion. See, e.g., U.S. Pat. No. 5,705,369. Further
pretreatment methods can involve the use of hydrogen peroxide
H.sub.2O.sub.2. See Gould, 1984, Biotech, and Bioengr.
26:46-52.
[0079] Pretreatment can also comprise contacting a biomass material
with stoichiometric amounts of sodium hydroxide and ammonium
hydroxide at a very low concentration. See Teixeira et al., 1999,
Appl. Biochem. and Biotech. 77-79:19-34. Pretreatment can also
comprise contacting a lignocellulose with a chemical (e.g., a base,
such as sodium carbonate or potassium hydroxide) at a pH of about 9
to about 14 at moderate temperature, pressure, and pH. See PCT
Publication WO2004/081185.
[0080] Ammonia pretreatment can also be used. Such a pretreatment
method comprises subjecting a biomass material to low ammonia
concentration under conditions of high solids. See, e.g., U.S.
Patent Publication No. 20070031918 and PCT publication WO
06/110901.
Detergent Compositions Comprising Variant CBH I Proteins
[0081] The present disclosure also provides detergent compositions
comprising a variant CBH I polypeptide of the disclosure. The
detergent compositions may employ besides the variant CBH I
polypeptide one or more of a surfactant, including anionic,
non-ionic and ampholytic surfactants; a hydrolase; a bleaching
agents; a bluing agent; a caking inhibitors; a solubilizer; and a
cationic surfactant. All of these components are known in the
detergent art.
[0082] The variant CBH I polypeptide is preferably provided as part
of cellulase composition. The cellulase composition can be employed
from about 0.00005 weight percent to about 5 weight percent or from
about 0.0002 weight percent to about 2 weight percent of the total
detergent composition. The cellulase composition can be in the form
of a liquid diluent, granule, emulsion, gel, paste, and the like.
Such forms are known to the skilled artisan. When a solid detergent
composition is employed, the cellulase composition is preferably
formulated as granules.
Examples
Materials and Methods
Preparation of CBH I Polypeptides for Biochemical
Characterization
[0083] Protein expression was carried out in an Aspergillus niger
host strain that had been transformed using PEG-mediated
transformation with expression constructs for CBHI that included
the hygromycin resistance gene as a selectable marker, in which the
full length CBH I sequences (signal sequence, catalytic domain,
linker and cellulose binding domain) were under the control of the
glyceraldeyhde-3-phosphate dehydrogenase (gpd) promoter.
Transformants were selected on the regeneration medium based on
resistance to hygromycin. The selected transformants were cultured
in Aspergillus salts medium, pH 6.2 supplemented with the
antibiotics penicillin, streptomycin, and hygromycin, and 80 g/L
glycerol, 20 g/L soytone, 10 mM uridine, 20 g/L MES) in baffled
shake flasks at 30.degree. C., 170 rpm. After five days of
incubation, the total secreted protein supernatant was recovered,
and then subjected to hollow fiber filtration to concentrate and
exchange the sample into acetate buffer (50 mM NaAc, pH 5). CBH I
protein represented over 90% of the total protein in these samples.
Protein purity was analyzed by SDS-PAGE. Protein concentration was
determined by gel densitometry and/or HPLC analysis. All CBH I
protein concentrations were normalized before assay and
concentrated to 1-2.5 mg/ml.
CBH I Activity Assays
[0084] 4-Methylumbelliferyl Lactoside (4-MUL) Assay:
[0085] This assay measures the activity of CBH I on the fluorogenic
substrate 4-MUL (also known as MUL). Assays were run in a costar
96-well black bottom plate, where reactions were initiated by the
addition of 4-MUL to enzyme in buffer (2 mM 4-MUL in 200 mM MES pH
6). Enzymatic rates were monitored by fluorescent readouts over
five minutes on a SPECTRAMAX.TM. plate reader (ex/em 365/450 nm).
Data in the linear range was used to calculate initial rates
(Vo).
[0086] Phosphoric Acid Swollen Cellulose (PASC) Assay:
[0087] This assay measures the activity of CBH I using PASC as the
substrate. During the assay, the concentration of PASC is monitored
by a fluorescent signal derived from calcofluor binding to PASC
(ex/em 365/440 nm). The assay is initiated by mixing enzyme (15
.mu.l) and reaction buffer (85 .mu.l of 0.2% PASC, 200 mM MES, pH
6), and then incubating at 35.degree. C. while shaking at 225 RPM.
After 2 hours, one reaction volume of calcofluor stop solution (100
.mu.g/ml in 500 mM glycine pH 10) is added and fluorescence
read-outs obtained (ex/em 365/440 nm).
[0088] Bagasse Assay:
[0089] This assay measures the activity of CBH I on bagasse, a
lignocellulosic substrate. Reactions were run in 10 ml vials with
5% dilute acid pretreated bagasse (250 mg solids per 5 ml
reaction). Each reaction contained 4 mg CBH I enzyme/g solids, 200
mM MES pH 6, kanamycin, and chloramphenicol. Reactions were
incubated at 35.degree. C. in hybridization incubators (Robbins
Scientific), rotating at 20 RPM. Time points were taken by
transferring a sample of homogenous slurry (150 .mu.l) into a
96-well deep well plate and quenching the reaction with stop buffer
(450 .mu.l of 500 mM sodium carbonate, pH 10). Time point
measurements were taken every 24 hours for 72 hours.
[0090] Cellobiose Tolerance Assays (or Cellobiose Inhibition
Assays):
[0091] Tolerance to cellobiose (or inhibition caused by cellobiose)
was tested in two ways in the CBH I assays. A direct-dose tolerance
method can be applied to all of the CBH I assays (i.e., 4-MUL,
PASC, and/or bagasse assays), and entails the exogenous addition of
a known amount of cellobiose into assay mixtures. A different
indirect method entails the addition of an excess amount of
.beta.-glucosidase (BG) to PASC and bagasse assays (typically, 1 mg
.beta.-glucosidase/g solids loaded). BG will enzymatically
hydrolyze the cellobiose generated during these assays; therefore,
CBH I activity in the presence of BG can be taken as a measure of
activity in the absence of cellobiose. Furthermore, when activity
in the presence and absence of BG are similar, this indicates
tolerance to cellobiose. Notably, in cases where BG activity is
undesired, but may be present in crude CBH I enzyme preparations,
the BG inhibitor gluconolactone can be added into CBH I assays to
prevent cellobiose breakdown.
Library Screening Assays
[0092] The wild type CBH I polypeptide BD29555 was mutagenized to
identify variants with improved product tolerance. A small
(60-member) library of BD29555 variants was designed to identify
variant CBH I polypeptides with reduced product inhibition. This
product-release-site library was designed based on residues
directly interacting with the cellobiose product in an attempt to
identify variants with weakened interactions with cellobiose from
which the product would be released more readily than the wild type
enzyme. The 60-member evolution library contained wild-type
residues and mutations at positions B273, W405, and R422 of BD29555
(SEQ ID NO:1), and included the following substitutions: B273 (WT),
R273Q, R273K, R273A, W405 (WT), W405Q, W405H, R422 (WT), R422Q,
R422K, R422L, and R422E (4 variants at position 273.times.3
variants at position 405.times.5 variants at position 422 equals 60
variants in total). All members of the library were screened using
the 4-MUL assay in the presence and absence of 250 g/L cellobiose
and using gluconolactone to inhibit any BG activity. The R273A,
R273Q, and R273K/R422K variants showed enhanced product tolerance.
The R273K/R422K variant showed greatest activity among the variants
and cellobiose tolerance at 250 mg/L. Due to low expression, the
R273K variant was not tested for product inhibition.
Characterization of Product Tolerant VARIANTS of BD29555
[0093] The R273K/R422K substitutions were characterized in both a
wild type BD29555 background and also in combination with the
substitutions Y274Q, D281K, Y410H, P411G, which were identified in
a screen of an expanded product release site evolution library.
[0094] The wild type, the R273K/R422K variant and the
R273K/Y274Q/D281K/Y410H/P411G/R422K variants were tested for
activity on 4-MUL in the presence and absence of 250 mg/L
cellobiose, and the R273K/R422K variant was also tested in the
bagasse assay in the presence and absence of BG. The results are
summarized in Table 5.
[0095] The results from these activity assays were converted into
the percentage of activity remaining with and without cellobiose
present, where values close to 100% indicated cellobiose tolerance.
The percent of activity remaining in the MUL assay in the presence
cellobiose versus in the absence of cellobiose shows that the
R273K/R422K variant was the most tolerant, followed by the
R273K/Y274Q/D281K/Y410H/P411G/R422K variant, and then wild-type, at
95%, 78%, and 25% activity, respectively.
[0096] Cellobiose dose response curves of the wild-type and
R273K/R422K variant of BD29555 were obtained during the 4-MUL
assay. Enzyme rates (Vo) were measured in the presence of different
concentrations of cellobiose (200 mM MES pH 6, 25.degree. C.).
Rates were measured in quadruplicate. The results are shown in FIG.
1A-1B. FIG. 1A shows that wild type BD2955 is inhibited by
cellobiose, with a half maximal inhibitory concentration (IC.sub.50
value) of 60 mg/L. FIG. 1B shows that the R273K/R422K variant is
tolerant to cellobiose up to 250 mg/L.
[0097] The bagasse assay results shown in Table 5, which lists the
percentage of activity remaining in the absence vs. presence of BG,
also demonstrate that the percentage activity of the wild type
BD29555 is lower than the percentage activity of the R273K/R422K
variant, indicating that the R273K/R422K variant is less sensitive
to the presence of cellobiose than the wild type. FIG. 2A-2B shows
bar graph data for the bagasse assay of BD29555 vs. the R273K/R422K
variant. In FIG. 2A, bars represent relative activity, which has
been normalized to wild type activity in the absence of cellobiose
(WT+BG=uninhibited activity=1). In FIG. 2B, bars indicate tolerance
to cellobiose, as represented by the ratio of activity in the
presence of cellobiose (-BG) to that of activity in the absence of
cellobiose (+BG); ratios close to 1 indicate greater tolerance to
cellobiose. These data again demonstrate that the R273K/R422K
variant of BD29555 is more tolerant to cellobiose than the wild
tvae BD29555.
[0098] The wild type and R273K/R422K variant were also
characterized in the PASC assay. Results are shown in FIG. 3. The
activities of both wild type BD29555 (SEQ ID NO:1) and wild type T.
reesei CBH I (SEQ ID NO:2) were inhibited by cellobiose
concentrations starting around 1 g/L (with IC.sub.50 values of 2.2
and 3 g/L, respectively), whereas the R273K/R422K variant showed
little inhibition in the presence of 10 g/L cellobiose.
Characterization of Product Tolerant VARIANTS of T. reesei CBH
I
[0099] Cellobiose product tolerant substitutions were introduced
into T. reesei CBH I (SEQ ID NO:2). A panel of variants with single
and double alanine and lysine substitutions at R268 and R411 were
expressed and analyzed. The variants were tested for activity on
4-MUL in the presence and absence of 250 mg/L cellobiose and also
in the bagasse assay in the absence and prseence of BG. The results
from these assays were converted into the percentage activity
remaining in the presence and absence of cellobiose and BG,
respectively. Values are summarized in Table 6.
[0100] The 4-MUL assay results shown in Table 6 demonstrate that
the activity of the wild type T. reesei CBH I was reduced to 23% in
the presence of cellobiose, whereas the double mutants at R268 and
R411 retained more than 90% of their activity under the same
conditions.
[0101] The bagasse assay results shown in Table 6 demonstrate that
the activity of the wild type T. reesei CBH I is more significantly
impacted by the presence of BG than is the activity of the single
or double substitution variants, indicating that the variants are
less sensitive to the accumulation of cellobiose than the wild
type. FIGS. 4 and 5 show bar graph data for the bagasse assay of
wild type T. reesei CBH I vs. the variants. In FIG. 4, bars
represent relative activity, normalized to wild type activity in
the absence of cellobiose (WT+BG=1). In FIG. 5, bars represent
tolerance to cellobiose, as represented by the ratio of activity in
the presence of accumulating cellobiose (-BG) to that of activity
in the absence of cellobiose (+BG); ratios close to 1 indicate
greater tolerance to cellobiose.
Specific Embodiments and Incorporation by Reference
[0102] All publications, patents, patent applications and other
documents cited in this application are hereby incorporated by
reference in their entireties for all purposes to the same extent
as if each individual publication, patent, patent application or
other document were individually indicated to be incorporated by
reference for all purposes.
[0103] While various specific embodiments have been illustrated and
described, it will be appreciated that various changes can be made
without departing from the spirit and scope of the
invention(s).
TABLE-US-00001 TABLE 1 Sequence Identifier Database (SEQ ID NO:)
Accession Number Species of Origin Amino Acid Sequence BD29555*
Unknown MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC
TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT
TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT
MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA
CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL
GVTDFYGSGK TVDTTKPITV VTQFVTDDGT STGTLSEIRR YYVQNGVVIP QPSSKISGVS
GNVINSDFCD AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS VNMLWLDSTY
PTNATGTPGA ARGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT
TASGTTTTKA SSTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL
340514556 Trichoderma reesei MYRKLAVISA FLATARAQSA CTLQSETHPP
LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL
DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV
SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE
PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG
GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ
PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN
MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS
GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG YSGPTVCASG TTCQVLNPYY
SQCL 51243029 Penicillium occitanis MSALNSFNMY KSALILGSLL
ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT
WNSAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ
IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR
DLKFIAGQAN VEGWTPSANN ANTGIGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC
TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTNDGT
STGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDYCA AEISTFGGTA SFNKHGGLTN
MAAGMEAGMV LVMSLWDDYA VNMLWLDSTY PTNATGTPGA ARGTCATTSG DPKTVESQSG
SSYVTFSDIR VGPFNSTFSG GSSTGGSTTT TASRTTTTSA SSTSTSSTST GTGVAGHWGQ
CGGQGWTGPT TCVSGTTCTV VNPYYSQCL 7cel (PDB) & Trichoderma reesei
ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL
CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF
TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL
KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG
DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI
NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM
VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN
IKFGPIGSTG NPSG 67516425 Aspergillus nidulans MASSFQLYKA LLFFSSLLSA
VQAQKVGTQQ AEVHPGLTWQ TCTSSGSCTT VNGEVTIDAN WRWLHTVNGY FGSC A4
TNCYTGNEWD TSICTSNEVC AEQCAVDGAN YASTYGITTS GSSLRLNFVT QSQQKNIGSR
VYLMDDEDTY TMFYLLNKEF TFDVDVSELP CGLNGAVYFV SMDADGGKSR YATNEAGAKY
GTGYCDSQCP RDLKFINGVA NVEGWESSDT NPNGGVGNHG SCCAEMDIWE ANSISTAFTP
HPCDTPGQTL CTGDSCGGTY SNDRYGGTCD PDGCDFNSYR QGNKTFYGPG LTVDTNSPVT
VVTQFLTDDN TDTGTLSEIK RFYVQNGVVI PNSESTYPAN PGNSITTEFC ESQKELFGDV
DVFSAHGGMA GMGAALEQGM VLVLSLWDDN YSNMLWLDSN YPTDADPTQP GIARGTCPTD
SGVPSEVEAQ YPNAYVVYSN IKFGPIGSTF GNGGGSGPTT TVTTSTATST TSSATSTATG
QAQHWEQCGG NGWTGPTVCA SPWACTVVNS WYSQCL 46107376 Gibberella zeae
PH-1 MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG CSNVKGSVTI
DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAEKCCLD GADYASTYGI TSSGDQLSLS
FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG
GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI GNLGTCCPEM
DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF
YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGNSLTA
DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK
LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN PTNPSEPSNT
ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ 70992391 Aspergillus
fumigatus MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS
CTTNNGKVVI DANWRWVHKV Af293 GDYTNCYTGN TWDTTICPDD ATCASNCALE
GANYESTYGV TASGNSLRLN FVTTSQQKNI GSRLYMMKDD STYEMFKLLN QEFTFDVDVS
NLPCGLNGAL YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP
SSNDANAGTG NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG
TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTSSGTLK EIKRFYVQNG
KVIPNSESTW TGVSGNSITT EYCTAQKSLF QDQNVFEKHG GLEGMGAALA QGMVLVMSLW
DDHSANMLWL DSNYPTTASS TTPGVARGTC DISSGVPADV EANHPDAYVV YSNIKVGPIG
STFNSGGSNP GGGTTTTTTT QPTTTTTTAG NPGGTGVAQH YGQCGGIGWT GPTTCASPYT
CQKLNDYYSQ CL 121699984 Aspergillus clavatus MLPSTISYRI YKNALFFAAL
FGAVQAQKVG TSKAEVHPSM AWQTCAADGT CTTKNGKVVI DANWRWVHDV NRRL 1
KGYTNCYTGN TWNAELCPDN ESCAENCALE GADYAATYGA TTSGNALSLK FVTQSQQKNI
GSRLYMMKDD NTYETFKLLN QEFTFDVDVS NLPCGLNGAL YFVSMDADGG LSRYTGNEAG
AKYGTGYCDS QCPRDLKFIN GLANVEGWTP SSSDANAGNG GHGSCCAEMD IWEANSISTA
YTPHPCDTPG QAMCNGDSCG GTYSSDRYGG TCDPDGCDFN SYRQGNKSFY GPGMTVDTKK
KMTVVTQFLT NDGTATGTLS EIKRFYVQDG KVIANSESTW PNLGGNSLTN DFCKAQKTVF
GDMDTFSKHG GMEGMGAALA EGMVLVMSLW DDHNSNMLWL DSNSPTTGTS TTPGVARGSC
DISSGDPKDL EANHPDASVV YSNIKVGPIG STFNSGGSNP GGSTTTTKPA TSTTTTKATT
TATTNTTGPT GTGVAQPWAQ CGGIGYSGPT QCAAPYTCTK QNDYYSQCL 1906845
Claviceps purpurea MHPSLQTILL SALFTTAHAQ QACSSKPETH PPLSWSRCSR
SGCRSVQGAV TVDANWLWTT VDGSQNCYTG NRWDTSICSS EKTCSESCCI DGADYAGTYG
VTTTGDALSL KFVQQGPYSK NVGSRLYLMK DESRYEMFTL LGNEFTFDVD VSKLGCGLNG
ALYFVSMDED GGMKRFPMNK AGAKFGTGYC DSQCPRDVKF INGMANSKDW IPSKSDANAG
IGSLGACCRE MDIWEANNIA SAFTPHPCKN SAYHSCTGDG CGGTYSKNRY SGDCDPDGCD
FNSYRLGNTT FYGPGPKFTI DTTRKISVVT QFLKGRDGSL REIKRFYVQN GKVIPNSVSR
VRGVPGNSIT QGFCNAQKKM FGAHESFNAK GGMKGMSAAV SKPMVLVMSL WDDHNSNMLW
LDSTYPTNSR QRGSKRGSCP ASSGRPTDVE SSAPDSTVVF SNIKFGPIGS TFSRGK 1gpi
(PDB) & Phanerochaete ESACTLQSET HPPLTWQKCS SGGTCTQQTG
SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN chrysosporium
CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD
VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE
GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN
RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT
FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY
YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG
NPSG 119468034 Neosartorya fischeri MHQRALLFSA LAVAANAQQV
GTQKPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS TKDTTNCYTG NRRL 181
NTWNTELCPD NESCAQNCAV DGADYAGTYG VTTSGSELKL SFVTGANVGS RLYLMQDDET
YQHFNLLNNE FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK YGTGYCDSQC
PRDLKFINGM ANVEGWKPSS NDKNAGVGGH GSCCPEMDIW EANSISTAVT PHPCDDVSQT
MCSGDACGGT YSATRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSEM TVVTQFITAD
GTDTGALSEI KRLYVQNGKV IANSVSNVAD VSGNSISSDF CTAQKKAFGD EDIFAKHGGL
SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG AGDPEKVESQ
HPDASVTFSN IKFGPIGSTY KA 7804883 Leptosphaeria MYRSLIFATS
LLSLAKGQLV GNLYCKGSCT AKNGKVVIDA NWRWLHVKGG YTNCYTGNEW NATACPDNKS
maculans CATNCAIDGA DYRRLRHYCE RQLLGTEVHH QGLYSTNIGS RTYLMQDDST
YQLFKFTGSQ EFTFDVDLSN LPCGLNGALY FVSMDADGGL KKYPTNKAGA KYGTGYCDAQ
CPRDLKFING EGNVEGWQPS KNDQNAGVGG HGSCCAEMDI WEANSVSTAV TPHSCSTIEQ
SRCDGDGCGG TYSADRYAGV CDPDGCDFNS YRMGVKDFYG KGKTVDTSKK FTVVTQFIGS
GDAMEIKRFY VQNGKTIPQP DSTIPGVTGN SITTFFCDAQ KKAFGDKYTF KDKGGMANMP
STCNGMVLVM SLWDDHYSNM LWLDSTYPTD KNPDTDAGSG RGECAITSGV PADVESQHPD
ASVIYSNIKF GPINTTFG 85108032 Neurospora crassa MLAKFAALAA
LVASANAQAV CSLTAETHPS LNWSKCTSSG CTNVAGSITV DANWRWTHIT SGSTNCYSGN
N150 (OR74A) EWDTSLCSTN TDCATKCCVD GAEYSSTYGI QTSGNSLSLQ FVTKGSYSTN
IGSRTYLMNG ADAYQGFELL GNEFTFDVDV SGTGCGLNGA LYFVSMDLDG GKAKYTNNKA
GAKYGTGYCD AQCPRDLKYI NGIANVEGWT PSTNDANAGI GDHGTCCSEM DIWEANKVST
AFTPHPCTTI EQHMCEGDSC GGTYSDDRYG GTCDADGCDF NSYRMGNTTF YGEGKTVDTS
SKFTVVTQFI KDSAGDLAEI KRFYVQNGKV IENSQSNVDG VSGNSITQSF CNAQKTAFGD
IDDFNKKGGL KQMGKALAKP MVLVMSIWDD HAANMLWLDS TYPVEGGPGA YRGECPTTSG
VPAEVEANAP NSKVIFSNIK FGPIGSTFSG GSSGTPPSNP SSSVKPVTST AKPSSTSTAS
NPSGTGAAHW AQCGGIGFSG PTTCQSPYTC QKINDYYSQC V 169859458 Coprinopsis
cinerea MFKKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ
VVLDANWRWL HVTDGYTNCY okayama TGNSWNSTVC SDPTTCAQRC ALEGANYQQT
YGITTNGDAL TIKFLTRSQQ TNVGARVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN
GALYFIQMDA DGGMSKQPNN RAGAKYGTGY CDSQCPRDIK FIDGVANSAD WTPSETDPNA
GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD
YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT
NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH
MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY
154292161 Botryotinia fuckeliana MYSAAVLATF SFLLGAGAQQ VGTSTAETHP
ALTVQKCAAG GTCTDESDSI VLDANWRWLH STSGSTNCYT B05-10 GNTWDTTLCP
DAATCTTNCA LDGADYEGTY GITTSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN
EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVNG
TANVEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHVCTVDSQ TACTGDDCAS
NTGVCDGDGC DFNPYRMGNT TFYGSGMTID TSKPFSVVTQ FITDDGTETG TLTEIKRFYV
QDDVVYEQPS SDISGVSGNS ITDDFCAAQK TAFGDTDYFT QNGGMAAMGK KMADGMVLVL
SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATDSGVP ATVEAASGSA YVTFSSIKYG
PIGSTFNAPA DSSSSVSASS SPAPIASSSS SASIAPVSSV VAAIVSSSAQ AISSAAPVVS
SSAQAISSAA PVVSSVVSSA APVATSSTKS KCSKVSSTLK TSVAAPATSA TSAAVVATSS
AASSTGSVPL YGNCTGGKTC SEGTCVVQND YYSQCVASS 169615761 #
Phaeosphaeria MTWQRCTGTG GSSCTNVNGE IVIDANWRWI HATGGYTNCF
DGNEWNKTAC PSNAACTKNC AIEGSDYRGT nodorum SN15 YGITTSGNSL TLKFITKGQY
STNVGSRTYL MKDTNNYEMF NLIGNEFTFD VDLSQLPCGL NGALYFVSMP EKGQGTPGAK
YGTGKLSQCS VHISKTLTDA CARDLKFVGG EANADGWQAS TSDPNAGVGK KGACCAEMDV
WEANSMSTAL TPHSCQPEGY AVCEESNCGG TYSLDRYAGT CDANGCDFNP YRVGNKDFYG
KGKTVDTSKK MTVVTQFLGT GSDLTELKRF YVQDGKVISN PEPTIPGMTG NSITQKWCDT
QKEVFKEEVY PFNQWGGMAS MGKGMAQGMV LVMSLWDDHY SNMLWLDSTY PTDRDPESPG
AARGECAITS GAPAEVEANN PDASVMFSNI KFGPIGSTFQ QPA 4883502 Humicola
grisea MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR
CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC SSATDCAQRC ALDGANYQST
YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI
NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN
AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN
GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP
NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS
YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PNAQVVWSNI RFGPIGSTVN V 950686
Humicola grisea MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWKKCTAG
GQCQTVQASI TLDSNWRWTH QVSGSTNCYT GNKWDTSICT DAKSCAQNCC VDGADYTSTY
GITTNGDSLS LKFVTKGQYS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN
GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA
GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC
DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI
PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL
DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN
GGNNGGNPPP PTTTTSSAPA TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYTCTKLNDW
YSQCL 124491660 Chaetomium MQIKQYLQYL AAALPLVNMA AAQRAGTQQT
ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN thermophilum CYDGNRWTSA
CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM
FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD
LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE
TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ
FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM
VLVMSIWDGH YANMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI
RFGPIGSTYQ V 58045187 Chaetomium MMYKKFAALA ALVAGAAAQQ ACSLTTETHP
RLTWKRCTSG GNCSTVNGAV TIDANWRWTH TVSGSTNCYT thermophilum GNEWDTSICS
DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQHG TNVGSRVYLM ENDTKYQMFE
LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK
FINGEANIEN WTPSTNDANA GFGRYGSCCS EMDIWDANNM ATAFTPHPCT IIGQSRCEGN
SCGGTYSSER YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TTKKMTVVTQ FHKNSAGVLS
EIKRFYVQDG KIIANAESKI PGNPGNSITQ EWCDAQKVAF GDIDDFNRKG GMAQMSKALE
GPMVLVMSVW DDHYANMLWL DSTYPIDKAG TPGAERGACP TTSGVPAEIE AQVPNSNVIF
SNIRFGPIGS TVPGLDGSTP SNPTATVAPP TSTTTSVRSS TTQISTPTSQ PGGCTTQKWG
QCGGIGYTGC TNCVAGTTCT ELNPWYSQCL 169601100 # Phaeosphaeria
MYRNFLYAAS LLSVARSQLV GTQTTETHPG MTWQSCTAKG SCTTCSDNKA CASNCAVDGA
DYKGTYGITA nodorum SN15 SGNSLQLKFI TKGSYSTNIG SRTYLMASDT AYQMFKFDGN
KEFTFDVDLS GLPCGFNGAL YFVSMDEDGG LKKYSGNKAG AKYGTGYCDA QCPRDLKFIN
GEGNVEGWKP SDNDANAGVG GHGSCCAEMD IWEANSISTA VTPHACSTIE QTRCDGDGCG
GTYSADRYAG VCDPDGCDFN AYRMGVKNFY GKGMTVDTSK KFTVVTQFIG TGDAMEIKRF
YVQGGKTIEQ PASTIPGVEG NSITTKFCDQ QKQVFGDRYT YKEKGGTANM AKALAQGMVL
VMSLWDDHYS NMLWLDSTYP TDKNPDTDLG SGRGSCDVKS GAPADVESKS PDATVIYSNI
KFGPLNSTY 169870197 Coprinopsis cinerea MLGKIAIASL SFLAIAKGQQ
VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY okayama
TGNSWNSSVC SDGTTCAQRC ALEGANYQQT YGITTSGNSL TMKFLTRSQG TNVGGRVYLM
ENENRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSSQPNN RAGAKYGTGY
CDSQCPRDIK FIDGVANSVG WEPSETDSNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR
TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTIDT NRKMTVVTQF
ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF
ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGT
PRETEQNHPD AQVIFSNIKF GDIGSTFSGY 3913806 Agaricus bisporus
MFPRSILLAL SLTAVALGQQ VGTNMAENHP SLTWQRCTSS GCQNVNGKVT LDANWRWTHR
INDFTNCYTG NEWDTSICPD GVTCAENCAL DGADYAGTYG VTSSGTALTL KFVTESQQKN
IGSRLYLMAD DSNYEIFNLL NKEFTFDVDV SKLPCGLNGA LYFSEMAADG GMSSTNTAGA
KYGTGYCDSQ CPRDIKFIDG EANSEGWEGS PNDVNAGTGN FGACCGEMDI WEANSISSAY
TPHPCREPGL QRCEGNTCSV NDRYATECDP DGCDFNSFRM GDKSFYGPGM TVDTNQPITV
VTQFITDNGS DNGNLQEIRR IYVQNGQVIQ NSNVNIPGID SGNSISAEFC DQAKEAFGDE
RSFQDRGGLS GMGSALDRGM VLVLSIWDDH AVNMLWLDSD YPLDASPSQP GISRGTCSRD
SGKPEDVEAN AGGVQVVYSN IKFGDINSTF NNNGGGGGNP SPTTTRPNSP AQTMWGQCGG
QGWTGPTACQ SPSTCHVIND FYSQCF 169611094 Phaeosphaeria MYRNLALASL
SLFGAARAQQ AGTVTTETHP SLSWKTCTGT GGTSCTTKAG KITLDANWRW THVTTGYTNC
nodorum SN15 YDGNSWNTTA CPDGATCTKN CAVDGADYSG TYGITTSSNS LSIKFVTKGS
NSANIGSRTY LMESDTKYQM FNLIGQEFTF DVDVSKLPCG LNGALYFVEM AADGGIGKGN
NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN AGSGKIGACC PEMDIWEANS
ISTAYTPHPC KGTGLQECTD DVSCGDGSNR YSGLCDKDGC DFNSYRMGVK DFYGPGATLD
TTKKMTVVTQ FLGSGSTLSE IKRFYVQNGK VFKNSDSAIE GVTGNSITES FCAAQKTAFG
DTNSFKTLGG LNEMGASLAR GHVLVMSLWD DHAVNMLWLD STYPTNSTKL GAQRGTCAID
SGKPEDVEKN HPDATVVFSD IKFGPIGSTF QQPS 3131 Phanerochaete MVDIQIATFL
LLGVVGVAAQ QVGTYIPENH PLLATQSCTA SGGCTTSSSK IVLDANRRWI HSTLGTTSCL
chrysosporium TANGWDPTLC PDGITCANYC ALDGVSYSST YGITTSGSAL
RLQFVTGTNI GSRVFLMADD THYRTFQLLN QELAFDVDVS KLPCGLNGAL YFVAMDADGG
KSKYPGNRAG AKYGTGYCDS QCPRDVQFIN GQANVQGWNA TSATTGTGSY GSCCTELDIW
EANSNAAALT PHTCTNNAQT RCSGSNCTSN TGFCDADGCD FNSFRLGNTT FLGAGMSVDT
TKTFTVVTQF ITSDNTSTGN LTEIRRFYVQ NGNVIPNSVV NVTGIGAVNS ITDPFCSQQK
KAFIETNYFA QHGGLAQLGQ ALRTGMVLAF SISDDPANHM LWLDSNFPPS ANPAVPGVAR
GMCSITSGNP ADVGILNPSP YVSFLNIKFG SIGTTFRPA 70991503 Aspergillus
fumigatus MHQRALLFSA LAVAANAQQV GTQTPETHPP LTWQKCTAAG SCSQQSGSVV
IDANWRWLHS TKDTTNCYTG Af293 NTWNTELCPD NESCAQNCAL DGADYAGTYG
VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNHE FTFDVDVSNL PCGLNGALYF
VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWEPSS SDKNAGVGGH
GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSESRYAGTC DPDGCDFNPF
RMGNESFYGP GKIVDTKSKM TVVTQFITAD GTDSGALSEI KRLYVQNGKV IANSVSNVAG
VSGNSITSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST
YPTDADPSKP GVARGTCEHG AGDPENVESQ HPDASVTFSN IKFGPIGSTY EG 294196
Phanerochaete MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS
GGCSNLNTKI VLDANWRWLH STSGYTNCYT chrysosporium GNQWDATLCP
DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ
EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING
EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT
GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN
GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS
IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD
LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA
SPYTCHVLNP YYSQCY 18997123 Thermoascus MYQRALLFSF FLAAARAHEA
GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG aurantiacus
NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD
DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD
SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP
GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI
TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH
GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD
VESQNPNSYV IYSNIKVGPI NSTFTAN 4204214 Humicola grisea var
MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW
RWLHNNGQNC thermoidea YEGNKWTSQC SSATDCAQRC ALDGANYQST YGASTSGDSL
TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI NSALYFVAME
EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN AGVGPMGACC
AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG
NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP NSADITPELC
DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG
GDRGPCPTTS GVPAEVEAQY PDAQVVWSNI RFGPIGSTVN V 34582632 Trichoderma
viride MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV
IDANWRWTHA TNSSTNCYDG (also known as NTWSSTLCPD NETCAKNCCL
DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL Hypochrea
rufa) GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD
SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV
GQEICEGDGC GGTYSDNRYG GTCDPDGCDW DPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ
FETSGAINRY YVQNGVTFQQ PNAELGSYSG NGLNDDYCTA EEAEFGGSSF SDKGGLTQFK
KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN
AKVTFSNIKF GPIGSTGDPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG
YSGPTVCASG TTCQVLNPYY SQCL 156712284 Thermoascus MYQRALLFSF
FLAAARAQQA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG
aurantiacus NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN
IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG GLSKYPGNKA
GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM DVWEANSIST
AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF YGPGQIVDTS
SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA
FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT
CPTTSGVPAD VESQYPNSYV IYSNIKVGPI NSTFTAN 39977899 Magnaporthe
grisea MIRKITTLAA LVGVVRGQAA CSLTAETHPS LTWQKCSSGG SCTNVAGSVT
IDANWRWTHT TSGYTNCYTG (oryzae) 70-15 NKWDTSICST NADCASKCCV
DGANYQQTYG ASTSGNALSL QYVTQSSGKN VGSRLYLLES ENKYQMFNLL GNEFTFDVDA
SKLGCGLNGA VYFVSMDADG GQSKYSGNKA GAKYGTGYCD SQCPRDLKYI NGAANVEGWQ
PSSGDANSGV GNMGSCCAEM DIWEANSIST AYTPHPCSNN AQHSCKGDDC GGTYSSVRYA
GDCDPDGCDF NSYRQGNRTF YGPGSNFNVD SSKKVTVVTQ FISSGGQLTD IKRFYVQNGK
VIPNSQSTIT GVTGNSVTQD YCDKQKTAFG DQNVFNQRGG LRQMGDALAK GMVLVMSVWD
DHHSQMLWLD STYPTTSTAP GAARGSCSTS SGKPSDVQSQ TPGATVVYSN IKFGPIGSTF
KSS 20986705 Talaromyces emersonii MLRRALLLSS SAILAVKAQQ AGTATAENHP
PLTWQECTAP GSCTTQNGAV VLDANWRWVH DVNGYTNCYT GNTWDPTYCP DDETCAQNCA
LDGADYEGTY GVTSSGSSLK LNFVTGSNVG SRLYLLQDDS TYQIFKLLNR EFSFDVDVSN
LPCGLNGALY FVAMDADGGV SKYPNNKAGA KYGTGYCDSQ CPRDLKFIDG EANVEGWQPS
SNNANTGIGD HGSCCAEMDV WEANSISNAV TPHPCDTPGQ TMCSGDDCGG TYSNDRYAGT
CDPDGCDFNP YRMGNTSFYG PGKIIDTTKP FTVVTQFLTD DGTDTGTLSE IKRFYIQNSN
VIPQPNSDIS GVTGNSITTE FCTAQKQAFG DTDDFSQHGG LAKMGAAMQQ GMVLVMSLWD
DYAAQMLWLD SDYPTDADPT TPGIARGTCP TDSGVPSDVE SQSPNSYVTY SNIKFGPINS
TFTAS 22138843 Aspergillus oryzae MHQRALLFSA FWTAVQAQQA GTLTAETHPS
LTWQKCAAGG TCTEQKGSVV LDSNWRWLHS VDGSTNCYTG NTWDATLCPD NESCASNCAL
DGADYEGTYG VTTSGDALTL QFVTGANIGS RLYLMADDDE SYQTFNLLNN EFTFDVDASK
LPCGLNGAVY FVSMDADGGV AKYSTNKAGA KYGTGYCDSQ CPRDLKFING
QVRKGWEPSD SDKNAGVGGH GSCCPQMDIW EANSISTAYT PHPCDDTAQT MCEGDTCGGT
YSSERYAGTC DPDGCDFNAY RMGNESFYGP SKLVDSSSPV TVVTQFITAD GTDSGALSEI
KRFYVQGGKV IANAASNVDG VTGNSITADF CTAQKKAFGD DDIFAQHGGL QGMGNALSSM
VLTLSIWDDH HSSMMWLDSS YPEDADATAP GVARGTCEPH AGDPEKVESQ SGSATVTYSN
IKYGPIGSTF DAPA 55775695 Penicillium MASTLSFKIY KNALLLAAFL
GAAQAQQVGT STAEVHPSLT WQKCTAGGSC TSQSGKVVID SNWRWVHNTG chrysogenum
GYTNCYTGND WDRTLCPDDV TCATNCALDG ADYKGTYGVT ASGSSLRLNF VTQASQKNIG
SRLYLMADDS KYEMFQLLNQ EFTFDVDVSN LPCGLNGALY FVAMDEDGGM ARYPTNKAGA
KYGTGYCDAQ CPRDLKFING QANVEGWEPS SSDVNGGTGN YGSCCAEMDI WEANSISTAF
TPHPCDDPAQ TRCTGDSCGG TYSSDRYGGT CDPDGCDFNP YRMGNQSFYG PSKIVDTESP
FTVVTQFITN DGTSTGTLSE IKRFYVQNGK VIPQSVSTIS AVTGNSITDS FCSAQKTAFK
DTDVFAKHGG MAGMGAGLAE GMVLVMSLWD DHAANMLWLD STYPTSASST TPGAARGSCD
ISSGEPSDVE ANHSNAYVVY SNIKVGPLGS TFGSTDSGSG TTTTKVTTTT ATKTTTTTGP
STTGAAHYAQ CGGQNWTGPT TCASPYTCQR QGDYYSQCL 171676762 Podospora
anserina MVSAKFAALA ALVASASAQQ VCSLTPESHP PLTWQRCSAG GSCTNVAGSV
TLDSNWRWTH TLQGSTNCYS GNEWDTSICT TGTKCAQNCC VEGAEYAATY GITTSGNQLN
LKFVTEGKYS TNVGSRTYLM ENATKYQGFN LLGNEFTFDV DVSNIGCGLN GALYFVSMDL
DGGLAKYSGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WNPSTNDVNA GAGRYGTCCS
EMDIWEANNM ATAYTPHSCT ILDQSRCEGE SCGGTYSSDR YGGVCDPDGC DFNSYRMGNK
EFYGKGKTVD TTKKMTVVTQ FLKNAAGELS EIKRFYVQNG VVIPNSVSSI PGVPNQNSIT
QDWCDAQKIA FGDPDDNTAK GGLRQMGLAL DKPMVLVMSI WNDHAAHMLW LDSTYPVDAA
GRPGAERGAC PTTSGVPSEV EAEAPNSNVA FSNIKFGPIG STFNSGSTNP NPISSSTATT
PTSTRVSSTS TAAQTPTSAP GGTVPRWGQC GGQGYTGPTQ CVAPYTCVVS NQWYSQCL
146350520 Pleurotus sp Florida MFPYIALVSF SFLSVVLAQQ VGTLTAETHP
QLTVQQCTRG GSCTTQQRSV VLDGNWRWLH STSGSNNCYT GNTWDTSLCP DAATCSRNCA
LDGADYSGTY GITSSGNALT LKFVTHGPYS TNIGSRVYLL ADDSHYQMFN LKNKEFTFDV
DVSQLPCGLN GALYFSQMDA DGGTGRFPNN KAGAKYGTGY CDSQCPHDIK FINGEANVQG
WQPSPNDSNA GKGQYGSCCA EMDIWEANSM ASAYTPHPCT VTTPTRCQGN DCGDGDNRYG
GVCDKDGCDF NSFRMGDKNF LGPGKTVNTN SKFTVVTQFL TSDNTTSGTL SEIRRLYVQN
GRVIQNSKVN IPGMASTLDS ITESFCSTQK TVFGDTNSFA SKGGLRAMGN AFDKGMVLVL
SIWDDHEAKM LWLDSNYPLD KSASAPGVAR GTCATTSGEP KDVESQSPNA QVIFSNIKYG
DIGSTYSN 37732123 Gibberella zeae myraiatasa LIAAVRAQQV CSLTQESKPS
LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAERCCLD
GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV
SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ
PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG
GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG
KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW
HDHHSNMLWL DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST
YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ
CQ 156055188 Sclerotinia MYSAAVLATF SFLLGAGAQQ VGTLKTESHP
PLTIQKCAAG GTCTDEADSV VLDANWRWLH STSGSTNCYT sclerotiorum 1980
GNTWDTTLCP DAATCTANCA FDGADYEGTY GITSSGDSLK LSFVTGSNVG SRTYLMDSET
TYKEFALLGN EFTFTVDVSK LPCGLNGALY FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ
CPQDMKFVSG GANNEGWVPD SNSANSGTGN IGSCCSEFDV WEANSMSQAL TPHTCTVDGQ
TACTGDDCAG NTGVCDADGC DFNPYRMGNT TFYGSGKTID TTKPFSVVTQ FITDDGTETG
TLTEIKRFYV QDDVVYEQPN SDISGVSGNS ITDDFCTAQK TAFGDTDYFS QKGGMAAMGK
KMADGMVLVL SIWDDYNVNM LWLDSDYPTT KDASTPGVSR GSCATTSGVP ATVEAASGSA
YVTFSSIKYG PIGSTFKAPA DSSSPVVASS SPAAVAAVVS TSSAQAVPSH PAVSSSQAAV
STPEAVSSAP EVPASSSAAQ SVAPTSTKPK CSKVSQSSTL ATSVAAPATT ATSAAVAATS
AASSSGSVPL YGNCTGGKTC SEGTCVVQNP WYSQCVASS 453224 Phanerochaete
MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH
STSGYTNCYT chrysosporium GNEWDTSLCP DGKTCAANCA LDGADYSGTY
GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY
LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG
TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF
LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI
TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK
DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS
SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPYYSQCY 50402144
Trichoderma reesei MYRKLAVISA FLATARAQSA CTLQSETHPP LTWQKCSSGG
TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG
VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA
LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI
GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC GGTYSDNRYG GTCDPDGCDW
NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGSYSG
NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT
NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF GPIGSTGNPS GGNPPGGNRG
TTTTRRPATT TGSSPGPTQS HYGQCGGIGY SGPTVCASGT TCQVLNPYYS QCL
115397177 Aspergillus terreus MPSTYDIYKK LLLLASFLSA SQAQQVGTSK
AEVHPSLTWQ TCTSGGSCTT VNGKVVVDAN WRWVHNVDGY NIH2624 NNCYTGNTWD
TTLCPDDETC ASNCALEGAD YSGTYGVTTS GNSLRLNFVT QASQKNIGSR LYLMEDDSTY
KMFKLLNQEF TFDVDVSNLP CGLNGAVYFV SMDADGGMAK YPANKAGAKY GTGYCDSQCP
RDLKFINGMA NVEGWEPSAN DANAGTGNHG SCCAEMDIWE ANSISTAYTP HPCDTPGQVM
CTGDSCGGTY SSDRYGGTCD PDGCDFNSYR QGNKTFYGPG MTVDTKSKIT VVTQFLTNDG
TASGTLSEIK RFYVQNGKVI PNSESTWSGV SGNSITTAYC NAQKTLFGDT DVFTKHGGME
GMGAALAEGM VLVLSLWDDH NSNMLWLDSN YPTDKPSTTP GVARGSCDIS SGDPKDVEAN
DANAYVVYSN IKVGPIGSTF SGSTGGGSSS STTATSKTTT TSATKTTTTT TKTTTTTSAS
STSTGGAQHW AQCGGIGWTG PTTCVAPYTC QKQNDYYSQC L 154312003 Botryotinia
fuckeliana MISKVLAFTS LLAAARAQQA GTLTTETHPP LSVSQCTASG CTTSAQSIVV
DANWRWLHST TGSTNCYTGN B05-10 TWDKTLCPDG ATCAANCALD GADYSGVYGI
TTSGNSIKLN FVTKGANTNV GSRTYLMAAG STTQYQMLKL LNQEFTFDVD VSNLPCGLNG
ALYFAAMDAD GGLSRFPTNK AGAKYGTGYC DAQCPQDIKF INGVANSVGW TPSSNDVNAG
AGQYGSCCSE MDIWEANKIS AAYTPHPCSV DTQTRCTGTD CGIGARYSSL CDADGCDFNS
YRQGNTSFYG AGLTVNTNKV FTVVTQFITN DGTASGTLKE IRRFYVQNGV VIPNSQSTIA
GVPGNSITDS FCAAQKTAFG DTNEFATKGG LATMSKALAK GMVLVMSIWD DHTANMLWLD
APYPATKSPS APGVTRGSCS ATSGNPVDVE ANSPGSSVTF SNIKWGPINS TYTGSGAAPS
VPGTTTVSSA PASTATSGAG GVAKYAQCGG SGYSGATACV SGSTCVALNP YYSQCQ
49333365 Volvariella volvacea MFPAATLFAF SLFAAVYGQQ VGTQLAETHP
RLTWQKCTRS GGCQTQSNGA IVLDANWRWV HNVGGYTNCY TGNTWNTSLC PDGATCAKNC
ALDGANYQST YGITTSGNAL TLKFVTQSEQ KNIGSRVYLL ESDTKYQLFN PLNQEFTFDV
DVSQLPCGLN GAVYFSAMDA DGGMSKFPNN AAGAKYGTGY CDSQCPRDIK FINGEANVQG
WQPSPNDTNA GTGNYGACCN EMDVWEANSI STAYTPHPCT QQGLVRCSGT ACGGGSNRYG
SICDPDGCDF NSFRMGDKSF YGPGLTVNTQ QKFTVVTQFL TNNNSSSGTL REIRRLYVQN
GRVIQNSKVN IPGMPSTMDS VTTEFCNAQK TAFNDTFSFQ QKGGMANMSE ALRRGMVLVL
SIWDDHAANM LWLDSNYPTD RPASQPGVAR GTCPTSSGKP SDVENSTANS QVIYSNIKFG
DIGSTYSA 729650 Penicillium MKGSISYQIY KGALLLSALL NSVSAQQVGT
LTAETHPALT WSKCTAGXCS QVSGSVVIDA NWPXVHSTSG janthinellum STNCYTGNTW
DATLCPDDVT CAANCAVDGA RRQHLRVTTS GNSLRINFVT TASQKNIGSR LYLLENDTTY
QKFNLLNQEF TFDVDVSNLP CGLNGALYFV DMDADGGMAK YPTNKAGAKY GTGYCDSQCP
RDLKFINGQA NVDGWTPSKN DVNSGIGNHG SCCAEMDIWE ANSISNAVTP HPCDTPSQTM
CTGQRCGGTY STDRYGGTCD PDGCDFNPYR MGVTNFYGPG ETIDTKSPFT VVTQFLTNDG
TSTGTLSEIK RFYVQGGKVI GNPQSTIVGV SGNSITDSWC NAQKSAFGDT NEFSKHGGMA
GMGAGLADGM VLVMSLWDDH ASDMLWLDST YPTNATSTTP GAKRGTCDIS RRPNTVESTY
PNAYVIYSNI KTGPLNSTFT GGTTSSSSTT TTTSKSTSTS SSSKTTTTVT TTTTSSGSSG
TGARDWAQCG GNGWTGPTTC VSPYTCTKQN DWYSQCL 146424871 Pleurotus sp
Florida MFRTAALTAF TLAAVVLGQQ VGTLTAENHP ALSIQQCTAS GCTTQQKSVV
LDSNWRWTHS LPVHTNCYTG NAWDASLCPD PTTCATNCAI DGADYSGTYG ITTSGNALTL
RFVTNGPYSK NIGSRVYLLD DADHYKMFDL KNQEFTFDVD MSGLPCGLNG ALYFSEMPAD
GGKAAHTSNK AGAKYGTGYC DAQCPHDIKW INGEANILDW SASATDANAG NGRYGACCAE
MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL
GRGKTIDTTK KITVVTQFIT DDNTSSGNLV EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS
DDFCVAQKTL FGDNQYYNTH GGTEKMGDAM ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS
PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP VSSGNTSAPP
VTSTTSSGPT TPTGPTGTVP KWGQCGGNGY SGPTTCVAGS TCTYSNDWYS QCL 67538012
Aspergillus nidulans MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG
SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG FGSC A4 NEWDATLCPD NESCAQNCAV
DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNL
PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD
SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC
DPDGCDFNSY RMGNTSFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS
ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM
MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF
62006162 Fusarium poae MYRAIATASA LIAAVRAQQV CSLTTETKPA LTWSKCTSSG
CSNVQGSVTI DANWRWTHQV SGSTNCHTGN KWDTSVCTSG KVCAEKCCVD GADYASTYGI
TSSGNQLSLS FVTKGSYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA
LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWE PSKSDVNGGI
GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF
NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI
AGNPGSSLTS DFCTTQKKVF GDIDDFAKKG AWNGMSDALE APMVLVMSLW HDHHSNMLWL
DSTYPTDSTA LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YNKEGTQPQP
TNPTNPNPTN PTNPGTVDQW GQCGGTNYSG PTACKSPFTC KKINDFYSQC Q 146424873
Pleurotus sp Florida MFRTAALTAF TLAAVVLGQQ VGTLAAENHP ALSIQQCTAS
GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDSSLCPN PTTCATNCAI DGADYSGTYG
ITTSGNSLTL RFVTNGQYSE NIGSRVYLLD DADHYKLFNL KNQEFTFDVD MSGLPCGLNG
ALYFSEMAAD GGKAAHTGNN AGAKYGTGYC DAQCPHDIKW INGEANILDW SGSATDPNAG
NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG VCDKDGCDFN
SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTPTGNLV EIRRVYVQDG VTYQNSFSTF
PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDSL ANGMVLIMSL WSDHAAHMLW
LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI GSTYGGSTPP
VSSGNTSVPP VTSTTSSGPT TPTGPTGTVP KWGQCGGIGY SGPTSCVAGS TCTYSNEWYS
QCL 295937 Trichoderma viride MYQKLALISA FLATARAQSA CTLQAETHPP
LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL
DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV
SQLPCGLNGA LYFVSMDADG GVTKYPTNTA GAKYGTGYCD SQCPRDLKFI
NGQANVEGWE
PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDSC GGTYSGDRYG
GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ
PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN
MLWLDSTYPT DETSSTPGAV RGSSSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNPS
GGNPPGGNPP GTTTPRPATS TGSSPGPTQT HYGQCGGIGY IGPTVCASGS TCQVLNPYYS
QCL 6179889 # Alternaria alternata MTWQSCTAKG SCTNKNGKIV IDANWRWLHK
KEGYDNCYTG NEWDATACPD NKACAANCAV DGADYSGTYG ITAGSNSLKL KFITKGSYST
NIGSRTYLMK DDTTYEMFKF TGNQEFTFDV DVSNLPCGFN GALYFVSMDA DGGLKKYSTN
KAGAKYGTGY CDAQCPRDLK FINGEGNVEG WKPSSNDANA GVGGHGSCCA EMDIWEANSV
STAVTPHSCS TIEQSRCDGD GCGGTYSADR YAGVCDPDGC DFNSYRMGVK DFYGKGKTVD
TSKKFTVVTQ FIGTGDAMEI KRFYVQNGKT IAQPASAVPG VEGNSITTKF CDQQKAVFGD
TYTFKDKGGM ANMAKALANG MVLVMSLWDD HYSNMLWLDS TYPTDKNPDT DLGTGRGECE
TSSGVPADVE SQHADATVVY SNIKFGPLNS TFG 119483864 Neosartorya fischeri
MASAISFQVY RSALILSAFL PSITQAQQIG TYTTETHPSM TWETCTSGGS CATNQGSVVM
DANWRWVHQV NRRL 181 GSTTNCYTGN TWDTSICDTD ETCATECAVD GADYESTYGV
TTSGSQIRLN FVTQNSNGAN VGSRLYMMAD NTHYQMFKLL NQEFTFDVDV SNLPCGLNGA
LYFVTMDEDG GVSKYPNNKA GAQYGVGYCD SQCPRDLKFI QGQANVEGWT PSSNNENTGL
GNYGSCCAEL DIWESNSISQ ALTPHPCDTA TNTMCTGDAC GGTYSSDRYA GTCDPDGCDF
NPYRMGNTTF YGPGKTIDTN SPFTVVTQFI TDDGTDTGTL SEIRRYYVQN GVTYAQPDSD
ISGITGNAIN ADYCTAENTV FDGPGTFAKH GGFSAMSEAM STGMVLVMSL WDDYYADMLW
LDSTYPTNAS SSTPGAVRGS CSTDSGVPAT IESESPDSYV TYSNIKVGPI GSTFSSGSGS
GSSGSGSSGS ASTSTTSTKT TAATSTSTAV AQHYSQCGGQ DWTGPTTCVS PYTCQVQNAY
YSQCL 85083281 Neurospora crassa MKAYFEYLVA ALPLLGLATA QQVGKQTTET
HPKLSWKKCT GKANCNTVNA EVVIDSNWRW LHDSSGKNCY OR74A DGNKWTSACS
SATDCASKCQ LDGANYGTTY GASTSGDALT LKFVTKHEYG TNIGSRFYLM NGASKYQMFT
LMNNEFAFDV DLSTVECGLN AALYFVAMEE DGGMASYSSN KAGAKYGTGY CDAQCARDLK
FVGGKANIEG WTPSTNDANA GVGPYGGCCA EIDVWESNAH SFAFTPHACK TNKYHVCERD
NCGGTYSEDR FAGLCDANGC DYNPYRMGNT DFYGKGKTVD TSKKFTVVSR FEENKLTQFF
VQNGQKIEIP GPKWDGIPSD NANITPEFCS AQFQAFGDRD RFAEVGGFAQ LNSALRMPMV
LVMSIWDDHY ANMLWLDSVY PPEKEGQPGA ARGDCPQSSG VPAEVESQYA NSKVVYSNIR
FGPVGSTVNV 3913803 Cryphonectria MFSKFALTGS LLAGAVNAQG VGTQQTETHP
QMTWQSCTSP SSCTTNQGEV VIDSNWRWVH DKDGYVNCYT parasitica GNTWNTTLCP
DDKTCAANCV LDGADYSSTY GITTSGNALS LQFVTQSSGK NIGSRTYLME SSTKYHLFDL
IGNEFAFDVD LSKLPCGLNG ALYFVTMDAD GGMAKYSTNT AGAEYGTGYC DSQCPRDLKF
INGQGNVEGW TPSTNDANAG VGGLGSCCSE MDVWEANSMD MAYTPHPCET AAQHSCNADE
CGGTYSSSRY AGDCDPDGCD WNPFRMGNKD FYGSGDTVDT SQKFTVVTQF HGSGSSLTEI
SQYYIQGGTK IQQPNSTWPT LTGYNSITDD FCKAQKVEFN DTDVFSEKGG LAQMGAGMAD
GMVLVMSLWD DHYANMLWLD STYPVDADAS SPGKQRGTCA TTSGVPADVE SSDASATVIY
SNIKFGPIGA TY 60729633 Corticium rolfsii MFPAAALLSF TLLAVASAQQ
IGTNTAEVHP SLTVSQCTTS GGCTSSTQSI VLDANWRWLH STSGYTNCYT GNQWNSDLCP
DPDTCATNCA LDGASYESTY GISTDGNAVT LNFVTQGSQT NVGSRVYLLS DDTHYQTFSL
LNKEFSFDVD ASNIGCGING AVYFVQMDAD GGLSKYSSNK AGAQYGTGYC DSQCPQDIKF
INGEANLLDW NATSANSGTG SYGSCCPEMD IWEANKYAAA YTPHPCSVSG QTRCTGTSCG
AGSERYDGYC DKDGCDFNSW RMGNETFLGP GMTIDTNKKF TIVTQFITDD NTANGTLSEI
RRLYVQGGTV IQNSVANQPN IPKVNSITDS FCTAQKTEFG DQDYFGTIGG LSQMGKAMSD
MVLVMSIWDD YDAEMLWLDS NYPTSGSAST PGISRGPCSA TSGLPATVES QQASASVTYS
NIKWGDIGST YSGSGSSGSS SSSSSSAASA STSTHTSAAA TATSSAAAAT GSPVPAYGQC
GGQSYTGSTT CASPYVCKVS NAYYSQCLPA 39971383 Magnaporthe grisea
MKRALCASLS LLAAAVAQQV GTNEPEVHPK MTWKKCSSGG SCSTVNGEVV IDGNWRWIHN
IGGYENCYSG 70-15 NKWTSVCSTN ADCATKCAME GAKYQETYGV STSGDALTLK
FVQQNSSGKN VGSRMYLMNG ANKYQMFTLK NNEFAFDVDL SSVECGMNSA LYFVPMKEDG
GMSTEPNNKA GAKYGTGYCD AQCARDLKFI GGKGNIEGWQ PSSTDSSAGI GAQGACCAEI
DIWESNKNAF AFTPHPCENN EYHVCTEPNC GGTYADDRYG GGCDANGCDY NPYRMGNPDF
YGPGKTIDTN RKFTVISRFE NNRNYQILMQ DGVAHRIPGP KFDGLEGETG ELNEQFCTDQ
FTVFDERNRF NEVGGWSKLN AAYEIPMVLV MSIWSDHFAN MLWLDSTYPP EKAGQPGSAR
GPCPADGGDP NGVVNQYPNA KVIWSNVRFG PIGSTYQVD 39973029 Magnaporthe
grisea MQLTKAGVFL GALMGGAAAQ QVGTQTAENH PKMTWKKCTG KASCTTVNGE
VVIDANWRWL HDASSKNCYD 70-15 GNRWTDSCRT ASDCAAKCSL EGADYAKTYG
ASTSGDALSL KFVTRHDYGT NIGSRFYLMN GASKYQMFSL LGNEFAFDVD LSTIECGLNS
ALYFVAMEED GGMKSYSSNK AGAKYGTGYC DAQCARDLKF VGGKANIEGW KPSSNDANAG
VGPYGACCAE IDVWESNAHA FAFTPHPCTD NKYHVCQDSN CGGTYSDDRF AGKCDANGCD
INPYRLGNTD FYGKGKTVDT SKKFTVVTRF ERDALTQFFV QNNKRIDMPS PALEGLPATG
AITAEYCTNV FNVFGDRNRF DEVGGWSQLQ QALSLPMVLV MSIWDDHYSN MLWLDSVYPP
DKEGSPGAAR GDCPQDSGVP SEVESQIPGA TVVWSNIRFG PVGSTVNV 1170141
Fusarium oxysporum MYRIVATASA LIAAARAQQV CSLNTETKPA LTWSKCTSSG
CSDVKGSVVI DANWRWTHQT SGSTNCYTGN KWDTSICTDG KTCAEKCCLD GADYSGTYGI
TSSGNQLSLG FVTNGPYSKN IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SGIGCGLNGA
PHFVSMDEDG GKAKYSGNKA GAKYGTGYCD AQCPRDVKFI NGVANSEGWK PSDSDVNAGV
GNLGTCCPEM DIWEANSIST AFTPHPCTKL TQHSCTGDSC GGTYSSDRYG GTCDADGCDF
NAYRQGNKTF YGPGSNFNID TTKKMTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI
AGNPGSSLTS DFCSKQKSVF GDIDDFSKKG GWNGMSDALS APMVLVMSLW HDHHSNMLWL
DSTYPTDSTK VGSQRGSCAT TSGKPSDLER DVPNSKVSFS NIKFGPIGST YKSDGTTPNP
PASSSTTGSS TPTNPPAGSV DQWGQCGGQN YSGPTTCKSP FTCKKINDFY SQCQ
121710012 Aspergillus clavatus MYQRALLFSA LATAVSAQQV GTQKAEVHPA
LTWQKCTAAG SCTDQKGSVV IDANWRWLHS TEDTTNCYTG NRRL 1 NEWNAELCPD
NEACAKNCAL DGADYSGTYG VTADGSSLKL NFVTSANVGS RLYLMEDDET YQMFNLLNNE
FTFDVDVSNL PCGLNGALYF VSMDADGGLS KYPGNKAGAK YGTGYCDSQC PRDLKFINGE
ANVEGWKPSD NDKNAGVGGY GSCCPEMDIW EANSISTAYT PHPCDGMEQT RCDGNDCGGT
YSSTRYAGTC DPDGCDFNSF RMGNESFYGP GGLVDTKSPI TVVTQFVTAG GTDSGALKEI
RRVYVQGGKV IGNSASNVAG VEGDSITSDF CTAQKKAFGD EDIFSKHGGL EGMGKALNKM
ALIVSIWDDH ASSMMWLDST YPVDADASTP GVARGTCEHG LGDPETVESQ HPDASVTFSN
IKFGPIGSTY KSV 17902580 Penicillium MSALNSFNMY KSALILGSLL
ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN funiculosum
TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT
YLMADNTHYQ IFDLLNQEFT FTVDVSNLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG
VGYCDSQCPR DLKFIAGQAN VEGWTPSTNN SNTGIGNHGS CCAELDIWEA NSISEALTPH
PCDTPGLTVC TADDCGGTYS SNRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV
VTQFVTDDGT SSGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDFCA AELSAFGETA
SFTNHGGLKN MGSALEAGMV LVMSLWDDYS VNMLWLDSTY PANETGTPGA ARGSCPTTSG
NPKTVESQSG SSYVVFSDIK VGPFNSTFSG GTSTGGSTTT TASGTTSTKA STTSTSSTST
GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL 1346226 Humicola grisea
var MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWNKCTAG GQCQTVQASI
TLDSNWRWTH QVSGSTNCYT thermoidea GNKWDTSICT DAKSCAQNCC VDGADYTSTY
GITTNGDSLS LKFVTKGQHS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN
GALYFVSMDA DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA
GAGRYGTCCS EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC
DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI
PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL
DSTFPVDAAG KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN
GGNNGGNPPP PTTTTSSAPA TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYICTKLNDW
YSQCL 156712282 Chaetomium MMYKKFAALA ALVAGASAQQ ACSLTAENHP
SLTWKRCTSG GSCSTVNGAV TIDANWRWTH TVSGSTNCYT thermophilum GNQWDTSLCT
DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQYG TNVGSRVYLM ENDTKYQMFE
LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK
FINGEANVGN WTPSTNDANA GFGRYGSCCS EMDVWEANNM ATAFTPHPCT TVGQSRCEAD
TCGGTYSSDR YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TNKKMTVVTQ FHKNSAGVLS
EIKRFYVQDG KIIANAESKI PGNPGNSITQ EYCDAQKVAF SNTDDFNRKG GMAQMSKALA
GPMVLVMSVW DDHYANMLWL DSTYPIDQAG APGAERGACP TTSGVPAEIE AQVPNSNVIF
SNIRFGPIGS TVPGLDGSNP GNPTTTVVPP ASTSTSRPTS STSSPVSTPT GQPGGCTTQK
WGQCGGIGYT GCTNCVAGTT CTQLNPWYSQ CL 169768818 Aspergillus oryzae
MASLSLSKIC RNALILSSVL STAQGQQVGT YQTETHPSMT WQTCGNGGSC STNQGSVVLD
ANWRWVHQTG RIB40 SSSNCYTGNK WDTSYCSTND ACAQKCALDG ADYSNTYGIT
TSGSEVRLNF VTSNSNGKNV GSRVYMMADD THYEVYKLLN QEFTFDVDVS KLPCGLNGAL
YFVVMDADGG VSKYPNNKAG AKYGTGYCDS QCPRDLKFIQ GQANVEGWVS STNNANTGTG
NHGSCCAELD IWESNSISQA LTPHPCDTPT NTLCTGDACG GTYSSDRYSG TCDPDGCDFN
PYRVGNTTFY GPGKTIDTNK PITVVTQFIT DDGTSSGTLS EIKRFYVQDG VTYPQPSADV
SGLSGNTINS EYCTAENTLF EGSGSFAKHG GLAGMGEAMS TGMVLVMSLW DDYYANMLWL
DSNYPTNEST SKPGVARGTC STSSGVPSEV EASNPSAYVA YSNIKVGPIG STFKS
46241270 Gibberella pulicaris MYRAIATASA LIAAVRAQQV CSLTPETKPA
LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG KVCAEKCCID
GAEYASTYGI TSSGNQLSLS FVTKGAYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV
SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ
PSKSDVNAGI GNMGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG
GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG
KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDTDDFAKKG AWSGMSDALE APMVLVMSLW
HDHHSNMLWL DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST
YKEGVPEPTN PTNPTNPTNP TNPGTVDQWA QCGGTNYSGP TACKSPFTCK KINDFYSQCQ
49333363 Volvariella volvacea MFPKSSLLVL SFLATAYAQQ VGTQTAEVHP
SLNWARCTSS GCTNVAGSVT LDANWRWLHT TSGYTNCYTG NSWNTTLCPD GATCAQNCAL
DGANYQSTCG ITTSGNALTL KFVTQGEQKN IGSRVYLMAS ESRYEMFGLL NKEFTFDVDV
SNLPCGLNGA LYFSSMDADG GMAKNPGNKA GAKYGTGYCD SQCPRDIKFI NGEANVAGWN
GSPNDTNAGT GNWGACCNEM DIWEANSISA AYTPHPCTVQ GLSRCSGTAC GTNDRYGTVC
DPDGCDFNSY RMGDKTYYGP GGTGVDTRSK FTVVTQFLTN NNSSSGTLSE IRRLYVQNGR
VVQNSKVNIP GMSNTLDSIT TGFCDSQKTA FGDTRSFQNK GGMSAMGQAL GAGMVLVLSV
WDDHAANMLW LDSNYPVDAD PSKPGIARGT CSTTSGKPTD VEQSAANSSV TFSNIKFGDI
GTTYTGGSVT TTPGNPGTTT STAPGAVQTK WGQCGGQGWT GPTRCESGST CTVVNQWYSQ
CI 46395332 Irpex lacteus MFRKAALLAF SFLAIAHGQQ VGTNQAENHP
SLPSQHCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD GVTCAKACAL
DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQLFKLINQE FTFDVDMSNL
PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVAGWTGSS
SDPNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQD ANRYKGVCDP
DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT SSGNLAEIRR FYVQDGKVIP
NSKVNIAGCD AVNSITDKFC TQQKTAFGDT NRFADQGGLK QMGAALKSGM VLALSLWDDH
AANMLWLDSD YPTTADASKP GVARGTCPNT SGVPKDVESQ SGSATVTYSN IKWGDLNSTF
SGTASNPTGP SSSPSGPSSS SSSTAGSQPT QPSSGSVAQW GQCGGIGYSG ATGCVSPYTC
HVVNPYYSQC Y
50844407 # Chaetomium TETHPRLTWK RCTSGGNCST VNGAVTIDAN WRWTHTVSGS
TNCYTGNEWD TSICSDGKSC AQTCCVDGAD thermophilum var YSSTYGITTS
GDSLNLKFVT KHQHGTNVGS RVYLMENDTK YQMFELLGNE FTFDVDVSNL GCGLNGALYF
thermophilum VSMDADGGMS KYSGNKAGAK YGTGYCDAQC PRDLKFINGE ANIENWTPST
NDANAGFGRY GSCCSEMDIW EANNMATAFT PHPCTIIGQS RCEGNSCGGT YSSERYAGVC
DPDGCDFNAY RQGDKTFYGK GMTVDTTKKM TVVTQFHKNS AGVLSEIKRF YVQDGKIIAN
AESKIPGNPG NSITQEWCDA QKVAFGDIDD FNRKGGMAQM SKALEGPMVL VMSVWDDHYA
NMLWLDSTYP IDKAGTPGAE RGACPTTSGV PAEIEAQVPN SNVIFSNIRF GPIGSTVPGL
DGSTPSNPTA TVAPPTSTTT SVRSSTTQIS TPTSQPGGCT TQKWGQCGGI GYTGCTNCVA
GTTCTELNPW YSQCL 4586347 Irpex lacteus MFHKAVLVAF SLVTIVHGQQ
AGTQTAENHP QLSSQKCTAG GSCTSASTSV VLDSNWRWVH TTSGYTNCYT GNTWDASICS
DPVSCAQNCA LDGADYAGTY GITTSGDALT LKFVTGSNVG SRVYLMEDET NYQMFKLMNQ
EFTFDVDVSN LPCGLNGAVY FVQMDQDGGT SKFPNNKAGA KFGTGYCDSQ CPQDIKFING
EANIVDWTAS AGDANSGTGS FGTCCQEMDI WEANSISAAY TPHPCTVTEQ TRCSGSDCGQ
GSDRFNGICD PDGCDFNSFR MGNTEFYGKG LTVDTSQKFT IVTQFISDDG TADGNLAEIR
RFYVQNGKVI PNSVVQITGI DPVNSITEDF CTQQKTVFGD TNNFAAKGGL KQMGEAVKNG
MVLALSLWDD YAAQMLWLDS DYPTTADPSQ PGVARGTCPT TSGVPSQVEG QEGSSSVIYS
NIKFGDLNST FTGTLTNPSS PAGPPVTSSP SEPSQSTQPS QPAQPTQPAG TAAQWAQCGG
MGFTGPTVCA SPFTCHVLNP YYSQCY 3980202 Phanerochaete MFRAAALLAF
TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT
chrysosporium GNEWNTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT
LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM
SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE
ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDHGDGCD FNSFRMGDKT FLGKGMTVDT
SKPFTDVTQF LTNDNTSTGT LSEIRRIYIQ NGKVIQNSVA NIPGVDPVNS ITDNFCAQQK
TAFGDTNWFA QKGGLKQMGE ALGNGMVLAL SIWDDHAANM LWLDSDYPTD KDPSAPGVAR
GTCATTSGVP SDVESQVPNS QVVFSNIKFG DIGSTFSGTS SPNPPGGSTT SSPVTTSPTP
PPTGPTVPQW GQCGGIGYSG STTCASPYTC HVLNPYYSQC Y 27125837 Melanocarpus
MMMKQYLQYL AAALPLVGLA AGQRAGNETP ENHPPLTWQR CTAPGNCQTV NAEVVIDANW
RWLHDDNMQN albomyces CYDGNQWTNA CSTATDCAEK CMIEGAGDYL GTYGASTSGD
ALTLKFVTKH EYGTNVGSRF YLMNGPDKYQ MFNLMGNELA FDVDLSTVEC GINSALYFVA
MEEDGGMASY PSNQAGARYG TGYCDAQCAR DLKFVGGKAN IEGWKSSTSD PNAGVGPYGS
CCAEIDVWES NAYAFAFTPH ACTTNEYHVC ETTNCGGTYS EDRFAGKCDA NGCDYNPYRM
GNPDFYGKGK TLDTSRKFTV VSRFEENKLS QYFIQDGRKI EIPPPTWEGM PNSSEITPEL
CSTMFDVFND RNRFEEVGGF EQLNNALRVP MVLVMSIWDD HYANMLWLDS IYPPEKEGQP
GAARGDCPTD SGVPAEVEAQ FPDAQVVWSN IRFGPIGSTY DF 171696102 Podospora
anserina MYRSATFLTF ASLVLGQQVG TYTAERHPSM PIQVCTAPGQ CTRESTEVVL
DANWRWTHIT NGYTNCYTGN EWNATACPDG ATCAKNCAVD GADYSGTYGI TTPSSGALRL
QFVKKNDNGQ NVGSRVYLMA SSDKYKLFNL LNKEFTFDVD VSKLPCGLNG AVYFSEMLED
GGLKSFSGNK AGAKYGTGYC DSQCPQDIKF INGEANVEGW GGADGNSGTG KYGICCAEMD
IWEANSDATA YTPHVCSVNE QTRCEGVDCG AGSDRYNSIC DKDGCDFNSY RLGNREFYGP
GKTVDTTRPF TIVTQFVTDD GTDSGNLKSI HRYYVQDGNV IPNSVTEVAG VDQTNFISEG
FCEQQKSAFG DNNYFGQLGG MRAMGESLKK MVLVLSIWDD HAVNMNWLDS IFPNDADPEQ
PGVARGRCDP ADGVPATIEA AHPDAYVIYS NIKFGAINST FTAN 3913802
Cochliobolus MYRTLAFASL SLYGAARAQQ VGTSTAENHP KLTWQTCTGT GGTNCSNKSG
SVVLDSNWRW AHNVGGYTNC carbonum YTGNSWSTQY CPDGDSCTKN CAIDGADYSG
TYGITTSNNA LSLKFVTKGS FSSNIGSRTY LMETDTKYQM FNLINKEFTF DVDVSKLPCG
LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN
GGAGKIGACC PEMDIWEANS ISTAYTPHPC RGVGLQECSD AASCGDGSNR YDGQCDKDGC
DFNSYRMGVK DFYGPGATLD TTKKMTVITQ FLGSGSSLSE IKRFYVQNGK VYKNSQSAVA
GVTGNSITES FCTAQKKAFG DTSSFAALGG LNEMGASLAR GHVLIMSLWG DHAVNMLWLD
STYPTDADPS KPGAARGTCP TTSGKPEDVE KNSPDATVVF SNIKFGPIGS TFAQPA
50403723 Trichoderma viride MYQKLALISA FLATARAQSA CTLQAETHPP
LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL
DGAAYASTYG VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV
SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE
PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICDGDSC GGTYSGDRYG
GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ
PNAELGDYSG NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN
MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNSS
GGNPPGGNPP GTTTTRRPAT STGSSPGPTQ THYGQCGGIG YSGPTVCASG STCQVLNPYY
SQCL 3913798 Aspergillus aculeatus MVDSFSIYKT ALLLSMLATS NAQQVGTYTA
ETHPSLTWQT CSGSGSCTTT SGSVVIDANW RWVHEVGGYT NCYSGNTWDS SICSTDTTCA
SECALEGATY ESTYGVTTSG SSLRLNFVTT ASQKNIGSRL YLLADDSTYE TFKLFNREFT
FDVDVSNLPC GLNGALYFVS MDADGGVSRF PTNKAGAKYG TGYCDSQCPR DLKFIDGQAN
IEGWEPSSTD VNAGTGNHGS CCPEMDIWEA NSISSAFTAH PCDSVQQTMC TGDTCGGTYS
DTTDRYSGTC DPDGCDFNPY RFGNTNFYGP GKTVDNSKPF TVVTQFITHD GTDTGTLTEI
RRLYVQNGVV IGNGPSTYTA ASGNSITESF CKAEKTLFGD TNVFETHGGL SAMGDALGDG
MVLVLSLWDD HAADMLWLDS DYPTTSCASS PGVARGTCPT TTGNATYVEA NYPNSYVTYS
NIKFGTLNST YSGTSSGGSS SSSTTLTTKA STSTTSSKTT TTTSKTSTTS SSSTNVAQLY
GQCGGQGWTG PTTCASGTCTKQNDYYSQCL 66828465 Dictyostelium MYRILKSFIL
LSLVNMSLSQ KIGKLTPEVH PPMTFQKCSE GGSCETIQGE VVVDANWRWV HSAQGQNCYT
discoideum GNTWNPTICP DDETCAENCY LDGANYESVY GVTTSEDSVR LNFVTQSQGK
NIGSRLFLMS NESNYQLFHV LGQEFTFDVD VSNLDCGLNG ALYLVSMDSD GGSARFPTNE
AGAKYGTGYC DAQCPRDLKF ISGSANVDGW IPSTNNPNTG YGNLGSCCAE MDLWEANNMA
TAVTPHPCDT SSQSVCKSDS CGGAASSNRY GGICDPDGCD YNPYRMGNTS FFGPNKMIDT
NSVITVVTQF ITDDGSSDGK LTSIKRLYVQ DGNVISQSVS TIDGVEGNEV NEEFCTNQKK
VFGDEDSFTK HGGLAKMGEA LKDGMVLVLS LWDDYQANML WLDSSYPTTS SPTDPGVARG
SCPTTSGVPS KVEQNYPNAY VVYSNIKVGP IDSTYKK 156060391 Sclerotinia
MISRVLAISS LLAAARAQQI GTNTAEVHPA LTSIVIDANW RWLHTTSGYT NCYTGNSWDA
TLCPDAVTCA sclerotiorum 1980 ANCALDGADY SGTYGITTSG NSLKLNFVTK
GANTNVGSRT YLMAAGSKTQ YQLLKLLGQE FTFDVDVSNL PCGLNGALYF AEMDADGGVS
RFPTNKAGAQ YGTGYCDAQC PQDIKFINGQ ANSVGWTPSS NDVNTGTGQY GSCCSEMDIW
EANKISAAYT PHPCSVDGQT RCTGTDCGIG ARYSSLCDAD GCDFNSYRMG DTGFYGAGLT
VDTSKVFTVV TQFITNDGTT SGTLSEIRRF YVQNGKVIPN SQSKVTGVSG NSITDSFCAA
QKTAFGDTNE FATKGGLATM SKALAKGMVL VMSIWDDHSA NMLWLDAPYP ASKSPSAAGV
SRGSCSASSG VPADVEANSP GASVTYSNIK WGPINSTYSA GTGSNTGSGS GSTTTLVSSV
PSSTPTSTTG VPKYGQCGGS GYTGPTNCIG STCVSMGQYY SQCQ 116181754
Chaetomium globosum MYRQVATALS FASLVLGQQV GTLTAETHPS LPIEVCTAPG
SCTKEDTTVV LDANWRWTHV TDGYTNCYTG CBS 148-51 NAWNETACPD GKTCAANCAI
DGAEYEKTYG ITTPEEGALR LNFVTESNVG SRVYLMAGED KYRLFNLLNK EFTMDVDVSN
LPCGLNGAVY FSEMDEDGGM SRFEGNKAGA KYGTGYCDSQ CPRDIKFING EANSEGWGGE
DGNSGTGKYG TCCAEMDIWE ANLDATAYTP HPCKVTEQTR CEDDTECGAG DARYEGLCDR
DGCDFNSFRL GNKEFYGPEK TVDTSKPFTL VTQFVTADGT DTGALQSIRR FYVQDGTVIP
NSETVVEGVD PTNEITDDFC AQQKTAFGDN NHFKTIGGLP AMGKSLEKMV LVLSIWDDHA
VYMNWLDSNY PTDADPTKPG VARGRCDPEA GVPETVEAAH PDAYVIYSNI KIGALNSTFA
AA 145230535 Aspergillus niger MSSFQVYRAA LLLSILATAN AQQVGTYTTE
THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS ICTDDVTCAA
NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL FKLLGQEFTF
DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD LKFINGEANC
DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD GDSCGGTYSA
SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG TSSGTLTEIK
RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE GMGEAMAKGM
VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE SPNAYVTYSN
IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS AAQAYGQCGG
QGWTGPTTCV SGYTCTYENA YYSQCL 46241266 Nectria haematococca
MYRAIATASA LLATARAQQV CTLNTENKPA LTWAKCTSSG CSNVRGSVVV DANWRWAHST
SSSTNCYTGN mpVI TWDKTLCPDG KTCADKCCLD GADYSGTYGV TSSGNQLNLK
FVTVGPYSTN VGSRLYLMED ENNYQMFDLL GNEFTFDVDV NNIGCGLNGA LYFVSMDKDG
GKSRFSTNKA GAKYGTGYCD AQCPRDVKFI NGVANSDEWK PSDSDKNAGV GKYGTCCPEM
DIWEANKIST AYTPHPCKSL TQQSCEGDAC GGTYSATRYA GTCDPDGCDF NPYRQGNKTF
YGPGSGFNVD TTKKVTVVTQ FIKGSDGKLS EIKRLYVQNG KVIGNPQSEI ANNPGSSVTD
SFCKAQKVAF NDPDDFNKKG GWSGMSDALA KPMVLVMSLW HDHYANMLWL DSTYPKGSKT
PGSARGSCPE DSGDPDTLEK EVPNSGVSFS NIKFGPIGST YTGTGGSNPD PEEPEEPEEP
VGTVPQYGQC GGINYSGPTA CVSPYKCNKI NDFYSQCQ 1q9h (PDB) # Talaromyces
emersonii EQAGTATAEN HPPLTWQECT APGSCTTQNG AVVLDANWRW VHDVNGYTNC
YTGNTWDPTY CPDDETCAQN CALDGADYEG TYGVTSSGSS LKLNFVTGSN VGSRLYLLQD
DSTYQIFKLL NREFSFDVDV SNLPCGLNGA LYFVAMDADG GVSKYPNNKA GAKYGTGYCD
SQCPRDLKFI DGEANVEGWQ PSSNNANTGI GDHGSCCAEM DVWEANSISN AVTPHPCDTP
GQTMCSGDDC GGTYSNDRYA GTCDPDGCDF NPYRMGNTSF YGPGKIIDTT KPFTVVTQFL
TDDGTDTGTL SEIKRFYIQN SNVIPQPNSD ISGVTGNSIT TEFCTAQKQA FGDTDDFSQH
GGLAKMGAAM QQGMVLVMSL WDDYAAQMLW LDSDYPTDAD PTTPGIARGT CPTDSGVPSD
VESQSPNSYV TYSNIKFGPI NSTFTAS 157362170 Polyporus arcularius
MFPTLALVSL SFLAIAYGQQ VGTLTAETHP KLSVSQCTAG GSCTTVQRSV VLDSNWRWLH
DVGGSTNCYT GNTWDDSLCP DPTTCAANCA LDGADYSGTY GITTSGNALS LKFVTQGPYS
TNIGSRVYLL SEDDSTYEMF NLKNQEFTFD VDMSALPCGL NGALYFVEMD KDGGSGRFPT
NKAGSKYGTG YCDTQCPHDI KFINGEANVL DWAGSSNDPN AGTGHYGTCC NEMDIWEANS
MGAAVTPHVC TVQGQTRCEG TDCGDGDERY DGICDKDGCD FNSWRMGDQT FLGPGKTVDT
SSKFTVVTQF ITADNTTSGD LSEIRRLYVQ NGKVIANSKT QIAGMDAYDS ITDDFCNAQK
TTFGDTNTFE QMGGLATMGD AFETGMVLVM SIWDDHEAKM LWLDSDYPTD ADASAPGVSR
GPCPTTSGDP TDVESQSPGA TVIFSNIKTG PIGSTFTS 7804885 Leptosphaeria
MLSASKAAAI LAFCAHTASA WVVGDQQTET HPKLNWQRCT GKGRSSCTNV NGEVVIDANW
RWLAHRSGYT maculans NCYTGSEWNQ SACPNNEACT KNCAIEGSDY AGTYGITTSG
NQMNIKFITK RPYSTNIGAR TYLMKDEQNY EMFQLIGNEF TFDVDLSQRC GMNGALYFVS
MPQKGQGAPG AKYGTGYCDA QCARDLKFVR GSANAEGWTK SASDPNSGVG KKGACCAQMD
VWEANSAATA LTPHSCQPAG YSVCEDTNCG GTYSEDRYAG TCDANGCDFN PFRVGVKDFY
GKGKTVDTTK KMTVVTQFVG SGNQLSEIKR FYVQDGKVIA NPEPTIPGME WCNTQKKVFQ
EEAYPFNEFG GMASMSEGMS QGMVLVMSLW DDHYANMLWL DSNWPREADP AKPGVARRDC
PTSGGKPSEV EAANPNAQVM FSNIKFGPIG STFAHAA 121852 Phanerochaete
MFRTATLLAF TMAAMVFGQQ VGTNTARSHP ALTSQKCTKS GGCSNLNTKI VLDANWRWLH
STSGYTNCYT chrysosporium GNQWDATLCP DGKTCAANCA LDGADYTGTY
GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY
LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG
TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF
LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI
TDNFCSQQKT
AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG
TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS
STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY 126013214
Penicillium decumbens MYQRALLFSA LMAGVSAQQV GTQKPETHPP LAWKECTSSG
CTSKDGSVVI DANWRWVHSV DGYKNCYTGN EWDSTLCPDD ATCATNCAVD GADYAGTYGA
TTEGDSLSIN FVTGSNIGSR FYLMEDENKY QMFKLLNKEF TFDVDVSTLP CGLNGALYFV
SMDADGGMSK YETNKAGAKY GTGYCDSQCP RDLKFINGKG NVEGWKPSAN DKNAGVGPHG
SCCAEMDIWE ANSISTALTP HPCDTNGQTI CEGDSCGGTY STTRYAGTCD PDGCDFNPFR
MGNESFYGPG KMVDTKSKMT VVTQFITSDG TDTGSLKEIK RVYVQNGKVI ANSASDVSGI
TGNSITSDFC TAQKKTFGDE DVFNKHGGLS GMGDALGEGM VLVMSLWDDH NSNMLWLDGE
KYPTDAAASK AGVSRGTCST DSGKPSTVES ESGSAKVVFS NIKVGSIGST FSA
156048578 Sclerotinia MTSKIALASL FAAAYGQQIG TYTTETHPSL TWQSCTAKGS
CTTQSGSIVL DGNWRWTHST TSSTNCYTGN sclerotiorum 1980 TWDATLCPDD
ATCAQNCALD GADYSGTYGI TTSGDSLRLN FVTQTANKNV GSRVYLLADN THYKTFNLLN
QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNKAGAQ YGTGYCDSQC PRDGKFINGK
ANVDGWVPSS NNPNTGVGNY GSCCAEMDIW EANSISTAVT PHSCDTVTQT VCTGDNCGGT
YSTTRYAGTC DPDGCDFNPY RQGNESFYGP GKTVDTNSVF TIVTQFLTTD GTSSGTLNEI
KRFYVQNGKV IPNSESTISG VTGNSITTPF CTAQKTAFGD PTSFSDHGGL ASMSAAFEAG
MVLVLSLWDD YYANMLWLDS TYPTTKTGAG GPRGTCSTSS GVPASVEASS PNAYVVYSNI
KVGAINSTFG 156712278 Acremonium MYTKFAALAA LVATVRGQAA CSLTAETHPS
LQWQKCTAPG SCTTVSGQVT IDANWRWLHQ TNSSTNCYTG thermophilum NEWDTSICSS
DTDCATKCCL DGADYTGTYG VTASGNSLNL KFVTQGPYSK NIGSRMYLME SESKYQGFTL
LGQEFTFDVD VSNLGCGLNG ALYFVSMDLD GGVSKYTTNK AGAKYGTGYC DSQCPRDLKF
INGQANIDGW QPSSNDANAG LGNHGSCCSE MDIWEANKVS AAYTPHPCTT IGQTMCTGDD
CGGTYSSDRY AGICDPDGCD FNSYRMGDTS FYGPGKTVDT GSKFTVVTQF LTGSDGNLSE
IKRFYVQNGK VIPNSESKIA GVSGNSITTD FCTAQKTAFG DTNVFEERGG LAQMGKALAE
PMVLVLSVWD DHAVNMLWLD STYPTDSTKP GAARGDCPIT SGVPADVESQ APNSNVIYSN
IRFGPINSTY TGTPSGGNPP GGGTTTTTTT TTSKPSGPTT TTNPSGPQQT HWGQCGGQGW
TGPTVCQSPY TCKYSNDWYS QCL 21449327 Aspergillus nidulans MYQRALLFSA
LLSVSRAQQA GTAQEEVHPS LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG
(also known as NEWDATLCPD NESCAQNCAV DGADYEATYG ITSNGDSLTL
KFVTGSNVGS RVYLMEDDET YQMFDLLNNE Emericella nidulans) FTFDVDVSNF
PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD
SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC
DPDGCDFNSY RMGNTRFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS
ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM
MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF
171683762 Podospora anserine (S MMMKQYLQYL AAGSLMTGLV AGQGVGTQQT
ETHPRITWKR CTGKANCTTV QAEVVIDSNW RWIHTSGGTN mat+) CYDGNAWNTA
ACSTATDCAS KCLMEGAGNY QQTYGASTSG DSLTLKFVTK HEYGTNVGSR FYLMNGASKY
QMFTLMNNEF TFDVDLSTVE CGLNSALYFV AMEEDGGMRS YPTNKAGAKY GTGYCDAQCA
RDLKFVGGKA NIEGWRESSN DENAGVGPYG GCCAEIDVWE SNAHAYAFTP HACENNNYHV
CERDTCGGTY SEDRFAGGCD ANGCDYNPYR MGNPDFYGKG KTVDTTKKFT VVTRFQDDNL
EQFFVQNGQK ILAPAPTFDG IPASPNLTPE FCSTQFDVFT DRNRFREVGD FPQLNAALRI
PMVLVMSIWA DHYANMLWLD SVYPPEKEGE PGAARGPCAQ DSGVPSEVKA NYPNAKVVWS
NIRFGPIGST VNV 56718412 Thermoascus MYQRALLFSF FLAAARAQQA
GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG aurantiacus
var NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN
IGSRLYLLQD DTTYQIFKLL levisporus GQEFTFDVDV SNLPCGLNGA LYFVAMDADG
GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSCCAEM
DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA GTCDPDGCDF NPYRQGNHSF
YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT
TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL WDDHAANMLW LDSTYPTDAD
PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI NSTFTAN 15824273
Pseudotrichonympha MFAIVLLGLT RSLGTGTNQA ENHPSLSWQN CRSGGSCTQT
SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS grassii LCPDPKTCSD NCLIDGADYS
GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC
GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND
ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC DGSECGDTDS GNRYGGVCDK
DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE
NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH
QVNMLWLDSI YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY
115390801 Aspergillus terreus MHQRALLFSA LVGAVRAQQA GTLTEEVHPP
LTWQKCTADG SCTEQSGSVV IDSNWRWLHS TNGSTNCYTG NIH2624 NTWDESLCPD
NEACAANCAL DGADYESTYG ITTSGDALTL TFVTGENVGS RVYLMAEDDE SYQTFDLVGN
EFTFDVDVSN LPCGLNGALY FTSMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFING
MANVEGWTPS DNDKNAGVGG HGSCCPELDI WEANSISSAF TPHPCDDLGQ TMCSGDDCGG
TYSETRYAGT CDPDGCDFNA YRMGNTSYYG PDKIVDTNSV MTVVTQFIGD GGSLSEIKRL
YVQNGKVIAN AQSNVDGVTG NSITSDFCTA QKTAFGDQDI FSKHGGLSGM GDAMSAMVLI
LSIWDDHNSS MMWLDSTYPE DADASEPGVA RGTCEHGVGD PETVESQHPG ATVTFSKIKF
GPIGSTYSSN STA 453223 Phanerochaete MFRAAALLAF TCLAMVSGQQ
AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT
chrysosporium GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT
LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM
SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE
ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS
KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT
AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG
TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP
PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPCESILS LQRSSNADQY LQTTRSATKR
RLDTALQPRK 3132 Phanerochaete MRTALALILA LAAFSAVSAQ QAGTITAETH
PTLTIQQCTQ SGGCAPLTTK VVLDVNWRWI HSTTGYTNCY chrysosporium
SGNTWDAILC PDPVTCAANC ALDGADYTGT FGILPSGTSV TLRPVDGLGL RLFLLADDSH
YQMFQLLNKE FTFDVEMPNM RCGSSGAIHL TAMDADGGLA KYPGNQAGAK YGTGFCSAQC
PKGVKFINGQ ANVEGWLGTT ATTGTGFFGS CCTDIALWEA NDNSASFAPH PCTTNSQTRC
SGSDCTADSG LCDADGCNFN SFRMGNTTFF GAGMSVDTTK LFTVVTQFIT SDNTSMGALV
EIHRLYIQNG QVIQNSVVNI PGINPATSIT DDLCAQENAA FGGTSSFAQH GGLAQVGEAL
RSGMVLALSI VNSAADTLWL DSNYPADADP SAPGVARGTC PQDSASIPEA PTPSVVFSNI
KLGDIGTTFG AGSALFSGRS PPGPVPGSAP ASSATATAPP FGSQCGGLGY AGPTGVCPSP
YTCQALNIYY SQCI 16304152 Thermoascus MYQRALLFSF FLAAARAHEA
GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG aurantiacus
NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD
DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD
SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP
GQTMCQGDDC GGTYSSTRYA GTCDTDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI
TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FDNTGFFTHG
GLQKISQALA QGMVLVMSLW DDHAANMLWL DSTYPTDADP DTPGVARGTC PTTSGVPADV
ESQNPNSYVI YSNIKVGPIN STFTAN 156712280 Acremonium MHKRAATLSA
LVVAAAGFAR GQGVGTQQTE THPKLTFQKC SAAGSCTTQN GEVVIDANWR WVHDKNGYTN
thermophilum CYTGNEWNTT ICADAASCAS NCVVDGADYQ GTYGASTSGN ALTLKFVTKG
SYATNIGSRM YLMASPTKYA MFTLLGHEFA FDVDLSKLPC GLNGAVYFVS MDEDGGTSKY
PSNKAGAKYG TGYCDSQCPR DLKFIDGKAN SASWQPSSND QNAGVGGMGS CCAEMDIWEA
NSVSAAYTPH PCQNYQQHSC SGDDCGGTYS ATRFAGDCDP DGCDWNAYRM GVHDFYGNGK
TVDTGKKFSI VTQFKGSGST LTEIKQFYVQ DGRKIENPNA TWPGLEPFNS ITPDFCKAQK
QVFGDPDRFN DMGGFTNMAK ALANPMVLVL SLWDDHYSNM LWLDSTYPTD ADPSAPGKGR
GTCDTSSGVP SDVESKNGDA TVIYSNIKFG PLDSTYTAS 5231154 Volvariella
volvacea MRASLLAFSL NSAAGQQAGT LQTKNHPSLT SQKCRQGGCP QVNTTIVLDA
NWRWTHSTSG STNCYTGNTW QATLCPDGKT CAANCALDGA DYTGTYGVTT SGNSLTLQFV
TQSNVGARLG YLMADDTTYQ MFNLLNQEFW FDVDMSNLPC GLNGALYFSA MARTAAWMPM
VVCASTPLIS TRRSTARLLR LPVPPRSRYG RGICDSQCPR DIKFINGEAN VQGWQPSPND
TNAGTGNYGA CCNKMDVWEA NSISTAYTPH PCTQRGLVRC SGTACGGGSN RYGSICDHDG
LGFQNLFGMG RTRVRARVGR VKQFNRSSRV VEPISWTKQT TLHLGNLPWK SADCNVQNGR
VIQNSKVNIP GMPSTMDSVT TEFCNAQKTA FNDTFSFQQK GGMANMSEAL RRGMVLVLSI
WDDHAANMLW LDSITSAAAC RSTPSEVHAT PLRESQIRSS HSRQTRYVTF TNIKFGPFNS
TGTTYTTGSV PTTSTSTGTT GSSTPPQPTG VTVPQGQCGG IGYTGPTTCA SPTTCHVLNP
YYSQCY 116200349 Chaetomium globosum MKQYLQYLAA ALPLMSLVSA
QGVGTSTSET HPKITWKKCS SGGSCSTVNA EVVIDANWRW LHNADSKNCY CBS 148-51
DGNEWTDACT SSDDCTSKCV LEGAEYGKTY GASTSGDSLS LKFLTKHEYG TNIGSRFYLM
NGASKYQMFT LMNNEFAFDV DLSTVECGLN SALYFVAMEE DGGMASYSTN KAGAKYGTGY
CDAQCARDLK FVGGKANYDG WTPSSNDANA GVGALGGCCA EIDVWESNAH AFAFTPHACE
NNNYHVCEDT TCGGTYSEDR FAGDCDANGC DYNPYRVGNT DFYGKGMTVD TSKKFTVVSQ
FQENKLTQFF VQNGKKIEIP GPKHEGLPTE SSDITPELCS AMPEVFGDRD RFAEVGGFDA
LNKALAVPMV LVMSIWDDHY ANMLWLDSSY PPEKAGTPGG DRGPCAQDSG VPSEVESQYP
DATVVWSNIR FGPIGSTVQV 4586343 Irpex lacteus MFPKASLIAL SFIAAVYGQQ
VGTQMAEVHP KLPSQLCTKS GCTNQNTAVV LDANWRWLHT TSGYTNCYTG NSWDATLCPD
ATTCAQNCAV DGADYSGTYG ITTSGNALTL KFKTGTNVGS RVYLMQTDTA YQMFQLLNQE
FTFDVDMSNL PCGLNGALYL SQMDQDGGLS KFPTNKAGAK YGTGYCDSQC PHDIKFINGM
ANVAGWAGSA SDPNAGSGTL GTCCSEMDIW EANNDAAAFT PHPCSVDGQT QCSGTQCGDD
DERYSGLCDK DGCDFNSFRM GDKSFLGKGM TVDTSRKFTV VTQFVTTDGT TNGDLHEIRR
LYVQDGKVIQ NSVVSIPGID AVDSITDNFC AQQKSVFGDT NYFATLGGLK KMGAALKSGM
VLAMSVWDDH AASMQWLDSN YPADGDATKP GVARGTCSAD SGLPTNVESQ SASASVTFSN
IKWGDINTTF TGTGSTSPSS PAGPVSSSTS VASQPTQPAQ GTVAQWGQCG GTGFTGPTVC
ASPFTCHVVNPYYSQCY 15321718 Lentinula edodes MFRTAALLSF AYLAVVYGQQ
AGTSTAETHP PLTWEQCTSG GSCTTQSSSV VLDSNWRWTH VVGGYTNCYT GNEWNTTVCP
DGTTCAANCA LDGADYEGTY GISTSGNALT LKFVTASAQT NVGSRVYLMA PGSETEYQMF
NPLNQEFTFD VDVSALPCGL NGALYFSEMD ADGGLSEYPT NKAGAKYGTG YCDSQCPRDI
KFIEGKANVE GWTPSSTSPN AGTGGTGICC NEMDIWEANS ISEALTPHPC TAQGGTACTG
DSCSSPNSTA GICDQAGCDF NSFRMGDTSF YGPGLTVDTT SKITVVTQFI TSDNTTTGDL
TAIRRIYVQN GQVIQNSMSN IAGVTPTNEI TTDFCDQQKT AFGDTNTFSE KGGLTGMGAA
FSRGMVLVLS IWDDDAAEML WLDSTYPVGK TGPGAARGTC ATTSGQPDQV ETQSPNAQVV
FSNIKFGAIG STFSSTGTGT GTGTGTGTGT GTTTSSAPAA TQTKYGQCGG QGWTGATVCA
SGSTCTSSGP YYSQCL 146424875 Pleurotus sp Florida MFRTAALTAF
TFAAVVLGQQ VGTLTTENHP ALSIQQCTAT GCTTQQKSVV LDSNWRWTHS TAGATNCYTG
NAWDPALCPD PATCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGQYSQ NIGSRVYLLD
DADHYKLFDL KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD GGKAAHAGNN AGAKYGTGYC
DAQCPHDIKW INGEANVLDW SASATDDNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD
EGLYRCSGTE CGDGNNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KVTVVTQFIT
DNNTPTGNLV EIRRVYVQNG VVYQNSFSTF
PSLSQYNSIS DEFCVAQKTL FGDNQYYNTH GGTTKMGDAF DNGMVLIMSL WSDHAAHMLW
LDSDYPLDKS PSEPGVSRGA CPTSSGDPDD VVANHPNASV TFSNIKYGPI GSTFGGSTPP
VSSGGSSVPP VTSTTSSGTT TPTGPTGTVP KWGQCGGIGY SGPTACVAGS TCTYSNDWYS
QCL 62006158 Fusarium venenatum MYRAIATASA LIAAVRAQQV CSLTPETKPA
LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG KVCAEKCCID
GAEYASTYGI TSSGNQLSLS FVTKGTYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV
SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ
PSKSDVNGGI GNLGTCCPEM DIWEANSIST AHTPHPCTKL TQHSCTGDSC GGTYSEDRYG
GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG
KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDIDDFEKKG AWGGMSDALE APMVLVMSLW
HDHHSNMLWL DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST
YKEGQPEPTN PTNPNPTTPG GTVDQWGQCG GTNYSGPTAC KSPFTCKKIN DFYSQCQ
296027 Phanerochaete MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS
GGCSNLNTKI VLDANWRWLH STSGYTNCYT chrysosporium GNQWDATLCP
DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ
EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING
EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT
GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN
GKVIQNSSVK IPGIDLVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS
IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD
LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA
SPYTCHVLNP YYSQCY 154449709 Fusicoccum sp MYQTSLLASL SFLLATSQAQ
QVGTQTAETH PKLTTQKCTT AGGCTDQSTS IVLDANWRWL HTVDGYTNCY BCC4124
TGQEWDTSIC TDGKTCAEKC ALDGADYEST YGISTSGNAL TMNFVTKSSQ TNIGGRVYLL
AADSDDTYEL FKLKNQEFTF DVDVSNLPCG LNGALYFSEM DSDGGLSKYT TNKAGAKYGT
GYCDTQCPHD IKFINGEANV QNWTASSTDK NAGTGHYGSC CNEMDIWEAN SQATAFTPHV
CEAKVEGQYR CEGTECGDGD NRYGGVCDKD GCDFNSYRMG NETFYGSNGS TIDTTKKFTV
VTQFITADNT ATGALTEIRR KYVQNDVVIE NSYADYETLS KFNSITDDFC AAQKTLSGDT
NDFKTKGGIA RMGESFERGM VLVMSVWDDH AANALWLDSS YPTDADASKP GVKRGPCSTS
SGVPSDVEAN DADSSVIYSN IRYGDIGSTF NKTA 169859460 Coprinopsis cinerea
MFSKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL
HVTDGYTNCY okayama TGNAWNSSVC SDGATCAQRC ALEGANYQQT YGITTSGDAL
TIKFLTRSEQ TNIGARVYLM ENEDRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA
DGGLSSQPNN RAGAKYGTGY CDSQCPRDIK FINGEANSVG WEPSETDPNA GKGQYGICCA
EMDIWEANSI SNAYTPHPCQ TVNDGGYQRC QGRDCNQPRY EGLCDPDGCD YNPFRMGNKD
FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD
SITQEFCDDA KRAFEDNDSF GRNGGLAHMG RSLAKGHVLA LSIWNDHTAH MLWLDSNYPT
DADPNKPGIA RGTCPTTGGS PRDTEQNHPD AQVIFSNIKF GDIGSTFSGN 50400675
Trichoderma MYRKLAVISA FLAAARAQQV CTQQAETHPP LTWQKCTASG CTPQQGSVVL
DANWRWTHDT KSTTNCYDGN harzianum (anamorph TWSSTLCPDD ATCAKNCCLD
GANYSGTYGV TTSGDALTLQ FVTASNVGSR LYLMANDSTY QEFTLSGNEF of Hypocrea
lixii) SFDVDVSQLP CGLNGALYFV SMDADGGQSK YPGNAAGAKY GTGYCDSQCP
RDLKFINGQA NVEGWEPSSN NANTGVGGHG SCCSEMDIWE ANSISEALTP HPCETVGQTM
CSGDSCGGTY SNDRYGGTCD PDGCDWNPYR LGNTSFYGPG SSFALDTTKK LTVVTQFATD
GSISRYYVQN GVKFQQPNAQ VGSYSGNTIN TDYCAAEQTA FGGTSFTDKG GLAQINKAFQ
GGMVLVMSLW DDYAVNMLWL DSTYPTNATA STPGAKRGSC STSSGVPAQV EAQSPNSKVI
YSNIRFGPIG STGGNTGSNP PGTSTTRAPP SSTGSSPTAT QTHYGQCGGT GWTGPTRCAS
GYTCQVLNPF YSQCL 729649 Neurospora crassa MRASLLAFSL AAAVAGGQQA
GTLTAKRHPS LTWQKCTRGG CPTLNTTMVL DANWRWTHAT SGSTKCYTGN (OR74A)
KWQATLCPDG KSCAANCALD GADYTGTYGI TGSGWSLTLQ FVTDNVGARA YLMADDTQYQ
MLELLNQELW FDVDMSNIPC GLNGALYLSA MDADGGMRKY PTNKAGAKYA TGYCDAQCPR
DLKYINGIAN VEGWTPSTND ANGIGDHGSC CSEMDIWEAN KVSTAFTPHP CTTIEQHMCE
GDSCGGTYSD DRYGVLCDAD GCDFNSYRMG NTTFYGEGKT VDTSSKFTVV TQFIKDSAGD
LAEIKAFYVQ NGKVIENSQS NVDGVSGNSI TQSFCKSQKT AFGDIDDFNK KGGLKQMGKA
LAQAMVLVMS IWDDHAANML WLDSTYPVPK VPGAYRGSGP TTSGVPAEVD ANAPNSKVAF
SNIKFGHLGI SPFSGGSSGT PPSNPSSSAS PTSSTAKPSS TSTASNPSGT GAAHWAQCGG
IGFSGPTTCP EPYTCAKDHD IYSQCV 119472134 Neosartorya fischeri
MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI
DANWRWVHKV NRRL 181 GDYTNCYTGN TWDKTLCPDD ATCASNCALE GANYQSTYGA
TTSGDSLRLN FVTTSQQKNI GSRLYMMKDD TTYEMFKLLN QEFTFDVDVS NLPCGLNGAL
YFVAMDADGG MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG
NHGSCCAEMD IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN
SFRQGNKTFY GPGMTVDTKS KFTVVTQFIT DDGTASGTLK EIKRFYVQNG KVIPNSESTW
SGVGGNSITN DYCTAQKSLF KDQNVFAKHG GMEGMGAALA QGMVLVMSLW DDHAANMLWL
DSNYPTTASS STPGVARGTC DISSGVPADV EANHPDASVV YSNIKVGPIG STFNSGGSNP
GGGTTTTAKP TTTTTTAGSP GGTGVAQHYG QCGGNGWQGP TTCASPYTCQ KLNDFYSQCL
117935080 Chaetomium MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR
CSSGGNCQTV NAEIVIDANW thermophilum RWLHDSNYQN CYDGNRWTSA CSSATDCAQK
CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF
DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD LKFVGGKANI
EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE
DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ FFIQDGRKID
IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM VLVMSIWDGH
YASMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI RFGPIGSTYQ V
154300584 Botryotinia fuckeliana MTSRIALVSL FAAVYGQQVG TYQTETHPSL
TWQSCTAKGS CTTNTGSIVL DGNWRWTHGV B05-10 GTSTNCYTGN TWDATLCPDD
ATCAQNCALE GADYSGTYGI TTSGNSLRLN FVTQSANKNI GSRVYLMADT THYKTFNLLN
QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNTAGAE YGTGYCDSQC PRDMKFIKGQ
ANVDGWVPSS NNANTGVGNH GSCCAEMDIW EANSISTAVT PHSCDTVTQT VCTGDDCGGT
YSSSRYAGTC DPDGCDFNSY RMGDETFYGP GKTVDTNSVF TVVTQFLTTD GTASGTLNEI
KRFYVQDGKV IPNSYSTISG VSGNSITTPF CDAQKTAFGD PTSFSDHGGL ASMSAAFEAG
MVLVLSLWDD YYANMLWLDS TYPVGKTSAG GPRGTCDTSS GVPASVEASS PNAYVVYSNI
KVGAINSTYG 15824271 Pseudotrichonympha MFVFVLLWLT QSLGTGTNQA
ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN grassii CYDGNEWSSS
LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ
IFDLKNKEFT FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH
DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC
DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT
DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS
GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS SGAPSDVESQ
HPDSSVTFSD IRFGPIDSTY 4586345 Irpex lacteus MFRKAALLAF SFLAIAHGQQ
VGTNQAENHP SLPSQKCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD
GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQMFQLINQE
FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE
ANVEGWTGSS TDSNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQG
DDRYDGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT TSGNLAEIRR
FYVQDGNVIP NSKVSIAGID AVNSITDDFC TQQKTAFGDT NRFAAQGGLK QMGAALKSGM
VLALSLWDDH AANMLWLDSD YPTTADASNP GVARGTCPTT SGFPRDVESQ SGSATVTYSN
IKWGDLNSTF TGTLTTPSGS SSPSSPASTS GSSTSASSSA SVPTQSGTVA QWAQCGGIGY
SGATTCVSPY TCHVVNAYYS QCY 46241268 Gibberella avenacea MYRAIATASA
LIAAARAQQV CTLTTETKPA LTWSKCTSSG CTDVKGSVGI DANWRWTHQT SSSTNCYTGN
KWDTSVCTSG ETCAQKCCLD GADYAGTYGI TSSGNQLSLG FVTKGSFSTN IGSRTYLMEN
ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKARYPANKA GAKYGTGYCD
AQCPRDVKFI NGKANSDGWK PSDSDINAGI GNMGTCCPEM DIWEANSIST AFTPHPCTKL
TQHACTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGRGSDFNVD TTKKVTVVTQ
FKKGSNGRLS EITRLYVQNG KVIANSESKI PGNSGSSLTA DFCSKQKSVF GDIDDFSKKG
GWSGMSDALE SPPMVLVMSL WHDHHSNMLW LDSTYPTDST KLGAQRGSCA TTSGVPSDLE
RDVPNSKVSF SNIKFGPIGS TYSSGTTNPP PSSTDTSTTP TNPPTGGTVG QYGQCGGQTY
TGPKDCKSPY TCKKINDFYS QCQ 6164684 Aspergillus niger MSSFQIYRAA
LLLSILATAN AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN
CYTGNEWDTS ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY
LMSDDSNYEL FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT
GYCDSQCPRD LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP
CDSVSQTMCD GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT
VVTQFITDDG TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE
NVFDKHGGLE GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD
SGVPATVEAE SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS
TTSSGSSSTS AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL 6164682
Aspergillus niger MHQRALLFSA LLTAVRAQQA GTLTEEVHPS LTWQKCTSEG
SCTEQSGSVV IDSNWRWTHS VNDSTNCYTG NTWDATLCPD DETCAANCAL DGADYESTYG
VTTDGDSLTL KFVTGSNVGS RLYLMDTSDE GYQTFNLLDA EFTFDVDVSN LPCGLNGALY
FTAMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFIDG QANVDGWEPS SNNDNTGIGN
HGSCCPEMDI WEANKISTAL TPHPCDSSEQ TMCEGNDCGG TYSDDRYGGT CDPDGCDFNP
YRMGNDSFYG PGKTIDTGSK MTVVTQFITD GSGSLSEIKR YYVQNGNVIA NADSNISGVT
GNSITTDFCT AQKKAFGDED IFAEHNGLAG ISDAMSSMVL ILSLWDDYYA SMEWLDSDYP
ENATATDPGV ARGTCDSESG VPATVEGAHP DSSVTFSNIK FGPINSTFSA SA 33733371
Chrysosporium MYAKFATLAA LVAGAAAQNA CTLTAENHPS LTWSKCTSGG
SCTSVQGSIT IDANWRWTHR TDSATNCYEG lucknowense NKWDTSYCSD GPSCASKCCI
DGADYSSTYG ITTSGNSLNL KFVTKGQYST NIGSRTYLME SDTKYQMFQL U.S. Pat.
No. 6,573,086-10 LGNEFTFDVD VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK
AGAKYGTGYC DSQCPRDLKF INGEANVENW QSSTNDANAG TGKYGSCCSE MDVWEANNMA
AAFTPHPCXV IGQSRCEGDS CGGTYSTDRY AGICDPDGCD FNSYRQGNKT FYGKGMTVDT
TKKITVVTQF LKNSAGELSE IKRFYVQNGK VIPNSESTIP GVEGNSITQD WCDRQKAAFG
DVTDXQDKGG MVQMGKALAG PMVLVMSIWD DHAVNMLWLD STWPIDGAGK PGAERGACPT
TSGVPAEVEA EAPNSNVIFS NIRFGPIGST VSGLPDGGSG NPNPPVSSST PVPSSSTTSS
GSSGPTGGTG VAKHYEQCGG IGFTGPTQCE SPYTCTKLND WYSQCL 29160311
Thielavia australiensis MYAKFATLAA LVAGASAQAV CSLTAETHPS LTWQKCTAPG
SCTNVAGSIT IDANWRWTHQ TSSATNCYSG SKWDSSICTT GTDCASKCCI DGAEYSSTYG
ITTSGNALNL KFVTKGQYST NIGSRTYLME SDTKYQMFKL LGNEFTFDVD VSNLGCGLNG
ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DAQCPRDLKF INGEANVEGW ESSTNDANAG
SGKYGSCCTE MDVWEANNMA TAFTPHPCTT IGQTRCEGDT CGGTYSSDRY AGVCDPDGCD
FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYAQDGK VIPNSESTIA
GIPGNSITKA YCDAQKTVFQ NTDDFTAKGG LVQMGKALAG DMVLVMSVWD DHAVNMLWLD
STYPTDQVGV AGAERGACPT TSGVPSDVEA NAPNSNVIFS NIRFGPIGST VQGLPSSGGT
SSSSSAAPQS TSTKASTTTS AVRTTSTATT KTTSSAPAQG TNTAKHWQQC GGNGWTGPTV
CESPYKCTKQ NDWYSQCL 146197087 uncultured symbiotic MLTLVYFLLS
LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP
protist of SSDTCSQKCY IEGADYSGTY GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL
KDENTYESFK LKNKEFTFTV Reticulitermes DDSKLNCGLN GALYFVAMDA
DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus
GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC
DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD
TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST
YPTDSSDSTA QRGPCPTSSG VPKDVESQHG DATVVFSDIK FGAINSTFKY N 146197237
uncultured symbiotic MLAAALFTFA CSVGVGTKTP ENHPKLNWQN CASKGSCSQV
SGEVTMDSNW RWTHDGNGKN CYDGNTWISS protist of Neotermes LCPDDKTCSD
KCVLDGAEYQ ATYGIQSNGT ALTLKFVTHG SYSTNIGSRL YLLKDKSTYY VFKLNNKEFT
koshunensis FSVDVSKLPC GLNGALYFVE MDADGGKAKY AGAKPGAEYG LGYCDAQCPS
DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSQATALTPH VCKTTGQQRC
SGKSECGGQD GQDRFAGLCD EDGCDFNNWR
MGDKTFFGPG LIVDTKSPFV VVTQFYGSPV TEIRRKYVQN GKVIENSKSN IPGIDATAAI
SDHFCEQQKK AFGDTNDFKN KGGFAKLGQV FDRGMVLVLS LWDDHQVAML WLDSTYPTNK
DKSQPGVDRG PCPTSSGKPD DVESASADAT VVYGNIKFGA LDSTY 146197067
uncultured symbiotic MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS
IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP protist of SSNTCSQKCY IEGADYSGTY
GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV
Reticulitermes DDSKLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY
CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNMK
SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD
TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT
NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVSSG
VPKDVESQYG DATVIYSDIK FGAINSTFKW N 146197407 uncultured symbiotic
MILALLSLAK SLGIATNQAE THPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY
DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT YGISTSGNSL
KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT punctulatus VDDSKLPCGL
NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN
SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDANQ RYNGICDKDG
CDFNSYRLGD KTFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS
KVNIAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL VLSLWDDHSV
NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK FGPIDSTY
146197157 uncultured symbiotic MLVIALILRG LSVGTGTQQS ETHPSLSWQQ
TSKGGSGQSV SGSVVLDSNW RWTHTTDGTT NCYDGNEWSS protist of DLCPDASTCS
SNCVLEGADY SGTYGITGSG SSLKLGFVTK GSYSTNIGSR VYLLGDESHY KLFKLENNEF
Hodotermopsis TFTVDDSNLE CGLNGALYFV AMDEDGGASK YSGAKPGAKY
GMGYCDAQCP HDMKFINGDA NVEGWKPSDN sjoestedti DENAGTGKWG ACCTEMDIWE
ANKYATAYTP HICTKNGEYR CEGTDCGDTK DNNRYGGVCD KDGCDFNSWR MGNQSFWGPG
LIIDTGKPVT VVTQFLADGG SLSEIRRKYV QGGKVIENTV TKISGMDEFD SITDEFCNQQ
KKAFRDTNDF EKKGGLKGLG TAVDAGVVLV LSLWDDHDVN MLWLDSIYPT DSGSKAGADR
GPCATSSGVP KDVESNYASA SVTFSDIKFG PIDSTY 146197403 uncultured
symbiotic MLLALFAFGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN GEIVLDSNWR
WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT
YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT punctulatus
VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL
DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGIRRCEG TECGDTDANQ
RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY
VQGGKVIENS KVNIAGMAAG NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDSGMVL
VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK
FGPIDSTY 146197081 uncultured symbiotic MLASVVYLVS LVVSLEIGTQ
QSEEHPKLTW QNGSSSVSGS IVLDSNWRWL HDSGTTNCYD GNLWSDDLCP protist of
NADTCSSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK
LKNKEFTFTV Reticulitermes DDSKLDCGLN GALYFVAMDA DGGKAKYSSF
KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GDGKLGTCCS
EMDIWEGNAK SQAYTVHACS KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ
SFYGEGKTVD TKSPVTVVTQ FIGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC
DATKDATGDT NDFKAKGAMA GFSTNLNTAQ VLVSVHCGMI IQPICCGLIR RIQRIQQKQV
QAVDRVLCRR VFQRMLKASM VMLQSRTRTL SLELSTRPLV GISPAGRLFF F 146197413
uncultured symbiotic MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN
GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC
DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT
punctulatus VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM
KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG
TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS
GTLSEIRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL
SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP
DATVTFSDIK FGPIDSTY 146197309 uncultured symbiotic MLCIGLISFV
YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD
protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST
NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD darwiniensis VSNLPCGLNG ALYHVNMDED
GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE
MDIWEANSIC SAVTPHVCDN LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT
FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC
TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS
DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK 146197227 uncultured
symbiotic MLGALVALAS CIGVGTNTPE KHPDLKWTNG GSSVSGSIVV DSNWRWTHIK
GETKNCYDGN LWSDKYCPDA protist of Neotermes ATCGKNCVLE GADYSGTYGV
TTSGDAATLK FVTHGQYSTN VGSRLYLLKD EKTYQMFNLV GKEFTFTVDV koshunensis
SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK
PQKNDKNSGN GKYGSCCSEM DIWEANSMAT AYTPHVCDKL EQTRCSGSAC GQNGGGDRFS
SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGGSVTEIK RKYVQGGKVI
DNSMTNIAAM SKQYNSVSDE FCQAQKKAFG DNDSFTKHGG FRQLGATLSK GHVLVLSLWD
DHDVNMLWLD SVYPTNSNKP GADRGPCKTS SGVPSDVESQ NADSTVKYSD IRFGAIDSTY
SK 146197253 uncultured symbiotic MLAAALFTFA CSVGVGTKTT ETHPKLNWQQ
CACKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS protist of Neotermes
LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTPKFVTHG SYSTNIGSRL YLLKDKSTYY
VFQLNNKEFT koshunensis FSVDVSKLPC GLNGALYFVE MDADGGKSKY AGAKPGAEYG
LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSMATALTPH
VCKTTGQTRC SGKSECGGQD GQDRFAGNCD EDGCDFNNWR MGDKTFFGPG LTVDTKSPFV
VVTQFYGSPV TEIRRKYVQN GKVIENAKSN IPGIDATNAI SDTFCEQQKK AFGDTNDFKN
KGGFTKLGSV FSRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSVPGVDRG PCPTSSGKPD
DVESASGDAT VVYGNIKFGA LDSTY 146197099 uncultured symbiotic
MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD
GNTWSSDLCP protist of DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS
TNIGSRVYLL KDENTYPMFK LKNKEFTFTV Reticulitermes DVSNLPCGLN
GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA
speratus GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN
GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN
SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT
ANMLWLDSTY PTDSTAIGAS RGPCATSSGD PKDVESASAN ASVKFSDIKF GALDSTY
146197409 uncultured symbiotic MLASLLPLSN SLGTASNQAE THPKLTWTQY
TGKGAGQTVN GEIVLDSNWR WTHKDGTNCY DGNTWSSSLC protist of Cryptocercus
PDPTTCSNNC NLDGADYPGT YGITTSGNQL KLGFVTHGSY STNIGSRVYL LRDSKNYQMF
KLKNKEFTFT punctulatus VDDSKLPCGL NGAVYFVAMD EDGGTAKHSI NKAGAQYGTG
YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRWGARC TEMDIWEANS RATAYTPHIC
TKTGLYRCEG TECGDSDTNR YGGVCDKDGC DFNSYRMGDK SFFGQGKTVD SSKPVTVVTQ
FITDNNQDSG KLTEIRRKYV QGGKVIDNSK VNIAGITAGN PITDTFCDEA KKAFGDNNDF
EKKGGLSALG TQLEAGFVLV LSLWDDHSVN MLWLDSTYPT NASPGALGVE RGDCAITSGV
PADVESQSAD ASVTFSDIKF GPIDSTY 146197315 uncultured symbiotic
MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT KGSTKNCYDG
NLWSKDLCPD protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGNALTL
KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD darwiniensis VSNLPCGLSG
ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG
NGKYGSCCSE MDIWEANSIC SAVTPHVCDN LQQTRCQGAA CGENGGGSRF GSSCDPDGCD
FNSWGMGNKT FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD
KFNSVSDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV
YPTNSKKAGS DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK 146197411
uncultured symbiotic MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN
GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC
DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT
punctulatus VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM
KFINGEANVL DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG
TECGDTDNDQ RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS
GILSETRRKY VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL
SKQLDAGMVL VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP
DATVTFSDIK FGPIDSTY 146197161 uncultured symbiotic MIGIVLIQTV
FGIGVGTQQS ESHPSLSWQQ CSKGGSCTSV SGSIVLDSNW RWTHIPDGTT NCYDGNEWSS
protist of DLCPDPTTCS NNCVLEGADY SGTYGISTSG SSAKLGFVTK GSYSTNIGSR
VYLLGDESHY KIFDLKNKEF Hodotermopsis TFTVDDSNLE CGLNGALYFV
AMDEDGGASR FTLAKPGAKY GTGYCDAQCP HDIKFINGEA NVQDWKPSDN sjoestedti
DDNAGTGHYG ACCTEMDIWE ANKYATAYTP HICTENGEYR CEGKSCGDSS DDRYGGVCDK
DGCDFNSWRL GNQSFWGPGL IIDTGKPVTV VTQFVTKDGT DSGALSEIRR KYVQGGKTIE
NTVVKISGID EVDSITDEFC NQQKQAFGDT NDFEKKGGLS GLGKAFDYGV VLVLSLWDDH
DVNMLWLDSV YPTNPAGKAG ADRGPCATSS GDPKEVEDKY ASASVTFSDI KFGPIDSTY
146197323 uncultured symbiotic MLVFGIVSFV YSIGVGTNTA ETHPKLTWKN
GGSTTNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD protist of Mastotermes
AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL
NGKEFTFTVD darwiniensis VSQLPCGLNG ALYFVCMDQD GGMSRYPDNQ AGAKYGTGYC
DAQCPTDLKF INGLPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ
VGQTRCEGRA CGENGGGDRF GSICDPDGCD FNSWRMGNKT FWGPGLIIDT KKPVTVVTQF
IGSPVTEIKR EYVQGGKVIE NSYTNIEGMD KFNSISDKFC TAQKKAFGDN DSFTKHGGFS
KLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV YPTNSKKLGS DRGPCPTSSG VPADVESKNA
DSSVKYSDIR FGSIDSTYK 146197077 uncultured symbiotic MLSFVFLLGF
GVSLEIGTQQ SENHPTLSWQ QCTSSGSCTS QSGSIVLDSN WRWVHDSGTT NCYDGNEWSS
protist of DLCPDPETCS KNCYLDGADY SGTYGITSNG SSLKLGFVTE GSYSTNIGSR
VYLKKDTNTY QIFKLKNHEF Reticulitermes TFTVDVSNLP CGLNGALYFV
EMEADGGKGK YPLAKPGAQY GMGYCDAQCP HDMKFINGNA NVLDWKPQET speratus
DENSGNGRYG TCCTEMDIWE ANSQATAYTP HICTKDGQYQ CEGTECGDSD ANQRYNGVCD
KDGCDFNSYR LGNKTFFGPG LIVDSKKPVT VVTQFITSNG QDSGDLTEIR RIYVQGGKTI
QNSFTNIAGL TSVDSITEAF CDESKDLFGD TNDFKAKGGF TAMGKSLDTG VVLVLSLWDD
HSVNMLWLDS TYPTDAAAGA LGTQRGPCAT SSGAPSDVES QSPDASVTFS DIKFGPLDST Y
146197089 uncultured symbiotic MLTLVVYLLS LVVSLEIGTQ QSESHPALTW
QREGSSASGS IVLDSNWRWV HDSGTTNCYD GNEWSTDLCP protist of SSDTCTQKCY
IEGADYSGTY GITTSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV
Reticulitermes DDSKLDCGLN GALYFVAMDA DGGKQKYSSF KPGAKYGMGY
CDAQCPHDMK FISGKANVED WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNAK
SQAYTVHACT KSGQYECTGT DCGDSDSRYQ GTCDKDGCDY ASYRWGDHSF YGEGKTVDTK
QPITVVTQFI GDPLTEIRRL YIQGGKVINN SKTQNLASVY DSITDAFCDA TKAASGDTND
FKAKGAMAGF SKNLDTPQVL VLSLWDDHTA NMLWLDSTYP TDSRDATAER GPCATSSGVP
KDVESNQADA SVVFSDIKFG AINSTYSYN 146197091 uncultured symbiotic
MFGFLLSLFA LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD
GNTWSSDLCP protist of DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS
TNIGSRVYLL KDENTYQMFK LKNKEFTFTV Reticulitermes DVSNLPCGLN
GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA
speratus GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN
GICDKDGCDF NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN
SFTNVSGITS VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT
ANMLWLDSTY PSNSTAIGAT RGPCATSSGD PKNVESASAN ASVKFSDIKF GAFDSTY
146197097 uncultured symbiotic MLALVYFLLS LVVSLEIGTQ QSEDHPKLTW
QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP protist of SSDTCTSKCY
IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV
Reticulitermes DDSQLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY
CDAQCPHDMK FISGKANVDD WKPQDNDENS speratus GNGKLGTCCS EMDIWEGNAK
SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD
TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKEASGDT
NDFKAKGAMS GFSTNLNTAQ VLVLSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVTSG
VPKDVESQYG SAQVVYSDIK FGAINSTY 146197095 uncultured symbiotic
MLALVYFLLS FVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD
GNLWSTDLCG protist of SSDTCSSKCY IEGADYSGTY GISASGSKLT LKFVTKGSYS
TNIGSRVYLL KDENTYETFK LKGKEFTFTV Reticulitermes DDSKLDCGLN
GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS
speratus GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR
FKGTCDKDGC DYASWRWGDQ SFYGEGKTID TKQPVTVVTQ FIGDPLTEIR RVYVQGGKVI
NNSKTSNLAN VYDSITDKFC DDTKDATGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH
TANMLWLDST YPTDSTKTGA SRGPCAVLSG VPKNVESQHG DATVIYSDIK FGAINSTFSY N
146197401 uncultured symbiotic MFLALFVLGK SLGIATNQAE NHPKLTWTRY
QSKGSGQTVN GEVVLDSNWR WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus
PDPQTCSSNC DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF
KLKNKEFTFT punctulatus VDDSKLPCGL NGALYFVAME EDGGVAKNSI NKAGAQYGTG
YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC IEMDIWEANS MATAYTPHVC
TVTGIHRCEG TECGDTDANQ RYNGICDKDG CDFNSYRMGD KSFFGVGKTV DSSKPVTVVT
QFVTSNGQDG GTLSEIKRKY VQGGKVIENS KVNIAGITAV NSITDTFCNE QKKAFGDNND
FEKKGGLGAL SKQLDLGMVL VLSLWDDHSV NMLWLDSTYP TDAAAGALGT ERGACATSSG
KPSDVESQSP DASVTFSDIK FGPIDSTY 146197225 uncultured symbiotic
MLLCLLSIAN SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN
LWSDKYCPDA protist of Neotermes ATCGKNCVIE GADYQGTYGV SSSGDGLTLT
FVTHGQYSTN VGSRLYLMKD EKTYQMFNLN GKEFTFTVDV koshunensis SNLPCGLNGA
LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN
GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSSC GHTGGGERFS SSCDPDGCDF
NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM
SKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD
SVYPTNSNKP GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K 146197317
uncultured symbiotic MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT
VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD protist of Mastotermes AATCGKNCVL
EGADYSGTYG VTSSGNALTL KFVTHGSYST NVGSRLYLMK DEKTYQMFNL NGKEFTFTVD
darwiniensis VSNLPCGLNG ALYHVNMDED GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF
INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC SAVTPHVCDT LQQTRCQGTA
CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT KSKFTVVTQF VGSPVTEIKR
KYVQNGKVIE NSFSNIEGMD KFNSISDKFC TAQKKAFGDT DSFTKHGGFK QLGSALAKGM
VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG VPADVESKSA NANVIYSDIR
FGAIDSTYK 146197251 uncultured symbiotic MLLCLLGIAS SLDAGTNTAE
NHPQLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA protist of
Neotermes ATCGQNCVIE GADYQGTYGV SASGNALTLT FVTHGQYSTN VGSRLYLLKD
EKTYQIFNLI GKEFTFTVDV koshunensis SNLPCGLNGA LYFVQMDADG GTAKYSDNKA
GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GRYGSCCSEM DVWEANSLAT
AYTPHVCDKL EQVRCDGRAC GQNGGGDRFS SSCDPDGCDF NSWRLGNKTF WGPGLIVDTK
QPVQVVTQWV GSGTSVTEIK RKYVQGGKVI DNSFTKLDSL TKQYNSVSDE FCVAQKKAFG
DNDSFTKHGG FRQLGATLAK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GADRGPCKTS
SGVPADVESQ AASSSVKYSD IRFGAIDSTY K 146197319 uncultured symbiotic
MLGIGFVCIV YSLGVGTNTA ENHPKLTWKN SGSTTNGEVT VDSNWRWTHT KGTTKNCYDG
NLWSKDLCPD protist of Mastotermes AATCGKNCVL EGADYSGTYG VTSSGDALTL
KFVTHGSYST NVGSRLYLLK DEKTYQIFNL NGKEFTFTVD darwiniensis VSNLPCGLNG
ALYFVNMDAD GGTGRYPDNQ AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG
NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF GSSCDPDGCD
FNSWRLGNKT FWGPGLIVDT KKPVTVVTQF VGSPVTEIKR KYVQGGKVIE NSYTNIEGLD
KFNSISDKFC TAQKKAFGDN DSFIKHGGFR QLGQSFTKGQ VLVLSLWDDH TVNMLWLDSV
YPTNSKKPGA DRGPCPTSSG VPADVESKNA GSSVKYSDIR FGSIDSTYK 146197071
uncultured symbiotic MATLVGILVS LFALEVALEI GTQTSESHPS LSWELNGQRQ
TGSIVIDSNW RWLHDSGTTN CYDGNEWSSD protist of LCPDPEKCSQ NCYLEGADYS
GTYGISSSGN SLQLGFVTKG SYSTNIGSRV YLLKDENTYA TFKLKNKEFT
Reticulitermes FTADVSNLPC GLNGALYFVA MPADGGKSKY PLAKPGAKYG
MGYCDAQCPH DMKFINGEAN ILDWKPSSND speratus ENAGAGRYGT CCTEMDIWEA
NSQATAYTVH ACSKNARCEG TECGDDDGRY NGICDKDGCD FNSWRWGNKT FFGPNLIVDS
SKPVTVVTQF IGDPLTEIRR IYVQGGKVIQ NSFTNISGVA SVDSITDAFC NENKVATGDT
NDFKAKGGMS GFSKALDTEV VLVLSLWDDH TANMLWLDST YPTDSSALGA SRGPCAITSG
EPKDVESASA NASVKFSDIK FGAIDSTY 146197075 uncultured symbiotic
MLTLVYFLLS LVVSLEIGTQ QSESHPQLSW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD
GNLWSTDLCP protist of SSDTCTSKCY IEGADYSGTY GITSSGSKLT LKFVTKGSYS
TNIGSRVYLL KDENTYETFK LKNKEFTFTV Reticulitermes DDSKLDCGLN
GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS
speratus GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR
FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPLTVVTQ FVGDPLTEIR RVYVQGGKTI
NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH
TANMLWLDST YPTDSTKTGA SRGPCAVSSG VPKDVESQHG DATVIYSDIK FGAINSTFKW N
146197159 uncultured symbiotic MLSLVSIFLV GLGFSLGVGT QQSESHPSLS
WQNCSAKGSC QSVSGSIVLD SNWRWLHDSG TTNCYDGNEW protist of STDLCPDAST
CDKNCYIEGA DYSGTYGITS SGAQLKLGFV TKGSYSTNIG SRVYLLRDES HYQLFKLKNH
Hodotermopsis EFTFTVDDSQ LPCGLNGALY FVEMAEDGGA KPGAQYGMGY
CDAQCPHDMK FITGEANVKD WKPQETDENA sjoestedti GNGHYGACCT EMDIWEANSQ
ATAYTPHICS KTGIYRCEGT ECGDNDANQR YNGVCDKDGC DFNSYRLGNK TFWGPGLTVD
SNKAMIVVTQ FTTSNNQDSG ELSEIRRIYV QGGKTIQNSD TNVQGITTTN KITQAFCDET
KVTFGDTNDF KAKGGFSGLS KSLESGAVLV LSLWDDHSVN MLWLDSTYPT DSAGKPGADR
GPCAITSGDP KDVESQSPNA SVTFSDIKFG PIDSTY 146197405 uncultured
symbiotic MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN GEIVLDSNWR
WTHHSGTNCY DGNTWSTSLC protist of Cryptocercus PDPTTCSNNC DLDGADYPGT
YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT punctulatus
VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL
DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ
RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY
VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGFGAL SKQLVAGMVL
VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK
FGPIDSTY 146197327 uncultured symbiotic MLCVGLFGLV YSIGVGTNTQ
ETHPKLSWKQ CSSGGSCTTQ QGSVVIDSNW RWTHSTKDLT NCYDGNLWDS protist of
Mastotermes TLCPDGTTCS KNCVLEGADY SGTYGITSSG DSLTLKFVTH GSYSTNVGSR
LYLLKDDNNY QIFNLAGKEF darwiniensis TFTVDVSNLP CGLNGALYFV EMDQDGGKGK
HKENEAGAKY GTGYCDAQCP TDLKFIDGIA NSDGWKPQDN DENSGNGKYG SCCSEMDIWE
ANSLATAYTP HVCDTKGQKR CQGTACGENG GGDRFGSECD PDGCDFNSWR QGNKSFWGPG
LIIDTKKSVQ VVTQFIGSGS SVTEIRRKYV QNGKVIENSY STISGTEKYN SISDDYCNAQ
KKAFGDTNSF ENHGGFKRFS QHIQDMVLVL SLWDDHTVNM LWLDSVYPTN SNKPGADRGP
CETSSGVPAD VESKSASASV KYSDIRFGPI DSTYK 146197261 uncultured
symbiotic MLLCLWSIAY SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK
GETKNCYDGN LWSDKYCPDA protist of Neotermes ATCGKNCVIE GADYQGTYGV
SASGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQIFNLN GKEFTFTVDV koshunensis
SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK
PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSAC GHTGGGERFS
SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI
DNSMSNIAGM TKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD
DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY
K
TABLE-US-00002 TABLE 2 Database Position Position Sequence
Identifier Accession Corresponding to Corresponding to (SEQ ID NO:)
Number Species of Origin Position 268 Position 411 BD29555* Unknown
273 422 340514556 Trichoderma reesei 268 411 51243029 Penicillium
occitanis 273 422 7cel (PDB) & Trichoderma reesei 251 394
67516425 Aspergillus nidulans FGSC A4 274 424 46107376 Gibberella
zeae PH-1 268 415 70992391 Aspergillus fumigatus Af293 277 427
121699984 Aspergillus clavatus NRRL 1 277 427 1906845 Claviceps
purpurea 269 416 1gpi (PDB) & Phanerochaete chrysosporium 240
391 119468034 Neosartorya fischeri NRRL 181 265 414 7804883
Leptosphaeria maculans 256 401 85108032 Neurospora crassa N150 268
412 169859458 Coprinopsis cinerea okayama 270 421 154292161
Botryotinia fuckeliana B05-10 -- 410 169615761 # Phaeosphaeria
nodorum SN15 246 393 4883502 Humicola grisea 272 413 950686
Humicola grisea 270 416 124491660 Chaetomium thermophilum 272 413
58045187 Chaetomium thermophilum 270 416 169601100 # Phaeosphaeria
nodorum SN15 237 383 169870197 Coprinopsis cinerea okayama 269 421
3913806 Agaricus bisporus 263 414 169611094 Phaeosphaeria nodorum
SN15 270 414 3131 Phanerochaete chrysosporium -- 410 70991503
Aspergillus fumigatus Af293 265 414 294196 Phanerochaete
chrysosporium 258 409 18997123 Thermoascus aurantiacus 268 418
4204214 Humicola grisea var thermoidea 272 413 34582632 Trichoderma
viride (also known as 268 411 Hypochrea rufa) 156712284 Thermoascus
aurantiacus 268 418 39977899 Magnaporthe grisea (oryzae) 70-15 268
414 20986705 Talaromyces emersonii 266 416 22138843 Aspergillus
oryzae 265 414 55775695 Penicillium chrysogenum 276 426 171676762
Podospora anserina 270 417 146350520 Pleurotus sp Florida 268 420
37732123 Gibberella zeae 268 415 156055188 Sclerotinia sclerotiorum
1980 -- 410 453224 Phanerochaete chrysosporium 258 409 50402144
Trichoderma reesei 268 411 115397177 Aspergillus terreus NIH2624
274 424 154312003 Botryotinia fuckeliana B05-10 266 416 49333365
Volvariella volvacea 268 420 729650 Penicillium janthinellum 274
424 146424871 Pleurotus sp Florida 267 418 67538012 Aspergillus
nidulans FGSC A4 265 410 62006162 Fusarium poae 268 415 146424873
Pleurotus sp Florida 267 418 295937 Trichoderma viride 268 411
6179889 # Alternaria alternata 240 386 119483864 Neosartorya
fischeri NRRL 181 278 428 85083281 Neurospora crassa OR74A 270 412
3913803 Cryphonectria parasitica 269 416 60729633 Corticium rolfsii
265 415 39971383 Magnaporthe grisea 70-15 268 410 39973029
Magnaporthe grisea 70-15 269 410 1170141 Fusarium oxysporum 268 415
121710012 Aspergillus clavatus NRRL 1 265 414 17902580 Penicillium
funiculosum 273 422 1346226 Humicola grisea var thermoidea 270 416
156712282 Chaetomium thermophilum 270 416 169768818 Aspergillus
oryzae RIB40 277 427 46241270 Gibberella pulicaris 268 415 49333363
Volvariella volvacea 265 418 46395332 Irpex lacteus 263 414
50844407 # Chaetomium thermophilum var 245 391 thermophilum 4586347
Irpex lacteus 264 415 3980202 Phanerochaete chrysosporium 258 410
27125837 Melanocarpus albomyces 273 414 171696102 Podospora
anserina 265 415 3913802 Cochliobolus carbonum 270 416 50403723
Trichoderma viride 268 411 3913798 Aspergillus aculeatus 275 425
66828465 Dictyostelium discoideum 269 419 156060391 Sclerotinia
sclerotiorum 1980 252 402 116181754 Chaetomium globosum CBS 148-51
263 413 145230535 Aspergillus niger 274 424 46241266 Nectria
haematococca mpVI 268 415 1q9h (PDB) # Talaromyces emersonii 248
398 157362170 Polyporus arcularius 269 420 7804885 Leptosphaeria
maculans 267 407 121852 Phanerochaete chrysosporium 258 409
126013214 Penicillium decumbens 264 415 156048578 Sclerotinia
sclerotiorum 1980 265 413 156712278 Acremonium thermophilum 269 414
21449327 Aspergillus nidulans 265 410 171683762 Podospora anserina
274 415 56718412 Thermoascus aurantiacus var 268 418 levisporus
15824273 Pseudotrichonympha grassii 263 414 115390801 Aspergillus
terreus NIH2624 266 411 453223 Phanerochaete chrysosporium 258 409
3132 Phanerochaete chrysosporium -- 407 16304152 Thermoascus
aurantiacus 268 417 156712280 Acremonium thermophilum 273 420
5231154 Volvariella volvacea 281 438 116200349 Chaetomium globosum
CBS 148-51 270 412 4586343 Irpex lacteus 263 414 15321718 Lentinula
edodes -- 417 146424875 Pleurotus sp Florida 267 418 62006158
Fusarium venenatum 268 415 296027 Phanerochaete chrysosporium 258
409 154449709 Fusicoccum sp BCC4124 272 424 169859460 Coprinopsis
cinerea okayama 269 421 50400675 Trichoderma harzianum 264 407
729649 Neurospora crassa 262 406 119472134 Neosartorya fischeri
NRRL 181 277 427 117935080 Chaetomium thermophilum 272 413
154300584 Botryotinia fuckeliana B05-10 265 413 15824271
Pseudotrichonympha grassii 263 414 4586345 Irpex lacteus 263 414
46241268 Gibberella avenacea 268 416 6164684 Aspergillus niger 274
424 6164682 Aspergillus niger 266 412 33733371 Chrysosporium
lucknowense 269 415 US6573086-10 29160311 Thielavia australiensis
269 415 146197087 uncultured symbiotic protist of 260 402
Reticulitermes speratus 146197237 uncultured symbiotic protist of
264 409 Neotermes koshunensis 146197067 uncultured symbiotic
protist of 260 402 Reticulitermes speratus 146197407 uncultured
symbiotic protist of 261 412 Cryptocercus punctulatus 146197157
uncultured symbiotic protist of 264 410 Hodotermopsis sjoestedti
146197403 uncultured symbiotic protist of 261 412 Cryptocercus
punctulatus 146197081 uncultured symbiotic protist of 260 410
Reticulitermes speratus 146197413 uncultured symbiotic protist of
261 412 Cryptocercus punctulatus 146197309 uncultured symbiotic
protist of 259 402 Mastotermes darwiniensis 146197227 uncultured
symbiotic protist of 258 404 Neotermes koshunensis 146197253
uncultured symbiotic protist of 264 409 Neotermes koshunensis
146197099 uncultured symbiotic protist of 258 401 Reticulitermes
speratus 146197409 uncultured symbiotic protist of 260 411
Cryptocercus punctulatus 146197315 uncultured symbiotic protist of
259 402 Mastotermes darwiniensis 146197411 uncultured symbiotic
protist of 261 412 Cryptocercus punctulatus 146197161 uncultured
symbiotic protist of 263 413 Hodotermopsis sjoestedti 146197323
uncultured symbiotic protist of 259 402 Mastotermes darwiniensis
146197077 uncultured symbiotic protist of 264 415 Reticulitermes
speratus 146197089 uncultured symbiotic protist of 258 400
Reticulitermes speratus 146197091 uncultured symbiotic protist of
258 401 Reticulitermes speratus 146197097 uncultured symbiotic
protist of 260 402 Reticulitermes speratus 146197095 uncultured
symbiotic protist of 260 402 Reticulitermes speratus 146197401
uncultured symbiotic protist of 261 412 Cryptocercus punctulatus
146197225 uncultured symbiotic protist of 258 404 Neotermes
koshunensis 146197317 uncultured symbiotic protist of 259 402
Mastotermes darwiniensis 146197251 uncultured symbiotic protist of
258 404 Neotermes koshunensis 146197319 uncultured symbiotic
protist of 259 402 Mastotermes darwiniensis 146197071 uncultured
symbiotic protist of 259 402 Reticulitermes speratus 146197075
uncultured symbiotic protist of 260 402 Reticulitermes speratus
146197159 uncultured symbiotic protist of 260 410 Hodotermopsis
sjoestedti 146197405 uncultured symbiotic protist of 261 412
Cryptocercus punctulatus 146197327 uncultured symbiotic protist of
264 408 Mastotermes darwiniensis 146197261 uncultured symbiotic
protist of 258 404 Neotermes koshunensis
TABLE-US-00003 TABLE 3 Signal Catalytic Cellulose Database Sequence
(SS) Domain (CD) Linker Start Binding Accession Start and End Start
and End and End Domain (CBD) SEQ ID NO: Number Species of Origin
Position Position Position Start and End BD29555* Unknown 1-25
26-455 456-493 494-529 340514556 Trichoderma reesei 1-17 18-444
445-479 480-514 51243029 Penicillium occitanis 1-25 26-455 456-493
494-529 7cel (PDB) & Trichoderma reesei N/A 1-427 N/A N/A
67516425 Aspergillus nidulans 1-23 24-457 458-490 491-526 FGSC A4
46107376 Gibberella zeae PH-1 1-17 18-448 449-476 477-512 70992391
Aspergillus fumigatus 1-26 27-460 461-496 497-532 Af293 121699984
Aspergillus clavatus 1-27 27-460 461-503 504-539 NRRL 1 1906845
Claviceps purpurea 1-19 20-449 N/A N/A 1gpi (PDB) &
Phanerochaete N/A 1-424 N/A N/A chrysosporium 119468034 Neosartorya
fischeri 1-17 18-447 N/A N/A NRRL 181 7804883 Leptosphaeria 1-17
18-434 N/A N/A maculans 85108032 Neurospora crassa 1-17 18-445
446-485 486-521 N150 169859458 Coprinopsis cinerea 1-18 19-454 N/A
N/A okayama 154292161 Botryotinia fuckeliana 1-18 19-443 444-555
556-596 B05-10 169615761 # Phaeosphaeria 1 2-426 N/A N/A nodorum
SN15 4883502 Humicola grisea 1-22 23-446 N/A N/A 950686 Humicola
grisea 1-18 19-449 450-489 490-525 124491660 Chaetomium 1-22 23-446
N/A N/A thermophilum 58045187 Chaetomium 1-18 19-449 450-494
495-530 thermophilum 169601100 # Phaeosphaeria 1 2-416 N/A N/A
nodorum SN15 169870197 Coprinopsis cinerea 1-18 19-454 N/A N/A
okayama 3913806 Agaricus bisporus 1-18 19-447 448-470 471-506
169611094 Phaeosphaeria 1-18 19-447 N/A N/A nodorum SN15 3131
Phanerochaete 1-19 20-443 N/A N/A chrysosporium 70991503
Aspergillus fumigatus 1-17 18-447 N/A N/A Af293 294196
Phanerochaete 1-18 19-442 443-480 481-516 chrysosporium 18997123
Thermoascus 1-17 18-451 N/A N/A aurantiacus 4204214 Humicola grisea
var 1-22 23-446 N/A N/A thermoidea 34582632 Trichoderma viride 1-18
18-444 445-479 480-514 (also known as Hypochrea rufa) 156712284
Thermoascus 1-17 18-451 N/A N/A aurantiacus 39977899 Magnaporthe
grisea 1-17 18-447 N/A N/A (oryzae) 70-15 20986705 Talaromyces
emersonii 1-18 19-449 N/A N/A 22138843 Aspergillus oryzae 1-17
18-447 N/A N/A 55775695 Penicillium 1-25 26-459 460-494 495-529
chrysogenum 171676762 Podospora anserina 1-18 19-450 451-492
493-528 146350520 Pleurotus sp Florida 1-18 19-453 N/A N/A 37732123
Gibberella zeae 1-17 18-448 449-476 477-512 156055188 Sclerotinia
1-18 19-443 444-546 547-586 sclerotiorum 1980 453224 Phanerochaete
1-18 19-442 443-474 475-510 chrysosporium 50402144 Trichoderma
reesei 1-17 18-444 445-478 479-513 115397177 Aspergillus terreus
1-23 24-457 458-505 506-541 NIH2624 154312003 Botryotinia
fuckeliana 1-17 18-449 450-480 481-516 B05-10 49333365 Volvariella
volvacea 1-18 19-453 N/A N/A 729650 Penicillium 1-25 26-456 457-502
503-537 janthinellum 146424871 Pleurotus sp Florida 1-18 19-451
452-487 488-523 67538012 Aspergillus nidulans 1-17 18-443 N/A N/A
FGSC A4 62006162 Fusarium poae 1-17 18-448 449-475 476-511
146424873 Pleurotus sp Florida 1-18 19-451 452-487 488-523 295937
Trichoderma viride 1-17 18-444 445-478 479-513 6179889 # Alternaria
alternata 1 2-419 N/A N/A 119483864 Neosartorya fischeri 1-26
27-461 462-499 500-535 NRRL 181 85083281 Neurospora crassa 1-20
21-445 N/A N/A OR74A 3913803 Cryphonectria 1-18 19-449 N/A N/A
Parasitica 60729633 Corticium rolfsii 1-18 19-448 449-492 493-528
39971383 Magnaporthe grisea 1-17 18-443 N/A N/A 70-15 39973029
Magnaporthe grisea 1-19 20-443 N/A N/A 70-15 1170141 Fusarium
oxysporum 1-17 18-448 449-478 479-514 121710012 Aspergillus
clavatus 1-17 18-447 N/A N/A NRRL 1 17902580 Penicillium 1-25
26-455 456-493 494-529 funiculosum 1346226 Humicola grisea var 1-18
19-449 450-489 490-525 thermoidea 156712282 Chaetomium 1-18 19-449
450-496 497-532 thermophilum 169768818 Aspergillus oryzae 1-25
26-460 N/A N/A RIB40 46241270 Gibberella pulicaris 1-17 18-448
449-474 475-510 49333363 Volvariella volvacea 1-18 19-451 452-476
477-512 46395332 Irpex lacteus 1-18 19-447 448-485 486-521 50844407
# Chaetomium N/A 1-424 425-469 470-505 thermophilum var
thermophilum 4586347 Irpex lacteus 1-18 19-448 449-490 491-526
3980202 Phanerochaete 1-18 19-443 444-475 476-511 chrysosporium
27125837 Melanocarpus 1-23 23-447 N/A N/A albomyces 171696102
Podospora anserina 1-17 17-448 N/A N/A 3913802 Cochliobolus 1-18
19-449 N/A N/A carbonum 50403723 Trichoderma viride 1-17 18-444
445-479 480-514 3913798 Aspergillus aculeatus 1-22 23-458 459-505
506-540 66828465 Dictyostelium 1-19 20-452 N/A N/A discoideum
156060391 Sclerotinia 1-17 18-435 436-470 471-504 sclerotiorum 1980
116181754 Chaetomium globosum 1-17 18-446 N/A N/A CBS 148-51
145230535 Aspergillus niger 1-21 22-457 458-500 501-536 46241266
Nectria haematococca 1-18 18-448 449-472 473-508 mpVI 1q9h (PDB) #
Talaromyces emersonii N/A 1-431 N/A N/A 157362170 Polyporus
arcularius 1-18 19-453 N/A N/A 7804885 Leptosphaeria 1-20 21-440
N/A N/A maculans 121852 Phanerochaete 1-18 19-442 443-480 481-516
chrysosporium 126013214 Penicillium decumbens 1-17 18-448 N/A N/A
156048578 Sclerotinia 1-16 17-446 N/A N/A sclerotiorum 1980
156712278 Acremonium 1-17 18-447 448-487 488-523 thermophilum
21449327 Aspergillus nidulans 1-17 18-443 N/A N/A 171683762
Podospora anserina 1-22 23-448 N/A N/A 56718412 Thermoascus 1-17
18-451 N/A N/A aurantiacus var levisporus 15824273
Pseudotrichonympha 1-20 21-447 N/A N/A grassii 115390801
Aspergillus terreus 1-17 18-444 N/A N/A NIH2624 453223
Phanerochaete 1-18 19-442 443-474 475-510 chrysosporium 3132
Phanerochaete 1-19 20-436 437-467 468-504 chrysosporium 16304152
Thermoascus 1-17 18-450 N/A N/A aurantiacus 156712280 Acremonium
1-21 22-453 N/A N/A thermophilum 5231154 Volvariella volvacea 1-15
16-472 473-500 501-536 116200349 Chaetomium globosum 1-20 21-445
N/A N/A CBS 148-51 4586343 Irpex lacteus 1-18 19-447 448-481
482-517 15321718 Lentinula edodes 1-18 19-450 451-480 481-516
146424875 Pleurotus sp Florida 1-18 19-451 452-487 488-523 62006158
Fusarium venenatum 1-17 18-448 449-471 472-507 296027 Phanerochaete
1-18 19-442 443-480 481-516 chrysosporium 154449709 Pusicoccum sp
1-19 20-457 N/A N/A BCC4124 169859460 Coprinopsis cinerea 1-18
19-454 N/A N/A okayama 50400675 Trichoderma 1-17 18-440 441-470
471-505 harzianum 729649 Neurospora crassa 1-17 18-439 440-480
481-516 119472134 Neosartorya fischeri 1-26 27-460 461-494 495-530
NRRL 181 117935080 Chaetomium 1-22 23-446 N/A N/A thermophilum
154300584 Botryotinia fuckeliana 1-16 17-446 N/A N/A B05-10
15824271 Pseudotrichonympha 1-20 21-447 N/A N/A grassii 4586345
Irpex lacteus 1-18 19-447 448-487 488-523 46241268 Gibberella
avenacea 1-17 18-449 450-478 478-513 6164684 Aspergillus niger 1-21
22-457 458-500 501-536 6164682 Aspergillus niger 1-17 18-445 N/A
N/A 33733371 Chrysosporium 1-17 18-448 449-490 491-526 lucknowense
US6573086-10 29160311 Thielavia australiensis 1-18 18-448 449-502
503-538 146197087 uncultured symbiotic 1-22 23-435 N/A N/A protist
of Reticulitermes speratus 146197237 uncultured symbiotic 1-20
21-442 N/A N/A protist of Neotermes koshunensis 146197067
uncultured symbiotic 1-22 23-435 N/A N/A protist of Reticulitermes
speratus 146197407 uncultured symbiotic 1-19 20-445 N/A N/A protist
of Cryptocercus punctulatus 146197157 uncultured symbiotic 1-20
21-443 N/A N/A protist of Hodotermopsis sjoestedti 146197403
uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus
punctulatus 146197081 uncultured symbiotic 1-22 23-443 N/A N/A
protist of Reticuhtermes speratus 146197413 uncultured symbiotic
1-19 20-445 N/A N/A protist of Cryptocercus punctulatus 146197309
uncultured symbiotic 1-20 21-435 N/A N/A protist of Mastotermes
darwiniensis 146197227 uncultured symbiotic 1-19 20-437 N/A N/A
protist of Neotermes koshunensis 146197253 uncultured symbiotic
1-21 21-442 N/A N/A protist of Neotermes koshunensis 146197099
uncultured symbiotic 1-22 23-434 N/A N/A protist of Rehculitermes
speratus 146197409 uncultured symbiotic 1-19 20-444 N/A N/A protist
of Cryptocercus punctulatus 146197315 uncultured symbiotic 1-20
21-435 N/A N/A protist of Mastotermes darwiniensis 146197411
uncultured symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus
Punctulatus 146197161 uncultured symbiotic 1-20 21-446 N/A N/A
protist of Hodotermopsis sjoestedti 146197323 uncultured symbiotic
1-20 21-435 N/A N/A protist of Mastotermes darwiniensis
146197077 uncultured symbiotic 1-21 22-448 N/A N/A protist of
Reticuhtermes speratus 146197089 uncultured symbiotic 1-22 23-433
N/A N/A protist of Reticuhtermes speratus 146197091 uncultured
symbiotic 1-22 23-434 N/A N/A protist of Reticuhtermes speratus
146197097 uncultured symbiotic 1-22 23-435 N/A N/A protist of
Reticuhtermes speratus 146197095 uncultured symbiotic 1-22 23-435
N/A N/A protist of Reticuhtermes speratus 146197401 uncultured
symbiotic 1-19 20-445 N/A N/A protist of Cryptocercus Punctulatus
146197225 uncultured symbiotic 1-19 20-437 N/A N/A protist of
Neotermes koshunensis 146197317 uncultured symbiotic 1-20 21-435
N/A N/A protist of Mastotermes darwiniensis 146197251 uncultured
symbiotic 1-19 20-437 N/A N/A protist of Neotermes koshunensis
146197319 uncultured symbiotic 1-20 21-435 N/A N/A protist of
Mastotermes darwiniensis 146197071 uncultured symbiotic 1-25 26-435
N/A N/A protist of Reticulitermes speratus 146197075 uncultured
symbiotic 1-22 23-435 N/A N/A protist of Reticulitermes speratus
146197159 uncultured symbiotic 1-23 24-443 N/A N/A protist of
Hodotermopsis sjoestedti 146197405 uncultured symbiotic 1-19 20-445
N/A N/A protist of Cryptocercus punctulatus 146197327 uncultured
symbiotic 1-20 21-441 N/A N/A protist of Mastotermes darwiniensis
146197261 uncultured symbiotic 1-19 20-437 N/A N/A protist of
Neotermes koshunensis
TABLE-US-00004 TABLE 4 Amino Acid Amino Acid Position of Positions
of Positions of Active Catalytic Sequence Database Fragment in Site
Loop in Residues in Identifier Accession Amino Acid Sequence of
Fragment of Catalytic Domain Sequence Sequence Sequence (SEQ ID
NO:) Number Species of Origin Including Loop and Catalytic Residue
Identifier Identifier Identifier BD29555* Unknown
NVEGWTPSSNNANTGLGNHGACCAELDIWEANS 210-242 214-226 234, 239
340514556 Trichoderma reesei NVEGWTPSANNANTGIGNHGACCAELDIWEANS
205-237 209-221 229, 234 51243029 Penicillium occitanis
NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS 210-242 214-226 234, 239 7cel
(PDB) & Trichoderma reesei NVEGWEPSSNNANTGIGGHGSCCSEMDIWQANS
188-220 192-204 212, 217 67516425 Aspergillus nidulans
NVEGWESSDTNPNGGVGNHGSCCAEMDIWEANS 211-243 215-227 235, 240 FGSC A4
46107376 Gibberella zeae PH-1 NSDGWQPSDSDVNGGIGNLGTCCPEMDIWEANS
205-237 209-221 229, 234 70992391 Aspergillus fumigatus
NVEGWQPSSNDANAGTGNHGSCCAEMDIWEANS 214-246 218-230 238, 243 Af293
121699984 Aspergillus clavatus NVEGWTPSSSDANAGNGGHGSCCAEMDIWEANS
214-246 218-230 238, 243 NRRL 1 1906845 Claviceps purpurea
NSKDWIPSKSDANAGIGSLGACCREMDIWEANN 206-238 210-222 230, 235 1gpi
(PDB) & Phanerochaete NVGNWTETG-SNTGTGSYGTCCSEMDIWEANN 185-215
189-199 207, 212 chrysosporium 119468034 Neosartorya fischeri
NVEGWKPSSNDKNAGVGGHGSCCPEMDIWEANS 202-234 206-218 226, 231 NRRL 181
7804883 Leptosphaeria NVEGWQPSKNDQNAGVGGHGSCCAEMDIWEANS 193-225
197-209 217, 222 maculans 85108032 Neurospora crassa
NVEGWTPSTNDANAGIGDHGTCCSEMDIWEANK 205-237 209-221 229, 234 N150
(OR74A) 169859458 Coprinopsis cinerea
NSADWTPSETDPNAGRGRYGICCAEMDIWEANS 207-239 211-223 231, 236 okayama
154292161 Botryotinia NVEGWVPDSNSANSGTGNIGSCCSEFDVWEANS 203-235
207-219 227, 232 fuckeliana B05-10 169615761 # Phaeosphaeria
NADGWQASTSDPNAGVGKKGACCAEMDVWEANS 183-215 187-199 207, 212 nodorum
SN15 4883502 Humicola grisea NIEGWRPSTNDPNAGVGPMGACCAEIDVWESNA
208-240 212-224 232, 237 950686 Humicola grisea
NIEGWTGSTNDPNAGAGRYGTCCSEMDIWEANN 207-239 211-223 231, 236
124491660 Chaetomium NIEGWRPSTNDANAGVGPYGACCAEIDVWESNA 209-241
213-225 233, 238 thermophilum 58045187 Chaetomium
NIENWTPSTNDANAGFGRYGSCCSEMDIWEANN 207-239 211-223 231, 236
thermophilum 169601100 # Phaeosphaeria
NVEGWKPSDNDANAGVGGHGSCCAEMDIWEANS 174-206 178-190 198, 203 nodorum
SN15 169870197 Coprinopsis cinerea
NSVGWEPSETDSNAGRGRYGICCAEMDIWEANS 207-239 211-223 231, 236 okayama
3913806 Agaricus bisporus NSEGWEGSPNDVNAGTGNFGACCGEMDIWEANS 203-235
207-219 227, 232 169611094 Phaeosphaeria
NVEGWNPSDADPNAGSGKIGACCPEMDIWEANS 208-240 212-224 232, 237 nodorum
SN15 3131 Phanerochaete NVQGWNATS--ATTGTGSYGSCCTELDIWEANS 204-234
208-218 226, 231 chrysosporium 70991503 Aspergillus fumigatus
NVEGWEPSSSDKNAGVGGHGSCCPEMDIWEANS 202-234 206-218 226, 231 Af293
294196 Phanerochaete NVEGWNATS--ANAGTGNYGTCCTEMDIWEANN 203-233
207-217 225, 230 chrysosporium 18997123 Thermoascus
NVEGWQPSANDPNAGVGNHGSSCAEMDVWEANS 205-237 209-221 229, 234
aurantiacus 4204214 Humicola grisea var
NIEGWRPSTNDPNAGVGPMGACCAEIDVWESNA 208-240 212-224 232, 237
thermoidea 34582632 Trichoderma viride
NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS 205-237 209-221 229, 234 (also
known as Hypochrea rufa) 156712284 Thermoascus
NVEGWQPSANDPNAGVGNHGSCCAEMDVWEANS 205-237 209-221 229, 234
aurantiacus 39977899 Magnaporthe grisea
NVEGWQPSSGDANSGVGNMGSCCAEMDIWEANS 205-237 209-221 229, 234 (oryzae)
70-15 20986705 Talaromyces NVEGWQPSSNNANTGIGDHGSCCAEMDVWEANS
203-235 207-219 227, 232 emersonii 22138843 Aspergillus oryzae
R-KGWEPSDSDKNAGVGGHGSCCPQMDIWEANS 203-234 206-218 226, 231 55775695
Penicillium NVEGWEPSSSDVNGGTGNYGSCCAEMDIWEANS 213-245 217-229 237,
242 chrysogenum 171676762 Podospora anserina
NIEGWNPSTNDVNAGAGRYGTCCSEMDIWEANN 207-239 211-223 231, 236
146350520 Pleurotus sp Florida NVQGWQPSPNDSNAGKGQYGSCCAEMDIWEANS
207-239 211-223 231, 236 37732123 Gibberella zeae
NSDGWQPSDSDVNGGIGNLGTCCPEMDIWEANS 205-237 209-221 229, 234
156055188 Sclerotinia NNEGWVPDSNSANSGTGNIGSCCSEFDVWEANS 203-235
207-219 227, 232 sclerotiorum 1980 453224 Phanerochaete
NVGNWTETG--SNTGTGSYGTCCSEMDIWEANN 203-233 207-217 225, 230
chrysosporium 50402144 Trichoderma reesei
NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS 205-237 209-221 229, 234
115397177 Aspergillus terreus NVEGWEPSANDANAGTGNHGSCCAEMDIWEANS
211-243 215-227 235, 240 NIH2624 154312003 Botryotinia
NSVGWTPSSNDVNAGAGQYGSCCSEMDIWEANK 206-238 210-222 230, 235
fuckeliana B05-10 49333365 Volvariella volvacea
NVQGWQPSPNDTNAGTGNYGACCNEMDVWEANS 207-239 211-223 231, 236 729650
Penicillium NVDGWTPSKNDVNSGIGNHGSCCAEMDIWEANS 211-243 215-227 235,
240 janthinellum 146424871 Pleurotus sp Florida
NILDWSASATDANAGNGRYGACCAEMDIWEANS 206-238 210-222 230, 235 67538012
Aspergillus nidulans NVEGWEPSDSDANAGVGGMGTCCPEMDIWEANS 202-234
206-218 226, 231 FGSC A4 62006162 Fusarium poae
NSDGWEPSKSDVNGGIGNLGTCCPEMDIWEANS 205-237 209-221 229, 234
146424873 Pleurotus sp Florida NILDWSGSATDPNAGNGRYGACCAEMDIWEANS
206-238 210-222 230, 235 295937 Trichoderma viride
NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS 205-237 209-221 229, 234 6179889
# Alternaria alternata NVEGWKPSSNDANAGVGGHGSCCAEMDIWEANS 177-209
181-193 201, 206 119483864 Neosartorya fischeri
NVEGWTPSSNNENTGLGNYGSCCAELDIWESNS 215-247 219-231 239, 244 NRRL 181
85083281 Neurospora crassa NIEGWTPSTNDANAGVGPYGGCCAEIDVWESNA
207-239 211-223 231, 236 OR74A 3913803 Cryphonectria
NVEGWTPSTNDANAGVGGLGSCCSEMDVWEANS 206-238 210-222 230, 235
parasitica 60729633 Corticium rolfsii
NLLDWNATS--ANSGTGSYGSCCPEMDIWEANK 206-236 210-220 228, 233 39971383
Magnaporthe grisea NIEGWQPSSTDSSAGIGAQGACCAEIDIWESNK 205-237
209-221 229, 234 70-15 39973029 Magnaporthe grisea
NIEGWKPSSNDANAGVGPYGACCAEIDVWESNA 206-238 210-222 230, 235 70-15
1170141 Fusarium oxysporum NSEGWKPSDSDVNAGVGNLGTCCPEMDIWEANS
205-237 209-221 229, 234 121710012 Aspergillus clavatus
NVEGWKPSDNDKNAGVGGYGSCCPEMDIWEANS 202-234 206-218 226, 231 NRRL 1
17902580 Penicillium NVEGWTPSTNNSNTGIGNHGSCCAELDIWEANS 210-242
214-226 234, 239 funiculosum 1346226 Humicola grisea var
NIEGWTGSTNDPNAGAGRYGTCCSEMDIWEANN 207-239 211-223 231, 236
thermoidea 156712282 Chaetomium NVGNWTPSTNDANAGFGRYGSCCSEMDVWEANN
207-239 211-223 231, 236 thermophilum 169768818 Aspergillus oryzae
NVEGWVSSTNNANTGTGNHGSCCAELDIWESNS 214-246 218-230 238, 243 RIB40
46241270 Gibberella pulicaris NSDGWQPSKSDVNAGIGNMGTCCPEMDIWEANS
205-237 209-221 229, 234 49333363 Volvariella volvacea
NVAGWNGSPNDTNAGTGNWGACCNEMDIWEANS 205-237 209-221 229, 234 46395332
Irpex lacteus NVAGWTGSSSDPNSGTGNYGTCCSEMDIWEANS 202-234 206-218
226, 231
50844407 # Chaetomium NIENWTPSTNDANAGFGRYGSCCSEMDIWEANN 182-214
186-198 206, 211 thermophilum var thermophilum 4586347 Irpex
lacteus NIVDWTASAGDANSGTGSFGTCCQEMDIWEANS 203-235 207-219 227, 232
3980202 Phanerochaete NVGNWTETG--SNTGTGSYGTCCSEMDIWEANN 203-233
207-217 225, 230 chrysosporium 27125837 Melanocarpus
NIEGWKSSTSDPNAGVGPYGSCCAEIDVWESNA 210-242 214-226 234, 239
albomyces 171696102 Podospora anserina
NVEGWGGAD--GNSGTGKYGICCAEMDIWEANS 206-236 210-220 228, 233 3913802
Cochliobolus NVEGWNPSDADPNGGAGKIGACCPEMDIWEANS 208-240 212-224 232,
237 carbonum 50403723 Trichoderma viride
NVEGWEPSSNNANTGIGGHGSCCSEMDIWEANS 205-237 209-221 229, 234 3913798
Aspergillus aculeatus NIEGWEPSSTDVNAGTGNHGSCCPEMDIWEANS 210-242
214-226 234, 239 66828465 Dictyostelium
NVDGWIPSTNNPNTGYGNLGSCCAEMDLWEANN 206-238 210-222 230, 235
discoideum 156060391 Sclerotinia NSVGWTPSSNDVNTGTGQYGSCCSEMDIWEANK
192-224 196-208 216, 221 sclerotiorum 1980 116181754 Chaetomium
NSEGWGGED--GNSGTGKYGTCCAEMDIWEANL 203-233 207-217 225, 230 globosum
CBS 148- 51 145230535 Aspergillus niger
NCDGWEPSSNNVNTGVGDHGSCCAEMDVWEANS 209-241 213-225 233, 238 46241266
Nectria NSDEWKPSDSDKNAGVGKYGTCCPEMDIWEANK 205-237 209-221 229, 234
haematococca mpVI 1q9h (PDB) # Talaromyces
NVEGWQPSSNNANTGIGDHGSCCAEMDVWEANS 185-217 189-201 209, 214
emersonii 157362170 Polyporus arcularius
NVLDWAGSSNDPNAGTGHYGTCCNEMDIWEANS 208-240 212-224 232, 237 7804885
Leptosphaeria NAEGWTKSASDPNSGVGKKGACCAQMDVWEANS 204-236 208-220
228, 233 maculans 121852 Phanerochaete
NVEGWNATS--ANAGTGNYGTCCTEMDIWEANN 203-233 207-217 225, 230
chrysosporium 126013214 Penicillium
NVEGWKPSANDKNAGVGPHGSCCAEMDIWEANS 201-233 205-217 225, 230
decumbens 156048578 Sclerotinia NVDGWVPSSNNPNTGVGNYGSCCAEMDIWEANS
202-234 206-218 226, 231 sclerotiorum 1980 156712278 Acremonium
NIDGWQPSSNDANAGLGNHGSCCSEMDIWEANK 206-238 210-222 230, 235
thermophilum 21449327 Aspergillus nidulans
NVEGWEPSDSDANAGVGGMGTCCPEMDIWEANS 202-234 206-218 226, 231 (also
known as Emericella nidulans) 171683762 Podospora anserine
NIEGWRESSNDENAGVGPYGGCCAEIDVWESNA 211-243 215-227 235, 240 (S mat+)
56718412 Thermoascus NVEGWQPSANDPNAGVGNHGSCCAEMDVWEANS 205-237
209-221 229, 234 aurantiacus var levisporus 15824273
Pseudotrichonympha NVENWKPQTNDENAGNGRYGACCTEMDIWEANK 200-232
204-216 224, 229 grassii 115390801 Aspergillus terreus
NVEGWTPSDNDKNAGVGGHGSCCPELDIWEANS 203-235 207-219 227, 232 NIH2624
453223 Phanerochaete NVGNWTETG--SNTGTGSYGTCCSEMDIWEANN 203-233
207-217 225, 230 chrysosporium 3132 Phanerochaete
NVEGWLGTT--ATTGTGFFGSCCTDIALWEAND 202-232 206-216 224, 229
chrysosporium 16304152 Thermoascus
NVEGWQPSANDPNAGVGNHGSSCAEMDVWEANS 205-237 209-221 229, 234
aurantiacus 156712280 Acremonium NSASWQPSSNDQNAGVGGMGSCCAEMDIWEANS
210-242 214-226 234, 239 thermophilum 5231154 Volvariella volvacea
NVQGWQPSPNDTNAGTGNYGACCNKMDVWEANS 220-252 224-236 244, 249
116200349 Chaetomium NYDGWTPSSNDANAGVGALGGCCAEIDVWESNA 207-239
211-223 231, 236 globosum CBS 148- 51 4586343 Irpex lacteus
NVAGWAGSASDPNAGSGTLGTCCSEMDIWEANN 202-234 206-218 226, 231 15321718
Lentinula edodes NVEGWTPSSTSPNAGTGGTGICCNEMDIWEANS 208-240 212-224
232, 237 146424875 Pleurotus sp Florida
NVLDWSASATDDNAGNGRYGACCAEMDIWEANS 206-238 210-222 230, 235 62006158
Fusarium venenatum NSDGWQPSKSDVNGGIGNLGTCCPEMDIWEANS 205-237
209-221 229, 234 296027 Phanerochaete
NVEGWNATS--ANAGTGNYGTCCTEMDIWEANN 203-233 207-217 225, 230
chrysosporium 154449709 Fusicoccum sp
NVQNWTASSTDKNAGTGHYGSCCNEMDIWEANS 209-241 213-225 233, 238 BCC4124
169859460 Coprinopsis cinerea NSVGWEPSETDPNAGKGQYGICCAEMDIWEANS
207-239 211-223 231, 236 okayama 50400675 Trichoderma
NVEGWEPSSNNANTGVGGHGSCCSEMDIWEANS 201-233 205-217 225, 230
harzianum (anamorph of Hypocrea lixii) 729649 Neurospora crassa
NVEGWTPSTNDAN-GIGDHGSCCSEMDIWEANK 200-231 204-215 223, 228 (OR74A)
119472134 Neosartorya fischeri NVEGWQPSSNDANAGTGNHGSCCAEMDIWEANS
214-246 218-230 238, 243 NRRL 181 117935080 Chaetomium
NIEGWRPSTNDANAGVGPYGACCAEIDVWESNA 209-241 213-225 233, 238
thermophilum 154300584 Botryotinia
NVDGWVPSSNNANTGVGNHGSCCAEMDIWEANS 202-234 206-218 226, 231
fuckeliana B05-10 15824271 Pseudotrichonympha
NVENWKPQTNDENAGNGRYGACCTEMDIWEANK 200-232 204-216 224, 229 grassii
4586345 Irpex lacteus NVEGWTGSSTDSNSGTGNYGTCCSEMDIWEANS 202-234
206-218 226, 231 46241268 Gibberella avenacea
NSDGWKPSDSDINAGIGNMGTCCPEMDIWEANS 205-237 209-221 229, 234 6164684
Aspergillus niger NCDGWEPSSNNVNTGVGDHGSCCAEMDVWEANS 209-241 213-225
233, 238 6164682 Aspergillus niger
NVDGWEPSSNNDNTGIGNHGSCCPEMDIWEANK 203-235 207-219 227, 232 33733371
Chrysosporium NVENWQSSTNDANAGTGKYGSCCSEMDVWEANN 206-238 210-222
230, 235 lucknowense U.S. Pat. No. 6,573,086-10 29160311 Thielavia
NVEGWESSTNDANAGSGKYGSCCTEMDVWEANN 206-238 210-222 230, 235
australiensis 146197087 uncultured symbiotic
NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNM 197-229 201-213 221, 226 protist
of Reticulitermes speratus 146197237 uncultured symbiotic
NSEGWKPQSGDKNAGNGKYGSCCSEMDVWESNS 200-232 204-216 224, 229 protist
of Neotermes koshunensis 146197067 uncultured symbiotic
NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNM 197-229 201-213 221, 226 protist
of Reticulitermes speratus 146197407 uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS 198-230 202-214 222, 227 protist
of Cryptocercus punctulatus 146197157 uncultured symbiotic
NVEGWKPSDNDENAGTGKWGACCTEMDIWEANK 201-233 205-217 225, 230 protist
of Hodotermopsis sjoestedti 146197403 uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS 198-230 202-214 222, 227 protist
of Cryptocercus punctulatus 146197081 uncultured symbiotic
NVDDWKPQDNDENSGDGKLGTCCSEMDIWEGNA 197-229 201-213 221, 226 protist
of Reticulitermes speratus 146197413 uncultured symbiotic
NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS 198-230 202-214 222, 227 protist
of Cryptocercus punctulatus 146197309 uncultured symbiotic
NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS 196-228 200-212 220, 225 protist
of Mastotermes darwiniensis 146197227 uncultured symbiotic
NSDGWKPQKNDKNSGNGKYGSCCSEMDIWEANS 195-227 199-211 219, 224 protist
of Neotermes koshunensis
146197253 uncultured symbiotic NSEGWKPQSGDKNAGNGKYGSCCSEMDVWESNS
200-232 204-216 224, 229 protist of Neotermes koshunensis 146197099
uncultured symbiotic NVLDWKPQSNDENAGTGRYGTCCTEMDIWEANS 197-229
201-213 221, 226 protist of Reticulitermes speratus 146197409
uncultured symbiotic NVLDWKPQSNDENSGNGRWGARCTEMDIWEANS 198-230
202-214 222, 227 protist of Cryptocercus punctulatus 146197315
uncultured symbiotic NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS 196-228
200-212 220, 225 protist of Mastotermes darwiniensis 146197411
uncultured symbiotic NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS 198-230
202-214 222, 227 protist of Cryptocercus punctulatus 146197161
uncultured symbiotic NVQDWKPSDNDDNAGTGHYGACCTEMDIWEANK 201-233
205-217 225, 230 protist of Hodotermopsis sjoestedti 146197323
uncultured symbiotic NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS 196-228
200-212 220, 225 protist of Mastotermes darwiniensis 146197077
uncultured symbiotic NVLDWKPQETDENSGNGRYGTCCTEMDIWEANS 201-233
205-217 225, 230 protist of Reticulitermes speratus 146197089
uncultured symbiotic NVEDWKPQDNDENSGNGKLGTCCSEMDIWEGNA 197-229
201-213 221, 226 protist of Reticulitermes speratus 146197091
uncultured symbiotic NVLDWKPQSNDENAGTGRYGTCCTEMDIWEANS 197-229
201-213 221, 226 protist of Reticulitermes speratus 146197097
uncultured symbiotic NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNA 197-229
201-213 221, 226 protist of Reticulitermes speratus 146197095
uncultured symbiotic NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNA 197-229
201-213 221, 226 protist of Reticulitermes speratus 146197401
uncultured symbiotic NVLDWKPQSNDENSGNGRYGACCIEMDIWEANS 198-230
202-214 222, 227 protist of Cryptocercus punctulatus 146197225
uncultured symbiotic NSDGWKPQKNDKNSGNGKYGSCCSEMDIWEANS 195-227
199-211 219, 224 protist of Neotermes koshunensis 146197317
uncultured symbiotic NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS 196-228
200-212 220, 225 protist of Mastotermes darwiniensis 146197251
uncultured symbiotic NSDGWKPQKNDKNSGNGRYGSCCSEMDVWEANS 195-227
199-211 219, 224 protist of Neotermes koshunensis 146197319
uncultured symbiotic NSDGWKPQSNDKNSGNGKYGSCCSEMDIWEANS 196-228
200-212 220, 225 protist of Mastotermes darwiniensis 146197071
uncultured symbiotic NILDWKPSSNDENAGAGRYGTCCTEMDIWEANS 200-232
204-216 224, 229 protist of Reticulitermes speratus 146197075
uncultured symbiotic NVDDWKPQDNDENSGNGKLGTCCSEMDIWEGNA 197-229
201-213 221, 226 protist of Reticulitermes speratus 146197159
uncultured symbiotic NVKDWKPQETDENAGNGHYGACCTEMDIWEANS 197-229
201-213 221, 226 protist of Hodotermopsis sjoestedti 146197405
uncultured symbiotic NVLDWKPQSNDENSGNGRYGACCTEMDIWEANS 198-230
202-214 222, 227 protist of Cryptocercus punctulatus 146197327
uncultured symbiotic NSDGWKPQDNDENSGNGKYGSCCSEMDIWEANS 201-233
205-217 225, 230 protist of Mastotermes darwiniensis 146197261
uncultured symbiotic NSDGWKPQKNDKNSGNGKYGSCCSEMDIWEANS 195-227
199-211 219, 224 protist of Neotermes koshunensis
TABLE-US-00005 TABLE 5 Tolerance to Tolerance to 250 mg/L
Cellobiose Cellobiose Accumulation % Activity in % Activity in
4-MUL Assay Bagasse Assay Substitution(s) (+/-Cellobiose).sup..+-.
(-/+BG).sup. None 25% 60% R273K/R422K 95% 84% R273K/Y274Q/ 78% ND
D281K/Y410H/ P411G/R422K
TABLE-US-00006 TABLE 6 Tolerance to 250 mg/L Cellobiose Tolerance
to % Activity in Cellobiose Accumulation 4-MUL Assay % Activity in
Bagasse Assay Substitution(s) (+/-Cellobiose).sup..+-. (-/+BG).sup.
None 23% 74% R268K/R411K 92% 94% R268A/R411A 92% 95% R268A/R411K
97% 94% R268K/R411A 97% 102% R268K ND 92% R268A ND 86% R411K ND 89%
R411A ND 94%
TABLE-US-00007 TABLE 7 SEQ ID NO. Amino Acid Sequence MSALNSFNMY
KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD ANWRWVHGVN
TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT
YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY PNNKAGAQYG
VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA CCAELDIWEA NSISEALTPH
PCDTPGLSVC TTDACGGTYS SDRYAGTCDP DGCDFNPYRL GVTDFYGSGK TVDTTKPITV
VTQFVTDDGT STGTLSEIRR YYVQNGVVIP QPSSKISGVS GNVINSDFCD AEISTFGETA
SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS VNMLWLDSTY PTNATGTPGA ARGSCPTTSG
DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA SSTSTSSTST
GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL MYRKLAVISA FLATARAQSA
CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD
NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL
GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI
NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC
GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY
YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV
MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF
GPIGSTGNPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG YSGPTVCASG
TTCQVLNPYY SQCL MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS
WSTCKSGGSC TTNSGAITLD ANWRWVHGVN TSTNCYTGNT WNSAICDTDA SCAQDCALDG
ADYSGTYGIT TSGNSLRLNF VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC
GLNGALYFVT MDADGGVSKY PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSANN
ANTGIGNHGA CCAELDIWEA NSISEALTPH PCDTPGLSVC TTDACGGTYS SDRYAGTCDP
DGCDFNPYRL GVTDFYGSGK TVDTTKPFTV VTQFVTNDGT STGSLSEIRR YYVQNGVVIP
QPSSKISGIS GNVINSDYCA AEISTFGGTA SFNKHGGLTN MAAGMEAGMV LVMSLWDDYA
VNMLWLDSTY PTNATGTPGA ARGTCATTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG
GSSTGGSTTT TASRTTTTSA SSTSTSSTST GTGVAGHWGQ CGGQGWTGPT TCVSGTTCTV
VNPYYSQCL ESACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC
YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIDFVTQSA QKNVGARLYL
MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG
YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWQANS ISEALTPHPC
TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV
VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT
QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ
SPNAKVTFSN IKFGPIGSTG NPSG MASSFQLYKA LLFFSSLLSA VQAQKVGTQQ
AEVHPGLTWQ TCTSSGSCTT VNGEVTIDAN WRWLHTVNGY TNCYTGNEWD TSICTSNEVC
AEQCAVDGAN YASTYGITTS GSSLRLNFVT QSQQKNIGSR VYLMDDEDTY TMFYLLNKEF
TFDVDVSELP CGLNGAVYFV SMDADGGKSR YATNEAGAKY GTGYCDSQCP RDLKFINGVA
NVEGWESSDT NPNGGVGNHG SCCAEMDIWE ANSISTAFTP HPCDTPGQTL CTGDSCGGTY
SNDRYGGTCD PDGCDFNSYR QGNKTFYGPG LTVDTNSPVT VVTQFLTDDN TDTGTLSEIK
RFYVQNGVVI PNSESTYPAN PGNSITTEFC ESQKELFGDV DVFSAHGGMA GMGAALEQGM
VLVLSLWDDN YSNMLWLDSN YPTDADPTQP GIARGTCPTD SGVPSEVEAQ YPNAYVVYSN
IKFGPIGSTF GNGGGSGPTT TVTTSTATST TSSATSTATG QAQHWEQCGG NGWTGPTVCA
SPWACTVVNS WYSQCL MYRAIATASA LIAAVRAQQV CSLTQESKPS LNWSKCTSSG
CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAEKCCLD GADYASTYGI
TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA
LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSDSDVNGGI
GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG GTCDADGCDF
NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI
AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW HDHHSNMLWL
DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKSDGTTPTN
PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ CQ
MLASTFSYRM YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI
DANWRWVHKV GDYTNCYTGN TWDTTICPDD ATCASNCALE GANYESTYGV TASGNSLRLN
FVTTSQQKNI GSRLYMMKDD STYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG
MSKYPTNKAG AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD
IWEANSISTA FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY
GPGMTVDTKS KFTVVTQFIT DDGTSSGTLK EIKRFYVQNG KVIPNSESTW TGVSGNSITT
EYCTAQKSLF QDQNVFEKHG GLEGMGAALA QGMVLVMSLW DDHSANMLWL DSNYPTTASS
TTPGVARGTC DISSGVPADV EANHPDAYVV YSNIKVGPIG STFNSGGSNP GGGTTTTTTT
QPTTTTTTAG NPGGTGVAQH YGQCGGIGWT GPTTCASPYT CQKLNDYYSQ CL
MLPSTISYRI YKNALFFAAL FGAVQAQKVG TSKAEVHPSM AWQTCAADGT CTTKNGKVVI
DANWRWVHDV KGYTNCYTGN TWNAELCPDN ESCAENCALE GADYAATYGA TTSGNALSLK
FVTQSQQKNI GSRLYMMKDD NTYETFKLLN QEFTFDVDVS NLPCGLNGAL YFVSMDADGG
LSRYTGNEAG AKYGTGYCDS QCPRDLKFIN GLANVEGWTP SSSDANAGNG GHGSCCAEMD
IWEANSISTA YTPHPCDTPG QAMCNGDSCG GTYSSDRYGG TCDPDGCDFN SYRQGNKSFY
GPGMTVDTKK KMTVVTQFLT NDGTATGTLS EIKRFYVQDG KVIANSESTW PNLGGNSLTN
DFCKAQKTVF GDMDTFSKHG GMEGMGAALA EGMVLVMSLW DDHNSNMLWL DSNSPTTGTS
TTPGVARGSC DISSGDPKDL EANHPDASVV YSNIKVGPIG STFNSGGSNP GGSTTTTKPA
TSTTTTKATT TATTNTTGPT GTGVAQPWAQ CGGIGYSGPT QCAAPYTCTK QNDYYSQCL
MHPSLQTILL SALFTTAHAQ QACSSKPETH PPLSWSRCSR SGCRSVQGAV TVDANWLWTT
VDGSQNCYTG NRWDTSICSS EKTCSESCCI DGADYAGTYG VTTTGDALSL KFVQQGPYSK
NVGSRLYLMK DESRYEMFTL LGNEFTFDVD VSKLGCGLNG ALYFVSMDED GGMKRFPMNK
AGAKFGTGYC DSQCPRDVKF INGMANSKDW IPSKSDANAG IGSLGACCRE MDIWEANNIA
SAFTPHPCKN SAYHSCTGDG CGGTYSKNRY SGDCDPDGCD FNSYRLGNTT FYGPGPKFTI
DTTRKISVVT QFLKGRDGSL REIKRFYVQN GKVIPNSVSR VRGVPGNSIT QGFCNAQKKM
FGAHESFNAK GGMKGMSAAV SKPMVLVMSL WDDHNSNMLW LDSTYPTNSR QRGSKRGSCP
ASSGRPTDVE SSAPDSTVVF SNIKFGPIGS TFSRGK ESACTLQSET HPPLTWQKCS
SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS
TYGVTTSGNS LSIDFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL
NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN
TGIGGHGSCC SEMDIWQANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG
CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS
YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST
YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSG
MHQRALLFSA LAVAANAQQV GTQKPETHPP LTWQKCTAAG SCSQQSGSVV IDANWRWLHS
TKDTTNCYTG NTWNTELCPD NESCAQNCAV DGADYAGTYG VTTSGSELKL SFVTGANVGS
RLYLMQDDET YQHFNLLNNE FTFDVDVSNL PCGLNGALYF VAMDADGGMS KYPSNKAGAK
YGTGYCDSQC PRDLKFINGM ANVEGWKPSS NDKNAGVGGH GSCCPEMDIW EANSISTAVT
PHPCDDVSQT MCSGDACGGT YSATRYAGTC DPDGCDFNPF RMGNESFYGP GKIVDTKSEM
TVVTQFITAD GTDTGALSEI KRLYVQNGKV IANSVSNVAD VSGNSISSDF CTAQKKAFGD
EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST YPTDADPSKP GVARGTCEHG
AGDPEKVESQ HPDASVTFSN IKFGPIGSTY KA MYRSLIFATS LLSLAKGQLV
GNLYCKGSCT AKNGKVVIDA NWRWLHVKGG YTNCYTGNEW NATACPDNKS CATNCAIDGA
DYRRLRHYCE RQLLGTEVHH QGLYSTNIGS RTYLMQDDST YQLFKFTGSQ EFTFDVDLSN
LPCGLNGALY FVSMDADGGL KKYPTNKAGA KYGTGYCDAQ CPRDLKFING EGNVEGWQPS
KNDQNAGVGG HGSCCAEMDI WEANSVSTAV TPHSCSTIEQ SRCDGDGCGG TYSADRYAGV
CDPDGCDFNS YRMGVKDFYG KGKTVDTSKK FTVVTQFIGS GDAMEIKRFY VQNGKTIPQP
DSTIPGVTGN SITTFFCDAQ KKAFGDKYTF KDKGGMANMP STCNGMVLVM SLWDDHYSNM
LWLDSTYPTD KNPDTDAGSG RGECAITSGV PADVESQHPD ASVIYSNIKF GPINTTFG
MLAKFAALAA LVASANAQAV CSLTAETHPS LNWSKCTSSG CTNVAGSITV DANWRWTHIT
SGSTNCYSGN EWDTSLCSTN TDCATKCCVD GAEYSSTYGI QTSGNSLSLQ FVTKGSYSTN
IGSRTYLMNG ADAYQGFELL GNEFTFDVDV SGTGCGLNGA LYFVSMDLDG GKAKYTNNKA
GAKYGTGYCD AQCPRDLKYI NGIANVEGWT PSTNDANAGI GDHGTCCSEM DIWEANKVST
AFTPHPCTTI EQHMCEGDSC GGTYSDDRYG GTCDADGCDF NSYRMGNTTF YGEGKTVDTS
SKFTVVTQFI KDSAGDLAEI KRFYVQNGKV IENSQSNVDG VSGNSITQSF CNAQKTAFGD
IDDFNKKGGL KQMGKALAKP MVLVMSIWDD HAANMLWLDS TYPVEGGPGA YRGECPTTSG
VPAEVEANAP NSKVIFSNIK FGPIGSTFSG GSSGTPPSNP SSSVKPVTST AKPSSTSTAS
NPSGTGAAHW AQCGGIGFSG PTTCQSPYTC QKINDYYSQC V MFKKVALTAL CFLAVAQAQQ
VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL HVTDGYTNCY TGNSWNSTVC
SDPTTCAQRC ALEGANYQQT YGITTNGDAL TIKFLTRSQQ TNVGARVYLM ENENRYQMFN
LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSKQPNN RAGAKYGTGY CDSQCPRDIK
FIDGVANSAD WTPSETDPNA GRGRYGICCA EMDIWEANSI SNAYTPHPCR TQNDGGYQRC
EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT NRKMTVVTQF ITHDNTDTGT
LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ KNLFGDYSSF ARDGGLAHMG
RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA RGTCPTTGGT PRETEQNHPD
AQVIFSNIKF GDIGSTFSGY MYSAAVLATF SFLLGAGAQQ VGTSTAETHP ALTVQKCAAG
GTCTDESDSI VLDANWRWLH STSGSTNCYT GNTWDTTLCP DAATCTTNCA LDGADYEGTY
GITTSGDSLK LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY
FVPMDADGGM SKYPTNKAGA KYGTGYCDAQ CPQDMKFVNG TANVEGWVPD SNSANSGTGN
IGSCCSEFDV WEANSMSQAL TPHVCTVDSQ TACTGDDCAS NTGVCDGDGC DFNPYRMGNT
TFYGSGMTID TSKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPS SDISGVSGNS
ITDDFCAAQK TAFGDTDYFT QNGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT
KDASTPGVSR GSCATDSGVP ATVEAASGSA YVTFSSIKYG PIGSTFNAPA DSSSSVSASS
SPAPIASSSS SASIAPVSSV VAAIVSSSAQ AISSAAPVVS SSAQAISSAA PVVSSVVSSA
APVATSSTKS KCSKVSSTLK TSVAAPATSA TSAAVVATSS AASSTGSVPL YGNCTGGKTC
SEGTCVVQND YYSQCVASS MTWQRCTGTG GSSCTNVNGE IVIDANWRWI HATGGYTNCF
DGNEWNKTAC PSNAACTKNC AIEGSDYRGT YGITTSGNSL TLKFITKGQY STNVGSRTYL
MKDTNNYEMF NLIGNEFTFD VDLSQLPCGL NGALYFVSMP EKGQGTPGAK YGTGKLSQCS
VHISKTLTDA CARDLKFVGG EANADGWQAS TSDPNAGVGK KGACCAEMDV WEANSMSTAL
TPHSCQPEGY AVCEESNCGG TYSLDRYAGT CDANGCDFNP YRVGNKDFYG KGKTVDTSKK
MTVVTQFLGT GSDLTELKRF YVQDGKVISN PEPTIPGMTG NSITQKWCDT QKEVFKEEVY
PFNQWGGMAS MGKGMAQGMV LVMSLWDDHY SNMLWLDSTY PTDRDPESPG AARGECAITS
GAPAEVEANN PDASVMFSNI KFGPIGSTFQ QPA MQIKSYIQYL AAALPLLSSV
AAQQAGTITA ENHPRMTWKR CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC
SSATDCAQRC ALDGANYQST YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF
TLMNNEFAFD VDLSKVECGI NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL
KFIGGKANIE GWRPSTNDPN AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE
TNNCGGTYSD DRFAGYCDAN GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ
FFVQDGRKIE VPPPTWPGLP NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM
VLVMSIWDDH HSNMLWLDSS YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PNAQVVWSNI
RFGPIGSTVN V MRTAKFATLA ALVASAAAQQ ACSLTTERHP SLSWKKCTAG GQCQTVQASI
TLDSNWRWTH QVSGSTNCYT GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS
LKFVTKGQYS TNVGSRTYLM DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN GALYFVSMDA
DGGLSRYPGN KAGAKYGTGY CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS
EMDIWEANNM ATAFTPHPCT IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK
TFYGKGMTVD TTKKITVVTQ FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ
DWCDRQKVAF GDIDDFNRKG GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG
KPGAERGACP TTSGVPAEVE AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP
PTTTTSSAPA TTTTASAGPK AGRWQQCGGI GFTGPTQCEE PYTCTKLNDW YSQCL
MQIKQYLQYL AAALPLVNMA AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW
RWLHDSNYQN CYDGNRWTSA CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE
YGTNIGSRVY LMNGSDKYQM FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS
SNKAGAKYGT GYCDAQCARD LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN
AYAFAFTPHG CLNNNYHVCE TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT
VDTSRKFTVV TRFEENKLTQ
FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM
VLVMSIWDGH YANMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI
RFGPIGSTYQ V MMYKKFAALA ALVAGAAAQQ ACSLTTETHP RLTWKRCTSG GNCSTVNGAV
TIDANWRWTH TVSGSTNCYT GNEWDTSICS DGKSCAQTCC VDGADYSSTY GITTSGDSLN
LKFVTKHQHG TNVGSRVYLM ENDTKYQMFE LLGNEFTFDV DVSNLGCGLN GALYFVSMDA
DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK FINGEANIEN WTPSTNDANA GFGRYGSCCS
EMDIWDANNM ATAFTPHPCT IIGQSRCEGN SCGGTYSSER YAGVCDPDGC DFNAYRQGDK
TFYGKGMTVD TTKKMTVVTQ FHKNSAGVLS EIKRFYVQDG KIIANAESKI PGNPGNSITQ
EWCDAQKVAF GDIDDFNRKG GMAQMSKALE GPMVLVMSVW DDHYANMLWL DSTYPIDKAG
TPGAERGACP TTSGVPAEIE AQVPNSNVIF SNIRFGPIGS TVPGLDGSTP SNPTATVAPP
TSTTTSVRSS TTQISTPTSQ PGGCTTQKWG QCGGIGYTGC TNCVAGTTCT ELNPWYSQCL
MYRNFLYAAS LLSVARSQLV GTQTTETHPG MTWQSCTAKG SCTTCSDNKA CASNCAVDGA
DYKGTYGITA SGNSLQLKFI TKGSYSTNIG SRTYLMASDT AYQMFKFDGN KEFTFDVDLS
GLPCGFNGAL YFVSMDEDGG LKKYSGNKAG AKYGTGYCDA QCPRDLKFIN GEGNVEGWKP
SDNDANAGVG GHGSCCAEMD IWEANSISTA VTPHACSTIE QTRCDGDGCG GTYSADRYAG
VCDPDGCDFN AYRMGVKNFY GKGMTVDTSK KFTVVTQFIG TGDAMEIKRF YVQGGKTIEQ
PASTIPGVEG NSITTKFCDQ QKQVFGDRYT YKEKGGTANM AKALAQGMVL VMSLWDDHYS
NMLWLDSTYP TDKNPDTDLG SGRGSCDVKS GAPADVESKS PDATVIYSNI KFGPLNSTY
MLGKIAIASL SFLAIAKGQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL
HVTDGYTNCY TGNSWNSSVC SDGTTCAQRC ALEGANYQQT YGITTSGNSL TMKFLTRSQG
TNVGGRVYLM ENENRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGMSSQPNN
RAGAKYGTGY CDSQCPRDIK FIDGVANSVG WEPSETDSNA GRGRYGICCA EMDIWEANSI
SNAYTPHPCR TQNDGGYQRC EGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTIDT
NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITEQFCTDQ
KNLFGDYSSF ARDGGLAHMG RSLAKGHVLA LSIWNDHGAH MLWLDSNYPT DADPNKPGIA
RGTCPTTGGT PRETEQNHPD AQVIFSNIKF GDIGSTFSGY MFPRSILLAL SLTAVALGQQ
VGTNMAENHP SLTWQRCTSS GCQNVNGKVT LDANWRWTHR INDFTNCYTG NEWDTSICPD
GVTCAENCAL DGADYAGTYG VTSSGTALTL KFVTESQQKN IGSRLYLMAD DSNYEIFNLL
NKEFTFDVDV SKLPCGLNGA LYFSEMAADG GMSSTNTAGA KYGTGYCDSQ CPRDIKFIDG
EANSEGWEGS PNDVNAGTGN FGACCGEMDI WEANSISSAY TPHPCREPGL QRCEGNTCSV
NDRYATECDP DGCDFNSFRM GDKSFYGPGM TVDTNQPITV VTQFITDNGS DNGNLQEIRR
IYVQNGQVIQ NSNVNIPGID SGNSISAEFC DQAKEAFGDE RSFQDRGGLS GMGSALDRGM
VLVLSIWDDH AVNMLWLDSD YPLDASPSQP GISRGTCSRD SGKPEDVEAN AGGVQVVYSN
IKFGDINSTF NNNGGGGGNP SPTTTRPNSP AQTMWGQCGG QGWTGPTACQ SPSTCHVIND
FYSQCF MYRNLALASL SLFGAARAQQ AGTVTTETHP SLSWKTCTGT GGTSCTTKAG
KITLDANWRW THVTTGYTNC YDGNSWNTTA CPDGATCTKN CAVDGADYSG TYGITTSSNS
LSIKFVTKGS NSANIGSRTY LMESDTKYQM FNLIGQEFTF DVDVSKLPCG LNGALYFVEM
AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE GWNPSDADPN AGSGKIGACC
PEMDIWEANS ISTAYTPHPC KGTGLQECTD DVSCGDGSNR YSGLCDKDGC DFNSYRMGVK
DFYGPGATLD TTKKMTVVTQ FLGSGSTLSE IKRFYVQNGK VFKNSDSAIE GVTGNSITES
FCAAQKTAFG DTNSFKTLGG LNEMGASLAR GHVLVMSLWD DHAVNMLWLD STYPTNSTKL
GAQRGTCAID SGKPEDVEKN HPDATVVFSD IKFGPIGSTF QQPS MVDIQIATFL
LLGVVGVAAQ QVGTYIPENH PLLATQSCTA SGGCTTSSSK IVLDANRRWI HSTLGTTSCL
TANGWDPTLC PDGITCANYC ALDGVSYSST YGITTSGSAL RLQFVTGTNI GSRVFLMADD
THYRTFQLLN QELAFDVDVS KLPCGLNGAL YFVAMDADGG KSKYPGNRAG AKYGTGYCDS
QCPRDVQFIN GQANVQGWNA TSATTGTGSY GSCCTELDIW EANSNAAALT PHTCTNNAQT
RCSGSNCTSN TGFCDADGCD FNSFRLGNTT FLGAGMSVDT TKTFTVVTQF ITSDNTSTGN
LTEIRRFYVQ NGNVIPNSVV NVTGIGAVNS ITDPFCSQQK KAFIETNYFA QHGGLAQLGQ
ALRTGMVLAF SISDDPANHM LWLDSNFPPS ANPAVPGVAR GMCSITSGNP ADVGILNPSP
YVSFLNIKFG SIGTTFRPA MHQRALLFSA LAVAANAQQV GTQTPETHPP LTWQKCTAAG
SCSQQSGSVV IDANWRWLHS TKDTTNCYTG NTWNTELCPD NESCAQNCAL DGADYAGTYG
VTTSGSELKL SFVTGANVGS RLYLMQDDET YQHFNLLNHE FTFDVDVSNL PCGLNGALYF
VAMDADGGMS KYPSNKAGAK YGTGYCDSQC PRDLKFINGM ANVEGWEPSS SDKNAGVGGH
GSCCPEMDIW EANSISTAVT PHPCDDVSQT MCSGDACGGT YSESRYAGTC DPDGCDFNPF
RMGNESFYGP GKIVDTKSKM TVVTQFITAD GTDSGALSEI KRLYVQNGKV IANSVSNVAG
VSGNSITSDF CTAQKKAFGD EDIFAKHGGL SGMGKALSEM VLIMSIWDDH HSSMMWLDST
YPTDADPSKP GVARGTCEHG AGDPENVESQ HPDASVTFSN IKFGPIGSTY EG
MFRTATLLAF TMAAMVFGQQ VGTNTAENHR TLTSQKCTKS GGCSNLNTKI VLDANWRWLH
STSGYTNCYT GNQWDATLCP DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG
SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA
KYGTGYCDSQ CPRDIKFING EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP
HPCTTNAQTR CSGSDCTRDT GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI
TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ
HGGLKQVGEA LRTGMVLALS IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA
QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV
TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP YYSQCY MYQRALLFSF FLAAARAHEA
GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD
DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL
GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD SQCPRDLKFI
NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC
GGTYSSTRYA GTCDPDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL
TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL
AQGMVLVMSL WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV
IYSNIKVGPI NSTFTAN MQIKSYIQYL AAALPLLSSV AAQQAGTITA ENHPRMTWKR
CSGPGNCQTV QGEVVIDANW RWLHNNGQNC YEGNKWTSQC SSATDCAQRC ALDGANYQST
YGASTSGDSL TLKFVTKHEY GTNIGSRFYL MANQNKYQMF TLMNNEFAFD VDLSKVECGI
NSALYFVAME EDGGMASYPS NRAGAKYGTG YCDAQCARDL KFIGGKANIE GWRPSTNDPN
AGVGPMGACC AEIDVWESNA YAYAFTPHAC GSKNRYHICE TNNCGGTYSD DRFAGYCDAN
GCDYNPYRMG NKDFYGKGKT VDTNRKFTVV SRFERNRLSQ FFVQDGRKIE VPPPTWPGLP
NSADITPELC DAQFRVFDDR NRFAETGGFD ALNEALTIPM VLVMSIWDDH HSNMLWLDSS
YPPEKAGLPG GDRGPCPTTS GVPAEVEAQY PDAQVVWSNI RFGPIGSTVN V MYRKLAVISA
FLATARAQSA CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG
NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS
DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD
SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV
GQEICEGDGC GGTYSDNRYG GTCDPDGCDW DPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ
FETSGAINRY YVQNGVTFQQ PNAELGSYSG NGLNDDYCTA EEAEFGGSSF SDKGGLTQFK
KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN
AKVTFSNIKF GPIGSTGDPS GGNPPGGNPP GTTTTRRPAT TTGSSPGPTQ SHYGQCGGIG
YSGPTVCASG TTCQVLNPYY SQCL MYQRALLFSF FLAAARAQQA GTVTAENHPS
LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD DVTCAQNCAL
DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV
SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ
PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA
GTCDPDGCDF NPYRQGNHSF YGPGQIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN
GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL
WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQYPNSYV IYSNIKVGPI
NSTFTAN MIRKITTLAA LVGVVRGQAA CSLTAETHPS LTWQKCSSGG SCTNVAGSVT
IDANWRWTHT TSGYTNCYTG NKWDTSICST NADCASKCCV DGANYQQTYG ASTSGNALSL
QYVTQSSGKN VGSRLYLLES ENKYQMFNLL GNEFTFDVDA SKLGCGLNGA VYFVSMDADG
GQSKYSGNKA GAKYGTGYCD SQCPRDLKYI NGAANVEGWQ PSSGDANSGV GNMGSCCAEM
DIWEANSIST AYTPHPCSNN AQHSCKGDDC GGTYSSVRYA GDCDPDGCDF NSYRQGNRTF
YGPGSNFNVD SSKKVTVVTQ FISSGGQLTD IKRFYVQNGK VIPNSQSTIT GVTGNSVTQD
YCDKQKTAFG DQNVFNQRGG LRQMGDALAK GMVLVMSVWD DHHSQMLWLD STYPTTSTAP
GAARGSCSTS SGKPSDVQSQ TPGATVVYSN IKFGPIGSTF KSS MLRRALLLSS
SAILAVKAQQ AGTATAENHP PLTWQECTAP GSCTTQNGAV VLDANWRWVH DVNGYTNCYT
GNTWDPTYCP DDETCAQNCA LDGADYEGTY GVTSSGSSLK LNFVTGSNVG SRLYLLQDDS
TYQIFKLLNR EFSFDVDVSN LPCGLNGALY FVAMDADGGV SKYPNNKAGA KYGTGYCDSQ
CPRDLKFIDG EANVEGWQPS SNNANTGIGD HGSCCAEMDV WEANSISNAV TPHPCDTPGQ
TMCSGDDCGG TYSNDRYAGT CDPDGCDFNP YRMGNTSFYG PGKIIDTTKP FTVVTQFLTD
DGTDTGTLSE IKRFYIQNSN VIPQPNSDIS GVTGNSITTE FCTAQKQAFG DTDDFSQHGG
LAKMGAAMQQ GMVLVMSLWD DYAAQMLWLD SDYPTDADPT TPGIARGTCP TDSGVPSDVE
SQSPNSYVTY SNIKFGPINS TFTAS MHQRALLFSA FWTAVQAQQA GTLTAETHPS
LTWQKCAAGG TCTEQKGSVV LDSNWRWLHS VDGSTNCYTG NTWDATLCPD NESCASNCAL
DGADYEGTYG VTTSGDALTL QFVTGANIGS RLYLMADDDE SYQTFNLLNN EFTFDVDASK
LPCGLNGAVY FVSMDADGGV AKYSTNKAGA KYGTGYCDSQ CPRDLKFING QVRKGWEPSD
SDKNAGVGGH GSCCPQMDIW EANSISTAYT PHPCDDTAQT MCEGDTCGGT YSSERYAGTC
DPDGCDFNAY RMGNESFYGP SKLVDSSSPV TVVTQFITAD GTDSGALSEI KRFYVQGGKV
IANAASNVDG VTGNSITADF CTAQKKAFGD DDIFAQHGGL QGMGNALSSM VLTLSIWDDH
HSSMMWLDSS YPEDADATAP GVARGTCEPH AGDPEKVESQ SGSATVTYSN IKYGPIGSTF
DAPA MASTLSFKIY KNALLLAAFL GAAQAQQVGT STAEVHPSLT WQKCTAGGSC
TSQSGKVVID SNWRWVHNTG GYTNCYTGND WDRTLCPDDV TCATNCALDG ADYKGTYGVT
ASGSSLRLNF VTQASQKNIG SRLYLMADDS KYEMFQLLNQ EFTFDVDVSN LPCGLNGALY
FVAMDEDGGM ARYPTNKAGA KYGTGYCDAQ CPRDLKFING QANVEGWEPS SSDVNGGTGN
YGSCCAEMDI WEANSISTAF TPHPCDDPAQ TRCTGDSCGG TYSSDRYGGT CDPDGCDFNP
YRMGNQSFYG PSKIVDTESP FTVVTQFITN DGTSTGTLSE IKRFYVQNGK VIPQSVSTIS
AVTGNSITDS FCSAQKTAFK DTDVFAKHGG MAGMGAGLAE GMVLVMSLWD DHAANMLWLD
STYPTSASST TPGAARGSCD ISSGEPSDVE ANHSNAYVVY SNIKVGPLGS TFGSTDSGSG
TTTTKVTTTT ATKTTTTTGP STTGAAHYAQ CGGQNWTGPT TCASPYTCQR QGDYYSQCL
MVSAKFAALA ALVASASAQQ VCSLTPESHP PLTWQRCSAG GSCTNVAGSV TLDSNWRWTH
TLQGSTNCYS GNEWDTSICT TGTKCAQNCC VEGAEYAATY GITTSGNQLN LKFVTEGKYS
TNVGSRTYLM ENATKYQGFN LLGNEFTFDV DVSNIGCGLN GALYFVSMDL DGGLAKYSGN
KAGAKYGTGY CDAQCPRDIK FINGEANIEG WNPSTNDVNA GAGRYGTCCS EMDIWEANNM
ATAYTPHSCT ILDQSRCEGE SCGGTYSSDR YGGVCDPDGC DFNSYRMGNK EFYGKGKTVD
TTKKMTVVTQ FLKNAAGELS EIKRFYVQNG VVIPNSVSSI PGVPNQNSIT QDWCDAQKIA
FGDPDDNTAK GGLRQMGLAL DKPMVLVMSI WNDHAAHMLW LDSTYPVDAA GRPGAERGAC
PTTSGVPSEV EAEAPNSNVA FSNIKFGPIG STFNSGSTNP NPISSSTATT PTSTRVSSTS
TAAQTPTSAP GGTVPRWGQC GGQGYTGPTQ CVAPYTCVVS NQWYSQCL MFPYIALVSF
SFLSVVLAQQ VGTLTAETHP QLTVQQCTRG GSCTTQQRSV VLDGNWRWLH STSGSNNCYT
GNTWDTSLCP DAATCSRNCA LDGADYSGTY GITSSGNALT LKFVTHGPYS TNIGSRVYLL
ADDSHYQMFN LKNKEFTFDV DVSQLPCGLN GALYFSQMDA DGGTGRFPNN KAGAKYGTGY
CDSQCPHDIK FINGEANVQG WQPSPNDSNA GKGQYGSCCA EMDIWEANSM ASAYTPHPCT
VTTPTRCQGN DCGDGDNRYG GVCDKDGCDF NSFRMGDKNF LGPGKTVNTN SKFTVVTQFL
TSDNTTSGTL SEIRRLYVQN GRVIQNSKVN IPGMASTLDS ITESFCSTQK TVFGDTNSFA
SKGGLRAMGN AFDKGMVLVL SIWDDHEAKM LWLDSNYPLD KSASAPGVAR GTCATTSGEP
KDVESQSPNA QVIFSNIKYG DIGSTYSN MYRAIATASA LIAAVRAQQV CSLTQESKPS
LNWSKCTSSG CSNVKGSVTI DANWRWTHQV SGSTNCYTGN KWDTSVCTSG KVCAERCCLD
GADYASTYGI TSSGDQLSLS FVTKGPYSTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV
SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ
PSDSDVNGGI GNLGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG
GTCDADGCDF NSYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG
KVIANSESKI AGVPGNSLTA DFCTKQKKVF NDPDDFTKKG AWSGMSDALE APMVLVMSLW
HDHHSNMLWL DSTYPTDSTK LGSQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST
YKSDGTTPTN PTNPSEPSNT ANPNPGTVDQ WGQCGGSNYS GPTACKSGFT CKKINDFYSQ
CQ MYSAAVLATF SFLLGAGAQQ VGTLKTESHP PLTIQKCAAG GTCTDEADSV
VLDANWRWLH STSGSTNCYT GNTWDTTLCP DAATCTANCA FDGADYEGTY GITSSGDSLK
LSFVTGSNVG SRTYLMDSET TYKEFALLGN EFTFTVDVSK LPCGLNGALY FVPMDADGGM
SKYPTNKAGA KYGTGYCDAQ CPQDMKFVSG GANNEGWVPD SNSANSGTGN
IGSCCSEFDV WEANSMSQAL TPHTCTVDGQ TACTGDDCAG NTGVCDADGC DFNPYRMGNT
TFYGSGKTID TTKPFSVVTQ FITDDGTETG TLTEIKRFYV QDDVVYEQPN SDISGVSGNS
ITDDFCTAQK TAFGDTDYFS QKGGMAAMGK KMADGMVLVL SIWDDYNVNM LWLDSDYPTT
KDASTPGVSR GSCATTSGVP ATVEAASGSA YVTFSSIKYG PIGSTFKAPA DSSSPVVASS
SPAAVAAVVS TSSAQAVPSH PAVSSSQAAV STPEAVSSAP EVPASSSAAQ SVAPTSTKPK
CSKVSQSSTL ATSVAAPATT ATSAAVAATS AASSSGSVPL YGNCTGGKTC SEGTCVVQNP
WYSQCVASS MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS GGCKPLSTKV
VLDSNWRWVH STSGYTNCYT GNEWDTSLCP DGKTCAANCA LDGADYSGTY GITSTGTALT
LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY LSAMDADGGM
SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG TCCSEMDIWE
ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF LGKGMTVDTS
KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI TDNFCAQQKT
AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK DPSAPGVARG
TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS SPVTTSPTPP
PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPYYSQCY MYRKLAVISA FLATARAQSA
CTLQSETHPP LTWQKCSSGG TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD
NETCAKNCCL DGAAYASTYG VTTSGNSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL
GNEFSFDVDV SQLPCGLNGA LYFVSMDADG GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI
NGQANVEGWE PSSNNANTGI GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDGC
GGTYSDNRYG GTCDPDGCDW NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY
YVQNGVTFQQ PNAELGSYSG NELNDDYCTA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV
MSLWDDYYAN MLWLDSTYPT NETSSTPGAV RGSCSTSSGV PAQVESQSPN AKVTFSNIKF
GPIGSTGNPS GGNPPGGNRG TTTTRRPATT TGSSPGPTQS HYGQCGGIGY SGPTVCASGT
TCQVLNPYYS QCL MPSTYDIYKK LLLLASFLSA SQAQQVGTSK AEVHPSLTWQ
TCTSGGSCTT VNGKVVVDAN WRWVHNVDGY NNCYTGNTWD TTLCPDDETC ASNCALEGAD
YSGTYGVTTS GNSLRLNFVT QASQKNIGSR LYLMEDDSTY KMFKLLNQEF TFDVDVSNLP
CGLNGAVYFV SMDADGGMAK YPANKAGAKY GTGYCDSQCP RDLKFINGMA NVEGWEPSAN
DANAGTGNHG SCCAEMDIWE ANSISTAYTP HPCDTPGQVM CTGDSCGGTY SSDRYGGTCD
PDGCDFNSYR QGNKTFYGPG MTVDTKSKIT VVTQFLTNDG TASGTLSEIK RFYVQNGKVI
PNSESTWSGV SGNSITTAYC NAQKTLFGDT DVFTKHGGME GMGAALAEGM VLVLSLWDDH
NSNMLWLDSN YPTDKPSTTP GVARGSCDIS SGDPKDVEAN DANAYVVYSN IKVGPIGSTF
SGSTGGGSSS STTATSKTTT TSATKTTTTT TKTTTTTSAS STSTGGAQHW AQCGGIGWTG
PTTCVAPYTC QKQNDYYSQC L MISKVLAFTS LLAAARAQQA GTLTTETHPP LSVSQCTASG
CTTSAQSIVV DANWRWLHST TGSTNCYTGN TWDKTLCPDG ATCAANCALD GADYSGVYGI
TTSGNSIKLN FVTKGANTNV GSRTYLMAAG STTQYQMLKL LNQEFTFDVD VSNLPCGLNG
ALYFAAMDAD GGLSRFPTNK AGAKYGTGYC DAQCPQDIKF INGVANSVGW TPSSNDVNAG
AGQYGSCCSE MDIWEANKIS AAYTPHPCSV DTQTRCTGTD CGIGARYSSL CDADGCDFNS
YRQGNTSFYG AGLTVNTNKV FTVVTQFITN DGTASGTLKE IRRFYVQNGV VIPNSQSTIA
GVPGNSITDS FCAAQKTAFG DTNEFATKGG LATMSKALAK GMVLVMSIWD DHTANMLWLD
APYPATKSPS APGVTRGSCS ATSGNPVDVE ANSPGSSVTF SNIKWGPINS TYTGSGAAPS
VPGTTTVSSA PASTATSGAG GVAKYAQCGG SGYSGATACV SGSTCVALNP YYSQCQ
MFPAATLFAF SLFAAVYGQQ VGTQLAETHP RLTWQKCTRS GGCQTQSNGA IVLDANWRWV
HNVGGYTNCY TGNTWNTSLC PDGATCAKNC ALDGANYQST YGITTSGNAL TLKFVTQSEQ
KNIGSRVYLL ESDTKYQLFN PLNQEFTFDV DVSQLPCGLN GAVYFSAMDA DGGMSKFPNN
AAGAKYGTGY CDSQCPRDIK FINGEANVQG WQPSPNDTNA GTGNYGACCN EMDVWEANSI
STAYTPHPCT QQGLVRCSGT ACGGGSNRYG SICDPDGCDF NSFRMGDKSF YGPGLTVNTQ
QKFTVVTQFL TNNNSSSGTL REIRRLYVQN GRVIQNSKVN IPGMPSTMDS VTTEFCNAQK
TAFNDTFSFQ QKGGMANMSE ALRRGMVLVL SIWDDHAANM LWLDSNYPTD RPASQPGVAR
GTCPTSSGKP SDVENSTANS QVIYSNIKFG DIGSTYSA MKGSISYQIY KGALLLSALL
NSVSAQQVGT LTAETHPALT WSKCTAGXCS QVSGSVVIDA NWPXVHSTSG STNCYTGNTW
DATLCPDDVT CAANCAVDGA RRQHLRVTTS GNSLRINFVT TASQKNIGSR LYLLENDTTY
QKFNLLNQEF TFDVDVSNLP CGLNGALYFV DMDADGGMAK YPTNKAGAKY GTGYCDSQCP
RDLKFINGQA NVDGWTPSKN DVNSGIGNHG SCCAEMDIWE ANSISNAVTP HPCDTPSQTM
CTGQRCGGTY STDRYGGTCD PDGCDFNPYR MGVTNFYGPG ETIDTKSPFT VVTQFLTNDG
TSTGTLSEIK RFYVQGGKVI GNPQSTIVGV SGNSITDSWC NAQKSAFGDT NEFSKHGGMA
GMGAGLADGM VLVMSLWDDH ASDMLWLDST YPTNATSTTP GAKRGTCDIS RRPNTVESTY
PNAYVIYSNI KTGPLNSTFT GGTTSSSSTT TTTSKSTSTS SSSKTTTTVT TTTTSSGSSG
TGARDWAQCG GNGWTGPTTC VSPYTCTKQN DWYSQCL MFRTAALTAF TLAAVVLGQQ
VGTLTAENHP ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS LPVHTNCYTG NAWDASLCPD
PTTCATNCAI DGADYSGTYG ITTSGNALTL RFVTNGPYSK NIGSRVYLLD DADHYKMFDL
KNQEFTFDVD MSGLPCGLNG ALYFSEMPAD GGKAAHTSNK AGAKYGTGYC DAQCPHDIKW
INGEANILDW SASATDANAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE
CGDGDNRYGG VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTSSGNLV
EIRRVYVQDG VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDAM
ANGMVLIMSL WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV
TFSNIKYGPI GSTYGGSTPP VSSGNTSAPP VTSTTSSGPT TPTGPTGTVP KWGQCGGNGY
SGPTTCVAGS TCTYSNDWYS QCL MYQRALLFSA LLSVSRAQQA GTAQEEVHPS
LTWQRCEASG SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG NEWDATLCPD NESCAQNCAV
DGADYEATYG ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNL
PCGLNGALYF TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD
SDANAGVGGM GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC
DPDGCDFNSY RMGNTSFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS
ESNISGVEGN SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM
MWLDSSYPTD ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF
MYRAIATASA LIAAVRAQQV CSLTTETKPA LTWSKCTSSG CSNVQGSVTI DANWRWTHQV
SGSTNCHTGN KWDTSVCTSG KVCAEKCCVD GADYASTYGI TSSGNQLSLS FVTKGSYGTN
IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA
GAKYGTGYCD AQCPRDVKFI NGQANSDGWE PSKSDVNGGI GNLGTCCPEM DIWEANSIST
AYTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD
TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCTTQKKVF
GDIDDFAKKG AWNGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTA LGSQRGSCST
SSGVPADLEK NVPNSKVAFS NIKFGPIGST YNKEGTQPQP TNPTNPNPTN PTNPGTVDQW
GQCGGTNYSG PTACKSPFTC KKINDFYSQC Q MFRTAALTAF TLAAVVLGQQ VGTLAAENHP
ALSIQQCTAS GCTTQQKSVV LDSNWRWTHS TAGATNCYTG NAWDSSLCPN PTTCATNCAI
DGADYSGTYG ITTSGNSLTL RFVTNGQYSE NIGSRVYLLD DADHYKLFNL KNQEFTFDVD
MSGLPCGLNG ALYFSEMAAD GGKAAHTGNN AGAKYGTGYC DAQCPHDIKW INGEANILDW
SGSATDPNAG NGRYGACCAE MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGDNRYGG
VCDKDGCDFN SYRMGDKNFL GRGKTIDTTK KITVVTQFIT DDNTPTGNLV EIRRVYVQDG
VTYQNSFSTF PSLSQYNSIS DDFCVAQKTL FGDNQYYNTH GGTEKMGDSL ANGMVLIMSL
WSDHAAHMLW LDSDYPLDKS PSEPGVSRGA CATTTGDPDD VVANHPNASV TFSNIKYGPI
GSTYGGSTPP VSSGNTSVPP VTSTTSSGPT TPTGPTGTVP KWGQCGGIGY SGPTSCVAGS
TCTYSNEWYS QCL MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG
TCTQQTGSVV IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG
VTTSADSLSI GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA
LYFVSMDADG GVTKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI
GGHGSCCSEM DIWEANSISE ALTPHPCTTV GQEICEGDSC GGTYSGDRYG GTCDPDGCDW
NPYRLGNTSF YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG
NSLDDDYCAA EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT
DETSSTPGAV RGSSSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNPS GGNPPGGNPP
GTTTPRPATS TGSSPGPTQT HYGQCGGIGY IGPTVCASGS TCQVLNPYYS QCL
MTWQSCTAKG SCTNKNGKIV IDANWRWLHK KEGYDNCYTG NEWDATACPD NKACAANCAV
DGADYSGTYG ITAGSNSLKL KFITKGSYST NIGSRTYLMK DDTTYEMFKF TGNQEFTFDV
DVSNLPCGFN GALYFVSMDA DGGLKKYSTN KAGAKYGTGY CDAQCPRDLK FINGEGNVEG
WKPSSNDANA GVGGHGSCCA EMDIWEANSV STAVTPHSCS TIEQSRCDGD GCGGTYSADR
YAGVCDPDGC DFNSYRMGVK DFYGKGKTVD TSKKFTVVTQ FIGTGDAMEI KRFYVQNGKT
IAQPASAVPG VEGNSITTKF CDQQKAVFGD TYTFKDKGGM ANMAKALANG MVLVMSLWDD
HYSNMLWLDS TYPTDKNPDT DLGTGRGECE TSSGVPADVE SQHADATVVY SNIKFGPLNS
TFG MASAISFQVY RSALILSAFL PSITQAQQIG TYTTETHPSM TWETCTSGGS
CATNQGSVVM DANWRWVHQV GSTTNCYTGN TWDTSICDTD ETCATECAVD GADYESTYGV
TTSGSQIRLN FVTQNSNGAN VGSRLYMMAD NTHYQMFKLL NQEFTFDVDV SNLPCGLNGA
LYFVTMDEDG GVSKYPNNKA GAQYGVGYCD SQCPRDLKFI QGQANVEGWT PSSNNENTGL
GNYGSCCAEL DIWESNSISQ ALTPHPCDTA TNTMCTGDAC GGTYSSDRYA GTCDPDGCDF
NPYRMGNTTF YGPGKTIDTN SPFTVVTQFI TDDGTDTGTL SEIRRYYVQN GVTYAQPDSD
ISGITGNAIN ADYCTAENTV FDGPGTFAKH GGFSAMSEAM STGMVLVMSL WDDYYADMLW
LDSTYPTNAS SSTPGAVRGS CSTDSGVPAT IESESPDSYV TYSNIKVGPI GSTFSSGSGS
GSSGSGSSGS ASTSTTSTKT TAATSTSTAV AQHYSQCGGQ DWTGPTTCVS PYTCQVQNAY
YSQCL MKAYFEYLVA ALPLLGLATA QQVGKQTTET HPKLSWKKCT GKANCNTVNA
EVVIDSNWRW LHDSSGKNCY DGNKWTSACS SATDCASKCQ LDGANYGTTY GASTSGDALT
LKFVTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN AALYFVAMEE
DGGMASYSSN KAGAKYGTGY CDAQCARDLK FVGGKANIEG WTPSTNDANA GVGPYGGCCA
EIDVWESNAH SFAFTPHACK TNKYHVCERD NCGGTYSEDR FAGLCDANGC DYNPYRMGNT
DFYGKGKTVD TSKKFTVVSR FEENKLTQFF VQNGQKIEIP GPKWDGIPSD NANITPEFCS
AQFQAFGDRD RFAEVGGFAQ LNSALRMPMV LVMSIWDDHY ANMLWLDSVY PPEKEGQPGA
ARGDCPQSSG VPAEVESQYA NSKVVYSNIR FGPVGSTVNV MFSKFALTGS LLAGAVNAQG
VGTQQTETHP QMTWQSCTSP SSCTTNQGEV VIDSNWRWVH DKDGYVNCYT GNTWNTTLCP
DDKTCAANCV LDGADYSSTY GITTSGNALS LQFVTQSSGK NIGSRTYLME SSTKYHLFDL
IGNEFAFDVD LSKLPCGLNG ALYFVTMDAD GGMAKYSTNT AGAEYGTGYC DSQCPRDLKF
INGQGNVEGW TPSTNDANAG VGGLGSCCSE MDVWEANSMD MAYTPHPCET AAQHSCNADE
CGGTYSSSRY AGDCDPDGCD WNPFRMGNKD FYGSGDTVDT SQKFTVVTQF HGSGSSLTEI
SQYYIQGGTK IQQPNSTWPT LTGYNSITDD FCKAQKVEFN DTDVFSEKGG LAQMGAGMAD
GMVLVMSLWD DHYANMLWLD STYPVDADAS SPGKQRGTCA TTSGVPADVE SSDASATVIY
SNIKFGPIGA TY MFPAAALLSF TLLAVASAQQ IGTNTAEVHP SLTVSQCTTS
GGCTSSTQSI VLDANWRWLH STSGYTNCYT GNQWNSDLCP DPDTCATNCA LDGASYESTY
GISTDGNAVT LNFVTQGSQT NVGSRVYLLS DDTHYQTFSL LNKEFSFDVD ASNIGCGING
AVYFVQMDAD GGLSKYSSNK AGAQYGTGYC DSQCPQDIKF INGEANLLDW NATSANSGTG
SYGSCCPEMD IWEANKYAAA YTPHPCSVSG QTRCTGTSCG AGSERYDGYC DKDGCDFNSW
RMGNETFLGP GMTIDTNKKF TIVTQFITDD NTANGTLSEI RRLYVQGGTV IQNSVANQPN
IPKVNSITDS FCTAQKTEFG DQDYFGTIGG LSQMGKAMSD MVLVMSIWDD YDAEMLWLDS
NYPTSGSAST PGISRGPCSA TSGLPATVES QQASASVTYS NIKWGDIGST YSGSGSSGSS
SSSSSSAASA STSTHTSAAA TATSSAAAAT GSPVPAYGQC GGQSYTGSTT CASPYVCKVS
NAYYSQCLPA MKRALCASLS LLAAAVAQQV GTNEPEVHPK MTWKKCSSGG SCSTVNGEVV
IDGNWRWIHN IGGYENCYSG NKWTSVCSTN ADCATKCAME GAKYQETYGV STSGDALTLK
FVQQNSSGKN VGSRMYLMNG ANKYQMFTLK NNEFAFDVDL SSVECGMNSA LYFVPMKEDG
GMSTEPNNKA GAKYGTGYCD AQCARDLKFI GGKGNIEGWQ PSSTDSSAGI GAQGACCAEI
DIWESNKNAF AFTPHPCENN EYHVCTEPNC GGTYADDRYG GGCDANGCDY NPYRMGNPDF
YGPGKTIDTN RKFTVISRFE NNRNYQILMQ DGVAHRIPGP KFDGLEGETG ELNEQFCTDQ
FTVFDERNRF NEVGGWSKLN AAYEIPMVLV MSIWSDHFAN MLWLDSTYPP EKAGQPGSAR
GPCPADGGDP NGVVNQYPNA KVIWSNVRFG PIGSTYQVD MQLTKAGVFL GALMGGAAAQ
QVGTQTAENH PKMTWKKCTG KASCTTVNGE VVIDANWRWL HDASSKNCYD GNRWTDSCRT
ASDCAAKCSL EGADYAKTYG ASTSGDALSL KFVTRHDYGT NIGSRFYLMN GASKYQMFSL
LGNEFAFDVD LSTIECGLNS ALYFVAMEED GGMKSYSSNK AGAKYGTGYC DAQCARDLKF
VGGKANIEGW KPSSNDANAG VGPYGACCAE IDVWESNAHA FAFTPHPCTD NKYHVCQDSN
CGGTYSDDRF AGKCDANGCD INPYRLGNTD FYGKGKTVDT SKKFTVVTRF ERDALTQFFV
QNNKRIDMPS PALEGLPATG AITAEYCTNV FNVFGDRNRF DEVGGWSQLQ QALSLPMVLV
MSIWDDHYSN MLWLDSVYPP DKEGSPGAAR GDCPQDSGVP SEVESQIPGA TVVWSNIRFG
PVGSTVNV MYRIVATASA LIAAARAQQV CSLNTETKPA LTWSKCTSSG CSDVKGSVVI
DANWRWTHQT SGSTNCYTGN KWDTSICTDG
KTCAEKCCLD GADYSGTYGI TSSGNQLSLG FVTNGPYSKN IGSRTYLMEN ENTYQMFQLL
GNEFTFDVDV SGIGCGLNGA PHFVSMDEDG GKAKYSGNKA GAKYGTGYCD AQCPRDVKFI
NGVANSEGWK PSDSDVNAGV GNLGTCCPEM DIWEANSIST AFTPHPCTKL TQHSCTGDSC
GGTYSSDRYG GTCDADGCDF NAYRQGNKTF YGPGSNFNID TTKKMTVVTQ FHKGSNGRLS
EITRLYVQNG KVIANSESKI AGNPGSSLTS DFCSKQKSVF GDIDDFSKKG GWNGMSDALS
APMVLVMSLW HDHHSNMLWL DSTYPTDSTK VGSQRGSCAT TSGKPSDLER DVPNSKVSFS
NIKFGPIGST YKSDGTTPNP PASSSTTGSS TPTNPPAGSV DQWGQCGGQN YSGPTTCKSP
FTCKKINDFY SQCQ MYQRALLFSA LATAVSAQQV GTQKAEVHPA LTWQKCTAAG
SCTDQKGSVV IDANWRWLHS TEDTTNCYTG NEWNAELCPD NEACAKNCAL DGADYSGTYG
VTADGSSLKL NFVTSANVGS RLYLMEDDET YQMFNLLNNE FTFDVDVSNL PCGLNGALYF
VSMDADGGLS KYPGNKAGAK YGTGYCDSQC PRDLKFINGE ANVEGWKPSD NDKNAGVGGY
GSCCPEMDIW EANSISTAYT PHPCDGMEQT RCDGNDCGGT YSSTRYAGTC DPDGCDFNSF
RMGNESFYGP GGLVDTKSPI TVVTQFVTAG GTDSGALKEI RRVYVQGGKV IGNSASNVAG
VEGDSITSDF CTAQKKAFGD EDIFSKHGGL EGMGKALNKM ALIVSIWDDH ASSMMWLDST
YPVDADASTP GVARGTCEHG LGDPETVESQ HPDASVTFSN IKFGPIGSTY KSV
MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD
ANWRWVHGVN TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF
VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSNLPC GLNGALYFVT MDADGGVSKY
PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSTNN SNTGIGNHGS CCAELDIWEA
NSISEALTPH PCDTPGLTVC TADDCGGTYS SNRYAGTCDP DGCDFNPYRL GVTDFYGSGK
TVDTTKPFTV VTQFVTDDGT SSGSLSEIRR YYVQNGVVIP QPSSKISGIS GNVINSDFCA
AELSAFGETA SFTNHGGLKN MGSALEAGMV LVMSLWDDYS VNMLWLDSTY PANETGTPGA
ARGSCPTTSG NPKTVESQSG SSYVVFSDIK VGPFNSTFSG GTSTGGSTTT TASGTTSTKA
STTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL MRTAKFATLA
ALVASAAAQQ ACSLTTERHP SLSWNKCTAG GQCQTVQASI TLDSNWRWTH QVSGSTNCYT
GNKWDTSICT DAKSCAQNCC VDGADYTSTY GITTNGDSLS LKFVTKGQHS TNVGSRTYLM
DGEDKYQTFE LLGNEFTFDV DVSNIGCGLN GALYFVSMDA DGGLSRYPGN KAGAKYGTGY
CDAQCPRDIK FINGEANIEG WTGSTNDPNA GAGRYGTCCS EMDIWEANNM ATAFTPHPCT
IIGQSRCEGD SCGGTYSNER YAGVCDPDGC DFNSYRQGNK TFYGKGMTVD TTKKITVVTQ
FLKDANGDLG EIKRFYVQDG KIIPNSESTI PGVEGNSITQ DWCDRQKVAF GDIDDFNRKG
GMKQMGKALA GPMVLVMSIW DDHASNMLWL DSTFPVDAAG KPGAERGACP TTSGVPAEVE
AEAPNSNVVF SNIRFGPIGS TVAGLPGAGN GGNNGGNPPP PTTTTSSAPA TTTTASAGPK
AGRWQQCGGI GFTGPTQCEE PYICTKLNDW YSQCL MMYKKFAALA ALVAGASAQQ
ACSLTAENHP SLTWKRCTSG GSCSTVNGAV TIDANWRWTH TVSGSTNCYT GNQWDTSLCT
DGKSCAQTCC VDGADYSSTY GITTSGDSLN LKFVTKHQYG TNVGSRVYLM ENDTKYQMFE
LLGNEFTFDV DVSNLGCGLN GALYFVSMDA DGGMSKYSGN KAGAKYGTGY CDAQCPRDLK
FINGEANVGN WTPSTNDANA GFGRYGSCCS EMDVWEANNM ATAFTPHPCT TVGQSRCEAD
TCGGTYSSDR YAGVCDPDGC DFNAYRQGDK TFYGKGMTVD TNKKMTVVTQ FHKNSAGVLS
EIKRFYVQDG KIIANAESKI PGNPGNSITQ EYCDAQKVAF SNTDDFNRKG GMAQMSKALA
GPMVLVMSVW DDHYANMLWL DSTYPIDQAG APGAERGACP TTSGVPAEIE AQVPNSNVIF
SNIRFGPIGS TVPGLDGSNP GNPTTTVVPP ASTSTSRPTS STSSPVSTPT GQPGGCTTQK
WGQCGGIGYT GCTNCVAGTT CTQLNPWYSQ CL MASLSLSKIC RNALILSSVL
STAQGQQVGT YQTETHPSMT WQTCGNGGSC STNQGSVVLD ANWRWVHQTG SSSNCYTGNK
WDTSYCSTND ACAQKCALDG ADYSNTYGIT TSGSEVRLNF VTSNSNGKNV GSRVYMMADD
THYEVYKLLN QEFTFDVDVS KLPCGLNGAL YFVVMDADGG VSKYPNNKAG AKYGTGYCDS
QCPRDLKFIQ GQANVEGWVS STNNANTGTG NHGSCCAELD IWESNSISQA LTPHPCDTPT
NTLCTGDACG GTYSSDRYSG TCDPDGCDFN PYRVGNTTFY GPGKTIDTNK PITVVTQFIT
DDGTSSGTLS EIKRFYVQDG VTYPQPSADV SGLSGNTINS EYCTAENTLF EGSGSFAKHG
GLAGMGEAMS TGMVLVMSLW DDYYANMLWL DSNYPTNEST SKPGVARGTC STSSGVPSEV
EASNPSAYVA YSNIKVGPIG STFKS MYRAIATASA LIAAVRAQQV CSLTPETKPA
LSWSKCTSSG CSNVQGSVTI DANWRWTHQL SGSTNCYTGN KWDTSICTSG KVCAEKCCID
GAEYASTYGI TSSGNQLSLS FVTKGAYGTN IGSRTYLMED ENTYQMFQLL GNEFTFDVDV
SNIGCGLNGA LYFVSMDADG GKAKYPGNKA GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ
PSKSDVNAGI GNMGTCCPEM DIWEANSIST AYTPHPCTKL TQHSCTGDSC GGTYSNDRYG
GTCDADGCDF NAYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG
KVIANSESKI AGVPGSSLTP EFCTAQKKVF GDTDDFAKKG AWSGMSDALE APMVLVMSLW
HDHHSNMLWL DSTYPTDSTK LGAQRGSCST SSGVPADLEK NVPNSKVAFS NIKFGPIGST
YKEGVPEPTN PTNPTNPTNP TNPGTVDQWA QCGGTNYSGP TACKSPFTCK KINDFYSQCQ
MFPKSSLLVL SFLATAYAQQ VGTQTAEVHP SLNWARCTSS GCTNVAGSVT LDANWRWLHT
TSGYTNCYTG NSWNTTLCPD GATCAQNCAL DGANYQSTCG ITTSGNALTL KFVTQGEQKN
IGSRVYLMAS ESRYEMFGLL NKEFTFDVDV SNLPCGLNGA LYFSSMDADG GMAKNPGNKA
GAKYGTGYCD SQCPRDIKFI NGEANVAGWN GSPNDTNAGT GNWGACCNEM DIWEANSISA
AYTPHPCTVQ GLSRCSGTAC GTNDRYGTVC DPDGCDFNSY RMGDKTYYGP GGTGVDTRSK
FTVVTQFLTN NNSSSGTLSE IRRLYVQNGR VVQNSKVNIP GMSNTLDSIT TGFCDSQKTA
FGDTRSFQNK GGMSAMGQAL GAGMVLVLSV WDDHAANMLW LDSNYPVDAD PSKPGIARGT
CSTTSGKPTD VEQSAANSSV TFSNIKFGDI GTTYTGGSVT TTPGNPGTTT STAPGAVQTK
WGQCGGQGWT GPTRCESGST CTVVNQWYSQ CI MFRKAALLAF SFLAIAHGQQ
VGTNQAENHP SLPSQHCTAS GCTTSSTSVV LDANWRWVHT TTGYTNCYTG QTWDASICPD
GVTCAKACAL DGADYSGTYG ITTSGNALTL QFVKGTNVGS RVYLLQDASN YQLFKLINQE
FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS RFPTNTAGAK YGTGYCDSQC PRDIKFINGE
ANVAGWTGSS SDPNSGTGNY GTCCSEMDIW EANSVAAAYT PHPCSVNQQT RCTGADCGQD
ANRYKGVCDP DGCDFNSFRM GDQTFLGKGL TVDTSRKFTI VTQFISDDGT SSGNLAEIRR
FYVQDGKVIP NSKVNIAGCD AVNSITDKFC TQQKTAFGDT NRFADQGGLK QMGAALKSGM
VLALSLWDDH AANMLWLDSD YPTTADASKP GVARGTCPNT SGVPKDVESQ SGSATVTYSN
IKWGDLNSTF SGTASNPTGP SSSPSGPSSS SSSTAGSQPT QPSSGSVAQW GQCGGIGYSG
ATGCVSPYTC HVVNPYYSQC Y TETHPRLTWK RCTSGGNCST VNGAVTIDAN WRWTHTVSGS
TNCYTGNEWD TSICSDGKSC AQTCCVDGAD YSSTYGITTS GDSLNLKFVT KHQHGTNVGS
RVYLMENDTK YQMFELLGNE FTFDVDVSNL GCGLNGALYF VSMDADGGMS KYSGNKAGAK
YGTGYCDAQC PRDLKFINGE ANIENWTPST NDANAGFGRY GSCCSEMDIW EANNMATAFT
PHPCTIIGQS RCEGNSCGGT YSSERYAGVC DPDGCDFNAY RQGDKTFYGK GMTVDTTKKM
TVVTQFHKNS AGVLSEIKRF YVQDGKIIAN AESKIPGNPG NSITQEWCDA QKVAFGDIDD
FNRKGGMAQM SKALEGPMVL VMSVWDDHYA NMLWLDSTYP IDKAGTPGAE RGACPTTSGV
PAEIEAQVPN SNVIFSNIRF GPIGSTVPGL DGSTPSNPTA TVAPPTSTTT SVRSSTTQIS
TPTSQPGGCT TQKWGQCGGI GYTGCTNCVA GTTCTELNPW YSQCL MFHKAVLVAF
SLVTIVHGQQ AGTQTAENHP QLSSQKCTAG GSCTSASTSV VLDSNWRWVH TTSGYTNCYT
GNTWDASICS DPVSCAQNCA LDGADYAGTY GITTSGDALT LKFVTGSNVG SRVYLMEDET
NYQMFKLMNQ EFTFDVDVSN LPCGLNGAVY FVQMDQDGGT SKFPNNKAGA KFGTGYCDSQ
CPQDIKFING EANIVDWTAS AGDANSGTGS FGTCCQEMDI WEANSISAAY TPHPCTVTEQ
TRCSGSDCGQ GSDRFNGICD PDGCDFNSFR MGNTEFYGKG LTVDTSQKFT IVTQFISDDG
TADGNLAEIR RFYVQNGKVI PNSVVQITGI DPVNSITEDF CTQQKTVFGD TNNFAAKGGL
KQMGEAVKNG MVLALSLWDD YAAQMLWLDS DYPTTADPSQ PGVARGTCPT TSGVPSQVEG
QEGSSSVIYS NIKFGDLNST FTGTLTNPSS PAGPPVTSSP SEPSQSTQPS QPAQPTQPAG
TAAQWAQCGG MGFTGPTVCA SPFTCHVLNP YYSQCY MFRAAALLAF TCLAMVSGQQ
AGTNTAENHP QLQSQQCTTS GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWNTSLCP
DGKTCAANCA LDGADYSGTY GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ
EFTFDVDMSN LPCGLNGALY LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING
EANVGNWTET GSNTGTGSYG TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT
GLCDHGDGCD FNSFRMGDKT FLGKGMTVDT SKPFTDVTQF LTNDNTSTGT LSEIRRIYIQ
NGKVIQNSVA NIPGVDPVNS ITDNFCAQQK TAFGDTNWFA QKGGLKQMGE ALGNGMVLAL
SIWDDHAANM LWLDSDYPTD KDPSAPGVAR GTCATTSGVP SDVESQVPNS QVVFSNIKFG
DIGSTFSGTS SPNPPGGSTT SSPVTTSPTP PPTGPTVPQW GQCGGIGYSG STTCASPYTC
HVLNPYYSQC Y MMMKQYLQYL AAALPLVGLA AGQRAGNETP ENHPPLTWQR CTAPGNCQTV
NAEVVIDANW RWLHDDNMQN CYDGNQWTNA CSTATDCAEK CMIEGAGDYL GTYGASTSGD
ALTLKFVTKH EYGTNVGSRF YLMNGPDKYQ MFNLMGNELA FDVDLSTVEC GINSALYFVA
MEEDGGMASY PSNQAGARYG TGYCDAQCAR DLKFVGGKAN IEGWKSSTSD PNAGVGPYGS
CCAEIDVWES NAYAFAFTPH ACTTNEYHVC ETTNCGGTYS EDRFAGKCDA NGCDYNPYRM
GNPDFYGKGK TLDTSRKFTV VSRFEENKLS QYFIQDGRKI EIPPPTWEGM PNSSEITPEL
CSTMFDVFND RNRFEEVGGF EQLNNALRVP MVLVMSIWDD HYANMLWLDS IYPPEKEGQP
GAARGDCPTD SGVPAEVEAQ FPDAQVVWSN IRFGPIGSTY DF MYRSATFLTF
ASLVLGQQVG TYTAERHPSM PIQVCTAPGQ CTRESTEVVL DANWRWTHIT NGYTNCYTGN
EWNATACPDG ATCAKNCAVD GADYSGTYGI TTPSSGALRL QFVKKNDNGQ NVGSRVYLMA
SSDKYKLFNL LNKEFTFDVD VSKLPCGLNG AVYFSEMLED GGLKSFSGNK AGAKYGTGYC
DSQCPQDIKF INGEANVEGW GGADGNSGTG KYGICCAEMD IWEANSDATA YTPHVCSVNE
QTRCEGVDCG AGSDRYNSIC DKDGCDFNSY RLGNREFYGP GKTVDTTRPF TIVTQFVTDD
GTDSGNLKSI HRYYVQDGNV IPNSVTEVAG VDQTNFISEG FCEQQKSAFG DNNYFGQLGG
MRAMGESLKK MVLVLSIWDD HAVNMNWLDS IFPNDADPEQ PGVARGRCDP ADGVPATIEA
AHPDAYVIYS NIKFGAINST FTAN MYRTLAFASL SLYGAARAQQ VGTSTAENHP
KLTWQTCTGT GGTNCSNKSG SVVLDSNWRW AHNVGGYTNC YTGNSWSTQY CPDGDSCTKN
CAIDGADYSG TYGITTSNNA LSLKFVTKGS FSSNIGSRTY LMETDTKYQM FNLINKEFTF
DVDVSKLPCG LNGALYFVEM AADGGIGKGN NKAGAKYGTG YCDSQCPHDI KFINGKANVE
GWNPSDADPN GGAGKIGACC PEMDIWEANS ISTAYTPHPC RGVGLQECSD AASCGDGSNR
YDGQCDKDGC DFNSYRMGVK DFYGPGATLD TTKKMTVITQ FLGSGSSLSE IKRFYVQNGK
VYKNSQSAVA GVTGNSITES FCTAQKKAFG DTSSFAALGG LNEMGASLAR GHVLIMSLWG
DHAVNMLWLD STYPTDADPS KPGAARGTCP TTSGKPEDVE KNSPDATVVF SNIKFGPIGS
TFAQPA MYQKLALISA FLATARAQSA CTLQAETHPP LTWQKCSSGG TCTQQTGSVV
IDANWRWTHA TNSSTNCYDG NTWSSTLCPD NETCAKNCCL DGAAYASTYG VTTSADSLSI
GFVTQSAQKN VGARLYLMAS DTTYQEFTLL GNEFSFDVDV SQLPCGLNGA LYFVSMDADG
GVSKYPTNTA GAKYGTGYCD SQCPRDLKFI NGQANVEGWE PSSNNANTGI GGHGSCCSEM
DIWEANSISE ALTPHPCTTV GQEICDGDSC GGTYSGDRYG GTCDPDGCDW NPYRLGNTSF
YGPGSSFTLD TTKKLTVVTQ FETSGAINRY YVQNGVTFQQ PNAELGDYSG NSLDDDYCAA
EEAEFGGSSF SDKGGLTQFK KATSGGMVLV MSLWDDYYAN MLWLDSTYPT NETSSTPGAV
RGSCSTSSGV PAQLESNSPN AKVVYSNIKF GPIGSTGNSS GGNPPGGNPP GTTTTRRPAT
STGSSPGPTQ THYGQCGGIG YSGPTVCASG STCQVLNPYY SQCL MVDSFSIYKT
ALLLSMLATS NAQQVGTYTA ETHPSLTWQT CSGSGSCTTT SGSVVIDANW RWVHEVGGYT
NCYSGNTWDS SICSTDTTCA SECALEGATY ESTYGVTTSG SSLRLNFVTT ASQKNIGSRL
YLLADDSTYE TFKLFNREFT FDVDVSNLPC GLNGALYFVS MDADGGVSRF PTNKAGAKYG
TGYCDSQCPR DLKFIDGQAN IEGWEPSSTD VNAGTGNHGS CCPEMDIWEA NSISSAFTAH
PCDSVQQTMC TGDTCGGTYS DTTDRYSGTC DPDGCDFNPY RFGNTNFYGP GKTVDNSKPF
TVVTQFITHD GTDTGTLTEI RRLYVQNGVV IGNGPSTYTA ASGNSITESF CKAEKTLFGD
TNVFETHGGL SAMGDALGDG MVLVLSLWDD HAADMLWLDS DYPTTSCASS PGVARGTCPT
TTGNATYVEA NYPNSYVTYS NIKFGTLNST YSGTSSGGSS SSSTTLTTKA STSTTSSKTT
TTTSKTSTTS SSSTNVAQLY GQCGGQGWTG PTTCASGTCTKQNDYYSQCL MYRILKSFIL
LSLVNMSLSQ KIGKLTPEVH PPMTFQKCSE GGSCETIQGE VVVDANWRWV HSAQGQNCYT
GNTWNPTICP DDETCAENCY LDGANYESVY GVTTSEDSVR LNFVTQSQGK NIGSRLFLMS
NESNYQLFHV LGQEFTFDVD VSNLDCGLNG ALYLVSMDSD GGSARFPTNE AGAKYGTGYC
DAQCPRDLKF ISGSANVDGW IPSTNNPNTG YGNLGSCCAE MDLWEANNMA TAVTPHPCDT
SSQSVCKSDS CGGAASSNRY GGICDPDGCD YNPYRMGNTS FFGPNKMIDT NSVITVVTQF
ITDDGSSDGK LTSIKRLYVQ DGNVISQSVS TIDGVEGNEV NEEFCTNQKK VFGDEDSFTK
HGGLAKMGEA LKDGMVLVLS LWDDYQANML WLDSSYPTTS SPTDPGVARG SCPTTSGVPS
KVEQNYPNAY VVYSNIKVGP IDSTYKK MISRVLAISS LLAAARAQQI GTNTAEVHPA
LTSIVIDANW RWLHTTSGYT NCYTGNSWDA TLCPDAVTCA ANCALDGADY SGTYGITTSG
NSLKLNFVTK GANTNVGSRT YLMAAGSKTQ YQLLKLLGQE FTFDVDVSNL PCGLNGALYF
AEMDADGGVS RFPTNKAGAQ YGTGYCDAQC PQDIKFINGQ ANSVGWTPSS NDVNTGTGQY
GSCCSEMDIW EANKISAAYT PHPCSVDGQT RCTGTDCGIG ARYSSLCDAD GCDFNSYRMG
DTGFYGAGLT VDTSKVFTVV TQFITNDGTT SGTLSEIRRF YVQNGKVIPN SQSKVTGVSG
NSITDSFCAA QKTAFGDTNE FATKGGLATM SKALAKGMVL VMSIWDDHSA NMLWLDAPYP
ASKSPSAAGV SRGSCSASSG VPADVEANSP GASVTYSNIK WGPINSTYSA GTGSNTGSGS
GSTTTLVSSV PSSTPTSTTG VPKYGQCGGS GYTGPTNCIG STCVSMGQYY SQCQ
MYRQVATALS FASLVLGQQV GTLTAETHPS LPIEVCTAPG SCTKEDTTVV LDANWRWTHV
TDGYTNCYTG NAWNETACPD GKTCAANCAI DGAEYEKTYG ITTPEEGALR LNFVTESNVG
SRVYLMAGED KYRLFNLLNK EFTMDVDVSN LPCGLNGAVY FSEMDEDGGM SRFEGNKAGA
KYGTGYCDSQ CPRDIKFING EANSEGWGGE DGNSGTGKYG TCCAEMDIWE ANLDATAYTP
HPCKVTEQTR CEDDTECGAG DARYEGLCDR DGCDFNSFRL GNKEFYGPEK TVDTSKPFTL
VTQFVTADGT DTGALQSIRR FYVQDGTVIP NSETVVEGVD PTNEITDDFC AQQKTAFGDN
NHFKTIGGLP AMGKSLEKMV LVLSIWDDHA VYMNWLDSNY PTDADPTKPG VARGRCDPEA
GVPETVEAAH PDAYVIYSNI KIGALNSTFA AA MSSFQVYRAA LLLSILATAN
AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS
ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL
FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD
LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD
GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG
TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE
GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE
SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS
AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL MYRAIATASA LLATARAQQV
CTLNTENKPA LTWAKCTSSG CSNVRGSVVV DANWRWAHST SSSTNCYTGN TWDKTLCPDG
KTCADKCCLD GADYSGTYGV TSSGNQLNLK FVTVGPYSTN VGSRLYLMED ENNYQMFDLL
GNEFTFDVDV NNIGCGLNGA LYFVSMDKDG GKSRFSTNKA GAKYGTGYCD AQCPRDVKFI
NGVANSDEWK PSDSDKNAGV GKYGTCCPEM DIWEANKIST AYTPHPCKSL TQQSCEGDAC
GGTYSATRYA GTCDPDGCDF NPYRQGNKTF YGPGSGFNVD TTKKVTVVTQ FIKGSDGKLS
EIKRLYVQNG KVIGNPQSEI ANNPGSSVTD SFCKAQKVAF NDPDDFNKKG GWSGMSDALA
KPMVLVMSLW HDHYANMLWL DSTYPKGSKT PGSARGSCPE DSGDPDTLEK EVPNSGVSFS
NIKFGPIGST YTGTGGSNPD PEEPEEPEEP VGTVPQYGQC GGINYSGPTA CVSPYKCNKI
NDFYSQCQ EQAGTATAEN HPPLTWQECT APGSCTTQNG AVVLDANWRW VHDVNGYTNC
YTGNTWDPTY CPDDETCAQN CALDGADYEG TYGVTSSGSS LKLNFVTGSN VGSRLYLLQD
DSTYQIFKLL NREFSFDVDV SNLPCGLNGA LYFVAMDADG GVSKYPNNKA GAKYGTGYCD
SQCPRDLKFI DGEANVEGWQ PSSNNANTGI GDHGSCCAEM DVWEANSISN AVTPHPCDTP
GQTMCSGDDC GGTYSNDRYA GTCDPDGCDF NPYRMGNTSF YGPGKIIDTT KPFTVVTQFL
TDDGTDTGTL SEIKRFYIQN SNVIPQPNSD ISGVTGNSIT TEFCTAQKQA FGDTDDFSQH
GGLAKMGAAM QQGMVLVMSL WDDYAAQMLW LDSDYPTDAD PTTPGIARGT CPTDSGVPSD
VESQSPNSYV TYSNIKFGPI NSTFTAS MFPTLALVSL SFLAIAYGQQ VGTLTAETHP
KLSVSQCTAG GSCTTVQRSV VLDSNWRWLH DVGGSTNCYT GNTWDDSLCP DPTTCAANCA
LDGADYSGTY GITTSGNALS LKFVTQGPYS TNIGSRVYLL SEDDSTYEMF NLKNQEFTFD
VDMSALPCGL NGALYFVEMD KDGGSGRFPT NKAGSKYGTG YCDTQCPHDI KFINGEANVL
DWAGSSNDPN AGTGHYGTCC NEMDIWEANS MGAAVTPHVC TVQGQTRCEG TDCGDGDERY
DGICDKDGCD FNSWRMGDQT FLGPGKTVDT SSKFTVVTQF ITADNTTSGD LSEIRRLYVQ
NGKVIANSKT QIAGMDAYDS ITDDFCNAQK TTFGDTNTFE QMGGLATMGD AFETGMVLVM
SIWDDHEAKM LWLDSDYPTD ADASAPGVSR GPCPTTSGDP TDVESQSPGA TVIFSNIKTG
PIGSTFTS MLSASKAAAI LAFCAHTASA WVVGDQQTET HPKLNWQRCT GKGRSSCTNV
NGEVVIDANW RWLAHRSGYT NCYTGSEWNQ SACPNNEACT KNCAIEGSDY AGTYGITTSG
NQMNIKFITK RPYSTNIGAR TYLMKDEQNY EMFQLIGNEF TFDVDLSQRC GMNGALYFVS
MPQKGQGAPG AKYGTGYCDA QCARDLKFVR GSANAEGWTK SASDPNSGVG KKGACCAQMD
VWEANSAATA LTPHSCQPAG YSVCEDTNCG GTYSEDRYAG TCDANGCDFN PFRVGVKDFY
GKGKTVDTTK KMTVVTQFVG SGNQLSEIKR FYVQDGKVIA NPEPTIPGME WCNTQKKVFQ
EEAYPFNEFG GMASMSEGMS QGMVLVMSLW DDHYANMLWL DSNWPREADP AKPGVARRDC
PTSGGKPSEV EAANPNAQVM FSNIKFGPIG STFAHAA MFRTATLLAF TMAAMVFGQQ
VGTNTARSHP ALTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP
DGKTCAANCA LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ
EFTFDVDMSN LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING
EANVEGWNAT SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT
GLCDADGCDF NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN
GKVIQNSSVK IPGIDPVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS
IWDDYAANML WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD
LNTTYTGTVS SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA
SPYTCHVLNP YYSQCY MYQRALLFSA LMAGVSAQQV GTQKPETHPP LAWKECTSSG
CTSKDGSVVI DANWRWVHSV DGYKNCYTGN EWDSTLCPDD ATCATNCAVD GADYAGTYGA
TTEGDSLSIN FVTGSNIGSR FYLMEDENKY QMFKLLNKEF TFDVDVSTLP CGLNGALYFV
SMDADGGMSK YETNKAGAKY GTGYCDSQCP RDLKFINGKG NVEGWKPSAN DKNAGVGPHG
SCCAEMDIWE ANSISTALTP HPCDTNGQTI CEGDSCGGTY STTRYAGTCD PDGCDFNPFR
MGNESFYGPG KMVDTKSKMT VVTQFITSDG TDTGSLKEIK RVYVQNGKVI ANSASDVSGI
TGNSITSDFC TAQKKTFGDE DVFNKHGGLS GMGDALGEGM VLVMSLWDDH NSNMLWLDGE
KYPTDAAASK AGVSRGTCST DSGKPSTVES ESGSAKVVFS NIKVGSIGST FSA
MTSKIALASL FAAAYGQQIG TYTTETHPSL TWQSCTAKGS CTTQSGSIVL DGNWRWTHST
TSSTNCYTGN TWDATLCPDD ATCAQNCALD GADYSGTYGI TTSGDSLRLN FVTQTANKNV
GSRVYLLADN THYKTFNLLN QEFTFDVDVS NLPCGLNGAV YFANLPADGG ISSTNKAGAQ
YGTGYCDSQC PRDGKFINGK ANVDGWVPSS NNPNTGVGNY GSCCAEMDIW EANSISTAVT
PHSCDTVTQT VCTGDNCGGT YSTTRYAGTC DPDGCDFNPY RQGNESFYGP GKTVDTNSVF
TIVTQFLTTD GTSSGTLNEI KRFYVQNGKV IPNSESTISG VTGNSITTPF CTAQKTAFGD
PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPTTKTGAG GPRGTCSTSS
GVPASVEASS PNAYVVYSNI KVGAINSTFG MYTKFAALAA LVATVRGQAA CSLTAETHPS
LQWQKCTAPG SCTTVSGQVT IDANWRWLHQ TNSSTNCYTG NEWDTSICSS DTDCATKCCL
DGADYTGTYG VTASGNSLNL KFVTQGPYSK NIGSRMYLME SESKYQGFTL LGQEFTFDVD
VSNLGCGLNG ALYFVSMDLD GGVSKYTTNK AGAKYGTGYC DSQCPRDLKF INGQANIDGW
QPSSNDANAG LGNHGSCCSE MDIWEANKVS AAYTPHPCTT IGQTMCTGDD CGGTYSSDRY
AGICDPDGCD FNSYRMGDTS FYGPGKTVDT GSKFTVVTQF LTGSDGNLSE IKRFYVQNGK
VIPNSESKIA GVSGNSITTD FCTAQKTAFG DTNVFEERGG LAQMGKALAE PMVLVLSVWD
DHAVNMLWLD STYPTDSTKP GAARGDCPIT SGVPADVESQ APNSNVIYSN IRFGPINSTY
TGTPSGGNPP GGGTTTTTTT TTSKPSGPTT TTNPSGPQQT HWGQCGGQGW TGPTVCQSPY
TCKYSNDWYS QCL MYQRALLFSA LLSVSRAQQA GTAQEEVHPS LTWQRCEASG
SCTEVAGSVV LDSNWRWTHS VDGYTNCYTG NEWDATLCPD NESCAQNCAV DGADYEATYG
ITSNGDSLTL KFVTGSNVGS RVYLMEDDET YQMFDLLNNE FTFDVDVSNF PCGLNGALYF
TSMDADGGLS KYEGNTAGAK YGTGYCDSQC PRDIKFINGL GNVEGWEPSD SDANAGVGGM
GTCCPEMDIW EANSISTAYT PHPCDSVEQT MCEGDSCGGT YSDDRYGGTC DPDGCDFNSY
RMGNTRFYGP GAIIDTSSKF TVVTQFIADG GSLSEIKRFY VQNGEVIPNS ESNISGVEGN
SITSEFCTAQ KTAFGDEDIF AQHGGLSAMG DAASAMVLIL SIWDDHHSSM MWLDSSYPTD
ADPSQPGVAR GTCEQGAGDP DVVESEHADA SVTFSNIKFG PIGSTF MMMKQYLQYL
AAGSLMTGLV AGQGVGTQQT ETHPRITWKR CTGKANCTTV QAEVVIDSNW RWIHTSGGTN
CYDGNAWNTA ACSTATDCAS KCLMEGAGNY QQTYGASTSG DSLTLKFVTK HEYGTNVGSR
FYLMNGASKY QMFTLMNNEF TFDVDLSTVE CGLNSALYFV AMEEDGGMRS YPTNKAGAKY
GTGYCDAQCA RDLKFVGGKA NIEGWRESSN DENAGVGPYG GCCAEIDVWE SNAHAYAFTP
HACENNNYHV CERDTCGGTY SEDRFAGGCD ANGCDYNPYR MGNPDFYGKG KTVDTTKKFT
VVTRFQDDNL EQFFVQNGQK ILAPAPTFDG IPASPNLTPE FCSTQFDVFT DRNRFREVGD
FPQLNAALRI PMVLVMSIWA DHYANMLWLD SVYPPEKEGE PGAARGPCAQ DSGVPSEVKA
NYPNAKVVWS NIRFGPIGST VNV MYQRALLFSF FLAAARAQQA GTVTAENHPS
LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG NTWDTSICPD DVTCAQNCAL
DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD DTTYQIFKLL GQEFTFDVDV
SNLPCGLNGA LYFVAMDADG GLSKYPGNKA GAKYGTGYCD SQCPRDLKFI NGQANVEGWQ
PSANDPNAGV GNHGSCCAEM DVWEANSIST AVTPHPCDTP GQTMCQGDDC GGTYSSTRYA
GTCDPDGCDF NPYRQGNHSF YGPGKIVDTS SKFTVVTQFI TDDGTPSGTL TEIKRFYVQN
GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FGDNTGFFTH GGLQKISQAL AQGMVLVMSL
WDDHAANMLW LDSTYPTDAD PDTPGVARGT CPTTSGVPAD VESQNPNSYV IYSNIKVGPI
NSTFTAN MFAIVLLGLT RSLGTGTNQA ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW
RWTHDSSLTN CYDGNEWSSS LCPDPKTCSD NCLIDGADYS GTYGITSSGN SLKLVFVTNG
PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT FTVDDSNLDC GLNGALYFVS MDEDGGTSRF
SSNKAGAKYG TGYCDAQCPH DIKFINGEAN VENWKPQTND ENAGNGRYGA CCTEMDIWEA
NKYATAYTPH ICTVNGEYRC DGSECGDTDS GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL
IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR KYVQGGKVIE NTVVNIAGMS SGNSITDDFC
NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM VLVLSLWDDH QVNMLWLDSI YPTDQPASQP
GVKRGPCATS SGAPSDVESQ HPDSSVTFSD IRFGPIDSTY MHQRALLFSA LVGAVRAQQA
GTLTEEVHPP LTWQKCTADG SCTEQSGSVV IDSNWRWLHS TNGSTNCYTG NTWDESLCPD
NEACAANCAL DGADYESTYG ITTSGDALTL TFVTGENVGS RVYLMAEDDE SYQTFDLVGN
EFTFDVDVSN LPCGLNGALY FTSMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFING
MANVEGWTPS DNDKNAGVGG HGSCCPELDI WEANSISSAF TPHPCDDLGQ TMCSGDDCGG
TYSETRYAGT CDPDGCDFNA YRMGNTSYYG PDKIVDTNSV MTVVTQFIGD GGSLSEIKRL
YVQNGKVIAN AQSNVDGVTG NSITSDFCTA QKTAFGDQDI FSKHGGLSGM GDAMSAMVLI
LSIWDDHNSS MMWLDSTYPE DADASEPGVA RGTCEHGVGD PETVESQHPG ATVTFSKIKF
GPIGSTYSSN STA MFRAAALLAF TCLAMVSGQQ AGTNTAENHP QLQSQQCTTS
GGCKPLSTKV VLDSNWRWVH STSGYTNCYT GNEWDTSLCP DGKTCAANCA LDGADYSGTY
GITSTGTALT LKFVTGSNVG SRVYLMADDT HYQLLKLLNQ EFTFDVDMSN LPCGLNGALY
LSAMDADGGM SKYPGNKAGA KYGTGYCDSQ CPKDIKFING EANVGNWTET GSNTGTGSYG
TCCSEMDIWE ANNDAAAFTP HPCTTTGQTR CSGDDCARNT GLCDGDGCDF NSFRMGDKTF
LGKGMTVDTS KPFTVVTQFL TNDNTSTGTL SEIRRIYIQN GKVIQNSVAN IPGVDPVNSI
TDNFCAQQKT AFGDTNWFAQ KGGLKQMGEA LGNGMVLALS IWDDHAANML WLDSDYPTDK
DPSAPGVARG TCATTSGVPS DVESQVPNSQ VVFSNIKFGD IGSTFSGTSS PNPPGGSTTS
SPVTTSPTPP PTGPTVPQWG QCGGIGYSGS TTCASPYTCH VLNPCESILS LQRSSNADQY
LQTTRSATKR RLDTALQPRK MRTALALILA LAAFSAVSAQ QAGTITAETH PTLTIQQCTQ
SGGCAPLTTK VVLDVNWRWI HSTTGYTNCY SGNTWDAILC PDPVTCAANC ALDGADYTGT
FGILPSGTSV TLRPVDGLGL RLFLLADDSH YQMFQLLNKE FTFDVEMPNM RCGSSGAIHL
TAMDADGGLA KYPGNQAGAK YGTGFCSAQC PKGVKFINGQ ANVEGWLGTT ATTGTGFFGS
CCTDIALWEA NDNSASFAPH PCTTNSQTRC SGSDCTADSG LCDADGCNFN SFRMGNTTFF
GAGMSVDTTK LFTVVTQFIT SDNTSMGALV EIHRLYIQNG QVIQNSVVNI PGINPATSIT
DDLCAQENAA FGGTSSFAQH GGLAQVGEAL RSGMVLALSI VNSAADTLWL DSNYPADADP
SAPGVARGTC PQDSASIPEA PTPSVVFSNI KLGDIGTTFG AGSALFSGRS PPGPVPGSAP
ASSATATAPP FGSQCGGLGY AGPTGVCPSP YTCQALNIYY SQCI MYQRALLFSF
FLAAARAHEA GTVTAENHPS LTWQQCSSGG SCTTQNGKVV IDANWRWVHT TSGYTNCYTG
NTWDTSICPD DVTCAQNCAL DGADYSGTYG VTTSGNALRL NFVTQSSGKN IGSRLYLLQD
DTTYQIFKLL GQEFTFDVDV SNLPCGLNGA LYFVAMDADG NLSKYPGNKA GAKYGTGYCD
SQCPRDLKFI NGQANVEGWQ PSANDPNAGV GNHGSSCAEM DVWEANSIST AVTPHPCDTP
GQTMCQGDDC GGTYSSTRYA GTCDTDGCDF NPYQPGNHSF YGPGKIVDTS SKFTVVTQFI
TDDGTPSGTL TEIKRFYVQN GKVIPQSEST ISGVTGNSIT TEYCTAQKAA FDNTGFFTHG
GLQKISQALA QGMVLVMSLW DDHAANMLWL DSTYPTDADP DTPGVARGTC PTTSGVPADV
ESQNPNSYVI YSNIKVGPIN STFTAN MHKRAATLSA LVVAAAGFAR GQGVGTQQTE
THPKLTFQKC SAAGSCTTQN GEVVIDANWR WVHDKNGYTN CYTGNEWNTT ICADAASCAS
NCVVDGADYQ GTYGASTSGN ALTLKFVTKG SYATNIGSRM YLMASPTKYA MFTLLGHEFA
FDVDLSKLPC GLNGAVYFVS MDEDGGTSKY PSNKAGAKYG TGYCDSQCPR DLKFIDGKAN
SASWQPSSND QNAGVGGMGS CCAEMDIWEA NSVSAAYTPH PCQNYQQHSC SGDDCGGTYS
ATRFAGDCDP DGCDWNAYRM GVHDFYGNGK TVDTGKKFSI VTQFKGSGST LTEIKQFYVQ
DGRKIENPNA TWPGLEPFNS ITPDFCKAQK QVFGDPDRFN DMGGFTNMAK ALANPMVLVL
SLWDDHYSNM LWLDSTYPTD ADPSAPGKGR GTCDTSSGVP SDVESKNGDA TVIYSNIKFG
PLDSTYTAS MRASLLAFSL NSAAGQQAGT LQTKNHPSLT SQKCRQGGCP QVNTTIVLDA
NWRWTHSTSG STNCYTGNTW QATLCPDGKT CAANCALDGA DYTGTYGVTT SGNSLTLQFV
TQSNVGARLG YLMADDTTYQ MFNLLNQEFW FDVDMSNLPC GLNGALYFSA MARTAAWMPM
VVCASTPLIS TRRSTARLLR LPVPPRSRYG RGICDSQCPR DIKFINGEAN VQGWQPSPND
TNAGTGNYGA CCNKMDVWEA NSISTAYTPH PCTQRGLVRC SGTACGGGSN RYGSICDHDG
LGFQNLFGMG RTRVRARVGR VKQFNRSSRV VEPISWTKQT TLHLGNLPWK SADCNVQNGR
VIQNSKVNIP GMPSTMDSVT TEFCNAQKTA FNDTFSFQQK GGMANMSEAL
RRGMVLVLSI WDDHAANMLW LDSITSAAAC RSTPSEVHAT PLRESQIRSS HSRQTRYVTF
TNIKFGPFNS TGTTYTTGSV PTTSTSTGTT GSSTPPQPTG VTVPQGQCGG IGYTGPTTCA
SPTTCHVLNP YYSQCY MKQYLQYLAA ALPLMSLVSA QGVGTSTSET HPKITWKKCS
SGGSCSTVNA EVVIDANWRW LHNADSKNCY DGNEWTDACT SSDDCTSKCV LEGAEYGKTY
GASTSGDSLS LKFLTKHEYG TNIGSRFYLM NGASKYQMFT LMNNEFAFDV DLSTVECGLN
SALYFVAMEE DGGMASYSTN KAGAKYGTGY CDAQCARDLK FVGGKANYDG WTPSSNDANA
GVGALGGCCA EIDVWESNAH AFAFTPHACE NNNYHVCEDT TCGGTYSEDR FAGDCDANGC
DYNPYRVGNT DFYGKGMTVD TSKKFTVVSQ FQENKLTQFF VQNGKKIEIP GPKHEGLPTE
SSDITPELCS AMPEVFGDRD RFAEVGGFDA LNKALAVPMV LVMSIWDDHY ANMLWLDSSY
PPEKAGTPGG DRGPCAQDSG VPSEVESQYP DATVVWSNIR FGPIGSTVQV MFPKASLIAL
SFIAAVYGQQ VGTQMAEVHP KLPSQLCTKS GCTNQNTAVV LDANWRWLHT TSGYTNCYTG
NSWDATLCPD ATTCAQNCAV DGADYSGTYG ITTSGNALTL KFKTGTNVGS RVYLMQTDTA
YQMFQLLNQE FTFDVDMSNL PCGLNGALYL SQMDQDGGLS KFPTNKAGAK YGTGYCDSQC
PHDIKFINGM ANVAGWAGSA SDPNAGSGTL GTCCSEMDIW EANNDAAAFT PHPCSVDGQT
QCSGTQCGDD DERYSGLCDK DGCDFNSFRM GDKSFLGKGM TVDTSRKFTV VTQFVTTDGT
TNGDLHEIRR LYVQDGKVIQ NSVVSIPGID AVDSITDNFC AQQKSVFGDT NYFATLGGLK
KMGAALKSGM VLAMSVWDDH AASMQWLDSN YPADGDATKP GVARGTCSAD SGLPTNVESQ
SASASVTFSN IKWGDINTTF TGTGSTSPSS PAGPVSSSTS VASQPTQPAQ GTVAQWGQCG
GTGFTGPTVC ASPFTCHVVN PYYSQCY MFRTAALLSF AYLAVVYGQQ AGTSTAETHP
PLTWEQCTSG GSCTTQSSSV VLDSNWRWTH VVGGYTNCYT GNEWNTTVCP DGTTCAANCA
LDGADYEGTY GISTSGNALT LKFVTASAQT NVGSRVYLMA PGSETEYQMF NPLNQEFTFD
VDVSALPCGL NGALYFSEMD ADGGLSEYPT NKAGAKYGTG YCDSQCPRDI KFIEGKANVE
GWTPSSTSPN AGTGGTGICC NEMDIWEANS ISEALTPHPC TAQGGTACTG DSCSSPNSTA
GICDQAGCDF NSFRMGDTSF YGPGLTVDTT SKITVVTQFI TSDNTTTGDL TAIRRIYVQN
GQVIQNSMSN IAGVTPTNEI TTDFCDQQKT AFGDTNTFSE KGGLTGMGAA FSRGMVLVLS
IWDDDAAEML WLDSTYPVGK TGPGAARGTC ATTSGQPDQV ETQSPNAQVV FSNIKFGAIG
STFSSTGTGT GTGTGTGTGT GTTTSSAPAA TQTKYGQCGG QGWTGATVCA SGSTCTSSGP
YYSQCL MFRTAALTAF TFAAVVLGQQ VGTLTTENHP ALSIQQCTAT GCTTQQKSVV
LDSNWRWTHS TAGATNCYTG NAWDPALCPD PATCATNCAI DGADYSGTYG ITTSGNALTL
RFVTNGQYSQ NIGSRVYLLD DADHYKLFDL KNQEFTFDVD MSGLPCGLNG ALYFSEMAAD
GGKAAHAGNN AGAKYGTGYC DAQCPHDIKW INGEANVLDW SASATDDNAG NGRYGACCAE
MDIWEANSEA TAYTPHVCRD EGLYRCSGTE CGDGNNRYGG VCDKDGCDFN SYRMGDKNFL
GRGKTIDTTK KVTVVTQFIT DNNTPTGNLV EIRRVYVQNG VVYQNSFSTF PSLSQYNSIS
DEFCVAQKTL FGDNQYYNTH GGTTKMGDAF DNGMVLIMSL WSDHAAHMLW LDSDYPLDKS
PSEPGVSRGA CPTSSGDPDD VVANHPNASV TFSNIKYGPI GSTFGGSTPP VSSGGSSVPP
VTSTTSSGTT TPTGPTGTVP KWGQCGGIGY SGPTACVAGS TCTYSNDWYS QCL
MYRAIATASA LIAAVRAQQV CSLTPETKPA LSWSKCTSSG CSNVQGSVTI DANWRWTHQL
SGSTNCYTGN KWDTSICTSG KVCAEKCCID GAEYASTYGI TSSGNQLSLS FVTKGTYGTN
IGSRTYLMED ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKAKYPGNKA
GAKYGTGYCD AQCPRDVKFI NGQANSDGWQ PSKSDVNGGI GNLGTCCPEM DIWEANSIST
AHTPHPCTKL TQHSCTGDSC GGTYSEDRYG GTCDADGCDF NAYRQGNKTF YGPGSGFNVD
TTKKVTVVTQ FHKGSNGRLS EITRLYVQNG KVIANSESKI AGVPGSSLTP EFCTAQKKVF
GDIDDFEKKG AWGGMSDALE APMVLVMSLW HDHHSNMLWL DSTYPTDSTK LGAQRGSCST
SSGVPADLEK NVPNSKVAFS NIKFGPIGST YKEGQPEPTN PTNPNPTTPG GTVDQWGQCG
GTNYSGPTAC KSPFTCKKIN DFYSQCQ MFRTATLLAF TMAAMVFGQQ VGTNTAENHR
TLTSQKCTKS GGCSNLNTKI VLDANWRWLH STSGYTNCYT GNQWDATLCP DGKTCAANCA
LDGADYTGTY GITASGSSLK LQFVTGSNVG SRVYLMADDT HYQMFQLLNQ EFTFDVDMSN
LPCGLNGALY LSAMDADGGM AKYPTNKAGA KYGTGYCDSQ CPRDIKFING EANVEGWNAT
SANAGTGNYG TCCTEMDIWE ANNDAAAYTP HPCTTNAQTR CSGSDCTRDT GLCDADGCDF
NSFRMGDQTF LGKGLTVDTS KPFTVVTQFI TNDGTSAGTL TEIRRLYVQN GKVIQNSSVK
IPGIDLVNSI TDNFCSQQKT AFGDTNYFAQ HGGLKQVGEA LRTGMVLALS IWDDYAANML
WLDSNYPTNK DPSTPGVARG TCATTSGVPA QIEAQSPNAY VVFSNIKFGD LNTTYTGTVS
SSSVSSSHSS TSTSSSHSSS STPPTQPTGV TVPQWGQCGG IGYTGSTTCA SPYTCHVLNP
YYSQCY MYQTSLLASL SFLLATSQAQ QVGTQTAETH PKLTTQKCTT AGGCTDQSTS
IVLDANWRWL HTVDGYTNCY TGQEWDTSIC TDGKTCAEKC ALDGADYEST YGISTSGNAL
TMNFVTKSSQ TNIGGRVYLL AADSDDTYEL FKLKNQEFTF DVDVSNLPCG LNGALYFSEM
DSDGGLSKYT TNKAGAKYGT GYCDTQCPHD IKFINGEANV QNWTASSTDK NAGTGHYGSC
CNEMDIWEAN SQATAFTPHV CEAKVEGQYR CEGTECGDGD NRYGGVCDKD GCDFNSYRMG
NETFYGSNGS TIDTTKKFTV VTQFITADNT ATGALTEIRR KYVQNDVVIE NSYADYETLS
KFNSITDDFC AAQKTLSGDT NDFKTKGGIA RMGESFERGM VLVMSVWDDH AANALWLDSS
YPTDADASKP GVKRGPCSTS SGVPSDVEAN DADSSVIYSN IRYGDIGSTF NKTA
MFSKVALTAL CFLAVAQAQQ VGREVAENHP RLPWQRCTRN GGCQTVSNGQ VVLDANWRWL
HVTDGYTNCY TGNAWNSSVC SDGATCAQRC ALEGANYQQT YGITTSGDAL TIKFLTRSEQ
TNIGARVYLM ENEDRYQMFN LLNKEFTFDV DVSKVPCGIN GALYFIQMDA DGGLSSQPNN
RAGAKYGTGY CDSQCPRDIK FINGEANSVG WEPSETDPNA GKGQYGICCA EMDIWEANSI
SNAYTPHPCQ TVNDGGYQRC QGRDCNQPRY EGLCDPDGCD YNPFRMGNKD FYGPGKTVDT
NRKMTVVTQF ITHDNTDTGT LVDIRRLYVQ DGRVIANPPT NFPGLMPAHD SITQEFCDDA
KRAFEDNDSF GRNGGLAHMG RSLAKGHVLA LSIWNDHTAH MLWLDSNYPT DADPNKPGIA
RGTCPTTGGS PRDTEQNHPD AQVIFSNIKF GDIGSTFSGN MYRKLAVISA FLAAARAQQV
CTQQAETHPP LTWQKCTASG CTPQQGSVVL DANWRWTHDT KSTTNCYDGN TWSSTLCPDD
ATCAKNCCLD GANYSGTYGV TTSGDALTLQ FVTASNVGSR LYLMANDSTY QEFTLSGNEF
SFDVDVSQLP CGLNGALYFV SMDADGGQSK YPGNAAGAKY GTGYCDSQCP RDLKFINGQA
NVEGWEPSSN NANTGVGGHG SCCSEMDIWE ANSISEALTP HPCETVGQTM CSGDSCGGTY
SNDRYGGTCD PDGCDWNPYR LGNTSFYGPG SSFALDTTKK LTVVTQFATD GSISRYYVQN
GVKFQQPNAQ VGSYSGNTIN TDYCAAEQTA FGGTSFTDKG GLAQINKAFQ GGMVLVMSLW
DDYAVNMLWL DSTYPTNATA STPGAKRGSC STSSGVPAQV EAQSPNSKVI YSNIRFGPIG
STGGNTGSNP PGTSTTRAPP SSTGSSPTAT QTHYGQCGGT GWTGPTRCAS GYTCQVLNPF
YSQCL MRASLLAFSL AAAVAGGQQA GTLTAKRHPS LTWQKCTRGG CPTLNTTMVL
DANWRWTHAT SGSTKCYTGN KWQATLCPDG KSCAANCALD GADYTGTYGI TGSGWSLTLQ
FVTDNVGARA YLMADDTQYQ MLELLNQELW FDVDMSNIPC GLNGALYLSA MDADGGMRKY
PTNKAGAKYA TGYCDAQCPR DLKYINGIAN VEGWTPSTND ANGIGDHGSC CSEMDIWEAN
KVSTAFTPHP CTTIEQHMCE GDSCGGTYSD DRYGVLCDAD GCDFNSYRMG NTTFYGEGKT
VDTSSKFTVV TQFIKDSAGD LAEIKAFYVQ NGKVIENSQS NVDGVSGNSI TQSFCKSQKT
AFGDIDDFNK KGGLKQMGKA LAQAMVLVMS IWDDHAANML WLDSTYPVPK VPGAYRGSGP
TTSGVPAEVD ANAPNSKVAF SNIKFGHLGI SPFSGGSSGT PPSNPSSSAS PTSSTAKPSS
TSTASNPSGT GAAHWAQCGG IGFSGPTTCP EPYTCAKDHD IYSQCV MLASTFSYRM
YKTALILAAL LGSGQAQQVG TSQAEVHPSM TWQSCTAGGS CTTNNGKVVI DANWRWVHKV
GDYTNCYTGN TWDKTLCPDD ATCASNCALE GANYQSTYGA TTSGDSLRLN FVTTSQQKNI
GSRLYMMKDD TTYEMFKLLN QEFTFDVDVS NLPCGLNGAL YFVAMDADGG MSKYPTNKAG
AKYGTGYCDS QCPRDLKFIN GQANVEGWQP SSNDANAGTG NHGSCCAEMD IWEANSISTA
FTPHPCDTPG QVMCTGDACG GTYSSDRYGG TCDPDGCDFN SFRQGNKTFY GPGMTVDTKS
KFTVVTQFIT DDGTASGTLK EIKRFYVQNG KVIPNSESTW SGVGGNSITN DYCTAQKSLF
KDQNVFAKHG GMEGMGAALA QGMVLVMSLW DDHAANMLWL DSNYPTTASS STPGVARGTC
DISSGVPADV EANHPDASVV YSNIKVGPIG STFNSGGSNP GGGTTTTAKP TTTTTTAGSP
GGTGVAQHYG QCGGNGWQGP TTCASPYTCQ KLNDFYSQCL MQIKQYLQYL AAALPLVNMA
AAQRAGTQQT ETHPRLSWKR CSSGGNCQTV NAEIVIDANW RWLHDSNYQN CYDGNRWTSA
CSSATDCAQK CYLEGANYGS TYGVSTSGDA LTLKFVTKHE YGTNIGSRVY LMNGSDKYQM
FTLMNNEFAF DVDLSKVECG LNSALYFVAM EEDGGMRSYS SNKAGAKYGT GYCDAQCARD
LKFVGGKANI EGWRPSTNDA NAGVGPYGAC CAEIDVWESN AYAFAFTPHG CLNNNYHVCE
TSNCGGTYSE DRFGGLCDAN GCDYNPYRMG NKDFYGKGKT VDTSRKFTVV TRFEENKLTQ
FFIQDGRKID IPPPTWPGLP NSSAITPELC TNLSKVFDDR DRYEETGGFR TINEALRIPM
VLVMSIWDGH YASMLWLDSV YPPEKAGQPG AERGPCAPTS GVPAEVEAQF PNAQVIWSNI
RFGPIGSTYQ V MTSRIALVSL FAAVYGQQVG TYQTETHPSL TWQSCTAKGS CTTNTGSIVL
DGNWRWTHGV GTSTNCYTGN TWDATLCPDD ATCAQNCALE GADYSGTYGI TTSGNSLRLN
FVTQSANKNI GSRVYLMADT THYKTFNLLN QEFTFDVDVS NLPCGLNGAV YFANLPADGG
ISSTNTAGAE YGTGYCDSQC PRDMKFIKGQ ANVDGWVPSS NNANTGVGNH GSCCAEMDIW
EANSISTAVT PHSCDTVTQT VCTGDDCGGT YSSSRYAGTC DPDGCDFNSY RMGDETFYGP
GKTVDTNSVF TVVTQFLTTD GTASGTLNEI KRFYVQDGKV IPNSYSTISG VSGNSITTPF
CDAQKTAFGD PTSFSDHGGL ASMSAAFEAG MVLVLSLWDD YYANMLWLDS TYPVGKTSAG
GPRGTCDTSS GVPASVEASS PNAYVVYSNI KVGAINSTYG MFVFVLLWLT QSLGTGTNQA
ENHPSLSWQN CRSGGSCTQT SGSVVLDSNW RWTHDSSLTN CYDGNEWSSS LCPDPKTCSD
NCLIDGADYS GTYGITSSGN SLKLVFVTNG PYSTNIGSRV YLLKDESHYQ IFDLKNKEFT
FTVDDSNLDC GLNGALYFVS MDEDGGTSRF SSNKAGAKYG TGYCDAQCPH DIKFINGEAN
VENWKPQTND ENAGNGRYGA CCTEMDIWEA NKYATAYTPH ICTVNGEYRC DGSECGDTDS
GNRYGGVCDK DGCDFNSYRM GNTSFWGPGL IIDTGKPVTV VTQFVTKDGT DNGQLSEIRR
KYVQGGKVIE NTVVNIAGMS SGNSITDDFC NEQKSAFGDT NDFEKKGGLS GLGKAFDYGM
VLVLSLWDDH QVNMLWLDSI YPTDQPASQP GVKRGPCATS SGAPSDVESQ HPDSSVTFSD
IRFGPIDSTY MFRKAALLAF SFLAIAHGQQ VGTNQAENHP SLPSQKCTAS GCTTSSTSVV
LDANWRWVHT TTGYTNCYTG QTWDASICPD GVTCAKACAL DGADYSGTYG ITTSGNALTL
QFVKGTNVGS RVYLLQDASN YQMFQLINQE FTFDVDMSNL PCGLNGAVYL SQMDQDGGVS
RFPTNTAGAK YGTGYCDSQC PRDIKFINGE ANVEGWTGSS TDSNSGTGNY GTCCSEMDIW
EANSVAAAYT PHPCSVNQQT RCTGADCGQG DDRYDGVCDP DGCDFNSFRM GDQTFLGKGL
TVDTSRKFTI VTQFISDDGT TSGNLAEIRR FYVQDGNVIP NSKVSIAGID AVNSITDDFC
TQQKTAFGDT NRFAAQGGLK QMGAALKSGM VLALSLWDDH AANMLWLDSD YPTTADASNP
GVARGTCPTT SGFPRDVESQ SGSATVTYSN IKWGDLNSTF TGTLTTPSGS SSPSSPASTS
GSSTSASSSA SVPTQSGTVA QWAQCGGIGY SGATTCVSPY TCHVVNAYYS QCY
MYRAIATASA LIAAARAQQV CTLTTETKPA LTWSKCTSSG CTDVKGSVGI DANWRWTHQT
SSSTNCYTGN KWDTSVCTSG ETCAQKCCLD GADYAGTYGI TSSGNQLSLG FVTKGSFSTN
IGSRTYLMEN ENTYQMFQLL GNEFTFDVDV SNIGCGLNGA LYFVSMDADG GKARYPANKA
GAKYGTGYCD AQCPRDVKFI NGKANSDGWK PSDSDINAGI GNMGTCCPEM DIWEANSIST
AFTPHPCTKL TQHACTGDSC GGTYSNDRYG GTCDADGCDF NSYRQGNKTF YGRGSDFNVD
TTKKVTVVTQ FKKGSNGRLS EITRLYVQNG KVIANSESKI PGNSGSSLTA DFCSKQKSVF
GDIDDFSKKG GWSGMSDALE SPPMVLVMSL WHDHHSNMLW LDSTYPTDST KLGAQRGSCA
TTSGVPSDLE RDVPNSKVSF SNIKFGPIGS TYSSGTTNPP PSSTDTSTTP TNPPTGGTVG
QYGQCGGQTY TGPKDCKSPY TCKKINDFYS QCQ MSSFQIYRAA LLLSILATAN
AQQVGTYTTE THPSLTWQTC TSDGSCTTND GEVVIDANWR WVHSTSSATN CYTGNEWDTS
ICTDDVTCAA NCALDGATYE ATYGVTTSGS ELRLNFVTQG SSKNIGSRLY LMSDDSNYEL
FKLLGQEFTF DVDVSNLPCG LNGALYFVAM DADGGTSEYS GNKAGAKYGT GYCDSQCPRD
LKFINGEANC DGWEPSSNNV NTGVGDHGSC CAEMDVWEAN SISNAFTAHP CDSVSQTMCD
GDSCGGTYSA SGDRYSGTCD PDGCDYNPYR LGNTDFYGPG LTVDTNSPFT VVTQFITDDG
TSSGTLTEIK RLYVQNGEVI ANGASTYSSV NGSSITSAFC ESEKTLFGDE NVFDKHGGLE
GMGEAMAKGM VLVLSLWDDY AADMLWLDSD YPVNSSASTP GVARGTCSTD SGVPATVEAE
SPNAYVTYSN IKFGPIGSTY SSGSSSGSGS SSSSSSTTTK ATSTTLKTTS TTSSGSSSTS
AAQAYGQCGG QGWTGPTTCV SGYTCTYENA YYSQCL MHQRALLFSA LLTAVRAQQA
GTLTEEVHPS LTWQKCTSEG SCTEQSGSVV IDSNWRWTHS VNDSTNCYTG NTWDATLCPD
DETCAANCAL DGADYESTYG VTTDGDSLTL KFVTGSNVGS RLYLMDTSDE GYQTFNLLDA
EFTFDVDVSN LPCGLNGALY FTAMDADGGV SKYPANKAGA KYGTGYCDSQ CPRDLKFIDG
QANVDGWEPS SNNDNTGIGN HGSCCPEMDI WEANKISTAL TPHPCDSSEQ TMCEGNDCGG
TYSDDRYGGT CDPDGCDFNP YRMGNDSFYG PGKTIDTGSK MTVVTQFITD GSGSLSEIKR
YYVQNGNVIA NADSNISGVT GNSITTDFCT AQKKAFGDED IFAEHNGLAG ISDAMSSMVL
ILSLWDDYYA SMEWLDSDYP ENATATDPGV ARGTCDSESG VPATVEGAHP DSSVTFSNIK
FGPINSTFSA SA MYAKFATLAA LVAGAAAQNA CTLTAENHPS LTWSKCTSGG
SCTSVQGSIT IDANWRWTHR TDSATNCYEG NKWDTSYCSD GPSCASKCCI DGADYSSTYG
ITTSGNSLNL KFVTKGQYST NIGSRTYLME SDTKYQMFQL LGNEFTFDVD VSNLGCGLNG
ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DSQCPRDLKF INGEANVENW QSSTNDANAG
TGKYGSCCSE MDVWEANNMA AAFTPHPCXV IGQSRCEGDS CGGTYSTDRY AGICDPDGCD
FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYVQNGK VIPNSESTIP
GVEGNSITQD WCDRQKAAFG DVTDXQDKGG MVQMGKALAG
PMVLVMSIWD DHAVNMLWLD STWPIDGAGK PGAERGACPT TSGVPAEVEA EAPNSNVIFS
NIRFGPIGST VSGLPDGGSG NPNPPVSSST PVPSSSTTSS GSSGPTGGTG VAKHYEQCGG
IGFTGPTQCE SPYTCTKLND WYSQCL MYAKFATLAA LVAGASAQAV CSLTAETHPS
LTWQKCTAPG SCTNVAGSIT IDANWRWTHQ TSSATNCYSG SKWDSSICTT GTDCASKCCI
DGAEYSSTYG ITTSGNALNL KFVTKGQYST NIGSRTYLME SDTKYQMFKL LGNEFTFDVD
VSNLGCGLNG ALYFVSMDAD GGMSKYSGNK AGAKYGTGYC DAQCPRDLKF INGEANVEGW
ESSTNDANAG SGKYGSCCTE MDVWEANNMA TAFTPHPCTT IGQTRCEGDT CGGTYSSDRY
AGVCDPDGCD FNSYRQGNKT FYGKGMTVDT TKKITVVTQF LKNSAGELSE IKRFYAQDGK
VIPNSESTIA GIPGNSITKA YCDAQKTVFQ NTDDFTAKGG LVQMGKALAG DMVLVMSVWD
DHAVNMLWLD STYPTDQVGV AGAERGACPT TSGVPSDVEA NAPNSNVIFS NIRFGPIGST
VQGLPSSGGT SSSSSAAPQS TSTKASTTTS AVRTTSTATT KTTSSAPAQG TNTAKHWQQC
GGNGWTGPTV CESPYKCTKQ NDWYSQCL MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW
QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP SSDTCSQKCY IEGADYSGTY
GIQSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV DDSKLNCGLN
GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS
GNGKLGTCCS EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC
DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD
TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST
YPTDSSDSTA QRGPCPTSSG VPKDVESQHG DATVVFSDIK FGAINSTFKY N MLAAALFTFA
CSVGVGTKTP ENHPKLNWQN CASKGSCSQV SGEVTMDSNW RWTHDGNGKN CYDGNTWISS
LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTLKFVTHG SYSTNIGSRL YLLKDKSTYY
VFKLNNKEFT FSVDVSKLPC GLNGALYFVE MDADGGKAKY AGAKPGAEYG LGYCDAQCPS
DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSQATALTPH VCKTTGQQRC
SGKSECGGQD GQDRFAGLCD EDGCDFNNWR MGDKTFFGPG LIVDTKSPFV VVTQFYGSPV
TEIRRKYVQN GKVIENSKSN IPGIDATAAI SDHFCEQQKK AFGDTNDFKN KGGFAKLGQV
FDRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSQPGVDRG PCPTSSGKPD DVESASADAT
VVYGNIKFGA LDSTY MLTLVYFLLS LVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS
IVLDSNWRWV HDSGTTNCYD GNLWSKDLCP SSNTCSQKCY IEGADYSGTY GIQSSGSKLT
LKFVTKGSYS TNIGSRVYLL KDENTYESFK LKNKEFTFTV DDSKLNCGLN GALYFVAMDA
DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS
EMDIWEGNMK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ
SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI NNSKTSNLAD TYDSITDKFC
DATKEASGDT NDFKAKGAMS GFSTNLNNGQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA
SRGPCAVSSG VPKDVESQYG DATVIYSDIK FGAINSTFKW N MILALLSLAK SLGIATNQAE
THPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC
DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT
VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL
DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDANQ
RYNGICDKDG CDFNSYRLGD KTFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY
VQGGKVIENS KVNIAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL
VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK
FGPIDSTY MLVIALILRG LSVGTGTQQS ETHPSLSWQQ TSKGGSGQSV SGSVVLDSNW
RWTHTTDGTT NCYDGNEWSS DLCPDASTCS SNCVLEGADY SGTYGITGSG SSLKLGFVTK
GSYSTNIGSR VYLLGDESHY KLFKLENNEF TFTVDDSNLE CGLNGALYFV AMDEDGGASK
YSGAKPGAKY GMGYCDAQCP HDMKFINGDA NVEGWKPSDN DENAGTGKWG ACCTEMDIWE
ANKYATAYTP HICTKNGEYR CEGTDCGDTK DNNRYGGVCD KDGCDFNSWR MGNQSFWGPG
LIIDTGKPVT VVTQFLADGG SLSEIRRKYV QGGKVIENTV TKISGMDEFD SITDEFCNQQ
KKAFRDTNDF EKKGGLKGLG TAVDAGVVLV LSLWDDHDVN MLWLDSIYPT DSGSKAGADR
GPCATSSGVP KDVESNYASA SVTFSDIKFG PIDSTY MLLALFAFGK SLGIATNQAE
NHPKLTWTRY QSKGSGQTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC
DLDGADYPGT YGISSSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT
VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL
DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGIRRCEG TECGDTDANQ
RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY
VQGGKVIENS KVNIAGMAAG NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDSGMVL
VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG APSDVESQSP DATVTFSDIK
FGPIDSTY MLASVVYLVS LVVSLEIGTQ QSEEHPKLTW QNGSSSVSGS IVLDSNWRWL
HDSGTTNCYD GNLWSDDLCP NADTCSSKCY IEGADYSGTY GITSSGSKVT LKFVTKGSYS
TNIGSRIYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN GALYFVAMDA DGGKAKYSSF
KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS GDGKLGTCCS EMDIWEGNAK
SQAYTVHACS KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD
TKSPVTVVTQ FIGDPLTEIR RVYVQGGKTI NNSKTSNLAD TYDSITDKFC DATKDATGDT
NDFKAKGAMA GFSTNLNTAQ VLVSVHCGMI IQPICCGLIR RIQRIQQKQV QAVDRVLCRR
VFQRMLKASM VMLQSRTRTL SLELSTRPLV GISPAGRLFF F MILALLVLGK SLGIATNQAE
THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC
DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT
VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL
DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ
RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY
VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL
VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK
FGPIDSTY MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT
KGSTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST
NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSNLPCGLNG ALYHVNMDED GGTKRYPDNE
AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC
SAVTPHVCDN LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT
KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC TAQKKAFGDT
DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG
VPADVESKSA DANVIYSDIR FGAIDSTYK MLGALVALAS CIGVGTNTPE KHPDLKWTNG
GSSVSGSIVV DSNWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVLE GADYSGTYGV
TTSGDAATLK FVTHGQYSTN VGSRLYLLKD EKTYQMFNLV GKEFTFTVDV SNLPCGLNGA
LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN
GKYGSCCSEM DIWEANSMAT AYTPHVCDKL EQTRCSGSAC GQNGGGDRFS SSCDPDGCDF
NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGGSVTEIK RKYVQGGKVI DNSMTNIAAM
SKQYNSVSDE FCQAQKKAFG DNDSFTKHGG FRQLGATLSK GHVLVLSLWD DHDVNMLWLD
SVYPTNSNKP GADRGPCKTS SGVPSDVESQ NADSTVKYSD IRFGAIDSTY SK
MLAAALFTFA CSVGVGTKTT ETHPKLNWQQ CACKGSCSQV SGEVTMDSNW RWTHDGNGKN
CYDGNTWISS LCPDDKTCSD KCVLDGAEYQ ATYGIQSNGT ALTPKFVTHG SYSTNIGSRL
YLLKDKSTYY VFQLNNKEFT FSVDVSKLPC GLNGALYFVE MDADGGKSKY AGAKPGAEYG
LGYCDAQCPS DLKFINGEAN SEGWKPQSGD KNAGNGKYGS CCSEMDVWES NSMATALTPH
VCKTTGQTRC SGKSECGGQD GQDRFAGNCD EDGCDFNNWR MGDKTFFGPG LTVDTKSPFV
VVTQFYGSPV TEIRRKYVQN GKVIENAKSN IPGIDATNAI SDTFCEQQKK AFGDTNDFKN
KGGFTKLGSV FSRGMVLVLS LWDDHQVAML WLDSTYPTNK DKSVPGVDRG PCPTSSGKPD
DVESASGDAT VVYGNIKFGA LDSTY MFGFLLSLFA LQFALEIGTQ TSESHPSITW
ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP DPEKCSQNCY LEGADYSGTY
GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYPMFK LKNKEFTFTV DVSNLPCGLN
GALYFVAMPS DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA
GTGRYGTCCT EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF
NSWRWGNKTF FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS
VDSITNTFCD ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY
PTDSTAIGAS RGPCATSSGD PKDVESASAN ASVKFSDIKF GALDSTY MLASLLPLSN
SLGTASNQAE THPKLTWTQY TGKGAGQTVN GEIVLDSNWR WTHKDGTNCY DGNTWSSSLC
PDPTTCSNNC NLDGADYPGT YGITTSGNQL KLGFVTHGSY STNIGSRVYL LRDSKNYQMF
KLKNKEFTFT VDDSKLPCGL NGAVYFVAMD EDGGTAKHSI NKAGAQYGTG YCDAQCPHDM
KFINGEANVL DWKPQSNDEN SGNGRWGARC TEMDIWEANS RATAYTPHIC TKTGLYRCEG
TECGDSDTNR YGGVCDKDGC DFNSYRMGDK SFFGQGKTVD SSKPVTVVTQ FITDNNQDSG
KLTEIRRKYV QGGKVIDNSK VNIAGITAGN PITDTFCDEA KKAFGDNNDF EKKGGLSALG
TQLEAGFVLV LSLWDDHSVN MLWLDSTYPT NASPGALGVE RGDCAITSGV PADVESQSAD
ASVTFSDIKF GPIDSTY MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT
VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGNALTL
KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD VSNLPCGLSG ALYHVNMDED
GGTKRYPDNE AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE
MDIWEANSIC SAVTPHVCDN LQQTRCQGAA CGENGGGSRF GSSCDPDGCD FNSWGMGNKT
FYGPGLIVDT KSKFTVVTQF VGNPVTEIKR KYVQNGKVIE NSYSNIEGMD KFNSVSDKFC
TAQKKAFGDT DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS
DRGPCPTTSG VPADVESKSA DANVIYSDIR FGAIDSTYK MILALLVLGK SLGIATNQAE
THPKLTWTRY QSKGSGSTVN GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC
DLDGADYPGT YGISTSGNSL KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT
VDDSKLPCGL NGALYFVAMD EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL
DWKPQSNDEN SGNGRYGACC TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ
RYNGICDKDG CDFNSYRLGD KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GILSETRRKY
VQGGKVIENS KVNVAGITAG NSVTDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDAGMVL
VLSLWDDHSV NMLWLDSTYP TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK
FGPIDSTY MIGIVLIQTV FGIGVGTQQS ESHPSLSWQQ CSKGGSCTSV SGSIVLDSNW
RWTHIPDGTT NCYDGNEWSS DLCPDPTTCS NNCVLEGADY SGTYGISTSG SSAKLGFVTK
GSYSTNIGSR VYLLGDESHY KIFDLKNKEF TFTVDDSNLE CGLNGALYFV AMDEDGGASR
FTLAKPGAKY GTGYCDAQCP HDIKFINGEA NVQDWKPSDN DDNAGTGHYG ACCTEMDIWE
ANKYATAYTP HICTENGEYR CEGKSCGDSS DDRYGGVCDK DGCDFNSWRL GNQSFWGPGL
IIDTGKPVTV VTQFVTKDGT DSGALSEIRR KYVQGGKTIE NTVVKISGID EVDSITDEFC
NQQKQAFGDT NDFEKKGGLS GLGKAFDYGV VLVLSLWDDH DVNMLWLDSV YPTNPAGKAG
ADRGPCATSS GDPKEVEDKY ASASVTFSDI KFGPIDSTY MLVFGIVSFV YSIGVGTNTA
ETHPKLTWKN GGSTTNGEVT VDSNWRWTHT KGSTKNCYDG NLWSKDLCPD AATCGKNCVL
EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQMFNL NGKEFTFTVD
VSQLPCGLNG ALYFVCMDQD GGMSRYPDNQ AGAKYGTGYC DAQCPTDLKF INGLPNSDGW
KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA CGENGGGDRF
GSICDPDGCD FNSWRMGNKT FWGPGLIIDT KKPVTVVTQF IGSPVTEIKR EYVQGGKVIE
NSYTNIEGMD KFNSISDKFC TAQKKAFGDN DSFTKHGGFS KLGQSFTKGQ VLVLSLWDDH
TVNMLWLDSV YPTNSKKLGS DRGPCPTSSG VPADVESKNA DSSVKYSDIR FGSIDSTYK
MLSFVFLLGF GVSLEIGTQQ SENHPTLSWQ QCTSSGSCTS QSGSIVLDSN WRWVHDSGTT
NCYDGNEWSS DLCPDPETCS KNCYLDGADY SGTYGITSNG SSLKLGFVTE GSYSTNIGSR
VYLKKDTNTY QIFKLKNHEF TFTVDVSNLP CGLNGALYFV EMEADGGKGK YPLAKPGAQY
GMGYCDAQCP HDMKFINGNA NVLDWKPQET DENSGNGRYG TCCTEMDIWE ANSQATAYTP
HICTKDGQYQ CEGTECGDSD ANQRYNGVCD KDGCDFNSYR LGNKTFFGPG LIVDSKKPVT
VVTQFITSNG QDSGDLTEIR RIYVQGGKTI QNSFTNIAGL TSVDSITEAF CDESKDLFGD
TNDFKAKGGF TAMGKSLDTG VVLVLSLWDD HSVNMLWLDS TYPTDAAAGA LGTQRGPCAT
SSGAPSDVES QSPDASVTFS DIKFGPLDST Y MLTLVVYLLS LVVSLEIGTQ QSESHPALTW
QREGSSASGS IVLDSNWRWV HDSGTTNCYD GNEWSTDLCP SSDTCTQKCY IEGADYSGTY
GITTSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN
GALYFVAMDA DGGKQKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVED WKPQDNDENS
GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGT DCGDSDSRYQ GTCDKDGCDY
ASYRWGDHSF YGEGKTVDTK QPITVVTQFI GDPLTEIRRL YIQGGKVINN SKTQNLASVY
DSITDAFCDA TKAASGDTND FKAKGAMAGF SKNLDTPQVL VLSLWDDHTA NMLWLDSTYP
TDSRDATAER GPCATSSGVP KDVESNQADA SVVFSDIKFG AINSTYSYN MFGFLLSLFA
LQFALEIGTQ TSESHPSITW ELNGARQSGQ IVIDSNWRWL HDSGTTNCYD GNTWSSDLCP
DPEKCSQNCY LEGADYSGTY GISASGSQLT LGFVTKGSYS TNIGSRVYLL KDENTYQMFK
LKNKEFTFTV DVSNLPCGLN GALYFVAMPS
DGGKAKYPLA KPGAKYGMGY CDAQCPHDMK FINGEANVLD WKPQSNDENA GTGRYGTCCT
EMDIWEANSQ ATAYTVHACS KNARCEGTEC GDDSASQRYN GICDKDGCDF NSWRWGNKTF
FGPGLTVDSS KPVTVVTQFI GDPLTEIRRI WVQGGKVIQN SFTNVSGITS VDSITNTFCD
ESKVATGDTN DFKAKGGMSG FSKALDTEVV LVLSLWDDHT ANMLWLDSTY PSNSTAIGAT
RGPCATSSGD PKNVESASAN ASVKFSDIKF GAFDSTY MLALVYFLLS LVVSLEIGTQ
QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP SSDTCTSKCY
IEGADYSGTY GITSSGSKVT LKFVTKGSYS TNIGSRIYLL KDENTYETFK LKNKEFTFTV
DDSQLNCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD
WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR
FKGTCDKDGC DYASWRWGDQ SFYGEGKTVD TKQPVTVVTQ FIGDPLTEIR RLYVQGGKTI
NNSKTSNLAD TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVLSLWDDH
TANMLWLDST YPTDSTKTGA SRGPCAVTSG VPKDVESQYG SAQVVYSDIK FGAINSTY
MLALVYFLLS FVVSLEIGTQ QSEDHPKLTW QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD
GNLWSTDLCG SSDTCSSKCY IEGADYSGTY GISASGSKLT LKFVTKGSYS TNIGSRVYLL
KDENTYETFK LKGKEFTFTV DDSKLDCGLN GALYFVAMDA DGGKAKYSSF KPGAKYGMGY
CDAQCPHDMK FISGKANVDD WKPQDNDENS GNGKLGTCCS EMDIWEGNAK SQAYTVHACT
KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC DYASWRWGDQ SFYGEGKTID TKQPVTVVTQ
FIGDPLTEIR RVYVQGGKVI NNSKTSNLAN VYDSITDKFC DDTKDATGDT NDFKAKGAMS
GFSTNLNTAQ VLVMSLWDDH TANMLWLDST YPTDSTKTGA SRGPCAVLSG VPKNVESQHG
DATVIYSDIK FGAINSTFSY N MFLALFVLGK SLGIATNQAE NHPKLTWTRY QSKGSGQTVN
GEVVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPQTCSSNC DLDGADYPGT YGISSSGNSL
KLGFVTHGSY STNIGSRVYL LRDSKNYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAME
EDGGVAKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC
IEMDIWEANS MATAYTPHVC TVTGIHRCEG TECGDTDANQ RYNGICDKDG CDFNSYRMGD
KSFFGVGKTV DSSKPVTVVT QFVTSNGQDG GTLSEIKRKY VQGGKVIENS KVNIAGITAV
NSITDTFCNE QKKAFGDNND FEKKGGLGAL SKQLDLGMVL VLSLWDDHSV NMLWLDSTYP
TDAAAGALGT ERGACATSSG KPSDVESQSP DASVTFSDIK FGPIDSTY MLLCLLSIAN
SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA
ATCGKNCVIE GADYQGTYGV SSSGDGLTLT FVTHGQYSTN VGSRLYLMKD EKTYQMFNLN
GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI
NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM DIWEANSQAT AYTPHVCDKL EQTRCSGSSC
GHTGGGERFS SSCDPDGCDF NSWRMGNKTF WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK
RKYVQGGKVI DNSMSNIAGM SKQYNSVSDD FCQAQKKAFG DNDSFTKHGG FRQLGATLGK
GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP GSDRGPCKTS SGIPADVESQ AASSSVKYSD
IRFGAIDSTY K MLCIGLISFV YSLGVGTNTA ETHPKLTWKN GGQTVNGEVT VDSNWRWTHT
KGSTKNCYDG NLWSKDLCPD AATCGKNCVL EGADYSGTYG VTSSGNALTL KFVTHGSYST
NVGSRLYLMK DEKTYQMFNL NGKEFTFTVD VSNLPCGLNG ALYHVNMDED GGTKRYPDNE
AGAKYGTGYC DAQCPTDLKF INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSIC
SAVTPHVCDT LQQTRCQGTA CGENGGGSRF GSSCDPDGCD FNSWRMGNKT FYGPGLIVDT
KSKFTVVTQF VGSPVTEIKR KYVQNGKVIE NSFSNIEGMD KFNSISDKFC TAQKKAFGDT
DSFTKHGGFK QLGSALAKGM VLVLSLWDDH TVNMLWLDSV YPTNSKKAGS DRGPCPTTSG
VPADVESKSA NANVIYSDIR FGAIDSTYK MLLCLLGIAS SLDAGTNTAE NHPQLSWKNG
GSSVSGSVTV DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGQNCVIE GADYQGTYGV
SASGNALTLT FVTHGQYSTN VGSRLYLLKD EKTYQIFNLI GKEFTFTVDV SNLPCGLNGA
LYFVQMDADG GTAKYSDNKA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN
GRYGSCCSEM DVWEANSLAT AYTPHVCDKL EQVRCDGRAC GQNGGGDRFS SSCDPDGCDF
NSWRLGNKTF WGPGLIVDTK QPVQVVTQWV GSGTSVTEIK RKYVQGGKVI DNSFTKLDSL
TKQYNSVSDE FCVAQKKAFG DNDSFTKHGG FRQLGATLAK GHVLVLSLWD DHDVNMLWLD
SVYPTNSNKP GADRGPCKTS SGVPADVESQ AASSSVKYSD IRFGAIDSTY K MLGIGFVCIV
YSLGVGTNTA ENHPKLTWKN SGSTTNGEVT VDSNWRWTHT KGTTKNCYDG NLWSKDLCPD
AATCGKNCVL EGADYSGTYG VTSSGDALTL KFVTHGSYST NVGSRLYLLK DEKTYQIFNL
NGKEFTFTVD VSNLPCGLNG ALYFVNMDAD GGTGRYPDNQ AGAKYGTGYC DAQCPTDLKF
INGIPNSDGW KPQSNDKNSG NGKYGSCCSE MDIWEANSLA TAVTPHVCDQ VGQTRCEGRA
CGENGGGDRF GSSCDPDGCD FNSWRLGNKT FWGPGLIVDT KKPVTVVTQF VGSPVTEIKR
KYVQGGKVIE NSYTNIEGLD KFNSISDKFC TAQKKAFGDN DSFIKHGGFR QLGQSFTKGQ
VLVLSLWDDH TVNMLWLDSV YPTNSKKPG DRGPCPTSSG VPADVESKNA GSSVKYSDIR
FGSIDSTYK MATLVGILVS LFALEVALEI GTQTSESHPS LSWELNGQRQ TGSIVIDSNW
RWLHDSGTTN CYDGNEWSSD LCPDPEKCSQ NCYLEGADYS GTYGISSSGN SLQLGFVTKG
SYSTNIGSRV YLLKDENTYA TFKLKNKEFT FTADVSNLPC GLNGALYFVA MPADGGKSKY
PLAKPGAKYG MGYCDAQCPH DMKFINGEAN ILDWKPSSND ENAGAGRYGT CCTEMDIWEA
NSQATAYTVH ACSKNARCEG TECGDDDGRY NGICDKDGCD FNSWRWGNKT FFGPNLIVDS
SKPVTVVTQF IGDPLTEIRR IYVQGGKVIQ NSFTNISGVA SVDSITDAFC NENKVATGDT
NDFKAKGGMS GFSKALDTEV VLVLSLWDDH TANMLWLDST YPTDSSALGA SRGPCAITSG
EPKDVESASA NASVKFSDIK FGAIDSTY MLTLVYFLLS LVVSLEIGTQ QSESHPQLSW
QNGSSSVSGS IVLDSNWRWV HDSGTTNCYD GNLWSTDLCP SSDTCTSKCY IEGADYSGTY
GITSSGSKLT LKFVTKGSYS TNIGSRVYLL KDENTYETFK LKNKEFTFTV DDSKLDCGLN
GALYFVAMDA DGGKAKYSSF KPGAKYGMGY CDAQCPHDMK FISGKANVDD WKPQDNDENS
GNGKLGTCCS EMDIWEGNAK SQAYTVHACT KSGQYECTGQ QCGDTDSGDR FKGTCDKDGC
DYASWRWGDQ SFYGEGKTVD TKQPLTVVTQ FVGDPLTEIR RVYVQGGKTI NNSKTSNLAD
TYDSITDKFC DATKEASGDT NDFKAKGAMS GFSTNLNTAQ VLVMSLWDDH TANMLWLDST
YPTDSTKTGA SRGPCAVSSG VPKDVESQHG DATVIYSDIK FGAINSTFKW N MLSLVSIFLV
GLGFSLGVGT QQSESHPSLS WQNCSAKGSC QSVSGSIVLD SNWRWLHDSG TTNCYDGNEW
STDLCPDAST CDKNCYIEGA DYSGTYGITS SGAQLKLGFV TKGSYSTNIG SRVYLLRDES
HYQLFKLKNH EFTFTVDDSQ LPCGLNGALY FVEMAEDGGA KPGAQYGMGY CDAQCPHDMK
FITGEANVKD WKPQETDENA GNGHYGACCT EMDIWEANSQ ATAYTPHICS KTGIYRCEGT
ECGDNDANQR YNGVCDKDGC DFNSYRLGNK TFWGPGLTVD SNKAMIVVTQ FTTSNNQDSG
ELSEIRRIYV QGGKTIQNSD TNVQGITTTN KITQAFCDET KVTFGDTNDF KAKGGFSGLS
KSLESGAVLV LSLWDDHSVN MLWLDSTYPT DSAGKPGADR GPCAITSGDP KDVESQSPNA
SVTFSDIKFG PIDSTY MILALLVLGK SLGIATNQAE THPKLTWTRY QSKGSGSTVN
GEIVLDSNWR WTHHSGTNCY DGNTWSTSLC PDPTTCSNNC DLDGADYPGT YGISTSGNSL
KLGFVTHGSY STNIGSRVYL LKDTKSYEMF KLKNKEFTFT VDDSKLPCGL NGALYFVAMD
EDGGVSKNSI NKAGAQYGTG YCDAQCPHDM KFINGEANVL DWKPQSNDEN SGNGRYGACC
TEMDIWEANS MATAYTPHVC TVTGLRRCEG TECGDTDNDQ RYNGICDKDG CDFNSYRLGD
KSFFGVGKTV DSSKPVTVVT QFVTSNGQDS GTLSEIRRKY VQGGKVIENS KVNVAGITAG
NSVTDTFCNE QKKAFGDNND FEKKGGFGAL SKQLVAGMVL VLSLWDDHSV NMLWLDSTYP
TNAAAGALGT ERGACATSSG KPSDVESQSP DATVTFSDIK FGPIDSTY MLCVGLFGLV
YSIGVGTNTQ ETHPKLSWKQ CSSGGSCTTQ QGSVVIDSNW RWTHSTKDLT NCYDGNLWDS
TLCPDGTTCS KNCVLEGADY SGTYGITSSG DSLTLKFVTH GSYSTNVGSR LYLLKDDNNY
QIFNLAGKEF TFTVDVSNLP CGLNGALYFV EMDQDGGKGK HKENEAGAKY GTGYCDAQCP
TDLKFIDGIA NSDGWKPQDN DENSGNGKYG SCCSEMDIWE ANSLATAYTP HVCDTKGQKR
CQGTACGENG GGDRFGSECD PDGCDFNSWR QGNKSFWGPG LIIDTKKSVQ VVTQFIGSGS
SVTEIRRKYV QNGKVIENSY STISGTEKYN SISDDYCNAQ KKAFGDTNSF ENHGGFKRFS
QHIQDMVLVL SLWDDHTVNM LWLDSVYPTN SNKPGADRGP CETSSGVPAD VESKSASASV
KYSDIRFGPI DSTYK MLLCLWSIAY SLGVGTNTAE NHPKLSWKNG GSSVSGSVTV
DANWRWTHIK GETKNCYDGN LWSDKYCPDA ATCGKNCVIE GADYQGTYGV SASGDGLTLT
FVTHGQYSTN VGSRLYLMKD EKTYQIFNLN GKEFTFTVDV SNLPCGLNGA LYFVQMDSDG
GMAKYPDNQA GAKYGTGYCD AQCPTDLKFI NGIPNSDGWK PQKNDKNSGN GKYGSCCSEM
DIWEANSQAT AYTPHVCDKL EQTRCSGSAC GHTGGGERFS SSCDPDGCDF NSWRMGNKTF
WGPGLIVDTK KPVQVVTQFV GSGNSCTEIK RKYVQGGKVI DNSMSNIAGM TKQYNSVSDD
FCQAQKKAFG DNDSFTKHGG FRQLGATLGK GHVLVLSLWD DHDVNMLWLD SVYPTNSNKP
GSDRGPCKTS SGIPADVESQ AASSSVKYSD IRFGAIDSTY K SEQ ID NO: 299
QSACTLQSET HPPLTWQKCS SGGTCTQQTG SVVIDANWRW THATNSSTNC YDGNTWSSTL
CPDNETCAKN CCLDGAAYAS TYGVTTSGNS LSIGFVTQSA QKNVGARLYL MASDTTYQEF
TLLGNEFSFD VDVSQLPCGL NGALYFVSMD ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL
KFINGQANVE GWEPSSNNAN TGIGGHGSCC SEMDIWEANS ISEALTPHPC TTVGQEICEG
DGCGGTYSDN AYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI
NRYYVQNGVT FQQPNAELGS YSGNELNDDY CTAEEAEFGG SSFSDKGGLT QFKKATSGGM
VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS SGVPAQVESQ SPNAKVTFSN
IKFGPIGSTG NPSGGNPPGG NPPGTTTTRR PATTTGSSPG PTQSHYGQCG GIGYSGPTVC
ASGTTCQVLN PYYSQCL SEQ ID NO: 300 QSACTLQSET HPPLTWQKCS SGGTCTQQTG
SVVIDANWRW THATNSSTNC YDGNTWSSTL CPDNETCAKN CCLDGAAYAS TYGVTTSGNS
LSIGFVTQSA QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD
ADGGVSKYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN TGIGGHGSCC
SEMDIWEANS ISEALTPHPC TTVGQEICEG DGCGGTYSDN RYGGTCDPDG CDWNPYRLGN
TSFYGPGSSF TLDTTKKLTV VTQFETSGAI NRYYVQNGVT FQQPNAELGS YSGNELNDDY
CTAEEAEFGG SSFSDKGGLT QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP
GAVAGSCSTS SGVPAQVESQ SPNAKVTFSN IKFGPIGSTG NPSGGNPPGG NPPGTTTTRR
PATTTGSSPG PTQSHYGQCG GIGYSGPTVC ASGTTCQVLN PYYSQCL SEQ ID NO: 301
MSALNSFNMY KSALILGSLL ATAGAQQIGT YTAETHPSLS WSTCKSGGSC TTNSGAITLD
ANWRWVHGVN TSTNCYTGNT WNTAICDTDA SCAQDCALDG ADYSGTYGIT TSGNSLRLNF
VTGSNVGSRT YLMADNTHYQ IFDLLNQEFT FTVDVSHLPC GLNGALYFVT MDADGGVSKY
PNNKAGAQYG VGYCDSQCPR DLKFIAGQAN VEGWTPSSNN ANTGLGNHGA CCAELDIWEA
NSISEALTPH PCDTPGLSVC TTDACGGTYS SDKYAGTCDP DGCDFNPYRL GVTDFYGSGK
TVDTTKPITV VTQFVTDDGT STGTLSEIRR YYVQNGVVIP QPSSKISGVS GNVINSDFCD
AEISTFGETA SFSKHGGLAK MGAGMEAGMV LVMSLWDDYS VNMLWLDSTY PTNATGTPGA
AKGSCPTTSG DPKTVESQSG SSYVTFSDIR VGPFNSTFSG GSSTGGSSTT TASGTTTTKA
SSTSTSSTST GTGVAAHWGQ CGGQGWTGPT TCASGTTCTV VNPYYSQCL SEQ ID NO:
302 QQIGTYTAET HPSLSWSTCK SGGSCTTNSG AITLDANWRW VHGVNTSTNC
YTGNTWNTAI CDTDASCAQD CALDGADYSG TYGITTSGNS LRLNFVTGSN VGSRTYLMAD
NTHYQIFDLL NQEFTFTVDV SHLPCGLNGA LYFVTMDADG GVSKYPNNKA GAQYGVGYCD
SQCPRDLKFI AGQANVEGWT PSSNNANTGL GNHGACCAEL DIWEANSISE ALTPHPCDTP
GLSVCTTDAC GGTYSSDKYA GTCDPDGCDF NPYRLGVTDF YGSGKTVDTT KPITVVTQFV
TDDGTSTGTL SEIRRYYVQN GVVIPQPSSK ISGVSGNVIN SDFCDAEIST FGETASFSKH
GGLAKMGAGM EAGMVLVMSL WDDYSVNMLW LDSTYPTNAT GTPGAAKGSC PTTSGDPKTV
ESQSGSSYVT FSDIRVGPFN STFSGGSSTG GSSTTTASGT TTTKASSTST SSTSTGTGVA
AHWGQCGGQG WTGPTTCASG TTCTVVNPYY SQCL
Sequence CWU 0 SQTB SEQUENCE LISTING The patent application
contains a lengthy "Sequence Listing" section. A copy of the
"Sequence Listing" is available in electronic form from the USPTO
web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20150329880A1).
An electronic copy of the "Sequence Listing" will also be available
from the USPTO upon request and payment of the fee set forth in 37
CFR 1.19(b)(3).
0 SQTB SEQUENCE LISTING The patent application contains a lengthy
"Sequence Listing" section. A copy of the "Sequence Listing" is
available in electronic form from the USPTO web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20150329880A1).
An electronic copy of the "Sequence Listing" will also be available
from the USPTO upon request and payment of the fee set forth in 37
CFR 1.19(b)(3).
* * * * *
References