U.S. patent application number 15/205577 was filed with the patent office on 2017-01-05 for fusion enzymes.
This patent application is currently assigned to Glykos Finland Oy. The applicant listed for this patent is Glykos Finland Oy. Invention is credited to Jukka Hiltunen, Anne Huuskonen, Anne Kanerva, Jari Natunen, Markku Saloheimo, Heli Viskari.
Application Number | 20170002337 15/205577 |
Document ID | / |
Family ID | 45093734 |
Filed Date | 2017-01-05 |
United States Patent
Application |
20170002337 |
Kind Code |
A1 |
Natunen; Jari ; et
al. |
January 5, 2017 |
Fusion Enzymes
Abstract
The present disclosure relates to recombinant proteins having
N-acetylglucosaminyltransferase activity. The present disclosure
further relates to methods for producing complex N-glycans
including the steps of providing host cells containing such
recombinant proteins and culturing the host cells such that the
recombinant proteins are expressed.
Inventors: |
Natunen; Jari; (Vantaa,
FI) ; Kanerva; Anne; (Helsinki, FI) ;
Hiltunen; Jukka; (Helsinki, FI) ; Saloheimo;
Markku; (Helsinki, FI) ; Viskari; Heli;
(Nummela, FI) ; Huuskonen; Anne; (Helsinki,
FI) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Glykos Finland Oy |
Helsinki |
|
FI |
|
|
Assignee: |
Glykos Finland Oy
Helsinki
FI
|
Family ID: |
45093734 |
Appl. No.: |
15/205577 |
Filed: |
July 8, 2016 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13989084 |
Aug 13, 2013 |
9399764 |
|
|
PCT/EP2011/070956 |
Nov 24, 2011 |
|
|
|
15205577 |
|
|
|
|
61417144 |
Nov 24, 2010 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12Y 204/01101 20130101;
C12Y 204/01143 20130101; C12P 21/005 20130101; C07K 2319/03
20130101; Y02P 20/52 20151101; C12N 9/1051 20130101; C12N 15/80
20130101; C12P 19/18 20130101 |
International
Class: |
C12N 9/10 20060101
C12N009/10; C12N 15/80 20060101 C12N015/80 |
Claims
1-69. (canceled)
70. A fungal host cell comprising an expression vector comprising a
polynucleotide encoding a fusion protein comprising an
N-acetylglucosaminyltransferase I catalytic domain and an
N-acetylglucosaminyltransferase II catalytic domain, wherein the
N-acetylglucosaminyltransferase II catalytic domain is positioned
N-terminal to the N-acetylglucosaminyltransferase I catalytic
domain, and wherein the fusion protein catalyzes the transfer of
N-acetylglucosamine to a terminal Man.alpha.3 residue and
N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan, wherein the acceptor glycan is attached to a
heterologous polypeptide.
71. The fungal host cell of claim 1, wherein the host cell is
selected from the group consisting of Trichoderma, Aspergillus,
Fusarium, Chrysosporium, Magnaporthe, Mycellopthora, Neurospora, or
Penicillium.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims the benefit of U.S. Provisional
Application No. 61/417,144, filed Nov. 24, 2010, which is hereby
incorporated by reference in its entirety.
SUBMISSION OF SEQUENCE LISTING ON ASCII TEXT FILE
[0002] The content of the following submission on ASCII text file
is incorporated herein by reference in its entirety: a computer
readable form (CRF) of the Sequence Listing (file name:
619672001040SEQLIST.txt, date recorded: Nov. 22, 2011, size: 305
KB).
FIELD OF THE INVENTION
[0003] The present disclosure relates to compositions and methods
useful for the production of N-glycans.
BACKGROUND
[0004] Posttranslational modification of proteins is often
necessary for proper protein folding and function. A common protein
modification is the addition of oligosaccharides (glycans) to
nascent polypeptides in the endoplasmic reticulum to form
glycoproteins, a process known as glycosylation. N-glycosylation is
of particular importance in the production of recombinant proteins
used for therapeutic purposes. Because standard prokaryotic
expression systems lack the proper machinery necessary for such
modifications, alternative expression systems have to be used in
production of these therapeutic proteins. Yeast and fungi are
attractive options for expressing proteins as they can be easily
grown at a large scale in simple media, which allows low production
costs. Moreover, tools are available to manipulate the relatively
simple genetic makeup of yeast and fungal cells as well as more
complex eukaryotic cells such as mammalian or insect cells (De
Pourcq et al., Appl Microbiol Biotechnol, 87(5):1617-31).
[0005] Fungal cells and mammalian cells share common steps in the
early stages of glycosylation that result in the formation of
mannose(8)N-acetylglucosamine(2) (Man8GlcNAc2). However,
significant differences exist in the later stages of the
process.
[0006] For example, in yeast, additional mannose subunits are added
to Man8GlcNAc2 by mannosyltransferases and mannan polymerases to
yield high-mannose type N-glycans. In contrast, mannose sugars are
removed from the human Man8GlcNAc2 to yield Man5GlcNAc2, followed
by three sequential reactions involving the enzymes
N-acetylglucosaminyltransferase I (GnTI), mannosidase II (Mns II),
and N-acetylglucosaminyltransferase II (GnTII), to convert
Man5GlcNAc2 into GlcNAc2Man3GlcNAc2.
[0007] The differences between the glycosylation process in
mammalian and fungal cells pose a challenge to the expression of
glycosylated mammalian proteins in fungal cells since glycoproteins
with high-mannose type N-glycans are not suitable for therapeutic
use in humans (De Pourcq et al., 2010; Wildt and Gerngross, Nature
Reviews Microbiology, 3: 119-128). Consequently, studies have been
conducted to re-engineer the glycosylation pathways in yeast and
fungal species to enable them to express recombinant human
proteins. The general approach in glycoengineering of yeast or
fungal cells has been to disrupt endogenous genes that are involved
in formation of high-mannose type N-glycans. These gene disruptions
can be combined with over-expression of endogenous mannosidases
and/or glycosyltransferases and glycosidases from different species
(Chiba et al., 1998, J Biol Chem 273: 26298-304; Kainz et al.,
2008, Appl Environ Microbiol 74: 1076-86; Maras et al., 1997, Euro
J Biochem 249: 701-07; Maras et al., 1999, Febs Letters 452:
365-70; Hamilton et al., 2003, Science 301: 1244-6; De Pourcq et
al., 2010). However, the production of glycosylated mammalian
proteins in non-mammalian cells still requires complicated and
time-consuming genetic engineering and can be inefficient at
producing a desired glycoprotein.
[0008] Thus, a need remains in the art for a simpler and more
efficient system to express complex N-glycans in non-mammalian
cells.
SUMMARY
[0009] Described herein are compositions including recombinant
proteins having N-acetylglucosaminyltransferase activity. Further
described herein are methods of producing complex N-glycans and
methods of producing Man3GlcNAc2 glycans.
[0010] Thus one aspect includes recombinant proteins having
N-acetylglucosaminyltransferase activity, where the recombinant
proteins catalyze the transfer of N-acetylglucosamine to a terminal
Man.alpha.3 residue and catalyze the transfer of
N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan, and where the recombinant protein contains
catalytic domains from at least two different enzymes. In certain
embodiments, the acceptor glycan is attached to a molecule selected
from an amino acid, a peptide, or a polypeptide. In certain
embodiments, the molecule is a heterologous polypeptide. In certain
embodiments that may be combined with the preceding embodiments,
the acceptor glycan is Man3. In certain embodiments that may be
combined with the preceding embodiments, the recombinant protein is
a fusion protein containing an N-acetylglucosaminyltransferase I
catalytic domain and an N-acetylglucosaminyltransferase II
catalytic domain. In certain embodiments, the
N-acetylglucosaminyltransferase I catalytic domain and the
N-acetylglucosaminyltransferase II catalytic domain are from human
enzymes. In certain embodiments, the
N-acetylglucosaminyltransferase I catalytic domain includes a
sequence that is at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at
least 98%, at least 99%, or 100% identical to amino acid residues
105-445 of SEQ ID NO: 1. In certain embodiments that may be
combined with the previous embodiments, the
N-acetylglucosaminyltransferase II catalytic domain includes a
sequence that is at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at
least 98%, at least 99%, or 100% identical amino acid residues
30-447 of SEQ ID NO: 21. In certain embodiments that may be
combined with the preceding embodiments, the
N-acetylglucosaminyltransferase I catalytic domain is N-terminal to
the N-acetylglucosaminyltransferase II catalytic domain. In certain
embodiments that may be combined with the preceding embodiments,
the N-acetylglucosaminyltransferase II catalytic domain is
N-terminal to the N-acetylglucosaminyltransferase I catalytic
domain.
[0011] In certain embodiments that may be combined with the
preceding embodiments, the recombinant proteins further contain a
spacer in between the N-acetylglucosaminyltransferase I catalytic
domain and the N-acetylglucosaminyltransferase II catalytic domain.
In certain embodiments, the spacer contains sequence from a stem
domain. In certain embodiments that may be combined with the
preceding embodiments, the spacer is at least 5, at least 10, at
least 15, at least 20, at least 30, at least 40, or at least 50
amino acids in length. In certain embodiments that may be combined
with the preceding embodiments, the spacer contains a sequence that
is selected from SEQ ID NO: 118, SEQ ID NO: 120, SEQ ID NO: 122,
and SEQ ID NO: 124. In certain embodiments, the spacer contains a
sequence that is selected from SEQ ID NO: 118, SEQ ID NO: 120, and
SEQ ID NO: 124. In certain embodiments, the spacer contains the
sequence of SEQ ID NO: 120 or SEQ ID NO: 124. In certain
embodiments, the spacer contains the sequence of SEQ ID NO:
124.
[0012] In certain embodiments that may be combined with the
preceding embodiments, the recombinant proteins further contain a
targeting peptide linked to the N-terminal end of the catalytic
domains. In certain embodiments, the targeting peptide contains a
stem domain. In certain embodiments, the stem domain is from an
N-acetylglucosaminyltransferase I enzyme or an
N-acetylglucosaminyltransferase II enzyme. In certain embodiments,
the N-acetylglucosaminyltransferase I enzyme and the
N-acetylglucosaminyltransferase II enzyme are human enzymes. In
certain embodiments that may be combined with the preceding
embodiments, the stem domain is from a protein selected from a
mannosidase, a mannosyltransferase, a glycosyltransferase, a Type 2
Golgi protein, MNN2, MNN4, MNN6, MNN9, MNN10, MNS1, KRE2, VAN1, or
OCH1. In certain embodiments, the protein is from an organism
selected from Acremonium, Aspergillus, Aureobasidium, Cryptococcus,
Chrysosporium, Chrysosporium lucknowense, Filibasidium, Fusarium,
Gibberella, Humicola, Magnaporthe, Mucor, Myceliophthora,
Myrothecium, Neocallimastix, Neurospora, Paecilomyces, Penicillium,
Piromyces, Schizophyllum, Talaromyces, Thermoascus, Thielavia,
Tolypocladium, or Trichoderma. In certain embodiments that may be
combined with the preceding embodiments, the targeting peptide is a
Kre2 targeting peptide. In certain embodiments, the targeting
peptide contains a transmembrane domain. In certain embodiments
that may be combined with the preceding embodiments, the targeting
peptide further contains a transmembrane domain linked to the
N-terminal end of the stem domain. In certain embodiments that may
be combined with the preceding embodiments, the transmembrane
domain is from an N-acetylglucosaminyltransferase I enzyme or an
N-acetylglucosaminyltransferase II enzyme. In certain embodiments,
the N-acetylglucosaminyltransferase I enzyme and the
N-acetylglucosaminyltransferase II enzyme are human enzymes. In
certain embodiments that may be combined with the preceding
embodiments, the transmembrane domain is from a protein selected
from a mannosidase, a mannosyltransferase, a glycosyltransferase, a
Type 2 Golgi protein, MNN2, MNN4, MNN6, MNN9, MNN10, MNS1, KRE2,
VAN1, or OCH1. In certain embodiments, the protein is from an
organism selected from Acremonium, Aspergillus, Aureobasidium,
Cryptococcus, Chrysosporium, Chrysosporium lucknowense,
Filibasidium, Fusarium, Gibberella, Humicola, Magnaporthe, Mucor,
Myceliophthora, Myrothecium, Neocallimastix, Neurospora,
Paecilomyces, Penicillium, Piromyces, Schizophyllum, Talaromyces,
Thermoascus, Thielavia, Tolypocladium, or Trichoderma. In certain
embodiments, the targeting peptide contains a cytoplasmic domain.
In certain embodiments that may be combined with the preceding
embodiments, the targeting peptide further contains a cytoplasmic
domain linked to the N-terminal end of the stem domain. In certain
embodiments that may be combined with the preceding embodiments,
the targeting peptide further contains a cytoplasmic domain linked
to the N-terminal end of the transmembrane domain. In certain
embodiments that may be combined with the preceding embodiments,
the cytoplasmic domain is from an N-acetylglucosaminyltransferase I
enzyme or an N-acetylglucosaminyltransferase II enzyme. In certain
embodiments, the N-acetylglucosaminyltransferase I enzyme and the
N-acetylglucosaminyltransferase II enzyme are human enzymes. In
certain embodiments that may be combined with the preceding
embodiments, the cytoplasmic domain is from a protein selected from
a mannosidase, a mannosyltransferase, a glycosyltransferase, a Type
2 Golgi protein, MNN2, MNN4, MNN6, MNN9, MNN10, MNS1, KRE2, VAN1,
or OCH1. In certain embodiments, the protein is from an organism
selected from Acremonium, Aspergillus, Aureobasidium, Cryptococcus,
Chrysosporium, Chrysosporium lucknowense, Filibasidium, Fusarium,
Gibberella, Humicola, Magnaporthe, Mucor, Myceliophthora,
Myrothecium, Neocallimastix, Neurospora, Paecilomyces, Penicillium,
Piromyces, Schizophyllum, Talaromyces, Thermoascus, Thielavia,
Tolypocladium, or Trichoderma.
[0013] Another aspect includes recombinant proteins containing a
human N-acetylglucosaminyltransferase II catalytic domain and a
human N-acetylglucosaminyltransferase I catalytic domain where the
N-acetylglucosaminyltransferase II catalytic domain is located
N-terminal to the N-acetylglucosaminyltransferase I catalytic
domain, a spacer sequence containing sequence from a human
N-acetylglucosaminyltransferase I stem domain located in between
the catalytic domains, and a targeting peptide located N-terminal
to the N-acetylglucosaminyltransferase II catalytic domain where
the targeting peptide contains a cytoplasmic domain, a
transmembrane domain, and a stem domain from human
N-acetylglucosaminyltransferase II. Another aspect includes a
recombinant protein containing a sequence that is at least 70%, at
least 75%, at least 80%, at least 85%, at least 90%, at least 95%,
at least 96%, at least 97%, at least 98%, at least 99%, or 100%
identical to SEQ ID NO: 95.
[0014] Another aspect includes recombinant proteins containing
N-acetylglucosaminyltransferase II catalytic domain and a
N-acetylglucosaminyltransferase I catalytic domain, where the
N-acetylglucosaminyltransferase II catalytic domain is located
N-terminal to the N-acetylglucosaminyltransferase I catalytic
domain; a spacer located in between the catalytic domains, where
the spacer contains a sequence selected from SEQ ID NO: 118, SEQ ID
NO: 120, SEQ ID NO: 122, and SEQ ID NO: 124; and a targeting
peptide located N-terminal to the N-acetylglucosaminyltransferase
II catalytic domain where the targeting peptide contains a
cytoplasmic domain, a transmembrane domain, and a stem domain from
human N-acetylglucosaminyltransferase II. In certain embodiments,
the spacer contains a sequence that is selected from SEQ ID NO:
118, SEQ ID NO: 120, and SEQ ID NO: 124. In certain embodiments,
the spacer contains the sequence of SEQ ID NO: 120 or SEQ ID NO:
124. In certain embodiments, the spacer contains the sequence of
SEQ ID NO: 124.
[0015] Another aspect includes isolated polynucleotides encoding
the recombinant protein of any of the preceding embodiments.
Another aspect includes expression vectors containing the isolated
polynucleotide of the preceding embodiment operably linked to a
promoter. In certain embodiments, the promoter is a constitutive
promoter. In certain embodiments, the promoter is an inducible
promoter. In certain embodiments, the promoter is from a gene
selected from gpdA, cbh1, Aspergillus oryzae TAKA amylase,
Rhizomucor miehei aspartic proteinase, Aspergillus niger neutral
alpha-amylase, Aspergillus niger acid stable alpha-amylase,
Aspergillus niger glucoamylase (glaA), Aspergillus awamori glaA,
Rhizomucor miehei lipase, Aspergillus oryzae alkaline protease,
Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans
acetamidase, Aspergillus oryzae acetamidase, Fusarium oxysporum
trypsin-like protease, fungal endo .alpha.-L-arabinase (abnA),
fungal .alpha.-L-arabinofuranosidase A (abfA), fungal
.alpha.-L-arabinofuranosidase B (abfB), fungal xylanase (xlnA),
fungal phytase, fungal ATP-synthetase, fungal subunit 9 (oliC),
fungal triose phosphate isomerase (tpi), fungal alcohol
dehydrogenase (adhA), fungal .alpha.-amylase (amy), fungal
amyloglucosidase (glaA), fungal acetamidase (amdS), fungal
glyceraldehyde-3-phosphate dehydrogenase (gpd), yeast alcohol
dehydrogenase, yeast alcohol oxidase, yeast lactase, yeast
3-phosphoglycerate kinase, yeast triosephosphate isomerase,
bacterial .alpha.-amylase, bacterial Spo2, or SSO. Another aspect
includes host cells containing the expression vector of any of the
preceding embodiments.
[0016] Another aspect includes methods of producing the recombinant
protein of any the preceding embodiments, including the steps of
introducing an isolated polynucleotide that encodes the recombinant
protein into a host cell, and culturing the host cell such that the
recombinant protein is expressed. In certain embodiments, the
methods further include a step of purifying the recombinant protein
from the host cell. In certain embodiments that may be combined
with the preceding embodiments, the host cell is a fungal cell. In
certain embodiments, the fungal cell is selected from yeast or
filamentous fungus.
[0017] Another aspect includes methods of producing a complex
N-glycan including the steps of providing a host cell, where the
host cell contains a polynucleotide encoding a fusion protein
containing an N-acetylglucosaminyltransferase I catalytic domain
and an N-acetylglucosaminyltransferase II catalytic domain, and
culturing the host cell such that the fusion protein is expressed,
where the fusion protein catalyzes the transfer of
N-acetylglucosamine to a terminal Man.alpha.3 residue and
N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan to produce a complex N-glycan. In certain
embodiments, the complex N-glycan is attached to a molecule
selected from an amino acid, a peptide, or a polypeptide. In
certain embodiments, the molecule is a heterologous polypeptide. In
certain embodiments that may be combined with the preceding
embodiments, the acceptor glycan is Man3. In certain embodiments
that may be combined with the preceding embodiments, the complex
N-glycan is
GlcNAc.beta.2Man.alpha.3(GlcNAc.beta.2Man.alpha.6)Man.beta.4GlcNAc.beta.4-
GlcNAc. In certain embodiments that may be combined with the
preceding embodiments, the host cell is a eukaryotic cell. In
certain embodiments that may be combined with the preceding
embodiments, the host cell is a fungal cell. In certain
embodiments, the fungal cell is a yeast cell selected from S.
cerevisiae, K. lactis, P. pastoris, H. polymorpha, C. albicans,
Schizosaccharomyces, or Yarrowia. In certain embodiments that may
be combined with the preceding embodiments, the fungal cell is a
filamentous fungal cell selected from Trichoderma sp., Acremonium,
Aspergillus, Aureobasidium, Cryptococcus, Chrysosporium,
Chrysosporium lucknowense, Filibasidium, Fusarium, Gibberella,
Magnaporthe, Mucor, Myceliophthora, Myrothecium, Neocallimastix,
Neurospora, Paecilomyces, Penicillium, Piromyces, Schizophyllum,
Talaromyces, Thermoascus, Thielavia, or Tolypocladium. In certain
embodiments that may be combined with the preceding embodiments,
the host cell further contains a polynucleotide encoding a
UDP-GlcNAc transporter. In certain embodiments that may be combined
with the preceding embodiments, the host cell has a reduced level
of activity of a dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
mannosyltransferase compared to the level of activity in a
wild-type host cell. In certain embodiments, the host cell has a
reduced level of expression of an alg3 gene compared to the level
of expression in a wild-type host cell. In certain embodiments, the
alg3 gene is deleted from the host cell. In certain embodiments
that may be combined with the preceding embodiments, the host cell
has a reduced level of activity of an
.alpha.-1,6-mannosyltransferase compared to the level of activity
in a wild-type host cell. In certain embodiments, the host cell has
a reduced level of expression of an och1 gene compared to the level
of expression in a wild-type host cell. In certain embodiments, the
och1 gene is deleted from the host cell. In certain embodiments
that may be combined with the preceding embodiments, the host cell
further contains a polynucleotide encoding an
.alpha.-1,2-mannosidase. In certain embodiments that may be
combined with the preceding embodiments, the host cell further
contains a polynucleotide encoding a
.beta.-1,4-galactosyltransferase. In certain embodiments that may
be combined with the preceding embodiments, the host cell further
contains a polynucleotide encoding a sialyltransferase. In certain
embodiments that may be combined with the preceding embodiments,
the host cell is a Trichoderma cell that has a reduced level of
activity of a dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
mannosyltransferase compared to the level of activity in a
wild-type Trichoderma cell. In certain embodiments that may be
combined with the preceding embodiments, the host cell is a yeast
or fungal cell that has a reduced level of activity of a
dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl mannosyltransferase and
a reduced level of activity of an alpha-1,6-mannosyltransferase
compared to the levels of activity in a wild-type yeast cell and
further contains a polynucleotide encoding an
.alpha.-1,2-mannosidase.
[0018] Another aspect includes methods of producing a complex
N-glycan including the steps of providing a Trichoderma host cell,
where the host cell has a reduced level of expression of an alg3
gene compared to the level of expression in a wild-type host cell
and contains a first polynucleotide encoding an
N-acetylglucosaminyltransferase I catalytic domain and a second
polynucleotide encoding an N-acetylglucosaminyltransferase II
catalytic domain, and culturing the host cell to produce a complex
N-glycan.
[0019] Another aspect includes methods of producing a complex
N-glycan including the steps of incubating a fusion protein
containing an N-acetylglucosaminyltransferase I catalytic domain
and an N-acetylglucosaminyltransferase II catalytic domain, an
acceptor glycan, and an N-acetylglucosamine donor together in a
buffer, where the fusion protein catalyzes the transfer of
N-acetylglucosamine to a terminal Man.alpha.3 residue and
N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan to produce a complex N-glycan. In certain
embodiments, the acceptor glycan is attached to a molecule selected
from an amino acid, a peptide, or a polypeptide. In certain
embodiments, the molecule is a heterologous polypeptide. In certain
embodiments, the acceptor glycan is Man3. In certain embodiments
that may be combined with the preceding embodiments, the
N-acetylglucosamine donor is a UDP-GlcNAc transporter.
[0020] Another aspect includes filamentous fungal cells containing
a mutation of alg3 and Man3GlcNAc2, where the Man3GlcNAc2 includes
at least 50%, at least 60%, at least 70%, at least 80%, at least
90%, or 100% (mol %) of neutral N-glycans secreted by the cells.
The neutral N-glycans may be attached to a molecule selected from
the group consisting of an amino acid, a peptide, and a
polypeptide. In certain embodiments, the mutation of alg3 is a
deletion of alg3. In certain embodiments that may be combined with
the preceding embodiments, the cell is a Trichoderma reesei cell.
In certain embodiments that may be combined with the preceding
embodiments, the filamentous fungal cell further contains a first
polynucleotide encoding an N-acetylglucosaminyltransferase I
catalytic domain and a second polynucleotide encoding an
N-acetylglucosaminyltransferase II catalytic domain. In certain
embodiments that may be combined with the preceding embodiments,
the filamentous fungal cell further contains a polynucleotide
encoding a fusion protein containing an
N-acetylglucosaminyltransferase I catalytic domain and an
N-acetylglucosaminyltransferase II catalytic domain.
[0021] Another aspect includes methods of producing a Man3GlcNAc2
glycan in a host cell including the steps of providing a host cell
with a reduced level of activity of a mannosyltransferase compared
to the level of activity in a wild-type host cell and culturing the
host cell to produce a Man3GlcNAc2 glycan, where the Man3GlcNAc2
glycan includes at least 50%, at least 60%, at least 70%, at least
80%, at least 90%, or100% (mol %) of the neutral N-glycans secreted
by the host cell. The neutral N-glycans may be attached to a
molecule selected from an amino acid, a peptide, and a polypeptide.
In certain embodiments, the Man3GlcNAc2 glycan is attached to a
heterologous polypeptide. In certain embodiments that may be
combined with the preceding embodiments, the mannosyltransferase is
a dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl mannosyltransferase.
In certain embodiments that may be combined with the preceding
embodiments, the host cell has a reduced level of expression of an
alg3 gene compared to the level of expression in a wild-type host
cell. In certain embodiments, the alg3 gene is deleted from the
host cell. In certain embodiments that may be combined with the
preceding embodiments, the host cell is a Trichoderma cell. In
certain embodiments that may be combined with the preceding
embodiments, the level of activity of alpha-1,6-mannosyltransferase
in the host cell is reduced compared to the level of activity in a
wild-type host cell. In certain embodiments that may be combined
with the preceding embodiments, the host cell contains an
endogenous polynucleotide encoding an .alpha.-1,2-mannosidase.
[0022] Another aspect includes a filamentous fungal cell having a
reduced level of expression of an alg3 gene compared to the level
of expression in a wild-type filamentous fungal cell, where the
filamentous fungal cell contains a recombinant protein of any of
the preceding embodiments. In certain embodiments, the alg3 gene
contains a mutation. Preferably, the recombinant protein has
N-acetylglucosaminyltransferase activity, where the recombinant
protein catalyzes the transfer of N-acetylglucosamine to a terminal
Man.alpha.3 residue and catalyzes the transfer of
N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan, and where the recombinant protein is a fusion
protein containing an N-acetylglucosaminyltransferase I catalytic
domain and an N-acetylglucosaminyltransferase II catalytic domain.
In certain embodiments, the mutation of the alg3 gene is a deletion
of the alg3 gene. In certain embodiments that may be combined with
the preceding embodiments, the fusion protein is encoded by a
polynucleotide operably linked to a promoter. In certain
embodiments, the promoter is an inducible promoter. In certain
embodiments, the inducible promoter is the cbh1 promoter. In
certain embodiments that may be combined with the preceding
embodiments, the filamentous fungal cell further contains a
polynucleotide encoding a UDP-GlcNAc transporter. In certain
embodiments that may be combined with the preceding embodiments,
the filamentous fungal has a reduced level of activity of an
.alpha.-1,6-mannosyltransferase compared to the level of activity
in a wild-type filamentous fungal cell. In certain embodiments, the
filamentous fungal has a reduced level of expression of an och1
gene compared to the level of expression in a wild-type filamentous
fungal cell. In certain embodiments that may be combined with the
preceding embodiments, the filamentous fungal cell further contains
a polynucleotide encoding an .alpha.-1,2-mannosidase. In certain
embodiments that may be combined with the preceding embodiments,
the filamentous fungal cell further contains a polynucleotide
encoding a .beta.-1,4-galactosyltransferase. In certain embodiments
that may be combined with the preceding embodiments, the
filamentous fungal cell further contains a polynucleotide encoding
a sialyltransferase. In certain embodiments that may be combined
with the preceding embodiments, the filamentous fungal cell is
selected from Trichoderma sp., Acremonium, Aspergillus,
Aureobasidium, Cryptococcus, Chrysosporium, Chrysosporium
lucknowense, Filibasidium, Fusarium, Gibberella, Magnaporthe,
Mucor, Myceliophthora, Myrothecium, Neocallimastix, Neurospora,
Paecilomyces, Penicillium, Piromyces, Schizophyllum, Talaromyces,
Thermoascus, Thielavia, and Tolypocladium.
DESCRIPTION OF THE FIGURES
[0023] FIG. 1 shows mass spectrometric neutral N-glycan profiles of
average glycosylation on T. reesei strains M44, M81, M84, M109,
M110, M131, M132, M133, M134, and M124.
[0024] FIG. 2 shows fragmentation analysis of monophosphorylated
Man7Gn2. Only one example structure of monophosphorylated Man7Gn2
is shown.
[0025] FIG. 3 shows mass spectrometric acidic glycan profiles of T.
reesei strains M44, M81, M84, M109, M110, M131, M132, M133, M134,
and M124.
[0026] FIG. 4 shows neutral (a) and acidic (b) N-glycan profiles of
T. reesei strain M44 cultured in a fermentor for 131.4 hours (fed
batch).
[0027] FIG. 5 shows mass spectrometric neutral (a) and acidic (b)
N-glycan profiles of T. reesei culture medium.
[0028] FIG. 6 shows a membrane blot of T. reesei M44 secreted
proteins.
[0029] FIG. 7 shows an example of analyzed protein bands of T.
reesei M44 cultivated in a fermentor. The glycosylation of proteins
did not differ significantly from average glycosylation in T.
reesei. The spectrum was focused to the minor base line signals,
and the major signal of the spectrum was not quantitative in
comparison to other signals.
[0030] FIG. 8 shows a Southern blot of DNA from the parental strain
and from Alg3 knockout strains with an alg3 probe.
[0031] FIG. 9A shows a restriction enzyme map of a section of the
pTTv38 construct with sizes of predicted restriction products. FIG.
9B shows a Southern blot of genomic DNA from the parental strain
and the Alg3 knockout strains digested with EcoRI+PvuI (E+P) or
KpnI+NheI (K+N). The control DNA was pTTv38 plasmid DNA digested
with NotI. The blot was probed with an AmdS probe.
[0032] FIG. 10 shows MALDI analysis of neutral N-glycans. Part A
shows the parental strain M124. Part B shows the Alg3 knockout 4A.
Squares represent N-acetylglucosamine, and circles represent
mannose, except for the one labeled glucose.
[0033] FIG. 11 shows fragmentation analysis of Man3Gn2 from the 4A
Alg3 knockout strain.
[0034] FIG. 12 shows fragmentation analysis of Hex5Gn2 from Alg3
knockout strain 4A (part A) and parental strain M124 (part B). The
signal marked with a box exists only as an isomer from the Alg3
knockout strain.
[0035] FIG. 13 shows neutral N-glycans from Alg3 knockout strain 4A
after .alpha.-mannosidase digestion.
[0036] FIG. 14 shows the separation of two major glycans from the
Alg3 knockout strain by liquid chromatography.
[0037] FIG. 15 shows proton NMR spectra of Hex3HexNAc2 (part A) and
Hex6HexNAc2 (part B) fractions. Spectra were collected at
40.degree. C. using a Varian Unity INOVA 600 MHz spectrometer
equipped with a cryoprobe.
[0038] FIG. 16 shows the acidic fraction of parental strain M124
(part A) and Alg3 knockout strain 4A (B). N-glycans with two
phosphate units are marked with an asterisk.
[0039] FIG. 17 shows neutral N-glycans from supernatant of T.
reesei Alg3 knockout strain 4A that was cultured in a flask for 5
days.
[0040] FIG. 18 shows neutral N-glycans from supernatant of T.
reesei Alg3 knockout strain 4A that was cultured in a fermentor for
10 days.
[0041] FIG. 19 shows a MALDI spectrum of GnTI reaction mixture.
GnTI has converted 54% of the acceptor to the product with one
additional HexNAc.
[0042] FIG. 20 shows Western blot analysis of GnTII expression.
Samples were run in 12% SDS-PAGE gel and blotted on nitrocellulose
membrane. Histidine-tagged GnTII was detected on the membrane using
mouse .alpha.-HIS monoclonal antibodies. Numbers shown on the left
are the sizes of molecular weight marker proteins (kDa).
[0043] FIG. 21 shows a MALDI spectrum of GnTII reaction mixture.
83% of the acceptor (m/z 913.340) was converted to product (m/z
1136.433).
[0044] FIG. 22 shows GnTI activity observed for the GnTI/GnTII
fusion protein.
[0045] FIG. 23 shows the N-glycans present in GnTI/GnTII T. reesei
transformants obtained by targeting to the alg3 locus.
[0046] FIG. 24 shows a MALDI spectrum of the purified reaction
mixture from the enzyme activity test of the GnTII/GnTI fusion
protein.
[0047] FIG. 25 shows a spectrum of the
.beta.1-2,3,4,6-N-acetylglucosaminidase reaction mixture.
[0048] FIG. 26 shows a MALDI spectrum of .beta.1-4GalT reaction
mixture.
[0049] FIG. 27 shows diagrams of observed N-glycans from
supernatant proteins of T. reesei M127 pTTv110 transformants (gnt
II/I in alg3 locus) on days 3 (A), 5 (B) and 7 (C and D). The clone
17A produced the most G0 on day 7. (E) Mass spectrum of neutral
N-glycans of supernatant proteins from T. reesei strain M127 GnT
II/I transformant clone 17A cultivated for 7 days in shake flasks.
Signals marked with asterisks originated from the culture
medium.
[0050] FIG. 28 shows neutral N-glycans of rituximab from T. reesei
M202 GnT II/I transformant clones (A) 9A-1 and (B) 31A-1, both
cultivated with soybean trypsin inhibitor, and (C) mass spectrum of
neutral N-glycans of rituximab purified from T. reesei strain M202
GnT II/I transformant clone 9A-1 cultivated for 5 days in shake
flasks in the presence of soybean trypsin inhibitor.
[0051] FIG. 29 shows MALDI spectra of spacer modified GnTII/GnTI
fusion reaction mixtures. Part (A) shows a reaction mixture of
GnTII/GnTI with 3.times.G4S spacer modification. 36% of the
acceptor has been converted to product with two additional HexNAcs.
Part (B) shows a reaction mixture of GnTII/GnTI with 2.times.G4S
spacer modification. 38% of the acceptor has been converted to
product with two additional HexNAcs. Calculated m/z values for
[M+Na]+-signals of GnTI product, Hex3HexNAc2 (calc. m/z 933.318),
was not detected in either spectra because all of the GnTI product
was converted directly to Hex3HexNAc3, (calc. m/z 1136.318).
[0052] FIG. 30 shows Western blots of GnTII/I spacer variant cell
pellets (A), and supernatants (B). Lanes 1.GnTII positive control,
2 GY3 mock strain, 3.GY7-2 wild type GnTII/I 4.GY32-5 3.times.G4S
spacer, 5. GY32-9 3.times.G4S spacer, 6.GY33-7 2.times.G4S spacer,
7.GY33-8 2.times.G4S spacer, 8.GY49-3 CBHI spacer and 9.GY50-10
EGIV spacer.
[0053] FIG. 31 shows GnT activities of wild-type GnTII/I and spacer
variants from supernatants and expressed in the presence of
protease inhibitors after day 3 (A) expression phases and day 4 (B)
expression phases. The x-axis depicts sample identity
(wt--wild-type, _1, _2=parallel clones of the spacer variants), and
the y-axis depicts percentage of products formed (GnTI and GnTII
reaction products added together).
[0054] FIG. 32 shows GnT activities of GnTII/I fusion protein (with
wild type spacer) in supernatant, cells and lysate. GnTI and GnTII
products have been added together
[0055] FIG. 33 shows GnT activities of GnTII/I wild-type and spacer
variants in (A) supernatants, (B) cells, and (C) lysates.
[0056] FIG. 34 shows example spectra of neutral N-glycans of
parental strain M124 and GnT1 transformants on day 5. Signal with
Gn addition (m/z 1460) is marked with an arrow. (pTTv11 with cbh1
promoter, pTTv13 with gpdA promoter).
[0057] FIG. 35 shows the amounts of Man5 and Gn1Man5 in four
positive GNT1 transformants on days 3 and 5. Quantitation was
carried out against internal calibrant (Hex2HexNAc4, 2 pmol).
[0058] FIG. 36 shows example spectra of phosphorylated N-glycans of
parental M124 strain and GnT1 transformants with internal calibrant
(NeuAcHex4HexNAc2, 0.5 pmol.). GnT1 products are marked with an
arrow.
[0059] FIG. 37 shows diagrams of neutral N-glycans of different
GnTII strains/clones from day 5. Part (A) show the pTTv140 clone.
Part (B) shows the pTTv142 clone. Part (C) shows the pTTv143 clone.
Part (D) shows the pTTv141 clone.
[0060] FIG. 38 shows an example of neutral N-glycans of different
GnTII strains/clones and the parental strain M198 from days 3, 5,
and 7. Part (A) shows clone 1-117A. Part (B) shows clone 3-11A.
Part (C) shows clone 30A. Part (D) shows parental stain M198.
[0061] FIG. 39 shows the membrane of separated proteins of T.
reesei strain M198 and GnTII clone 3-17A. The 50 kDA protein is
marked with an arrow.
[0062] FIG. 40 shows column diagrams of total secreted proteins
versus individual secreted protein(s) of parental strain M198 (A)
and the GnTII clone 3-17A (B).
[0063] FIG. 41 shows a column diagram of fermentor cultured GnTII
strain M329 from day 3 to day 7, and shake flask culture of strain
M329 from day 5.
[0064] FIG. 42 shows a multiple amino acid sequence alignment of T.
reesei ALG3 and ALG3 homologs.
DETAILED DESCRIPTION
[0065] The present invention relates to recombinant proteins having
N-acetylglucosaminyltransferase activity where the recombinant
protein catalyzes the transfer of N-acetylglucosamine (GlcNAc) to a
terminal Man.alpha.3 residue and catalyzes the transfer of
N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan, and where the recombinant protein contains
catalytic domains from at least two different enzymes.
[0066] In some embodiments, the recombinant proteins of the
invention include two catalytic domains, where one catalytic domain
has N-acetylglucosaminyltransferase I (GnTI) activity (e.g., reacts
with a terminal Man.alpha.3 residue), and the other catalytic
domain has N-acetylglucosaminyltransferase II (GnTII) activity
(e.g., reacts with a terminal Man.alpha.6 residue).
[0067] In some embodiments, the recombinant proteins of the present
invention catalyze reactions that occur essentially sequentially.
For example, the recombinant proteins of the present invention may
catalyze the transfer of GlcNAc to a terminal Man.alpha.3-residue,
first, and then catalyze the transfer of GlcNAc to a terminal
Man.alpha.6-residue of an acceptor glycan. In one embodiment, the
essentially sequential reactions are at least 10 fold, at least 20
fold, at least 30 fold, at least 40 fold, at least 50 fold, at
least 60 fold, at least 70 fold, at least 80 fold, at least 90
fold, or at least 100 fold, more effective than the two reactions
in the reversed order. In certain embodiments, a sequential
reaction means that essentially or absolutely no GlcNAc can be
transferred to the terminal Man.alpha.6-residue if GlcNAc has not
yet been transferred to the terminal Man.alpha.3-residue. In a
specific embodiment, the acceptor glycan contains a
GlcNAc.beta.2Man.alpha.3-branch.
[0068] In some embodiments, the recombinant proteins react
specifically with both Man.alpha.3 and Man.alpha.6 residues,
optionally in branched acceptor glycans but not substantially or
absolutely with other Man.alpha.-structures, e.g.
Man.alpha.-monosaccharide conjugates, with Man.alpha.benzyl and/or
Man.alpha.Ser/Thr-peptide. The non-substantial reactivity is
preferably below 10%, below 8%, below 6%, below 4%, below 2%, below
1%, or below 0.1% of the Vmax with 0.1 mM acceptor glycan
concentrations of reactions with terminal Man.alpha.3 and
Man.alpha.6 residues. In a specific embodiment, the recombinant
proteins have substantially similar reactivities with the terminal
Man.alpha.3 (preferably as GnTI reaction) and the terminal
Man.alpha.6 residue (preferably as GnTII reaction) of the acceptor
glycan. Preferably neither catalytic activity has more than a
10-fold, 5-fold, 3-fold or 2-fold difference in reaction
effectiveness compared to the other catalytic activity under the
same conditions.
[0069] In a specific embodiment, the transfer of GlcNAc to the
terminal Man.alpha.3 and Man.alpha.6 cause a conversion of at least
10%, at least 25%, at least 50%, at least 70%, at least 90%, or at
least 95% of Man3 glycan to a glycan with two terminal GlcNAcs. The
effectiveness of the reaction can be measured by in vitro or in
vivo assays as described in the examples disclosed herein. The
effectiveness of the GlcNAc transfer reactions can be measured
essentially as described in the Examples or as maximal reaction
rate Vmax with 0.1 mM acceptor concentrations and saturating donor
concentrations. In a specific embodiment, the effectiveness of the
reaction is measured with a Man3 acceptor glycan attached to an
amino acid, a peptide, or a polypeptide.
[0070] The present disclosure further relates to methods of
producing a complex N-glycan, including the steps of providing a
host cell, where the host cell contains a nucleic acid encoding a
fusion protein containing an N-acetylglucosaminyltransferase I
catalytic domain and an N-acetylglucosaminyltransferase II
catalytic domain, and culturing the host cell such that the fusion
protein is expressed, where the fusion protein catalyzes the
transfer of N-acetylglucosamine to a terminal Man.alpha.3 residue
and N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan to produce a complex N-glycan.
[0071] The present invention also relates to a filamentous fungal
cell having a reduced level of expression of an alg3 gene compared
to the level of expression in a wild-type filamentous fungal cell,
where the filamentous fungal cell contains a recombinant protein of
the invention.
[0072] Definitions
[0073] As used herein, "recombinant protein" refers to any protein
that has been produced from a recombinant nucleic acid.
"Recombinant nucleic acid" as used herein refers to a polymer of
nucleic acids where at least one of the following is true: (a) the
sequence of nucleic acids is foreign to (i.e., not naturally found
in) a given host cell; (b) the sequence may be naturally found in a
given host cell, but is present in an unnatural (e.g., greater than
expected) amount or expressed at a level that is more or less than
the natural level of expression; or (c) the sequence of nucleic
acids includes two or more sub-sequences that are not found in the
same relationship to each other in nature. For example, regarding
instance (c), a recombinant nucleic acid sequence will have two or
more sequences from unrelated genes arranged to make a new
functional nucleic acid. In another example, a recombinant nucleic
acid sequence will contain a promoter sequence and a gene-encoding
sequence that are not naturally found adjacent to one another.
[0074] As used herein, "N-acetylglucosaminyltransferase activity"
refers to the activity of an enzyme that transfers an
N-acetylglucosaminyl residue (GlcNAc) to an acceptor glycan.
Typically, enzymes having this activity are
N-acetylglucosaminyltransferases (GlcNAc transferases). In certain
embodiments, GlcNAc transferases are eukaryotic. In certain
embodiments, the GlcNAc transferases are mammalian enzymes forming
a .beta.-linkage from the 1-position of a GlcNAc-residue to the
terminal mannose residues. In certain embodiments, the GlcNAc
transferases are .beta.2-N-acetylglucosaminyltransferases
transferring .beta.2-linked GlcNAc-residue(s) to the 2-position
terminal mannose residues of glycans, in particular to an N-linked
glycan. In certain embodiments, the .beta.2-GlcNAc transferases are
enzymes having GnTI activity and GnTII activity. GnTI activity
transfers a GlcNAc residue to a Man.alpha.3 branch. The Man.alpha.3
branch may be a Man.alpha.3(R-Man.alpha.6)Man.beta.-branch of on
N-linked glycan core structure, such as Man3GlcNAc2 or Man3 or
Man5GlcNAc2 or Man5. GnTI enzymes may be mammalian enzymes, plant
enzymes, or lower eukaryotic enzymes. GnTII activity transfers a
GlcNAc residue to a Man.alpha.6-branch such as a
Man.alpha.6(GlcNAc.beta.2Man.alpha.3)Man.beta.-branch of an
N-linked glycan core structure. An example of such a
Man.alpha.6-branch is GlcNAclMan3GlcNAc2.
[0075] As used herein, "N-acetylglucosamine" refers to an
N-acetylglucosamine residue (GlcNAc). GlcNAc may be part of a
glycan structure. The amine group is on position 2, has a
D-configuration, and has a pyranose structure as a residue. It may
be alternatively named 2-acetamido-2-deoxy-D-glucopyranose
(D-GlcpNAc). GlcNAc may also be a free reducing monosaccharide
(i.e., not part of glycan).
[0076] As used herein, "Man" refers to a mannose residue. A
"terminal Man.alpha.3" or a "terminal Man.alpha.6" refers to a
mannose that is not substituted to the non-reducing end terminal
residue by another monosaccharide residue or residues.
[0077] As used herein, "glycan" refers to an oligosaccharide chain
that can be linked to a carrier such as an amino acid, peptide,
polypeptide, lipid or a reducing end conjugate. In certain
embodiments, the invention relates to N-linked glycans conjugated
to a polypeptide N-glycosylation site such as -Asn-Xxx-Ser/Thr- by
N-linkage to side-chain amide nitrogen of asparagine residue (Asn),
where Xxx is any amino acid residue except Pro. The invention may
further relate to glycans as part of
dolichol-phospho-oligosaccharide (Dol-P-P-OS) precursor lipid
structures, which are precursors of N-linked glycans in the
endoplasmic reticulum of eukaryotic cells. The precursor
oligosaccharides are linked from their reducing end to two
phosphate residues on the dolichol lipid. For example,
.alpha.3-mannosyltransferase Alg3 modifies the
Dol-P-P-oligosaccharide precursor of N-glycans. Generally, the
glycan structures described herein are terminal glycan structures,
where the non-reducing residues are not modified by other
monosaccharide residue or residues.
[0078] As used herein, "glycoprotein" refers to a peptide or
polypeptide attached to a glycan. The glycan may be attached to the
peptide or polypeptide in a cotranslational or posttranslational
modification.
[0079] As used herein, "glycolipid" refers to a lipid attached to a
glycan and includes glyceroglycolipids, glycosphingolipids, and
glycosylphosphatidylinositols.
[0080] As used throughout the present disclosure, glycolipid and
carbohydrate nomenclature is essentially according to
recommendations by the IUPAC-IUB Commission on Biochemical
Nomenclature (e.g. Carbohydrate Res. 1998, 312, 167; Carbohydrate
Res. 1997, 297, 1; Eur. J. Biochem. 1998, 257, 29). It is assumed
that Gal (galactose), Glc (glucose), GlcNAc (N-acetylglucosamine),
GalNAc (N-acetylgalactosamine), Man (mannose), and Neu5Ac are of
the D-configuration, Fuc of the L-configuration, and all the
monosaccharide units in the pyranose form (D-Galp, D-Glcp,
D-GlcpNAc, D-GalpNAc, D-Manp, L-Fucp, D-Neup5Ac). The amine group
is as defined for natural galactose and glucosamines on the
2-position of GalNAc or GlcNAc. Glycosidic linkages are shown
partly in shorter and partly in longer nomenclature, the linkages
of the sialic acid SA/Neu5X-residues .alpha.3 and .alpha.6 mean the
same as .alpha.2-3 and .alpha.2-6, respectively, and for hexose
monosaccharide residues .alpha.1-3, .alpha.1-6, .beta.1-2,
.beta.1-3, .beta.1-4, and .beta.1-6 can be shortened as .alpha.3,
.alpha.6, .beta.2, .beta.3, .beta.4, and .beta.6, respectively.
Lactosamine refers to type II N-acetyllactosamine,
Gal.beta.4GlcNAc, and/or type I N-acetyllactosamine.
Gal.beta.3GlcNAc and sialic acid (SA) refer to N-acetylneuraminic
acid (Neu5Ac), N-glycolylneuraminic acid (Neu5Gc), or any other
natural sialic acid including derivatives of Neu5X. Sialic acid is
referred to as NeuNX or Neu5X, where preferably X is Ac or Gc.
Occasionally Neu5Ac/Gc/X may be referred to as
NeuNAc/NeuNGc/NeuNX.
[0081] Recombinant Proteins of the Invention
[0082] The invention herein relates to recombinant proteins having
N-acetylglucosaminyltransferase activity, where the recombinant
proteins catalyze the transfer of N-acetylglucosamine to a terminal
Man.alpha.3 residue and catalyze the transfer of
N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan. Recombinant proteins of the invention may include,
without limitation, full length proteins having
N-acetylglucosaminyltransferase activity, fragments of proteins
having N-acetylglucosaminyltransferase activity, catalytic domains
having N-acetylglucosaminyltransferase activity, and fusion
proteins having N-acetylglucosaminyltransferase activity. A single
recombinant protein of the invention has the capability to catalyze
both transfers of N-acetylglucosamines. The transfer of
N-acetylglucosamine to a terminal Man.alpha.3 residue may occur
before or after the transfer of N-acetylglucosamine to a terminal
Man.alpha.6 residue. Alternatively, the transfers may occur
simultaneously.
[0083] The acceptor glycan may be attached to a molecule such as an
amino acid, a peptide, or a polypeptide. In certain embodiments,
the amino acid is an asparagine residue. The asparagine residue may
be in aminoglycosidic linkage from the side-chain amide (a biologic
mammalian polypeptide N-glycan linkage structure) and may be part
of a peptide chain such as a dipeptide, an oligopeptide, or a
polypeptide. The glycan may be a reducing end derivative such as an
N-, O-, or C-linked, preferably glycosidic, derivative of the
reducing GlcNAc or Man, such as a spacer or terminal organic
residue with a certain glycan linked structure selected from the
group of an amino acid, alkyl, heteroalkyl, acyl, alkyloxy, aryl,
arylalkyl, or heteroarylalkyl. The spacer may be further linked to
a polyvalent carrier or a solid phase. In certain embodiments,
alkyl-containing structures include methyl, ethyl, propyl, and
C4-C26 alkyls, lipids such as glycerolipids, phospholipids,
dolichol-phospholipids and ceramides and derivatives. The reducing
end may also be derivatized by reductive amination to a secondary
amine linkage or a derivative structure. Certain carriers include
biopoly- or oligomers such as (poly)peptides, poly(saccharides)
such as dextran, cellulose, amylose, or glycosaminoglycans, and
other organic polymers or oligomers such as plastics including
polyethylene, polypropylene, polyamides (e.g., nylon or
polystyrene), polyacrylamide, and polylactic acids, dendrimers such
as PAMAM, Starburst or Starfish dendrimers, or polylysine, and
polyalkylglycols such as polyethylene glycol (PEG). Solid phases
may include microtiter wells, silica particles, glass, metal
(including steel, gold, and silver), polymer beads such as
polystyrene or resin beads, polylactic acid beads, polysaccharide
beads or organic spacers containing magnetic beads.
[0084] In certain embodiments, the acceptor glycan is attached to a
heterologous polypeptide. As used herein, a "peptide" and a
"polypeptide" are amino acid sequences including a plurality of
consecutive polymerized amino acid residues. For purpose of this
invention, typically, peptides are those molecules including up to
50 amino acid residues, and polypeptides include more than 50 amino
acid residues. The peptide or polypeptide may include modified
amino acid residues, naturally occurring amino acid residues not
encoded by a codon, and non-naturally occurring amino acid
residues. As used herein, "protein" may refer to a peptide or a
polypeptide of any size. The term "heterologous polypeptide" refers
to a polypeptide that is not naturally found in a given host cell
or is not endogenous to a given host cell. In certain embodiments,
the heterologous polypeptide is a therapeutic protein. Therapeutic
proteins, for example, may include monoclonal antibodies,
erythropoietins, interferons, growth hormones, enzymes, or
blood-clotting factors. For example, the acceptor glycan may be
attached to a therapeutic protein such as rituximab.
[0085] Acceptor Glycans
[0086] In certain embodiments, the structure of the acceptor glycan
has the following formula,
[R.sub.1].sub.yMan.alpha.3([R.sub.2].sub.zMan.alpha.6)Man{.beta.4GlcNAc(F-
uc.alpha.x).sub.n[.beta.4GlcNAc].sub.m}.sub.q, where q, y, z, n and
m are 0 or 1; x is linkage position 3 or 6, of optional fucose
residue; R1 is GlcNAc, preferably GlcNAc.beta.2; and R2 is a
branched structure Man.alpha.3(Man.alpha.6), with the provision
that when z is 1, then y is 0, and when z is 0, then y is 0 or 1. (
) defines a branch in the regular N-glycan core structure, either
present or absent. [ ] and { } define a part of the glycan
structure either present or absent in a linear sequence. When z is
0 and y is 0 then the structure is a Man3 glycan, and when z is 0
and y is 1, the structure is a GlcNAcMan3 glycan. When y is 0 and z
is 1, the glycan is a Man5 glycan. The acceptor glycan may be
beta-glycosidically linked to an Asn residue, preferably from the
reducing end GlcNAc. In one embodiment, the acceptor glycan is a
polypeptide linked N-glycan, where m and q are 1, and the acceptor
structure contains a derivative of
[R.sub.1].sub.yMan.alpha.3([R.sub.2].sub.2Man.alpha.6)Man.beta.4GlcNAc(Fu-
c.alpha.x).sub.n.beta.4GlcNAc. Optional derivatives include
substitutions by monosaccharide residues such as GlcNAc or
xylose.
[0087] The acceptor glycan may be Man3, GlcNAcMan3, or Man5. In
certain embodiments, the acceptor glycan is Man3 or GlcNAcMan3.
Man3 is a trimannosyl glycan comprising at least one of Man.alpha.3
or Man.alpha.6 residues and is preferably a branched
oligosaccharide, such as Man.alpha.3(Man.alpha.6)Man. Other certain
Man3 oligosaccharides are Man.alpha.3(Man.alpha.6)Man.beta.,
Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc, and polypeptide-linked
Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc.beta.4GlcNAc. In addition,
depending on the host cell, the glycan can contain Fuc, Xyl or
GlcNAc in Man.beta. and/or GlcNAc residues, such as
Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc.beta.4(Fuc.alpha.x).sub.nGlcNAc,
where x is 3 or 6 and n is 0 or 1, also described by a
monosaccharide composition formula indicating the terminal mannose
structure and reducing end composition as Man3GlcNAc2 (n is 0) and
Man3GlcNAc2Fuc (n is 1). In certain embodiments, especially those
with a polypeptide-linked structure, the Man3 structure is a
Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc.beta.4(Fuc.alpha.6).sub.nGlcNAc.
In certain embodiments, the polypeptide-linked GlcNAcMan3 structure
is
GlcNAc.beta.2Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc.beta.4(Fuc.alpha.6)-
.sub.nGlcNAc, also described by a monosaccharide composition
formula GlcNAcMan3GlcNAc2 (n is 0) and GlcNAcMan3GlcNAc2Fuc (n is
1). In certain embodiments, the polypeptide-linked Man5 structure
is
Man.alpha.3{Man.alpha.3(Man.alpha.6)Man.alpha.6}Man.beta.4GlcNAc.beta.4(F-
uc.alpha.6).sub.nGlcNAc, where { } and ( ) indicate a branch and n
is 0 or 1, also described by a monosaccharide composition formula
Man5GlcNAc2 (n is 0) and Man5GlcNAc2Fuc (n is 1).
[0088] Accordingly, the certain Man3 glycans have structures
according to the following formula,
Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc(Fuc.alpha.x).sub.n.beta.4GlcNAc,
where n is 0 or 1, indicating presence or absence of part of the
molecule, where x is 3 or 6, and where ( ) defines a branch in the
structure. In embodiments of the invention where the acceptor
glycan is Man3, the recombinant protein catalyzes the transfer of
N-acetylglucosamine to the terminal Man.alpha.3 and Man.alpha.6 of
Man3, thus resulting in GlcNAc2Man3,
GlcNAc.beta.2Man.alpha.3(GlcNAc.beta.2Man.alpha.6)Man.beta.4GlcNAc.beta.4-
(Fuc.alpha.x).sub.nGlcNAc, where n is 0 or 1, also described by a
monosaccharide composition formula GlcNAc2Man3GlcNAc2 (n is 0) and
GlcNAc2Man3GlcNAc2Fuc (n is 1).
[0089] In embodiments of the invention where the acceptor glycan is
Man5, the recombinant protein catalyzes the transfer of
N-acetylglucosamine to the terminal Man.alpha.3 of Man5. After 2
mannoses have been removed from GlcNAcMan5 (for example, by
mannosidase II) to form GlcNAcMan3, the recombinant protein
catalyzes the transfer of N-acetylglucosamine to the terminal
Man.alpha.6, thus resulting in GlcNAc2Man3 (which has the structure
GlcNAc.beta.2Man.alpha.3(GlcNAc.beta.2Man.alpha.6)Man.beta.4Glc-
NAc.beta.4(Fuc.alpha.x).sub.nGlcNAc, where n is 0 or 1, also
referred to as G0 if attached to an antibody).
[0090] Fusion Proteins Containing N-acetylglucosaminyltransferase
Catalytic Domains
[0091] In certain embodiments, the recombinant proteins of the
invention are fusion proteins containing an
N-acetylglucosaminyltransferase I catalytic domain and an
N-acetylglucosaminyltransferase II catalytic domain. The term
"fusion protein" refers to any protein or polypeptide containing a
protein or polypeptide linked to heterologous amino acids.
[0092] N-acetylglucosaminyltransferase I (GlcNAc-TI; GnTI; EC
2.4.1.101) catalyzes the reaction
UDP-N-acetyl-D-glucosamine+3-(alpha-D-mannosyl)-beta-D-mannosyl-R<=>-
;UDP+3-(2-(N-acetyl-beta-D-glucosaminyl)-alpha-D-mannosyl)-beta-D-mannosyl-
-R, where R represents the remainder of the N-linked
oligosaccharide in the glycan acceptor. An
N-acetylglucosaminyltransferase I catalytic domain is any portion
of an N-acetylglucosaminyltransferase I enzyme that is capable of
catalyzing this reaction. Amino acid sequences for
N-acetylglucosaminyltransferase I enzymes from various organisms
are listed in SEQ ID NOs: 1-19. Additional GnTI enzymes are listed
in the CAZy database in the glycosyltransferase family 13
(cazy.org/GT13_all). Enzymatically characterized species includes
A. thaliana AAR78757.1 (U.S. Pat. No. 6,653,459), C. elegans
AAD03023.1 (Chen S. et al J. Biol. Chem 1999; 274(4288-97), D.
melanogaster AAF57454.1 (Sarkar & Schachter Biol Chem. 2001
February; 382(2):209-17); C. griseus AAC52872.1 (Puthalakath H. et
al J. Biol. Chem 1996 271(44):27818-22); H. sapiens AAA52563.1
(Kumar R. et al Proc Natl Acad Sci USA. 1990 December;
87(24):9948-52); M. auratus AAD04130.1 (Opat As et al Biochem J.
1998 Dec. 15; 336 (Pt 3):593-8), (including an example of
deactivating mutant), Rabbit, O. cuniculus AAA31493.1 (Sarkar M et
al. Proc Natl Acad Sci USA. 1991 Jan. 1; 88(1):234-8). Additional
examples of characterized active enzymes can be found at
cazy.org/GT13_characterized. The 3D structure of the catalytic
domain of rabbit GnTI was defined by X-ray crystallography in
Unligil U M et al. EMBO J. 2000 Oct. 16; 19(20):5269-80. The
Protein Data Bank (PDB) structures for GnTI are 1FO8, 1FO9, 1FOA,
2AM3 , 2AM4, 2AM5, and 2APC. In certain embodiments, the
N-acetylglucosaminyltransferase I catalytic domain is from the
human N-acetylglucosaminyltransferase I enzyme (SEQ ID NO: 1), or
variants thereof. In certain embodiments, the
N-acetylglucosaminyltransferase I catalytic domain contains a
sequence that is at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at
least 98%, at least 99%, or 100% identical to amino acid residues
84-445 of SEQ ID NO: 1. In some embodiments, a shorter sequence can
be used as a catalytic domain (e.g. amino acid residues 105-445 of
the human enzyme or amino acid residues 107-447 of the rabbit
enzyme; Sarkar et al. (1998) Glycoconjugate J 15:193-197).
Additional sequences that can be used as the GnTI catalytic domain
include amino acid residues from about amino acid 30 to 445 of the
human enzyme or any C-terminal stem domain starting between amino
acid residue 30 to 105 and continuing to about amino acid 445 of
the human enzyme, or corresponding homologous sequence of another
GnTI or a catalytically active variant or mutant thereof. The
catalytic domain may include N-terminal parts of the enzyme such as
all or part of the stem domain, the transmembrane domain, or the
cytoplasmic domain.
[0093] As used herein, "cytoplasmic" is used to refer to a part of
a protein that interacts with the cytoplasm of a cell.
[0094] N-acetylglucosaminyltransferase II (GlcNAc-TII; GnTII; EC
2.4.1.143) catalyzes the reaction
UDP-N-acetyl-D-glucosamine+6-(alpha-D-mannosyl)-beta-D-mannosyl-R<=>-
;UDP+6-(2-(N-acetyl-beta-D-glucosaminyl)-alpha-D-mannosyl)-beta-D-mannosyl-
-R, where R represents the remainder of the N-linked
oligosaccharide in the glycan acceptor. An
N-acetylglucosaminyltransferase II catalytic domain is any portion
of an N-acetylglucosaminyltransferase II enzyme that is capable of
catalyzing this reaction. Amino acid sequences for
N-acetylglucosaminyltransferase II enzymes from various organisms
are listed in SEQ ID NOs: 20-33. In certain embodiments, the
N-acetylglucosaminyltransferase II catalytic domain is from the
human N-acetylglucosaminyltransferase II enzyme (SEQ ID NO: 20).
Additional GnTII species are listed in the CAZy database in the
glycosyltransferase family 16 (cazy.org/GT16_all). Enzymatically
characterized species include GnTII of C. elegans, D. melanogaster,
Homo sapiens, Rattus norvegigus, Sus scrofa (cazy.org/GT16
characterized). In certain embodiments, the
N-acetylglucosaminyltransferase II catalytic domain contains a
sequence that is at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at
least 98%, at least 99%, or 100% identical to amino acid residues
from about 30 to about 447 of SEQ ID NO: 21. The catalytic domain
may include N-terminal parts of the enzyme such as all or part of
the stem domain, the transmembrane domain, or the cytoplasmic
domain.
[0095] In certain embodiments, the N-acetylglucosaminyltransferase
I catalytic domain is N-terminal to the
N-acetylglucosaminyltransferase II catalytic domain. In other
embodiments, the N-acetylglucosaminyltransferase II catalytic
domain is N-terminal to the N-acetylglucosaminyltransferase I
catalytic domain. The term "N-terminal" refers to the positioning
of a set of amino acid residues closer to the end of a polypeptide
that is terminated by an amino acid with a free amine group
(.about.NH.sub.2) compared to a reference set of amino acid
residues.
[0096] Spacers
[0097] In certain embodiments of the invention, the recombinant
protein contains a spacer in between the
N-acetylglucosaminyltransferase I catalytic domain and the
N-acetylglucosaminyltransferase II catalytic domain. The term
"spacer" refers to any number of consecutive amino acids of any
sequence separating the N-acetylglucosaminyltransferase I catalytic
domain and the N-acetylglucosaminyltransferase II catalytic domain
such that the spacer has no effect on the enzymatic function of the
catalytic domains. Typically, the spacer is at least 5, at least
10, at least 15, at least 20, at least 30, at least 40, or at least
50 amino acids in length. In certain embodiments, the spacer
contains sequence from a stem domain. "Stem domain" refers to a
protein domain, or a fragment thereof, which is located adjacent to
the transmembrane domain of a native enzyme, such as a
glycosyltransferase or a glycosyl hydrolase, and optionally targets
the enzyme to or assists in retention of the enzyme in the
ER/Golgi. Stem domains generally start with the first amino acid
following the hydrophobic transmembrane domain and end at the
catalytic domain. Exemplary stem domains include, but are not
limited to, the stem domain of human GnTI, amino acid residues from
about 30 to about 83 or from about 30 to about 105 for the human
GnTII, or amino acid residues from about 26 to about 106 or from
about 26 to about 83 for the T. reesei KRE2. In certain embodiments
where the spacer contains sequence from a stem domain, the spacer
includes amino acids 30-83 of the human GnTI sequence (SEQ ID NO:
34). In other embodiments, the spacer may include any of the
sequences listed in SEQ ID NOs: 35-38.
[0098] Further examples of suitable spacers include, without
limitation, the flexible spacer 3.times.G4S (SEQ ID NO: 118), the
flexible spacer 2.times.G4S (SEQ ID NO: 120), the spacer for the T.
reesei CBHI (SEQ ID NO: 122); and the spacer for the T. reesei EGIV
cellulase (SEQ ID NO: 124).
[0099] In certain embodiments, the length of the spacer is about
the same as the length of a stem domain of GnT1. In certain
embodiments, the length is about 74 amino acid residues, plus or
minus about 37 amino acids. For example, the spacer length is about
30 amino acids to about 110 amino acids, or from about 35 amino
acids to about 100 amino acids, or as exemplified in the examples
described herein, plus or minus 2, 3, 4, or 5 amino acids. In one
embodiment, the spacer length corresponds to a truncated stem
domain of GnT1, for example, start from amino acid 25 to amino acid
104, or between amino acid 30 to amino acid 101, to the end of the
GnT1 stem domain. In certain embodiments, the spacer may include a
part of the stem domain of human GnT1, which may start from an
amino acid positioned between amino acid 70 to amino acid 87
(according to numbering in SEQ ID NO: 34), or between amino acid 76
and amino acid 104, or beginning from amino acid 30, 35, 40, 45,
50, 60, 70, 73, 74, 75, 76, 80, 83, 84, 85, 86, 87, 100, 101, 102,
103, or 104, to the end of the human GnT1 stem domain. In other
embodiments, the spacer may include a heterologous spacer peptide,
which may include a fungal spacer peptide and/or a repetitive
oligomer spacer peptide.
[0100] Typically, the spacer is an elongated peptide without
specific conformation and contains amino acid residues allowing
high flexibility (e.g., Gly and Ala), hydroplicity (e.g., Ser and
Thr), and optionally Pro to prevent conformation. The spacer may be
glycosylated. In certain embodiments the spacer is O-glycosylated
including fungal O-mannosylation. In certain embodiments the spacer
is an endogenous fungal, filamentous fungal, or Trichoderma spacer
peptide, such as a spacer that naturally separates protein domains.
The spacer may be derived from a secreted or cellulolytic enzyme of
a fungus such as a filamentous fungus (e.g., T. reesei), a fragment
thereof, or a multimer of the spacer and/or its fragment or mutated
analog or equivalent thereof. The natural fungal spacer may contain
dimeric or oligomeric proline and/or glycine and/or serine and/or
threonine, and/or multiple amino acid residues selected from Ser,
Thr, Gly, Pro or Ala or any combinations thereof. In certain
embodiments, the spacer is a repeating oligomer containing a
monomer with 1-10 or 1-5 amino acid residues selected from Ser,
Thr, Gly, Pro or Ala and optionally a charged amino acid residue
selected from negatively charged residues Glu or Asp or positively
charged residues Lys or Arg. In certain embodiments the charged
residue is negatively charged. In certain embodiments the monomer
contains dimeric or oligomeric amino acid residues, and/or multiple
single amino acid residues selected from Ser, Thr, Gly, Pro and
Ala. In certain embodiments the oligomer contains a monomer of a
dimer or oligomer of glycine and a single residue selected from the
Ser, Thr, Gly, Pro and Ala. In certain embodiments the single
residue is Ser or Thr. In certain embodiments the residue is Ser.
In certain embodiments, the sequence of the repeating spacer is
{(Yyy).sub.nXxx).sub.m where n is 2 to 10, m is 2 to 10, and Xxx
and Yyy are selected from Ser, Thr, Gly, Pro and Ala, with the
proviso that Xxx and Yyy are not the same amino acid residue. In
certain embodiments the repeating spacer is {(Gly).sub.nXxx}.sub.m
where n is 2 to 10, m is 2 to 10, and Xxx is selected from Ser,
Thr, Gly, Pro and Ala. In certain embodiments Xxx is Ser or Thr. In
certain embodiments Xxx is Ser.
[0101] Targeting Peptides
[0102] In certain embodiments, recombinant proteins of the
invention include a targeting peptide linked to the catalytic
domains. The term "linked" as used herein means that two polymers
of amino acid residues in the case of a polypeptide or two polymers
of nucleotides in the case of a polynucleotide are either coupled
directly adjacent to each other or are within the same polypeptide
or polynucleotide but are separated by intervening amino acid
residues or nucleotides. A "targeting peptide", as used herein,
refers to any number of consecutive amino acid residues of the
recombinant protein that are capable of localizing the recombinant
protein to the endoplasmic reticulum (ER) or Golgi apparatus
(Golgi) within the host cell. The targeting peptide may be
N-terminal or C-terminal to the catalytic domains. In certain
embodiments, the targeting peptide is N-terminal to the catalytic
domains. In certain embodiments, the targeting peptide provides
binding to an ER or Golgi component, such as to a mannosidase II
enzyme. In other embodiments, the targeting peptide provides direct
binding to the ER or Golgi membrane.
[0103] Components of the targeting peptide may come from any enzyme
that normally resides in the ER or Golgi apparatus. Such enzymes
include mannosidases, mannosyltransferases, glycosyltransferases,
Type 2 Golgi proteins, and MNN2, MNN4, MNN6, MNN9, MNN10, MNS1,
KRE2, VAN1, and OCH1 enzymes. Such enzymes may come from a yeast or
fungal species such as those of Acremonium, Aspergillus,
Aureobasidium, Cryptococcus, Chrysosporium, Chrysosporium
lucknowense, Filobasidium, Fusarium, Gibberella, Humicola,
Magnaporthe, Mucor, Myceliophthora, Myrothecium, Neocallimastix,
Neurospora, Paecilomyces, Penicillium, Piromyces, Schizophyllum,
Talaromyces, Thermoascus, Thielavia, Tolypocladium, and
Trichoderma. Sequences for such enzymes can be found in the GenBank
sequence database.
[0104] In certain embodiments the targeting peptide comes from the
same enzyme and organism as one of the catalytic domains of the
recombinant protein. For example, if the recombinant protein
includes a human GnTII catalytic domain, the targeting peptide of
the recombinant protein is from the human GnTII enzyme. In other
embodiments, the targeting peptide may come from a different enzyme
and/or organism as the catalytic domains of the recombinant
protein.
[0105] Examples of various targeting peptides for use in targeting
proteins to the ER or Golgi that may be used for targeting
recombinant proteins of the invention include: Kre2/Mnt1 N-terminal
peptide fused to galactosyltransferase (Schwientek, JBC 1996,
3398), HDEL for localization of mannosidase to ER of yeast cells to
produce Man5 (Chiba, JBC 1998, 26298-304; Callewaert, FEBS Lett
2001, 173-178), OCH1 targeting peptide fused to GnTI catalytic
domain (Yoshida et al, Glycobiology 1999, 53-8), yeast N-terminal
peptide of Mns1 fused to .alpha.2-mannosidase (Martinet et al,
Biotech Lett 1998, 1171), N-terminal portion of Kre2 linked to
catalytic domain of GnTI or .beta.4GalT (Vervecken, Appl. Environ
Microb 2004, 2639-46), various approaches reviewed in Wildt and
Gerngross (Nature Rev Biotech 2005, 119), full-length GnTI in
Aspergillus nidulans (Kalsner et al, Glycocon. J 1995, 360-370),
full-length GnTI in Aspergillus oryzae (Kasajima et al, Biosci
Biotech Biochem 2006, 2662-8), portion of yeast Sec12 localization
structure fused to C. elegans GnTI in Aspergillus (Kainz et al
2008), N-terminal portion of yeast Mnn9 fused to human GnTI in
Aspergillus (Kainz et al 2008), N-terminal portion of Aspergillus
Mnn10 fused to human GnTI (Kainz et al, Appl. Environ Microb 2008,
1076-86), and full-length human GnTI in T. reesei (Maras et al,
FEBS Lett 1999, 365-70).
[0106] In certain embodiments the targeting peptide is the
Kre2/Mnt1(i.e., Kre2) targeting peptide having the amino acid
sequence of SEQ ID NO: 115 or SEQ ID NO: 116.
[0107] Further examples of sequences that may be used for targeting
peptides include the sequences listed in Table 1 below.
TABLE-US-00001 TABLE 1 Targeting peptides. Homologous to
Cytoplasmic Transmembrane Luminal KRE2 MASTNARYVR YLLIAFFTILVFYF
SKYEGVDLNKGTFTAPDSTKTTPKPPATGDAKDFPLALTPNDP estExt_fgenesh1_ SEQ ID
NO: 39 VSN GFNDLVGIAPGPRMNATFVTIARNSDVVVDIARSIRQVEDRFNRRYNY
pm.C_30039 SEQ ID NO: 40
DWVFLNDKPFDNTFKKVTTSLVSGKTHYGEIAPEHWSFPDWIDQDKA
KKVREDMAERKIIYGDSVSYRHMCRFESGFFFRQPLMMNYEYYWRV
EPSIELYCDIHYDPFRLMVEQGKKYSFVISLYEYPATIATLWESTKKFM
KNHPEHIAPDNSMRFLSDDGGETYNNCHFWSNFEIGSLEWLRSKQYI
DFFESLDKDGGFFYERWGDAPVHSIAAGLMLNRSEIHFFNDIAYWHV
PFTHCPTGEKTRLDLKCHCDPKENFDWKGYSCTSRFFEMNGMDKPE GWENQQD SEQ ID NO:
41 KRE2 alternative1 MAIARPVR ALGGLAAILWCFF
QLLRPSSSYNSPGDRYINFERDPNLDPTGEPEGILVRTSDRYAPDAK e_gw1.28.231.1 SEQ
ID NO: 42 LY DTDRASATLLALVRNEEVDDMVASMVDLERTVVNSKFNYPWTFFNDK SEQ ID
NO: 43 PFSEEFKKKTSAVTNATCNYELIPKEHWDAPSWIDPAIFEESAAVLKK
NGVQYANMMSYHQMCRWNSGMFYKHPALKDVRYYVVRVEPKVHFF
CDVDYDVFRYMQDNNKTYGFTINLYDDPHTLPTLWPQTAKFLADHPN
YLHEHSAIKWVIDDARRPQHNREAQGFSTCHFWSNFEVADMEFWRS
KVYEDYFEHLDRAGGFFYERWGDAPVHSIALGLFEDSSKIHWFRDIG
YQHIPFFNCPNSPKCKGCVTGRLTDGEPFLHREDCRPNWFKYAGMG SEQ ID NO: 44 OCH1
MLNPRR ALIAAAFILTVFFLI
SRSHNSESASTSEPKDAEAEALSAANAQQRAAPPPPPQKPMIDMSG e_gw1.16.371.1 SEQ
ID NO: 45 SEQ ID NO: 46
MSTYDKLAYAYEYDIESKFPAYIWQTWRKTPSEGDFEFREQEASWSI
EHPGFIHEVITDSVADTLLQLLYGSIPEVLEAYHALPLPVLKADLFRYLIL
YARGGIYSDIDTYAIRSALEWIPPQIPKETVGLVIGIEADPDRPDWADW
YSRRIQFCQWTIQSKPGHPVLRDIISRITNQTLEMKKSGKLSAFQGNR
VVDLTGPAVWTDTIMDYFNDERYFDMENSKGRIDYRNFTGMETSKRV
GDVVVLPITSFSPGVGQMGAKDYDDPMAFVKHDFEGTVVKPESERHI GEIVQELGEGQGEAPKEQ
SEQ ID NO: 47 OCH1 alternative1 MGMGQCQWSPF LPLYITVVCVFLVIV
NFDWILAIPNPASVLRREPKAPPLPGSTFPQKIWQTVVKVDPLNFDERD fgenesh1_pm.C_
RNKVPTQMRRC SEQ ID NO: 49
LVTARTWTTINPGMRYEVVTDANEMAYIEDRYGPNGFDRPDIVEFYK scaffold_13000080
SEQ ID NO: 48 MINLPIIKADLLRYMIMYAEGGIYADIDVETMKPFHRFIPDRYDEKDIDIII
GVEIDQPDFKDHPILGKKSMSFCQWTFVARPQQPVMMRLIENIMKWF
KTVARDQGVPLGEVQLDFDQVISGTGPSAFTKAMLEEMNRKTKGPKV
TVVDAFHNLDESKLVGGVLVLTVEAFCAGQGHSDSGNHNARNALVKH
HFHASNWPSRHPRYKHPAYGQVEDCNWVPECVRKWDEDTSNWDK
YSENEQKKILQDIENARLERERQQQALAALP SEQ ID NO: 50 MNN9 MARPMGSVRLKK
LILGAVLCIFIIIFLV SPSSPASASRLSIVSAQHHLSPPTSPYQSPRSGAVQGPPPVTRYNLN e
gw1.5.262.1 ANPST SEQ ID NO: 52
KVTVTSDPVRNQEHILILTPMARFYQEYWDNLLRLNYPHELITLGFILP SEQ ID NO: 51
KTKEGNQATSMLQKQIQKTQNYGPEKDRFKSIIILRQDFDPAVVSQDE
SERHKLANQKARREVMAKARNSLLFTTLGPSTSWVLWLDADITETAP
TLIQDLASHDKPIIVANCFQKYYDPESKKMAERPYDFNSWQDSETALK
MAEQMGPDDILLEGYAEMATYRTLLAYMSTPGGSKDLVVPLDGVGG
TALLVKADVHRDGAMFPPFAFYHLIESEGFAKMAKRLGWQPYGLPNY KVYHYNE SEQ ID NO:
53 MNN9 alternative1 MLLPKGGLDWRS FILLVGITGLILLLW
RGVSTSASEMQSFYCWGPAKPPMEMSPNEHNRWNGHLQTPVIFNH estExt_GeneWise
ARAQIPPTRAL SEQ ID NO: 55
HAPVEVNSSTIEHVDLNPINSTKQAVTKEERILILTPLKDAAPYLSKYFE Plus.C_230146
WNAVTRTR LLAELTYPHRLIDLAFLVSDSTDDTLAVLASELDRIQKRPDQIPFHSATV SEQ ID
NO: 54 IEKDFGFKLSQNVEERHSFEAQGPRRKAMGRARNYLLYTALKPEHSW
VYWRDVDIVDSPTGILEDFIAHDRDILVPNIWFHRYRDGVDIEGRFDYN
SWVESDKGRKLANSLDKDVVLAEGYKQYDTGRTYMAKMGDWRENK
DVELELDGIGGVNILVKADVHRSGINFPCYAFENQAETEGFAKMAKRA
GYEVYGLPNYVVWHIDTEEKGGNA SEQ ID NO: 56 MNN9 alternative2
MMPRHHSSGFSN VGIAVVVILVLVL
QPRSVASLISLGILSGYDDLKLETVRYYDLSNVQGTARGWEREERILL estExt_GeneWise
GYPRADTFEI WFG CVPLRDAEQHLPMFFSHLKNFTYPHNLIDLAFLVSDSKDHTLESLTEH
Plus.C_400029 SPHRFQPRATLPP SEQ ID NO: 58
LEAIQADPDPKQPYGEISIIEKDFGQKVNQDVESRHGFAAQASRRKLM HRKRKRTAIR
AQARNWLLSAALRPYHSWVYWRDVDVETAPFTILEDLMRHNKDVIVP SEQ ID NO: 57
NVVVRPLPDWLGGEQPYDLNSWQESETALALADTLDEDAVIVEGYAE
YATWRPHLAYLRDPYGDPDMEMEIDGVGGVSILAKAKVFRAGVHFPA
FSFEKHAETEGFGKMAKRMHFSVVGLPHYTIWHLYEPSVDDIKHMEE
MERERIAREKEEEERKKKEAQIKEEFGDANSQWEQDKQQMQDLKLQ
DRGGDKEAAAAGVNQGAAAKAAGAMEGQKN SEQ ID NO: 59 MNN10 MSLSRSPSPVPG
ILLPLIIICTIVAYY GTHEAPGFVHWWRRISMGGGGEKFVIILGANVGGGVMEWKGAREW
fgenesh5_pg.C_ GGWSSPGLNINS SEQ ID NO: 61
AIERDSVRNKRKYATRWGYDLEIVDMKTKKRYAHEWRESWEKVDFIR scaffold_5000342
GRSSPSNAAGSS AAMRKYPKAEWFWWLDLNTYVMEPSYSLQRHLFNHLDRHVYRDINV
VSWESAKMRKQG FNPLNITHPPTEEYLDAEARSPVGDGNINSVNLMLTQDCSGFNLGSFF
ANGYPSFSTQNQ IRRSAWTEQLLDIWWDPVLYEQKHMEWEHKEQDALEQLYRTQPWIR
GFFTRHMRRI QHTGFLPQRLINSFPPAACADESGLNNTRIHYNEKDRDFVVNMAGCE
SSSLPRFAAGPG WGRDCWGEMYHYREFSYWLNRNPWELFKEEIVAVIWYKLIGQRVKL
NTYAEREKYERG SEQ ID NO: 62 GHSPHAGGGRLR AFLARIGRRLKWR SEQ ID NO: 60
MNN10 MHFAYPSRKSSN IGIVLFLVLATLWFF
SNPRVPRPDPERVPSGRPPVVLVTVIDPTQYPNAYLKTIKENREQYAA alternative1
PPPFRPRSTRLPG SEQ ID NO: 64
KHGYEAFIVKAYDYDTQGAPQSWSKLMAMRHALTKFPECRFVWYLD estExt_GeneWise
LRRSRIKT QDAYIMDMSKSLEEQLLNRQKLESLMIKNYPVVPPDSIIKTFSHLRPDE
Plus.C_150339 SEQ ID NO: 63
VDLIVSQDSSGLVAGSVVVRNSQWSKFLLETWMDPLYRSYNFQKAE
RHALEHIVQWHPTILSKLALVPQRTLGPYTRTDQGDAYQDGDFVVMF
TGCTKSGEQSCETVSASYYQKWSSSL SEQ ID NO: 65 MNS1 MIRDPFGIHSKNA
VLGMIAAAVMFVL SSGQTEEAKKKASGSAFSWLGLSQERGGVDWDERRKSVVEAFEVVV
fgenesh1_pm.C_ FKATALRAARDIK YVTGFF
DAYERYAWGKDEFHPISKNGRNMAPKGLGWIIIDSLDTMMLMNQTTR scaffold_3000175
EAATQAGANALE SEQ ID NO: 67
LQHAREWISTSLTVVDQDQDVNTFETTIRMLGGLLSAHYLSTEFPELAP MSFSLPKHVPDF
LTEDDEGAPGEDLYLEKAKDLADRLLSAFESESGIPYASVNIGEYKGP GDPSRALEDRAW
SHSDNGASSTAEATTLQLEFKYLAKLTGEKNFWDKVEKVMEVVDDN AALLPMYKDKPYA
QPEDGLVPIYIYATTGEFRGQNIRLGSRGDSYYEYLIKQYLQINKQEPI YAPSMRLRPWWR
YEEMWDEALAGVRKHLVTYTEPSEFTIIAERPDGLEHPMSPKMDHLV RRK
CFMPGTIALAATGGLTEAEARKLSTWNKKKDDDMQLARELMHTCWG SEQ ID NO: 66
MYKYMKTGLAPEIMYFNIPNPPPESSAPHQAPAAFDEDPHAEWRKDF
VVHSNDVHNLQRPETVESLFYMWRITGDVKYREWGWDMFKSFVNYT
AVEDQGGFTSLLDANSIPPTPKDNMESFWLAETLKYMYLLFSPNDVLP
LHKIVLNTEAHPFPRFDMGPLFSTGWKRKPRDGSAKKKATTAATTDAE SEQ ID NO: 68 MNS1
alternative1 MARRRYR LFMICAAVILFLLYR
VSQNTWDDSAHYATLRHPPASNPPAAGGESPLKPAAKPEHEHEHEN estExt_fgenesh1_ SEQ
ID NO: 69 SEQ ID NO: 70
GYAPESKPKPQSEPKPESKPAPEHAAGGQKSQGKPSYEDDEETGKN pm.C_80182
PPKSAVIPSDTRLPPDNKVHWRPVKEHFPVPSESVISLPTGKPLKVPR
VQHEFGVESPEAKSRRVARQERVGKEIERAWSGYKKFAWMHDELSP
VSAKHRDPFCGWAATLVDSLDTLWIAGLKEQFDEAARAVEQIDFTTTP
RNNIPVFETTIRYLGGLLGAFDVSGGHDGGYPMLLTKAVELAEILMGIF
DTPNRMPILYYQWQPEYASQPHRAGSVGIAELGTLSMEFTRLAQLTS
QYKYYDAVDRITDALIELQKQGTSIPGLFPENLDASGCNHTATALRSSL
SEAAQKQMDEDLSNKPENYRPGKNSKADPQTVEKQPAKKQNEPVEK
AKQVPTQQTAKRGKPPFGANGFTANWDCVPQGLVVGGYGFQQY
HMGGGQDSAYEYFPKEYLLLGGLESKYQKLYVDAVEAINEWLLYRPM
TDGDWDILFPAKVSTAGNPSQDLVATFEVTHLTCFIGGMYGLGGKIFG
REKDLETAKRLTDGCVWAYQSTVSGIMPEGSQVLACPTLEKCDFN
ETLWWEKLDPAKDWRDKQVADDKDKATVGEALKETANSHDAAGGS
KAVHKRAAVPLPKPGADDDVGSELPQSLKDKIGFKNGEQKKPTGSSV
GIQRDPDAPVDSVLEAHRLPPQEPEEQQVILPDKPQTHEEFVKQRIAE
MGFAPGVVHIQSRQYILRPEAIESVWYMYRITGDPIWMEKGWKMFEA
TIRATRTEIANSAIDDVNSEEPGLKDEMESFWLAETLKYYYLLFSEPSVI
SLDEWVLNTEAHPFKRPGGSVIGHSI SEQ ID NO: 71 MNS1 alternative2
MLNOLOGRVPRRY IALVAFAFFVAFLLW
SGYDFVPRTATVGRFKYVPSSYDWSKAKVYYPVKDMKTLPQGTPVT estExt_GeneWise SEQ
ID NO: 72 SEQ ID NO: 73
FPRLQLRNQSEAQDDTTKARKQAVKDAFVKSWEAYKTYAWTKDQLQ Plus.C_120298
PLSLSGKETFSGWSAQLVDALDTLWIMDLKDDFFLAVKEVAVIDWSKT
KDNKVINLFEVTIRYLGGLIAAYDLSQEPVLRAKAIELGDTLYATFDTPN
RLPSHWLDYSKAKKGTQRADDSMSGAAGGTLCMEFTRLSQITGDPK
YYDATERIKQFFYRFQNETTLPGMWPVMMNYREETMVESRYSMGGS
ADSLYEYLVKMPALLGGLDPQYPEMAIRALDTARDNLLFRPMTEKGD
NILALGNALVDHGNVQRITEMQHLTCFAGGMYAMAGKLFKRDDYVDL
GSRISSGCVWAYDSFPSGIMPESADMAACAKLDGPCPYDEVKAPVD
PDGRRPHGFIHVKSRHYLLRPEAIESVFYMWRITGDQVWRDTAWRM
WENIVREAETEHAFAIVEDVTRTASKLINNYLLQTFWLAETLKYFYLIF
DDESAIDLDKWVFNTEAHPFKRPAV SEQ ID NO: 74 MNS1 alternative3
MLVVGRPRLVRNS IILTLAILSIWHLGLL
SRTPTSASALVSASVSASSEWSRLERLMNRGAPLTPYPDSNSSFDW estExt_GeneWise SEQ
ID NO: 75 SEQ ID NO: 76
SAIPFRYPPHNTTHLPPRHKQPPLPRIQHRFGPESPAAAKERIKRLKA Plus.C_160228
VKQVFLRAWQAYKGYAWKQDALLPISGGGREQFSGWAATLVDALDT
LWIMGLREEFDEAVAAVAEIDFGSSTSSRVNIFETNIRYLGGLLAAYDL
SGREVLLKKAVELGDLIYAGFNTENGMPVDFLNFYSAKSGEGLVVES
SVVSASPGTLSLELAHLSQVTGDDKYYSAVSQVMDVFYQGQNKTRLP
GVWPIDVNMRAKDVVSGSRFTLGGCADSLYEYLPKMHQLLGGGEPK
YETMSRTFLQAADRHFVFRPMLPGAEEDVLMPGNVNVDEDSGEAVL
DPETEHLACFVGGMFGLAGRLFSRPDDVETGVRLTNGCVYAYRAFP
TGMMPERLDLAPCRDRSSRCPWDEEHWLEERAKRPEWEPHLPRGF
TSAKDPRYLLRPEAIESVFYSYRITGRQEFQTAAWDMFTAVEKGTRT
QFANAAVLDVTRAADELPQEDYMESFWLAETLKYFYLMFTTPDIISLD DYVLNTEAHPFKLVG
SEQ ID NO: 77 MNS1 alternative4 -- MVMLVAIALAWLGCSLL
RPVDAMRADYLAQLRQETVDMFYHGYSNYMEHAFPEDELRPISCTPL e_gw1.13.279.1 SEQ
ID NO: 78 TRDRDNPGRISLNDALGNYSLTLIDSLSTLAILAGGPQNGPYTGPQAL
SDFQDGVAEFVRHYGDGRSGPSGAGIRARGFDLDSKVQVFETVIRG
VGGLLSAHLFAIGELPITGYVPRPEGVAGDDPLELAPIPWPNGFRYDG
QLLRLALDLSERLLPAFYTPTGIPYPRVNLRSGIPFYVNSPLHQNLGEA
VEEQSGRPEITETCSAGAGSLVLEFTVLSRLTGDARFEQAAKRAFWE
VWHRRSEIGLIGNGIDAERGLWIGPHAGIGAGMDSFFEYALKSHILLS
GLGMPNASTSRRQSTTSWLDPNSLHPPLPPEMHTSDAFLQAWHQAH
ASVKRYLYTDRSHFPYYSNNHRATGQPYAMWIDSLGAFYPGLLALAG
EVEEAIEANLVYTALWTRYSALPERWSVREGNVEAGIGWWPGRPEFI
ESTYHIYRATRDPWYLHVGEMVLRDIRRRCYAECGWAGLQDVQTGE
KQDRMESFFLGETAKYMYLLFDPDHPLNKLDAAYVFTTEGHPLIIPKS
KRGSGSHNRQDRARKAKKSRDVAVYTYYDESFTNSCPAPRPPSEHH
LIGSATAARPDLFSVSRFTDLYRTPNVHGPLEKVEMRDKKKGRVVRY
RATSNHTIFPWTLPPAMLPENGTCAAPPERIISLIEFPANDITSGITSRF
GNHLSWQTHLGPTVNILEGLRLQLEQVSDPATGEDKVVRITHIG
NTQLGRHETVFFHAEHVRHLKDEVFSCRRRRDAVEIELLVDKPSDTN
NNNTLASSDDDVVVDAKAEEQDGMLADDDQDTLNAETLSSNSLFQSL
LRAVSSVFEPVYTAIPESDPSAGTAKVYSFDAYTSTGPGAYPMPSI
SDTPIPGNPFYNFRNPASNFPWSTVFLAGQACEGPLPASAPREHQVI
VMLRGGCSFSRKLDNIPSFSPHDRALQLVVVLDEPPPPPPPPPANDR
RDVTRPLLDTEQTTPKGMKRLHGIPMVLVRAARGDYELFGHAIGVG MRRKYRVESQGLVVENAVVL
SEQ ID NO: 79 VAN1 MMPRHHSSGFSN VGIAVVVILVLVLWFG
QPRSVASLISLGILSGYDDLKLETVRYYDLSNVQGTARGWEREERILL estExt_GeneWise
GYPRADTFEISPH SEQ ID NO: 81
CVPLRDAEQHLPMFFSHLKNFTYPHNLIDLAFLVSDSKDHTLESLTEH Plus.C_400029
RFQPRATLPPHRK LEAIQADPDPKQPYGEISIIEKDFGQKVNQDVESRHGFAAQASRRKLM
RKRTAIR AQARNWLLSAALRPYHSWVYWRDVDVETAPFTILEDLMRHNKDVIVP SEQ ID NO:
80 NVWRPLPDWLGGEQPYDLNSWQESETALALADTLDEDAVIVEGYAE
YATWRPHLAYLRDPYGDPDMEMEIDGVGGVSILAKAKVFRAGVHFPA
FSFEKHAETEGFGKMAKRMHFSVVGLPHYTIWHLYEPSVDDIKHMEE
MERERIAREKEEEERKKKEAQIKEEFGDANSQWEQDKQQMQDLKLQ
DRGGDKEAAAAGVNQGAAAKAAGAMEGQKN SEQ ID NO: 82 VAN1 alternative1
MLLPKGGLDWRS FILLVGITGLILLLW
RGVSTSASEMQSFYCWGPAKPPMEMSPNEHNRWNGHLQTPVIFNH estExt_GeneWise
ARAQIPPTR SEQ ID NO: 84
HAPVEVNSSTIEHVDLNPINSTKQAVTKEERILILTPLKDAAPYLSKYF Plus.C_230146
ALWNAVTRTR ELLAELTYPHRLIDLAFLVSDSTDDTLAVLASELDRIQKRPDQIPFHSAT SEQ
ID NO: 83 VIEKDFGFKLSQNVEERHSFEAQGPRRKAMGRARNYLLYTALKPEHS
WVYWRDVDIVDSPTGILEDFIAHDRDILVPNIWFHRYRDGVDIEGRFD
YNSWVESDKGRKLANSLDKDVVLAEGYKQYDTGRTYMAKMGDWRE
NKDVELELDGIGGVNILVKADVHRSGINFPCYAFENQAETEGFAKMAK
RAGYEVYGLPNYVVWHIDTEEKGGNA SEQ ID NO: 85 VAN1 alternative2
MARPMGSVRLKK LILGAVLCIFIIIFLV
SPSSPASASRLSIVSAQHHLSPPTSPYQSPRSGAVQGPPPVTRYNLN
e_gw1.5.262.1 ANPST SEQ ID NO: 87
KVTVTSDPVRNQEHILILTPMARFYQEYWDNLLRLNYPHELITLGFILP SEQ ID NO: 86
KTKEGNQATSMLQKQIQKTQNYGPEKDRFKSIIILRQDFDPAVVSQDE
SERHKLANQKARREVMAKARNSLLFTTLGPSTSWVLWLDADITETAP
TLIQDLASHDKPIIVANCFQKYYDPESKKMAERPYDFNSWQDSETALK
MAEQMGPDDILLEGYAEMATYRTLLAYMSTPGGSKDLVVPLDGVGG
TALLVKADVHRDGAMFPPFAFYHLIESEGFAKMAKRLGWQPYGLPNY KVYHYNE SEQ ID NO:
88 Other01 MHFAYPSRKSSN IGIVLFLVLATLWFF
SNPRVPRPDPERVPSGRPPVVLVTVIDPTQYPNAYLKTIKENREQYAA estExt_GeneWise
PPPFRPRSTRLPG SEQ ID NO: 90
KHGYEAFIVKAYDYDTQGAPQSWSKLMAMRHALTKFPECRFVWYLD Plus.C_150339
LRRSRIKT QDAYIMDMSKSLEEQLLNRQKLESLMIKNYPVVPPDSIIKTFSHLRPDE SEQ ID
NO: 89 VDLIVSQDSSGLVAGSVVVRNSQWSKFLLETWMDPLYRSYNFQKAE
RHALEHIVQWHPTILSKLALVPQRTLGPYTRTDQGDAYQDGDFVVMF
TGCTKSGEQSCETVSASYYQKWSSSL SEQ ID NO: 91 Other02 MSLSRSPSPVPG
ILLPLIIICTIVAYYG THEAPGFVHWWRRISMGGGGEKFVIILGANVGGGVMEWKGAREWAI
fgenesh5_pg.C_ GGWSSPGLNINS SEQ ID NO: 93
ERDSVRNKRKYATRWGYDLEIVDMKTKKRYAHEWRESWEKVDFIRA scaffold_5000342
GRSSPSNAAGSS AMRKYPKAEWFWWLDLNTYVMEPSYSLQRHLFNHLDRHVYRDINVF
VSWESAKMRKQG NPLNITHPPTEEYLDAEARSPVGDGNINSVNLMLTQDCSGFNLGSFFI
ANGYPSFSTQNQ RRSAWTEQLLDIWWDPVLYEQKHMEWEHKEQDALEQLYRTQPWIR
GFFTRHMRRISSS CHTGFLPQRLINSFPPAACADESGLNNTRIHYNEKDRDFVVNMAGCE
LPRFAAGPGNTYA WGRDCWGEMYHYREFSYWLNRNPWELFKEEIVAVIWYKLTGQRVKL
EREKYERGGHSP SEQ ID NO: 94 HAGGGRLRAFLA RIGRRLKWR SEQ ID NO: 92
Putative transmembrane domains are underlined. In KRE2, the stem
domain enabling Golgi localization is underlined and
double-underlined. Other1 and Other02 are putative
mannosylation-related proteins.
[0108] Uncharacterized sequences may be tested for use as targeting
peptides by expressing proteins in the glycosylation pathway in a
host cell, where one of the proteins contains the uncharacterized
sequence as the sole targeting peptide, and measuring the glycans
produced in view of the cytoplasmic localization of glycan
biosynthesis (e.g. as in Schwientek JBC 1996 3398), or by
expressing a fluorescent reporter protein fused with the targeting
peptide, and analyzing the localization of the protein in the Golgi
by immunofluorescence or by fractionating the cytoplasmic membranes
of the Golgi and measuring the location of the protein.
[0109] The targeting peptide may include a stem domain. In certain
embodiments, the stem domain is from an
N-acetylglucosaminyltransferase I enzyme or an
N-acetylglucosaminyltransferase II enzyme. In especially certain
embodiments, the stem domain is from a human
N-acetylglucosaminyltransferase I enzyme or a human
N-acetylglucosaminyltransferase II enzyme. The sequence
corresponding to the stem domain from human
N-acetylglucosaminyltransferase I enzyme is SEQ ID NO: 34. The
sequence corresponding to the stem domain from human
N-acetylglucosaminyltransferase II enzyme is residues 30-85 of SEQ
ID NO: 20.
[0110] The targeting peptide may include a transmembrane domain. A
"transmembrane domain" refers to any sequence of amino acid
residues that is thermodynamically stable in a membrane as a
three-dimensional structure. In embodiments where the targeting
peptide also includes a stem domain, the transmembrane domain is
N-terminal to the stem domain. In certain embodiments, the
transmembrane domain is from an N-acetylglucosaminyltransferase I
enzyme or an N-acetylglucosaminyltransferase II enzyme. In
especially certain embodiments, the transmembrane domain is from a
human N-acetylglucosaminyltransferase I enzyme or a human
N-acetylglucosaminyltransferase II enzyme. The sequence
corresponding to the transmembrane domain from human
N-acetylglucosaminyltransferase I enzyme is residues 7-29 of SEQ ID
NO: 1. The sequence corresponding to the transmembrane domain from
human N-acetylglucosaminyltransferase II enzyme is residues 10-29
of SEQ ID NO: 20.
[0111] The targeting peptide may include a cytoplasmic domain. The
term "cytoplasmic domain" refers to an amino acid sequence that is
thermodynamically stable in a cytoplasmic environment as a
three-dimensional structure. In embodiments where the targeting
peptide also includes a stem domain, the cytoplasmic domain is
N-terminal to the stem domain. In embodiments where the targeting
peptide also includes a transmembrane domain, the cytoplasmic
domain is N-terminal to the transmembrane domain. In certain
embodiments, the cytoplasmic domain is from an
N-acetylglucosaminyltransferase I enzyme or an
N-acetylglucosaminyltransferase II enzyme. In especially certain
embodiments, the cytoplasmic domain is from a human
N-acetylglucosaminyltransferase I enzyme or a human
N-acetylglucosaminyltransferase II enzyme. The sequence
corresponding to the cytoplasmic domain from human
N-acetylglucosaminyltransferase I enzyme is residues 1-6 of SEQ ID
NO: 1. The sequence corresponding to the cytoplasmic domain from
human N-acetylglucosaminyltransferase II enzyme is residues 1-9 of
SEQ ID NO: 20.
[0112] In certain embodiments, the recombinant protein contains a
human GnTII catalytic domain N-terminal to a human GnTI catalytic
domain with a spacer sequence containing human GnTI stem domain
sequence in between the catalytic domains. In this embodiment, the
recombinant protein also includes a targeting peptide N-terminal to
the GnTII catalytic domain with cytoplasmic, transmembrane, and
stem domains from human GnTII. The sequence of the recombinant
protein in this embodiment is at least 70%, at least 75%, at least
80%, at least 85%, at least 90%, at least 95%, at least 96%, at
least 97%, at least 98%, at least 99%, or 100% identical to SEQ ID
NO: 95, and the sequence of a possible cDNA encoding the
recombinant protein of this embodiment is SEQ ID NO: 96.
[0113] In other embodiments, the recombinant protein contains a
human GnTII catalytic domain N-terminal to a human GnTI catalytic
domain with a spacer sequence. The spacer sequence may include,
without limitation, a sequence that is at least 70%, at least 75%,
at least 80%, at least 85%, at least 90%, at least 95%, at least
96%, at least 97%, at least 98%, at least 99%, or 100% identical to
SEQ ID NOs: 118, 120, 122, or 124. In this embodiment, the
recombinant protein also includes a targeting peptide N-terminal to
the GnTII catalytic domain with cytoplasmic, transmembrane, and
stem domains from human GnTII. Accordingly, in certain embodiments,
the sequence of the recombinant protein is at least 70%, at least
75%, at least 80%, at least 85%, at least 90%, at least 95%, at
least 96%, at least 97%, at least 98%, at least 99%, or 100%
identical to a sequence selected from SEQ ID NOs: 119, 121, 123,
and 125. In certain embodiments, the sequence of a possible cDNA
encoding the recombinant protein of SEQ ID NO: 119 is SEQ ID NO:
141. In other embodiments, the sequence of a possible cDNA encoding
the recombinant protein of SEQ ID NO: 121 is SEQ ID NO: 139. In
still other embodiments, the sequence of a possible cDNA encoding
the recombinant protein of SEQ ID NO: 123 is SEQ ID NO: 143. In
further embodiments, the sequence of a possible cDNA encoding the
recombinant protein of SEQ ID NO: 125 is SEQ ID NO: 145.
[0114] Production of Recombinant Proteins of the Invention
[0115] Another aspect of the invention includes isolated
polynucleotides encoding the recombinant proteins of the invention.
As used herein, the terms "polynucleotide," "nucleic acid
sequence," "sequence of nucleic acids," and variations thereof
shall be generic to polydeoxyribonucleotides (containing
2-deoxy-D-ribose), to polyribonucleotides (containing D-ribose), to
any other type of polynucleotide that is an N-glycoside of a purine
or pyrimidine base, and to other polymers containing
non-nucleotidic backbones, provided that the polymers contain
nucleobases in a configuration that allows for base pairing and
base stacking, as found in DNA and RNA. Thus, these terms include
known types of nucleic acid sequence modifications, for example,
substitution of one or more of the naturally-occurring nucleotides
with an analog; inter-nucleotide modifications, such as, for
example, those with uncharged linkages (e.g., methyl phosphonates,
phosphotriesters, phosphoramidates, carbamates, etc.), with
negatively charged linkages (e.g., phosphorothioates,
phosphorodithioates, etc.), and with positively charged linkages
(e.g., aminoalkylphosphoramidates, aminoalkylphosphotriesters);
those containing pendant moieties, such as, for example, proteins
(including nucleases, toxins, antibodies, signal peptides,
poly-L-lysine, etc.); those with intercalators (e.g., acridine,
psoralen, etc.); and those containing chelators (e.g., metals,
radioactive metals, boron, oxidative metals, etc.). As used herein,
the symbols for nucleotides and polynucleotides are those
recommended by the IUPAC-IUB Commission of Biochemical Nomenclature
(Biochem. 9:4022, 1970).
[0116] Sequences of the isolated polynucleotides are prepared by
any suitable method known to those of ordinary skill in the art,
including, for example, direct chemical synthesis or cloning. For
direct chemical synthesis, formation of a polymer of nucleic acids
typically involves sequential addition of 3'-blocked and 5'-blocked
nucleotide monomers to the terminal 5'-hydroxyl group of a growing
nucleotide chain, where each addition is effected by nucleophilic
attack of the terminal 5'-hydroxyl group of the growing chain on
the 3'-position of the added monomer, which is typically a
phosphorus derivative, such as a phosphotriester, phosphoramidite,
or the like. Such methodology is known to those of ordinary skill
in the art and is described in the pertinent texts and literature
[e.g., in Matteucci et al., (1980) Tetrahedron Lett 21:719-722;
U.S. Pat. Nos. 4,500,707; 5,436,327; and 5,700,637]. In addition,
the desired sequences may be isolated from natural sources by
splitting DNA using appropriate restriction enzymes, separating the
fragments using gel electrophoresis, and thereafter, recovering the
desired nucleic acid sequence from the gel via techniques known to
those of ordinary skill in the art, such as utilization of
polymerase chain reactions (PCR; e.g., U.S. Pat. No.
4,683,195).
[0117] Each polynucleotide of the invention can be incorporated
into an expression vector. "Expression vector" or "vector" refers
to a compound and/or composition that transduces, transforms, or
infects a host cell, thereby causing the cell to express nucleic
acids and/or proteins other than those native to the cell, or in a
manner not native to the cell. An "expression vector" contains a
sequence of nucleic acids (ordinarily RNA or DNA) to be expressed
by the host cell. Optionally, the expression vector also comprises
materials to aid in achieving entry of the nucleic acid into the
host cell, such as a virus, liposome, protein coating, or the like.
The expression vectors contemplated for use in the present
invention include those into which a nucleic acid sequence can be
inserted, along with any certain or required operational elements.
Further, the expression vector must be one that can be transferred
into a host cell and replicated therein. Certain expression vectors
are plasmids, particularly those with restriction sites that have
been well documented and that contain the operational elements
certain or required for transcription of the nucleic acid sequence.
Such plasmids, as well as other expression vectors, are well known
to those of ordinary skill in the art.
[0118] Incorporation of the individual polynucleotides may be
accomplished through known methods that include, for example, the
use of restriction enzymes (such as BamHI, EcoRI, HhaI, XhoI, XmaI,
and so forth) to cleave specific sites in the expression vector,
e.g., plasmid. The restriction enzyme produces single-stranded ends
that may be annealed to a polynucleotide having, or synthesized to
have, a terminus with a sequence complementary to the ends of the
cleaved expression vector. Annealing is performed using an
appropriate enzyme, e.g., DNA ligase. As will be appreciated by
those of ordinary skill in the art, both the expression vector and
the desired polynucleotide are often cleaved with the same
restriction enzyme, thereby assuring that the ends of the
expression vector and the ends of the polynucleotide are
complementary to each other. In addition, DNA linkers may be used
to facilitate linking of nucleic acids sequences into an expression
vector.
[0119] A series of individual polynucleotides can also be combined
by utilizing methods that are known to those having ordinary skill
in the art (e.g., U.S. Pat. No. 4,683,195).
[0120] For example, each of the desired polynucleotides can be
initially generated in a separate PCR. Thereafter, specific primers
are designed such that the ends of the PCR products contain
complementary sequences. When the PCR products are mixed,
denatured, and reannealed, the strands having the matching
sequences at their 3' ends overlap and can act as primers for each
other. Extension of this overlap by DNA polymerase produces a
molecule in which the original sequences are "spliced" together. In
this way, a series of individual polynucleotides may be "spliced"
together and subsequently transduced into a host cell
simultaneously. Thus, expression of each of the plurality of
polynucleotides is affected.
[0121] Individual polynucleotides, or "spliced" polynucleotides,
are then incorporated into an expression vector. The invention is
not limited with respect to the process by which the polynucleotide
is incorporated into the expression vector. Those of ordinary skill
in the art are familiar with the necessary steps for incorporating
a polynucleotide into an expression vector. A typical expression
vector contains the desired polynucleotide preceded by one or more
regulatory regions, along with a ribosome binding site, e.g., a
nucleotide sequence that is 3-9 nucleotides in length and located
3-11 nucleotides upstream of the initiation codon in E. coli. See
Shine and Dalgarno (1975) Nature 254(5495):34-38 and Steitz (1979)
Biological Regulation and Development (ed. Goldberger, R. F.),
1:349-399 (Plenum, New York).
[0122] The term "operably linked" as used herein refers to a
configuration in which a control sequence is placed at an
appropriate position relative to the coding sequence of a DNA
sequence or polynucleotide such that the control sequence directs
the expression of a polypeptide.
[0123] Regulatory regions include, for example, those regions that
contain a promoter and an operator. A promoter is operably linked
to the desired polynucleotide or portion of a polynucleotide
encoding a polypeptide, thereby initiating transcription of the
polynucleotide, or portion of the polynucleotide encoding a
polypeptide, via an RNA polymerase enzyme. An operator is a
sequence of nucleic acids adjacent to the promoter, which contains
a protein-binding domain where a repressor protein can bind. In the
absence of a repressor protein, transcription initiates through the
promoter. When present, the repressor protein specific to the
protein-binding domain of the operator binds to the operator,
thereby inhibiting transcription. In this way, control of
transcription is accomplished, based upon the particular regulatory
regions used and the presence or absence of the corresponding
repressor protein. Examples include lactose promoters (Lad
repressor protein changes conformation when contacted with lactose,
thereby preventing the Lad repressor protein from binding to the
operator) and tryptophan promoters (when complexed with tryptophan,
TrpR repressor protein has a conformation that binds the operator;
in the absence of tryptophan, the TrpR repressor protein has a
conformation that does not bind to the operator). Another example
is the tac promoter (see de Boer et al., (1983) Proc Natl Acad Sci
USA 80(1):21-25). As will be appreciated by those of ordinary skill
in the art, these and other regulatory regions may be used in the
present invention, and the invention is not limited in this
respect.
[0124] Examples of certain promoters for linkage to the isolated
polynucleotides encoding the recombinant proteins of the invention
include promoters from the following genes: gpdA, cbh1, Aspergillus
oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase,
Aspergillus niger neutral alpha-amylase, Aspergillus niger acid
stable alpha-amylase, Aspergillus niger glucoamylase (glaA),
Aspergillus awamori glaA, Rhizomucor miehei lipase, Aspergillus
oryzae alkaline protease, Aspergillus oryzae triose phosphate
isomerase, Aspergillus nidulans acetamidase, Aspergillus oryzae
acetamidase, Fusarium oxysporum trypsin-like protease, fungal endo
.alpha.-L-arabinase (abnA), fungal .alpha.-L-arabinofuranosidase A
(abfA), fungal .alpha.-L-arabinofuranosidase B (abfB), fungal
xylanase (xlnA), fungal phytase, fungal ATP-synthetase, fungal
subunit 9 (oliC), fungal triose phosphate isomerase (tpi), fungal
alcohol dehydrogenase (adhA), fungal .alpha.-amylase (amy), fungal
amyloglucosidase (glaA), fungal acetamidase (amdS), fungal
glyceraldehyde-3-phosphate dehydrogenase (gpd), yeast alcohol
dehydrogenase, yeast lactase, yeast 3-phosphoglycerate kinase,
yeast triosephosphate isomerase, bacterial .alpha.-amylase,
bacterial Spo2, and SSO. In certain embodiments, isolated
polynucleotides encoding the recombinant proteins of the invention
are operably linked to a constitutive promoter. In other
embodiments, isolated polynucleotides encoding the recombinant
proteins of the invention are operably linked to an inducible
promoter. In certain preferred embodiments, the inducible promoter
is from a cbh1 gene.
[0125] Although any suitable expression vector may be used to
incorporate the desired sequences, readily available expression
vectors include, without limitation: plasmids, such as pSClOl,
pBR322, pBBRlMCS-3, pUR, pEX, pMRlOO, pCR4, pBAD24, pUC 19;
bacteriophages, such as Ml 3 phage and .lamda. phage. Of course,
such expression vectors may only be suitable for particular host
cells. One of ordinary skill in the art, however, can readily
determine through routine experimentation whether any particular
expression vector is suited for any given host cell. For example,
the expression vector can be introduced into the host cell, which
is then monitored for viability and expression of the sequences
contained in the vector. In addition, reference may be made to the
relevant texts and literature, which describe expression vectors
and their suitability to any particular host cell.
[0126] Another aspect of the invention includes host cells
containing expression vectors containing isolated polynucleotides
that encode the recombinant proteins of the invention. "Host cell"
as used herein refers to a living biological cell that can be
transformed via insertion of recombinant DNA or RNA. Such
recombinant DNA or RNA can be in an expression vector. Thus, a host
cell as described herein may be a prokaryotic organism (e.g., an
organism of the kingdom eubacteria) or a eukaryotic cell. As will
be appreciated by one of ordinary skill in the art, a prokaryotic
cell lacks a membrane-bound nucleus, while a eukaryotic cell has a
membrane-bound nucleus. In certain embodiments, host cells used for
production of the recombinant proteins of the invention are fungal
cells such as yeast or filamentous fungi. In other embodiments, the
host cells are mammalian cells. Such cells may be human or
non-human.
[0127] Another aspect of the invention includes methods of
producing the recombinant proteins of the invention. The method
includes the steps of introducing an isolated polynucleotide that
encodes the recombinant protein into a host cell, and culturing the
host cell such that the recombinant protein is expressed. The
method may also include a step of purifying the recombinant protein
from the host cell.
[0128] Methods of producing the recombinant proteins of the
invention may include the introduction or transfer of expression
vectors containing the recombinant polynucleotides of the invention
into the host cell. Such methods for transferring expression
vectors into host cells are well known to those of ordinary skill
in the art. For example, one method for transforming E. coli with
an expression vector involves a calcium chloride treatment where
the expression vector is introduced via a calcium precipitate.
Other salts, e.g., calcium phosphate, may also be used following a
similar procedure. In addition, electroporation (i.e., the
application of current to increase the permeability of cells to
nucleic acid sequences) may be used to transfect the host cell.
Also, microinjection of the nucleic acid sequences provides the
ability to transfect host cells. Other means, such as lipid
complexes, liposomes, and dendrimers, may also be employed. Those
of ordinary skill in the art can transfect a host cell with a
desired sequence using these or other methods.
[0129] The vector may be an autonomously replicating vector, i.e.,
a vector which exists as an extrachromosomal entity, the
replication of which is independent of chromosomal replication,
e.g., a plasmid, an extrachromosomal element, a minichromosome, or
an artificial chromosome. The vector may contain any means for
assuring self-replication. Alternatively, the vector may be one
which, when introduced into the host, is integrated into the genome
and replicated together with the chromosome(s) into which it has
been integrated. Furthermore, a single vector or plasmid or two or
more vectors or plasmids which together contain the total DNA to be
introduced into the genome of the host, or a transposon may be
used.
[0130] The vectors may contain one or more selectable markers which
permit easy selection of transformed hosts. A selectable marker is
a gene, the product of which provides, for example, biocide or
viral resistance, resistance to heavy metals, prototrophy to
auxotrophs, and the like. Selection of bacterial cells may be based
upon antimicrobial resistance that has been conferred by genes such
as the amp, gpt, neo, and hyg genes.
[0131] Suitable markers for yeast hosts are, for example, ADE2,
HIS3, LEU2, LYS2, MET3, TRP1, and URA3. Selectable markers for use
in a filamentous fungal host include, but are not limited to, amdS
(acetamidase), argB (ornithine carbamoyltransferase), bar
(phosphinothricin acetyltransferase), hph (hygromycin
phosphotransferase), niaD (nitrate reductase), pyrG (orotidine
5'-phosphate decarboxylase), sC (sulfate adenyltransferase), and
trpC (anthranilate synthase), as well as equivalents thereof.
Certain for use in Aspergillus are the amdS and pyrG genes of
Aspergillus nidulans or Aspergillus oryzae and the bar gene of
Streptomyces hygroscopicus. Certain for use in Trichoderma are bar,
pyr4, and amdS.
[0132] The vectors may contain an element(s) that permits
integration of the vector into the host's genome or autonomous
replication of the vector in the cell independent of the
genome.
[0133] For integration into the host genome, the vector may rely on
the gene's sequence or any other element of the vector for
integration of the vector into the genome by homologous or
nonhomologous recombination. Alternatively, the vector may contain
additional nucleotide sequences for directing integration by
homologous recombination into the genome of the host. The
additional nucleotide sequences enable the vector to be integrated
into the host genome at a precise location(s) in the chromosome(s).
To increase the likelihood of integration at a precise location,
the integrational elements may contain a sufficient number of
nucleic acids, such as 100 to 10,000 base pairs, preferably 400 to
10,000 base pairs, and most preferably 800 to 10,000 base pairs,
which are highly homologous with the corresponding target sequence
to enhance the probability of homologous recombination. The
integrational elements may be any sequence that is homologous with
the target sequence in the genome of the host. Furthermore, the
integrational elements may be non-encoding or encoding nucleotide
sequences. On the other hand, the vector may be integrated into the
genome of the host by non-homologous recombination.
[0134] For autonomous replication, the vector may further comprise
an origin of replication enabling the vector to replicate
autonomously in the host in question. The origin of replication may
be any plasmid replicator mediating autonomous replication which
functions in a cell. The term "origin of replication" or "plasmid
replicator" is defined herein as a sequence that enables a plasmid
or vector to replicate in vivo. Examples of origins of replication
for use in a yeast host are the 2 micron origin of replication,
ARS1, ARS4, the combination of ARS1 and CEN3, and the combination
of ARS4 and CEN6. Examples of origins of replication useful in a
filamentous fungal cell are AMA1 and ANS1 (Gems et al., 1991;
Cullen et al., 1987; WO 00/24883). Isolation of the AMA1 gene and
construction of plasmids or vectors comprising the gene can be
accomplished according to the methods disclosed in WO 00/24883.
[0135] For other hosts, transformation procedures may be found, for
example, in Jeremiah D. Read, et al., Applied and Environmental
Microbiology, August 2007, p. 5088-5096, for Kluyveromyces, in
Osvaldo Delgado, et al., FEMS Microbiology Letters 132, 1995,
23-26, for Zymomonas, in U.S. Pat. No. 7,501,275 for Pichia
stipitis, and in WO 2008/040387 for Clostridium.
[0136] More than one copy of a gene may be inserted into the host
to increase production of the gene product. An increase in the copy
number of the gene can be obtained by integrating at least one
additional copy of the gene into the host genome or by including an
amplifiable selectable marker gene with the nucleotide sequence
where cells containing amplified copies of the selectable marker
gene, and thereby additional copies of the gene, can be selected
for by cultivating the cells in the presence of the appropriate
selectable agent.
[0137] The procedures used to ligate the elements described above
to construct the recombinant expression vectors of the present
invention are well-known to one skilled in the art (see, e.g.,
Sambrook et al., 1989, supra).
[0138] The host cell is transformed with at least one expression
vector. When only a single expression vector is used (without the
addition of an intermediate), the vector will contain all of the
nucleic acid sequences necessary.
[0139] Once the host cell has been transformed with the expression
vector, the host cell is allowed to grow. Methods of the invention
may include culturing the host cell such that recombinant nucleic
acids in the cell are expressed. For microbial hosts, this process
entails culturing the cells in a suitable medium. Typically, cells
are grown at 35.degree. C. in appropriate media. Certain growth
media in the present invention include, for example, common
commercially-prepared media such as Luria-Bertani (LB) broth,
Sabouraud Dextrose (SD) broth or Yeast medium (YM) broth. Other
defined or synthetic growth media may also be used and the
appropriate medium for growth of the particular host cell will be
known by someone skilled in the art of microbiology or fermentation
science. Temperature ranges and other conditions suitable for
growth are known in the art (see, e.g., Bailey and Ollis 1986).
[0140] Methods for purifying recombinant proteins of the invention
from the host cell are well known in the art (see E. L. V. Harris
and S. Angel, Eds. (1989) Protein Purification Methods: A Practical
Approach, IRL Press, Oxford, England). Such methods include,
without limitation, preparative disc-gel electrophoresis,
isoelectric focusing, high-performance liquid chromatography
(HPLC), reversed-phase HPLC, gel filtration, ion exchange and
partition chromatography, and countercurrent distribution, and
combinations thereof. In certain embodiments, the recombinant
proteins carry additional sequence tags to facilitate purification.
Such markers include epitope tags and protein tags. Non-limiting
examples of epitope tags include c-myc, hemagglutinin (HA),
polyhistidine (6.times.-HIS), GLU-GLU, and DYKDDDDK (FLAG) (SEQ ID
NO: 117) epitope tags. Epitope tags can be added to peptides by a
number of established methods. DNA sequences of epitope tags can be
inserted into recombinant protein coding sequences as
oligonucleotides or through primers used in PCR amplification. As
an alternative, peptide-coding sequences can be cloned into
specific vectors that create fusions with epitope tags; for
example, pRSET vectors (Invitrogen Corp., San Diego, Calif.)
Non-limiting examples of protein tags include
glutathione-S-transferase (GST), green fluorescent protein (GFP),
and maltose binding protein (MBP). Protein tags are attached to
peptides or polypeptides by several well-known methods. In one
approach, the coding sequence of a polypeptide or peptide can be
cloned into a vector that creates a fusion between the polypeptide
or peptide and a protein tag of interest. Suitable vectors include,
without limitation, the exemplary plasmids, pGEX (Amersham
Pharmacia Biotech, Inc., Piscataway, N.J.), pEGFP (CLONTECH
Laboratories, Inc., Palo Alto, Calif.), and pMAL.TM. (New England
BioLabs, Inc., Beverly, Mass.). Following expression, the epitope
or protein-tagged polypeptide or peptide can be purified from a
crude lysate of the host cell by chromatography on an appropriate
solid-phase matrix. In some cases, it may be preferable to remove
the epitope or protein tag (i.e., via protease cleavage) following
purification.
[0141] Methods of Producing Complex Glycans
[0142] Another aspect of the invention includes methods of
producing a complex N-glycan, including the steps of providing a
host cell, where the host cell contains a polynucleotide encoding a
fusion protein comprising an N-acetylglucosaminyltransferase I
catalytic domain and an N-acetylglucosaminyltransferase II
catalytic domain and culturing the host cell such that the fusion
protein is expressed, where the fusion protein catalyzes the
transfer of N-acetylglucosamine to a terminal Man.alpha.3 residue
and N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan to produce a complex N-glycan. In certain
embodiments, this aspect includes methods of producing human-like
N-glycans in a Trichoderma cell.
[0143] As used herein, the term "complex N-glycan" refers to an
N-glycan comprising a terminal GlcNAc.sub.2Man.sub.3 structure.
[0144] The complex N-glycan includes any glycan having the formula
[GlcNAc.beta.2].sub.7Man.alpha.3([GlcNAc.beta.2].sub.wMan.alpha.6)Man{.be-
ta.4GlcNAc.beta.(Fuc.alpha.x).sub.n[4GlcNAc].sub.m}.sub.p, where n,
m, and p are 0 or 1, indicating presence or absence of part of the
molecule, with the provision that when m is 0, then n is 0 (fucose
is a branch linked to the GlcNAc), where x is 3 or 6, where ( )
defines a branch in the structure, where [ ] defines a part of the
glycan structure either present or absent in a linear sequence, and
where z and w are 0 or 1. Preferably w and z are 1. In certain
embodiments, the complex N-glycan includes
GlcNAc.beta.2Man.alpha.3(GlcNAc.beta.2Man.alpha.6)Man.beta.4GlcN-
Ac.beta.4GlcNAc,
GlcNAc.beta.2Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc.beta.4GlcNAc,
GlcNAc.beta.2Man.alpha.3(GlcNAc.beta.2Man.alpha.6)Man.beta.4GlcNAc.beta.4-
(Fuc.alpha.6)GlcNAc,
GlcNAc.beta.2Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc.beta.4(Fuc.alpha.6)-
GlcNAc, and Man.alpha.3(Man.alpha.6)Man.beta.4GlcNAc.beta.4GlcNAc.
In certain embodiments, the complex N-glycans are fungal
non-fucosylated GlcNAcMan3, GlcNAc2Man3, and or Man3
[0145] In certain embodiments, the method of producing a complex
N-glycan will generate a mixture of different glycans. The complex
N-glycan may constitute at least 1%, at least 3%, at least 5%, at
least 10%, at least 15%, at least 20%, at least 25%, at least 50%,
or at least 75% or more of such a glycan mixture.
[0146] The acceptor glycan, and thus the complex N-glycan, may be
attached to a molecule such as an amino acid, a peptide, or a
polypeptide. In certain embodiments, the amino acid derivative is
an asparagine residue. The asparagine residue may be in
aminoglycosidic linkage from the side-chain amide (a biologic
mammalian polypeptide N-glycan linkage structure) and may be part
of a peptide chain such as a dipeptide, an oligopeptide, or a
polypeptide. The glycan may be a reducing end derivative such as an
N-, O-, or C-linked, preferably glycosidic, derivative of the
reducing GlcNAc or Man, such as a spacer or terminal organic
residue with a certain glycan linked structure selected from the
group of an amino acid, alkyl, heteroalkyl, acyl, alkyloxy, aryl,
arylalkyl, and heteroarylalkyl. The spacer may be further linked to
a polyvalent carrier or a solid phase. In certain embodiments,
alkyl-containing structures include methyl, ethyl, propyl, and
C4-C26 alkyls, lipids such as glycerolipids, phospholipids,
dolichol-phospholipids and ceramides and derivatives. The reducing
end may also be derivatized by reductive amination to a secondary
amine linkage or a derivative structure. Certain carriers include
biopoly- or oligomers such as (poly)peptides, poly(saccharides)
such as dextran, cellulose, amylose, or glycosaminoglycans, and
other organic polymers or oligomers such as plastics including
polyethylene, polypropylene, polyamides (e.g., nylon or
polystyrene), polyacrylamide, and polylactic acids, dendrimers such
as PAMAM, Starburst or Starfish dendrimers, or polylysine, and
polyalkylglycols such as polyethylene glycol (PEG). Solid phases
may include microtiter wells, silica particles, glass, metal
(including steel, gold and silver), polymer beads such as
polystyrene or resin beads, polylactic acid beads, polysaccharide
beads or organic spacers containing magnetic beads.
[0147] In certain embodiments, the acceptor glycan is attached to a
heterologous polypeptide. In certain embodiments, the heterologous
polypeptide is a therapeutic protein. Therapeutic proteins may
include monoclonal antibodies, erythropoietins, interferons, growth
hormones, enzymes, or blood-clotting factors and may be useful in
the treatment of humans or animals. For example, the acceptor
glycan may be attached to a therapeutic protein such as
rituximab.
[0148] The acceptor glycan may be any of the acceptor glycans
described in the section entitled, "Recombinant Proteins of the
Invention."
[0149] In certain embodiments, the acceptor glycan may be Man5. In
such embodiments, a Man5 expressing T. reesei strain is transformed
with a GnTII/GnTI fusion enzyme using random integration or by
targeted integration to a known site known not to affect Man5
glycosylation. Strains that produce GlcNAcMan5 are selected. The
selected strains are further transformed with a catalytic domain of
a mannosidase II-type mannosidase capable of cleaving Man5
structures to generate GlcNAcMan3. In certain embodiments
mannosidase II-type enzymes belong to glycoside hydrolase family 38
(cazy.org/GH38_all.html). Characterized enzymes include enzymes
listed in cazy.org/GH38_characterized.html. Especially useful
enzymes are Golgi-type enzymes that cleaving glycoproteins, such as
those of subfamily .alpha.-mannosidase II (Man2A1;ManA2). Examples
of such enzymes include human enzyme AAC50302, D. melanogaster
enzyme (Van den Elsen J. M. et al (2001) EMBO J. 20: 3008-3017),
those with the 3D structure according to PDB-reference 1HTY, and
others referenced with the catalytic domain in PDB. For cytoplasmic
expression, the catalytic domain of the mannosidase is typically
fused with an N-terminal targeting peptide or expressed with
endogenous animal or plant Golgi targeting structures of animal or
plant mannosidase II enzymes. After transformation with the
catalytic domain of a mannosidase II-type mannosidase, a strain
effectively producing GlcNAc2Man3 is selected.
[0150] Host Cells
[0151] The methods of producing a complex N-glycan include a first
step of providing a host cell. Any prokaryotic or eukaryotic host
cell may be used in the present invention so long as it remains
viable after being transformed with a sequence of nucleic acids.
Preferably, the host cell is not adversely affected by the
transduction of the necessary nucleic acid sequences, the
subsequent expression of recombinant proteins, or the resulting
intermediates. Suitable eukaryotic cells include, but are not
limited to, fungal, plant, insect or mammalian cells.
[0152] In certain embodiments, the host is a fungal strain. "Fungi"
as used herein includes the phyla Ascomycota, Basidiomycota,
Chytridiomycota, and Zygomycota (as defined by Hawksworth et al.,
In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edition,
1995, CAB International, University Press, Cambridge, UK) as well
as the Oomycota (as cited in Hawksworth et al., 1995, supra, page
171) and all mitosporic fungi (Hawksworth et al., 1995, supra).
[0153] In particular embodiments, the fungal host is a yeast
strain. "Yeast" as used herein includes ascosporogenous yeast
(Endomycetales), basidiosporogenous yeast, and yeast belonging to
the Fungi Imperfecti (Blastomycetes). Since the classification of
yeast may change in the future, for the purposes of this invention,
yeast shall be defined as described in Biology and Activities of
Yeast (Skinner, F. A., Passmore, S. M., and Davenport, R. R., eds,
Soc. App. Bacteriol. Symposium Series No. 9, 1980).
[0154] In certain embodiments, the yeast host is a Candida,
Hansenula, Kluyveromyces, Pichia, Saccharomyces,
Schizosaccharomyces, or Yarrowia strain.
[0155] In certain embodiments, the yeast host is Saccharomyces
cerevisiae, Kluyveromyces lactis, Pichia pastoris, Candida
albicans, Hansenula polymorpha, Schizosaccharomyces, or
Yarrowia.
[0156] In another particular embodiment, the fungal host cell is a
filamentous fungal strain. "Filamentous fungi" include all
filamentous forms of the subdivision Eumycota and Oomycota (as
defined by Hawksworth et al., 1995, supra). The filamentous fungi
are generally characterized by a mycelial wall composed of chitin,
cellulose, glucan, chitosan, mannan, and other complex
polysaccharides. Vegetative growth is by hyphal elongation and
carbon catabolism is obligately aerobic. In contrast, vegetative
growth by yeasts such as Saccharomyces cerevisiae is by budding of
a unicellular thallus and carbon catabolism may be
fermentative.
[0157] The filamentous fungal host cell may be, for example, an
Acremonium, Aspergillus, Fusarium, Humicola, Mucor, Myceliophthora,
Neurospora, Penicillium, Scytalidium, Thielavia, Tolypocladium, or
Trichoderma strain.
[0158] In certain embodiments, the filamentous fungal host cell is
a Trichoderma sp., Acremonium, Aspergillus, Aureobasidium,
Cryptococcus, Chrysosporium, Chrysosporium lucknowense,
Filibasidium, Fusarium, Gibberella, Magnaporthe, Mucor,
Myceliophthora, Myrothecium, Neocallimastix, Neurospora,
Paecilomyces, Penicillium, Piromyces, Schizophyllum, Talaromyces,
Thermoascus, Thielavia, or Tolypocladium strain.
[0159] In certain embodiments, the host cell is a mammalian cell.
Such cells may be human or non-human.
[0160] In other certain embodiments, the host cell is prokaryotic,
and in certain embodiments, the prokaryotes are E. coli, Bacillus
subtilis, Zymomonas mobilis, Clostridium sp., Clostridium
phytofermentans, Clostridium thermocellum, Clostridium
beijerinckii, Clostridium acetobutylicum (Moorella thermoacetica),
Thermoanaerobacterium saccharolyticum, or Klebsiella oxytoca. In
other embodiments, the prokaryotic host cells are Carboxydocella
sp., Corynebacterium glutamicum, Enterobacteriaceae, Erwinia
chrysanthemi, Lactobacillus sp., Pediococcus acidilactici,
Rhodopseudomonas capsulata, Streptococcus lactis, Vibrio furnissii,
Vibrio furnissii Ml, Caldicellulosiruptor saccharolyticus, or
Xanthomonas campestris. In other embodiments, the host cells are
cyanobacteria. Additional examples of bacterial host cells include,
without limitation, those species assigned to the Escherichia,
Enterobacter, Azotobacter, Erwinia, Bacillus, Pseudomonas,
Klebsiella, Proteus, Salmonella, Serratia, Shigella, Rhizobia,
Vitreoscilla, Synechococcus, Synechocystis, and Paracoccus
taxonomical classes.
[0161] In methods of the invention for producing a complex
N-glycan, the methods include a step of culturing the host cell
such that the fusion protein is expressed. For microbial hosts,
this process entails culturing the cells in a suitable medium.
Typically, cells are grown at 35.degree. C. in appropriate media.
Certain growth media in the present invention include, for example,
common commercially-prepared media such as Luria-Bertani (LB)
broth, Sabouraud Dextrose (SD) broth or Yeast medium (YM) broth.
Other defined or synthetic growth media may also be used and the
appropriate medium for growth of the particular host cell will be
known by someone skilled in the art of microbiology or fermentation
science. Temperature ranges and other conditions suitable for
growth are known in the art (see, e.g., Bailey and Ollis 1986). In
certain embodiments the pH of cell culture is between 3.5 and 7.5,
between 4.0 and 7.0, between 4.5 and 6.5, between 5 and 5.5, or at
5.5.
[0162] The host cells used in the methods of producing a complex
N-glycan contain a polynucleotide encoding any of the recombinant
proteins of the invention as described in the section entitled
"Recombinant Proteins of the Invention." In certain embodiments,
the host cell contains a polynucleotide encoding a fusion protein
comprising an N-acetylglucosaminyltransferase I catalytic domain
and an N-acetylglucosaminyltransferase II catalytic domain, where
the fusion protein catalyzes the transfer of N-acetylglucosamine to
a terminal Man.alpha.3 residue and N-acetylglucosamine to a
terminal Man.alpha.6 residue of an acceptor glycan to produce a
complex N-glycan.
[0163] In certain embodiments, the host cell contains a
polynucleotide encoding a UDP-GlcNAc transporter. The
polynucleotide encoding the UDP-GlcNAc transporter may be
endogenous (i.e., naturally present) in the host cell, or it may be
heterologous to the host cell.
[0164] In certain embodiments, the host cell contains a
polynucleotide encoding a .alpha.-1,2-mannosidase. The
polynucleotide encoding the .alpha.-1,2-mannosidase may be
endogenous in the host cell, or it may be heterologous to the host
cell. These polynucleotides are especially useful for a host cell
expressing high-mannose glycans transferred from the Golgi to the
ER without effective exo-.alpha.-2-mannosidase cleavage. The
.alpha.-1,2-mannosidase may be a mannosidase I type enzyme
belonging to the glycoside hydrolase family 47
(cazy.org/GH47_all.html). In certain embodiments the
.alpha.-1,2-mannosidase is an enzyme listed at cazy.org/GH47
characterized.html. In particular, the .alpha.-1,2-mannosidase may
be an ER-type enzyme that cleaves glycoproteins such as enzymes in
the subfamily of ER .alpha.-mannosidase I EC 3.2.1.113 enzymes.
Examples of such enzymes include human .alpha.-2-mannosidase 1B
(AAC26169), a combination of mammalian ER mannosidases, or a
filamentous fungal enzyme such as .alpha.-1,2-mannosidase (MDS1)
(T. reesei AAF34579; Maras M et al J Biotech. 77, 2000, 255). For
cytoplasmic expression the catalytic domain of the mannosidase is
typically fused with a targeting peptide, such as HDEL, KDEL, or
part of an ER or early Golgi protein, or expressed with an
endogenous ER targeting structures of an animal or plant
mannosidase I enzyme.
[0165] In certain embodiments, the host cell contains a
polynucleotide encoding a galactosyltransferase.
Galactosyltransferases transfer .beta.-linked galactosyl residues
to terminal N-acetylglucosaminyl residue. In certain embodiments
the galactosyltransferase is a .beta.-4-galactosyltransferase.
Generally, .beta.-4-galactosyltransferases belong to the CAZy
glycosyltransferase family 7 (cazy.org/GT7 all.html) and include
.beta.-N-acetylglucosaminyl-glycopeptide
.beta.-1,4-galactosyltransferase (EC 2.4.1.38), which is also known
as N-acetylactosamine synthase (EC 2.4.1.90). Useful subfamilies
include .beta.4-GalT1, .beta.4-GalT-II, -III, -IV, -V, and -VI,
such as mammalian or human .beta.4-GalTI or .beta.4GalT-II, -III,
-IV, -V, and -VI or any combinations thereof .beta.4-GalT1,
.beta.4-GalTII, or .beta.4-GalTIII are especially useful for
galactosylation of terminal GlcNAc.beta.2-structures on N-glycans
such as GlcNAcMan3, GlcNAc2Man3, or GlcNAcMan5 (Guo S. et al.
Glycobiology 2001, 11:813-20). The three-dimensional structure of
the catalytic region is known (e.g. (2006) J. Mol. Biol. 357:
1619-1633), and the structure has been represented in the PDB
database with code 2FYD. The CAZy database includes examples of
certain enzymes. Characterized enzymes are also listed in the CAZy
database at cazy.org/GT7_characterized.html. Examples of useful
.beta.4GalT enzymes include .beta.4GalT1, e.g. bovine Bos taurus
enzyme AAA30534.1 (Shaper N. L. et al Proc. Natl. Acad. Sci. U.S.A.
83 (6), 1573-1577 (1986)), human enzyme (Guo S. et al. Glycobiology
2001, 11:813-20), and Mus musculus enzyme AAA37297 (Shaper, N. L.
et al. 1998 J. Biol. Chem. 263 (21), 10420-10428); .beta.4GalTII
enzymes such as human .beta.4GalTII BAA75819.1, Chinese hamster
Cricetulus griseus AAM77195, Mus musculus enzyme BAA34385, and
Japanese Medaka fish Oryzias latipes BAH36754; and .beta.4GalTIII
enzymes such as human .beta.4GalTIII BAA75820.1, Chinese hamster
Cricetulus griseus AAM77196 and Mus musculus enzyme AAF22221.
[0166] The galactosyltransferase may be expressed in the cytoplasm
of the host cell. A heterologous targeting peptide, such as a Kre2
peptide described in Schwientek J. Biol. Chem 1996 3398, may be
used. Promoters that may be used for expression of the
galactosyltransferase include constitutive promoters such as gpd,
promoters of endogenous glycosylation enzymes and
glycosyltransferases such as mannosyltransferases that synthesize
N-glycans in the Golgi or ER, and inducible promoters of high-yield
endogenous proteins such as the cbh1 promoter.
[0167] In certain embodiments of the invention where the host cell
contains a polynucleotide encoding a galactosyltransferase, the
host cell also contains a polynucleotide encoding a UDP-Gal and/or
UDP-Gal transporter. In certain embodiments of the invention where
the host cell contains a polynucleotide encoding a
galactosyltransferase, lactose may be used as the carbon source
instead of glucose when culturing the host cell. The culture medium
may be between pH 4.5 and 7.0 or between 5.0 and 6.5. In certain
embodiments of the invention where the host cell contains a
polynucleotide encoding a galactosyltransferase and a
polynucleotide encoding a UDP-Gal and/or UDP-Gal transporter, a
divalent cation such as Mn2+, Ca2+ or Mg2+ may be added to the cell
culture medium.
[0168] In certain embodiments, the host cell contains a
polynucleotide encoding a sialyltransferase. A sialyltransferase
transfers .alpha.3- or .alpha.6-linked sialic acid, such as Neu5Ac,
to the terminal Gal of galactosylated complex glycans. Examples of
suitable sialyltransferases can be found in the glycosylation
protein family 29 (cazy.org/GT29.html). Useful .alpha.3- or
.alpha.6-sialyltransferases include .beta.-galactoside
.alpha.-2,6-sialyltransferase (EC 2.4.99.1) with a certain
subfamily ST6Gal-I, and N-acetylactosaminide
.alpha.-2,3-sialyltransferase (EC 2.4.99.6) with possible
cross-reactivity with .beta.-galactoside
.alpha.-2,3-sialyltransferase (EC 2.4.99.4). Useful subtypes of
.alpha.3-sialyltransferases include ST3Gal-III and ST3Gal-IV.
Certain enzymatically characterized species of these are listed as
characterized in the CAZy database of glycosylation enzymes
(cazy.org/GT29_characterized.html). The polynucleotide encoding the
.alpha.3- or .alpha.6-linked sialyltransferase may be endogenous to
the host cell, or it may be heterologous to the host cell.
Sialylation in the host cell may require expression of enzymes
synthesizing the donor CMP-sialic acid such as CMP-Neu5Ac,
especially in fungal, plant, nematode/parasite, or insect
cells.
[0169] The host cell may have increased or reduced levels of
activity of various endogenous enzymes. A reduced level of activity
may be provided by inhibiting the activity of the endogenous enzyme
with an inhibitor, an antibody, or the like. In certain
embodiments, the host cell is genetically modified in ways to
increase or reduce activity of various endogenous enzymes.
"Genetically modified" refers to any recombinant DNA or RNA method
used to create a prokaryotic or eukaryotic host cell that expresses
a polypeptide at elevated levels, at lowered levels, or in a
mutated form. In other words, the host cell has been transfected,
transformed, or transduced with a recombinant polynucleotide
molecule, and thereby been altered so as to cause the cell to alter
expression of a desired protein.
[0170] Genetic modifications which result in a decrease in gene
expression, in the function of the gene, or in the function of the
gene product (i.e., the protein encoded by the gene) can be
referred to as inactivation (complete or partial), deletion,
interruption, blockage, silencing, or down-regulation, or
attenuation of expression of a gene. For example, a genetic
modification in a gene which results in a decrease in the function
of the protein encoded by such gene, can be the result of a
complete deletion of the gene (i.e., the gene does not exist, and
therefore the protein does not exist), a mutation in the gene which
results in incomplete or no translation of the protein (e.g., the
protein is not expressed), or a mutation in the gene which
decreases or abolishes the natural function of the protein (e.g., a
protein is expressed which has decreased or no enzymatic activity
or action). More specifically, reference to decreasing the action
of proteins discussed herein generally refers to any genetic
modification in the host cell in question, which results in
decreased expression and/or functionality (biological activity) of
the proteins and includes decreased activity of the proteins (e.g.,
decreased catalysis), increased inhibition or degradation of the
proteins as well as a reduction or elimination of expression of the
proteins. For example, the action or activity of a protein of the
present invention can be decreased by blocking or reducing the
production of the protein, reducing protein action, or inhibiting
the action of the protein. Combinations of some of these
modifications are also possible. Blocking or reducing the
production of a protein can include placing the gene encoding the
protein under the control of a promoter that requires the presence
of an inducing compound in the growth medium. By establishing
conditions such that the inducer becomes depleted from the medium,
the expression of the gene encoding the protein (and therefore, of
protein synthesis) could be turned off. Blocking or reducing the
action of a protein could also include using an excision technology
approach similar to that described in U.S. Pat. No. 4,743,546. To
use this approach, the gene encoding the protein of interest is
cloned between specific genetic sequences that allow specific,
controlled excision of the gene from the genome. Excision could be
prompted by, for example, a shift in the cultivation temperature of
the culture, as in U.S. Pat. No. 4,743,546, or by some other
physical or nutritional signal.
[0171] In general, according to the present invention, an increase
or a decrease in a given characteristic of a mutant or modified
protein (e.g., enzyme activity) is made with reference to the same
characteristic of a wild-type (i.e., normal, not modified) protein
that is derived from the same organism (from the same source or
parent sequence), which is measured or established under the same
or equivalent conditions. Similarly, an increase or decrease in a
characteristic of a genetically modified host cell (e.g.,
expression and/or biological activity of a protein, or production
of a product) is made with reference to the same characteristic of
a wild-type host cell of the same species, and preferably the same
strain, under the same or equivalent conditions. Such conditions
include the assay or culture conditions (e.g., medium components,
temperature, pH, etc.) under which the activity of the protein
(e.g., expression or biological activity) or other characteristic
of the host cell is measured, as well as the type of assay used,
the host cell that is evaluated, etc. As discussed above,
equivalent conditions are conditions (e.g., culture conditions)
which are similar, but not necessarily identical (e.g., some
conservative changes in conditions can be tolerated), and which do
not substantially change the effect on cell growth or enzyme
expression or biological activity as compared to a comparison made
under the same conditions.
[0172] Preferably, a genetically modified host cell that has a
genetic modification that increases or decreases the activity of a
given protein (e.g., an enzyme) has an increase or decrease,
respectively, in the activity or action (e.g., expression,
production and/or biological activity) of the protein, as compared
to the activity of the wild-type protein in a wild-type host cell,
of at least about 5%, and more preferably at least about 10%, and
more preferably at least about 15%, and more preferably at least
about 20%, and more preferably at least about 25%, and more
preferably at least about 30%, and more preferably at least about
35%, and more preferably at least about 40%, and more preferably at
least about 45%, and more preferably at least about 50%, and more
preferably at least about 55%, and more preferably at least about
60%, and more preferably at least about 65%, and more preferably at
least about 70%, and more preferably at least about 75%, and more
preferably at least about 80%, and more preferably at least about
85%, and more preferably at least about 90%, and more preferably at
least about 95%, or any percentage, in whole integers between 5%
and 100% (e.g., 6%, 7%, 8%, etc.). The same differences are certain
when comparing an isolated modified nucleic acid molecule or
protein directly to the isolated wild-type nucleic acid molecule or
protein (e.g., if the comparison is done in vitro as compared to in
vivo).
[0173] In another aspect of the invention, a genetically modified
host cell that has a genetic modification that increases or
decreases the activity of a given protein (e.g., an enzyme) has an
increase or decrease, respectively, in the activity or action
(e.g., expression, production and/or biological activity) of the
protein, as compared to the activity of the wild-type protein in a
wild-type host cell, of at least about 2-fold, and more preferably
at least about 5-fold, and more preferably at least about 10-fold,
and more preferably about 20-fold, and more preferably at least
about 30-fold, and more preferably at least about 40-fold, and more
preferably at least about 50-fold, and more preferably at least
about 75-fold, and more preferably at least about 100-fold, and
more preferably at least about 125-fold, and more preferably at
least about 150-fold, or any whole integer increment starting from
at least about 2-fold (e.g., 3-fold, 4-fold, 5-fold, 6-fold,
etc.).
[0174] In certain embodiments, the host cell has a reduced level of
activity of a dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
mannosyltransferase compared to the level of activity in a
wild-type host cell. Dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
mannosyltransferase (EC 2.4.1.130) transfers an alpha-D-mannosyl
residue from dolichyl-phosphate D-mannose into a membrane
lipid-linked oligosaccharide. Typically, the
dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl mannosyltransferase
enzyme is encoded by an alg3 gene. In certain embodiments, the host
cell has a reduced level of expression of an alg3 gene compared to
the level of expression in a wild-type host cell. In certain
embodiments, the alg3 gene is deleted from the host cell.
[0175] In certain embodiments, the host cell has a reduced level of
activity of a alpha-1,6-mannosyltransferase compared to the level
of activity in a wild-type host cell. Alpha-1,6-mannosyltransferase
(EC 2.4.1.232) transfers an alpha-D-mannosyl residue from
GDP-mannose into a protein-linked oligosaccharide, forming an
elongation initiating alpha-(1.fwdarw.6)-D-mannosyl-D-mannose
linkage in the Golgi apparathus. Typically, the
alpha-L6-mannosyltransferase enzyme is encoded by an och1 gene. In
certain embodiments, the host cell has a reduced level of
expression of an och1 gene compared to the level of expression in a
wild-type host cell. In certain embodiments, the och1 gene is
deleted from the host cell.
[0176] In certain embodiments, the host cell has a reduced level of
protease activity. In certain embodiments, genes encoding various
proteases are deleted from the host cell. These genes include, for
example, genes encoding proteases such as pep1 (pepA in
Aspergillus) and cellulolytic enzymes, such as cellobiohydrolase1
(cbh1).
[0177] In certain embodiments, the host cell may have a reduced
level of activity of proteins involved in non-homologous end
joining (NHEJ) in order to enhance the efficiency of homologous
recombination. In certain embodiments, genes encoding these
proteins are deleted from the host cell. The genes and their
homologues include, but are not limited to, Ku70, Ku80, Lig4,
Rad50, Xrs2, Sir4, Lif1, or Nei1 as described in, for example,
Ninomiya et al. 2004, Ishibashi et al. 2006, Villalba et al. 2008,
and Mizutani et al. 2008.
[0178] In certain embodiments of methods of producing a complex
N-glycan, the host cell is a Trichoderma cell that has a reduced
level of activity of a dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
mannosyltransferase compared to the level of activity in a
wild-type Trichoderma cell.
[0179] In other certain embodiments of methods of producing a
complex N-glycan, the host cell is a yeast cell that has a reduced
level of activity of a dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
mannosyltransferase and a reduced level of activity of an
alpha-1,6-mannosyltransferase compared to the levels of activity in
a wild-type yeast cell and further comprises a polynucleotide
encoding a .alpha.-1,2-mannosidase.
[0180] In Vitro Methods of Producing Complex N-Glycans
[0181] In another aspect, the invention provides a method of
producing a complex N-glycan, including a step of incubating a
fusion protein comprising an N-acetylglucosaminyltransferase I
catalytic domain and an N-acetylglucosaminyltransferase II
catalytic domain, an acceptor glycan, and an N-acetylglucosamine
donor together in a buffer, where the fusion protein catalyzes the
transfer of N-acetylglucosamine to a terminal Man.alpha.3 residue
and N-acetylglucosamine to a terminal Man.alpha.6 residue of an
acceptor glycan to produce a complex N-glycan. In certain
embodiments the acceptor glycan is attached to an amino acid, a
peptide, or a polypeptide. In certain embodiments the acceptor
glycan is attached to a heterologous polypeptide. In certain
embodiments, the acceptor glycan is Man.sub.3. In certain
embodiments the N-acetylglucosamine donor is a UDP-GlcNAc
transporter. Typically the buffer contains a divalent cation such
as Mn.sup.2+, Ca.sup.2+, or Mg.sup.2+ at concentrations of 1 .mu.M
to 100 mM, 100 .mu.M to 50 mM, or 0.1 mM to 25 mM. The
N-acetylglucosamine donor is typically used in molar excess, such
as 1.1-100 fold excess with regard to the reactive acceptor sites
on the acceptor glycan. The concentration of the acceptor glycan is
typically between 1 .mu.M to 100 mM, 100 .mu.M to 50 mM, or 1 to 25
mM. Where the acceptor glycan is attached to a polypeptide, the
concentration ranges are typically at the lower end because of
higher molecular weights. The concentrations of the components of
the reaction may be adjusted based on their solubilities in the
buffer. The amount of enzyme activity (units) may be adjusted to
allow an effective reaction within a reasonable reaction time. A
reasonable reaction time is typically from a few minutes to several
days. In certain embodiments the reaction time will be from about
0.5 hours to one day or from 1 to 6 hours.
[0182] Useful buffers include buffers suitable for the fusion
protein such as TRIS, HEPES, MOPS in pH ranges of about 5 to 8.5,
5.5. to 8.0, or 6.0 and 7.5. Typically concentrations of IRIS,
HEPES, or MOPS buffers will be between 5 to 150 mM, between 10-100
mM, or 10-60 mM adjusted to maintain the pH. The reaction may be
optimized by adding salt such as NaCl at 10-200 mM and/or an enzyme
stabilizing but not glycosylatable protein (e.g., a pure
non-glycosylated or non-acceptor glycan containing albumin. In a
certain embodiment the in vitro reaction is adjusted to be
performed in cell culture medium. Phosphate buffers may be used to
reduce reaction speed.
[0183] Cells and Methods for Production of Man.sub.3GlcNAc.sub.2
Glycans
[0184] In another aspect, the present invention provides
filamentous fungal cells containing a mutation of alg3 and
Man3GlcNAc2, where the Man3GlcNAc2 includes at least 50%, at least
60%, at least 70%, at least 80%, at least 90%, or 100% (mol %) of
neutral N-glycans secreted by the cells. The neutral N-glycans may
be attached to an amino acid, a peptide, or a polypeptide. The alg3
gene may be mutated by any means known in the art, such as point
mutations or deletion of the entire alg3 gene. Preferably, the
function of the alg3 protein is reduced or eliminated by the
mutation of alg3. The filamentous fungal cell may be an Acremonium,
Aspergillus, Aureobasidium, Cryptococcus, Chrysosporium,
Chrysosporium lucknowense, Filibasidium, Fusarium, Gibberella,
Humicola, Magnaporthe, Mucor, Myceliophthora, Myrothecium,
Neocallimastix, Neurospora, Paecilomyces, Penicillium, Piromyces,
Schizophyllum, Scytalidium, Talaromyces, Thermoascus, Thielavia,
Tolypocladium, or Trichoderma cell. In certain embodiments, the
filamentous fungal cell is a T. reesei cell. In certain
embodiments, the filamentous fungal cell further contains one or
more polynucleotides encoding any of the recombinant proteins of
the invention. For example, the filamentous fungal cell may further
contain a first polynucleotide encoding an
N-acetylglucosaminyltransferase I catalytic domain and a second
polynucleotide encoding an N-acetylglucosaminyltransferase II
catalytic domain. Alternatively, the filamentous fungal cell may
further contain a polynucleotide encoding a fusion protein
including an N-acetylglucosaminyltransferase I catalytic domain and
an N-acetylglucosaminyltransferase II catalytic domain.
[0185] In yet another aspect, the present invention provides
methods of producing a Man.sub.3GlcNAc.sub.2 glycan in a host cell,
including the steps of providing a host cell with a reduced level
of activity of a mannosyltransferase compared to the level of
activity in a wild-type host cell, and culturing the host cell to
produce a Man.sub.3GlcNAc, glycan, where the Man.sub.3GlcNAc.sub.2
glycan makes up at least 50%, at least 60%, at least 70%, at least
80%, at least 90%, or 100% (mol %) of the neutral N-glycans
secreted by the host cell.
[0186] The Man.sub.3GlcNAc.sub.2 glycan may be attached to a
molecule such as an amino acid, a peptide, or a polypeptide. In
certain embodiments, the amino acid is an asparagine residue. The
asparagine residue may be in aminoglycosidic linkage from the
side-chain amide (a biologic mammalian protein N-glycan linkage
structure) and may be part of a peptide chain such as a dipeptide,
an oligopeptide, or a polypeptide. The glycan may be a reducing end
derivative such as an N-, O-, or C-linked, preferably glycosidic,
derivative of the reducing GlcNAc or Man, such as a spacer or
terminal organic residue with a certain glycan-linked structure
selected from the group of an amino acid, alkyl, heteroalkyl, acyl,
alkyloxy, aryl, arylalkyl, and heteroarylalkyl. The spacer may be
further linked to a polyvalent carrier or a solid phase. In certain
embodiments, alkyl-containing structures include methyl, ethyl,
propyl, and C4-C26 alkyls, lipids such as glycerolipids,
phospholipids, dolichol-phospholipids and ceramides and
derivatives. The reducing end may also be derivatized by reductive
amination to a secondary amine linkage or a derivative structure.
Certain carriers include biopoly- or oligomers such as
(poly)peptides, poly(saccharides) such as dextran, cellulose,
amylose, or glycosaminoglycans, and other organic polymers or
oligomers such as plastics including polyethylene, polypropylene,
polyamides (e.g., nylon or polystyrene), polyacrylamide, and
polylactic acids, dendrimers such as PAMAM, Starburst or Starfish
dendrimers, or polylysine, and polyalkylglycols such as
polyethylene glycol (PEG). Solid phases may include microtiter
wells, silica particles, glass, metal including steel, gold and
silver, polymer beads such as polystyrene or resin beads,
polylactic acid beads, polysaccharide beads or organic spacers
containing magnetic beads.
[0187] In certain embodiments, the Man.sub.3GlcNAc.sub.2 glycan is
attached to a heterologous polypeptide. In certain embodiments, the
heterologous polypeptide is a therapeutic protein. Therapeutic
proteins may include monoclonal antibodies, erythropoietins,
interferons, growth hormones, enzymes, or blood-clotting factors
and may be useful in the treatment of humans or animals. For
example, the Man.sub.3GlcNAc.sub.2 glycan may be attached to a
therapeutic protein such as rituximab. Typically, the
Man.sub.3GlcNAc.sub.2glycan will be further modified to become a
complex glycan. Such modification may take place in vivo in the
host cell or by in vitro methods.
[0188] In certain embodiments, the mannosyltransferase is a
dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl mannosyltransferase.
Typically, the dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
mannosyltransferase enzyme is encoded by an alg3 gene. In certain
embodiments, the host cell has a reduced level of expression of an
alg3 gene compared to the level of expression in a wild-type host
cell. In certain embodiments, the alg3 gene is deleted from the
host cell. SEQ ID NOs: 97 and 98 provide the nucleic acid and amino
acid sequences of the alg3 gene in T. reesei, respectively.
[0189] In certain embodiments, the level of activity of
alpha-1,6-mannosyltransferase in the host cell is not reduced
compared to the level of activity in a wild-type host cell.
Typically, the alpha-1,6-mannosyltransferase enzyme is encoded by
an och1 gene. In certain embodiments, the host cell contains an
endogenous polynucleotide encoding an .alpha.-1,2-mannosidase.
[0190] In certain embodiments, the host cell is a Trichoderma cell,
and in certain embodiments, the host cell is a Trichoderma reesei
cell.
[0191] Filamentous Fungal Cells of the Invention
[0192] In a further aspect, the present invention provides
filamentous fungal cells having a reduced level of expression of an
alg3 gene of the invention, compared to the level of expression of
the alg3 gene in a wild-type filamentous fungal cell, where the
filamentous fungal cell also contains any of the recombinant
proteins of the invention as described in the section entitled
"Recombinant Proteins of the Invention.". For example, in certain
embodiments the filamentous fungal cell further contains a
polynucleotide encoding a fusion protein including an
N-acetylglucosaminyltransferase I catalytic domain and an
N-acetylglucosaminyltransferase II catalytic domain. The expression
of the fusion protein may be controlled by a promoter that is
operably linked to the polynucleotide. The promoter may be a
constitutive promoter or an inducible promoter. In certain
preferred embodiments, the promoter is an inducible promoter, such
as the cbh1 inducible promoter.
[0193] In another aspect, the present invention provides
filamentous fungal cells having a reduced level of expression of an
alg3 gene of the invention, compared to the level of expression of
the alg3 gene in a wild-type filamentous fungal cell, where the
filamentous fungal cell also contains a first polynucleotide
encoding a recombinant N-acetylglucosaminyltransferase I catalytic
domain and a second polynucleotide encoding a recombinant
N-acetylglucosaminyltransferase II catalytic domain. In such
embodiments, the expression of the recombinant
N-acetylglucosaminyltransferase I catalytic domain is controlled by
a promoter that is operably linked to the first polynucleotide and
the expression of the recombinant N-acetylglucosaminyltransferase
II catalytic domain is controlled by a promoter that is operably
linked to the second polynucleotide. The promoter may be a
constitutive promoter or an inducible promoter. In certain
preferred embodiments, the promoter is an inducible promoter, such
as the cbh1 inducible promoter.
[0194] In other embodiments, a single polynucleotide may encode
both the recombinant N-acetylglucosaminyltransferase I catalytic
domain and the recombinant N-acetylglucosaminyltransferase II
catalytic domain such that they are expressed as separate
polypeptides. In such embodiments, the polynucleotide may contain
an internal ribosome entry site that allows for the separate
translation of each catalytic domain from the polynucleotide. In
such embodiments, the expression of the recombinant
N-acetylglucosaminyltransferase I catalytic domain is controlled by
a promoter that is operably linked to the portion of the
polynucleotide that encodes the N-acetylglucosaminyltransferase I
catalytic domain and the expression of the recombinant
N-acetylglucosaminyltransferase II catalytic domain is controlled
by a promoter that is operably linked to the portion of the
polynucleotide that encodes the N-acetylglucosaminyltransferase II
catalytic domain. The promoter may be a constitutive promoter or an
inducible promoter. In certain preferred embodiments, the promoter
is an inducible promoter, such as the cbh1 inducible promoter.
[0195] As disclosed herein, N-acetylglucosaminyltransferase I
(GlcNAc-TI; GnTI; EC 2.4.1.101) catalyzes the reaction
UDP-N-acetyl-D-glucosamine+3-(alpha-D-mannosyl)-beta-D-mannosyl-R<=>-
;UDP+3-(2-(N-acetyl-beta-D-glucosaminyl)-alpha-D-mannosyl)-beta-D-mannosyl-
-R, where R represents the remainder of the N-linked
oligosaccharide in the glycan acceptor. An
N-acetylglucosaminyltransferase I catalytic domain is any portion
of an N-acetylglucosaminyltransferase I enzyme that is capable of
catalyzing this reaction. Amino acid sequences for
N-acetylglucosaminyltransferase I enzymes from various organisms
are listed in SEQ ID NOs: 1-19. Additional GnTI enzymes are listed
in the CAZy database in the glycosyltransferase family 13
(cazy.org/GT13_all). Enzymatically characterized species includes
A. thaliana AAR78757.1 (U.S. Pat. No. 6,653,459), C. elegans
AAD03023.1 (Chen S. et al J. Biol. Chem 1999; 274(1):288-97), D.
melanogaster AAF57454.1 (Sarkar & Schachter Biol Chem. 2001
February; 382(2):209-17); C. griseus AAC52872.1 (Puthalakath H. et
al J. Biol. Chem 1996 271(44):27818-22); H. sapiens AAA52563.1
(Kumar R. et al Proc Natl Acad Sci USA. 1990 December;
87(24):9948-52); M. auratus AAD04130.1 (Opat As et al Biochem J.
1998 Dec. 15; 336 (Pt 3):593-8), (including an example of
deactivating mutant), Rabbit, O. cuniculus AAA31493.1 (Sarkar M et
al. Proc Natl Acad Sci USA. 1991 Jan. 1; 88(1):234-8). Additional
examples of characterized active enzymes can be found at
cazy.org/GT13_characterized. The 3D structure of the catalytic
domain of rabbit GnTI was defined by X-ray crystallography in
Unligil U M et al. EMBO J. 2000 Oct. 16; 19(20):5269-80. The
Protein Data Bank (PDB) structures for GnTI are 1FO8, 1FO9, 1FOA,
2AM3, 2AM4, 2AM5, and 2APC. In certain embodiments, the
N-acetylglucosaminyltransferase I catalytic domain is from the
human N-acetylglucosaminyltransferase I enzyme (SEQ ID NO: 1), or
variants thereof. In certain embodiments, the
N-acetylglucosaminyltransferase I catalytic domain contains a
sequence that is at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at
least 98%, at least 99%, or 100% identical to amino acid residues
84-445 of SEQ ID NO: 1. In some embodiments, a shorter sequence can
be used as a catalytic domain (e.g. amino acid residues 105-445 of
the human enzyme or amino acid residues 107-447 of the rabbit
enzyme; Sarkar et al. (1998) Glycoconjugate J 15:193-197).
Additional sequences that can be used as the GnTI catalytic domain
include amino acid residues from about amino acid 30 to 445 of the
human enzyme or any C-terminal stem domain starting between amino
acid residue 30 to 105 and continuing to about amino acid 445 of
the human enzyme, or corresponding homologous sequence of another
GnTI or a catalytically active variant or mutant thereof. The
catalytic domain may include N-terminal parts of the enzyme such as
all or part of the stem domain, the transmembrane domain, or the
cytoplasmic domain.
[0196] As disclosed herein, N-acetylglucosaminyltransferase II
(GlcNAc-TII; GnTII; EC 2.4.1.143) catalyzes the reaction
UDP-N-acetyl-D-glucosamine+6-(alpha-D-mannosyl)-beta-D-mannosyl-R<=>-
;UDP+6-(2-(N-acetyl-beta-D-glucosaminyl)-alpha-D-mannosyl)-beta-D-mannosyl-
-R, where R represents the remainder of the N-linked
oligosaccharide in the glycan acceptor. An
N-acetylglucosaminyltransferase II catalytic domain is any portion
of an N-acetylglucosaminyltransferase II enzyme that is capable of
catalyzing this reaction. Amino acid sequences for
N-acetylglucosaminyltransferase II enzymes from various organisms
are listed in SEQ ID NOs: 20-33. In certain embodiments, the
N-acetylglucosaminyltransferase II catalytic domain is from the
human N-acetylglucosaminyltransferase II enzyme (SEQ ID NO: 20), or
variants thereof. Additional GnTII species are listed in the CAZy
database in the glycosyltransferase family 16 (cazy.org/GT16_all).
Enzymatically characterized species include GnTII of C. elegans, D.
melanogaster, Homo sapiens, Rattus norvegigus, Sus scrofa
(cazy.org/GT16_characterized). In certain embodiments, the
N-acetylglucosaminyltransferase II catalytic domain contains a
sequence that is at least 70%, at least 75%, at least 80%, at least
85%, at least 90%, at least 95%, at least 96%, at least 97%, at
least 98%, at least 99%, or 100% identical to amino acid residues
from about 30 to about 447 of SEQ ID NO: 21. The catalytic domain
may include N-terminal parts of the enzyme such as all or part of
the stem domain, the transmembrane domain, or the cytoplasmic
domain.
[0197] In embodiments where the filamentous fungal cell contains a
fusion protein of the invention, the fusion protein may further
contain a spacer in between the N-acetylglucosaminyltransferase I
catalytic domain and the N-acetylglucosaminyltransferase II
catalytic domain. Any of the spacers of the invention as described
in the section entitled "Spacers" may be used. In certain preferred
embodiments, the spacer is an EGIV spacer, a 2.times.G4S spacer, a
3.times.G4S spacer, or a CBHI spacer. In other embodiments, the
spacer contains a sequence from a stem domain.
[0198] For ER/Golgi expression the N-acetylglucosaminyltransferase
I and/or N-acetylglucosaminyltransferase II catalytic domain is
typically fused with a targeting peptide or a part of an ER or
early Golgi protein, or expressed with an endogenous ER targeting
structures of an animal or plant N-acetylglucosaminyltransferase
enzyme. In certain preferred embodiments, the
N-acetylglucosaminyltransferase I and/or
N-acetylglucosaminyltransferase II catalytic domain contains any of
the targeting peptides of the invention as described in the section
entitled "Targeting peptides." Preferably, the targeting peptide is
linked to the N-terminal end of the catalytic domain. In some
embodiments, the targeting peptide contains any of the stem domains
of the invention as described in the section entitled "Targeting
peptides." In certain preferred embodiments, the targeting peptide
is a Kre2 targeting peptide. In other embodiments, the targeting
peptide further contains a transmembrane domain linked to the
N-terminal end of the stem domain or a cytoplasmic domain linked to
the N-terminal end of the stem domain. In embodiments where the
targeting peptide further contains a transmembrane domain, the
targeting peptide may further contain a cytoplasmic domain linked
to the N-terminal end of the transmembrane domain.
[0199] The level of expression of an alg3 gene of the invention may
be reduced by any suitable method known in the art, including,
without limitation, mutating the alg3 gene. The alg3 may be mutated
by, for example, point mutations or deletion of the entire alg3
gene. Preferably, the function of the alg3 protein is reduced or
eliminated by the mutation of alg3. The alg3 gene encodes a
dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl
alpha-1,3-mannosyltransferase. As disclosed herein, a
dolichyl-P-Man:Man(5)GlcNAc(2)-PP-dolichyl mannosyltransferase of
the invention transfers an alpha-D-mannosyl residue from
dolichyl-phosphate D-mannose into a membrane lipid-linked
oligosaccharide.
[0200] In certain embodiments, the filamentous fungal cell may
contain a polynucleotide encoding a UDP-GlcNAc transporter. The
polynucleotide encoding the UDP-GlcNAc transporter may be
endogenous (i.e., naturally present) in the filamentous fungal
cell, or it may be heterologous to the filamentous fungal cell.
[0201] In other embodiments, the filamentous fungal cell may also
contain a polynucleotide encoding a .alpha.-1,2-mannosidase of the
invention as described in the section entitled "Host Cells." The
polynucleotide encoding the .alpha.-1,2-mannosidase may be
endogenous in the filamentous fungal cell, or it may be
heterologous to the filamentous fungal cell. These polynucleotides
are especially useful for a filamentous fungal cell expressing
high-mannose glycans transferred from the Golgi to the ER without
effective exo-.alpha.-2-mannosidase cleavage. For cytoplasmic
expression the catalytic domain of the mannosidase is typically
fused with a targeting peptide, such as HDEL, KDEL, or part of an
ER or early Golgi protein, or expressed with an endogenous ER
targeting structures of an animal or plant mannosidase I
enzyme.
[0202] In further embodiments, the filamentous fungal cell may also
contain a polynucleotide encoding a galactosyltransferase of the
invention as described in the section entitled "Host Cells."
Galactosyltransferases transfer .beta.-linked galactosyl residues
to terminal N-acetylglucosaminyl residue. In certain embodiments
the galactosyltransferase is a .beta.-4-galactosyltransferase. The
galactosyltransferase may be expressed in the cytoplasm of the
filamentous fungal. A heterologous targeting peptide, such as a
Kre2 peptide described in Schwientek J. Biol. Chem 1996 3398, may
be used. Promoters that may be used for expression of the
galactosyltransferase include constitutive promoters such as gpd,
promoters of endogenous glycosylation enzymes and
glycosyltransferases such as mannosyltransferases that synthesize
N-glycans in the Golgi or ER, and inducible promoters of high-yield
endogenous proteins such as the cbh1 promoter. In embodiments of
the invention where the host cell contains a polynucleotide
encoding a galactosyltransferase, the host cell also contains a
polynucleotide encoding a UDP-Gal and/or UDP-Gal transporter. In
certain embodiments of the invention where the filamentous fungal
cell contains a polynucleotide encoding a galactosyltransferase,
lactose may be used as the carbon source instead of glucose when
culturing the filamentous fungal cell. The culture medium may be
between pH 4.5 and 7.0 or between 5.0 and 6.5. In certain
embodiments of the invention where the filamentous fungal cell
contains a polynucleotide encoding a galactosyltransferase and a
polynucleotide encoding a UDP-Gal and/or UDP-Gal transporter, a
divalent cation such as Mn2+, Ca2+ or Mg2+ may be added to the cell
culture medium.
[0203] In other embodiments, the filamentous fungal cell may also
contain a polynucleotide encoding a sialyltransferase of the
invention as described in the section entitled "Host Cells.". A
sialyltransferase transfers .alpha.3- or .alpha.6-linked sialic
acid, such as Neu5Ac, to the terminal Gal of galactosylated complex
glycans. The polynucleotide encoding the .alpha.3- or
.alpha.6-linked sialyltransferase may be endogenous to the
filamentous fungal cell, or it may be heterologous to the
filamentous fungal cell. Sialylation in the filamentous fungal cell
may require expression of enzymes synthesizing the donor CMP-sialic
acid such as CMP-Neu5Ac, especially in fungal, plant,
nematode/parasite, or insect cells.
[0204] Additionally, the filamentous fungal cell may have increased
or reduced levels of activity of various additional endogenous
enzymes. A reduced level of activity may be provided by inhibiting
the activity of the endogenous enzyme with an inhibitor, an
antibody, or the like. In certain embodiments, the filamentous
fungal cell is genetically modified in ways to increase or reduce
activity of one or more endogenous enzymes. Methods of genetically
modifying a filamentous fungal cell to increase or reduce activity
of one or more endogenous enzymes are well known in the art and
include, without limitation, those described in the section
entitled "Host Cells." In certain embodiments, the filamentous
fungal cell has a reduced level of activity of a
alpha-1,6-mannosyltransferase compared to the level of activity in
a wild-type filamentous fungal cell. Alpha-1,6-mannosyltransferase
(EC 2.4.1.232) in the Golgi apparatus transfers an elongation
initiating alpha-D-mannosyl residue from GDP-mannose into a
protein-linked N-glycan oligosaccharide, forming an
alpha-(1.fwdarw.6)-D-mannosyl-D-mannose linkage. Typically, the
alpha-1,6-mannosyltransferase enzyme is encoded by an och1 gene. In
certain embodiments, the filamentous fungal cell has a reduced
level of expression of an och1 gene compared to the level of
expression in a wild-type filamentous fungal cell. In certain
embodiments, the och1 gene is deleted from the filamentous fungal
cell.
[0205] The filamentous fungal cell may be, for example, an
Acremonium, Aspergillus, Aureobasidium, Cryptococcus,
Chrysosporium, Chrysosporium lucknowense, Filibasidium, Fusarium,
Gibberella, Humicola, Magnaporthe, Mucor, Myceliophthora,
Myrothecium, Neocallimastix, Neurospora, Paecilomyces, Penicillium,
Piromyces, Schizophyllum, Scytalidium, Talaromyces, Thermoascus,
Thielavia, Tolypocladium, or Trichoderma cell. In certain
embodiments, the filamentous fungal cell is a T. reesei cell.
[0206] Pharmaceutical Compositions Containing Complex N-Glycans
Produced by the Methods of the Invention
[0207] In another aspect, the present invention provides a
composition, e.g., a pharmaceutical composition, containing one or
more complex N-glycans attached to a heterologous molecule produced
by the methods of the invention, formulated together with a
pharmaceutically acceptable carrier. Pharmaceutical compositions of
the invention also can be administered in combination therapy,
i.e., combined with other agents. For example, the combination
therapy can include an complex N-glycans attached to a heterologous
molecule according to the present invention combined with at least
one other therapeutic agent.
[0208] As used herein, "pharmaceutically acceptable carrier"
includes any and all solvents, dispersion media, coatings,
antibacterial and antifungal agents, isotonic and absorption
delaying agents, and the like that are physiologically compatible.
Preferably, the carrier is suitable for intravenous, intramuscular,
subcutaneous, parenteral, spinal or epidermal administration (e.g.,
by injection or infusion). Depending on the route of
administration, the active compound, i.e., the complex N-glycan
attached to a heterologous molecule according to the invention, may
be coated in a material to protect the compound from the action of
acids and other natural conditions that may inactivate the
compound.
[0209] The pharmaceutical compositions of the invention may include
one or more pharmaceutically acceptable salts. A "pharmaceutically
acceptable salt" refers to a salt that retains the desired
biological activity of the parent compound and does not impart any
undesired toxicological effects (see e.g., Berge, S. M., et al.
(1977) J. Pharm. Sci. 66:1-19). Examples of such salts include acid
addition salts and base addition salts. Acid addition salts include
those derived from nontoxic inorganic acids, such as hydrochloric,
nitric, phosphoric, sulfuric, hydrobromic, hydroiodic, phosphorous
and the like, as well as from nontoxic organic acids such as
aliphatic mono- and dicarboxylic acids, phenyl-substituted alkanoic
acids, hydroxy alkanoic acids, aromatic acids, aliphatic and
aromatic sulfonic acids and the like. Base addition salts include
those derived from alkaline earth metals, such as sodium,
potassium, magnesium, calcium and the like, as well as from
nontoxic organic amines, such as N,N'-dibenzylethylenediamine,
N-methylglucamine, chloroprocaine, choline, diethanolamine,
ethylenediamine, procaine and the like.
[0210] A pharmaceutical composition of the invention also may
include a pharmaceutically acceptable antioxidant. Examples of
pharmaceutically acceptable antioxidants include: (1) water soluble
antioxidants, such as ascorbic acid, cysteine hydrochloride, sodium
bisulfate, sodium metabisulfite, sodium sulfite and the like; (2)
oil-soluble antioxidants, such as ascorbyl palmitate, butylated
hydroxyanisole (BHA), butylated hydroxytoluene (BHT), lecithin,
propyl gallate, alpha-tocopherol, and the like; and (3) metal
chelating agents, such as citric acid, ethylenediamine tetraacetic
acid (EDTA), sorbitol, tartaric acid, phosphoric acid, and the
like.
[0211] Examples of suitable aqueous and nonaqueous carriers that
may be employed in the pharmaceutical compositions of the invention
include water, ethanol, polyols (such as glycerol, propylene
glycol, polyethylene glycol, and the like), and suitable mixtures
thereof, vegetable oils, such as olive oil, and injectable organic
esters, such as ethyl oleate. Proper fluidity can be maintained,
for example, by the use of coating materials, such as lecithin, by
the maintenance of the required particle size in the case of
dispersions, and by the use of surfactants.
[0212] These compositions may also contain adjuvants such as
preservatives, wetting agents, emulsifying agents and dispersing
agents. Prevention of presence of microorganisms may be ensured
both by sterilization procedures, and by the inclusion of various
antibacterial and antifungal agents, for example, paraben,
chlorobutanol, phenol sorbic acid, and the like. It may also be
desirable to include isotonic agents, such as sugars, sodium
chloride, and the like into the compositions. In addition,
prolonged absorption of the injectable pharmaceutical form may be
brought about by the inclusion of agents which delay absorption
such as aluminum monostearate and gelatin.
[0213] Pharmaceutically acceptable carriers include sterile aqueous
solutions or dispersions and sterile powders for the extemporaneous
preparation of sterile injectable solutions or dispersion. The use
of such media and agents for pharmaceutically active substances is
known in the art. Except insofar as any conventional media or agent
is incompatible with the active compound, use thereof in the
pharmaceutical compositions of the invention is contemplated.
Supplementary active compounds can also be incorporated into the
compositions.
[0214] Therapeutic compositions typically must be sterile and
stable under the conditions of manufacture and storage. The
composition can be formulated as a solution, microemulsion,
liposome, or other ordered structure suitable to high drug
concentration. The carrier can be a solvent or dispersion medium
containing, for example, water, ethanol, polyol (for example,
glycerol, propylene glycol, and liquid polyethylene glycol, and the
like), and suitable mixtures thereof. The proper fluidity can be
maintained, for example, by the use of a coating such as lecithin,
by the maintenance of the required particle size in the case of
dispersion and by the use of surfactants. In many cases, it will be
preferable to include isotonic agents, for example, sugars,
polyalcohols such as mannitol, sorbitol, or sodium chloride in the
composition. Prolonged absorption of the injectable compositions
can be brought about by including in the composition an agent that
delays absorption, for example, monostearate salts and gelatin.
[0215] Sterile injectable solutions can be prepared by
incorporating the active compound in the required amount in an
appropriate solvent with one or a combination of ingredients
enumerated above, as required, followed by sterilization
microfiltration. Generally, dispersions are prepared by
incorporating the active compound into a sterile vehicle that
contains a basic dispersion medium and the required other
ingredients from those enumerated above. In the case of sterile
powders for the preparation of sterile injectable solutions, the
certain methods of preparation are vacuum drying and freeze-drying
(lyophilization) that yield a powder of the active ingredient plus
any additional desired ingredient from a previously
sterile-filtered solution thereof.
[0216] The amount of active ingredient which can be combined with a
carrier material to produce a single dosage form will vary
depending upon the subject being treated, and the particular mode
of administration. The amount of active ingredient which can be
combined with a carrier material to produce a single dosage form
will generally be that amount of the composition which produces a
therapeutic effect. Generally, out of one hundred percent, this
amount will range from about 0.01 percent to about ninety-nine
percent of active ingredient, preferably from about 0.1 percent to
about 70 percent, most preferably from about 1 percent to about 30
percent of active ingredient in combination with a pharmaceutically
acceptable carrier.
[0217] Dosage regimens are adjusted to provide the optimum desired
response (e.g., a therapeutic response). For example, a single
bolus may be administered, several divided doses may be
administered over time or the dose may be proportionally reduced or
increased as indicated by the exigencies of the therapeutic
situation. It is especially advantageous to formulate parenteral
compositions in dosage unit form for ease of administration and
uniformity of dosage. Dosage unit form as used herein refers to
physically discrete units suited as unitary dosages for the
subjects to be treated; each unit contains a predetermined quantity
of active compound calculated to produce the desired therapeutic
effect in association with the required pharmaceutical carrier. The
specification for the dosage unit forms of the invention are
dictated by and directly dependent on (a) the unique
characteristics of the active compound and the particular
therapeutic effect to be achieved, and (b) the limitations inherent
in the art of compounding such an active compound for the treatment
of sensitivity in individuals.
[0218] For administration of the complex N-glycan attached to a
heterologous molecule, in particular where the heterologous
molecule is an antibody, the dosage ranges from about 0.0001 to 100
mg/kg, and more usually 0.01 to 5 mg/kg, of the host body weight.
For example, dosages can be 0.3 mg/kg body weight, 1 mg/kg body
weight, 3 mg/kg body weight, 5 mg/kg body weight or 10 mg/kg body
weight or within the range of 1-10 mg/kg. An exemplary treatment
regime entails administration once per week, once every two weeks,
once every three weeks, once every four weeks, once a month, once
every 3 months or once every three to 6 months. Certain dosage
regimens for a complex N-glycan attached to a heterologous antibody
include 1 mg/kg body weight or 3 mg/kg body weight via intravenous
administration, with the antibody being given using one of the
following dosing schedules: (i) every four weeks for six dosages,
then every three months; (ii) every three weeks; (iii) 3 mg/kg body
weight once followed by 1 mg/kg body weight every three weeks.
[0219] Alternatively a complex N-glycan attached to a heterologous
molecule according to the invention can be administered as a
sustained release formulation, in which case less frequent
administration is required. Dosage and frequency vary depending on
the half-life of the administered substance in the patient. In
general, human antibodies show the longest half life, followed by
humanized antibodies, chimeric antibodies, and nonhuman antibodies.
The dosage and frequency of administration can vary depending on
whether the treatment is prophylactic or therapeutic. In
prophylactic applications, a relatively low dosage is administered
at relatively infrequent intervals over a long period of time. Some
patients continue to receive treatment for the rest of their lives.
In therapeutic applications, a relatively high dosage at relatively
short intervals is sometimes required until progression of the
disease is reduced or terminated, and preferably until the patient
shows partial or complete amelioration of symptoms of disease.
Thereafter, the patient can be administered a prophylactic
regime.
[0220] Actual dosage levels of the active ingredients in the
pharmaceutical compositions of the present invention may be varied
so as to obtain an amount of the active ingredient which is
effective to achieve the desired therapeutic response for a
particular patient, composition, and mode of administration,
without being toxic to the patient. The selected dosage level will
depend upon a variety of pharmacokinetic factors including the
activity of the particular compositions of the present invention
employed, or the ester, salt or amide thereof, the route of
administration, the time of administration, the rate of excretion
of the particular compound being employed, the duration of the
treatment, other drugs, compounds and/or materials used in
combination with the particular compositions employed, the age,
sex, weight, condition, general health and prior medical history of
the patient being treated, and like factors well known in the
medical arts.
[0221] A "therapeutically effective dosage" of immunoglobulin of
the invention preferably results in a decrease in severity of
disease symptoms, an increase in frequency and duration of disease
symptom-free periods, or a prevention of impairment or disability
due to the disease affliction. For example, for the treatment of
tumors, a "therapeutically effective dosage" preferably inhibits
cell growth or tumor growth by at least about 20%, more preferably
by at least about 40%, even more preferably by at least about 60%,
and still more preferably by at least about 80% relative to
untreated subjects. The ability of a compound to inhibit tumor
growth can be evaluated in an animal model system predictive of
efficacy in human tumors. Alternatively, this property of a
composition can be evaluated by examining the ability of the
compound to inhibit, such inhibition in vitro by assays known to
the skilled practitioner. A therapeutically effective amount of a
therapeutic compound can decrease tumor size, or otherwise
ameliorate symptoms in a subject. One of ordinary skill in the art
would be able to determine such amounts based on such factors as
the subject's size, the severity of the subject's symptoms, and the
particular composition or route of administration selected.
[0222] A composition of the present invention can be administered
via one or more routes of administration using one or more of a
variety of methods known in the art. As will be appreciated by the
skilled artisan, the route and/or mode of administration will vary
depending upon the desired results. Certain routes of
administration for binding moieties of the invention include
intravenous, intramuscular, intradermal, intraperitoneal,
subcutaneous, spinal or other parenteral routes of administration,
for example by injection or infusion. The phrase "parenteral
administration" as used herein means modes of administration other
than enteral and topical administration, usually by injection, and
includes, without limitation, intravenous, intramuscular,
intraarterial, intrathecal, intracapsular, intraorbital,
intracardiac, intradermal, intraperitoneal, transtracheal,
subcutaneous, subcuticular, intraarticular, subcapsular,
subarachnoid, intraspinal, epidural and intrasternal injection and
infusion.
[0223] Alternatively, a complex N-glycan attached to a heterologous
molecule according to the invention can be administered via a
nonparenteral route, such as a topical, epidermal or mucosal route
of administration, for example, intranasally, orally, vaginally,
rectally, sublingually or topically.
[0224] The active compounds can be prepared with carriers that will
protect the compound against rapid release, such as a controlled
release formulation, including implants, transdermal patches, and
microencapsulated delivery systems. Biodegradable, biocompatible
polymers can be used, such as ethylene vinyl acetate,
polyanhydrides, polyglycolic acid, collagen, polyorthoesters, and
polylactic acid. Many methods for the preparation of such
formulations are patented or generally known to those skilled in
the art. See, e.g., Sustained and Controlled Release Drug Delivery
Systems, J. R. Robinson, ed., Marcel Dekker, Inc., New York,
1978.
[0225] Therapeutic compositions can be administered with medical
devices known in the art. For example, in a certain embodiment, a
therapeutic composition of the invention can be administered with a
needleless hypodermic injection device, such as the devices
disclosed in U.S. Pat. Nos. 5,399,163; 5,383,851; 5,312,335;
5,064,413; 4,941,880; 4,790,824; or 4,596,556. Examples of
well-known implants and modules useful in the present invention
include: U.S. Patent No. 4,487,603, which discloses an implantable
micro-infusion pump for dispensing medication at a controlled rate;
U.S. Pat. No. 4,486,194, which discloses a therapeutic device for
administering medicants through the skin; U.S. Pat. No. 4,447,233,
which discloses a medication infusion pump for delivering
medication at a precise infusion rate; U.S. Pat. No. 4,447,224,
which discloses a variable flow implantable infusion apparatus for
continuous drug delivery; U.S. Pat. No. 4,439,196, which discloses
an osmotic drug delivery system having multi-chamber compartments;
and U.S. Pat. No. 4,475,196, which discloses an osmotic drug
delivery system.
[0226] In certain embodiments, the use of the complex N-glycan
attached to a heterologous molecule according to the invention is
for the treatment of any disease that may be treated with
therapeutic antibodies.
[0227] It is to be understood that, while the invention has been
described in conjunction with the certain specific embodiments
thereof, the foregoing description is intended to illustrate and
not limit the scope of the invention. Other aspects, advantages,
and modifications within the scope of the invention will be
apparent to those skilled in the art to which the invention
pertains.
[0228] The invention having been described, the following examples
are offered to illustrate the subject invention by way of
illustration, not by way of limitation.
EXAMPLES
Example 1
Host Strain Selection for Glycoengineering
[0229] The aim of this example was to identify optimal T. reesei
strains for glycoengineering. An optimal strain produces high
amounts of Man5 N-glycans and low amounts of acidic glycans.
[0230] Samples
[0231] Different T. reesei strains including M44 (VTT-D-00775;
Selinheimo et al., FEBS J. 2006, 273(18): 4322-35), M81, M84, M109,
M110, M131, M132, M133, M134 and M124 (a mus53-deleted strain of
M44) were analyzed. Each of the ten strains was grown in shake
flask cultures. Samples were taken at three different time points:
3 days, 5 days, and 7 days. Both supernatants (secreted proteins)
and cell pellets were collected and stored frozen at -20.degree. C.
until glycan analysis was conducted.
[0232] N-glycans were isolated from secreted proteins from the
indicated time points followed by matrix-assisted laser
desorption/ionization-time-of-flight (MALDI-TOF) glycan profiling.
Cell pellets from the 5 days time point were subjected to N-glycan
profiling. A total of 80 samples (30 each of neutral- and acidic
supernatant fractions, and 10 each of neutral- and acidic pellet
fractions) were subjected to analysis.
[0233] Strain M44 was also subjected to batch and fed-batch
fermentor cultivation in order to assess the difference on glycan
profile between shake flask and fermentor culture. For glycan
analysis, samples from three different time points were analyzed
for a total of 12 samples (6 neutral and 6 acidic fractions). As a
control, culture medium was analyzed.
[0234] Mass Spectrometry Methods
[0235] MALDI-TOF mass spectrometry was performed with a Bruker
Ultraflex TOF/TOF instrument (Bruker Daltonics, Germany). Neutral
N-glycans were detected in positive ion reflector mode as
[M+Na].sup.- ions, and acidic N-glycans were detected in negative
ion linear mode as [M-H].sup.- ions. The relative molar abundance
of neutral N-glycan components was assigned based on their relative
signal intensities in the spectra. The resulting glycan signals in
the presented glycan profiles were normalized to 100% to allow
comparison between samples.
[0236] Protein-Specific Glycosylation Methods
[0237] Proteins from a fermentor-cultured sample were separated
with SDS-PAGE and blotted to a PVDF membrane. The protein bands of
interest were excised, and N-glycans were liberated by enzymatic
release with PNGase F.
[0238] Neutral N-glycan Profile of T. Reesei Strains
[0239] The desired Man5 structure can be observed as a [M+Na].sup.+
signal at m/z value of 1257.4 in the mass spectra presented in FIG.
1. The neutral glycome of the analyzed T. reesei strains were found
to have either Man5 or Man8 as the main neutral glycan species
(H5N2 and H8N2 in Table 2).
TABLE-US-00002 TABLE 2 The percentage of different neutral N-glycan
signals of analyzed T. reesei strains. Strain M44 M81 M84 M109 M110
M131 M132 M133 M134 M124 Composition m/z % % % % % % % % % % H3N2
933 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 H4N2 1095 1.2 0.0 0.0
1.8 0.9 0.0 2.3 0.0 2.3 4.1 H5N2 1257 81.0 70.8 4.0 78.9 5.8 78.8
84.1 10.7 73.2 77.9 H6N2 1419 5.8 5.3 0.0 5.3 0.9 4.8 4.6 0.9 6.0
7.3 H7N2 1581 4.8 7.3 1.5 4.7 3.0 4.8 3.9 3.8 5.8 4.8 H8N2 1743 3.7
8.6 81.5 5.1 68.2 5.9 2.6 68.1 6.3 3.3 H9N2 1905 2.9 8.0 9.0 3.4
16.0 4.6 2.0 12.8 5.7 2.3 H10N2 2067 0.5 0.0 2.5 0.8 3.7 1.1 0.4
2.5 0.7 0.4 H11N2 2229 0.0 0.0 1.5 0.0 1.4 0.0 0.0 1.2 0.0 0.0
H12N2 2391 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
[0240] Some acidic N-glycans were observed in neutral N-glycan
fractions. This may have been due to specific properties of the
phosphorylated glycans, e.g. presence of phosphodiester structures,
or other properties of the phosphoglycans which could lead to
leakage of acidic species to neutral fraction under the
experimental conditions used in this study. To check the
corresponding structure, the signal of interest was subjected to
MS/MS analysis. Mass spectrometric fragmentation of glycans was
performed using Bruker Ultraflex TOF/TOF in MS/MS analysis mode
(FIG. 2). Because the glycans were not permethylated, definitive
structural assignment based on the MS/MS data could not be
obtained.
[0241] Acidic N-glycan Profiles of T. Reesei Strains
[0242] For glycoengineering purposes it was useful to have strains
with a minimum amount of acidic N-glycans. Therefore, acidic
N-glycan profiles were analyzed from the strains used for
screening. The acidic N-glycan spectra of analyzed strains are
shown in FIG. 3 and below in Table 3.
TABLE-US-00003 TABLE 3 The percentage of different acidic N-glycan
signals of analyzed T. reesei strains. M44 M81 M84 M109 M110 M131
M1132 M133 M134 M124 m/z % % 9/0 % % % % % % % Hex3HexNAc2SP 989
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Hex4HexNAc2SP 1151 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Hex5flexNAc2SP 1313 4.0 5.2 0.0 3.7
0.0 2.8 7.4 0.0 5.2 2.8 Hex5HexNAc2SP2 1393 0.0 0.0 0.0 0.7 0.0 0.0
0.0 0.0 0.9 0.0 Hex6HexNAc2SP 1475 23.7 27.3 2.2 18.1 2.1 22.4 21.0
3.9 24.9 26.3 Hex6HexNAc2SP2 1555 0.0 2.8 0.0 2.4 0.0 3.2 1.1 0.0
3.6 1.7 Hex7HexNAc2SP 1637 30.3 18.8 1.1 16.2 2.0 14.9 24.7 0.0
17.2 23.3 Hex7HexNAc2SP2 1717 0.0 7.7 0.0 8.6 0.0 10.7 2.5 0.0 10.4
7.0 Hex8HexNAc2SP 1799 18.4 11.8 17.9 12.8 9.7 9.1 19.7 14.5 8.8
11.2 Hex8HexNAc2SP2 1879 5.1 8.8 0.0 11.0 0.0 14.8 4.0 0.0 12.4
10.0 Hex9HexNAc2SP 1961 7.3 6.4 49.1 9.5 37.9 5.9 6.1 53.9 4.1 3.5
Hex9HexNAc2SP2 2041 4.2 5.0 0.0 5.7 0.0 7.3 5.1 0.0 5.9 7.2
Hex10HexNAc2SP 2123 2.8 2.9 19.7 4.5 28.1 2.6 2.3 19.3 2.1 1.6
Hex10HexNAc2SP2 2203 2.8 2.1 0.0 2.2 0.0 2.7 3.6 0.0 1.9 3.3
Hex11HexNAc2SP 2285 1.5 1.3 3.7 2.1 9.5 1.2 0.9 5.0 1.0 0.8
Hex11HexNAc2SP2 2365 0.0 0.0 0.0 0.9 0.0 1.3 1.5 0.0 0.8 1.3
Hex12HexNAc2SP 2447 0.0 0.0 1.3 1.0 1.6 1.0 0.0 0.0 0.5 0.0
Hex12HexNAc2SP2 2527 0.0 0.0 0.0 0.4 0.0 0.0 0.0 0.0 0.3 0.0
Hex13HexNAc2SP 2609 0.0 0.0 1.2 0.4 1.1 0.0 0.0 0.0 0.0 0.0
Hex14HexNAc2SP 2771 0.0 0.0 0.6 0.0 0.9 0.0 0.0 0.0 0.0 0.0
[0243] N-glycan Profile from Fermentor Cultured Strain M44
[0244] Strain M44 was cultivated in a fermentor in order to find
out if different culture conditions can cause changes in its glycan
profile. N-glycan analysis was performed for samples cultured in a
fermentor (Batch; 41:10, 88:45 and 112:50 hours, and Fed batch;
45:50, 131:40 and 217:20 hours) and compared to that of shake flask
culture. Neutral and acidic N-glycans of secreted proteins of T.
reesei strain M44 cultured in fermentor are shown in FIG. 4.
Comparison between the N-glycan percentages from flask and
fermentor cultures is presented below in Table 4.
TABLE-US-00004 TABLE 4 The percentage of N-glycan signals of T.
reesei strain M44 cultured in flask and in fermentor. Composition
m/z flask % fermentor % H3N2 933 0.0 0.0 H4N2 1095 1.2 0.0 H5N2
1257 81.0 91.3 H6N2 1419 5.8 4.5 H7N2 1581 4.8 4.2 H8N2 1743 3.7
0.0 H9N2 1905 2.9 0.0 H10N2 2067 0.5 0.0 H11N2 2229 0.0 0.0 H12N2
2391 0.0 0.0
[0245] N-glycan Analysis of Shake Flask Culture Medium
[0246] As a control experiment, culture medium (without contact
with fungus) of T. reesei was analyzed. FIG. 5a shows neutral
N-glycan analysis in which no N-glycans were observed. Only minor
signals of hexose oligomers, most likely derived from the plant
material used in the medium, were visible above the baseline. In
FIG. 5b (acidic glycans), no signals corresponding to N-glycans
were observed.
[0247] N-glycosylation of Secreted Proteins
[0248] To check whether there is variation in glycosylation between
individual secreted proteins, the samples from fermentation culture
supernatants were separated with SDS-PAGE and blotted to PVDF
membrane. The N-glycans of selected bands were then detached with
on-membrane enzymatic release. Results are shown in FIGS. 6 and
7.
[0249] Conclusions: Neutral Glycans
[0250] The purpose of this study was to identify T. reesei strains
for glycoengineering with the highest amount of Man5 N-glycans and
the lowest amount of acidic glycans. Strains which have Man5 as a
main peak in mass spectrometry analysis can have higher endogenous
.alpha.-1,2-mannosidase activity. Based on the background
information on T. reesei N-glycosylation, the likely structure for
Man5 is
Man.alpha.3[Man.alpha.3(Man.alpha.6)Man.alpha.6]Man.beta.4GlcNAc.beta.4Gl-
cNAc (Salovuori et al. 1987; Stals et al. Glycobiology 14, 2004,
page 725).
[0251] Some strains contained H8N2 as a major neutral glycoform.
Based on the literature, this glycoform is most likely to be a
Glc.alpha.3Man.alpha.2Man.alpha.2Man5 structure (Stals et al.
Glycobiology 14, 2004, page 725). It is possible that glucosidase
deficiency in these strains prevents the trimming of the glycans to
the smaller glycoforms.
[0252] In some strains, acidic N-glycans were observed in neutral
spectra. This situation may have been due to a higher proportion of
acidic N-glycans or to leakage of specific structures into the
neutral fraction during the separation of neutral glycans from
acidic glycans.
[0253] The glycan profile of strains was a bit more favorable for
glycoengineering when cultivated in a fermentor compared to in
shake flasks. The glycosylation of individual proteins from
fermentor-cultured samples didn't differ significantly from average
glycosylation. All analyzed proteins contained Man5 as a main
glycoform. This observation suggested that all secreted proteins go
through similar glycan processing. Thus it appeared that the
majority of secreted proteins were glycosylated similarly by the T.
reesei host cells, which is not always the case with mammalian
cells.
[0254] Acidic Glycans
[0255] The phosphorylation of N-glycan is not generally desired for
glycoengineering because the terminal phosphate residue is not
present in regular therapeutic proteins, including antibodies. Some
exceptions to this rule are a few specialized proteins used for
lysosomal glycosylation storage disorders. Phosphorylation of
N-glycans may be protein-specific in fungi. In animals, mannose
phosphorylation is a conserved lysosomal targeting signal.
[0256] To date there have been no reports of sulfation of T. reesei
N-glycans. Therefore, the acidic structures referred to in this
report were likely to be phosphorylated glycans.
[0257] Phosphorylation is more common when T. reesei is cultivated
in low pH values, as is the case in flask cultures, which may be
related to low pH stress and mycelia breakage (Stals et al., 2004,
Glycobiology 14:713-724). In this study a clear difference was
observed between flask and fermentor cultured samples. Acidic
N-glycans, all phosphorylated, were observed in shake flask culture
samples. The amount of acidic N-glycans in fermentor samples may
have been below the detection limit, or, because of higher pH there
may have been no significant phosphorylation of glycans. The
proportion of acidic N-glycans to the total amount of N-glycans
could not be verified with the method used in this study due to the
different ionization efficiencies between neutral and acidic glycan
species.
[0258] In order to determine phosphorylation levels, N-glycans were
released by N-glycanase from 10 .mu.g of T. reesei secreted protein
cultured in batch and fed batch fermentor. Protein concentration
was measured using a Bradford-based method with BSA as a standard.
One pmol of standard molecule NeuAcHex4HexNAc2 was added to acidic
N-glycans samples prior to MALDI-TOF analysis. Amounts of major
glycoforms (Hex7HexNAc2P for fermentor and Hex6-8HexNAc2P for flask
culture) were 0.9 pmol/10 .mu.g of secreted protein of batch
culture, 0.6 pmol/10 .mu.g of secreted protein of fed batch
culture, and 160 pmol/10 .mu.g of secreted protein of flask culture
when the pH of the culture was allowed to drop. The amount of
neutral N-glycans was measured using 10 pmol of standard glycan
Hex2HexNAc4 added to neutral N-glycan samples, prior to MALDI-TOF
analysis. The amount of major glycoform Hex5HexNAc2 was 87 pmol/10
.mu.g of secreted protein in batch and fed-batch cultures and 145
pmol/10 .mu.g of secreted protein in flask culture. Thus, the
proportion of acidic N-glycans to total amount of N-glycans was 1%
in batch culture, 0.7% in fed-batch culture and 52% in flask
culture. Quantitation was based only on signal intensity comparison
using MALDI-TOF data.
[0259] N-glycans were also larger in acidic fraction. This may have
been due to phospho-mannosylation reactions in which phosphorus
with one hexose unit is attached to a glycan backbone. Some
diphosphorylated structures were seen in acidic spectra. This
explanation is in agreement with the previously published data on
phosphorylated glycans found in T. reesei (Stals et al. 2004,
Glycobiology 14:725-737). When cultured in a fermentor, the
proportion of acidic N-glycans was very low, below the detection
limit.
[0260] The N-glycan spectra of T. reesei culture media did not
reveal contamination of the T. reesei N-glycome with glycans
derived from plant material containing medium.
[0261] In conclusion, N-glycan analysis of different T. reesei
strains revealed that the major glycoform in strains M44, M109,
M131, M132 and M124 is Man5 or
Man.alpha.3[Man.alpha.3(Man.alpha.6)Man.alpha.6]Man.beta.4GlcNAc.beta.4Gl-
cNAc. The possible presence of glucose, including H8N2 as a minor
component in Man5-producing strains was considered. Two strains
(M109 and M131) contained a larger amount of H8N2 than H7N2. The
enrichment of H8N2 could have indicated partial glucosidase
deficiency.
[0262] Strain M44 contained almost no phosphorylated glycans.
Leaking acidic glycans observed in neutral glycan fraction as
signals at m/z 1521 and m/z 1683 were observed in samples from
strains M131, M109, M132 and M124, which indicated higher
phosphorylation levels and the presence of potential phosphodiester
structures.
[0263] The aim of this study was to find a strain with maximal
production of Man5Gn2 structure and low-level production of acidic
(phosphorylated) N-glycans. The best strains had over 80% of Man5
under pH-controlled shake flask culture conditions. The best
strains also had reduced production of di-phosphorylated glycans
and/or larger phosphorylated structures (see Table 3).
Example 2
Generation of an Alg3-Deficient Trichoderma Strain
[0264] Vector Construction and Strain Generation
[0265] The gene encoding the ALG3 mannosyltransferase was
identified in the Trichoderma reesei genome sequence. A disruption
construct was designed to insert the acetamidase selection marker
between 1000 bp 5' and 3' flanking region fragments of the alg3
gene. The flanking region fragments were amplified by PCR, and the
construct was made by homologous recombination cloning in
Saccharomyces cerevisiae. The disruption cassette was released from
its backbone vector by digestion and transformed into the T. reesei
strain M124. Transformants were selected on acetamidase medium and
screened by PCR with a forward primer outside the 5' flanking
region fragment of the construct and the reverse primer inside the
AmdS selection marker.
[0266] Screening of Transformants
[0267] Fifty-eight out of 62 screened transformants gave a PCR
product of the size expected for integration of the construct to
the alg3 locus. Nine PCR-positive transformants were purified to
uninuclear clones through single spore cultures, and spore
suspensions were made from them. These nine clones were analyzed
for the correct integration of the disruption cassette by Southern
hybridization. EcoRI-digested genomic DNA from the parental strain
and from nine clones was hybridized with an alg3 probe under
standard hybridization conditions. The probe hybridized with DNA
from the parental strain, but not with DNA from any of the clones,
indicating successful deletion of alg3 (FIG. 8).
[0268] Further analysis was made by Southern hybridization with an
AmdS probe. The AmdS gene was included in the deletion cassette and
was predicted to be detectable in DNA from the transformants, but
not in DNA from the parental strain. Genomic DNA of parental strain
M124 and nine transformants was digested with EcoRI+PvuI (E+P) and
KpnI+NheI (K+N). NotI digested plasmid carrying the alg3-AmdS
deletion cassette was used as a positive control. The probe
recognized the expected .about.2.7 kb fragment (AmdS) from the
positive control but did not hybridize with the parental strain.
All transformants gave the expected signals (1.6+2.8 kb for E+P and
1.7+3.4 kb for K+N, shown with arrows in FIG. 9B) indicating
correct integration of the deletion cassette. Clones 11A and 15A
also showed hybridization of some additional fragments suggesting
unspecific integration of the deletion cassette to the genome (FIG.
9B).
[0269] N-glycan Analytics
[0270] Shake-flask cultures of five different Alg3 knockout strains
(4A, 5A, 6A, 10A and 16A) and parental strain M124 were analyzed
for N-glycans. Samples were collected from time points of 3, 5, 7,
and 9 days. All cultures were grown as duplicates.
[0271] The protein concentration of secreted proteins from a
randomly selected knockout strain (4A) from all time points was
measured using a Bradford-based assay against a BSA standard curve.
The highest protein concentration was detected on day 5. Therefore,
day 5 samples were used for N-glycan analysis for all five knockout
strains. All samples, including the duplicate cultures, were
analyzed as triplicates. Ten .mu.g was used for N-glycan analysis.
Both neutral and acidic N-glycans were analyzed by MALDI-TOF.
[0272] The major glycoform in parental strain M124 was Man5Gn2. In
all Alg3 knockout strains the major glycoform was Man3 (FIG. 10).
No Man3 was found in the parental strain M124. In different Alg3
knockout strains the amount of Man3 ranged between 49.7%-55.2% in
the shake-flask cultures allowing pH drop. Hex6Gn2 was increased in
the parental strain. Signal intensities as percentages of observed
neutral N-glycan signals are presented in Table 5 below.
TABLE-US-00005 TABLE 5 Neutral N-glycan content of Alg3 knockout
strains. Strain Parental M124 4A 5A 6A 10A 16A Composition m\z
Average STDEV Average STDEV Average STDEV Average STDEV Average
STDEV Average STDEV Hex3HexNAc2 933.31 0.0 0.0 53.6 0.2 55.2 4.2
49.7 0.5 53.3 0.9 53.4 0.9 Hex4HexNAc2 1095.37 1.6 0.1 2.7 0.0 2.9
0.7 3.4 0.1 3.2 0.4 3.4 0.4 Hex5HexNAc2 1257.42 70.2 3.3 8.5 0.2
7.3 1.1 10.4 0.5 8.6 0.9 9.7 0.9 Hex6HexNAc2 1419.48 7.9 1.1 35.0
0.3 84.4 1.9 36.1 0.6 34.9 0.5 33.2 0.7 Hex7HexNAc2 1581.53 7.8 0.6
0.3 0.4 0.3 0.4 0.3 0.4 0.0 0.0 0.3 0.4 Hex8HexNAc2 1743.58 5.9 0.7
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Hex9HexNAc2 1905.63 6.0 0.9
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Hex10HexNAc2 2067.69 0.7
0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
[0273] The presence of different isomers of each glycoform cannot
be observed by MALDI MS analysis, so further tandem mass
spectrometry studies were performed. First, the Man3 and Hex5Gn2
structures were investigated. For Man3 it was asked whether the
Man3 structure is branched or linear. For this analysis, a sample
containing both these structures was permethylated and analyzed
with mass spectrometric fragmentation using the Bruker Ultraflex
III TOF/TOF instrument according to the manufacturer's instructions
(FIGS. 11 and 12).
[0274] Next, it was determined whether the hexose unit on the
non-reducing end of the Hex6Gn2 structure is a mannose or a
glucose. Alpha-mannosidase digestion was performed on all knockout
strains and the parental strain (FIG. 13). Jack bean mannosidase,
which cleaves .alpha.-mannoses and leaves the .beta.-mannose from
backbone untouched, was used. The resulting structure was expected
to be Man1Gn2.
[0275] Due to low molecular weight range effects in MALDI, the
relative intensity of the Man1GlcNAc2 glycan may have been somewhat
reduced, which explained a small increase in the relative amount of
Hex6. After .alpha.-mannosidase digestion, Man3 and Man4 glycoforms
disappeared. No Man2 structure was observed. However, Hex6 (m/z
1419) was not digested (Table 6) indicating that there was a
glucose unit on the non-reducing end of the structure. Some
non-digestible Hex5 was also present, likely produced by a weak
reaction removing the sterically hindered Man6-branch of Hex6.
TABLE-US-00006 TABLE 6 Neutral N-glycans of Alg3 knockout strain 4A
before (native) and after .alpha.- mannosidase digestion. 4A Native
a-Man'ase Composition m/z Average % Hex1HexNAc2 609.21 0.0 53.2
Hex2HexNAc2 771.26 0.0 0.0 Hex3HexNAc2 933.31 47.5 0.0 Hex4HexNAc2
1095.37 3.8 0.0 Hex5HexNAc2 1257.42 11.7 5.0 Hex6HexNAc2 1419.48
36.8 41.0 Hex7HexNAc2 1581.53 0.2 0.8 Hex8HexNAc2 1743.58 0.0 0.0
Hex9HexNAc2 1905.63 0.0 0.0 Hex10HexNAc2 2067.69 0.0 0.0
[0276] For the final analysis of different structures found in the
Alg3 knockout strains, a large-scale PNGase F digestion was
performed to Alg3 knockout strain 4A. Two major glycans were
purified with HPLC (FIG. 14) and analyzed by NMR (FIG. 15).
[0277] Based on the data presented in FIG. 15A, the Hex3HexNAc2
species was unambiguously identified as
Man.alpha.1-3(Man.alpha.1-6)Man.beta.1-4GlcNAc.beta.1-4GlcNAc. The
Man.alpha.3 and Man.alpha.6 H-1 units resonated at 5.105 and 4.914
ppm, respectively. The Man.beta.4 H-2 unit was observed at 4.245
ppm. This signal was very characteristic, due to the neighboring
Man.alpha.3-OH substitution. The N-acetyl group --CH3 signals of
the core GlcNAc units were observed at 2.038 and 2.075. These
values agreed well with those reported for this pentasaccharide in
the Sugabase-database (www.boc.chem.uu.nl/sugabase/sugabase.html).
Moreover, the proton-NMR spectrum was measured for a commercially
produced
Man.alpha.1-3(Man.alpha.1-6)Man.beta.1-4GlcNAc.beta.1-4GlcNAc
(Glycoseparations, Inc.) in identical experimental conditions, and
nearly identical chemical shifts were obtained.
[0278] The NMR spectrum of the Hex6HexNAc2 component is shown in
FIG. 15B. The data implied that this component represents the
octasaccharide
Glcal-3Man.alpha.1-2Man.alpha.1-2Man.alpha.1-3(Man.alpha.1-6)Man.beta.1-4-
GlcNAc.beta.1-4GlcNAc. The presence of a glucose unit was evident
from the 5.255 signal showing a typical .alpha.Glc 2.4 Hz coupling.
All Man signals typically show<1 Hz coupling due to the
equatorial H-2 configuration. Small differences were observed
compared to the Sugabase data (Table 7), which may be ascribed to
the different temperature used in the present NMR measurement
(40.degree. C. vs. 26.degree. C.).
TABLE-US-00007 TABLE 7 Published NMR data of
Glc.alpha.1-3Man.alpha.1-2Man.alpha.1-2Man.alpha.1-3(Man.alpha.1-6)Man.be-
ta.1-4GlcNAc.beta.1-4GlcNAc. Data was obtained from Sugabase (found
at boc.chem.uu.nl/sugabase/sugabase). ##STR00001## Residue Linkage
Proton PPM J Hz D-GlcNAc H-1a 5.189 H-1b 4.694 H-2a 3.867 H-2b
3.692 NAc 2.038 b-D-GlcpNAc 4 H-1 4.606 H-2 3.792 NAc 2.077
b-D-Manp 4, 4 H-1 4.773 H 2 4.237 a-D-Manp 6, 4, 4 H-1 4.913 H-2
3.964 a-D-Manp 3, 4, 4 H-1 5.346 H-2 4.080 a-D-Manp 2, 3, 4, 4 H-1
5.304 H-2 4.103 a-D-Manp 2, 2, 3, 4, 4 H-1 5.038 H-2 4.224 a-D-Glcp
3, 2, 2, 3, 4, 4 H-1 5.247 H-2 3.544
[0279] Finally, the N-glycan profiles of randomly selected knockout
strain 4A were analyzed at different time points (days 3, 5, 7 and
9). The shake flask culture pH was 4.8 at the starting time point
and 2.6 at the ending time point. Triplicate samples from every
time point of duplicate cultures were analyzed. It was observed
that in both duplicates, the relative amount of Man3Gn2 signal
decreased as a function of growth time because of the reduction of
pH. However, the amount of Hex6Gn2 signal increased as a function
of growth time (Table 8).
TABLE-US-00008 TABLE 8 The percentages of signal intensities from
observed neutral glycan signals of Alg3 4A knockout strain.
Duplicate cultures (3A and 4A) from four different time points
(days 3, 5, 7 and 9) were analyzed. Alg3 knock out strain 4A (flask
3A) Day 3, 3A Day 5, 3A Day 7, 3A Day 9, 3A Composition m/z average
stdev average stdev average stdev average stdev Hex3HexNAc 730.24
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Hex2HexNAc2 771.26 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 Hex3HexNAc2 933.31 61.7 3.7 61.3 0.8 61.1 1.9 52.7
7.7 Hex4HexNAc2 1095.37 2.6 0.2 2.5 0.1 2.1 0.4 3.7 1.0 Hex5HexNAc2
1257.42 4.3 0.6 6.5 0.4 5.7 0.6 6.4 1.0 Hex6HexNAc2 1419.48 31.4
3.5 29.8 0.4 31.1 1.6 37.2 5.7 Alg3 knock out strain 4A (flask 4A)
Day 3, 3A Day 5, 3A Day 7, 3A Day 9, 3A Composition m/z average
stdev average stdev average stdev average stdev Hex3HexNAc 730.24
0.0 0.0 0.0 0.0 0.0 0.0 0.7 1.2 Hex2HexNAc2 771.26 0.0 0.0 0.0 0.0
0.0 0.0 0.3 0.5 Hex3HexNAc2 933.31 61.7 3.2 58.6 1.1 55.6 1.9 54.8
5.9 Hex4HexNAc2 1095.37 3.4 1.0 2.6 0.2 3.1 0.2 2.6 0.5 Hex5HexNAc2
1257.42 5.2 1.5 6.7 0.4 7.1 0.4 7.6 3.7 Hex6HexNAc2 1419.48 29.7
0.9 32.1 0.8 34.3 1.5 34.0 3.6
[0280] A difference between these two analyses (Tables 4 and 7)
concerning the percentage of Man3 in clone 4A (Day 5) were noted.
This difference may have been due to differences in the analyses
procedures. Some lability of the heterogenous culture medium
protein preparations was observed after freeze-thaw cycle(s),
likely due to glycan and/or protein degradation, resulting in
reduced amounts of larger glycans. Generation of the data in Table
5 included additional freeze thaw-cycles.
[0281] Acidic N-glycan fractions were also analyzed by MALDI (FIG.
16). The abundance of different acidic compounds in parental strain
M124 differed from all Alg3 knockout strains, among which the
acidic fraction seemed to be very similar.
[0282] Three major glycans in the parental strain were H6N2P1,
H7N2P1 and H8N2P1. In the Alg3 knockout the size shifted into
smaller glycans: H5N2P1, H6N2P1 and H4N2P1. Additionally,
diphosphorylated glycans were more abundant in the parental strain.
This may have been due to a lack of a suitable substrate for the
particular enzyme that attaches phosphorylated mannose to a glycan.
The phosphorylated mannose can be further elongated by other
mannose residues. Phosphorylation was not substantially present in
glycans of the parent M124 strain produced under fermentation
conditions.
[0283] Comparison of Fermentor and Shake Flask Grown Samples
[0284] One Alg3 knockout strain (transformant 4A) was grown in
batch fermentation on lactose and spent grain extract medium. The
medium was 60 g/l lactose with 20 g/l spent grain extract with a
volume of 7 liters (fermentor run bio01616) after inoculation.
Other medium components were KH.sub.2PO.sub.4 and
(NH.sub.4).sub.2SO.sub.4. Culture pH was controlled between 5.5 and
5.8. Biomass and culture supernatant samples were taken during the
course of the run and stored at -20.degree. C. Mycelial samples
were also collected for possible RNA analysis and were frozen
immediately in liquid nitrogen and transferred to -70.degree. C.
Samples collected from the whole course of these fermentations were
analyzed for N-glycan composition. N-glycan analysis was carried
out for fermentor run bio01616) and for the 5 days time point
sample from the shake flask culture of transformant 4A (FIGS. 17
and 18). The main signal in the shake flask culture was Man3 (59%).
In the fermentor culture, the main signal was Man3 (85%), and the
proportion of Hex6 was decreased to 8%.
[0285] Conclusions
[0286] The Alg3 knockout was successful in producing 50% or more of
the expected Man3 glycoform. The desired branched structure of
Man.alpha.3(Man.alpha.6)Man.beta.- was verified by fragmentation
mass spectrometry and NMR spectroscopy.
[0287] The other products of the Alg3 knockout included Man4
(mannose-containing minor product), Hex5 (a degradation product of
Hex6 as indicated in FIG. 13) and Hex6, which was the second
largest component. The Hex6 component was characterized to contain
terminal Glc by mannosidase resistance and specific NMR signals
including Glc.alpha.3Man-terminal. It was considered that the
glycan structure could be further optimized by methods for reducing
the amount of the terminal Glc, which was likely causing suboptimal
efficacy of glucosidase II with the glycan devoid of mannoses on
the Man.alpha.6-arm of the molecule. Further optimization of
fermentation conditions may reduce the amount of terminal Glc.
[0288] This data indicated better glycosylation results in the T.
reesei Alg3 knockout compared to earlier data for Alg3 knockouts in
Aspergillus (Kainz et al., Appl Environ Microb. 2008 1076-86) and
P. pastoris (Davidson et al., Glycobiology 2004, 399-407). In the
works of Kainz et al. and Davidson et al., similar or higher Hex6
corresponding product levels were reported. Those studies also
reported additional problems with .alpha.2-Mannose, OCH1 products
and larger size, and cell type-specific glycans produced by P.
pastoris. In conclusion, N-glycan analysis of T. reesei Alg3
knockouts revealed that the major glycoform in the knockout strains
is Man3Gn2, a desired starting point for efficient generation of
mammalian-type N-glycans.
Example 3
Purification and Activity of Individual GnTI and GnTII Enzymes
[0289] Human GnTI and GnTII (N-acetylglucosaminyltransferase I and
N-acetylglucosaminyltransferase II) were expressed as soluble,
secreted proteins in Pichia pastoris in order to study their
acceptor specificity and activity.
[0290] Generation of GnTI Construct for Production in P.
Pastoris
[0291] Human GnTI (P26572) sequence was obtained as a full-length
sequence and subcloned into Trichoderma reesei overexpression
vectors. Protein coding sequences (CDS) encoding the soluble part
of human GnTI were cloned to the pBLARG-SX expression vector in
order to produce a secreted form of the protein in Pichia pastoris
for enzymatic studies. During the cloning procedure, a His tag
encoding sequence was added to 5'end of the frame to obtain a tag
at the N-terminus of the truncated protein. The sequence was
verified by sequencing analysis. Resulting vector pTTg5 was
linearized and transformed by electroporation to P. pastoris GY190
cells to yield strain GY4. Arg.sup.+ transformants were picked and
screened by PCR. GY4 clones containing the integrated plasmid were
tested for protein expression.
[0292] Expression and Purification of Soluble GnTI
[0293] P. pastoris strain GY4 expressing soluble GnTI was first
grown overnight with shaking at +30.degree. C. in BMGY medium (1%
yeast extract, 2% peptone, 100 mM potassium phosphate pH 6.0, 1.34%
yeast nitrogen base, 4.times.10-5% biotin, 1% glycerol) to
OD.sub.600 2-6. The cells were then harvested by centrifugation and
resuspended to OD.sub.600 of 1 in BMMY medium (like BMGY, but with
0.5% methanol instead of 1% glycerol). The culture was placed in a
baffled flask and returned to a shaking incubator at +16.degree. C.
100% methanol was added to a final concentration of 0.5% every 24 h
to maintain induction. 1 ml samples of the expression culture were
taken 0, 24, 48, and 72 hours after induction, and both the cell
pellets and the supernatants were stored for analysis. After 3 days
of induction, the cells from the whole culture were harvested by
centrifugation, and the supernatant was collected for further
purification of GnTI.
[0294] Preparation of Crude GnTI Sample for Activity Assay
[0295] Pichia pastoris cell culture, which contained soluble
His-tagged GnTI was processed for activity assay by concentration
and buffer exchange. In brief, 40 ml of P. pastoris supernatant
from shake flask culture was harvested at day 3 after induction
with MeOH by pelleting the cells in 50 ml Falcon tube (Eppendorf
5810R, 3220 rcf, 5 min at +4.degree. C.) and collecting the
supernatant. The supernatant was then concentrated to <2.5 ml by
sequential centrifugations (Eppendorf 5810R or comparable, 3220
rcf, 10 min at +4.degree. C.) with Millipore Amicon Ultracel 30K
concentrator. The volume of the concentrate was adjusted to 2.5 ml
with 100 mM NIES, pH 6.1. Concentrate was subjected to buffer
exchange with a PD-10 gel filtration column (GE Healthcare
17-0851-01). The column was first equilibrated with 100 mM MES, pH
6.1 and then the sample (2.5 ml) was added, flow-through was
discarded and elution with 2.25 ml of MES buffer was collected.
Finally, 500 .mu.l of the eluate was concentrated to 100 .mu.l with
Millipore Biomax 30K concentrator (Eppendorf 5417, 12 000 rcf, 5
min +4.degree. C.) and used directly in activity assays.
[0296] Activity Assay of GnTI Enzyme
[0297] Man.alpha.1-6(Man.alpha.1-3)Man.beta.1-4GlcNAc (Man.sub.3Gn)
was used as an acceptor for GnTI in the GnTI activity assay. The
GnTI reaction was carried out by incubating the reaction mixture,
which contained 0.1 mM acceptor Man.sub.3GlcNAc, 20 mM UDP-GlcNAc,
50 mM GlcNAc, 100 mM MnCl.sub.2, 0.5% BSA and 8 .mu.l GnTI in 100
mM MES, pH 6.1, in a total volume of 10 .mu.l at room temperature
overnight. The reaction was stopped by incubating the reaction at
100.degree. C. for 5 min.
[0298] In parallel to the GnTI activity assay, the possible
HexNAc'ase activity in the crude enzyme preparation was controlled.
GlcNAc.beta.1-2Man.alpha.1-6(GlcNAc.beta.1-2Man.alpha.1-3)Man.beta.1-4Glc-
NAc.beta.1-4GlcNAc-Asn (=Gn.sub.2Man.sub.3Gn.sub.2-Asn) was used as
a substrate for HexNAc'ase. The reaction was carried out in a
similar way as for GnTI, except 100 pmol of
Gn.sub.2Man.sub.3Gn.sub.2-Asn was added instead of Man.sub.3Gn and
UDP-GlcNAc. No HexNAc'ase activity was detected.
[0299] The reaction mixture was purified for MALDI analysis by
sequential Hypersep C.sub.18 (100 mg, Thermo Scientific, cat no:
60300-428) and Hypercarb (10 mg/96 well plate/l PKG, cat no
60302-606) chromatography on HyperSep 96-well Vacuum Manifold,
Thermo Scientific. Hypersep C.sub.18 was prepared with 300 .mu.l
EtOH and 300 .mu.l MQ water, the collection plate was then put
under, and samples were loaded and eluted with 150 .mu.l MQ water.
Hypercarb was prepared with 300 .mu.l MeOH and 300 .mu.l MQ water.
Eluates from Hypersep C.sub.18 were loaded, salts were removed with
150 .mu.l 0.5 M NH.sub.4Ac, and wells were washed with 2.times.300
.mu.l MQ water. GnTI reaction products were eluted with 150 .mu.l
25% ACN, and HexNAc'ase reaction products were eluted with 25% ACN
and 0.05% TFA. Samples were dried in a Speedvac.
[0300] Matrix-assisted laser desorption-ionization time-of-light
(MALDI-TOF) mass spectrometry (MS) was performed with a Bruker
Ultraflex TOF/TOF instrument (Bruker Daltonics, Germany). Acceptor
saccharide and product were detected in positive ion reflector mode
as [M+Na]+ ions. Calculated m/z values for [M+Na]+-signals of
Hex.sub.3HexNAc.sub.1 and Hex.sub.3 HexNAc.sub.2 were 733.238 and
933.318, respectively. The percent ratio of the acceptor and the
product was calculated from the signals corresponding to
Hex.sub.3HexNAc.sub.1 and Hex.sub.3 HexNAc.sub.2 (FIG. 19).
[0301] Generation of GnTII Construct for Production in P.
pastoris
[0302] The nucleotide sequence encoding human GnTII was
PCR-amplified with primers GP3 and GP13, which contained KpnI and
EcoRI restriction sites, respectively. The EcoRI/KpnI-digested PCR
fragment was ligated to a similarly digested pBLARG-SX cloning
vector. After verifying the sequence, the final construct was
transformed to P. pastoris strain GS190 to yield strain GY22.
Positive yeast transformants were screened by PCR. Two clones (only
one of which is shown in FIG. 20) were studied for expression of
GnTII under the control of the methanol-inducible AOX1 promoter at
+16.degree. C. and at +30.degree. C.
[0303] Expression of Soluble GnTII
[0304] According to Western blot analysis (FIG. 20), P. pastoris
strain GY22 produced soluble recombinant GnTII enzyme. GnTII has a
calculated molecular mass of 49049.0 Da and two predicted
N-glycosylation sites. The recombinant GnTII was secreted into the
culture medium at +16.degree. C. (lane 9). When grown at
+30.degree. C., the recombinant GnTII was arrested inside the cells
(lane 4).
[0305] Activity Assays of Soluble GnTII
[0306] P. pastoris cell culture containing soluble His-tagged GnTII
was processed for an activity assay as described for GnTI above.
Cell culture was centrifuged, supernatant was harvested and
concentrated, buffer exchange to 100 mM MES, pH 6.1 was conducted,
and the resulting sample was further concentrated prior to activity
testing.
[0307] The activity assay was carried out similarly as for GnTI.
GnMan3Gn was used as a GnTII acceptor.
[0308] The GnTII reaction was carried out in the presence of 0.1 mM
acceptor GnMan3Gn, 20 mM UDP-GlcNAc, 50 mM GlcNAc, 100 mM MnCl2,
0.5% BSA, and GnTII in 100 mM MES, pH 6.1. Purification of the
reaction mixture for MALDI-TOF MS analysis was performed by
sequential Hypersep C18 and Hypercarb chromatography on a 96-well
plate on vacuum manifold as described for GnTI above.
[0309] MALDI-TOF MS was performed with a Bruker Ultraflex TOF/TOF
instrument (Bruker Daltonics, Germany). Acceptor saccharide and
product were detected in positive ion reflector mode as [M+Na]+
ions. Ratio of the product and acceptor at the end of the reaction
was calculated from their signal intensities (calculated m/z values
for [M+Na]+ signals of GnMan3Gn acceptor and product with one
GlcNAc addition are 933.318 and 1136.397, respectively).
[0310] Cultivation of P. pastoris producing GnTII was repeated, and
GnTII concentrate (60.times.) from supernatant was prepared and its
activity measured according to the methods described above. MALDI
spectrum of time point samples at 2.5 h, 5 h, and overnight showed
that 80%, 83%, and 82% of the acceptor was converted to product,
respectively. The close-to-maximum reaction was reached in 2.5
hours.
[0311] In addition, a crude GnTII sample was prepared, and the
activity assay was carried out as described above for the crude
GnTI sample. The reaction mixture was incubated overnight,
purified, and subjected to MALDI analysis. MALDI spectra revealed
GnTII activity (FIG. 21). HexNAc'ase activity was not detected in
the crude GnTII sample.
[0312] The methods used to synthesize a GnTII acceptor for use in
the above-described GnTII activity assays were as follows. A GnTI
sample was prepared from a P. pastoris cultivation medium as
described above. This GnTI sample showed high GnTI activity and,
therefore, it could be used in conversion of about 40 nmol of
Man3Gn to GnMan3Gn. The reaction was carried out in the presence of
0.5 mM Man3Gn, 20 mM UDP-GlcNAc, 50 mM GlcNAc, 100 mM MnCl2, 0.5%
BSA, and GnTI sample. The reaction mixture was incubated three days
at room temperature. A sample of .about.1% was subjected to
purification by Hypercarb chromatography and MALDI analysis. The
GnTI reaction converted almost all of Man3Gn acceptor to GnMan3Gn
product according to MALDI spectrum. Only 2.8% of the acceptor was
not converted.
Example 4
GnTI/GnTII Fusion Protein
[0313] Generation of GnTI/GnTII Expression Construct
[0314] A recombinant GnTI/II fusion protein was constructed by
amplifying a 1313 by GnTII fragment with a 65-mer fusion primer at
the 5'-end, which contained an in-frame fusion site (a short
sequence from GnTI containing a naturally occurring AleI
restriction site with the stop-codon removed and overlapped with
GnTII sequence) and 3'-end primers homologous to GntII containing
either SpeI or NdeI restriction sites. This fusion site allowed the
cloning of a fusion fragment directly to a T. reesei overexpression
vector with wild type GnTI under the control of the cbh1 promoter
(cloning with AleI/NdeI) or with wild type GnTI under the control
of the gpd promoter (cloning with AleI/SpeI). High-fidelity Phusion
polymerase (Finnzymes) and standard amplification and cloning
procedures were used. The sequence was verified by sequencing
directly from expression vectors. The resulting vector was used to
express the fusion as a transmembrane protein in T. reesei.
[0315] To gain more information on the functionality of the fusion
proteins, fusion GnTI/II proteins were also expressed as soluble
proteins in P. pastoris. CDS of the GnTI/II fusion encoding the
soluble part of the protein was cloned to the pBLARG-SX expression
vector in order to produce protein for enzymatic studies. During
the cloning procedure, His tag encoding sequence was added to the
5'-end of the frame to obtain a tag at the N-terminus of the
truncated protein. The sequence was verified by sequencing
analysis. The resulting vector was linearized and transformed by
electroporation to P. pastoris strain GS190 to yield strain GY6.
Arg.sup.+ transformants were picked and screened by PCR. P.
pastoris clones containing the integrated plasmid were tested for
protein expression.
[0316] Purification of Soluble GnTI/II Produced in P. Pastoris
[0317] Expression in P. pastoris and purification procedures were
carried out as described above with recombinant GnTI protein.
[0318] Enzyme Activity Tests of GnTI/II Fusion Protein
[0319] Activity assays were carried out as described above for GnTI
assays using Man3Gn oligosaccharide as an acceptor and UDP-GlcNAc
donor. The products of the reaction were analyzed by MALDI-TOF mass
spectrometry. Only GnTI activity was observed for the GnTI/GnTII
fusion protein (FIG. 22).
[0320] Transformation of T. reesei with GnTI/GnTII Construct by
Random Integration
[0321] A chimeric human GnTI/GnTII plasmid with a gpdA promoter was
co-transformed into the T. reesei M124 strain with random
integration. Selection was obtained by co-transformation of a
plasmid containing an acetamidase marker gene. Twenty PCR positive
transformants were purified to uninuclear clones and grown in shake
flask cultures for glycan analysis. All transformants and the
parental strain M124 were cultivated in Trichoderma minimal medium
(TrMM), pH 4.8, supplemented with 4% lactose and 2% spent grain
extract. Supernatant and mycelia samples were collected on days 3,
5, and 7, and were stored frozen until analysis. In addition, as a
control, T. reesei was transformed with a GnTI construct by random
integration.
[0322] Glycan Analysis of T. Reesei GnTI/GnTII Strains Obtained by
Random Integration
[0323] Samples from 20 different clones at three different time
points (days 3, 5 and 7) from T. reesei strain M124 GnTI/GnTII
transformants were analyzed. Samples from two parental M124 strains
were analyzed for controls. N-glycanase reactions without SDS
denaturation were performed in 96-well plates in triplicate for 5
.mu.g of supernatant protein. The protein concentration of the
supernatants was measured by Bradford-based assay (Bio-Rad Quick
Start Bradford Protein Assay) using BSA as a standard. Both neutral
and acidic N-glycans were analyzed by MALDI-TOF MS. No Go product
was detected using the GnTI/GnTII construct in any of the clones at
any time point as well as in clones of GnTI transformants with gpdA
promoter.
[0324] Transformation of T. Reesei with GnTI/GnTII Construct by
Targeted Integration
[0325] A chimeric GnTI/GnTII sequence was subcloned into a pTTv38
backbone, a vector that contains an acetamidase marker gene and 5'-
and 3'-flanking sequence sites for alg3 locus integration. The
vector was transformed into T. reesei M124 strain as a digested
fragment. From this transformation, 18 PCR positive transformants,
yielding PCR fragments indicating correct integration to the alg3
locus, were detected. These transformants were cultured in shake
flasks after a single spore purification step and were analyzed as
described below.
[0326] Glycan Analysis of T. Reesei GnTI/GnTII Strains Obtained by
Targeting to alg3 Locus
[0327] Supernatant samples of 10 different clones at three
different time points (days 3, 5 and 7) of .DELTA.alg3 T. reesei
GnTI/GnTII transformants were obtained. Clones had been cultivated
in shake flasks with two different media compositions. TrMM, pH
5.5, with 2% spent grain extract, 4% lactose, and K-phthalate
buffering was used for all clones and, in parallel, TrMM, pH 5.5,
with 2% spent grain extract, 4% lactose, 1% casamino acids, and
K-phthalate buffering was used for five of the clones. Cultivation
was continued for 7 days: 5 days at +28.degree. C. and days 6 and 7
at +24.degree. C.
[0328] N-glycan analyses were made in triplicate in 96-well plates
for 5 .mu.g of supernatant protein. Samples were analyzed from days
3, 5, and 7. The protein concentration of the supernatants was
measured by Bradford-based assay (Bio-Rad Quick Start Bradford
Protein Assay) using BSA as a standard. Both neutral and acidic
N-glycans were analyzed by MALDI-TOF MS.
[0329] Detectable amounts of glycoform G0 were found in every
clone. Clone 201A contained the most with 1.2% of Gn2Man3 (FIG. 23
and Table 9). In addition, the amount of Hex6 was lowest in this
particular clone. The second medium with 1% casamino acids did not
give any extra production of
G0/GlcNAc.beta.2Man.beta.3(GlcNAc.beta.2Man.alpha.6)Man.beta.4GlcNAc.beta-
.4GlcNAc.beta.. The results of the days 3 and 7 samples were
essentially the same as for the day 5 sample.
TABLE-US-00009 TABLE 9 The signal intensity percentages of observed
N-glycans from secreted proteins of T. reesei GnTI/II transformants
(GnTI/II integrated into the alg3 locus). Clones with letter A in
their name were cultivated in medium A) and clones with B in medium
B), which had an extra 1% casamino acids compared to medium A).
clone 201A, days clone 202A, days clone 208A, days clone 210A, days
clone 201A, day 5 clone 202A, day 5 clone 208A, day 5 Composition
m/z Average SD RSD MIN MAX Average SD RSD MIN MAX Average SD RSD
MIN MAX Man2 771.3 0.6 0.5 86.8 0.0 1.0 0.4 0.7 173.2 0.0 1.1 0.6
0.6 92.3 0.0 1.1 Man3 933.3 47.9 14.5 30.2 39.0 64.6 41.3 0.2 0.4
41.1 41.5 38.2 1.1 2.8 37.0 38.9 Man4 1095.4 7.9 2.9 36.5 5.9 11.3
6.4 0.6 8.7 6.0 7.0 5.3 0.2 4.0 5.0 5.5 GnMan3 1136.4 1.4 0.7 46.9
1.0 2.2 1.1 0.3 23.5 0.8 1.3 1.0 0.2 17.0 0.9 1.2 Man5 1257.4 10.5
2.5 23.5 8.7 13.3 8.6 0.8 9.7 7.7 9.4 8.2 0.3 4.0 7.8 8.5 Gn2Man3
1339.5 12 0.8 69.1 0.6 2.2 0.6 0.1 21.0 0.5 0.8 0.6 0.1 21.5 0.5
0.7 Hex6 1419.5 27.3 23.7 86.7 0.0 42.0 40.5 0.6 1.5 39.9 41.1 44.7
0.7 1.6 43.9 45.2 Hex7 1581.5 2.9 3.0 103.3 1.1 6.4 1.0 0.1 11.0
1.0 1.2 1.1 0.1 11.7 1.0 1.2 Hex8 1743.6 0.1 0.2 173.2 0.0 0.4 0.2
0.3 173.2 0.0 0.5 0.3 0.2 87.0 0.0 0.4 clone 210A, day 5 clone
212A, day 5 clone 213A, day 5 Composition m/z Average SD RSD MIN
MAX Average SD RSD MIN MAX Average SD RSD MIN MAX Man2 771.3 0.1
0.2 173.2 0.0 0.4 0.4 0.4 86.8 0.0 0.7 0.6 0.6 94.4 0.0 1.1 Man3
933.3 38.2 1.1 3.0 37.5 39.5 45.6 1.3 2.8 44.2 46.8 40.0 2.8 7.0
37.3 42.9 Man4 1095.4 6.0 0.4 6.6 5.5 6.2 5.6 0.3 5.1 5.4 5.9 6.5
0.6 8.8 6.0 7.1 GnMan3 1136.4 1.1 0.1 8.9 1.0 1.2 0.9 0.2 22.4 0.7
1.1 0.9 0.1 8.5 0.8 1.0 Man5 1257.4 8.9 0.3 3.7 8.6 9.3 7.2 0.5 7.0
6.8 7.7 9.5 0.4 3.8 9.1 9.8 Gn2Man3 1339.5 0.6 0.1 17.5 0.6 0.8 0.5
0.1 11.9 0.5 0.6 0.6 0.1 18.3 0.5 0.7 Hex6 1419.5 43.2 0.7 1.6 42.7
44.0 38.6 1.2 3.0 37.4 39.7 40.7 2.5 6.1 38.2 43.2 Hex7 1581.5 1.2
0.0 3.7 1.2 1.2 0.8 0.0 4.1 0.8 0.8 1.0 0.1 10.8 0.9 1.2 Hex8
1743.6 0.6 0.3 57.0 0.3 1.0 0.4 0.1 34.8 0.3 0.5 0.1 0.2 173.2 0.0
0.3 clone 215A, day 5 clone 216A, day 5 clone 217A, day 5
Composition m/z Average SD RSD MIN MAX Average SD RSD MIN MAX
Average SD RSD MIN MAX Man2 771.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.9 0.0 5.0 0.9 1.0 Man3 933.3 43.4 1.9 4.4 41.3 45.1 42.6
2.0 4.6 40.5 44.4 54.1 1.1 1.9 53.0 55.0 Man4 1095.4 6.3 0.5 8.5
5.7 6.8 6.1 0.6 10.3 5.4 6.7 5.2 0.3 6.5 4.9 5.5 GnMan3 1136.4 1.1
0.1 6.9 1.0 1.2 1.1 0.2 14.1 0.9 1.2 0.9 0.2 17.4 0.7 1.0 Man5
1257.4 8.5 0.4 4.2 8.2 8.9 7.7 0.6 8.4 7.0 8.3 5.8 0.1 2.6 5.6 5.9
Gn2Man3 1339.5 0.7 0.2 29.3 0.6 1.0 0.7 0.2 26.4 0.5 0.9 0.7 0.1
14.7 0.6 0.7 Hex6 1419.5 38.5 1.8 4.6 37.4 40.5 40.5 1.7 4.2 39.0
42.4 31.5 1.5 4.7 30.5 33.3 Hex7 1581.5 1.1 0.1 4.5 1.1 1.2 1.0 0.1
6.4 0.9 1.0 0.9 0.1 12.9 0.8 1.0 Hex8 1743.6 0.4 0.3 88.5 0.0 0.6
0.4 0.3 87.6 0.0 0.6 0.0 0.0 0.0 0.0 0.0 clone 219A, day 5 clone
201B, day 5 clone 202B, day 5 Composition m/z Average SD RSD MIN
MAX Average SD RSD MIN MAX Average SD RSD MIN MAX Man2 771.3 0.5
0.4 96.7 0.0 0.9 0.4 0.7 173.2 0.0 1.1 0.6 1.1 173.2 0.0 1.8 Man3
933.3 44.0 1.8 4.1 42.4 45.9 46.9 0.2 0.5 46.6 47.1 40.6 1.7 4.3
38.6 41.8 Man4 1095.4 5.7 0.1 1.5 5.6 5.8 6.9 0.9 12.7 6.0 7.8 8.5
0.9 10.0 7.7 9.4 GnMan3 1136.4 1.0 0.2 16.6 0.9 1.2 1.2 0.4 32.1
0.9 1.6 1.3 0.4 0.0 0.9 1.8 Man5 1257.4 8.0 1.2 15.6 6.7 9.2 8.1
0.5 5.7 7.8 8.6 10.0 0.6 6.2 9.5 10.6 Gn2Man3 1339.5 0.9 0.1 14.2
0.8 1.0 0.8 0.1 7.1 0.8 0.9 0.7 0.5 7.8 0.3 1.3 Hex6 1419.5 38.5
1.1 2.8 37.3 39.2 34.2 0.7 2.1 33.8 35.1 37.5 1.1 2.8 36.7 38.7
Hex7 1581.5 1.0 0.2 15.4 0.8 1.1 1.1 0.1 5.2 1.0 1.2 0.8 0.7 86.9
0.0 1.2 Hex8 1743.6 0.4 0.1 17.9 0.3 0.5 0.4 0.3 90.7 0.0 0.7 0.0
0.0 0.0 0.0 0.0 clone 208B, day 5 clone 210B, day 5 clone 219B, day
5 Composition m/z Average SD RSD MIN MAX Average SD RSD MIN MAX
Average SD RSD MIN MAX Man2 771.3 0.9 0.8 87.1 0.0 1.5 0.8 0.7 86.7
0.0 1.2 1.2 0.1 10.3 1.0 1.3 Man3 933.3 48.4 1.2 2.4 47.3 49.6 39.6
1.1 2.7 38.6 40.8 34.9 1.8 5.2 33.2 36.8 Man4 1095.4 7.2 0.2 2.2
7.0 7.3 7.9 0.6 8.0 7.3 8.5 8.1 0.3 4.1 7.8 8.4 GnMan3 1136.4 0.6
0.6 92.1 0.0 1.1 1.0 0.1 12.7 0.9 1.1 1.1 0.1 12.1 1.0 1.2 Man5
1257.4 8.7 0.7 7.6 7.9 9.1 9.6 0.2 2.0 9.4 9.8 11.3 0.8 7.5 10.7
12.3 Gn2Man3 1339.5 0.4 0.2 44.3 0.2 0.6 0.6 0.2 32.4 0.4 0.8 0.6
0.1 13.9 0.5 0.6 Hex6 1419.5 32.4 0.4 1.4 32.1 32.9 38.5 0.3 0.8
38.3 38.9 40.6 0.7 1.8 39.8 41.1 Hex7 1581.5 1.0 0.2 15.5 0.8 1.1
1.5 0.1 8.2 1.4 1.6 1.4 0.2 13.5 1.2 1.5 Hex8 1743.6 0.4 0.4 87.7
0.0 0.7 0.5 0.5 92.4 0.0 0.9 0.8 0.1 16.3 0.7 0.9
Example 5
GnTII/GnTI Fusion Protein
[0330] Generation of GnTII/GnTI Expression Construct
[0331] A GnTII/GnTI fusion expression construct was generated by
applying PCR overlap techniques. Fusion fragments were amplified
from GnTII and GnTI templates separately with primers containing 50
bp in-frame overlaps at the fusion site. Fragments were purified
from an agarose gel and used as PCR template for amplification of
the fusion construct according to standard procedures. The fusion
construct was cloned into a vector with ApaI/SpeI restriction
sites. The resulting construct was verified by sequencing analysis.
A vector was generated for expressing the soluble form of
GnTII/GnTI in P. pastoris with His tagging at the N-terminus of the
target protein. This vector was generated in a similar manner as
described above for the GnTI/II fusion construct.
[0332] Purification of Soluble GnTI/GnTI Produced in P.
Pastoris
[0333] Expression in P. pastoris and purification procedures were
carried out as described above for recombinant GnTI protein.
[0334] Enzyme Activity Tests of GnTII/GnTI Fusion Protein
[0335] Activity assays were carried out as described above for GnTI
using Man3Gn oligosaccharide as an acceptor. A MALDI spectrum of
the purified reaction mixture from the GnTII/GnTI reaction showed
that two GlcNAc.beta.-residues were transferred to the acceptor
(FIG. 24).
TABLE-US-00010 TABLE 10 Summary of GnTII/GnTI fusion protein
activities. Products formed GnTII/GnTI transformant Acceptor
concentration ##STR00002## ##STR00003## Transformant 1 0.5 mM 47%
5% Transformant 1 0.1 mM -- 11% Transformant 2 0.5 mM 3% 2.4%
[0336] Characterization by .beta.-N-acetylglucosaminidase
[0337] The mixture formed in the GnTII/GnTI activity reaction was
treated with .beta.1-2,3,4,6-N-acetylglucosaminidase from
Streptococcus pneumoniae. MALDI MS analysis was used to determine
that both transferred .beta.-linked GlcNAc residues were cleaved
(FIG. 25).
[0338] Galactosulation by .beta.1-4GalT
[0339] The mixture formed in the GnTII/GnTI activity reaction was
treated with .beta.1-4GalT from bovine milk. .beta.1-4GalT was
expected to galactosylate the terminal GlcNAc residues in the
product mixture. According to MALDI spectrum of the .beta.1-4GalT
reaction mixture, both products were galactosylated. Two galactoses
were transferred to the Gn2Man3Gn product, which indicated that the
GlcNAc residues were linked to separate mannose branches (FIG.
26).
[0340] Transformation of T. Reesei with GnTII/GnTI Construct by
Random Integration
[0341] A chimeric GnTII/GnTI sequence was designed and cloned into
a vector containing the gpdA promoter. After verification of the
plasmid sequence, it was co-transformed into the T. reesei M124
strain with the hygromycin marker gene. Thirteen PCR positive
transformants were identified. All positive transformants and the
parental strain M124 were cultivated in TrMM, pH 4.8, supplemented
with 4% lactose and 2% spent grain extract. In addition, seven
transformants and the parental strain were cultivated in TrMM, pH
5.5, with 4% lactose, 2% spent grain extract, and 1% casamino
acids, buffered with 100 mM PIPPS (piperazine1,4bis2propanesulfonic
acid). pH measurements were used to monitor the growth rate of the
strains. Supernatant and mycelia samples were collected on days 3,
5, and 7, stored frozen, and analyzed for glycan structures. The
GnTII/GnTI sequence was also cloned into a plasmid containing the
cbh1 promoter. In addition, as a control, T. reesei was transformed
with a GnTI construct by random integration.
[0342] Glycan Analysis of T. Reesei GnTII/GnTI Strains Obtained by
Random Integration
[0343] 156 supernatant samples of T. reesei strain M124 GnTII/GnTI
transformants and parental M124 strain cultivated in two different
media were analyzed. The first medium was TrMM, pH 4.8,
supplemented with 2% spent grain extract and 4% lactose, and the
second medium was TrMM, pH 5.5, supplemented with 2% spent grain
extract, 4% lactose, 100 mM PIPPS, and 1% casamino acids. Cells
were grown in both types of media for 3, 5 and 7 days.
[0344] N-glycanase reactions without SDS denaturation were carried
out in 96-well plates in triplicate for 5 .mu.g of supernatant
protein for samples from time points of 3 and 5 days. The protein
concentration of the supernatants was measured by Bradford-based
assay (Bio Rad Quick Start Bradford Protein Assay) using BSA as a
standard. Both neutral and acidic N-glycans were analyzed by
MALDI-TOF MS.
[0345] No sign of the expected GnTII/GnTI product was visible in
any of the clones from time points of 3 and 5 days. In addition, no
product was observed from GnTI and GnTI/II transformants with gpdA
promoters that were generated by random integration.
[0346] Transformation of T. Reesei with GnTII/GnTI Construct by
Targeted Integration
[0347] A vector having the chimeric GnTII/GnTI sequence under the
control of the cbh1 promoter was constructed with a pyr4 gene
loopout marker and subcloned into a backbone vector between alg3
flanking region fragments for targeted integration. A PmeI-digested
expression cassette was transformed into T. reesei strain M127
(pyr4.sup.- strain of M124). After plate selection, the clones were
PCR-screened and purified through single spores. To obtain material
for glycan analyses, shake flask cultivations were performed as
described. Five PCR positive transformants indicating correct
integration to the alg3 locus in the M127 transformation were
cultivated in a 300 ml volume for seven days at +28.degree. C. in a
media containing TrMM, pH 5.5, supplemented with 40 g/l lactose, 20
g/l spent grain extract, and 100 mM PIPPS. To avoid bacterial
contamination, 100 mg/l ampicillin was added into the flasks at the
time of inoculation. Samples for glycan analyses were collected on
days 3, 5 and 7.
[0348] Glycan Analysis of T. Reesei GnTII/GnTI Strains Obtained by
Targeting to alg3 Locus
[0349] Supernatant samples of T. reesei strain M124 (control), five
different clones of M127 GnTII/GnTI transformants, and control
medium samples were prepared in triplicate on 96-well plates for 5
.mu.g of supernatant protein. The protein concentrations of the
supernatants were measured by Bradford-based assay (Bio-Rad Quick
Start Bradford Protein Assay) using BSA as a standard. PNGase F
reactions were performed as described, but without SDS
denaturation. The released N-glycans were first purified with
Hypersep C-18 and then with Hypersep Hypercarb (both from Thermo
Scientific) where neutral and acidic glycans were separated. Both
purifications were performed in 96-well format. Neutral N-glycans
were analyzed by MALDI-TOF MS.
[0350] The proportions of neutral N-glycans from T. reesei M127
GnTII/GnTI transformants were compared to proportions from strain
M124, which was otherwise the same as strain M127 but pyr4
positive. Four of the five GnTII/GnTI transformants produced G0 as
a main glycoform at all time points (3, 5 and 7 days). Only clone
46A was G0 negative (FIG. 27). The proportion of Man3Gn was small
in every clone at all time points, but the proportion of Hex6 was
still quite large. On day 7, clone 17A produced the most G0 and the
least Hex6 in comparison to other clones (FIG. 27). Four clones of
the GnTII/GnTI transformants produced around 40% of glycoform G0 on
day 5 in shake flask conditions (FIG. 27). Fermentation conditions
with controlled pH can increase the amount of G0 product and reduce
the amount of Hex6 in alg3 knock-outs.
[0351] In the medium sample, a series of plant-type N-glycans were
observed, but no signals corresponding to G0 were observed.
[0352] Transformation of Rituximab-Producing T. reesei with
GnTII/GnTI Construct by Targeted Integration
[0353] The expression cassette described in the section entitled
"Transformation of T. reesei with GnTII/GnTI Construct by Targeted
Integration" was transformed into T. reesei strain M279 (pyr4.sup.-
strain of the strain M202). M202 was obtained by deleting pep1
protease in M124 and introducing rituximab heavy and light chain
(with Kex2 cleavage site). After plate selection, the clones were
PCR-screened and purified through single spores. To obtain material
for glycan analyses, shake flask cultivations were performed as
described in the section entitled "Transformation of T. reesei with
GnTII/GnTI Construct by Targeted Integration" and, in addition,
some culture media were supplemented with 0.3 mg/ml soybean trypsin
inhibitor (SBTI) and 1% casamino acids. SBTI was added first at
inoculation and then daily on days 3-6. PMSF and Pepstatin A were
added to all samples before freezing.
[0354] Glycan Analysis of Rituximab-Producing T. reesei GnTII/GnTI
Strains Obtained by Targeting to alg3 Locus
[0355] Rituximab was purified with Protein G affinity
chromatography from day 5 supernatant samples with SBTI and from
day 5 and 7 samples without SBTI. PNGase F reactions were performed
for .about.10 .mu.g of denatured protein. The released N-glycans
were first purified with Hypersep C-18 and then with Hypersep
Hypercarb (both from Thermo Scientific) where neutral and acidic
glycans were separated. The purification steps were performed in
96-well format. Neutral and acidic N-glycans were analyzed by
MALDI-TOF MS. Two of the GnTII/GnTI transformant clones, 9A-1 and
31A-1, produced G0 glycoform at .about.30% and .about.24%,
respectively. However, reasonable quantities of Hex6 and GnMan3
were still observed (FIG. 28). Rituximab from the other clones
contained little or no G0.
[0356] Optimization of Spacers
[0357] A series of spacer modifications for GnTII/GnTI fusion
proteins were constructed. These variants were produced in Pichia
and studied in vitro for enzyme stability and activity.
[0358] The materials and methods for cloning the GnTII/GnTI fusion
proteins are described here. T45 sequence was amplified in two
parts by using PCR overlapping strategy. First, a fragment was
amplified with GP13 5' primer and GP93 3' primer, and a second
fragment was amplified with GP92 5' primer and GP2 3' primer.
Amplification was carried out with Phusion high-fidelity PCR
polymerase (Finnzymes) under the standard conditions provided by
the supplier. Cycling conditions were as follows: initial
denaturation at 98.degree. C. for 30 seconds, denaturation at
98.degree. C. for 5 seconds, annealing at 65.degree. C. for 30
seconds, extension at 72.degree. C. for 45 seconds, repeat 20
times, and final extension at 72.degree. C. for 20 minutes. The
resulting PCR products were purified from the agarose gel with a
Fermentas GeneJET gel extraction kit. These fragments with
overlapping, modified sequences were combined in the same reaction
mixture with standard conditions without primers. Ten
annealing/extension cycles were carried out as follows: initial
denaturation at 98.degree. C. for 30 seconds, denaturation at
98.degree. C. for 5 seconds, annealing at 65.degree. C. for 30
seconds, extension at 72.degree. C. for 45 seconds, repeat 10
times, and final extension at 72.degree. C. for 20 minutes. Primers
GP13 (5') and GP2 (3') were added, and cycling was continued as
described above for 20 amplification cycles. The amplified T45
fragment was purified with a Fermentas GeneJET PCR purification
kit, digested with EcoRI/KpnI (New England Biolabs) according to
standard protocols, and cloned into EcoRI/KpnI digested yeast
expression vector pBLARG-SX. The resulting vector was sequenced
with primers 3'AOX, 5'AOX, GP9, GP37, GP38 and GP122. The sequence
was found to be correct.
[0359] This resulting plasmid was used as a template for the
3.times.G4S spacer modification. Cloning of the T46 sequence was
done as described above with T45. GP13 5'-primer and GP95 3'-primer
were used for first fragment synthesis, and GP94 5'-primer and GP2
3'-primer were used for second fragment synthesis. Fragments were
combined, and primers GP13 (5') and GP2 (3') were added for
amplification. Amplified fragment T46 was then digested with
EcoRI/KpnI and cloned into yeast expression vector pBLARG-SX. The
resulting vector was sequenced with the primers described above,
and the sequence was found to be correct.
[0360] Cellulase-related natural spacers were constructed with a
similar PCR overlap method. With the CBHI-related spacer, the first
fragment was amplified with GP13 5'-primer and GP107 3'-primer. The
second fragment was amplified with GP108 5'-primer and GP2
3'-primer (Table 11). With the EGIV-related spacer, the first
fragment was amplified with GP13 5'primer and GP109 3'primer. The
second fragment was amplified with GP110 5'-primer and GP2
3'-primer (Table 11). In both cases, PCR products were purified
from agarose gel, combined, and used as a template for the next PCR
reaction to amplify the sequences T50 and T51. T50 and T51 PCR
products were then digested with EcoRI/KpnI and cloned into yeast
expression vector pBLARG-SX.
[0361] All PCR amplifications were made with high-fidelity Phusion
polymerase (Finnzymes). Primers (Table 11) were ordered from MWG
Operon. Sequencing was performed by the DNA Sequencing Laboratory
of the Institute of Biotechnology, University of Helsinki, as a
commercial service.
TABLE-US-00011 TABLE 11 Primer sequences. Primer Sequence 5'-3'
3'AOX ##STR00004## 5'AOX GACTGGTTCCAATTGACAAGC (SEQ ID NO: 100) GP2
##STR00005## GP9 CGGACCACCGCAAGTTCC (SEQ ID NO: 102) GP13
##STR00006## GP37 CCTTTCTCTATCCAACTCTACC (SEQ ID NO. 104) GP38
##STR00007## GP92 CCGCCGGCTCCAGGGAGGTGGGGGCAGTGGAGGTGGCGGCAGT
GGGAGGGTGCCCACCGCCGCCCC (SEQ ID NO 106) GP93 ##STR00008## GP94
AGGTGGGGGCAGTGGAGGTGGCGGCAGTGGCGGCGGTGGAAGT GGGAGGGTGCCCACCGCCGCCC
(SEQ ID NO: 108) GP95 ##STR00009## GP107
GTTTCCGCCGGGAGGGTTGCCGCCGCTAGGGTTGCCGGTGCTC TGGAGCCGGCGGTAAGACTTGC
(SEQ ID NO: 110) GP108 ##STR00010## GP109
CCGCCTCCAGGAACAGTGGCGCTGGCGGTGGCCGTCGCGGCGG
AGCTCTGGAGCCGGCGGTAAGACTTGC (SEQ ID NO. 112) GP110 ##STR00011##
GP122 CATTAGCGAGAAGTTTACGG (SEQ ID NO. 114) ##STR00012##
[0362] Spacer modified (3.times.G4S and 2.times.G4S) GnTII/GnTI
fusion enzymes were processed for an activity assay by
concentration and buffer exchange in a similar way as described for
GnTI in Example 3. Activity assays were carried out with Man3Gn
acceptor, and reaction mixtures were purified as described in the
GnTI activity assay. MALDI analysis was also performed as described
with the GnTI reaction mixture, but, in addition, formation of the
GnTII product, Hex3HexNAc3, was followed. The calculated m/z values
for the [M+Na]+ signal of Hex3HexNAc3 was 1136.318 (FIG. 29).
[0363] Spacer Variants
[0364] GnTII/I spacer variants were modified from the wild type
spacer sequence of the GnTII/I fusion protein. The modified spacers
are listed in Table 12. All four spacer variant strains (GY32,
GY33, GY49, and GY50), wild-type GnTII/I fusion strain (GY7-2), and
mock strain (GY3) were expressed at +16.degree. C. with protease
inhibitors. Strains were inoculated in 60 ml of BMGY-medium at
+30.degree. C., 220 rpm, over-night (o/n). Over-night cultures were
pelleted and cells were resuspended in 60 ml of BMMY-medium.
Protease inhibitors, 1 mM EDTA, 1.5 .mu.M Pepstatin A (Sigma) and 1
Complete EDTA free protease inhibitor cocktail tablet (Roche) were
added in cultures at the same time when MeOH induction was started
and after that once in a day. 25 ml samples were taken from
cultures on day 3 and day 4, and supernatant samples were
concentrated using concentration tubes (Millipore), buffer was
exchanged in PD-10 columns into 100 mM MES pH 6.1 and concentrated
into final 50.times.. Cell pellets were resuspended in 500 .mu.l of
1.times.PBS, except cell pellet of wild type (3.sup.rd), which was
resuspended in 500 .mu.l of 100 mM MES pH 6.1 and complete (EDTA
free) inhibitor cocktail.
[0365] The amino acid sequence of the GnTII/GnTI fusion protein
containing the 3.times.G4S spacer is set forth in SEQ ID NO: 119.
The nucleotide sequence of the GnTII/GnTI fusion protein containing
the 3.times.G4S spacer is set forth in SEQ ID NO: 141. The amino
acid sequence of the GnTII/GnTI fusion protein containing the
2.times.G4S spacer is set forth in SEQ ID NO: 121. The nucleotide
sequence of the GnTII/GnTI fusion protein containing the
2.times.G4S spacer is set forth in SEQ ID NO: 139. The amino acid
sequence of the GnTII/GnTI fusion protein containing the CBHI
spacer is set forth in SEQ ID NO: 123. The nucleotide sequence of
the GnTII/GnTI fusion protein containing the CBHI spacer is set
forth in SEQ ID NO: 143. The amino acid sequence of the GnTII/GnTI
fusion protein containing the EGIV spacer is set forth in SEQ ID
NO: 125. The nucleotide sequence of the GnTII/GnTI fusion protein
containing the EGIV spacer is set forth in SEQ ID NO: 145.
[0366] A 200 .mu.l sample of cell suspension was washed by
repeating centrifuging and resuspending cells in 100 mM MES pH 6.1
with complete (EDTA free) inhibitor cocktail. A cell lysate was
prepared by taking 200 .mu.l of washed cell sample, adding 50 .mu.l
glass beads and 2 .mu.l Triton X-100 and putting in bead beater for
6 min. GnTI activity assays of 50.times. concentrated P. pastoris
culture supernantants, cell sample and cell lysate were performed
as above.
TABLE-US-00012 TABLE 12 Description of yeast strains. Sequence
Yeast Strains Description of spacer variant GY3 Mock strain GY7-2
Wild-type GnTII/I fusion GY32-5 GnTII/I fusion 3xG4S spacer variant
SEQ ID NO: 118 GY32-9 GY33-7 GnTII/I fusion 2xG4S spacer variant
SEQ ID NO: 120 GY33-8 GY49-3 GnTII/I fusion CBHI spacer variant SEQ
ID NO: 122 GY50-7 GnTII/I fusion EGIV spacer variant SEQ ID NO: 124
GY50-10
[0367] Western blots analysis of cell pellets and 50.times.
concentrated culture supernatants from day 3 are shown in FIG. 30.
The CBHI spacer variant (GY49) gave a strong signal from the cell
pellet sample but not from the supernatant. The EGIV spacer variant
(GY50) was detected from the supernatant, but only faint signal was
obtained. Faint signals from supernatant samples were also obtained
with the wild-type GnTII/I fusion strain (GY7-2) and the
2.times.G4S spacer variant strains GY33-7 and GY33-8 (FIG. 30).
[0368] The activities of the GnTII/I fusion protein containing the
spacer variants were then compared to the activity of the GnTII/I
fusion protein containing the wild-type spacer.
[0369] Fusion GnTII/I activity in supernatants. The GnTI substrate
Man3Gn was provided and the reaction product, GnMan3Gn, acted as
the acceptor for the GnTII activity of the fusion protein. Samples
for activity assays were taken after day 3 and day 4 expression
phases. FIG. 31 shows activity assay results of cultures of GnTII/I
fusion proteins containing either the wild type spacer or the
spacer variants. Sample cultivations were done in the presence of
inhibitors (1.5 .mu.M pepstatin A, 1 mM EDTA, 1 tablet/50 ml of
complete EDTA free protease inhibitor cocktail tablet). For
simplicity, the GnTI and GnTII reaction products were added
together. All activity assay samples contained only minor amounts
(<5%) of GnTI product GnMan3Gn, indicating that GnTII actively
transformed the GnMan3Gn to Gn2Man3Gn.
[0370] All four spacer variants showed GnT activities, although
there was some variability between clones and cultivation days. The
GnTII/I fusion proteins containing the 2.times.G4S (clone_1),
3.times.G4S (clone_1 and clone_2), or EGIV spacer variants showed
higher activity than the enzyme with the wild-type spacer (FIG.
31). The GnTII/I fusion protein containing the CBHI spacer variant
showed comparable activity with the enzyme with the wild-type
spacer (FIG. 31) The GnTII/I fusion protein containing the
2.times.G4S variant (clone_2) had lower activity than the enzyme
with the wild-type spacer (FIG. 31). Day 4 samples had higher
activities than day 3 samples, with the exception of the GnTII/I
fusion protein containing the 3.times.G4S spacer variants (clone_1
and clone_2), which showed higher activity on day 3 (FIG. 31). The
GnTII/I fusion protein containing the EGIV spacer variant had the
highest activity on day 4 (FIG. 31).
[0371] Fusion GnTII/I activity in cells and cell lysates. Activity
assays of cell, cell lysate, and supernatant samples from cells
containing the GnTII/I fusion protein having the wild-type spacer
indicated that lysate samples contained the highest activity (FIG.
32). The second highest activity was on the cell surface, and
lowest activity was seen in the supernatant samples (FIG. 32).
Accordingly, it appears that most of the GnTII/I fusion protein was
localized in cells or on the cell surface, with only a small amount
being secreted.
[0372] GnT activities of cells containing GnTII/I fusion proteins
having either the wild-type spacer or the spacer variants are shown
in FIG. 33. The cells were resuspended in 500 .mu.l of 100 mM MES,
pH 6.1 with complete EDTA free inhibitor cocktail and spacer
variants in 500 .mu.l PBS and cells and lysates for activity
testing were prepared as above.
[0373] As shown in FIG. 33, GnTII/I fusion proteins containing the
spacer variants had much higher GnTII/I activity in cells than in
supernatants. In lysates, the enzymes appeared to be inactive. It
is believed that this lack of activity is due to the action of
released proteases. The GnTII/I fusion protein containing the CBHI
spacer variant showed a high activity in cells and lysates (FIG.
33), which correlates with Western blot analysis showing higher
signal in the cell pellet sample (FIG. 30).
[0374] Discussion. In supernatants, the GnTII/I fusion proteins
containing the 2.times.G4S and 3.times.G4S spacer variants had
higher activity that the GnTII/I fusion protein containing the
wild-type spacer, while the CBHI spacer variant had comparable
activity to the GnTII/I fusion protein containing the wild-type
spacer. Moreover, the GnTII/I fusion protein containing the EGIV
spacer variant showed the highest GnT activity. Western blot
analysis of day 3 samples had some correlation with the results of
day 4 activities. Western blot analysis showed faint bands with
supernatant samples of wild-type, both clones of 2.times.G4S and
EGIV. The activities were detected in the following order:
EGIV>2.times.G4S (clone_1)>3.times.G4S
(clone_2)>3.times.G4S
(clone_1).gtoreq.CBHI=wild-type=2.times.G4S (clone_2).
[0375] Determination of GnTII/I fusion protein activity in
supernatant, cell, and cell lysate samples of the GnTII/I fusion
protein containing the wild-type spacer showed that most of the
activity is associated within the cells and lower amount is
secreted. It is believed that this explains why much better signals
of His-tagged GnTII/I were seen in cell fractions rather than in
supernatant fractions in Western blot analysis.
[0376] The inhibition of serine and cysteine proteases by complete
EDTA free inhibitor tablet, metalloproteinases by EDTA and aspartic
proteases by pepstatin A, improved the yield of GnTII/I fusion
protein. This observation on the use of serine protease inhibitor
is in accordance with the work of Salamin et al. (Appl. Environ.
Microbiol., 76 (2010) 4269-4276), which showed that serine type
protease activity in the media of P. pastoris was completed
inhibited with PMSF. In addition, Vad et al. (J. Biotechnol. 116
(2005) 251-260) reported high production, over 300 mg/l, of intact
human parathyroid hormone in P. pastoris in the presence of 10 mM
EDTA combined with co-expression of Saccharomyces cerevisiae
protein disulphide isomerase.
[0377] All GnTII/I fusion proteins containing each of the four
spacer variants possessed GnTII/I activity, and the activity of the
enzymes having the 2.times.G4S and EGIV spacer variants had higher
activities that the GnTII/I fusion protein containing the wild-type
spacer.
Example 6
Use of Fusion Proteins with Man5 as the Acceptor Glycan
[0378] Construction of Rituximab-Expressing T. Reesei Strain with
Man5 Type N-glycosylation
[0379] The native rituximab sequence is codon harmonized. Original
plasmids containing the synthesized rituximab light chain and heavy
chain are generated. The antibody chains and CBHI fusion protein
are designed with 40-nucleotide overlapping sequences as are the
expression vectors pHHO1 (acetamidase selection marker, cbh1 flanks
for integration into the cbh1 locus) for the heavy chain or pHHO2
(hygromycin selection marker, egl1 flanks for integration into the
egl1 locus) for the light chain, to enable cloning using yeast
homologous recombination.
[0380] The obtained gene plasmids are transformed into E. coli. DNA
is prepared, and the synthetic genes are digested and isolated from
the plasmid backbones. The expression vectors are constructed by
yeast homologous recombination on the T. reesei expression vectors
with the CBHI fusion protein and either heavy or light chain. The
recombined plasmids are rescued from yeast and transformed into E.
coli. After PCR screening, correct clones are isolated and
sequenced. The expression cassette fragments are digested and
isolated from the plasmid backbone resulting in around 10.2 kb
fragments for the heavy chain constructs and 10.8 kb fragments for
the light chain constructs. The heavy and light chain fragments are
cotransformed into the T. reesei strain M124. Transformants are
selected for hygromycin resistance and ability to grow on acetamide
as a sole nitrogen source. Transformants are streaked on the double
selective medium for two successive rounds and tested by PCR for
integration of the expression constructs into the genome.
[0381] Introduction of GnTII/I Tandem Enzyme and Mannosidase II to
T. Reesei Strain Expressing Rituximab Antibody
[0382] In addition to introducing a recombinant GnTII/I into a
Man5-producing strain such as M124, a mannosidase II activity is
further needed to remove two mannoses from the GlcNAcMan5 glycan
structure so that GnTII/I can use GlcNAcMan3 as an acceptor
molecule.
[0383] The GnTII/I expression cassette described in previous
examples can be targeted to, for example, the cbh2 locus of T.
reesei, using methods essentially as described above. To generate a
GlcNAcMan3 acceptor molecule for GnTII/I fusion protein,
mannosidase II activity is then introduced to the strain using
transformation methods described above.
[0384] Mannosidase II activity is introduced to the rituximab
antibody-expressing M124 strain by designing a desired
mannosidase-containing expression cassette with a promoter for
driving the mannosidase expression. Useful promoters are those from
gpdA or cbh1. Mannosidase II activity can be transformed by random
integration followed by screening of strains with most suitable
expression level. The expression cassette is linked with a
proprietary selection marker gene, or a selection marker is
co-transformed as a separate expression cassette. Transformation is
performed according methods described above.
[0385] A mannosidase II fusion construct can be derived from a T.
reesei cytoplasmic, transmembrane and stem domain, or targeting
peptide, of KRE2 and ligated in-frame to an N-terminal amino acid
deletion of a human mannosidase II. The encoded fusion protein
localizes in the ER/Golgi by means of the KRE2 targeting peptide
sequence while retaining its mannosidase catalytic domain activity
and is capable of hydrolyzing GlcNAcMan5GlcNAc2 into
GlcNAcMan3GlcNAc2. In certain embodiments, a full-length human
mannosidase II can be expressed in an M124 strain.
[0386] The KRE2 targeting peptide comprises the amino acids from
about 1 to about 106 or from about 1 to about 83 of KRE2.
TABLE-US-00013 Kre2 aa 1-106 (SEQ ID NO: 115)
MASTNARYVRYLLIAFFTILVFYFVSNSKYEGVDLNKGTFTAPDSTKTTP
KPPATGDAKDFPLALTPNDPGFNDLVGIAPGPRMNATFVTLARNSDVWDI ARSIRQ Kre2 aa
1-83 (SEQ ID NO: 116)
MASTNARYVRYLLIAFFTILVFYFVSNSKYEGVDLNKGTFTAPDSTKTTP
KPPATGDAKDFPLALTPNDPGFNDLVGIAPGPR
[0387] After transformation of Trichoderma with the mannosidase II
construct described above, Trichoderma strains are selected,
streaked on selective medium for two successive rounds, and tested
by PCR for integration of the expression constructs into the
genome. Selected transformants of Trichoderma strains producing
Man5 and expressing the GnTII/I fusion protein, mannosidase II, and
rituximab antibody are then cultured in shake flasks or fermentor
conditions and analyzed for glycan content as described above.
Example 7
Expression of GnTI and GnTII in T. Reesei
[0388] Transformation of T. Reesei M124 with GnTI Construct by
Random Integration
[0389] Codon optimized human GntI was transformed into the T.
reesei M124 strain. The GntI gene was cloned into a vector under
the control of two different promoters: (1) the inducible promoter
of the cbh1 gene; and (2) the constitutively expressed promoter of
the gpdA gene. The vectors containing GntI under either of the two
promoters were each co-transformed into the T. reesei M124 strain
with a plasmid containing either an acetamidase or a hygromycin
resistance marker gene.
[0390] Thirty-four transformants with GntI under the gpdA promoter
and under acetamide selection were screened by PCR, and all were
positive for GntI. For transformants with GntI under the cbh1
promoter and under acetamide selection, 19 of 26 were PCR-positive
for the GntI construct. In addition, initial DNA extraction was
performed for five strains with GntI under the cbh1 promoter and
under hygromycin selection. All of these strains were PCR-positive.
Twenty-five gpdA promoter transformants and all of the cbh1
promoter transformants (14+5) were purified to uninuclear clones
and spore suspensions were prepared.
[0391] For initial analysis purposes, 23 gpdA promoter
transformants and 19 cbh1 promoter transformants (14 grown from
acetamide and five from hygromycin selection), as well as the
parental strain M124 were cultivated in 250 ml shake flasks with 50
ml of Trichoderma minimal medium supplied with 2% spent grain
extract and 4% lactose. Growth of the strains was monitored by pH
measurements. Samples (supernatants and mycelia) were collected on
days 3, 5, and 7, stored frozen until used for glycan structure
analysis.
[0392] Glycan Analysis of T. Reesei GnTI Strains Obtained by Random
Integration
[0393] The protein concentration of all supernatant samples was
measured by Bradford-based assay (BioRad Quickstart Bradford
Protein Assay) using BSA as a standard. Secreted protein content of
samples subjected to N-glycan analysis was adjusted to 5 .mu.g or
10 .mu.g. N-glycan analysis was performed either on 96-well plates
for 5 .mu.g of supernatant protein, or in 1.5 ml tubes for 10 .mu.g
of supernatant protein. All N-glycan analyses were performed in
triplicate. Both neutral and acidic N-glycans were analyzed with
MALDI-TOF MS.
[0394] To get more exact measurements of the amount of the GnT1
product Gn1Hex5 produced in four of GnT1 transformants (from days 3
and 5) and also of the amount of produced acidic N-glycans, the
MALDI spectra was spiked with a known glycan. For neutral and
acidic N-glycans, an internal calibrant of 2 pmol/spectrum
Hex2HexNAc4 at the mass value of 1177 Da and 0.5 pmol of
monosialylated Hex4HexNAc2 at the mass value of 1362 Da were used,
respectively. Analyses were performed in triplicate.
[0395] No GnT1 product was observed in any of the gpdA promoter
transformants. However, eight cbhI promoter transformants produced
the GnT1 product Gn1Man5 (FIGS. 34 and 35, and Table 13); five with
hygromycin selection, three with acetamide selection.
TABLE-US-00014 TABLE 13 The percentages of signal intensities of
Man5 and Gn1Man5 compared to internal calibrant Hex2HexNAc4 in four
positive GnT1 transformants and parental M124 strain on days 3 and
5. Man5 is the main glycoform in parental M124 strain. M1241., day
3 M1241., day 5 Composition m/z Average SD RSD MIN MAX Average SD
RSD MIN MAX Hex2HexNAc4 1177.42 97.7 0.5 0.5 97.1 98.0 36.5 0.8 2.3
35.9 37.1 Hex5HexNAc2 1257.42 2.3 0.5 22.5 2.0 2.9 63.5 0.8 1.3
62.9 64.1 Hex5HexNAc3 1460.5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 M124/GNT1, clone HM1, day 3 M124/GNT1, clone HM1, day 5
Composition m/z Average SD RSD MIN MAX Average SD RSD MIN MAX
Hex2HexNAc4 1177.42 78.5 14.5 18.4 68.2 88.7 50.1 10.6 21.2 42.6
57.6 Hex5HexNAc2 1257.42 14.5 9.9 68.0 7.5 21.5 44.0 9.6 21.9 37.2
50.8 Hex5HexNAc3 1460.5 7.1 4.6 65.6 3.8 10.3 5.9 1.0 16.7 5.2 6.6
M124/GNT1, clone 8, day 3 M124/GNT1, clone 8, day 5 Composition m/z
Average SD RSD MIN MAX Average SD RSD MIN MAX Hex2HexNAc4 1177.42
77.3 7.6 9.8 72.0 82.7 67.3 10.0 14.9 56.5 76.3 Hex5HexNAc2 1257.42
15.0 5.2 34.4 11.4 18.7 18.9 6.2 32.5 12.8 25.1 Hex5HexNAc3 1460.5
7.6 2.4 31.6 5.9 9.3 13.8 4.0 29.1 10.8 18.3 M124/GNT1, clone 39,
day 3 M124/GNT1, clone 39, day 5 Composition m/z Average SD RSD MIN
MAX Average SD RSD MIN MAX Hex2HexNAc4 1177.42 83.7 1.5 1.8 82.7
84.8 40.0 1.9 4.6 37.9 41.1 Hex5HexNAc2 1257.42 8.3 1.0 11.7 7.6
8.9 46.9 1.8 3.8 45.6 49.0 Hex5HexNAc3 1460.5 8.0 0.6 6.9 7.6 8.4
13.1 0.3 2.1 12.7 13.3 M124/GNT1, clone 90, day 3 M124/GNT1, clone
90, day 5 Composition m/z Average SD RSD MIN MAX Average SD RSD MIN
MAX Hex2HexNAc4 1177.42 93.8 1.6 1.7 92.4 95.6 92.6 2.7 2.9 89.8
95.3 Hex5HexNAc2 1257.42 3.7 1.0 25.9 2.6 4.5 4.7 1.4 30.9 3.2 6.0
Hex5HexNAc3 1460.5 2.5 0.7 26.2 1.8 3.1 2.7 1.3 47.8 1.5 4.1
[0396] The GnT1 products Gn1Man6P1, Gn1Man7P1, and Gn1Man8P1 were
also found in phosphorylated N-glycans of all positive
transformants. The amount of phosphorylated N-glycans had increased
in GnT1 transformants, and the profile was biased toward larger
N-glycans, with Man7P1 or Man8P1 having the strongest signal
(Man6P1 in parental M124) (FIG. 36).
[0397] Eight GnTI transformants produced the Gn1Man5 structure.
Gn1Man5 was most abundant in clone 39. However, the best clone
appeared to be clone 8, which produced the second highest level of
Gn1Man5, but had a high proportion of Man5 and Gn1Man5 (FIG. 35).
Clone 8, which contains GnTI under the control of the cbhI
promoter, was named strain M198, and was selected for continued
analysis.
[0398] Transformation of T. Reesei M198 Strain with GnTII Construct
by Targeted Integration
[0399] Five GnTII-harboring vectors were created (Table 14). Two of
the vectors contained the native mammalian Golgi targeting peptide
in GNTII. In the three other vectors, the mammalian targeting
peptide was replaced by a T. reesei MNT1
(.alpha.-1,2-mannosyltransferase) targeting peptide. All five
vectors contained either a cbh1 promoter or a gpdA promoter, and a
pyr4 loop-out marker. Additionally, all five vectors were targeted
to integrate into the alg3 locus, thus deleting the alg3 gene. In
the MNT1/GnTII constructs under the cbh1 promoter, two different
sized GnTII sequence deletions were tested.
TABLE-US-00015 TABLE 14 Constructed GNT2 vectors. N-terminal
Plasmid name Promoter Targeting peptide deletion (GnTII) pTTv140
cbh1 mammalian N/A pTTv141 gpdA mammalian N/A pTTv142 cbh1
Trichoderma MNT1 74 amino acids pTTv143 cbh1 Trichoderma MNT1 104
amino acid pTTv144 gpdA Trichoderma MNT1 74 amino acids
[0400] These vectors, except for the pTTv144 vector, were
transformed into the best py4-negative GnTI producing strain M198
(M319) as PmeI fragments. Transformants were purified to uninuclear
clones and PCR screened. Clones showing the correct integration at
both ends were then selected for continued analysis.
[0401] To study the growth characteristics of the generated
GNTII-expressing strains, large shake flask cultures were prepared.
Shake flask culture were prepared in two separate batches. The
first batch contained pTTv140, pTTv142, and pTTv143. The second
batch contained pTTv141. The parental strain M198 was used as a
control strain. The cells were grown in TrMM medium supplemented
with 40 g/l lactose, 20 g/l spent grain extract, and 100 mM PIPPS,
pH 5.5. Five transformants per construct were cultured. The
pTTv140, pTTv142, and pTTv143 cultures were sampled on days 3, 5,
7, and 9. The pTTv141 cultures were sampled on days 3, 5, 7, and
10. The pH and cell dry weight of each sample were measured and
culture supernatant samples were used for glycan structure
analysis.
[0402] Glycan Analysis of T. Reesei Strains Obtained by Targeting
GnTII to alg3 Locus of T. Reesei M198 Strain
[0403] Five different clones containing the pTTv140 vector
(containing the native targeting peptide and the cbh1 promoter),
the pTTv142 vector (containing the MNT1 targeting peptide, the
GNTII 74 aa N-terminal deletion, and the cbh1 promoter), the
pTTv143 vector (containing the MNT1 targeting peptide, the GNTII
110 aa N-terminal deletion, and the cbh1 promoter), and the pTTv141
vector (containing the targeting peptide and the gpdA promoter)
were analyzed.
[0404] N-glycan analyses were prepared in triplicate for day 5
samples, and in duplicate for day 3 and 7 samples on 96-well plates
for 5 .mu.g of supernatant protein. The protein concentrations of
the supernatants were measured by Bradford-based assay (BioRad
Quickstart Bradford Protein Assay) using BSA as a standard. PNGase
F reactions were performed as described. The released N-glycans
were first purified with Hypersep C-18 100 mg and then with
Hypersep Hypercarb 10 mg (both from Thermo Scientific) where
neutral and acidic glycans were separated. Both purifications were
performed in 96-well format. Neutral N-glycans were analyzed by
MALDI-TOF MS.
[0405] N-glycans of Four Different Strains Transformed with GnTII
were Analyzed Clone
[0406] 1-117A, which was transformed with the pTTv140 vector, and
thus contained the native targeting peptide and the cbh1 promoter,
produced about 40% of G0 and about 13% of Hex6 (FIG. 37A). Clones
transformed with the pTTv143 vector, thus containing the MNT1
targeting peptide, the GnTII 110 aa N-terminal deletion, and the
cbh1 promoter, produced about 10% of G0 (FIG. 37C). Clone 3B, which
contained the gbdA promoter produced about 28% of G0 and about 19%
of Hex6 (FIG. 37D).
[0407] The glycosylation patterns of representative clones
containing the pTTv140, pTTv141, and pTTv142 vectors were also
shown to be stable as function of time (FIG. 38).
[0408] Protein Specific Glycosylation
[0409] To analyze protein specific changes in glycosylation,
samples from the pTTv142 vector-containing clone 3-17A and from the
parental strain M198 were separated with SDS-PAGE and blotted to a
PVDF membrane. The protein bands of interest (four bands of M198
and four of the 3-17A clone) were excised, and the N-glycans were
liberated with on-membrane enzymatic release with PNGase F (FIG.
39).
[0410] Detached and purified neutral N-glycans were analyzed using
MALDI-TOF MS. The glycosylation pattern of total secreted proteins
was similar to a separated 50 kDa protein of the M198 parental
strain (FIG. 40). The smallest size protein band was
unglycosylated.
[0411] In the GnTII clone 3-17A, most of the untypical signals had
disappeared, confirming their origin from the medium. Additionally,
the glycosylation pattern of clone 3-17A differed from the glycan
patterns of total secreted proteins (FIG. 40B). The amount of G0
from clone 3-17A was about 35 to 36% (FIG. 40B).
[0412] Fermenter Cultivation of GnTII Strain
[0413] Fermenter cultivation of the GnTII strain 1-117A M329 (which
contains the pTTv140vector) was fermented in TrMM pH 5.5+2% Spent
grain extract+6% lactose+0.5% KH.sub.2PO.sub.4+0.5%
(NH.sub.4).sub.2SO.sub.4 at +28.degree. C. (pH 5.5). N-glycan
analysis was performed in triplicate to 5 .mu.g of the secreted
proteins described in the "Protein specific glycosylation" section
above on samples taken on day 3. The amount of G0 was about 48% and
the amount of Hex6 was about 19% on day 3 (FIG. 41).
Example 8
T. Reesei ALG3 Homologs
[0414] Transformation of T. Reesei M124 with GnTI Construct by
Random Integration
[0415] T. reesei ALG3 homologs were identified from other
organisms. These homologs can be used to design ALG3 deletion
constructs for filamentous fungal cells other than T reesei. The
ALG3 homologs are listed in Table 15. A multiple amino acid
sequence alignment of T. reesei ALG3 and ALG3 homologs are shown in
FIG. 42.
TABLE-US-00016 TABLE 15 ALG3 Homologs. Reference sequence Organism
SEQ ID NO: Trire2|104121|fgenesh5_pg.C_scaffold_3000076 Trichoderma
reesei 126 Triat2|270085|fgenesh1_pg.contig_14_#_149 Trichoderma
atroviride 127 TriviGv29_8_2|194462|fgenesh1_pm.87_#_115
Trichoderma virens 128 EGU81920.1 Fusarium oxysporum Fo5176 129
XP_389829.1 Gibberella zeae PH-1 130 AEO60805.1 Myceliophthora
thermophila 131 XP_962259.1 Neurospora crassa OR74A 132
XP_001824044.1 Aspergillus oryzae RIB40 133 XP_001259497.1
Neosartorya fischeri NRRL 181 134 XP_001398696.2 Aspergillus niger
CBS 513.88 135 XP_362427.2 Magnaporthe oryzae 70-15 136 NP_593853.1
Schizosaccharomyces pombe 972h 137
Example 9
GnTII/GnTI Fusion Protein Variants
[0416] Generation of GnTII/GnTI Expression Construct
[0417] A recombinant GnTI/II fusion protein under the control of
the inducible promoter cbh1 and containing 1 of 4 spacer variants
is constructed as described in Examples 4 and 5. The four spacer
variants are the 2.times.G4S spacer, the 3.times.G4S spacer, the
CBHI spacer, and the EGIV spacer.
[0418] Briefly, the fusion fragments are amplified from GnTII and
GnTI templates separately with primers containing 50 bp in-frame
overlaps at the fusion site. Fragments are purified from an agarose
gel and used as PCR template for amplification of the fusion
construct according to standard procedures. The fusion construct is
cloned into a vector with ApaI/SpeI restriction sites, under the
control of the inducible promoter cbh1. Additionally, the native
mammalian Golgi targeting peptide in the GNTII domain was replaced
by a T. reesei MNT1 (.alpha.-1,2-mannosyltransferase) targeting
peptide.
[0419] To introduce the 2.times.G4S spacer variants into the fusion
protein, T45 sequence is amplified in two parts by using PCR
overlapping strategy. First, a fragment is amplified with AKT1-6-1
5' primer (GGTACCGGGCCCACTGCGCATCATGCGCTTCCGAATCTACAAGCG (SEQ ID
NO: 146)) and GP93 3' primer, and a second fragment is amplified
with GP92 5' primer and AKT1-6-4 3' primer
(GGCGCGCCACTAGTCTAATTCCAGCTGGGATCATAGCC (SEQ ID NO: 147)).
Amplification is carried out with Phusion high-fidelity PCR
polymerase (Finnzymes) under the standard conditions provided by
the supplier. Cycling conditions are as described in Example 5. The
resulting PCR product is purified from the agarose gel, and the
fragments with overlapping, modified sequences are combined in the
same reaction mixture with standard conditions without primers. Ten
annealing/extension cycles are carried out as described in Example
5. Primers AKT1-6-1 (5') and AKT1-6-4 (3') are added, and cycling
is continued as described in Example 5 for 20 amplification cycles.
The amplified T45 fragment is then purified, digested with
ApaI/SpeI (New England Biolabs) according to standard protocols,
and cloned into the Trichoderma reesei expression vector. The
cloned fragment is then verified by sequencing with appropriate set
of primers and the generated sequence is used for construction of
T. reesei expression vector with 2.times.G4S promoter and alg3
targeting.
[0420] The resulting plasmid is used as a template for the
3.times.G4S spacer modification. Cloning of the T46 sequence is
done as described above with T45. AKT1-6-1 5'-primer and GP95
3'-primer are used for first fragment synthesis, and GP94 5'-primer
and AKT1-6-4 3'-primer are used for second fragment synthesis.
Fragments are combined, and primers AKT1-6-1 (5') and AKT1-6-4 (3')
are added for amplification. Amplified fragment T46 is then
digested with ApaI/SpeI and cloned into the Trichoderma reesei
expression vector. The cloned fragment is then verified by
sequencing with an appropriate set of primers and the generated
sequence is used for construction of T. reesei expression vector
with 3.times.G4S promoter and alg3 targeting.
[0421] The CBHI and EGIV spacers are constructed with a similar PCR
overlap method. For the CBHI spacer, the first fragment is
amplified with AKT1-6-1 5'-primer and GP107 3'-primer. The second
fragment is amplified with GP108 5'-primer and AKT1-6-4 3'-primer
(Table 11). For the EGIV spacer, the first fragment is amplified
with AKT1-6-1 5'primer and GP109 3'primer. The second fragment is
amplified with GP110 5'-primer and AKT1-6-4 3'-primer (Table 11).
In both cases, the PCR products are purified from agarose gel,
combined, and used as a template for the next PCR reaction to
amplify the sequences T50 and T51. T50 and T51 PCR products are
then digested with ApaI/SpeI and cloned into the Trichoderma reesei
expression vector. The cloned fragments are then verified by
sequencing with appropriate sets of primers and the generated
sequences are used for construction of T. reesei expression vectors
with either CBHI or EGIV promoter and alg3 targeting.
[0422] All PCR amplifications are made with high-fidelity Phusion
polymerase (Finnzymes). Primers (Table 11) are ordered from MWG
Operon. Sequencing is performed by the DNA Sequencing Laboratory of
the Institute of Biotechnology, University of Helsinki, as a
commercial service.
[0423] The Trichoderma reesei expression vectors with the described
chimeric GnTII/GnTI sequences with spacer variations (2.times.G4S,
3.times.G4S, CBHI, and EGIV) are subcloned under the control of the
cbh1 promoter, with a pyr4 gene loopout marker and alg3 flanking
region fragments for targeted integration in the backbone are then
constructed. Expression cassettes are transformed into T. reesei
strain M279 (pyr4.sup.- strain of M202). After plate selection, the
clones are PCR-screened and purified through single spores. To
obtain material for glycan analyses, shake flask cultivations are
performed as described.
[0424] Introduction of GnTII/I Fusion Protein Variants to T. Reesei
Strain Expressing Rituximab Antibody
[0425] The recombinant GnTII/I fusion protein variants are
introduced into the rituximab-expressing T. reesei strain M279
described in Example 5.
[0426] Briefly, the vectors having the GnTII/GnTI fusion protein
under the control of the cbh1 promoter, the MNTI targeting peptide,
the pyr4 loop-out marker, and each of the 4 spacer variants are
each subcloned into a backbone vector between alg3 flanking region
fragments for targeted integration, thus deleting the alg3 gene. A
PmeI-digested expression cassette is transformed into T. reesei
strain M279 (a pyr4.sup.- strain). After plate selection, the
clones are PCR-screened and purified through single spores.
[0427] Glycan Analysis of Rituximab-Producing T. reesei GnTII/GnTI
Variant Strains Obtained by Targeting to alg3 Locus
[0428] To obtain material for glycan analysis, shake flask
cultivations are performed as described in Example 5 and, in
addition, some culture media are supplemented with 0.3 mg/ml
soybean trypsin inhibitor (SBTI) and 1% casamino acids. SBTI is
added first at inoculation and then daily on days 3-6. PMSF and
Pepstatin A is added to all samples before freezing.
[0429] Rituximab is purified with Protein G affinity chromatography
from day 5 supernatant samples with SBTI and from day 5 and 7
samples without SBTI. PNGase F reactions are performed for
.about.10 .mu.g of denatured protein. The released N-glycans are
first purified with Hypersep C-18 and then with Hypersep Hypercarb
(both from Thermo Scientific) where neutral and acidic glycans are
separated. The purification steps are performed in 96-well format.
Neutral and acidic N-glycans are analyzed by MALDI-TOF MS to test
for the presence of the G0 glycoform on the rituximab antibody.
Sequence CWU 1
1
1471445PRTHomo sapiens 1Met Leu Lys Lys Gln Ser Ala Gly Leu Val Leu
Trp Gly Ala Ile Leu1 5 10 15 Phe Val Ala Trp Asn Ala Leu Leu Leu
Leu Phe Phe Trp Thr Arg Pro 20 25 30 Ala Pro Gly Arg Pro Pro Ser
Val Ser Ala Leu Asp Gly Asp Pro Ala 35 40 45 Ser Leu Thr Arg Glu
Val Ile Arg Leu Ala Gln Asp Ala Glu Val Glu 50 55 60 Leu Glu Arg
Gln Arg Gly Leu Leu Gln Gln Ile Gly Asp Ala Leu Ser65 70 75 80 Ser
Gln Arg Gly Arg Val Pro Thr Ala Ala Pro Pro Ala Gln Pro Arg 85 90
95 Val Pro Val Thr Pro Ala Pro Ala Val Ile Pro Ile Leu Val Ile Ala
100 105 110 Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu
His Tyr 115 120 125 Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile Val Ser
Gln Asp Cys Gly 130 135 140 His Glu Glu Thr Ala Gln Ala Ile Ala Ser
Tyr Gly Ser Ala Val Thr145 150 155 160 His Ile Arg Gln Pro Asp Leu
Ser Ser Ile Ala Val Pro Pro Asp His 165 170 175 Arg Lys Phe Gln Gly
Tyr Tyr Lys Ile Ala Arg His Tyr Arg Trp Ala 180 185 190 Leu Gly Gln
Val Phe Arg Gln Phe Arg Phe Pro Ala Ala Val Val Val 195 200 205 Glu
Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe Arg Ala 210 215
220 Thr Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu Trp Cys Val Ser
Ala225 230 235 240 Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ala
Ser Arg Pro Glu 245 250 255 Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly
Leu Gly Trp Leu Leu Leu 260 265 270 Ala Glu Leu Trp Ala Glu Leu Glu
Pro Lys Trp Pro Lys Ala Phe Trp 275 280 285 Asp Asp Trp Met Arg Arg
Pro Glu Gln Arg Gln Gly Arg Ala Cys Ile 290 295 300 Arg Pro Glu Ile
Ser Arg Thr Met Thr Phe Gly Arg Lys Gly Val Ser305 310 315 320 His
Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu Asn Gln 325 330
335 Gln Phe Val His Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln Arg Glu
340 345 350 Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro
Gln Leu 355 360 365 Gln Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu
Leu Gly Glu Val 370 375 380 Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe
Lys Ala Phe Ala Lys Ala385 390 395 400 Leu Gly Val Met Asp Asp Leu
Lys Ser Gly Val Pro Arg Ala Gly Tyr 405 410 415 Arg Gly Ile Val Thr
Phe Gln Phe Arg Gly Arg Arg Val His Leu Ala 420 425 430 Pro Pro Pro
Thr Trp Glu Gly Tyr Asp Pro Ser Trp Asn 435 440 445 2445PRTPan
troglodytes 2Met Leu Lys Lys Gln Ser Ala Gly Leu Val Leu Trp Gly
Ala Ile Leu 1 5 10 15 Phe Val Ala Trp Asn Ala Leu Leu Leu Leu Phe
Phe Trp Thr Arg Pro 20 25 30 Ala Pro Gly Arg Pro Pro Ser Val Ser
Ala Leu Asp Asp Asp Pro Ala 35 40 45 Ser Leu Thr Arg Glu Val Ile
Arg Leu Ala Gln Asp Ala Glu Val Glu 50 55 60 Leu Glu Arg Gln Arg
Gly Leu Leu Gln Gln Ile Gly Asp Ala Leu Ser65 70 75 80 Ser Gln Arg
Gly Arg Val Pro Thr Ala Ala Pro Pro Ala Gln Pro Arg 85 90 95 Val
Pro Val Thr Pro Ala Pro Ala Val Ile Pro Ile Leu Val Ile Ala 100 105
110 Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu His Tyr
115 120 125 Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile Val Ser Gln Asp
Cys Gly 130 135 140 His Glu Glu Thr Ala Gln Ala Ile Ala Ser Tyr Gly
Ser Ala Val Thr145 150 155 160 His Ile Arg Gln Pro Asp Leu Ser Ser
Ile Ala Val Pro Pro Asp His 165 170 175 Arg Lys Phe Gln Gly Tyr Tyr
Lys Ile Ala Arg His Tyr Arg Trp Ala 180 185 190 Leu Gly Gln Val Phe
Arg Gln Phe Gly Phe Pro Ala Ala Val Val Val 195 200 205 Glu Asp Asp
Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe Gln Ala 210 215 220 Thr
Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu Trp Cys Val Ser Ala225 230
235 240 Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ala Ser Arg Pro
Glu 245 250 255 Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp
Leu Leu Leu 260 265 270 Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp
Pro Lys Ala Phe Trp 275 280 285 Asp Asp Trp Met Arg Arg Pro Glu Gln
Arg Gln Gly Arg Ala Cys Ile 290 295 300 Arg Pro Glu Ile Ser Arg Thr
Met Thr Phe Gly Arg Lys Gly Val Ser305 310 315 320 His Gly Gln Phe
Phe Asp Gln His Leu Lys Phe Ile Lys Leu Asn Gln 325 330 335 Gln Phe
Val His Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln Arg Glu 340 345 350
Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro Gln Leu 355
360 365 Gln Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu Leu Gly Glu
Val 370 375 380 Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe
Ala Lys Ala385 390 395 400 Leu Gly Val Met Asp Asp Leu Lys Ser Gly
Val Pro Arg Ala Gly Tyr 405 410 415 Arg Gly Ile Val Thr Phe Gln Phe
Arg Gly Arg Arg Val His Leu Ala 420 425 430 Pro Pro Pro Thr Trp Glu
Gly Tyr Asp Pro Ser Trp Asn 435 440 445 3445PRTPongo abelii 3Met
Leu Lys Lys Gln Ser Ala Gly Leu Val Leu Trp Gly Ala Ile Leu 1 5 10
15 Phe Val Ala Trp Asn Ala Leu Leu Leu Leu Phe Phe Trp Thr Arg Pro
20 25 30 Ala Pro Gly Arg Pro Pro Ser Val Ser Ala Leu Asp Asp Asp
Pro Ala 35 40 45 Ser Leu Thr Arg Glu Val Ile Arg Leu Ala Gln Asp
Ala Glu Val Glu 50 55 60 Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln
Ile Gly Asp Ala Leu Trp65 70 75 80 Ser Gln Arg Gly Arg Val Pro Thr
Pro Ala Leu Pro Ala Gln Pro Arg 85 90 95 Val Pro Ala Thr Pro Ala
Pro Ala Val Ile Pro Ile Leu Val Ile Ala 100 105 110 Cys Asp Arg Ser
Thr Val Arg Arg Cys Leu Asp Lys Leu Leu Gln Tyr 115 120 125 Arg Pro
Ser Ala Glu Leu Phe Pro Ile Ile Val Ser Gln Asp Cys Gly 130 135 140
His Glu Glu Thr Ala Gln Ala Ile Ala Ser Tyr Gly Ser Ala Val Thr145
150 155 160 His Ile Arg Gln Pro Asp Leu Ser Ser Ile Ala Val Pro Pro
Asp His 165 170 175 Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His
Tyr Arg Trp Ala 180 185 190 Leu Gly Gln Ile Phe Gln Arg Phe Arg Phe
Pro Ala Ala Val Val Val 195 200 205 Glu Asp Asp Leu Glu Val Ala Pro
Asp Phe Phe Glu Tyr Phe Gln Ala 210 215 220 Thr Tyr Pro Leu Leu Lys
Ala Asp Pro Ser Leu Trp Cys Val Ser Ala225 230 235 240 Trp Asn Asp
Asn Gly Lys Glu Gln Met Val Asp Ala Ser Lys Pro Glu 245 250 255 Leu
Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu Leu Leu 260 265
270 Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala Phe Trp
275 280 285 Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Lys Gly Arg Ala
Cys Ile 290 295 300 Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg
Lys Gly Val Ser305 310 315 320 His Gly Gln Phe Phe Asp Gln His Leu
Lys Phe Ile Lys Leu Asn Gln 325 330 335 Gln Phe Val His Phe Thr Gln
Leu Asp Leu Ser Tyr Leu Gln Arg Glu 340 345 350 Ala Tyr Asp Arg Asp
Phe Leu Ala Arg Val Tyr Gly Ala Pro Gln Leu 355 360 365 Gln Val Glu
Lys Val Arg Thr Asn Asp Arg Lys Glu Leu Gly Glu Val 370 375 380 Arg
Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe Ala Lys Ala385 390
395 400 Leu Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala Gly
Tyr 405 410 415 Arg Gly Ile Val Thr Phe Gln Phe Arg Gly Arg Arg Val
His Leu Ala 420 425 430 Pro Pro Pro Thr Trp Glu Gly Tyr Asp Pro Ser
Trp Asn 435 440 445 4445PRTMacaca mulatta 4Met Leu Lys Lys Gln Ser
Ala Gly Leu Val Leu Trp Gly Ala Ile Leu 1 5 10 15 Phe Val Ala Trp
Asn Ala Leu Leu Leu Leu Phe Phe Trp Thr Arg Pro 20 25 30 Ala Pro
Gly Arg Pro Pro Ser Val Ser Ala Leu Asn Asp Asp Pro Ala 35 40 45
Ser Leu Thr Arg Glu Val Ile Arg Leu Ala Gln Asp Ala Glu Val Glu 50
55 60 Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln Ile Gly Asp Ala Leu
Trp65 70 75 80 Ser Gln Arg Gly Arg Val Pro Thr Ala Gly Pro Pro Ala
Gln Pro His 85 90 95 Val Pro Val Thr Pro Ala Pro Ala Val Ile Pro
Ile Leu Val Ile Ala 100 105 110 Cys Asp Arg Ser Thr Val Arg Arg Cys
Leu Asp Lys Leu Leu His Tyr 115 120 125 Arg Pro Ser Ala Glu Arg Phe
Pro Ile Ile Val Ser Gln Asp Cys Gly 130 135 140 His Glu Glu Thr Ala
Gln Ala Ile Ala Ser Tyr Gly Ser Ala Val Thr145 150 155 160 His Ile
Arg Gln Pro Asp Leu Ser Ser Ile Ala Val Pro Pro Asp His 165 170 175
Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg Trp Ala 180
185 190 Leu Gly Gln Val Phe His Arg Phe Arg Phe Pro Ala Ala Val Val
Val 195 200 205 Glu Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr
Phe Gln Ala 210 215 220 Thr Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu
Trp Cys Val Ser Ala225 230 235 240 Trp Asn Asp Asn Gly Lys Glu Gln
Met Val Asp Ser Gly Lys Pro Glu 245 250 255 Leu Leu Tyr Arg Thr Asp
Phe Phe Pro Gly Leu Gly Trp Leu Leu Leu 260 265 270 Ala Glu Leu Trp
Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala Phe Trp 275 280 285 Asp Asp
Trp Met Arg Arg Pro Glu Gln Arg Lys Gly Arg Ala Cys Ile 290 295 300
Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys Gly Val Ser305
310 315 320 His Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu
Asn Gln 325 330 335 Gln Phe Val His Phe Thr Gln Leu Asp Leu Ser Tyr
Leu Gln Arg Glu 340 345 350 Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val
Tyr Ala Ala Pro Gln Leu 355 360 365 Gln Val Glu Lys Val Arg Thr Asn
Asp Arg Lys Glu Leu Gly Glu Val 370 375 380 Arg Val Gln Tyr Thr Gly
Arg Asp Ser Phe Lys Ala Phe Ala Lys Ala385 390 395 400 Leu Gly Val
Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala Gly Tyr 405 410 415 Arg
Gly Ile Val Thr Phe Gln Phe Arg Gly Arg Arg Val His Leu Ala 420 425
430 Pro Pro Pro Thr Trp Glu Gly Tyr Asp Pro Ser Trp Asn 435 440 445
5447PRTCricetulus griseus 5Met Leu Lys Lys Gln Ser Ala Gly Leu Val
Leu Trp Gly Ala Ile Leu 1 5 10 15 Phe Val Gly Trp Asn Ala Leu Leu
Leu Leu Phe Phe Trp Thr Arg Pro 20 25 30 Ala Pro Gly Arg Pro Pro
Ser Asp Ser Ala Ile Asp Asp Asp Pro Ala 35 40 45 Ser Leu Thr Arg
Glu Val Phe Arg Leu Ala Glu Asp Ala Glu Val Glu 50 55 60 Leu Glu
Arg Gln Arg Gly Leu Leu Gln Gln Ile Arg Glu His His Ala65 70 75 80
Leu Trp Arg Gln Arg Trp Lys Val Pro Thr Val Ala Pro Pro Ala Trp 85
90 95 Pro Arg Val Pro Ala Thr Pro Ser Pro Ala Val Ile Pro Ile Leu
Val 100 105 110 Ile Ala Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp
Lys Leu Leu 115 120 125 His Tyr Arg Pro Ser Ala Glu His Phe Pro Ile
Ile Val Ser Gln Asp 130 135 140 Cys Gly His Glu Glu Thr Ala Gln Val
Ile Ala Ser Tyr Gly Ser Ala145 150 155 160 Val Thr His Ile Arg Gln
Pro Asp Leu Ser Asn Ile Ala Val Pro Pro 165 170 175 Asp His Arg Lys
Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg 180 185 190 Trp Ala
Leu Gly Gln Ile Phe Asn Lys Phe Lys Phe Pro Ala Ala Val 195 200 205
Val Val Glu Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe 210
215 220 Gln Ala Thr Tyr Pro Leu Leu Arg Thr Asp Pro Ser Leu Trp Cys
Val225 230 235 240 Ser Ala Trp Asn Asp Asn Gly Lys Glu Gln Met Val
Asp Ser Ser Lys 245 250 255 Pro Glu Leu Leu Tyr Arg Thr Asp Phe Phe
Pro Gly Leu Gly Trp Leu 260 265 270 Leu Met Ala Glu Leu Trp Thr Glu
Leu Glu Pro Lys Trp Pro Lys Ala 275 280 285 Phe Trp Asp Asp Trp Met
Arg Arg Pro Glu Gln Arg Lys Gly Arg Ala 290 295 300 Cys Ile Arg Pro
Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys Gly305 310 315 320 Val
Ser His Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu 325 330
335 Asn Gln Gln Phe Val Ser Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln
340 345 350 Arg Glu Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Ser
Ala Pro 355 360 365 Leu Leu Gln Val Glu Lys Val Arg Thr Asn Asp Gln
Lys Glu Leu Gly 370 375 380 Glu Val Arg Val Gln Tyr Thr Ser Arg Asp
Ser Phe Lys Ala Phe Ala385 390 395 400 Lys Ala Leu Gly Val Met Asp
Asp Leu Lys Ser Gly Val Pro Arg Ala 405 410 415 Gly Tyr Arg Gly Val
Val Thr Phe Gln Phe Arg Gly Arg Arg Val His 420 425 430 Leu Ala Pro
Pro Gln Thr Trp Glu Gly Tyr Asp Pro Ser Trp Asn 435 440 445
6447PRTRattus norvegicus 6Met Leu Lys Lys Gln Ser Ala Gly Leu Val
Leu Trp Gly Ala Ile Ile 1 5 10 15 Phe Val Gly Trp Asn Ala Leu Leu
Leu Leu Phe Phe Trp Thr Arg Pro 20 25 30 Ala Pro Gly Arg Leu Pro
Ser Asp Ser Ala Leu Gly Asp Asp Pro Ala 35 40 45 Ser Leu Thr Arg
Glu Val Ile His Leu Ala Glu Asp Ala Glu Ala Glu 50 55 60 Leu Glu
Arg Gln Arg
Gly Leu Leu Gln Gln Ile Lys Glu His Tyr Ser65 70 75 80 Leu Trp Arg
Gln Arg Trp Arg Val Pro Thr Val Ala Pro Pro Ala Trp 85 90 95 Pro
Arg Val Pro Gly Thr Pro Ser Pro Ala Val Ile Pro Ile Leu Val 100 105
110 Ile Ala Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu
115 120 125 His Tyr Arg Pro Ser Ala Glu His Phe Pro Ile Ile Val Ser
Gln Asp 130 135 140 Cys Gly His Glu Glu Thr Ala Gln Val Ile Ala Ser
Tyr Gly Thr Ala145 150 155 160 Val Thr His Ile Arg Gln Pro Asp Leu
Ser Asn Ile Ala Val Gln Pro 165 170 175 Asp His Arg Lys Phe Gln Gly
Tyr Tyr Lys Ile Ala Arg His Tyr Arg 180 185 190 Trp Ala Leu Gly Gln
Ile Phe Asn Lys Phe Lys Phe Pro Ala Ala Val 195 200 205 Val Val Glu
Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe 210 215 220 Gln
Ala Thr Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu Trp Cys Val225 230
235 240 Ser Ala Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ser Ser
Lys 245 250 255 Pro Glu Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu
Gly Trp Leu 260 265 270 Leu Leu Ala Asp Leu Trp Ala Glu Leu Glu Pro
Lys Trp Pro Lys Ala 275 280 285 Phe Trp Asp Asp Trp Met Arg Arg Pro
Glu Gln Arg Lys Gly Arg Ala 290 295 300 Cys Ile Arg Pro Glu Ile Ser
Arg Thr Met Thr Phe Gly Arg Lys Gly305 310 315 320 Val Ser His Gly
Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu 325 330 335 Asn Gln
Gln Phe Val Pro Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln 340 345 350
Arg Glu Ala Tyr Asp Arg Asp Phe Leu Ala Gln Val Tyr Gly Ala Pro 355
360 365 Gln Leu Gln Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu Leu
Gly 370 375 380 Glu Val Arg Val Gln Tyr Thr Ser Arg Asp Ser Phe Lys
Ala Phe Ala385 390 395 400 Lys Ala Leu Gly Val Met Asp Asp Leu Lys
Ser Gly Val Pro Arg Ala 405 410 415 Gly Tyr Arg Gly Ile Val Thr Phe
Gln Phe Arg Gly Arg Arg Val His 420 425 430 Leu Ala Pro Pro Glu Thr
Trp Asn Gly Tyr Asp Pro Ser Trp Asn 435 440 445 7447PRTMus musculus
7Met Leu Lys Lys Gln Thr Ala Gly Leu Val Leu Trp Gly Ala Ile Ile 1
5 10 15 Phe Val Gly Trp Asn Ala Leu Leu Leu Leu Phe Phe Trp Thr Arg
Pro 20 25 30 Ala Pro Gly Arg Leu Pro Ser Asp Ser Ala Leu Gly Asp
Asp Pro Ala 35 40 45 Ser Leu Thr Arg Glu Val Ile His Leu Ala Glu
Asp Ala Glu Ala Glu 50 55 60 Leu Glu Arg Gln Arg Gly Leu Leu Gln
Gln Ile Lys Glu His Tyr Ala65 70 75 80 Leu Trp Arg Gln Arg Trp Arg
Val Pro Thr Val Ala Pro Pro Ala Trp 85 90 95 Pro Arg Val Pro Val
Thr Pro Ser Pro Val Gln Ile Pro Ile Leu Val 100 105 110 Ile Ala Cys
Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu 115 120 125 His
Tyr Arg Pro Ser Ala Glu Arg Phe Pro Ile Ile Val Ser Gln Asp 130 135
140 Cys Gly His Glu Glu Thr Ala Gln Val Ile Ala Ser Tyr Gly Thr
Ala145 150 155 160 Val Thr His Ile Arg Gln Pro Asp Leu Ser Asn Ile
Ala Val Gln Pro 165 170 175 Asp His Arg Lys Phe Gln Gly Tyr Tyr Lys
Ile Ala Arg His Tyr Arg 180 185 190 Trp Ala Leu Gly Gln Ile Phe Asn
Lys Phe Lys Phe Pro Ala Ala Val 195 200 205 Val Val Glu Asp Asp Leu
Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe 210 215 220 Gln Ala Thr Tyr
Pro Leu Leu Arg Thr Asp Pro Ser Leu Trp Cys Val225 230 235 240 Ser
Ala Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ser Ser Lys 245 250
255 Pro Glu Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu
260 265 270 Leu Leu Ala Asp Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro
Lys Ala 275 280 285 Phe Trp Asp Asp Trp Met Arg Arg Pro Glu Gln Arg
Lys Gly Arg Ala 290 295 300 Cys Ile Arg Pro Glu Ile Ser Arg Thr Met
Thr Phe Gly Arg Lys Gly305 310 315 320 Val Ser His Gly Gln Phe Phe
Asp Gln His Leu Lys Phe Ile Lys Leu 325 330 335 Asn Gln Gln Phe Val
Pro Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln 340 345 350 Gln Glu Ala
Tyr Asp Arg Asp Phe Leu Ala Gln Val Tyr Gly Ala Pro 355 360 365 Gln
Leu Gln Val Glu Lys Val Arg Thr Asn Asp Gln Lys Glu Leu Gly 370 375
380 Glu Val Arg Val Gln Tyr Thr Ser Arg Asp Ser Phe Lys Ala Phe
Ala385 390 395 400 Lys Ala Leu Gly Val Met Asp Asp Leu Lys Ser Gly
Val Pro Arg Ala 405 410 415 Gly Tyr Arg Gly Ile Val Thr Phe Gln Phe
Arg Gly Arg Arg Val His 420 425 430 Leu Ala Pro Pro Gln Thr Trp Thr
Gly Tyr Asp Pro Ser Trp Asn 435 440 445 8447PRTOryctolagus
cuniculus 8Met Leu Lys Lys Gln Ser Ala Gly Leu Val Leu Trp Gly Ala
Ile Leu 1 5 10 15 Phe Val Ala Trp Asn Ala Leu Leu Leu Leu Phe Phe
Trp Thr Arg Pro 20 25 30 Val Pro Ser Arg Leu Pro Ser Asp Asn Ala
Leu Asp Asp Asp Pro Ala 35 40 45 Ser Leu Thr Arg Glu Val Ile Arg
Leu Ala Gln Asp Ala Glu Val Glu 50 55 60 Leu Glu Arg Gln Arg Gly
Leu Leu Gln Gln Ile Arg Glu His His Ala65 70 75 80 Leu Trp Ser Gln
Arg Trp Lys Val Pro Thr Ala Ala Pro Pro Ala Gln 85 90 95 Pro His
Val Pro Val Thr Pro Pro Pro Ala Val Ile Pro Ile Leu Val 100 105 110
Ile Ala Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu 115
120 125 His Tyr Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile Val Ser Gln
Asp 130 135 140 Cys Gly His Glu Glu Thr Ala Gln Val Ile Ala Ser Tyr
Gly Ser Ala145 150 155 160 Val Thr His Ile Arg Gln Pro Asp Leu Ser
Asn Ile Ala Val Gln Pro 165 170 175 Asp His Arg Lys Phe Gln Gly Tyr
Tyr Lys Ile Ala Arg His Tyr Arg 180 185 190 Trp Ala Leu Gly Gln Ile
Phe His Asn Phe Asn Tyr Pro Ala Ala Val 195 200 205 Val Val Glu Asp
Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe 210 215 220 Gln Ala
Thr Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu Trp Cys Val225 230 235
240 Ser Ala Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ser Ser Lys
245 250 255 Pro Glu Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly
Trp Leu 260 265 270 Leu Leu Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys
Trp Pro Lys Ala 275 280 285 Phe Trp Asp Asp Trp Met Arg Arg Pro Glu
Gln Arg Lys Gly Arg Ala 290 295 300 Cys Val Arg Pro Glu Ile Ser Arg
Thr Met Thr Phe Gly Arg Lys Gly305 310 315 320 Val Ser His Gly Gln
Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu 325 330 335 Asn Gln Gln
Phe Val Pro Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln 340 345 350 Gln
Glu Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro 355 360
365 Gln Leu Gln Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu Leu Gly
370 375 380 Glu Val Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala
Phe Ala385 390 395 400 Lys Ala Leu Gly Val Met Asp Asp Leu Lys Ser
Gly Val Pro Arg Ala 405 410 415 Gly Tyr Arg Gly Ile Val Thr Phe Leu
Phe Arg Gly Arg Arg Val His 420 425 430 Leu Ala Pro Pro Gln Thr Trp
Asp Gly Tyr Asp Pro Ser Trp Thr 435 440 445 9447PRTMesocricetus
auratus 9Met Leu Lys Lys Gln Ser Ala Gly Leu Val Leu Trp Gly Ala
Ile Ile 1 5 10 15 Phe Val Gly Trp Asn Ala Leu Leu Leu Leu Phe Phe
Trp Thr Arg Pro 20 25 30 Ala Pro Gly Arg Pro Pro Leu Asp Ser Ala
Leu Asp Asp Asp Pro Ala 35 40 45 Ser Leu Thr Arg Glu Val Ile Arg
Leu Ala Glu Asp Ala Glu Val Glu 50 55 60 Leu Glu Arg Gln Arg Gly
Leu Leu Gln Gln Ile Arg Glu His His Thr65 70 75 80 Leu Trp Asn Gln
Arg Trp Lys Val Pro Thr Val Ala Pro Pro Ala Trp 85 90 95 Pro Arg
Val Pro Val Thr Pro Ser Pro Pro Val Ile Pro Ile Leu Val 100 105 110
Ile Ala Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu 115
120 125 His Tyr Arg Pro Ser Ala Glu His Phe Pro Ile Ile Val Ser Gln
Asp 130 135 140 Cys Gly His Glu Glu Thr Ala Gln Val Ile Ala Ser Tyr
Gly Ser Ala145 150 155 160 Val Thr His Ile Arg Gln Pro Asp Leu Ser
Asn Ile Ala Val Gln Pro 165 170 175 Asp His Arg Lys Phe Gln Gly Tyr
Tyr Lys Ile Ala Arg His Tyr Arg 180 185 190 Trp Ala Leu Gly Gln Ile
Phe Asn Lys Phe Lys Phe Pro Ala Ala Val 195 200 205 Val Val Glu Asp
Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe 210 215 220 Gln Ala
Thr Tyr Pro Leu Leu Arg Thr Asp Pro Ser Leu Trp Cys Val225 230 235
240 Ser Ala Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ser Ser Lys
245 250 255 Pro Glu Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly
Trp Leu 260 265 270 Leu Leu Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys
Trp Pro Lys Ala 275 280 285 Phe Trp Asp Asp Trp Met Arg Arg Pro Glu
Gln Arg Lys Gly Arg Ala 290 295 300 Cys Ile Arg Pro Glu Ile Ser Arg
Thr Met Thr Phe Gly Arg Lys Gly305 310 315 320 Val Ser His Gly Gln
Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu 325 330 335 Asn Gln Gln
Phe Val Ser Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln 340 345 350 Arg
Glu Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro 355 360
365 Leu Leu Gln Val Glu Lys Val Arg Thr Asn Asp Gln Lys Glu Leu Gly
370 375 380 Glu Val Arg Val Gln Tyr Thr Ser Arg Asp Ser Phe Lys Ala
Phe Ala385 390 395 400 Lys Ala Leu Gly Val Met Asp Asp Leu Lys Ser
Gly Val Pro Arg Ala 405 410 415 Gly Tyr Arg Gly Ile Val Thr Phe Gln
Phe Arg Gly Arg Arg Val His 420 425 430 Leu Ala Pro Pro Arg Ser Trp
Glu Gly Tyr Asp Pro Ser Trp Thr 435 440 445 10445PRTAiluropoda
melanoleuca 10Met Leu Lys Lys Gln Ser Ala Gly Leu Val Leu Trp Gly
Ala Ile Leu 1 5 10 15 Phe Val Ala Trp Asn Ala Leu Leu Leu Leu Phe
Phe Trp Thr Arg Pro 20 25 30 Ser Pro Gly Arg Leu Pro Ser Glu Ser
Ala Leu Asp Asp Asp Pro Ala 35 40 45 Val Leu Thr Arg Glu Val Ile
Arg Leu Ala Glu Asp Ala Glu Val Glu 50 55 60 Leu Glu Arg Gln Arg
Gly Leu Leu Gln Gln Ile Arg Glu His His Ala65 70 75 80 Arg Trp Ser
Gln Arg Trp Arg Ala Pro Thr Ala Thr Val Pro Ala Pro 85 90 95 Ala
Pro Ala Ser Asn Ala Pro Ala Val Ile Pro Ile Leu Val Ile Ala 100 105
110 Cys Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu His Tyr
115 120 125 Arg Pro Ser Ala Glu His Phe Pro Ile Ile Val Ser Gln Asp
Cys Gly 130 135 140 His Glu Glu Thr Ala Gln Val Ile Ala Ser Tyr Gly
Ser Ala Val Thr145 150 155 160 His Ile Arg Gln Pro Asp Leu Ser Ser
Ile Ala Val Pro Pro Asp His 165 170 175 Arg Lys Phe Gln Gly Tyr Tyr
Lys Ile Ala Arg His Tyr Arg Trp Ala 180 185 190 Leu Gly Gln Val Phe
His Arg Phe Lys Phe Pro Ala Ala Val Val Val 195 200 205 Glu Asp Asp
Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr Phe Gln Ala 210 215 220 Thr
Tyr Pro Leu Leu Arg Ala Asp Pro Ser Leu Trp Cys Val Ser Ala225 230
235 240 Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ser Ser Lys Pro
Glu 245 250 255 Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp
Leu Leu Leu 260 265 270 Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp
Pro Arg Ala Phe Trp 275 280 285 Asp Asp Trp Met Arg Arg Pro Glu Gln
Arg Gln Gly Arg Ala Cys Val 290 295 300 Arg Pro Glu Ile Ser Arg Thr
Met Thr Phe Gly Arg Lys Gly Val Ser305 310 315 320 His Gly Gln Phe
Phe Asp Gln His Leu Lys Phe Ile Lys Leu Asn Gln 325 330 335 His Phe
Val Pro Phe Thr Gln Leu Asp Leu Ser Tyr Leu Arg Gln Glu 340 345 350
Thr Tyr Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro Leu Leu 355
360 365 Gln Val Glu Lys Val Arg Thr Ser Glu Arg Asn Glu Leu Gly Glu
Val 370 375 380 Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe
Ala Lys Ala385 390 395 400 Leu Gly Val Met Asp Asp Leu Lys Ser Gly
Val Pro Arg Ala Gly Tyr 405 410 415 Arg Gly Ile Val Ser Phe Leu Phe
Arg Gly Arg Arg Val His Leu Ala 420 425 430 Pro Pro Gln Thr Trp Asp
Gly Tyr Asp Pro Ser Trp Asn 435 440 445 11447PRTSus scrofa 11Met
Leu Lys Lys Gln Ser Ala Gly Leu Val Leu Trp Gly Ala Ile Leu 1 5 10
15 Phe Val Ala Trp Asn Ala Leu Leu Leu Leu Phe Phe Trp Thr Arg Pro
20 25 30 Ala Pro Gly Arg Leu Pro Ser Asp Ser Ala Leu Asp Asp Asp
Pro Ala 35 40 45 Ser Leu Thr Arg Glu Val Ile Arg Leu Ala Gln Asp
Ala Glu Val Glu 50 55 60 Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln
Ile Arg Glu His His Ala65 70 75 80 Arg Trp Ser Gln Arg Trp Arg Val
Pro Thr Val Ala Pro Pro Val Pro 85 90 95 Pro Arg Val Pro Val Thr
Ser Ala Pro Thr Val Ile Pro Ile Leu Val 100 105 110 Ile Ala Cys Asp
Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu 115 120 125 His Tyr
Arg Pro
Ser Ala Glu His Phe Pro Ile Ile Val Ser Gln Asp 130 135 140 Cys Gly
His Glu Glu Thr Ala Gln Val Ile Ala Ser Tyr Gly Ser Ala145 150 155
160 Val Thr His Ile Arg Gln Pro Asp Leu Ser Asn Ile Val Val Pro Pro
165 170 175 Asp His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His
Tyr Arg 180 185 190 Trp Ala Leu Gly Gln Val Phe Glu Lys Phe Lys Phe
Ser Ala Ala Val 195 200 205 Val Val Glu Asp Asp Leu Glu Val Ala Pro
Asp Phe Phe Glu Tyr Phe 210 215 220 Gln Ala Thr Tyr Pro Leu Leu Arg
Ala Asp Pro Ser Leu Trp Cys Val225 230 235 240 Ser Ala Trp Asn Asp
Asn Gly Lys Glu Gln Met Val Asp Ser Ser Lys 245 250 255 Pro Glu Leu
Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu 260 265 270 Leu
Leu Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala 275 280
285 Phe Trp Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Gln Gly Arg Ala
290 295 300 Cys Val Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg
Lys Gly305 310 315 320 Val Ser His Gly Gln Phe Phe Asp Gln His Leu
Lys Phe Ile Lys Leu 325 330 335 Asn Gln His Phe Val Pro Phe Thr Gln
Leu Asp Leu Ser Tyr Leu Arg 340 345 350 Arg Glu Ala Tyr Asp Arg Asp
Phe Leu Ala Arg Val Tyr Gly Ala Pro 355 360 365 Leu Leu Gln Val Glu
Lys Val Arg Thr Ser Glu Arg Ser Glu Leu Gly 370 375 380 Glu Val Arg
Val Gln Tyr Thr Ser Arg Asp Ser Phe Lys Ala Phe Ala385 390 395 400
Lys Ala Leu Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala 405
410 415 Gly Tyr Arg Gly Ile Val Ser Phe Leu Phe Arg Gly Arg Arg Val
Tyr 420 425 430 Leu Ala Pro Pro Glu Thr Trp Asp Gly Tyr Asp Pro Ser
Trp Asn 435 440 445 12447PRTCanis familiaris 12Met Leu Lys Lys Gln
Ser Ala Gly Leu Val Leu Trp Gly Ala Ile Leu 1 5 10 15 Phe Val Ala
Trp Asn Ala Leu Leu Leu Leu Phe Phe Trp Thr Arg Pro 20 25 30 Ser
Pro Ser Arg Leu Pro Ser Asp Ser Ala Leu Asp Asp Asp Pro Ala 35 40
45 Ser Leu Thr Arg Glu Val Ile Arg Leu Ala Glu Asp Ala Glu Val Glu
50 55 60 Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln Ile Arg Glu His
His Ala65 70 75 80 Arg Trp Ser Gln Arg Trp Arg Val Pro Thr Ala Ala
Pro Pro Ala Pro 85 90 95 Pro Arg Val Pro Val Ser Ser Pro Pro Ala
Val Ile Pro Ile Leu Val 100 105 110 Ile Ala Cys Asp Arg Ser Thr Val
Arg Arg Cys Leu Asp Lys Leu Leu 115 120 125 His Tyr Arg Pro Ser Ala
Glu His Phe Pro Ile Ile Val Ser Gln Asp 130 135 140 Cys Gly His Glu
Glu Thr Ala Gln Val Ile Ala Ser Tyr Gly Ser Ala145 150 155 160 Ile
Thr His Ile Arg Gln Pro Asp Leu Ser Ser Ile Thr Val Pro Pro 165 170
175 Asp His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg
180 185 190 Trp Ala Leu Gly Gln Val Phe His Lys Phe Lys Phe Pro Ala
Ala Val 195 200 205 Val Val Glu Asp Asp Leu Glu Val Ala Pro Asp Phe
Phe Glu Tyr Phe 210 215 220 Gln Ala Thr Tyr Pro Leu Leu Arg Ala Asp
Pro Ser Leu Trp Cys Val225 230 235 240 Ser Ala Trp Asn Asp Asn Gly
Lys Glu Gln Met Val Asp Ser Ser Lys 245 250 255 Pro Glu Leu Leu Tyr
Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu 260 265 270 Leu Leu Ala
Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Arg Ala 275 280 285 Phe
Trp Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Gln Gly Arg Ala 290 295
300 Cys Val Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys
Gly305 310 315 320 Val Ser His Gly Gln Phe Phe Asp Gln His Leu Lys
Phe Ile Lys Leu 325 330 335 Asn Gln His Phe Val Pro Phe Thr Gln Leu
Asp Leu Ser Tyr Leu Arg 340 345 350 Gln Glu Thr Tyr Asp Arg Asp Phe
Leu Ala Arg Val Tyr Gly Ala Pro 355 360 365 Leu Leu Gln Val Glu Lys
Val Arg Thr Ser Glu Arg Ser Glu Leu Gly 370 375 380 Glu Val Arg Val
Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe Ala385 390 395 400 Lys
Ala Leu Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala 405 410
415 Gly Tyr Arg Gly Ile Val Ser Phe Leu Phe Arg Gly Arg Arg Val His
420 425 430 Leu Ala Pro Pro Gln Thr Trp Asp Gly Tyr Asp Pro Ser Trp
Asn 435 440 445 13447PRTBos taurus 13Met Leu Lys Lys Gln Ser Ala
Gly Leu Val Leu Trp Gly Ala Ile Leu 1 5 10 15 Phe Val Ala Trp Asn
Ala Leu Leu Leu Leu Phe Phe Trp Thr Arg Pro 20 25 30 Ala Pro Gly
Arg Leu Pro Ser Asp Ser Ala Leu Asp Asp Asp Pro Ala 35 40 45 Ser
Leu Thr Arg Glu Val Ile Arg Leu Ala Gln Asp Ala Glu Val Glu 50 55
60 Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln Ile Arg Glu His His
Ala65 70 75 80 Arg Trp Ser Gln Arg Trp Arg Val Pro Thr Val Ala Pro
Pro Val Pro 85 90 95 Pro Arg Val Pro Val Thr Thr Pro Pro Ala Val
Ile Pro Ile Leu Val 100 105 110 Ile Ala Cys Asp Arg Ser Thr Val Arg
Arg Cys Leu Asp Lys Leu Leu 115 120 125 Asn Tyr Arg Pro Ser Ala Glu
His Phe Pro Ile Ile Val Ser Gln Asp 130 135 140 Cys Gly His Glu Glu
Thr Ala Gln Val Ile Ala Ser Tyr Gly Ser Ala145 150 155 160 Val Met
His Ile Arg Gln Pro Asp Leu Ser Thr Ile Ala Val Pro Pro 165 170 175
Asp His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg 180
185 190 Trp Ala Leu Gly Gln Val Phe His Glu Phe Lys Phe Pro Ala Ala
Val 195 200 205 Val Val Glu Asp Asp Leu Glu Val Ala Pro Asp Phe Phe
Glu Tyr Phe 210 215 220 Gln Ala Thr Tyr Pro Leu Leu Arg Ala Asp Pro
Ser Leu Trp Cys Val225 230 235 240 Ser Ala Trp Asn Asp Asn Gly Lys
Glu Gln Met Val Asp Ser Ser Lys 245 250 255 Pro Glu Leu Leu Tyr Arg
Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu 260 265 270 Leu Leu Ala Glu
Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala 275 280 285 Phe Trp
Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Gln Gly Arg Ala 290 295 300
Cys Val Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys Gly305
310 315 320 Val Ser His Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile
Lys Leu 325 330 335 Asn Gln His Phe Val Pro Phe Thr Gln Leu Asp Leu
Ser Tyr Leu Arg 340 345 350 Gln Glu Thr Tyr Asp Arg Asp Phe Leu Ala
Arg Val Tyr Gly Ala Pro 355 360 365 Leu Leu Gln Val Glu Lys Val Arg
Thr Ser Glu Arg Ser Glu Leu Gln 370 375 380 Glu Val Arg Val Gln Tyr
Thr Ser Arg Asp Ser Phe Lys Ala Phe Ala385 390 395 400 Lys Ala Leu
Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala 405 410 415 Gly
Tyr Arg Gly Ile Val Ser Phe Leu Tyr Arg Gly Arg Arg Val His 420 425
430 Leu Ala Pro Pro Gln Thr Trp Asp Gly Tyr Asp Pro Ser Trp Asn 435
440 445 14447PRTEquus caballus 14Met Leu Lys Lys Gln Ser Ala Gly
Leu Val Leu Trp Gly Ala Ile Leu 1 5 10 15 Phe Val Ala Trp Asn Ala
Leu Leu Leu Leu Phe Phe Trp Met Arg Pro 20 25 30 Ser Pro Ser Arg
Leu Pro Ser Asp Gly Thr Leu Asp Asp Asp Pro Thr 35 40 45 Gly Leu
Thr Arg Lys Val Ile His Leu Ala Gln Asp Val Glu Val Glu 50 55 60
Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln Ile Arg Glu His His Ala65
70 75 80 Arg Trp Ser Gln Trp Trp Arg Val Pro Thr Val Pro Pro Pro
Val Pro 85 90 95 Pro His Val Ser Val Thr Ser Leu Pro Ala Val Ile
Pro Ile Leu Val 100 105 110 Ile Ala Cys Asp Arg Ser Thr Val Arg Arg
Cys Leu Asp Lys Leu Leu 115 120 125 His Tyr Arg Pro Ser Ala Glu His
Phe Pro Ile Ile Val Ser Gln Asp 130 135 140 Cys Gly His Glu Glu Thr
Ala Gln Val Ile Ala Ser Tyr Gly Ser Ala145 150 155 160 Val Thr His
Ile Arg Gln Pro Asp Leu Ser Asn Ile Ala Val Pro Pro 165 170 175 Asp
His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg 180 185
190 Trp Ala Leu Ala Gln Val Phe His Arg Phe Lys Phe Pro Ala Ala Val
195 200 205 Val Val Glu Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu
Tyr Phe 210 215 220 Gln Ala Thr Tyr Pro Leu Leu Arg Ala Asp Pro Ser
Leu Trp Cys Val225 230 235 240 Ser Ala Trp Asn Asp Asn Gly Lys Glu
Gln Met Val Asp Ser Ser Lys 245 250 255 Pro Glu Leu Leu Tyr Arg Thr
Asp Phe Phe Pro Gly Leu Gly Trp Leu 260 265 270 Leu Leu Ala Glu Leu
Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala 275 280 285 Phe Trp Asp
Asp Trp Met Arg Arg Pro Glu Gln Arg Gln Gly Arg Ala 290 295 300 Cys
Val Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Ile Gly305 310
315 320 Val Ser His Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys
Leu 325 330 335 Asn Gln His Phe Val Pro Phe Thr Gln Leu Asp Leu Ser
Tyr Leu Arg 340 345 350 Gln Glu Ala Tyr Asp Lys Asp Phe Leu Ala Arg
Val Tyr Gly Ala Pro 355 360 365 Leu Leu Gln Val Glu Lys Val Arg Thr
Gly Glu Arg Ser Glu Leu Gly 370 375 380 Glu Val Arg Val Gln Tyr Thr
Gly Arg Asp Ser Phe Lys Ala Phe Ala385 390 395 400 Lys Ala Leu Gly
Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala 405 410 415 Gly Tyr
Arg Gly Ile Val Ser Phe Leu Phe Arg Gly Arg Arg Val His 420 425 430
Leu Ala Pro Pro Gln Thr Trp Glu Gly Tyr Asp Pro Ser Trp Asn 435 440
445 15447PRTMonodelphis domestica 15Met Leu Lys Lys Gln Ser Ala Gly
Leu Val Leu Trp Gly Ala Ile Leu 1 5 10 15 Phe Val Ala Trp Asn Ala
Leu Leu Leu Phe Phe Phe Trp Ala Arg Pro 20 25 30 Leu Pro Gly Gly
Pro Ser Ser Glu Asp Pro Phe Ala Asn Asp Pro Ala 35 40 45 Ser Leu
Ser Arg Arg Val Ile Arg Leu Ala Gln Glu Ala Glu Ile Glu 50 55 60
Leu Glu Arg Gln His Val Leu Leu Gln Gln Ile Gln Lys His Ser Val65
70 75 80 Leu Trp Asn Gln Arg Gln Gln Val Ala Thr Ala Gly Pro Pro
Ala Val 85 90 95 Ser His Pro Thr Val Ala Pro Thr Thr Phe Val Leu
Pro Ile Leu Val 100 105 110 Ile Ala Cys Asp Arg Ser Thr Val Arg Arg
Cys Leu Asp Lys Leu Leu 115 120 125 His Tyr Arg Pro Ser Ala Glu Arg
Phe Pro Ile Ile Val Ser Gln Asp 130 135 140 Cys Gly His Lys Val Thr
Ala Gln Val Ile Ala Ser Tyr Gly Asn Ala145 150 155 160 Ile Met His
Ile Lys Gln Pro Asp Leu Ser Ser Ile Pro Val Pro Thr 165 170 175 Glu
His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg 180 185
190 Trp Ala Leu Asn Gln Val Phe Arg Thr Phe Lys Tyr Gln Ala Ala Val
195 200 205 Val Val Glu Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu
Tyr Phe 210 215 220 Gln Ala Thr Tyr Pro Leu Leu Arg Thr Asp Pro Ser
Leu Trp Cys Val225 230 235 240 Ser Ala Trp Asn Asp Asn Gly Lys Glu
Gln Met Val Asp Ala Lys Arg 245 250 255 Pro Asp Leu Leu Tyr Arg Thr
Asp Phe Phe Pro Gly Leu Gly Trp Leu 260 265 270 Leu Leu Ala Glu Leu
Trp Asp Glu Leu Glu Pro Lys Trp Pro Lys Ala 275 280 285 Phe Trp Asp
Asp Trp Met Arg Gln Pro Glu Gln Arg Arg Asp Arg Ala 290 295 300 Cys
Leu Arg Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys Gly305 310
315 320 Val Ser Gln Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys
Leu 325 330 335 Asn Gln Gly Phe Val Phe Phe Thr Gln Leu Asp Leu Ser
Tyr Leu Lys 340 345 350 Gln Glu Ala Tyr Asp Arg Asp Phe Ser Ala Arg
Val Tyr Ala Ala Pro 355 360 365 Gln Val Gln Val Glu Glu Leu Lys Ser
Asn Gln Lys Gln Glu Leu Gly 370 375 380 Glu Val Arg Val Gln Tyr Arg
Gly Arg Asp Ser Phe Arg Ala Phe Ala385 390 395 400 Lys Ala Leu Gly
Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala 405 410 415 Ser Tyr
Arg Gly Ile Val Ser Phe Leu Phe Arg Gly Arg Arg Val Tyr 420 425 430
Leu Ala Pro Pro Gln Asp Trp Thr Gly Tyr Asp Pro Ser Trp Ser 435 440
445 16438PRTSalmo salar 16Met Leu Arg Lys Arg Gly Ser Ala Ile Leu
Cys Gly Ala Phe Leu Phe 1 5 10 15 Val Ala Trp Asn Ala Val Val Val
Leu Tyr Leu Trp Gly Arg Pro Leu 20 25 30 Ser Gly Arg Glu Glu Arg
Glu Met Asp Gly Gly Arg Gly Gly Ala Asp 35 40 45 Leu Ala Gly Asp
Val Ile His Met Ala Glu Ala Phe Glu Ala Glu Leu 50 55 60 Glu Met
Gln Arg Lys Ile Leu Leu Gln Ile Gln Gly His Arg Ser Leu65 70 75 80
Trp Glu Gln Pro Asn Glu Asn Gly Ala Ser Arg Ile Gly Pro Pro Gln 85
90 95 Val Val Ile Pro Ile Leu Val Ile Ala Cys Asn Arg Val Thr Val
Lys 100 105 110 Arg Cys Leu Asp Lys Leu Leu Glu Tyr Arg Pro Ser Ala
Glu Leu Tyr 115 120 125 Pro Ile Ile Val Ser Gln Asp Cys Gly His Ala
Glu Thr Ala Gln Val 130 135 140 Ile Gly Ser Tyr Gly Ser Gln Val Thr
His Leu Lys Gln Pro Asp Leu145 150 155 160 Ser Asp Ile Ala Val Arg
Pro Glu His Lys Lys Phe Gln Gly Tyr Tyr 165 170 175 Lys Ile Ser Arg
His Tyr Arg Trp Ala Leu Asn Gln Val Phe Asn Ser 180 185 190 Leu Ser
His
Ser Ser Val Val Ile Val Glu Asp Asp Leu Glu Val Ala 195 200 205 Pro
Asp Phe Phe Glu Tyr Phe Arg Ser Leu His Pro Ile Leu Lys Ser 210 215
220 Asp Leu Ser Leu Trp Cys Val Ser Ala Trp Asn Asp Asn Gly Arg
Asp225 230 235 240 Gly Tyr Val Asp Pro Ala Lys Ala Asp Leu Leu Tyr
Arg Thr Asp Phe 245 250 255 Phe Pro Gly Leu Gly Trp Met Met Leu Lys
Glu Leu Trp Val Glu Leu 260 265 270 Glu Pro Lys Trp Pro Gly Ala Phe
Trp Asp Asp Trp Met Arg Gln Pro 275 280 285 Asp Gln Arg Arg Asp Arg
Ala Cys Ile Arg Pro Glu Ile Ser Arg Thr 290 295 300 Leu Thr Phe Gly
Arg Lys Gly Val Ser Leu Gly Gln Phe Tyr Asp Lys305 310 315 320 Tyr
Leu Arg Tyr Ile Lys Leu Asn Ser Glu Phe Val Pro Phe Thr Lys 325 330
335 Leu Asp Leu Ala Tyr Leu Lys Glu Glu Lys Tyr Lys Glu Ile Phe Glu
340 345 350 Lys Gln Val Tyr Ser Ala Pro Leu Val Lys Tyr Glu Glu Val
Gln Arg 355 360 365 Gly Gln Leu Lys Gly Ala Gly Pro Phe Cys Leu His
Tyr Leu Ser Lys 370 375 380 Asp Gly Phe Lys Val Leu Ala Lys Asn Leu
Gly Val Met Glu Asp Leu385 390 395 400 Lys Ser Gly Val Pro Arg Thr
Gly Tyr Arg Gly Val Val Ser Phe Leu 405 410 415 Ser Arg Gly Arg Arg
Ile Phe Leu Ala Pro Pro Pro Gly Trp Ser Lys 420 425 430 Tyr Asp Pro
Thr Trp Ser 435 17452PRTDanio rerio 17Met Leu Arg Lys Arg Ser Pro
Leu Val Ile Cys Gly Ala Phe Ile Phe 1 5 10 15 Val Ala Trp Asn Val
Val Leu Leu Phe Val Leu Met Arg Arg Pro Ser 20 25 30 Ser Pro Gly
Thr Phe Asn Asn Gln Asp Lys Pro Gly Glu Thr Glu His 35 40 45 Arg
Ala Glu Gly Gly Lys Phe Gly Asn Ile Met Asn Glu Val Ile Arg 50 55
60 Val Ala Asp Ala Phe Glu Ala Glu Leu Ala Ala Gln Lys Lys Ile
Leu65 70 75 80 Gln Gln Ile Gln Ser His Trp Ser Val Trp Asp Ser Lys
Asp Gly Val 85 90 95 Ile Pro Glu Lys Ser Lys Ser Glu Val Glu His
Thr Ala Pro Val Val 100 105 110 Ile Pro Ile Leu Val Ile Ala Cys Asn
Arg Val Thr Val Lys Arg Cys 115 120 125 Leu Asp Lys Leu Ile Glu His
Arg Pro Ser Ala Glu Leu His Pro Ile 130 135 140 Ile Val Ser Gln Asp
Cys Gly His Arg Glu Thr Ser Asp Val Ile Gly145 150 155 160 Ser Tyr
Gly Ser Gln Leu Thr His Ile Lys Gln Pro Asp Leu Ser Asp 165 170 175
Val Ala Val Pro Pro Gln His Lys Lys Phe Gln Gly Tyr Tyr Lys Ile 180
185 190 Ser Arg His Tyr Lys Trp Ala Leu Ser Gln Val Phe Asn Thr Phe
Ser 195 200 205 Tyr Ser Ser Val Val Val Val Glu Asp Asp Leu Glu Val
Ala Pro Asp 210 215 220 Phe Phe Glu Tyr Phe Arg Ala Leu His Pro Met
Leu Lys Ser Asp Pro225 230 235 240 Thr Leu Trp Cys Val Ser Ala Trp
Asn Asp Asn Gly Arg Asp Gly Phe 245 250 255 Val Asp Pro Gly Lys Ala
Ser Leu Leu Tyr Arg Thr Asp Phe Phe Pro 260 265 270 Gly Leu Gly Trp
Met Leu Thr Lys Asp Leu Trp Ala Glu Leu Glu Pro 275 280 285 Lys Trp
Pro Ala Ser Phe Trp Asp Asp Trp Met Arg His Pro Asp Gln 290 295 300
Arg Lys Asp Arg Ser Cys Ile Arg Pro Glu Ile Ser Arg Thr Leu Thr305
310 315 320 Phe Gly Arg Lys Gly Val Ser Leu Gly Gln Phe Tyr Asp Lys
Tyr Leu 325 330 335 Arg Phe Ile Lys Leu Asn Thr Glu Phe Val Pro Phe
Thr Lys Met Asp 340 345 350 Leu Ser Tyr Leu Glu Lys Glu Lys Tyr Asp
Glu Ser Phe Glu Lys Glu 355 360 365 Val Tyr Ala Ala Ser Leu Val Thr
Leu Glu Asp Leu Lys Ser Gly Lys 370 375 380 Leu Ser Gly Ser Gly Pro
Phe Arg Val Gln Tyr Ser Ser Pro Asp Ser385 390 395 400 Phe Lys Ser
Leu Ala Arg Asn Leu Gly Val Met Asp Asp Leu Lys Ser 405 410 415 Gly
Val Pro Arg Ala Gly Tyr Arg Gly Ala Val Ser Phe Leu Leu Arg 420 425
430 Gly Lys Arg Val Tyr Leu Ala Pro Pro Ala Gly Trp Ser Arg Tyr Asp
435 440 445 Pro Ser Trp Ser 450 18448PRTXenopus laevis 18Met Pro
Arg Lys Val Ser Val Ala Ala Trp Gly Ala Ala Leu Phe Ile 1 5 10 15
Ser Trp Asn Ala Ile Leu Leu Leu Tyr Leu Met Ser Arg Ser Arg Gly 20
25 30 Thr Asp His Ser Asp Leu Thr Ala His Val Ile Gln Leu Ala Glu
Ala 35 40 45 Ala Glu Ala Glu Leu Glu Lys Gln Lys Gly Leu Leu Gln
Gln Ile His 50 55 60 His Tyr Ser Gly Leu Leu Asn Gln Gln Gln Pro
Ser Ser His Val Arg65 70 75 80 Leu Ala Pro Leu Met Pro Ile Lys Asn
Leu Asn Val Ser Ser Pro Phe 85 90 95 Pro Ser Pro Val Gly Ser Gly
Pro Leu Pro Leu Val Ile Pro Ile Leu 100 105 110 Val Val Ala Cys Asp
Arg Pro Ser Val Arg Arg Cys Leu Asp Ser Leu 115 120 125 Leu Lys Tyr
Arg Pro Ser Ala Glu Lys Phe Pro Ile Ile Val Ser Gln 130 135 140 Asp
Cys Gly His Glu Glu Thr Gly Lys Val Ile Asp Ser Tyr Gly Asp145 150
155 160 Ala Val Thr His Ile Lys Gln Pro Asp Leu Ser Glu Val Ala Val
Pro 165 170 175 Pro Glu His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ser
Arg His Tyr 180 185 190 Arg Trp Ala Leu Asn Gln Ile Phe Lys Ser Met
Gly Tyr Lys Ala Ala 195 200 205 Ile Val Val Glu Asp Asp Leu Glu Val
Ala Pro Asp Phe Tyr Glu Tyr 210 215 220 Phe Gln Ala Thr Leu Pro Leu
Leu Gln Lys Asp Arg Met Leu Trp Cys225 230 235 240 Val Ser Ala Trp
Asn Asp Asn Gly Lys Glu Ala Leu Ile Asp Pro Gly 245 250 255 Gly Thr
Ser Leu Leu Tyr Arg Ser Asp Phe Phe Pro Gly Leu Gly Trp 260 265 270
Leu Leu Leu Arg Glu Leu Trp Glu Glu Leu Glu Pro Lys Trp Pro Ser 275
280 285 Ala Phe Trp Asp Asp Trp Val Arg Arg Pro Glu Gln Arg Leu Asp
Arg 290 295 300 Ala Cys Val Arg Pro Glu Leu Ser Arg Thr Arg Thr Phe
Gly Arg Lys305 310 315 320 Gly Val Ser Gln Gly Gln Phe Phe Asp Gln
His Leu Arg Phe Ile Lys 325 330 335 Leu Asn Gln Asp Leu Val Pro Phe
Thr Lys Met Asp Leu Ser Tyr Leu 340 345 350 Leu Lys Asp Thr Tyr Asp
Pro Trp Phe Leu Glu Gln Val Tyr Gly Ala 355 360 365 Pro Lys Ala Arg
Ala Glu Glu Val Leu His Gly Gln Val Pro Gly Gly 370 375 380 Arg Thr
Val Arg Val Glu Tyr Thr Thr Lys Asp Thr Phe Lys Ala Met385 390 395
400 Ala Arg Ala Phe Gly Val Met Glu Asp Leu Lys Ser Gly Val Ala Arg
405 410 415 Ala Ala Tyr Lys Gly Val Val Ser Phe Ser His Arg Gly Arg
Arg Val 420 425 430 Phe Leu Ala Pro Pro Lys Asp Trp Thr Gly Tyr Asp
Pro Leu Trp Asn 435 440 445 19458PRTDrosophila melanogaster 19Met
Arg Thr Arg Lys Val Leu Leu Val Ile Gly Phe Leu Val Thr Trp 1 5 10
15 Thr Tyr Ala Thr Tyr Tyr Leu Leu Leu Arg Gln Thr Gly Ile His Thr
20 25 30 Ser Arg His Gln Ser Leu Gln Ala Tyr Lys Leu Asn Ser Gln
Ala Arg 35 40 45 Asp Ala Asn Met Gln Ser His His Leu Ala Lys Asn
Val Phe Glu Phe 50 55 60 Val Lys Leu Lys Tyr Leu Glu Lys Gln Pro
Pro Ser Val Ala Ser Thr65 70 75 80 Pro Gln Ile Ser Ile Ile Ala Ala
Glu Ile Ser Ala Glu Leu Pro Glu 85 90 95 Gln His Val Ala Lys Ser
Ala Thr Ala Arg Ile Pro Thr Lys Thr Tyr 100 105 110 Leu Ala Asn Gly
Glu Pro Val Phe Pro Val Val Val Phe Ala Cys Asn 115 120 125 Arg Val
Ser Val Lys Lys Cys Ile Asp Asn Leu Val Gln Tyr Arg Pro 130 135 140
Ser Val Glu Gln Phe Pro Ile Ile Val Ser Gln Asp Cys Gly Asp Glu145
150 155 160 Pro Thr Lys Glu Ala Ile Leu Ser Tyr Gly Lys Gln Val Thr
Leu Ile 165 170 175 Glu Gln Pro Asp Leu Ser Asp Ile Thr Val Leu Pro
Lys Glu Lys Lys 180 185 190 Phe Lys Gly Tyr Tyr Lys Ile Ala Arg His
Tyr Gly Trp Ala Leu Asn 195 200 205 Thr Thr Phe Ala Val Gly Phe Glu
Phe Val Ile Ile Val Glu Asp Asp 210 215 220 Leu Asn Val Ala Pro Asp
Phe Phe Glu Tyr Phe Leu Gly Thr His Lys225 230 235 240 Leu Leu Lys
Gln Asp Pro Ser Leu Trp Cys Val Ser Ala Trp Asn Asp 245 250 255 Asn
Gly Lys Ala Ala Val Val Asp Ala Ala Gln Pro Glu Leu Leu Tyr 260 265
270 Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Met Leu Thr Lys Asp Leu
275 280 285 Trp Ala Glu Leu Ser Val Lys Trp Pro Lys Ser Phe Trp Asp
Asp Trp 290 295 300 Ile Arg His Pro Ala Gln Arg Lys Asp Arg Val Cys
Ile Arg Pro Glu305 310 315 320 Ile Ser Arg Thr Arg Thr Phe Gly Lys
Ile Gly Val Ser Asn Gly Leu 325 330 335 Phe Phe Asp Lys Tyr Leu Lys
His Ile Lys Leu Ser Glu Asp Phe Val 340 345 350 Gln Phe Thr Lys Ile
Asn Met Ser Tyr Leu Leu Lys Asp Asn Tyr Asp 355 360 365 Asn Thr Phe
Leu Arg Arg Val Tyr Thr Tyr Pro Ile Val Thr Tyr Asp 370 375 380 Glu
Leu Arg Arg Asn Leu Ile Arg Ile Glu Gly Pro Val Arg Ile Gln385 390
395 400 Tyr Thr Thr Arg Glu Gln Tyr Lys Arg Thr Thr Lys Met Leu Gly
Leu 405 410 415 Met Asp Asp Phe Lys Ser Gly Val Pro Arg Thr Ala Tyr
His Gly Ile 420 425 430 Val Ser Phe Tyr Tyr Asn Lys Arg Arg Val His
Leu Ala Pro Asn Ala 435 440 445 Asn Trp Lys Gly Tyr Glu Leu Ser Trp
Ser 450 455 20447PRTHomo sapiens 20Met Arg Phe Arg Ile Tyr Lys Arg
Lys Val Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe
Val Leu Trp Ser Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala
Leu Ala Pro Pro Leu Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly Ala
Gly Gly Arg Gly Gly Asp His Pro Ser Val Ala Val Gly Ile 50 55 60
Arg Arg Val Ser Asn Val Ser Ala Ala Ser Leu Val Pro Ala Val Pro65
70 75 80 Gln Pro Glu Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser Leu
Val Tyr 85 90 95 Gln Leu Asn Phe Asp Gln Thr Leu Arg Asn Val Asp
Lys Ala Gly Thr 100 105 110 Trp Ala Pro Arg Glu Leu Val Leu Val Val
Gln Val His Asn Arg Pro 115 120 125 Glu Tyr Leu Arg Leu Leu Leu Asp
Ser Leu Arg Lys Ala Gln Gly Ile 130 135 140 Asp Asn Val Leu Val Ile
Phe Ser His Asp Phe Trp Ser Thr Glu Ile145 150 155 160 Asn Gln Leu
Ile Ala Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe 165 170 175 Phe
Pro Phe Ser Ile Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180 185
190 Pro Arg Asp Cys Pro Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys Leu
195 200 205 Gly Cys Ile Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr
Arg Glu 210 215 220 Ala Lys Phe Ser Gln Thr Lys His His Trp Trp Trp
Lys Leu His Phe225 230 235 240 Val Trp Glu Arg Val Lys Ile Leu Arg
Asp Tyr Ala Gly Leu Ile Leu 245 250 255 Phe Leu Glu Glu Asp His Tyr
Leu Ala Pro Asp Phe Tyr His Val Phe 260 265 270 Lys Lys Met Trp Lys
Leu Lys Gln Gln Glu Cys Pro Glu Cys Asp Val 275 280 285 Leu Ser Leu
Gly Thr Tyr Ser Ala Ser Arg Ser Phe Tyr Gly Met Ala 290 295 300 Asp
Lys Val Asp Val Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly305 310
315 320 Leu Ala Leu Thr Arg Asn Ala Tyr Gln Lys Leu Ile Glu Cys Thr
Asp 325 330 335 Thr Phe Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr
Leu Gln Tyr 340 345 350 Leu Thr Val Ser Cys Leu Pro Lys Phe Trp Lys
Val Leu Val Pro Gln 355 360 365 Ile Pro Arg Ile Phe His Ala Gly Asp
Cys Gly Met His His Lys Lys 370 375 380 Thr Cys Arg Pro Ser Thr Gln
Ser Ala Gln Ile Glu Ser Leu Leu Asn385 390 395 400 Asn Asn Lys Gln
Tyr Met Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys 405 410 415 Phe Thr
Val Val Ala Ile Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly 420 425 430
Asp Ile Arg Asp His Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln 435 440
445 21447PRTPan troglodytes 21Met Arg Phe Arg Ile Tyr Lys Arg Lys
Val Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe Val
Leu Trp Ser Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala Leu
Ala Pro Pro Leu Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly Ala Gly
Gly Arg Gly Gly Asp His Pro Ser Val Ala Val Gly Ile 50 55 60 Arg
Arg Val Ser Asn Val Ser Ala Ala Pro Leu Val Pro Ala Val Pro65 70 75
80 Gln Pro Glu Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser Leu Val Tyr
85 90 95 Gln Leu Asn Phe Asp Gln Thr Leu Arg Asn Val Asp Lys Ala
Gly Thr 100 105 110 Trp Ala Pro Arg Glu Leu Val Leu Val Val Gln Val
His Asn Arg Pro 115 120 125 Glu Tyr Leu Arg Leu Leu Leu Asp Ser Leu
Arg Lys Ala Gln Gly Ile 130 135 140 Asp Asn Val Leu Val Ile Phe Ser
His Asp Phe Trp Ser Thr Glu Ile145 150 155 160 Asn Gln Leu Ile Ala
Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe 165 170 175 Phe Pro Phe
Ser Ile Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180 185 190 Pro
Arg Asp Cys Pro Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys Leu 195 200
205 Gly Cys Ile Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu
210 215 220 Ala Lys Phe Ser Gln Thr Lys His His Trp Trp Trp Lys Leu
His Phe225 230 235 240 Val Trp Glu Arg Val Lys Ile Leu Arg Asp Tyr
Ala Gly Leu Val Leu 245
250 255 Phe Leu Glu Glu Asp His Tyr Leu Ala Pro Asp Phe Tyr His Val
Phe 260 265 270 Lys Lys Met Trp Lys Leu Lys Gln Gln Glu Cys Pro Glu
Cys Asp Val 275 280 285 Leu Ser Leu Gly Thr Tyr Ser Ala Ser Arg Ser
Phe Tyr Gly Met Ala 290 295 300 Asp Lys Val Asp Val Lys Thr Trp Lys
Ser Thr Glu His Asn Met Gly305 310 315 320 Leu Ala Leu Thr Arg Asn
Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp 325 330 335 Thr Phe Cys Thr
Tyr Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr 340 345 350 Leu Thr
Val Ser Cys Leu Pro Lys Phe Trp Lys Val Leu Val Pro Gln 355 360 365
Val Pro Arg Ile Phe His Ala Gly Asp Cys Gly Met His His Lys Lys 370
375 380 Thr Cys Arg Pro Ser Thr Gln Ser Ala Gln Ile Glu Ser Leu Leu
Asn385 390 395 400 Asn Asn Lys Gln Tyr Met Phe Pro Glu Thr Leu Thr
Ile Ser Glu Lys 405 410 415 Phe Thr Val Val Ala Ile Ser Pro Pro Arg
Lys Asn Gly Gly Trp Gly 420 425 430 Asp Ile Arg Asp His Glu Leu Cys
Lys Ser Tyr Arg Arg Leu Gln 435 440 445 22447PRTOryctolagus
cuniculus 22Met Arg Phe Arg Ile Tyr Lys Arg Lys Val Leu Ile Leu Thr
Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe Val Leu Trp Ser Ser Asn
Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala Leu Ala Pro Pro Leu Leu
Asp Ala Glu Pro Ala Arg 35 40 45 Ala Ala Gly Gly Arg Gly Gly Asp
His Pro Ala Val Ser Val Gly Ile 50 55 60 Arg Arg Val Ser Asn Glu
Ser Ala Ala Pro Leu Val Pro Ala Ala Ala65 70 75 80 Gln Pro Glu Ala
Asp Asn Gln Thr Leu Arg Tyr Arg Ser Leu Val Tyr 85 90 95 Gln Leu
Asn Phe Asp Gln Thr Leu Arg Asn Val Asp Lys Ala Gly Ser 100 105 110
Trp Ala Pro Arg Glu Leu Val Leu Val Val Gln Val His Asn Arg Leu 115
120 125 Glu Tyr Leu Arg Leu Leu Leu Asp Ser Leu Arg Lys Ala Gln Gly
Ile 130 135 140 Glu Asp Val Leu Val Ile Phe Ser His Asp Phe Trp Ser
Pro Glu Ile145 150 155 160 Asn Gln Leu Ile Ala Gly Val Asp Phe Cys
Pro Ile Leu Gln Val Phe 165 170 175 Phe Pro Phe Ser Ile Gln Leu Tyr
Pro Asn Glu Phe Pro Gly Ser Asp 180 185 190 Pro Arg Asp Cys Pro Arg
Asp Leu Gln Lys Asn Ala Ala Leu Lys Leu 195 200 205 Gly Cys Ile Asn
Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu 210 215 220 Ala Lys
Phe Ser Gln Thr Lys His His Trp Trp Trp Lys Leu His Phe225 230 235
240 Val Trp Glu Arg Val Lys Val Leu Gln Asp Tyr Ala Gly Leu Ile Leu
245 250 255 Phe Leu Glu Glu Asp His Tyr Leu Ala Pro Asp Phe Tyr His
Val Phe 260 265 270 Lys Lys Met Trp Lys Leu Lys Gln Gln Glu Cys Pro
Glu Cys Asp Val 275 280 285 Leu Ser Leu Gly Thr Tyr Thr Ala Ser Arg
Ser Phe His Gly Ile Ala 290 295 300 His Lys Val Asp Val Lys Thr Trp
Lys Ser Thr Glu His Asn Met Gly305 310 315 320 Leu Ala Leu Thr Arg
Asn Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp 325 330 335 Thr Phe Cys
Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr 340 345 350 Leu
Thr Val Ser Cys Leu Pro Lys Leu Trp Arg Val Leu Val Pro Gln 355 360
365 Val Pro Arg Val Phe His Ala Gly Asp Cys Gly Met His His Lys Lys
370 375 380 Thr Cys Arg Pro Phe Thr Gln Ser Ala Gln Ile Glu Ser Leu
Leu Asn385 390 395 400 Ser Asn Arg Gln Tyr Met Phe Pro Glu Thr Leu
Ile Ile Ser Glu Lys 405 410 415 Ser Pro Val Val Ser Ile Ala Ser Pro
Arg Lys Asn Gly Gly Trp Gly 420 425 430 Asp Ile Arg Asp His Glu Leu
Cys Lys Ser Tyr Arg Arg Leu Gln 435 440 445 23446PRTAiluropoda
melanoleuca 23Met Arg Phe Arg Ile Tyr Lys Arg Lys Val Leu Ile Leu
Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe Val Leu Trp Ser Ser
Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala Leu Ala Pro Pro Leu
Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly Ala Gly Gly Arg Gly Gly
Asp His Ser Ala Val Ser Ala Gly Ile 50 55 60 Arg Arg Val Ser Asn
Asp Ser Ala Ala Pro Leu Val Pro Ala Ala Pro65 70 75 80 Gln Pro Glu
Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser Leu Val Tyr 85 90 95 Gln
Leu Asn Phe Asp Gln Thr Leu Arg Asn Val Asp Lys Ala Gly Ser 100 105
110 Trp Ala Pro Arg Glu Leu Val Leu Val Val Gln Val His Asn Arg Pro
115 120 125 Asp Tyr Leu Arg Leu Leu Leu Asp Ser Leu Arg Lys Ala Gln
Gly Ile 130 135 140 Asp Asn Val Leu Val Ile Phe Ser His Asp Phe Trp
Ser Thr Glu Ile145 150 155 160 Asn Gln Leu Ile Ala Gly Val Asp Phe
Cys Pro Val Leu Gln Val Phe 165 170 175 Phe Pro Phe Ser Ile Gln Leu
Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180 185 190 Pro Arg Asp Cys Pro
Arg Asp Leu Glu Lys Asn Ala Ala Leu Lys Met 195 200 205 Gly Cys Ile
Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu 210 215 220 Ala
Lys Phe Ser Gln Thr Lys His His Trp Trp Trp Lys Leu His Phe225 230
235 240 Val Trp Glu Arg Val Lys Val Leu Arg Asp Tyr Ala Gly Leu Ile
Leu 245 250 255 Phe Leu Glu Glu Asp His Tyr Leu Ala Pro Asp Phe Tyr
His Val Phe 260 265 270 Lys Lys Met Trp Lys Leu Lys Gln Glu Glu Cys
Thr Glu Cys Asp Val 275 280 285 Leu Ser Leu Gly Thr Tyr Thr Ala Val
Arg Ser Phe His Gly Ile Ala 290 295 300 Asp Lys Val Asp Val Lys Thr
Trp Lys Ser Thr Glu His Asn Met Gly305 310 315 320 Leu Ala Leu Thr
Arg Asp Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp 325 330 335 Thr Phe
Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr 340 345 350
Leu Thr Val Ser Cys Leu Pro Lys Phe Trp Lys Val Leu Val Pro Gln 355
360 365 Val Pro Arg Ile Phe His Ala Gly Asp Cys Gly Met His His Lys
Lys 370 375 380 Thr Cys Arg Pro Ser Thr Gln Ser Ala Gln Ile Glu Ser
Leu Leu Asn385 390 395 400 Asn Asn Lys Gln Tyr Leu Phe Pro Glu Thr
Leu Ile Ile Ser Glu Lys 405 410 415 Phe Val Ala Ala Ile Ser Pro Pro
Arg Lys Asn Gly Gly Trp Gly Asp 420 425 430 Ile Arg Asp His Glu Leu
Cys Lys Ser Tyr Arg Arg Leu Gln 435 440 445 24446PRTCanis
familiaris 24Met Arg Phe Arg Ile Tyr Lys Arg Lys Val Leu Ile Leu
Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe Val Leu Trp Ser Ser
Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala Leu Ala Pro Pro Leu
Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly Ala Gly Gly Arg Gly Gly
Asp His Ser Ala Val Ser Val Gly Ile 50 55 60 Arg Arg Gly Ser Asn
Glu Ser Ala Ala Pro Leu Val Pro Ala Ala Pro65 70 75 80 Gln Pro Glu
Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser Leu Val Tyr 85 90 95 Gln
Leu Asn Phe Asp Gln Thr Leu Arg Asn Val Asp Lys Ala Gly Ser 100 105
110 Trp Ala Pro Arg Glu Leu Val Leu Val Val Gln Val His Asn Arg Pro
115 120 125 Asp Tyr Leu Arg Leu Leu Leu Asp Ser Leu Arg Lys Ala Gln
Gly Ile 130 135 140 Asp Asn Val Leu Val Ile Phe Ser His Asp Phe Trp
Ser Thr Glu Ile145 150 155 160 Asn Gln Leu Ile Ala Gly Val Asp Phe
Cys Pro Val Leu Gln Val Phe 165 170 175 Phe Pro Phe Ser Ile Gln Leu
Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180 185 190 Pro Arg Asp Cys Pro
Arg Asp Leu Glu Lys Asn Ala Ala Leu Lys Met 195 200 205 Gly Cys Ile
Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu 210 215 220 Ala
Lys Phe Ser Gln Thr Lys His His Trp Trp Trp Lys Leu His Phe225 230
235 240 Val Trp Glu Arg Val Lys Val Leu Arg Asp Tyr Ala Gly Leu Ile
Leu 245 250 255 Phe Leu Glu Glu Asp His Tyr Leu Ala Pro Asp Phe Tyr
His Val Phe 260 265 270 Lys Lys Met Trp Lys Leu Lys Gln Glu Glu Cys
Pro Glu Cys Asp Val 275 280 285 Leu Ser Leu Gly Thr Tyr Thr Ala Ile
Arg Ser Phe His Gly Ile Ala 290 295 300 Asp Lys Val Asp Val Lys Thr
Trp Lys Ser Thr Glu His Asn Met Gly305 310 315 320 Leu Ala Leu Thr
Arg Asp Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp 325 330 335 Thr Phe
Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr 340 345 350
Leu Thr Val Ser Cys Leu Pro Lys Phe Trp Lys Val Leu Val Pro Gln 355
360 365 Val Pro Arg Ile Phe His Ala Gly Asp Cys Gly Met His His Lys
Lys 370 375 380 Thr Cys Lys Pro Ser Thr Gln Ser Ala Gln Ile Glu Ser
Leu Leu Asn385 390 395 400 Ser Asn Lys Gln Tyr Leu Phe Pro Glu Thr
Leu Ile Ile Ser Glu Lys 405 410 415 Phe Val Ala Ala Ile Ser Pro Pro
Arg Lys Asn Gly Gly Trp Gly Asp 420 425 430 Ile Arg Asp His Glu Leu
Cys Lys Ser Tyr Arg Arg Leu Gln 435 440 445 25446PRTSus scrofa
25Met Arg Phe Arg Ile Tyr Lys Arg Lys Val Leu Ile Leu Thr Phe Val 1
5 10 15 Val Ala Ala Cys Gly Phe Val Leu Trp Ser Ser Asn Gly Arg Gln
Arg 20 25 30 Lys Asn Glu Ala Leu Ala Pro Pro Leu Leu Asp Ala Glu
Pro Val Arg 35 40 45 Gly Ala Gly Ala Arg Ala Gly Asp His Pro Ala
Ile Ser Val Gly Ile 50 55 60 Arg Arg Gly Ser Asn Asp Ser Ala Ala
Pro Leu Val Ala Ala Ala Pro65 70 75 80 Gln Pro Glu Val Asp Asn Leu
Thr Leu Arg Tyr Arg Ser Leu Val Tyr 85 90 95 Gln Leu Asn Phe Asp
Gln Thr Leu Arg Asn Val Asp Lys Val Ser Ser 100 105 110 Trp Val Pro
Arg Glu Leu Val Leu Val Val Gln Val His Asn Arg Ala 115 120 125 Glu
Tyr Leu Lys Leu Leu Leu Asp Ser Leu Arg Lys Ala Gln Gly Ile 130 135
140 Asp Asn Val Leu Val Ile Phe Ser His Asp Phe Trp Ser Thr Glu
Ile145 150 155 160 Asn Gln Leu Ile Ala Gly Val Asp Phe Cys Pro Val
Leu Gln Val Phe 165 170 175 Phe Pro Phe Ser Ile Gln Leu Tyr Pro Asn
Glu Phe Pro Gly Thr Asp 180 185 190 Pro Arg Asp Cys Pro Arg Asp Leu
Glu Lys Asn Ala Ala Leu Lys Met 195 200 205 Gly Cys Ile Asn Ala Glu
Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu 210 215 220 Ala Lys Phe Ser
Gln Thr Lys His His Trp Trp Trp Lys Leu His Phe225 230 235 240 Val
Trp Glu Arg Val Lys Val Leu Arg Asp Tyr Ala Gly Leu Ile Leu 245 250
255 Phe Leu Glu Glu Asp His Tyr Val Ala Pro Asp Phe Tyr His Val Phe
260 265 270 Lys Lys Met Trp Asn Leu Lys Gln Gln Glu Cys Pro Glu Cys
Asp Val 275 280 285 Leu Ser Leu Gly Thr Tyr Thr Thr Val Arg Ser Phe
Arg Asp Val Ala 290 295 300 Asp Lys Val Asp Val Lys Thr Trp Lys Ser
Thr Glu His Asn Met Gly305 310 315 320 Leu Ala Leu Thr Arg Asp Ala
Tyr Gln Lys Leu Ile Glu Cys Thr Asp 325 330 335 Thr Phe Cys Thr Tyr
Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr 340 345 350 Leu Thr Val
Ser Cys Leu Pro Lys Phe Trp Lys Val Leu Val Pro Gln 355 360 365 Val
Pro Arg Ile Phe His Ala Gly Asp Cys Gly Met His His Lys Lys 370 375
380 Thr Cys Arg Pro Ser Thr Gln Ser Ala Gln Ile Glu Ser Leu Leu
Asn385 390 395 400 Ser Asn Lys Gln Tyr Met Phe Pro Glu Thr Leu Thr
Ile Ser Glu Lys 405 410 415 Leu Thr Ala Ala Leu Ser Pro Pro Arg Lys
Asn Gly Gly Trp Gly Asp 420 425 430 Ile Arg Asp His Glu Leu Cys Lys
Ser Tyr Arg Arg Leu Gln 435 440 445 26446PRTBos taurus 26Met Arg
Phe Arg Ile Tyr Lys Arg Lys Val Leu Ile Leu Met Leu Val 1 5 10 15
Val Ala Ala Cys Gly Phe Val Leu Trp Ser Ser Asn Gly Arg Gln Arg 20
25 30 Lys Asn Glu Ala Leu Ala Pro Pro Leu Leu Asp Ala Asp Pro Val
Arg 35 40 45 Gly Ala Gly Ala Arg Ala Gly Asp His Pro Ala Val Ser
Val Gly Ile 50 55 60 Arg Arg Gly Ser Asn Glu Ser Ala Ala Pro Leu
Val Ala Ala Ala Pro65 70 75 80 Gln Pro Glu Val Asp Asn Leu Thr Leu
Arg Tyr Arg Ser Leu Val Tyr 85 90 95 Gln Leu Asn Phe Asp Gln Thr
Leu Arg Asn Val Asp Lys Ala Ala Ser 100 105 110 Trp Thr Pro Arg Glu
Leu Ala Leu Val Val Gln Val His Asn Arg Pro 115 120 125 Glu Tyr Leu
Lys Leu Leu Leu Asp Ser Leu Arg Lys Ala Gln Gly Ile 130 135 140 Asp
Asp Val Leu Val Ile Phe Ser His Asp Phe Trp Ser Thr Glu Ile145 150
155 160 Asn Gln Leu Ile Ala Gly Val Asp Phe Cys Pro Val Leu Gln Val
Phe 165 170 175 Phe Pro Phe Ser Ile Gln Leu Tyr Pro Asn Glu Phe Pro
Gly Thr Asp 180 185 190 Pro Arg Asp Cys Pro Arg Asp Met Glu Lys Asn
Ala Ala Leu Arg Met 195 200 205 Gly Cys Ile Asn Ala Glu Tyr Pro Asp
Ser Phe Gly His Tyr Arg Glu 210 215 220 Ala Lys Phe Ser Gln Thr Lys
His His Trp Trp Trp Lys Leu His Phe225 230 235 240 Val Trp Glu Arg
Val Lys Val Leu Arg Asp Tyr Ala Gly Leu Ile Leu 245 250 255 Phe Leu
Glu Glu Asp His Tyr Leu Ala Pro Asp Phe Tyr His Val Phe 260 265 270
Lys Lys Met Trp Lys Leu Lys Gln Leu Glu Cys Pro Glu Cys Asp Val 275
280 285 Leu Ser Leu Gly Thr Tyr Thr Ala Ile Arg Asn Phe Tyr Asp Val
Ala 290 295 300 Asp Lys Val Asp Val Lys Thr Trp Lys Ser Thr Glu His
Asn Met Gly305 310
315 320 Leu Ala Leu Thr Arg Glu Ala Tyr Gln Lys Leu Ile Glu Cys Thr
Asp 325 330 335 Thr Phe Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr
Leu Gln Tyr 340 345 350 Leu Thr Val Ser Cys Leu Pro Lys Phe Trp Lys
Val Leu Val Pro Gln 355 360 365 Val Pro Arg Ile Phe His Ala Gly Asp
Cys Gly Met His His Gln Lys 370 375 380 Thr Cys Arg Pro Ala Thr Gln
Ser Ala Gln Leu Glu Ser Leu Leu Asn385 390 395 400 Asn Asn Lys Gln
Tyr Leu Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys 405 410 415 Phe Met
Thr Ser Leu Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly Asp 420 425 430
Ile Arg Asp His Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln 435 440 445
27442PRTRattus norvegicus 27Met Arg Phe Arg Ile Tyr Lys Arg Lys Val
Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe Val Leu
Trp Ser Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Asp Ala Leu Ala
Pro Pro Leu Leu Asp Ser Glu Pro Leu Arg 35 40 45 Gly Ala Gly His
Phe Ala Ala Ser Val Gly Ile Arg Arg Val Ser Asn 50 55 60 Asp Ser
Ala Ala Pro Leu Val Pro Ala Val Pro Arg Pro Glu Val Asp65 70 75 80
Asn Leu Thr Leu Arg Tyr Arg Ser Leu Val Tyr Gln Leu Asn Phe Asp 85
90 95 Gln Met Leu Arg Asn Val Asp Lys Asp Gly Thr Trp Ser Pro Gly
Glu 100 105 110 Leu Val Leu Val Val Gln Val His Asn Arg Pro Glu Tyr
Leu Arg Leu 115 120 125 Leu Ile Asp Ser Leu Arg Lys Ala Gln Gly Ile
Arg Glu Val Leu Val 130 135 140 Ile Phe Ser His Asp Phe Trp Ser Ala
Glu Ile Asn Ser Leu Ile Ser145 150 155 160 Ser Val Asp Phe Cys Pro
Val Leu Gln Val Phe Phe Pro Phe Ser Ile 165 170 175 Gln Leu Tyr Pro
Ser Glu Phe Pro Gly Ser Asp Pro Arg Asp Cys Pro 180 185 190 Arg Asp
Leu Lys Lys Asn Ala Ala Leu Lys Leu Gly Cys Ile Asn Ala 195 200 205
Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu Ala Lys Phe Ser Gln 210
215 220 Thr Lys His His Trp Trp Trp Lys Leu His Phe Val Trp Glu Arg
Val225 230 235 240 Lys Val Leu Gln Asp Tyr Thr Gly Leu Ile Leu Phe
Leu Glu Glu Asp 245 250 255 His Tyr Leu Ala Pro Asp Phe Tyr His Val
Phe Lys Lys Met Trp Lys 260 265 270 Leu Lys Gln Gln Glu Cys Pro Gly
Cys Asp Val Leu Ser Leu Gly Thr 275 280 285 Tyr Thr Thr Ile Arg Ser
Phe Tyr Gly Ile Ala Asp Lys Val Asp Val 290 295 300 Lys Thr Trp Lys
Ser Thr Glu His Asn Met Gly Leu Ala Leu Thr Arg305 310 315 320 Asp
Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp Thr Phe Cys Thr Tyr 325 330
335 Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr Leu Thr Leu Ala Cys
340 345 350 Leu Pro Lys Val Trp Lys Val Leu Val Pro Gln Ala Pro Arg
Ile Phe 355 360 365 His Ala Gly Asp Cys Gly Met His His Lys Lys Thr
Cys Arg Pro Ser 370 375 380 Thr Gln Ser Ala Gln Ile Glu Ser Leu Leu
Asn Asn Asn Lys Gln Tyr385 390 395 400 Leu Phe Pro Glu Thr Leu Val
Ile Gly Glu Lys Phe Pro Met Ala Ala 405 410 415 Ile Ser Pro Pro Arg
Lys Asn Gly Gly Trp Gly Asp Ile Arg Asp His 420 425 430 Glu Leu Cys
Lys Ser Tyr Arg Arg Leu Gln 435 440 28442PRTMus musculus 28Met Arg
Phe Arg Ile Tyr Lys Arg Lys Val Leu Ile Leu Thr Leu Val 1 5 10 15
Val Ala Ala Cys Gly Phe Val Leu Trp Ser Ser Asn Gly Arg Gln Arg 20
25 30 Lys Ser Asp Ala Leu Gly Pro Pro Leu Leu Asp Ala Glu Pro Val
Arg 35 40 45 Gly Ala Gly His Leu Ala Val Ser Val Gly Ile Arg Arg
Val Ser Asn 50 55 60 Glu Ser Ala Ala Pro Leu Val Pro Ala Val Pro
Arg Pro Glu Val Asp65 70 75 80 Asn Leu Thr Leu Arg Tyr Arg Ser Leu
Val Tyr Gln Leu Asn Phe Asp 85 90 95 Gln Met Leu Arg Asn Val Gly
Asn Asp Gly Thr Trp Ser Pro Gly Glu 100 105 110 Leu Val Leu Val Val
Gln Val His Asn Arg Pro Glu Tyr Leu Arg Leu 115 120 125 Leu Ile Asp
Ser Leu Arg Lys Ala Gln Gly Ile Gln Glu Val Leu Val 130 135 140 Ile
Phe Ser His Asp Phe Trp Ser Ala Glu Ile Asn Ser Leu Ile Ser145 150
155 160 Arg Val Asp Phe Cys Pro Val Leu Gln Val Phe Phe Pro Phe Ser
Ile 165 170 175 Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp Pro Arg
Asp Cys Pro 180 185 190 Arg Asp Leu Lys Lys Asn Ala Ala Leu Lys Leu
Gly Cys Ile Asn Ala 195 200 205 Glu Tyr Pro Asp Ser Phe Gly His Tyr
Arg Glu Ala Lys Phe Ser Gln 210 215 220 Thr Lys His His Trp Trp Trp
Lys Leu His Phe Val Trp Glu Arg Val225 230 235 240 Lys Val Leu Gln
Asp Tyr Thr Gly Leu Ile Leu Phe Leu Glu Glu Asp 245 250 255 His Tyr
Leu Ala Pro Asp Phe Tyr His Val Phe Lys Lys Met Trp Lys 260 265 270
Leu Lys Gln Gln Glu Cys Pro Gly Cys Asp Val Leu Ser Leu Gly Thr 275
280 285 Tyr Thr Thr Ile Arg Ser Phe Tyr Gly Ile Ala Asp Lys Val Asp
Val 290 295 300 Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly Leu Ala
Leu Thr Arg305 310 315 320 Asp Ala Tyr Gln Lys Leu Ile Glu Cys Thr
Asp Thr Phe Cys Thr Tyr 325 330 335 Asp Asp Tyr Asn Trp Asp Trp Thr
Leu Gln Tyr Leu Thr Leu Ala Cys 340 345 350 Leu Pro Lys Ile Trp Lys
Val Leu Val Pro Gln Ala Pro Arg Ile Phe 355 360 365 His Ala Gly Asp
Cys Gly Met His His Lys Lys Thr Cys Arg Pro Ser 370 375 380 Thr Gln
Ser Ala Gln Ile Glu Ser Leu Leu Asn Ser Asn Lys Gln Tyr385 390 395
400 Leu Phe Pro Glu Thr Leu Val Ile Gly Glu Lys Phe Pro Met Ala Ala
405 410 415 Ile Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly Asp Ile Arg
Asp His 420 425 430 Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln 435 440
29444PRTMonodelphis domestica 29Met Arg Phe Arg Ile Tyr Lys Arg Lys
Val Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe Val
Phe Trp Asn Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala Phe
Ala Gly Ser Val Pro Ala Pro Val Arg Ala Val 35 40 45 Gly Pro Gly
Asp Leu Arg Arg Phe Pro Asn Gly Ser Ala Ala Pro Pro 50 55 60 Pro
Glu Val Asp Asn Met Thr Leu Val Tyr Arg Ser Leu Val Tyr Gln65 70 75
80 Val Asn Phe Asp Gln Thr Leu Lys Asn Ala Leu Ala Ala Ala Ala Val
85 90 95 Gly Ala Gly Gly Ala Gly Gly Gly Gly Gly Gly Pro Ala Gln
Leu Glu 100 105 110 Leu Glu Leu Val Leu Val Val Gln Val His Asn Arg
Pro Asp Tyr Leu 115 120 125 Lys Leu Leu Leu Asp Ser Leu Arg Lys Val
Gln Gly Ile Gly Asn Leu 130 135 140 Leu Val Ile Phe Ser His Asp Phe
Trp Ser Ala Glu Ile Asn Gln Leu145 150 155 160 Ile Ala Gly Val Asp
Phe Cys Pro Val Leu Gln Val Phe Phe Pro Phe 165 170 175 Ser Ile Gln
Leu Tyr Pro Asn Glu Phe Pro Gly Asn Asp Pro Lys Asp 180 185 190 Cys
Pro Arg Asp Leu Gln Lys Lys Ala Ala Leu Lys Met Gly Cys Ile 195 200
205 Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu Ala Lys Phe
210 215 220 Ser Gln Thr Lys His His Trp Trp Trp Lys Leu His Phe Ala
Trp Glu225 230 235 240 Arg Val Lys Ile Leu Arg Asn Tyr Ala Gly Leu
Met Val Phe Leu Glu 245 250 255 Glu Asp His Tyr Leu Ala Pro Asp Phe
Phe His Val Leu Lys Lys Met 260 265 270 Trp Lys Leu Lys Leu Gln Glu
Cys Pro Asp Cys Asp Val Leu Ser Leu 275 280 285 Gly Ser Tyr Ala Val
Ser Arg Ser Phe Phe Gly Lys Ala Asp Lys Val 290 295 300 Glu Val Lys
Thr Trp Lys Ser Thr Glu His Asn Met Gly Leu Ala Leu305 310 315 320
Thr Arg Asp Thr Tyr Gln Lys Leu Ile Glu Cys Thr Asp Thr Phe Cys 325
330 335 Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr Leu Thr
Thr 340 345 350 Thr Cys Leu Lys Asn Phe Trp Lys Val Met Val Pro Glu
Val Pro Arg 355 360 365 Ile Tyr His Ala Gly Asp Cys Gly Met His His
Lys Asp Pro Cys Arg 370 375 380 Pro Ser Thr Gln Ser Ala Gln Ile Glu
Leu Leu Leu Asn Lys Asn Lys385 390 395 400 Gln Tyr Leu Phe Pro Lys
Thr Leu Ser Ile Ser Lys Lys Tyr Ser Met 405 410 415 Val Pro Leu Leu
Pro His Gly Lys Asn Gly Gly Trp Gly Asp Ile Arg 420 425 430 Asp His
Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln 435 440 30432PRTXenopus
laevis 30Met Arg Cys Arg Ile Tyr Lys Arg Lys Val Ile Ile Leu Thr
Leu Val 1 5 10 15 Val Val Ala Cys Gly Leu Ala Leu Trp Ser Ser Gly
Arg Gln Lys Lys 20 25 30 Asn Gly Phe Val Pro Glu Val Glu Ser Asp
Arg Phe Gln Asn Lys Gly 35 40 45 His Ile Ser Pro Ala Ala Arg Lys
Val Ser Asn Glu Ser Leu Ala Asn 50 55 60 Lys Glu Gln Lys Thr Arg
Val Asp Asn Met Thr Leu Val Tyr Arg Ser65 70 75 80 Val Val Phe Gln
Trp Asn Phe Asp Gln Ala Ile Arg Asn Val Asp Lys 85 90 95 Ile Asn
Arg Pro Gln Asp Asp Val Val Val Val Val Gln Val His Asn 100 105 110
Arg Pro Glu Phe Leu Arg Arg Leu Leu Asp Ser Leu Gly Lys Ala Lys 115
120 125 Gly Ile Glu Asn Val Leu Leu Val Phe Ser His Asp Tyr Trp Ser
Pro 130 135 140 Glu Ile Asn Gln Ile Ile Ala Ser Val Asp Phe Cys Gln
Val Leu Gln145 150 155 160 Ile Phe Phe Pro Phe Ser Ile Gln Leu Tyr
Pro Asn Glu Phe Pro Gly 165 170 175 His Asp Pro Lys Asp Cys Pro Arg
Asp Ile Lys Lys Lys Asp Ala Val 180 185 190 Glu Leu Gly Cys Ile Asn
Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr 195 200 205 Arg Glu Ala Lys
Phe Ser Gln Thr Lys His His Trp Trp Trp Lys Leu 210 215 220 Gln Phe
Val Trp Asp Lys Leu Lys Val Leu Lys Glu His Asn Gly Leu225 230 235
240 Val Leu Phe Ile Glu Glu Asp His Tyr Leu Ser Pro Asp Phe Tyr Tyr
245 250 255 Thr Leu Lys Lys Met Trp Ser Lys Lys Asn Glu Glu Cys Pro
Asp Cys 260 265 270 Asp Met Leu Cys Leu Gly Thr Tyr Ala His Thr Pro
Phe Ala Asp Lys 275 280 285 Ala Gly Lys Val Glu Val Lys Thr Trp Lys
Ser Thr Glu His Asn Met 290 295 300 Gly Met Ala Met Asn Arg Glu Thr
Tyr Lys Lys Leu Val Ala Cys Ser305 310 315 320 Glu Thr Phe Cys Thr
Tyr Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln 325 330 335 Tyr Leu Thr
Val Asn Cys Leu Pro Lys Phe Trp Lys Val Met Val Pro 340 345 350 Glu
Val Pro Arg Ile Tyr His Ile Gly Asp Cys Gly Met His His Asn 355 360
365 Lys Pro Cys Arg Pro Thr Thr Glu Ser Ala Lys Leu Glu Ala Leu Phe
370 375 380 Thr Ser Asn Gln Arg Asp Leu Phe Pro Glu Lys Ile Asp Ile
Ser Arg385 390 395 400 Arg Tyr Thr Met Ala Ala Leu Ser Pro His Val
Lys Asn Gly Gly Trp 405 410 415 Gly Asp Ile Arg Asp His Glu Leu Cys
Lys Ser Tyr His Arg Leu Gln 420 425 430 31450PRTDanio rerio 31Met
Arg Phe Arg Ile Tyr Lys Arg Lys Val Val Ile Leu Thr Leu Val 1 5 10
15 Val Ile Ile Cys Gly Phe Ala Val Trp Asn Ser Gly Lys Pro Lys Lys
20 25 30 Ala Ser Thr Val Phe Pro Lys Glu Val Glu Thr Val Lys Arg
Ser Ser 35 40 45 Val Gly Ser Gln Ile Gln Ala Thr Ile Pro Val Thr
Arg Lys Pro Ile 50 55 60 Asn Glu Ser Ile Pro Glu Lys Gln Gln Gln
Gln Gln Pro Val Ala Lys65 70 75 80 Ser Glu Ala Asp Asn Thr Thr Leu
Val Tyr Arg Gly Ile Val Phe Gln 85 90 95 Leu Asn Phe Asp Gln Asn
Leu Lys Asn Glu Glu Lys Phe Arg Ala Val 100 105 110 Arg Gln Lys Asp
Asp Leu Val Ile Val Val Gln Val His Asn Arg Pro 115 120 125 Glu Tyr
Leu Arg Leu Leu Val Asp Ser Leu Arg Lys Ser Lys Gly Ile 130 135 140
Glu Asn Ile Leu Leu Ile Phe Ser His Asp Phe Trp Ser Pro Glu Ile145
150 155 160 Asn Gln Ile Val Ala Ser Val Asp Phe Cys Leu Val Leu Gln
Ile Phe 165 170 175 Phe Pro Phe Ser Ile Gln Leu Tyr Pro Gln Glu Phe
Pro Gly Asn Asp 180 185 190 Pro Arg Asp Cys Pro Arg Asp Ile Pro Lys
Lys Glu Ala Leu Thr Leu 195 200 205 Gly Cys Ile Asn Ala Glu Tyr Pro
Asp Ser Phe Gly His Tyr Arg Glu 210 215 220 Ala Lys Phe Ser Gln Thr
Lys His His Trp Trp Trp Lys Leu His Phe225 230 235 240 Val Trp Asp
Arg Val Arg Val Leu Lys Asp His Lys Gly Leu Val Leu 245 250 255 Leu
Ile Glu Glu Asp His Tyr Leu Ala Pro Asp Phe Tyr His Leu Leu 260 265
270 Lys Leu Met Ala Ser Leu Lys Lys Glu Gln Cys Pro Asp Cys Asp Ile
275 280 285 Leu Ser Leu Gly Ser Tyr Gly His Ile Gly Tyr Ser Ser Lys
Ala Asn 290 295 300 Lys Val Glu Val Lys Ala Trp Lys Ser Thr Glu His
Asn Met Gly Met305 310 315 320 Ala Leu Asn Arg Asp Ala Tyr Gln Lys
Leu Leu Arg Cys Thr Asp Ala 325 330 335 Phe Cys Thr Tyr Asp Asp Tyr
Asn Trp Asp Trp Ser Leu Gln His Leu 340 345 350 Thr Val Thr Cys Leu
Pro Ala Phe Leu Lys Val Met Val Ser Glu Ala 355 360 365 Pro Arg Ile
Phe His Ala Gly Asp Cys Gly Met His His Lys Lys Ser 370 375 380 Ala
Cys Met Pro Ser Gly Gln Lys Thr Lys Ile Glu Asn Val Leu Gln385 390
395 400 Asn Ser Gly Asn Gln Leu Phe Pro Lys Gln Leu Leu Ile Thr Lys
Arg
405 410 415 Leu Pro Ala Ser Gly Ala Lys Gly Val Ala Pro His Val Lys
Asn Gly 420 425 430 Gly Trp Gly Asp Ile Arg Asp His Glu Leu Cys Lys
Ser Tyr Leu Arg 435 440 445 Leu Gln 450 32465PRTSalmo salar 32Met
Arg Phe Arg Val Tyr Lys Arg Lys Val Val Ile Leu Thr Leu Val 1 5 10
15 Val Val Val Cys Gly Leu Ala Phe Trp Thr Ser Gly Lys Gln Lys Lys
20 25 30 Ser Ser Gly Val Val Val Leu Lys Glu Ala Glu Gly Ala Arg
Arg Ser 35 40 45 Ser Ser Ser Gln Val Gln Pro Gln Pro Gln Ala Thr
Pro Glu Val Ser 50 55 60 Arg Ile Pro Asn Val Pro Pro Ile Ala Pro
Val Asn Glu Thr His Pro65 70 75 80 Lys Asn Gln Pro Glu Lys His Leu
Glu Lys Glu Glu Val Val Lys Pro 85 90 95 Glu Val Asp Asn Thr Thr
Gln Val Tyr Arg Gly Ile Val Phe Gln Leu 100 105 110 Asn Phe Asp Gln
Thr Val Arg His Glu Glu Lys Phe Arg Ala Ala Arg 115 120 125 Lys Lys
Asp Asp Leu Val Val Val Val Gln Val His Asn Arg Pro Asp 130 135 140
Tyr Leu Arg Leu Leu Val Glu Ser Leu Arg Lys Ala Arg Gly Val Glu145
150 155 160 Ser Ile Leu Leu Ile Phe Ser His Asp Phe Trp Ser Pro Glu
Ile Asn 165 170 175 Gln Val Val Ala Ser Val Asp Phe Cys Gln Val Leu
Gln Ile Phe Phe 180 185 190 Pro Phe Ser Ile Gln Leu Tyr Pro Gln Glu
Phe Pro Gly His Asp Pro 195 200 205 Arg Asp Cys Pro Arg Asp Ile Ser
Lys Ile Asp Ala Leu Lys Leu Gly 210 215 220 Cys Ile Asn Ala Glu Tyr
Pro Asp Ser Phe Gly His Tyr Arg Glu Ala225 230 235 240 Lys Phe Ser
Gln Thr Lys His His Trp Trp Trp Lys Leu His Phe Val 245 250 255 Trp
Asp Arg Val Arg Ala Leu Lys Asp His Arg Gly Leu Val Leu Leu 260 265
270 Ile Glu Glu Asp His Phe Leu Ser Pro Asp Phe Leu His Phe Leu Lys
275 280 285 Leu Met Ser Ile Leu Lys Arg Glu Asn Cys Pro Asp Cys Asp
Ile Leu 290 295 300 Ser Leu Gly Ser Tyr Gly His Ile Ser Tyr Pro Ser
Lys Ala Asn Lys305 310 315 320 Val Glu Val Lys Ala Trp Lys Ser Thr
Glu His Asn Met Gly Met Ala 325 330 335 Leu Ser Arg Glu Thr Tyr Gln
Lys Leu Ile Gln Cys Thr Asp Ala Phe 340 345 350 Cys Thr Tyr Asp Asp
Tyr Asn Trp Asp Trp Ser Leu Gln His Leu Thr 355 360 365 Val Thr Cys
Leu Pro Ser Tyr Trp Lys Val Met Val Ser Glu Ala Pro 370 375 380 Arg
Val Phe His Ala Gly Asp Cys Gly Met His His Lys Lys Ser Val385 390
395 400 Cys Met Pro Ser Ser Gln Lys Ser Lys Ile Asp Thr Ile Leu Gln
Ser 405 410 415 Ser Ser Asn Gln Leu Phe Pro Lys Asn Leu Leu Ile Thr
Lys Arg Leu 420 425 430 Pro Ala Asn Gly Ala Gly Gly Val Ala Pro His
Val Lys Asn Gly Gly 435 440 445 Trp Gly Asp Ile Arg Asp His Glu Leu
Cys Lys Ser Tyr Pro Arg Leu 450 455 460 Gln465 33471PRTDrosophila
melanogaster 33Met Gly Arg Lys Arg Asn Asn Phe Tyr Met Arg Ser Leu
Phe Leu Leu 1 5 10 15 Ala Leu Gly Ile Phe Gly Leu Leu Gln Tyr Asn
Asn Phe Asn Tyr Leu 20 25 30 Asp Ser Arg Asp Asn Val Leu Gly Asp
Ala Val Thr Asn Asp Ser Asp 35 40 45 Asp Ala Ile Leu Ala Met Val
Pro Ala Thr Leu His Lys Tyr Leu Thr 50 55 60 Pro His Ser Arg Asn
His Ser Ala Ser Gly Ala Gly Ala Leu Asn Gly65 70 75 80 Ala Ala Leu
Leu Leu Asn Ala Ser Ser Pro Gly Ala Ala Thr Ala Ser 85 90 95 Thr
Ile Ser Phe Asp Val Tyr His Pro Pro Asn Ile Thr Glu Ile Lys 100 105
110 Arg Gln Ile Val Arg Tyr Asn Asp Met Gln Met Val Leu Asn Glu Asp
115 120 125 Val Phe Gly Pro Leu Gln Asn Asp Ser Val Ile Ile Val Val
Gln Val 130 135 140 His Thr Arg Ile Thr Tyr Leu Arg His Leu Ile Val
Ser Leu Ala Gln145 150 155 160 Ala Arg Asp Ile Ser Lys Val Leu Leu
Val Phe Ser His Asp Tyr Tyr 165 170 175 Asp Asp Asp Ile Asn Asp Leu
Val Gln Gln Ile Asp Phe Cys Lys Val 180 185 190 Met Gln Ile Phe Tyr
Pro Tyr Ser Ile Gln Thr His Pro Asn Glu Tyr 195 200 205 Pro Gly Val
Asp Pro Asn Asp Cys Pro Arg Asn Ile Lys Lys Glu Gln 210 215 220 Ala
Leu Ile Thr Asn Cys Asn Asn Ala Met Tyr Pro Asp Leu Tyr Gly225 230
235 240 His Tyr Arg Glu Ala Lys Phe Thr Gln Thr Lys His His Trp Ile
Trp 245 250 255 Lys Ala Asn Arg Val Phe Asn Glu Leu Glu Val Thr Arg
Tyr His Thr 260 265 270 Gly Leu Val Leu Phe Leu Glu Glu Asp His Tyr
Val Ala Glu Asp Phe 275 280 285 Leu Tyr Leu Leu Ala Met Met Gln Gln
Arg Thr Lys Asp Leu Cys Pro 290 295 300 Gln Cys Asn Val Leu Ser Leu
Gly Thr Tyr Leu Lys Thr Phe Asn Tyr305 310 315 320 Tyr Thr Tyr His
Ser Lys Val Glu Val Met Pro Trp Val Ser Ser Lys 325 330 335 His Asn
Met Gly Phe Ala Phe Asn Arg Thr Thr Trp Ser Asn Ile Arg 340 345 350
Lys Cys Ala Arg His Phe Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp 355
360 365 Ser Leu Gln His Val Ser Gln Gln Cys Leu Arg Arg Lys Leu His
Ala 370 375 380 Met Ile Val Lys Gly Pro Arg Val Phe His Ile Gly Glu
Cys Gly Val385 390 395 400 His His Lys Asn Lys Asn Cys Glu Ser Asn
Gln Val Ile Ser Lys Val 405 410 415 Gln His Val Leu Arg Ile Ala Arg
Asn Ser His Gln Leu Phe Pro Arg 420 425 430 Ser Leu Thr Leu Thr Val
Pro Ser Leu Met Lys Lys Ser Lys Leu Arg 435 440 445 Lys Gly Asn Gly
Gly Trp Gly Asp Met Arg Asp His Glu Leu Cys Leu 450 455 460 Asn Met
Thr Leu Ala Thr Arg465 470 3454PRTHomo sapiens 34Thr Arg Pro Ala
Pro Gly Arg Pro Pro Ser Val Ser Ala Leu Asp Gly 1 5 10 15 Asp Pro
Ala Ser Leu Thr Arg Glu Val Ile Arg Leu Ala Gln Asp Ala 20 25 30
Glu Val Glu Leu Glu Arg Gln Arg Gly Leu Leu Gln Gln Ile Gly Asp 35
40 45 Ala Leu Ser Ser Gln Arg 50 3515PRTArtificial
SequenceSynthesized Construct 35Gly Gly Gly Gly Ser Gly Gly Gly Gly
Ser Gly Gly Gly Gly Ser1 5 10 15 3610PRTArtificial
SequenceSynthesized Construct 36Gly Gly Gly Gly Ser Gly Gly Gly Gly
Ser1 5 10 3721PRTTrichoderma reesei 37Ser Ser Ala Ala Thr Ala Thr
Ala Ser Ala Thr Val Pro Gly Gly Gly1 5 10 15 Ser Gly Pro Thr Ser 20
3819PRTTrichoderma reesei 38Ser Thr Gly Asn Pro Ser Gly Gly Asn Pro
Pro Gly Gly Asn Pro Pro1 5 10 15 Gly Ser Thr3910PRTTrichoderma
reesei 39Met Ala Ser Thr Asn Ala Arg Tyr Val Arg1 5 10
4017PRTTrichoderma reesei 40Tyr Leu Leu Ile Ala Phe Phe Thr Ile Leu
Val Phe Tyr Phe Val Ser1 5 10 15 Asn41379PRTTrichoderma reesei
41Ser Lys Tyr Glu Gly Val Asp Leu Asn Lys Gly Thr Phe Thr Ala Pro 1
5 10 15 Asp Ser Thr Lys Thr Thr Pro Lys Pro Pro Ala Thr Gly Asp Ala
Lys 20 25 30 Asp Phe Pro Leu Ala Leu Thr Pro Asn Asp Pro Gly Phe
Asn Asp Leu 35 40 45 Val Gly Ile Ala Pro Gly Pro Arg Met Asn Ala
Thr Phe Val Thr Leu 50 55 60 Ala Arg Asn Ser Asp Val Trp Asp Ile
Ala Arg Ser Ile Arg Gln Val65 70 75 80 Glu Asp Arg Phe Asn Arg Arg
Tyr Asn Tyr Asp Trp Val Phe Leu Asn 85 90 95 Asp Lys Pro Phe Asp
Asn Thr Phe Lys Lys Val Thr Thr Ser Leu Val 100 105 110 Ser Gly Lys
Thr His Tyr Gly Glu Ile Ala Pro Glu His Trp Ser Phe 115 120 125 Pro
Asp Trp Ile Asp Gln Asp Lys Ala Lys Lys Val Arg Glu Asp Met 130 135
140 Ala Glu Arg Lys Ile Ile Tyr Gly Asp Ser Val Ser Tyr Arg His
Met145 150 155 160 Cys Arg Phe Glu Ser Gly Phe Phe Phe Arg Gln Pro
Leu Met Met Asn 165 170 175 Tyr Glu Tyr Tyr Trp Arg Val Glu Pro Ser
Ile Glu Leu Tyr Cys Asp 180 185 190 Ile His Tyr Asp Pro Phe Arg Leu
Met Val Glu Gln Gly Lys Lys Tyr 195 200 205 Ser Phe Val Ile Ser Leu
Tyr Glu Tyr Pro Ala Thr Ile Ala Thr Leu 210 215 220 Trp Glu Ser Thr
Lys Lys Phe Met Lys Asn His Pro Glu His Ile Ala225 230 235 240 Pro
Asp Asn Ser Met Arg Phe Leu Ser Asp Asp Gly Gly Glu Thr Tyr 245 250
255 Asn Asn Cys His Phe Trp Ser Asn Phe Glu Ile Gly Ser Leu Glu Trp
260 265 270 Leu Arg Ser Lys Gln Tyr Ile Asp Phe Phe Glu Ser Leu Asp
Lys Asp 275 280 285 Gly Gly Phe Phe Tyr Glu Arg Trp Gly Asp Ala Pro
Val His Ser Ile 290 295 300 Ala Ala Gly Leu Met Leu Asn Arg Ser Glu
Ile His Phe Phe Asn Asp305 310 315 320 Ile Ala Tyr Trp His Val Pro
Phe Thr His Cys Pro Thr Gly Glu Lys 325 330 335 Thr Arg Leu Asp Leu
Lys Cys His Cys Asp Pro Lys Glu Asn Phe Asp 340 345 350 Trp Lys Gly
Tyr Ser Cys Thr Ser Arg Phe Phe Glu Met Asn Gly Met 355 360 365 Asp
Lys Pro Glu Gly Trp Glu Asn Gln Gln Asp 370 375 428PRTTrichoderma
reesei 42Met Ala Ile Ala Arg Pro Val Arg1 5 4315PRTTrichoderma
reesei 43Ala Leu Gly Gly Leu Ala Ala Ile Leu Trp Cys Phe Phe Leu
Tyr1 5 10 15 44371PRTTrichoderma reesei 44Gln Leu Leu Arg Pro Ser
Ser Ser Tyr Asn Ser Pro Gly Asp Arg Tyr 1 5 10 15 Ile Asn Phe Glu
Arg Asp Pro Asn Leu Asp Pro Thr Gly Glu Pro Glu 20 25 30 Gly Ile
Leu Val Arg Thr Ser Asp Arg Tyr Ala Pro Asp Ala Lys Asp 35 40 45
Thr Asp Arg Ala Ser Ala Thr Leu Leu Ala Leu Val Arg Asn Glu Glu 50
55 60 Val Asp Asp Met Val Ala Ser Met Val Asp Leu Glu Arg Thr Trp
Asn65 70 75 80 Ser Lys Phe Asn Tyr Pro Trp Thr Phe Phe Asn Asp Lys
Pro Phe Ser 85 90 95 Glu Glu Phe Lys Lys Lys Thr Ser Ala Val Thr
Asn Ala Thr Cys Asn 100 105 110 Tyr Glu Leu Ile Pro Lys Glu His Trp
Asp Ala Pro Ser Trp Ile Asp 115 120 125 Pro Ala Ile Phe Glu Glu Ser
Ala Ala Val Leu Lys Lys Asn Gly Val 130 135 140 Gln Tyr Ala Asn Met
Met Ser Tyr His Gln Met Cys Arg Trp Asn Ser145 150 155 160 Gly Met
Phe Tyr Lys His Pro Ala Leu Lys Asp Val Arg Tyr Tyr Trp 165 170 175
Arg Val Glu Pro Lys Val His Phe Phe Cys Asp Val Asp Tyr Asp Val 180
185 190 Phe Arg Tyr Met Gln Asp Asn Asn Lys Thr Tyr Gly Phe Thr Ile
Asn 195 200 205 Leu Tyr Asp Asp Pro His Thr Leu Pro Thr Leu Trp Pro
Gln Thr Ala 210 215 220 Lys Phe Leu Ala Asp His Pro Asn Tyr Leu His
Glu His Ser Ala Ile225 230 235 240 Lys Trp Val Ile Asp Asp Ala Arg
Arg Pro Gln His Asn Arg Glu Ala 245 250 255 Gln Gly Phe Ser Thr Cys
His Phe Trp Ser Asn Phe Glu Val Ala Asp 260 265 270 Met Glu Phe Trp
Arg Ser Lys Val Tyr Glu Asp Tyr Phe Glu His Leu 275 280 285 Asp Arg
Ala Gly Gly Phe Phe Tyr Glu Arg Trp Gly Asp Ala Pro Val 290 295 300
His Ser Ile Ala Leu Gly Leu Phe Glu Asp Ser Ser Lys Ile His Trp305
310 315 320 Phe Arg Asp Ile Gly Tyr Gln His Ile Pro Phe Phe Asn Cys
Pro Asn 325 330 335 Ser Pro Lys Cys Lys Gly Cys Val Thr Gly Arg Leu
Thr Asp Gly Glu 340 345 350 Pro Phe Leu His Arg Glu Asp Cys Arg Pro
Asn Trp Phe Lys Tyr Ala 355 360 365 Gly Met Gly 370
456PRTTrichoderma reesei 45Met Leu Asn Pro Arg Arg1 5
4615PRTTrichoderma reesei 46Ala Leu Ile Ala Ala Ala Phe Ile Leu Thr
Val Phe Phe Leu Ile1 5 10 15 47352PRTTrichoderma reesei 47Ser Arg
Ser His Asn Ser Glu Ser Ala Ser Thr Ser Glu Pro Lys Asp 1 5 10 15
Ala Glu Ala Glu Ala Leu Ser Ala Ala Asn Ala Gln Gln Arg Ala Ala 20
25 30 Pro Pro Pro Pro Pro Gln Lys Pro Met Ile Asp Met Ser Gly Met
Ser 35 40 45 Thr Tyr Asp Lys Leu Ala Tyr Ala Tyr Glu Tyr Asp Ile
Glu Ser Lys 50 55 60 Phe Pro Ala Tyr Ile Trp Gln Thr Trp Arg Lys
Thr Pro Ser Glu Gly65 70 75 80 Asp Phe Glu Phe Arg Glu Gln Glu Ala
Ser Trp Ser Ile Glu His Pro 85 90 95 Gly Phe Ile His Glu Val Ile
Thr Asp Ser Val Ala Asp Thr Leu Leu 100 105 110 Gln Leu Leu Tyr Gly
Ser Ile Pro Glu Val Leu Glu Ala Tyr His Ala 115 120 125 Leu Pro Leu
Pro Val Leu Lys Ala Asp Leu Phe Arg Tyr Leu Ile Leu 130 135 140 Tyr
Ala Arg Gly Gly Ile Tyr Ser Asp Ile Asp Thr Tyr Ala Ile Arg145 150
155 160 Ser Ala Leu Glu Trp Ile Pro Pro Gln Ile Pro Lys Glu Thr Val
Gly 165 170 175 Leu Val Ile Gly Ile Glu Ala Asp Pro Asp Arg Pro Asp
Trp Ala Asp 180 185 190 Trp Tyr Ser Arg Arg Ile Gln Phe Cys Gln Trp
Thr Ile Gln Ser Lys 195 200 205 Pro Gly His Pro Val Leu Arg Asp Ile
Ile Ser Arg Ile Thr Asn Gln 210 215 220 Thr Leu Glu Met Lys Lys Ser
Gly Lys Leu Ser Ala Phe Gln Gly Asn225 230 235 240 Arg Val Val Asp
Leu Thr Gly Pro Ala Val Trp Thr Asp Thr Ile Met 245 250 255 Asp Tyr
Phe Asn Asp Glu Arg Tyr Phe Asp Met Glu Asn Ser Lys Gly 260 265 270
Arg Ile Asp Tyr Arg Asn Phe Thr Gly Met Glu Thr Ser Lys Arg Val 275
280 285 Gly Asp Val Val Val Leu Pro Ile Thr Ser Phe Ser Pro Gly Val
Gly 290 295 300 Gln Met Gly Ala Lys Asp Tyr Asp Asp Pro Met Ala Phe
Val Lys His305 310 315 320 Asp Phe Glu Gly Thr Trp Lys Pro Glu Ser
Glu Arg His Ile Gly Glu 325 330
335 Ile Val Gln Glu Leu Gly Glu Gly Gln Gly Glu Ala Pro Lys Glu Gln
340 345 350 4822PRTTrichoderma reesei 48Met Gly Met Gly Gln Cys Gln
Trp Ser Pro Phe Arg Asn Lys Val Pro1 5 10 15 Thr Gln Met Arg Arg
Cys 20 4915PRTTrichoderma reesei 49Leu Pro Leu Tyr Ile Thr Val Val
Cys Val Phe Leu Val Ile Val1 5 10 15 50362PRTTrichoderma reesei
50Asn Phe Asp Trp Ile Leu Ala Ile Pro Asn Pro Ala Ser Val Leu Arg 1
5 10 15 Arg Glu Pro Lys Ala Pro Pro Leu Pro Gly Ser Thr Phe Pro Gln
Lys 20 25 30 Ile Trp Gln Thr Trp Lys Val Asp Pro Leu Asn Phe Asp
Glu Arg Asp 35 40 45 Leu Val Thr Ala Arg Thr Trp Thr Thr Ile Asn
Pro Gly Met Arg Tyr 50 55 60 Glu Val Val Thr Asp Ala Asn Glu Met
Ala Tyr Ile Glu Asp Arg Tyr65 70 75 80 Gly Pro Asn Gly Phe Asp Arg
Pro Asp Ile Val Glu Phe Tyr Lys Met 85 90 95 Ile Asn Leu Pro Ile
Ile Lys Ala Asp Leu Leu Arg Tyr Met Ile Met 100 105 110 Tyr Ala Glu
Gly Gly Ile Tyr Ala Asp Ile Asp Val Glu Thr Met Lys 115 120 125 Pro
Phe His Arg Phe Ile Pro Asp Arg Tyr Asp Glu Lys Asp Ile Asp 130 135
140 Ile Ile Ile Gly Val Glu Ile Asp Gln Pro Asp Phe Lys Asp His
Pro145 150 155 160 Ile Leu Gly Lys Lys Ser Met Ser Phe Cys Gln Trp
Thr Phe Val Ala 165 170 175 Arg Pro Gln Gln Pro Val Met Met Arg Leu
Ile Glu Asn Ile Met Lys 180 185 190 Trp Phe Lys Thr Val Ala Arg Asp
Gln Gly Val Pro Leu Gly Glu Val 195 200 205 Gln Leu Asp Phe Asp Gln
Val Ile Ser Gly Thr Gly Pro Ser Ala Phe 210 215 220 Thr Lys Ala Met
Leu Glu Glu Met Asn Arg Lys Thr Lys Gly Pro Lys225 230 235 240 Val
Thr Trp Asp Ala Phe His Asn Leu Asp Glu Ser Lys Leu Val Gly 245 250
255 Gly Val Leu Val Leu Thr Val Glu Ala Phe Cys Ala Gly Gln Gly His
260 265 270 Ser Asp Ser Gly Asn His Asn Ala Arg Asn Ala Leu Val Lys
His His 275 280 285 Phe His Ala Ser Asn Trp Pro Ser Arg His Pro Arg
Tyr Lys His Pro 290 295 300 Ala Tyr Gly Gln Val Glu Asp Cys Asn Trp
Val Pro Glu Cys Val Arg305 310 315 320 Lys Trp Asp Glu Asp Thr Ser
Asn Trp Asp Lys Tyr Ser Glu Asn Glu 325 330 335 Gln Lys Lys Ile Leu
Gln Asp Ile Glu Asn Ala Arg Leu Glu Arg Glu 340 345 350 Arg Gln Gln
Gln Ala Leu Ala Ala Leu Pro 355 360 5117PRTTrichoderma reesei 51Met
Ala Arg Pro Met Gly Ser Val Arg Leu Lys Lys Ala Asn Pro Ser1 5 10
15 Thr5216PRTTrichoderma reesei 52Leu Ile Leu Gly Ala Val Leu Cys
Ile Phe Ile Ile Ile Phe Leu Val1 5 10 15 53339PRTTrichoderma reesei
53Ser Pro Ser Ser Pro Ala Ser Ala Ser Arg Leu Ser Ile Val Ser Ala 1
5 10 15 Gln His His Leu Ser Pro Pro Thr Ser Pro Tyr Gln Ser Pro Arg
Ser 20 25 30 Gly Ala Val Gln Gly Pro Pro Pro Val Thr Arg Tyr Asn
Leu Asn Lys 35 40 45 Val Thr Val Thr Ser Asp Pro Val Arg Asn Gln
Glu His Ile Leu Ile 50 55 60 Leu Thr Pro Met Ala Arg Phe Tyr Gln
Glu Tyr Trp Asp Asn Leu Leu65 70 75 80 Arg Leu Asn Tyr Pro His Glu
Leu Ile Thr Leu Gly Phe Ile Leu Pro 85 90 95 Lys Thr Lys Glu Gly
Asn Gln Ala Thr Ser Met Leu Gln Lys Gln Ile 100 105 110 Gln Lys Thr
Gln Asn Tyr Gly Pro Glu Lys Asp Arg Phe Lys Ser Ile 115 120 125 Ile
Ile Leu Arg Gln Asp Phe Asp Pro Ala Val Val Ser Gln Asp Glu 130 135
140 Ser Glu Arg His Lys Leu Ala Asn Gln Lys Ala Arg Arg Glu Val
Met145 150 155 160 Ala Lys Ala Arg Asn Ser Leu Leu Phe Thr Thr Leu
Gly Pro Ser Thr 165 170 175 Ser Trp Val Leu Trp Leu Asp Ala Asp Ile
Thr Glu Thr Ala Pro Thr 180 185 190 Leu Ile Gln Asp Leu Ala Ser His
Asp Lys Pro Ile Ile Val Ala Asn 195 200 205 Cys Phe Gln Lys Tyr Tyr
Asp Pro Glu Ser Lys Lys Met Ala Glu Arg 210 215 220 Pro Tyr Asp Phe
Asn Ser Trp Gln Asp Ser Glu Thr Ala Leu Lys Met225 230 235 240 Ala
Glu Gln Met Gly Pro Asp Asp Ile Leu Leu Glu Gly Tyr Ala Glu 245 250
255 Met Ala Thr Tyr Arg Thr Leu Leu Ala Tyr Met Ser Thr Pro Gly Gly
260 265 270 Ser Lys Asp Leu Val Val Pro Leu Asp Gly Val Gly Gly Thr
Ala Leu 275 280 285 Leu Val Lys Ala Asp Val His Arg Asp Gly Ala Met
Phe Pro Pro Phe 290 295 300 Ala Phe Tyr His Leu Ile Glu Ser Glu Gly
Phe Ala Lys Met Ala Lys305 310 315 320 Arg Leu Gly Trp Gln Pro Tyr
Gly Leu Pro Asn Tyr Lys Val Tyr His 325 330 335 Tyr Asn
Glu5431PRTTrichoderma reesei 54Met Leu Leu Pro Lys Gly Gly Leu Asp
Trp Arg Ser Ala Arg Ala Gln1 5 10 15 Ile Pro Pro Thr Arg Ala Leu
Trp Asn Ala Val Thr Arg Thr Arg 20 25 30 5515PRTTrichoderma reesei
55Phe Ile Leu Leu Val Gly Ile Thr Gly Leu Ile Leu Leu Leu Trp 1 5
10 15 56358PRTTrichoderma reesei 56Arg Gly Val Ser Thr Ser Ala Ser
Glu Met Gln Ser Phe Tyr Cys Trp 1 5 10 15 Gly Pro Ala Lys Pro Pro
Met Glu Met Ser Pro Asn Glu His Asn Arg 20 25 30 Trp Asn Gly His
Leu Gln Thr Pro Val Ile Phe Asn His His Ala Pro 35 40 45 Val Glu
Val Asn Ser Ser Thr Ile Glu His Val Asp Leu Asn Pro Ile 50 55 60
Asn Ser Thr Lys Gln Ala Val Thr Lys Glu Glu Arg Ile Leu Ile Leu65
70 75 80 Thr Pro Leu Lys Asp Ala Ala Pro Tyr Leu Ser Lys Tyr Phe
Glu Leu 85 90 95 Leu Ala Glu Leu Thr Tyr Pro His Arg Leu Ile Asp
Leu Ala Phe Leu 100 105 110 Val Ser Asp Ser Thr Asp Asp Thr Leu Ala
Val Leu Ala Ser Glu Leu 115 120 125 Asp Arg Ile Gln Lys Arg Pro Asp
Gln Ile Pro Phe His Ser Ala Thr 130 135 140 Val Ile Glu Lys Asp Phe
Gly Phe Lys Leu Ser Gln Asn Val Glu Glu145 150 155 160 Arg His Ser
Phe Glu Ala Gln Gly Pro Arg Arg Lys Ala Met Gly Arg 165 170 175 Ala
Arg Asn Tyr Leu Leu Tyr Thr Ala Leu Lys Pro Glu His Ser Trp 180 185
190 Val Tyr Trp Arg Asp Val Asp Ile Val Asp Ser Pro Thr Gly Ile Leu
195 200 205 Glu Asp Phe Ile Ala His Asp Arg Asp Ile Leu Val Pro Asn
Ile Trp 210 215 220 Phe His Arg Tyr Arg Asp Gly Val Asp Ile Glu Gly
Arg Phe Asp Tyr225 230 235 240 Asn Ser Trp Val Glu Ser Asp Lys Gly
Arg Lys Leu Ala Asn Ser Leu 245 250 255 Asp Lys Asp Val Val Leu Ala
Glu Gly Tyr Lys Gln Tyr Asp Thr Gly 260 265 270 Arg Thr Tyr Met Ala
Lys Met Gly Asp Trp Arg Glu Asn Lys Asp Val 275 280 285 Glu Leu Glu
Leu Asp Gly Ile Gly Gly Val Asn Ile Leu Val Lys Ala 290 295 300 Asp
Val His Arg Ser Gly Ile Asn Phe Pro Cys Tyr Ala Phe Glu Asn305 310
315 320 Gln Ala Glu Thr Glu Gly Phe Ala Lys Met Ala Lys Arg Ala Gly
Tyr 325 330 335 Glu Val Tyr Gly Leu Pro Asn Tyr Val Val Trp His Ile
Asp Thr Glu 340 345 350 Glu Lys Gly Gly Asn Ala 355
5745PRTTrichoderma reesei 57Met Met Pro Arg His His Ser Ser Gly Phe
Ser Asn Gly Tyr Pro Arg 1 5 10 15 Ala Asp Thr Phe Glu Ile Ser Pro
His Arg Phe Gln Pro Arg Ala Thr 20 25 30 Leu Pro Pro His Arg Lys
Arg Lys Arg Thr Ala Ile Arg 35 40 45 5816PRTTrichoderma reesei
58Val Gly Ile Ala Val Val Val Ile Leu Val Leu Val Leu Trp Phe Gly1
5 10 15 59407PRTTrichoderma reesei 59Gln Pro Arg Ser Val Ala Ser
Leu Ile Ser Leu Gly Ile Leu Ser Gly 1 5 10 15 Tyr Asp Asp Leu Lys
Leu Glu Thr Val Arg Tyr Tyr Asp Leu Ser Asn 20 25 30 Val Gln Gly
Thr Ala Arg Gly Trp Glu Arg Glu Glu Arg Ile Leu Leu 35 40 45 Cys
Val Pro Leu Arg Asp Ala Glu Gln His Leu Pro Met Phe Phe Ser 50 55
60 His Leu Lys Asn Phe Thr Tyr Pro His Asn Leu Ile Asp Leu Ala
Phe65 70 75 80 Leu Val Ser Asp Ser Lys Asp His Thr Leu Glu Ser Leu
Thr Glu His 85 90 95 Leu Glu Ala Ile Gln Ala Asp Pro Asp Pro Lys
Gln Pro Tyr Gly Glu 100 105 110 Ile Ser Ile Ile Glu Lys Asp Phe Gly
Gln Lys Val Asn Gln Asp Val 115 120 125 Glu Ser Arg His Gly Phe Ala
Ala Gln Ala Ser Arg Arg Lys Leu Met 130 135 140 Ala Gln Ala Arg Asn
Trp Leu Leu Ser Ala Ala Leu Arg Pro Tyr His145 150 155 160 Ser Trp
Val Tyr Trp Arg Asp Val Asp Val Glu Thr Ala Pro Phe Thr 165 170 175
Ile Leu Glu Asp Leu Met Arg His Asn Lys Asp Val Ile Val Pro Asn 180
185 190 Val Trp Arg Pro Leu Pro Asp Trp Leu Gly Gly Glu Gln Pro Tyr
Asp 195 200 205 Leu Asn Ser Trp Gln Glu Ser Glu Thr Ala Leu Ala Leu
Ala Asp Thr 210 215 220 Leu Asp Glu Asp Ala Val Ile Val Glu Gly Tyr
Ala Glu Tyr Ala Thr225 230 235 240 Trp Arg Pro His Leu Ala Tyr Leu
Arg Asp Pro Tyr Gly Asp Pro Asp 245 250 255 Met Glu Met Glu Ile Asp
Gly Val Gly Gly Val Ser Ile Leu Ala Lys 260 265 270 Ala Lys Val Phe
Arg Ala Gly Val His Phe Pro Ala Phe Ser Phe Glu 275 280 285 Lys His
Ala Glu Thr Glu Gly Phe Gly Lys Met Ala Lys Arg Met His 290 295 300
Phe Ser Val Val Gly Leu Pro His Tyr Thr Ile Trp His Leu Tyr Glu305
310 315 320 Pro Ser Val Asp Asp Ile Lys His Met Glu Glu Met Glu Arg
Glu Arg 325 330 335 Ile Ala Arg Glu Lys Glu Glu Glu Glu Arg Lys Lys
Lys Glu Ala Gln 340 345 350 Ile Lys Glu Glu Phe Gly Asp Ala Asn Ser
Gln Trp Glu Gln Asp Lys 355 360 365 Gln Gln Met Gln Asp Leu Lys Leu
Gln Asp Arg Gly Gly Asp Lys Glu 370 375 380 Ala Ala Ala Ala Gly Val
Asn Gln Gly Ala Ala Ala Lys Ala Ala Gly385 390 395 400 Ala Met Glu
Gly Gln Lys Asn 405 60119PRTTrichoderma reesei 60Met Ser Leu Ser
Arg Ser Pro Ser Pro Val Pro Gly Gly Gly Trp Ser 1 5 10 15 Ser Pro
Gly Leu Asn Ile Asn Ser Gly Arg Ser Ser Pro Ser Asn Ala 20 25 30
Ala Gly Ser Ser Val Ser Trp Glu Ser Ala Lys Met Arg Lys Gln Gly 35
40 45 Ala Asn Gly Tyr Pro Ser Phe Ser Thr Gln Asn Gln Gly Phe Phe
Thr 50 55 60 Arg His Met Arg Arg Ile Ser Ser Ser Leu Pro Arg Phe
Ala Ala Gly65 70 75 80 Pro Gly Asn Thr Tyr Ala Glu Arg Glu Lys Tyr
Glu Arg Gly Gly His 85 90 95 Ser Pro His Ala Gly Gly Gly Arg Leu
Arg Ala Phe Leu Ala Arg Ile 100 105 110 Gly Arg Arg Leu Lys Trp Arg
115 6115PRTTrichoderma reesei 61Ile Leu Leu Pro Leu Ile Ile Ile Cys
Thr Ile Val Ala Tyr Tyr 1 5 10 15 62325PRTTrichoderma reesei 62Gly
Thr His Glu Ala Pro Gly Phe Val His Trp Trp Arg Arg Ile Ser 1 5 10
15 Met Gly Gly Gly Gly Glu Lys Phe Val Ile Ile Leu Gly Ala Asn Val
20 25 30 Gly Gly Gly Val Met Glu Trp Lys Gly Ala Arg Glu Trp Ala
Ile Glu 35 40 45 Arg Asp Ser Val Arg Asn Lys Arg Lys Tyr Ala Thr
Arg Trp Gly Tyr 50 55 60 Asp Leu Glu Ile Val Asp Met Lys Thr Lys
Lys Arg Tyr Ala His Glu65 70 75 80 Trp Arg Glu Ser Trp Glu Lys Val
Asp Phe Ile Arg Ala Ala Met Arg 85 90 95 Lys Tyr Pro Lys Ala Glu
Trp Phe Trp Trp Leu Asp Leu Asn Thr Tyr 100 105 110 Val Met Glu Pro
Ser Tyr Ser Leu Gln Arg His Leu Phe Asn His Leu 115 120 125 Asp Arg
His Val Tyr Arg Asp Ile Asn Val Phe Asn Pro Leu Asn Ile 130 135 140
Thr His Pro Pro Thr Glu Glu Tyr Leu Asp Ala Glu Ala Arg Ser Pro145
150 155 160 Val Gly Asp Gly Asn Ile Asn Ser Val Asn Leu Met Leu Thr
Gln Asp 165 170 175 Cys Ser Gly Phe Asn Leu Gly Ser Phe Phe Ile Arg
Arg Ser Ala Trp 180 185 190 Thr Glu Gln Leu Leu Asp Ile Trp Trp Asp
Pro Val Leu Tyr Glu Gln 195 200 205 Lys His Met Glu Trp Glu His Lys
Glu Gln Asp Ala Leu Glu Gln Leu 210 215 220 Tyr Arg Thr Gln Pro Trp
Ile Arg Gln His Thr Gly Phe Leu Pro Gln225 230 235 240 Arg Leu Ile
Asn Ser Phe Pro Pro Ala Ala Cys Ala Asp Glu Ser Gly 245 250 255 Leu
Asn Asn Thr Arg Ile His Tyr Asn Glu Lys Asp Arg Asp Phe Val 260 265
270 Val Asn Met Ala Gly Cys Glu Trp Gly Arg Asp Cys Trp Gly Glu Met
275 280 285 Tyr His Tyr Arg Glu Phe Ser Tyr Trp Leu Asn Arg Asn Pro
Trp Glu 290 295 300 Leu Phe Lys Glu Glu Ile Val Ala Val Ile Trp Tyr
Lys Leu Thr Gly305 310 315 320 Gln Arg Val Lys Leu 325
6333PRTTrichoderma reesei 63Met His Phe Ala Tyr Pro Ser Arg Lys Ser
Ser Asn Pro Pro Pro Phe 1 5 10 15 Arg Pro Arg Ser Thr Arg Leu Pro
Gly Leu Arg Arg Ser Arg Ile Lys 20 25 30 Thr6415PRTTrichoderma
reesei 64Ile Gly Ile Val Leu Phe Leu Val Leu Ala Thr Leu Trp Phe
Phe 1 5 10 15 65262PRTTrichoderma reesei 65Ser Asn Pro Arg Val Pro
Arg Pro Asp Pro Glu Arg Val Pro Ser Gly 1 5 10 15 Arg Pro Pro Val
Val Leu Val Thr Val Ile Asp Pro Thr Gln Tyr Pro 20 25 30 Asn Ala
Tyr Leu Lys Thr Ile Lys Glu Asn Arg Glu Gln Tyr Ala Ala 35 40 45
Lys His Gly Tyr Glu Ala Phe Ile Val Lys Ala Tyr Asp Tyr Asp Thr 50
55 60 Gln Gly Ala Pro Gln Ser Trp Ser Lys Leu Met Ala Met Arg His
Ala65 70 75
80 Leu Thr Lys Phe Pro Glu Cys Arg Phe Val Trp Tyr Leu Asp Gln Asp
85 90 95 Ala Tyr Ile Met Asp Met Ser Lys Ser Leu Glu Glu Gln Leu
Leu Asn 100 105 110 Arg Gln Lys Leu Glu Ser Leu Met Ile Lys Asn Tyr
Pro Val Val Pro 115 120 125 Pro Asp Ser Ile Ile Lys Thr Phe Ser His
Leu Arg Pro Asp Glu Val 130 135 140 Asp Leu Ile Val Ser Gln Asp Ser
Ser Gly Leu Val Ala Gly Ser Val145 150 155 160 Val Val Arg Asn Ser
Gln Trp Ser Lys Phe Leu Leu Glu Thr Trp Met 165 170 175 Asp Pro Leu
Tyr Arg Ser Tyr Asn Phe Gln Lys Ala Glu Arg His Ala 180 185 190 Leu
Glu His Ile Val Gln Trp His Pro Thr Ile Leu Ser Lys Leu Ala 195 200
205 Leu Val Pro Gln Arg Thr Leu Gly Pro Tyr Thr Arg Thr Asp Gln Gly
210 215 220 Asp Ala Tyr Gln Asp Gly Asp Phe Val Val Met Phe Thr Gly
Cys Thr225 230 235 240 Lys Ser Gly Glu Gln Ser Cys Glu Thr Val Ser
Ala Ser Tyr Tyr Gln 245 250 255 Lys Trp Ser Ser Ser Leu 260
6690PRTTrichoderma reesei 66Met Ile Arg Asp Pro Phe Gly Ile His Ser
Lys Asn Ala Phe Lys Ala 1 5 10 15 Thr Ala Leu Arg Ala Ala Arg Asp
Ile Lys Glu Ala Ala Thr Gln Ala 20 25 30 Gly Ala Asn Ala Leu Glu
Met Ser Phe Ser Leu Pro Lys His Val Pro 35 40 45 Asp Phe Gly Asp
Pro Ser Arg Ala Leu Glu Asp Arg Ala Trp Ala Ala 50 55 60 Leu Leu
Pro Met Tyr Lys Asp Lys Pro Tyr Ala Tyr Ala Pro Ser Met65 70 75 80
Arg Leu Arg Pro Trp Trp Arg Arg Arg Lys 85 90 6719PRTTrichoderma
reesei 67Val Leu Gly Met Ile Ala Ala Ala Val Met Phe Val Leu Tyr
Val Thr 1 5 10 15 Gly Phe Phe68565PRTTrichoderma reesei 68Ser Ser
Gly Gln Thr Glu Glu Ala Lys Lys Lys Ala Ser Gly Ser Ala 1 5 10 15
Phe Ser Trp Leu Gly Leu Ser Gln Glu Arg Gly Gly Val Asp Trp Asp 20
25 30 Glu Arg Arg Lys Ser Val Val Glu Ala Phe Glu Val Trp Asp Ala
Tyr 35 40 45 Glu Arg Tyr Ala Trp Gly Lys Asp Glu Phe His Pro Ile
Ser Lys Asn 50 55 60 Gly Arg Asn Met Ala Pro Lys Gly Leu Gly Trp
Ile Ile Ile Asp Ser65 70 75 80 Leu Asp Thr Met Met Leu Met Asn Gln
Thr Thr Arg Leu Gln His Ala 85 90 95 Arg Glu Trp Ile Ser Thr Ser
Leu Thr Trp Asp Gln Asp Gln Asp Val 100 105 110 Asn Thr Phe Glu Thr
Thr Ile Arg Met Leu Gly Gly Leu Leu Ser Ala 115 120 125 His Tyr Leu
Ser Thr Glu Phe Pro Glu Leu Ala Pro Leu Thr Glu Asp 130 135 140 Asp
Glu Gly Ala Pro Gly Glu Asp Leu Tyr Leu Glu Lys Ala Lys Asp145 150
155 160 Leu Ala Asp Arg Leu Leu Ser Ala Phe Glu Ser Glu Ser Gly Ile
Pro 165 170 175 Tyr Ala Ser Val Asn Ile Gly Glu Tyr Lys Gly Pro Ser
His Ser Asp 180 185 190 Asn Gly Ala Ser Ser Thr Ala Glu Ala Thr Thr
Leu Gln Leu Glu Phe 195 200 205 Lys Tyr Leu Ala Lys Leu Thr Gly Glu
Lys Asn Phe Trp Asp Lys Val 210 215 220 Glu Lys Val Met Glu Val Val
Asp Asp Asn Gln Pro Glu Asp Gly Leu225 230 235 240 Val Pro Ile Tyr
Ile Tyr Ala Thr Thr Gly Glu Phe Arg Gly Gln Asn 245 250 255 Ile Arg
Leu Gly Ser Arg Gly Asp Ser Tyr Tyr Glu Tyr Leu Ile Lys 260 265 270
Gln Tyr Leu Gln Thr Asn Lys Gln Glu Pro Ile Tyr Glu Glu Met Trp 275
280 285 Asp Glu Ala Leu Ala Gly Val Arg Lys His Leu Val Thr Tyr Thr
Glu 290 295 300 Pro Ser Glu Phe Thr Ile Ile Ala Glu Arg Pro Asp Gly
Leu Glu His305 310 315 320 Pro Met Ser Pro Lys Met Asp His Leu Val
Cys Phe Met Pro Gly Thr 325 330 335 Ile Ala Leu Ala Ala Thr Gly Gly
Leu Thr Glu Ala Glu Ala Arg Lys 340 345 350 Leu Ser Thr Trp Asn Lys
Lys Lys Asp Asp Asp Met Gln Leu Ala Arg 355 360 365 Glu Leu Met His
Thr Cys Trp Gly Met Tyr Lys Tyr Met Lys Thr Gly 370 375 380 Leu Ala
Pro Glu Ile Met Tyr Phe Asn Ile Pro Asn Pro Pro Pro Glu385 390 395
400 Ser Ser Ala Pro His Gln Ala Pro Ala Ala Phe Asp Glu Asp Pro His
405 410 415 Ala Glu Trp Arg Lys Asp Phe Val Val His Ser Asn Asp Val
His Asn 420 425 430 Leu Gln Arg Pro Glu Thr Val Glu Ser Leu Phe Tyr
Met Trp Arg Ile 435 440 445 Thr Gly Asp Val Lys Tyr Arg Glu Trp Gly
Trp Asp Met Phe Lys Ser 450 455 460 Phe Val Asn Tyr Thr Ala Val Glu
Asp Gln Gly Gly Phe Thr Ser Leu465 470 475 480 Leu Asp Ala Asn Ser
Ile Pro Pro Thr Pro Lys Asp Asn Met Glu Ser 485 490 495 Phe Trp Leu
Ala Glu Thr Leu Lys Tyr Met Tyr Leu Leu Phe Ser Pro 500 505 510 Asn
Asp Val Leu Pro Leu His Lys Ile Val Leu Asn Thr Glu Ala His 515 520
525 Pro Phe Pro Arg Phe Asp Met Gly Pro Leu Phe Ser Thr Gly Trp Lys
530 535 540 Arg Lys Pro Arg Asp Gly Ser Ala Lys Lys Lys Ala Thr Thr
Ala Ala545 550 555 560 Thr Thr Asp Ala Glu 565 697PRTTrichoderma
reesei 69Met Ala Arg Arg Arg Tyr Arg1 5 7015PRTTrichoderma reesei
70Leu Phe Met Ile Cys Ala Ala Val Ile Leu Phe Leu Leu Tyr Arg 1 5
10 15 71871PRTTrichoderma reesei 71Val Ser Gln Asn Thr Trp Asp Asp
Ser Ala His Tyr Ala Thr Leu Arg 1 5 10 15 His Pro Pro Ala Ser Asn
Pro Pro Ala Ala Gly Gly Glu Ser Pro Leu 20 25 30 Lys Pro Ala Ala
Lys Pro Glu His Glu His Glu His Glu Asn Gly Tyr 35 40 45 Ala Pro
Glu Ser Lys Pro Lys Pro Gln Ser Glu Pro Lys Pro Glu Ser 50 55 60
Lys Pro Ala Pro Glu His Ala Ala Gly Gly Gln Lys Ser Gln Gly Lys65
70 75 80 Pro Ser Tyr Glu Asp Asp Glu Glu Thr Gly Lys Asn Pro Pro
Lys Ser 85 90 95 Ala Val Ile Pro Ser Asp Thr Arg Leu Pro Pro Asp
Asn Lys Val His 100 105 110 Trp Arg Pro Val Lys Glu His Phe Pro Val
Pro Ser Glu Ser Val Ile 115 120 125 Ser Leu Pro Thr Gly Lys Pro Leu
Lys Val Pro Arg Val Gln His Glu 130 135 140 Phe Gly Val Glu Ser Pro
Glu Ala Lys Ser Arg Arg Val Ala Arg Gln145 150 155 160 Glu Arg Val
Gly Lys Glu Ile Glu Arg Ala Trp Ser Gly Tyr Lys Lys 165 170 175 Phe
Ala Trp Met His Asp Glu Leu Ser Pro Val Ser Ala Lys His Arg 180 185
190 Asp Pro Phe Cys Gly Trp Ala Ala Thr Leu Val Asp Ser Leu Asp Thr
195 200 205 Leu Trp Ile Ala Gly Leu Lys Glu Gln Phe Asp Glu Ala Ala
Arg Ala 210 215 220 Val Glu Gln Ile Asp Phe Thr Thr Thr Pro Arg Asn
Asn Ile Pro Val225 230 235 240 Phe Glu Thr Thr Ile Arg Tyr Leu Gly
Gly Leu Leu Gly Ala Phe Asp 245 250 255 Val Ser Gly Gly His Asp Gly
Gly Tyr Pro Met Leu Leu Thr Lys Ala 260 265 270 Val Glu Leu Ala Glu
Ile Leu Met Gly Ile Phe Asp Thr Pro Asn Arg 275 280 285 Met Pro Ile
Leu Tyr Tyr Gln Trp Gln Pro Glu Tyr Ala Ser Gln Pro 290 295 300 His
Arg Ala Gly Ser Val Gly Ile Ala Glu Leu Gly Thr Leu Ser Met305 310
315 320 Glu Phe Thr Arg Leu Ala Gln Leu Thr Ser Gln Tyr Lys Tyr Tyr
Asp 325 330 335 Ala Val Asp Arg Ile Thr Asp Ala Leu Ile Glu Leu Gln
Lys Gln Gly 340 345 350 Thr Ser Ile Pro Gly Leu Phe Pro Glu Asn Leu
Asp Ala Ser Gly Cys 355 360 365 Asn His Thr Ala Thr Ala Leu Arg Ser
Ser Leu Ser Glu Ala Ala Gln 370 375 380 Lys Gln Met Asp Glu Asp Leu
Ser Asn Lys Pro Glu Asn Tyr Arg Pro385 390 395 400 Gly Lys Asn Ser
Lys Ala Asp Pro Gln Thr Val Glu Lys Gln Pro Ala 405 410 415 Lys Lys
Gln Asn Glu Pro Val Glu Lys Ala Lys Gln Val Pro Thr Gln 420 425 430
Gln Thr Ala Lys Arg Gly Lys Pro Pro Phe Gly Ala Asn Gly Phe Thr 435
440 445 Ala Asn Trp Asp Cys Val Pro Gln Gly Leu Val Val Gly Gly Tyr
Gly 450 455 460 Phe Gln Gln Tyr His Met Gly Gly Gly Gln Asp Ser Ala
Tyr Glu Tyr465 470 475 480 Phe Pro Lys Glu Tyr Leu Leu Leu Gly Gly
Leu Glu Ser Lys Tyr Gln 485 490 495 Lys Leu Tyr Val Asp Ala Val Glu
Ala Ile Asn Glu Trp Leu Leu Tyr 500 505 510 Arg Pro Met Thr Asp Gly
Asp Trp Asp Ile Leu Phe Pro Ala Lys Val 515 520 525 Ser Thr Ala Gly
Asn Pro Ser Gln Asp Leu Val Ala Thr Phe Glu Val 530 535 540 Thr His
Leu Thr Cys Phe Ile Gly Gly Met Tyr Gly Leu Gly Gly Lys545 550 555
560 Ile Phe Gly Arg Glu Lys Asp Leu Glu Thr Ala Lys Arg Leu Thr Asp
565 570 575 Gly Cys Val Trp Ala Tyr Gln Ser Thr Val Ser Gly Ile Met
Pro Glu 580 585 590 Gly Ser Gln Val Leu Ala Cys Pro Thr Leu Glu Lys
Cys Asp Phe Asn 595 600 605 Glu Thr Leu Trp Trp Glu Lys Leu Asp Pro
Ala Lys Asp Trp Arg Asp 610 615 620 Lys Gln Val Ala Asp Asp Lys Asp
Lys Ala Thr Val Gly Glu Ala Leu625 630 635 640 Lys Glu Thr Ala Asn
Ser His Asp Ala Ala Gly Gly Ser Lys Ala Val 645 650 655 His Lys Arg
Ala Ala Val Pro Leu Pro Lys Pro Gly Ala Asp Asp Asp 660 665 670 Val
Gly Ser Glu Leu Pro Gln Ser Leu Lys Asp Lys Ile Gly Phe Lys 675 680
685 Asn Gly Glu Gln Lys Lys Pro Thr Gly Ser Ser Val Gly Ile Gln Arg
690 695 700 Asp Pro Asp Ala Pro Val Asp Ser Val Leu Glu Ala His Arg
Leu Pro705 710 715 720 Pro Gln Glu Pro Glu Glu Gln Gln Val Ile Leu
Pro Asp Lys Pro Gln 725 730 735 Thr His Glu Glu Phe Val Lys Gln Arg
Ile Ala Glu Met Gly Phe Ala 740 745 750 Pro Gly Val Val His Ile Gln
Ser Arg Gln Tyr Ile Leu Arg Pro Glu 755 760 765 Ala Ile Glu Ser Val
Trp Tyr Met Tyr Arg Ile Thr Gly Asp Pro Ile 770 775 780 Trp Met Glu
Lys Gly Trp Lys Met Phe Glu Ala Thr Ile Arg Ala Thr785 790 795 800
Arg Thr Glu Ile Ala Asn Ser Ala Ile Asp Asp Val Asn Ser Glu Glu 805
810 815 Pro Gly Leu Lys Asp Glu Met Glu Ser Phe Trp Leu Ala Glu Thr
Leu 820 825 830 Lys Tyr Tyr Tyr Leu Leu Phe Ser Glu Pro Ser Val Ile
Ser Leu Asp 835 840 845 Glu Trp Val Leu Asn Thr Glu Ala His Pro Phe
Lys Arg Pro Gly Gly 850 855 860 Ser Val Ile Gly His Ser Ile865 870
7213PRTTrichoderma reesei 72Met Leu Asn Gln Leu Gln Gly Arg Val Pro
Arg Arg Tyr 1 5 10 7315PRTTrichoderma reesei 73Ile Ala Leu Val Ala
Phe Ala Phe Phe Val Ala Phe Leu Leu Trp 1 5 10 15
74542PRTTrichoderma reesei 74Ser Gly Tyr Asp Phe Val Pro Arg Thr
Ala Thr Val Gly Arg Phe Lys 1 5 10 15 Tyr Val Pro Ser Ser Tyr Asp
Trp Ser Lys Ala Lys Val Tyr Tyr Pro 20 25 30 Val Lys Asp Met Lys
Thr Leu Pro Gln Gly Thr Pro Val Thr Phe Pro 35 40 45 Arg Leu Gln
Leu Arg Asn Gln Ser Glu Ala Gln Asp Asp Thr Thr Lys 50 55 60 Ala
Arg Lys Gln Ala Val Lys Asp Ala Phe Val Lys Ser Trp Glu Ala65 70 75
80 Tyr Lys Thr Tyr Ala Trp Thr Lys Asp Gln Leu Gln Pro Leu Ser Leu
85 90 95 Ser Gly Lys Glu Thr Phe Ser Gly Trp Ser Ala Gln Leu Val
Asp Ala 100 105 110 Leu Asp Thr Leu Trp Ile Met Asp Leu Lys Asp Asp
Phe Phe Leu Ala 115 120 125 Val Lys Glu Val Ala Val Ile Asp Trp Ser
Lys Thr Lys Asp Asn Lys 130 135 140 Val Ile Asn Leu Phe Glu Val Thr
Ile Arg Tyr Leu Gly Gly Leu Ile145 150 155 160 Ala Ala Tyr Asp Leu
Ser Gln Glu Pro Val Leu Arg Ala Lys Ala Ile 165 170 175 Glu Leu Gly
Asp Thr Leu Tyr Ala Thr Phe Asp Thr Pro Asn Arg Leu 180 185 190 Pro
Ser His Trp Leu Asp Tyr Ser Lys Ala Lys Lys Gly Thr Gln Arg 195 200
205 Ala Asp Asp Ser Met Ser Gly Ala Ala Gly Gly Thr Leu Cys Met Glu
210 215 220 Phe Thr Arg Leu Ser Gln Ile Thr Gly Asp Pro Lys Tyr Tyr
Asp Ala225 230 235 240 Thr Glu Arg Ile Lys Gln Phe Phe Tyr Arg Phe
Gln Asn Glu Thr Thr 245 250 255 Leu Pro Gly Met Trp Pro Val Met Met
Asn Tyr Arg Glu Glu Thr Met 260 265 270 Val Glu Ser Arg Tyr Ser Met
Gly Gly Ser Ala Asp Ser Leu Tyr Glu 275 280 285 Tyr Leu Val Lys Met
Pro Ala Leu Leu Gly Gly Leu Asp Pro Gln Tyr 290 295 300 Pro Glu Met
Ala Ile Arg Ala Leu Asp Thr Ala Arg Asp Asn Leu Leu305 310 315 320
Phe Arg Pro Met Thr Glu Lys Gly Asp Asn Ile Leu Ala Leu Gly Asn 325
330 335 Ala Leu Val Asp His Gly Asn Val Gln Arg Ile Thr Glu Met Gln
His 340 345 350 Leu Thr Cys Phe Ala Gly Gly Met Tyr Ala Met Ala Gly
Lys Leu Phe 355 360 365 Lys Arg Asp Asp Tyr Val Asp Leu Gly Ser Arg
Ile Ser Ser Gly Cys 370 375 380 Val Trp Ala Tyr Asp Ser Phe Pro Ser
Gly Ile Met Pro Glu Ser Ala385 390 395 400 Asp Met Ala Ala Cys Ala
Lys Leu Asp Gly Pro Cys Pro Tyr Asp Glu 405 410 415 Val Lys Ala Pro
Val Asp Pro Asp Gly Arg Arg Pro His Gly Phe Ile 420 425 430 His Val
Lys Ser Arg His Tyr Leu Leu Arg Pro Glu Ala Ile Glu Ser 435 440 445
Val Phe Tyr Met Trp Arg Ile Thr Gly Asp Gln Val Trp Arg Asp Thr 450
455 460 Ala Trp Arg Met Trp Glu Asn Ile Val Arg Glu Ala Glu Thr Glu
His465 470 475 480 Ala Phe Ala Ile Val Glu Asp Val Thr Arg Thr Ala
Ser Lys Leu Thr 485 490 495 Asn Asn Tyr Leu Leu Gln Thr Phe Trp Leu
Ala Glu Thr Leu Lys Tyr
500 505 510 Phe Tyr Leu Ile Phe Asp Asp Glu Ser Ala Ile Asp Leu Asp
Lys Trp 515 520 525 Val Phe Asn Thr Glu Ala His Pro Phe Lys Arg Pro
Ala Val 530 535 540 7513PRTTrichoderma reesei 75Met Leu Val Val Gly
Arg Pro Arg Leu Val Arg Asn Ser 1 5 10 7616PRTTrichoderma reesei
76Ile Ile Leu Thr Leu Ala Ile Leu Ser Ile Trp His Leu Gly Leu Leu 1
5 10 15 77576PRTTrichoderma reesei 77Ser Arg Thr Pro Thr Ser Ala
Ser Ala Leu Val Ser Ala Ser Val Ser 1 5 10 15 Ala Ser Ser Glu Trp
Ser Arg Leu Glu Arg Leu Met Asn Arg Gly Ala 20 25 30 Pro Leu Thr
Pro Tyr Pro Asp Ser Asn Ser Ser Phe Asp Trp Ser Ala 35 40 45 Ile
Pro Phe Arg Tyr Pro Pro His Asn Thr Thr His Leu Pro Pro Arg 50 55
60 His Lys Gln Pro Pro Leu Pro Arg Ile Gln His Arg Phe Gly Pro
Glu65 70 75 80 Ser Pro Ala Ala Ala Lys Glu Arg Ile Lys Arg Leu Lys
Ala Val Lys 85 90 95 Gln Val Phe Leu Arg Ala Trp Gln Ala Tyr Lys
Gly Tyr Ala Trp Lys 100 105 110 Gln Asp Ala Leu Leu Pro Ile Ser Gly
Gly Gly Arg Glu Gln Phe Ser 115 120 125 Gly Trp Ala Ala Thr Leu Val
Asp Ala Leu Asp Thr Leu Trp Ile Met 130 135 140 Gly Leu Arg Glu Glu
Phe Asp Glu Ala Val Ala Ala Val Ala Glu Ile145 150 155 160 Asp Phe
Gly Ser Ser Thr Ser Ser Arg Val Asn Ile Phe Glu Thr Asn 165 170 175
Ile Arg Tyr Leu Gly Gly Leu Leu Ala Ala Tyr Asp Leu Ser Gly Arg 180
185 190 Glu Val Leu Leu Lys Lys Ala Val Glu Leu Gly Asp Leu Ile Tyr
Ala 195 200 205 Gly Phe Asn Thr Glu Asn Gly Met Pro Val Asp Phe Leu
Asn Phe Tyr 210 215 220 Ser Ala Lys Ser Gly Glu Gly Leu Val Val Glu
Ser Ser Val Val Ser225 230 235 240 Ala Ser Pro Gly Thr Leu Ser Leu
Glu Leu Ala His Leu Ser Gln Val 245 250 255 Thr Gly Asp Asp Lys Tyr
Tyr Ser Ala Val Ser Gln Val Met Asp Val 260 265 270 Phe Tyr Gln Gly
Gln Asn Lys Thr Arg Leu Pro Gly Val Trp Pro Ile 275 280 285 Asp Val
Asn Met Arg Ala Lys Asp Val Val Ser Gly Ser Arg Phe Thr 290 295 300
Leu Gly Gly Cys Ala Asp Ser Leu Tyr Glu Tyr Leu Pro Lys Met His305
310 315 320 Gln Leu Leu Gly Gly Gly Glu Pro Lys Tyr Glu Thr Met Ser
Arg Thr 325 330 335 Phe Leu Gln Ala Ala Asp Arg His Phe Val Phe Arg
Pro Met Leu Pro 340 345 350 Gly Ala Glu Glu Asp Val Leu Met Pro Gly
Asn Val Asn Val Asp Glu 355 360 365 Asp Ser Gly Glu Ala Val Leu Asp
Pro Glu Thr Glu His Leu Ala Cys 370 375 380 Phe Val Gly Gly Met Phe
Gly Leu Ala Gly Arg Leu Phe Ser Arg Pro385 390 395 400 Asp Asp Val
Glu Thr Gly Val Arg Leu Thr Asn Gly Cys Val Tyr Ala 405 410 415 Tyr
Arg Ala Phe Pro Thr Gly Met Met Pro Glu Arg Leu Asp Leu Ala 420 425
430 Pro Cys Arg Asp Arg Ser Ser Arg Cys Pro Trp Asp Glu Glu His Trp
435 440 445 Leu Glu Glu Arg Ala Lys Arg Pro Glu Trp Glu Pro His Leu
Pro Arg 450 455 460 Gly Phe Thr Ser Ala Lys Asp Pro Arg Tyr Leu Leu
Arg Pro Glu Ala465 470 475 480 Ile Glu Ser Val Phe Tyr Ser Tyr Arg
Ile Thr Gly Arg Gln Glu Phe 485 490 495 Gln Thr Ala Ala Trp Asp Met
Phe Thr Ala Val Glu Lys Gly Thr Arg 500 505 510 Thr Gln Phe Ala Asn
Ala Ala Val Leu Asp Val Thr Arg Ala Ala Asp 515 520 525 Glu Leu Pro
Gln Glu Asp Tyr Met Glu Ser Phe Trp Leu Ala Glu Thr 530 535 540 Leu
Lys Tyr Phe Tyr Leu Met Phe Thr Thr Pro Asp Ile Ile Ser Leu545 550
555 560 Asp Asp Tyr Val Leu Asn Thr Glu Ala His Pro Phe Lys Leu Val
Gly 565 570 575 7817PRTTrichoderma reesei 78Met Val Met Leu Val Ala
Ile Ala Leu Ala Trp Leu Gly Cys Ser Leu 1 5 10 15
Leu791053PRTTrichoderma reesei 79Arg Pro Val Asp Ala Met Arg Ala
Asp Tyr Leu Ala Gln Leu Arg Gln 1 5 10 15 Glu Thr Val Asp Met Phe
Tyr His Gly Tyr Ser Asn Tyr Met Glu His 20 25 30 Ala Phe Pro Glu
Asp Glu Leu Arg Pro Ile Ser Cys Thr Pro Leu Thr 35 40 45 Arg Asp
Arg Asp Asn Pro Gly Arg Ile Ser Leu Asn Asp Ala Leu Gly 50 55 60
Asn Tyr Ser Leu Thr Leu Ile Asp Ser Leu Ser Thr Leu Ala Ile Leu65
70 75 80 Ala Gly Gly Pro Gln Asn Gly Pro Tyr Thr Gly Pro Gln Ala
Leu Ser 85 90 95 Asp Phe Gln Asp Gly Val Ala Glu Phe Val Arg His
Tyr Gly Asp Gly 100 105 110 Arg Ser Gly Pro Ser Gly Ala Gly Ile Arg
Ala Arg Gly Phe Asp Leu 115 120 125 Asp Ser Lys Val Gln Val Phe Glu
Thr Val Ile Arg Gly Val Gly Gly 130 135 140 Leu Leu Ser Ala His Leu
Phe Ala Ile Gly Glu Leu Pro Ile Thr Gly145 150 155 160 Tyr Val Pro
Arg Pro Glu Gly Val Ala Gly Asp Asp Pro Leu Glu Leu 165 170 175 Ala
Pro Ile Pro Trp Pro Asn Gly Phe Arg Tyr Asp Gly Gln Leu Leu 180 185
190 Arg Leu Ala Leu Asp Leu Ser Glu Arg Leu Leu Pro Ala Phe Tyr Thr
195 200 205 Pro Thr Gly Ile Pro Tyr Pro Arg Val Asn Leu Arg Ser Gly
Ile Pro 210 215 220 Phe Tyr Val Asn Ser Pro Leu His Gln Asn Leu Gly
Glu Ala Val Glu225 230 235 240 Glu Gln Ser Gly Arg Pro Glu Ile Thr
Glu Thr Cys Ser Ala Gly Ala 245 250 255 Gly Ser Leu Val Leu Glu Phe
Thr Val Leu Ser Arg Leu Thr Gly Asp 260 265 270 Ala Arg Phe Glu Gln
Ala Ala Lys Arg Ala Phe Trp Glu Val Trp His 275 280 285 Arg Arg Ser
Glu Ile Gly Leu Ile Gly Asn Gly Ile Asp Ala Glu Arg 290 295 300 Gly
Leu Trp Ile Gly Pro His Ala Gly Ile Gly Ala Gly Met Asp Ser305 310
315 320 Phe Phe Glu Tyr Ala Leu Lys Ser His Ile Leu Leu Ser Gly Leu
Gly 325 330 335 Met Pro Asn Ala Ser Thr Ser Arg Arg Gln Ser Thr Thr
Ser Trp Leu 340 345 350 Asp Pro Asn Ser Leu His Pro Pro Leu Pro Pro
Glu Met His Thr Ser 355 360 365 Asp Ala Phe Leu Gln Ala Trp His Gln
Ala His Ala Ser Val Lys Arg 370 375 380 Tyr Leu Tyr Thr Asp Arg Ser
His Phe Pro Tyr Tyr Ser Asn Asn His385 390 395 400 Arg Ala Thr Gly
Gln Pro Tyr Ala Met Trp Ile Asp Ser Leu Gly Ala 405 410 415 Phe Tyr
Pro Gly Leu Leu Ala Leu Ala Gly Glu Val Glu Glu Ala Ile 420 425 430
Glu Ala Asn Leu Val Tyr Thr Ala Leu Trp Thr Arg Tyr Ser Ala Leu 435
440 445 Pro Glu Arg Trp Ser Val Arg Glu Gly Asn Val Glu Ala Gly Ile
Gly 450 455 460 Trp Trp Pro Gly Arg Pro Glu Phe Ile Glu Ser Thr Tyr
His Ile Tyr465 470 475 480 Arg Ala Thr Arg Asp Pro Trp Tyr Leu His
Val Gly Glu Met Val Leu 485 490 495 Arg Asp Ile Arg Arg Arg Cys Tyr
Ala Glu Cys Gly Trp Ala Gly Leu 500 505 510 Gln Asp Val Gln Thr Gly
Glu Lys Gln Asp Arg Met Glu Ser Phe Phe 515 520 525 Leu Gly Glu Thr
Ala Lys Tyr Met Tyr Leu Leu Phe Asp Pro Asp His 530 535 540 Pro Leu
Asn Lys Leu Asp Ala Ala Tyr Val Phe Thr Thr Glu Gly His545 550 555
560 Pro Leu Ile Ile Pro Lys Ser Lys Arg Gly Ser Gly Ser His Asn Arg
565 570 575 Gln Asp Arg Ala Arg Lys Ala Lys Lys Ser Arg Asp Val Ala
Val Tyr 580 585 590 Thr Tyr Tyr Asp Glu Ser Phe Thr Asn Ser Cys Pro
Ala Pro Arg Pro 595 600 605 Pro Ser Glu His His Leu Ile Gly Ser Ala
Thr Ala Ala Arg Pro Asp 610 615 620 Leu Phe Ser Val Ser Arg Phe Thr
Asp Leu Tyr Arg Thr Pro Asn Val625 630 635 640 His Gly Pro Leu Glu
Lys Val Glu Met Arg Asp Lys Lys Lys Gly Arg 645 650 655 Val Val Arg
Tyr Arg Ala Thr Ser Asn His Thr Ile Phe Pro Trp Thr 660 665 670 Leu
Pro Pro Ala Met Leu Pro Glu Asn Gly Thr Cys Ala Ala Pro Pro 675 680
685 Glu Arg Ile Ile Ser Leu Ile Glu Phe Pro Ala Asn Asp Ile Thr Ser
690 695 700 Gly Ile Thr Ser Arg Phe Gly Asn His Leu Ser Trp Gln Thr
His Leu705 710 715 720 Gly Pro Thr Val Asn Ile Leu Glu Gly Leu Arg
Leu Gln Leu Glu Gln 725 730 735 Val Ser Asp Pro Ala Thr Gly Glu Asp
Lys Trp Arg Ile Thr His Ile 740 745 750 Gly Asn Thr Gln Leu Gly Arg
His Glu Thr Val Phe Phe His Ala Glu 755 760 765 His Val Arg His Leu
Lys Asp Glu Val Phe Ser Cys Arg Arg Arg Arg 770 775 780 Asp Ala Val
Glu Ile Glu Leu Leu Val Asp Lys Pro Ser Asp Thr Asn785 790 795 800
Asn Asn Asn Thr Leu Ala Ser Ser Asp Asp Asp Val Val Val Asp Ala 805
810 815 Lys Ala Glu Glu Gln Asp Gly Met Leu Ala Asp Asp Asp Gly Asp
Thr 820 825 830 Leu Asn Ala Glu Thr Leu Ser Ser Asn Ser Leu Phe Gln
Ser Leu Leu 835 840 845 Arg Ala Val Ser Ser Val Phe Glu Pro Val Tyr
Thr Ala Ile Pro Glu 850 855 860 Ser Asp Pro Ser Ala Gly Thr Ala Lys
Val Tyr Ser Phe Asp Ala Tyr865 870 875 880 Thr Ser Thr Gly Pro Gly
Ala Tyr Pro Met Pro Ser Ile Ser Asp Thr 885 890 895 Pro Ile Pro Gly
Asn Pro Phe Tyr Asn Phe Arg Asn Pro Ala Ser Asn 900 905 910 Phe Pro
Trp Ser Thr Val Phe Leu Ala Gly Gln Ala Cys Glu Gly Pro 915 920 925
Leu Pro Ala Ser Ala Pro Arg Glu His Gln Val Ile Val Met Leu Arg 930
935 940 Gly Gly Cys Ser Phe Ser Arg Lys Leu Asp Asn Ile Pro Ser Phe
Ser945 950 955 960 Pro His Asp Arg Ala Leu Gln Leu Val Val Val Leu
Asp Glu Pro Pro 965 970 975 Pro Pro Pro Pro Pro Pro Pro Ala Asn Asp
Arg Arg Asp Val Thr Arg 980 985 990 Pro Leu Leu Asp Thr Glu Gln Thr
Thr Pro Lys Gly Met Lys Arg Leu 995 1000 1005 His Gly Ile Pro Met
Val Leu Val Arg Ala Ala Arg Gly Asp Tyr Glu 1010 1015 1020 Leu Phe
Gly His Ala Ile Gly Val Gly Met Arg Arg Lys Tyr Arg Val1025 1030
1035 1040Glu Ser Gln Gly Leu Val Val Glu Asn Ala Val Val Leu 1045
1050 8045PRTTrichoderma reesei 80Met Met Pro Arg His His Ser Ser
Gly Phe Ser Asn Gly Tyr Pro Arg 1 5 10 15 Ala Asp Thr Phe Glu Ile
Ser Pro His Arg Phe Gln Pro Arg Ala Thr 20 25 30 Leu Pro Pro His
Arg Lys Arg Lys Arg Thr Ala Ile Arg 35 40 45 8116PRTTrichoderma
reesei 81Val Gly Ile Ala Val Val Val Ile Leu Val Leu Val Leu Trp
Phe Gly 1 5 10 15 82407PRTTrichoderma reesei 82Gln Pro Arg Ser Val
Ala Ser Leu Ile Ser Leu Gly Ile Leu Ser Gly 1 5 10 15 Tyr Asp Asp
Leu Lys Leu Glu Thr Val Arg Tyr Tyr Asp Leu Ser Asn 20 25 30 Val
Gln Gly Thr Ala Arg Gly Trp Glu Arg Glu Glu Arg Ile Leu Leu 35 40
45 Cys Val Pro Leu Arg Asp Ala Glu Gln His Leu Pro Met Phe Phe Ser
50 55 60 His Leu Lys Asn Phe Thr Tyr Pro His Asn Leu Ile Asp Leu
Ala Phe65 70 75 80 Leu Val Ser Asp Ser Lys Asp His Thr Leu Glu Ser
Leu Thr Glu His 85 90 95 Leu Glu Ala Ile Gln Ala Asp Pro Asp Pro
Lys Gln Pro Tyr Gly Glu 100 105 110 Ile Ser Ile Ile Glu Lys Asp Phe
Gly Gln Lys Val Asn Gln Asp Val 115 120 125 Glu Ser Arg His Gly Phe
Ala Ala Gln Ala Ser Arg Arg Lys Leu Met 130 135 140 Ala Gln Ala Arg
Asn Trp Leu Leu Ser Ala Ala Leu Arg Pro Tyr His145 150 155 160 Ser
Trp Val Tyr Trp Arg Asp Val Asp Val Glu Thr Ala Pro Phe Thr 165 170
175 Ile Leu Glu Asp Leu Met Arg His Asn Lys Asp Val Ile Val Pro Asn
180 185 190 Val Trp Arg Pro Leu Pro Asp Trp Leu Gly Gly Glu Gln Pro
Tyr Asp 195 200 205 Leu Asn Ser Trp Gln Glu Ser Glu Thr Ala Leu Ala
Leu Ala Asp Thr 210 215 220 Leu Asp Glu Asp Ala Val Ile Val Glu Gly
Tyr Ala Glu Tyr Ala Thr225 230 235 240 Trp Arg Pro His Leu Ala Tyr
Leu Arg Asp Pro Tyr Gly Asp Pro Asp 245 250 255 Met Glu Met Glu Ile
Asp Gly Val Gly Gly Val Ser Ile Leu Ala Lys 260 265 270 Ala Lys Val
Phe Arg Ala Gly Val His Phe Pro Ala Phe Ser Phe Glu 275 280 285 Lys
His Ala Glu Thr Glu Gly Phe Gly Lys Met Ala Lys Arg Met His 290 295
300 Phe Ser Val Val Gly Leu Pro His Tyr Thr Ile Trp His Leu Tyr
Glu305 310 315 320 Pro Ser Val Asp Asp Ile Lys His Met Glu Glu Met
Glu Arg Glu Arg 325 330 335 Ile Ala Arg Glu Lys Glu Glu Glu Glu Arg
Lys Lys Lys Glu Ala Gln 340 345 350 Ile Lys Glu Glu Phe Gly Asp Ala
Asn Ser Gln Trp Glu Gln Asp Lys 355 360 365 Gln Gln Met Gln Asp Leu
Lys Leu Gln Asp Arg Gly Gly Asp Lys Glu 370 375 380 Ala Ala Ala Ala
Gly Val Asn Gln Gly Ala Ala Ala Lys Ala Ala Gly385 390 395 400 Ala
Met Glu Gly Gln Lys Asn 405 8331PRTTrichoderma reesei 83Met Leu Leu
Pro Lys Gly Gly Leu Asp Trp Arg Ser Ala Arg Ala Gln 1 5 10 15 Ile
Pro Pro Thr Arg Ala Leu Trp Asn Ala Val Thr Arg Thr Arg 20 25 30
8415PRTTrichoderma reesei 84Phe Ile Leu Leu Val Gly Ile Thr Gly Leu
Ile Leu Leu Leu Trp 1 5 10 15 85358PRTTrichoderma reesei 85Arg Gly
Val Ser Thr Ser Ala Ser Glu Met Gln Ser Phe Tyr Cys Trp 1 5 10 15
Gly Pro Ala Lys Pro Pro Met Glu Met Ser Pro Asn Glu His Asn Arg 20
25 30 Trp Asn Gly His Leu Gln Thr Pro Val Ile Phe Asn His His Ala
Pro 35 40
45 Val Glu Val Asn Ser Ser Thr Ile Glu His Val Asp Leu Asn Pro Ile
50 55 60 Asn Ser Thr Lys Gln Ala Val Thr Lys Glu Glu Arg Ile Leu
Ile Leu65 70 75 80 Thr Pro Leu Lys Asp Ala Ala Pro Tyr Leu Ser Lys
Tyr Phe Glu Leu 85 90 95 Leu Ala Glu Leu Thr Tyr Pro His Arg Leu
Ile Asp Leu Ala Phe Leu 100 105 110 Val Ser Asp Ser Thr Asp Asp Thr
Leu Ala Val Leu Ala Ser Glu Leu 115 120 125 Asp Arg Ile Gln Lys Arg
Pro Asp Gln Ile Pro Phe His Ser Ala Thr 130 135 140 Val Ile Glu Lys
Asp Phe Gly Phe Lys Leu Ser Gln Asn Val Glu Glu145 150 155 160 Arg
His Ser Phe Glu Ala Gln Gly Pro Arg Arg Lys Ala Met Gly Arg 165 170
175 Ala Arg Asn Tyr Leu Leu Tyr Thr Ala Leu Lys Pro Glu His Ser Trp
180 185 190 Val Tyr Trp Arg Asp Val Asp Ile Val Asp Ser Pro Thr Gly
Ile Leu 195 200 205 Glu Asp Phe Ile Ala His Asp Arg Asp Ile Leu Val
Pro Asn Ile Trp 210 215 220 Phe His Arg Tyr Arg Asp Gly Val Asp Ile
Glu Gly Arg Phe Asp Tyr225 230 235 240 Asn Ser Trp Val Glu Ser Asp
Lys Gly Arg Lys Leu Ala Asn Ser Leu 245 250 255 Asp Lys Asp Val Val
Leu Ala Glu Gly Tyr Lys Gln Tyr Asp Thr Gly 260 265 270 Arg Thr Tyr
Met Ala Lys Met Gly Asp Trp Arg Glu Asn Lys Asp Val 275 280 285 Glu
Leu Glu Leu Asp Gly Ile Gly Gly Val Asn Ile Leu Val Lys Ala 290 295
300 Asp Val His Arg Ser Gly Ile Asn Phe Pro Cys Tyr Ala Phe Glu
Asn305 310 315 320 Gln Ala Glu Thr Glu Gly Phe Ala Lys Met Ala Lys
Arg Ala Gly Tyr 325 330 335 Glu Val Tyr Gly Leu Pro Asn Tyr Val Val
Trp His Ile Asp Thr Glu 340 345 350 Glu Lys Gly Gly Asn Ala 355
8617PRTTrichoderma reesei 86Met Ala Arg Pro Met Gly Ser Val Arg Leu
Lys Lys Ala Asn Pro Ser 1 5 10 15 Thr8716PRTTrichoderma reesei
87Leu Ile Leu Gly Ala Val Leu Cys Ile Phe Ile Ile Ile Phe Leu Val 1
5 10 15 88339PRTTrichoderma reesei 88Ser Pro Ser Ser Pro Ala Ser
Ala Ser Arg Leu Ser Ile Val Ser Ala 1 5 10 15 Gln His His Leu Ser
Pro Pro Thr Ser Pro Tyr Gln Ser Pro Arg Ser 20 25 30 Gly Ala Val
Gln Gly Pro Pro Pro Val Thr Arg Tyr Asn Leu Asn Lys 35 40 45 Val
Thr Val Thr Ser Asp Pro Val Arg Asn Gln Glu His Ile Leu Ile 50 55
60 Leu Thr Pro Met Ala Arg Phe Tyr Gln Glu Tyr Trp Asp Asn Leu
Leu65 70 75 80 Arg Leu Asn Tyr Pro His Glu Leu Ile Thr Leu Gly Phe
Ile Leu Pro 85 90 95 Lys Thr Lys Glu Gly Asn Gln Ala Thr Ser Met
Leu Gln Lys Gln Ile 100 105 110 Gln Lys Thr Gln Asn Tyr Gly Pro Glu
Lys Asp Arg Phe Lys Ser Ile 115 120 125 Ile Ile Leu Arg Gln Asp Phe
Asp Pro Ala Val Val Ser Gln Asp Glu 130 135 140 Ser Glu Arg His Lys
Leu Ala Asn Gln Lys Ala Arg Arg Glu Val Met145 150 155 160 Ala Lys
Ala Arg Asn Ser Leu Leu Phe Thr Thr Leu Gly Pro Ser Thr 165 170 175
Ser Trp Val Leu Trp Leu Asp Ala Asp Ile Thr Glu Thr Ala Pro Thr 180
185 190 Leu Ile Gln Asp Leu Ala Ser His Asp Lys Pro Ile Ile Val Ala
Asn 195 200 205 Cys Phe Gln Lys Tyr Tyr Asp Pro Glu Ser Lys Lys Met
Ala Glu Arg 210 215 220 Pro Tyr Asp Phe Asn Ser Trp Gln Asp Ser Glu
Thr Ala Leu Lys Met225 230 235 240 Ala Glu Gln Met Gly Pro Asp Asp
Ile Leu Leu Glu Gly Tyr Ala Glu 245 250 255 Met Ala Thr Tyr Arg Thr
Leu Leu Ala Tyr Met Ser Thr Pro Gly Gly 260 265 270 Ser Lys Asp Leu
Val Val Pro Leu Asp Gly Val Gly Gly Thr Ala Leu 275 280 285 Leu Val
Lys Ala Asp Val His Arg Asp Gly Ala Met Phe Pro Pro Phe 290 295 300
Ala Phe Tyr His Leu Ile Glu Ser Glu Gly Phe Ala Lys Met Ala Lys305
310 315 320 Arg Leu Gly Trp Gln Pro Tyr Gly Leu Pro Asn Tyr Lys Val
Tyr His 325 330 335 Tyr Asn Glu8933PRTTrichoderma reesei 89Met His
Phe Ala Tyr Pro Ser Arg Lys Ser Ser Asn Pro Pro Pro Phe 1 5 10 15
Arg Pro Arg Ser Thr Arg Leu Pro Gly Leu Arg Arg Ser Arg Ile Lys 20
25 30 Thr9015PRTTrichoderma reesei 90Ile Gly Ile Val Leu Phe Leu
Val Leu Ala Thr Leu Trp Phe Phe 1 5 10 15 91262PRTTrichoderma
reesei 91Ser Asn Pro Arg Val Pro Arg Pro Asp Pro Glu Arg Val Pro
Ser Gly 1 5 10 15 Arg Pro Pro Val Val Leu Val Thr Val Ile Asp Pro
Thr Gln Tyr Pro 20 25 30 Asn Ala Tyr Leu Lys Thr Ile Lys Glu Asn
Arg Glu Gln Tyr Ala Ala 35 40 45 Lys His Gly Tyr Glu Ala Phe Ile
Val Lys Ala Tyr Asp Tyr Asp Thr 50 55 60 Gln Gly Ala Pro Gln Ser
Trp Ser Lys Leu Met Ala Met Arg His Ala65 70 75 80 Leu Thr Lys Phe
Pro Glu Cys Arg Phe Val Trp Tyr Leu Asp Gln Asp 85 90 95 Ala Tyr
Ile Met Asp Met Ser Lys Ser Leu Glu Glu Gln Leu Leu Asn 100 105 110
Arg Gln Lys Leu Glu Ser Leu Met Ile Lys Asn Tyr Pro Val Val Pro 115
120 125 Pro Asp Ser Ile Ile Lys Thr Phe Ser His Leu Arg Pro Asp Glu
Val 130 135 140 Asp Leu Ile Val Ser Gln Asp Ser Ser Gly Leu Val Ala
Gly Ser Val145 150 155 160 Val Val Arg Asn Ser Gln Trp Ser Lys Phe
Leu Leu Glu Thr Trp Met 165 170 175 Asp Pro Leu Tyr Arg Ser Tyr Asn
Phe Gln Lys Ala Glu Arg His Ala 180 185 190 Leu Glu His Ile Val Gln
Trp His Pro Thr Ile Leu Ser Lys Leu Ala 195 200 205 Leu Val Pro Gln
Arg Thr Leu Gly Pro Tyr Thr Arg Thr Asp Gln Gly 210 215 220 Asp Ala
Tyr Gln Asp Gly Asp Phe Val Val Met Phe Thr Gly Cys Thr225 230 235
240 Lys Ser Gly Glu Gln Ser Cys Glu Thr Val Ser Ala Ser Tyr Tyr Gln
245 250 255 Lys Trp Ser Ser Ser Leu 260 92119PRTTrichoderma reesei
92Met Ser Leu Ser Arg Ser Pro Ser Pro Val Pro Gly Gly Gly Trp Ser 1
5 10 15 Ser Pro Gly Leu Asn Ile Asn Ser Gly Arg Ser Ser Pro Ser Asn
Ala 20 25 30 Ala Gly Ser Ser Val Ser Trp Glu Ser Ala Lys Met Arg
Lys Gln Gly 35 40 45 Ala Asn Gly Tyr Pro Ser Phe Ser Thr Gln Asn
Gln Gly Phe Phe Thr 50 55 60 Arg His Met Arg Arg Ile Ser Ser Ser
Leu Pro Arg Phe Ala Ala Gly65 70 75 80 Pro Gly Asn Thr Tyr Ala Glu
Arg Glu Lys Tyr Glu Arg Gly Gly His 85 90 95 Ser Pro His Ala Gly
Gly Gly Arg Leu Arg Ala Phe Leu Ala Arg Ile 100 105 110 Gly Arg Arg
Leu Lys Trp Arg 115 9316PRTTrichoderma reesei 93Ile Leu Leu Pro Leu
Ile Ile Ile Cys Thr Ile Val Ala Tyr Tyr Gly 1 5 10 15
94324PRTTrichoderma reesei 94Thr His Glu Ala Pro Gly Phe Val His
Trp Trp Arg Arg Ile Ser Met 1 5 10 15 Gly Gly Gly Gly Glu Lys Phe
Val Ile Ile Leu Gly Ala Asn Val Gly 20 25 30 Gly Gly Val Met Glu
Trp Lys Gly Ala Arg Glu Trp Ala Ile Glu Arg 35 40 45 Asp Ser Val
Arg Asn Lys Arg Lys Tyr Ala Thr Arg Trp Gly Tyr Asp 50 55 60 Leu
Glu Ile Val Asp Met Lys Thr Lys Lys Arg Tyr Ala His Glu Trp65 70 75
80 Arg Glu Ser Trp Glu Lys Val Asp Phe Ile Arg Ala Ala Met Arg Lys
85 90 95 Tyr Pro Lys Ala Glu Trp Phe Trp Trp Leu Asp Leu Asn Thr
Tyr Val 100 105 110 Met Glu Pro Ser Tyr Ser Leu Gln Arg His Leu Phe
Asn His Leu Asp 115 120 125 Arg His Val Tyr Arg Asp Ile Asn Val Phe
Asn Pro Leu Asn Ile Thr 130 135 140 His Pro Pro Thr Glu Glu Tyr Leu
Asp Ala Glu Ala Arg Ser Pro Val145 150 155 160 Gly Asp Gly Asn Ile
Asn Ser Val Asn Leu Met Leu Thr Gln Asp Cys 165 170 175 Ser Gly Phe
Asn Leu Gly Ser Phe Phe Ile Arg Arg Ser Ala Trp Thr 180 185 190 Glu
Gln Leu Leu Asp Ile Trp Trp Asp Pro Val Leu Tyr Glu Gln Lys 195 200
205 His Met Glu Trp Glu His Lys Glu Gln Asp Ala Leu Glu Gln Leu Tyr
210 215 220 Arg Thr Gln Pro Trp Ile Arg Gln His Thr Gly Phe Leu Pro
Gln Arg225 230 235 240 Leu Ile Asn Ser Phe Pro Pro Ala Ala Cys Ala
Asp Glu Ser Gly Leu 245 250 255 Asn Asn Thr Arg Ile His Tyr Asn Glu
Lys Asp Arg Asp Phe Val Val 260 265 270 Asn Met Ala Gly Cys Glu Trp
Gly Arg Asp Cys Trp Gly Glu Met Tyr 275 280 285 His Tyr Arg Glu Phe
Ser Tyr Trp Leu Asn Arg Asn Pro Trp Glu Leu 290 295 300 Phe Lys Glu
Glu Ile Val Ala Val Ile Trp Tyr Lys Leu Thr Gly Gln305 310 315 320
Arg Val Lys Leu95863PRTHomo sapiens 95Met Arg Phe Arg Ile Tyr Lys
Arg Lys Val Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly
Phe Val Leu Trp Ser Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu
Ala Leu Ala Pro Pro Leu Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly
Ala Gly Gly Arg Gly Gly Asp His Pro Ser Val Ala Val Gly Ile 50 55
60 Arg Arg Val Ser Asn Val Ser Ala Ala Ser Leu Val Pro Ala Val
Pro65 70 75 80 Gln Pro Glu Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser
Leu Val Tyr 85 90 95 Gln Leu Asn Phe Asp Gln Thr Leu Arg Asn Val
Asp Lys Ala Gly Thr 100 105 110 Trp Ala Pro Arg Glu Leu Val Leu Val
Val Gln Val His Asn Arg Pro 115 120 125 Glu Tyr Leu Arg Leu Leu Leu
Asp Ser Leu Arg Lys Ala Gln Gly Ile 130 135 140 Asp Asn Val Leu Val
Ile Phe Ser His Asp Phe Trp Ser Thr Glu Ile145 150 155 160 Asn Gln
Leu Ile Ala Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe 165 170 175
Phe Pro Phe Ser Ile Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180
185 190 Pro Arg Asp Cys Pro Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys
Leu 195 200 205 Gly Cys Ile Asn Ala Glu Tyr Pro Asp Ser Phe Gly His
Tyr Arg Glu 210 215 220 Ala Lys Phe Ser Gln Thr Lys His His Trp Trp
Trp Lys Leu His Phe225 230 235 240 Val Trp Glu Arg Val Lys Ile Leu
Arg Asp Tyr Ala Gly Leu Ile Leu 245 250 255 Phe Leu Glu Glu Asp His
Tyr Leu Ala Pro Asp Phe Tyr His Val Phe 260 265 270 Lys Lys Met Trp
Lys Leu Lys Gln Gln Glu Cys Pro Glu Cys Asp Val 275 280 285 Leu Ser
Leu Gly Thr Tyr Ser Ala Ser Arg Ser Phe Tyr Gly Met Ala 290 295 300
Asp Lys Val Asp Val Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly305
310 315 320 Leu Ala Leu Thr Arg Asn Ala Tyr Gln Lys Leu Ile Glu Cys
Thr Asp 325 330 335 Thr Phe Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp
Thr Leu Gln Tyr 340 345 350 Leu Thr Val Ser Cys Leu Pro Lys Phe Trp
Lys Val Leu Val Pro Gln 355 360 365 Ile Pro Arg Ile Phe His Ala Gly
Asp Cys Gly Met His His Lys Lys 370 375 380 Thr Cys Arg Pro Ser Thr
Gln Ser Ala Gln Ile Glu Ser Leu Leu Asn385 390 395 400 Asn Asn Lys
Gln Tyr Met Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys 405 410 415 Phe
Thr Val Val Ala Ile Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly 420 425
430 Asp Ile Arg Asp His Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln Thr
435 440 445 Arg Pro Ala Pro Gly Arg Pro Pro Ser Val Ser Ala Leu Asp
Gly Asp 450 455 460 Pro Ala Ser Leu Thr Arg Glu Val Ile Arg Leu Ala
Gln Asp Ala Glu465 470 475 480 Val Glu Leu Glu Arg Gln Arg Gly Leu
Leu Gln Gln Ile Gly Asp Ala 485 490 495 Leu Ser Ser Gln Arg Gly Arg
Val Pro Thr Ala Ala Pro Pro Ala Gln 500 505 510 Pro Arg Val Pro Val
Thr Pro Ala Pro Ala Val Ile Pro Ile Leu Val 515 520 525 Ile Ala Cys
Asp Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu 530 535 540 His
Tyr Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile Val Ser Gln Asp545 550
555 560 Cys Gly His Glu Glu Thr Ala Gln Ala Ile Ala Ser Tyr Gly Ser
Ala 565 570 575 Val Thr His Ile Arg Gln Pro Asp Leu Ser Ser Ile Ala
Val Pro Pro 580 585 590 Asp His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile
Ala Arg His Tyr Arg 595 600 605 Trp Ala Leu Gly Gln Val Phe Arg Gln
Phe Arg Phe Pro Ala Ala Val 610 615 620 Val Val Glu Asp Asp Leu Glu
Val Ala Pro Asp Phe Phe Glu Tyr Phe625 630 635 640 Arg Ala Thr Tyr
Pro Leu Leu Lys Ala Asp Pro Ser Leu Trp Cys Val 645 650 655 Ser Ala
Trp Asn Asp Asn Gly Lys Glu Gln Met Val Asp Ala Ser Arg 660 665 670
Pro Glu Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu 675
680 685 Leu Leu Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys
Ala 690 695 700 Phe Trp Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Gln
Gly Arg Ala705 710 715 720 Cys Ile Arg Pro Glu Ile Ser Arg Thr Met
Thr Phe Gly Arg Lys Gly 725 730 735 Val Ser His Gly Gln Phe Phe Asp
Gln His Leu Lys Phe Ile Lys Leu 740 745 750 Asn Gln Gln Phe Val His
Phe Thr Gln Leu Asp Leu Ser Tyr Leu Gln 755 760 765 Arg Glu Ala Tyr
Asp Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro 770 775 780 Gln Leu
Gln Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu Leu Gly785 790 795
800 Glu Val Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe Ala
805 810 815 Lys Ala Leu Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro
Arg Ala 820
825 830 Gly Tyr Arg Gly Ile Val Thr Phe Gln Phe Arg Gly Arg Arg Val
His 835 840 845 Leu Ala Pro Pro Pro Thr Trp Glu Gly Tyr Asp Pro Ser
Trp Asn 850 855 860 962592DNAHomo sapiens 96atgcgcttcc gaatctacaa
gcggaaggtc ctcattctga cccttgtcgt ggccgcttgc 60ggctttgttc tctggtccag
caacggtcgc cagcgtaaga acgaggccct ggcgcctccc 120ctcttggacg
ccgaaccggc cagaggcgca ggtggcaggg gaggggatca cccctcggtc
180gctgtcggca tccgccgcgt cagcaatgtg tccgccgcct ctctggtccc
ggcggttccg 240cagcctgagg cagacaacct cacgctgcgc taccgatcac
tcgtgtatca acttaacttc 300gaccagactc tgcggaacgt cgacaaggcc
ggaacctggg ctccgcgtga gttggtcctc 360gtcgttcagg tgcacaacag
gcccgagtac ctccgcctcc tgctggattc gcttcgaaag 420gcccagggca
tcgacaacgt cctggtgatt ttcagccatg acttttggtc cacagagatc
480aatcagctca ttgcgggtgt caacttttgc cccgtcttgc aagttttctt
ccctttctct 540atccaactct accccaacga gttcccgggc agtgaccccc
gcgactgtcc tcgggatctg 600ccaaaaaacg ccgctctcaa gctgggctgc
atcaacgccg aataccccga cagctttggc 660cactatcgcg aggccaagtt
ctcgcagacg aagcaccact ggtggtggaa gctccatttt 720gtctgggagc
gagtgaagat ccttcgtgat tacgcaggac tcattctgtt cttggaagag
780gaccactacc tggccccgga cttctaccac gtctttaaga agatgtggaa
gctcaagcag 840caggaatgcc ccgagtgcga cgttctgtcc cttggcacct
atagcgcgtc ccgctcgttc 900tacggtatgg ctgacaaggt cgatgtgaaa
acctggaagt caactgagca caatatgggc 960ctcgccctga cgaggaacgc
ctaccagaaa ctcatcgagt gtaccgacac cttctgcacg 1020tacgacgact
ataactggga ttggacactg cagtacttga ctgtcagctg cctccctaag
1080ttttggaagg tccttgttcc ccagatcccg agaattttcc atgctggcga
ctgcgggatg 1140caccacaaga aaacctgtcg cccatccacg cagtctgccc
aaatcgagtc gctcctgaac 1200aacaacaagc agtacatgtt ccccgagaca
ctgaccatta gcgagaagtt tacggtcgtg 1260gcgatctccc cgcctcgaaa
gaatggcggc tggggtgaca tccgcgatca cgagctgtgc 1320aagtcttacc
gccggctcca gacgcgccca gcacctggca ggccaccctc agtcagcgct
1380ctcgatggcg accccgccag cctcacccgg gaagtgattc gcctggccca
agacgccgag 1440gtggagctgg agcggcagcg tgggctgctg cagcagatcg
gggatgccct gtcgagccag 1500cgggggaggg tgcccaccgc cgcccctccc
gcccagccgc gtgtgcctgt gacccccgcg 1560ccggcggtga ttcccatcct
ggtcatcgcc tgtgaccgca gcactgttcg gcgctgcctg 1620gacaagctgc
tgcattatcg gccctcggct gagctcttcc ccatcatcgt cagccaggac
1680tgcgggcacg aggagacggc ccaggccatc gcctcctacg gcagcgcggt
cacgcacatc 1740cggcagcccg acctgagcag cattgcggtg ccgccggacc
accgcaagtt ccagggctac 1800tacaagatcg cgcgccacta ccgctgggcg
ctgggccagg tcttccggca gtttcgcttc 1860cccgccgccg tggtggtgga
ggatgacctg gaggtggccc cggacttctt cgagtacttt 1920cgggccacct
atccgctgct gaaggccgac ccctccctgt ggtgcgtctc ggcctggaat
1980gacaacggca aggagcagat ggtggacgcc agcaggcctg agctgctcta
ccgcaccgac 2040tttttccctg gcctgggctg gctgctgttg gccgagctct
gggctgagct ggagcccaag 2100tggccaaagg ccttctggga cgactggatg
cggcggccgg agcagcggca ggggcgggcc 2160tgcatccgcc ctgagatctc
aagaacgatg acctttggcc gcaagggtgt gagccacggg 2220cagttctttg
accagcacct caagttcatc aagctgaacc agcagtttgt gcacttcacc
2280cagctggacc tgtcttacct gcagcgggag gcctatgacc gagatttcct
cgcccgcgtc 2340tacggtgctc cccagctgca ggtggagaaa gtgaggacca
atgaccggaa ggagctgggg 2400gaggtgcggg tgcagtacac gggcagggac
agcttcaagg ctttcgccaa ggctctgggt 2460gtcatggatg acctcaagtc
gggggttccg agagctggct accggggcat tgtcaccttc 2520cagttccggg
gccgccgtgt ccacctggcg cccccaccga cgtgggaggg ctatgatccc
2580agctggaatt ag 2592971263DNATrichoderma reesei 97atggcgtcac
tcatcaaaac tgccgtggac attgccaacg gccgccatgc gctgtccaga 60tatgtcatct
ttgggctctg gcttgcggat gcggtgctgt gcgggctgat tatctggaaa
120gtgccttata cggaaatcga ctgggtcgcc tacatggagc aagtcaccca
gttcgtccac 180ggagagcgag actaccccaa gatggagggc ggcacagggc
ccctggtgta tcccgcggcc 240catgtgtaca tctacacagg gctctactac
ctgacgaaca agggcaccga catcctgctg 300gcgcagcagc tctttgccgt
gctctacatg gctactctgg cggtcgtcat gacatgctac 360tccaaggcca
aggtcccgcc gtacatcttc ccgcttctca tcctctccaa aagacttcac
420agcgtcttcg tcctgagatg cttcaacgac tgcttcgccg ccttcttcct
ctggctctgc 480atcttcttct tccagaggcg agagtggacc atcggagctc
tcgcatacag catcggcctg 540ggcgtcaaaa tgtcgctgct actggttctc
cccgccgtgg tcatcgtcct ctacctcggc 600cgcggcttca agggcgccct
gcggctgctc tggctcatgg tgcaggtcca gctcctcctc 660gccataccct
tcatcacgac aaattggcgc ggctacctcg gccgtgcatt cgagctctcg
720aggcagttca agtttgaatg gacagtcaat tggcgcatgc tgggcgagga
tctgttcctc 780agccggggct tctctatcac gctactggca tttcacgcca
tcttcctcct cgcctttatc 840ctcggccggt ggctgaagat tagggaacgg
accgtactcg ggatgatccc ctatgtcatc 900cgattcagat cgccctttac
cgagcaggaa gagcgcgcca tctccaaccg cgtcgtcacg 960cccggctatg
tcatgtccac catcttgtcg gccaacgtgg tgggactgct gtttgcccgg
1020tctctgcact accagttcta tgcatatctg gcgtgggcga ccccctatct
cctgtggacg 1080gcctgcccca atcttttggt ggtggccccc ctctgggcgg
cgcaagaatg ggcctggaac 1140gtcttcccca gcacgcctct tagctcgagc
gtcgtggtga gcgtgctggc cgtgacggtg 1200gccatggcgt ttgcaggttc
aaatccgcag ccacgtgaaa catcgaagcc gaagcagcac 1260taa
126398420PRTTrichoderma reesei 98Met Ala Ser Leu Ile Lys Thr Ala
Val Asp Ile Ala Asn Gly Arg His 1 5 10 15 Ala Leu Ser Arg Tyr Val
Ile Phe Gly Leu Trp Leu Ala Asp Ala Val 20 25 30 Leu Cys Gly Leu
Ile Ile Trp Lys Val Pro Tyr Thr Glu Ile Asp Trp 35 40 45 Val Ala
Tyr Met Glu Gln Val Thr Gln Phe Val His Gly Glu Arg Asp 50 55 60
Tyr Pro Lys Met Glu Gly Gly Thr Gly Pro Leu Val Tyr Pro Ala Ala65
70 75 80 His Val Tyr Ile Tyr Thr Gly Leu Tyr Tyr Leu Thr Asn Lys
Gly Thr 85 90 95 Asp Ile Leu Leu Ala Gln Gln Leu Phe Ala Val Leu
Tyr Met Ala Thr 100 105 110 Leu Ala Val Val Met Thr Cys Tyr Ser Lys
Ala Lys Val Pro Pro Tyr 115 120 125 Ile Phe Pro Leu Leu Ile Leu Ser
Lys Arg Leu His Ser Val Phe Val 130 135 140 Leu Arg Cys Phe Asn Asp
Cys Phe Ala Ala Phe Phe Leu Trp Leu Cys145 150 155 160 Ile Phe Phe
Phe Gln Arg Arg Glu Trp Thr Ile Gly Ala Leu Ala Tyr 165 170 175 Ser
Ile Gly Leu Gly Val Lys Met Ser Leu Leu Leu Val Leu Pro Ala 180 185
190 Val Val Ile Val Leu Tyr Leu Gly Arg Gly Phe Lys Gly Ala Leu Arg
195 200 205 Leu Leu Trp Leu Met Val Gln Val Gln Leu Leu Leu Ala Ile
Pro Phe 210 215 220 Ile Thr Thr Asn Trp Arg Gly Tyr Leu Gly Arg Ala
Phe Glu Leu Ser225 230 235 240 Arg Gln Phe Lys Phe Glu Trp Thr Val
Asn Trp Arg Met Leu Gly Glu 245 250 255 Asp Leu Phe Leu Ser Arg Gly
Phe Ser Ile Thr Leu Leu Ala Phe His 260 265 270 Ala Ile Phe Leu Leu
Ala Phe Ile Leu Gly Arg Trp Leu Lys Ile Arg 275 280 285 Glu Arg Thr
Val Leu Gly Met Ile Pro Tyr Val Ile Arg Phe Arg Ser 290 295 300 Pro
Phe Thr Glu Gln Glu Glu Arg Ala Ile Ser Asn Arg Val Val Thr305 310
315 320 Pro Gly Tyr Val Met Ser Thr Ile Leu Ser Ala Asn Val Val Gly
Leu 325 330 335 Leu Phe Ala Arg Ser Leu His Tyr Gln Phe Tyr Ala Tyr
Leu Ala Trp 340 345 350 Ala Thr Pro Tyr Leu Leu Trp Thr Ala Cys Pro
Asn Leu Leu Val Val 355 360 365 Ala Pro Leu Trp Ala Ala Gln Glu Trp
Ala Trp Asn Val Phe Pro Ser 370 375 380 Thr Pro Leu Ser Ser Ser Val
Val Val Ser Val Leu Ala Val Thr Val385 390 395 400 Ala Met Ala Phe
Ala Gly Ser Asn Pro Gln Pro Arg Glu Thr Ser Lys 405 410 415 Pro Lys
Gln His 420 9921DNAArtificial SequenceSynthesized Construct
99gcaaatggca ttctgacatc c 2110021DNAArtificial SequenceSynthesized
Construct 100gactggttcc aattgacaag c 2110142DNAArtificial
SequenceSynthesized Construct 101cagtggtacc ctaattccag ctaggatcat
agccctccca cg 4210218DNAArtificial SequenceSynthesized Construct
102cggaccaccg caagttcc 1810354DNAArtificial SequenceSynthesized
Construct 103atgcggaatt ctgcatcatc atcatcatca tcgccagcgt aagaacgagg
ccct 5410422DNAArtificial SequenceSynthesized Construct
104cctttctcta tccaactcta cc 2210518DNAArtificial
SequenceSynthesized Construct 105ggaacttgcg gtggtccg
1810666DNAArtificial SequenceSynthesized Construct 106ccgccggctc
cagggaggtg ggggcagtgg aggtggcggc agtgggaggg tgcccaccgc 60cgcccc
6610766DNAArtificial SequenceSynthesized Construct 107gcggtgggca
ccctcccact gccgccacct ccactgcccc cacctccctg gagccggcgg 60taagac
6610865DNAArtificial SequenceSynthesized Construct 108aggtgggggc
agtggaggtg gcggcagtgg cggcggtgga agtgggaggg tgcccaccgc 60cgccc
6510964DNAArtificial SequenceSynthesized Construct 109cggtgggcac
cctcccactt ccaccgccgc cactgccgcc acctccactg cccccacctc 60cctg
6411065DNAArtificial SequenceSynthesized Construct 110gtttccgccg
ggagggttgc cgccgctagg gttgccggtg ctctggagcc ggcggtaaga 60cttgc
6511166DNAArtificial SequenceSynthesized Construct 111gcaaccctcc
cggcggaaac ccgcctggca gcaccgggag ggtgcccacc gccgcccctc 60ccgccc
6611270DNAArtificial SequenceSynthesized Construct 112ccgcctccag
gaacagtggc gctggcggtg gccgtcgcgg cggagctctg gagccggcgg 60taagacttgc
7011371DNAArtificial SequenceSynthesized Construct 113cgccactgtt
cctggaggcg gtagcggccc caccagcggg agggtgccca ccgccgcccc 60tcccgcccag
c 7111420DNAArtificial SequenceSynthesized Construct 114cattagcgag
aagtttacgg 20115106PRTTrichoderma reesei 115Met Ala Ser Thr Asn Ala
Arg Tyr Val Arg Tyr Leu Leu Ile Ala Phe 1 5 10 15 Phe Thr Ile Leu
Val Phe Tyr Phe Val Ser Asn Ser Lys Tyr Glu Gly 20 25 30 Val Asp
Leu Asn Lys Gly Thr Phe Thr Ala Pro Asp Ser Thr Lys Thr 35 40 45
Thr Pro Lys Pro Pro Ala Thr Gly Asp Ala Lys Asp Phe Pro Leu Ala 50
55 60 Leu Thr Pro Asn Asp Pro Gly Phe Asn Asp Leu Val Gly Ile Ala
Pro65 70 75 80 Gly Pro Arg Met Asn Ala Thr Phe Val Thr Leu Ala Arg
Asn Ser Asp 85 90 95 Val Trp Asp Ile Ala Arg Ser Ile Arg Gln 100
105 11683PRTTrichoderma reesei 116Met Ala Ser Thr Asn Ala Arg Tyr
Val Arg Tyr Leu Leu Ile Ala Phe 1 5 10 15 Phe Thr Ile Leu Val Phe
Tyr Phe Val Ser Asn Ser Lys Tyr Glu Gly 20 25 30 Val Asp Leu Asn
Lys Gly Thr Phe Thr Ala Pro Asp Ser Thr Lys Thr 35 40 45 Thr Pro
Lys Pro Pro Ala Thr Gly Asp Ala Lys Asp Phe Pro Leu Ala 50 55 60
Leu Thr Pro Asn Asp Pro Gly Phe Asn Asp Leu Val Gly Ile Ala Pro65
70 75 80 Gly Pro Arg1178PRTArtificial SequenceSynthesized Construct
117Asp Tyr Lys Asp Asp Asp Asp Lys1 5 11815PRTArtificial
SequenceSynthesized Construct 118Gly Gly Gly Gly Ser Gly Gly Gly
Gly Ser Gly Gly Gly Gly Ser 1 5 10 15 119824PRTArtificial
SequenceSynthesized Construct 119Met Arg Phe Arg Ile Tyr Lys Arg
Lys Val Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe
Val Leu Trp Ser Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala
Leu Ala Pro Pro Leu Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly Ala
Gly Gly Arg Gly Gly Asp His Pro Ser Val Ala Val Gly Ile 50 55 60
Arg Arg Val Ser Asn Val Ser Ala Ala Ser Leu Val Pro Ala Val Pro65
70 75 80 Gln Pro Glu Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser Leu
Val Tyr 85 90 95 Gln Leu Asn Phe Asp Gln Thr Leu Arg Asn Val Asp
Lys Ala Gly Thr 100 105 110 Trp Ala Pro Arg Glu Leu Val Leu Val Val
Gln Val His Asn Arg Pro 115 120 125 Glu Tyr Leu Arg Leu Leu Leu Asp
Ser Leu Arg Lys Ala Gln Gly Ile 130 135 140 Asp Asn Val Leu Val Ile
Phe Ser His Asp Phe Trp Ser Thr Glu Ile145 150 155 160 Asn Gln Leu
Ile Ala Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe 165 170 175 Phe
Pro Phe Ser Ile Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180 185
190 Pro Arg Asp Cys Pro Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys Leu
195 200 205 Gly Cys Ile Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr
Arg Glu 210 215 220 Ala Lys Phe Ser Gln Thr Lys His His Trp Trp Trp
Lys Leu His Phe225 230 235 240 Val Trp Glu Arg Val Lys Ile Leu Arg
Asp Tyr Ala Gly Leu Ile Leu 245 250 255 Phe Leu Glu Glu Asp His Tyr
Leu Ala Pro Asp Phe Tyr His Val Phe 260 265 270 Lys Lys Met Trp Lys
Leu Lys Gln Gln Glu Cys Pro Glu Cys Asp Val 275 280 285 Leu Ser Leu
Gly Thr Tyr Ser Ala Ser Arg Ser Phe Tyr Gly Met Ala 290 295 300 Asp
Lys Val Asp Val Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly305 310
315 320 Leu Ala Leu Thr Arg Asn Ala Tyr Gln Lys Leu Ile Glu Cys Thr
Asp 325 330 335 Thr Phe Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr
Leu Gln Tyr 340 345 350 Leu Thr Val Ser Cys Leu Pro Lys Phe Trp Lys
Val Leu Val Pro Gln 355 360 365 Ile Pro Arg Ile Phe His Ala Gly Asp
Cys Gly Met His His Lys Lys 370 375 380 Thr Cys Arg Pro Ser Thr Gln
Ser Ala Gln Ile Glu Ser Leu Leu Asn385 390 395 400 Asn Asn Lys Gln
Tyr Met Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys 405 410 415 Phe Thr
Val Val Ala Ile Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly 420 425 430
Asp Ile Arg Asp His Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln Gly 435
440 445 Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
Arg 450 455 460 Val Pro Thr Ala Ala Pro Pro Ala Gln Pro Arg Val Pro
Val Thr Pro465 470 475 480 Ala Pro Ala Val Ile Pro Ile Leu Val Ile
Ala Cys Asp Arg Ser Thr 485 490 495 Val Arg Arg Cys Leu Asp Lys Leu
Leu His Tyr Arg Pro Ser Ala Glu 500 505 510 Leu Phe Pro Ile Ile Val
Ser Gln Asp Cys Gly His Glu Glu Thr Ala 515 520 525 Gln Ala Ile Ala
Ser Tyr Gly Ser Ala Val Thr His Ile Arg Gln Pro 530 535 540 Asp Leu
Ser Ser Ile Ala Val Pro Pro Asp His Arg Lys Phe Gln Gly545 550 555
560 Tyr Tyr Lys Ile Ala Arg His Tyr Arg Trp Ala Leu Gly Gln Val Phe
565 570 575 Arg Gln Phe Arg Phe Pro Ala Ala Val Val Val Glu Asp Asp
Leu Glu 580 585 590 Val Ala Pro Asp Phe Phe Glu Tyr Phe Arg Ala Thr
Tyr Pro Leu Leu 595 600 605 Lys Ala Asp Pro Ser Leu Trp Cys Val Ser
Ala Trp Asn Asp Asn Gly 610 615 620 Lys Glu Gln Met Val Asp Ala Ser
Arg Pro Glu Leu Leu Tyr Arg Thr625 630 635 640 Asp Phe Phe Pro
Gly
Leu Gly Trp Leu Leu Leu Ala Glu Leu Trp Ala 645 650 655 Glu Leu Glu
Pro Lys Trp Pro Lys Ala Phe Trp Asp Asp Trp Met Arg 660 665 670 Arg
Pro Glu Gln Arg Gln Gly Arg Ala Cys Ile Arg Pro Glu Ile Ser 675 680
685 Arg Thr Met Thr Phe Gly Arg Lys Gly Val Ser His Gly Gln Phe Phe
690 695 700 Asp Gln His Leu Lys Phe Ile Lys Leu Asn Gln Gln Phe Val
His Phe705 710 715 720 Thr Gln Leu Asp Leu Ser Tyr Leu Gln Arg Glu
Ala Tyr Asp Arg Asp 725 730 735 Phe Leu Ala Arg Val Tyr Gly Ala Pro
Gln Leu Gln Val Glu Lys Val 740 745 750 Arg Thr Asn Asp Arg Lys Glu
Leu Gly Glu Val Arg Val Gln Tyr Thr 755 760 765 Gly Arg Asp Ser Phe
Lys Ala Phe Ala Lys Ala Leu Gly Val Met Asp 770 775 780 Asp Leu Lys
Ser Gly Val Pro Arg Ala Gly Tyr Arg Gly Ile Val Thr785 790 795 800
Phe Gln Phe Arg Gly Arg Arg Val His Leu Ala Pro Pro Pro Thr Trp 805
810 815 Glu Gly Tyr Asp Pro Ser Trp Asn 820 12010PRTArtificial
SequenceSynthesized Construct 120Gly Gly Gly Gly Ser Gly Gly Gly
Gly Ser 1 5 10 121819PRTArtificial SequenceSynthesized Construct
121Met Arg Phe Arg Ile Tyr Lys Arg Lys Val Leu Ile Leu Thr Leu Val
1 5 10 15 Val Ala Ala Cys Gly Phe Val Leu Trp Ser Ser Asn Gly Arg
Gln Arg 20 25 30 Lys Asn Glu Ala Leu Ala Pro Pro Leu Leu Asp Ala
Glu Pro Ala Arg 35 40 45 Gly Ala Gly Gly Arg Gly Gly Asp His Pro
Ser Val Ala Val Gly Ile 50 55 60 Arg Arg Val Ser Asn Val Ser Ala
Ala Ser Leu Val Pro Ala Val Pro65 70 75 80 Gln Pro Glu Ala Asp Asn
Leu Thr Leu Arg Tyr Arg Ser Leu Val Tyr 85 90 95 Gln Leu Asn Phe
Asp Gln Thr Leu Arg Asn Val Asp Lys Ala Gly Thr 100 105 110 Trp Ala
Pro Arg Glu Leu Val Leu Val Val Gln Val His Asn Arg Pro 115 120 125
Glu Tyr Leu Arg Leu Leu Leu Asp Ser Leu Arg Lys Ala Gln Gly Ile 130
135 140 Asp Asn Val Leu Val Ile Phe Ser His Asp Phe Trp Ser Thr Glu
Ile145 150 155 160 Asn Gln Leu Ile Ala Gly Val Asn Phe Cys Pro Val
Leu Gln Val Phe 165 170 175 Phe Pro Phe Ser Ile Gln Leu Tyr Pro Asn
Glu Phe Pro Gly Ser Asp 180 185 190 Pro Arg Asp Cys Pro Arg Asp Leu
Pro Lys Asn Ala Ala Leu Lys Leu 195 200 205 Gly Cys Ile Asn Ala Glu
Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu 210 215 220 Ala Lys Phe Ser
Gln Thr Lys His His Trp Trp Trp Lys Leu His Phe225 230 235 240 Val
Trp Glu Arg Val Lys Ile Leu Arg Asp Tyr Ala Gly Leu Ile Leu 245 250
255 Phe Leu Glu Glu Asp His Tyr Leu Ala Pro Asp Phe Tyr His Val Phe
260 265 270 Lys Lys Met Trp Lys Leu Lys Gln Gln Glu Cys Pro Glu Cys
Asp Val 275 280 285 Leu Ser Leu Gly Thr Tyr Ser Ala Ser Arg Ser Phe
Tyr Gly Met Ala 290 295 300 Asp Lys Val Asp Val Lys Thr Trp Lys Ser
Thr Glu His Asn Met Gly305 310 315 320 Leu Ala Leu Thr Arg Asn Ala
Tyr Gln Lys Leu Ile Glu Cys Thr Asp 325 330 335 Thr Phe Cys Thr Tyr
Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln Tyr 340 345 350 Leu Thr Val
Ser Cys Leu Pro Lys Phe Trp Lys Val Leu Val Pro Gln 355 360 365 Ile
Pro Arg Ile Phe His Ala Gly Asp Cys Gly Met His His Lys Lys 370 375
380 Thr Cys Arg Pro Ser Thr Gln Ser Ala Gln Ile Glu Ser Leu Leu
Asn385 390 395 400 Asn Asn Lys Gln Tyr Met Phe Pro Glu Thr Leu Thr
Ile Ser Glu Lys 405 410 415 Phe Thr Val Val Ala Ile Ser Pro Pro Arg
Lys Asn Gly Gly Trp Gly 420 425 430 Asp Ile Arg Asp His Glu Leu Cys
Lys Ser Tyr Arg Arg Leu Gln Gly 435 440 445 Gly Gly Gly Ser Gly Gly
Gly Gly Ser Gly Arg Val Pro Thr Ala Ala 450 455 460 Pro Pro Ala Gln
Pro Arg Val Pro Val Thr Pro Ala Pro Ala Val Ile465 470 475 480 Pro
Ile Leu Val Ile Ala Cys Asp Arg Ser Thr Val Arg Arg Cys Leu 485 490
495 Asp Lys Leu Leu His Tyr Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile
500 505 510 Val Ser Gln Asp Cys Gly His Glu Glu Thr Ala Gln Ala Ile
Ala Ser 515 520 525 Tyr Gly Ser Ala Val Thr His Ile Arg Gln Pro Asp
Leu Ser Ser Ile 530 535 540 Ala Val Pro Pro Asp His Arg Lys Phe Gln
Gly Tyr Tyr Lys Ile Ala545 550 555 560 Arg His Tyr Arg Trp Ala Leu
Gly Gln Val Phe Arg Gln Phe Arg Phe 565 570 575 Pro Ala Ala Val Val
Val Glu Asp Asp Leu Glu Val Ala Pro Asp Phe 580 585 590 Phe Glu Tyr
Phe Arg Ala Thr Tyr Pro Leu Leu Lys Ala Asp Pro Ser 595 600 605 Leu
Trp Cys Val Ser Ala Trp Asn Asp Asn Gly Lys Glu Gln Met Val 610 615
620 Asp Ala Ser Arg Pro Glu Leu Leu Tyr Arg Thr Asp Phe Phe Pro
Gly625 630 635 640 Leu Gly Trp Leu Leu Leu Ala Glu Leu Trp Ala Glu
Leu Glu Pro Lys 645 650 655 Trp Pro Lys Ala Phe Trp Asp Asp Trp Met
Arg Arg Pro Glu Gln Arg 660 665 670 Gln Gly Arg Ala Cys Ile Arg Pro
Glu Ile Ser Arg Thr Met Thr Phe 675 680 685 Gly Arg Lys Gly Val Ser
His Gly Gln Phe Phe Asp Gln His Leu Lys 690 695 700 Phe Ile Lys Leu
Asn Gln Gln Phe Val His Phe Thr Gln Leu Asp Leu705 710 715 720 Ser
Tyr Leu Gln Arg Glu Ala Tyr Asp Arg Asp Phe Leu Ala Arg Val 725 730
735 Tyr Gly Ala Pro Gln Leu Gln Val Glu Lys Val Arg Thr Asn Asp Arg
740 745 750 Lys Glu Leu Gly Glu Val Arg Val Gln Tyr Thr Gly Arg Asp
Ser Phe 755 760 765 Lys Ala Phe Ala Lys Ala Leu Gly Val Met Asp Asp
Leu Lys Ser Gly 770 775 780 Val Pro Arg Ala Gly Tyr Arg Gly Ile Val
Thr Phe Gln Phe Arg Gly785 790 795 800 Arg Arg Val His Leu Ala Pro
Pro Pro Thr Trp Glu Gly Tyr Asp Pro 805 810 815 Ser Trp Asn
12219PRTTrichoderma reesei 122Ser Thr Gly Asn Pro Ser Gly Gly Asn
Pro Pro Gly Gly Asn Pro Pro 1 5 10 15 Gly Ser
Thr123828PRTTrichoderma reesei 123Met Arg Phe Arg Ile Tyr Lys Arg
Lys Val Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe
Val Leu Trp Ser Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala
Leu Ala Pro Pro Leu Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly Ala
Gly Gly Arg Gly Gly Asp His Pro Ser Val Ala Val Gly Ile 50 55 60
Arg Arg Val Ser Asn Val Ser Ala Ala Ser Leu Val Pro Ala Val Pro65
70 75 80 Gln Pro Glu Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser Leu
Val Tyr 85 90 95 Gln Leu Asn Phe Asp Gln Thr Leu Arg Asn Val Asp
Lys Ala Gly Thr 100 105 110 Trp Ala Pro Arg Glu Leu Val Leu Val Val
Gln Val His Asn Arg Pro 115 120 125 Glu Tyr Leu Arg Leu Leu Leu Asp
Ser Leu Arg Lys Ala Gln Gly Ile 130 135 140 Asp Asn Val Leu Val Ile
Phe Ser His Asp Phe Trp Ser Thr Glu Ile145 150 155 160 Asn Gln Leu
Ile Ala Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe 165 170 175 Phe
Pro Phe Ser Ile Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180 185
190 Pro Arg Asp Cys Pro Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys Leu
195 200 205 Gly Cys Ile Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr
Arg Glu 210 215 220 Ala Lys Phe Ser Gln Thr Lys His His Trp Trp Trp
Lys Leu His Phe225 230 235 240 Val Trp Glu Arg Val Lys Ile Leu Arg
Asp Tyr Ala Gly Leu Ile Leu 245 250 255 Phe Leu Glu Glu Asp His Tyr
Leu Ala Pro Asp Phe Tyr His Val Phe 260 265 270 Lys Lys Met Trp Lys
Leu Lys Gln Gln Glu Cys Pro Glu Cys Asp Val 275 280 285 Leu Ser Leu
Gly Thr Tyr Ser Ala Ser Arg Ser Phe Tyr Gly Met Ala 290 295 300 Asp
Lys Val Asp Val Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly305 310
315 320 Leu Ala Leu Thr Arg Asn Ala Tyr Gln Lys Leu Ile Glu Cys Thr
Asp 325 330 335 Thr Phe Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr
Leu Gln Tyr 340 345 350 Leu Thr Val Ser Cys Leu Pro Lys Phe Trp Lys
Val Leu Val Pro Gln 355 360 365 Ile Pro Arg Ile Phe His Ala Gly Asp
Cys Gly Met His His Lys Lys 370 375 380 Thr Cys Arg Pro Ser Thr Gln
Ser Ala Gln Ile Glu Ser Leu Leu Asn385 390 395 400 Asn Asn Lys Gln
Tyr Met Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys 405 410 415 Phe Thr
Val Val Ala Ile Ser Pro Pro Arg Lys Asn Gly Gly Trp Gly 420 425 430
Asp Ile Arg Asp His Glu Leu Cys Lys Ser Tyr Arg Arg Leu Gln Ser 435
440 445 Thr Gly Asn Pro Ser Gly Gly Asn Pro Pro Gly Gly Asn Pro Pro
Gly 450 455 460 Ser Thr Gly Arg Val Pro Thr Ala Ala Pro Pro Ala Gln
Pro Arg Val465 470 475 480 Pro Val Thr Pro Ala Pro Ala Val Ile Pro
Ile Leu Val Ile Ala Cys 485 490 495 Asp Arg Ser Thr Val Arg Arg Cys
Leu Asp Lys Leu Leu His Tyr Arg 500 505 510 Pro Ser Ala Glu Leu Phe
Pro Ile Ile Val Ser Gln Asp Cys Gly His 515 520 525 Glu Glu Thr Ala
Gln Ala Ile Ala Ser Tyr Gly Ser Ala Val Thr His 530 535 540 Ile Arg
Gln Pro Asp Leu Ser Ser Ile Ala Val Pro Pro Asp His Arg545 550 555
560 Lys Phe Gln Gly Tyr Tyr Lys Ile Ala Arg His Tyr Arg Trp Ala Leu
565 570 575 Gly Gln Val Phe Arg Gln Phe Arg Phe Pro Ala Ala Val Val
Val Glu 580 585 590 Asp Asp Leu Glu Val Ala Pro Asp Phe Phe Glu Tyr
Phe Arg Ala Thr 595 600 605 Tyr Pro Leu Leu Lys Ala Asp Pro Ser Leu
Trp Cys Val Ser Ala Trp 610 615 620 Asn Asp Asn Gly Lys Glu Gln Met
Val Asp Ala Ser Arg Pro Glu Leu625 630 635 640 Leu Tyr Arg Thr Asp
Phe Phe Pro Gly Leu Gly Trp Leu Leu Leu Ala 645 650 655 Glu Leu Trp
Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala Phe Trp Asp 660 665 670 Asp
Trp Met Arg Arg Pro Glu Gln Arg Gln Gly Arg Ala Cys Ile Arg 675 680
685 Pro Glu Ile Ser Arg Thr Met Thr Phe Gly Arg Lys Gly Val Ser His
690 695 700 Gly Gln Phe Phe Asp Gln His Leu Lys Phe Ile Lys Leu Asn
Gln Gln705 710 715 720 Phe Val His Phe Thr Gln Leu Asp Leu Ser Tyr
Leu Gln Arg Glu Ala 725 730 735 Tyr Asp Arg Asp Phe Leu Ala Arg Val
Tyr Gly Ala Pro Gln Leu Gln 740 745 750 Val Glu Lys Val Arg Thr Asn
Asp Arg Lys Glu Leu Gly Glu Val Arg 755 760 765 Val Gln Tyr Thr Gly
Arg Asp Ser Phe Lys Ala Phe Ala Lys Ala Leu 770 775 780 Gly Val Met
Asp Asp Leu Lys Ser Gly Val Pro Arg Ala Gly Tyr Arg785 790 795 800
Gly Ile Val Thr Phe Gln Phe Arg Gly Arg Arg Val His Leu Ala Pro 805
810 815 Pro Pro Thr Trp Glu Gly Tyr Asp Pro Ser Trp Asn 820 825
12421PRTTrichoderma reesei 124Ser Ser Ala Ala Thr Ala Thr Ala Ser
Ala Thr Val Pro Gly Gly Gly 1 5 10 15 Ser Gly Pro Thr Ser 20
125830PRTTrichoderma reesei 125Met Arg Phe Arg Ile Tyr Lys Arg Lys
Val Leu Ile Leu Thr Leu Val 1 5 10 15 Val Ala Ala Cys Gly Phe Val
Leu Trp Ser Ser Asn Gly Arg Gln Arg 20 25 30 Lys Asn Glu Ala Leu
Ala Pro Pro Leu Leu Asp Ala Glu Pro Ala Arg 35 40 45 Gly Ala Gly
Gly Arg Gly Gly Asp His Pro Ser Val Ala Val Gly Ile 50 55 60 Arg
Arg Val Ser Asn Val Ser Ala Ala Ser Leu Val Pro Ala Val Pro65 70 75
80 Gln Pro Glu Ala Asp Asn Leu Thr Leu Arg Tyr Arg Ser Leu Val Tyr
85 90 95 Gln Leu Asn Phe Asp Gln Thr Leu Arg Asn Val Asp Lys Ala
Gly Thr 100 105 110 Trp Ala Pro Arg Glu Leu Val Leu Val Val Gln Val
His Asn Arg Pro 115 120 125 Glu Tyr Leu Arg Leu Leu Leu Asp Ser Leu
Arg Lys Ala Gln Gly Ile 130 135 140 Asp Asn Val Leu Val Ile Phe Ser
His Asp Phe Trp Ser Thr Glu Ile145 150 155 160 Asn Gln Leu Ile Ala
Gly Val Asn Phe Cys Pro Val Leu Gln Val Phe 165 170 175 Phe Pro Phe
Ser Ile Gln Leu Tyr Pro Asn Glu Phe Pro Gly Ser Asp 180 185 190 Pro
Arg Asp Cys Pro Arg Asp Leu Pro Lys Asn Ala Ala Leu Lys Leu 195 200
205 Gly Cys Ile Asn Ala Glu Tyr Pro Asp Ser Phe Gly His Tyr Arg Glu
210 215 220 Ala Lys Phe Ser Gln Thr Lys His His Trp Trp Trp Lys Leu
His Phe225 230 235 240 Val Trp Glu Arg Val Lys Ile Leu Arg Asp Tyr
Ala Gly Leu Ile Leu 245 250 255 Phe Leu Glu Glu Asp His Tyr Leu Ala
Pro Asp Phe Tyr His Val Phe 260 265 270 Lys Lys Met Trp Lys Leu Lys
Gln Gln Glu Cys Pro Glu Cys Asp Val 275 280 285 Leu Ser Leu Gly Thr
Tyr Ser Ala Ser Arg Ser Phe Tyr Gly Met Ala 290 295 300 Asp Lys Val
Asp Val Lys Thr Trp Lys Ser Thr Glu His Asn Met Gly305 310 315 320
Leu Ala Leu Thr Arg Asn Ala Tyr Gln Lys Leu Ile Glu Cys Thr Asp 325
330 335 Thr Phe Cys Thr Tyr Asp Asp Tyr Asn Trp Asp Trp Thr Leu Gln
Tyr 340 345 350 Leu Thr Val Ser Cys Leu Pro Lys Phe Trp Lys Val Leu
Val Pro Gln 355 360 365 Ile Pro Arg Ile Phe His Ala Gly Asp Cys Gly
Met His His Lys Lys 370 375 380 Thr Cys Arg Pro Ser Thr Gln Ser Ala
Gln Ile Glu Ser Leu Leu Asn385 390 395 400 Asn Asn Lys Gln Tyr Met
Phe Pro Glu Thr Leu Thr Ile Ser Glu Lys
405 410 415 Phe Thr Val Val Ala Ile Ser Pro Pro Arg Lys Asn Gly Gly
Trp Gly 420 425 430 Asp Ile Arg Asp His Glu Leu Cys Lys Ser Tyr Arg
Arg Leu Gln Ser 435 440 445 Ser Ala Ala Thr Ala Thr Ala Ser Ala Thr
Val Pro Gly Gly Gly Ser 450 455 460 Gly Pro Thr Ser Gly Arg Val Pro
Thr Ala Ala Pro Pro Ala Gln Pro465 470 475 480 Arg Val Pro Val Thr
Pro Ala Pro Ala Val Ile Pro Ile Leu Val Ile 485 490 495 Ala Cys Asp
Arg Ser Thr Val Arg Arg Cys Leu Asp Lys Leu Leu His 500 505 510 Tyr
Arg Pro Ser Ala Glu Leu Phe Pro Ile Ile Val Ser Gln Asp Cys 515 520
525 Gly His Glu Glu Thr Ala Gln Ala Ile Ala Ser Tyr Gly Ser Ala Val
530 535 540 Thr His Ile Arg Gln Pro Asp Leu Ser Ser Ile Ala Val Pro
Pro Asp545 550 555 560 His Arg Lys Phe Gln Gly Tyr Tyr Lys Ile Ala
Arg His Tyr Arg Trp 565 570 575 Ala Leu Gly Gln Val Phe Arg Gln Phe
Arg Phe Pro Ala Ala Val Val 580 585 590 Val Glu Asp Asp Leu Glu Val
Ala Pro Asp Phe Phe Glu Tyr Phe Arg 595 600 605 Ala Thr Tyr Pro Leu
Leu Lys Ala Asp Pro Ser Leu Trp Cys Val Ser 610 615 620 Ala Trp Asn
Asp Asn Gly Lys Glu Gln Met Val Asp Ala Ser Arg Pro625 630 635 640
Glu Leu Leu Tyr Arg Thr Asp Phe Phe Pro Gly Leu Gly Trp Leu Leu 645
650 655 Leu Ala Glu Leu Trp Ala Glu Leu Glu Pro Lys Trp Pro Lys Ala
Phe 660 665 670 Trp Asp Asp Trp Met Arg Arg Pro Glu Gln Arg Gln Gly
Arg Ala Cys 675 680 685 Ile Arg Pro Glu Ile Ser Arg Thr Met Thr Phe
Gly Arg Lys Gly Val 690 695 700 Ser His Gly Gln Phe Phe Asp Gln His
Leu Lys Phe Ile Lys Leu Asn705 710 715 720 Gln Gln Phe Val His Phe
Thr Gln Leu Asp Leu Ser Tyr Leu Gln Arg 725 730 735 Glu Ala Tyr Asp
Arg Asp Phe Leu Ala Arg Val Tyr Gly Ala Pro Gln 740 745 750 Leu Gln
Val Glu Lys Val Arg Thr Asn Asp Arg Lys Glu Leu Gly Glu 755 760 765
Val Arg Val Gln Tyr Thr Gly Arg Asp Ser Phe Lys Ala Phe Ala Lys 770
775 780 Ala Leu Gly Val Met Asp Asp Leu Lys Ser Gly Val Pro Arg Ala
Gly785 790 795 800 Tyr Arg Gly Ile Val Thr Phe Gln Phe Arg Gly Arg
Arg Val His Leu 805 810 815 Ala Pro Pro Pro Thr Trp Glu Gly Tyr Asp
Pro Ser Trp Asn 820 825 830 126420PRTTrichoderma reesei 126Met Ala
Ser Leu Ile Lys Thr Ala Val Asp Ile Ala Asn Gly Arg His 1 5 10 15
Ala Leu Ser Arg Tyr Val Ile Phe Gly Leu Trp Leu Ala Asp Ala Val 20
25 30 Leu Cys Gly Leu Ile Ile Trp Lys Val Pro Tyr Thr Glu Ile Asp
Trp 35 40 45 Val Ala Tyr Met Glu Gln Val Thr Gln Phe Val His Gly
Glu Arg Asp 50 55 60 Tyr Pro Lys Met Glu Gly Gly Thr Gly Pro Leu
Val Tyr Pro Ala Ala65 70 75 80 His Val Tyr Ile Tyr Thr Gly Leu Tyr
Tyr Leu Thr Asn Lys Gly Thr 85 90 95 Asp Ile Leu Leu Ala Gln Gln
Leu Phe Ala Val Leu Tyr Met Ala Thr 100 105 110 Leu Ala Val Val Met
Thr Cys Tyr Ser Lys Ala Lys Val Pro Pro Tyr 115 120 125 Ile Phe Pro
Leu Leu Ile Leu Ser Lys Arg Leu His Ser Val Phe Val 130 135 140 Leu
Arg Cys Phe Asn Asp Cys Phe Ala Ala Phe Phe Leu Trp Leu Cys145 150
155 160 Ile Phe Phe Phe Gln Arg Arg Glu Trp Thr Ile Gly Ala Leu Ala
Tyr 165 170 175 Ser Ile Gly Leu Gly Val Lys Met Ser Leu Leu Leu Val
Leu Pro Ala 180 185 190 Val Val Ile Val Leu Tyr Leu Gly Arg Gly Phe
Lys Gly Ala Leu Arg 195 200 205 Leu Leu Trp Leu Met Val Gln Val Gln
Leu Leu Leu Ala Ile Pro Phe 210 215 220 Ile Thr Thr Asn Trp Arg Gly
Tyr Leu Gly Arg Ala Phe Glu Leu Ser225 230 235 240 Arg Gln Phe Lys
Phe Glu Trp Thr Val Asn Trp Arg Met Leu Gly Glu 245 250 255 Asp Leu
Phe Leu Ser Arg Gly Phe Ser Ile Thr Leu Leu Ala Phe His 260 265 270
Ala Ile Phe Leu Leu Ala Phe Ile Leu Gly Arg Trp Leu Lys Ile Arg 275
280 285 Glu Arg Thr Val Leu Gly Met Ile Pro Tyr Val Ile Arg Phe Arg
Ser 290 295 300 Pro Phe Thr Glu Gln Glu Glu Arg Ala Ile Ser Asn Arg
Val Val Thr305 310 315 320 Pro Gly Tyr Val Met Ser Thr Ile Leu Ser
Ala Asn Val Val Gly Leu 325 330 335 Leu Phe Ala Arg Ser Leu His Tyr
Gln Phe Tyr Ala Tyr Leu Ala Trp 340 345 350 Ala Thr Pro Tyr Leu Leu
Trp Thr Ala Cys Pro Asn Leu Leu Val Val 355 360 365 Ala Pro Leu Trp
Ala Ala Gln Glu Trp Ala Trp Asn Val Phe Pro Ser 370 375 380 Thr Pro
Leu Ser Ser Ser Val Val Val Ser Val Leu Ala Val Thr Val385 390 395
400 Ala Met Ala Phe Ala Gly Ser Asn Pro Gln Pro Arg Glu Thr Ser Lys
405 410 415 Pro Lys Gln His 420 127525PRTTrichoderma atroviride
127Met Ala Ser Leu Ile Lys Phe Ala Ser Asp Val Ala Thr Gly Arg His
1 5 10 15 Ala Leu Ser Lys Leu Ile Pro Val Gly Leu Phe Leu Ala Asp
Ala Ile 20 25 30 Leu Cys Gly Leu Val Ile Trp Lys Val Pro Tyr Thr
Glu Ile Asp Trp 35 40 45 Thr Ala Tyr Met Glu Gln Val Thr Gln Phe
Val Asn Gly Glu Arg Asp 50 55 60 Tyr Pro Lys Met Glu Gly Gly Thr
Gly Pro Leu Val Tyr Pro Ala Ala65 70 75 80 His Val Tyr Ile Tyr Thr
Gly Leu Tyr Tyr Leu Thr Asn Arg Gly Thr 85 90 95 Asp Ile Leu Leu
Ala Gln Gln Leu Phe Ala Val Leu Tyr Met Ala Thr 100 105 110 Leu Gly
Val Val Met Leu Ser Tyr Trp Lys Ala Arg Val Pro Pro Tyr 115 120 125
Ile Phe Pro Leu Leu Ile Leu Ser Lys Arg Leu His Ser Val Phe Val 130
135 140 Leu Arg Cys Phe Asn Asp Cys Phe Ala Ala Phe Phe Leu Trp Leu
Cys145 150 155 160 Ile Tyr Ser Phe Gln Asn Arg Ala Trp Thr Phe Gly
Ala Leu Ala Tyr 165 170 175 Thr Leu Gly Leu Gly Val Lys Met Ser Leu
Leu Leu Val Leu Pro Ala 180 185 190 Val Val Ile Ile Leu Phe Leu Gly
Arg Gly Phe Lys Gly Ala Leu Arg 195 200 205 Leu Val Trp Leu Met Ala
Gln Val Gln Leu Val Leu Ala Ile Pro Phe 210 215 220 Ile Thr Thr Asn
Trp Ala Gly Tyr Leu Gly Arg Ala Phe Glu Leu Ser225 230 235 240 Arg
Gln Phe Lys Phe Glu Trp Thr Val Asn Trp Arg Met Met Gly Glu 245 250
255 Glu Thr Phe Leu Ser Arg Gly Phe Ser Ile Thr Leu Leu Thr Phe His
260 265 270 Val Val Thr Leu Leu Val Phe Ile Ala Ala Arg Trp Leu Lys
Leu Gln 275 280 285 Glu Arg Ser Leu Leu Gly Ile Ile Thr Tyr Ala Val
Arg Phe Gln Ser 290 295 300 Pro Phe Thr Glu Gln Glu Glu Ala Lys Val
Ser Lys Lys Val Val Thr305 310 315 320 Pro Arg Tyr Val Leu Ala Thr
Ile Leu Ser Ala Asn Val Ile Gly Leu 325 330 335 Leu Phe Ala Arg Ser
Leu His Tyr Gln Phe Tyr Ala Tyr Leu Ala Trp 340 345 350 Ala Thr Pro
Phe Leu Leu Trp Thr Ala Tyr Pro Asn Leu Leu Val Val 355 360 365 Val
Pro Leu Trp Leu Ala Gln Glu Trp Ala Trp Asn Val Phe Pro Ser 370 375
380 Thr Pro Leu Ser Ser Ser Val Val Ile Ser Leu Val Pro Val Cys
Leu385 390 395 400 Leu Ser Pro Gln Leu Leu Val Ser His Asp Ile Tyr
Asn Phe Ala Asn 405 410 415 Cys Ser Ala Ile Leu Arg Pro Arg Gly Ile
Ala Phe Gly Gln Asp Ile 420 425 430 Ser Ala Thr Leu Asn Pro Asp Gly
Val Ala Lys Pro Leu Gly Glu Leu 435 440 445 Glu Asn Asp Gly Leu Arg
Val Trp His Leu Ala Ser Val Gln Val Val 450 455 460 Ser Phe Gly Leu
His His Ala His Asn Glu Leu Gly Gly Leu Gln Phe465 470 475 480 Gly
Trp Trp Arg Glu Arg Phe Leu Arg Gly Gly Glu Asp Val Ala Leu 485 490
495 Trp Phe Ala His Gly Gly Phe Glu Phe Arg Phe Phe Ser Glu Leu Leu
500 505 510 Val Arg Leu Ala Asp Thr Ser Asp Ile Lys Lys Ser Phe 515
520 525 128419PRTTrichoderma virens 128Met Ala Ser Leu Ile Lys Phe
Ala Ser Asp Val Ala Asn Gly Arg His 1 5 10 15 Ala Leu Ser Lys Phe
Ile Pro Met Gly Leu Trp Leu Ala Asp Ala Val 20 25 30 Leu Cys Gly
Leu Ile Ile Trp Lys Val Pro Tyr Thr Glu Ile Asp Trp 35 40 45 Val
Ala Tyr Met Glu Gln Ile Thr Gln Phe Val His Gly Glu Arg Asp 50 55
60 Tyr Pro Lys Met Glu Gly Gly Thr Gly Pro Leu Val Tyr Pro Ala
Ala65 70 75 80 His Val Tyr Ile Tyr Thr Gly Leu Tyr Tyr Leu Thr Asn
Lys Gly Thr 85 90 95 Asp Ile Leu Leu Ala Gln Gln Leu Phe Ala Val
Leu Tyr Met Ala Thr 100 105 110 Leu Gly Val Val Met Leu Cys Tyr Trp
Lys Ala Lys Val Pro Pro Tyr 115 120 125 Ile Phe Pro Leu Leu Ile Leu
Ser Lys Arg Leu His Ser Val Phe Val 130 135 140 Leu Arg Cys Phe Asn
Asp Cys Phe Ala Ala Phe Phe Leu Trp Leu Ser145 150 155 160 Ile Phe
Phe Phe Gln Arg Arg Val Trp Thr Leu Gly Ala Ile Ala Tyr 165 170 175
Thr Ile Gly Leu Gly Val Lys Met Ser Leu Leu Leu Val Leu Pro Ala 180
185 190 Val Val Ile Val Leu Phe Leu Gly Arg Gly Phe Lys Gly Ala Leu
Arg 195 200 205 Leu Leu Trp Leu Met Val Gln Val Gln Leu Leu Leu Ala
Ile Pro Phe 210 215 220 Ile Thr Thr Asn Trp Lys Gly Tyr Leu Gly Arg
Ala Phe Glu Leu Ser225 230 235 240 Arg Gln Phe Lys Phe Glu Trp Thr
Val Asn Trp Arg Met Leu Gly Glu 245 250 255 Glu Leu Phe Leu Ser Arg
Gly Phe Ser Ile Thr Leu Leu Ala Phe His 260 265 270 Ala Leu Phe Leu
Leu Ile Phe Ile Leu Gly Arg Trp Leu Arg Ile Lys 275 280 285 Glu Arg
Ser Phe Leu Gly Met Ile Pro Tyr Val Leu Arg Phe Thr Ser 290 295 300
Pro Phe Thr Glu His Glu Glu Ala Ser Ile Ser His Arg Val Val Thr305
310 315 320 Pro Glu Tyr Ile Met Ser Ala Met Leu Ser Ala Asn Val Val
Gly Leu 325 330 335 Leu Phe Ala Arg Ser Leu His Tyr Gln Phe Tyr Ala
Tyr Leu Ala Trp 340 345 350 Ala Thr Pro Phe Leu Leu Trp Thr Ala Ser
Pro Asn Leu Leu Val Val 355 360 365 Val Pro Leu Trp Ala Ala Gln Glu
Trp Ala Trp Asn Val Phe Pro Ser 370 375 380 Thr Pro Leu Ser Ser Asn
Val Val Val Ser Val Leu Ala Val Thr Val385 390 395 400 Ala Met Ala
Phe Val Gly Ser Asn Pro Gln Arg Gly Ala Pro Lys Pro 405 410 415 Lys
Gln Leu129434PRTFusarium oxysporum 129Met Pro Glu Ser Ala Ser Gly
Thr Leu Ser Gln Gly Val Arg Phe Leu 1 5 10 15 Arg Asn Val Leu Asn
Gly Arg His Ala Leu Ser Lys Leu Ile Pro Ile 20 25 30 Ala Leu Trp
Leu Val Asp Ala Leu Gly Cys Gly Leu Ile Ile Trp Lys 35 40 45 Ile
Pro Tyr Thr Glu Ile Asp Trp Val Ala Tyr Met Gln Gln Ile Ser 50 55
60 Gln Phe Val Ser Gly Glu Arg Asp Tyr Thr Lys Met Glu Gly Asp
Thr65 70 75 80 Gly Pro Leu Val Tyr Pro Ala Ala His Val Tyr Thr Tyr
Thr Gly Leu 85 90 95 Tyr Tyr Ile Thr Asp Lys Gly Thr Asn Ile Leu
Leu Ala Gln Gln Ile 100 105 110 Phe Ala Val Leu Tyr Met Ala Thr Leu
Ala Val Val Met Leu Cys Tyr 115 120 125 Trp Lys Ala Lys Val Pro Pro
Tyr Met Phe Ile Phe Leu Ile Ala Ser 130 135 140 Lys Arg Leu His Ser
Leu Phe Val Leu Arg Cys Phe Asn Asp Cys Phe145 150 155 160 Ala Val
Phe Phe Leu Trp Leu Thr Ile Phe Leu Phe Gln Arg Arg Gln 165 170 175
Trp Thr Val Gly Ser Leu Val Tyr Ser Trp Gly Leu Gly Ile Lys Met 180
185 190 Ser Leu Leu Leu Val Leu Pro Ala Ile Gly Val Ile Leu Phe Leu
Gly 195 200 205 Arg Gly Leu Trp Pro Ser Leu Arg Leu Ala Trp Leu Met
Ala Gln Ile 210 215 220 Gln Phe Ala Ile Gly Leu Pro Phe Ile Thr Lys
Asn Pro Arg Gly Tyr225 230 235 240 Ala Ala Arg Ala Phe Glu Leu Ser
Arg Gln Phe Gln Phe Lys Trp Thr 245 250 255 Val Asn Trp Arg Met Leu
Gly Glu Glu Val Phe Leu Ser Lys Tyr Phe 260 265 270 Ala Leu Ser Leu
Leu Ala Cys His Ile Leu Val Leu Leu Ile Phe Ile 275 280 285 Ser Lys
Arg Trp Ile Gln Pro Thr Gly Arg Ser Leu Tyr Asp Leu Ile 290 295 300
Pro Ser Phe Leu Arg Leu Lys Ser Pro Phe Thr Met Gln Glu Gln Leu305
310 315 320 Arg Ile Ser His Tyr Val Thr Pro Glu Tyr Ala Met Thr Thr
Met Leu 325 330 335 Thr Ala Asn Leu Ile Gly Leu Leu Phe Ala Arg Ser
Leu His Tyr Gln 340 345 350 Phe Tyr Ala Tyr Leu Ala Trp Ala Thr Pro
Tyr Leu Leu Trp Arg Ala 355 360 365 Thr Glu Asp Pro Val Ile Val Ala
Ile Ile Trp Ala Ala Gln Glu Trp 370 375 380 Ala Trp Asn Val Tyr Pro
Ser Thr Asp Leu Ser Ser Thr Ile Ala Val385 390 395 400 Asn Thr Met
Leu Ala Thr Val Val Leu Val Tyr Leu Gly Thr Ala Arg 405 410 415 Arg
Ala Val Pro Ala Pro Ala Ala Gln Val Gly Asn Val Asp Asp Lys 420 425
430 Asn Lys130432PRTGibberella zeae 130Met Ala Asp Pro Ala Pro Gly
Ala Leu Ala Arg Gly Thr Arg Phe Val 1 5 10 15 Arg Asn Val Leu Thr
Gly Gln His Ala Leu Ser Lys Leu Ile Pro Val 20 25 30 Ala Leu Trp
Leu Ala Asp Ala Val Gly Thr Ser Leu Ile Ile Trp Lys 35 40 45 Val
Pro Tyr Thr Glu Ile Asp Trp Glu Ala Tyr Met Gln Gln Val Ser 50 55
60 Gln Phe Ile Ser Gly Glu Arg Asp Tyr Thr Lys Ile Glu Gly Gly
Thr65 70 75
80 Gly Pro Leu Val Tyr Pro Ala Ala His Val Tyr Thr Phe Thr Gly Leu
85 90 95 Tyr His Ile Thr Asn Glu Gly Glu Asn Ile Phe Leu Ala Gln
Gln Ile 100 105 110 Phe Gly Val Leu Tyr Met Ala Thr Leu Ala Val Val
Met Leu Cys Tyr 115 120 125 Trp Lys Ala Lys Val Pro Pro Tyr Met Phe
Val Phe Leu Ile Ala Ser 130 135 140 Lys Arg Leu His Ser Leu Phe Val
Leu Arg Cys Phe Asn Asp Cys Phe145 150 155 160 Ala Val Phe Phe Leu
Trp Leu Ser Ile Tyr Phe Phe Gln Arg Arg Asn 165 170 175 Trp Thr Phe
Gly Ser Leu Ala Tyr Thr Trp Gly Leu Gly Ile Lys Met 180 185 190 Ser
Leu Leu Leu Val Leu Pro Ala Ile Gly Val Ile Leu Leu Leu Gly 195 200
205 Arg Gly Phe Trp Pro Gly Leu Arg Leu Ala Trp Leu Met Ala Gln Val
210 215 220 Gln Phe Ala Ile Gly Ile Pro Phe Ile Met Lys Asn Ser Arg
Gly Tyr225 230 235 240 Ala Ala Arg Ala Phe Glu Leu Ser Arg Glu Phe
Lys Phe Glu Trp Thr 245 250 255 Val Asn Trp Arg Met Leu Gly Glu Glu
Val Phe Leu Ser Lys Ser Phe 260 265 270 Ala Ile Phe Leu Leu Ala Cys
His Val Thr Ala Leu Leu Val Phe Ile 275 280 285 Ser Gln Arg Trp Leu
Gln Pro Thr Gly Arg Pro Leu Ser Ala Met Ile 290 295 300 Pro Ser Phe
Leu Gln Leu Lys Ser Pro Phe Thr Leu Gln Glu Gln Leu305 310 315 320
Arg Ile Ser His Tyr Val Thr Pro Glu Tyr Val Met Thr Thr Met Leu 325
330 335 Ser Ala Asn Val Ile Gly Leu Leu Phe Ala Arg Ser Leu His Tyr
Gln 340 345 350 Phe Tyr Ala Tyr Leu Ala Trp Ala Ser Pro Tyr Leu Ile
Trp Arg Ala 355 360 365 Thr Glu Asp Pro Phe Ile Val Leu Leu Ile Trp
Ala Ala Gln Glu Trp 370 375 380 Ala Trp Asn Val Phe Pro Ser Thr Asp
Leu Ser Ser Arg Val Thr Val385 390 395 400 Gly Ala Met Leu Ala Thr
Val Val Leu Ala Tyr Arg Gly Thr Ala Arg 405 410 415 Leu Ala Val Pro
Pro Ser Gln Ala Arg Lys Ile Glu Ala Lys Asn Lys 420 425 430
131446PRTMyceliophthora thermophila 131Met Thr Arg Met Arg Ser Ser
Pro Lys Thr Pro Thr Ala Thr Met Ala 1 5 10 15 Asp Gln Asn Arg Pro
Ile His Ile Arg Ala Thr Arg Leu Val Phe Asp 20 25 30 Ile Leu Asn
Gly Arg His Val Leu Ser Lys Leu Ile Pro Pro Leu Val 35 40 45 Phe
Leu Ala Asp Ala Leu Leu Cys Ala Leu Ile Ile Trp Lys Val Pro 50 55
60 Tyr Thr Glu Ile Asp Trp Asn Ala Tyr Met Glu Gln Val Ala Gln
Ile65 70 75 80 Leu Ser Gly Glu Arg Asp Tyr Thr Lys Ile Arg Gly Asn
Thr Gly Pro 85 90 95 Leu Val Tyr Pro Ala Ala His Val Tyr Ile Tyr
Thr Gly Leu Tyr His 100 105 110 Leu Thr Asp Glu Gly Arg Asn Ile Leu
Thr Ala Gln Lys Leu Phe Gly 115 120 125 Phe Leu Tyr Met Val Thr Leu
Ala Val Val Met Ala Cys Tyr Trp Gln 130 135 140 Ala Lys Val Pro Pro
Tyr Val Phe Pro Leu Leu Ile Leu Ser Lys Arg145 150 155 160 Leu His
Ser Ile Phe Val Leu Arg Cys Phe Asn Asp Cys Phe Ala Thr 165 170 175
Leu Phe Leu Trp Leu Ala Ile Phe Ala Leu Gln Arg Arg Ala Trp Arg 180
185 190 Thr Gly Ala Leu Met Tyr Thr Leu Gly Leu Gly Val Lys Met Ser
Leu 195 200 205 Leu Leu Val Leu Pro Ala Val Gly Val Val Leu Leu Leu
Gly Ala Gly 210 215 220 Phe Ala Thr Ser Leu Arg Leu Ala Ala Val Ile
Gly Leu Val Gln Val225 230 235 240 Leu Ile Ala Val Pro Phe Leu Ser
Asn Asn Pro Trp Gly Tyr Leu Gly 245 250 255 Arg Ala Phe Glu Leu Ser
Arg Gln Phe Phe Phe Lys Trp Thr Val Asn 260 265 270 Trp Arg Phe Val
Gly Glu Glu Val Phe Leu Ser Lys Glu Phe Ser Leu 275 280 285 Ala Leu
Leu Gly Leu His Val Ala Val Leu Ala Ile Phe Val Thr Thr 290 295 300
Arg Trp Leu Lys Pro Ala Arg Lys Pro Val Ser Gln Leu Ile Val Pro305
310 315 320 Ile Leu Leu Gly Lys Ser Pro Phe Thr Glu Glu Glu Gln Arg
Ala Val 325 330 335 Ser Arg Asp Val Thr Pro Arg Phe Ile Leu Thr Ser
Ile Leu Ser Ala 340 345 350 Asn Val Val Gly Leu Leu Phe Ala Arg Ser
Leu His Tyr Gln Phe Tyr 355 360 365 Ser Tyr Leu Ala Trp Met Thr Pro
Tyr Leu Leu Trp Arg Ser Gly Val 370 375 380 His Pro Ile Leu Gln Tyr
Ala Ile Trp Thr Ala Gln Glu Trp Ala Trp385 390 395 400 Asn Val Tyr
Pro Ser Thr Pro Ile Ser Ser Gly Val Val Val Gly Val 405 410 415 Leu
Ala Leu Thr Ala Ala Leu Val Trp Leu Gly Ala Arg Glu Asp Trp 420 425
430 Glu Pro Arg Arg Val Leu Leu Lys Gly Glu Ala Ala Lys Arg 435 440
445 132442PRTNeurospora crassa 132Met Ala Ala Pro Ser Ser Arg Pro
Glu Ser Asn Pro Pro Leu Tyr Lys 1 5 10 15 Gln Ala Leu Asp Phe Ala
Leu Asp Val Ala Asn Gly Arg His Ala Leu 20 25 30 Ser Lys Leu Ile
Pro Pro Ala Leu Phe Leu Val Asp Ala Leu Leu Cys 35 40 45 Gly Leu
Ile Ile Trp Lys Val Pro Tyr Thr Glu Ile Asp Trp Ala Ala 50 55 60
Tyr Met Glu Gln Val Ser Gln Ile Leu Ser Gly Glu Arg Asp Tyr Thr65
70 75 80 Lys Val Arg Gly Gly Thr Gly Pro Leu Val Tyr Pro Ala Ala
His Val 85 90 95 Tyr Ile Tyr Thr Gly Leu Tyr His Leu Thr Asp Glu
Gly Arg Asn Ile 100 105 110 Leu Leu Ala Gln Gln Leu Phe Ala Gly Leu
Tyr Met Val Thr Leu Ala 115 120 125 Val Val Met Gly Cys Tyr Trp Gln
Ala Lys Ala Pro Pro Tyr Leu Phe 130 135 140 Pro Leu Leu Thr Leu Ser
Lys Arg Leu His Ser Ile Phe Val Leu Arg145 150 155 160 Cys Phe Asn
Asp Cys Phe Ala Val Leu Phe Leu Trp Leu Ala Ile Phe 165 170 175 Phe
Phe Gln Arg Arg Asn Trp Gln Ala Gly Ala Leu Leu Tyr Thr Leu 180 185
190 Gly Leu Gly Val Lys Met Thr Leu Leu Leu Ser Leu Pro Ala Val Gly
195 200 205 Ile Val Leu Phe Leu Gly Ser Gly Ser Phe Val Thr Thr Leu
Gln Leu 210 215 220 Val Ala Thr Met Gly Leu Val Gln Ile Leu Ile Gly
Val Pro Phe Leu225 230 235 240 Ala His Tyr Pro Thr Glu Tyr Leu Ser
Arg Ala Phe Glu Leu Ser Arg 245 250 255 Gln Phe Phe Phe Lys Trp Thr
Val Asn Trp Arg Phe Val Gly Glu Glu 260 265 270 Ile Phe Leu Ser Lys
Gly Phe Ala Leu Thr Leu Leu Ala Leu His Val 275 280 285 Leu Val Leu
Gly Ile Phe Ile Thr Thr Arg Trp Ile Lys Pro Ala Arg 290 295 300 Lys
Ser Leu Val Gln Leu Ile Ser Pro Val Leu Leu Ala Gly Lys Pro305 310
315 320 Pro Leu Thr Val Pro Glu His Arg Ala Ala Ala Arg Asp Val Thr
Pro 325 330 335 Arg Tyr Ile Met Thr Thr Ile Leu Ser Ala Asn Ala Val
Gly Leu Leu 340 345 350 Phe Ala Arg Ser Leu His Tyr Gln Phe Tyr Ala
Tyr Val Ala Trp Ser 355 360 365 Thr Pro Phe Leu Leu Trp Arg Ala Gly
Leu His Pro Val Leu Val Tyr 370 375 380 Leu Leu Trp Ala Val His Glu
Trp Ala Trp Asn Val Phe Pro Ser Thr385 390 395 400 Pro Ala Ser Ser
Ala Val Val Val Gly Val Leu Gly Val Thr Val Ala 405 410 415 Gly Val
Trp Phe Gly Ala Arg Glu Glu Trp Glu Pro Gly Met Lys Ser 420 425 430
Ser Ser Lys Lys Glu Glu Ala Ala Met Arg 435 440
133413PRTAspergillus oryzae 133Met Glu Leu Lys His Phe Ile His Glu
Leu Cys Leu Asn Pro Arg His 1 5 10 15 Thr Lys Trp Ile Ala Pro Leu
Leu Val Ile Gly Asp Ala Phe Leu Cys 20 25 30 Ala Leu Ile Ile Trp
Lys Ile Pro Tyr Thr Glu Ile Asp Trp Thr Thr 35 40 45 Tyr Met Gln
Gln Ile Ala Leu Tyr Ile Ser Gly Glu Arg Asp Tyr Thr 50 55 60 Leu
Ile Lys Gly Ser Thr Gly Pro Leu Val Tyr Pro Ala Ala His Val65 70 75
80 Tyr Ser Tyr Met Ala Leu Tyr His Leu Thr Asp Glu Gly Arg Asp Ile
85 90 95 Leu Phe Gly Gln Ile Leu Phe Ala Val Leu Tyr Leu Val Thr
Leu Ala 100 105 110 Val Val Met Val Cys Tyr Arg Gln Ser Gly Ala Pro
Pro Tyr Leu Phe 115 120 125 Pro Leu Leu Val Leu Ser Lys Arg Leu His
Ser Val Phe Val Leu Arg 130 135 140 Leu Phe Asn Asp Gly Leu Ala Val
Cys Ala Met Trp Ile Ala Ile Leu145 150 155 160 Leu Phe Gln Asn Lys
Lys Trp Thr Ala Gly Val Thr Ala Trp Thr Val 165 170 175 Gly Val Gly
Ile Lys Met Thr Leu Leu Leu Leu Ala Pro Ala Ile Ala 180 185 190 Val
Val Thr Val Leu Ser Leu Ser Leu Val Pro Ser Ile Arg Leu Gly 195 200
205 Ile Leu Ala Leu Leu Ile Gln Val Leu Leu Ala Ile Pro Phe Leu Gln
210 215 220 Gly Asn Pro Ile Gly Tyr Val Ala Arg Ala Phe Glu Leu Thr
Arg Gln225 230 235 240 Phe Met Phe Lys Trp Thr Val Asn Trp Arg Phe
Val Gly Glu Asp Leu 245 250 255 Phe Leu Ser Lys Gln Phe Ser Leu Ala
Leu Leu Gly Leu His Ile Phe 260 265 270 Leu Leu Gly Leu Phe Val Thr
Thr Gly Trp Leu Arg Pro Ser Gly Ser 275 280 285 Asn Val Pro Asp Phe
Leu Arg Ser Leu Leu Gln Gly Arg Gln Arg Thr 290 295 300 Val Val Leu
Ser Lys Ser Phe Ile Met Thr Val Met Leu Thr Ser Leu305 310 315 320
Ala Ile Gly Leu Leu Cys Ala Arg Ser Leu His Tyr Gln Phe Phe Ala 325
330 335 Tyr Leu Ser Trp Ala Thr Pro Cys Leu Leu Trp Arg Ala Arg Leu
His 340 345 350 Pro Ile Leu Ile Tyr Ala Ile Trp Ala Leu Gln Glu Trp
Ala Trp Asn 355 360 365 Val Tyr Pro Ser Thr Asn Ala Ser Ser Ser Val
Val Val Phe Ser Leu 370 375 380 Ala Val Gln Val Phe Gly Val Leu Leu
Asn Ser Arg Asn Ala Leu Ser385 390 395 400 Asp Ala Pro Pro Arg Arg
Lys Gly Lys Glu His Ile Gln 405 410 134411PRTNeosartorya fischeri
134Met Asp Leu Lys His Thr Leu Arg Asp Leu Cys Met Asn Pro Arg His
1 5 10 15 Thr Arg Trp Val Ala Pro Leu Leu Ile Leu Gly Asp Ala Val
Leu Cys 20 25 30 Ala Leu Ile Ile Trp Lys Val Pro Tyr Thr Glu Ile
Asp Trp Thr Thr 35 40 45 Tyr Met Gln Gln Ile Ser Leu Tyr Ile Ser
Gly Glu Arg Asp Tyr Thr 50 55 60 Leu Ile Lys Gly Ser Thr Gly Pro
Leu Val Tyr Pro Ala Ala His Val65 70 75 80 Tyr Ile Phe Asn Ile Leu
Tyr His Leu Thr Asp Glu Gly Arg Asp Ile 85 90 95 Phe Leu Gly Gln
Ile Leu Phe Ala Ile Leu Tyr Leu Ala Thr Leu Thr 100 105 110 Val Ala
Met Thr Cys Tyr Arg Gln Ala Gly Ala Pro Pro Tyr Leu Leu 115 120 125
Val Pro Leu Val Leu Ser Lys Arg Leu His Ser Val Phe Met Leu Arg 130
135 140 Leu Phe Asn Asp Gly Phe Ala Ala Tyr Ala Met Trp Val Ser Ile
Leu145 150 155 160 Leu Phe Met Asn Lys Lys Trp Thr Ala Gly Ala Ile
Val Trp Ser Thr 165 170 175 Gly Val Gly Ile Lys Met Thr Leu Leu Leu
Leu Ala Pro Ala Ile Ala 180 185 190 Val Val Leu Val Leu Ser Leu Ser
Leu Gly Pro Ser Met Gln Leu Gly 195 200 205 Phe Leu Ala Val Leu Ile
Gln Val Leu Phe Gly Ile Pro Phe Leu Gln 210 215 220 Asn Asn Pro Ala
Gly Tyr Val Ser Arg Ala Phe Glu Leu Thr Arg Gln225 230 235 240 Phe
Met Phe Lys Trp Thr Val Asn Trp Arg Phe Val Gly Glu Glu Leu 245 250
255 Phe Leu Ser Arg Lys Phe Ser Leu Ala Leu Leu Ala Leu His Ile Leu
260 265 270 Leu Leu Gly Leu Phe Val Ala Thr Val Trp Leu Lys Pro Ser
Gly Ser 275 280 285 Asp Leu Pro Ser Phe Leu Gln Arg Leu Ile Gln Arg
Arg Tyr Arg Thr 290 295 300 Ala Ser Leu Ser Lys Ser Phe Ile Met Thr
Ala Met Leu Ser Ser Leu305 310 315 320 Ala Ile Gly Leu Leu Cys Ala
Arg Ser Leu His Tyr Gln Phe Phe Ala 325 330 335 Tyr Leu Ala Cys Ala
Thr Pro Phe Leu Leu Trp Gln Ala Gly Phe His 340 345 350 Pro Ile Leu
Val Tyr Val Val Trp Val Ala Gln Glu Trp Ala Trp Asn 355 360 365 Thr
Tyr Pro Ser Thr Asn Ala Ser Ser Leu Val Val Ile Leu Ser Leu 370 375
380 Ala Ala Gln Val Phe Gly Val Leu Gly Asn Ser Phe Ser Arg Lys
His385 390 395 400 Leu Asp Gln Ser Ser Gln Lys Glu His Leu Gln 405
410 135413PRTAspergillus niger 135Met Asp Trp Met Arg Leu Ile Arg
Asp Leu Cys Phe Asn Pro Arg His 1 5 10 15 Thr Lys Trp Met Ala Pro
Leu Leu Val Leu Gly Asp Ala Phe Leu Cys 20 25 30 Ala Leu Ile Ile
Trp Lys Val Pro Tyr Thr Glu Ile Asp Trp Ala Thr 35 40 45 Tyr Met
Gln Gln Ile Ser Leu Tyr Leu Ser Gly Glu Arg Asp Tyr Thr 50 55 60
Leu Ile Arg Gly Ser Thr Gly Pro Leu Val Tyr Pro Ala Ala His Val65
70 75 80 Tyr Ser Tyr Thr Ala Leu Tyr His Leu Thr Asp Glu Gly Arg
Asp Ile 85 90 95 Phe Phe Gly Gln Ile Leu Phe Ala Val Leu Tyr Leu
Ile Thr Leu Val 100 105 110 Val Val Leu Cys Cys Tyr Arg Gln Ser Gly
Ala Pro Pro Tyr Leu Leu 115 120 125 Pro Leu Leu Val Leu Ser Lys Arg
Leu His Ser Val Tyr Val Leu Arg 130 135 140 Leu Phe Asn Asp Gly Leu
Ala Ala Leu Ala Met Trp Val Ala Ile Leu145 150 155 160 Leu Phe Met
Asn Arg Lys Trp Thr Ala Ala Val Ala Val Trp Ser Thr 165 170 175 Gly
Val Ala Ile Lys Met Thr Leu Leu Leu Leu Ala Pro Ala Ile Ala 180 185
190 Val Val Thr Val Leu Ser Leu Ser Leu Gly Pro Ser Val Gly Leu Gly
195 200 205 Val Leu Ala Val Leu Val Gln Val Leu Leu Ala Ile Pro Phe
Leu Gln 210 215 220 Asn Asn Pro Ala Gly Tyr Leu Ser Arg Ala Phe Glu
Leu
Thr Arg Gln225 230 235 240 Phe Met Phe Lys Trp Thr Val Asn Trp Arg
Phe Val Gly Glu Glu Val 245 250 255 Phe Leu Ser Lys Ser Phe Ser Leu
Ala Leu Leu Ala Val His Ile Val 260 265 270 Leu Leu Gly Ala Phe Ala
Val Thr Gly Trp Leu Arg Tyr Ser Arg Ser 275 280 285 Ser Leu Pro Ala
Phe Ile Arg Asn Leu Leu Ala Gly Arg His Arg Thr 290 295 300 Val Ser
Leu Pro Lys Pro Tyr Ile Met Ser Val Met Leu Ser Ser Leu305 310 315
320 Thr Val Gly Leu Leu Cys Ala Arg Ser Leu His Tyr Gln Phe Phe Ala
325 330 335 Tyr Leu Ser Trp Ala Thr Pro Phe Leu Leu Trp Arg Ala Gly
Phe His 340 345 350 Pro Ile Leu Leu Tyr Leu Ile Trp Ala Met Gln Glu
Trp Ala Trp Asn 355 360 365 Thr Phe Pro Ser Thr Asn Leu Ser Ser Ile
Ile Val Val Leu Ser Leu 370 375 380 Ala Thr Gln Ser Phe Gly Val Leu
Ala Asn Ser Ala Ser Ala Phe Tyr385 390 395 400 Thr Met Arg Ser Asn
Pro Ser Gly Lys Glu His Asn Gln 405 410 136357PRTMagnaporthe oryzae
136Met Ala Ala Glu Arg Pro Ser Thr Leu Gly Lys Pro Val Gln Phe Val
1 5 10 15 Phe Asp Val Ala Asn Gly Arg His Pro Leu Ser Arg Ala Ile
Pro Pro 20 25 30 Met Leu Leu Ala Phe Asp Gly Leu Leu Cys Gly Leu
Ile Ile Lys Lys 35 40 45 Val Pro Ser Cys Tyr Arg Lys Ala Lys Val
Pro Pro Tyr Val Leu Pro 50 55 60 Leu Leu Val Leu Ser Lys Arg Leu
His Ser Ile Phe Val Leu Arg Cys65 70 75 80 Phe Asn Asp Cys Phe Ala
Val Leu Phe Phe Trp Leu Ala Ile Tyr Cys 85 90 95 Phe Gln Arg Arg
Ala Trp Ser Leu Gly Gly Val Phe Tyr Ser Phe Gly 100 105 110 Leu Gly
Ile Lys Met Thr Val Leu Leu Ser Leu Pro Ala Val Gly Val 115 120 125
Ile Leu Leu Leu Gly Arg Gly Phe Gly Gly Ala Leu Asn Val Ala Ser 130
135 140 Ile Met Gly Gln Leu Gln Val Ala Ile Gly Leu Pro Phe Leu Ser
Lys145 150 155 160 Asn Ala Trp Gly Tyr Leu Ser Arg Ala Phe Glu Leu
Ser Arg Gln Phe 165 170 175 Met Phe Lys Trp Thr Val Asn Trp Arg Phe
Val Gly Glu Glu Thr Phe 180 185 190 Leu Ser Lys Pro Phe Ala Ile Thr
Leu Leu Ala Leu His Ala Ser Val 195 200 205 Leu Leu Ala Phe Val Thr
Lys Arg Trp Leu Lys Pro Ala Ser Lys Ser 210 215 220 Ile Gly Gly Leu
Ile Ala Pro Leu Leu Ser Gly Arg Pro Ile Phe Thr225 230 235 240 Ala
Glu Glu Ala Gln Thr Ala Ala Arg Ala Val Thr Pro Glu Tyr Val 245 250
255 Met Thr Thr Met Leu Thr Ala Asn Ile Val Gly Met Leu Phe Ala Arg
260 265 270 Ser Leu His Tyr Gln Phe Tyr Ala Tyr Leu Ala Trp Ser Thr
Pro Tyr 275 280 285 Leu Leu Trp Arg Ser Gly Ile His Pro Leu Leu Gln
Trp Gly Leu Trp 290 295 300 Ala Leu Gln Glu Trp Ala Trp Asn Val Tyr
Pro Ser Thr Pro Val Ser305 310 315 320 Ser Gly Val Val Val Gly Val
Met Ala Ile Thr Val Gly Ala Val Met 325 330 335 Val Gly Ala Lys Ala
Glu Phe Arg Pro Gln Val Pro Val Ala Lys Lys 340 345 350 Val Glu Ala
Lys Arg 355 137406PRTSchizosaccharomyces pombe 137Met Ser Ser Val
Glu Thr Arg Asn Ser Phe Asn Pro Phe Arg Val Leu 1 5 10 15 Phe Asp
Leu Gly Ser Tyr Gly Trp Leu His Pro Ser Arg Leu Leu Leu 20 25 30
Leu Glu Ile Pro Phe Val Phe Ala Ile Ile Ser Lys Val Pro Tyr Thr 35
40 45 Glu Ile Asp Trp Ile Ala Tyr Met Glu Gln Val Asn Ser Phe Leu
Leu 50 55 60 Gly Glu Arg Asp Tyr Lys Ser Leu Val Gly Cys Thr Gly
Pro Leu Val65 70 75 80 Tyr Pro Gly Gly His Val Phe Leu Tyr Thr Leu
Leu Tyr Tyr Leu Thr 85 90 95 Asp Gly Gly Thr Asn Ile Val Arg Ala
Gln Tyr Ile Phe Ala Phe Val 100 105 110 Tyr Trp Ile Thr Thr Ala Ile
Val Gly Tyr Leu Phe Lys Ile Val Arg 115 120 125 Ala Pro Phe Tyr Ile
Tyr Val Leu Leu Ile Leu Ser Lys Arg Leu His 130 135 140 Ser Ile Phe
Ile Leu Arg Leu Phe Asn Asp Gly Phe Asn Ser Leu Phe145 150 155 160
Ser Ser Leu Phe Ile Leu Ser Ser Cys Lys Lys Lys Trp Val Arg Ala 165
170 175 Ser Ile Leu Leu Ser Val Ala Cys Ser Val Lys Met Ser Ser Leu
Leu 180 185 190 Tyr Val Pro Ala Tyr Leu Val Leu Leu Leu Gln Ile Leu
Gly Pro Lys 195 200 205 Lys Thr Trp Met His Ile Phe Val Ile Ile Ile
Val Gln Ile Leu Phe 210 215 220 Ser Ile Pro Phe Leu Ala Tyr Phe Trp
Ser Tyr Trp Thr Gln Ala Phe225 230 235 240 Asp Phe Gly Arg Ala Phe
Asp Tyr Lys Trp Thr Val Asn Trp Arg Phe 245 250 255 Ile Pro Arg Ser
Ile Phe Glu Ser Thr Ser Phe Ser Thr Ser Ile Leu 260 265 270 Phe Leu
His Val Ala Leu Leu Val Ala Phe Thr Cys Lys His Trp Asn 275 280 285
Lys Leu Ser Arg Ala Thr Pro Phe Ala Met Val Asn Ser Met Leu Thr 290
295 300 Leu Lys Pro Leu Pro Lys Leu Gln Leu Ala Thr Pro Asn Phe Ile
Phe305 310 315 320 Thr Ala Leu Ala Thr Ser Asn Leu Ile Gly Ile Leu
Cys Ala Arg Ser 325 330 335 Leu His Tyr Gln Phe Tyr Ala Trp Phe Ala
Trp Tyr Ser Pro Tyr Leu 340 345 350 Cys Tyr Gln Ala Ser Phe Pro Ala
Pro Ile Val Ile Gly Leu Trp Met 355 360 365 Leu Gln Glu Tyr Ala Trp
Asn Val Phe Pro Ser Thr Lys Leu Ser Ser 370 375 380 Leu Ile Ala Val
Cys Val Pro Leu Ile Thr Ile Leu Lys Leu Tyr Thr385 390 395 400 Ser
Asp Tyr Arg Lys Pro 405 13830DNAArtificial SequenceSynthesized
Construct 138ggaggtgggg gcagtggagg tggcggcagt
301392460DNAArtificial SequenceSynthesized Construct 139atgcgcttcc
gaatctacaa gcggaaggtc ctcattctga cccttgtcgt ggccgcttgc 60ggctttgttc
tctggtccag caacggtcgc cagcgtaaga acgaggccct ggcgcctccc
120ctcttggacg ccgaaccggc cagaggcgca ggtggcaggg gaggggatca
cccctcggtc 180gctgtcggca tccgccgcgt cagcaatgtg tccgccgcct
ctctggtccc ggcggttccg 240cagcctgagg cagacaacct cacgctgcgc
taccgatcac tcgtgtatca acttaacttc 300gaccagactc tgcggaacgt
cgacaaggcc ggaacctggg ctccgcgtga gttggtcctc 360gtcgttcagg
tgcacaacag gcccgagtac ctccgcctcc tgctggattc gcttcgaaag
420gcccagggca tcgacaacgt cctggtgatt ttcagccatg acttttggtc
cacagagatc 480aatcagctca ttgcgggtgt caacttttgc cccgtcttgc
aagttttctt ccctttctct 540atccaactct accccaacga gttcccgggc
agtgaccccc gcgactgtcc tcgggatctg 600ccaaaaaacg ccgctctcaa
gctgggctgc atcaacgccg aataccccga cagctttggc 660cactatcgcg
aggccaagtt ctcgcagacg aagcaccact ggtggtggaa gctccatttt
720gtctgggagc gagtgaagat ccttcgtgat tacgcaggac tcattctgtt
cttggaagag 780gaccactacc tggccccgga cttctaccac gtctttaaga
agatgtggaa gctcaagcag 840caggaatgcc ccgagtgcga cgttctgtcc
cttggcacct atagcgcgtc ccgctcgttc 900tacggtatgg ctgacaaggt
cgatgtgaaa acctggaagt caactgagca caatatgggc 960ctcgccctga
cgaggaacgc ctaccagaaa ctcatcgagt gtaccgacac cttctgcacg
1020tacgacgact ataactggga ttggacactg cagtacttga ctgtcagctg
cctccctaag 1080ttttggaagg tccttgttcc ccagatcccg agaattttcc
atgctggcga ctgcgggatg 1140caccacaaga aaacctgtcg cccatccacg
cagtctgccc aaatcgagtc gctcctgaac 1200aacaacaagc agtacatgtt
ccccgagaca ctgaccatta gcgagaagtt tacggtcgtg 1260gcgatctccc
cgcctcgaaa gaatggcggc tggggtgaca tccgcgatca cgagctgtgc
1320aagtcttacc gccggctcca gggaggtggg ggcagtggag gtggcggcag
tgggagggtg 1380cccaccgccg cccctcccgc ccagccgcgt gtgcctgtga
cccccgcgcc ggcggtgatt 1440cccatcctgg tcatcgcctg tgaccgcagc
actgttcggc gctgcctgga caagctgctg 1500cattatcggc cctcggctga
gctcttcccc atcatcgtca gccaggactg cgggcacgag 1560gagacggccc
aggccatcgc ctcctacggc agcgcggtca cgcacatccg gcagcccgac
1620ctgagcagca ttgcggtgcc gccggaccac cgcaagttcc agggctacta
caagatcgcg 1680cgccactacc gctgggcgct gggccaggtc ttccggcagt
ttcgcttccc cgccgccgtg 1740gtggtggagg atgacctgga ggtggccccg
gacttcttcg agtactttcg ggccacctat 1800ccgctgctga aggccgaccc
ctccctgtgg tgcgtctcgg cctggaatga caacggcaag 1860gagcagatgg
tggacgccag caggcctgag ctgctctacc gcaccgactt tttccctggc
1920ctgggctggc tgctgttggc cgagctctgg gctgagctgg agcccaagtg
gccaaaggcc 1980ttctgggacg actggatgcg gcggccggag cagcggcagg
ggcgggcctg catccgccct 2040gagatctcaa gaacgatgac ctttggccgc
aagggtgtga gccacgggca gttctttgac 2100cagcacctca agttcatcaa
gctgaaccag cagtttgtgc acttcaccca gctggacctg 2160tcttacctgc
agcgggaggc ctatgaccga gatttcctcg cccgcgtcta cggtgctccc
2220cagctgcagg tggagaaagt gaggaccaat gaccggaagg agctggggga
ggtgcgggtg 2280cagtacacgg gcagggacag cttcaaggct ttcgccaagg
ctctgggtgt catggatgac 2340ctcaagtcgg gggttccgag agctggctac
cggggcattg tcaccttcca gttccggggc 2400cgccgtgtcc acctggcgcc
cccaccgacg tgggagggct atgatcccag ctggaattag 246014045DNAArtificial
SequenceSynthesized Construct 140ggaggtgggg gcagtggagg tggcggcagt
ggcggcggtg gaagt 451412475DNAArtificial SequenceSynthesized
Construct 141atgcgcttcc gaatctacaa gcggaaggtc ctcattctga cccttgtcgt
ggccgcttgc 60ggctttgttc tctggtccag caacggtcgc cagcgtaaga acgaggccct
ggcgcctccc 120ctcttggacg ccgaaccggc cagaggcgca ggtggcaggg
gaggggatca cccctcggtc 180gctgtcggca tccgccgcgt cagcaatgtg
tccgccgcct ctctggtccc ggcggttccg 240cagcctgagg cagacaacct
cacgctgcgc taccgatcac tcgtgtatca acttaacttc 300gaccagactc
tgcggaacgt cgacaaggcc ggaacctggg ctccgcgtga gttggtcctc
360gtcgttcagg tgcacaacag gcccgagtac ctccgcctcc tgctggattc
gcttcgaaag 420gcccagggca tcgacaacgt cctggtgatt ttcagccatg
acttttggtc cacagagatc 480aatcagctca ttgcgggtgt caacttttgc
cccgtcttgc aagttttctt ccctttctct 540atccaactct accccaacga
gttcccgggc agtgaccccc gcgactgtcc tcgggatctg 600ccaaaaaacg
ccgctctcaa gctgggctgc atcaacgccg aataccccga cagctttggc
660cactatcgcg aggccaagtt ctcgcagacg aagcaccact ggtggtggaa
gctccatttt 720gtctgggagc gagtgaagat ccttcgtgat tacgcaggac
tcattctgtt cttggaagag 780gaccactacc tggccccgga cttctaccac
gtctttaaga agatgtggaa gctcaagcag 840caggaatgcc ccgagtgcga
cgttctgtcc cttggcacct atagcgcgtc ccgctcgttc 900tacggtatgg
ctgacaaggt cgatgtgaaa acctggaagt caactgagca caatatgggc
960ctcgccctga cgaggaacgc ctaccagaaa ctcatcgagt gtaccgacac
cttctgcacg 1020tacgacgact ataactggga ttggacactg cagtacttga
ctgtcagctg cctccctaag 1080ttttggaagg tccttgttcc ccagatcccg
agaattttcc atgctggcga ctgcgggatg 1140caccacaaga aaacctgtcg
cccatccacg cagtctgccc aaatcgagtc gctcctgaac 1200aacaacaagc
agtacatgtt ccccgagaca ctgaccatta gcgagaagtt tacggtcgtg
1260gcgatctccc cgcctcgaaa gaatggcggc tggggtgaca tccgcgatca
cgagctgtgc 1320aagtcttacc gccggctcca gggaggtggg ggcagtggag
gtggcggcag tggaggtggc 1380ggcagtggga gggtgcccac cgccgcccct
cccgcccagc cgcgtgtgcc tgtgaccccc 1440gcgccggcgg tgattcccat
cctggtcatc gcctgtgacc gcagcactgt tcggcgctgc 1500ctggacaagc
tgctgcatta tcggccctcg gctgagctct tccccatcat cgtcagccag
1560gactgcgggc acgaggagac ggcccaggcc atcgcctcct acggcagcgc
ggtcacgcac 1620atccggcagc ccgacctgag cagcattgcg gtgccgccgg
accaccgcaa gttccagggc 1680tactacaaga tcgcgcgcca ctaccgctgg
gcgctgggcc aggtcttccg gcagtttcgc 1740ttccccgccg ccgtggtggt
ggaggatgac ctggaggtgg ccccggactt cttcgagtac 1800tttcgggcca
cctatccgct gctgaaggcc gacccctccc tgtggtgcgt ctcggcctgg
1860aatgacaacg gcaaggagca gatggtggac gccagcaggc ctgagctgct
ctaccgcacc 1920gactttttcc ctggcctggg ctggctgctg ttggccgagc
tctgggctga gctggagccc 1980aagtggccaa aggccttctg ggacgactgg
atgcggcggc cggagcagcg gcaggggcgg 2040gcctgcatcc gccctgagat
ctcaagaacg atgacctttg gccgcaaggg tgtgagccac 2100gggcagttct
ttgaccagca cctcaagttc atcaagctga accagcagtt tgtgcacttc
2160acccagctgg acctgtctta cctgcagcgg gaggcctatg accgagattt
cctcgcccgc 2220gtctacggtg ctccccagct gcaggtggag aaagtgagga
ccaatgaccg gaaggagctg 2280ggggaggtgc gggtgcagta cacgggcagg
gacagcttca aggctttcgc caaggctctg 2340ggtgtcatgg atgacctcaa
gtcgggggtt ccgagagctg gctaccgggg cattgtcacc 2400ttccagttcc
ggggccgccg tgtccacctg gcgcccccac cgacgtggga gggctatgat
2460cccagctgga attag 247514257DNATrichoderma reesei 142agcaccggca
accctagcgg cggcaaccct cccggcggaa acccgcctgg cagcacc
571432487DNAArtificial SequenceSynthesized Construct 143atgcgcttcc
gaatctacaa gcggaaggtc ctcattctga cccttgtcgt ggccgcttgc 60ggctttgttc
tctggtccag caacggtcgc cagcgtaaga acgaggccct ggcgcctccc
120ctcttggacg ccgaaccggc cagaggcgca ggtggcaggg gaggggatca
cccctcggtc 180gctgtcggca tccgccgcgt cagcaatgtg tccgccgcct
ctctggtccc ggcggttccg 240cagcctgagg cagacaacct cacgctgcgc
taccgatcac tcgtgtatca acttaacttc 300gaccagactc tgcggaacgt
cgacaaggcc ggaacctggg ctccgcgtga gttggtcctc 360gtcgttcagg
tgcacaacag gcccgagtac ctccgcctcc tgctggattc gcttcgaaag
420gcccagggca tcgacaacgt cctggtgatt ttcagccatg acttttggtc
cacagagatc 480aatcagctca ttgcgggtgt caacttttgc cccgtcttgc
aagttttctt ccctttctct 540atccaactct accccaacga gttcccgggc
agtgaccccc gcgactgtcc tcgggatctg 600ccaaaaaacg ccgctctcaa
gctgggctgc atcaacgccg aataccccga cagctttggc 660cactatcgcg
aggccaagtt ctcgcagacg aagcaccact ggtggtggaa gctccatttt
720gtctgggagc gagtgaagat ccttcgtgat tacgcaggac tcattctgtt
cttggaagag 780gaccactacc tggccccgga cttctaccac gtctttaaga
agatgtggaa gctcaagcag 840caggaatgcc ccgagtgcga cgttctgtcc
cttggcacct atagcgcgtc ccgctcgttc 900tacggtatgg ctgacaaggt
cgatgtgaaa acctggaagt caactgagca caatatgggc 960ctcgccctga
cgaggaacgc ctaccagaaa ctcatcgagt gtaccgacac cttctgcacg
1020tacgacgact ataactggga ttggacactg cagtacttga ctgtcagctg
cctccctaag 1080ttttggaagg tccttgttcc ccagatcccg agaattttcc
atgctggcga ctgcgggatg 1140caccacaaga aaacctgtcg cccatccacg
cagtctgccc aaatcgagtc gctcctgaac 1200aacaacaagc agtacatgtt
ccccgagaca ctgaccatta gcgagaagtt tacggtcgtg 1260gcgatctccc
cgcctcgaaa gaatggcggc tggggtgaca tccgcgatca cgagctgtgc
1320aagtcttacc gccggctcca gagcaccggc aaccctagcg gcggcaaccc
tcccggcgga 1380aacccgcctg gcagcaccgg gagggtgccc accgccgccc
ctcccgccca gccgcgtgtg 1440cctgtgaccc ccgcgccggc ggtgattccc
atcctggtca tcgcctgtga ccgcagcact 1500gttcggcgct gcctggacaa
gctgctgcat tatcggccct cggctgagct cttccccatc 1560atcgtcagcc
aggactgcgg gcacgaggag acggcccagg ccatcgcctc ctacggcagc
1620gcggtcacgc acatccggca gcccgacctg agcagcattg cggtgccgcc
ggaccaccgc 1680aagttccagg gctactacaa gatcgcgcgc cactaccgct
gggcgctggg ccaggtcttc 1740cggcagtttc gcttccccgc cgccgtggtg
gtggaggatg acctggaggt ggccccggac 1800ttcttcgagt actttcgggc
cacctatccg ctgctgaagg ccgacccctc cctgtggtgc 1860gtctcggcct
ggaatgacaa cggcaaggag cagatggtgg acgccagcag gcctgagctg
1920ctctaccgca ccgacttttt ccctggcctg ggctggctgc tgttggccga
gctctgggct 1980gagctggagc ccaagtggcc aaaggccttc tgggacgact
ggatgcggcg gccggagcag 2040cggcaggggc gggcctgcat ccgccctgag
atctcaagaa cgatgacctt tggccgcaag 2100ggtgtgagcc acgggcagtt
ctttgaccag cacctcaagt tcatcaagct gaaccagcag 2160tttgtgcact
tcacccagct ggacctgtct tacctgcagc gggaggccta tgaccgagat
2220ttcctcgccc gcgtctacgg tgctccccag ctgcaggtgg agaaagtgag
gaccaatgac 2280cggaaggagc tgggggaggt gcgggtgcag tacacgggca
gggacagctt caaggctttc 2340gccaaggctc tgggtgtcat ggatgacctc
aagtcggggg ttccgagagc tggctaccgg 2400ggcattgtca ccttccagtt
ccggggccgc cgtgtccacc tggcgccccc accgacgtgg 2460gagggctatg
atcccagctg gaattag 248714465DNATrichoderma reesei 144agctccgccg
cgacggccac cgccagcgcc actgttcctg gaggcggtag cggccccacc 60agcgg
651452493DNAArtificial SequenceSynthesized Construct 145atgcgcttcc
gaatctacaa gcggaaggtc ctcattctga cccttgtcgt ggccgcttgc 60ggctttgttc
tctggtccag caacggtcgc cagcgtaaga acgaggccct ggcgcctccc
120ctcttggacg ccgaaccggc cagaggcgca ggtggcaggg gaggggatca
cccctcggtc 180gctgtcggca tccgccgcgt cagcaatgtg tccgccgcct
ctctggtccc ggcggttccg 240cagcctgagg cagacaacct cacgctgcgc
taccgatcac tcgtgtatca acttaacttc 300gaccagactc tgcggaacgt
cgacaaggcc ggaacctggg ctccgcgtga gttggtcctc 360gtcgttcagg
tgcacaacag gcccgagtac ctccgcctcc tgctggattc gcttcgaaag
420gcccagggca tcgacaacgt cctggtgatt ttcagccatg acttttggtc
cacagagatc 480aatcagctca ttgcgggtgt caacttttgc cccgtcttgc
aagttttctt ccctttctct 540atccaactct accccaacga gttcccgggc
agtgaccccc gcgactgtcc tcgggatctg 600ccaaaaaacg ccgctctcaa
gctgggctgc atcaacgccg aataccccga cagctttggc 660cactatcgcg
aggccaagtt ctcgcagacg aagcaccact ggtggtggaa gctccatttt
720gtctgggagc gagtgaagat ccttcgtgat tacgcaggac tcattctgtt
cttggaagag 780gaccactacc tggccccgga cttctaccac gtctttaaga
agatgtggaa gctcaagcag 840caggaatgcc ccgagtgcga cgttctgtcc
cttggcacct atagcgcgtc ccgctcgttc 900tacggtatgg ctgacaaggt
cgatgtgaaa acctggaagt caactgagca caatatgggc 960ctcgccctga
cgaggaacgc ctaccagaaa ctcatcgagt gtaccgacac cttctgcacg
1020tacgacgact ataactggga ttggacactg cagtacttga ctgtcagctg
cctccctaag 1080ttttggaagg tccttgttcc ccagatcccg agaattttcc
atgctggcga ctgcgggatg 1140caccacaaga aaacctgtcg cccatccacg
cagtctgccc aaatcgagtc gctcctgaac 1200aacaacaagc agtacatgtt
ccccgagaca ctgaccatta gcgagaagtt tacggtcgtg 1260gcgatctccc
cgcctcgaaa gaatggcggc tggggtgaca tccgcgatca cgagctgtgc
1320aagtcttacc gccggctcca gagctccgcc gcgacggcca ccgccagcgc
cactgttcct 1380ggaggcggta gcggcccgac cagcgggagg gtgcccaccg
ccgcccctcc cgcccagccg 1440cgtgtgcctg tgacccccgc gccggcggtg
attcccatcc tggtcatcgc ctgtgaccgc 1500agcactgttc ggcgctgcct
ggacaagctg ctgcattatc ggccctcggc tgagctcttc 1560cccatcatcg
tcagccagga ctgcgggcac gaggagacgg cccaggccat cgcctcctac
1620ggcagcgcgg tcacgcacat ccggcagccc gacctgagca gcattgcggt
gccgccggac 1680caccgcaagt tccagggcta ctacaagatc gcgcgccact
accgctgggc gctgggccag 1740gtcttccggc agtttcgctt ccccgccgcc
gtggtggtgg aggatgacct ggaggtggcc 1800ccggacttct tcgagtactt
tcgggccacc tatccgctgc tgaaggccga cccctccctg 1860tggtgcgtct
cggcctggaa tgacaacggc aaggagcaga tggtggacgc cagcaggcct
1920gagctgctct accgcaccga ctttttccct ggcctgggct ggctgctgtt
ggccgagctc 1980tgggctgagc tggagcccaa gtggccaaag gccttctggg
acgactggat gcggcggccg 2040gagcagcggc aggggcgggc ctgcatccgc
cctgagatct caagaacgat gacctttggc 2100cgcaagggtg tgagccacgg
gcagttcttt gaccagcacc tcaagttcat caagctgaac 2160cagcagtttg
tgcacttcac ccagctggac ctgtcttacc tgcagcggga ggcctatgac
2220cgagatttcc tcgcccgcgt ctacggtgct ccccagctgc aggtggagaa
agtgaggacc 2280aatgaccgga aggagctggg ggaggtgcgg gtgcagtaca
cgggcaggga cagcttcaag 2340gctttcgcca aggctctggg tgtcatggat
gacctcaagt cgggggttcc gagagctggc 2400taccggggca ttgtcacctt
ccagttccgg ggccgccgtg tccacctggc gcccccaccg 2460acgtgggagg
gctatgatcc cagctggaat tag 249314645DNAArtificial
SequenceSynthesized Construct 146ggtaccgggc ccactgcgca tcatgcgctt
ccgaatctac aagcg 4514738DNAArtificial SequenceSynthesized Construct
147ggcgcgccac tagtctaatt ccagctggga tcatagcc 38
* * * * *