Exo-endo Cellulase Fusion Protein Bower; Benjamin S. ; et al. [Danisco US Inc.]

Exo-endo Cellulase Fusion Protein

Bower; Benjamin S. ; et al.

Patent Application Summary

U.S. patent application number 13/290712 was filed with the patent office on 2012-05-31 for exo-endo cellulase fusion protein. This patent application is currently assigned to Danisco US Inc.. Invention is credited to Benjamin S. Bower, Edmund A. Larenas, Colin Mitchinson.

Application Number	20120135499 13/290712
Document ID	/
Family ID	34968421
Filed Date	2012-05-31

United States Patent Application	20120135499
Kind Code	A1
Bower; Benjamin S. ; et al.	May 31, 2012

EXO-ENDO CELLULASE FUSION PROTEIN

Abstract

The present invention relates to a heterologous exo-endo cellulase fusion construct, which encodes a fusion protein having cellulolytic activity comprising a catalytic domain derived from a fungal exo-cellobiohydrolase and a catalytic domain derived from an endoglucanase. The invention also relates to vectors and fungal host cells comprising the heterologous exo-endo cellulase fusion construct as well as methods for producing a cellulase fusion protein and enzymatic cellulase compositions.

Inventors:	Bower; Benjamin S.; (Newark, CA) ; Larenas; Edmund A.; (Moss Beach, CA) ; Mitchinson; Colin; (Half Moon Bay, CA)
Assignee:	Danisco US Inc. Palo Alto CA
Family ID:	34968421
Appl. No.:	13/290712
Filed:	November 7, 2011

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
11088306	Mar 23, 2005	8097445
13290712
60556598	Mar 25, 2004

Current U.S. Class:	435/209 ; 435/254.11; 435/254.6; 435/320.1; 536/23.2
Current CPC Class:	C12P 21/02 20130101; C12P 21/06 20130101; C12N 9/2437 20130101; C07K 2319/50 20130101; C12Y 302/01091 20130101; C12N 15/80 20130101; C07K 2319/00 20130101; C12N 15/625 20130101; C12Y 302/01004 20130101
Class at Publication:	435/209 ; 536/23.2; 435/320.1; 435/254.11; 435/254.6
International Class:	C12N 9/42 20060101 C12N009/42; C12N 15/63 20060101 C12N015/63; C12N 1/15 20060101 C12N001/15; C12N 15/62 20060101 C12N015/62

Goverment Interests

SPONSORED RESEARCH AND DEVELOPMENT

[0002] Portions of this work were funded by Subcontract No. ZC0-0-30017-01 with the National Renewable Energy Laboratory under Prime Contract No. DE-AC36-99G010337 with the United States Department of Energy. Accordingly, the United States Government may have certain rights in the invention.

Claims

1. A heterologous exo-endo cellulase fusion construct comprising in operable linkage from the 5' end of said molecule a. a DNA molecule encoding a signal sequence; b. a DNA molecule encoding a catalytic domain of an exo-cellobiohydrolase; and c. a DNA molecule encoding an endoglucanase catalytic domain.

2. The heterologous exo-endo cellulase fusion construct according to claim 1 further comprising a linker sequence located 3' of the catalytic domain of the exo-cellobiohydrolase and 5' of the catalytic domain of the endoglucanase.

3. The heterologous exo-endo cellulase fusion construct according to claim 1 further comprising a kexin site located after the linker sequence and before the coding region of the endoglucanase catalytic domain.

4. The heterologous exo-endo fusion construct according to claim 1 further comprising a promoter of a filamentous fungus secretable protein, said promoter located in operable linkage 5' of the coding region of the exo-cellobiohydrolase catalytic domain.

5. The heterologous exo-endo fusion construct according to claim 4 wherein the promoter is a cbh promoter.

6. The heterologous exo-endo fusion construct according to claim 5 wherein the promoter is a cbh1 promoter derived from T. reesei.

7. The heterologous exo-endo fusion construct according to claim 1 wherein the exo-cellobiohydrolase is a CBH1.

8. The heterologous exo-endo fusion construct according to claim 7 wherein said CBH1 comprises an amino acid sequence of at least 90% sequence identity with the sequence set forth in SEQ ID NO.: 6.

9. The heterologous exo-endo fusion construct according to claim 1 wherein the endoglucanase catalytic domain is derived from a bacterial endoglucanase.

10. The heterologous exo-endo fusion construct according to claim 9 wherein the bacterial endoglucanase catalytic domain is selected from the group consisting of an Acidothermus cellulolyticus GH5A endoglucanase I (E1) catalytic domain; an Acidothermus cellulolyticus GH74 endoglucanase (GH74-EG) catalytic domain: and a Thermobifida fusca E5 endoglucanase (Tf-E5) catalytic domain.

11. The heterologous exo-endo fusion construct according to claim 10 wherein the endoglucanase is an Acidothermus cellulolyticus GH5A E1 catalytic domain.

12. The heterologous exo-endo fusion construct according to claim 10 wherein the Acidothermus cellulolyticus GH5A E1 catalytic domain having an amino acid sequence of at least 90% sequence identity with the sequence set forth in SEQ ID NO. 8.

13. The heterologous exo-endo fusion construct according to claim 1 further comprising a terminator sequence located 3' to the endoglucanase catalytic domain.

14. The heterologous exo-endo fusion construct according to claim 1 further comprising a selectable marker.

15. A vector comprising in operable linkage a promoter of a filamentous fungus secretable protein, a DNA molecule encoding a signal sequence, a DNA molecule encoding a catalytic domain of a fungal exo-cellobiohydrolase, a DNA molecule encoding a catalytic domain of an endoglucanase, and a terminator.

16. The vector according to claim 15 further comprising a selectable marker.

17. The vector according to claim 15 further comprising a linker located 3' of the exo-cellobiohydrolase (CBH) catalytic domain and 5' of the EG catalytic domain.

18. The vector according to claim 15 further comprising a kexin site.

19. The vector according to claim 15 wherein the catalytic domain of the endoglucanase is derived from a bacterial endoglucanase.

20. A fungal host cell transformed with a heterologous exo-endo cellulase fusion construct according to claim 1.

21. A fungal host cell transformed with a vector according to claim 15.

22. A recombinant fungal cell comprising the heterologous exo-endo cellulase fusion construct according to claim 1.

23. A recombinant fungal cell comprising a vector according to claim 15.

24. The recombinant fungal cell according to claim 22 wherein the fungal host cell is a Trichoderma host cell.

25. The recombinant fungal cell according to claim 22 wherein the fungal host cell is a strain of T. reesei.

26. The recombinant fungal cell according to claim 22 wherein at least one gene selected from the group consisting of the cbh1, cbh2, egl1 and eg/2 has been deleted from the fungal cells.

27. An isolated cellulase fusion protein having cellulolytic activity which comprises an exo-cellobiohydrolase catalytic domain and an endoglucanase catalytic domain.

28. The isolated cellulase fusion protein according to claim 27 wherein the exo-cellobiohydrolase is a CBH1.

29. The isolated cellulase fusion protein according to claim 27 wherein the catalytic domain of the endoglucanase is derived from a bacterial cell.

30. The isolated cellulase fusion protein according to claim 29 wherein bacterial cell is a strain of Acidothermus cellulolyticus.

31. A cellulolytic composition comprising the isolated cellulase fusion protein according to claim 29.

32. A method of producing an enzyme having cellulolytic activity comprising, a) stably transforming a filamentous fungal host cell with a heterologous exo-endo cellulase fusion construct according to claim 1; b) cultivating the transformed fungal host cell under conditions suitable for said fungal host cell to produce an enzyme having cellulolytic activity; and c) recovering said enzyme.

33. The method according to claim 32 wherein the filamentous fungal host cell is a Trichoderma cell.

34. The method according to claim 32 wherein the filamentous fungal host cell is a T. reesei host cell.

35. The method according to claim 32 wherein the exo-cellobiohydrolase is a CBH1 and the endoglucanase is selected from the group consisting of an Acidothermus cellulolyticus endoglucanase and a Thermobifida fusca endoglucanase.

36. The method according to claim 32 wherein the recovered enzyme is selected from the group consisting of a cellulase fusion protein, components of the cellulase fusion protein, and a combination of the cellulase fusion protein and the components thereof.

37. The method according to claim 32 wherein the recovered enzyme(s) is purified.

38. A Trichoderma host cell which expresses a cellulase fusion protein, wherein said fusion protein comprises a catalytic domain of an exo-cellobiohydrolase and a catalytic domain of an endoglucanase.

39. The Trichoderma host cell according to claim 38 wherein the host cell is a T. reesei cell.

40. The Trichoderma host cell according to claim 38 wherein the exo-cellobiohydrolase is a CBH1 and the endoglucanase is an Acidothermus cellulolyticus endoglucanase.

41. The Trichoderma host cell according to claim 38 wherein the endoglucanase is either an Acidothermus cellulolyticus E1 or GH74 endoglucanase.

42. The Trichoderma host cell according to claim 38 wherein at least one gene selected from the group consisting of the cbh1, cbh2, egl1 and egl2 has been deleted from the host cell.

43. A fungal cellulase composition comprising a cellulase fusion protein or components thereof, wherein the fusion protein or components thereof is the product of a recombinant Trichoderma spp. according to claim 38.

44. A fungal cellulase composition according to claim 43 wherein the cellulase fusion protein is a CBH1-Acidothermus cellulolyticus E1 fusion protein and the components are the cleaved products, CBH1 and Acidothermus E1.

Description

RELATED APPLICATIONS

[0001] The present application claims priority to U.S. Provisional Patent Application Ser. No. 60/556,598, entitled "Exo-Endo Cellulase Fusion Protein" and filed on Mar. 25, 2004.

FIELD OF THE INVENTION

[0003] The present invention relates to a heterologous exo-endo cellulase fusion construct, which encodes a fusion protein having cellulolytic activity comprising a catalytic domain derived from a fungal exo-cellobiohydrolase and a catalytic domain derived from an endoglucanase. The invention also relates to vectors and fungal host cells comprising the heterologous exo-endo cellulase fusion construct as well as methods for producing a cellulase fusion protein and enzymatic cellulase compositions.

BACKGROUND OF THE INVENTION

[0004] Cellulose and hemicellulose are the most abundant plant materials produced by photosynthesis. They can be degraded and used as an energy source by numerous microorganisms, including bacteria, yeast and fungi, which produce extracellular enzymes capable of hydrolysis of the polymeric substrates to monomeric sugars (Aro et al., 2001). As the limits of non-renewable resources approach, the potential of cellulose to become a major renewable energy resource is enormous (Krishna at al., 2001). The effective utilization of cellulose through biological processes is one approach to overcoming the shortage of foods, feeds, and fuels (Ohmiya et al., 1997).

[0005] Cellulases are enzymes that hydrolyze cellulose (beta-1,4-glucan or beta D-glucosidic linkages) resulting in the formation of glucose, cellobiose, cellooligosaccharides, and the like. Cellulases have been traditionally divided into three major classes: endoglucanases (EC 3.2.1.4) ("EG"), exoglucanases or cellobiohydrolases (EC 3.2.1.91) ("CBH") and beta-glucosidases ((beta)-D-glucoside glucohydrolase; EC 3.2.1.21) ("BG") (Knowles et al., 1987 and Schulein, 1988). Endoglucanases act mainly on the amorphous parts of the cellulose fiber, whereas cellobiohydrolases are also able to degrade crystalline cellulose.

[0006] Cellulases are known to be produced by a large number of bacteria, yeast and fungi. Certain fungi produce a complete cellulase system capable of degrading crystalline forms of cellulose, such that the cellulases are readily produced in large quantities via fermentation.

[0007] In order to efficiently convert crystalline cellulose to glucose the complete cellulase system comprising components from each of the CBH, EG and BG classifications is required, with isolated components less effective in hydrolyzing crystalline cellulose (Filho et al., 1996). In particular, the combination of EG-type cellulases and CBH-type cellulases interact to more efficiently degrade cellulose than either enzyme used alone (Wood, 1985; Baker et al., 1994; and Nieves et al., 1995).

[0008] Additionally, cellulases are known in the art to be useful in the treatment of textiles for the purposes of enhancing the cleaning ability of detergent compositions, for use as a softening agent, for improving the feel and appearance of cotton fabrics, and the like (Kumar et al., 1997). Cellulase-containing detergent compositions with improved cleaning performance (U.S. Pat. No. 4,435,307; GB App. Nos. 2,095,275 and 2,094,826) and for use in the treatment of fabric to improve the feel and appearance of the textile (U.S. Pat. Nos. 5,648,263, 5,691,178, and 5,776,757, and GB App. No. 1,358,599), have been described in the literature.

[0009] Hence, cellulases produced in fungi and bacteria have received significant attention. In particular, fermentation of Trichoderma spp. (e.g., Trichoderma longibrachiatum or Trichoderma reesei) has been shown to produce a complete cellulase system capable of degrading crystalline forms of cellulose. Over the years, Trichoderma cellulase production has been improved by classical mutagenesis; screening, selection and development of highly refined, large scale inexpensive fermentation conditions. While the multi-component cellulase system of Trichoderma spp. is able to hydrolyze cellulose to glucose, there are cellulases from other microorganisms, particularly bacterial strains, with different properties for efficient cellulose hydrolysis, and it would be advantageous to express these proteins in a filamentous fungus for industrial scale cellulase production. However, the results of many studies demonstrate that the yield of bacterial enzymes from filamentous fungi is low (Jeeves et al., 1991).

[0010] In this invention, a heterologous exo-endo cellulase fusion construct, which includes the coding region of a fungal exo-cellobiohydrolase (CBH) catalytic domain and a coding region of an endoglucanase (EG) catalytic domain, has been introduced and expressed in a filamentous fungi host cell to increase the yield and effectiveness of cellulase enzymes.

SUMMARY OF THE INVENTION

[0011] In a first aspect, the invention includes a heterologous exo-endo cellulase fusion construct comprising in operable linkage from the 5' end of said construct, (a) a DNA molecule encoding a signal sequence, (b) a DNA molecule encoding a catalytic domain of an exo-cellobiohydrolase, and (c) a DNA molecule encoding an endoglucanase catalytic domain.

[0012] In a first embodiment of this aspect, the heterologous exo-endo cellulase fusion construct further comprises a linker sequence located 3' of the catalytic domain of the exo-cellobiohydrolase and 5' of the catalytic domain of the endoglucanase. In a second embodiment, the heterologous exo-endo cellulase fusion construct lacks the cellulose binding domain (CBD) of the exo-cellobiohydrolase. In a third embodiment, the heterologous exo-endo cellulase fusion construct further comprises a kexin site located after the linker sequence and before the coding region of the endoglucanase catalytic domain. In a fourth embodiment, the heterologous exo-endo fusion construct will comprise a promoter of a filamentous fungus secretable protein, said promoter located in operable linkage 5' of the coding region of the exo-cellobiohydrolase catalytic domain. In a fifth embodiment, the promoter is a cbh promoter and preferably a cbh1 promoter derived from T. reesei. In a sixth embodiment, the exo-cellobiohydrolase is a CBH1 and particularly a CBH1 having an amino acid sequence of at least 90% sequence identity with the sequence set forth in SEQ ID NO.: 6. In a seventh embodiment, the endoglucanase catalytic domain is derived from a bacterial endoglucanase. In an eighth embodiment, the bacterial endoglucanase catalytic domain is selected from the group consisting of an Acidothermus cellulolyticus GH5A endoglucanase I (E1) catalytic domain; an Acidothermus cellulolyticus GH74 endoglucanase (GH74-EG) catalytic domain: and a Thermobifida fusca E5 endoglucanase (Tf-E5) catalytic domain. In a ninth embodiment, the heterologous exo-endo cellulase fusion construct lacks the cellulose binding domain of the endoglucanase. In a tenth embodiment, the endoglucanase is an Acidothermus cellulolyticus GH5A E1 and particularly the Acidothermus cellulolyticus GH5A E1 having an amino acid sequence of at least 90% sequence identity with the sequence set forth in SEQ ID NO. 8. In an eleventh embodiment, the heterologous exo-endo cellulase fusion construct comprises a terminator sequence located 3' to the endoglucanase catalytic domain. In a twelfth embodiment, the heterologous fusion construct comprises a selectable marker.

[0013] In a second aspect, the invention includes a vector comprising in operable linkage a promoter of a filamentous fungus secretable protein, a DNA molecule encoding a signal sequence, a DNA molecule encoding a catalytic domain of a fungal exo-cellobiohydrolase, a DNA molecule encoding a catalytic domain of an endoglucanase, and a terminator. In one embodiment, the vector will further include a selectable marker.

[0014] In a second embodiment, the vector will comprise a linker located 3' of the exo-cellobiohydrolase (CBH) catalytic domain and 5' of the EG catalytic domain. In a third embodiment, the vector will lack the CBH cellulose binding domain. In a fourth embodiment, the vector will comprise a kexin site. In a fifth embodiment, the catalytic domain of the endoglucanase is derived from a bacterial endoglucanase. In a sixth embodiment, the vector lacks the cellulose binding domain of the endoglucanase.

[0015] In a third aspect, the invention includes a fungal host cell transformed with a heterologous exo-endo cellulase fusion construct or a fungal host cell transformed with a vector comprising a heterologous exo-endo cellulase fusion construct.

[0016] In a fourth aspect, the invention includes a recombinant fungal cell comprising the heterologous exo-end cellulase fusion construct or a vector comprising the same.

[0017] In a particularly preferred embodiment of the third and fourth aspects, the fungal host cell is a Trichoderma host cell and more particularly a strain of T. reesei. In another embodiment of these aspects, native cellulase genes, such as cbh1, cbh2, egl1 and egl2 have been deleted from the fungal cells. In a third embodiment, the native cellulose binding domain has been deleted from the fungal cells.

[0018] In a fifth aspect, the invention includes an isolated cellulase fusion protein having cellulolytic activity which comprises an exo-cellobiohydrolase catalytic domain and an endoglucanase catalytic domain, wherein the exo-cellobiohydrolase lacks a cellulose binding domain. In one embodiment of this aspect, the exo-cellobiohydrolase is a CBH1. In a second embodiment, the catalytic domain of the endoglucanase is derived from a bacterial cell. In a third embodiment, the bacterial cell is a strain of Acidothermus cellulolyticus. In a fourth embodiment, the invention concerns a cellulolytic composition comprising the isolated cellulase fusion protein.

[0019] In a sixth aspect, the invention includes a method of producing an enzyme having cellulolytic activity comprising, a) stably transforming a filamentous fungal host cell with a heterologous exo-endo cellulase fusion construct or vector as defined above in the first aspect and second aspect; b) cultivating the transformed fungal host cell under conditions suitable for said fungal host cell to produce an enzyme having cellulolytic activity; and c) recovering said enzyme.

[0020] In one embodiment of this aspect, the filamentous fungal host cell is a Trichoderma cell, and particularly a T. reesei host cell. In a second embodiment, the exo-cellobiohydrolase is a CBH1 and the endoglucanase is an Acidothermus cellulolyticus endoglucanase or a Thermobifida fusca endoglucanase. In a third embodiment, the recovered enzyme is a cellulase fusion protein, components of the cellulase fusion protein, or a combination of the cellulase fusion protein and the components thereof. In a fourth embodiment, the recovered enzyme(s) is purified.

[0021] In an seventh aspect, the invention includes a Trichoderma host cell which expresses a cellulase fusion protein, wherein said fusion protein comprises a catalytic domain of an exo-cellobiohydrolase and a catalytic domain of an endoglucanase, wherein the exo-cellobiohydrolase lacks a cellulose binding domain. In one embodiment, the Trichoderma host cell is a T. reesei cell. In a second embodiment, the exo-cellobiohydrolase is a CBH1 and the endoglucanase is an Acidothermus cellulolyticus endoglucanase and particularly an Acidothermus cellulolyticus E1 or GH74 endoglucanase. In a third embodiment, the endoglucanase lacks a cellulose binding domain. In a fourth embodiment, the T. reesei host cell includes deleted native cellulase genes.

[0022] In an eighth aspect, the invention includes a fungal cellulase composition comprising a cellulase fusion protein or components thereof, wherein the fusion protein or components thereof is the product of a recombinant Trichoderma spp. In one embodiment, the cellulase fusion protein is a CBH1-Acidothermus cellulolyticus E1 fusion protein and the components are the cleaved products, CBH1 and Acidothermus cellulolyticus E1, wherein each component has cellulolytic activity.

BRIEF DESCRIPTION OF THE FIGURES

[0023] FIG. 1 is a representation of a heterologous exo-endo cellulase fusion construct encompassed by the invention, which includes a Trichoderma reesei cbh1 promoter, a cbh1 core (cbh1 signal sequence and cbh1 catalytic domain), a cbh1 linker sequence, a kexin site, an E1 core (an Acidothermus cellulolyticus E1 endoglucanase catalytic domain), a cbh1 terminator and an A. nidulans amdS selectable marker.

[0024] FIG. 2 is a DNA sequence (SEQ ID NO: 1) of the T. reesei cbh1 signal sequence (SEQ ID NO: 2); the T. reesei cbh1 catalytic domain (SEQ ID NO: 3), and the T. reesei cbh1 linker (SEQ ID NO: 4). The signal sequence is underlined, the catalytic domain is in bold, and the linker sequence is in italics.

[0025] FIG. 3 shows the predicted amino acid sequence (SEQ ID NO: 5) based on the nucleotide sequence provided in FIG. 2, wherein the signal peptide is underlined, the catalytic domain, represented by (SEQ ID NO: 6), is in bold, and the linker is in italics.

[0026] FIG. 4 is an illustration of a nucleotide sequence (SEQ ID NO: 7) encoding an Acidothermus cellulolyticus GH5A endoglucanase I (E1)) catalytic domain.

[0027] FIG. 5 is the predicted amino acid sequence (SEQ ID NO: 8) of the Acidothermus cellulolyticus GH5A E1 catalytic domain based on the nucleotide sequence provided in FIG. 4.

[0028] FIGS. 6A and 6B are an illustration of a nucleotide sequence (SEQ ID NO: 9) encoding an Acidothermus cellulolyticus GH74-EG catalytic domain.

[0029] FIG. 7 is the predicted amino acid sequence (SEQ ID NO: 10) of the Acidothermus cellulolyticus GH74-EG based on the nucleotide sequence provided in FIGS. 6A and 6B.

[0030] FIG. 8 is an illustration of a nucleotide sequence (SEQ ID NO: 11) encoding the CBD, linker and catalytic domain of endoglucanase 5 (E5) of Thermobifida fusca.

[0031] FIG. 9 is the predicted amino acid sequence (SEQ ID NO: 12) of the CBD, linker and E5 based on the nucleotide sequence provided in FIG. 8.

[0032] FIG. 10 is the nucleotide sequence (2656 bases) (SEQ ID NO: 13) of a heterologous cellulase fusion construct described in example 1 comprising, the T. reesei CBH1 signal sequence; the catalytic domain of the T. reesei CBH1; the T. reesei CBH1 linker sequence; a kexin cleavage site which includes codons for the amino acids SKR and the sequence coding for the Acidothermus cellulolyticus GH5A-E1 catalytic domain.

[0033] FIG. 11 is the predicted amino acid sequence (SEQ ID NO: 14) of the cellulase fusion protein based on the nucleic acid sequence in FIG. 10.

[0034] FIG. 12 provides a schematic diagram of the pTrex4 plasmid, which was used for expression of a heterologous exo-endo cellulase fusion construct (CBH1-endoglucanase) as described in the examples and includes the Trichoderma reesei cbh1 promoter, the T. reesei CBH1 signal sequence, catalytic domain, and linker sequences, a kexin cleavage site and an endoglucanase gene of interest inserted between a SpeI and AscI site, a CBH1 Trichoderma reesei terminator and the amdS Aspergillus nidulans acetamidase marker gene.

[0035] FIGS. 13A-E provide the nucleotide sequence (SEQ ID NO:15) (10239 bp) of the pTrex4 plasmid of FIG. 12 without the catalytic domain of the EG gene of interest.

[0036] FIG. 14 illustrates a SDS-PAGE gel of supernate samples of shake flask growth of clones of a T. reesei strain deleted for the cellulases, cbh1, cbh2, egl1 and egl2 and transformed with the CBH1-E1 fusion construct. Lanes 1 and 10 represent MARK 12 Protein Standard (Invitrogen, Carlsbad, Calif.). Lanes 2-8 represent various transformants and lane 9 represents the untransformed T. reesei strain. The upper arrow indicates the cellulase fusion protein and the lower arrow indicates the cleaved E1 catalytic domain.

[0037] FIG. 15 illustrates a SDS-PAGE gel of supernate samples of shake flask growth of clones of a T. reesei strain deleted for the cellulases, cbh1, cbh2, egl1 and egl2 and transformed with the CBH1-GH74 fusion construct. Lane 1 represents the untransformed control. Lane 3 represents MARK 12 Protein Standard (Invitrogen, Carlsbad, Calif.). Lanes 2 and 4-12 represent various transformants. The upper arrow indicates the CBH1-GH74 fusion protein and the lower arrow indicates the cleaved GH74 catalytic domain.

[0038] FIG. 16 illustrates a SDS-PAGE gel of supernate samples of shake flask growth of clones of a T. reesei strain deleted for the cellulases, cbh1, cbh2, egl1 and egl2 and transformed with the CBH1-TfE5 fusion construct. Lane1 represents MARK 12 Protein Standard (Invitrogen, Carlsbad, Calif.). Lane 2 represents the untransformed strain and lanes 3-12 represent various transformants. Arrows indicate new bands observed in the CBH1-TfE5 expressing transformants.

[0039] FIG. 17 illustrates the % cellulose conversion to soluble sugars over time for a T. reesei parent strain comprising native cellulase genes with a corresponding T. reesei strain which expresses the CBH1-E1 fusion protein and reference is made to example 3.

DETAILED DESCRIPTION OF THE INVENTION

[0040] The invention will now be described in detail by way of reference only using the following definitions and examples. All patents and publications, including all sequences disclosed within such patents and publications, referred to herein are expressly incorporated by reference.

[0041] Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton, et al., DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY, 2D ED., John Wiley and Sons, New York (1994), and Hale & Marham, THE HARPER COLLINS DICTIONARY OF BIOLOGY, Harper Perennial, NY (1991) provide one of skill with a general dictionary of many of the terms used in this invention. Practitioners are particularly directed to Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL (Second and Third Editions (1989 and 2001), Cold Spring Harbor Press, Plainview, N.Y., and Ausubel F M et al., Current Protocols in Molecular Biology, John Wiley & Sons, New York, N.Y., 1993, for definitions and terms of the art.

[0042] It is to be understood that this invention is not limited to the particular methodology, protocols, and reagents described, as these may vary. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described.

[0043] Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.

[0044] Other objects, features and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the scope and spirit of the invention will become apparent to one skilled in the art from this detailed description.

1. DEFINITIONS

[0045] The term "heterologous exo-endo cellulase fusion construct" refers to a nucleic acid construct that is composed of parts of different genes in operable linkage. The components include, from the 5' end, a DNA molecule encoding an exo-cellobiohydrolase catalytic domain and a DNA molecule encoding an endoglucanase catalytic domain.

[0046] The term "cellulase fusion protein" or "fusion protein having cellulolytic activity" refers to an enzyme, which has an exo-cellobiohydrolase catalytic domain and an endoglucanase catalytic domain and exhibits cellulolytic activity. The term "components of a cellulase fusion protein" refers to individual (cleaved) fragments of the cellulase fusion protein, wherein each fragment has cellulolytic activity and includes one of the catalytic domains of the fusion protein.

[0047] The term "cellulase" refers to a category of enzymes capable of hydrolyzing cellulose (beta-1,4-glucan or beta D-glucosidic linkages) polymers to shorter cello-oligosaccharide oligomers, cellobiose and/or glucose.

[0048] The term "exo-cellobiohydrolase" (CBH) refers to a group of cellulase enzymes classified as EC 3.2.1.91. These enzymes are also known as exoglucanases or cellobiohydrolases. CBH enzymes hydrolyze cellobiose from the reducing or non-reducing end of cellulose. In general, a CBH1 type enzyme preferentially hydrolyzes cellobiose from the reducing end of cellulose and a CBH2 type enzyme preferentially hydrolyzes the non-reducing end of cellulose.

[0049] The term "endoglucanase" (EG) refers to a group of cellulase enzymes classified as EC 3.2.1.4. An EG enzyme hydrolyzes internal beta-1,4 glucosidic bonds of the cellulose.

[0050] The term "beta-glucosidases" refers to a group of cellulase enzymes classified as EC 3.2.1.21.

[0051] "Cellulolytic activity" encompasses exoglucanase activity, endoglucanase activity or both types of enzymatic activity.

[0052] The term "catalytic domain" refers to a structural portion or region of the amino acid sequence of a cellulase which possess the catalytic activity of the cellulase. The catalytic domain is a structural element of the cellulase tertiary structure that is distinct from the cellulose binding domain or site, which is a structural element which binds the cellulase to a substrate, such as cellulose.

[0053] The term "cellulose binding domain (CBD)" as used herein refers to a portion of the amino acid sequence of a cellulase or a region of the enzyme that is involved in the cellulose binding activity of a cellulase. Cellulose binding domains generally function by non-covalently binding the cellulase to cellulose, a cellulose derivative or other polysaccharide equivalent thereof. CBDs typically function independent of the catalytic domain.

[0054] A nucleic acid is "operably linked" when it is placed into a functional relationship with another nucleic acid sequence. For example, DNA encoding a signal peptide is operably linked to DNA encoding a polypeptide if it is expressed as a preprotein that participates in the secretion of the polypeptide; a promoter is operably linked to a coding sequence if it affects the transcription of the sequence. Generally, "operably linked" means that the DNA sequences being linked are contiguous, and, in the case of the heterologous exo-endo cellulase fusion construct contiguous and in reading frame.

[0055] As used herein, the term "gene" means the segment of DNA involved in producing a polypeptide chain, that may or may not include regions preceding and following the coding region, e.g. 5' untranslated (5' UTR) or "leader" sequences and 3' UTR or "trailer" sequences, as well as intervening sequences (introns) between individual coding segments (exons).

[0056] The term "polypeptide" as used herein refers to a compound made up of a single chain of amino acid residues linked by peptide bonds. The term "protein" as used herein may be synonymous with the term "polypeptide" or may refer, in addition, to a complex of two or more polypeptides.

[0057] The term "nucleic acid molecule", "nucleic acid" or "polynucleotide" includes RNA, DNA and cDNA molecules. It will be understood that, as a result of the degeneracy of the genetic code, a multitude of nucleotide sequences encoding a given protein such as a cellulase fusion protein of the invention may be produced.

[0058] A "heterologous" nucleic acid sequence has a portion of the sequence, which is not native to the cell in which it is expressed. For example, heterologous, with respect to a control sequence refers to a control sequence (i.e. promoter or enhancer) that does not function in nature to regulate the same gene the expression of which it is currently regulating. Generally, heterologous nucleic acid sequences are not endogenous to the cell or part of the genome in which they are present, and have been added to the cell, by infection, transfection, transformation, microinjection, electroporation, or the like. A "heterologous" nucleic acid sequence may contain a control sequence/DNA coding sequence combination that is the same as, or different from a control sequence/DNA coding sequence combination found in the native cell. The term heterologous nucleic acid sequence encompasses a heterologous exo-endo cellulase fusion construct according to the invention.

[0059] As used herein, the term "vector" refers to a nucleic acid sequence or construct designed for transfer between different host cells. An "expression vector" refers to a vector that has the ability to incorporate and express heterologous DNA sequences in a foreign cell. An expression vector may be generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular nucleic acid in a target cell. The recombinant expression cassette can be incorporated into a plasmid, chromosome, mitochondrial DNA, virus, or nucleic acid fragment.

[0060] As used herein, the term "plasmid" refers to a circular double-stranded (ds) DNA construct used as a cloning vector, and which forms an extrachromosomal self-replicating genetic element in many bacteria and some eukaryotes.

[0061] As used herein, the term "selectable marker" refers to a nucleotide sequence which is capable of expression in cells and where expression of the selectable marker confers to cells containing the expressed gene the ability to grow in the presence of a corresponding selective agent, or under corresponding selective growth conditions.

[0062] As used herein, the term "promoter" refers to a nucleic acid sequence that functions to direct transcription of a downstream gene. The promoter will generally be appropriate to the host cell in which the target gene is being expressed. The promoter together with other transcriptional and translational regulatory nucleic acid sequences (also termed "control sequences") are necessary to express a given gene. In general, the transcriptional and translational regulatory sequences include, but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and stop sequences, translational start and stop sequences, and enhancer or activator sequences.

[0063] The term "signal sequence" or "signal peptide" refers to a sequence of amino acids at the N-terminal portion of a protein, which facilitates the secretion of the mature form of the protein outside the cell. The mature form of the extracellular protein lacks the signal sequence which is cleaved off during the secretion process.

[0064] By the term "host cell" is meant a cell that contains a heterologous exo-endo cellulase fusion construct encompassed by the invention or a vector including the same and supports the replication, and/or transcription or transcription and translation (expression) of the heterologous exo-endo cellulase construct. Host cells for use in the present invention can be prokaryotic cells, such as E. coli, or eukaryotic cells such as yeast, plant, insect, amphibian, or mammalian cells. In general, host cells are filamentous fungi.

[0065] The term "filamentous fungi" includes all filamentous fungi recognized by those of skill in the art. A preferred fungus is selected from the subdivision Eumycota and Oomycota and particularly from the group consisting of Aspergillus, Trichoderma, Fusarium, Chrysosporium, Penicillium, Humicola, Neurospora, or alternative sexual forms thereof such as Emericella and Hypocrea (See, Kuhls et al., 1996).

[0066] The filamentous fungi are characterized by vegetative mycelium having a cell wall composed of chitin, glucan, chitosan, mannan, and other complex polysaccharides, with vegetative growth by hyphal elongation and carbon catabolism that is obligately aerobic.

[0067] The term "derived" encompasses the terms originated from, obtained or obtainable from and isolated from.

[0068] An "equivalent" amino acid sequence is an amino acid sequence that is not identical to an original reference amino acid sequence but includes some amino acid changes, which may be substitutions, deletions, additions or the like, wherein the protein exhibits essentially the same qualitative biological activity of the reference protein. An equivalent amino acid sequence will have between 80%-99% amino acid identity to the original reference sequence. Preferably the equivalent amino acid sequence will have at least 85%, 90%, 93%, 95%, 96%, 98% and 99% identity to the reference sequence.

[0069] A "substitution" results from the replacement of one or more nucleotides or amino acid by different nucleotides or amino acids, respectively. Substitutions are usually made in accordance with known conservative substitutions, wherein one class of amino acid is substituted with an amino acid in the same class. A "non-conservative substitution" refers to the substitution of an amino acid in one class with an amino acid from another class.

[0070] A "deletion" is a change in a nucleotide or amino acid sequence in which one or more nucleotides or amino acids are absent.

[0071] An "addition" is a change in a nucleotide or amino acid sequence that has resulted from the insertion of one or more nucleotides or amino acid as compared to an original reference sequence.

[0072] As used herein, "recombinant" includes reference to a cell or vector, that has been modified by the introduction of heterologous nucleic acid sequences or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found in identical form within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all as a result of deliberate human intervention.

[0073] As used herein, the terms "transformed", "stably transformed" or "transgenic" with reference to a cell means the cell has a heterologous nucleic acid sequence according to the invention integrated into its genome or as an episomal plasmid that is maintained through multiple generations.

[0074] The term "introduced" in the context of inserting a heterologous exo-endo cellulase fusion construct or heterologous nucleic acid sequence into a cell, means "transfection", "transformation" or "transduction" and includes reference to the incorporation of a heterologous nucleic acid sequence or heterologous exo-endo cellulase fusion construct into a eukaryotic or prokaryotic cell where the heterologous nucleic acid sequence or heterologous exo-endo cellulase nucleic acid construct may be incorporated into the genome of the cell (for example, chromosome, plasmid, plastid, or mitochondrial DNA), converted into an autonomous replicon, or transiently expressed (for example, transfected mRNA).

[0075] As used herein, the term "expression" refers to the process by which a polypeptide is produced based on the nucleic acid sequence of a gene. The process includes both transcription and translation.

[0076] It follows that the term "cellulase fusion protein expression" or "fusion expression" refers to transcription and translation of a "heterologous exo-endo cellulase fusion construct" comprising the catalytic domain of an exo-cellobiohydrolase and the catalytic domain of an endoglucanase, the products of which include precursor RNA, mRNA, polypeptide, post-translationally processed polypeptides, and derivatives thereof.

[0077] As used herein, the term "purifying" generally refers to subjecting recombinant nucleic acid or protein containing cells to biochemical purification and/or column chromatography.

[0078] As used herein, the terms "active" and "biologically active" refer to a biological activity associated with a particular protein, such as the enzymatic activity associated with a cellulase. It follows that the biological activity of a given protein refers to any biological activity typically attributed to that protein by those of skill in the art.

[0079] As used herein, the term "enriched" means that the concentration of a cellulase enzyme found in a fungal cellulase composition is greater relative to the concentration found in a wild type or naturally occurring fungal cellulase composition. The terms enriched, elevated and enhanced may be used interchangeably herein.

[0080] A "wild type fungal cellulase composition" is one produced by a naturally occurring fungal source and which comprises one or more BG, CBH and EG components wherein each of these components is found at the ratio produced by the fungal source.

[0081] Thus, to illustrate, a naturally occurring cellulase system may be purified into substantially pure components by recognized separation techniques well published in the literature, including ion exchange chromatography at a suitable pH, affinity chromatography, size exclusion and the like. A purified cellulase fusion protein or components thereof may then be added to the enzymatic solution resulting in an enriched cellulase solution. It is also possible to elevate the amount of EG or CBH produced by a microbe by expressing a cellulase fusion protein encompassed by the invention.

[0082] "A", "an" and "the" include plural references unless the context clearly dictates otherwise.

[0083] As used herein the term "comprising" and its cognates are used in their inclusive sense: that is equivalent to the term "including" and its corresponding cognates.

[0084] "ATCC" refers to American Type Culture Collection located in Manassas Va. 20108 (ATCC www/atcc.org).

[0085] "NRRL" refers to the Agricultural Research Service Culture Collection, National Center for Agricultural utilization Research (and previously known as USDA Northern Regional Research Laboratory), Peoria, Ill.

2. PREFERRED EMBODIMENTS

A. Components and Construction of Heterologous Exo-Endo Cellulase Fusion Constructs and Expression Vectors.

[0086] A heterologous exo-endo cellulase fusion construct or a vector comprising a heterologous exo-endo cellulase fusion construct may be introduced into and replicated in a filamentous fungal host cell for protein expression and secretion.

[0087] In some embodiments, the heterologous exo-endo cellulase fusion construct comprises in operable linkage from the 5' end of said construct, optionally a signal peptide, a DNA molecule encoding a catalytic domain of an exo-cellobiohydrolase, and a DNA molecule encoding a catalytic domain of an endoglucanase. In other embodiments, the components of the heterologous exo-endo cellulase fusion construct comprise in operable linkage from the 5' end of said construct, optionally a signal peptide, a DNA molecule encoding a catalytic domain of an exo-cellobiohydrolase, optionally a DNA molecule encoding the CBD of an endoglucanase, and a DNA molecule encoding a catalytic domain of the endoglucanase.

[0088] In other embodiments the construct will comprise in operable linkage from the 5' end of said construct optionally a signal peptide, a DNA molecule encoding a catalytic domain of an exo-cellobiohydrolase, optionally a DNA molecule encoding the CBD of the exo-cellobiohydrolase, a linker, optionally a DNA molecule encoding the CBD of an endoglucanase, and a DNA molecule encoding a catalytic domain of the endoglucanase.

[0089] In a further embodiment the heterologous exo-endo cellulase fusion construct or vector comprising a heterologous exo-endo cellulase fusion construct includes in operable linkage from the 5' end, a promoter of a filamentous fungus secretable protein; a DNA molecule encoding a signal sequence; a DNA molecule encoding a catalytic domain of an exo-cellobiohydrolase, optionally a DNA molecule encoding the exo-cellobiohydrolase CBD; a DNA molecule encoding a catalytic domain of an endoglucanase; and a terminator. Further the vector may include a DNA molecule encoding the CBD of the endoglucanase said CBD located 5' to the DNA molecule encoding the endoglucanase catalytic domain.

[0090] In one embodiment a preferred heterologous exo-endo cellulase fusion construct or expression vector will not include the exo-cellobiohydrolase CBD. In another embodiment, a preferred expression vector will include a promoter of a filamentous fungus secretable protein, a DNA molecule encoding an exo-cellobiohydrolase signal sequence, a DNA molecule encoding a catalytic domain of an exo-cellobiohydrolase, a linker, a DNA molecule encoding a catalytic domain of an endoglucanase, and a terminator, wherein the vector lacks the CBD of the exo-cellobiohydrolase and optionally lacks the CBD of the endoglucanase. In a preferred embodiment, the coding sequence for the endoglucanase catalytic domain (either including the endoglucanase CBD or lacking the endoglucanase CBD) will not include an endoglucanase signal sequence. Reference is made to FIGS. 1, 10 and 12 as examples of embodiments including an expression vector and heterologous exo-endo cellulase fusion construct of the invention.

[0091] Exemplary promoters include both constitutive promoters and inducible promoters. Examples include the promoters from the Aspergillus niger, A. awamori or A. oryzae glucoamylase, alpha-amylase, or alpha-glucosidase encoding genes; the A. nidulans gpdA or trpC genes; the Neurospora crassa cbh1 or trp1 genes; the A. niger or Rhizomucor miehei aspartic proteinase encoding genes; the T. reesei cbh1, cbh2, egl1, egl2, or other cellulase encoding genes; a CMV promoter, an SV40 early promoter, an RSV promoter, an EF-1.alpha. promoter, a promoter containing the tet responsive element (TRE) in the tet-on or tet-off system as described (ClonTech and BASF), the beta actin promoter. In some embodiments the promoter is one that is native to the fungal host cell to be transformed.

[0092] In one preferred embodiment, the promoter is an exo-cellobiohydrolase cbh1 or cbh2 promoter and particularly a cbh1 promoter, such as a T. reesei cbh1 promoter. The T. reesei cbh1 promoter is an inducible promoter, and reference is made to GenBank Accession No. D86235.

[0093] The DNA sequence encoding an exo-cellobiohydrolase catalytic domain is operably linked to a DNA sequence encoding a signal sequence. The signal sequence is preferably that which is naturally associated with the exo-cellobiohydrolase to be expressed. Preferably the signal sequence is encoded by a Trichoderma or Aspergillus gene which encodes a CBH. More preferably the signal sequence is encoded by a Trichoderma gene which encodes a CBH1. In further embodiments, the promoter and signal sequence of the heterologous exo-endo cellulase fusion construct are derived from the same source. In some embodiments, the signal sequence is a Trichoderma cbh1 signal sequence that is operably linked to a Trichoderma cbh1 promoter. In further embodiments the signal sequence has the amino acid sequence of SEQ ID NO: 2 or an equivalent sequence or a sequence having at least 95% identity thereto.

[0094] Most exo-cellobiohydrolases (CBHs) and endoglucanases (EGs) have a multidomain structure consisting of a catalytic domain separated from a cellulose binding domain (CBD) by a linker peptide (Suurnakki et al., 2000). The catalytic domain contains the active site whereas the CBD interacts with cellulose by binding the enzyme to it (van Tilbeurgh et al., 1986 and Tomme et al., 1988).

[0095] Numerous cellulases have been described in the scientific literature, examples of which include: from Trichoderma reesei: Shoemaker, S. et al., Bio/Technology, 1:691-696, 1983, which discloses CBH1; Teeri, T. et al., Gene, 51:43-52, 1987, which discloses CBH2; Penttila, M. et al., Gene, 45:253-263, 1986, which discloses EG1; Saloheimo, M. et al., Gene, 63:11-22, 1988, which discloses EG2; Okada, M. et al., Appl. Environ. Microbiol., 64:555-563, 1988, which discloses EG3; Saloheimo, M. et al., Eur. J. Biochem., 249:584-591, 1997, which discloses EG4; and Saloheimo, A. et al., Molecular Microbiology, 13:219-228, 1994, which discloses EG5. Exo-cellobiohydrolases and endoglucanases from species other than Trichoderma have also been described e.g., Ooi et al., 1990, which discloses the cDNA sequence coding for endoglucanase F1-CMC produced by Aspergillus aculeatus; Kawaguchi T et al., 1996, which discloses the cloning and sequencing of the cDNA encoding beta-glucosidase 1 from Aspergillus aculeatus; Sakamoto et al., 1995, which discloses the cDNA sequence encoding the endoglucanase CMCase-1 from Aspergillus kawachii IFO 4308; and Saarilahti et al., 1990 which discloses an endoglucanase from Erwinia carotovara. The sequences encoding these enzymes may be used in the heterologous exo-endo cellulase fusion construct or vector of the invention.

[0096] In some embodiments, the catalytic domain is derived from a CBH1 type exo-cellobiohydrolase and in other embodiments the catalytic domain is derived from a CBH2 type exo-cellobiohydrolase. In some embodiments, the CBH1 or CBH2 catalytic domain is derived from a Trichoderma spp.

[0097] In one embodiment, the catalytic domain of an exo-cellobiohydrolase is encoded by a nucleic acid sequence of a Trichoderma reesei cbh1. In some embodiments the nucleic acid is the sequence of SEQ ID NO:3 and nucleotide sequences homologous thereto.

[0098] In other embodiments, the catalytic domain will have the amino acid sequence of SEQ ID NO: 6 and equivalent amino acid sequences thereto. Further DNA sequences encoding any equivalents of said amino acid sequences of SEQ ID NO: 6, wherein said equivalents have a similar qualitative biological activity to SEQ ID NO: 6 may be incorporated into the heterologous exo-endo cellulase fusion construct.

[0099] In some embodiments, heterologous exo-endo cellulase fusion constructs encompassed by the invention will include a linker located 3' to the sequence encoding the exo-cellobiohydrolase catalytic domain and 5' to the sequence encoding the endoglucanase catalytic domain. In some preferred embodiments, the linker is derived from the same source as the catalytic domain of the exo-cellobiohydrolase. Preferably the linker will be derived from a Trichoderma cbh1 gene. One preferred linker sequence is illustrated in FIG. 3. In other embodiments, the heterologous exo-endo cellulase fusion construct will include two or more linkers. For example a linker may be located not only between the coding sequence of the CBH catalytic domain and the coding sequence of the EG catalytic domain but also between the coding region of the CBH CBD and the coding region of the EG CBD. Further linkers may be located between the CBD of the endoglucanase and the catalytic domain of the endoglucanase. In general, a linker may be between about 5 to 60 amino acid residues, between about 15 to 50 amino acid residues, and between about 25 to 45 amino acid residues. Reference is made to Srisodsuk M. et al., 1993 for a discussion of the linker peptide of T. reesei CBH1.

[0100] In addition to the linker sequence, a heterologous exo-endo cellulase fusion construct or expression vector of the invention may include a cleavage site, such as a protease cleavage site. In one preferred embodiment, the cleavage site is a kexin site which encodes the dipeptide Lys-Arg.

[0101] In a preferred embodiment, the heterologous exo-endo cellulase fusion construct and an expression vector including the same will lack the CBD of the CBH. In other embodiments the CBD will be included in the construct or vector.

[0102] The heterologous exo-endo cellulase fusion constructs include a coding sequence for the catalytic domain of an endoglucanase. Endoglucanases are found in more than 13 of the Glycosyl Hydrolase families using the classification of Coutinho, P. M. et al. (1999) Carbohydrate-Active Enzymes (CAZy) server at (afmb.cnrs-mrs.fr/-cazy/CAZY/index). Preferably the catalytic domain is derived from a bacterial endoglucanase. As described above numerous bacterial endoglucanases are known.

[0103] Particularly preferred DNA sequences encoding a catalytic domain of a bacterial endoglucanase include:

[0104] a) the DNA of SEQ ID NO: 7 encoding an Acidothermus cellulolyticus GH5A endoglucanase I (E1) catalytic domain having amino acid sequence SEQ ID NO: 8;

[0105] b) the DNA of SEQ ID NO: 9 encoding an Acidothermus cellulolyticus GH74 endoglucanase catalytic domain having amino acid sequence SEQ ID NO: 10;

[0106] c) the DNA of SEQ ID NO: 11 encoding a Thermobifida furca E5 endoglucanase having amino acid sequence SEQ ID NO: 12 and

[0107] d) DNA sequences or homologous DNA sequences encoding any equivalents of said amino acid sequences of SEQ ID NOs: 8, 10 and 12 wherein said equivalents have a similar qualitative biological activity to said sequences.

[0108] In some preferred embodiments, the endoglucanase is an Acidothermus cellulolyticus E1 and reference is made to the an Acidothermus cellulolyticus endoglucanases disclosed in WO 9105039; WO 9315186; U.S. Pat. No. 5,275,944; WO 9602551; U.S. Pat. No. 5,536,655 and WO 0070031. Also reference is made to GenBank U33212. In some embodiments, the Acidothermus cellulolyticus E1 has an amino acid sequence of a least 90%, 93%, 95% and 98% sequence identity with the sequence set forth in SEQ ID NO: 6.

[0109] As stated above homologous nucleic acid sequences to the nucleic acid sequences illustrated in SEQ ID NOs: 1, 3, 7, 9 and 11 may be used in a heterologous cellulase fusion construct or vector according to the invention. Homologous sequences include sequences found in other species, naturally occurring allelic variants and biologically active functional derivatives. A homologous sequence will have at least 80%, 85%, 88%, 90%, 93%, 95%, 97%, 98% and 99% identity to one of the sequences of SEQ ID NOs: 1, 3, 7, 9 and 11 when aligned using a sequence alignment program. For example, a homologue of a given sequence has greater than 80% sequence identity over a length of the given sequence e.g., the coding sequence for the Tf-E5 catalytic domain as described herein.

[0110] For a given heterologous exo-endo cellulase fusion construct or components of the construct it is appreciated that as a result of the degeneracy of the genetic code, a number of coding sequences can be produced that encode a protein having the same amino acid sequence. For example, the triplet CGT encodes the amino acid arginine. Arginine is alternatively encoded by CGA, CGC, CGG, AGA, and AGG. Therefore it is appreciated that such substitutions in the coding region fall within the nucleic acid sequences covered by the present invention. Any and all of these sequences can be utilized in the same way as described herein for a CBH catalytic domain or a bacterial EG catalytic domain.

[0111] Exemplary computer programs which can be used to determine identity between two sequences include, but are not limited to, the suite of BLAST programs, e.g., BLASTN, BLASTX, and TBLASTX, BLASTP and TBLASTN, publicly available on the Internet at www.ncbi.nlm.nih.gov/BLAST/. See also, Altschul, et al., 1990 and Altschul, et al., 1997.

[0112] Sequence searches are typically carried out using the BLASTN program when evaluating a given nucleic acid sequence relative to nucleic acid sequences in the GenBank DNA Sequences and other public databases. The BLASTX program is preferred for searching nucleic acid sequences that have been translated in all reading frames against amino acid sequences in the GenBank Protein Sequences and other public databases. Both BLASTN and BLASTX are run using default parameters of an open gap penalty of 11.0, and an extended gap penalty of 1.0, and utilize the BLOSUM-62 matrix. (See, e.g., Altschul, et al., 1997.)

[0113] A preferred alignment of selected sequences in order to determine "% identity" between two or more sequences, is performed using for example, the CLUSTAL-W program in MacVector version 6.5, operated with default parameters, including an open gap penalty of 10.0, an extended gap penalty of 0.1, and a BLOSUM 30 similarity matrix.

[0114] In one exemplary approach, sequence extension of a nucleic acid encoding a CBH or EG catalytic domain may be carried out using conventional primer extension procedures as described in Sambrook et al., supra, to detect CBH or bacterial EG precursors and processing intermediates of mRNA that may not have been reverse-transcribed into cDNA and/or to identify ORFs that encode the catalytic domain or full length protein.

[0115] In yet another aspect, the entire or partial nucleotide sequence of the nucleic acid sequence of the T. reesei chbl or GH5a-E1 may be used as a probe. Such a probe may be used to identify and clone out homologous nucleic acid sequences from related organisms.

[0116] Screening of a cDNA or genomic library with the selected probe may be conducted using standard procedures, such as described in Sambrook et al., (1989). Hybridization conditions, including moderate stringency and high stringency, are provided in Sambrook et al., supra.

[0117] In addition, alignment of amino acid sequences to determine homology or identity between sequences is also preferably determined by using a "sequence comparison algorithm." Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), by visual inspection or MOE by Chemical Computing Group, Montreal Canada.

[0118] An example of an algorithm that is suitable for determining sequence similarity is the BLAST algorithm, which is described in Altschul, et al., J. Mol. Biol. 215:403-410 (1990) and reference is also made to Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (<www.ncbi.nlm.nih.gov>).

[0119] The heterologous exo-endo cellulase fusion construct according to the invention may also include a terminator sequence. In some embodiments the terminator and the promoter are derived from the same source, for example a Trichoderma exo-cellobiohydrolase gene. In other embodiments the terminator and promoter are derived from different sources. In preferred embodiments the terminator is derived from a filamentous fungal source and particular a Trichoderma. Particularly suitable terminators include cbh1 derived from a strain of Trichoderma specifically T. reesei and the glucoamylase terminator derived from Aspergillus niger or A. awamori (Nunberg et al., 1984 and Boel et al., 1984).

[0120] The heterologous exo-endo cellulase fusion construct or a vector comprising a fusion construct may also include a selectable marker. The choice of the proper selectable marker will depend on the host cell, and appropriate markers for different hosts are well known in the art. Typical selectable marker genes include argB from A. nidulans or T. reesei, amdS from A. nidulans, pyr4 from Neurospora crassa or T. reesei, pyrG from Aspergillus niger or A. nidulans. Markers useful in vector systems for transformation of Trichoderma are described in Finkelstein, Chap. 6, in BIOTECHNOLOGY OF FILAMENTOUS FUNGI, Finkelstein et al eds. Butterworth-Heinemann, Boston, Mass. 1992. The amdS gene from Aspergillus nidulans encodes the enzyme acetamidase that allows transformant cells to grow on acetamide as a nitrogen source (Kelley et al., EMBO J. 4:475-479 (1985) and Penttila et al., Gene 61:155-164 (1987)). The selectable marker (e.g. pyrG) may restore the ability of an auxotrophic mutant strain to grow on a selective minimal medium and the selectable marker (e.g. olic31) may confer to transformants the ability to grow in the presence of an inhibitory drug or antibiotic

[0121] A typical heterologous exo-endo cellulase fusion construct is depicted in FIGS. 1 and 10. Methods used to ligate a heterologous exo-endo cellulase fusion construct encompassed by the invention and other heterologous nucleic acid sequences and to insert them into suitable vectors are well known in the art. Linking is generally accomplished by ligation at convenient restriction sites, and if such sites do not exist, synthetic oligonucleotide linkers are used in accordance with conventional practice. Additionally vectors can be constructed using known recombination techniques.

[0122] Any vector may be used as long as it is replicable and viable in the cells into which it is introduced. Large numbers of suitable cloning and expression vectors are described in Sambrook et al., 1989, Ausubel F M et al., 1993, and Strathern et al., 1981, each of which is expressly incorporated by reference herein. Further appropriate expression vectors for fungi are described in van den Hondel, C. A. M. J. J. et al. (1991) In: Bennett, J. W. and Lasure, L. L. (eds.) More Gene Manipulations in Fungi. Academic Press, pp. 396-428. The appropriate DNA sequence may be inserted into a vector by a variety of procedures. In general, the DNA sequence is inserted into an appropriate restriction endonuclease site(s) by standard procedures. Such procedures and related sub-cloning procedures are deemed to be within the scope of knowledge of those skilled in the art. Exemplary useful plasmids include pUC18, pBR322, pUC100, pSL1180 (Pharmacia Inc., Piscataway, N.J.) and pFB6. Other general purpose vectors such as in Aspergillus, pRAX and in Trichoderma, pTEX maybe also be used (FIGS. 12 and 13).

[0123] In one embodiment, a preferred vector is the vector disclosed in FIGS. 12 and 13, wherein the vector includes the nucleic acid sequence encoding the CBD, linker and catalytic domain of the Thermobifida fusca endoglucanase 5 (SEQ ID NO: 12). In another embodiment, a preferred vector is the vector disclosed in FIGS. 12 and 13, wherein the vector includes the nucleotide sequence encoding an Acidothermus cellulolyticus GH5A endoglucanase catalytic domain (SEQ ID NO: 8).

B. Target Host Cells.

[0124] In one embodiment of the present invention, the filamentous fungal parent or host cell may be a cell of a species of, but not limited to, Trichoderma sp., Penicillium sp., Humicola sp., Chrysosporium sp., Gliocladium sp., Aspergillus sp., Fusarium sp., Neurospora sp., Hypocrea sp., and Emericella sp. As used herein, the term "Trichoderma" or "Trichoderma sp." refers to any fungal strains which have previously been classified as Trichoderma or are currently classified as Trichoderma. Some preferred species for Trichoderma fungal parent cells include Trichoderma longibrachiatum (reesei), Trichoderma viride, Trichoderma koningii, and Trichoderma harzianum cells. Particularly preferred host cells include cells from strains of T. reesei, such as RL-P37 (Sheir-Neiss, et al., Appl. Microbiol. Biotechnol. 20:46-53 (1984) and functionally equivalent and derivative strains, such as Trichoderma reesei strain RUT-C30 (ATCC No. 56765) and strain QM9414 (ATCC No. 26921). Also reference is made to ATCC No. 13631, ATCC No. 26921, ATCC No. 56764, ATCC No. 56767 and NRRL 1509.

[0125] Some preferred species for Aspergillus fungal parent cells include Aspergillus niger, Aspergillus awamori, Aspergillus aculeatus, and Aspergillus nidulans cells. In one embodiment, the strain comprises Aspergillus niger, for example A. niger var. awamori dgr246 (Goedegebuur et al, (2002) Curr. Genet. 41: 89-98) and GCDAP3, GCDAP4 and GAPS-4 (Ward, M, et al., (1993), Appl. Microbiol. Biotechnol. 39:738-743).

[0126] In some instances it is desired to obtain a filamentous host cell strain such as a Trichoderma host cell strain which has had one or more cellulase genes deleted prior to introduction of a heterologous exo-endo cellulase fusion construct encompassed by the invention. Such strains may be prepared by the method disclosed in U.S. Pat. No. 5,246,853, U.S. Pat. No. 5,861,271 and WO 92/06209, which disclosures are hereby incorporated by reference. By expressing a cellulase fusion protein or components thereof having cellulolytic activity in a host microorganism that is missing one or more cellulase genes, the identification and subsequent purification procedures are simplified. Any gene from Trichoderma sp. which has been cloned can be deleted, for example, the cbh1, cbh2, egl1, and egl2 genes as well as those encoding EG3 and/or EG5 protein (see e.g., U.S. Pat. No. 5,475,101 and WO 94/28117, respectively). Gene deletion may be accomplished by inserting a form of the desired gene to be deleted or disrupted into a plasmid by methods known in the art.

[0127] Parental fungal cell lines are generally cultured under standard conditions with media containing physiological salts and nutrients, such as described by Pourquie, J. et al., BIOCHEMISTRY AND GENETICS OF CELLULOSE DEGRADATION, eds. Aubert J. P. et al., Academic Press pp. 71-86 (1988) and Ilmen, M. et al., Appl. Environ. Microbiol. 63:1298-1306 (1997). Also reference is made to common commercially prepared media such as yeast Malt Extract (YM) broth, Luria Bertani (LB) broth and Sabouraud Dextrose (SD) broth.

C. Introduction of a Heterologous Exo-Endo Cellulase Fusion Construct or Vector into Fungal Host Cells and Culture Conditions.

[0128] A host fungal cell may be genetically modified (i.e., transduced, transformed or transfected) with a heterologous exo-endo cellulase fusion construct according to the invention, a cloning vector or an expression vector comprising a heterologous exo-endo cellulase fusion construct. The methods of transformation of the present invention may result in the stable integration of all or part of the construct or vector into the genome of the filamentous fungus. However, transformation resulting in the maintenance of a self-replicating extra-chromosomal transformation vector is also contemplated.

[0129] Many standard transformation methods can be used to produce a filamentous fungal cell line such as a Trichoderma or Aspergillus cell line that express large quantities of a heterologous protein. Some of the published methods for the introduction of DNA constructs into cellulase-producing strains of Trichoderma include Lorito, Hayes, DiPietro and Harman (1993) Curr. Genet. 24: 349-356; Goldman, VanMontagu and Herrera-Estrella (1990) Curr. Genet. 17:169-174; Penttila, Nevalainen, Ratto, Salminen and Knowles (1987) Gene 61: 155-164, EP-A-0 244 234 and also Hazell B. et al., 2000; for Aspergillus include Yelton, Hamer and Timberlake (1984) Proc. Natl. Acad. Sci. USA 81: 1470-1474; for Fusarium include Bajar, Podila and Kolattukudy, (1991) Proc. Natl. Acad. Sci. USA 88: 8202-8212; for Streptomyces include Hopwood et al., (1985) Genetic Manipulation of Streptomyces: A Laboratory Manual, The John Innes Foundation, Norwich, UK; and for Bacillus include Brigidi, DeRossi, Bertarini, Riccardi and Matteuzzi, (1990), FEMS Microbiol. Lett. 55: 135-138.

[0130] Other methods for introducing a heterologous exo-endo cellulase fusion construct or vector into filamentous fungi (e.g., H. jecorina) include, but are not limited to the use of a particle or gene gun (biolistics), permeabilization of filamentous fungi cells walls prior to the transformation process (e.g., by use of high concentrations of alkali, e.g., 0.05 M to 0.4 M CaCl.sub.2 or lithium acetate), protoplast fusion, electroporation, or agrobacterium mediated transformation (U.S. Pat. No. 6,255,115).

[0131] An exemplary method for transformation of filamentous fungi by treatment of protoplasts or spheroplasts with polyethylene glycol and CaCl.sub.2 is described in Campbell, et al., (1989) Curr. Genet. 16:53-56, 1989 and Penttila, M. et al., (1988) Gene, 63:11-22 and Penttila, M. et al., (1987) Gene 61:155-164.

[0132] Any of the well-known procedures for introducing foreign nucleotide sequences into host cells may be used. It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the heterologous gene.

[0133] The invention includes the transformants of filamentous fungi especially Trichoderma cells comprising the coding sequences for the cellulase fusion protein. The invention further includes the filamentous fungi transformants for use in producing fungal cellulase compositions, which include the cellulase fusion protein or components thereof.

[0134] Following introduction of a heterologous exo-endo cellulase fusion construct comprising the exoglucanase catalytic domain coding sequence and the endoglucanase catalytic domain coding sequence, the genetically modified cells can be cultured in conventional nutrient media as described above for growth of target host cells and modified as appropriate for activating promoters and selecting transformants. The culture conditions, such as temperature, pH and the like, are those previously used for the host cell selected for expression, and will be apparent to those skilled in the art. Also preferred culture conditions for a given filamentous fungus may be found in the scientific literature and/or from the source of the fungi such as the American Type Culture Collection (ATCC; www.atcc.org/).

[0135] Stable transformants of filamentous fungi can generally be distinguished from unstable transformants by their faster growth rate and the formation of circular colonies with a smooth rather than ragged outline on solid culture medium. Additionally, in some cases, a further test of stability can be made by growing the transformants on solid non-selective medium, harvesting the spores from this culture medium and determining the percentage of these spores which will subsequently germinate and grow on selective medium.

[0136] The progeny of cells into which such heterologous exo-endo cellulase fusion constructs, or vectors including the same, have been introduced are generally considered to comprise the fusion protein encoded by the nucleic acid sequence found in the heterologous cellulase fusion construct.

[0137] In one exemplary application of the invention encompassed herein a recombinant strain of filamentous fungi, e.g., Trichoderma reesei, comprising a heterologous exo-endo cellulase fusion construct will produce not only a cellulase fusion protein but also will produce components of the cellulase fusion protein. In some embodiments the recombinant cells including the cellulase fusion construct will produce an increased amount of cellulolytic activity compared to a corresponding recombinant filamentous fungi strain grown under essentially the same conditions but genetically modified to include separate heterologous nucleic acid constructs encoding an exo-cellobiohydrolase catalytic domain and/or an endoglucanase catalytic domain.

D. Analysis of Protein Expression

[0138] In order to evaluate the expression of a cellulase fusion protein of the invention by a cell line that has been transformed with a heterologous exo-endo cellulase fusion construct, assays can be carried out at the protein level, the RNA level or by use of functional bioassays particular to exo-cellobiohydrolase activity or endoglucanase activity and/or production.

[0139] In general, the following assays can be used to determine integration of cellulase fusion protein expression constructs and vector sequences, Northern blotting, dot blotting (DNA or RNA analysis), RT-PCR (reverse transcriptase polymerase chain reaction), in situ hybridization, using an appropriately labeled probe (based on the nucleic acid coding sequence), conventional Southern blotting and autoradiography.

[0140] In addition, the production and/or expression of a cellulase enzyme may be measured in a sample directly, for example, by assays for cellobiohydrolase or endoglucanase activity, expression and/or production. Such assays are described, for example, in Becker et al., Biochem J. (2001) 356:19-30; Mitsuishi et al., FEBS (1990) 275:135-138. Shoemaker et al. 1978; and Schulein 1988) each of which is expressly incorporated by reference herein. The ability of CBH1 to hydrolyze isolated soluble and insoluble substrates can be measured using assays described in Srisodsuk et al., J. Biotech. (1997) 57:49-57 and Nidetzky and Claeyssens Biotech. Bioeng. (1994) 44:961-966. Substrates useful for assaying exo-cellobiohydrolase, endoglucanase or .beta.-glucosidase activities include crystalline cellulose, filter paper, phosphoric acid swollen cellulose, cellooligosaccharides, methylumbelliferyl lactoside, methylumbelliferyl cellobioside, orthonitrophenyl lactoside, paranitrophenyl lactoside, orthonitrophenyl cellobioside, paranitrophenyl cellobioside.

[0141] In addition, protein expression, may be evaluated by immunological methods, such as immunohistochemical staining of cells, tissue sections or immunoassay of tissue culture medium, e.g., by Western blot or ELISA. Such immunoassays can be used to qualitatively and quantitatively evaluate expression of a cellulase, for example CBH. The details of such methods are known to those of skill in the art and many reagents for practicing such methods are commercially available.

[0142] In an embodiment of the invention, the cellulase fusion protein which is expressed by the recombinant host cell will be about 0.1 to 80% of the total expressed cellulase. In other embodiments, the amount of expressed fusion protein will be in the range of about 0.1 mg to 100 g; about 0.1 mg to 50 g and 0.1 mg to 10 g protein per liter of culture media.

E. Recovery and Purification of Cellulase Fusion Proteins and Components Thereof.

[0143] In general, a cellulase fusion protein or components of the cellulase fusion protein produced in cell culture are secreted into the medium and may be recovered and optionally purified, e.g., by removing unwanted components from the cell culture medium. However, in some cases, a cellulase fusion protein or components thereof may be produced in a cellular form necessitating recovery from a cell lysate. In such cases the protein is purified from the cells in which it was produced using techniques routinely employed by those of skill in the art. Examples include, but are not limited to, affinity chromatography (van Tilbeurgh et al., FEBS Lett. 16:215, 1984), ion-exchange chromatographic methods (Goyal et al., Bioresource Technol. 36:37-50, 1991; Fliess et al., Eur. J. Appl. Microbiol. Biotechnol. 17:314-318, 1983; Bhikhabhai et al., J. Appl. Biochem. 6:336-345, 1984; Ellouz et al., J. Chromatography 396:307-317, 1987), including ion-exchange using materials with high resolution power (Medve et al., J. Chromatography A 808:153-165, 1998), hydrophobic interaction chromatography (Tomaz and Queiroz, J. Chromatography A 865:123-128, 1999), and two-phase partitioning (Brumbauer, et al., Bioseparation 7:287-295, 1999).

[0144] Once expression of a given cellulase fusion protein is achieved, the proteins thereby produced may be purified from the cells or cell culture by methods known in the art and reference is made to Deutscher, Methods in Enzymology, vol. 182, no. 57, pp. 779, 1990; and Scopes, Methods Enzymol. 90: 479-91, 1982. Exemplary procedures suitable for such purification include the following: antibody-affinity column chromatography, ion exchange chromatography; ethanol precipitation; reverse phase HPLC; chromatography on silica or on a cation-exchange resin such as DEAE; chromatofocusing; SDS-PAGE; ammonium sulfate precipitation; and gel filtration using, e.g., Sephadex G-75.

[0145] A purified form of a cellulase fusion protein or components thereof may be used to produce either monoclonal or polyclonal antibodies specific to the expressed protein for use in various immunoassays. (See, e.g., Hu et al., Mol Cell Biol. vol. 11, no. 11, pp. 5792-5799, 1991). Exemplary assays include ELISA, competitive immunoassays, radioimmunoassays, Western blot, indirect immunofluorescent assays and the like.

F. Utility of Enzymatic Compositions Comprising the Cellulase Fusion Proteins or Components Thereof.

[0146] The cellulase fusion protein and components comprising the catalytic domains of the cellulase fusion protein find utility in a wide variety applications, including use in detergent compositions, stonewashing compositions, in compositions for degrading wood pulp into sugars (e.g., for bio-ethanol production), and/or in feed compositions. In some embodiments, the cellulase fusion protein or components thereof may be used as cell free extracts. In other embodiments, the fungal cells expressing a heterologous exo-endo cellulase fusion construct are grown under batch or continuous fermentation conditions. A classical batch fermentation is a closed system, wherein the composition of the medium is set at the beginning of the fermentation and is not subject to artificial alterations during the fermentation. Thus, at the beginning of the fermentation the medium is inoculated with the desired organism(s). In this method, fermentation is permitted to occur without the addition of any components to the system. Typically, a batch fermentation qualifies as a "batch" with respect to the addition of the carbon source and attempts are often made at controlling factors such as pH and oxygen concentration. The metabolite and biomass compositions of the batch system change constantly up to the time the fermentation is stopped. Within batch cultures, cells progress through a static lag phase to a high growth log phase and finally to a stationary phase where growth rate is diminished or halted. If untreated, cells in the stationary phase eventually die. In general, cells in log phase are responsible for the bulk of production of end product.

[0147] A variation on the standard batch system is the "fed-batch fermentation" system, which also finds use with the present invention. In this variation of a typical batch system, the substrate is added in increments as the fermentation progresses. Fed-batch systems are useful when catabolite repression is apt to inhibit the production of products and where it is desirable to have limited amounts of substrate in the medium. Measurement of the actual substrate concentration in fed-batch systems is difficult and is therefore estimated on the basis of the changes of measurable factors such as pH, dissolved oxygen and the partial pressure of waste gases such as CO.sub.2. Batch and fed-batch fermentations are common and well known in the art.

[0148] Continuous fermentation is an open system where a defined fermentation medium is added continuously to a bioreactor and an equal amount of conditioned medium is removed simultaneously for processing. Continuous fermentation generally maintains the cultures at a constant high density where cells are primarily in log phase growth.

[0149] Continuous fermentation allows for the modulation of one factor or any number of factors that affect cell growth and/or end product concentration. For example, in one embodiment, a limiting nutrient such as the carbon source or nitrogen source is maintained at a fixed rate an all other parameters are allowed to moderate. In other systems, a number of factors affecting growth can be altered continuously while the cell concentration, measured by media turbidity, is kept constant. Continuous systems strive to maintain steady state growth conditions. Thus, cell loss due to medium being drawn off must be balanced against the cell growth rate in the fermentation. Methods of modulating nutrients and growth factors for continuous fermentation processes as well as techniques for maximizing the rate of product formation are well known in the art of industrial microbiology.

[0150] In some applications, the cellulase fusion protein and components thereof find utility in detergent compositions, stonewashing compositions or in the treatment of fabrics to improve their feel and appearance. A detergent composition refers to a mixture which is intended for use in a wash medium for the laundering of soiled cellulose containing fabrics. A stonewashing composition refers to a formulation for use in stonewashing cellulose containing fabrics. Stonewashing compositions are used to modify cellulose containing fabrics prior to sale, i.e., during the manufacturing process. In contrast, detergent compositions are intended for the cleaning of soiled garments and are not used during the manufacturing process.

[0151] In the context of the present invention, such compositions may also include, in addition to cellulases, surfactants, additional hydrolytic enzymes, builders, bleaching agents, bleach activators, bluing agents and fluorescent dyes, caking inhibitors, masking agents, cellulase activators, antioxidants, and solubilizers.

[0152] Surfactants may comprise anionic, cationic and nonionic surfactants such as those commonly found in detergents. Anionic surfactants include linear or branched alkylbenzenesulfonates; alkyl or alkenyl ether sulfates having linear or branched alkyl groups or alkenyl groups; alkyl or alkenyl sulfates; olefinsulfonates; and alkanesulfonates. Ampholytic surfactants include quaternary ammonium salt sulfonates, and betaine-type ampholytic surfactants. Such ampholytic surfactants have both the positive and negative charged groups in the same molecule. Nonionic surfactants may comprise polyoxyalkylene ethers, as well as higher fatty acid alkanolamides or alkylene oxide adduct thereof, fatty acid glycerine monoesters, and the like.

[0153] Cellulose containing fabric may be any sewn or unsewn fabrics, yarns or fibers made of cotton or non-cotton containing cellulose or cotton or non-cotton containing cellulose blends including natural cellulosics and manmade cellulosics (such as jute, flax, ramie, rayon, and lyocell). Cotton-containing fabrics are sewn or unsewn fabrics, yarns or fibers made of pure cotton or cotton blends including cotton woven fabrics, cotton knits, cotton denims, cotton yarns, raw cotton and the like.

[0154] Preferably the cellulase compositions comprising the cellulase fusion protein or components thereof are employed from about 0.00005 weight percent to about 5 weight percent relative to the total detergent composition. More preferably, the cellulase compositions are employed from about 0.0002 weight percent to about 2 weight percent relative to the total detergent composition.

[0155] Since the rate of hydrolysis of cellulosic products may be increased by using a transformant having a heterologous cellulase fusion construct inserted into the genome, products that contain cellulose or heteroglycans can be degraded at a faster rate and to a greater extent. Products made from cellulose such as paper, cotton, cellulosic diapers and the like can be degraded more efficiently in a landfill. Thus, the fermentation product obtainable from the transformants or the transformants alone may be used in compositions to help degrade by liquefaction a variety of cellulose products that add to the overcrowded landfills.

[0156] Cellulose-based feedstocks are comprised of agricultural wastes, grasses and woods and other low-value biomass such as municipal waste (e.g., recycled paper, yard clippings, etc.). Ethanol may be produced from the fermentation of any of these cellulosic feedstocks. However, the cellulose must first be converted to sugars before there can be conversion to ethanol. A composition containing an enhanced amount of cellulolytic activity due to the inclusion of a cellulase fusion protein or components thereof may find utility in ethanol production

[0157] Ethanol can be produced via saccharification and fermentation processes from cellulosic biomass such as trees, herbaceous plants, municipal solid waste and agricultural and forestry residues. However, the ratio of individual cellulase enzymes within a naturally occurring cellulase mixture produced by a microbe may not be the most efficient for rapid conversion of cellulose in biomass to glucose. It is known that endoglucanases act to produce new cellulose chain ends which themselves are substrates for the action of cellobiohydrolases and thereby improve the efficiency of hydrolysis of the entire cellulase system. Therefore, the use of increased or optimized endoglucanase activity from a cellulase fusion protein or components thereof may greatly enhance the production of ethanol and sugar which can be converted by fermentation to other chemicals.

[0158] Thus, the inventive cellulase fusion protein and components thereof find use in the hydrolysis of cellulose to its sugar components. In one embodiment, the cellulase fusion protein or components thereof are added to the biomass prior to the addition of a fermentative organism. In another embodiment, the cellulase fusion protein or components thereof are added to the biomass at the same time as a fermentative organism. Optionally, there may be other cellulase components present in either embodiment.

EXPERIMENTAL

[0159] The present invention is described in further detail in the following examples which are not in any way intended to limit the scope of the invention.

[0160] In the disclosure and experimental section, which follows, the following abbreviations apply:

[0161] CBH1-E1 (T. reesei CBH1 catalytic domain and linker fused to an Acidothermus cellulolyticus GH5A endoglucanase I catalytic domain);

[0162] CBH1-74E (T. reesei CBH1 catalytic domain and linker fused to an Acidothermus cellulolyticus GH74 endoglucanase catalytic domain);

[0163] CBH1-TfE5 (T. reesei CBH1 catalytic domain and linker fused to a Thermobifida fusca E5 endoglucanase cellulose binding domain, linker and Thermobifida fusca E5 endoglucanase catalytic domain;

wt % (weight percent); .degree. C. (degrees Centigrade); rpm (revolutions per minute); H.sub.2O (water); dH.sub.2O (deionized water); aa (amino acid); by (base pair); kb (kilobase pair); kD (kilodaltons); g (grams); .mu.g (micrograms); mg (milligrams); .mu.L (microliters); ml and mL (milliliters); mm (millimeters); .mu.m (micrometer); M (molar); mM (millimolar); .mu.M (micromolar); U (units); MW (molecular weight); sec (seconds); min(s) (minute/minutes); hr(s) (hour/hours); PAGE (polyacrylamide gel electrophoresis); phthalate buffer, (sodium phthalate in water, 20 mN, pH 5.0); PBS (phosphate buffered saline [150 mM NaCl, 10 mM sodium phosphate buffer, pH 7.2]); SDS (sodium dodecyl sulfate); Tris (tris(hydroxymethyl)aminomethane); w/v (weight to volume); w/w (weight to weight); v/v (volume to volume); and Genencor (Genencor International, Inc., Palo Alto, Calif.).

Example 1

Construction of a CBH1-E1 Fusion Vector

[0164] The CBH1-E1 fusion construct included the T. reesei cbh1 promoter; the T. reesei cbh1 gene sequence from the start codon to the end of the cbh1 linker and an additional 12 bases of DNA 5' to the start of the endoglucanase coding sequence, a stop codon and the T. reesei cbh1 terminator (see FIGS. 10 and 11). The additional 12 DNA bases (ACTAGTAAGCGG)) (SEQ ID NO. 16) code for the restriction endonuclease SpeI and the amino acids Ser, Lys, and Arg.

[0165] The plasmid E1-pUC19 which contained the open reading frame for the E1 gene locus was used as the DNA template in a PCR reaction. (Equivalent plasmids are described in U.S. Pat. No. 5,536,655, which describes cloning the E1 gene from the actinomycete Acidothermus cellulolyticus ATCC 43068, Mohagheghi A. et al., 1986).

[0166] Standard procedures for working with plasmid DNA and amplification of DNA using the polymerase chain reaction (PCR) are found in Sambrook, et al., 2001.

[0167] The following two primers were used to amplify the coding region of the catalytic domain of the E1 endoglucanase.

TABLE-US-00001 Forward Primer 1 = EL-316 (containing a SpeI site): (SEQ ID NO: 17) GCTTATACTAGTAAGCGCGCGGGCGGCGGCTATTGGCACAC Reverse Primer 2 = EL-317 (containing an AscI site and stop codon-reverse compliment): (SEQ ID NO: 18) GCTTATGGCGCGCCTTAGACAGGATCGAAAATCGACGAC.

[0168] The reaction conditions were as follows using materials from the PLATINUM Pfx DNA Polymerase kit (Invitrogen, Carlsbad, Calif.): 1 .mu.l dNTP Master Mix (final concentration 0.2 mM); 1 .mu.l primer 1 (final conc 0.5 .mu.M); 1 .mu.l primer 2 (final conc 0.5 .mu.M); 2 .mu.l DNA template (final conc 50-200 ng); 1 .mu.l 50 mM MgSO.sub.4 (final conc 1 mM); 5 .mu.l 10.times.Pfx Amplification Buffer; 5 .mu.l 10.times.PCRx Enhancer Solution; 1 .mu.l Platinum Pfx DNA Polymerase (2.5 U total); 33 .mu.l water for 50 .mu.l total reaction volume.

[0169] Amplification parameters were: step 1--94.degree. C. for 2 min (1st cycle only to denature antibody bound polymerase); step 2--94.degree. C. for 45 sec; step 3--60.degree. C. for 30 sec; step 4--68.degree. C. for 2 min; step 5--repeated step 2 for 24 cycles and step 6--68.degree. C. for 4 min.

[0170] The appropriately sized PCR product was cloned into the Zero Blunt TOPO vector and transformed into chemically competent Top10 E. coli cells (Invitrogen Corp., Carlsbad, Calif.)--plated onto to appropriate selection media (LA with 50 ppm with kanamycin and grown overnight at 37.degree. C. Several colonies were picked from the plate media and grown overnight in 5 ml cultures at 37.degree. C. in selection media (LB with 50 ppm kanamycin) from which plasmid mini-preps were made. Plasmid DNA from several clones was restriction digested to confirm the correct size insert. The correct sequence was confirmed by DNA sequencing. Following sequence verification, the E1 catalytic domain was excised from the TOPO vector by digesting with the restriction enzymes SpeI and AscI. This fragment was ligated into the pTrex4 vector which had been digested with the restriction enzymes SpeI and AscI (see, FIGS. 12 and 13).

[0171] The ligation mixture was transformed into MM294 competent E. coli cells, plated onto appropriate selection media (LA with 50 ppm carbenicillin) and grown overnight at 37.degree. C. Several colonies were picked from the plate media and grown overnight in 5 ml cultures at 37.degree. C. in selection media (LA with 50 ppm carbenicillin) from which plasmid mini-preps were made. Correctly ligated CBH1-E1 fusion protein vectors were confirmed by restriction digestion.

Example 2

Transformation and Expression the CBH1-E1 Fusion Construct into a T. reesei Host Strain

[0172] Various T. reesei strains were transformed with the CBH1-E1 fusion construct. The host strains included a derivative of T. reesei R.sup.L--P37 and a derivative of T. reesei wherein the native cellulase genes (cbh1, cbh2, egl1 and egl2) were deleted.

[0173] Approximately one-half swab (or 1-2 cm.sup.2) of a plate of a sporulated T. reesei derivative of strain RL-P37 (Sheir-Neiss, et al., 1984) mycelia (grown on a PDA plate for 7 days at 28.degree. C.) was inoculated into 50 ml of YEG (5 g/L yeast extract plus 20 g/L glucose) broth in a 250 ml, 4-baffled shake flask and incubated at 30.degree. C. for 16-20 hours at 200 rpm. The mycelia was recovered by transferring the liquid volume into 50 ml conical tubes and spinning at 2500 rpm for 10 minutes. The supernatant was aspirated off. The mycelial pellet was transferred into a 250 ml, CA Corning bottle containing 40 ml of B glucanase solution and incubated at 30.degree. C., 200 rpm for 2 hrs to generate protoplasts for transformation. Protoplasts were harvested by filtration through sterile miracloth into a 50 ml conical tube. They were pelleted by spinning at 2000 rpm for 5 minutes, the supernate was aspirated off. The protoplast pellet was washed once with 50 ml of 1.2 M sorbitol, spun down, aspirated, and washed with 25 ml of sorbitol CaCl.sub.2. Protoplasts were counted and then pelleted again at 2000 rpm for 5 min, the supernate was aspirated off, and the protoplast pellet was resuspended in a sufficient volume of sorbitol/CaCl.sub.2 to generate a protoplast concentration of 1.25.times.10.sup.8 protoplasts per ml. This constitutes the protoplast solution.

[0174] Aliquots of up to 20 .mu.g of expression vector DNA (in a volume no greater than 20 .mu.l) were placed into 15 ml conical tubes and the tubes were put on ice. Then 200 .mu.l of the protoplast solution was added, followed by 50 .mu.l PEG solution to each transformation aliquot. The tubes were mixed gently and incubated on ice for 20 min. Next, an additional 2 ml of PEG solution was added to the transformation aliquot tubes, followed by gentle inversion and incubation at room temperature for 5 minutes. Next 4 ml of Sorbitol/CaCl.sub.2 solution was added to the tubes (generating a total volume of 6.2 ml). This transformation mixture was divided into 3 aliquots each containing about 2 ml. An overlay mixture was created by adding each of these three aliquots to three tubes containing 10 ml of melted acetamide/sorbitol top agar (kept molten by holding at 50.degree. C.) and this overlay mixture was poured onto a selection plate of acetamide/sorbitol agar. The transformation plates were then incubated at 30.degree. C. for four to seven days.

[0175] The transformation was performed with amdS selection. Acetamide/sorbitol plates and overlays were used for the transformation. For the selection plates, the same plates were used, but without sorbitol. Transformants were purified by transfer of isolated colonies to fresh selective media containing acetamide.

[0176] With reference to the examples the following solutions were made as follows.

[0177] 1) 40 ml .beta.-D-glucanase solution was made up in 1.2M sorbitol and included 600 mg .beta.-D-glucanase and 400 mg MgSO.sub.4.7H.sub.2O (Catalog No. 0439-1, InterSpex Products Inc., San Mateo, Calif.).

[0178] 2) 200 ml PEG solution contained 50 g polyethylene glycol 4000 (BDH Laboratory Supplies Poole, England) and 1.47 g CaCl.sub.2-2H.sub.2O made up in dH.sub.2O.

[0179] 3) Sorbitol/CaCl.sub.2 contained 1.2M sorbitol and 50 mM CaCl.sub.2.

[0180] 4) Acetamide/sorbitol agar: [0181] Part 1--0.6 g acetamide (Aldrich, 99% sublime.), 1.68 g CsCl, 20 g glucose, 20 g KH.sub.2PO.sub.4, 0.6 g MgSO.sub.4.7H.sub.2O, 0.6 g CaCl.sub.2-2H.sub.2O, 1 ml 1000.times. salts (see below), adjusted to pH 5.5, brought to volume (300 mls) with dH.sub.2O, filtered and sterilized. [0182] Part II--20 g Noble agar and 218 g sorbitol brought to volume (700 mls) with dH.sub.2O and autoclaved. [0183] Part II was added to part I for a final volume of 1 L.

[0184] 5) 1000.times. Salts--5 g FeSO.sub.4.7H.sub.2O, 1.6 g MnSO.sub.4.H.sub.2O, 1.4 g ZnSO.sub.4.7H.sub.2O, 1 g CoCl.sub.2.6H.sub.2O were combined and the volume was brought to 1 L with dH.sub.2O. The solution was filtered and sterilized.

[0185] 6) Acetamide/sorbitol top agar is prepared as is acetamide/sorbitol agar except that top agar is substituted for noble agar.

The transformation procedure used was similar to that described in Penttila et al., Gene 61: 155-164, 1987.

[0186] Individual fungal transformants were grown up in shake flask culture to determine the level of fusion protein expression. The experiments were conducted essentially as described in example 1 of U.S. Pat. No. 5,874,276 with the following modification: 16 g/L of alpha-lactose was substituted for cellulose in TSF medium. The highest level of cleaved E1 protein expression from a transformant in shake flasks was estimated to be greater than 3 g/L.

[0187] In general, the fermentation protocol as described in Foreman et al. (Foreman et al. (2003) J. Biol. Chem. 278:31988-31997) was followed. Vogels minimal medium (Davis et al., (1970) Methods in Enzymology 17A, pg 79-143 and Davis, Rowland, NEUROSPORA, CONTRIBUTIONS OF A MODEL ORGANISM, Oxford University Press, (2000)) containing 5% glucose was inoculated with 1.5 ml frozen spore suspension. After 48 hours, each culture was transferred to 6.2 L of the same medium in a 14 L Biolafitte fermenter. The fermenter was run at 25.degree. C., 750 RPM and 8 standard liters per minute airflow. One hour after the initial glucose was exhausted, a 25% (w/w) lactose feed was started and fed in a carbon limiting fashion to prevent lactose accumulation. The concentrations of glucose and lactose were monitored using a glucose oxidase assay kit or a glucose hexokinase assay kit with beta-galactosidase added to cleave lactose, respectively (Instrumentation Laboratory Co., Lexington, Mass.). Samples were obtained at regular intervals to monitor the progress of the fermentation. Collected samples were spun in a 50 ml centrifuge tube at 3/4 speed in an International Equipment Company (Needham Heights, Mass.) clinical centrifuge.

[0188] Shake flask grown supernatant samples were run on BIS-TRIS SDS-PAGE gels (Invitrogen), under reducing conditions with MOPS (morpholinepropanesulfonic acid) SDS running buffer and LDS sample buffer. The results are provided in FIG. 14.

Example 3

Assay of Cellulolytic Activity from Transformed Trichoderma reesei Clones

[0189] The following assays and substrates were used to determine the cellulolytic activity of the CBH1-E1 fusion protein.

Pretreated corn stover (PCS)--Corn stover was pretreated with 2% w/w H.sub.2SO.sub.4 as described in Schell, D. et al., J. Appl. Biochem. Biotechnol. 105:69-86 (2003) and followed by multiple washes with deionized water to obtain a pH of 4.5. Sodium acetate was added to make a final concentration of 50 mM and this was titrated to pH 5.0. Measurement of Total Protein--Protein concentration was measured using the bicinchoninic acid method with bovine serum albumin as a standard. (Smith P. K. et al., Biochem 150:76-85, 1985). Cellulose conversion (Soluble sugar determinations) was evaluated by HPLC according to the methods described in Baker et al., Appl. Biochem. Biotechnol. 70-72:395-403 (1998).

[0190] A standard cellulosic conversion assay was used in the experiments. In this assay enzyme and buffered substrate were placed in containers and incubated at a temperature over time. The reaction was quenched with enough 100 mM Glycine, pH 11.0 to bring the pH of the reaction mixture to at least pH10. Once the reaction was quenched, an aliquot of the reaction mixture was filtered through a 0.2 micron membrane to remove solids. The filtered solution was then assayed for soluble sugars by HPLC as described above. The cellulose concentration in the reaction mixture was approximately 7%. The enzyme or enzyme mixtures were dosed anywhere from 1 to 60 mg of total protein per gram of cellulose.

[0191] In one set of experiments the percent conversion of 13.8% PCS (7.06% cellulose) at 55.degree. C. for 1 day was evaluated using 10 mg enzyme/g cellulose in 50 mM acetate buffer at 55.degree. C. Samples were agitated at 700 rpm. Comparisons were made between supernatants from growth of 1) a T. reesei parent strain which included the native cellulase genes and 2) a corresponding T. reesei CBH1-E1 fusion strain transformed according to the examples herein. The amount of E1 protein expressed by this strain was 10% w/w (estimated by PAGE as a percent of total protein). Samples were quenched at various times up to 24 hours.

[0192] The results are presented in FIG. 17, and it is observed that the CBH1-E1 fusion protein outperforms the parent. It took about 6 hours for the CBH1-E1 fusion protein to yield 20% cellulose conversion, while it requires 10 hours for the parent cellulase to reach 20% hydrolysis.

Example 4

Transformation and Expression the CBH1-74E Fusion Construct into T. reesei

[0193] The CBH1-74E fusion construct was designed according to the procedures described above in example 1 with the following differences. The forward primer was designed to maintain the reading frame translation and included a Lys-Arg kexin cleavage site (underlined). The reverse primer encodes a stop codon (the reverse compliment) at the end of the catalytic domain.

[0194] Primers were ordered with 5 prime phosphates to enable subsequent blunt cloning. The GH74 catalytic domain was amplified with the following forward and reverse primers:

TABLE-US-00002 GH74 forward primer bluntF4- (SEQ ID NO: 19) CTAAGAGAGCGACGACTCAGCCGTACACCTGGAGCAACGTGGC and GH74 reverse primer bluntR4- (SEQ ID NO: 20) TTACGATCCGGACGGCGCACCACCAATGTCCCCGTATA.

[0195] Amplification was performed using Stratagene's Herculase High Fidelity Polymerase (Stratagene, La Jolla, Calif.). The amplification conditions for the GH74 catalytic domain were:

[0196] An isolated fragment of DNA encompassing the GH74 catalytic domain was used as the template for PCR (approximately 0.2 ug of DNA). U.S. Pat. Appln. No. 20030108988 describes the cloning of GH74. (GH74 is referred to as Avilll in the published patent application).

[0197] Reaction set up (in ul):

TABLE-US-00003 COMPONENT 10X Herculase Buffer 5 10 mM dNTPs 1.5 H.sub.2O 39.5 Fwd primer (10 .mu.M) 1 Rev primer (10 .mu.M) 1 Template 1 Herculase Polymerase (5U) 1 Total reaction volume 50

[0198] Cycling:

TABLE-US-00004 Segment No. of cycles Temp .degree. C. hr:min:sec 1 1 95 00:03:00 2 10 95 00:00:40 60 00:00:30 72 150 sec 3 20 95 00:00:40 60 00:00:30 72 150 sec + 10 sec/cycle 4 1 4 hold

[0199] All PCR products were gel purified and treated with Mung Bean Nuclease to produce blunt ends prior to ligation. The amplified, blunted fragment was ligated into pTrex4 vector that had been digested with the restriction enzymes SpeI and AscI followed by nuclease digestion to remove the 3' overhangs thereby creating blunt ends. The newly created vector was then transformed into E. coli. Plasmid DNA was isolated from colonies of transformed E. coli. Since the amplified GH74 fragment could insert into pTrex4 in two different orientations, restriction digests were performed to discern clones with correctly oriented insert. Putative clones were confirmed by DNA sequencing.

[0200] Transformation of the fusion vector into T. reesei was performed using biolistic transformation according to the teaching of Hazell, B. W. et al., Lett. Appl. Microbiol. 30:282-286 (2000).

[0201] Expression of the CBH1-74E fusion protein was determined as described above for expression of the CBH1-E1 fusion protein in Example 2. The highest level of cleaved GH74 protein expression from a transformant in shake flasks was estimated to be greater then 3 g/L.

[0202] Shake flask grown supernatant samples were run on BIS-TRIS SDS-PAGE gels (Invitrogen), under reducing conditions with MOPS (morpholinepropanesulfonic acid) SDS running buffer and LDS sample buffer. The results are provided in FIG. 15.

Example 5

[0203] Transformation and Expression the CBHI-TfE5 Fusion Construct into T. reesei

[0204] The CBH1-TfE5 fusion construct was designed according to the procedures described above in example 1 with the following differences. A plasmid equivalent to that described in Collmer & Wilson, Bio/technol. 1: 594-601 (1983) carrying the TfE5 gene was used as the DNA template to amplify the TfE5. The following primers were used to amplify the TfE5 endoglucanase

TABLE-US-00005 EL-308 (which contains a SpeI site)-forward primer- (SEQ ID NO: 21) GCTTATACTAGTAAGCGCGCCGGTCTCACCGCCACAGTCACC and EL-309 (which contains a AscI site) reverse primer- (SEQ ID NO: 22) GCTTATGGCGCGCCTCAGGACTGGAGCTTGCTCCGC.

[0205] Transformation was as described in example 2 above. The highest level of cleaved TfE5 protein expression from a transformant in shake flasks was estimated to be greater than 2 g/L.

[0206] Shake flask grown supernatant samples were run on BIS-TRIS SDS-PAGE gels (Invitrogen), under reducing conditions with MOPS (morpholinepropanesulfonic acid) SDS running buffer and LDS sample buffer. The results are provided in FIG. 16.

REFERENCES

[0207] Altschul, S. F., et al., J. Mol. Biol. 215:403-410, 1990. [0208] Altschul, S. F., et al., Nucleic Acids Res. 25:3389-3402, 1997. [0209] Aro N, Saloheimo A, Ilmen M, Penttila M. ACEII, a novel transcriptional activator involved in regulation of cellulase and xylanase genes of Trichoderma reesei. J Biol. Chem. 2001 Jun. 29; 276(26):24309-14. (Epub 2001 Apr. 13.) [0210] Aubert J. P. et al, p 11 et seq., Biochemistry and Genetics of Cellulose Degradation, eds. Aubert, J. P., Beguin, P., Millet, J., Federation of European Microbiological Societies, Academic Press, 1988 [0211] Ausubel G. M., et al. CURRENT PROTOCOLS 1N MOLECULAR BIOLOGY, John Wiley & Sons, New York, N.Y., 1993. [0212] Baker et al., Appl. Biochem. and Biotechnol. 45/46:245-256, 1994. [0213] Bhikhabhai, R. et al., J. Appl. Biochem. 6:336, 1984. [0214] Boel et al. EMBO J. 3:1581-1585 1984. [0215] Brumbauer, A. et al., Bioseparation 7:287-295, 1999. [0216] Collmer, A. and D. B. Wilson Bio/Technol/1:594-601, 1983. [0217] Deutscher, M. P., Methods Enzymol. 182:779-80, 1990. [0218] Ellouz, S. et al., J. Chromatography 396:307, 1987. [0219] Filho, et al. Can. J. Microbiol. 42:1-5, 1996. [0220] Fliess, A., et al., Eur. J. Appl. Microbiol. Biotechnol. 17:314, 1983. [0221] Goedegebuur et al., Curr. Genet. 41:89-98, 2002. [0222] Goyal, A. et al. Bioresource Technol. 36:37, 1991. [0223] Hazell, B. W. et al., Lett. Appl. Microbiol. 30:282-286, 2000. [0224] Herr et al., Appl. Microbiol. Biotechnol. 5:29-36, 1978. [0225] Hu et al., Mol. Cell. Biol. 11:5792-9, 1991. [0226] Jeeves et al., Biotechnol. Genet. Eng. Rev. 9:327-369, 1991. [0227] Kawaguchi, T et al., Gene 173(2):287-8, 1996. [0228] Kelley et al. EMBO J. 4:475-479, 1985. [0229] Knowles, J. et al., TIBTECH 5, 255-261, 1987. [0230] Krishna, S. et al., Bioresource Tech. 77:193-196, 2001. [0231] Kuhls K. et al., Proc. Natl. Acad. Sci. USA 93(15): 7755-7760, 1996. [0232] Kumar, A., et al., Textile Chemist and Colorist 29:37-42, 1997. [0233] Medve, J. et al., J. Chromatography A 808:153, 1998. [0234] Mohagheghi, A. et al., Int. J. Syst. Bacteriol. 36:435-443, 1986. [0235] Nieves et al., Appl. Biochem. and Biotechnol. 51/52 211-223, 1995. [0236] Nunberg et al. Mol. Cell. Biol. 4:2306-2315 1984. [0237] Ohmiya et al., Biotechnol. Gen. Engineer. Rev. 14:365-414, 1997. [0238] Okada, M. et al., Appl. Environ. Microbiol., 64:555-563, 1988. [0239] Ooi et al., Nucleic Acid Res. 18:5884, 1990 [0240] Penttila et al., Gene 45:253-263, 1986. [0241] Penttila et al., Gene 61: 155-164, 1987. [0242] Penttila et al., Gene 63: 103-112, 1988. [0243] Pere, J., et al., In Proc. Tappi Pulping Conf., Nashville, Tenn., 27-31, pp. 693-696, 1996. [0244] Saarilahti et al., Gene 90:9-14, 1990. [0245] Sakamoto et al., Curr. Genet. 27:435-439, 1995. [0246] Saloheimo M, et al., Gene 63:11-22, 1988. [0247] Saloheimo, A. et al., Molecular Microbiology, 13:219-228, 1994. [0248] Saloheimo, M. et al., Eur. J. Biochem., 249:584-591, 1997. [0249] Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL (Second Edition), [0250] Cold Spring Harbor Press, Plainview, N.Y., 1989. [0251] Schulein, Methods Enzymol., 160, 25, pages 234 et seq, 1988. [0252] Scopes, Methods Enzymol. 90 Pt E:479-90, 1982. [0253] Shoemaker et al., Biochem. Biophys. Acat. 523:133-146 1978. [0254] Shoemaker, S. et al., Bio/Technology, 1:691-696, 1983 [0255] Srisodsuk, M. et al. J. Biol. Chem. 268(28): 20756-20761, 1993. [0256] Strathern et al., eds. (1981) The Molecular Biology of the Yeast Saccharomyces, Cold Spring Harbor Press, Plainview. N.Y. [0257] Suurnakki, A. et al., Cellulose 7:189-209, 2000. [0258] Teed, T. et al., Gene, 51:43-52, 1987 [0259] Van Tilbeurgh, H. et al., FEBS Lett. 16:215, 1984. [0260] Tomaz, C. and Queiroz, J., J. Chromatography A 865:123-128, 1999. [0261] Tomme, P. et al., Eur. J. Biochem. 170:575-581, 1988. [0262] Van Tilbeurgh, H. et al., FEBS Lett. 204:223-227, 1986. [0263] Ward, M. et al., Appl. Microbiol. Biotechnol. 39:738-743, 1993. [0264] Wood, Biochem. Soc. Trans., 13, pp. 407-410, 1985. [0265] Wood et al., METHODS IN ENZYMOLOGY, 160, 25, p. 87 et seq., Academic Press, New York, 1988.

Sequence CWU 1

1

2211570DNATrichoderma reesei 1atgtatcgga agttggccgt catctcggcc ttcttggcca cagctcgtgc tcagtcggcc 60tgcactctcc aatcggagac tcacccgcct ctgacatggc agaaatgctc gtctggtggc 120acttgcactc aacagacagg ctccgtggtc atcgacgcca actggcgctg gactcacgct 180acgaacagca gcacgaactg ctacgatggc aacacttgga gctcgaccct atgtcctgac 240aacgagacct gcgcgaagaa ctgctgtctg gacggtgccg cctacgcgtc cacgtacgga 300gttaccacga gcggtaacag cctctccatt ggctttgtca cccagtctgc gcagaagaac 360gttggcgctc gcctttacct tatggcgagc gacacgacct accaggaatt caccctgctt 420ggcaacgagt tctctttcga tgttgatgtt tcgcagctgc cgtaagtgac ttaccatgaa 480cccctgacgt atcttcttgt gggctcccag ctgactggcc aatttaaggt gcggcttgaa 540cggagctctc tacttcgtgt ccatggacgc ggatggtggc gtgagcaagt atcccaccaa 600caccgctggc gccaagtacg gcacggggta ctgtgacagc cagtgtcccc gcgatctgaa 660gttcatcaat ggccaggcca acgttgaggg ctgggagccg tcatccaaca acgcaaacac 720gggcattgga ggacacggaa gctgctgctc tgagatggat atctgggagg ccaactccat 780ctccgaggct cttacccccc acccttgcac gactgtcggc caggagatct gcgagggtga 840tgggtgcggc ggaacttact ccgataacag atatggcggc acttgcgatc ccgatggctg 900cgactggaac ccataccgcc tgggcaacac cagcttctac ggccctggct caagctttac 960cctcgatacc accaagaaat tgaccgttgt cacccagttc gagacgtcgg gtgccatcaa 1020ccgatactat gtccagaatg gcgtcacttt ccagcagccc aacgccgagc ttggtagtta 1080ctctggcaac gagctcaacg atgattactg cacagctgag gaggcagaat tcggcggatc 1140ctctttctca gacaagggcg gcctgactca gttcaagaag gctacctctg gcggcatggt 1200tctggtcatg agtctgtggg atgatgtgag tttgatggac aaacatgcgc gttgacaaag 1260agtcaagcag ctgactgaga tgttacagta ctacgccaac atgctgtggc tggactccac 1320ctacccgaca aacgagacct cctccacacc cggtgccgtg cgcggaagct gctccaccag 1380ctccggtgtc cctgctcagg tcgaatctca gtctcccaac gccaaggtca ccttctccaa 1440catcaagttc ggacccattg gcagcaccgg caaccctagc ggcggcaacc ctcccggcgg 1500aaacccgcct ggcaccacca ccacccgccg cccagccact accactggaa gctctcccgg 1560acctactagt 1570251DNATrichoderma reesei 2atgtatcgga agttggccgt catctcggcc ttcttggcca cagctcgtgc t 5131423DNATrichoderma reesei 3cagtcggcct gcactctcca atcggagact cacccgcctc tgacatggca gaaatgctcg 60tctggtggca cttgcactca acagacaggc tccgtggtca tcgacgccaa ctggcgctgg 120actcacgcta cgaacagcag cacgaactgc tacgatggca acacttggag ctcgacccta 180tgtcctgaca acgagacctg cgcgaagaac tgctgtctgg acggtgccgc ctacgcgtcc 240acgtacggag ttaccacgag cggtaacagc ctctccattg gctttgtcac ccagtctgcg 300cagaagaacg ttggcgctcg cctttacctt atggcgagcg acacgaccta ccaggaattc 360accctgcttg gcaacgagtt ctctttcgat gttgatgttt cgcagctgcc gtaagtgact 420taccatgaac ccctgacgta tcttcttgtg ggctcccagc tgactggcca atttaaggtg 480cggcttgaac ggagctctct acttcgtgtc catggacgcg gatggtggcg tgagcaagta 540tcccaccaac accgctggcg ccaagtacgg cacggggtac tgtgacagcc agtgtccccg 600cgatctgaag ttcatcaatg gccaggccaa cgttgagggc tgggagccgt catccaacaa 660cgcaaacacg ggcattggag gacacggaag ctgctgctct gagatggata tctgggaggc 720caactccatc tccgaggctc ttacccccca cccttgcacg actgtcggcc aggagatctg 780cgagggtgat gggtgcggcg gaacttactc cgataacaga tatggcggca cttgcgatcc 840cgatggctgc gactggaacc cataccgcct gggcaacacc agcttctacg gccctggctc 900aagctttacc ctcgatacca ccaagaaatt gaccgttgtc acccagttcg agacgtcggg 960tgccatcaac cgatactatg tccagaatgg cgtcactttc cagcagccca acgccgagct 1020tggtagttac tctggcaacg agctcaacga tgattactgc acagctgagg aggcagaatt 1080cggcggatcc tctttctcag acaagggcgg cctgactcag ttcaagaagg ctacctctgg 1140cggcatggtt ctggtcatga gtctgtggga tgatgtgagt ttgatggaca aacatgcgcg 1200ttgacaaaga gtcaagcagc tgactgagat gttacagtac tacgccaaca tgctgtggct 1260ggactccacc tacccgacaa acgagacctc ctccacaccc ggtgccgtgc gcggaagctg 1320ctccaccagc tccggtgtcc ctgctcaggt cgaatctcag tctcccaacg ccaaggtcac 1380cttctccaac atcaagttcg gacccattgg cagcaccggc aac 1423496DNATrichoderma reesei 4cctagcggcg gcaaccctcc cggcggaaac ccgcctggca ccaccaccac ccgccgccca 60gccactacca ctggaagctc tcccggacct actagt 965480PRTTrichoderma reesei 5Met Tyr Arg Lys Leu Ala Val Ile Ser Ala Phe Leu Ala Thr Ala Arg1 5 10 15Ala Gln Ser Ala Cys Thr Leu Gln Ser Glu Thr His Pro Pro Leu Thr 20 25 30Trp Gln Lys Cys Ser Ser Gly Gly Thr Cys Thr Gln Gln Thr Gly Ser 35 40 45Val Val Ile Asp Ala Asn Trp Arg Trp Thr His Ala Thr Asn Ser Ser 50 55 60Thr Asn Cys Tyr Asp Gly Asn Thr Trp Ser Ser Thr Leu Cys Pro Asp65 70 75 80Asn Glu Thr Cys Ala Lys Asn Cys Cys Leu Asp Gly Ala Ala Tyr Ala 85 90 95Ser Thr Tyr Gly Val Thr Thr Ser Gly Asn Ser Leu Ser Ile Gly Phe 100 105 110Val Thr Gln Ser Ala Gln Lys Asn Val Gly Ala Arg Leu Tyr Leu Met 115 120 125Ala Ser Asp Thr Thr Tyr Gln Glu Phe Thr Leu Leu Gly Asn Glu Phe 130 135 140Ser Phe Asp Val Asp Val Ser Gln Leu Pro Cys Gly Leu Asn Gly Ala145 150 155 160Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Val Ser Lys Tyr Pro 165 170 175Thr Asn Thr Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gln 180 185 190Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly Gln Ala Asn Val Glu Gly 195 200 205Trp Glu Pro Ser Ser Asn Asn Ala Asn Thr Gly Ile Gly Gly His Gly 210 215 220Ser Cys Cys Ser Glu Met Asp Ile Trp Glu Ala Asn Ser Ile Ser Glu225 230 235 240Ala Leu Thr Pro His Pro Cys Thr Thr Val Gly Gln Glu Ile Cys Glu 245 250 255Gly Asp Gly Cys Gly Gly Thr Tyr Ser Asp Asn Arg Tyr Gly Gly Thr 260 265 270Cys Asp Pro Asp Gly Cys Asp Trp Asn Pro Tyr Arg Leu Gly Asn Thr 275 280 285Ser Phe Tyr Gly Pro Gly Ser Ser Phe Thr Leu Asp Thr Thr Lys Lys 290 295 300Leu Thr Val Val Thr Gln Phe Glu Thr Ser Gly Ala Ile Asn Arg Tyr305 310 315 320Tyr Val Gln Asn Gly Val Thr Phe Gln Gln Pro Asn Ala Glu Leu Gly 325 330 335Ser Tyr Ser Gly Asn Glu Leu Asn Asp Asp Tyr Cys Thr Ala Glu Glu 340 345 350Ala Glu Phe Gly Gly Ser Ser Phe Ser Asp Lys Gly Gly Leu Thr Gln 355 360 365Phe Lys Lys Ala Thr Ser Gly Gly Met Val Leu Val Met Ser Leu Trp 370 375 380Asp Asp Tyr Tyr Ala Asn Met Leu Trp Leu Asp Ser Thr Tyr Pro Thr385 390 395 400Asn Glu Thr Ser Ser Thr Pro Gly Ala Val Arg Gly Ser Cys Ser Thr 405 410 415Ser Ser Gly Val Pro Ala Gln Val Glu Ser Gln Ser Pro Asn Ala Lys 420 425 430Val Thr Phe Ser Asn Ile Lys Phe Gly Pro Ile Gly Ser Thr Gly Asn 435 440 445Pro Ser Gly Gly Asn Pro Pro Gly Gly Asn Pro Pro Gly Thr Thr Thr 450 455 460Thr Arg Arg Pro Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro Thr Ser465 470 475 4806431PRTTrichoderma reesei 6Gln Ser Ala Cys Thr Leu Gln Ser Glu Thr His Pro Pro Leu Thr Trp1 5 10 15Gln Lys Cys Ser Ser Gly Gly Thr Cys Thr Gln Gln Thr Gly Ser Val 20 25 30Val Ile Asp Ala Asn Trp Arg Trp Thr His Ala Thr Asn Ser Ser Thr 35 40 45Asn Cys Tyr Asp Gly Asn Thr Trp Ser Ser Thr Leu Cys Pro Asp Asn 50 55 60Glu Thr Cys Ala Lys Asn Cys Cys Leu Asp Gly Ala Ala Tyr Ala Ser65 70 75 80Thr Tyr Gly Val Thr Thr Ser Gly Asn Ser Leu Ser Ile Gly Phe Val 85 90 95Thr Gln Ser Ala Gln Lys Asn Val Gly Ala Arg Leu Tyr Leu Met Ala 100 105 110Ser Asp Thr Thr Tyr Gln Glu Phe Thr Leu Leu Gly Asn Glu Phe Ser 115 120 125Phe Asp Val Asp Val Ser Gln Leu Pro Cys Gly Leu Asn Gly Ala Leu 130 135 140Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Val Ser Lys Tyr Pro Thr145 150 155 160Asn Thr Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gln Cys 165 170 175Pro Arg Asp Leu Lys Phe Ile Asn Gly Gln Ala Asn Val Glu Gly Trp 180 185 190Glu Pro Ser Ser Asn Asn Ala Asn Thr Gly Ile Gly Gly His Gly Ser 195 200 205Cys Cys Ser Glu Met Asp Ile Trp Glu Ala Asn Ser Ile Ser Glu Ala 210 215 220Leu Thr Pro His Pro Cys Thr Thr Val Gly Gln Glu Ile Cys Glu Gly225 230 235 240Asp Gly Cys Gly Gly Thr Tyr Ser Asp Asn Arg Tyr Gly Gly Thr Cys 245 250 255Asp Pro Asp Gly Cys Asp Trp Asn Pro Tyr Arg Leu Gly Asn Thr Ser 260 265 270Phe Tyr Gly Pro Gly Ser Ser Phe Thr Leu Asp Thr Thr Lys Lys Leu 275 280 285Thr Val Val Thr Gln Phe Glu Thr Ser Gly Ala Ile Asn Arg Tyr Tyr 290 295 300Val Gln Asn Gly Val Thr Phe Gln Gln Pro Asn Ala Glu Leu Gly Ser305 310 315 320Tyr Ser Gly Asn Glu Leu Asn Asp Asp Tyr Cys Thr Ala Glu Glu Ala 325 330 335Glu Phe Gly Gly Ser Ser Phe Ser Asp Lys Gly Gly Leu Thr Gln Phe 340 345 350Lys Lys Ala Thr Ser Gly Gly Met Val Leu Val Met Ser Leu Trp Asp 355 360 365Asp Tyr Tyr Ala Asn Met Leu Trp Leu Asp Ser Thr Tyr Pro Thr Asn 370 375 380Glu Thr Ser Ser Thr Pro Gly Ala Val Arg Gly Ser Cys Ser Thr Ser385 390 395 400Ser Gly Val Pro Ala Gln Val Glu Ser Gln Ser Pro Asn Ala Lys Val 405 410 415Thr Phe Ser Asn Ile Lys Phe Gly Pro Ile Gly Ser Thr Gly Asn 420 425 43071077DNAAcidothermus cellulolyticus 7gcgggcggcg gctattggca cacgagcggc cgggagatcc tggacgcgaa caacgtgccg 60gtacggatcg ccggcatcaa ctggtttggg ttcgaaacct gcaattacgt cgtgcacggt 120ctctggtcac gcgactaccg cagcatgctc gaccagataa agtcgctcgg ctacaacaca 180atccggctgc cgtactctga cgacattctc aagccgggca ccatgccgaa cagcatcaat 240ttttaccaga tgaatcagga cctgcagggt ctgacgtcct tgcaggtcat ggacaaaatc 300gtcgcgtacg ccggtcagat cggcctgcgc atcattcttg accgccaccg accggattgc 360agcgggcagt cggcgctgtg gtacacgagc agcgtctcgg aggctacgtg gatttccgac 420ctgcaagcgc tggcgcagcg ctacaaggga aacccgacgg tcgtcggctt tgacttgcac 480aacgagccgc atgacccggc ctgctggggc tgcggcgatc cgagcatcga ctggcgattg 540gccgccgagc gggccggaaa cgccgtgctc tcggtgaatc cgaacctgct cattttcgtc 600gaaggtgtgc agagctacaa cggagactcc tactggtggg gcggcaacct gcaaggagcc 660ggccagtacc cggtcgtgct gaacgtgccg aaccgcctgg tgtactcggc gcacgactac 720gcgacgagcg tctacccgca gacgtggttc agcgatccga ccttccccaa caacatgccc 780ggcatctgga acaagaactg gggatacctc ttcaatcaga acattgcacc ggtatggctg 840ggcgaattcg gtacgacact gcaatccacg accgaccaga cgtggctgaa gacgctcgtc 900cagtacctac ggccgaccgc gcaatacggt gcggacagct tccagtggac cttctggtcc 960tggaaccccg attccggcga cacaggagga attctcaagg atgactggca gacggtcgac 1020acagtaaaag acggctatct cgcgccgatc aagtcgtcga ttttcgatcc tgtcggc 10778359PRTAcidothermus cellulolyticus 8Ala Gly Gly Gly Tyr Trp His Thr Ser Gly Arg Glu Ile Leu Asp Ala1 5 10 15Asn Asn Val Pro Val Arg Ile Ala Gly Ile Asn Trp Phe Gly Phe Glu 20 25 30Thr Cys Asn Tyr Val Val His Gly Leu Trp Ser Arg Asp Tyr Arg Ser 35 40 45Met Leu Asp Gln Ile Lys Ser Leu Gly Tyr Asn Thr Ile Arg Leu Pro 50 55 60Tyr Ser Asp Asp Ile Leu Lys Pro Gly Thr Met Pro Asn Ser Ile Asn65 70 75 80Phe Tyr Gln Met Asn Gln Asp Leu Gln Gly Leu Thr Ser Leu Gln Val 85 90 95Met Asp Lys Ile Val Ala Tyr Ala Gly Gln Ile Gly Leu Arg Ile Ile 100 105 110Leu Asp Arg His Arg Pro Asp Cys Ser Gly Gln Ser Ala Leu Trp Tyr 115 120 125Thr Ser Ser Val Ser Glu Ala Thr Trp Ile Ser Asp Leu Gln Ala Leu 130 135 140Ala Gln Arg Tyr Lys Gly Asn Pro Thr Val Val Gly Phe Asp Leu His145 150 155 160Asn Glu Pro His Asp Pro Ala Cys Trp Gly Cys Gly Asp Pro Ser Ile 165 170 175Asp Trp Arg Leu Ala Ala Glu Arg Ala Gly Asn Ala Val Leu Ser Val 180 185 190Asn Pro Asn Leu Leu Ile Phe Val Glu Gly Val Gln Ser Tyr Asn Gly 195 200 205Asp Ser Tyr Trp Trp Gly Gly Asn Leu Gln Gly Ala Gly Gln Tyr Pro 210 215 220Val Val Leu Asn Val Pro Asn Arg Leu Val Tyr Ser Ala His Asp Tyr225 230 235 240Ala Thr Ser Val Tyr Pro Gln Thr Trp Phe Ser Asp Pro Thr Phe Pro 245 250 255Asn Asn Met Pro Gly Ile Trp Asn Lys Asn Trp Gly Tyr Leu Phe Asn 260 265 270Gln Asn Ile Ala Pro Val Trp Leu Gly Glu Phe Gly Thr Thr Leu Gln 275 280 285Ser Thr Thr Asp Gln Thr Trp Leu Lys Thr Leu Val Gln Tyr Leu Arg 290 295 300Pro Thr Ala Gln Tyr Gly Ala Asp Ser Phe Gln Trp Thr Phe Trp Ser305 310 315 320Trp Asn Pro Asp Ser Gly Asp Thr Gly Gly Ile Leu Lys Asp Asp Trp 325 330 335Gln Thr Val Asp Thr Val Lys Asp Gly Tyr Leu Ala Pro Ile Lys Ser 340 345 350Ser Ile Phe Asp Pro Val Gly 35592223DNAAcidothermus cellulolyticus 9gcgacgactc agccgtacac ctggagcaac gtggcgatcg ggggcggcgg ctttgtcgac 60gggatcgtct tcaatgaagg tgcaccggga attctgtacg tgcggacgga catcgggggg 120atgtatcgat gggatgccgc caacgggcgg tggatccctc ttctggattg ggtgggatgg 180aacaattggg ggtacaacgg cgtcgtcagc attgcggcag acccgatcaa tactaacaag 240gtatgggccg ccgtcggaat gtacaccaac agctgggacc caaacgacgg agcgattctc 300cgctcgtctg atcagggcgc aacgtggcaa ataacgcccc tgccgttcaa gcttggcggc 360aacatgcccg ggcgtggaat gggcgagcgg cttgcggtgg atccaaacaa tgacaacatt 420ctgtatttcg gcgccccgag cggcaaaggg ctctggagaa gcacagattc cggcgcgacc 480tggtcccaga tgacgaactt tccggacgta ggcacgtaca ttgcaaatcc cactgacacg 540accggctatc agagcgatat tcaaggcgtc gtctgggtcg ctttcgacaa gtcttcgtca 600tcgctcgggc aagcgagtaa gaccattttt gtgggcgtgg cggatcccaa taatccggtc 660ttctggagca gagacggcgg cgcgacgtgg caggcggtgc cgggtgcgcc gaccggcttc 720atcccgcaca agggcgtctt tgacccggtc aaccacgtgc tctatattgc caccagcaat 780acgggtggtc cgtatgacgg gagctccggc gacgtctgga aattctcggt gacctccggg 840acatggacgc gaatcagccc ggtaccttcg acggacacgg ccaacgacta ctttggttac 900agcggcctca ctatcgaccg ccagcacccg aacacgataa tggtggcaac ccagatatcg 960tggtggccgg acaccataat ctttcggagc accgacggcg gtgcgacgtg gacgcggatc 1020tgggattgga cgagttatcc caatcgaagc ttgcgatatg tgcttgacat ttcggcggag 1080ccttggctga ccttcggcgt acagccgaat cctcccgtac cgagtccgaa gctcggctgg 1140atggatgaag cgatggcaat cgatccgttc aactctgatc ggatgctcta cggaacaggc 1200gcgacgttgt acgcaacaaa tgatctcacg aagtgggact ccggcggcca gattcatatc 1260gcgccgatgg tcaaaggatt ggaggagacg gcggtaaacg atctcatcag cccgccgtct 1320ggcgccccgc tcatcagcgc tctcggagac ctcggcggct tcacccacgc cgacgttact 1380gccgtgccat cgacgatctt cacgtcaccg gtgttcacga ccggcaccag cgtcgactat 1440gcggaattga atccgtcgat catcgttcgc gctggaagtt tcgatccatc gagccaaccg 1500aacgacaggc acgtcgcgtt ctcgacagac ggcggcaaga actggttcca aggcagcgaa 1560cctggcgggg tgacgacggg cggcaccgtc gccgcatcgg ccgacggctc tcgtttcgtc 1620tgggctcccg gcgatcccgg tcagcctgtg gtgtacgcag tcggatttgg caactcctgg 1680gctgcttcgc aaggtgttcc cgccaatgcc cagatccgct cagaccgggt gaatccaaag 1740actttctatg ccctatccaa tggaaccttc tatcgaagca cggacggcgg cgtgacattc 1800caaccggtcg cggccggtct tccgagcagc ggtgccgtcg gtgtcatgtt ccacgcggtg 1860cctggaaaag aaggcgatct gtggctcgct gcatcgagcg ggctttacca ctcaaccaat 1920ggcggcagca gttggtctgc aatcaccggc gtatcctccg cggtgaacgt gggatttggt 1980aagtctgcgc ccgggtcgtc atacccagcc gtctttgtcg tcggcacgat cggaggcgtt 2040acgggggcgt accgctccga cgacggtggg acgacctggg tacggatcaa tgatgaccag 2100caccaatacg gaaattgggg acaagcaatc accggtgacc cgcgaattta cgggcgggtg 2160tacataggca cgaacggccg tggaattgtc tacggggaca ttggtggtgc gccgtccgga 2220tcg 222310741PRTAcidothermus cellulolyticus 10Ala Thr Thr Gln Pro Tyr Thr Trp Ser Asn Val Ala Ile Gly Gly Gly1 5 10 15Gly Phe Val Asp Gly Ile Val Phe Asn Glu Gly Ala Pro Gly Ile Leu 20 25 30Tyr Val Arg Thr Asp Ile Gly Gly Met Tyr Arg Trp Asp Ala Ala Asn 35 40 45Gly Arg Trp Ile Pro Leu Leu Asp Trp Val Gly Trp Asn Asn Trp Gly 50 55 60Tyr Asn Gly Val Val Ser Ile Ala Ala Asp Pro Ile Asn Thr Asn Lys65 70 75

80Val Trp Ala Ala Val Gly Met Tyr Thr Asn Ser Trp Asp Pro Asn Asp 85 90 95Gly Ala Ile Leu Arg Ser Ser Asp Gln Gly Ala Thr Trp Gln Ile Thr 100 105 110Pro Leu Pro Phe Lys Leu Gly Gly Asn Met Pro Gly Arg Gly Met Gly 115 120 125Glu Arg Leu Ala Val Asp Pro Asn Asn Asp Asn Ile Leu Tyr Phe Gly 130 135 140Ala Pro Ser Gly Lys Gly Leu Trp Arg Ser Thr Asp Ser Gly Ala Thr145 150 155 160Trp Ser Gln Met Thr Asn Phe Pro Asp Val Gly Thr Tyr Ile Ala Asn 165 170 175Pro Thr Asp Thr Thr Gly Tyr Gln Ser Asp Ile Gln Gly Val Val Trp 180 185 190Val Ala Phe Asp Lys Ser Ser Ser Ser Leu Gly Gln Ala Ser Lys Thr 195 200 205Ile Phe Val Gly Val Ala Asp Pro Asn Asn Pro Val Phe Trp Ser Arg 210 215 220Asp Gly Gly Ala Thr Trp Gln Ala Val Pro Gly Ala Pro Thr Gly Phe225 230 235 240Ile Pro His Lys Gly Val Phe Asp Pro Val Asn His Val Leu Tyr Ile 245 250 255Ala Thr Ser Asn Thr Gly Gly Pro Tyr Asp Gly Ser Ser Gly Asp Val 260 265 270Trp Lys Phe Ser Val Thr Ser Gly Thr Trp Thr Arg Ile Ser Pro Val 275 280 285Pro Ser Thr Asp Thr Ala Asn Asp Tyr Phe Gly Tyr Ser Gly Leu Thr 290 295 300Ile Asp Arg Gln His Pro Asn Thr Ile Met Val Ala Thr Gln Ile Ser305 310 315 320Trp Trp Pro Asp Thr Ile Ile Phe Arg Ser Thr Asp Gly Gly Ala Thr 325 330 335Trp Thr Arg Ile Trp Asp Trp Thr Ser Tyr Pro Asn Arg Ser Leu Arg 340 345 350Tyr Val Leu Asp Ile Ser Ala Glu Pro Trp Leu Thr Phe Gly Val Gln 355 360 365Pro Asn Pro Pro Val Pro Ser Pro Lys Leu Gly Trp Met Asp Glu Ala 370 375 380Met Ala Ile Asp Pro Phe Asn Ser Asp Arg Met Leu Tyr Gly Thr Gly385 390 395 400Ala Thr Leu Tyr Ala Thr Asn Asp Leu Thr Lys Trp Asp Ser Gly Gly 405 410 415Gln Ile His Ile Ala Pro Met Val Lys Gly Leu Glu Glu Thr Ala Val 420 425 430Asn Asp Leu Ile Ser Pro Pro Ser Gly Ala Pro Leu Ile Ser Ala Leu 435 440 445Gly Asp Leu Gly Gly Phe Thr His Ala Asp Val Thr Ala Val Pro Ser 450 455 460Thr Ile Phe Thr Ser Pro Val Phe Thr Thr Gly Thr Ser Val Asp Tyr465 470 475 480Ala Glu Leu Asn Pro Ser Ile Ile Val Arg Ala Gly Ser Phe Asp Pro 485 490 495Ser Ser Gln Pro Asn Asp Arg His Val Ala Phe Ser Thr Asp Gly Gly 500 505 510Lys Asn Trp Phe Gln Gly Ser Glu Pro Gly Gly Val Thr Thr Gly Gly 515 520 525Thr Val Ala Ala Ser Ala Asp Gly Ser Arg Phe Val Trp Ala Pro Gly 530 535 540Asp Pro Gly Gln Pro Val Val Tyr Ala Val Gly Phe Gly Asn Ser Trp545 550 555 560Ala Ala Ser Gln Gly Val Pro Ala Asn Ala Gln Ile Arg Ser Asp Arg 565 570 575Val Asn Pro Lys Thr Phe Tyr Ala Leu Ser Asn Gly Thr Phe Tyr Arg 580 585 590Ser Thr Asp Gly Gly Val Thr Phe Gln Pro Val Ala Ala Gly Leu Pro 595 600 605Ser Ser Gly Ala Val Gly Val Met Phe His Ala Val Pro Gly Lys Glu 610 615 620Gly Asp Leu Trp Leu Ala Ala Ser Ser Gly Leu Tyr His Ser Thr Asn625 630 635 640Gly Gly Ser Ser Trp Ser Ala Ile Thr Gly Val Ser Ser Ala Val Asn 645 650 655Val Gly Phe Gly Lys Ser Ala Pro Gly Ser Ser Tyr Pro Ala Val Phe 660 665 670Val Val Gly Thr Ile Gly Gly Val Thr Gly Ala Tyr Arg Ser Asp Asp 675 680 685Gly Gly Thr Thr Trp Val Arg Ile Asn Asp Asp Gln His Gln Tyr Gly 690 695 700Asn Trp Gly Gln Ala Ile Thr Gly Asp Pro Arg Ile Tyr Gly Arg Val705 710 715 720Tyr Ile Gly Thr Asn Gly Arg Gly Ile Val Tyr Gly Asp Ile Gly Gly 725 730 735Ala Pro Ser Gly Ser 740111293DNAArtificial Sequenceendoglucanase nucleotide derived from Thermobifida fusca 11gccggtctca ccgccacagt caccaaagaa tcctcgtggg acaacggcta ctccgcgtcc 60gtcaccgtcc gcaacgacac ctcgagcacc gtctcccagt gggaggtcgt cctcaccctg 120cccggcggca ctacagtggc ccaggtgtgg aacgcccagc acaccagcag cggcaactcc 180cacaccttca ccggggtttc ctggaacagc accatcccgc ccggaggcac cgcctcttcc 240ggcttcatcg cttccggcag cggcgaaccc acccactgca ccatcaacgg cgccccctgc 300gacgaaggct ccgagccggg cggccccggc ggtcccggaa ccccctcccc cgaccccggc 360acgcagcccg gcaccggcac cccggtcgag cggtacggca aagtccaggt ctgcggcacc 420cagctctgcg acgagcacgg caacccggtc caactgcgcg gcatgagcac ccacggcatc 480cagtggttcg accactgcct gaccgacagc tcgctggacg ccctggccta cgactggaag 540gccgacatca tccgcctgtc catgtacatc caggaagacg gctacgagac caacccgcgc 600ggcttcaccg accggatgca ccagctcatc gacatggcca cggcgcgcgg cctgtacgtg 660atcgtggact ggcacatcct caccccgggc gatccccact acaacctgga ccgggccaag 720accttcttcg cggaaatcgc ccagcgccac gccagcaaga ccaacgtgct ctacgagatc 780gccaacgaac ccaacggagt gagctgggcc tccatcaaga gctacgccga agaggtcatc 840ccggtgatcc gccagcgcga ccccgactcg gtgatcatcg tgggcacccg cggctggtcg 900tcgctcggcg tctccgaagg ctccggcccc gccgagatcg cggccaaccc ggtcaacgcc 960tccaacatca tgtacgcctt ccacttctac gcggcctcgc accgcgacaa ctacctcaac 1020gcgctgcgtg aggcctccga gctgttcccg gtcttcgtca ccgagttcgg caccgagacc 1080tacaccggtg acggcgccaa cgacttccag atggccgacc gctacatcga cctgatggcg 1140gaacggaaga tcgggtggac caagtggaac tactcggacg acttccgttc cggcgcggtc 1200ttccagccgg gcacctgcgc gtccggcggc ccgtggagcg gttcgtcgct gaaggcgtcc 1260ggacagtggg tgcggagcaa gctccagtcc tga 129312430PRTArtificial Sequenceendoglucanase derived from Thermobifida fusca 12Ala Gly Leu Thr Ala Thr Val Thr Lys Glu Ser Ser Trp Asp Asn Gly1 5 10 15Tyr Ser Ala Ser Val Thr Val Arg Asn Asp Thr Ser Ser Thr Val Ser 20 25 30Gln Trp Glu Val Val Leu Thr Leu Pro Gly Gly Thr Thr Val Ala Gln 35 40 45Val Trp Asn Ala Gln His Thr Ser Ser Gly Asn Ser His Thr Phe Thr 50 55 60Gly Val Ser Trp Asn Ser Thr Ile Pro Pro Gly Gly Thr Ala Ser Ser65 70 75 80Gly Phe Ile Ala Ser Gly Ser Gly Glu Pro Thr His Cys Thr Ile Asn 85 90 95Gly Ala Pro Cys Asp Glu Gly Ser Glu Pro Gly Gly Pro Gly Gly Pro 100 105 110Gly Thr Pro Ser Pro Asp Pro Gly Thr Gln Pro Gly Thr Gly Thr Pro 115 120 125Val Glu Arg Tyr Gly Lys Val Gln Val Cys Gly Thr Gln Leu Cys Asp 130 135 140Glu His Gly Asn Pro Val Gln Leu Arg Gly Met Ser Thr His Gly Ile145 150 155 160Gln Trp Phe Asp His Cys Leu Thr Asp Ser Ser Leu Asp Ala Leu Ala 165 170 175Tyr Asp Trp Lys Ala Asp Ile Ile Arg Leu Ser Met Tyr Ile Gln Glu 180 185 190Asp Gly Tyr Glu Thr Asn Pro Arg Gly Phe Thr Asp Arg Met His Gln 195 200 205Leu Ile Asp Met Ala Thr Ala Arg Gly Leu Tyr Val Ile Val Asp Trp 210 215 220His Ile Leu Thr Pro Gly Asp Pro His Tyr Asn Leu Asp Arg Ala Lys225 230 235 240Thr Phe Phe Ala Glu Ile Ala Gln Arg His Ala Ser Lys Thr Asn Val 245 250 255Leu Tyr Glu Ile Ala Asn Glu Pro Asn Gly Val Ser Trp Ala Ser Ile 260 265 270Lys Ser Tyr Ala Glu Glu Val Ile Pro Val Ile Arg Gln Arg Asp Pro 275 280 285Asp Ser Val Ile Ile Val Gly Thr Arg Gly Trp Ser Ser Leu Gly Val 290 295 300Ser Glu Gly Ser Gly Pro Ala Glu Ile Ala Ala Asn Pro Val Asn Ala305 310 315 320Ser Asn Ile Met Tyr Ala Phe His Phe Tyr Ala Ala Ser His Arg Asp 325 330 335Asn Tyr Leu Asn Ala Leu Arg Glu Ala Ser Glu Leu Phe Pro Val Phe 340 345 350Val Thr Glu Phe Gly Thr Glu Thr Tyr Thr Gly Asp Gly Ala Asn Asp 355 360 365Phe Gln Met Ala Asp Arg Tyr Ile Asp Leu Met Ala Glu Arg Lys Ile 370 375 380Gly Trp Thr Lys Trp Asn Tyr Ser Asp Asp Phe Arg Ser Gly Ala Val385 390 395 400Phe Gln Pro Gly Thr Cys Ala Ser Gly Gly Pro Trp Ser Gly Ser Ser 405 410 415Leu Lys Ala Ser Gly Gln Trp Val Arg Ser Lys Leu Gln Ser 420 425 430132656DNAArtificial Sequencefusion construct 13atgtatcgga agttggccgt catctcggcc ttcttggcca cagctcgtgc tcagtcggcc 60tgcactctcc aatcggagac tcacccgcct ctgacatggc agaaatgctc gtctggtggc 120acttgcactc aacagacagg ctccgtggtc atcgacgcca actggcgctg gactcacgct 180acgaacagca gcacgaactg ctacgatggc aacacttgga gctcgaccct atgtcctgac 240aacgagacct gcgcgaagaa ctgctgtctg gacggtgccg cctacgcgtc cacgtacgga 300gttaccacga gcggtaacag cctctccatt ggctttgtca cccagtctgc gcagaagaac 360gttggcgctc gcctttacct tatggcgagc gacacgacct accaggaatt caccctgctt 420ggcaacgagt tctctttcga tgttgatgtt tcgcagctgc cgtaagtgac ttaccatgaa 480cccctgacgt atcttcttgt gggctcccag ctgactggcc aatttaaggt gcggcttgaa 540cggagctctc tacttcgtgt ccatggacgc ggatggtggc gtgagcaagt atcccaccaa 600caccgctggc gccaagtacg gcacggggta ctgtgacagc cagtgtcccc gcgatctgaa 660gttcatcaat ggccaggcca acgttgaggg ctgggagccg tcatccaaca acgcaaacac 720gggcattgga ggacacggaa gctgctgctc tgagatggat atctgggagg ccaactccat 780ctccgaggct cttacccccc acccttgcac gactgtcggc caggagatct gcgagggtga 840tgggtgcggc ggaacttact ccgataacag atatggcggc acttgcgatc ccgatggctg 900cgactggaac ccataccgcc tgggcaacac cagcttctac ggccctggct caagctttac 960cctcgatacc accaagaaat tgaccgttgt cacccagttc gagacgtcgg gtgccatcaa 1020ccgatactat gtccagaatg gcgtcacttt ccagcagccc aacgccgagc ttggtagtta 1080ctctggcaac gagctcaacg atgattactg cacagctgag gaggcagaat tcggcggatc 1140ctctttctca gacaagggcg gcctgactca gttcaagaag gctacctctg gcggcatggt 1200tctggtcatg agtctgtggg atgatgtgag tttgatggac aaacatgcgc gttgacaaag 1260agtcaagcag ctgactgaga tgttacagta ctacgccaac atgctgtggc tggactccac 1320ctacccgaca aacgagacct cctccacacc cggtgccgtg cgcggaagct gctccaccag 1380ctccggtgtc cctgctcagg tcgaatctca gtctcccaac gccaaggtca ccttctccaa 1440catcaagttc ggacccattg gcagcaccgg caaccctagc ggcggcaacc ctcccggcgg 1500aaacccgcct ggcaccacca ccacccgccg cccagccact accactggaa gctctcccgg 1560acctactagt aagcgggcgg gcggcggcta ttggcacacg agcggccggg agatcctgga 1620cgcgaacaac gtgccggtac ggatcgccgg catcaactgg tttgggttcg aaacctgcaa 1680ttacgtcgtg cacggtctct ggtcacgcga ctaccgcagc atgctcgacc agataaagtc 1740gctcggctac aacacaatcc ggctgccgta ctctgacgac attctcaagc cgggcaccat 1800gccgaacagc atcaattttt accagatgaa tcaggacctg cagggtctga cgtccttgca 1860ggtcatggac aaaatcgtcg cgtacgccgg tcagatcggc ctgcgcatca ttcttgaccg 1920ccaccgaccg gattgcagcg ggcagtcggc gctgtggtac acgagcagcg tctcggaggc 1980tacgtggatt tccgacctgc aagcgctggc gcagcgctac aagggaaacc cgacggtcgt 2040cggctttgac ttgcacaacg agccgcatga cccggcctgc tggggctgcg gcgatccgag 2100catcgactgg cgattggccg ccgagcgggc cggaaacgcc gtgctctcgg tgaatccgaa 2160cctgctcatt ttcgtcgaag gtgtgcagag ctacaacgga gactcctact ggtggggcgg 2220caacctgcaa ggagccggcc agtacccggt cgtgctgaac gtgccgaacc gcctggtgta 2280ctcggcgcac gactacgcga cgagcgtcta cccgcagacg tggttcagcg atccgacctt 2340ccccaacaac atgcccggca tctggaacaa gaactgggga tacctcttca atcagaacat 2400tgcaccggta tggctgggcg aattcggtac gacactgcaa tccacgaccg accagacgtg 2460gctgaagacg ctcgtccagt acctacggcc gaccgcgcaa tacggtgcgg acagcttcca 2520gtggaccttc tggtcctgga accccgattc cggcgacaca ggaggaattc tcaaggatga 2580ctggcagacg gtcgacacag taaaagacgg ctatctcgcg ccgatcaagt cgtcgatttt 2640cgatcctgtc ggctaa 265614839PRTArtificial Sequencefusion construct 14Met Tyr Arg Lys Leu Ala Val Ile Ser Ala Pro Leu Ala Thr Ala Arg1 5 10 15Ala Gln Ser Ala Cys Thr Leu Gln Ser Glu Thr His Pro Pro Leu Thr 20 25 30Trp Gln Lys Cys Ser Ser Gly Gly Thr Cys Thr Gln Gln Thr Gly Ser 35 40 45Val Val Ile Asp Ala Asn Trp Arg Trp Thr His Ala Thr Asn Ser Ser 50 55 60Thr Asn Cys Tyr Asp Gly Asn Thr Trp Ser Ser Thr Leu Cys Pro Asp65 70 75 80Asn Glu Thr Cys Ala Lys Asn Cys Cys Leu Asp Gly Ala Ala Tyr Ala 85 90 95Ser Thr Tyr Gly Val Thr Thr Ser Gly Asn Ser Leu Ser Ile Gly Phe 100 105 110Val Thr Gln Ser Ala Gln Lys Asn Val Gly Ala Arg Leu Tyr Leu Met 115 120 125Ala Ser Asp Thr Thr Tyr Gln Glu Phe Thr Leu Leu Gly Asn Glu Phe 130 135 140Ser Phe Asp Val Asp Val Ser Gln Leu Pro Cys Gly Leu Asn Gly Ala145 150 155 160Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Val Ser Lys Tyr Pro 165 170 175Thr Asn Thr Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gln 180 185 190Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly Gln Ala Asn Val Glu Gly 195 200 205Trp Glu Pro Ser Ser Asn Asn Ala Asn Thr Gly Ile Gly Gly His Gly 210 215 220Ser Cys Cys Ser Glu Met Asp Ile Trp Glu Ala Asn Ser Ile Ser Glu225 230 235 240Ala Leu Thr Pro His Pro Cys Thr Thr Val Gly Gln Glu Ile Cys Glu 245 250 255Gly Asp Gly Cys Gly Gly Thr Tyr Ser Asp Asn Arg Tyr Gly Gly Thr 260 265 270Cys Asp Pro Asp Gly Cys Asp Trp Asn Pro Tyr Arg Leu Gly Asn Thr 275 280 285Ser Phe Tyr Gly Pro Gly Ser Ser Phe Thr Leu Asp Thr Thr Lys Lys 290 295 300Leu Thr Val Val Thr Gln Phe Glu Thr Ser Gly Ala Ile Asn Arg Tyr305 310 315 320Tyr Val Gln Asn Gly Val Thr Phe Gln Gln Pro Asn Ala Glu Leu Gly 325 330 335Ser Tyr Ser Gly Asn Glu Leu Asn Asp Asp Tyr Cys Thr Ala Glu Glu 340 345 350Ala Glu Phe Gly Gly Ser Ser Phe Ser Asp Lys Gly Gly Leu Thr Gln 355 360 365Phe Lys Lys Ala Thr Ser Gly Gly Met Val Leu Val Met Ser Leu Trp 370 375 380Asp Asp Tyr Tyr Ala Asn Met Leu Trp Leu Asp Ser Thr Tyr Pro Thr385 390 395 400Asn Glu Thr Ser Ser Thr Pro Gly Ala Val Arg Gly Ser Cys Ser Thr 405 410 415Ser Ser Gly Val Pro Ala Gln Val Glu Ser Gln Ser Pro Asn Ala Lys 420 425 430Val Thr Phe Ser Asn Ile Lys Phe Gly Pro Ile Gly Ser Thr Gly Asn 435 440 445Pro Ser Gly Gly Asn Pro Pro Gly Gly Asn Pro Pro Gly Thr Met Arg 450 455 460Arg Pro Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro Thr Ser Lys Arg465 470 475 480Ala Gly Gly Gly Tyr Trp His Thr Ser Gly Arg Glu Ile Leu Asp Ala 485 490 495Asn Asn Val Pro Val Arg Ile Ala Gly Ile Asn Trp Phe Gly Phe Glu 500 505 510Thr Cys Asn Tyr Val Val His Gly Leu Trp Ser Arg Asp Tyr Arg Ser 515 520 525Met Leu Asp Gln Ile Lys Ser Leu Gly Tyr Asn Thr Ile Arg Leu Pro 530 535 540Tyr Ser Asp Asp Ile Leu Lys Pro Gly Thr Met Pro Asn Ser Ile Asn545 550 555 560Phe Tyr Gln Met Asn Gln Asp Leu Gln Gly Leu Thr Ser Leu Gln Val 565 570 575Met Asp Lys Ile Val Ala Tyr Ala Gly Gln Ile Gly Leu Arg Ile Ile 580 585 590Leu Asp Arg His Arg Pro Asp Cys Ser Gly Gln Ser Ala Leu Trp Tyr 595 600 605Thr Ser Ser Val Ser Glu Ala Thr Trp Ile Ser Asp Leu Gln Ala Leu 610 615 620Ala Gln Arg Tyr Lys Gly Asn Pro Thr Val Val Gly Phe Asp Leu His625 630 635 640Asn Glu Pro His Asp Pro Ala Cys Trp Gly Cys Gly Asp Pro Ser Ile 645 650 655Asp Trp Arg Leu Ala Ala Glu Arg Ala Gly Asn Ala Val Leu Ser Val 660 665 670Asn Pro Asn Leu Leu Ile Phe Val Glu Gly Val Gln Ser Tyr Asn Gly 675

680 685Asp Ser Tyr Trp Trp Gly Gly Asn Leu Gln Gly Ala Gly Gln Tyr Pro 690 695 700Val Val Leu Asn Val Pro Asn Arg Leu Val Tyr Ser Ala His Asp Tyr705 710 715 720Ala Thr Ser Val Tyr Pro Gln Thr Trp Phe Ser Asp Pro Thr Phe Pro 725 730 735Asn Asn Trp Gly Ile Trp Asn Lys Asn Trp Gly Tyr Leu Ile Phe Asn 740 745 750Gln Asn Ile Ala Pro Val Trp Leu Gly Glu Phe Gly Thr Thr Leu Gln 755 760 765Ser Thr Thr Asp Gln Thr Trp Leu Lys Thr Leu Val Gln Tyr Leu Arg 770 775 780Pro Thr Ala Gln Tyr Gly Ala Asp Ser Phe Gln Trp Thr Phe Trp Ser785 790 795 800Trp Asn Pro Asp Ser Gly Asp Thr Gly Gly Ile Leu Lys Asp Asp Trp 805 810 815Gln Thr Val Asp Thr Val Lys Asp Gly Tyr Leu Ala Pro Ile Lys Ser 820 825 830Ser Ile Phe Asp Pro Val Gly 8351510244DNAArtificial SequencepTrex4 plasmid construct 15aagcttaact agtacttctc gagctctgta catgtccggt cgcgacgtac gcgtatcgat 60ggcgccagct gcaggcggcc gcctgcagcc acttgcagtc ccgtggaatt ctcacggtga 120atgtaggcct tttgtagggt aggaattgtc actcaagcac ccccaacctc cattacgcct 180cccccataga gttcccaatc agtgagtcat ggcactgttc tcaaatagat tggggagaag 240ttgacttccg cccagagctg aaggtcgcac aaccgcatga tatagggtcg gcaacggcaa 300aaaagcacgt ggctcaccga aaagcaagat gtttgcgatc taacatccag gaacctggat 360acatccatca tcacgcacga ccactttgat ctgctggtaa actcgtattc gccctaaacc 420gaagtgacgt ggtaaatcta cacgtgggcc cctttcggta tactgcgtgt gtcttctcta 480ggtgccattc ttttcccttc ctctagtgtt gaattgtttg tgttggagtc cgagctgtaa 540ctacctctga atctctggag aatggtggac taacgactac cgtgcacctg catcatgtat 600ataatagtga tcctgagaag gggggtttgg agcaatgtgg gactttgatg gtcatcaaac 660aaagaacgaa gacgcctctt ttgcaaagtt ttgtttcggc tacggtgaag aactggatac 720ttgttgtgtc ttctgtgtat ttttgtctgc aacaagaggc cagagacaat ctattcaaac 780accaagcttg ctcttttgag ctacaagaac ctgtggggta tatatctaga gttgtgaagt 840cggtaatccc gctgtatagt aatacgagtc gcatctaaat actccgaagc tgctgcgaac 900ccggagaatc gagatgtgct ggaaagcttc tagcgagcgg ctaaattagc atgaaaggct 960atgagaaatt ctggagacgg cttgttgaat catggcgttc cattcttcga caagcaaagc 1020gttccgtcgc agtagcaggc actcattccc gaaaaaactc ggagattcct aagtagcgat 1080ggaaccggaa taatataata ggcaatacat tgagttgcct cgacggttgc aatgcagggg 1140tactgagctt ggacataact gttccgtacc ccacctcttc tcaacctttg ggcgtttccc 1200tgattcagcg tacccgtaca agtcgtaatc actattaacc cagactgacc ggacgtgttt 1260tgcccttcat ttggagaaat aatgtcattg cgatgtgtaa tttgcctgct tgaccgactg 1320gggctgttcg aagcccgaat gtaggattgt tatccgaact ctgctcgtag aggcatgttg 1380tgaatctgtg tcgggcagga cacgcctcga aggttcacgg caagggaaac caccgatagc 1440agtgtctagt agcaacctgt aaagccgcaa tgcagcatca ctggaaaata caaaccaatc 1500tgctaaaagt acataagtta atgcctaaag aagtcatata ccagcggcta ataattgtac 1560aatcaagtgg ctaaacgtac cgtaatttgc caacggcttg tggggttgca gaagcaacgg 1620caaagcccca cttccccacg tttgtttctt cactcagtcc aatctcagct ggtgatcccc 1680caattgggtc gcttgtttgt tccggtgaag tgaaagaaga cagaggtaag aatgtctgac 1740tcggagcgtt ttgcatacaa ccaagggcag tgatggaaga cagtgaaatg ttgacattca 1800aggagtattt agccagggat gcttgagtgt atcgtgtaag gaggtttgtc tgccgatacg 1860acgaatactg tatagtcact tctgatgaag tggtccatat tgaaatgtaa gtcggcactg 1920aacaggcaaa agattgagtt gaaactgcct aagatctcgg gccctcgggc cttcggcctt 1980tgggtgtaca tgtttgtgct ccgggcaaat gcaaagtgtg gtaggatcga acacactgct 2040gcctttacca agcagctgag ggtatgtgat aggcaaatgt tcaggggcca ctgcatggtt 2100tcgaatagaa agagaagctt agccaagaac aatagccgat aaagatagcc tcattaaacg 2160gaatgagcta gtaggcaaag tcagcgaatg tgtatatata aaggttcgag gtccgtgcct 2220ccctcatgct ctccccatct actcatcaac tcagatcctc caggagactt gtacaccatc 2280ttttgaggca cagaaaccca atagtcaacc gcggactgcg catcatgtat cggaagttgg 2340ccgtcatctc ggccttcttg gccacagctc gtgctcagtc ggcctgcact ctccaatcgg 2400agactcaccc gcctctgaca tggcagaaat gctcgtctgg tggcacttgc actcaacaga 2460caggctccgt ggtcatcgac gccaactggc gctggactca cgctacgaac agcagcacga 2520actgctacga tggcaacact tggagctcga ccctatgtcc tgacaacgag acctgcgcga 2580agaactgctg tctggacggt gccgcctacg cgtccacgta cggagttacc acgagcggta 2640acagcctctc cattggcttt gtcacccagt ctgcgcagaa gaacgttggc gctcgccttt 2700accttatggc gagcgacacg acctaccagg aattcaccct gcttggcaac gagttctctt 2760tcgatgttga tgtttcgcag ctgccgtaag tgacttacca tcaacccctg acgtatcttc 2820ttgtgggctc ccagctgact ggccaattta aggtgcggct tgaacggagc tctctacttc 2880gtgtccatgg acgcggatgg tggcgtgagc aagtatccca ccaacaccgc tggcgccaag 2940tacggcacgg ggtactgtga cagccagtgt ccccgcgatc tgaagttcat caatggccag 3000gccaacgttg agggctggga gccgtcatcc aacaacgcaa acacgggcat tggaggacac 3060ggaagctgct gctctgagat ggatatctgg gaggccaact ccatctccga ggctcttacc 3120ccccaccctt gcacgactgt cggccaggag atctgcgagg gtgatgggtg cggcggaact 3180tactccgata acagatatgg cggcacttgc gatcccgatg gctgcgactg gaacccatac 3240cgcctgggca acaccagctt ctacggccct ggctcaagct ttaccctcga taccaccaag 3300aaattgaccg ttgtcaccca gttcgagacg tcgggtgcca tcaaccgata ctatgtccag 3360aatggcgtca ctttccagca gcccaacgcc gagcttggta gttactctgg caacgagctc 3420aacgatgatt actgcacagc tgaggaggca gaattcggcg gatcctcttt ctcagacaag 3480ggcggcctga ctcagttcaa gaaggctacc tctggcggca tggttctggt catgagtctg 3540tgggatgatg tgagtttgat ggacaaacat gcgcgttgac aaagagtcaa gcagctgact 3600gagatgttac agtactacgc caacatgctg tggctggact ccacctaccc gacaaacgag 3660acctcctcca cacccggtgc cgtgcgcgga agctgctcca ccagctccgg tgtccctgct 3720caggtcgaat ctcagtctcc caacgccaag gtcaccttct ccaacatcaa gttcggaccc 3780attggcagca ccggcaaccc tagcggcggc aaccctcccg gcggaaaccc gcctggcacc 3840accaccaccc gccgcccagc cactaccact ggaagctctc ccggacctac tagtaagcgg 3900ataaggcgcg ccgcgcgcca gctccgtgcg aaagcctgac gcaccggtag attcttggtg 3960agcccgtatc atgacggcgg cgggagctac atggccccgg gtgatttatt ttttttgtat 4020ctacttctga cccttttcaa atatacggtc aactcatctt tcactggaga tgcggcctgc 4080ttggtattgc gatgttgtca gcttggcaaa ttgtggcttt cgaaaacaca aaacgattcc 4140ttagtagcca tgcattttaa gataacggaa tagaagaaag aggaaattaa aaaaaaaaaa 4200aaaacaaaca tcccgttcat aacccgtaga atcgccgctc ttcgtgtatc ccagtaccag 4260tttattttga atagctcgcc cgctggagag catcctgaat gcaagtaaca accgtagagg 4320ctgacacggc aggtgttgct agggagcgtc gtgttctaca aggccagacg tcttcgcggt 4380tgatatatat gtatgtttga ctgcaggctg ctcagcgacg acagtcaagt tcgccctcgc 4440tgcttgtgca ataatcgcag tggggaagcc acaccgtgac tcccatcttt cagtaaagct 4500ctgttggtgt ttatcagcaa tacacgtaat ttaaactcgt tagcatgggg ctgatagctt 4560aattaccgtt taccagtgcc gcggttctgc agctttcctt ggcccgtaaa attcggcgaa 4620gccagccaat caccagctag gcaccagcta aaccctataa ttagtctctt atcaacacca 4680tccgctcccc cgggatcaat gaggagaatg agggggatgc ggggctaaac aagcctacat 4740aaccctcatg ccaactccca gtttacactc gtcgagccaa catcctgact ataagctaac 4800acagaatgcc tcaatcctgg gaagaactgg ccgctgataa gcgcgcccgc ctcgcaaaaa 4860ccatccctga tgaatggaaa gtccagacgc tgcctgcgga agacagcgtt attgatttcc 4920caaagaaatc ggggatcctt tcagaggccg aactgaagat cacagaggcc tccgctgcag 4980atcttgtgtc caagctggcg gccggagagt tgacctcggt ggaagttacg ctagcattct 5040gtaaacgggc agcaatcgcc cagcagttag tagggtcccc tctacctctc agggagatgt 5100aacaacgcca ccttatggga ctatcaagct gacgctggct tctgtgcaga caaactgcgc 5160ccacgagttc ttccctgacg ccgctctcgc gcaggcaagg gaactcgatg aatactacgc 5220aaagcacaag agacccgttg gtccactcca tggcctcccc atctctctca aagaccagct 5280tcgagtcaag gtacaccgtt gcccctaagt cgttagatgt ccctttttgt cagctaacat 5340atgccaccag ggctacgaaa catcaatggg ctacatctca tggctaaaca agtacgacga 5400aggggactcg gttctgacaa ccatgctccg caaagccggt gccgtcttct acgtcaagac 5460ctctgtcccg cagaccctga tggtctgcga gacagtcaac aacatcatcg ggcgcaccgt 5520caacccacgc aacaagaact ggtcgtgcgg cggcagttct ggtggtgagg gtgcgatcgt 5580tgggattcgt ggtggcgtca tcggtgtagg aacggatatc ggtggctcga ttcgagtgcc 5640ggccgcgttc aacttcctgt acggtctaag gccgagtcat gggcggctgc cgtatgcaaa 5700gatggcgaac agcatggagg gtcaggagac ggtgcacagc gttgtcgggc cgattacgca 5760ctctgttgag ggtgagtcct tcgcctcttc cttcttttcc tgctctatac caggcctcca 5820ctgtcctcct ttcttgcttt ttatactata tacgagaccg gcagtcactg atgaagtatg 5880ttagacctcc gcctcttcac caaatccgtc ctcggtcagg agccatggaa atacgactcc 5940aaggtcatcc ccatgccctg gcgccagtcc gagtcggaca ttattgcctc caagatcaag 6000aacggcgggc tcaatatcgg ctactacaac ttcgacggca atgtccttcc acaccctcct 6060atcctgcgcg gcgtggaaac caccgtcgcc gcactcgcca aagccggtca caccgtgacc 6120ccgtggacgc catacaagca cgatttcggc cacgatctca tctcccatat ctacgcggct 6180gacggcagcg ccgacgtaat gcgcgatatc agtgcatccg gcgagccggc gattccaaat 6240atcaaagacc tactgaaccc gaacatcaaa gctgttaaca tgaacgagct ctgggacacg 6300catctccaga agtggaatta ccagatggag taccttgaga aatggcggga ggctgaagaa 6360aaggccggga aggaactgga cgccatcatc gcgccgatta cgcctaccgc tgcggtacgg 6420catgaccagt tccggtacta tgggtatgcc tctgtgatca acctgctgga tttcacgagc 6480gtggttgttc cggttacctt tgcggataag aacatcgata agaagaatga gagtttcaag 6540gcggttagtg agcttgatgc cctcgtgcag gaagagtatg atccggaggc gtaccatggg 6600gcaccggttg cagtgcaggt tatcggacgg agactcagtg aagagaggac gttggcgatt 6660gcagaggaag tggggaagtt gctgggaaat gtggtgactc catagctaat aagtgtcaga 6720tagcaatttg cacaagaaat caataccagc aactgtaaat aagcgctgaa gtgaccatgc 6780catgctacga aagagcagaa aaaaacctgc cgtagaaccg aagagatatg acacgcttcc 6840atctctcaaa ggaagaatcc cttcagggtt gcgtttccag tctagacacg tataacggca 6900caagtgtctc tcaccaaatg ggttatatct caaatgtgat ctaaggatgg aaagcccaga 6960atctaggcct attaatattc cggagtatac gtagccggct aacgttaaca accggtacct 7020ctagaactat agctagcatg cgcaaattta aagcgctgat atcgatcgcg cgcagatcca 7080tatatagggc ccgggttata attacctcag gtcgacgtcc catggccatt cgaattcgta 7140atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg ctcacaattc cacacaacat 7200acgagccgga agcataaagt gtaaagcctg gggtgcctaa tgagtgagct aactcacatt 7260aattgcgttg cgctcactgc ccgctttcca gtcgggaaac ctgtcgtgcc agctgcatta 7320atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt gggcgctctt ccgcttcctc 7380gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag ctcactcaaa 7440ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca tgtgagcaaa 7500aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt tccataggct 7560ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc gaaacccgac 7620aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct ctcctgttcc 7680gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg tggcgctttc 7740tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca agctgggctg 7800tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact atcgtcttga 7860gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta acaggattag 7920cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta actacggcta 7980cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct tcggaaaaag 8040agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt tttttgtttg 8100caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga tcttttctac 8160ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca tgagattatc 8220aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat caatctaaag 8280tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg cacctatctc 8340agcgatctgt ctatttcgtt catccatagt tgcctgactc cccgtcgtgt agataactac 8400gatacgggag ggcttaccat ctggccccag tgctgcaatg ataccgcgag acccacgctc 8460accggctcca gatttatcag caataaacca gccagccgga agggccgagc gcagaagtgg 8520tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccctggaa gctagagtaa 8580gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt 8640cacgctcgtc gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta 8700catgatcccc catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca 8760gaagtaagtt ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta 8820ctgtcatgcc atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct 8880gagaatagtg tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg 8940cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac 9000tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact 9060gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa 9120atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt 9180ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat 9240gtatttagaa aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg 9300acgtctaaga aaccattatt atcatgacat taacctataa aaataggcgt atcacgaggc 9360cctttcgtct cgcgcgtttc ggtgatgacg gtgaaaacct ctgacacatg cagctcccgg 9420agacggtcac agcttgtctg taagcggatg ccgggagcag acaagcccgt cagggcgcgt 9480cagcgggtgt tggcgggtgt cgggggctgg cttaactatg cggcatcaga gcagattgta 9540ctgagagtgc accataaaat tgtaaacgtt aatattttgt taaaattcgc gttaaatttt 9600tgttaaatca gctcattttt taaccaatag gccgaaatcg gcaaaatccc ttataaatca 9660aaagaatagc ccgagatagg gttgagtgtt gttccagttt ggaacaagag tccactatta 9720aagaacgtgg actccaacgt caaagggcga aaaaccgtct atcagggcga tggcccacta 9780cgtgaaccat cacccaaatc aagttttttg gggtcgaggt gccgtaaagc actaaatcgg 9840aaccctaaag ggagcccccg atttagagct tgacggggaa agccggcgaa cgtggcgaga 9900aaggaaggga agaaagcgaa aggagcgggc gctagggcgc tggcaagtgt agcggtcacg 9960ctgcgcgtaa ccaccacacc cgccgcgctt aatgcgccgc tacagggcgc gtactatggt 10020tgctttgacg tatgcggtgt gaaataccgc acagatgcgt aaggagaaaa taccgcatca 10080ggcgccattc gccattcagg ctgcgcaact gttgggaagg gcgatcggtg cgggcctctt 10140cgctattacg ccagctggcg aaagggggat gtgctgcaag gcgattaagt tgggtaacgc 10200cagggttttc ccagtcacga cgttgtaaaa cgacggccag tgcc 102441612DNAArtificial Sequencesynthetic 16actagtaagc gg 121741DNAArtificial Sequenceprimer 17gcttatacta gtaagcgcgc gggcggcggc tattggcaca c 411839DNAArtificial Sequenceprimer 18gcttatggcg cgccttagac aggatcgaaa atcgacgac 391943DNAArtificial Sequenceprimer 19ctaagagagc gacgactcag ccgtacacct ggagcaacgt ggc 432038DNAArtificial Sequenceprimer 20ttacgatccg gacggcgcac caccaatgtc cccgtata 382142DNAArtificial Sequenceprimer 21gcttatacta gtaagcgcgc cggtgtcacc gccacagtca cc 422236DNAArtificial Sequenceprimer 22gcttatggcg cgcctcagga ctggagcttg ctccgc 36

* * * * *

Exo-endo Cellulase Fusion Protein

Bower; Benjamin S. ; et al.

References