Solubilized Enzyme And Uses Thereof Medoff; Marshall ; et al. [XYLECO, INC.]

Solubilized Enzyme And Uses Thereof

Medoff; Marshall ; et al.

Patent Application Summary

U.S. patent application number 16/456536 was filed with the patent office on 2019-10-17 for solubilized enzyme and uses thereof. The applicant listed for this patent is XYLECO, INC.. Invention is credited to Natasha Kreder, Sean Landry, James Lynch, Thomas Craig Masterman, Marshall Medoff, Desiree Pangilinan, Aiichiro Yoshida.

Application Number	20190316108 16/456536
Document ID	/
Family ID	55582048
Filed Date	2019-10-17

United States Patent Application	20190316108
Kind Code	A1
Medoff; Marshall ; et al.	October 17, 2019

SOLUBILIZED ENZYME AND USES THEREOF

Abstract

The present invention relates to mixtures comprising a polypeptide or a plurality of polypeptides having biomass-degrading activity that is solubilized from an inclusion body, and retaining biomass-degrading activity, and methods for producing and using the same. The invention described herein provides methods for increasing the yield of recombinant protein with biomass-degrading activity that can be isolated from host cells.

Inventors:

Medoff; Marshall; (Brookline, MA) ; Kreder; Natasha; (Wakefield, MA) ; Lynch; James; (Woburn, MA) ; Landry; Sean; (Essex, MA) ; Yoshida; Aiichiro; (Canton, MA) ; Pangilinan; Desiree; (Waltham, MA) ; Masterman; Thomas Craig; (Rockport, MA)

Applicant:

Name	City	State	Country	Type
XYLECO, INC.	WAKEFIELD	MA	US

Family ID:

55582048

Appl. No.:

16/456536

Filed:

June 28, 2019

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
14782205	Oct 2, 2015	10377999
PCT/US2015/052200	Sep 25, 2015
16456536
62055702	Sep 26, 2014

Current U.S. Class:	1/1
Current CPC Class:	C12P 19/02 20130101; C12Y 302/01021 20130101; C12N 9/2445 20130101; C12P 19/14 20130101
International Class:	C12N 9/42 20060101 C12N009/42; C12P 19/14 20060101 C12P019/14; C12P 19/02 20060101 C12P019/02

Claims

1. A mixture comprising a plurality of polypeptides having biomass-degrading activity and a solubilizing agent, wherein the polypeptides have at least 8-10% of the biomass-degrading activity compared to a native polypeptide having biomass-degrading activity.

2. The mixture of claim 1, further comprising one or more proteins associated with an inclusion body.

3. The mixture of claim 1, wherein the mixture does not comprise one or more proteins associated with an inclusion body.

4. The mixture of any of the preceding claims, further comprising cellular debris, one or more ribosomal component, one or more host protein, and/or host nucleic acid comprising DNA and/or RNA.

5. The mixture of any of the preceding claims, wherein the biomass-degrading activity is cellobiase activity, ligninase activity, endoglucanase activity, cellobiohydrolase activity, or xylanase activity.

6. The mixture of any of the preceding claims, wherein the polypeptide is partially unfolded, partially misfolded, or partially denatured.

7. The mixture of claim 1, wherein the polypeptide comprises an amino acid sequence with at least 90% identity to SEQ ID NO: 1.

8. The mixture of any of the preceding claims, wherein the polypeptide comprises a Cel3A enzyme from T. reesei, or a functional variant or fragment thereof.

9. The mixture of claim 8, wherein the Cel3A enzyme comprises the amino acid sequence SEQ ID NO: 1, or an amino acid sequence with at least 90% identity thereof.

10. The mixture any of the preceding claims, wherein the polypeptide is encoded by a nucleic acid sequence comprising at least 90% identity to SEQ ID NO: 2 or SEQ ID NO: 3.

11. The mixture of any of the preceding claims, wherein the polypeptide is aglycosylated.

12. The mixture of any of claims 1-3 or 7, wherein the solubilizing agent comprises urea, and optionally, is present at a concentration between 0.2M-6M.

13. The mixture of any of the preceding claims, further comprising at least one additional polypeptide having a biomass-degrading activity or a microorganism that produces one or more enzymes having a biomass-degrading activity.

14. The mixture of claim 13, wherein the additional polypeptide is selected from a ligninase, an endoglucanase, a cellobiohydrolase, a cellobiase, and a xylanase, or any combination thereof.

15. The mixture of claim 13 or 14, wherein the additional polypeptide is selected from: a. a polypeptide comprising an amino acid sequence with at least 90% identity to SEQ ID NO: 1; b. a Cel3A enzyme from T. reesei, or a functional variant or fragment thereof; or c. a polypeptide encoded by a nucleic acid sequence comprising (e.g., consisting of) SEQ ID NO: 2 or SEQ ID NO: 3.

16. The mixture of any of claims 13-15, wherein the additional polypeptide is aglycosylated.

17. The mixture of any of claims 13-15, wherein the additional polypeptide is glycosylated.

18. A mixture comprising a plurality of polypeptides having an amino acid sequence with at least 90% identity to SEQ ID NO: 1 and a solubilizing agent, wherein the plurality of polypeptides have at least 20%-40% of the activity of the native polypeptide comprising SEQ ID NO: 1.

19. The mixture of claim 18, further comprising one or more proteins associated with an inclusion body.

20. The mixture of claim 18, wherein the mixture does not comprise one or more proteins associated with an inclusion body.

21. The mixture of any of claims 18-20, further comprising cellular debris, one or more ribosomal component, one or more host protein, and/or host nucleic acid comprising DNA and/or RNA.

22. The mixture of any of claims 18-21, wherein the polypeptide is partially unfolded, partially misfolded, or partially denatured.

23. The mixture any of claims 18-22, wherein the polypeptide is encoded by a nucleic acid sequence comprising at least 90% identity to SEQ ID NO: 2 or SEQ ID NO: 3.

24. The mixture of any of claims 18-23, wherein the polypeptide is aglycosylated.

25. The mixture of any of claims 18-20, wherein the solubilizing agent comprises urea, and optionally, is present at a concentration between 0.2M-6M.

26. The mixture of any of claims 18-25, further comprising at least one additional polypeptide having a biomass-degrading activity or a microorganism that produces one or more enzymes having a biomass-degrading activity.

27. The mixture of claim 26, wherein the additional polypeptide is selected from a ligninase, an endoglucanase, a cellobiohydrolase, a cellobiase, and a xylanase, or any combination thereof.

28. The mixture of claim 26 or 27, wherein the additional polypeptide is selected from: a. a polypeptide comprising an amino acid sequence with at least 90% identity to SEQ ID NO: 1; b. a Cel3A enzyme from T. reesei, or a functional variant or fragment thereof; or c. a polypeptide encoded by a nucleic acid sequence comprising (e.g., consisting of) SEQ ID NO: 2 or SEQ ID NO: 3.

29. The mixture of any of claims 26-28 wherein the additional polypeptide is aglycosylated.

30. The mixture of any of claims 26-28, wherein the additional polypeptide is glycosylated.

31. A method for producing a mixture of any of claims 1-30 comprising contacting a cell expressing the polypeptide having biomass-degrading activity, or lysate thereof, with a solubilizing agent at a concentration suitable for solubilizing the polypeptide.

32. The method of claim 31, further comprising lysing the cell to obtain a lysate, separating a soluble fraction from an insoluble fraction of the lysate, and resuspending the insoluble fraction in the solubilizing agent.

33. The method of claim 31 or 32, wherein the solubilizing agent is urea, and optionally, wherein the concentration of the solubilizing agent is between 0.2M-6M.

34. The method of any of claims 31-33, wherein the biomass-degrading activity is a cellobiase activity, a ligninase activity, an endoglucanase activity, a cellobiohydrolase, or a xylanase activity.

35. The method of any of claims 31-34, wherein the polypeptide comprises an amino acid sequence with at least 90% identity to SEQ ID NO: 1.

36. The method of any of claims 31-35, wherein the polypeptide comprises a Cel3A from T. reesei, or a functional variant or fragment thereof.

37. The method of any of claims 31-36 wherein the polypeptide is aglycosylated.

38. A method for producing a polypeptide having biomass-degrading activity comprising expressing the polypeptide in a cell and contacting the cell or a lysate thereof with a solubilizing agent at a concentration suitable for solubilizing the polypeptide.

39. A method for producing a polypeptide having biomass-degrading activity comprising providing a cell that has been genetically modified to produce at least one polypeptide having biomass-degrading activity, wherein at least a portion of said polypeptide having biomass-degrading activity is found in inclusion bodies, and contacting the cell, or a lysate thereof containing the inclusion bodies, with a solubilizing agent at a concentration suitable for solubilizing the polypeptide.

40. The method of claim 38 or 39, wherein the solubilizing agent comprises urea.

41. The method of any of claims 38-40, wherein the concentration of the solubilizing agent is between 0.2M-6M.

42. The method of any of claims 38-41, further comprising lysing the cell to obtain a lysate, separating a soluble fraction from an insoluble fraction of the lysate, and resuspending the insoluble fraction in the solubilizing agent.

43. The method of any of claims 38-42, wherein the biomass-degrading activity is a cellobiase activity, a ligninase activity, an endoglucanase activity, a cellobiohydrolase activity, or a xylanase activity.

44. The method of any of claim 38 or 39, wherein the polypeptide comprises an amino acid sequence with at least 90% identity to SEQ ID NO: 1.

45. The method of any of claims 38-44, wherein the polypeptide comprises a Cel3A from T. reesei, or a functional variant or fragment thereof.

46. The method of any of claims 38-45, wherein the cell is a prokaryotic or bacterial cell, e.g., E. coli cell, origami E. coli cell.

47. The method of any of claims 38-46, wherein the polypeptide is aglycosylated.

48. A method of producing a product from a biomass comprising contacting a biomass with the mixture of any of claims 1-30, and, optionally, a microorganism that produces one or more biomass-degrading enzyme and/or an enzyme mixture comprising biomass-degrading enzymes, under conditions suitable for the production of the product.

49. The method of claim 48, further comprising treating the biomass with an electron beam prior to contacting the biomass with the mixture.

50. The method of claim 48 or 49, wherein the product is a sugar product.

51. The method of claim 50, wherein the sugar product is glucose and/or xylose.

52. The method of any of claims 48-51, further comprising isolating the product.

53. The method of claim 52, wherein the isolating of the product comprises precipitation, crystallization, chromatography, centrifugation, and/or extraction.

54. The method of any of claims 48-53, wherein the enzyme mixture comprises at least two of the enzymes selected from B2AF03, CIP1, CIP2, Cel1a, Cel3a, Cel5a, Cel6a, Cel7a, Cel7b, Cel12a, Cel45a, Cel74a, paMan5a, paMan26a, and Swollenin.

55. The method of any of claims 48-54, wherein the biomass comprises one or more of an agricultural product or waste, a paper product or waste, a forestry product, or a general waste, or any combination thereof, wherein: a) an agricultural product or waste comprises sugar cane jute, hemp, flax, bamboo, sisal, alfalfa, hay, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, canola straw, wheat straw, barley straw, oat straw, rice straw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp, bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, or beeswing, or a combination thereof; b) a paper product or waste comprises paper, pigmented papers, loaded papers, coated papers, filled papers, magazines, printed matter, printer paper, polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or a combination thereof; c) a forestry product comprises aspen wood, particle board, wood chips, or sawdust, or a combination thereof; and d) a general waste comprises manure, sewage, or offal, or a combination thereof.

56. The method of any of claims 48-55, further comprises a step of treating the biomass prior to introducing the microorganism or the enzyme mixture to reduce the recalcitrance of the biomass, wherein the treating comprises bombardment with electrons, sonication, oxidation, pyrolysis, steam explosion, chemical treatment, mechanical treatment, or freeze grinding.

57. The method of any of claims 48-56, wherein the microorganism that produces a biomass-degrading enzyme is from species in the genera selected from Bacillus, Coprinus, Myceliophthora, Cephalosporium, Scytalidium, Penicillium, Aspergillus, Pseudomonas, Humicola, Fusarium, Thielavia, Acremonium, Chrysosporium or Trichoderma.

58. The method of any of claims 48-57, wherein the microorganism that produces a biomass-degrading enzyme is selected from Aspergillus, Humicola insolens (Scytalidium thermophilum) Coprinus cinereus, Fusarium oxysporum, Myceliophthora thermophila, Meripilus giganteus, Thielavia terrestris, Acremonium persicinum, Acremonium acremonium, Acremonium brachypenium, Acremonium dichromosporum, Acremonium obclavatum, Acremonium pinkertoniae, Acremonium roseogriseum, Acremonium incoloratum, Acremonium furatum, Chrysosporium lucknowense, Trichoderma viride, Trichoderma reesei, or Trichoderma koningii.

59. The method of any of claims 48-58, wherein the microorganism has been induced to produce biomass-degrading enzymes by combining the microorganism with an induction biomass sample under conditions suitable for increasing production of biomass-degrading enzymes compared to an uninduced microorganism.

60. The method of any of claims 48-59, wherein the induction biomass sample comprises one or more of an agricultural product or waste, a paper product or waste, a forestry product, or a general waste, or any combination thereof, wherein: a) an agricultural product or waste comprises sugar cane jute, hemp, flax, bamboo, sisal, alfalfa, hay, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, canola straw, wheat straw, barley straw, oat straw, rice straw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp, bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, or beeswing, or a combination thereof; b) a paper product or waste comprises paper, pigmented papers, loaded papers, coated papers, filled papers, magazines, printed matter, printer paper, polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or a combination thereof; c) a forestry product comprises aspen wood, particle board, wood chips, or sawdust, or a combination thereof; and d) a general waste comprises manure, sewage, or offal, or a combination thereof.

Description

RELATED APPLICATIONS

[0001] This application is a divisional of U.S. application Ser. No. 14/782,205, filed Oct. 2, 2015, which is a national stage application under 35 U.S.C. .sctn. 371 of International Application No. PCT/US2015/052200, filed Sep. 25, 2015, which claims the benefit of U.S. Provisional Application No. 62/055,702, filed Sep. 26, 2014; the entire contents of each of which are hereby incorporated by reference.

SEQUENCE LISTING

[0002] The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Sep. 1, 2015, is named X2002-7003WO_SL.txt and is 76,946 bytes in size.

FIELD OF THE INVENTION

[0003] The present invention relates generally to mixtures comprising a polypeptide having biomass-degrading activity solubilized from inclusion bodies and having biomass-degrading activity, and methods for producing the mixtures described herein. The present invention also provides methods for using such mixtures, e.g., to process biomass materials.

BACKGROUND OF THE INVENTION

[0004] Biomass-degrading enzymes, such as cellulases, xylanases, and ligninases, are important for the degradation of biomass, such as feedstock. Cellulosic and lignocellulosic materials are produced, processed, and used in large quantities in a number of applications. Often such materials are used once, and then discarded as waste, or are simply considered to be wasted materials, e.g., sewage, bagasse, sawdust, and stover.

SUMMARY OF THE INVENTION

[0005] High level of expression of recombinant proteins in host cells such as E. coli can lead to accumulation of the recombinant proteins into insoluble aggregates within the host cell. These insoluble aggregates are called inclusion bodies and can also contain other components, such as proteins endogenous to the host cell, ribosomal components, nucleic acids, and cellular debris. Solubilization of the recombinant proteins from the inclusion bodies can be achieved through treatment with high concentrations of a solubilizing agent such as urea, which disrupts hydrogen bonds and hydrophobic interactions. However, treatment with a solubilizing agent, such as urea, can result in denaturation of the protein and loss of enzymatic activity. Thus, the aggregation of recombinant proteins into inclusion bodies can reduce the yield of recombinant protein with enzymatic activity that can be isolated from the host cells.

[0006] The present invention is based, at least in part, on the surprising discovery that a heterologously expressed cellobiase that has been solubilized from inclusion bodies by solubilizing agent, such as urea, retains cellobiase activity. Therefore, the methods described herein for solubilization of heterologously expressed cellobiase, or other biomass-degrading enzymes, are useful for increasing the yield of the heterologously expressed enzymes having biomass-degrading activity, e.g., by 30-40%. Furthermore, the presence of the solubilizing agent, e.g., urea, from the addition of the solubilized biomass-degrading enzyme, e.g., cellobiase, does not adversely affect the saccharification reaction for converting biomass to a sugar product and/or the yield of products.

[0007] Accordingly, in one aspect, the disclosure features a mixture comprising a polypeptide or a plurality of polypeptides having a biomass-degrading activity and a solubilizing agent, e.g., urea, wherein the polypeptide or plurality thereof has at least 8-10% biomass-degrading activity compared the native polypeptide.

[0008] In one embodiment, the mixture further comprises one or more proteins associated with an inclusion body. Alternatively, in one embodiment, the mixture does not comprise one or more proteins associated with an inclusion body. In one embodiment, the mixture further comprises cellular debris, one or more ribosomal component, one or more host protein, e.g., protein endogenously expressed by the host cell, and/or host nucleic acid, e.g., DNA and/or RNA.

[0009] In one embodiment, the biomass-degrading activity is cellobiase activity, ligninase activity, endoglucanase activity, cellobiohydrolase activity, or xylanase activity.

[0010] In one embodiment, the polypeptide is partially unfolded, partially misfolded, or partially denatured.

[0011] In another aspect, the disclosure features a mixture comprising a polypeptide or a plurality of polypeptides having an amino acid sequence with at least 90% identity to SEQ ID NO: 1 and a solubilizing agent, e.g., urea, wherein the polypeptide or plurality thereof has at least 20% of the activity of the native polypeptide, e.g., SEQ ID NO: 1 or Cel3a from T. reesei. For example, the mixture further comprises one or more proteins associated with an inclusion body. Alternatively, the mixture does not comprise one or more proteins associated with an inclusion body. The mixture may further comprise one or more of the following: cellular debris, one or more ribosomal component, one or more host protein, e.g., protein endogenously expressed by the host cell, and/or host nucleic acid, e.g., DNA and/or RNA. The polypeptide with at least 90% identity to SEQ ID NO: 1 may be partially unfolded, partially misfolded, or partially denatured.

[0012] In one embodiment, the polypeptide comprises an amino acid sequence with at least 90% identity to SEQ ID NO: 1. In one embodiment, the polypeptide comprises a Cel3A enzyme from T. reesei, or a functional variant or fragment thereof. In one embodiment, the Cel3A enzyme comprises (e.g., consists of) the amino acid sequence SEQ ID NO: 1. In one embodiment, the polypeptide is encoded by a nucleic acid sequence comprising (e.g., consisting of) at least 90% identity to SEQ ID NO: 2 or SEQ ID NO: 3.

[0013] In one embodiment, the polypeptide is aglycosylated.

[0014] In one embodiment, the solubilizing agent, e.g., urea, is present in the mixture at a concentration between 0.2M-6M.

[0015] In one embodiment, the mixture further comprises at least one additional polypeptide having a biomass-degrading activity or a microorganism that produces one or more enzymes having a biomass-degrading activity. In one embodiment, the additional polypeptide is selected from a ligninase, an endoglucanase, a cellobiohydrolase, a cellobiase, and a xylanase, or any combination thereof. In one embodiment, the additional polypeptide is selected from: [0016] a. a polypeptide comprising (e.g., consisting of) an amino acid sequence with at least 90% identity to SEQ ID NO: 1; [0017] b. a Cel3A enzyme from T. reesei, or a functional variant or fragment thereof; or [0018] c. a polypeptide encoded by a nucleic acid sequence comprising (e.g., consisting of) SEQ ID NO: 2 or SEQ ID NO: 3.

[0019] In one embodiment, the additional polypeptide is aglycosylated.

[0020] In one embodiment, the additional polypeptide is glycosylated.

[0021] In one aspect, the disclosure features a method for producing a mixture described herein comprising a polypeptide having biomass-degrading activity, one or more proteins associated with an inclusion body, and a solubilizing agent, e.g., urea, wherein the method comprises contacting a cell expressing the polypeptide having biomass-degrading activity, or lysate thereof, with a solubilizing agent, e.g., urea, at a concentration suitable for solubilizing the polypeptide. In one embodiment, the method further comprises lysing the cell to obtain a lysate, separating a soluble fraction from an insoluble fraction of the lysate, and resuspending the insoluble fraction in the solubilizing agent, e.g., urea. In one embodiment, the concentration of the solubilizing agent, e.g., urea, is between 0.2M-6M, e.g., 6M.

[0022] In one embodiment, the biomass-degrading activity is a cellobiase activity, a ligninase activity, an endoglucanase activity, a cellobiohydrolase, or a xylanase activity.

[0023] In one embodiment, the polypeptide comprises an amino acid sequence with at least 90% identity to SEQ ID NO: 1. In one embodiment, the polypeptide comprises a Cel3A from T. reesei, or a functional variant or fragment thereof.

[0024] In one embodiment, the polypeptide is aglycosylated.

[0025] In one aspect, the disclosure features a method for producing a polypeptide having a biomass-degrading activity comprising expressing the polypeptide in a cell and contacting the cell or a lysate thereof with a solubilizing agent, e.g., urea, at a concentration suitable for solubilizing the polypeptide.

[0026] In another aspect, the disclosure features a method for producing a polypeptide having biomass-degrading activity comprising providing a cell that has been genetically modified to produce at least one polypeptide having biomass-degrading activity, wherein at least a portion of said polypeptide having biomass-degrading activity is found in inclusion bodies, and contacting the cell, or a lysate thereof containing the inclusion bodies, with a solubilizing agent, e.g., urea, at a concentration suitable for solubilizing the polypeptide.

[0027] In one embodiment, the methods disclosed herein further comprise lysing the cell to obtain a lysate, separating a soluble fraction from an insoluble fraction of the lysate, and resuspending the insoluble fraction in the solubilizing agent, e.g., urea. In one embodiment, the concentration of the solubilizing agent, e.g., urea, is between 0.2M-6M, e.g., 6M.

[0028] In one embodiment, the biomass-degrading activity is a cellobiase activity, a ligninase activity, an endoglucanase activity, a cellobiohydrolase activity, or a xylanase activity.

[0029] In one embodiment, the aglycosylated polypeptide comprises (e.g., consisting of) an amino acid sequence with at least 90% identity to SEQ ID NO: 1. In one embodiment, the aglycosylated polypeptide comprises a Cel3A from T. reesei, or a functional variant or fragment thereof.

[0030] In one embodiment, the cell is a prokaryotic or bacterial cell, e.g., E. coli cell, origami E. coli cell.

[0031] In one embodiment, the polypeptide is aglycosylated.

[0032] In one aspect, the disclosure features a method of producing a product (e.g., hydrogen, a sugar, an alcohol) from a biomass (or converting a biomass to a product) comprising contacting a biomass with the mixture described herein comprising a polypeptide having biomass-degrading activity, one or more proteins associated with an inclusion body, and a solubilizing agent, e.g., urea, and, optionally, with a microorganism that produces one or more biomass-degrading enzyme and/or an enzyme mixture comprising biomass-degrading enzymes, under conditions suitable for the production of the product.

[0033] In one embodiment, the method further comprises a step of treating the biomass with an electron beam prior to contacting the biomass with the mixture described herein comprising a polypeptide having biomass-degrading activity, one or more proteins associated with an inclusion body, and a solubilizing agent, e.g., urea.

[0034] In one embodiment, the product is a sugar product. In one embodiment, the sugar product is glucose and/or xylose.

[0035] In one embodiment, the method further comprises a step of isolating the product. In one embodiment, the step of isolating the product comprises precipitation, crystallization, chromatography, centrifugation, and/or extraction.

[0036] In one embodiment, the enzyme mixture comprises at least two of the enzymes selected from B2AF03, CIP1, CIP2, Cel1a, Cel3a, Cel5a, Cel6a, Cel7a, Cel7b, Cel12a, Cel45a, Cel74a, paMan5a, paMan26a, Swollenin.

[0037] In one embodiment, the biomass comprises starchy materials, sugar cane, agricultural waste, paper, paper product, paper waste, paper pulp, pigmented papers, loaded papers, coated papers, filled papers, magazines, printed matter, printer paper, polycoated paper, card stock, cardboard, paperboard, cotton, wood, particle board, forestry wastes, sawdust, aspen wood, wood chips, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, agricultural waste, silage, canola straw, wheat straw, barley straw, oat straw, rice straw, jute, hemp, flax, bamboo, sisal, abaca, corn cobs, corn stover, soybean stover, corn fiber, alfalfa, hay, coconut hair, sugar processing residues, bagasse, beet pulp, agave bagasse, algae, seaweed, manure, sewage, offal, agricultural or industrial waste, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, or any combination thereof.

[0038] In one embodiment, the biomass comprises a starchy material or a starchy material that includes a cellulosic component. In some embodiments, the biomass comprises one or more of an agricultural product or waste, a paper product or waste, a forestry product, or a general waste, or any combination thereof; wherein: a) an agricultural product or waste comprises sugar cane jute, hemp, flax, bamboo, sisal, alfalfa, hay, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, canola straw, wheat straw, barley straw, oat straw, rice straw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp, bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, or beeswing, or a combination thereof; b) a paper product or waste comprises paper, pigmented papers, loaded papers, coated papers, filled papers, magazines, printed matter, printer paper, polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or a combination thereof; c) a forestry product comprises aspen wood, particle board, wood chips, or sawdust, or a combination thereof; and d) a general waste comprises manure, sewage, or offal, or a combination thereof.

[0039] In one embodiment, the method further comprises a step of treating the biomass prior to introducing the microorganism or the enzyme mixture to reduce the recalcitrance of the biomass, e.g., by treating the biomass with bombardment with electrons, sonication, oxidation, pyrolysis, steam explosion, chemical treatment, mechanical treatment, and/or freeze grinding.

[0040] In one embodiment, the microorganism that produces a biomass-degrading enzyme is from species in the genera selected from Bacillus, Coprinus, Myceliophthora, Cephalosporium, Scytalidium, Penicillium, Aspergillus, Pseudomonas, Humicola, Fusarium, Thielavia, Acremonium, Chrysosporium or Trichoderma. In one embodiment, the microorganism that produces a biomass-degrading enzyme is selected from Aspergillus, Humicola insolens (Scytalidium thermophilum), Coprinus cinereus, Fusarium oxysporum, Myceliophthora thermophila, Meripilus giganteus, Thielavia terrestris, Acremonium persicinum, Acremonium acremonium, Acremonium brachypenium, Acremonium dichromosporum, Acremonium obclavatum, Acremonium pinkertoniae, Acremonium roseogriseum, Acremonium incoloratum, Acremonium furatum, Chrysosporium lucknowense, Trichoderma viride, Trichoderma reesei, or Trichoderma koningii.

[0041] In one embodiment, the microorganism has been induced to produce biomass-degrading enzymes by combining the microorganism with an induction biomass sample under conditions suitable for increasing production of biomass-degrading enzymes compared to an uninduced microorganism. In one embodiment, the induction biomass sample comprises starchy materials, sugar cane, paper, paper products, paper waste, paper pulp, pigmented papers, loaded papers, coated papers, filled papers, magazines, printed matter, printer paper, polycoated paper, card stock, cardboard, paperboard, cotton, wood, particle board, forestry wastes, sawdust, aspen wood, wood chips, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, agricultural waste, silage, canola straw, wheat straw, barley straw, oat straw, rice straw, jute, hemp, flax, bamboo, sisal, abaca, corn cobs, corn stover, soybean stover, corn fiber, alfalfa, hay, coconut hair, sugar processing residues, bagasse, beet pulp, agave bagasse, algae, seaweed, manure, sewage, offal, agricultural or industrial waste, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, or any combination thereof.

[0042] In one embodiment, the induction biomass comprises a starchy material or a starchy material that includes a cellulosic component. In some embodiments, the induction biomass comprises one or more of an agricultural product or waste, a paper product or waste, a forestry product, or a general waste, or any combination thereof; wherein: a) an agricultural product or waste comprises sugar cane jute, hemp, flax, bamboo, sisal, alfalfa, hay, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, canola straw, wheat straw, barley straw, oat straw, rice straw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp, bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, or beeswing, or a combination thereof; b) a paper product or waste comprises paper, pigmented papers, loaded papers, coated papers, filled papers, magazines, printed matter, printer paper, polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or a combination thereof; c) a forestry product comprises aspen wood, particle board, wood chips, or sawdust, or a combination thereof; and d) a general waste comprises manure, sewage, or offal, or a combination thereof.

[0043] In one embodiment, the present invention provides advantages to current methods used in the art. These advantages include providing access to insoluble enzymes that would normally be discarded, increasing the yield of desired proteins that retain enzyme activity, purified enzymes for cleaner downstream processing, and organism selection (e.g., increase availability of organisms that may have been previously excluded from use due to propensity to develop inclusion bodies).

BRIEF DESCRIPTION OF THE DRAWINGS

[0044] FIG. 1 is a chromatogram showing the results of IMAC purification of solubilized Cel3a. The purified solubilized Cel3a peak is indicated by the arrow.

[0045] FIG. 2 is a picture of an SDS-PAGE gel showing the proteins in different fractions of the IMAC purification. Lane 1 shows the molecular weight standards. Lane 2 shows purified Cel3a from the soluble fraction. Lane 3 shows the flow through from IMAC purification of the insoluble fraction. Lane 4 shows the purified solubilized Cel3a from the insoluble fraction.

[0046] FIG. 3 is a graph comparing the cellobiase activity of purified soluble Cel3a and purified solubilized Cel3a from the insoluble fraction.

[0047] FIG. 4 is a graph comparing the cellobiase activity of purified soluble Cel3a, the wash fraction of the insoluble fraction, and Cel3a solubilized from the insoluble fraction without purification.

DETAILED DESCRIPTION

Definitions

[0048] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains.

[0049] The term "a" and "an" refers to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.

[0050] The term "aglycosylated", as used herein, refers to a molecule, e.g., a polypeptide, that is not glycosylated (i.e., it comprises a hydroxyl group or other functional group that is not attached to a glycosylate group) at one or more sites which has a glycan attached when the molecule is produced in its native environment. In some embodiments, the aglycosylated molecule does not have any attached glycans. In one embodiment, the molecule has been altered or mutated such that the molecule cannot be glycosylated, e.g., one or more glycosylation site is mutated such that a glycan cannot be attached to the glycosylation site. In another embodiment, an attached glycan can be removed from the molecule, e.g., by an enzymatic process, e.g., by incubating with enzymes that remove glycans or have deglycosylating activity. In yet another embodiment, glycosylation of the molecule can be inhibited, e.g., by use of a glycosylation inhibitor (that inhibits a glycosylating enzyme). In another embodiment, the molecule, e.g., the polypeptide, can be produced by a host cell that does not glycosylate, e.g., E. coli. For example, a Cel3A enzyme is aglycosylated when one or more site in the protein that normally has a glycan group attached to it when the Cel3A enzyme is produced in T. reesei does not have a glycan attached at that site.

[0051] The term "biomass", as used herein, refers to any non-fossilized, organic matter. The various types of biomass include plant biomass (e.g., lignocellulosic and cellulosic biomass), microbial biomass, animal biomass (any animal by-product, animal waste, etc.) and municipal waste biomass (residential and light commercial refuse with recyclables such as metal and glass removed). Plant biomass refers to any plant-derived organic matter (woody or non-woody). Plant biomass can include, but is not limited to, agricultural or food crops (e.g., sugarcane, sugar beets or corn kernels) or an extract therefrom (e.g., sugar from sugarcane and corn starch from corn), agricultural crop wastes and residues such as corn stover, wheat straw, rice straw, sugar cane bagasse, and the like. Plant biomass further includes, but is not limited to, trees, woody energy crops, wood wastes and residues such as softwood forest thinnings, barky wastes, sawdust, paper and pulp industry waste streams, wood fiber, and the like. Additionally, grass crops, such as switchgrass and the like have potential to be produced on a large-scale as another plant biomass source. For urban areas, the best potential plant biomass feedstock includes yard waste (e.g., grass clippings, leaves, tree clippings, and brush) and vegetable processing waste.

[0052] The term "biomass degrading enzymes", as used herein, refers to enzymes that break down components of the biomass matter described herein into intermediates or final products. For example, biomass-degrading enzymes include at least ligninases, endoglucancases, cellobiases, xylanases, and cellobiohydrolases. Biomass-degrading enzymes are produced by a wide variety of microorganisms, and can be isolated from the microorganisms, such as T. reesei.

[0053] The term "biomass degrading activity", as used herein, refers to enzymatic activity that breaks down components of the biomass matter described herein into intermediates or final products. Biomass-degrading activity includes at least ligninase activity, endoglucanase activity, cellobiase activity, cellobiohydrolase activity, and xylanase activity. For example, a polypeptide having biomass degrading activity is a cellobiase such as Cel3a from T. reesei.

[0054] The term "cellobiase", as used herein, refers to an enzyme that catalyzes the hydrolysis of a dimer, trimer, tetramer, pentamer, hexamer, heptamer, octamer, or an oligomer of glucose, or an oligomer of glucose and xylose, to glucose and/or xylose. For example, the cellobiase is beta-glucosidase, which catalyzes beta-1,4 bonds in cellobiose to release two glucose molecules.

[0055] The term "cellobiase activity", as used herein, refers to the activity of a category of cellulases that catalyze the hydrolysis of cellobiose to glucose, e.g., catalyzes the hydrolysis of beta-D-glucose residues to release beta-D-glucose. Cellobiase activity can be determined according to the assays described herein, e.g., in Example 4. One unit of cellobiase activity can be defined as [glucose] g/L/[Cel3a] g/L/30 minutes.

[0056] The term "cellobiohydrolase" as used herein, refers to an enzyme that hydrolyzes glycosidic bonds in cellulose. For example, the cellobiohydrolase is 1,4-beta-D-glucan cellobiohydrolase, which catalyzes the hydrolysis of 1,4-beta-D-glucosidic linkages in cellulose, cellooligosaccharides, or any beta-1,4-linked glucose containing polymer, releasing oligosaccharides from the polymer chain.

[0057] The term "cellobiohydrolase activity", as used herein, refers to the activity of an enzyme that catalyzes the hydrolysis of glycosidic bonds in cellulose, specifically, the hydrolysis of 1,4-beta-D-glucosidic linkages in cellulose, cellooligosaccharides, or any beta-1,4-linked glucose-containing polymer, to release cellobiose from the ends of the saccharide chain, e.g., from the reducing or the non-reducing ends of the chain. Cellobiohydrolase activity can be determined according to the assays described herein. One unit of cellobiohydrolase activity can be defined, for example, as the amount of enzyme that releases 1 .mu.M of glucose equivalent from substrate (e.g., Avicel) per minute.

[0058] The term "endoglucanase" as used herein, refers to an enzyme that catalyzes the hydrolysis of internal (3-1,4 glycosidic bonds. For example, the endoglucanase is endo-1,4-(1,3; 1,4)-beta-D-glucan 4-glucanohydrolase, which catalyses endohydrolysis of 1,4-beta-D-glycosidic linkages in cellulose, cellulose derivatives (such as carboxymethyl cellulose and hydroxyethyl cellulose), lichenan, beta-1,4 bonds in mixed beta-1,3 glucans such as cereal beta-D-glucans or xyloglucans, and other plant material containing cellulosic components.

[0059] The term "endoglucanase activity" as used herein, refers to the activity of an enzyme that catalyzes the endohydrolysis of the internal glycosidic bonds, e.g., internal beta-1,4 glycosidic bonds, of cellulose, cellulose derivatives (such as carboxymethyl cellulose and hydroxyethyl cellulose), lichenan, beta-1,4 bonds in mixed beta-1,3 glucans such as cereal beta-D-glucans or xyloglucans, and other plant material containing cellulosic components. Endoglucanase activity can be determined according to the assays described herein. One unit of endoglucanase activity can be defined, for example, as the amount of enzyme that increases the concentration of the reducing ends by 1 .mu.M from substrate per minute.

[0060] The term "enzyme mixture" as used herein, refers to a combination of at least two different enzymes, or two different variants of an enzyme (e.g., a glycosylated and an aglycosylated version of an enzyme). The enzyme mixture referred to herein includes at least the aglycosylated polypeptide having cellobiase activity described herein. In one embodiment, the enzyme mixture includes one or more of a cellobiase, an endoglucanase, a cellobiohydrolase, a ligninase, and/or a xylanase. In some embodiments, the enzyme mixture includes a cell, e.g., a microorganism, which expresses and, e.g., secretes, one or more of the enzymes. For example, the enzyme mixture can include an aglycosylated polypeptide described herein and a cell, e.g., a microorganism, which expresses and, e.g., secretes, one or more additional enzymes and/or variants of the polypeptide.

[0061] The term "inclusion body" as used herein, refers to insoluble aggregates produced by a microorganism, e.g., a host cell, containing one or more of the following: a heterologously expressed polypeptide, e.g., a polypeptide having biomass-degrading activity, cellular debris, one or more ribosomal component, one or more protein endogenously expressed from the host cell, one or more nucleic acids (RNA and/or DNA), or any combination thereof. Inclusion bodies commonly occur in host cells, e.g., bacterial cells, during high levels of expression of a recombinant protein. The heterologously expressed polypeptides found in the inclusion body may be partially unfolded, partially misfolded, or partially denatured.

[0062] The term "ligninase" as used herein, refers to an enzyme that catalyzes the breakdown of lignin, commonly found in the cell walls of plants, such as by an oxidation reaction. Ligninases include lignin-modifying enzymes, lignin peroxidases and laccases.

[0063] The term "ligninase activity" as used herein, refers to the activity of an enzyme that catalyzes the breakdown of lignin and lignin-like polymers by an oxidation reaction. Ligninase activity can be determined according to the assays described herein.

[0064] The terms "nucleic acid" or "polynucleotide" are used interchangeable, and refer to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)).

[0065] The term "operably linked", as used herein, refers to a configuration in which a control or regulatory sequence is placed at a position relative to a nucleic acid sequence that encodes a polypeptide, such that the control sequence influences the expression of a polypeptide (encoded by the DNA sequence). In an embodiment, the control or regulatory sequence is upstream of a nucleic acid sequence that encodes a polypeptide with cellobiase activity. In an embodiment, the control or regulatory sequence is downstream of a nucleic acid sequence that encodes a polypeptide with cellobiase activity.

[0066] The terms "peptide," "polypeptide," and "protein" are used interchangeably, and refer to a compound comprised of amino acid residues covalently linked by peptide bonds. A protein or peptide must contain at least two amino acids, and no limitation is placed on the maximum number of amino acids that can comprise a protein's or peptide's sequence. Polypeptides include any peptide or protein comprising two or more amino acids joined to each other by peptide bonds. "Polypeptides" include, for example, biologically active fragments, substantially homologous polypeptides, oligopeptides, homodimers, heterodimers, variants of polypeptides, modified polypeptides, derivatives, analogs, fusion proteins, among others. A polypeptide includes a natural peptide, a recombinant peptide, or a combination thereof. A "plurality of polypeptides" refers to two or more polypeptides, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, 100, 200, or 500 or more polypeptides.

[0067] The term "promoter", as used herein, refers to a DNA sequence recognized by the synthetic machinery of the cell, or introduced synthetic machinery, required to initiate the specific transcription of a polynucleotide sequence.

[0068] The term "regulatory sequence" or "control sequence", as used interchangeably herein, refers to a nucleic acid sequence which is required for expression of a nucleic acid product. In some instances, this sequence may be a promoter sequence and in other instances, this sequence may also include an enhancer sequence and other regulatory elements which are required for expression of the gene product. The regulatory/control sequence may, for example, be one which expresses the nucleic acid product in a regulated manner, e.g., inducible manner.

[0069] The term "constitutive" promoter refers to a nucleotide sequence which, when operably linked with a polynucleotide which encodes a polypeptide, causes the polypeptide to be produced in a cell under most or all physiological conditions of the cell. In an embodiment, the polypeptide is a polypeptide having cellobiase activity.

[0070] The term "inducible" promoter refers to a nucleotide sequence which, when operably linked with a polynucleotide which encodes a polypeptide, causes the polypeptide to be produced in a cell substantially only when an inducer which corresponds to the promoter is present in the cell. In an embodiment, the polypeptide is a polypeptide having cellobiase activity.

[0071] The term "repressible" promoter refers to a nucleotide sequence, which when operably linked with a polynucleotide which encodes a polypeptide, causes the polypeptide to be produced in a cell substantially only until a repressor which corresponds to the promoter is present in the cell. In an embodiment, the polypeptide is a polypeptide having cellobiase activity.

[0072] The term "solubilizing agent" refers to an agent that has the capacity for disrupting non-covalent bonds, e.g., hydrogen bonds, hydrophobic interactions, van der Waals interactions, dipole-dipole interactions, ionic interactions, pi-stacking, or any combination thereof. The disruption of the non-covalent bonds leads to the solubilization, or dissolution, of previously insoluble matter into solution. Specifically, a solubilizing agent used herein increases the ability of polypeptides having biomass-degrading activity described herein that have aggregated into inclusion bodies to dissolve into solution, e.g., water-based solution or a buffer. Examples of suitable solubilizing agents are described herein.

[0073] The term "xylanase" as used herein, refers to enzymes that hydrolyze xylan-containing material. Xylan is polysaccharide comprising units of xylose. A xylanase can be an endoxylanase, a beta-xylosidase, an arabinofuranosidase, an alpha-glucuronidase, an acetylxylan esterase, a feruloyl esterase, or an alpha-glucuronyl esterase.

[0074] The term "xylanase activity" as used herein, refers to the activity of enzymes that catalyze the endohydrolysis of 1,4-btea-D-xylosidic linkages in xylans and xylan-like polymers. Xylanase activity can be determined according to the assays described herein. One unit of xylanase activity will release 1 .mu.M of xylose equivalent from xylan per minute.

Description

[0075] High level of expression of recombinant proteins in host cells such as E. coli often leads to accumulation of the recombinant proteins into inactive, misfolded and insoluble aggregates within the host cell. These insoluble aggregates are called inclusion bodies and can also contain other components endogenous to the host cell, such as protein, ribosomal components, nucleic acids, and cellular debris. As much as 70-80% of proteins produced by recombinant techniques can form inclusion bodies, thereby significantly reducing the yield of active recombinant protein that can be readily isolated from the host cells.

[0076] Solubilization of the recombinant proteins from the inclusion bodies can be achieved through treatment with chaotropic agents, e.g., high concentrations of urea, which disrupt hydrogen bonds and hydrophobic interactions. However, such solubilization processes often result in denaturation of the protein and loss of native function or enzymatic activity. The soluble denatured proteins can be refolded to their native state after removal of chaotropic agents, however, refolding of recombinant proteins into bioactive forms with enzymatic activity can be cumbersome, costly, and result in low recovery of the final product.

[0077] The present invention is based, at least in part, on the surprising discovery that a heterologously expressed cellobiase that has been solubilized from inclusion bodies by urea retains cellobiase activity. The recovery of heterologously expressed cellobiase from the inclusion bodies increased the total yield of cellobiase by 30-40%. Furthermore, the presence of the solubilizing agent, e.g., urea, from the addition of the solubilized biomass-degrading enzyme, e.g., cellobiase, does not adversely affect the saccharification reaction for converting biomass to a sugar product and/or the yield of products.

[0078] Accordingly, the present invention provides methods for solubilizing a polypeptide having biomass-degrading activity from inclusion bodies, where the resulting solubilized polypeptide retains biomass-degrading activity, whereby the additional processing steps of refolding the polypeptide and removing the solubilizing agent is not required. The present invention provides methods for increasing the recovery of heterologously-expressed biomass-degrading enzymes from inclusion bodies, while retaining enzymatic activity, and use of the recovered biomass-degrading enzymes in methods for converting a biomass into products, e.g., by saccharification.

Polypeptides Having Biomass-Degrading Activity

[0079] The present disclosure provides a polypeptide, a plurality of polypeptides, having a biomass-degrading activity. In embodiments, the polypeptide having biomass-degrading activity, or plurality thereof, is present in a mixture with one or more solubilizing agent. Some mixtures may also contain one or more proteins associated with an inclusion body. In other embodiments, the mixture does not contain one or more proteins associated with the inclusion body, e.g., the polypeptide or plurality thereof having biomass-degrading activity was purified from one or more proteins associated with the inclusion body.

[0080] For example, the polypeptide has cellobiase activity, ligninase activity, endoglucanase activity, cellobiohydrolase activity, or xylanase activity.

[0081] In an embodiment, the polypeptide is a cellobiase. A cellobiase is an enzyme that hydrolyzes beta-1,4 bonds in its substrate, e.g., cellobiose, to release two glucose molecules. Cellobiose is a water soluble 1,4-linked dimer of glucose. In an embodiment, the polypeptide is Cel3a. Cel3a (also known as BglI) is a cellobiase that was identified in Trichoderma reesei. The amino acid sequence for Cel3a (GenBank Accession No. NW_006711153) is provided below:

TABLE-US-00001 (SEQ ID NO: 1) MGDSHSTSGASAEAVVPPAGTPWGTAYDKAKAALAKLNLQDKVGIVSGVG WNGGPCVGNTSPASKISYPSLCLQDGPLGVRYSTGSTAFTPGVQAASTWD VNLIRERGQFIGEEVKASGIHVILGPVAGPLGKTPQGGRNWEGFGVDPYL TGIAMGQTINGIQSVGVQATAKHYILNEQELNRETISSNPDDRTLHELYT WPFADAVQANVASVMCSYNKVNTTWACEDQYTLQTVLKDQLGFPGYVMTD WNAQHTTVQSANSGLDMSMPGTDFNGNNRLWGPALTNAVNSNQVPTSRVD DMVTRILAAWYLTGQDQAGYPSFNISRNVQGNHKTNVRAIARDGIVLLKN DANILPLKKPASIAVVGSAAIIGNHARNSPSCNDKGCDDGALGMGWGSGA VNYPYFVAPYDAINTRASSQGTQVTLSNTDNTSSGASAARGKDVAIVFIT ADSGEGYITVEGNAGDRNNLDPWHNGNALVQAVAGANSNVIVVVHSVGAI ILEQILALPQVKAVVWAGLPSQESGNALVDVLWGDVSPSGKLVYTIAKSP NDYNTRIVSGGSDSFSEGLFIDYKHFDDANITPRYEFGYGLSYTKFNYSR LSVLSTAKSGPATGAVVPGGPSDLFQNVATVTVDIANSGQVTGAEVAQLY ITYPSSAPRTPPKQLRGFAKLNLTPGQSGTATFNIRRRDLSYWDTASQKW VVPSGSFGISVGASSRDIRLTSTLSVAGSGS

[0082] In an embodiment, the polypeptide is a ligninase. A ligninase is an enzyme that breaks down lignin, which is a complex polymer of aromatic alcohols known as monolignols and plays an integral part of the secondary cell walls of plants and some algae. Ligninases include lignin peroxidases, 1,2-bis(3,4-dimethoxyphenyl)propane-1,3-diol:hydrogen-peroxide oxidoreductase, diarylpropane oxygenase, ligninase I, diarylpropane peroxidase, LiP, hydrogen-peroxide oxidoreductase (C--C-bond-cleaving), and some laccases. Examples of ligninases include CIP2 from Trichoderma reesei; LPOA, GLG2, GLG4, LIPA, GLG5, GLG3, GLG6, and LIPB from Phanerochaete chrysosporium; ligninase-3 from Phelbia radiate; Ligninase A and B from Coriolus versicolor; and LPG I and LPGIV Coriolus versicolor.

[0083] In an embodiment, the polypeptide is an endoglucanase. An endoglucanase is an enzyme that catalyzes the hydrolysis of cellulose. Specifically, the endoglucanases cleave the internal bonds of the cellulose chain. Endoglucanases are produced by fungi, bacteria, and protozoans. Endoglucanases are also known as beta-1-4 endoglucanase, 4-beta-D-glucan cellobiohydrolase, exo-cellobiohydrolase, beta-1,4-glucan cellobiohydrolase, beta-1,4-glucan cellobiosylhydrolase, 1,4-beta-glucan cellobiosidase, exoglucanase, avicelase, CBH 1, C1 cellulase, cellobiohydrolase I, cellobiohydrolase, exo-beta-1,4-glucan cellobiohydrolase, 1,4-beta-D-glucan cellobiohydrolase, or cellobiosidase. Examples of endoglucanases include Cel5A, Cel5B, Cel7B, Cel12A, Cel45A, Cel61A, Cel61B, and Cel74A from Trichoderma reesei.

[0084] In an embodiment, the polypeptide is a cellobiohydrolase, also known as exoglucanase. A cellobiohydrolase catalyzes the hydrolysis of 1-4-beta-D-glucosidic linkages in oligosaccharides containing that linkage, e.g., cellulose and cellotetraose, thereby releasing cellobiose from the non-reducing ends of the chains. Examples of cellobiohydrolases include cellobiohydrolase I (CBHI) and cellobiohydrolase II (CBHII) from Trichoderma reesei.

[0085] In an embodiment, the polypeptide is a xylanase. Xylanases are also known as endo-(1-4)-beta-xylan 4-xylanohydrolase, endo-1,4-xylanase, endo-1,4-beta-xylanase, beta-1,4-xylanase, endo-1,4-beta-D-xylanase, 1,4-beta-xylan xylanohydrolase, beta-xylanase, beta-1,4-xylan xylanohydrolase, beta-D-xylanase. A xylanase breaks down a component of plant cell walls called hemicellulose, e.g., degrades polysaccharides, such as xylan, e.g., beta-1,4-xylan, glucuronoxylan, arabinoxylan, glucomannan, and xyloglucan, to release xylose. Examples of xylanases include Xyn1, Xyn2, and Xyn3 from Trichoderma reesei; and TERTU_1599, TERTU_3603, TERTU_2546, and TERTU_4506 from Terendinibacter turnerae T7901.

[0086] The present disclosure also provides functional variants of a polypeptide having biomass-degrading activity described herein. In an embodiment, a functional variant has an amino acid sequence with at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to a biomass-degrading enzyme described herein, or a functional fragment thereof, e.g., at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to a biomass-degrading enzyme described herein, or a functional fragment thereof.

[0087] In an embodiment, a functional variant has an amino acid sequence with at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91% identity, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to Cel3a produced by T. reesei or SEQ ID NO: 1, or a functional fragment thereof.

[0088] Percent identity in the context of two or more amino acid or nucleic acid sequences, refers to two or more sequences that are the same. Two sequences are "substantially identical" if two sequences have a specified percentage of amino acid residues or nucleotides that are the same (e.g., 60% identity, optionally 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity over a specified region, or, when not specified, over the entire sequence), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Optionally, the identity exists over a region that is at least about 50 nucleotides, 100 nucleotides, 150 nucleotides, in length. More preferably, the identity exists over a region that is at least about 200 or more amino acids, or at least about 500 or 1000 or more nucleotides, in length.

[0089] For sequence comparison, one sequence typically acts as a reference sequence, to which one or more test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters. Methods of alignment of sequences for comparison are well known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith and Waterman, (1970) Adv. Appl. Math. 2:482c, by the homology alignment algorithm of Needleman and Wunsch, (1970) J. Mol. Biol. 48:443, by the search for similarity method of Pearson and Lipman, (1988) Proc. Nat'l. Acad. Sci. USA 85:2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Brent et al., (2003) Current Protocols in Molecular Biology).

[0090] Two examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., (1977) Nuc. Acids Res. 25:3389-3402; and Altschul et al., (1990) J. Mol. Biol. 215:403-410, respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information.

[0091] Functional variants may comprise one or more mutations, such that the variant retains biomass-degrading activity that is better than the biomass-degrading activity of a biomass-degrading enzyme described herein produced by the microorganism from which the enzyme originates from. In an embodiment, the functional variant has at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% (e.g., at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) of the biomass-degrading activity as a biomass-degrading enzyme as produced by E. coli. In embodiments, the functional variant has at least 200%, at least 300%, at least 400%, at least 500%, at least 1000% or more of the biomass-degrading activity as a biomass-degrading enzyme produced by E. coli or the microorganism from which the enzyme originates from. Biomass-degrading activity can be tested using the functional assays described herein. In one embodiment, the functional variant retains cellobiase activity that is better than the cellobiase activity of Cel3a as produced by T. reesei. In another embodiment, the functional variant has at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% (e.g., at least 80%, at least 85%, at least 90%, at least 95%, or at least 99%) of the cellobiase activity as a Cel3a or enzyme comprising SEQ ID NO: 1 as produced by E. coli. In embodiments, the functional variant has increased biomass-degrading activity compared to a biomass-degrading enzyme described herein, e.g., at least 200%, at least 300%, at least 400%, at least 500%, at least 1000% or more of the biomass-degrading activity of a biomass-degrading enzyme described herein, e.g., cellobiase activity as a Cel3a or enzyme comprising SEQ ID NO: 1 produced by E. coli or the microorganism from which the enzyme originates from.

[0092] The mutations present in a functional variant include amino acid substitutions, additions, and deletions. Mutations can be introduced by standard techniques known in the art, such as site-directed mutagenesis and PCR-mediated mutagenesis. The mutation may be a conservative amino acid substitution, in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine, tryptophan), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, one or more amino acid residues within the polypeptide having cellobiase activity of the disclosure can be replaced with other amino acids from the same side chain family, and the resultant polypeptide retains cellobiase activity comparable (e.g., at least 80%, 85%, 90%, 95%, or 99% of the cellobiase activity) to that of the wild-type polypeptide. Alternatively, the mutation may be an amino acid substitution in which an amino acid residue is replaced with an amino acid residue having a different side chain.

[0093] Such mutations may alter or affect various enzymatic characteristics of the biomass-degrading enzyme, e.g., cellobiase, ligninase, endoglucanase, or cellobiohydrolase. For example, such mutations may alter or affect the biomass-degrading activity, thermostability, optimal pH for reaction, enzyme kinetics, or substrate recognition of the biomass-degrading enzyme. In some embodiments, a mutation increases the biomass-degrading activity of the variant in comparison to the biomass-degrading enzyme, e.g., cellobiase produced by T. reesei and/or SEQ ID NO: 1 produced in E. coli. In some embodiments, a mutation increases or decreases the thermostability of the variant in comparison to a wild-type biomass degrading enzyme, e.g., a cellobiase and/or SEQ ID NO: 1 produced in E. coli. In an embodiment, a mutation changes the pH range at which the variant optimally performs the biomass-degrading reaction in comparison to wild-type biomass-degrading enzyme, e.g., wild-type cellobiase and/or SEQ ID NO: 1 produced in E. coli. In an embodiment, a mutation increases or decreases the kinetics of the biomass-degrading reaction (e.g., k.sub.cat, K.sub.M or K.sub.D) in comparison to wild-type biomass-degrading enzyme, e.g., wild-type cellobiase and/or SEQ ID NO: 1 produced in E. coli. In an embodiment, a mutation increases or decreases the ability of the cellobiase to recognize or bind to the substrate (e.g., cellobiose) in comparison to wild-type cellobiase and/or SEQ ID NO:1 produced in E. coli.

[0094] The present invention also provides functional fragments of a polypeptide having biomass-degrading activity, e.g., cellobiase activity, as described herein, e.g., Cel3a or SEQ ID NO: 1. One of ordinary skill in the art could readily envision that a fragment of a polypeptide having biomass-degrading activity as described herein that contains the functional domains responsible for enzymatic activity would retain functional activity, e.g., biomass-degrading activity, and therefore, such fragments are encompassed in the present invention. In an embodiment, the functional fragment is at least 700 amino acids, at least 650 amino acids, at least 600 amino acids, at least 550 amino acids, at least 500 amino acids, at least 450 amino acids, at least 400 amino acids, at least 350 amino acids, at least 300 amino acids, at least 250 amino acids, at least 200 amino acids, at least 150 amino acids, at least 100 amino acids, or at least 50 amino acids in length. In an embodiment, the functional fragment is 700 to 744 amino acids, 650 to 699 amino acids, 600 to 649 amino acids, 550 to 599 amino acids, 500 to 549 amino acids, 450 to 499 amino acids, 400 to 449 amino acids, 350 to 399 amino acids, 300 to 349 amino acids, 250 to 299 amino acids, 200 to 249 amino acids, 150 to 199 amino acids, 100 to 149 amino acids, or 50 to 99 amino acids. With regard to the ranges of amino acid length described above, the lowest and highest values of amino acid length are included within each disclosed range. In an embodiment, the functional fragment has at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% of the biomass-degrading activity as a wild-type biomass-degrading enzyme described herein, or the biomass-degrading enzyme produced in E. coli. In an embodiment, the functional fragment has at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% of the cellobiase activity as wild-type Cel3a or the polypeptide comprising SEQ ID NO: 1 produced in E. coli.

[0095] Assays for detecting cellobiase activity are known in the art. For example, detection of the amount of glucose released from cellobiose can be determined by incubating purified cellobiase with substrate, e.g., cellobiose, D-(+)-cellobiose, and detecting the resultant amount of free glucose after completion of the reaction. The amount of free glucose can be determined using a variety of methods known in the art. For example, dilutions of purified cellobiase are prepared in a buffer containing 50 mM sodium citrate, pH 5.0 NaOH. The cellobiose substrate is added to the purified cellobiase in an amount such that the final concentration of cellobiose in the reaction mixture is 30 mM. The reaction mixture is incubated under conditions suitable for the reaction to occur, e.g., in a shaker (700 rpm) at 48.degree. C. for 30 minutes. To stop the reaction, the reaction mixture is heated for 5 minutes at 100.degree. C. The reaction mixture is filtered through a 0.45 .mu.m filter and the filtrate is analyzed to quantify the amount of glucose and/or cellobiase. A YSI instrument that measures analytes such as glucose can be used to determine the concentration of glucose produced from the reaction. Alternatively, UPLC (Ultra Performance Liquid Chromatography) can be used to determine the concentration of glucose and cellobiose from the reaction. This assay can be formatted in a single reaction or in multiple reaction formats, e.g., 96 well format. In some embodiments, the multiple reaction format may be preferred to generate an activity curve representing cellobiase activity with respect to different concentrations of the purified cellobiase. The concentration of the purified cellobiase can be determined using a standard Bradford assay. Dilutions of the purified cellobiase assay are prepared, e.g., 2-fold dilutions, and are aliquoted into a 96 well plate, e.g., 12 wells of 2-fold dilutions. Cellobiose substrate is added as previously described, such that the final concentration of cellobiase in the reaction is 30 mM. The plate is sealed and treated under conditions sufficient for the cellobiase reaction to occur, and then under conditions to stop the reaction. The reaction is then filtered through a 96 well format 0.45 .mu.m membrane (e.g., Durapore) and analyzed by YSI and/or HPLC methods, e.g., UPLC.

[0096] This activity assay can also be used to determine the concentration, or titer, of a cellobiase in a sample with unknown concentration by generating a standard curve of activity of known concentrations of the cellobiase to extrapolate the concentration for the unknown concentration sample. For example, two-fold serial dilutions of a known concentration of the cellobiase are prepared in one row of a 96 well plate, e.g., 12 two-fold serial dilutions. The other rows contain two-fold serial dilutions of other remaining samples whose titer is to be determined, e.g., the crude lysate sample or solubilized inclusion body sample. The dilutions are incubated with a D-(+)-Cellobiose (Fluka) substrate solution in 50 mM sodium citrate monobasic buffer at pH 5.0, at 48.degree. C. for 30 minutes. After 30 minutes, the samples are heated to 100.degree. C. for 10 minutes to stop the reaction. Samples are analyzed for glucose and cellobiose using the YSI Biochemistry analyser (YSI Life Sciences) and/or HPLC methods. Using the samples of known concentration, a standard curve is generated using the data points within the linear range of the assay. The cellobiase activity detected from the samples with unknown titer can be compared to the standard curve to determine the titer of cellobiase in these sample.

[0097] Units of activity are only relative if calculated using values within the linear range of the assay. The linear range of the assay is defined as using glucose values that are less than 30% of the original soluble substrate load. In addition, glucose values lower than 0.05 g/L are omitted due to instrumentation reporting levels. One unit of cellobiase activity is defined as the amount of glucose per the amount of Cel3a per 30 minutes: [Glucose]g/L/[Cel3a]g/L/30 min.

[0098] In other embodiments, a colorimetric/fluorometric assay can be used. The purified cellobiase is incubated with substrate cellobiose under conditions for the reaction to occur. Detection of the product glucose is as follows. Glucose oxidase is added to the mixture, which oxidizes glucose (the product) to gluconic acid and hydrogen peroxide. Peroxidase and o-dianisidine is then added. O-dianisidine reacts with the hydrogen peroxide in the presence of peroxidase to form a colored product. Sulfuric acid is added, which reacts with the oxidized o-dianisidine reacts to form a more stable colored product. The intensity of the color when measured, e.g., by spectrophotometer or colorimeter, e.g., at 540 nm, is directly proportional to the glucose concentration. Such colorimetric/fluorometric glucose assays are commercially available, for example from Sigma Aldrich, Catalog No. GAGO-20.

[0099] Assays for detecting ligninase activity are known in the art. Ligninase activity can be measured by determining the rate of oxidation of veratryl alcohol to veratrylaldehyde (abbreviated as VAO for veratryl alcohol oxidation). Reaction mixtures are prepared, and contain dilutions of enzyme, 2 mM veratryl alcohol, 0.4 mM H.sub.2O.sub.2 and either 20 or 100 mM sodium tartrate, pH 2.9 in a final volume of 0.5 ml. The reactions were started by H.sub.2O.sub.2 addition and were monitored by spectrophotometry at 310 nm. Protein was determined according to Bradford, M. M., (1976) Anal. Biochem. 72:248-254, using bovine serum albumin (Sigma Chemical Co., St. Louis, Mo.) as standard or by using the 409 nm absorbance of a protein solution and calculating protein amount from the extinction coefficient of ligninase.

[0100] Assays for detecting endoglucanase activity are known in the art. For example, endoglucanase activity can be determined by measuring the hydrolysis of substrate carboxymethyl cellulose (CMC) and quantifying the concentration of reducing end by BCA method, in which the total concentration of reducing ends is exhibited by a color change of the sample solution in proportion to the concentration of the reducing ends. First, the polypeptide having biomass-degrading activity is diluted in a 50 mM citrate buffer at pH 4.8. CMC solution (0.05% w/v CMC in the sodium citrate buffer) is added to a reaction tube and equilibrated at 50 C. The diluted enzyme samples are added to the reaction and incubated at 50 C for 10 minutes. BCA reagents are added and incubated at 75 C for 30 minutes. The absorbance is read at 560 nm after subtracting the readings for the enzyme blanks and the substrate blank. Enzyme activity can be calculated based on a linear range between reducing end concentration and enzyme concentrations. Other endoglucanase activity assays are known in the art, for example, by determining a reduction in substrate viscosity (Zhang et al., Biotechnol Adv, 2006, 24:452-481).

[0101] Assays for detecting cellobiohydrolase activity are known in the art. Cellobiohydrolase activity can be determined by measuring soluble substrate released from substrate Avicel in a phenol-sulfuric assay. An Avicel solution (1.25% w/v in acetate buffer) is aliquoted into reaction tubes, and dilutions of the enzyme is prepared. Both substrate and enzyme solutions are equilibrated at 50 C. The diluted enzyme solutions are added to the substrate and incubated for a time sufficient for the reaction to occur, e.g., at 50 C for 2 hours. The reactions are stopped by submerging the samples into an ice cold water bath. The samples are centrifuged to separate the samples into a soluble and insoluble fraction. The total concentration of soluble sugars in the soluble fraction is determined by phenol-sulfuric assay. Specifically, an aliquot of the soluble fraction is mixed with 5% phenol, and concentrated sulfuric acid is added. The reaction is cooled to room temperature (about 20-30 minutes), and absorbance of the samples are read at 490 nm. The enzyme activity is calculated on the basis of a linear relationship between total soluble sugar release and the enzyme dilution. Other cellobiohydrolase activity assays are described in Zhang et al., Biofuels: Methods and Protocols, Vol. 581, pages 213-231.

[0102] Assays for detecting xylanase activity are known in the art. Xylanase activity can be determined by measuring the level of xylose released from a xylan substrate by a colorimetric assay. Xylan substrate is prepared as a 1.0% w/v solution in 50 mM sodium acetate buffer, pH 4.5. Dilutions of the enzyme of prepared. Xylan and the enzyme dilutions are mixed, and incubated under conditions sufficient for the reaction to occur, e.g., 30 C for 10 minutes. Then a solution containing 16 mM copper sulfate, 1.3M sodium sulfate, 226 mM sodium carbonate, 190 mM sodium bicarbonate, and 43 mM sodium potassium tartrate is added to the reaction. The reaction is then boiled for 10 minutes, and allowed to cool to room temperature. A solution containing 40 mM molybdic acid, 19 mM arsenic acid, and 756 mM sulfuric acid is added. The reaction is shaken or vortexed until the foaming stops and any preceiptate present is dissolved. The reaction is centrifuged to clarify, then the solutions are ready by spectrophotometer at 540 nM, and enzyme activity is calculated on the basis of a linear relationship between total soluble sugar release and the enzyme dilution.

Aglycosylated Polypeptides

[0103] Any of the polypeptides having biomass-degrading activity described herein, e.g., cellobiase activity, can be glycosylated or aglycosylated. An aglycosylated polypeptide having biomass-degrading activity may be solubilized from an inclusion body, as described herein. Alternatively, an aglycosylated polypeptide having biomass-degrading activity may be added to a mixture comprising a polypeptide having biomass-degrading activity that has been solubilized from an inclusion body, in which the polypeptide that was solubilized from an inclusion body can be glycosylated or aglycosylated.

[0104] Glycosylation is the enzymatic process by which a carbohydrate is attached to a glycosyl acceptor, e.g., the nitrogen of arginine or asparginine side chains or the hydroxyl oxygen of serine, threonine, or tyrosine side chains. There are two types of glycosylation: N-linked and O-linked glycosylation. N-linked glycosylation occurs at consensus site Asn-X-Ser/Thr, wherein the X can be any amino acid except a proline. O-linked glycosylation occurs at Ser/Thr residues. Glycosylation sites can be predicted using various algorithms known in the art, such as Prosite, publicly available by the Swiss Institute of Bioinformatics, and NetNGlyc 1.0 or NetOGlyc 4.0, publicly available by the Center for Biological Sequence Analysis.

[0105] The present invention provides methods for producing an aglycosylated polypeptide having biomass-degrading activity. In one embodiment, the nucleic acid encoding the polypeptide has been altered or mutated such that the polypeptide cannot be glycosylated, e.g., one or more glycosylation sites are mutated such that a glycan cannot be attached to the glycosylation site. For example, an aglycosylated polypeptide having biomass-degrading activity encoded by a nucleic acid sequence described herein contains one or more mutations at one or more glycosylation sites have been mutated such that a glycan can no longer be attached or linked to the glycosylation site. In another example, the polypeptide having biomass-degrading activity encoded by a nucleic acid sequence described herein contains one or more mutations proximal to one or more glycosylation sites that have been mutated such that a glycan can no longer be attached or linked to the glycosylation site. For example, the mutation proximal to a glycosylation site mutates the consensus motif recognized by the glycosylating enzyme, or changes the conformation of the polypeptide such that the polypeptide cannot be glycosylated, e.g., the glycoslation site is hidden or steric hindrance due to the new conformation prevents the glycosylating enzymes from accessing the glycosylation site. A mutation proximal to a glycosylation site in the polypeptide having biomass-degrading activity is directly adjacent to, or at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 30 or at least 40 amino acids from the glycosylation site that, as a result of the proximal mutation, will not be glycosylated.

[0106] In an embodiment, one or more of the following glycosylation sites of a cellobiase, e.g., Cel3a, or SEQ ID NO: 1, are mutated: the threonine at amino acid position 78, the threonine at amino acid position 241, the serine at amino acid position 343, the serine at amino acid position 450, the threonine at amino acid position 599, the serine at amino acid position 616, the threonine at amino acid position 691, the serine at amino acid position 21, the threonine at amino acid position 24, the serine at amino acid position 25, the serine at amino acid position 28, the threonine at amino acid position 38, the threonine at amino acid position 42, the threonine at amino acid position 303, the serine at amino acid position at 398, the serine at amino acid position 435, the serine at amino acid position 436, the threonine at amino acid position 439, the threonine at amino acid position 442, the threonine at amino acid position 446, the serine at amino acid position 451, the serine at amino acid position 619, the serine at amino acid position 622, the threonine at amino acid position 623, the serine at amino acid position 626, or the threonine at amino acid position 630, or any combination thereof. In embodiments, the glycosylation site is mutated from a serine or threonine to an alanine. For example, the aglycosylated polypeptide described herein has one or more of the following mutations: T78A, T241A, S343A, S450A, T599A, S616A, T691A, S21A, T24A, S25A, S28A, T38A, T42A, T303A, T398A, S435A, S436A, T439A, T442A, T446A, S451A, S619A, S622A, T623A, S626A, or T630A, or any combination thereof. Alternatively, one or more amino acids proximal to the glycosylation sites described above are mutated.

[0107] Assays to detect whether a polypeptide is modified by a glycan (e.g., whether the polypeptide is glycosylated or aglycosylated) are known in the art. The polypeptide can be purified or isolated and can be stained for detection and quantification of glycan moieties, or the polypeptide can be analyzed by mass spectrometry, and compared to a corresponding reference polypeptide. The reference polypeptide has the same primary sequence as the test polypeptide (of which the glycosylation state is to be determined), but is either glycosylated or aglycosylated.

[0108] The aglycosylated polypeptides described herein may have increased biomass-degrading activity, e.g., cellobiase activity, compared to a corresponding glycosylated polypeptide, e.g., glycosylated Cel3a polypeptide. For example, the aglycosylated polypeptide having biomass-degrading activity, e.g., cellobiase activity, has at least 1%, 2%, 5%, 10%, 15%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or 200% biomass-degrading activity, e.g., cellobiase activity, compared to the glycosylated polypeptide.

Nucleic Acids, Expression Vectors and Host Cells

[0109] A polypeptide having biomass-degrading activity as described herein is expressed in host cells. In one embodiment, an expression vector comprising a nucleic acid sequence encoding any of the polypeptides described herein having biomass-degrading activity is introduced into a host cell, and the host cell is cultured under conditions appropriate for expression of the polypeptide having biomass-degrading activity. In embodiments, the expression of the polypeptide having biomass-degrading activity is at a level such that inclusion bodies, or aggregates comprising the polypeptide having biomass-degrading activity, are formed. Also described herein are methods for expressing and isolating the soluble polypeptide having biomass-degrading activity expressed in the host cell. Methods for solubiziling and isolating the polypeptide having biomass-degrading activity from the inclusion bodies is described further in the section titled "Solubilization from Inclusion Bodies".

[0110] The present invention also provides a nucleic acid sequence encoding a polypeptide having biomass-degrading activity. In an embodiment, the nucleic acid sequence encodes a ligninase, an endoglucanase, a cellobiohydrolase, a xylanase, or a cellobiase described herein. The nucleic acid sequence encoding a polypeptide having biomass-degrading activity can be codon-optimized for increased expression in host cells. Codon optimization includes changing the nucleic acid sequence to take into consideration factors including codon usage bias, cryptic splicing sites, mRNA secondary structure, premature polyA sites, interaction of codon and anti-codon, and RNA instability motifs, to increase expression of the encoded polypeptide in the host. Various algorithms and commercial services for codon-optimization are known and available in the art.

[0111] In an embodiment, the nucleic acid sequence encodes a Cel3a enzyme from T. reesei with the amino acid sequence described herein, e.g., SEQ ID NO: 1. In an embodiment, the nucleic acid sequence that encodes Cel3a is provided below:

TABLE-US-00002 (SEQ ID NO: 2) ATGCGTTACCGAACAGCAGCTGCGCTGGCACTTGCCACTGGGCCCTTTGC TAGGGCAGACAGTCACTCAACATCGGGGGCCTCGGCTGAGGCAGTTGTAC CTCCTGCAGGGACTCCATGGGGAACCGCGTACGACAAGGCGAAGGCCGCA TTGGCAAAGCTCAATCTCCAAGATAAGGTCGGCATCGTGAGCGGTGTCGG CTGGAACGGCGGTCCTTGCGTTGGAAACACATCTCCGGCCTCCAAGATCA GCTATCCATCGCTATGCCTTCAAGACGGACCCCTCGGTGTTCGATACTCG ACAGGCAGCACAGCCTTTACGCCGGGCGTTCAAGCGGCCTCGACGTGGGA TGTCAATTTGATCCGCGAACGTGGACAGTTCATCGGTGAGGAGGTGAAGG CCTCGGGGATTCATGTCATACTTGGTCCTGTGGCTGGGCCGCTGGGAAAG ACTCCGCAGGGCGGTCGCAACTGGGAGGGCTTCGGTGTCGATCCATATCT CACGGGCATTGCCATGGGTCAAACCATCAACGGCATCCAGTCGGTAGGCG TGCAGGCGACAGCGAAGCACTATATCCTCAACGAGCAGGAGCTCAATCGA GAAACCATTTCGAGCAACCCAGATGACCGAACTCTCCATGAGCTGTATAC TTGGCCATTTGCCGACGCGGTTCAGGCCAATGTCGCTTCTGTCATGTGCT CGTACAACAAGGTCAATACCACCTGGGCCTGCGAGGATCAGTACACGCTG CAGACTGTGCTGAAAGACCAGCTGGGGTTCCCAGGCTATGTCATGACGGA CTGGAACGCACAGCACACGACTGTCCAAAGCGCGAATTCTGGGCTTGACA TGTCAATGCCTGGCACAGACTTCAACGGTAACAATCGGCTCTGGGGTCCA GCTCTCACCAATGCGGTAAATAGCAATCAGGTCCCCACGAGCAGAGTCGA CGATATGGTGACTCGTATCCTCGCCGCATGGTACTTGACAGGCCAGGACC AGGCAGGCTATCCGTCGTTCAACATCAGCAGAAATGTTCAAGGAAACCAC AAGACCAATGTCAGGGCAATTGCCAGGGACGGCATCGTTCTGCTCAAGAA TGACGCCAACATCCTGCCGCTCAAGAAGCCCGCTAGCATTGCCGTCGTTG GATCTGCCGCAATCATTGGTAACCACGCCAGAAACTCGCCCTCGTGCAAC GACAAAGGCTGCGACGACGGGGCCTTGGGCATGGGTTGGGGTTCCGGCGC CGTCAACTATCCGTACTTCGTCGCGCCCTACGATGCCATCAATACCAGAG CGTCTTCGCAGGGCACCCAGGTTACCTTGAGCAACACCGACAACACGTCC TCAGGCGCATCTGCAGCAAGAGGAAAGGACGTCGCCATCGTCTTCATCAC CGCCGACTCGGGTGAAGGCTACATCACCGTGGAGGGCAACGCGGGCGATC GCAACAACCTGGATCCGTGGCACAACGGCAATGCCCTGGTCCAGGCGGTG GCCGGTGCCAACAGCAACGTCATTGTTGTTGTCCACTCCGTTGGCGCCAT CATTCTGGAGCAGATTCTTGCTCTTCCGCAGGTCAAGGCCGTTGTCTGGG CGGGTCTTCCTTCTCAGGAGAGCGGCAATGCGCTCGTCGACGTGCTGTGG GGAGATGTCAGCCCTTCTGGCAAGCTGGTGTACACCATTGCGAAGAGCCC CAATGACTATAACACTCGCATCGTTTCCGGCGGCAGTGACAGCTTCAGCG AGGGACTGTTCATCGACTATAAGCACTTCGACGACGCCAATATCACGCCG CGGTACGAGTTCGGCTATGGACTGTCTTACACCAAGTTCAACTACTCACG CCTCTCCGTCTTGTCGACCGCCAAGTCTGGTCCTGCGACTGGGGCCGTTG TGCCGGGAGGCCCGAGTGATCTGTTCCAGAATGTCGCGACAGTCACCGTT GACATCGCAAACTCTGGCCAAGTGACTGGTGCCGAGGTAGCCCAGCTGTA CATCACCTACCCATCTTCAGCACCCAGGACCCCTCCGAAGCAGCTGCGAG GCTTTGCCAAGCTGAACCTCACGCCTGGTCAGAGCGGAACAGCAACGTTC AACATCCGACGACGAGATCTCAGCTACTGGGACACGGCTTCGCAGAAATG GGTGGTGCCGTCGGGGTCGTTTGGCATCAGCGTGGGAGCGAGCAGCCGGG ATATCAGGCTGACGAGCACTCTGTCGGTAGCG

The codon-optimized nucleic acid sequence that encodes Cel3a is provided below:

TABLE-US-00003 (SEQ ID NO: 3) ATGCGTTATCGTACAGCCGCAGCCCTGGCACTGGCCACAGGTCCGTTCGC ACGTGCCGATAGTCACAGTACCAGCGGTGCCAGCGCAGAAGCCGTGGTTC CGCCGGCAGGCACACCGTGGGGCACAGCCTATGATAAAGCCAAAGCCGCC CTGGCCAAGCTGAATCTGCAGGATAAAGTGGGCATCGTGAGTGGCGTGGG CTGGAACGGTGGTCCGTGCGTTGGCAACACCAGCCCGGCAAGCAAGATCA GCTATCCGAGCTTATGCCTGCAGGATGGTCCGCTGGGCGTGCGCTATAGC ACCGGTAGTACCGCCTTTACACCTGGTGTGCAGGCCGCCAGTACCTGGGA CGTTAACCTGATCCGCGAACGTGGCCAATTTATCGGCGAAGAAGTTAAAG CCAGCGGCATTCATGTTATTCTGGGTCCGGTGGCCGGTCCTCTGGGTAAA ACCCCGCAGGGCGGCCGTAATTGGGAAGGCTTCGGCGTTGATCCGTATTT AACCGGCATCGCAATGGGCCAGACCATTAATGGCATCCAGAGCGTGGGTG TTCAAGCCACCGCCAAACACTACATATTAAACGAACAGGAACTGAATCGT GAAACCATCAGCAGCAATCCGGATGATCGCACCCTGCATGAGCTGTATAC ATGGCCTTTTGCCGACGCAGTTCAGGCCAACGTGGCAAGTGTGATGTGTA GCTATAACAAGGTGAACACCACCTGGGCCTGCGAAGACCAGTACACCCTG CAGACCGTTTTAAAAGACCAACTGGGCTTCCCTGGTTACGTGATGACAGA TTGGAATGCCCAGCACACAACCGTTCAGAGCGCAAACAGTGGCCTGGATA TGAGCATGCCGGGCACCGACTTCAACGGCAATAATCGTCTGTGGGGTCCG GCACTGACCAATGCCGTTAACAGCAACCAGGTGCCGACCAGTCGTGTGGA CGATATGGTTACCCGTATTCTGGCCGCCTGGTACCTGACAGGTCAAGACC AGGCCGGCTACCCGAGCTTCAACATCAGCCGCAACGTGCAGGGTAATCAC AAGACCAACGTTCGCGCAATCGCACGCGATGGTATCGTGCTGTTAAAGAA CGATGCCAACATTCTGCCGCTGAAAAAACCGGCCAGCATCGCCGTTGTTG GTAGCGCAGCCATCATTGGCAACCACGCCCGTAACAGTCCGAGCTGCAAT GATAAAGGCTGTGACGACGGTGCCCTGGGCATGGGTTGGGGTAGTGGTGC CGTGAACTACCCGTATTTCGTGGCCCCGTACGACGCCATTAACACCCGTG CAAGTAGCCAGGGTACCCAGGTTACCCTGAGCAACACCGACAACACAAGC AGCGGTGCCAGTGCAGCACGTGGTAAGGATGTGGCCATCGTGTTCATCAC CGCCGACAGCGGCGAAGGCTACATTACCGTGGAGGGTAATGCCGGTGATC GCAATAATCTGGACCCGTGGCATAACGGCAACGCCCTGGTTCAGGCAGTG GCAGGCGCAAATAGCAACGTGATCGTTGTGGTGCATAGCGTGGGTGCCAT CATTCTGGAGCAGATCCTGGCCCTGCCGCAAGTTAAGGCAGTTGTGTGGG CAGGTCTGCCGAGCCAAGAAAGTGGCAATGCCCTGGTGGACGTTCTGTGG GGCGATGTTAGTCCGAGCGGCAAGCTGGTGTATACAATCGCCAAGAGCCC GAACGACTATAACACCCGCATCGTTAGCGGCGGCAGTGATAGCTTCAGCG AGGGCCTGTTTATCGACTACAAGCATTTCGATGATGCCAATATTACCCCG CGCTACGAATTTGGTTATGGCCTGAGCTATACCAAGTTCAACTACAGCCG CCTGAGCGTTTTAAGTACCGCCAAGAGTGGTCCGGCAACAGGTGCCGTGG TTCCTGGTGGTCCGAGTGATCTGTTTCAGAATGTGGCCACCGTGACCGTG GATATCGCCAACAGTGGTCAGGTTACCGGCGCCGAAGTGGCACAGCTGTA CATCACCTATCCGAGCAGTGCACCGCGCACCCCGCCGAAACAGCTGCGTG GCTTCGCCAAATTAAACCTGACCCCGGGCCAGAGCGGTACAGCAACCTTC AATATTCGCCGCCGTGATCTGAGCTATTGGGACACCGCCAGCCAAAAATG GGTGGTGCCGAGCGGCAGCTTTGGCATTAGTGTGGGTGCAAGTAGCCGCG ACATTCGCTTAACAAGCACCCTGAGTGTTGCC

[0112] In an embodiment, the nucleic acid sequence encoding a Cel3a enzyme or functional variant thereof comprises at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to SEQ ID NO: 2. In an embodiment, the nucleic acid sequence encoding a Cel3a enzyme or functional variant thereof comprises at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to SEQ ID NO:2 or SEQ ID NO: 3.

[0113] Also provided herein is a nucleic acid sequence encoding an aglycosylated polypeptide having biomass-degrading activity described herein, e.g., Cel3a polypeptide, in which one or more glycoslyation sites present in the polypeptide has been mutated such that a glycan can no longer be attached or linked to the glycosylation site. In another embodiment, the nucleic acid sequence described herein encoding an aglycosylated polypeptide, e.g., a Cel3a polypeptide, as described above, in which one or more mutations proximal to one or more glycosylation sites present in the polypeptide has been mutated such that a glycan can no longer be attached or linked to the glycosylation site, as previously described.

[0114] The techniques used to isolate or clone a nucleic acid sequence encoding a polypeptide are known in the art and include isolation from genomic DNA, preparation from cDNA, or a combination thereof. The cloning of the nucleic acid sequences of the present invention from such genomic DNA can be effected, e.g., by using the well known polymerase chain reaction (PCR) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features. See, e.g., Innis et al., 1990, PCR: A Guide to Methods and Application, Academic Press, New York. Other amplification procedures such as ligase chain reaction (LCR), ligated activated transcription (LAT) and nucleotide sequence-based amplification (NASBA) may be used. The nucleic acid sequence may be cloned from a strain of Trichoderma reesei, e.g., wild-type T. reesei, or T. reesei RUTC30, or another or related organism and thus, for example, may be an allelic or species variant of the polypeptide encoding region of the nucleic acid sequence.

[0115] The nucleic acid sequence may be obtained by standard cloning procedures used in genetic engineering to relocate the nucleic acid sequence from its natural location to a different site where it will be reproduced. The cloning procedures may involve excision and isolation of a desired fragment comprising the nucleotide sequence encoding the polypeptide, insertion of the fragment into a vector molecule, and incorporation of the recombinant vector into a host cell where multiple copies or clones of the nucleotide sequence will be replicated. The nucleotide sequence may be of genomic, cDNA, RNA, semisynthetic, synthetic origin, or any combinations thereof.

[0116] As used herein, an "expression vector" is a nucleic acid construct for introducing and expressing a nucleic acid sequence of interest into a host cell. In some embodiments, the vector comprises a suitable control sequence operably linked to and capable of effecting the expression of the polypeptide encoded by the nucleic acid sequence described herein. The control sequence may be an appropriate promoter sequence, recognized by a host cell for expression of the nucleic acid sequence. In an embodiment, the nucleic acid sequence of interest is a nucleic acid sequence encoding a polypeptide having biomass-degrading activity, e.g., cellobiase activity, as described herein.

[0117] A promoter in the expression vector of described herein can include promoters obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell, mutant promoters, truncated promoters, and hybrid promoters.

[0118] Examples of suitable promoters for directing transcription of the nucleic acid constructs of the present invention in a bacterial host cell are the promoters obtained from the E. coli lac operon, E. coli tac promoter (hybrid promoter, DeBoer et al, PNAS, 1983, 80:21-25), E. coli rec A, E. coli araBAD, E. coli tetA, and prokaryotic beta-lactamase. Other examples of suitable promoters include viral promoters, such as promoters from bacteriophages, including a T7 promoter, a T5 promoter, a T3 promoter, an M13 promoter, and a SP6 promoter. In some embodiments, more than one promoter controls the expression of the nucleic acid sequence of interest, e.g., an E. coli lac promoter and a T7 promoter. Further promoters that may be suitable for use in the present invention are described in "Useful proteins from recombinant bacteria" in Scientific American, 1980, 242:74-94, and Sambrook et al., Molecular Cloning: A Laboratory Manual, 1989. In some preferred embodiments, the promoter is inducible, where the addition of a molecule stimulates the transcription and expression of the downstream reading frame.

[0119] Examples of suitable promoters for directing transcription of the nucleic acid constructs of the present invention in a eukaryotic host cell, e.g., in a fungal or yeast cell are promoters obtained from the genes of Trichoderma Reesei, methanol-inducible alcohol oxidase (AOX promoter), Aspergillus nidulans tryptophan biosynthesis (trpC promoter), Aspergillus niger var. awamori flucoamylase (glaA), Saccharomyces cerevisiae galactokinase (GALl), or Kluyveromyces lactis Plac4-PBI promoter.

[0120] A control sequence present in the expression vector described herein may also be a signal sequence that codes for an amino acid sequence linked to the amino terminus of a polypeptide and directs the encoded polypeptide into the cell's secretory pathway, e.g., a secretion signal sequence. The signal sequence may be an endogenous signal sequence, e.g., where the signal sequence is present at the N-terminus of the wild-type polypeptide when endogenously expressed by the organism from which the polypeptide of interest originates from. The signal sequence may be a foreign, or heterologous, signal peptide, in which the signal sequence is from a different organism or a different polypeptide than that of the polypeptide of interest being expressed. Any signal sequence which directs the expressed polypeptide into the secretory pathway of a host cell may be used in the present invention. Typically, signal sequences are composed of between 6 and 136 basic and/or hycrophobic amino acids.

[0121] Examples of signal sequences suitable for the present invention include the signal sequence from Saccharomyces cerevisiae alpha-factor.

[0122] Fusion tags may also be used in the expression vector described herein to facilitate the detection and purification of the expressed polypeptide. Examples of suitable fusion tags include His-tag (e.g., 3.times. His, 6.times. His (SEQ ID NO: 22), or 8.times. His (SEQ ID NO: 21)), GST-tag, HSV-tag, S-tag, T7 tag. Other suitable fusion tags include myc tag, hemagglutinin (HA) tag, and fluorescent protein tags (e.g., green fluorescent protein). The fusion tag is typically operably linked to the N or C terminus of the polypeptide to be expressed. In some embodiments, there may be a linker region between the fusion tag sequence and the N-terminus or C-terminus of the polypeptide to be expressed. In an embodiment, the linker region comprises a sequence between 1 to 20 amino acids, that does not affect or alter the expression or function of the expressed polypeptide.

[0123] Utilization of the fusion tags described herein allows detection of the expressed protein, e.g., by western blot by using antibodies that specifically recognize the tag. The tags also allows for purification of the expressed polypeptide from the host cell, e.g., by affinity chromatography. For example, an expressed polypeptide fused to a His-tag can be purified by using nickel affinity chromatography. The His tag has affinity for the Nickel ions, and a nickel column will retain the his-tagged polypeptide, while allowing all other proteins and cell debris to flow through the column. Elution of the His-tagged polypeptide using an elution buffer, e.g., containing imidazole, releases the His-tagged polypeptide from the column, resulting in substantially purified polypeptide.

[0124] The expression vector described herein may further comprise a selectable marker gene to enable isolation of a genetically modified microbe transformed with the construct as is commonly known to those of skill in the art. The selectable marker gene may confer resistance to an antibiotic or the ability to grow on medium lacking a specific nutrient to the host organism that otherwise could not grow under these conditions. The present invention is not limited by the choice of selectable marker gene, and one of skill in the art may readily determine an appropriate gene. For example, the selectable marker gene may confer resistance to ampicillin, chloramphenicol, tetracycline, kanamycin, hygromycin, phleomycin, geneticin, or G418, or may complement a deficiency of the host microbe in one of the trp, arg, leu, pyr4, pyr, ura3, ura5, his, or ade genes or may confer the ability to grow on acetamide as a sole nitrogen source.

[0125] The expression vector described herein may further comprise other nucleic acid sequences, e.g., additional control sequences, as is commonly known to those of skill in the art, for example, transcriptional terminators, synthetic sequences to link the various other nucleic acid sequences together, origins of replication, ribosome binding sites, a multiple cloning site (or polylinker site), a polyadenylation signal and the like. The ribosomal binding site suitable for the expression vector depends on the host cell used, for example, for expression in a prokaryotic host cell, a prokaryotic RBS, e.g., a T7 phage RBS can be used. A multiple cloning site, or polylinker site, contains one or more restriction enzyme sites that are preferably not present in the remaining sequence of the expression vector. The restriction enzyme sites are utilized for the insertion of a nucleic acid sequence encoding a polypeptide having cellobiase activity or other desired control sequences. The practice of the present invention is not limited by the presence of any one or more of these other nucleic acid sequences, e.g., other control sequences.

[0126] Examples of suitable expression vectors for use in the present invention include vectors for expression in prokaryotes, e.g., bacterial expression vectors. A bacterial expression vector suitable for use in the present invention in the pET vector (Novagen), which contains the following: a viral T7 promoter which is specific to only T7 RNA polymerase (not bacterial RNA polymerase) and also does not occur anywhere in the prokaryotic genome, a lac operator comprising a lac promoter and coding sequence for the lac repressor protein (lacI gene), a polylinker, an f1 origin of replication (so that a single-stranded plasmid can be produced when co-infected with M13 helper phage), an ampicillin resistance gene, and a ColE1 origin of replication (Blaber, 1998). Both the promoter and the lac operator are located 5', or upstream, of the polylinker in which the nucleic acid sequence encoding a polypeptide described herein is inserted. The lac operator confers inducible expression of the nucleic acid sequence encoding a polypeptide having cellobiase activity. Addition of IPTG (Isopropyl 3-D-1-thiogalactopyranoside), a lactose metabolite, triggers transcription of the lac operon and induces protein expression of the nucleic acid sequence under control of the lac operator. Use of this system requires the addition of T7 RNA polymerase to the host cell for vector expression. The T7 RNA polymerase can be introduced via a second expression vector, or a host cell strain that is genetically engineered to express T7 RNA polymerase can be used.

[0127] An exemplary expression vector for use with the invention is a pET vector, commercially available from Novagen. The pET expression system is described in U.S. Pat. Nos. 4,952,496; 5,693,489; and 5,869,320. In one embodiment, the pET vector is a pET-DUET vector, e.g., pET-Duet1, commercially available from Novagen. Other vectors suitable for use in the present invention include vectors containing His-tag sequences, such as those described in U.S. Pat. Nos. 5,310,663 and 5,284,933; and European Patent No. 282042.

[0128] The present invention also relates to a host cell comprising the nucleic acid sequence or expression vector of the invention, which are used in the recombinant production of the polypeptides having biomass-degrading activity.

[0129] An expression vector comprising a nucleic acid sequence of the present invention is introduced into a host cell so that the vector is maintained (e.g., by chromosomal integration or as a self-replicating extra-chromosomal vector) such that the polypeptide is expressed.

[0130] The host cell may be a prokaryote or a eukaryote. The host cell may be a bacteria, such as an E. coli strain, e.g., K12 strains NovaBlue, NovaBlue T1R, JM109, and DH5a. Preferably, the bacteria cell has the capability to fold, or partially fold, exogenously expressed proteins, such as E. coli Origami strains, e.g., Origami B, Origami B (DE3), Origami 2, and Origami 2(DE3) strains. In some embodiments, it may be preferred to use a host cell that is deficient for glycosylation, or has an impaired glycosylation pathway such that proteins expressed by the host cell are not significantly glycosylated.

[0131] The host cell may be a yeast or a filamentous fungus, particularly those classified as Ascomycota. Genera of yeasts useful as host microbes for the expression of modified TrCel3A beta-glucosidases of the present invention include Saccharomyces, Pichia, Hansenula, Kluyveromyces, Yarrowia, and Arxula. Genera of fungi useful as microbes for the expression of the polypeptides of the present invention include Trichoderma, Hypocrea, Aspergillus, Fusarium, Humicola, Neurospora, Chrysosporium, Myceliophthora, Thielavia, Sporotrichum and Penicillium. For example, the host cell may be Pichia pastoris. For example, the host cell may be an industrial strain of Trichoderma reesei, or a mutant thereof, e.g., T. reesei RUTC30. Typically, the host cell is one which does not express a parental biomass-degrading enzyme, e.g., cellobiase or Cel3a.

[0132] The selection of the particular host cell, e.g., bacterial cell or a fungal cell, depends on the expression vector (e.g., the control sequences) and/or the method utilized for producing an aglycosylated polypeptide of the invention, as described in further detail below.

[0133] The expression vector of the invention may be introduced into the host cell by any number of methods known by one skilled in the art of microbial transformation, including but not limited to, transformation, treatment of cells with CaCl.sub.2, electroporation, biolistic bombardment, lipofection, and PEG-mediated fusion of protoplasts (e.g. White et al., WO 2005/093072, which is incorporated herein by reference). After selecting the recombinant host cells containing the expression vector (e.g., by selection utilizing the selectable marker of the expression vector), the recombinant host cells may be cultured under conditions that induce the expression of the polypeptide having biomass-degrading activity of the invention.

[0134] Methods for recovering the soluble polypeptides having biomass-degrading activity expressed from prokaryote and eukaryote cells are known in the art. In embodiments, the method for recovering the polypeptide comprises collecting the cells, e.g., by centrifugation or filtration, and lysing the cells, e.g., by mechanical, chemical, or enzymatic means. For example, cells can be physically broken apart, e.g., by sonication, milling (shaking with beads), or shear forces. Cell membranes can be treated such that they are permeabilized such that the contents of the cells are released, such as treatment with detergents, e.g., Triton, NP-40, or SDS. Cells with cell walls, e.g., bacterial cells, can be permeabilized using enzymes, such as a lysozyme or lysonase. Any combination of the mechanical, chemical, and enzymatic techniques described above are also suitable for recovering expressed polypeptides of interest from the host cell in the context of this invention. For example, when expressing a polypeptide having biomass-degrading activity described herein in a bacterial cell, e.g., an E. coli cell, the cell is typically collected by centrifuging and pelleting the cell culture, and lysed by resuspending the cell pellet in a lysis buffer containing lysozyme. To ensure complete lysis, the resuspended cells are subjected to one of the following methods: sonication, milling, or homogenization. After centrifugation, the soluble polypeptides having biomass-degrading activity are present in the supernatant, while the insoluble polypeptides having biomass-degrading activity, e.g., in inclusion bodies, are found in the pellet. Methods for recovering the insoluble polypeptides having biomass-degrading activity, e.g., from inclusion bodies, are described further below in the section titled "Solubilization from Inclusion Bodies".

[0135] The soluble polypeptides having biomass-degrading activity can then be purified or isolated from the cell lysate using standard methods known in the art. For polypeptides having biomass-degrading activity comprising a tag, e.g., a His tag, affinity chromatography can be used to separate the soluble polypeptides from the remainder of the soluble fraction of the lysate.

[0136] In one embodiment, the host cell expressing a polypeptide having biomass-degrading activity described herein is not lysed before addition to the biomass for the saccharification reaction. In some instances, the methods for lysing host cells and extracting the polypeptides having biomass-degrading activity can result in protein denaturation and/or decreased enzyme activity, which leads to increased cost of downstream processing. Thus, the present invention also provides methods for directly adding the host cells expressing an aglycosylated polypeptide having biomass-degrading activity described herein to the biomass prior to the saccharification step.

[0137] In an embodiment, the host cell, e.g., the E. coli cell, expressing a polypeptide having biomass-degrading activity described herein is isolated, e.g., by centrifugation, and added to the saccharification reaction, e.g., the saccharification reactor containing biomass. The cells are lysed by a combination of shear from the biomass, the impellers, and the increased temperature. In an embodiment, the culture of host cell, e.g., the E. coli cell, expressing the polypeptide having biomass-degrading activity described herein is added directly from the fermentation tank directly to the saccharification tank and eliminating the need to pellet cells by centrifugation. In an embodiment, the polypeptide is glycosylated or aglycosylated.

Solubilization from Inclusion Bodies

[0138] In embodiments, a cell, e.g., a microorganism disclosed herein, has been genetically modified using methods described herein to produce at least one polypeptide having a biomass-degrading activity. At least a portion, e.g., at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90%, of the polypeptide having biomass-degrading activity is found in inclusion bodies in the genetically modified cell. Disclosed herein are methods for solubilizing the at least one polypeptide having biomass-degrading activity from the inclusion bodies.

[0139] Inclusion bodies are insoluble aggregates in host cells comprising heterologously expressed proteins, e.g., a polypeptide having a biomass-degrading activity, when expressed at high levels. Inclusion bodies can be found in the nucleus or the cytoplasm. Inclusion bodies can also contain other components, such as other proteins endogenous to the host cell, e.g., host proteins, ribosomal components, nucleic acids (e.g., RNA and/or DNA), and cellular debris (e.g., cell wall debris, lipids, metabolites). Proteins endogenous to the host cell includes any protein that is encoded by genomic DNA of the host cell. The protein endogenous to the host cell may be localized to the cytoplasm, or may interact with the heterologously expressed protein. Examples of ribosomal components that can be found in an inclusion body include ribosomes, fragments of ribosomal subunits (e.g., 50 S subunit or 30 S subunit), partially translated polypeptides, transfer RNA, elongation factors, and/or messenger RNA. Examples of nucleic acids that can be found in an inclusion body include genomic DNA of the host, exogenous DNA (e.g., from a plasmid or expression vector introduced into the host cell), messenger RNA, transfer RNA, ribosomal RNA, or any fragments thereof. Examples of cellular debris that can be found in an inclusion body include cell wall or membrane debris (e.g., fragments or components of the cell wall or membrane), nuclear membrane debris (e.g., fragments or components of the nuclear membrane), fragments or components of other host organelles, endotoxins, lipids, and/or metabolites.

[0140] Additional methods for reducing aggregation of inclusion bodies include sonication, incubation at varying temperatures, acid/base treatment, protease treatment, electrical treatment, mechanical treatment, and addition of organisms that produce proteases. For example, the cells or lysates thereof containing inclusion bodies are incubated at temperatures ranging from -20.degree. C. to 0.degree. C., 0.degree. C. to 4.degree. C., 4.degree. C. to 20.degree. C., 20.degree. C. to 40.degree. C., and 40.degree. C. to 80.degree. C.

[0141] To isolate the inclusion bodies, the host cell expressing a polypeptide having biomass-degrading activity is first lysed, using standard methods in the art, such as lysis by lysozyme or other denaturing agents, ultrasound treatment, sonication, or high pressure homogenization. The host cells are lysed under conditions that do not lead to solubilization of an inclusion body. Inclusion bodies are isolated from the host cell using techniques known in the art. For example, the cell lysate is separated such that the inclusion bodies containing polypeptides having biomass-degrading activity and other insoluble matter are present in an insoluble fraction, while the soluble fraction contains the soluble polypeptides having biomass-degrading activity. Such separation can be accomplished through centrifugation, whereby the inclusion bodies are found in the pellet, e.g., the insoluble fraction, and the soluble polypeptides are found in the supernatant, e.g., the soluble fraction. Other methods suitable for separation of an insoluble fraction from the soluble fraction include filtration.

[0142] Solubilization of the inclusion bodies to release a polypeptide having biomass-degrading activity comprises adding a solubilizing agent to the insoluble fraction or inclusion bodies. In some embodiments a solubilizing agent can be an agent that prevents protein aggregation or precipitation, or dissolves protein aggregates. In some embodiments, the solubilizing agent includes an agent that disrupts van der Waals interactions, hydrophobic interactions, hydrogen bonding, dipole-dipole interactions, ionic interactions, pi stacking, or any combination thereof.

[0143] In some embodiments, the solubilizing agent can be an agent that disrupts hydrophobic interactions, e.g., such as a detergent. Exemplary detergents include nonionic, zwitterionic, anionic and cationic detergents. In some embodiments, the solubilizing agent can be nonionic, e.g., NP-40 and Triton X-100. In some embodiments, the solubilizing agent can be zwitterionic, e.g., CHAPS and sulfobetaines, e.g., SB3-10 or ASB 14. In some embodiments, the protein agent can be anionic, e.g., sodium dodecyl sulfate (SDS).

[0144] In some embodiments, the solubilizing agent can be an agent that reduces disulfide bonds, e.g., a thiol reducing agent. Exemplary thiol reducing agents include 2-mercaptoethanol .beta.ME and dithiothreitol (DTT). In some embodiments, the solubilizing agent that reduces disulfide bonds can be a phosphine, e.g., tributylphosphine (TBP) or triscarboxyethylphosphine (TCEP).

[0145] In some embodiments, the solubilizing agent can be an agent that disrupts hydrogen bonding and hydrophobic interactions. In some embodiments, the solubilizing agent can be a chaotropic compound, e.g., urea and substituted ureas (e.g., thiourea), and guanidinium hydrochloride.

[0146] In some embodiments, the solubilizing agent can be an agent that is a nonpolar solvent. Nonpolar solvents contain bonds between atoms with similar electronegativities, such as carbon and hydrogen, and have very low dielectric constants. For example, nonpolar solvents have a dielectric constant of less than 5. Examples of nonpolar solvents include pentane, hexane, cyclohexane, benzene, toluene, chloroform, diethyl ether.

[0147] In some embodiments, the solubilizing agent can be an agent that is a polar solvent. Polar solvents are characterized by having large dipole moments (or "partial charges"); they contain bonds between atoms with very different electronegativities, such as oxygen and hydrogen. In one embodiment, the polar solvents suitable for use in the invention herein have a dielectric constant of at least 5, or at least 20. In a preferred embodiment, the polar solvents have a high dielectric constant, e.g., a dielectric constant greater than 25. In some embodiments, the solubilizing agent is a protic polar solvent, which has O--H or N--H bonds, have high dielectric constants, e.g., greater than 20, greater than 25, and are good hydrogen bond donors, e.g., formic acid, n-butanol, isopropanol, n-propanol, ethanol, methanol, or nitromethane. In some embodiments, the solubilizing agent can be an aprotic polar solvent, which lack O--H or N--H bonds, and has a dielectric constant between 5 and 20, e.g., dimethylsulfoxide (DMSO), dichloromethane (DCM), tetrahydrofuran (THF), ethyl acetate, acetone, dimethylformamide (DMF), or acetonitrile (MeCN).

[0148] In some embodiments, the solubilizing agent can be an agent that has a positive charge, which may be suitable for disrupting ionic interactions of a net negatively charged molecule. Exemplary positively charged solubilizing agents include N-methyl D-glucamine, choline, arginine, lysine, procaine, tromethamine (TRIS), spermine, N-methyl-morpholine, glucosamine, N,N-bis 2-hydroxyethyl glycine, diazabicycloundecene, creatine, arginine ethyl ester, amantadine, rimantadine, ornithine, taurine, and citrulline. Cationic moieties may additionally include sodium, potassium, calcium, magnesium, ammonium, monoethanolamine, diethanolamine, triethanolamine, tromethamine, lysine, histidine, arginine, morpholine, methylglucamine, and glucosamine.

[0149] In some embodiments, the solubilizing agent can be an agent that has a negative charge, which may be suitable for disrupting ionic interactions of a net positively charged molecule. Exemplary negatively charged solubilizing agents include acetate, propionate, butyrate, pentanoate, hexanoate, heptanoate, levulinate, chloride, bromide, iodide, citrate, succinate, maleate, glycolate gluconate, glucuronate, 3-hydroxyisobutyrate, 2-hydroxyisobutyrate, lactate, malate, pyruvate, fumarate, tartarate, tartronate, nitrate, phosphate, benzene sulfonate, methane sulfonate, sulfate, sulfonate, acetic acid, adamantoic acid, alpha keto glutaric acid, D- or L-aspartic acid, benzensulfonic acid, benzoic acid, 10-camphorsulfunic acid, citric acid, 1,2-ethanedisulfonic acid, fumaric acid, D-gluconic acid, D-glucuronic acid, glucaric acid, D- or L-glutamic acid, glutaric acid, glycolic acid, hippuric acid, hydrobromic acid, hydrochloric acid, 1-hydroxyl-2-napthoic acid, lactobioinic acid, maleic acid, L-malic acid, mandelic acid, methanesulfonic acid, mucic acid, 1,5 napthalenedisulfonic acid tetrahydrate, 2-napthalenesulfonic acid, nitric acid, oleic acid, pamoic acid, phosphoric acid, p-toluenesulfonic acid hydrate, D-saccharide acid monopotassium salt, salicyclic acid, stearic acid, succinic acid, sulfuric acid, tannic acid, and D- or L-tartaric acid.

[0150] The solubilizing agent is added to an inclusion body, or a fraction containing inclusion bodies, at a sufficient concentration to solubilize a polypeptide having biomass degrading activity from the inclusion body, for example, at a concentration of about 0.01-10M, about 0.05-10M, about 0.1-10M, about 0.2-10M, about 0.5-10M, about 1-10M, about 2-10M, about 5-10M, about 8-10M, about 0.01-6M, about 0.05-6M, about 0.1-6M, about 0.2-6M, about 0.5-6M, about 1-6M, about 2-6M, about 4-6M, or about 5-6M. In an embodiment, the solubilizing agent is added to an inclusion body, or a fraction containing inclusion bodies, at a concentration of about 0.01M, about 0.02M, about 0.05M, about 0.1M, about 0.2M, about 0.5M, about 1M, about 2M, about 3M, about 4M, about 5M, about 6M, about 7M, about 8M, about 9M, or about 10M.

[0151] In a preferred embodiment, the solubilizing agent is urea, and is added to an inclusion body, or a fraction containing inclusion bodies, at a concentration of about 0.01-10M, about 0.05-10M, about 0.1-10M, about 0.2-10M, about 0.5-10M, about 1-10M, about 2-10M, about 5-10M, about 8-10M, about 0.01-6M, about 0.05-6M, about 0.1-6M, about 0.2-6M, about 0.5-6M, about 1-6M, about 2-6M, about 4-6M, or about 5-6M. In an embodiment, urea is added to an inclusion body, or a fraction containing inclusion bodies, at a concentration of about 0.01M, about 0.02M, about 0.05M, about 0.1M, about 0.2M, about 0.5M, about 1M, about 2M, about 3M, about 4M, about 5M, about 6M, about 7M, about 8M, about 9M, or about 10M. In a preferred embodiment, the urea is added to an inclusion body, or a fraction containing inclusion bodies, at a concentration of 6M.

[0152] After solubilization using a solubilizing agent, the resulting mixture contains a solubilized polypeptide having biomass-degrading activity, as described herein. The resulting mixture can be used directly in an enzymatic processes, such as a reaction for producing products, e.g., a saccharification reaction, as described in further detail in the section titled "Methods of Producing Products Using Solubilized Enzymes". In this embodiment, the mixture may contain other components of the inclusion body, such as other proteins endogenous to the host cell, ribosomal components, nucleic acids (e.g., RNA and/or DNA), and cellular debris (e.g., cell wall debris, lipids, metabolites).

[0153] In other embodiments, the resulting mixture containing a solubilized polypeptide having biomass-degrading activity is further processed to purify or isolate the solubilized polypeptide having biomass-degrading activity from the other solubilized components of the inclusion bodies. Suitable methods for isolating or purifying the solubilized polypeptide having biomass-degrading activity include affinity purification techniques. For example, the polypeptide having biomass-degrading activity preferably contains a tag or fusion peptide that can be utilized for affinity purification. In an embodiment, the polypeptide having biomass-degrading activity contains a His-tag, e.g., an 8.times. His tag (SEQ ID NO: 21), and the solubilized polypeptide having biomass-degrading activity can be purified using nickel affinity chromatography, e.g., an immobilized metal ion affinity chromatography (IMAC) system. In one embodiment, all purification steps occur in the presence of the solubilizing agent used to solubilize the polypeptide from an inclusion body, including in the washing and elution steps of the purification process. Accordingly, in one embodiment, the resulting purified solubilized polypeptide having biomass-degrading activity also contains a solubilizing agent.

[0154] The present invention provides a mixture comprising a polypeptide having biomass-degrading activity and a solubilizing agent. The mixture is obtained through the solubilization of inclusion bodies, as described above. The resulting mixture may also further comprise one or more proteins associated with the inclusion bodies. The solubilized polypeptide having biomass-degrading activity may also be purified by affinity purification techniques. In this case, the resulting mixture does not comprise one or more proteins associated with the inclusion bodies. The mixture can comprise other components found in inclusion bodies, such as other proteins endogenous to the host cell, ribosomal components, nucleic acids (e.g., RNA and/or DNA), and cellular debris (e.g., cell wall debris, lipids, metabolites).

[0155] In an embodiment, the solubilized polypeptide having biomass-degrading activity can be partially unfolded, partially misfolded, or partially denatured. In an embodiment, the solubilized polypeptide having biomass-degrading activity has at least 1%, at least 2%, at least 3%, at least 4%, at least 5%, at least 6%, at least 7%, at least 8%, at least 9%, at least 10%, at least 8-10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 100% biomass-degrading activity compared to the native polypeptide having biomass-degrading activity. In an embodiment, the solubilized polypeptide having biomass-degrading activity has about 1-10%, 1-20%, 1-30%, 1-40%, 1-50%, 1-60%, 1-70%, 1-80%, 1-90%, 1-100%, 10-20%, 10-30%, 10-40%, 10-50%, 10-60%, 10-70%, 10-80%, 10-90%, 10-100%, 20-30%, 20-40%, 20-50%, 20-60%, 20-70%, 20-80%, 20-90%, 20-100%, 30-40%, 30-50%, 30-60%, 30-70%, 30-80%, 30-90%, 30-100%, 40-50%, 40-60%, 40-70%, 40-80%, 40-90%, 40-100%, 50-60%, 50-70%, 50-80%, 50-90%, 50-100%, 60-70%, 60-80%, 60-90%, 60-100%, 70-80%, 70-90%, 70-100%, 80-90%, 80-100%, or 90-100% of the biomass-degrading activity compared to the native polypeptide having biomass-degrading activity. In a preferred embodiment, the solubilized polypeptide having biomass-degrading activity has at least 8-10% of the activity of the native polypeptide. The native polypeptide having biomass-degrading activity refers to, e.g., the corresponding polypeptide having biomass-degrading activity isolated from the soluble fraction, the corresponding polypeptide having biomass-degrading activity that is properly folded in its native form (thereby having 100% biomass-degrading activity), or the corresponding polypeptide having biomass-degrading activity endogenously expressed from the microorganism from which the polypeptide originates from. Biomass-degrading activity can be determined by any of the assays described herein, e.g., a ligninase activity assay, an endoglucanase activity assay, a cellobiohydrolase activity assay, a cellobiase activity assay, or a xylanase activity assay.

[0156] In one aspect, the mixture comprises a polypeptide having cellobiase activity, e.g., a Cel3a or a functional variant thereof from T. reesei, e.g., a polypeptide with at least 90% identity to SEQ ID NO: 1, and a solubilizing agent, e.g., urea. In one embodiment, the mixture comprises a polypeptide having at least 90% identity to SEQ ID NO: 1 and a solubilizing agent, e.g., urea, wherein the polypeptide has at least 20% of the cellobiase activity compared to the native polypeptide, e.g., SEQ ID NO: 1 or Cel3a from T. reesei. The mixture is obtained through the solubilization of inclusion bodies, as described above. The resulting mixture may also further comprise one or more proteins associated with the inclusion bodies. The solubilized polypeptide having cellobiase activity, e.g., a polypeptide with at least 90% identity to SEQ ID NO: 1, may also be purified by affinity purification techniques. In this case, the resulting mixture does not comprise one or more proteins associated with the inclusion bodies. The mixture can comprise other components found in inclusion bodies, such as other proteins endogenous to the host cell, ribosomal components, nucleic acids (e.g., RNA and/or DNA), and cellular debris (e.g., cell wall debris, lipids, metabolites). In an embodiment, the solubilizing agent is urea, and the urea is present at 0.2-6M.

[0157] In an embodiment, the solubilized polypeptide having cellobiase activity, or at least 90% identity with SEQ ID NO: 1, can be partially unfolded, partially misfolded, or partially denatured. In an embodiment, the solubilized polypeptide having cellobiase activity, or at least 90% identity with SEQ ID NO: 1, has at least 1%, at least 2%, at least 3%, at least 4%, at least 5%, at least 6%, at least 7%, at least 8%, at least 9%, at least 10%, at least 8-10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or at least 100% cellobiase activity compared to the native polypeptide. In an embodiment, the solubilized polypeptide having cellobiase activity, or at least 90% identity with SEQ ID NO: 1, has about 1-10%, 1-20%, 1-30%, 1-40%, 1-50%, 1-60%, 1-70%, 1-80%, 1-90%, 1-100%, 10-20%, 10-30%, 10-40%, 10-50%, 10-60%, 10-70%, 10-80%, 10-90%, 10-100%, 20-30%, 20-40%, 20-50%, 20-60%, 20-70%, 20-80%, 20-90%, 20-100%, 30-40%, 30-50%, 30-60%, 30-70%, 30-80%, 30-90%, 30-100%, 40-50%, 40-60%, 40-70%, 40-80%, 40-90%, 40-100%, 50-60%, 50-70%, 50-80%, 50-90%, 50-100%, 60-70%, 60-80%, 60-90%, 60-100%, 70-80%, 70-90%, 70-100%, 80-90%, 80-100%, or 90-100% of the cellobiase activity compared to the native polypeptide. The native polypeptide is, for example, SEQ ID NO: 1 that is properly folded (e.g., 100% folded), Cel3a that is isolated from T. reesei, or a functional variant thereof. Cellobiase activity can be measured using the assays described herein, and can be quantified as the concentration of glucose (g/L) released after 30 minutes or the % of cellobiose converted to glucose in 30 minutes.

Methods for Producing Aglycosylated Polypeptides

[0158] The present invention further provides methods for producing an aglycosylated polypeptide having biomass-degrading activity in a host cell, wherein the host cell, or lysate thereof, is treated with a solubilizing agent at a concentration suitable for solubilizing the aglycosylated polypeptide, as described herein. The method comprises culturing the host cell expressing the polypeptide having biomass-degrading activity under conditions suitable for the expression of the polypeptide. The method may also comprise recovering the aglycosylated polypeptide having biomass-degrading activity from the host cell. In the methods described in further detail below, the polypeptide having biomass-degrading activity has, e.g., ligninase activity, endoglucanase activity, cellobiohydrolase activity, cellobiase activity, or xylanase activity. In an embodiment, the polypeptide having cellobiase activity comprises a Cel3a from T. reesei, or a functional fragment thereof. In another embodiment, the polypeptide having cellobiase activity comprises SEQ ID NO: 1.

Using a Host Cell Deficient for Glycosylation

[0159] In embodiments, the expression vector comprises a nucleic acid sequence encoding a polypeptide having biomass-degrading activity described herein operably linked to a fusion tag is introduced to and expressed in a cell that does not significantly glycosylate proteins expressed in the cell, e.g., a bacterial host cell. The recombinant host cell is cultured under conditions for expression of the polypeptide, resulting in the production of an aglycosylated polypeptide having biomass-degrading activity. The aglycosylated polypeptide can be purified or isolated from the host cell using affinity chromatography methods for the fusion tag as described herein.

[0160] For example, in this embodiment, the expression vector contains a lac operator and a T7 promoter upstream of the nucleic acid sequence encoding a polypeptide having biomass-degrading activity, and the host cell has the capacity to express T7 RNA polymerase. Expression of the polypeptide having biomass-degrading activity is induced by addition of IPTG. Preferably, the host cell is an E. coli cell, preferably an E. coli Origami cell. In this embodiment, the fusion tag is a His-tag, and the purification of the expressed aglycosylated polypeptide comprises nickel affinity chromatography.

Using a Host Cell with the Capacity for Glycosylation

[0161] In another embodiment, an expression vector comprising a nucleic acid sequence encoding a polypeptide comprising one or more glycosylation site mutations such that the polypeptide is not glycosylated, as described herein, is expressed in a host cell, wherein the host cell is capable of glycosylating proteins expressed within the cell, e.g., a yeast or fungal host cell. Alternatively, the host cell is not capable of glycosylating proteins expressed within the cell, e.g., a bacterial host cell. In this embodiment, the polypeptide is operably linked to a fusion tag. The aglycosylated polypeptide can be purified or isolated from the bacterial host cell using affinity chromatography methods for the fusion tag as described herein.

[0162] In yet another embodiment, an expression vector comprising a nucleic acid sequence encoding a polypeptide having biomass-degrading activity described herein is expressed in a host cell, wherein the host cell is capable of glycosylating proteins expressed within the cell. The cells are cultured under conditions sufficient for expression and glycosylation of the polypeptide. In this embodiment, the polypeptide is operably linked to a fusion tag. The glycosylated polypeptide can be purified or isolated from the bacterial host cell using affinity chromatography methods for the fusion tag as described herein. After purification from the host cells and other endogenous host enzymes, e.g., glycosylation enzymes, the glycans of the isolated glycosylated polypeptide can be removed by incubation with deglycosylating enzymes. Deglycosylating enzymes include PNGase F, PNGase A, EndoH (endoglycosidase H), EndoS (endoglycosidase S), EndoD (endoglycosidase D), EndoF (endoglycosidase F), EndoF1 (endoglycosidase F1), or EndoF2 (endoglycosidase F2). Protein deglycosylation mixes containing enzymes sufficient for the complete removal of glycans are commercially available, e.g., from New England Biolabs. The isolated polypeptide is incubated with one or more deglycosylating enzyme under conditions sufficient for the removal of all of the glycans from the polypeptide. Other methods are known in the art for removing glycans from a polypeptide, e.g., -elimination with mild alkali or mild hydrazinolysis. Assessment of the glycosylation state of the polypeptide can be determined using methods for staining and visualization of glycans known in the art, or mass spectrometry.

[0163] In yet another embodiment, an expression vector comprising a nucleic acid sequence encoding a polypeptide having biomass-degrading activity described herein is expressed in a host cell, wherein the host cell is capable of glycosylating proteins expressed within the cell. The cells are cultured under conditions sufficient for expression of the polypeptide, but in the presence of glycosylation inhibitors. The glycosylation inhibitors are present at a concentration and for a sufficient time such that the expressed polypeptides are aglycosylated. In this embodiment, the polypeptide is operably linked to a fusion tag. The resulting aglycosylated polypeptide can be purified or isolated from the bacterial host cell using affinity chromatography methods for the fusion tag as described herein.

[0164] Examples of suitable glycosylation inhibitors for use in this embodiment include tunicamycin, Benzyl-GalNAc (Benzyl 2-acetamido-2-deoxy-.alpha.-D-galactopyranoside), 2-Fluoro-2-deoxy-D-glucose, and 5'CDP (5' cytidylate diphosphate). In some embodiments, a combination of glycosylation inhibitors is used. Preferably, the concentration of glycosylation inhibitors used in this embodiment is sufficient to inhibit glycosylation of the polypeptide, but do not cause cytotoxicity or inhibition of protein expression of the host cell.

Methods of Converting Biomass into Products

[0165] The present invention provides methods and compositions for converting or processing a biomass into products, using an aglycosylated polypeptide having cellobiase activity, as described herein. Methods for converting a biomass to products, such as sugar products, are known in the art, for example, as described in US Patent Application 2014/0011258, the contents of which are incorporated by reference in its entirety. Briefly, a biomass is optimally pretreated, e.g., to reduce the recalcitrance, and saccharified by a saccharification process that involves incubating the treated biomass with biomass-degrading, or cellulolytic, enzymes to produce sugars (e.g., glucose and/or xylose). The sugar products can then be further processed to produce a final product, e.g., by fermentation or distillation. Final products include alcohols (e.g., ethanol, isobutanol, or n-butanol), sugar alcohols (e.g., erythritol, xylitol, or sorbitol), or organic acids (e.g., lactic acid, pyurvic acid, succinic acid).

[0166] Using the processes described herein, the biomass material can be converted to one or more products, such as energy, fuels, foods and materials. Specific examples of products include, but are not limited to, hydrogen, sugars (e.g., glucose, xylose, arabinose, mannose, galactose, fructose, cellobiose, disaccharides, oligosaccharides and polysaccharides), alcohols (e.g., monohydric alcohols or dihydric alcohols, such as ethanol, n-propanol, isobutanol, sec-butanol, tert-butanol or n-butanol), hydrated or hydrous alcohols (e.g., containing greater than 10%, 20%, 30% or even greater than 40% water), biodiesel, organic acids, hydrocarbons (e.g., methane, ethane, propane, isobutene, pentane, n-hexane, biodiesel, bio-gasoline and mixtures thereof). co-products (e.g., proteins, such as cellulolytic proteins (enzymes) or single cell proteins), and mixtures of any of these in any combination or relative concentration, and optionally in combination with any additives (e.g., fuel additives). Other examples include carboxylic acids, salts of a carboxylic acid, a mixture of carboxylic acids and salts of carboxylic acids and esters of carboxylic acids (e.g., methyl, ethyl and n-propyl esters), ketones (e.g., acetone), aldehydes (e.g., acetaldehyde), alpha and beta unsaturated acids (e.g., acrylic acid) and olefins (e.g., ethylene). Other alcohols and alcohol derivatives include propanol, propylene glycol, 1,4-butanediol, 1,3-propanediol, sugar alcohols and polyols (e.g., glycol, glycerol, erythritol, threitol, arabitol, xylitol, ribitol, mannitol, sorbitol, galactitol, iditol, inositol, volemitol, isomalt, maltitol, lactitol, maltotriitol, maltotetraitol, and polyglycitol and other polyols), and methyl or ethyl esters of any of these alcohols. Other products include methyl acrylate, methylmethacrylate, lactic acid, citric acid, formic acid, acetic acid, propionic acid, butyric acid, succinic acid, valeric acid, caproic acid, 3-hydroxypropionic acid, palmitic acid, stearic acid, oxalic acid, malonic acid, glutaric acid, oleic acid, linoleic acid, glycolic acid, gamma-hydroxybutyric acid, and mixtures thereof, salts of any of these acids, mixtures of any of the acids and their respective salts.

Biomass

[0167] The biomass to be processed using the methods described herein is a starchy material and/or a cellulosic material comprising cellulose, e.g., a lignocellulosic material. The biomass may also comprise hemicellulose and/or lignin. The biomass can comprise one or more of an agricultural product or waste, a paper product or waste, a forestry product, or a general waste, or any combination thereof. An agricultural product or waste comprises material that can be cultivated, harvested, or processed for use or consumption, e.g., by humans or animals, or any intermediate, byproduct, or waste that is generated from the cultivation, harvest, or processing methods. Agricultural products or waste include, but are not limited to, sugar cane, jute, hemp, flax, bamboo, sisal, alfalfa, hay, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, canola straw, wheat straw, barley straw, oat straw, rice straw, corn cobs, corn stover, corn fiber, coconut hair, beet pulp, bagasse, soybean stover, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, or beeswing, or a combination thereof. A paper product or waste comprises material that is used to make a paper product, any paper product, or any intermediate, byproduct or waste that is generated from making or breaking down the paper product. Paper products or waste include, but are not limited to, paper, pigmented papers, loaded papers, coated papers, corrugated paper, filled papers, magazines, printed matter, printer paper, polycoated paper, cardstock, cardboard, paperboard, or paper pulp, or a combination thereof. A forestry product or waste comprises material that is produced by cultivating, harvesting, or processing of wood, or any intermediate, byproduct, or waste that is generated from the cultivation, harvest, or processing of the wood. Forestry products or waste include, but are not limited to, aspen wood, wood from any genus or species of tree, particle board, wood chips, or sawdust, or a combination thereof. A general waste includes, but is not limited to, manure, sewage, or offal, or a combination thereof.

[0168] The biomass may include, but is not limited to starchy materials, sugar cane, agricultural waste, paper, paper products, paper waste, paper pulp, pigmented papers, loaded papers, coated papers, filled papers, magazines, printed matter, printer paper, polycoated paper, card stock, cardboard, paperboard, cotton, wood, particle board, forestry wastes, sawdust, aspen wood, wood chips, grasses, switchgrass, miscanthus, cord grass, reed canary grass, grain residues, rice hulls, oat hulls, wheat chaff, barley hulls, agricultural waste, silage, canola straw, wheat straw, barley straw, oat straw, rice straw, jute, hemp, flax, bamboo, sisal, abaca, corn cobs, corn stover, soybean stover, corn fiber, alfalfa, hay, coconut hair, sugar processing residues, bagasse, beet pulp, agave bagasse, algae, seaweed, plankton manure, sewage, offal, agricultural or industrial waste, arracacha, buckwheat, banana, barley, cassava, kudzu, oca, sago, sorghum, potato, sweet potato, taro, yams, beans, favas, lentils, peas, or mixtures of any of these. In a preferred embodiment, the biomass comprises agriculture waste, such as corn cobs, e.g., corn stover. In another embodiment, the biomass comprises grasses.

[0169] In one embodiment, the biomass is treated prior to contact with the compositions described herein. For example, the biomass is treated to reduce the recalcitrance of the biomass, to reduce its bulk density, and/or increase its surface area. Suitable biomass treatment process may include, but are not limited to: bombardment with electrons, sonication, oxidation, pyrolysis, steam explosion, chemical treatment, mechanical treatment, and freeze grinding. Preferably, the treatment method is bombardment with electrons.

[0170] In some embodiments, electron bombardment is performed until the biomass receives a total dose of at least 0.5 Mrad, e.g. at least 5, 10, 20, 30, or at least 40 Mrad. In some embodiments, the treatment is performed until the biomass receives a dose a of from about 0.5 Mrad to about 150 Mrad, about 1 Mrad to about 100 Mrad, about 5 Mrad to about 75 Mrad, about 2 Mrad to about 75 Mrad, about 10 Mrad to about 50 Mrad, e.g., about 5 Mrad to about 50 Mrad, about 20 Mrad to about 40 Mrad, about 10 Mrad to about 35 Mrad, or from about 20 Mrad to about 30 Mrad. In some implementations, a total dose of 25 to 35 Mrad is preferred, applied ideally over a couple of seconds, e.g., at 5 Mrad/pass with each pass being applied for about one second. Applying a dose of greater than 7 to 9 Mrad/pass can in some cases cause thermal degradation of the feedstock material.

[0171] The biomass material (e.g., plant biomass, animal biomass, paper, and municipal waste biomass) can be used as feedstock to produce useful intermediates and products such as organic acids, salts of organic acids, anhydrides, esters of organic acids and fuels, e.g., fuels for internal combustion engines or feedstocks for fuel cells. Systems and processes are described herein that can use as feedstock cellulosic and/or lignocellulosic materials that are readily available, but often can be difficult to process, e.g., municipal waste streams and waste paper streams, such as streams that include newspaper, kraft paper, corrugated paper or mixtures of these.

[0172] In order to convert the feedstock to a form that can be readily processed, the glucan- or xylan-containing cellulose in the feedstock can be hydrolyzed to low molecular weight carbohydrates, such as sugars, by a saccharifying agent, e.g., an enzyme or acid, a process referred to as saccharification. The low molecular weight carbohydrates can then be used, for example, in an existing manufacturing plant, such as a single cell protein plant, an enzyme manufacturing plant, or a fuel plant, e.g., an ethanol manufacturing facility.

[0173] The feedstock can be hydrolyzed using an enzyme, e.g., by combining the materials and the enzyme in a solvent, e.g., in an aqueous solution. The enzymes can be made/induced according to the methods described herein.

[0174] Specifically, the enzymes can be supplied by organisms that are capable of breaking down biomass (such as the cellulose and/or the lignin portions of the biomass), or that contain or manufacture various cellulolytic enzymes (cellulases), ligninases or various small molecule biomass-degrading metabolites. These enzymes may be a complex of enzymes that act synergistically to degrade crystalline cellulose or the lignin portions of biomass. Examples of cellulolytic enzymes include: endoglucanases, cellobiohydrolases, and cellobiases (beta-glucosidases).

[0175] During saccharification a cellulosic substrate can be initially hydrolyzed by endoglucanases at random locations producing oligomeric intermediates. These intermediates are then substrates for exo-splitting glucanases such as cellobiohydrolase to produce cellobiose from the ends of the cellulose polymer. Cellobiose is a water-soluble 1,4-linked dimer of glucose. Finally, cellobiase cleaves cellobiose to yield glucose. The efficiency (e.g., time to hydrolyze and/or completeness of hydrolysis) of this process depends on the recalcitrance of the cellulosic material.

Saccharification

[0176] The reduced-recalcitrance biomass is treated with the biomass-degrading enzymes discussed above, generally by combining the reduced-recalcitrance biomass and the biomass-degrading enzymes in a fluid medium, e.g., an aqueous solution. In some cases, the feedstock is boiled, steeped, or cooked in hot water prior to saccharification, as described in U.S. Pat. App. Pub. 2012/0100577 A1 by Medoff and Masterman, published on Apr. 26, 2012, the entire contents of which are incorporated herein.

[0177] Provided herein are mixtures of enzymes that are capable of degrading the biomass, e.g., an enzyme mixture of biomass-degrading enzymes, for use in the saccharification process described herein.

[0178] The saccharification process can be partially or completely performed in a tank (e.g., a tank having a volume of at least 4000 L, 40,000 L, 500,000 L, 2,000,000 L, 4,000,000 L, or 6,000,000 L or more) in a manufacturing plant, and/or can be partially or completely performed in transit, e.g., in a rail car, tanker truck, or in a supertanker or the hold of a ship. The time required for complete saccharification will depend on the process conditions and the biomass material and enzyme used. If saccharification is performed in a manufacturing plant under controlled conditions, the cellulose may be substantially entirely converted to sugar, e.g., glucose in about 12-96 hours. If saccharification is performed partially or completely in transit, saccharification may take longer.

[0179] In a preferred embodiment, the saccharification reaction occurs at a pH optimal for the enzymatic reactions to occur, e.g., at the pH optimal for the activity of the biomass-degrading enzymes. Preferably, the pH of the saccharification reaction is at pH 4-4.5. In a preferred embodiment, the saccharification reaction occurs at a temperature optimal for the enzymatic reactions to occur, e.g., at the temperature optimal for the activity of the biomass-degrading enzymes. Preferably, the temperature of the saccharification reaction is at 42.degree. C. -52.degree. C.

[0180] It is generally preferred that the tank contents be mixed during saccharification, e.g., using jet mixing as described in International App. No. PCT/US2010/035331, filed May 18, 2010, which was published in English as WO 2010/135380 and designated the United States, the full disclosure of which is incorporated by reference herein.

[0181] The addition of surfactants can enhance the rate of saccharification. Examples of surfactants include non-ionic surfactants, such as a Tween.RTM. 20 or Tween.RTM. 80 polyethylene glycol surfactants, ionic surfactants, or amphoteric surfactants.

[0182] It is generally preferred that the concentration of the sugar solution resulting from saccharification be relatively high, e.g., greater than 5%, 7.5%, 10%, 10.5%, or greater than 40%, or greater than 50, 60, 70, or even greater than 80% by weight. Water may be removed, e.g., by evaporation, to increase the concentration of the sugar solution. This reduces the volume to be shipped, and also inhibits microbial growth in the solution.

[0183] Alternatively, sugar solutions of lower concentrations may be used, in which case it may be desirable to add an antimicrobial additive, e.g., a broad spectrum antibiotic, in a low concentration, e.g., 50 to 150 ppm. Other suitable antibiotics include amphotericin B, ampicillin, chloramphenicol, ciprofloxacin, gentamicin, hygromycin B, kanamycin, neomycin, penicillin, puromycin, streptomycin. Antibiotics will inhibit growth of microorganisms during transport and storage, and can be used at appropriate concentrations, e.g., between 15 and 10,000 ppm by weight, e.g., between 25 and 500 ppm, or between 50 and 150 ppm. If desired, an antibiotic can be included even if the sugar concentration is relatively high. Alternatively, other additives with anti-microbial of preservative properties may be used. Preferably the antimicrobial additive(s) are food-grade.

[0184] A relatively high concentration solution can be obtained by limiting the amount of water added to the biomass material with the enzyme. The concentration can be controlled, e.g., by controlling how much saccharification takes place. For example, concentration can be increased by adding more biomass material to the solution. In order to keep the sugar that is being produced in solution, a surfactant can be added, e.g., one of those discussed above. Solubility can also be increased by increasing the temperature of the solution. For example, the solution can be maintained at a temperature of 40-50.degree. C., 60-80.degree. C., or even higher.

[0185] In the processes described herein, for example after saccharification, sugars (e.g., glucose and xylose) can be isolated. For example, sugars can be isolated by precipitation, crystallization, chromatography (e.g., simulated moving bed chromatography, high pressure chromatography), centrifugation. extraction, any other isolation method known in the art, and combinations thereof.

Mixtures for Use in Saccharification

[0186] In an aspect, the present invention features a mixture for use in a saccharification comprising a polypeptide having biomass-degrading activity that has been solubilized from an inclusion body, as described herein, one or more proteins associated with an inclusion body, and a solubilizing agent. In an embodiment, the mixture may contain other components of the inclusion body, such as other proteins endogenous to the host cell, ribosomal components, nucleic acids (e.g., RNA and/or DNA), and cellular debris (e.g., cell wall debris, lipids, metabolites). The polypeptide having biomass-degrading activity can be glycosylated or aglycosylated.

[0187] In one embodiment, the mixture comprises a polypeptide having cellobiase activity and urea. In an embodiment, the urea is present at 0.2-6M. In one embodiment, the mixture comprises a Cel3a from T. reesei, or a functional variant or a fragment thereof. In one embodiment, the mixture comprises a polypeptide comprising at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to a Cel3a from T. reesei. In another embodiment, the mixture comprises a polypeptide comprising at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 1. The polypeptide having cellobiase activity can be glycosylated or aglycosylated.

[0188] In embodiments, the mixture described herein further comprises at least one additional enzyme derived from a microorganism, wherein the additional enzyme has biomass or cellulose-based material-degrading activity. For example, the additional enzyme is a ligninase, an endoglucanase, a cellobiohydrolase, a xylanase, or a cellobiase. In an embodiment, the mixture further comprises one or more ligninase, one or more endogluconase, one or more cellobiohydrolase, one or more xylanase, or one or more cellobiase. In embodiments, the additional biomass-degrading enzyme is glycosylated. In embodiments, the enzyme mixture further comprises at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, or at least 20 or more additional biomass-degrading enzymes described herein. Typical primary amino acid sequences for several biomass-degrading enzymes are shown below.

[0189] For example, the mixture further comprises a mixture of additional biomass-degrading enzymes produced by a microorganism, e.g., a fungal cell, such as wild-type T. reesei, or a mutant thereof, e.g., T. Reesei RUTC30. In an embodiment, the additional biomass-degrading enzymes are isolated from the microorganisms. In an embodiment, the mixture comprises one or more of the following biomass-degrading enzymes: B2AF03, CIP1, CIP2, Cel1a, Cel3a, Cel5a, Cel6a, Cel7a, Cel7b, Cel12a, Cel45a, Cel74a, paMan5a, paMan26a, or Swollenin, or any combination thereof. The additional biomass-degrading enzymes, e.g., listed above, can be endogenously expressed and isolated from the microorganism, e.g., fungal cell, from which the enzyme originates from (listed below in Table 1). Alternatively, the additional biomass-degrading enzymes, e.g., listed above, can be heterologously expressed using similar methods of expression in a host cell described herein, and isolated from the host cells. In an embodiment, the heterologously expressed additional biomass-degrading enzymes are tagged with a His tag at the C or N terminus of the enzyme and are isolated using nickel affinity chromatography techniques known in the art. For example, the additional biomass-degrading enzymes are selected from Table 1 below.

TABLE-US-00004 TABLE 1 Examples of Additional Biomass-Degrading Enzymes MW, no Protein kDa AA's th. pI no. Cysteines Organism B2AF03 87.1 800 5.94 10 Podospora anserina CIP1 32.9 316 4.93 8 Trichoderma reesei CIP2 48.2 460 7.0 12 Trichoderma reesei Cel1a 52.2 466 5.3 5 Trichoderma reesei Cel3a 78.4 744 6.3 6 Trichoderma reesei Cel5a 44.1 418 4.9 12 Trichoderma reesei Cel6a 49.6 471 5.1 12 Trichoderma reesei Cel7a 54.1 514 4.6 24 Trichoderma reesei Cel7b 48.2 459 4.7 22 Trichoderma reesei Cel12a 25.1 234 6.6 2 Trichoderma reesei Cel45a 24.4 242 4.2 16 Trichoderma reesei Cel74a 87.1 838 5.4 4 Trichoderma reesei paMan5a 41.1 373 7.0 6 Podospora anserina paMan26a 51.7 469 4.7 1 Podospora anserina Swollenin 51.5 493 4.8 28 Trichoderma reesei

[0190] The amino acid sequences for the biomass-degrading enzymes listed in Table 1 are provided below.

TABLE-US-00005 B2AF03 (Podospora anserina) (SEQ ID NO: 6) MKSSVFWGASLTSAVVRAIDLPFQFYPNCVDDLLSTNQVCNTTLSPPERAAALVAALTPEEKLQNIVSK SLGAPRIGLPAYNWWSEALHGVAYAPGTQFWQGDGPFNSSTSFPMPLLMAATFDDELLEKIAEVIGIEG RAFGNAGFSGLDYWTPNVNPFKDPRWGRGSETPGEDVLLVKRYAAAMIKGLEGPVPEKERRVVATCKHY AANDFEDWNGATRHNFNAKISLQDMAEYYFMPFQQCVRDSRVGSIMCAYNAVNGVPSCASPYLLQTILR EHWNWTEHNNYITSDCEAVLDVSLNHKYAATNAEGTAISFEAGMDTSCEYEGSSDIPGAWSQGLLKEST VDRALLRLYEGIVRAGYFDGKQSLYSSLGWADVNKPSAQKLSLQAAVDGTVLLKNDGTLPLSDLLDKSR PKKVAMIGFWSDAKDKLRGGYSGTAAYLHTPAYAASQLGIPFSTASGPILHSDLASNQSWTDNAMAAAK DADYILYFGGIDTSAAGETKDRYDLDWPGAQLSLINLLTTLSKPLIVLQMGDQLDNTPLLSNPKINAIL WANWPGQDGGTAVMELVTGLKSPAGRLPVTQYPSNFTELVPMTDMALRPSAGNSQLGRTYRWYKTPVQA FGFGLHYTTFSPKFGKKFPAVIDVDEVLEGCDDKYLDTCPLPDLPVVVENRGNRTSDYVALAFVSAPGV GPGPWPIKTLGAFTRLRGVKGGEKREGGLKWNLGNLARHDEEGNTVVYPGKYEVSLDEPPKARLRFEIV RGGKGKGKVKGKGKAAQKGGVVLDRWPKPPKGQEPPAIERV C1P1 (Trichoderma reesei) (SEQ ID NO: 7) MVRRTALLALGALSTLSMAQISDDFESGWDQTKWPISAPDCNQGGTVSLDTTVAHSGSNSMKVVGGPNG YCGHIFFGTTQVPTGDVYVRAWIRLQTALGSNHVTFIIMPDTAQGGKHLRIGGQSQVLDYNRESDDATL PDLSPNGIASTVTLPTGAFQCFEYHLGTDGTIETWLNGSLIPGMTVGPGVDNPNDAGWTRASYIPEITG VNFGWEAYSGDVNTVWFDDISIASTRVGCGPGSPGGPGSSTTGRSSTSGPTSTSRPSTTIPPPTSRTTT ATGPTQTHYGQCGGIGYSGPTVCASGTTCQVLNPYYSQCL C1P2 (Trichoderma reesei) (SEQ ID NO: 8) MASRFFALLLLAIPIQAQSPVWGQCGGIGWSGPTTCVGGATCVSYNPYYSQCIPSTQASSSIASTTLVT SFTTTTATRTSASTPPASSTGAGGATCSALPGSITLRSNAKLNDLFTMFNGDKVTTKDKFSCRQAEMSE LIQRYELGTLPGRPSTLTASFSGNTLTINCGEAGKSISFTVTITYPSSGTAPYPAIIGYGGGSLPAPAG VAMINFNNDNIAAQVNTGSRGQGKFYDLYGSSHSAGAMTAWAWGVSRVIDALELVPGARIDTTKIGVTG CSRNGKGAMVAGAFEKRIVLTLPQESGAGGSACWRISDYLKSQGANIQTASEIIGEDPWFSTTFNSYVN QVPVLPFDHHSLAALIAPRGLFVIDNNIDWLGPQSCFGCMTAAHMAWQALGVSDHMGYSQIGAHAHCAF PSNQQSQLTAFVQKFLLGQSTNTAIFQSDFSANQSQWIDWTTPTLS Cel1a (Trichoderma reesei) (SEQ ID NO: 9) MLPKDFQWGFATAAYQIEGAVDQDGRGPSIWDTFCAQPGKIADGSSGVTACDSYNRTAEDIALLKSLGA KSYRFSISWSRIIPEGGRGDAVNQAGIDHYVKFVDDLLDAGITPFITLFHWDLPEGLHQRYGGLLNRTE FPLDFENYARVMFRALPKVRNWITFNEPLCSAIPGYGSGTFAPGRQSTSEPWTVGHNILVAHGRAVKAY RDDFKPASGDGQIGIVLNGDFTYPWDAADPADKEAAERRLEFFTAWFADPIYLGDYPASMRKQLGDRLP TFTPEERALVHGSNDFYGMNHYTSNYIRHRSSPASADDTVGNVDVLFTNKQGNCIGPETQSPWLRPCAA GFRDFLVWISKRYGYPPIYVTENGTSIKGESDLPKEKILEDDFRVKYYNEYIRAMVTAVELDGVNVKGY FAWSLMDNFEWADGYVTRFGVTYVDYENGQKRFPKKSAKSLKPLFDELIAAA Cel3a (Trichoderma reesei) (SEQ ID NO: 10) MRYRTAAALALATGPFARADSHSTSGASAEAVVPPAGTPWGTAYDKAKAALAKLNLQDKVGIVSGVGWN GGPCVGNTSPASKISYPSLCLQDGPLGVRYSTGSTAFTPGVQAASTWDVNLIRERGQFIGEEVKASGIH VILGPVAGPLGKTPQGGRNWEGFGVDPYLTGIAMGQTINGIQSVGVQATAKHYILNEQELNRETISSNP DDRTLHELYTWPFADAVQANVASVMCSYNKVNTTWACEDQYTLQTVLKDQLGFPGYVMTDWNAQHTTVQ SANSGLDMSMPGTDFNGNNRLWGPALTNAVNSNQVPTSRVDDMVTRILAAWYLTGQDQAGYPSFNISRN VQGNHKTNVRAIARDGIVLLKNDANILPLKKPASIAVVGSAAIIGNHARNSPSCNDKGCDDGALGMGWG SGAVNYPYFVAPYDAINTRASSQGTQVTLSNTDNTSSGASAARGKDVAIVFITADSGEGYITVEGNAGD RNNLDPWHNGNALVQAVAGANSNVIVVVHSVGAIILEQILALPQVKAVVWAGLPSQESGNALVDVLWGD VSPSGKLVYTIAKSPNDYNTRIVSGGSDSFSEGLFIDYKHFDDANITPRYEFGYGLSYTKFNYSRLSVL STAKSGPATGAVVPGGPSDLFQNVATVTVDIANSGQVTGAEVAQLYITYPSSAPRTPPKQLRGFAKLNL TPGQSGTATFNIRRRDLSYWDTASQKWVVPSGSFGISVGASSRDIRLTSTLSVA Cel5a (Trichoderma reesei) (SEQ ID NO: 11) MNKSVAPLLLAASILYGGAAAQQTVWGQCGGIGWSGPTNCAPGSACSTLNPYYAQCIPGATTITTSTRP PSGPTTTTRATSTSSSTPPTSSGVRFAGVNIAGFDFGCTTDGTCVTSKVYPPLKNFTGSNNYPDGIGQM QHFVNDDGMTIFRLPVGWQYLVNNNLGGNLDSTSISKYDQLVQGCLSLGAYCIVDIHNYARWNGGIIGQ GGPTNAQFTSLWSQLASKYASQSRVWFGIMNEPHDVNINTWAATVQEVVTAIRNAGATSQFISLPGNDW QSAGAFISDGSAAALSQVTNPDGSTTNLIFDVHKYLDSDNSGTHAECTTNNIDGAFSPLATWLRQNNRQ AILTETGGGNVQSCIQDMCQQIQYLNQNSDVYLGYVGWGAGSFDSTYVLTETPTGSGNSWTDTSLVSSC LARK Cel6a (Trichoderma reesei) (SEQ ID NO: 12) MIVGILTTLATLATLAASVPLEERQACSSVWGQCGGQNWSGPTCCASGSTCVYSNDYYSQCLPGAASSS SSTRAASTTSRVSPTTSRSSSATPPPGSTTTRVPPVGSGTATYSGNPFVGVTPWANAYYASEVSSLAIP SLTGAMATAAAAVAKVPSFMWLDTLDKTPLMEQTLADIRTANKNGGNYAGQFVVYDLPDRDCAALASNG EYSIADGGVAKYKNYIDTIRQIVVEYSDIRTLLVIEPDSLANLVTNLGTPKCANAQSAYLECINYAVTQ LNLPNVAMYLDAGHAGWLGWPANQDPAAQLFANVYKNASSPRALRGLATNVANYNGWNITSPPSYTQGN AVYNEKLYIHAIGPLLANHGWSNAFFITDQGRSGKQPTGQQQWGDWCNVIGTGFGIRPSANTGDSLLDS FVWVKPGGECDGTSDSSAPRFDSHCALPDALQPAPQAGAWFQAYFVQLLTNANPSFL Cel7a (Trichoderma reesei) (SEQ ID NO: 13) MYRKLAVISAFLATARAQSACTLQSETHPPLTWQKCSSGGTCTQQTGSVVIDANWRWTHATNSSTNCYD GNTWSSTLCPDNETCAKNCCLDGAAYASTYGVTTSGNSLSIGFVTQSAQKNVGARLYLMASDTTYQEFT LLGNEFSFDVDVSQLPCGLNGALYFVSMDADGGVSKYPTNTAGAKYGTGYCDSQCPRDLKFINGQANVE GWEPSSNNANTGIGGHGSCCSEMDIWEANSISEALTPHPCTTVGQEICEGDGCGGTYSDNRYGGTCDPD GCDWNPYRLGNTSFYGPGSSFTLDTTKKLTVVTQFETSGAINRYYVQNGVTFQQPNAELGSYSGNELND DYCTAEEAEFGGSSFSDKGGLTQFKKATSGGMVLVMSLWDDYYANMLWLDSTYPTNETSSTPGAVRGSC STSSGVPAQVESQSPNAKVTFSNIKFGPIGSTGNPSGGNPPGGNPPGTTTTRRPATTTGSSPGPTQSHY GQCGGIGYSGPTVCASGTTCQVLNPYYSQCL Cel7b (Trichoderma reesei) (SEQ ID NO: 14) MAPSVTLPLTTAILAIARLVAAQQPGTSTPEVHPKLTTYKCTKSGGCVAQDTSVVLDWNYRWMHDANYN SCTVNGGVNTTLCPDEATCGKNCFIEGVDYAASGVTTSGSSLTMNQYMPSSSGGYSSVSPRLYLLDSDG EYVMLKLNGQELSFDVDLSALPCGENGSLYLSQMDENGGANQYNTAGANYGSGYCDAQCPVQTWRNGTL NTSHQGFCCNEMDILEGNSRANALTPHSCTATACDSAGCGFNPYGSGYKSYYGPGDTVDTSKTFTIITQ FNTDNGSPSGNLVSITRKYQQNGVDIPSAQPGGDTISSCPSASAYGGLATMGKALSSGMVLVFSIWNDN SQYMNWLDSGNAGPCSSTEGNPSNILANNPNTHVVFSNIRWGDIGSTTNSTAPPPPPASSTTFSTTRRS STTSSSPSCTQTHWGQCGGIGYSGCKTCTSGTTCQYSNDYYSQCL Cel12a (Trichoderma reesei) (SEQ ID NO: 15) MKFLQVLPALIPAALAQTSCDQWATFTGNGYTVSNNLWGASAGSGFGCVTAVSLSGGASWHADWQWSGG QNNVKSYQNSQIAIPQKRTVNSISSMPTTASWSYSGSNIRANVAYDLFTAANPNHVTYSGDYELMIWLG KYGDIGPIGSSQGTVNVGGQSWTLYYGYNGAMQVYSFVAQTNTTNYSGDVKNFFNYLRDNKGYNAAGQY VLSYQFGTEPFTGSGTLNVASWTASIN Cel45a (Trichoderma reesei) (SEQ ID NO: 16) MKATLVLGSLIVGAVSAYKATTTRYYDGQEGACGCGSSSGAFPWQLGIGNGVYTAAGSQALFDTAGASW CGAGCGKCYQLTSTGQAPCSSCGTGGAAGQSIIVMVTNLCPNNGNAQWCPVVGGTNQYGYSYHFDIMAQ NEIFGDNVVVDFEPIACPGQAASDWGTCLCVGQQETDPTPVLGNDTGSTPPGSSPPATSSSPPSGGGQQ TLYGQCGGAGWTGPTTCQAPGTCKVQNQWYSQCLP Cel74a (Trichoderma reesei) (SEQ ID NO: 17) MKVSRVLALVLGAVIPAHAAFSWKNVKLGGGGGFVPGIIFHPKTKGVAYARTDIGGLYRLNADDSWTAV TDGIADNAGWHNWGIDAVALDPQDDQKVYAAVGMYTNSWDPSNGAIIRSSDRGATWSFTNLPFKVGGNM PGRGAGERLAVDPANSNIIYFGARSGNGLWKSTDGGVTFSKVSSFTATGTYIPDPSDSNGYNSDKQGLM WVTFDSTSSTTGGATSRIFVGTADNITASVYVSTNAGSTWSAVPGQPGKYFPHKAKLQPAEKALYLTYS DGTGPYDGTLGSVWRYDIAGGTWKDITPVSGSDLYFGFGGLGLDLQKPGTLVVASLNSWWPDAQLFRST DSGTTWSPIWAWASYPTETYYYSISTPKAPWIKNNFIDVTSESPSDGLIKRLGWMIESLEIDPTDSNHW LYGTGMTIFGGHDLTNWDTRHNVSIQSLADGIEEFSVQDLASAPGGSELLAAVGDDNGFTFASRNDLGT SPQTVWATPTWATSTSVDYAGNSVKSVVRVGNTAGTQQVAISSDGGATWSIDYAADTSMNGGTVAYSAD GDTILWSTASSGVQRSQFQGSFASVSSLPAGAVIASDKKTNSVFYAGSGSTFYVSKDTGSSFTRGPKLG SAGTIRDIAAHPTTAGTLYVSTDVGIFRSTDSGTTFGQVSTALTNTYQIALGVGSGSNWNLYAFGTGPS GARLYASGDSGASWTDIQGSQGFGSIDSTKVAGSGSTAGQVYVGTNGRGVFYAQGTVGGGTGGTSSSTK QSSSSTSSASSSTTLRSSVVSTTRASTVTSSRTSSAAGPTGSGVAGHYAQCGGIGWTGPTQCVAPYVCQ KQNDYYYQCV paMan5a (Podospora anserina) (SEQ ID NO: 18) MKGLFAFGLGLLSLVNALPQAQGGGAAASAKVSGTRFVIDGKTGYFAGTNSYWIGFLTNNRDVDTTLDH IASSGLKILRVWGFNDVNNQPSGNTVWFQRLASSGSQINTGPNGLQRLDYLVRSAETRGIKLIIALVNY WDDFGGMKAYVNAFGGTKESWYTNARAQEQYKRYIQAVVSRYVNSPAIFAWELANEPRCKGCNTNVIFN WATQISDYIRSLDKDHLITLGDEGFGLPGQTTYPYQYGEGTDFVKNLQIKNLDFGTFHMYPGHWGVPTS FGPGWIKDHAAACRAAGKPCLLEEYGYESDRCNVQKGWQQASRELSRDGMSGDLFWQWGDQLSTGQTHN DGFTIYYGSSLATCLVTDHVRAINALPA paMan26a (Podospora anserina) (SEQ ID NO: 19) MVKLLDIGLFALALASSAVAKPCKPRDGPVTYEAEDAILTGTTVDTAQVGYTGRGYVTGFDEGSDKITF QISSATTKLYDLSIRYAAIYGDKRTNVVLNNGAVSEVFFPAGDSFTSVAAGQVLLNAGQNTIDIVNNWG WYLIDSITLTPSAPRPPHDINPNLNNPNADTNAKKLYSYLRSVYGNKIISGQQELHHAEWIRQQTGKTP ALVAVDLMDYSPSRVERGTTSHAVEDAIAHHNAGGIVSVLWHWNAPVGLYDTEENKWWSGFYTRATDFD IAATLANPQGANYTLLIRDIDAIAVQLKRLEAAGVPVLWRPLHEAEGGWFWWGAKGPEPAKQLWDILYE RLTVHHGLDNLIWVWNSILEDWYPGDDTVDILSADVYAQGNGPMSTQYNELIALGRDKKMIAAAEVGAA PLPGLLQAYQANWLWFAVWGDDFINNPSWNTVAVLNEIYNSDYVLTLDEIQGWRS Swollenin (Trichoderma reesei) (SEQ ID NO: 20) MAGKLILVALASLVSLSIQQNCAALFGQCGGIGWSGTTCCVAGAQCSFVNDWYSQCLASTGGNPPNGTT SSSLVSRTSSASSSVGSSSPGGNSPTGSASTYTTTDTATVAPHSQSPYPSIAASSCGSWTLVDNVCCPS YCANDDTSESCSGCGTCTTPPSADCKSGTMYPEVHHVSSNESWHYSRSTHFGLTSGGACGFGLYGLCTK GSVTASWTDPMLGATCDAFCTAYPLLCKDPTGTTLRGNFAAPNGDYYTQFWSSLPGALDNYLSCGECIE LIQTKPDGTDYAVGEAGYTDPITLEIVDSCPCSANSKWCCGPGADHCGEIDFKYGCPLPADSIHLDLSD

IAMGRLQGNGSLTNGVIPTRYRRVQCPKVGNAYIWLRNGGGPYYFALTAVNTNGPGSVTKIEIKGADTD NWVALVHDPNYTSSRPQERYGSWVIPQGSGPFNLPVGIRLTSPTGEQIVNEQAIKTFTPPATGDPNFYY IDIGVQFSQN

[0191] Other examples of suitable biomass-degrading enzymes for use in the enzyme mixture of the present invention include the enzymes from species in the genera Bacillus, Coprinus, Myceliophthora, Cephalosporium, Scytalidium, Penicillium, Aspergillus, Pseudomonas, Humicola, Fusarium, Thielavia, Acremonium, Chrysosporium and Trichoderma, especially those produced by a strain selected from the species Aspergillus (see, e.g., EP Pub. No. 0 458 162), Humicola insolens (reclassified as Scytalidium thermophilum, see, e.g., U.S. Pat. No. 4,435,307), Coprinus cinereus, Fusarium oxysporum, Myceliophthora thermophila, Meripilus giganteus, Thielavia terrestris, Acremonium sp. (including, but not limited to, A. persicinum, A. acremonium, A. brachypenium, A. dichromosporum, A. obclavatum, A. pinkertoniae, A. roseogriseum, A. incoloratum, and A. furatum). Preferred strains include Humicola insolens DSM 1800, Fusarium oxysporum DSM 2672, Myceliophthora thermophila CBS 117.65, Cephalosporium sp. RYM-202, Acremonium sp. CBS 478.94, Acremonium sp. CBS 265.95, Acremonium persicinum CBS 169.65, Acremonium acremonium AHU 9519, Cephalosporium sp. CBS 535.71, Acremonium brachypenium CBS 866.73, Acremonium dichromosporum CBS 683.73, Acremonium obclavatum CBS 311.74, Acremonium pinkertoniae CBS 157.70, Acremonium roseogriseum CBS 134.56, Acremonium incoloratum CBS 146.62, and Acremonium furatum CBS 299.70H. Biomass-degrading enzymes may also be obtained from Chrysosporium, preferably a strain of Chrysosporium lucknowense. Additional strains that can be used include, but are not limited to, Trichoderma (particularly T. viride, T. reesei, and T. koningii), alkalophilic Bacillus (see, for example, U.S. Pat. No. 3,844,890 and EP Pub. No. 0 458 162), and Streptomyces (see, e.g., EP Pub. No. 0 458 162).

[0192] In embodiments, the microorganism is induced to produce the biomass-degrading enzymes described herein under conditions suitable for increasing production of biomass-degrading enzymes compared to an uninduced microorganism. For example, an induction biomass sample comprising biomass as described herein is incubated with the microorganism to increase production of the biomass-degrading enzymes. Further description of the induction process can be found in US 2014/0011258, the contents of which are hereby incorporated by reference in its entirety.

[0193] The biomass-degrading enzymes produced and/or secreted by the aforementioned microorganisms can be isolated and added to the mixture of the present invention, or directly to the saccharification reaction. Alternatively, in one embodiment, the aforementioned microorganisms or host cells expressing the biomass-degrading enzymes described herein and above are not lysed before addition to the saccharification reaction.

[0194] In an embodiment, an enzyme mixture comprising the host cell expressing one or more additional biomass-degrading enzymes as described herein can be used with the mixture comprising the solubilized polypeptide having biomass-degrading activity described herein.

[0195] Use of the mixture described herein comprising a polypeptide having biomass-degrading activity solubilized from inclusion bodies and a solubilizing agent does not inhibit, prevent or decrease the yield of sugar products from saccharification compared to saccharification without addition of the solubilized polypeptide. In some embodiments, the yield of sugar products increases upon use of the mixture described herein comprising a polypeptide having biomass-degrading activity solubilized from inclusion bodies and a solubilizing agent. The yield of sugar products increases at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 100% compared to when the standard mixture of biomass-degrading enzymes is added to the saccharification without the mixture containing solubilized polypeptide and solubilized agent.

Further Processing

[0196] Further processing steps may be performed on the sugars produced by saccharification to produce alternative products. For example, the sugars can be hydrogenated, fermented, or treated with other chemicals to produce other products.

[0197] Glucose can be hydrogenated to sorbitol. Xylose can be hydrogenated to xylitol. Hydrogenation can be accomplished by use of a catalyst (e.g., Pt/gamma-Al.sub.2O.sub.3, Ru/C, Raney Nickel, or other catalysts know in the art) in combination with H.sub.2 under high pressure (e.g., 10 to 12000 psi). The sorbitol and/or xylitol products can be isolated and purified using methods known in the art.

[0198] Sugar products from saccharification can also be fermented to produce alcohols, sugar alcohols, such as erythritol, or organic acids, e.g., lactic, glutamic or citric acids or amino acids.

[0199] Yeast and Zymomonas bacteria, for example, can be used for fermentation or conversion of sugar(s) to alcohol(s). Other microorganisms are discussed below. The optimum pH for fermentations is about pH 4 to 7. For example, the optimum pH for yeast is from about pH 4 to 5, while the optimum pH for Zymomonas is from about pH 5 to 6. Typical fermentation times are about 24 to 168 hours (e.g., 24 to 96 hrs) with temperatures in the range of 20.degree. C. to 40.degree. C. (e.g., 26.degree. C. to 40.degree. C.), however thermophilic microorganisms prefer higher temperatures.

[0200] In some embodiments, e.g., when anaerobic organisms are used, at least a portion of the fermentation is conducted in the absence of oxygen, e.g., under a blanket of an inert gas such as N.sub.2, Ar, He, CO.sub.2 or mixtures thereof. Additionally, the mixture may have a constant purge of an inert gas flowing through the tank during part of or all of the fermentation. In some cases, anaerobic conditions can be achieved or maintained by carbon dioxide production during the fermentation and no additional inert gas is needed.

[0201] In some embodiments, all or a portion of the fermentation process can be interrupted before the low molecular weight sugar is completely converted to a product (e.g., ethanol). The intermediate fermentation products include sugar and carbohydrates in high concentrations. The sugars and carbohydrates can be isolated via any means known in the art. These intermediate fermentation products can be used in preparation of food for human or animal consumption. Additionally or alternatively, the intermediate fermentation products can be ground to a fine particle size in a stainless-steel laboratory mill to produce a flour-like substance.

[0202] Jet mixing may be used during fermentation, and in some cases saccharification and fermentation are performed in the same tank.

[0203] Nutrients for the microorganisms may be added during saccharification and/or fermentation, for example the food-based nutrient packages described in U.S. Pat. App. Pub. 2012/0052536, filed Jul. 15, 2011, the complete disclosure of which is incorporated herein by reference.

[0204] "Fermentation" includes the methods and products that are disclosed in U.S. Prov. App. No. 61/579,559, filed Dec. 22, 2012, and U.S. Prov. App. No. 61/579,576, filed Dec. 22, 2012, the contents of both of which are incorporated by reference herein in their entirety.

[0205] Mobile fermenters can be utilized, as described in International App. No. PCT/US2007/074028 (which was filed Jul. 20, 2007, was published in English as WO 2008/011598 and designated the United States), the contents of which is incorporated herein in its entirety. Similarly, the saccharification equipment can be mobile. Further, saccharification and/or fermentation may be performed in part or entirely during transit.

[0206] The microorganism(s) used in fermentation can be naturally-occurring microorganisms and/or engineered microorganisms. For example, the microorganism can be a bacterium (including, but not limited to, e.g., a cellulolytic bacterium), a fungus, (including, but not limited to, e.g., a yeast), a plant, a protist, e.g., a protozoa or a fungus-like protest (including, but not limited to, e.g., a slime mold), or an algae. When the organisms are compatible, mixtures of organisms can be utilized.

[0207] Suitable fermenting microorganisms have the ability to convert carbohydrates, such as glucose, fructose, xylose, arabinose, mannose, galactose, oligosaccharides or polysaccharides into fermentation products. Fermenting microorganisms include strains of the genus Saccharomyces spp. (including, but not limited to, S. cerevisiae (baker's yeast), S. distaticus, S. uvarum), the genus Kluyveromyces, (including, but not limited to, K. marxianus, K. fragilis), the genus Candida (including, but not limited to, C. pseudotropicalis, and C. brassicae), Pichia stipitis (a relative of Candida shehatae), the genus Clavispora (including, but not limited to, C. lusitaniae and C. opuntiae), the genus Pachysolen (including, but not limited to, P. tannophilus), the genus Bretannomyces (including, but not limited to, e.g., B. clausenii (Philippidis, G. P., 1996, Cellulose bioconversion technology, in Handbook on Bioethanol: Production and Utilization, Wyman, C. E., ed., Taylor & Francis, Washington, D.C., 179-212)). Other suitable microorganisms include, for example, Zymomonas mobilis, Clostridium spp. (including, but not limited to, C. thermocellum (Philippidis, 1996, supra), C. saccharobutylacetonicum, C. saccharobutylicum, C. Puniceum, C. beijernckii, and C. acetobutylicum), Moniliella pollinis, Moniliella megachiliensis, Lactobacillus spp. Yarrowia lipolytica, Aureobasidium sp., Trichosporonoides sp., Trigonopsis variabilis, Trichosporon sp., Moniliellaacetoabutans sp., Typhula variabilis, Candida magnoliae, Ustilaginomycetes sp., Pseudozyma tsukubaensis, yeast species of genera Zygosaccharomyces, Debaryomyces, Hansenula and Pichia, and fungi of the dematioid genus Torula.

[0208] For instance, Clostridium spp. can be used to produce ethanol, butanol, butyric acid, acetic acid, and acetone. Lactobacillus spp. can be used to produce lactic acid.

[0209] Many such microbial strains are publicly available, either commercially or through depositories such as the ATCC (American Type Culture Collection, Manassas, Va., USA), the NRRL (Agricultural Research Sevice Culture Collection, Peoria, Ill., USA), or the DSMZ (Deutsche Sammlung von Mikroorganismen and Zellkulturen GmbH, Braunschweig, Germany), to name a few.

[0210] Commercially available yeasts include, for example, Red Star.RTM./Lesaffre Ethanol Red (available from Red Star/Lesaffre, USA), FALI.RTM. (available from Fleischmann's Yeast, a division of Burns Philip Food Inc., USA), SUPERSTART.RTM. (available from Alltech, now Lalemand), GERT STRAND.RTM. (available from Gert Strand AB, Sweden) and FERMOL.RTM. (available from DSM Specialties).

[0211] Many microorganisms that can be used to saccharify biomass material and produce sugars can also be used to ferment and convert those sugars to useful products.

[0212] After fermentation, the resulting fluids can be distilled using, for example, a "beer column" to separate ethanol and other alcohols from the majority of water and residual solids. The vapor exiting the beer column can be, e.g., 35% by weight ethanol and can be fed to a rectification column. A mixture of nearly azeotropic (92.5%) ethanol and water from the rectification column can be purified to pure (99.5%) ethanol using vapor-phase molecular sieves. The beer column bottoms can be sent to the first effect of a three-effect evaporator. The rectification column reflux condenser can provide heat for this first effect. After the first effect, solids can be separated using a centrifuge and dried in a rotary dryer. A portion (25%) of the centrifuge effluent can be recycled to fermentation and the rest sent to the second and third evaporator effects. Most of the evaporator condensate can be returned to the process as fairly clean condensate with a small portion split off to waste water treatment to prevent build-up of low-boiling compounds.

[0213] Other types of chemical transformation of the products from the processes described herein can be used, for example, production of organic sugar derived products such (e.g., furfural and furfural-derived products). Chemical transformations of sugar derived products are described in U.S. Prov. App. No. 61/667,481, filed Jul. 3, 2012, the disclosure of which is incorporated herein by reference in its entirety.

EXAMPLES

[0214] The invention is further described in detail by reference to the following experimental examples. These examples are provided for purposes of illustration only, and are not intended to be limiting unless otherwise specified. Thus, the invention should in no way be construed as being limited to the following examples, but rather, should be construed to encompass any and all variations which become evident as a result of the teaching provided herein.

[0215] Without further description, it is believed that one of ordinary skill in the art can, using the preceding description and the following illustrative examples, make and utilize the compounds of the present invention and practice the claimed methods. The following working examples specifically point out various aspects of the present invention, and are not to be construed as limiting in any way the remainder of the disclosure.

Example 1: Expression of Cel3a-C'his in E. coli

[0216] The mature sequence for Cel3a (amino acids 20-744) was synthesized and codon-optimized for E. coli expression by Genewiz. The Cel3a-C'His referred to in the following examples refers to the codon-optimized mature sequence for Cel3a (aas 20-744) with an 8.times. His (SEQ ID NO: 21) tag at the C-terminus. The below primers were used to clone the Cel3a-C'His into pET-Duet (Novagen, Catalog No. 71146):

TABLE-US-00006 Forward (SEQ ID NO: 4) 5'-CATGCCATGGGCGATAGTCACAGTACCAGC Reverse (SEQ ID NO: 5) 3'-CCCAAGCTTTCATTAGTGATGATGATGATGATGATGATGGCTGCCGC TGCCGGCAACACTCAGGGTGC

(NcoI and HindIII sites are underlined; start and stop codons are in bold; the polyhistidine (8-His (SEQ ID NO: 21) tag; and glycine-serine (GSGS (SEQ ID NO: 23)) linker are italicized.) The Amplification reaction was performed using PfuUltra II Fusion HS Polymerase (Agilent, Catalog No. 600672).

[0217] The amplified DNA was cloned by restriction digestion using NcoI restriction enzyme (New England Biolabs, R3193) and HindIII restriction enzyme (New England Biolabs, R3104) under conditions suggested by the manufacturer. The digested amplified DNA was ligated into the NcoI-HindIII sites in the pETDuet vector using T4 DNA ligase (New England Biolabs, M0202), followed by transformation of E. coli cloning host Top 10 One Shot (Invitrogen). Plasmid purification was carried out using Qiagen's plasmid purification kit.

[0218] The Cel3A-C'His constructs were transformed into the E. coli expression host Origami B (DE3) (EMD Millipore, Catalog No. 70837) and streaked on plates containing LB medium and 100 .mu.g/ml ampicillin (Fisher Scientific, Catalog No. BP1760), 15 .mu.g/ml kanamysin (Fisher Scientific, Catalog No. BP906) and 12.5 .mu.g/ml tetracycline (Fisher Scientified, Catalog No. BP912). Colonies carrying the recombinant DNA were picked from plates for the inoculation of 2 ml starter cultures, and grown overnight at 37.degree. C., then subsequently used to inoculate 100 ml of LB media containing the appropriate antibiotics. Cultures were grown at 37.degree. C. until OD600 reached 0.8.

[0219] To induce protein expression, 500 .mu.M IPTG (Isopropyl-b-D-thiogalactopyranoside; Fisher Scientific, Catalog No. BP1755) was added. The expression culture was further grown for another 4 hours at 37.degree. C. The cells were harvested by centrifugation at 4200 at room temperature (RT) for 30 minutes using the Sorvall St16 rotor TX400.

Example 2: Solubilization of Cel3a from the Insoluble Fraction

[0220] An E. coli culture expressing an enzyme having biomass-degrading activity, Cel3a, was cultured and enzyme expression was induced, as described in Example 1. Isolation of Cel3a tagged with a His tag at the C-terminus (Cel3a-C'His) from the soluble and insoluble fraction was performed as follows. The cell culture was centrifuged at 4200 rpm for 30 minutes. The supernatant was discarded and the cell pellet was re-suspended in lysis buffer with 1 mg/mL lysozyme. Lysonase, e.g., 10 .mu.l, of Lysonase Bioprocessing Reagent (EMD Millipore 71320) per gram of cell paste was added and the sample was incubated for 1 hour at ambient temperature. After 1 hour, the sample was sonicated for a total of 2 minutes in 30 second intervals. Following sonication, the sample was centrifuged for 30 minutes at 10000 rpm. The supernatant, or soluble fraction, contains solubilized Cel3a, while the remaining pellet, or insoluble fraction, contains inclusion bodies and insoluble Cel3a.

[0221] The insoluble fraction was re-suspended in buffer containing a solubilizing agent for 15 minutes and vortexed at room temperature, specifically, 6M Urea pH 8 IMAC binding buffer. The sample was then filtered through a 0.45 .mu.m membrane to prepare for IMAC purification. The amount of 6M Urea pH 8 IMAC binding buffer added was proportional to the amount of cell mass, e.g., 2 or 3 volumes of buffer to 1 volume of cell mass. The amount of binding buffer is increased as the cell mass increases in order to make filtering of the sample possible.

Example 3: Purification of Cel3a

[0222] Purification of Soluble Cel3a

[0223] The soluble fraction from Example 2 was transferred to a fresh tube containing 100 l of pre-equilibrated Bio-Scale.TM. Profinity (Biorad) Ni-charged IMAC resin slurry (BioRad, Catalog No. 732-4614). The native binding buffer contained 50 mM Tris HCl pH 7.5, 150 mM NaCl, 0.1% Triton X-100, and 5 .mu.M imidazole. The protein was batch-bound for 1 hour at room temperature (RT), and then washed with native buffer containing 25 .mu.M imidazole. The protein was eluted in 300 .mu.l of native buffer containing 200 .mu.M imidazole.

[0224] Purification of Insoluble Cel3a

[0225] A Bio-Scale.TM. Mini Profinity IMAC 5 mL cartridge (BioRad, Catalog No. 732-4614) was equilibrated with 5 column volumes of 6M Urea pH8 IMAC binding buffer at a flow rate of 5 mL/min. After column equilibration the resuspended insoluble fraction from Example 2 was loaded at a flow rate of 1 mL/min. The column then received a 15 column volume wash of the 6M Urea pH8 IMAC binding buffer at a flow rate of 5 mL/min. The solubilized Cel3a was then eluted from the column with 10 column volumes of 6M Urea pH 4 IMAC elution buffer at a flow rate of 5 mL/min. The resulting solubilized Cel3a sample contains 6M urea.

[0226] IMAC chromatography analysis was performed (using IMAC columns, Bio-Scale.TM. Mini Profinity.TM. IMAC Cartridges 5 mL, Catalog #732-4614), and as shown in FIG. 1, purified solubilized Cel3a was detected.

[0227] SDS-PAGE analysis was performed to assess the amount of Cel3a from the purification described above in the following fractions: purified soluble Cel3a (lane 2 in FIG. 2); flow through from the IMAC purification of the insoluble fraction (lane 3 in FIG. 2); and the purified Cel3a from the insoluble fraction (solubilized Cel3a) (lane 4 in FIG. 2). As shown in FIG. 2, Cel3a was successfully isolated from the inclusion bodies of the insoluble fraction using the methods described above.

Example 4: Analysis of Cellobiase Activity of Solubilized Cel3a

[0228] Cel3a was purified using IMAC techniques, as described in Example 3. Prior to performing the activity assay, the amount (titer) of purified Cel3a was determined using Bradford assay and/or the nanodrop. For nanodrop quantification, the molar extinction coefficient was estimated by inserting the amino acid sequence of the target form of Cel3a into the ExPASy ProtParam online tool.

[0229] For the activity assay, two fold serial dilutions of samples containing purified Cel3a were prepared using 50 mM sodium citrate, pH 5.0 NaOH as buffer. Dilutions were aliquoted across one row of a 96 well plate. Dilutions were incubated with a D-(+)-Cellobiose (Fluka) substrate solution in 50 mM sodium citrate monobasic buffer at pH 5.0, at 48.degree. C. for 30 minutes. The plates were immediately sealed using an adhesive plate seal and placed on a microplate incubator shaker set at 48.degree. C., 700 rpm. After 30 minutes, the samples were heated on a heating dry bath for 5 minutes at 100.degree. C. to stop the reaction. The plate was then filtered through a 96 well format 0.45 m Durapore membrane. The filtrate samples were analysed for glucose and cellobiose using the YSI Biochemistry analyser (YSI Life Sciences) and/or HPLC (UPLC) methods. The cellobiase activity of the dilutions of purified Cel3a from the soluble and insoluble fractions was plotted on a graph and the results are shown in FIG. 3. FIG. 3 shows that the solubilized Cel3a has cellobiase activity, even in the presence of urea.

[0230] Cellobiase activity was also assessed for solubilized Cel3a that has not been purified by the IMAC purification methods described in Example 3. The cell pellet of cells expressing Cel3a was washed with lysis buffer before solubilising with the solubilizing buffer containing 6M urea. Cellobiase activity of the crude lysate sample containing solubilized Cel3a in 6M urea buffer is assayed by the cellobiase assay described above. The percentage of cellobiose converted to glucose in 30 minutes was compared between soluble Cel3a, the soluble wash, and the solubilized Cel3a from crude lysate (FIG. 4). The solubilized Cel3a without purification also possessed cellobiase activity.

EQUIVALENTS

[0231] The disclosures of each and every patent, patent application, and publication cited herein are hereby incorporated herein by reference in their entirety. While this invention has been disclosed with reference to specific aspects, it is apparent that other aspects and variations of this invention may be devised by others skilled in the art without departing from the true spirit and scope of the invention. The appended claims are intended to be construed to include all such aspects and equivalent variations.

Sequence CWU 1

1

231731PRTTrichoderma reesei 1Met Gly Asp Ser His Ser Thr Ser Gly Ala Ser Ala Glu Ala Val Val1 5 10 15Pro Pro Ala Gly Thr Pro Trp Gly Thr Ala Tyr Asp Lys Ala Lys Ala 20 25 30Ala Leu Ala Lys Leu Asn Leu Gln Asp Lys Val Gly Ile Val Ser Gly 35 40 45Val Gly Trp Asn Gly Gly Pro Cys Val Gly Asn Thr Ser Pro Ala Ser 50 55 60Lys Ile Ser Tyr Pro Ser Leu Cys Leu Gln Asp Gly Pro Leu Gly Val65 70 75 80Arg Tyr Ser Thr Gly Ser Thr Ala Phe Thr Pro Gly Val Gln Ala Ala 85 90 95Ser Thr Trp Asp Val Asn Leu Ile Arg Glu Arg Gly Gln Phe Ile Gly 100 105 110Glu Glu Val Lys Ala Ser Gly Ile His Val Ile Leu Gly Pro Val Ala 115 120 125Gly Pro Leu Gly Lys Thr Pro Gln Gly Gly Arg Asn Trp Glu Gly Phe 130 135 140Gly Val Asp Pro Tyr Leu Thr Gly Ile Ala Met Gly Gln Thr Ile Asn145 150 155 160Gly Ile Gln Ser Val Gly Val Gln Ala Thr Ala Lys His Tyr Ile Leu 165 170 175Asn Glu Gln Glu Leu Asn Arg Glu Thr Ile Ser Ser Asn Pro Asp Asp 180 185 190Arg Thr Leu His Glu Leu Tyr Thr Trp Pro Phe Ala Asp Ala Val Gln 195 200 205Ala Asn Val Ala Ser Val Met Cys Ser Tyr Asn Lys Val Asn Thr Thr 210 215 220Trp Ala Cys Glu Asp Gln Tyr Thr Leu Gln Thr Val Leu Lys Asp Gln225 230 235 240Leu Gly Phe Pro Gly Tyr Val Met Thr Asp Trp Asn Ala Gln His Thr 245 250 255Thr Val Gln Ser Ala Asn Ser Gly Leu Asp Met Ser Met Pro Gly Thr 260 265 270Asp Phe Asn Gly Asn Asn Arg Leu Trp Gly Pro Ala Leu Thr Asn Ala 275 280 285Val Asn Ser Asn Gln Val Pro Thr Ser Arg Val Asp Asp Met Val Thr 290 295 300Arg Ile Leu Ala Ala Trp Tyr Leu Thr Gly Gln Asp Gln Ala Gly Tyr305 310 315 320Pro Ser Phe Asn Ile Ser Arg Asn Val Gln Gly Asn His Lys Thr Asn 325 330 335Val Arg Ala Ile Ala Arg Asp Gly Ile Val Leu Leu Lys Asn Asp Ala 340 345 350Asn Ile Leu Pro Leu Lys Lys Pro Ala Ser Ile Ala Val Val Gly Ser 355 360 365Ala Ala Ile Ile Gly Asn His Ala Arg Asn Ser Pro Ser Cys Asn Asp 370 375 380Lys Gly Cys Asp Asp Gly Ala Leu Gly Met Gly Trp Gly Ser Gly Ala385 390 395 400Val Asn Tyr Pro Tyr Phe Val Ala Pro Tyr Asp Ala Ile Asn Thr Arg 405 410 415Ala Ser Ser Gln Gly Thr Gln Val Thr Leu Ser Asn Thr Asp Asn Thr 420 425 430Ser Ser Gly Ala Ser Ala Ala Arg Gly Lys Asp Val Ala Ile Val Phe 435 440 445Ile Thr Ala Asp Ser Gly Glu Gly Tyr Ile Thr Val Glu Gly Asn Ala 450 455 460Gly Asp Arg Asn Asn Leu Asp Pro Trp His Asn Gly Asn Ala Leu Val465 470 475 480Gln Ala Val Ala Gly Ala Asn Ser Asn Val Ile Val Val Val His Ser 485 490 495Val Gly Ala Ile Ile Leu Glu Gln Ile Leu Ala Leu Pro Gln Val Lys 500 505 510Ala Val Val Trp Ala Gly Leu Pro Ser Gln Glu Ser Gly Asn Ala Leu 515 520 525Val Asp Val Leu Trp Gly Asp Val Ser Pro Ser Gly Lys Leu Val Tyr 530 535 540Thr Ile Ala Lys Ser Pro Asn Asp Tyr Asn Thr Arg Ile Val Ser Gly545 550 555 560Gly Ser Asp Ser Phe Ser Glu Gly Leu Phe Ile Asp Tyr Lys His Phe 565 570 575Asp Asp Ala Asn Ile Thr Pro Arg Tyr Glu Phe Gly Tyr Gly Leu Ser 580 585 590Tyr Thr Lys Phe Asn Tyr Ser Arg Leu Ser Val Leu Ser Thr Ala Lys 595 600 605Ser Gly Pro Ala Thr Gly Ala Val Val Pro Gly Gly Pro Ser Asp Leu 610 615 620Phe Gln Asn Val Ala Thr Val Thr Val Asp Ile Ala Asn Ser Gly Gln625 630 635 640Val Thr Gly Ala Glu Val Ala Gln Leu Tyr Ile Thr Tyr Pro Ser Ser 645 650 655Ala Pro Arg Thr Pro Pro Lys Gln Leu Arg Gly Phe Ala Lys Leu Asn 660 665 670Leu Thr Pro Gly Gln Ser Gly Thr Ala Thr Phe Asn Ile Arg Arg Arg 675 680 685Asp Leu Ser Tyr Trp Asp Thr Ala Ser Gln Lys Trp Val Val Pro Ser 690 695 700Gly Ser Phe Gly Ile Ser Val Gly Ala Ser Ser Arg Asp Ile Arg Leu705 710 715 720Thr Ser Thr Leu Ser Val Ala Gly Ser Gly Ser 725 73022232DNATrichoderma reesei 2atgcgttacc gaacagcagc tgcgctggca cttgccactg ggccctttgc tagggcagac 60agtcactcaa catcgggggc ctcggctgag gcagttgtac ctcctgcagg gactccatgg 120ggaaccgcgt acgacaaggc gaaggccgca ttggcaaagc tcaatctcca agataaggtc 180ggcatcgtga gcggtgtcgg ctggaacggc ggtccttgcg ttggaaacac atctccggcc 240tccaagatca gctatccatc gctatgcctt caagacggac ccctcggtgt tcgatactcg 300acaggcagca cagcctttac gccgggcgtt caagcggcct cgacgtggga tgtcaatttg 360atccgcgaac gtggacagtt catcggtgag gaggtgaagg cctcggggat tcatgtcata 420cttggtcctg tggctgggcc gctgggaaag actccgcagg gcggtcgcaa ctgggagggc 480ttcggtgtcg atccatatct cacgggcatt gccatgggtc aaaccatcaa cggcatccag 540tcggtaggcg tgcaggcgac agcgaagcac tatatcctca acgagcagga gctcaatcga 600gaaaccattt cgagcaaccc agatgaccga actctccatg agctgtatac ttggccattt 660gccgacgcgg ttcaggccaa tgtcgcttct gtcatgtgct cgtacaacaa ggtcaatacc 720acctgggcct gcgaggatca gtacacgctg cagactgtgc tgaaagacca gctggggttc 780ccaggctatg tcatgacgga ctggaacgca cagcacacga ctgtccaaag cgcgaattct 840gggcttgaca tgtcaatgcc tggcacagac ttcaacggta acaatcggct ctggggtcca 900gctctcacca atgcggtaaa tagcaatcag gtccccacga gcagagtcga cgatatggtg 960actcgtatcc tcgccgcatg gtacttgaca ggccaggacc aggcaggcta tccgtcgttc 1020aacatcagca gaaatgttca aggaaaccac aagaccaatg tcagggcaat tgccagggac 1080ggcatcgttc tgctcaagaa tgacgccaac atcctgccgc tcaagaagcc cgctagcatt 1140gccgtcgttg gatctgccgc aatcattggt aaccacgcca gaaactcgcc ctcgtgcaac 1200gacaaaggct gcgacgacgg ggccttgggc atgggttggg gttccggcgc cgtcaactat 1260ccgtacttcg tcgcgcccta cgatgccatc aataccagag cgtcttcgca gggcacccag 1320gttaccttga gcaacaccga caacacgtcc tcaggcgcat ctgcagcaag aggaaaggac 1380gtcgccatcg tcttcatcac cgccgactcg ggtgaaggct acatcaccgt ggagggcaac 1440gcgggcgatc gcaacaacct ggatccgtgg cacaacggca atgccctggt ccaggcggtg 1500gccggtgcca acagcaacgt cattgttgtt gtccactccg ttggcgccat cattctggag 1560cagattcttg ctcttccgca ggtcaaggcc gttgtctggg cgggtcttcc ttctcaggag 1620agcggcaatg cgctcgtcga cgtgctgtgg ggagatgtca gcccttctgg caagctggtg 1680tacaccattg cgaagagccc caatgactat aacactcgca tcgtttccgg cggcagtgac 1740agcttcagcg agggactgtt catcgactat aagcacttcg acgacgccaa tatcacgccg 1800cggtacgagt tcggctatgg actgtcttac accaagttca actactcacg cctctccgtc 1860ttgtcgaccg ccaagtctgg tcctgcgact ggggccgttg tgccgggagg cccgagtgat 1920ctgttccaga atgtcgcgac agtcaccgtt gacatcgcaa actctggcca agtgactggt 1980gccgaggtag cccagctgta catcacctac ccatcttcag cacccaggac ccctccgaag 2040cagctgcgag gctttgccaa gctgaacctc acgcctggtc agagcggaac agcaacgttc 2100aacatccgac gacgagatct cagctactgg gacacggctt cgcagaaatg ggtggtgccg 2160tcggggtcgt ttggcatcag cgtgggagcg agcagccggg atatcaggct gacgagcact 2220ctgtcggtag cg 223232232DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 3atgcgttatc gtacagccgc agccctggca ctggccacag gtccgttcgc acgtgccgat 60agtcacagta ccagcggtgc cagcgcagaa gccgtggttc cgccggcagg cacaccgtgg 120ggcacagcct atgataaagc caaagccgcc ctggccaagc tgaatctgca ggataaagtg 180ggcatcgtga gtggcgtggg ctggaacggt ggtccgtgcg ttggcaacac cagcccggca 240agcaagatca gctatccgag cttatgcctg caggatggtc cgctgggcgt gcgctatagc 300accggtagta ccgcctttac acctggtgtg caggccgcca gtacctggga cgttaacctg 360atccgcgaac gtggccaatt tatcggcgaa gaagttaaag ccagcggcat tcatgttatt 420ctgggtccgg tggccggtcc tctgggtaaa accccgcagg gcggccgtaa ttgggaaggc 480ttcggcgttg atccgtattt aaccggcatc gcaatgggcc agaccattaa tggcatccag 540agcgtgggtg ttcaagccac cgccaaacac tacatattaa acgaacagga actgaatcgt 600gaaaccatca gcagcaatcc ggatgatcgc accctgcatg agctgtatac atggcctttt 660gccgacgcag ttcaggccaa cgtggcaagt gtgatgtgta gctataacaa ggtgaacacc 720acctgggcct gcgaagacca gtacaccctg cagaccgttt taaaagacca actgggcttc 780cctggttacg tgatgacaga ttggaatgcc cagcacacaa ccgttcagag cgcaaacagt 840ggcctggata tgagcatgcc gggcaccgac ttcaacggca ataatcgtct gtggggtccg 900gcactgacca atgccgttaa cagcaaccag gtgccgacca gtcgtgtgga cgatatggtt 960acccgtattc tggccgcctg gtacctgaca ggtcaagacc aggccggcta cccgagcttc 1020aacatcagcc gcaacgtgca gggtaatcac aagaccaacg ttcgcgcaat cgcacgcgat 1080ggtatcgtgc tgttaaagaa cgatgccaac attctgccgc tgaaaaaacc ggccagcatc 1140gccgttgttg gtagcgcagc catcattggc aaccacgccc gtaacagtcc gagctgcaat 1200gataaaggct gtgacgacgg tgccctgggc atgggttggg gtagtggtgc cgtgaactac 1260ccgtatttcg tggccccgta cgacgccatt aacacccgtg caagtagcca gggtacccag 1320gttaccctga gcaacaccga caacacaagc agcggtgcca gtgcagcacg tggtaaggat 1380gtggccatcg tgttcatcac cgccgacagc ggcgaaggct acattaccgt ggagggtaat 1440gccggtgatc gcaataatct ggacccgtgg cataacggca acgccctggt tcaggcagtg 1500gcaggcgcaa atagcaacgt gatcgttgtg gtgcatagcg tgggtgccat cattctggag 1560cagatcctgg ccctgccgca agttaaggca gttgtgtggg caggtctgcc gagccaagaa 1620agtggcaatg ccctggtgga cgttctgtgg ggcgatgtta gtccgagcgg caagctggtg 1680tatacaatcg ccaagagccc gaacgactat aacacccgca tcgttagcgg cggcagtgat 1740agcttcagcg agggcctgtt tatcgactac aagcatttcg atgatgccaa tattaccccg 1800cgctacgaat ttggttatgg cctgagctat accaagttca actacagccg cctgagcgtt 1860ttaagtaccg ccaagagtgg tccggcaaca ggtgccgtgg ttcctggtgg tccgagtgat 1920ctgtttcaga atgtggccac cgtgaccgtg gatatcgcca acagtggtca ggttaccggc 1980gccgaagtgg cacagctgta catcacctat ccgagcagtg caccgcgcac cccgccgaaa 2040cagctgcgtg gcttcgccaa attaaacctg accccgggcc agagcggtac agcaaccttc 2100aatattcgcc gccgtgatct gagctattgg gacaccgcca gccaaaaatg ggtggtgccg 2160agcggcagct ttggcattag tgtgggtgca agtagccgcg acattcgctt aacaagcacc 2220ctgagtgttg cc 2232430DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 4catgccatgg gcgatagtca cagtaccagc 30568DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 5cgtgggactc acaacggccg tcgccgtcgg tagtagtagt agtagtagta gtgattactt 60tcgaaccc 686800PRTPodospora anserina 6Met Lys Ser Ser Val Phe Trp Gly Ala Ser Leu Thr Ser Ala Val Val1 5 10 15Arg Ala Ile Asp Leu Pro Phe Gln Phe Tyr Pro Asn Cys Val Asp Asp 20 25 30Leu Leu Ser Thr Asn Gln Val Cys Asn Thr Thr Leu Ser Pro Pro Glu 35 40 45Arg Ala Ala Ala Leu Val Ala Ala Leu Thr Pro Glu Glu Lys Leu Gln 50 55 60Asn Ile Val Ser Lys Ser Leu Gly Ala Pro Arg Ile Gly Leu Pro Ala65 70 75 80Tyr Asn Trp Trp Ser Glu Ala Leu His Gly Val Ala Tyr Ala Pro Gly 85 90 95Thr Gln Phe Trp Gln Gly Asp Gly Pro Phe Asn Ser Ser Thr Ser Phe 100 105 110Pro Met Pro Leu Leu Met Ala Ala Thr Phe Asp Asp Glu Leu Leu Glu 115 120 125Lys Ile Ala Glu Val Ile Gly Ile Glu Gly Arg Ala Phe Gly Asn Ala 130 135 140Gly Phe Ser Gly Leu Asp Tyr Trp Thr Pro Asn Val Asn Pro Phe Lys145 150 155 160Asp Pro Arg Trp Gly Arg Gly Ser Glu Thr Pro Gly Glu Asp Val Leu 165 170 175Leu Val Lys Arg Tyr Ala Ala Ala Met Ile Lys Gly Leu Glu Gly Pro 180 185 190Val Pro Glu Lys Glu Arg Arg Val Val Ala Thr Cys Lys His Tyr Ala 195 200 205Ala Asn Asp Phe Glu Asp Trp Asn Gly Ala Thr Arg His Asn Phe Asn 210 215 220Ala Lys Ile Ser Leu Gln Asp Met Ala Glu Tyr Tyr Phe Met Pro Phe225 230 235 240Gln Gln Cys Val Arg Asp Ser Arg Val Gly Ser Ile Met Cys Ala Tyr 245 250 255Asn Ala Val Asn Gly Val Pro Ser Cys Ala Ser Pro Tyr Leu Leu Gln 260 265 270Thr Ile Leu Arg Glu His Trp Asn Trp Thr Glu His Asn Asn Tyr Ile 275 280 285Thr Ser Asp Cys Glu Ala Val Leu Asp Val Ser Leu Asn His Lys Tyr 290 295 300Ala Ala Thr Asn Ala Glu Gly Thr Ala Ile Ser Phe Glu Ala Gly Met305 310 315 320Asp Thr Ser Cys Glu Tyr Glu Gly Ser Ser Asp Ile Pro Gly Ala Trp 325 330 335Ser Gln Gly Leu Leu Lys Glu Ser Thr Val Asp Arg Ala Leu Leu Arg 340 345 350Leu Tyr Glu Gly Ile Val Arg Ala Gly Tyr Phe Asp Gly Lys Gln Ser 355 360 365Leu Tyr Ser Ser Leu Gly Trp Ala Asp Val Asn Lys Pro Ser Ala Gln 370 375 380Lys Leu Ser Leu Gln Ala Ala Val Asp Gly Thr Val Leu Leu Lys Asn385 390 395 400Asp Gly Thr Leu Pro Leu Ser Asp Leu Leu Asp Lys Ser Arg Pro Lys 405 410 415Lys Val Ala Met Ile Gly Phe Trp Ser Asp Ala Lys Asp Lys Leu Arg 420 425 430Gly Gly Tyr Ser Gly Thr Ala Ala Tyr Leu His Thr Pro Ala Tyr Ala 435 440 445Ala Ser Gln Leu Gly Ile Pro Phe Ser Thr Ala Ser Gly Pro Ile Leu 450 455 460His Ser Asp Leu Ala Ser Asn Gln Ser Trp Thr Asp Asn Ala Met Ala465 470 475 480Ala Ala Lys Asp Ala Asp Tyr Ile Leu Tyr Phe Gly Gly Ile Asp Thr 485 490 495Ser Ala Ala Gly Glu Thr Lys Asp Arg Tyr Asp Leu Asp Trp Pro Gly 500 505 510Ala Gln Leu Ser Leu Ile Asn Leu Leu Thr Thr Leu Ser Lys Pro Leu 515 520 525Ile Val Leu Gln Met Gly Asp Gln Leu Asp Asn Thr Pro Leu Leu Ser 530 535 540Asn Pro Lys Ile Asn Ala Ile Leu Trp Ala Asn Trp Pro Gly Gln Asp545 550 555 560Gly Gly Thr Ala Val Met Glu Leu Val Thr Gly Leu Lys Ser Pro Ala 565 570 575Gly Arg Leu Pro Val Thr Gln Tyr Pro Ser Asn Phe Thr Glu Leu Val 580 585 590Pro Met Thr Asp Met Ala Leu Arg Pro Ser Ala Gly Asn Ser Gln Leu 595 600 605Gly Arg Thr Tyr Arg Trp Tyr Lys Thr Pro Val Gln Ala Phe Gly Phe 610 615 620Gly Leu His Tyr Thr Thr Phe Ser Pro Lys Phe Gly Lys Lys Phe Pro625 630 635 640Ala Val Ile Asp Val Asp Glu Val Leu Glu Gly Cys Asp Asp Lys Tyr 645 650 655Leu Asp Thr Cys Pro Leu Pro Asp Leu Pro Val Val Val Glu Asn Arg 660 665 670Gly Asn Arg Thr Ser Asp Tyr Val Ala Leu Ala Phe Val Ser Ala Pro 675 680 685Gly Val Gly Pro Gly Pro Trp Pro Ile Lys Thr Leu Gly Ala Phe Thr 690 695 700Arg Leu Arg Gly Val Lys Gly Gly Glu Lys Arg Glu Gly Gly Leu Lys705 710 715 720Trp Asn Leu Gly Asn Leu Ala Arg His Asp Glu Glu Gly Asn Thr Val 725 730 735Val Tyr Pro Gly Lys Tyr Glu Val Ser Leu Asp Glu Pro Pro Lys Ala 740 745 750Arg Leu Arg Phe Glu Ile Val Arg Gly Gly Lys Gly Lys Gly Lys Val 755 760 765Lys Gly Lys Gly Lys Ala Ala Gln Lys Gly Gly Val Val Leu Asp Arg 770 775 780Trp Pro Lys Pro Pro Lys Gly Gln Glu Pro Pro Ala Ile Glu Arg Val785 790 795 8007316PRTTrichoderma reesei 7Met Val Arg Arg Thr Ala Leu Leu Ala Leu Gly Ala Leu Ser Thr Leu1 5 10 15Ser Met Ala Gln Ile Ser Asp Asp Phe Glu Ser Gly Trp Asp Gln Thr 20 25 30Lys Trp Pro Ile Ser Ala Pro Asp Cys Asn Gln Gly Gly Thr Val Ser 35 40 45Leu Asp Thr Thr Val Ala His Ser Gly Ser Asn Ser Met Lys Val Val 50 55 60Gly Gly Pro Asn Gly Tyr Cys Gly His Ile Phe Phe Gly Thr Thr Gln65 70 75 80Val Pro Thr Gly Asp Val Tyr Val Arg Ala Trp Ile Arg Leu Gln Thr 85 90 95Ala Leu Gly Ser Asn His Val Thr Phe Ile Ile Met Pro Asp Thr Ala 100 105

110Gln Gly Gly Lys His Leu Arg Ile Gly Gly Gln Ser Gln Val Leu Asp 115 120 125Tyr Asn Arg Glu Ser Asp Asp Ala Thr Leu Pro Asp Leu Ser Pro Asn 130 135 140Gly Ile Ala Ser Thr Val Thr Leu Pro Thr Gly Ala Phe Gln Cys Phe145 150 155 160Glu Tyr His Leu Gly Thr Asp Gly Thr Ile Glu Thr Trp Leu Asn Gly 165 170 175Ser Leu Ile Pro Gly Met Thr Val Gly Pro Gly Val Asp Asn Pro Asn 180 185 190Asp Ala Gly Trp Thr Arg Ala Ser Tyr Ile Pro Glu Ile Thr Gly Val 195 200 205Asn Phe Gly Trp Glu Ala Tyr Ser Gly Asp Val Asn Thr Val Trp Phe 210 215 220Asp Asp Ile Ser Ile Ala Ser Thr Arg Val Gly Cys Gly Pro Gly Ser225 230 235 240Pro Gly Gly Pro Gly Ser Ser Thr Thr Gly Arg Ser Ser Thr Ser Gly 245 250 255Pro Thr Ser Thr Ser Arg Pro Ser Thr Thr Ile Pro Pro Pro Thr Ser 260 265 270Arg Thr Thr Thr Ala Thr Gly Pro Thr Gln Thr His Tyr Gly Gln Cys 275 280 285Gly Gly Ile Gly Tyr Ser Gly Pro Thr Val Cys Ala Ser Gly Thr Thr 290 295 300Cys Gln Val Leu Asn Pro Tyr Tyr Ser Gln Cys Leu305 310 3158460PRTTrichoderma reesei 8Met Ala Ser Arg Phe Phe Ala Leu Leu Leu Leu Ala Ile Pro Ile Gln1 5 10 15Ala Gln Ser Pro Val Trp Gly Gln Cys Gly Gly Ile Gly Trp Ser Gly 20 25 30Pro Thr Thr Cys Val Gly Gly Ala Thr Cys Val Ser Tyr Asn Pro Tyr 35 40 45Tyr Ser Gln Cys Ile Pro Ser Thr Gln Ala Ser Ser Ser Ile Ala Ser 50 55 60Thr Thr Leu Val Thr Ser Phe Thr Thr Thr Thr Ala Thr Arg Thr Ser65 70 75 80Ala Ser Thr Pro Pro Ala Ser Ser Thr Gly Ala Gly Gly Ala Thr Cys 85 90 95Ser Ala Leu Pro Gly Ser Ile Thr Leu Arg Ser Asn Ala Lys Leu Asn 100 105 110Asp Leu Phe Thr Met Phe Asn Gly Asp Lys Val Thr Thr Lys Asp Lys 115 120 125Phe Ser Cys Arg Gln Ala Glu Met Ser Glu Leu Ile Gln Arg Tyr Glu 130 135 140Leu Gly Thr Leu Pro Gly Arg Pro Ser Thr Leu Thr Ala Ser Phe Ser145 150 155 160Gly Asn Thr Leu Thr Ile Asn Cys Gly Glu Ala Gly Lys Ser Ile Ser 165 170 175Phe Thr Val Thr Ile Thr Tyr Pro Ser Ser Gly Thr Ala Pro Tyr Pro 180 185 190Ala Ile Ile Gly Tyr Gly Gly Gly Ser Leu Pro Ala Pro Ala Gly Val 195 200 205Ala Met Ile Asn Phe Asn Asn Asp Asn Ile Ala Ala Gln Val Asn Thr 210 215 220Gly Ser Arg Gly Gln Gly Lys Phe Tyr Asp Leu Tyr Gly Ser Ser His225 230 235 240Ser Ala Gly Ala Met Thr Ala Trp Ala Trp Gly Val Ser Arg Val Ile 245 250 255Asp Ala Leu Glu Leu Val Pro Gly Ala Arg Ile Asp Thr Thr Lys Ile 260 265 270Gly Val Thr Gly Cys Ser Arg Asn Gly Lys Gly Ala Met Val Ala Gly 275 280 285Ala Phe Glu Lys Arg Ile Val Leu Thr Leu Pro Gln Glu Ser Gly Ala 290 295 300Gly Gly Ser Ala Cys Trp Arg Ile Ser Asp Tyr Leu Lys Ser Gln Gly305 310 315 320Ala Asn Ile Gln Thr Ala Ser Glu Ile Ile Gly Glu Asp Pro Trp Phe 325 330 335Ser Thr Thr Phe Asn Ser Tyr Val Asn Gln Val Pro Val Leu Pro Phe 340 345 350Asp His His Ser Leu Ala Ala Leu Ile Ala Pro Arg Gly Leu Phe Val 355 360 365Ile Asp Asn Asn Ile Asp Trp Leu Gly Pro Gln Ser Cys Phe Gly Cys 370 375 380Met Thr Ala Ala His Met Ala Trp Gln Ala Leu Gly Val Ser Asp His385 390 395 400Met Gly Tyr Ser Gln Ile Gly Ala His Ala His Cys Ala Phe Pro Ser 405 410 415Asn Gln Gln Ser Gln Leu Thr Ala Phe Val Gln Lys Phe Leu Leu Gly 420 425 430Gln Ser Thr Asn Thr Ala Ile Phe Gln Ser Asp Phe Ser Ala Asn Gln 435 440 445Ser Gln Trp Ile Asp Trp Thr Thr Pro Thr Leu Ser 450 455 4609466PRTTrichoderma reesei 9Met Leu Pro Lys Asp Phe Gln Trp Gly Phe Ala Thr Ala Ala Tyr Gln1 5 10 15Ile Glu Gly Ala Val Asp Gln Asp Gly Arg Gly Pro Ser Ile Trp Asp 20 25 30Thr Phe Cys Ala Gln Pro Gly Lys Ile Ala Asp Gly Ser Ser Gly Val 35 40 45Thr Ala Cys Asp Ser Tyr Asn Arg Thr Ala Glu Asp Ile Ala Leu Leu 50 55 60Lys Ser Leu Gly Ala Lys Ser Tyr Arg Phe Ser Ile Ser Trp Ser Arg65 70 75 80Ile Ile Pro Glu Gly Gly Arg Gly Asp Ala Val Asn Gln Ala Gly Ile 85 90 95Asp His Tyr Val Lys Phe Val Asp Asp Leu Leu Asp Ala Gly Ile Thr 100 105 110Pro Phe Ile Thr Leu Phe His Trp Asp Leu Pro Glu Gly Leu His Gln 115 120 125Arg Tyr Gly Gly Leu Leu Asn Arg Thr Glu Phe Pro Leu Asp Phe Glu 130 135 140Asn Tyr Ala Arg Val Met Phe Arg Ala Leu Pro Lys Val Arg Asn Trp145 150 155 160Ile Thr Phe Asn Glu Pro Leu Cys Ser Ala Ile Pro Gly Tyr Gly Ser 165 170 175Gly Thr Phe Ala Pro Gly Arg Gln Ser Thr Ser Glu Pro Trp Thr Val 180 185 190Gly His Asn Ile Leu Val Ala His Gly Arg Ala Val Lys Ala Tyr Arg 195 200 205Asp Asp Phe Lys Pro Ala Ser Gly Asp Gly Gln Ile Gly Ile Val Leu 210 215 220Asn Gly Asp Phe Thr Tyr Pro Trp Asp Ala Ala Asp Pro Ala Asp Lys225 230 235 240Glu Ala Ala Glu Arg Arg Leu Glu Phe Phe Thr Ala Trp Phe Ala Asp 245 250 255Pro Ile Tyr Leu Gly Asp Tyr Pro Ala Ser Met Arg Lys Gln Leu Gly 260 265 270Asp Arg Leu Pro Thr Phe Thr Pro Glu Glu Arg Ala Leu Val His Gly 275 280 285Ser Asn Asp Phe Tyr Gly Met Asn His Tyr Thr Ser Asn Tyr Ile Arg 290 295 300His Arg Ser Ser Pro Ala Ser Ala Asp Asp Thr Val Gly Asn Val Asp305 310 315 320Val Leu Phe Thr Asn Lys Gln Gly Asn Cys Ile Gly Pro Glu Thr Gln 325 330 335Ser Pro Trp Leu Arg Pro Cys Ala Ala Gly Phe Arg Asp Phe Leu Val 340 345 350Trp Ile Ser Lys Arg Tyr Gly Tyr Pro Pro Ile Tyr Val Thr Glu Asn 355 360 365Gly Thr Ser Ile Lys Gly Glu Ser Asp Leu Pro Lys Glu Lys Ile Leu 370 375 380Glu Asp Asp Phe Arg Val Lys Tyr Tyr Asn Glu Tyr Ile Arg Ala Met385 390 395 400Val Thr Ala Val Glu Leu Asp Gly Val Asn Val Lys Gly Tyr Phe Ala 405 410 415Trp Ser Leu Met Asp Asn Phe Glu Trp Ala Asp Gly Tyr Val Thr Arg 420 425 430Phe Gly Val Thr Tyr Val Asp Tyr Glu Asn Gly Gln Lys Arg Phe Pro 435 440 445Lys Lys Ser Ala Lys Ser Leu Lys Pro Leu Phe Asp Glu Leu Ile Ala 450 455 460Ala Ala46510744PRTTrichoderma reesei 10Met Arg Tyr Arg Thr Ala Ala Ala Leu Ala Leu Ala Thr Gly Pro Phe1 5 10 15Ala Arg Ala Asp Ser His Ser Thr Ser Gly Ala Ser Ala Glu Ala Val 20 25 30Val Pro Pro Ala Gly Thr Pro Trp Gly Thr Ala Tyr Asp Lys Ala Lys 35 40 45Ala Ala Leu Ala Lys Leu Asn Leu Gln Asp Lys Val Gly Ile Val Ser 50 55 60Gly Val Gly Trp Asn Gly Gly Pro Cys Val Gly Asn Thr Ser Pro Ala65 70 75 80Ser Lys Ile Ser Tyr Pro Ser Leu Cys Leu Gln Asp Gly Pro Leu Gly 85 90 95Val Arg Tyr Ser Thr Gly Ser Thr Ala Phe Thr Pro Gly Val Gln Ala 100 105 110Ala Ser Thr Trp Asp Val Asn Leu Ile Arg Glu Arg Gly Gln Phe Ile 115 120 125Gly Glu Glu Val Lys Ala Ser Gly Ile His Val Ile Leu Gly Pro Val 130 135 140Ala Gly Pro Leu Gly Lys Thr Pro Gln Gly Gly Arg Asn Trp Glu Gly145 150 155 160Phe Gly Val Asp Pro Tyr Leu Thr Gly Ile Ala Met Gly Gln Thr Ile 165 170 175Asn Gly Ile Gln Ser Val Gly Val Gln Ala Thr Ala Lys His Tyr Ile 180 185 190Leu Asn Glu Gln Glu Leu Asn Arg Glu Thr Ile Ser Ser Asn Pro Asp 195 200 205Asp Arg Thr Leu His Glu Leu Tyr Thr Trp Pro Phe Ala Asp Ala Val 210 215 220Gln Ala Asn Val Ala Ser Val Met Cys Ser Tyr Asn Lys Val Asn Thr225 230 235 240Thr Trp Ala Cys Glu Asp Gln Tyr Thr Leu Gln Thr Val Leu Lys Asp 245 250 255Gln Leu Gly Phe Pro Gly Tyr Val Met Thr Asp Trp Asn Ala Gln His 260 265 270Thr Thr Val Gln Ser Ala Asn Ser Gly Leu Asp Met Ser Met Pro Gly 275 280 285Thr Asp Phe Asn Gly Asn Asn Arg Leu Trp Gly Pro Ala Leu Thr Asn 290 295 300Ala Val Asn Ser Asn Gln Val Pro Thr Ser Arg Val Asp Asp Met Val305 310 315 320Thr Arg Ile Leu Ala Ala Trp Tyr Leu Thr Gly Gln Asp Gln Ala Gly 325 330 335Tyr Pro Ser Phe Asn Ile Ser Arg Asn Val Gln Gly Asn His Lys Thr 340 345 350Asn Val Arg Ala Ile Ala Arg Asp Gly Ile Val Leu Leu Lys Asn Asp 355 360 365Ala Asn Ile Leu Pro Leu Lys Lys Pro Ala Ser Ile Ala Val Val Gly 370 375 380Ser Ala Ala Ile Ile Gly Asn His Ala Arg Asn Ser Pro Ser Cys Asn385 390 395 400Asp Lys Gly Cys Asp Asp Gly Ala Leu Gly Met Gly Trp Gly Ser Gly 405 410 415Ala Val Asn Tyr Pro Tyr Phe Val Ala Pro Tyr Asp Ala Ile Asn Thr 420 425 430Arg Ala Ser Ser Gln Gly Thr Gln Val Thr Leu Ser Asn Thr Asp Asn 435 440 445Thr Ser Ser Gly Ala Ser Ala Ala Arg Gly Lys Asp Val Ala Ile Val 450 455 460Phe Ile Thr Ala Asp Ser Gly Glu Gly Tyr Ile Thr Val Glu Gly Asn465 470 475 480Ala Gly Asp Arg Asn Asn Leu Asp Pro Trp His Asn Gly Asn Ala Leu 485 490 495Val Gln Ala Val Ala Gly Ala Asn Ser Asn Val Ile Val Val Val His 500 505 510Ser Val Gly Ala Ile Ile Leu Glu Gln Ile Leu Ala Leu Pro Gln Val 515 520 525Lys Ala Val Val Trp Ala Gly Leu Pro Ser Gln Glu Ser Gly Asn Ala 530 535 540Leu Val Asp Val Leu Trp Gly Asp Val Ser Pro Ser Gly Lys Leu Val545 550 555 560Tyr Thr Ile Ala Lys Ser Pro Asn Asp Tyr Asn Thr Arg Ile Val Ser 565 570 575Gly Gly Ser Asp Ser Phe Ser Glu Gly Leu Phe Ile Asp Tyr Lys His 580 585 590Phe Asp Asp Ala Asn Ile Thr Pro Arg Tyr Glu Phe Gly Tyr Gly Leu 595 600 605Ser Tyr Thr Lys Phe Asn Tyr Ser Arg Leu Ser Val Leu Ser Thr Ala 610 615 620Lys Ser Gly Pro Ala Thr Gly Ala Val Val Pro Gly Gly Pro Ser Asp625 630 635 640Leu Phe Gln Asn Val Ala Thr Val Thr Val Asp Ile Ala Asn Ser Gly 645 650 655Gln Val Thr Gly Ala Glu Val Ala Gln Leu Tyr Ile Thr Tyr Pro Ser 660 665 670Ser Ala Pro Arg Thr Pro Pro Lys Gln Leu Arg Gly Phe Ala Lys Leu 675 680 685Asn Leu Thr Pro Gly Gln Ser Gly Thr Ala Thr Phe Asn Ile Arg Arg 690 695 700Arg Asp Leu Ser Tyr Trp Asp Thr Ala Ser Gln Lys Trp Val Val Pro705 710 715 720Ser Gly Ser Phe Gly Ile Ser Val Gly Ala Ser Ser Arg Asp Ile Arg 725 730 735Leu Thr Ser Thr Leu Ser Val Ala 74011418PRTTrichoderma reesei 11Met Asn Lys Ser Val Ala Pro Leu Leu Leu Ala Ala Ser Ile Leu Tyr1 5 10 15Gly Gly Ala Ala Ala Gln Gln Thr Val Trp Gly Gln Cys Gly Gly Ile 20 25 30Gly Trp Ser Gly Pro Thr Asn Cys Ala Pro Gly Ser Ala Cys Ser Thr 35 40 45Leu Asn Pro Tyr Tyr Ala Gln Cys Ile Pro Gly Ala Thr Thr Ile Thr 50 55 60Thr Ser Thr Arg Pro Pro Ser Gly Pro Thr Thr Thr Thr Arg Ala Thr65 70 75 80Ser Thr Ser Ser Ser Thr Pro Pro Thr Ser Ser Gly Val Arg Phe Ala 85 90 95Gly Val Asn Ile Ala Gly Phe Asp Phe Gly Cys Thr Thr Asp Gly Thr 100 105 110Cys Val Thr Ser Lys Val Tyr Pro Pro Leu Lys Asn Phe Thr Gly Ser 115 120 125Asn Asn Tyr Pro Asp Gly Ile Gly Gln Met Gln His Phe Val Asn Asp 130 135 140Asp Gly Met Thr Ile Phe Arg Leu Pro Val Gly Trp Gln Tyr Leu Val145 150 155 160Asn Asn Asn Leu Gly Gly Asn Leu Asp Ser Thr Ser Ile Ser Lys Tyr 165 170 175Asp Gln Leu Val Gln Gly Cys Leu Ser Leu Gly Ala Tyr Cys Ile Val 180 185 190Asp Ile His Asn Tyr Ala Arg Trp Asn Gly Gly Ile Ile Gly Gln Gly 195 200 205Gly Pro Thr Asn Ala Gln Phe Thr Ser Leu Trp Ser Gln Leu Ala Ser 210 215 220Lys Tyr Ala Ser Gln Ser Arg Val Trp Phe Gly Ile Met Asn Glu Pro225 230 235 240His Asp Val Asn Ile Asn Thr Trp Ala Ala Thr Val Gln Glu Val Val 245 250 255Thr Ala Ile Arg Asn Ala Gly Ala Thr Ser Gln Phe Ile Ser Leu Pro 260 265 270Gly Asn Asp Trp Gln Ser Ala Gly Ala Phe Ile Ser Asp Gly Ser Ala 275 280 285Ala Ala Leu Ser Gln Val Thr Asn Pro Asp Gly Ser Thr Thr Asn Leu 290 295 300Ile Phe Asp Val His Lys Tyr Leu Asp Ser Asp Asn Ser Gly Thr His305 310 315 320Ala Glu Cys Thr Thr Asn Asn Ile Asp Gly Ala Phe Ser Pro Leu Ala 325 330 335Thr Trp Leu Arg Gln Asn Asn Arg Gln Ala Ile Leu Thr Glu Thr Gly 340 345 350Gly Gly Asn Val Gln Ser Cys Ile Gln Asp Met Cys Gln Gln Ile Gln 355 360 365Tyr Leu Asn Gln Asn Ser Asp Val Tyr Leu Gly Tyr Val Gly Trp Gly 370 375 380Ala Gly Ser Phe Asp Ser Thr Tyr Val Leu Thr Glu Thr Pro Thr Gly385 390 395 400Ser Gly Asn Ser Trp Thr Asp Thr Ser Leu Val Ser Ser Cys Leu Ala 405 410 415Arg Lys12471PRTTrichoderma reesei 12Met Ile Val Gly Ile Leu Thr Thr Leu Ala Thr Leu Ala Thr Leu Ala1 5 10 15Ala Ser Val Pro Leu Glu Glu Arg Gln Ala Cys Ser Ser Val Trp Gly 20 25 30Gln Cys Gly Gly Gln Asn Trp Ser Gly Pro Thr Cys Cys Ala Ser Gly 35 40 45Ser Thr Cys Val Tyr Ser Asn Asp Tyr Tyr Ser Gln Cys Leu Pro Gly 50 55 60Ala Ala Ser Ser Ser Ser Ser Thr Arg Ala Ala Ser Thr Thr Ser Arg65 70 75 80Val Ser Pro Thr Thr Ser Arg Ser Ser Ser Ala Thr Pro Pro Pro Gly 85 90 95Ser Thr Thr Thr Arg Val Pro Pro Val Gly Ser Gly Thr Ala Thr Tyr 100 105 110Ser Gly Asn Pro Phe Val Gly Val Thr Pro Trp Ala Asn Ala Tyr Tyr 115 120 125Ala Ser Glu Val Ser Ser Leu Ala Ile Pro Ser Leu Thr Gly Ala Met 130 135 140Ala Thr Ala Ala Ala Ala Val Ala Lys Val Pro Ser Phe Met Trp Leu145 150 155 160Asp Thr Leu Asp Lys Thr Pro Leu Met Glu Gln Thr Leu Ala Asp Ile

165 170 175Arg Thr Ala Asn Lys Asn Gly Gly Asn Tyr Ala Gly Gln Phe Val Val 180 185 190Tyr Asp Leu Pro Asp Arg Asp Cys Ala Ala Leu Ala Ser Asn Gly Glu 195 200 205Tyr Ser Ile Ala Asp Gly Gly Val Ala Lys Tyr Lys Asn Tyr Ile Asp 210 215 220Thr Ile Arg Gln Ile Val Val Glu Tyr Ser Asp Ile Arg Thr Leu Leu225 230 235 240Val Ile Glu Pro Asp Ser Leu Ala Asn Leu Val Thr Asn Leu Gly Thr 245 250 255Pro Lys Cys Ala Asn Ala Gln Ser Ala Tyr Leu Glu Cys Ile Asn Tyr 260 265 270Ala Val Thr Gln Leu Asn Leu Pro Asn Val Ala Met Tyr Leu Asp Ala 275 280 285Gly His Ala Gly Trp Leu Gly Trp Pro Ala Asn Gln Asp Pro Ala Ala 290 295 300Gln Leu Phe Ala Asn Val Tyr Lys Asn Ala Ser Ser Pro Arg Ala Leu305 310 315 320Arg Gly Leu Ala Thr Asn Val Ala Asn Tyr Asn Gly Trp Asn Ile Thr 325 330 335Ser Pro Pro Ser Tyr Thr Gln Gly Asn Ala Val Tyr Asn Glu Lys Leu 340 345 350Tyr Ile His Ala Ile Gly Pro Leu Leu Ala Asn His Gly Trp Ser Asn 355 360 365Ala Phe Phe Ile Thr Asp Gln Gly Arg Ser Gly Lys Gln Pro Thr Gly 370 375 380Gln Gln Gln Trp Gly Asp Trp Cys Asn Val Ile Gly Thr Gly Phe Gly385 390 395 400Ile Arg Pro Ser Ala Asn Thr Gly Asp Ser Leu Leu Asp Ser Phe Val 405 410 415Trp Val Lys Pro Gly Gly Glu Cys Asp Gly Thr Ser Asp Ser Ser Ala 420 425 430Pro Arg Phe Asp Ser His Cys Ala Leu Pro Asp Ala Leu Gln Pro Ala 435 440 445Pro Gln Ala Gly Ala Trp Phe Gln Ala Tyr Phe Val Gln Leu Leu Thr 450 455 460Asn Ala Asn Pro Ser Phe Leu465 47013514PRTTrichoderma reesei 13Met Tyr Arg Lys Leu Ala Val Ile Ser Ala Phe Leu Ala Thr Ala Arg1 5 10 15Ala Gln Ser Ala Cys Thr Leu Gln Ser Glu Thr His Pro Pro Leu Thr 20 25 30Trp Gln Lys Cys Ser Ser Gly Gly Thr Cys Thr Gln Gln Thr Gly Ser 35 40 45Val Val Ile Asp Ala Asn Trp Arg Trp Thr His Ala Thr Asn Ser Ser 50 55 60Thr Asn Cys Tyr Asp Gly Asn Thr Trp Ser Ser Thr Leu Cys Pro Asp65 70 75 80Asn Glu Thr Cys Ala Lys Asn Cys Cys Leu Asp Gly Ala Ala Tyr Ala 85 90 95Ser Thr Tyr Gly Val Thr Thr Ser Gly Asn Ser Leu Ser Ile Gly Phe 100 105 110Val Thr Gln Ser Ala Gln Lys Asn Val Gly Ala Arg Leu Tyr Leu Met 115 120 125Ala Ser Asp Thr Thr Tyr Gln Glu Phe Thr Leu Leu Gly Asn Glu Phe 130 135 140Ser Phe Asp Val Asp Val Ser Gln Leu Pro Cys Gly Leu Asn Gly Ala145 150 155 160Leu Tyr Phe Val Ser Met Asp Ala Asp Gly Gly Val Ser Lys Tyr Pro 165 170 175Thr Asn Thr Ala Gly Ala Lys Tyr Gly Thr Gly Tyr Cys Asp Ser Gln 180 185 190Cys Pro Arg Asp Leu Lys Phe Ile Asn Gly Gln Ala Asn Val Glu Gly 195 200 205Trp Glu Pro Ser Ser Asn Asn Ala Asn Thr Gly Ile Gly Gly His Gly 210 215 220Ser Cys Cys Ser Glu Met Asp Ile Trp Glu Ala Asn Ser Ile Ser Glu225 230 235 240Ala Leu Thr Pro His Pro Cys Thr Thr Val Gly Gln Glu Ile Cys Glu 245 250 255Gly Asp Gly Cys Gly Gly Thr Tyr Ser Asp Asn Arg Tyr Gly Gly Thr 260 265 270Cys Asp Pro Asp Gly Cys Asp Trp Asn Pro Tyr Arg Leu Gly Asn Thr 275 280 285Ser Phe Tyr Gly Pro Gly Ser Ser Phe Thr Leu Asp Thr Thr Lys Lys 290 295 300Leu Thr Val Val Thr Gln Phe Glu Thr Ser Gly Ala Ile Asn Arg Tyr305 310 315 320Tyr Val Gln Asn Gly Val Thr Phe Gln Gln Pro Asn Ala Glu Leu Gly 325 330 335Ser Tyr Ser Gly Asn Glu Leu Asn Asp Asp Tyr Cys Thr Ala Glu Glu 340 345 350Ala Glu Phe Gly Gly Ser Ser Phe Ser Asp Lys Gly Gly Leu Thr Gln 355 360 365Phe Lys Lys Ala Thr Ser Gly Gly Met Val Leu Val Met Ser Leu Trp 370 375 380Asp Asp Tyr Tyr Ala Asn Met Leu Trp Leu Asp Ser Thr Tyr Pro Thr385 390 395 400Asn Glu Thr Ser Ser Thr Pro Gly Ala Val Arg Gly Ser Cys Ser Thr 405 410 415Ser Ser Gly Val Pro Ala Gln Val Glu Ser Gln Ser Pro Asn Ala Lys 420 425 430Val Thr Phe Ser Asn Ile Lys Phe Gly Pro Ile Gly Ser Thr Gly Asn 435 440 445Pro Ser Gly Gly Asn Pro Pro Gly Gly Asn Pro Pro Gly Thr Thr Thr 450 455 460Thr Arg Arg Pro Ala Thr Thr Thr Gly Ser Ser Pro Gly Pro Thr Gln465 470 475 480Ser His Tyr Gly Gln Cys Gly Gly Ile Gly Tyr Ser Gly Pro Thr Val 485 490 495Cys Ala Ser Gly Thr Thr Cys Gln Val Leu Asn Pro Tyr Tyr Ser Gln 500 505 510Cys Leu14459PRTTrichoderma reesei 14Met Ala Pro Ser Val Thr Leu Pro Leu Thr Thr Ala Ile Leu Ala Ile1 5 10 15Ala Arg Leu Val Ala Ala Gln Gln Pro Gly Thr Ser Thr Pro Glu Val 20 25 30His Pro Lys Leu Thr Thr Tyr Lys Cys Thr Lys Ser Gly Gly Cys Val 35 40 45Ala Gln Asp Thr Ser Val Val Leu Asp Trp Asn Tyr Arg Trp Met His 50 55 60Asp Ala Asn Tyr Asn Ser Cys Thr Val Asn Gly Gly Val Asn Thr Thr65 70 75 80Leu Cys Pro Asp Glu Ala Thr Cys Gly Lys Asn Cys Phe Ile Glu Gly 85 90 95Val Asp Tyr Ala Ala Ser Gly Val Thr Thr Ser Gly Ser Ser Leu Thr 100 105 110Met Asn Gln Tyr Met Pro Ser Ser Ser Gly Gly Tyr Ser Ser Val Ser 115 120 125Pro Arg Leu Tyr Leu Leu Asp Ser Asp Gly Glu Tyr Val Met Leu Lys 130 135 140Leu Asn Gly Gln Glu Leu Ser Phe Asp Val Asp Leu Ser Ala Leu Pro145 150 155 160Cys Gly Glu Asn Gly Ser Leu Tyr Leu Ser Gln Met Asp Glu Asn Gly 165 170 175Gly Ala Asn Gln Tyr Asn Thr Ala Gly Ala Asn Tyr Gly Ser Gly Tyr 180 185 190Cys Asp Ala Gln Cys Pro Val Gln Thr Trp Arg Asn Gly Thr Leu Asn 195 200 205Thr Ser His Gln Gly Phe Cys Cys Asn Glu Met Asp Ile Leu Glu Gly 210 215 220Asn Ser Arg Ala Asn Ala Leu Thr Pro His Ser Cys Thr Ala Thr Ala225 230 235 240Cys Asp Ser Ala Gly Cys Gly Phe Asn Pro Tyr Gly Ser Gly Tyr Lys 245 250 255Ser Tyr Tyr Gly Pro Gly Asp Thr Val Asp Thr Ser Lys Thr Phe Thr 260 265 270Ile Ile Thr Gln Phe Asn Thr Asp Asn Gly Ser Pro Ser Gly Asn Leu 275 280 285Val Ser Ile Thr Arg Lys Tyr Gln Gln Asn Gly Val Asp Ile Pro Ser 290 295 300Ala Gln Pro Gly Gly Asp Thr Ile Ser Ser Cys Pro Ser Ala Ser Ala305 310 315 320Tyr Gly Gly Leu Ala Thr Met Gly Lys Ala Leu Ser Ser Gly Met Val 325 330 335Leu Val Phe Ser Ile Trp Asn Asp Asn Ser Gln Tyr Met Asn Trp Leu 340 345 350Asp Ser Gly Asn Ala Gly Pro Cys Ser Ser Thr Glu Gly Asn Pro Ser 355 360 365Asn Ile Leu Ala Asn Asn Pro Asn Thr His Val Val Phe Ser Asn Ile 370 375 380Arg Trp Gly Asp Ile Gly Ser Thr Thr Asn Ser Thr Ala Pro Pro Pro385 390 395 400Pro Pro Ala Ser Ser Thr Thr Phe Ser Thr Thr Arg Arg Ser Ser Thr 405 410 415Thr Ser Ser Ser Pro Ser Cys Thr Gln Thr His Trp Gly Gln Cys Gly 420 425 430Gly Ile Gly Tyr Ser Gly Cys Lys Thr Cys Thr Ser Gly Thr Thr Cys 435 440 445Gln Tyr Ser Asn Asp Tyr Tyr Ser Gln Cys Leu 450 45515234PRTTrichoderma reesei 15Met Lys Phe Leu Gln Val Leu Pro Ala Leu Ile Pro Ala Ala Leu Ala1 5 10 15Gln Thr Ser Cys Asp Gln Trp Ala Thr Phe Thr Gly Asn Gly Tyr Thr 20 25 30Val Ser Asn Asn Leu Trp Gly Ala Ser Ala Gly Ser Gly Phe Gly Cys 35 40 45Val Thr Ala Val Ser Leu Ser Gly Gly Ala Ser Trp His Ala Asp Trp 50 55 60Gln Trp Ser Gly Gly Gln Asn Asn Val Lys Ser Tyr Gln Asn Ser Gln65 70 75 80Ile Ala Ile Pro Gln Lys Arg Thr Val Asn Ser Ile Ser Ser Met Pro 85 90 95Thr Thr Ala Ser Trp Ser Tyr Ser Gly Ser Asn Ile Arg Ala Asn Val 100 105 110Ala Tyr Asp Leu Phe Thr Ala Ala Asn Pro Asn His Val Thr Tyr Ser 115 120 125Gly Asp Tyr Glu Leu Met Ile Trp Leu Gly Lys Tyr Gly Asp Ile Gly 130 135 140Pro Ile Gly Ser Ser Gln Gly Thr Val Asn Val Gly Gly Gln Ser Trp145 150 155 160Thr Leu Tyr Tyr Gly Tyr Asn Gly Ala Met Gln Val Tyr Ser Phe Val 165 170 175Ala Gln Thr Asn Thr Thr Asn Tyr Ser Gly Asp Val Lys Asn Phe Phe 180 185 190Asn Tyr Leu Arg Asp Asn Lys Gly Tyr Asn Ala Ala Gly Gln Tyr Val 195 200 205Leu Ser Tyr Gln Phe Gly Thr Glu Pro Phe Thr Gly Ser Gly Thr Leu 210 215 220Asn Val Ala Ser Trp Thr Ala Ser Ile Asn225 23016242PRTTrichoderma reesei 16Met Lys Ala Thr Leu Val Leu Gly Ser Leu Ile Val Gly Ala Val Ser1 5 10 15Ala Tyr Lys Ala Thr Thr Thr Arg Tyr Tyr Asp Gly Gln Glu Gly Ala 20 25 30Cys Gly Cys Gly Ser Ser Ser Gly Ala Phe Pro Trp Gln Leu Gly Ile 35 40 45Gly Asn Gly Val Tyr Thr Ala Ala Gly Ser Gln Ala Leu Phe Asp Thr 50 55 60Ala Gly Ala Ser Trp Cys Gly Ala Gly Cys Gly Lys Cys Tyr Gln Leu65 70 75 80Thr Ser Thr Gly Gln Ala Pro Cys Ser Ser Cys Gly Thr Gly Gly Ala 85 90 95Ala Gly Gln Ser Ile Ile Val Met Val Thr Asn Leu Cys Pro Asn Asn 100 105 110Gly Asn Ala Gln Trp Cys Pro Val Val Gly Gly Thr Asn Gln Tyr Gly 115 120 125Tyr Ser Tyr His Phe Asp Ile Met Ala Gln Asn Glu Ile Phe Gly Asp 130 135 140Asn Val Val Val Asp Phe Glu Pro Ile Ala Cys Pro Gly Gln Ala Ala145 150 155 160Ser Asp Trp Gly Thr Cys Leu Cys Val Gly Gln Gln Glu Thr Asp Pro 165 170 175Thr Pro Val Leu Gly Asn Asp Thr Gly Ser Thr Pro Pro Gly Ser Ser 180 185 190Pro Pro Ala Thr Ser Ser Ser Pro Pro Ser Gly Gly Gly Gln Gln Thr 195 200 205Leu Tyr Gly Gln Cys Gly Gly Ala Gly Trp Thr Gly Pro Thr Thr Cys 210 215 220Gln Ala Pro Gly Thr Cys Lys Val Gln Asn Gln Trp Tyr Ser Gln Cys225 230 235 240Leu Pro17838PRTTrichoderma reesei 17Met Lys Val Ser Arg Val Leu Ala Leu Val Leu Gly Ala Val Ile Pro1 5 10 15Ala His Ala Ala Phe Ser Trp Lys Asn Val Lys Leu Gly Gly Gly Gly 20 25 30Gly Phe Val Pro Gly Ile Ile Phe His Pro Lys Thr Lys Gly Val Ala 35 40 45Tyr Ala Arg Thr Asp Ile Gly Gly Leu Tyr Arg Leu Asn Ala Asp Asp 50 55 60Ser Trp Thr Ala Val Thr Asp Gly Ile Ala Asp Asn Ala Gly Trp His65 70 75 80Asn Trp Gly Ile Asp Ala Val Ala Leu Asp Pro Gln Asp Asp Gln Lys 85 90 95Val Tyr Ala Ala Val Gly Met Tyr Thr Asn Ser Trp Asp Pro Ser Asn 100 105 110Gly Ala Ile Ile Arg Ser Ser Asp Arg Gly Ala Thr Trp Ser Phe Thr 115 120 125Asn Leu Pro Phe Lys Val Gly Gly Asn Met Pro Gly Arg Gly Ala Gly 130 135 140Glu Arg Leu Ala Val Asp Pro Ala Asn Ser Asn Ile Ile Tyr Phe Gly145 150 155 160Ala Arg Ser Gly Asn Gly Leu Trp Lys Ser Thr Asp Gly Gly Val Thr 165 170 175Phe Ser Lys Val Ser Ser Phe Thr Ala Thr Gly Thr Tyr Ile Pro Asp 180 185 190Pro Ser Asp Ser Asn Gly Tyr Asn Ser Asp Lys Gln Gly Leu Met Trp 195 200 205Val Thr Phe Asp Ser Thr Ser Ser Thr Thr Gly Gly Ala Thr Ser Arg 210 215 220Ile Phe Val Gly Thr Ala Asp Asn Ile Thr Ala Ser Val Tyr Val Ser225 230 235 240Thr Asn Ala Gly Ser Thr Trp Ser Ala Val Pro Gly Gln Pro Gly Lys 245 250 255Tyr Phe Pro His Lys Ala Lys Leu Gln Pro Ala Glu Lys Ala Leu Tyr 260 265 270Leu Thr Tyr Ser Asp Gly Thr Gly Pro Tyr Asp Gly Thr Leu Gly Ser 275 280 285Val Trp Arg Tyr Asp Ile Ala Gly Gly Thr Trp Lys Asp Ile Thr Pro 290 295 300Val Ser Gly Ser Asp Leu Tyr Phe Gly Phe Gly Gly Leu Gly Leu Asp305 310 315 320Leu Gln Lys Pro Gly Thr Leu Val Val Ala Ser Leu Asn Ser Trp Trp 325 330 335Pro Asp Ala Gln Leu Phe Arg Ser Thr Asp Ser Gly Thr Thr Trp Ser 340 345 350Pro Ile Trp Ala Trp Ala Ser Tyr Pro Thr Glu Thr Tyr Tyr Tyr Ser 355 360 365Ile Ser Thr Pro Lys Ala Pro Trp Ile Lys Asn Asn Phe Ile Asp Val 370 375 380Thr Ser Glu Ser Pro Ser Asp Gly Leu Ile Lys Arg Leu Gly Trp Met385 390 395 400Ile Glu Ser Leu Glu Ile Asp Pro Thr Asp Ser Asn His Trp Leu Tyr 405 410 415Gly Thr Gly Met Thr Ile Phe Gly Gly His Asp Leu Thr Asn Trp Asp 420 425 430Thr Arg His Asn Val Ser Ile Gln Ser Leu Ala Asp Gly Ile Glu Glu 435 440 445Phe Ser Val Gln Asp Leu Ala Ser Ala Pro Gly Gly Ser Glu Leu Leu 450 455 460Ala Ala Val Gly Asp Asp Asn Gly Phe Thr Phe Ala Ser Arg Asn Asp465 470 475 480Leu Gly Thr Ser Pro Gln Thr Val Trp Ala Thr Pro Thr Trp Ala Thr 485 490 495Ser Thr Ser Val Asp Tyr Ala Gly Asn Ser Val Lys Ser Val Val Arg 500 505 510Val Gly Asn Thr Ala Gly Thr Gln Gln Val Ala Ile Ser Ser Asp Gly 515 520 525Gly Ala Thr Trp Ser Ile Asp Tyr Ala Ala Asp Thr Ser Met Asn Gly 530 535 540Gly Thr Val Ala Tyr Ser Ala Asp Gly Asp Thr Ile Leu Trp Ser Thr545 550 555 560Ala Ser Ser Gly Val Gln Arg Ser Gln Phe Gln Gly Ser Phe Ala Ser 565 570 575Val Ser Ser Leu Pro Ala Gly Ala Val Ile Ala Ser Asp Lys Lys Thr 580 585 590Asn Ser Val Phe Tyr Ala Gly Ser Gly Ser Thr Phe Tyr Val Ser Lys 595 600 605Asp Thr Gly Ser Ser Phe Thr Arg Gly Pro Lys Leu Gly Ser Ala Gly 610 615 620Thr Ile Arg Asp Ile Ala Ala His Pro Thr Thr Ala Gly Thr Leu Tyr625 630 635 640Val Ser Thr Asp Val Gly Ile Phe Arg Ser Thr Asp Ser Gly Thr Thr 645 650 655Phe Gly Gln Val Ser Thr Ala Leu Thr Asn Thr Tyr Gln Ile Ala Leu 660 665 670Gly Val Gly Ser Gly Ser Asn Trp Asn Leu Tyr Ala Phe Gly Thr Gly 675 680 685Pro Ser Gly Ala Arg Leu Tyr Ala Ser Gly Asp Ser Gly Ala Ser Trp 690 695 700Thr Asp Ile Gln Gly

Ser Gln Gly Phe Gly Ser Ile Asp Ser Thr Lys705 710 715 720Val Ala Gly Ser Gly Ser Thr Ala Gly Gln Val Tyr Val Gly Thr Asn 725 730 735Gly Arg Gly Val Phe Tyr Ala Gln Gly Thr Val Gly Gly Gly Thr Gly 740 745 750Gly Thr Ser Ser Ser Thr Lys Gln Ser Ser Ser Ser Thr Ser Ser Ala 755 760 765Ser Ser Ser Thr Thr Leu Arg Ser Ser Val Val Ser Thr Thr Arg Ala 770 775 780Ser Thr Val Thr Ser Ser Arg Thr Ser Ser Ala Ala Gly Pro Thr Gly785 790 795 800Ser Gly Val Ala Gly His Tyr Ala Gln Cys Gly Gly Ile Gly Trp Thr 805 810 815Gly Pro Thr Gln Cys Val Ala Pro Tyr Val Cys Gln Lys Gln Asn Asp 820 825 830Tyr Tyr Tyr Gln Cys Val 83518373PRTPodospora anserina 18Met Lys Gly Leu Phe Ala Phe Gly Leu Gly Leu Leu Ser Leu Val Asn1 5 10 15Ala Leu Pro Gln Ala Gln Gly Gly Gly Ala Ala Ala Ser Ala Lys Val 20 25 30Ser Gly Thr Arg Phe Val Ile Asp Gly Lys Thr Gly Tyr Phe Ala Gly 35 40 45Thr Asn Ser Tyr Trp Ile Gly Phe Leu Thr Asn Asn Arg Asp Val Asp 50 55 60Thr Thr Leu Asp His Ile Ala Ser Ser Gly Leu Lys Ile Leu Arg Val65 70 75 80Trp Gly Phe Asn Asp Val Asn Asn Gln Pro Ser Gly Asn Thr Val Trp 85 90 95Phe Gln Arg Leu Ala Ser Ser Gly Ser Gln Ile Asn Thr Gly Pro Asn 100 105 110Gly Leu Gln Arg Leu Asp Tyr Leu Val Arg Ser Ala Glu Thr Arg Gly 115 120 125Ile Lys Leu Ile Ile Ala Leu Val Asn Tyr Trp Asp Asp Phe Gly Gly 130 135 140Met Lys Ala Tyr Val Asn Ala Phe Gly Gly Thr Lys Glu Ser Trp Tyr145 150 155 160Thr Asn Ala Arg Ala Gln Glu Gln Tyr Lys Arg Tyr Ile Gln Ala Val 165 170 175Val Ser Arg Tyr Val Asn Ser Pro Ala Ile Phe Ala Trp Glu Leu Ala 180 185 190Asn Glu Pro Arg Cys Lys Gly Cys Asn Thr Asn Val Ile Phe Asn Trp 195 200 205Ala Thr Gln Ile Ser Asp Tyr Ile Arg Ser Leu Asp Lys Asp His Leu 210 215 220Ile Thr Leu Gly Asp Glu Gly Phe Gly Leu Pro Gly Gln Thr Thr Tyr225 230 235 240Pro Tyr Gln Tyr Gly Glu Gly Thr Asp Phe Val Lys Asn Leu Gln Ile 245 250 255Lys Asn Leu Asp Phe Gly Thr Phe His Met Tyr Pro Gly His Trp Gly 260 265 270Val Pro Thr Ser Phe Gly Pro Gly Trp Ile Lys Asp His Ala Ala Ala 275 280 285Cys Arg Ala Ala Gly Lys Pro Cys Leu Leu Glu Glu Tyr Gly Tyr Glu 290 295 300Ser Asp Arg Cys Asn Val Gln Lys Gly Trp Gln Gln Ala Ser Arg Glu305 310 315 320Leu Ser Arg Asp Gly Met Ser Gly Asp Leu Phe Trp Gln Trp Gly Asp 325 330 335Gln Leu Ser Thr Gly Gln Thr His Asn Asp Gly Phe Thr Ile Tyr Tyr 340 345 350Gly Ser Ser Leu Ala Thr Cys Leu Val Thr Asp His Val Arg Ala Ile 355 360 365Asn Ala Leu Pro Ala 37019469PRTPodospora anserina 19Met Val Lys Leu Leu Asp Ile Gly Leu Phe Ala Leu Ala Leu Ala Ser1 5 10 15Ser Ala Val Ala Lys Pro Cys Lys Pro Arg Asp Gly Pro Val Thr Tyr 20 25 30Glu Ala Glu Asp Ala Ile Leu Thr Gly Thr Thr Val Asp Thr Ala Gln 35 40 45Val Gly Tyr Thr Gly Arg Gly Tyr Val Thr Gly Phe Asp Glu Gly Ser 50 55 60Asp Lys Ile Thr Phe Gln Ile Ser Ser Ala Thr Thr Lys Leu Tyr Asp65 70 75 80Leu Ser Ile Arg Tyr Ala Ala Ile Tyr Gly Asp Lys Arg Thr Asn Val 85 90 95Val Leu Asn Asn Gly Ala Val Ser Glu Val Phe Phe Pro Ala Gly Asp 100 105 110Ser Phe Thr Ser Val Ala Ala Gly Gln Val Leu Leu Asn Ala Gly Gln 115 120 125Asn Thr Ile Asp Ile Val Asn Asn Trp Gly Trp Tyr Leu Ile Asp Ser 130 135 140Ile Thr Leu Thr Pro Ser Ala Pro Arg Pro Pro His Asp Ile Asn Pro145 150 155 160Asn Leu Asn Asn Pro Asn Ala Asp Thr Asn Ala Lys Lys Leu Tyr Ser 165 170 175Tyr Leu Arg Ser Val Tyr Gly Asn Lys Ile Ile Ser Gly Gln Gln Glu 180 185 190Leu His His Ala Glu Trp Ile Arg Gln Gln Thr Gly Lys Thr Pro Ala 195 200 205Leu Val Ala Val Asp Leu Met Asp Tyr Ser Pro Ser Arg Val Glu Arg 210 215 220Gly Thr Thr Ser His Ala Val Glu Asp Ala Ile Ala His His Asn Ala225 230 235 240Gly Gly Ile Val Ser Val Leu Trp His Trp Asn Ala Pro Val Gly Leu 245 250 255Tyr Asp Thr Glu Glu Asn Lys Trp Trp Ser Gly Phe Tyr Thr Arg Ala 260 265 270Thr Asp Phe Asp Ile Ala Ala Thr Leu Ala Asn Pro Gln Gly Ala Asn 275 280 285Tyr Thr Leu Leu Ile Arg Asp Ile Asp Ala Ile Ala Val Gln Leu Lys 290 295 300Arg Leu Glu Ala Ala Gly Val Pro Val Leu Trp Arg Pro Leu His Glu305 310 315 320Ala Glu Gly Gly Trp Phe Trp Trp Gly Ala Lys Gly Pro Glu Pro Ala 325 330 335Lys Gln Leu Trp Asp Ile Leu Tyr Glu Arg Leu Thr Val His His Gly 340 345 350Leu Asp Asn Leu Ile Trp Val Trp Asn Ser Ile Leu Glu Asp Trp Tyr 355 360 365Pro Gly Asp Asp Thr Val Asp Ile Leu Ser Ala Asp Val Tyr Ala Gln 370 375 380Gly Asn Gly Pro Met Ser Thr Gln Tyr Asn Glu Leu Ile Ala Leu Gly385 390 395 400Arg Asp Lys Lys Met Ile Ala Ala Ala Glu Val Gly Ala Ala Pro Leu 405 410 415Pro Gly Leu Leu Gln Ala Tyr Gln Ala Asn Trp Leu Trp Phe Ala Val 420 425 430Trp Gly Asp Asp Phe Ile Asn Asn Pro Ser Trp Asn Thr Val Ala Val 435 440 445Leu Asn Glu Ile Tyr Asn Ser Asp Tyr Val Leu Thr Leu Asp Glu Ile 450 455 460Gln Gly Trp Arg Ser46520493PRTTrichoderma reesei 20Met Ala Gly Lys Leu Ile Leu Val Ala Leu Ala Ser Leu Val Ser Leu1 5 10 15Ser Ile Gln Gln Asn Cys Ala Ala Leu Phe Gly Gln Cys Gly Gly Ile 20 25 30Gly Trp Ser Gly Thr Thr Cys Cys Val Ala Gly Ala Gln Cys Ser Phe 35 40 45Val Asn Asp Trp Tyr Ser Gln Cys Leu Ala Ser Thr Gly Gly Asn Pro 50 55 60Pro Asn Gly Thr Thr Ser Ser Ser Leu Val Ser Arg Thr Ser Ser Ala65 70 75 80Ser Ser Ser Val Gly Ser Ser Ser Pro Gly Gly Asn Ser Pro Thr Gly 85 90 95Ser Ala Ser Thr Tyr Thr Thr Thr Asp Thr Ala Thr Val Ala Pro His 100 105 110Ser Gln Ser Pro Tyr Pro Ser Ile Ala Ala Ser Ser Cys Gly Ser Trp 115 120 125Thr Leu Val Asp Asn Val Cys Cys Pro Ser Tyr Cys Ala Asn Asp Asp 130 135 140Thr Ser Glu Ser Cys Ser Gly Cys Gly Thr Cys Thr Thr Pro Pro Ser145 150 155 160Ala Asp Cys Lys Ser Gly Thr Met Tyr Pro Glu Val His His Val Ser 165 170 175Ser Asn Glu Ser Trp His Tyr Ser Arg Ser Thr His Phe Gly Leu Thr 180 185 190Ser Gly Gly Ala Cys Gly Phe Gly Leu Tyr Gly Leu Cys Thr Lys Gly 195 200 205Ser Val Thr Ala Ser Trp Thr Asp Pro Met Leu Gly Ala Thr Cys Asp 210 215 220Ala Phe Cys Thr Ala Tyr Pro Leu Leu Cys Lys Asp Pro Thr Gly Thr225 230 235 240Thr Leu Arg Gly Asn Phe Ala Ala Pro Asn Gly Asp Tyr Tyr Thr Gln 245 250 255Phe Trp Ser Ser Leu Pro Gly Ala Leu Asp Asn Tyr Leu Ser Cys Gly 260 265 270Glu Cys Ile Glu Leu Ile Gln Thr Lys Pro Asp Gly Thr Asp Tyr Ala 275 280 285Val Gly Glu Ala Gly Tyr Thr Asp Pro Ile Thr Leu Glu Ile Val Asp 290 295 300Ser Cys Pro Cys Ser Ala Asn Ser Lys Trp Cys Cys Gly Pro Gly Ala305 310 315 320Asp His Cys Gly Glu Ile Asp Phe Lys Tyr Gly Cys Pro Leu Pro Ala 325 330 335Asp Ser Ile His Leu Asp Leu Ser Asp Ile Ala Met Gly Arg Leu Gln 340 345 350Gly Asn Gly Ser Leu Thr Asn Gly Val Ile Pro Thr Arg Tyr Arg Arg 355 360 365Val Gln Cys Pro Lys Val Gly Asn Ala Tyr Ile Trp Leu Arg Asn Gly 370 375 380Gly Gly Pro Tyr Tyr Phe Ala Leu Thr Ala Val Asn Thr Asn Gly Pro385 390 395 400Gly Ser Val Thr Lys Ile Glu Ile Lys Gly Ala Asp Thr Asp Asn Trp 405 410 415Val Ala Leu Val His Asp Pro Asn Tyr Thr Ser Ser Arg Pro Gln Glu 420 425 430Arg Tyr Gly Ser Trp Val Ile Pro Gln Gly Ser Gly Pro Phe Asn Leu 435 440 445Pro Val Gly Ile Arg Leu Thr Ser Pro Thr Gly Glu Gln Ile Val Asn 450 455 460Glu Gln Ala Ile Lys Thr Phe Thr Pro Pro Ala Thr Gly Asp Pro Asn465 470 475 480Phe Tyr Tyr Ile Asp Ile Gly Val Gln Phe Ser Gln Asn 485 490218PRTArtificial SequenceDescription of Artificial Sequence Synthetic 8xHis tag 21His His His His His His His His1 5226PRTArtificial SequenceDescription of Artificial Sequence Synthetic 6xHis tag 22His His His His His His1 5234PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 23Gly Ser Gly Ser1

* * * * *

Patent Diagrams and Documents

D00001

D00002

D00003

D00004

S00001

XML

US20190316108A1 – US 20190316108 A1