U.S. patent application number 15/502322 was filed with the patent office on 2017-08-10 for biological processes for the production of aryl sulfates.
The applicant listed for this patent is Danmarks Tekniske Universitet. Invention is credited to Christian Bille Jendresen, Alex Toftgaard Nielsen.
Application Number | 20170226543 15/502322 |
Document ID | / |
Family ID | 51383672 |
Filed Date | 2017-08-10 |
United States Patent
Application |
20170226543 |
Kind Code |
A1 |
Jendresen; Christian Bille ;
et al. |
August 10, 2017 |
BIOLOGICAL PROCESSES FOR THE PRODUCTION OF ARYL SULFATES
Abstract
The present invention generally relates to the field of
biotechnology as it applies to the production of aryl sulfates
using polypeptides or recombinant cells comprising said
polypeptides. More particularly, the present invention pertains to
polypeptides having aryl sulfotransferase activity, recombinant
host cells expressing same and processes for the production of aryl
sulfates employing these polypeptides or recombinant host
cells.
Inventors: |
Jendresen; Christian Bille;
(Copenhagen O, DK) ; Nielsen; Alex Toftgaard;
(Rungsted Kyst, DK) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Danmarks Tekniske Universitet |
Kgs. Lyngby |
|
DK |
|
|
Family ID: |
51383672 |
Appl. No.: |
15/502322 |
Filed: |
August 21, 2015 |
PCT Filed: |
August 21, 2015 |
PCT NO: |
PCT/EP2015/069298 |
371 Date: |
February 7, 2017 |
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12N 1/20 20130101; C12P
11/00 20130101; C12N 9/1229 20130101; C12N 15/67 20130101; C12N
9/1205 20130101; C12N 9/13 20130101; C12N 9/10 20130101; C12N 15/09
20130101; C12N 9/88 20130101; C12N 9/16 20130101 |
International
Class: |
C12P 11/00 20060101
C12P011/00; C12N 15/67 20060101 C12N015/67; C12N 1/20 20060101
C12N001/20; C12N 15/09 20060101 C12N015/09; C12N 9/10 20060101
C12N009/10; C12N 9/88 20060101 C12N009/88 |
Foreign Application Data
Date |
Code |
Application Number |
Aug 22, 2014 |
EP |
14182032.4 |
Claims
1. A process for the production of a sulfated phenolic compound
comprising: (i') contacting a medium comprising a phenolic compound
with a first recombinant host cell, wherein the first recombinant
host cell comprises a heterologous polypeptide having an aryl
sulfotransferase activity; or (i'') contacting a medium comprising
a fermentable carbon substrate with a first recombinant host cell;
wherein the first recombinant host cell comprises a heterologous
polypeptide having an aryl sulfotransferase activity; or (i''')
contacting a medium comprising a precursor of a phenolic compound
with a first recombinant host cell, wherein the first recombinant
host cell comprises a heterologous polypeptide having an aryl
sulfotransferase activity.
2-38. (canceled)
39. The process according to claim 1, further comprising: (ii)
culturing the first recombinant host cell under suitable conditions
for the production of the corresponding sulfated phenolic compound;
and (iii) optionally, recovering said sulfated phenolic
compound.
40. The process according to claim 1, wherein the heterologous
polypeptide having an aryl sulfotransferase activity is a
sulfotransferase 1A1 enzyme.
41. The process according to claim 1, wherein the heterologous
polypeptide having an aryl sulfotransferase activity is selected
from the group consisting of: a) a polypeptide comprising an amino
acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12 or 13; b) a polypeptide comprising an amino acid
sequence which has at least about 70% sequence identity to the
amino acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12 or 13, wherein the polypeptide has aryl
sulfotransferase activity; and c) a polypeptide comprising an amino
acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12 or 13, wherein 1 to 50 amino acid residues are
substituted, deleted and/or inserted, wherein the polypeptide has
aryl sulfotransferase activity.
42. The process according to claim 1, wherein the heterologous
polypeptide is selected from the group consisting of: a) a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 1; b) a polypeptide comprising an amino acid sequence which has
at least about 70% sequence identity to the amino acid sequence set
forth in SEQ ID NO: 1, wherein the polypeptide has aryl
sulfotransferase activity; and c) a polypeptide comprising an amino
acid sequence set forth in SEQ ID NO: 1, wherein 1 to 50 amino acid
residues are substituted, deleted and/or inserted, wherein the
polypeptide has aryl sulfotransferase activity.
43. The process according to claim 1, wherein the first recombinant
host cell has been further modified to have an increased expression
of an ATP sulfurylase protein as compared to an identical host cell
that does not carry said modification.
44. The process according to claim 43, wherein the ATP sulfurylase
is encoded by the genes cysD and cysN.
45. The process according to claim 1, wherein said first
recombinant host cell has been further modified to have an
increased expression of an APS kinase protein as compared to an
identical host cell that does not carry said modification.
46. The process according to claim 45, wherein the said APS kinase
is encoded by the gene cysC.
47. The process according to claim 1, wherein said first
recombinant host cell has been further modified to have an
increased expression of a PAP phosphatase protein as compared to an
identical host cell that does not carry said modification.
48. The process according to claim 47, wherein said PAP phosphatase
is encoded by the gene cycQ.
49. The process according to claim 1, wherein said first
recombinant host cell further comprises a heterologous polypeptide
having a tyrosine ammonia lyase activity.
50. The process according to claim 1, wherein in step (i'), (i'')
or (i''') the medium is further contacted with a second recombinant
host cell comprising a heterologous polypeptide having a tyrosine
ammonia lyase activity.
51. The process according to claim 49, wherein the heterologous
polypeptide having a tyrosine ammonia lyase activity is selected
from the group consisting of: a) a polypeptide comprising an amino
acid sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20,
21, 22 or 23; b) a polypeptide comprising an amino acid sequence
which has at least about 70% sequence identity to the amino acid
sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22
or 23, wherein the polypeptide has tyrosine ammonia lyase activity;
and c) a polypeptide comprising an amino acid sequence set forth in
SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23, wherein 1 to
50 amino acid residues are substituted, deleted and/or inserted,
wherein the polypeptide has tyrosine ammonia lyase activity.
52. The process according to claim 49, wherein the heterologous
polypeptide having a tyrosine ammonia lyase activity is selected
from the group consisting of: a) a polypeptide comprising an amino
acid sequence set forth in SEQ ID NO: 14; b) a polypeptide
comprising an amino acid sequence which has at least about 70%
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 14, wherein the polypeptide has tyrosine ammonia lyase
activity; and c) a polypeptide comprising an amino acid sequence
set forth in SEQ ID NO: 14, wherein 1 to 50 amino acid residues are
substituted, deleted and/or inserted, wherein the polypeptide has
tyrosine ammonia lyase activity.
53. The process according to claim 1, wherein the phenolic compound
is represented by the general formula (I): ##STR00009## wherein at
least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4, or R.sub.5 is an
hydroxyl group (--OH); wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4,
R.sub.5 and R.sub.6 are independently selected from the group
consisting of: halide, hydrogen, hydroxyl (--OH), --OR.sub.7,
--OCOR.sub.7, --NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7,
--SR.sub.7, --OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8,
alkyl, alkenyl, alkynyl, aryl, and heteroaryl, wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl; wherein
R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and R.sub.6, are
optionally linked with a bridge member Y.sub.n thereby forming one
or more rings, Y.sub.n being a bond or a C.sub.1-12 alkyl or an
aryl, a carbocyclic, a heterocyclic or a heteroaromatic structure
having 1-3 rings, 3-8 ring members in each and 0 to 4 heteroatoms,
or a heteroalkyl comprising 1 to 12 heteroatoms selected from the
group consisting of N, O, S, S(O).sub.1-2 and carbonyl, and wherein
n is an integer between 1 and 12.
54. A process according to claim 1, wherein the phenolic compound
is represented by the general formula (II): ##STR00010## wherein at
least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4, or R.sub.5 is an
hydroxyl group (--OH); wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4,
R.sub.5 and R.sub.6 are independently selected from the group
consisting of halide, hydrogen, hydroxyl (--OH), --OR.sub.7,
--OCOR.sub.7, --NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7,
--SR.sub.7, --OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8,
alkyl, alkenyl, alkynyl, aryl, and heteroaryl, wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl; wherein
R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and R.sub.6, are
optionally linked with a bridge member Y.sub.n, thereby forming one
or more rings, Y.sub.n being a bond or a C.sub.1-12 alkyl or an
aryl, a carbocyclic, a heterocyclic or a heteroaromatic structure
having 1-3 rings, 3-8 ring members in each and 0 to 4 heteroatoms,
or a heteroalkyl comprising 1 to 12 heteroatoms selected from the
group consisting of N, O, S, S(O).sub.1-2 and carbonyl, and wherein
n is an integer between 1 and 12.
55. The process according to claim 1, wherein the precursor of a
phenolic compound in step (i''') is a compound of the general
Formula (p-I): ##STR00011## wherein at least one of R.sub.1,
R.sub.2, R.sub.3, R.sub.4, or R.sub.5 is an hydroxyl group (--OH);
wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and R.sub.6 are
independently selected from the group consisting of halide,
hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl; wherein
R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and R.sub.6, are
optionally linked with a bridge member Y.sub.n, thereby forming one
or more rings, Y.sub.n being a bond or a C.sub.1-12 alkyl or an
aryl, a carbocyclic, a heterocyclic or a heteroaromatic structure
having 1-3 rings, 3-8 ring members in each and 0 to 4 heteroatoms,
or a heteroalkyl comprising 1 to 12 heteroatoms selected from the
group consisting of N, O, S, S(O).sub.1-2 and carbonyl, and wherein
n is an integer between 1 and 12.
56. The process according to claim 53, wherein R.sub.6 is
--COOR.sub.7 and R.sub.7 is hydrogen.
57. The process according to claim 53, wherein R.sub.3 is hydroxyl
(--OH).
58. The process according to claim 53, wherein each of R.sub.1,
R.sub.2, R.sub.4 and R.sub.5 is hydrogen, R.sub.3 is hydroxyl
(--OH) and R.sub.6 is --COOH.
59. A recombinant host cell comprising a first heterologous
polypeptide having aryl sulfotransferase activity, such as a
polypeptide selected from the group consisting of: a) a polypeptide
comprising an amino acid sequence set forth in SEQ ID NO: 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12 or 13; b) a polypeptide comprising an
amino acid sequence which has at least about 70% sequence identity
to the amino acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12 or 13, wherein the polypeptide has aryl
sulfotransferase activity; and c) a polypeptide comprising an amino
acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12 or 13, wherein 1 to 50 amino acid residues are
substituted, deleted and/or inserted, wherein the polypeptide has
aryl sulfotransferase activity.
60. The recombinant host cell according to claim 59, wherein the
heterologous polypeptide is selected from the group consisting of:
a) a polypeptide comprising an amino acid sequence set forth in SEQ
ID NO: 1; b) a polypeptide comprising an amino acid sequence which
has at least about 70% sequence identity to the amino acid sequence
set forth in SEQ ID NO: 1, wherein the polypeptide has aryl
sulfotransferase activity; and c) a polypeptide comprising an amino
acid sequence set forth in SEQ ID NO: 1, wherein 1 to 50 amino acid
residues are substituted, deleted and/or inserted, wherein the
polypeptide has aryl sulfotransferase activity.
61. The recombinant host cell according to claim 59, wherein the
recombinant host cell has further been modified to have an
increased protein expression of an ATP sulfurylase compared to an
identical host cell that does not carry said modification.
62. The recombinant host cell according to claim 61, wherein said
ATP sulfurylase is encoded by the genes cysD and cysN.
63. The recombinant host cell according to claim 59, wherein the
recombinant host cell has further been modified to have an
increased expression of an APS kinase protein as compared to an
identical host cell that does not carry said modification.
64. The recombinant host cell according to claim 63, wherein said
APS kinase is encoded by the gene cysC.
65. The recombinant host cell according to claim 59, wherein the
recombinant host cell has further been modified to have an
increased expression of a PAP phosphatase protein as compared to an
identical host cell that does not carry said modification.
66. The recombinant host cell according to claim 65, wherein said
PAP phosphatase is encoded by the gene cycQ.
67. The recombinant host cell according to claim 59, further
comprising a second heterologous polypeptide having tyrosine
ammonia lyase activity, wherein said polypeptide is selected from
the group consisting of: a) a polypeptide comprising an amino acid
sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22
or 23; b) a polypeptide comprising an amino acid sequence which has
at least about 70% sequence identity to the amino acid sequence set
forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23,
wherein the polypeptide has tyrosine ammonia lyase activity; and c)
a polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23, wherein 1 to 50 amino
acid residues are substituted, deleted and/or inserted, wherein the
polypeptide has tyrosine ammonia lyase activity.
68. The recombinant host cell according to claim 59, further
comprising a second heterologous polypeptide having tyrosine
ammonia lyase activity, wherein the heterologous polypeptide is
selected from the group consisting of: a) a polypeptide comprising
an amino acid sequence set forth in SEQ ID NO: 14; b) a polypeptide
comprising an amino acid sequence which has at least about 70%
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 14, wherein the polypeptide has tyrosine ammonia lyase
activity; and c) a polypeptide comprising an amino acid sequence
set forth in SEQ ID NO: 14, wherein 1 to 50 amino acid residues are
substituted, deleted and/or inserted, wherein the polypeptide has
tyrosine ammonia lyase activity.
69. The recombinant host cell according to claim 59, which is
employed as first recombinant host cell in the process according to
claim 1.
70. Process for the production of a sulfated phenolic compound,
such as zosteric acid, the method comprises sulfating a phenolic
compound, such as p-coumaric acid, using a polypeptide having aryl
sulfotransferase activity, such as a polypeptide selected from the
group consisting of: a) a polypeptide comprising an amino acid
sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12 or 13; b) a polypeptide comprising an amino acid sequence which
has at least about 70% sequence identity to the amino acid sequence
set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or
13, wherein the polypeptide has aryl sulfotransferase activity; and
c) a polypeptide comprising an amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13, wherein 1 to 50
amino acid residues are substituted, deleted and/or inserted,
wherein the polypeptide has aryl sulfotransferase activity.
71. The process according to claim 70, wherein the polypeptide is
selected from the group consisting of: a) a polypeptide comprising
an amino acid sequence set forth in SEQ ID NO: 1; b) a polypeptide
comprising an amino acid sequence which has at least about 70%
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 1, wherein the polypeptide has aryl sulfotransferase activity;
and c) a polypeptide comprising an amino acid sequence set forth in
SEQ ID NO: 1, wherein 1 to 50 amino acid residues are substituted,
deleted and/or inserted, wherein the polypeptide has aryl
sulfotransferase activity.
72. A composition comprising: a first recombinant host cell
comprising a heterologous polypeptide having arylsulfotransferase
activity, such as a polypeptide selected from the group consisting
of: a) a polypeptide comprising an amino acid sequence set forth in
SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13; b) a
polypeptide comprising an amino acid sequence which has at least
about 70% sequence identity to the amino acid sequence set forth in
SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13, wherein the
polypeptide has aryl sulfotransferase activity; and c) a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13, wherein 1 to 50
amino acid residues are substituted, deleted and/or inserted,
wherein the polypeptide has aryl sulfotransferase activity; and a
second recombinant host cell comprising a heterologous polypeptide
having tyrosine ammonia lyase activity, such as a polypeptide
selected from the group consisting of: a) a polypeptide comprising
an amino acid sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18,
19, 20, 21, 22 or 23; b) a polypeptide comprising an amino acid
sequence which has at least about 70% sequence identity to the
amino acid sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19,
20, 21, 22 or 23, wherein the polypeptide has tyrosine ammonia
lyase activity; and c) a polypeptide comprising an amino acid
sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22
or 23, wherein 1 to 50 amino acid residues are substituted, deleted
and/or inserted, wherein the polypeptide has tyrosine ammonia lyase
activity.
73. The composition according to claim 72, comprising: a first
recombinant host cell comprising a heterologous polypeptide having
arylsulfotransferase activity selected from the group consisting
of: a) a polypeptide comprising an amino acid sequence set forth in
SEQ ID NO: 1; b) a polypeptide comprising an amino acid sequence
which has at least about 70% sequence identity to the amino acid
sequence set forth in SEQ ID NO: 1, wherein the polypeptide has
aryl sulfotransferase activity; and c) a polypeptide comprising an
amino acid sequence set forth in SEQ ID NO: 1, wherein 1 to 50
amino acid residues are substituted, deleted and/or inserted,
wherein the polypeptide has aryl sulfotransferase activity; and a
second recombinant host cell comprising a heterologous polypeptide
having tyrosine ammonia lyase activity selected from the group
consisting of: a) a polypeptide comprising an amino acid sequence
set forth in SEQ ID NO: 14; b) a polypeptide comprising an amino
acid sequence which has at least about 70% sequence identity to the
amino acid sequence set forth in SEQ ID NO: 14, wherein the
polypeptide has tyrosine ammonia lyase activity; and c) a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, wherein 1 to 50 amino acid residues are substituted,
deleted and/or inserted, wherein the polypeptide has tyrosine
ammonia lyase activity.
74. A composition comprising: a first polypeptide having aryl
sulfotransferase activity, such as a polypeptide selected from the
group consisting of: a) a polypeptide comprising an amino acid
sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12 or 13; b) a polypeptide comprising an amino acid sequence which
has at least about 70% sequence identity to the amino acid sequence
set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or
13, wherein the polypeptide has aryl sulfotransferase activity; and
c) a polypeptide comprising an amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13, wherein 1 to 50
amino acid residues are substituted, deleted and/or inserted,
wherein the polypeptide has aryl sulfotransferase activity; and a
second polypeptide having tyrosine ammonia lyase activity, such as
a polypeptide selected from the group consisting of: a) a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23; b) a polypeptide
comprising an amino acid sequence which has at least about 70%
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23, wherein the
polypeptide has tyrosine ammonia lyase activity; and c) a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23, wherein 1 to 50 amino
acid residues are substituted, deleted and/or inserted, wherein the
polypeptide has tyrosine ammonia lyase activity.
75. The composition according to claim 74, comprising: a first
polypeptide having aryl sulfotransferase activity selected from the
group consisting of: a) a polypeptide comprising an amino acid
sequence set forth in SEQ ID NO: 1; b) a polypeptide comprising an
amino acid sequence which has at least about 70% sequence identity
to the amino acid sequence set forth in SEQ ID NO: 1, wherein the
polypeptide has aryl sulfotransferase activity; and c) a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 1, wherein 1 to 50 amino acid residues are substituted, deleted
and/or inserted, wherein the polypeptide has aryl sulfotransferase
activity; and a second polypeptide having tyrosine ammonia lyase
activity selected from the group consisting of: a) a polypeptide
comprising an amino acid sequence set forth in SEQ ID NO: 14; b) a
polypeptide comprising an amino acid sequence which has at least
about 70% sequence identity to the amino acid sequence set forth in
SEQ ID NO: 14, wherein the polypeptide has tyrosine ammonia lyase
activity; and c) a polypeptide comprising an amino acid sequence
set forth in SEQ ID NO: 14, wherein 1 to 50 amino acid residues are
substituted, deleted and/or inserted, wherein the polypeptide has
tyrosine ammonia lyase activity.
Description
TECHNICAL FIELD OF THE INVENTION
[0001] The present invention generally relates to the field of
biotechnology as it applies to the production of aryl sulfates
using polypeptides or recombinant cells comprising said
polypeptides. More particularly, the present invention pertains to
polypeptides having aryl sulfotransferase activity, recombinant
host cells expressing same and processes for the production of aryl
sulfates employing these polypeptides or recombinant host
cells.
BACKGROUND OF THE INVENTION
[0002] A range of phenolic compounds are of great interest to the
biotech industry since they are building blocks for polymeric
compounds. Examples of such phenolic compounds include p-coumaric
acid (pHCA) or other hydroxycinnamic acids which form the basis for
many secondary metabolites including flavonoids and stilbenes.
However, many of these phenolic compounds are toxic to producing
organisms, and thus limit the productivity during fermentation.
Hence, there is a need for large scale production processes, and
especially for biological large scale production processes allowing
improved productivity.
[0003] Moreover, a range of phenolic compounds, and especially
those used as drugs or food additives such as resveratrol or
vanillin, show poor solubility in water which makes it difficult
for these compounds to be uptaken by the body. Hence, there is also
a need for providing such phenolic compounds in a form which
improves the solubility, and hence bioavailablility, preferably by
using biological large scale production processes.
SUMMARY OF THE INVENTION
[0004] The object of the present invention is to provide a method
for large scale production of aryl sulfates. Furthermore, it is an
object to provide a biological process for the large scale
production of phenols. The inventors have developed a biological
process that solves both objects.
[0005] The present invention thus provides in a first aspect a
process for the production of a sulfated phenolic compound
comprising:
[0006] (i') contacting a medium comprising a phenolic compound with
a first recombinant host cell; wherein the first recombinant host
cell comprises a heterologous polypeptide having an aryl
sulfotransferase activity; or
[0007] (i'') contacting a medium comprising a fermentable carbon
substrate with a first recombinant host cell; wherein the first
recombinant host cell comprises a heterologous polypeptide having
an aryl sulfotransferase activity; or
[0008] (i''') contacting a medium comprising a precursor of a
phenolic compound with a first recombinant host cell; wherein the
first recombinant host cell comprises a heterologous polypeptide
having an aryl sulfotransferase activity.
[0009] The present invention provides in a further aspect a process
for the production of a sulfated phenolic compound, such as
zosteric acid, the method comprises sulfating a phenolic compound,
such as p-coumaric acid, using a polypeptide as detailed herein.
Particularly, the process involves the use of a polypeptide having
an aryl sulfotransferase activity, such as a polypeptide selected
from the group consisting of:
[0010] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g. SEQ
ID NO: 1);
[0011] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0012] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0013] The present invention provides in a further aspect a
recombinant host cell as detailed herein. Particularly, the present
invention provides a recombinant host cell comprising (e.g.
expresses) a first heterologous polypeptide having an aryl
sulfotransferase activity, such as a heterologous polypeptide
selected from the group consisting of:
[0014] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0015] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0016] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0017] The present invention provides in a further aspect the use
of a polypeptide as detailed herein in the sulfation of a phenolic
compound. Particularly, the present invention provides the use of a
polypeptide having an aryl sulfotransferase activity in the
sulfation of a phenolic compound, such as a polypeptide selected
from the group consisting of:
[0018] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0019] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0020] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0021] The present invention provides in a further aspect a
composition comprising a first recombinant host cell as detailed
herein and a second recombinant host cell as detailed herein.
Particularly, the present invention provides a composition
comprising a first recombinant host cell comprising (e.g.
expressing) a heterologous polypeptide having an aryl
sulfotransferase activity, such as a heterologous polypeptide
selected from the group consisting of:
[0022] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0023] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0024] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted; and
[0025] a second recombinant host cell comprising (e.g., expressing)
a heterologous polypeptide having tyrosine ammonia lyase activity,
such as a polypeptide selected from the group consisting of:
[0026] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0027] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14); or
[0028] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30, 1
to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0029] The present invention provides in a further aspect a
composition comprising a first polypeptide and a second
polypeptide. Particularly, the present invention provides a
composition comprising a first polypeptide having an aryl
sulfotransferase activity, such as a polypeptide selected from the
group consisting of:
[0030] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0031] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0032] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted; and
[0033] a second polypeptide tyrosine ammonia lyase activity, such
as a polypeptide selected from the group consisting of:
[0034] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0035] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14); or
[0036] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30, 1
to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
BRIEF DESCRIPTION OF THE DRAWINGS
[0037] FIG. 1: Map of plasmid for expression of SULT1A1 from Rattus
norvegicus in Escherichia coli
[0038] FIG. 2: Map of plasmid for over-expression of cysDNC in E.
coli.
[0039] FIG. 3: Map of plasmid for over-expression of cysDNCQ in E.
coli.
[0040] FIG. 4: Map of plasmid for expression of RmXAL from
Rhodotorula mucilaginosa/Rhodotorula rubra in E. coli.
[0041] FIG. 5: Toxicity of unsulfated or sulfated products
[0042] FIG. 6: Map of plasmid for expression of tyrosine
ammonia-lyase RsTAL from Rhodobacter sphaeroides in E. coli.
[0043] FIG. 7: Map of plasmid for expression of tyrosine
ammonia-lyase FjTAL from Flavobacterium johnsoniae in E. coli.
[0044] FIG. 8: Map of plasmid for expression of tyrosine
ammonia-lyase RcTAL from Rhodobacter capsulatus in E. coli.
[0045] FIG. 9: Map of plasmid for expression of SULT1A1 from Rattus
norvegicus in Saccharomyces cerevisiae (native gene).
[0046] FIG. 10: Map of plasmid for expression of SULT1A1 from
Rattus norvegicus in Saccharomyces cerevisiae (codon-optimized
gene).
DETAILED DESCRIPTION OF THE INVENTION
[0047] Unless specifically defined herein, all technical and
scientific terms used have the same meaning as commonly understood
by a skilled artisan in the fields of biochemistry, genetics, and
molecular biology.
[0048] All methods and materials similar or equivalent to those
described herein can be used in the practice or testing of the
present invention, with suitable methods and materials being
described herein. All publications, patent applications, patents,
and other references mentioned herein are incorporated by reference
in their entirety. In case of conflict, the present specification,
including definitions, will prevail. Further, the materials,
methods, and examples are illustrative only and are not intended to
be limiting, unless otherwise specified.
[0049] The practice of the present invention will employ, unless
otherwise indicated, conventional techniques of cell biology, cell
culture, molecular biology, transgenic biology, microbiology,
recombinant DNA, and immunology, which are within the skill of the
art. Such techniques are explained fully in the literature. See,
for example, Current Protocols in Molecular Biology (Frederick M.
AUSUBEL, 2000, Wiley and son Inc, Library of Congress, USA);
Molecular Cloning: A Laboratory Manual, Third Edition, (Sambrook et
al, 2001, Cold Spring Harbor, New York: Cold Spring Harbor
Laboratory Press); Oligonucleotide Synthesis (M. J. Gait ed.,
1984); Mullis et al. U.S. Pat. No. 4,683,195; Nucleic Acid
Hybridization (B. D. Harries & S. J. Higgins eds. 1984);
Transcription And Translation (B. D. Hames & S. J. Higgins eds.
1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc.,
1987); Immobilized Cells And Enzymes (IRL Press, 1986); B. Perbal,
A Practical Guide To Molecular Cloning (1984); the series, Methods
In ENZYMOLOGY (J. Abelson and M. Simon, eds.-in-chief, Academic
Press, Inc., New York), specifically, Vols. 154 and 155 (Wu et al.
eds.) and Vol. 185, "Gene Expression Technology" (D. Goeddel, ed.);
Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P.
Calos eds., 1987, Cold Spring Harbor Laboratory); Immunochemical
Methods In Cell And Molecular Biology (Mayer and Walker, eds.,
Academic Press, London, 1987); Handbook Of Experimental Immunology,
Volumes I-IV (D. M. Weir and C. C. Blackwell, eds., 1986); and
Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, N.Y., 1986).
[0050] Polypeptides and Host Cells
[0051] As indicated above, the present invention provides and
utilizes polypeptides having aryl sulfotransferase activity
(EC:2.8.2.1). This makes them particularly suitable for the
sulfation of phenolic compounds such as p-coumaric acid and
derivatives thereof (e.g., caffeic acid, ferulic acid or sinapic
acid), or resveratrol.
[0052] The polypeptide having aryl sulfotransferase activity may be
a sulfotransferase 1A1 enzyme, a sulfotransferase 1A2 enzyme, a
sulfotransferase 1A3 enzyme, a sulfotransferase 1B1 enzyme, a
sulfotransferase 1C1 enzyme, a sulfotransferase 1C2 enzyme, a
sulfotransferase 1C4 enzyme, or a sulfotransferase 1E1 enzyme.
According to certain embodiments, the polypeptide having aryl
sulfotransferase activity is a sulfotransferase 1A1 enzyme.
According to certain other embodiments, the polypeptide having aryl
sulfotransferase activity is a sulfotransferase 1A2 enzyme.
According to certain embodiments, the polypeptide having aryl
sulfotransferase activity is a sulfotransferase 1B1 enzyme.
According to certain embodiments, the polypeptide having aryl
sulfotransferase activity is a sulfotransferase 1C1 enzyme.
According to certain embodiments, the polypeptide having aryl
sulfotransferase activity is a sulfotransferase 1C2 enzyme.
According to certain embodiments, the polypeptide having aryl
sulfotransferase activity is a sulfotransferase 1C4 enzyme.
According to other certain embodiments, the polypeptide having aryl
sulfotransferase activity is a sulfotransferase 1E1 enzyme
(estrogen sulfotransferase), such as the sulfotransferase 1E1 from
Gallus gallus domesticus.
[0053] According to certain embodiments, the polypeptide having
aryl sulfotransferase activity is a mammalian aryl
sulfotransferase, such as a mammalian sulfotransferase 1A1
enzyme.
[0054] According to certain embodiments, the polypeptide having
aryl sulfotransferase activity is an aryl sulfotransferase from
Rattus norvegicus or a variant thereof. Such variant may have at
least about 70%, such as at least about 75%, at least about 80%, at
least about 85%, at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence of the aryl
sulfotransferase from Rattus norvegicus. Such variant may also have
an amino acid sequence of the sulfotransferase from Rattus
norvegicus, wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30, 1
to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0055] It is understood that the foregoing values generally define
the total number of alterations to the reference aryl
sulfotransferase. The alterations may solely be amino acid
substitutions, be it conserved or non-conserved substitutions, or
both. They may solely be amino acid deletions. They may solely be
amino acid insertions. The alterations may be a mix of these
specific alterations, such as amino acid substitutions and amino
acid insertions.
[0056] According to certain embodiments, the polypeptide having
aryl sulfotransferase activity may be a polypeptide selected from
the group consisting of:
[0057] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0058] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0059] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0060] According to certain embodiments, a polypeptide according to
the invention is a polypeptide according to a). Accordingly, a
polypeptide according to the invention may be a polypeptide
comprising an amino acid sequence set forth in SEQ ID NO: 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO: 1). According
to particular embodiments, a polypeptide according to a) comprises
an amino acid sequence set forth in SEQ ID NO: 1. According other
particular embodiments, a polypeptide according to a) comprises an
amino acid sequence set forth in SEQ ID NO: 2. According to yet
other particular embodiments, a polypeptide according to a)
comprises an amino acid sequence set forth in SEQ ID NO: 3.
According to yet other particular embodiments, a polypeptide
according to a) comprises an amino acid sequence set forth in SEQ
ID NO: 4. According to yet other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 5. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 6. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 7. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 8. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 9. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 10. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 11. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 12. According other particular embodiments, a
polypeptide according to a) comprises an amino acid sequence set
forth in SEQ ID NO: 13.
[0061] According to other certain embodiments, a polypeptide
according to the invention is a polypeptide according to b).
Accordingly, a polypeptide according to the invention may be a
polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 1, 2,
3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO: 1).
According to particular embodiments, a polypeptide according to b)
comprises an amino acid sequence which has at least about 80%, such
as at least about 85%, at least about 90%, at least about 93%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, or at least about 99%, sequence identity to the amino
acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12 or 13 (e.g., SEQ ID NO: 1). According to other
particular embodiments, a polypeptide according to b) comprises an
amino acid sequence which has at least about 85%, such as at least
about 90%, at least about 93%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, or at least about 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO:
1). According to other particular embodiments, a polypeptide
according to b) comprises an amino acid sequence which has at least
about 90%, such as at least about 93%, at least about 95%, at least
about 96%, at least about 97%, at least about 98%, or at least
about 99%, sequence identity to the amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1). According to other particular embodiments, a
polypeptide according to b) comprises an amino acid sequence which
has at least about 95%, such as at least about 96%, at least about
97%, at least about 98%, or at least about 99%, sequence identity
to the amino acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO: 1).
[0062] According to particular embodiments, a polypeptide according
to b) comprises an amino acid sequence which has at least about
70%, such as at least about 75%, at least about 80%, at least about
85%, at least about 90%, at least about 93%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99%, sequence identity
to the amino acid sequence set forth in SEQ ID NO: 1. According to
more particular embodiments, a polypeptide according to b)
comprises an amino acid sequence which has at least about 80%, such
as at least about 85%, at least about 90%, at least about 93%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, or at least about 99%, sequence identity to the amino
acid sequence set forth in SEQ ID NO: 1. According to other more
particular embodiments, a polypeptide according to b) comprises an
amino acid sequence which has at least about 85%, such as at least
about 90%, at least about 93%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, or at least about 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 1. According to other more particular embodiments, a
polypeptide according to b) comprises an amino acid sequence which
has at least about 90%, such as at least about 93%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, or
at least about 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 1. According to other more particular
embodiments, a polypeptide according to b) comprises an amino acid
sequence which has at least about 95%, such as at least about 96%,
at least about 97%, at least about 98%, or at least about 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 1.
[0063] Preferably, a polypeptide according to b) has aryl
sulfotransferase activity. More preferably, a polypeptide according
to b) has a aryl sulfotransferase activity similar to that of the
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO:
1).
[0064] According to certain embodiment, a polypeptide according to
b) has aryl sulfotransferase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 1. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 2. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 3. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 4. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 5. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 6. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 7. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 8. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 9. According to certain other embodiments, a polypeptide
according to b) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 10. According to certain other embodiments, a
polypeptide according to b) has aryl sulfotransferase activity
similar to that of the polypeptide comprising the amino acid
sequence set forth in SEQ ID NO: 11. According to certain other
embodiments, a polypeptide according to b) has aryl
sulfotransferase activity similar to that of the polypeptide
comprising the amino acid sequence set forth in SEQ ID NO: 12.
According to certain other embodiments, a polypeptide according to
b) has aryl sulfotransferase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 13.
[0065] With "similar" aryl sulfotransferase activity, it is meant
that the polypeptide according to b) has at least about 10%, such
as at least about 20%, at least about 30%, at least about 40%, at
least about 50%, at least about 60, at least about 75%, at least
about 80%, at least about 90%, at least about 95%, at least about
100%, at least about 200%, at least about 300%, at least about
400%, at least about 500%, at least about 800%, at least about
1000% or at least about 2000%, of the aryl sulfotransferase
activity of the reference polypeptide (e.g., SEQ ID NO: 1).
[0066] The aryl sulfotransferase activity may for instance be
determined in accordance to the following method: Aryl
sulfotransferase activity may be determined by the reaction of
radioactively sulfur labelled PAPS, [.sup.35S]PAPS, with the
substrate in presence of the polypeptide of interest. This is
described previously, for example by Hattori et al (Biosci
Biotechnol Biochem. 2008; 72(2):540-7). The reaction takes place in
a buffer such as 250 .mu.L 50 mM sodium phosphate pH 6.8 with 1
.mu.M [.sup.35S]PAPS (3.7 kBq) with 100 .mu.M accepting compound
for a period of 30 min at 30.degree. C. The reaction is stopped by
addition of 100 .mu.L of a 1:1 mixture of 0.1 M barium acetate and
barium hydroxide. 50 .mu.L of 0.1 M zinc sulfate is added, followed
by centrifugation at 1,200.times.g for 5 min. 300 .mu.L of the
supernatant is then transferred to a new container and 50 .mu.L of
an equal volume of 0.1 M barium hydroxide and 0.1 M zinc sulfate is
added. The mixture is then centrifuged at 13,000.times.g for 5 min,
and 300-.mu.L aliquots of the supernatant are mixed with 2.5 mL of
Cleasol I (Nacalai Tesque, Kyoto, Japan). The radioactivity is then
measured by scintillation.
[0067] Alternatively, the activity of a sulfotransferase may be
detected by direct measurement of the product by analytical methods
such as high performance liquid chromatography (HPLC) or liquid
chromatography in combination with mass spectrometry (LC-MS).
[0068] According to other certain embodiments, a polypeptide
according to the invention is a polypeptide according to c).
Accordingly, a polypeptide according to the invention may be a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO:
1), wherein 1 or more, such as 2 or more, 3 or more, 4 or more, 5
or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11
or more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or
more, 17 or more, 18 or more, 19 or more, 20 or more, 25 or more,
30 or more, 35 or more, 40 or more, 45 or more, 50 or more, 60 or
more, 70 or more, 80 or more, 90 or more, 100 or more, 110 or more,
120 or more, 130 or more, 140 or more, or 150 or more, amino acid
residues are substituted, deleted, and/or inserted. According to
particular embodiments, a polypeptide according to c) comprises an
amino acid sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12 or 13 (e.g., SEQ ID NO: 1), wherein about 1 to about
150, such as about 1 to about 140, about 1 to about 130, about 1 to
about 120, about 1 to about 110, about 1 to about 100, about 1 to
about 90, about 1 to about 80, about 1 to about 70, about 1 to
about 60, about 1 to about 50, about 1 to about 40, about 1 to
about 35, about 1 to about 30, about 1 to about 25, about 1 to
about 20, about 1 to about 15, about 1 to about 10, about 1 to
about 5, or about 1 to about 3, amino acid residues are
substituted, deleted and/or inserted. According to more particular
embodiments, a polypeptide according to c) comprises an amino acid
sequence set forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12 or 13 (e.g., SEQ ID NO: 1), wherein about 1 to about 50, about 1
to about 40, about 1 to about 35, about 1 to about 30, about 1 to
about 25, about 1 to about 20, about 1 to about 15, about 1 to
about 10, about 1 to about 5, or about 1 to about 3, amino acid
residues are substituted, deleted and/or inserted. According to
other more particular embodiments, a polypeptide according to c)
comprises an amino acid sequence set forth in SEQ ID NO: 1, 2, 3,
4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO: 1), wherein
about 1 to about 30, such as about 1 to about 25, about 1 to about
20, about 1 to about 15, about 1 to about 10, about 1 to about 5,
or about 1 to about 3, amino acid residues are substituted, deleted
and/or inserted.
[0069] According to other more particular embodiments, a
polypeptide according to c) comprises an amino acid sequence set
forth in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13
(e.g., SEQ ID NO: 1), wherein about 1 to about 25, such as about 1
to about 20, about 1 to about 15, about 1 to about 10, about 1 to
about 5, or about 1 to about 3, amino acid residues are
substituted, deleted and/or inserted.
[0070] According to particular embodiments, a polypeptide according
to c) comprises an amino acid sequence set forth in SEQ ID NO: 1,
wherein about 1 to about 150, such as about 1 to about 140, about 1
to about 130, about 1 to about 120, about 1 to about 110, about 1
to about 100, about 1 to about 90, about 1 to about 80, about 1 to
about 70, about 1 to about 60, about 1 to about 50, about 1 to
about 40, about 1 to about 35, about 1 to about 30, about 1 to
about 25, about 1 to about 20, about 1 to about 15, about 1 to
about 10, about 1 to about 5, or about 1 to about 3, amino acid
residues are substituted, deleted and/or inserted. According to
more particular embodiments, a polypeptide according to c)
comprises an amino acid sequence set forth in SEQ ID NO: 1, wherein
about 1 to about 50, about 1 to about 40, about 1 to about 35,
about 1 to about 30, about 1 to about 25, about 1 to about 20,
about 1 to about 15, about 1 to about 10, about 1 to about 5, or
about 1 to about 3, amino acid residues are substituted, deleted
and/or inserted. According to other more particular embodiments, a
polypeptide according to c) comprises an amino acid sequence set
forth in SEQ ID NO: 1, wherein about 1 to about 30, such as about 1
to about 25, about 1 to about 20, about 1 to about 15, about 1 to
about 10, about 1 to about 5, or about 1 to about 3, amino acid
residues are substituted, deleted and/or inserted. According to
other more particular embodiments, a polypeptide according to c)
comprises an amino acid sequence set forth in SEQ ID NO: 1, wherein
about 1 to about 25, such as about 1 to about 20, about 1 to about
15, about 1 to about 10, about 1 to about 5, or about 1 to about 3,
amino acid residues are substituted, deleted and/or inserted.
[0071] It is understood that the foregoing values generally define
the total number of alterations to the reference polypeptide (e.g.,
SEQ ID NO: 1). The alterations may solely be amino acid
substitutions, be it conserved or non-conserved substitutions, or
both. They may solely be amino acid deletions. They may solely be
amino acid insertions. The alterations may be a mix of these
specific alterations, such as amino acid substitutions and amino
acid insertions.
[0072] Preferably, a polypeptide according to c) has aryl
sulfotransferase activity. More preferably, a polypeptide according
to c) has a aryl sulfotransferase activity similar to that of the
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID NO:
1).
[0073] According to certain embodiment, a polypeptide according to
c) has aryl sulfotransferase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 1. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 2. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 3. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 4. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 5. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 6. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 7. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 8. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 9. According to certain other embodiments, a polypeptide
according to c) has aryl sulfotransferase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 10. According to certain other embodiments, a
polypeptide according to c) has aryl sulfotransferase activity
similar to that of the polypeptide comprising the amino acid
sequence set forth in SEQ ID NO: 11. According to certain other
embodiments, a polypeptide according to c) has aryl
sulfotransferase activity similar to that of the polypeptide
comprising the amino acid sequence set forth in SEQ ID NO: 12.
According to certain other embodiments, a polypeptide according to
c) has aryl sulfotransferase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 13.
[0074] With "similar" aryl sulfotransferase activity it is meant
that the polypeptide according to c) has at least about 10%, such
as at least about 20%, at least about 30%, at least about 40%, at
least about 50%, at least about 60, at least about 75%, at least
about 80%, at least about 90%, at least about 95%, at least about
100%, at least about 200%, at least about 300%, at least about
400%, at least about 500%, at least about 800%, at least about
1000% or at least about 2000%, at of the aryl sulfotransferase
activity of the reference polypeptide (e.g., SEQ ID NO: 1).
[0075] The aryl sulfotransferase activity may for instance be
determined in accordance to the following method: Aryl
sulfotransferase activity may be determined by the reaction of
radioactively sulfur labelled PAPS, [.sup.35S]PAPS, with the
substrate in presence of the polypeptide of interest. This is
described previously, for example by Hattori et al (Biosci
Biotechnol Biochem. 2008; 72(2):540-7). The reaction takes place in
a buffer such as 250 .mu.L 50 mM sodium phosphate pH 6.8 with 1
.mu.M [.sup.35S]PAPS (3.7 kBq) with 100 .mu.M accepting compound
for a period of 30 min at 30.degree. C. The reaction is stopped by
addition of 100 .mu.L of a 1:1 mixture of 0.1 M barium acetate and
barium hydroxide. 50 .mu.L of 0.1 M zinc sulfate is added, followed
by centrifugation at 1,200.times.g for 5 min. 300 .mu.L of the
supernatant is then transferred to a new container and 50 .mu.L of
an equal volume of 0.1 M barium hydroxide and 0.1 M zinc sulfate is
added. The mixture is then centrifuged at 13,000.times.g for 5 min,
and 300-4 aliquots of the supernatant are mixed with 2.5 mL of
Cleasol I (Nacalai Tesque, Kyoto, Japan). The radioactivity is then
measured by scintillation.
[0076] Alternatively, the activity of a sulfotransferase may be
detected by direct measurement of the product by analytical methods
such as high performance liquid chromatography (HPLC) or liquid
chromatography in combination with mass spectrometry (LC-MS).
[0077] Contemplated by the present invention is the production of a
sulfated phenolic compound from a precursor thereof, and in
particular from a precursor of the general formula (p-I) as
described in more detail below. In this case, it may be suitable to
employ a further (e.g., second) polypeptide which has tyrosine
ammonia lyase activity. Such polypeptide may be a polypeptide
selected from the group consisting of:
[0078] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0079] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, or at least about 99%, sequence identity to the amino
acid sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20,
21, 22 or 23 (e.g., SEQ ID NO: 14); or
[0080] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 or more, such as about 1 to about 50, about 1
to about 40, about 1 to about 35, about 1 to about 30, about 1 to
about 25, about 1 to about 20, about 1 to about 15, about 1 to
about 10, about 1 to about 5, or about 1 to about 3, amino acid
residues are substituted, deleted, and/or inserted.
[0081] According to certain embodiments, a further polypeptide
according to the invention is a polypeptide according to d).
Accordingly, a polypeptide according to the invention may be a
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO: 14).
According to particular embodiments, a polypeptide according to d)
comprises an amino acid sequence set forth in SEQ ID NO: 14.
According other particular embodiments, a polypeptide according to
d) comprises an amino acid sequence set forth in SEQ ID NO: 15.
According to yet other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 16. According other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 17. According other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 18. According other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 19. According other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 20. According other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 21. According other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 22. According other particular embodiments, a polypeptide
according to d) comprises an amino acid sequence set forth in SEQ
ID NO: 23.
[0082] According to other certain embodiments, a further
polypeptide according to the invention is a polypeptide according
to e). Accordingly, a polypeptide according to the invention may be
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 14, 15,
16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO: 14). According
to particular embodiments, a polypeptide according to e) comprises
an amino acid sequence which has at least about 80%, such as at
least about 85%, at least about 90%, at least about 93%, at least
about 95%, at least about 96%, at least about 97%, at least about
98%, or at least about 99%, sequence identity to the amino acid
sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22
or 23 (e.g., SEQ ID NO: 14). According to other particular
embodiments, a polypeptide according to e) comprises an amino acid
sequence which has at least about 85%, such as at least about 90%,
at least about 93%, at least about 95%, at least about 96%, at
least about 97%, at least about 98%, or at least about 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO: 14).
According to other particular embodiments, a polypeptide according
to e) comprises an amino acid sequence which has at least about
90%, such as at least about 93%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, or at least about 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO: 14).
According to other particular embodiments, a polypeptide according
to e) comprises an amino acid sequence which has at least about
95%, such as at least about 96%, at least about 97%, at least about
98%, or at least about 99%, sequence identity to the amino acid
sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22
or 23 (e.g., SEQ ID NO: 14).
[0083] According to particular embodiments, a polypeptide according
to e) comprises an amino acid sequence which has at least about
70%, such as at least about 75%, at least about 80%, at least about
85%, at least about 90%, at least about 93%, at least 95%, at least
96%, at least 97%, at least 98%, or at least 99%, sequence identity
to the amino acid sequence set forth in SEQ ID NO: 14. According to
more particular embodiments, a polypeptide according to e)
comprises an amino acid sequence which has at least about 80%, such
as at least about 85%, at least about 90%, at least about 93%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, or at least about 99%, sequence identity to the amino
acid sequence set forth in SEQ ID NO: 14. According to other more
particular embodiments, a polypeptide according to e) comprises an
amino acid sequence which has at least about 85%, such as at least
about 90%, at least about 93%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, or at least about 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 14. According to other more particular embodiments, a
polypeptide according to e) comprises an amino acid sequence which
has at least about 90%, such as at least about 93%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, or
at least about 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 14. According to other more particular
embodiments, a polypeptide according to e) comprises an amino acid
sequence which has at least about 95%, such as at least about 96%,
at least about 97%, at least about 98%, or at least about 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 14.
[0084] Preferably, a polypeptide according to e) has tyrosine
ammonia lyase activity. More preferably, a polypeptide according to
e) has a tyrosine ammonia lyase activity similar to that of the
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14).
[0085] According to certain embodiment, a polypeptide according to
e) has tyrosine ammonia lyase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 14. According to certain other embodiments, a polypeptide
according to e) has tyrosine ammonia lyase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 15. According to certain other embodiments, a
polypeptide according to e) has tyrosine ammonia lyase activity
similar to that of the polypeptide comprising the amino acid
sequence set forth in SEQ ID NO: 16. According to certain other
embodiments, a polypeptide according to e) has tyrosine ammonia
lyase activity similar to that of the polypeptide comprising the
amino acid sequence set forth in SEQ ID NO: 17. According to
certain other embodiments, a polypeptide according to e) has
tyrosine ammonia lyase activity similar to that of the polypeptide
comprising the amino acid sequence set forth in SEQ ID NO: 18.
According to certain other embodiments, a polypeptide according to
e) has tyrosine ammonia lyase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 19. According to certain other embodiments, a polypeptide
according to e) has tyrosine ammonia lyase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 20. According to certain other embodiments, a
polypeptide according to e) has tyrosine ammonia lyase activity
similar to that of the polypeptide comprising the amino acid
sequence set forth in SEQ ID NO: 21. According to certain other
embodiments, a polypeptide according to e) has tyrosine ammonia
lyase activity similar to that of the polypeptide comprising the
amino acid sequence set forth in SEQ ID NO: 22. According to
certain other embodiments, a polypeptide according to e) has
tyrosine ammonia lyase activity similar to that of the polypeptide
comprising the amino acid sequence set forth in SEQ ID NO: 23. With
"similar" tyrosine ammonia lyase activity it is meant that the
polypeptide according to e) has at least about 10%, such as at
least about 20%, at least about 30%, at least about 40%, at least
about 50%, at least about 60, at least about 75%, at least about
80%, at least about 90%, at least about 95%, at least about 100%,
at least about 200%, at least about 300%, at least about 400%, at
least about 500%, at least about 800%, at least about 1000% or at
least about 2000%, of the ammonia lyase activity of the reference
polypeptide (e.g., SEQ ID NO: 14).
[0086] The tyrosine ammonia lyase activity may for instance be
determined in accordance to the following method: Enzymatic assays
are performed in 200 .mu.L volumes in wells in a UV transparent
96-well plate, by following the increase in absorbance at 315 nm
(pHCA) using spectrophotometry or HPLC with UV detection. The
reaction mixtures contain 2 .mu.g of purified protein and are
initiated by adding 1 mM tyrosine or 6 mM after equilibration to
30.degree. C. The enzymatic activity is calculated as U/g, where U
is defined as .mu.mol substrate converted per minute. Negative
controls contain no purified protein. Kinetic constants Km and vmax
are determined from assays containing 1.56 .mu.M to 200 .mu.M
tyrosine.
[0087] According to other certain embodiments, a further
polypeptide according to the invention is a polypeptide according
to f). Accordingly, a polypeptide according to the invention may be
a polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO: 14),
wherein 1 or more, such as 2 or more, 3 or more, 4 or more, 5 or
more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or
more, 12 or more, 13 or more, 14 or more, 15 or more, 16 or more,
17 or more, 18 or more, 19 or more, 20 or more, 25 or more, 30 or
more, 35 or more, 40 or more, 45 or more, 50 or more, 60 or more,
70 or more, 80 or more, 90 or more, 100 or more, 110 or more, 120
or more, 130 or more, 140 or more, or 150 or more, amino acid
residues are substituted, deleted, and/or inserted. According to
particular embodiments, a polypeptide according to f) comprises an
amino acid sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19,
20, 21, 22 or 23 (e.g., SEQ ID NO: 14), wherein about 1 to about
150, such as about 1 to about 140, about 1 to about 130, about 1 to
about 120, about 1 to about 110, about 1 to about 100, about 1 to
about 90, about 1 to about 80, about 1 to about 70, about 1 to
about 60, about 1 to about 50, about 1 to about 40, about 1 to
about 35, about 1 to about 30, about 1 to about 25, about 1 to
about 20, about 1 to about 15, about 1 to about 10, about 1 to
about 5, or about 1 to about 3, amino acid residues are
substituted, deleted and/or inserted. According to more particular
embodiments, a polypeptide according to f) comprises an amino acid
sequence set forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22
or 23 (e.g., SEQ ID NO: 14), wherein about 1 to about 50, about 1
to about 40, about 1 to about 35, about 1 to about 30, about 1 to
about 25, about 1 to about 20, about 1 to about 15, about 1 to
about 10, about 1 to about 5, or about 1 to about 3, amino acid
residues are substituted, deleted and/or inserted. According to
other more particular embodiments, a polypeptide according to f)
comprises an amino acid sequence set forth in SEQ ID NO: 14, 15,
16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO: 14), wherein
about 1 to about 30, such as about 1 to about 25, about 1 to about
20, about 1 to about 15, about 1 to about 10, about 1 to about 5,
or about 1 to about 3, amino acid residues are substituted, deleted
and/or inserted. According to other more particular embodiments, a
polypeptide according to f) comprises an amino acid sequence set
forth in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g.,
SEQ ID NO: 14), wherein about 1 to about 25, such as about 1 to
about 20, about 1 to about 15, about 1 to about 10, about 1 to
about 5, or about 1 to about 3, amino acid residues are
substituted, deleted and/or inserted.
[0088] According to particular embodiments, a polypeptide according
to f) comprises an amino acid sequence set forth in SEQ ID NO: 14,
wherein about 1 to about 150, such as about 1 to about 140, about 1
to about 130, about 1 to about 120, about 1 to about 110, about 1
to about 100, about 1 to about 90, about 1 to about 80, about 1 to
about 70, about 1 to about 60, about 1 to about 50, about 1 to
about 40, about 1 to about 35, about 1 to about 30, about 1 to
about 25, about 1 to about 20, about 1 to about 15, about 1 to
about 10, about 1 to about 5, or about 1 to about 3, amino acid
residues are substituted, deleted and/or inserted. According to
more particular embodiments, a polypeptide according to f)
comprises an amino acid sequence set forth in SEQ ID NO: 14,
wherein about 1 to about 50, about 1 to about 40, about 1 to about
35, about 1 to about 30, about 1 to about 25, about 1 to about 20,
about 1 to about 15, about 1 to about 10, about 1 to about 5, or
about 1 to about 3, amino acid residues are substituted, deleted
and/or inserted. According to other more particular embodiments, a
polypeptide according to f) comprises an amino acid sequence set
forth in SEQ ID NO: 14, wherein about 1 to about 30, such as about
1 to about 25, about 1 to about 20, about 1 to about 15, about 1 to
about 10, about 1 to about 5, or about 1 to about 3, amino acid
residues are substituted, deleted and/or inserted. According to
other more particular embodiments, a polypeptide according to f)
comprises an amino acid sequence set forth in SEQ ID NO: 14,
wherein about 1 to about 25, such as about 1 to about 20, about 1
to about 15, about 1 to about 10, about 1 to about 5, or about 1 to
about 3, amino acid residues are substituted, deleted and/or
inserted.
[0089] It is understood that the foregoing values generally define
the total number of alterations to the reference polypeptide (e.g.,
SEQ ID NO: 14). The alterations may solely be amino acid
substitutions, be it conserved or non-conserved substitutions, or
both. They may solely be amino acid deletions. They may solely be
amino acid insertions. The alterations may be a mix of these
specific alterations, such as amino acid substitutions and amino
acid insertions.
[0090] Preferably, a polypeptide according to f) has tyrosine
ammonia lyase activity. More preferably, a polypeptide according to
f) has a tyrosine ammonia lyase activity similar to that of the
polypeptide comprising an amino acid sequence set forth in SEQ ID
NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO: 14).
According to certain embodiment, a polypeptide according to f) has
tyrosine ammonia lyase activity similar to that of the polypeptide
comprising the amino acid sequence set forth in SEQ ID NO: 14.
According to certain other embodiments, a polypeptide according to
f) has tyrosine ammonia lyase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 15. According to certain other embodiments, a polypeptide
according to f) has tyrosine ammonia lyase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 16. According to certain other embodiments, a
polypeptide according to f) has tyrosine ammonia lyase activity
similar to that of the polypeptide comprising the amino acid
sequence set forth in SEQ ID NO: 17. According to certain other
embodiments, a polypeptide according to f) has tyrosine ammonia
lyase activity similar to that of the polypeptide comprising the
amino acid sequence set forth in SEQ ID NO: 18. According to
certain other embodiments, a polypeptide according to f) has
tyrosine ammonia lyase activity similar to that of the polypeptide
comprising the amino acid sequence set forth in SEQ ID NO: 19.
According to certain other embodiments, a polypeptide according to
f) has tyrosine ammonia lyase activity similar to that of the
polypeptide comprising the amino acid sequence set forth in SEQ ID
NO: 20. According to certain other embodiments, a polypeptide
according to f) has tyrosine ammonia lyase activity similar to that
of the polypeptide comprising the amino acid sequence set forth in
SEQ ID NO: 21. According to certain other embodiments, a
polypeptide according to f) has tyrosine ammonia lyase activity
similar to that of the polypeptide comprising the amino acid
sequence set forth in SEQ ID NO: 22. According to certain other
embodiments, a polypeptide according to f) has tyrosine ammonia
lyase activity similar to that of the polypeptide comprising the
amino acid sequence set forth in SEQ ID NO: 23. With "similar"
tyrosine ammonia lyase activity it is meant that the polypeptide
according to f) has at least about 10%, such as at least about 20%,
at least about 30%, at least about 40%, at least about 50%, at
least about 60, at least about 75%, at least about 80%, at least
about 90%, at least about 95%, at least about 100%, at least about
200%, at least about 300%, at least about 400%, at least about
500%, at least about 800%, at least about 1000% or at least about
2000%, of the ammonia lyase activity of the reference polypeptide
(e.g., SEQ ID NO: 14).
[0091] The tyrosine ammonia lyase activity may for instance be
determined in accordance to the following method: Enzymatic assays
are performed in 200 .mu.L volumes in wells in a UV transparent
96-well plate, by following the increase in absorbance at 315 nm
(pHCA) using spectrophotometry or HPLC with UV detection. The
reaction mixtures contain 2 .mu.g of purified protein and are
initiated by adding 1 mM tyrosine or 6 mM after equilibration to
30.degree. C. The enzymatic activity is calculated as U/g, where U
is defined as .mu.mol substrate converted per minute. Negative
controls contain no purified protein. Kinetic constants Km and vmax
are determined from assays containing 1.56 .mu.M to 200 .mu.M
tyrosine.
[0092] Further contemplated by the present invention is to employ a
further (e.g., third) polypeptide which has phenylalanine ammonia
lyase activity, such as a phenylalanine ammonia lyase (EC
4.3.1.24).
[0093] The polypeptides may be employed in accordance with the
invention in isolated form, such as in purified form. The
polypeptides may for instance be expressed by a recombinant host
cell, and then purified. Techniques and means for the purification
of polypeptides produced by a recombinant host cell are well know
in the art. For example, in order to facilitate purification, an
amino acid motif comprising several histidine residues, such as at
least 6, may be inserted at the C- or N-terminal end of the
polypeptide. A non-limiting example of such amino acid motif is
provided in SEQ ID NO: 24. Various purification kits for
histidine-tagged polypeptides are available from commercial sources
such as Qiagen, Hilden, Germany; Clontech, Mountain View, Calif.,
USA; Bio-Rad, Hercules, Calif., USA and others.
[0094] Alternatively, the polypeptides may be chemically
synthezised. Techniques for chemical peptide synthesis are well
know and include Liquid-phase synthesis and Solid-phase
synthesis.
[0095] The polypeptides can also be employed in accordance with the
invention as part of a recombinant host cell. Such recombinant host
cells are described in more details below.
[0096] It is understood that the details given herein with respect
to polypeptides apply to all aspects of the invention.
[0097] The present invention also provides recombinant host cells
comprising (e.g. expressing) one or more polypeptides as detailed
herein. Generally, the polypeptides according to the invention will
be heterologous to the host cells, which means that the
polypeptides are normally not found in or made (i.e. expressed) by
the host cells, but derived from a different species.
[0098] Therefore, the present invention provides a recombinant host
cell comprising a heterologous polypeptide having an aryl
sulfotransferase activity. According to certain embodiments, a
recombinant host cell according to the invention comprises a
heterologous polypeptide selected from the group consisting of:
[0099] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0100] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0101] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0102] A recombinant host cell provided and utilized in accordance
with the present invention may comprise a hetereologus polypeptide
having tyrosine ammonia lyase activity. According to certain
embodiments, a recombinant host cell according to the invention
comprises a heterologous polypeptide selected from the group
consisting of:
[0103] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0104] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14); or
[0105] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 or more, such as about 1 to about 50, about 1
to about 40, about 1 to about 35, 1 to 30, 1 to 25, 1 to 20, 1 to
15, 1 to 10, 1 to 5, 1 to 3, amino acid residues are substituted,
deleted and/or inserted.
[0106] According to certain embodiments, a recombinant host cell
comprises a first heterologous polypeptide having aryl
sulfotransferase activity and a second heterologous polypeptide
having tyrosine ammonia lyase activity. According to particular
embodiments, a recombinant host cell comprises a first heterologous
polypeptide selected from the polypeptides according to items a) to
c) as detailed herein, and a second heterologous polypeptide
selected from the polypeptides according to items e) to f) as
detailed herein.
[0107] According to more particular embodiments, a recombinant host
cell comprises a first heterologous polypeptide selected from the
group consisting of:
[0108] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1;
[0109] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1; or
[0110] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to
30, 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted; and a second
heterologous polypeptide selected from the group consisting of:
[0111] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14;
[0112] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14; or
[0113] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to
30, 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0114] Alternatively, the first heterologous polypeptide having
aryl sulfotransferase activity may be comprised by a first
recombinant host cell, and the second heterologous polypeptide
having tyrosine ammonia lyase activity may be comprised by a second
recombinant host cell.
[0115] According to certain embodiments, a recombinant host cell
comprises a first heterologous polypeptide having aryl
sulfotransferase activity and a further (e.g., third) heterologous
polypeptide having phenylalanine ammonia lyase activity.
[0116] Alternatively, the first heterologous polypeptide having
aryl sulfotransferase activity may be comprised by a first
recombinant host cell, and the further (e.g., third) heterologous
polypeptide having phenylalanine ammonia lyase activity may be
comprised by a further (e.g., third) recombinant host cell. Such
further recombinant host cell may be a recombinant host cell also
comprising a heterologous polypeptide having tyrosine ammonia lyase
activity.
[0117] Recombinant host cells in accordance with the invention can
be produced from any suitable host organism, including
single-celled or multicellular microorganisms such as bacteria,
yeast, fungi, algae and plant, and higher eukaryotic organisms
including nematodes, insects, reptiles, birds, amphibians and
mammals.
[0118] According to certain embodiments, a recombinant host cells
in accordance with the invention is selected from the group
consisting of bacteria, yeast, fungi, algae and plant.
[0119] According to certain other embodiments, a recombinant host
cells in accordance with the invention is selected from the group
consisting of bacteria, yeast, fungi, and algae.
[0120] According to certain other embodiments, a recombinant host
cells in accordance with the invention is selected from the group
consisting of bacteria, yeast and fungi.
[0121] According to certain other embodiments, a recombinant host
cells in accordance with the invention is selected from the group
consisting of bacteria and yeast.
[0122] According to certain embodiments, a recombinant host cells
in accordance with the invention is not a plant cell.
[0123] Bacterial host cells are selected from Gram-positive and
Gram-negative bacteria. Non-limiting examples for Gram-negative
bacterial host cells include species from the genera Escherichia,
Erwinia, Klebsiella and Citrobacter. Non-limiting examples of
Gram-positive bacterial host cells include species from the genera
Bacillus, Lactococcus, Lactobacillus, Clostridium, Corynebacterium,
Streptomyces, Streptococcus, and Cellulomonas.
[0124] According to certain embodiments, the recombinant host cell
is a bacterium, which may be a bacterium of the genus Bacillus,
Lactococcus, Lactobacillus, Clostridium, Corynebacterium,
Geobacillus, Thermoanaerobacterium, Streptococcus, Pseudomonas,
Streptomyces, Escherichia, Shigella, Acinetobacter, Citrobacter,
Salmonella, Klebsiella, Enterobacter, Erwinia, Kluyvera, Serratia,
Cedecea, Morganella, Hafnia, Edwardsiella, Providencia, Proteus, or
Yersinia.
[0125] According to particular embodiments, the recombinant host
cell is a bacterium of the genus Bacillus. Non-limiting examples of
a bacteria of the genus Bacillus are Bacillus subtitlis, Bacillus
amyloliquefaciens, Bacillus licheniformis, and Bacillus mojavensis.
According to more particular embodiments, the recombinant host cell
is Bacillus subtitlis. According to other more particular
embodiments, the recombinant host cell is Bacillus
licheniformis.
[0126] According to other particular embodiments, the recombinant
host cell is a bacterium of the genus Lactococcus. A non-limiting
example of a bacterium of the genus Lactococcus is Lactococcus
lactis. According to more particular embodiments, the recombinant
host cell is Lactococcus lactis.
[0127] According to other particular embodiments, the recombinant
host cell is a bacterium of the genus Corynebacterium. A
non-limiting example of a bacterium of the genus Corynebacterium is
Corynebacterium glutamicum. According to more particular
embodiments, the recombinant host cell is Corynebacterium
glutamicum.
[0128] According to other particular embodiments, the recombinant
host cell is a bacterium of the genus Streptomyces. A non-limiting
examples of a bacterium of the genus Streptomyces are Streptomyces
lividans, Streptomyces coelicolor, or Streptomyces griseus.
According to more particular embodiments, the recombinant host cell
is Streptomyces lividans. According to other more particular
embodiments, the recombinant host cell is Streptomyces coelicolor.
According to other more particular embodiments, the recombinant
host cell is Streptomyces griseus.
[0129] According to other particular embodiments, the recombinant
host cell is a bacterium of the genus Pseudomonas. A non-limiting
example of a bacterium of the genus Pseudomonas is Pseudomonas
putida. According to more particular embodiments, the recombinant
host cell is Pseudomonas putida.
[0130] According to other particular embodiments, the recombinant
host cell is a bacterium of the genus Geobacillus. A non-limiting
examples of a bacterium of the genus Geobacillus are Geobacillus
thermoglucosidasius and Geobacillus stearothermophilus. According
to more particular embodiments, the recombinant host cell is
Geobacillus thermoglucosidasius. According to other more particular
embodiments, the recombinant host cell is Geobacillus
stearothermophilus.
[0131] According to other particular embodiments, the recombinant
host cell is a bacterium of the genus Thermoanaerobacterium. A
non-limiting example of a bacterium of the genus Pseudomonas is
Thermoanaerobacterium thermosaccharolyticum. According to more
particular embodiments, the recombinant host cell is
Thermoanaerobacterium thermosaccharolyticum.
[0132] According to other particular embodiments, the recombinant
host cell is a bacterium of the genus Escherichia. A non-limiting
example of a bacterium of the genus Escherichia is Escherichia
coli. According to more particular embodiments, the recombinant
host cell is Escherichia coli.
[0133] Yeast host cells may be derived from e.g., Saccharomyces,
Pichia, Schizosacharomyces, Zygosaccharomyces, Hansenula,
Pachyosolen, Kluyveromyces, Debaryomyces, Yarrowia, Candida,
Cryptococcus, Komagataella, Lipomyces, Rhodospiridium, Rhodotorula,
or Trichosporon.
[0134] According to certain embodiments, the recombinant host cell
is a yeast, which may be a yeast is of the genus Saccharomyces,
Pichia, Schizosacharomyces, Zygosaccharomyces, Hansenula,
Pachyosolen, Kluyveromyces, Debaryomyces, Yarrowia, Candida,
Cryptococcus, Komagataella, Lipomyces, Rhodospiridium, Rhodotorula,
or Trichosporon.
[0135] According to particular embodiments, the recombinant host
cell is a yeast of the genus Saccharomyces. A non-limiting example
of a yeast of the genus Saccharomyces is Saccharomyces cerevisiae.
According to more particular embodiments, the recombinant host cell
is Saccharomyces cerevisiae.
[0136] According to particular embodiments, the recombinant host
cell is a yeast of the genus Pichia. Non-limiting example of a
yeast of the genus Pichia are Pichia pastoris and pichia
kudriavzevii. According to more particular embodiments, the
recombinant host cell is Pichia pastoris. According to other more
particular embodiments, the recombinant host cell is pichia
kudriavzevii.
[0137] Fungi host cells may be derived from, e.g., Aspergillus.
[0138] According to certain embodiments, the recombinant host cell
is a fungus, such as a fungi of the genus Aspergillus. Non-limiting
examples of a fungus of the genus Aspergillus are Aspergillus
Oryzae, Aspergillus niger or Aspergillus awamsii. According to more
particular embodiments, the recombinant host cell is Aspergillus
Oryzae. According to other more particular embodiments, the
recombinant host cell is Aspergillus niger. According to other more
particular embodiments, the recombinant host cell is Aspergillus
awamsii.
[0139] Algae host cells may be derived from, e.g., Chlamydomonas,
Haematococcus, Phaedactylum, Volvox or Dunaliella.
[0140] According to certain embodiments, the recombinant host cell
is an alga, which may be an algae of the genus Chlamydomonas,
Haematococcus, Phaedactylum, Volvox or Dunaliella.
[0141] According to particular embodiments, the recombinant host
cell is an alga cell of the genus Chlamydomonas. A non-limiting
example of an alga of the genus Chlamydomonas is Chlamydomonas
reinhardtii.
[0142] According to particular embodiments, the recombinant host
cell is an alga cell of the genus Haematococcus. A non-limiting
example of an alga of the genus Haematococcus is Haematococcus
pluvialis.
[0143] According to other particular embodiments, the recombinant
host cell is an alga cell of the genus Phaedactylum. A non-limiting
example of an alga of the genus Phaedactylum is Phaedactylum
tricornatum.
[0144] A plant host cell may be derived from, e.g., soybean,
rapeseed, sunflower, cotton, corn, tobacco, alfalfa, wheat, barley,
oats, sorghum, lettuce, rice, broccoli, cauliflower, cabbage,
parsnips, melons, carrots, celery, parsley, tomatoes, potatoes,
strawberries, peanuts, grapes, grass seed crops, sugar beets, sugar
cane, beans, peas, rye, flax, hardwood trees, softwood trees, and
forage grasses.
[0145] According to certain embodiments, the recombinant host cell
is a plant cell, such as a plant cell selected from the group
consisting of soybean, rapeseed, sunflower, cotton, corn, tobacco,
alfalfa, wheat, barley, oats, sorghum, lettuce, rice, broccoli,
cauliflower, cabbage, parsnips, melons, carrots, celery, parsley,
tomatoes, potatoes, strawberries, peanuts, grapes, grass seed
crops, sugar beets, sugar cane, beans, peas, rye, flax, hardwood
trees, softwood trees, and forage grasses.
[0146] Generally, a recombinant host cell according to the
invention has been genetically modified to express one or more
polypeptides as detailed herein, which means that one or more
exogenous nucleic acid molecules, such as DNA molecules, which
comprise(s) a nucleotide sequence or nucleotide sequences encoding
said polypeptide or polypeptides has been introduced in the host
cell. Techniques for introducing exogenous nucleic acid molecule,
such as a DNA molecule, into the various host cells are well-known
to those of skill in the art, and include transformation (e.g.,
heat shock or natural transformation), transfection, conjugation,
electroporation, microinjection and microparticle bombardment.
[0147] Accordingly, a recombinant host cell according to the
invention comprises an exogenous nucleic acid molecule comprising a
nucleotide sequence encoding a polypeptide as detailed herein.
[0148] In order to facilitate expression of the polypeptide in the
host cell, the exogenous nucleic acid molecule may comprise
suitable regulatory elements such as a promoter that is functional
in the host cell to cause the production of an mRNA molecule and
that is operably linked to the nucleotide sequence encoding said
polypeptide.
[0149] Promoters useful in accordance with the invention are any
known promoters that are functional in a given host cell to cause
the production of an mRNA molecule. Many such promoters are known
to the skilled person. Such promoters include promoters normally
associated with other genes, and/or promoters isolated from any
bacteria, yeast, fungi, alga or plant cell. The use of promoters
for protein expression is generally known to those of skilled in
the art of molecular biology, for example, see Sambrook et al.,
Molecular cloning: A Laboratory Manual, Cold Spring Harbor
Laboratory, Cold Spring Harbor, N.Y., 1989. The promoter employed
may be inducible. The term "inducible" used in the context of a
promoter means that the promoter only directs transcription of an
operably linked nucleotide sequence if a stimulus is present, such
as a change in temperature or the presence of a chemical substance
("chemical inducer"). As used herein, "chemical induction"
according to the present invention refers to the physical
application of a exogenous or endogenous substance (incl.
macromolecules, e.g., proteins or nucleic acids) to a host cell.
This has the effect of causing the target promoter present in the
host cell to increase the rate of transcription. Alternatively, the
promoter employed may be constitutive. The term "constitutive" used
in the context of a promoter means that the promoter is capable of
directing transcription of an operably linked nucleotide sequence
in the absence of stimulus (such as heat shock, chemicals
etc.).
[0150] Non-limiting examples of promoters functional in bacteria,
such as Bacillus subtilis, Lactococcus lactis or Escherichia coli,
include both constitutive and inducible promoters such as T7
promoter, the beta-lactamase and lactose promoter systems; alkaline
phosphatase (phoA) promoter, a tryptophan (trp) promoter system,
tetracycline promoter, lambda-phage promoter, ribosomal protein
promoters; and hybrid promoters such as the tac promoter. Other
bacterial and synthetic promoters are also suitable.
[0151] Non-limiting examples of promoters functional in yeast, such
as Saccharomyces cerevisiae, include xylose promoter, GAL1 and
GAL10 promoters, TEF1 promoter, and pgk1 promoter.
[0152] Non-limiting examples of promoters functional in fungi, such
as Aspergillus Oryzae or Aspergillus niger, include promotors
derived from the gene encoding Aspergillus oryzae TAKA amylase,
Aspergillus niger neutral .alpha.-amylase, Aspergillus niger acid
stable .alpha.-amylase, Aspergillus niger or Aspergillus awamsii
glucoamylase (gluA), Aspergillus niger acetamidase, Aspergillus
oryzae alkaline protease, Aspergillus oryzae triose phosphatase
isomerase, Rhizopus meihei aspartic proteinase, and Rhizopus meihei
lipase.
[0153] Non-limiting examples of promoters functional in alga, such
as Haematococcus pluvialis, include the CaMV35S promoter, the SV40
promoter, and promoter of the Chlamydomonas reinhardtii RBCS2 gene
and the promoter of the Volvox carteri ARS gene.
[0154] Non-limiting examples of promoters functional in plant cells
include the Lactuca sative psbA promoter, the tabacco psbA
promoter, the tobacco rrn16 PEP+NEP promoter, the CaMV 35S
promoter, the 19S promoter, the tomate E8 promoter, the nos
promoter, the Mac promoter, and the pet E promoter or the ACT1
promoter.
[0155] Besides a promoter, the exogenous nucleic acid molecule may
further comprise at least one regulatory element selected from a 5'
untranslated region (5'UTR) and 3' untranslated region (3' UTR).
Many such 5' UTRs and 3' UTRs derived from prokaryotes and
eukaryotes are well known to the skilled person. Such regulatory
elements include 5' UTRs and 3' UTRs normally associated with other
genes, and/or 5' UTRs and 3' UTRs isolated from any bacteria,
yeast, fungi, alga or plant cell.
[0156] If the host cell is a prokaryotic organism, the 5' UTR
usually contains a ribosome binding site (RBS), also known as the
Shine Dalgarno sequence which is usually 3-10 base pairs upstream
from the initiation codon. Meanwhile, if the host cell is an
eukaryotic organism the 5' UTR usually contains the Kozak consensus
sequence. An eukaryotic 5' UTR may also contain cis-acting
regulatory elements.
[0157] The exogenous nucleic acid molecule may be a vector or part
of a vector, such as an expression vector. Normally, such a vector
remains extrachromosomal within the host cell which means that it
is found outside of the nucleus or nucleoid region of the host
cell.
[0158] According to certain embodiments, a recombinant host cell
according to the invention does not express an endogenous
PAPS-dependent aryl sulfotransferase.
[0159] It is also contemplated by the present invention that the
exogenous nucleic acid molecule is stably integrated into the
genome of the host cell. Means for stable integration into the
genome of a host cell, e.g., by homologous recombination, are well
known to the skilled person.
[0160] The sulfation reaction depends on the supply of sulfate from
3'-phosphoadenosine 5'-phosphosulfate (PAPS) or transferred from
another sulfated compound. The inventors have shown that the
sulfation reaction can be enhanced by improving the supply of PAPS
(3'-phosphoadenosine 5'-phosphosulfate) and, in addition, by the
removal of the product 3'-phosphoadenosine 5'-phosphate (PAP). The
improved supply is obtained by deregulation, mutation or
overexpression of enzymes that increase PAPS concentration or
similarly reduce PAP concentration. This is exemplified in Example
2, where an increased production of zosteric acid in Escherichia
coli is obtained by increasing the expression of the genes cysD,
cysN, and cysC which are responsible for production of PAPS.
Without being bound to a specific theory, it is believed that an
adenylyl moiety (AMP) of ATP is transferred to sulfate to form
activated sulfate, or APS (adenosine 5'-phosphosulfate). This
extremely unfavorable reaction is kinetically and energetically
linked to the hydrolysis of GTP by the enzyme ATP sulfurylase,
which is composed of two types of subunits: an adenylyl transferase
(cysD) and a GTPase (cysN). APS is then phosphorylated at the
3'-hydroxyl to form PAPS (3'-phosphoadenosine 5'-phosphosulfate) in
a reaction catalysed by APS kinase, which is encoded by cysC.
Furthermore, the inventors have enhanced the production of zosteric
acid even more by increasing the expression of the gene cysQ
encoding a PAP phosphatase which is responsible for the removal of
PAP.
[0161] Therefore, in order to further improve the production of a
sulfated phenolic compound, such as zosteric acid, a recombinant
host cell according to the present invention may be further
modified to have an increased protein expression of an ATP
sulfurylase compared to an identical host cell that does not carry
said modification; may be further modified to have an increased
protein expression of an APS kinase compared to an identical host
cell that does not carry said modification; and/or may be further
modified to have an increased protein expression of a PAP
phosphatase compared to an identical host cell that does not carry
said modification. By "increased protein expression" it is meant
that the amount of the respective protein produced by the thus
modified host cell is increased compared an identical host cell
that does not carry said modification. More particularly, by
"increase expression" it is meant that the amount of respective
protein produced by the thus modified host cell is increased by at
least 10%, such as at least 20%, at least 30%, at least 40%, at
least 50% at least 60%, at least 70%, at least 80%, at least 90%,
at least 100%, at least 150%, at least 200%, at least 300%, at
least 400%, at least 500%, at least 600%, at least 700% at least
800%, at least about 900%, at least about 1000%, at least about
2000%, at least about 3000%, at least about 4000%, at least about
5000%, at least about 6000%, at least about 7000%, at least about
8000% at least about 9000% or at least about 10000%, compared an
identical host cell that does not carry said modification. The
amount of protein in a given cell can be determined by any suitable
quantification technique known in the art, such as ELISA,
Immunohistochemistry or Western Blotting.
[0162] According to certain embodiments, a recombinant host cell
according to the invention has further been modified to have an
increased protein expression an ATP sulfurylase compared to an
identical host cell that does not carry said modification.
[0163] According to certain embodiments, a recombinant host cell
according to the invention has further been modified to have an
increased protein expression of an APS kinase compared to an
identical host cell that does not carry said modification.
[0164] According to certain embodiments, a recombinant host cell
according to the invention has further been modified to have an
increased protein expression of a PAP phosphatase compared to an
identical host cell that does not carry said modification.
[0165] An increase in protein expression may be achieved by any
suitable means well-know to those skilled in the art. For example,
an increase in protein expression may be achieved by increasing the
number of copies of the gene or genes encoding the respective
protein (e.g., ATP sulfurylase, APS kinase and/or PAP phosphatase)
in the host cell, such as by using (e.g., introducing into the host
cell) a vectors comprising the gene or genes operably linked to a
promoter that is functional in the host cell to cause the
production of an mRNA molecule. An increase in protein expression
may also be achieved by integration of at least a second copy of
the gene or genes encoding the respective protein into the genome
of the host cell. An increase in protein expression may also be
achieved by increasing the strength of the promoter(s) operably
linked to the gene or genes. An increase in protein expression may
also be achieved by modifying the ribosome binding site on the mRNA
molecule encoding the respective protein (e.g., ATP sulfurylase,
APS kinase and/or PAP phosphatase). By modifying the sequence of
the ribosome binding site the translation initiation rate may be
increased, thus increasing the translation efficiency.
[0166] ATP sulfurylase encoding genes for use according to the
invention may for instance be the cysD and cysN genes from
Escherichia coli (encoding SEQ ID NO: 25 and 26, respectively).
Alternative ATP sulfurylase encoding genes include the Arabidopsis
thaliana ATP sulfurylase ASAL gene (GenBank Accession No. U40715,
Logan et al. (1996) J Biol Chem 271: 12227); the Allium cepa
ATP-sulfurylase gene (Gen-Bank Accession No AF21154); the Lotus
japonicus ATP sulfurylase gene (GenBank Accession No. AW164083);
the Arabidopsis thaliana met3-1 ATP sulfurylase gene (Gen-Bank
Accession No. X79210).
[0167] According to certain embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising one or more nucleotide
sequences encoding a ATP sulfurylase.
[0168] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 25 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 25, provide that the sequence identity is
not 100%, and a nucleotide sequence encoding iii) a polypeptide
comprising an amino acid sequence set forth in SEQ ID NO: 26 or iv)
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 26,
provide that the sequence identity is not 100%. Preferably, the
polypeptides assemble to form a protein having ATP sulfurylase
activity.
[0169] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 25 and a nucleotide sequence encoding a polypeptide
comprising an amino acid sequence set forth in SEQ ID NO: 26.
[0170] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 25,
provide that the sequence identity is not 100%, and a nucleotide
sequence encoding a polypeptide comprising an amino acid sequence
which has at least about 70%, such as at least about 75%, at least
about 80%, at least about 85%, at least about 90%, at least about
93%, at least 95%, at least 96%, at least 97%, at least 98%, or at
least 99%, sequence identity to the amino acid sequence set forth
in SEQ ID NO: 26, provide that the sequence identity is not 100%.
Preferably, the polypeptides assemble to form a protein having ATP
sulfurylase activity.
[0171] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 25, provide that the sequence identity is not 100%, and a
nucleotide sequence encoding a polypeptide comprising an amino acid
sequence which has at least about 85%, such as at least about 90%,
at least about 93%, at least 95%, at least 96%, at least 97%, at
least 98%, or at least 99%, sequence identity to the amino acid
sequence set forth in SEQ ID NO: 26, provide that the sequence
identity is not 100%. Preferably, the polypeptides assemble to form
a protein having ATP sulfurylase activity.
[0172] An alternative ATP sulfurylase encoding gene for use
according to the invention may for instance be the MET3 gene from
Saccharomyces cerevisiae (encoding SEQ ID NO: 68).
[0173] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 68 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 68. Preferably, the polypeptide according
to ii) has ATP sulfurylase activity.
[0174] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 68.
[0175] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 68.
Preferably, the polypeptide has ATP sulfurylase activity.
[0176] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 68. Preferably, the polypeptide has ATP sulfurylase
activity.
[0177] An alternative ATP sulfurylase encoding gene for use
according to the invention may for instance be the ATP sulfurylase
encoding gene from Bacillus subtilis (encoding SEQ ID NO: 73).
[0178] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 73 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 73. Preferably, the polypeptide according
to ii) has ATP sulfurylase activity.
[0179] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 73.
[0180] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 73.
Preferably, the polypeptide has ATP sulfurylase activity.
[0181] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 73. Preferably, the polypeptide has ATP sulfurylase
activity.
[0182] In order to facilitate expression of the polypeptides in the
host cell, the exogenous nucleic acid molecule may comprise
suitable regulatory elements such as a promoter that is functional
in the host cell to cause the production of an mRNA molecule and
that is operably linked to the nucleotide sequences encoding said
polypeptides.
[0183] An APS kinase encoding gene for use according to the
invention may for instance be the cysC gene from Escherichia coli
(encoding SEQ ID NO: 27).
[0184] In certain instances a single polypeptide has been shown to
possess both an ATP sulfurylase and a 5'-adenylylsulfate kinase
activity. For example, an ATP sulfurylase/APS kinase encoding gene
has been isolated from mouse (GenBank Accession No. U34883, Li et
al. (1995) J Biol Chem)70: 1945), and human (GenBank Accession No.
AF033026, Yanagisawa (1998) Biosci Biotechnol Biochem 62: 1037)
sources. Other examples of such bifunctional enzyme include
3'-phosphoadenosine 5'-phosphosulfate synthase enzymes (PAPSS) from
rat (Rattus norvegicus) (SEQ ID NO: 71 or 72).
[0185] According to certain embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding an APS kinase.
[0186] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 27 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 27, provide that the sequence identity is
not 100%. Preferably, said polypeptide according to ii) has APS
kinase activity.
[0187] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 27.
[0188] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 27,
provide that the sequence identity is not 100%. Preferably, said
polypeptide has APS kinase activity.
[0189] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 27, provide that the sequence identity is not 100%. Preferably,
said polypeptide has APS kinase activity.
[0190] An alternative APS kinase encoding gene for use according to
the invention may for instance be the MET14 gene from Saccharomyces
cerevisiae (encoding SEQ ID NO: 69).
[0191] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 69 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 69. Preferably, said polypeptide according
to ii) has APS kinase activity.
[0192] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 69.
[0193] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 69.
Preferably, said polypeptide has APS kinase activity.
[0194] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 69. Preferably, said polypeptide has APS kinase activity.
[0195] An alternative APS kinase encoding gene for use according to
the invention may for instance be the APS kinase encoding gene from
Bacillus subtilis (encoding SEQ ID NO: 74).
[0196] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 74 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 74. Preferably, said polypeptide according
to ii) has APS kinase activity.
[0197] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 74.
[0198] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 74.
Preferably, said polypeptide has APS kinase activity.
[0199] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 74. Preferably, said polypeptide has APS kinase activity.
[0200] Alternatively, a polypeptide having both an ATP sulfurylase
and a APS kinase activity can be used, such as a
3'-phosphoadenosine 5'-phosphosulfate synthase (PAPSS).
[0201] According to certain embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding an 3'-phosphoadenosine 5'-phosphosulfate synthase.
[0202] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 71 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 71. Preferably, said polypeptide according
to ii) has both an ATP sulfurylase and a APS kinase activity.
[0203] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 71.
[0204] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 71.
Preferably, said polypeptide has both an ATP sulfurylase and a APS
kinase activity.
[0205] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 71. Preferably, said polypeptide has both an ATP sulfurylase
and a APS kinase activity.
[0206] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 72 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 72. Preferably, said polypeptide according
to ii) has both an ATP sulfurylase and a APS kinase activity.
[0207] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 72.
[0208] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 72.
Preferably, said polypeptide has both an ATP sulfurylase and APS
kinase activity.
[0209] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 72. Preferably, said polypeptide has both an ATP sulfurylase
and a APS kinase activity.
[0210] In order to facilitate expression of the polypeptide in the
host cell, the exogenous nucleic acid molecule may comprise
suitable regulatory elements such as a promoter that is functional
in the host cell to cause the production of an mRNA molecule and
that is operably linked to the nucleotide sequence encoding said
polypeptide.
[0211] An PAP phosphatase encoding gene for use according to the
invention may for instance be the cysQ gene from Escherichia coli
(encoding SEQ ID NO: 28).
[0212] According to certain embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding an PAP phosphatase.
[0213] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 28 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 28, provide that the sequence identity is
not 100%. Preferably, said polypeptide according to ii) has PAP
phosphatase activity.
[0214] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 28.
[0215] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 28,
provide that the sequence identity is not 100%. Preferably, said
polypeptide has PAP phosphatase activity.
[0216] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 28, provide that the sequence identity is not 100%. Preferably,
said polypeptide has PAP phosphatase activity.
[0217] An alternative PAP phosphatase encoding gene for use
according to the invention may for instance be the MET22 gene from
Saccharomyces cerevisiae (encoding SEQ ID NO: 70).
[0218] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 70 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 70. Preferably, said polypeptide according
to ii) has PAP phosphatase activity.
[0219] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 70.
[0220] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 70.
Preferably, said polypeptide has PAP phosphatase activity.
[0221] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 70. Preferably, said polypeptide has PAP phosphatase
activity.
[0222] An alternative PAP phosphatase encoding gene for use
according to the invention may for instance be the PAP phosphatase
encoding gene from Bacillus subtilits (encoding SEQ ID NO: 75).
[0223] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding i) a polypeptide comprising an amino acid sequence set
forth in SEQ ID NO: 75 or ii) a polypeptide comprising an amino
acid sequence which has at least about 70%, such as at least about
75%, at least about 80%, at least about 85%, at least about 90%, at
least about 93%, at least 95%, at least 96%, at least 97%, at least
98%, or at least 99%, sequence identity to the amino acid sequence
set forth in SEQ ID NO: 75. Preferably, said polypeptide according
to ii) has PAP phosphatase activity.
[0224] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as a vector) comprising a nucleotide sequence
encoding a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 75.
[0225] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 70%, such as at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 93%, at least 95%, at
least 96%, at least 97%, at least 98%, or at least 99%, sequence
identity to the amino acid sequence set forth in SEQ ID NO: 75.
Preferably, said polypeptide has PAP phosphatase activity.
[0226] According to particular embodiments, a recombinant host cell
according to the invention comprises an exogenous nucleic acid
molecule (such as vector) comprising a nucleotide sequence encoding
a polypeptide comprising an amino acid sequence which has at least
about 85%, such as at least about 90%, at least about 93%, at least
95%, at least 96%, at least 97%, at least 98%, or at least 99%,
sequence identity to the amino acid sequence set forth in SEQ ID
NO: 75. Preferably, said polypeptide has PAP phosphatase
activity.
[0227] In order to facilitate expression of the polypeptide in the
host cell, the exogenous nucleic acid molecule may comprise
suitable regulatory elements such as a promoter that is functional
in the host cell to cause the production of an mRNA molecule and
that is operably linked to the nucleotide sequence encoding said
polypeptide.
[0228] It is understood that the details given herein with respect
to a recombinant host cell apply to other aspects of the invention,
in particular to the processes and uses according to the invention,
which are described in more detail below.
[0229] Methods and Uses
[0230] The present invention provides processes for the production
of sulfated phenolic compounds. Particularly, a process for the
production of a sulfated phenolic compound is provided
comprising:
[0231] (i') contacting a medium comprising a phenolic compound with
a first recombinant host cell; wherein the first recombinant host
cell comprises a heterologous polypeptide having an aryl
sulfotransferase activity; or
[0232] (i'') contacting a medium comprising a fermentable carbon
substrate with a first recombinant host cell; wherein the first
recombinant host cell comprises a heterologous polypeptide having
an aryl sulfotransferase activity; or
[0233] (i''') contacting a medium comprising a precursor of a
phenolic compound with a first recombinant host cell; wherein the
first recombinant host cell comprises a heterologous polypeptide
having an aryl sulfotransferase activity.
[0234] According to certain embodiments, the process for the
production of a sulfated phenolic compound comprises:
[0235] (i') contacting a medium comprising a phenolic compound with
a first recombinant host cell; wherein the first recombinant host
cell comprises a heterologous polypeptide having an aryl
sulfotransferase activity.
[0236] According to other certain embodiments, the process for the
production of a sulfated phenolic compound comprises:
[0237] (i'') contacting a medium comprising a fermentable carbon
substrate with a first recombinant host cell; wherein the first
recombinant host cell comprises a heterologous polypeptide having
an aryl sulfotransferase activity.
[0238] According to other certain embodiments, the process for the
production of a sulfated phenolic compound comprises:
[0239] (i''') contacting a medium comprising a precursor of a
phenolic compound with a first recombinant host cell; wherein the
first recombinant host cell comprises a heterologous polypeptide
having an aryl sulfotransferase activity.
[0240] The medium employed may be any conventional medium suitable
for culturing the host cell in question, and may be composed
according to the principles of the prior art. The medium will
usually contain all nutrients necessary for the growth and survival
of the respective host cell, such as carbon and nitrogen sources
and other inorganic salts. Suitable media, e.g. minimal or complex
media, are available from commercial suppliers, or may be prepared
according to published receipts, e.g. the American Type Culture
Collection (ATCC) Catalogue of strains. Non-limiting standard
medium well known to the skilled person include Luria Bertani (LB)
broth, Sabouraud Dextrose (SD) broth, MS broth, Yeast Peptone
Dextrose, BMMY, GMMY, or Yeast Malt Extract (YM) broth, which are
all commercially available. A non-limiting example of suitable
media for culturing bacterial cells, such as B. subtilis, L. lactis
or E. coli cells, including minimal media and rich media such as
Luria Broth (LB), M9 media, M17 media, SA media, MOPS media,
Terrific Broth, YT and others. Suitable media for culturing
eukaryotic cells, such as yeast cells, are RPMI 1640, MEM, DMEM,
all of which may be supplemented with serum and/or growth factors
as required by the particular host cell being cultured. The medium
for culturing eukaryotic cells may also be any kind of minimal
media such as Yeast minimal media.
[0241] The fermentable carbon substrate may be any suitable carbon
substrate know in the art, and in particularly any carbon substrate
commonly used in the cultivation of microorganisms and/or
fermentation. Non-limiting examples of suitable fermentable carbon
substrates include carbohydrates (e.g., C5 sugars such as arabinose
or xylose, or C6 sugars such as glucose), glycerol, glycerine,
acetate, dihydroxyacetone, one-carbon source, methanol, methane,
oils, animal fats, animal oils, plant oils, fatty acids, lipids,
phospholipids, glycerolipids, monoglycerides, diglycerides,
triglycerides, renewable carbon sources, polypeptides (e.g., a
microbial or plant protein or peptide), yeast extract, component
from a yeast extract, peptone, casaminoacids or any combination of
two or more of the foregoing.
[0242] According to certain embodiments, the carbon substrate is
selected from the group consisting of C5 sugars (such as arabinose
or xylose), C6 sugars (such as glucose or fructose), lactose,
sucrose, glycerol, glycerine, acetate, Corn steep liquor, yeast
extract, component from a yeast extract, peptone, casaminoacids or
combinations thereof.
[0243] According to certain embodiments, the medium comprises
glucose.
[0244] According to certain other embodiments, the medium comprises
glycerol.
[0245] According to certain other embodiments, the medium comprises
acetate.
[0246] It is also contemplated to use starch as a carbon substrate.
Depending on the microorganism used, the metabolization of starch
may require the supplementation of beta-glucosidase, such as the
beta-glucosidase from Neurospora crassa, to the medium.
Alternatively, a recombination host cell according to the invention
may be further genetically modified to express a beta-glucosidase,
such as the beta-glucosidase from Neurospora crassa.
[0247] When a fermentable carbon substrate is employed it is thus
possible that the recombinant host cell produces the phenolic
compound or a precursor thereof directly from such primary carbon
substrate.
[0248] According to certain embodiments, the process further
comprises:
[0249] (ii) culturing the first recombinant host cell under
suitable conditions for the production of the corresponding
sulfated phenolic compound.
[0250] Suitable conditions for culturing the respective host cell
are well known to the skilled person. Typically, the recombinant
host cell is cultured at a temperature ranging from about 23 to
about 60.degree. C., such as from about 25 to about 40.degree. C.,
such as at about 37.degree. C. The pH of the medium may range from
pH 1.0 to pH 14.0, such as from about pH 1 to about pH 2, from
about pH 4 to about pH 11, from about pH 5 to about pH 10, from
about pH 6 to about pH 10, or from about pH 7 to about pH 9.5, e.g.
at pH 6.0, pH 7.0, pH. 7.5, pH 8.0, pH 8.5, pH 9.0, pH 9.5, pH
10.0, pH 10.5 or pH 11.0.
[0251] The process may further comprise iii) recovering the
sulfated phenolic compound. The sulfated phenolic compound may be
recovered by conventional method for isolation and purification
chemical compounds from a medium. Well-known purification
procedures include centrifugation or filtration, precipitation, and
chromatographic methods such as e.g. ion exchange chromatography,
gel filtration chromatography, etc.
[0252] Further provided is a process for the production of a
sulfated phenolic compound, such as zosteric acid, the method
comprises sulfating a phenolic compound, such as p-coumaric acid,
using a polypeptide having aryl sulfotransferase activity as
detailed herein. Such polypeptide may be selected from the group
consisting of:
[0253] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0254] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0255] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0256] Suitable conditions for the sulfation reaction are well
known to the skilled person. Typically, the sulfation reaction
takes place at a temperature ranging from about 23 to about
60.degree. C., such as from about 25 to about 40.degree. C., such
as at about 37.degree. C. The deamination reaction may take place
at a pH ranging from pH 1.0 to pH 14.0, such as from about pH 2 to
about pH 11, such as from about pH 5 to about pH 10, from about pH
6 to about pH 10, or from about pH 7 to about pH 9.5, e.g. at pH
6.0, pH 7.0, pH. 7.5, pH 8.0, pH 8.5, pH 9.0, pH 9.5, pH 10.0, pH
10.5 or pH 11.0.
[0257] Also provide is the use of a polypeptide in the sulfation of
a phenolic compound, said polypeptide having aryl sulfotransferase
activity as detailed herein. Such polypeptide may be selected from
the group consisting of:
[0258] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0259] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0260] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0261] For the purpose of this specification and the appended
claims, it should be understood that the phenolic compounds include
those compounds in which a hydroxyl group is directly attached to a
benzenoid carbon atom, and which compounds may or may not contain
other substituent groups.
[0262] According to certain embodiments, the phenolic compound is a
compound represented by the general formula (I):
##STR00001##
[0263] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0264] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0265] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0266] Specific examples of compounds of Formula I include, but are
not limited to, reservatrol, o-, m-, and p-coumaric acid, caffeic
acid, ferulic acid, sinapic acid, curcumin, rosmarinic acid,
sinapyl alcohol, coniferyl alcohol, and salvianolic acid.
[0267] A precursor of a phenolic compound according to Formula I
may be a compound represented by the general Formula (p-I):
##STR00002##
[0268] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0269] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0270] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0271] Such a precursor may be converted to the phenolic compound
by a recombinant host cell according to the invention, comprising a
polypeptide having tyrosine ammonia lyase activity. Such
polypeptide will eliminate ammonia from the precursor of Formula
(p-I) under the formation of the corresponding molecule of Formula
I. Preferably, the p-I precursor is the L-isomer.
[0272] According to certain embodiments, the precursor of a
phenolic compound as employed in step (i''') is a compound of the
general Formula (p-I) as defined herein.
[0273] According to certain other embodiments, the phenolic
compound is a compound represented by the general formula (II):
##STR00003##
[0274] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0275] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0276] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0277] According to certain embodiments, R.sub.6 is
--COOR.sub.7.
[0278] According to certain embodiments, R.sub.7 is hydrogen.
[0279] According to certain embodiments, R.sub.2 is hydroxyl
(--OH).
[0280] According to certain embodiments, R.sub.3 is hydroxyl
(--OH).
[0281] According to certain embodiments, R.sub.4 is hydroxyl
(--OH).
[0282] According to certain embodiments, each of R.sub.1, R.sub.2,
R.sub.4 and R.sub.5 is hydrogen.
[0283] According to certain embodiments, each of R.sub.1, R.sub.2,
and R.sub.5 is hydrogen.
[0284] According to particular embodiments, the phenolic compound
is p-coumaric acid (Formula I: R.sub.1=H, R.sub.2=H, R.sub.3=OH,
R.sub.4=H, R.sub.5=H, R.sub.6=COOH).
[0285] According to other particular embodiments, the phenolic
compound is caffeic acid (Formula I: R.sub.1=H, R.sub.2=H,
R.sub.3=OH, R.sub.4=OH, R.sub.5=H, R.sub.6=COOH).
[0286] According to other particular embodiments, the phenolic acid
is ferulic acid (Formula I: R.sub.1=H, R.sub.2=OCH.sub.3,
R.sub.3=OH, R.sub.4=H, R.sub.5=H, R.sub.6=COOH).
[0287] According to other particular embodiments, the phenolic acid
is sinapic acid (Formula I: R.sub.1=H, R.sub.2=OCH.sub.3,
R.sub.3=OH, R.sub.4=OCH.sub.3, R.sub.5=H, R.sub.6=COOH).
[0288] According to other particular embodiments, the phenolic
compound is resveratrol (Formula I: R.sub.1=H, R.sub.2=OH,
R.sub.3=H, R.sub.4=OH, R.sub.5=H, R.sub.6=p-hydroxyphenyl).
[0289] According to other particular embodiments, the phenolic
compound is vanillin (Formula II: R.sub.1=H, R.sub.2=H, R.sub.3=OH,
R.sub.4=OCH.sub.3, R.sub.5=H, R.sub.6=H).
[0290] According to certain embodiments, the phenolic compound is a
hydroxycinnamic acid.
[0291] According to certain embodiments, the phenolic compound is a
compound represented by the general formula (I), wherein R.sub.1 is
hydrogen; R.sub.2, R.sub.3 and R.sub.4 independently are selected
from the group consisting of hydrogen (H), hydroxyl (--OH),
C.sub.1-6-alkyl and C.sub.1-6-Alkoxy, provided that at least one of
R.sub.2, R.sub.3 and R.sub.4 is hydroxyl (--OH); R.sub.5 is
hydrogen, and R.sub.6 is COOH.
[0292] According to certain embodiments, the precursor of a
phenolic compound as employed in step (i''') is a compound of the
general Formula (p-I), wherein R.sub.1 is hydrogen; R.sub.2,
R.sub.3 and R.sub.4 independently are selected from the group
consisting of hydrogen (H), hydroxyl (--OH), C.sub.1-6-alkyl and
C.sub.1-6-Alkoxy, provided that at least one of R.sub.2, R.sub.3
and R.sub.4 is hydroxyl (--OH); R.sub.5 is hydrogen, and R.sub.6 is
COOH.
[0293] According to certain embodiment, the sulfated phenolic
compound obtained in according to the present invention is zosteric
acid.
[0294] Suitable sulfate donor molecules metabolized by a
polypeptide having aryl sulfotransferase activity are well-known to
one skilled in the art. Non-limiting examples include
3'-phosphoadenosine 5'-phosphosulfate (PAPS), para-nitrophenyl
sulfate (pNPS) and 4-methylumbelliferyl sulfate (MUS). Such sulfate
donor molecules may be employed to facilitate the sulfation of
phenolic compounds in accordance with the invention.
[0295] The medium employed for culturing the recombinant host cell
may be any conventional medium suitable for culturing the host cell
in question, and may be composed according to the principles of the
prior art. The medium will usually contain all nutrients necessary
for the growth and survival of the respective host cell, such as
carbon and nitrogen sources and other inorganic salts, such as
sulfate salts. Suitable media, e.g. minimal or complex media, are
available from commercial suppliers, or may be prepared according
to published receipts, e.g. the American Type Culture Collection
(ATCC) Catalogue of strains. Non-limiting standard medium well
known to the skilled person include Luria Bertani (LB) broth,
Sabouraud Dextrose (SD) broth, MS broth, Yeast Peptone Dextrose,
BMMY, GMMY, or Yeast Malt Extract (YM) broth, which are all
commercially available. A non-limiting example of suitable media
for culturing bacterial cells, such as B. subtilis, L. lactis or E.
coli cells, including minimal media and rich media such as Luria
Broth (LB), M9 media, M17 media, SA media, MOPS media, Terrific
Broth, YT and others. Suitable media for culturing eukaryotic
cells, such as yeast cells, are RPMI 1640, MEM, DMEM, all of which
may be supplemented with serum and/or growth factors as required by
the particular host cell being cultured. The medium for culturing
eukaryotic cells may also be any kind of minimal media such as
Yeast minimal media.
Certain Definitions
[0296] "Aryl sulfotransferase activity" as used herein refers to
the ability of a polypeptide to catalyze the catalyze the transfer
of a sulfate group from a donor molecule to an aryl acceptor
molecule.
[0297] "Tyrosine ammonia lyase activity" as used herein refers to
the ability of a polypeptide to catalysed the conversion of
L-tyrosine into p-coumaric acid.
[0298] "Phenylalanine ammonia lyase activity" as used herein refers
to the ability of a polypeptide to catalysed the conversion of
L-phenylalanine into trans-cinnamic acid.
[0299] "ATP sulfurylase" as used herein refers to an enzyme that
catalyzes the reaction: ATP+sulfate=diphosphate+adenosine
5'-phosphosulfate (APS).
[0300] "APS kinase" as used herein refers to an enzyme that
catalyzes the reaction: ATP+adenosine 5'-phosphosulfate
(APS)=ADP+3'-phosphoadenosine 5'-phosphosulfate (PAPS).
[0301] "PAP phosphatase" as used herein refers to an enzyme that
catalyzes the reaction: 3'-phosphoadenosine 5'-phosphate
(PAP)+H.sub.2O=AMP+phosphate.
[0302] "Polypeptide," or "protein" are used interchangeably herein
to denote a polymer of at least two amino acids covalently linked
by an amide bond, regardless of length or post-translational
modification (e.g., glycosylation, phosphorylation, lipidation,
myristilation, ubiquitination, etc.). Included within this
definition are D- and L-amino acids, and mixtures of D- and L-amino
acids.
[0303] "Nucleic acid" or "polynucleotide" are used interchangeably
herein to denote a polymer of at least two nucleic acid monomer
units or bases (e.g., adenine, cytosine, guanine, thymine)
covalently linked by a phosphodiester bond, regardless of length or
base modification.
[0304] "Recombinant" or "non-naturally occurring" when used with
reference to, e.g., a host cell, nucleic acid, or polypeptide,
refers to a material, or a material corresponding to the natural or
native form of the material, that has been modified in a manner
that would not otherwise exist in nature, or is identical thereto
but produced or derived from synthetic materials and/or by
manipulation using recombinant techniques. Non-limiting examples
include, among others, recombinant host cells expressing genes that
are not found within the native (non-recombinant) form of the cell
or express native genes that are otherwise expressed at a different
level.
[0305] "Substitution" or "substituted" refers to modification of
the polypeptide by replacing one amino acid residue with another,
for instance the replacement of an Arginine residue with a
Glutamine residue in a polypeptide sequence is an amino acid
substitution.
[0306] "Conservative substitution" refers to a substitution of an
amino acid residue with a different residue having a similar side
chain, and thus typically involves substitution of the amino acid
in the polypeptide with amino acids within the same or similar
class of amino acids. By way of example and not limitation, an
amino acid with an aliphatic side chain may be substituted with
another aliphatic amino acid, e.g., alanine, valine, leucine, and
isoleucine; an amino acid with hydroxyl side chain is substituted
with another amino acid with a hydroxyl side chain, e.g., serine
and threonine; an amino acid having an aromatic side chain is
substituted with another amino acid having an aromatic side chain,
e.g., phenylalanine, tyrosine, tryptophan, and histidine; an amino
acid with a basic side chain is substituted with another amino acid
with a basic side chain, e.g., lysine and arginine; an amino acid
with an acidic side chain is substituted with another amino acid
with an acidic side chain, e.g., aspartic acid or glutamic acid;
and a hydrophobic or hydrophilic amino acid is replaced with
another hydrophobic or hydrophilic amino acid, respectively.
[0307] "Non-conservative substitution" refers to substitution of an
amino acid in a polypeptide with an amino acid with significantly
differing side chain properties. Non-conservative substitutions may
use amino acids between, rather than within, the defined groups and
affects (a) the structure of the peptide backbone in the area of
the substitution (e.g., proline for glycine) (b) the charge or
hydrophobicity, or (c) the bulk of the side chain. By way of
example and not limitation, an exemplary non-conservative
substitution can be an acidic amino acid substituted with a basic
or aliphatic amino acid; an aromatic amino acid substituted with a
small amino acid; and a hydrophilic amino acid substituted with a
hydrophobic amino acid.
[0308] "Deletion" or "deleted" refers to modification of the
polypeptide by removal of one or more amino acids in the reference
polypeptide. Deletions can comprise removal of 1 or more amino
acids, 2 or more amino acids, 5 or more amino acids, 10 or more
amino acids, 15 or more amino acids, or 20 or more amino acids, up
to 10% of the total number of amino acids, or up to 20% of the
total number of amino acids making up the polypeptide while
retaining enzymatic activity and/or retaining the improved
properties of an engineered enzyme. Deletions can be directed to
the internal portions and/or terminal portions of the polypeptide,
in various embodiments, the deletion can comprise a continuous
segment or can be discontinuous.
[0309] "Insertion" or "inserted" refers to modification of the
polypeptide by addition of one or more amino acids to the reference
polypeptide. Insertions can comprise addition of 1 or more amino
acids, 2 or more amino acids, 5 or more amino acids, 10 or more
amino acids, 15 or more amino acids, or 20 or more amino acids.
Insertions can be in the internal portions of the polypeptide, or
to the carboxy or amino terminus. The insertion can be a contiguous
segment of amino acids or separated by one or more of the amino
acids in the reference polypeptide.
[0310] "Host cell" as used herein refers to a living cell or
microorganism that is capable of reproducing its genetic material
and along with it recombinant genetic material that has been
introduced into it--e.g., via heterologous transformation.
[0311] "Expression" includes any step involved in the production of
a polypeptide (e.g., encoded enzyme) including, but not limited to,
transcription, post-transcriptional modification, translation,
post-translational modification, and secretion.
[0312] As used herein, "vector" refers to a nucleic acid molecule
capable of transporting another nucleic acid molecule to which it
has been linked. One type of vector is a "plasmid", which refers to
a circular double stranded nucleic acid loop into which additional
nucleic acid segments can be ligated. Certain vectors are capable
of directing the expression of genes to which they are operatively
linked. Such vectors are referred to herein as "expression
vectors". Certain other vectors are capable of facilitating the
insertion of a exogenous nucleic acid molecule into a genome of a
host cell. Such vectors are referred to herein as "transformation
vectors". In general, vectors of utility in recombinant nucleic
acid techniques are often in the form of plasmids. In the present
specification, "plasmid" and "vector" can be used interchangeably
as the plasmid is the most commonly used form of a vector. Large
numbers of suitable vectors are known to those of skill in the art
and commercially available.
[0313] As used herein, "promoter" refers to a sequence of DNA,
usually upstream (5') of the coding region of a structural gene,
which controls the expression of the coding region by providing
recognition and binding sites for RNA polymerase and other factors
which may be required for initiation of transcription. The
selection of the promoter will depend upon the nucleic acid
sequence of interest. A "promoter functional in a host cell" refers
to a "promoter" which is capable of supporting the initiation of
transcription in said cell, causing the production of an mRNA
molecule.
[0314] As used herein, "operably linked" refers to a juxtaposition
wherein the components described are in a relationship permitting
them to function in their intended manner. A control sequence
"operably linked" to a coding sequence is ligated in such a way
that expression of the coding sequence is achieved under conditions
compatible with the control sequence. A promoter sequence is
"operably-linked" to a gene when it is in sufficient proximity to
the transcription start site of a gene to regulate transcription of
the gene.
[0315] "Percentage of sequence identity," "% sequence identity" and
"percent identity" are used herein to refer to comparisons between
an amino acid sequence and a reference amino acid sequence. The "%
sequence identify", as used herein, is calculated from the two
amino acid sequences as follows: The sequences are aligned using
Version 9 of the Genetic Computing Group's GAP (global alignment
program), using the default BLOSUM62 matrix (see below) with a gap
open penalty of -12 (for the first null of a gap) and a gap
extension penalty of -4 (for each additional null in the gap).
After alignment, percentage identity is calculated by expressing
the number of matches as a percentage of the number of amino acids
in the reference amino acid sequence.
[0316] The following BLOSUM62 matrix is used:
TABLE-US-00001 Ala 4 Arg -1 5 Asn -2 0 6 Asp -2 -2 1 6 Cys 0 -3 -3
-3 9 Gln -1 1 0 0 -3 5 Glu -1 0 0 2 -4 2 5 Gly 0 -2 0 -1 -3 -2 -2 6
His -2 0 1 -1 -3 0 0 -2 8 Ile -1 -3 -3 -3 -1 -3 -3 -4 -3 4 Leu -1
-2 -3 -4 -1 -2 -3 -4 -3 2 4 Lys -1 2 0 -1 -3 1 1 -2 -1 -3 -2 5 Met
-1 -1 -2 -3 -1 0 -2 -3 -2 1 2 -1 5 Phe -2 -3 -3 -3 -2 -3 -3 -3 -1 0
0 -3 0 6 Pro -1 -2 -2 -1 -3 -1 -1 -2 -2 -3 -3 -1 -2 -4 7 Ser 1 -1 1
0 -1 0 0 0 -1 -2 -2 0 -1 -2 -1 4 Thr 0 -1 0 -1 -1 -1 -1 -2 -2 -1 -1
-1 -1 -2 -1 1 5 Trp -3 -3 -4 -4 -2 -2 -3 -2 -2 -3 -2 -3 -1 1 -4 -3
-2 11 Tyr -2 -2 -2 -3 -2 -1 -2 -3 2 -1 -1 -2 -1 3 -3 -2 -2 2 7 Val
0 -3 -3 -3 -1 -2 -2 -3 -3 3 1 -2 1 -1 -2 -2 0 -3 -1 4 Ala Arg Asn
Asp Cys Gln Glu Gly His Ile Leu Lys Met Phe Pro Ser Thr Trp Tyr
Val
[0317] "Reference sequence" or "reference amino acid sequence"
refers to a defined sequence to which another sequence is compared.
In the context of the present invention a reference amino acid
sequence may be an amino acid sequence set forth in SEQ ID NO: 1,
2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22 or 23.
[0318] Aliphatic radicals/groups, as referred herein, are
optionally mono- or polysubstituted and may be branched or
unbranched, saturated or unsaturated. Unsaturated aliphatic groups,
as defined in herein, include alkyl, alkenyl and alkinyl radicals.
Preferred aliphatic radicals according to the present invention
include but are not restricted to methyl, ethyl, vinyl (ethenyl),
ethinyl, propyl, n-propyl, isopropyl, allyl (2-propenyl),
1-propinyl, methylethyl, butyl, n-butyl, iso-butyl, sec-butyl,
tert-butyl butenyl, butinyl, 1-methylpropyl, 2-methylpropyl,
1,1-dimethylethyl, pentyl, n-pentyl, 1,1-dimethylpropyl,
1,2-dimethylpropyl, 2,2-dimethylpropyl, hexyl, 1-methylpentyl,
n-heptyl, n-octyl, n-nonyl and n-decyl. Preferred substituents for
aliphatic radicals, according to the present invention, are a
C.sub.1-4 alkyl group, a linear or branched C.sub.1-6 alkoxy group,
F, Cl, I, Br, CF.sub.3, CH.sub.2F, CHF.sub.2, CN, OH, SH, NH.sub.2,
oxo, (C.dbd.O)R', SR', SOR', SO.sub.2R', NHR', NR'R'' whereby R'
and optionally R'' for each substitutent independently represents a
linear or branched C.sub.1-6-alkyl group.
[0319] "Alkyl", "alkyl radical" or group as used herein means
saturated, linear or branched hydrocarbons, which can be
unsubstituted or mono- or polysubstituted. Thus, unsaturated alkyl
is understood to encompass alkenyl and alkinyl groups, like e.g.
--CH.dbd.CH--CH.sub.3 or --C.ident.C--CH.sub.3, while saturated
alkyl encompasses e.g. --CH.sub.3 and --CH.sub.2--CH.sub.3.
"C.sub.1-12-alkyl" includes C.sub.1-2-alkyl, C.sub.1-3-alkyl,
C.sub.1-4-alkyl, and C.sub.1-6-alkyl, C.sub.1-6-alkyl,
C.sub.1-7-alkyl, C.sub.1-8-alkyl, C.sub.1-3-alkyl,
C.sub.1-10-alkyl, and C.sub.1-11-alkyl. In these radicals,
C.sub.1-2-alkyl represents C.sub.1- or C.sub.2-alkyl,
C.sub.1-3-alkyl represents C.sub.1-, C.sub.2- or C.sub.3-alkyl,
C.sub.1-4-alkyl represents C.sub.1-, C.sub.2-, C.sub.3- or
C.sub.4-alkyl, C.sub.1-6-alkyl represents C.sub.1-, C.sub.2-,
C.sub.3-, C.sub.4-, or C.sub.5-alkyl, C.sub.1-6-alkyl represents
C.sub.1-, C.sub.2-, C.sub.3-, C.sub.4-, C.sub.5- or C.sub.6-alkyl
etc. The alkyl radicals may be methyl, ethyl, vinyl (ethenyl),
propyl, allyl (2-propenyl), 1-propinyl, methylethyl, butyl,
1-methylpropyl, 2-methyl propyl, 1,1-dimethylethyl, pentyl,
1,1-dimethylpropyl, 1,2-dimethylpropyl, 2,2-dimethylpropyl, hexyl,
1-methylpentyl, if substituted also CHF.sub.2, CF.sub.3 or
CH.sub.2OH etc. These alkyl, alkenyl or alkinyl radicals may
optionally be mono- or polysubstituted by substitutents
independently selected from a C.sub.1-4 alkyl group, a linear or
branched C.sub.1-6 alkoxy group, F, Cl, I, Br, CF.sub.3, CH.sub.2F,
CHF.sub.2, CN, OH, SH, NH.sub.2, (C.dbd.O)R', SR', SOR',
SO.sub.2R', NHR', NR'R'' whereby R' and optionally R'' for each
substitutent independently represents linear or branched
C.sub.1-6-alkyl group.
[0320] "Aryl" or "aryl radical" as herein is understood as meaning
ring systems with at least one aromatic ring but without
heteroatoms even in only one of the rings. These aryl radicals may
optionally be mono- or polysubstituted by substitutents
independently selected from a C.sub.1-4 alkyl group, a linear or
branched C.sub.1-6 alkoxy group, an optionally at least
mono-substituted phenyl group, F, Cl, I, Br, CF.sub.3, CH.sub.2F,
CHF.sub.2, CN, OH, SH, NH.sub.2, oxo, (C.dbd.O)R', SR', SOR',
SO.sub.2R', N(C.dbd.O)--OR', NHR', NR'R'' whereby R' and optionally
R'' for each substitutent independently represents a linear or
branched C.sub.1-6-alkyl group. Preferred examples of aryl radicals
include but are not restricted to phenyl, naphthyl, fluoranthenyl,
fluorenyl, tetralinyl or indanyl or anthracenyl radicals, which may
optionally be mono- or polysubstituted, if not defined
otherwise.
[0321] "Alkyl-aryl" or "alkyl-aryl radical" as used herein
comprises a linear or branched, optionally at least
mono-substituted alkyl chain which is bonded to an aryl group, as
defined above. A preferred alkyl-aryl radical is a benzyl group,
wherein the alkyl chain is optionally branched or substituted.
Preferred substituents for alkyl-aryl radicals, according to the
present invention, are F, Cl, Br, I, NH.sub.2, SH, OH, SO.sub.2,
CF.sub.3, carboxy, amido, cyano, carbamyl, nitro, phenyl, benzyl,
--SO.sub.2NH.sub.2, C.sub.1-6 alkyl and/or C.sub.1-6-alkoxy.
[0322] "Heteroaryl" or "heteroaryl radical" as used herein is
understood as meaning heterocyclic ring systems which have at least
one aromatic ring and may optionally contain one or more
heteroatoms from the group consisting of nitrogen, oxygen and/or
sulfur and may optionally be mono- or polysubstituted by
substitutents independently selected from a C.sub.1-4 alkyl group,
a linear or branched C.sub.1-6 alkoxy group, F, Cl, I, Br,
CF.sub.3, CH.sub.2F, CHF.sub.2, CN, OH, SH, NH.sub.2, oxo,
(C.dbd.O)R', SR', SOR', SO.sub.2R', NHR', NR'R'' whereby R' and
optionally R'' for each substitutent independently represents a
linear or branched C.sub.1-6-alkyl group. Preferred examples of
heteroaryls include but are not restricted to furan, benzofuran,
thiophene, benzothiophene, pyrrole, pyridine, pyrimidine,
pyridazine, pyrazine, quinoline, isoquinoline, phthalazine,
benzo-1,2,5-thiadiazole, benzothiazole, indole, benzotriazole,
benzodioxolane, benzodioxane, benzimidzole, carbazole and
quinazoline.
[0323] "Alkoxy", "alkoxy radical" or group as used herein means an
"alkyl" singular bonded to oxygen. "C.sub.1-6-alkoxy" includes
C.sub.1-2-alkoxy, C.sub.1-3-alkoxy, C.sub.1-4-alkoxy, and
C.sub.1-6-alkoxy, as well as C.sub.2-3-alkoxy, C.sub.2-4-alkoxy,
C.sub.2-6-alkoxy, C.sub.3-4-alkoxy, C.sub.3-6-alkoxy, and
C.sub.4-6-alkoxy. In these radicals, C.sub.1-2-alkoxy represents
C1- or C2-alkoxy, C.sub.1-3-alkoxy represents C.sub.1-, C.sub.2- or
C.sub.3-alkoxy, C.sub.1-4-alkyl represents C.sub.1-, C.sub.2-,
C.sub.3- or C.sub.4-alkoxy, C.sub.1-6-alkoxy represents C.sub.1-,
C.sub.2-, C.sub.3-, C.sub.4-, or C.sub.5-alkoxy, C.sub.1-6-alkoxy
represents C.sub.1-, C.sub.2-, C.sub.3-, C.sub.4-, C.sub.5- or
C.sub.6-alkoxy. The alkoxy radicals may be methoxy, ethoxy,
propoxy, butoxy, pentyloxy or hexyloxy.
[0324] The term "precursor of a phenolic compound" refers to any
compound that may be converted to a phenolic compound by a host
cells as described herein.
[0325] Where a numerical limit or range is stated herein, the
endpoints are included. Also, all values and sub ranges within a
numerical limit or range are specifically included as if explicitly
written out.
[0326] Having generally described this invention, a further
understanding can be obtained by reference to certain specific
examples, which are provided herein for purposes of illustration
only, and are not intended to be limiting unless otherwise
specified.
Examples
Example 1--Production of Zosteric Acid in E. coli
[0327] A range of aryl sulfotransferases including SULT1A1 Rattus
norvegicus (SEQ ID NO: 1), SULT1A1 Homo sapiens (SEQ ID NO: 2),
SULT1A1 Equus caballus (SEQ ID NO: 3), SULT1A1 Sus scrofa
domesticus (SEQ ID NO: 4), SULT1A1 Canis lupus familiaris (SEQ ID
NO: 5) and SULT1E1 Gallus gallus domesticus (SEQ ID NO: 6) were
expressed in Escherichia coli. The respective genes encoding SEQ ID
NO. 1, 3, 4, 5, and 6 were cloned amplified from liver tissue cDNA
(Zyagen) by PCR using the primers listed in Table 1. The nucleotide
sequence of the gene encoding SEQ ID NO: 2 was codon optimized for
expression in Escherichia coli (GeneArt, Life Technologies) and
amplified by PCR using the primers in Table 1. The pETDuet-1
plasmid was digested with restriction endonucleases NcoI and SalI.
The PCR products were then individually cloned into the plasmid
pETDuet-1 using the Gibson reaction (New England Biolabs). The
resulting plasmids were transformed into BL21(DE3)pLysS (Life
Technologies). FIG. 1 shows the plasmid map of the plasmid encoding
SULT1A1 Rattus norvegicus (SEQ ID NO: 1).
TABLE-US-00002 TABLE 1 Overview of enzymes and primers for cloning
aryl sulfotransferases SEQ Rev ID NO Name Fwd Primer Primer 1
SULT1A1 Rattus norvegicus CBJP472 CBJP473 2 SULT1A1 Homo sapiens
CBJP470 CBJP471 3 SULT1A1 Equus caballus CBJP499 CBJP500 4 SULT1A1
Sus scrofa domesticus CBJP505 CBJP506 5 SULT1A1 Canis lupus
familiaris CBJP503 CBJP504 6 SULT1E1 Gallus gallus domesticus
CBJP501 CBJP502
[0328] The strains were grown in M9 minimal media containing
glucose as a carbon source, and 0.1 mM IPTG for induction of gene
expression as well as 0.1 mM p-coumaric acid (pHCA). After four
days of growth, samples were withdrawn by filtration and analyzed
by HPLC.
[0329] The concentration of p-courmaric acid (pHCA) and zosteric
acid in the supernatant was quantified by high performance (HPLC)
and compared to chemical standards. HPLC was done on a Thermo setup
using a HS-F5 column and mobile phases: 5 mM ammonium formate pH
4.0 (A) and acetonitrile (B) at 1.5 mL min-1, using a gradient
elution starting at 5% B. From 0.5 min after injection to 7 min,
the fraction of B increased linearly from 5% to 60%, and between
9.5 min and 9.6 the fraction of B decreased back to 5%, and
remaining there until 12 min. pHCA and zosteric acid were
quantified by measuring absorbance at 277 nm.
[0330] Table 2 shows the remaining pHCA and the produced zosteric
acid in the culture media. Zosteric acid was formed with an aryl
sulfotransferase heterologously expressed in a microorganism
exemplified by E. coli supplied with the substrate.
TABLE-US-00003 TABLE 2 Production of zosteric acid in E. coli from
pHCA through the heterologous expression of sulfotransferases. pHCA
Zosteric acid Enzyme remaining (mM) formed (mM) No enzyme 0.10 Not
detectable SULT1A1 Rattus norvegicus 0.02 0.10 SULT1A1 Homo sapiens
0.08 0.02 SULT1A1 Equus caballus 0.09 0.01 SULT1A1 Sus scrofa
domesticus 0.09 0.01 SULT1A1 Canis lupus familiaris 0.10 0.01
SULT1E1 Gallus gallus domesticus 0.08 0.01
Example 2--Increased Production of Zosteric Acid in E. coli
[0331] The addition of sulfated groups to targets is dependent on
supply of the donor molecule 3'-Phosphoadenosine 5'-phosphosulfate
(PAPS). We examined if we could increase the production of zosteric
acid by overexpressing enzymes providing PAPS and an enzyme that
removes the product 3'-Phosphoadenosine 5'-phosphate (PAP).
TABLE-US-00004 TABLE 3 Cloning of enzymes involved in activating
sulfate and product removal. Genes Fwd Primer Rev Primer cysDNC
alone CBJP491 CBJP492 cysDNC for artificial operon CBJP491 CBJP497
cysQ for artificial operon CBJP498 CBJP496
[0332] In E. coli, the genes cysD and cysN encode the two subunits
of ATP sulfurylase (EC:2.7.7.4), cysC encodes APS kinase
(EC:2.7.1.25), and cysQ encode a PAP phosphatase.
[0333] The cysDNC cluster was amplified by PCR from E. coli MG1655
chromosomal DNA using the primers shown in table 3. The plasmid
pRSFDuet-1 (Life Technologies) was digested by the restriction
endonucleases NdeI and BglII. The gene cluster was inserted into
the digested plasmid using the Gibson reaction (New England
Biolabs). FIG. 2 shows the resulting plasmid. For the combined
expression of cysDNC and cysQ in an artificial operon, cysDNCQ, the
two parts were amplified by PCR from E. coli MG1655 chromosomal DNA
using the primers shown in Table 3. Again the parts were inserted
into the digested plasmid. FIG. 3 shows the resulting plasmids. The
plasmid expressing SULT1A1 Homo sapiens (SEQ ID NO: 2) from example
1 was co-transformed into E. coli BL21(DE3)pLysS cells (Life
Technologies) with either the plasmid expressing cysDNC or
cysDNCQ.
[0334] Cells were grown as in Example 1 and the supernatants were
analyzed for product formation as in example 1. The strain
expressing SULT1A1 in combination with cysDNCQ was also grown
without the addition of IPTIG for induction. Table 4 shows the
concentrations of pHCA and zosteric acid.
TABLE-US-00005 TABLE 4 Concentrations of pHCA and zosteric acid in
culture media with E. coli expressing an aryl sulfotransferase in
combination with cysDNC and cysQ. pHCA Zosteric acid remaining
formed Enzymes Induction (mM) (mM) SULT1A1 Homo sapiens 0.1 mM IPTG
0.08 0.02 SULT1A1 Homo sapiens, 0.1 mM IPTG 0.06 0.06 CysDNC
SULT1A1 Homo sapiens, 0.1 mM IPTG 0.04 0.09 CysDNCQ SULT1A1 Homo
sapiens, None 0.10 Not detectable CysDNCQ
[0335] This shows that more of the pHCA is transformed into
zosteric acid when the protein expression of cysDNC is increased.
Even more zosteric acid is formed when the protein expression cysQ
is additionally increased.
Example 3--A Sulfated Product can be Formed In Vivo by
Co-Expression of an Heterologous Pathway and an Aryl
Sulfotransferase
[0336] The production of a sulfated product can be accomplished
biologically by the expression of aryl sulfotransferase as shown in
example 1. The substrate for sulfation may also be formed by a
biological organism, and here it will be shown for an organism
expressing both a heterologous pathway leading to a phenolic
compound and expressing a sulfotransferase acting upon the phenolic
compound.
[0337] The enzyme RmXAL from Rhodotorula mucilaginosa/Rhodotorula
rubra (SEQ ID NO: 20) has tyrosine ammonia lyase activity, thus
catalyzing the non-oxidative deamination of the amino acid
tyrosine, releasing p-coumaric acid (pHCA) and ammonia. The gene
encoding RmXAL was codon optimized using standard algorithms for
expression in E. coli available by GeneArt (Life Technologies) and
amplified by PCR using the primers shown in table 5 and inserted
into the pCDFDuet-1 vector (Novagen/Life Technologies), which had
been digested by the restriction enzymes NdeI and Bglll, using
Gibson reaction (New England Biolabs). FIG. 4 provides an image of
the plasmid expressing RmXAL.
TABLE-US-00006 TABLE 5 Primers used for cloning of tyrosine
ammonialyase Genes Fwd Primer Rev Primer RmXAL CBJP487 CBJP488
[0338] The resulting plasmid was co-transformed into E. coli
BL21(DE3)pLysS cells (Life Technologies) alone or together with the
plasmid expressing SULT1A1 from Homo sapiens (example 1). The
resulting strains was grown in M9 media with glucose as a carbon
source, with 0.1 mM IPTG for induction of gene expression. Samples
were taken as described previously (example 1) for analysis of
product formation. Table 6 shows the resulting concentrations of
pHCA and zosteric acid. RmXAL allowed the production of pHCA
without addition of any substrate, thus providing a heterologous
pathway from the cells normal metabolism to a heterologous product.
The additional expression of an aryl sulfotransferase, exemplified
by SULT1A1 from Homo sapiens, allowed the in vivo conversion of
pHCA to zosteric acid. Thus, an aryl sulfotransferase can act upon
a compound produced in vivo and the cells can release the resulting
sulfated product to the medium.
TABLE-US-00007 TABLE 6 Concentrations of pHCA and zosteric acid in
culture media with E. coli expressing an aryl sulfotransferase in
combination with a tyrosine ammonia lyase. Enzymes pHCA (mM)
Zosteric acid formed (mM) RmXAL 0.04 Not detectable SULT1A1 Homo
sapiens, 0.02 0.01 RmXAL
Example 4--Decreased Toxicity of Sulfated Product
[0339] E. coli MG1655 was grown in chemically defined M9 minimal
media with 0.2% glucose as a carbon source without further addition
or with the additions of either 10 mM, 20 mM, 25 mM, 30 mM, 35 mM
or 40 mM p-coumaric acid (pHCA), or with 20 mM or 40 mM of the
sulfate ester of pHCA (zosteric acid). All media preparations had
been adjusted to pH 7. Cells were grown at 37.degree. C. with 250
rpm shaking in an orbital shaker. The growth rates were examined by
following the optical density at 600 nm. The resulting growth rates
in exponential growth phase are shown in FIG. 5. Filled squares
represent growth rates in media with pHCA. Open squares represent
growth rates in media with zosteric acid. And the circle represents
the growth rate in media without any of these additions. It is
evident that the presence of pHCA is toxic to the cells, while the
sulfate ester, zosteric acid is much less so.
Example 5--In Vivo Supply of Precursor of Sulfated Product
[0340] The substrate that is the subject for sulfation may be
supplied to the medium void of such precursors or may be provided
by microorganisms in the medium. Here we show that p-coumaric acid
that is sulfated to generate zosteric acid, can be produced in vivo
by the expression of a tyrosine ammonia-lyase.
[0341] The genes encoding the tyrosine ammonia-lyases RcTAL (from
Rhodobacter capsulatus; SEQ ID NO: 48), RsTAL (from Rhodobacter
sphaeroides; SEQ ID NO: 17) and FjTAL (from Flavobacterium
johnsoniae; SEQ ID NO: 14) were cloned into expression vectors as
follows. Genes (SEQ ID NO: 49, 50, and 51, respectively) were
optimized for E. coli and synthesized by GeneArt, amplified by PCR
using the oligonucleotides shown in the table below, and cloned
into pCDFDuet-1 (Novagen): The plasmid was digested with NdeI and
BglII and gel purified. The genes were inserted by isothermal
assembly using Gibson Assembly Master Mix (New England Biolabs),
and transformed into chemically competent DH5.alpha. (laboratory
strain) or NEB5.alpha. (New England Biolabs), selecting for
resistance to 50 .mu.g mL.sup.-1 spectinomycin in LB medium.
Resulting plasmids pCBJ215 (RsTAL), pCBJ228 (FjTAL) and pCBJ297
(RcTAL) (FIGS. 6 to 8, respectively) were co-transformed by
electroporation into the E. coli expression strain BL21(DE3)
(Invitrogen/Life Technologies) together with a pETDuet-1-based
plasmid expressing SULT1A1 from rat (Example 1). Transformation
cultures were plated on LB containing 50 .mu.g mL.sup.-1
spectinomycin and 100 .mu.g mL.sup.-1 ampicillin. A control strain
carrying pCDFDuet-1 was also made.
[0342] Primers:
TABLE-US-00008 Oligo- nucleotide Gene Direction Sequence CBJP483
RsTAL Forward CATCTTAGTATATTAGTTAAGTATAAGAAGGAG
ATATACATATGCTGGCAATGAGCCCT CBJP484 RsTAL Reverse
TGGCCGGCCGATATCCAATTGATTAAACCGGAC TCTGTTGC CBJP555 FjTAL Forward
CATCTTAGTATATTAGTTAAGTATAAGAAGGAG ATATACATATGAACACCATCAACGAATATCTG
CBJP556 FjTAL Reverse TGGCCGGCCGATATCCAATTGATTAATTGTTAA
TCAGGTGGTCTTTTACTTTCTG CBJP745 RcTAL Forward
CATCTTAGTATATTAGTTAAGTATAAGAAGGAG ATATACATATGCTGGATGCAACCATTGG
CBJP746 RcTAL Reverse TGGCCGGCCGATATCCAATTGATTATGCCGGA
GGATCCGCT
[0343] Strains harboring recombinant plasmids were pre-cultured in
2xYT liquid medium with 100 .mu.g mL.sup.-1 ampicillin and 50 .mu.g
mL.sup.-1 spectinomycin and incubated at 37.degree. C. and 250 rpm
overnight. The following day, each pre-culture was transferred into
5 ml of M9 minimal medium with 0.2% glucose, 2 mM tyrosine and 1 mM
IPTG for induction of expression. Cultures were placed in an
incubator at 37.degree. C. with shaking at 250 rpm overnight. The
supernatants were then collected by centrifugation twice and
applied to HPLC analysis as described in example 1, and the titers
of p-coumaric acid (pHCA) and zosteric acid (ZA) were quantified
using chemical standards and are presented in the table below.
TABLE-US-00009 Sulfotransferase Tyrosine ammonia-lyase .mu.M pHCA
.mu.M ZA SULT1A1 rat None 0 0 SULT1A1 rat RsTAL 78 <1 SULT1A1
rat RcTAL 20 <1 SULT1A1 rat FjTAL 398 16
[0344] Here, it is evident that the zosteric acid is formed when
there is a supply of exogenous p-coumaric acid or if the cells are
capable of producing p-coumaric acid. Conclusively, a sulfated
product may be formed from an unsulfated precursor molecule, when
this is produced in vivo.
[0345] Furthermore, the data surprisingly show that employing the
tyrosine ammonia-lyase from Flavobacterium johnsoniae (FjTAL; SEQ
ID NO: 14) results in a higher supply in unsulfated precursor
molecule (here: p-coumaric acid), which in turn leads to a higher
yield of sulfated product (here: zosteric acid) compare to other
tyrosine ammonia-lyases.
Example 6--Production of Sulfated Products in Other Hosts
[0346] We have shown that zosteric acid can be produced in vivo in
Escherichia coli by expression of an aryl sulfotransferase. To show
that the reaction is possible in other microorganisms, we here show
that the yeast Saccharomyces cerevisiae can also be used as a host
for the production.
[0347] The gene encoding aryl sulfotransferase SULT1A (Example 1)
was cloned after a TEF1 promoter into an episomal plasmid with a
2-micron origin of replication as follows. The gene was amplified
by PCR using primers CBJP633 and CBJP634. Alternatively, the gene
was codon-optimized for E. coli and synthesized by GeneArt and
amplified by primers CBJP635 and CBJP636. The TEF1 promoter (Jensen
et al., 2014, FEMS Yeast Res 14: 238-248) was amplified by PCR
using the primers PTEF1_fw and PTEF1_rv. Plasmid pCfB132 (Jensen et
al., supra) was digested by restriction enzymes AsiSI and Nt.BsmI.
The three fragments--plasmid, TEF1 promotor and SULT1A1-encoding
gene--were assembled using a uracil-excission cloning procedure,
resulting in plasmids pCBJ283 and pCBJ284 (FIGS. 9 and 10,
respectively, which was subsequently transformed into the
Saccharomyces cerevisiae strain CEN.PK102-5B selecting for growth
on synthetic dropout media plates lacking uracil. A control strain
was also made by transformation of pCfB132 into CEN.PK102-5B.
[0348] Primers:
TABLE-US-00010 Oligo- Gene/ nucleotide promoter Direction Sequence
CBJP633 SULT1A1 rat Forward AGTGCAGGUAAAACAATGgagttctcccgtcca
CBJP634 SULT1A1 rat Reverse CGTGCGAUTCAtagttcacaacgaaacttg CBJP635
SULT1A1 rat Forward ATCTGTCAUAAAACAATGgaattttcacgtccgc (E. coli)
CBJP636 SULT1A1 rat Reverse CACGCGAUTCAcagttcacaacgaaatttgaa (E.
coli) PTEF1_fw PTEF1 Forward Cacgcgaugcacacaccatagcttc PTEF1_rv
PTEF1 Reverse Cgtgcgauggaagtaccttcaaaga
[0349] The strains were grown in modified Delft medium (Jensen et
al., supra) with 20 mg/mL histidine and 60 mg/mL leucine and 10 mM
p-coumaric acid overnight at 30.degree. C. with aeration. The
supernatant was then isolated and examined by HPLC as described in
Example 1. The table below shows that zosteric acid (ZA) was
produced by the strain expressing SULT1A1 and not the control
strain lacking a sulfotransferase.
TABLE-US-00011 .mu.M ZA (averages and standard Sulfotransferase
deviations of replicate experiments) None 0 .+-. 0 SULT1A1 rat
(native) 37.8 .+-. 5.7 SULT1A1 rat (codon 46.2 .+-. 3.5 optimized
for E. coli)
[0350] It is evident that zosteric acid is formed only when a
sulfotransferase is expressed in yeast, and that the gene encoding
this may be natural or encoded by a synthetic gene with a specific
codon-optimization. Conclusively, the sulfation reactions shown to
be catalyzed by sulfotransferases in E. coli are also catalyzed
when the sulfotransferases are expressed in other organisms, as
demonstrated here for the yeast S. cerevisiae. The efficacy of
production may be affected by means such as the codon-usage of the
genes encoding the sulfotransferase. Thus yeast expressing
sulfotransferases may be able to detoxify aromatic compounds such
as p-coumaric acid, and form sulfated products such as zosteric
acid.
Example 7--A Range of Compounds are Substrates for Sulfation In
Vivo
[0351] Here we show that the expression of an aryl sulfotransferase
may be able to convert several substrates. Some of these are
inhibitors that can be found in biomass hydrolyzate used as a
substrate for cell growth and production in biotechnology. The
compounds also include some that are of biotechnological interest
as products of a cell culture or be some whose sulfate ester is of
economic interest.
[0352] Different sulfotransferases were examined for their
substrate specificities against three substrates. We tested the
sulfotransferases mentioned in example 1, as well as additional
ones. The genes encoding these were cloned as described in example
1 using the primers shown in the table below from cDNA libraries of
the respective organisms, except for the SULT1A1 from rat (Rattus
norvegicus) codon-optimized for E. coli (described above). The
resulting vectors were transformed into BL21(DE3)pLysS.
[0353] Primers:
TABLE-US-00012 Oligo- nucleotide Gene Direction Sequence CBJP517
SULT1C1 Gallus Forward TAGAAATAATTTTGTTTAACTTTA gallus
AGAAGGAGATATACCatggccctgg domesticus ataaaatgg CBJP518 SULT1C1
Gallus Reverse TAAGCATTATGCGGCCGCAAGCT gallus
TGtcacaattccatgcgaaaaactag domesticus CBJP533 SULT1A1 Rattus
Forward TAGAAATAATTTTGTTTAACTTTA norvegicus
AGAAGGAGATATACCatggaattttc (Codon-optimized acgtcc for E. coli)
CBJP534 SULT1A1 Rattus Reverse TAAGCATTATGCGGCCGCAAGCT norvegicus
TGttacagttcacaacgaaatttg (Codon-optimized for E. coli)
[0354] The resulting strains were grown in M9 medium containing
either 100 .mu.M pHCA, 95 .mu.M resveratrol or 87 .mu.M kaempferol.
The cultures were grown overnight at 37.degree. C., 300 rpm. The
following day the supernatants were isolated and examined by HPLC
as described in example 1. BL21(DE3)pLysS were used as a control
strain and did not convert the substrates.
TABLE-US-00013 pHCA resveratrol kaempferol Enzyme 100 .mu.M 95
.mu.M 87 .mu.M SULT1A1 Rattus norvegicus 93% 93% 95% SULT1C1 Gallus
gallus 26% 100% 80% domesticus SULT1A1 Rattus norvegicus 73% 58%
38% (Codon-optimized for E. coli) SULT1A1 human 39% 36% 97% SULT1A1
Equus caballus 21% 100% 96% SULT1E1 Gallus gallus 17% 100% 47%
domesticus SULT1A1 Canis lupus familiaris 34% 61% 60% SULT1A1 Sus
scrofa domesticus 8% 88% 45%
[0355] The table shows the percent conversion of the various
substrates by cells expressing the different sulfotransferases. The
results show that several sulfotransferases, and especially the
aryl sulfotransferase from rat (Rattus norvegicus), may be employed
in the sulfation of phenolic compounds.
[0356] To further test the range of substrates that can be
sulfated, we used strains carrying plasmids expressing SULT1A1 from
rat (Rattus norvegicus) and SULT1E1 from chicken (Gallus gallus
domesticus) (Example 1) cloned into the expression vector
pETDuet-1, and cysDNCQ from E. coli cloned into expression vector
pRSFDuet-1 (Example 2). The plasmids were introduced into the E.
coli expression strain BL21(DE3)pLysS as described previously,
selecting for transformants with appropriate antibiotics, namely 34
.mu.g mL.sup.-1 chloramphenicol for pLysS, 100 .mu.g mL' ampicillin
for pETDuet-1-based vectors, and 100 .mu.g mL.sup.-1 kanamycin for
pRSFDuet-1-based vectors. The table below shows the combination of
over-expressed genes on plasmids. A control strain without a
sulfotransferase gene or cysDNCQ operon was also examined.
TABLE-US-00014 E. coli strains Sulfotransferase Cys genes Control
strain -- -- SULT1A1 rat SULT1A1 rat -- SULT1E1 chicken SULT1E1
chicken -- SULT1A1 rat + CysDNCQ SULT1A1 rat CysDNCQ
[0357] The strains were precultured in 2xYT medium with appropriate
antibiotics. 10 .mu.L of these precultures were used to inoculate
M9 media with 1 mM IPTG and none or a single substrate for
sulfation. After overnight growth at 37.degree. C., 300 rpm the
supernatants were withdrawn and examined by HPLC as described in
Example 1. The compounds were detected by UV absorbance. The table
below shows the percent reduction in concentration in the strains
expressing sulfotransferases alone or in combination with cysDNCQ
genes when compared to the control strain.
TABLE-US-00015 Start concen- tration SULT1A1 + Compound in .mu.M
SULT1A1 SULT1E1 CysDNCQ Ferulic acid 110 72% 67% 100% Quercetin 85
75% 74% 81% 4-hydroxybenzoic acid 287 5% 4% 6% 4-acetamidophenol
114 24% 10% 30% 3-Hydroxy-4- 132 51% 24% 62% methoxycinnamic acid
4-Hydroxyphenylpyruvic 255 47% 100% 64% acid 3-(4- 241 3% 1% 7%
Hydroxyphenyl)pro- pionic acid Vanillic acid 173 33% 0% 39%
Luteolin 61 27% 0% 37% Apigenin 77 41% 98% 99% fisetin 81 98% 98%
100%
[0358] Conclusively, a wide range of phenolic compounds are
substrates for sulfotransferases. In the shown examples, the
conversion is enhanced by the overexpression of cysDNCQ genes. Some
of these compounds and their sulfate esters are of interest in
biotechnology. Also, some of these compounds are inhibitors of cell
growth and function, and thus conversion by sulfation is of
interest for use in biological systems.
Embodiments of the Invention
[0359] 1. A process for the production of a sulfated phenolic
compound comprising:
[0360] (i') contacting a medium comprising a phenolic compound with
a first recombinant host cell; wherein the first recombinant host
cell comprises a heterologous polypeptide having an aryl
sulfotransferase activity; or
[0361] (i'') contacting a medium comprising a fermentable carbon
substrate with a first recombinant host cell; wherein the first
recombinant host cell comprises a heterologous polypeptide having
an aryl sulfotransferase activity; or
[0362] (i''') contacting a medium comprising a precursor of a
phenolic compound with a first recombinant host cell; wherein the
first recombinant host cell comprises a heterologous polypeptide
having an aryl sulfotransferase activity.
[0363] 2. The process according to item 1, further comprising:
[0364] (ii) culturing the first recombinant host cell under
suitable conditions for the production of the corresponding
sulfated phenolic compound; and
[0365] (iii) optionally, recovering said sulfated phenolic
compound.
[0366] 3. The process according to item 1 or 2, wherein the
heterologous polypeptide having an aryl sulfotransferase activity
is a sulfotransferase 1A1 enzyme.
[0367] 4. The process according to any one of items 1-3, wherein
the heterologous polypeptide having an aryl sulfotransferase
activity is selected from the group consisting of:
[0368] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0369] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0370] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0371] 5. The process according to any one of items 1-4, wherein
the heterologous polypeptide is selected from the group consisting
of:
[0372] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1;
[0373] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1; or
[0374] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to
30, 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0375] 6. The process according to any one of items 1-5, wherein
the first recombinant host cell comprises an exogenous nucleic acid
molecule comprising a nucleotide sequence encoding said
heterologous polypeptide.
[0376] 7. The process according to item 6, wherein the exogenous
nucleic acid molecule further comprises a promoter that is
functional in the host cell to cause the production of an mRNA
molecule and that is operably linked to the nucleotide sequence
encoding said heterologous polypeptide.
[0377] 8. The process according to item 6 or 7, wherein the
exogenous nucleic acid molecule is a vector.
[0378] 9. The process according to item 6 or 7, wherein the
exogenous nucleic acid molecule is stabily integrated into the
genome of said first recombinant host cell.
[0379] 10. The process according to any one of items 1-7, wherein
the first recombinant host cell has been further modified to have
an increased protein expression of an ATP sulfurylase compared to
an identical host cell that does not carry said modification.
[0380] 11. The process according to item 10, wherein the ATP
sulfurylase is encoded by the genes cysD and cysN.
[0381] 12. The process according to any one of items 1-11, wherein
said first recombinant host cell has been further modified to have
an increased poretin expression of an APS kinase compared to an
identical host cell that does not carry said modification.
[0382] 13. The process according to item 12, wherein the said APS
kinase is encoded by the gene cysC.
[0383] 14. The process according to any one of items 1-13, wherein
said first recombinant host cell has been further modified to have
an increased protein expression of a PAP phosphatase compared to an
identical host cell that does not carry said modification.
[0384] 15. The process according to item 14, wherein said PAP
phosphatase is encoded by the gene cycQ.
[0385] 16. The process according to any one of items 10-15, wherein
the increase in protein expression is achieved by increasing the
number of copies of the encoding gene or genes.
[0386] 17. The process according to item 16, wherein the increase
in the number of copies of the gene or genes is achieved by using
one or more vectors comprising the gene or genes operably linked to
a promoter that is functional in the host cell to cause the
production of an mRNA molecule.
[0387] 18. The process according to any one of items 10-15, wherein
the increase in protein expression is achieved by modifying the
ribosome binding site.
[0388] 19. The process according to any one of items 10-18, wherein
the increase in protein expression is achieved by increasing the
strength of the promoter(s) operably linked to the gene or
genes.
[0389] 20. The process according to any one of items 1-19, wherein
said first recombinant host cell further comprises a heterologous
polypeptide having a tyrosine ammonia lyase activity.
[0390] 21. The process according to any one of items 1-20, wherein
in step (i'), (i'') or (i''') the medium is further contacted with
a second recombinant host cell comprising a heterologous
polypeptide having a tyrosine ammonia lyase activity.
[0391] 22. The process according to item 20 or 21, wherein the
heterologous polypeptide having a tyrosine ammonia lyase activity
is selected from the group consisting of:
[0392] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0393] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14); or
[0394] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30, 1
to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0395] 23. The process according to any one of items 20 to 22,
wherein the first and/or second recombinant host cell comprises an
exogenous nucleic acid molecule comprising a nucleotide sequence
encoding said heterologous polypeptide having a tyrosine ammonia
lyase activity.
[0396] 24. The process according to item 23, wherein the exogenous
nucleic acid molecule further comprises a promoter that is
functional in the host cell to cause the production of an mRNA
molecule and that is operably linked to the nucleotide sequence
encoding said heterologous polypeptide.
[0397] 25. The process according to item 23 or 24, wherein the
exogenous nucleic acid molecule is a vector.
[0398] 26. The process according to item 23 or 24, wherein the
exogenous nucleic acid is stably integrated into the genome of the
first and/or second recombinant host cell.
[0399] 27. The process according to any one of items 1 to 26,
wherein the first recombinant host cell and the second recombinant
host cell are independently selected from the group consisting of
bacteria, yeasts, fungi, algae and plant cells.
[0400] 28. The process according to any one of items 1 to 27,
wherein the first recombinant host cell is a bacterium.
[0401] 29. The process according to item 28, wherein the bacterium
is a bacterium of the genus Bacillus, Lactococcus, Lactobacillus,
Clostridium, Corynebacterium, Geobacillus, Thermoanaerobacterium,
Streptococcus, Pseudomonas, Streptomyces, Escherichia, Shigella,
Acinetobacter, Citrobacter, Salmonella, Klebsiella, Enterobacter,
Erwinia, Kluyvera, Serratia, Cedecea, Morganella, Hafnia,
Edwardsiella, Providencia, Proteus, or Yersinia.
[0402] 30. The process according to item 28, wherein the bacterium
is a bacterium of the genus Bacillus.
[0403] 31. The process according to item 30, wherein the bacterium
is Bacillus subtilis.
[0404] 32. The process according to item 28, wherein the bacterium
is a bacterium of the genus Lactococcus.
[0405] 33. The process according to item 32, wherein the bacterium
is Lactococcus lactis.
[0406] 34. The process according to item 28, wherein the bacterium
is a bacterium of the genus Pseudomonas.
[0407] 35. The process according to item 34, wherein the bacterium
is Pseudomonas putida.
[0408] 36. The process according to item 28, wherein the bacterium
is a bacterium of the genus Corynebacterium.
[0409] 37. The process according to item 36, wherein the bacterium
is Corynebacterium glutamicum.
[0410] 38. The process according to item 28, wherein the bacterium
is a bacterium of the genus
[0411] Escherichia.
[0412] 39. The process according to item 38, wherein the bacterium
is Escherichia coli.
[0413] 40. The process according to any one of item 1-27, wherein
the first recombinant host cell is a yeast.
[0414] 41. The process according to item 40, wherein the yeast is
of the genus Saccharomyces, Pichia, Schizosacharomyces,
Zygosaccharomyces, Hansenula, Pachyosolen, Kluyveromyces,
Debaryomyces, Yarrowia, Candida, Cryptococcus, Komagataella,
Lipomyces, Rhodospiridium, Rhodotorula, or Trichosporon.
[0415] 42. The process according to item 40, wherein the yeast is a
yeast of the genus Saccharomyces or Pichia.
[0416] 43. The process according to item 40, wherein the yeast is
selected from the group consisting of Saccharomyces cerevisiae,
Pichia pastoris, and Pichia kudriavzevii.
[0417] 44. The process according to item 43, wherein the yeast is
Saccharomyces cerevisiae.
[0418] 45. The process according to item 43, wherein the yeast is
Pichia pastoris.
[0419] 46. The process according to any one of items 1-27, wherein
the first recombinant host cell is a fungus.
[0420] 47. The process according to item 46, wherein the fungus is
a fungus of the genus Aspergillus.
[0421] 48. The process according to item 47, wherein the fungus is
Aspergillus Oryzae or Aspergillus niger.
[0422] 49. The process according to any one of items 1-27, wherein
the first recombinant host cell is an algae cell.
[0423] 50. The process according to item 49, wherein the algae
cells is an algae cell of the genus Haematococcus, Phaedactylum,
Volvox or Dunaliella.
[0424] 51. The process according to any one of items 1-27, wherein
the first recombinant host cell is a plant cell.
[0425] 52. The process according to item 51, wherein the plant cell
is selected from the group consisting of soybean, rapeseed,
sunflower, cotton, corn, tobacco, alfalfa, wheat, barley, oats,
sorghum, lettuce, rice, broccoli, cauliflower, cabbage, parsnips,
melons, carrots, celery, parsley, tomatoes, potatoes, strawberries,
peanuts, grapes, grass seed crops, sugar beets, sugar cane, beans,
peas, rye, flax, hardwood trees, softwood trees, and forage
grasses.
[0426] 53. The process according to any one of items 1-52, wherein
the phenolic compound is represented by the general formula
(I):
##STR00004##
[0427] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0428] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0429] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0430] 54. A process according to any one of the items 1-52,
wherein the phenolic compound is represented by the general formula
(II):
##STR00005##
[0431] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0432] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0433] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0434] 55. The process according to any one of items 1-53, wherein
the precursor of a phenolic compound in step (i''') is a compound
of the general Formula (p-I):
##STR00006##
[0435] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0436] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0437] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0438] 56. The process according to any one of items 53-55, wherein
R.sub.6 is --COOR.sub.7, wherein R.sub.7 is selected from the group
consisting of hydrogen, alkyl, alkenyl, alkynyl, aryl, and
heteroaryl.
[0439] 57. The process according to item 56, wherein R.sub.7 is
hydrogen. 58. The process according to any one of items 53-57,
wherein R.sub.3 is hydroxyl (--OH).
[0440] 59. The process according to any one of items 53-58, wherein
each of R.sub.1, R.sub.2, R.sub.4 and R.sub.5 is hydrogen.
[0441] 60. The process according to any one of items 53-58, wherein
R.sub.4 is hydroxyl (--OH).
[0442] 61. The process according to item 60, wherein each of
R.sub.1, R.sub.2, and R.sub.5 is hydrogen.
[0443] 62. The process according to any one of items 53-55, wherein
each of R.sub.1, R.sub.3 and R.sub.5 is hydrogen, each of R.sub.2
and R.sub.4 is hydroxyl (--OH), and R.sub.6 is p-hydroxyphenyl.
[0444] 63. A recombinant host cell comprising a first heterologous
polypeptide having aryl sulfotransferase activity, such as a
polypeptide selected from the group consisting of:
[0445] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0446] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0447] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0448] 64. The recombinant host cell according to item 63, wherein
the heterologous polypeptide is selected from the group consisting
of:
[0449] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1;
[0450] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1; or
[0451] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to
30, 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0452] 65. The recombinant host cell according to item 63 or 64,
wherein the polypeptide according to b) or c) has aryl
sulfotransferase activity.
[0453] 66. The recombinant host cells according to any one of items
63-65, the host cell comprises an exogenous nucleic acid molecule
comprising a nucleotide sequence encoding said first heterologous
polypeptide.
[0454] 67. The recombinant host cell according to item 66, wherein
the exogenous nucleic acid molecule further comprises a promoter
that is functional in the host cell to cause the production of an
mRNA molecule and that is operably linked to the nucleotide
sequence encoding said first heterologous polypeptide.
[0455] 68. The recombinant host cell according to item 67, wherein
the exogenous nucleic acid molecule further comprises at least one
regulatory element selected from a 5' untranslated region (5'UTR)
and 3' untranslated region (3' UTR).
[0456] 69. The recombinant host cell according to any one of items
66-68, wherein the exogenous nucleic acid is a vector.
[0457] 70. The recombinant host cell according to any one of items
66-68, wherein the excogenous nucleic acid is stably integrated
into the genome of the host cell.
[0458] 71. The recombinant host cell according to any one of items
63-70, wherein the recombinant host cell has further been modified
to have an increased protein expression of an ATP sulfurylase
compared to an identical host cell that does not carry said
modification.
[0459] 72. The recombinant host cell according to item 71, wherein
said ATP sulfurylase is encoded by the genes cysD and cysN.
[0460] 73. The recombinant host cell according to any one of items
63-72, wherein the recombinant host cell has further been modified
to have an increased protein expression of an APS kinase compared
to an identical host cell that does not carry said
modification.
[0461] 74. The recombinant host cell according to item 73, wherein
said APS kinase is encoded by the gene cysC.
[0462] 75. The recombinant host cell according to any one of items
63-74, wherein the recombinant host cell has further been modified
to have an increased protein expression of a PAP phosphatase
compared to an identical host cell that does not carry said
modification.
[0463] 76. The recombinant host cell according to item 75, wherein
said PAP phosphatase is encoded by the gene cycQ.
[0464] 77. The recombinant host cell according to any one of items
63-76, wherein the increase in gene expression has been achieved by
an increased number of copies of the gene or genes.
[0465] 78. The precombinant host cell according to item 77, wherein
the increase in the number of copies of the gene or genes is
achieved by using one or more vectors comprising the gene or genes
operably linked to a promoter that is functional in the host cell
to cause the production of an mRNA molecule.
[0466] 79. The recombinant host cell according to any one of item
63-76, wherein the increase in protein expression is achieved by
modifying the ribosome binding site.
[0467] 80. The recombinaint host cell according to any one of items
63-76, wherein the increase in gene expression has been achieved by
increasing the strength of the promoter(s) operably linked to the
gene or genes.
[0468] 81. The recombinant host cell according to any one of items
63-80, further comprising a second heterologous polypeptide having
tyrosine ammonia lyase activity, such as a polypeptide selected
from the group consisting of:
[0469] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0470] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14); or
[0471] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30, 1
to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0472] 82. The recombinant host cell according to item 81, wherein
the heterologous polypeptide according to e) or f) has tyrosine
ammonia lyase activity.
[0473] 83. The recombinant host cell according to item 81 or 82,
wherein the recombinant host cell comprises an exogenous nucleic
acid molecule comprising a nucleotide sequence encoding said second
heterologous polypeptide.
[0474] 84. The recombinant host cell according to item 83, wherein
the exogenous nucleic acid molecule further comprises a promoter
that is functional in the host cell to cause the production of an
mRNA molecule and that is operably linked to the nucleotide
sequence encoding said second heterologous polypeptide.
[0475] 85. The recombinant host cell according to item 83 or 84,
wherein the exogenous nucleic acid molecule is a vector.
[0476] 86. The recombinant host cell according to item 83 or 84,
wherein the exogenous nucleic acid is stably integrated into the
genome of the recombinant host cell.
[0477] 87. The recombinant host cell according to any one of items
63-86, wherein the recombinant host cell is selected from the group
consisting of bacteria, yeasts, fungi, algae and plant cells.
[0478] 88. The recombinant host cell according to any one of items
63-86, wherein the recombinant host cell is a bacterium.
[0479] 89. The recombinant host cell according to item 88, wherein
the bacterium is a bacterium of the genus Bacillus, Lactococcus,
Lactobacillus, Clostridium, Corynebacterium, Geobacillus,
Streptococcus, Pseudomonas, Streptomyces, Escherichia, Shigella,
Acinetobacter, Citrobacter, Salmonella, Klebsiella, Enterobacter,
Erwinia, Kluyvera, Serratia, Cedecea, Morganella, Hafnia,
Edwardsiella, Providencia, Proteus, or Yersinia.
[0480] 90. The recombinant host cell according to item 88, wherein
the bacterium is a bacterium of the genus Bacillus.
[0481] 91. The recombinant host cell according to item 90, wherein
the bacterium is Bacillus subtilis.
[0482] 92. The recombinant host cell according to item 88, wherein
the bacterium is a bacterium of the genus Lactococcus.
[0483] 93. The recombinant host cell according to item 92, wherein
the bacterium is Lactococcus lactis.
[0484] 94. The recombinant host cell according to item 88, wherein
the bacterium is a bacterium of the genus Pseudomonas.
[0485] 95. The recombinant host cell according to item 94, wherein
the bacterium is Pseudomonas putida.
[0486] 96. The recombinant host cell according to item 88, wherein
the bacterium is a bacterium of the genus Corynebacterium.
[0487] 97. The recombinant host cell according to item 96, wherein
the bacterium is Corynebacterium glutamicum.
[0488] 98. The recombinant host cell according to item 88, wherein
the bacterium is a bacterium of the genus Escherichia.
[0489] 99. The recombinant host cell according to item 98, wherein
the bacterium is Escherichia coli.
[0490] 100. The recombinant host cell according to any one of items
63-86, wherein the recombinant host cell is a yeast.
[0491] 101. The recombinant host cell according to item 100,
wherein the yeast is of the genus
[0492] Saccharomyces, Pichia, Schizosacharomyces,
Zygosaccharomyces, Hansenula, Pachyosolen, Kluyveromyces,
Debaryomyces, Yarrowia, Candida, Cryptococcus, Komagataella,
Lipomyces, Rhodospiridium, Rhodotorula, or Trichosporon.
[0493] 102. The recombinant host cell according to item 100,
wherein the yeast is a yeast of the genus Saccharomyces or
Pichia.
[0494] 103. The recombinant host cell according to item 100,
wherein the yeast is selected from the group consisting of
Saccharomyces cerevisiae, Pichia pastoris, and Pichia
kudriavzevii.
[0495] 104. The recombinant host cell according to item 103,
wherein the yeast is Saccharomyces cerevisiae.
[0496] 105. The recombinant host cell according to item 103,
wherein the yeast is Pichia pastoris.
[0497] 106. The recombinant host cell according to any one of items
63-86, wherein the recombinant host cell is a fungus.
[0498] 107. The recombinant host cell according to item 106,
wherein the fungus is a fungus of the genus Aspergillus.
[0499] 108. The recombinant host cell according to item 107,
wherein the fungus is Aspergillus Oryzae or Aspergillus niger.
[0500] 109. The recombinant host cell according to any one of items
63-86, wherein the recombinant host cell is an algae cell.
[0501] 110. The recombinant host cell according to item 109,
wherein the algae cells is an algae cell of the genus
Haematococcus, Phaedactylum, Volvox or Dunaliella.
[0502] 111. The recombinant host cell according to any one of items
63-86, wherein the recombinant host cell is a plant cell.
[0503] 112. The recombinant host cell according to item 111,
wherein the plant cell is selected from the group consisting of
soybean, rapeseed, sunflower, cotton, corn, tobacco, alfalfa,
wheat, barley, oats, sorghum, lettuce, rice, broccoli, cauliflower,
cabbage, parsnips, melons, carrots, celery, parsley, tomatoes,
potatoes, strawberries, peanuts, grapes, grass seed crops, sugar
beets, sugar cane, beans, peas, rye, flax, hardwood trees, softwood
trees, and forage grasses.
[0504] 113. The recombinant host cell according to any one of items
63-112, which is employed as first recombinant host cell in the
process according to any one of items 1-62.
[0505] 114. Use of a polypeptide in the sulfation of a phenolic
compound, said polypeptide having aryl sulfotransferase activity,
such as a polypeptide selected from the group consisting of:
[0506] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0507] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0508] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0509] 115. The use according to item 114, wherein the polypeptide
is selected from the group consisting of:
[0510] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1;
[0511] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1; or
[0512] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to
30, 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0513] 116. The use according to item 114 or 115, wherein the
phenolic compound is of the general formula (I):
##STR00007##
[0514] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0515] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0516] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0517] 117. The use according to item 114 or 115, wherein the
phenolic compound is of the general formula (II)
##STR00008##
[0518] wherein at least one of R.sub.1, R.sub.2, R.sub.3, R.sub.4,
and R.sub.5 being an hydroxyl group (--OH);
[0519] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6 are independently selected from the group consisting of
halide, hydrogen, hydroxyl (--OH), --OR.sub.7, --OCOR.sub.7,
--NR.sub.7R.sub.8, --COR.sub.7, --COOR.sub.7, --SR.sub.7,
--OSO.sub.3R.sub.7, --OCSR.sub.7, --POR.sub.7R.sub.8, alkyl,
alkenyl, alkynyl, aryl, and heteroaryl; wherein R.sub.7, and
R.sub.8 are independently selected from the group consisting of
hydrogen, alkyl, alkenyl, alkynyl, aryl, and heteroaryl;
[0520] wherein R.sub.1, R.sub.2, R.sub.3, R.sub.4, R.sub.5 and
R.sub.6, are optionally linked with a bridge member Y.sub.n,
thereby forming one or more rings, Y.sub.n being a bond or a
C.sub.1-12 alkyl or an aryl, a carbocyclic, a heterocyclic or a
heteroaromatic structure having 1-3 rings, 3-8 ring members in each
and 0 to 4 heteroatoms, or a heteroalkyl comprising 1 to 12
heteroatoms selected from the group consisting of N, O, S,
S(O).sub.1-2 and carbonyl, and wherein n is an integer between 1
and 12.
[0521] 118. The use according to item 114 or 115, wherein the
phenolic compound is p-coumaric acid.
[0522] 119. Process for the production of a sulfated phenolic
compound, such as zosteric acid, the method comprises sulfating a
phenolic compound, such as p-coumaric acid, using a polypeptide
having aryl sulfotransferase activity, such as a polypeptide
selected from the group consisting of:
[0523] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0524] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0525] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0526] 120. The process according to item 119, wherein the
polypeptide is selected from the group consisting of:
[0527] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1;
[0528] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1; or
[0529] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to
30, 1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0530] 121. The process according to item 119 or 120, wherein the
phenolic compound is of the general formula (I) or (II) as defined
herein.
[0531] 122. A composition comprising a first recombinant host cell
comprising a heterologous polypeptide having arylsulfotransferase
activity, such as a polypeptide selected from the group consisting
of:
[0532] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0533] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0534] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted; and a second
recombinant host cell comprising a heterologous polypeptide having
tyrosine ammonia lyase activity, such as a polypeptide selected
from the group consisting of:
[0535] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0536] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14); or
[0537] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30, 1
to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
[0538] 123. A composition comprising a first polypeptide having
aryl sulfotransferase activity, such as a polypeptide selected from
the group consisting of:
[0539] a) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1);
[0540] b) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g., SEQ ID
NO: 1); or
[0541] c) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13 (e.g.,
SEQ ID NO: 1), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30,
1 to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted; and a second
polypeptide having tyrosine ammonia lyase activity, such as a
polypeptide selected from the group consisting of:
[0542] d) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14);
[0543] e) a polypeptide comprising an amino acid sequence which has
at least about 70%, such as at least about 75%, at least about 80%,
at least about 85%, at least about 90%, at least about 93%, at
least 95%, at least 96%, at least 97%, at least 98%, or at least
99%, sequence identity to the amino acid sequence set forth in SEQ
ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ ID NO:
14); or
[0544] f) a polypeptide comprising an amino acid sequence set forth
in SEQ ID NO: 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23 (e.g., SEQ
ID NO: 14), wherein 1 to 50, such as 1 to 40, 1 to 35, 1 to 30, 1
to 25, 1 to 20, 1 to 15, 1 to 10, 1 to 5, 1 to 3, amino acid
residues are substituted, deleted and/or inserted.
Sequence CWU 1
1
751291PRTRattus norvegicus 1Met Glu Phe Ser Arg Pro Pro Leu Val His
Val Lys Gly Ile Pro Leu 1 5 10 15 Ile Lys Tyr Phe Ala Glu Thr Ile
Gly Pro Leu Gln Asn Phe Thr Ala 20 25 30 Trp Pro Asp Asp Leu Leu
Ile Ser Thr Tyr Pro Lys Ser Gly Thr Thr 35 40 45 Trp Met Ser Glu
Ile Leu Asp Met Ile Tyr Gln Gly Gly Lys Leu Glu 50 55 60 Lys Cys
Gly Arg Ala Pro Ile Tyr Ala Arg Val Pro Phe Leu Glu Phe 65 70 75 80
Lys Cys Pro Gly Val Pro Ser Gly Leu Glu Thr Leu Glu Glu Thr Pro 85
90 95 Ala Pro Arg Leu Leu Lys Thr His Leu Pro Leu Ser Leu Leu Pro
Gln 100 105 110 Ser Leu Leu Asp Gln Lys Val Lys Val Ile Tyr Ile Ala
Arg Asn Ala 115 120 125 Lys Asp Val Val Val Ser Tyr Tyr Asn Phe Tyr
Asn Met Ala Lys Leu 130 135 140 His Pro Asp Pro Gly Thr Trp Asp Ser
Phe Leu Glu Asn Phe Met Asp 145 150 155 160 Gly Glu Val Ser Tyr Gly
Ser Trp Tyr Gln His Val Lys Glu Trp Trp 165 170 175 Glu Leu Arg His
Thr His Pro Val Leu Tyr Leu Phe Tyr Glu Asp Ile 180 185 190 Lys Glu
Asn Pro Lys Arg Glu Ile Lys Lys Ile Leu Glu Phe Leu Gly 195 200 205
Arg Ser Leu Pro Glu Glu Thr Val Asp Ser Ile Val His His Thr Ser 210
215 220 Phe Lys Lys Met Lys Glu Asn Cys Met Thr Asn Tyr Thr Thr Ile
Pro 225 230 235 240 Thr Glu Ile Met Asp His Asn Val Ser Pro Phe Met
Arg Lys Gly Thr 245 250 255 Thr Gly Asp Trp Lys Asn Thr Phe Thr Val
Ala Gln Asn Glu Arg Phe 260 265 270 Asp Ala His Tyr Ala Lys Thr Met
Thr Asp Cys Asp Phe Lys Phe Arg 275 280 285 Cys Glu Leu 290
2295PRTHomo sapiens 2Met Glu Leu Ile Gln Asp Thr Ser Arg Pro Pro
Leu Glu Tyr Val Lys 1 5 10 15 Gly Val Pro Leu Ile Lys Tyr Phe Ala
Glu Ala Leu Gly Pro Leu Gln 20 25 30 Ser Phe Gln Ala Arg Pro Asp
Asp Leu Leu Ile Ser Thr Tyr Pro Lys 35 40 45 Ser Gly Thr Thr Trp
Val Ser Gln Ile Leu Asp Met Ile Tyr Gln Gly 50 55 60 Gly Asp Leu
Glu Lys Cys His Arg Ala Pro Ile Phe Met Arg Val Pro 65 70 75 80 Phe
Leu Glu Phe Lys Ala Pro Gly Ile Pro Ser Gly Met Glu Thr Leu 85 90
95 Lys Asp Thr Pro Ala Pro Arg Leu Leu Lys Thr His Leu Pro Leu Ala
100 105 110 Leu Leu Pro Gln Thr Leu Leu Asp Gln Lys Val Lys Val Val
Tyr Val 115 120 125 Ala Arg Asn Ala Lys Asp Val Ala Val Ser Tyr Tyr
His Phe Tyr His 130 135 140 Met Ala Lys Val His Pro Glu Pro Gly Thr
Trp Asp Ser Phe Leu Glu 145 150 155 160 Lys Phe Met Val Gly Glu Val
Ser Tyr Gly Ser Trp Tyr Gln His Val 165 170 175 Gln Glu Trp Trp Glu
Leu Ser Arg Thr His Pro Val Leu Tyr Leu Phe 180 185 190 Tyr Glu Asp
Met Lys Glu Asn Pro Lys Arg Glu Ile Gln Lys Ile Leu 195 200 205 Glu
Phe Val Gly Arg Ser Leu Pro Glu Glu Thr Val Asp Phe Met Val 210 215
220 Gln His Thr Ser Phe Lys Glu Met Lys Lys Asn Pro Met Thr Asn Tyr
225 230 235 240 Thr Thr Val Pro Gln Glu Phe Met Asp His Ser Ile Ser
Pro Phe Met 245 250 255 Arg Lys Gly Met Ala Gly Asp Trp Lys Thr Thr
Phe Thr Val Ala Gln 260 265 270 Asn Glu Arg Phe Asp Ala Asp Tyr Ala
Glu Lys Met Ala Gly Cys Ser 275 280 285 Leu Ser Phe Arg Ser Glu Leu
290 295 3295PRTEquus caballus 3Met Glu Leu Ile Gln Asp Thr Ser Arg
Pro Pro Leu Lys Tyr Val Lys 1 5 10 15 Gly Val Pro Leu Ile Lys Tyr
Phe Ala Glu Ala Leu Gly Pro Leu Gln 20 25 30 Ser Phe Gln Ala Arg
Pro Asp Asp Leu Leu Ile Ser Thr Tyr Pro Lys 35 40 45 Ser Gly Thr
Thr Trp Val Ser Glu Ile Leu Asp Met Ile Tyr His Gly 50 55 60 Gly
Asp Leu Glu Lys Cys Arg Arg Ala Pro Ile Phe Ile Arg Val Pro 65 70
75 80 Phe Leu Glu Phe Lys Ala Pro Glu Ile Pro Ser Gly Val Glu Val
Leu 85 90 95 Lys Asp Thr Pro Ala Pro Arg Leu Leu Lys Thr His Leu
Pro Leu Ser 100 105 110 Leu Leu Pro Gln Thr Leu Leu Asp Gln Lys Val
Lys Val Val Tyr Leu 115 120 125 Ala Arg Asn Ala Lys Asp Val Ala Val
Ser Tyr Tyr His Phe Tyr Arg 130 135 140 Met Ala Lys Val His Pro Asp
Pro Gly Thr Trp Asp Ser Phe Leu Glu 145 150 155 160 Lys Phe Met Ala
Gly Glu Val Ser Tyr Gly Ser Trp Tyr Lys His Val 165 170 175 Gln Glu
Trp Trp Glu Leu Ser His Thr His Pro Val Leu Tyr Leu Phe 180 185 190
Tyr Glu Asp Met Lys Glu Asn Pro Lys Lys Glu Ile Gln Lys Ile Leu 195
200 205 Glu Phe Val Gly Arg Ser Leu Pro Glu Glu Thr Leu Asp Arg Ile
Val 210 215 220 Gln His Thr Ser Phe Lys Glu Met Lys Lys Asn Pro Met
Ala Asn Tyr 225 230 235 240 Ser Thr Ile Pro Cys Asp Ile Met Asp His
Asn Ile Ser Ala Phe Met 245 250 255 Arg Lys Gly Ile Ala Gly Asp Trp
Lys Asn Thr Phe Thr Val Ala Gln 260 265 270 Asn Glu His Phe Asp Thr
Asp Tyr Ala Glu Lys Met Ala Gly Cys Lys 275 280 285 Leu Ser Phe Arg
Ser Glu Val 290 295 4295PRTSus scrofa 4Met Glu Pro Val Gln Asp Thr
Tyr Arg Pro Pro Leu Glu Tyr Val Lys 1 5 10 15 Gly Val Pro Leu Ile
Lys Tyr Phe Ala Glu Ala Leu Gly Pro Leu Glu 20 25 30 Ser Phe Gln
Ala Trp Pro Asp Asp Val Leu Ile Ser Thr Tyr Pro Lys 35 40 45 Ser
Gly Thr Thr Trp Val Ser Glu Ile Leu Asp Leu Ile Tyr Gln Gly 50 55
60 Gly Asp Leu Gln Lys Cys Gln Arg Ala Pro Ile Phe Val Arg Val Pro
65 70 75 80 Phe Leu Glu Phe Lys Ile Pro Gly Cys Pro Thr Gly Phe Glu
Leu Leu 85 90 95 Lys Asp Thr Pro Ala Pro Arg Leu Leu Lys Thr His
Leu Pro Leu Ala 100 105 110 Leu Leu Pro Gln Thr Leu Leu Asp Gln Lys
Val Lys Val Val Tyr Val 115 120 125 Ala Arg Asn Ala Lys Asp Val Ala
Val Ser Tyr Tyr His Phe Tyr Arg 130 135 140 Met Ala Lys Val His Pro
Asn Pro Gly Thr Trp Asp Ser Phe Leu Glu 145 150 155 160 Asp Phe Met
Ala Gly Glu Val Ser Tyr Gly Ser Trp Tyr Gln His Val 165 170 175 Gln
Glu Trp Trp Glu Leu Arg His Thr His Pro Val Leu Tyr Leu Phe 180 185
190 Tyr Glu Asp Met Lys Glu Asn Pro Lys Arg Glu Ile Gln Lys Ile Leu
195 200 205 Glu Phe Val Gly Arg Ser Leu Pro Glu Glu Thr Val Glu Asp
Ile Val 210 215 220 Gln His Thr Ser Phe Gln Glu Met Lys Asn Asn Ala
Met Thr Asn Tyr 225 230 235 240 Arg Thr Leu Pro Ser Asp Leu Leu Asp
His Ser Ile Ser Ala Phe Met 245 250 255 Arg Lys Gly Ile Thr Gly Asp
Trp Lys Ser Thr Phe Thr Val Ala Gln 260 265 270 Asn Glu Arg Phe Glu
Ala Asp Tyr Ala Glu Lys Met Ala Gly Cys Asn 275 280 285 Leu Arg Phe
Arg Ser Glu Leu 290 295 5295PRTCanis lupus 5Met Glu Asp Ile Pro Asp
Thr Ser Arg Pro Pro Leu Lys Tyr Val Lys 1 5 10 15 Gly Ile Pro Leu
Ile Lys Tyr Phe Ala Glu Ala Leu Glu Ser Leu Gln 20 25 30 Asp Phe
Gln Ala Gln Pro Asp Asp Leu Leu Ile Ser Thr Tyr Pro Lys 35 40 45
Ser Gly Thr Thr Trp Val Ser Glu Ile Leu Asp Met Ile Tyr Gln Asp 50
55 60 Gly Asp Val Glu Lys Cys Arg Arg Ala Pro Val Phe Ile Arg Val
Pro 65 70 75 80 Phe Leu Glu Phe Lys Ala Pro Gly Ile Pro Thr Gly Leu
Glu Val Leu 85 90 95 Lys Asp Thr Pro Ala Pro Arg Leu Ile Lys Thr
His Leu Pro Leu Ala 100 105 110 Leu Leu Pro Gln Thr Leu Leu Asp Gln
Lys Val Lys Val Val Tyr Val 115 120 125 Ala Arg Asn Ala Lys Asp Val
Ala Val Ser Tyr Tyr His Phe Tyr Arg 130 135 140 Met Ala Lys Val His
Pro Asp Pro Asp Thr Trp Asp Ser Phe Leu Glu 145 150 155 160 Lys Phe
Met Ala Gly Glu Val Ser Tyr Gly Ser Trp Tyr Gln His Val 165 170 175
Gln Glu Trp Trp Glu Leu Ser His Thr His Pro Val Leu Tyr Leu Phe 180
185 190 Tyr Glu Asp Met Lys Glu Asn Pro Lys Arg Glu Ile Gln Lys Ile
Leu 195 200 205 Lys Phe Val Gly Arg Ser Leu Pro Glu Glu Thr Val Asp
Leu Ile Val 210 215 220 Gln His Thr Ser Phe Lys Glu Met Lys Asn Asn
Ser Met Ala Asn Tyr 225 230 235 240 Thr Thr Leu Ser Pro Asp Ile Met
Asp His Ser Ile Ser Ala Phe Met 245 250 255 Arg Lys Gly Ile Ser Gly
Asp Trp Lys Thr Thr Phe Thr Val Ala Gln 260 265 270 Asn Glu Arg Phe
Asp Ala Asp Tyr Ala Lys Lys Met Glu Gly Cys Gly 275 280 285 Leu Ser
Phe Arg Thr Gln Leu 290 295 6294PRTGallus gallus 6Met Gly Asn Asp
Glu Val Ile Arg Gln Asp Leu Gly Cys Leu Tyr Asp 1 5 10 15 Ile Pro
Leu Tyr Lys Cys Phe Val Ala Gly Trp Pro Gln Val Glu Ala 20 25 30
Phe Gln Ala Arg Pro Asp Asp Leu Leu Ile Ala Thr Tyr Pro Lys Ser 35
40 45 Gly Thr Thr Trp Leu Ser Glu Ile Leu Asp Ala Ile Tyr His Asp
Gly 50 55 60 Asp Leu Glu Lys Cys Arg Arg Asp Ala Ile Tyr Asn Arg
Val Pro Phe 65 70 75 80 Leu Glu Met Lys Ala Pro Gly Ile Leu Ser Gly
Val Glu Gln Leu Glu 85 90 95 Lys Ile Pro Ser Pro Arg Leu Val Lys
Thr His Leu Pro Val His Leu 100 105 110 Leu Pro Ala Ser Phe Gln Glu
Lys Asp Cys Lys Val Ile Tyr Met Ala 115 120 125 Arg Asn Ala Lys Asp
Val Val Ile Ser Tyr Tyr Tyr Phe Tyr Gln Met 130 135 140 Ala Lys Ile
His Pro Asp Pro Gly Thr Leu Ser Glu Phe Leu Gln Ala 145 150 155 160
Phe Met Asp Gly Lys Val Ala Tyr Gly Ser Trp Tyr Lys His Val Lys 165
170 175 Gly Trp Trp Glu Lys Arg His Glu Lys Arg Leu Leu Tyr Leu Phe
Tyr 180 185 190 Glu Asp Met Lys Lys Asp Pro Arg Arg Glu Ile Gln Lys
Ile Leu Gln 195 200 205 Phe Leu Gly Lys Glu Val Ala Glu Glu Thr Val
Ala Arg Ile Leu His 210 215 220 His Thr Ser Phe Gln Glu Met Lys Lys
Asn Pro Ala Thr Asn Tyr Glu 225 230 235 240 Thr Met Pro Thr Glu Leu
Met Asp His Ser Leu Ser Pro Phe Met Arg 245 250 255 Lys Gly Ile Ser
Gly Asp Trp Ala Asn His Phe Thr Val Ala Gln Asn 260 265 270 Glu Arg
Phe Asp Gln His Tyr Gln Gln Gln Met Ala Gly Ser Asp Leu 275 280 285
Cys Phe Gln Met Glu Ala 290 7307PRTGallus gallus 7Met Ala Leu Asp
Lys Met Glu Asn Leu Ser Leu Glu Glu Asn Met Leu 1 5 10 15 Arg Ser
Glu Met Gly Glu Val Gln Gly Ile Pro Val Thr Lys Pro Thr 20 25 30
Cys Asp Ile Trp Asp Gln Val Trp Asn Phe Lys Ala Arg Pro Asp Asp 35
40 45 Leu Leu Val Ala Thr Tyr Ala Lys Ala Gly Thr Thr Trp Thr Gln
Glu 50 55 60 Ile Val Asp Met Ile Gln Gln Asn Gly Asp Ile Glu Lys
Cys Arg Arg 65 70 75 80 Ala Ser Thr Tyr Lys Arg His Pro Phe Leu Glu
Trp Tyr Ile Pro Asp 85 90 95 Ser Ser Pro Leu Gly Tyr Ser Gly Leu
Lys Leu Ala Glu Ala Met Pro 100 105 110 Ser Pro Arg Thr Met Lys Thr
His Leu Pro Val Gln Leu Val Pro Pro 115 120 125 Ser Phe Trp Glu Gln
Asn Cys Lys Ile Ile Tyr Val Ala Arg Asn Ala 130 135 140 Lys Asp Asn
Leu Val Ser Tyr Tyr His Phe His Arg Met Asn Lys Val 145 150 155 160
Leu Pro Asp Pro Gly Thr Ile Glu Glu Phe Thr Glu Lys Phe Met Asn 165
170 175 Gly Glu Val Leu Trp Gly Ser Trp Tyr Asp His Val Lys Gly Trp
Trp 180 185 190 Lys Ala Lys Asp Lys His Arg Ile Leu Tyr Leu Phe Tyr
Glu Asp Met 195 200 205 Lys Glu Asn Pro Lys Arg Glu Ile Gln Lys Ile
Met Lys Phe Leu Glu 210 215 220 Lys Asp Leu Asp Glu Glu Val Leu Asn
Lys Ile Ile Tyr Asn Thr Ser 225 230 235 240 Phe Glu Ile Met Lys Asp
Asn Pro Met Thr Asn Tyr Thr Lys Asp Phe 245 250 255 Val Gly Val Met
Asp His Ser Val Ser Pro Phe Met Arg Lys Gly Ser 260 265 270 Val Gly
Asp Trp Lys Asn Tyr Phe Thr Val Ala Leu Asn Lys Lys Phe 275 280 285
Asp Gln Asp Tyr Lys Lys Lys Met Ala Asp Thr Ser Leu Val Phe Arg 290
295 300 Met Glu Leu 305 8301PRTDanio rerio 8Met Asp Leu Pro Asp Ile
Ser Ser Ile Lys Leu Pro Ser Arg Pro Lys 1 5 10 15 Ile Phe Glu Phe
Glu Gly Ile Ser Met Ile Ser Tyr Phe Thr Asp Asn 20 25 30 Trp Glu
Lys Leu Lys Asn Phe Gln Ala Arg Pro Asp Asp Ile Leu Ile 35 40 45
Ala Thr Tyr Pro Lys Ala Gly Thr Thr Trp Val Ser Tyr Ile Leu Asp 50
55 60 Leu Leu Tyr Phe Gly Lys Val Glu Pro Asn Gly Gln Ser Ser Leu
Pro 65 70 75 80 Ile Tyr Met Arg Val Pro Phe Leu Glu Ser Cys Phe Pro
Gly Met Pro 85 90 95 Ser Gly Thr Glu Leu Ala Asp Asn Leu Pro Asn
Ser Pro Arg Leu Ile 100 105 110 Lys Thr His Leu Pro Val Gln Leu Val
Pro Lys Ser Phe Trp Gly Gln 115 120 125 Asn Ser Lys Val Val Tyr Val
Ala Arg Asn Ala Lys Asp Asn Val Val 130 135 140 Ser Phe Phe His Phe
Asp Arg Met Asn His Gly Gln Pro Glu Pro Gly 145 150 155 160 Asp Trp
Asp Thr Phe Leu Gln Ala Phe Ile Lys Gly Glu Arg Val Phe 165 170 175
Gly Ser Trp Phe Asp His Val Cys Gly Trp Trp Glu Lys Lys Lys Thr 180
185 190 Tyr Pro Asn Leu His Tyr Met Phe Tyr Glu Asp Ile Ala Lys Asp
Ile 195 200 205 Asn Gly Glu Val Glu Ser Leu Cys Thr Phe Leu Lys
Leu
Ser Arg Ser 210 215 220 Asp Glu Glu Lys Glu Lys Ile Ile Asn Gly Val
Gln Phe Asp Ala Met 225 230 235 240 Lys Gln Asn Val Met Thr Asn Tyr
Ser Thr Ile Pro Thr Met Asp Phe 245 250 255 Thr Ile Ser Pro Phe Met
Arg Lys Gly Lys Val Gly Asp Trp Lys Asn 260 265 270 His Phe Thr Val
Ala Gln Asn Glu Gln Phe Asp Glu Asp Tyr Lys Glu 275 280 285 Lys Met
Lys Asn Thr Thr Leu Asn Phe Arg Thr Lys Ile 290 295 300
9300PRTDanio rerio 9Met Glu Ile Gln Gly Lys Ser Ser Thr Asp Leu Pro
Asp Arg Pro Glu 1 5 10 15 Ile Phe Glu Phe Glu Gly Ile Ser Met Val
Glu His Phe Thr Lys Asn 20 25 30 Trp Glu Asn Val Lys Asn Phe Gln
Ala Arg Pro Asp Asp Ile Leu Ile 35 40 45 Ala Thr Tyr Pro Lys Ala
Gly Thr Thr Trp Val Ser Asn Ile Leu Asp 50 55 60 Leu Leu Tyr Phe
Gly Lys Glu Asp Pro Lys Arg Gln Thr Thr Lys Pro 65 70 75 80 Ile Tyr
Lys Arg Val Pro Phe Leu Glu Ser Cys Phe Pro Glu Met Gln 85 90 95
Ser Gly Thr Glu Leu Ala Asn Asn Leu Pro Thr Ser Pro Arg Leu Ile 100
105 110 Lys Thr His Leu Pro Val Gln Leu Val Pro Gln Ser Phe Trp Glu
Lys 115 120 125 Asn Ser Arg Val Ala Tyr Val Ala Arg Asn Ala Lys Asp
Asn Ala Val 130 135 140 Ser Tyr Phe His Phe Asn Arg Met Asn Lys Ala
Gln Pro Glu Pro Gly 145 150 155 160 Asp Trp Asn Thr Phe Leu Glu Glu
Phe Met Lys Gly Lys Met Val Phe 165 170 175 Gly Ser Trp Phe Asp His
Val Cys Gly Trp Trp Glu Lys Lys Lys Thr 180 185 190 Tyr Pro Asn Leu
His Tyr Met Leu Tyr Glu Asp Met Ala Lys Asp Ile 195 200 205 Lys Gly
Glu Val Glu Ser Leu Cys Thr Phe Leu Lys Leu Ser Arg Ser 210 215 220
Asp Glu Glu Lys Glu Lys Ile Ile Asn Gly Ile Gln Phe Asp Ala Met 225
230 235 240 Lys Gln Asn Lys Met Thr Asn Tyr Ser Thr Val Leu Val Met
Asp Phe 245 250 255 Thr Ile Ser Pro Phe Met Arg Lys Gly Lys Val Gly
Asp Trp Lys Asn 260 265 270 His Phe Thr Val Ala Gln Asn Glu Gln Phe
Asn Glu Asp Tyr Lys Gln 275 280 285 Lys Met Lys Asn Ser Thr Leu Lys
Phe Pro Thr Glu 290 295 300 10338PRTDrosophila melanogaster 10Met
Pro Gln Ser Ser Phe Phe Ala Lys Ser Val Pro Phe Glu Gln Ile 1 5 10
15 Asp Lys Leu Ala Ile Ser Gly Gly Tyr Ser Ser Ile Phe Ala Ser Ser
20 25 30 Lys Pro Ser Val Pro Val Val Gly Asn Trp Glu Gln Arg Phe
Cys Arg 35 40 45 Leu Ala Asp Thr Phe Gln Pro Val Leu Asp Arg Val
Tyr Asp Phe Glu 50 55 60 Val Arg Asp Asp Asp Val Trp Ile Val Thr
Leu Pro Lys Cys Gly Thr 65 70 75 80 Thr Trp Met Gln Glu Leu Ala Trp
Leu Val Ile Asn Glu Cys Asp Phe 85 90 95 Glu Thr Ala Lys Ser Val
Asp Leu Thr His Arg Ser Pro Phe Leu Glu 100 105 110 Phe Asn Gly Val
Val Pro Asn Val Pro His Asp Thr Ile Ala Ala Ala 115 120 125 Asn Ala
Leu Pro Ser Pro Arg Leu Ile Lys Ser His Leu Pro Ala Trp 130 135 140
Met Leu Pro Arg Gln Ile Trp Ser Lys Arg Pro Lys Ile Ile Tyr Val 145
150 155 160 Tyr Arg Asn Pro Lys Asp Ala Ala Ile Ser Tyr Phe His His
Trp Arg 165 170 175 Gly Met Val Gly Tyr Gln Gly Thr Lys Ser Asp Phe
Met His Ser Phe 180 185 190 Ile Asp Gly Tyr Val Asn Phe Thr Pro Cys
Trp Pro His Ile Leu Asp 195 200 205 Phe Trp Gln Leu Arg His Glu Pro
Asn Ile Phe Phe Thr Ser Tyr Glu 210 215 220 Arg Met Lys Gly Gln Leu
Gly Gln Val Ile Ser Glu Val Ala Gln Phe 225 230 235 240 Leu Glu Arg
Ser Val Ser Gln Glu Gln Met Gln Gln Met Gln Arg His 245 250 255 Leu
Ser Phe Glu Ser Met Arg Asp Asn Pro Ala Cys Asn His Val Lys 260 265
270 Glu Phe Glu Ser Met Lys Ala Ala Ala Gly Arg Glu Val Glu Glu Phe
275 280 285 Arg Phe Val Arg Arg Gly Val Val Gly Ser His Lys Asp Glu
Leu Thr 290 295 300 Ala Asp Ile Ile Arg Glu Phe Asp Leu Trp Ser Asp
Ser Asn Leu Arg 305 310 315 320 Asp Phe Lys Leu Asn Met Asp Asp Phe
Ala Asn Tyr Ser Lys Phe Ala 325 330 335 Ser Thr 11313PRTDrosophila
melanogaster 11Met Asn Arg Val Gln Val Thr Pro Arg Ser Tyr Pro Thr
Asn Leu Ile 1 5 10 15 Asp Lys Asp Trp Gly Asn Arg Lys Leu Phe Tyr
Thr Lys Asp Ser Glu 20 25 30 Asn Phe Leu Arg Leu Val His Asp Met
Lys Leu Arg Asp Asp Asp Val 35 40 45 Trp Ile Val Thr Leu Pro Lys
Cys Gly Thr Thr Trp Met Gln Glu Leu 50 55 60 Leu Trp Leu Leu Leu
Asn Asn Cys Asp Phe Glu Gly Ala Leu Ala Lys 65 70 75 80 Asp Gln Glu
Leu Arg Thr Pro Phe Leu Glu Phe Gly Tyr Ser Val Phe 85 90 95 His
Asp Pro Asn Arg Ser Phe Gly Pro Ile Glu Asp Leu Lys Ser Pro 100 105
110 Arg Leu Ile Lys Ser His Leu Ser Leu Ala Leu Leu Pro Ser Lys Leu
115 120 125 Trp Glu Gly Lys Asn Lys Val Ile Tyr Val Ser Arg Asn Pro
Leu Asp 130 135 140 Ser Tyr Val Ser Arg Tyr Tyr His Gly Val Ser Phe
Gly Phe Asn Tyr 145 150 155 160 Gly Lys Ser Leu His Gln Tyr Phe Asp
Glu Val Leu Ala Ser Asp Asp 165 170 175 Phe Pro Thr Glu Phe Ile Glu
His Ala His Glu Phe Tyr Gln Leu Arg 180 185 190 Asn Glu Pro Trp Val
Phe Tyr Thr Ser Phe Glu Met Met Lys Lys Asp 195 200 205 Leu Arg Gly
Val Ile Asn Asp Val Ser Arg Phe Leu Asn Lys Pro Ile 210 215 220 Asn
Asp Gln Gln Met Glu Lys Leu Leu Lys His Leu Ser Phe Ala Glu 225 230
235 240 Met Lys Lys Asn Pro Thr Thr Asn His Leu Trp Glu Leu Ala Gln
Val 245 250 255 Gln His Glu Asn Ala Gly Lys Glu Met His Pro Phe Val
Arg Arg Gly 260 265 270 Asp Val Asn Gly Tyr Lys Asp Glu Leu Lys Pro
Glu Gln Ile Glu Lys 275 280 285 Ala Asn Val Arg Ile Gln Glu Val Leu
Ala Lys Asn Gly Val Thr Leu 290 295 300 Asp Glu Leu Leu Leu Leu Lys
Asp Gln 305 310 12346PRTDrosophila melanogaster 12Met Glu Asn Thr
Pro Leu Lys Phe Pro His Glu Ile Arg Asp Val Glu 1 5 10 15 Glu Ser
Thr Asn Ala Glu Leu Leu Asp His Phe His Gly Glu Arg Thr 20 25 30
Gly Phe Val Gln Val Gly Ser Glu Gly Tyr Phe Phe Pro His Lys Tyr 35
40 45 Lys Asp Glu Ala Glu Arg Tyr Tyr Asn Phe Glu Ala Arg Pro Asp
Asp 50 55 60 Val Trp Ile Ala Thr Val Pro Arg Ser Gly Thr Thr Trp
Thr Gln Glu 65 70 75 80 Leu Ile Trp Leu Val Ala Asn Gly Leu Asp Phe
Glu His Ala Gln Glu 85 90 95 Arg Pro Leu Thr Glu Arg Phe Pro Phe
Phe Glu Phe Pro Leu Phe Val 100 105 110 His Pro Lys Ile Lys Glu Glu
Leu Gln Glu Glu Asn Arg Asp Ser Ala 115 120 125 Glu Ala Leu Glu Phe
Ile Glu Lys Ile Ala Arg Pro Gly Tyr Glu Ala 130 135 140 Leu Ser Glu
Ile Pro Arg Ser Gln Arg Arg Phe Ile Lys Thr His Phe 145 150 155 160
Pro Phe Ser Leu Met Pro Pro Ser Val Leu Glu Lys Lys Cys Lys Val 165
170 175 Ile Tyr Val Val Arg Asp Pro Lys Asp Val Ala Val Ser Tyr Tyr
His 180 185 190 Leu Asn Arg Leu Phe Arg Thr Gln Gly Tyr Val Gly Asp
Phe Glu Arg 195 200 205 Tyr Trp His Tyr Phe Gln Asn Gly Leu Asn Pro
Trp Leu Pro Tyr Tyr 210 215 220 Ser His Val Lys Glu Ala Arg Glu His
Ala His Leu Ser Asn Val Leu 225 230 235 240 Phe Leu Arg Tyr Glu Asp
Met Leu Ala Asp Leu Pro Gly Ala Ile Asn 245 250 255 Ser Ile Ala Ser
Phe Leu Glu Cys Pro Pro Lys Pro Glu Asp Met Asp 260 265 270 Arg Leu
Leu Asp His Leu Ser Ile Arg Ser Phe Arg Glu Asn Lys Ser 275 280 285
Val Asn Met His Glu Met Ala Ser Val Gly Val Leu Asn Lys Gly Glu 290
295 300 Ala Gly Phe Val Arg Ser Gly Ala Lys Thr Ala Tyr Gln Pro Gln
Gln 305 310 315 320 Glu Phe Val Glu Asn Pro Lys Leu Leu Lys Ser Ala
Asn Glu Trp Val 325 330 335 Glu Gln Asn Ile Lys Ser Phe Lys Thr Ile
340 345 13323PRTArabidopsis thaliana 13Met Glu Met Asn Leu Arg Ile
Glu Asp Leu Asn Glu Glu Thr Lys Thr 1 5 10 15 Leu Ile Ser Ser Leu
Pro Ser Asp Lys Asp Phe Thr Gly Lys Thr Ile 20 25 30 Cys Lys Tyr
Gln Gly Cys Trp Tyr Thr His Asn Val Leu Gln Ala Val 35 40 45 Leu
Asn Phe Gln Lys Ser Phe Lys Pro Gln Asp Thr Asp Ile Ile Val 50 55
60 Ala Ser Phe Pro Lys Cys Gly Thr Thr Trp Leu Lys Ala Leu Thr Phe
65 70 75 80 Ala Leu Leu His Arg Ser Lys Gln Pro Ser His Asp Asp Asp
His Pro 85 90 95 Leu Leu Ser Asn Asn Pro His Val Leu Val Pro Tyr
Phe Glu Ile Asp 100 105 110 Leu Tyr Leu Arg Ser Glu Asn Pro Asp Leu
Thr Lys Phe Ser Ser Ser 115 120 125 Pro Arg Leu Phe Ser Thr His Val
Pro Ser His Thr Leu Gln Glu Gly 130 135 140 Leu Lys Gly Ser Thr Cys
Lys Ile Val Tyr Ile Ser Arg Asn Val Lys 145 150 155 160 Asp Thr Leu
Val Ser Tyr Trp His Phe Phe Thr Lys Lys Gln Thr Asp 165 170 175 Glu
Lys Ile Ile Ser Ser Phe Glu Asp Thr Phe Glu Met Phe Cys Arg 180 185
190 Gly Val Ser Ile Phe Gly Pro Phe Trp Asp His Val Leu Ser Tyr Trp
195 200 205 Arg Gly Ser Leu Glu Asp Pro Asn His Val Leu Phe Met Lys
Phe Glu 210 215 220 Glu Met Lys Ala Glu Pro Arg Asp Gln Ile Lys Lys
Phe Ala Glu Phe 225 230 235 240 Leu Gly Cys Pro Phe Thr Lys Glu Glu
Glu Glu Ser Gly Ser Val Asp 245 250 255 Glu Ile Ile Asp Leu Cys Ser
Leu Arg Asn Leu Ser Ser Leu Glu Ile 260 265 270 Asn Lys Thr Gly Lys
Leu Asn Ser Gly Arg Glu Asn Lys Met Phe Phe 275 280 285 Arg Lys Gly
Glu Val Gly Asp Trp Lys Asn Tyr Leu Thr Pro Glu Met 290 295 300 Glu
Asn Lys Ile Asp Met Ile Ile Gln Glu Lys Leu Gln Asn Ser Gly 305 310
315 320 Leu Lys Phe 14506PRTFlavobacterium johnsoniae 14Met Asn Thr
Ile Asn Glu Tyr Leu Ser Leu Glu Glu Phe Glu Ala Ile 1 5 10 15 Ile
Phe Gly Asn Gln Lys Val Thr Ile Ser Asp Val Val Val Asn Arg 20 25
30 Val Asn Glu Ser Phe Asn Phe Leu Lys Glu Phe Ser Gly Asn Lys Val
35 40 45 Ile Tyr Gly Val Asn Thr Gly Phe Gly Pro Met Ala Gln Tyr
Arg Ile 50 55 60 Lys Glu Ser Asp Gln Ile Gln Leu Gln Tyr Asn Leu
Ile Arg Ser His 65 70 75 80 Ser Ser Gly Thr Gly Lys Pro Leu Ser Pro
Val Cys Ala Lys Ala Ala 85 90 95 Ile Leu Ala Arg Leu Asn Thr Leu
Ser Leu Gly Asn Ser Gly Val His 100 105 110 Pro Ser Val Ile Asn Leu
Met Ser Glu Leu Ile Asn Lys Asp Ile Thr 115 120 125 Pro Leu Ile Phe
Glu His Gly Gly Val Gly Ala Ser Gly Asp Leu Val 130 135 140 Gln Leu
Ser His Leu Ala Leu Val Leu Ile Gly Glu Gly Glu Val Phe 145 150 155
160 Tyr Lys Gly Glu Arg Arg Pro Thr Pro Glu Val Phe Glu Ile Glu Gly
165 170 175 Leu Lys Pro Ile Gln Val Glu Ile Arg Glu Gly Leu Ala Leu
Ile Asn 180 185 190 Gly Thr Ser Val Met Thr Gly Ile Gly Val Val Asn
Val Tyr His Ala 195 200 205 Lys Lys Leu Leu Asp Trp Ser Leu Lys Ser
Ser Cys Ala Ile Asn Glu 210 215 220 Leu Val Gln Ala Tyr Asp Asp His
Phe Ser Ala Glu Leu Asn Gln Thr 225 230 235 240 Lys Arg His Lys Gly
Gln Gln Glu Ile Ala Leu Lys Met Arg Gln Asn 245 250 255 Leu Ser Asp
Ser Thr Leu Ile Arg Lys Arg Glu Asp His Leu Tyr Ser 260 265 270 Gly
Glu Asn Thr Glu Glu Ile Phe Lys Glu Lys Val Gln Glu Tyr Tyr 275 280
285 Ser Leu Arg Cys Val Pro Gln Ile Leu Gly Pro Val Leu Glu Thr Ile
290 295 300 Asn Asn Val Ala Ser Ile Leu Glu Asp Glu Phe Asn Ser Ala
Asn Asp 305 310 315 320 Asn Pro Ile Ile Asp Val Lys Asn Gln His Val
Tyr His Gly Gly Asn 325 330 335 Phe His Gly Asp Tyr Ile Ser Leu Glu
Met Asp Lys Leu Lys Ile Val 340 345 350 Ile Thr Lys Leu Thr Met Leu
Ala Glu Arg Gln Leu Asn Tyr Leu Leu 355 360 365 Asn Ser Lys Ile Asn
Glu Leu Leu Pro Pro Phe Val Asn Leu Gly Thr 370 375 380 Leu Gly Phe
Asn Phe Gly Met Gln Gly Val Gln Phe Thr Ala Thr Ser 385 390 395 400
Thr Thr Ala Glu Ser Gln Met Leu Ser Asn Pro Met Tyr Val His Ser 405
410 415 Ile Pro Asn Asn Asn Asp Asn Gln Asp Ile Val Ser Met Gly Thr
Asn 420 425 430 Ser Ala Val Ile Thr Ser Lys Val Ile Glu Asn Ala Phe
Glu Val Leu 435 440 445 Ala Ile Glu Met Ile Thr Ile Val Gln Ala Ile
Asp Tyr Leu Gly Gln 450 455 460 Lys Asp Lys Ile Ser Ser Val Ser Lys
Lys Trp Tyr Asp Glu Ile Arg 465 470 475 480 Asn Ile Ile Pro Thr Phe
Lys Glu Asp Gln Val Met Tyr Pro Phe Val 485 490 495 Gln Lys Val Lys
Asp His Leu Ile Asn Asn 500 505 15552PRTHerpetosiphon aurantiacus
15Met Ser Thr Thr Leu Ile Leu Thr Gly Glu Gly Leu Gly Ile Asp Asp 1
5 10 15 Val Val Arg Val Ala Arg His Gln Asp Arg Val Glu Leu Thr Thr
Asp 20 25 30 Pro Ala Ile Leu Ala Gln Ile Glu Ala Ser Cys Ala Tyr
Ile Asn Gln 35 40 45 Ala Val Lys Glu His Gln Pro Val Tyr Gly Val
Thr Thr Gly Phe Gly 50 55 60 Gly Met Ala Asn Val
Ile Ile Ser Pro Glu Glu Ala Ala Glu Leu Gln 65 70 75 80 Asn Asn Ala
Ile Trp Tyr His Lys Thr Gly Ala Gly Lys Leu Leu Pro 85 90 95 Phe
Thr Asp Val Arg Ala Ala Met Leu Leu Arg Ala Asn Ser His Met 100 105
110 Arg Gly Ala Ser Gly Ile Arg Leu Glu Ile Ile Gln Arg Met Val Thr
115 120 125 Phe Leu Asn Ala Asn Val Thr Pro His Val Arg Glu Phe Gly
Ser Ile 130 135 140 Gly Ala Ser Gly Asp Leu Val Pro Leu Ile Ser Ile
Thr Gly Ala Leu 145 150 155 160 Leu Gly Thr Asp Gln Ala Phe Met Val
Asp Phe Asn Gly Glu Thr Leu 165 170 175 Asp Cys Ile Ser Ala Leu Glu
Arg Leu Gly Leu Pro Arg Leu Arg Leu 180 185 190 Gln Pro Lys Glu Gly
Leu Ala Met Met Asn Gly Thr Ser Val Met Thr 195 200 205 Gly Ile Ala
Ala Asn Cys Val His Asp Ala Arg Ile Leu Leu Ala Leu 210 215 220 Ala
Leu Glu Ala His Ala Leu Met Ile Gln Gly Leu Gln Gly Thr Asn 225 230
235 240 Gln Ser Phe His Pro Phe Ile His Arg His Lys Pro His Thr Gly
Gln 245 250 255 Val Trp Ala Ala Asp His Met Leu Glu Leu Leu Gln Gly
Ser Gln Leu 260 265 270 Ser Arg Asn Glu Leu Asp Gly Ser His Asp Tyr
Arg Asp Gly Asp Leu 275 280 285 Ile Gln Asp Arg Tyr Ser Leu Arg Cys
Leu Pro Gln Phe Leu Gly Pro 290 295 300 Ile Ile Asp Gly Met Ala Phe
Ile Ser His His Leu Arg Val Glu Ile 305 310 315 320 Asn Ser Ala Asn
Asp Asn Pro Leu Ile Asp Thr Ala Ser Ala Ala Ser 325 330 335 Tyr His
Gly Gly Asn Phe Leu Gly Gln Tyr Ile Gly Val Gly Met Asp 340 345 350
Gln Leu Arg Tyr Tyr Met Gly Leu Met Ala Lys His Leu Asp Val Gln 355
360 365 Ile Ala Leu Leu Val Ser Pro Gln Phe Asn Asn Gly Leu Pro Ala
Ser 370 375 380 Leu Val Gly Asn Ile Gln Arg Lys Val Asn Met Gly Leu
Lys Gly Leu 385 390 395 400 Gln Leu Thr Ala Asn Ser Ile Met Pro Ile
Leu Thr Phe Leu Gly Asn 405 410 415 Ser Leu Ala Asp Arg Phe Pro Thr
His Ala Glu Gln Phe Asn Gln Asn 420 425 430 Ile Asn Ser Gln Gly Phe
Gly Ser Ala Asn Leu Ala Arg Gln Thr Ile 435 440 445 Gln Thr Leu Gln
Gln Tyr Ile Ala Ile Thr Leu Met Phe Gly Val Gln 450 455 460 Ala Val
Asp Leu Arg Thr His Lys Leu Ala Gly His Tyr Asn Ala Ala 465 470 475
480 Glu Leu Leu Ser Pro Leu Thr Ala Lys Ile Tyr His Ala Val Arg Ser
485 490 495 Ile Val Lys His Pro Pro Ser Pro Glu Arg Pro Tyr Ile Trp
Asn Asp 500 505 510 Asp Glu Gln Val Leu Glu Ala His Ile Ser Ala Leu
Ala His Asp Ile 515 520 525 Ala Asn Asp Gly Ser Leu Val Ser Ala Val
Glu Gln Thr Leu Ser Gly 530 535 540 Leu Arg Ser Ile Ile Leu Phe Arg
545 550 16552PRTHerpetosiphon aurantiacus 16Met Arg His Gln Val Thr
Leu Thr Gly Ala Gly Leu Thr Ile Glu Asp 1 5 10 15 Val Val Arg Val
Ala Arg His His Gln Pro Val Gly Leu Thr Asp Asn 20 25 30 Pro Glu
Ile Leu Gln Arg Ile Glu Asp Ser Cys Ala Tyr Ile Asn Asp 35 40 45
Ala Val Lys Ala Ser Lys Pro Val Tyr Gly Val Thr Thr Gly Phe Gly 50
55 60 Gly Met Ala Asp Val Val Ile Ser Ser Glu Glu Ala Ala Asp Leu
Gln 65 70 75 80 Asn Asn Ala Ile Trp Tyr His Lys Thr Gly Ala Gly Lys
Leu Leu Pro 85 90 95 Leu Ala Asp Val Arg Ala Ala Met Leu Leu Arg
Ala Asn Ser His Met 100 105 110 Arg Gly Val Ser Gly Ile Arg Leu Glu
Ile Ile Gln Arg Met Met Thr 115 120 125 Phe Leu Asn Ala Asn Val Thr
Pro His Val Arg Glu Phe Gly Ser Ile 130 135 140 Gly Ala Ser Gly Asp
Leu Val Pro Leu Ile Ser Ile Thr Gly Ala Leu 145 150 155 160 Leu Gly
Thr Asp Pro Ala Phe Arg Val Asp Phe Asp Gly Glu Asn Ile 165 170 175
Asp Cys Leu Glu Ala Leu Glu Arg Leu Asn Leu Pro Arg Leu Glu Leu 180
185 190 Leu Pro Lys Glu Gly Leu Ala Met Met Asn Gly Thr Ser Val Met
Thr 195 200 205 Gly Ile Ala Ser Asn Val Leu His Asp Ala Arg Ile Leu
Leu Gly Leu 210 215 220 Ala Leu Asn Ile His Gly Leu Met Ile Gln Gly
Leu Gln Gly Thr Asn 225 230 235 240 Gln Ser Phe His Pro Phe Ile His
Gln His Lys Ala His Thr Gly Gln 245 250 255 Val Trp Ala Ala Asp His
Met Leu Gln Ile Leu Glu Gly Ser Ala Leu 260 265 270 Ser Arg Asp Glu
Leu Asp Gly Arg His Glu Tyr Arg Glu Gly Asp Leu 275 280 285 Ile Gln
Asp Arg Tyr Ser Leu Arg Cys Leu Pro Gln Phe Leu Gly Pro 290 295 300
Ile Ile Asp Gly Met Ala Tyr Ile Thr His His Leu Arg Val Glu Ile 305
310 315 320 Asn Ser Ala Asn Asp Asn Pro Leu Ile Asn Thr Glu Ala Gly
Ala Ser 325 330 335 Tyr His Gly Gly Asn Phe Leu Gly Gln Tyr Ile Gly
Val Gly Met Asp 340 345 350 Gln Leu Arg Tyr Tyr Met Gly Leu Met Ala
Lys His Leu Asp Val Gln 355 360 365 Ile Ala Leu Leu Val Ser Pro Gln
Phe Asn Asn Gly Leu Ser Ala Ser 370 375 380 Leu Val Gly Asn Thr Asp
Arg Lys Val Asn Met Gly Leu Lys Gly Leu 385 390 395 400 Gln Ile Ser
Gly Asn Ser Ile Met Pro Ile Leu Gly Phe Leu Gly Asn 405 410 415 Ser
Leu Ala Asp Arg Phe Pro Thr His Ala Glu Gln Phe Asn Gln Asn 420 425
430 Ile Asn Ser Gln Gly Phe Gly Ser Ala Asn Leu Ala Arg Gln Thr Ile
435 440 445 Glu Thr Leu Gln Gln Tyr Ile Ala Ile Ala Leu Ile Phe Gly
Val Gln 450 455 460 Ala Val Asp Leu Arg Thr Phe Lys Arg Thr Gly His
Tyr Asn Ala Val 465 470 475 480 Glu Thr Leu Ser Pro Met Thr Ala Lys
Leu Tyr Ser Ala Met Arg Glu 485 490 495 Val Val Gly Lys Pro Ile Ser
His Glu Arg Pro Tyr Ile Trp Asn Asp 500 505 510 Asn Glu Gln Ala Leu
Glu Gln His Ile Ser Ala Ile Val Ser Asp Ile 515 520 525 Thr Asn Asp
Gly Ile Ile Pro Gln Ala Ile Gln Glu Thr Leu Asp Ser 530 535 540 Leu
Arg Ser Ile Ile Leu Phe Ala 545 550 17523PRTRhodobacter sphaeroides
17Met Leu Ala Met Ser Pro Pro Lys Pro Ala Val Glu Leu Asp Arg His 1
5 10 15 Ile Asp Leu Asp Glu Ala His Ser Val Ala Ser Gly Gly Ala Arg
Ile 20 25 30 Val Leu Ala Pro Pro Ala Arg Asp Arg Cys Arg Ala Ser
Glu Ala Arg 35 40 45 Leu Gly Ala Val Ile Arg Glu Ala Arg His Val
Tyr Gly Leu Thr Thr 50 55 60 Gly Phe Gly Pro Leu Ala Asn Arg Leu
Val Ser Gly Glu Asn Val Arg 65 70 75 80 Thr Leu Gln Ala Asn Leu Val
His His Leu Ala Ser Gly Val Gly Pro 85 90 95 Val Leu Asp Trp Thr
Thr Ala Arg Ala Met Val Leu Ala Arg Leu Val 100 105 110 Ala Ile Ala
Gln Gly Ala Ser Gly Ala Ser Glu Gly Thr Ile Ala Arg 115 120 125 Leu
Ile Asp Leu Leu Asn Ser Glu Leu Ala Pro Ala Val Pro Met Arg 130 135
140 Gly Thr Val Gly Ala Ser Gly Asp Leu Thr Pro Leu Ala His Met Val
145 150 155 160 Leu Cys Leu Gln Gly Arg Gly Asp Phe Leu Asp Arg Asp
Gly Thr Arg 165 170 175 Leu Asp Gly Ala Glu Gly Leu Arg Arg Gly Arg
Leu Gln Pro Leu Asp 180 185 190 Leu Ser His Arg Asp Ala Leu Ala Leu
Val Asn Gly Thr Ser Ala Met 195 200 205 Thr Gly Ile Ala Leu Val Asn
Ala His Ala Cys Arg His Leu Gly Asn 210 215 220 Trp Ala Val Ala Leu
Thr Ala Leu Leu Ala Glu Cys Leu Gly Gly Arg 225 230 235 240 Thr Glu
Ala Trp Ala Ala Ala Leu Ser Asp Leu Arg Pro His Pro Gly 245 250 255
Gln Lys Asp Ala Ala Ala Arg Leu Arg Ala Arg Val Asp Gly Ser Ala 260
265 270 Arg Val Val Arg His Val Ile Ala Glu Arg Arg Leu Gly Ala Ser
Asp 275 280 285 Ile Gly Thr Glu Pro Glu Ala Gly Gln Asp Ala Tyr Ser
Leu Arg Cys 290 295 300 Ala Pro Gln Val Leu Gly Ala Gly Phe Asp Thr
Leu Ala Trp His Asp 305 310 315 320 Arg Val Leu Thr Ile Glu Leu Asn
Ala Val Thr Asp Asn Pro Val Phe 325 330 335 Pro Pro Asp Gly Ser Val
Pro Ala Leu His Gly Gly Asn Phe Met Gly 340 345 350 Gln His Val Ala
Leu Thr Ser Asp Ala Leu Ala Thr Ala Val Thr Val 355 360 365 Leu Ala
Gly Leu Ala Glu Arg Gln Ile Ala Arg Leu Thr Asp Glu Arg 370 375 380
Leu Asn Arg Gly Leu Pro Pro Phe Leu His Arg Gly Pro Ala Gly Leu 385
390 395 400 Asn Ser Gly Phe Met Gly Ala Gln Val Thr Ala Thr Ala Leu
Leu Ala 405 410 415 Glu Met Arg Ala Thr Gly Pro Ala Ser Ile His Ser
Ile Ser Thr Asn 420 425 430 Ala Ala Asn Gln Asp Val Val Ser Leu Gly
Thr Ile Ala Ala Arg Leu 435 440 445 Cys Arg Glu Lys Ile Asp Arg Trp
Ala Glu Ile Leu Ala Ile Leu Ala 450 455 460 Leu Cys Leu Ala Gln Ala
Ala Glu Leu Arg Cys Gly Ser Gly Leu Asp 465 470 475 480 Gly Val Ser
Pro Ala Gly Lys Lys Leu Val Gln Ala Leu Arg Glu Gln 485 490 495 Phe
Pro Pro Leu Glu Thr Asp Arg Pro Leu Gly Gln Glu Ile Ala Ala 500 505
510 Leu Ala Thr His Leu Leu Gln Gln Ser Pro Val 515 520
18510PRTSaccharothrix espanaensis 18Met Thr Gln Val Val Glu Arg Gln
Ala Asp Arg Leu Ser Ser Arg Glu 1 5 10 15 Tyr Leu Ala Arg Val Val
Arg Ser Ala Gly Trp Asp Ala Gly Leu Thr 20 25 30 Ser Cys Thr Asp
Glu Glu Ile Val Arg Met Gly Ala Ser Ala Arg Thr 35 40 45 Ile Glu
Glu Tyr Leu Lys Ser Asp Lys Pro Ile Tyr Gly Leu Thr Gln 50 55 60
Gly Phe Gly Pro Leu Val Leu Phe Asp Ala Asp Ser Glu Leu Glu Gln 65
70 75 80 Gly Gly Ser Leu Ile Ser His Leu Gly Thr Gly Gln Gly Ala
Pro Leu 85 90 95 Ala Pro Glu Val Ser Arg Leu Ile Leu Trp Leu Arg
Ile Gln Asn Met 100 105 110 Arg Lys Gly Tyr Ser Ala Val Ser Pro Val
Phe Trp Gln Lys Leu Ala 115 120 125 Asp Leu Trp Asn Lys Gly Phe Thr
Pro Ala Ile Pro Arg His Gly Thr 130 135 140 Val Ser Ala Ser Gly Asp
Leu Gln Pro Leu Ala His Ala Ala Leu Ala 145 150 155 160 Phe Thr Gly
Val Gly Glu Ala Trp Thr Arg Asp Ala Asp Gly Arg Trp 165 170 175 Ser
Thr Val Pro Ala Val Asp Ala Leu Ala Ala Leu Gly Ala Glu Pro 180 185
190 Phe Asp Trp Pro Val Arg Glu Ala Leu Ala Phe Val Asn Gly Thr Gly
195 200 205 Ala Ser Leu Ala Val Ala Val Leu Asn His Arg Ser Ala Leu
Arg Leu 210 215 220 Val Arg Ala Cys Ala Val Leu Ser Ala Arg Leu Ala
Thr Leu Leu Gly 225 230 235 240 Ala Asn Pro Glu His Tyr Asp Val Gly
His Gly Val Ala Arg Gly Gln 245 250 255 Val Gly Gln Leu Thr Ala Ala
Glu Trp Ile Arg Gln Gly Leu Pro Arg 260 265 270 Gly Met Val Arg Asp
Gly Ser Arg Pro Leu Gln Glu Pro Tyr Ser Leu 275 280 285 Arg Cys Ala
Pro Gln Val Leu Gly Ala Val Leu Asp Gln Leu Asp Gly 290 295 300 Ala
Gly Asp Val Leu Ala Arg Glu Val Asp Gly Cys Gln Asp Asn Pro 305 310
315 320 Ile Thr Tyr Glu Gly Glu Leu Leu His Gly Gly Asn Phe His Ala
Met 325 330 335 Pro Val Gly Phe Ala Ser Asp Gln Ile Gly Leu Ala Met
His Met Ala 340 345 350 Ala Tyr Leu Ala Glu Arg Gln Leu Gly Leu Leu
Val Ser Pro Val Thr 355 360 365 Asn Gly Asp Leu Pro Pro Met Leu Thr
Pro Arg Ala Gly Arg Gly Ala 370 375 380 Gly Leu Ala Gly Val Gln Ile
Ser Ala Thr Ser Phe Val Ser Arg Ile 385 390 395 400 Arg Gln Leu Val
Phe Pro Ala Ser Leu Thr Thr Leu Pro Thr Asn Gly 405 410 415 Trp Asn
Gln Asp His Val Pro Met Ala Leu Asn Gly Ala Asn Ser Val 420 425 430
Phe Glu Ala Leu Glu Leu Gly Trp Leu Thr Val Gly Ser Leu Ala Val 435
440 445 Gly Val Ala Gln Leu Ala Ala Met Thr Gly His Ala Ala Glu Gly
Val 450 455 460 Trp Ala Glu Leu Ala Gly Ile Cys Pro Pro Leu Asp Ala
Asp Arg Pro 465 470 475 480 Leu Gly Ala Glu Val Arg Ala Ala Arg Asp
Leu Leu Ser Ala His Ala 485 490 495 Asp Gln Leu Leu Val Asp Glu Ala
Asp Gly Lys Asp Phe Gly 500 505 510 19519PRTRheinheimera sp. A13L
19Met Arg Ser Glu Gln Leu Thr Leu Glu Asp Val Glu Ala Ile Ala Leu 1
5 10 15 Gly Arg Gln Thr Leu Val Val Thr Glu Lys Gln Met His Ala Val
Glu 20 25 30 Asn Ala His Lys Phe Leu Cys Arg Ala Ile Ser Asp Arg
Lys Arg Ile 35 40 45 Tyr Gly Val Thr Thr Gly Tyr Gly Pro Leu Ala
Thr Thr Glu Val Asp 50 55 60 Pro Arg Gln Ser Ala Leu Leu Gln Gln
Asn Leu Val His His Leu Cys 65 70 75 80 Ser Gly Val Gly Asp Pro Leu
Thr His Pro Gln Val Arg Ala Met Met 85 90 95 Val Ala Arg Leu Ile
Ser Leu Leu Ser Gly His Ser Gly Ala Asn Pro 100 105 110 Leu Leu Ile
Lys Arg Met Gln Glu Trp Leu Asp Ala Asp Ile Val Pro 115 120 125 Phe
Ile Pro Cys Arg Gly Thr Val Gly Ala Ser Gly Asp Leu Thr Pro 130 135
140 Leu Ala His Leu Ala Arg Ala Leu Ser Gly Gly Gly Lys Val Ser Ile
145 150 155 160 Lys Gly Gly Leu Trp Ile Glu Ser Arg Asp Ala His Gln
Gln Leu Gly 165 170 175 Trp Gln Pro Leu Val Leu Lys Gly Lys Asp Ala
Ile Ser Leu Val Asn 180 185 190 Gly Thr Ser Ala Thr Val Gly Ile Ala
Ala Leu Asn Ala Thr Ala Ala 195 200 205 Glu Arg Ala Leu Lys Leu Ser
Thr Leu Leu Val Leu Leu Tyr
Ala Glu 210 215 220 Leu Leu Asn Gly His Arg Glu Ala Phe His Pro Ala
Ile Gly Gln Leu 225 230 235 240 Arg Pro His Pro Gly Gln Gln Lys Leu
His Ser Trp Leu Trp Ser Leu 245 250 255 Ser Ala Ser Ser Asp Ala Leu
Val Pro Trp Cys Ala Glu Ser Arg Asn 260 265 270 Leu Asn Leu Met Gly
Glu Asp Ile Gln Gln Asn Gln Pro Leu Leu Gln 275 280 285 Asp Ala Tyr
Thr Leu Arg Cys Ala Pro Gln Ala Leu Gly Ala Ala Leu 290 295 300 Asp
Val Ile Ser Gln His Ala Thr Thr Val Lys Ile Glu Leu Ser Ala 305 310
315 320 Val Thr Asp Asn Pro Leu Leu Phe Ala Glu Asp Glu Leu Ile Leu
His 325 330 335 Gly Gly Asn Phe Phe Gly Gln His Leu Ala Phe Ala Ser
Asp His Leu 340 345 350 Asn Asn Ala Leu Ile Gln Met Ala Leu Tyr Ser
Glu Arg Arg Ile Ala 355 360 365 Arg Ile Thr Asp Pro Leu Arg Asn Lys
Gly Leu Pro Ala Phe Met Gln 370 375 380 Pro Leu Asp Thr Gly Leu His
Ser Gly Phe Met Gly Ala Gln Val Cys 385 390 395 400 Ala Thr Ser Leu
Val Ala Glu Leu Arg Ser Gln Ala Met Pro Ala Ser 405 410 415 Ile Gln
Ser Ile Pro Thr Asn Ala Asp Asn Gln Asp Ile Val Pro Leu 420 425 430
Gly Thr Ile Ala Ala Arg Arg Ala Ser Thr Ser Leu Thr Gln Leu Tyr 435
440 445 Gln Ile Leu Ala Ile Glu Ala Leu Val Leu Val Gln Gly Ala Glu
Leu 450 455 460 Lys Asn Thr His Ser Phe Ser His Ser Ser Gln Val Val
Cys Ala Trp 465 470 475 480 Leu Arg Gln Tyr Ala Leu Pro Leu Lys Glu
Asp Arg Ala Leu Ser Glu 485 490 495 Asp Ile Thr Arg Val Ala Glu Ala
Leu Ile Asp Pro Asp Lys Val Lys 500 505 510 Ser Leu Ile Glu Leu Leu
Ala 515 20713PRTRhodotorula mucilaginosa 20Met Ala Pro Ser Val Asp
Ser Ile Ala Thr Ser Val Ala Asn Ser Leu 1 5 10 15 Ser Asn Gly Leu
His Ala Ala Ala Ala Ala Asn Gly Gly Asp Val His 20 25 30 Lys Lys
Thr Ala Gly Ala Gly Ser Leu Leu Pro Thr Thr Glu Thr Thr 35 40 45
Gln Leu Asp Ile Val Glu Arg Ile Leu Ala Asp Ala Gly Ala Thr Asp 50
55 60 Gln Ile Lys Leu Asp Gly Tyr Thr Leu Thr Leu Gly Asp Val Val
Gly 65 70 75 80 Ala Ala Arg Arg Gly Arg Ser Val Lys Val Ala Asp Ser
Pro His Ile 85 90 95 Arg Glu Lys Ile Asp Ala Ser Val Glu Phe Leu
Arg Thr Gln Leu Asp 100 105 110 Asn Ser Val Tyr Gly Val Thr Thr Gly
Phe Gly Gly Ser Ala Asp Thr 115 120 125 Arg Thr Glu Asp Ala Ile Ser
Leu Gln Lys Ala Leu Leu Glu His Gln 130 135 140 Leu Cys Gly Val Leu
Pro Thr Ser Met Asp Gly Phe Ala Leu Gly Arg 145 150 155 160 Gly Leu
Glu Asn Ser Leu Pro Leu Glu Val Val Arg Gly Ala Met Thr 165 170 175
Ile Arg Val Asn Ser Leu Thr Arg Gly His Ser Ala Val Arg Ile Val 180
185 190 Val Leu Glu Ala Leu Thr Asn Phe Leu Asn His Gly Ile Thr Pro
Ile 195 200 205 Val Pro Leu Arg Gly Thr Ile Ser Ala Ser Gly Asp Leu
Ser Pro Leu 210 215 220 Ser Tyr Ile Ala Ala Ser Ile Thr Gly His Pro
Asp Ser Lys Val His 225 230 235 240 Val Asp Gly Lys Ile Met Ser Ala
Gln Glu Ala Ile Ala Leu Lys Gly 245 250 255 Leu Gln Pro Val Val Leu
Gly Pro Lys Glu Gly Leu Gly Leu Val Asn 260 265 270 Gly Thr Ala Val
Ser Ala Ser Met Ala Thr Leu Ala Leu Thr Asp Ala 275 280 285 His Val
Leu Ser Leu Leu Ala Gln Ala Leu Thr Ala Leu Thr Val Glu 290 295 300
Ala Met Val Gly His Ala Gly Ser Phe His Pro Phe Leu His Asp Val 305
310 315 320 Thr Arg Pro His Pro Thr Gln Ile Glu Val Ala Arg Asn Ile
Arg Thr 325 330 335 Leu Leu Glu Gly Ser Lys Tyr Ala Val His His Glu
Thr Glu Val Lys 340 345 350 Val Lys Asp Asp Glu Gly Ile Leu Arg Gln
Asp Arg Tyr Pro Leu Arg 355 360 365 Cys Ser Pro Gln Trp Leu Gly Pro
Leu Val Ser Asp Met Ile His Ala 370 375 380 His Ala Val Leu Ser Leu
Glu Ala Gly Gln Ser Thr Thr Asp Asn Pro 385 390 395 400 Leu Ile Asp
Leu Glu Asn Lys Met Thr His His Gly Gly Ala Phe Met 405 410 415 Ala
Ser Ser Val Gly Asn Thr Met Glu Lys Thr Arg Leu Ala Val Ala 420 425
430 Leu Met Gly Lys Val Ser Phe Thr Gln Leu Thr Glu Met Leu Asn Ala
435 440 445 Gly Met Asn Arg Ala Leu Pro Ser Cys Leu Ala Ala Glu Asp
Pro Ser 450 455 460 Leu Ser Tyr His Cys Lys Gly Leu Asp Ile Ala Ala
Ala Ala Tyr Thr 465 470 475 480 Ser Glu Leu Gly His Leu Ala Asn Pro
Val Ser Thr His Val Gln Pro 485 490 495 Ala Glu Met Gly Asn Gln Ala
Ile Asn Ser Leu Ala Leu Ile Ser Ala 500 505 510 Arg Arg Thr Ala Glu
Ala Asn Asp Val Leu Ser Leu Leu Leu Ala Thr 515 520 525 His Leu Tyr
Cys Val Leu Gln Ala Val Asp Leu Arg Ala Met Glu Phe 530 535 540 Glu
His Thr Lys Ala Phe Glu Pro Met Val Thr Glu Leu Leu Lys Gln 545 550
555 560 His Phe Gly Ala Leu Ala Thr Ala Glu Val Glu Asp Lys Val Arg
Lys 565 570 575 Ser Ile Tyr Lys Arg Leu Gln Gln Asn Asn Ser Tyr Asp
Leu Glu Gln 580 585 590 Arg Trp His Asp Thr Phe Ser Val Ala Thr Gly
Ala Val Val Glu Ala 595 600 605 Leu Ala Gly Gln Glu Val Ser Leu Ala
Ser Leu Asn Ala Trp Lys Val 610 615 620 Ala Cys Ala Glu Lys Ala Ile
Ala Leu Thr Arg Ser Val Arg Asp Ser 625 630 635 640 Phe Trp Ala Ala
Pro Ser Ser Ser Ser Pro Ala Leu Lys Tyr Leu Ser 645 650 655 Pro Arg
Thr Arg Val Leu Tyr Ser Phe Val Arg Glu Glu Val Gly Val 660 665 670
Lys Ala Arg Arg Gly Asp Val Tyr Leu Gly Lys Gln Glu Val Thr Ile 675
680 685 Gly Thr Asn Val Ser Arg Ile Tyr Glu Ala Ile Lys Ser Gly Cys
Ile 690 695 700 Ala Pro Val Leu Val Lys Met Met Ala 705 710
21689PRTTrichosporon cutaneum 21Met Phe Ile Glu Thr Asn Val Ala Lys
Pro Ala Ser Thr Lys Ala Met 1 5 10 15 Asn Ala Gly Ser Ala Lys Ala
Ala Pro Val Glu Pro Phe Ala Thr Tyr 20 25 30 Ala His Ser Gln Ala
Thr Lys Thr Val Ser Ile Asp Gly His Thr Met 35 40 45 Lys Val Gly
Asp Val Val Ala Val Ala Arg His Gly Ala Lys Val Glu 50 55 60 Leu
Ala Ala Ser Val Ala Gly Pro Val Arg Ala Ser Val Asp Phe Lys 65 70
75 80 Glu Ser Lys Lys His Thr Ser Ile Tyr Gly Val Thr Thr Gly Phe
Gly 85 90 95 Gly Ser Ala Asp Thr Arg Thr Ser Asp Thr Glu Ala Leu
Gln Ile Ser 100 105 110 Leu Leu Glu His Gln Leu Cys Gly Phe Leu Pro
Thr Asp Ala Thr Tyr 115 120 125 Glu Gly Met Leu Leu Ala Ala Met Pro
Ile Pro Ile Val Arg Gly Ala 130 135 140 Met Ala Val Arg Val Asn Ser
Cys Val Arg Gly His Ser Gly Val Arg 145 150 155 160 Leu Glu Val Leu
Gln Ser Phe Ala Asp Phe Ile Asn Arg Gly Leu Val 165 170 175 Pro Cys
Val Pro Leu Arg Gly Thr Ile Ser Ala Ser Gly Asp Leu Ser 180 185 190
Pro Leu Ser Tyr Ile Ala Gly Ala Ile Cys Gly His Pro Asp Val Lys 195
200 205 Val Phe Asp Thr Ala Ala Ser Pro Pro Thr Val Leu Thr Ser Pro
Glu 210 215 220 Ala Ile Ala Lys Tyr Gly Leu Lys Thr Val Lys Leu Ala
Ser Lys Glu 225 230 235 240 Gly Leu Gly Leu Val Asn Gly Thr Ala Val
Ser Ala Ala Ala Gly Ala 245 250 255 Leu Ala Leu Tyr Asp Ala Glu Cys
Leu Ala Ile Met Ser Gln Thr Asn 260 265 270 Thr Val Leu Thr Val Glu
Ala Leu Asp Gly His Val Gly Ser Phe Ala 275 280 285 Pro Phe Ile Gln
Glu Ile Arg Pro His Ala Gly Gln Ile Glu Ala Ala 290 295 300 Arg Asn
Ile Arg His Met Leu Gly Gly Ser Lys Leu Ala Val His Glu 305 310 315
320 Glu Ser Glu Leu Leu Ala Asp Gln Asp Ala Gly Ile Leu Arg Gln Asp
325 330 335 Arg Tyr Ala Leu Arg Thr Ser Ala Gln Trp Ile Gly Pro Gln
Leu Glu 340 345 350 Ala Leu Gly Leu Ala Arg Gln Gln Ile Glu Thr Glu
Leu Asn Ser Thr 355 360 365 Thr Asp Asn Pro Leu Ile Asp Val Glu Gly
Gly Met Phe His His Gly 370 375 380 Gly Asn Phe Gln Ala Met Ala Val
Thr Ser Ala Met Asp Ser Ala Arg 385 390 395 400 Ile Val Leu Gln Asn
Leu Gly Lys Leu Ser Phe Ala Gln Val Thr Glu 405 410 415 Leu Ile Asn
Cys Glu Met Asn His Gly Leu Pro Ser Asn Leu Ala Gly 420 425 430 Ser
Glu Pro Ser Thr Asn Tyr His Cys Lys Gly Leu Asp Ile His Cys 435 440
445 Gly Ala Tyr Cys Ala Glu Leu Gly Phe Leu Ala Asn Pro Met Ser Asn
450 455 460 His Val Gln Ser Thr Glu Met His Asn Gln Ser Val Asn Ser
Met Ala 465 470 475 480 Phe Ala Ser Ala Arg Arg Thr Met Glu Ala Asn
Glu Val Leu Ser Leu 485 490 495 Leu Leu Gly Ser Gln Met Tyr Cys Ala
Thr Gln Ala Leu Asp Leu Arg 500 505 510 Val Met Glu Val Lys Phe Lys
Met Ala Ile Val Lys Leu Leu Asn Glu 515 520 525 Thr Leu Thr Lys His
Phe Ala Ala Phe Leu Thr Pro Glu Gln Leu Ala 530 535 540 Lys Leu Asn
Thr His Ala Ala Ile Thr Leu Tyr Lys Arg Leu Asn Gln 545 550 555 560
Thr Pro Ser Trp Asp Ser Ala Pro Arg Phe Glu Asp Ala Ala Lys His 565
570 575 Leu Val Gly Val Ile Met Asp Ala Leu Met Val Asn Asp Asp Ile
Thr 580 585 590 Asp Leu Thr Asn Leu Pro Lys Trp Lys Lys Glu Phe Ala
Lys Glu Ala 595 600 605 Gly Asn Leu Tyr Arg Ser Ile Leu Val Ala Thr
Thr Ala Asp Gly Arg 610 615 620 Asn Asp Leu Glu Pro Ala Glu Tyr Leu
Gly Gln Thr Arg Ala Val Tyr 625 630 635 640 Glu Ala Val Arg Ser Glu
Leu Gly Val Lys Val Arg Arg Gly Asp Val 645 650 655 Ala Glu Gly Lys
Ser Gly Lys Ser Ile Gly Ser Ser Val Ala Lys Ile 660 665 670 Val Glu
Ala Met Arg Asp Gly Arg Leu Met Gly Ala Val Gly Lys Met 675 680 685
Phe 22716PRTRhodosporidium toruloides 22Met Ala Pro Ser Leu Asp Ser
Ile Ser His Ser Phe Ala Asn Gly Val 1 5 10 15 Ala Ser Ala Lys Gln
Ala Val Asn Gly Ala Ser Thr Asn Leu Ala Val 20 25 30 Ala Gly Ser
His Leu Pro Thr Thr Gln Val Thr Gln Val Asp Ile Val 35 40 45 Glu
Lys Met Leu Ala Ala Pro Thr Asp Ser Thr Leu Glu Leu Asp Gly 50 55
60 Tyr Ser Leu Asn Leu Gly Asp Val Val Ser Ala Ala Arg Lys Gly Arg
65 70 75 80 Pro Val Arg Val Lys Asp Ser Asp Glu Ile Arg Ser Lys Ile
Asp Lys 85 90 95 Ser Val Glu Phe Leu Arg Ser Gln Leu Ser Met Ser
Val Tyr Gly Val 100 105 110 Thr Thr Gly Phe Gly Gly Ser Ala Asp Thr
Arg Thr Glu Asp Ala Ile 115 120 125 Ser Leu Gln Lys Ala Leu Leu Glu
His Gln Leu Cys Gly Val Leu Pro 130 135 140 Ser Ser Phe Asp Ser Phe
Arg Leu Gly Arg Gly Leu Glu Asn Ser Leu 145 150 155 160 Pro Leu Glu
Val Val Arg Gly Ala Met Thr Ile Arg Val Asn Ser Leu 165 170 175 Thr
Arg Gly His Ser Ala Val Arg Leu Val Val Leu Glu Ala Leu Thr 180 185
190 Asn Phe Leu Asn His Gly Ile Thr Pro Ile Val Pro Leu Arg Gly Thr
195 200 205 Ile Ser Ala Ser Gly Asp Leu Ser Pro Leu Ser Tyr Ile Ala
Ala Ala 210 215 220 Ile Ser Gly His Pro Asp Ser Lys Val His Val Val
His Glu Gly Lys 225 230 235 240 Glu Lys Ile Leu Tyr Ala Arg Glu Ala
Met Ala Leu Phe Asn Leu Glu 245 250 255 Pro Val Val Leu Gly Pro Lys
Glu Gly Leu Gly Leu Val Asn Gly Thr 260 265 270 Ala Val Ser Ala Ser
Met Ala Thr Leu Ala Leu His Asp Ala His Met 275 280 285 Leu Ser Leu
Leu Ser Gln Ser Leu Thr Ala Met Thr Val Glu Ala Met 290 295 300 Val
Gly His Ala Gly Ser Phe His Pro Phe Leu His Asp Val Thr Arg 305 310
315 320 Pro His Pro Thr Gln Ile Glu Val Ala Gly Asn Ile Arg Lys Leu
Leu 325 330 335 Glu Gly Ser Arg Phe Ala Val His His Glu Glu Glu Val
Lys Val Lys 340 345 350 Asp Asp Glu Gly Ile Leu Arg Gln Asp Arg Tyr
Pro Leu Arg Thr Ser 355 360 365 Pro Gln Trp Leu Gly Pro Leu Val Ser
Asp Leu Ile His Ala His Ala 370 375 380 Val Leu Thr Ile Glu Ala Gly
Gln Ser Thr Thr Asp Asn Pro Leu Ile 385 390 395 400 Asp Val Glu Asn
Lys Thr Ser His His Gly Gly Asn Phe Gln Ala Ala 405 410 415 Ala Val
Ala Asn Thr Met Glu Lys Thr Arg Leu Gly Leu Ala Gln Ile 420 425 430
Gly Lys Leu Asn Phe Thr Gln Leu Thr Glu Met Leu Asn Ala Gly Met 435
440 445 Asn Arg Gly Leu Pro Ser Cys Leu Ala Ala Glu Asp Pro Ser Leu
Ser 450 455 460 Tyr His Cys Lys Gly Leu Asp Ile Ala Ala Ala Ala Tyr
Thr Ser Glu 465 470 475 480 Leu Gly His Leu Ala Asn Pro Val Thr Thr
His Val Gln Pro Ala Glu 485 490 495 Met Ala Asn Gln Ala Val Asn Ser
Leu Ala Leu Ile Ser Ala Arg Arg 500 505 510 Thr Thr Glu Ser Asn Asp
Val Leu Ser Leu Leu Leu Ala Thr His Leu 515 520 525 Tyr Cys Val Leu
Gln Ala Ile Asp Leu Arg Ala Ile Glu Phe Glu Phe 530 535 540 Lys Lys
Gln Phe Gly Pro Ala Ile Val Ser Leu Ile Asp Gln His Phe 545 550 555
560 Gly Ser Ala Met Thr Gly Ser Asn Leu Arg Asp Glu Leu Val Glu Lys
565 570 575 Val Asn Lys Thr Leu Ala Lys Arg Leu Glu Gln Thr Asn Ser
Tyr Asp 580 585
590 Leu Val Pro Arg Trp His Asp Ala Phe Ser Phe Ala Ala Gly Thr Val
595 600 605 Val Glu Val Leu Ser Ser Thr Ser Leu Ser Leu Ala Ala Val
Asn Ala 610 615 620 Trp Lys Val Ala Ala Ala Glu Ser Ala Ile Ser Leu
Thr Arg Gln Val 625 630 635 640 Arg Glu Thr Phe Trp Ser Ala Ala Ser
Thr Ser Ser Pro Ala Leu Ser 645 650 655 Tyr Leu Ser Pro Arg Thr Gln
Ile Leu Tyr Ala Phe Val Arg Glu Glu 660 665 670 Leu Gly Val Lys Ala
Arg Arg Gly Asp Val Phe Leu Gly Lys Gln Glu 675 680 685 Val Thr Ile
Gly Ser Asn Val Ser Lys Ile Tyr Glu Ala Ile Lys Ser 690 695 700 Gly
Arg Ile Asn Asn Val Leu Leu Lys Met Leu Ala 705 710 715
23737PRTPhanerochaete chrysosporium 23Met Pro Ser Arg Ile Asp Tyr
Tyr Thr Ser Ser Gly Asn Gly Tyr Ala 1 5 10 15 Gln Ser Arg Lys Ser
Ser Ala Ile Tyr Pro Ala Ser Ala Ser Thr Gly 20 25 30 His Ala Ala
Pro Ser Thr Glu Arg Lys Pro Glu Leu Leu Asp Lys Phe 35 40 45 Val
Glu Ala Tyr Asp Glu Leu Gln Ser Tyr Arg Glu Gly Lys Pro Val 50 55
60 Ile Val Asp Gly His Asn Leu Ser Ile Pro Ala Val Ala Ala Thr Ala
65 70 75 80 Arg Phe Gly Ala Ala Val Val Leu Asp Glu Asn Pro Glu Thr
His Glu 85 90 95 Arg Val Leu Gln Ser Arg Arg Val Ile Val Asp Lys
Val Ser Thr Gln 100 105 110 Arg Ser Val Tyr Gly Val Ser Thr Gly Phe
Gly Gly Ser Ala Asp Thr 115 120 125 Arg Thr Ser Asp Pro Leu Gln Leu
Gly His Ala Leu Leu Gln His Gln 130 135 140 His Val Gly Val Leu Pro
Thr Gln Thr Glu Ser Pro Leu Pro Ala Leu 145 150 155 160 Pro Leu Gly
Asp Pro Leu Ala Thr Thr Ser Met Pro Glu Ala Trp Val 165 170 175 Arg
Gly Ala Ile Leu Ile Arg Met Asn Ser Leu Ile Arg Gly His Ser 180 185
190 Gly Val Arg Trp Glu Leu Ile Glu Lys Met Gly Glu Leu Leu Arg Glu
195 200 205 Asn Ile Thr Pro Leu Val Pro Leu Arg Gly Ser Ile Ser Ala
Ser Gly 210 215 220 Asp Leu Ser Pro Leu Ser Tyr Ile Ala Gly Thr Leu
Ile Gly Ser Pro 225 230 235 240 Ala Ile Arg Val Phe Asp Gly Pro Ala
Ser Tyr Gly Ala Arg Arg Ile 245 250 255 Leu Pro Ser Asn Ile Ala Leu
Ala Asn His Gly Val Ala Pro Ile Pro 260 265 270 Leu Ser Ser Lys Glu
His Leu Gly Ile Leu Asn Gly Thr Ala Phe Ser 275 280 285 Ala Ser Val
Gly Ala Leu Ala Leu Asn Glu Ala Val His Leu Ser Leu 290 295 300 Leu
Ala Gln Val Cys Thr Ala Met Gly Thr Glu Ala Met Ile Gly Ala 305 310
315 320 Val Gly Ser Phe Asp Ala Phe Ile His Asp Thr Ala Arg Pro His
Pro 325 330 335 Gly Gln Val Glu Val Ala Arg Asn Val Arg Thr Leu Leu
Glu Asp Ser 340 345 350 Gln Met Ala Val Lys Ala Glu Asp Glu Val His
Ile Ala Glu Asp Glu 355 360 365 Gly Glu Leu Arg Gln Asp Arg Tyr Pro
Leu Arg Thr Ala Ala Gln Phe 370 375 380 Leu Gly Pro Gln Ile Glu Asp
Ile Leu Ser Ala His Glu Thr Val Thr 385 390 395 400 Leu Glu Cys Asn
Ser Thr Thr Asp Asn Pro Leu Ile Asp Gly Glu Thr 405 410 415 Gly Thr
Val His His Gly Gly Asn Phe Gln Ala Met Ala Val Thr Asn 420 425 430
Ala Met Glu Lys Thr Arg Leu Ala Ile His His Ile Gly Lys Leu Leu 435
440 445 Phe Ala Gln Ala Thr Glu Leu Ile Asn Pro Met Met Asn Arg Gly
Leu 450 455 460 Pro Pro Asn Leu Ala Ala Thr Asp Pro Ser His Asn Tyr
Phe Ala Lys 465 470 475 480 Gly Val Asp Ile His Leu Ala Ala Tyr Val
Gly Glu Leu Gly Phe Leu 485 490 495 Ala Ser Pro Val Ser Ser His Ile
Gln Ser Ala Glu Met His Asn Gln 500 505 510 Ala Val Asn Ser Leu Ala
Leu Val Ser Ala Arg Tyr Thr Ile Ser Ala 515 520 525 Leu Asp Val Leu
Ser Leu Leu Thr Ala Ala Tyr Leu Tyr Val Leu Cys 530 535 540 Gln Ala
Leu Asp Leu Arg Ala Met His Asn Asp Leu Gln Ser Ser Leu 545 550 555
560 Ser Ala Ile Val Arg Glu Leu Leu Pro Lys His Phe Pro Ser Ala Ala
565 570 575 Lys Arg Ala Asp Ala Leu Leu Pro Ile Leu Glu Arg Thr Ile
Phe Arg 580 585 590 Ala Leu Asn Ser Ser Ser Ser Ala Asp Cys Lys Ala
Arg Met Val Ser 595 600 605 Val Ala Ala Ser Thr Thr Thr Pro Leu Val
Asp Phe Leu Ser Ala Asp 610 615 620 Ala Ala Leu Ala Ser Glu Leu Ala
Asn Ile Thr Ala Phe Arg Thr Glu 625 630 635 640 Leu Ala Thr Arg Ala
Ala Asp Ala Leu Thr Thr Leu Arg Thr Gln Tyr 645 650 655 Leu Glu Gly
Ala Arg Gly Ala Ala Pro Ala Ser Lys Tyr Leu Gly Lys 660 665 670 Thr
Arg Pro Val Tyr Glu Phe Val Arg Val Thr Leu Asn Val Pro Met 675 680
685 His Gly Arg Glu Asn Leu His Asn Phe Glu Met Gly Pro Gly Val Glu
690 695 700 Asp Gly Ile Ile Gly Asn Asn Ile Ser Thr Ile Tyr Glu Ala
Ile Arg 705 710 715 720 Asp Gly Lys Met Gln Asn Val Val Met Gln Leu
Val Lys Ser Ile Lys 725 730 735 Ala 2414PRTArtificial
SequenceHis-Tag 24Met Ala His His His His His His Glu Asn Leu Tyr
Phe Gln 1 5 10 25302PRTEscherichia coli 25Met Asp Gln Ile Arg Leu
Thr His Leu Arg Gln Leu Glu Ala Glu Ser 1 5 10 15 Ile His Ile Ile
Arg Glu Val Ala Ala Glu Phe Ser Asn Pro Val Met 20 25 30 Leu Tyr
Ser Ile Gly Lys Asp Ser Ser Val Met Leu His Leu Ala Arg 35 40 45
Lys Ala Phe Tyr Pro Gly Thr Leu Pro Phe Pro Leu Leu His Val Asp 50
55 60 Thr Gly Trp Lys Phe Arg Glu Met Tyr Glu Phe Arg Asp Arg Thr
Ala 65 70 75 80 Lys Ala Tyr Gly Cys Glu Leu Leu Val His Lys Asn Pro
Glu Gly Val 85 90 95 Ala Met Gly Ile Asn Pro Phe Val His Gly Ser
Ala Lys His Thr Asp 100 105 110 Ile Met Lys Thr Glu Gly Leu Lys Gln
Ala Leu Asn Lys Tyr Gly Phe 115 120 125 Asp Ala Ala Phe Gly Gly Ala
Arg Arg Asp Glu Glu Lys Ser Arg Ala 130 135 140 Lys Glu Arg Ile Tyr
Ser Phe Arg Asp Arg Phe His Arg Trp Asp Pro 145 150 155 160 Lys Asn
Gln Arg Pro Glu Leu Trp His Asn Tyr Asn Gly Gln Ile Asn 165 170 175
Lys Gly Glu Ser Ile Arg Val Phe Pro Leu Ser Asn Trp Thr Glu Gln 180
185 190 Asp Ile Trp Gln Tyr Ile Trp Leu Glu Asn Ile Asp Ile Val Pro
Leu 195 200 205 Tyr Leu Ala Ala Glu Arg Pro Val Leu Glu Arg Asp Gly
Met Leu Met 210 215 220 Met Ile Asp Asp Asn Arg Ile Asp Leu Gln Pro
Gly Glu Val Ile Lys 225 230 235 240 Lys Arg Met Val Arg Phe Arg Thr
Leu Gly Cys Trp Pro Leu Thr Gly 245 250 255 Ala Val Glu Ser Asn Ala
Gln Thr Leu Pro Glu Ile Ile Glu Glu Met 260 265 270 Leu Val Ser Thr
Thr Ser Glu Arg Gln Gly Arg Val Ile Asp Arg Asp 275 280 285 Gln Ala
Gly Ser Met Glu Leu Lys Lys Arg Gln Gly Tyr Phe 290 295 300
26475PRTEscherichia coli 26Met Asn Thr Ala Leu Ala Gln Gln Ile Ala
Asn Glu Gly Gly Val Glu 1 5 10 15 Ala Trp Met Ile Ala Gln Gln His
Lys Ser Leu Leu Arg Phe Leu Thr 20 25 30 Cys Gly Ser Val Asp Asp
Gly Lys Ser Thr Leu Ile Gly Arg Leu Leu 35 40 45 His Asp Thr Arg
Gln Ile Tyr Glu Asp Gln Leu Ser Ser Leu His Asn 50 55 60 Asp Ser
Lys Arg His Gly Thr Gln Gly Glu Lys Leu Asp Leu Ala Leu 65 70 75 80
Leu Val Asp Gly Leu Gln Ala Glu Arg Glu Gln Gly Ile Thr Ile Asp 85
90 95 Val Ala Tyr Arg Tyr Phe Ser Thr Glu Lys Arg Lys Phe Ile Ile
Ala 100 105 110 Asp Thr Pro Gly His Glu Gln Tyr Thr Arg Asn Met Ala
Thr Gly Ala 115 120 125 Ser Thr Cys Glu Leu Ala Ile Leu Leu Ile Asp
Ala Arg Lys Gly Val 130 135 140 Leu Asp Gln Thr Arg Arg His Ser Phe
Ile Ser Thr Leu Leu Gly Ile 145 150 155 160 Lys His Leu Val Val Ala
Ile Asn Lys Met Asp Leu Val Asp Tyr Ser 165 170 175 Glu Glu Thr Phe
Thr Arg Ile Arg Glu Asp Tyr Leu Thr Phe Ala Gly 180 185 190 Gln Leu
Pro Gly Asn Leu Asp Ile Arg Phe Val Pro Leu Ser Ala Leu 195 200 205
Glu Gly Asp Asn Val Ala Ser Gln Ser Glu Ser Met Pro Trp Tyr Ser 210
215 220 Gly Pro Thr Leu Leu Glu Val Leu Glu Thr Val Glu Ile Gln Arg
Val 225 230 235 240 Val Asp Ala Gln Pro Met Arg Phe Pro Val Gln Tyr
Val Asn Arg Pro 245 250 255 Asn Leu Asp Phe Arg Gly Tyr Ala Gly Thr
Leu Ala Ser Gly Arg Val 260 265 270 Glu Val Gly Gln Arg Val Lys Val
Leu Pro Ser Gly Val Glu Ser Asn 275 280 285 Val Ala Arg Ile Val Thr
Phe Asp Gly Asp Arg Glu Glu Ala Phe Ala 290 295 300 Gly Glu Ala Ile
Thr Leu Val Leu Thr Asp Glu Ile Asp Ile Ser Arg 305 310 315 320 Gly
Asp Leu Leu Leu Ala Ala Asp Glu Ala Leu Pro Ala Val Gln Ser 325 330
335 Ala Ser Val Asp Val Val Trp Met Ala Glu Gln Pro Leu Ser Pro Gly
340 345 350 Gln Ser Tyr Asp Ile Lys Ile Ala Gly Lys Lys Thr Arg Ala
Arg Val 355 360 365 Asp Gly Ile Arg Tyr Gln Val Asp Ile Asn Asn Leu
Thr Gln Arg Glu 370 375 380 Val Glu Asn Leu Pro Leu Asn Gly Ile Gly
Leu Val Asp Leu Thr Phe 385 390 395 400 Asp Glu Pro Leu Val Leu Asp
Arg Tyr Gln Gln Asn Pro Val Thr Gly 405 410 415 Gly Leu Ile Phe Ile
Asp Arg Leu Ser Asn Val Thr Val Gly Ala Gly 420 425 430 Met Val His
Glu Pro Val Ser Gln Ala Thr Ala Ala Pro Ser Glu Phe 435 440 445 Ser
Ala Phe Glu Leu Glu Leu Asn Ala Leu Val Arg Arg His Phe Pro 450 455
460 His Trp Gly Ala Arg Asp Leu Leu Gly Asp Lys 465 470 475
27201PRTEscherichia coli 27Met Ala Leu His Asp Glu Asn Val Val Trp
His Ser His Pro Val Thr 1 5 10 15 Val Gln Gln Arg Glu Leu His His
Gly His Arg Gly Val Val Leu Trp 20 25 30 Phe Thr Gly Leu Ser Gly
Ser Gly Lys Ser Thr Val Ala Gly Ala Leu 35 40 45 Glu Glu Ala Leu
His Lys Leu Gly Val Ser Thr Tyr Leu Leu Asp Gly 50 55 60 Asp Asn
Val Arg His Gly Leu Cys Ser Asp Leu Gly Phe Ser Asp Ala 65 70 75 80
Asp Arg Lys Glu Asn Ile Arg Arg Val Gly Glu Val Ala Asn Leu Met 85
90 95 Val Glu Ala Gly Leu Val Val Leu Thr Ala Phe Ile Ser Pro His
Arg 100 105 110 Ala Glu Arg Gln Met Val Arg Glu Arg Val Gly Glu Gly
Arg Phe Ile 115 120 125 Glu Val Phe Val Asp Thr Pro Leu Ala Ile Cys
Glu Ala Arg Asp Pro 130 135 140 Lys Gly Leu Tyr Lys Lys Ala Arg Ala
Gly Glu Leu Arg Asn Phe Thr 145 150 155 160 Gly Ile Asp Ser Val Tyr
Glu Ala Pro Glu Ser Ala Glu Ile His Leu 165 170 175 Asn Gly Glu Gln
Leu Val Thr Asn Leu Val Gln Gln Leu Leu Asp Leu 180 185 190 Leu Arg
Gln Asn Asp Ile Ile Arg Ser 195 200 28246PRTEscherichia coli 28Met
Leu Asp Gln Val Cys Gln Leu Ala Arg Asn Ala Gly Asp Ala Ile 1 5 10
15 Met Gln Val Tyr Asp Gly Thr Lys Pro Met Asp Val Val Ser Lys Ala
20 25 30 Asp Asn Ser Pro Val Thr Ala Ala Asp Ile Ala Ala His Thr
Val Ile 35 40 45 Met Asp Gly Leu Arg Thr Leu Thr Pro Asp Val Pro
Val Leu Ser Glu 50 55 60 Glu Asp Pro Pro Gly Trp Glu Val Arg Gln
His Trp Gln Arg Tyr Trp 65 70 75 80 Leu Val Asp Pro Leu Asp Gly Thr
Lys Glu Phe Ile Lys Arg Asn Gly 85 90 95 Glu Phe Thr Val Asn Ile
Ala Leu Ile Asp His Gly Lys Pro Ile Leu 100 105 110 Gly Val Val Tyr
Ala Pro Val Met Asn Val Met Tyr Ser Ala Ala Glu 115 120 125 Gly Lys
Ala Trp Lys Glu Glu Cys Gly Val Arg Lys Gln Ile Gln Val 130 135 140
Arg Asp Ala Arg Pro Pro Leu Val Val Ile Ser Arg Ser His Ala Asp 145
150 155 160 Ala Glu Leu Lys Glu Tyr Leu Gln Gln Leu Gly Glu His Gln
Thr Thr 165 170 175 Ser Ile Gly Ser Ser Leu Lys Phe Cys Leu Val Ala
Glu Gly Gln Ala 180 185 190 Gln Leu Tyr Pro Arg Phe Gly Pro Thr Asn
Ile Trp Asp Thr Ala Ala 195 200 205 Gly His Ala Val Ala Ala Ala Ala
Gly Ala His Val His Asp Trp Gln 210 215 220 Gly Lys Pro Leu Asp Tyr
Thr Pro Arg Glu Ser Phe Leu Asn Pro Gly 225 230 235 240 Phe Arg Val
Ser Ile Tyr 245 2958DNAArtificial SequencePrimer CBJP472
29tagaaataat tttgtttaac tttaagaagg agatatacca tggagttctc ccgtccac
583049DNAArtificial SequencePrimer CBJP473 30taagcattat gcggccgcaa
gcttgtcata gttcacaacg aaacttgaa 493162DNAArtificial SequencePrimer
CBJP470 31tagaaataat tttgtttaac tttaagaagg agatatacca tggaactgat
tcaggatacc 60ag 623247DNAArtificial SequencePrimer CBJP471
32taagcattat gcggccgcaa gcttgttaca gttcgctacg aaagctc
473360DNAArtificial SequencePrimer CBJP499 33tagaaataat tttgtttaac
tttaagaagg agatatacca tggagctgat ccaggacacc 603445DNAArtificial
SequencePrimer CBJP500 34taagcattat gcggccgcaa gcttgtcaca
cctctgagcg gaagc 453557DNAArtificial SequencePrimer CBJP505
35tagaaataat tttgtttaac tttaagaagg agatatacca tggagccggt ccaggac
573645DNAArtificial SequencePrimer CBJP506 36taagcattat gcggccgcaa
gcttgtcaca gctcagagcg gaagc 453757DNAArtificial SequencePrimer
CBJP503 37tagaaataat tttgtttaac tttaagaagg agatatacca tggaggacat
tcccgac 573845DNAArtificial SequencePrimer CBJP504 38taagcattat
gcggccgcaa gcttgtcaca gctgtgtgcg gaagc 453962DNAArtificial
SequencePrimer CBJP501 39tagaaataat tttgtttaac tttaagaagg
agatatacca tggggaatga tgaggtgatc
60ag 624054DNAArtificial SequencePrimer CBJP502 40taagcattat
gcggccgcaa gcttgttact ctgtctattg caatttatta cagg
544165DNAArtificial SequencePrimer CBJP491 41catcttagta tattagttaa
gtataagaag gagatataca tatggatcaa atacgactta 60ctcac
654246DNAArtificial SequencePrimer CBJP492 42tggccggccg atatccaatt
gatcaggatc tgataatatc gttctg 464324DNAArtificial SequencePrimer
CBJP497 43tcaggatctg ataatatcgt tctg 244459DNAArtificial
SequencePrimer CBJP498 44cagaacgata ttatcagatc ctgataagtt
aacaccgctc acagagacga ggtggagaa 594546DNAArtificial SequencePrimer
CBJP496 45tggccggccg atatccaatt gattagtaaa tagacactct gaaccc
464662DNAArtificial SequencePrimer CBJP487 46catcttagta tattagttaa
gtataagaag gagatataca tatggcaccg agcgttgata 60gc
624747DNAArtificial SequencePrimer CBJP488 47tggccggccg atatccaatt
gattaggcca tcattttaac cagaacc 4748540PRTRhodobacter capsulatus
48Met Leu Asp Ala Thr Ile Gly Arg Lys Arg Met Thr Leu Gln Ser Gln 1
5 10 15 Thr Ala Lys Asp Cys Leu Ala Leu Asp Gly Ala Leu Thr Leu Val
Gln 20 25 30 Cys Glu Ala Ile Ala Thr His Arg Ser Arg Ile Ser Val
Thr Pro Ala 35 40 45 Leu Arg Glu Arg Cys Ala Arg Ala His Ala Arg
Leu Glu His Ala Ile 50 55 60 Ala Glu Gln Arg His Ile Tyr Gly Ile
Thr Thr Gly Phe Gly Pro Leu 65 70 75 80 Ala Asn Arg Leu Ile Gly Ala
Asp Gln Gly Ala Glu Leu Gln Gln Asn 85 90 95 Leu Ile Tyr His Leu
Ala Thr Gly Val Gly Pro Lys Leu Ser Trp Ala 100 105 110 Glu Ala Arg
Ala Leu Met Leu Ala Arg Leu Asn Ser Ile Leu Gln Gly 115 120 125 Ala
Ser Gly Ala Ser Pro Glu Thr Ile Asp Arg Ile Val Ala Val Leu 130 135
140 Asn Ala Gly Phe Ala Pro Glu Val Pro Ala Gln Gly Thr Val Gly Ala
145 150 155 160 Ser Gly Asp Leu Thr Pro Leu Ala His Met Val Leu Ala
Leu Gln Gly 165 170 175 Arg Gly Arg Met Ile Asp Pro Ser Gly Arg Val
Gln Glu Ala Gly Ala 180 185 190 Val Met Asp Arg Leu Cys Gly Gly Pro
Leu Thr Leu Ala Ala Arg Asp 195 200 205 Gly Leu Ala Leu Val Asn Gly
Thr Ser Ala Met Thr Ala Ile Ala Ala 210 215 220 Leu Thr Gly Val Glu
Ala Ala Arg Ala Ile Asp Ala Ala Leu Arg His 225 230 235 240 Ser Ala
Val Leu Met Glu Val Leu Ser Gly His Ala Glu Ala Trp His 245 250 255
Pro Ala Phe Ala Glu Leu Arg Pro His Pro Gly Gln Leu Arg Ala Thr 260
265 270 Glu Arg Leu Ala Gln Ala Leu Asp Gly Ala Gly Arg Val Cys Arg
Thr 275 280 285 Leu Thr Ala Ala Arg Arg Leu Thr Ala Ala Asp Leu Arg
Pro Glu Asp 290 295 300 His Pro Ala Gln Asp Ala Tyr Ser Leu Arg Val
Val Pro Gln Leu Val 305 310 315 320 Gly Ala Val Trp Asp Thr Leu Asp
Trp His Asp Arg Val Val Thr Cys 325 330 335 Glu Leu Asn Ser Val Thr
Asp Asn Pro Ile Phe Pro Glu Gly Cys Ala 340 345 350 Val Pro Ala Leu
His Gly Gly Asn Phe Met Gly Val His Val Ala Leu 355 360 365 Ala Ser
Asp Ala Leu Asn Ala Ala Leu Val Thr Leu Ala Gly Leu Val 370 375 380
Glu Arg Gln Ile Ala Arg Leu Thr Asp Glu Lys Leu Asn Lys Gly Leu 385
390 395 400 Pro Ala Phe Leu His Gly Gly Gln Ala Gly Leu Gln Ser Gly
Phe Met 405 410 415 Gly Ala Gln Val Thr Ala Thr Ala Leu Leu Ala Glu
Met Arg Ala Asn 420 425 430 Ala Thr Pro Val Ser Val Gln Ser Leu Ser
Thr Asn Gly Ala Asn Gln 435 440 445 Asp Val Val Ser Met Gly Thr Ile
Ala Ala Arg Arg Ala Arg Ala Gln 450 455 460 Leu Leu Pro Leu Ser Gln
Ile Gln Ala Ile Leu Ala Leu Ala Leu Ala 465 470 475 480 Gln Ala Met
Asp Leu Leu Asp Asp Pro Glu Gly Gln Ala Gly Trp Ser 485 490 495 Leu
Thr Ala Arg Asp Leu Arg Asp Arg Ile Arg Ala Val Ser Pro Gly 500 505
510 Leu Arg Ala Asp Arg Pro Leu Ala Gly Asp Ile Glu Ala Val Ala Gln
515 520 525 Gly Leu Arg His Pro Ser Ala Ala Asp Pro Pro Ala 530 535
540 491620DNARhodobacter capsulatus 49atgctggatg caaccattgg
tcgtaaacgt atgaccctgc agagccagac cgcaaaagat 60tgtctggcac tggatggtgc
actgaccctg gttcagtgtg aagcaattgc aacccatcgt 120agccgtatta
gcgttacacc ggcactgcgt gaacgttgtg cacgtgccca tgcacgtctg
180gaacatgcaa ttgccgaaca gcgtcatatt tatggtatta ccaccggttt
tggtccgctg 240gcaaatcgtc tgattggtgc agatcagggt gcagaactgc
agcagaatct gatttatcat 300ctggcaaccg gtgtgggtcc gaaactgagc
tgggctgaag cccgtgcact gatgctggca 360cgtctgaata gcatcctgca
gggtgcaagc ggtgcaagtc cggaaaccat tgatcgcatt 420gttgccgttc
tgaatgcagg ttttgcaccg gaagttccgg cacagggcac cgttggtgcc
480agcggtgatc tgactccgct ggcccatatg gttctggccc tgcaaggtcg
tggtcgtatg 540attgatccga gcggtcgtgt tcaagaggca ggcgcagtta
tggatcgtct gtgtggcggt 600ccgctgacac tggcagcacg tgatggtctg
gcgctggtta atggcaccag cgcaatgacc 660gcaattgcag cactgaccgg
tgttgaagcc gcacgtgcaa ttgatgcagc cctgcgtcat 720agcgcagttc
tgatggaagt tctgagcggt catgcagaag catggcatcc ggcatttgcg
780gaactgcgtc cgcatccggg tcagctgcgt gcaaccgaac gtctggcaca
ggccctggat 840ggcgcaggtc gtgtttgtcg taccctgacc gcagcacgtc
gtctgacagc agccgatctg 900cgtccggaag atcatcctgc acaggatgca
tatagcctgc gtgttgttcc gcagctggtt 960ggtgcagttt gggataccct
ggattggcat gatcgtgttg ttacctgtga actgaatagc 1020gttaccgata
atccgatttt tccggaaggt tgtgcagttc ctgccctgca tggtggcaat
1080tttatgggtg ttcatgttgc actggcaagt gatgcactga atgccgcact
ggttaccctg 1140gcaggtctgg ttgaacgtca gattgcccgt ctgaccgatg
aaaaactgaa taaaggtctg 1200cctgcctttc tgcatggcgg tcaggctggt
ctgcagagcg gttttatggg agcacaggtt 1260accgcaaccg cactgctggc
agaaatgcgt gcaaatgcga caccggttag cgttcagagc 1320ctgagcacca
atggtgcgaa tcaggatgtt gttagcatgg gtacaattgc cgcacgtcgt
1380gcgcgtgcac agctgctgcc gctgagccag attcaggcaa tcctggccct
ggctctggcc 1440caggcaatgg atctgctgga tgatccggaa ggccaggcag
gttggagtct gaccgcacgt 1500gatctgcgtg atcgtattcg tgcagttagt
ccgggtctgc gtgcagatcg tcctctggca 1560ggcgatattg aagcagttgc
acagggactg cgtcatccga gcgcagcgga tcctccggca
1620501569DNARhodobacter sphaeroides 50atgctggcaa tgagccctcc
gaaaccggca gttgaactgg atcgtcatat tgatctggat 60gaagcacata gcgttgcaag
cggtggtgca cgtattgttc tggcaccgcc tgcacgtgat 120cgttgtcgtg
caagcgaagc acgtctgggt gcagttattc gtgaagcccg tcatgtttat
180ggtctgacca ccggttttgg tccgctggca aatcgtctgg ttagcggtga
aaatgttcgt 240accctgcagg caaatctggt tcatcatctg gccagcggtg
tgggtccggt tctggattgg 300accaccgcac gtgcaatggt gctggcacgc
ctggttgcaa ttgcccaggg tgcgagcggt 360gcaagtgaag gtacaattgc
acgtctgatt gatctgctga atagcgaact ggcaccggca 420gtgccgatgc
gtggcaccgt tggtgcatca ggtgatctga ctccgctggc ccatatggtt
480ctgtgtctgc agggtcgtgg tgattttctg gatcgtgatg gcacccgtct
ggatggtgcc 540gaaggtctgc gtcgtggtcg tctgcagccg ctggatctga
gccatcgtga tgcactggca 600ctggttaatg gcaccagcgc aatgacaggt
attgcactgg tgaatgcaca tgcctgtcgt 660catctgggta attgggcagt
tgcactgacc gcactgctgg ccgaatgtct gggtggtcgt 720accgaagcat
gggcagcagc actgagcgat ctgcgtccgc atccgggtca gaaagatgca
780gcagcccgtc tgcgtgcacg tgttgatggt agcgcacgtg tggttcgtca
tgttattgca 840gaacgtcgcc tgggtgccag cgatattggc accgaaccgg
aagcaggtca ggatgcatat 900agcctgcgtt gtgcaccgca ggttctgggt
gccggttttg ataccctggc atggcatgat 960cgtgttctga ccattgaact
gaatgcagtt accgataatc cggtttttcc tccggatggt 1020agtgttccgg
cactgcatgg tggcaatttt atgggtcagc atgttgccct gacctcagat
1080gccctggcaa ccgcagtgac cgttctggca ggtctggccg aacgtcagat
tgcccgtctg 1140accgatgaac gtctgaatcg tggtctgcct ccgtttctgc
accgtggtcc ggcaggcctg 1200aatagtggct ttatgggtgc acaggttacc
gcaacagccc tgctggcaga aatgcgtgca 1260accggtccgg caagcattca
tagcattagc accaatgcag caaatcagga tgttgttagc 1320ctgggtacga
ttgccgcacg tctgtgtcgt gaaaaaattg atcgttgggc agaaattctg
1380gccattctgg cactgtgtct ggcacaggca gcagaactgc gttgcggtag
tggcctggat 1440ggcgtttcac cggcaggtaa aaaactggtt caggcactgc
gcgaacagtt tccgcctctg 1500gaaaccgatc gtccgctggg tcaagaaatt
gcagcactgg caacccatct gctgcaacag 1560agtccggtt
1569511518DNAFlavobacterium johnsoniae 51atgaacacca tcaacgaata
tctgagcctg gaagaatttg aagccattat ctttggcaat 60cagaaagtga ccattagtga
tgttgttgtg aatcgcgtta acgagagctt taactttctg 120aaagaattta
gcggcaacaa agtgatctat ggtgtgaata ccggttttgg tccgatggca
180cagtatcgta ttaaagaaag cgatcagatt cagctgcagt ataatctgat
tcgtagccat 240agcagcggca ccggtaaacc gctgagtccg gtttgtgcaa
aagcagcaat tctggcacgt 300ctgaataccc tgagtctggg taatagcggt
gttcatccga gcgttattaa tctgatgagc 360gaactgatca acaaagatat
cacaccgctg atttttgaac atggtggtgt tggtgcaagc 420ggtgatctgg
ttcagctgag ccatctggca ctggttctga ttggtgaagg tgaagttttc
480tataaaggtg aacgtcgtcc gacaccggaa gtttttgaaa ttgaaggtct
gaaaccgatc 540caggtggaaa ttcgcgaagg tctggccctg attaatggca
ccagcgttat gaccggtatt 600ggtgttgtta atgtgtacca tgcaaaaaaa
ctgctggatt ggagcctgaa aagcagctgt 660gcaattaatg aactggttca
ggcatatgat gatcacttta gcgcagaact gaatcagacc 720aaacgtcata
aaggtcagca agaaattgca ctgaaaatgc gtcagaatct gagcgatagc
780accctgattc gcaaacgtga agatcatctg tatagcggtg aaaacaccga
agaaatcttc 840aaagaaaaag tgcaagagta ttatagcctg cgttgtgttc
cgcagattct gggtccggtt 900ctggaaacca ttaacaatgt tgcaagcatt
ctggaagatg aatttaacag cgcaaacgat 960aacccgatca tcgatgttaa
aaaccagcat gtttatcacg gtggcaattt tcatggtgat 1020tatatcagcc
tggaaatgga taaactgaaa atcgtgatta ccaaactgac catgctggca
1080gaacgtcagc tgaattatct gctgaatagc aaaattaacg aactgctgcc
tccgtttgtt 1140aatctgggca ccctgggttt taactttggt atgcagggtg
ttcagtttac cgcaaccagc 1200accaccgcag aaagccagat gctgagcaat
ccgatgtatg ttcatagcat tccgaacaat 1260aatgataacc aggatattgt
tagcatgggc accaatagcg cagttattac cagcaaagtt 1320atcgaaaatg
cctttgaagt tctggccatt gaaatgatta ccattgttca ggcgattgat
1380tatctgggcc agaaagataa aatcagcagc gttagcaaaa aatggtatga
tgaaatccgc 1440aacatcatcc cgacctttaa agaagatcag gtgatgtatc
cgttcgtgca gaaagtaaaa 1500gaccacctga ttaacaat 15185259DNAArtificial
SequencePrimer CBJP483 52catcttagta tattagttaa gtataagaag
gagatataca tatgctggca atgagccct 595341DNAArtificial SequencePrimer
CBJP484 53tggccggccg atatccaatt gattaaaccg gactctgttg c
415465DNAArtificial SequencePrimer CBJP555 54catcttagta tattagttaa
gtataagaag gagatataca tatgaacacc atcaacgaat 60atctg
655555DNAArtificial SequencePrimer CBJP556 55tggccggccg atatccaatt
gattaattgt taatcaggtg gtcttttact ttctg 555661DNAArtificial
SequencePrimer CBJP745 56catcttagta tattagttaa gtataagaag
gagatataca tatgctggat gcaaccattg 60g 615741DNAArtificial
SequencePrimer CBJP746 57tggccggccg atatccaatt gattatgccg
gaggatccgc t 415833DNAArtificial SequencePrimer CBJP633
58agtgcaggua aaacaatgga gttctcccgt cca 335930DNAArtificial
SequencePrimer CBJP634 59cgtgcgautc atagttcaca acgaaacttg
306034DNAArtificial SequencePrimer CBJP635 60atctgtcaua aaacaatgga
attttcacgt ccgc 346132DNAArtificial SequencePrimer CBJP636
61cacgcgautc acagttcaca acgaaatttg aa 326225DNAArtificial
SequencePrimer PTEF1_fw 62cacgcgaugc acacaccata gcttc
256325DNAArtificial SequencePrimer PTEF1_rv 63cgtgcgaugg aagtaccttc
aaaga 256458DNAArtificial SequencePrimer CBJP517 64tagaaataat
tttgtttaac tttaagaagg agatatacca tggccctgga taaaatgg
586549DNAArtificial SequencePrimer CBJP518 65taagcattat gcggccgcaa
gcttgtcaca attccatgcg aaaaactag 496656DNAArtificial SequencePrimer
CBJP533 66tagaaataat tttgtttaac tttaagaagg agatatacca tggaattttc
acgtcc 566747DNAArtificial SequencePrimer CBJP534 67taagcattat
gcggccgcaa gcttgttaca gttcacaacg aaatttg 4768511PRTSaccharomyces
cerevisiae 68Met Pro Ala Pro His Gly Gly Ile Leu Gln Asp Leu Ile
Ala Arg Asp 1 5 10 15 Ala Leu Lys Lys Asn Glu Leu Leu Ser Glu Ala
Gln Ser Ser Asp Ile 20 25 30 Leu Val Trp Asn Leu Thr Pro Arg Gln
Leu Cys Asp Ile Glu Leu Ile 35 40 45 Leu Asn Gly Gly Phe Ser Pro
Leu Thr Gly Phe Leu Asn Glu Asn Asp 50 55 60 Tyr Ser Ser Val Val
Thr Asp Ser Arg Leu Ala Asp Gly Thr Leu Trp 65 70 75 80 Thr Ile Pro
Ile Thr Leu Asp Val Asp Glu Ala Phe Ala Asn Gln Ile 85 90 95 Lys
Pro Asp Thr Arg Ile Ala Leu Phe Gln Asp Asp Glu Ile Pro Ile 100 105
110 Ala Ile Leu Thr Val Gln Asp Val Tyr Lys Pro Asn Lys Thr Ile Glu
115 120 125 Ala Glu Lys Val Phe Arg Gly Asp Pro Glu His Pro Ala Ile
Ser Tyr 130 135 140 Leu Phe Asn Val Ala Gly Asp Tyr Tyr Val Gly Gly
Ser Leu Glu Ala 145 150 155 160 Ile Gln Leu Pro Gln His Tyr Asp Tyr
Pro Gly Leu Arg Lys Thr Pro 165 170 175 Ala Gln Leu Arg Leu Glu Phe
Gln Ser Arg Gln Trp Asp Arg Val Val 180 185 190 Ala Phe Gln Thr Arg
Asn Pro Met His Arg Ala His Arg Glu Leu Thr 195 200 205 Val Arg Ala
Ala Arg Glu Ala Asn Ala Lys Val Leu Ile His Pro Val 210 215 220 Val
Gly Leu Thr Lys Pro Gly Asp Ile Asp His His Thr Arg Val Arg 225 230
235 240 Val Tyr Gln Glu Ile Ile Lys Arg Tyr Pro Asn Gly Ile Ala Phe
Leu 245 250 255 Ser Leu Leu Pro Leu Ala Met Arg Met Ser Gly Asp Arg
Glu Ala Val 260 265 270 Trp His Ala Ile Ile Arg Lys Asn Tyr Gly Ala
Ser His Phe Ile Val 275 280 285 Gly Arg Asp His Ala Gly Pro Gly Lys
Asn Ser Lys Gly Val Asp Phe 290 295 300 Tyr Gly Pro Tyr Asp Ala Gln
Glu Leu Val Glu Ser Tyr Lys His Glu 305 310 315 320 Leu Asp Ile Glu
Val Val Pro Phe Arg Met Val Thr Tyr Leu Pro Asp 325 330 335 Glu Asp
Arg Tyr Ala Pro Ile Asp Gln Ile Asp Thr Thr Lys Thr Arg 340 345 350
Thr Leu Asn Ile Ser Gly Thr Glu Leu Arg Arg Arg Leu Arg Val Gly 355
360 365 Gly Glu Ile Pro Glu Trp Phe Ser Tyr Pro Glu Val Val Lys Ile
Leu 370 375 380 Arg Glu Ser Asn Pro Pro Arg Pro Lys Gln Gly Phe Ser
Ile Val Leu 385 390 395 400 Gly Asn Ser Leu Thr Val Ser Arg Glu Gln
Leu Ser Ile Ala Leu Leu 405 410 415 Ser Thr Phe Leu Gln Phe Gly Gly
Gly Arg Tyr Tyr Lys Ile Phe Glu 420 425 430 His Asn Asn Lys Thr Glu
Leu Leu Ser Leu Ile Gln Asp Phe Ile Gly 435 440 445 Ser Gly Ser Gly
Leu Ile Ile Pro Asn Gln Trp Glu Asp Asp Lys Asp 450 455 460 Ser Val
Val Gly Lys Gln Asn Val Tyr Leu Leu Asp Thr Ser Ser Ser 465 470 475
480 Ala Asp Ile Gln Leu Glu Ser Ala Asp Glu Pro Ile Ser His Ile Val
485 490 495 Gln Lys Val Val Leu Phe Leu Glu Asp Asn Gly Phe Phe Val
Phe 500 505 510 69202PRTSaccharomyces cerevisiae 69Met Ala Thr Asn
Ile Thr Trp His Pro Asn Leu Thr Tyr Asp Glu Arg 1 5 10 15 Lys Ala
Leu Arg Lys Gln Asp Gly Cys Thr Ile Trp Leu Thr Gly Leu 20 25 30
Ser Ala Ser Gly Lys Ser Thr Ile Ala Cys Ala Leu Glu Gln Leu Leu 35
40 45 Leu Gln Lys Asn Leu Ser Ala Tyr Arg Leu Asp Gly Asp Asn Ile
Arg 50 55 60 Phe Gly Leu Asn Lys Asp Leu Gly Phe Ser Glu Lys Asp
Arg Asn Glu 65 70
75 80 Asn Ile Arg Arg Ile Ser Glu Val Ser Lys Leu Phe Ala Asp Ser
Cys 85 90 95 Ala Ile Ser Ile Thr Ser Phe Ile Ser Pro Tyr Arg Val
Asp Arg Asp 100 105 110 Arg Ala Arg Glu Leu His Lys Glu Ala Gly Leu
Lys Phe Ile Glu Ile 115 120 125 Phe Val Asp Val Pro Leu Glu Val Ala
Glu Gln Arg Asp Pro Lys Gly 130 135 140 Leu Tyr Lys Lys Ala Arg Glu
Gly Val Ile Lys Glu Phe Thr Gly Ile 145 150 155 160 Ser Ala Pro Tyr
Glu Ala Pro Lys Ala Pro Glu Leu His Leu Arg Thr 165 170 175 Asp Gln
Lys Thr Val Glu Glu Cys Ala Thr Ile Ile Tyr Glu Tyr Leu 180 185 190
Ile Ser Glu Lys Ile Ile Arg Lys His Leu 195 200
70357PRTSaccharomyces cerevisiae 70Met Ala Leu Glu Arg Glu Leu Leu
Val Ala Thr Gln Ala Val Arg Lys 1 5 10 15 Ala Ser Leu Leu Thr Lys
Arg Ile Gln Ser Glu Val Ile Ser His Lys 20 25 30 Asp Ser Thr Thr
Ile Thr Lys Asn Asp Asn Ser Pro Val Thr Thr Gly 35 40 45 Asp Tyr
Ala Ala Gln Thr Ile Ile Ile Asn Ala Ile Lys Ser Asn Phe 50 55 60
Pro Asp Asp Lys Val Val Gly Glu Glu Ser Ser Ser Gly Leu Ser Asp 65
70 75 80 Ala Phe Val Ser Gly Ile Leu Asn Glu Ile Lys Ala Asn Asp
Glu Val 85 90 95 Tyr Asn Lys Asn Tyr Lys Lys Asp Asp Phe Leu Phe
Thr Asn Asp Gln 100 105 110 Phe Pro Leu Lys Ser Leu Glu Asp Val Arg
Gln Ile Ile Asp Phe Gly 115 120 125 Asn Tyr Glu Gly Gly Arg Lys Gly
Arg Phe Trp Cys Leu Asp Pro Ile 130 135 140 Asp Gly Thr Lys Gly Phe
Leu Arg Gly Glu Gln Phe Ala Val Cys Leu 145 150 155 160 Ala Leu Ile
Val Asp Gly Val Val Gln Leu Gly Cys Ile Gly Cys Pro 165 170 175 Asn
Leu Val Leu Ser Ser Tyr Gly Ala Gln Asp Leu Lys Gly His Glu 180 185
190 Ser Phe Gly Tyr Ile Phe Arg Ala Val Arg Gly Leu Gly Ala Phe Tyr
195 200 205 Ser Pro Ser Ser Asp Ala Glu Ser Trp Thr Lys Ile His Val
Arg His 210 215 220 Leu Lys Asp Thr Lys Asp Met Ile Thr Leu Glu Gly
Val Glu Lys Gly 225 230 235 240 His Ser Ser His Asp Glu Gln Thr Ala
Ile Lys Asn Lys Leu Asn Ile 245 250 255 Ser Lys Ser Leu His Leu Asp
Ser Gln Ala Lys Tyr Cys Leu Leu Ala 260 265 270 Leu Gly Leu Ala Asp
Val Tyr Leu Arg Leu Pro Ile Lys Leu Ser Tyr 275 280 285 Gln Glu Lys
Ile Trp Asp His Ala Ala Gly Asn Val Ile Val His Glu 290 295 300 Ala
Gly Gly Ile His Thr Asp Ala Met Glu Asp Val Pro Leu Asp Phe 305 310
315 320 Gly Asn Gly Arg Thr Leu Ala Thr Lys Gly Val Ile Ala Ser Ser
Gly 325 330 335 Pro Arg Glu Leu His Asp Leu Val Val Ser Thr Ser Cys
Asp Val Ile 340 345 350 Gln Ser Arg Asn Ala 355 71581PRTRattus
norvegicus 71Met Glu Ile Pro Gly Ser Leu Cys Lys Lys Val Lys Leu
Ser Asn Asn 1 5 10 15 Ala Gln Asn Trp Gly Met Gln Arg Ala Thr Asn
Val Thr Tyr Gln Ala 20 25 30 His His Val Ser Arg Asn Lys Arg Gly
Gln Val Val Gly Thr Arg Gly 35 40 45 Gly Phe Arg Gly Cys Thr Val
Trp Leu Thr Gly Leu Ser Gly Ala Gly 50 55 60 Lys Thr Thr Val Ser
Met Ala Leu Glu Glu Tyr Leu Val Cys His Gly 65 70 75 80 Ile Pro Cys
Tyr Thr Leu Asp Gly Asp Asn Ile Arg Gln Gly Leu Asn 85 90 95 Lys
Asn Leu Gly Phe Ser Pro Glu Asp Arg Glu Glu Asn Val Arg Arg 100 105
110 Ile Ala Glu Val Ala Lys Leu Phe Ala Asp Ala Gly Leu Val Cys Ile
115 120 125 Thr Ser Phe Ile Ser Pro Tyr Thr Gln Val Arg Gln Gly Phe
Thr Gly 130 135 140 Ile Asp Ser Glu Tyr Glu Lys Pro Glu Ala Pro Glu
Leu Val Leu Lys 145 150 155 160 Thr Asp Ser Cys Asp Val Asn Asp Cys
Val Gln Gln Val Val Glu Leu 165 170 175 Leu Gln Glu Arg Asp Ile Val
Pro Val Asp Ala Ser Tyr Glu Val Lys 180 185 190 Glu Leu Tyr Val Pro
Glu Asn Lys Leu His Leu Ala Lys Thr Asp Ala 195 200 205 Glu Ala Leu
Pro Ala Leu Lys Ile Asn Lys Val Asp Met Gln Trp Val 210 215 220 Gln
Val Leu Ala Glu Gly Trp Ala Thr Pro Leu Asn Gly Phe Met Arg 225 230
235 240 Glu Arg Glu Tyr Leu Gln Cys Leu His Phe Asp Cys Leu Leu Asp
Gly 245 250 255 Gly Val Ile Asn Leu Ser Val Pro Ile Val Leu Thr Ala
Thr Gln Glu 260 265 270 Asp Lys Glu Arg Leu Asp Gly Cys Thr Ala Phe
Ala Leu Val Tyr Glu 275 280 285 Gly Arg Arg Val Ala Ile Leu Arg Asn
Pro Glu Phe Phe Glu His Arg 290 295 300 Lys Glu Glu Arg Cys Ala Arg
Gln Trp Gly Thr Thr Cys Arg Ser His 305 310 315 320 Pro Tyr Ile Lys
Met Ile Leu Glu Gln Gly Asp Trp Leu Ile Gly Gly 325 330 335 Asp Leu
Gln Val Leu Asp Arg Ile Tyr Trp Asn Asp Gly Leu Asp Gln 340 345 350
Tyr Arg Leu Thr Pro Ala Glu Leu Lys Gln Lys Phe Lys Asp Met Asn 355
360 365 Ala Asp Ala Val Phe Ala Phe Gln Leu Arg Asn Pro Val His Asn
Gly 370 375 380 His Ala Leu Leu Met Gln Asp Thr His Lys Gln Leu Leu
Glu Arg Gly 385 390 395 400 Tyr Arg Arg Pro Val Leu Leu Leu His Pro
Leu Gly Gly Trp Thr Lys 405 410 415 Asp Asp Asp Val Pro Leu Met Trp
Arg Met Lys Gln His Ala Ala Val 420 425 430 Leu Glu Glu Gly Ile Leu
Asn Pro Glu Thr Thr Val Val Ala Ile Phe 435 440 445 Pro Ser Pro Met
Met Tyr Ala Gly Pro Thr Glu Val Gln Trp His Cys 450 455 460 Arg Ala
Arg Met Val Ala Gly Ala Asn Phe Tyr Ile Val Gly Arg Asp 465 470 475
480 Pro Ala Gly Met Pro His Pro Glu Thr Gly Lys Asp Leu Tyr Glu Pro
485 490 495 Thr His Gly Ala Lys Val Leu Thr Met Ala Pro Gly Leu Ile
Thr Leu 500 505 510 Glu Ile Val Pro Phe Arg Val Ala Ala Tyr Asn Lys
Lys Lys Lys Arg 515 520 525 Met Asp Tyr Tyr Asp Ser Asp His His Glu
Asp Phe Glu Phe Ile Ser 530 535 540 Gly Thr Arg Met Arg Lys Leu Ala
Arg Glu Gly Gln Lys Pro Pro Glu 545 550 555 560 Gly Phe Met Ala Pro
Lys Ala Trp Thr Val Leu Val Glu Tyr Tyr Lys 565 570 575 Ser Leu Glu
Lys Ala 580 72614PRTRattus norvegicus 72Met Ser Glu Ile Lys Lys Gln
Lys Thr Asp Gln Gln Lys Ser Thr Asn 1 5 10 15 Val Val Tyr Gln Ala
His His Val Ser Arg Asn Lys Arg Gly Gln Val 20 25 30 Val Gly Thr
Arg Gly Gly Phe Arg Gly Cys Thr Val Trp Leu Thr Gly 35 40 45 Leu
Ser Gly Ala Gly Lys Thr Thr Ile Ser Phe Ala Leu Glu Glu Tyr 50 55
60 Leu Val Ser His Ala Ile Pro Cys Tyr Ser Leu Asp Gly Asp Asn Val
65 70 75 80 Arg His Gly Leu Asn Lys Asn Leu Gly Phe Ser Ala Gly Asp
Arg Glu 85 90 95 Glu Asn Ile Arg Arg Ile Ala Glu Val Ala Lys Leu
Phe Ala Asp Ala 100 105 110 Gly Leu Val Cys Ile Thr Ser Phe Ile Ser
Pro Phe Ala Lys Asp Arg 115 120 125 Glu Asn Ala Arg Lys Ile His Glu
Ser Ala Gly Leu Pro Phe Phe Glu 130 135 140 Ile Phe Val Asp Ala Pro
Leu Asn Ile Cys Glu Ser Arg Asp Val Lys 145 150 155 160 Gly Leu Tyr
Lys Arg Ala Arg Ala Gly Glu Ile Lys Gly Phe Thr Gly 165 170 175 Ile
Asp Ser Asn Tyr Glu Lys Pro Glu Thr Pro Glu Cys Val Leu Lys 180 185
190 Thr Asn Leu Ser Ser Val Ser Asp Cys Val Gln Gln Val Val Glu Leu
195 200 205 Leu Gln Glu Gln Ser Ile Val Pro His Thr Thr Ile Lys Gly
Ile His 210 215 220 Glu Leu Phe Val Pro Glu Asn Lys Ile Asp Gln Ile
Arg Ala Glu Leu 225 230 235 240 Glu Thr Leu Pro Ser Leu Pro Ile Thr
Lys Leu Asp Leu Gln Trp Val 245 250 255 Gln Ile Leu Ser Glu Gly Trp
Ala Thr Pro Leu Lys Gly Phe Met Arg 260 265 270 Glu Lys Glu Tyr Leu
Gln Thr Leu His Phe Asp Thr Leu Leu Asp Asp 275 280 285 Gly Val Ile
Asn Met Ser Ile Pro Ile Val Leu Pro Val Ser Gly Asp 290 295 300 Asp
Lys Ala Arg Leu Glu Gly Cys Ser Lys Phe Ala Leu Met Tyr Glu 305 310
315 320 Gly Arg Arg Val Ala Leu Leu Gln Asp Pro Glu Phe Tyr Glu His
Arg 325 330 335 Lys Glu Glu Arg Cys Ser Arg Val Trp Gly Thr Ala Ser
Ala Lys His 340 345 350 Pro His Ile Lys Met Val Met Glu Gly Gly Asp
Trp Leu Val Gly Gly 355 360 365 Asp Leu Gln Val Leu Glu Arg Ile Arg
Trp Asn Asp Gly Leu Asp Gln 370 375 380 Tyr Arg Leu Thr Pro Leu Glu
Leu Lys Gln Lys Cys Lys Asp Met Asp 385 390 395 400 Ala Asp Ala Val
Phe Ala Phe Gln Leu Arg Asn Pro Val His Asn Gly 405 410 415 His Ala
Leu Leu Met Gln Asp Thr Arg Arg Arg Leu Leu Glu Arg Gly 420 425 430
Tyr Lys His Pro Val Leu Leu Leu His Pro Leu Gly Gly Trp Thr Lys 435
440 445 Asp Asp Asp Val Pro Leu Asp Trp Arg Met Lys Gln His Ala Ala
Val 450 455 460 Leu Glu Glu Gly Ile Leu Asp Pro Lys Ser Thr Ile Val
Ala Ile Phe 465 470 475 480 Pro Ser Pro Met Leu Tyr Ala Gly Pro Thr
Glu Val Gln Trp His Cys 485 490 495 Arg Cys Arg Met Ile Ala Gly Ala
Asn Phe Tyr Ile Val Gly Arg Asp 500 505 510 Pro Ala Gly Met Pro His
Pro Glu Thr Lys Lys Asp Leu Tyr Glu Pro 515 520 525 Thr His Gly Gly
Lys Val Leu Ser Met Ala Pro Gly Leu Thr Ser Val 530 535 540 Glu Ile
Ile Pro Phe Arg Val Ala Ala Tyr Asn Lys Ile Lys Lys Ala 545 550 555
560 Met Asp Phe Tyr Asp Pro Ala Arg His Asp Glu Phe Asp Phe Ile Ser
565 570 575 Gly Thr Arg Met Arg Lys Leu Ala Arg Glu Gly Glu Asp Pro
Pro Asp 580 585 590 Gly Phe Met Ala Pro Lys Ala Trp Lys Val Leu Thr
Asp Tyr Tyr Arg 595 600 605 Ser Leu Glu Lys Ile Asn 610
73389PRTBacillus subtilis 73Met Asn Gly Asn Glu Pro His Gly Gly Val
Leu Ile Asn Arg Cys Asp 1 5 10 15 Pro Ala Cys His Phe Glu Gly Cys
Ala Cys Gln Ala Glu Leu Asp Gln 20 25 30 Leu Ala Leu Ser Asp Leu
Glu Leu Ile Ala Ile Gly Gly Tyr Ser Pro 35 40 45 Leu Thr Gly Phe
Leu Gly Glu Lys Asp Tyr His Ser Val Val Lys Glu 50 55 60 Met Arg
Leu Ala Asn Gly Leu Pro Trp Ser Leu Pro Ile Thr Leu Pro 65 70 75 80
Val Gly Glu Lys Thr Ala Arg Gln Leu Ser Ala Gly Asp His Val Lys 85
90 95 Leu Val Lys Asp Gly Val Thr Tyr Gly Met Ile Thr Val Thr Asp
Ile 100 105 110 Tyr Gln Pro Asp Lys Thr Gln Glu Ala Leu Ser Val Phe
Lys Thr Asn 115 120 125 Asp Pro Ala His Pro Gly Val Lys Lys Leu Leu
Ala Arg Pro Asp Tyr 130 135 140 Tyr Ile Gly Gly Pro Ile Thr Val Ser
Ser Leu Pro Asp Lys Ser Phe 145 150 155 160 Glu Gln Phe Tyr Ala Thr
Pro Ala Glu Thr Arg Ala Ala Phe Gln Lys 165 170 175 Leu Gly Trp Lys
Thr Ile Val Gly Phe Gln Thr Arg Asn Pro Val His 180 185 190 Arg Ala
His Glu Tyr Ile Gln Lys Thr Ala Leu Glu Thr Val Asp Gly 195 200 205
Leu Leu Leu His Pro Leu Val Gly Glu Thr Lys Ser Asp Asp Ile Pro 210
215 220 Ser Asp Ile Arg Met Glu Ser Tyr Gln Ala Leu Leu Asn His Tyr
Tyr 225 230 235 240 Pro Lys Asp Arg Val Met Leu Ser Val Phe Pro Ala
Ala Met Arg Tyr 245 250 255 Ala Gly Pro Arg Glu Ala Ile Phe His Ala
Leu Val Arg Lys Asn Tyr 260 265 270 Gly Cys Thr His Phe Ile Val Gly
Arg Asp His Ala Gly Val Gly Ser 275 280 285 Tyr Tyr Gly Thr Tyr Asp
Ala Gln Asn Ile Phe Gln Ser Phe Thr Glu 290 295 300 Glu Glu Leu Gly
Ile Lys Pro Leu Phe Phe Glu His Ser Phe Tyr Cys 305 310 315 320 Arg
Lys Cys Gly Asn Met Gly Thr Ser Lys Thr Cys Pro His Ser Pro 325 330
335 Arg Asp His Ile His Leu Ser Gly Thr Lys Val Arg Glu Leu Leu Arg
340 345 350 Gln Gly Lys Lys Pro Pro Lys Glu Phe Ser Arg Pro Glu Val
Ala Ala 355 360 365 Val Leu Ile Lys Gly Leu His Gln Gln Pro Val Ala
Ile Lys Gln Asn 370 375 380 Ser Gly Glu Leu Gln 385
74199PRTBacillus subtilis 74Met Thr His Asn Pro Asn Ile Ile Trp His
Pro Ala Ala Ile Ser Lys 1 5 10 15 Ser Asp Arg Gln Ser Leu Asn Gly
His Lys Ser Cys Val Leu Trp Phe 20 25 30 Thr Gly Leu Ser Gly Ser
Gly Lys Ser Val Leu Ala Asn Ala Val Asp 35 40 45 Glu Lys Leu Tyr
Arg Lys Gly Ile Gln Ser Tyr Val Leu Asp Gly Asp 50 55 60 Asn Ile
Arg His Gly Leu Asn Lys Asp Leu Gly Phe Gln Thr Gly Asp 65 70 75 80
Arg Ile Glu Asn Ile Arg Arg Ile Gly Glu Val Ala Lys Leu Phe Val 85
90 95 Asp Ser Gly Gln Met Ile Leu Thr Ala Phe Ile Ser Pro Phe Arg
Glu 100 105 110 Asp Arg Asp Met Val Arg Ala Leu Phe Pro Lys Gly Glu
Phe Phe Glu 115 120 125 Ile Tyr Val Lys Cys Pro Leu His Val Cys Glu
Gln Arg Asp Pro Lys 130 135 140 Gly Leu Tyr Lys Lys Ala Arg Asn Gly
Glu Ile Lys His Phe Thr Gly 145 150 155 160 Ile Asp Ser Pro Tyr Glu
Ala Pro Leu Ser Pro Asp Phe Ile Ile Glu 165 170 175 Ser Asp Gln Thr
Ser Ile Ser Asp Gly Ala Asp Leu Ile Ile Asn Ala 180 185 190 Leu Gln
Asn Arg Gly Ile Ile 195 75313PRTBacillus subtilis 75Met Lys Thr Glu
Leu Ile Arg Thr Ile Ser Leu Tyr Asp Thr Ile Ile 1 5 10 15 Leu His
Arg His
Val Arg Pro Asp Pro Asp Ala Tyr Gly Ser Gln Cys 20 25 30 Gly Leu
Thr Glu Ile Leu Arg Glu Thr Tyr Pro Glu Lys Asn Ile Phe 35 40 45
Ala Val Gly Thr Pro Glu Pro Ser Leu Ser Phe Leu Tyr Ser Leu Asp 50
55 60 Glu Val Asp Asn Glu Thr Tyr Glu Gly Ala Leu Val Ile Val Cys
Asp 65 70 75 80 Thr Ala Asn Gln Glu Arg Ile Asp Asp Gln Arg Tyr Pro
Ser Gly Ala 85 90 95 Lys Leu Met Lys Ile Asp His His Pro Asn Glu
Asp Pro Tyr Gly Asp 100 105 110 Leu Leu Trp Val Asp Thr Ser Ala Ser
Ser Val Ser Glu Met Ile Tyr 115 120 125 Glu Leu Tyr Leu Glu Gly Lys
Glu His Gly Trp Lys Leu Asn Thr Lys 130 135 140 Ala Ala Glu Leu Ile
Tyr Ala Gly Ile Val Gly Asp Thr Gly Arg Phe 145 150 155 160 Leu Phe
Pro Asn Thr Thr Glu Lys Thr Leu Lys Tyr Ala Gly Glu Leu 165 170 175
Ile Gln Tyr Pro Phe Ser Ser Ser Glu Leu Phe Asn Gln Leu Tyr Glu 180
185 190 Thr Lys Leu Asn Val Val Lys Leu Asn Gly Phe Ile Phe Gln Asn
Val 195 200 205 Ser Leu Ser Glu Asn Gly Ala Ala Ser Val Phe Ile Lys
Lys Asp Thr 210 215 220 Leu Glu Lys Phe Gly Thr Thr Ala Ser Glu Ala
Ser Gln Leu Val Gly 225 230 235 240 Thr Leu Gly Asn Ile Ser Gly Ile
Arg Ala Trp Val Phe Phe Val Glu 245 250 255 Glu Asp Asp Gln Ile Arg
Val Arg Phe Arg Ser Lys Gly Pro Val Ile 260 265 270 Asn Gly Leu Ala
Arg Lys Tyr Asn Gly Gly Gly His Pro Leu Ala Ser 275 280 285 Gly Ala
Ser Ile Tyr Ser Trp Asp Glu Ala Asp Arg Ile Leu Ala Asp 290 295 300
Leu Glu Thr Leu Cys Lys Glu His Glu 305 310
* * * * *