U.S. patent application number 12/481314 was filed with the patent office on 2009-11-19 for borrelidin-producing polyketide synthase and its use.
Invention is credited to Alfredo F. Brana, Peter F. Leadlay, Christine J. Martin, Carmen Mendez, Steven Moss, Carlos Olano, Marko Oliynyk, Jose A. Salas, Cesar Sanchez, Barrie Wilkinson.
Application Number | 20090286291 12/481314 |
Document ID | / |
Family ID | 9950463 |
Filed Date | 2009-11-19 |
United States Patent
Application |
20090286291 |
Kind Code |
A1 |
Salas; Jose A. ; et
al. |
November 19, 2009 |
BORRELIDIN-PRODUCING POLYKETIDE SYNTHASE AND ITS USE
Abstract
The present invention relates to the biosynthesis of polyketides
and derives from the cloning of nucleic acids encoding a polyketide
synthase and other associated proteins involved in the synthesis of
the polyketide borrelidin. Materials and Methods including enzyme
systems, nucleic acids, vectors and cells are provided for the
preparation of polyketides including borrelidin and analogues and
derivatives thereof. Novel polyketide molecules are also
provided.
Inventors: |
Salas; Jose A.; (Oviedo,
ES) ; Mendez; Carmen; (Oviedo, ES) ; Olano;
Carlos; (Oviedo, ES) ; Sanchez; Cesar;
(Oviedo, ES) ; Brana; Alfredo F.; (Oviedo, ES)
; Wilkinson; Barrie; (Sharnbrook, GB) ; Martin;
Christine J.; (Cambridge, GB) ; Moss; Steven;
(Cambridge, GB) ; Leadlay; Peter F.; (Cambridge,
GB) ; Oliynyk; Marko; (Cambridge, GB) |
Correspondence
Address: |
DANN, DORFMAN, HERRELL & SKILLMAN
1601 MARKET STREET, SUITE 2400
PHILADELPHIA
PA
19103-2307
US
|
Family ID: |
9950463 |
Appl. No.: |
12/481314 |
Filed: |
June 9, 2009 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10534210 |
Mar 17, 2006 |
7560252 |
|
|
PCT/GB2003/005704 |
Dec 24, 2003 |
|
|
|
12481314 |
|
|
|
|
Current U.S.
Class: |
435/124 ;
435/252.3; 435/252.35; 435/440 |
Current CPC
Class: |
C07D 313/00 20130101;
A61P 35/00 20180101; C12P 17/08 20130101; A61P 33/06 20180101; A61P
17/06 20180101; A61P 19/02 20180101; A61P 31/04 20180101; C12N
15/52 20130101 |
Class at
Publication: |
435/124 ;
435/252.3; 435/252.35; 435/440 |
International
Class: |
C12P 17/08 20060101
C12P017/08; C12N 1/21 20060101 C12N001/21; C12N 15/67 20060101
C12N015/67 |
Foreign Application Data
Date |
Code |
Application Number |
Dec 27, 2002 |
GB |
0230217.2 |
Claims
1-73. (canceled)
74. A host cell capable of expressing a polyketide synthase for
borrelidin or a borrelidin derivative or analogue, in which a
borrelidin biosynthetic gene involved in production of the
borrelidin starter unit in said cell, has been deleted, disrupted,
or otherwise inactivated wherein said gene is selected from the
list consisting of borC, borD, borE, borF, borG, borH, borK, borL,
borM, and borN.
75. The host cell according to claim 74 wherein the gene is
borG.
76. The host cell according to claim 74 wherein the gene is
borE.
77. The host cell according to claim 74 in which one or more
borrelidin biosynthesis genes or borrelidin polyketide synthase
domains or modules are additionally deleted, modified or
replaced.
78. The host cell according to claim 74 which is an
Actinomycete.
79. The host cell according to claim 74 which is a
Streptomycete.
80. The host cell according to claim 74 wherein the host cell is
selected from the group consisting of Saccharopolyspora erythraea,
Streptomyces coelicolor, Streptomyces avermitilis, Streptomyces
griseofuscus, Streptomyces cinnamonensis, Micromonospora
griseorubida, Streptomyces hygroscopicus, Streptomyces fradiae,
Streptomyces longisporoflavus, Streptomyces lasaliensis,
Streptomyces tsukubaensis, Streptomyces griseus, Streptomyces
venezuelae, Streptomyces antibioticus, Streptomyces lividans,
Streptomyces rimosus, Streptomyces albus, Streptomyces rochei
ATCC23956, Streptomyces parvulus Tu113, and Streptomyces parvulus
Tu4055.
81. A method for modifying a host cell to increase its capacity for
the production of borrelidin, or a borrelidin derivative or
analogue, the host cell being capable of expressing a polyketide
synthase for borrelidin or said derivative or analogue, the method
comprising deleting, disrupting, or otherwise inactivating a
borrelidin biosynthetic gene involved in production of the
borrelidin starter unit in said cell, wherein the gene is selected
from the group consisting of borC, borD, borE, borF, borG, borH,
borK, borL, borM and borN.
82. The method according to claim 81 wherein the gene is borG.
83. The method according to claim 81 wherein the gene is borE.
84. The method of claim 81, wherein the gene is borG and the method
additionally comprises deleting, modifying or replacing one or more
borrelidin biosynthesis genes or borrelidin polyketide synthase
domains or modules.
85. A method for producing borrelidin, or a borrelidin derivative
or analogue, said method comprising fermenting a host cell
according to claim 74 and feeding an exogenous carboxylic acid.
86. The method of claim 85 wherein the gene is borG and wherein the
exogenous carboxylic acid is selected from the group consisting of
trans-cyclobutane-1,2-dicarboxylic acid, 2,3-dimethylsuccinic acid,
2-methylsuccinic acid, and trans-cyclopentane-1,2-dicarboxylic
acid.
87. The method according to claim 85, further comprising the step
of isolating the borrelidin, borrelidin derivative or borrelidin
analogue.
Description
FIELD OF THE INVENTION
[0001] The present invention relates to materials and methods for
the preparation of polyketides. Enzyme systems, nucleic acids,
vectors and cells are provided for the preparation of polyketides,
and in particular the polyketide macrolide borrelidin.
BACKGROUND TO THE INVENTION
[0002] Polyketides are natural products produced by a wide range of
organisms, and particularly by microorganisms. Polyketides have
many important pharmaceutical, veterinary and agricultural uses.
Polyketides encompass a huge range of chemical structural space,
and have a wide range of associated biological activities.
Polyketides with use in medical treatments include antibiotics,
immunosuppressants, antitumor agents, other chemotherapeutic
agents, and other compounds possessing a broad range of therapeutic
and biological properties. The Gram-positive bacteria Streptomyces
and their allied genera are prodigious producers of polyketides,
and the genetics and biochemistry of polyketide biosynthesis in
these organisms are relatively well characterised (Hopwood, 1997).
The genes for polyketide biosynthesis in Streptomyces are clustered
and the exploitation of DNA technology has made it possible to
isolate complete biosynthetic gene clusters by screening gene
libraries with DNA probes encoding the genes responsible for their
biosynthesis. Thus, increasing numbers of gene clusters for
polyketide biosynthesis in Streptomyces and other microorganisms
have been isolated and sequenced, including, for example, those for
the polyether monensin (WO 01/68867), the polyene nystatin (WO
01/59126) and for rapamycin (Schwecke et al., 1995).
[0003] Polyketides are synthesised through the repeated
condensation of building blocks that contain a carboxylic acid
function. At each stage of the process this results in the
formation of a new .beta.-keto function and an .alpha.-side chain
branch into the growing chain. The structural diversity of
polyketides derives from a number of aspects of their biosynthetic
pathway including: the wide variety of starter units that may be
utilised in their biosynthesis; the different lengths of polyketide
chains that are possible; the various .alpha.-side chains that are
introduced either during or after assembly of the polyketide chain;
the various .beta.-substitutions that may be introduced during or
after assembly of the polyketide chain; the various degrees of
processing that the .beta.-keto groups can undergo (keto, hydroxyl,
enoyl, and methylene); and the various stereochemistries that are
possible at the .alpha.- and .beta.-centres.
[0004] The synthesis of polyketides is catalysed by an enzyme, or
by a complex of enzymes, called the polyketide synthase (PKS) in a
manner similar to that of fatty acid biosynthesis. Streptomyces and
related genera PKSs fall into three main categories: type-I,
type-II and type-III. The type-III PKSs are small proteins related
to plant chalcone synthases that have been discovered only recently
(Moore & Hopke, 2000). Type-III systems have been implicated in
the biosynthesis of a small number of secondary metabolites but may
be more generally involved in the biosynthesis of soluble pigments
(Cortes et al., 2002). The type-II PKSs consist of several
monofunctional proteins that act as a multi-polypeptide complex.
Simple aromatic polyketides such as actinorhodin are formed by
several rounds of chain assembly, which are performed iteratively
on one set of type-II PKS enzymes that are encoded for by one set
of PKS genes (Hopwood, 1997). Type-I PKSs are multifunctional
proteins and are required for the synthesis of more complex
polyketides such as erythromycin and rapamycin. As the focus of
this patent, type-I PKS organisation and function are described in
detail below:
[0005] Type-I PKSs are organised into modules, whereby each module
consists of several catalytic `domains` that are required to carry
out one round of chain assembly (Staunton & Wilkinson, 1997).
In general a modular PKS contains the correct number of modules
(loading plus extension modules) to select and condense the correct
number of loading and extension units. For example the erythromycin
PKS consists of 7 modules (one loading and six extension modules)
to select and condense the one starter and six extension units
required for the biosynthesis of the erythromycin precursor
6-deoxyerythronolide B. Thus, there exists a one to one
relationship between the number of modules present in the PKS and
the number of units incorporated. This one to one relationship is
described as `co-linearity`.
[0006] The term `extension module` as used herein refers to the set
of contiguous domains, from the .beta.-ketoacyl-acyl carrier
protein synthase (KS) domain to the next acyl carrier protein (ACP)
domain, which accomplishes one cycle of polyketide chain extension.
The term `loading module` as used herein refers to any group of
contiguous domains that accomplishes the loading of the starter
unit onto the PKS and thus renders it available to the KS domain of
the first extension module. Besides condensation of the next
extender carboxylic acid (or ketide) unit onto the growing
polyketide chain, which is performed by the catalytic activity of
the essential KS domain, modules of type-I PKSs may contain domains
with .beta.-ketoreductase (KR), dehydratase (DH), and enoyl
reductase (ER) activities which are responsible for the further
processing of the newly formed .beta.-keto groups during chain
extension. The acyl transferase (AT) and the ACP domains present in
each module are responsible for the choice of extender unit, and
the tethering of the growing chain during its passage on the PKS
respectively. The AT domains of a modular PKS can also be found as
discrete proteins (Cheng et al., 2003). The completed polyketide
chain is generally released from PKSs by the action of a terminal
thioesterase (TE) domain that is also generally involved in the
cyclisation (lactonisation) of the final product. Other chain
terminating/cyclising strategies are also employed such as that for
the addition of an amino acid residue and macrolactam formation as
observed for rapamycin (Schwecke et al., 1995), for macrolactam
formation as for rifamycin (August et al., 1998), and for amino
acid incorporation followed by reductive elimination as for
myxalamid biosynthesis (Silakowski et al., 2001). In summary, there
is a single enzymatic domain present for each successive catalytic
step that occurs during biosynthesis on the PKS, and they are used
in defined sequence that depends upon their location within the
protein and the particular function they perform. This mechanism is
termed `processive`.
[0007] The modular arrangement of type-I PKSs was first confirmed
by mutation of the erythromycin PKS (also known as
6-deoxyerythronolide B synthase, DEBS) through an in-frame deletion
of a region of the KR domain of module 5 (Donadio et al., 1991).
This led to the production of the erythromycin analogues,
5,6-dideoxy-3-.alpha.-mycarosyl-5-oxoerythronolide B and
5,6-dideoxy-5-oxoerythronolide B, due to the inability of the
mutated KR domain to reduce the .beta.-keto group 5 at this stage
of processive biosynthesis. Likewise, alteration of the active site
residues in the ER domain of module 4 of DEBS2, by genetic
engineering of the corresponding PKS-encoding DNA and its
introduction into Saccharopolyspora erythraea, led to the
production of 6,7-anhydroerythromycin C (Donadio et al., 1993). In
addition, the length of the polyketide chain formed by DEBS has
been altered through the specific relocation of the TE domain of
DEBS3 to the end of DEBS1; the expected triketide lactone product
was produced in good yield (Cortes et al., 1995). It should be
noted that the changes described involved modification by deletion
of sequence, or by sequence specific inactivation, or by the
alternative juxtaposition of DNA sequence from within the same PKS
cluster (ie. they are considered `homologous changes`). Other such
`homologous` changes to the erythromycin PKS are described in WO
93/13663.
[0008] The modular organisation of type-I PKS genes lends itself to
the manipulation of these genes to produce altered polyketide
structures. Type I PKSs represent an assembly line for polyketide
biosynthesis that can be manipulated by changing the number of
modules; by changing their specificities towards different
carboxylic acid starter units and extender units; by inactivating,
mutating, removing, swapping or inserting domains with different
activities and specificities; and by altering the chain or ring
size through the repositioning of termination or cyclisation
domains (Staunton & Wilkinson, 1997).
[0009] WO 98/01546 describes the production of hybrid PKS gene
assemblies comprising the incorporation of heterologous DNA. WO
98/01546 describes methods for generating hybrid PKSs in which the
substitution of genes encoding heterologous modules, sub-modules or
domains for the native genes generates novel polyketides with
altered structures. Specifically, for example the AT domains of
heterologous DNA from the rapamycin or monensin PKSs can be
exchanged for that native to the erythromycin PKS in order to
generate novel polyketides with altered alkyl branching. Such an AT
domain swap represented the first example of the production of a
truly hybrid PKS (Oliynyk et al., 1996). WO 98/01546 also describes
in general terms the production of hybrid PKS assemblies comprising
a loading module and at least one extension module. It specifically
describes the construction of a hybrid PKS gene assembly by
grafting the broad-specificity loading module for the
avermectin-producing PKS onto the first protein of the erythromycin
PKS (DEBS1) in place of the normal loading module (see also Marsden
et al., 1998). Additional examples comprising loading module swaps
that are substrate specific have also been described (WO 00/00618;
U.S. Pat. No. 5,876,991; Kuhstoss et al., 1996). WO 00/01827
describes methods for varying the .beta.-keto processing capability
of a PKS module through the ability to swap `reductive loops`, ie.
the ability to rapidly and in a combinatorial manner, alter the
number and type of ketoreductase, dehydratase and enoyl reductase
domains within a module. In addition to changing the level of
.beta.-keto group processing, such changes may also lead to changes
in stereochemistry of the .alpha.-alkyl and .beta.-hydroxyl groups
thus formed by the altered modules.
[0010] Although modular PKSs operate `normally` in a co-linear and
processive manner as described above, examples of a deviation from
this mode of operation have been described and are discussed
below.
[0011] The picromycin PKS gene cluster in Streptomyces venezuelae
is responsible for the biosynthesis of both picromycin (a
14-membered, heptaketide macrolide) and methymycin (a 12-membered,
hexaketide macrolide) (Xue et al., 1998). The ability of a single
PKS to produce two related macrolides, of different ring sizes,
derives from the alternative expression of the final PKS gene pikA4
(Xue & Sherman, 2000). When `normal` expression occurs and
full-length PikA4 is formed, a sixth extension unit is incorporated
and the picromycin aglycone is produced; when alternative
expression occurs and an N-terminally truncated form of PikA4 is
produced, no sixth extension unit is incorporated and the growing
polyketide chain is passed directly to the TE domain which leads to
formation of the methymycin aglycone. Thus, a breakdown of
co-linearity occurs and a `ring contracted` product is formed. The
biochemical basis for this phenomenon has been investigated and
shown to be an ACP5 to ACP6 transfer, missing out covalent
attachment to the intervening KS6 domain; such a breakdown of
co-linearity has been called `skipping` (Beck et al., 2002).
[0012] Skipping has also been observed to occur when an extra
extension module from the rapamycin PKS was interpolated into the
erythromycin PKS in order to convert the natural
heptketide-producing PKS Into an octaketide-producing one (Rowe et
al., 2001). The expected octaketide, 16-membered macrolide was
produced, but the major product was the normal heptaketide product
6-deoxyerythronolide. This `skipping` of the interpolated module is
believed to occur due to the interpolated module acting on some
occasions as a `shuttle`, passing the growing chain from the
preceding module to the following downstream module without
performing a round of chain extension. It was subsequently shown
that the ACP domain of the interpolated module is essential in
passing the growing polyketide chain from the preceding ACP domain
and passing it to the KS domain of the following module during
skipping (Thomas et al., 2002), a mechanism similar to that
described for methymycin biosynthesis above. It is shown that
skipping can occur without the active site nucleophile of the KS
domain. A ring-contracted (skipped) nemadectin (an antiparasitic
macrolide) has been reported from a mutant of a Streptomyces soil
isolate that was modified by chemical mutation (Rudd et al., 1990);
the biosynthesis of the natural PKS product was abolished.
[0013] An alternative manner in which modular PKSs deviate from
co-linear operation involves the iterative operation of modules.
For example, module 4 of the erythromycin PKS appears to operate
iteratively, at a low level, to produce a ring expanded
16-membered, octaketide macrolide related to 6-deoxyerythronolide B
(Wilkinson et al., 2000). The ability of the erythromycin PKS to
perform this operation has been termed `stuttering`. The
`stuttering` of the erythromycin PKS is considered an aberrant
process, as the products of this stuttering are formed in low yield
and the major product of the erythromycin PKS is the normal
heptaketide 6-deoxyerythonolide B formed by co-linear operation.
Products that appear to be formed by both stuttering and skipping
have also been reported as minor components from the epothilone
producer Soranglum cellulosum (Hardt et al., 2001). The
stigmatellin biosynthetic cluster of Stigmatella aurantiaca encodes
for a PKS that comprises ten (one loading and nine extension)
modules (Gaitatzis et al., 2002); however, based on results from
structural elucidation and the feeding of stable isotope labelled
substrates, stigmatellin is formed from eleven modular derived
units. Thus, it would appear that one of the stigmatellin PKS
modules operates (twice) iteratively.
[0014] Since the priority filing of the present application, the
sequence of the PKS responsible for biosynthesis of the macrolide
lankacidin by Streptomyces rochei has been described (Mochizuki et
al., 2003). This PKS also appears to contain too few modules in
comparison to the number of extension cycles required for
lankacidin biosynthesis, although the mechanism by which this would
occur is not clear.
[0015] Additional structural diversity can be generated through the
modification of polyketides by enzymes other than the PKS, either
during the process of chain assembly as seen during the
biosynthesis of some ansamycins (Floss, 2001), or after the process
of chain assembly following release from the PKS. Such non-PKS
mediated reactions may include, but are not limited to the
following: reduction, oxidation, hydroxylation, acylation,
alkylation, amination, decarboxylation, dehydration, double bond
isomerisation/migration, cyclisation, ring cleavage, conjugation,
glycosylation, reductive elimination and any combination of these.
When these reactions occur after chain assembly they are termed the
post-PKS or tailoring steps. Such tailoring steps are generally,
but not always, essential for endowing the polyketide natural
product with biological activity.
[0016] In addition, the structural diversity of polyketides
obtainable biosynthetically can be further enhanced through the use
of defined heterologous post-PKS tailoring enzymes as well as
through the use of those which naturally modify the natural
polyketide (Gaisser et al., 2000). WO 01/79520 describes the
heterologous modification of polyketide macrolide structures
through glycosylation, epoxidation, hydroxylation, and methylation.
The ability to generate analogues of the agricultural compound
spinosyn through glycosylation with alternative deoxyhexose
substituents has been reported (Gaisser et al., 2002).
[0017] Borrelidin 1 (FIG. 1) is an 18-membered macrolide produced
by several bacterial strains including, but not limited to,
Streptomyces rochei ATCC23956, Streptomyces parvulus Tu113 and
Streptomyces parvulus Tu4055. Borrelidin is herein shown to be
derived from a trans-cyclopentane-1,2-dicarboxylic acid starter
acid, three malonyl-CoA and five methylmalonyl-CoA extender units
(see FIG. 2). From the absolute stereochemistry of borrelidin,
based on the crystal structure and recently confirmed through total
synthesis, the actual starter acid is predicted to be
trans-cyclopentane-(1R,2R)-dicarboxylic acid. Borrelidin isolated
after the feeding of stable isotope labelled acetate and propionate
substrates clearly indicated the expected incorporation of these
building blocks; in addition, it has been demonstrated in the
present application that feeding of
trans-cyclopentane-1,2-dicarboxylic acid was sufficient to
re-establish borrelidin biosynthesis in mutants where specific
genes believed to be involved in the formation of the starter unit
had been disrupted. Borrelidin contains a nitrile group attached to
the C12 position, which is shown herein to arise through the action
of tailoring enzymes acting upon a methylmalonyl-CoA derived methyl
branch present at this position. The gross structure of borrelidin
was first elucidated in 1967 (Keller-Scheirlein, 1967), and was
subsequently refined by detailed NMR analysis (Kuo et al., 1989).
The absolute configuration of borrelidin was confirmed by X-ray
crystallography (Anderson et al., 1989). Its co-identity as the
antibiotic treponemycin has been verified (Maehr & Evans,
1987).
[0018] A number of groups have reported the synthesis of fragments
of the borrelidin structure, and since the priority filing of the
present application, two independent total syntheses of borrelidin
have been reported (Hanessian et al., 2003; Duffey et al.,
2003).
[0019] Borrelidin was first discovered due to its antibacterial
activity (Berger et al., 1949), although this antibacterial
activity extends only to a limited number of micrococci, and is not
found against all common test bacteria. The mode of action in
sensitive microorganisms involves selective inhibition of threonyl
tRNA synthetase (Paetz & Nass, 1973). Other activities against
spirochetes of the genus Treponema (Singh et al., 1985; U.S. Pat.
No. 4,759,928), against viruses (Dickinson et al., 1965), uses for
the control of animal pests and weeds (DE 3607287) and use as an
agricultural fungicide (DE 19835669; U.S. Pat. No. 6,193,964) have
been reported. Additionally, since the priority filing of the
present application, borrelidin has been reported to have
antimalarial activity against drug resistant Plasmodium falciparum
strains (Otoguro et al., 2003). Between all of these reports only
two reported any synthetically modified derivatives. The first of
these describes the benzyl ester and its bis-O-(4-nitrobenzoyl)
derivative (Berger et al., 1949). The second of these describes the
borrelidin methyl ester, the methyl ester bis O-acetyl derivative,
and the methyl ester .DELTA..sub.14-15-dihydro-,
.DELTA..sub.14-15,12-13-tetrahydro-, and
.DELTA..sub.14-15,12-13-tetrahydro-C12-amino derivatives (Anderton
& Rickards, 1965). No biological activity was reported for any
of these compounds.
[0020] A recent disclosure of particular interest is the discovery
that borrelidin displays anti-angiogenesis activity (Wakabayashi et
al., 1997). Angiogenesis is the process of the formation of new
blood vessels. Angiogenesis occurs only locally and transiently in
adults, being involved in, for example, repair following local
trauma and the female reproductive cycle. It has been established
as a key component in several pathogenic processes including
cancer, rheumatoid arthritis and diabetic retinopathy. Its
importance in enabling tumours to grow beyond a diameter of 1-2 cm
was established by Folkman (Folkman, 1986), and is provoked by the
tumour responding to hypoxia. In its downstream consequences
angiogenesis is mostly a host-derived process, thus inhibition of
angiogenesis offers significant potential in the treatment of
cancers, avoiding the hurdles of other anticancer therapeutic
modalities such as the diversity of cancer types and drug
resistance (Matter, 2001). It is of additional interest that recent
publications have described the functional involvement of
tyrosinyl- and tryptophanyl tRNA synthetases in the regulation of
angiogenesis (Wakasugi et al., 2002; Otani et al., 2002).
[0021] In the rat aorta matrix culture model of angiogenesis,
borrelidin exhibits a potent angiogenesis-inhibiting effect and
also causes disruption of formed capillary tubes in a dose
dependent manner by inducing apoptosis of the capillary-forming
cells (Wakabayashi et al., 1997). Borrelidin inhibited capillary
tube formation with an IC.sub.50 value of 0.4 ng/ml (0.8 nM). In
the same study, borrelidin was shown to possess anti-proliferative
activity towards human umbilical vein endothelial cells (HUVEC) in
a cell growth assay; the IC.sub.50 value was measured at 6 ng/ml,
which is 15-fold weaker than the anti-angiogenesis activity
measured in the same medium. This anti-proliferative activity of
borrelidin was shown to be general towards various cell lines. In
addition to these data the authors report that borrelidin inhibits
tRNA synthetase and protein synthesis in the cultured rat cells;
however the IC.sub.50 value for anti-angiogenesis activity (0.4
ng/ml) was 50-fold lower than that reported for Inhibition of
protein synthesis (20 ng/ml), indicating different activities of
the compound.
[0022] Borrelidin also displays potent inhibition of angiogenesis
in vivo using the mouse dorsal air sac model (Funahashi et al.,
1999), which examines VEGF-induced angiogenesis and is an excellent
model for studying tumour-angiogenesis. Borrelidin was administered
at a dose of 1.8 mg/kg by intraperitoneal injection and shown to
significantly reduce the increment of vascular volume induced by
WiDr cells, and to a higher degree than does TNP-470, which is a
synthetic angiogenesis inhibitor in clinical trials. Detailed
controls verified that these data are for angiogenesis inhibition
and not inhibition of growth of the tumour cells. The authors also
showed that borrelidin is effective for the inhibition of the
formation of spontaneous lung metastases of B16-BL6 melanoma cells
at the same dosage by inhibiting the angiogenic processes involved
in their formation.
[0023] JP 9-227,549 and JP 8-173,167 confirm that borrelidin is
effective against WiDr cell lines of human colon cancer, and also
against PC-3 cell lines of human prostate cancer. JP 9-227,549
describes the production of borrelidin by Streptomyces rochei
Mer-N7167 (Ferm P-14670) and its isolation from the resulting
fermentation culture. In addition to borrelidin 1,
12-desnitrile-12-carboxyl borrelidin 2 (presumably a biosynthetic
intermediate or shunt metabolite), 10-desmethyl borrelidin 3
(presumably a biosynthetic analogue arising from the
mis-incorporation of an alternative malonyl-CoA extender unit in
module 4 of the borrelidin PKS), 11-epiborrelidin 4 and the
C14,C15-cis borrelidin analogue 5 were described (see FIG. 1).
Thus, JP 9-227,549 specifies borrelidin and borrelidin analogues
wherein a nitrile or carboxyl group is attached the carbon skeleton
at C12, and a hydrogen atom or lower alkyl group is attached to the
carbon skeleton at C10.
[0024] WO 01/09113 discloses the preparation of borrelidin
analogues that have undergone synthetic modification at the
carboxylic acid moiety of the cyclopentane ring. The activity of
these compounds was examined using endothelial cell proliferation
and endothelial capillary formation assays in a similar manner to
that described above. In general, modification of the carboxyl
moiety improved the selectivity for inhibiting capillary formation:
the major reason for this improvement in selectivity is through a
decrease in the cell proliferation inhibition activity whereas the
capillary formation inhibitory activity was altered to a much lower
degree. Specifically, the borrelidin-morpholinoethyl ester showed a
60-fold selectivity index, the borrelidin-amide showed a 37-fold
selectivity index, the borrelidin-(2-pyridyl)-ethyl ester showed a
7.5-fold selectivity index and the borrelidin-morpholinoethyl amide
showed a 6-fold selectivity index, for the capillary formation
inhibitory activity versus cell proliferation with respect to
borrelidin. The capillary formation inhibitory activity of these
and other borrelidin derivatives was verified using a micro-vessel
formation assay. In addition, the authors showed that borrelidin
weakly inhibited the propagation of metastatic nodules, after
removal of the primary tumour, when using a Lewis lung
adenocarcinoma model. However, the borrelidin-(3-picolylamide)
derivative was reported to inhibit very considerably the increase
of micrometastases in rats after intraperitoneal and also with per
os administration at subtoxic doses. Similarly, using the colon 38
spleen liver model, the metastasis-forming ability of mouse colon
adenocarcinoma cells transplanted into mouse spleen was
considerably decreased after treatment with a subtoxic dose of this
borrelidin derivative. These data confirm the earlier reported
ability of borrelidin and its derivatives to inhibit the formation
of metastases.
[0025] Borrelidin has also been identified as an inhibitor of
cyclin-dependant kinase Cdc28/Cln2 of Saccharomyces cerevisiae with
an IC.sub.50 value of 12 .mu.g/ml (24 .mu.M) (Tsuchiya et al.,
2001). It was shown that borrelidin arrests both haploid and
diploid cells in late G.sub.1 phase (at a time point
indistinguishable from .alpha.-mating pheromone), and at
concentrations that do not affect gross protein biosynthesis. These
data were taken to indicate that borrelidin has potential as a lead
compound to develop anti-tumour agents.
[0026] Since the priority filing of the present application, two
further reports have been published concerning the biological
activity of borrelidin. The first of these indicates that the
anti-angiogenic effects of borrelidin are mediated through distinct
pathways (Kawamura et al., 2003). High concentrations of threonine
were found to attenuate the ability of borrelidin to inhibit both
capillary tube formation in the rat aorta culture model and HUVEC
cells proliferation; however, it did not affect the ability of
borrelidin to collapse formed capillary tubes or to induce
apoptosis in HUVEC. Borrelidin was also found to activate caspase-3
and caspase-8, and inhibitors of both of these suppressed
borrelidin induced apoptosis in HUVEC. The second of these papers
used the method of global cellular mRNA profiling to provide
insight into the effects of borrelidin on Saccharomyces cerevisiae
(Eastwood and Schaus, 2003). This analysis showed the induction of
amino acid biosynthetic enzymes in a time-dependent fashion upon
treatment with borrelidin, and it was ascertained that the
induction of this pathway involves the GCN4 transcription
factor.
[0027] In summary, the angiogenesis-inhibitory effect of borrelidin
is directed towards the twin tumour-biological effects of
proliferation and capillary formation. In addition, borrelidin, and
derivatives thereof, have been shown to inhibit the propagation of
metastases. Borrelidin also has indications for use in cell cycle
modulation. Thus, borrelidin and related compounds are particularly
attractive targets for investigation as therapeutic agents for the
treatment of tumour tissues, either as single agents or for use as
an adjunct to other therapies. In addition, they may be used for
treating other diseases in which angiogenesis is implicated in the
pathogenic process, including, but not restricted to, the following
list: rheumatoid arthritis, psoriasis, atherosclerosis, diabetic
retinopathy and various ophthalmic disorders.
SUMMARY OF THE INVENTION
[0028] The present invention provides the entire nucleic acid
sequence of the biosynthetic gene cluster responsible for governing
the synthesis of the polyketide macrolide borrelidin in
Streptomyces parvulus Tu4055. Also provided is the use of all or
part of the cloned DNA and the nucleic acid sequences thereof in
the specific detection of other polyketide biosynthetic gene
clusters, in the engineering of mutant strains of Streptomyces
parvulus and other suitable host strains for the production of
enhanced levels of borrelidin, or for the production of modified or
novel polyketides, and of recombinant genes encoding PKS systems
for the biosynthesis of modified or novel polyketides.
[0029] The present invention provides an isolated nucleic acid
molecule comprising all or part of a borrelidin biosynthetic gene
cluster.
[0030] The complete nucleotide sequence of the borrelidin
biosynthetic gene cluster from Streptomyces parvulus Tu4055 is
shown in SEQ ID No.1. Its organisation is presented in FIG. 3 and
comprises genes and open reading frames designated hereinafter as:
borA1, borA2, borA3, borA4, borA5, borA6, borB, borC, borD, borE,
borF, borG, borH, borI, borJ, borK, borL, borM, borN, borO, orfB1,
orfB2, orfB3, orfB4, orfB5, orfB6, orfB7, orfB8, orfB9, orfB10,
orfB11, orfB12, orfB13, orfB14, orfB15, orfB16, orfB17, orfB18,
orfB19, orfB20, orfB21 and orfB22.
[0031] The proposed functions of the cloned genes are described in
FIG. 4 (proposed biosynthesis of the starter unit), 5 (organisation
of the borrelidin PKS and biosynthesis of pre-borrelidin) and 6
(introduction of the C12-nitrile moiety) and are described
below.
[0032] The present invention thus provides an isolated nucleic acid
molecule comprising:
(a) a nucleotide sequence as shown in SEQ ID No.1, or a portion or
fragment thereof; or (b) a nucleotide sequence which is the
complement of SEQ ID No.1, or a portion or fragment thereof; or (c)
a nucleotide sequence which is degenerate with a coding sequence of
SEQ ID No.1, or a portion or fragment thereof.
[0033] As used herein the term "fragment" with respect to
nucleotide sequences refers to a stretch of nucleic acid residues
that are at least 10, preferably at least 20, at least 30, at least
50, at least 75, at least 100, at least 150 or at least 200
nucleotides in length. A preferred portion or fragment of SEQ ID
NO:1 is the sequence extending between nucleotide positions 7603
and 59966 of SEQ ID No.1.
[0034] The sequence may encode or be complementary to a sequence
encoding a polypeptide of a polyketide biosynthetic gene cluster,
or a portion thereof. By "a polypeptide of a polyketide
biosynthetic gene cluster" is meant a polypeptide encoded by one or
more open reading frames of a polyketide biosynthetic gene cluster,
and particularly the borrelidin biosynthetic gene cluster.
[0035] A polyketide biosynthetic gene cluster is a segment of DNA
comprising a plurality of genes encoding polypeptides having
activity in the biosynthesis of a polyketide or macrolide moiety.
This is not restricted to components of the polyketide synthase
(PKS) which function inter alia in the synthesis of the polyketide
backbone and reductive processing of side groups, but also
encompasses polypeptides having ancillary functions in the
synthesis of the polyketide. Thus polypeptides of the biosynthetic
gene cluster may also act in macrolide ring or polyketide chain
modification (e.g. catalysing a reaction in the formation of the
C12 nitrile moiety of borrelidin), in the synthesis of a precursor
or starter unit for a polyketide or macrolide moiety (e.g.
catalysing a reaction In the synthesis of the
trans-cyclopentane-1,2-dicarboxylic acid starter unit for the
borrelidin PKS, or responsible for the activation of such molecules
as the coenzyme-A thioesters of the starter and extender units of
the chain), regulatory activity (e.g. regulation of the expression
of the genes or proteins involved in polyketide or macrolide
synthesis), transporter activity (e.g. in transport of substrates
for the polyketide or macrolide moiety into the cell, or of
synthesis products such as the polyketide or macrolide molecule out
of the cell), and in conferring resistance of the producing cell to
the synthesised products (e.g. through specific binding to the
synthesised molecule, or as a replacement for other endogenous
proteins to which the synthesised molecule may bind within or
outside of the cell).
[0036] The gene cluster also includes non-coding regions, such as
promoters and other transcriptional regulatory sequences which are
operably linked to the coding regions of the gene cluster. The
skilled person is well able to identify such elements based upon
the information provided herein, and these are within the scope of
the present invention.
[0037] Genes and open reading frames encoded within SEQ ID No.1
represent preferred parts or fragments of SEQ ID No.1. Thus an
isolated nucleic acid molecule may comprise a sequence that encodes
a polypeptide from a borrelidin biosynthetic gene cluster, wherein
said polypeptide has an amino acid sequence selected from the group
consisting of SEQ ID Nos.2 to 43 and 113.
[0038] In preferred embodiments, the nucleic acid sequence
comprises an open reading frame selected from the group of open
reading frames of SEQ ID NO: 1 consisting of borA1, borA2, borA3,
borA4, borA5, borA6, borB, borC, borD, borE, borF, borG, borH,
borI, borJ, borK, borL, borM, borN, borO, orfB1, orfB2, orfB3,
orfB4, orfB5, orfB6, orfB7, orfB8, orfB9, orfB10, orfB11, orfB12,
orfB13a, orfB13b, orfB14, orfB15, orfB16, orfB17, orfB18, orfB19,
orfB20, orfB21 and orfB22, said open reading frames being described
by, respectively, bases 16184*-18814, 18875-23590, 23686-34188,
34185*-39047, 39122*-45514, 45514-50742, 7603-8397c, 8397-9194c,
9244-9996c, 9993-11165c, 11162-11980c, 11992-13611c, 13608-15659*c,
50739*-52019, 52113-53477, 53486-54466, 54506-56176, 56181*-57098,
57112-57858, 57939-59966, 2-313 (incomplete), 501*-3107,
3172-3810c, 3935-4924c, 5123-5953, 5961-6518*c, 6564*-7538,
60153-60533*c, 60620-61003, 61188*-61436, 61526-61738,
61767-62285c, 62750-63067c, 62586-62858c, 63155-65071c,
65374-65871, 65942-68305*c, 68290-68910*c, 69681-70436,
70445-71848, 71851-72957, 73037-73942 and 73995-74534c of SEQ ID
No.1.
[0039] In the above list, `c` indicates that the gene is encoded by
the complementary strand to that shown in SEQ ID NO: 1. Each open
reading frame above represents the longest probable open reading
frame present. It is sometimes the case that more than one
potential start codon can be Identified. One skilled in the art
will recognise this and be able to identify alternative possible
start codons, Those genes which have more than one possible start
codon are indicated with a `*` symbol. Throughout we have indicated
what we believe to be the start codon, however, a person of skill
in the art will appreciate that it may be possible to generate
active protein using an alternative start codon, proteins generated
using these alternative start codons are also considered within the
scope of the present invention.
[0040] It should be noted that a number of these open reading
frames begin with a codon (GTG, CTG or TTG) other than the more
normal ATG initiation codon. It is well known that in some
bacterial systems such codons, which normally denote valine (GTG)
or leucine (CTG, TTG), may be read as initiation codons encoding
methionine at the N terminus of the polypeptide chain. In the amino
acid sequences (SEQ ID Nos: 2 to 43 and 113) provided herein, such
codons are therefore translated as methionine.
[0041] Also provided are nucleic acid molecules comprising portions
of the open reading frames identified herein. For example, such a
nucleic acid sequence may comprise one or more isolated domains
derived from the open reading frames Identified herein. The
polypeptides encoded by these isolated portions of the open reading
frames may have independent activity, e.g. catalytic activity. In
particular, the polypeptides which make up the borrelidin PKS have
modular structures in which individual domains have particular
catalytic activities as set out above. Thus any of these domains
may be expressed alone or in combination, with other polypeptides
from the borrelidin PKS described herein or domains thereof, or
with polypeptides from the PKS of other polyketides. In particular,
any of these domains may be substituted for the equivalent domains
either within the borrelidin PKS or in other polyketide synthases
and additionally equivalent domains from other PKSs may be
substituted for domains within the borrelidin PKS. In this context
an equivalent domain includes domains which have the same type of
function but differ in for example, their specificity, an example
of substitutions contemplated by the present invention include: the
substitution of a malonyl-CoA specific AT domain for a
methylmalonyl-CoA specific AT domain, or the substitute of a
reductive loop containing a KR domain only for one containing KR,
DH and ER. In preferred embodiments the expressed domains represent
at least one PKS module as described below.
[0042] The term `PKS domain` as used herein refers to a polypeptide
sequence, capable of folding independently of the remainder of the
PKS, and having a single distinct enzymatic activity or other
function in polyketide or macrolide synthesis including, but not
restricted to .beta.-ketoacyl-acyl carrier protein synthase (KS),
acyl carrier protein (ACP), acyl transferase (AT),
.beta.-ketoreductase (KR), dehydratase (DH), enoyl reductase (ER)
or terminal thioesterase (TE).
[0043] Accordingly, the invention further provides:
(a) an isolated nucleic acid molecule comprising a sequence that
encodes a PKS domain selected from AT0 and ACP0, said domains being
described by, respectively, amino acids 322-664 and 694-763 of SEQ
ID No.2. In a preferred embodiment, the PKS domain comprises a
sequence selected from the group consisting of bases 17147-18175
and 18263-18472 of SEQ ID No.1; (b) an isolated nucleic acid
molecule comprising a sequence that encodes a PKS domain selected
from KS1, AT1, KR1 and ACP1, said domains being described by,
respectively, amino acids 34-459, 557-885, 1136-1379 and 1419-1486
of SEQ ID No.3. In a preferred embodiment, the PKS domain comprises
a sequence selected from the group consisting of bases 18974-20251,
20543-21529, 22280-23011 and 23129-23332 of SEQ ID No.1; (c) an
isolated nucleic acid molecule comprising a sequence that encodes a
PKS domain selected from KS2, AT2, DH2, KR2, ACP2, KS3, AT3, DH3,
KR3 and ACP3, said domains being described by, respectively, amino
acids 34-459, 559-887, 903-1050, 1354-1597, 1628-1694, 1724-2149,
2245-2576, 2593-2734, 3060-3307 and 3340-3406 of SEQ ID No.4. In a
preferred embodiment, the PKS domain comprises a sequence selected
from the group consisting of bases 23785-25062, 25360-26346,
26392-26835, 27745-28476, 28567-28767, 28855-30132, 30418-31413,
31462-31887, 32863-33606 and 33703-33903 of SEQ ID No.1; (d) an
isolated nucleic acid molecule comprising a sequence that encodes a
PKS domain selected from KS4, AT4, KR4 and ACP4, said domains being
described by, respectively, amino acids 34-459, 555-886, 1179-1423
and 1459-1525 of SEQ ID No.5. In a preferred embodiment, the PKS
domain comprises a sequence selected from the group consisting of
bases 34284-35561, 35847-36842, 37719-38453 and 38559-38759 of SEQ
ID No.1; (e) an isolated nucleic acid molecule comprising a
sequence that encodes a PKS domain selected from KS5, AT5, DH5,
ER5, KR5 and ACP5, said domains being described by, respectively,
amino acids 34457, 553-888, 905-1046, 1401-1690, 1696-1942 and
1975-2041 of SEQ ID No.6. In a preferred embodiment, the PKS domain
comprises a sequence selected from the group consisting of bases
39221-40492, 40778-41785, 41834-42259, 43322-44191, 44207-44947 and
45044-45244 of SEQ ID No.1; (f) an isolated nucleic acid molecule
comprising a sequence that encodes a PKS domain selected from KS6,
AT6, KR6, ACP6 and TE, said domains being described by,
respectively, amino acids 37-457, 555-883, 1101-1335, 1371-1437 and
1461-1708 of SEQ ID No.7. In a preferred embodiment, the PKS domain
comprises a sequence selected from the group consisting of bases
45622-46884, 47176-48162, 48814-49518, 49624-49824 and 49894-50637
of SEQ ID No.1.
[0044] In another of its aspects the invention provides an isolated
nucleic acid molecule comprising a sequence that encodes a PKS
module, said module being selected from the group consisting of
amino acids 322-763 of SEQ ID No.2, 34-1486 of SEQ ID No.3, 34-1694
of SEQ ID No.4, 1724-3406 of SEQ ID No.4, 34-1525 of SEQ ID No.5,
34-2041 of SEQ ID No.6 and 37-1437 or 1708 of SEQ ID No.7. In a
preferred embodiment, the module comprises a sequence selected from
the group consisting of bases 17147-18472, 18974-23332,
23785-28767, 28855-33903, 34284-38759, 39221-45244, 45622-49824 or
50637 of SEQ ID No.1.
[0045] The term `module` as used herein refers to a single
polypeptide comprising a plurality of PKS domains each having a
single distinct enzymatic activity in polyketide or macrolide
synthesis including, but not restricted to .beta.-ketoacyl-acyl
carrier protein synthase (KS), acyltransferase (AT), acyl carrier
protein (ACP), .beta.-ketoreductase (KR), dehydratase (DH), or
enoyl reductase (ER) or terminal thioesterase (TE). An extension
module typically comprises a KS, AT and ACP domain (although some
modular PKSs may encode their AT domains as independent proteins).
An extension module may further comprise one or more domains
capable of reducing a beta-keto group to a hydroxyl, enoyl or
methylene group (said group of domains are referred to herein as a
"reductive loop"). Thus a module comprising a reductive loop
typically contains a KR domain, KR and DH domains, or KR, DH and ER
domains.
[0046] A PKS may further comprise a TE domain to perform chain
termination and/or cyclisation of the final product, or
alternatively it may contain another functionality known to perform
a similar function such as that for the addition of an amino acid
residue and macrolactam formation as observed for rapamycin
(Schwecke et al., 1995), for macrolactam formation as for rifamycin
(August et al., 1998), and for amino acid incorporation followed by
reductive elimination as for myxalamid biosynthesis (Silakowski et
al., 2001).
[0047] Also provided is a nucleic acid molecule encoding a
polyketide synthase comprising a sequence encoding one or more of
the domains or modules described above.
[0048] The sequences provided herein provide means with which to
manipulate and/or to enhance polyketide synthesis. Thus there is
provided a method of modifying a parent polyketide synthase,
comprising expressing a domain from a borrelidin polyketide
synthase or a derivative thereof as described herein in a host cell
expressing said parent polyketide synthase, such that the domain is
incorporated into said parent polyketide synthase. There is further
provided a method of modifying a parent polyketide synthase,
comprising introducing into a host cell a nucleic acid encoding a
domain from a borrelidin polyketide synthase, or a derivative
thereof, wherein the host cell contains nucleic acid encoding said
parent polyketide synthase, such that, when expressed, the domain
is incorporated into said parent polyketide synthase. The
borrelidin PKS domain may be inserted in addition to the native
domains of the parent PKS, or may replace a native parent domain.
Typically the parent PKS will be a Type I PKS.
The present invention further provides methods of modifying a
parent borrelidin PKS. A donor domain (e.g. from a Type I PKS) may
be expressed in a host cell expressing said parent borrelidin PKS.
There is further provided a method of modifying a parent borrelidin
polyketide synthase comprising introducing into a host cell a
nucleic acid encoding a domain from a donor polyketide synthase,
wherein the host cell contains nucleic acid encoding said parent
borrelidin polyketide synthase, such that, when expressed, the
domain is incorporated into said parent borrelidin polyketide
synthase.
[0049] Additionally or alternatively, a domain of the parent PKS
may be deleted or otherwise inactivated; e.g. a parent domain may
simply be deleted, or be replaced by a domain from a donor PKS, or
a domain from a donor PKS may be added to the parent. Where a
domain is added or replaced, the donor domain may be derived from
the parent synthase, or from a different synthase.
[0050] These methods may be used to enhance the biosynthesis of
borrelidin, to produce new borrelidin derivatives or analogues, or
other novel polyketide or macrolide structures. The number and
nature of modules in the system may be altered to change the number
and type of extender units recruited, and to change the various
synthase, reductase and dehydratase activities that determine the
structure of the polyketide chain. Such changes can be made by
altering the order of the modules that comprise the PKS, by the
duplication or removal of modules that comprise the PKS, by the
introduction of modules from heterologous sources, or by some
combination of these various approaches.
[0051] Thus domains or modules of the borrelidin PKS may be
deleted, duplicated, or swapped with other domains or modules from
the borrelidin PKS, or from PKS systems responsible for synthesis
of other polyketides (heterologous PKS systems, particularly Type I
PKS systems), which may be from different bacterial strains or
species. Alternatively domains or modules from the borrelidin PKS
may be introduced into heterologous PKS systems in order to produce
novel polyketide or macrolides. Combinatorial modules may also be
swapped between the borrelidin polyketide synthase and other
polyketide synthases, these combinatorial modules extend between
corresponding domains of two natural-type modules, e.g. from the AT
of one module to the AT of the next.
[0052] For example, a particular extender module may be swapped for
one having specificity for a different extender unit (as described
e.g. in WO98/01571 and WO98/01546), or mutated to display
specificity or selectivity for a different extender unit e.g. as
described below. Additionally or alternatively, introduction,
deletion, swapping or mutation of domains or modules, such as the
KR, DH and ER domains responsible for the processing of a given
.beta.-keto moiety, may be used to alter the level of reductive
processing of an extender unit during polyketide synthesis. Such
changes may also lead to changes in stereochemistry of the
alpha-alkyl and beta-hydroxyl groups thus formed by altered
modules. In a preferred embodiment the BorA5 module may be
introduced into a parent PKS to provide iterative addition of
extender units to a polyketide backbone, e.g. expanding the ring
size of a macrolide polyketide relative to that naturally produced
by the parent PKS.
[0053] The borrelidin loading module is the first PKS loading
module to be identified having specificity for an alicyclic
di-carboxylic acid starter unit. Thus this module or a derivative
thereof may be used to introduce alicyclic starter units into
heterologous polyketide synthases. This need not be restricted to
use of trans-cyclopentane-1,2,-dicarboxylic acid normally used as
the borrelidin starter unit. The borrelidin loading module is
herein shown also to be capable of directing incorporation of other
starter units including trans-cyclobutane-1,2-dicarboxylic acid,
2,3-methylsuccinic acid and 2-methylsuccinic acid. The borrelidin
starter unit may also be modified in a borrelidin producing cell,
or replaced by a heterologous loading module, to introduce
alternative starter units into the borrelidin synthetic
pathway.
[0054] The position of the loading module of the PKS may be chosen
(e.g. by fusing it to a particular location within the PKS) in
order to control the ring size of the resultant
polyketide/macrolide molecules.
[0055] The AT domains that determine the carboxylic acid-CoA
thioester extender units may be deleted, modified or replaced. The
ACP domains may also be deleted, modified or replaced. In addition
domains that are not normally present in the borrelidin PKS but
which are found in other modular PKS and/or mixed PKS/NRPS systems
may be inserted. Examples include, but are not limited to: O-methyl
transferase domains, C-methyl transferase domains, epimerisation
domains, monooxygenase domains and dehydrogenase domains,
aminotransferase domains and non-ribosomal peptide synthetase
domains.
[0056] Further, the thioesterase domain of the borrelidin PKS may
be altered or repositioned (e.g. fused to a chosen location within
the PKS) in order to change its specificity and/or in order to
release polyketide/macrolide molecules with a chosen ring size.
Alternatively, heterologous thioesterase domains may be inserted
into the borrelidin PKS to produce molecules with altered ring size
relative to the molecule normally produced by the parent PKS, or to
produce a free acid.
[0057] In yet another alternative, the amino acid incorporating and
macrolactam forming domains from mixed NRPS/PKS systems such as
that for rapamycin, or for related systems such as for rifamycin
biosynthesis and myxalamid biosynthesis, or modules from NRPS
systems (such as those for bleomycin biosynthesis) may be inserted
into the PKS to produce novel polyketide related molecules of mixed
origin.
[0058] The open reading frames encoding the PKS described herein
may also comprise portions encoding non-enzymatically active
portions which nevertheless have a functional role as scaffold
regions which space and stabilise the enzymatically active domains
and/or modules of the PKS at appropriate distances and
orientations, and which may have recognition and docking functions
that order the domains and modules of the PKS in the correct
spatial arrangement. Thus the nucleic acid sequences of the present
invention comprise sequences encoding such scaffold regions, either
alone or in combination with sequences encoding domains or modules
as described above. It will be appreciated that the various
manipulations of PKS coding sequences described above may give rise
to hybrid PKS genes or systems. Thus the present invention also
provides nucleic acids encoding such hybrid PKS systems. The
invention therefore provides a nucleic acid construct comprising at
least one first nucleic acid portion encoding at least one domain
of a borrelidin PKS and a second nucleic acid portion or portions
encoding at least one type I PKS domain which is heterologous to
said borrelidin PKS. In preferred embodiments the construct
comprises a hybrid polyketide synthase gene, said gene encoding at
least one domain of a borrelidin PKS and at least one type I PKS
domain which is heterologous to said borrelidin PKS. Further
preferred embodiments are as described above.
[0059] In a further aspect, the present invention provides an
isolated nucleic acid molecule comprising a sequence encoding a
polypeptide which catalyses a step in the synthesis of a starter
unit or substrate for polyketide synthesis, preferably in the
synthesis of the trans-cyclopentane-1,2,-dicarboxylic acid moiety
used as a starter unit by the borrelidin PKS. The polypeptide may
have activity as a dehydrogenase, 3-oxoacyl-ACP-reductase, cyclase,
F420 dependent dehydrogenase, or
2-hydroxyhepta-2,4-diene-1,7-dioate isomerase. Preferably the
polypeptide comprises the sequence encoded by one of the group of
genes consisting of borC, borD, borE, borF, borG, borH, borK, borL,
borM and borN, as shown in SEQ ID NO: 8, 9, 10, 12, 13, 14, 17, 18,
19 or 20.
[0060] These genes may be rendered deleted, disrupted, or otherwise
inactivated in a borrelidin-producing cell in order to abolish
borrelidin production. Cell lines resulting from such changes may
be chemically complemented by the addition of exogenous carboxylic
acids which may be incorporated in place of the natural starter
unit. Thus, new borrelidin related molecules may be synthesised,
which are initiated from the exogenously fed carboxylic acid. Such
an approach is termed mutasynthesis. The genes responsible for
trans-cyclopentane-1,2,-dicarboxylic acid synthesis may be
introduced into a heterologous polyketide producer cell to allow
that cell to synthesise the alicyclic dicarboxylic acid as a
starter unit for its own PKS.
[0061] Thus the present invention further provides a method for the
production of borrelidin and borrelidin analogues at improved
titres, said method comprising disrupting borG in the host strain,
fermenting the resulting cell line and feeding an exogenous
carboxylic acid. In various preferred embodiments the exogenous
carboxylic acid is trans-cyclopentane-1,2-dicarboxylic acid or the
exogenous carboxylic acid is selected from the group consisting of
trans-cyclobutane-1,2-dicarboxylic acid, 2,3-dimethyl succinic acid
and 2-methylsuccinic acid and/or the method additional comprises
deleting, modifying or replacing one or more borrelidin
biosynthetic genes, or borrelidin polyketide synthase domains or
modules. A person of skill in the art is aware that polyketide
synthases may also be expressed in heterologous hosts, therefore
the present invention also contemplates a method for the production
of higher titres of borrelidin and borrelidin analogues in a
heterologous host, said method comprising transforming a host cell
with the entire borrelidin gene cluster with the exception of borG
or disrupting the borG gene in situ once the gene cluster has been
transferred.
[0062] Alternatively, genes responsible for the synthesis of the
starter unit may be over-expressed in order to improve the
fermentation titres of borrelidin or borrelidin related molecules.
Thus the present invention further provides a method for increasing
the titre of borrelidin and borrelidin derivatives or borrelidin
related molecules and their derivatives, said method comprising
upregulating a borrelidin biosynthetic gene involved in production
of the starter unit, said gene selected from the group consisting
of borC, borD, borE, borF, borH, borK, borL borM and borN, in a
preferred embodiment the upregulated gene is borE or borL.
[0063] In another approach the genes responsible for the synthesis
of the starter unit may be modified, or replaced by other synthetic
genes directing the production of altered carboxylic acids, leading
to the production of borrelidin related molecules. These techniques
may be complemented by the modification of the loading module of
the PKS as described above.
[0064] In a further aspect, the present invention provides an
isolated nucleic acid molecule comprising a sequence encoding a
polypeptide which catalyses a step in the modification of a side
chain of a polyketide moiety, for example in the conversion of a
methyl group to a nitrile moiety, e.g. at C12 of pre-borrelidin
(14). The polypeptide may have activity as a cytochrome P450
oxidase, amino transferase, or NAD/quinone oxidoreductase.
Preferably the polypeptide comprises the sequence encoded by one of
the group of genes consisting of borI, borJ, and borK as shown in
SEQ ID NO: 15, 16 or 17.
[0065] Various of these genes may be deleted/inactivated such that
borrelidin-related molecules, or shunt metabolites thereof,
accumulate which represent intermediate stages of the process that
introduces the nitrile moiety. The addition of heterologous genes
to such systems may allow alternative elaboration of any
accumulated biosynthetic intermediates or shunt metabolites
thereof. Alternatively, the genes may be mutated in order to alter
their substrate specificity such that they function on alternative
positions of pre-borrelidin molecules in order to provide
borrelidin-related molecules. In addition, the genes responsible
for formation of the nitrile group may be over-expressed in order
to improve the fermentation titres of borrelidin or
borrelidin-related molecules.
[0066] Alternatively, one, some or all of these genes may be
introduced into cells capable of producing other polyketides to
provide for desired side chain processing of that polyketide, e.g.
the introduction of a nitrile moiety. This opens up the possibility
of specific biosynthetic introduction of nitrile moieties into
polyketides, particularly at side chains derived from
methylmalonyl-CoA or ethylmalonyl-CoA extender units. Purified
enzymes (see below) may also be used to effect the conversion of
polyketide side chains to nitrile moieties in vitro.
[0067] In a further aspect, the present invention provides an
isolated nucleic acid molecule comprising a sequence encoding a
polypeptide involved conferring resistance to borrelidin. The
polypeptide may have homology to a threonyl tRNA synthase, and
preferably has threonyl tRNA synthase activity. Preferably the
polypeptide comprises the sequence encoded by the borO gene as
shown in SEQ ID NO: 21. A resistance gene such as borO, carried on
a suitable vector (see below) may be used as a selective marker.
Thus cells transformed with such a vector may be positively
selected by culture in the presence of a concentration of
borrelidin which inhibits the growth of, or kills, cells lacking
such a gene.
[0068] In a further aspect, the present invention provides an
isolated nucleic acid molecule comprising a sequence encoding a
polypeptide involved in regulation of expression of one or more
genes of the borrelidin gene cluster. In a preferred embodiment the
polypeptide comprises the sequence encoded by the borL gene as
shown in SEQ ID NO: 18, or as encoded by orfB8 or orfB12 as shown
in SEQ ID NO: 29 or 33. Regulator genes may be engineered to
increase the titre of borrelidin and borrelidin derivatives, or
borrelidin related molecules and their derivatives produced by
fermentation of the resulting cell lines. For example, repressors
may be deleted/inactivated, and/or activators may be up-regulated
or overexpressed, e.g. by increasing gene copy number or placing
the coding sequence under the control of a strong constitutively
active or inducible promoter. The borL gene or a portion thereof
may also find use as a hybridisation probe to identify similar
regulator genes located in or outside other biosynthetic gene
clusters.
[0069] In a further aspect, the present invention provides an
isolated nucleic acid molecule comprising a sequence encoding a
polypeptide having type II thioesterase activity. In a preferred
embodiment the polypeptide comprises the sequence encoded by the
borB gene as shown in SEQ ID NO: 8. This nucleic acid may be
introduced into a host cell to modulate the titre of a polyketide
synthesised by that cell. In particular, the titre may be increased
by `editing` of the products of unwanted side reactions (e.g.
removal of acyl groups formed by inappropriate decarboxylation of
extender units attached to KS domains). However in various aspects
it may be desirable to remove such an activity from a producer
cell, for example to increase the variety of polyketide products
produced by that cell, or to facilitate production of an analogue
of a naturally produced polyketide which would normally be blocked
by such an editing activity.
[0070] The nucleotide sequences of the invention may be portions of
the sequence shown in SEQ ID NO: 1, or the complement thereof, or
mutants, variants, derivatives or alleles of these sequences. The
sequences may differ from that shown by a change which is one or
more of addition, insertion, deletion and substitution of one or
more nucleotides of the sequence shown. Changes to a coding
nucleotide sequence may result in an amino acid change at the
protein level, or not, as determined by the redundancy of the
genetic code. Thus, nucleic acid according to the present invention
may include a sequence different from the sequence shown in SEQ ID
NO: 1 yet encode a polypeptide with the same amino acid sequence.
Preferably mutants, variants, derivatives or alleles of the
sequences provided encode polypeptides having the same enzymatic
activity as those described herein.
[0071] Where the sequence is a coding sequence, the encoded
polypeptide may comprise an amino acid sequence which differs by
one or more amino acid residues from the amino acid sequences shown
in SEQ ID Nos: 2 to 43 and 113. Nucleic acid encoding a polypeptide
which is an amino acid sequence mutant, variant, derivative or
allele of any of the sequences shown is further provided by the
present invention. Such polypeptides are discussed below. Nucleic
acid encoding such a polypeptide may show greater than about 60%
identity with the coding sequence of SEQ ID NO: 1, greater than
about 70% identity, greater than about 80% identity, or greater
than about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99%
Identity therewith. Percentage identity may be calculated using one
of the programs such as BLAST or BestFit from within the Genetics
Computer Group (GCG) Version 10 software package available from the
University of Wisconsin, using default parameters.
[0072] In preferred embodiments, whether coding or non-coding, the
nucleotide sequences of the invention are capable of hybridising
specifically with at least a portion of the sequence of SEQ ID NO:
1 or the complement thereof.
[0073] For example, hybridizations may be performed, according to
the method of Sambrook et al. (Sambrook et al., 1989), using a
hybridization solution comprising: 5.times.SSC, 5.times.Denhardt's
reagent, 0.5-1.0% SDS, 100 .mu.g/ml denatured, fragmented salmon
sperm DNA, 0.05% sodium pyrophosphate and up to 50% formamide.
Hybridization is carried out at 37-42.degree. C. for at least six
hours. Following hybridization, filters are washed as follows: (1)
5 minutes at room temperature in 2.times.SSC and 1% SDS; (2) 15
minutes at room temperature in 2.times.SSC and 0.1% SDS; (3) 30
minutes-1 hour at 37.degree. C. in 1.times.SSC and 1% SDS; (4) 2
hours at 42-65.degree. C. in 1.times.SSC and 1% SDS, changing the
solution every 30 minutes.
[0074] One common formula for calculating the stringency conditions
required to achieve hybridization between nucleic acid molecules of
a specified sequence homology is (Sambrook et al., 1989):
T.sub.m=81.5.degree. C.+16.6 Log [Na+]+0.41(% G+C)-0.63(%
formamide)-600/#bp in duplex
[0075] As an illustration of the above formula, using [Na+]=[0.368]
and 50% formamide, with GC content of 42% and an average probe size
of 200 bases, the T.sub.m is 57.degree. C. The T.sub.m of a DNA
duplex decreases by 1-1.5.degree. C. with every 1% decrease in
homology. Thus, targets with greater than about 75% sequence
identity would be observed using a hybridization temperature of
42.degree. C. Such hybridisation would be considered substantially
specific to the nucleic acid sequence of the present invention.
[0076] The nucleic acids of the present invention preferably
comprise at least 15 contiguous nucleotides of SEQ ID NO: 1. They
may comprise 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120,
150, 200, 300, 500 or more contiguous nucleotides of SEQ ID NO:
1.
[0077] The nucleic acids may be used e.g. as primers or probes for
the identification of novel genes or other genetic elements, such
as transcriptional regulatory sequences, from polyketide or
macrolide biosynthetic gene clusters, e.g. sequences encoding
enzymes of the PKS, or domains or modules thereof, enzymes involved
in the biosynthesis of a starter unit, enzymes modifying side
chains of polyketide moieties, transporters, resistance genes and
regulatory molecules as described.
[0078] Thus the present invention provides a method of identifying
a novel polyketide biosynthetic gene cluster, or a portion thereof,
comprising hybridising a sample of target nucleic acid with a
nucleic acid of the present invention capable of hybridising
specifically to a nucleic acid having the sequence of SEQ ID NO: 1
or a portion thereof. The target nucleic acid may be any suitable
nucleic acid, and is preferably bacterial genomic DNA.
[0079] Typically, the method further comprises the step of
detecting hybridisation between the sample of nucleic acid and the
nucleic acid of the invention. Hybridisation may be measured using
any of a variety of techniques at the disposal of those skilled in
the art. For instance, probes may be radioactively, fluorescently
or enzymatically labelled. Other methods not employing labelling of
probe include amplification using PCR, RNAase cleavage and allele
specific oligonucleotide probing.
[0080] A method may include hybridization of one or more (e.g. two)
probes or primers to target nucleic acid. Where the nucleic acid is
double-stranded DNA, hybridization will generally be preceded by
denaturation to produce single-stranded DNA. The hybridization may
be as part of a PCR procedure, or as part of a probing procedure
not involving PCR. An example procedure would be a combination of
PCR and low stringency hybridization. A screening procedure, chosen
from the many available to those skilled in the art, is used to
identify successful hybridization events and isolated hybridized
nucleic acid.
[0081] Those skilled in the art are well able to employ suitable
conditions of the desired stringency for selective hybridisation,
taking into account factors such as oligonucleotide length and base
composition, temperature and so on, as described above.
[0082] An isolated nucleic acid molecule of the invention may be an
isolated naturally occurring nucleic acid molecule (i.e. isolated
or separated from the components with which it is normally found in
nature) such as free or substantially free of nucleic acid flanking
the gene in the bacterial genome, except possibly one or more
regulatory sequence(s) for expression. Nucleic acid may be wholly
or partially synthetic and may include genomic DNA, cDNA or RNA.
Where nucleic acid according to the invention includes RNA,
reference to the sequence shown should be construed as reference to
the RNA equivalent, with U substituted for T.
[0083] The present invention further provides a vector comprising a
nucleic acid according to the present invention. The vector is
preferably an expression vector comprising a nucleic acid encoding
a polypeptide of a polyketide biosynthetic gene cluster (preferably
a borrelidin biosynthetic gene cluster), or a portion thereof, as
described. Suitable vectors comprising nucleic acid for
introduction into bacteria or eukaryotic host cells can be chosen
or constructed, containing appropriate regulatory sequences,
including promoter sequences, terminator fragments, enhancer
sequences, marker genes and other sequences as appropriate. Vectors
may be plasmids, viral eg "phage", or "phagemid", as appropriate.
For further details see, for example, Sambrook et al., 1989. Many
known techniques and protocols for manipulation of nucleic acid,
for example in preparation of nucleic acid constructs, mutagenesis,
sequencing, introduction of DNA into cells and gene expression, and
analysis of proteins, are described in detail in Short Protocols in
Molecular Biology, Second Edition, Ausubel et al. Eds, John Wiley
& Sons 1992. The disclosures of Sambrook et al. and Ausubel et
al. are incorporated herein by reference.
[0084] In another of its aspects the present invention provides an
isolated polypeptide encoded by a nucleic acid molecule of the
invention as described herein. More particularly, there is provided
an isolated polypeptide comprising an amino acid sequence as shown
in any one or more of SEQ ID Nos.2 to 43 and 113 or a portion
thereof. As set out above, these amino acid sequences represent
translations of the longest possible open reading frames present in
the sequence of SEQ ID NO: 1 and the complement thereof. The first
amino acid is always shown as Met, regardless of whether the
initiation codon is ATG, GTG, CTG or TTG.
[0085] As used herein the term "polypeptide(s)" includes peptides,
polypeptides and proteins, these terms are used interchangeably
unless otherwise specified.
[0086] A polypeptide which is an amino acid sequence variant,
allele, derivative or mutant of any one of the amino acid sequences
shown may exhibit at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity with the
polypeptide of any one of the SEQ ID Nos.2 to 43 and 113, or with a
portion thereof. Particular amino acid sequence variants may differ
from those shown by insertion, addition, substitution or deletion
of 1 amino acid, 2, 3, 4, 5-10, 10-20 20-30, 30-50, 50-100,
100-150, or more than 150 amino acids. Percentage identity may be
calculated using one of the programs such as FASTA or BestFit from
within the Genetics Computer Group (GCG) Version 10 software
package available from the University of Wisconsin, using default
parameters.
[0087] The present invention also includes active portions,
fragments, and derivatives of the polypeptides of the
invention.
[0088] An "active portion" means a peptide which is less than the
full length polypeptide, but which retains at least some of its
essential biological activity. For example, isolated domains or
modules of the PKS as described above may be regarded as active
portions of the PKS
[0089] A "fragment" means a stretch of amino acid residues of at
least five, at least six, or at least seven contiguous amino acids,
often at least eight or at least nine contiguous amino acids,
typically at least 10, at least 13 contiguous amino acids and, most
preferably, at least 20, at last 25, at least 30, at least 50, at
least 75, at least 100 or more contiguous amino acids. Fragments of
the sequence may comprise antigenic determinants or epitopes useful
for raising antibodies to a portion of the relevant polypeptide.
Thus the polypeptide need not comprise a complete sequence provided
in any one of SEQ ID Nos 2 to 43 and 113, but may comprise a
portion thereof having the desired activity, e.g. an isolated
domain or module, such as those of the PKS described above. It
should be noted that the terms part, portion and fragment are used
interchangeably in this specification; no particular significance
should be ascribed to the specific use of one of these terms in any
particular context.
[0090] A "derivative" of a polypeptide of the invention or a
fragment thereof means a polypeptide modified by varying the amino
acid sequence of the protein, e.g. by manipulation of the nucleic
acid encoding the protein or by altering the protein itself. Such
derivatives of the natural amino acid sequence may involve
insertion, addition, deletion or substitution of one, two, three,
five or more amino acids, without fundamentally altering the
essential activity of the wild type polypeptide.
[0091] Polypeptides of the invention are provided in isolated form,
e.g. isolated from one or more components with which they are
normally found associated in nature. They may be isolated from a
host in which they are naturally expressed, or may be synthetic or
recombinant.
[0092] The present invention also encompasses a method of making a
polypeptide (as disclosed), the method including expression from
nucleic acid encoding the polypeptide (generally nucleic acid
according to the invention). This may conveniently be achieved by
growing a host cell in culture, containing an expression vector as
described above, under appropriate conditions which cause or allow
expression of the polypeptide. Polypeptides may also be expressed
in in vitro systems, such as reticulocyte lysate systems.
[0093] The method may include the step of introducing the nucleic
acid into a host cell. The introduction, which may (particularly
for in vitro introduction) be generally referred to without
limitation as "transformation", may employ any available technique.
For eukaryotic cells, suitable techniques may include calcium
phosphate transfection, DEAE-Dextran, electroporation,
liposome-mediated transfection and transduction using retrovirus or
other virus, e.g. vaccinia or, for insect cells, baculovirus. For
bacterial cells, suitable techniques may include calcium chloride
transformation, conjugation, electroporation and transfection using
bacteriophage. As an alternative, direct Injection of the nucleic
acid could be employed. Marker genes such as antibiotic resistance
or sensitivity genes may be used in identifying clones containing
nucleic acid of interest, as is well known in the art.
[0094] Preferred host cells include Actinomycetes, preferably
Streptomycetes, and in particular those selected from the group
consisting of Saccharopolyspora erythraea, Streptomyces coelicolor,
Streptomyces avermitilis, Streptomyces griseofuscus, Streptomyces
cinnamonensis, Micromonospora griseorubida, Streptomyces
hygroscopicus, Streptomyces fradiae, Streptomyces longisporoflavus,
Streptomyces lasaliensis, Streptomyces tsukubaensis, Streptomyces
griseus, Streptomyces venezuelae, Streptomyces antibioticus,
Streptomyces lividans, Streptomyces rimosus and Streptomyces albus.
Streptomyces rochei ATCC23956, Streptomyces parvulus Tu113 and
Streptomyces parvulus Tu4055, more preferably selected from the
group consisting of Streptomyces rochei ATCC23956, Streptomyces
parvulus Tu113 and Streptomyces parvulus Tu4055.
[0095] A polypeptide, peptide fragment, allele, mutant or variant
according to the present invention may be used as an immunogen or
otherwise in obtaining specific antibodies, which may be useful in
purification and other manipulation of polypeptides and peptides,
screening or other applications.
[0096] In another of its aspects the invention provides for the
molecules that may be derived from the objects of the invention and
for modified compounds formed therefrom and for methods for their
production. The molecules derived from the objects of the invention
are shown by formula 1 and extends to pharmaceutically acceptable
salts thereof, wherein:
##STR00001##
R.sub.1 is a cycloalkyl group of varying size (n=1-2) and
substituted as shown below;
##STR00002##
wherein R.sub.1 can also optionally be substituted with one or more
halo atoms, or one or more C.sub.1 to C.sub.3 alkyl groups;
R.sub.2, R.sub.3, R.sub.6, R.sub.7, R.sub.8, R.sub.9, or R.sub.11
are each independently H, OCH.sub.3, CH.sub.3 or CH.sub.2CH.sub.3;
R.sub.4 is CN, CO.sub.2H, CHO, CH.sub.3, CONH.sub.2, CHNH, R.sub.6,
R.sub.10 are OH; or analogues differing from the corresponding
"natural" compound in the oxidation state of one or more of the
ketide units as shown in FIG. 2 (i.e. selection of alternatives
from the group: --CO--, --CH(OH)--, .dbd.CH--, and --CH2-), with
the proviso that said compounds are not borrelidin (1),
12-desnitrile-12-carboxyl borrelidin (2), 10-desmethyl borrelidin
(3), 11-epiborrelidin (4) or C14,C15-cis borrelidin analogue (5) as
shown in FIG. 1. In preferred embodiments: [0097] (a). R.sub.7,
R.sub.8 and R.sub.9 are all CH.sub.3. [0098] (b). R.sub.4 is
CH.sub.3 or COOH [0099] (c). R.sub.7, R.sub.8 and R.sub.9 are all
CH.sub.3 and R.sub.4 is CH.sub.3 or COOH [0100] (d). R.sub.1 is
cyclobutane-1'-carboxylate [0101] (e). R.sub.1 is
cyclobutane-1'-carboxylate and R.sub.7, R.sub.8 and R.sub.9 are all
CH.sub.3. [0102] (f). R.sub.6, R.sub.7, R.sub.8 and R.sub.9 are all
CH.sub.3, R.sub.2 and R.sub.11 are H, R.sub.5 and R.sub.10 are OH,
R.sub.4 is either CH.sub.3, COOH or CN and R.sub.1 is
cyclopentane-1'-carboxylate or. cyclobutane-1'-carboxylate [0103]
(g). R.sub.1 is cyclobutane-1'-carboxylate, R.sub.7, R.sub.8 and
R.sub.9 are all CH.sub.3 and R.sub.4 is CH.sub.3 or COOH. The
present invention also provides compounds of formula 2 and
pharmaceutically acceptable salts thereof, wherein:
##STR00003##
[0103] R.sub.2, R.sub.3, R.sub.6, R.sub.7, R.sub.8, R.sub.9, or
R.sub.11 are each independently H, OCH.sub.3, CH.sub.3 or
CH.sub.2CH.sub.3; R.sub.4 is CN, CO.sub.2H, CHO, CH.sub.3,
CONH.sub.2, CHNH, R.sub.5, R.sub.10 are OH; or analogues differing
from the corresponding "natural" compound in the oxidation state of
one or more of the ketide units as shown in FIG. 2 (i.e. selection
of alternatives from the group: --CO--, --CH(OH)--, .dbd.CH--, and
--CH.sub.2--), and R.sub.12 and R.sub.13 are independently H or a
C1-C4 alkyl group which may be optionally substituted with OH, F,
Cl, SH) with the proviso that R.sub.12 and R.sub.13 are not
simultaneously H. In preferred embodiments: [0104] (a). R.sub.7,
R.sub.8 and R.sub.9 are all CH.sub.3. [0105] (b). R.sub.4 is
CH.sub.3 or COOH [0106] (c). R.sub.7, R.sub.8 and R.sub.9 are all
CH.sub.3 and R.sub.4 is CH.sub.3 or COOH [0107] (d). R.sub.12 and
R.sub.13 are independently CH.sub.3 or H [0108] (e). R.sub.12 and
R.sub.13 are independently CH.sub.3 or H and R.sub.7, R.sub.8 and
R.sub.9 are all CH.sub.3 [0109] (f). R.sub.6, R.sub.7, R.sub.8 and
R.sub.9 are all CH.sub.3, R.sub.2 and R.sub.11 are H, R.sub.5 and
R.sub.10 are OH, R.sub.4 is either CH.sub.3, COOH or CN and
R.sub.12 and R.sub.13 are independently CH.sub.3 or H [0110] (g).
R.sub.6, R.sub.7, R.sub.8 and R.sub.9 are all CH.sub.3, R.sub.2 and
R.sub.11 are H, R.sub.5 and R.sub.10 are OH, R.sub.4 is either
CH.sub.3, COOH or CN and R.sub.12 and R.sub.13 are both CH.sub.3
[0111] (h). R.sub.12 and R.sub.13 are independently CH.sub.3 or H,
R.sub.7, R.sub.8 and R.sub.9 are all CH.sub.3 and R.sub.4 is
CH.sub.3 or COOH.
[0112] The compounds of the present invention may have tRNA
synthetase-inhibitory activity (e.g. they may inhibit threonyl-,
tyrosinyl-, or tryptophanyl-tRNA synthetase). They may display
anti-microbial activity, including activity against intra- or
extracellular parasites and organisms such as bacteria, spirochetes
(e.g. Treponema), malaria, viruses and fungi. Additionally or
alternatively they may have anti-proliferative activity against
mammalian cells, and/or anti-angiogenic activity, either as a
result of tRNA synthetase inhibition, or through some other mode of
action. This may make the compounds of the present invention
particularly suitable as anti-cancer agents (e.g. agents for
treatment of bowel cancer, prostate cancer or others), and may also
provide application in treatment of other proliferative disorders,
such as psoriasis, or conditions in which inappropriate
vascularisation occurs, such as psoriasis, rheumatoid arthritis,
atherosclerosis and diabetic retinopathy.
[0113] The compounds of the present invention may be formulated
into pharmaceutically acceptable compositions, e.g. by admixture
with a pharmaceutically acceptable excipient, carrier, buffer,
stabiliser or other materials well known to those skilled in the
art. Such compositions also fall within the scope of the present
invention.
[0114] Such pharmaceutically acceptable materials should be
non-toxic and should not interfere with the efficacy of the active
ingredient. The precise nature of the carrier or other material may
depend on the route of administration, e.g. oral, intravenous,
cutaneous or subcutaneous, nasal, intramuscular, intraperitoneal
routes.
[0115] Pharmaceutical compositions for oral administration may be
in tablet, capsule, powder or liquid form. A tablet may include a
solid carrier such as gelatin or an adjuvant. Liquid pharmaceutical
compositions generally include a liquid carrier such as water,
petroleum, animal or vegetable oils, mineral oil or synthetic oil.
Physiological saline solution, dextrose or other saccharide
solution or glycols such as ethylene glycol, propylene glycol or
polyethylene glycol may be included.
[0116] For intravenous, cutaneous or subcutaneous injection, or
injection at the site of affliction, the active ingredient will be
in the form of a parenterally acceptable aqueous solution which is
pyrogen-free and has suitable pH, isotonicity and stability. Those
of relevant skill in the art are well able to prepare suitable
solutions using, for example, isotonic vehicles such as Sodium
Chloride Injection, Ringer's Injection, Lactated Ringer's
Injection. Preservatives, stabilisers, buffers, antioxidants and/or
other additives may be included, as required. Examples of the
techniques and protocols mentioned above can be found in
Remington's Pharmaceutical Sciences, 20th Edition, 2000, pub.
Lippincott, Williams & Wilkins.
[0117] The invention further provides the compounds and
compositions described above for use in a method of medical
treatment. Also provided is the use of the compounds of the
invention in the preparation of a medicament for the treatment of
microbial conditions (including malaria), for the inhibition of
angiogenesis, for the treatment of proliferative disorders, or for
the treatment of conditions characterised by inappropriate
vascularisation, as described above.
BRIEF DESCRIPTION OF THE DRAWINGS
[0118] FIG. 1 illustrates the structure of borrelidin and some
related metabolites isolated from borrelidin producing
organisms.
[0119] FIG. 2 illustrates the incorporation patterns for .sup.13C
stable isotope labelled extension substrates and the position of
the trans-cyclopentane-1,2-dicarboxylic acid starter unit derived
carbons.
[0120] FIG. 3 illustrates the organisation of the borrelidin
biosynthetic gene cluster. Restriction sites: B, BamHI; Bc, BclI;
E, EcoRI; X, XhoI.
[0121] FIG. 4 illustrates a scheme showing the proposed
biosynthetic pathway for the trans-cyclopentane-1,2-dicarboxylic
acid starter unit.
[0122] FIG. 5 illustrates the organisation of the borrelidin PKS
and the biosynthesis of the pre-borrelidin molecule.
[0123] FIG. 6 illustrates the proposed biosynthetic route for the
introduction of the nitrile moiety at the C12 position of
borrelidin.
[0124] FIG. 7 illustrates the proposed structure of the molecule
6.
[0125] FIG. 8 illustrates the proposed structure of the molecules 7
& 8.
[0126] FIG. 9 illustrates the molecular characterisation of the
4-hydroxyphenylacetic acid catabolic pathway in E. coli W.
[0127] FIG. 10 illustrates the structures of the molecules
18-20
[0128] FIG. 11 illustrates the structures of the molecules
21-26
DETAILED DESCRIPTION OF THE INVENTION
[0129] A cosmid library of S. parvulus Tu4055 genomic DNA was
constructed using fragments obtained from a partial digestion with
Sau3AI that were cloned into pWE15 and introduced into E. coli
cells using the Gigapack.RTM. III Gold Packaging Extract kit
(Stratagene). A library of 3000 E. coli transformants was screened
for homology using a labelled probe that was generated using the
DIG DNA Labelling and Detection Kit (Roche). The probe used was a
1.7 kbp BglII-BamHI fragment obtained from the gene that encodes
module 6 of the third subunit of the oleandomycin PKS from
Streptomyces antibioticus (Swan et al., 1994).
[0130] Clones that gave a positive response were selected and
cosmid DNA isolated. Cosmid DNA was digested with BamHI and
fragments less than 3 kbp in size were sub-cloned into pOJ260
(Bierman et al., 1992). The plasmids were then used to transform S.
parvulus Tu4055 protoplasts and resulting mutants were screened for
the ability to produce borrelidin. Two mutants were identified as
borrelidin non-producers, both of which were derived from plasmids
that contained fragments of cosBor32A2. These two fragments were of
1.97 and 2.80 kbp in size, and were later identified as adjacent
fragments encoding parts of the borrelidin PKS (borA2 & borA3).
Using cosBor32A2 as the probe, a second overlapping cosmid,
cosBor19B9 was identified from the original library. These two
cosmids are sufficient to cover the entire borrelidin biosynthetic
gene cluster (see FIG. 3).
[0131] The complete nucleotide sequence of cosBor32A2 and
cosBor19B9 was determined by shotgun sequencing of a Sau3AI-derived
subclone library for each cosmid, consisting of 1.5-2.0 kbp
fragments in pHSG397 (Takeshita et al., 1987). Specific details are
provided in example 3. The complete, overlapping nucleotide-coding
sequence for cosBor32A2 and cosBor19B9 is presented as SEQ ID No.1.
The region encoded by cosmid cosBor32A2 represents the sequence
from nucleotide positions 0-40217 bp of SEQ ID No.1. The region
encoded by cosmid cosBor19B9 overlaps this region by 4452
nucleotides, and corresponds to the nucleotide positions
35766-74787 bp of SEQ ID No.1. As described in more detail in the
following text, we have performed gene inactivation experiments on
many of the orfs identified to be encoded within SEQ ID No.1, and
this leads us to identify the limits of the cluster. The borrelidin
biosynthetic gene cluster is contained between nucleotide positions
7603 to 59966 of SEQ ID No.1 (borB to borO, which includes the borA
region). Thus, these combined efforts have led us to the
identification and sequencing of the DNA region encompassing the
entire borrelidin biosynthetic gene cluster, and to the
identification and description of the functional sequences encoded
within this region.
PKS Genes
[0132] Encoded between positions 16184-50742 of SEQ ID No.1 are 6
orfs that display very high homology to the genes that encode the
PKSs of known macrolide producing organisms. These genes are
designated borA1, borA2, borA3, borA4, borA5 and borA6, and encode
the borrelidin PKS as was demonstrated above by disruption of a
1.97 kbp region within borA2. The six orfs are arranged in a
head-to-tail manner and each is terminated by an in-frame stop
codon. The nucleotide sequence and corresponding polypeptide
sequence details are shown below in Table 1:
TABLE-US-00001 TABLE 1 Corresponding PKS Nucleotide position
polypeptide encoding gene in SEQ ID No. 1 sequence number borA1
16184-18814 SEQ ID No. 2 borA2 18875-23590 SEQ ID No. 3 borA3
23686-34188 SEQ ID No. 4 borA4 34185-39047 SEQ ID No. 5 borA5
39122-45514 SEQ ID No. 6 borA6 45514-50742 SEQ ID No. 7
[0133] The gene borA1 encodes the starter or loading module (SEQ ID
No.1, position 16184-18814). The assignment of the start codon is
not obvious for this open reading frame. The start codon given here
is what we believe to be the true start codon, but there are at
least another three possible start codons between the first and the
beginning of the AT0 domain sequence and a person of skill in the
art will appreciate that it may be possible to generate active
protein using one of these alternative start codons. The start
codon given here leaves a significant N-terminal tail of 321 amino
acids preceding the AT0 domain. For comparison the N-terminal tail
preceding the AT0 of the erythromycin loading module is 108 amino
acids and that of the avermectin loading module is 28 amino acids.
It is therefore possible that one of the other candidate start
codons could be correct; the most likely of these are at positions
16298, 16607 and 16901 of SEQ ID No.1. The length of the N-terminal
tail suggests it could possibly represent a catalytic activity,
although it does not have any significant homology to other
sequences in the databases. The nucleotide sequence position and
the corresponding amino acid sequence for each of the functional
domains within the starter module are identified below in Table
2:
TABLE-US-00002 TABLE 2 Domain in Bases in Amino acids borA1 SEQ ID
No. 1 in SEQ ID No. 2 AT0 17147-18175 322-664 ACP0 18263-18472
694-763
[0134] The gene borA2 encodes the first extension module (SEQ ID
No.1, position 18875-23590). The nucleotide sequence position and
the corresponding amino acid sequence for each of the functional
domains within the first extension module are identified below in
Table 3:
TABLE-US-00003 TABLE 3 Domain in Bases in Amino acids borA2 SEQ ID
No. 1 in SEQ ID No. 3 KS1 18974-20251 34-459 AT1 20543-21529
557-885 KR1 22280-23011 1136-1379 ACP1 23129-23332 1419-1486
[0135] The gene borA3 encodes the second and third extension
modules (SEQ ID No.1, position 23686-34188). The nucleotide
sequence position and the corresponding amino acid sequence for
each of the functional domains within the second and third
extension modules are identified below in Table 4:
TABLE-US-00004 TABLE 4 Domain in Bases in Amino acids borA3 SEQ ID
No. 1 in SEQ ID No. 4 KS2 23785-25062 34-459 AT2 25360-26346
559-887 DH2 26392-26835 903-1050 KR2 27745-28476 1354-1597 ACP2
28567-28767 1628-1694 KS3 28855-30132 1724-2149 AT3 30418-31413
2245-2576 DH3 31462-31887 2593-2734 KR3 32863-33606 3060-3307 ACP3
33703-33903 3340-3406
[0136] The gene borA4 encodes the fourth extension module (SEQ ID
No.1, position 34185-39047). The nucleotide sequence position and
the corresponding amino acid sequence for each of the functional
domains within the fourth extension module are identified below in
Table 5:
TABLE-US-00005 TABLE 5 Domain in Bases in Amino acids borA4 SEQ ID
No. 1 in SEQ ID No. 5 KS4 34284-35561 34-459 AT4 35847-36842
555-886 KR4 37719-38453 1179-1423 ACP4 38559-38759 1459-1525
[0137] The gene borA5 encodes the fifth extension module (SEQ ID
No.1, position 39122-45514). The nucleotide sequence position and
the corresponding amino acid sequence for each of the functional
domains within the fifth extension module are identified below in
Table 6:
TABLE-US-00006 TABLE 6 Domain in Bases in Amino acids borA5 SEQ ID
No. 1 in SEQ ID No. 6 KS5 39221-40492 34-457 AT5 40778-41785
553-888 DH5 41834-42259 905-1046 ER5 43322-44191 1401-1690 KR5
44207-44947 1696-1942 ACP5 45044-45244 1975-2041
[0138] The gene borA6 encodes the sixth extension module and the
chain terminating thioesterase (SEQ ID No.1, position 45514-50742).
The nucleotide sequence position and the corresponding amino acid
sequence for each of the functional domains within the sixth
extension module are identified below in Table 7:
TABLE-US-00007 TABLE 7 Domain in Bases in Amino acids borA6 SEQ ID
No. 1 in SEQ ID No. 7 KS6 45622-46884 37-457 AT6 47176-48162
555-883 KR6 48814-49518 1101-1335 ACP6 49624-49824 1371-1437 TE
49894-50637 1461-1708
[0139] The identification of functional domains and their
boundaries as described in the aforementioned are determined based
on the similarities to the conserved amino acid sequences of other
modular PKSs such as those for the rapamycin (Schwecke et al.,
1995; Aparicio et al., 1996) and erythromycin (Cortes et al., 1990)
biosynthesis. The limits of the catalytic domains are established
on the basis of homology to other PKS clusters and the chosen point
at which a domain starts or finishes is not absolutely defined, but
selected based on the aforementioned considerations. In the case of
the .beta.-keto processing domains it is least obvious, as there is
typically a large region not assigned to a functional domain that
precedes the KR domain. This region may be structurally important,
or required for stability of the PKS dimer. An unusual
characteristic of the borrelidin PKS is that all of the individual
enzymatic domains appear to be catalytically competent based on
their oligonucleotide/amino acid sequence, and are all necessary in
order to provide the .beta.-keto processing required to produce the
functional groups observed in borrelidin. This is rather unusual as
the majority of modular PKS sequences so far reported contain one
or more inactive domains, an exception being for example the
spinosyn PKS (Waldron et al., 2001; U.S. Pat. No. 6,274,50).
[0140] One skilled in the art is familiar with the degeneracy of
the genetic code, therefore, the skilled artisan can modify the
specific DNA sequences provided by this disclosure to provide
proteins having the same or altered or improved characteristics
compared to those polypeptides specifically provided herein. One
skilled in the art can also modify the DNA sequences to express an
identical polypeptide to those provided, albeit expressed at higher
levels. Furthermore, one skilled in the art is familiar with means
to prepare synthetically, either partially or in whole, DNA
sequences which would be useful in preparing recombinant DNA
vectors or coding sequences which are encompassed by the current
invention. Additionally, recombinant means for modifying the DNA
sequences provided may include for example site-directed deletion
or site-directed mutagenesis. These techniques are well known to
those skilled in the art and need no further explanation here.
Consequently, as used herein, DNA which is isolated from natural
sources, prepared synthetically or semi-synthetically, or which is
modified by recombinant DNA methods, is within the scope of the
present invention.
[0141] Likewise, those skilled in the art will recognize that the
polypeptides of the invention may be expressed recombinantly.
Alternatively, those polypeptides may be synthesised either in
whole or in part, by conventional known non-recombinant techniques;
for example, solid phase synthesis. Thus, the present invention
should not be construed as necessarily limited to any specific
vector constructions or means for production of the specific
biosynthetic cluster molecules including the polyketide synthase
molecules exemplified.
[0142] The loading module of the borrelidin PKS exists as a
discrete protein. This is rather unusual as the majority of loading
modules are found on the same protein as the first extension
module. Exceptions to this include, for example, the nystatin
(Brautaset et al., 2000) and amphotericin (Caffrey et al., 2001)
PKSs. The loading module, which consists of an AT-ACP didomain, is
similar to the broad specificity loading module of the avermectin
PKS, which accept a number of alternative starter acids, and are of
use in generating libraries of novel polyketides (Marsden et al.,
1998; Pacey et al., 1998). The AT domain of the borrelidin PKS
loading module diverges from the vast majority of AT domains as the
active site serine residue is replaced with a cysteine such that
the active site motif is GXCXG (specifically GHCYG). In most
available type-I PKS AT domain sequences, the conserved active site
motif is GXSXG; the same motif is observed in lipases, fatty acid
synthases and most thioesterases. The nucleophilic serine is
substituted by cysteine in two NRPS thioesterase domains,
specifically the synthetases responsible for the production of
mycobactin and pyochelin (Shaw-Reid et al., 1999). A GXCXG motif is
also observed in a thioesterase-like domain of ORF1 in the
bialaphos cluster (Raibaud et al., 1991). It has been suggested
that since it is not possible to move between the two types of
serine codons by a single base change, active sites containing an
essential serine residue may lie on two lines of descent from an
ancient ancestral enzyme that had a cysteine instead of a serine in
its active site (Brenner, 1988). The presence of enzymes containing
cysteine in the active site may support this view. It may
alternatively be the case that cysteine arises in these active
sites because it is possible to move from one type of serine codon
to the other via a cysteine which would remain active.
[0143] The AT domains of PKSs select a particular carboxylic acid
unit as substrate. This selectivity has been shown to correlate
with certain motif signatures within the AT domain (Reeves et al.,
2001; WO 02/14482). The borrelidin loading module AT domain motif
differs from any described so far, which is not surprising as this
AT domain is the first to be sequenced that selects an alicyclic
dicarboxylic acid. The AT domains for the borrelidin PKS extension
modules display the expected active site motif GXSXG, and also each
contain the expected motifs for the selection of malonyl-CoA or
methylmalonyl-CoA (Reeves et al., 2001; WO 02/14482). The
malonyl-CoA selective AT domains (AT1, AT2 and AT6) show very high
similarity to one another, both at the protein and at the DNA
level. The same is true for the methylmalonyl-CoA selective AT
domains (AT3, AT4 and AT5); two of these AT domains (AT3 and AT4)
have identical amino acid sequences throughout the conserved
region. The high similarity of AT5 to AT3 and AT4 is evidence that
the extender unit selected in module 5 is methylmalonyl-CoA, and
that the borrelidin C12-methyl group thus incorporated is
subsequently modified to a nitrile function after incorporation
into the PKS.
[0144] To demonstrate that we can alter the PKS derived structure
of borrelidin, the AT domain of module 4 (the AT domain encoded by
borA4) is replaced by the AT domain of module 2 of the rapamycin
PKS (rapAT2) using a replacement strategy (see example 6). This
gives strain S. parvulus Tu4055/467. Upon fermentation and LCMS
analysis of culture extracts of this mutant, it can be determined
that some borrelidin is produced and a new, more polar compound is
also observed with a m/z value 14 units lower than borrelidin. This
is consistent with incorporation of a malonate rather that a
methylmalonate extender unit by module 4 of the PKS to produce
10-desmethyl borrelidin 3.
[0145] In addition to production by domain swapping methods, 3 is
also generated by introducing specific mutations into the module 4
AT domain selectivity motif (Reeves et al., 2001; WO 02/14482) (see
example 7). Such a change affects the selectivity of the AT domain
such that it selects a substrate molecule of malonyl-CoA
preferentially over methylmalonyl-CoA. Thus, the amino acid motif
YASH at positions 739 to 742 of SEQ ID No.5 is mutated to HAFH to
give strain S. parvulus Tu4055/472. Upon fermentation and LCMS
analysis of culture extracts of this mutant it is determined that
borrelidin is produced in addition to a new, more polar compound
with a m/z value 14 units lower than borrelidin. This new compound
is identical to that described above and thus is consistent with
incorporation of a malonate rather that a methylmalonate extender
unit by module 4 of the PKS to produce 3.
[0146] These results clearly indicate that the borrelidin PKS is
amenable to genetic manipulation and to the exchange of native
sequence for that of a heterologous strain. It is clear to one
skilled in the art that the biosynthetic engineering, by the
methods described above, of the borrelidin PKS will lead to the
production of novel borrelidin-like molecules.
[0147] The borrelidin loading module is of interest due to the
unique structure of its cognate substrate. To examine its potential
use in other systems, the loading module native to the erythromycin
PKS is replaced with the borrelidin loading module in
Saccharopolyspora erythraea; this experiment is analogous to those
done previously with the avermectin loading module (WO 98/01546;
Marsden et al., 1998). We anticipate that the new strain is capable
of producing novel erythromycin like molecules in which the
C13-ethyl group is replaced with an exogenously supplied racemic
trans-cyclopentane-1,2-dicarboxylic acid moiety. The methodology
used to perform this experiment is similar to that described in WO
98/01546, but the transformation is performed using a mutant
Saccharopolyspora erythraea DM (Gaisser et al., 2000) which
accumulates the aglycone product erythronolide B rather than the
fully processed macrolide, as well as using S. erythraea WT. This
experiment is described in example 8.
[0148] It is not evident from SEQ ID No.1, which of four candidate
start codons is correct for borA1. The four most obvious candidate
start codons are at nucleotides 16184, 16298, 16607 and 16901 of
SEQ ID No.1. The earliest of these possible start codons was used
in giving the amino acid sequence for SEQ ID No.2. A pile-up of
this loading module with the erythromycin and avermectin loading
modules indicates that the AT0 domain starts at position 321 of SEQ
ID No.2, and that there is a long N-terminal tail. No significant
homology is found for the first 298 amino acids of borA1. The
borrelidin loading module is encoded by a discrete orf, and in
order to retain this architecture the splice site chosen for
joining the borrelidin PKS loading module sequence to the
erythromycin PKS loading module sequence is at the beginning of the
homologous region of the KS1 domain of borA2, at amino acids 42-44
of SEQ ID No.3. This approach maintains the putative docking
regions at the end of BorA1 and start of BorA2 that are believed to
be essential for the production of a functional PKS assembly. To
maintain the continuity of this experiment this loading module is
fused to the equivalent point at the beginning of the KS1 domain of
eryA1. The resulting mutants S. erythraea DM/CJM400-403 are
fermented and analysed by negative ion LCMS using standard
protocols. This analysis clearly indicates the presence of a new
compound 6 with m/z=485.3 as expected (FIG. 7). It is clear to one
skilled in the art that the products of these experiments could be
biotransformed using an appropriate strain such as S. erythraea JC2
(Rowe et al., 1998) to provide novel, biologically active
erythromycin analogues. It is additionally clear to one skilled in
the art that the borrelidin loading module has utility for the
biosynthetic engineering of other PKSs (i.e. not the borrelidin
PKS) to produce further novel polyketides bearing a
trans-cyclopentane-1,2-dicarboxylic acid moiety. It is also clear
that the diversity of products arising from hybrid PKSs derived
from the borrelidin loading module may be further enhanced through
the exogenous feeding of carboxylic acids other than the cognate
substrate.
[0149] The most striking feature of the borrelidin PKS is the clear
divergence from the normal co-linear, processive mode of operation
for type-I modular PKSs. Borrelidin is a nonaketide (expected: one
loading plus eight extension steps), but only seven modules (one
loading and six extension modules) are present in the cluster.
Analysis of the PKS domains with respect to the chemical structure
of borrelidin correlates with the fifth extension module (BorA5)
being used iteratively for three rounds of chain elongation as
shown in FIG. 5. Thus, the fifth, sixth and seventh rounds of chain
elongation occur on BorA5 with the incorporation of three
methylmalonyl-CoA extension units, and with full reductive
processing of the .beta.-keto groups to methylene moieties. As
described supra, the divergence from co-linear operation for
modular PKSs is unusual and limited to a few examples. The present
example is interesting as it occurs on a module that reduces the
.beta.-keto group fully to a methylene moiety and which is followed
by an inter-rather than intra-protein transfer of the growing
chain. This is also the case for the two known examples of
erroneous iterative use of type-I modules by the erythromycin
(Wilkinson et al., 2000) and epothilone (Hardt et al., 2001) PKSs.
It is noteworthy that this full reduction makes these modules
functionally equivalent to fatty acid synthase (FAS). The type-I
PKS modules that can operate iteratively may have retained FAS like
activity.
[0150] Although it appears that BorA5 is used iteratively (three
times), two other possible scenarios may explain borrelidin
biosynthesis given the genes present in the borrelidin biosynthetic
cluster. Firstly, two modules may be `missing` from the cluster,
but could be present at some other location in the genome. However,
in the majority of cases investigated, the genes required for
biosynthesis of secondary metabolites in actinomycetes are
clustered in a single locus. The second possibility is that three
separate BorA5 dimers assemble, and that each catalyses a round of
chain elongation; thus the process would be processive. However,
this scenario requires that three times the amount of BorA5 is
produced with respect to the other PKS proteins, but the
organisation of the borrelidin gene cluster does not indicate that
the regulation of borA5 differs from that of any of the other PKS
genes. In addition, this scenario does not fit with the common
thinking as to the roles of inter-protein docking domains, which
suggests that there is a specific recognition between the N- and
C-terminal ends of the proteins of the biosynthetic complex that
need to interact, enabling specific binding between modules encoded
on different proteins (Ranganathan et al., 1999; Wu et al., 2001;
Broadhurst et al., 2003).
[0151] To address the issues described above, the two proteins
encoded by borA4 and borA5 were fused after manipulation at the
genetic level to provide strain S. parvulus Tu4055/borA4A5 (see
example 9), and separately the two proteins encoded by borA5 and
borA6 were fused in an analogous manner to provide strain S.
parvulus Tu4055/borA5-A6 (see example 10). Additionally, a double
mutant was generated in which the above described fusions were
combined to generate a strain in which borA4, borA5 and borA6 were
fused to generate strain S. parvulus Tu4055/borA4-A5-A6 (see
example 11). Therefore, the new, fused, bi- and tri-modular genes
make it impossible to assemble three separate molecules of BorA5,
or for another protein(s) encoded by a gene(s) remote from the
borrelidin cluster to act in tandem with BorA5. Upon fermentation
of strains S. parvulus Tu4055/borA4-A5, /borA5-A6, and /borA4-A5-A6
followed by extraction and analysis, the production of borrelidin
was verified at a reduced but significant level (21.+-.4%, 27.+-.4%
and 18.+-.5% respectively) when compared to the WT strain. Thus,
the production of borrelidin by these mutants indicates that module
5 of the fused BorA4-A5 or BorA5-A6 operates in an iterative
manner. Since the priority filing of this application, these
limited data have been published (Olano et al., 2003).
[0152] The ability of BorA5 to operate iteratively has great
potential for the engineering of heterologous PKSs to provide
macrolactones with expanded ring sizes. To examine this possibility
BorA5 is swapped into the erythromycin PKS in place of module 4 of
DEBS2. This is done by replacement of the appropriate gene fragment
in both the erythromycin producer S. erythraea WT and S. erythraea
DM. This experiment is chosen as both modules recruit
methylmalonyl-CoA extender units and process the .beta.-keto
functions formed through to methylene groups. In addition, the
stereochemistry of the resulting methyl group in the polyketide
chain is the same in both cases. Of most significance is the fact
that module 4 of DEBS2 is known to perform erroneous iterative
rounds of chain elongation (Wilkinson et al., 2000), indicating
that such a process can indeed occur at this location within the
PKS and give rise to products that can be fully processed by DEBS3,
making it an attractive target to introduce specific iterative use
of a heterologous module to make 16- and 18-membered
macrolides.
[0153] Briefly, the region of DNA encoding borA5 is swapped for
that encoded by module 4 of eryA2, which encodes the C-terminal
portion of DEBS2 of the erythromycin PKS (see example 12). The
resulting mutant S. erythraea DM/421 is grown and extracted as for
the production of metabolites by S. erythraea strains (Wilkinson et
al., 2000) and then analysed by LCMS. Two new significant
compounds, which are less polar than erythronolide B, are observed.
These have an m/z of 435.5 (7, [MNa.sup.+]) and 477.5 (8,
[MNa.sup.+]) respectively, which is consistent with the production
of two new ring expanded erythronolide B analogues (FIG. 8).
Compound 7 with m/z=435.5 is consistent with the presence of the
16-membered ring-expanded erythronolide B related macrolide
reported previously as a minor component of S. erythraea WT
fermentations (Wilkinson et al., 2000). It is clear to one skilled
in the art that such new products can be converted to antibacterial
molecules by biotransformation with an appropriate organism. It is
also clear to one skilled in the art, that the inclusion of such a
module into other positions of the erythromycin PKS, or into other
PKSs, may allow the production of novel, ring expanded polyketides
in a similar manner. In addition, it is possible to perform this
experiment by swapping only the region of the DEBS module 4 from
the start of the conserved region of the KS4 to the end of the ACP4
domain; this arrangement retains the C- and N-terminal regions at
the end of DEBS2 and DEBS3 respectively, to ensure the mutual
recognition and docking of these proteins.
Non-PKS Genes
[0154] Both upstream and downstream of the PKS encoding genes are
other orfs involved in the biosynthesis of borrelidin. An orf is
designated as consisting of at least 100 contiguous nucleotides,
that begins with an appropriate start codon and finishes with an
appropriate stop codon, and which has an appropriate codon bias for
protein-coding regions of an organism whose DNA is rich in the
nucleotides guanine and cytosine. In the DNA sequence both upstream
and downstream of the borrelidin PKS genes (borA1-borA6) there are
a number of orfs that could be identified by comparison to other
sequences in the NCBI database (see FIG. 3). The nucleotide
sequence details of these orfs are given below in Table 8:
TABLE-US-00008 TABLE 8 Corresponding polypeptide Gene Bases In SEQ
ID No. 1 sequence number borB 7603-8397c SEQ ID No. 8 borC
8397-9194c SEQ ID No. 9 borD 9244-9996c SEQ ID No. 10 borE
9993-11165c SEQ ID No. 11 borF 11162-11980c SEQ ID No. 12 borG
11992-13611c SEQ ID No. 13 borH 13608-15599c * SEQ ID No. 14 borI
50739 *-52019 SEQ ID No. 15 borJ 52113-53477 SEQ ID No. 16 borK
53486-54466 SEQ ID No. 17 borL 54506-56176 SEQ ID No. 18 borM 56181
*-57098 SEQ ID No. 19 borN 57112-57858 SEQ ID No. 20 borO
57939-59966 SEQ ID No. 21 orfB1 2-313 SEQ ID No. 22 orfB2 501
*-3107 SEQ ID No. 23 orfB3 3172-3810c SEQ ID No. 24 orfB4
3935-4924c SEQ ID No. 25 orfB5 5123-5953 SEQ ID No. 26 orfB6
5961-6518 *c SEQ ID No. 27 orfB7 6564 *-7538 SEQ ID No. 28 orfB8
60153-60533c SEQ ID No. 29 orfB9 60620-61003 SEQ ID No. 30 orfB10
61188 *-61436 SEQ ID No. 31 orfB11 61526-61738 SEQ ID No. 32 orfB12
61767-62285c SEQ ID No. 33 orfB13a 62750-63067c SEQ ID No. 34
orfB13b 62586-62858c SEQ ID No. 113 orfB14 63155-65071c SEQ ID No.
35 orfB15 65374-65871 SEQ ID No. 36 orfB16 65942-68305c * SEQ ID
No. 37 orfB17 68290-68910c * SEQ ID No. 38 orfB18 69681-70436 SEQ
ID No. 39 orfB19 70445-71848 SEQ ID No. 40 orfB20 71851-72957 SEQ
ID No. 41 orfB21 73037-73942 SEQ ID No. 42 orfB22 73995-74534c SEQ
ID No. 43 [Note 1: c indicates that the gene is encoded by the
complement DNA strand; Note 2: for each open reading frame given
above, the longest probable open reading frame is described. It is
sometimes the case that more than one potential candidate start
codon can been identified. One skilled in the art will recognise
this and be able to identify alternative possible start codons. We
have indicated those genes which have more than one possible start
codon with a `*` symbol. Throughout we have indicated what we
believe to be the start codon, however, a person of skill in the
art will appreciate that it may be possible to generate active
protein using an alternative start codon, proteins generated using
these alternative start codons are also considered within the scope
of the present invention. Note 3 the SEQ ID NO: for orfB13b was
originally designated SEQ ID NO: 34 but for clarity a separate
sequence and SEQ ID NO has been assigned.]
[0155] Potential functions of the predicted polypeptides (SEQ ID
N0. 7 to 43) were obtained from the NCBI database using a BLAST
search. The best matches obtained from these searches are described
below in Table 9.
TABLE-US-00009 TABLE 9 Accession Proposed Gene Significant protein
match Score GenBank function orfB1 hypothetical protein, no full
unknown length hits, high GC codon preference orfB2 SCM2.07,
hypothetical 998 NP_625154 unknown protein (S. coelicolor) orfB3
SCF76.07, hypothetical 359 NP_624786 unknown protein, (S.
coelicolor) orfB4 SCF76.06, araC family 412 NP_624785 unknown
transcriptional regulator (S. coelicolor) orfB5 SCF76.05c, non-heme
495 NP_624784 non-heme chloroperoxidase (S. coelicolor)
chloroperoxidase orfB6 SCF76.09, hypothetical 159 NP_624788 unknown
protein (S. coelicolor) orfB7 SCF76.08c, hypothetical 473 NP_624787
unknown protein (S. coelicolor) borB PteH, polyene macrolide 244
BAB69315 type II thioesterase thioesterase (S. avermitilis) borC
XF1726, 2,5-dichloro-2,5- 160 NP_299015 dehydrogenase
cyclohexadiene-1,4,-diol dehydrogenase (Xylella fastidiosa strain
9a5c)e borD FabG, 3-oxoacyl-ACP 124 AAK83686 3-oxoacyl-ACP
reductase precursor, reductase (Plasmodium falciparum) borE FN1586,
O-succinylbenzoyl- 88 NP_602402 cyclase (member of CoA synthase,
enolase superfamily) (Fusobacterium nucleatum subsp. nucleatum ATCC
25586) borF putative lysophospholipase 57 NP_565066 unknown
homologue, (Arabidopsis thaliana) borG MTH1444, acetolactate 120
NP_276558 Unknown synthase, large subunit, (Methanothermobacter
thermautotrophicus) borH PA3592, conserved 116 NP_252282 unknown
hypothetical protein, (Pseudomonas aeruginosa) borI TylH1,
cytochrome P450, 285 AAD12167 cytochrome P450 (Streptomyces
fradiae) oxidase borJ BioA, DAPA 346 BAB39453 amino transferase
aminotransferase, (Kurthia sp. 538-KA26) borK Adh1, alcohol 191
NP_213938 NAD/quinone dehydrogenase, (Aquifex oxidoreductase
aeolicus) borL putative auxin-regulated 92 NP_176159 unknown
protein GH3, (Arabidopsis thaliana) borM SCL6.10, hypothetical
protein 108 CAB76875 F420 dependent similar to putative F420-
dehydrogenase dependent dehydrogenase (S. coelicolor), borN
SC1C2.27, hypothetical 215 NP_629680 2-hydroxyhepta-2,4- protein,
2-hydroxyhepta-2,4- diene-1,7-dioate diene-1,7-dioate isomerase
isomerase superfamily (S. coelicolor) borO ThrS, threonyl-tRNA 627
NP_301410 threonyl-tRNA synthetase (Mycobacterium synthetase, self
leprae) resistance gene orfB8 conserved hypothetical 37 NP_617908
possible regulator protein (Methanosarcina acetivorans str. C2A).
(Pfam pulls out weak MarR family) orfB9 putative anti-sigma factor
113 NP_631789 anti-sigma factor antagonist (Streptomyces antagonist
coelicolor) orfB10 conserved hypothetical 95 NP_631790 unknown
protein (S. coelicolor) orfB11 hypothetical protein, no full
unknown length hits, high GC codon preference orfB12 putative
regulator (S. coelicolor) 92 NP_631494 regulator (of a two
component system, maybe membrane sensor) orfB13a putative
acetyltransferase (S. coelicolor); 58 NP_625155 tentative
assignment orfB13b putative acetyltransferase (S. coelicolor) 100
NP_625155 of acetyltransferase in two frames, or sequencing error
and should be in a single frame orfB14 putative lipoprotein (S.
coelicolor) 386 NP_631245 unknown orfB15 hypothetical protein (S.
coelicolor) 41 NP_631424 unknown orfB16 putative formate 915
NP_626265 oxidoreductase dehydrognease (S. coelicolor) (Pfam
matches to molybdopterin oxidoreductase/ formate dehydrogenase
alpha subunit) orfB17 conserved hypothetical 175 NP_631569 unknown
protein, S. coelicolor SCBAC25F8.16 orfB18 product unknown 396
AAD23399 unknown (Streptomyces aureofaciens) orfB19 putative
aldehyde 635 AAD23400 aldehyde dehydrogenase (S. aureofaciens)
dehydrogenase orfB20 putative alcohol 450 NP_630527 alcohol
dehydrogenase (S. coelicolor) dehydrogenase orfB21 hypothetical
protein (S. coelicolor) 395 NP_630528 unknown orfB22 putative
calcium binding 160 NP_631687 calcium binding protein (S.
coelicolor) protein
[0156] Analysis of the functions of the putative gene products
indicates that the genes borB to borO most probably form the
boundaries of the borrelidin biosynthetic cluster. Evidence to
support this came from the disruption of borB2, which produced
borrelidin at levels indistinguishable from the wild type parental
strain. In addition, borB3 to borB7 have homologues in the
Streptomyces coelicolor A3(2) genome encoded on cosmid SCF76; the
same orfs are present, but in a different order. The orfs borB8 to
borB10 are arranged identically to homologues in the S. coelicolor
A3(2) cosmid SC5E3. The orfs borB18 to borB21 have homologues that
are arranged similarly in the S. coelicolor A3(2) cosmid SC1A2. The
orf borB13 contains a frame-shift and thus any gene product would
most probably be inactive. In addition, no function can be readily
deduced for the products of these orfs during borrelidin
biosynthesis.
Starter Unit Biosynthesis Genes
[0157] In order to identify the genes that are involved in the
biosynthesis of the trans-cyclopentane-1,2-dicarboxylic starter
unit, each of the genes borB to borN was disrupted (e.g. see
examples 13-25). This was done in a manner designed to minimise the
possibility of polar effects, which was verified by successful in
trans complementation with a full-length copy of the disrupted gene
under the control of the ermE* promoter, which gave back
approximately WT levels of borrelidin production in each case.
[0158] Each of the disrupted mutants was grown in triplicate as
described in example 1, and borrelidin production assessed.
Alongside these, each mutant was grown in triplicate and
supplemented, after 24 hours, with exogenous starter acid to a
final concentration of 1 mM, and borrelidin production assessed.
Extraction and analysis for borrelidin provided the data that are
described below in Table 10:
TABLE-US-00010 TABLE 10 Borrelidin Borrelidin production Borrelidin
production with biosynthetic without feeding (% relative feeding (%
relative to unfed gene disrupted to WT) WT) Wild type 100 .+-. 16,
(100 .+-. 2) 363 .+-. 65, (269 .+-. 49) (control) borB 75 .+-. 11,
(43 .+-. 20) 172 .+-. 51 borC 0, (10 .+-. 3) 933 .+-. 42 borD 7
.+-. 1, (0) 75 .+-. 15 borE 2 .+-. 1 122 .+-. 23 borF 3 .+-. 2 201
.+-. 52 borG 11 .+-. 1, (32 .+-. 3) 1532 .+-. 142 borH 17 .+-. 2,
(23 .+-. 13) 203 .+-. 40 borI 0, (0) 0, (0) borJ 0, (0) 0, (0) borK
0, (6 .+-. 1) 319 .+-. 54, (464 .+-. 18) borL 0, (0) 408 .+-. 70,
(399 .+-. 69) borM 0, (6 .+-. 3) 461 .+-. 29, (553 .+-. 66) borN 25
.+-. 9, (34 .+-. 3) 68 .+-. 12, (46 .+-. 9) borO N/A N/A [Note 1:
The values given in brackets indicate where repeat runs of some
experiments were performed; Note 2: N/A = not applicable.]
[0159] Based on the data in table 10, it is clear to one skilled in
the art that the gene products BorC-F and K-M are essential or very
important for the biosynthesis of
trans-cyclopentane-1,2-dicarboxylic acid, as these mutants produced
no or very low levels of borrelidin without the addition of
exogenous starter acid, whereupon they produced borrelidin at
levels approaching, or better than, that of the WT organism. In
addition the gene products BorG, H, and N appear to be involved in,
but not essential for, the biosynthesis of the starter unit, as
they produced significantly lower levels of borrelidin unless
exogenous starter acid was added, whereupon they produced
borrelidin at levels approaching or better than that of the WT
organism; this was particularly notable in the case of the
borG.sup.- mutant.
[0160] The normal metabolic function of BorN homologues is the
production of 2-oxohepta-3-ene-1,7-dioate 10, a key step in the
catabolism of tyrosine via 4-hydroxyphenyl acetic acid 9 (FIG. 9)
(Prieto et al., 1996). Therefore, 10 may be an intermediate in the
biosynthetic pathway to trans-cyclopentane-1,2-dicarboxylic. The
ability of the mutant disrupted in borN to produce borrelidin,
albeit at a reduced level, most probably lies in the presence of a
homologue elsewhere in the genome utilised in the catabolism of
tyrosine during primary metabolism.
[0161] The intermediate 10 contains all the required functionality
for the eventual formation of trans-cyclopentane-1,2-dicarboxylic
acid. The most probable next step of the biosynthesis is the
reduction of the 3-ene position in a reaction similar to that
catalysed by an enoyl reductase. Potential enzymes responsible for
this step are BorC, BorD, BorK or BorM; these enzymes are all
involved in borrelidin starter unit biosynthesis as seen from the
data in table 10. The resulting 2-oxohepta-1,7-dioate 11 is one
possible substrate for cyclisation through formation of a new C--C
bond between C6 and C2. Another possible substrate for this
cyclisation would be 2-hydroxyhepta-1,7-dioate 12 or some activated
form thereof. This would presumably be formed from 11 by the action
of an oxidoreductase such as BorC, BorD or BorM.
[0162] The key cyclisation step is most probably catalysed by BorE,
which displays similarity to O-succinylbenzoyl-CoA synthase and
chloromuconate cycloisomerase. These enzymes belong to the enolase
super-family, the members of which share the common ability to
stabilise the formation of an anion on the carbon atom adjacent to
a carboxylate group (Schmidt. et al., 2001). It is further notable
that the substrate for muconate cycloisomerase is a
hexa-1,6-dioate, which is similar in gross structure to 11 and 12.
Abstraction of a proton and formation of an anion equivalent at C6
of 11 or 12 (or an activated form thereof, e.g. 13) with subsequent
cyclisation to C2 provides the correctly substituted cyclopentane
ring structure, although the intermediacy of 11 as substrate would
require some further processing of the substituted cyclopentane,
most probably via elimination of water to give the symmetric
cyclopent-1-ene-1,2-dicarboxylic acid, or possibly the
.DELTA..sup.1-unsaturated compound,
cyclopent-1-ene-1,2-dicarboxylic acid. However, the feeding of
cyclopent-1-ene-1,2-dicarboxylic acid, or ethyl esters thereof, to
S. parvulus Tu4055 strains disrupted in any of borC-E, or to WT
strains, did not produce any borrelidin, or did not produce
borrelidin in any increased amount when compared to the unfed
controls. These data indicate that this compound is probably not an
intermediate in starter unit biosynthesis, and that the substrate
of BorE is possibly the 2-hydroxyhepta-1,7-dioate 12, or an
activated form thereof (e.g. 13). A putative pathway for the
biosynthetic pathway to trans-cyclopentane-1,2-dicarboxylic acid is
shown in FIG. 4.
[0163] The combined, specific genes required for the biosynthetic
steps to trans-cyclopentane-1,2-dicarboxylic acid are not clear,
but probably are encoded by some combination of borC-H, borK, borM
and borN. The lack of certain homologues of genes that are involved
in the catabolism of 4-hydroxyphenyl acetic acid 9, and which would
act prior to BorN in the pathway, is most probably an indication
that primary metabolic genes perform these tasks. The addition of
exogenous trans-cyclopentane-1,2-dicarboxylic acid to S. parvulus
Tu4055 and related strains increases the titre of borrelidin in the
order of 2- to 3-fold under our conditions, indicating that the
biosynthesis of starter acid is a limiting factor in borrelidin
biosynthesis. These data are consistent with primary metabolic
degradation of tyrosine being the source of
trans-cyclopentane-1,2-dicarboxylic acid.
[0164] In an attempt to further clarify which genes may be
specifically responsible for biosynthesis of the starter unit, a
number of co-culture experiments were performed with combinations
of the different mutants--these require the knowledge that the gene
products of borI and borJ are specifically involved in the
formation of the C12-nitrile moiety, which is clarified by the data
given in the following section below in combination with the data
from table 10. In summary, the co-culture of mutants borE.sup.-
& borD.sup.-, and of borE.sup.- & borM.sup.- failed to
produce any borrelidin whereas the co-culture of mutants borM.sup.-
& borI.sup.-, and borM.sup.- & borK.sup.- produced
borrelidin at approximately WT levels. These data, in combination
with that in table 10, and below, clearly indicate that borD, borE
and borM are involved in starter unit biosynthesis, whereas borI,
and possibly borK, are involved in the formation of the nitrile
moiety at C12 of borrelidin.
[0165] It is clear from the data in table 10 that exogenous
addition of trans-cyclopentane-1,2-dicarboxylic acid is sufficient
to re-establish approximately WT levels, or better, of borrelidin
production in mutants where genes that are involved in starter unit
biosynthesis have been disrupted. These data indicate that there is
no problem with the active uptake of added carboxylic acid by S.
parvulus Tu4055, and that an activity is present which is capable
of converting the carboxylic acid to a CoA thioester equivalent.
Thus, given the known technologies of mutasynthesis, it is obvious
to one skilled in the art that the addition of exogenous carboxylic
acids to one of the aforementioned mutants, for example the borE
strain S. parvulus Tu4055/borE:aac3(IV) described in example 16,
may lead to the production of borrelidin analogues in which the
starter unit carboxylic acid moiety is replaced with a moiety
derived from the exogenously added carboxylic acid.
[0166] To examine this possibility, strain S. parvulus
Tu4055/borE:aac3(IV) was fed with a
trans-cyclobutane-1,2-dicarboxylic acid according to the protocol
described in example 1 and then analysed as described in example 4.
The structure 18, described in FIG. 10, shows the new borrelidin
structure obtained from feeding this carboxylic acid; this compound
18 displayed the anticipated UV chromophore for borrelidin but
eluted at an earlier retention time and displayed the expected mass
by LCMS (m/z=474.3 [M-H].sup.-XX). Verification of this methodology
was provided by the production, isolation and characterisation of
18 (example 33). (RS)-2 It is clear to one skilled in the art that
other carboxylic acids could also be used in similar feeding
experiments to provide further new borrelidin analogues. Although
it is possible that not all carboxylic acids would be incorporated
using the exact methodology described herein, a person of skill in
the art is aware of a number of available methods to enhance the
incorporation of fed starter units.
[0167] In addition to the use of the strain deleted in borE, it was
observed (see table 10) that the strain S. parvulus
Tu4055/borG:aac3(IV), in which borG has been disrupted, when fed
with the natural starter unit of the bor PKS,
trans-cyclopentane-1,2-dicarboxylic acid, produced borrelidin at
titres significantly higher than those seen when the wild-type
organism was fed (4-fold increase) or unfed (15-fold increase). To
examine this further, this experiment was repeated using both the
natural and an unnatural starter acid as exogenous substrates, fed,
in parallel, to wildtype, the borE mutant and the borG mutant. The
resulting data are described in table 11.
TABLE-US-00011 TABLE 11 Fed with 1 mM Fed with 1 mM cyclopentane
cyclobutane trans-1,2- trans-1,2- Unfed dicarboxylic acid
dicarboxylic acid S. parvulus Tu4055 2.3 mg/l 6.6 mg/l -- S.
parvulus 0 4.7 mg/l 2.2 mg/l Tu4055/borE: aac3(IV) S. parvulus 0
88.9 mg/l 43.0 mg/l Tu4055/borG: aac3(IV)
As one can see from table 11, using S. parvulus
Tu4055/borG:aac3(IV) instead of S. parvulus Tu4055/borE:aac3(IV)
for mutasynthesis increases the titre approximately 19-fold, and
that S. parvulus Tu4055/borG:aac3(IV) fed with the natural starter
acid produces 38-fold more borrelidin A than wild type alone, or 13
fold more borrelidin A than the wild type strain fed with the same
amount of cyclopentane trans-1,2-dicarboxylic acid. These data
clearly indicate that the use of strain S. parvulus
Tu4055/borG:aac3(IV) for mutasynthesis experiments is beneficial
for the production of improved titres of borrelidin analogues. This
method has general applicability for both the production of
borrelidin and borrelidin analogues.
[0168] On the basis of this finding, the feeding experiments with
alternative carboxylic acids were repeated in S. parvulus
Tu4055/borG:aac3(IV), and extended to include 2,3-dimethyl succinic
acid and 2-methylsuccinic acid; the new compounds derived from the
incorporation of these alternative starter units, 19 and 20
respectively, are described in FIG. 10.
[0169] In an attempt to improve the titre of borrelidin produced in
fermentation cultures of S. parvulus Tu4055 through other means,
additional copies of the genes borE and borL were introduced into
the organism in vectors that place them under the control of the
strong constitutive promoter ermE*. It was anticipated that the
over-expression of these genes would increase the intra-cellular
levels of the starter acid, which appears to be limiting with
respect to borrelidin production.
[0170] The genes borE and borL were amplified by PCR, cloned into
the vector pEM4, and then introduced into S. parvulus Tu4055 as
described in examples 29 and 30 respectively. In addition, the
vector pEM4 alone (not containing any insert) was also introduced
in S. parvulus Tu4055 and used as a control. The resulting strains
were grown, extracted and analysed as described in examples 1 and
4. Introduction of the vector as a control did not significantly
effect the levels of borrelidin production. However, the expression
of additional copies of either borE or borL in this manner brought
a 4.9.+-.0.3 and 4.3.+-.0.7-fold increase respectively in the titre
of borrelidin relative to the wild type strain. Presumably, the
steps of biosynthesis catalysed by their gene products are rate
limiting, or alternatively their gene products may have a positive
regulatory function. For example borL shows greatest homology to
auxin response proteins from plants. Auxins are hormones involved
in the regulation of various cellular processes in plants, and borL
may represent the first example of a related gene having regulatory
function in a bacteria. As controls, an additional copy of borJ,
borO and borA5, under the control of ermE* in pEM4, were introduced
into S. parvulus Tu4055, but did not have any significant effect
upon borrelidin titre. This was anticipated as none of the
respective gene products are anticipated to be involved in starter
unit biosynthesis. In addition, up-regulation of the putative
`stuttering` PKS module (borA5) did not increase borrelidin titre,
further indicating that iterative use of this module occurs, rather
than three independent copies being utilized. The lack of an effect
on titre when borO is up-regulated indicates that there is most
probably no limitation placed upon borrelidin production due to
toxicity in the producing organism and so indicates that there is
further scope for titre improvement.
Formation of the Nitrile Moiety at C12
[0171] Sequence analysis of the AT domain of the borrelidin PKS
module 3 indicates that the substrate utilised for the third round
of chain extension is methylmalonyl-CoA. Thus, the carbon atom of
the nitrile moiety most probably arises from the methyl group of
methylmalonyl-CoA. This was verified by stable isotope feeding
experiments. Feeding [2,3-.sup.13C.sub.2]sodium propionate to S.
parvulus Tu113 gave borrelidin which displayed intact labelling of
the carbons at C4-C24, C6-C25, C8-C26, C10-C27 and C12-C28, and
Identical specific incorporations (as determined within the limits
of our experimental methods), as expected (FIG. 2). These data
indicate that the conversion of the C12-methyl group occurs either
during chain assembly at, or after, the incorporation of the third
extension unit, or that it occurs after polyketide chain assembly
and release from the PKS. Based on functional assignments given to
the borrelidin biosynthetic genes, in conjunction with the gene
disruption data described in table 10, both borI and borJ are
clearly implicated in formation of the nitrile moiety at C12, while
others such as borK may also be.
[0172] The cytochrome P450 hydroxylase BorI shares greatest
similarity to TyIHI, which catalyses the hydroxylation of an
exocyclic methyl group of the tylosin macrolactone prior to
addition of a deoxyhexose moiety (Fouces et al., 1999). BorI is
therefore believed to catalyse oxidation of the C12-methyl group
during borrelidin biosynthesis. In agreement with this the
borI.sup.- mutant S. parvulus Tu4055/borI::aac3(IV) fails to
produce borrelidin but accumulates a new product 14 (FIG. 6) that
is less polar than borrelidin. 14 is readily transformed to
borrelidin when fed to the borE mutant S. parvulus
Tu4055/borE::aac3(IV) which lacks the ability to synthesise the PKS
starter unit but maintains the rest of the borrelidin biosynthetic
genes intact. Fermentation of S. parvulus Tu4055/borI::aac3(IV)
followed by extraction and isolation provided .about.30 mg of 14
(example 31). Full structural analysis of 14 identified it as
12-desnitrile-12-methylborrelidin (pre-borrelidin). This is
consistent with the proposed role of BorI in borrelidin
biosynthesis and provides a route to novel borrelidin analogues
with a methyl group attached to C12 of the macrolactone ring.
[0173] The putative PLP dependent aminotransferase BorJ is believed
to catalyse the introduction of a nitrogen atom into borrelidin at
the activated C28-position, probably via a C12-formyl moiety. In
agreement with this the borJ.sup.- mutant S. parvulus
Tu4055/borJ::aac3(IV) does not produce borrelidin and accumulates a
new compound that is more polar than borrelidin. This new compound
is not transformed to borrelidin when fed to mutant S. parvulus
Tu4055/borE::aac3(IV) which indicates that it is probably a shunt
metabolite rather than an intermediate in borrelidin biosynthesis.
Fermentation of S. parvulus Tu4055/borJ::aac3(IV) allowed the
isolation of 17 mg of the accumulated compound (example 32).
Detailed structural analysis identified the accumulant as
12-desnitrile-12-carboxyl borrelidin 2.
[0174] In addition to the compounds isolated from mutation of the
borrelidin biosynthetic genes, 12-desnitrile-12-formyl borrelidin
15 is isolated from the fermentation supernatant of S. parvulus
Tu113. The fermentation media and conditions used for these
experiments differ from those we have described so far herein, but
are designed to maximise the production of borrelidin. We propose
that this altered medium, in combination with a drop in the
dissolved oxygen concentration that is observed to occur during
this specific fermentation, promoted the accumulation of 15. 15 is
readily transformed to borrelidin when fed to the mutant S.
parvulus Tu4055/borE::aac3(IV) which lacks the ability to
synthesise the PKS starter unit but maintains the rest of the
borrelidin biosynthetic genes intact.
[0175] The above data lead us to propose a biosynthetic route to
the nitrile moiety of borrelidin as presented in FIG. 6. The
C12-methyl carbon of pre-borrelidin 14 is first oxidised by BorI to
introduce an allylic hydroxyl group at C28 (16). This hydroxyl
group is then converted to the formyl moiety attached to C12 (15)
using a method selected from the group comprising: spontaneous
oxidation (including oxidation mediated by some background enzyme)
the action of a specific gene of the borrelidin biosynthetic gene
cluster; candidate gene products are thus BorI itself, acting in a
multifunctional manner and operating via the formation of a
gem-diol structure at C12 followed by dehydration; or
alternatively, via one of the oxidoreductase encoding genes such as
borC or borK. The next step is anticipated to be BorJ-catalysed
transamination of 15 in order to introduce a nitrogen atom at C28,
in the form of an amine, through a pyridoxamine phosphate mediated
process. The putative product amine 17 then undergoes oxidation,
possibly spontaneously, but most probably by an enzymic activity
such as BorI (certain parallels can be drawn to the biosynthesis of
nitriles in plants (Celenza, 2001; Hahn et al., 1999; Nielson and
Moller, 1999)) or by the products of one of the oxidoreductase
encoding genes, e.g. borC or borK, or by a general oxidoreductase
within the proteome.
[0176] In order to examine this proposed pathway in more detail a
number of biotransformation experiments were performed using
pre-borrelidin 14 as substrate for investigating the action of
borI-K individually and in combination, using pEM4 as vector and S.
albus J1074 (Chater & Wilde, 1980) as an expression strain.
Expression of borI or borJ individually did not give borrelidin
production on addition of 14. The added 14 was only consumed during
biotransformation with borI (and not in any of the control
experiments); the 14 added was identified as being converted to the
shunt metabolite 2. However, co-expression of borI & borJ did
convert the added 14 to borrelidin. It thus appears that either
BorI or general proteome activities in S. albus are capable of
oxidising the proposed amine intermediate 17 in the borrelidin
biosynthetic pathway. In addition to the feeding of pre-borrelidin
14, 12-desnitrile-12-carboxyl borrelidin 2 was also fed to the
three strains described above. No conversion of 2 to borrelidin was
observed in any of these experiments, reinforcing the idea that 2
is a shunt metabolite.
[0177] Detailed investigation of genomic DNA from three borrelidin
producing strains, S. rochei ATCC23956, S. parvulus Tu113 and S.
parvulus Tu4055, using numerous restriction digests and subsequent
Southern Blot analysis, indicates that the borrelidin biosynthetic
gene clusters of these three organisms are very closely conserved.
It therefore appears that the borrelidin biosynthetic pathways of
these strains are very similar. This assumption allows us to
consider the data above, which are obtained from different strains,
as applicable to a single biosynthetic pathway.
[0178] It is clear to one skilled in the art that manipulation of
the genes involved in formation of the C12-nitrile moiety of
borrelidin, for example borI, or borJ, is a generally useful method
for the production of novel borrelidin related molecules and
borrelidin derivatives with altered functionality at C12. In
addition, the transfer of these genes to other organisms producing
other natural or engineered polyketide products may allow the
incorporation of nitrile moieties into such compounds.
[0179] In an extension of this work, disruptions in borI and borJ
are separately made in the strain S. parvulus Tu4055/borG:aac3(IV)
to give the doubly mutated strains S. parvulus
Tu4055/borG:aac3(IV)/borI::hyg and S. parvulus
Tu4055/borG:aac3(IV)/borJ::hyg (examples 27 & 28 respectively).
These strains are fed alternative carboxylic acids,
trans-cyclobutane-1,2-dicarboxylic acid, 2,3-dimethylsuccinic acid
and 2-methylsuccinic acid, (as described above) and are found to
produce the mutasynthetic borrelidin analogues carrying, either, a
methyl (21, 22 and 23 respectively) or a carboxyl function at C12
(24, 25 and 26 respectively) in place of the nitrile group, and
which are also derived from alternative starter units corresponding
to the exogenously supplied carboxylic acids. This orthogonal
library of new compounds is described in FIG. 11 and the observed
UV chromophores and mass spectral data for each compound is
shown.
Other Genes Involved in Borrelidin Production
[0180] In addition to the type-I terminal thioesterase domain of
the borrelidin PKS, a discrete type-II thioesterase is located at
the upstream boundary of the biosynthetic gene cluster and is
encoded by the gene borB. Such discrete type-II TE proteins are
commonly found to be associated with type-I PKSs and are believed
to play a role in the `editing` of PKSs by the removal of short
chain acyl groups that are formed by unwanted decarboxylation of
extender units attached to KS domains (Heathcote et al., 2001). The
disruption of such discrete type-I TEs in the picromycin (Xue et
al., 1998) and tylosin (Butler et al., 1999) biosynthetic clusters
leads to a significant reduction in titre of both macrolides. In
accordance with these results, disruption of borB (example 13) gave
a mutant that produced between 43-75% of the parental wild type
titre.
[0181] The self-resistance of S. parvulus strains to borrelidin is
most probably due to the product of borO, which encodes a threonyl
tRNA synthetase homologue. Threonyl-tRNA synthetase is the
molecular target of borrelidin in sensitive strains (Paetz &
Ness, 1973). It is predicted that BorO is resistant to the action
of borrelidin, and acts to produce threonyl-tRNAs in cells that
make borrelidin, effectively complementing the normal threonyl-tRNA
which are inhibited. To verify this hypothesis borO was amplified
by PCR and cloned in to the expression vector pEM4A, which puts
borO under the control of the strong constitutive promoter ermE*
(example 26). The resulting vector pborOR was then transformed into
the borrelidin-sensitive strain Streptomyces albus J1074 (Chater
& Wilde, 1980). Comparison of this strain with that containing
only the expression vector pEM4A, using a soaked disk bioassay,
clearly indicated that expression of borO confers resistance to
borrelidin.
EXAMPLES
General Methods
[0182] Restriction enzymes, other molecular biology reagents,
antibiotics and chemicals were purchased from standard commercial
sources. Restriction endonuclease digestion and ligation followed
standard methods (Sambrook, J. et al., 1989).
Example 1
Fermentation of S. parvulus Strains
[0183] The following method is generally useful for culturing S.
parvulus for the production of borrelidin and/or borrelidin
analogues:
[0184] A seed flask containing NYG medium (30 ml in a 250 ml
Erlenmeyer flask) was inoculated from a working stock (0.5 ml). NYG
medium contains, in deionised water: beef extract (0.3%), Bacto
peptone (0.5%), glucose (1%) and yeast extract (0.5%). After 2 days
shaking in a rotary incubator (2-inch throw; 30.degree. C.; 250
rpm) the resulting cream culture was used to inoculate PYDG
production medium (30 ml in a 250 ml Erlenmyer flask; 10%
innoculum). PYDG medium contains per litre of deionised water:
peptonised milk nutrient (1.5%), yeast autolysate (0.15%), dextrin
(4.5%) and glucose (0.5%) adjusted to pH 7.0. After 5 days shaking
on a rotary incubator (2-inch throw; 30.degree. C.; 250 rpm) the
culture was harvested for analysis as described in example 4, or
for isolation purposes as required. For quantitative analysis these
experiments were performed in triplicate.
The following method is useful for the feeding of exogenous
carboxylic acids to S. parvulus strains:
[0185] The S. parvulus strain was grown as described above. After
24 hours growth in PYDG production medium, the carboxylic acid of
choice was added as a 50 .mu.l single aliquot (0.6 M solution in
70% methanol after neutralization with 5 N NaOH). The resulting
culture was harvested after 5 days total fermentation and analysed
as described in example 4. For quantitative studies these
experiments were performed in triplicate, and the equivalent fed
and unfed WT strains served as controls.
Example 2
Cryopreservation of S. parvulus Strains
Working Stocks
[0186] Working stocks of vegetative mycelia were prepared by mixing
a 2 day old seed culture grown in NGY medium (0.5 ml) with
cryopreservative (0.5 ml). Cryopreservative consists of 20%
glycerol and 10% lactose in deionised water.
Spore Stocks
[0187] Strains of S. parvulus were incubated on HA agar plates at
30.degree. C. After 14 days the resulting spores from a single
plate were harvested and suspended in of cryopreservative (1 ml).
HA agar contains in deionised water: 0.4% yeast extract, 1% malt
extract, 0.4% dextrose and 1.5% agar adjusted to pH 7.3.
Example 3
Cloning of the Borrelidin Biosynthetic Gene Cluster and Disruption
of borA2 & borA3
Cosmid Library Generation
[0188] A cosmid library was constructed in pWE15 cosmid vector
using the Gigapack.RTM.III Gold Packaging Extract kit according to
the manufacturer's handbook (Stratagene). Chromosomal DNA was
extracted from S. parvulus Tu4055 according to standard protocols
(Kieser et al., 2000) and treated with Sau3AI prior to cloning into
pWE15. A number of the resulting E. coli transformants (3300) were
picked and transferred to 96 well microtitre plates containing
Luria Broth (LB) medium (0.1 ml per well) with ampicillin (100
.mu.g/ml). The resulting clones were replica-plated to Luria agar
(LA) plates containing ampicillin (100 .mu.g/ml). After incubation
overnight at 37.degree. C. colonies were transferred to nylon
membrane filters for in situ colony hybridization analysis
according to published protocols (Sambrook et al., 1989).
Library Screening
[0189] The cosmid library was screened using a probe that was
generated using the DIG DNA Labelling and detection kit (Roche)
according to the manufacturers instructions. The probe used was a
BglII-BamHI fragment (1.7 kbp) obtained from the gene that encodes
module 6 of the third subunit of the oleandomycin PKS from
Streptomyces antibioticus (Swan et al., 1994).
Disruption of the Borrelidin Biosynthetic Gene Cluster
[0190] Cosmids that gave a positive response when screened as
described above were digested with BamHI and fragments of less than
3 kbp were subcloned into pOJ260 (Bierman et al., 1992). These were
then used to transform protoplasts of S. parvulus Tu4055 as
described in example 5. The resulting transformants were then
assessed for the ability to produce borrelidin. Two clones were
borrelidin non-producers; both were obtained from cosBor32A2 and
contain sequence typical of a modular PKS. The remaining cosmids
were then screened using probes obtained from the two BamHI
fragments, which led to the identification of the overlapping
cosmid cosBor19B9 that contained the remainder of the borrelidin
biosynthetic cluster.
Sequencing of cosBor32A2 and cosBor19B9
[0191] The cosmids cosBor32A2 and cosBor19B9 were transformed into
E. coli DH10B and the resulting clones grown at 37.degree. C. in
2.times.TY media (30 ml) containing ampicillin. After 15 hours the
cells were harvested and Qiagen Tip 100 kits were used to prepare
cosmid DNA. Approximately 5 .mu.g of the cosmid DNA was digested
with Sau3AI (1 U). Samples were taken at 2, 4, 6, 8 & 10 minute
intervals after the enzyme was added and quenched into an equal
volume of ice cold 0.5M EDTA. The samples were mixed and then
analysed by gel electrophoresis, and those fragments between
1.5-2.0 kbp recovered from the gel. The fragments were cloned into
linearised and dephosphorylated pHSG397 (Takeshita et al., 1987),
and transformed into E. coli DH10B. The resulting clones that
contained insert were grown in 2.times.TY medium (2 ml) containing
chloramphenicol (30 .mu.g/ml) and purified using Wizard kits
(Promega).
[0192] DNA sequencing was carried out using an Applied Biosystems
800 Molecular Biology CATALYST robot to perform the dideoxy
terminator reactions, which were then loaded into an ABI Prism 3700
automated sequencer (Applied Biosystems). The raw sequence data was
processed using the Staden software package. Assembly and contig
editing was performed using GAP (Genome Assembly Program) version
4.2 (Bonfield et al., 1995). The GCG package (Devereux et al.,
1984) version 10.0 was used for sequence analysis.
Example 4
Chemical Analysis of S. parvulus Strains
[0193] The following method is useful for analysing fermentations
(see example 1) for the production of natural borrelidins and of
engineered borrelidin analogues:
[0194] In a 2 ml Eppendorf tube, an aliquot of 5 day old
fermentation broth (1 ml) was adjusted to pH.about.3 by the
addition of 90% formic acid (ca. 20 .mu.l). Ethyl acetate (1 ml)
was added to the sample and mixed vigorously for 10 min using a
vortex tray. The mixture was separated by centrifugation in a
microfuge and the upper phase removed to a clean 2 ml Eppendorf
tube. The ethyl acetate was removed by evaporation using a
Speed-Vac. Residues were dissolved into methanol (250 .mu.l) and
clarified using a microfuge. Analysis was performed on an Agilent
HP1100 HPLC system as described below; [0195] Injection volume: 50
.mu.l [0196] Column stationary phase: 150.times.4.6 mm column,
base-deactivated reversed phase silica gel, 3 .mu.m particle size
(Hypersil C.sub.18-BDS). [0197] Mobile phase A: 10%
acetonitrile:90% water, containing 10 mM ammonium acetate and 0.1%
TFA. [0198] Mobile phase B: 90% acetonitrile:10% water, containing
10 mM ammonium acetate and 0.1% TFA. [0199] Mobile phase gradient:
T=0 min, 25% B; T=15, 100% B; T=19, 100% B; T=19.5, 25% B; T=25,
25% B. [0200] Flow rate: 1 ml/min. [0201] Detection: UV at 258 nm
(DAD acquisition over 190-600 nm); MS detection by electrospray
ionisation over m/z range 100-1000 amu, with +/-ve ion mode
switching.
Example 5
Protoplast Transformation Protocol for S. parvulus Tu4055
[0202] A seed flask containing tryptone soy broth (TSB) medium (10
ml in a 100 ml Erlenmyer flask) was inoculated from a working stock
(0.15 ml). After 3 days shaking on a rotary incubator (30.degree.
C., 250 rpm), 5 ml of the culture was used to inoculate R5 medium
(Kieser et al., 2000) (50 ml in a 250 ml Erlenmeyer flask) that was
then shaken on a rotary incubator for 24 hours (30.degree. C., 250
rpm). The PEG mediated transformation of protoplasts was then
performed according to standard published protocols (Kieser et al.,
2000).
Example 6
Replacement of borAT4 with rapAT2--Production of C10-desmethyl
Borrelidin
[0203] The borrelidin PKS AT4 domain is replaced with the AT2
domain of the rapamycin polyketide synthase as follows:
[0204] CosBor32A2 is digested with EcoRI and the 5429 bp band
isolated. This is used as a template for PCR using the oligos CM410
(5'-AAAATGCATTCGGCCTGAACGGCCCCGCTGTCA-3') (SEQ ID No.44) and CM411
(5'-AAATGGCCAGCGAACACCAACACCACACCACCA-3') (SEQ ID No.45). CM410
introduces an NsiI restriction site for cloning purposes and CM411
introduces an MscI site for use in the introduction of a
heterologous AT. The .about.1.1 kbp product is cloned into pUC18
digested with SmaI and dephosphorylated. The insert can ligate in
two orientations and the reverse orientation is screened for by
restriction enzyme analysis and the insert sequenced. One correct
plasmid is designated pCJM462. Methylation deficient DNA
(specifically dcm.sup.-1) of pCJM462 and pCJR26 (Rowe et al. 1998)
is isolated by passaging the plasmids through E. coli ET12567. Each
plasmid is then digested with MscI and XbaI and the .about.7.8 kbp
fragment from pCJR26, containing the rapamycin AT2 and sequences
downstream in pCJR26, is ligated to the .about.3.8 kbp backbone
generated by digestion of pCJM462. Plasmid pCJM463 is identified by
restriction analysis.
[0205] CosBor32A2 is digested with EcoRI and EcoRV and the 2871 bp
band isolated. This is used as a template for PCR using the oligos
CM412 (5'-AAAGTCCTAGGCGGCGGCCGGCGGGTCGACCT-3') (SEQ ID No.46) and
CM413 (5'-TTTAGATCTCGCGACGTCGCACGCGCCGAACGTCA-3') (SEQ ID No.47).
CM412 introduces an AvrII restriction site that joins, in frame,
the downstream borrelidin homology to the heterologous AT, and
CM413 introduces a BglII site for cloning purposes. The .about.1.1
kbp product is cloned into pUC18 digested with SmaI and
dephosphorylated. The insert can ligate in two orientations and the
reverse orientation is screened for by restriction enzyme analysis
and the insert sequenced. One correct plasmid is designated
pCJM464.
[0206] Plasmids pCJM463 and pCJM464 are digested with AvrII and
XbaI and the .about.1.1 kbp fragment from pCJM464 is ligated into
the .about.4.7 kbp backbone of pCJM463 to give pCJM465, which is
identified by restriction enzyme analysis. pCJM465 contains the
hybrid rapamycin AT2 with flanking regions of borrelidin sequence
which provide homology for integration and secondary
recombination.
[0207] Plasmid pCJM465 is digested with NsiI and BglII and the
.about.3 kbp fragment is cloned into pSL1180 previously digested
with NsiI and BamHI to give pCJM466. Plasmid pCJM466 is then
digested with NsiI and the apramycin cassette is incorporated on a
PstI fragment from pEFBA (Lozano et al., 2000) to give the
replacement vector pCJM467. pCJM467 is introduced into #S. parvulus
Tu4055 by protoplast transformation as described in example 5.
Colonies resistant to apramycin (25 .mu.g/ml) are initially
identified, and then passaged several times through MA media
without antibiotic selection in order to promote the second
recombination (Fernandez et al., 1998). Several apramycin-sensitive
colonies are isolated and analysed by PCR and Southern blot. The
new mutant is named S. parvulus Tu4055/467.
[0208] S. parvulus Tu4055/467 is analysed as described in example 1
and shown to produce a mixture compounds with the correct UV
spectrum. One of the new major components that is more polar than
borrelidin has the correct retention time for 10-desmethyl
borrelidin 3. LCMS analysis indicates an m/z ratio for a compound
that is 14 mass units lower than borrelidin as expected, and with
an appropriate mass fragmentation pattern. Borrelidin itself is
also produced, but at levels lower than the WT organism.
Example 7
Mutation of the Methylmalonyl-CoA Selective Motif of borAT4 to
Generate 10-desmethyl Borrelidin
[0209] Site directed mutagenesis of acyl transferase domains may
also be used to alter the specificity of an AT. In this example the
specificity of borAT4 is directed from methyl-malonyl-CoA towards
malonyl-CoA. An amino acid motif has been Identified (Reeves et
al., 2001; WO 02/14482) which directs the specificity of an AT. The
motif YASH, as observed in borAT4, is found in methylmalonyl-CoA
specific ATs and in this example it is altered to HAFH which is
found in malonyl-CoA specific ATs.
[0210] CosBor32A2 is digested with NcoI and the 5167 bp band
isolated. This is used as a template for PCR using the primers
CM414 (5'-5 AAACTGCAGAGTCGAACATCGGTCACACGCAGGC-3') (SEQ ID No.48)
and CM415 (5'-AAAATGCATGATCCACATCGATACGACGCGCCCGA-3') (SEQ ID
No.49). CM414 introduces a PstI restriction site for cloning
purposes, and CM415 is a mutagenic primer covering the motif
encoding region of the AT which will effect the amino acid changes
and contains an NsiI site for cloning purposes. The .about.1.1 kbp
fragment is cloned into pUC18 digested with SmaI and
dephosphorylated. The insert can ligate in either orientation and
the forward orientation is screened for by restriction enzyme
analysis and the insert sequenced. One correct plasmid is
designated pCJM468.
[0211] A second PCR reaction is performed using the 5167 bp NcoI
fragment of CosBor32A2 and the primers CM416
(5'-TAAATGCATTCCATTCGGTGCAGGTGGAGTTGATCC-3') (SEQ ID No.50) and
CM417 (5'-ATAGGATCCCCTCCGGGTGCTCCAGACCGGCCACCC-3') (SEQ ID No.51).
CM416 introduces an NsiI restriction site and is also a mutagenic
primer covering the motif encoding region of the AT, and CM417
introduces a BamHI site for cloning purposes. The .about.1.1 kbp
fragment is cloned into pUC18 previously digested with SmaI and
dephosphorylated. The insert can ligate in two orientations and the
forward orientation is screened for by restriction enzyme analysis
and the insert sequenced. One correct plasmid is designated
pCJM469.
[0212] Plasmids pCJM468 and pCJM469 are digested with NsiI and XbaI
and the .about.1.1 kbp fragment from pCJM468 is ligated into the
3.8 kbp backbone of pCJM469 to give pCJM470, which is identified by
restriction enzyme analysis. pCJM470 contains the mutated motif of
borAT4 with .about.1.1 kbp of homologous DNA on either side which
provide homology for integration and secondary recombination.
[0213] Plasmid pCJM470 is digested with PstI and BamHI and the
.about.2.2 kbp fragment is cloned into pSL1180 (Amersham
Biosciences) previously digested with PstI and BamHI to give
pCJM471. Plasmid pCJM471 is then digested with PstI and the
apramycin cassette is incorporated on a PstI fragment from pEFBA
(Lozano et al., 2000) to provide the replacement vector
pCJM472.
[0214] The replacement vector pCJM472 is introduced into S.
parvulus Tu4055 by protoplast transformation as described in
example 5. Colonies resistant to apramycin are initially
identified, and then passaged several times through MA media
without antibiotic selection in order to promote the second
recombination (Fernandez et al., 1998). Several apramycin-sensitive
colonies are isolated and analysed by PCR and Southern blot, and
one is selected that contains the new AT4 sequence containing the
mutated motif and the NsiI site. The new mutant is named S.
parvulus Tu4055/472.
[0215] S. parvulus Tu4055/472 is grown and analysed as described in
example 1 and shown to produce a mixture of compounds with the
correct UV profile for borrelidin. One of the new major components,
that is more polar than borrelidin, has the correct retention time
for authentic 3. LCMS analysis indicates an m/z ratio for a
compound that is 14 mass units lower than borrelidin as expected,
and with an appropriate mass fragmentation pattern. Borrelidin
itself is also produced, but at levels lower than the WT
organism.
Example 8
Introduction of the Borrelidin Loading Module into the Erythromycin
PKS
[0216] The borrelidin loading module was amplified for each of the
four putative start codons. The PCR template was a 3376 bp BamHI
fragment of cosBor32A2 covering the region from nucleotides 15858
to 19234 of SEQ ID No.1. The reverse primer CM368
(5'-TTTCCTGCAGGCCATCCCCACGATCGCGATCGGCT-3') (SEQ ID No:52)
introduces a SbfI site at the sequence corresponding to the start
of KS1 of borA2 (conserved MACRL motif) and is used with each of
the forward primers CM369
(5'-TTTCATATGACAGGCAGTGCTGTTTCGGCCCCATT-3') (SEQ ID No.53), CM370
(5'-TTTCATATGGCGGATGCCGTACGTGCCGCCGGCGCT-3') (SEQ ID No.54), CM371
(5'-TTTCATATGCCCCAGGCGATCGTCCGCACCAC-3') (SEQ ID No.55) and CM372
(5'-TTTCATATGGTCTCGGCCCCCCACACAAGAGCCCTCCGGGC-3') (SEQ ID No:56).
The four PCR products (of 2834, 2720, 2411 and 2117 bp
respectively) were cloned into pUC18 that had previously been
digested with SmaI and dephosphorylated. The resulting plasmids
were designated pCJM370, which contains the largest insert,
pCJM371, pCJM372 and pCJM373, which contains the smallest
insert.
[0217] The four borrelidin loading module fragments were introduced
into the vector pKS1W, which contains a PstI site at the start of
eryKS1 of DEBS1-TE in the conserved MACRL motif (Rowe et al.,
2001); PstI gives the same overhang as SbfI. pKS1W is a pT7-based
plasmid containing DEBS1-TE on an NdeI/XbaI fragment, with unique
sites flanking the loading module, a unique PstI site at nucleotide
position 1698 of the DEBS1-TE encoding gene and a unique NdeI site
at the start codon. The borrelidin loading module fragments were
excised as follows: pCJM370 was digested with NdeI and SbfI,
pCJM371 and pCJM373 were digested with NdeI and PstI, and pCJM372
was digested with NdeI, PstI and DraI. Each loading module
containing fragment was cloned into pKS1W previously digested with
NdeI and PstI. The resulting plasmids were designated pCJM384,
which contains the largest insert, then pCJM386, pCJM388 and
pCJM390, which contains the smallest insert.
[0218] The hybrid PKS fragments were transferred into pCJR24, which
is a suitable vector for transformation of S. erythraea WT and S.
erythraea DM, and for expression of the resulting hybrid PKS (WO
98/01546). Each loading module construct was excised along with a
2346 bp fragment of DNA from DEBS1 in order to allow integration
into the chromosome. In order to achieve this, pCJR24 is digested
with XbaI and end-filled using Klenow fragment of DNA polymerase I.
This is then digested with NdeI to give the backbone fragment. Into
this, the four hybrid PKS fragments containing the borrelidin
loading modules plus the region of DEBS1 sequence for integration
are cloned as NdeI/EcoRV fragments from pCJM384, pCJM386, pCJM388
and pCJM390 to give pCJM400, pCJM401, pCJM402 and pCJM403
respectively.
[0219] Plasmids pCJM400, pCJM401, pCJM402 and pCJM403 were
introduced into S. erythraea by transformation of S. erythraea DM
protoplasts as described elsewhere (Gaisser et al., 2000). The
resulting mutants were analysed by PCR and Southern blot to confirm
the presence of the plasmid on the chromosome and to establish that
correct integration had occurred. A number of mutants that appeared
correct by these methods were grown, extracted and analysed
according to standard methods for polyketide production from S.
erythraea strains (Wilkinson et al., 2000). When compared to
control strains using LCMS methods, the extracts from several of
these mutants contained new compounds at reasonable levels.
Analysis of their MS spectra showed the presence of a compound with
m/z=485.3 ([M-H].sup.-, 6) in negative ion mode. This is in
agreement with the expected product compound (M=486.3).
Example 9
Fusion of PKS Modules 4 and 5 (S. parvulus Tu4055/borA4-A5)
[0220] To examine the iterative action of module 5, the two
separate proteins encoding modules 4 and 5 were fused together
through manipulation at the genetic level. The fusion was performed
by a gene replacement in which the last .about.1 kbp of borA4 and
the first .about.1 kbp of borA5, were fused by converting the
overlapping stop and start codons respectively into an arginine
residue, introducing a new XbaI site and converting the two
separate orfs into one.
[0221] In the first step of the mutagenesis, two separate PCR
amplifications were performed. In the first PCR reaction, the
template DNA was cosBor19B9, and the primers were B1819A
(5'-GTCATGCATGCGGCGGGCTC-3') (SEQ ID No.57) and B1819B
(5'-GGTCTAGAACGGCCGAACTT-3') (SEQ ID No.58). The 1063 bp product
was purified, digested NsiI-XbaI and cloned into pSL1180 (Amersham
Biosciences) digested similarly to give plasmid pSL18-19AB. The
second PCR reaction amplified the borA5 fragment and used the
primers B1819C (5'-GTTCTAGAACCTCGGTCGGC-3') (SEQ ID No.59) and
B1819D (5'-CTGGATCCCACGCTGCTGCG-3') (SEQ ID No.60). The 1033 bp
product was purified, digested with XbaI-BamHI and cloned into
pSL18-19AB that had been digested similarly, to give plasmid
pSL18-ABCD. Finally, the apramycin cassette from pEFBA (Lozano et
al., 2000) was excised as a PstI fragment and cloned into
pSL18-19ABCD digested with NsiI to give the replacement vector
pSL18-19Apra.
[0222] The replacement vector pSL18-19Apra was introduced into S.
parvulus Tu4055 by protoplast transformation as described in
example 5. Colonies resistant to apramycin (25 .mu.g/ml) were
initially selected, and then passaged several times through MA
media without selection. Several apramycin-sensitive colonies were
obtained, two of which produced borrelidin while the others did
not.
[0223] Chromosomal DNA was extracted from all of the apramycin
sensitive colonies and checked initially by PCR using the primers
BLDA (5'-GGAGACTTACGGGGGATGC-3') (SEQ ID No.61) and BLDB
(5'-CTCCAGCAGCGACCAGAAC-3') (SEQ ID No.62) that are selective for
the loading module (borA1). A 2.9 kbp fragment was observed for the
control and the two borrelidin-producing mutants, but not for the
non-producing strains. This result is symptomatic and
characteristic of non-specific deletions in the chromosome.
[0224] The two borrelidin-producing colonies were analysed further
by PCR using the primers B19A (5'-CCCATGCATCACCGACATAC-3') (SEQ ID
No.63) and B19B (5'-GCGATATCCCGAAGAACGCG-3') (SEQ ID No.64) in
order to check the fusion site. The method was as described above.
Both the colonies and the controls gave a PCR product of 1010 bp,
but upon digestion with XbaI only those that carried the
fusion-producing mutation gave digestion to 600 and 400 bp
fragments. Only one of the borrelidin-producing colonies harboured
the fusion, while the other had reverted to wild type. Final
confirmation came from Southern analysis using a BamHI-XhoI
internal fragment from borA5 as probe over chromosomal DNA digested
with XbaI and BclI. The control and wild type revertant colony
showed a fragment of 11.5 kbp as expected, while the fusion mutant
showed a fragment of 7.8 kbp as expected. This new mutant was named
S. parvulus Tu4055/borA4A5. S. parvulus Tu4055/borA4-A5 was shown
to produce borrelidin at 26.+-.5% of the WT titre, following the
protocol described in example 1.
Example 10
Fusion of PKS Modules 5 and 6 (S. parvulus
Tu4055/borA5-A.sctn.)
[0225] This experiment was performed for the same reason as, and in
an analogous manner to, that of example 9 above. The fusion of
these orfs introduced an additional leucine residue into the new
protein at the fusion point, in addition to a new SpeI site at the
genetic level. In the first step of the process two PCR fragments
were generated using cosBor19B9 as template. The first PCR reaction
amplified the borA5 region and used the primers B1920A
(5'-GCCAAGCTTCCTCGACGCGC-3') (SEQ ID No.65) and B1920B
(5'-CACTAGTGCCTCACCCAGTT-3') (SEQ ID No.66). The 804 bp product was
purified and digested with HindIII-SpeI. The second PCR reaction
amplified the borA6 region and used the primers B1920C
(5'-CACTAGTGACGGCCGAAGCG-3') (SEQ ID No.67) and B1920D
(5'-TCGGATCCGTCAGACCGTTC-3') (SEQ ID No.68). The 960 bp product was
purified and digested with SpeI-BamHI. The two purified and
digested gene products were then cloned together into pOJ260 that
had been digested with HindIII-BamHI to give the replacement vector
pOJF19-20. pOJF19-20 was introduced into S. parvulus Tu4055 by
protoplast transformation to give apramycin resistant colonies. One
such colony was passaged several times through MA media without
selection in order to promote double recombination. Two apramycin
sensitive colonies were obtained, and chromosomal DNA from these
was examined by Southern hybridisation to check for the presence of
a 3.2 kbp BamHI fragment (to control for unwanted deletions in the
loading module) and a 3.4 kbp SpeI-BamHI fragment to verify correct
introduction of the borA5-A6 fusion (5.8 kbp BamHI fragment in the
WT). One of the apramycin colonies carried the correct mutation
without deletion and was named S. parvulus Tu4055/borA5A6. S.
parvulus Tu4055/borA5-A6 was shown to produce borrelidin at
25.+-.4% of the WT titre, following the protocol as described in
example 1.
Example 11
Fusion of PKS Modules 4, 5 and 6 (S. parvulus
Tu4055/borA4-A5A6)
[0226] To generate the strain S. parvulus Tu4055/borA4-A5-A6 we
took advantage of the previously obtained strain S. parvulus
Tu4055/borA4-A5 (Example 9) and plasmid pOJF19-20 (Example 10).
pOJF19-20 was introduced into S. parvulus Tu4055/borA4-A5 by
protoplast transformation to give apramycin resistant colonies. One
such colony was passaged several times through MA media without
selection in order to promote double recombination. One apramycin
sensitive colony was obtained, and chromosomal DNA from it was
examined by Southern hybridisation to check for the presence of a
3.2 kbp BamHI fragment (to control for unwanted deletions in the
loading module), a 3.4 kbp SpeI-BamHI fragment to verify correct
introduction of the borA5-A6 fusion (5.8 kbp BamHI fragment in the
WT) and a 6.4 kbp SpeI-XbaI to verify the presence of both fusions,
borA4-A5 and borA5-A6, within the same strain. The chosen colony
carried the correct mutation without deletion and was named S.
parvulus Tu4055/borA4-A5-A6. S. parvulus Tu4055/borA4-A5-A6 was
shown to produce borrelidin at 18.+-.5% of the WT titre, following
the protocol as described in example 1.
Example 12
Replacement of the Erythromycin PKS Module 4 with Module 5 of the
Borrelidin PKS--Production of Ring Expanded Macrolides
[0227] Example 12 describes the replacement of erythromycin module
4 with borrelidin module 5. Borrelidin module 5 is believed to be
responsible for three rounds of condensation of methylmalonyl-CoA,
in an iterative fashion, within the borrelidin PKS. Previously,
erythromycin module 4 has been shown to occasionally act in an
iterative fashion `mis`-incorporating a second methylmalonyl-CoA to
make very small amounts of a 16-membered macrolide from the
erythromycin PKS. A strain in which the erythromycin module 4 is
replaced by borrelidin module 5 is engineered by a replacement
strategy as follows, and is based on a derivative process as
described for module insertion into the erythromycin PKS (Rowe et
al., 2001):
[0228] Initially a series of plasmids are made in order to generate
a plasmid in which the borrelidin module 5 is flanked by
appropriate regions of homology from the erythromycin PKS. In order
to facilitate this, the SbfI site is first removed from the
polylinker of pUC18 by digestion with PstI, end-polishing with T4
polymerase and religation. The new plasmid, pCJM409 is identified
by restriction enzyme digestion.
[0229] Borrelidin module 5 is isolated on an SbfI fragment by
ligating together 4 PCR fragments. PCRA is generated by
amplification of .about.1.4 kb of the beginning of borrelidin
module 5 using the 6062 bp XcmI fragment of cosBor19B9 as the
template and primers CM384
(5'-AACCTGCAGGTACCCCGGTGGGGTGCGGTCGCCCGA-3') (SEQ ID No.69) and
CM385 (5'-CGCCGCACGCGTCGAAGCCAACGA-3') (SEQ ID No.70). CM384
introduces an SbfI site in the conserved amino acid sequence MxCR
at the beginning of borrelidin module 5. CM385 incorporates a
naturally occurring MluI site that is used in the cloning strategy.
PCRA is treated with T4 polynucleotide kinase (T4 PNK, NEB) and
cloned into pCJM409 previously digested with SmaI and
dephosphorylated with Shrimp Alkaline Phosphatase (SAP, Roche).
Inserts cloned in the forward direction are screened for by
restriction enzyme digestion, and for one correct clone the insert
is verified by sequencing. This plasmid is designated pCJM410.
[0230] PCRB is generated by amplification of the adjacent
.about.1.4 kb of borrelidin module 5 using the 6062 bp XcmI
fragment of cosBor19B9 as the template and primers CM386
(5'-TGTGGGCTGGTCGTTGGCTTCGAC-3') (SEQ ID No.71) and CM387
(5'-GGTGCCTGCAGCGTGAGTTCCTCGACGGATCCGA-3') (SEQ ID No.72). CM386
binds upstream of the same MluI site as CM385 contains, which is
used in the cloning strategy. CM387 is used to remove the SbfI site
within the borrelidin PKS module 5 whilst leaving the overlapping
PstI site for cloning. PCRB is treated with T4 PNK and cloned into
pCJM409 previously digested with SmaI and dephosphorylated with
SAP. Inserts cloned in the forward direction are screened for by
restriction enzyme digestion, and for one correct clone the insert
is verified by sequencing. This plasmid is designated pCJM411.
[0231] PCRC is generated by amplification of the downstream
adjacent .about.1.5 kb of borrelidin module 5 using the 6062 bp
XcmI fragment of cosBor19B9 as the template and oligonucleotides
CM388 (5'-GAGGAACTCACCCTGCAGGCACCGCT-3') (SEQ ID No.73) and CM395
(5'-CGAACGTCCAGCCCTCGGGCATGCGT-3') (SEQ ID No.74). CM388 binds at
the same SbfI site as CM387, but is not mutagenic and retains the
SbfI site. CM395 incorporates an SphI site for cloning purposes.
PCRC is treated with T4 PNK and cloned into pCJM409 previously
digested with SmaI and dephosphorylated with SAP. Inserts cloned in
the forward direction are screened for by restriction enzyme
digestion and for one correct clone the insert is verified by
sequencing. This plasmid is designated pCJM412.
[0232] PCRD is generated by amplification of the downstream
adjacent .about.2.1 kb of borrelidin module 5 using the 7211 bp
BbvCI fragment of cosBor19B9 as the template and primers CM396
(5'-TGGCACGCATGCCCGAGGGCTGGACGTT-3') (SEQ ID No.75) and CM397
(5'-TTTCCTGCAGGCCATGCCGACGATCGCGACAGGCT-3') (SEQ ID No.76). CM396
contains the SphI site for cloning purposes, and CM397 introduces
an SbfI site in the conserved amino acid sequence MxCR at the end
of borrelidin module 5. PCRD is treated with T4 PNK and cloned into
pCJM409 previously digested with SmaI and dephosphorylated with
SAP. Inserts cloned in the forward direction are screened for by
restriction enzyme digestion, and for one correct clone the insert
is verified by sequencing, this plasmid is designated pCJM413.
[0233] The four PCR products (PCRA-D) are used to construct the
borrelidin module 5 on an SbfI fragment as follows:
[0234] pCJM412 is digested with SphI and the .about.1.5 kb fragment
isolated is cloned into pCJM413 previously digested with SphI and
dephosphorylated with SAP. This gives plasmid pCJM414, which is
identified by restriction enzyme digestion.
[0235] pCJM414 is digested with SbfI and the .about.3.6 kb fragment
isolated is cloned into pCJM411 previously digested with PstI and
dephosphorylated with SAP. This gives pCJM415 which is Identified
by restriction enzyme digestion.
[0236] pCJM410 is digested with MluI and HindIII and the .about.1.4
kb fragment isolated is cloned into pCJM415 previously digested
with MluI and HindIII. This gives pCJM416, which is identified by
restriction enzyme digestion. pCJM416 is a pUC18-based plasmid
containing the borrelidin module 5 as an SbfI fragment.
[0237] In order to introduce the Borrelidin module 5 into the
erythromycin PKS by a replacement strategy, flanking regions of
homology from the erythromycin PKS are incorporated for
recombination as follows:
[0238] PCRE is generated by amplification of .about.3.3 kb of the
erythromycin PKS directly upstream of the module 4 KS using the
6428 bp XmnI fragment of pIB023 as the template and primers CM398
(5'-AAACATATGGTCCTGGCGCTGCGCAACGGGGAACTG-3') (SEQ ID No.77) and
CM399 (5'-TTTCCTGCAGGCGATGCCGACGATGGCGATGGGCT-3') (SEQ ID No.78).
CM398 contains an NdeI site for cloning purposes and CM399
introduces an SbfI site in the conserved amino acid sequence M/IxCR
at the beginning of erythromycin module 4. PCRE is treated with T4
PNK and cloned into pCJM409 previously digested with SmaI and
dephosphorylated with SAP. Inserts cloned in the forward direction
are screened for by restriction enzyme digestion, and for one
correct clone the insert is verified by sequencing, this plasmid is
designated pCJM417.
[0239] PCRF is generated by amplification of .about.3.4 kb of the
erythromycin PKS directly downstream of the module 5 KS using the
7875 bp XmnI/NheI fragment of pIB023 as the template and primers
CM400 (5'-AAACCTGCAGGTTCCCCGGCGACGTGGACTCGCCGGAGTCGTT-3') (SEQ ID
No.79) and CM401 (5'-TTTTCTAGAGCGACGTCGCAGGCGGCGATGGTCACGCCCGT-3')
(SEQ ID No.80). CM400 introduces an SbfI site in the conserved
amino acid sequence M/IxCR at the beginning of erythromycin module
4, and primer CM401 contains an XbaI site for cloning purposes.
PCRF is treated with T4 PNK and cloned into pCJM409 previously
digested with SmaI and dephosphorylated with SAP. Inserts cloned in
the forward direction are screened for by restriction enzyme
digestion, and for one correct clone the insert is verified by
sequencing. This plasmid is designated pCJM418.
[0240] pCJM417 is digested with NdeI and SbfI and the .about.3.3 kb
fragment is cloned into pCJM418 digested with NdeI and SbfI
(.about.5.8 kbp) to give pCJM419 which is identified by its
restriction digest pattern. pCJM419 contains a unique SbfI site
which can be used to accept any complete module with SbfI (or PstI)
flanking sites appropriate to place, in-frame, the in-coming module
exactly into the conserved region of the KS domain.
[0241] The borrelidin module 5 with flanking SbfI sites is cloned
from pCJM416 as an SbfI fragment into the unique SbfI site of
pCJM419 (which has been dephosphorylated with SAP) to give pCJM420,
which is identified by restriction enzyme analysis to confirm the
presence and correct orientation of the insert. pCJM420 thus
contains borrelidin module 5 with flanking regions of homology to
introduce it in-frame between modules 3 and 5 of the erythromycin
PKS. The complete insert is removed as an NdeI/XbaI fragment from
pCJM420 and cloned into pCJM24 digested with NdeI and XbaI to give
the final plasmid pCJM421. pCJR24, and consequently pCJM421,
contain an appropriate resistance marker for selection of S.
erythraea transformants.
[0242] Plasmid pCJM421 is used to transform S. erythraea strains
NRRL2338 (wild type), and S. erythraea DM (eryCIII.sup.-,
eryBV.sup.-) protoplasts (Yamamoto et al., 1986; Rowe et al.,
1998). Integrants are selected for resistance to thiostrepton (50
mg/L) and a number of integrants (typically 5-8) are analysed
further by Southern blot to confirm that the strains are correct
and to identify the site of integration. Two correct integrants in
each case are sub-cultured in TSB liquid media without antibiotic
selection in order to promote the second recombination. Several
thiostrepton-sensitive colonies are isolated and analysed by PCR
and Southern blot, and in each case one selected that contains the
new module correctly inserted. This leads to strains S. erythraea
WT/421 and S. erythraea DM/421.
[0243] Strain S. erythraea DM/421 is cultured under conditions
appropriate for the production of erythronolides (Wilkinson et al.,
2000). Analysis of fermentation broth extracts using LCMS methods
indicates the presence of two new significant peaks when compared
to the control strain, and which are less polar than erythronolide
B. These have an m/z of 435.5 (MNa.sup.+) and 477.5 (MNa.sup.+)
respectively, which is consistent with the production of new ring
expanded erythronolide B analogues. The compound with m/z=435.5 (7)
is consistent with the presence of the 16-membered ring-expanded
erythronolide B related macrolide reported previously as a minor
component of S. erythraea WT fermentations (Wilkinson et al.,
2000); the compound with m/z=477.5 (8) is consistent with the
presence of an 18-membered, doubly ring-expanded erythronolide B
related macrolide (see FIG. 8). It is clear to one skilled in the
art that such new products can be converted to antibacterial
molecules by biotransformation with an appropriate organism, or
through the fermentation of the strain S. erythraea WT/421. It is
further clear to one skilled in the art that the inclusion of such
a module into other positions of the erythromycin PKS or into other
PKSs may allow the production of novel, ring expanded polyketides
in a similar manner.
[0244] An alternative strategy for generating this hybrid PKS is to
incorporate the borrelidin module 5 in place of erythromycin module
4 within a large plasmid that contains the entire hybrid PKS,
followed by transformation of an eryA.sup.- S. erythraea strain.
Such an appropriate existing eryA.sup.- is S. erythraea JC2 (Rowe
et al., 1998) and the plasmid containing the eryA genes under the
actI promoter, pIB023 that also contains a thiostrepton resistance
gene and the actII-ORF4 activator. This strategy is accomplished as
follows:
[0245] pIB023 is digested with NdeI and BsmI and the 13.4 kbp
fragment is cloned into pCJM419 digested with NdeI and BsmI to give
plasmid pCJM425. pIB023 is digested with BbvCI and XbaI and the
approx. 6 kbp fragment is cloned into pCJM425 digested with BbvCI
and XbaI to give plasmid pCJM426. The NdeI/XbaI fragment from
pCJM426 is cloned into pCJM395 digested with NdeI and XbaI. pCJM395
is a plasmid made by digesting pCJR24 with SbfI, end-polishing with
T4 polymerase and religating, to give a version of pCJR24 that does
not cut with SbfI. The resulting plasmid, pCJM427, contains an
engineered version of the erythromycin PKS in which module 4 is
removed. This backbone is then ready to accept any complete module
with appropriate flanking sites (SbfI or PstI) to generate a hybrid
PKS. Introduction of the single borrelidin module 5 is accomplished
by digesting pCJM427 with SbfI, dephosphorylating the backbone with
SAP, and ligating in the SbfI fragment from pCJM416, to give
pCJM430.
[0246] Plasmid pCJM430 is used to transform S. erythraea JC2.
Integrants are selected for resistance to thiostrepton (50 mg/L)
and a number of integrants (typically 5-8) are analysed further by
Southern blot to confirm that the strains are correct and to
identify the site of integration. The resulting correct strain S.
erythraea JC2/430 is cultured under conditions appropriate for the
production of erythromycins (Wilkinson et al., 2000) and analysed
for the production of novel compounds 7 & 8.
Example 13
Disruption of borB (S. parvulus Tu4055/borB::aac3(IV))
[0247] In order to disrupt borB, an region of 2751 bp containing
borB was amplified by PCR using primers B5B
(5-'AACTAGTCCGCAGTGGACCG-3') (SEQ ID No.91) and B5A
(5'-TCGATATCCTCACCGCCCGT-3') (SEQ ID No.92) and cosmid Bor32A2 as
template. The PCR product was purified and then digested at the
flanking sites SpeI-EcoRV and subcloned into pSL1180 digested with
the same restriction enzymes to generate pSLB. A SpeI-AgeI fragment
(the latter site internal to the insert) from pSLB containing the
5'-end of borB was subcloned into the SpeI-XmaI sites of pEFBA,
upstream of the apramycin resistance gene aac(3)IV, to produce
pEB1. A BsaAI-EcoRV fragment (the former site internal to the
insert) from pSLB containing the 3'-end of borB was then subcloned
in the correct orientation into the EcoRV site of pEB1 downstream
of aac(3)IV, to generate pEB2. In this way a 741 bp AgeI-BsaAI
fragment internal to borB was deleted and replaced by aac(3)IV.
Finally, the SpeI-EcoRV fragment was rescued from pEB2 and
subcloned, together with a PstI-SpeI fragment containing the hyg
gene from pLHyg, into the PstI-EcoRV sites of pSL1180 to generate
pSLBr1. This approach was used in order to avoid possible polar
effects.
[0248] The vector pSLBr1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected, and then passaged several
times through MA media without selection. The replacement was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borB::aac3(IV). Strain S. parvulus
Tu4055/borB::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was observed and
compared to a wild type control. In addition S. parvulus
Tu4055/borB::aac3(IV) was chemically complemented with
trans-1,2-dicyclopentane dicarboxylic acid, following the protocol
described in example 1.
Example 14
Disruption of borC (S. parvulus Tu4055/borC::aac3(IV))
[0249] In order to disrupt borC, an region of 3553 bp containing
borC was amplified by PCR using primers B6B
(5'-AACTAGTGTGGCAGACGGTC-3') (SEQ ID No.93) and B5A
(5'-TCGATATCCTCACCGCCCGT-3') (SEQ ID No.94) and cosmid Bor32A2 as
template. The PCR product was purified and then digested with
SpeI-EcoRV and subcloned into the same restriction sites of pSL1180
to produce pSLC. The SpeI-SphI and Ball-EcoRV fragments from this
plasmid pSLC, containing the 5'-end and the 3'-end of borC
respectively, were then cloned stepwise into the SpeI-SphI and
EcoRV sites of pEFBA and in the correct orientations. In this way a
302 bp SphI-Ball internal fragment of borC was replaced by the
aac(3)IV gene. The resulting plasmid was then digested with SpeI
and EcoRV and the resulting fragment was subcloned together with
the hyg gene as described above, into pSL1180 leading to the final
construct pSLCr1. This approach was used in order to avoid possible
polar effects.
[0250] The vector pSLCr1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected, and then passaged several
times through MA media without selection. The replacement was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borC::aac3(IV). Strain S. parvulus
Tu4055/borC::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borC::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
[0251] To verify that no polar effects were introduced a
full-length copy of borC under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. Full-length borC
was amplified by PCR using the primers B6T1
(5'-CGGATGCATCACCGGCACGG-3') (SEQ ID No.95) and B6T2
(5-TGGGATCCGCGGGGCGGTAC-3') (SEQ ID No.96) using cosmid Bor32A2 as
template. The 943 bp product was purified and then digested with
NsiI-BamHI and subcloned, together with a BamHI-SpeI fragment from
pLHyg (carrying the hyg gene), into pIJ2925 previously digested
with PstI-XbaI. A BglII fragment (using this site from the vector)
was then isolated and subcloned into pEM4, and in the correct
orientation to locate borC under the control of the promoter ermE*.
Plasmid pborCH and the control plasmid pEM4 were introduced into S.
parvulus Tu4055/borC::aac(3)IV by protoplast transformation as
described in example 5. The resulting strain S. parvulus
Tu4055/borC::aac(3)IV/pborCH was analysed as described in example 1
and shown to produce borrelidin at a titre similar to a WT
control.
Example 15
Disruption of borD (S. parvulus Tu4055/borD::aac3(IV))
[0252] In order to disrupt borD, a fragment of 2777 bp was
amplified by PCR using the primers BBB (5'-AACTAGTGCGATCCCGGGGA-3')
(SEQ ID No.97) and BBA (5'-CGTCGATATCCTCCAGGGGC-3') (SEQ ID No.98)
and cosmid Bor32A2 as template. The PCR product was purified and
then digested with SpeI-EcoRV and subcloned into pSL1180 to
generate pSLD. This was then digested with NdeI-StuI to delete an
internal 679 bp region of borD which was replaced by a SmaI-NdeI
fragment isolated from pEFBA containing the aac(3)IV gene. The
resulting construct was digested with SpeI-EcoRV and the 4.3 kb
fragment subcloned together with a SpeI-PstI fragment from pLHyg
containing the hyg gene, into pSL1180 digested with PstI-EcoRV.
This step leads to the final plasmid pSLDr1. This approach was used
in order to avoid possible polar effects.
[0253] The vector pSLDr1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected, and then passaged several
times through MA media without selection. The replacement was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borD::aac3(IV). Strain S. parvulus
Tu4055/borD::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borD::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
[0254] To verify that no polar effects were introduced a
full-length copy of borD under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. Full-length borD
was amplified by PCR using the primers BBT1
(5'-TACTGCAGCACACCCGGTGC-3') (SEQ ID No.99) and BBT2
(5'-TGGGATCCGCTGTGTCATAT-3') (SEQ ID No.100) using cosmid Bor32A2
as template. The 816 bp PCR product was purified and then digested
with PstI-BamHI and subcloned together with a BamHI-SpeI fragment
containing the hyg gene from pLHyg, into pIJ2925 digested with
PstI-XbaI, to give pIJDH. The BglII fragment from pIJDH (using
these sites from the vector) was then subcloned into pEM4
(predigested with BamHI) and in the correct orientation to generate
pborDH. Plasmid pborDH and the control plasmid pEM4 were introduced
into S. parvulus Tu4055/borD::aac(3)IV by protoplast transformation
as described in example 5. The resulting strain S. parvulus
Tu4055/borD::aac(3)IV/pborDH was analysed as described in example 1
and shown to produce borrelidin at a titre similar to a WT
control.
Example 16
Disruption of borE (S. parvulus Tu4055/borE::aac3(IV))
[0255] In order to disrupt borE, an Internal 761 bp fragment of the
gene was amplified by PCR using primers B25A
(5'-TTCTGCAGCCGCGGCCTTCG-3') (SEQ ID No.81) and B25B
(5'-AGAATTCGCCGGCGCCGCTG-3') (SEQ ID No.82) using cosBor32A2 as
template. The product was purified, digested PstI-EcoRI and cloned
into pOJ260ermE* which had been digested similarly, to provide
pOJEd1. This approach was used in order to avoid possible polar
effects. The vector pOJEd1 was introduced into S. parvulus Tu4055
by protoplast transformation as described in example 5, and
colonies were selected for apramycin resistance on R5 and then on
MA agar. The disruption was verified by Southern hybridisation and
the new mutant was named S. parvulus Tu4055/borE::aac3(IV). Strain
S. parvulus Tu4055/borE::aac3(IV) was grown, extracted and analysed
as described in example 1. No borrelidin production was observed
whereas a wild type control produced borrelidin as expected.
[0256] To verify that no polar effects were introduced a
full-length copy of borE under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. Full-length borE
was amplified by PCR using the primers B7T1
(5'-GGCTGCAGACGCGGCTGAAG-3') (SEQ ID No.83) and B7T2
(5'-CCGGATCCCAGAGCCACGTC-3') (SEQ ID No.84) using cosBor32A2 as
template. The 1216 bp product was purified, digested with
PstI-BamHI and cloned into PstI-XbaI digested pIJ2925 (Janssen
& Bibb, 1993), along with a BamHI-SpeI digested fragment from
pLHyg containing the hygromycin resistance cassette, to generate
pIJEH. A 2.8 kbp BamHI fragment was excised from pIJEH and cloned
into pEM4 (Quiros et al., 1998), which had been digested similarly,
to give pborEH (in which the borE gene was cloned in the correct
orientation for gene expression). pborEH and the control plasmid
pEM4 were introduced into S. parvulus Tu4055/borE::aac(3)IV by
protoplast transformation as described in example 5. The resulting
strain S. parvulus Tu4055/borE::aac(3)IV/pborEH was analysed as
described in example 1 and shown to produce borrelidin at a titre
similar to a WT control; the control strain S. parvulus
Tu4055/borE::aac(3)IV/pEM4 did not produce borrelidin.
[0257] Chemical complementation of S. parvulus
Tu4055/borE::aac3(IV) with trans-1,2-dicyclopentane dicarboxylic
acid, following the protocol described in example 1, demonstrated
that the strain thus grown was capable of borrelidin production at
122.+-.23% of the WT parent control. Thus, borE is required for
biosynthesis of trans-cyclopentane-1,2-dicarboxylic acid.
Example 17
Disruption of borF (S. parvulus Tu4055/borF::aac3(IV))
[0258] In order to disrupt borF, a region containing borF was
amplified by PCR using the primers BCB (5'-CACTAGTCCTCGCCGGGCAC-3')
(SEQ ID No.101) and BCA (5'-GAGGATCCCGGTCAGCGGCA-3') (SEQ ID
No.102) and cosmid Bor32A2 as template. The resulting 2132 bp
product was purified and then digested with SpeI-BamHI and
subcloned into the same sites of pSL1180 leading to pSLF. The
aac(3)IV gene from pEFBA was then subcloned as a SphI fragment into
the SphI site of pSLF, which is located inside the borF coding
region. Finally the BamHI-SpeI fragment was subcloned into pLHyg
digested with BamHI-NheI to generate pLHFr1.
[0259] The vector pLHFr1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected, and then passaged several
times through MA media without selection. The replacement was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borF::aac3(IV). Strain S. parvulus
Tu4055/borF::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borF::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
[0260] To verify that no polar effects were introduced a
full-length copy of borF under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. Full-length borF
was amplified by PCR using the primers BCT1
(5'-GCCTGCAGCGACCTCGCCGG-3') (SEQ ID No.103) and BCT2
(5'-CGGGATCCCGTGGCGTGGTC-3') (SEQ ID No.104) using cosmid Bor32A2
as template. The 1048 bp PCR product was purified and then digested
with PstI-BamHI and subcloned together with the hyg gene as
described above, into pIJ2925. A BglII fragment was then isolated
and subcloned into pEM4 to generate pborFH. This was used to
complement strain SPMF. Plasmid pborFH and the control plasmid pEM4
were introduced into S. parvulus Tu4055/borF::aac(3)IV by
protoplast transformation as described in example 5. The resulting
strain S. parvulus Tu4055/borF::aac(3)IV/pborFH was analysed as
described in example 1 and shown to produce borrelidin at a titre
similar to a WT control.
Example 18
Disruption of borG (S. parvulus Tu4055/borG::aac3(IV))
[0261] In order to disrupt borG, an internal region of 885 bp was
amplified by PCR using the primers B23A
(5'-ATCTGCAGCGGCATCGGTGT-3') (SEQ ID No.105) and B23B
(5'-AGAATTCTCCACTGCGGTCG-3') (SEQ ID No.106) and cosmid Bor32A2 as
template. The resulting product was purified and the digested at
the flanking sites PstI-EcoRI and then subcloned into pOJ260P,
downstream of the promoter ermE*, to generate pOJGd1.
[0262] The vector pOJGd1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected on MA agar. The disruption was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borG::aac3(IV). Strain S. parvulus
Tu4055/borG::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borG::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
Example 19
Disruption of borH (S. parvulus Tu4055/borH::aac3(IV))
[0263] In order to disrupt borH, and internal region of 697 bp was
amplified by PCR using the primers B9A (5'-ACCTGCAGGCCGGGCTCATC-3')
(SEQ ID No.107) and B9B (5'-AGAATTCGGGCGAGCCGCCG-3') (SEQ ID
No.108) and cosmid Bor32A2 as template. The resulting PCR product
was purified and then digested with PstI-EcoRI and then subcloned
into pOJ260P, downstream of the promoter ermE*, to generate
pOJHd2.
[0264] The vector pOJHd2 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected on MA agar. The disruption was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borH::aac3(IV). Strain S. parvulus
Tu4055/borH::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borH::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
Example 20
Disruption of borI (S. parvulus Tu4055/borI::aac3(IV))
[0265] The gene borI and surrounding DNA was amplified from
cosBor19B9 using the PCR primers BP4501
(5'-CGTATGCATGGCGCCATGGA-3') (SEQ ID No.85) and BP4502
(5'-AGCCAATTGGTGCACTCCAG-3') (SEQ ID No.86). The 2.32 kbp product
was purified, digested with NsiI-MfeI and cloned into pSL1180
digested NsiI-EcoRI, to give plasmid pSLI. The apramycin resistance
cassette was excised from pEFBA as an EcoRI fragment and cloned
into pSLI digested with EcoRI, to give the plasmid pSLIA. Finally,
the hygromycin resistance cassette was excised SpeI-PstI from pLHyg
and cloned into pSLIA which had been digested with NsiI-SpeI to
give plasmid pSLIr1.
[0266] The replacement vector pSLIr1 was introduced into S.
parvulus Tu4055 by protoplast transformation as described in
example 5. Colonies resistant to apramycin (25 .mu.g/ml) were
selected, and then passaged several times through MA media without
selection. The replacement was verified by Southern hybridisation
and the new mutant was named S. parvulus Tu4055/borI::aac3(IV).
[0267] S. parvulus Tu4055/borI::aac3(IV) was grown and analysed as
described in example 1. No borrelidin production was observed
whereas several new compounds were observed at significantly lower
levels. One of the less polar compounds displayed a UV absorbance
maximum of 240 nm, and LCMS analysis indicated an m/z ratio 11 mass
units lower than that for borrelidin, which is consistent with the
presence of a methyl- rather than a nitrile-group at C12.
[0268] To verify that no polar effects were introduced a
full-length copy of borI under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. A 2.1 kb
NsiI-AvrII fragment containing borI was recovered from pSLI and
subcloned into the PstI-XbaI sites of pEM4, together with the
NheI-SpeI fragment from pLHyg containing the hyg gene. Both
fragments were subcloned in the same orientation generating pborIH.
Plasmid pborIH and the control plasmid pEM4 were introduced into S.
parvulus Tu4055/borI::aac(3)IV by protoplast transformation as
described in example 5. The resulting strain S. parvulus
Tu4055/borI::aac(3)IV/pborIH was analysed as described in examples
1 & 4, and shown to produce borrelidin at a titre similar to a
WT control.
Example 21
Disruption of borJ (S. parvulus Tu4055/borJ::aac3(IV))
[0269] The gene borJ and surrounding DNA was amplified from
cosBor19B9 using the PCR primers BNHT1 (5'-GTCATGCATCAGCGCACCCG-3')
(SEQ ID No.87) and BNHT2 (5'-GTGCAATTGCCCTGGTAGTC-3') (SEQ ID
No.88). The 2.75 kbp product was purified, digested with NsiI-MfeI
and cloned into pSL1180 that had been digested with NsiI-EcoRI, to
give plasmid pSL. The hygromycin resistance cassette was excised
from pLHyg as a PstI-SpeI fragment and cloned into pSL digested
with NsiI-SpeI, to give pSLJH. Finally, the apramycin resistance
cassette was excised from pEFBA with SpeI-BamHI and cloned into
pSLJH that had been pre-digested with AvrII-BglII in order to
remove a 453 bp fragment from borJ, to give plasmid pSLJr1.
[0270] The replacement vector pSLJr1 was introduced into S.
parvulus Tu4055 by protoplast transformation as described in
example 5. Colonies resistant to apramycin (25 .quadrature.g/ml)
were selected, and then passaged several times through MA media
without selection. The replacement was verified by Southern
hybridisation. The new mutant was named S. parvulus
Tu4055/borJ::aac3(IV).
[0271] S. parvulus Tu4055/borJ::aac3(IV) was grown and analysed as
described in example 1. No borrelidin production was observed
whereas a new compound more polar than borrelidin was observed with
a UV maximum at 262 nm. LCMS analysis indicated a parent compound
of 508 amu, which is consistent with a carboxylic acid rather than
a nitrile function at C12.
[0272] To verify that no polar effects were introduced a
full-length copy of borJ under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. A 2.4 kb NsiI-SphI
fragment from pSLJ containing borJ was subcloned into the PstI-XbaI
sites of pEM4, together with the hyg gene as a SphI-SpeI fragment
from pLHyg; both fragments were subcloned in the same orientation
as the transcription of the genes. The final construct was designed
pborJH. Plasmid pborJH and the control plasmid pEM4 were introduced
into S. parvulus Tu4055/borJ::aac(3)IV by protoplast transformation
as described in example 5. The resulting strain S. parvulus
Tu4055/borJ::aac(3)IV/pborJH was analysed as described in examples
1 & 4, and shown to produce borrelidin at a titre similar to a
WT control.
Example 22
Disruption of borK (S. parvulus Tu4055/borK::aac3(IV))
[0273] In order to disrupt borK, a fragment of 2680 bp was
amplified by PCR using the primers B231
(5'-ATCAAGCTTCGTGTCCATGG-3') (SEQ ID No.109) and B232
(5'-GTCATGCATCAGGCGTTCGG-3') (SEQ ID No.110) and cosmid Bor19B9 as
template. The resulting PCR product was purified and then digested
with HindIII-NsiI and subcloned into the same sites of pSL1180 to
produce pSLK. After MluI digestion of pSLK and treatment with the
Klenow fragment, the aac(3)IV gene from pEFBA was subcloned as a
SmaI-EcoRV fragment leading to pSLKa. Finally a PstI-SpeI fragment
from pLHyg containing the hyg gene was subcloned into pSLKa
digested NsiI-XbaI to obtain pSLKr1.
[0274] The vector pSLKr1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected, and then passaged several
times through MA media without selection. The replacement was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borK::aac3(IV). Strain S. parvulus
Tu4055/borK::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borK::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
[0275] To verify that no polar effects were introduced a
full-length copy of borK under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. A 2.2 kb BglII
(blunt-ended)-NsiI fragment from pSLK was subcloned, together with
a 1.6 kb PstI-SpeI fragment from pLHyg containing the hyg gene,
into pEM4 digested with PstI (treated with the Klenow fragment) and
then XbaI. The final vector was named pborKH. Plasmid pborKH and
the control plasmid pEM4 were introduced into S. parvulus
Tu4055/borK::aac(3)IV by protoplast transformation as described in
example 5. The resulting strain S. parvulus
Tu4055/borK::aac(3)IV/pborKH was analysed as described in examples
1 & 4, and shown to produce borrelidin at a titre similar to a
WT control.
Example 23
Disruption of borL (S. parvulus Tu4055/borL::aac3(IV))
[0276] In order to disrupt borL a 3.95 kbp BglII fragment of
cosBor19B9, which contained the full-length borL, was sub-cloned
into pSL1180 digested similarly. The resulting clones were analysed
by restriction digest and one that displayed the correct
orientation was chosen to provide pSL395. Digestion of pSL395 with
NheI and SpeI, and subsequent re-ligation to eliminate a fragment
of borM that included a BglII site, gave pSLL. The apramycin
resistance cassette was excised with KpnI from pEFBA (Lozano et
al., 2000) and cloned into pSL that had been digested with KpnI, to
give pSLLA. pSLLA was digested with BglII and then subjected to
Klenow treatment following the manufacturers instructions (Roche);
an EcoRV fragment isolated from pLHyg containing the hygromycin
resistance cassette was then cloned into this prepared vector to
give pSLLr1.
[0277] The replacement vector pSLLr1 was introduced into S.
parvulus Tu4055 by protoplast transformation. Colonies resistant to
apramycin were selected, and then passaged several times through MA
media without selection. The replacement was verified by Southern
hybridisation. The new mutant was named S. parvulus
Tu4055/borL::aac3(I V).
[0278] Strain S. parvulus Tu4055/borL::aac3(IV) was grown,
extracted and analysed as described in example 1. No borrelidin
production was observed whereas a wild type control produced
borrelidin as expected. Chemical complementation of S. parvulus
Tu4055/borL::aac(IV) using the natural starter acid as described in
example 1 showed that the strain thus grown was capable of
borrelidin production at 408.+-.70% of the WT parent control
titre.
[0279] To verify that no polar effects were introduced a
full-length copy of borL under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. The vector
containing full-length borL was generated as described in example
30. Plasmid pborLH and the control plasmid pEM4 were introduced
into S. parvulus Tu4055/borL::aac(3)IV by protoplast transformation
as described in example 5. The resulting strain S. parvulus
Tu4055/borL::aac(3)IV/pborLH was analysed as described in example
1.
Example 24
Disruption of borM (S. parvulus Tu4055/borM::aac3(IV))
[0280] In order to disrupt borM, a 2870 bp fragment containing borM
was amplified by PCR using the primers B251
(5'-CTTCTAGATGAACCCCTCCA-3') (SEQ ID No.111) and B252
(5'-GGGCAATTGCGCGGCAGCTT-3') (SEQ ID No.112) and cosmid Bor19B9 as
template. The resulting product was purified and then digested with
XbaI-MfeI and subcloned into the XbaI-EcoRI sites of pSL1180,
leading to pSLM. An internal 780 bp SphI-NheI fragment of borM was
then replaced by the aac(3)IV gene which was subcloned from pEFBA
as a SpeI-XbaI fragment, leading to pSLMA. pSLMA was digested with
NsiI-XbaI and the hyg gene subcloned as a SpeI fragment from pLHyg
to generate pSLMr1.
[0281] The vector pSLMr1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected, and then passaged several
times through MA media without selection. The replacement was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borM::aac3(IV). Strain S. parvulus
Tu4055/borM::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borM::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
[0282] To verify that no polar effects were introduced a
full-length copy of borM under the control of the ermE* promoter
was introduced in trans to the disrupted mutant. Full-length borM
was cloned as a XbaI-AgeI fragment of 2.0 kb from pSLM and
subcloned into the EcoRI (end-filled with Klenow)-XbaI sites of
pEM4 together with the hyg gene as a XmaI-EcoRV fragment from
pLHyg, to give pborMH. Plasmid pborMH and the control plasmid pEM4
were introduced into S. parvulus Tu4055/borM::aac(3)IV by
protoplast transformation as described in example 5. The resulting
strain S. parvulus Tu4055/borM::aac(3)IV/pborMH was analysed as
described in example 1 and shown to produce borrelidin at a titre
similar to a WT control.
Example 25
Disruption of borN (S. parvulus Tu4055/borN::aac3(IV))
[0283] In order to disrupt borN, a 1201 bp BamHI fragment from pSLM
(containing the 3'-end of borM and the first 161 codons of borN)
was subcloned into the BglII-BamHI sites of pSL1180 and in the
correct orientation, to generate pSLMN. A BamHI-EcoRI fragment
(using these sites from the polylinker) containing borO from pborOR
(see below) was subcloned into the BamHI-EcoRI sites of pSLMN,
generating pSLNO. After EcoRI digestion of pSLNO and end-filling
with Klenow fragment, the hyg gene was subcloned from pLHyg as a
EcoRV fragment, leading to pSLNOH. Finally the aac3(IV) gene was
subcloned as a NcoI-BamHI fragment from pEFBA into pSLNOH digested
with the same restriction enzymes, generating pSLNr1.
[0284] The vector pSLNr1 was introduced into S. parvulus Tu4055 by
protoplast transformation as described in example 5. Colonies
resistant to apramycin were selected, and then passaged several
times through MA media without selection. The replacement was
verified by Southern hybridisation and the new mutant was named S.
parvulus Tu4055/borN::aac3(IV). Strain S. parvulus
Tu4055/borN::aac3(IV) was grown, extracted and analysed as
described in example 1. Borrelidin production was compared to a
wild type control. In addition, S. parvulus Tu4055/borN::aac3(IV)
was chemically complemented with trans-1,2-dicyclopentane
dicarboxylic acid, following the protocol described in example
1.
Example 26
Heterologous Expression of borO in Streptomyces albus J1074
[0285] In order to examine whether the putative resistance protein
BorO confers resistance to a borrelidin-sensitive organism, borO
was expressed in Streptomyces albus J1074. The gene borO was
amplified by PCR using the primers BTRNAS1
(5'-TGTCTAGACTCGCGCGAACA-3') (SEQ ID No.89) and BTRNAS2
(5'-TGAATTCCGAAGGGGGTGGT-3') (SEQ ID No.90) with cosBor19B9 as
template. The product was purified, digested XbaI-EcoRI and cloned
into pEM4A that had been similarly digested to give plasmid pborOR
which puts borO under the control of the promoter ermE*. The vector
pborOR was introduced into S. albus J1074 by protoplast
transformation (Chater & Wilde, 1980) and selected for
apramycin resistance. The new strain was named S. albus
J1074/pborOR.
[0286] Resistance to borrelidin was assayed on Bennett's agar
containing apramycin at 25 .mu.g/ml. Spores of S. albus
J1074/pborOR and the control S. albus J1074/pEM4A were spread onto
plates and then disks containing borrelidin at 100 & 200
.mu.g/ml were laid upon the lawn of spores and incubated overnight
at 30.degree. C. Haloes indicating inhibition of growth were
observed for the control strain harbouring pEM4A but not for S.
albus J1074/pborOR.
Example 27
Disruption of borG and borI (S. parvulus
Tu4055/borG::aac3(IV)/borI::hyg)
[0287] The hyg gene is isolated from pLHyg as an EcoRV fragment and
cloned into pSLI (example 20) digested with EcoRI and treated with
Klenow fragment to give pSLIH; the hyg gene is cloned in the same
orientation as borI. pSLIH is introduced into S. parvulus
Tu4055/borG::aac3(IV) by protoplast transformation, as described in
example 5, and selected for both apramycin and hygromycin
resistance, and is then passaged several times through MA media
without selection in order to promote double recombination.
Apramycin and hygromycin resistant colonies are analysed by
Southern hybridisation and PCR to verify the replacement.
Example 28
Disruption of borG and borJ (S. parvulus
Tu4055/borG::aac3(IV)/borJ::hyg)
[0288] The hyg gene is isolated from pLHyg as an EcoRV fragment and
cloned into pSLJ (example 21) digested with AvrII-BglII and treated
with Klenow, to give pSLJH; the hyg gene is cloned in the same
orientation as borI. pSLJH is introduced into S. parvulus
Tu4055/borG::aac3(IV) by protoplast transformation, as described in
example 5, and selected for both apramycin and hygromycin
resistance, and is then passaged several times through MA media
without selection in order to promote double recombination.
Apramycin and hygromycin resistant colonies are analysed by
Southern hybridisation and PCR to verify the replacement.
Example 29
Effects of borE Up-Regulation in S. parvulus Tu4055
[0289] To examine the possibility that biosynthesis of the
trans-1,2-cyclopentane dicarboxylic acid starter unit may have a
limiting effect upon borrelidin production, borE was up-regulated
in the parental strain and the effect upon borrelidin titre was
analysed. The vector used, pborEH was described in example 16.
[0290] The vectors pborEH and pEM4 (control) were used to transform
protoplasts of S. parvulus Tu4055 to give strains S. parvulus
Tu4055/pborEH and S. parvulus Tu4055/pEM4 respectively. Several
colonies from each transformation were picked, grown in triplicate
and then analysed as described in example 1. Compared to the
control strain, up-regulation of borE brought about a
4.2.+-.0.3-fold increase in the titre of borrelidin.
Example 30
Effects of borL Up-Regulation in S. parvulus Tu4055
[0291] To examine the possibility that borL may have a regulatory,
or some other related function involved in borrelidin production,
the gene was up-regulated in the parental strain and the effect
upon borrelidin titre was analysed.
[0292] The expression vector pborLH was generated as follows: pSLL
was digested with NotI, treated with Klenow fragment and then
digested with BamHI to obtain a fragment of 2190 bp containing
borL. This fragment was sub-cloned together with the BamHI-SpeI hyg
gene from pLHyg, into pEM4 digested with PstI (treated with
Klenow)-XbaI, to obtain pborLH.
[0293] The vectors pborLH and pEM4 (control) were used to transform
protoplasts of S. parvulus Tu4055 to give strains S. parvulus
Tu4055/pborLH and S. parvulus Tu4055/pEM4 respectively. Several
colonies from each transformation were picked, grown in triplicate
and then analysed as described in example 1. Compared to the
control strain, up-regulation of borL brought about a
4.3.+-.0.7-fold increase in the titre of borrelidin.
Example 31
Production of 12-desnitrile-12-methyl Borrelidin 14
(Pre-Borrelidin)
[0294] Working stocks of S. parvulus Tu4055/borI::aac3(IV) (0.5 ml)
were inoculated into primary vegetative pre-cultures of NYG as
described in example 1. Secondary pre-cultures were prepared (as
example 1 but with 250 ml NYG in 2 l Erlenmeyer flasks). PYDG
production medium (4 l), prepared as in example 1 and with 0.01%
Plutronic L0101 added to control foaming, was inoculated with
secondary pre-culture (12.5% inoculum). A second fermenter
containing centre-point medium (4 l) and 0.01% Plutronic L0101 to
control foaming, was set up in parallel and was also inoculated
with secondary pre-culture (12.5% inoculum). Centre-point
production medium contains per litre of deionised water: Tesco's
skimmed milk powder (1.5%), Avidex W-80 (4.5%), glucose (0.5%) and
yeast autolysate (0.15%) adjusted to pH 7.0 with 5 M NaOH.
[0295] These batches were each allowed to ferment in a 7 l Applikon
fermenter for 6.5 days at 30.degree. C. Airflow was set at 0.75 vvm
(volume per volume per minute), with tilted baffles and the
impeller speed controlled between 400 and 800 rpm to maintain
dissolved oxygen tension at or above 30% of air saturation. No
further antifoam was added. At 22 hours into the fermentation the
starter acid, trans-cyclopentane-1,2-dicarboxylic acid, was added
as a neutralised solution of 1:1 MeOH/5 M NaOH, through an in-line
filter (0.22 .mu.m). The final concentration in the fermenter
vessel of exogenous starter acid was 0.5 mM.
[0296] After 6.5 days of fermentation the broths were combined and
acidified to pH 3.5 with concentrated HCl (.about.6 ml), then
clarified by centrifugation at 3,500 rpm for 10 minutes. The
supernatant was extracted into ethyl acetate (3.times.1 volume
equivalent for 4 hours each) and the cell pellet left to steep in
methanol (2.times.1.5 litres for 4 hours each). The organics were
combined and removed under reduced pressure to yield a tarry gum.
The gum was re-suspended in 0.1 M Borax buffer (500 ml at pH 9.4)
and washed with hexanes (500 ml) and ethyl acetate (500 ml). The
aqueous layer was then acidified with concentrated HCl to pH 3.5
and extracted with ethyl acetate (3.times.500 ml), which were
combined and taken to dryness. The resultant gum was dissolved in
methanol (15 ml), diluted with water (285 ml) and loaded under
gravity onto a C.sub.18-reversed-phase cartridge (50 g, prepared in
5% aqueous methanol). The cartridge was washed with 20% and 50%
aqueous methanol (300 ml each) and eluted with 100% methanol (500
ml). This last fraction was taken to dryness under reduced pressure
to yield a black gummy-oil (600 mg) that was taken up in methanol.
This residue was finally purified by sequential preparative
reversed-phase HPLC (eluted with the mobile phases used in example
4, without added TFA, running isocratically at 40% B). Active
fractions were combined and desalted on a C.sub.18-cartridge (1 g),
to yield 28 mg of a dark oil (3.5 mg/l isolated yield). Table 12
summarises the .sup.1H and .sup.13C NMR chemical shift data for
12-desnitrile-12-methyl borrelidin 14 in CDCl.sub.3.
TABLE-US-00012 TABLE 12 Position .delta..sub.H (ppm) Multiplicity
Coupling (Hz) .delta..sub.c (ppm) 1 -- -- -- 174.5 2a 2.29 m --
37.8 2b 2.26 m -- -- 3 3.85 dt 9.0, 3.0 71.9 4 1.83 m -- 35.1 5a
1.19 bt 13.5 43.6 5b 0.91 m -- -- 6 1.75 m -- 27.0 7a 1.08 m --
49.2 7b 0.88 m -- -- 8 1.69 m -- 26.5 9a 0.97 m -- 38.3 9b 0.45 t
12.5 -- 10 1.62 m -- 34.1 11 3.53 d 9.0 85.7 12 -- -- -- 138.4 13
5.84 d 11.0 127.7 14 6.28 ddd 14.5, 11.0, 1.0 129.6 15 5.48 ddd
14.5, 10.5, 3.5 129.9 16a 2.53 m -- 39.1 16b 2.22 m -- -- 17 5.07
ddd 11.0, 8.0, 3.0 76.5 18 2.52 m -- 48.0 19a 1.92 m -- 30.4 19b
1.32 m -- -- 20a 1.74 m -- 26.2 20b 1.71 m -- -- 21a 1.96 m -- 32.0
21b 1.84 m -- -- 22 2.45 m 8.0 49.3 23 -- -- -- 182.3 4-CH.sub.3
0.78 d 6.5 18.5 6-CH.sub.3 0.77 d 6.5 18.8 8-CH.sub.3 0.75 d 6.5
20.6 10-CH.sub.3 0.94 d 6.5 16.3 12-CH.sub.3 1.64 s -- 11.4
Chemical shifts are referenced to CDCl.sub.3 (for .sup.1H at 7.26
ppm and for .sup.13C at 77.0 ppm)
Example 32
Production of 12-desnitrile-12-carboxy Borrelidin 2
[0297] Working stocks of S. parvulus Tu4055/borJ::aac3(IV) (0.5 ml)
were inoculated into primary vegetative pre-cultures of NYG as
described in example 1. Secondary pre-cultures were prepared (as
example 1 but with 250 ml NYG in 2 l Erlenmeyer flasks). PYDG
production media (4 L), prepared as in example 1 and with 0.01%
Plutronic L0101 added to control foaming, was inoculated with the
entire secondary pre-culture (10% inoculum). This was allowed to
ferment in a 7 L Applikon fermenter for 6 days at 30.degree. C.
Airflow was set at 0.75 vvm, with tilted baffles and the impeller
speed controlled between 250 and 600 rpm to maintain dissolved
oxygen tension at or above 30% of air saturation. No further
antifoam was added. A second fermentation was performed exactly as
above, but which was batch fed with 0.2 mol of glucose as an
aqueous solution every 12 hours from 60 hours post-inoculation.
[0298] After 6 days the fermentations were harvested and combined.
The broth was clarified by centrifugation (3,500 rpm, 10 minutes)
and the resultant supernatant acidified with 10 M HCl (aq) to pH
.about.3.5. This solution was then extracted into ethyl acetate by
stirring (3.times.1 volume equivalent for 4 hours each). The cell
pellet was extracted twice by steeping the cells in 1:1
methanol/ethyl acetate (500 ml). All the organics were combined and
removed under reduced pressure to yield an aqueous slurry. The
slurry was diluted to 500 ml with water, acidified to pH .about.3.5
with 10 M HCl and extracted into ethyl acetate (3.times.300 ml).
The organics were concentrated under reduced pressure to .about.300
ml and extracted with 0.1 M borax (3.times.150 ml, pH=9.4). The
combined borax solutions were acidified with 10 M HCl to pH
.about.3.5 and extracted with 6.times.300 ml of ethyl acetate.
Analytical HPLC demonstrated that some of the accumulant still
resided in the borax solution and so this was loaded, under
gravity, onto a C.sub.18-reverse-phase cartridge (50 g). The
cartridge was washed with water and the accumulant eluted in 100%
methanol. The organics containing the accumulant were combined and
reduced to a 40 ml methanolic solution. This was loaded onto a
Sephadex LH-20 column (70 g, swelled overnight in methanol, column
60 cm.times.2.5 cm), which was developed with 100% methanol; the
active fractions were combined and taken to dryness. The material
was then further processed by preparative reversed-phase HPLC
(eluted with the mobile phases used in example 4, without added
TFA, running isocratically at 40% B). The combined active fractions
were taken to dryness, dissolved in methanol (4 ml) and diluted
with water (200 ml). This mixture was split into 2 equal fractions
and each loaded, under gravity, onto a C.sub.18-reverse-phase
cartridge (20 g). The columns were then eluted with 3 column
volumes of 5%, 10%, 20%, 30%, 40%, 50%, 60%, 75%, 90% and 100%
aqueous methanol. The accumulant eluted in all fractions from 60%
to 100% methanol, which were combined and taken to dryness. The
accumulant (dissolved in DMSO) was then finally purified by
sequential preparative reversed-phase HPLC (eluted with the mobile
phases used in example 4, without added TFA, running isocratically
at 40% B). Active fractions were combined and desalted on a
C.sub.18-cartridge (1 g), to yield 17 mg of a brown oil (2.1 mg/l
isolated yield). Table 13 summarises the .sup.1H and .sup.13C NMR
chemical shift data for 12-desnitrile-12-carboxy borrelidin 2 in
d.sub.4-methanol.
TABLE-US-00013 TABLE 13 Position .delta..sub.H (ppm) Multiplicity
Coupling (Hz) .delta..sub.C (ppm) 1 -- -- -- 173.27 2a 2.40 dd
15.8, 4.1 39.31 2b 2.29 dd 15.8, 8.2 3 3.87 m 71.64 4 1.80 m 36.51
5a 1.29 m 44.24 5b 0.90 m 6 1.59 m 27.48 7a 1.09 m ~49.0* 7b 1.03 m
8 1.72 m 28.17 9a 1.12 m 38.42 9b 0.79 m 10 2.03 m 36.43 11 3.90 m
81.95 12 -- -- -- 132.35 13 6.43 d 11.0 140.83 14 6.96 dd 14.5,
11.5 130.91 15 5.91 ddd 15.0, 9.5, 5.0 138.93 16a 2.61 m 15.0 38.57
16b 2.36 m 17 5.04 m 77.40 18 2.50 m 49.80 19a 1.90 m 30.59 19b
1.32 m 20a 1.85 m 26.34 20b 1.41 m 21a 1.97 m 32.40 21b 1.75 m 22
2.52 m ~48.0* 23 -- -- -- 180.27 4-CH.sub.3 0.83 d 7.0 18.76
6-CH.sub.3 0.80 d 6.0 17.06 8-CH.sub.3 0.81 d 6.5 20.60 10-CH.sub.3
0.93 d 6.5 16.61 12-CO.sub.2H -- -- -- 170.49 Chemical shifts are
referenced to methanol (for .sup.1H at 3.35 ppm (quintet) and for
.sup.13C at 49.0 ppm (septet)); *Obscured by solvent signal,
d.sub.4-methanol.
Example 33
Production by Mutasynthesis of 17-des-(cyclopentane-2'-carboxylic
acid)-17-(cyclobutane-2'-carboxylic acid)borrelidin 18
[0299] Working stocks of S. parvulus Tu4055/borE::aac3(IV) (0.5 ml)
were inoculated into primary vegetative pre-cultures of NYG as
described in example 1. Secondary pre-cultures were prepared (as
example 1 but with 250 ml NYG in 2 l Erlenmeyer flasks). PYDG
production medium (4 l), prepared as in example 1 and with 0.01%
Plutronic L0101 added to control foaming, was inoculated with
secondary pre-culture (12.5% inoculum). Two further bioreactors
were set up in the same manner. These batches were each allowed to
ferment in a 7 l Applikon fermenter for 5 days at 30.degree. C.
Airflow was set at 0.75 vvm (volume per volume per minute), with
tilted baffles and the impeller speed controlled between 400 and
700 rpm to maintain dissolved oxygen tension at or above 30% of air
saturation. No further antifoam was added. At 22 hours into the
fermentation the starter acid, trans-cyclobutane-1,2-dicarboxylic
acid, was added as a neutralised solution of 1:1 MeOH 15 M NaOH.
The final concentration in the fermenter vessel of exogenous
starter acid was 0.5 mM.
After 5 days of fermentation the broths were combined and acidified
to pH 4.0 with concentrated HCl, then clarified by centrifugation
at 3,500 rpm for 10 minutes. The supernatant was absorbed onto
diaion HP-20SS resin (1 l), which had been pretreated with methanol
(2 l) and then 5% aqueous methanol (2 l), by filtration at a rate
of approximately 100 ml/min. The resin was then eluted with 20%
aqueous methanol (2.5 l) and then 80% aqueous acetone (4.5 l). The
organic solvent was removed from the aqueous acetone and the
resultant aqueous slurry (1 litre) extracted into ethyl acetate
(3.times.1 l). The organics were combined and reduced in vacuo to
yield a yellow/brown oil (1.7 g). Meanwhile, the cell pellet left
to steep in methanol-ethyl acetate, 1:1 (3.times.1 l for 4 hours
each), and the resultant organic supernatants reduced in vacuo to
yield an aqueous slurry (400 ml). The particulate matter was
dissolved in methanol (50 ml), and added back to the aqueous
slurry, which was made up to 500 ml with water. This slurry was
absorbed onto diaion HP-20SS resin (300 ml), that had been
pretreated with methanol (500 ml) and then 5% aqueous methanol (500
ml). The resin was then eluted with 20% aqueous methanol (1 l) and
then 80% aqueous acetone (1.5 l). The organic solvent was removed
from the aqueous acetone and the resultant aqueous slurry (made up
to 750 ml) extracted into ethyl acetate (3.times.750 ml). The
organics were combined and reduced in vacuo to yield a yellow/brown
oil (1.7 g). The crude extracts were combined (3.4 g), dissolved in
ethyl acetate (10 ml), then adsorbed onto a silica column (5 cm
ID.times.10 cm, treated with EtOAc), and eluted with EtOAc. The
active fractions were combined and the solvent removed in vacuo to
yield a brown gum (1.08 g). This residue was finally purified by
sequential preparative reversed-phase HPLC (eluted with the mobile
phases used in example 4, without added TFA, running from 25% B to
75% B over 25 minutes with a linear gradient). Active fractions
were combined and desalted on a C.sub.18-cartridge (5 g), to yield
83.9 mg (or 7.0 mg/l isolated yield). The .sup.13C-NMR spectrum of
18 is shown in table 14
TABLE-US-00014 TABLE 14 .delta..sub.C (ppm) Position 177.1 COOH
(C22) 172.2 1 144.0 13 138.7 15 126.9 14 118.3 12 115.8 CN 75.5 17
73.1 11 69.7 3 47.6 5 43.1 7 40.1 19 40.0 2 37.3 9 35.7 4 35.1 10
34.4 16 30.9 18 27.3 6 26.2 8 21.7 20 21.0 21 20.1 8-Me 18.1 6-Me
16.9 4-Me 14.9 10-Me
.sup.13C-NMR assignment for 18, in CDCl.sub.3, using that carbon
signal as reference at .delta..sub.C=77.7 ppm
REFERENCES
[0300] Anderson, B. F., Herit, A. J., Rickards, R. W., and
Robertson, G. B. (1989) Crystal and molecular structures of two
isomorphous solvates of the macrolide antibiotic borrelidin:
absolute configuration determination by incorporation of a chiral
solvent in the in the crystal lattice. Aust J. Chem. 42:717-730.
[0301] Anderton, K., and Rickards, R. W. (1965) Some structural
features of borrelidin, an anti-viral antibiotic. Nature 206:269.
[0302] Aparicio, J. F., Molnar, I., Konig, A., Haydock, S. H.,
Khaw., L. E., Staunton, J., and Leadlay, P. F. (1996) Organisation
of the biosynthetic gene cluster for rapamycin in Streptomyces
hygroscopicus: analysis of the enzymatic domains in the modular
polyketide synthase. Gene 169:9-16. [0303] August, P. R., Tang, L.,
Yoon, Y. J., Ning, S., Muller, R., Yu, T.-W., Taylor, M., Hoffmann,
D., Kim, C. G., Zhang, X. H., Hutchinson, C. R., and Floss, H. G.
(1998) Biosynthesis of the ansamycin antibiotic rifamycin:
deductions from the molecular analysis of the rif biosynthetic gene
cluster of Amycolatopsis mediterranei S699. Chem. Biol. 5:69-79.
[0304] Beck, J. B., Yoon, Y. J., Reynolds, K. A., and Sherman, D.
H. (2002) The hidden steps of domain skipping: ring size
determination in the pikromycin modular polyketide synthase. Chem.
Biol. 9:575-583. [0305] Berger, J., Jampolsky, L. M., and Goldberg,
M. W. (1949) Borrelidin, a new antibiotic with anti-Borrelia
activity and penicillin enhancement properties. Arc. Biochem.
22:476-478. [0306] Bierman, M., Logan, R., O'Brian, K., Seno, E.
T., Rao, N., and Schoner, B. E. (1992) Plasmid vectors for the
conjugal transfer of DNA from Escherichia coli to Streptomyces spp.
Gene 116:43-49. [0307] Bonfield, J. K., Smith, K. F., and Staden,
R. (1995) A new DNA sequence assembly program. Nucleic Acids
Research 23:4992-4999. [0308] Brautaset, T., Sekurova, O. N.,
Sletta, H., Ellingsen, T. E., Strom, A. R., Valla, S., and Zotchev,
S. B. (2000) Biosynthesis of the polyene antifungal antibiotic
nystatin in Streptomyces noursei ATCC 11455: analysis of the gene
cluster and deduction of the biosynthetic pathway. Chem. Biol.
7:395A403. [0309] Brenner, S. (1998) The molecular evolution of
genes and proteins: a tale of two serines. Nature 334:528-530.
[0310] Broadhurst, R. W., Nietlispach, D., Wheatcroft, M. P.,
Leadlay, P. F., and Weissman, K. J. (2003) The structure of docking
domains in modular polyketide synthases. Chem. Biol. 10:723-731.
[0311] Brosius, J. (1989) Super-polylinkers in cloning and
expression vectors. DNA 8:759-777. [0312] Butler, A. R., Bate, N.,
and Cundliffe, E. (1999) Impact of thioesterase activity on tylosin
biosynthesis in Streptomyces fradiae. Chem. Biol. 6:287-292. [0313]
Caffrey, P., Lynch, S., Flood, E., Finnan, S., and Oliynyk, M.
(2001) Amphotericin biosynthesis in Streptomyces nodosus:
deductions from analysis of polyketide synthase and late genes.
Chem. Biol. 8:713-723. [0314] Celenza, J. L. (2001) Metabolism of
tyrosine and tryptophan--new genes for old pathways. Curr. Opin.
Plant Biol. 4:234-240 [0315] Chater, K. F. and Wilde, L. C. (1980)
Streptomyces albus G mutants defective in the SalG1 restriction
modification system. J. Gen. Microbiol. 116:323-334. [0316] Cheng,
Y. Q., Tang, G. L., and Shen B. (2003) Type I polyketide synthase
requiring a discrete acyltransferase for polyketide biosynthesis.
Proc. Natl. Acad. Sci. USA. 100:3149-3154. [0317] Cortes J.,
Haydock, S. F., Roberts, G. A., Bevitt, D. J., and Leadlay, P. F.
(1990) An unusually large multifunctional polypeptide in the
erythromycin producing polyketide synthase of Saccharopolyspora
erythraea. Nature 348:176-178. [0318] Cortes, J., Weissman, K. E.
H., Roberts, G. A., Brown, M. J. B., Staunton, J., and Leadlay, P.
F. (1995) Repositioning of a domain in a modular polyketide
synthase to promote specific chain cleavage. Science 268:1487-1489.
[0319] Cortes, J., Velasco, J., Foster, G., Blackaby, A. P., Rudd,
B. A. M., and Wilkinson, B. (2002) Identification and cloning of a
type III polyketide synthase required for diffusible pigment
biosynthesis in Saccharopolyspora erythraea. Mol. Micro.
44:1213-1224. [0320] Devereux, J., Heaberli, P., and Smithies, O.
(1984) A comprehensive set of sequence analysis programs for the
VAX. Nucleic Acids Research 12:387-395. [0321] Dickinson, L.,
Griffiths, A. J., Mason, C. G., and Mills, R. F. (1965) Anti-viral
activity of two antibiotics isolated from a species of
Streptomyces. Nature 206:265-268. [0322] Donadio, S., Staver, M.
J., McAlpine, J. B., Swanson, S. J., and Katz, L. (1991) Modular
organization of genes required for complex polyketide biosynthesis.
Science 252:675-679. [0323] Donadio, S., McAlpine, J. B., Sheldon,
P. J., Jackson, M., and Katz, L. (1993) An erythromycin analog
produced by reprogramming of polyketide synthesis Proc. Nat. Acad.
Sci. USA 90:7119-7123. [0324] Duffey, M. O., LeTiran, A., and
Morken, J. P. (2003) Enantioselective total synthesis of
borrelidin. J. Am. Chem. Soc. 125:1458-1459. [0325] Eastwood, E.
L., and Schaus, S. E. (2003) Borrelidin induces the transcription
of amino acid biosynthetic enzymes via a GCN4-dependent pathway.
Bioorg. Med. Chem. Lett. 13:2235-2237. [0326] Fernandez, E.,
Weissbach, U., Sanchez-Reillo, C., Brana, A. F., Mendez, C., Rohr,
J., and Salas, J. A. (1998) Identification of two genes from
Streptomyces argillaceus encoding glycosyltransferases involved in
transfer of a disaccharide during the biosynthesis of the antitumor
drug mithramycin. J. Bacteriol. 180:4929-4937. [0327] Floss, H. G.
(2001) Antibiotic biosynthesis: from natural to unnatural
compounds. J. Ind. Micro. Biotech. 27:183-194. [0328] Fouces, R.,
Mellado, E., Diez, B., and Barredo, J. L. (1999) The tylosin
biosynthetic cluster from Streptomyces fradiae: genetic
organisation of the left region. Microbiology 145:855-868. [0329]
Folkman, J. (1986) How is blood vessel growth regulated in normal
and neoplastic tissue? G. H. A. Cloves Memorial Lecture. Cancer
Res. 51:467-473. [0330] Funahashi, Y., Wakabayashi, T., Semba, T.,
Sonoda, J., Kitoh, K., and Yoshimatsu, K. (1999) Establishment of a
quantitative mouse dorsal air sac model and its application to
evaluate a new angiogenesis inhibitor. Oncol. Res. 11:319-329.
[0331] Gaisser, S., Reather, J., Wirtz, G., Kellenberger, L.,
Staunton, J., and Leadlay, P. F. (2000) A defined system for hybrid
macrolide biosynthesis in Saccharopolyspora erythraea. Mol.
Microbiol. 36:391-401. [0332] Gaisser, S., Martin, C. J.,
Wilkinson, B., Sheridan, R. M., Lill, R. E., Weston, A. J., Ready,
S. J., Waldron, C., Crouse, G. C., Leadlay, P. F., and Staunton, J.
(2002) Engineered biosynthesis of novel spinosyns bearing altered
deoxyhexose substituents. Chem. Commun. 618-619. [0333] Gaitatzis,
N., Silakowski, B., Kunze, B., Nordsiek, G., Blocker, H., Hofle,
G., and Muller, R. (2002) The biosynthesis of the aromatic
myxobacterial electron transport inhibitor stigmatellin is directed
by a novel type of modular polyketide synthase. J. Biol. Chem.
277:13082-13090. [0334] Hanessian, S., Yang, Y., Giroux, S.,
Mascitti, V., Ma, J., and Raeppel, F. (2003) Application of
conformation design in acyclic stereoselection: total synthesis of
borrelidin as the crystalline benzene solvate. J. Am. Chem. Soc.
125:13784-13792. [0335] Hardt, I. H., Steinmetz, H., Gerth, K.,
Sassa, F., Reichenbach, H., and Hofle, G. (2001) New natural
epothilones from Sorangium cellulosum, strains So ce90/B2 and So
ce90/D13: isolation, structure elucidation, and SAR studies. J.
Nat. Prod. 64:847-856. [0336] Heathcote, M. L., Staunton, J., and
Leadlay, P. F. (2001) Role of type II thioesterases: evidence for
the removal of short acyl chains produced by aberrant
decarboxylation of chain extender units. Chem. Biol. 8:207-220.
[0337] Hopwood, D. (1997) Genetic contributions to understanding
polyketide biosynthesis. Chem. Rev. 97:2465-2497. [0338] Hunziker,
D., Yu, T.-W., Hutchinson, C. R., Floss, H. G., and Khosla, C.
(1998) Primer unit specificity in biosynthesis principally resides
in the later stages of the biosynthetic pathways. J. Am. Chem. Soc.
120:1092-1093. [0339] Janssen, G. R., Bibb, M. J., (1993)
Derivatives of pUC18 that have BglII sites flanking a modified
cloning site and that retain the ability to identify recombinant
clones by visual screening of E. coli colonies. Gene 124:133-134.
[0340] Kahn, R. A., Fahrendorf, T., Halkier, B. A., and Moller, B.
L. (1999) Substrate specificity of the cytochrome P450 enzymes
CYP79A1 and CYP71E1 involved in the biosynthesis of the cyanogenic
glucoside dhurrin in Sorghum bicolour (L.) Moench. Arch. Biochem.
Biophys. 363:9-18. [0341] Kawamura, T., Liu, D., Towle, M. J.,
Kageyama, R., Tsukahara, N., Wakabayashi, T., and Littlefield, B.
A. (2003) Anti-angiogenesis effects of borrelidin are mediated
through distinct pathways: Threonyl-tRNA synthetase and caspases
are independently involved in suppression of proliferation and
induction of apoptosis in endothelial cells. J. Antibiot.
56:709-715. [0342] Kieser, T., Bibb, M. J., Buttner, M. J., Chater,
K. F., and Hopwood, D. A. (2000) Practical Streptomyces Genetics.
The John Innes Foundation. Norwich. [0343] Keller-Scheirlein, W.
(1967) Composition of the antibiotic borrelidin. Helv. Chim. Acta.
60:731-753. [0344] Kuo, M. S., Yurek, D. A., and Kloosterman, D. A.
(1989) Assignment of .sup.1H and .sup.13C NMR signals and the
alkene geometry at C-7 in borrelidin. J. Antibiot. 42:1006-1007.
[0345] Kuhstoss, S., Huber, M., Turner, J. R., Paschal, J. W., and
Rao, R. N. (1996) Production of a novel polyketide through the
construction of a hybrid polyketide synthase. Gene 183:231-236.
[0346] Lozano, M. J., Remsing, L. L., Quiros, L. M., Brana, A. F.,
Fernandez, E., Sanchez, C., Mendez, C., Rohr, J., and Salas, J. A.
(2000) Characterization of two polyketide methyltransferases
involved in the biosynthesis of the antitumor drug mithramycin by
Streptomyces argillaceus. J. Biol. Chem. 275:3065-3074. [0347]
Maehr, H., and Evans, R. H. (1987) Identity of borrelidin with
treponemycin. J. Antibiot. 40:1455-1456. [0348] Marsden, A. F.,
Wilkinson, B., Cortes, J., Dunster, N. J., Staunton, J., and
Leadlay, P. F. (1998) Engineering broader specificity into an
antibiotic-producing polyketide synthase. Science 279:199-202.
[0349] Matter, A., (2001) Tumor angiogenesis as a therapeutic
target. Drug Dis. Today 6:1005-1024. [0350] Mochizuki, S., Hiratsu,
K., Suwa, M., Ishii, T., Sugino, F., Yamada, K., and Kinashi, H.
(2003) The large linear plasmid pSLA2-L of Streptomyces rochei has
an unusually condensed gene organization for secondary metabolism.
Mol Microbiol. 48:1501-1510. [0351] Moore, B. S., and Hopke, J. N.
(2000) Discovery of a new bacterial polyketide biosynthetic
pathway. Chembiochem 2:35-38. [0352] Nielsen, J. S., and Moller, B.
L. (1999) Biosynthesis of cyanogenic glucosides in Triglochin
maritime and the involvement of cytochrome P450 enzymes. Arch.
Biochem. Biophys. 368:121-130. [0353] Olano, C., Wilkinson, B.,
Moss, S. J., Brana, A. F., Mendez, C., Leadlay, P. F., and Sala, J.
A. (2003) Evidence from engineered gene fusions for the repeated
use of a module in a modular polyketide synthase. Chem. Commun.
2780-2782. [0354] Olynyk, M., Brown, M. J. B., Cortes, J.,
Staunton., J., and Leadlay, P. F. (1996) Chem. Biol. 3:833-839.
[0355] Otani, A., Slike, B. M., Dorrell, H. I., Hood, J., Kinder,
K., Cheresh, D. A., Schimmel, P., and Friedlander, M. (2002) A
fragment of human TrpRS as a potent antagonist of ocular
angiogenesis. Proc. Nat. Acad. Sci. USA 99:178-183. [0356] Otoguru,
K., Ui, H., Ishiyama, A., Kobayashi, M., Togashi, H., Takahashi,
Y., Masuma, R., Tanaka, H., Tomado, H., Yamada, H., and Omura, S.
(2003) In vitro and in vivo antimalarial activities of a
non-glycosidic 18-membered macrolide antibiotic, borrelidin,
against drug-resistant strains of Plasmodia. J. Antibiot.
56:727-729. [0357] Pacey, M. S., Dirlam, J. P., Geldart, L. W.,
Leadlay, P. F., McArthur, H. A. I., McCormick, E. L., Monday, R.
A., O'Connell, T. N., Staunton, J., and Winchester, T. J. (1998)
Novel erythromycins from a recombinant Saccharopolyspora erythraea
strain NRRL 2338 pIG1 I. Fermentation, isolation and biological
activity. J. Antibiot. 81:1029-1034. [0358] Paetz, W., and Nass, G.
(1973) Biochemical and immunological characterization of
threonyl-tRNA synthetase of two borrelidin-resistant mutants of
Escherichia coli K12. Eur. J. Biochem. 35:331-337. [0359] Prieto,
M. A., Diaz, E., and Garcia, J. L. (1996) Molecular
characterization of the 4-hydroxyphenylacetate catabolic pathway of
Escherichia coli W: engineering a mobile aromatic degradative
cluster. J. Bacteriol. 178:111-120. [0360] Quiros, L. M.,
Aguirrezabalaga, I., Olano, C., Mandez, C., and Salas, J. A. (1998)
Two glycosyltransferases and a glycosidase are involved in
oleandomycin modification during its biosynthesis by Streptomyces
antibioticus. Mol. Microbiol. 28:1177-1185. [0361] Raibaud, A.,
Zalacain, M., Holt, T. G., Tizard, R., and Thompson, C. J. (1991)
Nucleotide sequence analysis reveals linked N-acetyl hydrolase,
thioesterase, transport, and regulatory genes encoded by the
bialophos biosynthetic gene cluster of Streptomyces hygroscopicus.
J. Bacteriol. 173:4454-4463. [0362] Ranganathan, A., Timoney, M.,
Bycroft, M., Cortes, J., Thomas, I. P., Wilkinson, B.,
Kellenberger, L., Hanefeld, U., Galloway, I. S., Staunton, J., and
Leadlay, P. F. (1999) Knowledge-based design of bimodular and
trimodular polyketide synthases based on domain and module swaps: a
route to simple statin analogues. Chem. Biol. 6:731-741. [0363]
Reeves, C. D., Murli, S., Ashley, G. W., Piagentini, M.,
Hutchinson, C. R., and McDaniel, R. (2001) Alteration of the
substrate specificity of a modular polyketide synthase
acyltransferase domain through site-specific mutations.
Biochemistry 40:15464-15470. [0364] Rowe, C. J., Bohm, I. U.,
Thomas, I. P., Wilkinson, B., Rudd, B. A. M., Foster, G., Blackaby,
A. P., Sidebottom, P. J., Roddis, Y., Buss, A. D., (2001) Chem.
Biol. 8:475-485. [0365] Rowe, C. J., Cortes, J., Gaisser, S.,
Staunton, J., and Leadlay, P. F. (1998) Construction of new vectors
for high-level expression in actinomycetes. Gene 216:215-223.
[0366] Rudd, B. A. M., Noble, D., Foster, S. J., Webb, G., Haxell,
M. (1990) The biosynthesis of a family of novel antiparasitic
macrolides. Proceedings of the 6.sup.th International Symposium on
the Genetics of Industrial Microorganisms. Strausbourg, France.
Abstract A70. p. 96. ISBN 2-87805-004-5. [0367] Sambrook, J.,
Fritsch, E. F., and Maniatis, T. (1989) Molecular cloning: a
laboratory manual. 2.sup.nd ed. Cold Spring Harbour, Laboratory
Press. New York. [0368] Schmidt, D. M. Z., Hubbard, B. K., and
Gerlt, J. A. (2001) Evolution of enzymatic activities in the
enolase superfamily: functional assignment of unknown proteins in
Bacillus subtilis and Escherichia coli as L-Ala-D/L-Glu epimerases.
Biochemistry 40:15707-15715. [0369] Schwecke, T., Aparicio, J. F.,
Molnar, I., Konig, A., Khaw, L. E., Haydock, S. F., Oliynyk, M.,
Caffrey, P., Cort
es, J., Lester, J. B., Bohm, G. A., Staunton, J., and Leadlay, P.
F. (1995) The biosynthetic gene cluster for the polyketide
immunosuppressant rapamycin. Proc. Nat. Acad. Sci. USA
92:7839-7843. [0370] Shaw-Reid, C. A., Kelleher, N. L., Losey, H.
C., Gehring, A. M., Berg, C., and Walsh, C. T. (1999) Assembly line
enzymology by multimodular nonribosomal peptide synthetases: the
thioesterase domain of E. coli EntF catalyzes both elongation and
cyclolactonization. Chem. Biol. 6:385-400. [0371] Silakowski, B.,
Nordsiek, G., Kunze, B., Blocker, H., and Muller, R (2001) Novel
features in a combined polyketide synthase/non-ribosomal peptide
synthetase: the myxalamid biosynthetic gene cluster of the
myxobacterium Stigmatelia aurantica Sga15. Chem. Biol. 8:59-69.
[0372] Singh, S. K., Gurusiddaiah, S., and Whalen, J. W. (1985)
Treponemycin, a nitrile antibiotic active against Treponema
hyodysenteriae. Antimicrob. Agents Chemother. 27:239-245. [0373]
Staunton, J., and Wilkinson, B. (1997) Biosynthesis of erythromycin
and rapamycin. Chem. Rev. 97:2611-2629. [0374] Swan, D. G.,
Rodriguez, A. M., Vilches, C., Mendez, C., and Salas, J. A. (1994)
Characterization of a Streptomyces antibioticus gene encoding a
type I polyketide synthase which has an unusual coding sequence.
Mol. Gen. Genet. 242:258-362. [0375] Takeshita, S., Sato, M., Toba,
M., Masahashi, W., and Hashimoto-Gotoh, T. (1987) High-copy number
and low-copy number plasmid vectors for lacZ alpha-complementation
and chloroamphenicol- or kanamycin-resistance selection. Gene
61:63-74. [0376] Thomas, I., Martin (nee Rowe), C. J., Wilkinson,
C. J., Staunton, J., and Leadlay, P. F. (2002) Skipping in a hybrid
polyketide synthase: evidence for ACP to ACP chain transfer. Chem.
Biol. 9:781-787. [0377] Tsuchiya, E., Yukawa, M., Miyakawa, T.,
Kimura, K. I., and Takahashi, H. (2001) Borrelidin inhibits a
cyclin-dependent kinase (CDK), Cdc28/Cln2, of Saccharomyces
cerevisiae. J. Antibiot. 54:84-90. [0378] Wakasugi, K., Slike, B.
M., Hood, J., Otani, A., Ewalt, K. L., Friedlander, M., Cheresh, D.
A., and Schimmel, P. (2002) A human aminoacyl-tRNA synthetase as a
regulator of angiogenesis. Proc. Nat. Aced. Sci. USA 99:173-177.
[0379] Wakabayashi, T., Kageyama, R., Naruse, N., Tsukahara, N.,
Funahashi, Y., Kitoh, K., and Watanabe, Y. (1997) Borrelidin is an
angiogenesis inhibitor; disruption of angiogenic capilla vessels in
a rat aorta matrix culture model. J. Antibiot. 50:671-676. [0380]
Waldron, C., Matsushima, P., Rosteck, P. R., Broughton, M. C.,
Turner, J., Madduri, K., Crawford, K. P., Merlo, D. J. and Baltz,
R. H. (2001) Cloning and analysis of the spinosad biosynthetic gene
cluster of Saccharopolyspora spinosa. Chem. Biol. 8:487-499. [0381]
Wilkinson, B., Foster, G., Rudd, B. A. M., Taylor, N. L., Blackaby,
A. P., Sidebottom, P. J., Dawson, M. J., Buss, A. D., Gaisser, S.,
Bohm, I. U., Rowe, C. J., Cortes, J., Leadlay, P. F., and Staunton,
J. (2000) Novel octaketide macrolides related to
6-deoxyerythronolide B provide evidence for iterative operation of
the erythromycin polyketide synthase. Chem. Biol. 7:111-117. [0382]
Wu, N., Tsuji, S. Y., Cane, D. E., and Khosla, C. (2001) Assessing
the balance between protein-protein interactions and
enzyme-substrate interactions in the channeling of intermediates
between polyketide synthase modules. J. Am. Chem. Soc. 123:
6465-6474. [0383] Xue, Y. Q., Zhao, L. S., Liu, H.-W., and Sherman,
D. H. (1998) A gene cluster for macrolide antibiotic biosynthesis
in Streptomyces venezuelae: architecture of metabolic diversity.
Proc. Nat. Acad. Sci. USA 95:12111-12116. [0384] Xue, Y. Q., and
Sherman, D. H. (2000) Alternative modular polyketide synthase
expression controls macrolactone structure. Nature 403:571-575.
[0385] Yamamoto, H., Maurer, K. H., Hutchinson, C. R. (1986)
Transformation of Streptomyces erythraeus. J. Antibiot.
39:1304-1313.
Sequence CWU 1
1
113174787DNAStreptomyces parvulus Tu4055 1gatcccgcgc ggcatcgccg
tcgacgtgct gcgggccggc gaccgctggc cccacagcgc 60ggcaccgcgc caccggggac
tcctcaacgc ctggtggggc gcctgggtct gggccacggt 120cttcgaccgc
tacgcgtcga ggacctacga cgacgcccag gacgtcgacg cgatccacga
180cgcggcggga ctggtcatgg ccggtgccgg attcgacatc ctcgccgccg
tgctcgcgat 240cctcttcgtg cgccggctga ccgccgcaca gcacgcgaag
gccctcgcgg ggcccacccc 300gccgacgcac tgagccgccc gcacccgtga
tcccgccccg cgatccccgg gcccgataaa 360tgcgttggcc ccggcgcgcg
cctgtggtgg gatgagcggc gacgggggcg gctccccggc 420gtgcatcctt
ctcaccttcc tgcaaagatc ccgcgcgccc actctccgcc cccgttcttc
480cgtcccgagc cgtcgccgcc gtggaggctt tcctgttgct cgccgccgag
tccgtactgc 540tgcgccgtga ccagagcgtc tacgtgaccc cggggtccga
gccggacggt ccgccgaggg 600ccgcactgcg ccggctcgag gccgaactgc
tcggccgcgg ccacgccgtc tccgcgccgc 660tgcacgcggt cctcgcctcc
ttggactccg aggaactggc ggccgcccac gtacgcctcg 720tcggactcgt
cgacgacctg ctcggctccg accgcaccca caccccgctc ttccgccgct
780tcccgcgcac cgtgccgcgc gacaccgagg cgctgtacgt ggaccgcgtc
ttcgccttcc 840tgctgcagca gcccgagcag ccctgcgtgc tgtgcggcga
ggcgcgcacc gtcctgcccg 900tgtcaccctg cgcgcacctg gtctgccggc
tgtgctggga cggctccgac tacgcgggat 960gcccgctgtg ccaccgcagg
atcgacgggg acgacccctt cctgcgtccg gtccgtgccg 1020tcggcgccgc
cagggcgacc gtaccgggcc cgctgcgact gctgcgcctg ggcaccgaca
1080tgaccgccga cgccaccacg gcggtggacg ccctgctggc ccgccgcacc
ccgctctccc 1140cgcaggaccg ggacgacctg ctcaccctgt tgccgctcac
accggccggc cggggcgacc 1200tgccgcagga catcccggtc cgcgagacca
aggcgctggt cctgggcgcg ctggtgcgcc 1260gggcaccgtc gcggccggcc
ctgcggaggc tgctcgccga gcggctcacc accgccaccg 1320acgtgctgcg
gctgctcgcc gtgctctcgg gcggcgacgc cgggctggtg acaccggcac
1380ggttcacgaa cgttccccgt tccctgcggc gtgacctgct cgccgtcctc
gacggactgc 1440cggcgccgta cctggtcgag gacatgctgc ggcaccccac
ggcgtggaag cgggccgcgg 1500aggtgctgca ccccttcgag gggcacaccc
ggcacccgcg cgccgcgctc gccaccgccg 1560tgctgcgcgc cacaccgttg
gacccggaca ccgccttcgg cgccgccctg ctgaccacgg 1620ccgccgcgca
cccggacgcc gtgcgcccgg acggcacccg agtccgcccg gccacctggg
1680cgggacggct ggagcaggcg atggccgagg gggacgccgc tcgggccgcg
gccctcgccg 1740gggagcggcc cggcgaactg gtgcgccgcc tggacgtgtt
gctgcgcctg cacaccgacg 1800aggcgctcgt gccggagctg gagaaggccc
tgcggcacgg gctgccgaag gtgggcccgg 1860gcccgctgct gtcggcgctc
ggggcgctgc ggacacgcac cgaggaccgc accgggaccc 1920ggcgcgtgtt
cttcccgcgg ggcgacgtca cccgggccct gtccgtcccc gagcggcgcc
1980ccgccctgcc cgccgggccg gtgtccgagg tggtcgccct gctggagggg
gaactgctgc 2040gccggttcgc cgccgggcgg ccctacgagc tgtcggtgct
ggacgccgga ctgaccgacc 2100tcaccgtgcc gttcaccgag cggaccgccg
ccaaggccct ggtgaccgtg ggccgcggca 2160gcgtccaggc actccccgag
ggctccgtgc tccgactgtt cctgcactgg acggaacccc 2220ggggcaaccg
caccgacctg gacctgtccg tcgccttctt cgacgccgag tggacgttca
2280ccggcctgtg cgactacacg aacctggtgc acggtccgga cgcggcgatc
cactccggcg 2340acctcacgtc ggccccggcg ccgcgcggcg ccaccgagta
cgtggacctc gacctggagc 2400ggctggcgcg gcggggagac acctacgccg
tcccgctggt gttcagctac aacaacgtcc 2460cgttcgagga actgccggac
gccttcgccg ggttcatggc gctgcccgcg gaaggcccgc 2520gcgacgcgac
ctacgacccg cgcaccgtgc ggcagcgctt cgacctcgcg ggcgactcca
2580aggtgtgcct gccgatgatc gtggacctgg cccgccggcg ggcgttgtgg
accgacaccc 2640acctgccgtc cgcgggcggc ttccagagca tcggttcgca
cggcggcggt gagctggccg 2700cggtggccgg tgacctctgg cagcagttca
cctcgggcgg ccgggcgacc ctgtgggacc 2760tcgccgtcct gcgggcggcc
gccctctcgc cggaggtggc ggtggtgtcc cgggagccgg 2820agcccgcggt
gctgcgttac cggcggcggg cggccgagag cgaggccgcg ttcgccgtcc
2880gagtcgcgtc ccacaaggac gccgaggaac ggctggcgca caccgacccc
gactcggccg 2940cggccgggct cgccgccggc cggcgggtct tcctcgcgac
ggtccacggt gacgtccggc 3000cgccgggggc gtcgggcacg tcctaccggc
tcttccccgg ggccggggac gcctcaccga 3060ccctgacccg cgtgaccgcc
ggggacctgc tcgccgagct gggctgagcc aggcgccggc 3120ccgcgccggc
ccgcgccggc ccgtccctgc ccgtgccgga gggctcgccg gtcactccgg
3180ccaggcggag ttctcgatga cctcgacgaa gtccgtacgc cggaagccgg
gcgcgaagtg 3240ctccagcaca tccgcgttca ccgtgccgaa ggtggtcgcg
ggccggtgct cgaacccctc 3300ggtgaacgcc cgcaggatct gcttcttgaa
gtccgggcgg ggatgcgcgg cggtgaccgc 3360gtcgatctgg gcccgggtga
gattgcccag ccgcaggccg agcacgtcgg tctccacgcc 3420ggcggtggtc
gccgcgatct cgggggccat ccggtacggc acctccggag tggtgtgcag
3480ggcgacggcc gtccacacgg tgtccgcgtc ggcctcgggg atgccgtggg
cgagcaggaa 3540ggcgtgggcc tggtcggcac cgtccatctc gaagcgctgg
tcgtcaccgc ggtagggcgg 3600caccaggccg gtgtcgtgga agagcgcggc
gatgtacagc agctccgggt cggggcggat 3660gcccagggcg gcggcctgga
ggctgccgaa gaggtacaca cggcgtgagt ggtggaagat 3720cagcggcgga
gtggtgtcgc ggatcaggtc ggtcgcctcc cgcgccggcg cgctgtcggg
3780aatctcgatg ccggcgatct gctcggccat ggctgccctc cggggaatcg
gtgccgtcgt 3840tgctgcctcc accctccgcc cggcgcgacc ccggcgtccg
ctacccgatg gccgacaacc 3900ccttacaagc ggccatgtgc cccgcgccgc
cgcctcagcc gccgtccggg cgcgggccgg 3960cgtccggcac ggtggtggcg
aagcgctgcc ggtagcgggt gggcgacagc cccagatgac 4020gggcgaaggc
ccggcgcagg ctctcgtagc tggggaaacc cgacagcgcg gcggcctcgg
4080tggcgttgtg cccggagtcg agcagcgcct tggcgatgtc gaaacggatc
agctccacgt 4140acttcacggg cgtgacgtcc agctcggccc ggaacatccg
ggtcagatgc cgggggctga 4200cccgcacgcg cgccgccaac gccgccagac
tgtggtcggc ggccggatcg gcctgtacgg 4260cgtcctggac ctgccgcagc
acgggcgtcc gcggcgccgg gccccgcaac gaggcggaga 4320actgcgactg
gccgccggcc cgctgcaggt acaccaccag cgagcgcgcg accctgcggg
4380cgagatcggg cccgtggtcc tcctccagca gcgcgagggc caggtcgatg
cccgccgtca 4440cgccggcgga cgtgtaggtc gccccgtcct tgacgaagat
cgcgtcgggc tccacgcgtg 4500tcgacggaca gcggcgggcc agcgcggtgg
tgtgctgcca gtgcgtcgtc gcccgtctgc 4560cctccagcag acccgcggca
cccagcacga aggcgccggt gcacaccgag gcgacgcgtc 4620cggcccgggc
cgccagcgcc ttcgcggcgt cgatgagccg tgggtcgacg ggcgagccgg
4680gcagcgcgtc accgccgacg acgacgagcg tgtccggcgg gccggcggaa
cgcgcgtccg 4740cctcggccgg gaccagcagg ccgatggacg aacgcaccgg
cgccccgtcc ggggagacga 4800cgccgagccg gtaccgggcc ccgaaccggt
tggcctccgc gaagacctcc gccggccccg 4860acaggtcgag catcttcatg
ccgtcgaaga ccaggatgcc cacgctgtgc gctctcgccg 4920tcatgtctcc
ctctccgcgg gccggcgggc ccctgcgcgc cattgtcccg ccggccgtcc
4980acgccggcgg ccggcggcgc gggcggccgg cggtcggaat gaggcgcgcc
ggacatcggc 5040gtagggtggc gagcgtgtgt tcggccgcgg tcccggagac
cgcggaacgc aggacctttg 5100gcaggcacgc ggaaggacag cgatgggtac
ggtcaccacc tccgacggca cgagcatctt 5160ctacaaggac tggggcccgc
gcgacgcccc gccgatcgtc ttccaccacg gctggccgct 5220caccgcggac
gactgggaca accagatgct gttcttcctc tcgcacggct accgtgtgat
5280cgcccacgac cggcgcggcc acggccgctc gggccagccc tcgacgggcc
acgagatgga 5340cacctacgcc gccgacgtcg cggcgctgac cgaagcgctc
gacctgcggg acgccgtcca 5400catcgggcat tcgaccggcg gcggcgaggt
cgcgcgctat gtggcgcgcg ccgaaccggg 5460ccgggtcgcc aaggccgtgc
tggtcggcgc cgtgccgccg gtgatggtca agtccgacgc 5520caaccccggc
ggcaccccga tcgaggtctt cgacgggttc cgcacggccc tggccgccaa
5580ccgggcccag ttctacatcg acgtgccctc cggccccttc tacggattca
accgggaggg 5640cgcgaaggtc tcccagggcc tgatcgacaa ctggtggcgg
cagggcatgt cgggcgcggc 5700caacgcccac tacgagtgca tcaaggcgtt
ctccgagacc gacttcaccg aggacctcaa 5760ggccatcgac gtgccggtgc
tggtcgcgca cggcaccgac gaccaggtcg tgccctacgc 5820ggactcggcg
ccgctgtcgg tgaagctcct gaagaacggc accctcaagt cgtacgaagg
5880gctcccgcac ggcatgctct ccacccaccc cgaggtggtc aaccccgacc
tcctggactt 5940cgtgaggtcc tagtcggcgc tcacgccggc gacacgggag
cgggtgcggc gccgcgcacc 6000gggtgcttgc tcaggacgga gacccggttg
aaggcgttga tgctgatcgc cacccagatc 6060acggcggaga cctcgtcgtc
cgacaggacg ccccgtgcct gcgcgtaggc ggcgctctgc 6120gcggcggcgt
ccgccggacg ggtggtcgcc tccgcgaggg cgagcgccgc ccgctcccga
6180gcggtgaaca gctcggtgtc ccgccaggcg ggcagcaccg ccaggcgctg
ggtcgtctcg 6240ccggcccgca gcgccgccct ggtgtgcaga ctgagacagt
aggcacaggc attgagttgg 6300gagacgcgga tgttcaccag ttccacgagg
aggcggtcca ggccggccgc cgcggcggcc 6360tcccgcaccg attccgccgc
ggccacgaac gctttgtacg cgccgggggt ctgcttgtcg 6420acgaagaccc
gccgctcgtc cgtcgccacc ggggcctgct gtgtcacgtg gtctccttcg
6480tcgcgctctc ttccggcggg tcctatcatc acccccatgg atgttgaaag
tgaaactttc 6540aggtcggggc cggacggggg cgcgtggtga gcaacacgga
gacacggccc gcggagatgc 6600ggtgcggcgc cctcgaagac gaggtgcccg
ccgcgggcgt cgaagtcctc accgcccgtg 6660acgtccccct cggcggcccg
cgcgccatga ccgtgcggcg cacgctgccc cagcgggccc 6720ggacgctgat
cggagcctgg tgcttcgccg accactacgg tcccgacgac gtggccgcgt
6780cgggcggcat ggacgtcgcc ccgcacccgc acatcggcct gcagacggtc
agctggctgt 6840tcagcgggga gatcgagcac cgggacagcc tcggcaccca
cgccttcgtc aggcccggcg 6900aactcaacct gatgaccggc ggcttcggca
tcgcccactc cgaggtctcg acccccgaca 6960ccactgtcct gcacggcgtc
cagctctggg tggcgctgcc ggaggagcac cgcgacaccg 7020gccgcgactt
ccagcaccac gcacccgcgc cggtcgcctt cgacggcggc acggcacgcg
7080tcttcctcgg ctcgctcgcc ggggacacct cgcccgtgag caccttcacg
ccgctgctgg 7140gcgccgagtt gacgctggtg ccgggcggca ccgccaccct
ggacgtcgac cccggcttcg 7200agcacggcgt cctcgtcgac agcggtgacg
tacgcgtcga gggcgccgtc gtgcgaccgg 7260ccgaactggg ctacgtcgcg
ccgggtcgcg cgacgctgac cctgaccaac gagtcggccg 7320cacccgcccg
gctcatcctc ctcggcggcc ccccgttccc cgaggagatc atcatgtggt
7380ggaacttcat cggccggtcg cacgacgaga tcgtgcgggc ccgcgaggac
tggatgaagg 7440gcgaccgctt cggcgaggtg cacggctacg acggggcacc
cctgcccgcg ccggaactgc 7500cgaacgcacc cttgaagccg cgacgaaggg
cgcgctgatc tgcggggaca tgggttggca 7560ccaagggttt cggcgctgct
cgatcaccga acccaccgcg agtcactctc gggtgagtcc 7620cgaacggtcg
ccgggagcgc gtgagcacgt gcgcagatgc tcggcgatga tgccgagaat
7680cgcatcccgg tgctccagca ggtagaagtg accacccgcg aaggtgtcga
gtgtgaacgg 7740gccgtccgtg tgttcggacc atgcccgggc ctcgaccggg
gtgaccatcg ggtcatcatc 7800cccggtcaag gcatggatgg ggcaccgcag
cttcgggccc ggtcggtagc ggtaggtctc 7860ggcggccctg tagtcgccgc
ggatggcggg gagggccata cgcaccagct cctcgtcgtg 7920gaagacctgc
tccgcggtgc cgtcgagggt cctcagctcg gccaccaact cctcgtccga
7980caggaggtgc accgtacccc ccgtcctctg ccgggacggt gcgggcctgg
ccgagacgag 8040gagtgcctcc agggagatgc ccgcactctc gaaccgtcgg
gccagttcga aggcgagggt 8100ggcgcccatg ctgtgtccga acagcgcgac
cggctggtga acacgggccc gcagcacggg 8160gaagagctgg ttcgcgagtt
cgtcgatgtc ctccaggggc ttctccgcgc gccggtcctg 8220ccggccgggg
tactggaccg cgagcacgtc gcaccggggt gccagcgcgg cagccacggg
8280gtggtagaac gtcgcggagc cgccggcgtg cggcagacag atcaactggg
gtgccgtggg 8340atgtgcgggg cggtactgcc tgatccacac gtcgctgtgg
gtgttcgtac cggtcatcag 8400cggtgctgcc cttccggcgt ggcgttggtg
cgggggatgg ccgatccggc cgtgacgcct 8460ccgtcgaccc cgagggtctg
cccggtgacg taggcggcga gggggctgag cagccagacg 8520accgcgttgg
ccacctcctc gcacttgccc aggcggccca gcggagcccg gcgcgcccgt
8580tgtgcgaggg cgctgggatc ggcgtacagg ctgcgcagca tgggggtgtc
ggtcgaaccg 8640gggctgacca cgttgacgcg gatgccgtcg cccgcgtact
gcagggccac cgacttgctc 8700aggccgatga ccgcgtgctt ggtggccgag
tagagcgggc tctgggcgtg gccgatgtgc 8760ccggccactg atgcgcagtt
cacgatcgcg ccgccgccgg ccgtcagcat ggcctcgatc 8820tgtccgcgca
tgcacgacca gaccccacgc aggttggtgg cgatcacgcg gtcgaagttg
8880tcggcggtgt cctggtgcag cggaccgaac gagccgaagg tcccggcgtt
gttgaacgcc 8940ccgtccagcc gtccgaaccg gctcaccgcc cgggccacgc
agtccgccac ctgcttgtcg 9000tcaccgacgt cgcagggcac caccaagtgg
tgcgaggagg gtagtccggc ggttgtctcc 9060gtgagggccg actcggtgcg
gcccaccagg acgacgcgtg ctccgtgccc cacgaggagc 9120cgggccgcgg
cccggccgat gccgctgccc gctccggtga ccatcatcac gcggtcggtg
9180agttccagac tcatcgttgt tccaacgctc cgtccctgct cgtcggatgt
gcgatccgct 9240gtgtcatatg tgcagtccgc cgttgacgtc gacgaccgtg
ccggtggtgt atccggcgtc 9300ctcgccgcac agatggcaga ccatgcccgc
ggcctcggcg acgctgccga agcggccggc 9360ggggatgtgg ctgacgcggt
cggcggtcca ctgcgggggc ttgtcctccc aagcccggcg 9420gatgcgctcg
gtgccgatga cgccgtgggc gaccgcgttg accgtcacgc cgtgcggggc
9480cagttcgtag gcgcactgct tggtgaaccc gatgacgccg gccttggcgg
cgacgtaggc 9540ggcattgctg aaccgggtgt acgtgcgacc ggccacggac
gccaggttga cgaccctgcc 9600ccaccccgcc gcgaccatcg ccgggacgca
cagccgggtc atggtgaaca cgctcgccag 9660gttgtgcgtg acggcctcct
ggaggtcggc ctcggtcagt tcggtcaccg agcgggcccg 9720ggtgtcgcca
ccgacgccgt tgaccaggac gcccggccgg tgctgcgggg cgagcgagtc
9780gacggcggac gccagggcgt gagggtcggt cacgtcggcg accagcgggt
cccgggccag 9840ccggtcgccg agcccgtccg cgacccggtg caccgcctcg
gcgtccttgt cgagcaggac 9900cacccgcagg ccccgggccg ccaggcctcg
ggcgacctcc gcgcagatgc cgctgcccgc 9960tccggtcacc agagccacgt
cgtgtcgtgc cgtcatgtgt tcctccgcca gccgccgccg 10020gattcccagg
tggccgcgca ccgggtgtgt cgcagtagtt cttcggtcgg ttcgatgccc
10080gtgcccggac cggtcaaggg ttcgacccgg tgcagtgacc ggtcgacggt
gaacgccggc 10140gtggtcagtg gcacggggaa ccactcgtcg gcccggcccg
cctcgaccgt ctgccacagg 10200tcccatgcgg tggccagggt gcgcccggcc
gcccacagcg gccccacctc ggccacgtga 10260acgccgagct ggcaaccgac
gccgagctcg tcggcgcgca gtgccaggcg tgccgcggcg 10320aggaacccgc
cgcacttcga cagccgtacg ttgatgtggc tggcggcgcc gctggtggcg
10380gcggcgtgga ggtcggccgg tccggtacag gactcgtcga gcatgacggg
cagaccggtg 10440gcccgccgca gccggcccaa ctcgggccag gaccgcggcg
ggagcggctc ctccacccac 10500cccacgccgt ccagttcgcc cgcgaccttc
tccgcttcct cggccgtcca ggcgccgttg 10560acgtccagtg agacacgggt
gtcggcgggg aggcggtcct gggccgccgt cagccggtcc 10620accgccccgg
ccgggtccgc caccttgatc ttcacgtgcc gcaacgccgc cagcgcccgc
10680ggcgtgagtg cgtccaggac ggtcgcgacg tcgcgcgaga ggtggatcac
gaggctgacg 10740gacgtcggtc cgtcccgccg tgatcgggca ggcggggcca
ggacccgcag gacgtcggcg 10800agcggccggg cgaaatgccg gcacaccgcg
tcgagcaggg cgatctccac ggcggccgcc 10860gccgacgagc cgtcgacgag
cccggtcagc ggcagctgtg cgatcgaggc gacggcgctc 10920tcgaagtccc
gccactcgat gcgctcggcc agctccccgg gatcgcaggc ctggacggct
10980cgcaccgcgc cgtccagggt ctcaccggtg acgtagtcgc ggggcgctcc
ctctccccat 11040ccgcgggtgc ccgccagctc gatctcgacc agcagggacg
ccgcgctgcg acgggagcgc 11100gtggcgtggt cgaaggccgc ggccatgggc
acgacggcgg tgtgcagccg tacgcgacgg 11160atcacgcttc ctccttcagc
cgcgtggcca gccagtccca gtacgccgtc cgggccgacg 11220tgaactccac
gtagtgccga tccgtggcga agacctcctc gtgcacggct gacgtcagac
11280gccgcagcat cgctcgcgcg gccgacaggt cgatgatcgg gtcgtgagtg
gggagcgcca 11340ggtcgacggg gagccgggtg cggggggcac cgcgggcata
gtggtcctcc aggtgcacga 11400gcgtcgcctg cgtggccgag gtgacctcgc
gcagcatcag gtgatccccg gtgaggaact 11460cccggtagcg cggcaggtcg
gtgtagtcgc cgtcggcgag ccccacgggc cgtagcccgc 11520tgccggtgag
cgcgcggcgc tcggcgagcg tgtccgcggt gtggcgcgcc cgctgctgtc
11580ccagcgcggg cgcgcacagg accaacctgc ggacgggcag atcgcgggtg
caccagagcg 11640cggccagcac gctgccgccg aggctctgcc ccagggcgac
cggcccggca ccgccgacct 11700cggccgtcac ggcgtcgagg gcgcgggcgt
agtcgtcgag gacgagatcg gccgacggca 11760ggtggccgcg aggcccctcg
ctgcggcccg agcctctgcg gtccagggcg tagacgtcga 11820tgccgcgtgc
gttgagctcg ggccccgtct cgaacagcca gcccgcgtgg ctctggatgc
11880cgtggaggta gaagacggcc gaggtggcgc cgggcgtggt ccagtggtgc
agggtgagcc 11940cggtgccgtc ggcagcggtc agcatgctcg tggtgggcat
gggctgcctc ctcagtaccg 12000gacgagattg acgtcggggt ccagccggac
gtcgacgagg agcggtcctt cgagcgtcga 12060caacaggtcg ccgacggcgt
cgagttcttc cgccttgcgg acggtgaggg cacgtgcgcc 12120catcgccgtg
gccagcccgg cgaggtcggg ccaggcgaac gccgagtacg ccgggtcgta
12180gccgtggttc ctgagtttgt agtgctcggc tccgtacgcc ccgtcgttga
gtacgaccac 12240gacgagcggc agccggtacc gtaccgccgt cgtgaactcc
gacaggtgca tcatgaagcc 12300gccgtccccc acggcggcga ccacgggccg
gccggtcccg gccgtcgccg cgccgatcgc 12360cccggcgacg ccgagcccga
tcgagccgaa gccgcccatg acggtgaagt gcagcgggtc 12420cgccacgcgc
agatacggcc agacacccac gtcgaagcgg ccgatatcgc tgacgacact
12480gcgctcggcg ggcagtatcc ggtccagccg gatcatggcc gtccggatgt
cgacggtctc 12540cgctccactg cggtcgtcga cgtcgtcctg cggcgagaac
ccggccagtt gcccggcgac 12600gcgctccgcc caggcgccgt tggccgcggt
gactccggcc tgatccagca ggacgttcat 12660ggtctcggcc gtgcggcggg
catccccggc cacgggctcg tcgacggggc tgtacgagcc 12720gaaccgtgcc
ggatcggtgt ccacgtgcac gactctcttg ccgcggagca gctcgccgtt
12780gagcacggtc cacatgttca ggctcgcccc gaacgcgatc acgcagtccg
actcggcgat 12840gaccgtgctc gccacgctgt gcgcgagcga gccgaagatg
ccgacgtcgc gggggtgacc 12900ggcgaacatc tccttgccga gcacggtggt
ggccagcgct gctccggtac ggtccgccag 12960ctccaccagg gcctctcgcg
caccggcgac ggccgcaccg tgcccggcga ggaccagcgg 13020ccgcttggcc
gagccgatca gccccagcgc gccgtccagc gcctcggcct ccggagcggc
13080cagaggaccc ggcgccaccg ggagcgtgac cggcgcctgc tcgcccgcct
ccgcctgcat 13140gaggtcgatc ggcacattga gtacgacggg ccgccgctcg
gccacgatcc gctggacggc 13200ccggttcagg tccgcgacga gcgaggccgg
tctgtggacg cgttcgtacc ccgcgcccgc 13260cgcggccgcg accgtcgcga
tgtcgaagtg gtggaagtgc gtgggcaccg gtggcggatc 13320acctgtgatc
agcaggacct ggctgtggct acgagccgct tccacaagag gggtcaaggc
13380gttggtgaaa gccggcccgt gcgtcacgga cgccacaccg atgccgccgc
acatacgtgc 13440gcggccgtcg gccatggcga cggcgcccgc ctcgtgggcg
accgccacga accgtccgcc 13500cgcgtcggcg aaggcgggca gatagagcag
attggcgttg cccatgagac cgaagacggt 13560atcgacgccg tgtgcggtca
gagcgtcggc gagcgcgtgg aaaaccttca ttgctgtccc 13620tcggtcgggg
cgggctggag ccagacggga tcgttctggt cgaccggcgc gcaggtgggt
13680ggcgggtcct cgcggagcag ggaacgcagg tggttgccgg tgatgagcac
ggtggttccg 13740gcgcagtccc gggcgacggc catccgccgg gcgagaactt
gggggtgccg caccgcgcgg 13800gtcaggtcgt gcagcggggc cgcgggcacg
ccgacggctg ccgccgcgtc cagtacggcg 13860gcgacggggc cctcggcccg
cggtggcggg ccttccgcgt cgacgaggac cgtgccgtcc 13920gccgcccgga
cgaggtgggg ggcccgtgcg ggtggggcgg tgaaccagcg gtcctgggtg
13980agccagacag cgctgtcgaa cagcgcgacg tcggcggcgc atccgtgctc
ggtgcgcaga 14040cggacgtagg tggacacgac ggcggcggca gcggccaggt
aggcaccgag gacgtcggcg 14100ctggacactg gagtcctcag cccggcgccc
gggccgccga cgaggcgcat gatgcccgac 14160tcggcctgga tcacggtgtc
gacgctgcgg tcggcggcgg ccaggccatg gccggtgacc 14220gtgcagtgca
cgacgccgtg ccgggacagg atctgatcgg gggcgaggcc ctgcgcggtg
14280agtgtgtcgg ccgcgaggtt ggtgagcacg atgtcgcatc cggcgagcag
ccgctcgaac 14340ccggcccggt cctcggcgtc ggcgaggtcg agccgacagg
agcgtttgcc cgcgttgttg 14400acgtagtaga ggtagccgac cccggccacc
tgctgggcga gccgccggga cccttcgccg 14460tgcggcggct ccaccttcag
tacgtcggcg ccgagttggg ccagcagccg gcccgcgtgc 14520ggtccggccg
tgtacgagcc gacctccagc aggcggacgc cgcgcagcgg aggagtgccg
14580cgcgcgatcg gctcccagag gccgccctgc cggggcatcg gaccgtcggt
gacggcgggt 14640atgagggaac gcagtgggct gcccggtgta ccggacgggt
cggtgaccag gccgcgacgc 14700cgggcggcgg ctccgtcgcg tacctcctcg
ggggccgcga cctgggcgca cgggatgccg 14760gcggcccgca gggcggtcac
cacgtccacg gcccgctgcc cggcggtcca cttgccgagg 14820atctcgtcga
gctcgtcggc gttgcggacg cgggcggcgg tgtcggcgaa gcgagggtcg
14880tcggggaggt cccgtcgtcc caggactgcg gtcagcctgt gccatatcgg
ctcgcccatc 14940gtgcagatga cgacaggggc gtcctggcac gtgtagctgt
tccagggtgc cgccatgccg 15000tgtcggttgc cggtacgacg tggcggacga
ccggcgagcg cgacgctcgg aagcaaggtg 15060cccgtcagag tgaacaggct
gtcgaactcg gcgatgtcca ggtagtcccc gcctcctccg 15120cgctcgcggc
cgatgagccc ggccacgacg gcgatcagac cggacagggc cgccgtacgc
15180gacgccaggc ccaccacgga gaggaccgat ggttccccct cggtgccggt
cgccgaggtc 15240aggcccgcca gcgcctgcaa ggtccgctcg gtggccgggg
cgtcacgcag cgggccggtg 15300agcccgaacg cgctcagccg caccgcgacc
aactcggggc tgcgatgcgg aagttccggc 15360gccccgagcc ccagagcggc
gagccgctca tcgccctccg cgtcgcacac cagcacatcg 15420gccgtctgca
gcagacgcga tgcctggtcc caaccggacg ccgactgcgc ggcggagtgc
15480agccaccgct cgaacgggcc cccgtgatcg ggtgatcggc agagcgtcac
gacacgtgcc 15540ccgaggtccg cgagcagtct ccccagcagt gcggctggtg
tactgcggcc ggccatgagt 15600acggcgatgc cctcgagcgg ccctgccctt
gtcatggaat tctccctcgc tccgcgcacc 15660gatgcgggcg tcggtcgtca
ccgctgattg gtcgtggacg tcggccgtga ggcgaccgcc 15720agggaaatca
caccggcgcc gcccgcatcc gcgggggatc tggccggcag tcccgatgcg
15780ccattaaagc gcgcatgatt cgttccgtgc cgaccgtagc accgagacgg
cggaaaatca 15840tcgcacaccc ctgctccgga tccggaaacc ctgctcaggg
ggcaaggggg agggggtccg 15900taatggccaa aacgaaattt tacggagctt
tacgtttgct ggacgatcta ttggtgagcg 15960cctcgacggg ctggacatgg
cagtagtgaa tgtccgcatt catggctatt agtaccgtga 16020ccctgatcac
acgagccctg gttgacgggt gaaatttggg gctggcagag tgatgacgag
16080cttccgtccg caaagtggtt gaataactgt tccgaaatct tcggcaattc
aaaggagact 16140tacgggggat gcctctatta atgtattgct gtagggcgaa
ataatgacag gcagtgctgt 16200ttcggcccca ttcctgcagc ctcccgaacc
cgtctcaggg cactccgaac ggaaaagcga 16260tcccgtcctt ctcgtcggcg
ccggacgccg tgcccgcatg gcggatgccg tacgtgccgc 16320cggcgctcag
gcgggcatcg acccggccgt cctacggcgc acccgggcca ccttgatcac
16380cgcggggagc gcgggagccg caggccggct cgccgccgcc ctgcgcctga
ccggcgccac 16440gatctctctg gacacccgcg agacacccac actgctcgcc
ctgcacctcg ccgcccaagc 16500gctgcgggcg ggcgacacct cttacgccgt
cgtcggtgcc gaacttcccg acgggaactg 16560cgcgttgatc ctggccaggc
agtcagcggc aaccgccgag ggggctgtgc cccaggcgat 16620cgtccgcacc
accacggcgg accgcaccac cacggcggat cacgcccctg cgcccgacga
16680ccacggcagc ccggcccgtg aagccccgca tgccacccgc acgttgtccc
caggcatcac 16740ccaggccccc gccgagggct tcccgggcct gctggcgacc
ctgcacgacg acacacccct 16800gcgccccacc gcggtcaccg agcacggcag
cgacgccacc accgtcctcg tcctcctcga 16860ccagccccag gacgccgcac
ccgcggcacc gctcccctgg gtggtctcgg ccccccacac 16920ccgcgccctc
cgggccacgg ccgcgaccct ggccgtccac ctcgacacca caccggccgc
16980acccgccgac gtcgcgcaca ccctgctcac cgcgcgcccc gaccgccacc
gtgccgccgt 17040cgtcggcgcg gaccgggcca ccctcaccga cggactgcgc
gcactcgcca ccggaggcga 17100cgcgccccac ctcgtccacg gcaccgccac
cggatcgccg cgtcccgtct tcgtcttccc 17160cggccagggg tcgcagtggc
ccggtatggc cgccgaactc ctcgaaacca gcgagccctt 17220tcacgacagc
gtgcacgctt gcgccgacgc gctggccgag ttcgtcgact ggtcggttct
17280cgacgtcctg cgccaggcac cggacgcgcc acccctgcgc cgggtggacg
ttctccagcc 17340caccctgtgg gcgacgatgg tctccctggc cgaggtctgg
cgctcgtacg gcgtggaacc 17400ggccgccgtc gtcggccact gctacggcga
gatcgccgcc gcgcaggtag ccggcgccct 17460cgacatgcgt gacgccgccc
gactgctcgc ccaccgcagc cgggcctggc tgcgactggt 17520gggcaagggc
acggtcatct ccgtcgccac ctcgggacag gacatcaccc ggcgcatggc
17580ggcctggccc gactccgtcg aactggccgc gctcaacggc ccgcgctccg
tggcgctcgc 17640aggcccgccc gacgtcctgg acggcatcgt caacgacctg
accgaccagg gcatccacgc 17700caaacgcatc cccggcgtgg acaccgtcgg
ccactgctcc caggtcgagg tcctccgcga 17760ccacctgctg gacgtcctgc
gcccggtctc gccccggccc gccgccgtgc cgttctactc 17820caccgtcgac
ggaaccgaac gcgacaccac cacgctggac accgactact ggtacctcaa
17880cacccgcagc caggtccgct tccaccaggc cgtgcggaac ctgctcgccg
ccggacaccg 17940ctcgttcgtc gaggtgagcc cgcacccgct gctcggagcc
tccatcgagg acaccgcggc 18000cgagttcggc ctcgacgacg tggccgccgt
cggcaccctg cgtcgaggcc agggcggcac 18060ccgccgggtc ctgacctcgg
tggcggaggc gtatgtccac ggcatcgaca tcgacttcac 18120gcccgccttc
accggcacga cccccaaccg catcgacctt ccgaccgtcg aggaccacgg
18180catcgagggt cacggcgacg acggcggcga gacatggacc gaccgcgtca
gaaccctccc 18240ggacgagcag cgcgaagagg ctttgctgga cctcgtgtgc
cgcaccgtcg ccgcggtgct 18300cgaagcggac ccggccggca cggcggacgc
cgtcgccccc gacacggcgt tcaaggagat 18360gggcctcggc tcactgagcg
cggtccggct gcgcaacggc ctccgcgagg ccaccggcgc 18420ccacctgccg
gccaccatcg cctacgacca ccccaccccg gccgctctgg cccgccacct
18480ggcgatgacc ctgttcgacg cgacgggcgc cgccccggcg gtcccggcac
cgagccgcga 18540cgacgaaccg atcgacgccg agaccgctgt gctgaccgcg
ctggaacggg ccgacgaggc 18600gctggaacgg ttgcgggccc cgcacgcccg
cacgccccgg caggagaccg gccggcggat 18660cgacgagctg ctgcggtccc
tgaccgacaa ggccaggcgg atgagacagg ccgacgccgt 18720cgatgatgtc
gatgatccgg ccaccgaccg gttcgccgca gccaccgacg acgagatgtt
18780cgaactcctc gagaaacgtt tcggcatctc ctgaggcgcg ccgacctccc
gcactgcgag 18840tcgcttcccc cacgatcccc gaaggcggca accgatggca
catgaagaca aactgcgcca 18900cctcctcaag cgtgtcagtg ctgaactcga
cgacacccag cgccgggtgc gtgagatgga 18960ggagagcgag cgcgagccga
tcgcgatcgt ggggatgagc tgccgtctgc ccggcggggt 19020gaacagcccg
ggggagttct ggtcgctgct ggaggccggg acggacgccg tctcggagtt
19080cccgcgggac cgtggctggg atgtggagaa cctctacgac ccggacccgg
acgcccccgg 19140gcggtcgtac gtccgcgagg gcggattcct ggacggggcc
ggacagttcg acgccgcctt 19200cttcggaatc tcgccccgtg aggcgctggc
gatggatccg cagcagcggc tgctgctgga 19260gtgctcgtgg gaggcgatcg
agcggtcgcg gatcgacccg aagaccctgc acggcagccg 19320gaccggcgtc
ttcgcgggct ccaactggca ggactacaac accctgttgc tgaacgccga
19380ggagcgctcc cagagctacc tggccaccgg cgcctccgga agcgtgctgt
ccgggcgcgt 19440ctcgtacacg ctgggcatgg aagggcccgc gatcaccgtg
aacacggcgt gctcgtcctc 19500tctggtcgcc gtccacctgg cggcccgttc
cctgcgggcg ggggagtgcg acctcgccct 19560ggccggcgcc gtcacggtca
tgtccacacc gcagcttccg gtcgccttct cccggcagcg 19620cggactcgcc
cctgacggtc gctcgaaagc cttcgcggtt tcggccgacg gcatgggctt
19680cggcgagggg gtgggcgtgc ttgtgctgga gcggttgtcg gtggcgcggc
ggaacggtca 19740tcgggtgttg gcggtggtgc ggggttcggc ggtgaaccag
gacggtgcgt cgaacggtct 19800gacggcgccg aacggtccgt cgcagcagcg
ggtgatacgt gcggcgttgg cgagtgccgg 19860gctgggtccg gccgatgtgg
atgtggtgga ggcgcacggt acggggacgc ggttgggtga 19920tccgatcgag
gcgcaggcgt tgctggcgac gtacgggcgg ggccgggacg cggagcgtcc
19980gttgtggctg gggtcggtga agtcgaacat cggtcatgcg caggctgctg
ccggtgtcgc 20040cggtgtcatc aagatggtgc tggccatgga gaagggccgt
ctccctcgga cgctgcatgt 20100ggatgagccg tcgggtgagg tggactggga
ctcgggtgcg gtgcggctgc tgaccgaggc 20160gcgggactgg ccgtcggagg
aaggtcgtct gcggcgggcc ggtgtgtcgt cgttcgggat 20220ctcaggcacc
aacgcgcacg tgatcatcga ggaagcaccg gaagaggggg aggaaccgga
20280gtccgacgcg ggtggtgtgg tgccgtgggt gctctccgcg cggacggaag
gggcactgca 20340agcacaggcg gtgcaactga gcgagttcgt cggcgagtcg
agtccggtgg atgtgggttg 20400gtcgttggtt tcgacgcgtg cggcgttcga
gcatcgggcc gtggtggtgg ggcgcgggcg 20460ggacgagttg gtgcggggct
tgtccgaggt cgcgcagggt cggggcgtga ggggtgtcgc 20520gtcttcggcg
tcgggtggtc tcgcgtttgt ttttgctggt cagggcagtc agcggttggg
20580gatggggcgg gggttgtatg agcggttccc ggtgtttgcc gaggcgttcg
acgaggtgtg 20640tgggcgggtc ggtccggggg tgcgggaggt tgttttcggt
tcggatgcgg gtgagttgga 20700ccggacggtg tgggcgcagg cggggttgtt
cgcgttggag gtggcgctgt ttcggttgtt 20760ggagtcctgg ggtgtgcggc
cgggttgtct gatcgggcat tcggtcggtg agttgtcggc 20820ggcgtgtgtg
gcggggttgt ggtcgttgga ggatgcgtgt cgggtcgtgg ctgcccgggc
20880gcggttgatg caggcgttgc cggcgggtgg ggtgatggtc gcggttcggg
ccgaggcggg 20940ggagctggcc ggtttcctcg gtgaggacgt ggtgatcgcg
tcggtgaacg cgccggggca 21000ggtggtgatc gctggtcctg aggggggtgt
ggagcgtgtg gtggctgctt gtggggcgcg 21060gtcgcgtcgt ctggcggtct
cgcatgcttt tcattcgcct ttggtggagc cgatgcttgg 21120ggagttccgt
cgggttgtgg agtcggtggc gttcggtgtg ccgtcgttgc gggtggtttc
21180caatgtcacg ggtgcgtggg tggatccgga ggagtggggg acgccggagt
actgggtgcg 21240tcaggtccgt gagccggtgc gtttcgccga cggggtcgcc
acgttgctcg acgcgggtgt 21300gaggacgttc gtcgagctgg gtcccgccgg
ggcgctcact tcgatggtca gccactgcgc 21360ggacgccacc gccacttcgg
tgacggctgt acctaccttg cgccccgatc acgatgagtc 21420gcggaccgtg
ttgagtgccg cagcgtcctt gtacgtccag ggtcacccgg tcgactgggc
21480cccgctgttc ccgcgggccc gcacggtgga cctgcccacc taccccttcc
agcaccagca 21540ctactggctc gacgtacctc ctctgttcac cgcctcctcg
gcggcccagg acggtggctg 21600gcgataccgc atccactggc ggcggctcgg
cacgagggac tccggggacc ggctctccgg 21660ccgctggttg ctgctggtgc
ccgagtcgga cgggacggag ccctgggtgg agggggccga 21720gaagatgctg
gccgagcgcg ggtgcgaagt cgtccacgtg ccgatcgcgg cgacggccga
21780ccgggacgcg atggtcggag ccgtgcgtga gagcgtcgag gacggtcggg
tcgacggtgt 21840gctcagcctg ctggcgctcg acggccgccc gcaccccgat
gcggctgcgg tgccgacagg 21900gttggtcgcc acggcgcagg ttgtgcaggt
cagtgacgag ctgggcatcg gcccgctgtg 21960ggtcgccacc cgacaggcgg
tctccgtcga cggggccgat gaggctgacg gggccggtag 22020gaccaggaag
gccgacgacc ccgccgatgt cgcgcaggcc gctgtgtggg ggctcggccg
22080ggtcgccgcg ctggagaagc ctcggttgtg gggcggcctc gtcgacctgc
ccgcacgtgc 22140cgacgaacgg atgcgggacc tggtggctca ggccctcacc
gctcccgacg ccgaggacca 22200acttgccgtg cgggccgacg gcatcgccgt
tcgccgactg gtacgctccg ccgcgtcggc 22260cccggccgac gactggcagc
cgagcggcac cgtgctggtc accggcggca ccggaggcgt 22320cggagccaac
gtggcgcgtt ggctggtcac ccaggacatc cagcacctgt tgctggtcag
22380ccggcgcggc ccggacgccc ccggagccgc tgagctgctg gccgaactca
gcgcctcagg 22440aacgtccgtg accatcgagc cctgcgacgt caccgacgcg
gacgcggtac ggcgcctgat 22500cggcgccgta ccggccgaac ggccgctgag
cacggtcgtc cacgccgcgg gcgtactgga 22560cgactgcttg atcgacgccc
tgaccccgca gcgcctcgcc gccgcactgg aggtcaaggc 22620caagggcgca
ctgaacctcc acgaggcggc cggggaagcc cacttggtgc tcttctcctc
22680gctggccgga acaaccggaa ccaagggaca gggcaactac gccgccgcaa
acgcctatct 22740cgacgctctg gccgaacggc ggcgtgctga cggcctgccc
gccacttcgg tcgcctgggg 22800cgcctggcag ggcgcgggca tggtggccga
cgccgccgta gcccaccgca cgcgccgtta 22860tggcctcccg ctcatgagcc
ccgaccgcgc cgtcgccacc ctgcggcagg tcatggccga 22920gccggtggcc
acgcaggtgg tggcggacgt cgactggcag cgattcgtcg ccgacttcac
22980cgcggtgcgc cccagccgcc tcctcgccga cctgccggaa gtgcgctccc
tgggcgagca 23040gcgaaaggac ggcccgggcg gtcagggcga ggaggacggc
ttggccagca agctggcagc 23100cctgcccgaa gccgaccgcc gacgagccgt
gctggacctc gtggaggaac tcgtcctcgg 23160ggttctgggc cacgagacgc
gcgcggcgat cggcccggac agttccttcc acgccatcgg 23220cttcgactcg
ctcaccgccg tcgaactgcg caacctgctg accgtacgcc tcgggatgaa
23280gctgcccgcg accctcgtct acgatcaccc gaccctgtcg tcgctggccg
accacctgca 23340cgagcaactg gttatcgacg gcacccccat gacggacacc
gcggccgacc tgctcgccga 23400actcgacgca ctcgcggcga gactcgccgc
cgtcgggctg gaaccggagg cgcgcgcccg 23460catcggacgc aggctcaagg
acatgcagac cgcctgcgaa cccaggtcgg agtcctcacg 23520cgacctgaag
tccgcctcac gcaccgaagt gctcgacttc ctcaccaacg aactcggcat
23580ctcccgctga ccagttgacc gaccgcgacg aacggcgcac ctggctgcgg
ctcgtccacg 23640ccgaccttcg accttgcccg acgcccccgg gagcggacta
ccaccatgcc caacgacgaa 23700gaactcctcg actacctgaa gcggactgcc
tcgaacctcc aggaggcgcg gcagcgggtg 23760cacgaactgg aggagagcga
gcgcgagccg atcgcgatcg tggggatgag ctgccgtctg 23820cccggcgggg
tgaacagccc ggaagagttc tggtcgctgc tggaggccgg gacggacgcc
23880gtctcggagt tcccgcggga ccgtggctgg gacgtggagc ggctgtacga
cccggacccg 23940gacgcccccg gcaagtcgta cgtgcgggaa ggcggattcc
tcgacggcgc gggccggttc 24000gaccccgcgt tcttcggtat ctccccgcgg
gaggccgtgg tcatggatcc gcagcagcgg 24060ctgctgctgg agtgctcgtg
ggaggcgatc gagcggtcgc ggatcgaccc gaagaccctg 24120cacggcagcc
gcgcgggcgt gttcgtgggc tcgaacggcc aggactacgg gacgcttctc
24180ctgcgtgccg acgaccgctc ccacgcctac ctcgccacgg gcgcctccgc
gagcgtgctc 24240tccggccgca tctcctacac gctcggactg gagggccctg
cggtcacgat cagtacggcc 24300tgctcgtcct cactggtcgc cctccacctg
gcggcccgcg ccctgcgggc gggggagtgc 24360gagctggcgc tcgccggcgg
tgtgacggtc atgccgacga cccgcctgtt cgaggtcttc 24420tcccggcagc
gtggcctggc cggtgacggc cgctgcaagg ccttcgcggc cggggccgac
24480ggcactggct ggggcgaggg cgtgggcgta ctcgtcctgg agcggttgtc
ggtggcgcgg 24540cggaacggtc atcgggtgtt ggcggtggtg cggggttcgg
cggtgaacca ggacggtgcg 24600tcgaacggtc tgacggcgcc gaacggtccg
tcgcagcagc gggtgatccg cgcggccttg 24660gccagtgcac gcctggcccc
cgaggacgtg gacgccgtag aggcacacgg cacggggacc 24720tccctgggcg
acccgatcga ggcgcaggcg ttgctggcga cgtacgggcg gggccgggac
24780gcggagcgtc cgttgtggct ggggtcggtg aagtcgaaca tcggtcacgc
gcaggccgct 24840gccggtgtcg ccggtgtcat caagatggtc aaggcgatgc
aggcgggcac gctgccccgg 24900acgctgcatg tggatgagcc gtcgggtgag
gtggactggg actcgggtgc ggtgcggctg 24960ctgaccgagg cgcgggactg
gccgtcggag gaaggtcgtc tgcggcgggc cggtgtgtcg 25020tcgttcggga
tctccggcac caacgcgcac gtgattctcg aggagccgcc ggcggaggac
25080gcggtaccgg agcctgaagc gggtgatgtg gtgccgtggg ttctttcggc
gcggtcggct 25140gaggcgttgc gggagcaggc tgcccggctg gcgtcggtgg
ctggtgggtt gaacgtggtg 25200gatgtgggct ggtcgttggc ttcgacgcgt
gcggcgttcg agcaccgggc cgtagtggtg 25260gggcgggagc gggaagagct
gctcgcgggt ctgttcgctg tggctgcggg acgcccggct 25320gcgaacgtgg
tgacggggcc cgtcagctcc ggtcggcccg cctttgtttt tgctggtcag
25380ggcagtcagc ggttggggat ggggcggggg ttgtatgagc ggttcccggt
gtttgccgag 25440gcgttcgacg aggtgtgtgg gcgggtcggt ccgggggtgc
gggaggttgt tttcggttcg 25500gatgcgggtg agttggaccg gacggtgtgg
gcgcaggcgg ggttgttcgc gttggaggtg 25560gcgctgtttc ggttgttgga
gtcctggggt gtgcggccgg gttgtctgat cgggcattcg 25620gtcggtgagt
tgtcggcggc gtgtgtggcg gggttgtggt cgttggagga tgcgtgtcgg
25680gtcgtggctg cccgggcgcg gttgatgcag gcgttgccgg cgggtggggt
gatggtcgcg 25740gttcgggccg aggcggggga gctggccggt ttcctcggtg
aggacgtggt gatcgcgtcg 25800gtgaacgcgc cggggcaggt ggtgatcgct
ggtcctgagg ggggtgtgga gcgtgtggtg 25860gctgcttgtg gggcgcggtc
gcgtcgtctg gcggtctcgc atgcttttca ttcgcctttg 25920gtggagccga
tgcttgggga gttccgtcgg gttgtggagt cggtggcgtt cggtgtgccg
25980tcgttgcggg tggtttccaa tgtcacgggt gcgtgggtgg atccggagga
gtgggggacg 26040ccggagtact gggtgcgtca ggtccgtgag ccggtgcgtt
tcgccgacgg ggtcgccacg 26100ttgctcgacg cgggtgtgag gacgttcgtc
gagctgggtc ccgctgggac gctcacttcg 26160atggtcagcc actgcgcgga
cgccaccgcc acttcggtga cggctgtacc taccttgcgc 26220cccgatcacg
atgagtcgcg gaccgtgttg agtgccgcag cgtccttgta cgtccagggt
26280cacccggtcg actgggcccc gctgttcccg cgggcccgca cggtggacct
gcccacctac 26340cccttccagc accagcacta ctggatgatg aacaccggaa
gtgccgccga gccggcggag 26400ctggggctcg gcgatgcccg tcatccgctg
ctcggttccg tcgtcaccgt cgcgggggac 26460gacaaggtcg tcttcgccgg
gcggctggcg ctgcgcacac acccctggct ggccgaccac 26520accgtgctcg
acgcggtctt gctgcccgct acggccttcc tcgaactggc cgtgcgcgcc
26580ggtgaggagg tgagctgtcc ggtcgtacac gacctgacgc tgcaccgacc
gctggtcgta 26640cccgagcggg gcgccgtgca ggtacagatg gctgtgggcg
caccggaagc cgatgggcga 26700cgtgaggtcc gggtgtactc ccgccccgac
gacgacgcgg agcacgagtg gacgctgcac 26760gccgctggac tgctggcgtc
ggccgccacg gcggagcccg ccgtggcggc cggtgcctgg 26820ccgccgccgg
aggcgcaggc cgtggacctc gacggcttct acgccggact cgccgagcac
26880ggctaccact acggcccgct gttccagggc gtccgggccg cgtggcggct
gggcgacgac 26940gttctcgccg agatcgtgct gcccgaggcg gccggcgccg
acgccgcccg gtacggcatg 27000catccggccc tgctcgacgc cgtcctgcac
gcggcacggc tgggcgcctt ccgtgagcgg 27060tcggaggaga agtacctgcc
gttcgcctgg gaaggcgtga ccctgcgtac caggggagcg 27120accgccgtac
gtgctcgaat ctcccgggcc ggtaccgacg ccatccggct ggacgtcacc
27180gacaccgcgg accggccggt cctcacggcc gaatcgctca cgctgcgacc
ggtctccgcc 27240ggtcagctca tggccgtccc gcgcgactca ctgttccggg
tcgactgggt ttccgcgccc 27300gccgcgaacg gtcccggcct gcggctggcc
cgtgccgcca ccgtggaggc ggccctcgcg 27360gcggacgccg acatcgtggt
cgtgccatgc ctcgacagtg agggtccgca tcaggcgacg 27420taccaggcac
tggagctgct acagcgctgg ctggccgccg acaccggtac caccacgctc
27480gccctgctca cccaccgtgc cgtggcggtc ggcgacgacg tccacgacct
ccaccacgcg 27540cctctgtggg gcctggtccg caccgcccag accgaacacc
ccggctgctt ccggctcgtc 27600gactcggacg accccgaccc gacgacggac
gtcctggccg cggcgctcgc caccggggaa 27660ccccaggtcg cgatccgtga
cggcgccgtc ctggccccgc ggctgaccgc ggcctccgcg 27720ccgcgggagc
cggccgagtg ggacgccgag ggaacagtcc tcatcaccgg cggatcgggc
27780gccctcgcag ggatcgtggc ccagcacctc gtcgcacgtc acggcgtacg
ccgactcgtc 27840ctcgcgagcc gcagcggcag gcccgcaccg ggggccgacc
tgctcgacgc cgacgtcacg 27900gccgtgtcct gcgacgtctc cgaccgcgac
gccgtggccg cgctgctcgc ctccgtgccg 27960gacgaacacc cgctcaccgc
cgtcgtgcac accgcaggcg tactggacga cggcgtcctg 28020cacgccctca
cgaccgagcg catcgacacc tcgttcgcgg cgaaggtcga cggcgcccgt
28080catctccacg aactcacctc ccacctggat ctcaccgcgt tcgtgctgtt
ctcctccgcg 28140tcggccgtgc tgggcgccgc cggacagggc aactacgccg
cggccaacgc ctacctcgac 28200gcgctcgccg cccaccgtcg cagcaacgac
ctgcccgccg tgtctctcgc gtgggggctg 28260tgggccgagc acgagggcat
ggcccgcgga ctcggtgacg ccgagctgac gcgtatttcc 28320cggatcggcg
tcaccgcgct gagcgcggag gacggcatgc ggctgttcga cgccggatgc
28380gccggcgatc agtcacagct cgtgccgatg cgggtggaca ccgcggcgct
gcgcgcacgg 28440cgtgaccacc ttcccgcacc gatgtggagc ctggtccccg
agcggacccg agcggcacgt 28500acacagcctg ccgcctcgct tcgggacagg
ctcgccgaac tgaccgcccc cgaacgcaag 28560cgcacggtcc tcaacctggt
gcgcaacgcg gtcgccgaca cactcggcca caacgccgcc 28620gacggagtac
cgcccgacca gagcctcgac gccgccgggt tcgactcgct caccgccgtc
28680gagttccgca accggctctc cgccgtcacc gacctgcgcc tgcccgccac
cctcacctac 28740gatcacccca cccccgcggc catcgccgag cacatcctga
cccgcctcac cctgctgaag 28800gagaccgccg ccccggccgt cggcaccgcc
ccggttgcgg cgccgaccga agacgatgcg 28860atcgtcatcg tgggcatggc
gggccgcttc cctggcggcg tgcgcacacc cgaaggtctt 28920tgggacctcg
tccactccgg cacggacgcc atctcggagt ggcccaccga ccgcggctgg
28980gacgtggaga acctctacga cccggacccc gacgccgtcg gcaagtccta
cgtacggcac 29040ggcggattcc tgcacgacgt cgccggcttc gacgcgggct
tcttcgggat ctcgccgcgt 29100gaggcgctgg cgatggaccc gcagcagcgg
ctcctgctgg agtgctcgta cgaggccctg 29160gagcgggcgg gcatcgaccc
ggccacgctc agaggcagcc ggtcgggcgt gtacgccgga 29220gtgatgtacc
acgagtacgc ctcccggctg ggcgccacgc ccgcaggctt cgaaggcaca
29280ctcggcaccg gaagctcggg cagcatcgcc tccgggcgca tctcctacac
attcgacctc 29340accgggcccg cggtcaccgt cgacaccgca tgttccacct
ccctcgtagg cctgcacctg 29400gccgtgcagg ctctgcgggc cggtgagtgc
gaactggccc tcgccggcgg cgtcaccgtc 29460atgcacacgc cgcgcccctt
cgtcgagttc tcccgccagc gcggcctggc cgcggacggc 29520cggagcaagg
ccttcgcggc ctccgccgac ggggtggcct gggccgaagg cgccggaatc
29580ctcgtcctgg agcggctgtc ggcggcgcgg cggaacggtc atcgggtgtt
ggcggtggtg 29640cggggttcgg cggtgaacca ggacggtgcg tcgaacggtc
tgacggcgcc gaacggtccg 29700tcgcagcagc gggtgatacg tgcggccttg
gcgagtgccg ggctgggtcc ggccgatgtg 29760gatgtcgtcg aggcccacgg
caccggcacg gccctcggcg atccgatcga ggcgcaggcg 29820ttgctggcga
cgtacgggcg ggggcgtgac gcggatcgtc cgttgtggct ggggtcggtg
29880aagtcgaaca tcggtcacac gcaggcggcc gcgggtgtgg caagcgtgat
caagatggtg 29940caggcgatgc aggcgggcgt gctgccgcgg acgctgcatg
tggacgagcc gtcgggtgag 30000gtggattggg actcgggtgc ggtgcggctg
ctgaccgagg cgcgcgagtg gccgtcgggg 30060gaggggcgtg tgcggcgggc
gggtgtgtcg
tcgttcggga tctccgggac gaacgcgcac 30120gtgatccttg aggagccgcc
ggcggaggac gcgctgccgg agcctgaagc gggtgatgtg 30180gtgccgtggg
ttctttcggc gcggtcggca gaggcgttgc gggagcaggc tgcccggctg
30240gcgtcggtgg ctggtgggtt gaacgtggtg gatgtgggct ggtcgttggc
ttcgacgcgt 30300gcggcgttcg agcaccgggc cgtcgtcgtg ggaggcgatc
gggaagagct cctggggaag 30360ctttcctcgg tttcgggggt cgaggtcggg
gtcggggtcg gtgccggtgg tggtgtggtg 30420ttggtgttcg ccggtcaggg
gtgtcagtgg gtcggtatgg ggcgggagtt gctgggttcc 30480tcgctggtgt
tcgcggagtc gatgcgggag tgcgcggcgg ctctgtcgcc gtttgtggac
30540ttttctgtgg tggatgttct gggttcggct ggggagttgg gtcgggtcga
ggtggttcag 30600cctgcgttgt gggcggtgat ggtgtcgctg gcgcgggtgt
ggcggtcgtg gggtgttccg 30660gttgctgcgg tggtgggtca ttcgcagggt
gagattgccg cggcgacggt ggcgggtgcg 30720ttgagtgtgg gtgatgcggc
gcgggtggtg gcgttgcgga gccgtttgat cgcggagcgt 30780ctgtcggggc
tgggtgggat ggtttcggtg gcgttgtcgc gtgagcgggt ggtgtcgttg
30840atcgcgggtg tgccgggtgt gtcggtggcg gcggtgaacg gttcttcgtc
gacggtggtc 30900tcgggtgagg ccgcggggct ggagagggtg ctggccgcgt
gtgtgtcgtc gggggttcgg 30960gcgcgtcgta tcgatgtgga ttacgcctcg
cattcggtgc aggtggagtt gatccgtgag 31020gagttgttgg gggttctgga
cgggatcgtc ccgcgctcgg gtgagattcc gttcgtgtcc 31080acggtgacgg
gtgagcggat cgacactgtc gagctggggg cggagtactg gtaccgcaat
31140ctccgtcaga cagtggaatt ccagtcggtg gtggagggtc tggtcgctca
ggggtgtcgg 31200gtgttcctgg agtccagtcc gcatccggtg ttgacggtcg
gcatcgagga gtccgcggat 31260cgggtcgtgg cgttggagtc gctgcgtcgt
ggcgagggtg gtctgcggcg gttggtggat 31320gcggccggtg aggcgtgggt
gcgtggggtg ccgatcgact gggcggggat gctcgccggc 31380ggccggcggg
tcgacctgcc cacctatccc ttccaacacc agccctactg gctcgactca
31440ccacgacacc ctgccggaga cgtgaccgcc gtcggtctca cagaggccgg
tcacgcgttc 31500gtgccggcgg cggtcgacct gccggacggg cagcgggtct
ggacgggacg actgtcgctt 31560ccctcctacc cgtggctggc cgatcatcag
gtgctcgggc aggtgctgct ccccggcgtg 31620gtctgggtcg aactcgccct
gcacgcgggg caccaggccg gatgcgactc tgtcgatgag 31680ctcaccctac
agtcgccgct cgtgctcggt gcgtccgaca ccgtacaggt gagggtcgtc
31740gtcacggaga ccgaagagcc cggcacccgc accgtgtcga tgcactcgcg
ccgtgacgac 31800ggcagctggg tgactcacgc cgaggggatc ctcggggcgg
gcgggccgcc gccggagccg 31860ctgccggaat ggccgccgac cggcgccatg
cccctcgatg tcgagggctt ctacgacgag 31920ctcgcggcgg gcggctacca
ctacgggcct cagttccgct gcctgcggcg cgcctggcgt 31980gccggtgagg
atctcgtcgc cgagatctcg ctgccggagg gcaccgacgt cgatgcgtac
32040ggcctgcacc ctggactctt cgacgcggcg gtgcacagcg tggcctgcgc
ccggacgagc 32100gcgggggccg gcgatgacgg tccccggctg ccgttcgcct
tctcggacgt ccggctcttc 32160gcgaccgggg tgacctcgct acgggtccgg
atcgatccgc agaactcctc gtggcaggcg 32220tgggacgaat ccgggctgcc
ggtcctcacc atcgggcggc tcgccggccg gcctgtcgac 32280gccgatcagt
tcgccgtgcg gcgggcgggc cacctcttcc gcgtcgaaac gcggcacgaa
32340gccctggccg gcccggcccc cgcctcctgg gcggtcatcg gagcggaccc
ggccgggtac 32400gccgcagccc tggaggccac gggcgcgcag gtgacgacgg
ctgccgacct ggccggtctc 32460acatcggcac ccgaagccgc cctgttcacg
ctccccggca caaaggacgc gggggtcacc 32520gaggaggtgc cgaccgctgt
ccgggaggcg accgctcagg tgctggaggt gctgcaggac 32580tggctcaccg
acggacgttt cgacgatgcc cgactggtcg tcgtaagccg cgaagcggaa
32640gacggcgatc tcctccacgg aacggcgcgc ggactgctgc gcgccgcaca
ggccgagcac 32700ccggaccgca tcacccttgt cgacctcgat gctcatcccg
cctcgctcac ggcccttccc 32760ggtttcgccc tcggtcccga accggaggtc
gtcgtacgcg cgggagacgg cagggcaccg 32820cgcctggccc gggcgcaggc
ccccaccgga gcgggctcac tgggcacggg cacggtcctg 32880atcaccggag
gcacgggcac cctcggggga ctgctcgccc ggcacctggt ggagacgcac
32940ggagtcaccc ggctgctgct ggtcagccga cgaggaccgg ccgccgacgg
cgcggaccgg 33000ctgcacgccg agctcaccgg gcatggcgca cacgtcgaca
tcgtggcggc cgacctcggc 33060gaccgcacga gcgtggccgc gctcctcgcc
acggtcgacg ccgaccaccc cctgtcggcc 33120gtcgtgcacg ccgccggagc
gctggacgac ggcgtgctcg gcacccggtc cgccgactgg 33180ctcgacccgg
tcctgcgccc caaggcggac gccgcttggc acctgcacga actcaccgcc
33240gaactgcctc tgaccgcctt cgtcatgttc tcctcggccg catccgtgct
cggcgcggcg 33300ggacaggcca actacgccgc ggccaacgga tttctggacg
cactggccgc ccatcgtgcc 33360gcccggggac tgcccgggac ctcgctggcc
tgggggctgt gggagcaccg cagcgaactg 33420acccggcaca cgggctcccc
ctcccgcagc atcgcggccg tcggcgctct gtccaccgcg 33480gaggcccttg
ccgccttcga cgccggcctg gcctccgggg agccgctggc agtgccgatc
33540cggctggagt cgacatccag tgaggaggta ccgccgatgc tgcgcggcct
ggtccgcgta 33600cgccgccggg ccgccaccgg cacggaaccc gcggcgagcg
cgggcgccgc gcaggaggtc 33660cggcagctgg ccgagttggg cgccgacgag
cgacagcggc gcgtgcagcg gatcgtgctc 33720gacaccgcgg cggccgtcct
cggccatgac agccacgacg ccatccccct cacccggggc 33780ttcctggagc
tggggttcga ctccctgaca gcggtacggc tgcgcaaccg gctcgcccgc
33840cgactggggc tgcgcctgcc ggccacggtg gtgttcgacc accccagccc
ggccgccctg 33900gccgcccacc tggtcgagca tctcgtgggc accgtcgacc
cgaccgcgca ggccatggag 33960cagctggagg ctctgcgccg cagcgtgcac
gcagccacac ccgccggtgg cctggaccgc 34020gccctggtga cccaacgcct
gacggccctg ctcgacgaaa tgcggcacgt cgacggcccc 34080ggcggcaccg
aaggccccga cggctccggg gacgacctgg agaacgcgac agcggacgag
34140atctacgccc tcatcgacaa cgaactgggc atcgggggta cgcagtgaac
ggcgacgaca 34200aagcactggc ctatctcaag cgggtgaccg cggacctgcg
gtcggcgaga gccaggctgc 34260aggaactgga gtccgccgac accgacccca
tcgccatcat cggcatgggc tgccgtctgc 34320ccggtggcgt gcgcaccccc
gaggacctgt gggacctcgt ggagaagaag catgacgcga 34380tcggcccctt
ccccgccgac cgcggatggg acctcgagaa cctgtacgac cccgacccgg
34440acgcgccggg caaggcctac gtccgcgaag gtgggttcgt ccacgacgtc
gccggcttcg 34500acgcgggctt cttcggaatc tcgccgcgtg aggcgctggc
gatggacccg caacaccggc 34560ttctgctgga gtgctcgtgg gaggccctgg
agcgggcggg catcgaccct tcctccctcg 34620agggcacccg caccggcgtc
tacaccgggc tcatgaccca tgaatacgcg acccgactgc 34680cctcgatcga
cgaggagttg gagggtgtca tcggcatcgg caacgccgga agcgttgcct
34740cgggccgcgt ctcctacacg ctcggcctga acggccccgc tgtcaccgtc
gacacggcct 34800gctcctcctc gctcgtcgcc ctgcacctcg ccgcccaagc
cctgcgccag ggccagtgca 34860cccttgcgct ggccggaggt gcctccgtca
tcgcggcgcc gaccgtgttc gccaccttca 34920gccgacagcg gggcctcgcc
cccgacggcc gctgcaaggc gttctcgtcc acgaccgacg 34980gcacgggctt
cggcgagggg gtgggcgtac tggtcctgga gcgcctctcg gacgcccgtc
35040gcaacggaca cgaggtcctg gccgtcgtac ggggctcggc ggtcaaccag
gacggagcca 35100gcagcggatt caccgccccg aacggaccgt cccagcagga
cgtcatccgc gaggccttgg 35160ccgacggtcg actgacccct gcggacgtgg
acgtcgtgga gggtcacggt acggggacgc 35220ggttgggtga tccgatcgag
gcgcaggcgt tgctggcgac gtacgggcgg gggcgtgacg 35280cggatcgtcc
gttgtggctg gggtcggtga agtcgaacat cggtcacacg caggcggccg
35340cgggtgtggc aagcgtgatc aagatggtgc aggcgatgca ggcgggcgtg
ctgccgcgga 35400cgctgcatgt ggacgagccg tcgggtgagg tggattggga
ctcgggtgcg gtgcggctgc 35460tgaccgaggc gcgcgagtgg ccgtcggggg
aggggcgtgt gcggcgggcg ggtgtgtcgt 35520cgttcgggat ctccgggacg
aacgcgcacg tgatccttga ggagccgccg gcggaggacg 35580cgctgccgga
gcctgaagcg ggtgatgtgg tgccgtgggt tctttcggcg cggtcggcag
35640aggcgttgcg ggagcaggct gcccggctgg cgtcggtggc tggtgggttg
aacgtggtgg 35700atgtgggctg gtcgttggct tcgacgcgtg cggcgttcga
gcaccgggcc gtcgtcgtgg 35760gaggcgatcg ggaagagctc ctggggaagc
tttcctcggt ttcgggggtc gaggtcgggg 35820tcggggtcgg tgccggtggt
ggtgtggtgt tggtgttcgc cggtcagggg tgtcagtggg 35880tcggtatggg
gcgggagttg ctgggttcct cgctggtgtt cgcggagtcg atgcgggagt
35940gcgcggcggc tctgtcgccg tttgtggact tttctgtggt ggatgttctg
ggttcggctg 36000gggagttggg tcgggtcgag gtggttcagc ctgcgttgtg
ggcggtgatg gtgtcgctgg 36060cgcgggtgtg gcggtcgtgg ggtgttccgg
ttgctgcggt ggtgggtcat tcgcagggtg 36120agattgccgc ggcgacggtg
gcgggtgcgt tgagtgtggg tgatgcggcg cgggtggtgg 36180cgttgcggag
ccgtttgatc gcggagcgtc tgtcggggct gggtgggatg gtttcggtgg
36240cgttgtcgcg tgagcgggtg gtgtcgttga tcgcgggtgt gccgggtgtg
tcggtggcgg 36300cggtgaacgg ttcttcgtcg acggtggtct cgggtgaggc
cgcggggctg gagagggtgc 36360tggccgcgtg tgtgtcgtcg ggggttcggg
cgcgtcgtat cgatgtggat tacgcctcgc 36420attcggtgca ggtggagttg
atccgtgagg agttgttggg ggttctggac gggatcgtcc 36480cgcgctcggg
tgagattccg ttcgtgtcca cggtgacggg tgagcggatc gacactgtcg
36540agctgggggc ggagtactgg taccgcaatc tccgtcagac agtggaattc
cagtcggtgg 36600tggagggtct ggtcgctcag gggtgtcggg tgttcctgga
gtccagtccg catccggtgt 36660tgacggtcgg catcgaggag tccgcggatc
gggtcgtggc gttggagtcg ctgcgtcgtg 36720gcgagggtgg tctgcggcgg
ttggtggatg cggccggtga ggcgtgggtg cgtggggtgc 36780cgatcgactg
ggcggggatg ctcgccggcg gccggcgggt cgacctgccc acctatccct
36840tccaacacca gccctactgg ctcgactcac cacgacaccc tgccggagac
gtgaccggcc 36900cgggcgacga cgagttctgg gcggccgtgg agcacggtga
ggcgaccgag ttggcggacc 36960tgctccggag gtcggcggcg gagccggggc
aggatcttca cgcacccgtc gcggccctgc 37020tgccgacgct tgcaacgtgg
cgtcgggacc ggcagcgcag ggcggctgtg gactcctggc 37080ggtaccggat
cgtatggcgt ccggtcgcca cgccctcgta cgacagggtg ctgtcggggc
37140gctgggctgt cgtcgtgccc gccggtcacg aggacgaccc cgtcgtcgac
tgggtctgct 37200cggcgctgcg ggaccacggg ggcgagcccg aacgcatggt
gctgggcccg cgggagagcc 37260gttcggcgct ggccacgcgg ctggccgccg
atccccccgg gggcgtggtc tccctgctcg 37320gactgagcgg ggcggcgcac
cccgaccacg aggtgctgcc cagtgccgtc gccggtaccg 37380tcctgcttgc
ccaggccctc tccgacggcg ccgtacgagc accggtgtgg accctgaccc
37440gcaacggtgt gtccgcgacg gcgacggacc cggtggctcc cacgcacgcc
gcgcaggtgt 37500gggccgtggc acgggtggcc ggtctggagc acccggaggc
gtggggtggt ctgctcgacc 37560tgccggaccg tctcgacgac cgcgcggccg
cccggttcgc cgcggtcctg tccgcgggcg 37620aggacgagga ccaactggca
ttacgcgacg ctgggttgct ggcacgaagg ctggtgcgtg 37680cccccgttcc
gcgcgacgcg gtgaccgccg gctggcagcc ccgcgacaca gcgctcgtca
37740cgggcggcac cggcggtctc ggcgggcagg tcgcccgctg gctggcggcc
gcgggcgtac 37800ggcacctcgt gctggtcagc cgtcgggggg cggaggcgga
gggcgcagac cgtctgcgcg 37860acgacctcac cgccctcggc gtacaggtga
cgttcggcgc gtgcgacgtc gcggaccgcg 37920ccgcgctctc ggcgctcctc
gaccgggttc aggaggacgg cccgccgatc cgcacggtcg 37980tgcacgcggc
gggctccggt cgcgccgcca ggctgctgga caccgacgcc gaggagaccg
38040cggcggtgct gcgggcgaag tcggccggag cccggaacct gcacgaactc
ctcgatgacg 38100tggacgcgtt cgtgctgttc tcctccggag cgggtgtgtg
gggaagcagc gcccagggcg 38160cctacgcggc ggccaacgcc tacctggacg
cactggccga acagcgcagg ggccaggggc 38220ggccggcgac ctccgtcgcc
tggggcgcct gggccggtga cggcatgaca gccgccgccg 38280gcgaggaatg
gtggagcagg cagggtctgc ggttcatggc ccctgaggcc gccctcgacg
38340cgctgcgcca ggccgtcgac cgcgccgaga gcacgctcgt cgtcgcagac
atcgactgga 38400agacgttcgc tcccctcttc acgtcggccc gcagccgccc
cctcatcacc gacatacctg 38460aagcccgccc cgaaccgagg ccggaaggcg
cggaccagcc tacgcagggc ctcgtggcca 38520agctggcggt gctgtccgcg
gacgaacggc ggcgcgccct gctcgccgag gtgcgggcgc 38580aggcagcggt
ggtgctcggc caccccggcg cggacgccgt accggtcgac cggccgttcc
38640gcgagctcgg attcgactcc ctcagcgcgg tgaaactgcg caacaggatc
gttgctgcca 38700ccgggctcga gcttccggcc accctggtct tcgaccaccc
cacgtccacg gcgctcgccg 38760cctacctggg cgcccggctc ggaatcgacg
gcgcccccgc ggggtccact ctgctggaag 38820acctcgcgcg gctcgagtcc
accgtcgcca ccctgaccgc ggcacctctc gcagagaccg 38880tgccggacgc
ccgggaccgc gcggcgctca ccacacggct gcgggcgttg ctggagcggt
38940gggaccaggc cgatggcgag gaccaggccg ccgcccgaga agaactcgac
gatctgagcg 39000acgacgacct cttcgacttc atcgacgcga agttcggccg
ttcgtgacct cggtcggccg 39060ccgccaactc cacgtacacc ccgaagacca
cgatcaccac gcgaaaagga cgggcctctc 39120catgggggac gagcagaaac
tccgcaccta cctccggcgc gtcactgccg acctggccga 39180cgtgacggag
cggttgcagc gagcagagga caagaacgcc gagccgatcg cgatcgtcgg
39240catggggtgc cgctaccccg gtggggtgcg gtcgcccgag gagttctgga
acctgctcga 39300cgaaggcgtc gacgcagtgg ccggcttccc cgaggaccgt
ggctgggacc tggagaacct 39360gtacgacccc gaccccgacg agccgggtaa
gtgctatgcc cgcgaaggcg ggttcctcta 39420cgacgcgggc gagttcgacg
ccgcgttctt cgggatatcg ccccgcgagg ccctgtccat 39480ggacccgcag
cagcggctgc tgctggagtg ctcctggagt gccctcgagc gggcgggcat
39540cgacccgggc tcgctgcgcg gcaaagacgt cggcgtgtac gtcggcgcat
ggaacagcaa 39600ctacggcagg ggcggcgggg cggagagctc cgagggccac
ctgctgaccg gcaacgcctc 39660cagcgtggtc tcgggtcgcg tggcgtacgt
gctggggctc gaaggccccg ccgtcaccat 39720cgacaccgcc tgttcctcct
ccctggtcgg cctgcacctg gccgcccagg ccctcaggtc 39780cggcgagtgc
ggtcttgcgc tggccggcgg cgtcaccgtg atgtccaccc ctctgtcgct
39840ggtgtccttc tcccggcagc gggggctcgc acaggacggt cgttccaagg
cgttctcggc 39900ggacgccgat ggcatgggca tggccgaagg tgtaggcgta
ctggtcctgg agcgcctctc 39960ggaggcgcgc cgcaacgggc acgaggtcct
ggccgtcctg cggagctcgg ccgtgaacca 40020ggatggtgcc tcgaacggtc
tgagcgcccc gaacgggccg gcgcagcagc gtgtcatcca 40080gtccgccctg
accgtcggcc gtctcgcccc ctccgacatc gacgtcgtcg aggcccacgg
40140caccggcacg gccctcggcg atccgatcga ggcgcaggcg ttgctggcga
cgtacgggcg 40200ggggcgtgat gcggatcgtc cgttgtggct ggggtcggtg
aagtcgaaca tcggtcacac 40260gcaggcggcc gcgggtgtgg ccggggtcat
caagatggtg ctggccctgc gcaagggcgt 40320actgccgcgg acgttgcatg
tggatgagcc aaccggtgag gtggattggg actcgggtgc 40380ggtgcggctg
ctgaccgagg cgcgcgagtg gccgtcgggg gaggggcgtg tgcggcgggc
40440gggtgtgtcg tcgttcggga tctccgggac gaacgcgcat gtgatcgtcg
aggaggctcc 40500ggaggaggag ccccggccgg aggctccttc cgtcgacgtg
gtgccgtggg ttctttcggc 40560gcggtcggca gaggcgttgc gggagcaggc
tgcccggctg gcgtcggtgg ctggtgggtt 40620gaacgtggtg gatgtgggct
ggtcgttggc ttcgacgcgt gcggcgttcg agcaccgggc 40680cgtggtggtg
gggcgggact ccgaggaatt ggtgtcgggg ctttcctcgg tttcgggggt
40740cgaggtcggg gtcggggtcg gtgccggtgg tggtgtggtg ttggtgttcg
ccggtcaggg 40800gtgtcagtgg gtcggtatgg ggcgggagtt gctgggttcc
tcgctggtgt tcgcggagtc 40860gatgcgggag tgtgcggcgg ctctgtcgcc
gtttgtggac ttttctgtgg tggatgttct 40920gggttcggct ggggagttgg
gtcgggtcga ggtggttcag cctgcgttgt gggcggtgat 40980ggtgtcgctg
gcgcgggtgt ggcggtcgtg gggtgttccg gttgctgcgg tggtgggtca
41040ttcgcagggt gagattgccg cggcgacggt ggcgggtgcg ttgagtgtgg
gtgatgcggc 41100gcgggtggtg gcgttgcgga gccgtttgat cgcggagcgt
ctgtcggggc tgggtgggat 41160ggtttcggtg gcgttgtcgc gtgagcgggt
ggtgtcgttg atcgcgggtg tgccgggtgt 41220gtcggtggcg gcggtgaacg
gttcttcgtc gacggtggtc tcgggtgagg ccgcggggct 41280ggagagggtg
ctggccgcgt gtgtgtcgtc gggggttcgg gcgcgtcgta tcgatgtgga
41340ttacgcctcg cattcggtgc aggtggagtt gatccgtgag gagttgttgg
gggttctgga 41400cgggatcgtc ccgcgctcgg gtgagattcc gttcgtgtcc
acggtgacgg gtgagcggat 41460cgacactgtc gagctggggg cggagtactg
gtaccgcaat ctccgtcaga cagtggaatt 41520ccaagcatcc gtgcagacgc
tcctcgccca ggggcaccag gtcttcctgg agtccagtcc 41580gcacccggtt
ctcaccgtcg gcatcgagga gaccgttcac gagagcgccg cacaggccgt
41640cgttctggga agcctgcggc gggacgaggg tgccctcacc cggctcgtca
cctccgccgg 41700tgaggcatgg gcgcgcggtg tgcccgtcga ctgggcgggc
atgctcgccg gcggcaggcg 41760ggtcgagttg cccacgtatc ccttcctccg
ggagcggctg tggctggagc cgtcgcgctc 41820ccgcaccggg aacctcaaca
tggccgggct ggtcgaagcc ggacatgaaa tcctgcccgc 41880cgcagtggag
ttgcccggag agcagtgggt gtggaccggc gagctgtcgc tctccgcgta
41940cccgtggctg gccgatcacc aggtgctcgg gcagaccctg gtgccgggcg
tggcgtgggt 42000cgaactcgcc ctgcacgcgg gccaccagct cggtttcgga
tccgtcgagg aactcaccct 42060gcaggcaccg ctcgtgctcg gcgagtccga
cgccgtgcag gtcagagtcg ttgtctccga 42120tctcggggag agtgatcgcc
gggcagtgtc ggtgcactcg cgtggtgacg accagacgtg 42180ggtgacccat
gcggagggat tcctcaccgc gaaaggggcg cagccggaga ccatggccgt
42240gtggccgccg tccggtgcgg agccggtgga ggctgacggg ttctacgaac
gcctcgccga 42300tgcggggtac cactatggcc cggtcttcca gggcgtgagc
aaggtctggc gagctggcga 42360ggagatctac gccgaggtcg ggctgctcga
cgacgccgac gtggacggct tcggcatcca 42420ccccgccctg ctcgacgccg
ccctgcagac cgcctacgtc gcccaacggg gccccgcaga 42480gacgaagttg
cctttcgcgt tcggcgatgt acagctgttc gccaccggtg cccggtcgct
42540ccgcgtacgg gtctcgccgg ccgctcagca ggggatggcg tgggaggcct
gggaccccac 42600cggacttccg gtgttctccc tcgggtacct ggcgacccgg
ccggtcgacc gcggccagct 42660gaccgtgaag cggcccgagt cgctgttcaa
ggtggcctgg gacgagaccg tccccgtcgt 42720cgggaatgcg accgccgcgc
atggcgtcgt gctgggcgac gacccgttcg ccctcggtgc 42780cgcgctgcgc
gcggcgggct gggaggtcgg ggccgccccg gaacccgcgt ccgccgacac
42840cgccgccgaa gtactgctgc tgccctgcac cgcgcccggc gagccggacg
cggacctgcc 42900caccgcggtc agggccgtga ctgctcgggt gctcggcgtc
ctacaggagt ggctcgccga 42960cgaacggctc gccggcaccc gactggccgt
cgtgacccgc aacgccctgc cgggtgacct 43020cctgcacagc cccgtctggg
gtctcgtgcg ctccgcccag accgagaacc ccgggcgcat 43080caccctcgtc
gacctcgacg accaccccga ctcggcggcc gtccttgccg aggccgtcca
43140gtccgacgag ccgcgcatca tggtccgcga gggccggccc accgccgccc
gcctggtccg 43200tgccaccgca cccgagctgg tgccgcccgc cggagccgat
gcctggcgcc tcgagatcac 43260cgaaccgggc acgttcgaca acctcacgct
gggcgtctac ccgcacgccg agaagaccct 43320cgccgacaac gaggtccggg
tcgccgtcca cgcgggcggc ctcaacttcc acgacgtggt 43380cgccgcactc
ggcatggtcg aggacgacct gaccctcggc cgtgaggcgg ccggcgtcgt
43440cgtcgaggtc ggagacgccg tgccggatct gacccccggc gaccacgtga
tgggcatcct 43500gtcctccggc ttcgggccgc tcgccgtcac cgatcaccgc
tacctggcac gcatgcccga 43560gggctggacg ttcgcccagg cggcttcggt
gcccgccgcg ttcctgacgg cctactacgg 43620gctgtgcgac ctcggcggca
tccgcgcggg cgaccgcgtc ctcatccacg cggccgccgg 43680cggtgtcggc
atggccgccg tacagatcgc ccggcacctc ggggcggagg tgttcggcac
43740cgccagcccg cgcaagtggg gcgcgctgcg cgccctgggg ctcgacgacg
cccacctgtc 43800ctcctcccgc accctcgact tcgagcagga gttcctggac
gccaccgacg gcaggggagt 43860cgacctcgtt ctgaactcgc tggcccggga
gttcgtcgac gcctcgctgc ggctgatgcc 43920cggcggcggc cggttcgtgg
acatgggcaa gaccgacatc cggcggccgg aacaggtggc 43980ggaggaccac
ggcggagtcg cctaccaggc attcgacctc gtcgaggccg ggccgcagcg
44040cacgggggag atgctcgccg agatcgtccg gctcttccaa gccggcgcgt
tccggccgct 44100gccgatcacc cagtgggacg tgcgccgggc gccggaggcc
ttccgacaca tcagccaggc 44160caagcacata ggcaagatcg tcctcaccgt
gccccggccc atcgacaccg acggcaccgt 44220catggtcacc ggcgccaccg
ggaccctggg cggcttcgtc gcccggcacc tggtcaccca 44280tcacggcata
cgacgactgc tgctggtcag ccgcagcgcg gagcgcaccg acctggtgcg
44340ggaactcacc gagctgggcg ccgacgtcac ctgggcctcc tgcgacctag
ccgacgccac 44400cgccgtcgaa gagaccgttc ggtccgtcga cgaacggcat
ccgctcgtgg ccgtcgtcca 44460ctctgcggga gtactcgacg acggcgtcat
cgacaagcag agccccgaac ggctcgacac 44520cgtgatgcgt cccaaggtcg
acgccgcctg gaatctgcac cgactcctcg acaacgcccc 44580gctggccgac
ttcgtgctct tctcctccgc cagcggcgtg ctcggtggcg ccggacagtc
44640caactacgcg gccgccaacg ccttcctcga cgcgctcgcc gagcaccgcc
gtgcacaggg 44700cctcgccgga caggcgctcg cctggggact gtggtccgac
cgcagcacga tgacgggaca 44760gctcggctcc accgaactcg cccggatcgc
ccgcaacggc gtcgccgaga tgtccgagac 44820ggagggcctg gccctcttcg
acgccgcccg ggacaccgcc gaggcggtgt tgctgcccat 44880gcacctggac
gtcgcgaggc tccgcagccg caacggagag gtacccgcgg tgttccgccg
44940gctgatccac gccacggccc gccgcaccgc gagcaccgcg gtccgcagcg
ccggcctcga 45000acagcagctc gcctcgctgt ccggccccga acgcacggag
ctgctcctgg gactggtgcg 45060cgaccatgcc gccgcggtgc tcggccacgg
cacctccgac gccgtctcgc cggaccggcc 45120cttccgcgac ctgggtttcg
actccctgac
tgccgtggag ctgcgtaaca ggttcgccgc 45180cctcaccggc ctgcgtctgc
cggccacgct cgtcttcgac cacccgagcc cgacggccct 45240cgccgggcac
ctcgccggcc tgctgggcgc cgcgacgccc tccgcggccg agccggtcct
45300ggccgccgtc ggacggctgc gcgccgacct ccggtcgctc accccggacg
ccgagggcgc 45360cgaggacgtg acgatccagc tggaggccct cctcgccgag
tggcgggagg ccgcggagaa 45420gcgggctccg gaggcggtcg gtgacgagga
cctgtccacc gccaccgacg acgagatctt 45480cgcgctcgtc gacagcgaac
tgggtgaggc ctgatgacgg ccgaagcgtc tcaggacaag 45540ctgcgtgact
atctgcgaaa gaccctcgcc gacctgcgga ccaccaagca acggctacgc
45600gacaccgaac gcagggcgac cgagcctgtc gcgatcgtcg gcatgagctg
ccgactgccc 45660ggcgacgtac ggacaccgga gcggttctgg gaactcctcg
acactggaac cgacgccctg 45720acgcccttgc ccaccgaccg cggctggaat
ctcgacacgg cgttcgacga cgaacggccg 45780taccggcgcg aaggcggatt
cctttacgac gccggacggt tcgacgccga gttcttcggc 45840atctcgcccc
gtgaggcgct ggccatggac cctcagcagc ggctgctcct cgaaagctcg
45900tgggaggcga tcgagcacgc ccgcatcgac cccaggtccc tgcacggcag
tcgcaccggc 45960gtctggttcg gcacgatcgg ccaggactac ttctccctct
tcgccgcatc cggcggcgag 46020cacgccaact acttggccac cgcctgctcg
gccagcgtga tgtccggccg cgtctcgtac 46080gtgctcggcc tggaggggcc
cgctgtcacg gtcgacacgg cgtgctcgtc ctccctggtc 46140gccctccact
ccgccgtaca ggccctgagg tccggcgagt gcgaactggc tctcgccggg
46200ggcgccacgg tcatggccac cccgacggtg ttcaccgcct tctcccatca
gcgtggcctg 46260gccggtgacg gccgctgcaa ggccttcgcg gcgggtgccg
acggggcggg cttcgccgag 46320ggggtgggcg tgctggtgct ggagcggttg
tcggtggcgc ggcggaacgg tcatcgggtg 46380ttggcggtgg tgcggggttc
ggcggtgaac caggacggtg cgtcgaacgg tctgacggcg 46440ccgaacggtc
cgtcgcagca gcgggtgatc cgcgcggcgc tggccaacgc gcgcttggcg
46500ccggaggacg tggacgctgt cgaaggccac ggcacgggga cttcgctggg
cgacccgatc 46560gaggcgcagg cgttgctggc gacgtacggg cggggccggg
acgcggagcg tccgttgtgg 46620ctggggtcgg tgaagtcgaa catcggtcat
gcgcaggctg ctgccggtgt cgccggtgtc 46680atcaagatgg tgctggccat
ggagaagggc cgtctccctc ggacgctgca tgtggatgag 46740ccgtcgggtg
aggtggactg ggactcgggt gcggtacggc tgctgaccga ggcgcgggac
46800tggccgtcgg gggaggggcg ggtgcggcgg gcgggagtgt cgtcgttcgg
gatctccggg 46860acgaacgcgc acgtgatcat cgaggagccg caggaggagg
aagcggcacc ggattcctct 46920gcttcgggtg ccgtgccgtg ggtgctctcg
gcgcgatcgg ccgaagcgtt gcaggctctg 46980gcttcacaac tcgccgacca
cagcgccaaa tcgagtccgg tggatgtggg ttggtcgttg 47040gtttcgacgc
gtgcggcgtt cgagcatcgg gccgtggtgg tggggcgcgg gcgggacgag
47100ttggtgcggg gcttgtccga ggtcgcgcag ggtcggggcg tgaggggtgt
cgcgtcttcg 47160gcgtcgggtg gtctcgcgtt tgtttttgct ggtcagggca
gtcagcggtt ggggatgggg 47220cgggggttgt atgagcggtt cccggtgttt
gccgaggcgt tcgacgaggt gtgtgggcgg 47280gtcggtccgg gggtgcggga
ggttgttttc ggttcggatg cgggtgagtt ggaccggacg 47340gtgtgggcgc
aggcggggtt gttcgcgttg gaggtggcgc tgtttcggtt gttggagtcc
47400tggggtgtgc ggccgggttg tctgatcggg cattcggtcg gtgagttgtc
ggcggcgtgt 47460gtggcggggt tgtggtcgtt ggaggatgcg tgtcgggtcg
tggctgcccg ggcgcggttg 47520atgcaggcgt tgccggcggg tggggtgatg
gtcgcggttc gggccgaggc gggggagctg 47580gccggtttcc tcggtgagga
cgtggtgatc gcgtcggtga acgcgccggg gcaggtggtg 47640atcgctggtc
ctgagggggg tgtggagcgt gtggtggctg cttgtggggc gcggtcgcgt
47700cgtctggcgg tctcgcatgc ttttcattcg cctttggtgg agccgatgct
gggggagttc 47760cgtcgggttg tggagtcggt ggcgttcggt gtgccgtcgt
tgcgggtggt ttccaatgtc 47820acgggtgcgt gggtggatcc ggaggagtgg
gggacgccgg agtactgggt gcgtcaggtc 47880cgtgagccgg tgcgtttcgc
cgacggggtc gccacgttgc tcgacgcggg tgtgaggacg 47940ttcgtcgagc
tgggtcccgc tgggacgctc acttcgatgg tcagccactg cgcggacgcc
48000accgccactt cggtgacggc tgtacctacc ctgcgccccg atcacgacga
gtcgcggacc 48060gtgttgagtg ccgcagcgtc cttgtacgtc cagggtcacc
cggtcgactg ggccccgctg 48120ttcccgcggg cccgcacggt ggacctgccc
acctacccct tccagcacca gcactactgg 48180atggaaagcg ccgcccggcc
caccgtcgag gacaccccgc gcgagcccct cgacggctgg 48240acgcaccgca
tcgactgggt gccgctggtg gacgaggaac cggcgcccgt cctggccggt
48300acctggctgc tcgttcgtcc cgaagaaggt ccccgcccgc tcgccgacgc
cgtcgcggac 48360gcgctgaccc ggcacggcgc ctccgtcgtc gaggccgctc
gtgtcccgca ccaatccgac 48420accgagctga ccggagtcgt ctctctgctg
ggcccgggcg ccgacggcga cggcggcctg 48480gacgcgaccc tgcggctggt
acaggacttg gccaccgccg ggtccaccgc gcccttgtgg 48540atcgtcacca
gcggagccgt ggccgtcggt acgtccgaca ccgtgccgaa ccccgagcag
48600gcgacgctct gggggttggc ccgggcggcg gccaccgagt ggcccggcct
gggggcggcg 48660cgcatcgacc tgcccgccga cctcaccgag caggtcggac
gtcggctctg cgcccggctg 48720ctcgaccgga gtgagcagga gacggcggtc
cgacaggccg gggtgttcgc caggcggctg 48780gtccgtgccc gtaccagcga
cggccggtgg acgccgcgcg gcaccgtgct ggtcaccggc 48840gggaccggcg
cgctcgccgg acacgtcgcg cgatggctgg cggaggaggg ggccgagcac
48900atcgtgctgg ccgggcgcag agggcccgac ggtcagggcg ccgaggcgct
gcgcgccgac 48960ctggtcgccg caggggtcaa ggcgacgatc gtgcgctgcg
acgtcgccga ccgggatgcc 49020gtacgtctgc tcctggacgc acaccggccc
agcgccatcg tgcacacggc cggggtcgtc 49080gacgacggac tgctcacctc
gctgacgccc gcccaggtcg agcgggtgct gcggcccaag 49140ctgctcggcg
ccaggaacct gcatgagctg acccgggacc gggaactgga cgccttcgtg
49200ctgttctcct ccctcgccgg agtcctcggc ggggcagggc aggccaacta
cgccgctgcc 49260aacgcctact tggacgccct ggccgcacac cgcaccgcgc
atgggctgcc ggcggcctcg 49320ctggcatggg ggccgtggga gggcgacggc
atggccgcgg cgcaggaggc cgccgaccgg 49380ctccgccgca gcggtctcac
cccgctgccg ccggagcagg ccgtacgggc cctcggccgg 49440ggccacgggc
cgctggtggt ggccgacgcg gactgggcgc ggctggccgc cggctcgacg
49500cagcgcctgc tcgacgagct tcccgaggtg cgtgcggtca ggccggcgga
gcctgctgtc 49560ggacagcgcc ccgacctacc ggcccggttg gcggggcgtc
cggccgagga gcagtccgcg 49620gtactgctgg aggcggtccg ggaggagatc
gccgccgtac tgcgttacgc cgatccggcg 49680cggatcggcg ccgatcacga
gttcctcgcc ctcggcttcg actcgctgac atcgatcgaa 49740ctgcgcaaca
ggcttgccac gcgcatcggt ctgacgcttc ccgcgacgct caccctggaa
49800cagcgcaccc ctgccgggct cgccgcgcac ctgcgcgagc ggatcgcgga
ccggcccgtc 49860gggtccggtg ccgtcccggt gcccgggagc gctgatgtcc
cggaggcggg cggcggtagc 49920ggcctcggtg agctgtggca ggaagccgac
cggcacggcc ggcggctgga gttcatcgac 49980gtactcaccg cggccgccgc
cttccggccc gcctaccgtg aaccggccga gctggagctg 50040ccgcctctac
ggctcacctc cggcggggac gagccgcccc tgttctgcat cccctcgcac
50100ctcggcaagg ccgacccgca caagttcctg cggttcgccg cggccctgcg
gggacggcgg 50160gacgtcttcg tcctgcgcca gcccggcttc gtacccgggc
agcccctgcc cgcgggcctc 50220gacgtcctgc tcgacaccca cgcgcgggcc
atggccgggc acgaccggcc cgtgctgctc 50280ggctactcgg ccggcggtct
tgccgcgcag gcgttggccg cccgactcgc cgagctcggc 50340aggccgccgg
cggccgtcgt gctcgtcgac acctatgccc ccgacgagac ggaggtgatg
50400gcccgtatcc agggcgccat ggagcagggc cagcgcgatc gcgacggcag
gaccggtgcc 50460gccttcggtg aggcctggct caccgcgatg ggccactact
tcggcttcga ctggaccccg 50520tgtccggtcg acgtgccggt gctgcacgta
cgcgccggcg accccatgac cggtatgccc 50580gtcgaagggc ggtggcaggc
gcgctggaac ctgccgcaca ccgccgtcga cgtgcccgga 50640gaccacttca
cgatgatgga ggatcacgcc ccgcgcaccg ccgacaccgt gcacgactgg
50700ctcggcacgg ccgtccgccg ccctgagaga acccgcgact gacgactcgc
cggcgacagc 50760ggcatcccgc cctgtcccct tcctgtccgt ccgttccctt
tcctctcctc gaaacggagt 50820tcgttctcat gccttccttc cccgtacgcc
ggtccgtgcc cgacactccg cccgccgagc 50880acctcgaact gctcaaggag
agcggcggcg tctgcccctt caccatggag gacggccgtc 50940cggcctggct
cgcggccagc cacgacgccg tgcgctccct gctcgccgac cgccgtatca
51000gcaacaaccc ggcgaagacg ccgcccttct cgcagcggga ggccctgcag
aaggagcggg 51060gccagttcag ccgtcacctg ttcaacatgg actcgccgga
gcacgacgtg gcccgccgca 51120tgatcgcgga ggacttcact ccccggcacg
ccgaggcggt ccggccgtac ttcgaggagg 51180tgttcggcga gatcgtcgac
gaagtggtcc acaagggccc accggccgag atgatcgagt 51240cgttcgcctt
cccggtcgcc acccgcacca tctgcaaggt gctggacatt ccggaggacg
51300actgcgagta cttccagaag cgcaccgagc agatcatcga gatggaccgc
ggcgaggaga 51360acctcgaagc cgtcgtcgaa ctgcgccgct acgtcgacag
cgtcatgcag cagcgcaccc 51420gcaagcccgg cgacgacctg ctcagcagga
tgatcgtcaa ggcgaaggcg tccaaggaga 51480tcgagctcag cgacgccgac
ctggtcgaca acgcgatgtt cctgctggtg gccgggcacg 51540agccgtcggc
caacatgctg ggcctcggcg tgctcgccct cgccgaattc ccggacgtgg
51600ccgaggaact gcgggccgag ccgcacctgt ggccgggcgc gatcgacgag
atgctccgct 51660actacaccat cgcccgggcc accaagcggg tcgcggccgc
cgacatcgag tacgaggggc 51720acacgatcaa ggagggggac gccgtcatcg
tgctcctcga caccagcaac cgcgacccga 51780aggtgcacgc cgaaccgaac
cggctcgaca tccaccgctc ggcgggcaac cacctggcct 51840tcagccacgg
accgcaccag tgcctgggca agcacctcgt ccgggtccaa ctggagatcg
51900cgctgcgggc tgtcgccgag cggctgcccg gcctgcgcct ggacatcgcc
aaggaggaca 51960tccccttccg cggtgacgcc ctgtcctacg ggccgcgcca
gctgcgcgtc acctggtaac 52020agccaccatc ggcccccgcc gcggaccggg
cagcacgacc cggtccgcgg cgggggcacc 52080accaccgtca acatccccag
agaggcttcc ccgtggagaa gaccgacgtc gaccggctgc 52140gcacactcga
ccgagagcac atgtggtacc cgtggacgcc gatgaccgag tggatggccc
52200gtgatcagct cgtcgtcgaa cgcgccgaag gctgctggct gatcgacgca
gacggtaagc 52260gctacctcga cggccgctcg tcgatgggca tgaacctgca
cggccacggc cgcagcgaga 52320tagtcgaggc cctggtcgcc caggcgcgca
aggccggtga gaccacgctc taccgcgtct 52380cgcacccggc ggcggtggaa
ctcgccgccc gcctggcatc gatggcgccg gccgggctcc 52440agcgcgtctt
cttcgccgag tccggatcga ccgcggtgga gacggctctc aaggctgcct
52500acgcctactg ggtcgcgaag ggcgaaccgc agcgatccac cttcgtgtcc
atggagggcg 52560gttaccacgg cgagacccta ggcacggtca gcctgcgcgg
caccaacggc gaacaggtcg 52620acatgatccg caagacctac gagccactgt
tgttcccctc cctctccttc caccagcccc 52680actgctaccg gtgtcccgtc
ggccagtcgt cggacagcga ctgcgggctg gagtgcaccg 52740attcgctgga
gaacctcctc acccgggaga agggccggat cgccgcggtc atcgtcgagc
52800cgcgggtcca ggccctcgcc ggagtgatca ccgccccgga gggacacctc
gcgaaggtcg 52860cggagatcac ccgcaggcac ggagtgctcc tcatcgtcga
cgaggtcctc accggctggg 52920cccgcaccgg cccgacgttc tcctgcgagg
ccgagggcgt cacaccggat ctgatgacgg 52980tgggcaaggc gctgaccggc
ggatatctgc cgctgtcggc caccttggcc acggaggaga 53040tcttcggagc
cttccgtgag agcgtcttcc tcagcggcag cacctactcc ggatacgcgc
53100tcggggcggc cgtcgccctg gccagcctcg acctgttcga gaaggaggac
gtaccggccc 53160gggccaaggc gctcgccgac gtgctcacca ccgcactgga
acccttccgc gcgctcaccc 53220acgtcggtga cgtccggcag ctcggcctca
tcgccggcgt cgagctggtg gccgaccggg 53280agacccgcgc cccctacccg
ccccaggagc gcgtcgtcga tcgcatctgc accctggcca 53340gggacaacgg
cgtgctggtc aacgcggtcc ccggggacgt gatcaccatg ctgccctcac
53400cgtcgatgag ccccgacgac ctgcgcttcc tcaccggcac cctgtacacg
gccgtccgag 53460aggtgaccga agagtgaaag ggctgatgcg ggcggccgtc
atccgtgcct ggggcggccc 53520cgagcggctg accctggacc gggtcgaacg
gccgtcaccc ccgcccggat ggatcgccgt 53580acgcgtcgag gcctgcgccc
tgaaccacct cgacatccac gtgcgcaacg ggcttccggg 53640cgtacggctg
gaactcccgc acgtctccgg cggcgacgtc gtcggcgtcg tcgagcaggc
53700caccgacgag gcgggggaga gactgctcgg cagccgtgtg ctgctcgacc
cgatgatcgg 53760gcgcggcatc ctcggcgagc actactgggg cgggctcgcc
gagtacgtcg tcgcacccgc 53820ccacaacgcg ctccccgtcc ccgatcagga
cgcggacccg gcacgctacg ccgcactgcc 53880catctcctac ggcacggccc
agcgcatgct cttcagccgc gcccggctgc gtcccggcga 53940gagcgtgctg
ctgttcggcg cgaccggcgg cgtcggcgtc gcctgcgccc agctcgccct
54000gcgtgccggg gcccggatca tcgcctgctc cggatcaccg gccaagctcg
cccggctgcg 54060ccgactcggc gtgatcgaca cgatcgacac cggcaccgag
gacgtacggc gcagggtccg 54120cgaactcacg gacggcggtg ccgacctggt
cgtcgactac cagggcaagg acacctggcc 54180cgtctccctg cgctcggcgc
gcgccggcgg ccgcatcgtc acctgcggcg cgaccaccgg 54240gtacgaggcg
acgaccgacc tgcgctacgt gtggtcgcgt cagctggaca tcctcggctc
54300caacgcgtgg caccgcgacg atctgcacac gctggtcgac ctggtggcca
ccgacgccct 54360ggaaccggtg gtgcacgccg acttcccact ctcccgcgcc
cccgaggcgg tcgccgaact 54420ggaggagcgc cgggcgttcg ggaaggtcgt
gatccgcacg gcgtgaactc actcatgtcc 54480cggctcgatc ccagggggaa
acagcgtgac cggcaacacc acatccgccg ccttcctgcg 54540gcggacacag
aacgcgctcg ccatgcagcg caagatatgc gcccagcccg aggagaccgc
54600ggagcgcgtg ttctccgaca tcctctcggt gtcacgagac accggcttcg
gccgcgaaca 54660cggcctcgcc ggggtccgca cccgccagga gtggcggcgt
gccgtgccca tccgcaccta 54720cgacgaactg gccccctacg tcgagcggca
gttctccggc gaacgccgcg tgctcaccac 54780cgacgacccc cgcgccttcc
tgcgcacctc gggatcgacc ggccgcgcga agctggtacc 54840caccaccgat
cactggcgcc gtgtctaccg cggaccggcg ctgtacgcgc agtgggggct
54900ctacttcgaa cagatcggca cgcatcggct caccggcgac gaggtcctcg
acctgtcctg 54960ggagcccggc cccatccggc accgactgcg cggcttcccc
gtctacagca tcaccgagcg 55020ccccgtgtcg gacgaccccg acgactggaa
cccgccgtgg cgtcacgcga ggtggttcac 55080ccgcgatgcc ggtgccgcga
ccatggccga cctgctctac ggcaaactgc tgcggctggc 55140cgcccacgac
ctgagactga tcgtctcggt gaacccctcc aagatcgtcc tgctcgccga
55200gacactgaag gagaacgccg aacgcctgat ccaggacctg cacgacggcc
acggcacgga 55260ccgggcagcc cgcccggact tcctccgccg cctcaccgcc
gccttcgacc gcaccggagg 55320ccgtccgctg ctcaccgacc tgtggcccgg
cctgcgtctg ctcgtctgct ggaactccgc 55380ctccgcggcg ctgtacgggc
cctggctgtc ccggctcgcg accggcgtgg cggcactgcc 55440gttcagcacc
acgggcaccg agggaatcgt cacgctgccc gtcgacgacc acctctcggc
55500ggggccgctc gctgtcgacc aggggcattt cgaattcgtt ccgtggcagg
acctggacga 55560cggcagccct ctgcccgagg acacccccac cctcggctat
gacgaactcg aactcggcgc 55620cgactaccgg ctcgtcatga gccaggccaa
cgggctctac cgctacgacg tgggcgacgt 55680gtaccgcgtc gtcggagcgg
tcggcgccac gccacggctg gagtttctgg gacgcgcggg 55740attccagtcc
tccttcaccg gcgagaagct caccgaatcc gatgtgcaca ccgccgtgat
55800gcgggtcctc ggcagcgaac gcaccgacca cccgcacttc tccggcatcc
cggtctggga 55860caccccgccc cactacctcg tcgccatcga atgggctgac
gcccacggca cgttgaacgt 55920gcaggacaca gcccgccgca tcgacgcgac
tctccaggaa gtcaacgtgg aatacgccga 55980caagcgccgc agcggacgac
tgcggcccct gcagatcctg cccctggtgc ccggcgcttt 56040cggccagatc
gccgaacgaa ggttccgcca gggcaccgcg ggagcccaga tcaaacacca
56100ctggctgcag aaggactcgg cgttcctcga cacgctgcgc gacctcgacc
tcgtccgcgc 56160ccgcccgggg acgtgacggc atgcgcatcg gattcgccgc
acccatgtcc ggcccctggg 56220ccaccccgga caccgccgtg cacgtcgccc
gcaccgccga acagctcgga tacgcctcgc 56280tctggaccta ccagcgagtc
ctcggcgcgc ccgacgactc ctggggcgag gccaaccgca 56340gcgtccacga
ccccctgacc accttggcct tcctggccgc gcacaccacc gggatccggc
56400tcggtgtcgc cgttctgatc atgccgctgc acacccccgc ggtgctggcc
aagcagctca 56460ccaccctcga cctgctctcc ggcggccgac tcgacgtggg
cctcggcaac ggctgggccg 56520ccgaggagta cgccgccgcc ggcgtgaccc
ccaccgggct cagccgccgc gccgaggact 56580tcctcgcctg tctgcgggcc
ctgtggggtg agcagaccgt ggtggaacac gacggcccct 56640tctaccgggt
cccgcccgcc cgcttcgacc cgaagcccgc ccagtccccg cacccgccgc
56700tgctcctggg cggcgccgcg cccggcgcac tgcgccgcgc cggccgcctg
tgcgacggct 56760ggatcgcgag cagcaaggcc ggcccggccg ccatccgcga
cgccatcacc gtcgtacgcg 56820acagcgctga gcgaaccgga cgcgaccccg
cgaccctgag gttcgtctgc cgcgccccgg 56880tccggctgcg gacccggtcg
gcccccaacg agccgccgct gaccggcacc gcggagacga 56940tccgggccga
tctcgccgcg ctagccgaca ctggcctgac cgagatcttc ctggacccca
57000acttcgaccc cgagatcggc tcaccggacg cgccgaccgg cgacgtgcga
caccgcgttg 57060atctgctgct gcacgaactg gcccccgcaa actggtgaga
ggaagagaac agtgctgatc 57120gcgcgcgccg ccgtcggaga agaccgaacg
tacgcccgcg tcgacacgga cacagggctg 57180atccacctcc tggccggcac
tccctacgac gagatccggc cgaccggcga aaccagaccg 57240cttgccgagg
cccgcctgct cgcaccggtc gaacccagca aagtgctggt cgcaggacgc
57300aattacggcg atgtcgtcac accggacctg gtggtcttca tgaagccgtc
cacctctgtc 57360gtcggcccca ggagcaccgt cctgctgccg gcggaggcca
agcaggtccg gtacgaggga 57420gaactcgccg tggtgatcgg gcgccgctgc
aaagacgtcc ccgaagacac cgcggaccag 57480gccgtgttcg gctacacctg
cgccaacgac gtcaccgcct gggacgtcgg ggaaccgaag 57540ggccactgga
ccaaggcgaa gagcttcgac acattctgcc cgctgggacc atggatccgc
57600accgatctcg accccgctga cctcgtcctg cgcacaaccg tgaacggcac
gctgcgccag 57660gacggctcca ccaaggaaat gaacaggaat gtccgcgccc
tcgtgtcccg ttgcagctca 57720ctgatgacgc tgctgcccgg agacgtgatc
ctcaccggca caccggcggg cgccggcgtg 57780ctgcgtccgg gtgacgaggt
cgtcgtcgag attgacggga tcggttcgct cgcgaatccg 57840atcggcgtgg
ccaagtagtt cactgactac actcgcgcga acaacacggg cccgtctgcg
57900gcgcttcgag ctgcgccgat ccccgaggag agattccagt gtctgtaatc
cgtcccaccg 57960ccgaaaccga acgcgcagtc gtggtggtcc cggctgggac
gacgtgcgcc gacgcggtca 58020ccgcggcaaa gctgccgcgc aatggcccca
acgcgatcgt cgtggtgcga gacccgtccg 58080gcgccctgcg tgacctcgac
tggacccccg attccgacgt cgaggtcgag gccgtcgcgt 58140tgtccagcga
ggacggcctc acggtgctgc gccactccac ggcacacgta ctggcccagg
58200cggtccagca actctggccg gaggccaggc tcggtatcgg cccgccgatc
gagaacggct 58260tctactacga cttcgacgtg gagcgcccct tccagccaga
ggacctcgag cgcgtcgagc 58320agcggatgaa ggagatcatc aagtccggcc
agcgcttctg ccgccgcgag ttccccgatc 58380gggaagcggc ccgtgccgag
cttgccaagg agccgtacaa gctcgagctc gttgacctca 58440agggcgacgt
ggacgccgcc gaggcaatgg aggtcggcgg gagcgacctg acgatctacg
58500acaacctcga cgcgagaact ggagatgtgt gctggtccga cctctgccgc
ggcccccact 58560tgccgtcgac ccgcctgatc ccggcgttca agctgctgcg
caacgcggca gcctactggc 58620gcggcagcga gaagaacccc caactgcagc
gcatctacgg cacggcctgg ccgacccgcg 58680acgagctcaa gtcccatctc
gccgccttgg aggaggccgc caagcgtgac caccgccgca 58740tcggcgagga
actcgacctc ttcgcgttca acaaggagat cggccgcggc ctgccgctgt
58800ggctgcccaa cggcgcgatc atccgcgacg aactcgagga ctgggcccgc
aagaccgaac 58860gcaagctcgg ctacaagcgc gtcgtcaccc cgcacatcac
ccaggaggac ctttactacc 58920tctcaggcca tctgccttac tacgcggagg
acctgtacgc gccgatcgac atcgacggcg 58980agaagtacta tctcaagccg
atgaactgcc cgcaccacca catggtgtac aaggcgcgcc 59040cgcacagcta
tcgcgacctg ccctacaagg tcgccgaata cggcacggtg taccgattcg
59100agcgcagcgg tcagctgcac ggcatgatgc gtacgcgcgg tttcagccag
aatgacgcgc 59160acatctactg cacggcggac caggccaagg accagttcct
ggaagtcatg cgcatgcacg 59220cggactacta ccgcactctg gggatcagcg
acttctacat ggtgctcgcg ctgcgtgact 59280cggcgaacaa ggacaagtac
cacgacgacg agcagatgtg ggaggacgct gagcggatca 59340cccgggaggc
catggaagag tccgacatcc ccttccagat cgacctgggc ggtgccgcgc
59400actacggccc gaaggtcgac ttcatgatcc gagccgtcac cggcaaggag
ttcgccgcct 59460ccaccaacca ggtcgacctg tacaccccgc agcgtttcgg
gctgacctac cacgactccg 59520acggcaccga gaagcccgtc gtggtgatcc
atcgcgctcc gctcggctcg cacgagcgct 59580tcaccgccta tctcaccgag
cacttcgcag gtgccttccc ggtgtggttg gcgccggagc 59640aggtccggat
cattccgatc gtggaggaac tcacggacta cgccgaggaa gtccgcgaca
59700tgctgctgga cgcggacgtg cgtgccgacg tcgatgccgg cgacggccgg
ctgaatgcca 59760aggtacgcgc ggccgtcacc cggaagatcc cgctcgtcgt
ggtggtcggc aggcgagagg 59820ctgagcagcg caccgtaacc gtgcgcgacc
gctccggcga ggagaccccg atgtccctgg 59880agaagttcgt ggcccatgtc
actggactca tcaggaccaa gagcctggac ggcgccggcc 59940acatccgtcc
gctgtccaag gcctgaccca cagccacggg gccccggcag gtgtcccgcc
60000tcggaccacc cccttcggtc ctcagccgac ggcgggctca tggcagccgc
ccgacctgcc 60060ggtgccgtgg ctgttcggca acccgtgggc gccgcccgcc
gaggagaccg cgcgctgccg 60120ggggatgatc tcgtccggcc cgcgccccgg
gctcaccggg cggcgctgga gcgggccgga 60180cgggagccgg gaccggccga
ccgtccgtgg
tcggtctcgt cccggacgag cagggacagc 60240agtgcggcca cggtgaggcc
cgcttccgcc gggtggcgca ggaccttgtc cggttcgacg 60300cggtagatgt
tgctgcgccc gtcacgggtg tgggagagat agccgtcctg ctccaggtcc
60360gagatgatcc gctggacggc gcgctcggtg agtctgcagt gggcggcgat
gtcgcggatc 60420cgcacgttcg gattgtcggc gatggccgcc agtacacgcg
cgtggttggt gacgaacgtc 60480catccggagt gagattcagg cactgcaacc
atgcacagca ttgtagggac catctttgcc 60540ggacagccaa tacatgacat
acttttcgcg ttaagagtgg catgttctgt cccatgggca 60600actgagaagg
gacccgaggg tgtctttgga tgaagcggtc gcggggtgct cgcgccacac
60660cggccggcgt cggctcccgg ccgcggagca acccacgcag gcgcagtacg
aagcgcacgg 60720cgcctgggtc gtcagcgcac ggggcgcata cgacatgaac
tcggtcgagc ccttggccga 60780cgcgttgaaa gacgcggccg agaagtctcc
gaaggtggta ctggacgcct ccggcatcac 60840cttcgccgac tccaccctgc
tgagtctgct gatcctcacc caccaggcga cggacttccg 60900ggtggccgcg
ccgacgtggc aagtgatgcg gctcatgcaa ctgacgggcg tcgacgcctt
60960cttgaaggta cgggccacgg tggaagaggc cgccaccgct taggggcacg
gcgtgccggg 61020cctcggctga cgcaagccga tggcttggag ctgagaattc
cgggcattcg acgttctgcc 61080tggcccccgg ccgtcgatgg tggccggcca
ggcgtgatga agacagtcac ctcctcgagg 61140cgcgatcgac cgcgcctcgg
gggagggcgg ttgacgggag agggaaggtg tccatgattc 61200tgccggcgga
gaaggaactg cgtgccgtgc tggctcggtt cgctcaggcg cgcatcgacc
61260acgacgtacg tcccagcggc tgcaccagca ggctcctcga ggacgccacg
tacaccctgt 61320gcgtgatgac cggtgcccgt accgccgaac aggctctgcg
tacggcggac gaacttctcg 61380cacagttcgc cgagcgcacc gctgcccccg
tggaggacga agccctggcc gcgtgagccg 61440acggcacaca cctgcggcgc
ctcgcgtggc aggtgtgtgc ctgccggcgg gcggacgcga 61500gcacctgagg
aaatgagaga gagtcatgag cgatacccgg cttcggcagc gcgatgagac
61560gtcgaagggg ccggccaccg agatcccggc gccgcagtgg cgggacctct
tcctcgcccc 61620cgactggggc ggcactgatg agcaggtgat cgtcgccgaa
gaggcgcgcg ggcccgagca 61680cttcaccgga gcgcgccgtc cgcgcggcgg
ccgccgatcg agtcgacggg ccgcgtgatg 61740cgcggccctg ccgcgacggc
cgcaggtcag gtgagggcga tccgtgcggt gatgcgcttg 61800ccggacggct
cgggccggat ctcgacggcc tggctcagcg ccgccacgat ctccagaccg
61860tgctgaccca cccgggcggg atcggcgggc cggggagcgg gcagggtggt
gtcaccgtct 61920cgcacggtga cagtcaccgc gtcgtccgtg aggctcaggg
tgagttctat ggggtcggcc 61980ccgtacttga ccgcgttcgt gatcagctcg
ctcaccacca ggtggacggc ctccgacgct 62040ctctccggca ccggcgaccg
aaggtcccgc tgtgaacggg tgaggtagtc ggtcgcgaag 62100tgacgggcgt
cggcgatccg cagcgcttcc cgccggtacg tcaccgaggc cgcccgcgga
62160tccgcaccct ccggtgcccg ctgtgtcgac gaaccgtcca gtccggtctc
acccatgtca 62220accgccacgt tacccccgag ccacgcacgc gcgccgacgc
ctcccgcgcg atgagaacat 62280ctcatgtgtt cctacgatag ttctgctttc
cgtcggtcac cgcacccgtc ggccagggag 62340aaagcggggg cctggacgtg
atccgggcct caggccgtgc tgagcacgcc tccgcgcaag 62400cgggccggca
gccgctccgg aatccgtgcg gtcgtcgtcg ccatgatgtc cctccccgtt
62460ccgatggccc gctcgcgtga cgggccactc ggcggctacc ccctgcgcgg
gctcgcatgc 62520acggaggcgg agatcttgtc cgaggccgtt ccgctcccgc
ccgcgcgacg tgatcaggtc 62580ggcggttacg ggatcaggaa ttccagcacg
ggtcggtccc aggccaccgc gggcgggttc 62640gcgcgggcgg ttccggtggg
cagggcgccc acagcgcggt agaagccctc ggccggaggg 62700tgcgacacca
cccggacacg gtccagccct gccgcgcggg cccgcctctt catgtgatcg
62760acaagcagcc gtccgatgcc gcggccctgg gtaccgtcct ggacgaacag
caggtccagc 62820tccgccggcg cgaggagcag cgcgtagaag ccgagcaccc
ggtccgtggc gtcctcggcg 62880tcgacgctac gaagacctgg tggttctcga
tgtagtcggg cccgacgcgg tagtcggaga 62940ccatcgccgc gtacgggccc
tcgtaggctc gtgagccgcg tacgagacgc gagagccgct 63000tggcgtcccg
ggcgaccgcc cggcggatga cgatctcgcc ccgtacggac gcggaactcg
63060attgcacggg gagcagtcta cgcgtccggg acgggccggg acctccgcgg
gctccgtccg 63120agccttccgt cagtccgtcc cgatgacgac cacctcagtc
cgtcccgatg acgaccacct 63180cgtcctcggc cgtccaccgg cgccgctcgc
gtttgggcgg attgatccgc actccgtagt 63240gggggcgcgt cgacgcgtcg
gcgtggtcgc ggtaaccgat ggcgcactcg ccgcgacggc 63300gtgccgcggc
cacgacggtg gcgaaggacg tggtgctccc tggcaacagg tagtcggtcg
63360ccggccgcag gcggaccccg gcgccctcgg cggagaacag ttcctcgaag
accgcggcca 63420ggtgccggtt ctgggagatc tgggacatga gcaggccgat
gagcttgccg ctgatgatga 63480cgtcggcccc ggggccgatg ggggccagcg
cccggttgcg gtcgtcgatc agttcggtga 63540cgaccggcag ctcgcgcccg
gtggcttctt ccagttggcg caggagcaga agggtgacga 63600gcgtgcggtt
gtcgggatcg tccggcggtt ggcccggggc ggggtcccgc cccagcacga
63660tcacgctgtc gtaggagtgg acgtccaggc gccgcagcgt ctcgggacgg
gtgatgtccc 63720cgtggtgcag agccaggctc aggccgttcc caccgttctc
cccgtttccg ctgtccgctt 63780cggcctcgct gatctcgcgg atcgtcgcct
cgcccggttc cgccacgacg tcgacggccg 63840aaccgggccg ggcgcgtcgg
tgcaactggt cgaccacgag cggcgctcgg cggttccagc 63900ccagtagcag
aatccgctcc gccggcgcgg gcgtcggagg gcgggaggcc accgcggcct
63960tctcgaccga ctccgcacag tcgtccagcc gggccgtgtc gtcgtcgccg
gtgatgacga 64020cgagcaggtc gtccggggcg accggcgtcg tcggcggcgg
gttgagcaag ggggtgcagc 64080cgcgcatcag tccgacgacg ctcgtcgtcg
agtaggacag gagaacctcg ccgaacgggc 64140ggccggtcag ggccggctcg
ctgatcagat agaactcgtc tccggcgaag tcgaggagtt 64200cccggtggac
gagggagatc ccggggcggc gggcggcctg gacgatcagc cgggcggtga
64260cggtgtcact ctccaggacg acgccgtcgg gtccggcggc gagacaggcg
gccaggcggt 64320accggtcgtc ccggacggcg gcgacgacgg gcggacgcgg
tttcgccccg gccagagcgg 64380cccgcagcgc cagcagtgtc ttcaccacct
ccgcgtcggc gtgcggctcg tccggaggca 64440gaaccagcac gacaccggcc
gtggccgggc tggtcaacgg caacacggcc gggtcggtgg 64500tggggccgct
gcggcagatc aaccgcgtac cgccgcagga gcccaccttc gtgcccaggg
64560actcctccat gacggtcttg tcccggtcgg ccagcaccac caccgccgcg
ccgcgctggt 64620tgacgttggc ggccaccagc tcgctcacca ccgtgaagac
ctgttccgac catccgagga 64680ccacggcgtg cccctgttcc agcacggtgg
aacggccccg ccgcaacgag gtgagccgct 64740ccgtgagcgc cgtcgtgatc
aggccgacga gcgtggagac gtagagcagg gtgaccagcg 64800cgagcagcac
cgacaacatc gcccgcagcg gcgtacccgt ggcaccgccc agccgtagcg
64860tctccccggt gagacgccac acctccgcca gccgctccgc gagggacggc
ggggcatccg 64920ggtcggtcca caccatcacc gcgctggccg gcacgacgac
ggccagggac agcagcgcca 64980tccagccgac gagcgcggcg gcaccgcggg
ccagggtgct gtcgaaccag taacgggccc 65040tgtcaccgaa cggagtccgc
cgctgcgcca cccgtccccc ttccgtctcc ccgtactccc 65100accaggccgc
gtcggcaccg cagcggcgcc gtgccgcgcg ggcctgcccc ggagcgcgga
65160cgtgggcacg catgtcgcag tgtgggggat cgatcagggc cgcagacgac
gttcgaccag 65220ccttcatccc ttcgggtagc cgcgtcctcc acggcggcgg
acctgcgaag acgcccggct 65280cggacacgca gatgaaagcc gccgggcctc
attcgcggaa cgccgggtat ccgcgagacc 65340ggatcaccgt cacacacagc
gaggagagac cctgtgccgt ccaccgatgt cgtcgaactc 65400atcctgcggg
accaccgccg tatggaggaa ctgttccgca ccctgcgcaa cgtcgaagcg
65460gaccgtgccg cggccctgac ggagttcgcg gacctgctca tcgcgcacgc
ctcggccgag 65520gaggacgagg tctaccccgc cctgcgtcgg tacaagaacg
tcgagggtga ggatgtcgac 65580cacagcgtcc acgagcacca cgaggccaac
gaggcgctgc tggccctgct cgaggtggag 65640gacaccgctt ccgacgagtg
ggatgacaag ctcgaagagc tggtcacggc ggtcaaccac 65700cacgccgacg
aggaggaacg aacactcctc aacgacgccc gggagaacgt cgccgacgac
65760cgccgccggg aactggggca gaagttccag gaggcgcgtt cgcggtatct
ggagaccggc 65820tgcggcagtg tcgagaacgt gcgcaagctg gtcgccgccg
ccgacgactg acccgcgtcg 65880gcgacgtccg ggcgcggagg ggagccgccg
ccgtcgggcc ccctcgccgg gcgtaccgcg 65940gtcaggcggg tgagggctgc
gtccggtccg ggacgggttc gaggcggacg acgatgccct 66000tggacgtggg
ctgattgctg atgtccgcga cgctgtccag cggtaccagg acgttggtct
66060cgggatagta ggcggcggcc gagcccttgg ccgccgggta gggaacgacc
tggaagttct 66120cggcccggcg ctcggtgccg tccgcccaga cgctcacgag
gtcgacgcga tcgccctggg 66180cgaggccgag ttcgctcagg tcggccgggt
tgacgaggac gacgtggcgg ctgccgtgga 66240tgccgcggta gcgatcgttg
tcggtgtagg ggacggtgtt ccactggtcg tgcgaacgca 66300gtgtctgcag
cagcagatga ccttcgggcg cccgtgggac cacgctctcg ttgcgagtga
66360acagggcctt gccgacctcg gtgttgaaga cgccttcgtt gaccgggttg
ggcagttgga 66420agccaccggg ccgggtcacg cgtgcgttga agtcgtggaa
gccgggcacg atgcgcgcga 66480tgcggtcgcg gatggtgtgg tagtcgcctt
cgaaggtctc ccaggggatc tccaccctgc 66540cgtccagggt gagccgggcg
agccggcaca ggatcgcgat ctcgctcagc agcatggggg 66600aggcgggggc
caggcggccc cgggaggtgt gcacctcgct catggagttc tcgacggtga
66660cgaactgctc gccgtcggcc tggacgtcgc gctcggtgcg tcccagcgtc
ggcaggatca 66720acgcggtgtc accgcagacg gtgtgcgagc ggttgagctt
ggtcgagatg tgggcggtca 66780gccggcacga gcgcatcgcc tcctcggtga
cctcgctgtc gggcgccgcc cggacgaagt 66840tcccggccag ggcgaggaag
accttgacgc ggccctcgcg cattgccttg atcgagttca 66900ccgagtccag
gccgtgggcc cgcggcggct cgaagccgaa ctcgtcccgc agggcgtcca
66960ggaaggtgtc cggcatctgc tcccagatgc ccatggtgcg gtcgccctgg
acgttgctgt 67020ggccgcgcac cgggcaggcg ccggtgccgg cgcgtcccag
gttgccgcgg agcatcagga 67080agttgacgat ctcccggacg gtgggcacgc
cgtgcttgtg ctgggtgatg cccatcgccc 67140agcagacgac gacgcgttcg
ctgtcgagga cctcgtcgcg taccttctcg atctcctcgc 67200gggtcagtcc
ggtcgccgcg cgcacgtcgt cccagtccac cgtgcgggcg tgccgggcga
67260actcctcgaa gccggtggtg tgggcgtcga tgaagtcgtg gtccaggacg
gtgccgggcc 67320gggcgtcctc ggcctccagc agcagtcggt tgagggcctg
gaagagggcg aggtcgccac 67380cgggcttgat gtgcaggaaa cggtcggcga
tccgggtgcc gcgcccgacg accccgcgcg 67440gctgctgcgg gttcttgaag
cgtcgcagcc cggcctcggg aagcgggttc acggccacga 67500tccgggcgcc
gttccgcttg gcctcctcca gcgcgctgag ctggcgcggg tggttgctgc
67560cggggttctg cccgaccagg aagatcagat cggcgtggtg gaggtcgtcg
agaccgacgg 67620tgcccttgcc ggtccccagg gtctcgctca gggcgaagcc
gctggactcg tggcacatgt 67680tgctgcagtc gggcaggttg ttggtgccga
aggcgcgggc gaagagctgc agcacgaagg 67740cggcctcgtt gctggcgcgg
cccgaggtgt agaacaccgc ctcgtcgggg gaggccagcg 67800acttgagctc
ctccgcgagg acccccaggg cgtcgttcca gccgatgggc tcgtagtgcg
67860cggagccggg ccgtttgatc atcggctcgg tgagccggcc ctgctggttg
agccacatgt 67920cggagcggcc ggcgaggtcg gcgacgctgt gctcgcggaa
gaagtcggcg gtcacccgcc 67980gtgtggtcgc ctcgtcgttg atgtgcttgg
cgccgttctc gcagtactcg ttgcggtggc 68040gccgtcccgg ggccgggtcc
gcccacgcgc agccggggca gtcgatgccg cccacctggt 68100tcatggtcag
cagatccacc ccggtcctgc gcggggacgt ctgctccaag gagtactcca
68160gcgcgtgcac gaccgcgggg acgccggcgg cccacttctt cggcggtgtc
acggagaggc 68220tggtctccga ctcctcaccg tgcggcttct gcatgtgttc
gcctttctct cgccgtgtcc 68280ggccctcggt cagccgacgg gccaccgggc
caccctgctc ctgggcggtt ccggtcgatc 68340gtgttgcacg cggccgttgc
ggatggtgcc gcgccaggcg ccgccctctt gtcccagtcc 68400ctccatgaac
cgtttgaatc tcgtcagctc gccgtggacc agtcgggtgg tcagccggac
68460cgcccgcgag gaacgggtga ggatcctcgc ggctccccgc ggttccagaa
gcacccggac 68520ggtgatcgcg gtgccgccgg actccgtcgg cctgaactcg
acctctcccc ggtgccacgg 68580cctctgctcc aggccgcgcc aggccaggta
ggcgtcgggg tcctgctcca ggatctcgac 68640cgcgaagcgg cggcgcaggg
ggccgtaacc gagggtccag gcggtcacgg tgggccggac 68700ctgctcgacg
tcgcggacca cggccgagaa ccgagggaag gacttgaact gcgtccactg
68760gttgtacgcc gtccgtacgg gcaccgccac ctcgaccgtc tggtcgacgt
gccgctcgcg 68820cagcggacga cggcccgtgc tcctgtcacc gctggtgcgc
gcccccggtg tgtccggcac 68880ggcccgcggg ccctcgtgct gctccgccat
cgtggtctct cctcctgtgg ggggaaccag 68940gcgtccaggc tctcgacgtc
tcctcgtgcg tcagggtccc ctcgtgcgtc agggttttcc 69000cctcccggtt
cccacctccg gcgtttcgag tcacctcggt cccgccgcgt cctcgtgcgc
69060gtcggcctcc gacgccccgc ccggagagat ccggctcgac gtcggcctgg
tcgctccggc 69120gggcggtgag cagcaggtac tcccagtcga tggccccgtc
ccgcagcgcg ccggccgcga 69180ggtcggtgag ggccgcgtcc agggcggcgg
cgcgctcggt gtcgtcgccg atgtaccggt 69240agacggcgat gatcggcccg
taggcggcct tgaagaactc aaggaagtcc ccccttcgag 69300cggcgctcgg
gaccgggggt gcgccggccg gcctgatgac cagcgtcgac ggggtcgtga
69360gagcgtgatc gaggtgctcg acgagccgga gccaccgcag gagagtcgcg
ggcagcgctc 69420tctccagcaa cccagccttc gcacacccgt gcgcggcgca
aggggcgtac gtcccctgcg 69480caaaggggcg gctggaggcc ggtgccccac
gcctcgtctc cgcacggcgc gtcggacggg 69540gggagaacgc gtccgatgaa
cacggtgtgc gccctcccga ttcctctccg tgacgtcaat 69600gatgagccca
ccgcgcctgg gtcaggcgat ccgcggtccc ctgccgctcg gtcaggggcc
69660ggtgacggaa ggagtccacg gtgttgcttc tcatctctcc ggacggtgtc
gaggaagccc 69720tcgactgcgc gaaggcggcg gagcacctcg acatcgtcga
cgtgaagaag cccgacgagg 69780gctcgctggg cgcgaacttc ccgtgggtca
tcagggagat ccgcgacgcg gtgccggcgg 69840acaagccggt ctcggccacc
gtgggggacg tcccgtacaa gcccggcacg gtggcgcagg 69900ccgcgctggg
cgcggtcgtc tcgggggcca cgtacatcaa ggtcggcctc tacggatgca
69960cgacgcccga acagggcatc gcggtcatgc gcgcggtggt ccgggcggtg
aaggaccacc 70020gtcccgaagc gctcgtcgtc gcgtccggtt acgccgacgc
ccaccggatc ggctgcgtca 70080acccgctcgc cctgcccgac atcgccgccc
gctccggcgc cgacgccgcg atgctcgaca 70140ccgcggtcaa ggacgggacg
cggctgttcg atcacgttcc gccggacacc tgcgccgagt 70200tcgtccgtcg
cgcccacgcc gccggcctgc tcgccgccct cgcgggcagc gtcaggcaga
70260ccgacctcgg ccggctgacc cggatcggca cggacatcgt cggggtgcgc
ggagcggtct 70320gcgagggcgg cgaccgcaac gccggacgca tccggccgca
cctggtggcc gccttccgga 70380gcgagatgga ccggcacgcc cgcgagcacc
gggccggcgt caccaccgcg agctgaccgc 70440cggtatgccg acccccgcac
ccgaccacgc ccccgcacag cgggccgcgc ctctcgcggt 70500cgtcgatccg
gccaccggaa cggtcttcga cgaggccccc gaccagggac cggacgtgct
70560ggacgccgtc gtcgaccggg cccgccgggc ctggcacggc tggcgcgccg
atcccgacgc 70620ccgtaccacc gcgctgcgct cggcggccga cgcggtcgag
gccgccgggg acgacctcgc 70680ccgtctcctc acccgggaac agggaaagcc
cctggccgaa tcgcatgcgg aggtcgcccg 70740gacggcggcc cgcctgcgct
acttcgccgg cctggccccc cggacccggc gcatcaccga 70800cgggcggccg
gtgcgcagcg aggtccgctg gcgccccctc ggacccgtcg ccgcgatcgt
70860gccgtggaac ttccccctcc aactcgcgtc ggcgaagttc gcgcccgcgc
tcgccgcggg 70920caacaccatg gtcctcaaac cctccccctt caccccgctc
gccacccggc tgctcgggtc 70980cgtcctcgcc accgccctgc ccgaggacgt
cctgacggtc gtcaccggcc gcgagccact 71040cggcgcccgc ctcgccgcac
accccggcat ccgccacgtc accttcaccg gatcggtgcc 71100cacgggccgg
gccgtcgccc gagcggcggc ggcctcgctc gcccgggtca ccctggaact
71160gggcggcaac gacgccgcgg tcctgctgga cgacgtcgaa gtggaccgga
tcgccgaccg 71220gctgttctgg gccgcgttcc gcaactgcgg gcaggtctgc
atggcggtca agcgcgtcta 71280cgcaccggcc cgtctgcacg cacaggtcgt
cgaagccctc accgagcgcg ccaaggccgt 71340cgccgtcggg cccggcctcg
acccccgcac ccggctggga ccggtcgcca acgcccccca 71400gctggcccgg
gtcgagcaga tcacccggcg cgccctggcg gacggcgccc gggcggcggc
71460cggcggccac cggctggacg ggccgggctg cttcttcgcc cccacgatcc
tcaccgacgt 71520cccgcccgac agcccggtgg tgaccgagga gcagttcggg
ccggtactgc cggtgctgcc 71580gtaccggagc ctggacgaag ccgtcgacgc
ggccaacggc acgggattcg ggctgggggg 71640ctccgtatgg ggcaccgacc
tcgaccgggc cgaggcggtg gccgaccggc tggaatgcgg 71700cacggcctgg
gtcaaccacc acgccgagct gtccctcgcc cagcccttcg ccggcgacaa
71760ggacagcggg gtcggcgtcg cgggcgggcc gtggggactg tacggcaacc
tccgtccgtt 71820cgtcgtccac cgaccgcggg gggagtgacg gtgagcttcc
gggcggccgt actgcgcggg 71880tacgaggacc ccttcacggt cgaggaggtg
accctgggga cggagcccgg cgcaggggag 71940atcctggtcg agatcgccgg
ctgcggaatg tgccggaccg atctcgcggt ccggcgctcg 72000gccggccgga
gcccgctgcc ggcggtgctc ggccacgagg gctccggggt ggtggtgcgg
72060acgggcggcg gcccggacac cgcgatcggc gtcggtgacc acgtggtgct
gagcttcgac 72120tcctgcgggc actgccgcaa ctgccgcgcg gcggcccccg
cctactgtga ttccttcgcc 72180tccctcaacc tcttcggggg ccgtgcggag
gacccgccgc ggctcaccga cgggtcgggg 72240gcggcactgg ctccccggtg
gttcggacag tccgccttcg ccgagtacgc gctcgtctcc 72300gcccgcaacg
ccgttcgggt cgaccccgcc ctgcccgtcg aactgctcgg gccgctgggc
72360tgcggcttcc tcaccggagc cggagccgtg ctcaacacct tcgccgccgg
gccgggcgac 72420accctcgtcg tgctcggcgc gggcgccgtg ggcctggccg
cggtgatggc ggccaccgcc 72480gccggcgcac cgtccgtggc cgtggaccgc
aacccccgtc gcctggagct ggccgagcgg 72540ttcggcgcgg tcccgctgcc
cgccgcgacg gccggactgg ccgagcggat ccggcggctc 72600acggacggcg
gcgcgcggta cgcactggac acgaccgcct ccgtcccact gatcaacgag
72660gcgctgcgcg cactgcgtcc caccggcgct ctcggcctgg tggcacggct
ccacaccgcg 72720ctgcccctgg aaccgggcac gctcgaccgg gggcgcagca
tccgccacgt ctgcgagggg 72780gacgcggtac ccggtctgct gataccgcag
ctgacccggc tctggcaggc cggacgcttc 72840cccttcgacc agctcgtccg
tacctacccg ctggccgaca tcaacgaggc ggagcgcgac 72900tgcgacgccg
gcctcgtggt caaacccgtg ctgctcccgc ccgcgaggag ccggtgagta
72960cggcgcacgg caccgcggtc cgaccgcatc cgacgagcag gaagctcgcg
gccccacttc 73020cgccaacgga ggagacatga ccggcacggc gccgcagtac
acggacgtgg aaggcgtgaa 73080cggaggtgtg ggcctgacgg ccttcctggt
cgccgccgcg cgggcgatcg agacccatcg 73140cgacgacagt ctggcccagg
acgtctacgc ggaacacttc gtgcgcgccg ccccggcgtg 73200cgcggactgg
ccggtgcgca tcgagcaggt ccccgacggg gacggcaacc cgctgtgggg
73260acggttcgcc cgctatttcg gcctgcggac ccgggccctc gacgacttcc
tgctccggtc 73320ggtccggacg ggcccccgac aggtggtgct gctgggcgcg
gggttggaca cccgtgcctt 73380ccggctcgac tggccgtcgc agtgcgcggt
cttcgagatc gaccggacgg gcgtgctcgc 73440cttcaaacag caggtgctca
cggacctggc ggcaaccccg agagtggagc gcgtccccgt 73500tccggtcgat
ctgcgcgcgg actgggccgg cgcgctgacc gcggccggct tcgaccccgc
73560ggcgcccagc gtctggctgg ccgagggact gctcttctat ctgccgggcc
ccgccgagtc 73620gcttctcgtc gacacggtgg accggctgac caccgacggc
agcgcgctgg ccttcgaggc 73680caagctggag aaggacctgc tggcgtaccg
cgacagtgcg atctacacgg cgacgcgcga 73740gcagatcggc atcgacctcc
tccgcctctt cgacaagggg ccccgacccg actccgcggg 73800tgagctggcg
gccagaggct ggtccacctc gatgcacacg cccttcgtct tcacccaccg
73860gtacggacgc ggtcccctcc ccgagccgaa cgacgcgctg gaggggaacc
gatgggtctt 73920cgcccgcaag cccgggccct gacgtgccgg ccgcgcttgc
cgcccacgcc cggggacgcc 73980gctgacgagc ggtgtcagac ggtccgggcg
gccaccagcc ggtcgcccgg gatcccggcc 74040aggtccggcg cgtagtagtc
ctcgatgccg gaggcccacg cggccacggt gatgcggtcc 74100gcggcgtccg
gggcgaaggc gtcgaacagc gcgtggatgt tcgcttcctc gaaaccgatc
74160gccttcatca gcgcgacgaa gaggggacgc tcgatcagcc cgtccccgtc
gggatcgccg 74220agcgcggaca gtgcccgagc gaactcggcg atggtgggac
cgaagcgctc gggatcgagt 74280acgaacggac ggaactcctc cacggtgatc
acgccgtcgc cgtcggcgtc cagttcggtg 74340gccagcgtgg tccagtagcg
gcggaacgcg gcccggacgg cggccttggc gctgtcgtcc 74400gacccggccg
ccgccgcaac gacccggtcg gtcatcaggt cgaagtcgtc ggagtcgatg
74460actccgttgc cgttggcgtc gaagagggag aagaccagct cgacccgctt
ggcggcctca 74520tctcgcatgc acatcacctg tcttctacgg cccggtcttc
gcgggcccgg ccccatgaat 74580gctctgcgtg accgagcggg gcaggacgaa
agcctccgag cggtcgcgtc ccagagaacc 74640accatgaatg tccctgaact
gcagatcggg catctgctgg cctggtgcgg gcgggggctg 74700gcccggtgcg
gcaggggagt gctctggtgc ctgggcaagg ccgtcacggg gatcatcctg
74760ctcgccatct tcgcgtccgc gatgatc 747872876PRTStreptomyces
parvulus Tu4055 2Met Thr Gly Ser Ala Val Ser Ala Pro Phe Leu Gln
Pro Pro Glu Pro1 5 10 15Val Ser Gly His Ser Glu Arg Lys Ser Asp Pro
Val Leu Leu Val Gly 20 25 30Ala Gly Arg Arg Ala Arg Met Ala Asp Ala
Val Arg Ala Ala Gly Ala 35 40 45Gln Ala Gly Ile Asp Pro Ala Val Leu
Arg Arg Thr Arg Ala Thr Leu 50 55 60Ile Thr Ala Gly Ser Ala Gly Ala
Ala Gly Arg Leu
Ala Ala Ala Leu65 70 75 80Arg Leu Thr Gly Ala Thr Ile Ser Leu Asp
Thr Arg Glu Thr Pro Thr 85 90 95Leu Leu Ala Leu His Leu Ala Ala Gln
Ala Leu Arg Ala Gly Asp Thr 100 105 110Ser Tyr Ala Val Val Gly Ala
Glu Leu Pro Asp Gly Asn Cys Ala Leu 115 120 125Ile Leu Ala Arg Gln
Ser Ala Ala Thr Ala Glu Gly Ala Val Pro Gln 130 135 140Ala Ile Val
Arg Thr Thr Thr Ala Asp Arg Thr Thr Thr Ala Asp His145 150 155
160Ala Pro Ala Pro Asp Asp His Gly Ser Pro Ala Arg Glu Ala Pro His
165 170 175Ala Thr Arg Thr Leu Ser Pro Gly Ile Thr Gln Ala Pro Ala
Glu Gly 180 185 190Phe Pro Gly Leu Leu Ala Thr Leu His Asp Asp Thr
Pro Leu Arg Pro 195 200 205Thr Ala Val Thr Glu His Gly Ser Asp Ala
Thr Thr Val Leu Val Leu 210 215 220Leu Asp Gln Pro Gln Asp Ala Ala
Pro Ala Ala Pro Leu Pro Trp Val225 230 235 240Val Ser Ala Pro His
Thr Arg Ala Leu Arg Ala Thr Ala Ala Thr Leu 245 250 255Ala Val His
Leu Asp Thr Thr Pro Ala Ala Pro Ala Asp Val Ala His 260 265 270Thr
Leu Leu Thr Ala Arg Pro Asp Arg His Arg Ala Ala Val Val Gly 275 280
285Ala Asp Arg Ala Thr Leu Thr Asp Gly Leu Arg Ala Leu Ala Thr Gly
290 295 300Gly Asp Ala Pro His Leu Val His Gly Thr Ala Thr Gly Ser
Pro Arg305 310 315 320Pro Val Phe Val Phe Pro Gly Gln Gly Ser Gln
Trp Pro Gly Met Ala 325 330 335Ala Glu Leu Leu Glu Thr Ser Glu Pro
Phe His Asp Ser Val His Ala 340 345 350Cys Ala Asp Ala Leu Ala Glu
Phe Val Asp Trp Ser Val Leu Asp Val 355 360 365Leu Arg Gln Ala Pro
Asp Ala Pro Pro Leu Arg Arg Val Asp Val Leu 370 375 380Gln Pro Thr
Leu Trp Ala Thr Met Val Ser Leu Ala Glu Val Trp Arg385 390 395
400Ser Tyr Gly Val Glu Pro Ala Ala Val Val Gly His Cys Tyr Gly Glu
405 410 415Ile Ala Ala Ala Gln Val Ala Gly Ala Leu Asp Met Arg Asp
Ala Ala 420 425 430Arg Leu Leu Ala His Arg Ser Arg Ala Trp Leu Arg
Leu Val Gly Lys 435 440 445Gly Thr Val Ile Ser Val Ala Thr Ser Gly
Gln Asp Ile Thr Arg Arg 450 455 460Met Ala Ala Trp Pro Asp Ser Val
Glu Leu Ala Ala Leu Asn Gly Pro465 470 475 480Arg Ser Val Ala Leu
Ala Gly Pro Pro Asp Val Leu Asp Gly Ile Val 485 490 495Asn Asp Leu
Thr Asp Gln Gly Ile His Ala Lys Arg Ile Pro Gly Val 500 505 510Asp
Thr Val Gly His Cys Ser Gln Val Glu Val Leu Arg Asp His Leu 515 520
525Leu Asp Val Leu Arg Pro Val Ser Pro Arg Pro Ala Ala Val Pro Phe
530 535 540Tyr Ser Thr Val Asp Gly Thr Glu Arg Asp Thr Thr Thr Leu
Asp Thr545 550 555 560Asp Tyr Trp Tyr Leu Asn Thr Arg Ser Gln Val
Arg Phe His Gln Ala 565 570 575Val Arg Asn Leu Leu Ala Ala Gly His
Arg Ser Phe Val Glu Val Ser 580 585 590Pro His Pro Leu Leu Gly Ala
Ser Ile Glu Asp Thr Ala Ala Glu Phe 595 600 605Gly Leu Asp Asp Val
Ala Ala Val Gly Thr Leu Arg Arg Gly Gln Gly 610 615 620Gly Thr Arg
Arg Val Leu Thr Ser Val Ala Glu Ala Tyr Val His Gly625 630 635
640Ile Asp Ile Asp Phe Thr Pro Ala Phe Thr Gly Thr Thr Pro Asn Arg
645 650 655Ile Asp Leu Pro Thr Val Glu Asp His Gly Ile Glu Gly His
Gly Asp 660 665 670Asp Gly Gly Glu Thr Trp Thr Asp Arg Val Arg Thr
Leu Pro Asp Glu 675 680 685Gln Arg Glu Glu Ala Leu Leu Asp Leu Val
Cys Arg Thr Val Ala Ala 690 695 700Val Leu Glu Ala Asp Pro Ala Gly
Thr Ala Asp Ala Val Ala Pro Asp705 710 715 720Thr Ala Phe Lys Glu
Met Gly Leu Gly Ser Leu Ser Ala Val Arg Leu 725 730 735Arg Asn Gly
Leu Arg Glu Ala Thr Gly Ala His Leu Pro Ala Thr Ile 740 745 750Ala
Tyr Asp His Pro Thr Pro Ala Ala Leu Ala Arg His Leu Ala Met 755 760
765Thr Leu Phe Asp Ala Thr Gly Ala Ala Pro Ala Val Pro Ala Pro Ser
770 775 780Arg Asp Asp Glu Pro Ile Asp Ala Glu Thr Ala Val Leu Thr
Ala Leu785 790 795 800Glu Arg Ala Asp Glu Ala Leu Glu Arg Leu Arg
Ala Pro His Ala Arg 805 810 815Thr Pro Arg Gln Glu Thr Gly Arg Arg
Ile Asp Glu Leu Leu Arg Ser 820 825 830Leu Thr Asp Lys Ala Arg Arg
Met Arg Gln Ala Asp Ala Val Asp Asp 835 840 845Val Asp Asp Pro Ala
Thr Asp Arg Phe Ala Ala Ala Thr Asp Asp Glu 850 855 860Met Phe Glu
Leu Leu Glu Lys Arg Phe Gly Ile Ser865 870 87531571PRTStreptomyces
Parvulus Tu4055 3Met Ala His Glu Asp Lys Leu Arg His Leu Leu Lys
Arg Val Ser Ala1 5 10 15Glu Leu Asp Asp Thr Gln Arg Arg Val Arg Glu
Met Glu Glu Ser Glu 20 25 30Arg Glu Pro Ile Ala Ile Val Gly Met Ser
Cys Arg Leu Pro Gly Gly 35 40 45Val Asn Ser Pro Gly Glu Phe Trp Ser
Leu Leu Glu Ala Gly Thr Asp 50 55 60Ala Val Ser Glu Phe Pro Arg Asp
Arg Gly Trp Asp Val Glu Asn Leu65 70 75 80Tyr Asp Pro Asp Pro Asp
Ala Pro Gly Arg Ser Tyr Val Arg Glu Gly 85 90 95Gly Phe Leu Asp Gly
Ala Gly Gln Phe Asp Ala Ala Phe Phe Gly Ile 100 105 110Ser Pro Arg
Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu 115 120 125Glu
Cys Ser Trp Glu Ala Ile Glu Arg Ser Arg Ile Asp Pro Lys Thr 130 135
140Leu His Gly Ser Arg Thr Gly Val Phe Ala Gly Ser Asn Trp Gln
Asp145 150 155 160Tyr Asn Thr Leu Leu Leu Asn Ala Glu Glu Arg Ser
Gln Ser Tyr Leu 165 170 175Ala Thr Gly Ala Ser Gly Ser Val Leu Ser
Gly Arg Val Ser Tyr Thr 180 185 190Leu Gly Met Glu Gly Pro Ala Ile
Thr Val Asn Thr Ala Cys Ser Ser 195 200 205Ser Leu Val Ala Val His
Leu Ala Ala Arg Ser Leu Arg Ala Gly Glu 210 215 220Cys Asp Leu Ala
Leu Ala Gly Ala Val Thr Val Met Ser Thr Pro Gln225 230 235 240Leu
Pro Val Ala Phe Ser Arg Gln Arg Gly Leu Ala Pro Asp Gly Arg 245 250
255Ser Lys Ala Phe Ala Val Ser Ala Asp Gly Met Gly Phe Gly Glu Gly
260 265 270Val Gly Val Leu Val Leu Glu Arg Leu Ser Val Ala Arg Arg
Asn Gly 275 280 285His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val
Asn Gln Asp Gly 290 295 300Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly
Pro Ser Gln Gln Arg Val305 310 315 320Ile Arg Ala Ala Leu Ala Ser
Ala Gly Leu Gly Pro Ala Asp Val Asp 325 330 335Val Val Glu Ala His
Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu 340 345 350Ala Gln Ala
Leu Leu Ala Thr Tyr Gly Arg Gly Arg Asp Ala Glu Arg 355 360 365Pro
Leu Trp Leu Gly Ser Val Lys Ser Asn Ile Gly His Ala Gln Ala 370 375
380Ala Ala Gly Val Ala Gly Val Ile Lys Met Val Leu Ala Met Glu
Lys385 390 395 400Gly Arg Leu Pro Arg Thr Leu His Val Asp Glu Pro
Ser Gly Glu Val 405 410 415Asp Trp Asp Ser Gly Ala Val Arg Leu Leu
Thr Glu Ala Arg Asp Trp 420 425 430Pro Ser Glu Glu Gly Arg Leu Arg
Arg Ala Gly Val Ser Ser Phe Gly 435 440 445Ile Ser Gly Thr Asn Ala
His Val Ile Ile Glu Glu Ala Pro Glu Glu 450 455 460Gly Glu Glu Pro
Glu Ser Asp Ala Gly Gly Val Val Pro Trp Val Leu465 470 475 480Ser
Ala Arg Thr Glu Gly Ala Leu Gln Ala Gln Ala Val Gln Leu Ser 485 490
495Glu Phe Val Gly Glu Ser Ser Pro Val Asp Val Gly Trp Ser Leu Val
500 505 510Ser Thr Arg Ala Ala Phe Glu His Arg Ala Val Val Val Gly
Arg Gly 515 520 525Arg Asp Glu Leu Val Arg Gly Leu Ser Glu Val Ala
Gln Gly Arg Gly 530 535 540Val Arg Gly Val Ala Ser Ser Ala Ser Gly
Gly Leu Ala Phe Val Phe545 550 555 560Ala Gly Gln Gly Ser Gln Arg
Leu Gly Met Gly Arg Gly Leu Tyr Glu 565 570 575Arg Phe Pro Val Phe
Ala Glu Ala Phe Asp Glu Val Cys Gly Arg Val 580 585 590Gly Pro Gly
Val Arg Glu Val Val Phe Gly Ser Asp Ala Gly Glu Leu 595 600 605Asp
Arg Thr Val Trp Ala Gln Ala Gly Leu Phe Ala Leu Glu Val Ala 610 615
620Leu Phe Arg Leu Leu Glu Ser Trp Gly Val Arg Pro Gly Cys Leu
Ile625 630 635 640Gly His Ser Val Gly Glu Leu Ser Ala Ala Cys Val
Ala Gly Leu Trp 645 650 655Ser Leu Glu Asp Ala Cys Arg Val Val Ala
Ala Arg Ala Arg Leu Met 660 665 670Gln Ala Leu Pro Ala Gly Gly Val
Met Val Ala Val Arg Ala Glu Ala 675 680 685Gly Glu Leu Ala Gly Phe
Leu Gly Glu Asp Val Val Ile Ala Ser Val 690 695 700Asn Ala Pro Gly
Gln Val Val Ile Ala Gly Pro Glu Gly Gly Val Glu705 710 715 720Arg
Val Val Ala Ala Cys Gly Ala Arg Ser Arg Arg Leu Ala Val Ser 725 730
735His Ala Phe His Ser Pro Leu Val Glu Pro Met Leu Gly Glu Phe Arg
740 745 750Arg Val Val Glu Ser Val Ala Phe Gly Val Pro Ser Leu Arg
Val Val 755 760 765Ser Asn Val Thr Gly Ala Trp Val Asp Pro Glu Glu
Trp Gly Thr Pro 770 775 780Glu Tyr Trp Val Arg Gln Val Arg Glu Pro
Val Arg Phe Ala Asp Gly785 790 795 800Val Ala Thr Leu Leu Asp Ala
Gly Val Arg Thr Phe Val Glu Leu Gly 805 810 815Pro Ala Gly Ala Leu
Thr Ser Met Val Ser His Cys Ala Asp Ala Thr 820 825 830Ala Thr Ser
Val Thr Ala Val Pro Thr Leu Arg Pro Asp His Asp Glu 835 840 845Ser
Arg Thr Val Leu Ser Ala Ala Ala Ser Leu Tyr Val Gln Gly His 850 855
860Pro Val Asp Trp Ala Pro Leu Phe Pro Arg Ala Arg Thr Val Asp
Leu865 870 875 880Pro Thr Tyr Pro Phe Gln His Gln His Tyr Trp Leu
Asp Val Pro Pro 885 890 895Leu Phe Thr Ala Ser Ser Ala Ala Gln Asp
Gly Gly Trp Arg Tyr Arg 900 905 910Ile His Trp Arg Arg Leu Gly Thr
Arg Asp Ser Gly Asp Arg Leu Ser 915 920 925Gly Arg Trp Leu Leu Leu
Val Pro Glu Ser Asp Gly Thr Glu Pro Trp 930 935 940Val Glu Gly Ala
Glu Lys Met Leu Ala Glu Arg Gly Cys Glu Val Val945 950 955 960His
Val Pro Ile Ala Ala Thr Ala Asp Arg Asp Ala Met Val Gly Ala 965 970
975Val Arg Glu Ser Val Glu Asp Gly Arg Val Asp Gly Val Leu Ser Leu
980 985 990Leu Ala Leu Asp Gly Arg Pro His Pro Asp Ala Ala Ala Val
Pro Thr 995 1000 1005Gly Leu Val Ala Thr Ala Gln Val Val Gln Val
Ser Asp Glu Leu Gly 1010 1015 1020Ile Gly Pro Leu Trp Val Ala Thr
Arg Gln Ala Val Ser Val Asp Gly1025 1030 1035 1040Ala Asp Glu Ala
Asp Gly Ala Gly Arg Thr Arg Lys Ala Asp Asp Pro 1045 1050 1055Ala
Asp Val Ala Gln Ala Ala Val Trp Gly Leu Gly Arg Val Ala Ala 1060
1065 1070Leu Glu Lys Pro Arg Leu Trp Gly Gly Leu Val Asp Leu Pro
Ala Arg 1075 1080 1085Ala Asp Glu Arg Met Arg Asp Leu Val Ala Gln
Ala Leu Thr Ala Pro 1090 1095 1100Asp Ala Glu Asp Gln Leu Ala Val
Arg Ala Asp Gly Ile Ala Val Arg1105 1110 1115 1120Arg Leu Val Arg
Ser Ala Ala Ser Ala Pro Ala Asp Asp Trp Gln Pro 1125 1130 1135Ser
Gly Thr Val Leu Val Thr Gly Gly Thr Gly Gly Val Gly Ala Asn 1140
1145 1150Val Ala Arg Trp Leu Val Thr Gln Asp Ile Gln His Leu Leu
Leu Val 1155 1160 1165Ser Arg Arg Gly Pro Asp Ala Pro Gly Ala Ala
Glu Leu Leu Ala Glu 1170 1175 1180Leu Ser Ala Ser Gly Thr Ser Val
Thr Ile Glu Pro Cys Asp Val Thr1185 1190 1195 1200Asp Ala Asp Ala
Val Arg Arg Leu Ile Gly Ala Val Pro Ala Glu Arg 1205 1210 1215Pro
Leu Ser Thr Val Val His Ala Ala Gly Val Leu Asp Asp Cys Leu 1220
1225 1230Ile Asp Ala Leu Thr Pro Gln Arg Leu Ala Ala Ala Leu Glu
Val Lys 1235 1240 1245Ala Lys Gly Ala Leu Asn Leu His Glu Ala Ala
Gly Glu Ala His Leu 1250 1255 1260Val Leu Phe Ser Ser Leu Ala Gly
Thr Thr Gly Thr Lys Gly Gln Gly1265 1270 1275 1280Asn Tyr Ala Ala
Ala Asn Ala Tyr Leu Asp Ala Leu Ala Glu Arg Arg 1285 1290 1295Arg
Ala Asp Gly Leu Pro Ala Thr Ser Val Ala Trp Gly Ala Trp Gln 1300
1305 1310Gly Ala Gly Met Val Ala Asp Ala Ala Val Ala His Arg Thr
Arg Arg 1315 1320 1325Tyr Gly Leu Pro Leu Met Ser Pro Asp Arg Ala
Val Ala Thr Leu Arg 1330 1335 1340Gln Val Met Ala Glu Pro Val Ala
Thr Gln Val Val Ala Asp Val Asp1345 1350 1355 1360Trp Gln Arg Phe
Val Ala Asp Phe Thr Ala Val Arg Pro Ser Arg Leu 1365 1370 1375Leu
Ala Asp Leu Pro Glu Val Arg Ser Leu Gly Glu Gln Arg Lys Asp 1380
1385 1390Gly Pro Gly Gly Gln Gly Glu Glu Asp Gly Leu Ala Ser Lys
Leu Ala 1395 1400 1405Ala Leu Pro Glu Ala Asp Arg Arg Arg Ala Val
Leu Asp Leu Val Glu 1410 1415 1420Glu Leu Val Leu Gly Val Leu Gly
His Glu Thr Arg Ala Ala Ile Gly1425 1430 1435 1440Pro Asp Ser Ser
Phe His Ala Ile Gly Phe Asp Ser Leu Thr Ala Val 1445 1450 1455Glu
Leu Arg Asn Leu Leu Thr Val Arg Leu Gly Met Lys Leu Pro Ala 1460
1465 1470Thr Leu Val Tyr Asp His Pro Thr Leu Ser Ser Leu Ala Asp
His Leu 1475 1480 1485His Glu Gln Leu Val Ile Asp Gly Thr Pro Met
Thr Asp Thr Ala Ala 1490 1495 1500Asp Leu Leu Ala Glu Leu Asp Ala
Leu Ala Ala Arg Leu Ala Ala Val1505 1510 1515 1520Gly Leu Glu Pro
Glu Ala Arg Ala Arg Ile Gly Arg Arg Leu Lys Asp 1525 1530 1535Met
Gln Thr Ala Cys Glu Pro Arg Ser Glu Ser Ser Arg Asp Leu Lys 1540
1545 1550Ser Ala Ser Arg Thr Glu Val Leu Asp Phe Leu Thr Asn Glu
Leu Gly 1555 1560 1565Ile Ser Arg 157043500PRTStreptomyces parvulus
Tu4055 4Met Pro Asn Asp Glu Glu Leu Leu Asp Tyr Leu Lys Arg Thr Ala
Ser1 5 10 15Asn Leu Gln Glu Ala Arg Gln Arg Val His Glu Leu Glu Glu
Ser Glu 20 25 30Arg Glu Pro Ile Ala Ile Val Gly Met Ser Cys Arg Leu
Pro Gly Gly 35 40 45Val Asn Ser Pro Glu Glu Phe Trp Ser Leu Leu Glu
Ala Gly Thr Asp 50 55 60Ala Val Ser Glu Phe Pro Arg Asp Arg Gly Trp
Asp Val Glu Arg Leu65 70 75 80Tyr Asp Pro Asp Pro Asp Ala
Pro Gly Lys Ser Tyr Val Arg Glu Gly 85 90 95Gly Phe Leu Asp Gly Ala
Gly Arg Phe Asp Pro Ala Phe Phe Gly Ile 100 105 110Ser Pro Arg Glu
Ala Val Val Met Asp Pro Gln Gln Arg Leu Leu Leu 115 120 125Glu Cys
Ser Trp Glu Ala Ile Glu Arg Ser Arg Ile Asp Pro Lys Thr 130 135
140Leu His Gly Ser Arg Ala Gly Val Phe Val Gly Ser Asn Gly Gln
Asp145 150 155 160Tyr Gly Thr Leu Leu Leu Arg Ala Asp Asp Arg Ser
His Ala Tyr Leu 165 170 175Ala Thr Gly Ala Ser Ala Ser Val Leu Ser
Gly Arg Ile Ser Tyr Thr 180 185 190Leu Gly Leu Glu Gly Pro Ala Val
Thr Ile Ser Thr Ala Cys Ser Ser 195 200 205Ser Leu Val Ala Leu His
Leu Ala Ala Arg Ala Leu Arg Ala Gly Glu 210 215 220Cys Glu Leu Ala
Leu Ala Gly Gly Val Thr Val Met Pro Thr Thr Arg225 230 235 240Leu
Phe Glu Val Phe Ser Arg Gln Arg Gly Leu Ala Gly Asp Gly Arg 245 250
255Cys Lys Ala Phe Ala Ala Gly Ala Asp Gly Thr Gly Trp Gly Glu Gly
260 265 270Val Gly Val Leu Val Leu Glu Arg Leu Ser Val Ala Arg Arg
Asn Gly 275 280 285His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val
Asn Gln Asp Gly 290 295 300Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly
Pro Ser Gln Gln Arg Val305 310 315 320Ile Arg Ala Ala Leu Ala Ser
Ala Arg Leu Ala Pro Glu Asp Val Asp 325 330 335Ala Val Glu Ala His
Gly Thr Gly Thr Ser Leu Gly Asp Pro Ile Glu 340 345 350Ala Gln Ala
Leu Leu Ala Thr Tyr Gly Arg Gly Arg Asp Ala Glu Arg 355 360 365Pro
Leu Trp Leu Gly Ser Val Lys Ser Asn Ile Gly His Ala Gln Ala 370 375
380Ala Ala Gly Val Ala Gly Val Ile Lys Met Val Lys Ala Met Gln
Ala385 390 395 400Gly Thr Leu Pro Arg Thr Leu His Val Asp Glu Pro
Ser Gly Glu Val 405 410 415Asp Trp Asp Ser Gly Ala Val Arg Leu Leu
Thr Glu Ala Arg Asp Trp 420 425 430Pro Ser Glu Glu Gly Arg Leu Arg
Arg Ala Gly Val Ser Ser Phe Gly 435 440 445Ile Ser Gly Thr Asn Ala
His Val Ile Leu Glu Glu Pro Pro Ala Glu 450 455 460Asp Ala Val Pro
Glu Pro Glu Ala Gly Asp Val Val Pro Trp Val Leu465 470 475 480Ser
Ala Arg Ser Ala Glu Ala Leu Arg Glu Gln Ala Ala Arg Leu Ala 485 490
495Ser Val Ala Gly Gly Leu Asn Val Val Asp Val Gly Trp Ser Leu Ala
500 505 510Ser Thr Arg Ala Ala Phe Glu His Arg Ala Val Val Val Gly
Arg Glu 515 520 525Arg Glu Glu Leu Leu Ala Gly Leu Phe Ala Val Ala
Ala Gly Arg Pro 530 535 540Ala Ala Asn Val Val Thr Gly Pro Val Ser
Ser Gly Arg Pro Ala Phe545 550 555 560Val Phe Ala Gly Gln Gly Ser
Gln Arg Leu Gly Met Gly Arg Gly Leu 565 570 575Tyr Glu Arg Phe Pro
Val Phe Ala Glu Ala Phe Asp Glu Val Cys Gly 580 585 590Arg Val Gly
Pro Gly Val Arg Glu Val Val Phe Gly Ser Asp Ala Gly 595 600 605Glu
Leu Asp Arg Thr Val Trp Ala Gln Ala Gly Leu Phe Ala Leu Glu 610 615
620Val Ala Leu Phe Arg Leu Leu Glu Ser Trp Gly Val Arg Pro Gly
Cys625 630 635 640Leu Ile Gly His Ser Val Gly Glu Leu Ser Ala Ala
Cys Val Ala Gly 645 650 655Leu Trp Ser Leu Glu Asp Ala Cys Arg Val
Val Ala Ala Arg Ala Arg 660 665 670Leu Met Gln Ala Leu Pro Ala Gly
Gly Val Met Val Ala Val Arg Ala 675 680 685Glu Ala Gly Glu Leu Ala
Gly Phe Leu Gly Glu Asp Val Val Ile Ala 690 695 700Ser Val Asn Ala
Pro Gly Gln Val Val Ile Ala Gly Pro Glu Gly Gly705 710 715 720Val
Glu Arg Val Val Ala Ala Cys Gly Ala Arg Ser Arg Arg Leu Ala 725 730
735Val Ser His Ala Phe His Ser Pro Leu Val Glu Pro Met Leu Gly Glu
740 745 750Phe Arg Arg Val Val Glu Ser Val Ala Phe Gly Val Pro Ser
Leu Arg 755 760 765Val Val Ser Asn Val Thr Gly Ala Trp Val Asp Pro
Glu Glu Trp Gly 770 775 780Thr Pro Glu Tyr Trp Val Arg Gln Val Arg
Glu Pro Val Arg Phe Ala785 790 795 800Asp Gly Val Ala Thr Leu Leu
Asp Ala Gly Val Arg Thr Phe Val Glu 805 810 815Leu Gly Pro Ala Gly
Thr Leu Thr Ser Met Val Ser His Cys Ala Asp 820 825 830Ala Thr Ala
Thr Ser Val Thr Ala Val Pro Thr Leu Arg Pro Asp His 835 840 845Asp
Glu Ser Arg Thr Val Leu Ser Ala Ala Ala Ser Leu Tyr Val Gln 850 855
860Gly His Pro Val Asp Trp Ala Pro Leu Phe Pro Arg Ala Arg Thr
Val865 870 875 880Asp Leu Pro Thr Tyr Pro Phe Gln His Gln His Tyr
Trp Met Met Asn 885 890 895Thr Gly Ser Ala Ala Glu Pro Ala Glu Leu
Gly Leu Gly Asp Ala Arg 900 905 910His Pro Leu Leu Gly Ser Val Val
Thr Val Ala Gly Asp Asp Lys Val 915 920 925Val Phe Ala Gly Arg Leu
Ala Leu Arg Thr His Pro Trp Leu Ala Asp 930 935 940His Thr Val Leu
Asp Ala Val Leu Leu Pro Ala Thr Ala Phe Leu Glu945 950 955 960Leu
Ala Val Arg Ala Gly Glu Glu Val Ser Cys Pro Val Val His Asp 965 970
975Leu Thr Leu His Arg Pro Leu Val Val Pro Glu Arg Gly Ala Val Gln
980 985 990Val Gln Met Ala Val Gly Ala Pro Glu Ala Asp Gly Arg Arg
Glu Val 995 1000 1005Arg Val Tyr Ser Arg Pro Asp Asp Asp Ala Glu
His Glu Trp Thr Leu 1010 1015 1020His Ala Ala Gly Leu Leu Ala Ser
Ala Ala Thr Ala Glu Pro Ala Val1025 1030 1035 1040Ala Ala Gly Ala
Trp Pro Pro Pro Glu Ala Gln Ala Val Asp Leu Asp 1045 1050 1055Gly
Phe Tyr Ala Gly Leu Ala Glu His Gly Tyr His Tyr Gly Pro Leu 1060
1065 1070Phe Gln Gly Val Arg Ala Ala Trp Arg Leu Gly Asp Asp Val
Leu Ala 1075 1080 1085Glu Ile Val Leu Pro Glu Ala Ala Gly Ala Asp
Ala Ala Arg Tyr Gly 1090 1095 1100Met His Pro Ala Leu Leu Asp Ala
Val Leu His Ala Ala Arg Leu Gly1105 1110 1115 1120Ala Phe Arg Glu
Arg Ser Glu Glu Lys Tyr Leu Pro Phe Ala Trp Glu 1125 1130 1135Gly
Val Thr Leu Arg Thr Arg Gly Ala Thr Ala Val Arg Ala Arg Ile 1140
1145 1150Ser Arg Ala Gly Thr Asp Ala Ile Arg Leu Asp Val Thr Asp
Thr Ala 1155 1160 1165Asp Arg Pro Val Leu Thr Ala Glu Ser Leu Thr
Leu Arg Pro Val Ser 1170 1175 1180Ala Gly Gln Leu Met Ala Val Pro
Arg Asp Ser Leu Phe Arg Val Asp1185 1190 1195 1200Trp Val Ser Ala
Pro Ala Ala Asn Gly Pro Gly Leu Arg Leu Ala Arg 1205 1210 1215Ala
Ala Thr Val Glu Ala Ala Leu Ala Ala Asp Ala Asp Ile Val Val 1220
1225 1230Val Pro Cys Leu Asp Ser Glu Gly Pro His Gln Ala Thr Tyr
Gln Ala 1235 1240 1245Leu Glu Leu Leu Gln Arg Trp Leu Ala Ala Asp
Thr Gly Thr Thr Thr 1250 1255 1260Leu Ala Leu Leu Thr His Arg Ala
Val Ala Val Gly Asp Asp Val His1265 1270 1275 1280Asp Leu His His
Ala Pro Leu Trp Gly Leu Val Arg Thr Ala Gln Thr 1285 1290 1295Glu
His Pro Gly Cys Phe Arg Leu Val Asp Ser Asp Asp Pro Asp Pro 1300
1305 1310Thr Thr Asp Val Leu Ala Ala Ala Leu Ala Thr Gly Glu Pro
Gln Val 1315 1320 1325Ala Ile Arg Asp Gly Ala Val Leu Ala Pro Arg
Leu Thr Ala Ala Ser 1330 1335 1340Ala Pro Arg Glu Pro Ala Glu Trp
Asp Ala Glu Gly Thr Val Leu Ile1345 1350 1355 1360Thr Gly Gly Ser
Gly Ala Leu Ala Gly Ile Val Ala Gln His Leu Val 1365 1370 1375Ala
Arg His Gly Val Arg Arg Leu Val Leu Ala Ser Arg Ser Gly Arg 1380
1385 1390Pro Ala Pro Gly Ala Asp Leu Leu Asp Ala Asp Val Thr Ala
Val Ser 1395 1400 1405Cys Asp Val Ser Asp Arg Asp Ala Val Ala Ala
Leu Leu Ala Ser Val 1410 1415 1420Pro Asp Glu His Pro Leu Thr Ala
Val Val His Thr Ala Gly Val Leu1425 1430 1435 1440Asp Asp Gly Val
Leu His Ala Leu Thr Thr Glu Arg Ile Asp Thr Ser 1445 1450 1455Phe
Ala Ala Lys Val Asp Gly Ala Arg His Leu His Glu Leu Thr Ser 1460
1465 1470His Leu Asp Leu Thr Ala Phe Val Leu Phe Ser Ser Ala Ser
Ala Val 1475 1480 1485Leu Gly Ala Ala Gly Gln Gly Asn Tyr Ala Ala
Ala Asn Ala Tyr Leu 1490 1495 1500Asp Ala Leu Ala Ala His Arg Arg
Ser Asn Asp Leu Pro Ala Val Ser1505 1510 1515 1520Leu Ala Trp Gly
Leu Trp Ala Glu His Glu Gly Met Ala Arg Gly Leu 1525 1530 1535Gly
Asp Ala Glu Leu Thr Arg Ile Ser Arg Ile Gly Val Thr Ala Leu 1540
1545 1550Ser Ala Glu Asp Gly Met Arg Leu Phe Asp Ala Gly Cys Ala
Gly Asp 1555 1560 1565Gln Ser Gln Leu Val Pro Met Arg Val Asp Thr
Ala Ala Leu Arg Ala 1570 1575 1580Arg Arg Asp His Leu Pro Ala Pro
Met Trp Ser Leu Val Pro Glu Arg1585 1590 1595 1600Thr Arg Ala Ala
Arg Thr Gln Pro Ala Ala Ser Leu Arg Asp Arg Leu 1605 1610 1615Ala
Glu Leu Thr Ala Pro Glu Arg Lys Arg Thr Val Leu Asn Leu Val 1620
1625 1630Arg Asn Ala Val Ala Asp Thr Leu Gly His Asn Ala Ala Asp
Gly Val 1635 1640 1645Pro Pro Asp Gln Ser Leu Asp Ala Ala Gly Phe
Asp Ser Leu Thr Ala 1650 1655 1660Val Glu Phe Arg Asn Arg Leu Ser
Ala Val Thr Asp Leu Arg Leu Pro1665 1670 1675 1680Ala Thr Leu Thr
Tyr Asp His Pro Thr Pro Ala Ala Ile Ala Glu His 1685 1690 1695Ile
Leu Thr Arg Leu Thr Leu Leu Lys Glu Thr Ala Ala Pro Ala Val 1700
1705 1710Gly Thr Ala Pro Val Ala Ala Pro Thr Glu Asp Asp Ala Ile
Val Ile 1715 1720 1725Val Gly Met Ala Gly Arg Phe Pro Gly Gly Val
Arg Thr Pro Glu Gly 1730 1735 1740Leu Trp Asp Leu Val His Ser Gly
Thr Asp Ala Ile Ser Glu Trp Pro1745 1750 1755 1760Thr Asp Arg Gly
Trp Asp Val Glu Asn Leu Tyr Asp Pro Asp Pro Asp 1765 1770 1775Ala
Val Gly Lys Ser Tyr Val Arg His Gly Gly Phe Leu His Asp Val 1780
1785 1790Ala Gly Phe Asp Ala Gly Phe Phe Gly Ile Ser Pro Arg Glu
Ala Leu 1795 1800 1805Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu
Cys Ser Tyr Glu Ala 1810 1815 1820Leu Glu Arg Ala Gly Ile Asp Pro
Ala Thr Leu Arg Gly Ser Arg Ser1825 1830 1835 1840Gly Val Tyr Ala
Gly Val Met Tyr His Glu Tyr Ala Ser Arg Leu Gly 1845 1850 1855Ala
Thr Pro Ala Gly Phe Glu Gly Thr Leu Gly Thr Gly Ser Ser Gly 1860
1865 1870Ser Ile Ala Ser Gly Arg Ile Ser Tyr Thr Phe Asp Leu Thr
Gly Pro 1875 1880 1885Ala Val Thr Val Asp Thr Ala Cys Ser Thr Ser
Leu Val Gly Leu His 1890 1895 1900Leu Ala Val Gln Ala Leu Arg Ala
Gly Glu Cys Glu Leu Ala Leu Ala1905 1910 1915 1920Gly Gly Val Thr
Val Met His Thr Pro Arg Pro Phe Val Glu Phe Ser 1925 1930 1935Arg
Gln Arg Gly Leu Ala Ala Asp Gly Arg Ser Lys Ala Phe Ala Ala 1940
1945 1950Ser Ala Asp Gly Val Ala Trp Ala Glu Gly Ala Gly Ile Leu
Val Leu 1955 1960 1965Glu Arg Leu Ser Ala Ala Arg Arg Asn Gly His
Arg Val Leu Ala Val 1970 1975 1980Val Arg Gly Ser Ala Val Asn Gln
Asp Gly Ala Ser Asn Gly Leu Thr1985 1990 1995 2000Ala Pro Asn Gly
Pro Ser Gln Gln Arg Val Ile Arg Ala Ala Leu Ala 2005 2010 2015Ser
Ala Gly Leu Gly Pro Ala Asp Val Asp Val Val Glu Ala His Gly 2020
2025 2030Thr Gly Thr Ala Leu Gly Asp Pro Ile Glu Ala Gln Ala Leu
Leu Ala 2035 2040 2045Thr Tyr Gly Arg Gly Arg Asp Ala Asp Arg Pro
Leu Trp Leu Gly Ser 2050 2055 2060Val Lys Ser Asn Ile Gly His Thr
Gln Ala Ala Ala Gly Val Ala Ser2065 2070 2075 2080Val Ile Lys Met
Val Gln Ala Met Gln Ala Gly Val Leu Pro Arg Thr 2085 2090 2095Leu
His Val Asp Glu Pro Ser Gly Glu Val Asp Trp Asp Ser Gly Ala 2100
2105 2110Val Arg Leu Leu Thr Glu Ala Arg Glu Trp Pro Ser Gly Glu
Gly Arg 2115 2120 2125Val Arg Arg Ala Gly Val Ser Ser Phe Gly Ile
Ser Gly Thr Asn Ala 2130 2135 2140His Val Ile Leu Glu Glu Pro Pro
Ala Glu Asp Ala Leu Pro Glu Pro2145 2150 2155 2160Glu Ala Gly Asp
Val Val Pro Trp Val Leu Ser Ala Arg Ser Ala Glu 2165 2170 2175Ala
Leu Arg Glu Gln Ala Ala Arg Leu Ala Ser Val Ala Gly Gly Leu 2180
2185 2190Asn Val Val Asp Val Gly Trp Ser Leu Ala Ser Thr Arg Ala
Ala Phe 2195 2200 2205Glu His Arg Ala Val Val Val Gly Gly Asp Arg
Glu Glu Leu Leu Gly 2210 2215 2220Lys Leu Ser Ser Val Ser Gly Val
Glu Val Gly Val Gly Val Gly Ala2225 2230 2235 2240Gly Gly Gly Val
Val Leu Val Phe Ala Gly Gln Gly Cys Gln Trp Val 2245 2250 2255Gly
Met Gly Arg Glu Leu Leu Gly Ser Ser Leu Val Phe Ala Glu Ser 2260
2265 2270Met Arg Glu Cys Ala Ala Ala Leu Ser Pro Phe Val Asp Phe
Ser Val 2275 2280 2285Val Asp Val Leu Gly Ser Ala Gly Glu Leu Gly
Arg Val Glu Val Val 2290 2295 2300Gln Pro Ala Leu Trp Ala Val Met
Val Ser Leu Ala Arg Val Trp Arg2305 2310 2315 2320Ser Trp Gly Val
Pro Val Ala Ala Val Val Gly His Ser Gln Gly Glu 2325 2330 2335Ile
Ala Ala Ala Thr Val Ala Gly Ala Leu Ser Val Gly Asp Ala Ala 2340
2345 2350Arg Val Val Ala Leu Arg Ser Arg Leu Ile Ala Glu Arg Leu
Ser Gly 2355 2360 2365Leu Gly Gly Met Val Ser Val Ala Leu Ser Arg
Glu Arg Val Val Ser 2370 2375 2380Leu Ile Ala Gly Val Pro Gly Val
Ser Val Ala Ala Val Asn Gly Ser2385 2390 2395 2400Ser Ser Thr Val
Val Ser Gly Glu Ala Ala Gly Leu Glu Arg Val Leu 2405 2410 2415Ala
Ala Cys Val Ser Ser Gly Val Arg Ala Arg Arg Ile Asp Val Asp 2420
2425 2430Tyr Ala Ser His Ser Val Gln Val Glu Leu Ile Arg Glu Glu
Leu Leu 2435 2440 2445Gly Val Leu Asp Gly Ile Val Pro Arg Ser Gly
Glu Ile Pro Phe Val 2450 2455 2460Ser Thr Val Thr Gly Glu Arg Ile
Asp Thr Val Glu Leu Gly Ala Glu2465 2470 2475 2480Tyr Trp Tyr Arg
Asn Leu Arg Gln Thr Val Glu Phe Gln Ser Val Val 2485 2490 2495Glu
Gly Leu Val Ala Gln Gly Cys Arg Val Phe Leu Glu Ser Ser Pro 2500
2505 2510His Pro Val Leu Thr Val Gly Ile Glu Glu Ser Ala Asp Arg
Val Val 2515 2520 2525Ala Leu Glu Ser Leu Arg Arg Gly Glu Gly Gly
Leu Arg Arg Leu Val 2530 2535 2540Asp Ala Ala Gly Glu Ala Trp Val
Arg Gly
Val Pro Ile Asp Trp Ala2545 2550 2555 2560Gly Met Leu Ala Gly Gly
Arg Arg Val Asp Leu Pro Thr Tyr Pro Phe 2565 2570 2575Gln His Gln
Pro Tyr Trp Leu Asp Ser Pro Arg His Pro Ala Gly Asp 2580 2585
2590Val Thr Ala Val Gly Leu Thr Glu Ala Gly His Ala Phe Val Pro Ala
2595 2600 2605Ala Val Asp Leu Pro Asp Gly Gln Arg Val Trp Thr Gly
Arg Leu Ser 2610 2615 2620Leu Pro Ser Tyr Pro Trp Leu Ala Asp His
Gln Val Leu Gly Gln Val2625 2630 2635 2640Leu Leu Pro Gly Val Val
Trp Val Glu Leu Ala Leu His Ala Gly His 2645 2650 2655Gln Ala Gly
Cys Asp Ser Val Asp Glu Leu Thr Leu Gln Ser Pro Leu 2660 2665
2670Val Leu Gly Ala Ser Asp Thr Val Gln Val Arg Val Val Val Thr Glu
2675 2680 2685Thr Glu Glu Pro Gly Thr Arg Thr Val Ser Met His Ser
Arg Arg Asp 2690 2695 2700Asp Gly Ser Trp Val Thr His Ala Glu Gly
Ile Leu Gly Ala Gly Gly2705 2710 2715 2720Pro Pro Pro Glu Pro Leu
Pro Glu Trp Pro Pro Thr Gly Ala Met Pro 2725 2730 2735Leu Asp Val
Glu Gly Phe Tyr Asp Glu Leu Ala Ala Gly Gly Tyr His 2740 2745
2750Tyr Gly Pro Gln Phe Arg Cys Leu Arg Arg Ala Trp Arg Ala Gly Glu
2755 2760 2765Asp Leu Val Ala Glu Ile Ser Leu Pro Glu Gly Thr Asp
Val Asp Ala 2770 2775 2780Tyr Gly Leu His Pro Gly Leu Phe Asp Ala
Ala Val His Ser Val Ala2785 2790 2795 2800Cys Ala Arg Thr Ser Ala
Gly Ala Gly Asp Asp Gly Pro Arg Leu Pro 2805 2810 2815Phe Ala Phe
Ser Asp Val Arg Leu Phe Ala Thr Gly Val Thr Ser Leu 2820 2825
2830Arg Val Arg Ile Asp Pro Gln Asn Ser Ser Trp Gln Ala Trp Asp Glu
2835 2840 2845Ser Gly Leu Pro Val Leu Thr Ile Gly Arg Leu Ala Gly
Arg Pro Val 2850 2855 2860Asp Ala Asp Gln Phe Ala Val Arg Arg Ala
Gly His Leu Phe Arg Val2865 2870 2875 2880Glu Thr Arg His Glu Ala
Leu Ala Gly Pro Ala Pro Ala Ser Trp Ala 2885 2890 2895Val Ile Gly
Ala Asp Pro Ala Gly Tyr Ala Ala Ala Leu Glu Ala Thr 2900 2905
2910Gly Ala Gln Val Thr Thr Ala Ala Asp Leu Ala Gly Leu Thr Ser Ala
2915 2920 2925Pro Glu Ala Ala Leu Phe Thr Leu Pro Gly Thr Lys Asp
Ala Gly Val 2930 2935 2940Thr Glu Glu Val Pro Thr Ala Val Arg Glu
Ala Thr Ala Gln Val Leu2945 2950 2955 2960Glu Val Leu Gln Asp Trp
Leu Thr Asp Gly Arg Phe Asp Asp Ala Arg 2965 2970 2975Leu Val Val
Val Ser Arg Glu Ala Glu Asp Gly Asp Leu Leu His Gly 2980 2985
2990Thr Ala Arg Gly Leu Leu Arg Ala Ala Gln Ala Glu His Pro Asp Arg
2995 3000 3005Ile Thr Leu Val Asp Leu Asp Ala His Pro Ala Ser Leu
Thr Ala Leu 3010 3015 3020Pro Gly Phe Ala Leu Gly Pro Glu Pro Glu
Val Val Val Arg Ala Gly3025 3030 3035 3040Asp Gly Arg Ala Pro Arg
Leu Ala Arg Ala Gln Ala Pro Thr Gly Ala 3045 3050 3055Gly Ser Leu
Gly Thr Gly Thr Val Leu Ile Thr Gly Gly Thr Gly Thr 3060 3065
3070Leu Gly Gly Leu Leu Ala Arg His Leu Val Glu Thr His Gly Val Thr
3075 3080 3085Arg Leu Leu Leu Val Ser Arg Arg Gly Pro Ala Ala Asp
Gly Ala Asp 3090 3095 3100Arg Leu His Ala Glu Leu Thr Gly His Gly
Ala His Val Asp Ile Val3105 3110 3115 3120Ala Ala Asp Leu Gly Asp
Arg Thr Ser Val Ala Ala Leu Leu Ala Thr 3125 3130 3135Val Asp Ala
Asp His Pro Leu Ser Ala Val Val His Ala Ala Gly Ala 3140 3145
3150Leu Asp Asp Gly Val Leu Gly Thr Arg Ser Ala Asp Trp Leu Asp Pro
3155 3160 3165Val Leu Arg Pro Lys Ala Asp Ala Ala Trp His Leu His
Glu Leu Thr 3170 3175 3180Ala Glu Leu Pro Leu Thr Ala Phe Val Met
Phe Ser Ser Ala Ala Ser3185 3190 3195 3200Val Leu Gly Ala Ala Gly
Gln Ala Asn Tyr Ala Ala Ala Asn Gly Phe 3205 3210 3215Leu Asp Ala
Leu Ala Ala His Arg Ala Ala Arg Gly Leu Pro Gly Thr 3220 3225
3230Ser Leu Ala Trp Gly Leu Trp Glu His Arg Ser Glu Leu Thr Arg His
3235 3240 3245Thr Gly Ser Pro Ser Arg Ser Ile Ala Ala Val Gly Ala
Leu Ser Thr 3250 3255 3260Ala Glu Ala Leu Ala Ala Phe Asp Ala Gly
Leu Ala Ser Gly Glu Pro3265 3270 3275 3280Leu Ala Val Pro Ile Arg
Leu Glu Ser Thr Ser Ser Glu Glu Val Pro 3285 3290 3295Pro Met Leu
Arg Gly Leu Val Arg Val Arg Arg Arg Ala Ala Thr Gly 3300 3305
3310Thr Glu Pro Ala Ala Ser Ala Gly Ala Ala Gln Glu Val Arg Gln Leu
3315 3320 3325Ala Glu Leu Gly Ala Asp Glu Arg Gln Arg Arg Val Gln
Arg Ile Val 3330 3335 3340Leu Asp Thr Ala Ala Ala Val Leu Gly His
Asp Ser His Asp Ala Ile3345 3350 3355 3360Pro Leu Thr Arg Gly Phe
Leu Glu Leu Gly Phe Asp Ser Leu Thr Ala 3365 3370 3375Val Arg Leu
Arg Asn Arg Leu Ala Arg Arg Leu Gly Leu Arg Leu Pro 3380 3385
3390Ala Thr Val Val Phe Asp His Pro Ser Pro Ala Ala Leu Ala Ala His
3395 3400 3405Leu Val Glu His Leu Val Gly Thr Val Asp Pro Thr Ala
Gln Ala Met 3410 3415 3420Glu Gln Leu Glu Ala Leu Arg Arg Ser Val
His Ala Ala Thr Pro Ala3425 3430 3435 3440Gly Gly Leu Asp Arg Ala
Leu Val Thr Gln Arg Leu Thr Ala Leu Leu 3445 3450 3455Asp Glu Met
Arg His Val Asp Gly Pro Gly Gly Thr Glu Gly Pro Asp 3460 3465
3470Gly Ser Gly Asp Asp Leu Glu Asn Ala Thr Ala Asp Glu Ile Tyr Ala
3475 3480 3485Leu Ile Asp Asn Glu Leu Gly Ile Gly Gly Thr Gln 3490
3495 350051620PRTStreptomyces parvulus Tu4055 5Met Asn Gly Asp Asp
Lys Ala Leu Ala Tyr Leu Lys Arg Val Thr Ala1 5 10 15Asp Leu Arg Ser
Ala Arg Ala Arg Leu Gln Glu Leu Glu Ser Ala Asp 20 25 30Thr Asp Pro
Ile Ala Ile Ile Gly Met Gly Cys Arg Leu Pro Gly Gly 35 40 45Val Arg
Thr Pro Glu Asp Leu Trp Asp Leu Val Glu Lys Lys His Asp 50 55 60Ala
Ile Gly Pro Phe Pro Ala Asp Arg Gly Trp Asp Leu Glu Asn Leu65 70 75
80Tyr Asp Pro Asp Pro Asp Ala Pro Gly Lys Ala Tyr Val Arg Glu Gly
85 90 95Gly Phe Val His Asp Val Ala Gly Phe Asp Ala Gly Phe Phe Gly
Ile 100 105 110Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln His Arg
Leu Leu Leu 115 120 125Glu Cys Ser Trp Glu Ala Leu Glu Arg Ala Gly
Ile Asp Pro Ser Ser 130 135 140Leu Glu Gly Thr Arg Thr Gly Val Tyr
Thr Gly Leu Met Thr His Glu145 150 155 160Tyr Ala Thr Arg Leu Pro
Ser Ile Asp Glu Glu Leu Glu Gly Val Ile 165 170 175Gly Ile Gly Asn
Ala Gly Ser Val Ala Ser Gly Arg Val Ser Tyr Thr 180 185 190Leu Gly
Leu Asn Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser 195 200
205Ser Leu Val Ala Leu His Leu Ala Ala Gln Ala Leu Arg Gln Gly Gln
210 215 220Cys Thr Leu Ala Leu Ala Gly Gly Ala Ser Val Ile Ala Ala
Pro Thr225 230 235 240Val Phe Ala Thr Phe Ser Arg Gln Arg Gly Leu
Ala Pro Asp Gly Arg 245 250 255Cys Lys Ala Phe Ser Ser Thr Thr Asp
Gly Thr Gly Phe Gly Glu Gly 260 265 270Val Gly Val Leu Val Leu Glu
Arg Leu Ser Asp Ala Arg Arg Asn Gly 275 280 285His Glu Val Leu Ala
Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly 290 295 300Ala Ser Ser
Gly Phe Thr Ala Pro Asn Gly Pro Ser Gln Gln Asp Val305 310 315
320Ile Arg Glu Ala Leu Ala Asp Gly Arg Leu Thr Pro Ala Asp Val Asp
325 330 335Val Val Glu Gly His Gly Thr Gly Thr Arg Leu Gly Asp Pro
Ile Glu 340 345 350Ala Gln Ala Leu Leu Ala Thr Tyr Gly Arg Gly Arg
Asp Ala Asp Arg 355 360 365Pro Leu Trp Leu Gly Ser Val Lys Ser Asn
Ile Gly His Thr Gln Ala 370 375 380Ala Ala Gly Val Ala Ser Val Ile
Lys Met Val Gln Ala Met Gln Ala385 390 395 400Gly Val Leu Pro Arg
Thr Leu His Val Asp Glu Pro Ser Gly Glu Val 405 410 415Asp Trp Asp
Ser Gly Ala Val Arg Leu Leu Thr Glu Ala Arg Glu Trp 420 425 430Pro
Ser Gly Glu Gly Arg Val Arg Arg Ala Gly Val Ser Ser Phe Gly 435 440
445Ile Ser Gly Thr Asn Ala His Val Ile Leu Glu Glu Pro Pro Ala Glu
450 455 460Asp Ala Leu Pro Glu Pro Glu Ala Gly Asp Val Val Pro Trp
Val Leu465 470 475 480Ser Ala Arg Ser Ala Glu Ala Leu Arg Glu Gln
Ala Ala Arg Leu Ala 485 490 495Ser Val Ala Gly Gly Leu Asn Val Val
Asp Val Gly Trp Ser Leu Ala 500 505 510Ser Thr Arg Ala Ala Phe Glu
His Arg Ala Val Val Val Gly Gly Asp 515 520 525Arg Glu Glu Leu Leu
Gly Lys Leu Ser Ser Val Ser Gly Val Glu Val 530 535 540Gly Val Gly
Val Gly Ala Gly Gly Gly Val Val Leu Val Phe Ala Gly545 550 555
560Gln Gly Cys Gln Trp Val Gly Met Gly Arg Glu Leu Leu Gly Ser Ser
565 570 575Leu Val Phe Ala Glu Ser Met Arg Glu Cys Ala Ala Ala Leu
Ser Pro 580 585 590Phe Val Asp Phe Ser Val Val Asp Val Leu Gly Ser
Ala Gly Glu Leu 595 600 605Gly Arg Val Glu Val Val Gln Pro Ala Leu
Trp Ala Val Met Val Ser 610 615 620Leu Ala Arg Val Trp Arg Ser Trp
Gly Val Pro Val Ala Ala Val Val625 630 635 640Gly His Ser Gln Gly
Glu Ile Ala Ala Ala Thr Val Ala Gly Ala Leu 645 650 655Ser Val Gly
Asp Ala Ala Arg Val Val Ala Leu Arg Ser Arg Leu Ile 660 665 670Ala
Glu Arg Leu Ser Gly Leu Gly Gly Met Val Ser Val Ala Leu Ser 675 680
685Arg Glu Arg Val Val Ser Leu Ile Ala Gly Val Pro Gly Val Ser Val
690 695 700Ala Ala Val Asn Gly Ser Ser Ser Thr Val Val Ser Gly Glu
Ala Ala705 710 715 720Gly Leu Glu Arg Val Leu Ala Ala Cys Val Ser
Ser Gly Val Arg Ala 725 730 735Arg Arg Ile Asp Val Asp Tyr Ala Ser
His Ser Val Gln Val Glu Leu 740 745 750Ile Arg Glu Glu Leu Leu Gly
Val Leu Asp Gly Ile Val Pro Arg Ser 755 760 765Gly Glu Ile Pro Phe
Val Ser Thr Val Thr Gly Glu Arg Ile Asp Thr 770 775 780Val Glu Leu
Gly Ala Glu Tyr Trp Tyr Arg Asn Leu Arg Gln Thr Val785 790 795
800Glu Phe Gln Ser Val Val Glu Gly Leu Val Ala Gln Gly Cys Arg Val
805 810 815Phe Leu Glu Ser Ser Pro His Pro Val Leu Thr Val Gly Ile
Glu Glu 820 825 830Ser Ala Asp Arg Val Val Ala Leu Glu Ser Leu Arg
Arg Gly Glu Gly 835 840 845Gly Leu Arg Arg Leu Val Asp Ala Ala Gly
Glu Ala Trp Val Arg Gly 850 855 860Val Pro Ile Asp Trp Ala Gly Met
Leu Ala Gly Gly Arg Arg Val Asp865 870 875 880Leu Pro Thr Tyr Pro
Phe Gln His Gln Pro Tyr Trp Leu Asp Ser Pro 885 890 895Arg His Pro
Ala Gly Asp Val Thr Gly Pro Gly Asp Asp Glu Phe Trp 900 905 910Ala
Ala Val Glu His Gly Glu Ala Thr Glu Leu Ala Asp Leu Leu Arg 915 920
925Arg Ser Ala Ala Glu Pro Gly Gln Asp Leu His Ala Pro Val Ala Ala
930 935 940Leu Leu Pro Thr Leu Ala Thr Trp Arg Arg Asp Arg Gln Arg
Arg Ala945 950 955 960Ala Val Asp Ser Trp Arg Tyr Arg Ile Val Trp
Arg Pro Val Ala Thr 965 970 975Pro Ser Tyr Asp Arg Val Leu Ser Gly
Arg Trp Ala Val Val Val Pro 980 985 990Ala Gly His Glu Asp Asp Pro
Val Val Asp Trp Val Cys Ser Ala Leu 995 1000 1005Arg Asp His Gly
Gly Glu Pro Glu Arg Met Val Leu Gly Pro Arg Glu 1010 1015 1020Ser
Arg Ser Ala Leu Ala Thr Arg Leu Ala Ala Asp Pro Pro Gly Gly1025
1030 1035 1040Val Val Ser Leu Leu Gly Leu Ser Gly Ala Ala His Pro
Asp His Glu 1045 1050 1055Val Leu Pro Ser Ala Val Ala Gly Thr Val
Leu Leu Ala Gln Ala Leu 1060 1065 1070Ser Asp Gly Ala Val Arg Ala
Pro Val Trp Thr Leu Thr Arg Asn Gly 1075 1080 1085Val Ser Ala Thr
Ala Thr Asp Pro Val Ala Pro Thr His Ala Ala Gln 1090 1095 1100Val
Trp Ala Val Ala Arg Val Ala Gly Leu Glu His Pro Glu Ala Trp1105
1110 1115 1120Gly Gly Leu Leu Asp Leu Pro Asp Arg Leu Asp Asp Arg
Ala Ala Ala 1125 1130 1135Arg Phe Ala Ala Val Leu Ser Ala Gly Glu
Asp Glu Asp Gln Leu Ala 1140 1145 1150Leu Arg Asp Ala Gly Leu Leu
Ala Arg Arg Leu Val Arg Ala Pro Val 1155 1160 1165Pro Arg Asp Ala
Val Thr Ala Gly Trp Gln Pro Arg Asp Thr Ala Leu 1170 1175 1180Val
Thr Gly Gly Thr Gly Gly Leu Gly Gly Gln Val Ala Arg Trp Leu1185
1190 1195 1200Ala Ala Ala Gly Val Arg His Leu Val Leu Val Ser Arg
Arg Gly Ala 1205 1210 1215Glu Ala Glu Gly Ala Asp Arg Leu Arg Asp
Asp Leu Thr Ala Leu Gly 1220 1225 1230Val Gln Val Thr Phe Gly Ala
Cys Asp Val Ala Asp Arg Ala Ala Leu 1235 1240 1245Ser Ala Leu Leu
Asp Arg Val Gln Glu Asp Gly Pro Pro Ile Arg Thr 1250 1255 1260Val
Val His Ala Ala Gly Ser Gly Arg Ala Ala Arg Leu Leu Asp Thr1265
1270 1275 1280Asp Ala Glu Glu Thr Ala Ala Val Leu Arg Ala Lys Ser
Ala Gly Ala 1285 1290 1295Arg Asn Leu His Glu Leu Leu Asp Asp Val
Asp Ala Phe Val Leu Phe 1300 1305 1310Ser Ser Gly Ala Gly Val Trp
Gly Ser Ser Ala Gln Gly Ala Tyr Ala 1315 1320 1325Ala Ala Asn Ala
Tyr Leu Asp Ala Leu Ala Glu Gln Arg Arg Gly Gln 1330 1335 1340Gly
Arg Pro Ala Thr Ser Val Ala Trp Gly Ala Trp Ala Gly Asp Gly1345
1350 1355 1360Met Thr Ala Ala Ala Gly Glu Glu Trp Trp Ser Arg Gln
Gly Leu Arg 1365 1370 1375Phe Met Ala Pro Glu Ala Ala Leu Asp Ala
Leu Arg Gln Ala Val Asp 1380 1385 1390Arg Ala Glu Ser Thr Leu Val
Val Ala Asp Ile Asp Trp Lys Thr Phe 1395 1400 1405Ala Pro Leu Phe
Thr Ser Ala Arg Ser Arg Pro Leu Ile Thr Asp Ile 1410 1415 1420Pro
Glu Ala Arg Pro Glu Pro Arg Pro Glu Gly Ala Asp Gln Pro Thr1425
1430 1435 1440Gln Gly Leu Val Ala Lys Leu Ala Val Leu Ser Ala Asp
Glu Arg Arg 1445 1450 1455Arg Ala Leu Leu Ala Glu Val Arg Ala Gln
Ala Ala Val Val Leu Gly 1460 1465 1470His Pro Gly Ala Asp Ala Val
Pro Val Asp Arg Pro Phe Arg Glu Leu 1475 1480 1485Gly Phe Asp Ser
Leu Ser Ala Val Lys Leu Arg Asn Arg Ile Val Ala 1490 1495 1500Ala
Thr Gly Leu Glu Leu Pro Ala Thr Leu
Val Phe Asp His Pro Thr1505 1510 1515 1520Ser Thr Ala Leu Ala Ala
Tyr Leu Gly Ala Arg Leu Gly Ile Asp Gly 1525 1530 1535Ala Pro Ala
Gly Ser Thr Leu Leu Glu Asp Leu Ala Arg Leu Glu Ser 1540 1545
1550Thr Val Ala Thr Leu Thr Ala Ala Pro Leu Ala Glu Thr Val Pro Asp
1555 1560 1565Ala Arg Asp Arg Ala Ala Leu Thr Thr Arg Leu Arg Ala
Leu Leu Glu 1570 1575 1580Arg Trp Asp Gln Ala Asp Gly Glu Asp Gln
Ala Ala Ala Arg Glu Glu1585 1590 1595 1600Leu Asp Asp Leu Ser Asp
Asp Asp Leu Phe Asp Phe Ile Asp Ala Lys 1605 1610 1615Phe Gly Arg
Ser 162062130PRTStreptomyces parvulus Tu4055 6Met Gly Asp Glu Gln
Lys Leu Arg Thr Tyr Leu Arg Arg Val Thr Ala1 5 10 15Asp Leu Ala Asp
Val Thr Glu Arg Leu Gln Arg Ala Glu Asp Lys Asn 20 25 30Ala Glu Pro
Ile Ala Ile Val Gly Met Gly Cys Arg Tyr Pro Gly Gly 35 40 45Val Arg
Ser Pro Glu Glu Phe Trp Asn Leu Leu Asp Glu Gly Val Asp 50 55 60Ala
Val Ala Gly Phe Pro Glu Asp Arg Gly Trp Asp Leu Glu Asn Leu65 70 75
80Tyr Asp Pro Asp Pro Asp Glu Pro Gly Lys Cys Tyr Ala Arg Glu Gly
85 90 95Gly Phe Leu Tyr Asp Ala Gly Glu Phe Asp Ala Ala Phe Phe Gly
Ile 100 105 110Ser Pro Arg Glu Ala Leu Ser Met Asp Pro Gln Gln Arg
Leu Leu Leu 115 120 125Glu Cys Ser Trp Ser Ala Leu Glu Arg Ala Gly
Ile Asp Pro Gly Ser 130 135 140Leu Arg Gly Lys Asp Val Gly Val Tyr
Val Gly Ala Trp Asn Ser Asn145 150 155 160Tyr Gly Arg Gly Gly Gly
Ala Glu Ser Ser Glu Gly His Leu Leu Thr 165 170 175Gly Asn Ala Ser
Ser Val Val Ser Gly Arg Val Ala Tyr Val Leu Gly 180 185 190Leu Glu
Gly Pro Ala Val Thr Ile Asp Thr Ala Cys Ser Ser Ser Leu 195 200
205Val Gly Leu His Leu Ala Ala Gln Ala Leu Arg Ser Gly Glu Cys Gly
210 215 220Leu Ala Leu Ala Gly Gly Val Thr Val Met Ser Thr Pro Leu
Ser Leu225 230 235 240Val Ser Phe Ser Arg Gln Arg Gly Leu Ala Gln
Asp Gly Arg Ser Lys 245 250 255Ala Phe Ser Ala Asp Ala Asp Gly Met
Gly Met Ala Glu Gly Val Gly 260 265 270Val Leu Val Leu Glu Arg Leu
Ser Glu Ala Arg Arg Asn Gly His Glu 275 280 285Val Leu Ala Val Leu
Arg Ser Ser Ala Val Asn Gln Asp Gly Ala Ser 290 295 300Asn Gly Leu
Ser Ala Pro Asn Gly Pro Ala Gln Gln Arg Val Ile Gln305 310 315
320Ser Ala Leu Thr Val Gly Arg Leu Ala Pro Ser Asp Ile Asp Val Val
325 330 335Glu Ala His Gly Thr Gly Thr Ala Leu Gly Asp Pro Ile Glu
Ala Gln 340 345 350Ala Leu Leu Ala Thr Tyr Gly Arg Gly Arg Asp Ala
Asp Arg Pro Leu 355 360 365Trp Leu Gly Ser Val Lys Ser Asn Ile Gly
His Thr Gln Ala Ala Ala 370 375 380Gly Val Ala Gly Val Ile Lys Met
Val Leu Ala Leu Arg Lys Gly Val385 390 395 400Leu Pro Arg Thr Leu
His Val Asp Glu Pro Thr Gly Glu Val Asp Trp 405 410 415Asp Ser Gly
Ala Val Arg Leu Leu Thr Glu Ala Arg Glu Trp Pro Ser 420 425 430Gly
Glu Gly Arg Val Arg Arg Ala Gly Val Ser Ser Phe Gly Ile Ser 435 440
445Gly Thr Asn Ala His Val Ile Val Glu Glu Ala Pro Glu Glu Glu Pro
450 455 460Arg Pro Glu Ala Pro Ser Val Asp Val Val Pro Trp Val Leu
Ser Ala465 470 475 480Arg Ser Ala Glu Ala Leu Arg Glu Gln Ala Ala
Arg Leu Ala Ser Val 485 490 495Ala Gly Gly Leu Asn Val Val Asp Val
Gly Trp Ser Leu Ala Ser Thr 500 505 510Arg Ala Ala Phe Glu His Arg
Ala Val Val Val Gly Arg Asp Ser Glu 515 520 525Glu Leu Val Ser Gly
Leu Ser Ser Val Ser Gly Val Glu Val Gly Val 530 535 540Gly Val Gly
Ala Gly Gly Gly Val Val Leu Val Phe Ala Gly Gln Gly545 550 555
560Cys Gln Trp Val Gly Met Gly Arg Glu Leu Leu Gly Ser Ser Leu Val
565 570 575Phe Ala Glu Ser Met Arg Glu Cys Ala Ala Ala Leu Ser Pro
Phe Val 580 585 590Asp Phe Ser Val Val Asp Val Leu Gly Ser Ala Gly
Glu Leu Gly Arg 595 600 605Val Glu Val Val Gln Pro Ala Leu Trp Ala
Val Met Val Ser Leu Ala 610 615 620Arg Val Trp Arg Ser Trp Gly Val
Pro Val Ala Ala Val Val Gly His625 630 635 640Ser Gln Gly Glu Ile
Ala Ala Ala Thr Val Ala Gly Ala Leu Ser Val 645 650 655Gly Asp Ala
Ala Arg Val Val Ala Leu Arg Ser Arg Leu Ile Ala Glu 660 665 670Arg
Leu Ser Gly Leu Gly Gly Met Val Ser Val Ala Leu Ser Arg Glu 675 680
685Arg Val Val Ser Leu Ile Ala Gly Val Pro Gly Val Ser Val Ala Ala
690 695 700Val Asn Gly Ser Ser Ser Thr Val Val Ser Gly Glu Ala Ala
Gly Leu705 710 715 720Glu Arg Val Leu Ala Ala Cys Val Ser Ser Gly
Val Arg Ala Arg Arg 725 730 735Ile Asp Val Asp Tyr Ala Ser His Ser
Val Gln Val Glu Leu Ile Arg 740 745 750Glu Glu Leu Leu Gly Val Leu
Asp Gly Ile Val Pro Arg Ser Gly Glu 755 760 765Ile Pro Phe Val Ser
Thr Val Thr Gly Glu Arg Ile Asp Thr Val Glu 770 775 780Leu Gly Ala
Glu Tyr Trp Tyr Arg Asn Leu Arg Gln Thr Val Glu Phe785 790 795
800Gln Ala Ser Val Gln Thr Leu Leu Ala Gln Gly His Gln Val Phe Leu
805 810 815Glu Ser Ser Pro His Pro Val Leu Thr Val Gly Ile Glu Glu
Thr Val 820 825 830His Glu Ser Ala Ala Gln Ala Val Val Leu Gly Ser
Leu Arg Arg Asp 835 840 845Glu Gly Ala Leu Thr Arg Leu Val Thr Ser
Ala Gly Glu Ala Trp Ala 850 855 860Arg Gly Val Pro Val Asp Trp Ala
Gly Met Leu Ala Gly Gly Arg Arg865 870 875 880Val Glu Leu Pro Thr
Tyr Pro Phe Leu Arg Glu Arg Leu Trp Leu Glu 885 890 895Pro Ser Arg
Ser Arg Thr Gly Asn Leu Asn Met Ala Gly Leu Val Glu 900 905 910Ala
Gly His Glu Ile Leu Pro Ala Ala Val Glu Leu Pro Gly Glu Gln 915 920
925Trp Val Trp Thr Gly Glu Leu Ser Leu Ser Ala Tyr Pro Trp Leu Ala
930 935 940Asp His Gln Val Leu Gly Gln Thr Leu Val Pro Gly Val Ala
Trp Val945 950 955 960Glu Leu Ala Leu His Ala Gly His Gln Leu Gly
Phe Gly Ser Val Glu 965 970 975Glu Leu Thr Leu Gln Ala Pro Leu Val
Leu Gly Glu Ser Asp Ala Val 980 985 990Gln Val Arg Val Val Val Ser
Asp Leu Gly Glu Ser Asp Arg Arg Ala 995 1000 1005Val Ser Val His
Ser Arg Gly Asp Asp Gln Thr Trp Val Thr His Ala 1010 1015 1020Glu
Gly Phe Leu Thr Ala Lys Gly Ala Gln Pro Glu Thr Met Ala Val1025
1030 1035 1040Trp Pro Pro Ser Gly Ala Glu Pro Val Glu Ala Asp Gly
Phe Tyr Glu 1045 1050 1055Arg Leu Ala Asp Ala Gly Tyr His Tyr Gly
Pro Val Phe Gln Gly Val 1060 1065 1070Ser Lys Val Trp Arg Ala Gly
Glu Glu Ile Tyr Ala Glu Val Gly Leu 1075 1080 1085Leu Asp Asp Ala
Asp Val Asp Gly Phe Gly Ile His Pro Ala Leu Leu 1090 1095 1100Asp
Ala Ala Leu Gln Thr Ala Tyr Val Ala Gln Arg Gly Pro Ala Glu1105
1110 1115 1120Thr Lys Leu Pro Phe Ala Phe Gly Asp Val Gln Leu Phe
Ala Thr Gly 1125 1130 1135Ala Arg Ser Leu Arg Val Arg Val Ser Pro
Ala Ala Gln Gln Gly Met 1140 1145 1150Ala Trp Glu Ala Trp Asp Pro
Thr Gly Leu Pro Val Phe Ser Leu Gly 1155 1160 1165Tyr Leu Ala Thr
Arg Pro Val Asp Arg Gly Gln Leu Thr Val Lys Arg 1170 1175 1180Pro
Glu Ser Leu Phe Lys Val Ala Trp Asp Glu Thr Val Pro Val Val1185
1190 1195 1200Gly Asn Ala Thr Ala Ala His Gly Val Val Leu Gly Asp
Asp Pro Phe 1205 1210 1215Ala Leu Gly Ala Ala Leu Arg Ala Ala Gly
Trp Glu Val Gly Ala Ala 1220 1225 1230Pro Glu Pro Ala Ser Ala Asp
Thr Ala Ala Glu Val Leu Leu Leu Pro 1235 1240 1245Cys Thr Ala Pro
Gly Glu Pro Asp Ala Asp Leu Pro Thr Ala Val Arg 1250 1255 1260Ala
Val Thr Ala Arg Val Leu Gly Val Leu Gln Glu Trp Leu Ala Asp1265
1270 1275 1280Glu Arg Leu Ala Gly Thr Arg Leu Ala Val Val Thr Arg
Asn Ala Leu 1285 1290 1295Pro Gly Asp Leu Leu His Ser Pro Val Trp
Gly Leu Val Arg Ser Ala 1300 1305 1310Gln Thr Glu Asn Pro Gly Arg
Ile Thr Leu Val Asp Leu Asp Asp His 1315 1320 1325Pro Asp Ser Ala
Ala Val Leu Ala Glu Ala Val Gln Ser Asp Glu Pro 1330 1335 1340Arg
Ile Met Val Arg Glu Gly Arg Pro Thr Ala Ala Arg Leu Val Arg1345
1350 1355 1360Ala Thr Ala Pro Glu Leu Val Pro Pro Ala Gly Ala Asp
Ala Trp Arg 1365 1370 1375Leu Glu Ile Thr Glu Pro Gly Thr Phe Asp
Asn Leu Thr Leu Gly Val 1380 1385 1390Tyr Pro His Ala Glu Lys Thr
Leu Ala Asp Asn Glu Val Arg Val Ala 1395 1400 1405Val His Ala Gly
Gly Leu Asn Phe His Asp Val Val Ala Ala Leu Gly 1410 1415 1420Met
Val Glu Asp Asp Leu Thr Leu Gly Arg Glu Ala Ala Gly Val Val1425
1430 1435 1440Val Glu Val Gly Asp Ala Val Pro Asp Leu Thr Pro Gly
Asp His Val 1445 1450 1455Met Gly Ile Leu Ser Ser Gly Phe Gly Pro
Leu Ala Val Thr Asp His 1460 1465 1470Arg Tyr Leu Ala Arg Met Pro
Glu Gly Trp Thr Phe Ala Gln Ala Ala 1475 1480 1485Ser Val Pro Ala
Ala Phe Leu Thr Ala Tyr Tyr Gly Leu Cys Asp Leu 1490 1495 1500Gly
Gly Ile Arg Ala Gly Asp Arg Val Leu Ile His Ala Ala Ala Gly1505
1510 1515 1520Gly Val Gly Met Ala Ala Val Gln Ile Ala Arg His Leu
Gly Ala Glu 1525 1530 1535Val Phe Gly Thr Ala Ser Pro Arg Lys Trp
Gly Ala Leu Arg Ala Leu 1540 1545 1550Gly Leu Asp Asp Ala His Leu
Ser Ser Ser Arg Thr Leu Asp Phe Glu 1555 1560 1565Gln Glu Phe Leu
Asp Ala Thr Asp Gly Arg Gly Val Asp Leu Val Leu 1570 1575 1580Asn
Ser Leu Ala Arg Glu Phe Val Asp Ala Ser Leu Arg Leu Met Pro1585
1590 1595 1600Gly Gly Gly Arg Phe Val Asp Met Gly Lys Thr Asp Ile
Arg Arg Pro 1605 1610 1615Glu Gln Val Ala Glu Asp His Gly Gly Val
Ala Tyr Gln Ala Phe Asp 1620 1625 1630Leu Val Glu Ala Gly Pro Gln
Arg Thr Gly Glu Met Leu Ala Glu Ile 1635 1640 1645Val Arg Leu Phe
Gln Ala Gly Ala Phe Arg Pro Leu Pro Ile Thr Gln 1650 1655 1660Trp
Asp Val Arg Arg Ala Pro Glu Ala Phe Arg His Ile Ser Gln Ala1665
1670 1675 1680Lys His Ile Gly Lys Ile Val Leu Thr Val Pro Arg Pro
Ile Asp Thr 1685 1690 1695Asp Gly Thr Val Met Val Thr Gly Ala Thr
Gly Thr Leu Gly Gly Phe 1700 1705 1710Val Ala Arg His Leu Val Thr
His His Gly Ile Arg Arg Leu Leu Leu 1715 1720 1725Val Ser Arg Ser
Ala Glu Arg Thr Asp Leu Val Arg Glu Leu Thr Glu 1730 1735 1740Leu
Gly Ala Asp Val Thr Trp Ala Ser Cys Asp Leu Ala Asp Ala Thr1745
1750 1755 1760Ala Val Glu Glu Thr Val Arg Ser Val Asp Glu Arg His
Pro Leu Val 1765 1770 1775Ala Val Val His Ser Ala Gly Val Leu Asp
Asp Gly Val Ile Asp Lys 1780 1785 1790Gln Ser Pro Glu Arg Leu Asp
Thr Val Met Arg Pro Lys Val Asp Ala 1795 1800 1805Ala Trp Asn Leu
His Arg Leu Leu Asp Asn Ala Pro Leu Ala Asp Phe 1810 1815 1820Val
Leu Phe Ser Ser Ala Ser Gly Val Leu Gly Gly Ala Gly Gln Ser1825
1830 1835 1840Asn Tyr Ala Ala Ala Asn Ala Phe Leu Asp Ala Leu Ala
Glu His Arg 1845 1850 1855Arg Ala Gln Gly Leu Ala Gly Gln Ala Leu
Ala Trp Gly Leu Trp Ser 1860 1865 1870Asp Arg Ser Thr Met Thr Gly
Gln Leu Gly Ser Thr Glu Leu Ala Arg 1875 1880 1885Ile Ala Arg Asn
Gly Val Ala Glu Met Ser Glu Thr Glu Gly Leu Ala 1890 1895 1900Leu
Phe Asp Ala Ala Arg Asp Thr Ala Glu Ala Val Leu Leu Pro Met1905
1910 1915 1920His Leu Asp Val Ala Arg Leu Arg Ser Arg Asn Gly Glu
Val Pro Ala 1925 1930 1935Val Phe Arg Arg Leu Ile His Ala Thr Ala
Arg Arg Thr Ala Ser Thr 1940 1945 1950Ala Val Arg Ser Ala Gly Leu
Glu Gln Gln Leu Ala Ser Leu Ser Gly 1955 1960 1965Pro Glu Arg Thr
Glu Leu Leu Leu Gly Leu Val Arg Asp His Ala Ala 1970 1975 1980Ala
Val Leu Gly His Gly Thr Ser Asp Ala Val Ser Pro Asp Arg Pro1985
1990 1995 2000Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu
Leu Arg Asn 2005 2010 2015Arg Phe Ala Ala Leu Thr Gly Leu Arg Leu
Pro Ala Thr Leu Val Phe 2020 2025 2030Asp His Pro Ser Pro Thr Ala
Leu Ala Gly His Leu Ala Gly Leu Leu 2035 2040 2045Gly Ala Ala Thr
Pro Ser Ala Ala Glu Pro Val Leu Ala Ala Val Gly 2050 2055 2060Arg
Leu Arg Ala Asp Leu Arg Ser Leu Thr Pro Asp Ala Glu Gly Ala2065
2070 2075 2080Glu Asp Val Thr Ile Gln Leu Glu Ala Leu Leu Ala Glu
Trp Arg Glu 2085 2090 2095Ala Ala Glu Lys Arg Ala Pro Glu Ala Val
Gly Asp Glu Asp Leu Ser 2100 2105 2110Thr Ala Thr Asp Asp Glu Ile
Phe Ala Leu Val Asp Ser Glu Leu Gly 2115 2120 2125Glu Ala
213071742PRTStreptomyces parvulus Tu4055 7Met Thr Ala Glu Ala Ser
Gln Asp Lys Leu Arg Asp Tyr Leu Arg Lys1 5 10 15Thr Leu Ala Asp Leu
Arg Thr Thr Lys Gln Arg Leu Arg Asp Thr Glu 20 25 30Arg Arg Ala Thr
Glu Pro Val Ala Ile Val Gly Met Ser Cys Arg Leu 35 40 45Pro Gly Asp
Val Arg Thr Pro Glu Arg Phe Trp Glu Leu Leu Asp Thr 50 55 60Gly Thr
Asp Ala Leu Thr Pro Leu Pro Thr Asp Arg Gly Trp Asn Leu65 70 75
80Asp Thr Ala Phe Asp Asp Glu Arg Pro Tyr Arg Arg Glu Gly Gly Phe
85 90 95Leu Tyr Asp Ala Gly Arg Phe Asp Ala Glu Phe Phe Gly Ile Ser
Pro 100 105 110Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu
Leu Glu Ser 115 120 125Ser Trp Glu Ala Ile Glu His Ala Arg Ile Asp
Pro Arg Ser Leu His 130 135 140Gly Ser Arg Thr Gly Val Trp Phe Gly
Thr Ile Gly Gln Asp Tyr Phe145 150 155 160Ser Leu Phe Ala Ala Ser
Gly Gly Glu His Ala Asn Tyr Leu Ala Thr 165 170 175Ala Cys Ser Ala
Ser Val Met Ser Gly Arg Val Ser Tyr Val Leu Gly 180 185 190Leu Glu
Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu 195 200
205Val Ala Leu His Ser Ala Val Gln Ala
Leu Arg Ser Gly Glu Cys Glu 210 215 220Leu Ala Leu Ala Gly Gly Ala
Thr Val Met Ala Thr Pro Thr Val Phe225 230 235 240Thr Ala Phe Ser
His Gln Arg Gly Leu Ala Gly Asp Gly Arg Cys Lys 245 250 255Ala Phe
Ala Ala Gly Ala Asp Gly Ala Gly Phe Ala Glu Gly Val Gly 260 265
270Val Leu Val Leu Glu Arg Leu Ser Val Ala Arg Arg Asn Gly His Arg
275 280 285Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly
Ala Ser 290 295 300Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln
Arg Val Ile Arg305 310 315 320Ala Ala Leu Ala Asn Ala Arg Leu Ala
Pro Glu Asp Val Asp Ala Val 325 330 335Glu Gly His Gly Thr Gly Thr
Ser Leu Gly Asp Pro Ile Glu Ala Gln 340 345 350Ala Leu Leu Ala Thr
Tyr Gly Arg Gly Arg Asp Ala Glu Arg Pro Leu 355 360 365Trp Leu Gly
Ser Val Lys Ser Asn Ile Gly His Ala Gln Ala Ala Ala 370 375 380Gly
Val Ala Gly Val Ile Lys Met Val Leu Ala Met Glu Lys Gly Arg385 390
395 400Leu Pro Arg Thr Leu His Val Asp Glu Pro Ser Gly Glu Val Asp
Trp 405 410 415Asp Ser Gly Ala Val Arg Leu Leu Thr Glu Ala Arg Asp
Trp Pro Ser 420 425 430Gly Glu Gly Arg Val Arg Arg Ala Gly Val Ser
Ser Phe Gly Ile Ser 435 440 445Gly Thr Asn Ala His Val Ile Ile Glu
Glu Pro Gln Glu Glu Glu Ala 450 455 460Ala Pro Asp Ser Ser Ala Ser
Gly Ala Val Pro Trp Val Leu Ser Ala465 470 475 480Arg Ser Ala Glu
Ala Leu Gln Ala Leu Ala Ser Gln Leu Ala Asp His 485 490 495Ser Ala
Lys Ser Ser Pro Val Asp Val Gly Trp Ser Leu Val Ser Thr 500 505
510Arg Ala Ala Phe Glu His Arg Ala Val Val Val Gly Arg Gly Arg Asp
515 520 525Glu Leu Val Arg Gly Leu Ser Glu Val Ala Gln Gly Arg Gly
Val Arg 530 535 540Gly Val Ala Ser Ser Ala Ser Gly Gly Leu Ala Phe
Val Phe Ala Gly545 550 555 560Gln Gly Ser Gln Arg Leu Gly Met Gly
Arg Gly Leu Tyr Glu Arg Phe 565 570 575Pro Val Phe Ala Glu Ala Phe
Asp Glu Val Cys Gly Arg Val Gly Pro 580 585 590Gly Val Arg Glu Val
Val Phe Gly Ser Asp Ala Gly Glu Leu Asp Arg 595 600 605Thr Val Trp
Ala Gln Ala Gly Leu Phe Ala Leu Glu Val Ala Leu Phe 610 615 620Arg
Leu Leu Glu Ser Trp Gly Val Arg Pro Gly Cys Leu Ile Gly His625 630
635 640Ser Val Gly Glu Leu Ser Ala Ala Cys Val Ala Gly Leu Trp Ser
Leu 645 650 655Glu Asp Ala Cys Arg Val Val Ala Ala Arg Ala Arg Leu
Met Gln Ala 660 665 670Leu Pro Ala Gly Gly Val Met Val Ala Val Arg
Ala Glu Ala Gly Glu 675 680 685Leu Ala Gly Phe Leu Gly Glu Asp Val
Val Ile Ala Ser Val Asn Ala 690 695 700Pro Gly Gln Val Val Ile Ala
Gly Pro Glu Gly Gly Val Glu Arg Val705 710 715 720Val Ala Ala Cys
Gly Ala Arg Ser Arg Arg Leu Ala Val Ser His Ala 725 730 735Phe His
Ser Pro Leu Val Glu Pro Met Leu Gly Glu Phe Arg Arg Val 740 745
750Val Glu Ser Val Ala Phe Gly Val Pro Ser Leu Arg Val Val Ser Asn
755 760 765Val Thr Gly Ala Trp Val Asp Pro Glu Glu Trp Gly Thr Pro
Glu Tyr 770 775 780Trp Val Arg Gln Val Arg Glu Pro Val Arg Phe Ala
Asp Gly Val Ala785 790 795 800Thr Leu Leu Asp Ala Gly Val Arg Thr
Phe Val Glu Leu Gly Pro Ala 805 810 815Gly Thr Leu Thr Ser Met Val
Ser His Cys Ala Asp Ala Thr Ala Thr 820 825 830Ser Val Thr Ala Val
Pro Thr Leu Arg Pro Asp His Asp Glu Ser Arg 835 840 845Thr Val Leu
Ser Ala Ala Ala Ser Leu Tyr Val Gln Gly His Pro Val 850 855 860Asp
Trp Ala Pro Leu Phe Pro Arg Ala Arg Thr Val Asp Leu Pro Thr865 870
875 880Tyr Pro Phe Gln His Gln His Tyr Trp Met Glu Ser Ala Ala Arg
Pro 885 890 895Thr Val Glu Asp Thr Pro Arg Glu Pro Leu Asp Gly Trp
Thr His Arg 900 905 910Ile Asp Trp Val Pro Leu Val Asp Glu Glu Pro
Ala Pro Val Leu Ala 915 920 925Gly Thr Trp Leu Leu Val Arg Pro Glu
Glu Gly Pro Arg Pro Leu Ala 930 935 940Asp Ala Val Ala Asp Ala Leu
Thr Arg His Gly Ala Ser Val Val Glu945 950 955 960Ala Ala Arg Val
Pro His Gln Ser Asp Thr Glu Leu Thr Gly Val Val 965 970 975Ser Leu
Leu Gly Pro Gly Ala Asp Gly Asp Gly Gly Leu Asp Ala Thr 980 985
990Leu Arg Leu Val Gln Asp Leu Ala Thr Ala Gly Ser Thr Ala Pro Leu
995 1000 1005Trp Ile Val Thr Ser Gly Ala Val Ala Val Gly Thr Ser
Asp Thr Val 1010 1015 1020Pro Asn Pro Glu Gln Ala Thr Leu Trp Gly
Leu Ala Arg Ala Ala Ala1025 1030 1035 1040Thr Glu Trp Pro Gly Leu
Gly Ala Ala Arg Ile Asp Leu Pro Ala Asp 1045 1050 1055Leu Thr Glu
Gln Val Gly Arg Arg Leu Cys Ala Arg Leu Leu Asp Arg 1060 1065
1070Ser Glu Gln Glu Thr Ala Val Arg Gln Ala Gly Val Phe Ala Arg Arg
1075 1080 1085Leu Val Arg Ala Arg Thr Ser Asp Gly Arg Trp Thr Pro
Arg Gly Thr 1090 1095 1100Val Leu Val Thr Gly Gly Thr Gly Ala Leu
Ala Gly His Val Ala Arg1105 1110 1115 1120Trp Leu Ala Glu Glu Gly
Ala Glu His Ile Val Leu Ala Gly Arg Arg 1125 1130 1135Gly Pro Asp
Gly Gln Gly Ala Glu Ala Leu Arg Ala Asp Leu Val Ala 1140 1145
1150Ala Gly Val Lys Ala Thr Ile Val Arg Cys Asp Val Ala Asp Arg Asp
1155 1160 1165Ala Val Arg Leu Leu Leu Asp Ala His Arg Pro Ser Ala
Ile Val His 1170 1175 1180Thr Ala Gly Val Val Asp Asp Gly Leu Leu
Thr Ser Leu Thr Pro Ala1185 1190 1195 1200Gln Val Glu Arg Val Leu
Arg Pro Lys Leu Leu Gly Ala Arg Asn Leu 1205 1210 1215His Glu Leu
Thr Arg Asp Arg Glu Leu Asp Ala Phe Val Leu Phe Ser 1220 1225
1230Ser Leu Ala Gly Val Leu Gly Gly Ala Gly Gln Ala Asn Tyr Ala Ala
1235 1240 1245Ala Asn Ala Tyr Leu Asp Ala Leu Ala Ala His Arg Thr
Ala His Gly 1250 1255 1260Leu Pro Ala Ala Ser Leu Ala Trp Gly Pro
Trp Glu Gly Asp Gly Met1265 1270 1275 1280Ala Ala Ala Gln Glu Ala
Ala Asp Arg Leu Arg Arg Ser Gly Leu Thr 1285 1290 1295Pro Leu Pro
Pro Glu Gln Ala Val Arg Ala Leu Gly Arg Gly His Gly 1300 1305
1310Pro Leu Val Val Ala Asp Ala Asp Trp Ala Arg Leu Ala Ala Gly Ser
1315 1320 1325Thr Gln Arg Leu Leu Asp Glu Leu Pro Glu Val Arg Ala
Val Arg Pro 1330 1335 1340Ala Glu Pro Ala Val Gly Gln Arg Pro Asp
Leu Pro Ala Arg Leu Ala1345 1350 1355 1360Gly Arg Pro Ala Glu Glu
Gln Ser Ala Val Leu Leu Glu Ala Val Arg 1365 1370 1375Glu Glu Ile
Ala Ala Val Leu Arg Tyr Ala Asp Pro Ala Arg Ile Gly 1380 1385
1390Ala Asp His Glu Phe Leu Ala Leu Gly Phe Asp Ser Leu Thr Ser Ile
1395 1400 1405Glu Leu Arg Asn Arg Leu Ala Thr Arg Ile Gly Leu Thr
Leu Pro Ala 1410 1415 1420Thr Leu Thr Leu Glu Gln Arg Thr Pro Ala
Gly Leu Ala Ala His Leu1425 1430 1435 1440Arg Glu Arg Ile Ala Asp
Arg Pro Val Gly Ser Gly Ala Val Pro Val 1445 1450 1455Pro Gly Ser
Ala Asp Val Pro Glu Ala Gly Gly Gly Ser Gly Leu Gly 1460 1465
1470Glu Leu Trp Gln Glu Ala Asp Arg His Gly Arg Arg Leu Glu Phe Ile
1475 1480 1485Asp Val Leu Thr Ala Ala Ala Ala Phe Arg Pro Ala Tyr
Arg Glu Pro 1490 1495 1500Ala Glu Leu Glu Leu Pro Pro Leu Arg Leu
Thr Ser Gly Gly Asp Glu1505 1510 1515 1520Pro Pro Leu Phe Cys Ile
Pro Ser His Leu Gly Lys Ala Asp Pro His 1525 1530 1535Lys Phe Leu
Arg Phe Ala Ala Ala Leu Arg Gly Arg Arg Asp Val Phe 1540 1545
1550Val Leu Arg Gln Pro Gly Phe Val Pro Gly Gln Pro Leu Pro Ala Gly
1555 1560 1565Leu Asp Val Leu Leu Asp Thr His Ala Arg Ala Met Ala
Gly His Asp 1570 1575 1580Arg Pro Val Leu Leu Gly Tyr Ser Ala Gly
Gly Leu Ala Ala Gln Ala1585 1590 1595 1600Leu Ala Ala Arg Leu Ala
Glu Leu Gly Arg Pro Pro Ala Ala Val Val 1605 1610 1615Leu Val Asp
Thr Tyr Ala Pro Asp Glu Thr Glu Val Met Ala Arg Ile 1620 1625
1630Gln Gly Ala Met Glu Gln Gly Gln Arg Asp Arg Asp Gly Arg Thr Gly
1635 1640 1645Ala Ala Phe Gly Glu Ala Trp Leu Thr Ala Met Gly His
Tyr Phe Gly 1650 1655 1660Phe Asp Trp Thr Pro Cys Pro Val Asp Val
Pro Val Leu His Val Arg1665 1670 1675 1680Ala Gly Asp Pro Met Thr
Gly Met Pro Val Glu Gly Arg Trp Gln Ala 1685 1690 1695Arg Trp Asn
Leu Pro His Thr Ala Val Asp Val Pro Gly Asp His Phe 1700 1705
1710Thr Met Met Glu Asp His Ala Pro Arg Thr Ala Asp Thr Val His Asp
1715 1720 1725Trp Leu Gly Thr Ala Val Arg Arg Pro Glu Arg Thr Arg
Asp 1730 1735 17408264PRTStreptomyces parvulus Tu4055 8Met Thr Gly
Thr Asn Thr His Ser Asp Val Trp Ile Arg Gln Tyr Arg1 5 10 15Pro Ala
His Pro Thr Ala Pro Gln Leu Ile Cys Leu Pro His Ala Gly 20 25 30Gly
Ser Ala Thr Phe Tyr His Pro Val Ala Ala Ala Leu Ala Pro Arg 35 40
45Cys Asp Val Leu Ala Val Gln Tyr Pro Gly Arg Gln Asp Arg Arg Ala
50 55 60Glu Lys Pro Leu Glu Asp Ile Asp Glu Leu Ala Asn Gln Leu Phe
Pro65 70 75 80Val Leu Arg Ala Arg Val His Gln Pro Val Ala Leu Phe
Gly His Ser 85 90 95Met Gly Ala Thr Leu Ala Phe Glu Leu Ala Arg Arg
Phe Glu Ser Ala 100 105 110Gly Ile Ser Leu Glu Ala Leu Leu Val Ser
Ala Arg Pro Ala Pro Ser 115 120 125Arg Gln Arg Thr Gly Gly Thr Val
His Leu Leu Ser Asp Glu Glu Leu 130 135 140Val Ala Glu Leu Arg Thr
Leu Asp Gly Thr Ala Glu Gln Val Phe His145 150 155 160Asp Glu Glu
Leu Val Arg Met Ala Leu Pro Ala Ile Arg Gly Asp Tyr 165 170 175Arg
Ala Ala Glu Thr Tyr Arg Tyr Arg Pro Gly Pro Lys Leu Arg Cys 180 185
190Pro Ile His Ala Leu Thr Gly Asp Asp Asp Pro Met Val Thr Pro Val
195 200 205Glu Ala Arg Ala Trp Ser Glu His Thr Asp Gly Pro Phe Thr
Leu Asp 210 215 220Thr Phe Ala Gly Gly His Phe Tyr Leu Leu Glu His
Arg Asp Ala Ile225 230 235 240Leu Gly Ile Ile Ala Glu His Leu Arg
Thr Cys Ser Arg Ala Pro Gly 245 250 255Asp Arg Ser Gly Leu Thr Arg
Glu 2609265PRTStreptomyces parvulus Tu4055 9Met Ser Leu Glu Leu Thr
Asp Arg Val Met Met Val Thr Gly Ala Gly1 5 10 15Ser Gly Ile Gly Arg
Ala Ala Ala Arg Leu Leu Val Gly His Gly Ala 20 25 30Arg Val Val Leu
Val Gly Arg Thr Glu Ser Ala Leu Thr Glu Thr Thr 35 40 45Ala Gly Leu
Pro Ser Ser His His Leu Val Val Pro Cys Asp Val Gly 50 55 60Asp Asp
Lys Gln Val Ala Asp Cys Val Ala Arg Ala Val Ser Arg Phe65 70 75
80Gly Arg Leu Asp Gly Ala Phe Asn Asn Ala Gly Thr Phe Gly Ser Phe
85 90 95Gly Pro Leu His Gln Asp Thr Ala Asp Asn Phe Asp Arg Val Ile
Ala 100 105 110Thr Asn Leu Arg Gly Val Trp Ser Cys Met Arg Gly Gln
Ile Glu Ala 115 120 125Met Leu Thr Ala Gly Gly Gly Ala Ile Val Asn
Cys Ala Ser Val Ala 130 135 140Gly His Ile Gly His Ala Gln Ser Pro
Leu Tyr Ser Ala Thr Lys His145 150 155 160Ala Val Ile Gly Leu Ser
Lys Ser Val Ala Leu Gln Tyr Ala Gly Asp 165 170 175Gly Ile Arg Val
Asn Val Val Ser Pro Gly Ser Thr Asp Thr Pro Met 180 185 190Leu Arg
Ser Leu Tyr Ala Asp Pro Ser Ala Leu Ala Gln Arg Ala Arg 195 200
205Arg Ala Pro Leu Gly Arg Leu Gly Lys Cys Glu Glu Val Ala Asn Ala
210 215 220Val Val Trp Leu Leu Ser Pro Leu Ala Ala Tyr Val Thr Gly
Gln Thr225 230 235 240Leu Gly Val Asp Gly Gly Val Thr Ala Gly Ser
Ala Ile Pro Arg Thr 245 250 255Asn Ala Thr Pro Glu Gly Gln His Arg
260 26510250PRTStreptomyces parvulus Tu4055 10Met Thr Ala Arg His
Asp Val Ala Leu Val Thr Gly Ala Gly Ser Gly1 5 10 15Ile Cys Ala Glu
Val Ala Arg Gly Leu Ala Ala Arg Gly Leu Arg Val 20 25 30Val Leu Leu
Asp Lys Asp Ala Glu Ala Val His Arg Val Ala Asp Gly 35 40 45Leu Gly
Asp Arg Leu Ala Arg Asp Pro Leu Val Ala Asp Val Thr Asp 50 55 60Pro
His Ala Leu Ala Ser Ala Val Asp Ser Leu Ala Pro Gln His Arg65 70 75
80Pro Gly Val Leu Val Asn Gly Val Gly Gly Asp Thr Arg Ala Arg Ser
85 90 95Val Thr Glu Leu Thr Glu Ala Asp Leu Gln Glu Ala Val Thr His
Asn 100 105 110Leu Ala Ser Val Phe Thr Met Thr Arg Leu Cys Val Pro
Ala Met Val 115 120 125Ala Ala Gly Trp Gly Arg Val Val Asn Leu Ala
Ser Val Ala Gly Arg 130 135 140Thr Tyr Thr Arg Phe Ser Asn Ala Ala
Tyr Val Ala Ala Lys Ala Gly145 150 155 160Val Ile Gly Phe Thr Lys
Gln Cys Ala Tyr Glu Leu Ala Pro His Gly 165 170 175Val Thr Val Asn
Ala Val Ala His Gly Val Ile Gly Thr Glu Arg Ile 180 185 190Arg Arg
Ala Trp Glu Asp Lys Pro Pro Gln Trp Thr Ala Asp Arg Val 195 200
205Ser His Ile Pro Ala Gly Arg Phe Gly Ser Val Ala Glu Ala Ala Gly
210 215 220Met Val Cys His Leu Cys Gly Glu Asp Ala Gly Tyr Thr Thr
Gly Thr225 230 235 240Val Val Asp Val Asn Gly Gly Leu His Ile 245
25011390PRTStreptomyces parvulus Tu4055 11Met Ile Arg Arg Val Arg
Leu His Thr Ala Val Val Pro Met Ala Ala1 5 10 15Ala Phe Asp His Ala
Thr Arg Ser Arg Arg Ser Ala Ala Ser Leu Leu 20 25 30Val Glu Ile Glu
Leu Ala Gly Thr Arg Gly Trp Gly Glu Gly Ala Pro 35 40 45Arg Asp Tyr
Val Thr Gly Glu Thr Leu Asp Gly Ala Val Arg Ala Val 50 55 60Gln Ala
Cys Asp Pro Gly Glu Leu Ala Glu Arg Ile Glu Trp Arg Asp65 70 75
80Phe Glu Ser Ala Val Ala Ser Ile Ala Gln Leu Pro Leu Thr Gly Leu
85 90 95Val Asp Gly Ser Ser Ala Ala Ala Ala Val Glu Ile Ala Leu Leu
Asp 100 105 110Ala Val Cys Arg His Phe Ala Arg Pro Leu Ala Asp Val
Leu Arg Val 115 120 125Leu Ala Pro Pro Ala Arg Ser Arg Arg Asp Gly
Pro Thr Ser Val Ser 130 135 140Leu
Val Ile His Leu Ser Arg Asp Val Ala Thr Val Leu Asp Ala Leu145 150
155 160Thr Pro Arg Ala Leu Ala Ala Leu Arg His Val Lys Ile Lys Val
Ala 165 170 175Asp Pro Ala Gly Ala Val Asp Arg Leu Thr Ala Ala Gln
Asp Arg Leu 180 185 190Pro Ala Asp Thr Arg Val Ser Leu Asp Val Asn
Gly Ala Trp Thr Ala 195 200 205Glu Glu Ala Glu Lys Val Ala Gly Glu
Leu Asp Gly Val Gly Trp Val 210 215 220Glu Glu Pro Leu Pro Pro Arg
Ser Trp Pro Glu Leu Gly Arg Leu Arg225 230 235 240Arg Ala Thr Gly
Leu Pro Val Met Leu Asp Glu Ser Cys Thr Gly Pro 245 250 255Ala Asp
Leu His Ala Ala Ala Thr Ser Gly Ala Ala Ser His Ile Asn 260 265
270Val Arg Leu Ser Lys Cys Gly Gly Phe Leu Ala Ala Ala Arg Leu Ala
275 280 285Leu Arg Ala Asp Glu Leu Gly Val Gly Cys Gln Leu Gly Val
His Val 290 295 300Ala Glu Val Gly Pro Leu Trp Ala Ala Gly Arg Thr
Leu Ala Thr Ala305 310 315 320Trp Asp Leu Trp Gln Thr Val Glu Ala
Gly Arg Ala Asp Glu Trp Phe 325 330 335Pro Val Pro Leu Thr Thr Pro
Ala Phe Thr Val Asp Arg Ser Leu His 340 345 350Arg Val Glu Pro Leu
Thr Gly Pro Gly Thr Gly Ile Glu Pro Thr Glu 355 360 365Glu Leu Leu
Arg His Thr Arg Cys Ala Ala Thr Trp Glu Ser Gly Gly 370 375 380Gly
Trp Arg Arg Asn Thr385 39012272PRTStreptomyces parvulus Tu4055
12Met Pro Thr Thr Ser Met Leu Thr Ala Ala Asp Gly Thr Gly Leu Thr1
5 10 15Leu His His Trp Thr Thr Pro Gly Ala Thr Ser Ala Val Phe Tyr
Leu 20 25 30His Gly Ile Gln Ser His Ala Gly Trp Leu Phe Glu Thr Gly
Pro Glu 35 40 45Leu Asn Ala Arg Gly Ile Asp Val Tyr Ala Leu Asp Arg
Arg Gly Ser 50 55 60Gly Arg Ser Glu Gly Pro Arg Gly His Leu Pro Ser
Ala Asp Leu Val65 70 75 80Leu Asp Asp Tyr Ala Arg Ala Leu Asp Ala
Val Thr Ala Glu Val Gly 85 90 95Gly Ala Gly Pro Val Ala Leu Gly Gln
Ser Leu Gly Gly Ser Val Leu 100 105 110Ala Ala Leu Trp Cys Thr Arg
Asp Leu Pro Val Arg Arg Leu Val Leu 115 120 125Cys Ala Pro Ala Leu
Gly Gln Gln Arg Ala Arg His Thr Ala Asp Thr 130 135 140Leu Ala Glu
Arg Arg Ala Leu Thr Gly Ser Gly Leu Arg Pro Val Gly145 150 155
160Leu Ala Asp Gly Asp Tyr Thr Asp Leu Pro Arg Tyr Arg Glu Phe Leu
165 170 175Thr Gly Asp His Leu Met Leu Arg Glu Val Thr Ser Ala Thr
Gln Ala 180 185 190Thr Leu Val His Leu Glu Asp His Tyr Ala Arg Gly
Ala Pro Arg Thr 195 200 205Arg Leu Pro Val Asp Leu Ala Leu Pro Thr
His Asp Pro Ile Ile Asp 210 215 220Leu Ser Ala Ala Arg Ala Met Leu
Arg Arg Leu Thr Ser Ala Val His225 230 235 240Glu Glu Val Phe Ala
Thr Asp Arg His Tyr Val Glu Phe Thr Ser Ala 245 250 255Arg Thr Ala
Tyr Trp Asp Trp Leu Ala Thr Arg Leu Lys Glu Glu Ala 260 265
27013539PRTStreptomyces parvulus Tu4055 13Met Lys Val Phe His Ala
Leu Ala Asp Ala Leu Thr Ala His Gly Val1 5 10 15Asp Thr Val Phe Gly
Leu Met Gly Asn Ala Asn Leu Leu Tyr Leu Pro 20 25 30Ala Phe Ala Asp
Ala Gly Gly Arg Phe Val Ala Val Ala His Glu Ala 35 40 45Gly Ala Val
Ala Met Ala Asp Gly Arg Ala Arg Met Cys Gly Gly Ile 50 55 60Gly Val
Ala Ser Val Thr His Gly Pro Ala Phe Thr Asn Ala Leu Thr65 70 75
80Pro Leu Val Glu Ala Ala Arg Ser His Ser Gln Val Leu Leu Ile Thr
85 90 95Gly Asp Pro Pro Pro Val Pro Thr His Phe His His Phe Asp Ile
Ala 100 105 110Thr Val Ala Ala Ala Ala Gly Ala Gly Tyr Glu Arg Val
His Arg Pro 115 120 125Ala Ser Leu Val Ala Asp Leu Asn Arg Ala Val
Gln Arg Ile Val Ala 130 135 140Glu Arg Arg Pro Val Val Leu Asn Val
Pro Ile Asp Leu Met Gln Ala145 150 155 160Glu Ala Gly Glu Gln Ala
Pro Val Thr Leu Pro Val Ala Pro Gly Pro 165 170 175Leu Ala Ala Pro
Glu Ala Glu Ala Leu Asp Gly Ala Leu Gly Leu Ile 180 185 190Gly Ser
Ala Lys Arg Pro Leu Val Leu Ala Gly His Gly Ala Ala Val 195 200
205Ala Gly Ala Arg Glu Ala Leu Val Glu Leu Ala Asp Arg Thr Gly Ala
210 215 220Ala Leu Ala Thr Thr Val Leu Gly Lys Glu Met Phe Ala Gly
His Pro225 230 235 240Arg Asp Val Gly Ile Phe Gly Ser Leu Ala His
Ser Val Ala Ser Thr 245 250 255Val Ile Ala Glu Ser Asp Cys Val Ile
Ala Phe Gly Ala Ser Leu Asn 260 265 270Met Trp Thr Val Leu Asn Gly
Glu Leu Leu Arg Gly Lys Arg Val Val 275 280 285His Val Asp Thr Asp
Pro Ala Arg Phe Gly Ser Tyr Ser Pro Val Asp 290 295 300Glu Pro Val
Ala Gly Asp Ala Arg Arg Thr Ala Glu Thr Met Asn Val305 310 315
320Leu Leu Asp Gln Ala Gly Val Thr Ala Ala Asn Gly Ala Trp Ala Glu
325 330 335Arg Val Ala Gly Gln Leu Ala Gly Phe Ser Pro Gln Asp Asp
Val Asp 340 345 350Asp Arg Ser Gly Ala Glu Thr Val Asp Ile Arg Thr
Ala Met Ile Arg 355 360 365Leu Asp Arg Ile Leu Pro Ala Glu Arg Ser
Val Val Ser Asp Ile Gly 370 375 380Arg Phe Asp Val Gly Val Trp Pro
Tyr Leu Arg Val Ala Asp Pro Leu385 390 395 400His Phe Thr Val Met
Gly Gly Phe Gly Ser Ile Gly Leu Gly Val Ala 405 410 415Gly Ala Ile
Gly Ala Ala Thr Ala Gly Thr Gly Arg Pro Val Val Ala 420 425 430Ala
Val Gly Asp Gly Gly Phe Met Met His Leu Ser Glu Phe Thr Thr 435 440
445Ala Val Arg Tyr Arg Leu Pro Leu Val Val Val Val Leu Asn Asp Gly
450 455 460Ala Tyr Gly Ala Glu His Tyr Lys Leu Arg Asn His Gly Tyr
Asp Pro465 470 475 480Ala Tyr Ser Ala Phe Ala Trp Pro Asp Leu Ala
Gly Leu Ala Thr Ala 485 490 495Met Gly Ala Arg Ala Leu Thr Val Arg
Lys Ala Glu Glu Leu Asp Ala 500 505 510Val Gly Asp Leu Leu Ser Thr
Leu Glu Gly Pro Leu Leu Val Asp Val 515 520 525Arg Leu Asp Pro Asp
Val Asn Leu Val Arg Tyr 530 53514683PRTStreptomyces parvulus Tu4055
14Met Arg Gly Ala Arg Glu Asn Ser Met Thr Arg Ala Gly Pro Leu Glu1
5 10 15Gly Ile Ala Val Leu Met Ala Gly Arg Ser Thr Pro Ala Ala Leu
Leu 20 25 30Gly Arg Leu Leu Ala Asp Leu Gly Ala Arg Val Val Thr Leu
Cys Arg 35 40 45Ser Pro Asp His Gly Gly Pro Phe Glu Arg Trp Leu His
Ser Ala Ala 50 55 60Gln Ser Ala Ser Gly Trp Asp Gln Ala Ser Arg Leu
Leu Gln Thr Ala65 70 75 80Asp Val Leu Val Cys Asp Ala Glu Gly Asp
Glu Arg Leu Ala Ala Leu 85 90 95Gly Leu Gly Ala Pro Glu Leu Pro His
Arg Ser Pro Glu Leu Val Ala 100 105 110Val Arg Leu Ser Ala Phe Gly
Leu Thr Gly Pro Leu Arg Asp Ala Pro 115 120 125Ala Thr Glu Arg Thr
Leu Gln Ala Leu Ala Gly Leu Thr Ser Ala Thr 130 135 140Gly Thr Glu
Gly Glu Pro Ser Val Leu Ser Val Val Gly Leu Ala Ser145 150 155
160Arg Thr Ala Ala Leu Ser Gly Leu Ile Ala Val Val Ala Gly Leu Ile
165 170 175Gly Arg Glu Arg Gly Gly Gly Gly Asp Tyr Leu Asp Ile Ala
Glu Phe 180 185 190Asp Ser Leu Phe Thr Leu Thr Gly Thr Leu Leu Pro
Ser Val Ala Leu 195 200 205Ala Gly Arg Pro Pro Arg Arg Thr Gly Asn
Arg His Gly Met Ala Ala 210 215 220Pro Trp Asn Ser Tyr Thr Cys Gln
Asp Ala Pro Val Val Ile Cys Thr225 230 235 240Met Gly Glu Pro Ile
Trp His Arg Leu Thr Ala Val Leu Gly Arg Arg 245 250 255Asp Leu Pro
Asp Asp Pro Arg Phe Ala Asp Thr Ala Ala Arg Val Arg 260 265 270Asn
Ala Asp Glu Leu Asp Glu Ile Leu Gly Lys Trp Thr Ala Gly Gln 275 280
285Arg Ala Val Asp Val Val Thr Ala Leu Arg Ala Ala Gly Ile Pro Cys
290 295 300Ala Gln Val Ala Ala Pro Glu Glu Val Arg Asp Gly Ala Ala
Ala Arg305 310 315 320Arg Arg Gly Leu Val Thr Asp Pro Ser Gly Thr
Pro Gly Ser Pro Leu 325 330 335Arg Ser Leu Ile Pro Ala Val Thr Asp
Gly Pro Met Pro Arg Gln Gly 340 345 350Gly Leu Trp Glu Pro Ile Ala
Arg Gly Thr Pro Pro Leu Arg Gly Val 355 360 365Arg Leu Leu Glu Val
Gly Ser Tyr Thr Ala Gly Pro His Ala Gly Arg 370 375 380Leu Leu Ala
Gln Leu Gly Ala Asp Val Leu Lys Val Glu Pro Pro His385 390 395
400Gly Glu Gly Ser Arg Arg Leu Ala Gln Gln Val Ala Gly Val Gly Tyr
405 410 415Leu Tyr Tyr Val Asn Asn Ala Gly Lys Arg Ser Cys Arg Leu
Asp Leu 420 425 430Ala Asp Ala Glu Asp Arg Ala Gly Phe Glu Arg Leu
Leu Ala Gly Cys 435 440 445Asp Ile Val Leu Thr Asn Leu Ala Ala Asp
Thr Leu Thr Ala Gln Gly 450 455 460Leu Ala Pro Asp Gln Ile Leu Ser
Arg His Gly Val Val His Cys Thr465 470 475 480Val Thr Gly His Gly
Leu Ala Ala Ala Asp Arg Ser Val Asp Thr Val 485 490 495Ile Gln Ala
Glu Ser Gly Ile Met Arg Leu Val Gly Gly Pro Gly Ala 500 505 510Gly
Leu Arg Thr Pro Val Ser Ser Ala Asp Val Leu Gly Ala Tyr Leu 515 520
525Ala Ala Ala Ala Ala Val Val Ser Thr Tyr Val Arg Leu Arg Thr Glu
530 535 540His Gly Cys Ala Ala Asp Val Ala Leu Phe Asp Ser Ala Val
Trp Leu545 550 555 560Thr Gln Asp Arg Trp Phe Thr Ala Pro Pro Ala
Arg Ala Pro His Leu 565 570 575Val Arg Ala Ala Asp Gly Thr Val Leu
Val Asp Ala Glu Gly Pro Pro 580 585 590Pro Arg Ala Glu Gly Pro Val
Ala Ala Val Leu Asp Ala Ala Ala Ala 595 600 605Val Gly Val Pro Ala
Ala Pro Leu His Asp Leu Thr Arg Ala Val Arg 610 615 620His Pro Gln
Val Leu Ala Arg Arg Met Ala Val Ala Arg Asp Cys Ala625 630 635
640Gly Thr Thr Val Leu Ile Thr Gly Asn His Leu Arg Ser Leu Leu Arg
645 650 655Glu Asp Pro Pro Pro Thr Cys Ala Pro Val Asp Gln Asn Asp
Pro Val 660 665 670Trp Leu Gln Pro Ala Pro Thr Glu Gly Gln Gln 675
68015426PRTStreptomyces parvulus Tu4055 15Met Thr Thr Arg Arg Arg
Gln Arg His Pro Ala Leu Ser Pro Ser Cys1 5 10 15Pro Ser Val Pro Phe
Pro Leu Leu Glu Thr Glu Phe Val Leu Met Pro 20 25 30Ser Phe Pro Val
Arg Arg Ser Val Pro Asp Thr Pro Pro Ala Glu His 35 40 45Leu Glu Leu
Leu Lys Glu Ser Gly Gly Val Cys Pro Phe Thr Met Glu 50 55 60Asp Gly
Arg Pro Ala Trp Leu Ala Ala Ser His Asp Ala Val Arg Ser65 70 75
80Leu Leu Ala Asp Arg Arg Ile Ser Asn Asn Pro Ala Lys Thr Pro Pro
85 90 95Phe Ser Gln Arg Glu Ala Leu Gln Lys Glu Arg Gly Gln Phe Ser
Arg 100 105 110His Leu Phe Asn Met Asp Ser Pro Glu His Asp Val Ala
Arg Arg Met 115 120 125Ile Ala Glu Asp Phe Thr Pro Arg His Ala Glu
Ala Val Arg Pro Tyr 130 135 140Phe Glu Glu Val Phe Gly Glu Ile Val
Asp Glu Val Val His Lys Gly145 150 155 160Pro Pro Ala Glu Met Ile
Glu Ser Phe Ala Phe Pro Val Ala Thr Arg 165 170 175Thr Ile Cys Lys
Val Leu Asp Ile Pro Glu Asp Asp Cys Glu Tyr Phe 180 185 190Gln Lys
Arg Thr Glu Gln Ile Ile Glu Met Asp Arg Gly Glu Glu Asn 195 200
205Leu Glu Ala Val Val Glu Leu Arg Arg Tyr Val Asp Ser Val Met Gln
210 215 220Gln Arg Thr Arg Lys Pro Gly Asp Asp Leu Leu Ser Arg Met
Ile Val225 230 235 240Lys Ala Lys Ala Ser Lys Glu Ile Glu Leu Ser
Asp Ala Asp Leu Val 245 250 255Asp Asn Ala Met Phe Leu Leu Val Ala
Gly His Glu Pro Ser Ala Asn 260 265 270Met Leu Gly Leu Gly Val Leu
Ala Leu Ala Glu Phe Pro Asp Val Ala 275 280 285Glu Glu Leu Arg Ala
Glu Pro His Leu Trp Pro Gly Ala Ile Asp Glu 290 295 300Met Leu Arg
Tyr Tyr Thr Ile Ala Arg Ala Thr Lys Arg Val Ala Ala305 310 315
320Ala Asp Ile Glu Tyr Glu Gly His Thr Ile Lys Glu Gly Asp Ala Val
325 330 335Ile Val Leu Leu Asp Thr Ser Asn Arg Asp Pro Lys Val His
Ala Glu 340 345 350Pro Asn Arg Leu Asp Ile His Arg Ser Ala Gly Asn
His Leu Ala Phe 355 360 365Ser His Gly Pro His Gln Cys Leu Gly Lys
His Leu Val Arg Val Gln 370 375 380Leu Glu Ile Ala Leu Arg Ala Val
Ala Glu Arg Leu Pro Gly Leu Arg385 390 395 400Leu Asp Ile Ala Lys
Glu Asp Ile Pro Phe Arg Gly Asp Ala Leu Ser 405 410 415Tyr Gly Pro
Arg Gln Leu Arg Val Thr Trp 420 42516454PRTStreptomyces parvulus
Tu4055 16Met Glu Lys Thr Asp Val Asp Arg Leu Arg Thr Leu Asp Arg
Glu His1 5 10 15Met Trp Tyr Pro Trp Thr Pro Met Thr Glu Trp Met Ala
Arg Asp Gln 20 25 30Leu Val Val Glu Arg Ala Glu Gly Cys Trp Leu Ile
Asp Ala Asp Gly 35 40 45Lys Arg Tyr Leu Asp Gly Arg Ser Ser Met Gly
Met Asn Leu His Gly 50 55 60His Gly Arg Ser Glu Ile Val Glu Ala Leu
Val Ala Gln Ala Arg Lys65 70 75 80Ala Gly Glu Thr Thr Leu Tyr Arg
Val Ser His Pro Ala Ala Val Glu 85 90 95Leu Ala Ala Arg Leu Ala Ser
Met Ala Pro Ala Gly Leu Gln Arg Val 100 105 110Phe Phe Ala Glu Ser
Gly Ser Thr Ala Val Glu Thr Ala Leu Lys Ala 115 120 125Ala Tyr Ala
Tyr Trp Val Ala Lys Gly Glu Pro Gln Arg Ser Thr Phe 130 135 140Val
Ser Met Glu Gly Gly Tyr His Gly Glu Thr Leu Gly Thr Val Ser145 150
155 160Leu Arg Gly Thr Asn Gly Glu Gln Val Asp Met Ile Arg Lys Thr
Tyr 165 170 175Glu Pro Leu Leu Phe Pro Ser Leu Ser Phe His Gln Pro
His Cys Tyr 180 185 190Arg Cys Pro Val Gly Gln Ser Ser Asp Ser Asp
Cys Gly Leu Glu Cys 195 200 205Thr Asp Ser Leu Glu Asn Leu Leu Thr
Arg Glu Lys Gly Arg Ile Ala 210 215 220Ala Val Ile Val Glu Pro Arg
Val Gln Ala Leu Ala Gly Val Ile Thr225 230 235 240Ala Pro Glu Gly
His Leu Ala Lys Val Ala Glu Ile Thr Arg Arg His 245 250 255Gly Val
Leu Leu Ile Val Asp Glu Val Leu Thr Gly Trp Ala Arg Thr 260 265
270Gly Pro Thr Phe Ser Cys Glu Ala Glu Gly Val Thr Pro Asp Leu Met
275 280
285Thr Val Gly Lys Ala Leu Thr Gly Gly Tyr Leu Pro Leu Ser Ala Thr
290 295 300Leu Ala Thr Glu Glu Ile Phe Gly Ala Phe Arg Glu Ser Val
Phe Leu305 310 315 320Ser Gly Ser Thr Tyr Ser Gly Tyr Ala Leu Gly
Ala Ala Val Ala Leu 325 330 335Ala Ser Leu Asp Leu Phe Glu Lys Glu
Asp Val Pro Ala Arg Ala Lys 340 345 350Ala Leu Ala Asp Val Leu Thr
Thr Ala Leu Glu Pro Phe Arg Ala Leu 355 360 365Thr His Val Gly Asp
Val Arg Gln Leu Gly Leu Ile Ala Gly Val Glu 370 375 380Leu Val Ala
Asp Arg Glu Thr Arg Ala Pro Tyr Pro Pro Gln Glu Arg385 390 395
400Val Val Asp Arg Ile Cys Thr Leu Ala Arg Asp Asn Gly Val Leu Val
405 410 415Asn Ala Val Pro Gly Asp Val Ile Thr Met Leu Pro Ser Pro
Ser Met 420 425 430Ser Pro Asp Asp Leu Arg Phe Leu Thr Gly Thr Leu
Tyr Thr Ala Val 435 440 445Arg Glu Val Thr Glu Glu
45017326PRTStreptomyces parvulus Tu4055 17Met Arg Ala Ala Val Ile
Arg Ala Trp Gly Gly Pro Glu Arg Leu Thr1 5 10 15Leu Asp Arg Val Glu
Arg Pro Ser Pro Pro Pro Gly Trp Ile Ala Val 20 25 30Arg Val Glu Ala
Cys Ala Leu Asn His Leu Asp Ile His Val Arg Asn 35 40 45Gly Leu Pro
Gly Val Arg Leu Glu Leu Pro His Val Ser Gly Gly Asp 50 55 60Val Val
Gly Val Val Glu Gln Ala Thr Asp Glu Ala Gly Glu Arg Leu65 70 75
80Leu Gly Ser Arg Val Leu Leu Asp Pro Met Ile Gly Arg Gly Ile Leu
85 90 95Gly Glu His Tyr Trp Gly Gly Leu Ala Glu Tyr Val Val Ala Pro
Ala 100 105 110His Asn Ala Leu Pro Val Pro Asp Gln Asp Ala Asp Pro
Ala Arg Tyr 115 120 125Ala Ala Leu Pro Ile Ser Tyr Gly Thr Ala Gln
Arg Met Leu Phe Ser 130 135 140Arg Ala Arg Leu Arg Pro Gly Glu Ser
Val Leu Leu Phe Gly Ala Thr145 150 155 160Gly Gly Val Gly Val Ala
Cys Ala Gln Leu Ala Leu Arg Ala Gly Ala 165 170 175Arg Ile Ile Ala
Cys Ser Gly Ser Pro Ala Lys Leu Ala Arg Leu Arg 180 185 190Arg Leu
Gly Val Ile Asp Thr Ile Asp Thr Gly Thr Glu Asp Val Arg 195 200
205Arg Arg Val Arg Glu Leu Thr Asp Gly Gly Ala Asp Leu Val Val Asp
210 215 220Tyr Gln Gly Lys Asp Thr Trp Pro Val Ser Leu Arg Ser Ala
Arg Ala225 230 235 240Gly Gly Arg Ile Val Thr Cys Gly Ala Thr Thr
Gly Tyr Glu Ala Thr 245 250 255Thr Asp Leu Arg Tyr Val Trp Ser Arg
Gln Leu Asp Ile Leu Gly Ser 260 265 270Asn Ala Trp His Arg Asp Asp
Leu His Thr Leu Val Asp Leu Val Ala 275 280 285Thr Asp Ala Leu Glu
Pro Val Val His Ala Asp Phe Pro Leu Ser Arg 290 295 300Ala Pro Glu
Ala Val Ala Glu Leu Glu Glu Arg Arg Ala Phe Gly Lys305 310 315
320Val Val Ile Arg Thr Ala 32518556PRTStreptomyces parvulus Tu4055
18Met Thr Gly Asn Thr Thr Ser Ala Ala Phe Leu Arg Arg Thr Gln Asn1
5 10 15Ala Leu Ala Met Gln Arg Lys Ile Cys Ala Gln Pro Glu Glu Thr
Ala 20 25 30Glu Arg Val Phe Ser Asp Ile Leu Ser Val Ser Arg Asp Thr
Gly Phe 35 40 45Gly Arg Glu His Gly Leu Ala Gly Val Arg Thr Arg Gln
Glu Trp Arg 50 55 60Arg Ala Val Pro Ile Arg Thr Tyr Asp Glu Leu Ala
Pro Tyr Val Glu65 70 75 80Arg Gln Phe Ser Gly Glu Arg Arg Val Leu
Thr Thr Asp Asp Pro Arg 85 90 95Ala Phe Leu Arg Thr Ser Gly Ser Thr
Gly Arg Ala Lys Leu Val Pro 100 105 110Thr Thr Asp His Trp Arg Arg
Val Tyr Arg Gly Pro Ala Leu Tyr Ala 115 120 125Gln Trp Gly Leu Tyr
Phe Glu Gln Ile Gly Thr His Arg Leu Thr Gly 130 135 140Asp Glu Val
Leu Asp Leu Ser Trp Glu Pro Gly Pro Ile Arg His Arg145 150 155
160Leu Arg Gly Phe Pro Val Tyr Ser Ile Thr Glu Arg Pro Val Ser Asp
165 170 175Asp Pro Asp Asp Trp Asn Pro Pro Trp Arg His Ala Arg Trp
Phe Thr 180 185 190Arg Asp Ala Gly Ala Ala Thr Met Ala Asp Leu Leu
Tyr Gly Lys Leu 195 200 205Leu Arg Leu Ala Ala His Asp Leu Arg Leu
Ile Val Ser Val Asn Pro 210 215 220Ser Lys Ile Val Leu Leu Ala Glu
Thr Leu Lys Glu Asn Ala Glu Arg225 230 235 240Leu Ile Gln Asp Leu
His Asp Gly His Gly Thr Asp Arg Ala Ala Arg 245 250 255Pro Asp Phe
Leu Arg Arg Leu Thr Ala Ala Phe Asp Arg Thr Gly Gly 260 265 270Arg
Pro Leu Leu Thr Asp Leu Trp Pro Gly Leu Arg Leu Leu Val Cys 275 280
285Trp Asn Ser Ala Ser Ala Ala Leu Tyr Gly Pro Trp Leu Ser Arg Leu
290 295 300Ala Thr Gly Val Ala Ala Leu Pro Phe Ser Thr Thr Gly Thr
Glu Gly305 310 315 320Ile Val Thr Leu Pro Val Asp Asp His Leu Ser
Ala Gly Pro Leu Ala 325 330 335Val Asp Gln Gly His Phe Glu Phe Val
Pro Trp Gln Asp Leu Asp Asp 340 345 350Gly Ser Pro Leu Pro Glu Asp
Thr Pro Thr Leu Gly Tyr Asp Glu Leu 355 360 365Glu Leu Gly Ala Asp
Tyr Arg Leu Val Met Ser Gln Ala Asn Gly Leu 370 375 380Tyr Arg Tyr
Asp Val Gly Asp Val Tyr Arg Val Val Gly Ala Val Gly385 390 395
400Ala Thr Pro Arg Leu Glu Phe Leu Gly Arg Ala Gly Phe Gln Ser Ser
405 410 415Phe Thr Gly Glu Lys Leu Thr Glu Ser Asp Val His Thr Ala
Val Met 420 425 430Arg Val Leu Gly Ser Glu Arg Thr Asp His Pro His
Phe Ser Gly Ile 435 440 445Pro Val Trp Asp Thr Pro Pro His Tyr Leu
Val Ala Ile Glu Trp Ala 450 455 460Asp Ala His Gly Thr Leu Asn Val
Gln Asp Thr Ala Arg Arg Ile Asp465 470 475 480Ala Thr Leu Gln Glu
Val Asn Val Glu Tyr Ala Asp Lys Arg Arg Ser 485 490 495Gly Arg Leu
Arg Pro Leu Gln Ile Leu Pro Leu Val Pro Gly Ala Phe 500 505 510Gly
Gln Ile Ala Glu Arg Arg Phe Arg Gln Gly Thr Ala Gly Ala Gln 515 520
525Ile Lys His His Trp Leu Gln Lys Asp Ser Ala Phe Leu Asp Thr Leu
530 535 540Arg Asp Leu Asp Leu Val Arg Ala Arg Pro Gly Thr545 550
55519305PRTStreptomyces parvulus Tu4055 19Met Arg Ile Gly Phe Ala
Ala Pro Met Ser Gly Pro Trp Ala Thr Pro1 5 10 15Asp Thr Ala Val His
Val Ala Arg Thr Ala Glu Gln Leu Gly Tyr Ala 20 25 30Ser Leu Trp Thr
Tyr Gln Arg Val Leu Gly Ala Pro Asp Asp Ser Trp 35 40 45Gly Glu Ala
Asn Arg Ser Val His Asp Pro Leu Thr Thr Leu Ala Phe 50 55 60Leu Ala
Ala His Thr Thr Gly Ile Arg Leu Gly Val Ala Val Leu Ile65 70 75
80Met Pro Leu His Thr Pro Ala Val Leu Ala Lys Gln Leu Thr Thr Leu
85 90 95Asp Leu Leu Ser Gly Gly Arg Leu Asp Val Gly Leu Gly Asn Gly
Trp 100 105 110Ala Ala Glu Glu Tyr Ala Ala Ala Gly Val Thr Pro Thr
Gly Leu Ser 115 120 125Arg Arg Ala Glu Asp Phe Leu Ala Cys Leu Arg
Ala Leu Trp Gly Glu 130 135 140Gln Thr Val Val Glu His Asp Gly Pro
Phe Tyr Arg Val Pro Pro Ala145 150 155 160Arg Phe Asp Pro Lys Pro
Ala Gln Ser Pro His Pro Pro Leu Leu Leu 165 170 175Gly Gly Ala Ala
Pro Gly Ala Leu Arg Arg Ala Gly Arg Leu Cys Asp 180 185 190Gly Trp
Ile Ala Ser Ser Lys Ala Gly Pro Ala Ala Ile Arg Asp Ala 195 200
205Ile Thr Val Val Arg Asp Ser Ala Glu Arg Thr Gly Arg Asp Pro Ala
210 215 220Thr Leu Arg Phe Val Cys Arg Ala Pro Val Arg Leu Arg Thr
Arg Ser225 230 235 240Ala Pro Asn Glu Pro Pro Leu Thr Gly Thr Ala
Glu Thr Ile Arg Ala 245 250 255Asp Leu Ala Ala Leu Ala Asp Thr Gly
Leu Thr Glu Ile Phe Leu Asp 260 265 270Pro Asn Phe Asp Pro Glu Ile
Gly Ser Pro Asp Ala Pro Thr Gly Asp 275 280 285Val Arg His Arg Val
Asp Leu Leu Leu His Glu Leu Ala Pro Ala Asn 290 295
300Trp30520248PRTStreptomyces parvulus Tu4055 20Met Leu Ile Ala Arg
Ala Ala Val Gly Glu Asp Arg Thr Tyr Ala Arg1 5 10 15Val Asp Thr Asp
Thr Gly Leu Ile His Leu Leu Ala Gly Thr Pro Tyr 20 25 30Asp Glu Ile
Arg Pro Thr Gly Glu Thr Arg Pro Leu Ala Glu Ala Arg 35 40 45Leu Leu
Ala Pro Val Glu Pro Ser Lys Val Leu Val Ala Gly Arg Asn 50 55 60Tyr
Gly Asp Val Val Thr Pro Asp Leu Val Val Phe Met Lys Pro Ser65 70 75
80Thr Ser Val Val Gly Pro Arg Ser Thr Val Leu Leu Pro Ala Glu Ala
85 90 95Lys Gln Val Arg Tyr Glu Gly Glu Leu Ala Val Val Ile Gly Arg
Arg 100 105 110Cys Lys Asp Val Pro Glu Asp Thr Ala Asp Gln Ala Val
Phe Gly Tyr 115 120 125Thr Cys Ala Asn Asp Val Thr Ala Trp Asp Val
Gly Glu Pro Lys Gly 130 135 140His Trp Thr Lys Ala Lys Ser Phe Asp
Thr Phe Cys Pro Leu Gly Pro145 150 155 160Trp Ile Arg Thr Asp Leu
Asp Pro Ala Asp Leu Val Leu Arg Thr Thr 165 170 175Val Asn Gly Thr
Leu Arg Gln Asp Gly Ser Thr Lys Glu Met Asn Arg 180 185 190Asn Val
Arg Ala Leu Val Ser Arg Cys Ser Ser Leu Met Thr Leu Leu 195 200
205Pro Gly Asp Val Ile Leu Thr Gly Thr Pro Ala Gly Ala Gly Val Leu
210 215 220Arg Pro Gly Asp Glu Val Val Val Glu Ile Asp Gly Ile Gly
Ser Leu225 230 235 240Ala Asn Pro Ile Gly Val Ala Lys
24521675PRTStreptomyces parvulus Tu4055 21Met Ser Val Ile Arg Pro
Thr Ala Glu Thr Glu Arg Ala Val Val Val1 5 10 15Val Pro Ala Gly Thr
Thr Cys Ala Asp Ala Val Thr Ala Ala Lys Leu 20 25 30Pro Arg Asn Gly
Pro Asn Ala Ile Val Val Val Arg Asp Pro Ser Gly 35 40 45Ala Leu Arg
Asp Leu Asp Trp Thr Pro Asp Ser Asp Val Glu Val Glu 50 55 60Ala Val
Ala Leu Ser Ser Glu Asp Gly Leu Thr Val Leu Arg His Ser65 70 75
80Thr Ala His Val Leu Ala Gln Ala Val Gln Gln Leu Trp Pro Glu Ala
85 90 95Arg Leu Gly Ile Gly Pro Pro Ile Glu Asn Gly Phe Tyr Tyr Asp
Phe 100 105 110Asp Val Glu Arg Pro Phe Gln Pro Glu Asp Leu Glu Arg
Val Glu Gln 115 120 125Arg Met Lys Glu Ile Ile Lys Ser Gly Gln Arg
Phe Cys Arg Arg Glu 130 135 140Phe Pro Asp Arg Glu Ala Ala Arg Ala
Glu Leu Ala Lys Glu Pro Tyr145 150 155 160Lys Leu Glu Leu Val Asp
Leu Lys Gly Asp Val Asp Ala Ala Glu Ala 165 170 175Met Glu Val Gly
Gly Ser Asp Leu Thr Ile Tyr Asp Asn Leu Asp Ala 180 185 190Arg Thr
Gly Asp Val Cys Trp Ser Asp Leu Cys Arg Gly Pro His Leu 195 200
205Pro Ser Thr Arg Leu Ile Pro Ala Phe Lys Leu Leu Arg Asn Ala Ala
210 215 220Ala Tyr Trp Arg Gly Ser Glu Lys Asn Pro Gln Leu Gln Arg
Ile Tyr225 230 235 240Gly Thr Ala Trp Pro Thr Arg Asp Glu Leu Lys
Ser His Leu Ala Ala 245 250 255Leu Glu Glu Ala Ala Lys Arg Asp His
Arg Arg Ile Gly Glu Glu Leu 260 265 270Asp Leu Phe Ala Phe Asn Lys
Glu Ile Gly Arg Gly Leu Pro Leu Trp 275 280 285Leu Pro Asn Gly Ala
Ile Ile Arg Asp Glu Leu Glu Asp Trp Ala Arg 290 295 300Lys Thr Glu
Arg Lys Leu Gly Tyr Lys Arg Val Val Thr Pro His Ile305 310 315
320Thr Gln Glu Asp Leu Tyr Tyr Leu Ser Gly His Leu Pro Tyr Tyr Ala
325 330 335Glu Asp Leu Tyr Ala Pro Ile Asp Ile Asp Gly Glu Lys Tyr
Tyr Leu 340 345 350Lys Pro Met Asn Cys Pro His His His Met Val Tyr
Lys Ala Arg Pro 355 360 365His Ser Tyr Arg Asp Leu Pro Tyr Lys Val
Ala Glu Tyr Gly Thr Val 370 375 380Tyr Arg Phe Glu Arg Ser Gly Gln
Leu His Gly Met Met Arg Thr Arg385 390 395 400Gly Phe Ser Gln Asn
Asp Ala His Ile Tyr Cys Thr Ala Asp Gln Ala 405 410 415Lys Asp Gln
Phe Leu Glu Val Met Arg Met His Ala Asp Tyr Tyr Arg 420 425 430Thr
Leu Gly Ile Ser Asp Phe Tyr Met Val Leu Ala Leu Arg Asp Ser 435 440
445Ala Asn Lys Asp Lys Tyr His Asp Asp Glu Gln Met Trp Glu Asp Ala
450 455 460Glu Arg Ile Thr Arg Glu Ala Met Glu Glu Ser Asp Ile Pro
Phe Gln465 470 475 480Ile Asp Leu Gly Gly Ala Ala His Tyr Gly Pro
Lys Val Asp Phe Met 485 490 495Ile Arg Ala Val Thr Gly Lys Glu Phe
Ala Ala Ser Thr Asn Gln Val 500 505 510Asp Leu Tyr Thr Pro Gln Arg
Phe Gly Leu Thr Tyr His Asp Ser Asp 515 520 525Gly Thr Glu Lys Pro
Val Val Val Ile His Arg Ala Pro Leu Gly Ser 530 535 540His Glu Arg
Phe Thr Ala Tyr Leu Thr Glu His Phe Ala Gly Ala Phe545 550 555
560Pro Val Trp Leu Ala Pro Glu Gln Val Arg Ile Ile Pro Ile Val Glu
565 570 575Glu Leu Thr Asp Tyr Ala Glu Glu Val Arg Asp Met Leu Leu
Asp Ala 580 585 590Asp Val Arg Ala Asp Val Asp Ala Gly Asp Gly Arg
Leu Asn Ala Lys 595 600 605Val Arg Ala Ala Val Thr Arg Lys Ile Pro
Leu Val Val Val Val Gly 610 615 620Arg Arg Glu Ala Glu Gln Arg Thr
Val Thr Val Arg Asp Arg Ser Gly625 630 635 640Glu Glu Thr Pro Met
Ser Leu Glu Lys Phe Val Ala His Val Thr Gly 645 650 655Leu Ile Arg
Thr Lys Ser Leu Asp Gly Ala Gly His Ile Arg Pro Leu 660 665 670Ser
Lys Ala 67522103PRTStreptomyces parvulus Tu4055 22Met Pro Arg Gly
Ile Ala Val Asp Val Leu Arg Ala Gly Asp Arg Trp1 5 10 15Pro His Ser
Ala Ala Pro Arg His Arg Gly Leu Leu Asn Ala Trp Trp 20 25 30Gly Ala
Trp Val Trp Ala Thr Val Phe Asp Arg Tyr Ala Ser Arg Thr 35 40 45Tyr
Asp Asp Ala Gln Asp Val Asp Ala Ile His Asp Ala Ala Gly Leu 50 55
60Val Met Ala Gly Ala Gly Phe Asp Ile Leu Ala Ala Val Leu Ala Ile65
70 75 80Leu Phe Val Arg Arg Leu Thr Ala Ala Gln His Ala Lys Ala Leu
Ala 85 90 95Gly Pro Thr Pro Pro Thr His 10023868PRTStreptomyces
parvulus Tu4055 23Met Glu Ala Phe Leu Leu Leu Ala Ala Glu Ser Val
Leu Leu Arg Arg1 5 10 15Asp Gln Ser Val Tyr Val Thr Pro Gly Ser Glu
Pro Asp Gly Pro Pro 20 25 30Arg Ala Ala Leu Arg Arg Leu Glu Ala Glu
Leu Leu Gly Arg Gly His 35 40 45Ala Val Ser Ala Pro Leu His Ala Val
Leu Ala Ser Leu Asp Ser Glu 50 55 60Glu Leu
Ala Ala Ala His Val Arg Leu Val Gly Leu Val Asp Asp Leu65 70 75
80Leu Gly Ser Asp Arg Thr His Thr Pro Leu Phe Arg Arg Phe Pro Arg
85 90 95Thr Val Pro Arg Asp Thr Glu Ala Leu Tyr Val Asp Arg Val Phe
Ala 100 105 110Phe Leu Leu Gln Gln Pro Glu Gln Pro Cys Val Leu Cys
Gly Glu Ala 115 120 125Arg Thr Val Leu Pro Val Ser Pro Cys Ala His
Leu Val Cys Arg Leu 130 135 140Cys Trp Asp Gly Ser Asp Tyr Ala Gly
Cys Pro Leu Cys His Arg Arg145 150 155 160Ile Asp Gly Asp Asp Pro
Phe Leu Arg Pro Val Arg Ala Val Gly Ala 165 170 175Ala Arg Ala Thr
Val Pro Gly Pro Leu Arg Leu Leu Arg Leu Gly Thr 180 185 190Asp Met
Thr Ala Asp Ala Thr Thr Ala Val Asp Ala Leu Leu Ala Arg 195 200
205Arg Thr Pro Leu Ser Pro Gln Asp Arg Asp Asp Leu Leu Thr Leu Leu
210 215 220Pro Leu Thr Pro Ala Gly Arg Gly Asp Leu Pro Gln Asp Ile
Pro Val225 230 235 240Arg Glu Thr Lys Ala Leu Val Leu Gly Ala Leu
Val Arg Arg Ala Pro 245 250 255Ser Arg Pro Ala Leu Arg Arg Leu Leu
Ala Glu Arg Leu Thr Thr Ala 260 265 270Thr Asp Val Leu Arg Leu Leu
Ala Val Leu Ser Gly Gly Asp Ala Gly 275 280 285Leu Val Thr Pro Ala
Arg Phe Thr Asn Val Pro Arg Ser Leu Arg Arg 290 295 300Asp Leu Leu
Ala Val Leu Asp Gly Leu Pro Ala Pro Tyr Leu Val Glu305 310 315
320Asp Met Leu Arg His Pro Thr Ala Trp Lys Arg Ala Ala Glu Val Leu
325 330 335His Pro Phe Glu Gly His Thr Arg His Pro Arg Ala Ala Leu
Ala Thr 340 345 350Ala Val Leu Arg Ala Thr Pro Leu Asp Pro Asp Thr
Ala Phe Gly Ala 355 360 365Ala Leu Leu Thr Thr Ala Ala Ala His Pro
Asp Ala Val Arg Pro Asp 370 375 380Gly Thr Arg Val Arg Pro Ala Thr
Trp Ala Gly Arg Leu Glu Gln Ala385 390 395 400Met Ala Glu Gly Asp
Ala Ala Arg Ala Ala Ala Leu Ala Gly Glu Arg 405 410 415Pro Gly Glu
Leu Val Arg Arg Leu Asp Val Leu Leu Arg Leu His Thr 420 425 430Asp
Glu Ala Leu Val Pro Glu Leu Glu Lys Ala Leu Arg His Gly Leu 435 440
445Pro Lys Val Gly Pro Gly Pro Leu Leu Ser Ala Leu Gly Ala Leu Arg
450 455 460Thr Arg Thr Glu Asp Arg Thr Gly Thr Arg Arg Val Phe Phe
Pro Arg465 470 475 480Gly Asp Val Thr Arg Ala Leu Ser Val Pro Glu
Arg Arg Pro Ala Leu 485 490 495Pro Ala Gly Pro Val Ser Glu Val Val
Ala Leu Leu Glu Gly Glu Leu 500 505 510Leu Arg Arg Phe Ala Ala Gly
Arg Pro Tyr Glu Leu Ser Val Leu Asp 515 520 525Ala Gly Leu Thr Asp
Leu Thr Val Pro Phe Thr Glu Arg Thr Ala Ala 530 535 540Lys Ala Leu
Val Thr Val Gly Arg Gly Ser Val Gln Ala Leu Pro Glu545 550 555
560Gly Ser Val Leu Arg Leu Phe Leu His Trp Thr Glu Pro Arg Gly Asn
565 570 575Arg Thr Asp Leu Asp Leu Ser Val Ala Phe Phe Asp Ala Glu
Trp Thr 580 585 590Phe Thr Gly Leu Cys Asp Tyr Thr Asn Leu Val His
Gly Pro Asp Ala 595 600 605Ala Ile His Ser Gly Asp Leu Thr Ser Ala
Pro Ala Pro Arg Gly Ala 610 615 620Thr Glu Tyr Val Asp Leu Asp Leu
Glu Arg Leu Ala Arg Arg Gly Asp625 630 635 640Thr Tyr Ala Val Pro
Leu Val Phe Ser Tyr Asn Asn Val Pro Phe Glu 645 650 655Glu Leu Pro
Asp Ala Phe Ala Gly Phe Met Ala Leu Pro Ala Glu Gly 660 665 670Pro
Arg Asp Ala Thr Tyr Asp Pro Arg Thr Val Arg Gln Arg Phe Asp 675 680
685Leu Ala Gly Asp Ser Lys Val Cys Leu Pro Met Ile Val Asp Leu Ala
690 695 700Arg Arg Arg Ala Leu Trp Thr Asp Thr His Leu Pro Ser Ala
Gly Gly705 710 715 720Phe Gln Ser Ile Gly Ser His Gly Gly Gly Glu
Leu Ala Ala Val Ala 725 730 735Gly Asp Leu Trp Gln Gln Phe Thr Ser
Gly Gly Arg Ala Thr Leu Trp 740 745 750Asp Leu Ala Val Leu Arg Ala
Ala Ala Leu Ser Pro Glu Val Ala Val 755 760 765Val Ser Arg Glu Pro
Glu Pro Ala Val Leu Arg Tyr Arg Arg Arg Ala 770 775 780Ala Glu Ser
Glu Ala Ala Phe Ala Val Arg Val Ala Ser His Lys Asp785 790 795
800Ala Glu Glu Arg Leu Ala His Thr Asp Pro Asp Ser Ala Ala Ala Gly
805 810 815Leu Ala Ala Gly Arg Arg Val Phe Leu Ala Thr Val His Gly
Asp Val 820 825 830Arg Pro Pro Gly Ala Ser Gly Thr Ser Tyr Arg Leu
Phe Pro Gly Ala 835 840 845Gly Asp Ala Ser Pro Thr Leu Thr Arg Val
Thr Ala Gly Asp Leu Leu 850 855 860Ala Glu Leu
Gly86524212PRTStreptomyces parvulus Tu4055 24Met Ala Glu Gln Ile
Ala Gly Ile Glu Ile Pro Asp Ser Ala Pro Ala1 5 10 15Arg Glu Ala Thr
Asp Leu Ile Arg Asp Thr Thr Pro Pro Leu Ile Phe 20 25 30His His Ser
Arg Arg Val Tyr Leu Phe Gly Ser Leu Gln Ala Ala Ala 35 40 45Leu Gly
Ile Arg Pro Asp Pro Glu Leu Leu Tyr Ile Ala Ala Leu Phe 50 55 60His
Asp Thr Gly Leu Val Pro Pro Tyr Arg Gly Asp Asp Gln Arg Phe65 70 75
80Glu Met Asp Gly Ala Asp Gln Ala His Ala Phe Leu Leu Ala His Gly
85 90 95Ile Pro Glu Ala Asp Ala Asp Thr Val Trp Thr Ala Val Ala Leu
His 100 105 110Thr Thr Pro Glu Val Pro Tyr Arg Met Ala Pro Glu Ile
Ala Ala Thr 115 120 125Thr Ala Gly Val Glu Thr Asp Val Leu Gly Leu
Arg Leu Gly Asn Leu 130 135 140Thr Arg Ala Gln Ile Asp Ala Val Thr
Ala Ala His Pro Arg Pro Asp145 150 155 160Phe Lys Lys Gln Ile Leu
Arg Ala Phe Thr Glu Gly Phe Glu His Arg 165 170 175Pro Ala Thr Thr
Phe Gly Thr Val Asn Ala Asp Val Leu Glu His Phe 180 185 190Ala Pro
Gly Phe Arg Arg Thr Asp Phe Val Glu Val Ile Glu Asn Ser 195 200
205Ala Trp Pro Glu 21025329PRTStreptomyces parvulus Tu4055 25Met
Thr Ala Arg Ala His Ser Val Gly Ile Leu Val Phe Asp Gly Met1 5 10
15Lys Met Leu Asp Leu Ser Gly Pro Ala Glu Val Phe Ala Glu Ala Asn
20 25 30Arg Phe Gly Ala Arg Tyr Arg Leu Gly Val Val Ser Pro Asp Gly
Ala 35 40 45Pro Val Arg Ser Ser Ile Gly Leu Leu Val Pro Ala Glu Ala
Asp Ala 50 55 60Arg Ser Ala Gly Pro Pro Asp Thr Leu Val Val Val Gly
Gly Asp Ala65 70 75 80Leu Pro Gly Ser Pro Val Asp Pro Arg Leu Ile
Asp Ala Ala Lys Ala 85 90 95Leu Ala Ala Arg Ala Gly Arg Val Ala Ser
Val Cys Thr Gly Ala Phe 100 105 110Val Leu Gly Ala Ala Gly Leu Leu
Glu Gly Arg Arg Ala Thr Thr His 115 120 125Trp Gln His Thr Thr Ala
Leu Ala Arg Arg Cys Pro Ser Thr Arg Val 130 135 140Glu Pro Asp Ala
Ile Phe Val Lys Asp Gly Ala Thr Tyr Thr Ser Ala145 150 155 160Gly
Val Thr Ala Gly Ile Asp Leu Ala Leu Ala Leu Leu Glu Glu Asp 165 170
175His Gly Pro Asp Leu Ala Arg Arg Val Ala Arg Ser Leu Val Val Tyr
180 185 190Leu Gln Arg Ala Gly Gly Gln Ser Gln Phe Ser Ala Ser Leu
Arg Gly 195 200 205Pro Ala Pro Arg Thr Pro Val Leu Arg Gln Val Gln
Asp Ala Val Gln 210 215 220Ala Asp Pro Ala Ala Asp His Ser Leu Ala
Ala Leu Ala Ala Arg Val225 230 235 240Arg Val Ser Pro Arg His Leu
Thr Arg Met Phe Arg Ala Glu Leu Asp 245 250 255Val Thr Pro Val Lys
Tyr Val Glu Leu Ile Arg Phe Asp Ile Ala Lys 260 265 270Ala Leu Leu
Asp Ser Gly His Asn Ala Thr Glu Ala Ala Ala Leu Ser 275 280 285Gly
Phe Pro Ser Tyr Glu Ser Leu Arg Arg Ala Phe Ala Arg His Leu 290 295
300Gly Leu Ser Pro Thr Arg Tyr Arg Gln Arg Phe Ala Thr Thr Val
Pro305 310 315 320Asp Ala Gly Pro Arg Pro Asp Gly Gly
32526276PRTStreptomyces parvulus Tu4055 26Met Gly Thr Val Thr Thr
Ser Asp Gly Thr Ser Ile Phe Tyr Lys Asp1 5 10 15Trp Gly Pro Arg Asp
Ala Pro Pro Ile Val Phe His His Gly Trp Pro 20 25 30Leu Thr Ala Asp
Asp Trp Asp Asn Gln Met Leu Phe Phe Leu Ser His 35 40 45Gly Tyr Arg
Val Ile Ala His Asp Arg Arg Gly His Gly Arg Ser Gly 50 55 60Gln Pro
Ser Thr Gly His Glu Met Asp Thr Tyr Ala Ala Asp Val Ala65 70 75
80Ala Leu Thr Glu Ala Leu Asp Leu Arg Asp Ala Val His Ile Gly His
85 90 95Ser Thr Gly Gly Gly Glu Val Ala Arg Tyr Val Ala Arg Ala Glu
Pro 100 105 110Gly Arg Val Ala Lys Ala Val Leu Val Gly Ala Val Pro
Pro Val Met 115 120 125Val Lys Ser Asp Ala Asn Pro Gly Gly Thr Pro
Ile Glu Val Phe Asp 130 135 140Gly Phe Arg Thr Ala Leu Ala Ala Asn
Arg Ala Gln Phe Tyr Ile Asp145 150 155 160Val Pro Ser Gly Pro Phe
Tyr Gly Phe Asn Arg Glu Gly Ala Lys Val 165 170 175Ser Gln Gly Leu
Ile Asp Asn Trp Trp Arg Gln Gly Met Ser Gly Ala 180 185 190Ala Asn
Ala His Tyr Glu Cys Ile Lys Ala Phe Ser Glu Thr Asp Phe 195 200
205Thr Glu Asp Leu Lys Ala Ile Asp Val Pro Val Leu Val Ala His Gly
210 215 220Thr Asp Asp Gln Val Val Pro Tyr Ala Asp Ser Ala Pro Leu
Ser Val225 230 235 240Lys Leu Leu Lys Asn Gly Thr Leu Lys Ser Tyr
Glu Gly Leu Pro His 245 250 255Gly Met Leu Ser Thr His Pro Glu Val
Val Asn Pro Asp Leu Leu Asp 260 265 270Phe Val Arg Ser
27527185PRTStreptomyces parvulus Tu4055 27Met Gly Val Met Ile Gly
Pro Ala Gly Arg Glu Arg Asp Glu Gly Asp1 5 10 15His Val Thr Gln Gln
Ala Pro Val Ala Thr Asp Glu Arg Arg Val Phe 20 25 30Val Asp Lys Gln
Thr Pro Gly Ala Tyr Lys Ala Phe Val Ala Ala Ala 35 40 45Glu Ser Val
Arg Glu Ala Ala Ala Ala Ala Gly Leu Asp Arg Leu Leu 50 55 60Val Glu
Leu Val Asn Ile Arg Val Ser Gln Leu Asn Ala Cys Ala Tyr65 70 75
80Cys Leu Ser Leu His Thr Arg Ala Ala Leu Arg Ala Gly Glu Thr Thr
85 90 95Gln Arg Leu Ala Val Leu Pro Ala Trp Arg Asp Thr Glu Leu Phe
Thr 100 105 110Ala Arg Glu Arg Ala Ala Leu Ala Leu Ala Glu Ala Thr
Thr Arg Pro 115 120 125Ala Asp Ala Ala Ala Gln Ser Ala Ala Tyr Ala
Gln Ala Arg Gly Val 130 135 140Leu Ser Asp Asp Glu Val Ser Ala Val
Ile Trp Val Ala Ile Ser Ile145 150 155 160Asn Ala Phe Asn Arg Val
Ser Val Leu Ser Lys His Pro Val Arg Gly 165 170 175Ala Ala Pro Ala
Pro Val Ser Pro Ala 180 18528324PRTStreptomyces parvulus Tu4055
28Met Val Ser Asn Thr Glu Thr Arg Pro Ala Glu Met Arg Cys Gly Ala1
5 10 15Leu Glu Asp Glu Val Pro Ala Ala Gly Val Glu Val Leu Thr Ala
Arg 20 25 30Asp Val Pro Leu Gly Gly Pro Arg Ala Met Thr Val Arg Arg
Thr Leu 35 40 45Pro Gln Arg Ala Arg Thr Leu Ile Gly Ala Trp Cys Phe
Ala Asp His 50 55 60Tyr Gly Pro Asp Asp Val Ala Ala Ser Gly Gly Met
Asp Val Ala Pro65 70 75 80His Pro His Ile Gly Leu Gln Thr Val Ser
Trp Leu Phe Ser Gly Glu 85 90 95Ile Glu His Arg Asp Ser Leu Gly Thr
His Ala Phe Val Arg Pro Gly 100 105 110Glu Leu Asn Leu Met Thr Gly
Gly Phe Gly Ile Ala His Ser Glu Val 115 120 125Ser Thr Pro Asp Thr
Thr Val Leu His Gly Val Gln Leu Trp Val Ala 130 135 140Leu Pro Glu
Glu His Arg Asp Thr Gly Arg Asp Phe Gln His His Ala145 150 155
160Pro Ala Pro Val Ala Phe Asp Gly Gly Thr Ala Arg Val Phe Leu Gly
165 170 175Ser Leu Ala Gly Asp Thr Ser Pro Val Ser Thr Phe Thr Pro
Leu Leu 180 185 190Gly Ala Glu Leu Thr Leu Val Pro Gly Gly Thr Ala
Thr Leu Asp Val 195 200 205Asp Pro Gly Phe Glu His Gly Val Leu Val
Asp Ser Gly Asp Val Arg 210 215 220Val Glu Gly Ala Val Val Arg Pro
Ala Glu Leu Gly Tyr Val Ala Pro225 230 235 240Gly Arg Ala Thr Leu
Thr Leu Thr Asn Glu Ser Ala Ala Pro Ala Arg 245 250 255Leu Ile Leu
Leu Gly Gly Pro Pro Phe Pro Glu Glu Ile Ile Met Trp 260 265 270Trp
Asn Phe Ile Gly Arg Ser His Asp Glu Ile Val Arg Ala Arg Glu 275 280
285Asp Trp Met Lys Gly Asp Arg Phe Gly Glu Val His Gly Tyr Asp Gly
290 295 300Ala Pro Leu Pro Ala Pro Glu Leu Pro Asn Ala Pro Leu Lys
Pro Arg305 310 315 320Arg Arg Ala Arg29126PRTStreptomyces parvulus
Tu4055 29Met Val Pro Thr Met Leu Cys Met Val Ala Val Pro Glu Ser
His Ser1 5 10 15Gly Trp Thr Phe Val Thr Asn His Ala Arg Val Leu Ala
Ala Ile Ala 20 25 30Asp Asn Pro Asn Val Arg Ile Arg Asp Ile Ala Ala
His Cys Arg Leu 35 40 45Thr Glu Arg Ala Val Gln Arg Ile Ile Ser Asp
Leu Glu Gln Asp Gly 50 55 60Tyr Leu Ser His Thr Arg Asp Gly Arg Ser
Asn Ile Tyr Arg Val Glu65 70 75 80Pro Asp Lys Val Leu Arg His Pro
Ala Glu Ala Gly Leu Thr Val Ala 85 90 95Ala Leu Leu Ser Leu Leu Val
Arg Asp Glu Thr Asp His Gly Arg Ser 100 105 110Ala Gly Pro Gly Ser
Arg Pro Ala Arg Ser Ser Ala Ala Arg 115 120 12530127PRTStreptomyces
parvulus Tu4055 30Met Ser Leu Asp Glu Ala Val Ala Gly Cys Ser Arg
His Thr Gly Arg1 5 10 15Arg Arg Leu Pro Ala Ala Glu Gln Pro Thr Gln
Ala Gln Tyr Glu Ala 20 25 30His Gly Ala Trp Val Val Ser Ala Arg Gly
Ala Tyr Asp Met Asn Ser 35 40 45Val Glu Pro Leu Ala Asp Ala Leu Lys
Asp Ala Ala Glu Lys Ser Pro 50 55 60Lys Val Val Leu Asp Ala Ser Gly
Ile Thr Phe Ala Asp Ser Thr Leu65 70 75 80Leu Ser Leu Leu Ile Leu
Thr His Gln Ala Thr Asp Phe Arg Val Ala 85 90 95Ala Pro Thr Trp Gln
Val Met Arg Leu Met Gln Leu Thr Gly Val Asp 100 105 110Ala Phe Leu
Lys Val Arg Ala Thr Val Glu Glu Ala Ala Thr Ala 115 120
1253182PRTStreptomyces parvulus Tu4055 31Met Ser Met Ile Leu Pro
Ala Glu Lys Glu Leu Arg Ala Val Leu Ala1 5 10 15Arg Phe Ala Gln Ala
Arg Ile Asp His Asp Val Arg Pro Ser Gly Cys 20 25 30Thr Ser Arg Leu
Leu Glu Asp Ala Thr Tyr Thr Leu Cys Val Met Thr 35 40 45Gly Ala Arg
Thr Ala Glu Gln Ala Leu Arg Thr Ala
Asp Glu Leu Leu 50 55 60Ala Gln Phe Ala Glu Arg Thr Ala Ala Pro Val
Glu Asp Glu Ala Leu65 70 75 80Ala Ala3270PRTStreptomyces parvulus
Tu4055 32Met Ser Asp Thr Arg Leu Arg Gln Arg Asp Glu Thr Ser Lys
Gly Pro1 5 10 15Ala Thr Glu Ile Pro Ala Pro Gln Trp Arg Asp Leu Phe
Leu Ala Pro 20 25 30Asp Trp Gly Gly Thr Asp Glu Gln Val Ile Val Ala
Glu Glu Ala Arg 35 40 45Gly Pro Glu His Phe Thr Gly Ala Arg Arg Pro
Arg Gly Gly Arg Arg 50 55 60Ser Ser Arg Arg Ala Ala65
7033172PRTStreptomyces parvulus Tu4055 33Met Arg Cys Ser His Arg
Ala Gly Gly Val Gly Ala Arg Ala Trp Leu1 5 10 15Gly Gly Asn Val Ala
Val Asp Met Gly Glu Thr Gly Leu Asp Gly Ser 20 25 30Ser Thr Gln Arg
Ala Pro Glu Gly Ala Asp Pro Arg Ala Ala Ser Val 35 40 45Thr Tyr Arg
Arg Glu Ala Leu Arg Ile Ala Asp Ala Arg His Phe Ala 50 55 60Thr Asp
Tyr Leu Thr Arg Ser Gln Arg Asp Leu Arg Ser Pro Val Pro65 70 75
80Glu Arg Ala Ser Glu Ala Val His Leu Val Val Ser Glu Leu Ile Thr
85 90 95Asn Ala Val Lys Tyr Gly Ala Asp Pro Ile Glu Leu Thr Leu Ser
Leu 100 105 110Thr Asp Asp Ala Val Thr Val Thr Val Arg Asp Gly Asp
Thr Thr Leu 115 120 125Pro Ala Pro Arg Pro Ala Asp Pro Ala Arg Val
Gly Gln His Gly Leu 130 135 140Glu Ile Val Ala Ala Leu Ser Gln Ala
Val Glu Ile Arg Pro Glu Pro145 150 155 160Ser Gly Lys Arg Ile Thr
Ala Arg Ile Ala Leu Thr 165 17034105PRTStreptomyces parvulus Tu4055
34Met Gln Ser Ser Ser Ala Ser Val Arg Gly Glu Ile Val Ile Arg Arg1
5 10 15Ala Val Ala Arg Asp Ala Lys Arg Leu Ser Arg Leu Val Arg Gly
Ser 20 25 30Arg Ala Tyr Glu Gly Pro Tyr Ala Ala Met Val Ser Asp Tyr
Arg Val 35 40 45Gly Pro Asp Tyr Ile Glu Asn His Gln Val Phe Val Ala
Ser Thr Pro 50 55 60Arg Thr Pro Arg Thr Gly Cys Ser Ala Ser Thr Arg
Cys Ser Ser Arg65 70 75 80Arg Arg Ser Trp Thr Cys Cys Ser Ser Arg
Thr Val Pro Arg Ala Ala 85 90 95Ala Ser Asp Gly Cys Leu Ser Ile Thr
100 10535638PRTStreptomyces parvulus Tu4055 35Met Ala Gln Arg Arg
Thr Pro Phe Gly Asp Arg Ala Arg Tyr Trp Phe1 5 10 15Asp Ser Thr Leu
Ala Arg Gly Ala Ala Ala Leu Val Gly Trp Met Ala 20 25 30Leu Leu Ser
Leu Ala Val Val Val Pro Ala Ser Ala Val Met Val Trp 35 40 45Thr Asp
Pro Asp Ala Pro Pro Ser Leu Ala Glu Arg Leu Ala Glu Val 50 55 60Trp
Arg Leu Thr Gly Glu Thr Leu Arg Leu Gly Gly Ala Thr Gly Thr65 70 75
80Pro Leu Arg Ala Met Leu Ser Val Leu Leu Ala Leu Val Thr Leu Leu
85 90 95Tyr Val Ser Thr Leu Val Gly Leu Ile Thr Thr Ala Leu Thr Glu
Arg 100 105 110Leu Thr Ser Leu Arg Arg Gly Arg Ser Thr Val Leu Glu
Gln Gly His 115 120 125Ala Val Val Leu Gly Trp Ser Glu Gln Val Phe
Thr Val Val Ser Glu 130 135 140Leu Val Ala Ala Asn Val Asn Gln Arg
Gly Ala Ala Val Val Val Leu145 150 155 160Ala Asp Arg Asp Lys Thr
Val Met Glu Glu Ser Leu Gly Thr Lys Val 165 170 175Gly Ser Cys Gly
Gly Thr Arg Leu Ile Cys Arg Ser Gly Pro Thr Thr 180 185 190Asp Pro
Ala Val Leu Pro Leu Thr Ser Pro Ala Thr Ala Gly Val Val 195 200
205Leu Val Leu Pro Pro Asp Glu Pro His Ala Asp Ala Glu Val Val Lys
210 215 220Thr Leu Leu Ala Leu Arg Ala Ala Leu Ala Gly Ala Lys Pro
Arg Pro225 230 235 240Pro Val Val Ala Ala Val Arg Asp Asp Arg Tyr
Arg Leu Ala Ala Cys 245 250 255Leu Ala Ala Gly Pro Asp Gly Val Val
Leu Glu Ser Asp Thr Val Thr 260 265 270Ala Arg Leu Ile Val Gln Ala
Ala Arg Arg Pro Gly Ile Ser Leu Val 275 280 285His Arg Glu Leu Leu
Asp Phe Ala Gly Asp Glu Phe Tyr Leu Ile Ser 290 295 300Glu Pro Ala
Leu Thr Gly Arg Pro Phe Gly Glu Val Leu Leu Ser Tyr305 310 315
320Ser Thr Thr Ser Val Val Gly Leu Met Arg Gly Cys Thr Pro Leu Leu
325 330 335Asn Pro Pro Pro Thr Thr Pro Val Ala Pro Asp Asp Leu Leu
Val Val 340 345 350Ile Thr Gly Asp Asp Asp Thr Ala Arg Leu Asp Asp
Cys Ala Glu Ser 355 360 365Val Glu Lys Ala Ala Val Ala Ser Arg Pro
Pro Thr Pro Ala Pro Ala 370 375 380Glu Arg Ile Leu Leu Leu Gly Trp
Asn Arg Arg Ala Pro Leu Val Val385 390 395 400Asp Gln Leu His Arg
Arg Ala Arg Pro Gly Ser Ala Val Asp Val Val 405 410 415Ala Glu Pro
Gly Glu Ala Thr Ile Arg Glu Ile Ser Glu Ala Glu Ala 420 425 430Asp
Ser Gly Asn Gly Glu Asn Gly Gly Asn Gly Leu Ser Leu Ala Leu 435 440
445His His Gly Asp Ile Thr Arg Pro Glu Thr Leu Arg Arg Leu Asp Val
450 455 460His Ser Tyr Asp Ser Val Ile Val Leu Gly Arg Asp Pro Ala
Pro Gly465 470 475 480Gln Pro Pro Asp Asp Pro Asp Asn Arg Thr Leu
Val Thr Leu Leu Leu 485 490 495Leu Arg Gln Leu Glu Glu Ala Thr Gly
Arg Glu Leu Pro Val Val Thr 500 505 510Glu Leu Ile Asp Asp Arg Asn
Arg Ala Leu Ala Pro Ile Gly Pro Gly 515 520 525Ala Asp Val Ile Ile
Ser Gly Lys Leu Ile Gly Leu Leu Met Ser Gln 530 535 540Ile Ser Gln
Asn Arg His Leu Ala Ala Val Phe Glu Glu Leu Phe Ser545 550 555
560Ala Glu Gly Ala Gly Val Arg Leu Arg Pro Ala Thr Asp Tyr Leu Leu
565 570 575Pro Gly Ser Thr Thr Ser Phe Ala Thr Val Val Ala Ala Ala
Arg Arg 580 585 590Arg Gly Glu Cys Ala Ile Gly Tyr Arg Asp His Ala
Asp Ala Ser Thr 595 600 605Arg Pro His Tyr Gly Val Arg Ile Asn Pro
Pro Lys Arg Glu Arg Arg 610 615 620Arg Trp Thr Ala Glu Asp Glu Val
Val Val Ile Gly Thr Asp625 630 63536165PRTStreptomyces parvulus
Tu4055 36Met Pro Ser Thr Asp Val Val Glu Leu Ile Leu Arg Asp His
Arg Arg1 5 10 15Met Glu Glu Leu Phe Arg Thr Leu Arg Asn Val Glu Ala
Asp Arg Ala 20 25 30Ala Ala Leu Thr Glu Phe Ala Asp Leu Leu Ile Ala
His Ala Ser Ala 35 40 45Glu Glu Asp Glu Val Tyr Pro Ala Leu Arg Arg
Tyr Lys Asn Val Glu 50 55 60Gly Glu Asp Val Asp His Ser Val His Glu
His His Glu Ala Asn Glu65 70 75 80Ala Leu Leu Ala Leu Leu Glu Val
Glu Asp Thr Ala Ser Asp Glu Trp 85 90 95Asp Asp Lys Leu Glu Glu Leu
Val Thr Ala Val Asn His His Ala Asp 100 105 110Glu Glu Glu Arg Thr
Leu Leu Asn Asp Ala Arg Glu Asn Val Ala Asp 115 120 125Asp Arg Arg
Arg Glu Leu Gly Gln Lys Phe Gln Glu Ala Arg Ser Arg 130 135 140Tyr
Leu Glu Thr Gly Cys Gly Ser Val Glu Asn Val Arg Lys Leu Val145 150
155 160Ala Ala Ala Asp Asp 16537787PRTStreptomyces parvulus Tu4055
37Met Ala Arg Arg Leu Thr Glu Gly Arg Thr Arg Arg Glu Lys Gly Glu1
5 10 15His Met Gln Lys Pro His Gly Glu Glu Ser Glu Thr Ser Leu Ser
Val 20 25 30Thr Pro Pro Lys Lys Trp Ala Ala Gly Val Pro Ala Val Val
His Ala 35 40 45Leu Glu Tyr Ser Leu Glu Gln Thr Ser Pro Arg Arg Thr
Gly Val Asp 50 55 60Leu Leu Thr Met Asn Gln Val Gly Gly Ile Asp Cys
Pro Gly Cys Ala65 70 75 80Trp Ala Asp Pro Ala Pro Gly Arg Arg His
Arg Asn Glu Tyr Cys Glu 85 90 95Asn Gly Ala Lys His Ile Asn Asp Glu
Ala Thr Thr Arg Arg Val Thr 100 105 110Ala Asp Phe Phe Arg Glu His
Ser Val Ala Asp Leu Ala Gly Arg Ser 115 120 125Asp Met Trp Leu Asn
Gln Gln Gly Arg Leu Thr Glu Pro Met Ile Lys 130 135 140Arg Pro Gly
Ser Ala His Tyr Glu Pro Ile Gly Trp Asn Asp Ala Leu145 150 155
160Gly Val Leu Ala Glu Glu Leu Lys Ser Leu Ala Ser Pro Asp Glu Ala
165 170 175Val Phe Tyr Thr Ser Gly Arg Ala Ser Asn Glu Ala Ala Phe
Val Leu 180 185 190Gln Leu Phe Ala Arg Ala Phe Gly Thr Asn Asn Leu
Pro Asp Cys Ser 195 200 205Asn Met Cys His Glu Ser Ser Gly Phe Ala
Leu Ser Glu Thr Leu Gly 210 215 220Thr Gly Lys Gly Thr Val Gly Leu
Asp Asp Leu His His Ala Asp Leu225 230 235 240Ile Phe Leu Val Gly
Gln Asn Pro Gly Ser Asn His Pro Arg Gln Leu 245 250 255Ser Ala Leu
Glu Glu Ala Lys Arg Asn Gly Ala Arg Ile Val Ala Val 260 265 270Asn
Pro Leu Pro Glu Ala Gly Leu Arg Arg Phe Lys Asn Pro Gln Gln 275 280
285Pro Arg Gly Val Val Gly Arg Gly Thr Arg Ile Ala Asp Arg Phe Leu
290 295 300His Ile Lys Pro Gly Gly Asp Leu Ala Leu Phe Gln Ala Leu
Asn Arg305 310 315 320Leu Leu Leu Glu Ala Glu Asp Ala Arg Pro Gly
Thr Val Leu Asp His 325 330 335Asp Phe Ile Asp Ala His Thr Thr Gly
Phe Glu Glu Phe Ala Arg His 340 345 350Ala Arg Thr Val Asp Trp Asp
Asp Val Arg Ala Ala Thr Gly Leu Thr 355 360 365Arg Glu Glu Ile Glu
Lys Val Arg Asp Glu Val Leu Asp Ser Glu Arg 370 375 380Val Val Val
Cys Trp Ala Met Gly Ile Thr Gln His Lys His Gly Val385 390 395
400Pro Thr Val Arg Glu Ile Val Asn Phe Leu Met Leu Arg Gly Asn Leu
405 410 415Gly Arg Ala Gly Thr Gly Ala Cys Pro Val Arg Gly His Ser
Asn Val 420 425 430Gln Gly Asp Arg Thr Met Gly Ile Trp Glu Gln Met
Pro Asp Thr Phe 435 440 445Leu Asp Ala Leu Arg Asp Glu Phe Gly Phe
Glu Pro Pro Arg Ala His 450 455 460Gly Leu Asp Ser Val Asn Ser Ile
Lys Ala Met Arg Glu Gly Arg Val465 470 475 480Lys Val Phe Leu Ala
Leu Ala Gly Asn Phe Val Arg Ala Ala Pro Asp 485 490 495Ser Glu Val
Thr Glu Glu Ala Met Arg Ser Cys Arg Leu Thr Ala His 500 505 510Ile
Ser Thr Lys Leu Asn Arg Ser His Thr Val Cys Gly Asp Thr Ala 515 520
525Leu Ile Leu Pro Thr Leu Gly Arg Thr Glu Arg Asp Val Gln Ala Asp
530 535 540Gly Glu Gln Phe Val Thr Val Glu Asn Ser Met Ser Glu Val
His Thr545 550 555 560Ser Arg Gly Arg Leu Ala Pro Ala Ser Pro Met
Leu Leu Ser Glu Ile 565 570 575Ala Ile Leu Cys Arg Leu Ala Arg Leu
Thr Leu Asp Gly Arg Val Glu 580 585 590Ile Pro Trp Glu Thr Phe Glu
Gly Asp Tyr His Thr Ile Arg Asp Arg 595 600 605Ile Ala Arg Ile Val
Pro Gly Phe His Asp Phe Asn Ala Arg Val Thr 610 615 620Arg Pro Gly
Gly Phe Gln Leu Pro Asn Pro Val Asn Glu Gly Val Phe625 630 635
640Asn Thr Glu Val Gly Lys Ala Leu Phe Thr Arg Asn Glu Ser Val Val
645 650 655Pro Arg Ala Pro Glu Gly His Leu Leu Leu Gln Thr Leu Arg
Ser His 660 665 670Asp Gln Trp Asn Thr Val Pro Tyr Thr Asp Asn Asp
Arg Tyr Arg Gly 675 680 685Ile His Gly Ser Arg His Val Val Leu Val
Asn Pro Ala Asp Leu Ser 690 695 700Glu Leu Gly Leu Ala Gln Gly Asp
Arg Val Asp Leu Val Ser Val Trp705 710 715 720Ala Asp Gly Thr Glu
Arg Arg Ala Glu Asn Phe Gln Val Val Pro Tyr 725 730 735Pro Ala Ala
Lys Gly Ser Ala Ala Ala Tyr Tyr Pro Glu Thr Asn Val 740 745 750Leu
Val Pro Leu Asp Ser Val Ala Asp Ile Ser Asn Gln Pro Thr Ser 755 760
765Lys Gly Ile Val Val Arg Leu Glu Pro Val Pro Asp Arg Thr Gln Pro
770 775 780Ser Pro Ala78538206PRTStreptomyces parvulus Tu4055 38Met
Ala Glu Gln His Glu Gly Pro Arg Ala Val Pro Asp Thr Pro Gly1 5 10
15Ala Arg Thr Ser Gly Asp Arg Ser Thr Gly Arg Arg Pro Leu Arg Glu
20 25 30Arg His Val Asp Gln Thr Val Glu Val Ala Val Pro Val Arg Thr
Ala 35 40 45Tyr Asn Gln Trp Thr Gln Phe Lys Ser Phe Pro Arg Phe Ser
Ala Val 50 55 60Val Arg Asp Val Glu Gln Val Arg Pro Thr Val Thr Ala
Trp Thr Leu65 70 75 80Gly Tyr Gly Pro Leu Arg Arg Arg Phe Ala Val
Glu Ile Leu Glu Gln 85 90 95Asp Pro Asp Ala Tyr Leu Ala Trp Arg Gly
Leu Glu Gln Arg Pro Trp 100 105 110His Arg Gly Glu Val Glu Phe Arg
Pro Thr Glu Ser Gly Gly Thr Ala 115 120 125Ile Thr Val Arg Val Leu
Leu Glu Pro Arg Gly Ala Ala Arg Ile Leu 130 135 140Thr Arg Ser Ser
Arg Ala Val Arg Leu Thr Thr Arg Leu Val His Gly145 150 155 160Glu
Leu Thr Arg Phe Lys Arg Phe Met Glu Gly Leu Gly Gln Glu Gly 165 170
175Gly Ala Trp Arg Gly Thr Ile Arg Asn Gly Arg Val Gln His Asp Arg
180 185 190Pro Glu Pro Pro Arg Ser Arg Val Ala Arg Trp Pro Val Gly
195 200 20539251PRTStreptomyces parvulus Tu4055 39Met Leu Leu Leu
Ile Ser Pro Asp Gly Val Glu Glu Ala Leu Asp Cys1 5 10 15Ala Lys Ala
Ala Glu His Leu Asp Ile Val Asp Val Lys Lys Pro Asp 20 25 30Glu Gly
Ser Leu Gly Ala Asn Phe Pro Trp Val Ile Arg Glu Ile Arg 35 40 45Asp
Ala Val Pro Ala Asp Lys Pro Val Ser Ala Thr Val Gly Asp Val 50 55
60Pro Tyr Lys Pro Gly Thr Val Ala Gln Ala Ala Leu Gly Ala Val Val65
70 75 80Ser Gly Ala Thr Tyr Ile Lys Val Gly Leu Tyr Gly Cys Thr Thr
Pro 85 90 95Glu Gln Gly Ile Ala Val Met Arg Ala Val Val Arg Ala Val
Lys Asp 100 105 110His Arg Pro Glu Ala Leu Val Val Ala Ser Gly Tyr
Ala Asp Ala His 115 120 125Arg Ile Gly Cys Val Asn Pro Leu Ala Leu
Pro Asp Ile Ala Ala Arg 130 135 140Ser Gly Ala Asp Ala Ala Met Leu
Asp Thr Ala Val Lys Asp Gly Thr145 150 155 160Arg Leu Phe Asp His
Val Pro Pro Asp Thr Cys Ala Glu Phe Val Arg 165 170 175Arg Ala His
Ala Ala Gly Leu Leu Ala Ala Leu Ala Gly Ser Val Arg 180 185 190Gln
Thr Asp Leu Gly Arg Leu Thr Arg Ile Gly Thr Asp Ile Val Gly 195 200
205Val Arg Gly Ala Val Cys Glu Gly Gly Asp Arg Asn Ala Gly Arg Ile
210 215 220Arg Pro His Leu Val Ala Ala Phe Arg Ser Glu Met Asp Arg
His Ala225 230 235 240Arg Glu His Arg Ala Gly Val Thr Thr Ala Ser
245 25040467PRTStreptomyces parvulus Tu4055 40Met Pro Thr Pro Ala
Pro Asp His Ala Pro Ala Gln Arg Ala Ala Pro1 5
10 15Leu Ala Val Val Asp Pro Ala Thr Gly Thr Val Phe Asp Glu Ala
Pro 20 25 30Asp Gln Gly Pro Asp Val Leu Asp Ala Val Val Asp Arg Ala
Arg Arg 35 40 45Ala Trp His Gly Trp Arg Ala Asp Pro Asp Ala Arg Thr
Thr Ala Leu 50 55 60Arg Ser Ala Ala Asp Ala Val Glu Ala Ala Gly Asp
Asp Leu Ala Arg65 70 75 80Leu Leu Thr Arg Glu Gln Gly Lys Pro Leu
Ala Glu Ser His Ala Glu 85 90 95Val Ala Arg Thr Ala Ala Arg Leu Arg
Tyr Phe Ala Gly Leu Ala Pro 100 105 110Arg Thr Arg Arg Ile Thr Asp
Gly Arg Pro Val Arg Ser Glu Val Arg 115 120 125Trp Arg Pro Leu Gly
Pro Val Ala Ala Ile Val Pro Trp Asn Phe Pro 130 135 140Leu Gln Leu
Ala Ser Ala Lys Phe Ala Pro Ala Leu Ala Ala Gly Asn145 150 155
160Thr Met Val Leu Lys Pro Ser Pro Phe Thr Pro Leu Ala Thr Arg Leu
165 170 175Leu Gly Ser Val Leu Ala Thr Ala Leu Pro Glu Asp Val Leu
Thr Val 180 185 190Val Thr Gly Arg Glu Pro Leu Gly Ala Arg Leu Ala
Ala His Pro Gly 195 200 205Ile Arg His Val Thr Phe Thr Gly Ser Val
Pro Thr Gly Arg Ala Val 210 215 220Ala Arg Ala Ala Ala Ala Ser Leu
Ala Arg Val Thr Leu Glu Leu Gly225 230 235 240Gly Asn Asp Ala Ala
Val Leu Leu Asp Asp Val Glu Val Asp Arg Ile 245 250 255Ala Asp Arg
Leu Phe Trp Ala Ala Phe Arg Asn Cys Gly Gln Val Cys 260 265 270Met
Ala Val Lys Arg Val Tyr Ala Pro Ala Arg Leu His Ala Gln Val 275 280
285Val Glu Ala Leu Thr Glu Arg Ala Lys Ala Val Ala Val Gly Pro Gly
290 295 300Leu Asp Pro Arg Thr Arg Leu Gly Pro Val Ala Asn Ala Pro
Gln Leu305 310 315 320Ala Arg Val Glu Gln Ile Thr Arg Arg Ala Leu
Ala Asp Gly Ala Arg 325 330 335Ala Ala Ala Gly Gly His Arg Leu Asp
Gly Pro Gly Cys Phe Phe Ala 340 345 350Pro Thr Ile Leu Thr Asp Val
Pro Pro Asp Ser Pro Val Val Thr Glu 355 360 365Glu Gln Phe Gly Pro
Val Leu Pro Val Leu Pro Tyr Arg Ser Leu Asp 370 375 380Glu Ala Val
Asp Ala Ala Asn Gly Thr Gly Phe Gly Leu Gly Gly Ser385 390 395
400Val Trp Gly Thr Asp Leu Asp Arg Ala Glu Ala Val Ala Asp Arg Leu
405 410 415Glu Cys Gly Thr Ala Trp Val Asn His His Ala Glu Leu Ser
Leu Ala 420 425 430Gln Pro Phe Ala Gly Asp Lys Asp Ser Gly Val Gly
Val Ala Gly Gly 435 440 445Pro Trp Gly Leu Tyr Gly Asn Leu Arg Pro
Phe Val Val His Arg Pro 450 455 460Arg Gly
Glu46541368PRTStreptomyces parvulus Tu4055 41Met Ser Phe Arg Ala
Ala Val Leu Arg Gly Tyr Glu Asp Pro Phe Thr1 5 10 15Val Glu Glu Val
Thr Leu Gly Thr Glu Pro Gly Ala Gly Glu Ile Leu 20 25 30Val Glu Ile
Ala Gly Cys Gly Met Cys Arg Thr Asp Leu Ala Val Arg 35 40 45Arg Ser
Ala Gly Arg Ser Pro Leu Pro Ala Val Leu Gly His Glu Gly 50 55 60Ser
Gly Val Val Val Arg Thr Gly Gly Gly Pro Asp Thr Ala Ile Gly65 70 75
80Val Gly Asp His Val Val Leu Ser Phe Asp Ser Cys Gly His Cys Arg
85 90 95Asn Cys Arg Ala Ala Ala Pro Ala Tyr Cys Asp Ser Phe Ala Ser
Leu 100 105 110Asn Leu Phe Gly Gly Arg Ala Glu Asp Pro Pro Arg Leu
Thr Asp Gly 115 120 125Ser Gly Ala Ala Leu Ala Pro Arg Trp Phe Gly
Gln Ser Ala Phe Ala 130 135 140Glu Tyr Ala Leu Val Ser Ala Arg Asn
Ala Val Arg Val Asp Pro Ala145 150 155 160Leu Pro Val Glu Leu Leu
Gly Pro Leu Gly Cys Gly Phe Leu Thr Gly 165 170 175Ala Gly Ala Val
Leu Asn Thr Phe Ala Ala Gly Pro Gly Asp Thr Leu 180 185 190Val Val
Leu Gly Ala Gly Ala Val Gly Leu Ala Ala Val Met Ala Ala 195 200
205Thr Ala Ala Gly Ala Pro Ser Val Ala Val Asp Arg Asn Pro Arg Arg
210 215 220Leu Glu Leu Ala Glu Arg Phe Gly Ala Val Pro Leu Pro Ala
Ala Thr225 230 235 240Ala Gly Leu Ala Glu Arg Ile Arg Arg Leu Thr
Asp Gly Gly Ala Arg 245 250 255Tyr Ala Leu Asp Thr Thr Ala Ser Val
Pro Leu Ile Asn Glu Ala Leu 260 265 270Arg Ala Leu Arg Pro Thr Gly
Ala Leu Gly Leu Val Ala Arg Leu His 275 280 285Thr Ala Leu Pro Leu
Glu Pro Gly Thr Leu Asp Arg Gly Arg Ser Ile 290 295 300Arg His Val
Cys Glu Gly Asp Ala Val Pro Gly Leu Leu Ile Pro Gln305 310 315
320Leu Thr Arg Leu Trp Gln Ala Gly Arg Phe Pro Phe Asp Gln Leu Val
325 330 335Arg Thr Tyr Pro Leu Ala Asp Ile Asn Glu Ala Glu Arg Asp
Cys Asp 340 345 350Ala Gly Leu Val Val Lys Pro Val Leu Leu Pro Pro
Ala Arg Ser Arg 355 360 36542301PRTStreptomyces parvulus Tu4055
42Met Thr Gly Thr Ala Pro Gln Tyr Thr Asp Val Glu Gly Val Asn Gly1
5 10 15Gly Val Gly Leu Thr Ala Phe Leu Val Ala Ala Ala Arg Ala Ile
Glu 20 25 30Thr His Arg Asp Asp Ser Leu Ala Gln Asp Val Tyr Ala Glu
His Phe 35 40 45Val Arg Ala Ala Pro Ala Cys Ala Asp Trp Pro Val Arg
Ile Glu Gln 50 55 60Val Pro Asp Gly Asp Gly Asn Pro Leu Trp Gly Arg
Phe Ala Arg Tyr65 70 75 80Phe Gly Leu Arg Thr Arg Ala Leu Asp Asp
Phe Leu Leu Arg Ser Val 85 90 95Arg Thr Gly Pro Arg Gln Val Val Leu
Leu Gly Ala Gly Leu Asp Thr 100 105 110Arg Ala Phe Arg Leu Asp Trp
Pro Ser Gln Cys Ala Val Phe Glu Ile 115 120 125Asp Arg Thr Gly Val
Leu Ala Phe Lys Gln Gln Val Leu Thr Asp Leu 130 135 140Ala Ala Thr
Pro Arg Val Glu Arg Val Pro Val Pro Val Asp Leu Arg145 150 155
160Ala Asp Trp Ala Gly Ala Leu Thr Ala Ala Gly Phe Asp Pro Ala Ala
165 170 175Pro Ser Val Trp Leu Ala Glu Gly Leu Leu Phe Tyr Leu Pro
Gly Pro 180 185 190Ala Glu Ser Leu Leu Val Asp Thr Val Asp Arg Leu
Thr Thr Asp Gly 195 200 205Ser Ala Leu Ala Phe Glu Ala Lys Leu Glu
Lys Asp Leu Leu Ala Tyr 210 215 220Arg Asp Ser Ala Ile Tyr Thr Ala
Thr Arg Glu Gln Ile Gly Ile Asp225 230 235 240Leu Leu Arg Leu Phe
Asp Lys Gly Pro Arg Pro Asp Ser Ala Gly Glu 245 250 255Leu Ala Ala
Arg Gly Trp Ser Thr Ser Met His Thr Pro Phe Val Phe 260 265 270Thr
His Arg Tyr Gly Arg Gly Pro Leu Pro Glu Pro Asn Asp Ala Leu 275 280
285Glu Gly Asn Arg Trp Val Phe Ala Arg Lys Pro Gly Pro 290 295
30043179PRTStreptomyces parvulus Tu4055 43Met Cys Met Arg Asp Glu
Ala Ala Lys Arg Val Glu Leu Val Phe Ser1 5 10 15Leu Phe Asp Ala Asn
Gly Asn Gly Val Ile Asp Ser Asp Asp Phe Asp 20 25 30Leu Met Thr Asp
Arg Val Val Ala Ala Ala Ala Gly Ser Asp Asp Ser 35 40 45Ala Lys Ala
Ala Val Arg Ala Ala Phe Arg Arg Tyr Trp Thr Thr Leu 50 55 60Ala Thr
Glu Leu Asp Ala Asp Gly Asp Gly Val Ile Thr Val Glu Glu65 70 75
80Phe Arg Pro Phe Val Leu Asp Pro Glu Arg Phe Gly Pro Thr Ile Ala
85 90 95Glu Phe Ala Arg Ala Leu Ser Ala Leu Gly Asp Pro Asp Gly Asp
Gly 100 105 110Leu Ile Glu Arg Pro Leu Phe Val Ala Leu Met Lys Ala
Ile Gly Phe 115 120 125Glu Glu Ala Asn Ile His Ala Leu Phe Asp Ala
Phe Ala Pro Asp Ala 130 135 140Ala Asp Arg Ile Thr Val Ala Ala Trp
Ala Ser Gly Ile Glu Asp Tyr145 150 155 160Tyr Ala Pro Asp Leu Ala
Gly Ile Pro Gly Asp Arg Leu Val Ala Ala 165 170 175Arg Thr
Val4433DNAArtificial SequenceDescription of Artificial Sequence
oligo CM410 44aaaatgcatt cggcctgaac ggccccgctg tca
334533DNAArtificial SequenceDescription of Artificial Sequence
oligo CM411 45aaatggccag cgaacaccaa caccacacca cca
334632DNAArtificial SequenceDescription of Artificial Sequence
oligo CM412 46aaagtcctag gcggcggccg gcgggtcgac ct
324735DNAArtificial SequenceDescription of Artificial Sequence
oligo CM413 47tttagatctc gcgacgtcgc acgcgccgaa cgtca
354834DNAArtificial SequenceDescription of Artificial Sequence
oligo CM414 48aaactgcaga gtcgaacatc ggtcacacgc aggc
344935DNAArtificial SequenceDescription of Artificial Sequence
oligo CM415 49aaaatgcatg atccacatcg atacgacgcg cccga
355036DNAArtificial SequenceDescription of Artificial Sequence
oligo CM416 50taaatgcatt ccattcggtg caggtggagt tgatcc
365136DNAArtificial SequenceDescription of Artificial Sequence
oligo CM417 51ataggatccc ctccgggtgc tccagaccgg ccaccc
365235DNAArtificial SequenceDescription of Artificial Sequence
oligo CM368 52tttcctgcag gccatcccca cgatcgcgat cggct
355335DNAArtificial SequenceDescription of Artificial Sequence
oligo CM369 53tttcatatga caggcagtgc tgtttcggcc ccatt
355436DNAArtificial SequenceDescription of Artificial Sequence
oligo CM370 54tttcatatgg cggatgccgt acgtgccgcc ggcgct
365532DNAArtificial SequenceDescription of Artificial Sequence
oligo CM371 55tttcatatgc cccaggcgat cgtccgcacc ac
325641DNAArtificial SequenceDescription of Artificial Sequence
oligo CM372 56tttcatatgg tctcggcccc ccacacaaga gccctccggg c
415720DNAArtificial SequenceDescription of Artificial Sequence
oligo B1819A 57gtcatgcatg cggcgggctc 205820DNAArtificial
SequenceDescription of Artificial Sequence oligo B1819B
58ggtctagaac ggccgaactt 205920DNAArtificial SequenceDescription of
Artificial Sequence oligo B1819C 59gttctagaac ctcggtcggc
206020DNAArtificial SequenceDescription of Artificial Sequence
oligo B1819D 60ctggatccca cgctgctgcg 206119DNAArtificial
SequenceDescription of Artificial Sequence oligo BLDA 61ggagacttac
gggggatgc 196219DNAArtificial SequenceDescription of Artificial
Sequence oligo BLDB 62ctccagcagc gaccagaac 196320DNAArtificial
SequenceDescription of Artificial Sequence oligo B19A 63cccatgcatc
accgacatac 206420DNAArtificial SequenceDescription of Artificial
Sequence oligo B19B 64gcgatatccc gaagaacgcg 206520DNAArtificial
SequenceDescription of Artificial Sequence oligo B1920A
65gccaagcttc ctcgacgcgc 206620DNAArtificial SequenceDescription of
Artificial Sequence oligo B1920B 66cactagtgcc tcacccagtt
206720DNAArtificial SequenceDescription of Artificial Sequence
oligo B1920C 67cactagtgac ggccgaagcg 206820DNAArtificial
SequenceDescription of Artificial Sequence oligo B1920D
68tcggatccgt cagaccgttc 206936DNAArtificial SequenceDescription of
Artificial Sequence oligo CM384 69aacctgcagg taccccggtg gggtgcggtc
gcccga 367024DNAArtificial SequenceDescription of Artificial
Sequence oligo CM385 70cgccgcacgc gtcgaagcca acga
247124DNAArtificial SequenceDescription of Artificial Sequence
oligo CM386 71tgtgggctgg tcgttggctt cgac 247234DNAArtificial
SequenceDescription of Artificial Sequence oligo CM387 72ggtgcctgca
gcgtgagttc ctcgacggat ccga 347326DNAArtificial SequenceDescription
of Artificial Sequence oligo CM388 73gaggaactca ccctgcaggc accgct
267426DNAArtificial SequenceDescription of Artificial Sequence
oligo CM395 74cgaacgtcca gccctcgggc atgcgt 267528DNAArtificial
SequenceDescription of Artificial Sequence oligo CM396 75tggcacgcat
gcccgagggc tggacgtt 287635DNAArtificial SequenceDescription of
Artificial Sequence oligo CM397 76tttcctgcag gccatgccga cgatcgcgac
aggct 357736DNAArtificial SequenceDescription of Artificial
Sequence oligo CM398 77aaacatatgg tcctggcgct gcgcaacggg gaactg
367835DNAArtificial SequenceDescription of Artificial Sequence
oligo CM399 78tttcctgcag gcgatgccga cgatggcgat gggct
357943DNAArtificial SequenceDescription of Artificial Sequence
oligo CM400 79aaacctgcag gttccccggc gacgtggact cgccggagtc gtt
438041DNAArtificial SequenceDescription of Artificial Sequence
oligo CM401 80ttttctagag cgacgtcgca ggcggcgatg gtcacgcccg t
418120DNAArtificial SequenceDescription of Artificial Sequence
oligo B25A 81ttctgcagcc gcggccttcg 208220DNAArtificial
SequenceDescription of Artificial Sequence oligo B25B 82agaattcgcc
ggcgccgctg 208320DNAArtificial SequenceDescription of Artificial
Sequence oligo B7T1 83ggctgcagac gcggctgaag 208420DNAArtificial
SequenceDescription of Artificial Sequence oligo B7T2 84ccggatccca
gagccacgtc 208520DNAArtificial SequenceDescription of Artificial
Sequence oligo BP4501 85cgtatgcatg gcgccatgga 208620DNAArtificial
SequenceDescription of Artificial Sequence oligo BP4502
86agccaattgg tgcactccag 208720DNAArtificial SequenceDescription of
Artificial Sequence oligo BNHT1 87gtcatgcatc agcgcacccg
208820DNAArtificial SequenceDescription of Artificial Sequence
oligo BNHT2 88gtgcaattgc cctggtagtc 208920DNAArtificial
SequenceDescription of Artificial Sequence oligo BTRNAS1
89tgtctagact cgcgcgaaca 209020DNAArtificial SequenceDescription of
Artificial Sequence oligo BTRNAS2 90tgaattccga agggggtggt
209120DNAArtificial SequenceDescription of Artificial Sequence
oligo B5B 91aactagtccg cagtggaccg 209220DNAArtificial
SequenceDescription of Artificial Sequence oligo B5A 92tcgatatcct
caccgcccgt 209320DNAArtificial SequenceDescription of Artificial
Sequence oligo B6B 93aactagtgtg gcagacggtc 209420DNAArtificial
SequenceDescription of Artificial Sequence oligo B5A 94tcgatatcct
caccgcccgt 209520DNAArtificial SequenceDescription of Artificial
Sequence oligo B6T1 95cggatgcatc accggcacgg 209620DNAArtificial
SequenceDescription of Artificial Sequence oligo B6T2 96tgggatccgc
ggggcggtac 209720DNAArtificial SequenceDescription of Artificial
Sequence oligo BBB 97aactagtgcg atcccgggga 209820DNAArtificial
SequenceDescription of Artificial Sequence oligo BBA 98cgtcgatatc
ctccaggggc 209920DNAArtificial SequenceDescription of Artificial
Sequence oligo BBT1 99tactgcagca cacccggtgc 2010020DNAArtificial
SequenceDescription of Artificial Sequence oligo BBT2 100tgggatccgc
tgtgtcatat 2010120DNAArtificial SequenceDescription of Artificial
Sequence oligo BCB 101cactagtcct cgccgggcac
2010220DNAArtificial SequenceDescription of Artificial Sequence
oligo BCA 102gaggatcccg gtcagcggca 2010320DNAArtificial
SequenceDescription of Artificial Sequence oligo BCT1 103gcctgcagcg
acctcgccgg 2010420DNAArtificial SequenceDescription of Artificial
Sequence oligo BCT2 104cgggatcccg tggcgtggtc 2010520DNAArtificial
SequenceDescription of Artificial Sequence oligo B23A 105atctgcagcg
gcatcggtgt 2010620DNAArtificial SequenceDescription of Artificial
Sequence oligo B23B 106agaattctcc actgcggtcg 2010720DNAArtificial
SequenceDescription of Artificial Sequence oligo B9A 107acctgcaggc
cgggctcatc 2010820DNAArtificial SequenceDescription of Artificial
Sequence oligo B9B 108agaattcggg cgagccgccg 2010920DNAArtificial
SequenceDescription of Artificial Sequence oligo B231 109atcaagcttc
gtgtccatgg 2011020DNAArtificial SequenceDescription of Artificial
Sequence oligo B232 110gtcatgcatc aggcgttcgg 2011120DNAArtificial
SequenceDescription of Artificial Sequence oligo B251 111cttctagatg
aacccctcca 2011220DNAArtificial SequenceDescription of Artificial
Sequence oligo B252 112gggcaattgc gcggcagctt 2011390PRTStreptomyces
parvulus Tu4055 113Met Leu Gly Phe Tyr Ala Leu Leu Leu Ala Pro Ala
Glu Leu Asp Leu1 5 10 15Leu Phe Val Gln Asp Gly Thr Gln Gly Arg Gly
Ile Gly Arg Leu Leu 20 25 30Val Asp His Met Lys Arg Arg Ala Arg Ala
Ala Gly Leu Asp Arg Val 35 40 45Arg Val Val Ser His Pro Pro Ala Glu
Gly Phe Tyr Arg Ala Val Gly 50 55 60Ala Leu Pro Thr Gly Thr Ala Arg
Ala Asn Pro Pro Ala Val Ala Trp65 70 75 80Asp Arg Pro Val Leu Glu
Phe Leu Ile Pro 85 90
* * * * *