U.S. patent application number 13/794610 was filed with the patent office on 2013-09-26 for biofuel production.
This patent application is currently assigned to BIO ARCHITECTURE LAB, INC.. The applicant listed for this patent is BIO ARCHITECTURE LAB, INC.. Invention is credited to Yuki KASHIYAMA, Yasuo YOSHIKUNI.
Application Number | 20130252312 13/794610 |
Document ID | / |
Family ID | 40227859 |
Filed Date | 2013-09-26 |
United States Patent
Application |
20130252312 |
Kind Code |
A1 |
YOSHIKUNI; Yasuo ; et
al. |
September 26, 2013 |
BIOFUEL PRODUCTION
Abstract
Methods, enzymes, recombinant microorganism, and microbial
systems are provided for converting polysaccharides, such as those
derived from biomass, into suitable monosaccharides or
oligosaccharides, as well as for converting suitable
monosaccharides or oligosaccharides into commodity chemicals, such
as biofuels. Commodity chemicals produced by the methods described
herein are also provided. Commodity chemical enriched,
refinery-produced petroleum products are also provided, as well as
methods for producing the same.
Inventors: |
YOSHIKUNI; Yasuo; (Albany,
CA) ; KASHIYAMA; Yuki; (Berkeley, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
BIO ARCHITECTURE LAB, INC. |
Berkeley |
CA |
US |
|
|
Assignee: |
BIO ARCHITECTURE LAB, INC.
Berkeley
CA
|
Family ID: |
40227859 |
Appl. No.: |
13/794610 |
Filed: |
March 11, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
12245537 |
Oct 3, 2008 |
|
|
|
13794610 |
|
|
|
|
60977628 |
Oct 4, 2007 |
|
|
|
Current U.S.
Class: |
435/252.33 ;
435/254.2 |
Current CPC
Class: |
C12P 7/38 20130101; Y02E
50/17 20130101; C12P 7/18 20130101; Y02E 50/10 20130101; Y02T
50/678 20130101; C12P 7/04 20130101; C12P 5/02 20130101; C12P 17/10
20130101; C12P 7/02 20130101; C12N 15/70 20130101; C12P 7/26
20130101; C12P 7/06 20130101; C12P 5/026 20130101; C12P 7/22
20130101; C12P 7/16 20130101 |
Class at
Publication: |
435/252.33 ;
435/254.2 |
International
Class: |
C12N 15/70 20060101
C12N015/70 |
Claims
1. A recombinant microorganism for production of a commodity
chemical, comprising recombinant DNA encoding a transporter,
wherein the transporter transports an alginate-derived
polysaccharide into the recombinant microorganism, and wherein said
polysaccharide is converted to said commodity chemical in said
microorganism.
2. The microorganism of claim 1 wherein the transporter is a
monosaccharide transporter, disaccharide transporter, trisaccharide
transporter, oligosaccharide transporter, or polysaccharide
transporter.
3. The microorganism of claim 1 wherein the transporter is a
symporter, ABC transporter, or permease.
4. The microorganism of claim 1 wherein the transporter is a
superchannel or outer membrane porin.
5. The microorganism of claim 1 wherein the transporter comprises
SEQ ID NO: 8, SEQ ID NO: 24, or SEQ ID NO: 38.
6. The microorganism of claim 1 wherein the alginate-derived
polysaccharide is selected from the group consisting of a
dialginate, trialginate, pentalginate, hexylginate, heptalginate,
octalginate, nonalginate, decalginate, undecalginate,
dodecalginate, and polyalginate.
7. The microorganism of claim 1 wherein the alginate-derived
polysaccharide is a saturated polysaccharide.
8. The microorganism of claim 1 wherein the alginate-derived
polysaccharide is an unsaturated polysaccharide.
9. The microorganism of claim 1 wherein the alginate-derived
polysaccharide is selected from the group consisting of
b-D-mannuronate, .alpha.-L-gluronate,
4-deoxy-L-erythro-5-hexoseulose uronic acid,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-D-mannuronate or L-guluronate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-dialginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-trialginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-tetralginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-pentalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-hexylginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-heptalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-octalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-nonalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-undecalginate, and
4-(4-deoxy-beta-D-mann-4-enuronosyl)-dodecalginate.
10. The microorganism of claim 1 wherein the microorganism is
yeast.
11. The microorganism of claim 1 wherein the microorganism is E.
coli.
12. A system for the production of a commodity chemical, comprising
a) an alginate-derived polysaccharide; and b) a recombinant
microorganism comprising recombinant DNA encoding a transporter,
wherein the transporter transports an alginate-derived
polysaccharide into the recombinant microorganism and wherein said
polysaccharide is converted to said commodity chemical in said
microorganism.
13. The system of claim 12 wherein the transporter is a symporter,
ABC transporter, or permease.
14. The system of claim 12 wherein the transporter is a
superchannel or outer membrane porin.
15. The system of claim 12 wherein the transporter comprises SEQ ID
NO: 8, SEQ ID NO: 24, or SEQ ID NO: 38.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application is a continuation of copending U.S. patent
application Ser. No. 12/245,537, with a filing date of Oct. 3,
2008, which claims the benefit under 35 U.S.C. .sctn.119(e) of U.S.
Provisional Patent Application No. 60/977,628 filed Oct. 4, 2007,
all of which are incorporated herein by reference in their
entirety.
SUBMISSION OF SEQUENCE LISTING AS ASCII TEXT FILE
[0002] The content of the following submission on ASCII text file
is incorporated herein by reference in its entirety: a computer
readable form (CRF) of the Sequence Listing (file name:
690212000607SeqList.txt, date recorded: Mar. 11, 2013 size: 519
KB).
TECHNICAL FIELD
[0003] The present application relates generally to the use of
microbial and chemical systems to convert biomass to commodity
chemicals, such as biofuels/biopetrols.
BACKGROUND
[0004] Petroleum is facing declining global reserves and
contributes to more than 30% of greenhouse gas emissions driving
global warming. Annually 800 billion barrels of transportation fuel
are consumed globally. Diesel and jet fuels account for greater
than 50% of global transportation fuels.
[0005] Significant legislation has been passed, requiring fuel
producers to cap or reduce the carbon emissions from the production
and use of transportation fuels. Fuel producers are seeking
substantially similar, low carbon fuels that can be blended and
distributed through existing infrastructure (e.g., refineries,
pipelines, tankers).
[0006] Due to increasing petroleum costs and reliance on
petrochemical feedstocks, the chemicals industry is also looking
for ways to improve margin and price stability, while reducing its
environmental footprint. The chemicals industry is striving to
develop greener products that are more energy, water, and CO.sub.2
efficient than current products. Fuels produced from biological
sources, such as biomass, represent one aspect of process.
[0007] Presents method for converting biomass into biofuels focus
on the use of lignocellulolic biomass, and there are many problems
associated with using this process. Large-scale cultivation of
lignocellulolic biomass requires substantial amount of cultivated
land, which can be only achieved by replacing food crop production
with energy crop production, deforestation, and by recultivating
currently uncultivated land. Other problems include a decrease in
water availability and quality and an increase in the use of
pesticides and fertilizers.
[0008] The degradation of lignocellulolic biomass using biological
systems is a very difficult challenge due to its substantial
mechanistic strength and the complex chemical components.
Approximately thirty different enzymes are required to fully
convert lignocellulose to monosaccharides. The only available
alternate to this complex approach requires a substantial amount of
heat, pressure, and strong acids. The art therefore needs an
economic and technically simple process for converting biomass into
hydrocarbons for use as biofuels or biopetrols.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] FIG. 1 shows the Vibrio splendidus genomic region of the
fosmid clone described in Example 1. Genes are indicated with
orange arrows. Labels show the numerical gene indices and the
predicted function of the proteins.
[0010] FIG. 2 illustrates the pathways involved in certain
embodiment in which E. coli may be engineered to grow on alginate
as a sole source of carbon.
[0011] FIG. 3 illustrates the pathways involved in certain
embodiment in which E. coli may be engineered to grow on pectin as
a sole source of carbon.
[0012] FIG. 4 shows the results of engineered or recombinant E.
coli growing on alginate as a sole source of carbon (see solid
circles). Agrobacterium tumefaciens cells provide a positive
control (see hatched circles). The well to the immediate left of
the of the A. tumefaciens positive control contains DH10B E. coli
cells, which provide a negative control.
[0013] FIG. 5 shows the growth of recombinant strain of E. coli on
galacturonates and pectin. FIG. 5A shows the growth of E. coli on
various lengths of galacturonate after 24 hr. The recombinant
strain in FIG. 5A is the E. coli BL21(DE3) strain harboring
pTrlogl-kdgR+pBBRGal3P, and the control strain is the BL21(DE3)
strain harboring pTrc99A+pBBR1MCS-2, as described in Example 2.
FIG. 5B shows the growth of recombinant E. coli on pectin after 3-4
days. The recombinant strain in FIG. 5B is E. coli DH5a strain
containing pPEL74 (Ctrl) and pPEL74 and pROU2, as described in
Example 2.
[0014] FIG. 6 shows the degradation of alginate to form pyruvate.
FIG. 6A illustrates a simplified metabolic pathway for alginate
degradation and metabolism. FIG. 6B shows the results of in vitro
degradation of alginate to form pyruvate by an enzymatic
degradation route. FIG. 6C shows the results of in vitro
degradation of alginate to form pyruvate by a chemical degradation
route.
[0015] FIG. 7 shows the biological activity of various alcohol
dehydrogenases isolated from Agrobacterium tumefaciens C58. FIG. 7A
shows DEHU hydrogenase activity as monitored by NADPH consumption,
and FIG. 7B shows mannuronate hydrogenase activity as monitored by
NADPH consumption.
[0016] FIG. 8 shows the GC-MS chromatogram results for the control
sample (FIG. 8A) and for isobutyraldehyde, 3-methylpentanol, and
2-methylpentanal production from pBADalsS-ilvCD-leuABCD2 and
pTrcBALK (FIG. 8B).
[0017] FIG. 9 shows the GC-MS chromatogram results for the control
sample (FIG. 9A) and for 4-hydroxyphenylethanol and
indole-3-ethanol production from pBADtyrA-aroLAC-aroG-tktA-aroBDE
and pTrcBALK (FIG. 9B).
[0018] FIG. 10 shows the mass spectrometry results for isobutanal
(FIG. 10A), 3-methylpentanol (FIG. 10B), and 2-methylpentanol (FIG.
10C).
[0019] FIG. 11 shows the mass spectrometry results for
phenylethanol (FIG. 11A), 4-hydroxyphenylethanol (FIG. 11B), and
indole-3-ethanol (FIG. 11C).
[0020] FIG. 12 shows the biological activity of diol dehydratases.
FIG. 12A shows the reduction of butyroin by ddh1, ddh2, and ddh3 as
monitored by NADH consumption. FIG. 12B shows the oxidation
activity of ddh3 towards 1,2-cyclopentanediol and
1,2-cyclohexanediol as measured by NADH production.
[0021] FIG. 13 summarizes the results of kinetic studies for
various substrates in the oxidation reactions catalyzed by the DDH
polypeptides. These reactions were NAD+ dependent.
[0022] FIG. 14 shows the nucleotide sequence (FIG. 14A) (SEQ ID
NO:97) and polypeptide sequence (FIG. 14B) (SEQ ID NO:98) of diol
dehydrogenase DDH1 isolated from Lactobaccilus brevis ATCC 367.
[0023] FIG. 15 shows the nucleotide sequence (FIG. 15A) (SEQ ID
NO:99) and polypeptide sequence (FIG. 15B) (SEQ ID NO:100) of diol
dehydrogenase DDH2 isolated from Pseudomonas putida KT2440.
[0024] FIG. 16 shows the nucleotide sequence (FIG. 16A) (SEQ ID
NO:101) and polypeptide sequence (FIG. 16B) (SEQ ID NO:102) of diol
dehydrogenase DDH3 isolated from Klebsiella pneumoniae
MGH78578.
[0025] FIG. 17 shows the sequential in vivo biological activity of
a benzaldehyde lyase (bal) gene isolated from Pseudomonas
fluorescens (codon usage was optimized for E. coli protein
expression) and a ddh gene isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 (DDH3). This reaction illustrates the
sequential conversion of butanal into 5-hydroxy-4-octanone and then
4,5-octanonediol. FIG. 17A shows the detection of butyroin
(5-hydroxy-4-octanone) at 5.36 minutes, and FIG. 17B shows the
detection of 4,5-octanediol at 6.49 and 6.65 minutes.
[0026] FIG. 18 shows the sequential in vivo biological activity of
a benzaldehyde lyase (bal) gene isolated from Pseudomonas
fluorescens (codon usage was optimized for E. coli protein
expression) and a ddh gene isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the
sequential conversion of n-pentanal into 6-hydroxy-5-decanone and
then 5,6-decanediol. FIG. 18A shows the detection of valeroin
(6-hydroxy-5-decanone) at 8.22 minutes, and FIG. 18B shows the
detection of 5,6 decanediol at 9.22 and 9.35 minutes.
[0027] FIG. 19 shows the sequential in vivo biological activity of
a benzaldehyde lyase (bal) gene isolated from Pseudomonas
fluorescens (codon usage was optimized for E. coli protein
expression) and a ddh gene isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the
sequential conversion of 3-methylbutanal into
2,7-dimethyl-5-hydroxy-4-octanone and then
2,7-dimethyl-4,5-octanediol. FIG. 19A shows the detection of
isoveraloin (2,7-dimethyl-5-hydroxy-4-octanone) at 6.79 minutes,
and FIG. 19B shows the detection of 2,7-dimethyl-4,5-octanediol at
7.95 and 8.15 minutes.
[0028] FIG. 20 shows the sequential in vivo biological activity of
a benzaldehyde lyase (bal) gene isolated from Pseudomonas
fluorescens (codon usage was optimized for E. coli protein
expression) and a ddh gene isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the
sequential conversion of n-hexanal into 7-hydroxy-6-dodecanone and
then 6,7-dodecanediol. FIG. 20A shows the detection of hexanoin
(7-hydroxy-6-decanone) at 10.42 minutes, and FIG. 20B shows the
detection of 6,7 dodecanediol at 10.89 and 10.95 minutes.
[0029] FIG. 21 shows the sequential in vivo biological activity of
a benzaldehyde lyase (bal) gene isolated from Pseudomonas
fluorescens (codon usage was optimized for E. coli protein
expression) and a ddh gene isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the
sequential conversion of 4-methylpentanal into
2,9-dimethyl-6-hydroxy-5-decanone and then
2,9-dimethyl-5,6-decanediol. FIG. 21A shows the detection of
isohexanoin (2,9-Dimethyl-6-hydroxy-5-decanone) at 9.45 minutes,
and FIG. 21B shows the detection of 2,9-dimethyl-5,6-decanediol at
10.38 and 10.44 minutes.
[0030] FIG. 22 shows the in vivo biological activity of a
benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens
(codon usage was optimized for E. coli protein expression) and a
ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH
78578 (DDH3). This Figure illustrates the conversion of n-octanal
into 9-hydroxy-8-hexadecanone by showing the detection of detection
of octanoin (9-hydroxy-8-hexadecanone) at 12.35 minutes.
[0031] FIG. 23 shows the in vivo biological activity of a
benzaldehyde lyase (bal) gene isolated from Pseudomonas fluorescens
(codon usage was optimized for E. coli protein expression) and a
ddh gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH
78578 (DDH3). This Figure illustrates the conversion of
acetaldehyde into 3-hydroxy-2-butanone by showing the detection of
acetoin (3-hydroxy-2-butanone) at rt=0.91 minutes.
[0032] FIG. 24 shows the sequential in vivo biological activity of
a benzaldehyde lyase (bal) gene isolated from Pseudomonas
fluorescens (codon usage was optimized for E. coli protein
expression) and a ddh gene isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 (DDH3). This Figure illustrates the
sequential conversion of n-propanal into 4-hydroxy-3-hexanone and
then 3,4-hexanediol. FIG. 24A shows the detection of propioin
(4-hydroxy-3-hexanone) at rt=2.62 minutes, and FIG. 24B shows the
detection of 3,4-hexanediol at rt=3.79 minutes.
[0033] FIG. 25 the in vivo biological activity of a benzaldehyde
lyase (bal) gene isolated from Pseudomonas fluorescens (codon usage
was optimized for E. coli protein expression) and a ddh gene
isolated from Klebsiella pneumoniae subsp. pneumoniae MGH 78578
(DDH3). This Figure illustrates the conversion of
phenylacetoaldehyde into 1,4-diphenyl-3-hydroxy-2-butanone by
showing the detection of 1,4-diphenyl-3-hydroxy-2-butanone at
rt=13.66 minutes.
[0034] FIG. 26 shows the sequential biological activity of a diol
dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 (DDH3) and a
diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578. FIG.
26A shows GC-MS data which confirms the presence of 4,5-octanediol
in the sample extraction, which is the expected product resulting
from the reduction of butyroin by ddh3. FIG. 26B shows GC-MS data
confirming the presence of 4-octanone in the sample extraction,
which is the expected product resulting from the sequential
dehydrogenation of butyroin and dehydration of 4,5-octanediol by
ddh3 and pduCDE, respectively.
[0035] FIG. 27 shows the sequential biological activity of a diol
dehydrogenase ddh from Klebsiella pneumoniae MGH 78578 (DDH3) and a
diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578. FIGS.
27A and 27B show comparisons between the sample extraction gas
chromatograph/mass spectrum and the 4-octanone standard gas
chromatograph/mass spectrum, confirming that 4-octanone was
produced from butyroin using the enzymes diol dehydrogenase (ddh3)
and a diol dehydratase (pduCDE).
[0036] FIG. 28 shows the nucleotide sequence (FIG. 28A) (SEQ ID
NO:103) and polypeptide sequence (FIG. 28B) (SEQ ID NO:104) of a
diol dehydratase large subunit (pduC) isolated from Klebsiella
pneumoniae MGH78578.
[0037] FIG. 29 shows the nucleotide sequence (FIG. 29A) (SEQ ID
NO:105) and polypeptide sequence (FIG. 29B) (SEQ ID NO:106) of a
diol dehydratase medium subunit isolated from Klebsiella pneumoniae
MGH78578 (pduD), in addition to the nucleotide sequence (FIG. 29C)
(SEQ ID NO:107) and polypeptide sequence (FIG. 29D) (SEQ ID NO:108)
of a diol dehydratase small subunit isolated from Klebsiella
pneumoniae MGH78578 (pduE).
[0038] FIG. 30 shows the oxidation of 4-octanol by secondary
alcohol dehydrogenases as monitored by NADH production (FIG. 30A)
and NADPH production (FIG. 30B).
[0039] FIG. 31 shows the oxidation of 4-octanol by secondary
alcohol dehydrogenases as monitored by NADH production (FIG. 31A)
and NADPH production (FIG. 31B).
[0040] FIG. 32 shows the oxidation of 2,7-dimethyl octanol by
secondary alcohol dehydrogenases as monitored by NADH production
(FIG. 32A) and NADPH production (FIG. 32B).
[0041] FIG. 33 shows the oxidation and reduction activity of 2ADH11
and 2ADH16. FIG. 33A shows the reduction of 2,7-dimethyl-4-octanone
as measured by NADPH consumption. FIG. 33B shows the reduction of
2,7-dimethyl-4-octanone, 4-octanone, and cyclolypentanone.
[0042] FIG. 34 shows the oxidation and reduction of cyclopentanol
by secondary alcohol dehydrogenases. FIG. 34A shows the oxidation
of cyclopentanol as monitored by NADH or NADPH formation. FIG. 34B
shows the reduction of cyclopentanol as monitored by NADPH
consumption.
[0043] FIG. 35 shows the calculated rate constants for the
illustrated reduction reactions for each substrate catalyzed by
secondary alcohol dehydrogenase ADH-16 (SEQ ID NO:138).
[0044] FIG. 36 shows the calculated rate constants for the
illustrated oxidation reactions for each substrate catalyzed by
secondary alcohol dehydrogenase ADH-16 (SEQ ID NO:138).
[0045] FIGS. 37A-B shows a list of alginate lyases genes/proteins
that may be utilized according to the methods and recombinant
microorganisms described herein.
[0046] FIGS. 38A-E shows a list of pectate lyase genes/proteins
that may be utilized according to the methods and recombinant
microorganisms described herein.
[0047] FIG. 39A shows a list of rhamnogalacturonan lyase
genes/proteins that may be utilized according to the methods and
recombinant microorganisms described herein. FIG. 39B shows a list
of rhamnogalacturonate hydrolase genes/proteins that may be
utilized according to the methods and recombinant microorganisms
described herein.
[0048] FIGS. 40A-B shows a list of pectin methyl esterase
genes/proteins that may be utilized according to the methods and
recombinant microorganisms described herein.
[0049] FIG. 41 shows a list of pectin acetyl esterase
genes/proteins that may be utilized according to the methods and
recombinant microorganisms described herein.
[0050] FIG. 42 shows the production of 2-phenyl ethanol (FIG. 42A),
2-(4-hydroxyphenyl)ethanol (FIG. 42B), and 2-(indole-3-)ethanol
(FIG. 42C) at 24 hours from the recombinant microorganisms
described in Example 4, which comprise functional 2-phenylethanol,
2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis
pathways.
[0051] FIG. 43 shows the GC-MS chromatogram results that confirm
the production of 2-phenyl ethanol (FIG. 43B) at one week from the
recombinant microorganisms described in Example 4
(pBADpheA-aroLAC-aroG-tktA-aroBDE and pTrcBALK). FIG. 43A shows the
negative control cells (pBAD33 and pTrc99A).
[0052] FIG. 44 shows the GC-MS chromatogram results that confirm
the production of 2-(4-hydroxyphenyl)ethanol (9.36 min) and
2-(indole-3) ethanol (10.32 min) at one week from the recombinant
microorganisms described in Example 4
(pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK).
[0053] FIG. 45 confirms both the formation of 1-propanal from
1,2-propanediol (FIG. 45A), and the formation of 2-butanone from
meso-2,3-butanediol (FIG. 45B), both of which were catalyzed in
vitro by an isolated B12 independent diol dehydratase, as described
in Example 9.
[0054] FIG. 46A shows the in vivo production of 1-propanol from
1,2-propanediol. FIG. 46B shows the in vivo production of 2-butanol
from meso-2,3 butanediol. FIG. 46C shows the in vivo production of
cyclopentanone from trans-1,2-cyclopentanediol. These experiments
were performed as described in Example 9.
[0055] FIG. 47 shows the results of the TBA assay, as performed in
Example 10. The left tube in FIG. 47 represents media taken from an
overnight culture of cells expressing Vs24254, showing secretion of
an alginate lyase, while the right hand tube shows the TBA reaction
using media from cells expressing Vs24259 (negative control). The
lack of pink coloration in the negative control indicates that
little or no cleavage of the alginate polymer has occurred.
[0056] FIG. 48 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized
benzaldehyde lyase (BAL) catalyzed the in vivo production of
3-hydroxy-2-pentanone and 2-hydroxy-3-pentanone from a ligation
reaction between acetaldehyde and propionaldehyde (FIG. 48A), and
catalyzed the in vivo production of 4-hydroxy-3-heptanone and
3-hydroxy-4-heptanone from a ligation reaction between
propionaldehyde and butyraldehyde (FIG. 48B).
[0057] FIG. 49 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized BAL
catalyzed the in vivo production of 3-hydroxy-2-heptanone from a
ligation reaction between acetaldehyde and pentanal (FIG. 49A), and
catalyzed the in vivo production of 4-hydroxy-3-octanone and
3-hydroxy-4-octanone from a ligation reaction between pentanal and
propionaldehyde (FIG. 49B).
[0058] FIG. 50 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized BAL
catalyzed the in vivo production of 5-hydroxy-4-nonanone from
ligation reaction between butyraldehyde and pentanal (FIG. 50A),
and catalyzed the in vivo production of
2-methyl-5-hydroxy-4-decanone and 2-methyl-4-hydroxy-5-decanone
from ligation reaction between hexanal and 3-methylbutyraldehyde
(FIG. 50B).
[0059] FIG. 51 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized BAL
catalyzed the in vivo production of 6-methyl-3-hydroxy-2-heptanone
from ligation reaction between acetaldehyde and 4-methylhexanal
(FIG. 51A), and catalyzed the in vivo production of
7-methyl-4-hydroxy-3-octanone from a ligation reaction between
4-methylhexanal and propionaldehyde (FIG. 51B).
[0060] FIG. 52 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized BAL
catalyzed the in vivo production of 8-methyl-5-hydroxy-4-nonanone
from ligation reaction between 4-methylhexanal and butyraldehyde
(FIG. 52A), and catalyzed the in vivo production of
3-hydroxy-2-decanone from a ligation reaction between acetaldehyde
and octanal (FIG. 52B).
[0061] FIG. 53 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized BAL
catalyzed the in vivo production of 4-hydroxy-3-undecanone from
ligation reaction between octanal and propionaldehyde (FIG. 53A),
and catalyzed the in vivo production of 5-hydroxy-4-dodecanone from
a ligation reaction between octanal and butyraldehyde (FIG.
53B).
[0062] FIG. 54 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized BAL
catalyzed the in vivo production of 6-hydroxy-5-tridecanone (FIG.
54A) from ligation reaction between octanal and pentanal, and
catalyzed the in vivo production of 2-methyl-5-hydroxy-4-dodecanone
and 2-methyl-4-hydroxy-5-decanone from a ligation reaction between
octanal and 3-methylbutyraldehyde (FIG. 54B).
[0063] FIG. 55 shows the in vivo biological activity of a C--C
ligase isolated from Pseudomonas fluorescens and cloned into E.
coli. The GC-MS chromatogram results show that codon-optimized BAL
catalyzed the in vivo production of
2-methyl-6-hydroxy-5-tridecanone from a ligation reaction between
octanal and 4-methylpentanal.
[0064] FIG. 56 shows the growth of recombinant E. coli on alginate
as a sole source of carbon (FIG. 56A), as described in Example 10.
Growth on glucose (FIG. 56B) provides a positive control. The cells
were transformed with either no plasmid (BL21--negative control),
one plasmid (e.g., Da or 3a), or two plasmids (e.g., Dk3a and
Da3k). The plasmids are indicated by the lower case letter: "a"
refers to the pET-DEST42 plasmid backbone and "k" refers to the
pENTR/D/TOPO backbone. "D" indicates that the plasmid contains the
genomic region Vs24214-24249, while "3" indicates that the plasmid
contains the genomic region Vs24189-24209. Thus, Da would be
pET-DEST42-Vs24214-24249, Da3k would be pET-DEST42-Vs24214-24249
and pENTR/D/TOPO-Vs24189-24209 and so on. These results show that
the combined genomic regions Vs24214-24249 and Vs24189-24209 are
sufficient to confer on E. coli the ability to grow on alginate as
a sole source of carbon.
[0065] FIG. 57 shows the production of ethanol by E. coli growing
on alginate, as performed in Example 11. E. coli was transformed
with either pBBRPdc-AdhA/B or pBBRPdc-AdhA/B+1.5 FOS and allowed to
grow in m9 media containing alginate.
BRIEF SUMMARY
[0066] Embodiments of the present invention include methods for
converting a polysaccharide to a commodity chemical, comprising (a)
contacting the polysaccharide, wherein the polysaccharide is
optionally derived from biomass, with a polysaccharide degrading or
depolymerizing metabolic system, wherein the metabolic system is
selected from; (i) enzymatic or chemical catalysis, and (ii) a
microbial system, wherein the microbial system comprises a
recombinant microorganism, wherein the recombinant microorganism
comprises one or exogenous genes that allow it to grow on the
polysaccharide as a sole source of carbon, thereby converting the
polysaccharide to a suitable monosaccharide or oligosaccharide; and
(b) contacting the suitable monosaccharide or oligosaccharide with
commodity chemical biosynthesis pathway, wherein the commodity
chemical biosynthesis pathway comprises an aldehyde or ketone
biosynthesis pathway, thereby converting the polysaccharide to the
commodity chemical.
[0067] In certain aspects, the biomass is selected from marine
biomass and vegetable/fruit/plant biomass. In certain aspects, the
marine biomass is selected from kelp, giant kelp, sargasso,
seaweed, algae, marine microflora, microalgae, and sea grass. In
certain aspects, the vegetable/fruit/plant biomass comprises plant
peel or pomace. In certain aspects, the vegetable/fruit/plant
biomass is selected from citrus, potato, tomato, grape, gooseberry,
carrot, mango, sugar-beet, apple, switchgrass, wood, and
stover.
[0068] In certain aspects, the polysaccharide is selected from
alginate, agar, carrageenan, fucoidan, pectin, polygalacturonate,
cellulose, hemicellulose, xylan, arabinan, and mannan. In certain
aspects, the suitable monosaccharide or oligosaccharide is selected
from 2-keto-3-deoxy D-gluconate (KDG), D-mannitol, guluronate,
mannuronate, mannitol, lyxose, glycerol, xylitol, glucose, mannose,
galactose, xylose, arabinose, glucuronate, galacturonates, and
rhamnose.
In certain aspects, the commodity chemical is selected from
methane, methanol, ethane, ethene, ethanol, n-propane, 1-propene,
1-propanol, propanal, acetone, propionate, n-butane, 1-butene,
1-butanol, butanal, butanoate, isobutanal, isobutanol,
2-methylbutanal, 2-methylbutanol, 3-methylbutanal, 3-methylbutanol,
2-butene, 2-butanol, 2-butanone, 2,3-butanediol,
3-hydroxy-2-butanone, 2,3-butanedione, ethylbenzene,
ethenylbenzene, 2-phenylethanol, phenylacetaldehyde,
1-phenylbutane, 4-phenyl-1-butene, 4-phenyl-2-butene,
1-phenyl-2-butene, 1-phenyl-2-butanol, 4-phenyl-2-butanol,
1-phenyl-2-butanone, 4-phenyl-2-butanone, 1-phenyl-2,3-butandiol,
1-phenyl-3-hydroxy-2-butanone, 4-phenyl-3-hydroxy-2-butanone,
1-phenyl-2,3-butanedione, n-pentane, ethylphenol, ethenylphenol,
2-(4-hydroxyphenyl)ethanol, 4-hydroxyphenylacetaldehyde,
1-(4-hydroxyphenyl)butane, 4-(4-hydroxyphenyl)-1-butene,
4-(4-hydroxyphenyl)-2-butene, 1-(4-hydroxyphenyl)-1-butene,
1-(4-hydroxyphenyl)-2-butanol, 4-(4-hydroxyphenyl)-2-butanol,
1-(4-hydroxyphenyl)-2-butanone, 4-(4-hydroxyphenyl)-2-butanone,
1-(4-hydroxyphenyl)-2,3-butandiol,
1-(4-hydroxyphenyl)-3-hydroxy-2-butanone,
4-(4-hydroxyphenyl)-3-hydroxy-2-butanone,
1-(4-hydroxyphenyl)-2,3-butanonedione, indolylethane,
indolylethene, 2-(indole-3-)ethanol, n-pentane, 1-pentene,
1-pentanol, pentanal, pentanoate, 2-pentene, 2-pentanol,
3-pentanol, 2-pentanone, 3-pentanone, 4-methylpentanal,
4-methylpentanol, 2,3-pentanediol, 2-hydroxy-3-pentanone,
3-hydroxy-2-pentanone, 2,3-pentanedione, 2-methylpentane,
4-methyl-1-pentene, 4-methyl-2-pentene, 4-methyl-3-pentene,
4-methyl-2-pentanol, 2-methyl-3-pentanol, 4-methyl-2-pentanone,
2-methyl-3-pentanone, 4-methyl-2,3-pentanediol,
4-methyl-2-hydroxy-3-pentanone, 4-methyl-3-hydroxy-2-pentanone,
4-methyl-2,3-pentanedione, 1-phenylpentane, 1-phenyl-1-pentene,
1-phenyl-2-pentene, 1-phenyl-3-pentene, 1-phenyl-2-pentanol,
1-phenyl-3-pentanol, 1-phenyl-2-pentanone, 1-phenyl-3-pentanone,
1-phenyl-2,3-pentanediol, 1-phenyl-2-hydroxy-3-pentanone,
1-phenyl-3-hydroxy-2-pentanone, 1-phenyl-2,3-pentanedione,
4-methyl-1-phenylpentane, 4-methyl-1-phenyl-1-pentene,
4-methyl-1-phenyl-2-pentene, 4-methyl-1-phenyl-3-pentene,
4-methyl-1-phenyl-3-pentanol, 4-methyl-1-phenyl-2-pentanol,
4-methyl-1-phenyl-3-pentanone, 4-methyl-1-phenyl-2-pentanone,
4-methyl-1-phenyl-2,3-pentanediol,
4-methyl-1-phenyl-2,3-pentanedione,
4-methyl-1-phenyl-3-hydroxy-2-pentanone,
4-methyl-1-phenyl-2-hydroxy-3-pentanone,
1-(4-hydroxyphenyl)pentane, 1-(4-hydroxyphenyl)-1-pentene,
1-(4-hydroxyphenyl)-2-pentene, 1-(4-hydroxyphenyl)-3-pentene,
1-(4-hydroxyphenyl)-2-pentanol, 1-(4-hydroxyphenyl)-3-pentanol,
1-(4-hydroxyphenyl)-2-pentanone, 1-(4-hydroxyphenyl)-3-pentanone,
1-(4-hydroxyphenyl)-2,3-pentanediol,
1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone,
1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone,
1-(4-hydroxyphenyl)-2,3-pentanedione,
4-methyl-1-(4-hydroxyphenyl)pentane,
4-methyl-1-(4-hydroxyphenyl)-2-pentene,
4-methyl-1-(4-hydroxyphenyl)-3-pentene,
4-methyl-1-(4-hydroxyphenyl)-1-pentene,
4-methyl-1-(4-hydroxyphenyl)-3-pentanol,
4-methyl-1-(4-hydroxyphenyl)-2-pentanol,
4-methyl-1-(4-hydroxyphenyl)-3-pentanone,
4-methyl-1-(4-hydroxyphenyl)-2-pentanone,
4-methyl-1-(4-hydroxyphenyl)-2,3-pentanediol,
4-methyl-1-(4-hydroxyphenyl)-2,3-pentanedione,
4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone,
4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone,
1-indole-3-pentane, 1-(indole-3)-1-pentene, 1-(indole-3)-2-pentene,
1-(indole-3)-3-pentene, 1-(indole-3)-2-pentanol,
1-(indole-3)-3-pentanol, 1-(indole-3)-2-pentanone,
1-(indole-3)-3-pentanone, 1-(indole-3)-2,3-pentanediol,
1-(indole-3)-2-hydroxy-3-pentanone,
1-(indole-3)-3-hydroxy-2-pentanone, 1-(indole-3)-2,3-pentanedione,
4-methyl-1-(indole-3-)pentane, 4-methyl-1-(indole-3)-2-pentene,
4-methyl-1-(indole-3)-3-pentene, 4-methyl-1-(indole-3)-1-pentene,
4-methyl-2-(indole-3)-3-pentanol, 4-methyl-1-(indole-3)-2-pentanol,
4-methyl-1-(indole-3)-3-pentanone,
4-methyl-1-(indole-3)-2-pentanone,
4-methyl-1-(indole-3)-2,3-pentanediol,
4-methyl-1-(indole-3)-2,3-pentanedione,
4-methyl-1-(indole-3)-3-hydroxy-2-pentanone,
4-methyl-1-(indole-3)-2-hydroxy-3-pentanone, n-hexane, 1-hexene,
1-hexanol, hexanal, hexanoate, 2-hexene, 3-hexene, 2-hexanol,
3-hexanol, 2-hexanone, 3-hexanone, 2,3-hexanediol, 2,3-hexanedione,
3,4-hexanediol, 3,4-hexanedione, 2-hydroxy-3-hexanone,
3-hydroxy-2-hexanone, 3-hydroxy-4-hexanone, 4-hydroxy-3-hexanone,
2-methylhexane, 3-methylhexane, 2-methyl-2-hexene,
2-methyl-3-hexene, 5-methyl-1-hexene, 5-methyl-2-hexene,
4-methyl-1-hexene, 4-methyl-2-hexene, 3-methyl-3-hexene,
3-methyl-2-hexene, 3-methyl-1-hexene, 2-methyl-3-hexanol,
5-methyl-2-hexanol, 5-methyl-3-hexanol, 2-methyl-3-hexanone,
5-methyl-2-hexanone, 5-methyl-3-hexanone, 2-methyl-3,4-hexanediol,
2-methyl-3,4-hexanedione, 5-methyl-2,3-hexanediol,
5-methyl-2,3-hexanedione, 4-methyl-2,3-hexanediol,
4-methyl-2,3-hexanedione, 2-methyl-3-hydroxy-4-hexanone,
2-methyl-4-hydroxy-3-hexanone, 5-methyl-2-hydroxy-3-hexanone,
5-methyl-3-hydroxy-2-hexanone, 4-methyl-2-hydroxy-3-hexanone,
4-methyl-3-hydroxy-2-hexanone, 2,5-dimethylhexane,
2,5-dimethyl-2-hexene, 2,5-dimethyl-3-hexene,
2,5-dimethyl-3-hexanol, 2,5-dimethyl-3-hexanone,
2,5-dimethyl-3,4-hexanediol, 2,5-dimethyl-3,4-hexanedione,
2,5-dimethyl-3-hydroxy-4-hexanone, 5-methyl-1-phenylhexane,
4-methyl-1-phenylhexane, 5-methyl-1-phenyl-1-hexene,
5-methyl-1-phenyl-2-hexene, 5-methyl-1-phenyl-3-hexene,
4-methyl-1-phenyl-1-hexene, 4-methyl-1-phenyl-2-hexene,
4-methyl-1-phenyl-3-hexene, 5-methyl-1-phenyl-2-hexanol,
5-methyl-1-phenyl-3-hexanol, 4-methyl-1-phenyl-2-hexanol,
4-methyl-1-phenyl-3-hexanol, 5-methyl-1-phenyl-2-hexanone,
5-methyl-1-phenyl-3-hexanone, 4-methyl-1-phenyl-2-hexanone,
4-methyl-1-phenyl-3-hexanone, 5-methyl-1-phenyl-2,3-hexanediol,
4-methyl-1-phenyl-2,3-hexanediol,
5-methyl-1-phenyl-3-hydroxy-2-hexanone,
5-methyl-1-phenyl-2-hydroxy-3-hexanone,
4-methyl-1-phenyl-3-hydroxy-2-hexanone,
4-methyl-1-phenyl-2-hydroxy-3-hexanone,
5-methyl-1-phenyl-2,3-hexanedione,
4-methyl-1-phenyl-2,3-hexanedione,
4-methyl-1-(4-hydroxyphenyl)hexane,
5-methyl-1-(4-hydroxyphenyl)-1-hexene,
5-methyl-1-(4-hydroxyphenyl)-2-hexene,
5-methyl-1-(4-hydroxyphenyl)-3-hexene,
4-methyl-1-(4-hydroxyphenyl)-1-hexene,
4-methyl-1-(4-hydroxyphenyl)-2-hexene,
4-methyl-1-(4-hydroxyphenyl)-3-hexene,
5-methyl-1-(4-hydroxyphenyl)-2-hexanol,
5-methyl-1-(4-hydroxyphenyl)-3-hexanol,
4-methyl-1-(4-hydroxyphenyl)-2-hexanol,
4-methyl-1-(4-hydroxyphenyl)-3-hexanol,
5-methyl-1-(4-hydroxyphenyl)-2-hexanone,
5-methyl-1-(4-hydroxyphenyl)-3-hexanone,
4-methyl-1-(4-hydroxyphenyl)-2-hexanone,
4-methyl-1-(4-hydroxyphenyl)-3-hexanone,
5-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol,
4-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol,
5-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone,
5-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone,
4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone,
4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone,
5-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione,
4-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione,
4-methyl-1-(indole-3-)hexane, 5-methyl-1-(indole-3)-1-hexene,
5-methyl-1-(indole-3)-2-hexene, 5-methyl-1-(indole-3)-3-hexene,
4-methyl-1-(indole-3)-1-hexene, 4-methyl-1-(indole-3)-2-hexene,
4-methyl-1-(indole-3)-3-hexene, 5-methyl-1-(indole-3)-2-hexanol,
5-methyl-1-(indole-3)-3-hexanol, 4-methyl-1-(indole-3)-2-hexanol,
4-methyl-1-(indole-3)-3-hexanol, 5-methyl-1-(indole-3)-2-hexanone,
5-methyl-1-(indole-3)-3-hexanone, 4-methyl-1-(indole-3)-2-hexanone,
4-methyl-1-(indole-3)-3-hexanone,
5-methyl-1-(indole-3)-2,3-hexanediol,
4-methyl-1-(indole-3)-2,3-hexanediol,
5-methyl-1-(indole-3)-3-hydroxy-2-hexanone,
5-methyl-1-(indole-3)-2-hydroxy-3-hexanone,
4-methyl-1-(indole-3)-3-hydroxy-2-hexanone,
4-methyl-1-(indole-3)-2-hydroxy-3-hexanone,
5-methyl-1-(indole-3)-2,3-hexanedione,
4-methyl-1-(indole-3)-2,3-hexanedione, n-heptane, 1-heptene,
1-heptanol, heptanal, heptanoate, 2-heptene, 3-heptene, 2-heptanol,
3-heptanol, 4-heptanol, 2-heptanone, 3-heptanone, 4-heptanone,
2,3-heptanediol, 2,3-heptanedione, 3,4-heptanediol,
3,4-heptanedione, 2-hydroxy-3-heptanone, 3-hydroxy-2-heptanone,
3-hydroxy-4-heptanone, 4-hydroxy-3-heptanone, 2-methylheptane,
3-methylheptane, 6-methyl-2-heptene, 6-methyl-3-heptene,
2-methyl-3-heptene, 2-methyl-2-heptene, 5-methyl-2-heptene,
5-methyl-3-heptene, 3-methyl-3-heptene, 2-methyl-3-heptanol,
2-methyl-4-heptanol, 6-methyl-3-heptanol, 5-methyl-3-heptanol,
3-methyl-4-heptanol, 2-methyl-3-heptanone, 2-methyl-4-heptanone,
6-methyl-3-heptanone, 5-methyl-3-heptanone, 3-methyl-4-heptanone,
2-methyl-3,4-heptanediol, 2-methyl-3,4-heptanedione,
6-methyl-3,4-heptanediol, 6-methyl-3,4-heptanedione,
5-methyl-3,4-heptanediol, 5-methyl-3,4-heptanedione,
2-methyl-3-hydroxy-4-heptanone, 2-methyl-4-hydroxy-3-heptanone,
6-methyl-3-hydroxy-4-heptanone, 6-methyl-4-hydroxy-3-heptanone,
5-methyl-3-hydroxy-4-heptanone, 5-methyl-4-hydroxy-3-heptanone,
2,6-dimethylheptane, 2,5-dimethylheptane, 2,6-dimethyl-2-heptene,
2,6-dimethyl-3-heptene, 2,5-dimethyl-2-heptene,
2,5-dimethyl-3-heptene, 3,6-dimethyl-3-heptene,
2,6-dimethyl-3-heptanol, 2,6-dimethyl-4-heptanol,
2,5-dimethyl-3-heptanol, 2,5-dimethyl-4-heptanol,
2,6-dimethyl-3,4-heptanediol, 2,6-dimethyl-3,4-heptanedione,
2,5-dimethyl-3,4-heptanediol, 2,5-dimethyl-3,4-heptanedione,
2,6-dimethyl-3-hydroxy-4-heptanone,
2,6-dimethyl-4-hydroxy-3-heptanone,
2,5-dimethyl-3-hydroxy-4-heptanone,
2,5-dimethyl-4-hydroxy-3-heptanone, n-octane, 1-octene, 2-octene,
1-octanol, octanal, octanoate, 3-octene, 4-octene, 4-octanol,
4-octanone, 4,5-octanediol, 4,5-octanedione, 4-hydroxy-5-octanone,
2-methyloctane, 2-methyl-3-octene, 2-methyl-4-octene,
7-methyl-3-octene, 3-methyl-3-octene, 3-methyl-4-octene,
6-methyl-3-octene, 2-methyl-4-octanol, 7-methyl-4-octanol,
3-methyl-4-octanol, 6-methyl-4-octanol, 2-methyl-4-octanone,
7-methyl-4-octanone, 3-methyl-4-octanone, 6-methyl-4-octanone,
2-methyl-4,5-octanediol, 2-methyl-4,5-octanedione,
3-methyl-4,5-octanediol, 3-methyl-4,5-octanedione,
2-methyl-4-hydroxy-5-octanone, 2-methyl-5-hydroxy-4-octanone,
3-methyl-4-hydroxy-5-octanone, 3-methyl-5-hydroxy-4-octanone,
2,7-dimethyloctane, 2,7-dimethyl-3-octene, 2,7-dimethyl-4-octene,
2,7-dimethyl-4-octanol, 2,7-dimethyl-4-octanone,
2,7-dimethyl-4,5-octanediol, 2,7-dimethyl-4,5-octanedione,
2,7-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyloctane,
2,6-dimethyl-3-octene, 2,6-dimethyl-4-octene,
3,7-dimethyl-3-octene, 2,6-dimethyl-4-octanol,
3,7-dimethyl-4-octanol, 2,6-dimethyl-4-octanone,
3,7-dimethyl-4-octanone, 2,6-dimethyl-4,5-octanediol,
2,6-dimethyl-4,5-octanedione, 2,6-dimethyl-4-hydroxy-5-octanone,
2,6-dimethyl-5-hydroxy-4-octanone, 3,6-dimethyloctane,
3,6-dimethyl-3-octene, 3,6-dimethyl-4-octene,
3,6-dimethyl-4-octanol, 3,6-dimethyl-4-octanone,
3,6-dimethyl-4,5-octanediol, 3,6-dimethyl-4,5-octanedione,
3,6-dimethyl-4-hydroxy-5-octanone, n-nonane, 1-nonene, 1-nonanol,
nonanal, nonanoate, 2-methylnonane, 2-methyl-4-nonene,
2-methyl-5-nonene, 8-methyl-4-nonene, 2-methyl-5-nonanol,
8-methyl-4-nonanol, 2-methyl-5-nonanone, 8-methyl-4-nonanone,
8-methyl-4,5-nonanediol, 8-methyl-4,5-nonanedione,
8-methyl-4-hydroxy-5-nonanone, 8-methyl-5-hydroxy-4-nonanone,
2,8-dimethylnonane, 2,8-dimethyl-3-nonene, 2,8-dimethyl-4-nonene,
2,8-dimethyl-5-nonene, 2,8-dimethyl-4-nonanol,
2,8-dimethyl-5-nonanol, 2,8-dimethyl-4-nonanone,
2,8-dimethyl-5-nonanone, 2,8-dimethyl-4,5-nonanediol,
2,8-dimethyl-4,5-nonanedione, 2,8-dimethyl-4-hydroxy-5-nonanone,
2,8-dimethyl-5-hydroxy-4-nonanone, 2,7-dimethylnonane,
3,8-dimethyl-3-nonene, 3,8-dimethyl-4-nonene,
3,8-dimethyl-5-nonene, 3,8-dimethyl-4-nonanol,
3,8-dimethyl-5-nonanol, 3,8-dimethyl-4-nonanone,
3,8-dimethyl-5-nonanone, 3,8-dimethyl-4,5-nonanediol,
3,8-dimethyl-4,5-nonanedione, 3,8-dimethyl-4-hydroxy-5-nonanone,
3,8-dimethyl-5-hydroxy-4-nonanone, n-decane, 1-decene, 1-decanol,
decanoate, 2,9-dimethyldecane, 2,9-dimethyl-3-decene,
2,9-dimethyl-4-decene, 2,9-dimethyl-5-decanol,
2,9-dimethyl-5-decanone, 2,9-dimethyl-5,6-decanediol,
2,9-dimethyl-6-hydroxy-5-decanone,
2,9-dimethyl-5,6-decanedionen-undecane, 1-undecene, 1-undecanol,
undecanal. undecanoate, n-dodecane, 1-dodecene, 1-dodecanol,
dodecanal, dodecanoate, n-dodecane, 1-decadecene, 1-dodecanol,
dodecanal, dodecanoate, n-tridecane, 1-tridecene, 1-tridecanol,
tridecanal, tridecanoate, n-tetradecane, 1-tetradecene,
1-tetradecanol, tetradecanal, tetradecanoate, n-pentadecane,
1-pentadecene, 1-pentadecanol, pentadecanal, pentadecanoate,
n-hexadecane, 1-hexadecene, 1-hexadecanol, hexadecanal,
hexadecanoate, n-heptadecane, 1-heptadecene, 1-heptadecanol,
heptadecanal, heptadecanoate, n-octadecane, 1-octadecene,
1-octadecanol, octadecanal, octadecanoate, n-nonadecane,
1-nonadecene, 1-nonadecanol, nonadecanal, nonadecanoate, eicosane,
1-eicosene, 1-eicosanol, eicosanal, eicosanoate, 3-hydroxy
propanal, 1,3-propanediol, 4-hydroxybutanal, 1,4-butanediol,
3-hydroxy-2-butanone, 2,3-butandiol, 1,5-pentane diol, homocitrate,
homoisocitorate, b-hydroxy adipate, glutarate, glutarsemialdehyde,
glutaraldehyde, 2-hydroxy-1-cyclopentanone, 1,2-cyclopentanediol,
cyclopentanone, cyclopentanol, (S)-2-acetolactate,
(R)-2,3-Dihydroxy-isovalerate, 2-oxoisovalerate, isobutyryl-CoA,
isobutyrate, isobutyraldehyde, 5-amino pentaldehyde,
1,10-diaminodecane, 1,10-diamino-5-decene,
1,10-diamino-5-hydroxydecane, 1,10-diamino-5-decanone,
1,10-diamino-5,6-decanediol, 1,10-diamino-6-hydroxy-5-decanone,
phenylacetoaldehyde, 1,4-diphenylbutane, 1,4-diphenyl-1-butene,
1,4-diphenyl-2-butene, 1,4-diphenyl-2-butanol,
1,4-diphenyl-2-butanone, 1,4-diphenyl-2,3-butanediol,
1,4-diphenyl-3-hydroxy-2-butanone,
1-(4-hydeoxyphenyl)-4-phenylbutane,
1-(4-hydeoxyphenyl)-4-phenyl-1-butene,
1-(4-hydeoxyphenyl)-4-phenyl-2-butene,
1-(4-hydeoxyphenyl)-4-phenyl-2-butanol,
1-(4-hydeoxyphenyl)-4-phenyl-2-butanone,
1-(4-hydeoxyphenyl)-4-phenyl-2,3-butanediol,
1-(4-hydeoxyphenyl)-4-phenyl-3-hydroxy-2-butanone,
1-(indole-3)-4-phenylbutane, 1-(indole-3)-4-phenyl-1-butene,
1-(indole-3)-4-phenyl-2-butene, 1-(indole-3)-4-phenyl-2-butanol,
1-(indole-3)-4-phenyl-2-butanone,
1-(indole-3)-4-phenyl-2,3-butanediol,
1-(indole-3)-4-phenyl-3-hydroxy-2-butanone,
4-hydroxyphenylacetoaldehyde, 1,4-di(4-hydroxyphenyl)butane,
1,4-di(4-hydroxyphenyl)-1-butene, 1,4-di(4-hydroxyphenyl)-2-butene,
1,4-di(4-hydroxyphenyl)-2-butanol,
1,4-di(4-hydroxyphenyl)-2-butanone,
1,4-di(4-hydroxyphenyl)-2,3-butanediol,
1,4-di(4-hydroxyphenyl)-3-hydroxy-2-butanone,
1-(4-hydroxyphenyl)-4-(indole-3-)butane,
1-(4-hydroxyphenyl)-4-(indole-3)-1-butene,
1-di(4-hydroxyphenyl)-4-(indole-3)-2-butene,
1-(4-hydroxyphenyl)-4-(indole-3)-2-butanol,
1-(4-hydroxyphenyl)-4-(indole-3)-2-butanone,
1-(4-hydroxyphenyl)-4-(indole-3)-2,3-butanediol,
1-(4-hydroxyphenyl-4-(indole-3)-3-hydroxy-2-butanone,
indole-3-acetoaldehyde, 1,4-di(indole-3-)butane,
1,4-di(indole-3)-1-butene, 1,4-di(indole-3)-2-butene,
1,4-di(indole-3)-2-butanol, 1,4-di(indole-3)-2-butanone,
1,4-di(indole-3)-2,3-butanediol,
1,4-di(indole-3)-3-hydroxy-2-butanone, succinate semialdehyde,
hexane-1,8-dicarboxylic acid, 3-hexene-1,8-dicarboxylic acid,
3-hydroxy-hexane-1,8-dicarboxylic acid, 3-hexanone-1,8-dicarboxylic
acid, 3,4-hexanediol-1,8-dicarboxylic acid,
4-hydroxy-3-hexanone-1,8-dicarboxylic acid, fucoidan, iodine,
chlorophyll, carotenoid, calcium, magnesium, iron, sodium,
potassium, and phosphate.
[0070] Certain embodiments of the present invention include methods
for converting a polysaccharide to a suitable monosaccharide or
oligosaccharide, comprising: (a) contacting the polysaccharide,
wherein the polysaccharide is optionally obtained from biomass,
with a microbial system for a time sufficient to convert the
polysaccharide to a suitable monosaccharide or oligosaccharide,
wherein the microbial system comprises, (i) at least one gene
encoding and expressing an enzyme selected from a lyase and a
hydrolase, wherein the lyase and/or hydrolase optionally comprises
at least one signal peptide or at least one autotransporter domain;
(ii) at least one gene encoding and expressing an enzyme selected
from a monosaccharide transporter, a disaccharide transporter, a
trisaccharide transporter, an oligosaccharide transporter, a
polysaccharide transporter, and a superchannel; and (iii) at least
one gene encoding and expressing an enzyme selected from a
monosaccharide dehydrogenase, an isomerase, a dehydratase, a
kinase, and an aldolase, thereby converting the polysaccharide to a
suitable monosaccharide or oligosaccharide.
[0071] Certain embodiments of the present invention include methods
for converting a polysaccharide to a suitable monosaccharide or
oligosaccharide, comprising: (a) contacting the polysaccharide,
wherein the polysaccharide is optionally obtained from biomass,
with a chemical or enzymatic catalysis pathway for a time
sufficient to convert the polysaccharide to a first monosaccharide
or oligosaccharide; and (b) contacting the first monosaccharide or
oligosaccharide with a microbial system for a time sufficient to
convert the first monosaccharide or oligosaccharide to the suitable
monosaccharide or oligosaccharide, wherein the microbial system
comprises, (i) at least one gene encoding and expressing an enzyme
selected from a lyase and a hydrolase, (ii) at least one gene
encoding and expressing an enzyme selected from a monosaccharide
transporter, a disaccharide transporter, a trisaccharide
transporter, an oligosaccharide transporter, a polysaccharide
transporter, and a superchannel; and (ii) at least one gene
encoding and expressing an enzyme selected from a monosaccharide
dehydrogenase, an isomerase, a dehydratase, a kinase, and an
aldolase, thereby converting the polysaccharide to the suitable
monosaccharide or oligosaccharide.
[0072] In certain aspects, the lyase is selected from an alginate
lyase, a pectate lyase, a polymannuronate lyase, a polygluronate
lyase, a polygalacturonate lyase and a rhamnogalacturonate lyase.
In certain aspects, the hydrolase is selected from an alginate
hydrolase, a rhamnogalacturonate hydrolase, a polymannuronate
hydrolase, a pectin hydrolase, and a polygalacturonate hydrolase.
In certain aspects, the transporter is selected from an ABC
transporter, a symporter, and an outer membrane porin. In certain
aspects, the ABC transporter is selected from Atu3021, Atu3022,
Atu3023, Atu3024, algM1, algM2, AlgQ1, AlgQ2, AlgS,
OG2516.sub.--05558, OG2516.sub.--05563, OG2516.sub.--05568,
OG2516.sub.--05573, TogM, TogN, TogA, TogB, and functional variants
thereof. In certain aspects, the symporter is selected from
V12B01.sub.--24239 (SEQ ID NO:26), V12B01.sub.--24194 (SEQ ID
NO:8), and TogT, and functional variants thereof. In certain
aspects, the outermembrane porin comprises a porin selected from
V12B01.sub.--24269, KdgM, and KdgN, and functional variants
thereof.
[0073] Certain embodiments include a recombinant microorganism that
is capable of growing on a polysaccharide as a sole source of
carbon, wherein the polysaccharide is selected from alginate,
pectin, tri-galacturonate, di-galacturonate, cellulose, and
hemi-cellulose. In certain aspects, the polysaccharide is alginate.
In certain aspects, the polysaccharide is pectin. In certain
aspects, the polysaccharide is tri-galacturonate.
[0074] Certain embodiments include a recombinant microrganism,
comprising (i) at least one gene encoding and expressing an enzyme
selected from a lyase and a hydrolase, wherein the lyase or
hydrolase optionally comprises at least one signal peptide or at
least one autotransporter domain; (ii) at least one gene encoding
and expressing an enzyme selected from a monosaccharide
transporter, a disaccharide transporter, a trisaccharide
transporter, an oligosaccharide transporter, a polysaccharide
transporter, and a superchannel; and (iii) at least one gene
encoding and expressing an enzyme selected from a monosaccharide
dehydrogenase, an isomerase, a dehydratase, a kinase, and an
aldolase. In certain aspects, the microorganism is capable of
growing on a polysaccharide as a sole source of carbon. In certain
aspects, the polysaccharide is selected from alginate, pectin, and
tri-galacturonate.
[0075] Certain embodiments include methods for converting a
suitable monosaccharide or oligosaccharide to a first commodity
chemical comprising, (a) contacting the suitable monosaccharide or
oligosaccharide with a microbial system for a time sufficient to
convert to the suitable monosaccharide or oligosaccharide to the
commodity chemical, wherein the microbial system comprises a
recombinant microorganism, wherein the microorganism comprises a
commodity chemical biosynthesis pathway, thereby converting the
suitable monosaccharide or oligosaccharide to the first commodity
chemical. In certain aspects, the commodity chemical pathway
comprises one or more genes encoding an aldehyde or ketone
biosynthesis pathway.
[0076] In certain aspects, the aldehyde or ketone biosynthesis
pathway is selected from one or more of an acetoaldehyde, a
propionaldehyde, a butyraldehyde, an isobutyraldehyde, a
2-methyl-butyraldehyde, a 3-methyl-butyraldehyde, a 2-phenyl
acetaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, a
2-Indole-3-acetoaldehyde, a glutaraldehyde, a 5-amino-pentaldehyde,
a succinate semialdehyde, and a succinate 4-hydroxyphenyl
acetaldehyde biosynthesis pathway. In certain aspects, the aldehyde
or ketone biosynthesis pathway comprises an acetoaldehyde
biosynthesis pathway and a biosynthesis pathway selected from a
propionaldehyde, butyraldehyde, isobutyraldehyde,
2-methyl-butyraldehyde, 3-methyl-butyraldehyde, a 2-phenyl
acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a
2-Indole-3-acetoaldehyde biosynthesis pathway.
[0077] In certain aspects, the aldehyde or ketone biosynthesis
pathway comprises a propionaldehyde biosynthesis pathway and a
biosynthesis pathway selected from a butyraldehyde,
isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde,
and phenylacetoaldehyde biosynthesis pathway. In certain aspects,
the aldehyde or ketone biosynthesis pathway comprises a
butyraldehyde biosynthesis pathway and a biosynthesis pathway
selected from an isobutyraldehyde, 2-methyl-butyraldehyde,
3-methyl-butyraldehyde, a 2-phenyl acetoaldehyde, a
2-(4-hydroxyphenyl)acetaldehyde, and a 2-Indole-3-acetoaldehyde
biosynthesis pathway. In certain aspects, the aldehyde or ketone
biosynthesis pathway comprises an isobutyraldehyde biosynthesis
pathway and a biosynthesis pathway selected from a
2-methyl-butyraldehyde, 3-methyl-butyraldehyde, a 2-phenyl
acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a
2-Indole-3-acetoaldehyde biosynthesis pathway.
[0078] In certain aspects, the aldehyde or ketone biosynthesis
pathway comprises a 2-methyl-butyraldehyde biosynthesis pathway and
a biosynthesis pathway selected from a 3-methyl-butyraldehyde, a
2-phenyl acetoaldehyde, a 2-(4-hydroxyphenyl)acetaldehyde, and a
2-Indole-3-acetoaldehyde biosynthesis pathway. In certain aspects,
the aldehyde or ketone biosynthesis pathway comprises a
3-methyl-butyraldehyde biosynthesis pathway and a biosynthesis
pathway selected from a 2-phenyl acetoaldehyde, a
2-(4-hydroxyphenyl)acetaldehyde, and a 2-Indole-3-acetoaldehyde
biosynthesis pathway. In certain aspects, the aldehyde or ketone
biosynthesis pathway comprises a 2-phenyl acetoaldehyde
biosynthesis pathway and a biosynthesis pathway selected from a
2-(4-hydroxyphenyl)acetaldehyde and a 2-Indole-3-acetoaldehyde
biosynthesis pathway.
[0079] In certain aspects, the aldehyde or ketone biosynthesis
pathway comprises a 2-(4-hydroxyphenyl)acetaldehyde biosynthesis
pathway and a 2-Indole-3-acetoaldehyde biosynthesis pathway. In
certain aspects, the first commodity chemical is further
enzymatically and/or chemically reduced and dehydrated to a second
commodity chemical.
[0080] Certain embodiments include methods for converting a
suitable monosaccharide or oligosaccharide to a commodity chemical
comprising, (a) contacting the suitable monosaccharide or
oligosaccharide with a microbial system for a time sufficient to
convert to the suitable monosaccharide or oligosaccharide to the
commodity chemical, wherein the microbial system comprises; (i) one
or more genes encoding and expressing an aldehyde biosynthesis
pathway, wherein the aldehyde biosynthesis pathway comprises one or
more genes encoding and expressing a decarboxylase enzyme; and (ii)
one or more genes encoding and expressing an aldehyde reductase,
thereby converting the suitable monosaccharide or oligosaccharide
to the commodity chemical. In certain aspects, the decarboxylase
enzyme is an indole-3-pyruvate decarboxylase (IPDC). In certain
aspects, the IPDC comprises an amino acid sequence that is at least
80%, 90%, 95%, 98%, or 99% identical to the amino acid sequence set
forth in SEQ ID NO: 312. In certain aspects, the aldehyde reductase
enzyme is a phenylacetaldehyde reductase (PAR). In certain aspects,
the PAR comprises an amino acid sequence that is at least 80%, 90%,
95%, 98%, or 99% identical to the amino acid sequence set forth in
SEQ ID NO: 313. In certain aspects, the commodity chemical is
selected from 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and
indole-3-ethanol.
[0081] Certain embodiments include a recombinant microorganism,
comprising (i) one or more genes encoding and expressing an
aldehyde biosynthesis pathway, wherein the aldehyde biosynthesis
pathway comprises one or more genes encoding and expressing a
decarboxylase enzyme; and (ii) one or more genes encoding and
expressing an aldehyde reductase. In certain aspects, the aldehyde
biosynthesis pathway further comprises one or more genes encoding
and expressing an enzyme selected from a CoA-linked aldehyde
dehydrogenase, an aldehyde dehydrogenase, and an alcohol
dehydrogenase. In certain aspects, the decarboxylase enzyme is an
indole-3-pyruvate decarboxylase (IPDC). In certain aspects, the
aldehyde reductase enzyme is a phenylacetoaldehyde reductase (PAR).
In certain aspects, the microorganism is capable of converting a
suitable monosaccharide or oligosaccharide to a commodity chemical.
In certain aspects, the commodity chemical is selected from
2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and
indole-3-ethanol.
[0082] Certain embodiments include a recombinant microorganism,
wherein the microorganism comprises reduced ethanol production
capability compared to a wild-type microorganism. In certain
aspects, the microorganism comprises a reduction or inhibition in
the conversion of acetyl-coA to ethanol. In certain aspects, the
recombinant microorganism comprises a reduction of an ethanol
dehydrogenase, thereby providing a reduced ethanol production
capability. In certain aspects, the ethanol dehydrogenase is an
adhE, homolog or variant thereof. In certain aspects, the
microorganism comprises a deletion or knockout of an adhE, homolog
or variant thereof. In certain aspects, the recombinant
microorganism comprises one or more deletions or knockouts in a
gene encoding an enzyme selected from an enzyme that catalyzes the
conversion of acetyl-coA to ethanol, an enzyme that catalyzes the
conversion of pyruvate to lactate, an enzyme that catalyzes the
conversion of fumarate to succinate, an enzyme that catalyzes the
conversion of acetyl-coA and phosphate to coA and acetyl phosphate,
an enzyme that catalyzes the conversion of acetyl-coA and formate
to coA and pyruvate, and an enzyme that catalyzes the conversion of
alpha-keto acid to branched chain amino acids.
[0083] Certain embodiments include wherein the microbial systems or
recombinant microorgansims described herein comprise a
microorganism selected from Acetobacter aceti, Achromobacter,
Acidiphilium, Acinetobacter, Actinomadura, Actinoplanes, Aeropyrum
pernix, Agrobacterium, Alcaligenes, Ananas comosus (M),
Arthrobacter, Aspargillus niger, Aspargillus oryze, Aspergillus
melleus, Aspergillus pulverulentus, Aspergillus saitoi, Aspergillus
sojea, Aspergillus usamii, Bacillus alcalophilus, Bacillus
amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus
clausii, Bacillus lentus, Bacillus licheniformis, Bacillus
macerans, Bacillus stearothermophilus, Bacillus subtilis,
Bifidobacterium, Brevibacillus brevis, Burkholderia cepacia,
Candida cylindracea, Candida rugosa, Carica papaya (L),
Cellulosimicrobium, Cephalosporium, Chaetomium erraticum,
Chaetomium gracile, Clostridium, Clostridium butyricum, Clostridium
acetobutylicum, Clostridium thermocellum, Corynebacterium
(glutamicum), Corynebacterium efficiens, Escherichia coli,
Enterococcus, Erwina chrysanthemi, Gliconobacter,
Gluconacetobacter, Haloarcula, Humicola insolens, Humicola nsolens,
Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kluyveromyces,
Kluyveromyces fragilis, Kluyveromyces lactis, Kocuria, Lactlactis,
Lactobacillus, Lactobacillus fermentum, Lactobacillus sake,
Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis,
Methanolobus siciliae, Methanogenium organophilum, Methanobacterium
bryantii, Microbacterium imperiale, Micrococcus lysodeikticus,
Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium,
Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus,
Pediococcus halophilus, Penicillium, Penicillium camemberti,
Penicillium citrinum, Penicillium emersonii, Penicillium
roqueforti, Penicillum lilactinum, Penicillum multicolor,
Paracoccus pantotrophus, Propionibacterium, Pseudomonas,
Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus,
Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor
miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar,
Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus
oligosporus, Rhodococcus, Sccharomyces cerevisiae, Sclerotina
libertine, Sphingobacterium multivorum, Sphingobium, Sphingomonas,
Streptococcus, Streptococcus thermophilus Y-1, Streptomyces,
Streptomyces griseus, Streptomyces lividans, Streptomyces murinus,
Streptomyces rubiginosus, Streptomyces violaceoruber,
Streptoverticillium mobaraense, Tetragenococcus, Thermus,
Thiosphaera pantotropha, Trametes, Trichoderma, Trichoderma
longibrachiatum, Trichoderma reesei, Trichoderma viride,
Trichosporon penicillatum, Vibrio alginolyticus, Xanthomonas,
yeast, Zygosaccharomyces rouxii, Zymomonas, and Zymomonus
mobilis.
[0084] Certain embodiments include a commodity chemical produced by
the methods described herein. Certain aspects include a blended
commodity chemical comprising a commodity chemical produced by the
methods provided herein and a refinery-produced petroleum product.
In certain aspects, the commodity chemical is selected from a
C10-C12 hydrocarbon, 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol,
and indole-3-ethanol. In certain aspects, the C10-C12 hydrocarbon
is selected from 2,7-dimethyloctane and 2,9-dimethyldecane. In
certain aspects, the refinery-produced petroleum product is
selected from jet fuel and diesel fuel.
[0085] Certain embodiments include methods of producing a commodity
chemical enriched refinery-produced petroleum product, comprising
(a) blending the refinery-produced petroleum product with the
commodity chemical produced by the methods described herein,
thereby producing the commodity chemical enriched refinery-produced
petroleum product.
DETAILED DESCRIPTION
[0086] Embodiments of the present invention relate to the
unexpected discovery that microorganisms which are otherwise
incapable of growing on certain polysaccharides derived from
biomass as a sole source of carbon, can be engineered to grow on
these polysaccharides as a sole source of carbon. Such
microorganisms can include both prokaryotic and eukaryotic
microorganisms, such as bacteria and yeast. In some aspects,
certain laboratory and/or wild-type strains of E. coli can be
engineered to grow on biomass derived from either alginate or
pectin as a sole source of carbon to produce suitable
monosaccharides or other molecules. Among other uses apparent to a
person skilled in the art, the monosaccharides and other molecules
produced by the growth of these engineered or recombinant
microorganisms on alginate or pectin may be utilized as feedstock
in the production of various commodity chemicals, such as
biofuels.
[0087] Alginate and pectin provide advantages over other biomass
sources in the production of biofuel feedstocks. For example,
large-scale aquatic-farming can generate a significant amount of
biomass without replacing food crop production with energy crop
production, deforestation, and recultivating currently uncultivated
land, as most of hydrosphere including oceans, rivers, and lakes
remains untapped. As one particular example, the Pacific coast of
North America is abundant in minerals necessary for large-scale
aqua-farming. Giant kelp, which lives in the area, grows as fast as
1 m/day, the fastest among plants on earth, and grows up to 50 m.
Additionally, aqua-farming has other benefits including the
prevention of a red tide outbreak and the creation of a
fish-friendly environment.
[0088] As an additional advantage, and in contrast to
lignocellulolic biomass, biomass derived from aquatic, fruit, plant
and/or vegetable sources is easy to degrade. Such biomass typically
lacks lignin and is significantly more fragile than lignocellulolic
biomass and can thus be easily degraded using either enzymes or
chemical catalysts (e.g., formate). As one example, aquatic biomass
such as seaweed may be easily converted to monosaccharides using
either enzymes or chemical catalysis, as seaweed has significantly
simpler major sugar components (Alginate: 30%, Mannitol: 15%) as
compared to lignocellulose (Glucose: 24.1-39%, Mannose: 0.2-4.6%,
Galactose: 0.5-2.4%, Xylose: 0.4-22.1%, Arabinose 1.5-2.8%, and
Uronic acids: 1.2-20.7%, and total sugar contents are corresponding
to 36.5-70% of dried weight).
[0089] As an additional example, biomass from plants such as fruit
and/or vegetable contains pectin, a heteropolysaccharide derived
from the plant cell wall. The characteristic structure of pectin is
a linear chain of .alpha.-(1-4)-linked D-galacturonic acid that
forms the pectin-backbone, a homogalacturonan. Pectin can be easily
converted to oligosaccharides or suitable monosaccharides using
either enzymes, chemical catalysis, and/or microbial systems
designed to utilize pectin as a source of carbon, as described
herein. Saccharification and fermentation using aquatic, fruit,
and/or vegetable biomass is much easier than using
lignocellulose.
[0090] In this regard, embodiments of the present invention also
relate to the surprising discovery that certain microorganisms can
be engineered to produce various commodity chemicals, such as
biofuels. In certain aspects, these biofuels may include alkanes,
such as medium to long chain alkanes, which provide advantages over
ethanol based biofuels. In certain aspects, the monosaccharides
(e.g., 2-keto-3-deoxy D-gluconate; KDG) and other molecules
produced by the growth of various engineered or recombinant
microorganisms (e.g., recombinant microorganisms growing on pectin
or alginate as a source of carbon) may be useful in the production
of commodity chemicals, such as biofuels. As one example, suitable
monosaccharides such as KDG may be utilized by recombinant
microorganisms to produce alkanes, such as medium to long chain
alkanes, among other chemicals. In certain aspects, such
recombinant microorganisms may be utilized to produce such
commodity chemical as 2,7 dimethyl octane and 2,9 dimethyl decane,
among others provided herein and known in the art.
[0091] Such processes produce biofuels with significant advantages
over other biofuels. In particular, medium to long chain alkanes
provide a number of important advantages over the existing common
biofuels such as ethanol and butanol, and are attractive long-term
replacements of petroleum-based fuels such as gasoline, diesels,
kerosene, and heavy oils in the future. As one example, medium to
long chain alkanes and alcohols are major components in all
petroleum products and jet fuel in particular, and hence alkanes we
produce can be utilized directly by existing engines. By way of
further example, medium to long chain alcohols are far better fuels
than ethanol, and have a nearly comparable energy density to
gasoline.
[0092] As another example, n-alkanes are major components of all
oil products including gasoline, diesels, kerosene, and heavy oils.
Microbial systems or recombinant microorganisms may be used to
produce n-alkanes with different carbon lengths ranging, for
example, from C7 to over C20: C7 for gasoline (e.g., motor
vehicles), C10-C15 for diesels (e.g., motor vehicles, trains, and
ships), and C8-C16 for kerosene (e.g., aviations and ships), and
for all heavy oils.
[0093] As one aspect of the invention, the commodity chemicals
produced by the methods and recombinant microorganisms described
herein may be utilized by existing petroleum refineries for the
purposes of blending with petroleum products produced by
traditional refinery methods. To this end, as noted above, fuel
producers are seeking substantially similar, low carbon fuels that
can be blended and distributed through existing infrastructure
(refineries, pipelines, tankers). As hydrocarbons, the commodity
chemicals produced according to the methods herein are
substantially similar to petroleum derived fuels, reduce green
house gas emissions by more than 80% from petroleum derived fuels,
and are compatible with existing infrastructure in the oil and gas
industry. For instance, certain of the commodity chemicals produced
herein, including, for example, various C10-C12 hydrocarbons such
as 2,7 dimethyloctane, 2,7 dimethyldecanone, among others, are
blendable directly into refinery-produced petroleum products, such
as jet and diesel fuels. By using such biologically produced
commodity chemicals as a blendstock for jet and diesel fuels,
refineries may reduce Green House Gas emissions by more than
80%.
[0094] Accordingly, certain embodiments of the present invention
relate generally to methods for converting biomass to a commodity
chemical, comprising obtaining a polysaccharide from biomass;
contacting the polysaccharide with a polysaccharide degrading or
depolymerizing pathway, thereby converting the polysaccharide to a
suitable monosaccharide. The suitable monosaccharide obtained from
such as process may be used for any desired purpose. For instance,
in certain aspects, the suitable monosaccharide may then be
converted to a commodity chemical (e.g., biofuel) by contacting the
suitable monosaccharide with a biofuel biosynthesis pathway,
whether as part of a recombinant microorganism, an in vitro
enzymatic or chemical pathway, or a combination thereof, thereby
converting the monosaccharide to a commodity chemical.
[0095] In other aspects, in producing a commodity chemical such as
a biofuel, a suitable monosaccharide may be obtained directly from
any available source and converted to a commodity chemical by
contacting the suitable monosaccharide with a biofuel biosynthesis
pathway, as described herein. Among other uses apparent to a person
skilled in the art, such biofuels may then be blended directly with
refinery produced petroleum products, such as jet and diesel fuels,
to produce commodity chemical enriched, refinery-produced petroleum
products.
DEFINITIONS
[0096] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood by those
of ordinary skill in the art to which the invention belongs.
Although any methods and materials similar or equivalent to those
described herein can be used in the practice or testing of the
present invention, preferred methods and materials are described.
For the purposes of the present invention, the following terms are
defined below. All references referred to herein are incorporated
by reference in their entirety.
[0097] The articles "a" and "an" are used herein to refer to one or
to more than one (i.e. to at least one) of the grammatical object
of the article. By way of example, "an element" means one element
or more than one element.
[0098] By "about" is meant a quantity, level, value, number,
frequency, percentage, dimension, size, amount, weight or length
that varies by as much as 30, 25, 20, 25, 10, 9, 8, 7, 6, 5, 4, 3,
2 or 1% to a reference quantity, level, value, number, frequency,
percentage, dimension, size, amount, weight or length.
[0099] The term "biologically active fragment", as applied to
fragments of a reference polynucleotide or polypeptide sequence,
refers to a fragment that has at least about 0.1, 0.5, 1, 2, 5, 10,
12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 35, 40, 45, 50, 55, 60, 65,
70, 75, 80, 85, 90, 95, 96, 97, 98, 99, 100, 110, 120, 150, 200,
300, 400, 500, 600, 700, 800, 900, 1000% or more of the activity of
a reference sequence.
[0100] The term "reference sequence" refers generally to a nucleic
acid coding sequence, or amino acid sequence, of any enzyme having
a biological activity described herein (e.g., saccharide
dehydrogenase, alcohol dehydrogenase, dehydratase, lyase,
transporter, decarboxylase, hydrolase, etc.), such as a "wild-type"
sequence, including those reference sequences exemplified by SEQ ID
NOS:1-144, and 308-313. A reference sequence may also include
naturally-occurring, functional variants (i.e., orthologs or
homologs) of the sequences described herein.
[0101] Included within the scope of the present invention are
biologically active fragments of at least about 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 120,
140, 160, 180, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380,
400, 500, 600 or more contiguous nucleotides or amino acid residues
in length, including all integers in between, which comprise or
encode a polypeptide having an enzymatic activity of a reference
polynucleotide or polypeptide. Representative biologically active
fragments generally participate in an interaction, e.g., an
intra-molecular or an inter-molecular interaction. An
inter-molecular interaction can be a specific binding interaction
or an enzymatic interaction. Examples of enzymatic interactions or
activities include saccharide dehydrogenase activities, alcohol
dehydrogenase activities, dehydratases activities, lyase
activities, transporter activities, isomerase activities, kinase
activities, among others described herein. Biologically active
fragments typically comprise one or more active sites or
enzymatic/binding motifs, as described herein and known in the
art.
[0102] By "coding sequence" is meant any nucleic acid sequence that
contributes to the code for the polypeptide product of a gene. By
contrast, the term "non-coding sequence" refers to any nucleic acid
sequence that does not contribute to the code for the polypeptide
product of a gene.
[0103] Throughout this specification, unless the context requires
otherwise, the words "comprise", "comprises" and "comprising" will
be understood to imply the inclusion of a stated step or element or
group of steps or elements but not the exclusion of any other step
or element or group of steps or elements.
[0104] By "consisting of" is meant including, and limited to,
whatever follows the phrase "consisting of." Thus, the phrase
"consisting of" indicates that the listed elements are required or
mandatory, and that no other elements may be present.
[0105] By "consisting essentially of" is meant including any
elements listed after the phrase, and limited to other elements
that do not interfere with or contribute to the activity or action
specified in the disclosure for the listed elements. Thus, the
phrase "consisting essentially of" indicates that the listed
elements are required or mandatory, but that other elements are
optional and may or may not be present depending upon whether or
not they affect the activity or action of the listed elements.
[0106] The terms "complementary" and "complementarity" refer to
polynucleotides (i.e., a sequence of nucleotides) related by the
base-pairing rules. For example, the sequence "A-G-T," is
complementary to the sequence "T-C-A." Complementarity may be
"partial," in which only some of the nucleic acids' bases are
matched according to the base pairing rules. Or, there may be
"complete" or "total" complementarity between the nucleic acids.
The degree of complementarity between nucleic acid strands has
significant effects on the efficiency and strength of hybridization
between nucleic acid strands.
[0107] By "corresponds to" or "corresponding to" is meant (a) a
polynucleotide having a nucleotide sequence that is substantially
identical or complementary to all or a portion of a reference
polynucleotide sequence or encoding an amino acid sequence
identical to an amino acid sequence in a peptide or protein; or (b)
a peptide or polypeptide having an amino acid sequence that is
substantially identical to a sequence of amino acids in a reference
peptide or protein.
[0108] By "derivative" is meant a polypeptide that has been derived
from the basic sequence by modification, for example by conjugation
or complexing with other chemical moieties (e.g., pegylation) or by
post-translational modification techniques as would be understood
in the art. The term "derivative" also includes within its scope
alterations that have been made to a parent sequence including
additions or deletions that provide for functionally equivalent
molecules.
[0109] By "enzyme reactive conditions" it is meant that any
necessary conditions are available in an environment (i.e., such
factors as temperature, pH, lack of inhibiting substances) which
will permit the enzyme to function. Enzyme reactive conditions can
be either in vitro, such as in a test tube, or in vivo, such as
within a cell.
[0110] As used herein, the terms "function" and "functional" and
the like refer to a biological or enzymatic function.
[0111] By "gene" is meant a unit of inheritance that occupies a
specific locus on a chromosome and consists of transcriptional
and/or translational regulatory sequences and/or a coding region
and/or non-translated sequences (i.e., introns, 5' and 3'
untranslated sequences).
[0112] "Homology" refers to the percentage number of amino acids
that are identical or constitute conservative substitutions.
Homology may be determined using sequence comparison programs such
as GAP (Deveraux et al., 1984, Nucleic Acids Research 12, 387-395)
which is incorporated herein by reference. In this way sequences of
a similar or substantially different length to those cited herein
could be compared by insertion of gaps into the alignment, such
gaps being determined, for example, by the comparison algorithm
used by GAP.
[0113] The term "host cell" includes an individual cell or cell
culture which can be or has been a recipient of any recombinant
vector(s) or isolated polynucleotide of the invention. Host cells
include progeny of a single host cell, and the progeny may not
necessarily be completely identical (in morphology or in total DNA
complement) to the original parent cell due to natural, accidental,
or deliberate mutation and/or change. A host cell includes cells
transfected, transformed, or infected in vivo or in vitro with a
recombinant vector or a polynucleotide of the invention. A host
cell which comprises a recombinant vector of the invention is a
recombinant host cell, recombinant cell, or recombinant
microrganism.
[0114] By "isolated" is meant material that is substantially or
essentially free from components that normally accompany it in its
native state. For example, an "isolated polynucleotide", as used
herein, refers to a polynucleotide, which has been purified from
the sequences which flank it in a naturally-occurring state, e.g.,
a DNA fragment which has been removed from the sequences that are
normally adjacent to the fragment. Alternatively, an "isolated
peptide" or an "isolated polypeptide" and the like, as used herein,
refer to in vitro isolation and/or purification of a peptide or
polypeptide molecule from its natural cellular environment, and
from association with other components of the cell, i.e., it is not
associated with in vivo substances.
[0115] By "increased" or "increasing" is meant the ability of one
or more recombinant microorganisms to produce a greater amount of a
given product or molecule (e.g., commodity chemical, biofuel, or
intermediate product thereof) as compared to a control
microorganism, such as an unmodified microorganism or a differently
modified microorganism. An "increased" amount is typically a
"statistically significant" amount, and may include an increase
that is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30 or more times
(including all integers and decimal points in between, e.g., 1.5,
1.6, 1.7. 1.8, etc.) the amount produced by an unmodified
microorganism or a differently modified microorganism.
[0116] By "obtained from" is meant that a sample such as, for
example, a polynucleotide extract or polypeptide extract is
isolated from, or derived from, a particular source, such as a
desired organism, typically a microorganism. "Obtained from" can
also refer to the situation in which a polynucleotide or
polypeptide sequence is isolated from, or derived from, a
particular organism or microorganism. For example, a polynucleotide
sequence encoding a benzaldehyde lyase enzyme may be isolated from
a variety of prokaryotic or eukaryotic microorganisms, such as
Pseudomonas.
[0117] The term "operably linked" as used herein means placing a
gene under the regulatory control of a promoter, which then
controls the transcription and optionally the translation of the
gene. In the construction of heterologous promoter/structural gene
combinations, it is generally preferred to position the genetic
sequence or promoter at a distance from the gene transcription
start site that is approximately the same as the distance between
that genetic sequence or promoter and the gene it controls in its
natural setting; i.e. the gene from which the genetic sequence or
promoter is derived. As is known in the art, some variation in this
distance can be accommodated without loss of function. Similarly,
the preferred positioning of a regulatory sequence element with
respect to a heterologous gene to be placed under its control is
defined by the positioning of the element in its natural setting;
i.e., the genes from which it is derived. "Constitutive promoters"
are typically active, i.e., promote transcription, under most
conditions. "Inducible promoters" are typically active only under
certain conditions, such as in the presence of a given molecule
factor (e.g., IPTG) or a given environmental condition (e.g.,
CO.sub.2 concentration, nutrient levels, light, heat). In the
absence of that condition, inducible promoters typically do not
allow significant or measurable levels of transcriptional
activity.
[0118] The recitation "polynucleotide" or "nucleic acid" as used
herein designates mRNA, RNA, cRNA, rRNA, cDNA or DNA. The term
typically refers to polymeric form of nucleotides of at least 10
bases in length, either ribonucleotides or deoxynucleotides or a
modified form of either type of nucleotide. The term includes
single and double stranded forms of DNA.
[0119] As will be understood by those skilled in the art, the
polynucleotide sequences of this invention can include genomic
sequences, extra-genomic and plasmid-encoded sequences and smaller
engineered gene segments that express, or may be adapted to
express, proteins, polypeptides, peptides and the like. Such
segments may be naturally isolated, or modified synthetically by
the hand of man.
[0120] Polynucleotides may be single-stranded (coding or antisense)
or double-stranded, and may be DNA (genomic, cDNA or synthetic) or
RNA molecules. Additional coding or non-coding sequences may, but
need not, be present within a polynucleotide of the present
invention, and a polynucleotide may, but need not, be linked to
other molecules and/or support materials.
[0121] Polynucleotides may comprise a native sequence (i.e., an
endogenous sequence) or may comprise a variant, or a biological
functional equivalent of such a sequence. Polynucleotide variants
may contain one or more substitutions, additions, deletions and/or
insertions, as further described below, preferably such that the
enzymatic activity of the encoded polypeptide is not substantially
diminished relative to the unmodified polypeptide, and preferably
such that the enzymatic activity of the encoded polypeptide is
improved (e.g., optimized) relative to the unmodified polypeptide.
The effect on the enzymatic activity of the encoded polypeptide may
generally be assessed as described herein.
[0122] The polynucleotides of the present invention, regardless of
the length of the coding sequence itself, may be combined with
other DNA sequences, such as promoters, polyadenylation signals,
additional restriction enzyme sites, multiple cloning sites, other
coding segments, and the like, such that their overall length may
vary considerably. It is therefore contemplated that a
polynucleotide fragment of almost any length may be employed, with
the total length preferably being limited by the ease of
preparation and use in the intended recombinant DNA protocol.
[0123] The terms "polynucleotide variant" and "variant" and the
like refer to polynucleotides that display substantial sequence
identity with any of the reference polynucleotide sequences or
genes described herein, and to polynucleotides that hybridize with
any polynucleotide reference sequence described herein, or any
polynucleotide coding sequence of any gene or protein referred to
herein, under low stringency, medium stringency, high stringency,
or very high stringency conditions that are defined hereinafter and
known in the art. These terms also encompass polynucleotides that
are distinguished from a reference polynucleotide by the addition,
deletion or substitution of at least one nucleotide. Accordingly,
the terms "polynucleotide variant" and "variant" include
polynucleotides in which one or more nucleotides have been added or
deleted, or replaced with different nucleotides. In this regard, it
is well understood in the art that certain alterations inclusive of
mutations, additions, deletions and substitutions can be made to a
reference polynucleotide whereby the altered polynucleotide retains
the biological function or activity of the reference
polynucleotide, or has increased activity in relation to the
reference polynucleotide (i.e., optimized). Polynucleotide variants
include, for example, polynucleotides having at least 50% (and at
least 51% to at least 99% and all integer percentages in between)
sequence identity with a reference polynucleotide described
herein.
[0124] The terms "polynucleotide variant" and "variant" also
include naturally-occurring allelic variants that encode these
enzymes. Examples of naturally-occurring variants include allelic
variants (same locus), homologs (different locus), and orthologs
(different organism). Naturally occurring variants such as these
can be identified and isolated using well-known molecular biology
techniques including, for example, various polymerase chain
reaction (PCR) and hybridization-based techniques as known in the
art. Naturally occurring variants can be isolated from any organism
that encodes one or more genes having a suitable enzymatic activity
described herein (e.g., C--C ligase, diol dehyodrogenase, pectate
lyase, alginate lyase, diol dehydratase, transporter, etc.).
[0125] Non-naturally occurring variants can be made by mutagenesis
techniques, including those applied to polynucleotides, cells, or
organisms. The variants can contain nucleotide substitutions,
deletions, inversions and insertions. Variation can occur in either
or both the coding and non-coding regions. In certain aspects,
non-naturally occurring variants may have been optimized for use in
a given microorganism (e.g., E. coli), such as by engineering and
screening the enzymes for increased activity, stability, or any
other desirable feature. The variations can produce both
conservative and non-conservative amino acid substitutions (as
compared to the originally encoded product). For nucleotide
sequences, conservative variants include those sequences that,
because of the degeneracy of the genetic code, encode the amino
acid sequence of a reference polypeptide. Variant nucleotide
sequences also include synthetically derived nucleotide sequences,
such as those generated, for example, by using site-directed
mutagenesis but which still encode a biologically active
polypeptide. Generally, variants of a particular reference
nucleotide sequence will have at least about 30%, 40% 50%, 55%,
60%, 65%, 70%, generally at least about 75%, 80%, 85%, 90% to 95%
or more, and even about 97% or 98% or more sequence identity to
that particular nucleotide sequence as determined by sequence
alignment programs described elsewhere herein using default
parameters.
[0126] As used herein, the term "hybridizes under low stringency,
medium stringency, high stringency, or very high stringency
conditions" describes conditions for hybridization and washing.
Guidance for performing hybridization reactions can be found in
Ausubel et al., "Current Protocols in Molecular Biology", John
Wiley & Sons Inc, 1994-1998, Sections 6.3.1-6.3.6. Aqueous and
non-aqueous methods are described in that reference and either can
be used.
[0127] Reference herein to "low stringency" conditions include and
encompass from at least about 1% v/v to at least about 15% v/v
formamide and from at least about 1 M to at least about 2 M salt
for hybridization at 42.degree. C., and at least about 1 M to at
least about 2 M salt for washing at 42.degree. C. Low stringency
conditions also may include 1% Bovine Serum Albumin (BSA), 1 mM
EDTA, 0.5 M NaHPO.sub.4 (pH 7.2), 7% SDS for hybridization at
65.degree. C., and (i) 2.times.SSC, 0.1% SDS; or (ii) 0.5% BSA, 1
mM EDTA, 40 mM NaHPO.sub.4 (pH 7.2), 5% SDS for washing at room
temperature. One embodiment of low stringency conditions includes
hybridization in 6.times. sodium chloride/sodium citrate (SSC) at
about 45.degree. C., followed by two washes in 0.2.times.SSC, 0.1%
SDS at least at 50.degree. C. (the temperature of the washes can be
increased to 55.degree. C. for low stringency conditions).
[0128] "Medium stringency" conditions include and encompass from at
least about 16% v/v to at least about 30% v/v formamide and from at
least about 0.5 M to at least about 0.9 M salt for hybridization at
42.degree. C., and at least about 0.1 M to at least about 0.2 M
salt for washing at 55.degree. C. Medium stringency conditions also
may include 1% Bovine Serum Albumin (BSA), 1 mM EDTA, 0.5 M
NaHPO.sub.4 (pH 7.2), 7% SDS for hybridization at 65.degree. C.,
and (i) 2.times.SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA, 40 mM
NaHPO.sub.4 (pH 7.2), 5% SDS for washing at 60-65.degree. C. One
embodiment of medium stringency conditions includes hybridizing in
6.times.SSC at about 45.degree. C., followed by one or more washes
in 0.2.times.SSC, 0.1% SDS at 60.degree. C.
[0129] "High stringency" conditions include and encompass from at
least about 31% v/v to at least about 50% v/v formamide and from
about 0.01 M to about 0.15 M salt for hybridization at 42.degree.
C., and about 0.01 M to about 0.02 M salt for washing at 55.degree.
C. High stringency conditions also may include 1% BSA, 1 mM EDTA,
0.5 M NaHPO.sub.4 (pH 7.2), 7% SDS for hybridization at 65.degree.
C., and (i) 0.2.times.SSC, 0.1% SDS; or (ii) 0.5% BSA, 1 mM EDTA,
40 mM NaHPO.sub.4 (pH 7.2), 1% SDS for washing at a temperature in
excess of 65.degree. C. One embodiment of high stringency
conditions includes hybridizing in 6.times.SSC at about 45.degree.
C., followed by one or more washes in 0.2.times.SSC, 0.1% SDS at
65.degree. C.
[0130] One embodiment of "very high stringency" conditions includes
hybridizing in 0.5 M sodium phosphate, 7% SDS at 65.degree. C.,
followed by one or more washes in 0.2.times.SSC, 1% SDS at
65.degree. C.
[0131] Other stringency conditions are well known in the art and a
skilled addressee will recognize that various factors can be
manipulated to optimize the specificity of the hybridization.
Optimization of the stringency of the final washes can serve to
ensure a high degree of hybridization. For detailed examples, see
Ausubel et al., supra at pages 2.10.1 to 2.10.16 and Sambrook et
al., Current Protocols in Molecular Biology (1989), at sections
1.101 to 1.104.
[0132] While stringent washes are typically carried out at
temperatures from about 42.degree. C. to 68.degree. C., one skilled
in the art will appreciate that other temperatures may be suitable
for stringent conditions. Maximum hybridization rate typically
occurs at about 20.degree. C. to 25.degree. C. below the T.sub.m
for formation of a DNA-DNA hybrid. It is well known in the art that
the T.sub.m is the melting temperature, or temperature at which two
complementary polynucleotide sequences dissociate. Methods for
estimating T.sub.m are well known in the art (see Ausubel et al.,
supra at page 2.10.8).
[0133] In general, the T.sub.m of a perfectly matched duplex of DNA
may be predicted as an approximation by the formula:
T.sub.m=81.5+16.6 (log.sub.10 M)+0.41 (% G+C)-0.63 (%
formamide)-(600/length) wherein: M is the concentration of
Na.sup.+, preferably in the range of 0.01 molar to 0.4 molar; % G+C
is the sum of guano sine and cytosine bases as a percentage of the
total number of bases, within the range between 30% and 75% G+C; %
formamide is the percent formamide concentration by volume; length
is the number of base pairs in the DNA duplex. The T.sub.m of a
duplex DNA decreases by approximately 1.degree. C. with every
increase of 1% in the number of randomly mismatched base pairs.
Washing is generally carried out at T.sub.m-15.degree. C. for high
stringency, or T.sub.m-30.degree. C. for moderate stringency.
[0134] In one example of a hybridization procedure, a membrane
(e.g., a nitrocellulose membrane or a nylon membrane) containing
immobilized DNA is hybridized overnight at 42.degree. C. in a
hybridization buffer (50% deionizer formamide, 5.times.SSC,
5.times. Reinhardt's solution (0.1% fecal, 0.1%
polyvinylpyrollidone and 0.1% bovine serum albumin), 0.1% SDS and
200 mg/mL denatured salmon sperm DNA) containing a labeled probe.
The membrane is then subjected to two sequential medium stringency
washes (i.e., 2.times.SSC, 0.1% SDS for 15 min at 45.degree. C.,
followed by 2.times.SSC, 0.1% SDS for 15 min at 50.degree. C.),
followed by two sequential higher stringency washes (i.e.,
0.2.times.SSC, 0.1% SDS for 12 min at 55.degree. C. followed by
0.2.times.SSC and 0.1% SDS solution for 12 min at 65-68.degree.
C.
[0135] Polynucleotides and fusions thereof may be prepared,
manipulated and/or expressed using any of a variety of well
established techniques known and available in the art. For example,
polynucleotide sequences which encode polypeptides of the
invention, or fusion proteins or functional equivalents thereof,
may be used in recombinant DNA molecules to direct expression of a
selected enzyme in appropriate host cells. Due to the inherent
degeneracy of the genetic code, other DNA sequences that encode
substantially the same or a functionally equivalent amino acid
sequence may be produced and these sequences may be used to clone
and express a given polypeptide.
[0136] As will be understood by those of skill in the art, it may
be advantageous in some instances to produce polypeptide-encoding
nucleotide sequences possessing non-naturally occurring codons. For
example, codons preferred by a particular prokaryotic or eukaryotic
host can be selected to increase the rate of protein expression or
to produce a recombinant RNA transcript having desirable
properties, such as a half-life which is longer than that of a
transcript generated from the naturally occurring sequence. Such
nucleotides are typically referred to as "codon-optimized." Any of
the nucleotide sequences described herein may be utilized in such a
"codon-optimized" form. For example, the nucleotide coding sequence
of the benzaldehyde lyase from Pseudomonas fluorescens may be
codon-optimized for expression in E. coli.
[0137] Moreover, the polynucleotide sequences of the present
invention can be engineered using methods generally known in the
art in order to alter polypeptide encoding sequences for a variety
of reasons, including but not limited to, alterations which modify
the cloning, processing, expression and/or activity of the gene
product.
[0138] In order to express a desired polypeptide, a nucleotide
sequence encoding the polypeptide, or a functional equivalent, may
be inserted into appropriate expression vector, i.e., a vector that
contains the necessary elements for the transcription and
translation of the inserted coding sequence. Methods which are well
known to those skilled in the art may be used to construct
expression vectors containing sequences encoding a polypeptide of
interest and appropriate transcriptional and translational control
elements. These methods include in vitro recombinant DNA
techniques, synthetic techniques, and in vivo genetic
recombination. Such techniques are described in Sambrook et al.,
Molecular Cloning, A Laboratory Manual (1989), and Ausubel et al.,
Current Protocols in Molecular Biology (1989).
[0139] "Polypeptide," "polypeptide fragment," "peptide" and
"protein" are used interchangeably herein to refer to a polymer of
amino acid residues and to variants and synthetic analogues of the
same. Thus, these terms apply to amino acid polymers in which one
or more amino acid residues are synthetic non-naturally occurring
amino acids, such as a chemical analogue of a corresponding
naturally occurring amino acid, as well as to naturally-occurring
amino acid polymers. In certain aspects, polypeptides may include
enzymatic polypeptides, or "enzymes," which typically catalyze
(i.e., increase the rate of) various chemical reactions.
[0140] The recitation polypeptide "variant" refers to polypeptides
that are distinguished from a reference polypeptide sequence by the
addition, deletion or substitution of at least one amino acid
residue. In certain embodiments, a polypeptide variant is
distinguished from a reference polypeptide by one or more
substitutions, which may be conservative or non-conservative. In
certain embodiments, the polypeptide variant comprises conservative
substitutions and, in this regard, it is well understood in the art
that some amino acids may be changed to others with broadly similar
properties without changing the nature of the activity of the
polypeptide. Polypeptide variants also encompass polypeptides in
which one or more amino acids have been added or deleted, or
replaced with different amino acid residues.
[0141] The present invention contemplates the use in the methods
described herein of variants of full-length polypeptides having any
of the enzymatic activities described herein, truncated fragments
of these full-length polypeptides, variants of truncated fragments,
as well as their related biologically active fragments. Typically,
biologically active fragments of a polypeptide may participate in
an interaction, for example, an intra-molecular or an
inter-molecular interaction. An inter-molecular interaction can be
a specific binding interaction or an enzymatic interaction (e.g.,
the interaction can be transient and a covalent bond is formed or
broken). Biologically active fragments of a polypeptide/enzyme an
enzymatic activity described herein include peptides comprising
amino acid sequences sufficiently similar to, or derived from, the
amino acid sequences of a (putative) full-length reference
polypeptide sequence. Typically, biologically active fragments
comprise a domain or motif with at least one enzymatic activity,
and may include one or more (and in some cases all) of the various
active domains. A biologically active fragment of a an enzyme can
be a polypeptide fragment which is, for example, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170,
180, 190, 200, 220, 240, 260, 280, 300, 320, 340, 360, 380, 400,
450, 500, 600 or more contiguous amino acids, including all
integers in between, of a reference polypeptide sequence. In
certain embodiments, a biologically active fragment comprises a
conserved enzymatic sequence, domain, or motif, as described
elsewhere herein and known in the art. Suitably, the
biologically-active fragment has no less than about 1%, 10%, 25%,
50% of an activity of the wild-type polypeptide from which it is
derived.
[0142] The term "exogenous" refers generally to a polynucleotide
sequence or polypeptide that does not naturally occur in a
wild-type cell or organism, but is typically introduced into the
cell by molecular biological techniques, i.e., engineering to
produce a recombinant microorganism. Examples of "exogenous"
polynucleotides include vectors, plasmids, and/or man-made nucleic
acid constructs encoding a desired protein or enzyme. The term
"endogenous" refers generally to naturally occurring polynucleotide
sequences or polypeptides that may be found in a given wild-type
cell or organism. For example, certain naturally-occurring
bacterial or yeast species do not typically contain a benzaldehyde
lyase gene, and, therefore, do not comprise an "endogenous"
polynucleotide sequence that encodes a benzaldehyde lyase. In this
regard, it is also noted that even though an organism may comprise
an endogenous copy of a given polynucleotide sequence or gene, the
introduction of a plasmid or vector encoding that sequence, such as
to over-express or otherwise regulate the expression of the encoded
protein, represents an "exogenous" copy of that gene or
polynucleotide sequence. Any of the of pathways, genes, or enzymes
described herein may utilize or rely on an "endogenous" sequence,
or may be provided as one or more "exogenous" polynucleotide
sequences, and/or may be utilized according to the endogenous
sequences already contained within a given microorganism.
[0143] A "recombinant" microorganism typically comprises one or
more exogenous nucleotide sequences, such as in a plasmid or
vector.
[0144] The recitations "sequence identity" or, for example,
comprising a "sequence 50% identical to," as used herein, refer to
the extent that sequences are identical on a
nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis
over a window of comparison. Thus, a "percentage of sequence
identity" may be calculated by comparing two optimally aligned
sequences over the window of comparison, determining the number of
positions at which the identical nucleic acid base (e.g., A, T, C,
G, I) or the identical amino acid residue (e.g., Ala, Pro, Ser,
Thr, Gly, Val, Leu, Ile, Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu,
Asn, Gln, Cys and Met) occurs in both sequences to yield the number
of matched positions, dividing the number of matched positions by
the total number of positions in the window of comparison (i.e.,
the window size), and multiplying the result by 100 to yield the
percentage of sequence identity.
[0145] Terms used to describe sequence relationships between two or
more polynucleotides or polypeptides include "reference sequence",
"comparison window", "sequence identity", "percentage of sequence
identity" and "substantial identity". A "reference sequence" is at
least 12 but frequently 15 to 18 and often at least 25 monomer
units, inclusive of nucleotides and amino acid residues, in length.
Because two polynucleotides may each comprise (1) a sequence (i.e.,
only a portion of the complete polynucleotide sequence) that is
similar between the two polynucleotides, and (2) a sequence that is
divergent between the two polynucleotides, sequence comparisons
between two (or more) polynucleotides are typically performed by
comparing sequences of the two polynucleotides over a "comparison
window" to identify and compare local regions of sequence
similarity. A "comparison window" refers to a conceptual segment of
at least 6 contiguous positions, usually about 50 to about 100,
more usually about 100 to about 150 in which a sequence is compared
to a reference sequence of the same number of contiguous positions
after the two sequences are optimally aligned. The comparison
window may comprise additions or deletions (i.e., gaps) of about
20% or less as compared to the reference sequence (which does not
comprise additions or deletions) for optimal alignment of the two
sequences. Optimal alignment of sequences for aligning a comparison
window may be conducted by computerized implementations of
algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin
Genetics Software Package Release 7.0, Genetics Computer Group, 575
Science Drive Madison, Wis., USA) or by inspection and the best
alignment (i.e., resulting in the highest percentage homology over
the comparison window) generated by any of the various methods
selected. Reference also may be made to the BLAST family of
programs as for example disclosed by Altschul et al., 1997, Nucl.
Acids Res. 25:3389. A detailed discussion of sequence analysis can
be found in Unit 19.3 of Ausubel et al., "Current Protocols in
Molecular Biology", John Wiley & Sons Inc, 1994-1998, Chapter
15.
[0146] "Transformation" refers generally to the permanent,
heritable alteration in a cell resulting from the uptake and
incorporation of foreign DNA into the host-cell genome; also, the
transfer of an exogenous gene from one organism into the genome of
another organism.
[0147] By "vector" is meant a polynucleotide molecule, preferably a
DNA molecule derived, for example, from a plasmid, bacteriophage,
yeast or virus, into which a polynucleotide can be inserted or
cloned. A vector preferably contains one or more unique restriction
sites and can be capable of autonomous replication in a defined
host cell including a target cell or tissue or a progenitor cell or
tissue thereof, or be integrable with the genome of the defined
host such that the cloned sequence is reproducible. Accordingly,
the vector can be an autonomously replicating vector, i.e., a
vector that exists as an extra-chromosomal entity, the replication
of which is independent of chromosomal replication, e.g., a linear
or closed circular plasmid, an extra-chromosomal element, a
mini-chromosome, or an artificial chromosome. The vector can
contain any means for assuring self-replication. Alternatively, the
vector can be one which, when introduced into the host cell, is
integrated into the genome and replicated together with the
chromosome(s) into which it has been integrated. Such a vector may
comprise specific sequences that allow recombination into a
particular, desired site of the host chromosome. A vector system
can comprise a single vector or plasmid, two or more vectors or
plasmids, which together contain the total DNA to be introduced
into the genome of the host cell, or a transposon. The choice of
the vector will typically depend on the compatibility of the vector
with the host cell into which the vector is to be introduced. In
the present case, the vector is preferably one which is operably
functional in a bacterial cell, such as a cyanobacterial cell. The
vector can include a reporter gene, such as a green fluorescent
protein (GFP), which can be either fused in frame to one or more of
the encoded polypeptides, or expressed separately. The vector can
also include a selection marker such as an antibiotic resistance
gene that can be used for selection of suitable transformants.
[0148] The terms "wild-type" and "naturally occurring" are used
interchangeably to refer to a gene or gene product that has the
characteristics of that gene or gene product when isolated from a
naturally occurring source. A wild type gene or gene product (e.g.,
a polypeptide) is that which is most frequently observed in a
population and is thus arbitrarily designed the "normal" or
"wild-type" form of the gene.
[0149] Examples of "biomass" include aquatic or marine biomass,
fruit-based biomass such as fruit waste, and vegetable-based
biomass such as vegetable waste, among others. Examples of aquatic
or marine biomass include, but are not limited to, kelp, giant
kelp, seaweed, algae, and marine microflora, microalgae, sea grass,
and the like. In certain aspects, biomass does not include
fossilized sources of carbon, such as hydrocarbons that are
typically found within the top layer of the Earth's crust (e.g.,
natural gas, nonvolatile materials composed of almost pure carbon,
like anthracite coal, etc).
[0150] Examples of fruit and/or vegetable biomass include, but are
not limited to, any source of pectin such as plant peel and pomace
including citrus, orange, grapefruit, potato, tomato, grape, mango,
gooseberry, carrot, sugar-beet, and apple, among others.
[0151] Examples of polysaccharides, oligosaccharides,
monosaccharides or other sugar components of biomass include, but
are not limited to, alginate, agar, carrageenan, fucoidan, pectin,
gluronate, mannuronate, mannitol, lyxose, cellulose, hemicellulose,
glycerol, xylitol, glucose, mannose, galactose, xylose, xylan,
mannan, arabinan, arabinose, glucuronate, galacturonate (including
di- and tri-galacturonates), rhamnose, and the like.
[0152] Certain examples of alginate-derived polysaccharides include
saturated polysaccharides, such as .beta.-D-mannuronate,
.alpha.-L-gluronate, dialginate, trialginate, pentalginate,
hexylginate, heptalginate, octalginate, nonalginate, decalginate,
undecalginate, dodecalginate and polyalginate, as well as
unsaturated polysaccharides such as 4-deoxy-L-erythro-5-hexoseulose
uronic acid, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-D-mannuronate or
L-guluronate, 4-(4-deoxy-beta-D-mann-4-enuronosyl)-dialginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-trialginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-tetralginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-pentalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-hexylginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-heptalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-octalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-nonalginate,
4-(4-deoxy-beta-D-mann-4-enuronosyl)-undecalginate, and
4-(4-deoxy-beta-D-mann-4-enuronosyl)-dodecalginate.
[0153] Certain examples of pectin-derived polysaccharides include
saturated polysaccharides, such as galacturonate, digalacturonate,
trigalacturonate, tetragalacturonate, pentagalacturonate,
hexagalacturonate, heptagalacturonate, octagalacturonate,
nonagalacturonate, decagalacturonate, dodecagalacturonate,
polygalacturonate, and rhamnopolygalacturonate, as well as
saturated polysaccharides such as 4-deoxy-L-threo-5-hexosulose
uronate, 4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-galacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-digalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-trigalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-tetragalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-pentagalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-hexagalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-heptagalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-octagalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-nonagalacturonate,
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-decagalacturonate, and
4-(4-Deoxy-alpha-D-gluc-4-enuronosyl)-D-dodecagalacturonate.
[0154] These polysaccharide or oligosaccharide components may be
converted into "suitable monosaccharides" or other "suitable
saccharides," such as "suitable oligosaccharides," by the
microorganisms described herein which are capable of growing on
such polysaccharides or other sugar components as a source of
carbon (e.g., a sole source of carbon).
[0155] A "suitable monosaccharide" or "suitable saccharide" refers
generally to any saccharide that may be produced by a recombinant
microorganism growing on pectin, alginate, or other saccharide
(e.g., galacturonate, cellulose, hemi-cellulose etc.) as a source
or sole source of carbon, and also refers generally to any
saccharide that may be utilized in a biofuel biosynthesis pathway
of the present invention to produce hydrocarbons such as biofuels
or biopetrols. Examples of suitable monosaccharides or
oligosaccharides include, but are not limited to, 2-keto-3-deoxy
D-gluconate (KDG), D-mannitol, gluronate, mannuronate, mannitol,
lyxose, glycerol, xylitol, glucose, mannose, galactose, xylose,
arabinose, glucuronate, galacturonates, and rhamnose, and the like.
As noted herein, a "suitable monosaccharide" or "suitable
saccharide" as used herein may be produced by an engineered or
recombinant microorganism of the present invention, or may be
obtained from commercially available sources.
[0156] The recitation "commodity chemical" as used herein includes
any saleable or marketable chemical that can be produced either
directly or as a by-product of the methods provided herein,
including biofuels and/or biopetrols. General examples of
"commodity chemicals" include, but are not limited to, biofuels,
minerals, polymer precursors, fatty alcohols, surfactants,
plasticizers, and solvents. The recitation "biofuels" as used
herein includes solid, liquid, or gas fuels derived, at least in
part, from a biological source, such as a recombinant
microorganism.
Examples of commodity chemicals include, but are not limited to,
methane, methanol, ethane, ethene, ethanol, n-propane, 1-propene,
1-propanol, propanal, acetone, propionate, n-butane, 1-butene,
1-butanol, butanal, butanoate, isobutanal, isobutanol,
2-methylbutanal, 2-methylbutanol, 3-methylbutanal, 3-methylbutanol,
2-butene, 2-butanol, 2-butanone, 2,3-butanediol,
3-hydroxy-2-butanone, 2,3-butanedione, ethylbenzene,
ethenylbenzene, 2-phenylethanol, phenylacetaldehyde,
1-phenylbutane, 4-phenyl-1-butene, 4-phenyl-2-butene,
1-phenyl-2-butene, 1-phenyl-2-butanol, 4-phenyl-2-butanol,
1-phenyl-2-butanone, 4-phenyl-2-butanone, 1-phenyl-2,3-butandiol,
1-phenyl-3-hydroxy-2-butanone, 4-phenyl-3-hydroxy-2-butanone,
1-phenyl-2,3-butanedione, n-pentane, ethylphenol, ethenylphenol,
2-(4-hydroxyphenyl)ethanol, 4-hydroxyphenylacetaldehyde,
1-(4-hydroxyphenyl)butane, 4-(4-hydroxyphenyl)-1-butene,
4-(4-hydroxyphenyl)-2-butene, 1-(4-hydroxyphenyl)-1-butene,
1-(4-hydroxyphenyl)-2-butanol, 4-(4-hydroxyphenyl)-2-butanol,
1-(4-hydroxyphenyl)-2-butanone, 4-(4-hydroxyphenyl)-2-butanone,
1-(4-hydroxyphenyl)-2,3-butandiol,
1-(4-hydroxyphenyl)-3-hydroxy-2-butanone,
4-(4-hydroxyphenyl)-3-hydroxy-2-butanone,
1-(4-hydroxyphenyl)-2,3-butanonedione, indolylethane,
indolylethene, 2-(indole-3-)ethanol, n-pentane, 1-pentene,
1-pentanol, pentanal, pentanoate, 2-pentene, 2-pentanol,
3-pentanol, 2-pentanone, 3-pentanone, 4-methylpentanal,
4-methylpentanol, 2,3-pentanediol, 2-hydroxy-3-pentanone,
3-hydroxy-2-pentanone, 2,3-pentanedione, 2-methylpentane,
4-methyl-1-pentene, 4-methyl-2-pentene, 4-methyl-3-pentene,
4-methyl-2-pentanol, 2-methyl-3-pentanol, 4-methyl-2-pentanone,
2-methyl-3-pentanone, 4-methyl-2,3-pentanediol,
4-methyl-2-hydroxy-3-pentanone, 4-methyl-3-hydroxy-2-pentanone,
4-methyl-2,3-pentanedione, 1-phenylpentane, 1-phenyl-1-pentene,
1-phenyl-2-pentene, 1-phenyl-3-pentene, 1-phenyl-2-pentanol,
1-phenyl-3-pentanol, 1-phenyl-2-pentanone, 1-phenyl-3-pentanone,
1-phenyl-2,3-pentanediol, 1-phenyl-2-hydroxy-3-pentanone,
1-phenyl-3-hydroxy-2-pentanone, 1-phenyl-2,3-pentanedione,
4-methyl-1-phenylpentane, 4-methyl-1-phenyl-1-pentene,
4-methyl-1-phenyl-2-pentene, 4-methyl-1-phenyl-3-pentene,
4-methyl-1-phenyl-3-pentanol, 4-methyl-1-phenyl-2-pentanol,
4-methyl-1-phenyl-3-pentanone, 4-methyl-1-phenyl-2-pentanone,
4-methyl-1-phenyl-2,3-pentanediol,
4-methyl-1-phenyl-2,3-pentanedione,
4-methyl-1-phenyl-3-hydroxy-2-pentanone,
4-methyl-1-phenyl-2-hydroxy-3-pentanone,
1-(4-hydroxyphenyl)pentane, 1-(4-hydroxyphenyl)-1-pentene,
1-(4-hydroxyphenyl)-2-pentene, 1-(4-hydroxyphenyl)-3-pentene,
1-(4-hydroxyphenyl)-2-pentanol, 1-(4-hydroxyphenyl)-3-pentanol,
1-(4-hydroxyphenyl)-2-pentanone, 1-(4-hydroxyphenyl)-3-pentanone,
1-(4-hydroxyphenyl)-2,3-pentanediol,
1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone,
1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone,
1-(4-hydroxyphenyl)-2,3-pentanedione,
4-methyl-1-(4-hydroxyphenyl)pentane,
4-methyl-1-(4-hydroxyphenyl)-2-pentene,
4-methyl-1-(4-hydroxyphenyl)-3-pentene,
4-methyl-1-(4-hydroxyphenyl)-1-pentene,
4-methyl-1-(4-hydroxyphenyl)-3-pentanol,
4-methyl-1-(4-hydroxyphenyl)-2-pentanol,
4-methyl-1-(4-hydroxyphenyl)-3-pentanone,
4-methyl-1-(4-hydroxyphenyl)-2-pentanone,
4-methyl-1-(4-hydroxyphenyl)-2,3-pentanediol,
4-methyl-1-(4-hydroxyphenyl)-2,3-pentanedione,
4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-pentanone,
4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-pentanone,
1-indole-3-pentane, 1-(indole-3)-1-pentene, 1-(indole-3)-2-pentene,
1-(indole-3)-3-pentene, 1-(indole-3)-2-pentanol,
1-(indole-3)-3-pentanol, 1-(indole-3)-2-pentanone,
1-(indole-3)-3-pentanone, 1-(indole-3)-2,3-pentanediol,
1-(indole-3)-2-hydroxy-3-pentanone,
1-(indole-3)-3-hydroxy-2-pentanone, 1-(indole-3)-2,3-pentanedione,
4-methyl-1-(indole-3-)pentane, 4-methyl-1-(indole-3)-2-pentene,
4-methyl-1-(indole-3)-3-pentene, 4-methyl-1-(indole-3)-1-pentene,
4-methyl-2-(indole-3)-3-pentanol, 4-methyl-1-(indole-3)-2-pentanol,
4-methyl-1-(indole-3)-3-pentanone,
4-methyl-1-(indole-3)-2-pentanone,
4-methyl-1-(indole-3)-2,3-pentanediol,
4-methyl-1-(indole-3)-2,3-pentanedione,
4-methyl-1-(indole-3)-3-hydroxy-2-pentanone,
4-methyl-1-(indole-3)-2-hydroxy-3-pentanone, n-hexane, 1-hexene,
1-hexanol, hexanal, hexanoate, 2-hexene, 3-hexene, 2-hexanol,
3-hexanol, 2-hexanone, 3-hexanone, 2,3-hexanediol, 2,3-hexanedione,
3,4-hexanediol, 3,4-hexanedione, 2-hydroxy-3-hexanone,
3-hydroxy-2-hexanone, 3-hydroxy-4-hexanone, 4-hydroxy-3-hexanone,
2-methylhexane, 3-methylhexane, 2-methyl-2-hexene,
2-methyl-3-hexene, 5-methyl-1-hexene, 5-methyl-2-hexene,
4-methyl-1-hexene, 4-methyl-2-hexene, 3-methyl-3-hexene,
3-methyl-2-hexene, 3-methyl-1-hexene, 2-methyl-3-hexanol,
5-methyl-2-hexanol, 5-methyl-3-hexanol, 2-methyl-3-hexanone,
5-methyl-2-hexanone, 5-methyl-3-hexanone, 2-methyl-3,4-hexanediol,
2-methyl-3,4-hexanedione, 5-methyl-2,3-hexanediol,
5-methyl-2,3-hexanedione, 4-methyl-2,3-hexanediol,
4-methyl-2,3-hexanedione, 2-methyl-3-hydroxy-4-hexanone,
2-methyl-4-hydroxy-3-hexanone, 5-methyl-2-hydroxy-3-hexanone,
5-methyl-3-hydroxy-2-hexanone, 4-methyl-2-hydroxy-3-hexanone,
4-methyl-3-hydroxy-2-hexanone, 2,5-dimethylhexane,
2,5-dimethyl-2-hexene, 2,5-dimethyl-3-hexene,
2,5-dimethyl-3-hexanol, 2,5-dimethyl-3-hexanone,
2,5-dimethyl-3,4-hexanediol, 2,5-dimethyl-3,4-hexanedione,
2,5-dimethyl-3-hydroxy-4-hexanone, 5-methyl-1-phenylhexane,
4-methyl-1-phenylhexane, 5-methyl-1-phenyl-1-hexene,
5-methyl-1-phenyl-2-hexene, 5-methyl-1-phenyl-3-hexene,
4-methyl-1-phenyl-1-hexene, 4-methyl-1-phenyl-2-hexene,
4-methyl-1-phenyl-3-hexene, 5-methyl-1-phenyl-2-hexanol,
5-methyl-1-phenyl-3-hexanol, 4-methyl-1-phenyl-2-hexanol,
4-methyl-1-phenyl-3-hexanol, 5-methyl-1-phenyl-2-hexanone,
5-methyl-1-phenyl-3-hexanone, 4-methyl-1-phenyl-2-hexanone,
4-methyl-1-phenyl-3-hexanone, 5-methyl-1-phenyl-2,3-hexanediol,
4-methyl-1-phenyl-2,3-hexanediol,
5-methyl-1-phenyl-3-hydroxy-2-hexanone,
5-methyl-1-phenyl-2-hydroxy-3-hexanone,
4-methyl-1-phenyl-3-hydroxy-2-hexanone,
4-methyl-1-phenyl-2-hydroxy-3-hexanone,
5-methyl-1-phenyl-2,3-hexanedione,
4-methyl-1-phenyl-2,3-hexanedione,
4-methyl-1-(4-hydroxyphenyl)hexane,
5-methyl-1-(4-hydroxyphenyl)-1-hexene,
5-methyl-1-(4-hydroxyphenyl)-2-hexene,
5-methyl-1-(4-hydroxyphenyl)-3-hexene,
4-methyl-1-(4-hydroxyphenyl)-1-hexene,
4-methyl-1-(4-hydroxyphenyl)-2-hexene,
4-methyl-1-(4-hydroxyphenyl)-3-hexene,
5-methyl-1-(4-hydroxyphenyl)-2-hexanol,
5-methyl-1-(4-hydroxyphenyl)-3-hexanol,
4-methyl-1-(4-hydroxyphenyl)-2-hexanol,
4-methyl-1-(4-hydroxyphenyl)-3-hexanol,
5-methyl-1-(4-hydroxyphenyl)-2-hexanone,
5-methyl-1-(4-hydroxyphenyl)-3-hexanone,
4-methyl-1-(4-hydroxyphenyl)-2-hexanone,
4-methyl-1-(4-hydroxyphenyl)-3-hexanone,
5-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol,
4-methyl-1-(4-hydroxyphenyl)-2,3-hexanediol,
5-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone,
5-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone,
4-methyl-1-(4-hydroxyphenyl)-3-hydroxy-2-hexanone,
4-methyl-1-(4-hydroxyphenyl)-2-hydroxy-3-hexanone,
5-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione,
4-methyl-1-(4-hydroxyphenyl)-2,3-hexanedione,
4-methyl-1-(indole-3-)hexane, 5-methyl-1-(indole-3)-1-hexene,
5-methyl-1-(indole-3)-2-hexene, 5-methyl-1-(indole-3)-3-hexene,
4-methyl-1-(indole-3)-1-hexene, 4-methyl-1-(indole-3)-2-hexene,
4-methyl-1-(indole-3)-3-hexene, 5-methyl-1-(indole-3)-2-hexanol,
5-methyl-1-(indole-3)-3-hexanol, 4-methyl-1-(indole-3)-2-hexanol,
4-methyl-1-(indole-3)-3-hexanol, 5-methyl-1-(indole-3)-2-hexanone,
5-methyl-1-(indole-3)-3-hexanone, 4-methyl-1-(indole-3)-2-hexanone,
4-methyl-1-(indole-3)-3-hexanone,
5-methyl-1-(indole-3)-2,3-hexanediol,
4-methyl-1-(indole-3)-2,3-hexanediol,
5-methyl-1-(indole-3)-3-hydroxy-2-hexanone,
5-methyl-1-(indole-3)-2-hydroxy-3-hexanone,
4-methyl-1-(indole-3)-3-hydroxy-2-hexanone,
4-methyl-1-(indole-3)-2-hydroxy-3-hexanone,
5-methyl-1-(indole-3)-2,3-hexanedione,
4-methyl-1-(indole-3)-2,3-hexanedione, n-heptane, 1-heptene,
1-heptanol, heptanal, heptanoate, 2-heptene, 3-heptene, 2-heptanol,
3-heptanol, 4-heptanol, 2-heptanone, 3-heptanone, 4-heptanone,
2,3-heptanediol, 2,3-heptanedione, 3,4-heptanediol,
3,4-heptanedione, 2-hydroxy-3-heptanone, 3-hydroxy-2-heptanone,
3-hydroxy-4-heptanone, 4-hydroxy-3-heptanone, 2-methylheptane,
3-methylheptane, 6-methyl-2-heptene, 6-methyl-3-heptene,
2-methyl-3-heptene, 2-methyl-2-heptene, 5-methyl-2-heptene,
5-methyl-3-heptene, 3-methyl-3-heptene, 2-methyl-3-heptanol,
2-methyl-4-heptanol, 6-methyl-3-heptanol, 5-methyl-3-heptanol,
3-methyl-4-heptanol, 2-methyl-3-heptanone, 2-methyl-4-heptanone,
6-methyl-3-heptanone, 5-methyl-3-heptanone, 3-methyl-4-heptanone,
2-methyl-3,4-heptanediol, 2-methyl-3,4-heptanedione,
6-methyl-3,4-heptanediol, 6-methyl-3,4-heptanedione,
5-methyl-3,4-heptanediol, 5-methyl-3,4-heptanedione,
2-methyl-3-hydroxy-4-heptanone, 2-methyl-4-hydroxy-3-heptanone,
6-methyl-3-hydroxy-4-heptanone, 6-methyl-4-hydroxy-3-heptanone,
5-methyl-3-hydroxy-4-heptanone, 5-methyl-4-hydroxy-3-heptanone,
2,6-dimethylheptane, 2,5-dimethylheptane, 2,6-dimethyl-2-heptene,
2,6-dimethyl-3-heptene, 2,5-dimethyl-2-heptene,
2,5-dimethyl-3-heptene, 3,6-dimethyl-3-heptene,
2,6-dimethyl-3-heptanol, 2,6-dimethyl-4-heptanol,
2,5-dimethyl-3-heptanol, 2,5-dimethyl-4-heptanol,
2,6-dimethyl-3,4-heptanediol, 2,6-dimethyl-3,4-heptanedione,
2,5-dimethyl-3,4-heptanediol, 2,5-dimethyl-3,4-heptanedione,
2,6-dimethyl-3-hydroxy-4-heptanone,
2,6-dimethyl-4-hydroxy-3-heptanone,
2,5-dimethyl-3-hydroxy-4-heptanone,
2,5-dimethyl-4-hydroxy-3-heptanone, n-octane, 1-octene, 2-octene,
1-octanol, octanal, octanoate, 3-octene, 4-octene, 4-octanol,
4-octanone, 4,5-octanediol, 4,5-octanedione, 4-hydroxy-5-octanone,
2-methyloctane, 2-methyl-3-octene, 2-methyl-4-octene,
7-methyl-3-octene, 3-methyl-3-octene, 3-methyl-4-octene,
6-methyl-3-octene, 2-methyl-4-octanol, 7-methyl-4-octanol,
3-methyl-4-octanol, 6-methyl-4-octanol, 2-methyl-4-octanone,
7-methyl-4-octanone, 3-methyl-4-octanone, 6-methyl-4-octanone,
2-methyl-4,5-octanediol, 2-methyl-4,5-octanedione,
3-methyl-4,5-octanediol, 3-methyl-4,5-octanedione,
2-methyl-4-hydroxy-5-octanone, 2-methyl-5-hydroxy-4-octanone,
3-methyl-4-hydroxy-5-octanone, 3-methyl-5-hydroxy-4-octanone,
2,7-dimethyloctane, 2,7-dimethyl-3-octene, 2,7-dimethyl-4-octene,
2,7-dimethyl-4-octanol, 2,7-dimethyl-4-octanone,
2,7-dimethyl-4,5-octanediol, 2,7-dimethyl-4,5-octanedione,
2,7-dimethyl-4-hydroxy-5-octanone, 2,6-dimethyloctane,
2,6-dimethyl-3-octene, 2,6-dimethyl-4-octene,
3,7-dimethyl-3-octene, 2,6-dimethyl-4-octanol,
3,7-dimethyl-4-octanol, 2,6-dimethyl-4-octanone,
3,7-dimethyl-4-octanone, 2,6-dimethyl-4,5-octanediol,
2,6-dimethyl-4,5-octanedione, 2,6-dimethyl-4-hydroxy-5-octanone,
2,6-dimethyl-5-hydroxy-4-octanone, 3,6-dimethyloctane,
3,6-dimethyl-3-octene, 3,6-dimethyl-4-octene,
3,6-dimethyl-4-octanol, 3,6-dimethyl-4-octanone,
3,6-dimethyl-4,5-octanediol, 3,6-dimethyl-4,5-octanedione,
3,6-dimethyl-4-hydroxy-5-octanone, n-nonane, 1-nonene, 1-nonanol,
nonanal, nonanoate, 2-methylnonane, 2-methyl-4-nonene,
2-methyl-5-nonene, 8-methyl-4-nonene, 2-methyl-5-nonanol,
8-methyl-4-nonanol, 2-methyl-5-nonanone, 8-methyl-4-nonanone,
8-methyl-4,5-nonanediol, 8-methyl-4,5-nonanedione,
8-methyl-4-hydroxy-5-nonanone, 8-methyl-5-hydroxy-4-nonanone,
2,8-dimethylnonane, 2,8-dimethyl-3-nonene, 2,8-dimethyl-4-nonene,
2,8-dimethyl-5-nonene, 2,8-dimethyl-4-nonanol,
2,8-dimethyl-5-nonanol, 2,8-dimethyl-4-nonanone,
2,8-dimethyl-5-nonanone, 2,8-dimethyl-4,5-nonanediol,
2,8-dimethyl-4,5-nonanedione, 2,8-dimethyl-4-hydroxy-5-nonanone,
2,8-dimethyl-5-hydroxy-4-nonanone, 2,7-dimethylnonane,
3,8-dimethyl-3-nonene, 3,8-dimethyl-4-nonene,
3,8-dimethyl-5-nonene, 3,8-dimethyl-4-nonanol,
3,8-dimethyl-5-nonanol, 3,8-dimethyl-4-nonanone,
3,8-dimethyl-5-nonanone, 3,8-dimethyl-4,5-nonanediol,
3,8-dimethyl-4,5-nonanedione, 3,8-dimethyl-4-hydroxy-5-nonanone,
3,8-dimethyl-5-hydroxy-4-nonanone, n-decane, 1-decene, 1-decanol,
decanoate, 2,9-dimethyldecane, 2,9-dimethyl-3-decene,
2,9-dimethyl-4-decene, 2,9-dimethyl-5-decanol,
2,9-dimethyl-5-decanone, 2,9-dimethyl-5,6-decanediol,
2,9-dimethyl-6-hydroxy-5-decanone,
2,9-dimethyl-5,6-decanedionen-undecane, 1-undecene, 1-undecanol,
undecanal. undecanoate, n-dodecane, 1-dodecene, 1-dodecanol,
dodecanal, dodecanoate, n-dodecane, 1-decadecene, 1-dodecanol,
dodecanal, dodecanoate, n-tridecane, 1-tridecene, 1-tridecanol,
tridecanal, tridecanoate, n-tetradecane, 1-tetradecene,
1-tetradecanol, tetradecanal, tetradecanoate, n-pentadecane,
1-pentadecene, 1-pentadecanol, pentadecanal, pentadecanoate,
n-hexadecane, 1-hexadecene, 1-hexadecanol, hexadecanal,
hexadecanoate, n-heptadecane, 1-heptadecene, 1-heptadecanol,
heptadecanal, heptadecanoate, n-octadecane, 1-octadecene,
1-octadecanol, octadecanal, octadecanoate, n-nonadecane,
1-nonadecene, 1-nonadecanol, nonadecanal, nonadecanoate, eicosane,
1-eicosene, 1-eicosanol, eicosanal, eicosanoate, 3-hydroxy
propanal, 1,3-propanediol, 4-hydroxybutanal, 1,4-butanediol,
3-hydroxy-2-butanone, 2,3-butandiol, 1,5-pentane diol, homocitrate,
homoisocitorate, b-hydroxy adipate, glutarate, glutarsemialdehyde,
glutaraldehyde, 2-hydroxy-1-cyclopentanone, 1,2-cyclopentanediol,
cyclopentanone, cyclopentanol, (S)-2-acetolactate,
(R)-2,3-Dihydroxy-isovalerate, 2-oxoisovalerate, isobutyryl-CoA,
isobutyrate, isobutyraldehyde, 5-amino pentaldehyde,
1,10-diaminodecane, 1,10-diamino-5-decene,
1,10-diamino-5-hydroxydecane, 1,10-diamino-5-decanone,
1,10-diamino-5,6-decanediol, 1,10-diamino-6-hydroxy-5-decanone,
phenylacetoaldehyde, 1,4-diphenylbutane, 1,4-diphenyl-1-butene,
1,4-diphenyl-2-butene, 1,4-diphenyl-2-butanol,
1,4-diphenyl-2-butanone, 1,4-diphenyl-2,3-butanediol,
1,4-diphenyl-3-hydroxy-2-butanone,
1-(4-hydeoxyphenyl)-4-phenylbutane,
1-(4-hydeoxyphenyl)-4-phenyl-1-butene,
1-(4-hydeoxyphenyl)-4-phenyl-2-butene,
1-(4-hydeoxyphenyl)-4-phenyl-2-butanol,
1-(4-hydeoxyphenyl)-4-phenyl-2-butanone,
1-(4-hydeoxyphenyl)-4-phenyl-2,3-butanediol,
1-(4-hydeoxyphenyl)-4-phenyl-3-hydroxy-2-butanone,
1-(indole-3)-4-phenylbutane, 1-(indole-3)-4-phenyl-1-butene,
1-(indole-3)-4-phenyl-2-butene, 1-(indole-3)-4-phenyl-2-butanol,
1-(indole-3)-4-phenyl-2-butanone,
1-(indole-3)-4-phenyl-2,3-butanediol,
1-(indole-3)-4-phenyl-3-hydroxy-2-butanone,
4-hydroxyphenylacetoaldehyde, 1,4-di(4-hydroxyphenyl)butane,
1,4-di(4-hydroxyphenyl)-1-butene, 1,4-di(4-hydroxyphenyl)-2-butene,
1,4-di(4-hydroxyphenyl)-2-butanol,
1,4-di(4-hydroxyphenyl)-2-butanone,
1,4-di(4-hydroxyphenyl)-2,3-butanediol,
1,4-di(4-hydroxyphenyl)-3-hydroxy-2-butanone,
1-(4-hydroxyphenyl)-4-(indole-3-)butane,
1-(4-hydroxyphenyl)-4-(indole-3)-1-butene,
1-di(4-hydroxyphenyl)-4-(indole-3)-2-butene,
1-(4-hydroxyphenyl)-4-(indole-3)-2-butanol,
1-(4-hydroxyphenyl)-4-(indole-3)-2-butanone,
1-(4-hydroxyphenyl)-4-(indole-3)-2,3-butanediol,
1-(4-hydroxyphenyl-4-(indole-3)-3-hydroxy-2-butanone,
indole-3-acetoaldehyde, 1,4-di(indole-3-)butane,
1,4-di(indole-3)-1-butene, 1,4-di(indole-3)-2-butene,
1,4-di(indole-3)-2-butanol, 1,4-di(indole-3)-2-butanone,
1,4-di(indole-3)-2,3-butanediol,
1,4-di(indole-3)-3-hydroxy-2-butanone, succinate semialdehyde,
hexane-1,8-dicarboxylic acid, 3-hexene-1,8-dicarboxylic acid,
3-hydroxy-hexane-1,8-dicarboxylic acid, 3-hexanone-1,8-dicarboxylic
acid, 3,4-hexanediol-1,8-dicarboxylic acid,
4-hydroxy-3-hexanone-1,8-dicarboxylic acid, fucoidan, iodine,
chlorophyll, carotenoid, calcium, magnesium, iron, sodium,
potassium, phosphate, and the like.
[0158] The recitation "optimized" as used herein refers to a
pathway, gene, polypeptide, enzyme, or other molecule having an
altered biological activity, such as by the genetic alteration of a
polypeptide's amino acid sequence or by the alteration/modification
of the polypeptide's surrounding cellular environment, to improve
its functional characteristics in relation to the original molecule
or original cellular environment (e.g., a wild-type sequence of a
given polypeptide or a wild-type microorganism). Any of the
polypeptides or enzymes described herein may be optionally
"optimized," and any of the genes or nucleotide sequences described
herein may optionally encode an optimized polypeptide or enzyme.
Any of the pathways described herein may optionally contain one or
more "optimized" enzymes, or one or more nucleotide sequences
encoding for an optimized enzyme or polypeptide.
[0159] Typically, the improved functional characteristics of the
polypeptide, enzyme, or other molecule relate to the suitability of
the polypeptide or other molecule for use in a biological pathway
(e.g., a biosynthesis pathway, a C--C ligation pathway) to convert
a monosaccharide or oligosaccharide into a biofuel. Certain
embodiments, therefore, contemplate the use of "optimized"
biological pathways. An exemplary "optimized" polypeptide may
contain one or more alterations or mutations in its amino acid
coding sequence (e.g., point mutations, deletions, addition of
heterologous sequences) that facilitate improved expression and/or
stability in a given microbial system or microorganism, allow
regulation of polypeptide activity in relation to a desired
substrate (e.g., inducible or repressible activity), modulate the
localization of the polypeptide within a cell (e.g., intracellular
localization, extracellular secretion), and/or effect the
polypeptide's overall level of activity in relation to a desired
substrate (e.g., reduce or increase enzymatic activity). A
polypeptide or other molecule may also be "optimized" for use with
a given microbial system or microorganism by altering one or more
pathways within that system or organism, such as by altering a
pathway that regulates the expression (e.g., up-regulation),
localization, and/or activity of the "optimized" polypeptide or
other molecule, or by altering a pathway that minimizes the
production of undesirable by-products, among other alterations. In
this manner, a polypeptide or other molecule may be "optimized"
with or without altering its wild-type amino acid sequence or
original chemical structure. Optimized polypeptides or biological
pathways may be obtained, for example, by direct mutagenesis or by
natural selection for a desired phenotype, according to techniques
known in the art.
[0160] In certain aspects, "optimized" genes or polypeptides may
comprise a nucleotide coding sequence or amino acid sequence that
is 50% to 99% identical (including all integeres in between) to the
nucleotide or amino acid sequence of a reference (e.g., wild-type)
gene or polypeptide. In certain aspects, an "optimized" polypeptide
or enzyme may have about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40,
50, 100 (including all integers and decimal points in between e.g.,
1.2, 1.3, 1.4, 1.5, 5.5, 5.6, 5.7, 60, 70, etc.), or more times the
biological activity of a reference polypeptide.
[0161] Certain aspects of the invention also include a commodity
chemical, such as a biofuel, that is produced according to the
methods and recombinant microorganisms described herein. Such a
biofuel (e.g., medium to long chain alkane) may be distinguished
from other fuels, such as those fuels produced by traditional
refinery from crude carbon sources, by radio-carbon dating
techniques. For instance, carbon has two stable, nonradioactive
isotopes: carbon-12 (.sup.12C), and carbon-13 (.sup.13C). In
addition, there are trace amounts of the unstable isotope carbon-14
(.sup.14C) on Earth. Carbon-14 has a half-life of 5730 years, and
would have long ago vanished from Earth were it not for the
unremitting impact of cosmic rays on nitrogen in the Earth's
atmosphere, which create more of this isotope. The neutrons
resulting from the cosmic ray interactions participate in the
following nuclear reaction on the atoms of nitrogen molecules
(N.sub.2) in the atmospheric air:
n+.sub.7.sup.14N.fwdarw..sub.6.sup.14C+p
[0162] Plants and other photosynthetic organisms take up
atmospheric carbon dioxide by photosynthesis. Since many plants are
ingested by animals, every living organism on Earth is constantly
exchanging carbon-14 with its environment for the duration of its
existence. Once an organism dies, however, this exchange stops, and
the amount of carbon-14 gradually decreases over time through
radioactive beta decay.
[0163] Most hydrocarbon-based fuels, such as crude oil and natural
gas derived from mining operations, are the result of compression
and heating of ancient organic materials (i.e., kerogen) over
geological time. Formation of petroleum typically occurs from
hydrocarbon pyrolysis, in a variety of mostly endothermic reactions
at high temperature and/or pressure. Today's oil formed from the
preserved remains of prehistoric zooplankton and algae, which had
settled to a sea or lake bottom in large quantities under anoxic
conditions (the remains of prehistoric terrestrial plants, on the
other hand, tended to form coal). Over geological time the organic
matter mixed with mud, and was buried under heavy layers of
sediment resulting in high levels of heat and pressure (known as
diagenesis). This process caused the organic matter to chemically
change, first into a waxy material known as kerogen which is found
in various oil shales around the world, and then with more heat
into liquid and gaseous hydrocarbons in a process known as
catagenesis. Most hydrocarbon based fuels derived from crude oil
have been undergoing a process of carbon-14 decay over geological
time, and, thus, will have little to no detectable carbon-14. In
contrast, certain biofuels produced by the living microorganisms of
the present invention will comprise carbon-14 at a level comparable
to all other presently living things (i.e., an equilibrium level).
In this manner, by measuring the carbon-12 to carbon-14 ratio of a
hydrocarbon-based biofuel of the present invention, and comparing
that ratio to a hydrocarbon based fuel derived from crude oil, the
biofuels produced by the methods provided herein can be
structurally distinguished from typical sources of hydrocarbon
based fuels.
[0164] Embodiments of the present invention include methods for
converting a polysaccharide to a suitable monosaccharide
comprising, (a) obtaining the polysaccharide; and (b) contacting
the polysaccharide with a recombinant microorganism or microbial
system comprising such a microorgansim for a time sufficient to
convert the polysaccharide to a suitable monosaccharide, wherein
the microbial system comprises, (i) at least one gene encoding and
expressing an enzyme selected from a lyase and a hydrolase, wherein
the lyase and/or hydrolase optionally comprises at least one signal
peptide or at least one autotransporter domain; (ii) at least one
gene encoding and expressing an enzyme selected from a
monosaccharide transporter, a disaccharide transporter, a
trisaccharide transporter, an oligosaccharide transporter, and a
polysaccharide transporter; and (iii) at least one gene encoding
and expressing an enzyme selected from a monosaccharide
dehydrogenase, an isomerase, a dehydratase, a kinase, and an
aldolase, thereby converting the polysaccharide to a suitable
monosaccharide.
[0165] Alternatively, certain aspects may include methods for
converting a polysaccharide to a suitable monosaccharide
comprising, (a) obtaining the polysaccharide; and (b) contacting
the polysaccharide with a microbial system for a time sufficient to
convert the polysaccharide to a suitable monosaccharide, wherein
the microbial system comprises, (i) at least one gene encoding and
expressing an enzyme selected from a lyase and a hydrolase; (ii) at
least one gene encoding and expressing a superchannel; and (iii) at
least one gene encoding and expressing an enzyme selected from a
monosaccharide dehydrogenase, an isomerase, a dehydratase, a
kinase, and an aldolase, thereby converting the polysaccharide to a
suitable monosaccharide.
[0166] In certain embodiments, a microbial system or isolated
microorganism is capable of growing using a polysaccharide (e.g.,
alginate, pectin, etc.) as a sole source of carbon and/or energy. A
"sole source of carbon" refers generally to the ability to grow on
a given carbon source as the only carbon source in a given growth
medium.
[0167] With regard to alginate, approximately 50 percent of seaweed
dry-weight comprises various sugar components, among which alginate
and mannitol are major components corresponding to 30 and 15
percent of seaweed dry-weight, respectively. With regard to pectin,
although microorganisms such as E. coli are generally considered as
a host organisms in synthetic biology, and although such
microorganism are able to metabolize mannitol, they completely lack
the ability to degrade and metabolize alginate. In this regard,
many laboratory or wild-type microorganisms, such as E. coli, are
unable to grow on alginate as a sole source of carbon. Similarly,
many organisms such as E. coli are unable to degrade and metabolize
pectin, a polysaccharide found in many food waste products, and,
thus are unable to grown on pectin as a sole source of carbon.
Accordingly, embodiments of the present application include
engineered microorganisms, such as E. coli, or microbial systems
containing such engineered microorganisms, that are capable of
using polysaccharides, such as alginate and pectin, as a sole
source of carbon and/or energy.
[0168] Alginate is a block co-polymer of .beta.-D-mannuronate (M)
and .alpha.-D-gluronate (G) (M and G are epimeric about the
C5-carboxyl group). Each alginate polymer comprises regions of all
M (polyM), all G (polyG), and/or the mixture of M and G (polyMG).
To utilize alginate to produce one or more suitable
monosaccharides, certain aspects of the present invention provide
an engineered or recombinant microorganism or microbial system that
is able to degrade or de-polymerize alginate and to use it as a
source of carbon and/or energy. As one means of accomplishing this
purpose, such recombinant microorganisms may incorporate a set of
polysaccharide degrading or depolymerizing enzymes such as alginate
lyases (ALs) to the microbial system.
[0169] ALs are mainly classified into two distinctive subfamilies
depending on their acts of catalysis: endo- (EC 4.2.2.3) and
exo-acting (EC 4.2.2.-) ALs. Endo-acting ALs are further classified
based on their catalytic specificity; M specific and G specific
ALs. The endo-acting ALs randomly cleave alginate via a
.beta.-elimination mechanism and mainly depolymerize alginate to
di-, tri- and tetrasaccharides. The uronate at the non-reducing
terminus of each oligosaccharide are converted to unsaturated sugar
uronate, 4-deoxy-.alpha.-L-erythro-hex-4-ene pyranosyl uronates.
The exo-acting ALs catalyze further depolymerization of these
oligosaccharides and release unsaturated monosaccharides, which may
be non-enzymatically converted to monosaccharides, including
.alpha.-keto acid, 4-deoxy-.alpha.-L-erythro-hexoselulose uronate
(DEHU). Certain embodiments of an engineered microbial system or
isolated, engineered microorganism may include endoM-, endoG- and
exo-acting ALs to degrade or depolymerize aquatic or marine-biomass
polysaccharides such as alginate to a monosaccharide such as
DEHU.
[0170] Embodiments of the present invention may also include lyases
such as alginate lyases isolated from various sources, including,
but not limited to, marine algae, mollusks, and wide varieties of
microbes such as genus Pseudomonas, Vibrio, and Sphingomonas. Many
alginate lyases are endo-acting M specific, several are G specific,
and few are exo-acting. For example, ALs isolated from Sphingomonas
sp. strain A1 include five endo-acting ALs, A1-I, A1-II, A1-II',
A1-III, and A1-IV' and an exo-acting AL, A1-IV.
[0171] Typically, A1-I, A1-II, and A1-III have molecular weights of
66 kDa, 25 kDa, and 40 kDa, respectively. AI-II and AI-III are
self-splicing products of A1-I. AI-II may be more specific to G and
A1-III may be specific to M. A1-I may have high activity for both M
and G. A1-IV has molecular weight of about 85 kDa and catalyzes
exo-lytic depolymerization of oligoalginate. Although both A1-II'
and A1-IV' are functional homologues of A1-II and A1-IV. AI-II' has
endo-lytic activity and may have no preference to M or G. A1-IV has
primarily endo-lytic activity. In addition to these ALs, exo-lytic
AL Atu3025 derived from Agrobacterium tumefaciens has high activity
for depolymerization of oligoalginate, and may be used in certain
embodiments of the present invention. Certain embodiments may
incorporate into the microbial system or isolated microorganism the
genes encoding A1-I, A1-II', A1-IV, and Atu3025, and may include
optimal codon usage for the suitable host organisms, such as E.
coli.
[0172] Certain examples of alginate lyases or oligoalginate lyases
that may be utilized herein include enzymes or polypeptides sharing
at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity
(including all integers in between) to SEQ ID NOS:67-68, which show
the nucleotide (SEQ ID NO:67) and polypeptide (SEQ ID NO:68)
sequences of oligoalginate lyase Atu3025 isolated from
Agrobacterium tumefaciens. Certain examples of alginate lyases that
may be utilized herein include enzymes or polypeptides sharing at
least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence identity
(including all integers in between) to the alginate lyase enyzmes
described in FIGS. 37A-B, as well as the secreted alginate lyase
encoded by Vs24254 from Vibrio splendidus.
[0173] In certain embodiments, a microbial system or recombinant
microorganism may be engineered to secrete or display the lyases or
alginate lyases (ALs) to the culture media, such as by
incorporating a signal peptide or autotransporter domain into the
lyase. In this regard, it is typically understood that bacteria
have at least four different types of protein secretion machinery
(type I, II, III and IV). For example, in E. coli, the type II
secretion machinery is used for the secretion of recombinant
proteins. The type II secretion machinery may comprise a two-step
process: the translocation of premature proteins tagged with signal
peptides to the periplasm fraction and processing to the mature
proteins followed by secretion to media.
[0174] The first process may proceed by any of three different
pathways: secB-dependent pathway, signal recognition particle (SRP)
pathway, or twin-arginine translocation (TAT) pathway. Recombinant
proteins may be secreted into periplasm fraction. The fates of the
mature proteins vary dependent on the type of proteins. For
example, some proteins are secreted spontaneously by diffusion or
passively by a secretion apparatus named secretion that consists of
12-16 proteins, and others stay in periplasm fraction and are
eventually degraded.
[0175] Some proteins may also be secreted by an autotransporter
apparatus, such as by utilizing an autotransporter domain. The
proteins secreted by autotransporter domains typically comprise an
N-terminal signal peptide that plays a role in translocation to the
periplasm, which may be mediated by secB or SRP pathways, passenger
domain, and/or C-terminal translocation unit (UT) having a
characteristic .beta.-barrel structure. The .beta.-barrel portion
of the UT builds an aqueous pore channel across the outer membrane
and helps the transportation of passenger domain to media.
Autodisplayed passenger proteins are often cleaved by the
autotransporter and set free to media.
[0176] The type I secretion machinery may also be used for the
secretion of recombinant proteins in E. coli. The type I secretion
machinery may be used for the secretion of high-molecular-weight
toxins and exoenzymes. The type I secretion machinery consist of
two inner membrane proteins (HlyB and HlyD) that are the member of
the ATP binding cassette (ABC) transporter family, and an
endogenous outer membrane protein (TolC). The secretion of
recombinant proteins based on type I secretion machinery may
utilize the C-terminal region of .alpha.-haemolysin (HlyA) as a
signal sequence. The recombinant proteins may readily pass through
the inner membrane, periplasm, and outer membrane through the type
I secretion machinery.
[0177] Depending on the types of linker and signal peptides
utilized by various embodiments of the present application, both
autotransporter and type I secretion machinery can be altered to
the cell surface display machinery. Alternatively, a system
specific to cell surface display may be used. For example, in this
system, target proteins may be fused to PgsA protein (a
poly-.gamma.-glutamate synthetase complex) that is natively
displayed on the surface of Bacillus subtilis.
[0178] Certain embodiments may include lyases such as alginate
lyases fused with various signal peptides and/or autotransporter
domains found in proteins secreted by both type I and type II
secretion machinery. Other embodiments may include lyases such as
alginate lyases fused with any combination of signal peptides and
or autotransporter domains found in proteins secreted transport
machinery as described herein or known to a person skilled in the
art. Embodiments may also include signal peptides or
autotransporter domains that are experimentally redesigned to
maximize the secretion of lyases such as alginate lyases to the
culture media, and may also include the use of many different
linker sequences that fuse signal peptides, lyases, and
autotransporters that improve the efficiency of secretion or the
cell surface presentation of lyases.
[0179] Certain embodiments may include a microbial system or
isolated microorganism that comprise saccharide transporters, which
are able to transport monosaccharides (e.g., DEHU) and
oligosaccharides from the media to the cytosol to efficiently
utilize these monosaccharides as a source of carbon and/or energy.
For instance, genes encoding monosaccharide permeases (i.e.,
monosaccharide transporters) such as DEHU permeases may be isolated
from bacteria that grow on polysaccharides such as alginate as a
source of carbon and/or energy, and may be incorporated into
embodiments of the present microbial system or isolated
microorganism. As an additional example, embodiments may also
include redesigned native permeases or transporters with altered
specificity for monosaccharide (e.g., DEHU) transportation.
[0180] In this regard, E. coli contains several permeases able to
transport monosaccharides, which include, but are not limited to,
KdgT for 2-keto-3-deoxy-D-gluconate (KDG) transporter, ExuT for
aldohexuronates such as D-galacturonate and D-glucuronate
transporter, GntT, GntU, GntP, and GntT for gluconate transporter,
and KgtP for proton-driven .alpha.-ketoglutarate transporter.
Microbial systems or recombinant microorganisms described herein
may comprise any of these permeases, in addition to those permeases
known to a person of skill in the art and not mentioned herein, and
may also include permease enzymes redesigned to transport other
monosaccharides, such as DEHU.
[0181] A microbial system or recombinant microorganism according to
the present invention may also comprise
permeases/transporters/superchannels/porins that catalyze the
transport of monosaccharides (e.g., D-mannuronate and D-lyxose)
from media to the periplasm or cytosol of a microorganism. For
example, genes encoding the permeases of D-mannuronate in soil
Aeromonas may be incorporated into a microbial system as described
herein.
[0182] As one alternative example, a microbial system or
microorganism may comprise native permeases/transporters that are
redesigned to alter their specificity for efficient monosaccharide
transportation, such as for D-mannuronate and D-lyxose
transportation. For instance, E. coli contains several permeases
that are able to transport monosaccharides or sugars such as
D-mannonate and D-lyxose, including KdgT for
2-keto-3-deoxy-D-gluconate (KDG) transporter, ExuT for
aldohexuronates such as D-galacturonate and D-glucuronate
transporter, GntPTU for gluconate/fructuronate transporter, uidB
for glucuronide transporter, fucP for L-fucose transporter, galP
for galactose transporter, yghK for glycolate transporter, dgoT for
D-galactonate transporter, uhpT for hexose phosphate transporter,
dctA for orotate/citrate transporter, gntUT for gluconate
transporter, malEGF for maltose transporter: alsABC for D-allose
transporter, idnT for L-idonate/D-gluconate transporter, KgtP for
proton-driven .alpha.-ketoglutarate transporter, lacY for
lactose/galactose transporter, xylEFGH for D-xylose transporter,
araEFGH for L-arabinose transporter, and rbsABC for D-ribose
transporter. In certain embodiments, a microbial system or
recombinant microorganism may comprise permeases or transporters as
described above, including those that are re-designed or optimized
for improvided transport of certain monosaccharides, such as
D-mannuronate, DEHU, and D-lyxose.
[0183] Certain aspects may employ a recombinant microorganism that
comprises a "superchannel," by which aquatic or marine-biomass
polysaccharides such as alginate polymers, or fruit or vegetable
biomass such as pectin polymers, may be directly incorporated into
the cytosol and degraded inside the microbial system. For instance,
a group of bacteria characterized as Sphingomonads have a wide
range in capability of degrading environmentally hazardous
compounds such as polychlorinated polycyclic aromatics (dioxin).
These bacteria contain characteristic large pleat-like molecules on
their cell surfaces. In this regard, certain Sphingomonads have
structures characterized as "superchannels" that enable the
bacteria to directly take up macromolecules.
[0184] As one particular example of a microorganism comprising a
superchannel, Sphingomonas sp. strain A1 directly incorporates
polysaccharides such as alginate through a superchannel. Such
superchannels may consist of a pit on the outer membrane (e.g.,
AlgR), alginate-binding proteins in the periplasm (e.g., AlgQ1 and
Alg Q2), and an ATP-binding cassette (ABC) transporter (e.g.,
AlgM1, AlgM2, and AlgS). Incorporated polysaccharides such as
alginate may be readily depolymerized by lyases such as alginate
lyases produced in the cytosol. Thus, certain embodiments may
incorporate genes encoding a superchannel (e.g., ccpA, algS, algM1,
algM2, algQ1, algQ2) to introduce this ability to the microbial
system or recombinant microorganism. Other embodiments may include
microorganisms such as Sphingomonas subarctica IFO 16058.sup.T,
which harbor the plasmid containing genes that encode a
superchannel, and which have significantly improved ability to
utilize marine or aquatic biomass polysaccharides such as alginate
as a source of carbon and/or energy. Certain recombinant
microorganisms may employ these superchannel encoding plasmid
sequences contained within Sphingomonas subarctica IFO
16058.sup.T.
[0185] Certain examples of alginate ABC transporters that may be
utilized herein, include ABC transporters Atu3021, Atu3022,
Atu3023, Atu3024, algM1, algM2, AlgQ1, AlgQ2, AlgS,
OG2516.sub.--05558, OG2516.sub.--05563, OG2516.sub.--05568, and
OG2516.sub.--05573, including functional variants thereof. Certain
examples of alginate symporters that may be utilized herein include
symporters V12B01.sub.--24239 and V12B01.sub.--24194, among others,
including functional variants thereof. One additional example of an
alginate porin includes V12B01.sub.--24269, and variants
thereof.
[0186] As noted above, certain embodiments may include recombinant
microorgansims that comprise one or more monosaccharide
dehydrogenases, isomerases, dehydratases, kinases, and aldolases.
With regard to monosaccharide dehyodrogenases, certain microbial
systems or recombinant microorganism may incorporate enzymes that
reduce various monosaccharides (e.g., DEHU, mannuronate) to a
monosaccharide that is suitable for biofuel biosynthesis, such as
2-keto-3-deoxy-D-gluconate (KDG) or D-mannitol. Such exemplary
enzymes, include, for example, DEHU hydrogenases and mannuronate
hydrogenases, in addition to various alcohol dehydrogenases having
DEHU hydrogenase and/or mannuronate dehydrogenase activity, such as
the novel ADH1 through ADH12 enzymes isolated from Agrobacterium
tumefaciens C58 (see, e.g., SEQ ID NOS:69-92).
[0187] For more detail on the ADH1 through ADH12 enzymes, SEQ ID
NO:69 shows the nucleotide and SEQ ID NO:70 shows the polypeptide
sequence of ADH1 Atu1557 isolated from Agrobacterium tumefaciens
C58. SEQ ID NO:71 shows the nucleotide and SEQ ID NO:72 shows the
polypeptide sequence of ADH2 Atu2022 isolated from Agrobacterium
tumefaciens C58. SEQ ID NO:73 shows the nucleotide and SEQ ID NO:74
shows the polypeptide sequence of ADH3 Atu0626 isolated from
Agrobacterium tumefaciens C58.
[0188] SEQ ID NO:75 shows the nucleotide and SEQ ID NO:76 shows the
polypeptide sequence of ADH4 Atu5240 isolated from Agrobacterium
tumefaciens C58. SEQ ID NO:77 shows the nucleotide and SEQ ID NO:78
shows the polypeptide sequence of ADH5 Atu3163 isolated from
Agrobacterium tumefaciens C58. SEQ ID NO:79 shows the nucleotide
and SEQ ID NO:80 shows the polypeptide sequence of ADH6 Atu2151
isolated from Agrobacterium tumefaciens C58.
[0189] SEQ ID NO:81 shows the nucleotide and SEQ ID NO:82 shows the
polypeptide sequence of ADH7 Atu2814 isolated from Agrobacterium
tumefaciens C58. SEQ ID NO:83 shows the nucleotide and SEQ ID NO:84
shows the polypeptide sequence of ADH8 Atu5447 isolated from
Agrobacterium tumefaciens C58. SEQ ID NO:85 shows the nucleotide
and SEQ ID NO:86 shows the polypeptide sequence of ADH9 Atu4087
isolated from Agrobacterium tumefaciens C58.
[0190] SEQ ID NO:87 shows the nucleotide and SEQ ID NO:88 shows the
polypeptide sequence of ADH10 Atu4289 isolated from Agrobacterium
tumefaciens C58. SEQ ID NO:89 shows the nucleotide and SEQ ID NO:90
shows the polypeptide sequence of ADH11 Atu3027 isolated from
Agrobacterium tumefaciens C58. SEQ ID NO:91 shows the nucleotide
and SEQ ID NO:92 shows the polypeptide sequence of ADH12 Atu3026
isolated from Agrobacterium tumefaciens C58.
[0191] Further examples of enzymes having dehydrogenase activity
include Atu3026, Atu3027, OG2516.sub.--05543, OG2516.sub.--05538
and V12B01.sub.--24244. The microorganisms and methods of the
present invention may also utilize biologically active fragments
and variants of these hydrogenase enzymes, including optimized
variants thereof.
[0192] As a further example, Pseudomonas grown using alginate as a
sole source of carbon and energy comprises a DEHU hydrogenase
enzyme that uses NADPH as a co-factor, is more stable when
NADP.sup.+ is present in the solution, and is active at ambient pH.
Thus, certain embodiments of a microbial system or a recombinant
microorganism as described herein may incorporate genes encoding
hydrogenases such as DEHU or mannuronate hydrogenase derived or
obtained from various microbes, in which these microbes may be
capable of growing on polysaccharides such as alginate or pectin as
a source of carbon and/or energy.
[0193] Certain embodiments may incorporate components of a
microbial system or isolated microorganism that is capable of
efficiently growing on monosaccharides such as D-mannuronate or
D-lyxose as a source of carbon and energy. For instance, both
Aeromonas and Aerobacter aerogenes PRL-R3 comprise genes encoding
monosaccharide dehydrogenases such as D-mannuronate hydrogenase and
D-lyxose isomerase. Thus, certain microbial systems or recombinant
microorganisms may comprise monosaccharide dehydrogenases such as
D-mannuronate hydrogenase and D-lyxose isomerase from Aeromonas,
Aerobacter aerogenes PRL-R3, or various other suitable
microorganisms, including those microorganisms capable of growing
on D-mannuronate or D-lyxose as a source of carbon and energy.
[0194] Certain embodiments may include a microbial system or
isolated microorganism with enhanced efficiency for converting
monosaccharides such as D-mannonate and D-xylulose into
monosaccharides suitable for a biofuel biosynthesis pathway such as
KDG. Merely by way of explanation, D-mannonate and D-xylulose are
metabolites in microbes such as E. coli. D-mannonate is converted
by a D-mannonate dehydratase to KDG. D-xylulose enters the pentose
phosphate pathway. Thus, to increase conversion of D-mannonate to
KDG, an exogenous or endogenous D-mannonate dehydratase (e.g.,
uxuA) gene may be over-expressed an a recombinant microorganism of
the invention. Similarly, in other embodiments, suitable endogenous
or exogenous genes such as kinases (e.g., kdgK), nad, as well as
KDG aldolases (e.g., kdgA and eda) may be either incorporated or
overexpressed in a given recombinant microorganism (see SEQ ID
NOS:93-96), including biologically active variants or fragments
thereof, such as optimized variants of these genes. SEQ ID NO:93
shows the nucleotide sequence and SEQ ID NO:94 shows the
polypeptide sequence of a 2-keto-deoxy gluconate kinase (KdgK) from
Escherichia coli DH10B. SEQ ID NO:95 shows the nucleotide sequence
and SEQ ID NO:96 shows the polypeptide sequence of a 2-keto-deoxy
gluconate-6-phosphate aldorase (KdgA) from Escherichia coli
DH10B.
[0195] In certain aspects, as noted above, a recombinant
microorganism that is capable of growing on alginate or pectin as a
sole source of carbon may utilize a naturally-occurring or
endogenous copy of a dehyradratase, kinase, and/or aldolase. For
instance, E. coli contains endogenous dehydratases, kinases, and
aldolases that are capable of catalyzing the appropriate steps in
the conversion of polysaccharides to a suitable monosaccharide. In
these and other related aspects, the naturally-occurring
dehydratase or kinase may also be over-expressed, such as by
providing an exogenous copy of the naturally-occurring dehydratase,
kinase or aldolase operable linked to a highly constitutive or
inducible promoter.
[0196] As one exemplary source of enzymes for engineering a
recombinant microorganism to grow on alginate as a sole source of
carbon, Vibrio splendidus is known to be able to metabolize
alginate to support growth. For example, SEQ ID NO:1 shows a
secretome region carrying certain Vibrio splendidus genes
(V12B01.sub.--02425 to V12B01.sub.--02480), which encodes a type II
secretion apparatus. SEQ ID NO:2 shows the nucleotide sequence of
an entire genomic region between V12B01.sub.--24189 to
V12B01.sub.--24249, which was derived from Vibrio splendidus, and
which when transformed into E. coli as a fosmid clone was
sufficient to confer the ability to grow on alginate as a sole
source of carbon. SEQ ID NOS:3-64 show the individual putative
genes contained within SEQ ID NO:2. Thus, in certain aspects, a
recombinant microorganism (e.g., E. coli) that is able to grow on
alginate as a sole source of carbon and/or energy may comprise one
or more nucleotide or polypeptide reference sequences described in
SEQ ID NOS:1-64, including biologically active fragments or
variants thereof, such as optimized variants.
[0197] In certain aspects, a recombinant microorganism that is able
to grow on alginate as a sole source of carbon may contain certain
coding nucleotide or polypeptide sequences contained within SEQ ID
NO:2, such as the sequences in SEQ ID NOS:3-64, or biologically
active fragments or variants thereof, including optimized variants.
These sequences are described in further detail below.
[0198] SEQ ID NO:3 shows the nucleotide coding sequence of the
putative protein V12B01.sub.--24184. This putative coding sequence
is contained within the polynucleotide sequence of SEQ ID NO:2, and
encodes a polypeptide that is similar to an autotransporter
adhesion or type I secretion target ggxgxdxxx (SEQ ID NO:145)
repeat. SEQ ID NO:4 shows the polypeptide sequence of putative
protein V12B01.sub.--24184, encoded by the polynucleotide of SEQ ID
NO:3. This putative polypeptide is similar to autotransporter
adhesion or type I secretion target ggxgxdxxx (SEQ ID NO:145)
repeat.
[0199] SEQ ID NO:5 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24189. SEQ ID NO:6 shows the
polypeptide sequence of the putative protein V12B01.sub.--24189,
which is similar to cyclohexadienyl dehydratase.
[0200] SEQ ID NO:7 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24194. SEQ ID NO:8 shows the
polypeptide sequence of the putative protein V12B01.sub.--24194,
which is similar to a Na/proline transporter.
[0201] SEQ ID NO:9 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24199. SEQ ID NO:10 shows the
polypeptide sequence of the putative protein V12B01.sub.--24199,
which is similar to a keto-deoxy-phosphogluconate aldolase.
[0202] SEQ ID NO:11 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24204. SEQ ID NO:12 shows the
polypeptide sequence of the putative protein V12B01.sub.--24204,
which is similar to 2-dehydro-3-deoxygluconokinase.
[0203] SEQ ID NO:13 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--241209. SEQ ID NO:14 shows the
polypeptide sequence of the putative protein
V12B01.sub.--241209.
[0204] SEQ ID NO:15 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24214. SEQ ID NO:16 shows the
polypeptide sequence of the putative protein V12B01.sub.--24214,
which is similar to a chondroitin AC/alginate lyase.
[0205] SEQ ID NO:17 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24219. SEQ ID NO:18 shows the
polypeptide sequence of the putative protein V12B01.sub.--24219,
which is similar to a chondroitin AC/alginate lyase.
[0206] SEQ ID NO:19 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24224. SEQ ID NO:20 shows the
polypeptide sequence of the putative protein V12B01.sub.--24224,
which is similar to a 2-keto-4-pentenoate
hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase.
[0207] SEQ ID NO:21 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24229. SEQ ID NO:22 shows the
polypeptide sequence of the putative protein V12B01.sub.--24229,
which is similar to a GntR-family transcriptional regulator.
[0208] SEQ ID NO:23 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24234. SEQ ID NO:24 shows the
polypeptide sequence of the putative protein V12B01.sub.--24234,
which is similar to a Na.sup.+/proline symporter.
[0209] SEQ ID NO:25 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24239. SEQ ID NO:26 shows the
polypeptide sequence of the putative protein V12B01.sub.--24239,
which is similar to an oligoalginate lyase.
[0210] SEQ ID NO:27 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24244. SEQ ID NO:28 shows the
polypeptide sequence of putative protein V12B01.sub.--24244, which
is similar to a 3-hydroxyisobutyrate dehydrogenase.
[0211] SEQ ID NO:29 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24249. SEQ ID NO:30 shows the
polypeptide sequence of the putative protein V12B01.sub.--24249,
which is similar to a methyl-accepting chemotaxis protein.
[0212] SEQ ID NO:31 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24254. SEQ ID NO:32 shows the
polypeptide sequence of putative protein V12B01.sub.--24254, which
is similar to an alginate lyase.
[0213] SEQ ID NO:33 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24259. SEQ ID NO:34 shows the
polypeptide sequence of putative protein V12B01.sub.--24259, which
is similar to an alginate lyase.
[0214] SEQ ID NO:35 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24264. SEQ ID NO:36 shows the
polypeptide sequence of putative protein V12B01.sub.--24264.
[0215] SEQ ID NO:37 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24269. SEQ ID NO:38 shows the
polypeptide sequence of putative protein V12B01.sub.--24269, which
is similar to a putative oligogalacturonate specific porin.
[0216] SEQ ID NO:39 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--24274. SEQ ID NO:40 shows the
polypeptide sequence of putative protein V12B01.sub.--24274, which
is similar to an alginate lyase.
[0217] FIG. 32 shows the nucleotide coding sequence and polypeptide
sequence of putative protein V12B01.sub.--02425. FIG. 32A shows the
nucleotide sequence that encodes the putative protein
V12B01.sub.--02425 (SEQ ID NO:41). FIG. 32B shows the polypeptide
sequence of putative protein V12B01.sub.--02425 (SEQ ID NO:42),
which is similar to a type II secretory pathway component EpsC.
[0218] SEQ ID NO:43 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02430. SEQ ID NO:44 shows the
polypeptide sequence of putative protein V12B01.sub.--02430, which
is similar to a type II secretory pathway component EpsD.
[0219] SEQ ID NO:45 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02435. SEQ ID NO:46 shows the
polypeptide sequence of putative protein V12B01.sub.--02435, which
is similar to a type II secretory pathway component EpsE.
[0220] SEQ ID NO:47 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02440. SEQ ID NO:48 shows the
polypeptide sequence of putative protein V12B01.sub.--02440, which
is similar to a type II secretory pathway component EpsF.
[0221] SEQ ID NO:49 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02445. SEQ ID NO:50 shows the
polypeptide sequence of putative protein V12B01.sub.--02445, which
is similar to a type II secretory pathway component EpsG.
[0222] SEQ ID NO:51 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02450. SEQ ID NO:52 shows the
polypeptide sequence of putative protein V12B01.sub.--02450, which
is similar to a type II secretory pathway component EpsH.
[0223] SEQ ID NO:53 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02455. SEQ ID NO:54 shows the
polypeptide sequence of putative protein V12B01.sub.--02455, which
is similar to a type II secretory pathway component EpsI.
[0224] SEQ ID NO:55 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02460. SEQ ID NO:56 shows the
polypeptide sequence of putative protein V12B01.sub.--02460, which
is similar to a type II secretory pathway component EpsJ.
[0225] SEQ ID NO:57 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02465. SEQ ID NO:58 shows the
polypeptide sequence of putative protein V12B01.sub.--02465, which
is similar to a type II secretory pathway component EpsK.
[0226] SEQ ID NO:59 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02470. SEQ ID NO:60 shows the
polypeptide sequence of putative protein V12B01.sub.--02470, which
is similar to a type II secretory pathway component EpsL.
[0227] SEQ ID NO:61 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02475. SEQ ID NO:62 shows the
polypeptide sequence of putative protein V12B01.sub.--02475, which
is similar to a type II secretory pathway component EpsM.
[0228] SEQ ID NO:63 shows the nucleotide sequence that encodes the
putative protein V12B01.sub.--02480. SEQ ID NO:64 shows the
nucleotide sequence that encodes the putative protein
V12B01.sub.--02480, which is similar to a type II secretory pathway
component EpsC.
[0229] As a further exemplary source of enzymes for engineering a
microorganism to grow on alginate, Agrobacterium tumefaciens C58 is
able to metabolize relatively small sizes of alginate molecules
(.about.1000 mers) as a sole source of carbon and energy. Since A.
tumefaciens C58 has long been used for plant biotechnology, the
genetics of this organism has been relatively well studied, and
many genetic tools are available and compatible with other
gram-negative bacteria such as E. coli. Thus, certain aspects may
employ this microbe, or the genes therein, for the production of
suitable monosaccharides. For instance, as noted above, the present
disclosure provides a series of novel ADH genes having both DEHU
and mannuronate hydrogenase activity that were obtained from
Agrobacterium tumefaciens C58 (see SEQ ID NOS: 67-92).
[0230] As noted above, certain aspects may include a recombinant
microorganism or microbial system that is capable of growing on
pectin as a sole source of carbon and/or energy. Pectin is a linear
chain of .alpha.-(1-4)-linked D-galacturonic acid that forms the
pectin-backbone, a homogalacturonan. Into this backbone, there are
regions where galacturonic acid is replaced by (1-2)-linked
L-rhamnose. From rhamnose, side chains of various neutral sugars
typically branch off. This type of pectin is called
rhamnogalacturonan I. Over all, about up to every 25th galacturonic
acid in the main chain is exchanged with rhamnose. Some stretches
consisting of alternating galacturonic acid and rhamnose--"hairy
regions", others with lower density of rhamnose--"smooth regions."
The neutral sugars mainly comprise D-galactose, L-arabinose and
D-xylose; the types and proportions of neutral sugars vary with the
origin of pectin. In nature, around 80% of carboxyl groups of
galacturonic acid are esterified with methanol. Some plants, like
sugar-beet, potatoes and pears, contain pectins with acetylated
galacturonic acid in addition to methyl esters. Acetylation
prevents gel-formation but increases the stabilising and
emulsifying effects of pectin. Certain pectin degradation and
metabolic pathways are exemplified in FIG. 3.
[0231] In addition to the genes, enzymes, and biological pathways
described above, certain recombinant microorganisms may incorporate
features that are useful for growth on pectin as a sole source of
carbon. For instance, to degrade and metabolize pectin as a sole
source of carbon, pectin methyl and acetyl esterases first catalyze
the hydrolysis of methyl and acetyl esters on pectin. Examples of
pectin methyl esterases include, but are not limited to, pemA and
pmeB. Examples of pectin acetyl esterases include, but are not
limited to, PaeX and PaeY. Further examples of pectin methyl
esterases that may be utilized herein include enzymes or
polypeptides sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more
sequence identity (including all integers in between) to the
pectate methyl esterases in FIGS. 40A-B. Further examples of
pectate acetyl esterases that may be utilized herein include
enzymes or polypeptides sharing at least 60%, 70%, 80%, 90%, 95%,
98%, or more sequence identity (including all integers in between)
to the pectate acetyl esterases described in FIG. 41.
[0232] Further to this end, pectate lyases and hydrolases may
catalyze the endolytic cleavage of pectate via .beta.-elimination
and hydrolysis, respectively, to produce oligopectates. Other
enzymes that may be utilized to metabolize pectin include Examples
of pectate lyases include, but are not limited to, PelA, PelB,
PelC, PelD, PelE, Pelf, PelI, PelL, and PelZ. Examples of pectate
hydrolases include, but are not limited to, PehA, PehN, PehV, PehW,
and PehX. Further examples of pectate lyases include polypeptides
or enzymes sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more
sequence identity (including all integers in between) to the
pectate lyases described in FIGS. 38A-E.
[0233] Polygalacturonases, rhamnogalacturonan lyases, and
rhamnogalacturonan hydrolyases may also be utilized herein to
degrade and metabolize pectin. Examples of rhamnogalacturonan
lyases include polypeptides or enzymes sharing at least 60%, 70%,
80%, 90%, 95%, 98%, or more sequence identity (including all
integers in between) to the rhamnoglacturonan lyases (i.e.,
rhamnogalacturonases) described in FIG. 39A. Examples of
rhamnogalacturonate hydrolyases include polypeptides or enzymes
sharing at least 60%, 70%, 80%, 90%, 95%, 98%, or more sequence
identity (including all integers in between) to the
rhamnogalacturonate hydrolases described in FIG. 39B.
[0234] Thus, to degrade and metabolize pectin, certain of the
recombinant microorganisms and methods of the present invention may
incorporate one or more of the above noted methy and acetyl
esterases, lyases, and/or hydrolases, among others known in the
art. These may enzymes may be encoded and expressed by endogenous
or exogenous genes, and may also include biologically active
fragments or variants thereof, such as homologs, orthologs, and/or
optimized variants of these enzymes.
[0235] To further metabolize the degradation products of pectin,
oligopectates may be transported into the periplasm fraction of
gram-negative bacteria by outer membrane porins, where they are
further degraded into such components as di- and
tri-galactonurates. Examples of outer membrane porins include that
can transport oligopectates into the periplasm include, but are not
limited to, kdgN and kdgM. Certain recombinant microorganism may
incorporate these or similar genes.
[0236] Di- and tri-galactonurates may then be transported into the
cytosol for further degradation. Bacteria contain at least two
different transporter systems responsible for di- and
tri-galacturonate transportation, including symporter and ABC
transporter (e.g., TogT and TogMNAB, respectively). Thus, certain
of the recombinant microorganisms provided herein may comprise one
or more a di- or tri-galacturonate transporter systems, such as
TogT and/or TogMNAB.
[0237] Once di- and trigalacturonate are incorporated into the
cytosol, short pectate or galacturonate lyases, break them down to
D-galacturonate and (4S)-4,6-dihydroxy-2,5-dioxohexuronate.
Examples of short pectate or galacturonate lyases include, but are
not limited to, PelW and Ogl, which genes may be either
endogenously or exogenously incorporated into certain recombinant
microorganisms provided herein. D-galacturonate and
(4S)-4,6-dihydroxy-2,5-dioxohexuronate are then converted to
5-dehydro-4-deoxy-D-glucuronate and further to KDG, which steps may
be catalyzed by KduI and KduD, respectively. The KduI enzyme has an
isomerase activity, and the KduD enzyme has a dehydrogenase
activity, such as a 2-deoxy-D-gluconate 3-dehydrogenase activity.
Accordingly, certain recombinant microorganisms provided herein may
comprise one or more short pectate or galacturonate lyases, such as
PelW and/or Ogl, and may optionally comprise one or more
isomerases, such as KduI, as well as one or more dehydrogenases,
such as KduD, to convert di- and trigalacturonates into a suitable
monosaccharide, such as KDG.
[0238] In certain aspects, a recombinant microorganism, such as E.
coli, that is able to grown on pectin or tri-galacturonate as a
sole source of carbon and/or energy may comprise one or more of the
gene sequences contained within SEQ ID NOS:65 and 66, including
biologically active fragments or variants thereof, such as
optimized variants. SEQ ID NO:65 shows the nucleotide sequence of
the kdgF-PaeX region from Erwinia carotovora subsp. Atroseptica
SCRI1043. SEQ ID NO:66 shows the nucleotide sequence of ogl-kdgR
from Erwinia carotovora subsp. Atroseptica SCRI1043.
[0239] In certain aspects, a recombinant microorganism, such as E.
coli, that is able to grown on pectin or tri-galacturonate as a
sole source of carbon and/or energy may comprise one or more
genomic regions of Erwinia chrysanthemi, comprising several genes
(kdgF, kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl,
and kdgR) encoding enzymes (kduI, kduD, ogl, pelW, and paeX),
transporters (togM, togN, togA, togB, and kdgM), and regulatory
proteins (kdgR) responsible for degradation of di- and
trigalacturonate, as well as several genes (pelA, pelE, paeY, and
pem) encoding pectate lyases (pelA and pelE), pectin
acetylesterases (paeY), and pectin methylesterase (pem) (see
Example 2).
[0240] Additional examples of isomerases that may be utilized
herein include glucoronate isomerases, such as those in the family
uxaC, as well as 4-deoxy-L-threo-5-hexylose uronate isomerases,
such as those in the family KduI. Additional examples of reductases
that may be utilized herein include tagaturonate reductases, such
as those in the family uxaB. Additional examples of dehyadratases
that may be utilized herein include altronate dehydratases, such as
those in the family uxaA. Additional examples of dehydrogenases
that may be utilized herein include 2-deoxy-D-gluconate
3-dehydrogenases, such as those in the family kduD.
[0241] Certain aspects my also utilize recombinant microorganisms
engineered to enhance the efficiency of the KDG degradation
pathway. For instance, in bacteria, KDG is a common metabolic
intermediate in the degradation of hexuronates such as
D-glucuronate and D-galacturonate and enters into Entner Doudoroff
pathway where it is converted to pyruvate and
glyceraldehyde-3-phosphate (G3P). In this pathway, KDG is first
phosphorylated by KDG kinase (KdgK) followed by its cleavage into
pyruvate and glyceraldehyde-3-phosphate (G3P) using
2-keto-3-deoxy-D-6-phosphate-gluconate (KDPG) aldolase (KdgA). The
expression of these enzymes concurrently with KDG permease (e.g.,
KdgT) is negatively regulated by KdgR and is almost none at basal
level. The expression is dramatically (3-5-fold) induced upon the
addition of hexuronates, and a similar result has been reported in
Pseudomonas grown on alginate. Hence, to increase the conversion of
KDG to pyruvate and G3P, the negative regulator KdgR may be
removed. To further improve the pathway efficiency, exogenous
copies of KdgK and KdgA may also be incorporated into a given
recombinant microorganism.
[0242] In certain aspects, a recombinant microorganism that is able
to grow on a polysaccharide (e.g., alginate, pectin, etc) as a sole
source of carbon may be capable of producing an increased amount of
a given commodity chemical (e.g., ethanol) while growing on that
polysaccharide. For example, E. coli engineered to grown on
alginate may be engineered to produced an increased amount of
ethanol from alginate as compared to E. coli that is not engineered
to grown on alginate (see Example 11). Thus, certain aspects
include a recombinant microorganism that is capable of growing on
alginate or pectin as a sole source carbon, and that is capable of
producing an increased amount of ethanol, such as by comprising one
or more genes encoding and expressing a pyruvate decarboxylase
(pdc) and/or an alcohol dehydrogenase, including functional
variants thereof. In certain aspects, such a recombinant
microorganism may comprise a pyruvate decarboxylase (pdc) and two
alcohol dehydrogenases (adhA and adhB) obtained from Zymomonas
mobilis.
[0243] Embodiments of the present invention also include methods
for converting polysaccharide to a suitable monosaccharide
comprising, (a) obtaining a polysaccharide; (b) contacting the
polysaccharide with a chemical catalysis or enzymatic pathway,
thereby converting the polysaccharide to a first monosaccharide or
oligosaccharide; and (c) contacting the first monosaccharide with a
microbial system for a time sufficient to convert the first
monosaccharide or oligosaccharide to the suitable monosaccharide,
wherein the microbial system comprises, (i) at least one gene
encoding and expressing an enzyme selected from a monosaccharide
transporter, a disaccharide transporter, a trisaccharide
transporter, an oligosaccharide transporter, and a polysaccharide
transporter; and (ii) at least one gene encoding and expressing an
enzyme selected from a monosaccharide dehydrogenase, an isomerase,
a dehydratase, a kinase, and an aldolase, thereby converting the
polysaccharide to a suitable monosaccharide.
[0244] In certain aspects of the present invention, aquatic or
marine-biomass polysaccharides such as alginate may be chemically
degraded using chemical catalysts such as acids. Similarly,
biomass-derived pectin may be chemically degraded. For instance,
the reaction catalyzed by chemical catalysts is typically through
hydrolysis, as opposed to the .beta.-elimination type of reactions
catalyzed by enzymatic catalysts. Thus, certain embodiments may
include boiling alginate or pectin with strong mineral acids to
liberate carbon dioxide from D-mannuronate, thereby forming
D-lyxose, a common sugar metabolite utilized by many
microorganisms. Such embodiments may use, for example, formate,
hydrochloric acid, sulfuric acid, in addition to other suitable
acids known in the art as chemical catalysts.
[0245] An enzymatic pathway may utilized one or more enzymes
described herein that are capable of catalyzing the degradation of
polysaccharides, such as alginate or pectin.
[0246] Other embodiments may use variations of chemical catalysis
similar to those described herein or known to a person skilled in
the art, including improved or redesigned methods of chemical
catalysis suitable for use with biomass related polysaccharides.
Certain embodiments include those wherein the resulting
monosaccharide uronate is D-mannuronate.
[0247] As noted above, the suitable monosaccharides or suitable
oligosaccharides produced by the recombinant microorganisms and
microbial systems of the present invention may be utilized as a
feedstock in the production of commodity chemicals, such as
biofuels, as well as commodity chemical intermediates. Thus,
certain embodiments of the present invention relate generally to
methods for converting a suitable monosaccharide or oligosaccharide
to a commodity chemical, such as a biofuel, comprising, (a)
obtaining a suitable monosaccharide or oligosaccharide; (b)
contacting the suitable monosaccharide or oligosaccharide with a
microbial system for a time sufficient to convert to the suitable
monosaccharide to the biofuel, thereby converting the suitable
monosaccharide to the biofuel.
[0248] Certain aspects include methods for converting a suitable
monosaccharide to a first commodity chemical such as a biofuel,
comprising, (a) obtaining a suitable monosaccharide; (b) contacting
the suitable monosaccharide with a microbial system for a time
sufficient to convert to the suitable monosaccharide to the first
commodity chemical, wherein the microbial system comprises one or
more genes encoding a aldehyde or ketone biosynthesis pathway,
thereby converting the suitable monosaccharide to the first
commodity chemical.
[0249] In these and other related aspects, depending on the
particular ketone or aldehyde biosynthesis pathway employed, the
first commodity chemical may be further enzymatically and/or
chemically reduced and dehydrated to a second commodity chemical.
Examples of such second commodity chemicals include, but are not
limited to, butene or butane; 1-phenylbutene or 1-phenylbutane;
pentene or pentane; 2-methylpentene or 2-methylpentane;
1-phenylpentene or 1-phenylpentane; 1-phenyl-4-methylpentene or
1-phenyl-4-methylpentane; hexene or hexane; 2-methylhexene or
2-methylhexane; 3-methylhexene or 3-methylhexane;
2,5-dimethylhexene or 2,5-dimethylhexane; 1-phenylhexene or
1-phenylhexane; 1-phenyl-4-methylhexene or 1-phenyl-4-methylhexane;
1-phenyl-5-methylhexene or 1-phenyl-5-methylhexane; heptene or
heptane; 2-methylheptene or 2-methylheptane; 3-methylheptene or
3-methylheptane; 2,6-dimethylheptene or 2,6-dimethylheptane;
3,6-dimethylheptene or 3,6-dimethylheptane; 3-methyloctene or
3-methyloctane; 2-methyloctene or 2-methyloctane;
2,6-dimethyloctene or 2,6-dimethyloctane; 2,7-dimethyloctene or
2,7-dimethyloctane; 3,6-dimethyloctene or 3,6-dimethyloctane; and
cyclopentane or cyclopentene.
[0250] Certain embodiments of the present invention may also
include methods for converting a suitable monosaccharide or
oligosaccharide to a commodity chemical comprising (a) obtaining a
suitable monosaccharide or oligosaccharide; (b) contacting the
suitable monosaccharide or oligosaccharide with a microbial system
for a time sufficient to convert to the suitable monosaccharide or
oligosaccharide to the commodity chemical, wherein the microbial
system comprises; (i) one or more genes encoding a biosynthesis
pathway; (ii) one or more genes encoding and expressing a C--C
ligation pathway; and (iii) one or more genes encoding and
expressing a reduction and dehydration pathway, comprising a diol
dehydrogenase, a diol dehydratase, and a secondary alcohol
dehydrogenase, thereby converting the suitable monosaccharide or
oligosaccharide to the commodity chemical.
[0251] Certain aspects also include recombinant microorganism that
comprise (i) one or more genes encoding a biosynthesis pathway;
(ii) one or more genes encoding and expressing a C--C ligation
pathway; and (iii) one or more genes encoding and expressing a
reduction and dehydration pathway, comprising a diol dehydrogenase,
a diol dehydratase, and a secondary alcohol dehydrogenase. Certain
aspects also include recombinant microorganisms that comprise the
above pathways individually or in certain combinations, such as
recombinant microorganism that comprises one or more genes encoding
a biosynthesis pathway, as described herein. Certain aspects may
also include recombinant microorganisms that comprise one or more
genes encoding and expressing a C--C ligation pathway, as described
herein. Certain aspects may also include include recombinant
microorganisms that comprise one or more genes encoding and
expressing a reduction and dehydration pathway, comprising a diol
dehydrogenase, a diol dehydratase, and a secondary alcohol
dehydrogenase, as described herein.
[0252] As for recombinant microorganisms that comprise combinations
of the above-noted pathways, certain aspects may include
recombinant microorgansims that comprise (i) one or more genes
encoding a biosynthesis pathway; and (ii) one or more genes
encoding and expressing a C--C ligation pathway. Certain aspects
may also include recombinant microorganisms that comprise (i) one
or more genes encoding and expressing a C--C ligation pathway; and
(ii) one or more genes encoding and expressing a reduction and
dehydration pathway, comprising a diol dehydrogenase, a diol
dehydratase, and a secondary alcohol dehydrogenase.
[0253] Certain aspects may also include recombinant microorganisms
that comprise one or more individual components of a dehydration
and reduction pathway, such as a recombinant microorganism that
comprises a diol dehydrogenase, a diol dehydratase, or a secondary
alcohol dehydrogenase. These and other microorganisms may be
utilized, for example, to convert a suitable polysaccharide to a
first commodity chemical, or an intermediate thereof, or to to
convert a first commodity chemical, or an intermediate thereof, to
a second commodity chemical.
[0254] Merely by way of illustration, a recombinant microorganism
comprising a C--C ligation pathway may be utilized to convert
butanal into a first commodity chemical, or an intermediate
thereof, such as 5-hydroxy-4-octanone, which can then be converted
into a second commodity chemical, or intermediate thereof, by any
suitable pathway. As a further example, a recombinant microorganism
comprising a C--C ligation pathway and a diol hydrogenase may be
utilized for the sequential conversion of butanal into
5-hydroxy-4-octanone and then 4,5-octanonediol. Examples of
recombinant microorganisms that comprise these and other various
combinations of the individual pathways described herein, as well
as various combinations of the individual components of those
pathways, will be apparent to those skilled in the art, and may
also be found in the Examples.
[0255] Also included are methods of converting a polysaccharide to
a first commodity chemical, or an intermediate thereof, such as by
utilizing a recombinant microorganism that comprises an aldehyde or
ketone biosynthesis pathway. Also included are methods of
converting a first commodity chemical, or intermediate thereof, to
a second commodity chemical, such as by utilizing a recombinant
microorganism that optionally comprises a biosynthesis pathway,
optionally comprises C--C ligation pathway and/or optionally
comprises one or more of the individual components of a dehydration
and reduction pathway. Merely by way of illustration, a recombinant
microorganism comprising an exogenous C--C ligase (e.g.,
benzaldehyde lyase from Pseudomonas fluorescens) could be utilized
in a method to convert a first commodity chemical such as
3-methylbutanal to a second commodity chemical such as
2,7-dimethyl-5-hydroxy-4-octanone. Along this line of illustration,
the same or different recombinant microorganism comprising a diol
dehydrogenase could be utilized in a method to convert
2,7-dimethyl-5-hydroxy-4-octanone to another commodity chemical
such as 2,7-dimethyl-4,5-octanediol (see Table 2 for other
examples). As an additional illustrative example, a recombinant
microorganism comprising an exogenous secondary alcohol
dehydrogenase could be utilized in a method to convert a first
commodity chemical such as 2,7-dimethyl-4-octanone to a second
commodity chemical such as 2,7-dimethyloctanol.
[0256] Embodiments of a microbial system or isolated microorganism
of the present application may include a naturally-occurring
biosynthesis pathway, and/or an engineered, reconstructed, or
re-designed biosynthesis pathway that has been optimized for
improved functionality.
[0257] Embodiments of a microbial system or recombinant
microorganism of the present invention may include a natural or
reconstructed biosynthesis pathway, such as a butyraldehyde
biosynthesis pathway, as found in such microorganisms as
Clostridium acetobutylicum and Streptomyces coelicolor. In
explanation, butyrate and butanol are the common fermentation
products of certain bacterial species such as Clostridia, in which
the production of butyrate and butanol is mediated by a synthetic
thiolase dependent pathway characteristically similar to fatty acid
degradation pathway. Such pathways may be initiated with the
condensation of two molecules of acetyl-CoA to acetoacetyl-CoA,
which is catalyzed by thiolase. Acetoacetyl-CoA is then reduced to
.beta.-hydroxy butyryl-CoA, which is catalyzed by NAD(P)H dependent
.beta.-hydroxy butyryl-CoA dehydrogenase (HBDH). Crotonase
catalyzes dehydration from .beta.-hydroxy butyryl-CoA to form
crotonyl-CoA. Further reduction catalyzed by NADH-dependent
butyryl-CoA dehydrogenase (BCDH) saturates the double bond at C2 of
crotonyl-CoA to form butyryl-CoA.
[0258] In certain embodiments, thiolase, the first enzyme in this
pathway, may be overexpressed to maximize production. In certain
embodiments, thiolase may over-expressed in E. coli. In this
regard, all three enzymes (e.g., HBDH, crotonase, and BCDH)
catalyzing the following reaction steps are found in Clostridium
acetobutylicum ATCC824. In certain embodiments, BDH, crotonase, and
BCDH may be expressed or over-expressed in a suitable microorganism
such as E. coli. Alternatively, a short-chain aliphatic acyl-CoA
dehydrogenase derived from Pseudomonas putida KT2440 may be
utilized in other embodiments of a microbial system or isolated
microorganism of the present application.
[0259] Further to this end, butyryl-CoA in Clostridia may be
readily converted to butanol and/or butyrate by at least a few
different pathways. In one pathway, butyryl-CoA is directly reduced
to butyraldehyde catalyzed by NADH dependent CoA-acylating aldehyde
dehydrogenase (ALDH). Butyraldehyde may be further reduced to
butanol by NADH-dependent butanol dehydrogenase. Although
CoA-acylating ALDH catalyzes the one step reduction of butyryl-CoA
to butyraldehyde, the incorporation of CoA-acylating ALDH to the
microbial system may result in acetoaldehyde formation because of
its promiscuous acetyl-CoA deacylating activity. In certain
embodiments, the formation of acetoaldehyde may be minimized by
functionally redesigning the relevant enzyme(s).
[0260] Butyryl-CoA in other biosynthesis pathways is deacylated to
form butyryl phosphate catalyzed by phosphotransbutyrylase. Butyryl
phosphate is then hydrolyzed by reversible butyryl phosphate kinase
to form butyrate. This reaction is coupled with ATP generation from
ADP. The butyrate formation through these enzymes is known to be
significantly more specific. Certain embodiments may comprise
phosphotransbutyrylase and butyryl phosphate kinase to the
microbial system. In other embodiments, butyrate may be directly
formed from butyryl-CoA by short chain acyl-CoA thioesterase.
[0261] Butyrate in Clostridia may also be sequentially reduced to
butanol, which is catalyzed by a single alcohol/aldehyde
dehydrogenase. Certain embodiments may comprise short chain
aldehyde dehydrogenase from other bacteria such as Pseudomonas
putida to complement the production of butyraldehyde in the
microbial system. One potential concern in using short chain
aldehyde dehydrogenase involves the possible formation of
acetoaldehyde from acetate. Certain embodiments may be directed to
minimizing the acetate formation in the microbial system, for
example, by deleting several genes encoding enzymes involved in the
acetate production.
[0262] Moreover, there are multiple routes in E. coli to form
acetate, one of which is mediated by pyruvate oxygenase (PDXB) from
pyruvate, whereas another is mediated by phosphotransacetylase
(PTA) and acetyl phosphate kinase (ACKA) from acetyl-CoA. The
acetate production from E. coli mutant strains with poxB.sup.-,
pta.sup.-, and acka.sup.- are significantly diminished. In
addition, incorporation of acetyl-CoA synthase (ACS) which
catalyses the acetyl-CoA formation from acetate is also known to
significantly reduce the accumulation of acetate. Certain
embodiments may comprise a microbial system or isolated
microorganism with deleted PDXB, PTA, and/or ACKA genes, and other
embodiments may also comprise, separately or together with the
deleted genes, one or more genes encoding and expressing ACS.
[0263] A microbial system or recombinant microorganism provided
herein may also comprise a glutaraldehyde biosynthesis pathway. As
one example, Saccharomyces cerevisiae has a lysine biosynthetic
pathway in which acetyl-CoA is initially condensed to
.alpha.-ketoglutarate, a common metabolite in citric acid cycle, to
form homocitorate. This reaction is catalyzed by homocitrate
synthase derived from Yeast, Thermus thermophilus, or Deinococcus
radiodurans. Homoaconitase derived from Yeast, Thermus
thermophilus, or Deinococcus radiodurans catalyzes the conversion
between homocitrate and homoisocitrate. Homoisocitrate is then
oxidatively decarboxylated to form 2-ketoadipate, which is
catalyzed by homoisocitrate dehydrogenase derived from Yeast,
Thermus thermophilus, or Deinococcus radiodurans. Homoisocitrate is
also oxidatively decarboxylated to form glutaryl-CoA, which may be
catalyzed by homoisocitrate dehydrogenase. Thus, certain
embodiments may comprise a homocitrate synthase, a homoaconitase,
and/or a homoisocitrate dehydrogenase.
[0264] Further to this end, in synthesizing
2-keto-adipicsemialdehyde, 2-ketoadipate is reduced to
2-keto-adipicsemialdehyde. This reaction can be catalyzed by
dialdehyde dehydrogenase, which, for example, may be isolated from
Agrobacterium tumefaciens C58. Thus, certain embodiments may
incorporate dialdehyde dehydrogenases into a microbial system or
recombinant microorganism.
[0265] In synthesizing glutaraldehyde, Acyl-CoA thioesterases
(ACOT) may also catalyze the hydrolysis of glutaryl-CoA. The genes
encoding .omega.-carboxylic acyl-CoA specific peroxisomal ACOTs are
found in many mammalian species; both ACOT4 and ACOT8 derived from
mice have been previously expressed in E. coli and shown that both
enzymes are highly active on the hydrolysis of glutaryl-CoA to form
glutarate. Certain embodiments may comprise one or more Acyl-CoA
thioesterases.
[0266] Glutarate is sequentially reduced to glutaraldehyde. This
reaction can be catalyzed by glutaraldehyde dehydrogenase (CpnE),
which, for example, may be isolated from Comomonas sp. Strain NCIMB
9872. Certain embodiments may incorporate glutaraldehyde
dehydrogenases such as CpnE into a microbial system or isolated
microorganism. Other embodiments may comprise both ACOT and CpnE
enzymes. Other embodiments may comprise CpnE enzymes redesigned to
catalyze the reduction of 1-hydroxy propanoate and succinate to
1-hydroxy propanal and succinicaldehyde.
[0267] In certain aspects, the biosynthesis pathway may include an
aldehyde biosynthesis pathway, a ketone biosynthesis pathway, or
both. In certain aspects, the biosynthesis pathway may be include
one or more of an acetoaldehyde, propionaldehyde, butyraldehyde,
isobutyraldehyde, 2-methyl-butyraldehyde, 3-methyl-butyraldehyde,
4-methylpentaldehyde, phenylacetoaldehyde, 2-phenyl acetoaldehyde,
2-(4-hydroxyphenyl)acetaldehyde, 2-Indole-3-acetoaldehyde,
glutaraldehyde, 5-amino-pentaldehyde, succinate semialdehyde,
and/or succinate 4-hydroxyphenyl acetaldehyde biosynthesis pathway,
including various combinations thereof.
[0268] With regard to combinations of biosynthesis pathways, a
biosynthesis pathway may comprise an acetoaldehyde biosynthesis
pathway in combination with at least one of a propionaldehyde,
butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde,
3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis
pathway. In certain aspects, a biosynthesis pathway may comprise a
propionaldehyde biosynthesis pathway in combination with at least
one of a butyraldehyde, isobutyraldehyde, 2-methyl-butyraldehyde,
3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis
pathway. In certain aspects, a biosynthesis pathway may comprise a
butyraldehyde biosynthesis pathway in combination with at least one
of an isobutyraldehyde, 2-methyl-butyraldehyde,
3-methyl-butyraldehyde, or phenylacetoaldehyde biosynthesis
pathway. In certain aspects, a biosynthesis pathway may comprise an
isobutyraldehyde biosynthesis pathway in combination with at least
one of a 2-methyl-butyraldehyde, 3-methyl-butyraldehyde, or
phenylacetoaldehyde biosynthesis pathway. In certain aspects, a
biosynthesis pathway may comprise a 2-methyl-butyraldehyde
biosynthesis pathway in combination with at least one of a
3-methyl-butyraldehyde or a phenylacetoaldehyde biosynthesis
pathway. In certain aspects, a biosynthesis pathway may comprise a
3-methyl-butyraldehyde biosynthesis pathway in combination with a
phenylacetoaldehyde biosynthesis pathway.
[0269] In certain aspects, a propionaldehyde biosynthesis pathway
may comprise a threonine deaminase (ilvA) gene from an organism
such as Escherichia coli and a keto-isovalerate decarboxylase
(kiwi) gene from an organism such as Lactococcus lactis, and/or
functional variants of these enzymes, including homologs or
orthologs thereof, as well as optimized variants. These enzymes may
be utilized generally to convert L-threonine to
propionaldehyde.
[0270] In certain aspects, a butyraldehyde biosyntheis pathway may
comprise at least one of a thiolase (atoB) gene from an organism
such as E. coli, a .beta.-hydroxy butyryl-CoA dehydrogenase (hbd)
gene, a crotonase (crt) gene, a butyryl-CoA dehydrogenase (bcd)
gene, an electron transfer flavoprotein A (etfA) gene, and/or an
electron transfer flavoprotein B (etfB) gene from an organism such
as Clostridium acetobutyricum (e.g., ATCC 824), as well as a
coenzyme A-linked butyraldehyde dehydrogenase (ald) gene from an
organism such as Clostridium beijerinckii acetobutyricum ATCC 824.
In certain aspects, a coenzyme A-linked alcohol dehydrogenase
(adhE2) gene from an organism such as Clostridium acetobutyricum
ATCC 824 may be used as an alternative to an ald gene.
[0271] In certain aspects, an isobutyraldehyde biosynthetic pathway
may comprise an acetolactate synthase (alsS) from an organism such
as Bacillus subtilis or an als gene from an organism such as
Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage may
be optimized for E. coli protein expression). Such a pathway may
also comprise acetolactate reductoisomerase (ilvC) and/or
2,3-dihydroxyisovalerate dehydratase (ilvD) genes from an organism
such as E. coli, as well as a keto-isovalerate decarboxylase (kivd)
gene from an organism such as Lactococcus lactis.
[0272] In certain aspects, a 3-methylbutyraldehyde and
2-methylbutyraldehyde biosynthesis pathway may comprise an
acetolactate synthase (alsS) gene from an organism such as Bacillus
subtilis or an (als) gene from an organism such as Klebsiella
pneumoniae subsp. pneumoniae MGH 78578 (codon usage may be
optimized for E. coli protein expression). Certain aspects of such
a pathway may also comprise acetolactate reductoisomerase (ilvC),
2,3-dihydroxyisovalerate dehydratase (ilvD), isopropylmalate
synthase (LeuA), isopropylmalate isomerase (LeuC and LeuD), and
3-isopropylmalate dehydrogenase (LeuB) genes from an organism such
as E. coli, as well as a keto-isovalerate decarboxylase (kivd) from
an organism such as Lactococcus lactis.
[0273] In certain aspects, a phenylacetoaldehyde and
4-hydroxyphenylacetoaldehyde biosynthesis pathway may comprise one
or more of 3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and
aroH), 3-dehydroquinate synthase (aroB), a 3-dehydroquinate
dehydratase (aroD), dehydroshikimate reductase (aroE), shikimate
kinase II (aroL), shikimate kinase I (aroK),
5-enolpyruvylshikimate-3-phosphate synthetase (aroA), chorismate
synthase (aroC), fused chorismate mutase P/prephenate dehydratase
(pheA), and/or fused chorismate mutase T/prephenate dehydrogenase
(tyrA) genes from an organism such as E. coli, as well as a
keto-isovalerate decarboxylase (kivd) from an organism such as
Lactococcus lactis.
[0274] In certain aspects, such as for the ultimate production of
1,10-diamino-5-decanol and 1,10-dicarboxylic-5-decanol, a
biosynthesis pathway may comprise one or more homocitrate synthase,
homoaconitate hydratase, homoisocitrate dehydrogenase, and/or
homoisocitrate dehydrogenase genes from an organism such as
Deinococcus radiodurans and/or Thermus thermophilus, as well as a
keto-adipate decarboxylase gene, a 2-aminoadipate transaminase
gene, and a L-2-Aminoadipate-6-semialdehyde: NAD+ 6-oxidoreductase
gene. Such a biosynthesis pathway would be able to convert
.alpha.-ketoglutarate to 5-aminopentaldehyde.
[0275] In certain aspects, such as for one step in cyclopentanol
production, a .alpha.-ketoadipate semialdehyde biosynthesis pathway
may comprise homocitrate synthase (hcs), homoaconitate hydratase,
and homoisocitrate dehydrogenase genes from an organism such as
Deinococcus radiodurans and/or Thermus thermophilus, and an
.alpha.-ketoadipate semialdehyde dehydrogenase gene. Such a
biosynthesis pathway would be able to convert acetyl-CoA and
.alpha.-ketoglutarate to .alpha.-ketoadipate semialdehyde.
[0276] For the production of certain commodity chemicals, such as
2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol,
among other similar chemicals, a biosynthesis pathway (e.g.,
aldehyde biosynthesis pathway) may optionally or further comprise
one or more genes encoding a carboxylase enzyme, such as an
indole-3-pyruvate decarboxylase (IPDC). An IPDC may be obtained,
for example, from such microorganisms as Azospirillum brasilense
and Paenibacillus polymyxa E681. In this regard, an IPDC may be
utilized to more efficiently catalyze the dexarboxylation of
various carboxylic acids to form the corresponding aldehyde, which
can be further converted to a commodity chemical by a reductase or
dehydrogenase, as detailed herein.
[0277] In certain aspects, a 2-phenylethanol,
2-(4-hydroxyphenyl)ethanol, and 2-(indole-3-)ethanol biosynthesis
pathway may comprise a transketolase (tktA), a
3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH),
3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase
(aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II
(aroL), a shikimate kinase I (aroK), a
5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate
synthase (aroC), a fused chorismate mutase P/prephenate dehydratase
(pheA), and a fused chorismate mutase T/prephenate dehydrogenase
(tyrA) genes from E. coli, keto-isovalerate decarboxylase (kivd)
from Lactococcus lactis, alcohol dehydrogenase (adh2) from
Saccharomyces cerevisiae, Indole-3-pyruvate decarboxylase (ipdc)
from Azospirillum brasilense, phenylethanol reductase (par) from
Rhodococcus sp. ST-10, and a benzaldehyde lyase (bal) from
Pseudomonas fluorescence.
[0278] As for all other pathways described herein, the components
for each of the biosynthesis pathways described herein may be
present in a recombinant microorganism either endogenously or
exogenously. To improve the efficiency of a given biosynthesis
pathway, endogenous genes, for example, may be up-regulated or
over-expressed, such as by introducing an additional (i.e.,
exogenous) copy of that endogenous gene into the recombinant
microorganism. Such pathways may also be optimized by altering via
mutagenesis the endogenous version of a gene to improve
functionality, followed by introduction of the altered gene into
the microorganism. The expression of endogenous genes may be up or
down-regulated, or even eliminated, according to known techniques
in the art and described herein. Similarly, the expression levels
of exogenously provided genes may be regulated as desired, such as
by using various constitutive or inducible promoters. Such genes
may also be "codon-optimized," as described herein and known in the
art. Also included are functional naturally-occurring variants of
the genes and enzymes described herein, including homologs or
orthologs thereof.
[0279] Certain embodiments of a microbial system or isolated
microorganism may comprise a CC-ligation pathway. In certain
aspects, a CC-ligation pathway may comprise a ThDP-dependent
enzyme, such as a C--C ligase, or an optimized C--C ligase. For
example, eight-carbon unit molecules (butyroins) may be made from
condensing together two four-carbon unit molecules
(butyraldehydes). ThDP-dependent enzymes are a group of enzymes
known to catalyze both breaking and formation of C--C bonds and
have been utilized as catalysts in chemoenzymatic syntheses. The
spectrum of chemical reactions that these enzymes catalyze ranges
from decarboxylation of .alpha.-keto acids, oxidative
decarboxylation, carboligation, and to the cleavage of C--C
bonds.
[0280] To provide a few examples, benzaldehyde lyase (BAL) from
Pseudomonas fluorescens, benzoylformate decarboxylase (BFD) from
Pseudomonas putida, and pyruvate decarboxylase (PDC) from Zymomonas
mobilis may catalyze a carboligation reaction between two
aldehydes. BAL accepts the broadest spectrum of aldehydes as
substrates among these three enzymes ranging from substituted
benzaldehyde to acetoaldehyde, among others, as shown herein. BAL
catalyzes stereospecific carboligation reaction between two
aldehydes and forms .alpha.-hydroxy ketones with over 99% ee for
R-configuration. The benzoin formation from two benzaldehyde
molecules is a favored reaction catalyzed by BAL and proceeds as
fast as 320 mmol (benzoin) mg (protein).sup.-1 min.sup.-1. The
formation of .alpha.-hydroxy ketone may be carried out using many
different aldehydes, including butyraldehyde.
[0281] BFD and PCD may also catalyze the carboligation reactions
between two aldehyde molecules. BFD and PCD accept relatively
larger and smaller aldehyde molecules, respectively. With the
presence of benzaldehyde and acetoaldehyde, BFD catalyzes the
formation of benzoin and (S)-.alpha.-hydroxy phenylpropanone
(2S-HPP), whereas PCD catalyzes the formation of
(R)-.alpha.-hydroxy phenylpropanone (2R-HPP) and
(R)-.alpha.-hydroxy 2-butanone (acetoin). As detailed below,
certain microbial systems or isolated microorganisms of the present
application may comprise natural or optimized C--C ligases
(ThDP-dependent enzymes) selected from benzaldehyde lyase (BAL)
from Pseudomoas fluorescens, benzoylformate decarboxylase (BFD)
from Pseudomonas putida, and pyruvate decarboxylase (PDC) from
Zymomonas mobilis. Other embodiments may comprise a benzaldehyde
lyase (BAL) from Pseudomoas fluorescens (see SEQ ID NOS:143-144,
showing the nucleotide and polypeptide sequences, respectively)
including biologically active variants thereof, such as optimized
variants.
[0282] A C--C ligation pathway of the present invention typically
comprises one or more C--C ligases, such as a lyase enzyme.
Exemplary lyases include, but are not limited to, acetoaldehyde
lyases, propionaldehyde lyases, butyraldehyde lyases,
isobutyraldehyde lyases, 2-methyl-butyraldehyde lyases,
3-methyl-butyraldehyde lyases (isoveraldehyde), phenylacetaldehyde
lyases, .alpha.-keto adipate carboxylyases, pentaldehyde lyases,
4-methyl-pentaldehyde lyases, hexyldehyde lyases, heptaldehyde
lyases, octaldehyde lyases, 4-hydroxyphenylacetaldehyde lyases,
indoleacetaldehyde lyases, indolephenylacetaldehyde lyases. In
certain aspects, a selected CC-ligase or lyase enzyme may have one
or more of the above exemplified lyase activities, such as
acetoaldehyde lyase activity, a propionaldehyde lyase activity, a
butyraldehyde lyase activity, and/or an isobutyraldehyde lyase
activity, among others.
[0283] As noted above, a C--C ligase may comprise a benzaldehyde
lyase, such as a benzaldehyde lyase isolated from Pseudomonas
fluorescens (SEQ ID NOS:143-144), as well as biologically active
fragments or variants of this reference sequence, such as optimized
variants of a benzaldehyde lyase. In this regard, certain aspects
may comprise nucleotide sequences or polypeptide sequences having
80%, 85%, 90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID
NOS:143-144, and which are capable of catalyzing a carboligation
reaction, or which possess C--C lyase activity, as described
herein. In certain aspects, a BAL enzyme will comprise one or more
conserved amino acid residues, including G27, E50, A57, G155, P162,
P234, D271, G277, G422, G447, D448, and/or G512.
[0284] Pseudomonas fluorescens is able to grow on R-benzoin as the
sole carbon and energy source because it harbours the enzyme
benzaldehyde lyase that cleaves the acyloin linkage using thiamine
diphosphate (ThDP) as a cofactor. In the reverse reaction, as
utilized herein, benzaldehyde lyase catalyses the carboligation of
two aldehydes with high substrate and stereospecificity.
Structure-based comparisons with other proteins show that
benzaldehyde lyase belongs to a group of closely related
ThDP-dependent enzymes. The ThDP cofactors of these enzymes are
fixed at their two ends in separate domains, suspending a
comparatively mobile thiazolium ring between them. While the
residues binding the two ends of ThDP are well conserved, the
lining of the active centre pocket around the thiazolium moiety
varies greatly within the group. The active sites for BAL have been
described, for example, in Kneen et al. (Biochimica et Biophysica
Acta 1753:263-271, 2005) and Brandt et al. (Biochemistry
47:7734-43, 2008). Benzaldehyde lyase derived from Pseudomonas
fluorescens has been demonstrated herein to at least have an
acetoaldehyde lyase activity, a propionaldehyde lyase activity, a
butyraldehyde lyase activity, a 3-methyl-butyraldehyde lyase
activity, a pentaldehyde lyase activity, a 4-methylpentaldehyde
lyase activity, a hexyldehyde lyase activity, a phenylacetoaldehyde
lyase activity, and an octaldehyde lyase activity (see Table 2),
among other in vivo lyase activities (see FIGS. 48-55).
[0285] In certain aspects, a C--C ligase, such as BAL derived from
Pseudomonas fluorescens, BFD derived from Pseudomonas putida, or
PDC derived from Zymomonas mobilis may comprise a lyase with a
combination of lyase activities, such as a lyase having both a
propionaldehyde lyase activity and a 3-methyl-butyraldehyde lyase
activity, among other combinations and activities, such as those
exemplary combinations detailed herein. Merely by way of
illustration, a lyase having a combination of lyase activities may
be referred to herein as a propionaldehyde/3-methyl-butyraldehyde
lyase.
[0286] A dehydration and reduction pathway, comprising a diol
dehydrogenase, a diol dehydratase, and a secondary alcohol
dehydrogenase, may be utilized to further convert an aldehyde,
ketone, or corresponding alcohol, to a commodity chemical, such as
a biofuel.
[0287] To this end, a dehydration and reduction pathway may
comprise one or more diol dehydrogenases. A "diol dehydrogenase"
refers generally to an enzyme that catalyzes the reversible
reduction and oxidation of a .alpha.-hydroxy ketone and/or its
corresponding diol. Certain embodiments of a microbial system or
isolated microorganism may comprise genes encoding a diol
dehydrogenase that specifically catalyzes the reduction of
.alpha.-hydroxy-ketones, including, for example, a 4,5, octanediol
dehydrogenase. Diol dehydrogenases, such as 4,5, octanediol
dehydrogenase, may be isolated from a variety of organisms and
incorporated into a microbial system or isolated microorganism. A
particular group of alcohol dehydrogenases has a characteristic
ability to oxidize various .alpha.-hydroxy alcohols and reduce
various .alpha.-hydroxy ketones and .alpha.-keto ketones. As such,
the recitation "diol dehydrogenase" may also encompass such alcohol
dehydrogenases.
[0288] By way of example regarding diol dehydrogenases from
exemplary organisms, glycerol dehydrogenase isolated from Hansenula
ofunaensis has broad substrate specificity and is capable of
catalyzing the oxidation of various .alpha.-hydroxy alcohols,
including 1,2-octane, as well as the reduction of various
.alpha.-hydroxy ketones and .alpha.-keto ketones, including
3-hydroxy-2-butanone and 3,4-hexanedione, with the activity
comparable to its native substrates, glycerol and dihydroxyaceton,
respectively (40-200%). As one further example, glycerol
dehydrogenase discovered in Hansenula polumorpha DI-1 works
similarly. In certain embodiments, a microbial system or
recombinant microorganism may comprise a glycerol dehydrogenase
gene isolated from Hansenula ofunaensis, a glycerol dehydrogenase
isolated from Hansenula polumorpha DI-1 and/or a meso-2,3-butane
diol dehydrogenase from Klebsiella pneumoniae. In other
embodiments, a microbial system or isolated microorganism may
comprise a 4,5, octanediol dehydrogenase, among others detailed
herein. Diol dehyodregnases may also be obtained from Lactobaccilus
brevis ATCC 367, Pseudomanas putida KT2440, and Klebsiella
pneumoniae MGH78578), as described herein (see Example 5).
[0289] Exemplary diol dehydrogenases include, but are not limited
to, 2,3-butanediol dehydrogenase, 3,4-hexanediol dehydrogenase,
4,5-octanediol dehydrogenase, 5,6-decanediol dehydrogenase,
6,7-dodecanediol dehydrogenase, 7,8-tetradecanediol dehydrogenase,
8,9-hexadecanediol dehydrogenase, 2,5-dimethyl-3,4-hexanediol
dehydrogenase, 3,6-dimethyl-4,5-octanediol dehydrogenase,
2,7-dimethyl-4,5-octanediol dehydrogenase,
2,9-dimethyl-5,6-decanediol dehydrogenase,
1,4-diphenyl-2,3-butanediol dehydrogenase,
bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase,
1,4-diindole-2,3-butanediol dehydrogenase, 1,2-cyclopentanediol
dehydrogenase, 2,3-pentanediol dehydrogenase, 2,3-hexanediol
dehydrogenase, 2,3-heptanediol dehydrogenase, 2,3-octanediol
dehydrogenase, 2,3-nonanediol dehydrogenase,
4-methyl-2,3-pentanediol dehydrogenase, 4-methyl-2,3-hexanediol
dehydrogenase, 5-methyl-2,3-hexanediol dehydrogenase,
6-methyl-2,3-heptanediol dehydrogenase, 1-phenyl-2,3-butanediol
dehydrogenase, 1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase,
1-indole-2,3-butanediol dehydrogenase, 3,4-heptanediol
dehydrogenase, 3,4-octanediol dehydrogenase, 3,4-nonanediol
dehydrogenase, 3,4-decanediol dehydrogenase, 3,4-undecanediol
dehydrogenase, 2-methyl-3,4-hexanediol dehydrogenase,
5-methyl-3,4-heptanediol dehydrogenase, 6-methyl-3,4-heptanediol
dehydrogenase, 7-methyl-3,4-octanediol dehydrogenase,
1-phenyl-2,3-pentanediol dehydrogenase,
1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase,
1-indole-2,3-pentanediol dehydrogenase, 4,5-nonanediol
dehydrogenase, 4,5-decanediol dehydrogenase, 4,5-undecanediol
dehydrogenase, 4,5-dodecanediol dehydrogenase,
2-methyl-3,4-heptanediol dehydrogenase, 3-methyl-4,5-octanediol
dehydrogenase, 2-methyl-4,5-octanediol dehydrogenase,
8-methyl-4,5-nonanediol dehydrogenase, 1-phenyl-2,3-hexanediol
dehydrogenase, 1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase,
1-indole-2,3-hexanediol dehydrogenase, 5,6-undecanediol
dehydrogenase, 5,6-undecanediol dehydrogenase, 5,6-tridecanediol
dehydrogenase, 2-methyl-3,4-octanediol dehydrogenase,
3-methyl-4,5-nonanediol dehydrogenase, 2-methyl-4,5-nonanediol
dehydrogenase, 2-methyl-5,6-decanediol dehydrogenase,
1-phenyl-2,3-heptanediol dehydrogenase,
1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase,
1-indole-2,3-heptanediol dehydrogenase, 6,7-tridecanediol
dehydrogenase, 6,7-tetradecanediol dehydrogenase,
2-methyl-3,4-nonanediol dehydrogenase, 3-methyl-4,5-decanediol
dehydrogenase, 2-methyl-4,5-decanediol dehydrogenase,
2-methyl-5,6-undecanediol dehydrogenase, 1-phenyl-2,3-octanediol
dehydrogenase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase,
1-indole-2,3-octanediol dehydrogenase, 7,8-pentadecanediol
dehydrogenase, 2-methyl-3,4-decanediol dehydrogenase,
3-methyl-4,5-undecanediol dehydrogenase, 2-methyl-4,5-undecanediol
dehydrogenase, 2-methyl-5,6-dodecanediol dehydrogenase,
1-phenyl-2,3-nonanediol dehydrogenase,
1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase,
1-indole-2,3-nonanediol dehydrogenase, 2-methyl-3,4-undecanediol
dehydrogenase, 3-methyl-4,5-dodecanediol dehydrogenase,
2-methyl-4,5-dodecanediol dehydrogenase, 2-methyl-5,6-tridecanediol
dehydrogenase, 1-phenyl-2,3-decanediol dehydrogenase,
1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase,
1-indole-2,3-decanediol dehydrogenase, 2,5-dimethyl-3,4-heptanediol
dehydrogenase, 2,6-dimethyl-3,4-heptanediol dehydrogenase,
2,7-dimethyl-3,4-octanediol dehydrogenase,
1-phenyl-4-methyl-2,3-pentanediol dehydrogenase,
1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase,
1-indole-4-methyl-2,3-pentanediol dehydrogenase,
2,6-dimethyl-4,5-octanediol dehydrogenase,
3,8-dimethyl-4,5-nonanediol dehydrogenase,
1-phenyl-4-methyl-2,3-hexanediol dehydrogenase,
1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase,
1-indole-4-methyl-2,3-hexanediol dehydrogenase,
2,8-dimethyl-4,5-nonanediol dehydrogenase,
1-phenyl-5-methyl-2,3-hexanediol dehydrogenase,
1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase,
1-indole-5-methyl-2,3-hexanediol dehydrogenase,
1-phenyl-6-methyl-2,3-heptanediol dehydrogenase,
1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase,
1-indole-6-methyl-2,3-heptanediol dehydrogenase,
1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase,
1-indole-4-phenyl-2,3-butanediol dehydrogenase,
1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase,
1,10-diamino-5,6-decanediol dehydrogenase,
1,4-di(4-hydroxyphenyl)-2,3-butanediol,
2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase, and the
like.
[0290] In certain aspects, a selected diol dehydrogenase enzyme may
have one or more of the above exemplified diol dehydrogenase
activities, such as a 2,3-butanediol dehydrogenase activity, a
3,4-hexanediol dehydrogenase activity, and/or a 4,5-octanediol
dehydrogenase activity, among others.
[0291] In certain aspects, a recombinant microorganism may comprise
a diol dehydrogenase encoded by a nucleotide reference sequence
selected from SEQ ID NO:97, 99, and 101, or an enzyme having a
polypeptide sequence selected from SEQ ID NO:98, 100, and 102,
including biologically active fragments or variants thereof, such
as optimized variants. Certain aspects may also comprises
nucleotide sequences or polypeptide sequences having 80%, 85%, 90%,
95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:97-102.
[0292] Other embodiments may comprise re-designed diol
dehydrogenases for reduction of 1-hydroxy propanal,
succinicaldehyde, and glutaraldehyde to 1,3-propanediol,
1,4-butanediol, and 1,5 pentanediol, respectively, among
others.
[0293] A dehydration and reduction pathway, as described herein,
may comprise one or more diol dehydratases. A "diol dehydratase"
refers generally to an enzyme that catalyzes the irreversible
dehydration of diols. For instance, this enzyme may serve to
dehydrate octanediol to form 4-octane. It has been recognized that
there are at least two different types of diol dehydratases: a
group dependent on and independent of coenzyme B12 for its
catalysis. Coenzyme B12 dependent diol dehydratases are known to
catalyze a radical mediated dehydration reaction from
.alpha.-hydroxy alcohol to aldehydes or ketones. For example, a
diol dehydratase from Klebsiella pneumoniae catalyzes the
dehydration of glycerol to form .beta.-hydroxypropyl aldehyde,
accepts 2,3-butanediol as a substrate, and catalyzes the
dehydration reaction to form 2-butanone.
[0294] As a further example, Clostridium butylicum contains
coenzyme B12 independent diol dehydratases. FIG. 46 shows the in
vivo biological activity of coenzyme B12 independent diol
dehydratase (dhaB1) and activator (dhaB2) isolated from Clostridium
butylicum (see Example 9). 46A shows the in vivo production of
1-propanol from 1,2-propanediol, FIG. 46B shows the in vivo
production of 2-butanol from meso-2,3 butanediol, and FIG. 46C
shows the in vivo production of cyclopentanone from
trans-1,2-cyclopentanediol.
[0295] Thus, certain embodiments of the present invention may
comprise optimized or redesigned diol dehydratases that accommodate
various substrates, such as 4,5-octanediol as a substrate, and may
include diol dehydratases isolated and/or optimized from Klebsiella
pneumoniae and Clostridium butylicum, among other organisms
described herein and known in the art.
[0296] Exemplary diol dehydratases include, but are not limited to,
2,3-butanediol dehydratase, 3,4-hexanediol dehydratase,
4,5-octanediol dehydratase, 5,6-decanediol dehydratase,
6,7-dodecanediol dehydratase, 7,8-tetradecanediol dehydratase,
8,9-hexadecanediol dehydratase, 2,5-dimethyl-3,4-hexanediol
dehydratase, 3,6-dimethyl-4,5-octanediol dehydratase,
2,7-dimethyl-4,5-octanediol dehydratase,
2,9-dimethyl-5,6-decanediol dehydratase,
1,4-diphenyl-2,3-butanediol dehydratase,
bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase,
1,4-diindole-2,3-butanediol dehydratase, 1,2-cyclopentanediol
dehydratase, 2,3-pentanediol dehydratase, 2,3-hexanediol
dehydratase, 2,3-heptanediol dehydratase, 2,3-octanediol
dehydratase, 2,3-nonanediol dehydratase, 4-methyl-2,3-pentanediol
dehydratase, 4-methyl-2,3-hexanediol dehydratase,
5-methyl-2,3-hexanediol dehydratase, 6-methyl-2,3-heptanediol
dehydratase, 1-phenyl-2,3-butanediol dehydratase,
1-(4-hydroxyphenyl)-2,3-butanediol dehydratase,
1-indole-2,3-butanediol dehydratase, 3,4-heptanediol dehydratase,
3,4-octanediol dehydratase, 3,4-nonanediol dehydratase,
3,4-decanediol dehydratase, 3,4-undecanediol dehydratase,
2-methyl-3,4-hexanediol dehydratase, 5-methyl-3,4-heptanediol
dehydratase, 6-methyl-3,4-heptanediol dehydratase,
7-methyl-3,4-octanediol dehydratase, 1-phenyl-2,3-pentanediol
dehydratase, 1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase,
1-indole-2,3-pentanediol dehydratase, 4,5-nonanediol dehydratase,
4,5-decanediol dehydratase, 4,5-undecanediol dehydratase,
4,5-dodecanediol dehydratase, 2-methyl-3,4-heptanediol dehydratase,
3-methyl-4,5-octanediol dehydratase, 2-methyl-4,5-octanediol
dehydratase, 8-methyl-4,5-nonanediol dehydratase,
1-phenyl-2,3-hexanediol dehydratase,
1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase,
1-indole-2,3-hexanediol dehydratase, 5,6-undecanediol dehydratase,
5,6-undecanediol dehydratase, 5,6-tridecanediol dehydratase,
2-methyl-3,4-octanediol dehydratase, 3-methyl-4,5-nonanediol
dehydratase, 2-methyl-4,5-nonanediol dehydratase,
2-methyl-5,6-decanediol dehydratase, 1-phenyl-2,3-heptanediol
dehydratase, 1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase,
1-indole-2,3-heptanediol dehydratase, 6,7-tridecanediol
dehydratase, 6,7-tetradecanediol dehydratase,
2-methyl-3,4-nonanediol dehydratase, 3-methyl-4,5-decanediol
dehydratase, 2-methyl-4,5-decanediol dehydratase,
2-methyl-5,6-undecanediol dehydratase, 1-phenyl-2,3-octanediol
dehydratase, 1-(4-hydroxyphenyl)-2,3-octanediol dehydratase,
1-indole-2,3-octanediol dehydratase, 7,8-pentadecanediol
dehydratase, 2-methyl-3,4-decanediol dehydratase,
3-methyl-4,5-undecanediol dehydratase, 2-methyl-4,5-undecanediol
dehydratase, 2-methyl-5,6-dodecanediol dehydratase,
1-phenyl-2,3-nonanediol dehydratase,
1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase,
1-indole-2,3-nonanediol dehydratase, 2-methyl-3,4-undecanediol
dehydratase, 3-methyl-4,5-dodecanediol dehydratase,
2-methyl-4,5-dodecanediol dehydratase, 2-methyl-5,6-tridecanediol
dehydratase, 1-phenyl-2,3-decanediol dehydratase,
1-(4-hydroxyphenyl)-2,3-decanediol dehydratase,
1-indole-2,3-decanediol dehydratase, 2,5-dimethyl-3,4-heptanediol
dehydratase, 2,6-dimethyl-3,4-heptanediol dehydratase,
2,7-dimethyl-3,4-octanediol dehydratase,
1-phenyl-4-methyl-2,3-pentanediol dehydratase,
1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase,
1-indole-4-methyl-2,3-pentanediol dehydratase,
2,6-dimethyl-4,5-octanediol dehydratase,
3,8-dimethyl-4,5-nonanediol dehydratase,
1-phenyl-4-methyl-2,3-hexanediol dehydratase,
1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase,
1-indole-4-methyl-2,3-hexanediol dehydratase,
2,8-dimethyl-4,5-nonanediol dehydratase,
1-phenyl-5-methyl-2,3-hexanediol dehydratase,
1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase,
1-indole-5-methyl-2,3-hexanediol dehydratase,
1-phenyl-6-methyl-2,3-heptanediol dehydratase,
1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase,
1-indole-6-methyl-2,3-heptanediol dehydratase,
1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase,
1-indole-4-phenyl-2,3-butanediol dehydratase,
1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase,
1,10-diamino-5,6-decanediol dehydratase,
1,4-di(4-hydroxyphenyl)-2,3-butanediol,
2,3-hexanediol-1,6-dicarboxylic acid dehydratase, and the like.
[0297] In certain aspects, a selected diol dehydratase enzyme may
have one or more of the above exemplified diol dehydratase
activities, such as a 2,3-butanediol dehydratase activity, a
3,4-hexanediol dehydratase activity, and/or a 4,5-octanediol
dehydratase activity, among others.
[0298] In certain aspects, diol dehydratases may be obtained from
Klebsiella pneumoniae MGH 78578, including from the pduCDE gene of
this and other microorganisms. In certain aspects, a recombinant
microorganism may comprise one or more diol dehydratases encoded by
a nucleotide reference sequence selected from SEQ ID NO:103, 105,
and 107, or an enzyme having a polypeptide sequence selected from
SEQ ID NO:104, 106, and 108, including biologically active
fragments or variants thereof, such as optimized variants. Certain
aspects may also comprises nucleotide sequences or polypeptide
sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99% sequence
identity to SEQ ID NOS:103-108. In certain aspects, polypeptides of
SEQ ID NO:104 may comprise certain conserved amino acid residues,
including those chosen from D149, P151, A155, A159, G165, E168,
E170, A183, G189, G196, Q200, E208, G215, Y219, E221, T222, S224,
Y226, G227, T228, F232, G235, D236, D237, T238, P239, S241, L245,
Y249, S251, R252, G253, K255, R257, S260, E265, M268, G269, S275,
Y278, L279, E280, C283, G291, Q293, G294, Q296, N297, G298, G312,
E329, S341, R344, G356, D371, N372, F374, S377, R392, D393, R412,
L477, A486, G499, D500, S516, N522, D523, Y524, G526, and G530.
[0299] In certain aspects, a diol dehydratase may include a
polypeptide that comprises an amino acid sequence having 0%, 85%,
90%, 95%, 97%, 98%, 99% sequence identity to SEQ ID NOS:308-311.
SEQ ID NO:308 shows the polypeptide sequence of PduG, a diol
dehydratase reactivation large subunit derived from Klebsiella
pneumoniae subsp. pneumoniae MGH 78578. SEQ ID NO:309 shows the
polypeptide sequence of PduH, diol dehydratase reactivation small
subunit derived from Klebsiella pneumoniae subsp. pneumoniae MGH
78578. SEQ ID NO:310 shows the polypeptide sequence of a
B12-independent glycerol dehydratase from Clostridium Butyricum.
SEQ ID NO:311 shows the polypeptide sequence of a glycerol
dehydratase activator from Clostridium Butyricum. In certain
aspects, a B 12-independent glycerol dehydratase may comprise
conserved amino acid residues, such as T36, G74, P87, E88, E97,
W126, R221, A263, Q265, R287, D289, E309, R317, G335, G345, G346,
N356, P374, R379, G399, G401, P403, D408, G432, C433, N452, C529,
G533, G539, G540, S559, G603, N604, A654, G658, R659, D676, N702,
Q735, N737, A747, P751, R760, V761, A762, G763, Q776, I780, and/or
R782. In certain aspects, a B12-independent glycerol dehydratase
activator may comprise certain conserved amino acid residues,
including D19, G20, G22, R24, F28, G31, C32, C36, W38, C39, N41,
P42, C58, C64, C96, G129, T132, G135, G136, D185, R187, N208, R222,
and/or R264.
[0300] A dehydration and reduction pathway, as described herein,
may comprise one or more alcohol dehydrogenases or secondary
alcohol dehydrogenases. An "alcohol dehydrogenase" or "secondary
alcohol dehydrogenase" that is part of a dehydration and reduction
pathway refers generally to an enzyme that catalyzes the conversion
of aldehyde or ketone substituents to alcohols. For instance,
4-octanone may be reduced to 4-octanol by a secondary alcohol
dehydrogenase one enzymatic step for the conversion of butyroin to
a biofuel. Pseudomonads express at least one secondary alcohol
dehydrogenase that oxidizes 4-octanol to 4-octanone using NAD.sup.+
as a co-factor. As another example, Rhodococcus erythropolis
ATCC4277 catalyzes oxidation of medium to long chain secondary
fatty alcohols using NADH as a co-factor, using an enzyme that also
catalyzes the oxidation of 3-decanol and 4-decanol. In addition,
Norcadia fusca AKU2123 contains an (S)-specific secondary alcohol
dehydrogenase.
[0301] Genes encoding secondary alcohol dehydrogenases may be
isolated from these and other organisms according to known
techniques in the art and incorporated into the microbial systems
recombinant organisms as described herein. In certain embodiments,
a microbial system or isolated microorganism may comprise natural
or optimized secondary alcohol dehydrogenases from Pseudomonads,
Rhodococcus erythropolis ATCC4277, Norcadia fusca AKU2123, or other
suitable organisms.
[0302] Examples of secondary alcohol dehydrogenases include, but
are not limited to, 2-butanol dehydrogenase, 3-hexanol
dehydrogenase, 4-octanol dehydrogenase, 5-decanol dehydrogenase,
6-dodecanol dehydrogenase, 7-tetradecanol dehydrogenase,
8-hexadecanol dehydrogenase, 2,5-dimethyl-3-hexanol dehydrogenase,
3,6-dimethyl-4-octanol dehydrogenase, 2,7-dimethyl-4-octanol
dehydrogenase, 2,9-dimethyl-4-decanol dehydrogenase,
1,4-diphenyl-2-butanol dehydrogenase,
bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase,
1,4-diindole-2-butanol dehydrogenase, cyclopentanol dehydrogenase,
2(or 3)-pentanol dehydrogenase, 2(or 3)-hexanol dehydrogenase, 2(or
3)-heptanol dehydrogenase, 2(or 3)-octanol dehydrogenase, 2(or
3)-nonanol dehydrogenase, 4-methyl-2(or 3)-pentanol dehydrogenase,
4-methyl-2(or 3)-hexanol dehydrogenase, 5-methyl-2(or 3)-hexanol
dehydrogenase, 6-methyl-2(or 3)-heptanol dehydrogenase,
1-phenyl-2(or 3)-butanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or
3)-butanol dehydrogenase, 1-indole-2(or 3)-butanol dehydrogenase,
3(or 4)-heptanol dehydrogenase, 3(or 4)-octanol dehydrogenase, 3(or
4)-nonanol dehydrogenase, 3(or 4)-decanol dehydrogenase, 3(or
4)-undecanol dehydrogenase, 2-methyl-3(or 4)-hexanol dehydrogenase,
5-methyl-3(or 4)-heptanol dehydrogenase, 6-methyl-3(or 4)-heptanol
dehydrogenase, 7-methyl-3(or 4)-octanol dehydrogenase,
1-phenyl-2(or 3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or
3)-pentanol dehydrogenase, 1-indole-2(or 3)-pentanol dehydrogenase,
4(or 5)-nonanol dehydrogenase, 4(or 5)-decanol dehydrogenase, 4(or
5)-undecanol dehydrogenase, 4(or 5)-dodecanol dehydrogenase,
2-methyl-3(or 4)-heptanol dehydrogenase, 3-methyl-4(or 5)-octanol
dehydrogenase, 2-methyl-4(or 5)-octanol dehydrogenase,
8-methyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-2(or 3)-hexanol
dehydrogenase, 1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase,
1-indole-2(or 3)-hexanol dehydrogenase, 4(or 5)-undecanol
dehydrogenase, 5(or 6)-undecanol dehydrogenase, 5(or 6)-tridecanol
dehydrogenase, 2-methyl-3(or 4)-octanol dehydrogenase,
3-methyl-4(or 5)-nonanol dehydrogenase, 2-methyl-4(or 5)-nonanol
dehydrogenase, 2-methyl-5(or 6)-decanol dehydrogenase,
1-phenyl-2(or 3)-heptanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or
3)-heptanol dehydrogenase, 1-indole-2(or 3)-heptanol dehydrogenase,
6(or 7)-tridecanol dehydrogenase, 6(or 7)-tetradecanol
dehydrogenase, 2-methyl-3(or 4)-nonanol dehydrogenase,
3-methyl-4(or 5)-decanol dehydrogenase, 2-methyl-4(or 5)-decanol
dehydrogenase, 2-methyl-5(or 6)-undecanol dehydrogenase,
1-phenyl-2(or 3)-octanol dehydrogenase, 1-(4-hydroxyphenyl)-2(or
3)-octanol dehydrogenase, 1-indole-2(or 3)-octanol dehydrogenase,
7(or 8)-pentadecanol dehydrogenase, 2-methyl-3(or 4)-decanol
dehydrogenase, 3-methyl-4(or 5)-undecanol dehydrogenase,
2-methyl-4(or 5)-undecanol dehydrogenase, 2-methyl-5(or
6)-dodecanol dehydrogenase, 1-phenyl-2(or 3)-nonanol dehydrogenase,
1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase, 1-indole-2(or
3)-nonanol dehydrogenase, 2-methyl-3(or 4)-undecanol dehydrogenase,
3-methyl-4(or 5)-dodecanol dehydrogenase, 2-methyl-4(or
5)-dodecanol dehydrogenase, 2-methyl-5(or 6)-tridecanol
dehydrogenase, 1-phenyl-2(or 3)-decanol dehydrogenase,
1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase, 1-indole-2(or
3)-decanol dehydrogenase, 2,5-dimethyl-3(or 4)-heptanol
dehydrogenase, 2,6-dimethyl-3(or 4)-heptanol dehydrogenase,
2,7-dimethyl-3(or 4)-octanol dehydrogenase, 1-phenyl-4-methyl-2(or
3)-pentanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2(or
3)-pentanol dehydrogenase, 1-indole-4-methyl-2(or 3)-pentanol
dehydrogenase, 2,6-dimethyl-4(or 5)-octanol dehydrogenase,
3,8-dimethyl-4(or 5)-nonanol dehydrogenase, 1-phenyl-4-methyl-2(or
3)-hexanol dehydrogenase, 1-(4-hydroxyphenyl)-4-methyl-2 (or
3)-hexanol dehydrogenase, 1-indole-4-methyl-2(or 3)-hexanol
dehydrogenase, 2,8-dimethyl-4(or 5)-nonanol dehydrogenase,
1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase,
1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase,
1-indole-5-methyl-2(or 3)-hexanol dehydrogenase,
1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase,
1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase,
1-indole-6-methyl-2(or 3)-heptanol dehydrogenase,
1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase,
1-indole-4-phenyl-2(or 3)-butanol dehydrogenase,
1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase,
1,10-diamino-5-decanol dehydrogenase,
1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase,
2-hexanol-1,6-dicarboxylic acid dehydrogenase, phenylethanol
dehydrogenase, 4-hydroxyphenylethanol dehydrogenase,
Indole-3-ethanol dehydrogenase, and the like.
[0303] In certain aspects, a selected alcohol dehydrogenase or
secondary alcohol dehydrogenase may have one or more of the above
exemplified alcohol dehydrogenase activities, such as a 2-butanol
dehydrogenase activity, 3-hexanol dehydrogenase activity, and/or a
4-octanol dehydrogenase activity, among others.
[0304] In certain aspects, a recombinant microorganism may comprise
one or more secondary alcohol dehydrogenases encoded by a
nucleotide reference sequence selected from SEQ ID NO:109, 111,
113, 115, 117, 119, 121, 123, 125, 127, 129, 131, 133, 135, 137,
139, and 141, or an enzyme having a polypeptide sequence selected
from SEQ ID NO:110, 112, 114, 116, 118, 120, 122, 124, 126, 128,
130, 132, 134, 136, 138, 140, and 142, including biologically
active fragments or variants thereof, such as optimized variants.
Certain aspects may also comprises nucleotide sequences or
polypeptide sequences having 80%, 85%, 90%, 95%, 97%, 98%, 99%
sequence identity to SEQ ID NOS:109-142.
[0305] For the secondary alcohol dehydrogenase sequences referred
to above, SEQ ID NO:109 is the nucleotide sequence and SEQ ID
NO:110 is the polypeptide sequence of a secondary alcohol
dehydrogenase (2adh-1: PP.sub.--1946) isolated from Pseudomonas
putida KT2440. SEQ ID NO:111 is the nucleotide sequence and SEQ ID
NO:112 is the polypeptide sequence of a secondary alcohol
dehydrogenase (2adh-2: PP.sub.--1817) isolated from Pseudomonas
putida KT2440.
[0306] SEQ ID NO:113 is the nucleotide sequence and SEQ ID NO:114
is the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-3: PP.sub.--1953) isolated from Pseudomonas putida KT2440.
SEQ ID NO:115 is the nucleotide sequence and SEQ ID NO:116 is the
polypeptide sequence of a secondary alcohol dehydrogenase (2adh-4:
PP.sub.--3037) isolated from Pseudomonas putida KT2440.
[0307] SEQ ID NO:117 is the nucleotide sequence and SEQ ID NO:118
is the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-5: PP.sub.--1852) isolated from Pseudomonas putida KT2440.
SEQ ID NO:119 is the nucleotide sequence and SEQ ID NO:120 is the
polypeptide sequence of a secondary alcohol dehydrogenase (2adh-6:
PP.sub.--2723) isolated from Pseudomonas putida KT2440.
[0308] SEQ ID NO:121 is the nucleotide sequence and SEQ ID NO:122
is the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-7: PP.sub.--2002) isolated from Pseudomonas putida KT2440.
SEQ ID NO:123 is the nucleotide sequence and SEQ ID NO:124 is the
polypeptide sequence of a secondary alcohol dehydrogenase (2adh-8:
PP.sub.--1914) isolated from Pseudomonas putida KT2440.
[0309] SEQ ID NO:125 is the nucleotide sequence and SEQ ID NO:126
is the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-9: PP.sub.--1914) isolated from Pseudomonas putida KT2440.
SEQ ID NO:127 is the nucleotide sequence and SEQ ID NO:128 is the
polypeptide sequence of a secondary alcohol dehydrogenase (2adh-10:
PP.sub.--3926) isolated from Pseudomonas putida KT2440.
[0310] SEQ ID NO:129 is the nucleotide sequence and SEQ ID NO:130
is the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-11: PFL.sub.--1756) isolated from Pseudomonas fluorescens
Pf-5. SEQ ID NO:131 is the nucleotide sequence and SEQ ID NO:132 is
the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-12: KPN.sub.--01694) isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578.
[0311] SEQ ID NO:133 is the nucleotide sequence and SEQ ID NO:134
is the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-13: KPN.sub.--02061) isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578. SEQ ID NO:135 is the nucleotide
sequence and SEQ ID NO:136 is the polypeptide sequence of a
secondary alcohol dehydrogenase (2adh-14: KPN.sub.--00827) isolated
from Klebsiella pneumoniae subsp. pneumoniae MGH 78578.
[0312] SEQ ID NO:137 is the nucleotide sequence and SEQ ID NO:138
is the polypeptide sequence of a secondary alcohol dehydrogenase
(2adh-16: KPN.sub.--01350) isolated from Klebsiella pneumoniae
subsp. pneumoniae MGH 78578. SEQ ID NO:139 is the nucleotide
sequence and SEQ ID NO:140 is the polypeptide sequence of a
secondary alcohol dehydrogenase (2adh-17: KPN.sub.--03369) isolated
from Klebsiella pneumoniae subsp. pneumoniae MGH 78578. SEQ ID
NO:141 is the nucleotide sequence and SEQ ID NO:142 is the
polypeptide sequence of a secondary alcohol dehydrogenase (2adh-18:
KPN.sub.--03363) isolated from Klebsiella pneumoniae subsp.
pneumoniae MGH 78578.
[0313] In certain aspects, an alcohol dehydrogenase (e.g., DEHU
hydrogenase), a secondary alcohol dehydrogenase (2ADH), a fragment,
variant, or derivative thereof, or any other enzyme that utilizes
such an active site, may comprise at least one of a nicotinamide
adenine dinucleotide (NAD+), NADH, nicotinamide adenine
dinucleotide phosphate (NADP+), or NADPH binding motif. In certain
embodiments, the NAD+, NADH, NADP+, or NADPH binding motif may be
selected from the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y,
Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y,
Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein
Y is independently selected from alanine, glycine, and serine,
wherein G is glycine, and wherein X is independently selected from
a genetically encoded amino acid.
[0314] As one example of a step in a reduction and dehydration
pathway, .alpha.-hydroxy cyclopentanone may be reduced to
1,2-cyclopentanediol. For example, the glycerol dehydrogenase
isolated from Hansenula ofunaensis favors the reduction of
.alpha.-hydroxy ketones and .alpha.-keto ketones, and has very
broad substrate specificity. The similar alcohol dehydrogenase
derived from Hansenula polumorpha and meso-2,3-butanediol
dehydrogenase has similar properties. Certain embodiments may
incorporate a 1,2-cyclopentanediol dehydrogenase to the microbial
system or isolated microorganism. Other embodiments may incorporate
a glycerol dehydrogenase from Hansenula ofunaensis, Hansenula
polumorpha, Klebsiella pneumonia, or any other suitable
organism.
[0315] By way of example, a chemical or hydrocarbon such as
1,2-cyclopentanediol may be dehydrated to form cyclopentanone as
one enzymatic step in a reduction and dehydration pathway. There
are at least two different types of diol dehydratases that may
catalyze dehydration of chemicals such as 1,2-cyclopentanediol.
Certain embodiments of microbial system comprising a reduction and
dehydration pathway will comprise diol dehydratases such as
1,2-cyclopentanediol dehydratase.
[0316] In the last enzymatic step for a reduction and dehydration
pathway, the conversion of such exemplary chemicals as
.alpha.-hydroxy cyclopentanone to cyclopentanol may include the
reduction of cyclopentanone to cyclopentanol. This step may be
catalyzed by cyclopentanol dehydrogenase, which is found in
Comomonas sp. strain NCIMB 9872 and its gene (cpnA) has been
isolated. Certain embodiments of a microbial system or isolated
microorganism may comprise a cyclopentanol dehydrogenase, such as
that expressed by cpnA in Comomonas sp. strain NCIMB 9872, among
others described herein.
[0317] As detailed below, in certain embodiments, selected C--C
ligation pathways may be utilized in combination with selected
components or enzymes of a reduction and dehydration pathway to
produce a commodity chemical, or intermediate thereof.
[0318] For example, certain embodiments include a method wherein
the C--C ligation pathway may comprise an acetoaldehyde lyase and
wherein the reduction and dehydration pathway may comprise at least
one of a 2,3-butanediol dehydrogenase, a 2,3-butanediol
dehydratase, and a 2-butanol dehydrogenase. Additional embodiments
include a method wherein the C--C ligation pathway may comprise a
propionaldehyde lyase and wherein the reduction and dehydration
pathway may comprise at least one of a 3,4-hexanediol
dehydrogenase, a 3,4-hexanediol dehydratase, and a 3-hexanol
dehydrogenase.
[0319] Additional embodiments include a method wherein the C--C
ligation pathway may comprise a butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
4,5-octanediol dehydrogenase, a 4,5-octanediol dehydratase, and a
4-octanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise a butyraldehyde
lyase and wherein the reduction and dehydration pathway may
comprise at least one of a 5,6-decanediol dehydrogenase, a
5,6-decanediol dehydratase, and a 5-decanol dehydrogenase.
[0320] Additional embodiments include a method wherein the C--C
ligation pathway may comprise a butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
6,7-dodecanediol dehydrogenase, a 6,7-dodecanediol dehydratase, and
a 6-dodecanol dehydrogenase. Additional embodiments include a
method wherein the C--C ligation pathway may comprise a
butyraldehyde lyase and wherein the reduction and dehydration
pathway may comprise at least one of a 7,8-tetradecanediol
dehydrogenase, a 7,8-tetradecanediol dehydratase, and a
7-tetradecanol dehydrogenase.
[0321] Additional embodiments include a method wherein the C--C
ligation pathway may comprise a butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
8,9-hexadecanediol dehydrogenase, a 8,9-hexadecanediol dehydratase,
and a 8-hexadecanol dehydrogenase. Additional embodiments include a
method wherein the C--C ligation pathway may comprise an
isobutyraldehyde lyase and wherein the reduction and dehydration
pathway may comprise at least one of a 2,5-dimethyl-3,4-hexanediol
dehydrogenase, a 2,5-dimethyl-3,4-hexanediol dehydratase, and a
2,5-dimethyl-3-hexanol dehydrogenase.
[0322] Additional embodiments include a method wherein the C--C
ligation pathway may comprise a 2-methyl-butyraldehyde lyase and
wherein the reduction and dehydration pathway may comprise at least
one of a 3,6-dimethyl-4,5-octanediol dehydrogenase, a
3,6-dimethyl-4,5-octanediol dehydratase, and a
3,6-dimethyl-4-octanol dehydrogenase. Additional embodiments
include a method wherein the C--C ligation pathway may comprise a
3-methyl-butyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
2,7-dimethyl-4,5-octanediol dehydrogenase, a
2,7-dimethyl-4,5-octanediol dehydratase, and a
2,7-dimethyl-4-octanol dehydrogenase.
[0323] Additional embodiments include a method wherein the C--C
ligation pathway may comprise a 3-methyl-butyraldehyde lyase and
wherein the reduction and dehydration pathway may comprise at least
one of a 2,9-dimethyl-5,6-decanediol dehydrogenase, a
2,9-dimethyl-4,5-decanediol dehydratase, and a
2,9-dimethyl-4-decanol dehydrogenase. Additional embodiments
include a method wherein the C--C ligation pathway may comprise a
phenylacetaldehyde lyase and wherein the reduction and dehydration
pathway may comprise at least one of a 1,4-diphenyl-2,3-butanediol
dehydrogenase, a 1,4-diphenyl-2,3-butanediol dehydratase, and a
1,4-diphenyl-2-butanol dehydrogenase.
[0324] Additional embodiments include a method wherein the C--C
ligation pathway may comprise a phenylacetaldehyde lyase and
wherein the reduction and dehydration pathway may comprise at least
one of a bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a
bis-1,4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a
bis-1,4-(4-hydroxyphenyl)-2-butanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise a phenylacetaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
1,4-diindole-2,3-butanediol dehydrogenase, a
1,4-diindole-2,3-butanediol dehydratase, and a
1,4-diindole-2-butanol dehydrogenase.
[0325] Additional embodiments include a method wherein the C--C
ligation pathway may comprise an .alpha.-keto adipate carboxylyase,
and wherein the reduction and dehydration pathway may comprise at
least one of a 1,2-cyclopentanediol dehydrogenase, a
1,2-cyclopentanediol dehydratase, and a cyclopentanol
dehydrogenase. Additional embodiments include a method wherein the
C--C ligation pathway may comprise at least one of an
acetoaldehyde/propiondehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 2,3-pentanediol
dehydrogenase, a 2,3-pentanediol dehydratase, and a 2(or
3)-pentanol dehydrogenase.
[0326] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
acetoaldehyde/butyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 2,3-hexanediol
dehydrogenase, a 2,3-hexanediol dehydratase, and a 2(or 3)-hexanol
dehydrogenase. Additional embodiments include a method wherein the
C--C ligation pathway may comprise at least one of an
acetoaldehyde/pentaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 2,3-heptanediol
dehydrogenase, a 2,3-heptanediol dehydratase, and a 2(or
3)-heptanol dehydrogenase.
[0327] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
acetoaldehyde/hexyldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 2,3-octanediol
dehydrogenase, a 2,3-octanediol dehydratase, and a 2(or 3)-octanol
dehydrogenase. Additional embodiments include a method wherein the
C--C ligation pathway may comprise at least one of an
acetoaldehyde/octaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 2,3-nonanediol
dehydrogenase, a 2,3-nonanediol dehydratase, and a 2(or 3)-nonanol
dehydrogenase.
[0328] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
acetoaldehyde/isobutyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
4-methyl-2,3-pentanediol dehydrogenase, a 4-methyl-2,3-pentanediol
dehydratase, and a 4-methyl-2(or 3)-pentanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of an
acetoaldehyde/2-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
4-methyl-2,3-hexanediol dehydrogenase, a 4-methyl-2,3-hexanediol
dehydratase, and a 4-methyl-2(or 3)-hexanol dehydrogenase.
[0329] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
acetoaldehyde/3-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
5-methyl-2,3-hexanediol dehydrogenase, a 5-methyl-2,3-hexanediol
dehydrogenase, and a 5-methyl-2(or 3)-hexanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of an
acetoaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
6-methyl-2,3-heptanediol dehydrogenase, a 6-methyl-2,3-heptanediol
dehydrogenase, and a 6-methyl-2(or 3)-heptanol dehydrogenase.
[0330] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
acetoaldehyde/phenylacetaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
1-phenyl-2,3-butanediol dehydrogenase, a 1-phenyl-2,3-butanediol
dehydratase, and a 1-phenyl-2(or 3)-butanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of an
acetoaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase, a
1-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a
1-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase.
[0331] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
acetoaldehyde/indoleacetaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
1-indole-2,3-butanediol dehydrogenase, a 1-indole-2,3-butanediol
dehydratase, and a 1-indole-2(or 3)-butanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
propionaldehyde/butyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 3,4-heptanediol
dehydrogenase, a 3,4-heptanediol dehydratase, and a 3(or
4)-heptanol dehydrogenase.
[0332] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
propionaldehyde/pentaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 3,4-octanediol
dehydrogenase, a 3,4-octanediol dehydratase, and a 3(or 4)-octanol
dehydrogenase. Additional embodiments include a method wherein the
C--C ligation pathway may comprise at least one of a
propionaldehyde/hexyldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 3,4-nonanediol
dehydrogenase, a 3,4-nonanediol dehydratase, and a 3(or 4)-nonanol
dehydrogenase.
[0333] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
propionaldehyde/heptaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 3,4-decanediol
dehydrogenase, a 3,4-decanediol dehydratase, and a 3(or 4)-decanol
dehydrogenase. Additional embodiments include a method wherein the
C--C ligation pathway may comprise at least one of a
propionaldehyde/octaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 3,4-undecanediol
dehydrogenase, a 3,4-undecanediol dehydratase, and a 3(or
4)-undecanol dehydrogenase.
[0334] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
propionaldehyde/isobutyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-3,4-hexanediol dehydrogenase, a 2-methyl-3,4-hexanediol
dehydratase, and a 2-methyl-3(or 4)-hexanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
propionaldehyde/2-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
5-methyl-3,4-heptanediol dehydrogenase, a 5-methyl-3,4-heptanediol
dehydratase, and a 5-methyl-3(or 4)-heptanol dehydrogenase.
[0335] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
propionaldehyde/3-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
6-methyl-3,4-heptanediol dehydrogenase, a 6-methyl-3,4-heptanediol
dehydratase, and a 6-methyl-3(or 4)-heptanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
propionaldehyde/4-methyl-pentaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
7-methyl-3,4-octanediol dehydrogenase, a 7-methyl-3,4-octanediol
dehydratase, and a 7-methyl-3(or 4)-octanol dehydrogenase.
[0336] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a propionaldehyde and
a phenylacetaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
1-phenyl-2,3-pentanediol dehydrogenase, a 1-phenyl-2,3-pentanediol
dehydratase, and a 1-phenyl-2(or 3)-pentanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
propionaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-2,3-pentanediol dehydrogenase, a
1-(4-hydroxyphenyl)-2,3-pentanediol dehydratase, and a
1-(4-hydroxyphenyl)-2(or 3)-pentanol dehydrogenase.
[0337] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
propionaldehyde/indoleacetoaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
1-indole-2,3-pentanediol dehydrogenase, a 1-indole-2,3-pentanediol
dehydratase, and a 1-indole-2(or 3)-pentanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a butyraldehyde/pentaldehyde
lyase and wherein the reduction and dehydration pathway may
comprise at least one of a 4,5-nonanediol dehydrogenase, a
4,5-nonanediol dehydratase, and a 4(or 5)-nonanol
dehydrogenase.
[0338] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
butyraldehyde/hexyldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 4,5-decanediol
dehydrogenase, a 4,5-decanediol dehydratase, and a 4(or 5)-decanol
dehydrogenase. Additional embodiments include a method wherein the
C--C ligation pathway may comprise at least one of a
butyraldehyde/heptaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 4,5-undecanediol
dehydrogenase, a 4,5-undecanediol dehydratase, and a 4(or
5)-undecanol dehydrogenase.
[0339] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
butyraldehyde/octaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 4,5-dodecanediol
dehydrogenase, a 4,5-dodecanediol dehydratase, and a 4(or
5)-dodecanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise at least one of a
butyraldehyde/isobutyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
2-methyl-3,4-heptanediol dehydrogenase, a 2-methyl-3,4-heptanediol
dehydratase, and a 2-methyl-3(or 4)-heptanol dehydrogenase.
[0340] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
butyraldehyde/2-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
3-methyl-4,5-octanediol dehydrogenase, a 3-methyl-4,5-octanediol
dehydratase, and a 3-methyl-4(or 5)-octanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
butyraldehyde/3-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
2-methyl-4,5-octanediol dehydrogenase, a 2-methyl-4,5-octanediol
dehydratase, and a 2-methyl-4(or 5)-octanol dehydrogenase.
[0341] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
butyraldehyde/4-methyl-pentaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of an
8-methyl-4,5-nonanediol dehydrogenase, an 8-methyl-4,5-nonanediol
dehydratase, and an 8-methyl-4(or 5)-nonanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
butyraldehyde/phenylacetaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
1-phenyl-2,3-hexanediol dehydrogenase, a 1-phenyl-2,3-hexanediol
dehydratase, and a 1-phenyl-2(or 3)-hexanol dehydrogenase.
[0342] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
butyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-2,3-hexanediol dehydrogenase, a
1-(4-hydroxyphenyl)-2,3-hexanediol dehydratase, and a
1-(4-hydroxyphenyl)-2(or 3)-hexanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise at least one of a butyraldehyde/indoleacetaldehyde lyase
and wherein the reduction and dehydration pathway may comprise at
least one of a 1-indole-2,3-hexanediol dehydrogenase, a
1-indole-2,3-hexanediol dehydratase, and a 1-indole-2(or 3)-hexanol
dehydrogenase.
[0343] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
pentaldehyde/hexyldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 5,6-undecanediol
dehydrogenase, a 4,5-undecanediol dehydratase, and a 4(or
5)-undecanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise at least one of a
pentaldehyde/heptaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a 5,6-undecanediol
dehydrogenase, a 5,6-undecanediol dehydratase, and a 5(or
6)-undecanol dehydrogenase.
[0344] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
pentaldehyde/octaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
5,6-tridecanediol dehydrogenase, a 5,6-tridecanediol dehydratase,
and a 5(or 6)-tridecanol dehydrogenase. Additional embodiments
include a method wherein the C--C ligation pathway may comprise at
least one of a pentaldehyde/isobutyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
2-methyl-3,4-octanediol dehydrogenase, a 2-methyl-3,4-octanediol
dehydratase, and a 2-methyl-3(or 4)-octanol dehydrogenase.
[0345] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
pentaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
3-methyl-4,5-nonanediol dehydrogenase, a 3-methyl-4,5-nonanediol
dehydratase, and a 3-methyl-4(or 5)-nonanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
pentaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-4,5-nonanediol dehydrogenase, a 2-methyl-4,5-nonanediol
dehydratase, and a 2-methyl-4(or 5)-nonanol dehydrogenase.
[0346] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
pentaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-5,6-decanediol dehydrogenase, a 2-methyl-5,6-decanediol
dehydratase, and a 2-methyl-5(or 6)-decanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
pentaldehyde/phenylacetaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
1-phenyl-2,3-heptanediol dehydrogenase, a 1-phenyl-2,3-heptanediol
dehydratase, and a 1-phenyl-2(or 3)-heptanol dehydrogenase.
[0347] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
pentaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-2,3-heptanediol dehydrogenase, a
1-(4-hydroxyphenyl)-2,3-heptanediol dehydratase, and a
1-(4-hydroxyphenyl)-2(or 3)-heptanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise at least one of a pentaldehyde/indoleacetaldehyde lyase
and wherein the reduction and dehydration pathway may comprise at
least one of a 1-indole-2,3-heptanediol dehydrogenase, a
1-indole-2,3-heptanediol dehydratase, and a 1-indole-2(or
3)-heptanol dehydrogenase.
[0348] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
hexyldehyde/heptaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
6,7-tridecanediol dehydrogenase, a 6,7-tridecanediol dehydratase,
and a 6(or 7)-tridecanol dehydrogenase. Additional embodiments
include a method wherein the C--C ligation pathway may comprise at
least one of a hexyldehyde/octaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
6,7-tetradecanediol dehydrogenase, a 6,7-tetradecanediol
dehydratase, and a 6(or 7)-tetradecanol dehydrogenase.
[0349] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
hexyldehyde/isobutyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
2-methyl-3,4-nonanediol dehydrogenase, a 2-methyl-3,4-nonanediol
dehydratase, and a 2-methyl-3(or 4)-nonanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
hexyldehyde/2-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
3-methyl-4,5-decanediol dehydrogenase, a 3-methyl-4,5-decanediol
dehydratase, and a 3-methyl-4(or 5)-decanol dehydrogenase.
[0350] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
hexyldehyde/3-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-4,5-decanediol dehydrogenase, a 2-methyl-4,5-decanediol
dehydratase, and a 2-methyl-4(or 5)-decanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
hexyldehyde/4-methyl-pentaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-5,6-undecanediol dehydrogenase, a
2-methyl-5,6-undecanediol dehydratase, and a 2-methyl-5(or
6)-undecanol dehydrogenase.
[0351] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
hexyldehyde/phenylacetaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
1-phenyl-2,3-octanediol dehydrogenase, a 1-phenyl-2,3-octanediol
dehydratase, and a 1-phenyl-2(or 3)-octanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
hexyldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-2,3-octanediol dehydrogenase, a
1-(4-hydroxyphenyl)-2,3-octanediol dehydratase, and a
1-(4-hydroxyphenyl)-2(or 3)-octanol dehydrogenase.
[0352] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
hexyldehyde/indoleacetaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
1-indole-2,3-octanediol dehydrogenase, a 1-indole-2,3-octanediol
dehydratase, and a 1-indole-2(or 3)-octanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a heptaldehyde/octaldehyde
lyase and wherein the reduction and dehydration pathway may
comprise at least one of a 7,8-pentadecanediol dehydrogenase, a
7,8-pentadecanediol dehydratase, and a 7(or 8)-pentadecanol
dehydrogenase.
[0353] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
heptaldehyde/isobutyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
2-methyl-3,4-decanediol dehydrogenase, a 2-methyl-3,4-decanediol
dehydratase, and a 2-methyl-3(or 4)-decanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
heptaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
3-methyl-4,5-undecanediol dehydrogenase, a
3-methyl-4,5-undecanediol dehydratase, and a 3-methyl-4(or
5)-undecanol dehydrogenase.
[0354] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
heptaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-4,5-undecanediol dehydrogenase, a
2-methyl-4,5-undecanediol dehydratase, and a 2-methyl-4(or
5)-undecanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise at least one of a
heptaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-5,6-dodecanediol dehydrogenase, a
2-methyl-5,6-dodecanediol dehydratase, and a 2-methyl-5(or
6)-dodecanol dehydrogenase.
[0355] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
heptaldehyde/phenylacetaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
1-phenyl-2,3-nonanediol dehydrogenase, a 1-phenyl-2,3-nonanediol
dehydratase, and a 1-phenyl-2(or 3)-nonanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
heptaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-2,3-nonanediol dehydrogenase, a
1-(4-hydroxyphenyl)-2,3-nonanediol dehydratase, and a
1-(4-hydroxyphenyl)-2 (or 3)-nonanol dehydrogenase.
[0356] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
heptaldehyde/indoleacetaldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
1-indole-2,3-nonanediol dehydrogenase, a 1-indole-2,3-nonanediol
dehydratase, and a 1-indole-2(or 3)-nonanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of an
octaldehyde/isobutyraldehyde lyase and wherein the reduction and
dehydration pathway may comprise at least one of a
2-methyl-3,4-undecanediol dehydrogenase, a
2-methyl-3,4-undecanediol dehydratase, and a 2-methyl-3(or
4)-undecanol dehydrogenase.
[0357] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
octaldehyde/2-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
3-methyl-4,5-dodecanediol dehydrogenase, a
3-methyl-4,5-dodecanediol dehydratase, and a 3-methyl-4(or
5)-dodecanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise at least one of an
octaldehyde/3-methyl-butyraldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-4,5-dodecanediol dehydrogenase, a
2-methyl-4,5-dodecanediol dehydratase, and a 2-methyl-4(or
5)-dodecanol dehydrogenase.
[0358] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
octaldehyde/4-methyl-pentaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
2-methyl-5,6-tridecanediol dehydrogenase, a
2-methyl-5,6-tridecanediol dehydratase, and a 2-methyl-5(or
6)-tridecanol dehydrogenase. Additional embodiments include a
method wherein the C--C ligation pathway may comprise at least one
of an octaldehyde/phenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-phenyl-2,3-decanediol dehydrogenase, a 1-phenyl-2,3-decanediol
dehydratase, and a 1-phenyl-2(or 3)-decanol dehydrogenase.
[0359] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
octaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-2,3-decanediol dehydrogenase, a
1-(4-hydroxyphenyl)-2,3-decanediol dehydratase, and a
1-(4-hydroxyphenyl)-2 (or 3)-decanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise at least one of an octaldehyde/indoleacetaldehyde lyase
and wherein the reduction and dehydration pathway may comprise at
least one of a 1-indole-2,3-decanediol dehydrogenase, a
1-indole-2,3-decanediol dehydratase, and a 1-indole-2(or 3)-decanol
dehydrogenase.
[0360] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
isobutyraldehyde/2-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
2,5-dimethyl-3,4-heptanediol dehydrogenase, a
2,5-dimethyl-3,4-heptanediol dehydratase, and a 2,5-dimethyl-3(or
4)-heptanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise at least one of an
isobutyraldehyde/3-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
2,6-dimethyl-3,4-heptanediol dehydrogenase, a
2,6-dimethyl-3,4-heptanediol dehydratase, and a 2,6-dimethyl-3(or
4)-heptanol dehydrogenase.
[0361] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
isobutyraldehyde/4-methyl-pentaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
2,7-dimethyl-3,4-octanediol dehydrogenase, a
2,7-dimethyl-3,4-octanediol dehydratase, and a 2,7-dimethyl-3(or
4)-octanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise at least one of an
isobutyraldehyde/phenylacetaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
1-phenyl-4-methyl-2,3-pentanediol dehydrogenase, a
1-phenyl-4-methyl-2,3-pentanediol dehydratase, and a
1-phenyl-4-methyl-2(or 3)-pentanol dehydrogenase.
[0362] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of an
isobutyraldehyde/4-hydroxyphenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydrogenase, a
1-(4-hydroxyphenyl)-4-methyl-2,3-pentanediol dehydratase, and a
1-(4-hydroxyphenyl)-4-methyl-2(or 3)-pentanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of an
isobutyraldehyde/indoleacetaldehyde lyase and wherein the reduction
and dehydration pathway may comprise at least one of a
1-indole-4-methyl-2,3-pentanediol dehydrogenase, a
1-indole-4-methyl-2,3-pentanediol dehydratase, and a
1-indole-4-methyl-2(or 3)-pentanol dehydrogenase.
[0363] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
2-methyl-butyraldehyde/3-methyl-butyraldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
2,6-dimethyl-4,5-octanediol dehydrogenase, a
2,6-dimethyl-4,5-octanediol dehydratase, and a 2,6-dimethyl-4(or
5)-octanol dehydrogenase. Additional embodiments include a method
wherein the C--C ligation pathway may comprise at least one of a
2-methyl-butyraldehyde/4-methyl-pentaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
3,8-dimethyl-4,5-nonanediol dehydrogenase, a
3,8-dimethyl-4,5-nonanediol dehydratase, and a 3,8-dimethyl-4(or
5)-nonanol dehydrogenase.
[0364] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
2-methyl-butyraldehyde/phenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-phenyl-4-methyl-2,3-hexanediol dehydrogenase, a
1-phenyl-4-methyl-2,3-hexanediol dehydratase, and a
1-phenyl-4-methyl-2(or 3)-hexanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise at least one of a
2-methyl-butyraldehyde/4-hydroxyphenylacetaldehyde lyase and
wherein the reduction and dehydration pathway may comprise at least
one of a 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydrogenase,
a 1-(4-hydroxyphenyl)-4-methyl-2,3-hexanediol dehydratase, and a
1-(4-hydroxyphenyl)-4-methyl-2 (or 3)-hexanol dehydrogenase.
[0365] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
2-methyl-butyraldehyde/indoleacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-indole-4-methyl-2,3-hexanediol dehydrogenase, a
1-indole-4-methyl-2,3-hexanediol dehydratase, and a
1-indole-4-methyl-2(or 3)-hexanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise at least one of a
3-methyl-butyraldehyde/4-methyl-pentaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
2,8-dimethyl-4,5-nonanediol dehydrogenase, a
2,8-dimethyl-4,5-nonanediol dehydratase, and a 2,8-dimethyl-4(or
5)-nonanol dehydrogenase.
[0366] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
3-methyl-butyraldehyde/phenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-phenyl-5-methyl-2,3-hexanediol dehydrogenase, a
1-phenyl-5-methyl-2,3-hexanediol dehydratase, and a
1-phenyl-5-methyl-2(or 3)-hexanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise at least one of a
3-methyl-butyraldehyde/4-hydroxyphenylacetaldehyde lyase and
wherein the reduction and dehydration pathway may comprise at least
one of a 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydrogenase,
a 1-(4-hydroxyphenyl)-5-methyl-2,3-hexanediol dehydratase, and a
1-(4-hydroxyphenyl)-5-methyl-2(or 3)-hexanol dehydrogenase.
[0367] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
3-methyl-butyraldehyde/indoleacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-indole-5-methyl-2,3-hexanediol dehydrogenase, a
1-indole-5-methyl-2,3-hexanediol dehydratase, and a
1-indole-5-methyl-2(or 3)-hexanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise at least one of a 4-methyl-pentaldehyde/phenylacetaldehyde
lyase and wherein the reduction and dehydration pathway may
comprise at least one of a 1-phenyl-6-methyl-2,3-heptanediol
dehydrogenase, a 1-phenyl-6-methyl-2,3-heptanediol dehydratase, and
a 1-phenyl-6-methyl-2(or 3)-heptanol dehydrogenase.
[0368] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
4-methyl-pentaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein
the reduction and dehydration pathway may comprise at least one of
a 1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydrogenase, a
1-(4-hydroxyphenyl)-6-methyl-2,3-heptanediol dehydratase, and a
1-(4-hydroxyphenyl)-6-methyl-2(or 3)-heptanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
4-methyl-pentaldehyde/Indoleacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-indole-6-methyl-2,3-heptanediol dehydrogenase, a
1-indole-6-methyl-2,3-heptanediol dehydratase, and a
1-indole-6-methyl-2(or 3)-heptanol dehydrogenase.
[0369] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
phenylacetaldehyde/4-hydroxyphenylacetaldehyde lyase and wherein
the reduction and dehydration pathway may comprise at least one of
a 1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydrogenase, a
1-(4-hydroxyphenyl)-4-phenyl-2,3-butanediol dehydratase, and a
1-(4-hydroxyphenyl)-4-phenyl-2(or 3)-butanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise at least one of a
phenylacetaldehyde/indolephenylacetaldehyde lyase and wherein the
reduction and dehydration pathway may comprise at least one of a
1-indole-4-phenyl-2,3-butanediol dehydrogenase, a
1-indole-4-phenyl-2,3-butanediol dehydratase, and a
1-indole-4-phenyl-2(or 3)-butanol dehydrogenase.
[0370] Additional embodiments include a method wherein the C--C
ligation pathway may comprise at least one of a
4-hydroxyphenylacetaldehyde/indolephenylacetaldehyde lyase and
wherein the reduction and dehydration pathway may comprise at least
one of a 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydrogenase,
a 1-indole-4-(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a
1-indole-4-(4-hydroxyphenyl)-2(or 3)-butanol dehydrogenase.
Additional embodiments include a method wherein the C--C ligation
pathway may comprise a 5-amino-pantaldehyde lyase, and wherein the
reduction and dehydration pathway may comprise at least one of a
1,10-diamino-5,6-decanediol dehydrogenase, a
1,10-diamino-5,6-decanediol dehydratase, and a
1,10-diamino-5-decanol dehydrogenase. Additional embodiments
include a method wherein the C--C ligation pathway may comprise a
4-hydroxyphenyl acetaldehyde lyase, and wherein the reduction and
dehydration pathway may comprise at least one of a
1,4-di(4-hydroxyphenyl)-2,3-butanediol, a
1,4-di(4-hydroxyphenyl)-2,3-butanediol dehydratase, and a
1,4-di(4-hydroxyphenyl)-2-butanol dehydrogenase. Additional
embodiments include a method wherein the C--C ligation pathway may
comprise a succinate semialdehyde lyase, and wherein the reduction
and dehydration pathway may comprise at least one of a
2,3-hexanediol-1,6-dicarboxylic acid dehydrogenase, a
2,3-hexanediol-1,6-dicarboxylic acid dehydratase, and a
2-hexanol-1,6-dicarboxylic dehydrogenase.
[0371] Certain embodiments of a microbial system or recombinant
microorganism may comprise genes encoding enzymes that are able to
catalyze (e.g., reduction and dehydration) the conversion of
4-octanol to octene or octane. Other embodiments may comprise
redesigned or de novo designed enzymes for this reduction and
dehydration pathway. For example, three redesigned enzymes could
convert 4-octanone to either 3- and 4-octene. The first step could
be catalyzed by redesigned isocitrate dehydrogenase. This enzyme
could catalyze the formation of 4-hydroxy-3(or 5)-carboxylic
octane. The 4-hydroxy group could be phosphorylated by redesigned
kinase. Finally, redesigned mevalonate diphosphate decarboxylase
catalyzes the formation of 3(or 4)-octene.
[0372] In other embodiments, several redesigned enzymes could
convert 4-octanone to octane. For example, the 4-hydroxy-3(or
5)-carboxylic octane is sequentially reduced and dehydrated to form
3(or 5)-carboxylic octane. Redesigned enzymes involved in fatty
acid metabolism can catalyze these reactions. The 3(or
5)-carboxylic octane can be reduced to corresponding aldehyde by
aldehyde dehydrogenase and the product may be decarbonylated to
form octane catalyzed by a redesigned decarbonylase.
[0373] As noted above, for the production of certain commodity
chemicals, such as 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and
indole-3-ethanol, among other similar chemicals, a biosynthesis
pathway (e.g., aldehyde biosynthesis pathway) may optionally or
further comprise one or more genes encoding a decarboxylase enzyme,
such as an indole-3-pyruvate decarboxylase (IPDC), to produce an
aldehyde. In certain aspects, an IPDC may comprise an amino acid
sequence that is at least 80%, 90%, 95%, 98%, or 99% identical to
the amino acid sequence set forth in SEQ ID NO:312. An IDPC enzyme
may comprise certain conserved amino acid residues, such as G24,
D25, E48, A55, R60, G75, E89, H113, G252, G405, G413, G428, G430,
and/or N456.
[0374] In these and other embodiments, a recombinant microorganism
may comprise an aldehyde reductase, such as a phenylacetoaldehyde
reductase (PAR), to convert an aldehyde to a commodity chemical. In
certain aspects, a PAR may comprise an amino acid sequence that is
at least 80%, 90%, 95%, 98%, or 99% identical to the amino acid
sequence set forth in SEQ ID NO:313, which shows the sequence of a
PAR enzymed derived from Rhodococcus sp. ST-10. In certain aspects,
a PAR enzyme may comprise at least one of a nicotinamide adenine
dinucleotide (NAD+), NADH, nicotinamide adenine dinucleotide
phosphate (NADP+), or NADPH binding motif. In certain embodiments,
the NAD+, NADH, NADP+, or NADPH binding motif may be selected from
the group consisting of Y-X-G-G-X-Y, Y-X-X-G-G-X-Y,
Y-X-X-X-G-G-X-Y, Y-X-G-X-X-Y, Y-X-X-G-G-X-X-Y, Y-X-X-X-G-X-X-Y,
Y-X-G-X-Y, Y-X-X-G-X-Y, Y-X-X-X-G-X-Y, and Y-X-X-X-X-G-X-Y; wherein
Y is independently selected from alanine, glycine, and serine,
wherein G is glycine, and wherein X is independently selected from
a genetically encoded amino acid.
[0375] In certain embodiments, such a recombinant microorganism may
also or alternatively comprise a secondary alcohol dehydrogenase
having an activity selected from at least one of a phenylethanol
dehydrogenase activity, a 4-hydroxyphenylethanol dehydrogenase
activity, and an Indole-3-ethanol dehydrogenase activity, to reduce
the aldehyde to its corresponding alcohol (e.g. 2-phenylethanol,
2-(4-hydroxyphenyl)ethanol, and indole-3-ethanol).
[0376] Embodiments of the present invention also include methods
for converting a suitable monosaccharide to a commodity chemical
comprising, (a) obtaining a suitable monosaccharide; (b) contacting
the suitable monosaccharide with a microbial system for a time
sufficient to convert to the suitable monosaccharide to the
biofuel, wherein the microbial system comprises, (i) one or more
genes encoding and expressing a fatty acid biosynthesis pathway, an
amino acid biosynthetic pathway, and/or a short chain alcohol
biosynthetic pathway; (ii) one or more genes encoding and
expressing a keto-acid decarboxylase, aldehyde dehydrogenase,
and/or alcohol dehydrogenase; and (iii) an enzymatic reduction
pathway selected from (1) an enzymatic long chain alcohol reduction
pathway, (2) an enzymatic decarbonylation pathway, (3) an enzymatic
decarboxylation pathway, and (4) an enzymatic reduction pathway
comprising (1), (2), and/or (3), thereby converting the suitable
monosaccharide to the commodity chemical.
[0377] Embodiments of the present invention may comprise one or
more genes encoding and expressing enzymes in a fatty acid
synthesis pathway, which may be used, as one example, to produce
biofuels in the form of alkanes, such as medium to long chain
alkanes. In certain embodiments, the specificity of the fatty acid
biosynthesis pathway in the microbial system may be recalibrated or
redesigned. Merely by way of example, microorganisms generally
produce a mixture of long chain fatty acids (e.g., E. coli
naturally produce large quantities of long chain fatty acids
(C16-C19: <95% in whole cells) and small quantity of medium
chain fatty acids (C12: 2% and C14: 5% in whole cells)).
[0378] In certain embodiments, the recalibration or re-engineering
may be directed to increasing production of medium chain alkanes,
including, but not limited to, caprylate (C8), caprate (C10),
laurate (C12), myristate (C14), and palmitate (C16), as alkanes
produced from these fatty acids are major components of gasoline,
diesels, and kerosene. In addition to these fatty acids, other
embodiments may be directed to increased production of long chain
fatty acids, including, but not limited to, stearate (C18),
arachidonate (C20), behenate (C22) and longer fatty acids, as
n-alkanes produced from these fatty acids are one of major
components in heavy oils.
[0379] For example, Cuphea mainly accumulate medium chain fatty
acids as major components in their seed oils, and these
compositions alter depending on species. In particular, Cuphea
pulcherrima accumulates caprylate (C8:0) 96%, Cuphea koehneana
accumulates caprate (C10:0) 95.3%, and Cuphea polymorpha
accumulates laurate (C12:0) 80.1%. Embodiments of the microbial
systems or isolated microorganisms according to the present
application may incorporate genes from various Cuphea species
encoding enzymes involved in a fatty acid biosynthesis pathway, and
these microorganisms may be directed in part to the production of
middle chain fatty acids.
[0380] In other embodiments, acyl-acyl carrier protein (ACP)
thioesterases (TEs) derived from various species including Cuphea
hookeriana, Cuphea palustris, Umbellularia californica, and
Cinnamomum camphorum may be over-expressed in such microorganisms
as E. coli, wherein the specific activity for the formation of each
medium chain fatty acids, caprylate (C8), caprate (C10), laurate
(C12), myristate (C14), and palmitate (C16) is improved over the
wild type. Certain embodiments may include other enzyme components
involved in fatty acid biosynthesis as known to a person skilled in
the arts, including, but not limited to, ACP and .beta.-ketoacyl
ACP synthase (KAS) IV.
[0381] Microbial systems and isolated microorganisms of the present
application may also incorporate fatty aldehyde dehydrogenases to
reduce fatty acids to fatty aldehydes. Merely by way of
explanation, the conversion of fatty acids to fatty aldehydes may
be catalyzed by medium and/or long chain fatty aldehyde
dehydrogenases isolated from various suitable organisms. Certain
embodiments may incorporate, for example, a fatty aldehyde
dehydrogenase derived from Vibrio harveyi.
[0382] Microbial systems and isolated microorganisms of the present
application may also incorporate one or more enzymes that catalyze
the conversion of fatty aldehydes to biofuels such as n-alkanes,
including, for example, enzymes comprising an enzymatic long chain
alcohol reduction pathway. Certain embodiments may incorporate
genes from various other sources that encode enzymes capable of
catalyzing the reduction and dehydration of fatty acids to
biofuels, such as alkanes. For example, bacterial strain HD-1 is
able to produce biofuels, such as n-alkanes, with various chain
lengths, and also produces both odd and even numbered alkanes.
Certain embodiments of the microbial systems and recombinant
microorganisms provided herein may incorporate the HD-1 genes
encoding the enzymes involved in this pathway.
[0383] Other embodiments may incorporate redesigned or de novo
designed enzymes for this reduction pathway. For example,
embodiments of the present invention may include a redesigned
isocitrate dehydrogenase, which may catalyze the formation of
2-carboxy-1-alcohols. In certain embodiments, the
2-carboxy-1-alcohols may be sequentially reduced and dehydrated to
form 2-carboxy-alkanes, which may be catalyzed by redesigned
enzymes involved in fatty acid metabolism. The 2-carboxy-alkanes
can be reduced to corresponding aldehyde by aldehyde dehydrogenase
and then decarbonylated to form n-alkanes catalyzed by the
redesigned decarbonylase as discussed below. Certain embodiments of
these microbial systems may produce either even numbered n-alkanes,
odd numbered n-alkanes, or both.
[0384] Certain embodiments of the present application may
incorporate the genes encoding enzymes catalyzing decarbonylation,
or an enzymatic decarbonylation pathway. Merely by way of example,
green colonial alga Botyrococcus braunii, race A, produces linear
odd-numbered C27, C29, and C31 hydrocarbons that total up to 32% of
the alga's dry weight. Microsomal preparations of this organism
have decarbonylation activity. This decarbonylase from B. braunii
culture is a cobalt-protoporphyrin IX containing enzyme. Certain
microbial systems of isolated microorganisms may incorporate the
gene encoding fatty aldehyde decarbonylase from Botyrococcus
braunii.
[0385] Other embodiments may include redesigned decarbonylase
enzymes, for example, wherein the N-terminal membrane sequence is
substituted. By way of explanation, the functional activity of a
similar enzyme, cytochrome P450 containing Fe-protopolphyrin IX
(heme), is improved by substituting N-terminal membrane associated
sequence, and the functional activity of decarbonylases of the
present microbial systems may comprise similar substitutions or
improvements.
[0386] Other embodiments may incorporate the genes encoding a
Co-porphyrin synthase. In explanation, decarbonylase enzymes may
use Co-protoporphyrin IX as a co-factor, and Clostridium
tetranomorphum is able to incorporate cobalt into incubated
protopolphyrin IX. Certain embodiments may incorporate the
Co-porphyrin synthase from Clostridium tetranomorphum, or from
other suitable microorganisms. Other embodiments may incorporate de
novo designed decarbonylation enzymes using inorganic metals such
as Co.sup.2+, Fe.sup.2+, and Ni.sup.2+ as catalysts.
[0387] Certain embodiments may comprise genes encoding the enzymes
responsible for the formation of alkenes, or an enzymatic
decarboxylation pathway. These genes may be derived or isolated
from various sources, such as higher plants and insects. For
example, higher plants such as germinating safflower (Carthamus
tinctorius L.) produce a number of odd numbered 1-alkenes,
including 1-pentadecene, 1-heptadecene, 1,8-heptadecadiene and
1,8,11-heptadecatriene besides about 80-90%
1,8,11,14-heptadecatetraene by decarboxylation from their
corresponding fatty acids. Certain embodiments may incorporate the
genes from higher plants such as Carthamus tinctorius.
[0388] Other embodiments may incorporate the genes encoding the
enzymes responsible for the formation of alkenes (e.g., an
enzymatic decarboxylation pathway) from microorganisms, including,
but not limited to, such as bacterial strain DH-1. By way of
explanation, bacterial strain DH-1 produces n-alkenes in addition
to n-alkanes.
[0389] Other embodiments may incorporate the genes from de novo
designed enzymes for an enzymatic decarboxylation pathway. For
example, these redesigned enzymes convert .beta.-hydroxy fatty
acids to n-alkenes. The first step is catalyzed by a redesigned
kinase, which catalyzes the phosphorylation of a .beta.-hydroxy
group. A redesigned mevalonate diphosphate decarboxylase then
catalyzes the formation of n-alkenes, such as n-1-alkene.
[0390] Any microorganism may be utilized according to the present
invention. In certain aspects, a microorganism is a eukaryotic or
prokaryotic microorganism. In certain aspects, a microrganism is a
yeast, such as S. cerevisiae. In certain aspects, a microorganism
is a bacteria, such as a gram-positive bacteria or a gram-negative
bacteria. Given its rapid growth rate, well-understood genetics,
the variety of available genetic tools, and its capability in
producing heterologous proteins, genetically modified E. coli may
be used in certain embodiments of a microbial system as described
herein, whether for the degradation amd metabolism of a
polysaccharide, such as alginate or pectin, or the formation or
biosynthesis of commodity chemicals, such as biofuels.
[0391] Other microorganisms may be used according to the present
invention, based in part on the compatibility of enzymes and
metabolites to host organisms. For example, other organisms such as
Acetobacter aceti, Achromobacter, Acidiphilium, Acinetobacter,
Actinomadura, Actinoplanes, Aeropyrum pernix, Agrobacterium,
Alcaligenes, Ananas comosus (M), Arthrobacter, Aspargillus niger,
Aspargillus oryze, Aspergillus melleus, Aspergillus pulverulentus,
Aspergillus saitoi, Aspergillus sojea, Aspergillus usamii, Bacillus
alcalophilus, Bacillus amyloliquefaciens, Bacillus brevis, Bacillus
circulans, Bacillus clausii, Bacillus lentus, Bacillus
licheniformis, Bacillus macerans, Bacillus stearothermophilus,
Bacillus subtilis, Bifidobacterium, Brevibacillus brevis,
Burkholderia cepacia, Candida cylindracea, Candida rugosa, Carica
papaya (L), Cellulosimicrobium, Cephalosporium, Chaetomium
erraticum, Chaetomium gracile, Clostridium, Clostridium butyricum,
Clostridium acetobutylicum, Clostridium thermocellum,
Corynebacterium (glutamicum), Corynebacterium efficiens,
Escherichia coli, Enterococcus, Erwina chrysanthemi, Gliconobacter,
Gluconacetobacter, Haloarcula, Humicola insolens, Humicola nsolens,
Kitasatospora setae, Klebsiella, Klebsiella oxytoca, Kluyveromyces,
Kluyveromyces fragilis, Kluyveromyces lactis, Kocuria, Lactlactis,
Lactobacillus, Lactobacillus fermentum, Lactobacillus sake,
Lactococcus, Lactococcus lactis, Leuconostoc, Methylocystis,
Methanolobus siciliae, Methanogenium organophilum, Methanobacterium
bryantii, Microbacterium imperiale, Micrococcus lysodeikticus,
Microlunatus, Mucor javanicus, Mycobacterium, Myrothecium,
Nitrobacter, Nitrosomonas, Nocardia, Papaya carica, Pediococcus,
Pediococcus halophilus, Penicillium, Penicillium camemberti,
Penicillium citrinum, Penicillium emersonii, Penicillium
roqueforti, Penicillum lilactinum, Penicillum multicolor,
Paracoccus pantotrophus, Propionibacterium, Pseudomonas,
Pseudomonas fluorescens, Pseudomonas denitrificans, Pyrococcus,
Pyrococcus furiosus, Pyrococcus horikoshii, Rhizobium, Rhizomucor
miehei, Rhizomucor pusillus Lindt, Rhizopus, Rhizopus delemar,
Rhizopus japonicus, Rhizopus niveus, Rhizopus oryzae, Rhizopus
oligosporus, Rhodococcus, Sccharomyces cerevisiae, Sclerotina
libertina, Sphingobacterium multivorum, Sphingobium, Sphingomonas,
Streptococcus, Streptococcus thermophilus Y-1, Streptomyces,
Streptomyces griseus, Streptomyces lividans, Streptomyces murinus,
Streptomyces rubiginosus, Streptomyces violaceoruber,
Streptoverticillium mobaraense, Tetragenococcus, Thermus,
Thiosphaera pantotropha, Trametes, Trichoderma, Trichoderma
longibrachiatum, Trichoderma reesei, Trichoderma viride,
Trichosporon penicillatum, Vibrio alginolyticus, Xanthomonas,
yeast, Zygosaccharomyces rouxii, Zymomonas, and Zymomonus mobilis,
may be utilized as recombinant microorganisms provided herein, and,
thus, may be utilized according to the various methods of the
present invention.
[0392] The following Examples are offered by way of illustration,
not limitation.
EXAMPLES
Example 1
Engineering E. coli to Grow on Alginate as a Sole Source of
Carbon
[0393] Wild type E. coli cannot use alginate polymer or degraded
alginate as its sole carbon source (see FIG. 4). Vibrio splendidus,
however, is known to be able to metabolize alginate to support
growth. To generate recombinant E. coli that use degraded alginate
as its sole carbon source, a Vibrio splendidus fosmid library was
constructed and cloned into E. coli.
[0394] To prepare the Vibrio splendidus fosmid library, genomic DNA
was isolated from Vibrio Splendidus B01 (gift from Dr. Martin Polz,
MIT) using the DNeasy Blood and Tissue Kit (Qiagen, Valencia,
Calif.). A fosmid library was then constructed using Copy Control
Fosmid Library Production Kit (Epicentre, Madison, Wis.). This
library consisted of random genomic fragments of approximately 40
kb inserted into the vector pCC1FOS (Epicentre, Madison, Wis.).
[0395] The fosmid library was packaged into phage, and E. coli
DH10B cells harboring a pDONR221 plasmid (Invitrogen, Carlsbad,
Calif.) carrying certain Vibrio splendidus genes
(V12B01.sub.--02425 to V12B01.sub.--02480; encoding a type II
secretion apparatus; see SEQ ID NO:1) were transfected with the
phage library. This secretome region encodes a type II secretion
apparatus derived from Vibrio splendidus, which was cloned into a
pDONR221 plasmid and introduced into E. coli strain DH10B (see
Example 1).
[0396] Transformants were selected for chloroamphenicol resistance
and then screened for their ability to grow on degraded alginate.
The resultant transformants were screened for growth on degraded
alginate media. Degraded alginate media was prepared by incubating
2% Alginate (Sigma-Aldrich, St. Louis, Mo.) 10 mM Na-Phosphate
buffer, 50 mM KCl, 400 mM NaCl with alginate lyase from
Flavobacterium sp. (Sigma-Aldrich, St. Louis, Mo.) at room
temperature for at least one week. This degraded alginate was
diluted to a concentration of 0.8% to make growth media that had a
final concentration of 1.times.M9 salts, 2 mM MgSO.sub.4, 100 .mu.M
CaCl2, 0.007% Leucine, 0.01% casamino acids, 1.5% NaCl (this
includes all sources of sodium: M9, diluted alginate and added
NaCl).
[0397] One fosmid-containing E. coli clone was isolated that grew
well on this media. The fosmid DNA from this clone was isolated and
prepared using FosmidMAX DNA Purification Kit (Epicentre, Madison,
Wis.). This isolated fosmid was transferred back into DH10B cells,
and these cells were tested for the ability to grown on
alginate.
[0398] The results are illustrated in FIG. 4, which shows that
certain fosmid-containing E. coli clones are capable of growing on
alginate as a sole source of carbon. Agrobacterium tumefaciens
provides a positive control (see hatched circles). As a negative
control, E. coli DH10B cells are not capable of growing on alginate
(see immediate left of positive control).
[0399] These results also demonstrate that the sequences contained
within this Vibrio splendidus derived fosmid clone are sufficient
to confer on E. coli the ability to grow on degraded alginate as a
sole source of carbon. Accordingly, the type II secretion machinery
sequences contained within the pDONR221 vector (i.e., SEQ ID NO:1),
which was harbored by the original DH10B cells, were not necessary
for growth on degraded alginate.
[0400] The isolated fosmid sufficient to confer growth alginate as
a sole source of carbon was sequenced by Elim Biopharmaceuticals
(Hayward, Calif.) using the following primers: Uni
R3-GGGCGGCCGCAAGGGGTTCGCGTTGGCCGA (SEQ ID NO:147) and
PCC1FOS_uni_F-GGAGAAAATACCGCATCAGGCG (SEQ ID NO:148). Sequencing
showed that the vector contained a genomic DNA section that
contained the full length genes V12B01.sub.--24189 to
V12B01.sub.--24249 (see SEQ ID NOS:2-64). SEQ ID NO:2 shows the
nucleotide sequence of entire region between V12B01.sub.--24189 to
V12B01.sub.--24249. SEQ ID NOS:3-64 show the individual putative
genes contained within SEQ ID NO:2. In this sequence, there is a
large gene before V12B01.sub.--24189 that is truncated in the
fosmid clone. The large gene V12B01.sub.--24184 is a putative
protein with similarity to autotransporters and belongs to COG3210,
which is a cluster of orthologous proteins that include large
exoproteins involved in heme utilization or adhesion. In the fosmid
clone, V12B01.sub.--24184 is N-terminally truncated such that the
first 5893 bp are missing from the predicted open reading frame
(which is predicted to contain 22889 bp in total).
Example 2
Engineering E. coli to Grow on Pectin as a Sole Source of
Carbon
[0401] Wild type E. coli is not capable of growing on pectin, di-,
or tri-galacturonates as a sole source of carbon. To identify the
minimal components to confer on E. coli the capability of growing
on pectin, di- and/or tri-galacturonates as a sole source of
carbon, an E. coli strain BL21(DE3) harboring both the pBBRGal3P
plasmid and the pTrcogl-kdgR plasmid was engineered and tested for
the ability to grown on these polysaccharides.
[0402] The pBBRGal3P plasmid was engineered to contain certain
genomic region of Erwinia carotovora subsp. Atroseptica SCRI 1043,
comprising several genes (kdgF, kduI, kduD, pelW, togM, togN, togA,
togB, kdgM, and paeX) encoding certain enzymes (kduI, kduD, ogl,
pelW, and paeX), transporters (togM, togN, togA, togB, and kdgM),
and regulatory proteins (kdgR) responsible for the degradation of
di- and trigalacturonate. SEQ ID NO:65 shows the nucleotide
sequence of the kdgF-PaeX region from Erwinia carotovora subsp.
Atroseptica SCRI1043.
[0403] To construct this plasmid, the DNA sequence encoding kdgF,
kduI, kduD, pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR
of Erwinia carotovora subsp. Atroseptica SCRI 1043 was amplified by
polymerase chain reaction (PCR): 98.degree. C. for 10 sec,
60.degree. C. for 15 sec, and 72.degree. C. for 6 min, repeated 30
times. The reaction mixture contained 1.times. Phusion buffer
(NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-CGGGATCCAAGTTGCAGGATATGACGAAAGCG-3') (SEQ ID NO:149) and
reverse (5'-GCTCTAGA AGATTATCCCTGTCTGCGGAAGCGG-3') (SEQ ID NO:150)
primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
Erwinia carotovora subsp. Atroseptica SCRI 1043 genome (ATCC) in 50
.mu.l.
[0404] The vector pBBR1MCS-2 was then amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 2.5 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-GCTCTAGA GGGGTGCCTAATGAGTGAGCTAAC-3') (SEQ ID
NO:151) and reverse (5'-CGGGATCC GCGTTAATATTTTGTTAAAATTCGC-3') (SEQ
ID NO:152) primers, 1U Phusion High Fidelity DNA polymerase (NEB),
and 50 ng pBBR1MCS-2 in 50 .mu.l. Both amplified DNA fragments were
digested with BamHI and XbaI and ligated.
[0405] The pTrcogl-kdgR plasmid was engineered to contain certain
genomic regions of Erwinia carotovora subsp. Atroseptica SCRI 1043,
comprising two genes (ogl and kdgR) encoding an enzyme (ogl) and a
regulatory protein (kdgR) responsible for degradation of di- and
trigalacturonate. SEQ ID NO:66 shows the nucleotide sequence of
ogl-kdgR from Erwinia carotovora subsp. Atroseptica SCRI1043.
[0406] To prepare this construct, the DNA sequence encoding ogl and
kdgR of Erwinia carotovora subsp. Atroseptica SCRI 1043 was
amplified by polymerase chain reaction (PCR): 98.degree. C. for 10
sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min,
repeated 30 times. The reaction mixture contained 1.times. Phusion
buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-GCTCTAGA
GTTTATGTCGCACCCGCCGTTGG-3') (SEQ ID NO:153) and reverse (5'-CCCAAGC
TTAGAAAGGGAAATTGTGGTAGCCC-3') (SEQ ID NO:154) primers, 1U Phusion
High Fidelity DNA polymerase (NEB), and 50 ng Erwinia carotovora
subsp. Atroseptica SCRI 1043 genome (ATCC) in 50 .mu.l. The
amplified DNA fragment was digested with XbaI and HindIII and
ligated into pTrc99A pre-digested with the same restriction
enzymes.
[0407] The plasmids pBBRGal3P and pTrcogl-kdgR were co-transformed
into E. coli strain BL21(DE3). A single colony was inoculated into
LB media containing 50 ug/ml kanamycin and 100 ug/ml ampicillin,
and the culture was grown in incubation shaker with 200 rpm at 37
C. When culture reached OD 600 nm of 0.6, 500 ul of culture was
transferred to eppendorf tube and centrifuged to pellet the cells.
The cells were resuspended into 50 ul of M9 media containing 2 mM
MgSO.sub.4, 100 uM CaCl.sub.2, 0.4% di- or trigalacturonate, and 5
ul of this solution was inoculated into 500 ul of fresh M9 media
containing 2 mM MgSO.sub.4, 100 uM CaCl.sub.2, 0.4% di- or
trigalacturonate. The culture was grown in incubation shaker with
200 rpm at 37 C.
[0408] The results in FIG. 5A show that these two plasmids were
sufficient to provide E. coli ability to grow on di- and
trigalacturonate as sole source of carbon, but not pectin. In
particular, these results show that the regions kdgF-paeX and
ogl-kdgR were sufficient to confer this ability on E. coli.
[0409] Based on the information obtained from the above
experiments, it was considered whether the introduction of pectate
lyase, pectate acetylesterase, and methylesterase might confer E.
coli capability of growing on pectin. To test this hypothesis, E.
coli strain DH5.alpha. bacterial cells were engineered to contain
both the pROU2 plasmid and the pPEL74 plasmid.
[0410] The pROU2 plasmid contains certain genomic regions of
Erwinia chrysanthemi, comprising several genes (kdgF, kduI, kduD,
pelW, togM, togN, togA, togB, kdgM, paeX, ogl, and kdgR) encoding
enzymes (kduI, kduD, ogl, pelW, and paeX), transporters (togM,
togN, togA, togB, and kdgM), and regulatory proteins (kdgR)
responsible for degradation of di- and trigalacturonate.
[0411] The pPEL74 plasmid contains certain genomic regions of
Erwinia chrysanthemi, comprising several genes (pelA, pelE, paeY,
and pem) encoding pectate lyases (pelA and pelE), pectin
acetylesterases (paeY), and pectin methylesterase (pem).
[0412] As shown in FIG. 5B, E. coli DH5.alpha. engineered with
pROU2 and pPEL74 was able to grow on pectin as a sole source of
carbon, showing that the genes contained within these plasmids are
sufficient to confer this property on an organism that is otherwise
incapable of growing on pectin as a sole source of carbon.
Example 3
In Vitro Conversion of Alginate to Pyruvate and
Glyceraldehyde-3-Phosphate
[0413] The ability of an enzyme mixture containing all required
enzymes for alginate degradation and metabolism was investigated
for its ability to produce pyruvate from alginate. In addition,
various novel alcohol dehydrogenases (ADHs), such as ADH1-12 (see
SEQ ID NOS:69-92), isolated from Agrobacterium tumefaciens, were
tested for their ability to catalyze either DEHU or mannuronate
hydrogenation.
[0414] A simplified metabolic pathway for alginate degradation and
metabolism is shown in FIG. 2. Alginate can be degraded by at least
two different methodologies: enzymatic and chemical
methodologies.
[0415] In enzymatic degradation, the degradation of alginate is
catalyzed by a family of enzymes called alginate lyases. For this
experiment, Atu3025 was used. Atu3025 is an exolytically acting
enzyme and yields DEHU from alginate polymer. DEHU is converted to
the common hexuronate metabolite, KDG. This reaction is catalyzed
by alcohol dehydrogenases (e.g., DEHU hydrogenases).
[0416] Chemical degradation catalyzed by acid solution, such as
formate, yields a monosaccharide mannuronate. Mannuronate is then
converted to mannonate, which is catalyzed by enzymes with
mannonate dehydrogenase (mannuronate reductase) activity. In
bacteria, mannonate dehydratase (UxuA) catalyzes dehydration from
mannuronate to form KDG.
[0417] KDG is readily metabolized to form of pyruvate and
glyceraldehydes-3-phosphate (G3P). KDG is first phosphorylated to
KDG-6-phosphate (KDGP), which is catalyzed by KDG kinase, and then
broken down to pyruvate and G3P, which is catalyzed by KDGP
aldolase.
[0418] Preparation of oligoalginate lyase Atu3025 derived from
Agrobacterium tumefaciens C58. pETAtu3025 was constructed based on
pET29 plasmid backbone (Novagen). The oligoalginate lyase Atu3025
was amplified by PCR: 98.degree. C. for 10 sec, 55.degree. C. for
15 sec, and 72.degree. C. for 60 sec, repeated for 30 times. The
reaction mixture contained 1.times. Phusion buffer, 2 mM dNTP, 0.5
.mu.M forward (5'-GGAATTCCATATGCGTCCCTCTGCCCCGGCC-3') (SEQ ID
NO:155) and reverse (5'-CGGGATCCTTAGAACTGCTTGGGAAGGGAG-3') (SEQ ID
NO:156) primers, 2.5 U Phusion DNA polymerase (Finezyme), and an
aliquot of Agrobacterium tumefaciens C58 (gift from Professor
Eugene Nester, University of Washington) cells as a template in
total volume of 100 .mu.l. The amplified fragment was digested with
NdeI and BamHI and ligated into pET29 pre-digested with the same
enzymes using T4 DNA ligase to form pETAtu3025. The constructed
plasmid was sequenced (Elim Biophamaceuticals) and the DNA sequence
of the insert was confirmed. The nucleotide sequence of the Atu3025
insert is provided in SEQ ID NO:67. The polypeptide sequence
encoded by the Atu3025 insert is provided in SEQ ID NO:68.
[0419] The pETAtu3025 was transformed into Escherichia coli strain
BL21(DE3). A colony of BL21(DE3) containing pETAtu3025 was
inoculated into 50 ml of LB media containing 50 .mu.g/ml kanamycin
(Km.sup.50). This strain was grown in an orbital shaker with 200
rpm at 37.degree. C. The 0.2 mM IPTG was added to the culture when
the OD.sub.600nm reached 0.6, and the induced culture was grown in
an orbital shaker with 200 rpm at 20.degree. C. 24 hours after the
induction, the cells were harvested by centrifugation at 4,000
rpm.times.g for 10 min and the pellet was resuspended into 2 ml of
Bugbuster (Novagen) containing 10 .mu.l of Lysonase.TM.
Bioprocessing Reagent (Novagen). The solution was again centrifuged
at 4,000 rpm.times.g for 10 min and the supernatant was
obtained.
[0420] Construction of pETADH1 through pETADH12. DNA sequences of
ADH1-12 of Agrobacterium tumefaciens C58 were amplified by
polymerase chain reaction (PCR): 98.degree. C. for 10 sec,
60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30
times. The reaction mixture contained 1.times. Phusion buffer
(NEB), 2 mM dNTP, 0.5 .mu.M forward (Table 1) and reverse (Table 1)
primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
Agrobacterium tumefaciens C58 genome in 50 .mu.l. Amplified DNA
fragment was digested with NdeI and BamHI and ligated into pET28
pre-digested with the same restriction enzymes. For DNA sequences
with internal NdeI or BamHI site, front and bottom half sequences
of each ADH were first amplified using described method. The
resulting two DNA fragments were gel purified and spliced by
overlapping PCR.
TABLE-US-00001 TABLE 1 Primers used to amplify ADH1-12 from
Agrobacterium tumefaciens C58. A. tumefaciens Name C58 Forward
Primer Reverse Primer ADH1 Atu1557 GGAATTCCATATGTTCACAACGTCCGCCTA
GCTTGACGGCCATGTGGCCGAGGCCGC (SEQ ID NO: 276) (SEQ ID NO: 277)
GCGGCCTCGGCCACATGGCCGTCAAGC CGGGATCCTTAGGCGGCCTTCTGGCGCG (SEQ ID
NO: 278) (SEQ ID NO: 279) ADH2 Atu2022
GGAATTCCATATGGCTATTGCAAGAGGTTA CGGGATCCTTAAGCGTCGAGCGAGGCCA (SEQ ID
NO: 280) (SEQ ID NO: 281) ADH3 Atu0626
GGAATTCCATATGACTAAAACAATGAAGGC CACCGGGGCCGGGGTCCGGTATTGCCA (SEQ ID
NO: 282) (SEQ ID NO: 283) TGGCAATACCGGACCCCGGCCCCGGTG
CGGGATCCTTAGGCGGCGAGATCCACGA (SEQ ID NO: 284) (SEQ ID NO: 285) ADH4
Atu5240 GGAATTCCATATGACCGGGGCGAACCAGCC ATAGCCGCTCATACGCCTCGGTTGCCT
(SEQ ID NO: 286) (SEQ ID NO: 287) AGGCAACCGAGGCGTATGAGCGGCTAT
CGGGATCCTTAAGCGCCGTGCGGAAGGA (SEQ ID NO: 288) (SEQ ID NO: 289) ADH5
Atu3163 GGAATTCCATATGACCATGCATGCCATTCA CGGGATCCTTATTCGGCTGCAAATTGCA
(SEQ ID NO: 290) (SEQ ID NO: 291) ADH6 Atu2151
GGAATTCCATATGCGCGCGCTTTATTACGA CGGGATCCTTATTCGAACCGGTCGATGA (SEQ ID
NO: 292) (SEQ ID NO: 293) ADH7 Atu2814
GGAATTCCATATGCTGGCGATTTTCTGTGA CGGGATCCTTATGCGACCTCCACCATGC (SEQ ID
NO: 294) (SEQ ID NO: 295) ADH8 Atu5447
GGAATTCCATATGAAAGCCTTCGTCGTCGA CGGGATCCTTAGGATGCGTATGTAACCA (SEQ ID
NO: 296) (SEQ ID NO: 297) ADH9 Atu4087
GGAATTCCATATGAAAGCGATTGTCGCCCA CGGGATCCTTAGGAAAAGGCGATCTGCA (SEQ ID
NO: 298) (SEQ ID NO: 299) ADH10 Atu4289
GGAATTCCATATGCCGATGGCGCTCGGGCA CGGGATCCTTAGAATTCGATGACTTGCC (SEQ ID
NO: 300) (SEQ ID NO: 301) ADH11 Atu3027
GGAATTCCATATGAAACATTCTCAGGACAA GGGCGCCGATCATGTGGTGCGTTTCCG (SEQ ID
NO: 302) (SEQ ID NO: 303) CGGAAACGCACCACATGATCGGCGCCC
CGGGATCCTTATGCCATACGTTCCATAT (SEQ ID NO: 304) (SEQ ID NO: 305)
ADH12 Atu3026 GGAATTCCATATGCAGCGTTTTACCAACAG
CGGGATCCTTAGGAAAACAGGACGCCGC (SEQ ID NO: 306) (SEQ ID NO: 307)
Expression and Purification of ADH1-10.
[0421] All plasmids were transformed into Escherichia coli strain
BL21(DE3). The single colonies of BL21(DE3) containing respective
alcohol dehydrogenase (ADH) genes were inoculated into 50 ml of LB
media containing 50 .mu.g/ml kanamycin (Km.sup.50). These strains
were grown in an orbital shaker with 200 rpm at 37.degree. C. The
0.2 mM IPTG was added to each culture when the OD.sub.600nm reached
0.6, and the induced culture was grown in an orbital shaker with
200 rpm at 20.degree. C. 24 hours after the induction, the cells
were harvested by centrifugation at 4,000 rpm.times.g for 10 min
and the pellet was resuspended into 2 ml of Bugbuster (Novagen)
containing 10 .mu.l of Lysonase.TM. Bioprocessing Reagent
(Novagen). The solution was again centrifuged at 4,000 rpm.times.g
for 10 min and the supernatant was obtained.
Preparation of .about.2% DEHU Solution by Enzymatic
Degradation.
[0422] DEHU solution was enzymatically prepared. A 2% alginate
solution was prepared by adding 10 g of low viscosity alginate into
the 500 ml of 20 mM Tris-HCl (pH7.5) solution. An approximately 10
mg of alginate lyase derived from Flavobacterium sp. (purchased
from Sigma-aldrich) was added to the alginate solution. 250 ml of
this solution was then transferred to another bottle and the E.
coli cell lysate containing Atu3025 prepared above section was
added. The alginate degradation was carried out at room temperature
over night. The resulting products were analyzed by thin layer
chromatography, and DEHU formation was confirmed.
Preparation of D-Mannuronate Solution by Chemical Degradation.
[0423] D-mannuronate solution was chemically prepared based on the
protocol previously described by Spoehr (Archive of Biochemistry,
14: pp 153-155). Fifty milligram of alginate was dissolved into 800
.mu.L of ninety percent formate. This solution was incubated at
100.degree. C. for over night. Formate was then evaporated and the
residual substances were washed with absolute ethanol twice. The
residual substance was again dissolved into absolute ethanol and
filtrated. Ethanol was evaporated and residual substances were
resuspended into 20 mL of 20 mM Tris-HCl (pH 8.0) and the solution
was filtrated to make a D-mannuronate solution. This D-mannuronate
solution was diluted 5-fold and used for assay.
Assay for DEHU Hydrogenase.
[0424] To identify DEHU hydrogenase, a NADPH dependent DEHU
hydrogenation assay was performed. 20 .mu.l of prepared cell lysate
containing each ADH was added to 160 .mu.l of 20-fold deluted DEHU
solution prepared in the above section. 20 .mu.l of 2.5 mg/ml of
NADPH solution (20 mM Tris-HCl, pH 8.0) was added to initiate the
hydrogenation reaction, as a preliminary study using cell lysate of
A. tumefaciens C58 have shown that DEHU hydrogenation requires
NADPH as a co-factor. The consumption of NADPH was monitored an
absorbance at 340 nm for 30 min using the kinetic mode of ThermoMAX
96 well plate reader (Molecular Devises). E. coli cell lysate
containing alcohol dehydrogenase (ADH) 10 lacking a portion of
N-terminal domain was used in a control reaction mixture.
Assay for D-Mannuronate Hydrogenase.
[0425] To identify D-mannuronate hydrogenase, a NADPH dependent
D-mannuronate hydrogenation assay was performed. 20 .mu.l of
prepared cell lysate containing each ADH was added to 160 .mu.l of
D-mannuronate solution prepared in the above section. 20 .mu.l of
2.5 mg/ml of NADPH solution (20 mM Tris-HCl, pH 8.0) was added to
initiate the hydrogenation reaction. The consumption of NADPH was
monitored an absorbance at 340 nm for 30 min using the kinetic mode
of ThermoMAX 96 well plate reader (Molecular Devises). E. coli cell
lysate containing alcohol dehydrogenase (ADH) 10 lacking a portion
of N-terminal domain was used in a control reaction mixture.
Construction of pETkdgK.
[0426] DNA sequence of kdgK of Escherichi coli encoding
2-keto-deoxy gluconate kinase was amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 1 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-AGGTACGGTGAAATAA AGGAGG ATATACAT
ATGTCCAAAAAGATTGCCGT-3') (SEQ ID NO:157) and reverse
(5'-TTTTCCTTTTGCGGCCGCCCCGCTGGCATCGCCTCAC-3') (SEQ ID NO:158)
primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
Escherichia coli DH10B genome in 50 .mu.l. Amplified DNA fragment
was digested with NdeI and NotI and ligated into pET29 pre-digested
with the same restriction enzymes.
Construction of pETkdgA.
[0427] DNA sequence of kdgA Escherichi coli encoding 2-keto-deoxy
gluconate-6-phosphate aldolase was amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 1 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-GGCGATGCCAGCGTAA AGGAGG ATATACAT
ATGAAAAACTGGAAAACAAG-3') (SEQ ID NO:159) and reverse
(5'-TTTTCCTTTTGCGGCCGCCCCAGCTTAGCGCCTTCTA-3') (SEQ ID NO:160)
primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
Escherichia coli DH10B genome in 50 .mu.l. Amplified DNA fragment
was digested with NdeI and NotI and ligated into pET29 pre-digested
with the same restriction enzymes.
Protein Expression and Purification.
[0428] All plasmids (pETAtu3025, pETADH11, pETADH12, pETkdgA,
pETkdgK, and pETuxuA) were transformed into Escherichia coli strain
BL21(DE3). The single colonies of BL21(DE3) containing respective
plasmids were inoculated into 50 ml of LB media containing 50
.mu.g/ml kanamycin (Km.sup.50). These strains were grown in an
orbital shaker with 200 rpm at 37.degree. C. The 0.2 mM IPTG was
added to each culture when the OD.sub.600nm reached 0.6, and the
induced culture was grown in an orbital shaker with 200 rpm at
20.degree. C. 24 hours after the induction, the cells were
harvested by centrifugation at 4,000 rpm.times.g for 10 min and the
pellet was resuspended into 2 ml of Bugbuster (Novagen) containing
10 .mu.l of Lysonase.TM. Bioprocessing Reagent (Novagen) and
suggested amount of protease inhibitor cocktail (SIGMA). The
solution was again centrifuged at 4,000 rpm.times.g for 10 min and
the supernatant was obtained. The supernatant was applied to
Nickel-NTA spin column (Qiagen) to purify His-tagged proteins.
[0429] The results of the assays for DEHU hydrogenase activity and
D-mannuronate hydrogenase activity of ADH1-10 are shown in FIGS. 7A
and 7B. These results demonstrate that the novel enzymes ADH1 and
ADH2 showed significant DEHU hydrogenase activity (FIG. 7A), and
that the novel enzymes ADH3, ADH4, and ADH9 showed significant
mannuronate hydrogenase activity (FIG. 7B).
In Vitro Pyruvate Formation.
[0430] The reaction mixture contained 1% alginate or .about.0.5%
mannuronate, .about.5 ug of purified Atu3026 (ADH12) or Atu3027
(ADH11), and .about.5 ug of purified oligoalginate lyase (Atu3025),
UxuA, KdgK, and KdgA, 2 mM of ATP, and 0.6 mM of NADPH in 20 mM
Tris-HCl pH7.0. The reaction was carried out over night and the
pyruvate formation was monitored by the pyruvate assay kit
(BioVision, Inc).
[0431] The results of in vitro pyruvate formation from alginate
mediated by enzymatic and chemical degradation are shown in FIG. 6B
and FIG. 6C, respectively. As can be seen in these figures,
alginate was converted to pyruvate via the isolated enzymes. These
results also show that each of Atu3026 (ADH12) and Atu3027 (ADH11)
are capable of catalyzing both DEHU hydrogenase and mannuronate
hydrogenase reactions.
Example 4
Construction and Biological Activity of Biosynthesis Pathways
Construction of Pathways:
[0432] A propionaldehyde biosynthetic pathway comprising a
threonine deaminase (ilvA) gene from Escherichia coli and
keto-isovalerate decarboxylase (kivd) from Lactococcus lactis is
constructed and tested for the ability to convert L-threonine to
propionaldehyde.
[0433] A butyraldehyde biosynthetic pathway comprising a thiolase
(atoB) gene from E. coli, .beta.-hydroxy butyryl-CoA dehydrogenase
(hbd), crotonase (crt), butyryl-CoA dehydrogenase (bcd), electron
transfer flavoprotein A (etfA), and electron transfer flavoprotein
B (etfB) genes from Clostridium acetobutyricum ATCC 824, and a
coenzyme A-linked butyraldehyde dehydrogenase (ald) gene from
Clostridium beijerinckii acetobutyricum ATCC 824 was constructed in
E. coli and tested for the ability to produce butyraldehyde. Also,
a coenzyme A-linked alcohol dehydrogenase (adhE2) gene from
Clostridium acetobutyricum ATCC 824 was used as an alternative to
ald and tested for the ability to produce butanol.
[0434] An isobutyraldehyde biosynthetic pathway comprising an
acetolactate synthase (alsS) from Bacillus subtilis or (als) from
Klebsiella pneumoniae subsp. pneumoniae MGH 78578 (codon usage was
optimized for E. coli protein expression) and acetolactate
reductoisomerase (ilvC) and 2,3-dihydroxyisovalerate dehydratase
(ilvD), genes from E. coli and keto-isovalerate decarboxylase
(kivd) from Lactococcus lactis was constructed and tested for the
ability to produce isobutyraldehyde, as measured by isobutanal
production.
[0435] 3-methylbutyraldehyde and 2-methylbutyraldehyde biosynthesis
pathways comprising an acetolactate synthase (alsS) from Bacillus
subtilis or (als) from Klebsiella pneumoniae subsp. pneumoniae MGH
78578 (codon usage was optimized for E. coli protein expression),
acetolactate reductoisomerase (ilvC), 2,3-dihydroxyisovalerate
dehydratase (ilvD), isopropylmalate synthase (LeuA),
isopropylmalate isomerase (LeuC and LeuD), and 3-isopropylmalate
dehydrogenase (LeuB) genes from E. coli and keto-isovalerate
decarboxylase (kivd) from Lactococcus lactis were constructed and
tested for the ability to produce 3-isovaleraldehyde and
2-isovaleraldehyde.
[0436] Phenylacetoaldehyde and 4-hydroxyphenylacetoaldehyde
biosynthesis pathways comprising a transketolase (tktA), a
3-deoxy-7-phosphoheptulonate synthase (aroF, aroG, and aroH),
3-dehydroquinate synthase (aroB), a 3-dehydroquinate dehydratase
(aroD), a dehydroshikimate reductase (aroE), a shikimate kinase II
(aroL), a shikimate kinase I (aroK), a
5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a chorismate
synthase (aroC), a fused chorismate mutase P/prephenate dehydratase
(pheA), and a fused chorismate mutase T/prephenate dehydrogenase
(tyrA) genes from E. coli, keto-isovalerate decarboxylase (kiwi)
from Lactococcus lactis were constructed and tested for the ability
to produce phenylacetoaldehyde and/or
4-hydroxyphenylacetoaldehyde.
[0437] A 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and
2-(indole-3-)ethanol biosynthesis pathway comprising a
transketolase (tktA), a 3-deoxy-7-phosphoheptulonate synthase
(aroF, aroG, and aroH), 3-dehydroquinate synthase (aroB), a
3-dehydroquinate dehydratase (aroD), a dehydroshikimate reductase
(aroE), a shikimate kinase II (aroL), a shikimate kinase I (aroK),
a 5-enolpyruvylshikimate-3-phosphate synthetase (aroA), a
chorismate synthase (aroC), a fused chorismate mutase P/prephenate
dehydratase (pheA), and a fused chorismate mutase T/prephenate
dehydrogenase (tyrA) genes from E. coli, keto-isovalerate
decarboxylase (kivd) from Lactococcus lactis, alcohol dehydrogenase
(adh2) from Saccharomyces cerevisiae, Indole-3-pyruvate
decarboxylase (ipdc) from Azospirillum brasilense, phenylethanol
reductase (par) from Rhodococcus sp. ST-10, and benzaldehyde lyase
(bal) from Pseudomonas fluorescence was constructed and tested for
the ability to produce 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol
and/or 2-(indole-3)ethanol.
Construction of pBADButP.
[0438] The DNA sequence encoding hbd, crt, bcd, etfA, and etfB of
Clostridium acetobutyricum ATCC 824 was amplified by polymerase
chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for
15 sec, and 72.degree. C. for 3 min, repeated 30 times. The
reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM
dNTP, 0.5 .mu.M forward (5'-CCCGAGCTCTTAGGAGGATTAGTCATGGAAC-3')
(SEQ ID NO:161) and reverse (5'-GCTCTAGA
TTATTTTGAATAATCGTAGAAACC-3') (SEQ ID NO:162) primers, 1U Phusion
High Fidelity DNA polymerase (NEB), and 50 ng Clostridium
acetobutyricum ATCC 824 genome (ATCC) in 50 .mu.l. Amplified DNA
fragment was digested with BamHI and XbaI and ligated into pBAD33
pre-digested with the same restriction enzymes.
Construction of pBADButP-atoB.
[0439] The DNA sequence encoding atoB of Escherichia coli DH10B was
amplified by polymerase chain reaction (PCR): 98.degree. C. for 10
sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min,
repeated 30 times. The reaction mixture contained 1.times. Phusion
buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-GCTCTAGAGGAGGATATATATATGAAAAATTGTGTCATCGTC-3') (SEQ ID NO:163)
and reverse (5'-AA CTGCAGTTAATTCAACCGTTCAATCACC-3') (SEQ ID NO:164)
primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
Escherichia coli DH10B genome in 50 .mu.l. Amplified DNA fragment
was digested with XbaI and PstI and ligated into pBADButP
pre-digested with the same restriction enzymes.
Construction of pBADatoB-ald. The DNA sequence encoding atoB of
Escherichia coli DH10B and ald from Clostridium beijerinckii were
amplified separately by polymerase chain reaction (PCR): 98.degree.
C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1
min, repeated 30 times. The reaction mixture contained 1.times.
Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTC
AGGAGGATATATATATGAAAAATTGTGTCATCGTCAGTG-3') (SEQ ID NO:165) for
atoB and 5'-GGTTGAATTAAGGAGGATATATATATGAATAAAGACACACTAATACCTAC-3'
for ald) (SEQ ID NO:166) and reverse
(5'-GTCTTTATTCATATATATATCCTCCTTAATTCAACCGTTCAATCACCATC-3' (SEQ ID
NO:146) for atoB and 5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3' for ald)
(SEQ ID NO:167) primers, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng Escherichia coli DH10B and Clostridium
beijerinckii genome (ATCC) in 50 .mu.l, respectively. The amplified
DNA fragments were gel purified and eluted into 30 ul of EB buffer
(Qiagen). 5 ul from each DNA solution was combined and each DNA
fragment was spliced by another round of PCR: 98.degree. C. for 10
sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2 min,
repeated 30 times. The reaction mixture contained 1.times. Phusion
buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward (5'-CGAGCTC
AGGAGGATATATATATGAAAAATTGTGTCATCGTCAGTG-3') (SEQ ID NO:168) and
reverse (5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3') (SEQ ID NO:169)
primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced
fragment was digested with SacI and HindIII and ligated into
pBADButP pre-digested with the same restriction enzymes.
Construction of pBADButP-atoB-ALD.
[0440] The DNA fragment 1 encoding chloramphenicol
acetyltransferase (CAT), P15 origin of replication, araBAD
promoter, atoB of Escherichia coli DH10B and ald of Clostridium
beijerinckii and the DNA fragment 2 encoding araBAD promoter, hbd,
crt, bcd, etfA, and etfB of Clostridium acetobutyricum ATCC 824
were amplified separately by polymerase chain reaction (PCR):
98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree.
C. for 4 min, repeated 30 times. The reaction mixture contained
1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:170)
for fragment 1 and 5'-CGGGGTACCACTTTTCATACTCCCGCCATTCAG-3' (SEQ ID
NO:274) for fragment 2, and reverse
(5'-CGGGGTACCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:171) for
fragment 1 and (5'-AAGGAAAAAAGCGGCCGCGCGGATACATATTTGAATGTATTTAG-3')
(SEQ ID NO:172) for fragment 2) primers, 1U Phusion High Fidelity
DNA polymerase (NEB), and 50 ng pBADatoB-ald and pBADButP in 50
.mu.l, respectively. Amplified DNA fragments were digested with
NotI and KpnI and ligated each other.
Construction of pBADilvCD.
[0441] The DNA fragments encoding ilvC and ilvD of Escherichia coli
DH10B were amplified separately by polymerase chain reaction (PCR):
98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree.
C. for 1 min, repeated 30 times. The reaction mixture contained
1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-GCTCTAGAGGAGGATATATATATGGCTAACTACTTCAATACAC-3') (SEQ ID NO:173)
for ilvC and
5'-TGCTGTTGCGGGTTAAGGAGGATATATATATGCCTAAGTACCGTTCCGCC-3' for ilvD)
(SEQ ID NO:174) and reverse
(5'-AACGGTACTTAGGCATATATATATCCTCCTTAACCCGCAACAGCAATACG-3') (SEQ ID
NO:175) for ilvC and 5'-ACATGCATGCTTAACCCCCCAGTTTCGATT-3') (SEQ ID
NO:176) for ilvD) primers, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng Escherichia coli DH10B genome (ATCC) in 50 .mu.l.
The amplified DNA fragments were gel purified and eluted into 30 ul
of EB buffer (Qiagen). 5 ul from each DNA solution was combined and
each DNA fragment was spliced by another round of PCR: 98.degree.
C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 2
min, repeated 30 times. The reaction mixture contained 1.times.
Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-GCTCTAGAGGAGGATATATATATGGCTAACTACTTCAATACAC-3') (SEQ ID NO:177)
and reverse (5'-ACATGCATGCTTAACCCCCCAGTTTCGATT-3') (SEQ ID NO:178)
primers, 1U Phusion High Fidelity DNA polymerase (NEB). The spliced
fragment was digested with XbaI and SphI and ligated into pBAD33
pre-digested with the same restriction enzymes.
Construction of pBADals-ilvCD.
[0442] The DNA fragment encoding als of Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 of its codon usage optimized for
over-expression in E. coli was amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 1 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGGATAAACAGTATCCGGT-3')
(SEQ ID NO:179) and reverse (5'-GCTCTAGATTACAGAATTTGACTCAGGT-3')
(SEQ ID NO:180) primers, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng pETals in 50 .mu.l. The amplified DNA fragment was
digested with SacI and XbaI and ligated into pBADilvCD pre-digested
with the same restriction enzymes.
Construction of pBADalsS-ilvCD.
[0443] The DNA fragments encoding front and bottom halves of alsS
of Bacillus subtilis B26 were amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 0.5 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-CCCGAGCTCAGGAGGATATATATATGTTGACAAAAGCAACAAAAG-3')
(SEQ ID NO:181) for front and 5'-CGGTACCCTTTCCAGAGATTTAGAG-3' (SEQ
ID NO:275) for back halves, and reverse
(5'-CTCTAAATCTCTGGAAAGGGTACCG-3') (SEQ ID NO:182) for front and
(5'-GCTCTAGATTAGAGAGCTTTCGTTTTCATG-3' for back halves) (SEQ ID
NO:183) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and
50 ng Bacillus subtilis B26 genome (ATCC) in 50 .mu.l. The
amplified DNA fragments were gel purified and eluted into 30 ul of
EB buffer (Qiagen). 5 ul from each DNA solution was combined and
each DNA fragment was spliced by another round of PCR: 98.degree.
C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1
min, repeated 30 times. The reaction mixture contained 1.times.
Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-CCCGAGCTCAGGAGGATATATATATGTTGACAAAAGCAACAAAAG-3') (SEQ ID
NO:184) and reverse (5'-GCTCTAGATTAGAGAGCTTTCGTTTTCATG-3') (SEQ ID
NO:185) primers, 1U Phusion High Fidelity DNA polymerase (NEB). The
spliced fragment was internal XbaI site free and thus was digested
with SacI and XbaI and ligated into pBADilvCD pre-digested with the
same restriction enzymes.
Construction of pBADLeuABCD.
[0444] The DNA fragment encoding leuA, leuB, leuC, and leuD of
Escherichia coli BL21(DE3) was amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 3 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID
NO:186) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3')
(SEQ ID NO:187) primers, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l. The
amplified DNA fragment was digested with SacI and XbaI and ligated
into pBAD33 pre-digested with the same restriction enzymes.
Construction of pBADLeuABCD2.
[0445] The DNA fragment 1 encoding leuA and leuB and the DNA
fragment 2 encoding leuC and leuD of Escherichia coli BL21(DE3)
were amplified by polymerase chain reaction (PCR): 98.degree. C.
for 10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min,
repeated 30 times. The reaction mixture contained 1.times. Phusion
buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID
NO:188) for fragment 1 and
(5'-AGGGGTGTAAGGAGGATATATATATGGCTAAGACGTTATACGAAAAATTG-3') (SEQ ID
NO:189) for fragment 2 and reverse
(5'-CGTCTTAGCCATATATATATCCTCCTTACACCCCTTCTGCTACATAGCGG-3') (SEQ ID
NO:190) for fragment 1 and
(5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:191) for
fragment 2 primers, 1U Phusion High Fidelity DNA polymerase (NEB),
and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l,
respectively. The amplified DNA fragments were gel purified and
eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA
solution was combined and each DNA fragment was spliced by another
round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 3 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID
NO:192) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3')
(SEQ ID NO:193) primers, 1U Phusion High Fidelity DNA polymerase
(NEB). The spliced fragment was digested with SacI and XbaI and
ligated into pBAD33 pre-digested with the same restriction
enzymes.
Construction of pBADLeuABCD4.
[0446] The DNA fragments encoding leuA, leuB, leuC and leuD of
Escherichia coli BL21(DE3) were amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 1 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID
NO:194) for leuA,
(5'-GAAACCGTGTGAGGAGGATATATATATGTCGAAGAATTACCATATTGCCG-3') (SEQ ID
NO:195) for leuB,
(5'-AGGGGTGTAAGGAGGATATATATATGGCTAAGACGTTATACGAAAAATTG-3') (SEQ ID
NO:196) for leuC, and
(5'-ACATTAAATAAGGAGGATATATATATGGCAGAGAAATTTATCAAACACAC-3') (SEQ ID
NO:197) for leuD and reverse
(5'-ATTCTTCGACATATATATATCCTCCTCACACGGTTTCCTTGTTGTTTTCG-3') (SEQ ID
NO:198) for leuA,
(5'-CGTCTTAGCCATATATATATCCTCCTTACACCCCTTCTGCTACATAGCGG-3') (SEQ ID
NO:199) for leuB,
(5'-TTTCTCTGCCATATATATATCCTCCTTATTTAATGTTGCGAATGTCGGCG-3') (SEQ ID
NO:200) for leuC, and (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3')
(SEQ ID NO:201) for leuD primers, 1U Phusion High Fidelity DNA
polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50
.mu.l, respectively. The amplified DNA fragments were gel purified
and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA
solution was combined and each DNA fragment was spliced by another
round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 3 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CGAGCTCAGGAGGATATATATATGAGCCAGCAAGTCATTATTTTCG-3') (SEQ ID
NO:202) and reverse (5'-AAAACTGCAGCGTTTGATGACGTGGACGATAGCGG-3')
(SEQ ID NO:203) primers, 1U Phusion High Fidelity DNA polymerase
(NEB). The spliced fragment was digested with SacI and XbaI and
ligated into pBAD33 pre-digested with the same restriction
enzymes.
Construction of pBADals-ilvCD-leuABCD, pBADals-ilvCD-leuABCD2,
pBADals-ilvCD-leuABCD4, pBADalsS-ilvCD-leuABCD,
pBADalsS-ilvCD-leuABCD2, pBADalsS-ilvCD-leuABCD4.
[0447] The DNA fragments 1 (for als) and 2 (for alsS) encoding
chloramphenicol acetyltransferase (CAT), P15 origin of replication,
araBAD promoter, als of Klebsiella pneumoniae subsp. pneumoniae MGH
78578 of its codon usage optimized for over-expression in E. coli
or alsS of Bacillus subtilis B26 and ilvC and ilvD of E. coli DH10B
were amplified separately by polymerase chain reaction (PCR):
98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree.
C. for 4 min, repeated 30 times. The reaction mixture contained
1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:204)
and reverse (5'-CGGGGTACCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID
NO:205) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and
50 ng pBADals-ilvCD and pBADalsS-ilvCD in 50 .mu.l,
respectively.
[0448] To remove an internal SphI restriction enzyme site form
leuC, overlap PCR was carried out. The front and bottom halves of
DNA fragment 3 (for leuABCD), fragment 4 (for leuABCD2), and
fragment 5 (for leuABCD4) encoding araBAD promoter, leuA, leuB,
leuC, and leuD of E. coli BL21(DE3) were amplified separately by
polymerase chain reaction (PCR): 98.degree. C. for 10 sec,
60.degree. C. for 15 sec, and 72.degree. C. for 4 min, repeated 30
times. The reaction mixture contained 1.times. Phusion buffer
(NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-AAGGAAAAAAGCGGCCGCACTTTTCATACTCCCGCCATTCAG-3') (SEQ ID NO:206)
for front and (5'-CAAAGGCCGTCTGCACGCGCCGAAAGGCAAA-3') (SEQ ID
NO:207) for back halves) and reverse
(5'-TTTGCCTTTCGGCGCGTGCAGACGGCCTTTG-3') (SEQ ID NO:208) for front
and (5'-ACATGCATGCCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:209) for
bottom halves, 1U Phusion High Fidelity DNA polymerase (NEB), and
50 ng pBADleuABCD, pBADleuABCD2, and pBADleuABCD4 in 50 .mu.l,
respectively. The amplified DNA fragments were gel purified and
eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA
solution was combined and each DNA fragment was spliced by another
round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 4 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-AAGGAAAAAAGCGGCCGCACTTTTCATACTCCCGCCATTCAG-3')
(SEQ ID NO:210) and reverse
(5'-ACATGCATGCCGTTTGATGACGTGGACGATAGCGG-3') (SEQ ID NO:211)
primers, 1U Phusion High Fidelity DNA polymerase (NEB). The
resulting fragment 3, 4, and 5 were digested with SphI and NotI and
ligated into both fragment 1 and 2 pre-digested with the same
restriction enzymes.
Construction of pBADaroG-tktA-aroBDE.
[0449] The DNA fragments encoding aroG, tktA, aroB, aroD, and aroE
of Escherichia coli BL21(DE3) were amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 1 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-CCCGAGCTCAGGAGGATATATAT
ATGAATTATCAGAACGACGATTTAC-3') (SEQ ID NO:212) for aroG,
(5'-GCGTCGCGGGTAAGGAGGAAAATTTTATGTCCTCACGTAAAGAGCTTGCC-3') (SEQ ID
NO:213) for tktA,
(5'-GAACTGCTGTAAGGAGGTTAAAATTATGGAGAGGATTGTCGTTACTCTCG-3') (SEQ ID
NO:214) for aroB,}
(5'-CAATCAGCGTAAGGAGGTATATATAATGAAAACCGTAACTGTAAAAGATC-3') (SEQ ID
NO:215) for aroD, and
(5'-TACACCAGGCATAAGGAGGAATTAATTATGGAAACCTATGCTGTTTTTGG-3') (SEQ ID
NO:216) for aroE and reverse
(5'-TACGTGAGGACATAAAATTTTCCTCCTTACCCGCGACGCGCTTTTACTGC-3') (SEQ ID
NO:217) for aroG,
(5'-CAATCCTCTCCATAATTTTAACCTCCTTACAGCAGTTCTTTTGCTTTCGC-3') (SEQ ID
NO:218) for tktA,
(5'-CAATCAGCGTAAGGAGGTATATATAATGAAAACCGTAACTGTAAAAGATC-3') (SEQ ID
NO:219) for aroB,
(5'-TACGGTTTTCATTATATATACCTCCTTACGCTGATTGACAATCGGCAATG-3') (SEQ ID
NO:220) for aroD, and (5'-ACATGCATGCTTACGCGGACAATTCCTCCTGCAA-3')
(SEQ ID NO:221) for aroE, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l,
respectively. The amplified DNA fragments were gel purified and
eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA
solution was combined and each DNA fragment was spliced by another
round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 3 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CCCGAGCTCAGGAGGATATATATATGAATTATCAGAACGACGATTTAC-3') (SEQ ID
NO:222) and reverse (5'-ACATGCATGCTTACGCGGACAATTCCTCCTGCAA-3') (SEQ
ID NO:223) primers, 1U Phusion High Fidelity DNA polymerase (NEB).
The spliced fragment was digested with SacI and SphI and ligated
into pBAD33 pre-digested with the same restriction enzymes.
Construction of pBADpheA-aroLAC.
[0450] The DNA fragments encoding pheA, aroL, aroA, and aroC of
Escherichia coli DH10 were amplified by polymerase chain reaction
(PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and
72.degree. C. for 1 min, repeated 30 times. The reaction mixture
contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M
forward (5'-CCCGAGCTCAGGAGGATATATATATGACATCGGAAAACCCGTTACTGG-3')
(SEQ ID NO:224) for pheA,
(5'-GATCCAACCTAAGGAGGAAAATTTTATGACACAACCTCTTTTTCTGATCG-3') (SEQ ID
NO:225) for aroL,
(5'-GATCAATTGTTAAGGAGGTATATATAATGGAATCCCTGACGTTACAACCC-3') (SEQ ID
NO:226) for aroA, and
(5'-CAGGCAGCCTAAGGAGGAATTAATTATGGCTGGAAACACAATTGGACAAC-3') (SEQ ID
NO:227) for aroC and reverse
(5'-AGGTTGTGTCATAAAATTTTCCTCCTTAGGTTGGATCAACAGGCACTACG-3') (SEQ ID
NO:228) for pheA,
(5'-CAGGGATTCCATTATATATACCTCCTTAACAATTGATCGTCTGTGCCAGG-3') (SEQ ID
NO:229) for aroL,
(5'-GTTTCCAGCCATAATTAATTCCTCCTTAGGCTGCCTGGCTAATCCGCGCC-3') (SEQ ID
NO:230) for aroA, and (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3')
(SEQ ID NO:231) for aroC primers, 1U Phusion High Fidelity DNA
polymerase (NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50
.mu.l, respectively. The amplified DNA fragments were gel purified
and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA
solution was combined and each DNA fragment was spliced by another
round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 4 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CCCGAGCTCAGGAGGATATATATATGACATCGGAAAACCCGTTACTGG-3') (SEQ ID
NO:232) and reverse (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3')
(SEQ ID NO:233) primers, 1U Phusion High Fidelity DNA polymerase
(NEB). The spliced fragment was digested with SacI and SphI and
ligated into pBAD33 pre-digested with the same restriction
enzymes.
Construction of pBADtyrA-aroLAC.
[0451] The DNA fragments encoding pheA, aroL, aroA, and aroC of
Escherichia coli DH10 were amplified by polymerase chain reaction
(PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and
72.degree. C. for 1 min, repeated 30 times. The reaction mixture
contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M
forward (5'-CCCGAGCTCAGGAGGATATATATATGGTTGCTGAATTGACCGCATTAC-3')
(SEQ ID NO:234) for tyrA,
(5'-AATCGCCAGTAAGGAGGAAAATTTTATGACACAACCTCTTTTTCTGATCG-3') (SEQ ID
NO:235) for aroL,
(5'-GATCAATTGTTAAGGAGGTATATATAATGGAATCCCTGACGTTACAACCC-3') (SEQ ID
NO:236) for aroA, and
(5'-CAGGCAGCCTAAGGAGGAATTAATTATGGCTGGAAACACAATTGGACAAC-3') (SEQ ID
NO:237) for aroC, and reverse
(5'-GAGGTTGTGTCATAAAATTTTCCTCCTTACTGGCGATTGTCATTCGCCTG-3') (SEQ ID
NO:238) for tyrA,
(5'-CAGGGATTCCATTATATATACCTCCTTAACAATTGATCGTCTGTGCCAGG-3') (SEQ ID
NO:239) for aroL,
(5'-GTTTCCAGCCATAATTAATTCCTCCTTAGGCTGCCTGGCTAATCCGCGCC-3') (SEQ ID
NO:240) for aroA, and (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3')
(SEQ ID NO:241) for aroC, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng Escherichia coli BL21(DE3) genome in 50 .mu.l,
respectively. The amplified DNA fragments were gel purified and
eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA
solution was combined and each DNA fragment was spliced by another
round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 4 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CCCGAGCTCAGGAGGATATATATATGGTTGCTGAATTGACCGCATTAC-3') (SEQ ID
NO:242) and reverse (5'-ACATGCATGCTTACCAGCGTGGAATATCAGTCTTC-3')
(SEQ ID NO:243) primers, 1U Phusion High Fidelity DNA polymerase
(NEB). The spliced fragment was digested with SacI and SphI and
ligated into pBAD33 pre-digested with the same restriction
enzymes.
Construction of pBADpheA-aroLAC-aroG-tktA-aroBDE and
pBADtyrA-aroLAC-aroG-tktA-aroBDE.
[0452] A DNA fragment 1 (for pheA) and 2 (for tyrA) encoding
chloramphenicol acetyltransferase (CAT), P15 origin of replication,
araBAD promoter, pheA or tyrA, aroL, aroA, aroC of Escherichia coli
DH10B and a DNA fragment 3 encoding araBAD promoter, aroG, tktA,
aroB, aroD, and aroE of Escherichia coli DH10B were amplified
separately by polymerase chain reaction (PCR): 98.degree. C. for 10
sec, 60.degree. C. for 15 sec, and 72.degree. C. for 4 min,
repeated 30 times. The reaction mixture contained 1.times. Phusion
buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-AAGGAAAAAAGCGGCCGCCCCTGAACCGACGACCGGGTCG-3') (SEQ ID NO:244)
for fragment 1 and 2 and (5'-GCTCTAGAACTTTTCATACTCCCGCCATTCAG-3')
(SEQ ID NO:245) for fragment 3, and reverse
(5'-GCTCTAGAGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID NO:246) for
fragment 1 and 2 and
(5'-AAGGAAAAAAGCGGCCGCGCGGATACATATTTGAATGTATTTAG-3') (SEQ ID
NO:247) for fragment 3, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng pBADpheA-aroLAC, pBADtyrA-aroLAC, and
pBADaroG-tktA-aroBDE in 50 .mu.l, respectively. Amplified DNA
fragments 1 and 2 were digested with NotI and XbaI and ligated into
fragment 3 pre-digested with the same restriction enzymes.
Construction of pTrcBAL.
[0453] A DNA sequence encoding benzaldehyde lyase (bal) of
Pseudomonas fluorescens of its codon usage optimized for
over-expression in E. coli was amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 1 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-CATGCCATGGCTATGATTACTGGTGG-3') (SEQ ID NO:248)
and reverse (5'-CCCCGAGCTCTTACGCGCCGGATTGGAAATACA-3') (SEQ ID
NO:249) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and
50 ng pETBAL in 50 .mu.l. Amplified DNA fragment was digested with
NcoI and SacI and ligated into pTrc99A pre-digested with the same
restriction enzymes.
Construction of pTrcAdhE2.
[0454] A DNA sequence encoding Co-A linked alcohol/aldehyde
dehydrogenase (adhE2) of Clostridium acetobutyricum ATCC824 was
amplified by polymerase chain reaction (PCR): 98.degree. C. for 10
sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min,
repeated 30 times. The reaction mixture contained 1.times. Phusion
buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-CATGCCATGGCCAAAGTTACAAATCAAAAAG-3') (SEQ ID NO:250) and reverse
(5'-CGAGCTCTTAAAATGATTTTATATAGATATCC-3') (SEQ ID NO:251) primers,
1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
Clostridium acetobutyricum ATCC824 genome in 50 .mu.l. Amplified
DNA fragment was digested with NcoI and SacI and ligated into
pTrc99A pre-digested with the same restriction enzymes.
Construction of pTrcAdh2.
[0455] A DNA sequence encoding alcohol dehydrogenase (adh2) of
Saccharomyces cerevisiae was amplified by polymerase chain reaction
(PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and
72.degree. C. for 1 min, repeated 30 times. The reaction mixture
contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M
forward (5'-CATGCCATGGGTATTCCAGAAACTCAAAAAG-3') (SEQ ID NO:252) and
reverse (5'-CCCGAGCTCTTATTTAGAAGTGTCAACAACG-3') (SEQ ID NO:253)
primers, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
genome of Saccharomyces cerevisiae in 50 .mu.l. Amplified DNA
fragment was digested with NcoI and SacI and ligated into pTrc99A
pre-digested with the same restriction enzymes.
Construction of pTrcBALD.
[0456] A DNA sequence encoding CoA-linked aldehyde dehydrogenase
(ald) of Clostridium beijerinckii was amplified by polymerase chain
reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 1 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward (5'-CCCCGAGCTCAGGAGG
ATATACATATGAATAAAGACACACTAATACC-3') (SEQ ID NO:254) and reverse
(5'-CCCAAGCTTAGCCGGCAAGTACACATCTTC-3') (SEQ ID NO:255) primers, 1U
Phusion High Fidelity DNA polymerase (NEB), and 50 ng pETBAL in 50
.mu.l. Amplified DNA fragment was digested with SacI and HndIII and
ligated into pTrcBAL pre-digested with the same restriction
enzymes.
Construction of pTrcBALK.
[0457] A DNA sequence encoding ketoisovalerate decarboxylase (kivd)
of Lactococcus lavtis was amplified by polymerase chain reaction
(PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and
72.degree. C. for 1 min, repeated 30 times. The reaction mixture
contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M
forward (5'-CCCGAGCTCAGGAGGATATATATATGTATACAGTAGGAGATTACC-3') (SEQ
ID NO:256) and reverse (5'-GCTCTAGATTATGATTTATTTTGTTCAGCAAAT-3')
(SEQ ID NO:257) primers, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng pETBAL in 50 .mu.l. Amplified DNA fragment was
digested with SacI and XbaI and ligated into pTrcBAL pre-digested
with the same restriction enzymes.
Construction of pTrcAdh-Kivd.
[0458] A DNA sequence encoding ketoisovalerate decarboxylase (kivd)
of Lactococcus lavtis was amplified by polymerase chain reaction
(PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and
72.degree. C. for 1 min, repeated 30 times. The reaction mixture
contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M
forward (5'-CCCGAGCTCAGGAGGATATATATATGTATACAGTAGGAGATTACC-3') (SEQ
ID NO:258) and reverse (5'-GCTCTAGATTATGATTTATTTTGTTCAGCAAAT-3')
(SEQ ID NO:259) primers, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng pETBAL in 50 .mu.l. Amplified DNA fragment was
digested with SacI and XbaI and ligated into pTrcAdh2 pre-digested
with the same restriction enzymes.
Construction of pTrcBAL-DDH-2ADH.
[0459] To remove internal NcoI site, overlap PCR was carried out.
DNA fragments encoding front and bottom halves of
meso-2,3-butanedioldehydrogenase (ddh) of Klebsiella pneumoniae
subsp. pneumoniae MGH 78578 and secondary alcohol dehydrogenase
(2adh) of Pseudomanas fluorescens were amplified separately by
polymerase chain reaction (PCR): 98.degree. C. for 10 sec,
60.degree. C. for 15 sec, and 72.degree. C. for 1 min, repeated 30
times. The reaction mixture contained 1.times. Phusion buffer
(NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-CGAGCTCAGGAGGATATATATATGAAAAAAGTCGCACTTGTTACCG-3') (SEQ ID
NO:260) for front half of ddh,
(5'-GGCCGGCGGCCGCGCGATGGCGGTGAAAGTG-3') (SEQ ID NO:261) for bottom
half of ddh,
(5'-AACTAATCTAGAGGAGGATATATATATGAGCATGACGTTTTCCGGCCAGG-3') (SEQ ID
NO:262) for front half of 2adh, and
(5'-CCTTGCGGAGGGCTCGATGGATGAGTTCGAC-3') (SEQ ID NO:263) for bottom
half of 2adh, and reverse (5'-CACTTTCACCGCCATCGCGCGGCCGCCGGCC-3')
(SEQ ID NO:264) for front half of ddh,
(5'-GCTCATATATATATCCTCCTCTAGATTAGTTAAACACCATCCCGCCGTCG-3') (SEQ ID
NO:265) for bottom half of ddh,
(5'-GTCGAACTCATCCATCGAGCCCTCCGCAAGG-3') (SEQ ID NO:266) for front
half of 2adh, and (5'-CCCAAGCTTAGATCGCGGTGGCCCCGCCGTCG-3') (SEQ ID
NO:267) for bottom half of 2adh, 1U Phusion High Fidelity DNA
polymerase (NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae
MGH 78578 for ddh and Pseudomanas fluorescens genome for 2adh in 50
.mu.l, respectively. The amplified DNA fragments were gel purified
and eluted into 30 ul of EB buffer (Qiagen). 5 ul from each DNA
solution was combined and each DNA fragment was spliced by another
round of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec,
and 72.degree. C. for 2 min, repeated 30 times. The reaction
mixture contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5
.mu.M forward
(5'-CGAGCTCAGGAGGATATATATATGAAAAAAGTCGCACTTGTTACCG-3') (SEQ ID
NO:268) and reverse (5'-CCCAAGCTTAGATCGCGGTGGCCCCGCCGTCG-3') (SEQ
ID NO:269) primers, 1U Phusion High Fidelity DNA polymerase (NEB).
The spliced fragment was digested with SacI and HindIII and ligated
into pTrcBAL pre-digested with the same restriction enzymes.
Construction of pBBRPduCDEGH.
[0460] A DNA sequence encoding propanediol dehydratase medium
(pduD) and small (pduE) subunits and propanediol dehydratase
reactivation large (pduG) and small (pduH) subunits of Klebsiella
pneumoniae subsp. pneumoniae MGH 78578 was amplified by polymerase
chain reaction (PCR): 98.degree. C. for 10 sec, 60.degree. C. for
15 sec, and 72.degree. C. for 2 min, repeated 30 times. The
reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM
dNTP, 0.5 .mu.M forward
(5'-GCTCTAGAGGAGGATTTAAAAATGGAAATTAACGAAACGCTGC-3') (SEQ ID NO:270)
and reverse (5'-TCCCCGCGGTTAAGCATGGCGATCCCGAAATGGAATCCCTTTGAC-3')
(SEQ ID NO:271) primers, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578
in 50 .mu.l. Amplified DNA fragment was digested with SacII and
XbaI and ligated into pTrc99A pre-digested with the same
restriction enzymes to form pBBRPduDEGH.
[0461] A DNA sequence encoding propanediol dehydratase large
subunit (pduC) of Klebsiella pneumoniae subsp. pneumoniae MGH 78578
was amplified by polymerase chain reaction (PCR): 98.degree. C. for
10 sec, 60.degree. C. for 15 sec, and 72.degree. C. for 1 min,
repeated 30 times. The reaction mixture contained 1.times. Phusion
buffer (NEB), 2 mM dNTP, 0.5 .mu.M forward
(5'-CCGCTCGAGGAGGATATATATATGAGATCGAAAAGATTTGAAGC-3') (SEQ ID
NO:272) and reverse (5'-GCTCTAGATTAGCCAAGTTCATTGGGATCG-3') (SEQ ID
NO:273) primers, 1U Phusion High Fidelity DNA polymerase (NEB), and
50 ng Klebsiella pneumoniae subsp. pneumoniae MGH 78578 in 50
.mu.l. Amplified DNA fragment was digested with XhoI and XbaI and
ligated into pBBRPduDEGH pre-digested with the same restriction
enzymes.
Construction of pTrcIpdc-Par.
[0462] A DNA sequence encoding indole-3-pyruvate (ipdc) of
Azospirillum brasilense and phenylethanol reductase (par) of
Rhodococcus sp. ST-10 were amplified by polymerase chain reaction
(PCR): 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and
72.degree. C. for 1 min, repeated 30 times. The reaction mixture
contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M
forward primers (5'-CATGCCATGGGACTGGCTGAGGCACTGCTGC-3' (SEQ ID
NO:314) for ipdc and
5'-CGAGCTCAGGAGGATATATATATGAAAGCTATCCAGTACACCCGTAT-3' (SEQ ID
NO:315) for par, and reverse primers
(5'-CGAGCTCTTATTCGCGCGGTGCCGCGTGCAGG-3' (SEQ ID NO:316) for ipdc
and 5'-GCTCTAGATTACAGGCCCGGAACCACAACGGCGC-3' (SEQ ID NO:317) for
par, 1U Phusion High Fidelity DNA polymerase (NEB), and 50 ng
pTrcIpdc and pTrcPar, respectively, in 50 .mu.l. Amplified DNA
fragment of ipdc and par were digested with NcoI/SacI and
SacI/XbaI, respectively, and were ligated into pTrc99A pre-digested
with NcoI and XbaI.
Testing and Results:
[0463] To test the butyraldehyde biosynthesis pathway, DH10B
harboring pBADButP-atoB/pTrcBALD and
pBADButP-atoB-ALD/pTrcB2DH/pBBRpduCDEGH were grown overnight in LB
media containing 50 ug/ml chroramphenicol (Cm.sup.50) and 100 ug/ml
ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each seed
culture was inoculated into fresh TB media containing Cm.sup.50 and
Amp.sup.100 and was grown in incubation shaker at 37 C, 200 rpm.
Three hours after inoculation, the cultures were induced with 13.3
mM arabinose and 1 mM IPTG and were grown for overnight. 700 ul of
this culture was extracted with equal volume of ethylacetate and
analyzed by GC-MS.
[0464] To test the isobutyeraldehyde biosynthesis pathway, DH10B
cells harboring pBADals-ilvCD/pTrcBALK or pBADalsS-ilvCD/pTrcBALK
were grown overnight in LB media containing 50 ug/ml
chloramphenicol (Cm.sup.50) and 100 ug/ml ampicillin (Amp.sup.100)
at 37 C, 200 rpm. An aliquot of each seed culture was inoculated
into fresh TB media containing Cm.sup.50 and Amp.sup.100 and was
grown in incubation shaker at 37 C, 200 rpm. Three hours after
inoculation, the cultures were induced with 13.3 mM arabinose and 1
mM IPTG and were grown for overnight. 700 ul of this culture was
extracted with equal volume of ethylacetate and analyzed by GC-MS
for the production of isobutyraldehyde. FIG. 8B shows the
production of isobutanal from these cultures.
[0465] To test the 3-methylbutyraldehyde and 2-methylbutyraldehyde
biosynthesis pathways, DH10B harboring
pBADals-ilvCD-LeuABCD/pTrcBALK, pBADals-ilvCD-LeuABCD2/pTrcBALK,
pBADals-ilvCD-LeuABCD/pTrcBALK4, pBADalsS-LeuABCD/pTrcBALK,
pBADalsS-LeuABCD2/pTrcBALK, or pBADalsS-LeuABCD4/pTrcBALK were
grown overnight in LB media containing 50 ug/ml chloramphenicol
(Cm.sup.50) and 100 ug/ml ampicillin (Amp.sup.100) at 37 C, 200
rpm. An aliquot of each seed culture was inoculated into fresh TB
media containing Cm.sup.50 and Amp.sup.100 and was grown in
incubation shaker at 37 C, 200 rpm. Three hours after inoculation,
the cultures were induced with 13.3 mM arabinose and 1 mM IPTG and
were grown for overnight. 700 ul of this culture was extracted with
equal volume of ethylacetate and analyzed by GC-MS. The production
of 2-isovaleralcohol (2-methylpental) and 3-isovaleralcohol
(3-methylpentanal) was monitored because 3-isovaleraldehyde and
2-isovaleraldehyde are spontaneously converted to their
corresponding alcohols. FIG. 8B shows the production of
2-methylpental and 3-methylpentanal from these cultures.
[0466] To test the phenylacetoaldehyde and
4-hydroxyphenylacetoaldehyde biosynthesis pathways, DH10B cells
harboring pBADpheA-aroLAC/pTrcBALK, pBADtyrA-aroLAC/pTrcBALK,
pBADaroG-tktA-aroBDE/pTrcBALK,
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK, and
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK were grown overnight in
LB media containing 50 ug/ml chloramphenicol (Cm.sup.50) and 100
ug/ml ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each
seed culture was inoculated into fresh TB media containing
Cm.sup.50 and Amp.sup.100 and was grown in incubation shaker at 37
C, 200 rpm. Three hours after inoculation, the cultures were
induced with 13.3 mM arabinose and 1 mM IPTG and were grown for
overnight. 700 ul of this culture was extracted with equal volume
of ethylacetate and analyzed by GC-MS. The production of
phenylacetoaldehyde, 4-hydroxyphenylaldehyde and their
corresponding alcohols were monitored using GC-MS. FIG. 9B shows
the production of 4-hydroxyphenylethanol from these cultures.
[0467] To test the 2-phenylethanol, 2-(4-hydroxyphenyl)ethanol, and
2-(indole-3) ethanol biosynthesis pathways, DH10B harboring
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK,
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcBALK,
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcAdh2-Kivd,
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcAdh2-Kivd,
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcIpdc-Par, and
pBADpheA-aroLAC-aroG-tktA-aroBDE/pTrcIpdc-Par were grown overnight
in LB media containing 50 ug/ml chroramphenicol (Cm.sup.50) and 100
ug/ml ampicillin (Amp.sup.100) at 37 C, 200 rpm. An aliquot of each
seed culture was inoculated into fresh TB media containing
Cm.sup.50 and Amp.sup.100 and was grown in incubation shaker at 37
C, 200 rpm. Three hours after inoculation, the cultures were
induced with 13.3 mM arabinose and 1 mM IPTG and were grown for
overnight to a week. 700 ul of this culture was extracted with
equal volume of ethylacetate and analyzed by GC-MS. The results are
detailed below.
[0468] The production of 2-phenylethanol,
2-(4-hydroxyphenyl)ethanol and/or 2-(indole-3-)ethanol was
monitored using GC-MS. FIG. 42A shows the production of
2-phenylethanol from these cultures at 24 hours. FIG. 42B shows the
production of 2-(4-hydroxyphenyl)ethanol from these cultures at 24
hours. FIG. 42C shows the production of 2-(indole-3-)ethanol from
these cultures at 24 hours.
[0469] FIG. 43A shows the GC-MS chromatogram for control (pBAD33
and pTrc99A) at one week. FIG. 43B shows the GC-MS chromatogram for
2-phenylethanol (5.97 min) production from
pBADpheA-aroLAC-aroG-tktA-aroBDE and pTrcBALK at one week. FIG. 44
shows the GC-MS chromatogram for 2-(4-hydroxyphenyl)ethanol (9.36
min) and 2-(indole-3) ethanol (10.32 min) production from
pBADtyrA-aroLAC-aroG-tktA-aroBDE and pTrcBALK at one week.
Example 5
Isolation and Biological Activity of Diol Dehydrogenases
[0470] Available substrates such as 3-hydroxy-2-butanone (acetoin),
4-hydroxy-3-hexanone (propioin), 5-hydroxy-4-octanone (butyroin),
6-hydroxy-5-decanone (valeroin), and 1,2-cyclopentanediol were used
to measure the ability of diol dehydrogenases (ddh) to catalyze the
reduction of large saturated .alpha.-hydroxyketones to produce a
diol. All reagents were purchased from Sigma-Aldrich Co. and TCI
America, unless otherwise stated.
[0471] For cloning and isolation of DDH polypeptides, genomic DNA
from several species of bacteria were obtained from ATCC
(Lactobaccilus brevis ATCC 367, Pseudomanas putida KT2440, and
Klebsiella pneumoniae MGH78578), PCR-amplified (using Phusioin with
polymerase with 1.times. Phusion buffer, 0.2 mM dNTP, 0.5 .mu.L
Phusion enzyme, 1.5 .mu.M primers, and 20 pg template DNA in a 50
.mu.L reaction) utilizing the following protocol: 30 cycles,
98.degree. C./10 secs (denaturing), 60.degree. C./15 secs
(annealing), 72.degree. C./30 secs (elongation). Polymerase chain
reaction products were then digested using restriction enzymes NdeI
and BamHI, then ligated into NdeI/BamHI digested pET28 vectors.
Vectors containing ddh clones were transformed into BL21(DE3)
competent cells for protein expression. Single colony was
innoculated into LB media, and expression of 6.times.His-tagged
proteins of interest was induced at OD.sub.600=0.6 with 0.1 mM
IPTG. Expression was allowed to proceed for 15 hours at 22.degree.
C. The 6.times.His-tagged enzymes were purified using Ni-NTA spin
columns following suggested protocols by QIAGEN, yielding purified
protein concentrations in the range of 1.1-6.5 mg/mL (determined by
Bradford assay).
[0472] Diol dehydrogenase ddh1 was isolated from Lactobaccilus
brevis ATCC 367, diol dehydrogenase ddh2 was isolated from
Pseudomonas putida KT2440, and diol dehydrogenase ddh3 was isolated
from Klebsiella pneumoniae MGH78578. The nucleotide sequence
encoding and polypeptide sequence of ddh1 are shown in SEQ ID
NOS:97 and 98, respectively; nucleotide sequence encoding and
polypeptide sequence of ddh2 are shown in SEQ ID NOS:99 and 100,
respectively; and nucleotide sequence encoding and polypeptide
sequence of ddh3 are shown in SEQ ID NOS: 101 and 102,
respectively.
[0473] Reactions to measure biological activity of DDH polypeptides
were performed in a final volume of 200 .mu.L as follows: 25 mM
substrate, 0.04 mg/mL DDH polypeptide, 0.25 mg/mL nicotinamide
cofactor, 200 mM imidazole, 14 mM Tris-HCl, and 1.5% by volume
DMSO. Biological activity was assayed using a Molecular Devices
Thermomax 96 well plate reader, monitoring absorbance at 340 nm,
which corresponds to NADH or NADPH concentration. For the kinetic
studies, 0.04 mg/mL DDH polypeptide, 0.25 mg/mL NADH, 20 mM Tris
HCl Buffer pH 6.5(red) or 9.0(ox), T=25 C, 100 uL total volume was
used.
[0474] FIG. 12A shows the biological activity of ddh1, ddh2, and
ddh3 using butyroin as a substrate (triangles represent ddh3
activity). FIG. 12B shows the oxidation activity of ddh3 towards
1,2-cyclopentanediol and 1,2-cyclohexanediol as measured by NADH
production. FIG. 13 summarizes the results of kinetic studies for
various substrates in the oxidation reactions catalyzed by the DDH
polypeptides. These reactions were NAD+ dependent.
Example 6
Sequential In Vivo Biological Activity of CC-Ligases (Lyases) and
Diol Dehydrogenases
[0475] The ability of a C--C lyase and a diol hydrogenase to
perform the following sequential reaction was tested in E.
coli:
##STR00001##
[0476] For .alpha.-hydroxyketone and diol production, a pathway
comprising a benzaldehyde lyase (bal) gene isolated from
Pseudomonas fluorescens (codon usage was optimized for E. coli
protein expression) and meso-2,3-butanediol dehydrogenase (ddh)
gene isolated from Klebsiella pneumoniae subsp. pneumoniae MGH
78578 was constructed in E. coli and tested for its ability to
condensate the substrates detailed below in Table 2 (e.g.,
acetoaldehyde, propionaldehyde, butyraldehyde, isobutyraldehyde,
2-methyl-butyraldehyde, 3-methyl-butyraldehyde, phenylacetaldehyde,
and 4-hydroxyphenylacetaldehyde, or their corresponding alcohols)
to form .alpha.-hydroxyketone and the corresponding diol in vivo.
The production of various .alpha.-hydroxyketones and diols was
monitored by gas chromatography-mass spectrometry (GC-MS).
TABLE-US-00002 TABLE 2 Summary of substrates and products. Produced
Substrate .alpha.-hydroxyketone Produced diol FIGS. Butanal
5-Hydroxy-4-octanone 4,5-Octanonediol 17A & B n-Pentanal
6-Hydroxy-5-decanone 5,6-Decanediol 18A & B 3-Methylbutanal
2,7-Dimethyl-5-hydroxy-4- 2,7-Dimethyl-4,5-octanediol 19A & B
octanone n-Hexanal 7-Hydroxy-6-dodecanone 6,7-dodecanediol 20A
& B 4-Methylpentanal 2,9-Dimethyl-6-hydroxy-5-
2,9-Dimethyl-5,6- 21A & B decanone decanediol n-Octanal
9-Hydroxy-8-hexadecanone 8,9-hexadecanediol 22 Acetaldehyde
3-Hydroxy-2-butanone 2,3-Butanediol 23 n-Propanal
4-Hydroxy-3-hexanone 3,4-Hexanediol 24A & B Phenylacetoaldehyde
1,4-Diphenyl-3-hydroxy-2- 1,4-Diphenyl-2,3-butanediol 25
butanone
For Analysis of .ltoreq.C10.
[0477] E. coli harboring pTrcBAL-DDH-2ADH was grown for overnight
in LB media containing 50 ug/ml Kanamycine (Km). This seed culture
was innoculated into M9 media containing 3% (v/v) glycerol, 0.5%
(g/v) and 50 ug/ml Km. 10 mL cultures were grown to
O.D..sub.600=0.7, then cultures were induced with 0.5 mM IPTG. The
cells were allowed to express the enzymes of interest for 3 hours
before various aldehydes were added to a concentration of 5-10 mM.
After addition of aldehydes, the cultures were capped and incubated
at 37.degree. C. with skaking for 72 hours. Cultures were extracted
with 2 mL ethyl acetate, and analyzed on GC-MS using the following
protocol:
[0478] 1 .mu.L, injection w/ 50:1 split
[0479] Inlet temperature--150.degree. C.
[0480] Initial oven temperature--50.degree. C.
[0481] Temperature Ramp 1--10.degree. C./min to 150.degree. C.
[0482] Temperature Ramp 2--50.degree. C./min to 300.degree. C.
[0483] GC to MS transfer temp--250.degree. C.
[0484] MS detection--full scan MW 35-200
For Analysis of .gtoreq.C12.
[0485] E. coli DH10B strains harboring pTrc99A (Ctrl vector) or
pTrcBAL were inoculated into 0.75.times.M9/0.5% LB containing 0.1
mM CaCl.sub.2, 2 mM MgSO.sub.4, 1 mM KCl, 1% galacturonate, 5
.mu.g/mL thiamine, Amp. The cultures were grown up to an optical
density (600 n nm) of 0.8 and induced with 0.25 mM IPTG. The cells
were allowed to express the proteins for 2.5 hours at 37.degree.
C., then aldehyde substrate was added to a concentration of 5 mM,
the culture vial was capped tightly and incubated for 72 hours at
37.degree. C. w/ shaking 200 rpm. 1 mL of the final culture was
extracted with 0.75 mL of ethyl acetate, centrifuged facilitate
phase separation, then analyzed via GCMS using the following
method.
[0486] 1 .mu.L injection w/50:1 split
[0487] Inlet temperature--250.degree. C.
[0488] Initial oven temperature--50.degree. C.
[0489] Temperature Ramp 1--10.degree. C./min to 125.degree. C.
[0490] Temperature Ramp 2--30.degree. C./min to 300.degree. C.
[0491] Final Temperature 300.degree. C.--1 minute
[0492] GC to MS transfer temp--250.degree. C.
[0493] MS detection--full scan MW 40-260.
[0494] The results are depicted in FIGS. 17 through 25. FIG. 17
shows the sequential conversion of butanal into
5-hydroxy-4-octanone and then 4,5-octanonediol. FIG. 18 shows the
sequential conversion of n-pentanal into 6-hydroxy-5-decanone and
then 5,6-decanediol. FIG. 19 shows the conversion of
3-methylbutanal into 2,7-dimethyl-5-hydroxy-4-octanone and then
2,7-Dimethyl-4,5-octanediol. FIG. 20 shows the sequential
conversion of n-hexanal into 7-hydroxy-6-dodecanone and then
6,7-dodecanediol. FIG. 21 shows the conversion of 4-methylpentanal
into 2,9-dimethyl-6-hydroxy-5-decanone and then
2,9-dimethyl-5,6-decanediol. FIG. 22 shows the conversion of
n-octanal into 9-hydroxy-8-hexadecanone. FIG. 23 shows the
conversion of acetaldehyde into 3-hydroxy-2-butanone. FIG. 24 shows
the sequential conversion of n-propanal into 4-hydroxy-3-hexanone
and then 3,4-hexanediol. FIG. 25 shows the conversion of
phenylacetoaldehyde into 1,4-diphenyl-3-hydroxy-2-butanone.
[0495] Similar to above, a pathway comprising a benzaldehyde lyase
(bal) gene isolated from Pseudomonas fluorescens (codon usage was
optimized for E. coli protein expression) was constructed in E.
coli and tested for its ability to catalyze the production of
various .alpha.-hydroxyketones. The results, which show the broad
spectrum of C--C ligase activity for the bal gene tested, are set
forth in FIG. 48 through FIG. 55.
Example 7
Sequential Biological Activity of Diol Dehydrogenases and Diol
Dehydratases
[0496] To test the sequential biological activity of diol
dehydrogenases and diol dehydratases in a dehydration and reduction
pathway, butyroin was used as a substrate in a sequential reaction
to produce 4-octanone. The enzyme diol dehydrogenase (e.g., ddh)
catalyzes the reversible reduction and oxidation of .alpha.-hydroxy
ketones and its corresponding diol, such as 5-hydroxy-4-octanone
and 4,5-octanediol, and the enzyme diol dehydratase (e.g., pduCDE)
catalyzes the irreversible dehydration of diols, such as
4,5-octanediol.
[0497] Diol dehydrogenase ddh from Klebsiella pneumoniae MGH 78578
and diol dehydratase pduCDE from Klebsiella pneumoniae MGH 78578
were cloned into a bacterial expression vector and expressed and
purified on a Ni-NTA column, as described in Example X except that
1 mM of 1,2-propanediol was added at all time during the expression
and purification of diol dehydratase. The large, medium, and small
subunits of the pduCDE polypeptide are encoded by the nucleotide
sequences of SEQ ID NOs:103, 105, and 107, respectively, and the
polypeptide sequence are set forth in SEQ ID NOs: 104, 106, and
108, respectively.
[0498] The ddh3 and pduCDE polypeptides were incubated with
butyroin and their appropriate cofactors, then assayed using gas
chromatography-mass spectrometry (GC-MS) for their ability to
perform sequential reactions resulting in the product 4-octanone.
Reaction conditions are given in Table 3 below. The reaction
mixture was incubated at 37.degree. C. for 40 hours in a 0.6 mL
eppendorf tube with minimal head space. The reaction product was
extracted with an equivalent volume of ethyl acetate, stored in a
glass vial, and sent to Thermo Fischer Scientific Instruments
Division for compositional analysis by GC-MS.
TABLE-US-00003 TABLE 3 Reaction Conditions Rxn Component
Concentration 5-hydroxy-4-octanone (butyroin) 8.4 mM
Adenosylcobalamin (coenzyme B.sub.12) 33.5 .mu.M KCl 9.6 mM NADH 18
mM dDH3 enzyme 0.19 mg/mL dDOH1 enzyme mix 0.15 mg/mL Reaction
Buffer 10 mM Tris HCl pH 7.0
[0499] FIG. 26A shows GC-MS data which confirms the presence of
4,5-octanediol in the sample extraction. The mass-spectra of the
peaks, retention time, at 5.36 was identified as butyroin
(substrate), and at 6.01, 6.09, and 6.12 min were identified as
different isomers of 4,5-octanediol. This compound is the expected
product resulting from the reduction of butyroin by ddh3.
[0500] FIG. 26B shows GC-MS data confirming the presence of
4-octanone in the sample extraction. The mass-spectra of the peak,
retention time, at 4.55 was identified as 4-octanone. This compound
is the expected product resulting from the sequential
dehydrogenation of butyroin and dehydration of 4,5-octanediol by
ddh3 and pduCDE, respectively.
[0501] FIGS. 27A and 27B show comparisons between the sample
extraction gas chromatograph/mass spectrum and the 4-octanone
standard gas chromatograph/mass spectrum. These results demonstrate
that 4-octanone was produced from butyroin using the enzymes diol
dehydrogenase (ddh3) and a diol dehydratase (pduCDE). GC-MS
analysis of the incubated reaction mixture confirmed starting
material, intermediate and product, demonstrating that these
enzymes can be reappropriated for these specific substrates.
Example 8
Isolation and Biological Activity of Secondary Alcohol
Dehydrogenases
[0502] Substrates such as 4-octanone, 2,7-dimethyl-4-octanone,
cyclopentanone and corresponding alcohols were utilized to measure
the ability of secondary alcohol dehydrogenases (2ADHs) to catalyze
the reduction of large saturated ketones to secondary alcohols. An
example of a reaction catalyzed by secondary alcohol dehydrogenases
is illustrated below (reduction of 4-octanone to 4-octanol is
shown):
##STR00002##
[0503] All enzymes and reagents were purchased from New England
Biolabs and Sigma, respectively, unless otherwise stated.
[0504] Various secondary alcohol dehydrogenases (2ADHs) were
isolated from Pseudomonas putida KT2440, Pseudomonas fluorescens
Pf-5, and Klebsiella pneumoniae MGH 78578. All vectors were
transformed in BL21(DE3) competent cells and expression of the
genes encoding the proteins of interest was induced with IPTG (via
the T7 promoter). The cells were lysed, proteins were extracted and
then purified on Ni-NTA columns. Final protein concentration in the
Ni-NTA eluate was diluted to 0.15 mg/mL prior to assays.
[0505] NADPH/NADPH consumption and production assays were performed
using a THERMOmax microplate reader in the kinetic mode, monitoring
the NADPH absorbance peak at 340 nm until the reaction reached
equilibrium. In the assay described in Table 2, 2ADH-2, 2ADH-5,
2ADH-8, and 2ADH-10 were tested for their ability to either
catalyze the oxidation of 4-octanol or catalyze the reduction of
4-octanone. These reaction conditions are found in Table 4
below.
TABLE-US-00004 TABLE 4 Reaction Conditions for Various Enzyme
Assays Reaction Component Final Concentration NADH Production Assay
(30.degree. C.) 2ADH enzyme Approx. 0.058 .mu.g/.mu.L 4-octanol
5.55 mM NAD+ Approx. 1.4 .mu.g/.mu.L Imidizole (from Elution
Buffer) Approx. 280 mM NADH Consumption Assay (30.degree. C.) 2ADH
enzyme Approx. 0.075 .mu.g/.mu.L 4-octanone 5.0 mM NADH Approx.
0.25 .mu.g/.mu.L Imidizole (from Elution Buffer) Approx. 250 mM
NADPH Production Assay (30.degree. C.) 2ADH enzyme Approx. 0.058
.mu.g/.mu.L 4-octanol 5.55 mM NADP+ Approx. 1.4 .mu.g/.mu.L
Imidizole (from Elution Buffer) Approx. 280 mM
[0506] Further testing was performed, as described in Tables 5
below, in which 2ADH-2, 2ADH-11, 2ADH-12, 2ADH-13, 2ADH-14,
2ADH-15, 2ADH-16, 2ADH-17, and 2ADH-18 were tested for their
ability to either catalyze the oxidation of 4-octanol,
2,7-dimethyl-4-octanonol, or cyclopentanol, or catalyze the
reduction of 4-octanone, 2,7-dimethyl-4-octanonone, or
cyclopentanone.
TABLE-US-00005 TABLE 5 Rxn Component Final Concentration Rxn
Components for NADPH Consumption Assays (Reduction) Substrate 25 mM
Enzyme 0.04 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM
Tris HCl 14 mM DMSO 1.5% by volume Total Volume 200 .mu.L Rxn
Components for NAD(P)H Production Assays (Oxidation) Substrate 5 mM
Enzyme 0.04 mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM
Tris HCl 14 mM Rxn Components for NAD(P)H Production Assay using
2,7-dimethyl-4-octanone as a substrate Substrate 50 mM Enzyme 0.08
mg/mL Nicotinamide cofactor 0.25 mg/mL Imidizole 200 mM Tris HCl 14
mM DMSO 3% by volume
[0507] FIG. 30A shows the results from the NADH Production Assay of
Table 3, in which 2ADH-2 catalyzes the oxidation of 4-octanol in
the presence of NAD+, as measured by NADH production. FIG. 30B
shows the results of the NADPH Production Assay of Table 3, in
which 2ADH-5, 2ADH-8, and 2ADH-10 catalyze the oxidation of
4-octanol in the presence of NADP+, as measured by NADPH
production.
[0508] FIG. 31 shows the oxidation of 4-octanol by by 2ADH-11 (FIG.
31A) and 2ADH-16 (FIG. 31B), as measured by NADH and NADPH
production, respectively.
[0509] FIG. 32 shows the oxidation of 2,7-dimethyloctanol by
2ADH-11 and others (FIG. 32A) and 2ADH-16 (FIG. 32B), as measured
by NADH and NADPH production, respectively.
[0510] FIG. 33A shows the reduction of 2,7-dimethyl octanol by 2ADH
11 and 2ADH16 as monitored by NADPH consumption. FIG. 33B shows the
reduction activity of both 2ADH11 and 2ADH16 towards various
substrates. FIG. 34 shows the oxidation (FIG. 34A) and reduction
(FIG. 34B) of cyclopentanol by 2ADH-16.
[0511] Similar to above, kinetic testing for both oxidation and
reduction reactions was performed on various substrates using
2ADH-16. The conditions for these studies were as follows: 0.04
mg/mL enzyme, 0.25 mg/mL cofactor, 20 mM Tris HCl Buffer pH
6.5(red) or 9.0(ox), T=25 C, 100 uL total volume was used. The
calculated rate constants for the reduction reactions, along with
the structures of the substrates, are summarized in FIG. 35. The
calculated rate constants for the oxidation reactions, along with
the structures of the substrates, are summarized in FIG. 36. These
results show that 2ADH-16 is capable of catalyzing both the
oxidation and reduction of a wide variety of substrates.
Example 9
Isolation and In Vitro and In Vivo Activity of Coenzyme B12
Independent Diol Dehydratases
[0512] Substrates such as 1,2-propanediol, meso-2,3-butanediol, and
trans-1,2-cyclopentanediol were utilized to test both the in vitro
and in vivo biological activity of a B12 independent diol
dehydratase in a dehydration and reduction pathway. Diol
dehydratases catalyzes the irreversible dehydration of diols, such
as 1,2-propanediol.
[0513] For in vitro activity, E. coli BL21(DE3) harboring pETPduCDE
(diol dehydratase subunits) was inoculated into 100 mL LB media,
grown to to OD.sub.600=0.7, induced with 0.15 mM IPTG, and
incubated for 22 hours at 22.degree. C. The cells were lysed and
proteins of interest were purified on a Ni-NTA spin column.
Purification of all three dehydratase subunits was accomplished by
adding 5 mM 1,2-propanediol to the lysis and wash buffers. The
Ni-NTA purification yielded approximately 660 .mu.L of protein
mixture at a concentration of 2.2 mg/mL. Protein concentration
assays were conducted using a Bradford reagent protocol.
[0514] The purified PduCDE was used to set up in vitro diol
dehydratase reactions. Three assays were conducted with
1,2-propanediol and meso-2,3-butanediol. Control reactions were
also set up with elution buffer added in place of purified PduCDE.
In vitro reactions were conducted under semi-anaerobic conditions
in 2 mL screw cap glass vials. Reaction components and
concentrations are given in Table 6.
TABLE-US-00006 TABLE 6 Reaction conditions for B.sub.12 dependent
DDOH in vitro assay Rxn Component Concentration Diol substrate 10
mM Adenosylcobalamin (B.sub.12) 100 .mu.g/mL KCl 10 mM dOH1 enzyme
mix 0.08 mg/mL Reaction Buffer 10 mM Tris HCl pH 7.5
[0515] After 48 hours, 1 mL of the reaction mixture was extracted
with 0.5 mL of either ethylacetate or hexanol and analyzed by
GCMS.
[0516] The following GCMS protocol was used for all
experiments:
[0517] 1 .mu.L, injection w/50:1 split
[0518] Inlet temperature--250.degree. C.
[0519] Initial oven temperature--50.degree. C.
[0520] Temperature Ramp 1--10.degree. C./min to 125.degree. C.
[0521] Temperature Ramp 2--30.degree. C./min to 300.degree. C.
[0522] Final Temperature 300.degree. C.--1 minute
[0523] GC to MS transfer temp--250.degree. C.
[0524] MS detection--full scan MW 40-260
[0525] The results are shown in FIG. 45. FIG. 45A confirms the
formation of 1-propanal from 1,2-propanediol, and FIG. 45B confirms
the formation of 2-butanone from meso-2,3-butanediol, both of which
were catalyzed by B12 independent diol dehydratase.
[0526] For in vivo activity, the pBBRDhaB1/2 plasmid was
constructed as follows: the DNA sequence encoding B12-independent
glycerol dehydratase (dhaB1) and activator (dhaB2) of Clostridium
butyricum was amplified by polymerase chain reaction (PCR):
98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and 72.degree.
C. for 2 min for dhaB1 and 1 min for dhaB2, repeated 30 times. The
reaction mixture contained 1.times. Phusion buffer (NEB), 2 mM
dNTP, 0.5 .mu.M forward primers
(5'-CCGCTCGAGGAGGATATATATATGATTTCTAAAGGCTTTAGCACCC-3' (SEQ ID
NO:318) for dhaB1 and
5'-ACGTGATGTAATCTAGAGGAGGATATATATATGAGCAAAGAAATTAAAGG-3' (SEQ ID
NO:319) for dhaB2, and reverse primers
(5'-TCTTTGCTCATATATATATCCTCCTCTAGATTACATCACGTGTTCAGTAC-3' (SEQ ID
NO:320) for dhaB1 and 5'-CGAGCTCTTATTCGGCGCCAATGGTGCACGGG-3' (SEQ
ID NO:321) for dhaB2, 1U Phusion High Fidelity DNA polymerase
(NEB), and 50 ng pETdhaB1 and pETdhaB2, respectively, in 50 .mu.l.
Amplified fragments were gel purified and spliced by another round
of PCR: 98.degree. C. for 10 sec, 60.degree. C. for 15 sec, and
72.degree. C. for 2.5 min, repeated 30 times. The reaction mixture
contained 1.times. Phusion buffer (NEB), 2 mM dNTP, 0.5 .mu.M
forward (5'-CCGCTCGAGGAGGATATATATATGATTTCTAAAGGCTTTAGCACCC-3') (SEQ
ID NO:322) and reverse primers
(5'-CGAGCTCTTATTCGGCGCCAATGGTGCACGGG-3') (SEQ ID NO:323), 1U
Phusion High Fidelity DNA polymerase (NEB), and 50 ng each fragment
in 50 .mu.l. Amplified DNA fragment was digested with XhoI and SacI
and ligated into pBBR1MCS-2 pre-digested with the same restriction
enzymes.
[0527] Two strains of E. coli DH10B harboring pBBR1MCS-2 or
pBBRDhaB1/2 into TB media without glycerol were innoculated.
Cultures were grown to OD.sub.600=0.5 and the substrates
1,2-propanediol, meso-2,3-butanediol, and
trans-1,2-cyclopentanediol were added to separate cultures to a
concentration of 10 mM. 5 ug/ml of co-enzyme S-adenosylmethionine
was added before the culture is transferred to anaerobic
environment. The cultures were incubated at 37 C for 48 hrs.
[0528] After 48 hours, 1 mL of culture was extracted with 0.5 mL of
ethylacetate or hexanol and analyzed by GCMS, as described above.
The results are shown in FIG. 46. FIG. 46A shows the in vivo
production of 1-propanol from 1,2-propanediol. FIG. 46B shows the
in vivo production of 2-butanol from meso-2,3 butanediol. FIG. 46C
shows the in vivo production of cyclopentanone from
trans-1,2-cyclopentanediol.
Example 10
Identification of Secreted Alginate Lyase and Genomic Regions
Sufficient for Growth on Alginate as a Sole Source of Carbon
[0529] To identify secreted or external alginate lyases, and to
identify genomic regions from Vibrio splendidus that are sufficient
to confer growth in alginate as a sole source of carbon, the
following clones were made using the gateway system from Invitrogen
(Carlsbad, Calif.). First, entry vectors were made by TOPO cloning
PCR fragments into pENTR/D/TOPO. PCR fragments were generated using
Vibrio splendidus B01 genomic DNA as a template and amplified with
the following primer pairs:
[0530] Vs24214-24249: genomic region corresponding to gene id
between V12B01.sub.--24214 and V12B01.sub.--24249 (see Example
1).
TABLE-US-00007 TABLE 7 24214 F cacc caagcgatagtttatatagcgt (SEQ ID
NO: 324) 24249 R gaaatgaacggatattacgt (SEQ ID NO: 325)
[0531] Vs24189-24209: genomic region corresponding to gene id
between V12B01.sub.--24189 and V12B01.sub.--24209 (see Example
1).
TABLE-US-00008 TABLE 8 24189 R cggaacaggtgattgtggt (SEQ ID NO: 326)
24209 F cacc gcccacttcaagatgaagctgt (SEQ ID NO: 327)
[0532] Vs24214-24239: genomic region corresponding to gene id
between V12B01.sub.--24214 and V12B01.sub.--24239 (see Example
1).
TABLE-US-00009 TABLE 9 24214 F cacc caagcgatagtttatatagcgt (SEQ ID
NO: 328) 24239 R_1 gtggctaagtacatgccggt (SEQ ID NO: 329)
[0533] The entry vectors were recombined with the destination
vector pET-DEST42 (Invitrogen) using the LR recombinase enzyme
(Invitrogen). These destination vectors were then put into
electro-competent DH10B or BL21 cells.
[0534] The alginate lyase clones were then made by digesting (using
enzymes Nde I and Bam HI) the PCR products that were generated
using Vibrio splendidus 12B01 genomic DNA as a template and
amplified with the following primer pairs:
TABLE-US-00010 TABLE 10 24214 ndeF GGAATTC CAT
atgacaaagaatatgacgactaaac (SEQ ID NO: 330) for forward primer for
V12B01_24214 24214 bamR CG GGATCC ttattatttcccctgccctgcagt (SEQ ID
NO: 331) for reverse primer for V12B01_24214 24219 ndeF GGAATTC CAT
atgagctatcaaccacttttac (SEQ ID NO: 332) for forward primer for
V12B01_24219 24219 bamR CG GGATCC ttacagttgagcaaatgatcc (SEQ ID NO:
333) for reverse primer for V12B01_24219
[0535] The digested PCR products were then ligated into cut pET28
vector. Certain of the cloned genomic regions of Vibrio splendidus
B01 were tested for the presence of secreted alginate lyases, and
the above-described constructs were tested in various combinations
for the ability to confer growth on alginate as a sole source of
carbon.
[0536] The Vs24254 (SEQ ID NO: 32) region of Vibro spendidus
encodes a functional external alginate lyase. BL21 cells expressing
Vs24254 from the pET28 vector were capable of breaking down
alginate in the growth medium. When grown on LB+2% alginate+0.1 mM
Isopropyl .beta.-D-1-thiogalactopyranoside (IPTG), only cells
expressing the Vs24254 gene give a positive TBA assay result of
pink color. This assay was performed by spinning down an overnight
culture grown on the above mentioned media. The media was then
mixed in a 1:1 ratio with 0.8% thiobarbituric acid (TBA), heated
for 10 min at 99 degrees Celsius, and assayed for pink coloration.
FIG. 47 shows the results of this assay. The left tube in FIG. 47
represents media taken from an overnight culture of cells
expressing Vs24254, while the right hand tube shows the TBA
reaction using media from cells expressing Vs24259 (negative
control). The lack of pink coloration in the negative control
indicates that little or no cleavage of the alginate polymer has
occurred. Wildtype E. coli cells not expressing any recombinant
proteins show the same coloration as the negative control Vs24259
(data not shown).
[0537] To test the ability of recombinant E. coli to grow on
alginate as a sole source of carbon, transformed cells were grown
for 19 hours at 30 degrees Celsius with mild shaking in a 96-well
plate. Each well held 222 .mu.l of minimal media (see growth
conditions for explanation of minimal media) with the 0.66% carbon
source in the form of either degraded alginate or glucose (positive
control for growth). All cells were either BL21 with no plasmid
(BL21--negative control), one plasmid (Da or 3a), or two plasmids
(Dk3a and Da3k). The plasmids are indicated by the lower case
letter: "a" refers to the plasmid backbone pET-DEST42 and "k"
refers to the pENTR/D/TOPO backbone. "D" indicates that the plasmid
contains the genomic region Vs24214-24249, while "3" indicates that
the plasmid contains the genomic region Vs24189-24209. Thus, Da
would be pET-DEST42-Vs24214-24249, Da3k would be
pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 and so
on.
[0538] As shown in FIG. 56A, the two vector-constructs
pET-DEST42-Vs24214-24249 and pENTR/D/TOPO-Vs24189-24209 when
combined in E. coli confer growth on degraded alginate as the sole
carbon source. This same result is be observed when these genomic
inserts are switched into the opposite vector
(pET-DEST42-Vs24189-24209 and pENTR/D/TOPO-Vs24214-24249). FIG. 56B
shows growth on glucose as a positive control. Thus, the combined
genomic regions of Vs24214-24249 and Vs24189-24209 from Vibro
splendidus were sufficient to confer on E. coli the ability to
grown on alginate as a sole source of carbon.
Example 11
Production of Ethanol from Alginate
[0539] The ability of recombinant E. coli to produce ethanol by
growing on alginate on a source of carbon was tested. To generate
recombinant E. coli, DNA sequences encoding pyruvate decarboxylase
(pdc), and two alcohol dehydrogenase (adhA and adhB) of Zymomonas
mobilis were amplified by polymerase chain reaction (PCR). These
amplified fragments were gel purified and spliced together by
another round of PCR. The final amplified DNA fragment was digested
with BamHI and XbaI ligated into pBBR1MCS-2 pre-digested with the
same restriction enzymes. The resulting plasmid is referred to as
pBBRPdc-AdhA/B.
[0540] E. coli was transformed with either pBBRPdc-AdhA/B or
pBBRPdc-AdhA/B+1.5 Fos (fosmid clone containing genomic region
between V12B01.sub.--24189 and V12B01.sub.--24249; these sequences
confer on E. coli the ability to use alginate as a sole source of
carbon, see Examples 1 and 10), grown in m9 media containing
alginate, and tested for the production of ethanol. The results are
shown in FIG. 57, which demonstrates that the strain harboring
pBBRPdc-AdhA/B+1.5 FOS showed significantly higher ethanol
production when growing on alginate. These results indicate that
the pBBRPdc-AdhA/B+1.5 FOS was able to utilize alginate as a source
of carbon in the production of ethanol.
[0541] The various embodiments described above can be combined to
provide further embodiments. All of the U.S. patents, U.S. patent
application publications, U.S. patent applications, foreign
patents, foreign patent applications and non-patent publications
referred to in this specification and/or listed in the Application
Data Sheet, are incorporated herein by reference, in their
entirety. Aspects of the embodiments can be modified, if necessary
to employ concepts of the various patents, applications and
publications to provide yet further embodiments.
[0542] These and other changes can be made to the embodiments in
light of the above-detailed description. In general, in the
following claims, the terms used should not be construed to limit
the claims to the specific embodiments disclosed in the
specification and the claims, but should be construed to include
all possible embodiments along with the full scope of equivalents
to which such claims are entitled. Accordingly, the claims are not
limited by the disclosure.
[0543] The following publications are herein incorporated by
reference in their entirety. [0544] 1. T. Y. Wong, L. A. Preston,
N. L. Schiller, Annu Rev Microbiol 54, 289 (2000). [0545] 2. W.
Hashimoto, O. Miyake, A. Ochiai, K. Murata, J Biosci Bioeng 99, 48
(January, 2005). [0546] 3. M. Yamasaki, K. Ogura, W. Hashimoto, B.
Mikami, K. Murata, J Mol Biol 352, 11 (Sep. 9, 2005). [0547] 4. M.
Yamasaki et al., Acta Crystallogr Sect F Struct Biol Cryst Commun
61, 288 (Mar. 1, 2005). [0548] 5. O. Miyake, A. Ochiai, W.
Hashimoto, K. Murata, J Bacteriol 186, 2891 (May, 2004). [0549] 6.
O. Miyake, W. Hashimoto, K. Murata, Protein Expr Purif 29, 33 (May,
2003). [0550] 7. H. J. Yoon, B. Mikami, W. Hashimoto, K. Murata, J
Mol Biol 290, 505 (Jul. 9, 1999). [0551] 8. H. J. Yoon, W.
Hashimoto, O. Miyake, K. Murata, B. Mikami, J Mol Biol 307, 9 (Mar.
16, 2001). [0552] 9. W. Hashimoto, O. Miyake, K. Momma, S. Kawai,
K. Murata, J Bacteriol 182, 4572 (August, 2000). [0553] 10. H. J.
Yoon et al., Protein Expr Purif 19, 84 (June, 2000). [0554] 11. T.
Osawa, Y. Matsubara, T. Muramatsu, M. Kimura, Y. Kakuta, J Mol Biol
345, 1111 (Feb. 4, 2005). [0555] 12. A. Ochiai, W. Hashimoto, K.
Murata, Res Microbiol 157, 642 (September, 2006). [0556] 13. F. J.
Mergulhao, D. K. Summers, G. A. Monteiro, Biotechnol Adv 23, 177
(May, 2005). [0557] 14. J. H. Choi, S. Y. Lee, Appl Microbiol
Biotechnol 64, 625 (June, 2004). [0558] 15. M. P. DeLisa, D.
Tullman, G. Georgiou, Proc Natl Acad Sci USA 100, 6115 (May 13,
2003). [0559] 16. N. Blaudeck, G. A. Sprenger, R. Freudl, T.
Wiegert, J Bacteriol 183, 604 (January, 2001). [0560] 17. N. Pradel
et al., Biochem Biophys Res Commun 306, 786 (Jul. 4, 2003). [0561]
18. L. Masip et al., Science 303, 1185 (Feb. 20, 2004). [0562] 19.
C. M. Barrett, N. Ray, J. D. Thomas, C. Robinson, A. Bolhuis,
Biochem Biophys Res Commun 304, 279 (May 2, 2003). [0563] 20. R.
Binet, S. Letoffe, J. M. Ghigo, P. Delepelaire, C. Wandersman,
Folia Microbiol (Praha) 42, 179 (1997). [0564] 21. I. Gentschev, G.
Dietrich, W. Goebel, Trends Microbiol 10, 39 (January, 2002).
[0565] 22. V. Koronakis, FEBS Lett 555, 66 (Nov. 27, 2003). [0566]
23. J. Jose, Appl Microbiol Biotechnol 69, 607 (February, 2006).
[0567] 24. J. Jose, D. Betscheider, D. Zangen, Anal Biochem 346,
258 (Nov. 15, 2005). [0568] 25. M. Ashiuchi, H. Misono, Appl
Microbiol Biotechnol 59, 9 (June, 2002). [0569] 26. J. Narita et
al., Appl Microbiol Biotechnol 70, 564 (May, 2006). [0570] 27. Y.
Aso et al., Nat Biotechnol 24, 188 (February, 2006). [0571] 28. W.
Hashimoto et al., Biosci Biotechnol Biochem 69, 673 (April, 2005).
[0572] 29. A. E. Lagarde, F. R. Stoeber, J Bacteriol 129, 606
(February, 1977). [0573] 30. M. A. Mandrand-Berthelot, P.
Ritzenthaler, M. Mata-Gilsinger, J Bacteriol 160, 600 (November,
1984). [0574] 31. J. Pouyssegur, F. Stoeber, J Bacteriol 117, 641
(February, 1974). [0575] 32. J. Preiss, G. Ashwell, J Biol Chem
237, 309 (February, 1962). [0576] 33. J. Preiss, G. Ashwell, J Biol
Chem 237, 317 (February, 1962). [0577] 34. G. M. Bird, P. Haas,
Biochemical Journal 25, 403 (1931). [0578] 35. L. H. Cretcher, W.
L. Nelson, Science 67, 537 (May 25, 1928). [0579] 36. W. L. Nelson,
L. H. Cretcher, Journal of the American Chemical Society 51, 1914
(1929). [0580] 37. W. L. Nelson, L. H. Cretcher, Journal of the
American Chemical Society 52, 2130 (1930). [0581] 38. W. L. Nelson,
L. H. Cretcher, Journal of the American Chemical Society 54, 3409
(1932). [0582] 39. E. Schoeffel, K. P. Link, Journal of Biological
Chemistry 95, 213 (1932). [0583] 40. E. Schoeffel, K. P. Link,
Journal of Biological Chemistry 100, 397 (1933). [0584] 41. H. A.
Spoehr, Archive of Biochemistry 14, 153 (1947). [0585] 42. J. J.
Farmer, 3rd, R. G. Eagon, J Bacteriol 97, 97 (January, 1969).
[0586] 43. R. L. Anderson, D. P. Allison, J Biol Chem 240, 2367
(June, 1965). [0587] 44. W. J. Lennarz, R. J. Light, K. Bloch, Proc
Natl Acad Sci USA 48, 840 (May, 1962). [0588] 45. S. A. Graham,
Crit Rev Food Sci Nutr 28, 139 (1989). [0589] 46. E. Wiberg, P.
Edwards, J. Byrne, S. Stymne, K. Dehesh, Planta 212, 33 (December,
2000). [0590] 47. L. Yuan, T. A. Voelker, D. J. Hawkins, Proc Natl
Acad Sci USA 92, 10639 (Nov. 7, 1995). [0591] 48. K. Dehesh, A.
Jones, D. S. Knutzon, T. A. Voelker, Plant J 9, 167 (February,
1996). [0592] 49. K. Dehesh, P. Edwards, T. Hayes, A. M. Cranmer,
J. Fillatti, Plant Physiol 110, 203 (January, 1996). [0593] 50. K.
M. Mayer, J. Shanklin, BMC Plant Biol 7, 1 (2007). [0594] 51. J. K.
Jha et al., Plant Physiol Biochem 44, 645 (November-December,
2006). [0595] 52. B. S. Schutt, M. Brummel, R. Schuch, F. Spener,
Planta 205, 263 (June, 1998). [0596] 53. K. Dehesh, P. Edwards, J.
Fillatti, M. Slabaugh, J. Byrne, Plant J 15, 383 (August, 1998).
[0597] 54. J. M. Leonard, S. J. Knapp, M. B. Slabaugh, Plant J 13,
621 (March, 1998). [0598] 55. M. Vedadi, R. Szittner, L. Smillie,
E. Meighen, Biochemistry 34, 16725 (Dec. 26, 1995). [0599] 56. M.
O. Park, J Bacteriol 187, 1426 (February, 2005). [0600] 57. M. O.
Park, K. Heguri, K. Hirata, K. Miyamoto, J Appl Microbiol 98, 324
(2005). [0601] 58. M. O. Park, M. Tanabe, K. Hirata, K. Miyamoto,
Appl Microbiol Biotechnol 56, 448 (August, 2001). [0602] 59. M.
Morikawa, T. Iwasa, S. Yanagida, T. Imanaka, Journal of
Fermentation and Bioengineering 85, 243 (1998). [0603] 60. M.
Dennis, P. E. Kolattukudy, Proc Natl Acad Sci USA 89, 5306 (Jun.
15, 1992). [0604] 61. T. M. Cheesbrough, P. E. Kolattukudy, J Biol
Chem 263, 2738 (Feb. 25, 1988). [0605] 62. M. C. Chang, R. A.
Eachus, W. Trieu, D. K. Ro, J. D. Keasling, Nat Chem Biol 3, 274
(May, 2007). [0606] 63. R. J. Porra, B. D. Ross, Biochem J 94, 557
(March, 1965). [0607] 64. X. Chen, W. Guo, L. Zhao, Q. Fu, Y. Ma, J
Phys Chem A 111, 3566 (May 10, 2007). [0608] 65. L. Zhao, W. Guo,
R. Zhang, S. Wu, X. Lu, Chemphyschem 7, 1345 (Jun. 12, 2006).
[0609] 66. L. Zhao, R. Zhang, W. Guo, S. Wu, X. Lu, Chemical
Physics Letters 414, 28 (2005). [0610] 67. G. Gorgen, W. Boland,
Eur J Biochem 185, 237 (Nov. 6, 1989). [0611] 68. P. Ney, W.
Boland, Eur J Biochem 162, 203 (Jan. 2, 1987). [0612] 69. Z. L.
Boynton, G. N. Bennett, F. B. Rudolph, Appl Environ Microbiol 62,
2758 (August, 1996). [0613] 70. R. T. Yan, J. S. Chen, Appl Environ
Microbiol 56, 2591 (September, 1990). [0614] 71. R. V. Nair, G. N.
Bennett, E. T. Papoutsakis, J Bacteriol 176, 871 (February, 1994).
[0615] 72. D. P. Wiesenborn, F. B. Rudolph, E. T. Papoutsakis, Appl
Environ Microbiol 55, 317 (February, 1989). [0616] 73. D. K.
Thompson, J. S. Chen, Appl Environ Microbiol 56, 607 (March, 1990).
[0617] 74. M. G. Hartmanis, J Biol Chem 262, 617 (Jan. 15, 1987).
[0618] 75. K. X. Huang, S. Huang, F. B. Rudolph, G. N. Bennett, J
Mol Microbiol Biotechnol 2, 33 (January, 2000). [0619] 76. L.
Fontaine et al., J Bacteriol 184, 821 (February, 2002). [0620] 77.
B. McMahon, M. E. Gallagher, S. G. Mayhew, FEMS Microbiol Lett 250,
121 (Sep. 1, 2005). [0621] 78. M. Li, S. Yao, S. K., Microbial
Biotechnology 23, 573 (2007). [0622] 79. T. B. Causey, S. Zhou, K.
T. Shanmugam, L. O. Ingram, Proc Natl Acad Sci USA 100, 825 (Feb.
4, 2003). [0623] 80. D. E. Chang, S. Shin, J. S. Rhee, J. G. Pan, J
Bacteriol 181, 6656 (November, 1999). [0624] 81. C. R. Dittrich, R.
V. Vadali, G. N. Bennett, K. Y. San, Biotechnol Prog 21, 627
(March-April, 2005). [0625] 82. H. Lin, N. M. Castro, G. N.
Bennett, K. Y. San, Appl Microbiol Biotechnol 71, 870 (August,
2006). [0626] 83. U. Schorken, G. A. Sprenger, Biochim Biophys Acta
1385, 229 (Jun. 29, 1998). [0627] 84. G. A. Sprenger, M. Pohl,
Journal of Molecular Catalysis B: Enzymatic 6, 145 (1999). [0628]
85. G. A. Sprenger, M. Pohl, Journal of Molecular Catalysis B:
Enzymic 6, 145 (1999). [0629] 86. B. Gonzalez, R. Vicuna, J
Bacteriol 171, 2401 (May, 1989). [0630] 87. P. Hinrichsen, I.
Gomez, R. Vicuna, Gene 144, 137 (Jun. 24, 1994). [0631] 88. E.
Janzen et al., Bioorg Chem 34, 345 (December, 2006). [0632] 89. M.
M. Kneen, I. D. Pogozheva, G. L. Kenyon, M. J. McLeish, Biochim
Biophys Acta 1753, 263 (Dec. 1, 2005). [0633] 90. K.
Yamada-Onodera, A. Nakajima, Y. Tani, J Biosci Bioeng 102, 545
(December, 2006). [0634] 91. K. Yamada-Onodera, M. Fukui, Y. Tani,
J Biosci Bioeng 103, 174 (February, 2007). [0635] 92. T. Tobimatsu,
M. Azuma, S. Hayashi, K. Nishimoto, T. Toraya, Biosci Biotechnol
Biochem 62, 1774 (September, 1998). [0636] 93. T. Tobimatsu et al.,
J Biol Chem 271, 22352 (Sep. 13, 1996). [0637] 94. T. Toraya, T.
Shirakashi, T. Kosuga, S. Fukui, Biochem Biophys Res Commun 69, 475
(Mar. 22, 1976). [0638] 95. M. Yamanishi et al., Eur J Biochem 269,
4484 (September, 2002). [0639] 96. J. R. O'Brien et al.,
Biochemistry 43, 4635 (Apr. 27, 2004). [0640] 97. C. Raynaud, P.
Sarcabal, I. Meynial-Salles, C. Croux, P. Soucaille, Proc Natl Acad
Sci USA 100, 5010 (Apr. 29, 2003). [0641] 98. B. Ludwig, A. Akundi,
K. Kendall, Appl Environ Microbiol 61, 3729 (October, 1995). [0642]
99. S. X. Xie, J. Ogawa, S. Shimizu, Biosci Biotechnol Biochem 63,
1721 (October, 1999). [0643] 100. T. Zelinski, J. Peters, M. R.
Kula, J Biotechnol 33, 283 (Apr. 15, 1994). [0644] 101. M. C. Hunt,
A. Rautanen, M. A. Westin, L. T. Svensson, S. E. Alexson, Faseb J
20, 1855 (September, 2006). [0645] 102. M. A. Westin, S. E.
Alexson, M. C. Hunt, J Biol Chem 279, 21841 (May 21, 2004). [0646]
103. M. A. Westin, M. C. Hunt, S. E. Alexson, J Biol Chem 280,
38125 (Nov. 18, 2005). [0647] 104. H. Iwaki, Y. Hasegawa, S. Wang,
M. M. Kayser, P. C. Lau, Appl Environ Microbiol 68, 5671 (November,
2002).
Sequence CWU 1
1
333112066DNAVibrio splendidus 1ggggacaagt ttgtacaaaa aagcaggctt
gacgcttatc acatttagta gaagcttatg 60tggaggcgat tggctttttt ttcaaggaag
attacaaaat agctcaggta atgccgattt 120atagatttgc tatgatatag
ttcaggatct tatgctttta ataagcagga acagaattta 180tgaacaaaaa
agctgatagt ttagtaggtt acagctttat tcgttataga aagggttagg
240gaacgtgaac tttttagagc tcaaacttcg catggataac tctccggtgc
tgagccgatt 300tttagagaat ggatttttac tccagcagaa actgagcctt
gttctttgtt gtgtgttgat 360cgcagcttct gcatggattt taggacagct
tgcatggttt attgaacctg ctgagcaaac 420cgtcgtgcca tggacagcaa
cggcttcctc gtcttcaacg cctcaatcga ctcttgatat 480ctcttctttg
cagcagagca acatgtttgg tgcttataac ccaaccacgc ctgctgtggt
540tgagcagcaa gttatccaag atgcgccaaa gacgcgactg aacctcgttt
tagtgggtgc 600agtagccagt tctaatccaa agctgagctt ggctgtgatt
gccaatcgcg gcacacaagc 660aacctacggc attaatgaag agatcgaagg
tacgcgagct aagttaaaag cggtattagt 720cgatcgcgtg attattgata
actcaggtcg agacgaaacc ttgatgcttg aaggcattga 780gtacaagcgt
ttgtctgtat cagcacctgc gccacctcgt acctcttctt ctgtgcgtgg
840caacaaccca gcttctgcag aagagaagct agatgaaatt aaagcgaaga
taatgaaaga 900tccgcaacaa atcttccaat atgttcgact gtctcaggtg
aaacgcgacg ataaagtgat 960tggttatcgt gtgagccctg gcaaagattc
agaacttttt aactctgttg ggctccaaaa 1020cggagatatt gccactcagt
taaatggaca agacctgaca gaccctgctg ctatgggcaa 1080catattccgt
tctatctcag agctgacaga gctaaacctc gtcgtcgaga gagatggtca
1140acaacatgaa gtgtttattg aattttagaa ctttgcgtct aacgaaggac
gaaagtgtag 1200gagaagtacg tgaagcattg gtttaagaaa agtgcatggt
tattggcagg aagcttaatc 1260tgcacacccg cagccatcgc gagtgatttt
agtgccagct ttaaaggcac tgatattcaa 1320gagtttatta atattgttgg
tcgtaaccta gagaagacga tcatcgttga cccttcggtg 1380cgcggaaaaa
tcgatgtacg cagctacgac gtactcaatg aagagcaata ctacagcttc
1440ttcctaaacg tattggaagt gtatggctac gcggttgtcg aaatggactc
gggtgttctt 1500aagatcatca aggccaaaga ttcgaaaaca tcggcaattc
cagtcgttgg agacagtgac 1560acgatcaaag gcgacaatgt ggtgacacgt
gttgtgacgg ttcgtaatgt ctcggtgcgt 1620gaactttctc ctctgcttcg
tcaactaaac gacaatgcag gcgcgggtaa cgttgtgcac 1680tacgacccag
ccaacatcat ccttattaca ggccgagcgg cggtagtaaa ccgtttagct
1740gaaatcatca agcgtgttga ccaagcgggt gataaagaga ttgaagtcgt
tgagctaaag 1800aatgcttctg cggcagaaat ggtacgtatc gttgatgcgt
taagcaaaac cactgatgcg 1860aaaaacacac ctgcatttct acaacctaaa
ttagttgccg atgaacgtac caatgcgatt 1920cttatctcag gcgaccctaa
agtacgtagc cgtttaagaa ggctgattga acagcttgat 1980gttgaaatgg
caaccaaggg caataaccaa gttatttacc ttaaatatgc aaaagccgaa
2040gatctagttg atgtgctgaa aggcgtgtcg gacaacctac aatcagagaa
gcagacatca 2100accaaaggaa gttcatcgca gcgtaaccaa gtgatgatct
cagctcacag tgacaccaac 2160tctttagtga ttaccgcaca gccggacatc
atgaatgcgc ttcaagatgt gatcgcacag 2220ctggatattc gtcgtgctca
agtattgatt gaagcactga ttgtcgaaat ggccgaaggt 2280gacggcgtta
accttggtgt gcagtggggt aaccttgaaa cgggtgccat gattcagtac
2340agcaacactg gcgcttccat tggcggtgtg atggttggtt tagaagaagc
gaaagacagc 2400gaaacgacaa ccgctgttta tgattcagac ggtaaattct
tacgtaatga aaccacgacg 2460gaagaaggtg actattcaac attagcttcc
gcactttctg gtgttaatgg tgcggcaatg 2520agtgtggtaa tgggtgactg
gaccgccttg atcagtgcag tagcgaccga ttcaaattca 2580aatatcctat
cttctccaag tatcaccgtg atggataacg gcgaagcgtc attcattgtg
2640ggtgaagagg tgcctgttct aaccggttct acagcaggct caagtaacga
caacccattc 2700caaacagttg aacgtaaaga agtgggtatc aagcttaaag
tggtgccgca aatcaatgaa 2760ggtgattcgg ttcaactgca aatagaacaa
gaagtatcga acgtattagg cgccaatggt 2820gcggttgatg tgcgttttgc
taagcgacag ctaaatacat cagtgattgt tcaagacggt 2880caaatgctgg
tgttgggtgg cttgattgac gagcgagcat tggaaagtga atctaaggtg
2940ccgttcttgg gagatattcc tgtgcttgga cacttgttca aatcaaccag
tactcaggtt 3000gagaaaaaga acctaatggt cttcatcaaa ccaaccatta
ttcgtgatgg tatgacagcc 3060gatggtatca cgcagcgtaa atacaacttc
atccgtgctg agcagttgta caaggctgag 3120caaggactga agttaatggc
agacgataac atcccagtat tgcctaaatt tggtgccgac 3180atgaatcacc
cggctgaaat tcaagccttc atcgatcaaa tggaacaaga ataatggctg
3240aattggtagg ggcggcacgt acttatcagc gcttgccgtt tagctttgcg
aatcgctaca 3300agatggtgtt ggaataccaa catccagagc gcgcaccgat
actttattat gttgagccac 3360tgaaatcggc ggcgatcatt gaagtgagtc
gtgttgtgaa aaatggtttc acgccacaag 3420cgattactct cgatgagttt
gataaaaaac taaccgatgc ttatcagcgt gactcgtcag 3480aagctcgtca
gctcatggaa gacattggtg ctgatagtga tgatttcttc tcactagcgg
3540aagaactgcc tcaagacgaa gacttacttg aatcagaaga tgatgcacca
atcatcaagt 3600taatcaatgc gatgctgggt gaggcgatca aagagggtgc
ttcggatata cacatcgaaa 3660cctttgaaaa gtcactttgt atccgtttcc
gagttgatgg tgtgctgcgt gatgttctag 3720cgccaagccg taaactggct
ccgctattgg tttcacgtgt caaggttatg gctaaactgg 3780atattgcgga
aaaacgcgtg ccacaagatg gtcgtatttc tctgcgtatt ggtggccgag
3840cggttgatgt tcgtgtttca accatgcctt cttcgcatgg tgagcgtgtg
gtaatgcgtc 3900tgttggacaa aaatgccact cgtctagact tgcacagttt
aggtatgaca gccgaaaacc 3960atgaaaactt ccgtaagctg attcagcgcc
cacatggcat tatcttggtg accggcccga 4020caggttcagg taaatcgacg
accttgtacg caggtctgca agaactcaac agcaatgaac 4080gaaacatttt
aaccgttgaa gacccaatcg aattcgatat cgatggcatt ggtcaaacac
4140aagtgaaccc taaggttgat atgacctttg cgcgtggttt acgtgccatt
cttcgtcaag 4200atcctgatgt tgttatgatt ggtgagatcc gtgacttgga
gaccgcagag attgctgtcc 4260aggcctcttt gacaggtcac ttagttatgt
cgactctgca taccaatact gccgtcggtg 4320cgattacacg tctacgtgat
atgggcattg aacctttctt gatctcttct tcgctgctgg 4380gtgttttggc
tcagcgcttg gttcgtactt tatgtaacga atgtaaagaa ccttatgaag
4440ccgataaaga gcagaagaaa ctgtttgggt tgaagaagaa agaaagcttg
acgctttacc 4500atgccaaagg ttgtgaagag tgtggccata agggttatcg
aggtcgtacg ggtattcatg 4560agctgttgat gattgatgat tcagtacaag
agctgattca cagtgaagcg ggtgagcagg 4620cgattgataa agcaattcgt
ggcacaacac caagtattcg agatgatggc ttgagcaaag 4680ttctgaaagg
ggtaacgtcc ctagaagaag tgatgcgcgt gaccaaggaa gtctagtatg
4740gcggcatttg aatacaaagc actggatgcc aaaggcaaaa gtaaaaaagg
ctcaattgaa 4800gcagataatg ctcgtcaggc tcgccaaaga ataaaagagc
ttggcttgat gccggttgag 4860atgaccgagg ctaaagcaaa aacagcaaaa
ggtgctcagc catcgaccag ctttaaacgc 4920ggcatcagta cgcctgatct
tgcgcttatt actcgtcaaa tatccacgct cgttcaatct 4980ggtatgccgc
tagaagagtg tttgaaagcc gttgccgaac agtctgagaa acctcgtatt
5040cgcaccatgc tactcgcggt gagatctaag gtgactgaag gttattcgtt
agcagacagc 5100ttgtctgatt atccccatat cttcgatgag ctattcagag
ccatggttgc tgctggtgag 5160aagtcagggc atctagatgc ggtattggaa
cgattggctg actacgcaga aaaccgtcag 5220aagatgcgtt ctaagttgct
gcaagcgatg atctacccca tcgtgctggt ggtgtttgcg 5280gtgacgattg
tgtcgttcct actggcaacg gtagtgccga agatcgttga gcctattatc
5340caaatgggac aagagctccc tcagtcgaca caatttttat tagcatcgag
tgaatttatc 5400cagaattggg gcatccaatt actggtgttg accattggtg
tgattgtgtt ggttaagact 5460gcgctgaaaa agccgggcgt tcgcatgagc
tgggatcgca aattattgag catcccgctg 5520ataggcaaga tagcgaaagg
gatcaacacc tctcgttttg cacgaacact ttctatctgt 5580acctctagtg
cgattcctat ccttgaaggg atgaaggtcg cggtagatgt gatgtcgaat
5640catcacgtga aacaacaagt attacaggca tcagatagcg ttagagaagg
ggcaagcctg 5700cgtaaagcgc ttgatcaaac caaactcttt cccccgatga
tgctgcatat gatcgccagt 5760ggtgagcaga gtggccaatt ggaacagatg
ctgacaagag cggcagataa tcaggatcaa 5820agctttgaat cgaccgttaa
tatcgcgtta ggcattttta ccccagcgct tattgcgttg 5880atggctggct
tagtgctgtt tatcgtgatg gcgacgctga tgccaatgct tgaaatgaac
5940aatttaatga gtggttaacc tgccgctcat cagacgttag tttttggatt
atcgagaaga 6000aggacatcat tcccctcaac tcgctatctg taatttggag
aaaataatga aaaataaaat 6060gaaaaaacaa tcaggcttta ccctattaga
agtcatggtt gttgtcgtta tccttggtgt 6120tctagcaagt tttgttgtac
ctaacctgtt gggcaacaaa gagaaggcgg atcaacaaaa 6180agccatcact
gatattgtgg cgctagagaa cgcgctcgac atgtacaaac tggataacag
6240cgtttaccca acaacggatc aaggcctgga cgggttggtg acaaagccaa
gcagtccaga 6300gcctcgtaac taccgagacg gcggttacat caagcgtcta
cctaacgacc catggggcaa 6360tgagtaccaa tacctaagtc ctggtgataa
cggcacaatt gatatcttca ctcttggcgc 6420agatggtcaa gaaggtggtg
aaggtattgc tgcagatatc ggcaactgga acatgcagga 6480cttccaataa
gcttcggctt gttgtcggtt gatacgttcc tgttgtttga ttcgttatcg
6540ttgcttgata cgttattgat ggtagtacgc aaaaaatgga gtctacaagg
tgaaaactaa 6600gcaaacacag ccaggtttca ccttgattga gattcttttg
gtgttggtat tactgtcagt 6660atcggcggtc gcggtgatct cgaccatccc
taccaatagc aaagatgttg ctaaaaaata 6720cgctcaaagc ttttatcagc
gaattcagct actcaatgaa gaggctattt tgagtggctt 6780agattttggt
gttcgtgttg atgaaaaaaa atcgacttac gttctgatga ctttgaagtc
6840tgatggctgg caagaaacgg agttcgaaaa gatcccttct tcaactgaat
taccggaaga 6900actggcactg tcgctgacat taggtggtgg cgcgtgggaa
gacgatgatc ggttgttcaa 6960tccaggaagc ttatttgatg aagatatgtt
tgctgatctt gaagaggaaa agaagccgaa 7020accaccacag atctacatct
tgtcgagtgc tgaaatgacg ccatttgtac tgtcgtttta 7080cccaaatacc
ggtgacacaa tacaagatgt ttggcgcatt cgagtattgg ataatggtgt
7140gattcgatta ctcgagccgg gagaagaaga tgaagaagaa taaccgttct
ccttatcgtt 7200ctcgcggtat gcctcttggt tctcgaggaa tgactctgct
tgaagtattg gttgcgctgg 7260ctatcttcgc tacggcggcg atcagtgtga
ttcgtgctgt cacccagcac atcaatacgc 7320tcagttatct cgaagaaaaa
accttcgcgg cgatggtcgt tgataatcaa atggccctag 7380tcatgctaca
tcctgagatg cttaaaaaag cgcagggcac gcaagagtta gcgggaagag
7440aatggttctg gaaggtgact cccatcgata ccagcgataa tttattaaag
gcgtttgatg 7500tgagtgcggc aaccagtaag aaagcgtctc cagtcgttac
ggtgcgcagt tatgtggtta 7560attaagagaa tgtggtcaat taagagcatg
ttattaatta agaacagctc gctaactaag 7620agcgtgtcgc taactaagag
catgtcggaa aataagcgta cgccgcgtaa acaaggtcta 7680ccttcaaaag
ggagaggctt taccttaatt gaagtcttgg tctcgattgc tatctttgcc
7740acgctaagta tggcggctta tcaggtggtt aatcaggtgc agcgaagcaa
cgagatctct 7800attgagcgca gtgctcgttt gaaccaactg caacgcagtt
tagtcatttt agataatgat 7860tttcgccaga tggcggtgcg aaaatttcgt
accaacggtg aagaagcatc atctaagctg 7920atcttaatga aagagtattt
attggactcc gacagtgtag gcatcatgtt tactcgtcta 7980ggttggcaca
acccacaaca gcagtttcct cgcggtgaag tcacgaaggt tggctaccgt
8040attaaagaag aaacacttga gcgtgtatgg tggcgttatc ccgatacacc
ttcaggccaa 8100gaaggtgtga ttacccctct gcttgatgat gttgaaagct
tggaattcga gttttatgac 8160ggaagccgct gggggaaaga gtggcaaacc
gataaatcac tgccgaaagc ggtgaggctt 8220aagctgacac tgaaagacta
tggtgagata gagcgtgttt atctcactcc cggtggcacc 8280ctagatcagg
ccgatgattc ttcaaacagt gactcttcag gcagtagtga ggggaataat
8340gactcatcga actaataagc gtttagcgac aaggtcagcc ttgggacgta
aacaacgtgg 8400tgtcgcgctg atcattattt tgatgctatt ggcgatcatg
gcaaccattg ctggcagcat 8460gtccgagcgt ttgtttacgc aattcaagcg
cgttggtaac caactgaatt accaacaggc 8520ttactggtac agcattggtg
tggaagcgct tgtgcaaaac ggtattaggc aaagttacaa 8580agacagtgat
accgtgaacc taagccaacc atgggcgtta gaagagcagg tatacccatt
8640ggattatggc caagttaagg gccgcattgt tgatgctcag gcatgtttta
atcttaatgc 8700cttagccgga gtggcgacca cttcaagtaa ccagactcct
tatttaatca cggtttggca 8760aaccttattg gaaaaccaag acgttgagcc
ttatcaggct gaggttatcg caaattcaac 8820gtgggaattt gttgatgcgg
atacacgaac cacctcttcg tctggtgtag aagacagcac 8880gtatgaagcg
atgaagccct cttatttggc ggcgaatggc ttaatggccg atgaatccga
8940gctacgagcg gtttatcaag tcactggtga agtgatgaat aaggttcgcc
cctttgtttg 9000cgctctgcca accgatgatt tccgcttgaa tgtgaatact
ctcacggaaa aacaagcacc 9060gttattggaa gcgatgtttg cgccaggctt
aagtgaatcg gatgccaaac agctgataga 9120taaacgccca tttgatggct
gggatacggt agatgctttc atggctgaac ctgccattgt 9180tggtgtaagt
gccgaagtca gcaagaaagc gaaagcatat ttaactgtag atagcgccta
9240ttttgagcta gatgcagagg tattagttga gcagtcacgt gtacgtatac
ggacgctttt 9300ctatagtagt aatcgagaaa cagtgacggt agtacgccgt
cgttttggag gaatcagtga 9360gcgagtttct gaccgttcga ctgagtagcg
aaccacaaag ccctgtgcag tggttagttt 9420ggtcgacaag ccaacaagaa
gtgatagcaa gcggtgaact gtctagctgg gaacagcttg 9480acgagttaac
gccttacgct gaaaagcgca gctgtatcgc tttattgccg ggaagtgaat
9540gcttaattaa gcgtgttgag atcccgaaag gtgctgctcg ccagtttgat
tctatgctgc 9600cgttcttatt agaagacgaa gtcgcacaag atatcgaaga
cttacacctg actattttag 9660ataaagatgc cactcacgct accgtgtgtg
gtgtggatcg tgaatggcta aaacaagctt 9720tagacctgtt tcgcgaagcc
aatataatct tccgtaaggt gctaccagat acactagccg 9780tgccttttga
agaacaaggc atcagtgcgt tgcagataga tcagcattgg ttattgcgcc
9840aaggtcactc tcaacgtcaa ggtcactatc aagccgtatc gatcagtgaa
gcatggttac 9900cgatgttttt gcaaagtgat tgggttgtcg ctggtgagga
agagcaagcg acgactatct 9960tcagctatac cgcgatgccg agcgacgacg
ttcaacagca aagcggcctc gagtggcaag 10020caaagcctgc ggaattggtg
atgtctttat tgagtcagca agcgatcaca agcggcgtaa 10080atttactgac
tggcaccttt aaaaccaaat cttcattcag taaatattgg cgtgtttggc
10140agaaagtggc gattgctgct tgtttgctgg tggccgtgat tgtgactcag
caagtgttga 10200aggttcagca atacgaagcg caagcacaag cctaccgcat
ggagagtgag cgtatcttta 10260gagctgtgct gcctggcaaa caacgcattc
cgaccgtgag ttacctcaag cgtcagatga 10320atgatgaagc taagaaatac
ggtggttcag gcgaaggtga ttctttactt ggttggttag 10380ctttgctgcc
tgaaacctta gggcaagtga agacgatcga agttgaaagc attcgctacg
10440atggcaaccg ttctgaggtt cgactgcagg ctaaaagttc tgacttccaa
cactttgaga 10500ccgcaagggt gaagctcgaa gagaagtttg tcgttgagca
agggccattg aaccgtaatg 10560gcgatgccgt atttggcagt tttactctta
aaccccatca ataacctgcg taaggagatc 10620agtgatgaga aatatgattg
aaccactcca agcgtggtgg gcttcaataa gtcagcggga 10680acaacgatta
gtcattggtt gttctatttt attgatactg ggcgttgtct attggggatt
10740aatacaacca cttagccaac gagccgagct tgcacaaagc cgcattcaaa
gtgagaagca 10800acttctggct tgggtaacgg acaaagcgaa tcaagtggtt
gaactacgag gcagtggtgg 10860catcagtgcc agtcagcctt tgaaccaatc
tgtgcctgct tctatgcgcc gttttaacat 10920cgagctgata cgcgtgcaac
cacgcggtga gatgctgcaa gtttggatta agcctgtgcc 10980atttaataag
ttcgttgact ggctgacata cctgaaagaa aagcagggtg ttgaggttga
11040gtttatggat attgatcgct ctgatagccc tggggttatt gagatcaacc
gactacagtt 11100taaacgaggt taatgtgaaa cgcggtttat ctttcaaata
cggcctgtta ttcagcgtca 11160tttttatcgt ttttttctcg gtaagcttgt
tgctgcattt gcctgccgct tttgctctca 11220agcatgcacc cgtcgtgcgt
ggtttaagca ttgaaggcgt tgagggcacc gtttggcaag 11280gtcgcgctaa
caatatcgcg tggcagcgtg tcaattacgg ctcagtgcag tgggacttcc
11340agttctctaa actattccaa gccaaagcag aacttgcggt tcgctttggc
cgcaacagcg 11400acatgaactt atcaggtaaa ggacgtgtcg gatatagcat
gagtggtgct tacgcggaaa 11460acttagtggc atcaatgcca gccagcaacg
tgatgaaata tgcgccagct atcccagtgc 11520ctgtgtctat tgcagggcaa
gttgaactga cgatcaaaca tgcggttcat gctcaacctt 11580ggtgtcaatc
aggtgaaggt acgcttgctt ggtctggtgc agcagtcgac tcgccagtgg
11640gttcgttaga ccttggccct gtgattgcgg acataacgtg tgaagacagc
acaattgcag 11700ccaaaggcac tcagaagagc gatcaggtag acagcgagtt
ctcagcgagc gtaacaccta 11760accaacgcta cacctcggca gcatggttta
agccaggcgc tgaattcccg ccagcaatgc 11820agagtcagct taagtggttg
ggcaatcctg atagccaagg taaataccaa tttacttatc 11880aaggccgctt
ttagcccggt atttacttca gagctagtat ctgaagtaaa tttggcgatc
11940aaatcgcgaa ctataaaaaa cgggcacctc actgaggtgc ccgttttgtt
tgttctgaga 12000atctagagga tatctgacgg ttaaagagag caaactcacc
cagctttctt gtacaaagtg 12060gtcccc 12066254080DNAVibrio splendidus
2gtgctttgtg acaacggggg atgtatggat attgaagttt cgcgccaggt tgcggtagtt
60gaagctacga gtggagatgt cgtcgtagtt aagccagacg gcagcgcaag aaaagtttca
120gttggcgata ccatccgtga aaatgagatc gtgattacgg ccaacaagtc
agagcttgta 180ttaggcgttc agaatgattc gattccggtt gcagagaatt
gcgtcggttg tgttgatgaa 240aacgctgcat gggtagatgc cccaatagct
ggtgaggtta attttgactt acagcaagca 300gacgcagaaa ccttcactga
agacgacctt gctgcaattc aagaagccat tttaggtggt 360gccgatccga
ctcaaatctt agaagcaacg gctgctggtg gcggactagg ttctgcaaat
420gctggctttg tgacgattga ctataactac actgaaactc atccatcgac
tttctttgag 480accgctggtc tagcagaaca aactgttgat gaagacagag
aagaattcag atctatcact 540cgttcatcag gtggccaatc aatcagtgaa
acactgactg aaggctccat atctggcaat 600acctatcccc aatctgtaac
aacgacagaa acgattattg ctggtagttt agctctcgcc 660cctaactctt
tcattccaga aactttatcc ctcgcttcac tacttagtga attaaacagc
720gacattactt caagtggtca gtccgttatc ttcacctatg acgcgacgac
taattctatc 780gttggtgttc aagataccga cgaagtatta cgtatcgaca
ttgatgccgt cagtgttggc 840aataacattg agctttctct aaccacaacg
atttcccagc cgattgatca tgtaccgtcg 900gttggcggtg gtcaggtttc
ttacactggc gatcaaatag atattgcctt tgatattcaa 960ggtgaagaca
ccgctgggaa cccgctagca acacccgtta acgcacaagt ttcagtgttt
1020gacgggatag atccgtctgt tgaaagtgtc aatatcacta acgttgaaac
tagcagcgcg 1080gcaatcgaag ggacgttctc aaatattggt agtgataacc
ttcaatcagc cgtatttgat 1140gcaagtgcac tggaccagtt tgatgggttg
ctcagtgata atcaaaacac gcttgcgaga 1200ctttctgatg atggaacaac
gattactctg tccatccaag gtcgaggtga ggttgttctc 1260actatctctc
tagataccga tggcacctat aaattcgagc agtctaatcc gatagaacaa
1320gtgggtaccg attcactgac gttcgctttg ccaatcacga ttaccgattt
tgaccaagat 1380gttgtaacca atacgatcaa cattgccatt actgatggcg
atagccctgt tattactaat 1440gttgacagta ttgatgttga tgaagcgggc
attgttggcg gctcacaaga gggcacggcg 1500ccagtgtctg gcactggcgg
tatcaccgcg gacatttttg aaagtgacat cattgaccat 1560tatgagctag
aacccactga atttaatact aatggcacct tggtttcaaa tggcgaggct
1620gtgctacttg agttgattga tgaaaccaac ggtgtaagaa cttacgaagg
ttatgttgag 1680gtcaatggtt cgagaattac ggtctttgac gttaaaattg
atagcccttc attgggcaac 1740tatgagttta atctttatga agaactttct
catcaaggcg ctgaagatgc gctgttaact 1800tttgcattgc caatttatgc
tgttgatgca gatggcgacc gttctgcact gtctggaggt 1860tcgaacacac
cagaagctgc tgagatcctc gttaatgtta aagacgatgt cgttgaatta
1920gttgataagg ttgaatcagt caccgagccg accttagcgg gcgatactat
tgtttcgtat 1980aacctgttca attttgaagg cgcagatggt tctacaattc
aatcgtttaa ctacgacggt 2040gttgattact cactcgatca aagcctgctc
cccgatgcta cccagatttt cagttttact 2100gaaggtgtcg tcactatctc
attaaacggt gacttcagtt ttgaagtcgc tcgtgatatc 2160gaccactcaa
gcagtgaaac tatcgtcaaa cagttctcat ttttagccga agatggtgat
2220ggggatactg atagttcgac gcttgagtta agtattaccg atggccaaga
tccgatcatt 2280gatttgatcc cgcctgtgac tctctctgaa accaacctta
atgacggctc tgctcccagc 2340ggaagtacag ttagcgcaac cgagacgatt
acctttaccg caggcagcga cgatgtagca 2400agtttccgta ttgaaccaac
agagtttaat gtgggcggtg cacttaaatc gaatggattt 2460tcggttgaga
taaaagaaga ttcggctaat ccgggtactt acattggctt tattaccaac
2520ggttcgggcg ctgaaatccc agtgtttacg attgctttct ctacgagcac
attgggtgaa 2580tacaccttta ctctgcttga agcgttagac catgtagatg
gtttagataa gaacgatctg 2640agctttgatc tgcctattta tgcggttgat
acggacggcg acgattcatt ggtgtctcag 2700cttaatgtga ctatcggtga
tgatgttcaa atcatgcaag acggtacgtt agatatcacc 2760gagccaaatc
ttgctgacgg tacaatcaca accaacacca ttgatgtaat gccaaatcaa
2820agtgctgatg gcgcgacgat cactcggttc acttatgacg gtgtcgtaaa
cacactggat 2880caaagtattt
caggagaaca gcagttcagc ttcacagaag gcgaactgtt tatcaccctt
2940gaaggtgaag tgcgctttga gcctaatcgc gatctagacc actcagtgag
tgaagatatc 3000gtgaagtcga ttgtggtgac ttcaagcgac ttcgataacg
atccggtgac ttcaaccatt 3060acgctgacga tcactgatgg tgataacccg
acgattgatg ttattccaag tgttacgctt 3120tctgaaatta acctgagcga
tggctctgct ccaagtggca gcgcggtaag ctcgactcaa 3180actattactt
ttaccaatca aagtgatgat gtggttcgtt tccgtattga gtcaacggag
3240ttcaatacta acgatgatct taaatcgaac ggtttagctg ttgagttacg
tgaagacccg 3300gcagggtcgg gtgactacat tggttttacg accagtgcga
cgaacgtaga aactccagta 3360ttcacattaa gctttaattc tggatcatta
ggtgaataca cgttcacact catcgaagcg 3420ttggaccacc aagatgcccg
tggcaacaac gacctcagtt ttgatttacc tgtttacgcg 3480gtagatagtg
atggcgatga ttcattggtg tctccgttaa acgtcactat cggtgatgat
3540gttcaaatca tgcaagatag tacgttagat atcgtcgagc caaccgtcgc
agatttggcc 3600gctggcacag tgacaactaa caccattgat gtgatgccaa
atcaaagtgc cgatggcgca 3660acggtgacgc aattcactta tgatggccag
cttcgaacac ttgaccaaaa tgacaatggt 3720gagcagcaat ttagcttcac
agaaggtgaa ctgttcatca cgcttcaagg tgatgtgcgc 3780tttgagccta
atcgtaatct agaccacaca ctcagcgaag acatcgtgaa atcaatcgtg
3840gtgacatcta gcgattccga taacgatgtg ttgacctcaa ccgtcactct
gaccattacc 3900gatggtgata tcccaaccat tgataatgtt ccaactgtga
acttgtctga aactaatctg 3960agtgatggct ctgcacctag cggaagcgcg
gtgagttcaa ctcaaactat tacttacacc 4020actcaaagtg atgatgtgac
aagcttccgt attgaaccga ctgaatttaa tgttggtggc 4080gctctcacat
caaacggatt ggcagtcgag ttaaaagctg atccaaccac accgggtggc
4140tacatcggtt ttgtgactga tggttcgaac gttgaaacta acgtgttcac
gattagcttc 4200tcagatacca atttaggcca gtacaccttc accttacttg
aagcgttaga ccatgtggat 4260ggtttagcga acaatgatct gacctttgat
ctgcctgttt atgcagttga tagcgatggc 4320gacgattcac tggtgtctca
gttaaatgta accatcggtg atgatgttca aatcatgcaa 4380ggtggtacgt
tagatatcac tgagccaaat cttgcagacg gcacaattac aaccaatacc
4440atcgatgtga tgccagagca aagcgccgat ggtgcgacga tcactcagtt
cacttatgac 4500ggtcaagttc gaacactgga tcaaacggac aatggtgagc
agcaatttag cttcactgaa 4560ggcgagttgt tcatcactct tcaaggtgac
gtgcgtttcg aacccaatcg caacctagat 4620cacacagcta gcgaagatat
cgtgaagtcg atagtggtga cttcaagcga tttagataac 4680gatgtggtga
cgtcaacggt cactctgacg attactgatg gtgatatccc aaccattgat
4740gcagtgccaa gcgttactct gtctgaaatc aatcttagtg acggctctgc
gccaagtggc 4800actgcagtta gtcaaactga gacgattacc ttcaccaatc
aaagtgatga tgtgaccagt 4860ttccgtattg agccaataga gttcaatgtg
ggcggtgcac tgaaatcgaa tggatttgcg 4920gttgagataa aagaagattc
ggctaatccg ggtacttaca ttggctttat taccaacggt 4980tcgggcgctg
aaatcccagt gtttacgatt gctttctcta cgagctcatt gggtgaatac
5040acctttactc tgcttgaagc gttagaccat gtagatggtt tagataagaa
cgatctgagc 5100ttcgatctgc ctgtttatgc ggtcgatacg gacggcgatg
attcattggt gtctcagcta 5160aacgtgacca tcggtgatga tgtccaaatc
atgcaagacg gtacgttaga tatcatcgag 5220ccaaatctgg ctgatggaac
aatcacaacc agcactattg atgtgatgcc aaaccaaagt 5280gctgatggtg
cgacgatcac tcagtttact tatgacggtc agctaagaac gcttgatcaa
5340aatgacactg gcgaacagca gttcagcttc acagaaggcg agttgtttat
cacccttgaa 5400ggtgaagtgc gctttgagcc aaaccgagac ctagaccaca
ccgcgagtga agatattgtt 5460aagtcgattg tggtcacttc aagtgatttc
gataacgact ctctgacttc taccgtaacg 5520ctgaccatta ctgatggtga
taaccctacg atcgacgtca ttccaagcgt taccctttct 5580gaaactaatc
tgagtgatgg ctctgctcca agtggcagcg cggtaagctc gactcaaact
5640attactttta ccaatcaaag tgatgatgtg gttcgtttcc gtattgagcc
aacggagttc 5700aatactaacg atgatcttaa atcgaacggt ttagccgttg
agttacgtga agacccggct 5760gggtcgggtg actacattgg ttttactact
agtgcgacga atgtcgaaac cacggtattt 5820acgctgagtt tttctagcac
cacattaggt gaatatacct tcactttgct tgaagcgttg 5880gaccaccaag
atgcccgtgg caacaacgac ctcagttttg aactgcctgt ttatgcggta
5940gacagtgatg gcgatgattc actgatgtct ccgttaaacg tcaccatcgg
cgatgatgtt 6000caaatcatgc aagacggtac gttagatatc gtcgagccaa
ccgtcgcaga tttggccgct 6060ggcattgtga caactaacac cattgatgtg
atgccaaatc aaagtgccga tggcgcgacg 6120atcactcaat tcacttatga
tggccaactt cgaacacttg accaaaatga caatggcgaa 6180caacagttta
gcttcacgga aggtgaacta ttcatcactc ttgaaggtga agtgcgcttt
6240gagcctaatc gtaatctaga ccacacgctg aacgaagaca tcgtgaaatc
gatcgtggtg 6300acgtctagtg actccgataa cgatgtgttg acctcaaccg
tcactctgac cattaccgat 6360ggtgatatcc caaccattga taatgtgcca
acagtgagct tgtcagaaac aagtctgagt 6420gacggctctt caccaagtgg
cagcgcagtt agctcaactc aaaccatcac ttacaccact 6480caaagtgatg
atgtaaccag cttccgtatt gaaccgactg agttcaatgt tggcggtgct
6540ctcaaatcaa atggattggc ggttgagctg aaggccgatc caaccactcc
gggcggctac 6600atcggctttg tgactgatgg ttcgaacgtt gaaactaacg
tgttcacgat tagcttctcg 6660gataccaatt taggtcaata caccttcacc
ttgcttgaag cgttggatca tgcggatagc 6720cttgcaaata acgatctgag
ctttgatctg ccagtctacg ccgtcgatag tgatggcgat 6780gattcactgg
tgtctcaact caatgtaacc atcggtgatg atgttcaaat catgcaaggt
6840ggtacgttag atatcactga gccaaacctt gcagacggca caaccacaac
taacaccatc 6900gatgtgatgc cagaacaaag tgccgatggt gcgacgatca
ctcagtttac gtatgacggg 6960caagttcgca ctctggatca aactgacaat
ggtgagcagc aatttagctt cactgaaggc 7020gagttgttca tcactcttca
aggtgacgtg cgtttcgaac ccaatcgcaa cctagatcac 7080acagctagcg
aagacatcgt gaagtcgata gtggtgactt caagcgattc agataacgat
7140gtggtgacgt caacggtcac tctgactatt actgatggtg atctcccaac
cattgatgca 7200gtgccaagcg ttactctgtc tgaaactaat cttagtgacg
gctctgcgcc aagtggcagc 7260gcagtcagtc aaactgagac catcaccttt
accaatcaaa gtgatgatgt ggcgagtttc 7320cgtattgagc caaccgagtt
taatgtgggc ggtgcactga aatcgaatgg gtttgcggtt 7380gagataaaag
aagactctgc taatccgggt acttacattg gctttattgc caatggttcg
7440agcgctgaaa tcccagtgtt cacgattgct ttctctacga gtacgttggg
tgaatacacc 7500tttactctgc ttgaagcgtt agaccatgcg gatggtttag
ataagaacga tctgagcttt 7560gagcttccgg tttacgcggt tgatacagac
ggtgatgatt cattggtatc tcagcttaat 7620gtgaccattg gtgatgatgt
tcaaatcatg caagatggta cgttagacgt tatcgagcca 7680aatcttgcag
acggcacaat cacaaccaac accattgatg tgatgcccga gcaaagtgct
7740gatggtgcga cgatcactca gtttacttat gacggtcagc taagaacgct
tgatcaaaat 7800gacactggtg aacagcagtt cagcttcaca gaaggcgagt
tgtttatcac ccttgaaggt 7860gaagtgcgct ttgaacctaa tcgcgatcta
gaccattccg ttagcgaaga catcgtgaag 7920tcgatagtag tgacttcaag
cgacttcgat aacgatccgg tgacttcagc cattacgctg 7980accattactg
atggtgataa tccgactatc gattcggtac cgagcgttgt acttgaagaa
8040gctgatttaa ctgatggctc atcgccaagt ggcagcgcgg ttagtcaaac
ggaaaccatc 8100actttcacta atcaaagtga cgatgttgag aaattccgtt
tagaaccaag tgaatttaat 8160actaacaacg cgctcaagtc cgatggcttg
atcattgaga ttcgagagga accaacagga 8220tccggcaatt atattggttt
cacgaccgat atttcgaatg tcgaaaccac tgtgtttaca 8280ctcgatttca
gcagtaccac tttgggtgag tacaccttca cgcttctgga agcgattgac
8340cacacgcctg ttcaaggcaa taacgatcta acattcaact tgccagtcta
cgcggttgat 8400agcgacggtg atgattcgct aatgtcatca ctatcggtga
cgattactga tgatgttcaa 8460gtgatggtga gtggttcgct tagtatcgaa
gagcctactg ttgccgactt ggctgcaggc 8520acgccaacaa catcagtatt
tgatgtatta acatccgcga gtgctgatgg ggcgaccatt 8580actcagttca
cttatgatgg tggggcggta ttaacgcttg atcaaaacga tacaggtgag
8640cagaagttcg tggttgctga tggggcatta tatatcactc tgcaaggcga
tattcgtttc 8700gaaccaagtc gtaaccttga ccatactggt ggcgatatcg
tcaagtcgat agtcgtaact 8760tcaagtgatt ccgatagcga tcttgtgtct
tcaacggtaa cgctaaccat tactgatggc 8820gatatcccaa cgattgacac
ggtgccaagc gttactctgt cagaaacgaa tctgagcgac 8880ggatctgctc
cgaatgcaag tgcggtaagt tcaactcaaa ccattacctt tactaaccaa
8940agtgatgacg tgacgagttt ccgtattgaa ccgactgatt ttaatgttgg
tggtgctctg 9000aaatcgaacg gattggcggt cgaactgaaa gcggacccaa
ctacaccggg tggctacatc 9060ggttttgtga ctgatggttc gaacgttgaa
actaacgtgt ttacgattag cttctcggat 9120accaatttag gtcaatacac
cttcaccctg cttgaagcgt tggatcatgt agatggctta 9180gtgaagaatg
atctgacttt tgatcttcct gtttatgcgg ttgatagcga tggtgatgat
9240tcactggtgt ctcaactgaa tgtgaccatt ggtgatgatg tacaggtcat
gcaaaaccaa 9300gcgcttaata ttattgagcc aacggttgct gatttggctg
caggtactcc gacgacagcc 9360actgttgatg tgatgcctag ccaaagtgcc
gatggcgcga caatcactca gtttacttac 9420gatggcgggg cggcaataac
actcgaccaa aacgacaccg gtgaacagaa gtttgtattt 9480actgaaggtt
cactgtttat caccttgcaa ggtgaagtgc gtttcgagcc aaatcgcaat
9540ctaaaccaca cagcgagcga agacatcgtg aagtcgattg tggtgacttc
aagcgattta 9600gataacgatg tactgacgtc aacggtcact ctgactatta
ctgatggtga tatcccaacc 9660attgatgcag tgccaagcgt tactctgtct
gaaactaatc ttagtgacgg ctcagcgcca 9720agcagcagtg ctgtaagtca
aacagagacg attaccttca tcaatcaaag tgatgatgtg 9780gcgagtttcc
gtattgagcc aacagagttc aatgtgggcg gtgcactgaa atcgaatgga
9840tttgcggttg agataaaaga agattcggct aatccgggta cttatatcgg
ttttattacc 9900gatggttcga atactgaagt tcctgtattc acgattgctt
tctctacaag tacgttgggc 9960gaatacacct tcaccttact tgaagcgcta
gaccatgcaa atggcctaga taagaacgat 10020ctgagttttg atcttcctgt
ttatgcggta gacagtgatg gcgatgattc actggtgtct 10080caactgaatg
tgaccattgg tgatgatgtc caaataatgc aagacggtac gttagatatc
10140actgagccaa atcttgcaga cggaacaatc acaaccaaca ccattgatgt
gatgccaaat 10200cagagtgccg atggtgcgac gatcactgaa ttctcatttg
gcggtattgt caaaacactc 10260gatcaaagca tcgtaggtga gcagcagttt
agtttcaccg aaggtgagct attcatcact 10320cttcaaggtc aagtgcgctt
tgaaccaaat cgtgaccttg accactctgc cagcgaagac 10380atcgtgaagt
cgatagtggt tacttcaagt gattttgata acgatcctgt gacttcaacc
10440gttacgctga ccattaccga tggtgatatt ccaactatcg atgcggtacc
aagtgttacg 10500ctttcagaaa caaacctagc tgatggttct gcgccaagtg
gtagtgcggt tagtcaaacg 10560gagacgatta cttttaccaa tcaaagtgat
gatgtggttc gcttccgtct ggaaccaacc 10620gagttcaata ctaacgatgc
acttaaatcg aatggcttag cggtcgaact gcgcgaagaa 10680cctcaaggct
ctggtcagta cattggcttt accaccagtt cgtctaatgt tgagacaaca
10740gtatttacgt tggactttaa ctccggaacc ttaggtgaat acacatttac
tttaatcgaa 10800gctctggatc atcaagatgc gcgtggcaac aacgatttaa
gctttaatct acctgtgtat 10860gcggtggata gtgatggcga tgactcgtta
gtctctcagc ttggcgtgac cattggcgac 10920gatgtgcagt tgatgcaaga
cggcacaatc accagtcgtg agcctgcagc aagtgttgaa 10980acatcaaata
cctttgatgt gatgccaaac caaagtgctg atggagccaa agtcacttca
11040tttgttttcg atggtaagac tgcagaaagt cttgatttga atgtgaatgg
tgaacaagag 11100ttcgtcttca cggaaggttc ggtatttatt acgacggaag
gtgagatacg attcgagccg 11160gtacgtaatc aaaatcatgc tggtggtgat
attaccaagt cgattgaggt gacgtctgtt 11220gacctcgatg gcgatattgt
cacatcgaca gtgacactga agattgttga tggtgacctt 11280cctactatcg
accttgttcc cggaattacg ttatctgaag tggatctggc cgatggctct
11340gtgccaaccg gtaatccagt gacaatgaca caaaccatta cctacacagc
gggtagtgac 11400gacgtaagcc atttcagaat tgaccctacg cagttcaata
cttcaggggt tttgaaatcg 11460aacggcctag atgtcgaaat aaaagagcag
ccagctaatt ctggtaatta cattggcttc 11520gtcaaagacg gttctaacgt
agaaaccaac gtcttcacga tcagcttctc gacgagcaat 11580ttagggcaat
acacgttcac actacttgaa gcgttagatc atgtagatgg attgcaaaac
11640aatatactaa gcttcgatgt ccctgtttta gcggttgatg cggatggtga
tgattctgca 11700atgtcgccta tgacggttgc gatcaccgat gacgtacaag
gtgttcaaga tggcaccttg 11760agtatcactg agccttcatt agctgatttg
gcatcgggta cgccaccaac gacggcaatc 11820attgatgtta tgccaacgca
gagtgctgat ggcgcgaaag taacacagtt tacttacgat 11880ggtggcacag
ctgtaacgtt agacccaagc atcgccacag aacaagtctt taccgtaacc
11940gatggcttac tgtacatcac cattgaaggg gaggttcgtt ttgagccgag
ccgagatcta 12000gaccattcat ctggcgatat cgtaagaacg attgtcgtca
ccaccagtga ttttgataac 12060gatacagata ccgcggatgt cactttgacg
atcaaagacg gtatcaatcc cgttatcaat 12120gtggttccag atgttaactt
atcggaagtt aatctagcgg atggctcgac gccaagtggt 12180tctgcagtca
gttcgactca cacaatcact tacaccgaag gaagtgatga ttttagtcac
12240tttagaattg cgaccaacga attcaatcct ggcgatctgt tgaaatcaag
tggtcttgtt 12300gttcaactaa aagaagatcc tgcttctgct ggtgattaca
ttggttatac cgatgatggt 12360atgggtaacg ttaccgatgt atttaccatt
agctttgata gtgcaaacaa agctcagttt 12420acatttacct tgattgaggc
gcttgatcac cttgatggtg tgctttacaa cgatcttacg 12480ttccgtttgc
ctatctatgc tgttgataca gatgattctg aatcaacaaa gcgcgatgtg
12540gtggttacga tagaagatga catccagcaa atgcaagatg gcttcttaac
cattaccgag 12600ccaaattctg gtactccaac aacaactacc gttgatgtga
tgccaatacc aagtgcagac 12660ggtgcgacta ttacgcagtt cacgtatgac
ggtggttctc caattactct gaatcaaagc 12720atcagcggcg aacaagagtt
tgttttcact gaaggttcac tgtttgtgac actagatggt 12780gatgtaaggt
ttgagccaaa tagaaacctt gatcactctg cgggcgacat tgttaaatcg
12840attgtgttca cgtcttcaga ctttgataac gacatcttct catcaaaagt
cactctcacc 12900attgttgatg gtgatgggcc aacaatcgac gttgtgccgg
gtgtggcatt gtcagaaagc 12960ttacttgcgg atggttcgac gcctagcgta
aatcccgtga gtatgactca aaccattact 13020tcacttgcaa gtagtgatga
tattgctgaa atagtggtgg aagtcgggtt gttcaatacc 13080aacggcgcgt
tgaagtcgga tggtttgtca ctgagtttac gtgaagaccc tgtaaattca
13140ggcgactaca ttgcatttac tactaatggt tcgggtgttg agaaagttat
cttcactctg 13200gattttgatg atacgaatcc gagtcaatat acgtttactc
tgcttgaacg tttagaccat 13260gttgatggct taggaaataa cgatctgagt
tttgatcttt ctgtttatgc agaagatacc 13320gatggtgata tttcagcgtc
taaaccgctt acagtcacca tcaccgatga tgttcagctc 13380atgcaatccg
gtgcgctcaa cattactgag ccaaccacag gaacaccgac tacagcagtc
13440tttgatgtga tgcctgcgca aagtgcagat ggcgcgacaa tcactaagtt
tacctatggc 13500agccaacctg aagagtctct ggtacaaacc gtcacgggtg
agcaagaatt tgtgttcact 13560gaaggttctc tgtttatcaa tcttgaaggt
gatgtacgtt tcgaacctaa ccgtaatctc 13620gatcattcgg gtggtaacat
cgttaagacc attacggtga catcggaaga taaagatggc 13680gatattgtca
cttcaacagt gacgctgact attgtagatg gcgcgccacc agtaatagac
13740acagtaccaa cggttgcatt ggaagaagcg aatctggtcg acggatcttc
accgggttta 13800cctgttagcc aaactgaaat cattactttc acagcaggaa
gtgatgatgt gagccacttc 13860cgtattgatc cggctcaatt caacacatca
ggcgatctga aagcggatgg tttggtggtt 13920cagttaaaag aagatcctct
aaacagcgat aattatattg gttacgttga aagcggcggt 13980gtccaaacgg
atatcttcac catcaccttt agcagcgtgg ttctaggaga gtacacattc
14040accttgttgg aagagttaga tcacctgcct gtacaaggta acaatgatca
aatcttcacc 14100ttgccagtga tcgcagtcga caaagacaac actgactcag
cggtgaaacc tcttacggtg 14160accattaccg atgatgttcc aaccattact
gacaccaccg gcgccagtac gtttgtggtt 14220gatgaagatg atttgggcac
tctggcacaa gcgacgggtt cgtttgtaac cacagaaggt 14280gcagatcaag
tcgaggttta cgaactacgt aatatatcaa cgttggaagc aacgctatcg
14340tcgggcagtg aaggtattaa gatcactgag atcacaggtg ctgctaacac
gaccacctac 14400caaggggcga ccgacccaag tggaacgcca attttcacat
tagtgctgac tgatgatggt 14460gcctacacct ttaccttgct tggccctctc
aatcacgcta cgacaccgag taacctcgat 14520acattaacaa taccatttga
tgttgttgcc gttgacggtg atggcgatga ttctaaccaa 14580tatgtattgc
caatcgaggt gctagatgat gtgcctgtaa tgacggcgcc gacgggtgaa
14640acggttgttg atgaagacga tcttactggc attggttccg atcaatctga
agatacaatt 14700atcaatggac tgttcaccgt tgatgaaggt gcggatggcg
ttgtgctgta tgagctggtt 14760gatgaagatt tggttctgac gggcttaacc
tctgatggag aaagcttaga gtggctagct 14820gtttcacaaa acggcacaac
atttacttac gttgctcaaa ctgcaacgag taatgaagcg 14880gtgttcgaga
ttattttcga cacctcggat aacagctacc aatttgaatt atttaagcca
14940ctgaagcacc ctgacggtgc aaacgagaac gcgatagatc ttgatttctc
aatcgttgct 15000gaagattttg atcaagacca atcggatgcg atcggtctaa
aaattacggt aaccgatgat 15060gttccgttag tgacaactca atcgattact
cgtcttgaag gtcaggggta tggcaactct 15120aaagtcgaca tgtttgccaa
tgcaacagat gtgggggctg atggcgcggt actgagtcga 15180attgagggta
tctcaaataa tggtgcagat attgttttcc gtagcgggaa caatgggcca
15240tatagtagcg gcttcgattt aaacagcggt agccaacaag ttcgagtcta
cgagcaaaca 15300aatggcggtg ctgatactcg tgaacttggc cgtctacgca
tcaactcaaa tggtgaggtt 15360gaattcagag ctaacggcta tctcgatcat
gacggtgatg acaccatcga cttctcgatt 15420aacgtgattg ccacagatgg
agatttagac acctctgaaa caccgttaga tattacgatt 15480actgataggg
attctacaag aattgcgctg aaagtgacga ccttcgagga tgcgggtaga
15540gactcaacca taccttacgc aacaggtgat gagccgactc ttgagaatgt
tcaagataac 15600caaaatggtt tgccgaatgc gccagcgcaa gttgcgctgc
aagttagtct gtatgaccaa 15660gataacgctg aatctattgg gcagttgacg
attaaaagcc cgaacggagg tgatagtcat 15720caaggtactt tttattactt
tgatggtgct gactacatag aattagtgcc tgagtcaaat 15780gggagcatta
tatttggctc tcctgaactc gaacaaagct tcgctccaaa cccgagtgaa
15840ccaagacaaa ctatcgcgac gatagacaac ctgttctttg ttccagacca
acacgctagt 15900tcggatgaaa ctggtgggcg agttcgttat gagcttgaaa
ttgagaaaaa tggcagtacg 15960gatcacaccg ttaattcaaa cttcagaatt
gagattgaag ctgtagctga tattgcgact 16020tgggatgatt ccaacagcac
gtatcagtat caagtcaacg aagatgaaga caatgtcacg 16080ttgcagctga
acgcagagtc tcaagataac agtaatactg agacgattac ctatgaactt
16140gaagccgttc aaggcgacgg gaagtttgag ttacttgatc aaaatggcaa
tgtgttaacg 16200cccgttaatg gtgtttatat catcgcatct gctgatatca
atagcaccgt agttaaccct 16260attgataact tctcagggca gattgagttc
aaagcgacgg caattacgga agagacgctt 16320aacccatacg atgattcaga
caacggtgga gcaaacgata agacgacggc tcgttctgtg 16380gaacaaagta
ttgttattga tgtgaccgca gatgcggacc ctggcacatt cagtgttagt
16440cgaattcaga tcaacgaaga caatatcgat gatccagatt acgtcgggcc
tttggacaat 16500aaagacgcgt tcacgttaga cgaagtcatc accatgacag
ggtcggtcga ttctgacagt 16560tctgaagaac tgtttgtgcg catcagtaat
gttacggaag gagctgtgct ttacttctta 16620ggcaccacga cagtcgttcc
gaccatcacg atcaatggtg tggattatca agaaatcgcg 16680tattccgatt
tggctaacgt tgaggttgtt ccaaccaaac acagtaatgt cgatttcacc
16740ttcgatgtta cgggagtggt caaagatacg gcaaatctat ccacgggcgc
ccaaatcgat 16800gaggagatac taggaactaa aaccgtcaac gttgaagtca
aaggcgttgc cgatactcct 16860tatggtggaa cgaatggcac ggcttggagt
gcaattacag atggcactac atctggtgtt 16920caaaccacga ttcaagagag
ccaaaatggt gatacctttg ctgagcttga tttcaccgtg 16980ttgtcgggag
agagaagacc agatactggc actacaccat tagctgacga tgggtcagaa
17040tcaataaccg ttattctatc gggtataccc gatggggttg ttctagaaga
cggtgacggt 17100acagtgattg accttaactt tgtcggttat gaaaccggac
cgggcggtag tcctgactta 17160tccaaaccta tctacgaagc gaacattact
gaggcgggta aaacttcagg cattcgcatc 17220agacctgtcg actcttcaac
cgagaatatt cacattcaag gtaaagtgat tgtgactgag 17280aacgatggtc
acacgcttac gtttgatcaa gaaattcgag tgcttgttat acctcgaatc
17340gacacatcag caacttatgt caatacgact aacggtgatg aagatacggc
tatcaatatt 17400gattggcacc ctgaaggcac ggattacatt gatgacgatg
agcatttcac taagataact 17460attaatggaa taccactggg tgttactgca
gtagtcaacg gtgatgtgac cgttgatgac 17520tcaaccccag gaacattgat
tataacgcct aaagatgctt cccaaactcc tgaacaattt 17580actcaaattg
cattagctaa taacttcatt caaatgacgc ctccggctga ttctagtgca
17640gattttacgt tgaccaccga acttaaaatg gaagagcgag atcatgagta
tacgtctagc 17700ggcctagagg atgaagatgg tggttatgtc gaagccgatc
cagatataac cggaatcatt 17760aacgttcaag tacgacctgt ggttgaacct
ggagatgccg acaacaagat tgtcgtttca 17820aacgaagatg gctctggaga
tctcactacg attacggctg atgctaatgg tgtcattaaa 17880tttacaacta
acagtgataa ccaaacgact gatactaacg gagacgaaat ctgggacggt
17940gaatacgtcg
tccgatacca agaaacggat ttaagcacag tagaagagca agtcgacgaa
18000gtgattgttc agctgactaa caccgatgga agcgcgttat ctgatgatat
tttagggcaa 18060cttttagtaa ctggtgcctc ttacgaaggc ggtggccgat
gggttgtgac caatgaagat 18120gcctttagcg tcagtgcgcc caatggatta
gatttcaccc ctgccaatga tgcggatgat 18180gtagctactg atttcaatga
tatcaagatg acaattttca ctttggtctc agatcctggt 18240gatgctaaca
atgaaacgtc cgcccaagtg caacgcaccg gagaagtaac gctttcttat
18300cctgaagtgc tgacggcacc tgacaaagtt gccgcagata ttgcgattgt
gccagacagt 18360gttatcgacg ctgttgagga tactcagctt gatctcggcg
cggcactcaa cggcattttg 18420agcttgacgg gtcgcgatga ttctactgac
caagtgacgg tgatcatcga tggcactctg 18480gtcattgatg ctacaacatc
attcccaatt agcctgtcgg gaacaagtga tgttgacttt 18540gtgaatggga
aatatgttta cgagacgact gttgagcagg gcgtagccgt cgattcatcg
18600ggtttgttat tgaatctgcc accaaactac tctggtgact ttaggttgcc
aatgaccatc 18660gtgaccaaag atttacaatc tggtgatgag aagaccttag
tgactgaagt tatcatcaaa 18720gtcgcaccag atgctgagac ggatccaacg
attgaggtga atgtcgtggg ttcgcttgat 18780gatgccttta atcctgttga
taccgacggt caagctgggc aagatccggt gggttacgaa 18840gacacctata
ttcaactcga cttcaattcg accatttcgg atcaggtttc cggcgtcgaa
18900ggcggccaag aagcgtttac gtccattact ttaacgttgg acgacccttc
tataggtgca 18960ttctatgaca acacgggtac ttcattaggt acatctgtta
cgtttaatca ggctgaaata 19020gcagcgggtg cactcgataa cgtgctcttt
agggcaatcg aaaattaccc aacgggtaat 19080gatattaacc aagtgcaggt
taatgtcagc ggtacagtca cagataccgc aacctataat 19140gatcctgctt
ctcctgcggg tacggcaaca gactcagata ctttctctac gagtgtcagc
19200tttgaagtcg ttcctgtggt cgatgacgtg tctgtcactg gaccgggtag
cgatcctgat 19260gttatcgaga ttactggcaa cgaagaccag ctcatttctt
tgtcggggac agggcctgta 19320tcgattgcac tgactgacct tgatggttca
gaacagtttg tatcgattaa gttcacagat 19380gtccctgatg gcttccaaat
gcgtgcagat gctggctcga catataccgt gaaaaataat 19440ggtaatggag
agtggagtgt tcaactgcct caagcttcgg ggttgtcatt cgatttaagt
19500gagatttcga tcttgccgcc taaaaacttc agtggtaccg ctgagtttgg
tgtggaagtc 19560ttcactcaag aatcgttgct gggtgtgcct actgcggcgg
caaacttgcc aagcttcaaa 19620ctgcatgtgg tacctgttgg tgacgatgtt
gataccaatc cgactgattc tgtaacaggc 19680aacgaaggcc aaaacattga
tatcgaaatc aatgcgacta ttttggataa agaattgtct 19740gcaacaggaa
gcgggacgta taccgagaat gcgcccgaaa cgcttcgagt tgaagtggcg
19800ggtgttcctc aagatgcttc tattttctat ccagatggca cgacattggc
tagctacgat 19860ccggcgacgc agctctggac tctcgatgtt ccagctcagt
cgttagataa gatcgtattt 19920aactctggcg aacataatag tgatacaggc
aatgtactgg gtatcaatgg tccactgcag 19980attacggtac gttcagtaga
tactgatgct gataatacag agtacctagg tacgccaacc 20040agcttcgatg
tcgatctggt gattgatcct attaacgatc aaccgatctt tgtgaacgta
20100acgaatattg aaacatcgga agacatcagt gttgccatcg acaactttag
tatctacgac 20160gtcgacgcaa actttgataa tccagatgct ccgtatgaac
tgacgcttaa agtcgaccaa 20220acactgccgg gagcgcaagg tgtgtttgag
tttaccagct ctcctgacgt gacgtttgta 20280ttgcaacctg acggctcatt
ggtgattacc ggtaaagaag ccgacattaa taccgcattg 20340actaatggag
ctgtgacttt caaacccgac ccagaccaga actacctcaa ccagactggt
20400ttagtcacaa tcaatgcaac gctcgatgat ggtggtaata acggtttgat
tgacgcggtt 20460gatccgaata ccgctcaaac caatcaaact accttcacca
ttaaggtgac ggaagtgaat 20520gacgctcctg tggcgactaa cgttgattta
ggctcgattg cggaagacgc tcaaatcgtg 20580attgttgaga gtgacttgat
tgcagccagt tctgatctag aaaaccataa tctcacagta 20640accggtgtga
ctcttactca agggcaaggt cagcttacac gctatgaaaa tgctggtggt
20700gctgatgacg cagcgattac ggggccattc tggatattca ttgcagataa
tgatttcaac 20760ggcgacgtta aattcaatta ctccattatc gatgatggta
ccaccaacgg tgtggatgat 20820tttaaaaccg atagcgctga aatcagcctt
gtagttactg aagtcaatga ccagccagtg 20880gcatcgaaca ttgatttggg
caccatgctt gaagaaggac agctggtcat taaagaggaa 20940gacctgattt
ccgcaaccac tgatccggaa aacgacacga ttactgtgaa cagtttggtg
21000ctcgatcaag gtcagggcca attacaacgc tttgagaacg tgggcggtgc
tgatgatgct 21060acgatcactg gcccgtactg ggtatttact gcagccaacg
aatacaacgg tgatgttaag 21120ttcacttata ccgttgagga cgatggtaca
accaacggcg ctgatgattt cttaacagat 21180accggcgaaa ttagcgttgt
ggtaacggaa gtgaatgatc aaccggtggc aacggatatc 21240gacttaggaa
acatccttga agaagggcag ttgatcatca aagaggaaga cttaattgct
21300gctacgagcg atccggaaaa cgacacgatt accgtgacca atctggtgct
cgacgaaggc 21360caaggccagt tacagcgctt tgagaacgtg ggcggtgctg
atgacgctat gattactggc 21420ccgtactgga tatttacggc tgctgatgaa
tacaacggta acgttaagtt cacctatacc 21480gtcgaggatg atggtacaac
caacggcgct aatgatttcc taacggatac tgcagagatc 21540acagcgattg
tcgacggagt gaacgatacg cctgttgtta atggtgacag tgtcactacg
21600attgttgacg aggatgctgg tcagctattg agtggtatca atgtcagtga
cccagattat 21660gtggatgcat tttctaatga cttgatgaca gtcacgctga
cagtggatta cggtacattg 21720aacgtatcac ttccggcagt gacgacagtg
atggtcaacg gcaacaacac tggttcggtt 21780atcttagttg gtactttgag
tgacctgaat gcgctgattg atacgccaac cagtccaaac 21840ggtgtctacc
tcgatgcgag cttgtctcca accaatagca ttggcttaga agtaatcgcc
21900aaagacagcg gtaacccttc tggtatcgcg attgaaactg caccagtggt
ttataatatc 21960gcagtgacac cagtcgctaa tgcgccaacc ttgtctattg
atccggcatt taactatgtg 22020agaaacatta cgaccagctc atctgtggtc
gctaatagtg gagtcgcttt agttggaatt 22080gtcgctgcat tgacggacat
tactgaagag ttaacgttga agatcagcga tgttccggat 22140ggtgttgatg
taaccagtga tgtgggtacg gtttcgttgg tgggtgatac ttggatagcg
22200accgctgatg cgatcgatag tctcagactc gtagagcagt catcattagg
taaaccgttg 22260accccgggta attacacctt gaaagttgag gcgctatctg
aagagactga caacaacgat 22320attgcgatat ctcaaaacat cgatctgaat
ctcaatattg ttgccaatcc aatagatctc 22380gatctgtctt ctgaaacaga
cgatgtgcaa cttttagcga gtaactttga tactaacctc 22440actggcggaa
ctggaaatga ccgacttgta ggtggagcgg gtgacgatac gctggttggc
22500ggtgacggta acgacacact cattggtggc ggcggttccg atattctaac
cggtggcaat 22560ggtatggatt cgtttgtatg gctcaatatt gaagatggcg
ttgaagacac cattaccgat 22620ttcagcctgt ctgaaggaga ccaaatcgac
ctacgagaag tattacctga gttgaagaat 22680acatctccag acatgtctgc
attgctacaa cagatagacg cgaaagtgga aggggatgat 22740attgagctta
cgatcaagtc tgatggttta ggcactacgg aacaggtgat tgtggttgaa
22800gaccttgctc ctcagctaac cttaagtggc accatgcctt cggatatttt
ggatgcgtta 22860gtgcaacaaa atgtcatcac tcacggttaa cgcctaattg
gaggctagct attagaatct 22920aacgattaaa ctaaaagcgg accatttaac
cataacgaaa gaggccagca ttgctggcct 22980cttttttgtc actgtataaa
tcgtaaagag ttacttaaga gagttgtgga tcaggaactc 23040ttcttcgacg
cctttcaatt tcatctcatc cataatgaag ttcactgtgt tcaacaagcg
23100ttgttcacct tttggtatca ggtaaccgaa ttgactgttg gtaaacggtg
tttcacagcg 23160tgccgcttca agacgttcgt ccgtcacttg atagaacaga
ccttcaggag tttctgtcac 23220cattacatca actttacctt ccgcaacggc
ttgcggaacg tctaggttgt tctcgtaacg 23280cgtaaagctc gcgtcttgca
agttagcatc cgcaaacatc tcattagtcc caccgatatt 23340gacgccaaca
cgcacagaag agaggttcac tttctcaatg ctgttgtatt gttctgcttt
23400gcctttcgca actaagaaac acttgccaaa ggtcatgtaa ccttgagttt
gttctgcgtt 23460taactgacgc tgcattttac gcgtgatacc gcccatcgcg
atgtcgtatt tatcgctgtc 23520tagatcggtc agtagatctt tccatgtggt
acgaacaatc tgtaattcaa cgcccaactg 23580ctctgcaaca tgtttggcta
cgtcaatgtc ataaccagag taggttttgc cgtcgaagta 23640agaaaaaggt
ttgtagtcgc ctgtggtgcc gacgcgaagt gtgcctgatt tttgaatgtc
23700ttctagctgg tcagcttgta ctacaccaga aagtgccaga gtaatggaag
caagtaatag 23760tgatgttttt ttcattgtaa ttatctgttg tgtttgtgtt
gttattcaaa gtaacagaaa 23820caatcagaga aagagatcaa accattggaa
aggttgtaaa agaagataaa acgagggcag 23880gagataggta acgctattga
tttgtgaaca ttgataaaca tgtgtttcat attccatttt 23940gataaaccgt
agacaaacaa aaagcccatg ttatcgaata acatgggctt cattttggtt
24000taacttgtta gctgcttatt tagctgctta tttagctgtt tagctgttta
gctgtttagc 24060tacttagcaa ctgactcgtt gttcatctta gccggagctt
tagatgcgtt aaccagcagg 24120ataccaacgg tgagtaccat cgaaccacat
agtaggaaca acaagcgtcc tgttggttcg 24180tttggaatca gagccattgc
taggataccg aaacctgctg tgctgataag cttaccaagc 24240attgaacgct
gtttagtatc taggttctgc tgctcttcac cttccgctac tagcggcgta
24300ttccagttag tgaatagttg gtcaacttct ttctcacgtt caggcgatag
gcctttgtag 24360aagcgagaag ttaggatgaa gtaaccacca gtaaacacta
cgtgagcagc taagctaaga 24420ccaactttca agtcgctcca ttcacggcca
gtaagcgctg tttccatacc aaataggtgc 24480tcgatgtctt ctgcttgaag
cgagataccg aagatgtaag aaacgaagcc accaacgatt 24540aacgtagacc
aaccagccca gtcaggcgtc ttacgaatcc acataccaag tagtacaggg
24600ataagcattg ggaagccaat taacgcacct acgttcatta cgatatcgaa
caagctcaaa 24660tgacgtagag agttaatgaa caagccaatc gcgatgatga
taatacccat catgatagtg 24720gttagcttac ttacaataac cagctctttc
tgagttgcgt tttgacgtag aatagggctg 24780tagaagttca ttacaaagat
gccagcgtta cggttcaaac ctgaatccat agaagacatt 24840gttgcagcga
acattgctga cataagaaga ccaaccatac ctgctggcat tacgttctgt
24900acgaatgcta ggtaagcagc atcaccagct ttatcaccca ttgaagcgta
ctccaatgcg 24960aaatcaggca tgaatgcact tacgtaccaa ggtggtagga
accagattag tgggccaaca 25020accataagga tacatgctag gcctgccgct
ttacgtgcgt tttcactgtc tttcgcacat 25080aggtaacggt aagcgttgat
gctgttgttc attacaccga actgcttcac gaagatgaat 25140acaacccaaa
gaacgaagat gctcatgtag tttaggttat tacctaacat gaagtcgccg
25200tcgaaatttg caacgatgtt agttaggcca ccaccgtgga agtaagctgc
aaccgcacaa 25260gtaatcgtaa ccgccatgat aacaagcatt tgcatgaagt
cagaagcaac aaccgcccaa 25320gagccgcctg ttactgccat caatactaga
accatacccg ttaccacaat ggttgcttcc 25380attgggatgt tgaataccgc
tgctacgaag atagctagac catttagcca gatacccgca 25440gagataaggc
tgtcaggcat acctgcccat gtgaagaact gttcagacgt tttaccaaag
25500cgctgacgaa tagcttcgat cgccgttacc acacgaagtt ggcggaactt
tggagcgaag 25560tacatatagt tcatgaagta gccaaaagca ttggctaaga
ataggattac aataacgaaa 25620ccgtcattga acgcgcgtcc tgcggcacct
gtaaacgtcc atgctgaaaa ctgtgtcatg 25680aaggcggttg caccaaccat
ccaccacaac attttgccgc cccctctgaa gtaatcacta 25740gtcgacgtgg
tgaacttacg gaacatccaa ccaatagcga ttaaaaagaa gaagtaggcg
25800agaacaacaa aagtatcgat agtcatcttt tcagcctttt aaatatcata
attaactggg 25860cttagattaa cgcgttcaaa ggtttatttg tactacaata
tgtctttagt atgatctagg 25920tcgcattgat ttttgggtgc acacgataag
ttaatttaac ctactgtttt tattgatttt 25980aattgttttt atgaattgct
ctagatccaa gataaattga agttcaaatg tttatatgta 26040ttacaatata
agtaatgagg ctttagttta ccttatttat aagattttaa ttataaccgt
26100aacaaatatg ctacaactga gcgtggttgt gcgacgacat tcacgttaat
ttggaactct 26160attctggaaa ttcttgtatt aggatttcaa gtgtagctca
ttgttttcac ttcgctattt 26220tgtgtttgtc tgcggttctg tcgcctttcc
atgctattga ttaatttttt cgtgctagag 26280agacgcgtat ttggaatgtt
tgtcactgag tgggcgttaa actggacgac gggacactct 26340ttcggctcac
tttgtctatt gtggtcttca gtgcatgcta tgagaaatgt ttgacgacgt
26400attgaaaagg aatattgtcg gataaaggga tgggtaagga gctggataag
cggtagggag 26460ccccagtaac gcttcgctag atgcatactg aggttgcttg
aaagccttac atcactcgtt 26520cttgcctgtc ttagtcacgg agctgtacga
ggccataggg agaacggtga tagggtatgg 26580ggaaacagaa cgttgattga
gcgtgtttta cggttagtca gcgcaataaa cgccagataa 26640taaaaagccc
caccgaggtg aggctttatc acgaaatcta aaacagatta agcgttaacg
26700tgatcaactg cgtcacgaac aagcttgcct agttcgtccc acttaccttc
atcgataagg 26760ttagttggaa ccatccaagt accgccacac gcaagaacag
aagggatcga taggtattca 26820tcaacattct tcaagcttac gccaccagta
ggcatgaatt taacagggta aactgctgtt 26880agtgctttaa gcatgccagt
accgcctgaa ggctcagcag ggaagaactt caacgtgcga 26940agacccattt
ccattgcttg ctcaactagg cttgggttgt taacacccgg tacgattgca
27000atacctttat cgatacagta ttgaacagta cgtgggttaa aacctgggct
tacgatgaaa 27060tcaacaccag cttcgataga tgcgtcaact tgctcgttag
tcagtacagt acctgaaccg 27120attagcatgt ctgggaattc tttacgcatg
atgcgaatcg cttcgattgc acattctgta 27180cgtagtgtaa tttctgcaca
tggcatgcca ttttcaacca acgctttacc tagagggata 27240gcgtcttcag
cacggttgat cgcgattaca ggaattactt ttaggtttgc tagttgttca
27300tttaatgtcg tcatgaattc tttctcacgt taaatgtggg cctgctttca
actaagcaaa 27360cccttgatta atagttaaag tgcgtaatta tagagacaga
tcaggcgtcg cttctagagg 27420aatgatagca cctggatgct gaatcacggt
tcctgccaca atatgacctg caaatgcagc 27480atcacgagca ctaccgccgc
tcaagcgctt ggccaagaag cctgcactga acgagtcgcc 27540agcggcagtc
gtatcaacga tgttgtctac agggttgggt gcaacgtatt gagcgctttg
27600gctttcaacc actaagcagt ctttcgcgcc acgtttaatg acgatctctt
tcacaccaga 27660ctctgacgta cgtgtaatac attgttcaat gctttcgtcg
ccgtatagct cttgctcatc 27720atcaaacgtc agcagagccg tatctgtgta
cttaagcatt ttcaagtacc aagaaatcgc 27780ttcttgttgg ctttcccaaa
gtttaggtcg gtagttattg tcgaagaata cttggccgcc 27840ttgagctttg
aatttgtcta agaagttgaa tagctgcgtg cgaccatttt ctgtcaagat
27900tgccagcgta ataccactta agtaaatcgc gtcaaaagag aacagcttat
caagaagagc 27960aggcgtgtct tcctgatcaa acatgaactt cgctgcagca
tcactacgcc agtagtggaa 28020actgcgttca ccagtttcat cggtctcgat
gtagtaaagc cctggttgtt tgtggtccag 28080ctgagcaatt aagctcgtgt
cgataccttc cgcttgccaa ttttttaaca tgtcggtact 28140gaatgggtca
gtgcctagtg cagttacgta gctcgtgttg atatcttgct cttttgttaa
28200gcgtgacaag taaagtgcag tattcagcgt atcgccacca aaactttgct
taagcccgtc 28260ttgtttcttt tgtagctcaa ccatgcactc gccaatgacc
gcgatgttta atgatttcat 28320atgcttacct tagcaactga ggttgcgcta
gttattattt taggaaatct tcacgcgcag 28380gattgaagat atcaagaagg
atgctgtctt gttctagagc aactgcaccg tgcatcatgt 28440gtttacgagc
gaagtaagca tcgccttctt taagcacttt cttctcgccg tcgatttcag
28500cttcgaagct accacgaaca acataaccga tttggtcgtg aatttcgtga
gtatgagggt 28560ggccaatcgc gcccttatca aagcataggt gtactgccat
tagatcgtca gtgtaagcaa 28620cgattttacg cttaatgccg ccaccaagtt
cttcccatgg attttcatct aggataaaga 28680aagagttcat tgtgtatctc
ctaatctgtt taaatctttt aagtgttact taacttgcat 28740ccatcataag
ggaatgagtt caattgtaat acaatatatc taaatttgtg tgatattgat
28800caagcgatag tttatatagc gtaaatgaat caacaactta agaattgctt
ggtatctggc 28860attagttagc tgcatcaatg gcttacggtg aattatgtga
ctctactcat catttggcga 28920cgaataggta taattaaagc tcatattgta
ttactttata tggagtttga aaatttaatc 28980aaagtttaag cagataaact
ctttattgag ggtgacaaag aatatgacga ctaaaccagt 29040attgttgact
gaagctgaaa tcgaacagct tcatcttgaa gtgggccgtt ctagcttaat
29100gggcaaaacc attgcagcga acgcgaaaga cctagaagca ttcatgcgtt
tacctattga 29160tgttccaggt cacggtgaag ctgggggtta cgaacataac
cgccacaagc aaaattacac 29220gtacatgaac ctagctggtc gcatgttctt
gatcactaaa gagcaaaaat acgctgactt 29280tgttacagaa ttactagaag
agtacgcaga caaatatcta acgtttgatt accacgtaca 29340gaaaaacacc
aacccaacag gtcgtttgtt ccaccaaatc ctaaacgaac actgctggtt
29400aatgttctca agcttagctt attcttgtgt tgcttcaaca ctgacacaag
atcagcgtga 29460caatattgag tctcgcattt ttgaacccat gctagaaatg
ttcacggtta aatacgcaca 29520cgacttcgac cgtattcaca atcacggtat
ttgggcagta gccgctgtgg gtatctgtgg 29580tcttgcttta ggcaaacgtg
aatacctaga aatgtcagtg tacggcatcg accgtaatga 29640tactggcggt
ttcctagcgc aagtttctca gctatttgca ccttctggct actacatgga
29700aggtccttac taccatcgtt atgcgattcg cccaacgtgt gtgttcgctg
aagtgattca 29760ccgtcatatg cctgaagttg atatctacaa ctacaaaggc
ggcgtgattg gtaacacagt 29820acaagctatg cttgcgacag cgtacccgaa
cggcgagttc ccggctctga atgatgcttc 29880tcgtactatg ggtatcacag
acatgggtgt tcaggttgcg gtcagtgttt acagtaagca 29940ttactcttct
gaaaacggtg tagaccaaaa cattctgggt atggcgaaga ttcaagacgc
30000agtatggatg catccatgtg gtcttgagct atctaaagca tacgaagccg
catctgcaga 30060gaaagaaatc ggcatgcctt tctggccaag tgttgaattg
aatgaaggcc ctcaaggtca 30120caacggcgcg caaggcttta tccgtatgca
ggataagaaa ggcgacgttt ctcaacttgt 30180gatgaactac ggccaacacg
gcatgggtca cggcaacttt gatacgctgg gtatttcttt 30240ctttaaccgc
ggtcaagaag tgctacgtga atacggcttc tgtcgttggg ttaacgttga
30300gccaaaattc ggcggccgtt acctagacga aaacaaatct tacgctcgtc
aaacgattgc 30360tcacaatgca gttacgattg atgaaaaatg tcagaacaac
tttgacgttg aacgtgcaga 30420ctcagtacat ggtttacctc acttctttaa
agtagaagac gatcaaatca acggtatgag 30480tgcatttgct aacgatcatt
accaaggctt tgacatgcaa cgcagcgtgt tcatgctaaa 30540tcttgaagaa
ttagaatctc cgttattgtt agacctatac cgcttagatt ctacaaaagg
30600cggcgaaggc gagcaccaat acgactattc acaccaatat gcgggtcaga
ttgttcgcac 30660taacttcgaa taccaagcga acaaagagct aaacactcta
ggtgacgatt tcggttacca 30720acatctatgg aacgtcgcaa gcggtgaagt
gaagggcaca gcaattgtaa gttggctaca 30780aaacaacacc tactacacat
ggctaggtgc aacgtctaac gataatgctg aagtaatatt 30840tactcgcact
ggcgctaacg acccaagttt caatctacgt tcagagcctg cgttcattct
30900acgcagcaaa ggcgaaacaa cactgtttgc ttctgttgtt gaaacgcacg
gttatttcaa 30960cgaagaattc gagcaatctg tcaatgcacg tggtgttgtg
aaagacatca aagtcgtggc 31020tcacaccaat gtcggttcgg tagttgagat
caccacagag aaatcaaacg tgacagtgat 31080gatcagcaac caacttggcg
cgactgacag cactgaacac aaagtagaac tgaacggcaa 31140agtatacagc
tggaaaggct tctactcagt agagacaact ttacaagaaa cgaattcaga
31200agaacttagc actgcagggc aggggaaata ataatgagct atcaaccact
tttacttaac 31260tttgatgaag cagctgaact tcgtaaagaa cttggcaagg
atagcctatt aggtaacgca 31320ctgactcgcg acattaaaca aactgacgct
tacatggctg aagttggcat tgaagtacca 31380ggtcacggtg aaggcggcgg
ttacgagcac aaccgtcata agcaaaacta catccatatg 31440gatctagcag
gccgtttgtt ccttatcact gaggaaacaa aataccgaga ttacatcgtt
31500gatatgctaa cagcgtacgc gacggtatac ccaacacttg aaagcaacgt
aagccgtgac 31560tctaaccctc cgggtaagct gttccaccaa acgttgaacg
agaacatgtg gatgctttac 31620gcttcttgtg cgtacagctg catctaccac
acgatctctg aagagcaaaa gcgtctgatc 31680gaagacgatc ttcttaagca
aatgatcgaa atgttcgttg tgacttacgc acacgacttc 31740gatatcgtac
acaaccacgg cttatgggca gtggcagcag taggtatctg tggttacgca
31800atcaacgatc aagagtctgt agacaaagca ctatacggcc tgaaactaga
caaagtcagc 31860ggcggtttct tagcgcaact agaccaactg ttttcgccag
acggctacta catggaaggt 31920ccttactacc accgtttctc tctgcgtcca
atctacctgt tcgcagaagc gattgaacgt 31980cgtcagcctg aagttggtat
ctatgaattc aacgattcag tgatcaagac aacgtcttac 32040tctgtattca
aaacggcatt cccagacggt acattgcctg ctctgaacga ttcatcgaag
32100acaatctcta tcaacgatga aggcgttatc atggcaacgt ctgtgtgtta
ccaccgttac 32160gagcaaactg aaactctact tggtatggct aaccaccagc
aaaacgtttg ggttcatgct 32220tcaggtaaaa cactgtctga cgcggttgat
gcagcagacg acatcaaagc attcaactgg 32280ggtagcctgt ttgtaaccga
cggccctgaa ggcgaaaaag gcggcgtaag catccttcgt 32340caccgtgacg
aacaagatga cgacacgatg gcgttgatct ggtttggtca acacggttct
32400gatcaccagt accactctgc tctagaccac ggtcactacg atggcctgca
cctaagcgta 32460tttaaccgtg gccacgaagt gctgcacgat ttcggcttcg
gtcgctgggt aaacgttgag 32520cctaagtttg gcggtcgtta catcccagag
aacaagtctt actgtaagca gacggttgct 32580cacaacacag taacggttga
tcagaaaacg cagaacaact tcaacacagc attggctgag 32640tctaagtttg
gtcagaagca cttcttcgta gcagacgacc agtctctaca aggcatgagc
32700ggcacaattt ctgagtacta cactggcgta gacatgcaac gcagcgtgat
tcttgctgaa 32760cttcctgagt tcgagaagcc acttgtaatc gacgtatacc
gcatcgaagc tgacgctgaa 32820caccagtacg acctacccgt tcaccactct
ggtcagatca tccgtactga cttcgattac 32880aacatggaaa aaacgcttaa
gccgctaggt gaagacaacg gttaccagca cttatggaac 32940gtggcttcag
gcaaagtgaa cgaagaaggt tctctagtaa gctggctaca tgacagcagc
33000tactacagcc
tagtaaccag cgcgaatgcg ggcagcgaag tgatttttgc tcgcactggt
33060gctaacgatc cagacttcaa ccttaagagt gagcctgcgt tcatcttacg
tcagtctggt 33120caaaaccacg tgtttgcttc tgtactagaa acgcatggtt
actttaacga gtctatcgaa 33180gcctctgtag gcgctcgtgg tctagttaaa
tcagtatctg ttgtgggcca taacagtgtc 33240gggactgttg ttcgcattca
gactacttct ggcaacactt accactacgg tatctcaaac 33300caagctgaag
acacgcagca agcaactcac actgttgagt tcgcgggtga gacatactcg
33360tgggaaggat catttgctca actgtaaatg attaacatac atgccgttta
acgatggcat 33420gtattgatgt ggtgctttgc gggaacgaag catcacattg
aattcagtcg tgattgcaaa 33480tcgttcgttg ataccaacaa cgactgaata
catcgggaat aagtcaaacc gagtaactca 33540ctgcgagttg ctcggttttt
ttatgcgtgc tgcttttata agaaggggga aagaggatgg 33600ggcaacggag
cttccctttt ccttcgaatc ttacagagtg ggctaaagta taatttagga
33660tttaaaaata aagggattca aggatgaagt ggttattggc aatagttgcg
atgtctggtg 33720tcgcattggc ggcagaaaat aagaatgttg aggtgagcag
tgagcatttc gtccgttatc 33780aataccaaga caaaatcagc tatggaaagc
tagacaatga cgcagtgtta ccggtcagcg 33840gcgatctctt tggcgaatat
tcggtagcaa aaaattcgat cccgttagag tcggttgagg 33900tgttactacc
gacaaaacca gagaaagtct tcgccgtcgg gatgaacttc gctagccact
33960tagcctcacc tgccgatgca ccaccgccga tgtttcttaa acttccttct
tctttgattc 34020tcacgggcga agtgattcaa gtgccaccaa aagcaagaaa
tgttcatttt gaaggcgagc 34080tggtggttgt gattggtaga gagctcagtc
aagccagtga agaagaagcc gaacaagcga 34140tctttggcgt cacggtgggc
aacgatatta ctgaaagaag ttggcaaggc gccgatttac 34200aatggctccg
agcgaaagct tccgatggtt ttggcccggt tggcaacaca attgtgcgcg
34260gcattgatta caacaatatt gagttaacca ctcgtgttaa cggtaaagtg
gttcaacaag 34320aaaatacttc gttcatgatc cacaagccaa gaaaagtcgt
gagctatttg agctattatt 34380ttaccctcaa accgggcgat ctaattttca
tgggcacgcc aggtagaact tatgctctgt 34440ccgacaaaga tcaagtgagt
gtcacgattg aaggggtagg gactgtggta aatgaagtgc 34500ggttctgatg
gaattgaatt agcgttggga gctacagagc ttatgtctga atttgcagta
34560cgtagacgac ttgaacctat taatttgaac taggttaact tgtgtagtga
ataaactaac 34620cgtttttcgg ttccattatt ttagcccaat tgagtgatgt
ttttggaagc gagcagagaa 34680aacgagaatg acgaacctac atgctcggcg
agggttttgt tagtggtgta acacagtgtt 34740tctagctaag agaaattaga
tgctttctaa gtgtttgatt aattgaataa attaacaggt 34800actatccgct
ttgattttac tcaattggct gtaggtttaa atactgttat agtgttcctt
34860aaataataca taaacataac atataaataa gcgaacttat ggctagcact
tttaattcaa 34920tttcgggctc gaagcgtagc ctgcacgtgc aagtagcacg
cgaaatcgct cgaggaattt 34980tgtctggtga tctgccgcaa ggttctatta
ttcctggtga aatggcgttg tgtgaacagt 35040ttggtatcag ccgaacggca
cttcgtgaag cagttaaact actgacctct aaaggtctgt 35100tagagtctcg
ccctaaaatt ggtactcgcg tagtcgaccg cgcatactgg aacttccttg
35160atcctcaact gattgaatgg atggacggac taaccgacgt agaccaattc
tgttctcagt 35220ttttaggcct tcgccgtgcg atcgagcctg aagcgtgtgc
actggcggca aaatttgcga 35280cagctgaaca acgtatcgag ctttcagaga
tcttccaaaa gatggtcgaa gtggatgaag 35340ctgaagtgtt tgaccaagaa
cgttggacag acattgatac tcgtttccat agcttgatct 35400tcaatgcgac
cggtaacgac ttctatctac cgttcggtaa tattctgact actatgttcg
35460ttaacttcat agtgcattct tctgaagagg gaagcacatg catcaatgaa
caccgcagaa 35520tctatgaagc tatcatggcc ggtgattgtg acaaggctag
aattgcttct gctgttcact 35580tgcaagatgc caaccaccgt ttggcaacag
cataatagaa atgatttaaa gcgcacctga 35640gccatctcac atcgagatga
acaccctcac gttcggataa acgactttaa aaggtatgcc 35700tagtgcatgc
cttttttggt ttttagaccg cgtgttgcac tatctgtagc actattttgg
35760gtcagtcttt tcgctacgtc tgttaagcta ttcttccacg ttacaacccg
ccttgttttt 35820aacgtctacg taacaatccc caagcatcgt tctaaacaca
tttttagact gtctgtacct 35880gacaagtagt tatgcgacag ccgggatttt
tcacctctca gtattctaaa tctgggatta 35940aacaaacagg gttctcggat
ttaatattta gatatttaaa tcgaattcta atgatattac 36000ccactcgatt
tcgtaaaaaa cactggttta ttgtgtgatg aatgatgtgg gtttggtcaa
36060ggattctctt ttattatttt tgagaacttt atgtttatat gtgtttgatt
gtatttgtta 36120ataagtgtgc aaagtctcac ttttatttta agttgttgtt
tttaatgttt aatttatttt 36180gagtgtttga tcttttgggt ttttacctaa
aaccctaaca atttccttaa tggattagcc 36240atattccatc ctatgtcata
tatataatta acttaatcaa tcaaaataag atcaccatca 36300cttatttgga
ttattgtact acaaataaag agtcgaattt cctatagtcc tcgtaacaaa
36360ttaaaacgga caaaggatac acgatggaac tcaacacgat tattgtcggc
atttatttcc 36420tattcttgat tgcgataggt tggatgttta gaacatttac
aagtactact agtgactact 36480tccgcggggg cggtaacatg ttgtggtgga
tggttggtgc aaccgccttt atgacccagt 36540ttagtgcatg gacattcacc
ggtgcagcag gtaaagcgta taacgatggt ttcgctgtag 36600cggtcatctt
cgtagccaac gcatttggtt acttcatgaa ctacgcgtac ttcgcgccga
36660aattccgtca acttcgcgtt gttacggtaa tcgaagcgat tcgtatgcgt
tttggtgcga 36720ccaacgaaca agtattcact tggtcttcaa tgccaaactc
agtggtatct gcgggtgtgt 36780ggttaaacgc attggcaatc atcgcttcgg
gtatcttcgg tttcgacatg aacatgacta 36840tctgggtgac tggcctagtg
gtattggcaa tgtcggtaac aggtggttca tgggcggtaa 36900tcgcatctga
cttcatgcag atggttatca tcatggcggt aacggtaact tgtgcggttg
36960tagcggttgt tcaaggtggc ggtgttggtg agattgttaa caacttccca
gtacaagatg 37020gtggttcgtt cctttggggc aacaacatca actacctaag
catctttacg atttgggcat 37080tcttcatctt cgttaagcag ttctcaatca
cgaacaacat gcttaactct taccgttacc 37140tagcggctaa agactcaaag
aacgctaaga aagctgcact gcttgcttgt gtgttgatgt 37200tgtgtggtgt
gtttatttgg ttcatgcctt cttggttcat tgcaggccaa ggtgttgatt
37260tatcagcggc ttacccgaat gcaggtaaaa aagcgggtga ctttgcttac
ctatacttcg 37320tacaagagta catgccagca ggtatggttg gtctattagt
tgccgcgatg tttgcagcga 37380caatgtcttc aatggactca ggtctaaacc
gtaactcagg tatttttgtt aagaacttct 37440acgaaacaat cgttcgtaaa
ggtcaagcat cagagaaaga gctagtaacc gtatctaaaa 37500ttacttcagc
ggtatttggt ttcgctatta tcctaatcgc acagttcatc aactcattaa
37560aaggcttaag cctgtttgat acgatgatgt acgtaggtgc gttaatcggc
ttccctatga 37620cgattcctgc attccttggt ttcttcatca agaagactcc
ggactgggct ggttggggaa 37680cgctagttgt tggtggtatc gtatcttatg
tggttggttt tgttatcaac gcggagatgg 37740tagcagcggc gtttggtctt
gatactctaa caggacgtga atggtctgat gttaaagttg 37800cgattggtct
gattgctcac atcacgctaa ccggtggctt cttcgtacta tctacgatgt
37860tctacaagcc tctatcaaaa gaacgtcaag cggatgttga taagttcttt
ggcaacttag 37920ataccccatt agtagctgaa tcggcagagc aaaaagtgtt
ggataacaaa caacgtcaaa 37980tgcttggtaa actgattgcg gtagcgggtg
ttggtattat gctgatggct cttctgacta 38040acccaatgtg ggggcgccta
gtcttcatct tatgtggtgt gatagtgggt ggtgtcggta 38100ttctacttgt
gaaagcggtc gatgacggcg gcaagcaagc gaaagcagta accgaaagct
38160aatacataga aaacgtttat aatagaatgc gacgactcga aagggcgtcg
cattttttat 38220tctgcggaac tggaaaaccg tcaggtgaaa gatatctgac
ctaaatcacg aaaactgtac 38280aaagtggttc aatcgaatcg aaatatattc
aattgtccta caataagacg tatattgttg 38340ctaattcctt tcaatcaact
tgaaaaataa gtgagttaga atgagcgacc aaaaatctct 38400tgatgcaatc
aggaagatga agctggaaaa cgatacttca gcaggtaatc ttgtagacct
38460actccctatc gaagttcaaa cacgtgactt cgacctatca ttcctagaca
ccttgagcga 38520agcacgtccg cgtcttcttg ttcaagctga tcagctagaa
gaattcaaag caaaagtgaa 38580agctgatcaa gctcactgta tgtttgatga
tttctacaac aactctaccg ttaagttcct 38640tgagactgct cctttcgaag
agcctcaagc gtacccagct gagacggtag gtaaagcttc 38700tctatggcgt
ccttattggc gtcaaatgta cgttgattgc caaatggcac tgaacgcgac
38760acgtaaccta gcgattgctg gtgttgtaaa agaagacgaa gcgctcattg
cgaaagcaaa 38820agcttggact ctaaaactgt ctacgtacga tccagaaggc
gtgacttctc gtggctataa 38880cgatgaagcg gctttccgtg ttatcgctgc
tatggcttgg ggttacgatt ggctacacgg 38940ctacttcacc gatgaagaac
gccagcaagt tcaagatgct ttgattgagc gtctagacga 39000aatcatgcac
cacctgaaag tgacggttga tctattgaac aacccactaa atagccacgg
39060tgttcgttct atctcttctg ctatcatccc aacgtgtatc gcgctttacc
acgatcaccc 39120gaaagcaggc gagtacattg catacgcgct agaatactac
gcagtacatt acccaccatg 39180gggcggtgta gacggcggtt gggctgaagg
tcctgattac tggaacacgc aaactgcatt 39240cctaggcgaa gcattcgacc
tattgaaagc atactgtggt gtagacatgt ttaacaaaac 39300attctacgaa
aacacaggtg atttcccgct ttactgcatg ccagttcact ctaagcgcgc
39360gagcttctgt gaccagtctt caatcggcga tttcccaggt ttaaaactgg
cttacaacat 39420caagcactac gcaggtgtta accagaagcc tgagtacgtt
tggtactata accagcttaa 39480aggccgtgat actgaagcac acaccaaatt
ctacaacttc ggttggtggg acttcggtta 39540tgacgatctt cgttttaact
tcctttggga tgcacctgaa gagaaagccc catcgaacga 39600tccactgttg
aaagtattcc caatcacggg ttgggctgca ttccacaaca agatgactga
39660gcgtgataac catattcaca tggtattcaa atgttctccg tttggctcaa
tcagccactc 39720tcacggtgac caaaacgcat ttacgcttca cgcatttggt
gaaacgctag cgtcagtaac 39780aggttactat ggtggtttcg gtgtagacat
gcacacgaaa tggcgtcgtc aaacgttctc 39840taaaaacctg ccactatttg
gcggtaaagg tcagtacggc gagaacaaga acacaggcta 39900cgaaaaccac
caagatcgct tttgtatcga agcgggcggc actatctctg acttcgacac
39960tgaatctgat gtgaagatgg ttgaaggtga tgcaacggca tcttacaagt
acttcgttcc 40020tgaaatcgaa tcttacaagc gtaaagtctg gttcgttcaa
ggtaaagtct tcgtaatgca 40080agacaaggca acgctttctg aagagaaaga
catgacttgg ctaatgcaca caactttcgc 40140aaacgaagtg gcagacaagt
ctttcactat ccgtggcgaa gttgcgcacc tagacgtaaa 40200cttcatcaac
gagtctgctg ataacatcac gtcagttaag aacgttgaag gctttggcga
40260agttgaccca tacgagttca aagatcttga gatccaccgt cacgtggaag
tggaattcaa 40320gccatcgaaa gagcacaaca tcctgacgct tcttgttcct
aataagaatg aaggcgagca 40380agttgaagtg tttcacaagc ttgaaggcaa
cacgctactg ctaaatgttg acggcgaaac 40440ggtttcaatc gaactgtaat
ccgctgaagt aacagaagtt agatactaaa aactccgagt 40500gaaagctcgg
agtttttttg tttggctagc caattaagtt ggagttggat aagtcagtta
40560agttgtatta gttgacaacg ttggcaaacc gatcaggttg aaagaaaact
taattggcca 40620gagataaata gcttctcgat gccaagtcag tggctgaggg
ctaaatctgg acattgatgc 40680acataaagac cggcatgtac ttagccacta
tgctcaatga aatgtgcagg agtcgtataa 40740gagactcgta tatatcgctc
tgttagaaga acagggcgcc aacgcctgtt tcctagcaat 40800tgttatgact
tacttttccg tgaacagtct tatcactggc tgagtaaggg agtagtgaac
40860tatacatagg taaaggcgta gcttgttctt actaatcgta tgacatttaa
cgtacgttat 40920tcgttattat aatgaacata taatcataca atactatatt
tggagtttga acatgactaa 40980acctgtaatc ggtttcattg gcctaggtct
tatgggcggc aacatggttg aaaacctaca 41040aaagcgcggc taccacgtaa
acgtaatgga tctaagcgct gaagctgttg ctcgcgtaac 41100agatcgcggc
aacgcaactg cattcacttc tgctaaagaa ctagctgctg caagtgacat
41160cgttcagttt tgtctgacaa cttctgctgt tgttgaaaaa atcgtttacg
gcgaagacgg 41220cgttctagcg ggcatcaaag aaggcgcagt actagtagac
ttcggtactt ctatccctgc 41280ttctactaag aaaatcggcg cagctcttgc
tgaaaaaggc gcgggcatga tcgacgcacc 41340tctaggtcgt actcctgcac
acgctaaaga tggtcttctg aacatcatgg ctgctggcga 41400catggaaact
ttcaacaaag ttaaacctgt tcttgaagag caaggcgaaa acgtattcca
41460cctaggggct ctaggttctg gtcacgtgac taagcttgta aacaacttca
tgggtatgac 41520gactgttgcg actatgtctc aagctttcgc tgttgctcaa
cgcgctggtg ttgatggcca 41580acaactgttt gacatcatgt ctgcaggtcc
atctaactct ccgttcatgc aattctgtaa 41640gttctacgcg gtagacggcg
aagagaagct aggtttctct gttgctaacg caaacaaaga 41700ccttggttac
ttccttgcac tttgtgaaga gctaggtact gagtctctaa tcgctcaagg
41760tactgcaaca agcctacaag ctgctgttga tgcaggcatg ggtaacaacg
acgtaccagt 41820aatcttcgac tacttcgcta aactagagaa gtaatcgacg
tacgacctcg ctagggtatt 41880gcttgtcttc taggcggcga tacctcagcg
aggttcgttt ttatctgcca tacccaaccc 41940tttgttccct tgttaaaatc
ttctacttct acttctactt ctacttcaat ttcctcagtt 42000acacctaatc
aaaactctgt ttaactctgt tactgcctca attcctattt ttttctatat
42060ctatttctaa cggtaaattc aaaaccttct agcaccaact cattcactca
tttttcctcg 42120caagctcaaa ctcaacgcgc ttacatgatt gttggtgatg
gcttaacacc gctcgtatat 42180cggtcctgaa aagaaagtaa aaaaaaagcc
cacacagctg gtgactgtat gggcatgttc 42240ggacgagccg tctggacaaa
caaatgagca atagtaagtg aaaaaacgaa taacgagatc 42300ccccgacagt
ttctacgtta aacgcgttca atgaccttaa agcggctgct tcaattatca
42360ctttgaattg aacaaaagca tccagaaaga acttaagtta tgattcaaat
acaccatagt 42420acaagactta ttgtattaca aataaatttt aagattgaat
gcctttagtg aatggttagt 42480tggtagaagt gtgagttaag actcattttt
tcactcagct gggtgaggta aagaagaaga 42540gttttcgaaa agatgttatc
ggaaaaatga tgagctaatt atctaaaaat cgatctattt 42600taatgtgtta
tgcgtcaatg tttaacttcg aacaaaatcc aaactcataa atgataccta
42660tgtcacaggg cggttttagc cagttttaat atatcaagat cgctcacaga
atgtctggtc 42720aattaaacat acaatattaa ttaagttgat ggttgtgacg
atggatcggc atgaacaagt 42780ttcgctttcc gtatcttcga aaatgtaaaa
aatggccatt tcattcggat gaaaataata 42840gacataggtt gatatggatg
atgagtttta tgaattcaaa attgtctcta gggtttaaag 42900gaaaattgat
tttaatggta gcggtcgtca gttctagtgc tttggcattt acgaactggt
42960ttacgcttaa cttggccact gaacaggtaa accaaacgat ttataacgag
attgatcact 43020cgcttacgat agaaatcaat caaatagaaa gtaccgttca
gcgcaccatc gataccgtta 43080actctgttgc acaagagttc atgaaatccc
cttaccaagt gccgaatgaa gcactcatgc 43140attatgccgc taagcttggt
ggcattgaca agattgtggt gggttttgac gacggccgtt 43200cttatacctc
tcgcccttca gagtctttcc ctaacggtgt tggaataaaa gaaaaataca
43260atccaaccac tcgaccttgg tatcaacaag cgaaattgaa atcaggctta
tcttttagtg 43320gtctgttttt cactaagagt actcaagtgc ctatgatcgg
tgtgacctac tcataccaag 43380atcgtgtcat catggccgat atacgctttg
acgatttgga aacgcagctt gaacagctgg 43440acagcatcta cgaagccaaa
ggcattatca tcgacgaaaa ggggatggtg gtcgcttcaa 43500caatcgaaaa
cgtgcttccg caaaccaata tatcttctgc agacactcaa atgaaactca
43560acagtgccat tgaacagcct gatcaattca ttgagggtgt gattgatggt
aaccagagaa 43620tcttgatggc caagaaagtg gatattggca gccagaaaga
gtggttcatg atctccagta 43680ttgaccctga actcgcgctc aatcagctga
atggcgtgat gtcgagtgcg cgcatcctta 43740tcgtcgcttg tgtacttggc
tcggtgatat tgatgatttt acttctgaat cgtttctacc 43800gcccaatcgt
gtcactgcgc aaaatcgtcc acgatctatc acaaggtaac ggagacctca
43860ctcaaaggct tgctgagaag gggaatgatg acttagggca tatcgccaaa
gacatcaact 43920tgttcattat cggcttacaa gagatggtta aggatgtgaa
atacaagaac tcggatctcg 43980ataccaaggt actgagtatt cgcgaaggtt
gtaaagaaac cagcgatgta ctgaaagttc 44040atactgatga aacggttcaa
gtggtctctg cgattaacgg cttgtctgaa gcatcaaacg 44100aagtagagaa
gagttctcag tcggcggcag aagcagcaag agaggccgct gtgttcagtg
44160atgagacgaa acagattaac acggtgacgg aaacctatat cagtgatctt
gagaagcaag 44220tctgcaccac ttctgatgac attcgctcaa tggccaatga
aacgcagagc atccagtcta 44280tcgtgtctgt gattggcgga attgcggaac
aaactaattt gctggcattg aatgcgtcaa 44340ttgaagcggc gagggcgggt
gaacatggtc gaggtttcgc ggtggttgct gatgaagtcc 44400gtgcgctagc
caaccgaacg caaatcagta cctctgaaat tgatgaagcg ttatctggct
44460tgcagtctaa atcagatggt ttggttaaat ctattgagtt gaccaaaagt
aactgtgaac 44520tgactcgcgc tcaagttgtt caagctgtaa acatgttggc
gaagctaacc gagcagatgg 44580aaacagtaag tcgttttaat aatgacattt
cgggttcgtc tgttgagcaa aacgccctta 44640ttcagagcat tgctaagaac
atgcataaga ttgaaagctt tgttgaggag cttaataaac 44700taagccaaga
tcagttaact gaatcagcag aaatcaaaac acttaacggt agcgttagtg
44760aattgatgag cagctttaag gtttaatgtt tctaatattt atacctaaaa
atcaacatgt 44820taagtttagt tgttgatctg aaggccactc aataactgtc
gagtttagag tggcttttct 44880gcgttgttct tgagtctaac tctacgtaat
atccgttcat ttcacttcat ttgccgcatc 44940tcacattctg ataaatagac
aattgacata aaatagtaca aatatacatt gtcactctac 45000tcttatggat
aagtgagata aatgtgaata agccaatctt tgtcgtcgta ctcgcttcgc
45060ttacgtatgg ctgcggtgga agcagctcca gtgactctag tgacccttct
gataccaata 45120actcaggagc atcttatggt gttgttgctc cctatgatat
tgccaagtat caaaacatcc 45180tttccagctc agatcttcag gtgtctgatc
ctaatggaga ggagggcaat aaaacctctg 45240aagtcaaaga tggtaacttc
gatggttatg tcagtgatta tttttatgct gacgaagaga 45300cggaaaatct
gatcttcaaa atggcgaact acaagatgcg ctctgaagtt cgtgaaggag
45360aaaacttcga tatcaatgaa gcaggcgtaa gacgcagtct acatgcggaa
ataagcctac 45420ctgatattga gcatgtaatg gcgagttctc ccgcagatca
cgatgaagtg accgtgctac 45480agatccacaa taaaggtaca gacgagagtg
gcacgggtta tatccctcat ccgctattgc 45540gtgtggtttg ggagcaagaa
cgagatggcc tcacaggtca ctactgggca gtcatgaaaa 45600ataatgccat
tgactgtagc agtgccgctg actcttcgga ttgttatgcc acttcatata
45660atcgctacga tttgggagag gcggatctcg ataacttcac caagtttgat
ctttctgttt 45720atgaaaatac cctttcgatc aaagtgaacg atgaagttaa
agtcgacgaa gacatcacct 45780actggcagca tctactgagt tactttaaag
cgggtatcta caatcaattt gaaaatggtg 45840aagccacggc tcactttcag
gcactgcgat acaccaccac acaggtcaac ggctcaaacg 45900attgggatat
taatgattgg aagttgacga ttcctgcgag taaagacact tggtatggaa
45960gtgggggtga cagtgcggct gaactagaac ctgagcgctg cgaatcgagc
aaagaccttc 46020tcgccaacga cagtgatgtc tacgacagcg atattggtct
ttcttatttc aataccgatg 46080aagggagagt gcactttaga gcggatatgg
gatatggcac ctctaccgaa aattctagct 46140atattcgctc tgagctcagg
gagttgtatc aaagcagtgt tcaaccggat tgtagcacca 46200gcgatgaaga
tacaagttgg tatttggacg acactagaac gaacgctacc agtcacgagt
46260taaccgcaag cttacgaatt gaagactacc cgaacattaa taaccaagac
ccgaaagtgg 46320tgcttgggca aatacacggt tggaagatca atcaagcatt
ggtgaagttg ttatgggaag 46380gcgagagtaa gccagtaaga gtgatactga
actctgattt tgagcgcaac aaccaagact 46440gtaaccattg tgacccgttc
agtgtcgagt taggtactta ttcggcaagt gaagagtggc 46500gatatacgat
tcgagccaat caagacggta tctacttagc gactcatgat ttagatggaa
46560ctaatacggt ttctcattta atcccttggg gacaagatta cacagataaa
gatggggaca 46620cggtctcgtt gacgtcagat tggacatcga cagacatcgc
tttctatttc aaagcgggca 46680tctacccaca atttaagcct gatagcgact
atgcgggtga agtgtttgat gtgagcttta 46740gttctctaag agcagagcat
aactgagttc tctgatgttt ggttagccat gtcggtaatg 46800aagaagacca
tattgatgcc tacaatgtgg tctttttttg tttttggaca cttacagtga
46860tgtgttttga aggacaaatg ttctgctcga atcatgcaaa tacacacgat
tacagctcgc 46920ttgttctgcc cttgctagct catttcgcat tccaaattct
tatatattgt cttttatcaa 46980taggaaatgt gatccagtta aagtatggaa
aaatcggaaa gtgttcctag tctcatttat 47040ccaacgaagt gttttatttg
tattataaga ttacgtaata ttttcgtgtt atcgcaaata 47100ctgataggtg
aatcgcctta tagctcgtgt ttgctgattt agctttcact tacgaacgct
47160gtctttgtat tataataatg gattaaatat gaaacaaatt actctaaaaa
ctttactcgc 47220ttcttctatt ctacttgcgg ttggttgtgc gagcacgagc
acgcctactg ctgattttcc 47280aaataacaaa gaaactggtg aagcgcttct
gacgccagtt gctgtttccg ctagtagcca 47340tgatggtaac ggacctgatc
gtctcgttga ccaagaccta actacacgtt ggtcatctgc 47400gggtgacggc
gagtgggcaa cgctagacta tggttcagta caggagtttg acgcggttca
47460ggcatctttc agtaaaggta atcagcgcca atctaaattt gatatccaag
tgagtgttga 47520tggcgaaagc tggacaacgg tactagaaaa ccaactaagc
tcaggtaaag cgatcggcct 47580agagcgtttc caatttgagc cagtagtgca
agcacgctac gtaagatacg ttggtcacgg 47640taacaccaaa aacggttgga
acagtgtgac tggattagcg gcggttaact gtagcattaa 47700cgcatgtcct
gctagccata tcatcacttc agacgtggtt gcagcagaag ccgtgattat
47760tgctgaaatg aaagcggcag aaaaagcacg taaagatgcg cgcaaagatc
tacgctctgg 47820taacttcggt gtagcagcgg tttacccttg tgagacgacc
gttgaatgtg acactcgcag 47880tgcacttcca gttccgacag gcctgccagc
gacaccagtt gcaggtaact cgccaagcga 47940aaactttgac atgacgcatt
ggtacctatc tcaaccattt gaccatgaca aaaatggcaa 48000acctgatgat
gtgtctgagt ggaaccttgc aaacggttac caacaccctg aaatcttcta
48060cacagctgat
gacggcggcc tagtattcaa agcttacgtg aaaggtgtac gtacctctaa
48120aaacactaag tacgcgcgta cagagcttcg tgaaatgatg cgtcgtggtg
atcagtctat 48180tagcactaaa ggtgttaata agaataactg ggtattctca
agcgctcctg aatctgactt 48240agagtcggca gcgggtattg acggcgttct
agaagcgacg ttgaaaatcg accatgcaac 48300aacgacgggt aatgcgaatg
aagtaggtcg ctttatcatt ggtcagattc acgatcaaaa 48360cgatgaacca
attcgtttgt actaccgtaa actgccaaac caagaaacgg gtgcggttta
48420cttcgcacat gaaagccaag acgcaactaa agaggacttc taccctctag
tgggcgacat 48480gacggctgaa gtgggtgacg atggtatcgc gcttggcgaa
gtgttcagct accgtattga 48540cgttaaaggc aacacgatga ctgtaacgct
aatacgtgaa ggcaaagacg atgttgtaca 48600agtggttgat atgagcaaca
gcggctacga cgcaggcggc aagtacatgt acttcaaagc 48660cggtgtttac
aaccaaaaca tcagcggcga cctagacgat tactcacaag cgactttcta
48720tcagctagat gtatcgcacg atcaatacaa aaagtaatct aatcgaataa
cacttaatat 48780taaaggtatt gcaatagcct ccagccttag ggtttggagg
cttttttgtg cctgctgttg 48840gttgggctta agcgtatgat ttaattgagt
aggagagggg tagttatcag ttgcacagag 48900tttaagacat tatcattaag
ctcattcagt attaacttta gtcattatca gtcactatta 48960ccccccaagc
gccgatcaca attaacctag ctcatgatta atctcagtta ccaataggct
49020agcctgtagc ggattcaaac ccaaataatg tcgtgatgtt tatcggaatc
accatagctc 49080gaaaactttg accttgttct caaggctttg ccaatgcacg
aacgtattat gtgcgtggtt 49140tactaataag cgttagctcg gctgactact
catactgttc ttgaaaccgt tactcttggg 49200ttgtttagct agactcctag
caacagccat aaatagtgct ctaactcttt cataattaga 49260agggtagggt
tagccattct attggttcca atgctttatg aaatactagg cgggctcaag
49320tcgatgatca aacgactcta acagcttaag gttatgcgct tttgcgttag
ttacctgcag 49380gccgtaaatg ccctgattgt agttgtacgg tgacgctgaa
taataatttg taggattagt 49440atagaactga gagactttgt ctatctatga
tcgatacagg ctttgagagg gctggatcag 49500tagaaagaca gaatgacaat
tagcactaga gttattttgg tttttaatta gagttaataa 49560aatagatatt
tggtttgtta aatttaatcg tgtcataagc tctgtgtttt aaaaaataaa
49620aaaagccata gcagttgcta tggctttgaa taagtcaggt tctaaggtaa
gcaaacagca 49680agtcaacttg tctgttttga tattcttagt cttagttcaa
gatattttct ttacctgccg 49740cagtgttcac tgcagatggt tgtgcgtaga
tggctgtatt tcttatatct ttaccgttgt 49800cagctgaaag tagggttttg
taaccattga acgtattgct ttcaatagtg acattacagc 49860caaggttatc
atcactattc tttatgacgg cactgctaga atcaccaact ttaattccgc
49920ccacatcact accaccagag ttaatagtga atgtattatt tgtgatttgt
gaaccaaatc 49980gcccgttgtc actactacag ttaatacgaa ttgcattatt
ttggaaactg ccacttaaac 50040cgacaaattc gctattgtct aatgtaaagt
aacctcgaga gaataaccaa cttgcttttt 50100tagtacctag atcatcttcg
gtaatgccgt ttgcatcgaa ctttaggttt tcaagtgcta 50160caggatctga
atctttacca attttaccaa taacgatcgc accagtttca ttatctgaag
50220taccagctga ctccctacca aaacacccgg ccaaattgtc gttagcaaaa
gtcatgtttt 50280tgatacctgc accgggtgca gtgacatcaa tacaagcatc
tccggtaatg gttgctaaac 50340cagcaccatc aattgtgaca gctttattta
gctcaataac accggtatca aacgtacctt 50400cagatgataa atcaataatc
gcgccatctt ctgctgatgc aatcgcagcg ttcacatcat 50460cgactgattt
aggttttcca tcatcggcat tttctaatgc agttataaca gattcatctg
50520taatctctgt tgctgagtaa gctgtttcta cgtcatcttt ttcacaggtc
atatctagct 50580gttgctcagt cactgtgtaa gtacagtcct taccttcaaa
ggaaacaaca ccttcttctt 50640cagcagtata tattgaactc tcaaaagtga
agccattagc tacgtctcca ctgtaaatgc 50700ttagagcacg agtaccttcc
tcattattat caaagcgata tggtgaagtt ccgctgagtg 50760actgtgcagc
aacagcacca cctgtcagat cccaatagac gttttctata gagtaaactt
50820caacaggttc aacagggtct gttccgcctg gatctgttgg aataggtaaa
ccatcactgt 50880tacaaccaaa taataaacct gtcgaaagag cgacagctgt
agcgacttta gaaatttgca 50940taaaatattc tctttatgat attaaatcca
tatgtaaatc acataagaaa tagataatga 51000atagtcgtta aatatttatt
aggatgaagc taattctgat tagaacatcc tattatttaa 51060aataaagtaa
ttaaaatatg cccaaataaa ttacaagagg agagggctat tttatatttt
51120gactatttta ttattagaat gagtaagcaa taccaacacg gtatttagct
tcacgatctt 51180ttgaagatga actgatatca gaagaccaga tttcagcaaa
aggtttccaa gaaccgaact 51240tgtagtttac ctttaagcca gcatcccatt
cccagtcatc actattataa agaagtacgt 51300tatccaaaga ttttacatag
tttgcttcgt aagaaaggcc aagcttaggt agagattcaa 51360ttttgtaaga
gcccgtcagt gtaactttag acttttgagc tgattctaaa cgctcgccag
51420tttcagaatc tttgtcgcca aattgtgtgt ggttacggaa gtcagcatat
tcatgacggt 51480aacgaatagc agttgttaaa cccatatctg ctttatagcc
aacgcggaac tgaggtttaa 51540acgtaacctt tttcatcttc cagtcgccat
cgttagcatt aggctcatcc caatcccaag 51600caataggcat acccatttgt
agataccaat tattgtctat tttgtatgtc gcagtgttat 51660cgatctccat
accatagatg taccaattgc catcgtaaaa actctggctg tttgctgatt
51720taacagaacc tgaatcttca tcatagtaag agtcatcacc gtggaactta
agttctagac 51780cagtagagtg cttccacttg tctgacagct taaagctttc
acctagctta actcggtgtt 51840gatggcgagc gtctacgtga gccgtatcac
cattagtctt tgtataatcc gtcgcagcac 51900gatactcgta acgataatca
agagatgcac cagcagctgt gcccgctaaa agagtacatg 51960caacagctgc
agcaattttt gtaacagaat tcataccttt gtctcactat tattttttta
52020ttttggatac atccaatgta cccctgactc acaaaccaat accttacatg
gtattaaatt 52080aatgtatgac aaatatggta tttattccta gggtagattt
ctgtgagatc tatcaaaagt 52140tccgactaat ggcctattta tatagctaaa
tgttatgaat atctcaattt aaggcttacc 52200aatcaaatca atcatgactc
agttctcata ttaacaaacc ttgtaagctc agttggttgt 52260atgtgttaaa
ataatacaaa tataagaata ttcccacact ttcatatcga tgttctagtt
52320gttgtggttt aaacataacg gcgcatgttg agggatatag atataaacca
ccgccaaatg 52380tttggtaaaa gttaaaagat ggcgaaatgt aaattctatt
tattggttgg tttatttaag 52440tcgaagagaa aatatttagt actaattcgt
gttcaaaagt agtttctgtg ctgagagtgt 52500actcagtatc tgttaacaat
aaaggatgag tcatgtttaa gaaaaacata ttagcagtgg 52560cgttattagc
gactgtgcca atggttactt tcgcaaataa cggtgtttct taccccgtac
52620ctgccgataa attcgatatg cataattgga aaataaccat accttcagat
attaatgaag 52680atggtcgcgt tgatgaaata gaaggggtcg ctatgatgag
ctactcacat agtgatttct 52740tccatcttga taaagacggc aaccttgtat
ttgaagtgca gaaccaagcg attacgacga 52800aaaactcgaa gaatgcgcgt
tctgagttac gccagatgcc aagaggcgca gatttctcta 52860tcgatacggc
tgataaagga aaccagtggg cactgtcgag tcacccagcg gctagtgaat
52920acagtgctgt gggcggaaca ttagaagcga cattaaaagt gaatcacgtc
tcagttaacg 52980ctaagttccc agaaaaatac ccagctcatt ctgttgtggt
tggtcagatt catgctaaaa 53040aacacaacga gctaatcaaa gctggaaccg
gttatgggca tggtaatgaa ccactaaaga 53100tcttctataa gaagtttcct
gaccaagaaa tgggttcagt attctggaac tatgaacgta 53160acctagagaa
aaaagatcct aaccgtgccg atatcgctta tccagtgtgg ggtaacacgt
53220gggaaaaccc tgcagagccg ggtgaagccg gtattgctct tggtgaagag
tttagctaca 53280aagtggaagt gaaaggcacc atgatgtacc taacgtttga
aaccgagcgt cacgataccg 53340ttaagtatga aatcgacctg agtaagggca
tcgatgaact tgactcacca acgggctatg 53400ctgaagatga tttttactac
aaagcgggcg catacggcca atgtagcgtg agcgattctc 53460accctgtatg
ggggcctggt tgtggcggta ctggcgattt cgctgtcgat aaaaagaatg
53520gcgattacaa cagtgtgact ttctctgcgc ttaagttaaa cggtaaatag
cacatagcat 53580aaccaatagt ctagctagac gcagtcctta aggaatattt
tcgaagacca cttaaccgaa 53640tgttgagtgg tctttttgtt ttatatgagt
tttaagatga acttggtatt aatgtgacct 53700tggtatcaat gagggtgtac
gtgaagccta ccaatgaaag gtacagctaa aacaatacaa 53760ccttgtcaaa
agacaaggtt gcattcagaa agcgtaggaa gattttagga cgacaactcg
53820atacggagtt tagtcataca tcaactcttt ggctttgtcg gcatcaaact
ctttaagaga 53880ctttcgagcc aagtgacgga atgggaaagc tttcacgact
tcttcgaatg gttggatggc 53940aaatgcccaa aagatagaac cgtctaatcc
aaagatgatc aatgcacaca atggaattga 54000aattacccat tgaccagtaa
agttgatttt gaagactgcg gtcgtttttc ctagggctct 54060taatacattc
ccatgaaccg 54080322890DNAVibrio splendidus 3gtgctttgtg acaacggggg
atgtatggat attgaagttt cgcgccaggt tgcggtagtt 60gaagctacga gtggagatgt
cgtcgtagtt aagccagacg gcagcgcaag aaaagtttca 120gttggcgata
ccatccgtga aaatgagatc gtgattacgg ccaacaagtc agagcttgta
180ttaggcgttc agaatgattc gattccggtt gcagagaatt gcgtcggttg
tgttgatgaa 240aacgctgcat gggtagatgc cccaatagct ggtgaggtta
attttgactt acagcaagca 300gacgcagaaa ccttcactga agacgacctt
gctgcaattc aagaagccat tttaggtggt 360gccgatccga ctcaaatctt
agaagcaacg gctgctggtg gcggactagg ttctgcaaat 420gctggctttg
tgacgattga ctataactac actgaaactc atccatcgac tttctttgag
480accgctggtc tagcagaaca aactgttgat gaagacagag aagaattcag
atctatcact 540cgttcatcag gtggccaatc aatcagtgaa acactgactg
aaggctccat atctggcaat 600acctatcccc aatctgtaac aacgacagaa
acgattattg ctggtagttt agctctcgcc 660cctaactctt tcattccaga
aactttatcc ctcgcttcac tacttagtga attaaacagc 720gacattactt
caagtggtca gtccgttatc ttcacctatg acgcgacgac taattctatc
780gttggtgttc aagataccga cgaagtatta cgtatcgaca ttgatgccgt
cagtgttggc 840aataacattg agctttctct aaccacaacg atttcccagc
cgattgatca tgtaccgtcg 900gttggcggtg gtcaggtttc ttacactggc
gatcaaatag atattgcctt tgatattcaa 960ggtgaagaca ccgctgggaa
cccgctagca acacccgtta acgcacaagt ttcagtgttt 1020gacgggatag
atccgtctgt tgaaagtgtc aatatcacta acgttgaaac tagcagcgcg
1080gcaatcgaag ggacgttctc aaatattggt agtgataacc ttcaatcagc
cgtatttgat 1140gcaagtgcac tggaccagtt tgatgggttg ctcagtgata
atcaaaacac gcttgcgaga 1200ctttctgatg atggaacaac gattactctg
tccatccaag gtcgaggtga ggttgttctc 1260actatctctc tagataccga
tggcacctat aaattcgagc agtctaatcc gatagaacaa 1320gtgggtaccg
attcactgac gttcgctttg ccaatcacga ttaccgattt tgaccaagat
1380gttgtaacca atacgatcaa cattgccatt actgatggcg atagccctgt
tattactaat 1440gttgacagta ttgatgttga tgaagcgggc attgttggcg
gctcacaaga gggcacggcg 1500ccagtgtctg gcactggcgg tatcaccgcg
gacatttttg aaagtgacat cattgaccat 1560tatgagctag aacccactga
atttaatact aatggcacct tggtttcaaa tggcgaggct 1620gtgctacttg
agttgattga tgaaaccaac ggtgtaagaa cttacgaagg ttatgttgag
1680gtcaatggtt cgagaattac ggtctttgac gttaaaattg atagcccttc
attgggcaac 1740tatgagttta atctttatga agaactttct catcaaggcg
ctgaagatgc gctgttaact 1800tttgcattgc caatttatgc tgttgatgca
gatggcgacc gttctgcact gtctggaggt 1860tcgaacacac cagaagctgc
tgagatcctc gttaatgtta aagacgatgt cgttgaatta 1920gttgataagg
ttgaatcagt caccgagccg accttagcgg gcgatactat tgtttcgtat
1980aacctgttca attttgaagg cgcagatggt tctacaattc aatcgtttaa
ctacgacggt 2040gttgattact cactcgatca aagcctgctc cccgatgcta
cccagatttt cagttttact 2100gaaggtgtcg tcactatctc attaaacggt
gacttcagtt ttgaagtcgc tcgtgatatc 2160gaccactcaa gcagtgaaac
tatcgtcaaa cagttctcat ttttagccga agatggtgat 2220ggggatactg
atagttcgac gcttgagtta agtattaccg atggccaaga tccgatcatt
2280gatttgatcc cgcctgtgac tctctctgaa accaacctta atgacggctc
tgctcccagc 2340ggaagtacag ttagcgcaac cgagacgatt acctttaccg
caggcagcga cgatgtagca 2400agtttccgta ttgaaccaac agagtttaat
gtgggcggtg cacttaaatc gaatggattt 2460tcggttgaga taaaagaaga
ttcggctaat ccgggtactt acattggctt tattaccaac 2520ggttcgggcg
ctgaaatccc agtgtttacg attgctttct ctacgagcac attgggtgaa
2580tacaccttta ctctgcttga agcgttagac catgtagatg gtttagataa
gaacgatctg 2640agctttgatc tgcctattta tgcggttgat acggacggcg
acgattcatt ggtgtctcag 2700cttaatgtga ctatcggtga tgatgttcaa
atcatgcaag acggtacgtt agatatcacc 2760gagccaaatc ttgctgacgg
tacaatcaca accaacacca ttgatgtaat gccaaatcaa 2820agtgctgatg
gcgcgacgat cactcggttc acttatgacg gtgtcgtaaa cacactggat
2880caaagtattt caggagaaca gcagttcagc ttcacagaag gcgaactgtt
tatcaccctt 2940gaaggtgaag tgcgctttga gcctaatcgc gatctagacc
actcagtgag tgaagatatc 3000gtgaagtcga ttgtggtgac ttcaagcgac
ttcgataacg atccggtgac ttcaaccatt 3060acgctgacga tcactgatgg
tgataacccg acgattgatg ttattccaag tgttacgctt 3120tctgaaatta
acctgagcga tggctctgct ccaagtggca gcgcggtaag ctcgactcaa
3180actattactt ttaccaatca aagtgatgat gtggttcgtt tccgtattga
gtcaacggag 3240ttcaatacta acgatgatct taaatcgaac ggtttagctg
ttgagttacg tgaagacccg 3300gcagggtcgg gtgactacat tggttttacg
accagtgcga cgaacgtaga aactccagta 3360ttcacattaa gctttaattc
tggatcatta ggtgaataca cgttcacact catcgaagcg 3420ttggaccacc
aagatgcccg tggcaacaac gacctcagtt ttgatttacc tgtttacgcg
3480gtagatagtg atggcgatga ttcattggtg tctccgttaa acgtcactat
cggtgatgat 3540gttcaaatca tgcaagatag tacgttagat atcgtcgagc
caaccgtcgc agatttggcc 3600gctggcacag tgacaactaa caccattgat
gtgatgccaa atcaaagtgc cgatggcgca 3660acggtgacgc aattcactta
tgatggccag cttcgaacac ttgaccaaaa tgacaatggt 3720gagcagcaat
ttagcttcac agaaggtgaa ctgttcatca cgcttcaagg tgatgtgcgc
3780tttgagccta atcgtaatct agaccacaca ctcagcgaag acatcgtgaa
atcaatcgtg 3840gtgacatcta gcgattccga taacgatgtg ttgacctcaa
ccgtcactct gaccattacc 3900gatggtgata tcccaaccat tgataatgtt
ccaactgtga acttgtctga aactaatctg 3960agtgatggct ctgcacctag
cggaagcgcg gtgagttcaa ctcaaactat tacttacacc 4020actcaaagtg
atgatgtgac aagcttccgt attgaaccga ctgaatttaa tgttggtggc
4080gctctcacat caaacggatt ggcagtcgag ttaaaagctg atccaaccac
accgggtggc 4140tacatcggtt ttgtgactga tggttcgaac gttgaaacta
acgtgttcac gattagcttc 4200tcagatacca atttaggcca gtacaccttc
accttacttg aagcgttaga ccatgtggat 4260ggtttagcga acaatgatct
gacctttgat ctgcctgttt atgcagttga tagcgatggc 4320gacgattcac
tggtgtctca gttaaatgta accatcggtg atgatgttca aatcatgcaa
4380ggtggtacgt tagatatcac tgagccaaat cttgcagacg gcacaattac
aaccaatacc 4440atcgatgtga tgccagagca aagcgccgat ggtgcgacga
tcactcagtt cacttatgac 4500ggtcaagttc gaacactgga tcaaacggac
aatggtgagc agcaatttag cttcactgaa 4560ggcgagttgt tcatcactct
tcaaggtgac gtgcgtttcg aacccaatcg caacctagat 4620cacacagcta
gcgaagatat cgtgaagtcg atagtggtga cttcaagcga tttagataac
4680gatgtggtga cgtcaacggt cactctgacg attactgatg gtgatatccc
aaccattgat 4740gcagtgccaa gcgttactct gtctgaaatc aatcttagtg
acggctctgc gccaagtggc 4800actgcagtta gtcaaactga gacgattacc
ttcaccaatc aaagtgatga tgtgaccagt 4860ttccgtattg agccaataga
gttcaatgtg ggcggtgcac tgaaatcgaa tggatttgcg 4920gttgagataa
aagaagattc ggctaatccg ggtacttaca ttggctttat taccaacggt
4980tcgggcgctg aaatcccagt gtttacgatt gctttctcta cgagctcatt
gggtgaatac 5040acctttactc tgcttgaagc gttagaccat gtagatggtt
tagataagaa cgatctgagc 5100ttcgatctgc ctgtttatgc ggtcgatacg
gacggcgatg attcattggt gtctcagcta 5160aacgtgacca tcggtgatga
tgtccaaatc atgcaagacg gtacgttaga tatcatcgag 5220ccaaatctgg
ctgatggaac aatcacaacc agcactattg atgtgatgcc aaaccaaagt
5280gctgatggtg cgacgatcac tcagtttact tatgacggtc agctaagaac
gcttgatcaa 5340aatgacactg gcgaacagca gttcagcttc acagaaggcg
agttgtttat cacccttgaa 5400ggtgaagtgc gctttgagcc aaaccgagac
ctagaccaca ccgcgagtga agatattgtt 5460aagtcgattg tggtcacttc
aagtgatttc gataacgact ctctgacttc taccgtaacg 5520ctgaccatta
ctgatggtga taaccctacg atcgacgtca ttccaagcgt taccctttct
5580gaaactaatc tgagtgatgg ctctgctcca agtggcagcg cggtaagctc
gactcaaact 5640attactttta ccaatcaaag tgatgatgtg gttcgtttcc
gtattgagcc aacggagttc 5700aatactaacg atgatcttaa atcgaacggt
ttagccgttg agttacgtga agacccggct 5760gggtcgggtg actacattgg
ttttactact agtgcgacga atgtcgaaac cacggtattt 5820acgctgagtt
tttctagcac cacattaggt gaatatacct tcactttgct tgaagcgttg
5880gaccaccaag atgcccgtgg caacaacgac ctcagttttg aactgcctgt
ttatgcggta 5940gacagtgatg gcgatgattc actgatgtct ccgttaaacg
tcaccatcgg cgatgatgtt 6000caaatcatgc aagacggtac gttagatatc
gtcgagccaa ccgtcgcaga tttggccgct 6060ggcattgtga caactaacac
cattgatgtg atgccaaatc aaagtgccga tggcgcgacg 6120atcactcaat
tcacttatga tggccaactt cgaacacttg accaaaatga caatggcgaa
6180caacagttta gcttcacgga aggtgaacta ttcatcactc ttgaaggtga
agtgcgcttt 6240gagcctaatc gtaatctaga ccacacgctg aacgaagaca
tcgtgaaatc gatcgtggtg 6300acgtctagtg actccgataa cgatgtgttg
acctcaaccg tcactctgac cattaccgat 6360ggtgatatcc caaccattga
taatgtgcca acagtgagct tgtcagaaac aagtctgagt 6420gacggctctt
caccaagtgg cagcgcagtt agctcaactc aaaccatcac ttacaccact
6480caaagtgatg atgtaaccag cttccgtatt gaaccgactg agttcaatgt
tggcggtgct 6540ctcaaatcaa atggattggc ggttgagctg aaggccgatc
caaccactcc gggcggctac 6600atcggctttg tgactgatgg ttcgaacgtt
gaaactaacg tgttcacgat tagcttctcg 6660gataccaatt taggtcaata
caccttcacc ttgcttgaag cgttggatca tgcggatagc 6720cttgcaaata
acgatctgag ctttgatctg ccagtctacg ccgtcgatag tgatggcgat
6780gattcactgg tgtctcaact caatgtaacc atcggtgatg atgttcaaat
catgcaaggt 6840ggtacgttag atatcactga gccaaacctt gcagacggca
caaccacaac taacaccatc 6900gatgtgatgc cagaacaaag tgccgatggt
gcgacgatca ctcagtttac gtatgacggg 6960caagttcgca ctctggatca
aactgacaat ggtgagcagc aatttagctt cactgaaggc 7020gagttgttca
tcactcttca aggtgacgtg cgtttcgaac ccaatcgcaa cctagatcac
7080acagctagcg aagacatcgt gaagtcgata gtggtgactt caagcgattc
agataacgat 7140gtggtgacgt caacggtcac tctgactatt actgatggtg
atctcccaac cattgatgca 7200gtgccaagcg ttactctgtc tgaaactaat
cttagtgacg gctctgcgcc aagtggcagc 7260gcagtcagtc aaactgagac
catcaccttt accaatcaaa gtgatgatgt ggcgagtttc 7320cgtattgagc
caaccgagtt taatgtgggc ggtgcactga aatcgaatgg gtttgcggtt
7380gagataaaag aagactctgc taatccgggt acttacattg gctttattgc
caatggttcg 7440agcgctgaaa tcccagtgtt cacgattgct ttctctacga
gtacgttggg tgaatacacc 7500tttactctgc ttgaagcgtt agaccatgcg
gatggtttag ataagaacga tctgagcttt 7560gagcttccgg tttacgcggt
tgatacagac ggtgatgatt cattggtatc tcagcttaat 7620gtgaccattg
gtgatgatgt tcaaatcatg caagatggta cgttagacgt tatcgagcca
7680aatcttgcag acggcacaat cacaaccaac accattgatg tgatgcccga
gcaaagtgct 7740gatggtgcga cgatcactca gtttacttat gacggtcagc
taagaacgct tgatcaaaat 7800gacactggtg aacagcagtt cagcttcaca
gaaggcgagt tgtttatcac ccttgaaggt 7860gaagtgcgct ttgaacctaa
tcgcgatcta gaccattccg ttagcgaaga catcgtgaag 7920tcgatagtag
tgacttcaag cgacttcgat aacgatccgg tgacttcagc cattacgctg
7980accattactg atggtgataa tccgactatc gattcggtac cgagcgttgt
acttgaagaa 8040gctgatttaa ctgatggctc atcgccaagt ggcagcgcgg
ttagtcaaac ggaaaccatc 8100actttcacta atcaaagtga cgatgttgag
aaattccgtt tagaaccaag tgaatttaat 8160actaacaacg cgctcaagtc
cgatggcttg atcattgaga ttcgagagga accaacagga 8220tccggcaatt
atattggttt cacgaccgat atttcgaatg tcgaaaccac tgtgtttaca
8280ctcgatttca gcagtaccac tttgggtgag tacaccttca cgcttctgga
agcgattgac 8340cacacgcctg ttcaaggcaa taacgatcta acattcaact
tgccagtcta cgcggttgat 8400agcgacggtg atgattcgct aatgtcatca
ctatcggtga cgattactga tgatgttcaa 8460gtgatggtga gtggttcgct
tagtatcgaa gagcctactg ttgccgactt ggctgcaggc 8520acgccaacaa
catcagtatt tgatgtatta acatccgcga gtgctgatgg ggcgaccatt
8580actcagttca cttatgatgg tggggcggta ttaacgcttg atcaaaacga
tacaggtgag 8640cagaagttcg tggttgctga tggggcatta tatatcactc
tgcaaggcga tattcgtttc 8700gaaccaagtc gtaaccttga ccatactggt
ggcgatatcg tcaagtcgat agtcgtaact 8760tcaagtgatt ccgatagcga
tcttgtgtct tcaacggtaa cgctaaccat tactgatggc 8820gatatcccaa
cgattgacac ggtgccaagc gttactctgt cagaaacgaa tctgagcgac
8880ggatctgctc cgaatgcaag tgcggtaagt tcaactcaaa ccattacctt
tactaaccaa 8940agtgatgacg tgacgagttt ccgtattgaa
ccgactgatt ttaatgttgg tggtgctctg 9000aaatcgaacg gattggcggt
cgaactgaaa gcggacccaa ctacaccggg tggctacatc 9060ggttttgtga
ctgatggttc gaacgttgaa actaacgtgt ttacgattag cttctcggat
9120accaatttag gtcaatacac cttcaccctg cttgaagcgt tggatcatgt
agatggctta 9180gtgaagaatg atctgacttt tgatcttcct gtttatgcgg
ttgatagcga tggtgatgat 9240tcactggtgt ctcaactgaa tgtgaccatt
ggtgatgatg tacaggtcat gcaaaaccaa 9300gcgcttaata ttattgagcc
aacggttgct gatttggctg caggtactcc gacgacagcc 9360actgttgatg
tgatgcctag ccaaagtgcc gatggcgcga caatcactca gtttacttac
9420gatggcgggg cggcaataac actcgaccaa aacgacaccg gtgaacagaa
gtttgtattt 9480actgaaggtt cactgtttat caccttgcaa ggtgaagtgc
gtttcgagcc aaatcgcaat 9540ctaaaccaca cagcgagcga agacatcgtg
aagtcgattg tggtgacttc aagcgattta 9600gataacgatg tactgacgtc
aacggtcact ctgactatta ctgatggtga tatcccaacc 9660attgatgcag
tgccaagcgt tactctgtct gaaactaatc ttagtgacgg ctcagcgcca
9720agcagcagtg ctgtaagtca aacagagacg attaccttca tcaatcaaag
tgatgatgtg 9780gcgagtttcc gtattgagcc aacagagttc aatgtgggcg
gtgcactgaa atcgaatgga 9840tttgcggttg agataaaaga agattcggct
aatccgggta cttatatcgg ttttattacc 9900gatggttcga atactgaagt
tcctgtattc acgattgctt tctctacaag tacgttgggc 9960gaatacacct
tcaccttact tgaagcgcta gaccatgcaa atggcctaga taagaacgat
10020ctgagttttg atcttcctgt ttatgcggta gacagtgatg gcgatgattc
actggtgtct 10080caactgaatg tgaccattgg tgatgatgtc caaataatgc
aagacggtac gttagatatc 10140actgagccaa atcttgcaga cggaacaatc
acaaccaaca ccattgatgt gatgccaaat 10200cagagtgccg atggtgcgac
gatcactgaa ttctcatttg gcggtattgt caaaacactc 10260gatcaaagca
tcgtaggtga gcagcagttt agtttcaccg aaggtgagct attcatcact
10320cttcaaggtc aagtgcgctt tgaaccaaat cgtgaccttg accactctgc
cagcgaagac 10380atcgtgaagt cgatagtggt tacttcaagt gattttgata
acgatcctgt gacttcaacc 10440gttacgctga ccattaccga tggtgatatt
ccaactatcg atgcggtacc aagtgttacg 10500ctttcagaaa caaacctagc
tgatggttct gcgccaagtg gtagtgcggt tagtcaaacg 10560gagacgatta
cttttaccaa tcaaagtgat gatgtggttc gcttccgtct ggaaccaacc
10620gagttcaata ctaacgatgc acttaaatcg aatggcttag cggtcgaact
gcgcgaagaa 10680cctcaaggct ctggtcagta cattggcttt accaccagtt
cgtctaatgt tgagacaaca 10740gtatttacgt tggactttaa ctccggaacc
ttaggtgaat acacatttac tttaatcgaa 10800gctctggatc atcaagatgc
gcgtggcaac aacgatttaa gctttaatct acctgtgtat 10860gcggtggata
gtgatggcga tgactcgtta gtctctcagc ttggcgtgac cattggcgac
10920gatgtgcagt tgatgcaaga cggcacaatc accagtcgtg agcctgcagc
aagtgttgaa 10980acatcaaata cctttgatgt gatgccaaac caaagtgctg
atggagccaa agtcacttca 11040tttgttttcg atggtaagac tgcagaaagt
cttgatttga atgtgaatgg tgaacaagag 11100ttcgtcttca cggaaggttc
ggtatttatt acgacggaag gtgagatacg attcgagccg 11160gtacgtaatc
aaaatcatgc tggtggtgat attaccaagt cgattgaggt gacgtctgtt
11220gacctcgatg gcgatattgt cacatcgaca gtgacactga agattgttga
tggtgacctt 11280cctactatcg accttgttcc cggaattacg ttatctgaag
tggatctggc cgatggctct 11340gtgccaaccg gtaatccagt gacaatgaca
caaaccatta cctacacagc gggtagtgac 11400gacgtaagcc atttcagaat
tgaccctacg cagttcaata cttcaggggt tttgaaatcg 11460aacggcctag
atgtcgaaat aaaagagcag ccagctaatt ctggtaatta cattggcttc
11520gtcaaagacg gttctaacgt agaaaccaac gtcttcacga tcagcttctc
gacgagcaat 11580ttagggcaat acacgttcac actacttgaa gcgttagatc
atgtagatgg attgcaaaac 11640aatatactaa gcttcgatgt ccctgtttta
gcggttgatg cggatggtga tgattctgca 11700atgtcgccta tgacggttgc
gatcaccgat gacgtacaag gtgttcaaga tggcaccttg 11760agtatcactg
agccttcatt agctgatttg gcatcgggta cgccaccaac gacggcaatc
11820attgatgtta tgccaacgca gagtgctgat ggcgcgaaag taacacagtt
tacttacgat 11880ggtggcacag ctgtaacgtt agacccaagc atcgccacag
aacaagtctt taccgtaacc 11940gatggcttac tgtacatcac cattgaaggg
gaggttcgtt ttgagccgag ccgagatcta 12000gaccattcat ctggcgatat
cgtaagaacg attgtcgtca ccaccagtga ttttgataac 12060gatacagata
ccgcggatgt cactttgacg atcaaagacg gtatcaatcc cgttatcaat
12120gtggttccag atgttaactt atcggaagtt aatctagcgg atggctcgac
gccaagtggt 12180tctgcagtca gttcgactca cacaatcact tacaccgaag
gaagtgatga ttttagtcac 12240tttagaattg cgaccaacga attcaatcct
ggcgatctgt tgaaatcaag tggtcttgtt 12300gttcaactaa aagaagatcc
tgcttctgct ggtgattaca ttggttatac cgatgatggt 12360atgggtaacg
ttaccgatgt atttaccatt agctttgata gtgcaaacaa agctcagttt
12420acatttacct tgattgaggc gcttgatcac cttgatggtg tgctttacaa
cgatcttacg 12480ttccgtttgc ctatctatgc tgttgataca gatgattctg
aatcaacaaa gcgcgatgtg 12540gtggttacga tagaagatga catccagcaa
atgcaagatg gcttcttaac cattaccgag 12600ccaaattctg gtactccaac
aacaactacc gttgatgtga tgccaatacc aagtgcagac 12660ggtgcgacta
ttacgcagtt cacgtatgac ggtggttctc caattactct gaatcaaagc
12720atcagcggcg aacaagagtt tgttttcact gaaggttcac tgtttgtgac
actagatggt 12780gatgtaaggt ttgagccaaa tagaaacctt gatcactctg
cgggcgacat tgttaaatcg 12840attgtgttca cgtcttcaga ctttgataac
gacatcttct catcaaaagt cactctcacc 12900attgttgatg gtgatgggcc
aacaatcgac gttgtgccgg gtgtggcatt gtcagaaagc 12960ttacttgcgg
atggttcgac gcctagcgta aatcccgtga gtatgactca aaccattact
13020tcacttgcaa gtagtgatga tattgctgaa atagtggtgg aagtcgggtt
gttcaatacc 13080aacggcgcgt tgaagtcgga tggtttgtca ctgagtttac
gtgaagaccc tgtaaattca 13140ggcgactaca ttgcatttac tactaatggt
tcgggtgttg agaaagttat cttcactctg 13200gattttgatg atacgaatcc
gagtcaatat acgtttactc tgcttgaacg tttagaccat 13260gttgatggct
taggaaataa cgatctgagt tttgatcttt ctgtttatgc agaagatacc
13320gatggtgata tttcagcgtc taaaccgctt acagtcacca tcaccgatga
tgttcagctc 13380atgcaatccg gtgcgctcaa cattactgag ccaaccacag
gaacaccgac tacagcagtc 13440tttgatgtga tgcctgcgca aagtgcagat
ggcgcgacaa tcactaagtt tacctatggc 13500agccaacctg aagagtctct
ggtacaaacc gtcacgggtg agcaagaatt tgtgttcact 13560gaaggttctc
tgtttatcaa tcttgaaggt gatgtacgtt tcgaacctaa ccgtaatctc
13620gatcattcgg gtggtaacat cgttaagacc attacggtga catcggaaga
taaagatggc 13680gatattgtca cttcaacagt gacgctgact attgtagatg
gcgcgccacc agtaatagac 13740acagtaccaa cggttgcatt ggaagaagcg
aatctggtcg acggatcttc accgggttta 13800cctgttagcc aaactgaaat
cattactttc acagcaggaa gtgatgatgt gagccacttc 13860cgtattgatc
cggctcaatt caacacatca ggcgatctga aagcggatgg tttggtggtt
13920cagttaaaag aagatcctct aaacagcgat aattatattg gttacgttga
aagcggcggt 13980gtccaaacgg atatcttcac catcaccttt agcagcgtgg
ttctaggaga gtacacattc 14040accttgttgg aagagttaga tcacctgcct
gtacaaggta acaatgatca aatcttcacc 14100ttgccagtga tcgcagtcga
caaagacaac actgactcag cggtgaaacc tcttacggtg 14160accattaccg
atgatgttcc aaccattact gacaccaccg gcgccagtac gtttgtggtt
14220gatgaagatg atttgggcac tctggcacaa gcgacgggtt cgtttgtaac
cacagaaggt 14280gcagatcaag tcgaggttta cgaactacgt aatatatcaa
cgttggaagc aacgctatcg 14340tcgggcagtg aaggtattaa gatcactgag
atcacaggtg ctgctaacac gaccacctac 14400caaggggcga ccgacccaag
tggaacgcca attttcacat tagtgctgac tgatgatggt 14460gcctacacct
ttaccttgct tggccctctc aatcacgcta cgacaccgag taacctcgat
14520acattaacaa taccatttga tgttgttgcc gttgacggtg atggcgatga
ttctaaccaa 14580tatgtattgc caatcgaggt gctagatgat gtgcctgtaa
tgacggcgcc gacgggtgaa 14640acggttgttg atgaagacga tcttactggc
attggttccg atcaatctga agatacaatt 14700atcaatggac tgttcaccgt
tgatgaaggt gcggatggcg ttgtgctgta tgagctggtt 14760gatgaagatt
tggttctgac gggcttaacc tctgatggag aaagcttaga gtggctagct
14820gtttcacaaa acggcacaac atttacttac gttgctcaaa ctgcaacgag
taatgaagcg 14880gtgttcgaga ttattttcga cacctcggat aacagctacc
aatttgaatt atttaagcca 14940ctgaagcacc ctgacggtgc aaacgagaac
gcgatagatc ttgatttctc aatcgttgct 15000gaagattttg atcaagacca
atcggatgcg atcggtctaa aaattacggt aaccgatgat 15060gttccgttag
tgacaactca atcgattact cgtcttgaag gtcaggggta tggcaactct
15120aaagtcgaca tgtttgccaa tgcaacagat gtgggggctg atggcgcggt
actgagtcga 15180attgagggta tctcaaataa tggtgcagat attgttttcc
gtagcgggaa caatgggcca 15240tatagtagcg gcttcgattt aaacagcggt
agccaacaag ttcgagtcta cgagcaaaca 15300aatggcggtg ctgatactcg
tgaacttggc cgtctacgca tcaactcaaa tggtgaggtt 15360gaattcagag
ctaacggcta tctcgatcat gacggtgatg acaccatcga cttctcgatt
15420aacgtgattg ccacagatgg agatttagac acctctgaaa caccgttaga
tattacgatt 15480actgataggg attctacaag aattgcgctg aaagtgacga
ccttcgagga tgcgggtaga 15540gactcaacca taccttacgc aacaggtgat
gagccgactc ttgagaatgt tcaagataac 15600caaaatggtt tgccgaatgc
gccagcgcaa gttgcgctgc aagttagtct gtatgaccaa 15660gataacgctg
aatctattgg gcagttgacg attaaaagcc cgaacggagg tgatagtcat
15720caaggtactt tttattactt tgatggtgct gactacatag aattagtgcc
tgagtcaaat 15780gggagcatta tatttggctc tcctgaactc gaacaaagct
tcgctccaaa cccgagtgaa 15840ccaagacaaa ctatcgcgac gatagacaac
ctgttctttg ttccagacca acacgctagt 15900tcggatgaaa ctggtgggcg
agttcgttat gagcttgaaa ttgagaaaaa tggcagtacg 15960gatcacaccg
ttaattcaaa cttcagaatt gagattgaag ctgtagctga tattgcgact
16020tgggatgatt ccaacagcac gtatcagtat caagtcaacg aagatgaaga
caatgtcacg 16080ttgcagctga acgcagagtc tcaagataac agtaatactg
agacgattac ctatgaactt 16140gaagccgttc aaggcgacgg gaagtttgag
ttacttgatc aaaatggcaa tgtgttaacg 16200cccgttaatg gtgtttatat
catcgcatct gctgatatca atagcaccgt agttaaccct 16260attgataact
tctcagggca gattgagttc aaagcgacgg caattacgga agagacgctt
16320aacccatacg atgattcaga caacggtgga gcaaacgata agacgacggc
tcgttctgtg 16380gaacaaagta ttgttattga tgtgaccgca gatgcggacc
ctggcacatt cagtgttagt 16440cgaattcaga tcaacgaaga caatatcgat
gatccagatt acgtcgggcc tttggacaat 16500aaagacgcgt tcacgttaga
cgaagtcatc accatgacag ggtcggtcga ttctgacagt 16560tctgaagaac
tgtttgtgcg catcagtaat gttacggaag gagctgtgct ttacttctta
16620ggcaccacga cagtcgttcc gaccatcacg atcaatggtg tggattatca
agaaatcgcg 16680tattccgatt tggctaacgt tgaggttgtt ccaaccaaac
acagtaatgt cgatttcacc 16740ttcgatgtta cgggagtggt caaagatacg
gcaaatctat ccacgggcgc ccaaatcgat 16800gaggagatac taggaactaa
aaccgtcaac gttgaagtca aaggcgttgc cgatactcct 16860tatggtggaa
cgaatggcac ggcttggagt gcaattacag atggcactac atctggtgtt
16920caaaccacga ttcaagagag ccaaaatggt gatacctttg ctgagcttga
tttcaccgtg 16980ttgtcgggag agagaagacc agatactggc actacaccat
tagctgacga tgggtcagaa 17040tcaataaccg ttattctatc gggtataccc
gatggggttg ttctagaaga cggtgacggt 17100acagtgattg accttaactt
tgtcggttat gaaaccggac cgggcggtag tcctgactta 17160tccaaaccta
tctacgaagc gaacattact gaggcgggta aaacttcagg cattcgcatc
17220agacctgtcg actcttcaac cgagaatatt cacattcaag gtaaagtgat
tgtgactgag 17280aacgatggtc acacgcttac gtttgatcaa gaaattcgag
tgcttgttat acctcgaatc 17340gacacatcag caacttatgt caatacgact
aacggtgatg aagatacggc tatcaatatt 17400gattggcacc ctgaaggcac
ggattacatt gatgacgatg agcatttcac taagataact 17460attaatggaa
taccactggg tgttactgca gtagtcaacg gtgatgtgac cgttgatgac
17520tcaaccccag gaacattgat tataacgcct aaagatgctt cccaaactcc
tgaacaattt 17580actcaaattg cattagctaa taacttcatt caaatgacgc
ctccggctga ttctagtgca 17640gattttacgt tgaccaccga acttaaaatg
gaagagcgag atcatgagta tacgtctagc 17700ggcctagagg atgaagatgg
tggttatgtc gaagccgatc cagatataac cggaatcatt 17760aacgttcaag
tacgacctgt ggttgaacct ggagatgccg acaacaagat tgtcgtttca
17820aacgaagatg gctctggaga tctcactacg attacggctg atgctaatgg
tgtcattaaa 17880tttacaacta acagtgataa ccaaacgact gatactaacg
gagacgaaat ctgggacggt 17940gaatacgtcg tccgatacca agaaacggat
ttaagcacag tagaagagca agtcgacgaa 18000gtgattgttc agctgactaa
caccgatgga agcgcgttat ctgatgatat tttagggcaa 18060cttttagtaa
ctggtgcctc ttacgaaggc ggtggccgat gggttgtgac caatgaagat
18120gcctttagcg tcagtgcgcc caatggatta gatttcaccc ctgccaatga
tgcggatgat 18180gtagctactg atttcaatga tatcaagatg acaattttca
ctttggtctc agatcctggt 18240gatgctaaca atgaaacgtc cgcccaagtg
caacgcaccg gagaagtaac gctttcttat 18300cctgaagtgc tgacggcacc
tgacaaagtt gccgcagata ttgcgattgt gccagacagt 18360gttatcgacg
ctgttgagga tactcagctt gatctcggcg cggcactcaa cggcattttg
18420agcttgacgg gtcgcgatga ttctactgac caagtgacgg tgatcatcga
tggcactctg 18480gtcattgatg ctacaacatc attcccaatt agcctgtcgg
gaacaagtga tgttgacttt 18540gtgaatggga aatatgttta cgagacgact
gttgagcagg gcgtagccgt cgattcatcg 18600ggtttgttat tgaatctgcc
accaaactac tctggtgact ttaggttgcc aatgaccatc 18660gtgaccaaag
atttacaatc tggtgatgag aagaccttag tgactgaagt tatcatcaaa
18720gtcgcaccag atgctgagac ggatccaacg attgaggtga atgtcgtggg
ttcgcttgat 18780gatgccttta atcctgttga taccgacggt caagctgggc
aagatccggt gggttacgaa 18840gacacctata ttcaactcga cttcaattcg
accatttcgg atcaggtttc cggcgtcgaa 18900ggcggccaag aagcgtttac
gtccattact ttaacgttgg acgacccttc tataggtgca 18960ttctatgaca
acacgggtac ttcattaggt acatctgtta cgtttaatca ggctgaaata
19020gcagcgggtg cactcgataa cgtgctcttt agggcaatcg aaaattaccc
aacgggtaat 19080gatattaacc aagtgcaggt taatgtcagc ggtacagtca
cagataccgc aacctataat 19140gatcctgctt ctcctgcggg tacggcaaca
gactcagata ctttctctac gagtgtcagc 19200tttgaagtcg ttcctgtggt
cgatgacgtg tctgtcactg gaccgggtag cgatcctgat 19260gttatcgaga
ttactggcaa cgaagaccag ctcatttctt tgtcggggac agggcctgta
19320tcgattgcac tgactgacct tgatggttca gaacagtttg tatcgattaa
gttcacagat 19380gtccctgatg gcttccaaat gcgtgcagat gctggctcga
catataccgt gaaaaataat 19440ggtaatggag agtggagtgt tcaactgcct
caagcttcgg ggttgtcatt cgatttaagt 19500gagatttcga tcttgccgcc
taaaaacttc agtggtaccg ctgagtttgg tgtggaagtc 19560ttcactcaag
aatcgttgct gggtgtgcct actgcggcgg caaacttgcc aagcttcaaa
19620ctgcatgtgg tacctgttgg tgacgatgtt gataccaatc cgactgattc
tgtaacaggc 19680aacgaaggcc aaaacattga tatcgaaatc aatgcgacta
ttttggataa agaattgtct 19740gcaacaggaa gcgggacgta taccgagaat
gcgcccgaaa cgcttcgagt tgaagtggcg 19800ggtgttcctc aagatgcttc
tattttctat ccagatggca cgacattggc tagctacgat 19860ccggcgacgc
agctctggac tctcgatgtt ccagctcagt cgttagataa gatcgtattt
19920aactctggcg aacataatag tgatacaggc aatgtactgg gtatcaatgg
tccactgcag 19980attacggtac gttcagtaga tactgatgct gataatacag
agtacctagg tacgccaacc 20040agcttcgatg tcgatctggt gattgatcct
attaacgatc aaccgatctt tgtgaacgta 20100acgaatattg aaacatcgga
agacatcagt gttgccatcg acaactttag tatctacgac 20160gtcgacgcaa
actttgataa tccagatgct ccgtatgaac tgacgcttaa agtcgaccaa
20220acactgccgg gagcgcaagg tgtgtttgag tttaccagct ctcctgacgt
gacgtttgta 20280ttgcaacctg acggctcatt ggtgattacc ggtaaagaag
ccgacattaa taccgcattg 20340actaatggag ctgtgacttt caaacccgac
ccagaccaga actacctcaa ccagactggt 20400ttagtcacaa tcaatgcaac
gctcgatgat ggtggtaata acggtttgat tgacgcggtt 20460gatccgaata
ccgctcaaac caatcaaact accttcacca ttaaggtgac ggaagtgaat
20520gacgctcctg tggcgactaa cgttgattta ggctcgattg cggaagacgc
tcaaatcgtg 20580attgttgaga gtgacttgat tgcagccagt tctgatctag
aaaaccataa tctcacagta 20640accggtgtga ctcttactca agggcaaggt
cagcttacac gctatgaaaa tgctggtggt 20700gctgatgacg cagcgattac
ggggccattc tggatattca ttgcagataa tgatttcaac 20760ggcgacgtta
aattcaatta ctccattatc gatgatggta ccaccaacgg tgtggatgat
20820tttaaaaccg atagcgctga aatcagcctt gtagttactg aagtcaatga
ccagccagtg 20880gcatcgaaca ttgatttggg caccatgctt gaagaaggac
agctggtcat taaagaggaa 20940gacctgattt ccgcaaccac tgatccggaa
aacgacacga ttactgtgaa cagtttggtg 21000ctcgatcaag gtcagggcca
attacaacgc tttgagaacg tgggcggtgc tgatgatgct 21060acgatcactg
gcccgtactg ggtatttact gcagccaacg aatacaacgg tgatgttaag
21120ttcacttata ccgttgagga cgatggtaca accaacggcg ctgatgattt
cttaacagat 21180accggcgaaa ttagcgttgt ggtaacggaa gtgaatgatc
aaccggtggc aacggatatc 21240gacttaggaa acatccttga agaagggcag
ttgatcatca aagaggaaga cttaattgct 21300gctacgagcg atccggaaaa
cgacacgatt accgtgacca atctggtgct cgacgaaggc 21360caaggccagt
tacagcgctt tgagaacgtg ggcggtgctg atgacgctat gattactggc
21420ccgtactgga tatttacggc tgctgatgaa tacaacggta acgttaagtt
cacctatacc 21480gtcgaggatg atggtacaac caacggcgct aatgatttcc
taacggatac tgcagagatc 21540acagcgattg tcgacggagt gaacgatacg
cctgttgtta atggtgacag tgtcactacg 21600attgttgacg aggatgctgg
tcagctattg agtggtatca atgtcagtga cccagattat 21660gtggatgcat
tttctaatga cttgatgaca gtcacgctga cagtggatta cggtacattg
21720aacgtatcac ttccggcagt gacgacagtg atggtcaacg gcaacaacac
tggttcggtt 21780atcttagttg gtactttgag tgacctgaat gcgctgattg
atacgccaac cagtccaaac 21840ggtgtctacc tcgatgcgag cttgtctcca
accaatagca ttggcttaga agtaatcgcc 21900aaagacagcg gtaacccttc
tggtatcgcg attgaaactg caccagtggt ttataatatc 21960gcagtgacac
cagtcgctaa tgcgccaacc ttgtctattg atccggcatt taactatgtg
22020agaaacatta cgaccagctc atctgtggtc gctaatagtg gagtcgcttt
agttggaatt 22080gtcgctgcat tgacggacat tactgaagag ttaacgttga
agatcagcga tgttccggat 22140ggtgttgatg taaccagtga tgtgggtacg
gtttcgttgg tgggtgatac ttggatagcg 22200accgctgatg cgatcgatag
tctcagactc gtagagcagt catcattagg taaaccgttg 22260accccgggta
attacacctt gaaagttgag gcgctatctg aagagactga caacaacgat
22320attgcgatat ctcaaaacat cgatctgaat ctcaatattg ttgccaatcc
aatagatctc 22380gatctgtctt ctgaaacaga cgatgtgcaa cttttagcga
gtaactttga tactaacctc 22440actggcggaa ctggaaatga ccgacttgta
ggtggagcgg gtgacgatac gctggttggc 22500ggtgacggta acgacacact
cattggtggc ggcggttccg atattctaac cggtggcaat 22560ggtatggatt
cgtttgtatg gctcaatatt gaagatggcg ttgaagacac cattaccgat
22620ttcagcctgt ctgaaggaga ccaaatcgac ctacgagaag tattacctga
gttgaagaat 22680acatctccag acatgtctgc attgctacaa cagatagacg
cgaaagtgga aggggatgat 22740attgagctta cgatcaagtc tgatggttta
ggcactacgg aacaggtgat tgtggttgaa 22800gaccttgctc ctcagctaac
cttaagtggc accatgcctt cggatatttt ggatgcgtta 22860gtgcaacaaa
atgtcatcac tcacggttaa 2289047629PRTVibrio splendidus 4Met Leu Cys
Asp Asn Gly Gly Cys Met Asp Ile Glu Val Ser Arg Gln1 5 10 15 Val
Ala Val Val Glu Ala Thr Ser Gly Asp Val Val Val Val Lys Pro 20 25
30 Asp Gly Ser Ala Arg Lys Val Ser Val Gly Asp Thr Ile Arg Glu Asn
35 40 45 Glu Ile Val Ile Thr Ala Asn Lys Ser Glu Leu Val Leu Gly
Val Gln 50 55 60 Asn Asp Ser Ile Pro Val Ala Glu Asn Cys Val Gly
Cys Val Asp Glu65 70 75 80 Asn Ala Ala Trp Val Asp Ala Pro Ile Ala
Gly Glu Val Asn Phe Asp 85 90 95 Leu Gln Gln Ala Asp Ala Glu Thr
Phe Thr Glu Asp Asp Leu Ala Ala 100 105 110 Ile Gln Glu Ala Ile Leu
Gly Gly Ala Asp Pro Thr Gln Ile Leu Glu 115 120 125 Ala Thr Ala Ala
Gly Gly Gly Leu Gly Ser Ala Asn Ala Gly Phe Val 130 135 140 Thr Ile
Asp Tyr Asn Tyr Thr Glu Thr His Pro Ser Thr Phe Phe Glu145 150 155
160 Thr Ala Gly Leu Ala Glu Gln Thr Val Asp Glu Asp
Arg Glu Glu Phe 165 170 175 Arg Ser Ile Thr Arg Ser Ser Gly Gly Gln
Ser Ile Ser Glu Thr Leu 180 185 190 Thr Glu Gly Ser Ile Ser Gly Asn
Thr Tyr Pro Gln Ser Val Thr Thr 195 200 205 Thr Glu Thr Ile Ile Ala
Gly Ser Leu Ala Leu Ala Pro Asn Ser Phe 210 215 220 Ile Pro Glu Thr
Leu Ser Leu Ala Ser Leu Leu Ser Glu Leu Asn Ser225 230 235 240 Asp
Ile Thr Ser Ser Gly Gln Ser Val Ile Phe Thr Tyr Asp Ala Thr 245 250
255 Thr Asn Ser Ile Val Gly Val Gln Asp Thr Asp Glu Val Leu Arg Ile
260 265 270 Asp Ile Asp Ala Val Ser Val Gly Asn Asn Ile Glu Leu Ser
Leu Thr 275 280 285 Thr Thr Ile Ser Gln Pro Ile Asp His Val Pro Ser
Val Gly Gly Gly 290 295 300 Gln Val Ser Tyr Thr Gly Asp Gln Ile Asp
Ile Ala Phe Asp Ile Gln305 310 315 320 Gly Glu Asp Thr Ala Gly Asn
Pro Leu Ala Thr Pro Val Asn Ala Gln 325 330 335 Val Ser Val Phe Asp
Gly Ile Asp Pro Ser Val Glu Ser Val Asn Ile 340 345 350 Thr Asn Val
Glu Thr Ser Ser Ala Ala Ile Glu Gly Thr Phe Ser Asn 355 360 365 Ile
Gly Ser Asp Asn Leu Gln Ser Ala Val Phe Asp Ala Ser Ala Leu 370 375
380 Asp Gln Phe Asp Gly Leu Leu Ser Asp Asn Gln Asn Thr Leu Ala
Arg385 390 395 400 Leu Ser Asp Asp Gly Thr Thr Ile Thr Leu Ser Ile
Gln Gly Arg Gly 405 410 415 Glu Val Val Leu Thr Ile Ser Leu Asp Thr
Asp Gly Thr Tyr Lys Phe 420 425 430 Glu Gln Ser Asn Pro Ile Glu Gln
Val Gly Thr Asp Ser Leu Thr Phe 435 440 445 Ala Leu Pro Ile Thr Ile
Thr Asp Phe Asp Gln Asp Val Val Thr Asn 450 455 460 Thr Ile Asn Ile
Ala Ile Thr Asp Gly Asp Ser Pro Val Ile Thr Asn465 470 475 480 Val
Asp Ser Ile Asp Val Asp Glu Ala Gly Ile Val Gly Gly Ser Gln 485 490
495 Glu Gly Thr Ala Pro Val Ser Gly Thr Gly Gly Ile Thr Ala Asp Ile
500 505 510 Phe Glu Ser Asp Ile Ile Asp His Tyr Glu Leu Glu Pro Thr
Glu Phe 515 520 525 Asn Thr Asn Gly Thr Leu Val Ser Asn Gly Glu Ala
Val Leu Leu Glu 530 535 540 Leu Ile Asp Glu Thr Asn Gly Val Arg Thr
Tyr Glu Gly Tyr Val Glu545 550 555 560 Val Asn Gly Ser Arg Ile Thr
Val Phe Asp Val Lys Ile Asp Ser Pro 565 570 575 Ser Leu Gly Asn Tyr
Glu Phe Asn Leu Tyr Glu Glu Leu Ser His Gln 580 585 590 Gly Ala Glu
Asp Ala Leu Leu Thr Phe Ala Leu Pro Ile Tyr Ala Val 595 600 605 Asp
Ala Asp Gly Asp Arg Ser Ala Leu Ser Gly Gly Ser Asn Thr Pro 610 615
620 Glu Ala Ala Glu Ile Leu Val Asn Val Lys Asp Asp Val Val Glu
Leu625 630 635 640 Val Asp Lys Val Glu Ser Val Thr Glu Pro Thr Leu
Ala Gly Asp Thr 645 650 655 Ile Val Ser Tyr Asn Leu Phe Asn Phe Glu
Gly Ala Asp Gly Ser Thr 660 665 670 Ile Gln Ser Phe Asn Tyr Asp Gly
Val Asp Tyr Ser Leu Asp Gln Ser 675 680 685 Leu Leu Pro Asp Ala Thr
Gln Ile Phe Ser Phe Thr Glu Gly Val Val 690 695 700 Thr Ile Ser Leu
Asn Gly Asp Phe Ser Phe Glu Val Ala Arg Asp Ile705 710 715 720 Asp
His Ser Ser Ser Glu Thr Ile Val Lys Gln Phe Ser Phe Leu Ala 725 730
735 Glu Asp Gly Asp Gly Asp Thr Asp Ser Ser Thr Leu Glu Leu Ser Ile
740 745 750 Thr Asp Gly Gln Asp Pro Ile Ile Asp Leu Ile Pro Pro Val
Thr Leu 755 760 765 Ser Glu Thr Asn Leu Asn Asp Gly Ser Ala Pro Ser
Gly Ser Thr Val 770 775 780 Ser Ala Thr Glu Thr Ile Thr Phe Thr Ala
Gly Ser Asp Asp Val Ala785 790 795 800 Ser Phe Arg Ile Glu Pro Thr
Glu Phe Asn Val Gly Gly Ala Leu Lys 805 810 815 Ser Asn Gly Phe Ser
Val Glu Ile Lys Glu Asp Ser Ala Asn Pro Gly 820 825 830 Thr Tyr Ile
Gly Phe Ile Thr Asn Gly Ser Gly Ala Glu Ile Pro Val 835 840 845 Phe
Thr Ile Ala Phe Ser Thr Ser Thr Leu Gly Glu Tyr Thr Phe Thr 850 855
860 Leu Leu Glu Ala Leu Asp His Val Asp Gly Leu Asp Lys Asn Asp
Leu865 870 875 880 Ser Phe Asp Leu Pro Ile Tyr Ala Val Asp Thr Asp
Gly Asp Asp Ser 885 890 895 Leu Val Ser Gln Leu Asn Val Thr Ile Gly
Asp Asp Val Gln Ile Met 900 905 910 Gln Asp Gly Thr Leu Asp Ile Thr
Glu Pro Asn Leu Ala Asp Gly Thr 915 920 925 Ile Thr Thr Asn Thr Ile
Asp Val Met Pro Asn Gln Ser Ala Asp Gly 930 935 940 Ala Thr Ile Thr
Arg Phe Thr Tyr Asp Gly Val Val Asn Thr Leu Asp945 950 955 960 Gln
Ser Ile Ser Gly Glu Gln Gln Phe Ser Phe Thr Glu Gly Glu Leu 965 970
975 Phe Ile Thr Leu Glu Gly Glu Val Arg Phe Glu Pro Asn Arg Asp Leu
980 985 990 Asp His Ser Val Ser Glu Asp Ile Val Lys Ser Ile Val Val
Thr Ser 995 1000 1005 Ser Asp Phe Asp Asn Asp Pro Val Thr Ser Thr
Ile Thr Leu Thr Ile 1010 1015 1020 Thr Asp Gly Asp Asn Pro Thr Ile
Asp Val Ile Pro Ser Val Thr Leu1025 1030 1035 1040 Ser Glu Ile Asn
Leu Ser Asp Gly Ser Ala Pro Ser Gly Ser Ala Val 1045 1050 1055 Ser
Ser Thr Gln Thr Ile Thr Phe Thr Asn Gln Ser Asp Asp Val Val 1060
1065 1070 Arg Phe Arg Ile Glu Ser Thr Glu Phe Asn Thr Asn Asp Asp
Leu Lys 1075 1080 1085 Ser Asn Gly Leu Ala Val Glu Leu Arg Glu Asp
Pro Ala Gly Ser Gly 1090 1095 1100 Asp Tyr Ile Gly Phe Thr Thr Ser
Ala Thr Asn Val Glu Thr Pro Val1105 1110 1115 1120 Phe Thr Leu Ser
Phe Asn Ser Gly Ser Leu Gly Glu Tyr Thr Phe Thr 1125 1130 1135 Leu
Ile Glu Ala Leu Asp His Gln Asp Ala Arg Gly Asn Asn Asp Leu 1140
1145 1150 Ser Phe Asp Leu Pro Val Tyr Ala Val Asp Ser Asp Gly Asp
Asp Ser 1155 1160 1165 Leu Val Ser Pro Leu Asn Val Thr Ile Gly Asp
Asp Val Gln Ile Met 1170 1175 1180 Gln Asp Ser Thr Leu Asp Ile Val
Glu Pro Thr Val Ala Asp Leu Ala1185 1190 1195 1200 Ala Gly Thr Val
Thr Thr Asn Thr Ile Asp Val Met Pro Asn Gln Ser 1205 1210 1215 Ala
Asp Gly Ala Thr Val Thr Gln Phe Thr Tyr Asp Gly Gln Leu Arg 1220
1225 1230 Thr Leu Asp Gln Asn Asp Asn Gly Glu Gln Gln Phe Ser Phe
Thr Glu 1235 1240 1245 Gly Glu Leu Phe Ile Thr Leu Gln Gly Asp Val
Arg Phe Glu Pro Asn 1250 1255 1260 Arg Asn Leu Asp His Thr Leu Ser
Glu Asp Ile Val Lys Ser Ile Val1265 1270 1275 1280 Val Thr Ser Ser
Asp Ser Asp Asn Asp Val Leu Thr Ser Thr Val Thr 1285 1290 1295 Leu
Thr Ile Thr Asp Gly Asp Ile Pro Thr Ile Asp Asn Val Pro Thr 1300
1305 1310 Val Asn Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala Pro
Ser Gly 1315 1320 1325 Ser Ala Val Ser Ser Thr Gln Thr Ile Thr Tyr
Thr Thr Gln Ser Asp 1330 1335 1340 Asp Val Thr Ser Phe Arg Ile Glu
Pro Thr Glu Phe Asn Val Gly Gly1345 1350 1355 1360 Ala Leu Thr Ser
Asn Gly Leu Ala Val Glu Leu Lys Ala Asp Pro Thr 1365 1370 1375 Thr
Pro Gly Gly Tyr Ile Gly Phe Val Thr Asp Gly Ser Asn Val Glu 1380
1385 1390 Thr Asn Val Phe Thr Ile Ser Phe Ser Asp Thr Asn Leu Gly
Gln Tyr 1395 1400 1405 Thr Phe Thr Leu Leu Glu Ala Leu Asp His Val
Asp Gly Leu Ala Asn 1410 1415 1420 Asn Asp Leu Thr Phe Asp Leu Pro
Val Tyr Ala Val Asp Ser Asp Gly1425 1430 1435 1440 Asp Asp Ser Leu
Val Ser Gln Leu Asn Val Thr Ile Gly Asp Asp Val 1445 1450 1455 Gln
Ile Met Gln Gly Gly Thr Leu Asp Ile Thr Glu Pro Asn Leu Ala 1460
1465 1470 Asp Gly Thr Ile Thr Thr Asn Thr Ile Asp Val Met Pro Glu
Gln Ser 1475 1480 1485 Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr
Asp Gly Gln Val Arg 1490 1495 1500 Thr Leu Asp Gln Thr Asp Asn Gly
Glu Gln Gln Phe Ser Phe Thr Glu1505 1510 1515 1520 Gly Glu Leu Phe
Ile Thr Leu Gln Gly Asp Val Arg Phe Glu Pro Asn 1525 1530 1535 Arg
Asn Leu Asp His Thr Ala Ser Glu Asp Ile Val Lys Ser Ile Val 1540
1545 1550 Val Thr Ser Ser Asp Leu Asp Asn Asp Val Val Thr Ser Thr
Val Thr 1555 1560 1565 Leu Thr Ile Thr Asp Gly Asp Ile Pro Thr Ile
Asp Ala Val Pro Ser 1570 1575 1580 Val Thr Leu Ser Glu Ile Asn Leu
Ser Asp Gly Ser Ala Pro Ser Gly1585 1590 1595 1600 Thr Ala Val Ser
Gln Thr Glu Thr Ile Thr Phe Thr Asn Gln Ser Asp 1605 1610 1615 Asp
Val Thr Ser Phe Arg Ile Glu Pro Ile Glu Phe Asn Val Gly Gly 1620
1625 1630 Ala Leu Lys Ser Asn Gly Phe Ala Val Glu Ile Lys Glu Asp
Ser Ala 1635 1640 1645 Asn Pro Gly Thr Tyr Ile Gly Phe Ile Thr Asn
Gly Ser Gly Ala Glu 1650 1655 1660 Ile Pro Val Phe Thr Ile Ala Phe
Ser Thr Ser Ser Leu Gly Glu Tyr1665 1670 1675 1680 Thr Phe Thr Leu
Leu Glu Ala Leu Asp His Val Asp Gly Leu Asp Lys 1685 1690 1695 Asn
Asp Leu Ser Phe Asp Leu Pro Val Tyr Ala Val Asp Thr Asp Gly 1700
1705 1710 Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr Ile Gly Asp
Asp Val 1715 1720 1725 Gln Ile Met Gln Asp Gly Thr Leu Asp Ile Ile
Glu Pro Asn Leu Ala 1730 1735 1740 Asp Gly Thr Ile Thr Thr Ser Thr
Ile Asp Val Met Pro Asn Gln Ser1745 1750 1755 1760 Ala Asp Gly Ala
Thr Ile Thr Gln Phe Thr Tyr Asp Gly Gln Leu Arg 1765 1770 1775 Thr
Leu Asp Gln Asn Asp Thr Gly Glu Gln Gln Phe Ser Phe Thr Glu 1780
1785 1790 Gly Glu Leu Phe Ile Thr Leu Glu Gly Glu Val Arg Phe Glu
Pro Asn 1795 1800 1805 Arg Asp Leu Asp His Thr Ala Ser Glu Asp Ile
Val Lys Ser Ile Val 1810 1815 1820 Val Thr Ser Ser Asp Phe Asp Asn
Asp Ser Leu Thr Ser Thr Val Thr1825 1830 1835 1840 Leu Thr Ile Thr
Asp Gly Asp Asn Pro Thr Ile Asp Val Ile Pro Ser 1845 1850 1855 Val
Thr Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala Pro Ser Gly 1860
1865 1870 Ser Ala Val Ser Ser Thr Gln Thr Ile Thr Phe Thr Asn Gln
Ser Asp 1875 1880 1885 Asp Val Val Arg Phe Arg Ile Glu Pro Thr Glu
Phe Asn Thr Asn Asp 1890 1895 1900 Asp Leu Lys Ser Asn Gly Leu Ala
Val Glu Leu Arg Glu Asp Pro Ala1905 1910 1915 1920 Gly Ser Gly Asp
Tyr Ile Gly Phe Thr Thr Ser Ala Thr Asn Val Glu 1925 1930 1935 Thr
Thr Val Phe Thr Leu Ser Phe Ser Ser Thr Thr Leu Gly Glu Tyr 1940
1945 1950 Thr Phe Thr Leu Leu Glu Ala Leu Asp His Gln Asp Ala Arg
Gly Asn 1955 1960 1965 Asn Asp Leu Ser Phe Glu Leu Pro Val Tyr Ala
Val Asp Ser Asp Gly 1970 1975 1980 Asp Asp Ser Leu Met Ser Pro Leu
Asn Val Thr Ile Gly Asp Asp Val1985 1990 1995 2000 Gln Ile Met Gln
Asp Gly Thr Leu Asp Ile Val Glu Pro Thr Val Ala 2005 2010 2015 Asp
Leu Ala Ala Gly Ile Val Thr Thr Asn Thr Ile Asp Val Met Pro 2020
2025 2030 Asn Gln Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr
Asp Gly 2035 2040 2045 Gln Leu Arg Thr Leu Asp Gln Asn Asp Asn Gly
Glu Gln Gln Phe Ser 2050 2055 2060 Phe Thr Glu Gly Glu Leu Phe Ile
Thr Leu Glu Gly Glu Val Arg Phe2065 2070 2075 2080 Glu Pro Asn Arg
Asn Leu Asp His Thr Leu Asn Glu Asp Ile Val Lys 2085 2090 2095 Ser
Ile Val Val Thr Ser Ser Asp Ser Asp Asn Asp Val Leu Thr Ser 2100
2105 2110 Thr Val Thr Leu Thr Ile Thr Asp Gly Asp Ile Pro Thr Ile
Asp Asn 2115 2120 2125 Val Pro Thr Val Ser Leu Ser Glu Thr Ser Leu
Ser Asp Gly Ser Ser 2130 2135 2140 Pro Ser Gly Ser Ala Val Ser Ser
Thr Gln Thr Ile Thr Tyr Thr Thr2145 2150 2155 2160 Gln Ser Asp Asp
Val Thr Ser Phe Arg Ile Glu Pro Thr Glu Phe Asn 2165 2170 2175 Val
Gly Gly Ala Leu Lys Ser Asn Gly Leu Ala Val Glu Leu Lys Ala 2180
2185 2190 Asp Pro Thr Thr Pro Gly Gly Tyr Ile Gly Phe Val Thr Asp
Gly Ser 2195 2200 2205 Asn Val Glu Thr Asn Val Phe Thr Ile Ser Phe
Ser Asp Thr Asn Leu 2210 2215 2220 Gly Gln Tyr Thr Phe Thr Leu Leu
Glu Ala Leu Asp His Ala Asp Ser2225 2230 2235 2240 Leu Ala Asn Asn
Asp Leu Ser Phe Asp Leu Pro Val Tyr Ala Val Asp 2245 2250 2255 Ser
Asp Gly Asp Asp Ser Leu Val Ser Gln Leu Asn Val Thr Ile Gly 2260
2265 2270 Asp Asp Val Gln Ile Met Gln Gly Gly Thr Leu Asp Ile Thr
Glu Pro 2275 2280 2285 Asn Leu Ala Asp Gly Thr Thr Thr Thr Asn Thr
Ile Asp Val Met Pro 2290 2295 2300 Glu Gln Ser Ala Asp Gly Ala Thr
Ile Thr Gln Phe Thr Tyr Asp Gly2305 2310 2315 2320 Gln Val Arg Thr
Leu Asp Gln Thr Asp Asn Gly Glu Gln Gln Phe Ser 2325 2330 2335 Phe
Thr Glu Gly Glu Leu Phe Ile Thr Leu Gln Gly Asp Val Arg Phe 2340
2345 2350 Glu Pro Asn Arg Asn Leu Asp His Thr Ala Ser Glu Asp Ile
Val Lys 2355 2360 2365 Ser Ile Val Val Thr Ser Ser Asp Ser Asp Asn
Asp Val Val Thr Ser 2370 2375 2380 Thr Val Thr Leu Thr Ile Thr Asp
Gly Asp Leu Pro Thr Ile Asp Ala2385 2390 2395 2400 Val Pro Ser Val
Thr Leu Ser Glu Thr Asn Leu Ser Asp Gly Ser Ala 2405 2410 2415 Pro
Ser Gly Ser Ala Val Ser Gln Thr Glu Thr Ile Thr Phe Thr Asn 2420
2425 2430 Gln Ser Asp Asp Val Ala Ser Phe Arg Ile Glu Pro Thr Glu
Phe Asn 2435 2440 2445 Val Gly Gly Ala Leu Lys Ser Asn Gly Phe Ala
Val Glu Ile Lys Glu 2450 2455 2460 Asp Ser Ala Asn Pro Gly Thr Tyr
Ile Gly Phe Ile Ala Asn Gly Ser2465 2470 2475 2480 Ser Ala Glu Ile
Pro
Val Phe Thr Ile Ala Phe Ser Thr Ser Thr Leu 2485 2490 2495 Gly Glu
Tyr Thr Phe Thr Leu Leu Glu Ala Leu Asp His Ala Asp Gly 2500 2505
2510 Leu Asp Lys Asn Asp Leu Ser Phe Glu Leu Pro Val Tyr Ala Val
Asp 2515 2520 2525 Thr Asp Gly Asp Asp Ser Leu Val Ser Gln Leu Asn
Val Thr Ile Gly 2530 2535 2540 Asp Asp Val Gln Ile Met Gln Asp Gly
Thr Leu Asp Val Ile Glu Pro2545 2550 2555 2560 Asn Leu Ala Asp Gly
Thr Ile Thr Thr Asn Thr Ile Asp Val Met Pro 2565 2570 2575 Glu Gln
Ser Ala Asp Gly Ala Thr Ile Thr Gln Phe Thr Tyr Asp Gly 2580 2585
2590 Gln Leu Arg Thr Leu Asp Gln Asn Asp Thr Gly Glu Gln Gln Phe
Ser 2595 2600 2605 Phe Thr Glu Gly Glu Leu Phe Ile Thr Leu Glu Gly
Glu Val Arg Phe 2610 2615 2620 Glu Pro Asn Arg Asp Leu Asp His Ser
Val Ser Glu Asp Ile Val Lys2625 2630 2635 2640 Ser Ile Val Val Thr
Ser Ser Asp Phe Asp Asn Asp Pro Val Thr Ser 2645 2650 2655 Ala Ile
Thr Leu Thr Ile Thr Asp Gly Asp Asn Pro Thr Ile Asp Ser 2660 2665
2670 Val Pro Ser Val Val Leu Glu Glu Ala Asp Leu Thr Asp Gly Ser
Ser 2675 2680 2685 Pro Ser Gly Ser Ala Val Ser Gln Thr Glu Thr Ile
Thr Phe Thr Asn 2690 2695 2700 Gln Ser Asp Asp Val Glu Lys Phe Arg
Leu Glu Pro Ser Glu Phe Asn2705 2710 2715 2720 Thr Asn Asn Ala Leu
Lys Ser Asp Gly Leu Ile Ile Glu Ile Arg Glu 2725 2730 2735 Glu Pro
Thr Gly Ser Gly Asn Tyr Ile Gly Phe Thr Thr Asp Ile Ser 2740 2745
2750 Asn Val Glu Thr Thr Val Phe Thr Leu Asp Phe Ser Ser Thr Thr
Leu 2755 2760 2765 Gly Glu Tyr Thr Phe Thr Leu Leu Glu Ala Ile Asp
His Thr Pro Val 2770 2775 2780 Gln Gly Asn Asn Asp Leu Thr Phe Asn
Leu Pro Val Tyr Ala Val Asp2785 2790 2795 2800 Ser Asp Gly Asp Asp
Ser Leu Met Ser Ser Leu Ser Val Thr Ile Thr 2805 2810 2815 Asp Asp
Val Gln Val Met Val Ser Gly Ser Leu Ser Ile Glu Glu Pro 2820 2825
2830 Thr Val Ala Asp Leu Ala Ala Gly Thr Pro Thr Thr Ser Val Phe
Asp 2835 2840 2845 Val Leu Thr Ser Ala Ser Ala Asp Gly Ala Thr Ile
Thr Gln Phe Thr 2850 2855 2860 Tyr Asp Gly Gly Ala Val Leu Thr Leu
Asp Gln Asn Asp Thr Gly Glu2865 2870 2875 2880 Gln Lys Phe Val Val
Ala Asp Gly Ala Leu Tyr Ile Thr Leu Gln Gly 2885 2890 2895 Asp Ile
Arg Phe Glu Pro Ser Arg Asn Leu Asp His Thr Gly Gly Asp 2900 2905
2910 Ile Val Lys Ser Ile Val Val Thr Ser Ser Asp Ser Asp Ser Asp
Leu 2915 2920 2925 Val Ser Ser Thr Val Thr Leu Thr Ile Thr Asp Gly
Asp Ile Pro Thr 2930 2935 2940 Ile Asp Thr Val Pro Ser Val Thr Leu
Ser Glu Thr Asn Leu Ser Asp2945 2950 2955 2960 Gly Ser Ala Pro Asn
Ala Ser Ala Val Ser Ser Thr Gln Thr Ile Thr 2965 2970 2975 Phe Thr
Asn Gln Ser Asp Asp Val Thr Ser Phe Arg Ile Glu Pro Thr 2980 2985
2990 Asp Phe Asn Val Gly Gly Ala Leu Lys Ser Asn Gly Leu Ala Val
Glu 2995 3000 3005 Leu Lys Ala Asp Pro Thr Thr Pro Gly Gly Tyr Ile
Gly Phe Val Thr 3010 3015 3020 Asp Gly Ser Asn Val Glu Thr Asn Val
Phe Thr Ile Ser Phe Ser Asp3025 3030 3035 3040 Thr Asn Leu Gly Gln
Tyr Thr Phe Thr Leu Leu Glu Ala Leu Asp His 3045 3050 3055 Val Asp
Gly Leu Val Lys Asn Asp Leu Thr Phe Asp Leu Pro Val Tyr 3060 3065
3070 Ala Val Asp Ser Asp Gly Asp Asp Ser Leu Val Ser Gln Leu Asn
Val 3075 3080 3085 Thr Ile Gly Asp Asp Val Gln Val Met Gln Asn Gln
Ala Leu Asn Ile 3090 3095 3100 Ile Glu Pro Thr Val Ala Asp Leu Ala
Ala Gly Thr Pro Thr Thr Ala3105 3110 3115 3120 Thr Val Asp Val Met
Pro Ser Gln Ser Ala Asp Gly Ala Thr Ile Thr 3125 3130 3135 Gln Phe
Thr Tyr Asp Gly Gly Ala Ala Ile Thr Leu Asp Gln Asn Asp 3140 3145
3150 Thr Gly Glu Gln Lys Phe Val Phe Thr Glu Gly Ser Leu Phe Ile
Thr 3155 3160 3165 Leu Gln Gly Glu Val Arg Phe Glu Pro Asn Arg Asn
Leu Asn His Thr 3170 3175 3180 Ala Ser Glu Asp Ile Val Lys Ser Ile
Val Val Thr Ser Ser Asp Leu3185 3190 3195 3200 Asp Asn Asp Val Leu
Thr Ser Thr Val Thr Leu Thr Ile Thr Asp Gly 3205 3210 3215 Asp Ile
Pro Thr Ile Asp Ala Val Pro Ser Val Thr Leu Ser Glu Thr 3220 3225
3230 Asn Leu Ser Asp Gly Ser Ala Pro Ser Ser Ser Ala Val Ser Gln
Thr 3235 3240 3245 Glu Thr Ile Thr Phe Ile Asn Gln Ser Asp Asp Val
Ala Ser Phe Arg 3250 3255 3260 Ile Glu Pro Thr Glu Phe Asn Val Gly
Gly Ala Leu Lys Ser Asn Gly3265 3270 3275 3280 Phe Ala Val Glu Ile
Lys Glu Asp Ser Ala Asn Pro Gly Thr Tyr Ile 3285 3290 3295 Gly Phe
Ile Thr Asp Gly Ser Asn Thr Glu Val Pro Val Phe Thr Ile 3300 3305
3310 Ala Phe Ser Thr Ser Thr Leu Gly Glu Tyr Thr Phe Thr Leu Leu
Glu 3315 3320 3325 Ala Leu Asp His Ala Asn Gly Leu Asp Lys Asn Asp
Leu Ser Phe Asp 3330 3335 3340 Leu Pro Val Tyr Ala Val Asp Ser Asp
Gly Asp Asp Ser Leu Val Ser3345 3350 3355 3360 Gln Leu Asn Val Thr
Ile Gly Asp Asp Val Gln Ile Met Gln Asp Gly 3365 3370 3375 Thr Leu
Asp Ile Thr Glu Pro Asn Leu Ala Asp Gly Thr Ile Thr Thr 3380 3385
3390 Asn Thr Ile Asp Val Met Pro Asn Gln Ser Ala Asp Gly Ala Thr
Ile 3395 3400 3405 Thr Glu Phe Ser Phe Gly Gly Ile Val Lys Thr Leu
Asp Gln Ser Ile 3410 3415 3420 Val Gly Glu Gln Gln Phe Ser Phe Thr
Glu Gly Glu Leu Phe Ile Thr3425 3430 3435 3440 Leu Gln Gly Gln Val
Arg Phe Glu Pro Asn Arg Asp Leu Asp His Ser 3445 3450 3455 Ala Ser
Glu Asp Ile Val Lys Ser Ile Val Val Thr Ser Ser Asp Phe 3460 3465
3470 Asp Asn Asp Pro Val Thr Ser Thr Val Thr Leu Thr Ile Thr Asp
Gly 3475 3480 3485 Asp Ile Pro Thr Ile Asp Ala Val Pro Ser Val Thr
Leu Ser Glu Thr 3490 3495 3500 Asn Leu Ala Asp Gly Ser Ala Pro Ser
Gly Ser Ala Val Ser Gln Thr3505 3510 3515 3520 Glu Thr Ile Thr Phe
Thr Asn Gln Ser Asp Asp Val Val Arg Phe Arg 3525 3530 3535 Leu Glu
Pro Thr Glu Phe Asn Thr Asn Asp Ala Leu Lys Ser Asn Gly 3540 3545
3550 Leu Ala Val Glu Leu Arg Glu Glu Pro Gln Gly Ser Gly Gln Tyr
Ile 3555 3560 3565 Gly Phe Thr Thr Ser Ser Ser Asn Val Glu Thr Thr
Val Phe Thr Leu 3570 3575 3580 Asp Phe Asn Ser Gly Thr Leu Gly Glu
Tyr Thr Phe Thr Leu Ile Glu3585 3590 3595 3600 Ala Leu Asp His Gln
Asp Ala Arg Gly Asn Asn Asp Leu Ser Phe Asn 3605 3610 3615 Leu Pro
Val Tyr Ala Val Asp Ser Asp Gly Asp Asp Ser Leu Val Ser 3620 3625
3630 Gln Leu Gly Val Thr Ile Gly Asp Asp Val Gln Leu Met Gln Asp
Gly 3635 3640 3645 Thr Ile Thr Ser Arg Glu Pro Ala Ala Ser Val Glu
Thr Ser Asn Thr 3650 3655 3660 Phe Asp Val Met Pro Asn Gln Ser Ala
Asp Gly Ala Lys Val Thr Ser3665 3670 3675 3680 Phe Val Phe Asp Gly
Lys Thr Ala Glu Ser Leu Asp Leu Asn Val Asn 3685 3690 3695 Gly Glu
Gln Glu Phe Val Phe Thr Glu Gly Ser Val Phe Ile Thr Thr 3700 3705
3710 Glu Gly Glu Ile Arg Phe Glu Pro Val Arg Asn Gln Asn His Ala
Gly 3715 3720 3725 Gly Asp Ile Thr Lys Ser Ile Glu Val Thr Ser Val
Asp Leu Asp Gly 3730 3735 3740 Asp Ile Val Thr Ser Thr Val Thr Leu
Lys Ile Val Asp Gly Asp Leu3745 3750 3755 3760 Pro Thr Ile Asp Leu
Val Pro Gly Ile Thr Leu Ser Glu Val Asp Leu 3765 3770 3775 Ala Asp
Gly Ser Val Pro Thr Gly Asn Pro Val Thr Met Thr Gln Thr 3780 3785
3790 Ile Thr Tyr Thr Ala Gly Ser Asp Asp Val Ser His Phe Arg Ile
Asp 3795 3800 3805 Pro Thr Gln Phe Asn Thr Ser Gly Val Leu Lys Ser
Asn Gly Leu Asp 3810 3815 3820 Val Glu Ile Lys Glu Gln Pro Ala Asn
Ser Gly Asn Tyr Ile Gly Phe3825 3830 3835 3840 Val Lys Asp Gly Ser
Asn Val Glu Thr Asn Val Phe Thr Ile Ser Phe 3845 3850 3855 Ser Thr
Ser Asn Leu Gly Gln Tyr Thr Phe Thr Leu Leu Glu Ala Leu 3860 3865
3870 Asp His Val Asp Gly Leu Gln Asn Asn Ile Leu Ser Phe Asp Val
Pro 3875 3880 3885 Val Leu Ala Val Asp Ala Asp Gly Asp Asp Ser Ala
Met Ser Pro Met 3890 3895 3900 Thr Val Ala Ile Thr Asp Asp Val Gln
Gly Val Gln Asp Gly Thr Leu3905 3910 3915 3920 Ser Ile Thr Glu Pro
Ser Leu Ala Asp Leu Ala Ser Gly Thr Pro Pro 3925 3930 3935 Thr Thr
Ala Ile Ile Asp Val Met Pro Thr Gln Ser Ala Asp Gly Ala 3940 3945
3950 Lys Val Thr Gln Phe Thr Tyr Asp Gly Gly Thr Ala Val Thr Leu
Asp 3955 3960 3965 Pro Ser Ile Ala Thr Glu Gln Val Phe Thr Val Thr
Asp Gly Leu Leu 3970 3975 3980 Tyr Ile Thr Ile Glu Gly Glu Val Arg
Phe Glu Pro Ser Arg Asp Leu3985 3990 3995 4000 Asp His Ser Ser Gly
Asp Ile Val Arg Thr Ile Val Val Thr Thr Ser 4005 4010 4015 Asp Phe
Asp Asn Asp Thr Asp Thr Ala Asp Val Thr Leu Thr Ile Lys 4020 4025
4030 Asp Gly Ile Asn Pro Val Ile Asn Val Val Pro Asp Val Asn Leu
Ser 4035 4040 4045 Glu Val Asn Leu Ala Asp Gly Ser Thr Pro Ser Gly
Ser Ala Val Ser 4050 4055 4060 Ser Thr His Thr Ile Thr Tyr Thr Glu
Gly Ser Asp Asp Phe Ser His4065 4070 4075 4080 Phe Arg Ile Ala Thr
Asn Glu Phe Asn Pro Gly Asp Leu Leu Lys Ser 4085 4090 4095 Ser Gly
Leu Val Val Gln Leu Lys Glu Asp Pro Ala Ser Ala Gly Asp 4100 4105
4110 Tyr Ile Gly Tyr Thr Asp Asp Gly Met Gly Asn Val Thr Asp Val
Phe 4115 4120 4125 Thr Ile Ser Phe Asp Ser Ala Asn Lys Ala Gln Phe
Thr Phe Thr Leu 4130 4135 4140 Ile Glu Ala Leu Asp His Leu Asp Gly
Val Leu Tyr Asn Asp Leu Thr4145 4150 4155 4160 Phe Arg Leu Pro Ile
Tyr Ala Val Asp Thr Asp Asp Ser Glu Ser Thr 4165 4170 4175 Lys Arg
Asp Val Val Val Thr Ile Glu Asp Asp Ile Gln Gln Met Gln 4180 4185
4190 Asp Gly Phe Leu Thr Ile Thr Glu Pro Asn Ser Gly Thr Pro Thr
Thr 4195 4200 4205 Thr Thr Val Asp Val Met Pro Ile Pro Ser Ala Asp
Gly Ala Thr Ile 4210 4215 4220 Thr Gln Phe Thr Tyr Asp Gly Gly Ser
Pro Ile Thr Leu Asn Gln Ser4225 4230 4235 4240 Ile Ser Gly Glu Gln
Glu Phe Val Phe Thr Glu Gly Ser Leu Phe Val 4245 4250 4255 Thr Leu
Asp Gly Asp Val Arg Phe Glu Pro Asn Arg Asn Leu Asp His 4260 4265
4270 Ser Ala Gly Asp Ile Val Lys Ser Ile Val Phe Thr Ser Ser Asp
Phe 4275 4280 4285 Asp Asn Asp Ile Phe Ser Ser Lys Val Thr Leu Thr
Ile Val Asp Gly 4290 4295 4300 Asp Gly Pro Thr Ile Asp Val Val Pro
Gly Val Ala Leu Ser Glu Ser4305 4310 4315 4320 Leu Leu Ala Asp Gly
Ser Thr Pro Ser Val Asn Pro Val Ser Met Thr 4325 4330 4335 Gln Thr
Ile Thr Ser Leu Ala Ser Ser Asp Asp Ile Ala Glu Ile Val 4340 4345
4350 Val Glu Val Gly Leu Phe Asn Thr Asn Gly Ala Leu Lys Ser Asp
Gly 4355 4360 4365 Leu Ser Leu Ser Leu Arg Glu Asp Pro Val Asn Ser
Gly Asp Tyr Ile 4370 4375 4380 Ala Phe Thr Thr Asn Gly Ser Gly Val
Glu Lys Val Ile Phe Thr Leu4385 4390 4395 4400 Asp Phe Asp Asp Thr
Asn Pro Ser Gln Tyr Thr Phe Thr Leu Leu Glu 4405 4410 4415 Arg Leu
Asp His Val Asp Gly Leu Gly Asn Asn Asp Leu Ser Phe Asp 4420 4425
4430 Leu Ser Val Tyr Ala Glu Asp Thr Asp Gly Asp Ile Ser Ala Ser
Lys 4435 4440 4445 Pro Leu Thr Val Thr Ile Thr Asp Asp Val Gln Leu
Met Gln Ser Gly 4450 4455 4460 Ala Leu Asn Ile Thr Glu Pro Thr Thr
Gly Thr Pro Thr Thr Ala Val4465 4470 4475 4480 Phe Asp Val Met Pro
Ala Gln Ser Ala Asp Gly Ala Thr Ile Thr Lys 4485 4490 4495 Phe Thr
Tyr Gly Ser Gln Pro Glu Glu Ser Leu Val Gln Thr Val Thr 4500 4505
4510 Gly Glu Gln Glu Phe Val Phe Thr Glu Gly Ser Leu Phe Ile Asn
Leu 4515 4520 4525 Glu Gly Asp Val Arg Phe Glu Pro Asn Arg Asn Leu
Asp His Ser Gly 4530 4535 4540 Gly Asn Ile Val Lys Thr Ile Thr Val
Thr Ser Glu Asp Lys Asp Gly4545 4550 4555 4560 Asp Ile Val Thr Ser
Thr Val Thr Leu Thr Ile Val Asp Gly Ala Pro 4565 4570 4575 Pro Val
Ile Asp Thr Val Pro Thr Val Ala Leu Glu Glu Ala Asn Leu 4580 4585
4590 Val Asp Gly Ser Ser Pro Gly Leu Pro Val Ser Gln Thr Glu Ile
Ile 4595 4600 4605 Thr Phe Thr Ala Gly Ser Asp Asp Val Ser His Phe
Arg Ile Asp Pro 4610 4615 4620 Ala Gln Phe Asn Thr Ser Gly Asp Leu
Lys Ala Asp Gly Leu Val Val4625 4630 4635 4640 Gln Leu Lys Glu Asp
Pro Leu Asn Ser Asp Asn Tyr Ile Gly Tyr Val 4645 4650 4655 Glu Ser
Gly Gly Val Gln Thr Asp Ile Phe Thr Ile Thr Phe Ser Ser 4660 4665
4670 Val Val Leu Gly Glu Tyr Thr Phe Thr Leu Leu Glu Glu Leu Asp
His 4675 4680 4685 Leu Pro Val Gln Gly Asn Asn Asp Gln Ile Phe Thr
Leu Pro Val Ile 4690 4695 4700 Ala Val Asp Lys Asp Asn Thr Asp Ser
Ala Val Lys Pro Leu Thr Val4705 4710 4715 4720 Thr Ile Thr Asp Asp
Val Pro Thr Ile Thr Asp Thr Thr Gly Ala Ser 4725 4730 4735 Thr Phe
Val Val Asp Glu Asp Asp Leu Gly Thr Leu Ala Gln Ala Thr 4740 4745
4750 Gly Ser Phe Val Thr Thr Glu Gly Ala Asp Gln Val Glu Val Tyr
Glu 4755 4760 4765 Leu Arg Asn Ile Ser Thr Leu Glu Ala Thr Leu Ser
Ser Gly Ser Glu 4770 4775 4780 Gly Ile Lys Ile Thr Glu Ile Thr Gly
Ala Ala Asn Thr Thr Thr Tyr4785 4790 4795
4800 Gln Gly Ala Thr Asp Pro Ser Gly Thr Pro Ile Phe Thr Leu Val
Leu 4805 4810 4815 Thr Asp Asp Gly Ala Tyr Thr Phe Thr Leu Leu Gly
Pro Leu Asn His 4820 4825 4830 Ala Thr Thr Pro Ser Asn Leu Asp Thr
Leu Thr Ile Pro Phe Asp Val 4835 4840 4845 Val Ala Val Asp Gly Asp
Gly Asp Asp Ser Asn Gln Tyr Val Leu Pro 4850 4855 4860 Ile Glu Val
Leu Asp Asp Val Pro Val Met Thr Ala Pro Thr Gly Glu4865 4870 4875
4880 Thr Val Val Asp Glu Asp Asp Leu Thr Gly Ile Gly Ser Asp Gln
Ser 4885 4890 4895 Glu Asp Thr Ile Ile Asn Gly Leu Phe Thr Val Asp
Glu Gly Ala Asp 4900 4905 4910 Gly Val Val Leu Tyr Glu Leu Val Asp
Glu Asp Leu Val Leu Thr Gly 4915 4920 4925 Leu Thr Ser Asp Gly Glu
Ser Leu Glu Trp Leu Ala Val Ser Gln Asn 4930 4935 4940 Gly Thr Thr
Phe Thr Tyr Val Ala Gln Thr Ala Thr Ser Asn Glu Ala4945 4950 4955
4960 Val Phe Glu Ile Ile Phe Asp Thr Ser Asp Asn Ser Tyr Gln Phe
Glu 4965 4970 4975 Leu Phe Lys Pro Leu Lys His Pro Asp Gly Ala Asn
Glu Asn Ala Ile 4980 4985 4990 Asp Leu Asp Phe Ser Ile Val Ala Glu
Asp Phe Asp Gln Asp Gln Ser 4995 5000 5005 Asp Ala Ile Gly Leu Lys
Ile Thr Val Thr Asp Asp Val Pro Leu Val 5010 5015 5020 Thr Thr Gln
Ser Ile Thr Arg Leu Glu Gly Gln Gly Tyr Gly Asn Ser5025 5030 5035
5040 Lys Val Asp Met Phe Ala Asn Ala Thr Asp Val Gly Ala Asp Gly
Ala 5045 5050 5055 Val Leu Ser Arg Ile Glu Gly Ile Ser Asn Asn Gly
Ala Asp Ile Val 5060 5065 5070 Phe Arg Ser Gly Asn Asn Gly Pro Tyr
Ser Ser Gly Phe Asp Leu Asn 5075 5080 5085 Ser Gly Ser Gln Gln Val
Arg Val Tyr Glu Gln Thr Asn Gly Gly Ala 5090 5095 5100 Asp Thr Arg
Glu Leu Gly Arg Leu Arg Ile Asn Ser Asn Gly Glu Val5105 5110 5115
5120 Glu Phe Arg Ala Asn Gly Tyr Leu Asp His Asp Gly Asp Asp Thr
Ile 5125 5130 5135 Asp Phe Ser Ile Asn Val Ile Ala Thr Asp Gly Asp
Leu Asp Thr Ser 5140 5145 5150 Glu Thr Pro Leu Asp Ile Thr Ile Thr
Asp Arg Asp Ser Thr Arg Ile 5155 5160 5165 Ala Leu Lys Val Thr Thr
Phe Glu Asp Ala Gly Arg Asp Ser Thr Ile 5170 5175 5180 Pro Tyr Ala
Thr Gly Asp Glu Pro Thr Leu Glu Asn Val Gln Asp Asn5185 5190 5195
5200 Gln Asn Gly Leu Pro Asn Ala Pro Ala Gln Val Ala Leu Gln Val
Ser 5205 5210 5215 Leu Tyr Asp Gln Asp Asn Ala Glu Ser Ile Gly Gln
Leu Thr Ile Lys 5220 5225 5230 Ser Pro Asn Gly Gly Asp Ser His Gln
Gly Thr Phe Tyr Tyr Phe Asp 5235 5240 5245 Gly Ala Asp Tyr Ile Glu
Leu Val Pro Glu Ser Asn Gly Ser Ile Ile 5250 5255 5260 Phe Gly Ser
Pro Glu Leu Glu Gln Ser Phe Ala Pro Asn Pro Ser Glu5265 5270 5275
5280 Pro Arg Gln Thr Ile Ala Thr Ile Asp Asn Leu Phe Phe Val Pro
Asp 5285 5290 5295 Gln His Ala Ser Ser Asp Glu Thr Gly Gly Arg Val
Arg Tyr Glu Leu 5300 5305 5310 Glu Ile Glu Lys Asn Gly Ser Thr Asp
His Thr Val Asn Ser Asn Phe 5315 5320 5325 Arg Ile Glu Ile Glu Ala
Val Ala Asp Ile Ala Thr Trp Asp Asp Ser 5330 5335 5340 Asn Ser Thr
Tyr Gln Tyr Gln Val Asn Glu Asp Glu Asp Asn Val Thr5345 5350 5355
5360 Leu Gln Leu Asn Ala Glu Ser Gln Asp Asn Ser Asn Thr Glu Thr
Ile 5365 5370 5375 Thr Tyr Glu Leu Glu Ala Val Gln Gly Asp Gly Lys
Phe Glu Leu Leu 5380 5385 5390 Asp Gln Asn Gly Asn Val Leu Thr Pro
Val Asn Gly Val Tyr Ile Ile 5395 5400 5405 Ala Ser Ala Asp Ile Asn
Ser Thr Val Val Asn Pro Ile Asp Asn Phe 5410 5415 5420 Ser Gly Gln
Ile Glu Phe Lys Ala Thr Ala Ile Thr Glu Glu Thr Leu5425 5430 5435
5440 Asn Pro Tyr Asp Asp Ser Asp Asn Gly Gly Ala Asn Asp Lys Thr
Thr 5445 5450 5455 Ala Arg Ser Val Glu Gln Ser Ile Val Ile Asp Val
Thr Ala Asp Ala 5460 5465 5470 Asp Pro Gly Thr Phe Ser Val Ser Arg
Ile Gln Ile Asn Glu Asp Asn 5475 5480 5485 Ile Asp Asp Pro Asp Tyr
Val Gly Pro Leu Asp Asn Lys Asp Ala Phe 5490 5495 5500 Thr Leu Asp
Glu Val Ile Thr Met Thr Gly Ser Val Asp Ser Asp Ser5505 5510 5515
5520 Ser Glu Glu Leu Phe Val Arg Ile Ser Asn Val Thr Glu Gly Ala
Val 5525 5530 5535 Leu Tyr Phe Leu Gly Thr Thr Thr Val Val Pro Thr
Ile Thr Ile Asn 5540 5545 5550 Gly Val Asp Tyr Gln Glu Ile Ala Tyr
Ser Asp Leu Ala Asn Val Glu 5555 5560 5565 Val Val Pro Thr Lys His
Ser Asn Val Asp Phe Thr Phe Asp Val Thr 5570 5575 5580 Gly Val Val
Lys Asp Thr Ala Asn Leu Ser Thr Gly Ala Gln Ile Asp5585 5590 5595
5600 Glu Glu Ile Leu Gly Thr Lys Thr Val Asn Val Glu Val Lys Gly
Val 5605 5610 5615 Ala Asp Thr Pro Tyr Gly Gly Thr Asn Gly Thr Ala
Trp Ser Ala Ile 5620 5625 5630 Thr Asp Gly Thr Thr Ser Gly Val Gln
Thr Thr Ile Gln Glu Ser Gln 5635 5640 5645 Asn Gly Asp Thr Phe Ala
Glu Leu Asp Phe Thr Val Leu Ser Gly Glu 5650 5655 5660 Arg Arg Pro
Asp Thr Gly Thr Thr Pro Leu Ala Asp Asp Gly Ser Glu5665 5670 5675
5680 Ser Ile Thr Val Ile Leu Ser Gly Ile Pro Asp Gly Val Val Leu
Glu 5685 5690 5695 Asp Gly Asp Gly Thr Val Ile Asp Leu Asn Phe Val
Gly Tyr Glu Thr 5700 5705 5710 Gly Pro Gly Gly Ser Pro Asp Leu Ser
Lys Pro Ile Tyr Glu Ala Asn 5715 5720 5725 Ile Thr Glu Ala Gly Lys
Thr Ser Gly Ile Arg Ile Arg Pro Val Asp 5730 5735 5740 Ser Ser Thr
Glu Asn Ile His Ile Gln Gly Lys Val Ile Val Thr Glu5745 5750 5755
5760 Asn Asp Gly His Thr Leu Thr Phe Asp Gln Glu Ile Arg Val Leu
Val 5765 5770 5775 Ile Pro Arg Ile Asp Thr Ser Ala Thr Tyr Val Asn
Thr Thr Asn Gly 5780 5785 5790 Asp Glu Asp Thr Ala Ile Asn Ile Asp
Trp His Pro Glu Gly Thr Asp 5795 5800 5805 Tyr Ile Asp Asp Asp Glu
His Phe Thr Lys Ile Thr Ile Asn Gly Ile 5810 5815 5820 Pro Leu Gly
Val Thr Ala Val Val Asn Gly Asp Val Thr Val Asp Asp5825 5830 5835
5840 Ser Thr Pro Gly Thr Leu Ile Ile Thr Pro Lys Asp Ala Ser Gln
Thr 5845 5850 5855 Pro Glu Gln Phe Thr Gln Ile Ala Leu Ala Asn Asn
Phe Ile Gln Met 5860 5865 5870 Thr Pro Pro Ala Asp Ser Ser Ala Asp
Phe Thr Leu Thr Thr Glu Leu 5875 5880 5885 Lys Met Glu Glu Arg Asp
His Glu Tyr Thr Ser Ser Gly Leu Glu Asp 5890 5895 5900 Glu Asp Gly
Gly Tyr Val Glu Ala Asp Pro Asp Ile Thr Gly Ile Ile5905 5910 5915
5920 Asn Val Gln Val Arg Pro Val Val Glu Pro Gly Asp Ala Asp Asn
Lys 5925 5930 5935 Ile Val Val Ser Asn Glu Asp Gly Ser Gly Asp Leu
Thr Thr Ile Thr 5940 5945 5950 Ala Asp Ala Asn Gly Val Ile Lys Phe
Thr Thr Asn Ser Asp Asn Gln 5955 5960 5965 Thr Thr Asp Thr Asn Gly
Asp Glu Ile Trp Asp Gly Glu Tyr Val Val 5970 5975 5980 Arg Tyr Gln
Glu Thr Asp Leu Ser Thr Val Glu Glu Gln Val Asp Glu5985 5990 5995
6000 Val Ile Val Gln Leu Thr Asn Thr Asp Gly Ser Ala Leu Ser Asp
Asp 6005 6010 6015 Ile Leu Gly Gln Leu Leu Val Thr Gly Ala Ser Tyr
Glu Gly Gly Gly 6020 6025 6030 Arg Trp Val Val Thr Asn Glu Asp Ala
Phe Ser Val Ser Ala Pro Asn 6035 6040 6045 Gly Leu Asp Phe Thr Pro
Ala Asn Asp Ala Asp Asp Val Ala Thr Asp 6050 6055 6060 Phe Asn Asp
Ile Lys Met Thr Ile Phe Thr Leu Val Ser Asp Pro Gly6065 6070 6075
6080 Asp Ala Asn Asn Glu Thr Ser Ala Gln Val Gln Arg Thr Gly Glu
Val 6085 6090 6095 Thr Leu Ser Tyr Pro Glu Val Leu Thr Ala Pro Asp
Lys Val Ala Ala 6100 6105 6110 Asp Ile Ala Ile Val Pro Asp Ser Val
Ile Asp Ala Val Glu Asp Thr 6115 6120 6125 Gln Leu Asp Leu Gly Ala
Ala Leu Asn Gly Ile Leu Ser Leu Thr Gly 6130 6135 6140 Arg Asp Asp
Ser Thr Asp Gln Val Thr Val Ile Ile Asp Gly Thr Leu6145 6150 6155
6160 Val Ile Asp Ala Thr Thr Ser Phe Pro Ile Ser Leu Ser Gly Thr
Ser 6165 6170 6175 Asp Val Asp Phe Val Asn Gly Lys Tyr Val Tyr Glu
Thr Thr Val Glu 6180 6185 6190 Gln Gly Val Ala Val Asp Ser Ser Gly
Leu Leu Leu Asn Leu Pro Pro 6195 6200 6205 Asn Tyr Ser Gly Asp Phe
Arg Leu Pro Met Thr Ile Val Thr Lys Asp 6210 6215 6220 Leu Gln Ser
Gly Asp Glu Lys Thr Leu Val Thr Glu Val Ile Ile Lys6225 6230 6235
6240 Val Ala Pro Asp Ala Glu Thr Asp Pro Thr Ile Glu Val Asn Val
Val 6245 6250 6255 Gly Ser Leu Asp Asp Ala Phe Asn Pro Val Asp Thr
Asp Gly Gln Ala 6260 6265 6270 Gly Gln Asp Pro Val Gly Tyr Glu Asp
Thr Tyr Ile Gln Leu Asp Phe 6275 6280 6285 Asn Ser Thr Ile Ser Asp
Gln Val Ser Gly Val Glu Gly Gly Gln Glu 6290 6295 6300 Ala Phe Thr
Ser Ile Thr Leu Thr Leu Asp Asp Pro Ser Ile Gly Ala6305 6310 6315
6320 Phe Tyr Asp Asn Thr Gly Thr Ser Leu Gly Thr Ser Val Thr Phe
Asn 6325 6330 6335 Gln Ala Glu Ile Ala Ala Gly Ala Leu Asp Asn Val
Leu Phe Arg Ala 6340 6345 6350 Ile Glu Asn Tyr Pro Thr Gly Asn Asp
Ile Asn Gln Val Gln Val Asn 6355 6360 6365 Val Ser Gly Thr Val Thr
Asp Thr Ala Thr Tyr Asn Asp Pro Ala Ser 6370 6375 6380 Pro Ala Gly
Thr Ala Thr Asp Ser Asp Thr Phe Ser Thr Ser Val Ser6385 6390 6395
6400 Phe Glu Val Val Pro Val Val Asp Asp Val Ser Val Thr Gly Pro
Gly 6405 6410 6415 Ser Asp Pro Asp Val Ile Glu Ile Thr Gly Asn Glu
Asp Gln Leu Ile 6420 6425 6430 Ser Leu Ser Gly Thr Gly Pro Val Ser
Ile Ala Leu Thr Asp Leu Asp 6435 6440 6445 Gly Ser Glu Gln Phe Val
Ser Ile Lys Phe Thr Asp Val Pro Asp Gly 6450 6455 6460 Phe Gln Met
Arg Ala Asp Ala Gly Ser Thr Tyr Thr Val Lys Asn Asn6465 6470 6475
6480 Gly Asn Gly Glu Trp Ser Val Gln Leu Pro Gln Ala Ser Gly Leu
Ser 6485 6490 6495 Phe Asp Leu Ser Glu Ile Ser Ile Leu Pro Pro Lys
Asn Phe Ser Gly 6500 6505 6510 Thr Ala Glu Phe Gly Val Glu Val Phe
Thr Gln Glu Ser Leu Leu Gly 6515 6520 6525 Val Pro Thr Ala Ala Ala
Asn Leu Pro Ser Phe Lys Leu His Val Val 6530 6535 6540 Pro Val Gly
Asp Asp Val Asp Thr Asn Pro Thr Asp Ser Val Thr Gly6545 6550 6555
6560 Asn Glu Gly Gln Asn Ile Asp Ile Glu Ile Asn Ala Thr Ile Leu
Asp 6565 6570 6575 Lys Glu Leu Ser Ala Thr Gly Ser Gly Thr Tyr Thr
Glu Asn Ala Pro 6580 6585 6590 Glu Thr Leu Arg Val Glu Val Ala Gly
Val Pro Gln Asp Ala Ser Ile 6595 6600 6605 Phe Tyr Pro Asp Gly Thr
Thr Leu Ala Ser Tyr Asp Pro Ala Thr Gln 6610 6615 6620 Leu Trp Thr
Leu Asp Val Pro Ala Gln Ser Leu Asp Lys Ile Val Phe6625 6630 6635
6640 Asn Ser Gly Glu His Asn Ser Asp Thr Gly Asn Val Leu Gly Ile
Asn 6645 6650 6655 Gly Pro Leu Gln Ile Thr Val Arg Ser Val Asp Thr
Asp Ala Asp Asn 6660 6665 6670 Thr Glu Tyr Leu Gly Thr Pro Thr Ser
Phe Asp Val Asp Leu Val Ile 6675 6680 6685 Asp Pro Ile Asn Asp Gln
Pro Ile Phe Val Asn Val Thr Asn Ile Glu 6690 6695 6700 Thr Ser Glu
Asp Ile Ser Val Ala Ile Asp Asn Phe Ser Ile Tyr Asp6705 6710 6715
6720 Val Asp Ala Asn Phe Asp Asn Pro Asp Ala Pro Tyr Glu Leu Thr
Leu 6725 6730 6735 Lys Val Asp Gln Thr Leu Pro Gly Ala Gln Gly Val
Phe Glu Phe Thr 6740 6745 6750 Ser Ser Pro Asp Val Thr Phe Val Leu
Gln Pro Asp Gly Ser Leu Val 6755 6760 6765 Ile Thr Gly Lys Glu Ala
Asp Ile Asn Thr Ala Leu Thr Asn Gly Ala 6770 6775 6780 Val Thr Phe
Lys Pro Asp Pro Asp Gln Asn Tyr Leu Asn Gln Thr Gly6785 6790 6795
6800 Leu Val Thr Ile Asn Ala Thr Leu Asp Asp Gly Gly Asn Asn Gly
Leu 6805 6810 6815 Ile Asp Ala Val Asp Pro Asn Thr Ala Gln Thr Asn
Gln Thr Thr Phe 6820 6825 6830 Thr Ile Lys Val Thr Glu Val Asn Asp
Ala Pro Val Ala Thr Asn Val 6835 6840 6845 Asp Leu Gly Ser Ile Ala
Glu Asp Ala Gln Ile Val Ile Val Glu Ser 6850 6855 6860 Asp Leu Ile
Ala Ala Ser Ser Asp Leu Glu Asn His Asn Leu Thr Val6865 6870 6875
6880 Thr Gly Val Thr Leu Thr Gln Gly Gln Gly Gln Leu Thr Arg Tyr
Glu 6885 6890 6895 Asn Ala Gly Gly Ala Asp Asp Ala Ala Ile Thr Gly
Pro Phe Trp Ile 6900 6905 6910 Phe Ile Ala Asp Asn Asp Phe Asn Gly
Asp Val Lys Phe Asn Tyr Ser 6915 6920 6925 Ile Ile Asp Asp Gly Thr
Thr Asn Gly Val Asp Asp Phe Lys Thr Asp 6930 6935 6940 Ser Ala Glu
Ile Ser Leu Val Val Thr Glu Val Asn Asp Gln Pro Val6945 6950 6955
6960 Ala Ser Asn Ile Asp Leu Gly Thr Met Leu Glu Glu Gly Gln Leu
Val 6965 6970 6975 Ile Lys Glu Glu Asp Leu Ile Ser Ala Thr Thr Asp
Pro Glu Asn Asp 6980 6985 6990 Thr Ile Thr Val Asn Ser Leu Val Leu
Asp Gln Gly Gln Gly Gln Leu 6995 7000 7005 Gln Arg Phe Glu Asn Val
Gly Gly Ala Asp Asp Ala Thr Ile Thr Gly 7010 7015 7020 Pro Tyr Trp
Val Phe Thr Ala Ala Asn Glu Tyr Asn Gly Asp Val Lys7025 7030 7035
7040 Phe Thr Tyr Thr Val Glu Asp Asp Gly Thr Thr Asn Gly Ala Asp
Asp 7045 7050 7055 Phe Leu Thr Asp Thr Gly Glu Ile Ser Val Val Val
Thr Glu Val Asn 7060 7065 7070 Asp Gln Pro Val Ala Thr Asp Ile Asp
Leu Gly Asn Ile Leu Glu Glu 7075 7080 7085 Gly Gln Leu Ile Ile Lys
Glu Glu Asp Leu Ile Ala Ala Thr Ser Asp 7090 7095 7100 Pro Glu Asn
Asp Thr Ile Thr Val Thr Asn Leu Val Leu Asp Glu Gly7105 7110
7115 7120 Gln Gly Gln Leu Gln Arg Phe Glu Asn Val Gly Gly Ala Asp
Asp Ala 7125 7130 7135 Met Ile Thr Gly Pro Tyr Trp Ile Phe Thr Ala
Ala Asp Glu Tyr Asn 7140 7145 7150 Gly Asn Val Lys Phe Thr Tyr Thr
Val Glu Asp Asp Gly Thr Thr Asn 7155 7160 7165 Gly Ala Asn Asp Phe
Leu Thr Asp Thr Ala Glu Ile Thr Ala Ile Val 7170 7175 7180 Asp Gly
Val Asn Asp Thr Pro Val Val Asn Gly Asp Ser Val Thr Thr7185 7190
7195 7200 Ile Val Asp Glu Asp Ala Gly Gln Leu Leu Ser Gly Ile Asn
Val Ser 7205 7210 7215 Asp Pro Asp Tyr Val Asp Ala Phe Ser Asn Asp
Leu Met Thr Val Thr 7220 7225 7230 Leu Thr Val Asp Tyr Gly Thr Leu
Asn Val Ser Leu Pro Ala Val Thr 7235 7240 7245 Thr Val Met Val Asn
Gly Asn Asn Thr Gly Ser Val Ile Leu Val Gly 7250 7255 7260 Thr Leu
Ser Asp Leu Asn Ala Leu Ile Asp Thr Pro Thr Ser Pro Asn7265 7270
7275 7280 Gly Val Tyr Leu Asp Ala Ser Leu Ser Pro Thr Asn Ser Ile
Gly Leu 7285 7290 7295 Glu Val Ile Ala Lys Asp Ser Gly Asn Pro Ser
Gly Ile Ala Ile Glu 7300 7305 7310 Thr Ala Pro Val Val Tyr Asn Ile
Ala Val Thr Pro Val Ala Asn Ala 7315 7320 7325 Pro Thr Leu Ser Ile
Asp Pro Ala Phe Asn Tyr Val Arg Asn Ile Thr 7330 7335 7340 Thr Ser
Ser Ser Val Val Ala Asn Ser Gly Val Ala Leu Val Gly Ile7345 7350
7355 7360 Val Ala Ala Leu Thr Asp Ile Thr Glu Glu Leu Thr Leu Lys
Ile Ser 7365 7370 7375 Asp Val Pro Asp Gly Val Asp Val Thr Ser Asp
Val Gly Thr Val Ser 7380 7385 7390 Leu Val Gly Asp Thr Trp Ile Ala
Thr Ala Asp Ala Ile Asp Ser Leu 7395 7400 7405 Arg Leu Val Glu Gln
Ser Ser Leu Gly Lys Pro Leu Thr Pro Gly Asn 7410 7415 7420 Tyr Thr
Leu Lys Val Glu Ala Leu Ser Glu Glu Thr Asp Asn Asn Asp7425 7430
7435 7440 Ile Ala Ile Ser Gln Asn Ile Asp Leu Asn Leu Asn Ile Val
Ala Asn 7445 7450 7455 Pro Ile Asp Leu Asp Leu Ser Ser Glu Thr Asp
Asp Val Gln Leu Leu 7460 7465 7470 Ala Ser Asn Phe Asp Thr Asn Leu
Thr Gly Gly Thr Gly Asn Asp Arg 7475 7480 7485 Leu Val Gly Gly Ala
Gly Asp Asp Thr Leu Val Gly Gly Asp Gly Asn 7490 7495 7500 Asp Thr
Leu Ile Gly Gly Gly Gly Ser Asp Ile Leu Thr Gly Gly Asn7505 7510
7515 7520 Gly Met Asp Ser Phe Val Trp Leu Asn Ile Glu Asp Gly Val
Glu Asp 7525 7530 7535 Thr Ile Thr Asp Phe Ser Leu Ser Glu Gly Asp
Gln Ile Asp Leu Arg 7540 7545 7550 Glu Val Leu Pro Glu Leu Lys Asn
Thr Ser Pro Asp Met Ser Ala Leu 7555 7560 7565 Leu Gln Gln Ile Asp
Ala Lys Val Glu Gly Asp Asp Ile Glu Leu Thr 7570 7575 7580 Ile Lys
Ser Asp Gly Leu Gly Thr Thr Glu Gln Val Ile Val Val Glu7585 7590
7595 7600 Asp Leu Ala Pro Gln Leu Thr Leu Ser Gly Thr Met Pro Ser
Asp Ile 7605 7610 7615 Leu Asp Ala Leu Val Gln Gln Asn Val Ile Thr
His Gly 7620 7625 5765DNAVibrio splendidus 5atgaaaaaaa catcactatt
acttgcttcc attactctgg cactttctgg tgtagtacaa 60gctgaccagc tagaagacat
tcaaaaatca ggcacacttc gcgtcggcac cacaggcgac 120tacaaacctt
tttcttactt cgacggcaaa acctactctg gttatgacat tgacgtagcc
180aaacatgttg cagagcagtt gggcgttgaa ttacagattg ttcgtaccac
atggaaagat 240ctactgaccg atctagacag cgataaatac gacatcgcga
tgggcggtat cacgcgtaaa 300atgcagcgtc agttaaacgc agaacaaact
caaggttaca tgacctttgg caagtgtttc 360ttagttgcga aaggcaaagc
agaacaatac aacagcattg agaaagtgaa cctctcttct 420gtgcgtgttg
gcgtcaatat cggtgggact aatgagatgt ttgcggatgc taacttgcaa
480gacgcgagct ttacgcgtta cgagaacaac ctagacgttc cgcaagccgt
tgcggaaggt 540aaagttgatg taatggtgac agaaactcct gaaggtctgt
tctatcaagt gacggacgaa 600cgtcttgaag cggcacgctg tgaaacaccg
tttaccaaca gtcaattcgg ttacctgata 660ccaaaaggtg aacaacgctt
gttgaacaca gtgaacttca ttatggatga gatgaaattg 720aaaggcgtcg
aagaagagtt cctgatccac aactctctta agtaa 7656254PRTVibrio splendidus
6Met Lys Lys Thr Ser Leu Leu Leu Ala Ser Ile Thr Leu Ala Leu Ser1 5
10 15 Gly Val Val Gln Ala Asp Gln Leu Glu Asp Ile Gln Lys Ser Gly
Thr 20 25 30 Leu Arg Val Gly Thr Thr Gly Asp Tyr Lys Pro Phe Ser
Tyr Phe Asp 35 40 45 Gly Lys Thr Tyr Ser Gly Tyr Asp Ile Asp Val
Ala Lys His Val Ala 50 55 60 Glu Gln Leu Gly Val Glu Leu Gln Ile
Val Arg Thr Thr Trp Lys Asp65 70 75 80 Leu Leu Thr Asp Leu Asp Ser
Asp Lys Tyr Asp Ile Ala Met Gly Gly 85 90 95 Ile Thr Arg Lys Met
Gln Arg Gln Leu Asn Ala Glu Gln Thr Gln Gly 100 105 110 Tyr Met Thr
Phe Gly Lys Cys Phe Leu Val Ala Lys Gly Lys Ala Glu 115 120 125 Gln
Tyr Asn Ser Ile Glu Lys Val Asn Leu Ser Ser Val Arg Val Gly 130 135
140 Val Asn Ile Gly Gly Thr Asn Glu Met Phe Ala Asp Ala Asn Leu
Gln145 150 155 160 Asp Ala Ser Phe Thr Arg Tyr Glu Asn Asn Leu Asp
Val Pro Gln Ala 165 170 175 Val Ala Glu Gly Lys Val Asp Val Met Val
Thr Glu Thr Pro Glu Gly 180 185 190 Leu Phe Tyr Gln Val Thr Asp Glu
Arg Leu Glu Ala Ala Arg Cys Glu 195 200 205 Thr Pro Phe Thr Asn Ser
Gln Phe Gly Tyr Leu Ile Pro Lys Gly Glu 210 215 220 Gln Arg Leu Leu
Asn Thr Val Asn Phe Ile Met Asp Glu Met Lys Leu225 230 235 240 Lys
Gly Val Glu Glu Glu Phe Leu Ile His Asn Ser Leu Lys 245 250
71764DNAVibrio splendidus 7atgactatcg atacttttgt tgttctcgcc
tacttcttct ttttaatcgc tattggttgg 60atgttccgta agttcaccac gtcgactagt
gattacttca gagggggcgg caaaatgttg 120tggtggatgg ttggtgcaac
cgccttcatg acacagtttt cagcatggac gtttacaggt 180gccgcaggac
gcgcgttcaa tgacggtttc gttattgtaa tcctattctt agccaatgct
240tttggctact tcatgaacta tatgtacttc gctccaaagt tccgccaact
tcgtgtggta 300acggcgatcg aagctattcg tcagcgcttt ggtaaaacgt
ctgaacagtt cttcacatgg 360gcaggtatgc ctgacagcct tatctctgcg
ggtatctggc taaatggtct agctatcttc 420gtagcagcgg tattcaacat
cccaatggaa gcaaccattg tggtaacggg tatggttcta 480gtattgatgg
cagtaacagg cggctcttgg gcggttgttg cttctgactt catgcaaatg
540cttgttatca tggcggttac gattacttgt gcggttgcag cttacttcca
cggtggtggc 600ctaactaaca tcgttgcaaa tttcgacggc gacttcatgt
taggtaataa cctaaactac 660atgagcatct tcgttctttg ggttgtattc
atcttcgtga agcagttcgg tgtaatgaac 720aacagcatca acgcttaccg
ttacctatgt gcgaaagaca gtgaaaacgc acgtaaagcg 780gcaggcctag
catgtatcct tatggttgtt ggcccactaa tctggttcct accaccttgg
840tacgtaagtg cattcatgcc tgatttcgca ttggagtacg cttcaatggg
tgataaagct 900ggtgatgctg cttacctagc attcgtacag aacgtaatgc
cagcaggtat ggttggtctt 960cttatgtcag caatgttcgc tgcaacaatg
tcttctatgg attcaggttt gaaccgtaac 1020gctggcatct ttgtaatgaa
cttctacagc cctattctac gtcaaaacgc aactcagaaa 1080gagctggtta
ttgtaagtaa gctaaccact atcatgatgg gtattatcat catcgcgatt
1140ggcttgttca ttaactctct acgtcatttg agcttgttcg atatcgtaat
gaacgtaggt 1200gcgttaattg gcttcccaat gcttatccct gtactacttg
gtatgtggat tcgtaagacg 1260cctgactggg ctggttggtc tacgttaatc
gttggtggct tcgtttctta catcttcggt 1320atctcgcttc aagcagaaga
catcgagcac ctatttggta tggaaacagc gcttactggc 1380cgtgaatgga
gcgacttgaa agttggtctt agcttagctg ctcacgtagt gtttactggt
1440ggttacttca tcctaacttc tcgcttctac aaaggcctat cgcctgaacg
tgagaaagaa 1500gttgaccaac tattcactaa ctggaatacg ccgctagtag
cggaaggtga agagcagcag 1560aacctagata ctaaacagcg ttcaatgctt
ggtaagctta tcagcacagc aggtttcggt 1620atcctagcaa tggctctgat
tccaaacgaa ccaacaggac gcttgttgtt cctactatgt 1680ggttcgatgg
tactcaccgt tggtatcctg ctggttaacg catctaaagc tccggctaag
1740atgaacaacg agtcagttgc taag 17648588PRTVibrio splendidus 8Met
Thr Ile Asp Thr Phe Val Val Leu Ala Tyr Phe Phe Phe Leu Ile1 5 10
15 Ala Ile Gly Trp Met Phe Arg Lys Phe Thr Thr Ser Thr Ser Asp Tyr
20 25 30 Phe Arg Gly Gly Gly Lys Met Leu Trp Trp Met Val Gly Ala
Thr Ala 35 40 45 Phe Met Thr Gln Phe Ser Ala Trp Thr Phe Thr Gly
Ala Ala Gly Arg 50 55 60 Ala Phe Asn Asp Gly Phe Val Ile Val Ile
Leu Phe Leu Ala Asn Ala65 70 75 80 Phe Gly Tyr Phe Met Asn Tyr Met
Tyr Phe Ala Pro Lys Phe Arg Gln 85 90 95 Leu Arg Val Val Thr Ala
Ile Glu Ala Ile Arg Gln Arg Phe Gly Lys 100 105 110 Thr Ser Glu Gln
Phe Phe Thr Trp Ala Gly Met Pro Asp Ser Leu Ile 115 120 125 Ser Ala
Gly Ile Trp Leu Asn Gly Leu Ala Ile Phe Val Ala Ala Val 130 135 140
Phe Asn Ile Pro Met Glu Ala Thr Ile Val Val Thr Gly Met Val Leu145
150 155 160 Val Leu Met Ala Val Thr Gly Gly Ser Trp Ala Val Val Ala
Ser Asp 165 170 175 Phe Met Gln Met Leu Val Ile Met Ala Val Thr Ile
Thr Cys Ala Val 180 185 190 Ala Ala Tyr Phe His Gly Gly Gly Leu Thr
Asn Ile Val Ala Asn Phe 195 200 205 Asp Gly Asp Phe Met Leu Gly Asn
Asn Leu Asn Tyr Met Ser Ile Phe 210 215 220 Val Leu Trp Val Val Phe
Ile Phe Val Lys Gln Phe Gly Val Met Asn225 230 235 240 Asn Ser Ile
Asn Ala Tyr Arg Tyr Leu Cys Ala Lys Asp Ser Glu Asn 245 250 255 Ala
Arg Lys Ala Ala Gly Leu Ala Cys Ile Leu Met Val Val Gly Pro 260 265
270 Leu Ile Trp Phe Leu Pro Pro Trp Tyr Val Ser Ala Phe Met Pro Asp
275 280 285 Phe Ala Leu Glu Tyr Ala Ser Met Gly Asp Lys Ala Gly Asp
Ala Ala 290 295 300 Tyr Leu Ala Phe Val Gln Asn Val Met Pro Ala Gly
Met Val Gly Leu305 310 315 320 Leu Met Ser Ala Met Phe Ala Ala Thr
Met Ser Ser Met Asp Ser Gly 325 330 335 Leu Asn Arg Asn Ala Gly Ile
Phe Val Met Asn Phe Tyr Ser Pro Ile 340 345 350 Leu Arg Gln Asn Ala
Thr Gln Lys Glu Leu Val Ile Val Ser Lys Leu 355 360 365 Thr Thr Ile
Met Met Gly Ile Ile Ile Ile Ala Ile Gly Leu Phe Ile 370 375 380 Asn
Ser Leu Arg His Leu Ser Leu Phe Asp Ile Val Met Asn Val Gly385 390
395 400 Ala Leu Ile Gly Phe Pro Met Leu Ile Pro Val Leu Leu Gly Met
Trp 405 410 415 Ile Arg Lys Thr Pro Asp Trp Ala Gly Trp Ser Thr Leu
Ile Val Gly 420 425 430 Gly Phe Val Ser Tyr Ile Phe Gly Ile Ser Leu
Gln Ala Glu Asp Ile 435 440 445 Glu His Leu Phe Gly Met Glu Thr Ala
Leu Thr Gly Arg Glu Trp Ser 450 455 460 Asp Leu Lys Val Gly Leu Ser
Leu Ala Ala His Val Val Phe Thr Gly465 470 475 480 Gly Tyr Phe Ile
Leu Thr Ser Arg Phe Tyr Lys Gly Leu Ser Pro Glu 485 490 495 Arg Glu
Lys Glu Val Asp Gln Leu Phe Thr Asn Trp Asn Thr Pro Leu 500 505 510
Val Ala Glu Gly Glu Glu Gln Gln Asn Leu Asp Thr Lys Gln Arg Ser 515
520 525 Met Leu Gly Lys Leu Ile Ser Thr Ala Gly Phe Gly Ile Leu Ala
Met 530 535 540 Ala Leu Ile Pro Asn Glu Pro Thr Gly Arg Leu Leu Phe
Leu Leu Cys545 550 555 560 Gly Ser Met Val Leu Thr Val Gly Ile Leu
Leu Val Asn Ala Ser Lys 565 570 575 Ala Pro Ala Lys Met Asn Asn Glu
Ser Val Ala Lys 580 585 9627DNAVibrio splendidus 9atgacgacat
taaatgaaca actagcaaac ctaaaagtaa ttcctgtaat cgcgatcaac 60cgtgctgaag
acgctatccc tctaggtaaa gcgttggttg aaaatggcat gccatgtgca
120gaaattacac tacgtacaga atgtgcaatc gaagcgattc gcatcatgcg
taaagaattc 180ccagacatgc taatcggttc aggtactgta ctgactaacg
agcaagttga cgcatctatc 240gaagctggtg ttgatttcat cgtaagccca
ggttttaacc cacgtactgt tcaatactgt 300atcgataaag gtattgcaat
cgtaccgggt gttaacaacc caagcctagt tgagcaagca 360atggaaatgg
gtcttcgcac gttgaagttc ttccctgctg agccttcagg cggtactggc
420atgcttaaag cactaacagc agtttaccct gttaaattca tgcctactgg
tggcgtaagc 480ttgaagaatg ttgatgaata cctatcgatc ccttctgttc
ttgcgtgtgg cggtacttgg 540atggttccaa ctaaccttat cgatgaaggt
aagtgggacg aactaggcaa gcttgttcgt 600gacgcagttg atcacgttaa cgcttaa
62710208PRTVibrio splendidus 10Met Thr Thr Leu Asn Glu Gln Leu Ala
Asn Leu Lys Val Ile Pro Val1 5 10 15 Ile Ala Ile Asn Arg Ala Glu
Asp Ala Ile Pro Leu Gly Lys Ala Leu 20 25 30 Val Glu Asn Gly Met
Pro Cys Ala Glu Ile Thr Leu Arg Thr Glu Cys 35 40 45 Ala Ile Glu
Ala Ile Arg Ile Met Arg Lys Glu Phe Pro Asp Met Leu 50 55 60 Ile
Gly Ser Gly Thr Val Leu Thr Asn Glu Gln Val Asp Ala Ser Ile65 70 75
80 Glu Ala Gly Val Asp Phe Ile Val Ser Pro Gly Phe Asn Pro Arg Thr
85 90 95 Val Gln Tyr Cys Ile Asp Lys Gly Ile Ala Ile Val Pro Gly
Val Asn 100 105 110 Asn Pro Ser Leu Val Glu Gln Ala Met Glu Met Gly
Leu Arg Thr Leu 115 120 125 Lys Phe Phe Pro Ala Glu Pro Ser Gly Gly
Thr Gly Met Leu Lys Ala 130 135 140 Leu Thr Ala Val Tyr Pro Val Lys
Phe Met Pro Thr Gly Gly Val Ser145 150 155 160 Leu Lys Asn Val Asp
Glu Tyr Leu Ser Ile Pro Ser Val Leu Ala Cys 165 170 175 Gly Gly Thr
Trp Met Val Pro Thr Asn Leu Ile Asp Glu Gly Lys Trp 180 185 190 Asp
Glu Leu Gly Lys Leu Val Arg Asp Ala Val Asp His Val Asn Ala 195 200
205 11933DNAVibrio splendidus 11atgaaatcat taaacatcgc ggtcattggc
gagtgcatgg ttgagctaca aaagaaacaa 60gacgggctta agcaaagttt tggtggcgat
acgctgaata ctgcacttta cttgtcacgc 120ttaacaaaag agcaagatat
caacacgagc tacgtaactg cactaggcac tgacccattc 180agtaccgaca
tgttaaaaaa ttggcaagcg gaaggtatcg acacgagctt aattgctcag
240ctggaccaca aacaaccagg gctttactac atcgagaccg atgaaactgg
tgaacgcagt 300ttccactact ggcgtagtga tgctgcagcg aagttcatgt
ttgatcagga agacacgcct 360gctcttcttg ataagctgtt ctcttttgac
gcgatttact taagtggtat tacgctggca 420atcttgacag aaaatggtcg
cacgcagcta ttcaacttct tagacaaatt caaagctcaa 480ggcggccaag
tattcttcga caataactac cgacctaaac tttgggaaag ccaacaagaa
540gcgatttctt ggtacttgaa aatgcttaag tacacagata cggctctgct
gacgtttgat 600gatgagcaag agctatacgg cgacgaaagc attgaacaat
gtattacacg tacgtcagag 660tctggtgtga aagagatcgt cattaaacgt
ggcgcgaaag actgcttagt ggttgaaagc 720caaagcgctc aatacgttgc
acccaaccct gtagacaaca tcgttgatac gactgccgct 780ggcgactcgt
tcagtgcagg cttcttggcc aagcgcttga gcggcggtag tgctcgtgat
840gctgcatttg caggtcatat tgtggcagga accgtgattc agcatccagg
tgctatcatt 900cctctagaag cgacgcctga tctgtctcta taa
93312310PRTVibrio splendidus 12Met Lys Ser Leu Asn Ile Ala Val Ile
Gly Glu Cys Met Val Glu Leu1 5 10 15 Gln Lys Lys Gln Asp Gly Leu
Lys Gln Ser Phe Gly Gly Asp Thr Leu 20 25 30 Asn Thr Ala Leu Tyr
Leu Ser Arg Leu Thr Lys Glu Gln Asp Ile Asn 35 40 45 Thr Ser Tyr
Val Thr Ala Leu Gly Thr Asp Pro Phe Ser Thr Asp Met 50 55 60 Leu
Lys Asn Trp Gln Ala Glu Gly Ile Asp Thr Ser Leu Ile Ala Gln65
70
75 80 Leu Asp His Lys Gln Pro Gly Leu Tyr Tyr Ile Glu Thr Asp Glu
Thr 85 90 95 Gly Glu Arg Ser Phe His Tyr Trp Arg Ser Asp Ala Ala
Ala Lys Phe 100 105 110 Met Phe Asp Gln Glu Asp Thr Pro Ala Leu Leu
Asp Lys Leu Phe Ser 115 120 125 Phe Asp Ala Ile Tyr Leu Ser Gly Ile
Thr Leu Ala Ile Leu Thr Glu 130 135 140 Asn Gly Arg Thr Gln Leu Phe
Asn Phe Leu Asp Lys Phe Lys Ala Gln145 150 155 160 Gly Gly Gln Val
Phe Phe Asp Asn Asn Tyr Arg Pro Lys Leu Trp Glu 165 170 175 Ser Gln
Gln Glu Ala Ile Ser Trp Tyr Leu Lys Met Leu Lys Tyr Thr 180 185 190
Asp Thr Ala Leu Leu Thr Phe Asp Asp Glu Gln Glu Leu Tyr Gly Asp 195
200 205 Glu Ser Ile Glu Gln Cys Ile Thr Arg Thr Ser Glu Ser Gly Val
Lys 210 215 220 Glu Ile Val Ile Lys Arg Gly Ala Lys Asp Cys Leu Val
Val Glu Ser225 230 235 240 Gln Ser Ala Gln Tyr Val Ala Pro Asn Pro
Val Asp Asn Ile Val Asp 245 250 255 Thr Thr Ala Ala Gly Asp Ser Phe
Ser Ala Gly Phe Leu Ala Lys Arg 260 265 270 Leu Ser Gly Gly Ser Ala
Arg Asp Ala Ala Phe Ala Gly His Ile Val 275 280 285 Ala Gly Thr Val
Ile Gln His Pro Gly Ala Ile Ile Pro Leu Glu Ala 290 295 300 Thr Pro
Asp Leu Ser Leu305 310 13336DNAVibrio splendidus 13atgaactctt
tctttatcct agatgaaaat ccatgggaag aacttggtgg cggcattaag 60cgtaaaatcg
ttgcttacac tgacgatcta atggcagtac acctatgctt tgataagggc
120gcgattggcc accctcatac tcacgaaatt cacgaccaaa tcggttatgt
tgttcgtggt 180agcttcgaag ctgaaatcga cggcgagaag aaagtgctta
aagaaggcga tgcttacttc 240gctcgtaaac acatgatgca cggtgcagtt
gctctagaac aagacagcat ccttcttgat 300atcttcaatc ctgcgcgtga
agatttccta aaataa 33614111PRTVibrio splendidus 14Met Asn Ser Phe
Phe Ile Leu Asp Glu Asn Pro Trp Glu Glu Leu Gly1 5 10 15 Gly Gly
Ile Lys Arg Lys Ile Val Ala Tyr Thr Asp Asp Leu Met Ala 20 25 30
Val His Leu Cys Phe Asp Lys Gly Ala Ile Gly His Pro His Thr His 35
40 45 Glu Ile His Asp Gln Ile Gly Tyr Val Val Arg Gly Ser Phe Glu
Ala 50 55 60 Glu Ile Asp Gly Glu Lys Lys Val Leu Lys Glu Gly Asp
Ala Tyr Phe65 70 75 80 Ala Arg Lys His Met Met His Gly Ala Val Ala
Leu Glu Gln Asp Ser 85 90 95 Ile Leu Leu Asp Ile Phe Asn Pro Ala
Arg Glu Asp Phe Leu Lys 100 105 110 152208DNAVibrio splendidus
15atgacgacta aaccagtatt gttgactgaa gctgaaatcg aacagcttca tcttgaagtg
60ggccgttcta gcttaatggg caaaaccatt gcagcgaacg cgaaagacct agaagcattc
120atgcgtttac ctattgatgt tccaggtcac ggtgaagctg ggggttacga
acataaccgc 180cacaagcaaa attacacgta catgaaccta gctggtcgca
tgttcttgat cactaaagag 240caaaaatacg ctgactttgt tacagaatta
ctagaagagt acgcagacaa atatctaacg 300tttgattacc acgtacagaa
aaacaccaac ccaacaggtc gtttgttcca ccaaatccta 360aacgaacact
gctggttaat gttctcaagc ttagcttatt cttgtgttgc ttcaacactg
420acacaagatc agcgtgacaa tattgagtct cgcatttttg aacccatgct
agaaatgttc 480acggttaaat acgcacacga cttcgaccgt attcacaatc
acggtatttg ggcagtagcc 540gctgtgggta tctgtggtct tgctttaggc
aaacgtgaat acctagaaat gtcagtgtac 600ggcatcgacc gtaatgatac
tggcggtttc ctagcgcaag tttctcagct atttgcacct 660tctggctact
acatggaagg tccttactac catcgttatg cgattcgccc aacgtgtgtg
720ttcgctgaag tgattcaccg tcatatgcct gaagttgata tctacaacta
caaaggcggc 780gtgattggta acacagtaca agctatgctt gcgacagcgt
acccgaacgg cgagttcccg 840gctctgaatg atgcttctcg tactatgggt
atcacagaca tgggtgttca ggttgcggtc 900agtgtttaca gtaagcatta
ctcttctgaa aacggtgtag accaaaacat tctgggtatg 960gcgaagattc
aagacgcagt atggatgcat ccatgtggtc ttgagctatc taaagcatac
1020gaagccgcat ctgcagagaa agaaatcggc atgcctttct ggccaagtgt
tgaattgaat 1080gaaggccctc aaggtcacaa cggcgcgcaa ggctttatcc
gtatgcagga taagaaaggc 1140gacgtttctc aacttgtgat gaactacggc
caacacggca tgggtcacgg caactttgat 1200acgctgggta tttctttctt
taaccgcggt caagaagtgc tacgtgaata cggcttctgt 1260cgttgggtta
acgttgagcc aaaattcggc ggccgttacc tagacgaaaa caaatcttac
1320gctcgtcaaa cgattgctca caatgcagtt acgattgatg aaaaatgtca
gaacaacttt 1380gacgttgaac gtgcagactc agtacatggt ttacctcact
tctttaaagt agaagacgat 1440caaatcaacg gtatgagtgc atttgctaac
gatcattacc aaggctttga catgcaacgc 1500agcgtgttca tgctaaatct
tgaagaatta gaatctccgt tattgttaga cctataccgc 1560ttagattcta
caaaaggcgg cgaaggcgag caccaatacg actattcaca ccaatatgcg
1620ggtcagattg ttcgcactaa cttcgaatac caagcgaaca aagagctaaa
cactctaggt 1680gacgatttcg gttaccaaca tctatggaac gtcgcaagcg
gtgaagtgaa gggcacagca 1740attgtaagtt ggctacaaaa caacacctac
tacacatggc taggtgcaac gtctaacgat 1800aatgctgaag taatatttac
tcgcactggc gctaacgacc caagtttcaa tctacgttca 1860gagcctgcgt
tcattctacg cagcaaaggc gaaacaacac tgtttgcttc tgttgttgaa
1920acgcacggtt atttcaacga agaattcgag caatctgtca atgcacgtgg
tgttgtgaaa 1980gacatcaaag tcgtggctca caccaatgtc ggttcggtag
ttgagatcac cacagagaaa 2040tcaaacgtga cagtgatgat cagcaaccaa
cttggcgcga ctgacagcac tgaacacaaa 2100gtagaactga acggcaaagt
atacagctgg aaaggcttct actcagtaga gacaacttta 2160caagaaacga
attcagaaga acttagcact gcagggcagg ggaaataa 220816735PRTVibrio
splendidus 16Met Thr Thr Lys Pro Val Leu Leu Thr Glu Ala Glu Ile
Glu Gln Leu1 5 10 15 His Leu Glu Val Gly Arg Ser Ser Leu Met Gly
Lys Thr Ile Ala Ala 20 25 30 Asn Ala Lys Asp Leu Glu Ala Phe Met
Arg Leu Pro Ile Asp Val Pro 35 40 45 Gly His Gly Glu Ala Gly Gly
Tyr Glu His Asn Arg His Lys Gln Asn 50 55 60 Tyr Thr Tyr Met Asn
Leu Ala Gly Arg Met Phe Leu Ile Thr Lys Glu65 70 75 80 Gln Lys Tyr
Ala Asp Phe Val Thr Glu Leu Leu Glu Glu Tyr Ala Asp 85 90 95 Lys
Tyr Leu Thr Phe Asp Tyr His Val Gln Lys Asn Thr Asn Pro Thr 100 105
110 Gly Arg Leu Phe His Gln Ile Leu Asn Glu His Cys Trp Leu Met Phe
115 120 125 Ser Ser Leu Ala Tyr Ser Cys Val Ala Ser Thr Leu Thr Gln
Asp Gln 130 135 140 Arg Asp Asn Ile Glu Ser Arg Ile Phe Glu Pro Met
Leu Glu Met Phe145 150 155 160 Thr Val Lys Tyr Ala His Asp Phe Asp
Arg Ile His Asn His Gly Ile 165 170 175 Trp Ala Val Ala Ala Val Gly
Ile Cys Gly Leu Ala Leu Gly Lys Arg 180 185 190 Glu Tyr Leu Glu Met
Ser Val Tyr Gly Ile Asp Arg Asn Asp Thr Gly 195 200 205 Gly Phe Leu
Ala Gln Val Ser Gln Leu Phe Ala Pro Ser Gly Tyr Tyr 210 215 220 Met
Glu Gly Pro Tyr Tyr His Arg Tyr Ala Ile Arg Pro Thr Cys Val225 230
235 240 Phe Ala Glu Val Ile His Arg His Met Pro Glu Val Asp Ile Tyr
Asn 245 250 255 Tyr Lys Gly Gly Val Ile Gly Asn Thr Val Gln Ala Met
Leu Ala Thr 260 265 270 Ala Tyr Pro Asn Gly Glu Phe Pro Ala Leu Asn
Asp Ala Ser Arg Thr 275 280 285 Met Gly Ile Thr Asp Met Gly Val Gln
Val Ala Val Ser Val Tyr Ser 290 295 300 Lys His Tyr Ser Ser Glu Asn
Gly Val Asp Gln Asn Ile Leu Gly Met305 310 315 320 Ala Lys Ile Gln
Asp Ala Val Trp Met His Pro Cys Gly Leu Glu Leu 325 330 335 Ser Lys
Ala Tyr Glu Ala Ala Ser Ala Glu Lys Glu Ile Gly Met Pro 340 345 350
Phe Trp Pro Ser Val Glu Leu Asn Glu Gly Pro Gln Gly His Asn Gly 355
360 365 Ala Gln Gly Phe Ile Arg Met Gln Asp Lys Lys Gly Asp Val Ser
Gln 370 375 380 Leu Val Met Asn Tyr Gly Gln His Gly Met Gly His Gly
Asn Phe Asp385 390 395 400 Thr Leu Gly Ile Ser Phe Phe Asn Arg Gly
Gln Glu Val Leu Arg Glu 405 410 415 Tyr Gly Phe Cys Arg Trp Val Asn
Val Glu Pro Lys Phe Gly Gly Arg 420 425 430 Tyr Leu Asp Glu Asn Lys
Ser Tyr Ala Arg Gln Thr Ile Ala His Asn 435 440 445 Ala Val Thr Ile
Asp Glu Lys Cys Gln Asn Asn Phe Asp Val Glu Arg 450 455 460 Ala Asp
Ser Val His Gly Leu Pro His Phe Phe Lys Val Glu Asp Asp465 470 475
480 Gln Ile Asn Gly Met Ser Ala Phe Ala Asn Asp His Tyr Gln Gly Phe
485 490 495 Asp Met Gln Arg Ser Val Phe Met Leu Asn Leu Glu Glu Leu
Glu Ser 500 505 510 Pro Leu Leu Leu Asp Leu Tyr Arg Leu Asp Ser Thr
Lys Gly Gly Glu 515 520 525 Gly Glu His Gln Tyr Asp Tyr Ser His Gln
Tyr Ala Gly Gln Ile Val 530 535 540 Arg Thr Asn Phe Glu Tyr Gln Ala
Asn Lys Glu Leu Asn Thr Leu Gly545 550 555 560 Asp Asp Phe Gly Tyr
Gln His Leu Trp Asn Val Ala Ser Gly Glu Val 565 570 575 Lys Gly Thr
Ala Ile Val Ser Trp Leu Gln Asn Asn Thr Tyr Tyr Thr 580 585 590 Trp
Leu Gly Ala Thr Ser Asn Asp Asn Ala Glu Val Ile Phe Thr Arg 595 600
605 Thr Gly Ala Asn Asp Pro Ser Phe Asn Leu Arg Ser Glu Pro Ala Phe
610 615 620 Ile Leu Arg Ser Lys Gly Glu Thr Thr Leu Phe Ala Ser Val
Val Glu625 630 635 640 Thr His Gly Tyr Phe Asn Glu Glu Phe Glu Gln
Ser Val Asn Ala Arg 645 650 655 Gly Val Val Lys Asp Ile Lys Val Val
Ala His Thr Asn Val Gly Ser 660 665 670 Val Val Glu Ile Thr Thr Glu
Lys Ser Asn Val Thr Val Met Ile Ser 675 680 685 Asn Gln Leu Gly Ala
Thr Asp Ser Thr Glu His Lys Val Glu Leu Asn 690 695 700 Gly Lys Val
Tyr Ser Trp Lys Gly Phe Tyr Ser Val Glu Thr Thr Leu705 710 715 720
Gln Glu Thr Asn Ser Glu Glu Leu Ser Thr Ala Gly Gln Gly Lys 725 730
735 172154DNAVibrio splendidus 17atgagctatc aaccactttt acttaacttt
gatgaagcag ctgaacttcg taaagaactt 60ggcaaggata gcctattagg taacgcactg
actcgcgaca ttaaacaaac tgacgcttac 120atggctgaag ttggcattga
agtaccaggt cacggtgaag gcggcggtta cgagcacaac 180cgtcataagc
aaaactacat ccatatggat ctagcaggcc gtttgttcct tatcactgag
240gaaacaaaat accgagatta catcgttgat atgctaacag cgtacgcgac
ggtataccca 300acacttgaaa gcaacgtaag ccgtgactct aaccctccgg
gtaagctgtt ccaccaaacg 360ttgaacgaga acatgtggat gctttacgct
tcttgtgcgt acagctgcat ctaccacacg 420atctctgaag agcaaaagcg
tctgatcgaa gacgatcttc ttaagcaaat gatcgaaatg 480ttcgttgtga
cttacgcaca cgacttcgat atcgtacaca accacggctt atgggcagtg
540gcagcagtag gtatctgtgg ttacgcaatc aacgatcaag agtctgtaga
caaagcacta 600tacggcctga aactagacaa agtcagcggc ggtttcttag
cgcaactaga ccaactgttt 660tcgccagacg gctactacat ggaaggtcct
tactaccacc gtttctctct gcgtccaatc 720tacctgttcg cagaagcgat
tgaacgtcgt cagcctgaag ttggtatcta tgaattcaac 780gattcagtga
tcaagacaac gtcttactct gtattcaaaa cggcattccc agacggtaca
840ttgcctgctc tgaacgattc atcgaagaca atctctatca acgatgaagg
cgttatcatg 900gcaacgtctg tgtgttacca ccgttacgag caaactgaaa
ctctacttgg tatggctaac 960caccagcaaa acgtttgggt tcatgcttca
ggtaaaacac tgtctgacgc ggttgatgca 1020gcagacgaca tcaaagcatt
caactggggt agcctgtttg taaccgacgg ccctgaaggc 1080gaaaaaggcg
gcgtaagcat ccttcgtcac cgtgacgaac aagatgacga cacgatggcg
1140ttgatctggt ttggtcaaca cggttctgat caccagtacc actctgctct
agaccacggt 1200cactacgatg gcctgcacct aagcgtattt aaccgtggcc
acgaagtgct gcacgatttc 1260ggcttcggtc gctgggtaaa cgttgagcct
aagtttggcg gtcgttacat cccagagaac 1320aagtcttact gtaagcagac
ggttgctcac aacacagtaa cggttgatca gaaaacgcag 1380aacaacttca
acacagcatt ggctgagtct aagtttggtc agaagcactt cttcgtagca
1440gacgaccagt ctctacaagg catgagcggc acaatttctg agtactacac
tggcgtagac 1500atgcaacgca gcgtgattct tgctgaactt cctgagttcg
agaagccact tgtaatcgac 1560gtataccgca tcgaagctga cgctgaacac
cagtacgacc tacccgttca ccactctggt 1620cagatcatcc gtactgactt
cgattacaac atggaaaaaa cgcttaagcc gctaggtgaa 1680gacaacggtt
accagcactt atggaacgtg gcttcaggca aagtgaacga agaaggttct
1740ctagtaagct ggctacatga cagcagctac tacagcctag taaccagcgc
gaatgcgggc 1800agcgaagtga tttttgctcg cactggtgct aacgatccag
acttcaacct taagagtgag 1860cctgcgttca tcttacgtca gtctggtcaa
aaccacgtgt ttgcttctgt actagaaacg 1920catggttact ttaacgagtc
tatcgaagcc tctgtaggcg ctcgtggtct agttaaatca 1980gtatctgttg
tgggccataa cagtgtcggg actgttgttc gcattcagac tacttctggc
2040aacacttacc actacggtat ctcaaaccaa gctgaagaca cgcagcaagc
aactcacact 2100gttgagttcg cgggtgagac atactcgtgg gaaggatcat
ttgctcaact gtaa 215418717PRTVibrio slpendidus 18Met Ser Tyr Gln Pro
Leu Leu Leu Asn Phe Asp Glu Ala Ala Glu Leu1 5 10 15 Arg Lys Glu
Leu Gly Lys Asp Ser Leu Leu Gly Asn Ala Leu Thr Arg 20 25 30 Asp
Ile Lys Gln Thr Asp Ala Tyr Met Ala Glu Val Gly Ile Glu Val 35 40
45 Pro Gly His Gly Glu Gly Gly Gly Tyr Glu His Asn Arg His Lys Gln
50 55 60 Asn Tyr Ile His Met Asp Leu Ala Gly Arg Leu Phe Leu Ile
Thr Glu65 70 75 80 Glu Thr Lys Tyr Arg Asp Tyr Ile Val Asp Met Leu
Thr Ala Tyr Ala 85 90 95 Thr Val Tyr Pro Thr Leu Glu Ser Asn Val
Ser Arg Asp Ser Asn Pro 100 105 110 Pro Gly Lys Leu Phe His Gln Thr
Leu Asn Glu Asn Met Trp Met Leu 115 120 125 Tyr Ala Ser Cys Ala Tyr
Ser Cys Ile Tyr His Thr Ile Ser Glu Glu 130 135 140 Gln Lys Arg Leu
Ile Glu Asp Asp Leu Leu Lys Gln Met Ile Glu Met145 150 155 160 Phe
Val Val Thr Tyr Ala His Asp Phe Asp Ile Val His Asn His Gly 165 170
175 Leu Trp Ala Val Ala Ala Val Gly Ile Cys Gly Tyr Ala Ile Asn Asp
180 185 190 Gln Glu Ser Val Asp Lys Ala Leu Tyr Gly Leu Lys Leu Asp
Lys Val 195 200 205 Ser Gly Gly Phe Leu Ala Gln Leu Asp Gln Leu Phe
Ser Pro Asp Gly 210 215 220 Tyr Tyr Met Glu Gly Pro Tyr Tyr His Arg
Phe Ser Leu Arg Pro Ile225 230 235 240 Tyr Leu Phe Ala Glu Ala Ile
Glu Arg Arg Gln Pro Glu Val Gly Ile 245 250 255 Tyr Glu Phe Asn Asp
Ser Val Ile Lys Thr Thr Ser Tyr Ser Val Phe 260 265 270 Lys Thr Ala
Phe Pro Asp Gly Thr Leu Pro Ala Leu Asn Asp Ser Ser 275 280 285 Lys
Thr Ile Ser Ile Asn Asp Glu Gly Val Ile Met Ala Thr Ser Val 290 295
300 Cys Tyr His Arg Tyr Glu Gln Thr Glu Thr Leu Leu Gly Met Ala
Asn305 310 315 320 His Gln Gln Asn Val Trp Val His Ala Ser Gly Lys
Thr Leu Ser Asp 325 330 335 Ala Val Asp Ala Ala Asp Asp Ile Lys Ala
Phe Asn Trp Gly Ser Leu 340 345 350 Phe Val Thr Asp Gly Pro Glu Gly
Glu Lys Gly Gly Val Ser Ile Leu 355 360 365 Arg His Arg Asp Glu Gln
Asp Asp Asp Thr Met Ala Leu Ile Trp Phe 370 375 380 Gly Gln His Gly
Ser Asp His Gln Tyr His Ser Ala Leu Asp His Gly385 390 395 400 His
Tyr Asp Gly Leu His Leu Ser Val Phe Asn Arg Gly His Glu Val 405 410
415 Leu His Asp Phe Gly Phe Gly Arg Trp Val Asn Val Glu Pro Lys Phe
420 425 430 Gly Gly Arg Tyr Ile Pro Glu Asn Lys Ser Tyr Cys Lys Gln
Thr Val 435 440 445 Ala His Asn Thr Val Thr Val Asp Gln Lys Thr Gln
Asn Asn Phe Asn 450 455 460 Thr Ala Leu Ala Glu Ser Lys Phe Gly Gln
Lys His Phe Phe Val Ala465 470 475
480 Asp Asp Gln Ser Leu Gln Gly Met Ser Gly Thr Ile Ser Glu Tyr Tyr
485 490 495 Thr Gly Val Asp Met Gln Arg Ser Val Ile Leu Ala Glu Leu
Pro Glu 500 505 510 Phe Glu Lys Pro Leu Val Ile Asp Val Tyr Arg Ile
Glu Ala Asp Ala 515 520 525 Glu His Gln Tyr Asp Leu Pro Val His His
Ser Gly Gln Ile Ile Arg 530 535 540 Thr Asp Phe Asp Tyr Asn Met Glu
Lys Thr Leu Lys Pro Leu Gly Glu545 550 555 560 Asp Asn Gly Tyr Gln
His Leu Trp Asn Val Ala Ser Gly Lys Val Asn 565 570 575 Glu Glu Gly
Ser Leu Val Ser Trp Leu His Asp Ser Ser Tyr Tyr Ser 580 585 590 Leu
Val Thr Ser Ala Asn Ala Gly Ser Glu Val Ile Phe Ala Arg Thr 595 600
605 Gly Ala Asn Asp Pro Asp Phe Asn Leu Lys Ser Glu Pro Ala Phe Ile
610 615 620 Leu Arg Gln Ser Gly Gln Asn His Val Phe Ala Ser Val Leu
Glu Thr625 630 635 640 His Gly Tyr Phe Asn Glu Ser Ile Glu Ala Ser
Val Gly Ala Arg Gly 645 650 655 Leu Val Lys Ser Val Ser Val Val Gly
His Asn Ser Val Gly Thr Val 660 665 670 Val Arg Ile Gln Thr Thr Ser
Gly Asn Thr Tyr His Tyr Gly Ile Ser 675 680 685 Asn Gln Ala Glu Asp
Thr Gln Gln Ala Thr His Thr Val Glu Phe Ala 690 695 700 Gly Glu Thr
Tyr Ser Trp Glu Gly Ser Phe Ala Gln Leu705 710 715 19825DNAVibrio
splendidus 19atgaagtggt tattggcaat agttgcgatg tctggtgtcg cattggcggc
agaaaataag 60aatgttgagg tgagcagtga gcatttcgtc cgttatcaat accaagacaa
aatcagctat 120ggaaagctag acaatgacgc agtgttaccg gtcagcggcg
atctctttgg cgaatattcg 180gtagcaaaaa attcgatccc gttagagtcg
gttgaggtgt tactaccgac aaaaccagag 240aaagtcttcg ccgtcgggat
gaacttcgct agccacttag cctcacctgc cgatgcacca 300ccgccgatgt
ttcttaaact tccttcttct ttgattctca cgggcgaagt gattcaagtg
360ccaccaaaag caagaaatgt tcattttgaa ggcgagctgg tggttgtgat
tggtagagag 420ctcagtcaag ccagtgaaga agaagccgaa caagcgatct
ttggcgtcac ggtgggcaac 480gatattactg aaagaagttg gcaaggcgcc
gatttacaat ggctccgagc gaaagcttcc 540gatggttttg gcccggttgg
caacacaatt gtgcgcggca ttgattacaa caatattgag 600ttaaccactc
gtgttaacgg taaagtggtt caacaagaaa atacttcgtt catgatccac
660aagccaagaa aagtcgtgag ctatttgagc tattatttta ccctcaaacc
gggcgatcta 720attttcatgg gcacgccagg tagaacttat gctctgtccg
acaaagatca agtgagtgtc 780acgattgaag gggtagggac tgtggtaaat
gaagtgcggt tctga 82520274PRTVibrio splendidus 20Met Lys Trp Leu Leu
Ala Ile Val Ala Met Ser Gly Val Ala Leu Ala1 5 10 15 Ala Glu Asn
Lys Asn Val Glu Val Ser Ser Glu His Phe Val Arg Tyr 20 25 30 Gln
Tyr Gln Asp Lys Ile Ser Tyr Gly Lys Leu Asp Asn Asp Ala Val 35 40
45 Leu Pro Val Ser Gly Asp Leu Phe Gly Glu Tyr Ser Val Ala Lys Asn
50 55 60 Ser Ile Pro Leu Glu Ser Val Glu Val Leu Leu Pro Thr Lys
Pro Glu65 70 75 80 Lys Val Phe Ala Val Gly Met Asn Phe Ala Ser His
Leu Ala Ser Pro 85 90 95 Ala Asp Ala Pro Pro Pro Met Phe Leu Lys
Leu Pro Ser Ser Leu Ile 100 105 110 Leu Thr Gly Glu Val Ile Gln Val
Pro Pro Lys Ala Arg Asn Val His 115 120 125 Phe Glu Gly Glu Leu Val
Val Val Ile Gly Arg Glu Leu Ser Gln Ala 130 135 140 Ser Glu Glu Glu
Ala Glu Gln Ala Ile Phe Gly Val Thr Val Gly Asn145 150 155 160 Asp
Ile Thr Glu Arg Ser Trp Gln Gly Ala Asp Leu Gln Trp Leu Arg 165 170
175 Ala Lys Ala Ser Asp Gly Phe Gly Pro Val Gly Asn Thr Ile Val Arg
180 185 190 Gly Ile Asp Tyr Asn Asn Ile Glu Leu Thr Thr Arg Val Asn
Gly Lys 195 200 205 Val Val Gln Gln Glu Asn Thr Ser Phe Met Ile His
Lys Pro Arg Lys 210 215 220 Val Val Ser Tyr Leu Ser Tyr Tyr Phe Thr
Leu Lys Pro Gly Asp Leu225 230 235 240 Ile Phe Met Gly Thr Pro Gly
Arg Thr Tyr Ala Leu Ser Asp Lys Asp 245 250 255 Gln Val Ser Val Thr
Ile Glu Gly Val Gly Thr Val Val Asn Glu Val 260 265 270 Arg
Phe21717DNAVibrio splendidus 21atggctagca cttttaattc aatttcgggc
tcgaagcgta gcctgcacgt gcaagtagca 60cgcgaaatcg ctcgaggaat tttgtctggt
gatctgccgc aaggttctat tattcctggt 120gaaatggcgt tgtgtgaaca
gtttggtatc agccgaacgg cacttcgtga agcagttaaa 180ctactgacct
ctaaaggtct gttagagtct cgccctaaaa ttggtactcg cgtagtcgac
240cgcgcatact ggaacttcct tgatcctcaa ctgattgaat ggatggacgg
actaaccgac 300gtagaccaat tctgttctca gtttttaggc cttcgccgtg
cgatcgagcc tgaagcgtgt 360gcactggcgg caaaatttgc gacagctgaa
caacgtatcg agctttcaga gatcttccaa 420aagatggtcg aagtggatga
agctgaagtg tttgaccaag aacgttggac agacattgat 480actcgtttcc
atagcttgat cttcaatgcg accggtaacg acttctatct accgttcggt
540aatattctga ctactatgtt cgttaacttc atagtgcatt cttctgaaga
gggaagcaca 600tgcatcaatg aacaccgcag aatctatgaa gctatcatgg
ccggtgattg tgacaaggct 660agaattgctt ctgctgttca cttgcaagat
gccaaccacc gtttggcaac agcataa 71722238PRTVibrio splendidus 22Met
Ala Ser Thr Phe Asn Ser Ile Ser Gly Ser Lys Arg Ser Leu His1 5 10
15 Val Gln Val Ala Arg Glu Ile Ala Arg Gly Ile Leu Ser Gly Asp Leu
20 25 30 Pro Gln Gly Ser Ile Ile Pro Gly Glu Met Ala Leu Cys Glu
Gln Phe 35 40 45 Gly Ile Ser Arg Thr Ala Leu Arg Glu Ala Val Lys
Leu Leu Thr Ser 50 55 60 Lys Gly Leu Leu Glu Ser Arg Pro Lys Ile
Gly Thr Arg Val Val Asp65 70 75 80 Arg Ala Tyr Trp Asn Phe Leu Asp
Pro Gln Leu Ile Glu Trp Met Asp 85 90 95 Gly Leu Thr Asp Val Asp
Gln Phe Cys Ser Gln Phe Leu Gly Leu Arg 100 105 110 Arg Ala Ile Glu
Pro Glu Ala Cys Ala Leu Ala Ala Lys Phe Ala Thr 115 120 125 Ala Glu
Gln Arg Ile Glu Leu Ser Glu Ile Phe Gln Lys Met Val Glu 130 135 140
Val Asp Glu Ala Glu Val Phe Asp Gln Glu Arg Trp Thr Asp Ile Asp145
150 155 160 Thr Arg Phe His Ser Leu Ile Phe Asn Ala Thr Gly Asn Asp
Phe Tyr 165 170 175 Leu Pro Phe Gly Asn Ile Leu Thr Thr Met Phe Val
Asn Phe Ile Val 180 185 190 His Ser Ser Glu Glu Gly Ser Thr Cys Ile
Asn Glu His Arg Arg Ile 195 200 205 Tyr Glu Ala Ile Met Ala Gly Asp
Cys Asp Lys Ala Arg Ile Ala Ser 210 215 220 Ala Val His Leu Gln Asp
Ala Asn His Arg Leu Ala Thr Ala225 230 235 231779DNAVibrio
splendidus 23atggaactca acacgattat tgtcggcatt tatttcctat tcttgattgc
gataggttgg 60atgtttagaa catttacaag tactactagt gactacttcc gcgggggcgg
taacatgttg 120tggtggatgg ttggtgcaac cgcctttatg acccagttta
gtgcatggac attcaccggt 180gcagcaggta aagcgtataa cgatggtttc
gctgtagcgg tcatcttcgt agccaacgca 240tttggttact tcatgaacta
cgcgtacttc gcgccgaaat tccgtcaact tcgcgttgtt 300acggtaatcg
aagcgattcg tatgcgtttt ggtgcgacca acgaacaagt attcacttgg
360tcttcaatgc caaactcagt ggtatctgcg ggtgtgtggt taaacgcatt
ggcaatcatc 420gcttcgggta tcttcggttt cgacatgaac atgactatct
gggtgactgg cctagtggta 480ttggcaatgt cggtaacagg tggttcatgg
gcggtaatcg catctgactt catgcagatg 540gttatcatca tggcggtaac
ggtaacttgt gcggttgtag cggttgttca aggtggcggt 600gttggtgaga
ttgttaacaa cttcccagta caagatggtg gttcgttcct ttggggcaac
660aacatcaact acctaagcat ctttacgatt tgggcattct tcatcttcgt
taagcagttc 720tcaatcacga acaacatgct taactcttac cgttacctag
cggctaaaga ctcaaagaac 780gctaagaaag ctgcactgct tgcttgtgtg
ttgatgttgt gtggtgtgtt tatttggttc 840atgccttctt ggttcattgc
aggccaaggt gttgatttat cagcggctta cccgaatgca 900ggtaaaaaag
cgggtgactt tgcttaccta tacttcgtac aagagtacat gccagcaggt
960atggttggtc tattagttgc cgcgatgttt gcagcgacaa tgtcttcaat
ggactcaggt 1020ctaaaccgta actcaggtat ttttgttaag aacttctacg
aaacaatcgt tcgtaaaggt 1080caagcatcag agaaagagct agtaaccgta
tctaaaatta cttcagcggt atttggtttc 1140gctattatcc taatcgcaca
gttcatcaac tcattaaaag gcttaagcct gtttgatacg 1200atgatgtacg
taggtgcgtt aatcggcttc cctatgacga ttcctgcatt ccttggtttc
1260ttcatcaaga agactccgga ctgggctggt tggggaacgc tagttgttgg
tggtatcgta 1320tcttatgtgg ttggttttgt tatcaacgcg gagatggtag
cagcggcgtt tggtcttgat 1380actctaacag gacgtgaatg gtctgatgtt
aaagttgcga ttggtctgat tgctcacatc 1440acgctaaccg gtggcttctt
cgtactatct acgatgttct acaagcctct atcaaaagaa 1500cgtcaagcgg
atgttgataa gttctttggc aacttagata ccccattagt agctgaatcg
1560gcagagcaaa aagtgttgga taacaaacaa cgtcaaatgc ttggtaaact
gattgcggta 1620gcgggtgttg gtattatgct gatggctctt ctgactaacc
caatgtgggg gcgcctagtc 1680ttcatcttat gtggtgtgat agtgggtggt
gtcggtattc tacttgtgaa agcggtcgat 1740gacggcggca agcaagcgaa
agcagtaacc gaaagctaa 177924592PRTVibrio splendidus 24Met Glu Leu
Asn Thr Ile Ile Val Gly Ile Tyr Phe Leu Phe Leu Ile1 5 10 15 Ala
Ile Gly Trp Met Phe Arg Thr Phe Thr Ser Thr Thr Ser Asp Tyr 20 25
30 Phe Arg Gly Gly Gly Asn Met Leu Trp Trp Met Val Gly Ala Thr Ala
35 40 45 Phe Met Thr Gln Phe Ser Ala Trp Thr Phe Thr Gly Ala Ala
Gly Lys 50 55 60 Ala Tyr Asn Asp Gly Phe Ala Val Ala Val Ile Phe
Val Ala Asn Ala65 70 75 80 Phe Gly Tyr Phe Met Asn Tyr Ala Tyr Phe
Ala Pro Lys Phe Arg Gln 85 90 95 Leu Arg Val Val Thr Val Ile Glu
Ala Ile Arg Met Arg Phe Gly Ala 100 105 110 Thr Asn Glu Gln Val Phe
Thr Trp Ser Ser Met Pro Asn Ser Val Val 115 120 125 Ser Ala Gly Val
Trp Leu Asn Ala Leu Ala Ile Ile Ala Ser Gly Ile 130 135 140 Phe Gly
Phe Asp Met Asn Met Thr Ile Trp Val Thr Gly Leu Val Val145 150 155
160 Leu Ala Met Ser Val Thr Gly Gly Ser Trp Ala Val Ile Ala Ser Asp
165 170 175 Phe Met Gln Met Val Ile Ile Met Ala Val Thr Val Thr Cys
Ala Val 180 185 190 Val Ala Val Val Gln Gly Gly Gly Val Gly Glu Ile
Val Asn Asn Phe 195 200 205 Pro Val Gln Asp Gly Gly Ser Phe Leu Trp
Gly Asn Asn Ile Asn Tyr 210 215 220 Leu Ser Ile Phe Thr Ile Trp Ala
Phe Phe Ile Phe Val Lys Gln Phe225 230 235 240 Ser Ile Thr Asn Asn
Met Leu Asn Ser Tyr Arg Tyr Leu Ala Ala Lys 245 250 255 Asp Ser Lys
Asn Ala Lys Lys Ala Ala Leu Leu Ala Cys Val Leu Met 260 265 270 Leu
Cys Gly Val Phe Ile Trp Phe Met Pro Ser Trp Phe Ile Ala Gly 275 280
285 Gln Gly Val Asp Leu Ser Ala Ala Tyr Pro Asn Ala Gly Lys Lys Ala
290 295 300 Gly Asp Phe Ala Tyr Leu Tyr Phe Val Gln Glu Tyr Met Pro
Ala Gly305 310 315 320 Met Val Gly Leu Leu Val Ala Ala Met Phe Ala
Ala Thr Met Ser Ser 325 330 335 Met Asp Ser Gly Leu Asn Arg Asn Ser
Gly Ile Phe Val Lys Asn Phe 340 345 350 Tyr Glu Thr Ile Val Arg Lys
Gly Gln Ala Ser Glu Lys Glu Leu Val 355 360 365 Thr Val Ser Lys Ile
Thr Ser Ala Val Phe Gly Phe Ala Ile Ile Leu 370 375 380 Ile Ala Gln
Phe Ile Asn Ser Leu Lys Gly Leu Ser Leu Phe Asp Thr385 390 395 400
Met Met Tyr Val Gly Ala Leu Ile Gly Phe Pro Met Thr Ile Pro Ala 405
410 415 Phe Leu Gly Phe Phe Ile Lys Lys Thr Pro Asp Trp Ala Gly Trp
Gly 420 425 430 Thr Leu Val Val Gly Gly Ile Val Ser Tyr Val Val Gly
Phe Val Ile 435 440 445 Asn Ala Glu Met Val Ala Ala Ala Phe Gly Leu
Asp Thr Leu Thr Gly 450 455 460 Arg Glu Trp Ser Asp Val Lys Val Ala
Ile Gly Leu Ile Ala His Ile465 470 475 480 Thr Leu Thr Gly Gly Phe
Phe Val Leu Ser Thr Met Phe Tyr Lys Pro 485 490 495 Leu Ser Lys Glu
Arg Gln Ala Asp Val Asp Lys Phe Phe Gly Asn Leu 500 505 510 Asp Thr
Pro Leu Val Ala Glu Ser Ala Glu Gln Lys Val Leu Asp Asn 515 520 525
Lys Gln Arg Gln Met Leu Gly Lys Leu Ile Ala Val Ala Gly Val Gly 530
535 540 Ile Met Leu Met Ala Leu Leu Thr Asn Pro Met Trp Gly Arg Leu
Val545 550 555 560 Phe Ile Leu Cys Gly Val Ile Val Gly Gly Val Gly
Ile Leu Leu Val 565 570 575 Lys Ala Val Asp Asp Gly Gly Lys Gln Ala
Lys Ala Val Thr Glu Ser 580 585 590 252079DNAVibrio splendidus
25atgagcgacc aaaaatctct tgatgcaatc aggaagatga agctggaaaa cgatacttca
60gcaggtaatc ttgtagacct actccctatc gaagttcaaa cacgtgactt cgacctatca
120ttcctagaca ccttgagcga agcacgtccg cgtcttcttg ttcaagctga
tcagctagaa 180gaattcaaag caaaagtgaa agctgatcaa gctcactgta
tgtttgatga tttctacaac 240aactctaccg ttaagttcct tgagactgct
cctttcgaag agcctcaagc gtacccagct 300gagacggtag gtaaagcttc
tctatggcgt ccttattggc gtcaaatgta cgttgattgc 360caaatggcac
tgaacgcgac acgtaaccta gcgattgctg gtgttgtaaa agaagacgaa
420gcgctcattg cgaaagcaaa agcttggact ctaaaactgt ctacgtacga
tccagaaggc 480gtgacttctc gtggctataa cgatgaagcg gctttccgtg
ttatcgctgc tatggcttgg 540ggttacgatt ggctacacgg ctacttcacc
gatgaagaac gccagcaagt tcaagatgct 600ttgattgagc gtctagacga
aatcatgcac cacctgaaag tgacggttga tctattgaac 660aacccactaa
atagccacgg tgttcgttct atctcttctg ctatcatccc aacgtgtatc
720gcgctttacc acgatcaccc gaaagcaggc gagtacattg catacgcgct
agaatactac 780gcagtacatt acccaccatg gggcggtgta gacggcggtt
gggctgaagg tcctgattac 840tggaacacgc aaactgcatt cctaggcgaa
gcattcgacc tattgaaagc atactgtggt 900gtagacatgt ttaacaaaac
attctacgaa aacacaggtg atttcccgct ttactgcatg 960ccagttcact
ctaagcgcgc gagcttctgt gaccagtctt caatcggcga tttcccaggt
1020ttaaaactgg cttacaacat caagcactac gcaggtgtta accagaagcc
tgagtacgtt 1080tggtactata accagcttaa aggccgtgat actgaagcac
acaccaaatt ctacaacttc 1140ggttggtggg acttcggtta tgacgatctt
cgttttaact tcctttggga tgcacctgaa 1200gagaaagccc catcgaacga
tccactgttg aaagtattcc caatcacggg ttgggctgca 1260ttccacaaca
agatgactga gcgtgataac catattcaca tggtattcaa atgttctccg
1320tttggctcaa tcagccactc tcacggtgac caaaacgcat ttacgcttca
cgcatttggt 1380gaaacgctag cgtcagtaac aggttactat ggtggtttcg
gtgtagacat gcacacgaaa 1440tggcgtcgtc aaacgttctc taaaaacctg
ccactatttg gcggtaaagg tcagtacggc 1500gagaacaaga acacaggcta
cgaaaaccac caagatcgct tttgtatcga agcgggcggc 1560actatctctg
acttcgacac tgaatctgat gtgaagatgg ttgaaggtga tgcaacggca
1620tcttacaagt acttcgttcc tgaaatcgaa tcttacaagc gtaaagtctg
gttcgttcaa 1680ggtaaagtct tcgtaatgca agacaaggca acgctttctg
aagagaaaga catgacttgg 1740ctaatgcaca caactttcgc aaacgaagtg
gcagacaagt ctttcactat ccgtggcgaa 1800gttgcgcacc tagacgtaaa
cttcatcaac gagtctgctg ataacatcac gtcagttaag 1860aacgttgaag
gctttggcga agttgaccca tacgagttca aagatcttga gatccaccgt
1920cacgtggaag tggaattcaa gccatcgaaa gagcacaaca tcctgacgct
tcttgttcct 1980aataagaatg aaggcgagca agttgaagtg tttcacaagc
ttgaaggcaa cacgctactg 2040ctaaatgttg acggcgaaac ggtttcaatc
gaactgtaa 207926692PRTVibrio splendidus 26Met Ser Asp Gln Lys Ser
Leu Asp Ala Ile Arg Lys Met Lys Leu Glu1 5 10 15 Asn Asp Thr Ser
Ala Gly Asn Leu Val Asp Leu Leu Pro Ile Glu Val 20 25 30 Gln Thr
Arg Asp Phe Asp Leu Ser Phe Leu Asp Thr Leu Ser Glu Ala 35 40 45
Arg Pro Arg Leu Leu Val Gln Ala Asp Gln Leu Glu Glu Phe Lys Ala 50
55 60 Lys Val Lys Ala Asp Gln Ala His Cys Met Phe Asp Asp Phe Tyr
Asn65 70 75 80 Asn Ser Thr Val Lys Phe Leu Glu Thr Ala Pro Phe Glu
Glu Pro Gln 85 90 95 Ala Tyr Pro
Ala Glu Thr Val Gly Lys Ala Ser Leu Trp Arg Pro Tyr 100 105 110 Trp
Arg Gln Met Tyr Val Asp Cys Gln Met Ala Leu Asn Ala Thr Arg 115 120
125 Asn Leu Ala Ile Ala Gly Val Val Lys Glu Asp Glu Ala Leu Ile Ala
130 135 140 Lys Ala Lys Ala Trp Thr Leu Lys Leu Ser Thr Tyr Asp Pro
Glu Gly145 150 155 160 Val Thr Ser Arg Gly Tyr Asn Asp Glu Ala Ala
Phe Arg Val Ile Ala 165 170 175 Ala Met Ala Trp Gly Tyr Asp Trp Leu
His Gly Tyr Phe Thr Asp Glu 180 185 190 Glu Arg Gln Gln Val Gln Asp
Ala Leu Ile Glu Arg Leu Asp Glu Ile 195 200 205 Met His His Leu Lys
Val Thr Val Asp Leu Leu Asn Asn Pro Leu Asn 210 215 220 Ser His Gly
Val Arg Ser Ile Ser Ser Ala Ile Ile Pro Thr Cys Ile225 230 235 240
Ala Leu Tyr His Asp His Pro Lys Ala Gly Glu Tyr Ile Ala Tyr Ala 245
250 255 Leu Glu Tyr Tyr Ala Val His Tyr Pro Pro Trp Gly Gly Val Asp
Gly 260 265 270 Gly Trp Ala Glu Gly Pro Asp Tyr Trp Asn Thr Gln Thr
Ala Phe Leu 275 280 285 Gly Glu Ala Phe Asp Leu Leu Lys Ala Tyr Cys
Gly Val Asp Met Phe 290 295 300 Asn Lys Thr Phe Tyr Glu Asn Thr Gly
Asp Phe Pro Leu Tyr Cys Met305 310 315 320 Pro Val His Ser Lys Arg
Ala Ser Phe Cys Asp Gln Ser Ser Ile Gly 325 330 335 Asp Phe Pro Gly
Leu Lys Leu Ala Tyr Asn Ile Lys His Tyr Ala Gly 340 345 350 Val Asn
Gln Lys Pro Glu Tyr Val Trp Tyr Tyr Asn Gln Leu Lys Gly 355 360 365
Arg Asp Thr Glu Ala His Thr Lys Phe Tyr Asn Phe Gly Trp Trp Asp 370
375 380 Phe Gly Tyr Asp Asp Leu Arg Phe Asn Phe Leu Trp Asp Ala Pro
Glu385 390 395 400 Glu Lys Ala Pro Ser Asn Asp Pro Leu Leu Lys Val
Phe Pro Ile Thr 405 410 415 Gly Trp Ala Ala Phe His Asn Lys Met Thr
Glu Arg Asp Asn His Ile 420 425 430 His Met Val Phe Lys Cys Ser Pro
Phe Gly Ser Ile Ser His Ser His 435 440 445 Gly Asp Gln Asn Ala Phe
Thr Leu His Ala Phe Gly Glu Thr Leu Ala 450 455 460 Ser Val Thr Gly
Tyr Tyr Gly Gly Phe Gly Val Asp Met His Thr Lys465 470 475 480 Trp
Arg Arg Gln Thr Phe Ser Lys Asn Leu Pro Leu Phe Gly Gly Lys 485 490
495 Gly Gln Tyr Gly Glu Asn Lys Asn Thr Gly Tyr Glu Asn His Gln Asp
500 505 510 Arg Phe Cys Ile Glu Ala Gly Gly Thr Ile Ser Asp Phe Asp
Thr Glu 515 520 525 Ser Asp Val Lys Met Val Glu Gly Asp Ala Thr Ala
Ser Tyr Lys Tyr 530 535 540 Phe Val Pro Glu Ile Glu Ser Tyr Lys Arg
Lys Val Trp Phe Val Gln545 550 555 560 Gly Lys Val Phe Val Met Gln
Asp Lys Ala Thr Leu Ser Glu Glu Lys 565 570 575 Asp Met Thr Trp Leu
Met His Thr Thr Phe Ala Asn Glu Val Ala Asp 580 585 590 Lys Ser Phe
Thr Ile Arg Gly Glu Val Ala His Leu Asp Val Asn Phe 595 600 605 Ile
Asn Glu Ser Ala Asp Asn Ile Thr Ser Val Lys Asn Val Glu Gly 610 615
620 Phe Gly Glu Val Asp Pro Tyr Glu Phe Lys Asp Leu Glu Ile His
Arg625 630 635 640 His Val Glu Val Glu Phe Lys Pro Ser Lys Glu His
Asn Ile Leu Thr 645 650 655 Leu Leu Val Pro Asn Lys Asn Glu Gly Glu
Gln Val Glu Val Phe His 660 665 670 Lys Leu Glu Gly Asn Thr Leu Leu
Leu Asn Val Asp Gly Glu Thr Val 675 680 685 Ser Ile Glu Leu 690
27882DNAVibrio splendidus 27atgactaaac ctgtaatcgg tttcattggc
ctaggtctta tgggcggcaa catggttgaa 60aacctacaaa agcgcggcta ccacgtaaac
gtaatggatc taagcgctga agctgttgct 120cgcgtaacag atcgcggcaa
cgcaactgca ttcacttctg ctaaagaact agctgctgca 180agtgacatcg
ttcagttttg tctgacaact tctgctgttg ttgaaaaaat cgtttacggc
240gaagacggcg ttctagcggg catcaaagaa ggcgcagtac tagtagactt
cggtacttct 300atccctgctt ctactaagaa aatcggcgca gctcttgctg
aaaaaggcgc gggcatgatc 360gacgcacctc taggtcgtac tcctgcacac
gctaaagatg gtcttctgaa catcatggct 420gctggcgaca tggaaacttt
caacaaagtt aaacctgttc ttgaagagca aggcgaaaac 480gtattccacc
taggggctct aggttctggt cacgtgacta agcttgtaaa caacttcatg
540ggtatgacga ctgttgcgac tatgtctcaa gctttcgctg ttgctcaacg
cgctggtgtt 600gatggccaac aactgtttga catcatgtct gcaggtccat
ctaactctcc gttcatgcaa 660ttctgtaagt tctacgcggt agacggcgaa
gagaagctag gtttctctgt tgctaacgca 720aacaaagacc ttggttactt
ccttgcactt tgtgaagagc taggtactga gtctctaatc 780gctcaaggta
ctgcaacaag cctacaagct gctgttgatg caggcatggg taacaacgac
840gtaccagtaa tcttcgacta cttcgctaaa ctagagaagt aa 88228293PRTVibrio
splendidus 28Met Thr Lys Pro Val Ile Gly Phe Ile Gly Leu Gly Leu
Met Gly Gly1 5 10 15 Asn Met Val Glu Asn Leu Gln Lys Arg Gly Tyr
His Val Asn Val Met 20 25 30 Asp Leu Ser Ala Glu Ala Val Ala Arg
Val Thr Asp Arg Gly Asn Ala 35 40 45 Thr Ala Phe Thr Ser Ala Lys
Glu Leu Ala Ala Ala Ser Asp Ile Val 50 55 60 Gln Phe Cys Leu Thr
Thr Ser Ala Val Val Glu Lys Ile Val Tyr Gly65 70 75 80 Glu Asp Gly
Val Leu Ala Gly Ile Lys Glu Gly Ala Val Leu Val Asp 85 90 95 Phe
Gly Thr Ser Ile Pro Ala Ser Thr Lys Lys Ile Gly Ala Ala Leu 100 105
110 Ala Glu Lys Gly Ala Gly Met Ile Asp Ala Pro Leu Gly Arg Thr Pro
115 120 125 Ala His Ala Lys Asp Gly Leu Leu Asn Ile Met Ala Ala Gly
Asp Met 130 135 140 Glu Thr Phe Asn Lys Val Lys Pro Val Leu Glu Glu
Gln Gly Glu Asn145 150 155 160 Val Phe His Leu Gly Ala Leu Gly Ser
Gly His Val Thr Lys Leu Val 165 170 175 Asn Asn Phe Met Gly Met Thr
Thr Val Ala Thr Met Ser Gln Ala Phe 180 185 190 Ala Val Ala Gln Arg
Ala Gly Val Asp Gly Gln Gln Leu Phe Asp Ile 195 200 205 Met Ser Ala
Gly Pro Ser Asn Ser Pro Phe Met Gln Phe Cys Lys Phe 210 215 220 Tyr
Ala Val Asp Gly Glu Glu Lys Leu Gly Phe Ser Val Ala Asn Ala225 230
235 240 Asn Lys Asp Leu Gly Tyr Phe Leu Ala Leu Cys Glu Glu Leu Gly
Thr 245 250 255 Glu Ser Leu Ile Ala Gln Gly Thr Ala Thr Ser Leu Gln
Ala Ala Val 260 265 270 Asp Ala Gly Met Gly Asn Asn Asp Val Pro Val
Ile Phe Asp Tyr Phe 275 280 285 Ala Lys Leu Glu Lys 290
291872DNAVibrio splendidus 29atggtagcgg tcgtcagttc tagtgctttg
gcatttacga actggtttac gcttaacttg 60gccactgaac aggtaaacca aacgatttat
aacgagattg atcactcgct tacgatagaa 120atcaatcaaa tagaaagtac
cgttcagcgc accatcgata ccgttaactc tgttgcacaa 180gagttcatga
aatcccctta ccaagtgccg aatgaagcac tcatgcatta tgccgctaag
240cttggtggca ttgacaagat tgtggtgggt tttgacgacg gccgttctta
tacctctcgc 300ccttcagagt ctttccctaa cggtgttgga ataaaagaaa
aatacaatcc aaccactcga 360ccttggtatc aacaagcgaa attgaaatca
ggcttatctt ttagtggtct gtttttcact 420aagagtactc aagtgcctat
gatcggtgtg acctactcat accaagatcg tgtcatcatg 480gccgatatac
gctttgacga tttggaaacg cagcttgaac agctggacag catctacgaa
540gccaaaggca ttatcatcga cgaaaagggg atggtggtcg cttcaacaat
cgaaaacgtg 600cttccgcaaa ccaatatatc ttctgcagac actcaaatga
aactcaacag tgccattgaa 660cagcctgatc aattcattga gggtgtgatt
gatggtaacc agagaatctt gatggccaag 720aaagtggata ttggcagcca
gaaagagtgg ttcatgatct ccagtattga ccctgaactc 780gcgctcaatc
agctgaatgg cgtgatgtcg agtgcgcgca tccttatcgt cgcttgtgta
840cttggctcgg tgatattgat gattttactt ctgaatcgtt tctaccgccc
aatcgtgtca 900ctgcgcaaaa tcgtccacga tctatcacaa ggtaacggag
acctcactca aaggcttgct 960gagaagggga atgatgactt agggcatatc
gccaaagaca tcaacttgtt cattatcggc 1020ttacaagaga tggttaagga
tgtgaaatac aagaactcgg atctcgatac caaggtactg 1080agtattcgcg
aaggttgtaa agaaaccagc gatgtactga aagttcatac tgatgaaacg
1140gttcaagtgg tctctgcgat taacggcttg tctgaagcat caaacgaagt
agagaagagt 1200tctcagtcgg cggcagaagc agcaagagag gccgctgtgt
tcagtgatga gacgaaacag 1260attaacacgg tgacggaaac ctatatcagt
gatcttgaga agcaagtctg caccacttct 1320gatgacattc gctcaatggc
caatgaaacg cagagcatcc agtctatcgt gtctgtgatt 1380ggcggaattg
cggaacaaac taatttgctg gcattgaatg cgtcaattga agcggcgagg
1440gcgggtgaac atggtcgagg tttcgcggtg gttgctgatg aagtccgtgc
gctagccaac 1500cgaacgcaaa tcagtacctc tgaaattgat gaagcgttat
ctggcttgca gtctaaatca 1560gatggtttgg ttaaatctat tgagttgacc
aaaagtaact gtgaactgac tcgcgctcaa 1620gttgttcaag ctgtaaacat
gttggcgaag ctaaccgagc agatggaaac agtaagtcgt 1680tttaataatg
acatttcggg ttcgtctgtt gagcaaaacg cccttattca gagcattgct
1740aagaacatgc ataagattga aagctttgtt gaggagctta ataaactaag
ccaagatcag 1800ttaactgaat cagcagaaat caaaacactt aacggtagcg
ttagtgaatt gatgagcagc 1860tttaaggttt aa 187230623PRTVibrio
splendidus 30Met Val Ala Val Val Ser Ser Ser Ala Leu Ala Phe Thr
Asn Trp Phe1 5 10 15 Thr Leu Asn Leu Ala Thr Glu Gln Val Asn Gln
Thr Ile Tyr Asn Glu 20 25 30 Ile Asp His Ser Leu Thr Ile Glu Ile
Asn Gln Ile Glu Ser Thr Val 35 40 45 Gln Arg Thr Ile Asp Thr Val
Asn Ser Val Ala Gln Glu Phe Met Lys 50 55 60 Ser Pro Tyr Gln Val
Pro Asn Glu Ala Leu Met His Tyr Ala Ala Lys65 70 75 80 Leu Gly Gly
Ile Asp Lys Ile Val Val Gly Phe Asp Asp Gly Arg Ser 85 90 95 Tyr
Thr Ser Arg Pro Ser Glu Ser Phe Pro Asn Gly Val Gly Ile Lys 100 105
110 Glu Lys Tyr Asn Pro Thr Thr Arg Pro Trp Tyr Gln Gln Ala Lys Leu
115 120 125 Lys Ser Gly Leu Ser Phe Ser Gly Leu Phe Phe Thr Lys Ser
Thr Gln 130 135 140 Val Pro Met Ile Gly Val Thr Tyr Ser Tyr Gln Asp
Arg Val Ile Met145 150 155 160 Ala Asp Ile Arg Phe Asp Asp Leu Glu
Thr Gln Leu Glu Gln Leu Asp 165 170 175 Ser Ile Tyr Glu Ala Lys Gly
Ile Ile Ile Asp Glu Lys Gly Met Val 180 185 190 Val Ala Ser Thr Ile
Glu Asn Val Leu Pro Gln Thr Asn Ile Ser Ser 195 200 205 Ala Asp Thr
Gln Met Lys Leu Asn Ser Ala Ile Glu Gln Pro Asp Gln 210 215 220 Phe
Ile Glu Gly Val Ile Asp Gly Asn Gln Arg Ile Leu Met Ala Lys225 230
235 240 Lys Val Asp Ile Gly Ser Gln Lys Glu Trp Phe Met Ile Ser Ser
Ile 245 250 255 Asp Pro Glu Leu Ala Leu Asn Gln Leu Asn Gly Val Met
Ser Ser Ala 260 265 270 Arg Ile Leu Ile Val Ala Cys Val Leu Gly Ser
Val Ile Leu Met Ile 275 280 285 Leu Leu Leu Asn Arg Phe Tyr Arg Pro
Ile Val Ser Leu Arg Lys Ile 290 295 300 Val His Asp Leu Ser Gln Gly
Asn Gly Asp Leu Thr Gln Arg Leu Ala305 310 315 320 Glu Lys Gly Asn
Asp Asp Leu Gly His Ile Ala Lys Asp Ile Asn Leu 325 330 335 Phe Ile
Ile Gly Leu Gln Glu Met Val Lys Asp Val Lys Tyr Lys Asn 340 345 350
Ser Asp Leu Asp Thr Lys Val Leu Ser Ile Arg Glu Gly Cys Lys Glu 355
360 365 Thr Ser Asp Val Leu Lys Val His Thr Asp Glu Thr Val Gln Val
Val 370 375 380 Ser Ala Ile Asn Gly Leu Ser Glu Ala Ser Asn Glu Val
Glu Lys Ser385 390 395 400 Ser Gln Ser Ala Ala Glu Ala Ala Arg Glu
Ala Ala Val Phe Ser Asp 405 410 415 Glu Thr Lys Gln Ile Asn Thr Val
Thr Glu Thr Tyr Ile Ser Asp Leu 420 425 430 Glu Lys Gln Val Cys Thr
Thr Ser Asp Asp Ile Arg Ser Met Ala Asn 435 440 445 Glu Thr Gln Ser
Ile Gln Ser Ile Val Ser Val Ile Gly Gly Ile Ala 450 455 460 Glu Gln
Thr Asn Leu Leu Ala Leu Asn Ala Ser Ile Glu Ala Ala Arg465 470 475
480 Ala Gly Glu His Gly Arg Gly Phe Ala Val Val Ala Asp Glu Val Arg
485 490 495 Ala Leu Ala Asn Arg Thr Gln Ile Ser Thr Ser Glu Ile Asp
Glu Ala 500 505 510 Leu Ser Gly Leu Gln Ser Lys Ser Asp Gly Leu Val
Lys Ser Ile Glu 515 520 525 Leu Thr Lys Ser Asn Cys Glu Leu Thr Arg
Ala Gln Val Val Gln Ala 530 535 540 Val Asn Met Leu Ala Lys Leu Thr
Glu Gln Met Glu Thr Val Ser Arg545 550 555 560 Phe Asn Asn Asp Ile
Ser Gly Ser Ser Val Glu Gln Asn Ala Leu Ile 565 570 575 Gln Ser Ile
Ala Lys Asn Met His Lys Ile Glu Ser Phe Val Glu Glu 580 585 590 Leu
Asn Lys Leu Ser Gln Asp Gln Leu Thr Glu Ser Ala Glu Ile Lys 595 600
605 Thr Leu Asn Gly Ser Val Ser Glu Leu Met Ser Ser Phe Lys Val 610
615 620 311743DNAVibrio splendidus 31gtgaataagc caatctttgt
cgtcgtactc gcttcgctta cgtatggctg cggtggaagc 60agctccagtg actctagtga
cccttctgat accaataact caggagcatc ttatggtgtt 120gttgctccct
atgatattgc caagtatcaa aacatccttt ccagctcaga tcttcaggtg
180tctgatccta atggagagga gggcaataaa acctctgaag tcaaagatgg
taacttcgat 240ggttatgtca gtgattattt ttatgctgac gaagagacgg
aaaatctgat cttcaaaatg 300gcgaactaca agatgcgctc tgaagttcgt
gaaggagaaa acttcgatat caatgaagca 360ggcgtaagac gcagtctaca
tgcggaaata agcctacctg atattgagca tgtaatggcg 420agttctcccg
cagatcacga tgaagtgacc gtgctacaga tccacaataa aggtacagac
480gagagtggca cgggttatat ccctcatccg ctattgcgtg tggtttggga
gcaagaacga 540gatggcctca caggtcacta ctgggcagtc atgaaaaata
atgccattga ctgtagcagt 600gccgctgact cttcggattg ttatgccact
tcatataatc gctacgattt gggagaggcg 660gatctcgata acttcaccaa
gtttgatctt tctgtttatg aaaataccct ttcgatcaaa 720gtgaacgatg
aagttaaagt cgacgaagac atcacctact ggcagcatct actgagttac
780tttaaagcgg gtatctacaa tcaatttgaa aatggtgaag ccacggctca
ctttcaggca 840ctgcgataca ccaccacaca ggtcaacggc tcaaacgatt
gggatattaa tgattggaag 900ttgacgattc ctgcgagtaa agacacttgg
tatggaagtg ggggtgacag tgcggctgaa 960ctagaacctg agcgctgcga
atcgagcaaa gaccttctcg ccaacgacag tgatgtctac 1020gacagcgata
ttggtctttc ttatttcaat accgatgaag ggagagtgca ctttagagcg
1080gatatgggat atggcacctc taccgaaaat tctagctata ttcgctctga
gctcagggag 1140ttgtatcaaa gcagtgttca accggattgt agcaccagcg
atgaagatac aagttggtat 1200ttggacgaca ctagaacgaa cgctaccagt
cacgagttaa ccgcaagctt acgaattgaa 1260gactacccga acattaataa
ccaagacccg aaagtggtgc ttgggcaaat acacggttgg 1320aagatcaatc
aagcattggt gaagttgtta tgggaaggcg agagtaagcc agtaagagtg
1380atactgaact ctgattttga gcgcaacaac caagactgta accattgtga
cccgttcagt 1440gtcgagttag gtacttattc ggcaagtgaa gagtggcgat
atacgattcg agccaatcaa 1500gacggtatct acttagcgac tcatgattta
gatggaacta atacggtttc tcatttaatc 1560ccttggggac aagattacac
agataaagat ggggacacgg tctcgttgac gtcagattgg 1620acatcgacag
acatcgcttt ctatttcaaa gcgggcatct acccacaatt taagcctgat
1680agcgactatg cgggtgaagt gtttgatgtg agctttagtt ctctaagagc
agagcataac 1740tga 174332580PRTVibrio splendidus 32Met Asn Lys Pro
Ile Phe Val Val Val Leu Ala Ser Leu Thr Tyr Gly1 5 10 15 Cys Gly
Gly Ser Ser Ser Ser Asp Ser Ser Asp Pro Ser Asp Thr Asn 20 25 30
Asn Ser Gly Ala Ser Tyr Gly Val Val Ala Pro Tyr Asp Ile Ala Lys 35
40 45 Tyr Gln Asn Ile Leu Ser Ser Ser Asp Leu Gln Val Ser Asp Pro
Asn 50 55 60 Gly Glu Glu Gly Asn Lys Thr Ser Glu Val Lys
Asp Gly Asn Phe Asp65 70 75 80 Gly Tyr Val Ser Asp Tyr Phe Tyr Ala
Asp Glu Glu Thr Glu Asn Leu 85 90 95 Ile Phe Lys Met Ala Asn Tyr
Lys Met Arg Ser Glu Val Arg Glu Gly 100 105 110 Glu Asn Phe Asp Ile
Asn Glu Ala Gly Val Arg Arg Ser Leu His Ala 115 120 125 Glu Ile Ser
Leu Pro Asp Ile Glu His Val Met Ala Ser Ser Pro Ala 130 135 140 Asp
His Asp Glu Val Thr Val Leu Gln Ile His Asn Lys Gly Thr Asp145 150
155 160 Glu Ser Gly Thr Gly Tyr Ile Pro His Pro Leu Leu Arg Val Val
Trp 165 170 175 Glu Gln Glu Arg Asp Gly Leu Thr Gly His Tyr Trp Ala
Val Met Lys 180 185 190 Asn Asn Ala Ile Asp Cys Ser Ser Ala Ala Asp
Ser Ser Asp Cys Tyr 195 200 205 Ala Thr Ser Tyr Asn Arg Tyr Asp Leu
Gly Glu Ala Asp Leu Asp Asn 210 215 220 Phe Thr Lys Phe Asp Leu Ser
Val Tyr Glu Asn Thr Leu Ser Ile Lys225 230 235 240 Val Asn Asp Glu
Val Lys Val Asp Glu Asp Ile Thr Tyr Trp Gln His 245 250 255 Leu Leu
Ser Tyr Phe Lys Ala Gly Ile Tyr Asn Gln Phe Glu Asn Gly 260 265 270
Glu Ala Thr Ala His Phe Gln Ala Leu Arg Tyr Thr Thr Thr Gln Val 275
280 285 Asn Gly Ser Asn Asp Trp Asp Ile Asn Asp Trp Lys Leu Thr Ile
Pro 290 295 300 Ala Ser Lys Asp Thr Trp Tyr Gly Ser Gly Gly Asp Ser
Ala Ala Glu305 310 315 320 Leu Glu Pro Glu Arg Cys Glu Ser Ser Lys
Asp Leu Leu Ala Asn Asp 325 330 335 Ser Asp Val Tyr Asp Ser Asp Ile
Gly Leu Ser Tyr Phe Asn Thr Asp 340 345 350 Glu Gly Arg Val His Phe
Arg Ala Asp Met Gly Tyr Gly Thr Ser Thr 355 360 365 Glu Asn Ser Ser
Tyr Ile Arg Ser Glu Leu Arg Glu Leu Tyr Gln Ser 370 375 380 Ser Val
Gln Pro Asp Cys Ser Thr Ser Asp Glu Asp Thr Ser Trp Tyr385 390 395
400 Leu Asp Asp Thr Arg Thr Asn Ala Thr Ser His Glu Leu Thr Ala Ser
405 410 415 Leu Arg Ile Glu Asp Tyr Pro Asn Ile Asn Asn Gln Asp Pro
Lys Val 420 425 430 Val Leu Gly Gln Ile His Gly Trp Lys Ile Asn Gln
Ala Leu Val Lys 435 440 445 Leu Leu Trp Glu Gly Glu Ser Lys Pro Val
Arg Val Ile Leu Asn Ser 450 455 460 Asp Phe Glu Arg Asn Asn Gln Asp
Cys Asn His Cys Asp Pro Phe Ser465 470 475 480 Val Glu Leu Gly Thr
Tyr Ser Ala Ser Glu Glu Trp Arg Tyr Thr Ile 485 490 495 Arg Ala Asn
Gln Asp Gly Ile Tyr Leu Ala Thr His Asp Leu Asp Gly 500 505 510 Thr
Asn Thr Val Ser His Leu Ile Pro Trp Gly Gln Asp Tyr Thr Asp 515 520
525 Lys Asp Gly Asp Thr Val Ser Leu Thr Ser Asp Trp Thr Ser Thr Asp
530 535 540 Ile Ala Phe Tyr Phe Lys Ala Gly Ile Tyr Pro Gln Phe Lys
Pro Asp545 550 555 560 Ser Asp Tyr Ala Gly Glu Val Phe Asp Val Ser
Phe Ser Ser Leu Arg 565 570 575 Ala Glu His Asn 580 331569DNAVibrio
splendidus 33atgaaacaaa ttactctaaa aactttactc gcttcttcta ttctacttgc
ggttggttgt 60gcgagcacga gcacgcctac tgctgatttt ccaaataaca aagaaactgg
tgaagcgctt 120ctgacgccag ttgctgtttc cgctagtagc catgatggta
acggacctga tcgtctcgtt 180gaccaagacc taactacacg ttggtcatct
gcgggtgacg gcgagtgggc aacgctagac 240tatggttcag tacaggagtt
tgacgcggtt caggcatctt tcagtaaagg taatcagcgc 300caatctaaat
ttgatatcca agtgagtgtt gatggcgaaa gctggacaac ggtactagaa
360aaccaactaa gctcaggtaa agcgatcggc ctagagcgtt tccaatttga
gccagtagtg 420caagcacgct acgtaagata cgttggtcac ggtaacacca
aaaacggttg gaacagtgtg 480actggattag cggcggttaa ctgtagcatt
aacgcatgtc ctgctagcca tatcatcact 540tcagacgtgg ttgcagcaga
agccgtgatt attgctgaaa tgaaagcggc agaaaaagca 600cgtaaagatg
cgcgcaaaga tctacgctct ggtaacttcg gtgtagcagc ggtttaccct
660tgtgagacga ccgttgaatg tgacactcgc agtgcacttc cagttccgac
aggcctgcca 720gcgacaccag ttgcaggtaa ctcgccaagc gaaaactttg
acatgacgca ttggtaccta 780tctcaaccat ttgaccatga caaaaatggc
aaacctgatg atgtgtctga gtggaacctt 840gcaaacggtt accaacaccc
tgaaatcttc tacacagctg atgacggcgg cctagtattc 900aaagcttacg
tgaaaggtgt acgtacctct aaaaacacta agtacgcgcg tacagagctt
960cgtgaaatga tgcgtcgtgg tgatcagtct attagcacta aaggtgttaa
taagaataac 1020tgggtattct caagcgctcc tgaatctgac ttagagtcgg
cagcgggtat tgacggcgtt 1080ctagaagcga cgttgaaaat cgaccatgca
acaacgacgg gtaatgcgaa tgaagtaggt 1140cgctttatca ttggtcagat
tcacgatcaa aacgatgaac caattcgttt gtactaccgt 1200aaactgccaa
accaagaaac gggtgcggtt tacttcgcac atgaaagcca agacgcaact
1260aaagaggact tctaccctct agtgggcgac atgacggctg aagtgggtga
cgatggtatc 1320gcgcttggcg aagtgttcag ctaccgtatt gacgttaaag
gcaacacgat gactgtaacg 1380ctaatacgtg aaggcaaaga cgatgttgta
caagtggttg atatgagcaa cagcggctac 1440gacgcaggcg gcaagtacat
gtacttcaaa gccggtgttt acaaccaaaa catcagcggc 1500gacctagacg
attactcaca agcgactttc tatcagctag atgtatcgca cgatcaatac
1560aaaaagtaa 156934522PRTVibrio splendidus 34Met Lys Gln Ile Thr
Leu Lys Thr Leu Leu Ala Ser Ser Ile Leu Leu1 5 10 15 Ala Val Gly
Cys Ala Ser Thr Ser Thr Pro Thr Ala Asp Phe Pro Asn 20 25 30 Asn
Lys Glu Thr Gly Glu Ala Leu Leu Thr Pro Val Ala Val Ser Ala 35 40
45 Ser Ser His Asp Gly Asn Gly Pro Asp Arg Leu Val Asp Gln Asp Leu
50 55 60 Thr Thr Arg Trp Ser Ser Ala Gly Asp Gly Glu Trp Ala Thr
Leu Asp65 70 75 80 Tyr Gly Ser Val Gln Glu Phe Asp Ala Val Gln Ala
Ser Phe Ser Lys 85 90 95 Gly Asn Gln Arg Gln Ser Lys Phe Asp Ile
Gln Val Ser Val Asp Gly 100 105 110 Glu Ser Trp Thr Thr Val Leu Glu
Asn Gln Leu Ser Ser Gly Lys Ala 115 120 125 Ile Gly Leu Glu Arg Phe
Gln Phe Glu Pro Val Val Gln Ala Arg Tyr 130 135 140 Val Arg Tyr Val
Gly His Gly Asn Thr Lys Asn Gly Trp Asn Ser Val145 150 155 160 Thr
Gly Leu Ala Ala Val Asn Cys Ser Ile Asn Ala Cys Pro Ala Ser 165 170
175 His Ile Ile Thr Ser Asp Val Val Ala Ala Glu Ala Val Ile Ile Ala
180 185 190 Glu Met Lys Ala Ala Glu Lys Ala Arg Lys Asp Ala Arg Lys
Asp Leu 195 200 205 Arg Ser Gly Asn Phe Gly Val Ala Ala Val Tyr Pro
Cys Glu Thr Thr 210 215 220 Val Glu Cys Asp Thr Arg Ser Ala Leu Pro
Val Pro Thr Gly Leu Pro225 230 235 240 Ala Thr Pro Val Ala Gly Asn
Ser Pro Ser Glu Asn Phe Asp Met Thr 245 250 255 His Trp Tyr Leu Ser
Gln Pro Phe Asp His Asp Lys Asn Gly Lys Pro 260 265 270 Asp Asp Val
Ser Glu Trp Asn Leu Ala Asn Gly Tyr Gln His Pro Glu 275 280 285 Ile
Phe Tyr Thr Ala Asp Asp Gly Gly Leu Val Phe Lys Ala Tyr Val 290 295
300 Lys Gly Val Arg Thr Ser Lys Asn Thr Lys Tyr Ala Arg Thr Glu
Leu305 310 315 320 Arg Glu Met Met Arg Arg Gly Asp Gln Ser Ile Ser
Thr Lys Gly Val 325 330 335 Asn Lys Asn Asn Trp Val Phe Ser Ser Ala
Pro Glu Ser Asp Leu Glu 340 345 350 Ser Ala Ala Gly Ile Asp Gly Val
Leu Glu Ala Thr Leu Lys Ile Asp 355 360 365 His Ala Thr Thr Thr Gly
Asn Ala Asn Glu Val Gly Arg Phe Ile Ile 370 375 380 Gly Gln Ile His
Asp Gln Asn Asp Glu Pro Ile Arg Leu Tyr Tyr Arg385 390 395 400 Lys
Leu Pro Asn Gln Glu Thr Gly Ala Val Tyr Phe Ala His Glu Ser 405 410
415 Gln Asp Ala Thr Lys Glu Asp Phe Tyr Pro Leu Val Gly Asp Met Thr
420 425 430 Ala Glu Val Gly Asp Asp Gly Ile Ala Leu Gly Glu Val Phe
Ser Tyr 435 440 445 Arg Ile Asp Val Lys Gly Asn Thr Met Thr Val Thr
Leu Ile Arg Glu 450 455 460 Gly Lys Asp Asp Val Val Gln Val Val Asp
Met Ser Asn Ser Gly Tyr465 470 475 480 Asp Ala Gly Gly Lys Tyr Met
Tyr Phe Lys Ala Gly Val Tyr Asn Gln 485 490 495 Asn Ile Ser Gly Asp
Leu Asp Asp Tyr Ser Gln Ala Thr Phe Tyr Gln 500 505 510 Leu Asp Val
Ser His Asp Gln Tyr Lys Lys 515 520 351230DNAVibrio splendidus
35atgcaaattt ctaaagtcgc tacagctgtc gctctttcga caggtttatt atttggttgt
60aacagtgatg gtttacctat tccaacagat ccaggcggaa cagaccctgt tgaacctgtt
120gaagtttact ctatagaaaa cgtctattgg gatctgacag gtggtgctgt
tgctgcacag 180tcactcagcg gaacttcacc atatcgcttt gataataatg
aggaaggtac tcgtgctcta 240agcatttaca gtggagacgt agctaatggc
ttcacttttg agagttcaat atatactgct 300gaagaagaag gtgttgtttc
ctttgaaggt aaggactgta cttacacagt gactgagcaa 360cagctagata
tgacctgtga aaaagatgac gtagaaacag cttactcagc aacagagatt
420acagatgaat ctgttataac tgcattagaa aatgccgatg atggaaaacc
taaatcagtc 480gatgatgtga acgctgcgat tgcatcagca gaagatggcg
cgattattga tttatcatct 540gaaggtacgt ttgataccgg tgttattgag
ctaaataaag ctgtcacaat tgatggtgct 600ggtttagcaa ccattaccgg
agatgcttgt attgatgtca ctgcacccgg tgcaggtatc 660aaaaacatga
cttttgctaa cgacaatttg gccgggtgtt ttggtaggga gtcagctggt
720acttcagata atgaaactgg tgcgatcgtt attggtaaaa ttggtaaaga
ttcagatcct 780gtagcacttg aaaacctaaa gttcgatgca aacggcatta
ccgaagatga tctaggtact 840aaaaaagcaa gttggttatt ctctcgaggt
tactttacat tagacaatag cgaatttgtc 900ggtttaagtg gcagtttcca
aaataatgca attcgtatta actgtagtag tgacaacggg 960cgatttggtt
cacaaatcac aaataataca ttcactatta actctggtgg tagtgatgtg
1020ggcggaatta aagttggtga ttctagcagt gccgtcataa agaatagtga
tgataacctt 1080ggctgtaatg tcactattga aagcaatacg ttcaatggtt
acaaaaccct actttcagct 1140gacaacggta aagatataag aaatacagcc
atctacgcac aaccatctgc agtgaacact 1200gcggcaggta aagaaaatat
cttgaactaa 123036409PRTVibrio splendidus 36Met Gln Ile Ser Lys Val
Ala Thr Ala Val Ala Leu Ser Thr Gly Leu1 5 10 15 Leu Phe Gly Cys
Asn Ser Asp Gly Leu Pro Ile Pro Thr Asp Pro Gly 20 25 30 Gly Thr
Asp Pro Val Glu Pro Val Glu Val Tyr Ser Ile Glu Asn Val 35 40 45
Tyr Trp Asp Leu Thr Gly Gly Ala Val Ala Ala Gln Ser Leu Ser Gly 50
55 60 Thr Ser Pro Tyr Arg Phe Asp Asn Asn Glu Glu Gly Thr Arg Ala
Leu65 70 75 80 Ser Ile Tyr Ser Gly Asp Val Ala Asn Gly Phe Thr Phe
Glu Ser Ser 85 90 95 Ile Tyr Thr Ala Glu Glu Glu Gly Val Val Ser
Phe Glu Gly Lys Asp 100 105 110 Cys Thr Tyr Thr Val Thr Glu Gln Gln
Leu Asp Met Thr Cys Glu Lys 115 120 125 Asp Asp Val Glu Thr Ala Tyr
Ser Ala Thr Glu Ile Thr Asp Glu Ser 130 135 140 Val Ile Thr Ala Leu
Glu Asn Ala Asp Asp Gly Lys Pro Lys Ser Val145 150 155 160 Asp Asp
Val Asn Ala Ala Ile Ala Ser Ala Glu Asp Gly Ala Ile Ile 165 170 175
Asp Leu Ser Ser Glu Gly Thr Phe Asp Thr Gly Val Ile Glu Leu Asn 180
185 190 Lys Ala Val Thr Ile Asp Gly Ala Gly Leu Ala Thr Ile Thr Gly
Asp 195 200 205 Ala Cys Ile Asp Val Thr Ala Pro Gly Ala Gly Ile Lys
Asn Met Thr 210 215 220 Phe Ala Asn Asp Asn Leu Ala Gly Cys Phe Gly
Arg Glu Ser Ala Gly225 230 235 240 Thr Ser Asp Asn Glu Thr Gly Ala
Ile Val Ile Gly Lys Ile Gly Lys 245 250 255 Asp Ser Asp Pro Val Ala
Leu Glu Asn Leu Lys Phe Asp Ala Asn Gly 260 265 270 Ile Thr Glu Asp
Asp Leu Gly Thr Lys Lys Ala Ser Trp Leu Phe Ser 275 280 285 Arg Gly
Tyr Phe Thr Leu Asp Asn Ser Glu Phe Val Gly Leu Ser Gly 290 295 300
Ser Phe Gln Asn Asn Ala Ile Arg Ile Asn Cys Ser Ser Asp Asn Gly305
310 315 320 Arg Phe Gly Ser Gln Ile Thr Asn Asn Thr Phe Thr Ile Asn
Ser Gly 325 330 335 Gly Ser Asp Val Gly Gly Ile Lys Val Gly Asp Ser
Ser Ser Ala Val 340 345 350 Ile Lys Asn Ser Asp Asp Asn Leu Gly Cys
Asn Val Thr Ile Glu Ser 355 360 365 Asn Thr Phe Asn Gly Tyr Lys Thr
Leu Leu Ser Ala Asp Asn Gly Lys 370 375 380 Asp Ile Arg Asn Thr Ala
Ile Tyr Ala Gln Pro Ser Ala Val Asn Thr385 390 395 400 Ala Ala Gly
Lys Glu Asn Ile Leu Asn 405 37861DNAVibrio splendidus 37atgaattctg
ttacaaaaat tgctgcagct gttgcatgta ctcttttagc gggcacagct 60gctggtgcat
ctcttgatta tcgttacgag tatcgtgctg cgacggatta tacaaagact
120aatggtgata cggctcacgt agacgctcgc catcaacacc gagttaagct
aggtgaaagc 180tttaagctgt cagacaagtg gaagcactct actggtctag
aacttaagtt ccacggtgat 240gactcttact atgatgaaga ttcaggttct
gttaaatcag caaacagcca gagtttttac 300gatggcaatt ggtacatcta
tggtatggag atcgataaca ctgcgacata caaaatagac 360aataattggt
atctacaaat gggtatgcct attgcttggg attgggatga gcctaatgct
420aacgatggcg actggaagat gaaaaaggtt acgtttaaac ctcagttccg
cgttggctat 480aaagcagata tgggtttaac aactgctatt cgttaccgtc
atgaatatgc tgacttccgt 540aaccacacac aatttggcga caaagattct
gaaactggcg agcgtttaga atcagctcaa 600aagtctaaag ttacactgac
gggctcttac aaaattgaat ctctacctaa gcttggcctt 660tcttacgaag
caaactatgt aaaatctttg gataacgtac ttctttataa tagtgatgac
720tgggaatggg atgctggctt aaaggtaaac tacaagttcg gttcttggaa
accttttgct 780gaaatctggt cttctgatat cagttcatct tcaaaagatc
gtgaagctaa ataccgtgtt 840ggtattgctt actcattcta a 86138286PRTVibrio
splendidus 38Met Asn Ser Val Thr Lys Ile Ala Ala Ala Val Ala Cys
Thr Leu Leu1 5 10 15 Ala Gly Thr Ala Ala Gly Ala Ser Leu Asp Tyr
Arg Tyr Glu Tyr Arg 20 25 30 Ala Ala Thr Asp Tyr Thr Lys Thr Asn
Gly Asp Thr Ala His Val Asp 35 40 45 Ala Arg His Gln His Arg Val
Lys Leu Gly Glu Ser Phe Lys Leu Ser 50 55 60 Asp Lys Trp Lys His
Ser Thr Gly Leu Glu Leu Lys Phe His Gly Asp65 70 75 80 Asp Ser Tyr
Tyr Asp Glu Asp Ser Gly Ser Val Lys Ser Ala Asn Ser 85 90 95 Gln
Ser Phe Tyr Asp Gly Asn Trp Tyr Ile Tyr Gly Met Glu Ile Asp 100 105
110 Asn Thr Ala Thr Tyr Lys Ile Asp Asn Asn Trp Tyr Leu Gln Met Gly
115 120 125 Met Pro Ile Ala Trp Asp Trp Asp Glu Pro Asn Ala Asn Asp
Gly Asp 130 135 140 Trp Lys Met Lys Lys Val Thr Phe Lys Pro Gln Phe
Arg Val Gly Tyr145 150 155 160 Lys Ala Asp Met Gly Leu Thr Thr Ala
Ile Arg Tyr Arg His Glu Tyr 165 170 175 Ala Asp Phe Arg Asn His Thr
Gln Phe Gly Asp Lys Asp Ser Glu Thr 180 185 190 Gly Glu Arg Leu Glu
Ser Ala Gln Lys Ser Lys Val Thr Leu Thr Gly 195 200 205 Ser Tyr Lys
Ile Glu Ser Leu Pro Lys Leu Gly Leu Ser Tyr Glu Ala 210 215 220 Asn
Tyr Val Lys Ser Leu Asp Asn Val Leu Leu Tyr Asn Ser Asp Asp225 230
235 240 Trp Glu Trp Asp Ala Gly Leu Lys Val Asn Tyr Lys Phe Gly Ser
Trp 245 250 255 Lys Pro Phe Ala Glu Ile Trp Ser Ser Asp Ile Ser Ser
Ser Ser Lys 260 265
270 Asp Arg Glu Ala Lys Tyr Arg Val Gly Ile Ala Tyr Ser Phe 275 280
285 391038DNAVibrio splendidus 39atgtttaaga aaaacatatt agcagtggcg
ttattagcga ctgtgccaat ggttactttc 60gcaaataacg gtgtttctta ccccgtacct
gccgataaat tcgatatgca taattggaaa 120ataaccatac cttcagatat
taatgaagat ggtcgcgttg atgaaataga aggggtcgct 180atgatgagct
actcacatag tgatttcttc catcttgata aagacggcaa ccttgtattt
240gaagtgcaga accaagcgat tacgacgaaa aactcgaaga atgcgcgttc
tgagttacgc 300cagatgccaa gaggcgcaga tttctctatc gatacggctg
ataaaggaaa ccagtgggca 360ctgtcgagtc acccagcggc tagtgaatac
agtgctgtgg gcggaacatt agaagcgaca 420ttaaaagtga atcacgtctc
agttaacgct aagttcccag aaaaataccc agctcattct 480gttgtggttg
gtcagattca tgctaaaaaa cacaacgagc taatcaaagc tggaaccggt
540tatgggcatg gtaatgaacc actaaagatc ttctataaga agtttcctga
ccaagaaatg 600ggttcagtat tctggaacta tgaacgtaac ctagagaaaa
aagatcctaa ccgtgccgat 660atcgcttatc cagtgtgggg taacacgtgg
gaaaaccctg cagagccggg tgaagccggt 720attgctcttg gtgaagagtt
tagctacaaa gtggaagtga aaggcaccat gatgtaccta 780acgtttgaaa
ccgagcgtca cgataccgtt aagtatgaaa tcgacctgag taagggcatc
840gatgaacttg actcaccaac gggctatgct gaagatgatt tttactacaa
agcgggcgca 900tacggccaat gtagcgtgag cgattctcac cctgtatggg
ggcctggttg tggcggtact 960ggcgatttcg ctgtcgataa aaagaatggc
gattacaaca gtgtgacttt ctctgcgctt 1020aagttaaacg gtaaatag
103840345PRTVibrio splendidus 40Met Phe Lys Lys Asn Ile Leu Ala Val
Ala Leu Leu Ala Thr Val Pro1 5 10 15 Met Val Thr Phe Ala Asn Asn
Gly Val Ser Tyr Pro Val Pro Ala Asp 20 25 30 Lys Phe Asp Met His
Asn Trp Lys Ile Thr Ile Pro Ser Asp Ile Asn 35 40 45 Glu Asp Gly
Arg Val Asp Glu Ile Glu Gly Val Ala Met Met Ser Tyr 50 55 60 Ser
His Ser Asp Phe Phe His Leu Asp Lys Asp Gly Asn Leu Val Phe65 70 75
80 Glu Val Gln Asn Gln Ala Ile Thr Thr Lys Asn Ser Lys Asn Ala Arg
85 90 95 Ser Glu Leu Arg Gln Met Pro Arg Gly Ala Asp Phe Ser Ile
Asp Thr 100 105 110 Ala Asp Lys Gly Asn Gln Trp Ala Leu Ser Ser His
Pro Ala Ala Ser 115 120 125 Glu Tyr Ser Ala Val Gly Gly Thr Leu Glu
Ala Thr Leu Lys Val Asn 130 135 140 His Val Ser Val Asn Ala Lys Phe
Pro Glu Lys Tyr Pro Ala His Ser145 150 155 160 Val Val Val Gly Gln
Ile His Ala Lys Lys His Asn Glu Leu Ile Lys 165 170 175 Ala Gly Thr
Gly Tyr Gly His Gly Asn Glu Pro Leu Lys Ile Phe Tyr 180 185 190 Lys
Lys Phe Pro Asp Gln Glu Met Gly Ser Val Phe Trp Asn Tyr Glu 195 200
205 Arg Asn Leu Glu Lys Lys Asp Pro Asn Arg Ala Asp Ile Ala Tyr Pro
210 215 220 Val Trp Gly Asn Thr Trp Glu Asn Pro Ala Glu Pro Gly Glu
Ala Gly225 230 235 240 Ile Ala Leu Gly Glu Glu Phe Ser Tyr Lys Val
Glu Val Lys Gly Thr 245 250 255 Met Met Tyr Leu Thr Phe Glu Thr Glu
Arg His Asp Thr Val Lys Tyr 260 265 270 Glu Ile Asp Leu Ser Lys Gly
Ile Asp Glu Leu Asp Ser Pro Thr Gly 275 280 285 Tyr Ala Glu Asp Asp
Phe Tyr Tyr Lys Ala Gly Ala Tyr Gly Gln Cys 290 295 300 Ser Val Ser
Asp Ser His Pro Val Trp Gly Pro Gly Cys Gly Gly Thr305 310 315 320
Gly Asp Phe Ala Val Asp Lys Lys Asn Gly Asp Tyr Asn Ser Val Thr 325
330 335 Phe Ser Ala Leu Lys Leu Asn Gly Lys 340 345 41897DNAVibrio
splendidus 41atggataact ctccggtgct gagccgattt ttagagaatg gatttttact
ccagcagaaa 60ctgagccttg ttctttgttg tgtgttgatc gcagcttctg catggatttt
aggacagctt 120gcatggttta ttgaacctgc tgagcaaacc gtcgtgccat
ggacagcaac ggcttcctcg 180tcttcaacgc ctcaatcgac tcttgatatc
tcttctttgc agcagagcaa catgtttggt 240gcttataacc caaccacgcc
tgctgtggtt gagcagcaag ttatccaaga tgcgccaaag 300acgcgactga
acctcgtttt agtgggtgca gtagccagtt ctaatccaaa gctgagcttg
360gctgtgattg ccaatcgcgg cacacaagca acctacggca ttaatgaaga
gatcgaaggt 420acgcgagcta agttaaaagc ggtattagtc gatcgcgtga
ttattgataa ctcaggtcga 480gacgaaacct tgatgcttga aggcattgag
tacaagcgtt tgtctgtatc agcacctgcg 540ccacctcgta cctcttcttc
tgtgcgtggc aacaacccag cttctgcaga agagaagcta 600gatgaaatta
aagcgaagat aatgaaagat ccgcaacaaa tcttccaata tgttcgactg
660tctcaggtga aacgcgacga taaagtgatt ggttatcgtg tgagccctgg
caaagattca 720gaacttttta actctgttgg gctccaaaac ggagatattg
ccactcagtt aaatggacaa 780gacctgacag accctgctgc tatgggcaac
atattccgtt ctatctcaga gctgacagag 840ctaaacctcg tcgtcgagag
agatggtcaa caacatgaag tgtttattga attttag 89742298PRTVibrio
splendidus 42Met Asp Asn Ser Pro Val Leu Ser Arg Phe Leu Glu Asn
Gly Phe Leu1 5 10 15 Leu Gln Gln Lys Leu Ser Leu Val Leu Cys Cys
Val Leu Ile Ala Ala 20 25 30 Ser Ala Trp Ile Leu Gly Gln Leu Ala
Trp Phe Ile Glu Pro Ala Glu 35 40 45 Gln Thr Val Val Pro Trp Thr
Ala Thr Ala Ser Ser Ser Ser Thr Pro 50 55 60 Gln Ser Thr Leu Asp
Ile Ser Ser Leu Gln Gln Ser Asn Met Phe Gly65 70 75 80 Ala Tyr Asn
Pro Thr Thr Pro Ala Val Val Glu Gln Gln Val Ile Gln 85 90 95 Asp
Ala Pro Lys Thr Arg Leu Asn Leu Val Leu Val Gly Ala Val Ala 100 105
110 Ser Ser Asn Pro Lys Leu Ser Leu Ala Val Ile Ala Asn Arg Gly Thr
115 120 125 Gln Ala Thr Tyr Gly Ile Asn Glu Glu Ile Glu Gly Thr Arg
Ala Lys 130 135 140 Leu Lys Ala Val Leu Val Asp Arg Val Ile Ile Asp
Asn Ser Gly Arg145 150 155 160 Asp Glu Thr Leu Met Leu Glu Gly Ile
Glu Tyr Lys Arg Leu Ser Val 165 170 175 Ser Ala Pro Ala Pro Pro Arg
Thr Ser Ser Ser Val Arg Gly Asn Asn 180 185 190 Pro Ala Ser Ala Glu
Glu Lys Leu Asp Glu Ile Lys Ala Lys Ile Met 195 200 205 Lys Asp Pro
Gln Gln Ile Phe Gln Tyr Val Arg Leu Ser Gln Val Lys 210 215 220 Arg
Asp Asp Lys Val Ile Gly Tyr Arg Val Ser Pro Gly Lys Asp Ser225 230
235 240 Glu Leu Phe Asn Ser Val Gly Leu Gln Asn Gly Asp Ile Ala Thr
Gln 245 250 255 Leu Asn Gly Gln Asp Leu Thr Asp Pro Ala Ala Met Gly
Asn Ile Phe 260 265 270 Arg Ser Ile Ser Glu Leu Thr Glu Leu Asn Leu
Val Val Glu Arg Asp 275 280 285 Gly Gln Gln His Glu Val Phe Ile Glu
Phe 290 295 432025DNAVibrio splendidus 43gtgaagcatt ggtttaagaa
aagtgcatgg ttattggcag gaagcttaat ctgcacaccc 60gcagccatcg cgagtgattt
tagtgccagc tttaaaggca ctgatattca agagtttatt 120aatattgttg
gtcgtaacct agagaagacg atcatcgttg acccttcggt gcgcggaaaa
180atcgatgtac gcagctacga cgtactcaat gaagagcaat actacagctt
cttcctaaac 240gtattggaag tgtatggcta cgcggttgtc gaaatggact
cgggtgttct taagatcatc 300aaggccaaag attcgaaaac atcggcaatt
ccagtcgttg gagacagtga cacgatcaaa 360ggcgacaatg tggtgacacg
tgttgtgacg gttcgtaatg tctcggtgcg tgaactttct 420cctctgcttc
gtcaactaaa cgacaatgca ggcgcgggta acgttgtgca ctacgaccca
480gccaacatca tccttattac aggccgagcg gcggtagtaa accgtttagc
tgaaatcatc 540aagcgtgttg accaagcggg tgataaagag attgaagtcg
ttgagctaaa gaatgcttct 600gcggcagaaa tggtacgtat cgttgatgcg
ttaagcaaaa ccactgatgc gaaaaacaca 660cctgcatttc tacaacctaa
attagttgcc gatgaacgta ccaatgcgat tcttatctca 720ggcgacccta
aagtacgtag ccgtttaaga aggctgattg aacagcttga tgttgaaatg
780gcaaccaagg gcaataacca agttatttac cttaaatatg caaaagccga
agatctagtt 840gatgtgctga aaggcgtgtc ggacaaccta caatcagaga
agcagacatc aaccaaagga 900agttcatcgc agcgtaacca agtgatgatc
tcagctcaca gtgacaccaa ctctttagtg 960attaccgcac agccggacat
catgaatgcg cttcaagatg tgatcgcaca gctggatatt 1020cgtcgtgctc
aagtattgat tgaagcactg attgtcgaaa tggccgaagg tgacggcgtt
1080aaccttggtg tgcagtgggg taaccttgaa acgggtgcca tgattcagta
cagcaacact 1140ggcgcttcca ttggcggtgt gatggttggt ttagaagaag
cgaaagacag cgaaacgaca 1200accgctgttt atgattcaga cggtaaattc
ttacgtaatg aaaccacgac ggaagaaggt 1260gactattcaa cattagcttc
cgcactttct ggtgttaatg gtgcggcaat gagtgtggta 1320atgggtgact
ggaccgcctt gatcagtgca gtagcgaccg attcaaattc aaatatccta
1380tcttctccaa gtatcaccgt gatggataac ggcgaagcgt cattcattgt
gggtgaagag 1440gtgcctgttc taaccggttc tacagcaggc tcaagtaacg
acaacccatt ccaaacagtt 1500gaacgtaaag aagtgggtat caagcttaaa
gtggtgccgc aaatcaatga aggtgattcg 1560gttcaactgc aaatagaaca
agaagtatcg aacgtattag gcgccaatgg tgcggttgat 1620gtgcgttttg
ctaagcgaca gctaaataca tcagtgattg ttcaagacgg tcaaatgctg
1680gtgttgggtg gcttgattga cgagcgagca ttggaaagtg aatctaaggt
gccgttcttg 1740ggagatattc ctgtgcttgg acacttgttc aaatcaacca
gtactcaggt tgagaaaaag 1800aacctaatgg tcttcatcaa accaaccatt
attcgtgatg gtatgacagc cgatggtatc 1860acgcagcgta aatacaactt
catccgtgct gagcagttgt acaaggctga gcaaggactg 1920aagttaatgg
cagacgataa catcccagta ttgcctaaat ttggtgccga catgaatcac
1980ccggctgaaa ttcaagcctt catcgatcaa atggaacaag aataa
202544674PRTVibrio splendidus 44Met Lys His Trp Phe Lys Lys Ser Ala
Trp Leu Leu Ala Gly Ser Leu1 5 10 15 Ile Cys Thr Pro Ala Ala Ile
Ala Ser Asp Phe Ser Ala Ser Phe Lys 20 25 30 Gly Thr Asp Ile Gln
Glu Phe Ile Asn Ile Val Gly Arg Asn Leu Glu 35 40 45 Lys Thr Ile
Ile Val Asp Pro Ser Val Arg Gly Lys Ile Asp Val Arg 50 55 60 Ser
Tyr Asp Val Leu Asn Glu Glu Gln Tyr Tyr Ser Phe Phe Leu Asn65 70 75
80 Val Leu Glu Val Tyr Gly Tyr Ala Val Val Glu Met Asp Ser Gly Val
85 90 95 Leu Lys Ile Ile Lys Ala Lys Asp Ser Lys Thr Ser Ala Ile
Pro Val 100 105 110 Val Gly Asp Ser Asp Thr Ile Lys Gly Asp Asn Val
Val Thr Arg Val 115 120 125 Val Thr Val Arg Asn Val Ser Val Arg Glu
Leu Ser Pro Leu Leu Arg 130 135 140 Gln Leu Asn Asp Asn Ala Gly Ala
Gly Asn Val Val His Tyr Asp Pro145 150 155 160 Ala Asn Ile Ile Leu
Ile Thr Gly Arg Ala Ala Val Val Asn Arg Leu 165 170 175 Ala Glu Ile
Ile Lys Arg Val Asp Gln Ala Gly Asp Lys Glu Ile Glu 180 185 190 Val
Val Glu Leu Lys Asn Ala Ser Ala Ala Glu Met Val Arg Ile Val 195 200
205 Asp Ala Leu Ser Lys Thr Thr Asp Ala Lys Asn Thr Pro Ala Phe Leu
210 215 220 Gln Pro Lys Leu Val Ala Asp Glu Arg Thr Asn Ala Ile Leu
Ile Ser225 230 235 240 Gly Asp Pro Lys Val Arg Ser Arg Leu Arg Arg
Leu Ile Glu Gln Leu 245 250 255 Asp Val Glu Met Ala Thr Lys Gly Asn
Asn Gln Val Ile Tyr Leu Lys 260 265 270 Tyr Ala Lys Ala Glu Asp Leu
Val Asp Val Leu Lys Gly Val Ser Asp 275 280 285 Asn Leu Gln Ser Glu
Lys Gln Thr Ser Thr Lys Gly Ser Ser Ser Gln 290 295 300 Arg Asn Gln
Val Met Ile Ser Ala His Ser Asp Thr Asn Ser Leu Val305 310 315 320
Ile Thr Ala Gln Pro Asp Ile Met Asn Ala Leu Gln Asp Val Ile Ala 325
330 335 Gln Leu Asp Ile Arg Arg Ala Gln Val Leu Ile Glu Ala Leu Ile
Val 340 345 350 Glu Met Ala Glu Gly Asp Gly Val Asn Leu Gly Val Gln
Trp Gly Asn 355 360 365 Leu Glu Thr Gly Ala Met Ile Gln Tyr Ser Asn
Thr Gly Ala Ser Ile 370 375 380 Gly Gly Val Met Val Gly Leu Glu Glu
Ala Lys Asp Ser Glu Thr Thr385 390 395 400 Thr Ala Val Tyr Asp Ser
Asp Gly Lys Phe Leu Arg Asn Glu Thr Thr 405 410 415 Thr Glu Glu Gly
Asp Tyr Ser Thr Leu Ala Ser Ala Leu Ser Gly Val 420 425 430 Asn Gly
Ala Ala Met Ser Val Val Met Gly Asp Trp Thr Ala Leu Ile 435 440 445
Ser Ala Val Ala Thr Asp Ser Asn Ser Asn Ile Leu Ser Ser Pro Ser 450
455 460 Ile Thr Val Met Asp Asn Gly Glu Ala Ser Phe Ile Val Gly Glu
Glu465 470 475 480 Val Pro Val Leu Thr Gly Ser Thr Ala Gly Ser Ser
Asn Asp Asn Pro 485 490 495 Phe Gln Thr Val Glu Arg Lys Glu Val Gly
Ile Lys Leu Lys Val Val 500 505 510 Pro Gln Ile Asn Glu Gly Asp Ser
Val Gln Leu Gln Ile Glu Gln Glu 515 520 525 Val Ser Asn Val Leu Gly
Ala Asn Gly Ala Val Asp Val Arg Phe Ala 530 535 540 Lys Arg Gln Leu
Asn Thr Ser Val Ile Val Gln Asp Gly Gln Met Leu545 550 555 560 Val
Leu Gly Gly Leu Ile Asp Glu Arg Ala Leu Glu Ser Glu Ser Lys 565 570
575 Val Pro Phe Leu Gly Asp Ile Pro Val Leu Gly His Leu Phe Lys Ser
580 585 590 Thr Ser Thr Gln Val Glu Lys Lys Asn Leu Met Val Phe Ile
Lys Pro 595 600 605 Thr Ile Ile Arg Asp Gly Met Thr Ala Asp Gly Ile
Thr Gln Arg Lys 610 615 620 Tyr Asn Phe Ile Arg Ala Glu Gln Leu Tyr
Lys Ala Glu Gln Gly Leu625 630 635 640 Lys Leu Met Ala Asp Asp Asn
Ile Pro Val Leu Pro Lys Phe Gly Ala 645 650 655 Asp Met Asn His Pro
Ala Glu Ile Gln Ala Phe Ile Asp Gln Met Glu 660 665 670 Gln
Glu451503DNAVibrio splendidus 45atggctgaat tggtaggggc ggcacgtact
tatcagcgct tgccgtttag ctttgcgaat 60cgctacaaga tggtgttgga ataccaacat
ccagagcgcg caccgatact ttattatgtt 120gagccactga aatcggcggc
gatcattgaa gtgagtcgtg ttgtgaaaaa tggtttcacg 180ccacaagcga
ttactctcga tgagtttgat aaaaaactaa ccgatgctta tcagcgtgac
240tcgtcagaag ctcgtcagct catggaagac attggtgctg atagtgatga
tttcttctca 300ctagcggaag aactgcctca agacgaagac ttacttgaat
cagaagatga tgcaccaatc 360atcaagttaa tcaatgcgat gctgggtgag
gcgatcaaag agggtgcttc ggatatacac 420atcgaaacct ttgaaaagtc
actttgtatc cgtttccgag ttgatggtgt gctgcgtgat 480gttctagcgc
caagccgtaa actggctccg ctattggttt cacgtgtcaa ggttatggct
540aaactggata ttgcggaaaa acgcgtgcca caagatggtc gtatttctct
gcgtattggt 600ggccgagcgg ttgatgttcg tgtttcaacc atgccttctt
cgcatggtga gcgtgtggta 660atgcgtctgt tggacaaaaa tgccactcgt
ctagacttgc acagtttagg tatgacagcc 720gaaaaccatg aaaacttccg
taagctgatt cagcgcccac atggcattat cttggtgacc 780ggcccgacag
gttcaggtaa atcgacgacc ttgtacgcag gtctgcaaga actcaacagc
840aatgaacgaa acattttaac cgttgaagac ccaatcgaat tcgatatcga
tggcattggt 900caaacacaag tgaaccctaa ggttgatatg acctttgcgc
gtggtttacg tgccattctt 960cgtcaagatc ctgatgttgt tatgattggt
gagatccgtg acttggagac cgcagagatt 1020gctgtccagg cctctttgac
aggtcactta gttatgtcga ctctgcatac caatactgcc 1080gtcggtgcga
ttacacgtct acgtgatatg ggcattgaac ctttcttgat ctcttcttcg
1140ctgctgggtg ttttggctca gcgcttggtt cgtactttat gtaacgaatg
taaagaacct 1200tatgaagccg ataaagagca gaagaaactg tttgggttga
agaagaaaga aagcttgacg 1260ctttaccatg ccaaaggttg tgaagagtgt
ggccataagg gttatcgagg tcgtacgggt 1320attcatgagc tgttgatgat
tgatgattca gtacaagagc tgattcacag tgaagcgggt 1380gagcaggcga
ttgataaagc aattcgtggc acaacaccaa gtattcgaga tgatggcttg
1440agcaaagttc tgaaaggggt aacgtcccta gaagaagtga tgcgcgtgac
caaggaagtc 1500tag 150346500PRTVibrio splendidus 46Met Ala Glu Leu
Val Gly Ala Ala Arg Thr Tyr Gln Arg Leu Pro Phe1 5 10 15 Ser Phe
Ala Asn Arg Tyr Lys Met Val Leu Glu Tyr Gln His Pro Glu 20 25 30
Arg Ala Pro Ile Leu Tyr Tyr Val Glu Pro Leu Lys Ser Ala Ala Ile 35
40 45 Ile Glu Val Ser Arg Val Val Lys Asn Gly Phe Thr Pro Gln Ala
Ile 50 55 60 Thr Leu Asp Glu Phe Asp Lys Lys Leu Thr Asp Ala Tyr
Gln Arg Asp65 70 75 80 Ser Ser Glu Ala Arg Gln Leu Met Glu Asp Ile
Gly Ala Asp Ser Asp 85
90 95 Asp Phe Phe Ser Leu Ala Glu Glu Leu Pro Gln Asp Glu Asp Leu
Leu 100 105 110 Glu Ser Glu Asp Asp Ala Pro Ile Ile Lys Leu Ile Asn
Ala Met Leu 115 120 125 Gly Glu Ala Ile Lys Glu Gly Ala Ser Asp Ile
His Ile Glu Thr Phe 130 135 140 Glu Lys Ser Leu Cys Ile Arg Phe Arg
Val Asp Gly Val Leu Arg Asp145 150 155 160 Val Leu Ala Pro Ser Arg
Lys Leu Ala Pro Leu Leu Val Ser Arg Val 165 170 175 Lys Val Met Ala
Lys Leu Asp Ile Ala Glu Lys Arg Val Pro Gln Asp 180 185 190 Gly Arg
Ile Ser Leu Arg Ile Gly Gly Arg Ala Val Asp Val Arg Val 195 200 205
Ser Thr Met Pro Ser Ser His Gly Glu Arg Val Val Met Arg Leu Leu 210
215 220 Asp Lys Asn Ala Thr Arg Leu Asp Leu His Ser Leu Gly Met Thr
Ala225 230 235 240 Glu Asn His Glu Asn Phe Arg Lys Leu Ile Gln Arg
Pro His Gly Ile 245 250 255 Ile Leu Val Thr Gly Pro Thr Gly Ser Gly
Lys Ser Thr Thr Leu Tyr 260 265 270 Ala Gly Leu Gln Glu Leu Asn Ser
Asn Glu Arg Asn Ile Leu Thr Val 275 280 285 Glu Asp Pro Ile Glu Phe
Asp Ile Asp Gly Ile Gly Gln Thr Gln Val 290 295 300 Asn Pro Lys Val
Asp Met Thr Phe Ala Arg Gly Leu Arg Ala Ile Leu305 310 315 320 Arg
Gln Asp Pro Asp Val Val Met Ile Gly Glu Ile Arg Asp Leu Glu 325 330
335 Thr Ala Glu Ile Ala Val Gln Ala Ser Leu Thr Gly His Leu Val Met
340 345 350 Ser Thr Leu His Thr Asn Thr Ala Val Gly Ala Ile Thr Arg
Leu Arg 355 360 365 Asp Met Gly Ile Glu Pro Phe Leu Ile Ser Ser Ser
Leu Leu Gly Val 370 375 380 Leu Ala Gln Arg Leu Val Arg Thr Leu Cys
Asn Glu Cys Lys Glu Pro385 390 395 400 Tyr Glu Ala Asp Lys Glu Gln
Lys Lys Leu Phe Gly Leu Lys Lys Lys 405 410 415 Glu Ser Leu Thr Leu
Tyr His Ala Lys Gly Cys Glu Glu Cys Gly His 420 425 430 Lys Gly Tyr
Arg Gly Arg Thr Gly Ile His Glu Leu Leu Met Ile Asp 435 440 445 Asp
Ser Val Gln Glu Leu Ile His Ser Glu Ala Gly Glu Gln Ala Ile 450 455
460 Asp Lys Ala Ile Arg Gly Thr Thr Pro Ser Ile Arg Asp Asp Gly
Leu465 470 475 480 Ser Lys Val Leu Lys Gly Val Thr Ser Leu Glu Glu
Val Met Arg Val 485 490 495 Thr Lys Glu Val 500 471221DNAVibrio
splendidus 47atggcggcat ttgaatacaa agcactggat gccaaaggca aaagtaaaaa
aggctcaatt 60gaagcagata atgctcgtca ggctcgccaa agaataaaag agcttggctt
gatgccggtt 120gagatgaccg aggctaaagc aaaaacagca aaaggtgctc
agccatcgac cagctttaaa 180cgcggcatca gtacgcctga tcttgcgctt
attactcgtc aaatatccac gctcgttcaa 240tctggtatgc cgctagaaga
gtgtttgaaa gccgttgccg aacagtctga gaaacctcgt 300attcgcacca
tgctactcgc ggtgagatct aaggtgactg aaggttattc gttagcagac
360agcttgtctg attatcccca tatcttcgat gagctattca gagccatggt
tgctgctggt 420gagaagtcag ggcatctaga tgcggtattg gaacgattgg
ctgactacgc agaaaaccgt 480cagaagatgc gttctaagtt gctgcaagcg
atgatctacc ccatcgtgct ggtggtgttt 540gcggtgacga ttgtgtcgtt
cctactggca acggtagtgc cgaagatcgt tgagcctatt 600atccaaatgg
gacaagagct ccctcagtcg acacaatttt tattagcatc gagtgaattt
660atccagaatt ggggcatcca attactggtg ttgaccattg gtgtgattgt
gttggttaag 720actgcgctga aaaagccggg cgttcgcatg agctgggatc
gcaaattatt gagcatcccg 780ctgataggca agatagcgaa agggatcaac
acctctcgtt ttgcacgaac actttctatc 840tgtacctcta gtgcgattcc
tatccttgaa gggatgaagg tcgcggtaga tgtgatgtcg 900aatcatcacg
tgaaacaaca agtattacag gcatcagata gcgttagaga aggggcaagc
960ctgcgtaaag cgcttgatca aaccaaactc tttcccccga tgatgctgca
tatgatcgcc 1020agtggtgagc agagtggcca attggaacag atgctgacaa
gagcggcaga taatcaggat 1080caaagctttg aatcgaccgt taatatcgcg
ttaggcattt ttaccccagc gcttattgcg 1140ttgatggctg gcttagtgct
gtttatcgtg atggcgacgc tgatgccaat gcttgaaatg 1200aacaatttaa
tgagtggtta a 122148406PRTVibrio splendidus 48Met Ala Ala Phe Glu
Tyr Lys Ala Leu Asp Ala Lys Gly Lys Ser Lys1 5 10 15 Lys Gly Ser
Ile Glu Ala Asp Asn Ala Arg Gln Ala Arg Gln Arg Ile 20 25 30 Lys
Glu Leu Gly Leu Met Pro Val Glu Met Thr Glu Ala Lys Ala Lys 35 40
45 Thr Ala Lys Gly Ala Gln Pro Ser Thr Ser Phe Lys Arg Gly Ile Ser
50 55 60 Thr Pro Asp Leu Ala Leu Ile Thr Arg Gln Ile Ser Thr Leu
Val Gln65 70 75 80 Ser Gly Met Pro Leu Glu Glu Cys Leu Lys Ala Val
Ala Glu Gln Ser 85 90 95 Glu Lys Pro Arg Ile Arg Thr Met Leu Leu
Ala Val Arg Ser Lys Val 100 105 110 Thr Glu Gly Tyr Ser Leu Ala Asp
Ser Leu Ser Asp Tyr Pro His Ile 115 120 125 Phe Asp Glu Leu Phe Arg
Ala Met Val Ala Ala Gly Glu Lys Ser Gly 130 135 140 His Leu Asp Ala
Val Leu Glu Arg Leu Ala Asp Tyr Ala Glu Asn Arg145 150 155 160 Gln
Lys Met Arg Ser Lys Leu Leu Gln Ala Met Ile Tyr Pro Ile Val 165 170
175 Leu Val Val Phe Ala Val Thr Ile Val Ser Phe Leu Leu Ala Thr Val
180 185 190 Val Pro Lys Ile Val Glu Pro Ile Ile Gln Met Gly Gln Glu
Leu Pro 195 200 205 Gln Ser Thr Gln Phe Leu Leu Ala Ser Ser Glu Phe
Ile Gln Asn Trp 210 215 220 Gly Ile Gln Leu Leu Val Leu Thr Ile Gly
Val Ile Val Leu Val Lys225 230 235 240 Thr Ala Leu Lys Lys Pro Gly
Val Arg Met Ser Trp Asp Arg Lys Leu 245 250 255 Leu Ser Ile Pro Leu
Ile Gly Lys Ile Ala Lys Gly Ile Asn Thr Ser 260 265 270 Arg Phe Ala
Arg Thr Leu Ser Ile Cys Thr Ser Ser Ala Ile Pro Ile 275 280 285 Leu
Glu Gly Met Lys Val Ala Val Asp Val Met Ser Asn His His Val 290 295
300 Lys Gln Gln Val Leu Gln Ala Ser Asp Ser Val Arg Glu Gly Ala
Ser305 310 315 320 Leu Arg Lys Ala Leu Asp Gln Thr Lys Leu Phe Pro
Pro Met Met Leu 325 330 335 His Met Ile Ala Ser Gly Glu Gln Ser Gly
Gln Leu Glu Gln Met Leu 340 345 350 Thr Arg Ala Ala Asp Asn Gln Asp
Gln Ser Phe Glu Ser Thr Val Asn 355 360 365 Ile Ala Leu Gly Ile Phe
Thr Pro Ala Leu Ile Ala Leu Met Ala Gly 370 375 380 Leu Val Leu Phe
Ile Val Met Ala Thr Leu Met Pro Met Leu Glu Met385 390 395 400 Asn
Asn Leu Met Ser Gly 405 49444DNAVibrio splendidus 49atgaaaaata
aaatgaaaaa acaatcaggc tttaccctat tagaagtcat ggttgttgtc 60gttatccttg
gtgttctagc aagttttgtt gtacctaacc tgttgggcaa caaagagaag
120gcggatcaac aaaaagccat cactgatatt gtggcgctag agaacgcgct
cgacatgtac 180aaactggata acagcgttta cccaacaacg gatcaaggcc
tggacgggtt ggtgacaaag 240ccaagcagtc cagagcctcg taactaccga
gacggcggtt acatcaagcg tctacctaac 300gacccatggg gcaatgagta
ccaataccta agtcctggtg ataacggcac aattgatatc 360ttcactcttg
gcgcagatgg tcaagaaggt ggtgaaggta ttgctgcaga tatcggcaac
420tggaacatgc aggacttcca ataa 44450146PRTVibrio splendidus 50Lys
Asn Lys Met Lys Lys Gln Ser Gly Phe Thr Leu Leu Glu Val Met1 5 10
15 Val Val Val Val Ile Leu Gly Val Leu Ala Ser Phe Val Val Pro Asn
20 25 30 Leu Leu Gly Asn Lys Glu Lys Ala Asp Gln Gln Lys Ala Ile
Thr Asp 35 40 45 Ile Val Ala Leu Glu Asn Ala Leu Asp Met Tyr Lys
Leu Asp Asn Ser 50 55 60 Val Tyr Pro Thr Thr Asp Gln Gly Leu Asp
Gly Leu Val Thr Lys Pro65 70 75 80 Ser Ser Pro Glu Pro Arg Asn Tyr
Arg Asp Gly Gly Tyr Ile Lys Arg 85 90 95 Leu Pro Asn Asp Pro Trp
Gly Asn Glu Tyr Gln Tyr Leu Ser Pro Gly 100 105 110 Asp Asn Gly Thr
Ile Asp Ile Phe Thr Leu Gly Ala Asp Gly Gln Glu 115 120 125 Gly Gly
Glu Gly Ile Ala Ala Asp Ile Gly Asn Trp Asn Met Gln Asp 130 135 140
Phe Gln145 51594DNAVibrio splendidus 51gtgaaaacta agcaaacaca
gccaggtttc accttgattg agattctttt ggtgttggta 60ttactgtcag tatcggcggt
cgcggtgatc tcgaccatcc ctaccaatag caaagatgtt 120gctaaaaaat
acgctcaaag cttttatcag cgaattcagc tactcaatga agaggctatt
180ttgagtggct tagattttgg tgttcgtgtt gatgaaaaaa aatcgactta
cgttctgatg 240actttgaagt ctgatggctg gcaagaaacg gagttcgaaa
agatcccttc ttcaactgaa 300ttaccggaag aactggcact gtcgctgaca
ttaggtggtg gcgcgtggga agacgatgat 360cggttgttca atccaggaag
cttatttgat gaagatatgt ttgctgatct tgaagaggaa 420aagaagccga
aaccaccaca gatctacatc ttgtcgagtg ctgaaatgac gccatttgta
480ctgtcgtttt acccaaatac cggtgacaca atacaagatg tttggcgcat
tcgagtattg 540gataatggtg tgattcgatt actcgagccg ggagaagaag
atgaagaaga ataa 59452197PRTVibrio splendidus 52Met Lys Thr Lys Gln
Thr Gln Pro Gly Phe Thr Leu Ile Glu Ile Leu1 5 10 15 Leu Val Leu
Val Leu Leu Ser Val Ser Ala Val Ala Val Ile Ser Thr 20 25 30 Ile
Pro Thr Asn Ser Lys Asp Val Ala Lys Lys Tyr Ala Gln Ser Phe 35 40
45 Tyr Gln Arg Ile Gln Leu Leu Asn Glu Glu Ala Ile Leu Ser Gly Leu
50 55 60 Asp Phe Gly Val Arg Val Asp Glu Lys Lys Ser Thr Tyr Val
Leu Met65 70 75 80 Thr Leu Lys Ser Asp Gly Trp Gln Glu Thr Glu Phe
Glu Lys Ile Pro 85 90 95 Ser Ser Thr Glu Leu Pro Glu Glu Leu Ala
Leu Ser Leu Thr Leu Gly 100 105 110 Gly Gly Ala Trp Glu Asp Asp Asp
Arg Leu Phe Asn Pro Gly Ser Leu 115 120 125 Phe Asp Glu Asp Met Phe
Ala Asp Leu Glu Glu Glu Lys Lys Pro Lys 130 135 140 Pro Pro Gln Ile
Tyr Ile Leu Ser Ser Ala Glu Met Thr Pro Phe Val145 150 155 160 Leu
Ser Phe Tyr Pro Asn Thr Gly Asp Thr Ile Gln Asp Val Trp Arg 165 170
175 Ile Arg Val Leu Asp Asn Gly Val Ile Arg Leu Leu Glu Pro Gly Glu
180 185 190 Glu Asp Glu Glu Glu 195 53396DNAVibrio splendidus
53atgaagaaga ataaccgttc tccttatcgt tctcgcggta tgcctcttgg ttctcgagga
60atgactctgc ttgaagtatt ggttgcgctg gctatcttcg ctacggcggc gatcagtgtg
120attcgtgctg tcacccagca catcaatacg ctcagttatc tcgaagaaaa
aaccttcgcg 180gcgatggtcg ttgataatca aatggcccta gtcatgctac
atcctgagat gcttaaaaaa 240gcgcagggca cgcaagagtt agcgggaaga
gaatggttct ggaaggtgac tcccatcgat 300accagcgata atttattaaa
ggcgtttgat gtgagtgcgg caaccagtaa gaaagcgtct 360ccagtcgtta
cggtgcgcag ttatgtggtt aattaa 39654131PRTVibrio splendidus 54Met Lys
Lys Asn Asn Arg Ser Pro Tyr Arg Ser Arg Gly Met Pro Leu1 5 10 15
Gly Ser Arg Gly Met Thr Leu Leu Glu Val Leu Val Ala Leu Ala Ile 20
25 30 Phe Ala Thr Ala Ala Ile Ser Val Ile Arg Ala Val Thr Gln His
Ile 35 40 45 Asn Thr Leu Ser Tyr Leu Glu Glu Lys Thr Phe Ala Ala
Met Val Val 50 55 60 Asp Asn Gln Met Ala Leu Val Met Leu His Pro
Glu Met Leu Lys Lys65 70 75 80 Ala Gln Gly Thr Gln Glu Leu Ala Gly
Arg Glu Trp Phe Trp Lys Val 85 90 95 Thr Pro Ile Asp Thr Ser Asp
Asn Leu Leu Lys Ala Phe Asp Val Ser 100 105 110 Ala Ala Thr Ser Lys
Lys Ala Ser Pro Val Val Thr Val Arg Ser Tyr 115 120 125 Val Val Asn
130 55804DNAVibrio slpendidus 55atgtggttaa ttaagagaat gtggtcaatt
aagagcatgt tattaattaa gaacagctcg 60ctaactaaga gcgtgtcgct aactaagagc
atgtcggaaa ataagcgtac gccgcgtaaa 120caaggtctac cttcaaaagg
gagaggcttt accttaattg aagtcttggt ctcgattgct 180atctttgcca
cgctaagtat ggcggcttat caggtggtta atcaggtgca gcgaagcaac
240gagatctcta ttgagcgcag tgctcgtttg aaccaactgc aacgcagttt
agtcatttta 300gataatgatt ttcgccagat ggcggtgcga aaatttcgta
ccaacggtga agaagcatca 360tctaagctga tcttaatgaa agagtattta
ttggactccg acagtgtagg catcatgttt 420actcgtctag gttggcacaa
cccacaacag cagtttcctc gcggtgaagt cacgaaggtt 480ggctaccgta
ttaaagaaga aacacttgag cgtgtatggt ggcgttatcc cgatacacct
540tcaggccaag aaggtgtgat tacccctctg cttgatgatg ttgaaagctt
ggaattcgag 600ttttatgacg gaagccgctg ggggaaagag tggcaaaccg
ataaatcact gccgaaagcg 660gtgaggctta agctgacact gaaagactat
ggtgagatag agcgtgttta tctcactccc 720ggtggcaccc tagatcaggc
cgatgattct tcaaacagtg actcttcagg cagtagtgag 780gggaataatg
actcatcgaa ctaa 80456267PRTVibrio splendidus 56Met Trp Leu Ile Lys
Arg Met Trp Ser Ile Lys Ser Met Leu Leu Ile1 5 10 15 Lys Asn Ser
Ser Leu Thr Lys Ser Val Ser Leu Thr Lys Ser Met Ser 20 25 30 Glu
Asn Lys Arg Thr Pro Arg Lys Gln Gly Leu Pro Ser Lys Gly Arg 35 40
45 Gly Phe Thr Leu Ile Glu Val Leu Val Ser Ile Ala Ile Phe Ala Thr
50 55 60 Leu Ser Met Ala Ala Tyr Gln Val Val Asn Gln Val Gln Arg
Ser Asn65 70 75 80 Glu Ile Ser Ile Glu Arg Ser Ala Arg Leu Asn Gln
Leu Gln Arg Ser 85 90 95 Leu Val Ile Leu Asp Asn Asp Phe Arg Gln
Met Ala Val Arg Lys Phe 100 105 110 Arg Thr Asn Gly Glu Glu Ala Ser
Ser Lys Leu Ile Leu Met Lys Glu 115 120 125 Tyr Leu Leu Asp Ser Asp
Ser Val Gly Ile Met Phe Thr Arg Leu Gly 130 135 140 Trp His Asn Pro
Gln Gln Gln Phe Pro Arg Gly Glu Val Thr Lys Val145 150 155 160 Gly
Tyr Arg Ile Lys Glu Glu Thr Leu Glu Arg Val Trp Trp Arg Tyr 165 170
175 Pro Asp Thr Pro Ser Gly Gln Glu Gly Val Ile Thr Pro Leu Leu Asp
180 185 190 Asp Val Glu Ser Leu Glu Phe Glu Phe Tyr Asp Gly Ser Arg
Trp Gly 195 200 205 Lys Glu Trp Gln Thr Asp Lys Ser Leu Pro Lys Ala
Val Arg Leu Lys 210 215 220 Leu Thr Leu Lys Asp Tyr Gly Glu Ile Glu
Arg Val Tyr Leu Thr Pro225 230 235 240 Gly Gly Thr Leu Asp Gln Ala
Asp Asp Ser Ser Asn Ser Asp Ser Ser 245 250 255 Gly Ser Ser Glu Gly
Asn Asn Asp Ser Ser Asn 260 265 571050DNAVibrio splendidus
57atgactcatc gaactaataa gcgtttagcg acaaggtcag ccttgggacg taaacaacgt
60ggtgtcgcgc tgatcattat tttgatgcta ttggcgatca tggcaaccat tgctggcagc
120atgtccgagc gtttgtttac gcaattcaag cgcgttggta accaactgaa
ttaccaacag 180gcttactggt acagcattgg tgtggaagcg cttgtgcaaa
acggtattag gcaaagttac 240aaagacagtg ataccgtgaa cctaagccaa
ccatgggcgt tagaagagca ggtataccca 300ttggattatg gccaagttaa
gggccgcatt gttgatgctc aggcatgttt taatcttaat 360gccttagccg
gagtggcgac cacttcaagt aaccagactc cttatttaat cacggtttgg
420caaaccttat tggaaaacca agacgttgag ccttatcagg ctgaggttat
cgcaaattca 480acgtgggaat ttgttgatgc ggatacacga accacctctt
cgtctggtgt agaagacagc 540acgtatgaag cgatgaagcc ctcttatttg
gcggcgaatg gcttaatggc cgatgaatcc 600gagctacgag cggtttatca
agtcactggt gaagtgatga ataaggttcg cccctttgtt 660tgcgctctgc
caaccgatga tttccgcttg aatgtgaata ctctcacgga aaaacaagca
720ccgttattgg aagcgatgtt tgcgccaggc ttaagtgaat cggatgccaa
acagctgata 780gataaacgcc catttgatgg ctgggatacg gtagatgctt
tcatggctga acctgccatt 840gttggtgtaa gtgccgaagt cagcaagaaa
gcgaaagcat atttaactgt agatagcgcc 900tattttgagc tagatgcaga
ggtattagtt gagcagtcac gtgtacgtat acggacgctt 960ttctatagta
gtaatcgaga aacagtgacg gtagtacgcc gtcgttttgg aggaatcagt
1020gagcgagttt ctgaccgttc gactgagtag
105058349PRTVibrio splendidus 58Met Thr His Arg Thr Asn Lys Arg Leu
Ala Thr Arg Ser Ala Leu Gly1 5 10 15 Arg Lys Gln Arg Gly Val Ala
Leu Ile Ile Ile Leu Met Leu Leu Ala 20 25 30 Ile Met Ala Thr Ile
Ala Gly Ser Met Ser Glu Arg Leu Phe Thr Gln 35 40 45 Phe Lys Arg
Val Gly Asn Gln Leu Asn Tyr Gln Gln Ala Tyr Trp Tyr 50 55 60 Ser
Ile Gly Val Glu Ala Leu Val Gln Asn Gly Ile Arg Gln Ser Tyr65 70 75
80 Lys Asp Ser Asp Thr Val Asn Leu Ser Gln Pro Trp Ala Leu Glu Glu
85 90 95 Gln Val Tyr Pro Leu Asp Tyr Gly Gln Val Lys Gly Arg Ile
Val Asp 100 105 110 Ala Gln Ala Cys Phe Asn Leu Asn Ala Leu Ala Gly
Val Ala Thr Thr 115 120 125 Ser Ser Asn Gln Thr Pro Tyr Leu Ile Thr
Val Trp Gln Thr Leu Leu 130 135 140 Glu Asn Gln Asp Val Glu Pro Tyr
Gln Ala Glu Val Ile Ala Asn Ser145 150 155 160 Thr Trp Glu Phe Val
Asp Ala Asp Thr Arg Thr Thr Ser Ser Ser Gly 165 170 175 Val Glu Asp
Ser Thr Tyr Glu Ala Met Lys Pro Ser Tyr Leu Ala Ala 180 185 190 Asn
Gly Leu Met Ala Asp Glu Ser Glu Leu Arg Ala Val Tyr Gln Val 195 200
205 Thr Gly Glu Val Met Asn Lys Val Arg Pro Phe Val Cys Ala Leu Pro
210 215 220 Thr Asp Asp Phe Arg Leu Asn Val Asn Thr Leu Thr Glu Lys
Gln Ala225 230 235 240 Pro Leu Leu Glu Ala Met Phe Ala Pro Gly Leu
Ser Glu Ser Asp Ala 245 250 255 Lys Gln Leu Ile Asp Lys Arg Pro Phe
Asp Gly Trp Asp Thr Val Asp 260 265 270 Ala Phe Met Ala Glu Pro Ala
Ile Val Gly Val Ser Ala Glu Val Ser 275 280 285 Lys Lys Ala Lys Ala
Tyr Leu Thr Val Asp Ser Ala Tyr Phe Glu Leu 290 295 300 Asp Ala Glu
Val Leu Val Glu Gln Ser Arg Val Arg Ile Arg Thr Leu305 310 315 320
Phe Tyr Ser Ser Asn Arg Glu Thr Val Thr Val Val Arg Arg Arg Phe 325
330 335 Gly Gly Ile Ser Glu Arg Val Ser Asp Arg Ser Thr Glu 340 345
591248DNAVibrio splendidus 59gtgagcgagt ttctgaccgt tcgactgagt
agcgaaccac aaagccctgt gcagtggtta 60gtttggtcga caagccaaca agaagtgata
gcaagcggtg aactgtctag ctgggaacag 120cttgacgagt taacgcctta
cgctgaaaag cgcagctgta tcgctttatt gccgggaagt 180gaatgcttaa
ttaagcgtgt tgagatcccg aaaggtgctg ctcgccagtt tgattctatg
240ctgccgttct tattagaaga cgaagtcgca caagatatcg aagacttaca
cctgactatt 300ttagataaag atgccactca cgctaccgtg tgtggtgtgg
atcgtgaatg gctaaaacaa 360gctttagacc tgtttcgcga agccaatata
atcttccgta aggtgctacc agatacacta 420gccgtgcctt ttgaagaaca
aggcatcagt gcgttgcaga tagatcagca ttggttattg 480cgccaaggtc
actctcaacg tcaaggtcac tatcaagccg tatcgatcag tgaagcatgg
540ttaccgatgt ttttgcaaag tgattgggtt gtcgctggtg aggaagagca
agcgacgact 600atcttcagct ataccgcgat gccgagcgac gacgttcaac
agcaaagcgg cctcgagtgg 660caagcaaagc ctgcggaatt ggtgatgtct
ttattgagtc agcaagcgat cacaagcggc 720gtaaatttac tgactggcac
ctttaaaacc aaatcttcat tcagtaaata ttggcgtgtt 780tggcagaaag
tggcgattgc tgcttgtttg ctggtggccg tgattgtgac tcagcaagtg
840ttgaaggttc agcaatacga agcgcaagca caagcctacc gcatggagag
tgagcgtatc 900tttagagctg tgctgcctgg caaacaacgc attccgaccg
tgagttacct caagcgtcag 960atgaatgatg aagctaagaa atacggtggt
tcaggcgaag gtgattcttt acttggttgg 1020ttagctttgc tgcctgaaac
cttagggcaa gtgaagacga tcgaagttga aagcattcgc 1080tacgatggca
accgttctga ggttcgactg caggctaaaa gttctgactt ccaacacttt
1140gagaccgcaa gggtgaagct cgaagagaag tttgtcgttg agcaagggcc
attgaaccgt 1200aatggcgatg ccgtatttgg cagttttact cttaaacccc atcaataa
124860415PRTVibrio splendidus 60Met Ser Glu Phe Leu Thr Val Arg Leu
Ser Ser Glu Pro Gln Ser Pro1 5 10 15 Val Gln Trp Leu Val Trp Ser
Thr Ser Gln Gln Glu Val Ile Ala Ser 20 25 30 Gly Glu Leu Ser Ser
Trp Glu Gln Leu Asp Glu Leu Thr Pro Tyr Ala 35 40 45 Glu Lys Arg
Ser Cys Ile Ala Leu Leu Pro Gly Ser Glu Cys Leu Ile 50 55 60 Lys
Arg Val Glu Ile Pro Lys Gly Ala Ala Arg Gln Phe Asp Ser Met65 70 75
80 Leu Pro Phe Leu Leu Glu Asp Glu Val Ala Gln Asp Ile Glu Asp Leu
85 90 95 His Leu Thr Ile Leu Asp Lys Asp Ala Thr His Ala Thr Val
Cys Gly 100 105 110 Val Asp Arg Glu Trp Leu Lys Gln Ala Leu Asp Leu
Phe Arg Glu Ala 115 120 125 Asn Ile Ile Phe Arg Lys Val Leu Pro Asp
Thr Leu Ala Val Pro Phe 130 135 140 Glu Glu Gln Gly Ile Ser Ala Leu
Gln Ile Asp Gln His Trp Leu Leu145 150 155 160 Arg Gln Gly His Ser
Gln Arg Gln Gly His Tyr Gln Ala Val Ser Ile 165 170 175 Ser Glu Ala
Trp Leu Pro Met Phe Leu Gln Ser Asp Trp Val Val Ala 180 185 190 Gly
Glu Glu Glu Gln Ala Thr Thr Ile Phe Ser Tyr Thr Ala Met Pro 195 200
205 Ser Asp Asp Val Gln Gln Gln Ser Gly Leu Glu Trp Gln Ala Lys Pro
210 215 220 Ala Glu Leu Val Met Ser Leu Leu Ser Gln Gln Ala Ile Thr
Ser Gly225 230 235 240 Val Asn Leu Leu Thr Gly Thr Phe Lys Thr Lys
Ser Ser Phe Ser Lys 245 250 255 Tyr Trp Arg Val Trp Gln Lys Val Ala
Ile Ala Ala Cys Leu Leu Val 260 265 270 Ala Val Ile Val Thr Gln Gln
Val Leu Lys Val Gln Gln Tyr Glu Ala 275 280 285 Gln Ala Gln Ala Tyr
Arg Met Glu Ser Glu Arg Ile Phe Arg Ala Val 290 295 300 Leu Pro Gly
Lys Gln Arg Ile Pro Thr Val Ser Tyr Leu Lys Arg Gln305 310 315 320
Met Asn Asp Glu Ala Lys Lys Tyr Gly Gly Ser Gly Glu Gly Asp Ser 325
330 335 Leu Leu Gly Trp Leu Ala Leu Leu Pro Glu Thr Leu Gly Gln Val
Lys 340 345 350 Thr Ile Glu Val Glu Ser Ile Arg Tyr Asp Gly Asn Arg
Ser Glu Val 355 360 365 Arg Leu Gln Ala Lys Ser Ser Asp Phe Gln His
Phe Glu Thr Ala Arg 370 375 380 Val Lys Leu Glu Glu Lys Phe Val Val
Glu Gln Gly Pro Leu Asn Arg385 390 395 400 Asn Gly Asp Ala Val Phe
Gly Ser Phe Thr Leu Lys Pro His Gln 405 410 415 61489DNAVibrio
splendidus 61atgagaaata tgattgaacc actccaagcg tggtgggctt caataagtca
gcgggaacaa 60cgattagtca ttggttgttc tattttattg atactgggcg ttgtctattg
gggattaata 120caaccactta gccaacgagc cgagcttgca caaagccgca
ttcaaagtga gaagcaactt 180ctggcttggg taacggacaa agcgaatcaa
gtggttgaac tacgaggcag tggtggcatc 240agtgccagtc agcctttgaa
ccaatctgtg cctgcttcta tgcgccgttt taacatcgag 300ctgatacgcg
tgcaaccacg cggtgagatg ctgcaagttt ggattaagcc tgtgccattt
360aataagttcg ttgactggct gacatacctg aaagaaaagc agggtgttga
ggttgagttt 420atggatattg atcgctctga tagccctggg gttattgaga
tcaaccgact acagtttaaa 480cgaggttaa 48962162PRTVibrio splendidus
62Met Arg Asn Met Ile Glu Pro Leu Gln Ala Trp Trp Ala Ser Ile Ser1
5 10 15 Gln Arg Glu Gln Arg Leu Val Ile Gly Cys Ser Ile Leu Leu Ile
Leu 20 25 30 Gly Val Val Tyr Trp Gly Leu Ile Gln Pro Leu Ser Gln
Arg Ala Glu 35 40 45 Leu Ala Gln Ser Arg Ile Gln Ser Glu Lys Gln
Leu Leu Ala Trp Val 50 55 60 Thr Asp Lys Ala Asn Gln Val Val Glu
Leu Arg Gly Ser Gly Gly Ile65 70 75 80 Ser Ala Ser Gln Pro Leu Asn
Gln Ser Val Pro Ala Ser Met Arg Arg 85 90 95 Phe Asn Ile Glu Leu
Ile Arg Val Gln Pro Arg Gly Glu Met Leu Gln 100 105 110 Val Trp Ile
Lys Pro Val Pro Phe Asn Lys Phe Val Asp Trp Leu Thr 115 120 125 Tyr
Leu Lys Glu Lys Gln Gly Val Glu Val Glu Phe Met Asp Ile Asp 130 135
140 Arg Ser Asp Ser Pro Gly Val Ile Glu Ile Asn Arg Leu Gln Phe
Lys145 150 155 160 Arg Gly63780DNAVibrio splendidus 63gtgaaacgcg
gtttatcttt caaatacggc ctgttattca gcgtcatttt tatcgttttt 60ttctcggtaa
gcttgttgct gcatttgcct gccgcttttg ctctcaagca tgcacccgtc
120gtgcgtggtt taagcattga aggcgttgag ggcaccgttt ggcaaggtcg
cgctaacaat 180atcgcgtggc agcgtgtcaa ttacggctca gtgcagtggg
acttccagtt ctctaaacta 240ttccaagcca aagcagaact tgcggttcgc
tttggccgca acagcgacat gaacttatca 300ggtaaaggac gtgtcggata
tagcatgagt ggtgcttacg cggaaaactt agtggcatca 360atgccagcca
gcaacgtgat gaaatatgcg ccagctatcc cagtgcctgt gtctattgca
420gggcaagttg aactgacgat caaacatgcg gttcatgctc aaccttggtg
tcaatcaggt 480gaaggtacgc ttgcttggtc tggtgcagca gtcgactcgc
cagtgggttc gttagacctt 540ggccctgtga ttgcggacat aacgtgtgaa
gacagcacaa ttgcagccaa aggcactcag 600aagagcgatc aggtagacag
cgagttctca gcgagcgtaa cacctaacca acgctacacc 660tcggcagcat
ggtttaagcc aggcgctgaa ttcccgccag caatgcagag tcagcttaag
720tggttgggca atcctgatag ccaaggtaaa taccaattta cttatcaagg
ccgcttttag 78064259PRTVibrio splendidus 64Met Lys Arg Gly Leu Ser
Phe Lys Tyr Gly Leu Leu Phe Ser Val Ile1 5 10 15 Phe Ile Val Phe
Phe Ser Val Ser Leu Leu Leu His Leu Pro Ala Ala 20 25 30 Phe Ala
Leu Lys His Ala Pro Val Val Arg Gly Leu Ser Ile Glu Gly 35 40 45
Val Glu Gly Thr Val Trp Gln Gly Arg Ala Asn Asn Ile Ala Trp Gln 50
55 60 Arg Val Asn Tyr Gly Ser Val Gln Trp Asp Phe Gln Phe Ser Lys
Leu65 70 75 80 Phe Gln Ala Lys Ala Glu Leu Ala Val Arg Phe Gly Arg
Asn Ser Asp 85 90 95 Met Asn Leu Ser Gly Lys Gly Arg Val Gly Tyr
Ser Met Ser Gly Ala 100 105 110 Tyr Ala Glu Asn Leu Val Ala Ser Met
Pro Ala Ser Asn Val Met Lys 115 120 125 Tyr Ala Pro Ala Ile Pro Val
Pro Val Ser Ile Ala Gly Gln Val Glu 130 135 140 Leu Thr Ile Lys His
Ala Val His Ala Gln Pro Trp Cys Gln Ser Gly145 150 155 160 Glu Gly
Thr Leu Ala Trp Ser Gly Ala Ala Val Asp Ser Pro Val Gly 165 170 175
Ser Leu Asp Leu Gly Pro Val Ile Ala Asp Ile Thr Cys Glu Asp Ser 180
185 190 Thr Ile Ala Ala Lys Gly Thr Gln Lys Ser Asp Gln Val Asp Ser
Glu 195 200 205 Phe Ser Ala Ser Val Thr Pro Asn Gln Arg Tyr Thr Ser
Ala Ala Trp 210 215 220 Phe Lys Pro Gly Ala Glu Phe Pro Pro Ala Met
Gln Ser Gln Leu Lys225 230 235 240 Trp Leu Gly Asn Pro Asp Ser Gln
Gly Lys Tyr Gln Phe Thr Tyr Gln 245 250 255 Gly Arg
Phe6510967DNAErwinia carotovora subsp. Atroseptica SCRI1043
65aagttgcagg atatgacgaa agcgtggccg acgactatac cggccacgct ttgaggaatt
60acaggaaatc agctcgctta ggcgagaaag catcgatcag tacgctaccg tcttccagcg
120aaaccacgcc gtgcatctcg tgtttcaccg ccagataggc gtcgcccgtt
ttcagggtgc 180gtttttcacc ttcgatcacg acttcaaagc tgccagcggc
aacataagca atctggtcgt 240gaatctcatg gaagtgcggc gtaccaatcg
cacctttatc aaagtgcacg taaaccatca 300tcagctcatc gctccatgtc
atgattttac gtttaatgcc accgcccagc tcttcccatg 360gcgtttcatc
atcaataaag tatcttctca tcatctctct cctctaacgc tctttttgcc
420cataccttct attgcgtcaa caaaccgtgt acgacaacga atgcatggct
atggattgcg 480acattttagc cacatcagta ccagaagaaa cataaaataa
gcaaaaccat gacggccctc 540aagaaataaa taaaacatta tttcattttt
attgaattcg catctcatcc aaactatcat 600cccgcataac aagaaagaac
cgggcatgtt gaggaacagg tgacgttgtc actgccacgc 660aacatcatct
gtttcgcccg gcgctttcgc caggaacgat tcctcttctt ggaacggcgc
720ctgatttttg tttttctctg aaagagaggc taagaaatgc aagttcgtca
aagcattcac 780agcgatcacg cgaagcagct agatacagca ggcctgcgtc
gtgaattcct gatcgaacag 840attttttctg ccgatgccta cactatgacc
tatagccaca tcgaccgaat catcgtcggt 900ggcatcatgc ccgtacacag
cgccgtaacg attggcggtg aagtgggtaa acaactcggc 960gttagctatt
tccttgagcg tcgcgaactc ggagccatca acattggcgg cgcgggtacc
1020gttactgtcg atggcgagcg ctatgacgtg ggtaatgaag aagcaattta
tgttggcatg 1080ggcgtgaaag acgtgcagtt taccagcact gatgccacta
acccggccaa gttctactac 1140aacagcgcgc ctgcacatac gacatatcct
acccgcaaga ttacccaagc tgacgcttca 1200ccacaaaccg tgggagaaga
tgcaagctgt aatcgtcgca caattaacaa atacattgtt 1260cccgatgtat
tgccaacctg ccagctcacc atgggattaa ccaagttagc tgaaggcagc
1320ctgtggaaca ccatgccttg tcatacgcat gagcgccgga tggaagtcta
tttctatttt 1380gatatggatg aggaaacggc cgttttccac atgatggggc
aaccgcagga aacccgtcac 1440atagttatta aaaacgagca ggcggtgatt
tcaccgagct ggtcgattca ttccggtgtt 1500ggcaccagac gctacacctt
tatctggggc atggttggcg agaatcaagt tttcggtgac 1560atggatcacg
tcaaggttag cgagttacgt taatcgcttt caaccggaat taccggtgtt
1620ccctacagta acagctaacg actaagtatt gtcgcttata gagagattat
tgatatgatt 1680ttaaattctt ttgatttgca aggtaaagtt gctcttatca
cgggttgtga tacgggttta 1740ggtcagggta tggctatcgg tctggcacaa
gctggctgtg atatcgttgg cgtcaacatc 1800gttgaaccaa aagataccat
cgaaaaagtt accgcactgg gacgccgttt cctcagcctg 1860accgctgaca
tgagcaacgt agcgggtcat gccgagctgg tagagaaagc cgttgctgaa
1920tttggtcacg ttgacattct ggtcaacaac gccggtatca tccgtcgtga
agatgctatc 1980gagttcagcg agaaaaactg ggacgacgtc atgaatctga
acattaagag cgttttcttt 2040atgtctcagg ctgttgcacg ccagtttatc
aaacaaggta aaggcggcaa gatcatcaac 2100atcgcctcta tgctgtcctt
ccaaggcggt atccgcgtgc cttcttacac tgcgtcaaaa 2160agcgccgtta
tgggtgtaac ccgtctgctg gctaacgagt gggcaaaaca cggcatcaac
2220gttaacgcca ttgctccagg gtacatggca accaacaata ctcagcaact
gcgcgccgat 2280gaagaccgca gcaaagagat tctggaccgt atcccggctg
gccgttgggg tttaccacag 2340gatctgatgg gcccatccgt cttcctggca
tccagcgcat ctgattacat caatggctac 2400acgattgccg ttgatggtgg
ctggctggct cgctaagtgt aatttttctt agcggcattt 2460cgctaatcca
cgataaaaag cacaatttag gttgtgcttt ttatttattt ttcaagttgt
2520tatttcgttt tttataattc tcttttctgc ctaaatcctt tcttaaaaaa
aaatcaaaac 2580aacgttccga ctttgatcac actttcgata ttgcgtgcat
gacgacaagg ttaatagcgc 2640aatataatca atcaaaacag tgtttctatt
tataaggaac tgttcacgca gttccataag 2700aaggtactcc atgagtattt
ttgaaaactt atacaccagc aggaaatcgc agctcgacga 2760atgggttgct
gcacttgata gccacatatc ctgcgttcag gaaaaaggcc gcagccaaag
2820ccaaccgacg ctattactgg ccgatggttt tgatgtggaa aattatgcgc
ctgcggtatg 2880gcaatttccg gatgggcaca gcgcgcctat ttctaatttt
gccagccagc agaattggct 2940aagaacgctg tgcgccatga gcgtcgttac
gggtaatgat agttaccaac agcacgctat 3000cgcacaaagc gaatatttcc
tggatcattt cgttgatgat aatagcggcc tgttctactg 3060gggcggccat
cgctttatta atctggatac gctggaaggc gaagggccag aatccaaagc
3120tcaggtgcat gaattaaagc accacctgcc ctattacgcg ctgttacatc
gtgttaacgc 3180ggaaaagacg ctgaacttct ttcaggggtt ctggaacgca
cacgttgaag attggaattc 3240actggatctg ggtcgtcatg gcgattacag
caaaaaacgc gatcctgatg ttttcctgca 3300taaccgtcat gatgtcgtcg
atccggcaca gtggcccgtt ctgccattaa cgaaaggcct 3360gacgtttgtt
aatgccggca cggatctgat ttacgccgca ttcaaatatg cagaatatac
3420gggcgatagc catgccgcgg catggggtaa acacctttat cgccaatacg
ttctggctcg 3480caacccagaa accggtatgc cggtgtatca attcagttca
ccacagcagc gccagccagt 3540gccggaagac gataaccaga cgcagtcctg
gtttggcgat cgcgctcaac gccagtttgg 3600cccagagttc ggtgaaatcg
cacgtgaagc caatgtgctg ttccgcgata tgcgtccact 3660gctgattgat
aacccgctgg caatgctgga tatcctccgc acacagcctg atgcagaaat
3720gctgaattgg gtaatctctg gattaaaaaa ttattaccag tacgcctacg
atgtcaccag 3780caatacgttg cgcccgatgt ggaacaacgg gcaggacatg
acaggctacc gttttaaacg 3840cgatggctat tacggcaaag cgggaacgga
attaaaaccg ttcgcattag aaggtgatta 3900tttattacct ctggttcgtg
cttatcgtct gagcggtgat gaagacctgt acgcactggt 3960taacaccatg
ctgacacggc tgaataaaga agatattcag cacatcgcca gtccgctact
4020tttgttgacc gttatcgaac tggccgatca caagcaatca gaatcctggg
cacattacgc 4080cgcacaactg gcgggcgtta tgtttgaaca acatttccat
cgtggtttgt ttgttcgctc 4140tgcacagcat cgttatgttc gtctggatga
tacctatccg ctggctttac tgactttcgt 4200tgccgcctgt cgcaacaaat
taaacgatat cccgccgtat ctgacacaag gtggatatgt 4260tcacggcgat
tttcacgtta acggggaaaa tagaattgtt tatgacgtgg aattaattta
4320tccagagtta ttaacagctt aattttatgt tttttttaat gattcacaat
taatcaatag 4380gtaagcatta tgaatgaaaa cagaatgctg gggttagcct
atatctcccc ctatattata 4440gggctgatag tttttaccgc tttccccttt
atttcgtcat ttatcctcag ttttactgag 4500tatgatttga
tgagtccgcc tgagtttacg ggtcttgaga actatcaccg tatgttcatg
4560gaggatgatc ttttttggaa atcaatgggc gtcacctttg cctatgtatt
tctgaccatt 4620ccattgaaat taatcttcgc actgttaatt gcgtttgtac
ttaatttcaa attacgtggt 4680atcggtttct tccgtactgc ttactatgtg
ccttctattc tgggcagcag cgtggccatt 4740gccgttctgt ggcgtgccct
attcgccatc gatggcttgc tgaacagctt cctcggcgta 4800tttggctttg
atgccatcaa ctggctgggc gaaccttcgc tggcactgat gtcggtaacc
4860ctgctgcgcg tatggcagtt tggttccgcc atggttatct tccttgctgc
attgcagaac 4920gtcccgcaat cacagtatga agcagccatg atcgacggtg
catccaaatg gcaaatgttc 4980ctgaaagtaa cggttccact gattacgccg
gttattttct ttaactttat catgcagacc 5040actcaggcat tccaggagtt
tacggcacct tacgtcatca ctggcggcgg tccaacgcac 5100tacacctatc
tgttctcgct ctatatctat gataccgcgt tcaagtattt cgatatgggc
5160tatggtgctg cgctggcatg ggttctgttc ctggttgttg cggtatttgc
ggcaatctcc 5220tttaagtcgt cgaaatactg ggtgttctac tccgctgata
aaggaggaaa aaatggctga 5280catgcattca aacctgacta cagcacaaga
aattgctgct gcagaagtac gccgcacgct 5340gcgtaaagag aaactcagtg
cctccatccg ttacgtgata ctgctgttcg ttggcttact 5400gatgctttac
ccactagcgt ggatgttctc agcgtcgttc aaaccgaacc aagagatctt
5460cacgacactg ggcctgtggc cggaacacgc cacatgggac ggtttcgtta
acggttggaa 5520aaccggtacg gaatacaatt tcggtcacta catgatcaat
acgctcaagt tcgtgattcc 5580gaaagtgcta ctgaccatta tctcttccac
cattgtcgct tacggctttg cccgtttcga 5640gattccatgg aagggcttct
ggttcgggac gctgatcacc accatgctgt taccaagcac 5700cgtgttgctg
attccgcagt acatcatgtt ccgtgaaatg ggcatgctga acagctatct
5760gccactgtac ttgccgatgg cgtttgcaac acaagggttc tttgtgttca
tgctgatcca 5820gttcctgcgt ggtgtaccac gtgatatgga agaagccgcc
cagatcgatg gctgtaactc 5880cttccaggtt ctgtggtatg tggtcgtgcc
gattttgaaa ccagccatca tctctgttgc 5940gctgttccag ttcatgtggt
caatgaacga cttcatcggt ccgctgattt atgtctatag 6000cgtggataaa
tatccgattg cgctggcgct gaaaatgtct atcgacgtta ctgaaggcgc
6060tccgtggaat gaaatcctgg caatgtccag catctccatt ctgccatcca
ttattgtttt 6120cttcctggca cagcgttact tcgtacaagg cgtgaccagc
agcggaatta aaggttaata 6180gaggatttat catggctgaa gttattttca
ataaactgga aaaagtatac accaacggct 6240tcaaagcggt tcacggcatc
gacctgacca ttaaagacgg tgagttcatg gttatcgtcg 6300gcccgtcagg
ctgtgcgaaa tcaacgacgc tgcgtatgtt agcgggtctg gaaaccatca
6360gcggcggtga agttcgcatc ggcgagcgcg ttgttaacaa tctggcaccg
aaagagcgtg 6420ggattgcaat ggtgttccag aactatgcgc tctaccctca
tatgacggta aaagagaacc 6480tggcgtttgg tctgaagctg agcaaaatgc
ctaaagatca aattgaagcg caagtaacgg 6540aagcagccaa aattctggag
ctggaagacc tgatggatcg tctgccacgc cagctatctg 6600gtggtcaggc
gcagcgtgtg gccgtaggcc gtgccatcgt taaaaagccg gatgttttcc
6660tgtttgatga accgttatct aacctggatg ccaaactgcg tgcttccatg
cgtatccgta 6720tttctgacct gcataagcag ttgaagaaaa gcggtaaagc
ggcaacgacg gtatatgtta 6780cccacgacca gactgaagcc atgaccatgg
gcgaccgtat ctgcgttatg aagctgggtc 6840acatcatgca ggtcgatacg
ccggataacc tgtaccattt ccctgtcaac atgttcgttg 6900ctggcttcat
tggctcacca gaaatgaaca ttaagccgtg caaactggtc gagaaagacg
6960gtcagattgg cgttgttgtg ggtaataacg cgctggtatt aaatactgaa
aaacaagata 7020aagtgcgcag ctacgtagga caagacgtat tcttcggcgt
tcgcccagac tatgtttcct 7080tgtcagatac gccatttgaa ggcagccact
cacagggtga actggttcgc gtagaaaaca 7140tgggtcacga attctttatg
tacattaaag tcgatggctt tgaattaacc agccgcattc 7200cttatgacga
aggtcggctg attatcgaga agggactgca tcgtccggta tatttccagt
7260tcgacatgga aaaatgccat atttttgatg caaaaacaga aaaaaatatc
tctctttaac 7320aggagtagta accgatgaaa aaagcgatcc tacacacgtt
aatagcttca tctttggcat 7380tagttgcaat gccatctctg gcagccgatc
aggttgagtt gagaatgtcc tggtggggcg 7440gcaacagccg tcaccaacag
acgctcaagg cgattgaaga gttccataag cagcacccag 7500acatcaccgt
gaaagcggaa tacaccggat gggatggtca cctgtctcgt ctgacaacac
7560agattgccgg taacactgag ccagatgtga tgcagactaa ctggaactgg
ctgccgattt 7620tctccaaaaa cggcgatggt ttttatgatc tgaacaaagt
gaaagattct ctggatctga 7680cccagttcga agcaaaagaa ctgcaaaaca
ccacggttaa cggcaagctg aacggtattc 7740ctatttctgt taccgctcgc
gtgttctatt tcaacaacga aagctgggca aaagcgggac 7800tggaataccc
gaaaacgtgg gacgaactgc tgaacgccgg taaagtgttc aaagagaagc
7860tgggcgacca atactaccct atcgtgttgg aacaccagga ttctctggca
ctgctgaact 7920cttacatggt tcaaaaatac aacattcctg ctattgatgt
gaaaagtcag aaattcgcct 7980ataccgatgc acaatgggtt gaattctttg
gcatgtataa gaaactgatc gacagccatg 8040tcatgcctga tgcgaaatac
tatgcctctt tcggtaagag caacatgtat gagatgaagc 8100catggatcaa
tggcgagtgg tctggtactt acatgtggaa ctccactatc actaagtact
8160ctgacaactt gcaaccacca gcaaaactgg cgttaggtaa ctacccaatg
ctgcctggtg 8220caaaagatgc tggcttgttc ttcaaacctg cacaaatgct
gtctatcggt aagtcaacca 8280agcatcctaa agagtctgct cagttgatca
acttcctgct gaacagcaaa gaaggtgctc 8340aggctttggg tctggaacgt
ggtgtaccgt tgagtaaagc ggctgtggct cagctgaccg 8400ctgatggcat
catcaaagat gatgctccag cagttgccgg gttgaagctg gcgctgtctc
8460tgccgcatga agttgctgtt tctccttatt tcgacgaccc acaaatcgtt
tctctgtttg 8520gtgataccat ccaatctatc gattatggtc agaaatctgt
ggaagacgca gcgaaatact 8580tccagcgtca atctgagcgt gttctgaaac
gcgcaatgaa ataatgtagc actcgattta 8640ccctgtaatt catccctgcc
gcaccgacgg cagggatttt tcatttaaat taaaacatcc 8700tctatattca
attcgatctc cctcacaatt tgaaacccta ttttactttt tgttactcaa
8760aacgatctcg atcacagaac gtaatttaat aataaataga atagaacttg
tcccaaaaaa 8820cataatgcgc ctttcgaatt aaagtattaa gcacagtcct
aaccaatggg gaatataaca 8880atgaaattta aattattagc tctggctgtt
acatcattaa ttagtgtgaa tgcaatggct 8940gtaactatcg attaccgtca
tgaaatgaaa gatacaccga aaaatgatca ccgcgatcgt 9000ttgtcaatgt
cacaccgttt tgccaatggc tttggtttat ccgttgaagc aaaatggcgt
9060caatccagtg ctgacagcac accgaataaa ccatttaatg aaaccgtcag
caacggtact 9120gaagttgtcg ccagctatgt ttacaacttc aacaaaactt
tttctctgga gccaggtttc 9180tctttagatt caagctctac ctctaacaac
tatcgccctt atctgcgcgg taaagtgaat 9240atcactgacg atctttctac
ctctttacgt tatcgtcctt actacaaacg taacagcggt 9300gatgttccaa
atgcatcaaa aaacaaccaa gagaatggtt ataacctaac cgccgttctc
9360agctataaat tcctgaaaga tttccaagtt gattacgaac tggactacaa
aaaagcaaat 9420aaagccggtg cgtatcaata cgacaatgaa acatacaatt
tcgaccatga tgtaaaattg 9480tcttataaaa tggataaaaa ctggaagcct
tatatggctg taggtaatgt tgcagattcc 9540ggcaccaacg atcatcgtca
aactcgttac cgtgttggtg tgcaatacag cttctaataa 9600cggccttgtt
atttaaataa gcgttattag gtagcagaag ggatgttatt gttaatcgat
9660ttactcagat ctacttttat cattaacatc cctttattat ggtgtccgtt
gtaggttaag 9720caggttagtt acgtttcttt gttgtacatg atttagttat
atgcgtttta gctgctgtaa 9780ttgctgtgtc tgatttaccc tcttcgtgta
tgaatgttat ttctttatta aaatttgcgg 9840ttcagggtag tcattttttc
tccgatgtga tggctaccct attttttacc accgcccaac 9900gattcccccc
tcattccctt tgtcaggtga tctatcatga ttgttcgttc tctgcttgtc
9960ggggccatta tgatgtctgt aaatggatta agttacgcac aacctgtttt
ctctgtctgg 10020ccacacggtg aagcaccggg tgcctcttct tcaacggcac
agccgcaagt ggtcgaacgg 10080agtaaagatc cttctcttcc cgatcgagcc
gcaacgggta ttcgcagccc tgaaattacc 10140gtttatccgg cagagaaacc
caatggcatg gcattactca ttacgccggg cggttcttat 10200cagcgcgtcg
tgctagataa agaaggcagc gatctagccc ctttctttaa tcaacaaggc
10260tacacccttt tcgtgatgac ctatcgtatg cccggtgaag gccataaaga
aggcgctgac 10320gctccgctag ccgatgccca acgagccatc agaacactga
gagccaacgc cgaaaagtgg 10380cacattaacc cgcagcgcat cggtattatg
gggttctccg ccggtggtca cgttgccgcc 10440agccttggaa cccgattcgc
acagtccgtt taccccgcga tggacgccgt tgataacgta 10500agcgcacgcc
ctgacttcat ggtgttgatg taccccgtaa tttctatgca ggcagatatt
10560gcgcacgccg gttcacgtaa acagttaatc ggcgagcaac cgatggaagt
acaagcggta 10620cgttattctc ctgagaaaca ggttactgat cagactcccc
ccacgttttt ggtgcatgcg 10680gttgacgatc cgtcagtgtc ggttgataac
agcctggtga tgtttagcgc gctgcgggca 10740aagcagattc cggtcgaaat
gcatctcttt gagaaaggta aacacggctt cggtctccgc 10800ggcaccaagg
ggcttcctgc cgctgcctgg cctcaactgc tggacaactg gctacgcgct
10860ttacctgcaa gcaacgaatt gccgaaagcc gcgccataag gtatagcaaa
catcgtaacc 10920gaaataaatc gttacgccgt caccgcttcc gcagacaggg ataatct
10967662582DNAErwinia carotovora subsp. Atroseptica SCRI1043
66ccaacggcgg gtgcgacata aacataagcg aatcgaagcg ctgcgctccg gtgagtatct
60gaagtaattt acgatagttt ctttccaaag gcccattcgg gcctttgtta tttcagcgtt
120tattgattca tcaaacctgc gctttctctg ctcgaatgtt ttcactagat
ctgaaacagg 180tggtgaaaac atgaagaatg ttttataaaa taaaaccacg
atcacggaaa aatgaaacat 240tgtttctata ataccgatat gacaggcgtc
tcgcgtgaga tttgtggcct gatttttgaa 300caaccggtgt cggggtgacc
gattcgtcgg acgttcagta atgtcaggtt atcgaagcgt 360atgcgtgtgt
ggcgtcaaat tcttcatgat aagttctaag gatttacgga tggccaaagg
420taataagatc cccctaacgt ttcataccta ccaggatgca gcaaccggca
ccgaagttgt 480gcgtttaacc ccgcccgatg ttatctgcca ccggaattat
ttctaccaga agtgtttctt 540caatgacggt agcaagctgc tgtttggcgc
tgcatttgat ggcccatgga actactatct 600gctggattta aaagagcaga
acgccacaca gttgacggaa ggcaaaggcg acaatacttt 660tggtggtttc
ctgtctccga atgacgatgc gctatattac gttaaaaata cccgtaattt
720gatgcgtgtc gatctgacta cgctggaaga gaaaacgatt tatcaggtgc
ctgacgattg 780ggtcggctac ggtacttggg ttgccaactc cgattgcacc
aaaatggtcg gtattgagat 840caagaaagaa gactggaagc cactgaccga
ttggaaaaaa ttccaggagt tctacttcac 900taatccttgc tgtcgtctga
ttcgcgtcga tttggtaacg ggcgaagcgg agactatcct 960tcaggaaaac
cagtggctgg gtcacccaat ctaccgtcca ggtgatgaca acacggttgc
1020tttctgtcac gaaggcccgc atgacctggt tgatgctcgt atgtggttca
tcaacgaaga 1080tggcaccaac atgcgcaaag tgaaagagca tgcagaaggc
gaaagctgca cccacgaatt 1140ttgggtgccg gatggctccg cgatgattta
tgtctcttat cttaaagacg ataccaaccg 1200ttatattcgc agcatcgatc
ccgttacgct ggaagatcgc caactgcgtg taatgccgcc 1260gtgttctcac
ctgatgagta actatgatgg cacactgttg gtcggtgatg gttccgatgc
1320accggtcgac gtgcaggatg atggtggcta caaaattgag aacgatccgt
tcctgtatgt 1380tttcaacctg aaaactggca aagaacatcg tattgcgcag
cacaatacat cctgggaagt 1440gttggaaggg gaccgtcagg tcactcaccc
gcacccgtct ttcacgccgg ataataaaca 1500agttctgttt acttctgacg
tagatggaaa acctgcgttg tatctggcga aggttcctga 1560ttcagtctgg
aactaataat actaataaat ccgcgtcacg tttcatggcg cggattattt
1620taaaatattt acttacatat tattttatta agtctctgac gcggttattt
ctcaaactta 1680acttgattat cgttgttgct ccattgccat aatcaaagcg
ttccctttat actaaaacca 1740ttgttctatt ttttttaaaa caaaaaaacc
tgagtagggt aaccacaaaa atggctagtg 1800cagatttaga taaacaaccc
gattccgtgt cgtccgtttt aaaggttttt ggtattttgc 1860aggcattagg
tgaagagaga gaaattggta ttaccgagct ttctcagcga gtcatgatgt
1920ctaagagtac cgtttaccgt ttcttgcaga cgatgaaatc cctgggctat
gtcgcgcagg 1980aaggtgaatc agagaagtat tcgctaacgc tcaagttgtt
tgaacttggt gcaaaagcat 2040tgcagaacgt agacttaatc cgcagtgcgg
atatacagat gcgcgagttg tctgtgctga 2100cgcgggaaac gattcacctt
ggcgcgttgg atgaagacgg catcgtttat atccacaaga 2160ttgattctat
gtataacctg cgtatgtatt cgcgcatcgg tcgccgtaat ccactacaca
2220gtaccgcaat tggtaaagtg ttgctggctt ggcgcgatcg cggtgaagtg
gaagaggttc 2280tgtcgactgt cgaattcacg cgtagtacgc cacacacatt
gtgtactgct gaagatcttc 2340tcaatcaact ggatgtcgtg cgtgagcaag
gctacgggga agataaagaa gagcaggaag 2400aagggctgcg ttgtatcgct
gtgccagtat tcgatcgttt tggtgtggtg attgccggcc 2460tcagtatttc
cttcccaacg attcgttttt cagaagaaaa caaacacgaa tatgtggcca
2520tgctgcacac cgcagctaga aatatctctg agcaaatggg ctaccacaat
ttccctttct 2580ga 2582672331DNAAgrobacterium tumefaciens
67atgcgtccct ctgccccggc catctccaga cagacacttc tcgatgaacc ccgcccgggc
60tcattgacca ttggctacga gccgagcgaa gaagcacaac cgacggagaa ccctccgcgc
120ttttcatggc tacccgatat tgacgacggc gcgcgttacg tgctgcgcat
ttcgaccgat 180cccggtttta cagacaaaaa aacgctcgtc ttcgaggatc
tcgcctggaa tttcttcacc 240ccggatgaag cactgccgga cggccattat
cactggtgtt atgcgctatg ggatcagaaa 300tccgcaacag cgcattccaa
ctggagcacc gtacgcagtt tcgagatcag tgaagcactg 360ccgaaaacgc
cgctgcccgg caggtctgcc cgccatgctg ccgcgcaaac cagccaccct
420cggctgtggc tcaactccga gcaattgagt gccttcgccg atgccgttgc
gaaggacccc 480aaccattgtg gctgggccga gttttacgaa aaatcggtcg
agccgtggct cgagcggccg 540gtcatgccgg aaccgcagcc ctatcccaac
aacacgcgtg tcgccacgct ctggcggcag 600atgtatatag actgccagga
agtgatctat gcgatccggc acctggccat tgccggccgc 660gtgctcggac
gcgacgacct tctcgatgca tcccgcaaat ggctgctggc cgtcgccgcc
720tgggacacga aaggtgcgac ctcacgcgcc tataatgacg aggcggggtt
ccgcgtcgtc 780gtcgcactcg cctggggtta tgactggctg tacgaccatc
tgagcgaaga cgaacgcagg 840accgtgcgat ccgttcttct cgaacggacg
cgggaagttg ccgatcatgt catcgcacac 900gcccgcattc acgtctttcc
ctatgacagc catgcggtgc gctcgctttc ggctgtattg 960acgccggcct
gcatcgcact tcagggagaa agcgacgagg ctggcgaatg gctcgactat
1020accgtcgaat tccttgccac gctctattct ccctgggcgg gaaccgatgg
tggttgggcg 1080gaaggtccgc attactggat gaccggcatg gcctatctca
tcgaggccgc caatctgatc 1140cgctcctata ttggttatga cctctatcaa
cggccgtttt tccagaatac cggtcgcttc 1200ccgctttaca ccaaggcgcc
gggaacccgc cgcgccaact tcggcgacga ctccaccctt 1260ggcgaccttc
ccggcctgaa gctgggatac aacgtccggc aattcgccgg cgtcaccggc
1320aatggccatt accagtggta tttcgatcac atcaaggccg atgcgacagg
cacggaaatg 1380gccttttaca attacggctg gtgggacctc aacttcgacg
atctcgtcta tcgccacgat 1440tacccgcagg tggaagccgt gtctcccgcc
gacctgccgg cactcgccgt tttcgatgat 1500attggttggg cgaccatcca
aaaagacatg gaagacccgg accggcacct gcagttcgtc 1560ttcaaatcca
gcccttacgg ttcgctcagc cacagtcacg gcgaccagaa tgcctttgtg
1620ctttatgccc atggcgagga tctggcgatc cagtccggtt attacgtggc
gttcaattcg 1680cagatgcatc tgaattggcg gcgtcagaca cggtcgaaaa
atgccgtgct gatcggcggc 1740aaaggccaat atgcggaaaa ggacaaggcg
cttgcacgcc gcgccgccgg ccgcatcgtc 1800tcggtggagg aacagcccgg
ccatgttcgt atcgtcggcg atgcaaccgc cgcctaccag 1860gttgcgaacc
cgctggttca aaaggtgctg cgcgaaaccc acttcgttaa tgacagctat
1920ttcgtgattg tcgacgaagt cgaatgttcg gaaccccagg aactgcaatg
gctttgccat 1980acactcggag cgccgcagac cggcaggtca agcttccgct
acaatggccg gaaagccggt 2040ttctacggac agttcgttta ctcttcgggc
ggcacgccgc aaatcagcgc cgtggagggt 2100tttcccgata tcgacccgaa
agaattcgaa gggctcgaca tacaccacca tgtctgcgcc 2160acggttccgg
ccgccacccg gcatcgcctt gtcacccttc tggtgcctta cagcctgaag
2220gagccgaagc gcattttcag cttcatcgat gatcagggtt tttccaccga
catctacttc 2280agtgatgtcg atgacgagcg tttcaagctc tcccttccca
agcagttcta a 233168776PRTAgrobacterium tumefaciens 68Met Arg Pro
Ser Ala Pro Ala Ile Ser Arg Gln Thr Leu Leu Asp Glu1 5 10 15 Pro
Arg Pro Gly Ser Leu Thr Ile Gly Tyr Glu Pro Ser Glu Glu Ala 20 25
30 Gln Pro Thr Glu Asn Pro Pro Arg Phe Ser Trp Leu Pro Asp Ile Asp
35 40 45 Asp Gly Ala Arg Tyr Val Leu Arg Ile Ser Thr Asp Pro Gly
Phe Thr 50 55 60 Asp Lys Lys Thr Leu Val Phe Glu Asp Leu Ala Trp
Asn Phe Phe Thr65 70 75 80 Pro Asp Glu Ala Leu Pro Asp Gly His Tyr
His Trp Cys Tyr Ala Leu 85 90 95 Trp Asp Gln Lys Ser Ala Thr Ala
His Ser Asn Trp Ser Thr Val Arg 100 105 110 Ser Phe Glu Ile Ser Glu
Ala Leu Pro Lys Thr Pro Leu Pro Gly Arg 115 120 125 Ser Ala Arg His
Ala Ala Ala Gln Thr Ser His Pro Arg Leu Trp Leu 130 135 140 Asn Ser
Glu Gln Leu Ser Ala Phe Ala Asp Ala Val Ala Lys Asp Pro145 150 155
160 Asn His Cys Gly Trp Ala Glu Phe Tyr Glu Lys Ser Val Glu Pro Trp
165 170 175 Leu Glu Arg Pro Val Met Pro Glu Pro Gln Pro Tyr Pro Asn
Asn Thr 180 185 190 Arg Val Ala Thr Leu Trp Arg Gln Met Tyr Ile Asp
Cys Gln Glu Val 195 200 205 Ile Tyr Ala Ile Arg His Leu Ala Ile Ala
Gly Arg Val Leu Gly Arg 210 215 220 Asp Asp Leu Leu Asp Ala Ser Arg
Lys Trp Leu Leu Ala Val Ala Ala225 230 235 240 Trp Asp Thr Lys Gly
Ala Thr Ser Arg Ala Tyr Asn Asp Glu Ala Gly 245 250 255 Phe Arg Val
Val Val Ala Leu Ala Trp Gly Tyr Asp Trp Leu Tyr Asp 260 265 270 His
Leu Ser Glu Asp Glu Arg Arg Thr Val Arg Ser Val Leu Leu Glu 275 280
285 Arg Thr Arg Glu Val Ala Asp His Val Ile Ala His Ala Arg Ile His
290 295 300 Val Phe Pro Tyr Asp Ser His Ala Val Arg Ser Leu Ser Ala
Val Leu305 310 315 320 Thr Pro Ala Cys Ile Ala Leu Gln Gly Glu Ser
Asp Glu Ala Gly Glu 325 330 335 Trp Leu Asp Tyr Thr Val Glu Phe Leu
Ala Thr Leu Tyr Ser Pro Trp 340 345 350 Ala Gly Thr Asp Gly Gly Trp
Ala Glu Gly Pro His Tyr Trp Met Thr 355 360 365 Gly Met Ala Tyr Leu
Ile Glu Ala Ala Asn Leu Ile Arg Ser Tyr Ile 370 375 380 Gly Tyr Asp
Leu Tyr Gln Arg Pro Phe Phe Gln Asn Thr Gly Arg Phe385 390 395 400
Pro Leu Tyr Thr Lys Ala Pro Gly Thr Arg Arg Ala Asn Phe Gly Asp 405
410 415 Asp Ser Thr Leu Gly Asp Leu Pro Gly Leu Lys Leu Gly Tyr Asn
Val 420 425 430 Arg Gln Phe Ala Gly Val Thr Gly Asn Gly His Tyr Gln
Trp Tyr Phe 435 440 445 Asp His Ile Lys Ala Asp Ala Thr Gly Thr Glu
Met Ala Phe Tyr Asn 450 455 460 Tyr Gly Trp Trp Asp Leu Asn Phe Asp
Asp Leu Val Tyr Arg His Asp465 470 475 480 Tyr Pro Gln Val Glu Ala
Val Ser Pro Ala Asp Leu Pro Ala Leu Ala 485 490 495 Val Phe Asp Asp
Ile Gly Trp Ala Thr Ile Gln Lys Asp Met Glu Asp 500 505 510 Pro Asp
Arg His Leu Gln Phe Val Phe Lys Ser Ser Pro Tyr Gly Ser 515 520 525
Leu Ser His Ser His Gly Asp Gln Asn Ala Phe Val Leu Tyr
Ala His 530 535 540 Gly Glu Asp Leu Ala Ile Gln Ser Gly Tyr Tyr Val
Ala Phe Asn Ser545 550 555 560 Gln Met His Leu Asn Trp Arg Arg Gln
Thr Arg Ser Lys Asn Ala Val 565 570 575 Leu Ile Gly Gly Lys Gly Gln
Tyr Ala Glu Lys Asp Lys Ala Leu Ala 580 585 590 Arg Arg Ala Ala Gly
Arg Ile Val Ser Val Glu Glu Gln Pro Gly His 595 600 605 Val Arg Ile
Val Gly Asp Ala Thr Ala Ala Tyr Gln Val Ala Asn Pro 610 615 620 Leu
Val Gln Lys Val Leu Arg Glu Thr His Phe Val Asn Asp Ser Tyr625 630
635 640 Phe Val Ile Val Asp Glu Val Glu Cys Ser Glu Pro Gln Glu Leu
Gln 645 650 655 Trp Leu Cys His Thr Leu Gly Ala Pro Gln Thr Gly Arg
Ser Ser Phe 660 665 670 Arg Tyr Asn Gly Arg Lys Ala Gly Phe Tyr Gly
Gln Phe Val Tyr Ser 675 680 685 Ser Gly Gly Thr Pro Gln Ile Ser Ala
Val Glu Gly Phe Pro Asp Ile 690 695 700 Asp Pro Lys Glu Phe Glu Gly
Leu Asp Ile His His His Val Cys Ala705 710 715 720 Thr Val Pro Ala
Ala Thr Arg His Arg Leu Val Thr Leu Leu Val Pro 725 730 735 Tyr Ser
Leu Lys Glu Pro Lys Arg Ile Phe Ser Phe Ile Asp Asp Gln 740 745 750
Gly Phe Ser Thr Asp Ile Tyr Phe Ser Asp Val Asp Asp Glu Arg Phe 755
760 765 Lys Leu Ser Leu Pro Lys Gln Phe 770 775
691068DNAAgrobacterium temefaciens C58 69atgttcacaa cgtccgccta
tgcctgcgat gacggctctt cgccgatgaa gctcgcgacc 60atcaggcgcc gcgatcccgg
tccgcgcgat gtcgaaatcg agatagaatt ctgtggcgtc 120tgccactcgg
acatccatac ggcccgcagc gaatggccgg gctccctcta cccttgcgtc
180cccggccacg aaatcgtcgg ccgtgtcggt cgggtgggcg cgcaagtcac
ccggttcaag 240acgggtgacc gcgtcggtgt cggctgtatc gtcgatagct
gccgcgaatg cgcaagctgc 300gccgaagggc tggagcaata ttgcgaaaac
ggcatgaccg gcacctataa ctcccctgac 360aaggcgatgg gcggcggcgc
gcatacgctt ggcggctatt ccgcccatgt ggtggtggat 420gaccgctatg
tgctcaatat tcccgaaggg ctcgatccgg cggcagcagc accgctactc
480tgcgctggta tcaccaccta ctcgccgctg cgccactgga atgccggccc
cggcaaacgc 540gtcggcgtcg tcggtctggg cggcctcggc catatggccg
tcaagctcgc caatgccatg 600ggtgcgactg tcgtgatgat caccacctcg
cccggcaagg cggaggatgc caaaaaactc 660ggcgcacacg aggtgatcat
ctcccgcgat gcggagcaga tgaagaaggc tacctcgagc 720ctcgatctca
tcatcgatgc tgtcgccgcc gaccacgaca tcgacgccta tctggcgctg
780ctgaaacgcg atggcgcgct ggtgcaggtg ggcgcgccgg aaaagccact
ttcggtgatg 840gccttcagcc tcatccccgg ccgcaagacc tttgccggct
cgatgatcgg cggtattccc 900gagactcagg aaatgctgga tttctgcgcc
gaaaaaggca tcgccggcga aatcgagatg 960atcgatatcg atcagatcaa
tgacgcttat gaacgcatga taaaaagcga tgtgcgttat 1020cgtttcgtca
ttgatatgaa gagcctgccg cgccagaagg ccgcctga 106870355PRTAgrobacterium
tumefaciens C58 70Met Phe Thr Thr Ser Ala Tyr Ala Cys Asp Asp Gly
Ser Ser Pro Met1 5 10 15 Lys Leu Ala Thr Ile Arg Arg Arg Asp Pro
Gly Pro Arg Asp Val Glu 20 25 30 Ile Glu Ile Glu Phe Cys Gly Val
Cys His Ser Asp Ile His Thr Ala 35 40 45 Arg Ser Glu Trp Pro Gly
Ser Leu Tyr Pro Cys Val Pro Gly His Glu 50 55 60 Ile Val Gly Arg
Val Gly Arg Val Gly Ala Gln Val Thr Arg Phe Lys65 70 75 80 Thr Gly
Asp Arg Val Gly Val Gly Cys Ile Val Asp Ser Cys Arg Glu 85 90 95
Cys Ala Ser Cys Ala Glu Gly Leu Glu Gln Tyr Cys Glu Asn Gly Met 100
105 110 Thr Gly Thr Tyr Asn Ser Pro Asp Lys Ala Met Gly Gly Gly Ala
His 115 120 125 Thr Leu Gly Gly Tyr Ser Ala His Val Val Val Asp Asp
Arg Tyr Val 130 135 140 Leu Asn Ile Pro Glu Gly Leu Asp Pro Ala Ala
Ala Ala Pro Leu Leu145 150 155 160 Cys Ala Gly Ile Thr Thr Tyr Ser
Pro Leu Arg His Trp Asn Ala Gly 165 170 175 Pro Gly Lys Arg Val Gly
Val Val Gly Leu Gly Gly Leu Gly His Met 180 185 190 Ala Val Lys Leu
Ala Asn Ala Met Gly Ala Thr Val Val Met Ile Thr 195 200 205 Thr Ser
Pro Gly Lys Ala Glu Asp Ala Lys Lys Leu Gly Ala His Glu 210 215 220
Val Ile Ile Ser Arg Asp Ala Glu Gln Met Lys Lys Ala Thr Ser Ser225
230 235 240 Leu Asp Leu Ile Ile Asp Ala Val Ala Ala Asp His Asp Ile
Asp Ala 245 250 255 Tyr Leu Ala Leu Leu Lys Arg Asp Gly Ala Leu Val
Gln Val Gly Ala 260 265 270 Pro Glu Lys Pro Leu Ser Val Met Ala Phe
Ser Leu Ile Pro Gly Arg 275 280 285 Lys Thr Phe Ala Gly Ser Met Ile
Gly Gly Ile Pro Glu Thr Gln Glu 290 295 300 Met Leu Asp Phe Cys Ala
Glu Lys Gly Ile Ala Gly Glu Ile Glu Met305 310 315 320 Ile Asp Ile
Asp Gln Ile Asn Asp Ala Tyr Glu Arg Met Ile Lys Ser 325 330 335 Asp
Val Arg Tyr Arg Phe Val Ile Asp Met Lys Ser Leu Pro Arg Gln 340 345
350 Lys Ala Ala 355 711047DNAAgrobacterium tumefaciens C58
71atggctattg caagaggtta tgctgcgacc gacgcgtcga agccgcttac cccgttcacc
60ttcgaacgcc gcgagccgaa tgatgacgac gtcgtcatcg atatcaaata tgccggcatc
120tgccactcgg acatccacac cgtccgcaac gaatggcaca atgccgttta
cccgatcgtt 180ccgggccacg aaatcgccgg tgtcgtgcgg gccgttggtt
ccaaggtcac gcggttcaag 240gtcggcgacc atgtcggcgt cggctgcttt
gtcgattcct gcgttggctg cgccacccgc 300gatgtcgaca atgagcagta
tatgccgggt ctcgtgcaga cctacaattc cgttgaacgg 360gacggcaaga
gcgcgaccca gggcggttat tccgaccata tcgtggtcag ggaagactac
420gtcctgtcca tcccggacaa cctgccgctc gatgcctccg cgccgcttct
ctgcgccggc 480atcacgctct attcgccgct gcagcactgg aatgcaggcc
ccggcaagaa agtggctatc 540gtcggcatgg gtggccttgg ccacatgggc
gtgaagatcg gctcggccat gggcgctgat 600atcaccgttc tctcgcagac
gctgtcgaag aaggaagacg gcctcaagct cggcgcgaag 660gaatattacg
ccaccagcga cgcctcgacc tttgagaaac tcgccggcac cttcgacctg
720atcctgtgca cagtctcggc cgaaatcgac tggaacgcct acctcaacct
gctcaaggtc 780aacggcacga tggttctgct cggcgtgccg gaacatgcga
tcccggtgca cgcattctcg 840gtcattcccg cccgccgttc gctcgccggt
tcgatgatcg gctcgatcaa ggaaacccag 900gaaatgctgg atttctgcgg
caagcacgac atcgtttcgg aaatcgaaac gatcggcatc 960aaggacgtca
acgaagccta tgagcgcgtg ctgaagagcg acgtgcgtta ccgcttcgtc
1020atcgacatgg cctcgctcga cgcttga 104772348PRTAgrobacterium
tumefaciens C58 72Met Ala Ile Ala Arg Gly Tyr Ala Ala Thr Asp Ala
Ser Lys Pro Leu1 5 10 15 Thr Pro Phe Thr Phe Glu Arg Arg Glu Pro
Asn Asp Asp Asp Val Val 20 25 30 Ile Asp Ile Lys Tyr Ala Gly Ile
Cys His Ser Asp Ile His Thr Val 35 40 45 Arg Asn Glu Trp His Asn
Ala Val Tyr Pro Ile Val Pro Gly His Glu 50 55 60 Ile Ala Gly Val
Val Arg Ala Val Gly Ser Lys Val Thr Arg Phe Lys65 70 75 80 Val Gly
Asp His Val Gly Val Gly Cys Phe Val Asp Ser Cys Val Gly 85 90 95
Cys Ala Thr Arg Asp Val Asp Asn Glu Gln Tyr Met Pro Gly Leu Val 100
105 110 Gln Thr Tyr Asn Ser Val Glu Arg Asp Gly Lys Ser Ala Thr Gln
Gly 115 120 125 Gly Tyr Ser Asp His Ile Val Val Arg Glu Asp Tyr Val
Leu Ser Ile 130 135 140 Pro Asp Asn Leu Pro Leu Asp Ala Ser Ala Pro
Leu Leu Cys Ala Gly145 150 155 160 Ile Thr Leu Tyr Ser Pro Leu Gln
His Trp Asn Ala Gly Pro Gly Lys 165 170 175 Lys Val Ala Ile Val Gly
Met Gly Gly Leu Gly His Met Gly Val Lys 180 185 190 Ile Gly Ser Ala
Met Gly Ala Asp Ile Thr Val Leu Ser Gln Thr Leu 195 200 205 Ser Lys
Lys Glu Asp Gly Leu Lys Leu Gly Ala Lys Glu Tyr Tyr Ala 210 215 220
Thr Ser Asp Ala Ser Thr Phe Glu Lys Leu Ala Gly Thr Phe Asp Leu225
230 235 240 Ile Leu Cys Thr Val Ser Ala Glu Ile Asp Trp Asn Ala Tyr
Leu Asn 245 250 255 Leu Leu Lys Val Asn Gly Thr Met Val Leu Leu Gly
Val Pro Glu His 260 265 270 Ala Ile Pro Val His Ala Phe Ser Val Ile
Pro Ala Arg Arg Ser Leu 275 280 285 Ala Gly Ser Met Ile Gly Ser Ile
Lys Glu Thr Gln Glu Met Leu Asp 290 295 300 Phe Cys Gly Lys His Asp
Ile Val Ser Glu Ile Glu Thr Ile Gly Ile305 310 315 320 Lys Asp Val
Asn Glu Ala Tyr Glu Arg Val Leu Lys Ser Asp Val Arg 325 330 335 Tyr
Arg Phe Val Ile Asp Met Ala Ser Leu Asp Ala 340 345
731029DNAAgrobacterium tumefaciens C58 73atgactaaaa caatgaaggc
ggcggttgtc cgcgcatttg gaaaaccgct gaccatcgag 60gaagtggcaa taccggatcc
cggccccggt gaaattctca tcaactacaa ggcgacgggc 120gtttgccaca
ccgacctgca cgccgcaacg ggggattggc cggtcaagcc caacccgccc
180ttcattcccg gacatgaagg tgcaggttac gtcgccaaga tcggcgctgg
cgtcaccggc 240atcaaggagg gcgaccgcgc cggcacgccc tggctctaca
ccgcctgcgg atgctgcatt 300ccctgccgta ccggctggga aaccctgtgc
ccgagccaga agaactcagg ttattccgtc 360aacggcagct ttgccgaata
tggccttgcc gatccgaaat tcgtcggccg cctgcctgac 420aatctcgatt
tcggcccagc cgcacccgtg ctctgcgccg gcgttacagt ctataagggc
480ctgaaggaaa ccgaagtcag gcccggtgaa tgggtggtca tttcaggcat
tggcgggctt 540ggccacatgg ccgtgcaata tgcgaaagcc atgggcatgc
atgtggttgc cgccgatatt 600ttcgacgaca agctggcgct tgccaaaaag
ctcggagccg acgtcgtcgt caacggccgc 660gcgcctgacg cggtggagca
agtgcaaaag gcaaccggcg gcgtccatgg cgcgctggtg 720acggcggttt
caccgaaggc catggagcag gcttatggct tcctgcgctc caagggcacg
780atggcgcttg tcggtctgcc gccgggcttc atctccattc cggtgttcga
cacggtgctg 840aagcgcatca cggtgcgtgg ctccatcgtc ggcacgcggc
aggatctgga ggaggcgttg 900accttcgccg gtgaaggcaa ggtggccgcc
cacttctcgt gggacaagct cgaaaacatc 960aatgatatct tccatcgcat
ggaagagggc aagatcgacg gccgtatcgt cgtggatctc 1020gccgcctga
102974342PRTAgrobacterium tumefaciens C58 74Met Thr Lys Thr Met Lys
Ala Ala Val Val Arg Ala Phe Gly Lys Pro1 5 10 15 Leu Thr Ile Glu
Glu Val Ala Ile Pro Asp Pro Gly Pro Gly Glu Ile 20 25 30 Leu Ile
Asn Tyr Lys Ala Thr Gly Val Cys His Thr Asp Leu His Ala 35 40 45
Ala Thr Gly Asp Trp Pro Val Lys Pro Asn Pro Pro Phe Ile Pro Gly 50
55 60 His Glu Gly Ala Gly Tyr Val Ala Lys Ile Gly Ala Gly Val Thr
Gly65 70 75 80 Ile Lys Glu Gly Asp Arg Ala Gly Thr Pro Trp Leu Tyr
Thr Ala Cys 85 90 95 Gly Cys Cys Ile Pro Cys Arg Thr Gly Trp Glu
Thr Leu Cys Pro Ser 100 105 110 Gln Lys Asn Ser Gly Tyr Ser Val Asn
Gly Ser Phe Ala Glu Tyr Gly 115 120 125 Leu Ala Asp Pro Lys Phe Val
Gly Arg Leu Pro Asp Asn Leu Asp Phe 130 135 140 Gly Pro Ala Ala Pro
Val Leu Cys Ala Gly Val Thr Val Tyr Lys Gly145 150 155 160 Leu Lys
Glu Thr Glu Val Arg Pro Gly Glu Trp Val Val Ile Ser Gly 165 170 175
Ile Gly Gly Leu Gly His Met Ala Val Gln Tyr Ala Lys Ala Met Gly 180
185 190 Met His Val Val Ala Ala Asp Ile Phe Asp Asp Lys Leu Ala Leu
Ala 195 200 205 Lys Lys Leu Gly Ala Asp Val Val Val Asn Gly Arg Ala
Pro Asp Ala 210 215 220 Val Glu Gln Val Gln Lys Ala Thr Gly Gly Val
His Gly Ala Leu Val225 230 235 240 Thr Ala Val Ser Pro Lys Ala Met
Glu Gln Ala Tyr Gly Phe Leu Arg 245 250 255 Ser Lys Gly Thr Met Ala
Leu Val Gly Leu Pro Pro Gly Phe Ile Ser 260 265 270 Ile Pro Val Phe
Asp Thr Val Leu Lys Arg Ile Thr Val Arg Gly Ser 275 280 285 Ile Val
Gly Thr Arg Gln Asp Leu Glu Glu Ala Leu Thr Phe Ala Gly 290 295 300
Glu Gly Lys Val Ala Ala His Phe Ser Trp Asp Lys Leu Glu Asn Ile305
310 315 320 Asn Asp Ile Phe His Arg Met Glu Glu Gly Lys Ile Asp Gly
Arg Ile 325 330 335 Val Val Asp Leu Ala Ala 340
751008DNAAgrobacterium tumefaciens C58 75atgaccgggg cgaaccagcc
ttgggaggtt caagaggttc ccgttccgaa ggcagagcca 60ggacttgtcc ttgttaaaat
ccacgcctcc ggcatgtgct acacggacgt gtgggcgacg 120cagggtgccg
gtggcgacat ctatccgcag acccccggcc atgaggttgt cggcgagatc
180atcgaggtcg gcgcgggcgt tcatacgcgc aaggtgggag accgggtcgg
caccacctgg 240gtgcagtcct cttgtggacg atgctcctac tgccgccaga
accgtccgtt gaccggccag 300acagccatga actgcgattc acccaggaca
acggggttcg cgacgcaagg cgggcacgca 360gagtacatcg cgatctctgc
tgaaggcaca gtgttattac ccgacgggct cgactacacg 420gatgccgcac
ccatgatgtg cgcaggctac acgacctgga gcggcttgcg cgacgccgag
480cccaaacctg gtgacagaat tgcggtactt ggcatcggcg ggctggggca
cgtcgccgtg 540cagttctcca aagccttggg gtttgagacc atcgcgatca
cgcattcacc cgacaagcac 600aagttggcca ccgatcttgg tgcagacatc
gtcgtcgccg atggcaaaga gttattggag 660gccggcggtg cggacgttct
tctggttacg accaacgact tcgacaccgc cgaaaaagcg 720atggcgggcg
taaggcctga cgggcgcatc gttctttgcg cgctcgactt cagcaagccg
780ttctcgatcc cgtccgacgg caagccgttc cacatgatgc gccaacgcgt
ggttgggtcc 840acgcatggcg gacagcacta tctcgccgaa atcctcgatc
tcgccgccaa gggcaaggtc 900aagccgattg tcgagacctt cgccctcgag
caggcaaccg aggcatatga gcggctatcc 960accgggaaga tgcgcttccg
gggcgtgttc cttccgcacg gcgcttga 100876335PRTAgrobacterium
tumefaciens C58 76Met Thr Gly Ala Asn Gln Pro Trp Glu Val Gln Glu
Val Pro Val Pro1 5 10 15 Lys Ala Glu Pro Gly Leu Val Leu Val Lys
Ile His Ala Ser Gly Met 20 25 30 Cys Tyr Thr Asp Val Trp Ala Thr
Gln Gly Ala Gly Gly Asp Ile Tyr 35 40 45 Pro Gln Thr Pro Gly His
Glu Val Val Gly Glu Ile Ile Glu Val Gly 50 55 60 Ala Gly Val His
Thr Arg Lys Val Gly Asp Arg Val Gly Thr Thr Trp65 70 75 80 Val Gln
Ser Ser Cys Gly Arg Cys Ser Tyr Cys Arg Gln Asn Arg Pro 85 90 95
Leu Thr Gly Gln Thr Ala Met Asn Cys Asp Ser Pro Arg Thr Thr Gly 100
105 110 Phe Ala Thr Gln Gly Gly His Ala Glu Tyr Ile Ala Ile Ser Ala
Glu 115 120 125 Gly Thr Val Leu Leu Pro Asp Gly Leu Asp Tyr Thr Asp
Ala Ala Pro 130 135 140 Met Met Cys Ala Gly Tyr Thr Thr Trp Ser Gly
Leu Arg Asp Ala Glu145 150 155 160 Pro Lys Pro Gly Asp Arg Ile Ala
Val Leu Gly Ile Gly Gly Leu Gly 165 170 175 His Val Ala Val Gln Phe
Ser Lys Ala Leu Gly Phe Glu Thr Ile Ala 180 185 190 Ile Thr His Ser
Pro Asp Lys His Lys Leu Ala Thr Asp Leu Gly Ala 195 200 205 Asp Ile
Val Val Ala Asp Gly Lys Glu Leu Leu Glu Ala Gly Gly Ala 210 215 220
Asp Val Leu Leu Val Thr Thr Asn Asp Phe Asp Thr Ala Glu Lys Ala225
230 235 240 Met Ala Gly Val Arg Pro Asp Gly Arg Ile Val Leu Cys Ala
Leu Asp 245 250 255 Phe Ser Lys Pro Phe Ser Ile Pro Ser Asp Gly Lys
Pro Phe His Met 260 265 270 Met Arg Gln Arg Val Val Gly Ser Thr His
Gly Gly Gln His Tyr Leu 275 280 285 Ala Glu Ile Leu Asp Leu Ala Ala
Lys Gly Lys Val Lys Pro Ile Val 290 295 300 Glu Thr Phe Ala Leu Glu
Gln Ala Thr Glu Ala Tyr Glu Arg Leu Ser305 310 315 320 Thr Gly Lys
Met Arg Phe Arg Gly Val Phe Leu Pro His Gly Ala
325 330 335 771017DNAAgrobacterium tumefaciens C58 77atgaccatgc
atgccattca attcgtcgag aagggacgcg ccgtgctggc ggaactcccc 60gtcgccgatc
tgccgccggg ccatgcgctc gtgcgggtca aggcttcggg gctttgccat
120accgatatcg acgtgctgca tgcgcgttat ggcgacggtg cgttccccgt
cattccgggg 180catgaatatg ctggcgaagt cgcagccgtg gcttccgatg
tgacagtctt caaggctggc 240gaccgggttg tcgtcgatcc caatctgccc
tgtggcacct gcgccagctg caggaaaggg 300ctgaccaacc tttgcagcac
attgaaagct tacggcgttt cccacaatgg cggctttgcg 360gagttcagtg
tggtgcgtgc cgatcacctg cacggtatcg gttcgatgcc ctatcacgtc
420gcggcgctgg ctgagccgct tgcctgtgtt gtcaatggca tgcagagtgc
gggtattggc 480gagagtggcg tggtgccgga gaatgcgctt gttttcggtg
ctgggcccat cggcctgctg 540cttgccctgt cgctgaaatc acgcggcatt
gcgacggtga cgatggccga tatcaatgaa 600agcaggctgg cctttgccca
ggacctcggg cttcagacgg cggtatccgg ctcggaagcg 660ctctcgcggc
agcggaagga gttcgatttc gtggccgatg cgacgggtat tgccccggtc
720gccgaggcga tgatcccgct ggttgcggat ggcggcacgg cgctattctt
cggcgtctgc 780gcgccggatg cccgtatttc ggtggcaccg tttgaaatct
tccggcgcca gctgaaactt 840gtcggctcgc attcgctgaa ccgcaacata
ccgcaggcgc ttgccattct ggagacggat 900ggcgaggtca tggcgcggct
cgtttcgcac cgcttgccgc tttcggagat gctgccgttc 960tttacgaaaa
aaccgtctga tccggcgacg atgaaagtgc aatttgcagc cgaatga
101778338PRTAgrobacterium tumefaciens C58 78Met Thr Met His Ala Ile
Gln Phe Val Glu Lys Gly Arg Ala Val Leu1 5 10 15 Ala Glu Leu Pro
Val Ala Asp Leu Pro Pro Gly His Ala Leu Val Arg 20 25 30 Val Lys
Ala Ser Gly Leu Cys His Thr Asp Ile Asp Val Leu His Ala 35 40 45
Arg Tyr Gly Asp Gly Ala Phe Pro Val Ile Pro Gly His Glu Tyr Ala 50
55 60 Gly Glu Val Ala Ala Val Ala Ser Asp Val Thr Val Phe Lys Ala
Gly65 70 75 80 Asp Arg Val Val Val Asp Pro Asn Leu Pro Cys Gly Thr
Cys Ala Ser 85 90 95 Cys Arg Lys Gly Leu Thr Asn Leu Cys Ser Thr
Leu Lys Ala Tyr Gly 100 105 110 Val Ser His Asn Gly Gly Phe Ala Glu
Phe Ser Val Val Arg Ala Asp 115 120 125 His Leu His Gly Ile Gly Ser
Met Pro Tyr His Val Ala Ala Leu Ala 130 135 140 Glu Pro Leu Ala Cys
Val Val Asn Gly Met Gln Ser Ala Gly Ile Gly145 150 155 160 Glu Ser
Gly Val Val Pro Glu Asn Ala Leu Val Phe Gly Ala Gly Pro 165 170 175
Ile Gly Leu Leu Leu Ala Leu Ser Leu Lys Ser Arg Gly Ile Ala Thr 180
185 190 Val Thr Met Ala Asp Ile Asn Glu Ser Arg Leu Ala Phe Ala Gln
Asp 195 200 205 Leu Gly Leu Gln Thr Ala Val Ser Gly Ser Glu Ala Leu
Ser Arg Gln 210 215 220 Arg Lys Glu Phe Asp Phe Val Ala Asp Ala Thr
Gly Ile Ala Pro Val225 230 235 240 Ala Glu Ala Met Ile Pro Leu Val
Ala Asp Gly Gly Thr Ala Leu Phe 245 250 255 Phe Gly Val Cys Ala Pro
Asp Ala Arg Ile Ser Val Ala Pro Phe Glu 260 265 270 Ile Phe Arg Arg
Gln Leu Lys Leu Val Gly Ser His Ser Leu Asn Arg 275 280 285 Asn Ile
Pro Gln Ala Leu Ala Ile Leu Glu Thr Asp Gly Glu Val Met 290 295 300
Ala Arg Leu Val Ser His Arg Leu Pro Leu Ser Glu Met Leu Pro Phe305
310 315 320 Phe Thr Lys Lys Pro Ser Asp Pro Ala Thr Met Lys Val Gln
Phe Ala 325 330 335 Ala Glu791044DNAAgrobacterium tumefaciens C58
79atgcgcgcgc tttattacga acgattcggc gagacccctg tagtcgcgtc cctgcctgat
60ccggcaccga gcgatggcgg cgtggtgatt gcggtgaagg caaccggcct ctgccgcagc
120gactggcatg gctggatggg acatgacacg gatatccgtc tgccgcatgt
gcccggccac 180gagttcgccg gcgtcatctc cgcagtcggc agaaacgtca
cccgcttcaa gacgggtgat 240cgcgttaccg tgcctttcgt ctccggctgc
ggccattgcc atgagtgccg ctccggcaat 300cagcaggtct gcgaaacgca
gttccagccc ggcttcaccc attggggttc cttcgccgaa 360tatgtcgcca
tcgactatgc cgatcagaac ctcgtgcacc tgccggaatc gatgagttac
420gccaccgccg ccggcctcgg ttgccgtttc gccacctcct tccgggcggt
gacggatcag 480ggacgcctga agggcggcga atggctggct gtccatggct
gcggcggtgt cggtctctcc 540gccatcatga tcggcgccgg cctcggcgca
caggtcgtcg ccatcgatat tgccgaagac 600aagctcgaac tcgcccggca
actgggtgca accgcaacca tcaacagccg ctccgttgcc 660gatgtcgccg
aagcggtgcg cgacatcacc ggtggcggcg cgcatgtgtc ggtggatgcg
720cttggccatc cgcagacctg ctgcaattcc atcagcaacc tgcgccggcg
cggacgccat 780gtgcaggtgg ggctgatgct ggcagaccat gccatgccgg
ccattcccat ggcccgggtg 840atcgctcatg agctggagat ctatggcagc
cacggcatgc aggcatggcg ttacgaggac 900atgctggcca tgatcgaaag
cggcaggctt gcgccggaaa agctgattgg ccgccatatc 960tcgctgaccg
aagcggccgt cgccctgccc ggaatggata ggttccagga gagcggcatc
1020agcatcatcg accggttcga atag 104480357PRTAgrobacterium
tumefaciens C58 80Met Asn Leu Arg Thr Asn Asp Glu Ala Met Met Arg
Ala Leu Tyr Tyr1 5 10 15 Glu Arg Phe Gly Glu Thr Pro Val Val Ala
Ser Leu Pro Asp Pro Ala 20 25 30 Pro Ser Asp Gly Gly Val Val Ile
Ala Val Lys Ala Thr Gly Leu Cys 35 40 45 Arg Ser Asp Trp His Gly
Trp Met Gly His Asp Thr Asp Ile Arg Leu 50 55 60 Pro His Val Pro
Gly His Glu Phe Ala Gly Val Ile Ser Ala Val Gly65 70 75 80 Arg Asn
Val Thr Arg Phe Lys Thr Gly Asp Arg Val Thr Val Pro Phe 85 90 95
Val Ser Gly Cys Gly His Cys His Glu Cys Arg Ser Gly Asn Gln Gln 100
105 110 Val Cys Glu Thr Gln Phe Gln Pro Gly Phe Thr His Trp Gly Ser
Phe 115 120 125 Ala Glu Tyr Val Ala Ile Asp Tyr Ala Asp Gln Asn Leu
Val His Leu 130 135 140 Pro Glu Ser Met Ser Tyr Ala Thr Ala Ala Gly
Leu Gly Cys Arg Phe145 150 155 160 Ala Thr Ser Phe Arg Ala Val Thr
Asp Gln Gly Arg Leu Lys Gly Gly 165 170 175 Glu Trp Leu Ala Val His
Gly Cys Gly Gly Val Gly Leu Ser Ala Ile 180 185 190 Met Ile Gly Ala
Gly Leu Gly Ala Gln Val Val Ala Ile Asp Ile Ala 195 200 205 Glu Asp
Lys Leu Glu Leu Ala Arg Gln Leu Gly Ala Thr Ala Thr Ile 210 215 220
Asn Ser Arg Ser Val Ala Asp Val Ala Glu Ala Val Arg Asp Ile Thr225
230 235 240 Gly Gly Gly Ala His Val Ser Val Asp Ala Leu Gly His Pro
Gln Thr 245 250 255 Cys Cys Asn Ser Ile Ser Asn Leu Arg Arg Arg Gly
Arg His Val Gln 260 265 270 Val Gly Leu Met Leu Ala Asp His Ala Met
Pro Ala Ile Pro Met Ala 275 280 285 Arg Val Ile Ala His Glu Leu Glu
Ile Tyr Gly Ser His Gly Met Gln 290 295 300 Ala Trp Arg Tyr Glu Asp
Met Leu Ala Met Ile Glu Ser Gly Arg Leu305 310 315 320 Ala Pro Glu
Lys Leu Ile Gly Arg His Ile Ser Leu Thr Glu Ala Ala 325 330 335 Val
Ala Leu Pro Gly Met Asp Arg Phe Gln Glu Ser Gly Ile Ser Ile 340 345
350 Ile Asp Arg Phe Glu 355 811011DNAAgrobacterium tumefaciens C58
81atgctggcga ttttctgtga cactcccggt caattaaccg ccaaggatct gccgaacccc
60gtgcgcggcg aaggtgaagt cctggtacgt attcgccgga ttggcgtttg cggcacggat
120ctgcacatct ttaccggcaa ccagccctat ctttcctatc cgcggatcat
gggtcacgaa 180ctttccggca cggttgagga ggcacccgct ggcagccacc
tttccgctgg cgatgtggtg 240accataattc cctatatgtc ctgcgggaaa
tgcaatgcct gcctgaaggg taagagcaat 300tgctgccgca atatcggtgt
gcttggcgtt catcgcgatg gcggcatggt ggaatatctg 360agcgtgccgc
agcaattcgt gctgaaggcg gaggggctga gcctcgacca ggcagccatg
420acggaatttc tggcgatcgg tgcccatgcg gtgcgtcgcg gtgccgtcga
aaaagggcaa 480aaggtcctga tcgtcggtgc cggcccgatc ggcatggcgg
ttgctgtctt tgcggttctc 540gatggcacgg aagtgacgat gatcgacggt
cgcaccgacc ggctggattt ctgcaaggac 600cacctcggtg tcgctcatac
agtcgccctc ggcgacggtg acaaagatcg tctgtccgac 660attaccggtg
gcaatttctt cgatgcggtg tttgatgcga ccggcaatcc gaaagccatg
720gagcgcggtt tctccttcgt cggtcacggc ggctcctatg ttctggtgtc
catcgtcgcc 780agcgatatca gcttcaacga cccggaattt cacaagcgtg
agacgacgct gctcggcagc 840cgcaacgcga cggctgatga tttcgagcgg
gtgcttcgcg ccttgcgcga agggaaagtg 900ccggaggcac taatcaccca
tcgcatgaca cttgccgatg ttccctcgaa gttcgccggc 960ctgaccgatc
cgaaagccgg agtcatcaag ggcatggtgg aggtcgcatg a
101182336PRTAgrobacterium tumefaciens C58 82Met Leu Ala Ile Phe Cys
Asp Thr Pro Gly Gln Leu Thr Ala Lys Asp1 5 10 15 Leu Pro Asn Pro
Val Arg Gly Glu Gly Glu Val Leu Val Arg Ile Arg 20 25 30 Arg Ile
Gly Val Cys Gly Thr Asp Leu His Ile Phe Thr Gly Asn Gln 35 40 45
Pro Tyr Leu Ser Tyr Pro Arg Ile Met Gly His Glu Leu Ser Gly Thr 50
55 60 Val Glu Glu Ala Pro Ala Gly Ser His Leu Ser Ala Gly Asp Val
Val65 70 75 80 Thr Ile Ile Pro Tyr Met Ser Cys Gly Lys Cys Asn Ala
Cys Leu Lys 85 90 95 Gly Lys Ser Asn Cys Cys Arg Asn Ile Gly Val
Leu Gly Val His Arg 100 105 110 Asp Gly Gly Met Val Glu Tyr Leu Ser
Val Pro Gln Gln Phe Val Leu 115 120 125 Lys Ala Glu Gly Leu Ser Leu
Asp Gln Ala Ala Met Thr Glu Phe Leu 130 135 140 Ala Ile Gly Ala His
Ala Val Arg Arg Gly Ala Val Glu Lys Gly Gln145 150 155 160 Lys Val
Leu Ile Val Gly Ala Gly Pro Ile Gly Met Ala Val Ala Val 165 170 175
Phe Ala Val Leu Asp Gly Thr Glu Val Thr Met Ile Asp Gly Arg Thr 180
185 190 Asp Arg Leu Asp Phe Cys Lys Asp His Leu Gly Val Ala His Thr
Val 195 200 205 Ala Leu Gly Asp Gly Asp Lys Asp Arg Leu Ser Asp Ile
Thr Gly Gly 210 215 220 Asn Phe Phe Asp Ala Val Phe Asp Ala Thr Gly
Asn Pro Lys Ala Met225 230 235 240 Glu Arg Gly Phe Ser Phe Val Gly
His Gly Gly Ser Tyr Val Leu Val 245 250 255 Ser Ile Val Ala Ser Asp
Ile Ser Phe Asn Asp Pro Glu Phe His Lys 260 265 270 Arg Glu Thr Thr
Leu Leu Gly Ser Arg Asn Ala Thr Ala Asp Asp Phe 275 280 285 Glu Arg
Val Leu Arg Ala Leu Arg Glu Gly Lys Val Pro Glu Ala Leu 290 295 300
Ile Thr His Arg Met Thr Leu Ala Asp Val Pro Ser Lys Phe Ala Gly305
310 315 320 Leu Thr Asp Pro Lys Ala Gly Val Ile Lys Gly Met Val Glu
Val Ala 325 330 335 831005DNAAgrobacterium tumefaciens C58
83gtgaaagcct tcgtcgtcga caagtacaag aagaagggcc cgctgcgtct ggccgacatg
60cccaatccgg tcatcggcgc caatgatgtg ctggttcgca tccatgccac tgccatcaat
120cttctcgact ccaaggtgcg cgacggggaa ttcaagctgt tcctgcccta
tcgtcctccc 180ttcattctcg gtcatgatct ggccggaacg gtcatccgcg
tcggcgcgaa tgtacggcag 240ttcaagacag gcgacgaggt tttcgctcgc
ccgcgtgatc accgggtcgg aaccttcgca 300gaaatgattg cggtcgatgc
cgcagacctt gcgctgaagc caacgagcct gtccatggag 360caggcagcgt
cgatcccgct cgtcggactg actgcctggc aggcgcttat cgaggttggc
420aaggtcaagt ccggccagaa ggttttcatc caggccggtt ccggcggtgt
cggcaccttc 480gccatccagc ttgccaagca tctcggcgct accgtggcca
cgaccaccag cgccgcgaat 540gccgaactgg tcaaaagcct cggcgcagat
gtggtgatcg actacaagac gcaggacttc 600gaacaggtgc tgtccggcta
cgatctcgtc ctgaacagcc aggatgccaa gacgctggaa 660aagtcgttga
acgtgctgag accgggcgga aagctcattt cgatctccgg tccgccggat
720gttgcctttg ccagatcgtt gaaactgaat ccgctcctgc gttttgtcgt
cagaatgctg 780agccgtggtg tcctgaaaaa ggcaagcaga cgcggtgtcg
attactcttt cctgttcatg 840cgcgccgaag gtcagcaatt gcatgagatc
gccgaactga tcgatgccgg caccatccgt 900ccggtcgtcg acaaggtgtt
tcaatttgcg cagacgcccg acgccctggc ctatgtcgag 960accggacggg
caaggggcaa ggttgtggtt acatacgcat cctag 100584359PRTAgrobacterium
tumefaciens C58 84Met Pro Ser Leu Cys Arg Lys Pro Trp Leu Ser Ser
Leu Pro Asp Leu1 5 10 15 Ile Asn Val Ser His Trp Arg Lys Pro Val
Lys Ala Phe Val Val Asp 20 25 30 Lys Tyr Lys Lys Lys Gly Pro Leu
Arg Leu Ala Asp Met Pro Asn Pro 35 40 45 Val Ile Gly Ala Asn Asp
Val Leu Val Arg Ile His Ala Thr Ala Ile 50 55 60 Asn Leu Leu Asp
Ser Lys Val Arg Asp Gly Glu Phe Lys Leu Phe Leu65 70 75 80 Pro Tyr
Arg Pro Pro Phe Ile Leu Gly His Asp Leu Ala Gly Thr Val 85 90 95
Ile Arg Val Gly Ala Asn Val Arg Gln Phe Lys Thr Gly Asp Glu Val 100
105 110 Phe Ala Arg Pro Arg Asp His Arg Val Gly Thr Phe Ala Glu Met
Ile 115 120 125 Ala Val Asp Ala Ala Asp Leu Ala Leu Lys Pro Thr Ser
Leu Ser Met 130 135 140 Glu Gln Ala Ala Ser Ile Pro Leu Val Gly Leu
Thr Ala Trp Gln Ala145 150 155 160 Leu Ile Glu Val Gly Lys Val Lys
Ser Gly Gln Lys Val Phe Ile Gln 165 170 175 Ala Gly Ser Gly Gly Val
Gly Thr Phe Ala Ile Gln Leu Ala Lys His 180 185 190 Leu Gly Ala Thr
Val Ala Thr Thr Thr Ser Ala Ala Asn Ala Glu Leu 195 200 205 Val Lys
Ser Leu Gly Ala Asp Val Val Ile Asp Tyr Lys Thr Gln Asp 210 215 220
Phe Glu Gln Val Leu Ser Gly Tyr Asp Leu Val Leu Asn Ser Gln Asp225
230 235 240 Ala Lys Thr Leu Glu Lys Ser Leu Asn Val Leu Arg Pro Gly
Gly Lys 245 250 255 Leu Ile Ser Ile Ser Gly Pro Pro Asp Val Ala Phe
Ala Arg Ser Leu 260 265 270 Lys Leu Asn Pro Leu Leu Arg Phe Val Val
Arg Met Leu Ser Arg Gly 275 280 285 Val Leu Lys Lys Ala Ser Arg Arg
Gly Val Asp Tyr Ser Phe Leu Phe 290 295 300 Met Arg Ala Glu Gly Gln
Gln Leu His Glu Ile Ala Glu Leu Ile Asp305 310 315 320 Ala Gly Thr
Ile Arg Pro Val Val Asp Lys Val Phe Gln Phe Ala Gln 325 330 335 Thr
Pro Asp Ala Leu Ala Tyr Val Glu Thr Gly Arg Ala Arg Gly Lys 340 345
350 Val Val Val Thr Tyr Ala Ser 355 851032DNAAgrobacterium
tumefaciens C58 85atgaaagcga ttgtcgccca cggggcaaag gatgtgcgca
tcgaagaccg gccggaggaa 60aagccgggtc cgggcgaggt gcggctccgt ctggcgaggg
gcgggatctg cggcagtgat 120ctgcattatt acaatcatgg cggtttcggc
gccgtgcggc ttcgtgaacc catggtgctg 180ggccatgagg tttccgccgt
catcgaggaa ctgggcgaag gcgttgaggg gctgaagatc 240ggcggtctgg
tggcggtttc gccgtcgcgc ccatgccgaa cctgccgctt ctgccaggag
300ggtctgcaca atcagtgcct caacatgcgg ttttatggca gcgccatgcc
tttcccgcat 360attcagggcg cgttccggga aattctggtg gcggacgccc
tgcaatgcgt gccggccgat 420ggtctcagcg ccggggaagc cgccatggcg
gaaccgctgg cggtgacgct gcatgccaca 480cgccgggccg gcgatttgct
gggaaaacgt gtgctcgtca cgggttgcgg ccccatcggc 540attctctcca
ttctggctgc gcgccgggcg ggtgctgctg aaatcgtcgc caccgacctt
600tccgatttca cgctcggcaa ggcgcgtgaa gcgggggcgg accgtgtcat
caacagcaag 660gatgagcccg atgcgctcgc cgcttatggt gcaaacaagg
gaaccttcga cattctctat 720gaatgctcgg gtgcggccgt ggcgcttgcc
ggcggcatta cggcactgcg gccgcgcggc 780atcatcgtcc agctcgggct
cggcggcgat atgagcctgc cgatgatggc gatcacagcc 840aaggaactcg
acctgcgtgg ttcctttcgc ttccacgagg aattcgccac cggcgtcgag
900ctgatgcgca agggcctgat cgacgtcaaa cccttcatca cccagaccgt
cgatcttgcc 960gacgccatct cggccttcga attcgcctcg gatcgcagcc
gcgccatgaa ggtgcagatc 1020gccttttcct aa 103286343PRTAgrobacterium
tumefaciens C58 86Met Lys Ala Ile Val Ala His Gly Ala Lys Asp Val
Arg Ile Glu Asp1 5 10 15 Arg Pro Glu Glu Lys Pro Gly Pro Gly Glu
Val Arg Leu Arg Leu Ala 20 25 30 Arg Gly Gly Ile Cys Gly Ser Asp
Leu His Tyr Tyr Asn His Gly Gly 35 40 45 Phe Gly Ala Val Arg Leu
Arg Glu Pro Met Val Leu Gly His Glu Val 50 55
60 Ser Ala Val Ile Glu Glu Leu Gly Glu Gly Val Glu Gly Leu Lys
Ile65 70 75 80 Gly Gly Leu Val Ala Val Ser Pro Ser Arg Pro Cys Arg
Thr Cys Arg 85 90 95 Phe Cys Gln Glu Gly Leu His Asn Gln Cys Leu
Asn Met Arg Phe Tyr 100 105 110 Gly Ser Ala Met Pro Phe Pro His Ile
Gln Gly Ala Phe Arg Glu Ile 115 120 125 Leu Val Ala Asp Ala Leu Gln
Cys Val Pro Ala Asp Gly Leu Ser Ala 130 135 140 Gly Glu Ala Ala Met
Ala Glu Pro Leu Ala Val Thr Leu His Ala Thr145 150 155 160 Arg Arg
Ala Gly Asp Leu Leu Gly Lys Arg Val Leu Val Thr Gly Cys 165 170 175
Gly Pro Ile Gly Ile Leu Ser Ile Leu Ala Ala Arg Arg Ala Gly Ala 180
185 190 Ala Glu Ile Val Ala Thr Asp Leu Ser Asp Phe Thr Leu Gly Lys
Ala 195 200 205 Arg Glu Ala Gly Ala Asp Arg Val Ile Asn Ser Lys Asp
Glu Pro Asp 210 215 220 Ala Leu Ala Ala Tyr Gly Ala Asn Lys Gly Thr
Phe Asp Ile Leu Tyr225 230 235 240 Glu Cys Ser Gly Ala Ala Val Ala
Leu Ala Gly Gly Ile Thr Ala Leu 245 250 255 Arg Pro Arg Gly Ile Ile
Val Gln Leu Gly Leu Gly Gly Asp Met Ser 260 265 270 Leu Pro Met Met
Ala Ile Thr Ala Lys Glu Leu Asp Leu Arg Gly Ser 275 280 285 Phe Arg
Phe His Glu Glu Phe Ala Thr Gly Val Glu Leu Met Arg Lys 290 295 300
Gly Leu Ile Asp Val Lys Pro Phe Ile Thr Gln Thr Val Asp Leu Ala305
310 315 320 Asp Ala Ile Ser Ala Phe Glu Phe Ala Ser Asp Arg Ser Arg
Ala Met 325 330 335 Lys Val Gln Ile Ala Phe Ser 340
87939DNAAgrobacterium tumefaciens C58 87atgccgatgg cgctcgggca
cgaagcggcg ggcgtcgtcg aggcattggg cgaaggcgtg 60cgcgatcttg agcccggcga
tcatgtggtc atggtcttca tgcccagttg cggacattgc 120ctgccctgtg
cggaaggcag gcccgctctg tgcgagccgg gcgccgccgc caatgcagca
180ggcaggctgt tgggtggcgc cacccgcctg aactatcatg gcgaggtcgt
ccatcatcac 240cttggtgtgt cggcctttgc cgaatatgcc gtggtgtcgc
gcaattcgct ggtcaagatc 300gaccgcgatc ttccatttgt cgaggcggca
ctcttcggct gcgcggttct caccggcgtc 360ggcgccgtcg tgaatacggc
aagggtcagg accggctcga ctgcggtcgt catcggactt 420ggcggtgtgg
gccttgccgc ggttctcgga gcccgggcgg ccggtgccag caagatcgtc
480gccgtcgacc tttcgcagga aaagcttgca ctcgccagcg aactgggcgc
gaccgccatc 540gtgaacggac gcgatgagga tgccgtcgag caggtccgcg
agctcacttc cggcggtgcc 600gattatgcct tcgagatggc agggtctatt
cgcgccctcg aaaacgcctt caggatgacc 660aaacgtggcg gcaccaccgt
taccgccggt ctgccaccgc cgggtgcggc cctgccgctc 720aacgtcgtgc
agctcgtcgg cgaggagcgg acactcaagg gcagctatat cggcacctgt
780gtgcctctcc gggatattcc gcgcttcatc gccctttatc gcgacggccg
gttgccggtg 840aaccgccttc tgagcggaag gctgaagcta gaagacatca
atgaagggtt cgaccgcctg 900cacgacggaa gcgccgttcg gcaagtcatc gaattctga
93988312PRTAgrobacterium tumefaciens C58 88Met Pro Met Ala Leu Gly
His Glu Ala Ala Gly Val Val Glu Ala Leu1 5 10 15 Gly Glu Gly Val
Arg Asp Leu Glu Pro Gly Asp His Val Val Met Val 20 25 30 Phe Met
Pro Ser Cys Gly His Cys Leu Pro Cys Ala Glu Gly Arg Pro 35 40 45
Ala Leu Cys Glu Pro Gly Ala Ala Ala Asn Ala Ala Gly Arg Leu Leu 50
55 60 Gly Gly Ala Thr Arg Leu Asn Tyr His Gly Glu Val Val His His
His65 70 75 80 Leu Gly Val Ser Ala Phe Ala Glu Tyr Ala Val Val Ser
Arg Asn Ser 85 90 95 Leu Val Lys Ile Asp Arg Asp Leu Pro Phe Val
Glu Ala Ala Leu Phe 100 105 110 Gly Cys Ala Val Leu Thr Gly Val Gly
Ala Val Val Asn Thr Ala Arg 115 120 125 Val Arg Thr Gly Ser Thr Ala
Val Val Ile Gly Leu Gly Gly Val Gly 130 135 140 Leu Ala Ala Val Leu
Gly Ala Arg Ala Ala Gly Ala Ser Lys Ile Val145 150 155 160 Ala Val
Asp Leu Ser Gln Glu Lys Leu Ala Leu Ala Ser Glu Leu Gly 165 170 175
Ala Thr Ala Ile Val Asn Gly Arg Asp Glu Asp Ala Val Glu Gln Val 180
185 190 Arg Glu Leu Thr Ser Gly Gly Ala Asp Tyr Ala Phe Glu Met Ala
Gly 195 200 205 Ser Ile Arg Ala Leu Glu Asn Ala Phe Arg Met Thr Lys
Arg Gly Gly 210 215 220 Thr Thr Val Thr Ala Gly Leu Pro Pro Pro Gly
Ala Ala Leu Pro Leu225 230 235 240 Asn Val Val Gln Leu Val Gly Glu
Glu Arg Thr Leu Lys Gly Ser Tyr 245 250 255 Ile Gly Thr Cys Val Pro
Leu Arg Asp Ile Pro Arg Phe Ile Ala Leu 260 265 270 Tyr Arg Asp Gly
Arg Leu Pro Val Asn Arg Leu Leu Ser Gly Arg Leu 275 280 285 Lys Leu
Glu Asp Ile Asn Glu Gly Phe Asp Arg Leu His Asp Gly Ser 290 295 300
Ala Val Arg Gln Val Ile Glu Phe305 310 891035DNAAgrobacterium
tumefaciens C58 89atgaaacatt ctcaggacaa accacgcctg ctgattgcga
tgcgtagcga gcttccagaa 60ggcttcttcg gtccgcgcga atgggcaagg ctgaatgccg
tagcggacat tattccgggc 120tttccccata cggatttcga cacggcgaac
ggtgccgagg ctctcgccga agcggatatt 180ctgctcgctg cctggggtac
gccatccctg acacgcgaac gactttcacg cgcgccgcgg 240ctgaaaatgc
tggcctatgc ggcatcatcg gtgcggatgg ttgcgcccgc agaattctgg
300gagacgtcgg atattctggt cacgacagca gcttccgcca tggccgtgcc
ggttgccgaa 360ttcacctatg cggcaatcat catgtgcggc aaggatgtgt
ttcgattgcg ggatgaacat 420agaacagagc gcggcaccgg cgtttttggc
agcaggcgcg gcagaagcct gccctatctt 480ggcaatcatg cccgcaaggt
tggcattgtc ggcgcctcgc gcatcgggcg gctggtgatg 540gagatgctgg
cgcgcggcac attcgagatt gccgtttacg atccctttct gtcggcggaa
600gaggccgcat cccttggcgc gaagaaagcc gaactggacg agcttctcgc
atggtccgat 660gtggtctcgc tgcacgcgcc gatcctgccg gaaacgcacc
atatgatcgg cgcccgcgaa 720ctggcgctga tggcggacca tgccatcttc
atcaacacgg cgcggggctg gctggtcgac 780cacgatgcat tgctgactga
agcgatttcc ggacggctgc gcattctgat tgacacgccc 840gaacccgagc
ccctgcccac ggacagcccg ttttacgatc tgcccaatgt cgttctaacc
900ccccatatag ccggggcgct gggcaatgaa ttgcgcgcac tttccgatct
ggccattacc 960gaaattgaac gtttcgtggc gggacttgcg cccctccacc
cggtccacaa gcaggatatg 1020gaacgtatgg catga
103590331PRTAgrobacterium tumefaciens C58 90Met Arg Ser Glu Leu Pro
Glu Gly Phe Phe Gly Pro Arg Glu Trp Ala1 5 10 15 Arg Leu Asn Ala
Val Ala Asp Ile Ile Pro Gly Phe Pro His Thr Asp 20 25 30 Phe Asp
Thr Ala Asn Gly Ala Glu Ala Leu Ala Glu Ala Asp Ile Leu 35 40 45
Leu Ala Ala Trp Gly Thr Pro Ser Leu Thr Arg Glu Arg Leu Ser Arg 50
55 60 Ala Pro Arg Leu Lys Met Leu Ala Tyr Ala Ala Ser Ser Val Arg
Met65 70 75 80 Val Ala Pro Ala Glu Phe Trp Glu Thr Ser Asp Ile Leu
Val Thr Thr 85 90 95 Ala Ala Ser Ala Met Ala Val Pro Val Ala Glu
Phe Thr Tyr Ala Ala 100 105 110 Ile Ile Met Cys Gly Lys Asp Val Phe
Arg Leu Arg Asp Glu His Arg 115 120 125 Thr Glu Arg Gly Thr Gly Val
Phe Gly Ser Arg Arg Gly Arg Ser Leu 130 135 140 Pro Tyr Leu Gly Asn
His Ala Arg Lys Val Gly Ile Val Gly Ala Ser145 150 155 160 Arg Ile
Gly Arg Leu Val Met Glu Met Leu Ala Arg Gly Thr Phe Glu 165 170 175
Ile Ala Val Tyr Asp Pro Phe Leu Ser Ala Glu Glu Ala Ala Ser Leu 180
185 190 Gly Ala Lys Lys Ala Glu Leu Asp Glu Leu Leu Ala Trp Ser Asp
Val 195 200 205 Val Ser Leu His Ala Pro Ile Leu Pro Glu Thr His His
Met Ile Gly 210 215 220 Ala Arg Glu Leu Ala Leu Met Ala Asp His Ala
Ile Phe Ile Asn Thr225 230 235 240 Ala Arg Gly Trp Leu Val Asp His
Asp Ala Leu Leu Thr Glu Ala Ile 245 250 255 Ser Gly Arg Leu Arg Ile
Leu Ile Asp Thr Pro Glu Pro Glu Pro Leu 260 265 270 Pro Thr Asp Ser
Pro Phe Tyr Asp Leu Pro Asn Val Val Leu Thr Pro 275 280 285 His Ile
Ala Gly Ala Leu Gly Asn Glu Leu Arg Ala Leu Ser Asp Leu 290 295 300
Ala Ile Thr Glu Ile Glu Arg Phe Val Ala Gly Leu Ala Pro Leu His305
310 315 320 Pro Val His Lys Gln Asp Met Glu Arg Met Ala 325 330
91750DNAAgrobacterium tumefaciens C58 91atgcagcgtt ttaccaacag
aaccatcgtt gtcgccgggg ccggccggga tatcggccgg 60gcatgcgcca tccgtttcgc
acaggaaggc gccaatgtcg ttcttaccta taatggcgcg 120gcagagggcg
cggccacagc cgttgccgaa atcgaaaagc ttggtcgttc ggctctggcg
180atcaaggcgg atctcacaaa cgccgccgaa gtcgaggctg ccatatctgc
ggctgcggac 240aagtttgggg agatccacgg cctcgtccat gttgccggcg
gcctgatcgc ccgcaagaca 300atcgcagaaa tggatgaagc cttctggcat
caggtcctcg acgtcaatct gacatcgctg 360ttcctgacgg ccaagaccgc
attgccgaag atggccaagg gcggcgcgat cgtcactttc 420tcgtcgcagg
ccggccgtga tggcggcggc ccgggcgctc ttgcctatgc cacttccaag
480ggtgccgtga tgaccttcac ccgcggactt gccaaagaag tcggccccaa
aatccgcgtc 540aacgccgttt gccccggtat gatctccacc accttccacg
ataccttcac caagccggag 600gtgcgcgaac gggtggccgg cgcgacgtcg
ctcaagcgcg aagggtcgag cgaagacgtc 660gccggtctgg tggccttcct
cgcgtctgac gatgccgctt atgtcaccgg cgcctgctac 720gacatcaatg
gcggcgtcct gttttcctga 75092249PRTAgrobacterium tumefaciens C58
92Met Gln Arg Phe Thr Asn Arg Thr Ile Val Val Ala Gly Ala Gly Arg1
5 10 15 Asp Ile Gly Arg Ala Cys Ala Ile Arg Phe Ala Gln Glu Gly Ala
Asn 20 25 30 Val Val Leu Thr Tyr Asn Gly Ala Ala Glu Gly Ala Ala
Thr Ala Val 35 40 45 Ala Glu Ile Glu Lys Leu Gly Arg Ser Ala Leu
Ala Ile Lys Ala Asp 50 55 60 Leu Thr Asn Ala Ala Glu Val Glu Ala
Ala Ile Ser Ala Ala Ala Asp65 70 75 80 Lys Phe Gly Glu Ile His Gly
Leu Val His Val Ala Gly Gly Leu Ile 85 90 95 Ala Arg Lys Thr Ile
Ala Glu Met Asp Glu Ala Phe Trp His Gln Val 100 105 110 Leu Asp Val
Asn Leu Thr Ser Leu Phe Leu Thr Ala Lys Thr Ala Leu 115 120 125 Pro
Lys Met Ala Lys Gly Gly Ala Ile Val Thr Phe Ser Ser Gln Ala 130 135
140 Gly Arg Asp Gly Gly Gly Pro Gly Ala Leu Ala Tyr Ala Thr Ser
Lys145 150 155 160 Gly Ala Val Met Thr Phe Thr Arg Gly Leu Ala Lys
Glu Val Gly Pro 165 170 175 Lys Ile Arg Val Asn Ala Val Cys Pro Gly
Met Ile Ser Thr Thr Phe 180 185 190 His Asp Thr Phe Thr Lys Pro Glu
Val Arg Glu Arg Val Ala Gly Ala 195 200 205 Thr Ser Leu Lys Arg Glu
Gly Ser Ser Glu Asp Val Ala Gly Leu Val 210 215 220 Ala Phe Leu Ala
Ser Asp Asp Ala Ala Tyr Val Thr Gly Ala Cys Tyr225 230 235 240 Asp
Ile Asn Gly Gly Val Leu Phe Ser 245 93930DNAEscherichia coli DH10B
93atgtccaaaa agattgccgt gattggcgaa tgcatgattg agctttccga gaaaggcgcg
60gacgttaagc gcggtttcgg cggcgatacc ctgaacactt ccgtctatat cgcccgtcag
120gtcgatcctg cggcattaac cgttcattac gtaacggcgc tgggaacgga
cagttttagc 180cagcagatgc tggacgcctg gcacggcgag aacgttgata
cttccctgac ccaacggatg 240gaaaaccgtc tgccgggcct ttactacatt
gaaaccgaca gcaccggcga gcgtacgttc 300tactactggc ggaacgaagc
cgccgccaaa ttctggctgg agagtgagca gtctgcggcg 360atttgcgaag
agctggcgaa tttcgattat ctctacctga gcgggattag cctggcgatc
420ttaagcccga ccagccgcga aaagctgctt tccctgctgc gcgaatgccg
cgccaacggc 480ggaaaagtga ttttcgacaa taactatcgt ccgcgcctgt
gggccagcaa agaagagaca 540cagcaggtgt accaacaaat gctggaatgc
acggatatcg ccttcctgac gctggacgac 600gaagacgcgc tgtggggtca
acagccggtg gaagacgtca ttgcgcgcac ccataacgcg 660ggcgtgaaag
aagtggtggt gaaacgcggg gcggattctt gcctggtgtc cattgctggc
720gaagggttag tggatgttcc ggcggtgaaa ctgccgaaag aaaaagtgat
cgataccacc 780gcagctggcg actctttcag tgccggttat ctggcggtac
gtctgacagg cggcagcgcg 840gaagacgcgg cgaaacgtgg gcacctgacc
gcaagtaccg ttattcagta tcgcggcgcg 900attatcccgc gtgaggcgat
gccagcgtaa 93094309PRTEscherichia coli DH10B 94Met Ser Lys Lys Ile
Ala Val Ile Gly Glu Cys Met Ile Glu Leu Ser1 5 10 15 Glu Lys Gly
Ala Asp Val Lys Arg Gly Phe Gly Gly Asp Thr Leu Asn 20 25 30 Thr
Ser Val Tyr Ile Ala Arg Gln Val Asp Pro Ala Ala Leu Thr Val 35 40
45 His Tyr Val Thr Ala Leu Gly Thr Asp Ser Phe Ser Gln Gln Met Leu
50 55 60 Asp Ala Trp His Gly Glu Asn Val Asp Thr Ser Leu Thr Gln
Arg Met65 70 75 80 Glu Asn Arg Leu Pro Gly Leu Tyr Tyr Ile Glu Thr
Asp Ser Thr Gly 85 90 95 Glu Arg Thr Phe Tyr Tyr Trp Arg Asn Glu
Ala Ala Ala Lys Phe Trp 100 105 110 Leu Glu Ser Glu Gln Ser Ala Ala
Ile Cys Glu Glu Leu Ala Asn Phe 115 120 125 Asp Tyr Leu Tyr Leu Ser
Gly Ile Ser Leu Ala Ile Leu Ser Pro Thr 130 135 140 Ser Arg Glu Lys
Leu Leu Ser Leu Leu Arg Glu Cys Arg Ala Asn Gly145 150 155 160 Gly
Lys Val Ile Phe Asp Asn Asn Tyr Arg Pro Arg Leu Trp Ala Ser 165 170
175 Lys Glu Glu Thr Gln Gln Val Tyr Gln Gln Met Leu Glu Cys Thr Asp
180 185 190 Ile Ala Phe Leu Thr Leu Asp Asp Glu Asp Ala Leu Trp Gly
Gln Gln 195 200 205 Pro Val Glu Asp Val Ile Ala Arg Thr His Asn Ala
Gly Val Lys Glu 210 215 220 Val Val Val Lys Arg Gly Ala Asp Ser Cys
Leu Val Ser Ile Ala Gly225 230 235 240 Glu Gly Leu Val Asp Val Pro
Ala Val Lys Leu Pro Lys Glu Lys Val 245 250 255 Ile Asp Thr Thr Ala
Ala Gly Asp Ser Phe Ser Ala Gly Tyr Leu Ala 260 265 270 Val Arg Leu
Thr Gly Gly Ser Ala Glu Asp Ala Ala Lys Arg Gly His 275 280 285 Leu
Thr Ala Ser Thr Val Ile Gln Tyr Arg Gly Ala Ile Ile Pro Arg 290 295
300 Glu Ala Met Pro Ala305 95642DNAEscherichia coli DH10B
95atgaaaaact ggaaaacaag tgcagaatca atcctgacca ccggcccggt tgtaccggtt
60atcgtggtaa aaaaactgga acacgcggtg ccgatggcaa aagcgttggt tgctggtggg
120gtgcgcgttc tggaagtgac tctgcgtacc gagtgtgcag ttgacgctat
ccgtgctatc 180gccaaagaag tgcctgaagc gattgtgggt gccggtacgg
tgctgaatcc acagcagctg 240gcagaagtca ctgaagcggg tgcacagttc
gcaattagcc cgggtctgac cgagccgctg 300ctgaaagctg ctaccgaagg
gactattcct ctgattccgg ggatcagcac tgtttccgaa 360ctgatgctgg
gtatggacta cggtttgaaa gagttcaaat tcttcccggc tgaagctaac
420ggcggcgtga aagccctgca ggcgatcgcg ggtccgttct cccaggtccg
tttctgcccg 480acgggtggta tttctccggc taactaccgt gactacctgg
cgctgaaaag cgtgctgtgc 540atcggtggtt cctggctggt tccggcagat
gcgctggaag cgggcgatta cgaccgcatt 600actaagctgg cgcgtgaagc
tgtagaaggc gctaagctgt aa 64296213PRTEscherichia coli DH10B 96Met
Lys Asn Trp Lys Thr Ser Ala Glu Ser Ile Leu Thr Thr Gly Pro1 5 10
15 Val Val Pro Val Ile Val Val Lys Lys Leu Glu His Ala Val Pro Met
20 25 30 Ala Lys Ala Leu Val Ala Gly Gly Val Arg Val Leu Glu Val
Thr Leu 35 40 45 Arg Thr Glu Cys Ala Val Asp Ala Ile Arg Ala Ile
Ala Lys Glu Val 50 55 60 Pro Glu Ala Ile Val Gly Ala Gly Thr Val
Leu Asn Pro Gln Gln Leu65 70 75 80 Ala Glu Val Thr Glu Ala Gly Ala
Gln Phe Ala Ile Ser Pro Gly Leu 85 90 95 Thr
Glu Pro Leu Leu Lys Ala Ala Thr Glu Gly Thr Ile Pro Leu Ile 100 105
110 Pro Gly Ile Ser Thr Val Ser Glu Leu Met Leu Gly Met Asp Tyr Gly
115 120 125 Leu Lys Glu Phe Lys Phe Phe Pro Ala Glu Ala Asn Gly Gly
Val Lys 130 135 140 Ala Leu Gln Ala Ile Ala Gly Pro Phe Ser Gln Val
Arg Phe Cys Pro145 150 155 160 Thr Gly Gly Ile Ser Pro Ala Asn Tyr
Arg Asp Tyr Leu Ala Leu Lys 165 170 175 Ser Val Leu Cys Ile Gly Gly
Ser Trp Leu Val Pro Ala Asp Ala Leu 180 185 190 Glu Ala Gly Asp Tyr
Asp Arg Ile Thr Lys Leu Ala Arg Glu Ala Val 195 200 205 Glu Gly Ala
Lys Leu 210 97780DNALactobaccilus brevis ATCC 367 97atggcatcaa
atggaaaagt agcaatggtt accggtggcg gacaaggaat tggtgaagcc 60atctcgaaac
ggttagctaa cgacggcttt gctgtggcaa ttgctgattt gaacttggac
120aatgccaaca aggtcgtttc tgatattgaa gctgctggtg gcaaggccat
tgcggtcaag 180accgatgtct ctgatcgtga tagcgtgttt gctgcggtta
atgaagcggc cgacaagctg 240ggcggctttg acgttatcgt taataacgcc
ggccttggcc caaccacgcc aattgacacc 300atcacccaag aacagtttga
tacggtttat cacgttaacg tgggtggggt tctttggggc 360attcaagcag
cccatgcgaa gttcaaggaa ttgggtcatg gtgggaagat catttccgcg
420acgtctcaag ccggggttgt tggtaacccg aacttagctc tgtacagtgg
aactaagttt 480gccattcgtg gtgtgaccca agttgcggcg cgtgacttag
ccgctgaagg tatcacggtc 540aatgcttatg cacccgggat tgttaagaca
ccaatgatgt ttgacatcgc tcacaaggtt 600ggtcaaaatg ctggtaaaga
cgacgaatgg gggatgcaaa ccttctcaaa ggacatcgct 660ttatgtcgat
tgtcagaacc agaagatgtg gctaacgggg tggctttctt agccggtccc
720gattctaact acattacggg tcaaacactt gaagttgatg gtgggatgca
gttccactaa 78098259PRTLactobaccilus brevis ATCC 367 98Met Ala Ser
Asn Gly Lys Val Ala Met Val Thr Gly Gly Gly Gln Gly1 5 10 15 Ile
Gly Glu Ala Ile Ser Lys Arg Leu Ala Asn Asp Gly Phe Ala Val 20 25
30 Ala Ile Ala Asp Leu Asn Leu Asp Asn Ala Asn Lys Val Val Ser Asp
35 40 45 Ile Glu Ala Ala Gly Gly Lys Ala Ile Ala Val Lys Thr Asp
Val Ser 50 55 60 Asp Arg Asp Ser Val Phe Ala Ala Val Asn Glu Ala
Ala Asp Lys Leu65 70 75 80 Gly Gly Phe Asp Val Ile Val Asn Asn Ala
Gly Leu Gly Pro Thr Thr 85 90 95 Pro Ile Asp Thr Ile Thr Gln Glu
Gln Phe Asp Thr Val Tyr His Val 100 105 110 Asn Val Gly Gly Val Leu
Trp Gly Ile Gln Ala Ala His Ala Lys Phe 115 120 125 Lys Glu Leu Gly
His Gly Gly Lys Ile Ile Ser Ala Thr Ser Gln Ala 130 135 140 Gly Val
Val Gly Asn Pro Asn Leu Ala Leu Tyr Ser Gly Thr Lys Phe145 150 155
160 Ala Ile Arg Gly Val Thr Gln Val Ala Ala Arg Asp Leu Ala Ala Glu
165 170 175 Gly Ile Thr Val Asn Ala Tyr Ala Pro Gly Ile Val Lys Thr
Pro Met 180 185 190 Met Phe Asp Ile Ala His Lys Val Gly Gln Asn Ala
Gly Lys Asp Asp 195 200 205 Glu Trp Gly Met Gln Thr Phe Ser Lys Asp
Ile Ala Leu Cys Arg Leu 210 215 220 Ser Glu Pro Glu Asp Val Ala Asn
Gly Val Ala Phe Leu Ala Gly Pro225 230 235 240 Asp Ser Asn Tyr Ile
Thr Gly Gln Thr Leu Glu Val Asp Gly Gly Met 245 250 255 Gln Phe
His991089DNAPseudomonas putida KT2440 99atgaatgacc tgagccacac
ccacatgcgc gcggccgtct ggcatggccg ccacgatatt 60cgtgtcgaac aggtaccttt
gccggccgac cctgcgccgg gctgggtgca gatcaaggtg 120gactggtgcg
gcatctgcgg ctccgacctg cacgaatatg ttgccggccc ggtgttcatc
180ccggtagagg ccccgcaccc gctgaccggc attcagggcc agtgcatcct
cggccacgaa 240ttctgcggcc acatcgccaa gcttggcgaa ggcgtggaag
gctatgccgt aggcgacccg 300gtggcggcag acgcgtgcca gcattgtggt
acctgctatt actgcaccca tggcctgtac 360aacatctgcg aacgcctggc
gttcaccggc ctgatgaaca acggtgcctt cgccgagctg 420gtcaacgtgc
ccgccaacct gctctaccgg ctgccgcagg gcttccctgc cgaagccggg
480gcactgatcg agccgctggc ggtgggtatg cacgcggtga aaaaggccgg
cagcctgctt 540gggcaaaccg ttgtagtggt tggggccggc accatcggcc
tgtgcaccat catgtgcgcc 600aaggctgcag gtgcggcaca ggtcatcgcc
cttgagatgt cctctgcgcg caaagccaag 660gccaaggaag cgggcgccaa
cgtggtgctg gaccccagcc agtgcgatgc cctggcggaa 720atccgcgcac
tgactgctgg gctgggcgcc gatgtgagtt ttgagtgcat cggcaacaaa
780catacggcca agctggccat cgacaccatc cgcaaagcag gcaagtgcgt
gctggtgggt 840attttcgaag agcccagcga gttcaacttc ttcgagctgg
tgtccaccga gaagcaagtg 900ctgggggcgt tggcgtacaa cggcgagttt
gctgacgtga ttgccttcat tgctgatggt 960cggctggata ttcgcccgct
ggtaaccggc cggatcggat tggagcagat tgtcgagctg 1020ggcttcgagg
aactggtgaa caacaaagag gagaacgtga agatcatcgt ttcaccaggt
1080gtgcgctga 1089100362PRTPseudomonas putida KT2440 100Met Asn Asp
Leu Ser His Thr His Met Arg Ala Ala Val Trp His Gly1 5 10 15 Arg
His Asp Ile Arg Val Glu Gln Val Pro Leu Pro Ala Asp Pro Ala 20 25
30 Pro Gly Trp Val Gln Ile Lys Val Asp Trp Cys Gly Ile Cys Gly Ser
35 40 45 Asp Leu His Glu Tyr Val Ala Gly Pro Val Phe Ile Pro Val
Glu Ala 50 55 60 Pro His Pro Leu Thr Gly Ile Gln Gly Gln Cys Ile
Leu Gly His Glu65 70 75 80 Phe Cys Gly His Ile Ala Lys Leu Gly Glu
Gly Val Glu Gly Tyr Ala 85 90 95 Val Gly Asp Pro Val Ala Ala Asp
Ala Cys Gln His Cys Gly Thr Cys 100 105 110 Tyr Tyr Cys Thr His Gly
Leu Tyr Asn Ile Cys Glu Arg Leu Ala Phe 115 120 125 Thr Gly Leu Met
Asn Asn Gly Ala Phe Ala Glu Leu Val Asn Val Pro 130 135 140 Ala Asn
Leu Leu Tyr Arg Leu Pro Gln Gly Phe Pro Ala Glu Ala Gly145 150 155
160 Ala Leu Ile Glu Pro Leu Ala Val Gly Met His Ala Val Lys Lys Ala
165 170 175 Gly Ser Leu Leu Gly Gln Thr Val Val Val Val Gly Ala Gly
Thr Ile 180 185 190 Gly Leu Cys Thr Ile Met Cys Ala Lys Ala Ala Gly
Ala Ala Gln Val 195 200 205 Ile Ala Leu Glu Met Ser Ser Ala Arg Lys
Ala Lys Ala Lys Glu Ala 210 215 220 Gly Ala Asn Val Val Leu Asp Pro
Ser Gln Cys Asp Ala Leu Ala Glu225 230 235 240 Ile Arg Ala Leu Thr
Ala Gly Leu Gly Ala Asp Val Ser Phe Glu Cys 245 250 255 Ile Gly Asn
Lys His Thr Ala Lys Leu Ala Ile Asp Thr Ile Arg Lys 260 265 270 Ala
Gly Lys Cys Val Leu Val Gly Ile Phe Glu Glu Pro Ser Glu Phe 275 280
285 Asn Phe Phe Glu Leu Val Ser Thr Glu Lys Gln Val Leu Gly Ala Leu
290 295 300 Ala Tyr Asn Gly Glu Phe Ala Asp Val Ile Ala Phe Ile Ala
Asp Gly305 310 315 320 Arg Leu Asp Ile Arg Pro Leu Val Thr Gly Arg
Ile Gly Leu Glu Gln 325 330 335 Ile Val Glu Leu Gly Phe Glu Glu Leu
Val Asn Asn Lys Glu Glu Asn 340 345 350 Val Lys Ile Ile Val Ser Pro
Gly Val Arg 355 360 101771DNAKlebsiella pneumoniae MGH78578
101atgaaaaaag tcgcacttgt taccggcgcc ggccagggga ttggtaaagc
tatcgccctt 60cgtctggtga aggatggatt tgccgtggcc attgccgatt ataacgacgc
caccgccaaa 120gcggtcgcct cggaaatcaa ccaggccggc ggacacgccg
tggcggtgaa agtggatgtc 180tccgaccgcg atcaggtatt tgccgccgtt
gaacaggcgc gcaaaacgct gggcggcttc 240gacgtcatcg tcaataacgc
cggtgtggca ccgtctacgc cgatcgagtc cattaccccg 300gagattgtcg
acaaagtcta caacatcaac gtcaaagggg tgatctgggg tattcaggcg
360gcggtcgagg cctttaagaa agaggggcac ggcgggaaaa tcatcaacgc
ctgttcccag 420gccggccacg tcggcaaccc ggagctggcg gtgtatagct
ccagtaaatt cgcggtacgc 480ggcttaaccc agaccgccgc tcgcgacctc
gcgccgctgg gcatcacggt caacggctac 540tgcccgggga ttgtcaaaac
gccaatgtgg gccgaaattg accgccaggt gtccgaagcc 600gccggtaaac
cgctgggcta cggtaccgcc gagttcgcca aacgcatcac tctcggtcgt
660ctgtccgagc cggaagatgt cgccgcctgc gtctcctatc ttgccagccc
ggattctgat 720tacatgaccg gtcagtcgtt gctgatcgac ggcgggatgg
tatttaacta a 771102256PRTKlebsiella pneumoniae MGH78578 102Met Lys
Lys Val Ala Leu Val Thr Gly Ala Gly Gln Gly Ile Gly Lys1 5 10 15
Ala Ile Ala Leu Arg Leu Val Lys Asp Gly Phe Ala Val Ala Ile Ala 20
25 30 Asp Tyr Asn Asp Ala Thr Ala Lys Ala Val Ala Ser Glu Ile Asn
Gln 35 40 45 Ala Gly Gly His Ala Val Ala Val Lys Val Asp Val Ser
Asp Arg Asp 50 55 60 Gln Val Phe Ala Ala Val Glu Gln Ala Arg Lys
Thr Leu Gly Gly Phe65 70 75 80 Asp Val Ile Val Asn Asn Ala Gly Val
Ala Pro Ser Thr Pro Ile Glu 85 90 95 Ser Ile Thr Pro Glu Ile Val
Asp Lys Val Tyr Asn Ile Asn Val Lys 100 105 110 Gly Val Ile Trp Gly
Ile Gln Ala Ala Val Glu Ala Phe Lys Lys Glu 115 120 125 Gly His Gly
Gly Lys Ile Ile Asn Ala Cys Ser Gln Ala Gly His Val 130 135 140 Gly
Asn Pro Glu Leu Ala Val Tyr Ser Ser Ser Lys Phe Ala Val Arg145 150
155 160 Gly Leu Thr Gln Thr Ala Ala Arg Asp Leu Ala Pro Leu Gly Ile
Thr 165 170 175 Val Asn Gly Tyr Cys Pro Gly Ile Val Lys Thr Pro Met
Trp Ala Glu 180 185 190 Ile Asp Arg Gln Val Ser Glu Ala Ala Gly Lys
Pro Leu Gly Tyr Gly 195 200 205 Thr Ala Glu Phe Ala Lys Arg Ile Thr
Leu Gly Arg Leu Ser Glu Pro 210 215 220 Glu Asp Val Ala Ala Cys Val
Ser Tyr Leu Ala Ser Pro Asp Ser Asp225 230 235 240 Tyr Met Thr Gly
Gln Ser Leu Leu Ile Asp Gly Gly Met Val Phe Asn 245 250 255
1031665DNAKlebsiella pneumoniae MGH78578 103atgagatcga aaagatttga
agcactggcg aaacgccctg tgaatcagga tggtttcgtt 60aaggagtgga ttgaagaggg
ctttatcgcg atggaaagcc ctaacgatcc caaaccttct 120atccgcatcg
tcaacggcgc ggtgaccgaa ctcgacgata aaccggttga gcagttcgac
180ctgattgacc actttatcgc gcgctacggc attaatctcg cccgggccga
agaagtgatg 240gccatggatt cggttaagct cgccaacatg ctctgcgacc
cgaacgttaa acgcagcgac 300atcgtgccgc tcactaccgc gatgaccccg
gcgaaaatcg tggaagtggt gtcgcatatg 360aacgtggtcg agatgatgat
ggcgatgcaa aaaatgcgcg cccgccgcac gccgtcccag 420caggcgcatg
tcactaatat caaagataat ccggtacaga ttgccgccga cgccgctgaa
480ggcgcatggc gcggctttga cgagcaggag accaccgtcg ccgtggcgcg
ctacgcgccg 540ttcaacgcca tcgccctgct ggtcggttca caggttggcc
gccccggcgt cctcacccag 600tgttcgctgg aagaagccac cgagctgaaa
ctgggcatgc tgggccacac ctgctatgcc 660gaaaccattt cggtatacgg
tacggaaccg gtgtttaccg atggcgatga caccccgtgg 720tcgaaaggct
tcctcgcctc ctcctacgcc tcgcgcggcc tgaaaatgcg ctttacctcc
780ggttccggct cggaggtgca gatgggctat gccgaaggca aatcgatgct
ttatctcgaa 840gcgcgctgca tctacatcac caaagccgcc ggggtgcaag
gcctgcagaa tggctccgtc 900agctgtatcg gcgtgccgtc cgccgtgccg
tccgggatcc gcgccgtact ggcggaaaac 960ctgatctgct cagcgctgga
tctggagtgc gcctccagca acgatcaaac ctttacccac 1020tcggatatgc
ggcgtaccgc gcgtctgctg atgcagttcc tgccaggtac cgactttatc
1080tcctccggtt actcggcggt gccgaactac gacaacatgt tcgccggttc
caacgaagat 1140gccgaagact tcgatgacta caacgtgatc cagcgcgacc
tgaaggtcga tggcggcctg 1200cggccggtgc gtgaagagga cgtgatcgcc
attcgcaaca aagccgcccg cgcgctgcag 1260gcggtatttg ccggcatggg
tttgccgcct attacggatg aagaagtaga agccgccacc 1320tacgcccacg
gttcaaaaga tatgcctgag cgcaatatcg tcgaggacat caagtttgct
1380caggagatca tcaacaagaa ccgcaacggc ctggaggtgg tgaaagccct
ggcgaaaggc 1440ggcttccccg atgtcgccca ggacatgctc aatattcaga
aagccaagct caccggcgac 1500tacctgcata cctccgccat cattgttggc
gagggccagg tgctctcggc cgtgaatgac 1560gtgaacgatt atgccggtcc
ggcaacaggc taccgcctgc aaggcgagcg ctgggaagag 1620attaaaaata
tcccgggcgc gctcgatccc aatgaacttg gctaa 1665104554PRTKlebsiella
pneumoniae MGH78578 104Met Arg Ser Lys Arg Phe Glu Ala Leu Ala Lys
Arg Pro Val Asn Gln1 5 10 15 Asp Gly Phe Val Lys Glu Trp Ile Glu
Glu Gly Phe Ile Ala Met Glu 20 25 30 Ser Pro Asn Asp Pro Lys Pro
Ser Ile Arg Ile Val Asn Gly Ala Val 35 40 45 Thr Glu Leu Asp Asp
Lys Pro Val Glu Gln Phe Asp Leu Ile Asp His 50 55 60 Phe Ile Ala
Arg Tyr Gly Ile Asn Leu Ala Arg Ala Glu Glu Val Met65 70 75 80 Ala
Met Asp Ser Val Lys Leu Ala Asn Met Leu Cys Asp Pro Asn Val 85 90
95 Lys Arg Ser Asp Ile Val Pro Leu Thr Thr Ala Met Thr Pro Ala Lys
100 105 110 Ile Val Glu Val Val Ser His Met Asn Val Val Glu Met Met
Met Ala 115 120 125 Met Gln Lys Met Arg Ala Arg Arg Thr Pro Ser Gln
Gln Ala His Val 130 135 140 Thr Asn Ile Lys Asp Asn Pro Val Gln Ile
Ala Ala Asp Ala Ala Glu145 150 155 160 Gly Ala Trp Arg Gly Phe Asp
Glu Gln Glu Thr Thr Val Ala Val Ala 165 170 175 Arg Tyr Ala Pro Phe
Asn Ala Ile Ala Leu Leu Val Gly Ser Gln Val 180 185 190 Gly Arg Pro
Gly Val Leu Thr Gln Cys Ser Leu Glu Glu Ala Thr Glu 195 200 205 Leu
Lys Leu Gly Met Leu Gly His Thr Cys Tyr Ala Glu Thr Ile Ser 210 215
220 Val Tyr Gly Thr Glu Pro Val Phe Thr Asp Gly Asp Asp Thr Pro
Trp225 230 235 240 Ser Lys Gly Phe Leu Ala Ser Ser Tyr Ala Ser Arg
Gly Leu Lys Met 245 250 255 Arg Phe Thr Ser Gly Ser Gly Ser Glu Val
Gln Met Gly Tyr Ala Glu 260 265 270 Gly Lys Ser Met Leu Tyr Leu Glu
Ala Arg Cys Ile Tyr Ile Thr Lys 275 280 285 Ala Ala Gly Val Gln Gly
Leu Gln Asn Gly Ser Val Ser Cys Ile Gly 290 295 300 Val Pro Ser Ala
Val Pro Ser Gly Ile Arg Ala Val Leu Ala Glu Asn305 310 315 320 Leu
Ile Cys Ser Ala Leu Asp Leu Glu Cys Ala Ser Ser Asn Asp Gln 325 330
335 Thr Phe Thr His Ser Asp Met Arg Arg Thr Ala Arg Leu Leu Met Gln
340 345 350 Phe Leu Pro Gly Thr Asp Phe Ile Ser Ser Gly Tyr Ser Ala
Val Pro 355 360 365 Asn Tyr Asp Asn Met Phe Ala Gly Ser Asn Glu Asp
Ala Glu Asp Phe 370 375 380 Asp Asp Tyr Asn Val Ile Gln Arg Asp Leu
Lys Val Asp Gly Gly Leu385 390 395 400 Arg Pro Val Arg Glu Glu Asp
Val Ile Ala Ile Arg Asn Lys Ala Ala 405 410 415 Arg Ala Leu Gln Ala
Val Phe Ala Gly Met Gly Leu Pro Pro Ile Thr 420 425 430 Asp Glu Glu
Val Glu Ala Ala Thr Tyr Ala His Gly Ser Lys Asp Met 435 440 445 Pro
Glu Arg Asn Ile Val Glu Asp Ile Lys Phe Ala Gln Glu Ile Ile 450 455
460 Asn Lys Asn Arg Asn Gly Leu Glu Val Val Lys Ala Leu Ala Lys
Gly465 470 475 480 Gly Phe Pro Asp Val Ala Gln Asp Met Leu Asn Ile
Gln Lys Ala Lys 485 490 495 Leu Thr Gly Asp Tyr Leu His Thr Ser Ala
Ile Ile Val Gly Glu Gly 500 505 510 Gln Val Leu Ser Ala Val Asn Asp
Val Asn Asp Tyr Ala Gly Pro Ala 515 520 525 Thr Gly Tyr Arg Leu Gln
Gly Glu Arg Trp Glu Glu Ile Lys Asn Ile 530 535 540 Pro Gly Ala Leu
Asp Pro Asn Glu Leu Gly545 550 105690DNAKlebsiella pneumoniae
MGH78578 105atggaaatta acgaaacgct gctgcgccag attatcgaag aggtgctgtc
ggagatgaaa 60tcaggcgcag ataagccggt ctcctttagc gcgcctgcgg cttctgtcgc
ctctgccgcg 120ccggtcgccg ttgcgcctgt gtccggcgac agcttcctga
cggaaatcgg cgaagccaaa 180cccggcacgc agcaggatga agtcattatt
gccgtcgggc cagcgtttgg tctggcgcaa 240accgccaata tcgtcggcat
tccgcataaa aatattctgc gcgaagtgat cgccggcatt 300gaggaagaag
gcatcaaagc ccgggtgatc cgctgcttta agtcttctga cgtcgccttc
360gtggcagtgg aaggcaaccg cctgagcggc tccggcatct cgatcggtat
tcagtcgaaa 420ggcaccaccg tcatccacca gcgcggcctg ccgccgcttt
ccaatctgga actcttcccg 480caggcgccgc tgctgacgct ggaaacctac
cgtcagattg gcaaaaacgc cgcgcgctac 540gccaaacgcg agtcgccgca
gccggtgccg acgcttaacg atcagatggc tcgtcccaaa 600taccaggcga
agtcggccat tttgcacatt aaagagacca aatacgtggt gacgggcaaa
660aacccgcagg aactgcgcgt ggcgctttaa 690106229PRTKlebsiella
pneumoniae MGH78578 106Met Glu Ile Asn Glu Thr Leu Leu Arg Gln Ile
Ile Glu Glu Val Leu1 5 10 15 Ser Glu Met Lys Ser Gly Ala Asp Lys
Pro Val Ser Phe Ser Ala Pro 20 25 30 Ala Ala Ser Val Ala Ser Ala
Ala Pro Val Ala Val Ala Pro Val Ser 35 40 45 Gly Asp Ser Phe Leu
Thr Glu Ile Gly Glu Ala Lys Pro Gly Thr Gln 50 55 60 Gln Asp Glu
Val Ile Ile Ala Val Gly Pro Ala Phe Gly Leu Ala Gln65 70 75 80 Thr
Ala Asn Ile Val Gly Ile Pro His Lys Asn Ile Leu Arg Glu Val 85 90
95 Ile Ala Gly Ile Glu Glu Glu Gly Ile Lys Ala Arg Val Ile Arg Cys
100 105 110 Phe Lys Ser Ser Asp Val Ala Phe Val Ala Val Glu Gly Asn
Arg Leu 115 120 125 Ser Gly Ser Gly Ile Ser Ile Gly Ile Gln Ser Lys
Gly Thr Thr Val 130 135 140 Ile His Gln Arg Gly Leu Pro Pro Leu Ser
Asn Leu Glu Leu Phe Pro145 150 155 160 Gln Ala Pro Leu Leu Thr Leu
Glu Thr Tyr Arg Gln Ile Gly Lys Asn 165 170 175 Ala Ala Arg Tyr Ala
Lys Arg Glu Ser Pro Gln Pro Val Pro Thr Leu 180 185 190 Asn Asp Gln
Met Ala Arg Pro Lys Tyr Gln Ala Lys Ser Ala Ile Leu 195 200 205 His
Ile Lys Glu Thr Lys Tyr Val Val Thr Gly Lys Asn Pro Gln Glu 210 215
220 Leu Arg Val Ala Leu225 107525DNAKlebsiella pneumoniae MGH78578
107atgaataccg acgcaattga atccatggta cgcgacgtgc tgagccggat
gaacagccta 60caggacggga taacgcccgc gccagccgcg ccgacaaacg acaccgttcg
ccagccaaaa 120gttagcgact acccgttagc gacccgccat ccggagtggg
tcaaaaccgc taccaataaa 180acgctcgatg acctgacgct ggagaacgta
ttaagcgatc gcgttacggc gcaggacatg 240cgcatcactc cggaaacgct
gcgtatgcag gcggcgatcg cccaggatgc cggacgcgat 300cggctggcga
tgaactttga gcgggccgca gagctcaccg cggttcccga cgaccgaatc
360cttgagatct acaacgccct gcgcccatac cgttccaccc aggcggagct
actggcgatc 420gctgatgacc tcgagcatcg ctaccaggca cgactctgtg
ccgcctttgt tcgggaagcg 480gccgggctgt acatcgagcg taagaagctg
aaaggcgacg attaa 525108174PRTKlebsiella pneumoniae MGH78578 108Met
Asn Thr Asp Ala Ile Glu Ser Met Val Arg Asp Val Leu Ser Arg1 5 10
15 Met Asn Ser Leu Gln Asp Gly Ile Thr Pro Ala Pro Ala Ala Pro Thr
20 25 30 Asn Asp Thr Val Arg Gln Pro Lys Val Ser Asp Tyr Pro Leu
Ala Thr 35 40 45 Arg His Pro Glu Trp Val Lys Thr Ala Thr Asn Lys
Thr Leu Asp Asp 50 55 60 Leu Thr Leu Glu Asn Val Leu Ser Asp Arg
Val Thr Ala Gln Asp Met65 70 75 80 Arg Ile Thr Pro Glu Thr Leu Arg
Met Gln Ala Ala Ile Ala Gln Asp 85 90 95 Ala Gly Arg Asp Arg Leu
Ala Met Asn Phe Glu Arg Ala Ala Glu Leu 100 105 110 Thr Ala Val Pro
Asp Asp Arg Ile Leu Glu Ile Tyr Asn Ala Leu Arg 115 120 125 Pro Tyr
Arg Ser Thr Gln Ala Glu Leu Leu Ala Ile Ala Asp Asp Leu 130 135 140
Glu His Arg Tyr Gln Ala Arg Leu Cys Ala Ala Phe Val Arg Glu Ala145
150 155 160 Ala Gly Leu Tyr Ile Glu Arg Lys Lys Leu Lys Gly Asp Asp
165 170 109789DNAPseudomonas putida KT2440 109atgacagtca attatgattt
ttccggaaaa gtcgtgctgg ttaccggcgc tggctctggt 60attggccgtg ccactgcgct
tgccttcgcg cagtcgggcg catccgttgc ggtcgcagac 120atctcgactg
accacggttt gaaaaccgta gagttggtca aagccgaagg aggcgaggcg
180accttcttcc atgtcgatgt aggctctgaa cccagcgtcc agtcgatgct
ggctggtgtc 240gtggcgcatt acggcggcct ggacattgcg cacaacaacg
ccggcattga ggccaatatc 300gtgccgctgg ccgagctgga ctccgacaac
tggcgtcgtg tcatcgatgt gaacctttcc 360tcggtgttct attgcctgaa
aggtgaaatc cctctgatgc tgaaaagggg cggcggcgcc 420attgtgaata
ccgcatcggc ctccgggctg attggcggct atcgcctttc cgggtatacc
480gccacgaagc acggcgtagt ggggctgact aaggctgctg ctatcgatta
tgcaaaccag 540aatatccgga ttaatgccgt gtgccctggt ccagttgact
ccccattcct ggctgacatg 600ccgcaaccca tgcgcgatcg acttctcttt
ggcactccaa ttggacgatt ggccaccgca 660gaggagatcg cgcgttcggt
tctgtggctg tgttctgacg atgcaaaata cgtggtgggc 720cattcgatgt
cagtcgacgg tggcgtggca gtgactgcgg ttggtactcg aatggatgat 780ctcttttaa
789110262PRTPseudomonas putida KT2440 110Met Thr Val Asn Tyr Asp
Phe Ser Gly Lys Val Val Leu Val Thr Gly1 5 10 15 Ala Gly Ser Gly
Ile Gly Arg Ala Thr Ala Leu Ala Phe Ala Gln Ser 20 25 30 Gly Ala
Ser Val Ala Val Ala Asp Ile Ser Thr Asp His Gly Leu Lys 35 40 45
Thr Val Glu Leu Val Lys Ala Glu Gly Gly Glu Ala Thr Phe Phe His 50
55 60 Val Asp Val Gly Ser Glu Pro Ser Val Gln Ser Met Leu Ala Gly
Val65 70 75 80 Val Ala His Tyr Gly Gly Leu Asp Ile Ala His Asn Asn
Ala Gly Ile 85 90 95 Glu Ala Asn Ile Val Pro Leu Ala Glu Leu Asp
Ser Asp Asn Trp Arg 100 105 110 Arg Val Ile Asp Val Asn Leu Ser Ser
Val Phe Tyr Cys Leu Lys Gly 115 120 125 Glu Ile Pro Leu Met Leu Lys
Arg Gly Gly Gly Ala Ile Val Asn Thr 130 135 140 Ala Ser Ala Ser Gly
Leu Ile Gly Gly Tyr Arg Leu Ser Gly Tyr Thr145 150 155 160 Ala Thr
Lys His Gly Val Val Gly Leu Thr Lys Ala Ala Ala Ile Asp 165 170 175
Tyr Ala Asn Gln Asn Ile Arg Ile Asn Ala Val Cys Pro Gly Pro Val 180
185 190 Asp Ser Pro Phe Leu Ala Asp Met Pro Gln Pro Met Arg Asp Arg
Leu 195 200 205 Leu Phe Gly Thr Pro Ile Gly Arg Leu Ala Thr Ala Glu
Glu Ile Ala 210 215 220 Arg Ser Val Leu Trp Leu Cys Ser Asp Asp Ala
Lys Tyr Val Val Gly225 230 235 240 His Ser Met Ser Val Asp Gly Gly
Val Ala Val Thr Ala Val Gly Thr 245 250 255 Arg Met Asp Asp Leu Phe
260 111762DNAPseudomonas putida KT2440 111atgagcatga ccttttctgg
ccaggtagcc ctggtgaccg gcgcgggtgc cggcatcggc 60cgggcaaccg ccctggcgtt
cgcccacgag ggcatgaaag tggtggtggc ggacctcgac 120ccggtcggcg
gcgaggccac cgtggcgcag atccacgcgg caggcggcga agcgctgttc
180attgcctgcg acgtgacccg cgacgccgag gtgcgccagt tgcatgagcg
cctgatggcc 240gcctacggcc ggctggacta cgccttcaac aacgccggga
tcgagatcga gcaacaccgc 300ctggccgaag gcagcgaagc ggagttcgat
gccatcatgg gcgtgaacgt gaagggcgtg 360tggttgtgca tgaagtatca
gttgcccttg ttgctggccc aaggcggtgg ggccatcgtc 420aataccgcgt
cggtggcggg gctaggggcg gcgccaaaga tgagcatcta cagcgccagc
480aagcatgcgg tcatcggtct gaccaagtcg gcggccatcg agtacgccaa
gaagggcatc 540cgcgtgaacg ccgtgtgccc ggccgtgatc gacaccgaca
tgttccgccg cgcttaccag 600gccgacccgc gcaaggccga gttcgccgca
gccatgcacc cggtagggcg cattggcaag 660gtcgaggaaa tcgccagcgc
cgtgctgtat ctgtgcagtg acggcgcggc gtttaccacc 720gggcattgcc
tgacggtgga tggtggggct acggcgatct ga 762112253PRTPseudomonas putida
KT2440 112Met Ser Met Thr Phe Ser Gly Gln Val Ala Leu Val Thr Gly
Ala Gly1 5 10 15 Ala Gly Ile Gly Arg Ala Thr Ala Leu Ala Phe Ala
His Glu Gly Met 20 25 30 Lys Val Val Val Ala Asp Leu Asp Pro Val
Gly Gly Glu Ala Thr Val 35 40 45 Ala Gln Ile His Ala Ala Gly Gly
Glu Ala Leu Phe Ile Ala Cys Asp 50 55 60 Val Thr Arg Asp Ala Glu
Val Arg Gln Leu His Glu Arg Leu Met Ala65 70 75 80 Ala Tyr Gly Arg
Leu Asp Tyr Ala Phe Asn Asn Ala Gly Ile Glu Ile 85 90 95 Glu Gln
His Arg Leu Ala Glu Gly Ser Glu Ala Glu Phe Asp Ala Ile 100 105 110
Met Gly Val Asn Val Lys Gly Val Trp Leu Cys Met Lys Tyr Gln Leu 115
120 125 Pro Leu Leu Leu Ala Gln Gly Gly Gly Ala Ile Val Asn Thr Ala
Ser 130 135 140 Val Ala Gly Leu Gly Ala Ala Pro Lys Met Ser Ile Tyr
Ser Ala Ser145 150 155 160 Lys His Ala Val Ile Gly Leu Thr Lys Ser
Ala Ala Ile Glu Tyr Ala 165 170 175 Lys Lys Gly Ile Arg Val Asn Ala
Val Cys Pro Ala Val Ile Asp Thr 180 185 190 Asp Met Phe Arg Arg Ala
Tyr Gln Ala Asp Pro Arg Lys Ala Glu Phe 195 200 205 Ala Ala Ala Met
His Pro Val Gly Arg Ile Gly Lys Val Glu Glu Ile 210 215 220 Ala Ser
Ala Val Leu Tyr Leu Cys Ser Asp Gly Ala Ala Phe Thr Thr225 230 235
240 Gly His Cys Leu Thr Val Asp Gly Gly Ala Thr Ala Ile 245 250
113810DNAPseudomonas putida KT2440 113atgtcttttc aaaacaaaat
cgttgtgctc acaggcgcag cttctggcat cggcaaagcg 60acagcacagc tgctagtgga
gcagggcgcc catgtggttg ccatggatct taaaagcgac 120ttgcttcaac
aagcattcgg cagtgaggag cacgttctgt gcatccctac cgacgtcagc
180gatagcgaag ccgtgcgagc cgccttccag gcagtggacg cgaaatttgg
ccgtgtcgac 240gtgattatta acgccgcggg catcaacgca cctacgcgag
aagccaacca gaaaatggtt 300gatgccaacg tcgctgccct cgatgccatg
aagagcgggc gggcgcccac tttcgacttc 360ctggccgata cctcggatca
ggatttccgg cgcgtaatgg aagtcaattt gttcagccag 420ttttactgca
ttcgagaggg tgttccgctg atgcgccgag cgggtggcgg cagcatcgtc
480aacatctcca gcgtggcagc gctcctgggc gtggcaatgc cactttacta
ccccgcctcc 540aaggcggcgg tgctgggcct cacccgtgca gcggcagctg
agttggcacc ttacaacatt 600cgtgtgaatg ccatcgctcc aggctctgtc
gacacaccat tgatgcatga gcaaccaccg 660gaagtcgttc agttcctggt
cagcatgcaa cccatcaagc ggctggccca acccgaggag 720cttgcccaaa
gcatcctgtt ccttgccggt gagcattcgt ccttcatcac cggacagacg
780ctttctccca acggcgggat gcacatgtaa 810114269PRTPseudomonas putida
KT2440 114Met Ser Phe Gln Asn Lys Ile Val Val Leu Thr Gly Ala Ala
Ser Gly1 5 10 15 Ile Gly Lys Ala Thr Ala Gln Leu Leu Val Glu Gln
Gly Ala His Val 20 25 30 Val Ala Met Asp Leu Lys Ser Asp Leu Leu
Gln Gln Ala Phe Gly Ser 35 40 45 Glu Glu His Val Leu Cys Ile Pro
Thr Asp Val Ser Asp Ser Glu Ala 50 55 60 Val Arg Ala Ala Phe Gln
Ala Val Asp Ala Lys Phe Gly Arg Val Asp65 70 75 80 Val Ile Ile Asn
Ala Ala Gly Ile Asn Ala Pro Thr Arg Glu Ala Asn 85 90 95 Gln Lys
Met Val Asp Ala Asn Val Ala Ala Leu Asp Ala Met Lys Ser 100 105 110
Gly Arg Ala Pro Thr Phe Asp Phe Leu Ala Asp Thr Ser Asp Gln Asp 115
120 125 Phe Arg Arg Val Met Glu Val Asn Leu Phe Ser Gln Phe Tyr Cys
Ile 130 135 140 Arg Glu Gly Val Pro Leu Met Arg Arg Ala Gly Gly Gly
Ser Ile Val145 150 155 160 Asn Ile Ser Ser Val Ala Ala Leu Leu Gly
Val Ala Met Pro Leu Tyr 165 170 175 Tyr Pro Ala Ser Lys Ala Ala Val
Leu Gly Leu Thr Arg Ala Ala Ala 180 185 190 Ala Glu Leu Ala Pro Tyr
Asn Ile Arg Val Asn Ala Ile Ala Pro Gly 195 200 205 Ser Val Asp Thr
Pro Leu Met His Glu Gln Pro Pro Glu Val Val Gln 210 215 220 Phe Leu
Val Ser Met Gln Pro Ile Lys Arg Leu Ala Gln Pro Glu Glu225 230 235
240 Leu Ala Gln Ser Ile Leu Phe Leu Ala Gly Glu His Ser Ser Phe Ile
245 250 255 Thr Gly Gln Thr Leu Ser Pro Asn Gly Gly Met His Met 260
265 115771DNAPseudomonas putida KT2440 115atgacccttg aaggcaaaac
tgcactcgtc accggttcca ccagcggcat tggcctgggc 60atcgcccagg tattggcccg
ggctggcgcc aacatcgtgc tcaacggctt tggtgacccg 120ggccccgcca
tggcggaaat tgcccggcac ggggtgaagg ttgtgcacca cccggccgac
180ctgtcggatg tggtccagat cgaggctttg ttcaacctgg ccgaacgcga
gttcggcggc 240gtcgacatcc tggtcaacaa cgccggtatc cagcatgtgg
caccggttga gcagttcccg 300ccagaaagct gggacaagat catcgccctg
aacctgtcgg ccgtattcca tggcacgcgc 360ctggcgctgc cgggcatgcg
cacgcgcaac tgggggcgca tcatcaatat cgcttcggtg 420catggcctgg
tcggctcgat tggcaaggca gcctacgtgg cagccaagca tggcgtgatc
480ggcctgacca aggtggtcgg cctggaaacc gccaccagtc atgtcacctg
caatgccata 540tgcccgggct gggtgctgac accgctggtg caaaagcaga
tcgacgatcg tgcggccaag 600ggtggcgatc ggctgcaagc gcagcacgat
ctgctggcag aaaagcaacc gtcgctggct 660ttcgtcaccc ccgaacacct
cggtgagctg gtactctttc tgtgcagcga ggccggtagc 720caggttcgcg
gcgccgcctg gaacgtcgat ggtggctggt tggcccagtg a
771116256PRTPseudomonas putida KT2440 116Met Thr Leu Glu Gly Lys
Thr Ala Leu Val Thr Gly Ser Thr Ser Gly1 5 10 15 Ile Gly Leu Gly
Ile Ala Gln Val Leu Ala Arg Ala Gly Ala Asn Ile 20 25 30 Val Leu
Asn Gly Phe Gly Asp Pro Gly Pro Ala Met Ala Glu Ile Ala 35 40 45
Arg His Gly Val Lys Val Val His His Pro Ala Asp Leu Ser Asp Val 50
55 60 Val Gln Ile Glu Ala Leu Phe Asn Leu Ala Glu Arg Glu Phe Gly
Gly65 70 75 80 Val Asp Ile Leu Val Asn Asn Ala Gly Ile Gln His Val
Ala Pro Val 85 90 95 Glu Gln Phe Pro Pro Glu Ser Trp Asp Lys Ile
Ile Ala Leu Asn Leu 100 105 110 Ser Ala Val Phe His Gly Thr Arg Leu
Ala Leu Pro Gly Met Arg Thr 115 120 125 Arg Asn Trp Gly Arg Ile Ile
Asn Ile Ala Ser Val His Gly Leu Val 130 135 140 Gly Ser Ile Gly Lys
Ala Ala Tyr Val Ala Ala Lys His Gly Val Ile145 150 155 160 Gly Leu
Thr Lys Val Val Gly Leu Glu Thr Ala Thr Ser His Val Thr 165 170 175
Cys Asn Ala Ile Cys Pro Gly Trp Val Leu Thr Pro Leu Val Gln Lys 180
185 190 Gln Ile Asp Asp Arg Ala Ala Lys Gly Gly Asp Arg Leu Gln Ala
Gln 195 200 205 His Asp Leu Leu Ala Glu Lys Gln Pro Ser Leu Ala Phe
Val Thr Pro 210 215 220 Glu His Leu Gly Glu Leu Val Leu Phe Leu Cys
Ser Glu Ala Gly Ser225 230 235 240 Gln Val Arg Gly Ala Ala Trp Asn
Val Asp Gly Gly Trp Leu Ala Gln 245 250 255 117750DNAPseudomonas
putida KT2440 117atgtccaagc aacttacact cgaaggcaaa gtggccctgg
ttcagggcgg ttcccgaggc 60attggcgcag ctatcgtaag gcgcctggcc cgcgaaggcg
cgcaagtggc cttcacctat 120gtcagctctg ccggcccggc tgaagaactg
gctcgggaaa ttaccgagaa cggcggcaaa 180gccttggccc tgcgggctga
cagcgctgat gccgcggccg tgcagctggc ggttgatgac 240accgagaaag
ccttgggccg gctggatatc ctggtcaaca acgccggtgt gctggcagtg
300gccccagtga cagagttcga cctggccgac ttcgatcata tgctggccgt
gaacgtacgc 360agcgtgttcg tcgccagcca ggccgcggca cgctatatgg
gccagggcgg tcgtatcatc 420aacattggca gcaccaacgc cgagcgcatg
ccgtttgccg gtggtgcacc gtacgccatg 480agcaagtcgg cactggttgg
tctgacccgc ggcatggcac gcgacctcgg gccgcagggc 540attaccgtga
acaacgtgca gcccggcccg gtggacaccg acatgaaccc ggccagtggc
600gagtttgccg agagcctgat tccgctgatg gccattgggc gatatggcga
gccggaggag 660attgccagct tcgtggctta cctggcaggg cctgaagccg
ggtatatcac cggggccagc 720ctgactgtag atggtgggtt tgcagcctga
750118249PRTPseudomonas putida KT2440 118Met Ser Lys Gln Leu Thr
Leu Glu Gly Lys Val Ala Leu Val Gln Gly1 5 10 15 Gly Ser Arg Gly
Ile Gly Ala Ala Ile Val Arg Arg Leu Ala Arg Glu
20 25 30 Gly Ala Gln Val Ala Phe Thr Tyr Val Ser Ser Ala Gly Pro
Ala Glu 35 40 45 Glu Leu Ala Arg Glu Ile Thr Glu Asn Gly Gly Lys
Ala Leu Ala Leu 50 55 60 Arg Ala Asp Ser Ala Asp Ala Ala Ala Val
Gln Leu Ala Val Asp Asp65 70 75 80 Thr Glu Lys Ala Leu Gly Arg Leu
Asp Ile Leu Val Asn Asn Ala Gly 85 90 95 Val Leu Ala Val Ala Pro
Val Thr Glu Phe Asp Leu Ala Asp Phe Asp 100 105 110 His Met Leu Ala
Val Asn Val Arg Ser Val Phe Val Ala Ser Gln Ala 115 120 125 Ala Ala
Arg Tyr Met Gly Gln Gly Gly Arg Ile Ile Asn Ile Gly Ser 130 135 140
Thr Asn Ala Glu Arg Met Pro Phe Ala Gly Gly Ala Pro Tyr Ala Met145
150 155 160 Ser Lys Ser Ala Leu Val Gly Leu Thr Arg Gly Met Ala Arg
Asp Leu 165 170 175 Gly Pro Gln Gly Ile Thr Val Asn Asn Val Gln Pro
Gly Pro Val Asp 180 185 190 Thr Asp Met Asn Pro Ala Ser Gly Glu Phe
Ala Glu Ser Leu Ile Pro 195 200 205 Leu Met Ala Ile Gly Arg Tyr Gly
Glu Pro Glu Glu Ile Ala Ser Phe 210 215 220 Val Ala Tyr Leu Ala Gly
Pro Glu Ala Gly Tyr Ile Thr Gly Ala Ser225 230 235 240 Leu Thr Val
Asp Gly Gly Phe Ala Ala 245 119858DNAPseudomonas putida KT2440
119atgagcgact accctacccc tccattccca tcccaaccgc aaagcgttcc
cggttcccag 60cgcaagatgg atccgtatcc ggactgcggt gagcagagct acaccggcaa
caatcgcctc 120gcaggcaaga tcgccttgat aaccggtgct gacagcggca
tcgggcgtgc ggtggcgatt 180gcctatgccc gagaaggcgc tgacgttgcc
attgcctatc tgaatgaaca cgacgatgcg 240caggaaaccg cgcgctgggt
caaagcggct ggccgccagt gcctgctgct gcccggcgac 300ctggcacaga
aacagcactg ccacgacatc gtcgacaaga ccgtggcgca gtttggtcgc
360atcgatatcc tggtcaacaa cgccgcgttc cagatggccc atgaaagcct
ggacgacatt 420gatgacgatg aatgggtgaa gaccttcgat accaacatca
ccgccatttt ccgcatttgc 480cagcgcgctt tgccctcgat gccaaagggc
ggttcgatca tcaacaccag ttcggtcaac 540tctgacgacc cgtcacccag
cctgttggcc tatgccgcga ccaaaggggc tattgccaat 600ttcactgcag
gccttgcgca actgctgggc aagcagggca ttcgcgtcaa cagcgtcgca
660cccggcccga tctggacccc gctgatcccg gccaccatgc ctgatgaggc
ggtgagaaac 720ttcggttccg gttacccgat gggacggccg ggtcaacctg
tggaggtggc gccaatctat 780gtcttgctgg ggtccgatga agccagctac
atctcgggtt cgcgttacgc cgtgacggga 840ggcaaaccta ttctgtga
858120285PRTPseudomonas putida KT2440 120Met Ser Asp Tyr Pro Thr
Pro Pro Phe Pro Ser Gln Pro Gln Ser Val1 5 10 15 Pro Gly Ser Gln
Arg Lys Met Asp Pro Tyr Pro Asp Cys Gly Glu Gln 20 25 30 Ser Tyr
Thr Gly Asn Asn Arg Leu Ala Gly Lys Ile Ala Leu Ile Thr 35 40 45
Gly Ala Asp Ser Gly Ile Gly Arg Ala Val Ala Ile Ala Tyr Ala Arg 50
55 60 Glu Gly Ala Asp Val Ala Ile Ala Tyr Leu Asn Glu His Asp Asp
Ala65 70 75 80 Gln Glu Thr Ala Arg Trp Val Lys Ala Ala Gly Arg Gln
Cys Leu Leu 85 90 95 Leu Pro Gly Asp Leu Ala Gln Lys Gln His Cys
His Asp Ile Val Asp 100 105 110 Lys Thr Val Ala Gln Phe Gly Arg Ile
Asp Ile Leu Val Asn Asn Ala 115 120 125 Ala Phe Gln Met Ala His Glu
Ser Leu Asp Asp Ile Asp Asp Asp Glu 130 135 140 Trp Val Lys Thr Phe
Asp Thr Asn Ile Thr Ala Ile Phe Arg Ile Cys145 150 155 160 Gln Arg
Ala Leu Pro Ser Met Pro Lys Gly Gly Ser Ile Ile Asn Thr 165 170 175
Ser Ser Val Asn Ser Asp Asp Pro Ser Pro Ser Leu Leu Ala Tyr Ala 180
185 190 Ala Thr Lys Gly Ala Ile Ala Asn Phe Thr Ala Gly Leu Ala Gln
Leu 195 200 205 Leu Gly Lys Gln Gly Ile Arg Val Asn Ser Val Ala Pro
Gly Pro Ile 210 215 220 Trp Thr Pro Leu Ile Pro Ala Thr Met Pro Asp
Glu Ala Val Arg Asn225 230 235 240 Phe Gly Ser Gly Tyr Pro Met Gly
Arg Pro Gly Gln Pro Val Glu Val 245 250 255 Ala Pro Ile Tyr Val Leu
Leu Gly Ser Asp Glu Ala Ser Tyr Ile Ser 260 265 270 Gly Ser Arg Tyr
Ala Val Thr Gly Gly Lys Pro Ile Leu 275 280 285
121774DNAPseudomonas putida KT2440 121atgatcgaaa tcagcggcag
caccccgggc cacaatggcc gggtagcctt ggtcacgggc 60gccgcccgcg gcatcggtct
gggcattgcc gcatggctga tctgcgaagg ctggcaagtg 120gtgctgagtg
atctggaccg ccagcgtggt accaaagtgg ccaaggcgtt gggcgacaac
180gcctggttca tcaccatgga cgttgccgac gaggcccagg tcagtgccgg
cgtgtccgaa 240gtgctcgggc agttcggccg gctggacgcg ctggtgtgca
atgcggccat tgccaacccg 300cacaaccaga cgctggaaag cctgagcctg
gcacaatgga accgggtgct gggggtcaac 360ctcagcggcc ccatgctgct
ggccaagcat tgtgcgccgt acctgcgtgc gcacaatggg 420gcgatcgtca
acctgacctc tacccgtgct cggcagtccg aacccgacac cgaggcttac
480gcggcaagca agggcggcct ggtggctttg acccatgccc tggccatgag
cctgggcccg 540gagattcgcg tcaatgcggt gagcccgggc tggatcgatg
cccgtgatcc gtcgcagcgc 600cgtgccgagc cgttgagcga agctgaccat
gcccagcatc caacgggcag ggtagggacc 660gtggaagatg tcgcggccat
ggttgcctgg ttgctgtcac gccaggcggc atttgtcacc 720ggccaggagt
ttgtggtcga tggcggcatg acccgcaaga tgatctatac ctga
774122257PRTPseudomonas putida KT2440 122Met Ile Glu Ile Ser Gly
Ser Thr Pro Gly His Asn Gly Arg Val Ala1 5 10 15 Leu Val Thr Gly
Ala Ala Arg Gly Ile Gly Leu Gly Ile Ala Ala Trp 20 25 30 Leu Ile
Cys Glu Gly Trp Gln Val Val Leu Ser Asp Leu Asp Arg Gln 35 40 45
Arg Gly Thr Lys Val Ala Lys Ala Leu Gly Asp Asn Ala Trp Phe Ile 50
55 60 Thr Met Asp Val Ala Asp Glu Ala Gln Val Ser Ala Gly Val Ser
Glu65 70 75 80 Val Leu Gly Gln Phe Gly Arg Leu Asp Ala Leu Val Cys
Asn Ala Ala 85 90 95 Ile Ala Asn Pro His Asn Gln Thr Leu Glu Ser
Leu Ser Leu Ala Gln 100 105 110 Trp Asn Arg Val Leu Gly Val Asn Leu
Ser Gly Pro Met Leu Leu Ala 115 120 125 Lys His Cys Ala Pro Tyr Leu
Arg Ala His Asn Gly Ala Ile Val Asn 130 135 140 Leu Thr Ser Thr Arg
Ala Arg Gln Ser Glu Pro Asp Thr Glu Ala Tyr145 150 155 160 Ala Ala
Ser Lys Gly Gly Leu Val Ala Leu Thr His Ala Leu Ala Met 165 170 175
Ser Leu Gly Pro Glu Ile Arg Val Asn Ala Val Ser Pro Gly Trp Ile 180
185 190 Asp Ala Arg Asp Pro Ser Gln Arg Arg Ala Glu Pro Leu Ser Glu
Ala 195 200 205 Asp His Ala Gln His Pro Thr Gly Arg Val Gly Thr Val
Glu Asp Val 210 215 220 Ala Ala Met Val Ala Trp Leu Leu Ser Arg Gln
Ala Ala Phe Val Thr225 230 235 240 Gly Gln Glu Phe Val Val Asp Gly
Gly Met Thr Arg Lys Met Ile Tyr 245 250 255 Thr123741DNAPseudomonas
putida KT2440 123atgagcctgc aaggtaaagt tgcactggtt accggcgcca
gccgtggcat tggccaggcc 60atcgccctcg agctgggccg ccagggcgcg accgtgatcg
gtaccgccac gtcggcgtcc 120ggtgccgagc gcatcgctgc caccctgaaa
gaacacggca ttaccggcac tggcatggag 180ctgaacgtga ccagcgccga
atcggttgaa gccgtactgg ccgccattgg cgagcagttc 240ggcgcgccgg
ccatcttggt caacaatgcc ggtatcaccc gcgacaacct catgctgcgc
300atgaaagacg acgagtggtt tgatgtcatc gacaccaacc tgaacagcct
ctaccgtctg 360tccaagggcg tgctgcgtgg catgaccaag gcgcgttggg
gtcgtatcat cagcatcggc 420tcggtcgttg gtgccatggg taacgcaggt
caggccaact acgcggctgc caaggccggt 480ctggaaggtt tcagccgcgc
cctggcgcgt gaagtgggtt cgcgtggtat caccgtcaac 540tcggtgaccc
caggcttcat cgataccgac atgacccgcg agctgccaga agctcagcgc
600gaagccctgc agacccagat tccgctgggc cgcctgggcc aggctgacga
aattgccaag 660gtggtttcgt tcctggcatc cgacggcgcc gcctacgtga
ccggcgctac cgtgccggtc 720aacggcggga tgtacatgta a
741124246PRTPseudomonas putida KT2440 124Met Ser Leu Gln Gly Lys
Val Ala Leu Val Thr Gly Ala Ser Arg Gly1 5 10 15 Ile Gly Gln Ala
Ile Ala Leu Glu Leu Gly Arg Gln Gly Ala Thr Val 20 25 30 Ile Gly
Thr Ala Thr Ser Ala Ser Gly Ala Glu Arg Ile Ala Ala Thr 35 40 45
Leu Lys Glu His Gly Ile Thr Gly Thr Gly Met Glu Leu Asn Val Thr 50
55 60 Ser Ala Glu Ser Val Glu Ala Val Leu Ala Ala Ile Gly Glu Gln
Phe65 70 75 80 Gly Ala Pro Ala Ile Leu Val Asn Asn Ala Gly Ile Thr
Arg Asp Asn 85 90 95 Leu Met Leu Arg Met Lys Asp Asp Glu Trp Phe
Asp Val Ile Asp Thr 100 105 110 Asn Leu Asn Ser Leu Tyr Arg Leu Ser
Lys Gly Val Leu Arg Gly Met 115 120 125 Thr Lys Ala Arg Trp Gly Arg
Ile Ile Ser Ile Gly Ser Val Val Gly 130 135 140 Ala Met Gly Asn Ala
Gly Gln Ala Asn Tyr Ala Ala Ala Lys Ala Gly145 150 155 160 Leu Glu
Gly Phe Ser Arg Ala Leu Ala Arg Glu Val Gly Ser Arg Gly 165 170 175
Ile Thr Val Asn Ser Val Thr Pro Gly Phe Ile Asp Thr Asp Met Thr 180
185 190 Arg Glu Leu Pro Glu Ala Gln Arg Glu Ala Leu Gln Thr Gln Ile
Pro 195 200 205 Leu Gly Arg Leu Gly Gln Ala Asp Glu Ile Ala Lys Val
Val Ser Phe 210 215 220 Leu Ala Ser Asp Gly Ala Ala Tyr Val Thr Gly
Ala Thr Val Pro Val225 230 235 240 Asn Gly Gly Met Tyr Met 245
125738DNAPseudomonas putida KT2440 125atgactcaga aaatagctgt
cgtgaccggc ggcagtcgcg gcattggcaa gtccatcgtg 60ctggccctgg ccggcgcggg
ttatcaggtt gccttcagtt atgtccgtga cgaggcgtca 120gccgctgcct
tgcaggcgca ggtcgaaggg ctcggccggg actgcctggc cgtgcagtgt
180gatgtcaagg aagcgccgag cattcaggcg ttttttgaac gggtcgagca
acgtttcgag 240cgtatcgact tgttggtcaa caacgccggt attacccgtg
acggtttgct cgccacgcaa 300tcgttgaacg acatcaccga ggtcatccag
accaacctgg tcggcacgtt gttgtgctgt 360cagcaggtgc tgccctgcat
gatgcgccaa cgcagcgggt gcatcgtcaa cctcagttcg 420gtggccgcgc
aaaagcccgg caagggccag agcaactacg ccgccgccaa aggcggtgta
480gaagcattga cacgcgcact ggcggtggag ttggcgccgc gcaacatccg
ggtcaacgcg 540gtggcgcccg gcatcgtcag caccgacatg agccaagccc
tggtcggcgc ccatgagcag 600gaaatccagt cgcggctgtt gatcaaacgg
ttcgcccggc ctgaagaaat tgccgacgcg 660gtgctgtatc tggccgagcg
cggcctgtac atcacgggcg aagtcctgtc cgtcaacggc 720ggattgaaaa tgccatga
738126245PRTPseudomonas putida KT2440 126Met Thr Gln Lys Ile Ala
Val Val Thr Gly Gly Ser Arg Gly Ile Gly1 5 10 15 Lys Ser Ile Val
Leu Ala Leu Ala Gly Ala Gly Tyr Gln Val Ala Phe 20 25 30 Ser Tyr
Val Arg Asp Glu Ala Ser Ala Ala Ala Leu Gln Ala Gln Val 35 40 45
Glu Gly Leu Gly Arg Asp Cys Leu Ala Val Gln Cys Asp Val Lys Glu 50
55 60 Ala Pro Ser Ile Gln Ala Phe Phe Glu Arg Val Glu Gln Arg Phe
Glu65 70 75 80 Arg Ile Asp Leu Leu Val Asn Asn Ala Gly Ile Thr Arg
Asp Gly Leu 85 90 95 Leu Ala Thr Gln Ser Leu Asn Asp Ile Thr Glu
Val Ile Gln Thr Asn 100 105 110 Leu Val Gly Thr Leu Leu Cys Cys Gln
Gln Val Leu Pro Cys Met Met 115 120 125 Arg Gln Arg Ser Gly Cys Ile
Val Asn Leu Ser Ser Val Ala Ala Gln 130 135 140 Lys Pro Gly Lys Gly
Gln Ser Asn Tyr Ala Ala Ala Lys Gly Gly Val145 150 155 160 Glu Ala
Leu Thr Arg Ala Leu Ala Val Glu Leu Ala Pro Arg Asn Ile 165 170 175
Arg Val Asn Ala Val Ala Pro Gly Ile Val Ser Thr Asp Met Ser Gln 180
185 190 Ala Leu Val Gly Ala His Glu Gln Glu Ile Gln Ser Arg Leu Leu
Ile 195 200 205 Lys Arg Phe Ala Arg Pro Glu Glu Ile Ala Asp Ala Val
Leu Tyr Leu 210 215 220 Ala Glu Arg Gly Leu Tyr Ile Thr Gly Glu Val
Leu Ser Val Asn Gly225 230 235 240 Gly Leu Lys Met Pro 245
127768DNAPseudomonas putida KT2440 127atgtccaaga cccacctgtt
cgacctcgac ggcaagattg cctttgtttc cggcgccagc 60cgtggcatcg gcgaggccat
cgcccacttg ctcgcgcagc aaggggccca tgtgatcgtt 120tccagccgca
agcttgacgg gtgccagcag gtggccgacg ccatcattgc cgccggcggc
180aaggccacgg ctgtggcctg ccacattggt gagctggaac agattcagca
ggtgttcgcc 240ggcattcgcg aacagttcgg gcgactggac gtgctggtca
acaatgcagc caccaacccg 300caattctgca atgtgctgga caccgaccca
ggggcgttcc agaagaccgt ggacgtgaac 360atccgtggtt acttcttcat
gtcggtggag gctggcaagc tgatgcgcga gaacggcggc 420ggcagcatca
tcaacgtggc gtcgatcaac ggtgtttcac ccgggctgtt ccaaggcatc
480tactcggtga ccaaggcggc ggtcatcaac atgaccaagg tgttcgccaa
agagtgtgca 540cccttcggta ttcgctgcaa cgcgctactg ccggggctga
ccgataccaa gttcgcttcg 600gcattggtga agaacgaagc catcctcaac
gccgccttgc agcagatccc cctcaaacgc 660gtggccgacc ccaaggaaat
ggcgggtgcg gtgctgtacc tggccagcga tgcctccagc 720tacaccaccg
gcaccacgct caatgtcgac ggtggcttcc tgtcctga 768128255PRTPseudomonas
putida KT2440 128Met Ser Lys Thr His Leu Phe Asp Leu Asp Gly Lys
Ile Ala Phe Val1 5 10 15 Ser Gly Ala Ser Arg Gly Ile Gly Glu Ala
Ile Ala His Leu Leu Ala 20 25 30 Gln Gln Gly Ala His Val Ile Val
Ser Ser Arg Lys Leu Asp Gly Cys 35 40 45 Gln Gln Val Ala Asp Ala
Ile Ile Ala Ala Gly Gly Lys Ala Thr Ala 50 55 60 Val Ala Cys His
Ile Gly Glu Leu Glu Gln Ile Gln Gln Val Phe Ala65 70 75 80 Gly Ile
Arg Glu Gln Phe Gly Arg Leu Asp Val Leu Val Asn Asn Ala 85 90 95
Ala Thr Asn Pro Gln Phe Cys Asn Val Leu Asp Thr Asp Pro Gly Ala 100
105 110 Phe Gln Lys Thr Val Asp Val Asn Ile Arg Gly Tyr Phe Phe Met
Ser 115 120 125 Val Glu Ala Gly Lys Leu Met Arg Glu Asn Gly Gly Gly
Ser Ile Ile 130 135 140 Asn Val Ala Ser Ile Asn Gly Val Ser Pro Gly
Leu Phe Gln Gly Ile145 150 155 160 Tyr Ser Val Thr Lys Ala Ala Val
Ile Asn Met Thr Lys Val Phe Ala 165 170 175 Lys Glu Cys Ala Pro Phe
Gly Ile Arg Cys Asn Ala Leu Leu Pro Gly 180 185 190 Leu Thr Asp Thr
Lys Phe Ala Ser Ala Leu Val Lys Asn Glu Ala Ile 195 200 205 Leu Asn
Ala Ala Leu Gln Gln Ile Pro Leu Lys Arg Val Ala Asp Pro 210 215 220
Lys Glu Met Ala Gly Ala Val Leu Tyr Leu Ala Ser Asp Ala Ser Ser225
230 235 240 Tyr Thr Thr Gly Thr Thr Leu Asn Val Asp Gly Gly Phe Leu
Ser 245 250 255 129762DNAPseudomonas fluorescens Pf-5 129atgagcatga
cgttttccgg ccaggtggcc ctagtgaccg gcgcagccaa tggtatcggc 60cgcgccaccg
cccaggcatt tgccgcacaa ggcttgaagg tggtggtggc ggacctggac
120acggcggggg gcgagggcac cgtggcgctg atccgcgagg ccggtggcga
ggcattgttc 180gtgccgtgca acgttaccct ggaggcggat gtgcaaagcc
tcatggcccg caccatcgaa 240gcctatgggc gcctggatta cgccttcaac
aatgccggta tcgagatcga aaagggccgc 300cttgcggagg gctccatgga
tgagttcgac gccatcatgg gggtcaacgt caaaggggtc 360tggctgtgca
tgaagtacca gttgccgctg ctgctggccc agggcggtgg ggcgatcgtc
420aacaccgcct cggtggcggg cctgggcgcg gcgccgaaga tgagcatcta
tgcggcctcc 480aagcatgcgg tgatcggcct gaccaagtcg gcggccatcg
aatatgcgaa gaagaaaatc 540cgcgtgaacg cggtatgccc ggcggtgatc
gacaccgaca tgttccgccg tgcctacgag 600gcggacccga agaaggccga
gttcgccgcg gccatgcacc cggtggggcg catcggcaag 660gtcgaggaga
tcgccagtgc ggtgctctac ctgtgcagcg atggcgcggc ctttaccacc
720ggccatgcac tggcggtcga cggcggggcc accgcgatct ga
762130253PRTPseudomonas fluorscens Pf-5 130Met Ser Met Thr Phe Ser
Gly Gln Val Ala Leu Val Thr Gly Ala Ala1 5 10 15 Asn Gly Ile Gly
Arg Ala Thr Ala Gln Ala Phe Ala Ala Gln Gly Leu 20 25 30 Lys Val
Val Val Ala Asp Leu Asp Thr Ala Gly Gly Glu Gly Thr Val 35 40 45
Ala Leu Ile Arg Glu Ala Gly Gly Glu Ala Leu Phe Val Pro Cys Asn 50
55 60 Val Thr Leu Glu Ala Asp Val Gln Ser Leu Met Ala Arg Thr Ile
Glu65 70 75 80 Ala Tyr Gly Arg Leu Asp Tyr Ala Phe Asn Asn Ala Gly
Ile Glu Ile 85 90 95 Glu Lys Gly Arg Leu Ala Glu Gly Ser Met Asp
Glu Phe Asp Ala Ile 100 105 110 Met Gly Val Asn Val Lys Gly Val Trp
Leu Cys Met Lys Tyr Gln Leu 115 120 125 Pro Leu Leu Leu Ala Gln Gly
Gly Gly Ala Ile Val Asn Thr Ala Ser 130 135 140 Val Ala Gly Leu Gly
Ala Ala Pro Lys Met Ser Ile Tyr Ala Ala Ser145 150 155 160 Lys His
Ala Val Ile Gly Leu Thr Lys Ser Ala Ala Ile Glu Tyr Ala 165 170 175
Lys Lys Lys Ile Arg Val Asn Ala Val Cys Pro Ala Val Ile Asp Thr 180
185 190 Asp Met Phe Arg Arg Ala Tyr Glu Ala Asp Pro Lys Lys Ala Glu
Phe 195 200 205 Ala Ala Ala Met His Pro Val Gly Arg Ile Gly Lys Val
Glu Glu Ile 210 215 220 Ala Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly
Ala Ala Phe Thr Thr225 230 235 240 Gly His Ala Leu Ala Val Asp Gly
Gly Ala Thr Ala Ile 245 250 131735DNAKlebsiella pneumoniae subsp.
pneumoniae MGH78578 131atgaaacttg ccagtaaaac cgccattgtc accggcgccg
cacgcggtat cggctttggc 60attgcccagg tgcttgcgcg ggaaggcgcg cgagtgatta
tcgccgatcg tgatgcacac 120ggcgaagccg ccgccgcttc cctgcgcgaa
tcgggcgcac aggcgctgtt tatcagctgc 180aatatcgctg aaaaaacgca
ggtcgaagcc ctgtattccc aggccgaaga ggcgtttggc 240ccggtagaca
ttctggtgaa taacgccgga atcaaccgcg acgccatgct gcacaaatta
300acggaagcgg actgggacac ggttatcgac gttaacctga aaggcacttt
cctctgtatg 360cagcaggccg ctatccgcat gcgcgagcgc ggtgcgggcc
gcattatcaa tatcgcttcc 420gccagttggc ttggcaacgt cgggcaaacc
aactattcgg cgtcaaaagc cggcgtggtg 480ggaatgacca aaaccgcctg
ccgcgaactg gcgaaaaaag gtgtcacggt gaatgccatc 540tgcccgggct
ttatcgatac cgacatgacg cgcggcgtac cggaaaacgt ctggcaaatc
600atggtcagca aaattcccgc gggttacgcc ggcgaggcga aagacgtcgg
cgagtgtgtg 660gcgtttctgg cgtccgatgg cgcgcgctat atcaatggtg
aagtgattaa cgtcggcggc 720ggcatggtgc tgtaa 735132253PRTKlebsiella
pneumoniae subsp. pneumoniae MGH78578 132Met Ser Met Thr Phe Ser
Gly Gln Val Ala Leu Val Thr Gly Ala Ala1 5 10 15 Asn Gly Ile Gly
Arg Ala Thr Ala Gln Ala Phe Ala Ala Gln Gly Leu 20 25 30 Lys Val
Val Val Ala Asp Leu Asp Thr Ala Gly Gly Glu Gly Thr Val 35 40 45
Ala Leu Ile Arg Glu Ala Gly Gly Glu Ala Leu Phe Val Pro Cys Asn 50
55 60 Val Thr Leu Glu Ala Asp Val Gln Ser Leu Met Ala Arg Thr Ile
Glu65 70 75 80 Ala Tyr Gly Arg Leu Asp Tyr Ala Phe Asn Asn Ala Gly
Ile Glu Ile 85 90 95 Glu Lys Gly Arg Leu Ala Glu Gly Ser Met Asp
Glu Phe Asp Ala Ile 100 105 110 Met Gly Val Asn Val Lys Gly Val Trp
Leu Cys Met Lys Tyr Gln Leu 115 120 125 Pro Leu Leu Leu Ala Gln Gly
Gly Gly Ala Ile Val Asn Thr Ala Ser 130 135 140 Val Ala Gly Leu Gly
Ala Ala Pro Lys Met Ser Ile Tyr Ala Ala Ser145 150 155 160 Lys His
Ala Val Ile Gly Leu Thr Lys Ser Ala Ala Ile Glu Tyr Ala 165 170 175
Lys Lys Lys Ile Arg Val Asn Ala Val Cys Pro Ala Val Ile Asp Thr 180
185 190 Asp Met Phe Arg Arg Ala Tyr Glu Ala Asp Pro Lys Lys Ala Glu
Phe 195 200 205 Ala Ala Ala Met His Pro Val Gly Arg Ile Gly Lys Val
Glu Glu Ile 210 215 220 Ala Ser Ala Val Leu Tyr Leu Cys Ser Asp Gly
Ala Ala Phe Thr Thr225 230 235 240 Gly His Ala Leu Ala Val Asp Gly
Gly Ala Thr Ala Ile 245 250 133750DNAKlebsiella pneumoniae subsp.
pneumoniae MGH78578 133atgttattga aagataaagt cgccattatt actggcgcgg
cctccgcacg cggtttgggc 60ttcgcgactg cgaaattatt cgccgaaaac ggcgcgaaag
tggtcattat cgacctcaat 120ggcgaagcca gtaaaaccgc cgcggcggca
ttaggcgaag accatctcgg cctggcggcc 180aacgtcgctg atgaagtgca
ggtgcaggcg gccatcgaac agatcctggc gaaatacggt 240cgggttgatg
tactggtcaa taacgccggg attacccagc cgctgaagct gatggatatc
300aagcgcgcca actatgacgc ggtgcttgat gttagcctgc gcggcacgct
gctgatgtcg 360caggcggtta tccccaccat gcgggcgcaa aaatccggca
gcatcgtctg catctcgtcc 420gtctccgccc agcgcggcgg cggtattttc
ggcggaccgc actacagcgc ggcaaaagcc 480ggggtgctgg gtctggcgcg
ggcgatggcg cgcgagcttg gcccggataa cgtccgcgtt 540aactgcatca
ccccggggct gattcagacc gacattaccg ccggcaagct gactgatgac
600atgacggcca acattcttgc cggcattccg atgaaccgcc ttggcgacgc
gatagacatc 660gcgcgcgccg cgctgttcct cggcagcgat ctttcctcct
actccaccgg catcaccctg 720gacgttaacg gcggcatgtt aattcactaa
750134249PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 134Met
Leu Leu Lys Asp Lys Val Ala Ile Ile Thr Gly Ala Ala Ser Ala1 5 10
15 Arg Gly Leu Gly Phe Ala Thr Ala Lys Leu Phe Ala Glu Asn Gly Ala
20 25 30 Lys Val Val Ile Ile Asp Leu Asn Gly Glu Ala Ser Lys Thr
Ala Ala 35 40 45 Ala Ala Leu Gly Glu Asp His Leu Gly Leu Ala Ala
Asn Val Ala Asp 50 55 60 Glu Val Gln Val Gln Ala Ala Ile Glu Gln
Ile Leu Ala Lys Tyr Gly65 70 75 80 Arg Val Asp Val Leu Val Asn Asn
Ala Gly Ile Thr Gln Pro Leu Lys 85 90 95 Leu Met Asp Ile Lys Arg
Ala Asn Tyr Asp Ala Val Leu Asp Val Ser 100 105 110 Leu Arg Gly Thr
Leu Leu Met Ser Gln Ala Val Ile Pro Thr Met Arg 115 120 125 Ala Gln
Lys Ser Gly Ser Ile Val Cys Ile Ser Ser Val Ser Ala Gln 130 135 140
Arg Gly Gly Gly Ile Phe Gly Gly Pro His Tyr Ser Ala Ala Lys Ala145
150 155 160 Gly Val Leu Gly Leu Ala Arg Ala Met Ala Arg Glu Leu Gly
Pro Asp 165 170 175 Asn Val Arg Val Asn Cys Ile Thr Pro Gly Leu Ile
Gln Thr Asp Ile 180 185 190 Thr Ala Gly Lys Leu Thr Asp Asp Met Thr
Ala Asn Ile Leu Ala Gly 195 200 205 Ile Pro Met Asn Arg Leu Gly Asp
Ala Ile Asp Ile Ala Arg Ala Ala 210 215 220 Leu Phe Leu Gly Ser Asp
Leu Ser Ser Tyr Ser Thr Gly Ile Thr Leu225 230 235 240 Asp Val Asn
Gly Gly Met Leu Ile His 245 135750DNAKlebsiella pneumoniae subsp.
pneumoniae MGH78578 135atgttattga aagataaagt cgccattatt actggcgcgg
cctccgcacg cggtttgggc 60ttcgcgactg cgaaattatt cgccgaaaac ggcgcgaaag
tggtcattat cgacctcaat 120ggcgaagcca gtaaaaccgc cgcggcggca
ttaggcgaag accatctcgg cctggcggcc 180aacgtcgctg atgaagtgca
ggtgcaggcg gccatcgaac agatcctggc gaaatacggt 240cgggttgatg
tactggtcaa taacgccggg attacccagc cgctgaagct gatggatatc
300aagcgcgcca actatgacgc ggtgcttgat gttagcctgc gcggcacgct
gctgatgtcg 360caggcggtta tccccaccat gcgggcgcaa aaatccggca
gcatcgtctg catctcgtcc 420gtctccgccc agcgcggcgg cggtattttc
ggcggaccgc actacagcgc ggcaaaagcc 480ggggtgctgg gtctggcgcg
ggcgatggcg cgcgagcttg gcccggataa cgtccgcgtt 540aactgcatca
ccccggggct gattcagacc gacattaccg ccggcaagct gactgatgac
600atgacggcca acattcttgc cggcattccg atgaaccgcc ttggcgacgc
gatagacatc 660gcgcgcgccg cgctgttcct cggcagcgat ctttcctcct
actccaccgg catcaccctg 720gacgttaacg gcggcatgtt aattcactaa
750136249PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 136Met
Leu Leu Lys Asp Lys Val Ala Ile Ile Thr Gly Ala Ala Ser Ala1 5 10
15 Arg Gly Leu Gly Phe Ala Thr Ala Lys Leu Phe Ala Glu Asn Gly Ala
20 25 30 Lys Val Val Ile Ile Asp Leu Asn Gly Glu Ala Ser Lys Thr
Ala Ala 35 40 45 Ala Ala Leu Gly Glu Asp His Leu Gly Leu Ala Ala
Asn Val Ala Asp 50 55 60 Glu Val Gln Val Gln Ala Ala Ile Glu Gln
Ile Leu Ala Lys Tyr Gly65 70 75 80 Arg Val Asp Val Leu Val Asn Asn
Ala Gly Ile Thr Gln Pro Leu Lys 85 90 95 Leu Met Asp Ile Lys Arg
Ala Asn Tyr Asp Ala Val Leu Asp Val Ser 100 105 110 Leu Arg Gly Thr
Leu Leu Met Ser Gln Ala Val Ile Pro Thr Met Arg 115 120 125 Ala Gln
Lys Ser Gly Ser Ile Val Cys Ile Ser Ser Val Ser Ala Gln 130 135 140
Arg Gly Gly Gly Ile Phe Gly Gly Pro His Tyr Ser Ala Ala Lys Ala145
150 155 160 Gly Val Leu Gly Leu Ala Arg Ala Met Ala Arg Glu Leu Gly
Pro Asp 165 170 175 Asn Val Arg Val Asn Cys Ile Thr Pro Gly Leu Ile
Gln Thr Asp Ile 180 185 190 Thr Ala Gly Lys Leu Thr Asp Asp Met Thr
Ala Asn Ile Leu Ala Gly 195 200 205 Ile Pro Met Asn Arg Leu Gly Asp
Ala Ile Asp Ile Ala Arg Ala Ala 210 215 220 Leu Phe Leu Gly Ser Asp
Leu Ser Ser Tyr Ser Thr Gly Ile Thr Leu225 230 235 240 Asp Val Asn
Gly Gly Met Leu Ile His 245 137714DNAKlebsiella pneumoniae subsp.
pneumoniae MGH78578 137atgacagcgt ttcacaacaa atcagtgctg gttttaggcg
ggagtcgggg aattggcgcg 60gcgatcgtca ggcgttttgt cgccgatggc gcgtcggtgg
tgtttagcta ttccggttcg 120ccggaagcgg ccgagcggct ggcggcagag
accggcagca cggcggtgca ggcggacagc 180gccgatcgcg atgcggtgat
aagcctggtc cgcgacagcg gcccgctgga cgtgttagtg 240gtcaatgccg
ggatcgcgct tttcggtgac gctctcgagc aggacagcga tgcaatcgat
300cgcctgttcc acatcaatat tcacgccccc taccatgcct ccgtcgaagc
ggcgcgccgc 360atgccggaag gcgggcgcat tattgtcatc ggctcagtca
atggcgatcg catgccgttg 420ccgggaatgg cggcctatgc gctcagcaaa
tcggccctgc aggggctggc gcgcggcctg 480gcgcgggatt ttggcccgcg
cggcatcacg gtcaacgtcg tccagcccgg cccaattgat 540accgacgcca
acccggagaa cggcccgatg aaagagctga tgcacagctt tatggccatt
600aagcgccatg gccgtccgga agaggtggcg ggaatggtgg cgtggctggc
cggtccggag 660gcgtcgtttg tcactggcgc catgcacacc atcgacggag
cgtttggcgc ctga 714138237PRTKlebsiella pneumoniae subsp. pneumoniae
MGH78578 138Met Thr Ala Phe His Asn Lys Ser Val Leu Val Leu Gly Gly
Ser Arg1 5 10 15 Gly Ile Gly Ala Ala Ile Val Arg Arg Phe Val Ala
Asp Gly Ala Ser 20 25 30 Val Val Phe Ser Tyr Ser Gly Ser Pro Glu
Ala Ala Glu Arg Leu Ala 35 40 45 Ala Glu Thr Gly Ser Thr Ala Val
Gln Ala Asp Ser Ala Asp Arg Asp 50 55 60 Ala Val Ile Ser Leu Val
Arg Asp Ser Gly Pro Leu Asp Val Leu Val65 70 75 80 Val Asn Ala Gly
Ile Ala Leu Phe Gly Asp Ala Leu Glu Gln Asp Ser 85 90 95 Asp Ala
Ile Asp Arg Leu Phe His Ile Asn Ile His Ala Pro Tyr His 100 105 110
Ala Ser Val Glu Ala Ala Arg Arg Met Pro Glu Gly Gly Arg Ile Ile 115
120 125 Val Ile Gly Ser Val Asn Gly Asp Arg Met Pro Leu Pro Gly Met
Ala 130 135 140 Ala Tyr Ala Leu Ser Lys Ser Ala Leu Gln Gly Leu Ala
Arg Gly Leu145 150 155 160 Ala Arg Asp Phe Gly Pro Arg Gly Ile Thr
Val Asn Val Val Gln Pro 165 170 175 Gly Pro Ile Asp Thr Asp Ala Asn
Pro Glu Asn Gly Pro Met Lys Glu 180 185 190 Leu Met His Ser Phe Met
Ala Ile Lys Arg His Gly Arg Pro Glu Glu 195 200 205 Val Ala Gly Met
Val Ala Trp Leu Ala Gly Pro Glu Ala Ser Phe Val 210 215 220 Thr Gly
Ala Met His Thr Ile Asp Gly Ala Phe Gly Ala225 230 235
139750DNAKlebsiella pneumoniae subp. pneumoniae MGH78578
139atgaacggcc tgctaaacgg taaacgtatt gtcgtcaccg gtgcggcgcg
cggtctcggg 60taccactttg ccgaagcctg cgccgctcag ggcgcgacgg tggtgatgtg
cgacatcctg 120cagggagagc tggcggaaag cgctcatcgc ctgcagcaga
agggctatca ggtcgaatct 180cacgccatcg atcttgccag tcaagcatcg
atcgagcagg tcttcagcgc catcggcgcg 240caggggtcta tcgatggctt
agtcaataac gcagcgatgg ccaccggcgt cggcggaaaa 300aatatgatcg
attacgatcc ggatctgtgg gatcgggtaa tgacggtcaa cgttaaaggc
360acctggttgg tgacccgcgc ggcggtaccg ctgctgcgcg aaggggcggc
gatcgtcaac 420gtcgcttcgg ataccgcgct gtggggcgcg ccgcggctga
tggcctatgt cgccagtaag 480ggcgcggtga ttgcgatgac ccgctccatg
gcccgcgagc tgggtgaaaa gcggatccgt 540atcaacgcca tcgcgccggg
actgacccgc gttgaggcca cggaatacgt tcccgccgag 600cgtcatcagc
tgtatgagaa cggccgcgcg ctcagcggcg cgcagcagcc ggaagatgtc
660accggcagcg tggtctggct gctgagcgat ctttcgcgct ttatcaccgg
ccaactgatc 720ccggtcaacg gcggttttgt ctttaactaa
750140249PRTKlebsiella pneumoniae subsp. pneumoinae MGH78578 140Met
Asn Gly Leu Leu Asn Gly Lys Arg Ile Val Val Thr Gly Ala Ala1 5 10
15 Arg Gly Leu Gly Tyr His Phe Ala Glu Ala Cys Ala Ala Gln Gly Ala
20 25 30 Thr Val Val Met Cys Asp Ile Leu Gln Gly Glu Leu Ala Glu
Ser Ala 35 40 45 His Arg Leu Gln Gln Lys Gly Tyr Gln Val Glu Ser
His Ala Ile Asp 50 55 60 Leu Ala Ser Gln Ala Ser Ile Glu Gln Val
Phe Ser Ala Ile Gly Ala65 70 75 80 Gln Gly Ser Ile Asp Gly Leu Val
Asn Asn Ala Ala Met Ala Thr Gly 85 90 95 Val Gly Gly Lys Asn Met
Ile Asp Tyr Asp Pro Asp Leu Trp Asp Arg 100 105 110 Val Met Thr Val
Asn Val Lys Gly Thr Trp Leu Val Thr Arg Ala Ala 115 120 125 Val Pro
Leu Leu Arg Glu Gly Ala Ala Ile Val Asn Val Ala Ser Asp 130 135 140
Thr Ala Leu Trp Gly Ala Pro Arg Leu Met Ala Tyr Val Ala Ser Lys145
150 155 160 Gly Ala Val Ile Ala Met Thr Arg Ser Met Ala Arg Glu Leu
Gly Glu 165 170 175 Lys Arg Ile Arg Ile Asn Ala Ile Ala Pro Gly Leu
Thr Arg Val Glu 180 185 190 Ala Thr Glu Tyr Val Pro Ala Glu Arg His
Gln Leu Tyr Glu Asn Gly 195 200 205 Arg Ala Leu Ser Gly Ala Gln Gln
Pro Glu Asp Val Thr Gly Ser Val 210 215 220 Val Trp Leu Leu Ser Asp
Leu Ser Arg Phe Ile Thr Gly Gln Leu Ile225 230 235 240 Pro Val Asn
Gly Gly Phe Val Phe Asn 245 141795DNAKlebsiella pneumoniae subsp.
pneumoniae MGH78578 141atgaatgcac aaattgaagg gcgcgtcgcg gtagtcaccg
gcggttcgtc aggaatcggc 60tttgaaacgc tgcgcctgct gctgggcgaa ggggcgaaag
tcgccttttg cggccgcaac 120ccggatcggc ttgccagcgc ccatgcggcg
ttgcaaaacg aatatccaga aggtgaggtg 180ttctcctggc gctgtgacgt
actgaacgaa gctgaagttg aggcgttcgc cgccgcggtc 240gccgcgcgtt
tcggcggcgt cgatatgctg attaataacg ccggccaggg ctatgtcgcc
300cacttcgccg atacgccacg tgaggcctgg ctgcacgaag ccgaactgaa
actgttcggc 360gtgattaacc cggtaaaggc ctttcagtcc ctgctagagg
cgtcggatat cgcctcgatt 420acctgtgtga actcgctgct ggcgttacag
ccggaagagc acatgatcgc cacctctgcc 480gcccgcgccg cgctgctcaa
tatgacgctg actctgtcga aagagctggt ggataaaggt 540attcgtgtga
attccattct gctggggatg gtggagtccg ggcagtggca gcgccgtttt
600gagagccgaa gcgataagag ccagagttgg cagcagtgga ccgccgatat
cgcccgtaag 660cgggggatcc cgatggcgcg tctcggtaag ccgcaggagc
cagcgcaagc gctgctattc 720ctcgcttcgc cgctggcctc ctttaccacc
ggcgcggcgc tggacgtttc cggcggtttc 780tgtcgccatc tgtaa
795142264PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 142Met
Asn Ala Gln Ile Glu Gly Arg Val Ala Val Val Thr Gly Gly Ser1 5 10
15 Ser Gly Ile Gly Phe Glu Thr Leu Arg Leu Leu Leu Gly Glu Gly Ala
20 25 30 Lys Val Ala Phe Cys Gly Arg Asn Pro Asp Arg Leu Ala Ser
Ala His 35 40 45 Ala Ala Leu Gln Asn Glu Tyr Pro Glu Gly Glu Val
Phe Ser Trp Arg 50 55 60 Cys Asp Val Leu Asn Glu Ala Glu Val Glu
Ala Phe Ala Ala Ala Val65 70 75 80 Ala Ala Arg Phe Gly Gly Val Asp
Met Leu Ile Asn Asn Ala Gly Gln 85 90 95 Gly Tyr Val Ala His Phe
Ala Asp Thr Pro Arg Glu Ala Trp Leu His 100 105 110 Glu Ala Glu Leu
Lys Leu Phe Gly Val Ile Asn Pro Val Lys Ala Phe 115 120 125 Gln Ser
Leu Leu Glu Ala Ser Asp Ile Ala Ser Ile Thr Cys Val Asn 130 135 140
Ser Leu Leu Ala Leu Gln Pro Glu Glu His Met Ile Ala Thr Ser Ala145
150 155 160 Ala Arg Ala Ala Leu Leu Asn Met Thr Leu Thr Leu Ser Lys
Glu Leu 165 170 175 Val Asp Lys Gly Ile Arg Val Asn Ser Ile Leu Leu
Gly Met Val Glu 180 185 190 Ser Gly Gln Trp Gln Arg Arg Phe Glu Ser
Arg Ser Asp Lys Ser Gln 195 200 205 Ser Trp Gln Gln Trp Thr Ala Asp
Ile Ala Arg Lys Arg Gly Ile Pro 210 215 220 Met Ala Arg Leu Gly Lys
Pro Gln Glu Pro Ala Gln Ala Leu Leu Phe225 230 235 240 Leu Ala Ser
Pro Leu Ala Ser Phe Thr Thr Gly Ala Ala Leu Asp Val 245 250 255 Ser
Gly Gly Phe Cys Arg His Leu 260 1431795DNAPseudomonas fluorescens
143cgccaagcaa tcgggctttg gggcagaatt gggtcgcgaa gggcttgagg
agtttgccca 60gtccaagatc atcaacgccg cgctataaat taaaggatcc cccatggcga
tgattacagg 120cggcgaactg gttgttcgca ccctaataaa ggctggggtc
gaacatctgt tcggcctgca 180cggcgcgcat atcgatacga tttttcaagc
ctgtctcgat catgatgtgc cgatcatcga 240cacccgccat gaggccgccg
cagggcatgc ggccgagggc tatgcccgcg ctggcgccaa 300gctgggcgtg
gctggtcacg gcgggcgggg gatttaccaa tgcggtcacg cccattgcca
360acgcttggct ggatcgcaag gccggtgtat tcctcacccg ggatcgggcg
cgctgcgtga 420tgatgaaacc aacacgttgc aggcggggat tgatcaggtc
gccatggcgg cgcccattac 480caaatgggcg catcgggtga tggcaaccga
gcatatccca cggctggtga tgcaggcgat 540ccgcgccgcg ttgagcgcgc
cacgcgggcc ggtgttgctg gatctgccgt gggatattct 600gatgaaccag
attgatgagg atagcgtcat tatccccgat ctggtcttgt ccgcgcatgg
660ggccagaccc gaccctgccg atctggatca ggctctcgcg cttttgcgca
aggcggagcg 720gccggtcatc gtgctcggct cagaagcctc gcggacagcg
cgcaagacgg cgcttagcgc 780cttcgtggcg gcgactggcg tgccggtgtt
tgccgattat gaagggctaa gcatgctctc 840ggggctgccc gatgctatgc
ggggcgggct ggtgcaaaac ctctattctt ttgccaaagc 900cgatgccgcg
ccagatctcg tgctgatgct gggggcgcgc tttggcctta acaccgggca
960tggatctggg cagttgatcc cccatagcgc gcaggtcatt caggtcgacc
ctgatgcctg 1020cgagctggga cgcctgcagg gcatcgctct gggcattgtg
gccgatgtgg gtgggaccat 1080cgaggctttg gcgcaggcca ccgcgcaaga
tgcggcttgg ccggatcgcg gcgactggtg 1140cgccaaagtg acggatctgg
cgcaagagcg ctatgccagc atcgctgcga aatcgagcag 1200cgagcatgcg
ctccacccct ttcacgcctc gcaggtcatt gccaaacacg tcgatgcagg
1260ggtgacggtg gtagcggatg gtgcgctgac ctatctctgg ctgtccgaag
tgatgagccg 1320cgtgaaaccc ggcggttttc tctgccacgg ctatctaggc
tcgatgggcg tgggcttcgg 1380cacggcgctg ggcgcgcaag tggccgatct
tgaagcaggc cgccgcacga tccttgtgac 1440cggcgatggc tcggtgggct
atagcatcgg tgaatttgat acgctggtgc gcaaacaatt 1500gccgctgatc
gtcatcatca tgaacaacca aagctggggg gcgacattgc atttccagca
1560attggccgtc ggccccaatc gcgtgacggg cacccgtttg gaaaatggct
cctatcacgg 1620ggtggccgcc gcctttggcg cggatggcta tcatgtcgac
agtgtggaga gcttttctgc 1680ggctctggcc caagcgctcg cccataatcg
ccccgcctgc atcaatgtcg cggtcgcgct 1740cgatccgatc ccgcccgaag
aactcattct gatcggcatg gaccccttcg catga 1795144563PRTPseudomonas
fluorescens 144Met Ala Met Ile Thr Gly Gly Glu Leu Val Val Arg Thr
Leu Ile Lys1 5 10 15 Ala Gly Val Glu His Leu Phe Gly Leu His Gly
Ala His Ile Asp Thr 20 25 30 Ile Phe Gln Ala Cys Leu Asp His Asp
Val Pro Ile Ile Asp Thr Arg 35 40 45 His Glu Ala Ala Ala Gly His
Ala Ala Glu Gly Tyr Ala Arg Ala Gly 50 55 60 Ala Lys Leu Gly Val
Ala Gly His Gly Gly Arg Gly Ile Tyr Gln Cys65 70 75 80 Gly His Ala
His Cys Gln Arg Leu Ala Gly Ser Gln Gly Arg Cys Ile 85 90 95 Pro
His Pro Gly Ser Gly Ala Leu Arg Asp Asp Glu Thr Asn Thr Leu 100 105
110 Gln Ala Gly Ile Asp Gln Val Ala Met Ala Ala Pro Ile Thr Lys Trp
115 120 125 Ala His Arg Val Met Ala Thr Glu His Ile Pro Arg Leu Val
Met Gln 130 135 140 Ala Ile Arg Ala Ala Leu Ser Ala Pro Arg Gly Pro
Val Leu Leu Asp145 150 155 160 Leu Pro Trp Asp Ile Leu Met Asn Gln
Ile Asp Glu Asp Ser Val Ile 165 170 175 Ile Pro Asp Leu Val Leu Ser
Ala His Gly Ala Arg Pro Asp Pro Ala 180 185 190 Asp Leu Asp Gln Ala
Leu Ala Leu Leu Arg Lys Ala Glu Arg Pro Val 195 200 205 Ile Val Leu
Gly Ser Glu Ala Ser Arg Thr Ala Arg Lys Thr Ala Leu 210 215 220 Ser
Ala Phe Val Ala Ala Thr Gly Val Pro Val Phe Ala Asp Tyr Glu225 230
235 240 Gly Leu Ser Met Leu Ser Gly Leu Pro Asp Ala Met Arg Gly Gly
Leu 245 250 255 Val Gln Asn Leu Tyr Ser Phe Ala Lys Ala Asp Ala Ala
Pro Asp Leu 260 265 270 Val Leu Met Leu Gly Ala Arg Phe Gly Leu Asn
Thr Gly His Gly Ser 275 280 285 Gly Gln Leu Ile Pro His Ser Ala Gln
Val Ile Gln Val Asp Pro Asp 290 295 300 Ala Cys Glu Leu Gly Arg Leu
Gln Gly Ile Ala Leu Gly Ile Val Ala305 310 315 320 Asp Val Gly Gly
Thr Ile Glu Ala Leu Ala Gln Ala Thr Ala Gln Asp 325 330 335 Ala Ala
Trp Pro Asp Arg Gly Asp Trp Cys Ala Lys Val Thr Asp Leu 340 345 350
Ala Gln Glu Arg Tyr Ala Ser Ile Ala Ala Lys Ser Ser Ser Glu His 355
360 365 Ala Leu His Pro Phe His Ala Ser Gln Val Ile Ala Lys His Val
Asp 370 375 380 Ala Gly Val Thr Val Val Ala Asp Gly Ala Leu Thr Tyr
Leu Trp Leu385 390 395 400 Ser Glu Val Met Ser Arg Val Lys Pro Gly
Gly Phe Leu Cys His Gly 405 410 415 Tyr Leu Gly Ser Met Gly Val Gly
Phe Gly Thr Ala Leu Gly Ala Gln 420 425 430 Val Ala Asp Leu Glu Ala
Gly Arg Arg Thr Ile Leu Val Thr Gly Asp 435 440 445 Gly Ser Val Gly
Tyr Ser Ile Gly Glu Phe Asp Thr Leu Val Arg Lys 450 455 460 Gln Leu
Pro Leu Ile Val Ile Ile Met Asn Asn Gln Ser Trp Gly Ala465 470 475
480 Thr Leu His Phe Gln Gln Leu Ala Val Gly Pro Asn Arg Val Thr Gly
485 490 495 Thr Arg Leu Glu Asn Gly Ser Tyr His Gly Val Ala Ala Ala
Phe Gly 500 505 510 Ala Asp Gly Tyr His Val Asp Ser Val Glu Ser Phe
Ser Ala Ala Leu 515 520 525 Ala Gln Ala Leu Ala His Asn Arg Pro Ala
Cys Ile Asn Val Ala Val 530 535 540 Ala Leu Asp Pro Ile Pro Pro Glu
Glu Leu Ile Leu Ile Gly Met Asp545 550 555 560 Pro Phe
Ala1459PRTArtificial SequenceA polypeptide that is similar to an
autotransporter adhesion or type I secretion target repeat. 145Gly
Gly Xaa Gly Xaa Asp Xaa Xaa Xaa1 5 14650DNAArtificial
SequencePrimer 146gtctttattc atatatatat cctccttaat tcaaccgttc
aatcaccatc 5014730DNAArtificial SequencePrimer 147gggcggccgc
aaggggttcg cgttggccga 3014822DNAArtificial SequencePrimer
148ggagaaaata ccgcatcagg cg 2214932DNAArtificial SequencePrimer
149cgggatccaa gttgcaggat atgacgaaag cg 3215033DNAArtificial
SequencePrimer 150gctctagaag attatccctg tctgcggaag cgg
3315132DNAArtificial SequencePrimer 151gctctagagg ggtgcctaat
gagtgagcta ac 3215233DNAArtificial SequencePrimer 152cgggatccgc
gttaatattt tgttaaaatt cgc 3315331DNAArtificial SequencePrimer
153gctctagagt ttatgtcgca cccgccgttg g 3115432DNAArtificial
SequencePrimer 154cccaagctta gaaagggaaa ttgtggtagc cc
3215531DNAArtificial SequencePrimer 155ggaattccat atgcgtccct
ctgccccggc c 3115630DNAArtificial SequencePrimer 156cgggatcctt
agaactgctt gggaagggag 3015750DNAArtificial SequencePrimer
157aggtacggtg aaataaagga ggatatacat atgtccaaaa agattgccgt
5015837DNAArtificial SequencePrimer 158ttttcctttt gcggccgccc
cgctggcatc gcctcac 3715950DNAArtificial SequencePrimer
159ggcgatgcca gcgtaaagga ggatatacat atgaaaaact ggaaaacaag
5016037DNAArtificial SequencePrimer 160ttttcctttt gcggccgccc
cagcttagcg ccttcta 3716131DNAArtificial SequencePrimer
161cccgagctct taggaggatt agtcatggaa c 3116232DNAArtificial
SequencePrimer 162gctctagatt attttgaata atcgtagaaa cc
3216342DNAArtificial sequencePrimer 163gctctagagg aggatatata
tatgaaaaat tgtgtcatcg tc 4216430DNAArtificial SequencePrimer
164aactgcagtt aattcaaccg ttcaatcacc 3016546DNAArtificial
SequencePrimer 165cgagctcagg aggatatata tatgaaaaat tgtgtcatcg
tcagtg 4616650DNAArtificial SequencePrimer 166ggttgaatta aggaggatat
atatatgaat aaagacacac taatacctac 5016730DNAArtificial
SequencePrimer 167cccaagctta gccggcaagt acacatcttc
3016846DNAArtificial SequencePrimer 168cgagctcagg aggatatata
tatgaaaaat tgtgtcatcg tcagtg 4616930DNAArtificial SequencePrimer
169cccaagctta gccggcaagt acacatcttc 3017040DNAArtificial
SequencePrimer 170aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg
4017135DNAArtificial SequencePrimer 171cggggtaccg cggatacata
tttgaatgta tttag 3517244DNAArtificial SequencePrimer 172aaggaaaaaa
gcggccgcgc ggatacatat ttgaatgtat ttag 4417343DNAArtificial
SequencePrimer 173gctctagagg aggatatata tatggctaac tacttcaata cac
4317450DNAArtificial SequencePrimer 174tgctgttgcg ggttaaggag
gatatatata tgcctaagta ccgttccgcc 5017550DNAArtificial
SequencePrimer 175aacggtactt aggcatatat atatcctcct taacccgcaa
cagcaatacg 5017630DNAArtificial SequencePrimer 176acatgcatgc
ttaacccccc agtttcgatt 3017743DNAArtificial SequencePrimer
177gctctagagg aggatatata tatggctaac tacttcaata cac
4317830DNAArtificial SequencePrimer 178acatgcatgc ttaacccccc
agtttcgatt 3017943DNAArtificial SequencePrimer 179cccgagctca
ggaggatata tatatggata aacagtatcc ggt 4318028DNAArtificial
SequencePrimer 180gctctagatt acagaatttg actcaggt
2818145DNAArtificial SequencePrimer 181cccgagctca ggaggatata
tatatgttga caaaagcaac aaaag 4518225DNAArtificial SequencePrimer
182ctctaaatct ctggaaaggg taccg 2518330DNAArtificial SequencePrimer
183gctctagatt agagagcttt cgttttcatg 3018445DNAArtificial
SequencePrimer 184cccgagctca ggaggatata tatatgttga caaaagcaac aaaag
4518530DNAArtificial SequencePrimer 185gctctagatt agagagcttt
cgttttcatg 3018646DNAArtificial SequencePrimer 186cgagctcagg
aggatatata tatgagccag caagtcatta ttttcg 4618735DNAArtificial
SequencePrimer 187aaaactgcag cgtttgatga cgtggacgat agcgg
3518846DNAArtificial SequencePrimer 188cgagctcagg aggatatata
tatgagccag caagtcatta ttttcg 4618950DNAArtificial SequencePrimer
189aggggtgtaa ggaggatata tatatggcta agacgttata cgaaaaattg
5019050DNAArtificial SequencePrimer 190cgtcttagcc atatatatat
cctccttaca ccccttctgc tacatagcgg 5019135DNAArtificial
SequencePrimer 191aaaactgcag cgtttgatga cgtggacgat agcgg
3519246DNAArtificial SequencePrimer 192cgagctcagg aggatatata
tatgagccag caagtcatta ttttcg 4619335DNAArtificial SequencePrimer
193aaaactgcag cgtttgatga cgtggacgat agcgg 3519446DNAArtificial
SequencePrimer 194cgagctcagg aggatatata tatgagccag caagtcatta
ttttcg 4619550DNAArtificial SequencePrimer 195gaaaccgtgt gaggaggata
tatatatgtc gaagaattac catattgccg 5019650DNAArtificial
SequencePrimer 196aggggtgtaa ggaggatata tatatggcta agacgttata
cgaaaaattg 5019750DNAArtificial SequencePrimer 197acattaaata
aggaggatat atatatggca gagaaattta tcaaacacac 5019850DNAArtificial
SequencePrimer 198attcttcgac atatatatat cctcctcaca cggtttcctt
gttgttttcg 5019950DNAArtificial SequencePrimer 199cgtcttagcc
atatatatat cctccttaca ccccttctgc tacatagcgg 5020050DNAArtificial
SequencePrimer 200tttctctgcc atatatatat cctccttatt taatgttgcg
aatgtcggcg 5020135DNAArtificial SequencePrimer 201aaaactgcag
cgtttgatga cgtggacgat agcgg 3520246DNAArtificial SequencePrimer
202cgagctcagg aggatatata tatgagccag caagtcatta ttttcg
4620335DNAArtificial SequencePrimer 203aaaactgcag cgtttgatga
cgtggacgat agcgg 3520440DNAArtificial SequencePrimer 204aaggaaaaaa
gcggccgccc ctgaaccgac gaccgggtcg 4020535DNAArtificial
SequencePrimer 205cggggtaccg cggatacata tttgaatgta tttag
3520642DNAArtificial SequencePrimer 206aaggaaaaaa gcggccgcac
ttttcatact cccgccattc ag 4220731DNAArtificial SequencePrimer
207caaaggccgt ctgcacgcgc cgaaaggcaa a 3120831DNAArtificial
SequencePrimer 208tttgcctttc ggcgcgtgca gacggccttt g
3120935DNAArtificial SequencePrimer 209acatgcatgc cgtttgatga
cgtggacgat agcgg 3521042DNAArtificial SequencePrimer 210aaggaaaaaa
gcggccgcac ttttcatact cccgccattc ag 4221135DNAArtificial
SequencePrimer 211acatgcatgc cgtttgatga cgtggacgat agcgg
3521248DNAArtificial SequencePrimer 212cccgagctca ggaggatata
tatatgaatt atcagaacga cgatttac 4821350DNAArtificial SequencePrimer
213gcgtcgcggg taaggaggaa aattttatgt cctcacgtaa agagcttgcc
5021450DNAArtificial SequencePrimer 214gaactgctgt aaggaggtta
aaattatgga gaggattgtc gttactctcg 5021550DNAArtificial
SequencePrimer 215caatcagcgt aaggaggtat atataatgaa aaccgtaact
gtaaaagatc 5021650DNAArtificial SequencePrimer 216tacaccaggc
ataaggagga attaattatg gaaacctatg ctgtttttgg 5021750DNAArtificial
SequencePrimer 217tacgtgagga cataaaattt tcctccttac ccgcgacgcg
cttttactgc 5021850DNAArtificial SequencePrimer 218caatcctctc
cataatttta acctccttac agcagttctt ttgctttcgc 5021950DNAArtificial
SequencePrimer 219caatcagcgt aaggaggtat atataatgaa aaccgtaact
gtaaaagatc 5022050DNAArtificial SequencePrimer 220tacggttttc
attatatata cctccttacg ctgattgaca atcggcaatg 5022134DNAArtificial
SequencePrimer 221acatgcatgc ttacgcggac aattcctcct gcaa
3422248DNAArtificial SequencePrimer 222cccgagctca ggaggatata
tatatgaatt atcagaacga cgatttac 4822334DNAArtificial SequencePrimer
223acatgcatgc ttacgcggac aattcctcct gcaa 3422448DNAArtificial
SequencePrimer 224cccgagctca ggaggatata tatatgacat cggaaaaccc
gttactgg 4822550DNAArtificial SequencePrimer 225gatccaacct
aaggaggaaa attttatgac acaacctctt tttctgatcg 5022650DNAArtificial
SequencePrimer 226gatcaattgt taaggaggta tatataatgg aatccctgac
gttacaaccc 5022750DNAArtificial SequencePrimer 227caggcagcct
aaggaggaat taattatggc tggaaacaca attggacaac 5022850DNAArtificial
SequencePrimer 228aggttgtgtc ataaaatttt cctccttagg ttggatcaac
aggcactacg 5022950DNAArtificial SequencePrimer 229cagggattcc
attatatata cctccttaac aattgatcgt ctgtgccagg 5023050DNAArtificial
SequencePrimer
230gtttccagcc ataattaatt cctccttagg ctgcctggct aatccgcgcc
5023135DNAArtificial SequencePrimer 231acatgcatgc ttaccagcgt
ggaatatcag tcttc 3523248DNAArtificial SequencePrimer 232cccgagctca
ggaggatata tatatgacat cggaaaaccc gttactgg 4823335DNAArtificial
SequencePrimer 233acatgcatgc ttaccagcgt ggaatatcag tcttc
3523448DNAArtificial SequencePrimer 234cccgagctca ggaggatata
tatatggttg ctgaattgac cgcattac 4823550DNAArtificial SequencePrimer
235aatcgccagt aaggaggaaa attttatgac acaacctctt tttctgatcg
5023650DNAArtificial SequencePrimer 236gatcaattgt taaggaggta
tatataatgg aatccctgac gttacaaccc 5023750DNAArtificial
SequencePrimer 237caggcagcct aaggaggaat taattatggc tggaaacaca
attggacaac 5023850DNAArtificial SequencePrimer 238gaggttgtgt
cataaaattt tcctccttac tggcgattgt cattcgcctg 5023950DNAArtificial
SequencePrimer 239cagggattcc attatatata cctccttaac aattgatcgt
ctgtgccagg 5024050DNAArtificial SequencePrimer 240gtttccagcc
ataattaatt cctccttagg ctgcctggct aatccgcgcc 5024135DNAArtificial
SequencePrimer 241acatgcatgc ttaccagcgt ggaatatcag tcttc
3524248DNAArtificial SequencePrimer 242cccgagctca ggaggatata
tatatggttg ctgaattgac cgcattac 4824335DNAArtificial SequencePrimer
243acatgcatgc ttaccagcgt ggaatatcag tcttc 3524440DNAArtificial
SequencePrimer 244aaggaaaaaa gcggccgccc ctgaaccgac gaccgggtcg
4024532DNAArtificial SequencePrimer 245gctctagaac ttttcatact
cccgccattc ag 3224634DNAArtificial SequencePrimer 246gctctagagc
ggatacatat ttgaatgtat ttag 3424744DNAArtificial SequencePrimer
247aaggaaaaaa gcggccgcgc ggatacatat ttgaatgtat ttag
4424826DNAArtificial SequencePrimer 248catgccatgg ctatgattac tggtgg
2624933DNAArtificial SequencePrimer 249ccccgagctc ttacgcgccg
gattggaaat aca 3325031DNAArtificial SequencePrimer 250catgccatgg
ccaaagttac aaatcaaaaa g 3125132DNAArtificial SequencePrimer
251cgagctctta aaatgatttt atatagatat cc 3225231DNAArtificial
SequencePrimer 252catgccatgg gtattccaga aactcaaaaa g
3125331DNAArtificial SequencePrimer 253cccgagctct tatttagaag
tgtcaacaac g 3125447DNAArtificial SequencePrimer 254ccccgagctc
aggaggatat acatatgaat aaagacacac taatacc 4725530DNAArtificial
SequencePrimer 255cccaagctta gccggcaagt acacatcttc
3025645DNAArtificial SequencePrimer 256cccgagctca ggaggatata
tatatgtata cagtaggaga ttacc 4525733DNAArtificial SequencePrimer
257gctctagatt atgatttatt ttgttcagca aat 3325845DNAArtificial
SequencePrimer 258cccgagctca ggaggatata tatatgtata cagtaggaga ttacc
4525933DNAArtificial SequencePrimer 259gctctagatt atgatttatt
ttgttcagca aat 3326046DNAArtificial SequencePrimer 260cgagctcagg
aggatatata tatgaaaaaa gtcgcacttg ttaccg 4626131DNAArtificial
SequencePrimer 261ggccggcggc cgcgcgatgg cggtgaaagt g
3126250DNAArtificial SequencePrimer 262aactaatcta gaggaggata
tatatatgag catgacgttt tccggccagg 5026331DNAArtificial
SequencePrimer 263ccttgcggag ggctcgatgg atgagttcga c
3126431DNAArtificial SequencePrimer 264cactttcacc gccatcgcgc
ggccgccggc c 3126550DNAArtificial SequencePrimer 265gctcatatat
atatcctcct ctagattagt taaacaccat cccgccgtcg 5026631DNAArtificial
SequencePrimer 266gtcgaactca tccatcgagc cctccgcaag g
3126732DNAArtificial SequencePrimer 267cccaagctta gatcgcggtg
gccccgccgt cg 3226846DNAArtificial SequencePrimer 268cgagctcagg
aggatatata tatgaaaaaa gtcgcacttg ttaccg 4626932DNAArtificial
SequencePrimer 269cccaagctta gatcgcggtg gccccgccgt cg
3227043DNAArtificial SequencePrimer 270gctctagagg aggatttaaa
aatggaaatt aacgaaacgc tgc 4327145DNAArtificial SequencePrimer
271tccccgcggt taagcatggc gatcccgaaa tggaatccct ttgac
4527244DNAArtificial SequencePrimer 272ccgctcgagg aggatatata
tatgagatcg aaaagatttg aagc 4427330DNAArtificial SequencePrimer
273gctctagatt agccaagttc attgggatcg 3027433DNAArtificial
SequencePrimer 274cggggtacca cttttcatac tcccgccatt cag
3327525DNAArtificial SequencePrimer 275cggtaccctt tccagagatt tagag
2527630DNAArtificial SequencePrimer 276ggaattccat atgttcacaa
cgtccgccta 3027727DNAArtificial SequencePrimer 277gcttgacggc
catgtggccg aggccgc 2727827DNAArtificial SequencePrimer
278gcggcctcgg ccacatggcc gtcaagc 2727928DNAArtificial
SequencePrimer 279cgggatcctt aggcggcctt ctggcgcg
2828030DNAArtificial SequencePrimer 280ggaattccat atggctattg
caagaggtta 3028128DNAArtificial SequencePrimer 281cgggatcctt
aagcgtcgag cgaggcca 2828230DNAArtificial SequencePrimer
282ggaattccat atgactaaaa caatgaaggc 3028327DNAArtificial
SequencePrimer 283caccggggcc ggggtccggt attgcca
2728427DNAArtificial SequencePrimer 284tggcaatacc ggaccccggc
cccggtg 2728528DNAArtificial SequencePrimer 285cgggatcctt
aggcggcgag atccacga 2828630DNAArtificial SequencePrimer
286ggaattccat atgaccgggg cgaaccagcc 3028727DNAArtificial
SequencePrimer 287atagccgctc atacgcctcg gttgcct
2728827DNAArtificial SequencePrimer 288aggcaaccga ggcgtatgag
cggctat 2728928DNAArtificial SequencePrimer 289cgggatcctt
aagcgccgtg cggaagga 2829030DNAArtificial SequencePrimer
290ggaattccat atgaccatgc atgccattca 3029128DNAArtificial
SequencePrimer 291cgggatcctt attcggctgc aaattgca
2829230DNAArtificial SequencePrimer 292ggaattccat atgcgcgcgc
tttattacga 3029328DNAArtificial SequencePrimer 293cgggatcctt
attcgaaccg gtcgatga 2829430DNAArtificial SequencePrimer
294ggaattccat atgctggcga ttttctgtga 3029528DNAArtificial
SequencePrimer 295cgggatcctt atgcgacctc caccatgc
2829630DNAArtificial SequencePrimer 296ggaattccat atgaaagcct
tcgtcgtcga 3029728DNAArtificial SequencePrimer 297cgggatcctt
aggatgcgta tgtaacca 2829830DNAArtificial SequencePrimer
298ggaattccat atgaaagcga ttgtcgccca 3029928DNAArtificial
SequencePrimer 299cgggatcctt aggaaaaggc gatctgca
2830030DNAArtificial SequencePrimer 300ggaattccat atgccgatgg
cgctcgggca 3030128DNAArtificial SequencePrimer 301cgggatcctt
agaattcgat gacttgcc 2830230DNAArtificial SequencePrimer
302ggaattccat atgaaacatt ctcaggacaa 3030327DNAArtificial
SequencePrimer 303gggcgccgat catgtggtgc gtttccg
2730427DNAArtificial SequencePrimer 304cggaaacgca ccacatgatc
ggcgccc 2730528DNAArtificial SequencePrimer 305cgggatcctt
atgccatacg ttccatat 2830630DNAArtificial SequencePrimer
306ggaattccat atgcagcgtt ttaccaacag 3030728DNAArtificial
SequencePrimer 307cgggatcctt aggaaaacag gacgccgc
28308610PRTKlebsiella pneumoniae subsp. pneumoniae MGH 78578 308Met
Arg Tyr Ile Ala Gly Ile Asp Ile Gly Asn Ser Ser Thr Glu Val1 5 10
15 Ala Leu Ala Thr Val Asp Asp Ala Gly Val Leu Asn Ile Arg His Ser
20 25 30 Ala Leu Ala Glu Thr Thr Gly Ile Lys Gly Thr Leu Arg Asn
Val Phe 35 40 45 Gly Ile Gln Glu Ala Leu Thr Gln Ala Ala Lys Ala
Ala Gly Ile Gln 50 55 60 Leu Ser Asp Ile Ser Leu Ile Arg Ile Asn
Glu Ala Thr Pro Val Ile65 70 75 80 Gly Asp Val Ala Met Glu Thr Ile
Thr Glu Thr Ile Ile Thr Glu Ser 85 90 95 Thr Met Ile Gly His Asn
Pro Lys Thr Pro Gly Gly Val Gly Leu Gly 100 105 110 Val Gly Ile Thr
Ile Thr Pro Glu Ala Leu Leu Ser Cys Ser Ala Asp 115 120 125 Thr Pro
Tyr Ile Leu Val Val Ser Ser Ala Phe Asp Phe Ala Asp Val 130 135 140
Ala Ala Met Val Asn Ala Ala Thr Ala Ala Gly Tyr Gln Ile Thr Gly145
150 155 160 Ile Ile Leu Gln Gln Asp Asp Gly Val Leu Val Asn Asn Arg
Leu Gln 165 170 175 Gln Pro Leu Pro Val Ile Asp Glu Val Gln His Ile
Asp Arg Ile Pro 180 185 190 Leu Gly Met Leu Ala Ala Val Glu Val Ala
Leu Pro Gly Lys Ile Ile 195 200 205 Glu Thr Leu Ser Asn Pro Tyr Gly
Ile Ala Thr Val Phe Asp Leu Asn 210 215 220 Ala Glu Glu Thr Lys Asn
Ile Val Pro Met Ala Arg Ala Leu Ile Gly225 230 235 240 Asn Arg Ser
Ala Val Val Val Lys Thr Pro Ser Gly Asp Val Lys Ala 245 250 255 Arg
Ala Ile Pro Ala Gly Asn Leu Leu Leu Ile Ala Gln Gly Arg Ser 260 265
270 Val Gln Val Asp Val Ala Ala Gly Ala Glu Ala Ile Met Lys Ala Val
275 280 285 Asp Gly Cys Gly Lys Leu Asp Asn Val Ala Gly Glu Ala Gly
Thr Asn 290 295 300 Ile Gly Gly Met Leu Glu His Val Arg Gln Thr Met
Ala Glu Leu Thr305 310 315 320 Asn Lys Pro Ala Gln Glu Ile Arg Ile
Gln Asp Leu Leu Ala Val Asp 325 330 335 Thr Ala Val Pro Val Ser Val
Thr Gly Gly Leu Ala Gly Glu Phe Ser 340 345 350 Leu Glu Gln Ala Val
Gly Ile Ala Ser Met Val Lys Ser Asp Arg Leu 355 360 365 Gln Met Ala
Leu Ile Ala Arg Glu Ile Glu His Lys Leu Gln Ile Ala 370 375 380 Val
Gln Val Gly Gly Ala Glu Ala Glu Ala Ala Ile Leu Gly Ala Leu385 390
395 400 Thr Thr Pro Gly Thr Thr Arg Pro Leu Ala Ile Leu Asp Leu Gly
Ala 405 410 415 Gly Ser Thr Asp Ala Ser Ile Ile Asn Ala Gln Gly Glu
Ile Ser Ala 420 425 430 Thr His Leu Ala Gly Ala Gly Asp Met Val Thr
Met Ile Ile Ala Arg 435 440 445 Glu Leu Gly Leu Glu Asp Arg Tyr Leu
Ala Glu Glu Ile Lys Lys Tyr 450 455 460 Pro Leu Ala Lys Val Glu Ser
Leu Phe His Leu Arg His Glu Asp Gly465 470 475 480 Ser Val Gln Phe
Phe Pro Ser Ala Leu Pro Pro Ala Val Phe Ala Arg 485 490 495 Val Cys
Val Val Lys Pro Asp Glu Leu Val Pro Leu Pro Gly Asp Leu 500 505 510
Pro Leu Glu Lys Val Arg Ala Ile Arg Arg Ser Ala Lys Ser Arg Val 515
520 525 Phe Val Thr Asn Ala Leu Arg Ala Leu Arg Gln Val Ser Pro Thr
Gly 530 535 540 Asn Ile Arg Asp Ile Pro Phe Val Val Leu Val Gly Gly
Ser Ser Leu545 550 555 560 Asp Phe Glu Ile Pro Gln Leu Val Thr Asp
Ala Leu Ala His Tyr Arg 565 570 575 Leu Val Ala Gly Arg Gly Asn Ile
Arg Gly Cys Glu Gly Pro Arg Asn 580 585 590 Ala Val Ala Ser Gly Leu
Leu Leu Ser Trp Gln Lys Gly Gly Thr His 595 600 605 Gly Glu 610
309116PRTKlebsiella pneumoniae subsp. pneumoniae MGH78578 309Met
Glu Ser Ser Val Val Ala Pro Ala Ile Val Ile Ala Val Thr Asp1 5 10
15 Glu Cys Ser Glu Gln Trp Arg Asp Val Leu Leu Gly Ile Glu Glu Glu
20 25 30 Gly Ile Pro Phe Val Leu Gln Pro Gln Thr Gly Gly Asp Leu
Ile His 35 40 45 His Ala Trp Gln Ala Ala Gln Arg Ser Pro Leu Gln
Val Gly Ile Ala 50 55 60 Cys Asp Arg Glu Arg Leu Ile Val His Tyr
Lys Asn Leu Pro Ala Ser65 70 75 80 Thr Pro Leu Phe Ser Leu Met Tyr
His Gln Asn Arg Leu Ala Arg Arg 85 90 95 Asn Thr Gly Asn Asn Ala
Ala Arg Leu Val Lys Gly Ile Pro Phe Arg 100 105 110 Asp Arg His Ala
115 310787PRTClostridium butyricum 310Met Ile Ser Lys Gly Phe Ser
Thr Gln Thr Glu Arg Ile Asn Ile Leu1 5 10 15 Lys Ala Gln Ile Leu
Asn Ala Lys Pro Cys Val Glu Ser Glu Arg Ala 20 25 30 Ile Leu Ile
Thr Glu Ser Phe Lys Gln Thr Glu Gly Gln Pro Ala Ile 35 40 45 Leu
Arg Arg Ala Leu Ala Leu Lys His Ile Leu Glu Asn Ile Pro Ile 50 55
60 Thr Ile Arg Asp Gln Glu Leu Ile Val Gly Ser Leu Thr Lys Glu
Pro65 70 75 80 Arg Ser Ser Gln Val Phe Pro Glu Phe Ser Asn Lys Trp
Leu Gln Asp 85 90 95 Glu Leu Asp Arg Leu Asn Lys Arg Thr Gly Asp
Ala Phe Gln Ile Ser 100 105 110 Glu Glu Ser Lys Glu Lys Leu Lys Asp
Val Phe Glu Tyr Trp Asn Gly 115 120 125 Lys Thr Thr Ser Glu Leu Ala
Thr Ser Tyr Met Thr Glu Glu Thr Arg 130 135 140 Glu Ala Val Asn Cys
Asp Val Phe Thr Val Gly Asn Tyr Tyr Tyr Asn145 150 155 160 Gly Val
Gly His Val Ser Val Asp Tyr Gly Lys Val Leu Arg Val Gly 165 170 175
Phe Asn Gly Ile Ile Asn Glu Ala Lys Glu Gln Leu Glu Lys Asn Arg 180
185 190 Ser Ile Asp Pro Asp Phe Ile Lys Lys Glu Lys Phe Leu Asn Ser
Val 195 200 205 Ile Ile Ser Cys Glu Ala Ala Ile Thr Tyr Val Asn Arg
Tyr Ala Lys 210 215 220 Lys Ala Lys Glu Ile Ala Asp Asn Thr Ser Asp
Ala Lys Arg Lys Ala225 230 235 240 Glu Leu Asn Glu Ile Ala Lys Ile
Cys Ser Lys Val Ser Gly Glu Gly 245 250 255 Ala Lys Ser Phe Tyr Glu
Ala Cys Gln Leu Phe Trp Phe Ile His Ala 260 265 270 Ile Ile Asn Ile
Glu Ser Asn Gly His Ser Ile Ser Pro Ala Arg Phe 275 280 285 Asp Gln
Tyr Met Tyr Pro Tyr Tyr Glu Asn Asp Lys Asn Ile Thr Asp 290 295 300
Lys Phe Ala Gln Glu Leu Ile Asp Cys Ile Trp Ile Lys Leu Asn Asp305
310 315 320 Ile Asn Lys Val Arg Asp Glu Ile Ser Thr Lys His Phe Gly
Gly Tyr 325 330 335 Pro Met Tyr Gln Asn Leu Ile Val Gly Gly Gln Asn
Ser Glu Gly Lys 340 345 350 Asp Ala Thr Asn Lys Val Ser Tyr Met Ala
Leu Glu Ala Ala Val His 355 360 365 Val Lys Leu Pro Gln Pro Ser Leu
Ser Val Arg Ile Trp Asn Lys Thr 370 375 380 Pro Asp Glu Phe Leu Leu
Arg Ala Ala Glu Leu Thr Arg Glu Gly Leu385 390 395 400 Gly Leu Pro
Ala Tyr Tyr Asn Asp Glu Val Ile Ile Pro Ala Leu Val 405 410 415 Ser
Arg Gly Leu Thr Leu Glu Asp Ala Arg Asp Tyr Gly Ile Ile Gly 420 425
430 Cys Val Glu Pro Gln Lys Pro Gly Lys Thr Glu Gly Trp His Asp Ser
435 440 445 Ala Phe Phe Asn Leu Ala Arg Ile Val Glu Leu Thr Ile Asn
Ser Gly 450 455 460 Phe Asp Lys Asn Lys Gln Ile Gly Pro Lys Thr Gln
Asn Phe Glu Glu465 470 475 480 Met Lys Ser Phe Asp Glu Phe Met Lys
Ala Tyr Lys Ala Gln Met Glu 485 490
495 Tyr Phe Val Lys His Met Cys Cys Ala Asp Asn Cys Ile Asp Ile Ala
500 505 510 His Ala Glu Arg Ala Pro Leu Pro Phe Leu Ser Ser Met Val
Asp Asn 515 520 525 Cys Ile Gly Lys Gly Lys Ser Leu Gln Asp Gly Gly
Ala Glu Tyr Asn 530 535 540 Phe Ser Gly Pro Gln Gly Val Gly Val Ala
Asn Ile Gly Asp Ser Leu545 550 555 560 Val Ala Val Lys Lys Ile Val
Phe Asp Glu Asn Lys Ile Thr Pro Ser 565 570 575 Glu Leu Lys Lys Thr
Leu Asn Asn Asp Phe Lys Asn Ser Glu Glu Ile 580 585 590 Gln Ala Leu
Leu Lys Asn Ala Pro Lys Phe Gly Asn Asp Ile Asp Glu 595 600 605 Val
Asp Asn Leu Ala Arg Glu Gly Ala Leu Val Tyr Cys Arg Glu Val 610 615
620 Asn Lys Tyr Thr Asn Pro Arg Gly Gly Asn Phe Gln Pro Gly Leu
Tyr625 630 635 640 Pro Ser Ser Ile Asn Val Tyr Phe Gly Ser Leu Thr
Gly Ala Thr Pro 645 650 655 Asp Gly Arg Lys Ser Gly Gln Pro Leu Ala
Asp Gly Val Ser Pro Ser 660 665 670 Arg Gly Cys Asp Val Ser Gly Pro
Thr Ala Ala Cys Asn Ser Val Ser 675 680 685 Lys Leu Asp His Phe Ile
Ala Ser Asn Gly Thr Leu Phe Asn Gln Lys 690 695 700 Phe His Pro Ser
Ala Leu Lys Gly Asp Asn Gly Leu Met Asn Leu Ser705 710 715 720 Ser
Leu Ile Arg Ser Tyr Phe Asp Gln Lys Gly Phe His Val Gln Phe 725 730
735 Asn Val Ile Asp Lys Lys Ile Leu Leu Ala Ala Gln Lys Asn Pro Glu
740 745 750 Lys Tyr Gln Asp Leu Ile Val Arg Val Ala Gly Tyr Ser Ala
Gln Phe 755 760 765 Ile Ser Leu Asp Lys Ser Ile Gln Asn Asp Ile Ile
Ala Arg Thr Glu 770 775 780 His Val Met785 311304PRTClostridium
buyricum 311Met Ser Lys Glu Ile Lys Gly Val Leu Phe Asn Ile Gln Lys
Phe Ser1 5 10 15 Leu His Asp Gly Pro Gly Ile Arg Thr Ile Val Phe
Phe Lys Gly Cys 20 25 30 Ser Met Ser Cys Leu Trp Cys Ser Asn Pro
Glu Ser Gln Asp Ile Lys 35 40 45 Pro Gln Val Met Phe Asn Lys Asn
Leu Cys Thr Lys Cys Gly Arg Cys 50 55 60 Lys Ser Gln Cys Lys Ser
Ala Ala Ile Asp Met Asn Ser Glu Tyr Arg65 70 75 80 Ile Asp Lys Ser
Lys Cys Thr Glu Cys Thr Lys Cys Val Asp Asn Cys 85 90 95 Leu Ser
Gly Ala Leu Val Ile Glu Gly Arg Asn Tyr Ser Val Glu Asp 100 105 110
Val Ile Lys Glu Leu Lys Lys Asp Ser Val Gln Tyr Arg Arg Ser Asn 115
120 125 Gly Gly Ile Thr Leu Ser Gly Gly Glu Val Leu Leu Gln Pro Asp
Phe 130 135 140 Ala Val Glu Leu Leu Lys Glu Cys Lys Ser Tyr Gly Trp
His Thr Ala145 150 155 160 Ile Glu Thr Ala Met Tyr Val Asn Ser Glu
Ser Val Lys Lys Val Ile 165 170 175 Pro Tyr Ile Asp Leu Ala Met Ile
Asp Ile Lys Ser Met Asn Asp Glu 180 185 190 Ile His Arg Lys Phe Thr
Gly Val Ser Asn Glu Ile Ile Leu Gln Asn 195 200 205 Ile Lys Leu Ser
Asp Glu Leu Ala Lys Glu Ile Ile Ile Arg Ile Pro 210 215 220 Val Ile
Glu Gly Phe Asn Ala Asp Leu Gln Ser Ile Gly Ala Ile Ala225 230 235
240 Gln Phe Ser Lys Ser Leu Thr Asn Leu Lys Arg Ile Asp Leu Leu Pro
245 250 255 Tyr His Asn Tyr Gly Glu Asn Lys Tyr Gln Ala Ile Gly Arg
Glu Tyr 260 265 270 Ser Leu Lys Glu Leu Lys Ser Pro Ser Lys Asp Lys
Met Glu Arg Leu 275 280 285 Lys Ala Leu Val Glu Ile Met Gly Ile Pro
Cys Thr Ile Gly Ala Glu 290 295 300 312545PRTAzospirillum
brasilense 312Met Lys Leu Ala Glu Ala Leu Leu Arg Ala Leu Lys Asp
Arg Gly Ala1 5 10 15 Gln Ala Met Phe Gly Ile Pro Gly Asp Phe Ala
Leu Pro Phe Phe Lys 20 25 30 Val Ala Glu Glu Thr Gln Ile Leu Pro
Leu His Thr Leu Ser His Glu 35 40 45 Pro Ala Val Gly Phe Ala Ala
Asp Ala Ala Ala Arg Tyr Ser Ser Thr 50 55 60 Leu Gly Val Ala Ala
Val Thr Tyr Gly Ala Gly Ala Phe Asn Met Val65 70 75 80 Asn Ala Val
Ala Gly Ala Tyr Ala Glu Lys Ser Pro Val Val Val Ile 85 90 95 Ser
Gly Ala Pro Gly Thr Thr Glu Gly Asn Ala Gly Leu Leu Leu His 100 105
110 His Gln Gly Arg Thr Leu Asp Thr Gln Phe Gln Val Phe Lys Glu Ile
115 120 125 Thr Val Ala Gln Ala Arg Leu Asp Asp Pro Ala Lys Ala Pro
Ala Glu 130 135 140 Ile Ala Arg Val Leu Gly Ala Ala Arg Ala Gln Ser
Arg Pro Val Tyr145 150 155 160 Leu Glu Ile Pro Arg Asn Met Val Asn
Ala Glu Val Glu Pro Val Gly 165 170 175 Asp Asp Pro Ala Trp Pro Val
Asp Arg Asp Ala Leu Ala Ala Cys Ala 180 185 190 Asp Glu Val Leu Ala
Ala Met Arg Ser Ala Thr Ser Pro Val Leu Met 195 200 205 Val Cys Val
Glu Val Arg Arg Tyr Gly Leu Glu Ala Lys Val Ala Glu 210 215 220 Leu
Ala Gln Arg Leu Gly Val Pro Val Val Thr Thr Phe Met Gly Arg225 230
235 240 Gly Leu Leu Ala Asp Ala Pro Thr Pro Pro Leu Gly Thr Tyr Ile
Gly 245 250 255 Val Ala Gly Asp Ala Glu Ile Thr Arg Leu Val Glu Glu
Ser Asp Gly 260 265 270 Leu Phe Leu Leu Gly Ala Ile Leu Ser Asp Thr
Asn Phe Ala Val Ser 275 280 285 Gln Arg Lys Ile Asp Leu Arg Lys Thr
Ile His Ala Phe Asp Arg Ala 290 295 300 Val Thr Leu Gly Tyr His Thr
Tyr Ala Asp Ile Pro Leu Ala Gly Leu305 310 315 320 Val Asp Ala Leu
Leu Glu Arg Leu Pro Pro Ser Asp Arg Thr Thr Arg 325 330 335 Gly Lys
Glu Pro His Ala Tyr Pro Thr Gly Leu Gln Ala Asp Gly Glu 340 345 350
Pro Ile Ala Pro Met Asp Ile Ala Arg Ala Val Asn Asp Arg Val Arg 355
360 365 Ala Gly Gln Glu Pro Leu Leu Ile Ala Ala Asp Met Gly Asp Cys
Leu 370 375 380 Phe Thr Ala Met Asp Met Ile Asp Ala Gly Leu Met Ala
Pro Gly Tyr385 390 395 400 Tyr Ala Gly Met Gly Phe Gly Val Pro Ala
Gly Ile Gly Ala Gln Cys 405 410 415 Val Ser Gly Gly Lys Arg Ile Leu
Thr Val Val Gly Asp Gly Ala Phe 420 425 430 Gln Met Thr Gly Trp Glu
Leu Gly Asn Cys Arg Arg Leu Gly Ile Asp 435 440 445 Pro Ile Val Ile
Leu Phe Asn Asn Ala Ser Trp Glu Met Leu Arg Thr 450 455 460 Phe Gln
Pro Glu Ser Ala Phe Asn Asp Leu Asp Asp Trp Arg Phe Ala465 470 475
480 Asp Met Ala Ala Gly Met Gly Gly Asp Gly Val Arg Val Arg Thr Arg
485 490 495 Ala Glu Leu Lys Ala Ala Leu Asp Lys Ala Phe Ala Thr Arg
Gly Arg 500 505 510 Phe Gln Leu Ile Glu Ala Met Ile Pro Arg Gly Val
Leu Ser Asp Thr 515 520 525 Leu Ala Arg Phe Val Gln Gly Gln Lys Arg
Leu His Ala Ala Pro Arg 530 535 540 Glu545 313348PRTRhodococcus sp.
ST-10 313Met Lys Ala Ile Gln Tyr Thr Arg Ile Gly Ala Glu Pro Glu
Leu Thr1 5 10 15 Glu Ile Pro Lys Pro Glu Pro Gly Pro Gly Glu Val
Leu Leu Glu Val 20 25 30 Thr Ala Ala Gly Val Cys His Ser Asp Asp
Phe Ile Met Ser Leu Pro 35 40 45 Glu Glu Gln Tyr Thr Tyr Gly Leu
Pro Leu Thr Leu Gly His Glu Gly 50 55 60 Ala Gly Lys Val Ala Ala
Val Gly Glu Gly Val Glu Gly Leu Asp Ile65 70 75 80 Gly Thr Asn Val
Val Val Tyr Gly Pro Trp Gly Cys Gly Asn Cys Trp 85 90 95 His Cys
Ser Gln Gly Leu Glu Asn Tyr Cys Ser Arg Ala Gln Glu Leu 100 105 110
Gly Ile Asn Pro Pro Gly Leu Gly Ala Pro Gly Ala Leu Ala Glu Phe 115
120 125 Met Ile Val Asp Ser Pro Arg His Leu Val Pro Ile Gly Asp Leu
Asp 130 135 140 Pro Val Lys Thr Val Pro Leu Thr Asp Ala Gly Leu Thr
Pro Tyr His145 150 155 160 Ala Ile Lys Arg Ser Leu Pro Lys Leu Arg
Gly Gly Ser Tyr Ala Val 165 170 175 Val Ile Gly Thr Gly Gly Leu Gly
His Val Ala Ile Gln Leu Leu Arg 180 185 190 His Leu Ser Ala Ala Thr
Val Ile Ala Leu Asp Val Ser Ala Asp Lys 195 200 205 Leu Glu Leu Ala
Thr Lys Val Gly Ala His Glu Val Val Leu Ser Asp 210 215 220 Lys Asp
Ala Ala Glu Asn Val Arg Lys Ile Thr Gly Ser Gln Gly Ala225 230 235
240 Ala Leu Val Leu Asp Phe Val Gly Tyr Gln Pro Thr Ile Asp Thr Ala
245 250 255 Met Ala Val Ala Gly Val Gly Ser Asp Val Thr Ile Val Gly
Ile Gly 260 265 270 Asp Gly Gln Ala His Ala Lys Val Gly Phe Phe Gln
Ser Pro Tyr Glu 275 280 285 Ala Ser Val Thr Val Pro Tyr Trp Gly Ala
Arg Asn Glu Leu Ile Glu 290 295 300 Leu Ile Asp Leu Ala His Ala Gly
Ile Phe Asp Ile Ser Val Glu Thr305 310 315 320 Phe Ser Leu Asp Asn
Gly Ala Glu Ala Tyr Arg Arg Leu Ala Ala Gly 325 330 335 Thr Leu Ser
Gly Arg Ala Val Val Val Pro Gly Leu 340 345 31431DNAArtificial
SequencePrimer 314catgccatgg gactggctga ggcactgctg c
3131547DNAArtificial SequencePrimer 315cgagctcagg aggatatata
tatgaaagct atccagtaca cccgtat 4731632DNAArtificial SequencePrimer
316cgagctctta ttcgcgcggt gccgcgtgca gg 3231734DNAArtificial
SequencePrimer 317gctctagatt acaggcccgg aaccacaacg gcgc
3431846DNAArtificial SequencePrimer 318ccgctcgagg aggatatata
tatgatttct aaaggcttta gcaccc 4631950DNAArtificial SequencePrimer
319acgtgatgta atctagagga ggatatatat atgagcaaag aaattaaagg
5032050DNAArtificial SequencePrimer 320tctttgctca tatatatatc
ctcctctaga ttacatcacg tgttcagtac 5032132DNAArtificial
SequencePrimer 321cgagctctta ttcggcgcca atggtgcacg gg
3232246DNAArtificial SequencePrimer 322ccgctcgagg aggatatata
tatgatttct aaaggcttta gcaccc 4632332DNAArtificial SequencePrimer
323cgagctctta ttcggcgcca atggtgcacg gg 3232426DNAArtificial
SequencePrimer 324cacccaagcg atagtttata tagcgt 2632520DNAArtificial
SequencePrimer 325gaaatgaacg gatattacgt 2032619DNAArtificial
SequencePrimer 326cggaacaggt gattgtggt 1932726DNAArtificial
SequencePrimer 327caccgcccac ttcaagatga agctgt 2632826DNAArtificial
SequencePrimer 328cacccaagcg atagtttata tagcgt 2632920DNAArtificial
SequencePrimer 329gtggctaagt acatgccggt 2033035DNAArtificial
SequencePrimer 330ggaattccat atgacaaaga atatgacgac taaac
3533132DNAArtificial SequencePrimer 331cgggatcctt attatttccc
ctgccctgca gt 3233232DNAArtificial SequencePrimer 332ggaattccat
atgagctatc aaccactttt ac 3233329DNAArtificial SequencePrimer
333cgggatcctt acagttgagc aaatgatcc 29
* * * * *