U.S. patent application number 14/508927 was filed with the patent office on 2016-04-07 for methods of making surfactant and cleaning compositions through microbially produced branched fatty alcohols.
This patent application is currently assigned to LS9, INC.. The applicant listed for this patent is Mathew RUDE. Invention is credited to Mathew RUDE.
Application Number | 20160097065 14/508927 |
Document ID | / |
Family ID | 44120899 |
Filed Date | 2016-04-07 |
United States Patent
Application |
20160097065 |
Kind Code |
A1 |
RUDE; Mathew |
April 7, 2016 |
METHODS OF MAKING SURFACTANT AND CLEANING COMPOSITIONS THROUGH
MICROBIALLY PRODUCED BRANCHED FATTY ALCOHOLS
Abstract
The invention provides a surfactant and/or a cleaning
composition comprising a microbially produced branched fatty
alcohol or a derivative thereof. The invention also provides a
household cleaning composition and a personal or pet care cleaning
composition comprising a microbially produced branched fatty
alcohol or a derivative thereof.
Inventors: |
RUDE; Mathew; (South San
Francisco, CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
RUDE; Mathew |
South San Francisco |
CA |
US |
|
|
Assignee: |
LS9, INC.
South San Francisco
CA
|
Family ID: |
44120899 |
Appl. No.: |
14/508927 |
Filed: |
October 7, 2014 |
Current U.S.
Class: |
435/134 |
Current CPC
Class: |
C11D 1/29 20130101; C11D
1/62 20130101; C11D 3/202 20130101; C12P 7/04 20130101; C07C 33/025
20130101; C11D 1/75 20130101; C11D 1/28 20130101; C07C 31/125
20130101; C12P 7/64 20130101; C11D 1/72 20130101; C11D 1/345
20130101; C11D 1/662 20130101 |
International
Class: |
C12P 7/64 20060101
C12P007/64 |
Claims
1-31. (canceled)
32: A method of making a surfactant composition using branched
chain fatty alcohols produced in a recombinant host cell, the
method comprising, (a) providing a recombinant host cell
genetically modified to comprise (i) a polynucleotide encoding a
polypeptide comprising one or more subunits having branched chain
alpha-keto acid dehydrogenase (BKD) activity (E.C. 1.2.4.4.)
capable of catalyzing a conversion of an alpha-keto acid to a
branched acyl-CoA, (ii) a polynucleotide encoding a polypeptide
having beta-ketoacyl-ACP synthase (FabH) activity capable of
catalyzing a condensation of a branched acyl-CoA and a malonyl-ACP
to produce a branched acyl-ACP, and (iii) a polynucleotide encoding
a polypeptide having fatty aldehyde biosynthesis activity capable
of catalyzing a conversion of a branched fatty acyl-ACP into a
branched fatty aldehyde; (b) culturing the recombinant host cell in
the presence of a carbon source under conditions effective to
express the polynucleotides and produce branched chain fatty
alcohols that are secreted into the extracellular environment of
the host cell; (c) collecting the branched chain fatty alcohols;
and (d) blending the branched chain fatty alcohols to make a
surfactant composition.
33: The method of claim 32, further comprising a polypeptide having
fatty alcohol biosynthesis activity, wherein said polypeptide is an
alcohol dehydrogenase (EC 1.1.1.1).
33. (canceled)
34: The method of claim 32, wherein said one or more subunits are
selected from the group consisting of E1 alpha/beta
(decarboxylase), E2 (dihydrolipoyl transacylase), and E3
(dihydrolipoyl dehydrogenase) subunits.
35: The method of claim 32, wherein said one or more subunits are
selected from the group consisting of E1 alpha/beta (decarboxylase)
and E2 (dihydrolipoyl transacylase) subunits.
36: The method of claim 32, wherein said polypeptide having fatty
aldehyde biosynthesis activity is an acyl-ACP reductase (AAR).
37: The method of claim 36, wherein said AAR is an enzyme from
Synechococcus elongates.
38: The method of claim 32, wherein said polypeptide having fatty
aldehyde biosynthesis activity is carboxylic acid reductase
(CAR).
39: The method of claim 32, wherein the recombinant host cell is an
E. coli host cell.
40: The method of claim 32, wherein the branched chain fatty
alcohols include one or more of saturated or unsaturated C.sub.12,
C.sub.14 and C.sub.16 fatty alcohols.
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. Pat. No.
8,859,259, filed Feb. 14, 2011, which claims the benefit of U.S.
Provisional Patent Application No. 61/304,448, filed Feb. 14, 2010,
and U.S. Provisional Patent Application No. 61/324,310, filed Apr.
15, 2010, the entire contents of which are hereby incorporated by
reference.
BACKGROUND OF THE INVENTION
[0002] Fatty alcohols have many commercial uses. Worldwide annual
sales of fatty alcohols and their derivatives are in excess of US$1
billion. Fatty alcohols are used in diverse industries. For
example, they are used in the cosmetic and food industries as
emulsifiers, emollients, and thickeners. Due to their amphiphilic
nature, fatty alcohols can be formulated or be used per se as
nonionic surfactants, which are useful in personal care and
household products, for example, in detergents. In addition, fatty
alcohols are used in waxes, gums, resins, pharmaceutical salves and
lotions, lubricating oil additives, textile antistatic and
finishing agents, plasticizers, cosmetics, industrial solvents, and
solvents for fats.
[0003] One major use for fatty alcohols is in cleaning
compositions. On the other hand, fatty alcohols find applicability
as surfactants, which are, for example, capable of enhancing oil
recovery and/or engine performance. Conventional surfactants
comprise molecules having at least one water-solubilizing
substituent or moiety (e.g., hydrophilic group) and at least one
oleophilic substituent or moiety (e.g., hydrophobic group).
Examples of hydrophilic groups include, without limitation,
carboxylate, sulfate, sulfonate, amine oxide, or polyoxyethylene.
Examples of the hydrophobic groups include, without limitation,
alkyl, alkenyl, or alkaryl hydrophobes, which typically contain
about 10 to about 20 carbon atoms.
[0004] Surfactants are typically regarded as the major force behind
cleaning products' ability to break up stains, solubilize dirt and
soil, and/or prevent their redeposition to surfaces. As such,
surfactants are also referred to as wetting agents and foamers,
which lower the surface tension of the medium in which they are
dissolved. Capable of lowering the interfacial tension between two
media or interfaces (e.g., air/water, water/oil, or oil/solid
interfaces), surfactants play a key role, and are often the most
important component in detergents. Conventional detergent
compositions contain mixtures of various surfactants in order to
remove different types of soils and stains from surfaces.
[0005] The earliest utilized source of hydrophobe groups were
natural fats and oils, which were converted into soaps (e.g.,
carboxylate hydrophile) using base via saponification processes.
Coconut and palm oils are to this day used to manufacture soaps and
alkylsulfate surfactants. As edible oils became more scarce, it has
become increasingly prevalent to manufacture detergents from
petrochemicals, using processes such as the Zeigler process to
convert petroleum derived ethylene to fatty alcohols. For example,
ethylene has been converted into alkyl benezene sulfonate
surfactants, which are commonly found in today's detergents and
cleaning compositions.
[0006] Fatty alcohols can also served as starting materials in the
preparation of surfactants and of other cleaning composition
ingredients including, for example, alkyl sulfates, fatty ether
sulfates, fatty alcohol sulfates, fatty phosphate esters,
alkylbenzyl dimethylammonium salts, fatty amine oxides, alkyl
polyglucosides, and alkyl glyceryl ether sulfonates. Among these,
alkyl sulfates are commonly known due to the ease of their
manufacture as well as their improved solubility and surfactant
characteristics over traditional soap-based surfactants. However,
long-chain alkyl surfactants have less than optimal performance as
surfactants or as component(s) of detergents at low temperatures
(e.g., about 50.degree. C. or lower, about 30.degree. C. or
lower).
[0007] While there have been isolated reports that branching,
especially towards the middle part of the long-chain alkyl, can
reduce solubility of the surfactant, others have described that, in
commercial practices, branching in fatty alcohols is highly
desirable. See, e.g., R. G. Laughlin, The Aqueous Phase Behavior of
Surfactants," Academic Press, N.Y., (1994), at page 347; but see,
Finger et al., Detergent alcohols--the effect of alcohol structure
and molecular weight on surfactant properties, J. Amer. Oil
Chemicals Society, Vol. 44:525 (1967); Technical Bulletin, Shell
Chemical Co., SC:164-80. In addition, K. R. Wormuth, et al.,
Langmuir, vol 7 (1991):2048-2053, describes the technical
advantages observed with a number of branched alkyl sulfates,
especially with the "branched Guerbet" type, derived from the
highly branched "Exxal" alcohols (Exxon). Phase studies have
established a liphophilic ranking (i.e., a hydrophobicity ranking)
if highly branched/double tail>methyl branched>linear.
Furthermore, patents and applications, including, for example, U.S.
Pat. No. 6,008,181 indicates that certain branched or
multi-branched fatty alcohol derivatives exhibit improved cleaning
capacity, especially at lower temperatures.
[0008] Branched fatty alcohols and various precursors are known to
have additional preferred properties such as considerably lower
melting points, which can in turn confer lower pour points when
made into industrial chemicals, as compared to linear alcohols of
comparable molecular weights. They are also known to confer
substantially lower volatility and vapor pressure, and improved
stability against oxidation and rancidity than their linear
counterparts. These additional preferred properties, in addition to
making branched materials desirable surfactants, make them
particularly suited as components or feedstocks for cosmetic and
pharmaceutical applications, as components of plasticizers for
making synthetic resins, as solvents for solutions for printing ink
and specialty inks, or as industrial lubricants.
[0009] Those added preferred properties can be alternatively
obtained from unsaturated fatty alcohols and precursors. But
unsaturation promotes oxidation, leading to short shelf lives and
corrosion. Thus desirable properties, e.g., lower melting points,
pour points, volatility, and vapor pressure and improved oxidative
stability, are better achieved via branching.
[0010] Obtaining branched materials from crude petroleum requires a
significant financial investment as well as consumes a great deal
of energy. It is also an inefficient process because frequently it
is necessary to crack the long chain hydrocarbons in crude
petroleum to produce smaller monomers, which only then become
useful as raw materials for manufacturing complex specialty
chemicals. Furthermore, it is commonplace in the petrochemical
industry to obtain branched chemicals, such as branched alcohols
and aldehydes, by isomerization of straight-chain hydrocarbons.
Expensive catalysts are typically required for isomerization, thus
increasing manufacturing cost. The catalysts often then become
undesirable contaminants that are removed from the finished
products, adding yet further cost to the processes.
[0011] Obtaining specialty chemicals such as branched alcohols or
derivatives from crude petroleum also drains the dwindling resource
of petroleum, in addition to the cost and problems associated with
exploring, extracting, transporting, and refining. One estimate of
world petroleum consumption is 30 billion barrels per year. By some
estimates, it is predicted that at current production levels, the
world's petroleum reserves could be depleted before 2050.
[0012] Finally, processing and manufacturing of surfactants and/or
detergents from petroleum inevitably releases greenhouse gases
(e.g., in the form of carbon dioxide) and other forms of air
pollution (e.g., carbon monoxide, sulfur dioxide, etc.). The
accumulation of greenhouse gases in the atmosphere can lead to
increase global warming, causing local pollutions and spillage as
well as global environmental detriments.
[0013] Thus, although it is possible to obtain branched fatty
alcohols and derivatives from natural oils and petroleum, it would
be desirable to produce these branched materials from other
sources, such as directly from biomass.
SUMMARY OF THE INVENTION
[0014] The invention provides a surfactant composition and a
cleaning composition comprising one or more microbially produced
branched fatty alcohols, branched fatty alcohol precursors, or
branched fatty alcohol derivatives thereof.
[0015] The invention provides a surfactant composition comprising
about 0.001 wt. % to about 100 wt. % of one or more microbially
produced branched fatty alcohols or branched alcohol derivatives
thereof.
[0016] The invention also provides a liquid cleaning composition
comprising (a) about 0.1 wt. % to about 50 wt. % of one or more
microbially produced branched fatty alcohols or derivatives
thereof, or about 0.1 wt. % to about 50 wt. % of a surfactant
comprising one or more microbially produced branched fatty alcohols
or derivatives thereof, (b) about 1 wt. % to about 30 wt. % of one
or more co-surfactants, (c) about 0 wt. % to about 10 wt. % of one
or more detergency builders, (d) 0 wt. % to about 2 wt. % of one or
more enzymes, (e) about 0 wt. % to about 15 wt. % of one or more
chelating agents, (f) about 0 wt. % to about 20 wt. % of one or
more hydrotropes, (g), about 0 wt. % to about 1.0 wt. % of one or
more organic sequestering agents, and (h) about 0.1 wt. % to about
98 wt. % of a solvent system. In some embodiments, the liquid
cleaning composition further comprises one or more suitable
adjuncts.
[0017] The invention further provides a solid cleaning composition
comprising (a) about 0.1 wt. % to about 50 wt. % of one or more
microbially produced branched fatty alcohols or derivatives
thereof, or about 0.1 wt. % to about 50 wt. % of a surfactant
comprising one or more microbially produced branched fatty alcohols
or derivatives thereof, (b) about 1 wt. % to about 30 wt. % of one
or more co-surfactants, (c) about 1 wt. % to about 60 wt. % of one
or more detergency builders, (d) about 0 wt. % to about 2 wt. % of
one or more enzymes, (e) about 0 wt. % to about 20 wt. % of one or
more hydrotropes, (f) about 10 wt. % to about 35 wt. % of one or
more filler salts, (g) about 0 wt. % to about 15 wt. % of one or
more chelating agents, and (g) about 0.01 wt. % to about 1 wt. % of
one or more organic sequestering agents. In certain embodiments,
the solid cleaning composition further comprises one or more
suitable adjuncts.
[0018] In particular embodiments, the invention pertains to a
household cleaning composition comprising (a) about 0.1 wt. % to
about 50 wt. % of one or more microbially produced branched fatty
alcohols and/or derivatives thereof, or about 0.1 wt. % to about 50
wt. % of a surfactant comprising one or more microbially produced
fatty alcohols and/or derivatives thereof; (b) about 1 wt. % to
about 30 wt. % of one or more co-surfactants; (c) about 0 wt. % to
about 30 wt. % of one or more detergency builders; (d) about 0 wt.
% to about 2.0 wt. % of one or more suitable detersive enzymes; (e)
about 0 wt. % to about 15 wt. % one or more chelating agents; (f)
about 0 wt. % to about 20 wt. % of one or more hydrotropes, (g)
about 0 to about 15 wt. % of one or more rheology modifier; (h)
about 0 wt. % to about 1.0 wt. % of one or more organic
sequestering agents; and (i) various other adjuncts such as, for
example, one or more of bleaching agents, additional enzymes, suds
suppressors, dispersants, lime-soap dispersants, soil suspension
and anti-redeposition agents, and corrosion inhibitors. In an
exemplary embodiment, a laundry composition can also comprise
softening agents, fragrances, bleach systems, dyes or colorants,
preservatives, germicides, fungicides, fabric care benefit agents,
gelling agents, antideposition agents, and other detersive
adjuncts
[0019] Such a household cleaning composition can be a liquid, which
further comprises water and/or a suitable aqueous carrier or
solvent. Liquid compositions can be in a "concentrated" form, the
density of which can range from, for example, about 400 to about
1,200 g/L, when measured at 200.degree. C. For example, the water
content of a typical concentrated liquid detergent is less than
about 40 wt. %, or less than about 30 wt. %. Alternatively, a
household cleaning composition can be a solid, for example, in the
form of a tablet, a bar, a powder or a granule. Granular
compositions can also be in a "compact" form, which is best
reflected by density and, in terms of composition, by the amount of
inorganic filler salt. Inorganic filler salts are conventional
ingredients of solid cleaning compositions, present in substantial
amounts, varying from, for example, about 10 wt. % to about 35 wt.
%. Suitable filler salts include, for example, alkali and
alkaline-earth metal salts of sulfates and chlorides. An exemplary
filler salt is sodium sulfate.
[0020] In another embodiment, the invention provides a personal or
beauty care cleaning or treatment composition comprising (a) about
0.1 wt. % to about 50 wt. % of one or more microbially produced
branched fatty alcohols and/or derivatives thereof, or about 0.1
wt. % to about 50 wt. % of a surfactant comprising one or more
microbially produced branched fatty alcohols and/or derivatives
thereof; (b) about 0.001 wt. % to about 30 wt. % of one or more
co-surfactants; (c) about 0 wt. % to about 30 wt. % of one or more
detergency builders; (d) about 0 wt. % to about 2.0 wt. % of one or
more suitable detersive enzymes; (e) about 0 wt. % to about 15 wt.
% one or more chelating agents; (f) about 0 wt. % to about 20 wt. %
of one or more hydrotropes, (g) about 0 to about 15 wt. % of one or
more rheology modifier; (h) about 0 wt. % to about 1.0 wt. % of one
or more organic sequestering agents; and (i) various other adjuncts
such as, for example, one or more of conditioner, silicone,
fragrances, silica particles, cationic cellose or guar polymers,
silicone microemulsion stabilizers, fatty amphiphiles, germicides,
fungicides, anti-dandruff agents, pearlescent agents, foam
boosters, pediculocides, pH adjusting agents, UV absorbers,
sunscreens, skin active agents, vitamins, minerals,
herbal/fruit/food extracts, sphingolipids, sensory indicators,
suspension agents, and mixtures thereof.
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] FIG. 1A and FIG. 1B are schematics of two exemplary
alternative pathways for producing branched fatty alcohols using
recombinant microbial host cells.
[0022] FIG. 2A lists representative homologs of BKD E1 alpha
subunit, their amino acid sequences and polynucleotide sequences,
as well as amino acid sequence motifs of suitable BKD E1 alpha
subunit homologs and variants. FIG. 2B lists representative
homologs of BKD E1 beta subunit, their amino acid sequences and
polynucleotide sequences, as well as amino acid sequence motifs of
suitable BKD E1 beta subunit homologs and variants. FIG. 2C lists
representative homologs of BKD E2 subunit, their amino acid
sequences and polypeptide sequences, as well as amino acid sequence
motifs of suitable BKD E2 subunit homologs and variants. FIG. 2D
lists representative homologs of BKD E3 subunit homologs and
variants, as well as amino acid sequence motifs of suitable BKD E3
subunit homologs and variants. FIG. 2E lists representative
homologs of beta ketoacyl-ACP synthase homologs, their amino acid
sequences and polynucleotide sequences, as well as amino acid
sequences of suitable beta keto-acyl-ACP synthase homologs and
variants.
[0023] FIG. 3A is a table of BKD E1 alpha subunit homologs. FIG. 3B
is a table of BKD E1 beta subunit homologs. FIG. 3C is a table of
BKD E2 subunit homologs. FIG. 3D if a table of BKD E3 subunit
homologs. FIG. 3E is a table of beta ketoacyl-ACP synthase
homologs. These tables also present % identity in reference to the
sequences of various organisms. For example, "ID % Pp" indicates
that the identity listed in the column below are in reference to a
P. putida gene encoding that subunit. "ID % Bs" refers to the
identity to a B. subtilis gene encoding that subunit. "ID % Sc" and
"ID % Sc2" refer to identity to a first and second S. coelicolor
genes encoding that subunit, respectively. "ID % Sa" and "ID % Sa2"
refer to identity to a first and a second S. avermitilis genes
encoding that subunit, respectively.
[0024] FIG. 4A depicts a GC/MS trace of branched fatty alcohol
production of strain MG1655.sub.-- .DELTA.tonA AAR:kan transformed
with a pGL10 vector containing P. putida Pput1450, Pput1451,
Pput1452 and Pput1453 inserts, and with B. subtilis fabH1. The
figure indicates the production of iso-C.sub.14:0, iso-C.sub.15:0,
anteiso-C.sub.15:0, iso-C.sub.16:0, iso-C.sub.17:0 and
anteiso-C.sub.17:0 branched fatty alcohols. FIG. 4B depicts the
production of branched fatty acyl-CoA precursors by feeding
branched substrates isobutyrate and isovalerate to an engineered E.
coli strain comprising the pDG10 and an OP-180 plasmids, the latter
plasmid contained teas under the control of a Ptrc promoter.
[0025] FIG. 5 is a representative calibration curve obtained by
linear regression, which was used in the semi-quantitative
measurement of the amount of branched fatty alcohol yield relative
to the amount of straight-chain fatty alcohol yield.
[0026] FIG. 6A is a listing of nucleotide sequence of the pDG2
plasmid. FIG. 6B depicts a map of the pDG6 plasmid. FIG. 6C is a
listing of nucleotide sequence of the pDG6 plasmid, constructed by
inserting B. subtilis fabH1 into pDG2, comprising E. coli PfabH1
(promoter) and B. subtilis fabH1. The B. subtilis fabH1 insert is
in upper case italic letters. FIG. 6D depicts a map of the pDG7
plasmid. FIG. 6E is a listing of nucleotide sequence of the pDG7
plasmid, constructed by inserting a B. subtilis fabH2 into pDG2,
comprising E. coli PfabH1 (promoter) and B. subtilis fabH2. FIG. 6F
depicts a map of the pDG8 plasmid. FIG. 6G is a listing of
nucleotide sequence of pDG8 plasmid, constructed by inserting S.
coelicolor fabH into pDG2, comprising E. coli PfabH1 (promoter) and
S. coelicolor fabH. FIG. 6H is a plasmid map of the pDG10 plasmid.
FIG. 6I is listing of nucleotide sequence of the pDG10 plasmid,
comprising a C. acetobutylicum ptb_buk insert. FIG. 6J is a listing
of nucleotide sequence of the pLS9-111 plasmid. FIG. 6K is a
listing of nucleotide sequence of the pLS9-114 plasmid. FIG. 6L is
a listing of nucleotide sequence of the pLS9-115 plasmid.
[0027] FIG. 7 is a listing of nucleotide sequence of the pKZ4
plasmid having a pGL10.173B vector backbone and a polynucleotide
insert encoding a BKD complex from Pseudomonas putida. The P.
putida genes encoding a BKD complex are shown in lower case italic
letters.
[0028] FIG. 8 is a listing of nucleotide sequence of the pGL10.173B
vector backbone, which contains the BamHI and EcoRI sites to which
the Pseudomonas putida bkd genes (operon) were inserted. The BamHI
and EcoRI restriction sites are marked.
[0029] FIG. 9 is a listing of additional nucleotide and amino acid
sequences of the disclosure.
DETAILED DESCRIPTION OF THE INVENTION
[0030] Unless otherwise defined, all technical and scientific terms
used herein have the same meaning as commonly understood by one of
ordinary skill in the art to which this invention belongs. Although
methods and materials similar or equivalent to those described
herein can be used in the practice or testing of the invention,
suitable methods and materials are described below. All
publications, patent applications, patents, and other references
mentioned herein, including GenBank database sequences, are
incorporated by reference in their entirety. In case of conflict,
the present specification, including definitions, will control. In
addition, the materials, methods, and examples are illustrative
only and not intended to be limiting.
[0031] Other features and advantages of the invention will be
apparent from the following detailed description, and from the
claims.
DEFINITIONS
[0032] Throughout the specification, a reference may be made using
an abbreviated gene name or polypeptide name, but it is understood
that such an abbreviated gene or polypeptide name represents the
genus of genes or polypeptides. Such gene names include all genes
encoding the same polypeptide and homologous polypeptides having
the same physiological function. Polypeptide names include all
polypeptides and homologous polypeptides that have the same
activity (e.g., that catalyze the same fundamental chemical
reaction).
[0033] Unless otherwise indicated, the accession numbers referenced
herein are derived from the NCBI database (National Center for
Biotechnology Information) maintained by the National Institute of
Health, U.S.A. Unless otherwise indicated, the accession numbers
are as provided in the database as of December 2009.
[0034] EC numbers are established by the Nomenclature Committee of
the International Union of Biochemistry and Molecular Biology
(NC-IUBMB) (available at http://www.chem.
qmul.ac.uk/iubmb/enzyme/). The EC numbers referenced herein are
derived from the KEGG Ligand database, maintained by the Kyoto
Encyclopedia of Genes and Genomics, sponsored in part by the
University of Tokyo. Unless otherwise indicated, the EC numbers are
as provided in the database as of October 2008.
[0035] The articles "a" and "an" are used herein to refer to one or
to more than one (i.e., to at least one) of the grammatical object
of the article. By way of example, "an element" means one element
or more than one element.
[0036] The term "about" is used herein to mean a value .+-.20% of a
given numerical value. Thus, "about 60%" means a value of between
60.+-.(20% of 60) (i.e., between 48 and 70).
[0037] The term "alkyl" is used herein to mean a straight chain or
a branched chain hydrocarbon residue having from about 6 carbon
atoms to about 26 carbon atoms and in the context of the present
specification is used interchangeably with the term "fatty."
[0038] As used herein, the term "alcohol dehydrogenase" (EC
1.1.1.*) refers to a polypeptide capable of catalyzing the
conversion of a fatty aldehyde to an alcohol (e.g., fatty alcohol).
In certain embodiments, these enzymes can also be referred to as
fatty aldehyde recutases, oxidoreductases, or aldo-keto reductases.
Additionally, one of ordinary skill in the art will appreciate that
some alcohol dehydrogenases will catalyze other reactions as well.
For example, some alcohol dehydrogenases will accept other
substrates in addition to fatty aldehydes. Such non-specific
alcohol dehydrogenases are, therefore, also included in this
definition. Nucleic acid sequences encoding alcohol dehydrogenases
are known in the art, and such alcohol dehydrogenases are publicly
available. Exemplary GenBank Accession Numbers are provided in
Table 8 herein.
[0039] As used herein, the term "attenuate" means to weaken,
reduce, or diminish. For example, a polypeptide can be attenuated
by modifying the polypeptide to reduce its activity (e.g., by
modifying a nucleotide sequence that encodes the polypeptide) or
its expression level.
[0040] As used herein, the term "biomass" refers to any biological
material from which a carbon source is derived. In some instances,
a biomass is processed into a carbon source, which is suitable for
bioconversion. In other instances, the biomass may not require
further processing into a carbon source. The carbon source can be
converted into a fatty alcohol. One exemplary source of biomass is
plant matter or vegetation. For example, corn, sugar cane, or
switchgrass can be used as biomass. Another non-limiting example of
biomass is metabolic wastes, such as animal matter, for example cow
manure. In addition, biomass may include algae and other marine
plants. Biomass also includes waste products from industry,
agriculture, forestry, and households. Examples of such waste
products that can be used as biomass are fermentation waste,
ensilage, straw, lumber, sewage, garbage, cellulosic urban waste,
and food leftovers. Biomass also includes carbon sources such as
carbohydrates (e.g., monosaccharides, disaccharides, or
polysaccharides).
[0041] As used herein, the phrase "carbon source" refers to a
substrate or compound suitable to be used as a source of carbon for
prokaryotic or simple eukaryotic cell growth. Carbon sources can be
in various forms, including, but not limited to polymers,
carbohydrates, acids, alcohols, aldehydes, ketones, amino acids,
peptides, and gases (e.g., CO and CO.sub.2). These include, for
example, various monosaccharides, such as glucose, fructose,
mannose, and galactose; oligosaccharides, such as
fructo-oligosaccharide and galacto-oligosaccharide; polysaccharides
such as xylose and arabinose; disaccharides, such as sucrose,
maltose, and turanose; cellulosic material, such as methyl
cellulose and sodium carboxymethyl cellulose; saturated or
unsaturated fatty acid esters, such as succinate, lactate, and
acetate; alcohols, such as ethanol, methanol, and glycerol, or
mixtures thereof. The carbon source can also be a product of
photosynthesis, including, but not limited to, glucose. A preferred
carbon source is biomass. Another preferred carbon source is
glucose.
[0042] A nucleotide sequence is "complementary" to another
nucleotide sequence if each of the bases of the two sequences
matches (i.e., is capable of forming Watson-Crick base pairs). The
term "complementary strand" is used herein interchangeably with the
term "complement". The complement of a nucleic acid strand can be
the complement of a coding strand or the complement of a non-coding
strand.
[0043] As used herein, the term "conditions sufficient to allow
expression" means any conditions that allow a host cell to produce
a desired product, such as a polypeptide or fatty alcohol described
herein. Suitable conditions include, for example, fermentation
conditions. Fermentation conditions can comprise many parameters,
such as temperature ranges, levels of aeration, and media
composition. Each of these conditions, individually and in
combination, allows the host cell to grow. Exemplary culture media
include broths or gels. Generally, the medium includes a carbon
source, such as glucose, fructose, cellulose, or the like, that can
be metabolized by a host cell directly. In addition, enzymes can be
used in the medium to facilitate the mobilization (e.g., the
depolymerization of starch or cellulose to fermentable sugars) and
subsequent metabolism of the carbon source.
[0044] To determine if conditions are sufficient to allow
expression, a host cell can be cultured, for example, for about 4,
8, 12, 24, 36, or 48 hours. During and/or after culturing, samples
can be obtained and analyzed to determine if the conditions allow
expression. For example, the host cells in the sample or the medium
in which the host cells were grown can be tested for the presence
of a desired product. When testing for the presence of a product,
assays, such as TLC, HPLC, GC/FID, GC/MS, LC/MS, and MS, can be
used.
[0045] It is understood that the polypeptides described herein may
have additional conservative or non-essential amino acid
substitutions, which do not have a substantial effect on the
polypeptide functions. Whether or not a particular substitution
will be tolerated (i.e., will not adversely affect desired
biological properties, such as carboxylic acid reductase activity)
can be determined as described in Bowie et al., Science, 247:
1306-1310 (1990). A "conservative amino acid substitution" refers
to the replacement of one amino acid residue with another amino
acid residue having a similar side chain. Families of amino acid
residues having similar side chains have been defined in the art.
These families include amino acids with basic side chains (e.g.,
lysine, arginine, histidine), acidic side chains (e.g., aspartic
acid, glutamic acid), uncharged polar side chains (e.g., glycine,
asparagine, glutamine, serine, threonine, tyrosine, cysteine),
nonpolar side chains (e.g., alanine, valine, leucine, isoleucine,
proline, phenylalanine, methionine, tryptophan), beta-branched side
chains (e.g., threonine, valine, isoleucine), and aromatic side
chains (e.g., tyrosine, phenylalanine, tryptophan, histidine).
[0046] As used herein, "control element" means a transcriptional
and/or a translational control element. Control elements include
promoters and enhancers, such as ribosome binding sequences. The
term "promoter element," "promoter," or "promoter sequence" refers
to a DNA sequence that functions as a switch that activates the
expression of a gene. If the gene is activated, it is said to be
transcribed or participating in transcription. Transcription
involves the synthesis of mRNA from the gene. A promoter,
therefore, serves as a transcriptional regulatory element and also
provides a site for initiation of transcription of the gene into
mRNA. Control elements interact specifically with cellular proteins
involved in transcription (Maniatis et al., Science, 236: 1237
(1987)).
[0047] As used herein, the term "detergent" refers broadly to
agents and materials that are useful in cleaning applications or as
cleaning aids. This term is thus used interchangeably with the term
"cleaning composition." The term encompasses materials and agents
that comprise various surfactants at various percentages by weight
or by volume, as well as suitable additives, and are capable of
emulsifying stains in a cleaning matrix. A detergent can take the
physical form of, for example, a liquid, a paste, a gel, a bar, a
powder, a tablet, or a granule. Granular compositions can also be
in "compact" form, whereas liquid compositions can be in
"concentrate" form.
[0048] As used herein, detergent compositions include articles and
compositions of cleaning and/or treatment. As used herein, the term
"cleaning and/or treatment composition" includes, unless otherwise
indicated, tablet, granular, or power-form all-purpose or "heavy
duty" washing agents, especially laundry detergents; liquid, gel,
or paste-form all-purpose washing agents, especially the so-called
heavy-duty liquid types; liquid fine-fabric detergents; hand
dishwashing agents, or light duty dishwashing agents, especially
those of the high-foaming type; machine dishwashing agents,
including the various tablets, granular, liquid and rinse-aid types
for household and institutional use. The compositions can also be
in unit dose packages, including those known in the art and those
that are water soluble, water insoluble and/or water permeable.
[0049] As used herein, detergent composition also include personal
or beauty care products in the form of skin and hair care
compositions including, for example, conditioning treatments,
cleansing products, such as hair and/or scalp shampoos, body
washes, hand cleaners, water-less hand sanitizers/cleansers, facial
cleansers, and the like.
[0050] As used herein, the term "fatty acid" means a carboxylic
acid having the formula RCOOH. R represents an aliphatic group,
preferably an alkyl group. R can comprise about 4 or more carbon
atoms. In some embodiments, the fatty acid comprises between about
4 and about 22 carbon atoms. Fatty acids can be saturated,
monounsaturated, or polyunsaturated. In addition, fatty acids can
comprise a straight or branched chain. The branched chains may have
one or more points of branching. In addition, the branched chains
may include cyclic branches. In a preferred embodiment, the fatty
acid is made from a fatty acid biosynthetic pathway.
[0051] As used herein, the term "fatty acid biosynthetic pathway"
means a biosynthetic pathway that produces fatty acids. The fatty
acid biosynthetic pathway includes fatty acid enzymes that can be
engineered, as described herein, to produce fatty acids, and in
some embodiments can be expressed with additional enzymes to
produce fatty acids having desired carbon chain
characteristics.
[0052] As used herein, the term "fatty acid derivative" means
products made in part from the fatty acid biosynthetic pathway of
the production host organism. "Fatty acid derivative" also includes
products made in part from acyl-ACP or acyl-ACP derivatives. The
fatty acid biosynthetic pathway includes fatty acid synthase
enzymes which can be engineered as described herein to produce
fatty acid derivatives, and in some examples can be expressed with
additional enzymes to produce fatty acid derivatives having desired
carbon chain characteristics. Exemplary fatty acid derivatives
include, for example, fatty acids, acyl-CoA, fatty aldehyde, short
and long chain alcohols, hydrocarbons, fatty alcohols, and esters
(e.g., waxes, fatty acid esters, or fatty esters), although due to
their separate and industrial utilities and depending the sources
from which they derive, hydrocarbons can sometimes be grouped into
a separate "hydrocarbon" category.
[0053] As used herein, the term "fatty acid derivative enzyme"
means any enzyme that may be expressed or overexpressed in the
production of fatty acid derivatives. These enzymes may be part of
the fatty acid biosynthetic pathway. Non-limiting examples of fatty
acid derivative enzymes include fatty acid synthases,
thioesterases, acyl-CoA synthases, acyl-CoA reductases, alcohol
dehydrogenases, alcohol acyltransferases, fatty alcohol-forming
acyl-CoA reductases, carboxylic acid reductases (e.g., fatty acid
reductases), acyl-ACP reductases, fatty acid hydroxylases, acyl-CoA
desaturases, acyl-ACP desaturases, acyl-CoA oxidases, acyl-CoA
dehydrogenases, ester synthases, and/or alkane biosynthetic
polypeptides, etc. Fatty acid derivative enzymes can convert a
substrate into a fatty acid derivative. In some examples, the
substrate may be a fatty acid derivative that the fatty acid
derivative enzyme converts into a different fatty acid
derivative.
[0054] As used herein, "fatty acid enzyme" means any enzyme
involved in fatty acid biosynthesis. Fatty acid enzymes can be
expressed or overexpressed in host cells to produce fatty acids.
Non-limiting examples of fatty acid enzymes include fatty acid
synthases and thioesterases.
[0055] As used herein, "fatty aldehyde" means an aldehyde having
the formula RCHO characterized by an unsaturated carbonyl group
(C.dbd.O). In a preferred embodiment, the fatty aldehyde is any
aldehyde made from a fatty acid or fatty acid derivative. In one
embodiment, the R group is at least about 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
or 26 carbons in length, or is a value between any two of the
foregoing values.
[0056] R can be straight or branched chain. The branched chains may
have one or more points of branching. In addition, the branched
chains may include cyclic branches.
[0057] Furthermore, R can be saturated or unsaturated. If
unsaturated, the R can have one or more points of unsaturation.
[0058] In one embodiment, the fatty aldehyde is produced
biosynthetically.
[0059] Fatty aldehydes have many uses. For example, fatty aldehydes
can be used to produce many specialty chemicals. For example, fatty
aldehydes are used to produce polymers, resins, dyes, flavorings,
plasticizers, perfumes, pharmaceuticals, and other chemicals. Some
are used as solvents, preservatives, or disinfectants. Some natural
and synthetic compounds, such as vitamins and hormones, are
aldehydes.
[0060] The terms "fatty aldehyde biosynthetic polypeptide",
"carboxylic acid reductase", and "CAR" are used interchangeably
herein.
[0061] As used herein, "fatty alcohol" means an alcohol having the
formula ROH. In a preferred embodiment, the fatty alcohol is any
alcohol made from a fatty acid or fatty acid derivative. In one
embodiment, the R group is at least about 5, 6, 7, 8, 9, 10, 11,
12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or 26
carbons in length, or is a value between any two of the foregoing
values. Typically, the fatty alcohol comprises an R group that is 6
to 26 carbons in length. Preferably, the fatty alcohol comprises an
R group that is 8, 10, 12, 14, 16, or 18 carbons in length.
[0062] R can be straight or branched chain. The branched chains may
have one or more points of branching. In addition, the branched
chains may include cyclic branches. In a particular embodiment, the
fatty alcohol of the present invention comprises one or more points
of branching.
[0063] Furthermore, R can be saturated or unsaturated. If
unsaturated, the R can have one or more points of unsaturation.
[0064] In one embodiment, the branched fatty alcohol is produced
biosynthetically.
[0065] Fatty alcohols have many uses. For example, fatty alcohols
can be used to produce various specialty chemicals. As such, fatty
alcohols are used as a biofuel; as solvents for fats, waxes, gums,
and resins; in pharmaceutical salves, emollients, and lotions; as
lubricating-oil additives; in detergents and emulsifiers; as
textile antistatic and finishing agents; as plasticizers; as
nonionic surfactants; in cosmetics, e.g., as thickeners.
[0066] The term "fatty alcohol derivative" refers to a compound
derived from a fatty alcohol. The fatty alcohol derivative can
include the oxygen atom derived from the fatty alcohol, or, in some
embodiments, does not include the aforesaid oxygen atom, in, for
example, fatty amine oxides. For example, a fatty amide, which also
can be referred to as an alkyl amide, refers to a compound
comprising an amide group and a hydrocarbon residue having about 6
carbon atoms or more, wherein the hydrocarbon residue is bonded to
the carbonyl group of the amide group or to the nitrogen atom of
the amide group. In some embodiments, the hydrocarbon residue of
the fatty alcohol is bonded to the carbonyl group of the amide
group or to the nitrogen atom of the amide group. In some
embodiments, the hydrocarbon residue is saturated. In other
embodiments, the hydrocarbon residue is monounsaturated. In further
embodiments, the hydrocarbon residue is polyunsaturated. In certain
other embodiments, the hydrocarbon residue can be a straight-chain
residue. In certain further embodiments, the hydrocarbon residue
can contain one or more points of branching.
[0067] Branched fatty alcohols have particularly beneficial
properties as compared to their corresponding straight-chain
isomers (i.e., isomers of the same molecular weight). For example,
branched fatty alcohols tend to have considerably lower melting
points when compared to their corresponding straight-chain isomers.
Lower melting points confer lower pour points. In addition,
branched fatty alcohols tend to substantially lower volatility and
vapor pressure, and improved stability against oxidation and
rancidity, as compared to their corresponding straight-chain
isomers. These beneficial properties render particular suitability
of using branched fatty alcohols and/or derivatives thereof as
components or feedstocks for cosmetic and pharmaceutical
applications, as components of plasticizers for synthetic resins,
as solvents for solutions for printing ink and specialty inks, or
as industrial lubricants. These materials are also well suited as
components of surfactants that have good low-temperature detersive
performance. As such, they are especially desirable as ingredients
of various household and/or personal care cleaning/treatment
compositions wherein low washing temperatures are preferred.
[0068] As used herein, "fraction of modern carbon" or "f.sub.M" has
the same meaning as defined by National Institute of Standards and
Technology (NIST) Standard Reference Materials (SRMs) 4990B and
4990C, known as oxalic acids standards HOxI and HOxII,
respectively. The fundamental definition relates to 0.95 times the
.sup.14C/.sup.12C isotope ratio HoxI (referenced to AD 1950). This
is roughly equivalent to decay-corrected pre-Industrial Revolution
wood. For the current living biosphere (plant material), f.sub.M is
approximately 1.1.
[0069] "Gene knockout", as used herein, refers to a procedure by
which a gene encoding a target protein is modified or inactivated
so as to reduce or eliminate the function of the intact protein.
Inactivation of the gene may be performed by general methods such
as mutagenesis by UV irradiation or treatment with
N-methyl-N'-nitro-N-nitrosoguanidine, site-directed mutagenesis,
homologous recombination, insertion-deletion mutagenesis, or
"Red-driven integration" (Datsenko et al., Proc. Natl. Acad. Sci.
USA, 97: 6640-45 (2000)). For example, in one embodiment, a
construct is introduced into a host cell, such that it is possible
to select for homologous recombination events in the host cell. One
of skill in the art can readily design a knock-out construct
including both positive and negative selection genes for
efficiently selecting transfected cells that undergo a homologous
recombination event with the construct. The alteration in the host
cell may be obtained, for example, by replacing through a single or
double crossover recombination a wild type DNA sequence by a DNA
sequence containing the alteration. For convenient selection of
transformants, the alteration may, for example, be a DNA sequence
encoding an antibiotic resistance marker or a gene complementing a
possible auxotrophy of the host cell. Mutations include, but are
not limited to, deletion-insertion mutations. An example of such an
alteration includes a gene disruption (i.e., a perturbation of a
gene) such that the product that is normally produced from this
gene is not produced in a functional form. This could be due to a
complete deletion, a deletion and insertion of a selective marker,
an insertion of a selective marker, a frameshift mutation, an
in-frame deletion, or a point mutation that leads to premature
termination. In some instances, the entire mRNA for the gene is
absent. In other situations, the amount of mRNA produced
varies.
[0070] Calculations of "homology" between two sequences can be
performed as follows. The sequences are aligned for optimal
comparison purposes (e.g., gaps can be introduced in one or both of
a first and a second amino acid or nucleic acid sequence for
optimal alignment and non-homologous sequences can be disregarded
for comparison purposes). In a preferred embodiment, the length of
a reference sequence that is aligned for comparison purposes is at
least about 30%, preferably at least about 40%, more preferably at
least about 50%, even more preferably at least about 60%, and even
more preferably at least about 70%, at least about 80%, at least
about 90%, or about 100% of the length of the reference sequence.
The amino acid residues or nucleotides at corresponding amino acid
positions or nucleotide positions are then compared. When a
position in the first sequence is occupied by the same amino acid
residue or nucleotide as the corresponding position in the second
sequence, then the molecules are identical at that position (as
used herein, amino acid or nucleic acid "identity" is equivalent to
amino acid or nucleic acid "homology"). The percent identity
between the two sequences is a function of the number of identical
positions shared by the sequences, taking into account the number
of gaps and the length of each gap, which need to be introduced for
optimal alignment of the two sequences.
[0071] The comparison of sequences and determination of percent
homology between two sequences can be accomplished using a
mathematical algorithm. In a preferred embodiment, the percent
homology between two amino acid sequences is determined using the
Needleman and Wunsch (1970), J. Mol. Biol. 48:444 453, algorithm
that has been incorporated into the GAP program in the GCG software
package, using either a Blossum 62 matrix or a PAM250 matrix, and a
gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1,
2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent
homology between two nucleotide sequences is determined using the
GAP program in the GCG software package, using a NWSgapdna. CMP
matrix and a gap weight of about 40, 50, 60, 70, or 80 and a length
weight of about 1, 2, 3, 4, 5, or 6. A particularly preferred set
of parameters (and the one that should be used if the practitioner
is uncertain about which parameters should be applied to determine
if a molecule is within a homology limitation of the claims) are a
Blossum 62 scoring matrix with a gap penalty of 12, a gap extend
penalty of 4, and a frameshift gap penalty of 5.
[0072] Other methods for aligning sequences for comparison are well
known in the art. Various programs and alignment algorithms are
described in, for example, Smith & Waterman, Adv. Appl. Math.
2:482, 1981; Pearson & Lipman, Proc. Natl. Acad. Sci. USA
85:2444, 1988; Higgins & Sharp, Gene 73:237 244, 1988; Higgins
& Sharp, CABIOS 5:151-153, 1989; Corpet et al., Nucleic Acids
Research 16:10881-10890, 1988; Huang et al., CABIOS 8:155-165,
1992; and Pearson et al., Methods in Molecular Biology 24:307-331,
1994. and Altschul et al., J. Mol. Biol. 215:403-410, 1990.
[0073] As used herein, a "host cell" is a cell used to produce a
product described herein (e.g., a branched fatty alcohol described
herein). A host cell can be modified to express or overexpress
selected genes or to have attenuated expression of selected genes.
Non-limiting examples of host cells include plant, animal, human,
bacteria, yeast, or filamentous fungi cells.
[0074] As used herein, the term "hybridizes under low stringency,
medium stringency, high stringency, or very high stringency
conditions" describes conditions for hybridization and washing.
Guidance for performing hybridization reactions can be found, for
example, in Current Protocols in Molecular Biology, John Wiley
& Sons, N.Y. (1989), 6.3.1-6.3.6. Aqueous and nonaqueous
methods are described in that reference, and either method can be
used. An example of hybridization conditions referred to herein are
as follows: 1) low stringency hybridization conditions in 6.times.
sodium chloride/sodium citrate (SSC) at about 45.degree. C.,
followed by two washes in 0.2.times.SSC, 0.1% SDS at least at
50.degree. C. (the temperature of the washes can be increased to
55.degree. C. for low stringency conditions); 2) medium stringency
hybridization conditions in 6.times.SSC at about 45.degree. C.,
followed by one or more washes in 0.2.times.SSC, 0.1% SDS at
60.degree. C.; 3) high stringency hybridization conditions in
6.times.SSC at about 45.degree. C., followed by one or more washes
in 0.2.X SSC, 0.1% SDS at 65.degree. C.; and 4) very high
stringency hybridization conditions in 0.5M sodium phosphate, 7%
SDS at 65.degree. C., followed by one or more washes at
0.2.times.SSC, 1% SDS at 65.degree. C. Very high stringency
conditions of 4) are the preferred conditions unless otherwise
specified.
[0075] The term "isolated" as used herein with respect to nucleic
acids, such as DNA or RNA, refers to molecules separated from other
DNAs or RNAs, respectively, that are present in the natural source
of the nucleic acid. Moreover, by an "isolated nucleic acid" is
meant to include nucleic acid fragments, which are not naturally
occurring as fragments and would not be found in the natural state.
The term "isolated" is also used herein to refer to polypeptides,
which are isolated from other cellular proteins, and is meant to
encompass both purified and recombinant polypeptides. The term
"isolated" as used herein also refers to a nucleic acid or peptide
that is substantially free of cellular material, viral material, or
culture medium when produced by recombinant DNA techniques. The
term "isolated" as used herein also refers to a nucleic acid or
peptide that is substantially free of chemical precursors or other
chemicals when chemically synthesized.
[0076] As used herein, the "level of expression of a gene in a
cell" refers to the level of mRNA, pre-mRNA nascent transcript(s),
transcript processing intermediates, mature mRNA(s), and
degradation products encoded by the gene in the cell.
[0077] As used herein, the term "microorganism" means prokaryotic
and eukaryotic microbial species from the domains Archaea,
Bacteria, and Eucarya, the latter including yeast and filamentous
fungi, protozoa, algae, or higher Protista. The terms "microbial
cells" (i.e., cells from microbes) and "microbes" are used
interchangeably and refer to cells or small organisms that can only
be seen with the aid of a microscope.
[0078] As used herein, the term "nucleic acid" refers to
polynucleotides, such as deoxyribonucleic acid (DNA), and, where
appropriate, ribonucleic acid (RNA). The term should also be
understood to include, as equivalents, analogs of either RNA or DNA
made from nucleotide analogs, and, as applicable to the embodiment
being described, single (sense or antisense) and double-stranded
polynucleotides, ESTs, chromosomes, cDNAs, mRNAs, and rRNAs.
[0079] As used herein, the term "operably linked" means that
selected nucleotide sequence (e.g., encoding a polypeptide
described herein) is in proximity to a promoter to allow the
promoter to regulate expression of the selected DNA. In addition,
the promoter is located upstream of the selected nucleotide
sequence in terms of the direction of transcription and
translation. By "operably linked" is meant that a nucleotide
sequence and a regulatory sequence(s) are connected in such a way
as to permit gene expression when the appropriate molecules (e.g.,
transcriptional activator proteins) are bound to the regulatory
sequence(s).
[0080] The term "or" is used herein to mean, and is used
interchangeably with, the term "and/or," unless context clearly
indicates otherwise.
[0081] As used herein, "overexpress" means to express or cause to
be expressed a nucleic acid or polypeptide in a cell at a greater
concentration than is normally expressed in a corresponding
wild-type cell. For example, a polypeptide can be "overexpressed"
in a recombinant host cell when the polypeptide is present in a
greater concentration in the recombinant host cell compared to its
concentration in a non-recombinant host cell of the same
species.
[0082] As used herein, "partition coefficient" or "P" is defined as
the equilibrium concentration of a compound in an organic phase
divided by the concentration at equilibrium in an aqueous phase
(e.g., fermentation broth). In one embodiment of a bi-phasic system
described herein, the organic phase is formed by the fatty aldehyde
or fatty alcohol during the production process. However, in some
examples, an organic phase can be provided, such as by providing a
layer of octane, to facilitate product separation. When describing
a two phase system, the partition characteristics of a compound can
be described as logP. For example, a compound with a logP of 1
would partition 10:1 to the organic phase: aqueous phase. A
compound with a logP of -1 would partition 1:10 to the organic
phase: aqueous phase. By choosing an appropriate fermentation broth
and organic phase, a branched fatty aldehyde or branched fatty
alcohol with a high logP value can separate into the organic phase
even at very low concentrations in the fermentation vessel.
[0083] As used herein, the term "purify," "purified," or
"purification" means the removal or isolation of a molecule from
its environment by, for example, isolation or separation.
"Substantially purified" molecules are at least about 60% free,
preferably at least about 75% free, and more preferably at least
about 90% free from other components with which they are
associated. As used herein, these terms also refer to the removal
of contaminants from a sample. For example, the removal of
contaminants can result in an increase in the percentage of
branched fatty aldehyde or branched fatty alcohol in a sample. For
example, when branched fatty alcohols are produced in a host cell,
the branched fatty alcohols can be purified by the removal of host
cell proteins, or by simply separating and removing linear fatty
alcohols that are produced during the same process. After
purification, the percentage of branched fatty alcohols in the
sample is increased.
[0084] The terms "purify," "purified," and "purification" do not
require absolute purity. They are relative terms. Thus, for
example, when branched fatty alcohols are produced in host cells, a
purified branched fatty alcohol is one that is substantially
separated from other cellular components (e.g., nucleic acids,
polypeptides, lipids, carbohydrates, or other compounds, such as,
for example, linear fatty alcohols). In another example, a purified
branched fatty alcohol preparation is one in which the branched
fatty alcohol is substantially free from contaminants, such as
those that might be present following fermentation. In some
embodiments, a branched fatty alcohol is purified when at least
about 50% by weight of a sample is composed of the branched fatty
alcohol. In other embodiments, a branched fatty alcohol is purified
when at least about 60%, 70%, 80%, 85%, 90%, 92%, 95%, 98%, or 99%
or more by weight of a sample is composed of the branched fatty
alcohol.
[0085] As used herein, the term "recombinant polypeptide" refers to
a polypeptide that is produced by recombinant DNA techniques,
wherein generally DNA encoding the expressed protein or RNA is
transferred into a suitable expression vector and that is in turn
used to transform a host cell to produce the polypeptide or
RNA.
[0086] As used herein, the term "substantially identical" (or
"substantially homologous") is used to refer to a first amino acid
or nucleotide sequence that contains a sufficient number of
identical or equivalent (e.g., with a similar side chain, such as
involving conservative amino acid substitutions) amino acid
residues or nucleotides to a second amino acid or nucleotide
sequence such that the first and second amino acid or nucleotide
sequences have similar activities.
[0087] As used herein, the term "surfactants" refers broadly to
surface active agents. These agents are typically amphipathic
molecules comprising both hydrophilic and hydrophobic moieties that
partition preferentially at the interface between fluid phases with
different degrees of polarity and hydrogen bonding, such as, for
example, an oil/water interface, or an air/water interface.
Surfactants are capable of reducing surface and interfacial tension
and forming microemulsions. These characteristics confer
detergency, emulsifying, foaming and dispersing traits, making them
some of the most versatile process chemicals.
[0088] Surfactants can be natural or synthetic in origin.
Surfactants from natural origin can be derived from, for example,
vegetable or animal sources. Surfactants derived from synthetic
origin are typically those derived from petroleum.
[0089] There are many types of surfactants, including, for example,
anionic surfactants, cationic surfactants, non-ionic surfactants,
and amphoteric/zwitterionic surfactants, each with distinct
characteristics.
[0090] The hydrophobic end of an anionic surfactant is negatively
charged in solution. As a result, they have good cleaning
properties and high sudding potentials, which make them
particularly effective as some of the most widely used types of
surfactants in, for example, laundry detergents, dishwashing
liquids, and shampoos. Known anionic surfactants include, for
example, alkyl sulfates, alkyl ethoxylate sulfates, and soaps.
[0091] The hydrophobic end of a cationic surfactant is positively
charged in solution. Three types of cationic surfactants are the
most commonly known. The first type is the esterquat, which is
widely included in, for example, fabric treatment agents or
softeners and in detergents with built-in softeners. This is
because esterquat is capable of adding softness to fabrics. The
second type is a mono alkyl quaternary system, which is found in
many household cleaners due to its disinfecting and/or sanitizing
properties.
[0092] Non-ionic surfactants do not have an electrical charge in
solution, making them resistant to water hardness deactivation.
They are typically excellent grease removers. The most commonly
used non-ionic surfactants are ethers or derivatives of fatty
alcohols.
[0093] Amphoteric/zwitterionic surfactants are milder than the
other types of surfactants, making them particularly suitable for
use in personal or beauty care cleaning/treatment products. They
may contain two oppositely-charged groups. While the positive
charge is typically conferred by ammonium, the source of the
negative charge can vary. For example, the negative charge can be
conferred by carboxylate, sulfate, sulfonate, or a combination
thereof. They can be anionic (e.g., negatively charged), cationic
(e.g., positively charged) or non-ionic (e.g., no charge) in
solution, depending on the acidity or pH of the solution. They have
good compatibility with the other types of surfactants and are well
known for being soluble and effective in the presence of high
concentrations of electrolytes, acids and alkalis. An example of an
amphoteric/zwitterionic surfactant is an alkyl betaine.
[0094] In typical applications, different types of surfactants are
blended or otherwise used together to achieve an array of desirable
properties.
[0095] As used herein, the term "synthase" means an enzyme that
catalyzes a synthesis process. As used herein, the term synthase
includes synthases, synthetases, and ligases.
[0096] As used herein, the term "transfection" means the
introduction of a nucleic acid (e.g., via an expression vector)
into a recipient cell by nucleic acid-mediated gene transfer.
[0097] As used herein, "transformation" refers to a process in
which a cell's genotype is changed as a result of the cellular
uptake of exogenous DNA or RNA. This may result in the transformed
cell expressing a recombinant form of an RNA or polypeptide. In the
case of antisense expression from the transferred gene, the
expression of a naturally-occurring form of the polypeptide is
disrupted.
[0098] As used herein, a "transport protein" is a polypeptide that
facilitates the movement of one or more compounds in and/or out of
a cellular organelle and/or a cell.
[0099] As used herein, a "variant" of polypeptide X refers to a
polypeptide having the amino acid sequence of peptide X in which
one or more amino acid residues is altered. The variant may have
conservative changes or nonconservative changes. Guidance in
determining which amino acid residues may be substituted, inserted,
or deleted without affecting biological activity may be found using
computer programs well known in the art, for example, LASERGENE
software (DNASTAR).
[0100] The term "variant," when used in the context of a
polynucleotide sequence, may encompass a polynucleotide sequence
related to that of a gene or the coding sequence thereof. This
definition may also include, for example, "allelic," "splice,"
"species," or "polymorphic" variants. A splice variant may have
significant identity to a reference polynucleotide, but will
generally have a greater or fewer number of polynucleotides due to
alternative splicing of exons during mRNA processing. The
corresponding polypeptide may possess additional functional domains
or an absence of domains. Species variants are polynucleotide
sequences that vary from one species to another. The resulting
polypeptides generally will have significant amino acid sequence
identity relative to each other. A polymorphic variant is a
variation in the polynucleotide sequence of a particular gene
between individuals of a given species.
[0101] As used herein, the term "vector" refers to a nucleic acid
molecule capable of transporting another nucleic acid to which it
has been linked. One type of useful vector is an episome (i.e., a
nucleic acid capable of extra-chromosomal replication). Useful
vectors are those capable of autonomous replication and/or
expression of nucleic acids to which they are linked. Vectors
capable of directing the expression of genes to which they are
operatively linked are referred to herein as "expression vectors".
In general, expression vectors of utility in recombinant DNA
techniques are often in the form of "plasmids," which refer
generally to circular double stranded DNA loops that, in their
vector form, are not bound to the chromosome. In the present
specification, "plasmid" and "vector" are used interchangeably, as
the plasmid is the most commonly used form of vector. However, also
included are such other forms of expression vectors that serve
equivalent functions and that become known in the art subsequently
hereto.
Surfactants and Cleaning Compositions Comprising a Microbially
Produced Branched Fatty Alcohol or a Branched Fatty Alcohol
Derivative Thereof
[0102] The invention provides a surfactant composition comprising
one or more microbially produced branched chain fatty alcohols
and/or derivatives thereof. The invention further provides a
detergent/cleaning composition, such as, for example, a household
cleaning composition or a personal or beauty care cleaning
composition, comprising such a surfactant.
[0103] In one aspect, the invention features a surfactant
composition comprising branched chain fatty alcohols and/or
derivatives thereof produced by microbes. In some embodiments, the
microbially produced branched fatty alcohol and/or derivative
thereof is produced by a host cell expressing genes encoding at
least one subunit of a branched-chain alpha-keto acid dehydrogenase
polypeptide. The host cell expresses genes encoding at least two
subunits of a branched-chain alpha-keto acid dehydrogenase
polypeptide. For example, the host cell expresses a set of genes
encoding the first subunit and a second subunit of a branched-chain
alpha-keto acid dehydrogenase polypeptide. In certain embodiments,
the host cell expresses a third gene encoding the second subunit of
a branched-chain alpha-keto acid dehydrogenase polypeptide. In some
embodiments, the first and second polypeptides have branched-chain
alpha-keto acid decarboxylase activity, and the third polypeptide
has lipoamide acyltransferase activity. In further embodiments, the
host cell expresses a fourth gene encoding the third subunit of a
branched-chain alpha-keto acid dehydrogenase polypeptide. In some
embodiments, the fourth polypeptide has lipoamide dehydrogenase
activity.
[0104] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a gene encoding a beta ketoacyl-ACP synthase
polypeptide. In certain embodiments, the beta ketoacyl-ACP synthase
polypeptide has FabH activity. In certain embodiments the beta
ketoacyl-ACP synthase has specificity for branched-chain acyl-CoA
substrates.
[0105] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a set of genes encoding at least one subunit of a
branched-chain alpha-keto acid dehydrogenase complex. Specifically,
the microbially produced branched fatty alcohol and/or derivative
thereof is produced by a host cell expressing a first gene encoding
a first polypeptide comprising the amino acid sequence that is any
one of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, and 15, or one that has at
least about 30%, at least about 35%, at least about 40%, at least
about 45%, at least about 50%, at least about 55%, at least about
60%, at least about 65%, at least about 70%, at least about 75%, at
least about 80%, at least about 85%, at least about 90%, at least
about 91%, at least about 92%, at least about 93%, at least about
94%, at least about 95%, at least about 96%, at least about 97%, at
least about 98%, at least about 99% sequence identity to the amino
acid sequence of any one of SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, and
15, or a variant thereof; a second gene encoding a second
polypeptide comprising an amino acid sequence of any one of SEQ ID
NOs:24, 26, 28, 30, 32, 34, 36, and 38, or one that has at least
about 30%, at least about 35%, at least about 40%, at least about
45%, at least about 50%, at least about 55%, at least about 60%, at
least about 65%, at least about 70%, at least about 75%, at least
about 80%, at least about 85%, at least about 90%, at least about
91%, at least about 92%, at least about 93%, at least about 94%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, at least about 99% sequence identity to the amino acid
sequence of any one of SEQ ID NOs:24, 26, 28, 30, 32, 34, 36, and
38, or a variant thereof. In certain embodiments, the host cell
also expresses a third gene encoding a third polypeptide comprising
the amino acid sequence of any one of SEQ ID NOs:47, 49, 51, 53,
55, 57, 59, and 61, or one that has at least 30%, at least about
35%, at least about 40%, at least about 45%, at least about 50%, at
least about 55%, at least about 60%, at least about 65%, at least
about 70%, at least about 75%, at least about 80%, at least about
85%, at least about 90%, at least about 91%, at least about 92%, at
least about 93%, at least about 94%, at least about 95%, at least
about 96%, at least about 97%, at least about 98%, at least about
99% sequence identity to the amino acid sequence of any one of SEQ
ID NOs:47, 49, 51, 53, 55, 57, 59, and 61, or a variant thereof. In
some embodiments, the branched fatty aldehyde, branched fatty
alcohol, or a derivative thereof is isolated from the host cell,
for example, isolated from the extracellular environment of the
host cell. In some embodiments, the branched fatty aldehyde,
branched fatty alcohol, or the derivative thereof is spontaneously
secreted, completely or partially, from the host cell. In
alternative embodiments, the branched fatty aldehyde, branched
fatty alcohol, or the derivative thereof is transported into the
extracellular environment. In further embodiments, the branched
fatty aldehyde, branched fatty alcohol, or the derivative thereof
is passively transported or spontaneously secreted into the
extracellular environment.
[0106] The first polypeptide comprises the amino acid sequence of
any one of SEQ ID NOs:1, 3, 5, 7, 9, 11, 13, and 15, with one or
more amino acid substitutions, additions, insertions, or deletions,
the second polypeptide comprises the amino acid sequence of any one
of SEQ ID NOs: 24, 26, 28, 30, 32, 34, 36, and 38, wherein the
first and second polypeptides together have alpha-keto acid
decarboxylase activity. In certain embodiments, the first
polypeptide comprises one or more or all of the amino acid sequence
motifs selected from SEQ ID NOs:17-23. The second polypeptide
comprises one or more or all of the amino acid sequence motifs
selected from SEQ ID NOs:40-46. In some embodiments, the third
polypeptide comprises an amino acid sequence of any one of SEQ ID
NOs: 47, 49, 51, 53, 55, 57, 59, and 61, with one or more amino
acid substitutions, additions, insertions, or deletions, wherein
the third polypeptide has lipoamide acyltransferase activity. The
third polypeptide comprises one or more or all of the amino acid
sequence motifs selected from SEQ ID NOs:63-68. In some
embodiments, the first, second and third polypeptides are capable
of catalyzing the conversion of alpha-keto acids to branched
acyl-CoAs. It is within the capacity of those skilled in the art to
devise a suitable enzymatic assay using the appropriate substrates.
Examples of such assays are described herein.
[0107] In some embodiments, the first, second, and third
polypeptides independently comprises 1 or more, 5 or more, 10 or
more, 15 or more, 20 or more, 30 or more, 40 or more, 50 or more,
or 100 or more of the following conservative amino acid
substitutions: replacement of an aliphatic amino acid, such as
alanine, valine, leucine, and isoleucine, with another aliphatic
amino acid; replacement of a serine with a threonine; replacement
of a threonine with a serine; replacement of an acidic residue,
such as aspartic acid and glutamic acid, with another acidic
residue; replacement of a residue bearing an amide group, such as
asparagine and glutamine, with another residue bearing an amide
group; exchange of a basic residue, such as lysine and arginine,
with another basic residue; and replacement of an aromatic residue,
such as phenylalanine and tyrosine, with another aromatic residue.
In some embodiments, the first and second polypeptides
independently comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15,
20, 30, 40, 50, 60, 70, 80, 90, 100, 200, or more amino acid
substitutions, additions, insertions, or deletions. In some
embodiments, the third polypeptide comprises about 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, or
more amino acid substitutions, additions, insertions, or deletions.
In some embodiments, the first and second polypeptides have
branched-chain alpha-keto acid decarboxylase activity and the third
polypeptide has lipoamide acyltransferase activity. In some
embodiments, the first, second and third polypeptides are capable
of catalyzing the conversion of branched alpha-keto acids to
branched acyl-CoAs.
[0108] In certain embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell
expressing a fourth gene encoding a fourth polypeptide comprising
the amino acid sequence of any one of SEQ ID NOs:69, 71, 73, 75,
77, 79, 81, and 83, or one that has at least about 30%, at least
about 35%, at least about 40%, at least about 45%, at least about
50%, at least about 55%, at least about 60%, at least about 65%, at
least about 70%, at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, at least
about 99% sequence identity to the amino acid sequence of any one
of SEQ ID NOs:69, 71, 73, 75, 77, 79, 81, and 83, or a variant
thereof. In some embodiments, the branched fatty aldehyde, branched
fatty alcohol, or a derivative thereof is isolated from the host
cell, for example, from the extracellular environment. In certain
embodiments, the branched fatty aldehyde, branched fatty alcohol,
or the derivative thereof is spontaneously secreted, partially or
completely, into the extracellular environment. In other
embodiments, the branched fatty aldehyde, branched fatty alcohol,
or the derivative thereof is transported into the extracellular
environment. In certain embodiments, the branched fatty aldehyde,
branched fatty alcohol or the derivative thereof is passively
transported into the extracellular environment.
[0109] The fourth polypeptide comprises the amino acid sequence of
any one of SEQ ID NOs:69, 71, 73, 75, 77, 79, 81, and 83, with one
or more amino acid substitutions, additions, insertions, or
deletions, and the polypeptide has lipoamide dehydrogenase
activity. In certain embodiments, the fourth polypeptide comprises
one or more or all of amino acid sequence motifs selected from SEQ
ID NOs:85-89. In some embodiments, the fourth polypeptide comprises
1 or more, 5 or more, 10 or more, 15 or more, 20 or more, 30 or
more, 40 or more, 50 or more, or 100 or more of the following
conservative amino acid substitutions: replacement of an aliphatic
amino acid, such as alanine, valine, leucine, and isoleucine, with
another aliphatic amino acid; replacement of a serine with a
threonine; replacement of a threonine with a serine; replacement of
an acidic residue, such as aspartic acid and glutamic acid, with
another acidic residue; replacement of a residue bearing an amide
group, such as asparagine and glutamine, with another residue
bearing an amide group; exchange of a basic residue, such as lysine
and arginine, with another basic residue; and replacement of an
aromatic residue, such as phenylalanine and tyrosine, with another
aromatic residue. In some embodiments, the fourth polypeptide
comprises about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50,
60, 70, 80, 90, 100, 200, or more amino acid substitutions,
additions, insertions, or deletions. In some embodiments, the
fourth polypeptide has lipoamide dehydrogenase activity. In some
embodiments, the first, second, third and fourth polypeptides have
branched chain alpha-keto acid decarboxylase and/or lipoamide
acyltransferase and/or lipoamide dehydrogenase activity. In some
embodiments, the first, second, third and fourth polypeptides,
optionally forming a complex, are capable of catalyzing the
conversion alpha-keto acids to branched-chain acyl-CoAs.
[0110] In certain embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell
further expressing a gene encoding a beta-ketoacyl ACP synthase
comprising the amino acid sequence of any one of SEQ ID NOs: 90,
92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118,
and 120, or one that has at least about 30%, at least about 35%, at
least about 40%, at least about 45%, at least about 50%, at least
about 55%, at least about 60%, at least about 65%, at least about
70%, at least about 75%, at least about 80%, at least about 85%, at
least about 90%, at least about 91%, at least about 92%, at least
about 93%, at least about 94%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, at least about 99%
sequence identity to the amino acid sequence of any one of SEQ ID
NOs:90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114,
116, 118, and 120, or a variant thereof. In some embodiments, the
branched fatty aldehyde, branched fatty alcohol, or a derivative
thereof is isolated from the host cell, for example, from the
extracellular environment. In certain embodiments, the branched
fatty aldehyde, branched fatty alcohol, or the derivative thereof
is spontaneously secreted, partially or completely, into the
extracellular environment. In other embodiments, the branched fatty
aldehyde, branched fatty alcohol, or the derivative thereof is
transported into the extracellular environment. In certain
embodiments, the branched fatty aldehyde, branched fatty alcohol,
or the derivative thereof is passively transported into the
extracellular environment.
[0111] The beta ketoacyl-ACP synthase polypeptide comprises the
amino acid sequence of any one of SEQ ID NOs:90, 92, 94, 96, 98,
100, 102, 104, 106, 108, 110, 112, 114, 116, 118, and 120, with one
or more amino acid substitutions, additions, insertions, or
deletions. In some embodiments, the beta ketoacyl-ACP synthase
polypeptide comprises one or more or all of amino acid sequence
motifs selected from SEQ ID NOs:122-127. In some embodiments, the
polypeptide has FabH activity. In certain embodiments, the beta
ketoacyl-ACP synthase polypeptide has specificity for
branched-chain fatty acyl-CoA substrates. In certain embodiments,
the polypeptide is capable of catalyzing the condensation reaction
between a branched acyl-CoA and malonyl-ACP. It is within the
capacity of those skilled in the art to devise a suitable enzymatic
assay using the appropriate substrates in order to distinguish
those polypeptides having sequence homology to the
beta-ketoacyl-ACP synthase polypeptides herein but are not suitable
or does not have specificity for branched-chain substrates. Two
examples of such enzymatic assays are described herein.
[0112] The beta ketoacyl-ACP synthase polypeptide can comprise 1 or
more, 5 or more, 10 or more, 15 or more, 20 or more, 30 or more, 40
or more, 50 or more, or 100 or more of the following conservative
amino acid substitutions: replacement of an aliphatic amino acid,
such as alanine, valine, leucine, and isoleucine, with another
aliphatic amino acid; replacement of a serine with a threonine;
replacement of a threonine with a serine; replacement of an acidic
residue, such as aspartic acid and glutamic acid, with another
acidic residue; replacement of a residue bearing an amide group,
such as asparagine and glutamine, with another residue bearing an
amide group; exchange of a basic residue, such as lysine and
arginine, with another basic residue; and replacement of an
aromatic residue, such as phenylalanine and tyrosine, with another
aromatic residue. In some embodiments, the polypeptide comprises
about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70,
80, 90, 100, 200 or more amino acid substitutions, additions,
insertions, or deletions. In some embodiments, the polypeptide has
FabH activity. In some embodiments, the polypeptide has specificity
for branched-chain acyl-CoAs. In some embodiments, the polypeptide
is capable of catalyzing the condensation of a branched acyl-CoA
and malonyl-ACP.
[0113] In certain embodiments, the first polypeptide comprises an
amino acid sequence motif of any one of or one or more or all of
SEQ ID NOs:17-23, wherein the first polypeptide is of about 200 to
about 800 amino acid residues in length, or about 300 to about 700
amino acid residues in length, or about 400 to about 600 amino
acids in length. In some embodiments, the second polypeptide
comprises an amino acid sequence motif of any one of or one or more
or all of SEQ ID NOs:40-46, wherein the second polypeptide is about
200 to about 800 amino acid residues in length, or about 300 to
about 700 amino acid residues in length, or about 400 to about 600
amino acid residues in length. In some embodiments, the third
polypeptide comprises an amino acid sequence motif of any one of or
one or more or all of SEQ ID NOs:63-68, wherein the first
polypeptide is of about 200 to about 800 amino acid residues in
length, or about 300 to about 700 amino acid residues in length, or
about 400 to about 600 amino acid residues in length. In some
embodiments, the first, second and optionally the third
polypeptides are capable of catalyzing the conversion of alpha-keto
acid substrates to branched acyl-CoAs.
[0114] In certain embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell
further expressing a gene encoding a fatty aldehyde biosynthesis
polypeptide selected from those listed in the Table 6, or a variant
thereof. In some embodiments, the fatty aldehyde biosynthesis
polypeptide comprises the amino acid sequence of an enzyme listed
in Table 6, with one or more amino acid substitutions, additions,
insertions, or deletions, and the polypeptide has carboxylic acid
reductase activity. In some embodiments, the polypeptide has fatty
acid reductase activity. In some embodiments, the fatty aldehyde
biosynthesis polypeptide comprises one or more of the following
conservative amino acid substitutions. In some embodiments, the
fatty aldehyde biosynthesis polypeptide has about 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90, 100, or more amino
acid substitutions, additions, insertions, or deletions. In some
embodiments, the polypeptide has carboxylic acid reductase
activity. In some embodiments, the polypeptide has fatty acid
reductase activity.
[0115] In some embodiments, the branched fatty alcohol or a
derivative thereof is isolated from the host cell, for example,
from the extracellular environment. In some embodiments, the
branched fatty alcohol or the derivative thereof is spontaneously
secreted, partially or completely, from the host cell. In
alternative embodiments, the branched fatty alcohol or the
derivative thereof is transported into the extracellular
environment. In other embodiments, the branched fatty alcohol or
the derivative thereof is passively transported into the
extracellular environment.
[0116] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
wherein a gene encoding a fatty acid synthase is modified. For
example, modifying the expression of a gene encoding a fatty acid
synthase includes expressing a gene encoding a fatty acid synthase
in the host cell and/or increasing the expression or activity of an
endogenous fatty acid synthase in the host cell. Alternatively,
modifying the expression of a gene encoding a fatty acid synthase
includes attenuating a gene encoding a fatty acid synthase in the
host cell and/or decreasing the expression or activity of an
endogenous fatty acid synthase in the host cell. In some
embodiments, the fatty acid synthase is a thioesterase. In
particular embodiments, the thioesterase is encoded by tesA, tesA
without leader sequence, tesB, fatB, fatB2, fatB3, fatA, or
fatA1.
[0117] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a gene encoding a fatty alcohol biosynthesis
polypeptide. The fatty alcohol biosynthesis polypeptide is, for
example, an alcohol dehydrogenase. In particular embodiments, the
fatty alcohol biosynthesis polypeptide is one selected from the
enzymes listed in Table 8, or a variant thereof.
[0118] In certain other embodiments, the microbially produced
branched fatty alcohol and/or derivative thereof is produced by a
host cell expressing a gene encoding another aldehyde biosynthetic
polypeptide or an acyl-ACP reductase polypeptide comprising the
amino acid sequence of any of the enzymes listed in Table 7, or a
variant thereof. In some embodiments, the branched fatty alcohol or
derivative thereof is isolated from the host cell, for example,
from the extracellular environment. In certain embodiments, the
branched fatty alcohol or derivative thereof is spontaneously
secreted, partially or completely, from the host cell. In
alternative embodiments, the branched fatty alcohol or derivative
thereof is transported into the extracellular environment. In other
embodiments, the branched fatty alcohol or derivative thereof is
passively transported into the extracellular environment.
[0119] The acyl-ACP reductase polypeptide, for example, comprises
the amino acid sequence of an enzyme selected from those listed in
Table 7, with one or more amino acid substitutions, additions,
insertions, or deletions, and the polypeptide has reductase
activity. In certain embodiments, the polypeptide is capable of
catalyzing the conversion of a suitable biological substrate into
an aldehyde. The acyl-ACP reductase polypeptide, for example,
comprises one or more conservative amino acid substitutions, or has
about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30, 40, 50, 60, 70,
80, 90, 100, 150, or more amino acid substitutions, additions,
insertions, or deletions. In some embodiments, the polypeptide has
reductase activity. In some embodiments, the polypeptide is capable
of catalyzing the conversion of a suitable biological substrate
into an aldehyde.
[0120] In any of the above-described embodiments, the microbially
produced branched fatty alcohol and/or derivative thereof is
produced by a host cell, which is genetically engineered to express
an attenuated level of a fatty acid degradation enzyme relative to
a wild type host cell. For example, the host cell is genetically
engineered to express an attenuated level of an acyl-CoA synthase
relative to a wild type host cell. In particular embodiments, the
host cell expresses an attenuated level of an acyl-CoA synthase
encoded by fadD, fadK, BH3103, yhfL, Pfl-4354, EAV15023, fadD1,
fadD2, RPC_4074, fadDD35, fadDD22, faa3p or the gene encoding the
protein ZP_01644857. In certain embodiments, the genetically
engineered host cell comprises a knockout of one or more genes
encoding a fatty acid degradation enzyme, such as the
aforementioned acyl-CoA synthase genes. In certain embodiments, the
host cell is genetically engineered to express, relative to a wild
type host cell, a decreased level of at least one of a gene
encoding an acyl-CoA dehydrogenase, a gene encoding an outer
membrane protein receptor, and a gene encoding a transcriptional
regulator of fatty acid biosynthesis. In some embodiments, the gene
encoding an acyl-CoA dehydrogenase is fadE. In some embodiments,
the gene encoding an outer membrane protein receptor is tonA (also
known as fhuA). Yet in other embodiments, the gene encoding a
transcriptional regulator of fatty acid biosynthesis is fabR.
[0121] In yet other embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell,
which is genetically engineered to express an attenuated level of a
dehydratase/isomerase enzyme, such as an enzyme encoded by fabA or
by a gene listed in Table 1 or Table 2. In some embodiments, the
host cell comprises a knockout of fabA or a gene listed in Table 1
or Table 2. In other embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell,
which is genetically engineered to express an attenuated level of
an endogenous ketoacyl-ACP synthase, such as an enzyme encoded by
fabB or by a gene listed in Table 3 or Table 4. In certain
embodiments, the host cell comprises a knockout of fabB or a gene
listed in Table 3 or Table 4. In yet other embodiments, the host
cell is genetically engineered to express a modified level of a
gene encoding a desaturase enzyme, such as desA.
[0122] In certain embodiments, the branched-chain alpha-keto acid
dehydrogenase complex polypeptides, the beta ketoacyl-ACP synthase
polypeptide, the aldehyde biosynthesis polypeptide, the fatty acid
synthase, the acyl-ACP reductase, the alcohol biosynthesis
polypeptide, and the fatty acid degradation enzyme polypeptide are
each independently obtained from a bacterium, a plant, an insect, a
yeast, a fungus, or a mammal. For example, each of the
above-mentioned polypeptides is from a mammalian cell, plant cell,
insect cell, yeast cell, fungus cell, filamentous fungi cell,
bacterial cell, or any other organism described herein. In certain
embodiments, the branched-chain alpha-keto acid dehydrogenase
complex polypeptides can be from a bacterium that uses branched
amino acids as carbon source, including, for example, Pseudomonas
putida or a Bacillus subtilis. In certain embodiments, the
branched-chain alpha-keto acid dehydrogenase complex polypeptide
can be from a bacterium that comprises branched fatty acids in its
phospholipids, including, for example, a Legionella,
Stenotrophomonas, Alteromonas, Flavobacterium, Myxococcus,
Bacteroides, Micrococcus, Staphylococcus, Bacillus, Clostridium,
Listeria, Lactococcus, or Streptomyces bacterium. In some
embodiments, the bacterium is a Leginella pneumophila,
Stenotrophomonas maltophilia, Alteromonas macleodii, Flabobacterium
phsychrophilum, Myxococcus Xanthus, Bacteroides thetaiotaomicron,
Macrococcus luteus, Staphylococcus aureus, Clostridium
thermocellum, Listeria monocytogenes, Streptomyces lividans,
Streptomyces coelicolor, Streptomyces glaucescens, Streptococcus
pneumoniae, Streptomyces peucetius, Streptococcus pyogenes,
Escherichia coli, Escherichia coli K-12, Lactococcus lactis ssp.
Lactis, Mycobacterium tuberculosis, Enterococcus tuberculosis,
Bacillus subtilis, Lactobacillus plantarum. In certain embodiments,
suitable fatty aldehyde biosynthesis polypeptides, fatty alcohol
biosynthesis polypeptides, acyl-ACP reductases, and other
polypeptides of the invention can be from a mycobacterium selected
from the group consisting of Mycobacterium smegmatis, Mycobacterium
abscessus, Mycobacterium avium, Mycobacterium bovis, Mycobacterium
tuberculosis, Mycobacterium leprae, Mycobacterium marinum, and
Mycobacterium ulcerans. In other embodiments, the bacterium is
Nocardia sp. NRRL 5646, Nocardia farcinica, Streptomyces griseus,
Salinispora arenicola, or Clavibacter michiganenesis. In certain
further embodiments, the polypeptide of the invention is derived
from a cyanobacterium, including, for example, Synechococcus
elongatus PCC7942, Synechocystis sp. PCC6803, Cyanothece sp.
ATCC51142, Prochlorococcus marinus subsp. pastoris str. CCMP1986
PMM0533, Gloeobacter violaceus PCC7421, Nostoc punctiforme
PCC73102, Anabaena variabilis ATCC29413, Synechococcus elongatus
PCC6301, and Nostoc sp. PCC 7120, Microcoleus chthonoplastes
PCC7420, Arthrospira maxima CS-328, Lyngbya sp. PCC8106, Nodularia
spumigena CCY9414, Trichodesmium erythraeum IMS101, Microcystis
aeruginosa, Nostoc azollae, Anabaena variabilis, Crocophaera
watsonii, Thermosynechococcus elongatus, Gloeobacer violaceus,
Cyanobium, or Prochlorococcus marinus.
[0123] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
cultured in the presence of at least one biological substrate for
the branched-chain alpha-keto acid dehydrogenase polypeptides, the
beta ketoacyl-ACP synthase polypeptide, the aldehyde biosynthesis
polypeptide, the acyl-ACP reductase, and/or the alcohol
biosynthesis polypeptide. Suitable substrate for the branched-chain
alpha-keto acid dehydrogenase polypeptides can include, without
limitation, 2-oxo-isovalerate, 2-oxo-isobutyrate, or
2-oxo-3-methyl-valerate.
[0124] In another aspect, the invention features a surfactant or
detergent composition comprising a microbially produced branched
fatty alcohol or a derivative thereof. In some embodiments, the
microbially produced branched fatty alcohol and/or derivative
thereof is produced by a host cell expressing a first
polynucleotide that hybridizes to a complement of a nucleotide
sequence of any one of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, and 16,
or to a fragment thereof, and a second polynucleotide that
hybridizes to a complement of a second polynucleotide sequence of
any one of SEQ ID NOs:25, 27, 29, 31, 33, 35, 37, and 39. In
certain embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a third polynucleotide that hybridizes to a complement
of a third nucleotide sequence of any one of SEQ ID NOs:48, 50, 52,
54, 56, 58, 60, and 62, or to a fragment thereof, wherein the first
and second polynucleotides encode the first and second polypeptides
having branched-chain alpha-keto acid decarboxylase activity, and
wherein the third polynucleotide encodes a polypeptide having
lipoamide acyltransferase activity. In some embodiments, the first
and the second polypeptides, optionally forming a single subunit,
optionally together with the third polypeptide, are capable of
catalyzing the conversion of branched-chain alpha-keto acids to
branched acyl-CoAs.
[0125] The first polynucleotide hybridizes under low stringency,
medium stringency, high stringency, or very high stringency
conditions, to a complement of the nucleotide sequence of any one
of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, and 16, or to a fragment
thereof. The second polynucleotide hybridizes under low stringency,
medium stringency, high stringency, or very high stringency
conditions, to a complement of the nucleotide sequence of any one
of SEQ ID NOs: 25, 27, 29, 31, 33, 35, 37, and 39, or to a fragment
thereof. The third polynucleotide hybridizes under low stringency,
medium stringency, high stringency, or very high stringency
conditions, to a complement of the nucleotide sequence of any one
of SEQ ID NOs: 48, 50, 52, 54, 56, 58, 60, and 62.
[0126] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a fourth polynucleotide that hybridizes to a complement
of a nucleotide sequence of any one of SEQ ID NOs:70, 72, 74, 76,
78, 80, 82, and 84, or to a fragment thereof, wherein the fourth
polynucleotide encodes a polypeptide having lipoamide dehydrogenase
activity. In some embodiments, the first, second, and optionally
the third and/or fourth polypeptides are capable of catalyzing the
conversion of branched-chain alpha-keto acids into branched
acetyl-CoAs.
[0127] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a polynucleotide that hybridizes to a complement of a
nucleotide sequence of any one of SEQ ID NOs:91, 93, 95, 97, 99,
101, 103, 105, 107, 109, 111, 113, 115, 117, 119, and 121, or to a
fragment thereof, wherein the polynucleotide encodes a polypeptide
having beta ketoacyl-ACP synthase activity. In some embodiments,
the polypeptide is capable of catalyzing the condensation of a
branched acyl-CoA with malonyl-ACP. In some embodiments, the
polypeptide has FabH activity. In some embodiments, the polypeptide
has specificity for branched acyl-CoA substrates. The
polynucleotide hybridizes under low stringency, medium stringency,
high stringency, or very high stringency conditions, to a
complement of the nucleotide sequence of any one of SEQ ID NOs:91,
93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119,
and 121, or to a fragment thereof.
[0128] In some embodiments, the branched fatty aldehyde, branched
fatty alcohol, or derivative thereof is isolated from the host
cell, for example, from the extracellular environment. In certain
embodiments, the branched fatty aldehyde, branched fatty alcohol,
or derivative thereof is spontaneously secreted, partially or
completely, from the host cell. In alternative embodiments, the
branched fatty aldehyde, branched fatty alcohol, or derivative
thereof is transported into the extracellular environment. In other
embodiments, the branched fatty aldehyde, branched fatty alcohol,
or derivative thereof is passively transported into the
extracellular environment.
[0129] In certain embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell
expressing a polynucleotide that hybridizes to a complement of the
nucleotide sequence encoding a fatty aldehyde biosynthesis
polypeptide listed in Table 6, or to a fragment thereof, wherein
the polynucleotide encodes a polypeptide having carboxylic acid
reductase activity. In some embodiments, the polypeptide has fatty
acid reductase activity.
[0130] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell,
wherein the gene encoding a fatty acid synthase is modified. In
certain embodiments, modifying the expression of a gene encoding a
fatty acid synthase includes expressing a gene encoding a fatty
acid synthase in the host cell and/or increasing the expression or
activity of an endogenous fatty acid synthase in the host cell. In
alternate embodiments, modifying the expression of a gene encoding
a fatty acid synthase includes attenuating a gene encoding a fatty
acid synthase in the host cell and/or decreasing the expression or
activity of an endogenous fatty acid synthase in the host cell. In
some embodiments, the fatty acid synthase is a thioesterase. In
particular embodiments, the thioesterase is encoded by tesA, tesA
without leader sequence, tesB, fatB, fatB2, fatB3, fatA, or
fatA1.
[0131] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a gene encoding a fatty alcohol biosynthesis
polypeptide. For example, the fatty alcohol biosynthesis
polypeptide is an alcohol dehydrogenase. In particular embodiments,
the fatty alcohol biosynthesis polypeptide is one selected from
those listed in Table 8, or a variant thereof.
[0132] In any of the above-described embodiments, the microbially
produced branched fatty alcohol and/or derivative thereof is
produced by a host cell, which is genetically engineered to express
an attenuated level of a fatty acid degradation enzyme relative to
a wild type host cell. In some embodiments, the host cell is
genetically engineered to express an attenuated level of an
acyl-CoA synthase relative to a wild type host cell. In particular
embodiments, the host cell expresses an attenuated level of an
acyl-CoA synthase encoded by fadD, fadK, BH3103, yhfL, Pfl-4354,
EAV15023, fadD1, fadD2, RPC_4074, fadDD35, fadDD22, faa3p or the
gene encoding the protein ZP_01644857. In certain embodiments, the
genetically engineered host cell comprises a knockout of one or
more genes encoding a fatty acid degradation enzyme, such as the
aforementioned acyl-CoA synthase genes. In certain embodiments, the
host cell is genetically engineered to express, relative to a wild
type host cell, a decreased level of at least one of a gene
encoding an acyl-CoA dehydrogenase, a gene encoding an outer
membrane protein receptor, and a gene encoding a transcriptional
regulator of fatty acid biosynthesis. In some embodiments, the gene
encoding an acyl-CoA dehydrogenase is fadE. In some embodiments,
the gene encoding an outer membrane protein receptor is tonA (also
known as fhuA). Yet in other embodiments, the gene encoding a
transcriptional regulator of fatty acid biosynthesis is fabR.
[0133] In yet other embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell,
which is genetically engineered to express an attenuated level of a
dehydratase/isomerase enzyme, such as an enzyme encoded by fabA or
by a gene listed in Table 1 or Table 2. In some embodiments, the
host cell comprises a knockout of fabA or a gene listed in Table 1
or Table 2. In other embodiments, the host cell is genetically
engineered to express an attenuated level of a ketoacyl-ACP
synthase, such as an enzyme encoded by fabB or by a gene listed in
Table 3 or Table 4. In certain embodiments, the host cell comprises
a knockout of fabB or a gene listed in Table 3 or Table 4. In yet
other embodiments, the host cell is genetically engineered to
express a modified level of a gene encoding a desaturase enzyme,
such as desA.
[0134] In some embodiments, the branched fatty alcohol or a
derivative thereof is isolated from the host cell, for example,
from the extracellular environment. In certain embodiments, the
branched fatty alcohol or the derivative thereof is spontaneously
secreted, partially or completely, from the host cell. In
alternative embodiments, the branched fatty alcohol or the
derivative thereof is transported into the extracellular
environment. In other embodiments, the branched fatty alcohol or
the derivative thereof is passively transported into the
extracellular environment.
[0135] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a polynucleotide that hybridizes to a complement of a
nucleotide sequence encoding an acyl-ACP reductases selected from
those listed in Table 7, or to a fragment thereof, wherein the
polynucleotide encodes a polypeptide having reductase activity. The
polynucleotide hybridizes under low stringency, medium stringency,
high stringency, or very high stringency conditions, to a
complement of the nucleotide sequence encoding an acyl-ACP
reductases selected from those listed in Table 7, or to a fragment
thereof.
[0136] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a gene encoding a fatty alcohol biosynthesis polypeptide
in the host cell. For example, the fatty alcohol biosynthesis
polypeptide is an alcohol dehydrogenase. In particular embodiments,
the fatty alcohol biosynthesis polypeptide is one selected from
those listed in Table 8, or a variant thereof.
[0137] In any of the above-described embodiments, the microbially
produced branched fatty alcohol and/or derivative thereof is
produced by a host cell, which is genetically engineered to express
an attenuated level of a fatty acid degradation enzyme relative to
a wild type host cell. In some embodiments, the host cell is
genetically engineered to express an attenuated level of an
acyl-CoA synthase relative to a wild type host cell. In particular
embodiments, the host cell expresses an attenuated level of an
acyl-CoA synthase encoded by fadD, fadK, BH3103, yhfL, Pfl-4354,
EAV15023, fadD1, fadD2, RPC_4074, fadDD35, fadDD22, faa3p or the
gene encoding the protein ZP_01644857. In certain embodiments, the
genetically engineered host cell comprises a knockout of one or
more genes encoding a fatty acid degradation enzyme, such as the
aforementioned acyl-CoA synthase genes. In certain embodiments, the
host cell is genetically engineered to express, relative to a wild
type host cell, a decreased level of at least one of a gene
encoding an acyl-CoA dehydrogenase, a gene encoding an outer
membrane protein receptor, and a gene encoding a transcriptional
regulator of fatty acid biosynthesis. In some embodiments, the gene
encoding an acyl-CoA dehydrogenase is fadE. In some embodiments,
the gene encoding an outer membrane protein receptor is tonA (also
known as fhuA). Yet in other embodiments, the gene encoding a
transcriptional regulator of fatty acid biosynthesis is fabR.
[0138] In yet other embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell,
which is genetically engineered to express an attenuated level of a
dehydratase/isomerase enzyme, such as an enzyme encoded by fabA or
by a gene listed in Table 1 or Table 2. In some embodiments, the
host cell comprises a knockout of fabA or a gene listed in Table 1
or Table 2. In other embodiments, the host cell is genetically
engineered to express an attenuated level of a ketoacyl-ACP
synthase, such as an enzyme encoded by fabB or by a gene listed in
Table 3 or Table 4. In certain embodiments, the host cell comprises
a knockout of fabB or a gene listed in Table 3 or Table 4. In yet
other embodiments, the host cell is genetically engineered to
express a modified level of a gene encoding a desaturase enzyme,
such as desA.
[0139] In some embodiments, the branched fatty alcohol or
derivative thereof is isolated from the host cell, e.g., from the
extracellular environment. In certain embodiments, the branched
fatty alcohol or derivative thereof is spontaneously secreted,
partially or completely, from the host cell. In alternative
embodiments, the branched fatty alcohol or derivative thereof is
transported into the extracellular environment. In other
embodiments, the branched fatty alcohol or derivative thereof is
passively transported into the extracellular environment.
[0140] In some embodiments, the branched-chain alpha-keto acid
dehydrogenase complex, the beta ketoacyl-ACP synthase polypeptide,
the aldehyde biosynthesis polypeptide, the fatty acid synthase, the
acyl-ACP reductase, the alcohol biosynthesis polypeptide, and the
fatty acid degradation enzyme polypeptide are each independently
from a bacterium, a plant, an insect, a yeast, a fungus, or a
mammal. For example, the branched-chain alpha-keto acid
dehydrogenase complex polypeptides can be from a bacterium that
uses branched amino acids as carbon source, including, for example,
Pseudomonas putida, or Bacillus subtilis. In another example, the
branched-chain alpha-keto acid dehydrogenase complex polypeptide
can be from a bacterium that comprises branched fatty acids in its
phospholipids, including, for example, a Legionella,
Stenotrophomonas, Alteromonas, Flavobacterium, Myxococcus,
Bccteroides, Micrococcus, Staphylococcus, Bacillus, Clostridium,
Listeria, Lactococcus, or Streptomyces bacterium. In some
embodiments, the bacterium is a Leginella pneumophila,
Stenotrophomonas maltophilia, Alteromonas macleodii, Flabobacterium
phsychrophilum, Myxococcus Xanthus, Bacteroides thetaiotaomicron,
Macrococcus luteus, Staphylococcus aureus, Clostridium
thermocellum, Listeria monocytogenes, Streptomyces lividans,
Streptomyces coelicolor, Streptomyces glaucescens, Streptococcus
pneumoniae, Streptomyces peucetius, Streptococcus pyogenes,
Escherichia coli, Escherichia coli K-12, Lactococcus lactis ssp.
Lactis, Mycobacterium tuberculosis, Enterococcus tuberculosis,
Bacillus subtilis, Lactobacillus plantarum. In some embodiments,
suitable fatty aldehyde biosynthesis polypeptides, fatty alcohol
biosynthesis polypeptides, acyl-ACP reductases, and other
polypeptides of the invention can be from a mycobacterium selected
from the group consisting of Mycobacterium smegmatis, Mycobacterium
abscessus, Mycobacterium avium, Mycobacterium bovis, Mycobacterium
tuberculosis, Mycobacterium leprae, Mycobacterium marinum, and
Mycobacterium ulcerans. In other embodiments, the bacterium is
Nocardia sp. NRRL 5646, Nocardia farcinica, Streptomyces griseus,
Salinispora arenicola, or Clavibacter michiganenesis. In yet
further embodiments, the polypeptide of the invention is derived
from a cyanobacterium, including, for example, Synechococcus
elongatus PCC7942, Synechocystis sp. PCC6803, Cyanothece sp.
ATCC51142, Prochlorococcus marinus subsp. pastoris str. CCMP1986
PMM0533, Gloeobacter violaceus PCC7421, Nostoc punctiforme
PCC73102, Anabaena variabilis ATCC29413, Synechococcus elongatus
PCC6301, and Nostoc sp. PCC 7120, Microcoleus chthonoplastes
PCC7420, Arthrospira maxima CS-328, Lyngbya sp. PCC8106, Nodularia
spumigena CCY9414, Trichodesmium erythraeum IMS101, Microcystis
aeruginosa, Nostoc azollae, Anabaena variabilis, Crocophaera
watsonii, Thermosynechococcus elongatus, Gloeobacer violaceus,
Cyanobium, or Prochlorococcus marinus.
[0141] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
cultured in the presence of at least one biological substrate for
the branched-chain alpha-keto acid dehydrogenase polypeptides, the
beta ketoacyl-ACP synthase polypeptide, the aldehyde biosynthesis
polypeptide, the acyl-ACP reductase, or the alcohol biosynthesis
polypeptide. In some embodiments, the host cell is cultured under
conditions that allow the expression of the branched-chain
alpha-keto acid dehydrogenase polypeptides, the beta ketoacyl-ACP
synthase, the aldehyde biosynthesis polypeptide, the acyl-ACP
reductase, and/or the alcohol biosynthesis polypeptide. In
particular embodiments, the host cell is cultured under conditions
that allow the production of branched fatty alcohols or derivatives
thereof.
[0142] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
cultured in the presence of at least one biological substrate for
the branched-chain alpha-keto acid dehydrogenase complex, the
aldehyde biosynthesis polypeptide, the alcohol biosynthesis
polypeptide, and/or the acyl-ACP reductase polypeptide.
Accordingly, the host cell is cultured under conditions that allow
expression of branched-chain alpha-keto acid dehydrogenase complex,
the aldehyde biosynthesis polypeptide, the alcohol biosynthesis
polypeptide, and/or the acyl-ACP reductase polypeptide.
[0143] In some embodiments, the branched fatty alcohol or
derivative thereof is isolated from the host cell, e.g., from the
extracellular environment. In some embodiments, the branched fatty
alcohol or derivative thereof is spontaneously secreted, partially
or completely, from the host cell. In alternative embodiments, the
branched fatty alcohol or derivative thereof is transported into
the extracellular environment. In other embodiments, the branched
fatty alcohol or derivative thereof is passively transported into
the extracellular environment.
[0144] In another aspect, the invention features a surfactant or
detergent composition comprising a microbially produced branched
fatty alcohol or a derivative thereof. In certain embodiments, the
microbially produced branched fatty alcohol and/or derivative
thereof is produced by a host cell expressing one or more
recombinant vectors comprising at least the E1 alpha and beta
subunits of a branched-chain alpha-keto acid dehydrogenase. In
certain embodiments, the recombinant vector further comprises an E2
subunit of a branched-chain alpha-keto acid dehydrogenase. The
subunits can be introduced into the host cell in separate vectors
or together in a single vector. For example, the vector can
comprise a first polynucleotide sequence having at least about 30%,
e.g., at least about 35%, at least about 40%, at least about 45%,
at least about 50%, at least about 55%, at least about 60%, at
least about 65%, at least about 70%, at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, at least
about 99%, or 100% identity sequence identity to a polynucleotide
sequence listed in FIG. 2A, and a second polynucleotide sequence
having at least about 30%, e.g., at least about 35%, at least about
40%, at least about 45%, at least about 50%, at least about 55%, at
least about 60%, at least about 65%, at least about 70%, at least
about 80%, at least about 85%, at least about 90%, at least about
91%, at least about 92%, at least about 93%, at least about 94%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, at least about 99%, or 100% identity sequence identity
to a polynucleotide sequence listed in FIG. 2B. In another example,
the vector can further comprise a third polynucleotide having at
least about 30%, e.g., at least about 35%, at least about 40%, at
least about 45%, at least about 50%, at least about 55%, at least
about 60%, at least about 65%, at least about 70%, at least about
80%, at least about 85%, at least about 90%, at least about 91%, at
least about 92%, at least about 93%, at least about 94%, at least
about 95%, at least about 96%, at least about 97%, at least about
98%, at least about 99%, or 100% identity sequence identity to a
polynucleotide sequence listed in FIG. 2C. The polynucleotides
encoding the alpha and beta subunits of the E1 subunit can be
linked and constitute a single operon, or they may be separately
introduced into a vector and/or into a host cell. Likewise, the
polynucleotides encoding the E1 subunit and the polynucleotide
encoding the E2 subunit can be linked and constitute a single
operon, or they may be separately introduced into a vector and/or
into a host cell. For example, a first vector can comprise a first
polynucleotide sequence having at least about 30%, e.g., at least
about 35%, at least about 40%, at least about 45%, at least about
50%, at least about 55%, at least about 60%, at least about 65%, at
least about 70%, at least about 80%, at least about 85%, at least
about 90%, at least about 91%, at least about 92%, at least about
93%, at least about 94%, at least about 95%, at least about 96%, at
least about 97%, at least about 98%, at least about 99%, or 100%
identity sequence identity to a polynucleotide sequence listed in
FIG. 2A, and a second vector can comprise a second polynucleotide
having at least about 30%, e.g., at least about 35%, at least about
40%, at least about 45%, at least about 50%, at least about 55%, at
least about 60%, at least about 65%, at least about 70%, at least
about 80%, at least about 85%, at least about 90%, at least about
91%, at least about 92%, at least about 93%, at least about 94%, at
least about 95%, at least about 96%, at least about 97%, at least
about 98%, at least about 99%, or 100% identity sequence identity
to a polynucleotide sequence listed in FIG. 2B.
[0145] In some embodiment, the nucleotide sequence of the first
polynucleotide has at least about 35%, at least about 40%, at least
about 45%, at least about 50%, at least about 55%, at least about
60%, at least about 65%, at least about 70%, at least about 75%, at
least about 80%, at least about 85%, at least about 90%, at least
about 91%, at least about 92%, at least about 93%, at least about
94%, at least about 95%, at least about 96%, at least about 97%, at
least about 98%, at least about 99% sequence identity to the
nucleotide sequence of any one of SEQ ID NOs: 2, 4, 6, 8, 10, 12,
14 and 16; the second polynucleotide has at least about 35%, at
least about 40%, at least about 45%, at least about 50%, at least
about 55%, at least about 60%, at least about 65%, at least about
70%, at least about 75%, at least about 80%, at least about 85%, at
least about 90%, at least about 91%, at least about 92%, at least
about 93%, at least about 94%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, at least about 99%
sequence identity to the nucleotide sequence of any one of SEQ ID
NOs: 25, 27, 29, 31, 33, 35, 37, and 39; and the nucleotide
sequence of the third polynucleotide has at least about 35%, at
least about 40%, at least about 45%, at least about 50%, at least
about 55%, at least about 60%, at least about 65%, at least about
70%, at least about 75%, at least about 80%, at least about 85%, at
least about 90%, at least about 91%, at least about 92%, at least
about 93%, at least about 94%, at least about 95%, at least about
96%, at least about 97%, at least about 98%, at least about 99%
sequence identity to the nucleotide sequence of any one of SEQ ID
NOs: 48, 50, 52, 54, 56, 58, 60, and 62. In some embodiment, the
nucleotide sequence of the first polynucleotide is any one of SEQ
ID NOs: 2, 4, 6, 8, 10, 12, 14 and 16, the nucleotide sequence of
the second polynucleotide is any one of SEQ ID NOs: 25, 27, 29, 31,
33, 35, 37, and 39, and the nucleotide sequence of the third
polynucleotide, when present, is any one of SEQ ID NOs: 48, 50, 52,
54, 56, 58, 60, and 62.
[0146] In some embodiment, each of the vectors above, or another
vector can comprise a fourth polynucleotide sequence having at
least about 30%, e.g., at least about 35%, at least about 40%, at
least about 45%, at least about 50%, at least about 55%, at least
about 60%, at least about 65%, at least about 70%, at least about
80%, at least about 85%, at least about 90%, at least about 91%, at
least about 92%, at least about 93%, at least about 94%, at least
about 95%, at least about 96%, at least about 97%, at least about
98%, at least about 99%, or 100% sequence identity to a
polynucleotide sequence listed in FIG. 2D. In certain embodiments,
the nucleotide sequence of the fourth polynucleotide has at least
about 35%, at least about 40%, at least about 45%, at least about
50%, at least about 55%, at least about 60%, at least about 65%, at
least about 70%, at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, at least
about 99% sequence identity to the nucleotide sequence of any one
of SEQ ID NOs:70, 72, 74, 76, 78, 80, 82, and 84. In some
embodiments, the nucleotide sequence of the fourth polynucleotide
is any one of SEQ ID NOs:70, 72, 74, 76, 78, 80, 82, and 84.
[0147] In some embodiments, each of the vectors above, or another
vector can be introduced into the host cell wherein the vector
comprises a beta-ketoacyl ACP synthase nucleotide that has at least
about 30%, e.g., at least about 35%, at least about 40%, at least
about 45%, at least about 50%, at least about 55%, at least about
60%, at least about 65%, at least about 70%, at least about 80%, at
least about 85%, at least about 90%, at least about 91%, at least
about 92%, at least about 93%, at least about 94%, at least about
95%, at least about 96%, at least about 97%, at least about 98%, at
least about 99%, or 100% sequence identity to a polynucleotide
sequence listed in FIG. 2E. In certain embodiments, the nucleotide
sequence of the beta-ketoacyl ACP synthase nucleotide has at least
about 35%, at least about 40%, at least about 45%, at least about
50%, at least about 55%, at least about 60%, at least about 65%, at
least about 70%, at least about 75%, at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, at least
about 99% sequence identity to the nucleotide sequence of any one
of SEQ ID NOs:91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111,
113, 115, 117, 119, and 121. In some embodiments, the nucleotide
sequence of beta-ketoacyl ACP synthase nucleotide is SEQ ID NOs:
91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117,
119, and 121.
[0148] In yet another embodiment, an individual vector comprising a
beta-ketoacyl-ACP synthase nucleotide that has at least about 30%,
e.g., at least about 35%, at least about 40%, at least about 45%,
at least about 50%, at least about 55%, at least about 60%, at
least about 65%, at least about 70%, at least about 80%, at least
about 85%, at least about 90%, at least about 91%, at least about
92%, at least about 93%, at least about 94%, at least about 95%, at
least about 96%, at least about 97%, at least about 98%, at least
about 99%, or 100% sequence identity to a polynucleotide sequence
listed in FIG. 2E can be introduced into a suitable host cell,
independent of whether one or more other vectors comprising one or
more subunits of a branched-chain alpha-keto acid dehydrogenase is
introduced into the same cell. For example, the host cell can
suitably be one that expresses an endogenous branched-chain
alpha-keto acid dehydrogenase, or one or more subunits thereof.
[0149] In some embodiments, the branched fatty aldehyde, branched
fatty alcohol, or derivative thereof is isolated from the host
cell, for example, from the extracellular environment. In some
embodiments, the branched fatty aldehyde, branched fatty alcohol or
derivative thereof is spontaneously secreted, partially or
completely, from the host cell. In alternative embodiments, the
branched fatty aldehyde, branched fatty alcohol or derivative
thereof is transported into the extracellular environment. In other
embodiments, the branched fatty aldehyde, branched fatty alcohol,
or derivative thereof is passively transported into the
extracellular environment.
[0150] The recombinant vector can further comprises a promoter
operably linked to the nucleotide sequence. In certain embodiments,
the promoter is a developmentally-regulated, an organelle-specific,
a tissue-specific, an inducible, a constitutive, or a cell-specific
promoter.
[0151] In other embodiments, the recombinant vector comprises at
least one sequence selected from the group consisting of (a) a
regulatory sequence operatively coupled to the nucleotide sequence;
(b) a selection marker operatively coupled to the nucleotide
sequence; (c) a marker sequence operatively coupled to the
nucleotide sequence; (d) a purification moiety operatively coupled
to the nucleotide sequence; (e) a secretion sequence operatively
coupled to the nucleotide sequence; and (f) a targeting sequence
operatively coupled to the nucleotide sequence.
[0152] In some embodiments, the recombinant vector is a
plasmid.
[0153] In some embodiments, the host cell expresses a polypeptide
encoded by the recombinant vector. In some embodiments, the
nucleotide sequence is stably incorporated into the genomic DNA of
the host cell, and the expression of the nucleotide sequence is
under the control of a regulated promoter region. In an exemplary
embodiment, one or more of the polynucleotides encoding a
branched-chain alpha-keto acid dehydrogenase polypeptide, a beta
ketoacyl-ACP synthase polypeptide, a fatty aldehyde biosynthesis
polypeptide, a fatty alcohol biosynthesis polypeptide, and/or an
acyl-ACP reductase of the invention can be stably incorporated into
the genomic DNA of the host cell, and the expression of the
polynucleotide sequence is under the control of a regulated
promoter region.
[0154] In some embodiment, an above-described vector or another
vector can be introduced into the host cell wherein the vector
comprises a fatty aldehyde biosynthesis polynucleotide having at
least about 70% sequence identity to a nucleotide sequence encoding
an enzyme listed in Table 6.
[0155] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
wherein the expression of a gene encoding a fatty acid synthase is
modified. In certain embodiments, modifying the expression of a
gene encoding a fatty acid synthase includes expressing a gene
encoding a fatty acid synthase in the host cell and/or increasing
the expression or activity of an endogenous fatty acid synthase in
the host cell. In alternate embodiments, modifying the expression
of a gene encoding a fatty acid synthase includes attenuating a
gene encoding a fatty acid synthase in the host cell and/or
decreasing the expression or activity of an endogenous fatty acid
synthase in the host cell. In some embodiments, the fatty acid
synthase is a thioesterase. In particular embodiments, the
thioesterase is encoded by tesA, tesA without leader sequence,
tesB, fatB, fatB2, fatB3, fatA, or fatA1.
[0156] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
expressing a gene encoding a fatty alcohol biosynthesis
polypeptide. For example, the fatty alcohol biosynthesis
polypeptide is an alcohol dehydrogenase. In particular embodiments,
the fatty alcohol biosynthesis polypeptide comprises the amino acid
sequence of an enzyme listed in Table 8, or a variant thereof.
[0157] In any of the embodiments described above, the microbially
produced branched fatty alcohol and/or derivative thereof is
produced by a host cell, which is genetically engineered to express
an attenuated level of a fatty acid degradation enzyme relative to
a wild type host cell. In some embodiments, the host cell is
genetically engineered to express an attenuated level of an
acyl-CoA synthase relative to a wild type host cell. In particular
embodiments, the host cell expresses an attenuated level of an
acyl-CoA synthase encoded by fadD, fadK, BH3103, yhfL, Pfl-4354,
EAV15023, fadD1, fadD2, RPC_4074, fadDD35, fadDD22, faa3p or the
gene encoding the protein ZP_01644857. In certain embodiments, the
genetically engineered host cell comprises a knockout of one or
more genes encoding a fatty acid degradation enzyme, such as the
aforementioned acyl-CoA synthase genes. In certain embodiments, the
host cell is genetically engineered to express, relative to a wild
type host cell, a decreased level of at least one of a gene
encoding an acyl-CoA dehydrogenase, a gene encoding an outer
membrane protein receptor, and a gene encoding a transcriptional
regulator of fatty acid biosynthesis. In some embodiments, the gene
encoding an acyl-CoA dehydrogenase is fadE. In some embodiments,
the gene encoding an outer membrane protein receptor is tonA (also
known as fhuA). Yet in other embodiments, the gene encoding a
transcriptional regulator of fatty acid biosynthesis is fabR.
[0158] In yet other embodiments, the microbially produced branched
fatty alcohol and/or derivative thereof is produced by a host cell,
which is genetically engineered to express an attenuated level of a
dehydratase/isomerase enzyme, such as an enzyme encoded by fabA or
by a gene listed in Table 1 or Table 2. In some embodiments, the
host cell comprises a knockout of fabA or a gene listed in Table 1
or Table 2. In other embodiments, the host cell is genetically
engineered to express an attenuated level of a ketoacyl-ACP
synthase, such as an enzyme encoded by fabB or by a gene listed in
Table 3 or Table 4. In certain embodiments, the host cell comprises
a knockout of fabB or a gene listed in Table 3 or Table 4. In yet
other embodiments, the host cell is genetically engineered to
express a modified level of a gene encoding a desaturase enzyme,
such as desA.
[0159] In certain other embodiments, any of the vectors comprising
the E1 alpha, E1 beta, and/or optionally E2 and/or optionally E3
subunits of a branched-chain alpha-keto acid dehydrogenase complex
or another vector can be introduced into the host cell wherein the
vector further comprises an acyl-ACP reductase polynucleotide
having at least about 70% sequence identity to a nucleotide
sequence encoding an enzyme listed in Table 7.
[0160] In some embodiments, the host cell is cultured in the
presence of at least one biological substrate for the
branched-chain alpha-keto acid dehydrogenase complex, the aldehyde
biosynthesis polypeptide, the alcohol biosynthesis polypeptide,
and/or the acyl-ACP reductase polypeptide. In certain embodiments,
the host cell is cultured under conditions that are sufficient for
expressing a branched-chain alpha-keto acid dehydrogenase complex,
an aldehyde biosynthesis polypeptide, an alcohol biosynthesis
polypeptide, and/or an acyl-ACP reductase polypeptide. In certain
other embodiments, the host cell is cultured under conditions that
allow the production of branched fatty alcohols or derivatives
thereof.
[0161] In some embodiments, the microbially produced branched fatty
alcohol and/or derivative thereof is produced by a host cell
cultured in the presence of at least one biological substrate for
the branched-chain alpha-keto acid dehydrogenase complex, the
aldehyde biosynthesis polypeptide, the alcohol biosynthesis
polypeptide, and/or the acyl-ACP reductase polypeptide.
Accordingly, the host cell is cultured under conditions that allow
expression of branched-chain alpha-keto acid dehydrogenase complex,
the aldehyde biosynthesis polypeptide, the alcohol biosynthesis
polypeptide, and/or the acyl-ACP reductase polypeptide.
[0162] In some embodiments, the branched fatty alcohol or
derivative thereof is isolated from the host cell, for example,
from the extracellular environment. In some embodiments, the
branched fatty alcohol or derivative thereof is secreted from the
host cell. In alternative embodiments, the branched fatty alcohol
or derivative thereof is transported into the extracellular
environment. In other embodiments, the branched fatty alcohol or
derivative thereof is passively transported into the extracellular
environment.
[0163] In any of the aspects of the invention described herein, the
host cell can be selected from the group consisting of a mammalian
cell, plant cell, insect cell, yeast cell, fungus cell, filamentous
fungi cell, and bacterial cell. In some embodiments, the host cell
is a Gram-positive bacterial cell. In other embodiments, the host
cell is a Gram-negative bacterial cell. In some embodiments, the
host cell is selected from the genus Escherichia, Bacillus,
Lactobacillus, Rhodococcus, Pseudomonas, Aspergillus, Trichoderma,
Neurospora, Fusarium, Humicola, Rhizomucor, Kluyveromyces, Pichia,
Mucor, Myceliophtora, Penicillium, Phanerochaete, Pleurotus,
Trametes, Chrysosporium, Saccharomyces, Stenotrophamonas,
Schizosaccharomyces, Yarrowia, or Streptomyces.
[0164] In certain embodiments, the host cell is a Bacillus lentus
cell, a Bacillus brevis cell, a Bacillus stearothermophilus cell, a
Bacillus licheniformis cell, a Bacillus alkalophilus cell, a
Bacillus coagulans cell, a Bacillus circulans cell, a Bacillus
pumilis cell, a Bacillus thuringiensis cell, a Bacillus clausii
cell, a Bacillus megaterium cell, a Bacillus subtilis cell, or a
Bacillus amyloliquefaciens cell.
[0165] In other embodiments, the host cell is a Trichoderma
koningii cell, a Trichoderma viride cell, a Trichoderma reesei
cell, a Trichoderma longibrachiatum cell, an Aspergillus awamori
cell, an Aspergillus fumigates cell, an Aspergillus foetidus cell,
an Aspergillus nidulans cell, an Aspergillus niger cell, an
Aspergillus oryzae cell, a Humicola insolens cell, a Humicola
lanuginose cell, a Rhodococcus opacus cell, a Rhizomucor miehei
cell, or a Mucor michei cell.
[0166] In yet other embodiments, the host cell is a Streptomyces
lividans cell or a Streptomyces murinus cell.
[0167] In yet other embodiments, the host cell is an Actinomycetes
cell.
[0168] In some embodiments, the host cell is a Saccharomyces
cerevisiae cell.
[0169] In particular embodiments, the host cell is a cell from an
eukaryotic plant, algae, cyanolacterium, green-sulfur bacterium,
green non-sulfur bacterium, purple sulfur bacterium, purple
non-sulfur bacterium, extremophile, yeast, fungus, engineered
organisms thereof, or a synthetic organism. In some embodiments,
the host cell is light dependent or fixes carbon. In some
embodiments, the host cell is light dependent or fixes carbon. In
some embodiments, the host cell has autotrophic activity. In some
embodiments, the host cell has photoautotrophic activity, such as
in the presence of light. In some embodiments, the host cell is
heterotrophic or mixotrophic in the absence of light. In certain
embodiments, the host cell is a cell from Avabidopsis thaliana,
Panicum virgatum, Miscanthus giganteus, Zea mays, Botryococcuse
braunii, Chlamydomonas reinhardtii, Dunaliela salina, Synechococcus
Sp. PCC 7002, Synechococcus Sp. PCC 7942, Synechocystis Sp. PCC
6803, Thermosynechococcus elongates BP-1, Chlorobium tepidum,
Chloroflexus auranticus, Chromatiumm vinosum, Rhodospirillum
rubrum, Rhodobacter capsulatus, Rhodopseudomonas palusris,
Clostridium ljungdahlii, Clostridiuthermocellum, Penicillium
chrysogenum, Pichia pastoris, Saccharomyces cerevisiae,
Schizosaccharomyces pombe, Pseudomonas fluorescens, or Zymomonas
mobilis.
[0170] In other embodiments, the host cell is a CHO cell, a COS
cell, a VERO cell, a BHK cell, a HeLa cell, a Cv1 cell, an MDCK
cell, a 293 cell, a 3T3 cell, or a PC12 cell.
[0171] In yet other embodiments, the host cell is an E. coli cell.
In certain embodiments, the E. coli cell is a strain B, a strain C,
a strain K, or a strain W E. coli cell.
[0172] In further embodiments, the host cell can be genetically
engineered to express an attenuated level of a
dehydratase/isomerase enzyme. For example, an E. coli cell is
chosen as a suitable host cell, wherein one or more of the
endogenous dehydratase/isomerase enzymes such as those listed in
Table 1 below can be attenuated or knocked out.
TABLE-US-00001 TABLE 1 E.coli dehydratase/isomerase enzymes
Polynucleotide Polypeptide Gene Name Acc. No. Acc. No. fabA
beta-hydroxydecanoyl GU072596.1 ACY27485.1 thioester dehydrase fabZ
(3R)-hydroxymyristol acyl GU072604 ACY27493.1 carrier protein
dehydratase cysM cysteine synthase B CP001637.1 ACX38914 maoC fused
aldehyde dehydroge- CP001637 ACX39905.1 nase/enoyl-CoA
hydratase
[0173] Other dehydratase/isomerase enzymes encoded by a gene listed
below in Table 2 can also be attenuated or knocked from an organism
comprising such a gene.
TABLE-US-00002 TABLE 2 Other dehydatase/isomerase enzymes Organism
Accession No. Shigella sp. D9 ZP_05432652 Citrobacter youngae ATCC
29220 ZP_04561391.1 Salmonella enterica YP_001570967.1 Escherichia
fergusonii ATCC 35469 YP_002382254.1 Klebsiella pneumoniae
NTUH-K2044 YP_002918743.1 Enterobacter cancerogenus ATCC 35316
ZP_03281954.1 Cronobacter turicensis CBA29728.1 Erwinia pyrifoliae
Ep1/96 YP_002649242.1 Pectobacterium carotovorum YP_003018119.1
subsp. carotovorum PC1 Dickeya dadantii Ech703 YP_002987184.1
Edwardsiella ictaluri 93-146 YP_002932813.1 Providencia
alcalifaciens DSM 30120 ZP_03317956.1 Yersinia kristensenii ATCC
33638 ZP_04624337.1 Photorhabdus asymbiotica YP_003041580.1 Pantoea
sp. At-9b ZP_05728924.1 Actinobacillus succinogenes 130Z
YP_001344737.1 Mannheimia succiniciproducens MBEL55E YP_088386.1
Pasteurella multocida subsp. NP_245421.1 multocida str. Pm70
Haemophilus somnus 129PT YP_719117.1 Proteus mirabilis HI4320
YP_002150544.1 Sodalis glossinidius str. `morsitans` YP_454706.1
Candidatus Blochmannia YP_277927.1 pennsylvanicus str. BPEN
Aggregatibacter aphrophilus NJ8700 YP_003007342.1 Vibrio cholerae
MZO-3 ZP_01958381.1 Baumannia cicadellinicola str. Hc YP_588853.1
(Homalodisca coagulata) Vibrionales bacterium SWAT-3 ZP_01815187.1
Aliivibrio salmonicida LFI1238 YP_002262988.1 Aeromonas salmonicida
subsp. YP_001141819.1 salmonicida A449 Wigglesworthia glossinidia
endosymbiont NP_871303.1 of Glossina brevipalpis Glaciecola sp.
HTCC2999 ZP_03560821.1 Alteromonas macleodii ATCC 27126
ZP_04714556.1
[0174] In other embodiments, the host cell is genetically
engineered to express an attenuated level of an endogenous
ketoacyl-ACP synthase. For example, an E. coli cell is used as a
suitable host cell, wherein one or more of the ketoacyl-ACP genes
listed in Table 3 below can be attenuated or knocked out.
TABLE-US-00003 TABLE 3 E.coli ketoacyl-ACP synthase enzymes
Polynucleotide Polypeptide Gene Name Acc. No. Acc. No. fabB
B-ketoacyl synthase/ GU072597.1 ACY27486.1
3-oxoacyl-[acyl-carrier-pro tein] synthase I fabF
3-oxoacyl-[acyl-carrier- GU072598.1 ACY27487 protein] synthase II
fadJ fused enoyl-CoA hydratase CP001637.1 ACX38989.1 and
epimerase/isomerase/ 3-hydroxyacyl-CoA dehydrogenase xerC
site-specific tyrosine CP001637.1 ACX41768.1 recombinase yqeF
predicted acyltransferase CP001637.1 ACX38529.1 murQ predicted PTS
component CP001637.1 ACX38907.1
[0175] Other endogenous ketoacyl-ACP synthases, such as the ones
listed in Table 4, can be attenuated or knocked out from an
organism comprising such an enzyme.
TABLE-US-00004 TABLE 4 Other ketoacyl-ACP synthases Organism
Accession No. Shigella boydii CDC 3083-94 YP_001881145.1
Escherichia fergusonii ATCC 35469 YP_002382013.1 Salmonella
enterica subsp. arizonae YP_001569590.1 Citrobacter sp. 30_2
ZP_04562837.1 Klebsiella pneumoniae subsp. pneumoniae
YP_001336360.1 MGH 78578 Pectobacterium carotovorum subsp.
ZP_03831287.1 carotovorum WPP14 Enterobacter cancerogenus ATCC
35316 ZP_03283474.1 Pantoea sp. At-9b ZP_05730617.1 Cronobacter
turicensis CBA32510.1 Dickeya dadantii Ech586 ZP_05723897.1 Erwinia
tasmaniensis Et1/99 YP_001907100.1 Serratia proteamaculans 568
YP_001479594.1 Edwardsiella ictaluri 93-146 YP_002934130.1 Sodalis
glossinidius str. `morsitans` YP_455303.1 Yersinia aldovae ATCC
35236 ZP_04620215.1 Providencia stuartii ATCC 25827 ZP_02961167.1
Photorhabdus asymbiotica YP_003040275.1 Proteus mirabilis HI4320
YP_002151524.1 Candidatus Blochmannia pennsylvanicus str. BPEN
YP_278005.1 Glaciecola sp. HTCC2999 ZP_03561088.1 Vibrio cholerae
V51 ZP_04919940.1 Wigglesworthia glossinidia endosymbiont of
NP_871411.1 Glossina brevipalpis Tolumonas auensis DSM 9187
YP_002892770.1 Actinobacillus pleuropneumoniae serovar 1 str. 4074
ZP_00134992.2 Aggregatibacter aphrophilus NJ8700 YP_003007711.1
Pseudoalteromonas tunicata D2 ZP_01135065.1 Vibrionales bacterium
SWAT-3 ZP_01816638.1 Pasteurella multocida subsp. multocida str.
Pm70 NP_245276.1 Mannheimia succiniciproducens MBEL55E YP_088783.1
Haemophilus somnus 129PT YP_718877.1 Shewanella loihica PV-4
YP_001094535.1 Aliivibrio salmonicida LFI1238 YP_002262558.1
[0176] In yet other embodiments, the host cell is genetically
engineered to express a modified level of a gene encoding a
desaturase enzyme, such as desA.
[0177] In certain embodiments, the microorganism is genetically
engineered to express a modified level (including, e.g., to
attenuate or knock out or to express or overexpress) of a gene
encoding a fatty aldehyde biosynthesis polypeptide. In some
embodiments, the fatty aldehyde biosynthesis polypeptide comprises
an amino acid sequence that has at least 70% sequence identity to
an enzyme listed in Table 6.
[0178] In certain embodiments, the microorganism is genetically
engineered to express a modified level of a fatty acid synthase in
the host cell. An exemplary fatty acid synthase is a thioesterase
encoded by, for example, tesA, tesA without leader sequence, tesB,
fatB, fatB2, fatB3, fatA, or fatA1.
[0179] In certain embodiments, the microorganism is genetically
engineered to express a modified level of gene encoding a fatty
alcohol biosynthesis polypeptide. For example, the fatty alcohol
biosynthesis polypeptide is an alcohol dehydrogenase. In particular
embodiments, the fatty alcohol biosynthesis polypeptide comprises
an amino acid sequence that has at least 70% sequence identity to
an enzyme listed in Table 8.
Branched-Chain Alpha-Keto Acid Dehydrogenase Complex (BKD Complex)
and Beta Ketoacyl-ACP Synthase
[0180] The methods described herein can be used to produce branched
fatty alcohols and/or derivatives, for example, from alpha keto
acids. The oxidative decarboxylation step, which converts the alpha
keto acids to the corresponding branched-chain acyl-CoA involves a
branched-chain .alpha.-keto acid dehydrogenase complex (bkd; EC
1.2.4.4.) (Denoya et al., J. Bacteriol. 177:3504 (1995)), which
consists of E1 alpha/beta (decarboxylase), E2 (dihydrolipoyl
transacylase), and E3 (dihydrolipoyl dehydrogenase) subunits. Any
microorganism that possesses branched-chain fatty acids, and/or
grows on branched-chain amino acids can be used as a source to
isolate bkd genes for expression in host cells, for example, E.
coli. Furthermore, E. coli has the E3 component as part of its
pyruvate dehydrogenase complex (lpd, EC 1.8.1.4, GenBank accession
NP_414658). Thus, branched fatty alcohols and/or derivatives can be
made by heterologously expressing only the E1 alpha/beta and E2 bkd
genes. Furthermore, certain of the host cells, including E. coli,
can produce branched products when only the E1 alpha/beta is
expressed without co-expression of the E2 bkd gene.
[0181] On the other hand, microorganisms that endogenously express
a suitable beta-ketoacyl ACP synthase can be engineered to express
or overexpress at least the first (E1) subunit of a branched-chain
alpha keto acid dehydrogenase complex, optionally also the second
(E2) and/or the third (E3) subunits of that complex to produce the
desirable branched fatty alcohols and/or derivatives thereof. The
endogenous beta-ketoacyl ACP synthase can be overexpressed, or can
be modified such that it is attenuated or deleted, and a
heterologous beta-ketoacyl ACP synthase gene can be expressed in
its place.
[0182] In a further embodiment, microorganisms that endogenously
express at least the first (E1) subunit of a branched-chain alpha
keto acid dehydrogenase complex, and optionally also the second
(E2) and/or the third (E3) subunits of that complex, can be
engineered to express or overexpress a beta-ketoacyl ACP synthase.
For example, the endogenous genes encoding the subunits of the
branched-chain alpha keto acid dehydrogenase complex can be
overexpressed, or can be modified such that they are attenuated or
deleted and a gene encoding one or more subunits of a heterologous
branched-chain alpha keto acid dehydrogenase complex can be
expressed in the host cell.
Substrates for Branched Fatty Alcohol Production
[0183] The branched fatty alcohols and/or derivatives, as well as
the surfactant compositions comprising them, can be produced from,
for example, branched fatty aldehydes, which themselves can be
produced from an appropriate substrate. While not wishing to be
bound by theory, it is believed that the branched fatty aldehyde
biosynthetic polypeptides described herein produce branched fatty
aldehydes from substrates via a reduction mechanism. In some
instances, the substrate is a branched fatty acid derivative, and a
fatty aldehyde having particular branching patterns and carbon
chain length can be produced from a branched fatty acid derivative
having those characteristics. The branched fatty aldehyde can then
be converted into the desired branched fatty alcohol in a reaction
catalyzed by a fatty alcohol biosynthesis polypeptide.
[0184] Alternatively, a suitable acyl-ACP reductases can be
employed to convert a branched acyl-ACP into a fatty aldehyde,
which can in turn be converted into a branched fatty alcohol in a
reaction catalyzed by a fatty alcohol biosynthesis polypeptide.
[0185] Accordingly, each step within a biosynthetic pathway that
leads to the production of a branched fatty acid derivative
substrate can be modified to produce or overproduce the branched
substrate of interest. For example, known genes involved in the
fatty acid biosynthetic pathway or the fatty aldehyde biosynthesis
pathway can be expressed, overexpressed, or attenuated in host
cells to produce a desired substrate (see, e.g., International
Publication WO 2008/119082, the disclosure of which is incorporated
by reference).
Synthesis of Branched Fatty Alcohols and Substrates
[0186] Fatty acid synthase (FAS) is a group of polypeptides that
catalyze the initiation and elongation of acyl chains (Marrakchi et
al., Biochemical Society, 30: 1050-1055 (2002)). The acyl carrier
protein (ACP) along with the enzymes in the FAS pathway control the
length, degree of saturation, and branching of the fatty acid
derivatives produced. The fatty acid biosynthetic pathway involves
the precursors acetyl-CoA and malonyl-CoA. The steps in this
pathway are catalyzed by enzymes of the fatty acid biosynthesis
(fab) and acetyl-CoA carboxylase (acc) gene families (see, e.g.,
Heath et al., Prog. Lipid Res., 40(6): 467-97 (2001)).
[0187] Host cells can be engineered to express fatty acid
derivative substrates by recombinantly expressing or overexpressing
one or more fatty acid synthase genes, such as acetyl-CoA and/or
malonyl-CoA synthase genes. For example, to increase acetyl-CoA
production, one or more of the following genes can be expressed in
a host cell: pdh (a multienzyme complex comprising aceEF (which
encodes the E1p dehydrogenase component, the E2p dihydrolipoamide
acyltransferase component of the pyruvate and 2-oxoglutarate
dehydrogenase complexes, and lpd), panK, fabH, fabB, fabD, fabG,
acpP, and fabF. Exemplary GenBank accession numbers for these genes
are: pdh (BAB34380, AAC73227, AAC73226), panK (also known as CoA,
AAC76952), aceEF (AAC73227, AAC73226), fabH (AAC74175), fabB
(P0A953), fabD (AAC74176), fabG (AAC74177), acpP (AAC74178), and
fabF (AAC74179). Additionally, the expression levels of fadE, gpsA,
ldhA, pflb, adhE, pta, poxB, ackA, and/or ackB can be attenuated or
knocked-out in an engineered host cell by transformation with
conditionally replicative or non-replicative plasmids containing
null or deletion mutations of the corresponding genes or by
substituting promoter or enhancer sequences. Exemplary GenBank
accession numbers for these genes are: fadE (AAC73325), gspA
(AAC76632), ldhA (AAC74462), pflb (AAC73989), adhE (AAC74323), pta
(AAC75357), poxB (AAC73958), ackA (AAC75356), and ackB (BAB81430).
The resulting host cells will have increased acetyl-CoA production
levels when grown in an appropriate environment.
[0188] Malonyl-CoA overexpression can be affected by introducing
accABCD (e.g., accession number AAC73296, EC 6.4.1.2) into a host
cell. Fatty acid production can be further increased by introducing
into the host cell a DNA sequence encoding a lipase (e.g.,
accession numbers CAA89087, CAA98876).
[0189] In addition, inhibiting PlsB can lead to an increase in the
levels of long chain acyl-ACP, which will inhibit early steps in
the pathway (e.g., accABCD, fabH, and fabI). The plsB (e.g.,
accession number AAC77011) D311E mutation can be used to increase
the amount of available fatty acids.
[0190] In addition, a host cell can be engineered to overexpress a
sfa gene (suppressor of fabA, e.g., accession number AAN79592) to
increase production of monounsaturated fatty acids (Rock et al., J.
Bacteriology, 178: 5382-5387 (1996)).
[0191] The chain length of a fatty acid derivative substrate can be
selected for by modifying the expression of selected thioesterases.
Thioesterase influences the chain length of fatty acids produced.
Hence, host cells can be engineered to express, overexpress, have
attenuated expression, or not to express one or more selected
thioesterases to increase the production of a preferred fatty acid
derivative substrate. For example, C.sub.10 fatty acids can be
produced by expressing a thioesterase that has a preference for
producing C.sub.10 fatty acids and attenuating thioesterases that
have a preference for producing fatty acids other than C.sub.10
fatty acids (e.g., a thioesterase which prefers to produce C.sub.14
fatty acids). This would result in a relatively homogeneous
population of fatty acids that have a carbon chain length of 10. In
other instances, C.sub.14 fatty acids can be produced by
attenuating endogenous thioesterases that produce non-C.sub.14
fatty acids and expressing the thioesterases that have a preference
for C.sub.14-ACP. In some situations, C.sub.12 fatty acids can be
produced by expressing thioesterases that have a preference for
C.sub.12-ACP and attenuating thioesterases that preferentially
produce non-C.sub.12 fatty acids. Acetyl-CoA, malonyl-CoA, and
fatty acid overproduction can be verified using methods known in
the art, for example, by using radioactive precursors, HPLC, or
GC-MS subsequent to cell lysis. Non-limiting examples of
thioesterases that can be used in the methods described herein are
listed in Table 5.
TABLE-US-00005 TABLE 5 Thioesterases Accession Number Source
Organism Gene AAC73596 E. coli tesA without leader sequence
AAC73555 E. coli tesB Q41635, AAA34215 Umbellularia california fatB
AAC49269 Cuphea hookeriana fatB2 Q39513; AAC72881 Cuphea hookeriana
fatB3 Q39473, AAC49151 Cinnamonum camphorum fatB CAA85388
Arabidopsis thaliana fatB [M141T]* NP 189147; NP 193041 Arabidopsis
thaliana fatA CAC39106 Bradyrhiizobium japonicum fatA AAC72883
Cuphea hookeriana fatA AAL79361 Helianthus annus fatA1 *Mayer et
al., BMC Plant Biology, 7: 1-11 (2007)
[0192] In certain embodiments, a host cell, which is used to
produce branched fatty alcohols and/or derivatives herein, can be
engineered to express or overexpress one of more fatty aldehyde
biosynthetic polypeptides. Alternatively, the host cell can be
engineered to express an attenuated level of an endogenous fatty
aldehyde biosynthetic polypeptide. In other instances, a fatty
aldehyde biosynthetic polypeptide, a variant, or a fragment thereof
is expressed in a host cell that contains a naturally occurring
mutation that results in an increased level of branched fatty
aldehyde substrate in the host cell or of branched fatty alcohol
produced by the host cell. In some instances, a branched fatty
aldehyde is produced by expressing a fatty aldehyde biosynthesis
gene, for example, a carboxylic acid reductases gene, encoding a
protein listed in Table 6, below, as well as a polynucleotide
variant there. In some instances, the fatty aldehyde biosynthesis
gene encodes one of the enzymes listed in Table 6 below.
TABLE-US-00006 TABLE 6 Fatty Aldehyde Biosynthesis Genes
Name/Organism Accession No. Nocardia sp. NRRL 5646
>gi|40796035|gb|AAR91681.1| Mycobacterium tuberculosis
>gi|15609727|ref|NP_217106.1 H37Rv Mycobacterium smegmatis
>gi|118174788|gb|ABK75684.1| str. MC2 155 Mycobacterium
smegmatis >gi|118469671|ref|YP_889972.1| str. MC2 155 FadD9
uniprot|A0PPD8|A0PPD8_MYCUA Tsukamurella paurometabola
>gi|22798060|ref|ZP_04027864.1| DSM 20162 Cyanobium sp. PCC 7001
>gi|254431429|ref|ZP_05045132.1| Putative acyl-CoA dehydrogenase
>uniprot|A0QIB5|A0QIB5_MYCA1 NAD dependent
>uniprot|A0QWI7|A0QWI7_MYCS2 epimerase/dehydratase Mycobacterium
intracellulare >gi|254819907|ref|ZP_05224908.1| ATCC13950
Putative long-chain >uniprot|A0R484|A0R484_MYCS2 fattyacid-CoA
ligase Mycobacterium kansasii >gi|240173202|ref|ZP_04751860.1|
ATCC 12478 Probable fatty-acid-CoA >uniprot|A1KLT8|A1KLT8_MYCBP
ligase fadD9 Mycobacterium intracellulare
>gi|254822803|ref|ZP_05227804.1| ATCC13950 Fatty-acid-CoA ligase
fadD9 >uniprot|A1QUM2|A1QUM2_MYCTF Thioester reductase domain
>uniprot|A1T887|A1T887_MYCVP Thioester reductase domain
>uniprot|A1UFA8|A1UFA8_MYCSK Mycobacterium avium
>gi|254775919|ref|ZP_05217435.1| subsp. ATCC 25291 Thioester
reductase domain >uniprot|A3PYW9|A3PYW9_MYCSJ Mycobacterium
leprae Br4923 >gi|219932734|emb|CAR70557.1| Putative acyl-CoA
synthetase >uniprot|A5CM59|A5CM59_CLAM3 Thioester reductase
domain >uniprot|A8M8D3|A8M8D3_SALAI Probable fatty-acid-CoA
>uniprot|B1MCR9|B1MCR9_MYCAB ligase FadD Probable fatty-acid-CoA
>uniprot|B1MCS0|B1MCS0_MYCAB ligase FadD Putative fatty-acid-CoA
ligase >uniprot|B1MDX4|B1MDX4_MYCAB Probable fatty-acid-coa
>uniprot|B1MLD7|B1MLD7_MYCAB ligase FadD Putative carboxylic
acid reductase >uniprot|B1VMZ4|B1VMZ4_STRGG Fatty-acid-CoA
ligase FadD9_1 >uniprot|B2HE95|B2HE95_MYCMM Fatty-acid-CoA
ligase FadD9 >uniprot|B2HN69|B2HN69_MYCMM Putative Acyl-CoA
synthetase >uniprot|O69484|O69484_MYCLE Probable peptide
synthetase nrp >uniprot|Q10896|Q10896_MYCTU Putative carboxylic
acid reductase >uniprot|Q5YY80|Q5YY80_NOCFA ATP/NADPH-dependent
>uniprot|Q6RKB1|Q6RKB1_9NOCA carboxylic acid reductase FadD9
>uniprot|Q741P9|Q741P9_MYCPA Substrate--CoA ligase, putative
>uniprot|Q7D6X4|Q7D6X4_MYCTU Probable fatty-acid-coa
>uniprot|Q7TY99|Q7TY99_MYCBO ligase fadd9 Putative acyl-CoA
synthetase >uniprot|Q9CCT4|Q9CCT4_MYCLE Putative uncharacterized
protein >uniprot|Q54JK0|Q54JK0_DICDI Putative non-ribosomal
>uniprot|Q2MFQ3|Q2MFQ3_STRRY peptide synthetase Mycobacterium
tuberculosis >gi|215431545|ref|ZP_03429464.1| EAS054
Mycobacterium tuberculosis >gi|218754327|ref|ZP_03533123.1| GM
1503 Mycobacterium tuberculosis T85
>gi|215446840|ref|ZP_03433592.1| Mycobacterium tuberculosis T17
>gi|219558593|ref|ZP_03537669.1| Mycobacterium intracellulare
>gi|254819907|ref|ZP_05224908.1| ATCC13950
[0193] In certain embodiments, a host cell, which is used to
produce branched fatty alcohols and/or derivatives herein, can be
engineered to express or overexpress one or more acyl-ACP
reductases polypeptides, variants, or fragments thereof to achieve
an improved production of one or more desirable branched fatty
alcohols or derivatives. Alternatively, a host cell can be
engineered to express an attenuated level of an endogenous acyl-ACP
reductase. Non-limiting examples of suitable acyl-ACP reductases
are listed in Table 7 below:
TABLE-US-00007 TABLE 7 Acyl-ACP Reductase Polypeptides Organism
Accession No. Synechococcus elongatus Synpcc7942_1594 (YP_400611)
PCC7942 Synechocystis sp. sll0209 (NP_442146) Cyanothece sp.
ATCC51142 cce_1430 (YP_001802846) Prochlorococcus marinus CCMP1986
PMM0533 (NP_892651) subsp.pastoris str. Gloeobacter violaceus
PCC7421 NP_96091 (gll3145) Nostoc punctiforme PCC73102 ZP_00108837
(Npun02004176) Anabaena variabilis ATCC29413 YP_323044 (Ava_2534)
Synechococcus elongatus PCC6301 YP_170761 (syc0051_d) Nostoc sp.
PCC 7120 alr5284 (NP_489324) Prochlorococcus marinus CCMP1986
PMM0533 (NP_892651) subsp.pastoris str.
[0194] In certain embodiments, a host cell, which is used to
produce fatty alcohols and/or derivatives herein, can be further
engineered to express or overexpress one or more fatty alcohol
biosynthesis polypeptides, variants, or fragments thereof in order
to achieve an improved production of one or more desirable branched
fatty alcohols or derivatives. Alternatively, a host cell can be
engineered to express an attenuated level of an endogenous fatty
alcohol biosynthesis polypeptide. Non limiting examples of suitable
fatty alcohol biosynthesis polypeptides are listed in Table 8
below:
TABLE-US-00008 TABLE 8 Fatty Alcohol Biosynthesis/Alcohol
Dehydrogenase Polypeptide GenBank GenBank GenBank Name Accession
No. Name Accession No. Name Accession No. ygjB NP_418690 YggP
YP_026187 YciK NP_415787 yahK NP_414859 YiaY YP_026233 YgfF
NP_417378 adhP NP_415995 FucO NP_417279 YghA NP_417476 ydjL
NP_416290 EutG NP_416948 YjgI NP_418670 ydjJ NP_416288 YqhD
NP_417484 YdfG NP_416057 idnD NP_418688 AdhE NP_415757 YgcW
NP_417254 Tdh NP_418073 dkgB NP_414743 UcpA NP_416921 yjjN
NP_418778 YdjG NP_416285 EntA NP_415128 rspB NP_416097 YeaE
NP_416295 FolM NP_416123 gatD NP_416594 dkgA NP_417485 HdhA
NP_416136 yphC NP_417040 YajO NP_414953 HcaB NP_417036 yhdH
NP_417719 YghZ NP_417474 SrlD NP_417185 ycjQ NP_415829 Tas
NP_417311 KduD NP_417319 yncB NP_415966 YdhF YP_025305 IdnO
NP_418687 Qor NP_418475 YdbC NP_415924 FabG NP_415611 frmA
NP_414890 ybbO NP_415026 FabI NP_415804 ybdR NP_415141 yohF
NP_416641 YdjA NP_416279
[0195] In some instances, a host cell, which can be used to produce
branched fatty alcohols and/or derivatives herein, is genetically
engineered to increase the level of branched fatty acids in the
host cell relative to a corresponding wild-type host cell. For
example, the host cell can be genetically engineered to express a
reduced level of an acyl-CoA synthase relative to a wild-type host
cell. In one embodiment, the level of expression of one or more
genes (e.g., an acyl-CoA synthase gene) is reduced by genetically
engineering a "knock out" host cell.
[0196] Any known acyl-CoA synthase gene can be reduced or knocked
out in a host cell. Non-limiting examples of acyl-CoA synthase
genes include fadD, fadK, BH3103, yhfL, Pfl-4354, EAV15023, fadD1,
fadD2, RPC_4074, fadDD35, fadDD22, faa3p or the gene encoding the
protein ZP_01644857. Specific examples of acyl-CoA synthase genes
include fadDD35 from M. tuberculosis H37Rv [NP_217021], fadDD22
from M. tuberculosis H37Rv [NP_217464], fadD from E. coli
[NP_416319], fadK from E. coli [YP_416216], fadD from Acinetobacter
sp. ADP1 [YP_045024], fadD from Haemophilus influenza RdkW20
[NP_438551], fadD from Rhodopseudomonas palustris Bis B18
[YP_533919], BH3101 from Bacillus halodurans C-125 [NP_243969],
Pfl-4354 from Pseudomonas fluorescens Pfo-1 [YP_350082], EAV15023
from Comamonas testosterone KF-1 [ZP_01520072], yhfL from B.
subtilis [NP_388908], fadD1 from P. aeruginosa PAO1 [NP_251989],
fadD1 from Ralstonia solanacearum GM1 1000 [NP_520978], fadD2 from
P. aeruginosa PAO1 [NP_251990], the gene encoding the protein
ZP_01644857 from Stenotrophomonas maltophilia R551-3, faa3p from
Saccharomyces cerevisiae [NP_012257], faa1p from Saccharomyces
cerevisiae [NP_014962], lcfA from Bacillus subtilis [CAA99571], or
those described in Shockey et al., Plant. Physiol., 129: 1710-1722
(2002); Caviglia et al., J. Biol. Chem., 279: 1163-1169 (2004);
Knoll et al., J. Biol. Chem., 269(23): 16348-56 (1994); Johnson et
al., J. Biol. Chem., 269: 18037-18046 (1994); and Black et al., J.
Biol. Chem. 267: 25513-25520 (1992).
Production of Branched Precursors
[0197] Branched fatty alcohols and derivatives can be produced from
branched fatty aldehydes containing one or more branched points,
using branched acyl-ACPs as substrates for a fatty aldehyde
biosynthesis polypeptide or an acyl-ACP reductase polypeptide as
described herein. The first step in forming branched fatty alcohol
precursors is the production of the corresponding alpha-keto acids
by a branched-chain amino acid aminotransferase. Host cells may
endogenously include genes encoding such enzymes or such genes can
be recombinantly introduced. E. coli, for example, endogenously
expresses such an enzyme, IlvE (EC 2.6.1.42; GenBank accession
YP_026247). In host cells where no branched-chain amino acid
aminotransferase are expressed, an E. coli IlvE or any other
branched-chain amino acid aminotransferase (e.g., IlvE from
Lactococcus lactis (GenBank accession AAF34406), IlvE from
Pseudomonas putida (GenBank accession NP_745648), or IlvE from
Streptomyces coelicolor (GenBank accession NP_629657)), can be
introduced.
[0198] In another embodiment, the production of alpha-keto acids
can be achieved using the methods described in Park et al., PNAS,
104:7797-7802 (2007) and Atsumi et al., Nature, 451: 86-89 (2008).
For example, 2-ketoisovalerate can be produced by overexpressing
the genes encoding IlvI, IlvH, IlvH mutant, IlvB, IlvN, IlvGM,
IlvC, or IlvD. Alternatively, 2-keto-3-methyl-valerate can be
produced by overexpressing the genes encoding IlvA and IlvI, IlvH
(or AlsS of Bacillus subtilis), IlvC, IlvD, or their homologs.
2-keto-4-methyl-pentanoate can also be produced by overexpressing
the genes encoding IlvI, IlvH, IlvC, IlvD and LeuA, LeuB, LeuC,
LeuD, or their homologs.
[0199] In another example, isobutyryl-CoA can be made in a host
cell, for example in E. coli, through the coexpression of a
crotonyl-CoA reductase (Ccr, EC 1.6.5.5, 1.1.1.1) and
isobutyryl-CoA mutase (large subunit IcmA, EC 5.4.99.2; small
subunit IcmB, EC 5.4.99.2) (Han and Reynolds, J. Bacteriol., 179:
5157 (1997)). Crotonyl-CoA is an intermediate in fatty acid
biosynthesis in E. coli and other microorganisms. Non-limiting
examples of ccr and icm genes from selected microorganisms are
listed in Table 9.
TABLE-US-00009 TABLE 9 ccr and icm Genes from Selected
Microorganisms Organism Gene GenBank Accession # Streptomyces
coelicolor Ccr NP_630556 icmA NP_629554 icmB NP_630904 Streptomyces
Ccr AAD53915 cinnamonensis icmA AAC08713 icmB AJ246005
Formation of Branched Cyclic Fatty Alcohols and Derivatives
[0200] Branched cyclic fatty alcohols can be produced from suitable
alpha keto acids using branched cyclic fatty acid derivatives such
as a branched cyclic acyl-ACP as substrates. To produce branched
cyclic fatty acid derivative substrates, genes that provide cyclic
precursors (e.g., the ans, chc, and plm gene families) can be
introduced into a host cell and expressed to allow initiation of
fatty acid biosynthesis from branched cyclic precursors. For
example, to convert a host cell, such as E. coli, into one capable
of synthesizing .omega.-cyclic fatty acids (cyFA), a gene that
provides the cyclic precursor cyclohexylcarbonyl-CoA (CHC-CoA)
(Cropp et al., Nature Biotech., 18: 980-983 (2000)) can be
introduced and expressed in the host cell. Non-limiting examples of
genes that provide CHC-CoA in E. coli include: ansJ, ansK, ansL,
chcA, and ansM from the ansatrienin gene cluster of Streptomyces
collinus (Chen et al., Eur. J. Biochem., 261: 98-107 (1999)) or
plmJ, plmK, plmL, chcA, and plmM from the phoslactomycin B gene
cluster of Streptomyces sp. HK803 (Palaniappan et al., J. Biol.
Chem., 278: 35552-35557 (2003)) together with the chcB gene (Patton
et al., Biochem., 39: 7595-7604 (2000)) from S. collinus, S.
avermitilis, or S. coelicolor (see Table 10). The genes listed in
Table 10 can then be expressed to allow initiation and elongation
of .omega.-cyclic fatty acids. Alternatively, the homologous genes
can be isolated from microorganisms that make cyFA and expressed in
a host cell (e.g., E. coli).
TABLE-US-00010 TABLE 10 Genes for the Synthesis of CHC-CoA Organism
Gene GenBank Accession No. Streptomyces collinus ansJK U72144* ansL
AF268489 chcA ansM chcB Streptomyces sp. HK803 pmlJK AAQ84158 pmlL
AAQ84159 chcA AAQ84160 pmlM AAQ84161 Streptomyces coelicolor
chcB/caiD NP_629292 Streptomyces avermitilis chcB/caiD NP_629292
*Only chcA is annotated in GenBank entry U72144; ansJKLM are
according to Chen et al. (Eur. J. Biochem., 261: 98-107
(1999)).
[0201] Genes fabH, acp, and fabF allow initiation and elongation of
.omega.-cyclic fatty acids because they have broad substrate
specificity. If the coexpression of any of these genes with the
genes listed in Table 10 does not yield cyFA, then fabH, acp,
and/or fabF homologs from microorganisms that make cyFAs (e.g.,
those listed in Table 11) can be isolated (e.g., by using
degenerate PCR primers or heterologous DNA sequence probes) and
coexpressed.
TABLE-US-00011 TABLE 11 Non-Limiting Examples of Microorganisms
Containing .omega.-cyclic Fatty Acids Organism Reference
Curtobacterium pusillum ATCC19096 Alicyclobacillus acidoterrestris
ATCC49025 Alicyclobacillus acidocaldarius ATCC27009
Alicyclobacillus cycloheptanicus * Moore, J. Org. Chem., 62: 2173
(1997) * Uses cycloheptylcarbonyl-CoA and not
cyclohexylcarbonyl-CoA as precursor for cyFA biosynthesis.
Branched Fatty Alcohol Saturation Levels
[0202] The degree of saturation in branched fatty acid derivative
substrates, such as, for example, a branched acyl-ACP, (which can
then be converted into branched fatty aldehydes and then branched
fatty alcohols as described herein) can be controlled by regulating
the degree of saturation of fatty acid intermediates. For example,
the sfa, gns, and fab families of genes can be expressed or
overexpressed to control the saturation of a branched acyl-ACP. In
certain embodiments, the host cells can be engineered to reduce the
expression of an sfa, gns, or fab gene and control the level of
saturated substrates vs. unsaturated substrates, which in turn
affects the production level of saturated branched fatty alcohols
or derivatives vs. unsaturated branched fatty alcohols or
derivatives.
[0203] In some instances, a host cell can be engineered to express
an attenuated level of a dehydratase/isomerase and/or a
ketoacyl-ACP synthase. For example, a host cell can be engineered
to express a decreased level of fabA and/or fabB. In some
instances, the host cell can be cultured or grown in the presence
of unsaturated fatty acids. In some instances, the host cell can be
engineered to express or overexpress a gene encoding a desaturases
enzyme. One non-limiting example of a desaturases is B. subtiis
DesA (AF037430). Other genes encoding desaturases are known in the
art can be introduced or used in the host cell and methods
described herein, such as desaturases that use acyl-ACPs,
including, for example, hexadecanoyl-ACP or octadecanoyl-ACP.
[0204] In some embodiments, those cells can be engineered to
produce unsaturated fatty acids by engineering the production host
to overexpress fabB or by growing the production host at low
temperatures (e.g., less than 37.degree. C.). FabB has preference
to cis-.delta.3decenoyl-ACP and results in unsaturated fatty acid
production in E. coli. Overexpression of fabB results in the
production of a significant percentage of unsaturated fatty acids
(de Mendoza et al., J. Biol. Chem., 258: 2098-2101 (1983)). The
gene fabB may be inserted into and expressed in host cells not
naturally having the gene. These unsaturated fatty acids can then
be used as intermediates in host cells that are engineered to
produce branched and unsaturated fatty acid derivative substrates,
such as branched and unsaturated fatty aldehydes, which can in turn
be converted into branched and unsaturated fatty alcohols and
derivatives.
[0205] In other instances, a repressor of fatty acid biosynthesis,
for example, fabR (GenBank accession NP_418398), can be deleted,
which will also result in increased unsaturated fatty acid
production in E. coli (Zhang et al., J. Biol. Chem., 277: 15558
(2002)). Similar deletions may be made in other host cells. A
further increase in unsaturated fatty acids may be achieved, for
example, by overexpressing fabM (trans-2, cis-3-decenoyl-ACP
isomerase, GenBank accession DAA05501) and controlled expression of
fabK (trans-2-enoyl-ACP reductase II, GenBank accession NP_357969)
from Streptococcus pneumoniae (Marrakchi et al., J. Biol. Chem.,
277: 44809 (2002)), while deleting E. coli fabI (trans-2-enoyl-ACP
reductase, GenBank accession NP_415804). In some examples, the
endogenous fabF gene can be attenuated, thus increasing the
percentage of palmitoleate (C16:1) produced.
Production of Genetic Variants
[0206] Variants can be naturally occurring or created in vitro. In
particular, such variants can be created using genetic engineering
techniques, such as site directed mutagenesis, random chemical
mutagenesis, Exonuclease III deletion procedures, or standard
cloning techniques. Alternatively, such variants, fragments,
analogs, or derivatives can be created using chemical synthesis or
modification procedures.
[0207] Methods of making variants are well known in the art. These
include procedures in which nucleic acid sequences obtained from
natural isolates are modified to generate nucleic acids that encode
polypeptides having characteristics that enhance their value in
industrial or laboratory applications. In such procedures, a large
number of variant sequences having one or more nucleotide
differences with respect to the sequence obtained from the natural
isolate are generated and characterized. Typically, these
nucleotide differences result in amino acid changes with respect to
the polypeptides encoded by the nucleic acids from the natural
isolates.
[0208] For example, variants can be created using error prone PCR
(see, e.g., Leung et al., Technique, 1: 11-15 (1989); and Caldwell
et al., PCR Methods Applic., 2: 28-33 (1992)). In error prone PCR,
PCR is performed under conditions where the copying fidelity of the
DNA polymerase is low, such that a high rate of point mutations is
obtained along the entire length of the PCR product. Briefly, in
such procedures, nucleic acids to be mutagenized (e.g., a fatty
aldehyde biosynthetic polynucleotide sequence) are mixed with PCR
primers, reaction buffer, MgCl.sub.2, MnCl.sub.2, Taq polymerase,
and an appropriate concentration of dNTPs for achieving a high rate
of point mutation along the entire length of the PCR product. For
example, the reaction can be performed using 20 fmoles of nucleic
acid to be mutagenized (e.g., a fatty aldehyde biosynthetic
polynucleotide sequence), 30 pmole of each PCR primer, a reaction
buffer comprising 50 mM KCl, 10 mM Tris HCl (pH 8.3), and 0.01%
gelatin, 7 mM MgCl.sub.2, 0.5 mM MnCl.sub.2, 5 units of Taq
polymerase, 0.2 mM dGTP, 0.2 mM dATP, 1 mM dCTP, and 1 mM dTTP. PCR
can be performed for 30 cycles of 94.degree. C. for 1 min,
45.degree. C. for 1 min, and 72.degree. C. for 1 min. However, it
will be appreciated that these parameters can be varied as
appropriate. The mutagenized nucleic acids are then cloned into an
appropriate vector and the activities of the polypeptides encoded
by the mutagenized nucleic acids are evaluated.
[0209] Variants can also be created using oligonucleotide directed
mutagenesis to generate site-specific mutations in any cloned DNA
of interest. Oligonucleotide mutagenesis is described in, for
example, Reidhaar-Olson et al., Science, 241: 53-57 (1988).
[0210] Variants can also be generated by assembly PCR, which
involves the assembly of a PCR product from a mixture of small DNA
fragments. A large number of different PCR reactions occur in
parallel in the same vial, with the products of one reaction
priming the products of another reaction. Assembly PCR is described
in, e.g., U.S. Pat. No. 5,965,408.
[0211] Still another method of generating variants is sexual PCR
mutagenesis, wherein forced homologous recombination occurs between
DNA molecules of different, but highly related, DNA sequence in
vitro as a result of random fragmentation of the DNA molecule based
on sequence homology. This is followed by fixation of the crossover
by primer extension in a PCR reaction. Sexual PCR mutagenesis is
described in, for example, Stemmer, Proc. Natl. Acad. Sci. USA, 91:
10747-10751 (1994).
[0212] Variants can also be created by in vivo mutagenesis. In some
embodiments, random mutations in a nucleic acid sequence are
generated by propagating the sequence in a bacterial strain, such
as an E. coli strain, which carries mutations in one or more of the
DNA repair pathways. Such "mutator" strains have a higher random
mutation rate than that of a wild-type strain. Propagating a DNA
sequence (e.g., a BKD polynucleotide sequence, a beta acyl-ACP
synthase polynucleotide sequence, a fatty aldehyde biosynthesis
polynucleotide sequence, or a fatty alcohol biosynthesis
polynucleotide sequence) in one of these strains will eventually
generate random mutations within the DNA. Mutator strains suitable
for use for in vivo mutagenesis are described in, for example,
International Publication WO 91/016427.
[0213] Variants can also be generated using cassette mutagenesis.
In cassette mutagenesis, a small region of a double stranded DNA
molecule is replaced with a synthetic oligonucleotide "cassette"
that differs from the native sequence. The oligonucleotide often
contains a completely and/or partially randomized native
sequence.
[0214] Recursive ensemble mutagenesis can also be used to generate
variants. Recursive ensemble mutagenesis is an algorithm for
protein engineering (i.e., protein mutagenesis) developed to
produce diverse populations of phenotypically related mutants whose
members differ in amino acid sequence. This method uses a feedback
mechanism to control successive rounds of combinatorial cassette
mutagenesis. Recursive ensemble mutagenesis is described in, for
example, Arkin et al., Proc. Natl. Acad. Sci. USA, 89: 7811-7815
(1992).
[0215] In some embodiments, variants are created using exponential
ensemble mutagenesis. Exponential ensemble mutagenesis is a process
for generating combinatorial libraries with a high percentage of
unique and functional mutants, wherein small groups of residues are
randomized in parallel to identify, at each altered position, amino
acids which lead to functional proteins. Exponential ensemble
mutagenesis is described in, for example, Delegrave et al.,
Biotech. Res., 11: 1548-1552 (1993). Random and site-directed
mutagenesis are described in, for example, Arnold, Curr. Opin.
Biotech., 4: 450-455 (1993).
[0216] In some embodiments, variants are created using shuffling
procedures wherein portions of a plurality of nucleic acids that
encode distinct polypeptides are fused together to create chimeric
nucleic acid sequences that encode chimeric polypeptides as
described in, for example, U.S. Pat. Nos. 5,965,408 and
5,939,250.
[0217] Polynucleotide variants also include nucleic acid analogs.
Nucleic acid analogs can be modified at the base moiety, sugar
moiety, or phosphate backbone to improve, for example, stability,
hybridization, or solubility of the nucleic acid. Modifications at
the base moiety include deoxyuridine for deoxythymidine and
5-methyl-2'-deoxycytidine or 5-bromo-2'-doxycytidine for
deoxycytidine. Modifications of the sugar moiety include
modification of the 2' hydroxyl of the ribose sugar to form
2'-halo, 2'-O-methyl or 2'-O-allyl sugars. The deoxyribose
phosphate backbone can be modified to produce morpholino nucleic
acids, in which each base moiety is linked to a six-membered,
morpholino ring, or peptide nucleic acids, in which the
deoxyphosphate backbone is replaced by a pseudopeptide backbone and
the four bases are retained. (See, e.g., Summerton et al.,
Antisense Nucleic Acid Drug Dev., 7: 187-195 (1997); and Hyrup et
al., Bioorgan. Med. Chem., 4: 5-23 (1996)). In addition, the
deoxyphosphate backbone can be replaced with, for example, a
phosphorothioate or phosphorodithioate backbone, a
phosphoroamidite, or an alkyl phosphotriester backbone.
Production of Polypeptide Variants
[0218] Conservative substitutions are those that substitute an
amino acid in a polypeptide by another amino acid of similar
characteristics. Common conservative substitutions include, without
limitation: replacing an aliphatic amino acid, such as alanine,
valine, leucine, and isoleucine, with another aliphatic amino acid;
replacing a serine with a threonine or vice versa; replacing an
acidic residue, such as aspartic acid and glutamic acid, with
another acidic residue; replacing a residue bearing an amide group,
such as asparagine and glutamine, with another residue bearing an
amide group; replacing a basic residue, such as lysine and
arginine, with another basic residue; and replacing an aromatic
residue, such as phenylalanine and tyrosine, with another aromatic
residue.
[0219] Other polypeptide variants are those in which one or more
amino acid residues include a substituent group. Still other
polypeptide variants are those in which the polypeptide is
associated with another compound, such as a compound to increase
the half-life of the polypeptide (e.g., polyethylene glycol).
[0220] Additional polypeptide variants are those in which
additional amino acids are fused to the polypeptide, such as a
leader sequence, a secretory sequence, a proprotein sequence, or a
sequence which facilitates purification, enrichment, or
stabilization of the polypeptide.
[0221] In some instances, the polypeptide variants described herein
retain the same biological function as a polypeptide from which
they are derived (e.g., retain branched-chain alpha keto acid
dehydrogenase activity, retain beta ketoyacyl ACP synthase
activity, such as FabH activity, or retain fatty aldehyde
biosynthetic activity, such as carboxylic acid or fatty acid
reductase activity) and have amino acid sequences substantially
identical thereto.
[0222] In other instances, the polypeptide variants have at least
about 50%, at least about 55%, at least about 60%, at least about
65%, at least about 70%, at least about 75%, at least about 80%, at
least about 85%, at least about 90%, at least about 95%, or more
than about 95% homology to an amino acid sequence from which they
are derived. In another embodiment, the polypeptide variants
include a fragment comprising at least about 5, 10, 15, 20, 25, 30,
35, 40, 50, 75, 100, or 150 consecutive amino acids thereof.
[0223] The polypeptide variants or fragments thereof can be
obtained by isolating nucleic acids encoding them using techniques
described herein or by expressing synthetic nucleic acids encoding
them. Alternatively, polypeptide variants or fragments thereof can
be obtained through biochemical enrichment or purification
procedures. The sequence of polypeptide variants or fragments can
be determined by proteolytic digestion, gel electrophoresis, and/or
microsequencing. The sequence of the polypeptide variants or
fragments can then be compared to the amino acid sequence from
which it is derived using any of the programs described herein.
[0224] The polypeptide variants and fragments thereof can be
assayed for enzymatic activity. For example, the polypeptide
variants or fragments can be contacted with a substrate under
conditions that allow the polypeptide variants or fragments to
function. A decrease in the level of the substrate or an increase
in the level of the desired product can be measured to determine
its activity.
Modifications to Increase Conversion of Branched Substrates to
Branched Fatty Alcohol
[0225] Host cells can be engineered using known polypeptides to
produce branched fatty alcohols from branched substrate, including,
for example, a branched fatty acid, a branched fatty acid
derivative, a branched acyl-CoA, or a branched acyl-CoA derivative
substrate. For example, one method of making branched fatty
alcohols involves increasing the expression of, or expressing more
active forms of, fatty alcohol forming acyl-CoA reductases (encode
by a gene such as acr1 from FAR, EC 1.2.1.50/1.1.1) or acyl-CoA
reductases (EC 1.2.1.50) and alcohol dehydrogenase (EC
1.1.1.1).
[0226] The host cell can also be, for example, modified or
engineered, such that it expresses or overexpresses at least one
(E1) subunit of a branched-chain alpha keto acid dehydrogenase
complex, and a beta ketoacyl-ACP synthase. The host cell can be
further engineered such that it expresses or overexpresses a fatty
aldehyde biosynthesis polypeptide and/or a fatty alcohol
biosynthesis polypeptide. Alternatively, the host cell can be
engineered such that it expresses or overexpresses an acyl-ACP
reductase polypeptide and a fatty alcohol biosynthesis
polypeptide.
[0227] In certain embodiments, the gene encoding the subunits of
branched-chain alpha keto acid dehydrogenase complex can be derived
from a bacterium, a plant, an insect, a yeast, a fungus, or a
mammal. For example, the subunits of the branched-chain alpha keto
acid dehydrogenase complex can be derived from a bacterium that
uses branched amino acids as carbon source, including, for example,
Pseudomonas putida or Bacillus subtilis. In another example, the
branched-chain alpha-keto acid dehydrogenase complex polypeptide
can be from a bacterium that comprises branched fatty acids in its
phospholipids, including, e.g., a Legionella, Stenotrophomonas,
Alteromonas, Flavobacterium, Myxococcus, Bccteroides, Micrococcus,
Staphylococcus, Bacillus, Clostridium, Listeria, Lactococcus, or
Streptomyces. In some embodiments, the bacterium is a Leginella
pneumophila, Stenotrophomonas maltophilia, Alteromonas macleodii,
Flabobacterium phsychrophilum, Myxococcus Xanthus, Bacteroides
thetaiotaomicron, Macrococcus luteus, Staphylococcus aureus,
Clostridium thermocellum, Listeria monocytogenes, Streptomyces
lividans, Streptomyces coelicolor, Streptomyces glaucescens,
Streptococcus pneumoniae, Streptomyces peucetius, Streptococcus
pyogenes, Escherichia coli, Escherichia coli K-12, Lactococcus
lactis ssp. Lactis, Mycobacterium tuberculosis, Enterococcus
tuberculosis, Bacillus subtilis, Lactobacillus plantarum. In some
embodiments, suitable fatty aldehyde biosynthesis polypeptides,
fatty alcohol biosynthesis polypeptides, acyl-ACP reductases, and
other polypeptides of the invention can be from a mycobacterium
selected from Mycobacterium smegmatis, Mycobacterium abscessus,
Mycobacterium avium, Mycobacterium bovis, Mycobacterium
tuberculosis, Mycobacterium leprae, Mycobacterium marinum, or
Mycobacterium ulcerans. In other embodiments, the bacterium is
Nocardia sp. NRRL 5646, Nocardia farcinica, Streptomyces griseus,
Salinispora arenicola, or Clavibacter michiganenesis. In yet
further embodiments, the polypeptide of the invention is derived
from a cyanobacterium, including, for example, Synechococcus
elongatus PCC7942, Synechocystis sp. PCC6803, Cyanothece sp.
ATCC51142, Prochlorococcus marinus subsp. pastoris str. CCMP1986
PMM0533, Gloeobacter violaceus PCC7421, Nostoc punctiforme
PCC73102, Anabaena variabilis ATCC29413, Synechococcus elongatus
PCC6301, and Nostoc sp. PCC 7120, Microcoleus chthonoplastes
PCC7420, Arthrospira maxima CS-328, Lyngbya sp. PCC8106, Nodularia
spumigena CCY9414, Trichodesmium erythraeum IMS101, Microcystis
aeruginosa, Nostoc azollae, Anabaena variabilis, Crocophaera
watsonii, Thermosynechococcus elongatus, Gloeobacer violaceus,
Cyanobium, or Prochlorococcus marinus.
Genetic Engineering of Host Cells to Produce Branched Fatty
Alcohols
[0228] Various host cells can be used to produce branched fatty
alcohols, as described herein. A host cell can be any prokaryotic
or eukaryotic cell. For example, the host cell can be bacterial
cells (such as E. coli), insect cells, yeast, or mammalian cells
(such as Chinese hamster ovary cells (CHO) cells, COS cells, VERO
cells, BHK cells, HeLa cells, Cv1 cells, MDCK cells, 293 cells, 3T3
cells, or PC12 cells). Other exemplary host cells include cells
from the members of the genus Escherichia, Bacillus, Lactobacillus,
Rhodococcus, Pseudomonas, Aspergillus, Trichoderma, Neurospora,
Fusarium, Humicola, Rhizomucor, Kluyveromyces, Pichia, Mucor,
Myceliophtora, Penicillium, Phanerochaete, Pleurotus, Trametes,
Chrysosporium, Saccharomyces, Schizosaccharomyces, Yarrowia, or
Streptomyces. Yet other exemplary host cells can be a Bacillus
lentus cell, a Bacillus brevis cell, a Bacillus stearothermophilus
cell, a Bacillus licheniformis cell, a Bacillus alkalophilus cell,
a Bacillus coagulans cell, a Bacillus circulans cell, a Bacillus
pumilis cell, a Bacillus thuringiensis cell, a Bacillus clausii
cell, a Bacillus megaterium cell, a Bacillus subtilis cell, a
Bacillus amyloliquefaciens cell, a Trichoderma koningii cell, a
Trichoderma viride cell, a Trichoderma reesei cell, a Trichoderma
longibrachiatum cell, an Aspergillus awamori cell, an Aspergillus
fumigates cell, an Aspergillus foetidus cell, an Aspergillus
nidulans cell, an Aspergillus niger cell, an Aspergillus oryzae
cell, a Humicola insolens cell, a Humicola lanuginose cell, a
Rhizomucor miehei cell, a Mucor michei cell, a Streptomyces
lividans cell, a Streptomyces murinus cell, or an Actinomycetes
cell. Host cells can also be cyanobacterial cells such as, for
example, Synechoccus sp., Synechoccus elongatus, or Synechocystis
sp. cells.
[0229] In a preferred embodiment, the host cell is an E. coli cell,
a Saccharomyces cerevisiae cell, or a Bacillus subtilis cell. For
example, the host cell can be one from E. coli strain B, C, K, or
W. Other suitable host cells are known to those skilled in the
art.
[0230] Various methods well known in the art can be used to
genetically engineer host cells to produce branched fatty alcohols.
The methods can include the use of vectors, preferably expression
vectors, containing a nucleic acid encoding the first (E1
alpha/beta) subunit of a branched-chain alpha keto acid
dehydrogenase, and optionally also the second (E2) and/or the third
(E3) subunit of that enzyme, and/or a beta ketoacyl-ACP synthase,
and/or a fatty aldehyde biosynthetic polypeptide, and/or an alcohol
dehydrogenase, and/or an acyl-ACP reductases, described herein,
polypeptide variant, or a fragment thereof. Those skilled in the
art will appreciate a variety of viral vectors (for example,
retroviral vectors, lentiviral vectors, adenoviral vectors, and
adeno-associated viral vectors) and non-viral vectors can be used
in the methods described herein.
[0231] The recombinant expression vectors can include
polynucleotides described herein in a form suitable for expression
in a host cell. The recombinant expression vectors can include one
or more control sequences, selected on the basis of the host cell
to be used for expression. The control sequence is operably linked
to the nucleic acid sequence to be expressed. Such control
sequences are described, for example, in Goeddel, Gene Expression
Technology: Methods in Enzymology 185, Academic Press, San Diego,
Calif. (1990). Control sequences include those that direct
constitutive expression of a nucleotide sequence in many types of
host cells and those that direct expression of the nucleotide
sequence only in certain host cells (e.g., tissue-specific
regulatory sequences). It will be appreciated by those skilled in
the art that the design of the expression vector can depend on such
factors as the choice of the host cell to be transformed, the level
of expression of protein desired, etc. The expression vectors
described herein can be introduced into host cells to produce
polypeptides, including fusion polypeptides, encoded by the nucleic
acids as described herein.
[0232] In some embodiments, recombinant expression vectors can be
designed for expression of a gene encoding a first (E1 alpha/beta)
subunit, and optionally a second (E2) and/or a third (E3) subunit
of a branched-chain alpha-keto acid dehydrogenase (or variant)
and/or a gene encoding a beta-ketoacyl ACP synthase (or variant),
and/or a gene encoding a fatty aldehyde biosynthesis polypeptide
(or variant), and/or a gene encoding an alcohol dehydrogenase (or
variant), and/or a gene encoding an acyl-ACP reductases (or
variant) in a suitable host cell. Suitable host cells are discussed
further in Goeddel, Gene Expression Technology: Methods in
Enzymology 185, Academic Press, San Diego, Calif. (1990).
Alternatively, the recombinant expression vector can be transcribed
and translated in vitro, for example, by using T7 promoter
regulatory sequences and T7 polymerase.
[0233] Expression of genes encoding polypeptides in prokaryotes,
for example, E. coli, is most often carried out with vectors
containing constitutive or inducible promoters directing the
expression of either fusion or non-fusion polypeptides. Fusion
vectors add a number of amino acids to a polypeptide encoded
therein, usually to the amino terminus of the recombinant
polypeptide. Such fusion vectors typically serve three purposes:
(1) to increase expression of the recombinant polypeptide; (2) to
increase the solubility of the recombinant polypeptide; and (3) to
aid in the purification of the recombinant polypeptide by acting as
a ligand in affinity purification. Often, in fusion expression
vectors, a proteolytic cleavage site is introduced at the junction
of the fusion moiety and the recombinant polypeptide. This enables
separation of the recombinant polypeptide from the fusion moiety
after purification of the fusion polypeptide. Examples of such
enzymes, and their cognate recognition sequences, include Factor
Xa, thrombin, and enterokinase. Exemplary fusion expression vectors
include pGEX (Pharmacia Biotech Inc.; Smith et al., Gene, 67: 31-40
(1988)), pMAL (New England Biolabs, Beverly, Mass.), and pRITS
(Pharmacia, Piscataway, N.J.), which fuse glutathione S-transferase
(GST), maltose E binding protein, or protein A, respectively, to
the target recombinant polypeptide.
[0234] Examples of inducible, non-fusion E. coli expression vectors
include pTrc (Amann et al., Gene, 69: 301-315 (1988)) and pET 11d
(Studier et al., Gene Expression Technology: Methods in Enzymology
185, Academic Press, San Diego, Calif. (1990), pp. 60-89). Target
gene expression from the pTrc vector relies on host RNA polymerase
transcription from a hybrid trp-lac fusion promoter. Target gene
expression from the pET 11d vector relies on transcription from a
T7 gn10-lac fusion promoter mediated by a coexpressed viral RNA
polymerase (T7 gn1). This viral polymerase is supplied by host
strains BL21(DE3) or HMS174(DE3) from a resident .lamda., prophage
harboring a T7 gn1 gene under the transcriptional control of the
lacUV 5 promoter.
[0235] One strategy to maximize expression is to express the
polypeptide in a host cell with an impaired capacity to
proteolytically cleave the recombinant polypeptide (see Gottesman,
Gene Expression Technology: Methods in Enzymology 185, Academic
Press, San Diego, Calif. (1990), pp. 119-128). Another strategy is
to alter the nucleic acid sequence to be inserted into an
expression vector so that the individual codons for each amino acid
are those preferentially utilized in the host cell (Wada et al.,
Nucleic Acids Res., 20: 2111-18 (1992)). These strategies can be
carried out by standard DNA synthesis techniques.
[0236] In another embodiment, the host cell is a yeast cell, and
the expression vector is a yeast expression vector. Examples of
vectors for expression in yeast S. cerevisiae include pYepSecl
(Baldari et al., EMBO J., 6: 229-234 (1987)), pMFa (Kurjan et al.,
Cell, 30: 933-943 (1982)), pJRY88 (Schultz et al., Gene, 54:
113-123 (1987)), pYES2 (Invitrogen Corporation, San Diego, Calif.),
and picZ (Invitrogen Corp, San Diego, Calif.).
[0237] Alternatively, polypeptides described herein can be
expressed in insect cells using baculovirus expression vectors.
Available baculovirus vectors include, for example, the pAc series
(Smith et al., Mol. Cell Biol., 3: 2156-2165 (1983)) and the pVL
series (Lucklow et al., Virology, 170: 31-39 (1989)).
[0238] In yet another embodiment, the polypeptides described herein
can be expressed in mammalian cells using a mammalian expression
vector. Examples of mammalian expression vectors include pCDM8
(Seed, Nature, 329: 840 (1987)) and pMT2PC (Kaufman et al., EMBO
J., 6: 187-195 (1987)). When used in mammalian cells, the
expression vector's control functions can be provided by viral
regulatory elements. Commonly used promoters include those derived
from polyoma, Adenovirus 2, cytomegalovirus, and Simian Virus 40.
Other suitable expression systems for both prokaryotic and
eukaryotic cells are described in chapters 16-17 of Sambrook et
al., eds., Molecular Cloning: A Laboratory Manual. 2.sup.nd, ed.,
Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press,
Cold Spring Harbor, N. Y., 1989.
[0239] Vectors can be introduced into prokaryotic or eukaryotic
cells via conventional transformation or transfection techniques.
As used herein, the terms "transformation" and "transfection" refer
to a variety of art-recognized techniques for introducing foreign
nucleic acid (e.g., DNA) into a host cell, including calcium
phosphate or calcium chloride co-precipitation,
DEAE-dextran-mediated transfection, lipofection, or
electroporation.
[0240] It is known that, depending upon the expression vector and
transformation technique used, only a small fraction of bacterial
cells will take-up and replicate the expression vector. In order to
identify and select these transformants, a gene that encodes a
selectable marker (e.g., resistance to antibiotics) can be
introduced into the host cells along with the gene of interest.
Selectable markers include those that confer resistance to drugs,
such as ampicillin, kanamycin, chloramphenicol, or tetracycline.
Nucleic acids encoding a selectable marker can be introduced into a
host cell on the same vector as that encoding a polypeptide
described herein or can be introduced on a separate vector. Cells
stably transfected with the introduced nucleic acid can be
identified by drug selection (e.g., cells that have incorporated
the selectable marker gene will survive, while the other cells
die).
[0241] It is known that, depending upon the expression vector and
transfection technique used, only a small fraction of mammalian
cells may integrate the foreign DNA into their genome. In order to
identify and select these integrants, a gene that encodes a
selectable marker (e.g., resistance to antibiotics) can be
introduced into the host cells along with the gene of interest.
Preferred selectable markers include those which confer resistance
to drugs, such as G418, hygromycin, and methotrexate. Nucleic acids
encoding a selectable marker can be introduced into a host cell on
the same vector as that encoding a polypeptide described herein or
can be introduced on a separate vector. Cells stably transfected
with the introduced nucleic acid can be identified by drug
selection (e.g., cells that have incorporated the selectable marker
gene will survive, while the other cells die).
Transport Proteins
[0242] Transport proteins can export or excrete polypeptides and
organic compounds (e.g., branched fatty alcohols) out of a host
cell. A number of transport and efflux proteins can be modified to
selectively secrete particular types of compounds such as branched
fatty alcohols.
[0243] Non-limiting examples of suitable transport proteins are
ATP-Binding Cassette (ABC) transport proteins, efflux proteins, and
fatty acid transporter proteins (FATP). Additional suitable
transport proteins include the ABC transport proteins from
organisms such as Caenorhabditis elegans, Arabidopsis thalania,
Alkaligenes eutrophus, or Rhodococcus erythropolis. Exemplary ABC
transport proteins include, without limitation, CERS, AtMRPS,
AmiS2, and AtPGP1. Host cells can also be chosen for their
endogenous ability to secrete organic compounds. The efficiency of
organic compound production and secretion into the host cell
environment (e.g., culture medium, fermentation broth) can be
expressed as a ratio of intracellular product to extracellular
product. For example, the ratio can be about 5:1, 4:1, 3:1, 2:1,
1:1, 1:2, 1:3, 1:4, or 1:5.
Fermentation
[0244] The production and isolation of branched fatty alcohols can
be enhanced by employing beneficial fermentation techniques. One
method for maximizing production while reducing costs is increasing
the percentage of the carbon source that is converted to the
branched fatty alcohol products.
[0245] During normal cellular lifecycles, carbon is used in
cellular functions such as producing lipids, saccharides, proteins,
organic acids, and nucleic acids. Reducing the amount of carbon
necessary for growth-related activities can increase the efficiency
of carbon source conversion to product. This can be achieved by,
for example, first growing host cells to a desired density (for
example, a density achieved at the peak of the log phase of
growth). At such a point, replication checkpoint genes can be
harnessed to stop the growth of cells. Specifically, quorum sensing
mechanisms (reviewed in Camilli et al., Science, 311: 1113 (2006);
Venturi FEMS Microbio. Rev., 30: 274-291 (2006); and Reading et
al., FEMS Microbiol. Lett., 254: 1-11 (2006)) can be used to
activate checkpoint genes, such as p53, p21, or other checkpoint
genes.
[0246] Genes that can be activated to stop cell replication and/or
growth in E. coli include umuDC genes. The overexpression of umuDC
genes stops the progression from stationary phase to exponential
growth (Murli et al., J. Bact., 182: 1127 (2000)). UmuC is a DNA
polymerase that can carry out translesion synthesis over non-coding
lesions--the mechanistic basis of most UV and chemical mutagenesis.
The umuDC gene products are involved in the process of translesion
synthesis and also serve as a DNA sequence damage checkpoint. The
umuDC gene products include UmuC, UmuD, umuD', UmuD'.sub.2C, UmuD'
.sub.2, and UmuD.sub.2. Simultaneously, product-producing genes can
be activated, thus minimizing the need for replication and
maintenance pathways to be used while a fatty aldehyde is being
made. Host cells can also be engineered to express umuC and umuD
from E. coli in pBAD24 under the prpBCDE promoter system through de
novo synthesis of this gene with the appropriate end-product
production genes.
[0247] The percentage of input carbons converted to branched fatty
alcohols can be a cost driver. The more efficient the process is
(i.e., the higher the percentage of input carbons converted to
branched fatty alcohols), the less expensive the process will be.
For oxygen-containing carbon sources (e.g., glucose and other
carbohydrate based sources), the oxygen must be released in the
form of carbon dioxide. For every 2 oxygen atoms released, a carbon
atom is also released leading to a maximal theoretical metabolic
efficiency of approximately 34% (w/w) (for fatty acid derived
products). This figure, however, changes for other organic
compounds and carbon sources. Typical efficiencies in the
literature are approximately less than 5%. Host cells engineered to
produce fatty alcohols can have greater than about 1, 3, 5, 10, 15,
20, 25, and 30% efficiency. In one example, host cells can exhibit
an efficiency of about 10% to about 25%. In other examples, such
host cells can exhibit an efficiency of about 25% to about 30%. In
other examples, host cells can exhibit greater than 30%
efficiency.
[0248] The host cell can be additionally engineered to express
recombinant cellulosomes, such as those described in International
Publication WO 2008/100251. These cellulosomes can allow the host
cell to use cellulosic material as a carbon source. For example,
the host cell can be additionally engineered to express invertases
(EC 3.2.1.26) so that sucrose can be used as a carbon source.
Similarly, the host cell can be engineered using the teachings
described in U.S. Pat. Nos. 5,000,000; 5,028,539; 5,424,202;
5,482,846; and 5,602,030, so that the host cell can assimilate
carbon efficiently and use cellulosic materials as carbon
sources.
[0249] In one example, the fermentation chamber can enclose a
fermentation that is undergoing a continuous reduction. In this
instance, a stable reductive environment can be created. The
electron balance can be maintained by the release of carbon dioxide
(in gaseous form). Efforts to augment the NAD/H and NADP/H balance
can also facilitate in stabilizing the electron balance. The
availability of intracellular NADPH can also be enhanced by
engineering the host cell to express an NADH:NADPH
transhydrogenase. The expression of one or more NADH:NADPH
transhydrogenases converts the NADH produced in glycolysis to
NADPH, which can enhance the production of fatty alcohols.
[0250] For small scale production, the engineered host cells can be
(a) grown in batches of, for example, about 100 mL, 500 mL, 1 L, 2
L, 5 L, or 10 L, (b) fermented, and (c) induced to express desired
bkd genes, beta-ketoacyl ACP synthase genes, fatty aldehyde
biosynthesis genes, alcohol dehydrogenase genes, and/or acyl-ACP
reductases genes, based on the specific genes encoded in the
appropriate plasmids. For large scale production, the engineered
host cells can be (a) grown in batches of about 10 L, 100 L, 1000
L, 10,000 L, 100,000 L, 1,000,000 L, or larger, (b) fermented, and
(c) induced to express the desired bkd genes, beta-ketoacyl ACP
synthase genes, fatty aldehyde biosynthesis genes, alcohol
dehydrogenase genes, and/or acyl-ACP reductases genes based on the
specific genes encoded in the plasmids or incorporated into the
host cell's genome.
[0251] For example, a suitable production host, such as an E. coli,
harboring plasmids containing the desired genes or having the genes
integrated in its chromosome can be incubated in a suitable
reactor, for example a 1 L reactor, for 20 hours at 37.degree. C.
in an M9 medium supplemented with 2% glucose, carbenicillin, and
chloramphenicol. When the OD.sub.600 of the culture reaches 0.9,
the production host can be induced with IPTG alcohol After
incubation, the spent media can be extracted and the organic phase
can be examined for the presence of branched fatty alcohols using
GC-MS.
[0252] In some instances, after the first hour of induction,
aliquots of no more than about 10% of the total cell volume can be
removed each hour and allowed to sit without agitation to allow the
branched fatty alcohols to rise to the surface and undergo a
spontaneous phase separation or precipitation. The branched fatty
alcohol component can then be collected, and the aqueous phase
returned to the reaction chamber. The reaction chamber can be
operated continuously. When the OD.sub.600 drops below 0.6, the
cells can be replaced with a new batch grown from a seed
culture.
Producing Branched Fatty Alcohols and Derivatives Using Cell-Free
Methods
[0253] In some embodiments, branched fatty alcohols can be produced
using a purified polypeptide (e.g., a branched-chain alpha keto
acid dehydrogenase complex polypeptide) described herein and a
substrate (e.g., an alpha keto acid, malonyl-CoA,
2-oxo-isovalerate, 2-oxo-isobutylrate, 2-oxo-3-methyl-valerate.
2-oxo-isocaproate, 2-oxoglutarate, 2-oxopentanoate,
3-methyl-2-oxobutanoate, 3-methyl-2-oxopentanoate,
4-methyl-2-oxopentanoate, or pyruvate) produced, for example, by a
method described herein. For example, a host cell can be engineered
to express a branched-chain alpha keto acid dehydrogenase
polypeptide or the E1 (alpha and beta), and optionally, the E2
and/or the E3 subunits thereof, or variants as described herein.
The host cell can be cultured under conditions sufficient to allow
expression of the polypeptide. Cell free extracts can then be
generated using known methods, including, for example, cell lysis
using detergents or sonication. The expressed polypeptides can be
purified. Thereafter, substrates described herein can be added to
the cell free extracts and maintained under conditions to allow
conversion of the substrates (e.g., alpha keto acids, such as
2-oxo-isovalerate, 2-oxo-isobutylrate, 2-oxo-3-methyl-valerate.
2-oxo-isocaproate, 2-oxoglutarate, 2-oxopentanoate,
3-methyl-2-oxobutanoate, 3-methyl-2-oxopentanoate,
4-methyl-2-oxopentanoate, or pyruvate) to branched chain acyl-CoAs,
which can then be converted into branched fatty aldehydes and
branched fatty alcohols. The branched fatty alcohols can then be
separated and purified using known techniques.
Post-Production Processing
[0254] Depending on the intended use of the branched fatty alcohols
produced in accordance with the methods here, post-production
processing may or may not be necessary. As such, in certain
industrial applications, the produced branched fatty alcohols
and/or derivatives may be suitably used per se as surfactants.
Moreover, such surfactants can be directly blended or formulated
into suitable cleaning compositions.
[0255] The branched fatty alcohols produced during fermentation can
be separated from the fermentation media, using any known technique
for separating fatty alcohols from aqueous media. One exemplary
separation process is a two phase (bi-phasic) separation process,
which involves fermenting the genetically engineered host cells
under conditions sufficient to produce a branched fatty alcohol,
allowing it to collect in an organic phase, and separating the
organic phase from the aqueous fermentation broth. This method can
be practiced in both a batch and continuous fermentation
processes.
[0256] Bi-phasic separation uses the relative immiscibility of
fatty alcohols to facilitate separation. Immiscible refers to the
relative inability of a compound to dissolve in water and is
defined by the compound's partition coefficient. One of ordinary
skill in the art will appreciate that by choosing a fermentation
broth and organic phase, such that the branched fatty alcohol being
produced has a high logP value, the branched fatty alcohol can
separate into the organic phase, even at very low concentrations,
in the fermentation vessel.
[0257] The branched fatty alcohols produced by the methods
described herein can be relatively immiscible in the fermentation
broth and the cytoplasm. Therefore, the branched fatty alcohol can
collect in an organic phase either intracellularly or
extracellularly. The collection of the products in the organic
phase can lessen the impact of the branched fatty alcohol on
cellular function and can allow the host cell to produce more
product.
[0258] The branched fatty alcohol can thus be produced as a
homogeneous compounds wherein at least about 60%, 70%, 80%, 90%, or
95% of the branched fatty alcohols produced will have carbon chain
lengths that vary by less than about 6 carbons, less than about 4
carbons, or less than about 2 carbons. These compounds can also be
produced with a relatively uniform degree of saturation. They can
be used per se as surfactants or can be formulated into suitable
cleaning compositions. They can also be used as fuels, fuel
additives, starting materials for production of other chemical
compounds (e.g., polymers, surfactants, plastics, textiles,
solvents, adhesives, etc.), or personal care additives. These
compounds can also be used as feedstock for subsequent reactions,
for example, hydrogenation, catalytic cracking (e.g., via
hydrogenation, pyrolisis, or both), and can be dehydrated to make
other products. In particular, these branched products confer low
volatility, beneficial low-temperature properties, as well as
oxidative stability, making them ideal for low temperature
applications such as in household cleaning compositions and
personal and beauty care products.
[0259] In some embodiments, the branched fatty alcohols produced
using methods described herein can contain between about 50% and
about 90% carbon, or between about 5% and about 25% hydrogen. In
other embodiments, the branched fatty alcohols produced using
methods described herein can contain between about 65% and about
85% carbon, or between about 10% and about 15% hydrogen.
[0260] In some embodiments, the branched fatty alcohols produced in
accordance with the disclosure herein comprises a C.sub.6-C.sub.26
branched fatty alcohol. In some embodiments, the branched fatty
alcohol comprises a C.sub.6, C.sub.7, C.sub.8, C.sub.9, C.sub.10,
C.sub.11, C.sub.12, C.sub.13, C.sub.14, C.sub.15, C.sub.16,
C.sub.17, C.sub.18, C.sub.19, C.sub.20, C.sub.21, C.sub.22,
C.sub.23, C.sub.24, C.sub.25, or a C.sub.26 branched fatty alcohol.
In particular embodiments, the branched fatty alcohol is a C.sub.6,
C.sub.8, C.sub.10, C.sub.12, C.sub.13, C.sub.14, C.sub.15,
C.sub.16, C.sub.17, or C.sub.18 branched fatty alcohol. In certain
embodiments, the hydroxyl group of the branched fatty alcohol is in
the primary (C.sub.1) position. In certain embodiment, the branched
fatty alcohol is an iso-fatty alcohol or an anteiso-fatty alcohol.
In exemplary embodiments, the branched fatty alcohol is selected
from iso-C.sub.7:0, iso-C.sub.8:0, iso-C.sub.9:0, iso-C.sub.10:0,
iso-C.sub.11:0, iso-C.sub.12:0, iso-C.sub.13:0, iso-C.sub.14:0,
iso-C.sub.15:0, iso-C.sub.16:0, iso-C.sub.17:0, iso-C.sub.18:0,
iso-C.sub.19:0, anteiso-C.sub.7:0, anteiso-C.sub.8:0,
anteiso-C.sub.9:0, anteiso-C.sub.10:0, anteiso-C.sub.11:0,
anteiso-C.sub.12:0, anteiso-C.sub.13:0, anteiso-C.sub.14:0,
anteiso-C.sub.15:0, anteiso-C.sub.16:0, anteiso-C.sub.17:0,
anteiso-C.sub.18:0, and anteiso-C.sub.19:0 fatty alcohol.
[0261] In certain embodiments, the fatty alcohol product can
comprise straight chain fatty alcohols. In other embodiments, the
branched fatty alcohols produced by the host cells described herein
can comprise one or more points of branching. In certain
embodiments, the branched fatty alcohols produced by the host cells
as described herein can comprise one or more cyclic moieties.
[0262] In some embodiments, the branched fatty alcohols can be
unsaturated branched fatty alcohols. For example, the branched
fatty alcohols produced in accordance with the present description
can be monounsaturated branched fatty alcohols. In certain
embodiments, the unsaturated branched fatty alcohol can be a C6:1,
C7:1, C8:1, C9:1, C10:1, C11:1, C12:1, C13:1, C14:1, C15:1, C16:1,
C17:1, C18:1, C19:1, C20:1, C21:1, C22:1, C23:1, C24:1, C25:1, or a
C26:1 unsaturated branched fatty alcohol. In other embodiments, the
branched fatty alcohol is unsaturated at the omega-7 position. In
certain embodiments, the unsaturated branched fatty alcohol
comprises a cis double bond.
[0263] In some embodiments, branched fatty alcohols are produced at
a relative yield to a straight-chain fatty alcohol at about 20%,
for example, at about 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, or
higher. In an exemplary embodiment, the total amount of branched
fatty alcohols produced is estimated to about 45% to about 50%
relative to the amount of straight-chain fatty alcohols produced by
a host cell.
[0264] In any of the aspects described herein, the production yield
of fatty alcohols, including branched fatty alcohols and straight
chain fatty alcohol, is about 1 mg/L, 5 mg/L, 10 mg/L, 15 mg/L, 20
mg/L, 25 mg/L, about 50 mg/L, about 75 mg/L, about 100 mg/L, about
125 mg/L, about 150 mg/L, about 175 mg/L, about 200 mg/L, about 225
mg/L, about 250 mg/L, about 275 mg/L, about 300 mg/L, about 325
mg/L, about 350 mg/L, about 375 mg/L, about 400 mg/L, about 425
mg/L, about 450 mg/L, about 475 mg/L, about 500 mg/L, about 525
mg/L, about 550 mg/L, about 575 mg/L, about 600 mg/L, about 625
mg/L, about 650 mg/L, about 675 mg/L, about 700 mg/L, about 725
mg/L, about 750 mg/L, about 775 mg/L, about 800 mg/L, about 825
mg/L, about 850 mg/L, about 875 mg/L, about 900 mg/L, about 925
mg/L, about 950 mg/L, about 975 mg/L, about 1000 g/L, about 1050
mg/L, about 1075 mg/L, about 1100 mg/L, about 1125 mg/L, about 1150
mg/L, about 1175 mg/L, about 1200 mg/L, about 1225 mg/L, about 1250
mg/L, about 1275 mg/L, about 1300 mg/L, about 1325 mg/L, about 1350
mg/L, about 1375 mg/L, about 1400 mg/L, about 1425 mg/L, about 1450
mg/L, about 1475 mg/L, about 1500 mg/L, about 1525 mg/L, about 1550
mg/L, about 1575 mg/L, about 1600 mg/L, about 1625 mg/L, about 1650
mg/L, about 1675 mg/L, about 1700 mg/L, about 1725 mg/L, about 1750
mg/L, about 1775 mg/L, about 1800 mg/L, about 1825 mg/L, about 1850
mg/L, about 1875 mg/L, about 1900 mg/L, about 1925 mg/L, about 1950
mg/L, about 1975 mg/L, about 2000 mg/L, or more.
[0265] In another aspect, the branched fatty alcohol produced in
accordance with the present invention is produced by culturing a
host cell described herein in a medium having a low level of iron,
under conditions sufficient to produce a branched fatty alcohol. In
particular embodiments, the medium contains less than about 500
.mu.M iron, less than about 400 .mu.M iron, less than about 300
.mu.M iron, less than about 200 .mu.M iron, less than about 150
.mu.M iron, less than about 100 .mu.M iron, less than about 90
.mu.M iron, less than about 80 .mu.M iron, less than about 70 .mu.M
iron, less than about 60 .mu.M iron, less than about 50 .mu.M iron,
less than about 40 .mu.M iron, less than about 30 .mu.M iron, less
than about 20 .mu.M iron, less than about 10 .mu.M iron, or less
than about 5 .mu.M iron. In certain embodiments, the medium does
not contain iron.
[0266] Bioproducts (e.g., surfactants and cleaning compositions)
comprising microbially produced branched fatty alcohols and/or
derivatives, produced using the fatty acid biosynthetic pathway,
have not been produced from renewable sources and, as such, are new
compositions of matter. These new bioproducts can be distinguished
from organic compounds derived from petrochemical carbon on the
basis of dual carbon-isotopic fingerprinting or .sup.14C dating.
Additionally, the specific source of biosourced carbon (e.g.,
glucose vs. glycerol) can be determined by dual carbon-isotopic
fingerprinting (see, e.g., U.S. Pat. No. 7,169,588, which is herein
incorporated by reference).
[0267] The ability to distinguish bioproducts from petroleum based
organic compounds is beneficial in tracking these materials in
commerce. Organic compounds or chemicals comprising both
biologically based and petroleum based carbon isotope profiles may
be distinguished from organic compounds and chemicals made only of
petroleum based materials. Hence, the surfactants and cleaning
compositions of the present invention be followed in commerce on
the basis of their unique carbon isotope profile.
[0268] Surfactants or cleaning compositions produced in accordance
with the present disclosure can be distinguished from
petroleum-derived compounds by comparing the stable carbon isotope
ratio (.sup.13C/.sup.12C) of each. The .sup.13C/.sup.12C ratio in a
given bioproduct is a consequence of the .sup.13C/.sup.12C ratio in
atmospheric carbon dioxide at the time the carbon dioxide is fixed.
It also reflects the precise metabolic pathway. Regional variations
also occur. Petroleum, C.sub.3 plants (the broadleaf), C.sub.4
plants (the grasses), and marine carbonates all show significant
differences in .sup.13C/.sup.12C and the corresponding
.delta..sup.13C values. Moreover, lipid matter of C.sub.3 and
C.sub.4 plants analyze differently than materials derived from the
carbohydrate components of the same plants as a consequence of the
metabolic pathway.
[0269] Within the precision of measurement, .sup.13C shows large
variations due to isotopic fractionation effects, the most
significant of which for bioproducts is the photosynthetic
mechanism. The major cause of differences in the carbon isotope
ratio in plants is closely associated with differences in the
pathway of photosynthetic carbon metabolism in the plants,
particularly the reaction occurring during the primary
carboxylation (i.e., the initial fixation of atmospheric CO.sub.2).
Two large classes of vegetation are those that incorporate the
"C.sub.3" (or Calvin-Benson) photosynthetic cycle and those that
incorporate the "C.sub.4" (or Hatch-Slack) photosynthetic
cycle.
[0270] In C.sub.3 plants, the primary CO.sub.2
fixation/carboxylation reaction involves the enzyme
ribulose-1,5-diphosphate carboxylase, and the first stable product
is a 3-carbon compound. C.sub.3 plants, such as hardwoods and
conifers, are dominant in the temperate climate zones.
[0271] In C.sub.4 plants, an additional carboxylation reaction
involving phosphoenol-pyruvate carboxylase, is the primary
carboxylation reaction. The first stable carbon compound is a
4-carbon acid that is subsequently decarboxylated. The CO.sub.2
thus released is refixed by the C.sub.3 cycle. Examples of C.sub.4
plants are tropical grasses, corn, and sugar cane.
[0272] Both C.sub.4 and C.sub.3 plants exhibit a range of
.sup.13C/.sup.12C isotopic ratios, but typical values are about -7
to about -13 per mil for C.sub.4 plants and about -19 to about -27
per mil for C.sub.3 plants (see, e.g., Stuiver et al., Radiocarbon,
19: 355 (1977)). Coal and petroleum fall generally in this latter
range. The .sup.13C measurement scale was originally defined by a
zero set by Pee Dee Belemnite (PDB) limestone, where values are
given in parts per thousand deviations from this material. The
".delta..sup.13C" values are expressed in parts per thousand (per
mil), abbreviated, , and are calculated as follows:
.delta..sup.13C()=[(.sup.13C/.sup.12C).sub.sample-(.sup.13C/.sup.12C).su-
b.standard]/(.sup.13C/.sup.12C).sub.standard.times.1000
[0273] Since the PDB reference material (RM) has been exhausted, a
series of alternative RMs have been developed in cooperation with
the IAEA, USGS, NIST, and other selected international isotope
laboratories. Notations for the per mil deviations from PDB is
.delta..sup.13C. Measurements are made on CO.sub.2 by high
precision stable ratio mass spectrometry (IRMS) on molecular ions
of masses 44, 45, and 46.
[0274] The branched fatty alcohol and derivative compositions as
well as the surfactants or cleaning compositions described herein
include bioproducts produced by any of the methods described
herein. The surfactants and cleaning compositions can have a
.delta..sup.13C of about -28 or greater, about -27 or greater, -20
or greater, -18 or greater, -15.4 or greater, -15 or greater, -13
or greater, -10 or greater, or -8 or greater. A surfactant or
cleaning composition so produced can have a .delta..sup.13C of
about -30 to about -15, about -27 to about -19, about -25 to about
-21, about -15 to about -5, about -15.4 to about -10.9, about
-13.92 to about -13.84, about -13 to about -7, or about -13 to
about -10. For example it can have a .delta..sup.13C of about -10,
-11, -12, or -12.3.
[0275] The surfactants or cleaning compositions herein can also be
distinguished from petroleum-derived compounds by comparing the
amount of .sup.14C in each compound. Because .sup.14C has a nuclear
half life of 5730 years, petroleum based chemicals containing
"older" carbon can be distinguished from bioproducts which contain
"newer" carbon (see, e.g., Currie, "Source Apportionment of
Atmospheric Particles," Characterization of Environmental
Particles, J. Buffle and H. P. van Leeuwen, Eds., 1 of Vol. I of
the IUPAC Environmental Analytical Chemistry Series (Lewis
Publishers, Inc.) (1992), pp. 3-74).
[0276] The basic assumption in radiocarbon dating is that the
constancy of .sup.14C concentration in the atmosphere leads to the
constancy of .sup.14C in living organisms. But because of
atmospheric nuclear testing since 1950 and the burning of fossil
fuel since 1850, .sup.14C has acquired a second, geochemical time
characteristic. Its concentration in atmospheric CO.sub.2, and
hence in the living biosphere, approximately doubled at the peak of
nuclear testing, in the mid-1960s. It has since been gradually
returning to the steady-state cosmogenic (atmospheric) baseline
isotope rate (.sup.14C/.sup.12C) of about 1.2.times.10.sup.-12,
with an approximate relaxation "half-life" of 7-10 years. (This
latter half-life is not to be taken literally; rather, the detailed
atmospheric nuclear input/decay function to trace the variation of
atmospheric and biospheric .sup.14C since the onset of the nuclear
age should be used).
[0277] It is this latter biospheric .sup.14C time characteristic
that holds out the promise of annual dating of recent biospheric
carbon. .sup.14C can be measured by accelerator mass spectrometry
(AMS), with results given in units of "fraction of modern carbon"
(f.sub.M). f.sub.M is defined by National Institute of Standards
and Technology (NIST) Standard Reference Materials (SRMs) 4990B and
4990C. As used herein, "fraction of modern carbon" or "f.sub.M" has
the same meaning as defined by National Institute of Standards and
Technology (NIST) Standard Reference Materials (SRMs) 4990B and
4990C, known as oxalic acids standards HOxI and HOxII,
respectively. The fundamental definition relates to 0.95 times the
.sup.14C/.sup.12C isotope ratio HOxI (referenced to AD 1950). This
is roughly equivalent to decay-corrected pre-Industrial Revolution
wood. For the current living biosphere (plant material), f.sub.M is
approximately 1.1.
[0278] The invention provides surfactants or cleaning compositions,
having an f.sub.M .sup.14C of at least about 1. An exemplary
surfactant has an f.sub.M .sup.14C of at least about 1.01, of at
least about 1.5, an f.sub.M .sup.14C of about 1 to about 1.5, an
f.sub.M .sup.14C of about 1.04 to about 1.18, or an f.sub.M
.sup.14C of about 1.111 to about 1.124. Likewise, an exemplary
cleaning composition has an f.sub.M .sup.14C of at least about
1.01, of at least about 1.5, an f.sub.M .sup.14C of about 1 to
about 1.5, an f.sub.M .sup.14C of about 1.04 to about 1.18, or an
f.sub.M .sup.14C of about 1.111 to about 1.124.
[0279] Another measurement of .sup.14C is known as the percent of
modern carbon, pMC. For an archaeologist or geologist using
.sup.14C dates, AD 1950 equals "zero years old". This also
represents 100 pMC. "Bomb carbon" in the atmosphere reached almost
twice the normal level in 1963 at the peak of thermonuclear weapons
testing. Its distribution within the atmosphere has been
approximated since its appearance, showing values that are greater
than 100 pMC for plants and animals living since AD 1950. It has
gradually decreased over time with today's value being near 107.5
pMC. This means that a fresh biomass material, such as corn, would
give a .sup.14C signature near 107.5 pMC. Petroleum based compounds
will have a pMC value of zero. Combining fossil carbon with present
day carbon will result in a dilution of the present day pMC
content. By presuming 107.5 pMC represents the .sup.14C content of
present day biomass materials and 0 pMC represents the .sup.14C
content of petroleum based products, the measured pMC value for
that material will reflect the proportions of the two component
types. For example, a material derived 100% from present day
soybeans would give a radiocarbon signature near 107.5 pMC. If that
material was diluted 50% with petroleum based products, it would
give a radiocarbon signature of approximately 54 pMC.
[0280] A biologically based carbon content is derived by assigning
"100%" equal to 107.5 pMC and "0%" equal to 0 pMC. For example, a
sample measuring 99 pMC will give an equivalent biologically based
carbon content of 93%. This value is referred to as the mean
biologically based carbon result and assumes all the components
within the analyzed material originated either from present day
biological material or petroleum based material.
[0281] A surfactant or a cleaning composition comprising branched
fatty alcohols and/or derivatives described herein can have a pMC
of at least about 50, 60, 70, 75, 80, 85, 90, 95, 96, 97, 98, 99,
or 100. In other instances, such a surfactant or cleaning
composition can have a pMC of between about 50 and about 100;
between about 60 and about 100; between about 70 and about 100;
between about 80 and about 100; between about 85 and about 100;
between about 87 and about 98; or between about 90 and about 95. In
yet other instances, it can have a pMC of about 90, 91, 92, 93, 94,
or 94.2.
[0282] Accordingly the present invention is drawn to a branched
fatty alcohol or a derivative thereof produced by an engineered
microbial host cell. The engineered microbial host cell expresses:
(a) a first gene encoding a first polypeptide having at least about
85% sequence identity to the amino acid sequence of any one of SEQ
ID NOs: 1, 3, 5, 7, 9, 11, 13 and 15, or of a variant thereof; and
(b) a second gene encoding a second polypeptide having at least
about 85% sequence identity to the amino acid sequence of any one
or SEQ ID NOs:24, 26, 28, 30, 32, 34, 36, and 38, or of a variant
thereof, and is cultured in the presence of one or more biological
substrates of the first and second polypeptides. In some
embodiments, the microbial host cell is engineered to express a
third gene encoding a third polypeptide comprising an amino acid
sequence having at least an about 85% sequence identity to the
amino acid sequence of any one of SEQ ID NOs:47, 49, 51, 53, 55,
57, 59, and 61, or of a variant thereof. In some embodiments, the
microbial host cell is engineered to express a fourth gene encoding
a fourth polypeptide comprising an amino acid sequence having at
least an about 85% sequence identity to the amino acid sequence of
any one of SEQ ID NO:69, 71, 73, 75, 77, 79, 81, and 83, or of a
variant thereof. In any of the above embodiments, the microbial
host cell is engineered to express a beta-ketoacyl ACP synthase
gene in the host cell, wherein the beta-ketoacyl ACP gene encodes a
polypeptide comprising an amino acid sequence having at least about
85% sequence identity to the amino acid sequence of any one of SEQ
ID NOs:90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114,
116, 118, and 120, or of a variant thereof. The beta-ketoacyl ACP
synthase is, for example, a FabH that has specificity for
branched-chain acyl-CoA substrates. In any of the embodiments
above, the microbial host cell is engineered to express a fatty
aldehyde biosynthesis polypeptide, or a variant thereof. In any of
the embodiments above, the microbial host cell is engineered to
express an acyl-ACP reductase polypeptide or a variant thereof, to
modify the expression of a gene encoding a fatty acid synthase,
which comprise expressing a gene encoding a thioesterase in the
microbial host cell, to express a gene encoding an alcohol
dehydrogenase or a variant thereof, and/or to express an attenuated
level of a fatty acid degradation enzyme relative to a wild type
host cell. The fatty acid degradation enzyme is, for example, an
acyl-CoA synthase.
Branched Fatty Alcohol Derivatives
[0283] A derivative of the branched fatty alcohol produced in
accordance to the methods described herein can be produced by
converting the isolated branched fatty alcohol into a branched
fatty alcohol derivative thereof. The branched fatty alcohol
derivative can be any suitable branched fatty alcohol derivative
selected from, for example, a branched fatty ether sulfate, a
branched fatty phosphate ester, an alkylbenzyldimethyl-ammonium
chloride, a branched fatty amine oxide, a branched fatty alcohol
sulfate, a branched alkyl polyglucoside, a branched alkyl glyceryl
ether sulfonate, and a branched ethoxylated fatty alcohol.
Typically, the branched fatty alcohol derivative comprises an alkyl
group that is about 6 to about 26 carbons in length. Preferably,
the branched fatty alcohol comprises an alkyl group that is about
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, or 18 carbons in length.
In certain embodiments, the alkyl group comprises one or more
points of branching. In this regard, the number of carbons in the
alkyl group refers to the hydrocarbon group derived from the
branched fatty alcohol, and not to any carbon atoms added in the
preparation of the branched fatty alcohol derivative, such as
polyethoxy groups and the like.
[0284] As used herein, the term "fatty ether sulfate" is the same
as "alkyl ether sulfate" wherein the alkyl residue is a fatty
residue, and denotes a compound of the structure:
RO(CH.sub.2CH.sub.2O).sub.n--SO.sub.3H, wherein R is a
C.sub.6-C.sub.26 alkyl group as defined herein, and n is an integer
of 1 to about 50. Fatty ether sulfate can also refer to the salt
denoted by RO(CH.sub.2CH.sub.2O).sub.nSO.sub.3X, where n and R are
as defined above and X is a cation. An exemplary fatty ether
sulfate salt is a sodium salt, for example,
RO(CH.sub.2CH.sub.2O).sub.nSO.sub.3Na. In an exemplary embodiment,
the R group comprises one or more points of branching.
[0285] As used herein, the term "fatty alcohol sulfate" denotes a
compound of the structure: ROSO.sub.3H wherein R is a
C.sub.8-C.sub.26 alkyl group. Fatty alcohol sulfate can also refer
to the salt of the above structure, denoted by ROSO.sub.3X where R
is as defined above and X is a cation. An exemplary fatty alcohol
sulfate salt is a sodium salt, for example, ROSO.sub.3Na. In an
exemplary embodiment, the R group comprises one or more points of
branching.
[0286] As used herein, the term "fatty phosphate ester" is the same
as "alkyl phosphate ester" wherein the alkyl residue is a fatty
residue, and denotes a compound of the structure:
ROP(O)(OH).sub.2
[0287] As used herein, alkylbenzyldimethylammonium chlorides have
the structure:
##STR00001##
wherein R is a C.sub.8-C.sub.26 alkyl group as defined herein. For
example, the alkyl group of R comprises one or more points of
branching.
[0288] As used herein, the term "fatty amine oxide" is the same as
"alkyl amine oxide" wherein the alkyl residue is a fatty residue as
defined herein, and denotes a compound of the structure:
##STR00002##
wherein R is a C.sub.8-C.sub.26 alkyl group as defined herein and
wherein R.sup.1 and R.sup.2 are C.sub.1-C.sub.26 alkyl groups,
preferably C.sub.1-C.sub.6 alkyl groups. Preferably the alkyl
groups of R, R.sup.1 and R.sup.2 each independently comprises one
or more points of branching.
[0289] Alkyl polyglucosides have the structure:
RO(C.sub.nH.sub.2nO).sub.tZ.sub.x wherein R is a C.sub.8-C.sub.26
alkyl group, preferably comprising one or more points of branching,
Z is a glucose residue, n is 2 or 3, t is from 0 to 10, and x is
from about 1 to 10, preferably from about 1.5 to 4.
[0290] Alkyl glyceryl ether sulfonates have the structure:
##STR00003##
wherein R is a C.sub.8-C.sub.26 alkyl group as defined herein,
preferably comprising one or more points of branching, and n is an
integer from 1 to 4, for example, 1, 2, 3, or 4.
[0291] As used herein, the term "fatty alcohol alkoxylate" is the
same as "alkoxylated fatty alcohol" and denotes a compound of the
structure: RO(CH.sub.2CH.sub.2).sub.nOH wherein R is a
C.sub.8-C.sub.26 alkyl group as defined herein and n is an integer
from 1 to about 50. Preferably R comprises one or more points of
branching.
[0292] The branched fatty alcohol derivatives can be produced by
any suitable method, many of which are known in the art. See, e.g.,
"Handbook on Soaps, Detergents, and Acid Slurry," 2.sup.nd ed.,
NIIR Board, Asia Pacific Business Press, Inc., Delhi, India.
[0293] In one embodiment, the branched fatty alcohol derivative is
an ethoxylated branched fatty alcohol, which is also known in the
art as a branched fatty alcohol ethoxylate, and has a structure as
described herein. Preferably, the ethoxylated branched fatty
alcohol contains from about 1 to about 50 moles of ethylene oxide
per mole of branched fatty alcohol.
Surfactants or Detersive Surfactants
[0294] A surfactant composition of the present invention can
comprise about 0.001 wt. % to about 100 wt. % of microbially
produced branched fatty alcohols and/or derivatives thereof.
Preferably, a surfactant composition is a blend of a microbially
produced branched fatty alcohol and/or derivative in combination
with one or more other surfactants and/or surfactant systems that
have been derived from similar (e.g., microbially derived) or
different sources (e.g., synthetic, petroleum-derived). Those other
surfactants and/or surfactant systems can confer additional
desirable properties. In some embodiments, the one or more other
surfactants and/or surfactant systems that are blended with the
microbially produced branched fatty alcohols and/or derivatives can
comprise linear or branched fatty alcohol derivatives, or they can
be other types of surfactants such as, cationic surfactants,
anionic surfactants and/or amphoteric/zwitterionic surfactants.
These other surfactants and/or surfactants systems are collectively
referred to as "co-surfactants" herein. For example, a surfactant
composition of the invention can be a blend of a microbially
produced branched fatty alcohol and/or derivative composition
prepared in accordance with the disclosure herein, and a cationic
surfactant derived from a petrochemical source, and the resulting
surfactant composition only has good cleaning properties but also
contributes certain disinfecting and/sanitizing benefits.
[0295] The cleaning composition of the invention can comprise, in
addition to the microbially produced branched fatty alcohols and/or
derivatives, or the surfactants comprising such branched materials
and/or derivatives, co-surfactants selected from nonionic
surfactants, anionic surfactants, cationic surfactants, ampholytic
surfactants, squitterionic surfactants, semi-polar nonionric
surfactants, and mixtures thereof. When present, the total amount
of surfactants, including the microbially produced branched fatty
alcohols and/or derivatives thereof, and the co-surfactants, is
typically present at a level of about 0.1 wt. % or higher (e.g.,
about 1.0 wt. % or higher, about 10 wt. % or higher, about 25 wt. %
or higher, about 50 wt. % or higher, about 70 wt. % or higher). For
example, the total amount of surfactant in a cleaning composition
can vary from about 0.1 wt. % to about 80 wt. % (e.g., from about
0.1 wt. % to about 40 wt. %, from about 0.1 wt % to about 12 wt. %,
from about 1.0 wt. % to about 50 wt. %, or from about 5 wt. % to
about 40 wt. %).
[0296] Various known surfactants can be suitable co-surfactants. In
some embodiments, the co-surfactant can comprise an anionic
surfactant. In certain embodiments, the amount of one or more
anionic surfactants in the cleaning composition can be, for
example, about 1 wt. % or more (e.g., about 5 wt. % or more, about
10 wt. % or more, about 20 wt. % or more, about 30 wt. % or more,
about 40 wt. % or more). For example, the amount of one or more
anionic surfactants in the cleaning composition can vary from about
1 wt. % to about 40 wt. %. Suitable anionic surfactants include,
for example, linear alkylbenzenesulfonate, alpha-olefinsulfonate,
alkyl sulfate (fatty alcohol sulfate), alcohol ethoxysulfate,
secondary alkanesulfonate, alpha-sulfo fatty acid methyl esters,
alkyl- or alkenylsuccinic acid or soap. In some embodiments, an
anonic surfactant is, for example, a C.sub.10-C.sub.18 alkyl akoxy
ester (AE.sub.xS), wherein x is from 1-30. Other suitable anionic
surfactants can be found in International Publication WO98/39403,
Surface Active Agents and Detergetns (Vol. 1, & II, by
Schwartz, Perry and Berch), and U.S. Pat. Nos. 3,929,678,
6,020,303, 6,060,443, 6,008,181, International Publications WO
99/05243, WO 99/05242 and WO 99/05244, the relevant disclosures of
which are incorporated herein by reference.
[0297] In another embodiment, the co-surfctant can comprise a
cationic surfactant. Suitable cationic surfactants include, for
example, those having long-chain hydrocarbyl groups. Examples
include ammonium surfactants such as alkyltrimethylammonium
halodenides, and surfactants having the formula
[R.sup.2(OR.sup.3)y][R.sup.4(OR.sup.3)y].sub.2R.sup.5N+X.sup.-,
wherein R.sup.2 is an alkyl or alkyl benzyl group having from about
8 to about 18 carbon atoms in the alkyl chain, each R.sup.3 is
independently selected from the group consisting of
--CH.sub.2CH.sub.2--, CH.sub.2CH(CH.sub.3)--,
CH.sub.2(CH(CH.sub.2OH))--, CH.sub.2CH.sub.2CH.sub.2--, and
mixtures thereof; each R.sup.4 is selected from the group
consisting of C.sub.1-C.sub.4 alkyl, C.sub.1-C.sub.4 hydroxyalkyl,
benzyl ring structures formed by joining the two R.sup.4 groups,
--CH.sub.2CHOH--CHOHCOR.sup.6CHOHCH.sub.2OH wherein R.sup.6 is any
hexose or hexose polymer having a molecular weight less than about
1,000, and hydrogen, when y is not 0; R.sup.5 is the same as
R.sup.4 or is an alkyl chain wherein the total number of carbon
atoms of R.sup.2 plus R.sup.5 is not more than about 18; each y is
from 0 to about 10 and the sum of the y values is from 0 to about
15; and X is any compatible anion.
[0298] Certain quaternary ammonium surfactant may also be suitable
as cationic co-surfactants, and examples of those are described in
International Publication WO 98/39403. Examples of suitable
quaternary ammonium compounds include coconut trimethyl ammonium
chloride or bromide; coconut methyl dihydroxyethyl ammonium
chloride or bromide; decyl triethyl ammonium chloride; decyl di
methyl hydroxyethyl ammonium chloride or bromide; C.sub.12-15
dimethyl hydroxyethyl ammonium chloride or bromide; coconut
dimethyl hydroxyethyl ammonium chloride or bromide; myristyl
trimethyl ammonium methyl sulphate; lauryl dimethyl benzyl ammonium
chloride or bromide; lauryl di methyl (ethenoxy) 4 ammonium
chloride or bromide. Other cationic surfactants have been described
in U.S. Pat. Nos. 4,228,044, 4,228,042, 4,239,660 4,260,529
6,136,769, 6,004,922, 6,022,844, and 6,221,825, International
Publications WO 98/35002, WO 98/35003, WO 98/35004, WO 98/35005, WO
98/35006, and WO 00/47708, as well as European Patent Application
EP 000,224. When included herein, the surfactant/detergent and the
cleaning/treatment compositions of the present invention can
comprise, for example, from about 0.2 wt. % to about 25 wt. %,
preferably from about 1 wt. % to about 8 wt. % by weight of
cationic surfactants.
[0299] In certain embodiments, co-surfactants can comprise nonionic
surfactants. Polyethylene, polypropylene, and polybutylene oxide
condensates of alkyl phenols are suitable, with the polyethylene
oxide condensates being preferred. They include the condensation
products of alkyl phenols having an alkyl group having about 6 to
about 14 carbon atoms, preferably from about 8 to about 14 carbon
atoms, in either a straight-chain or branched-chain configuration,
with alkylene oxide. In particular embodiments, the ethylene oxide
is present in an amount of from about 2 to about 25 moles (e.g.,
from about 3 to about 15 moles) of ethylene oxide per mole of alkyl
phenol. Commercially available nonionic surfactants of this type
include Igepal.TM. C0-630 (The GAF Corp.), Triton.TM. X-45, X-114,
X-100 and X-102 (Dow Chemicals). These surfactants are commonly
referred to as alkylphenol alkoxylates (e.g., alkyl phenol
ethoxylates).
[0300] Moreover, condensation products of primary and secondary
aliphatic alcohols with from about 1 to about 25 moles of ethylene
oxide are suitable nonionic co-surfactants. The alkyl chain of the
aliphatic alcohol can be straight or branched, primary or
secondary, and generally can contain about 8 to about 22 (e.g.,
about 8 to about 20, or about 10 to about 18) carbon atoms with
about 2 to about 10 moles (e.g., about 2 to about 5 moles) of
ethylene oxide per mole of alcohol present in the condensation
products. Examples of commercially available nonionic surfactants
of this type include Tergitol.TM. 15-S-9, Tergitol.TM. 24-L-6 NMW
(Union Carbide); Neodol.TM. 45-9, Neodol.TM. 23-3, Neodol.TM. 45-7,
Neodol.TM. 45-5 (Shell Chemical), Kyro.TM. EOB (Procter &
Gamble), and Genapol LA 030 or 050 (Hoechst).
[0301] Further examples of nonionic co-surfactants include
C.sub.12-C.sub.18 alkyl ethoxylates (e.g., NEODOL.RTM. nonionic
surfactants (Shell)), C.sub.6-C.sub.12 alkyl phenol alkoxylates
wherein the alkoxylate units are a mixture of ethyleneoxy and
propyleneoxy units, C.sub.12-C.sub.18 alcohol and C.sub.6-C.sub.12
alkyl phenol condensates with ethylene oxide/propylene oxide block
alkyl polyamine ethoxylates (e.g., PLURONIC.RTM. (BASF)),
C.sub.14-C.sub.22 mid-chain branched alcohols as described in U.S.
Pat. No. 6,150,322, C.sub.14-C.sub.22 mid-chain branched alkyl
alkoxylates, BAE.sub.x, wherein x is from 1-30, as described in
U.S. Pat. Nos. 6,153,577, 6,020,303 and 6,093,856,
alkylpolysaccharides as described in U.S. Pat. No. 4,565,647,
alkylpolyglycosides as described in U.S. Pat. No. 4,483,780 and
U.S. Pat. No. 4,483,779, polyhydroxy detergent acid amides as
described in U.S. Pat. No. 5,332,528, or ether capped
poly(oxyalkylated) alcohol surfactants as described in U.S. Pat.
No. 6,482,994 and International Publication WO 01/42408.
[0302] Semi-polar nonionic surfactants can also be suitable. They
include, e.g., water-soluble amine oxides containing 1 alkyl moiety
of from about 10 to about 18 carbon atoms and 2 moieties selected
from alkyl or hydroxyalkyl moieties containing about 1 to about 3
carbon atoms, water-soluble phosphine oxides containing 1 alkyl
moiety of about 10 to about 18 carbon atoms and 2 moieties selected
from alkyl or hydroxyalkyl moieties containing about 1 to about 3
carbon atoms; and water-soluble sulfoxides containing 1 alkyl
moiety of about 10 to about 18 carbon atoms and a moiety selected
from alkyl or hydroxyalkyl moieties of about 1 to about 3 carbon
atoms. Semi-polar nonionic surfactants have been described in,
e.g., International Publication WO 01/32816, U.S. Pat. Nos.
4,681,704 and 4,133,779.
[0303] Moreover, alkylpolysaccharides, such as those described in
U.S. Pat. No. 4,565,647, having a hydrophobic group containing
about 6 to about 30 carbon atoms (e.g., from about 10 to about 16
carbon atoms) and a polysaccharide, can also be suitable semi-polar
noniornic co-surfactants. Others have been described in, for
example, International Publication WO 98/39403. When included
herein, the cleaning/treatment compositions of the present
invention can comprise, for example, about 0.2 wt. % or more (e.g.,
about 1 wt. % or more, about 5 wt. % or more, or about 8 wt. % or
more) of such semi-polar nonionic surfactants. For example, the
cleaning compositions of the invention can comprise about 0.2 wt. %
to about 15 wt. % (e.g., about 1 wt. % to about 10 wt. %) of
semi-polar nonionic surfactants.
[0304] In certain embodiments, the co-surfactants comprises
ampholytic surfactants. Ampholytic surfactants can be broadly
described as aliphatic derivatives of secondary or tertiary amines,
or aliphatic derivatives of heterocyclic secondary and tertiary
amines in which the aliphatic radical can be straight- or
branched-chain. One of the aliphatic substituents can contain at
least about 8 carbon atoms (e.g., from about 8 to about 18 carbon
atoms), and at least another contains an anionic water-solubilizing
group, e.g. carboxy, sulfonate, or sulfate. Ampholytica surfactants
have been described in, for example, U.S. Pat. No. 3,929,678. When
included therein, a cleaning composition of the invention can
comprise, for example, about 0.2 wt. % to about 15 wt. % (e.g.,
about 1 wt. % to about 10 wt. %) of ampholytic surfactants.
[0305] In certain other embodiments, especially in personal care
cleaning/treatment compositions, zwitterionic surfactants are
included as co-surfactants. These surfactants can be broadly
described as derivatives of secondary and tertiary amines,
derivatives of heterocyclic secondary and tertiary amines, or
derivatives of quaternary ammonium, quaternary phosphonium or
tertiary sulfonium compounds. Zwitterionic surfactants have been
described in, for example, U.S. Pat. No. 3,929,678. When included
therein, a surfactant/detergent or cleaning/treatment composition
of the invention can comprise, for example, about 0.2 wt. % to
about 15 wt. % (e.g., about 1 wt. % to about 10 wt. %) of
zwitterionic surfactants.
[0306] In further embodiments, primary or tertiary amines can be
included as co-surfactants. Suitable primary amines include amines
according to the formula R.sup.1NH.sub.2 wherein R.sup.1 is a
C.sub.6-C.sub.12, preferably C.sub.6-C.sub.10, alkyl chain; or
R.sub.4X(CH.sub.2)n, wherein X is --O--, --C(O)NH-- or --NH--,
R.sup.4 is a C.sub.6-C.sub.12 alkyl chain, n is between 1 to 5
(e.g., 3). The alkyl chain of R.sup.1 can be straight or branched,
and can be interrupted with up to 12, but preferably less than 5
ethylene oxide moieties. Preferred amines include n-alkyl amines,
selected from, for example, 1-hexylamine, 1-octylamine,
1-decylamine and laurylamine, C.sub.8-C.sub.10 oxypropylamine,
octyloxypropylamine, 2-ethylhexyl-oxypropylamine, lauryl amido
propylamine or amido propylamine. Suitable tertiary amines include
those having the formula R.sup.1R.sup.2R.sup.3N wherein R.sup.1 and
R.sup.2 are C.sub.1-C.sub.8 alkylchains, R.sup.3 is either a
C.sub.6-C.sub.12, preferably C.sub.6-C.sub.10, alkyl chain, or
R.sup.3 is R.sup.4X(CH.sub.2)n, whereby X is --O--, --C(O)NH-- or
--NH--, R.sup.4 is a C.sub.4-C.sub.12, n is between 1 and 5 (e.g.,
2, 3, or 4), R.sup.5 is H or C.sub.1-C.sub.2 alkyl, and x is
between 1 and 6. R.sup.3 and R.sup.4 may be linear or branched. The
alkyl chain of R.sup.3 can be interrupted with up to 12, but
preferably less than 5, ethylene oxide moieties. Preferred tertiary
amines include, for example, 1-hexylamine, 1-octylamine,
1-decylamine, 1-dodecylamine, n-dodecyldimethylamine,
bishydroxyethylcoconutalkylamine, oleylamine(7)ethoxylated, lauryl
amido propylamine, and cocoamido propylamine. Other useful
detersive surfactants have been described in the prior art, for
example, in U.S. Pat. Nos. 3,664,961, 3,919,678, 4,222,905, and
4,239,659.
[0307] In some embodiments, the detergent/cleaning composition of
the invention comprises greater than about 5 wt. % anionic
surfactant and/or less than about 25 wt. % nonionic surfactant. For
example, the composition comprises greater than about 10 wt. %
anionic surfactants. In another example, the composition comprises
less than 15%, more preferably, less than 12% nonionic
surfactants.
[0308] The total amount of surfactants included in a cleaning
composition of the invention is typically about 0.1 wt. % or more
(e.g., about 1 wt. % or more, about 10 wt. % or more, about 25 wt.
% or more, about 50 wt. % or more, about 60 wt. % or more, about 70
wt. % or more). An exemplary cleaning composition of the invention
comprises about 0.1 wt. % to about 80 wt. % total surfactants
(e.g., about 1 wt. % to about 50 wt. %, about 10 wt. % to about 40
wt. %, about 20 wt. % to about 35 wt. %) of total surfactants,
including the microbially produced branched fatty alcohols and/or
derivatives thereof and co-surfactants.
[0309] One criteria based on which to the type(s) and amount(s) of
surfactants to be included in cleaning compositions can be
determined is compatibility with the enzyme components present in
the cleaning compositions. For example, in liquid or gel
compositions, the cleaning composition (including all the
surfactants, which are, for example, pre-formulated into a
surfactant package) is prepared such that it promotes, or at least
does not degrade, the stability of any enzyme in the cleaning
composition.
[0310] A surfactant composition of the present invention, or a
surfactant package, which can be formulated and subsequently
included in a cleaning composition, can be in any form, for
example, a liquid; a solid such as a powder, granules, agglomerate,
paste, tablet, pouches, bar; a gel; an emulsion; or in a suitable
form to be delivered in dual-compartment containers. The
composition can also be formulated into a spray or foam detergent,
premoistened wipes (e.g., the cleaning composition in combination
with a nonwoven material as described, for example, in U.S. Pat.
No. 6,121,165), dry wipes (e.g., a cleaning composition in
combination with a nonwoven material, activated with water by a
consumer, as described, for example, in U.S. Pat. No. 5,980,931),
and other homogeneous or multiphase consumer cleaning product
forms.
Cleaning Compositions
[0311] Surfactant compositions comprising branched fatty alcohols
and/or derivatives thereof, e.g., sulfate, alkoxyalated or
alkoxylated sulfate derivatives, are particularly suitable as soil
detachment-promoting ingredients of laundry detergents, dishwashing
liquids and powders, and various other cleaning compositions. They
exhibit good dissolving power especially when faced with greasy
soils, and it is particular advantageous that they display the
outstanding soil-detaching power even at low washing
temperatures.
[0312] The branched fatty alcohol/derivative compositions of the
present invention can be included or blended into a surfactant
package as described above, which comprises about 0.0001 wt. % to
about 100 wt. % of one or more branched fatty alcohols and/or
derivatives produced by a genetically engineered host cell or
microbe. That surfactant package can then be blended into a
cleaning composition to impart detergency and cleaning power to the
cleaning composition. In alternative embodiments, the branched
fatty alcohols and/or derivatives thereof produced by the host cell
or mibrobe can be blended into a cleaning composition directly, in
an amount of about 0.001 wt. % or more (e.g., about 0.001 wt. % or
more, about 0.1 wt. % or more, about 1 wt. % or more, about 10 wt.
% or more, about 20 wt. % or more, or about 40 wt. % or more) based
on the total weight of the cleaning composition. For example, the
branched fatty alcohols and/or derivatives thereof can be blended
into a composition in an amount of about 0.001 wt. % to about 50
wt. % (e.g., about 0.01 wt. % to about 45 wt. %, about 0.1 wt. % to
about 40 wt. %, about 1 wt. % to about 35 wt. %). Accordingly, a
cleaning composition of the present invention, in either a solid
form (e.g., a tablet, granule, powder, or compact), or a liquid
form (e.g., a fluid, gel, paste, emulsion, or concentrate) can
comprise about 0.001 wt. % to about 50 wt. % of microbially
produced branched fatty alcohols and/or derivatives thereof. For
example, a cleaning composition of the invention can comprise about
0.5 wt. % to about 44 wt. % of microbially produced branched fatty
alcohols and/or derivatives thereof. Preferably, the cleaning
composition comprises about 1 wt. % to about 30 wt. % of
microbially produced branched fatty alcohols and/or
derivatives.
[0313] Alternatively, a cleaning composition of the present
invention can comprise about 0.001 wt. % to about 80 wt. % of a
surfactant package formulated to comprise about 0.001 wt. % to
about 100 wt. % of microbially produced branched fatty alcohols
and/or derivatives. For example, a cleaning composition of the
present invention can comprise about 0.1 wt. % to about 50 wt. % of
such a surfactant package. The surfactant package can comprise
other surfactants (i.e., co-surfactants), which can include
surfactants derived from similar (e.g., microbially-produced
surfactant) or different sources (e.g., petroleum-derived
surfactants). In a particular embodiment, however, the surfactant
package can comprise mostly or exclusively of branched fatty
alcohols and/or derivatives produced by a host cell or a microbe as
described herein.
Industrial Cleaning Compositions, Household Cleaning Compositions
& Personal Care Cleaning Compositions
[0314] In certain embodiments, the cleaning composition of the
present invention is a liquid or solid laundry detergent
composition. In some embodiments, the cleaning composition is a
hard surface cleaning composition, wherein the hard surface
cleaning composition preferably impregnates a nonwoven substrate.
As used herein, "impregnate" means that the hard surface cleaning
composition is placed in contact with a nonwoven substrate such
that at least a portion of the nonwoven substrate is penetrated by
the hard surface cleaning composition. For example, the hard
surface cleaning composition preferably saturates the nonwoven
substrate. In other embodiments, the cleaning composition of the
present invention is a car care composition, which is useful for
cleaning various surfaces such as hard wood, tile, ceramic,
plastic, leather, metal, and/or glass. In some embodiments, the
cleaning composition is a dish-washing composition, such as, for
example, a liquid hand dishwashing composition, a solid automatic
dishwashing composition, a liquid automatic dishwashing
composition, and a tab/unit dose form automatic dishwashing
composition.
[0315] In further embodiments, the cleaning composition can be used
in industrial environments for cleaning of various equipment,
machinery, and for use in oil drilling operations. For example, the
cleaning composition of the present invention can be particularly
suited in environments wherein it comes into contact with free
hardness and in compositions that require hardness tolerant
surfactant systems, such as when used to aid oil drilling.
[0316] In some embodiments, the cleaning composition of the
invention can be formulated into personal or pet care compositions
such as shampoos, body washs, or liquid or solid soaps.
[0317] Common cleaning adjuncts applicable to most cleaning
compositions, including, household cleaning compositions, and
personal care compositions and the like, include builders, enzymes,
polymers, suds boosters, suds suppressors (antifoam), dyes,
fillers, germicides, hydrotropes, anti-oxidants, perfumes,
pro-perfumes, enzyme stabilizing agents, pigments, and the like. In
some embodiments, the cleaning composition is a liquid cleaning
composition, wherein the composition comprises one or more selected
from solvents, chealating agents, dispersants, and water. In other
embodiments, the cleaning composition is a solid, wherein the
composition further comprises, for example, an inorganic filler
salt. Inorganic filler salts are conventional ingredients of solid
cleaning compositions, present in substantial amounts, varying
from, for example, about 10 wt. % to about 35 wt. %. Suitable
filler salts include, for example, alkali and alkaline-earth metal
salts of sulfates and chlorides. An exemplary filler salt is sodium
sulfate.
[0318] Household cleaning compositions, including, e.g., laundry
detergents and household surface cleaners typically comprise
certain additional, in some embodiments, more specialized,
ingredients or cleaning adjuncts selected from one or more of:
bleaches, bleach activators, catalytic materials, suds boosters,
suds suppressors (antifoams), diverse active ingredients or
specialized materials such as dispersant polymers (e.g., dispersant
polymers sold by BASF or Dow Chemicals), silvercare, anti-tarnish
and/or anti-corrosion agents, dyes, germicides, alkalinity sources,
hydrotropes, anti-oxidants, enzyme stabilizing agents,
pro-perfumes, perfumes, solubilizing agents, carriers, processing
aids, pigments, and, for liquid formulations, solvents, chelating
agents, dye transfer inhibiting agents, dispersants, brighteners,
dyes, structure elasticizing agents, fabric softeners,
anti-abrasion agents, hydrotropes, processing aids, and other
fabric care agents. The cleaning adjuncts particularly useful for
household cleaning compositions and the levels of use have been
described in, e.g., U.S. Pat. Nos. 5,576,282, 6,306,812 and
6,326,348. A comprehensive list of suitable lanudry or other
household cleaning adjuncts is, e.g., in International Publication
WO 99/05245.
[0319] Personal/pet or beauty care cleaning compositions including,
e.g., shampoos, facial cleansers, hand sanitizers, bodywash, and
the like, can also comprise, in some embodiments, other more
specialized adjuncts, inlcuding, for example, conditioning agents
such as vitamines, silicone, silicone emulsion stabilizing
components, cationic cellulose or polymers such as Guar polymers,
anti-dandruff agents, antibacterial agents, dispersed gel network
phase, suspending agents, viscosity modifiers, dyes, non-volatile
solvens or diluents (water soluble or insoluble), foam boosters,
pediculocides, pH adjusting agnets, perfumes, preservatives,
chelants, proteins, skin active agents, sunscreens, UV absorbers,
and minerals, herbal/fruit/food extracts, sphigolipids derivatives
or synthetic derivatives and clay.
Common Adjuncts
[0320] (1) Enzymes
[0321] Various known detersive enzymes can be blended into a
cleaning composition of the present invention. Suitable enzymes
include, e.g., proteases, amylases, lipases, cellulases,
pectinases, mannases, arabinases, galactanases, xylanases, oxidases
(e.g., laccases), peroxidases, and/or mixtures thereof. They can
provide enhanced cleaning performance and/or fabric care benefits.
In general, just as the selection of the type and amount of
surfactants to be formulated into a cleaning composition should
take account of the enzymes therein, the types of enzyme chosen to
be included in the composition should take account of the other
components in the composition (including the surfactants).
Considerations may include, e.g., the pH-optimum of the overall
composition, the presence of absence of enzyme stabilization
agents, etc. The enzymes should be present in the cleaning
compositions in effective amounts.
[0322] Suitable proteases include those of animal, vegetable or
microbial origin. Microbial origin is preferred. Chemically
modified or engineered mutants (e.g., those desecribed in
International Publications WO 92/19729, 98/20115, 98/20116, and
98/34946) can also be included. Suitable proteases can be a serine
protease or a metallo protease, preferably an alkaline microbial
protease or a trypsin-like protease. Examples of alkaline proteases
are subtilisins, especially those derived from Bacillus, e.g.,
subtilisin Novo, subtilisin Carlsberg, subtilisin 309, subtilisin
147 and subtilisin 168 (as described in International Publications
WO 89/06279 and WO 05/103244). Other suitable serine proteases
include those from Micrococcineae sp. and those from Cellulonas sp.
and variants thereof as, e.g., described in International
Publication WO05/052146. Examples of trypsin-like proteases include
trypsin (e.g. of porcine or bovine origin) and Fusarium proteases
such as those described in International Publications WO 89/06270
and WO 94/25583. Many proteases are commercially available,
including, e.g., Alcalase.TM., Savinase.TM., Primase.TM.,
Duralase.TM., Esperase.TM. Coronase.TM., Polarzyme.TM., Kannase.TM.
(Novozymes A/S), Maxatase.TM., Maxacal.TM. Maxapem.TM.,
Properase.TM., Purafect.TM., Purafect Prime.TM., Purafect OxP.RTM.,
FNA, FN2, FN3, and FN4 (Genencor Int'l Inc.).
[0323] Suitable lipases include those of bacterial or fungal
origin. For example, suitable lipases can be derived from yeast,
from genera such as Candida, Kluyvermyces, pichia, Saccharomyces,
Schizosaccharomyces, or Yarrowia, or derived from a filamentous
fungus, such as Acremonium, Aspergillus, Aureobasidum,
Cryptococcus, Filobasidium, Fusarium, Humicolar, Magnaporthe,
Mucor, Myceliophthora, Neocallimasix, Neurospora, Paecilomyces,
Penicillium, Piromyces, Schizophyllum, Talaromyces, thermoascus,
Thielavia, Tolypocladium, Thermomyces or Trichoderma. Many
chemically modified lipases are also suitable, including, for
example, those from Humicola, (e.g., a modified lipase from H.
lanuginosa as described in EP 258 068 and 305 216, a modified
lipase from H. insolens as described in International Publication
WO 96/13580), those from Pseudomonas (e.g., a modified lipase from
P. alcaligenes, or from P. pseudoalcaligenes as described in EP 218
272, a modified lipase from P. cepacia as described in EP 331 376,
a modified lipase from P. stutzeri as described in GB 1,372,034, a
modified lipase from P. fluoresces or Pseudomonas sp. strain SD
705, as described in International Publications WO 95/06720 and
WO96/27002, a modified lipase from P. wisconsinensis as described
in International Publication WO 96/12012), those from Bacillus
(e.g. a modified lipase from B. subtilis as described in Dartois et
al. Biochemica et Biophysica Acta, 1131, 253-360 (1993)), a
modified lipase from B. stearothermophilus as described in JP
Application 64/744992, a modified lipase from B. pumilus as
described in International Publication WO 91/16422. Other examples
are lipase variants include those described in International
Publications WO 92/05249, WO 94/01541, WO 95/35381, WO 96/00292, WO
95/30744, WO 94/25578, WO 95/14783, WO 95/22615, WO 97/04079, WO
97/07202, and EP 407 225 and 260 105.
[0324] A number of lipase enzymes, which can be included in a
cleaning composition of the invention, are commercially available.
They include Lipolase.TM., Lipolase.TM. Ultra and Lipex.TM.
(Novozymes A/S). Suitable amylases (.alpha. and/or .beta.) include
those of bacterial or fungal origin. Chemically modified or
engineered mutant amylases can also be suitably included in a
cleaning composition of the invention. Amylases include, for
example, .alpha.-amylases obtained from Bacillus (for example, from
a special strain of B. licheniformis as described GB Patent
1,296,839). Various mutant amylases, which can be suitably included
in a cleaning composition, have been described in International
Publications WO 94/02597, WO 94/18314, WO 96/23873, and WO
97/43424. A number of amylases, which can be included in a cleaning
composition of the present invention, are commercially available.
They include Duramyl.TM. Termamyl.TM., Stainzyme.TM., Stainzyme
Ultra.TM., Stainzyme Plus.TM., Fungamyl.TM. and BAN.TM. (Novozymes
A/S), Rapidase.TM. and Purastar.TM. (Genencor Int'l Inc.).
[0325] Suitable cellulases include those of bacterial or fungal
origin. Chemically modified or engineered mutant cellulases can
also be suitably included in a cleaning composition of the
invention. Cellulases include, for example, those obtained from the
genera Bacillus, Pseudomonas, Humicola (e.g., from Humicola
insolens), Fusarium (e.g., from Fusarium oxysporum), Thielavia,
Acremonium, Myceliophthora, as described in U.S. Pat. Nos.
4,435,307, 5,648,263, 5,691,178, 5,776,757, and International
Publication WO 89/09259. Especially suitable cellulases are the
alkaline or neutral cellulases that impart color care benefits.
Examples of such cellulases include those described in EP 0 495
257, 0 531 372, and International Publications WO 96/11262, WO
96/29397, and WO 98/08940. A number of cellulases, especially those
that provide added color care benefits, are commercially available,
which can be included in a cleaning composition of the invention,
especially in, for example, a laundry detergent composition.
Commercially available cellulases include, e.g., Renozyme.TM.
Celluclean.TM., Endolase.TM., Celluzyme.TM., and Carezyme.TM.
(Novozymes A/S), Clazinase.TM. and Puradax HA.TM. (Genencor
International Inc.), and KAC-500(B).TM. (Kao Corporation).
[0326] Suitable peroxidases/oxidases include those of plant,
bacterial or fungal origin. Chemically modified or engineered
mutant peroxidases/oxidases can also be suitably included in a
cleaning composition of the invention. Useful peroxidases include,
for example, those obtained from the genera Coprinus (e.g., a
periosidase from C. cinereus and variants thereof as described in
International Publications WO 93/24618, WO 95/10602, and WO
98/15257). Commercially available peroxidases include, for example,
Guardzyme.TM. (Novozymes A/S).
[0327] Suitable enzymes described above can be present in a
cleaning composition of the present invention at levels of about
0.00001 wt. % or higher (e.g., about 0.01 wt % or higher, about 0.1
wt. % or higher, about 0.5 wt. % or higher, or about 1 wt. % or
higher). For example, one or more such enzymes can be present in a
cleaning composition of the invention in an amount of about 0.00001
wt. % to about 2 wt. % (e.g., about 0.0001 wt. % to about 1 wt. %,
about 0.001 wt. % to about 0.5 wt. %) based on the total weight of
the cleaning composition. In certain embodiments, the enzyme(s) can
be present or used at very low levels, for example, at about 0.001
wt. % or lower. In alternative embodiments, enzyme(s) can be
formulated, for example, into a heavier duty laundry detergent
composition, at about 0.1 wt. % and higher, for example, at about
0.5 wt. % or higher.
[0328] 2) Enzyme Stabilizers
[0329] In certain embodiments, the cleaning composition of the
present invention, which comprises one or more enzymes, further
comprises one or more enzyme stabilizers. For example, the enzymes
employed in the cleaning composition can be stabilized by the
presence of water-soluble sources of calcium and/or magnesium ions
in the finished compositions that provide such ions to the enzymes.
Known stabilizing agents include, for example, a polyol such as
propylene glycol or a glycerol, a sugar or a sugar alcohol, a
lactic acid, a boric acid, a boric acid derivative such as an
aromatic borate ester, a phenyl boronic acid derivative such as a
4-formylphenyl boronic acid. The enzyme stabilizers can be
incorporated into the cleaning composition according to known
methods, such as, for example, those described in International
Publications WO 92/19709 and WO 92/19708.
[0330] 3) Builders
[0331] Cleaning compositions of the present invention optionally
comprise one or more detergent builders or builder systems. When a
builder is used, the subject composition can comprise, e.g., at
least about 1 wt. % (e.g., at least about 1 wt. %, at least about 5
wt. %, at least about 10 wt. %, at least about 20 wt. %, at least
about 30 wt. %, at least about 40 wt. %, at least about 50 wt. %,
or more) of one or more builders. For example, a solid cleaning
composition of the present invention can comprise, e.g., about 1
wt. % to about 60 wt. % (e.g., about 5 wt. % to about 50 wt. %,
about 10 wt. % to about 40 wt. %, about 15 wt. % to about 30 wt. %)
of one or more builders or a builder system. For example, a liquid
cleaning composition of the present invention can comprise about 0
wt. % to about 10 wt. % of one or more detergency builders.
[0332] Various known builder materials can be used, including,
e.g., aluminosilicate materials, silicates, polycarboxylates,
alkyl- or alkenyl-succinic acid, and fatty acids, materials such as
ethylenediamine tetraacetate, diethylene triamine
pentamethyleneacetate, metal ion sequestrants such as
aminopolyphosphonates, particularly ethylenediamine tetramethylene
phosphonic acid and diethylene tramine pentamethylenephosphonic
acid. Particularly, builder materials such as calcium sequestrant
materials, precipitating materials, calcium ion-exchange mateirals,
polycarboxylate materials, citrate builder, succinic acid builders,
aminocarboxylates, and mixtures thereof are preferred.
[0333] Examples of calcium sequenstrant builder materials include
alkali metal polyphosphates, such as sodium tripolyphosphate and
organic sequestrants, and ethylene diamine tetra-acetic acid.
Examples of precipitating builder materials include sodium
orthophosphate and sodium carbonate. Examples of calcium
ion-exchange builder materials include the water-insoluble
crystalline or amorphous aluminosilicates, of which zeolites are
the best known, e.g., zeolite A, zeolite B (also known as zeolite
P), zeolite C, zeolite X, zeolite Y, and also the zeolite P-type as
described in, e.g., EP 0 384 070.
[0334] Of particular importance are citrate builders, including,
for example, citric acid and soluble salts thereof (particularly
sodium salt), are polycarboxylate builders of particular importance
for heavy duty liquid detergent formulations due to their
availability from renewable resources and their biodegradability.
Oxydisuccinates are also especially useful in such compositions and
combinations. Useful succinic acid builders can also be
C.sub.5-C.sub.20 alkyl and alkenyl succinic acids and salts
thereof, including laurylsuccinate, myristylsuccinate,
palmitylsuccinate, 2-dodecenylsuccinate, 2-pentadecenylsuccinate.
with dodecenylsuccinic acid being particularly preferred.
[0335] Suitable polycarboxylate builders include, for example,
cyclic compounds, particularly alicyclic compounds, such as those
described in U.S. Pat. Nos. 3,308,067, 3,723,322, 3,835,163;
3,923,679; 4,102,903, 4,120,874, 4,144,226, and 4,158,635.
[0336] Ether hydroxypolycarboxylates, copolymers of maleic
anhydride with ethylene or vinyl methyl ether, 1, 3, 5-trihydroxy
benzene-2, 4, 6-trisulphonic acid, and carboxymethyl oxysuccinic
acid, various alkali metal, ammonium, and substituted ammonium
salts of poly acetic acids such as ethylenediamine tetraacetic acid
and nitrilotriacetic acid, and polycarboxylates such as mellitic
acid, succinic acid, oxy-disuccinic acid, polymaleic acid, benzene
1,3,5-tricarboxylic acid, carboxymethyloxy-succinic acid, and
soluble salts thereof can be used as builders. Other
nitrogen-containing, phospho-free aminocarboxylates are sometimes
used. Specific examples include ethylene diamine disuccinic acid
and salts thereof (ethylene diamine disuccinates, EDDS), ethylene
diamine tetraacetic acid and salts thereof (ethylene diamine
tetraacetates, EDTA), and diethylene triamine penta acetic acid and
salts thereof (diethylene triamine penta acetates, DTPA). In
particular embodiments of a liquid composition,
3,3-dicarboxy-4-oxa-1,6-hexanedioates and related compounds as
described in U.S. Pat. No. 4,566,984 can be suitable.
[0337] 4) Chelating Agents
[0338] Cleaning compositions of the present invention can
optionally comprise one or a mixture of more than one copper, iron
and/or manganese chelating agents. When such an agent is used, the
subject cleaning composition can comprise, for example, about 0.005
wt. % or more (e.g., about 0.01 wt. % or more, about 1 wt. % or
more, about 5 wt. % or more, about 10 wt. % or more) chelating
agents. For example, a cleaning composition of the invention
comprises about 0.005 wt. % to about 15 wt. % (e.g., about 0.01 wt.
% to about 12 wt. %, about 0.1 wt. % to about 10 wt. %, about 1 wt.
% to about 8 wt. %, about 2 wt. % to about 6 wt. %) chelating
agents.
[0339] Suitable chelating agents include, e.g., amino carboxylates,
amino phosphonates, polyfunctionally-substituted aromatic chelating
agents, or mixtures, which are capable of removing copper, iron, or
manganese ions from washing mixtures by forming soluble
chelates.
[0340] Amino carboxylates include, for example,
ethylenediaminetetracetates,
N-hydroxyethylethylenediaminetriacetates, nitrilotriacetates,
ethylenediamine tetraproprionates,
triethylenetetraamine-hexacetates, diethylenetriamine
penta-acetates, and ethanoldiglycines, alkali metal, ammonium, and
substituted ammonium salts thereof.
[0341] Amino phosphonates are selectively used in cleaning
compositions becuase they increase the amount of total phosphorus.
For some applications wherein the amount of total phosphorus in a
cleaning composition is limited, amino phosphonates may not be a
suitable chelating agent or should be used in low amounts. Amino
phosphonates include, e.g., ethylenediamine tetrakis
(methylenephosphonates). The amino phosphonates preferably do not
contain alkyl or alkenyl groups with more than about 6 carbon
atoms.
[0342] Suitable polyfunctionally-substiuted aromatic chelating
agents have been described in, e.g., U.S. Pat. No. 3,812,044.
Exemplary polyfunctionally-substituted aromatic chelating agents
include a dihydroxydisulfobenzene, such as a
1,2-dihydroxy-3,5-disulfobenzene.
[0343] In some embodiments, biodegradable chelators can be included
in a cleaning composition of the invention. An exemplary
biodegradable chelator is ethylenediamine disuccinate ("EDDS"),
especially the [S,S] isomer as described in U.S. Pat. No.
4,704,233.
[0344] The compositions herein may also contain water-soluble
methyl glycine diacetic acid (MGDA) salts (or acid form) as a
chelant or co-builder useful with, for example, insoluble builders
such as zeolites, layered silicates and the like.
[0345] 5) Hydrotropes
[0346] Hydrotropes can be optionally included in cleaning
compositions of the present invention to improve the physical and
chemical stability of the compositions. Suitable hydrotropes
include sulfonated hydrotropes, which include, for example, alkyl
aryl sulfonates, or alkyl aryl sulfonic acids. Alkyl aryl
sulfonates can be sodium, potassium, calcium, or ammonium xylene
sulfonates; sodium, potassium, calcium, or ammonium toluene
sulfonates; sodium, potassium, calcium, or ammonium euraene
sulfonates; sodium, potassium, calcium, or ammonium substituted or
unsubstituted naphthalene sulfonates, and mixtures thereof.
Preferred among these are the sodium salts. Alkyl aryl sulfonic
acids can be xylenesulfonic acid, toluenesuifonic acid,
cumenesulfonic acid, substituted or unsubstituted
naphthalenesulfonic acid, or salts thereof. In certain embodimens,
a mixture of xylenesulfonic acid and p-toluene sulfonate can be
used.
[0347] If present, a cleaning composition of the present invention
comprises hydrotropes in an amount of about 0.01 wt. % or more
(e.g., about 0.02 wt. % or more, about 0.05 wt. % or more, about
0.1 wt. % or more, about 1 wt. % or more, about 5 wt. % or more,
about 10 wt. % or more, or about 15 wt. % or more). On the other
hand, a cleaning composition of the present invention comprises
hydrotropes in an amount of no more bout 20 wt. % (e.g., no more
than about 20 wt. %, no more than about 15 wt. %, no more than
about 10 wt. %, no more than about 5 wt. %, no more than about 1
wt. %). In certain embodiments, the cleaning composition can
comprise hydrotropes in an amount of about 0.01 wt. % to about 20
wt. % (e.g., about 0.02 wt. % to about 18 wt. %, about 0.05 wt. %
to about 15 wt. %, about 0.1 wt. % to about 10 wt. %, about 1 wt. %
to about 5 wt. %), based on the total weight of the cleaning
composition.
[0348] 6) Rheology Modifier
[0349] A cleaning composition of the present invention, when in the
form of a liquid, can suitably comprise a rheology modifier to
provide a matrix that is "shear-thinning" A shear-thinning fluid,
as it is understood by those skilled in the art, is a fluid the
viscosity of which decreases as shear is applied. Thus, at rest,
for example, during storage or shipping of a composition, the
liquid matrix of the composition preferably has a relatively high
viscosity. When shear is applied to the composition, however, such
as in the act of pouring or squeezing the composition from its
container, the viscosity of the matrix is lowered to the extent
that dispensing of the fluid product is easily and readily
accomplished.
[0350] Various materials that are capable of forming shear-thinning
fluids when combined with water or other aqueous liquids are known.
One type of useful structuring agent for this purpose comprises
non-polymeric (except for conventional alkoxylation) crystalline
hydroxy-functional materials that can form thread-like structuring
systems throughout the liquid matrix when crystallized within the
matrix in situ. Such materials include, e.g., crystalline
hydroxyl-containing fatty acids, fatty esters, or fatty waxes.
Specific examples of crystalline hydroxyl-containing rheology
modifiers include castor oil and derivatives. Preferred are
hydrogenated castor oil derivatives such as hydrogenated castor oil
and hydrogenated castor wax. A number of these materials are
commercially availalbe, including, for example, THIXCIN.RTM.
(Elementis Specialties), 1,4-di-O-benzyl-D-Threitol in the R,R, and
S, S forms and any mixtures, optically active or not, and others
described in, for example, U.S. Pat. No. 6,080,708 and
International Publication WO 02/40627.
[0351] Suitable polymeric rheology modifiers include those of the
polyacrylate, polysaccharide or polysaccharide derivative type.
Polysaccharide derivatives typically used as rheology modifiers
comprise polymeric gum materials. Such gums include pectine,
alginate, arabinogalactan, carrageenan, gellan gum, xanthan gum and
guar gum. Another suitable rheology modifier is a combination of a
solvent and a polycarboxylate polymer. The solvent can be, for
example, an alkylene glycol, more preferably dipropy glycol. For
example, the solvent can comprise a mixture of dipropyleneglycol
and 1,2-propanediol, with a ratio of dipropyleneglycol to
1,2-propanediol being about 3:1 to about 1:3 (e.g., about 1:1). The
polycarboxylate polymer can be, e.g., a polyacrylate,
polymethacrylate, or mixtures thereof. The polyacrylate can be a
copolymer of unsaturated mono- or di-carbonic acid and 1-30C alkyl
ester of the (meth) acrylic acid, or a polyacrylate of unsaturated
mono- or di-carbonic acid and 1-30C alkyl ester of the (meth)
acrylic acid. Some of these polymers are commercially available,
for example, under the tradename Carbopol.RTM. Aqua 30 (Lubrizol,
Wickliffe, Ohio).
[0352] The rheology modifiers can be present at a level of about
0.5 wt. % to about 15 wt. % (e.g., about 1 wt. % to about 12 wt. %,
about 2 wt. % to about 9 wt. %), based on the total weight of the
cleaning composition. The polycarboxylate polymer is suitably
present at a level of about 0.1 wt. % to about 10 wt. % (e.g.,
about 1 wt. % to about 8 wt. %, about 1.5% to about 6 wt. %, about
2 wt. % to about 5 wt. %) in the cleaning composition.
[0353] 6) Solvents or Solvent Systems
[0354] A cleaning composition of the invention can be in a liquid
form, wherein one or more suitable solvents or solvent systems are
included. Suitable solvents include water, lipophilic fluids, or
organic solvents. Examples of suitable lipophilic fluids include
siloxanes, other types of silicones, hydrocarbons, glycol ethers,
glycerine derivatives such as glycerine ethers, perfluorinated
amines, perfluorinated and hydrofluoroether solvents,
low-volatility nonfluorinated organic solvents, diol solvents,
other environmentally-friendly solvents and mixtures. Particularly
suitable solvents include low molecular weight primary and
secondary alcohols, such as methanol, ethanol, propanil, or
isopropanol. Monohydric alcohols, e.g., polyols containing about 2
to about 6 carbon atoms, and/or about 2 to about 6 hydroxy groups
(e.g., propylene glycol, ethylene glycol, glycerin, and
1,2-propanediol) are also suitable.
[0355] Solvents can be absent, for example, from anhydrous solid
embodiments of the cleaning compositions of the invention. But in a
liquid cleaning composition, they are typically present at levels
of bout 0.1 wt. % to about 98 wt. % (e.g., about 1 wt. % to about
90 wt. %, about 10 wt. % to about 80 wt. %, about 20 wt. % to about
75 wt. %).
[0356] 7) Organic Sequestering Agent
[0357] A cleaning composition of the invention can optionally
comprise about 0.01 wt. % to about 1.0 wt. % of an organic
sequestering agent. Non-limiting example of organic sequestering
agent include nitriloacetic acid, EDTA, organic phosphonates,
sodium citrate, sodium tartrate monosuccinate, sodium tartrate
disuccinate, and mixture thereof.
[0358] Certain adjuncts are particularly suitable for
laundry/household cleaning applications as compared to for
personal/beauty care cleaning compositions, while other adjuncts
are vise versa. Certain adjuncts are categorized and described
below as particularly suitable for the former or the latter, but
that categorization is not meant to be exclusive in that adjuncts
that are suitable for laundry/household cleaning compositions can
be included in personal/beauty care cleaning compositions and vise
versa as appropriate.
Adjuncts Particularly Suitable for Laundry/Household
Applications
[0359] 1) Bleach System
[0360] A bleach system suitable for use herein can contain one or
more bleaching agents. Suitable bleaching agents include, e.g.,
catalytic metal complexes, activated peroxygen sources, bleach
activators, bleach boosters, photobleaches, bleaching enzymes, free
radical initiators, and hyohalite bleaches. Suitable activated
peroxygen sources include, e.g., preformed peracids, a hydrogen
peroxide source with a bleach activator, or a mixture thereof.
Suitable preformed peracids include, e.g., percarboxylic acids and
salts, percarbonic acids and salts, perimidic acids and salts,
peroxymonosulfuric acids and salts, and mixtures thereof. Suitable
sources of hydrogen peroxide include, e.g., perborate compounds,
percarbonate compounds, perphosphate compounds and mixtures.
Suitable types and levels of activated peroxygen sources are
described in, e.g., U.S. Pat. Nos. 5,576,282, 6,306,812, and
6,326,348.
[0361] A household cleaning composition of the invention can
optionally comprise photobleach, which can be, for example, a
xanthene dye photobleach, a photo-initiator, or mixtures thereof.
Suitable photobleaches can also catalytic photobleaches and
photo-initiators. In certain embodiments, catalytic photobleaches
are selected from the group consisting of water soluble
phthalocyanines of the formula:
##STR00004##
wherein: PC is the phthalocyanine ring system; Me is Zn; Fe(II);
Ca; Mg; Na; K; Al--Z.sub.1; Si(IV); P(V); Ti(IV); Ge(IV); Cr(VI);
Ga(III); Zr(IV); In(III); Sn(IV) or Hf(VI); Z.sub.1 is a halide;
sulfate; nitrate; carboxylate; alkanolate; or hydroxyl ion; q is 0;
1 or 2; r is 1 to 4; Q1 is a sulfur or carboxyl group; or a radical
of the formula: --SO.sub.2X.sub.2--R.sub.1--X.sub.3.sup.+;
--O--R.sub.1--X.sub.3.sup.+; or --(CH.sub.2), --Y.sub.1.sup.+; in
which R.sub.1 is a branched or unbranched C.sub.1-C.sub.8 alkylene;
or 1,3- or 1,4-phenylene; X.sub.2 is --NH--; or
--N--C.sub.1-C.sub.5 alkyl; X.sub.3.sup.+ is a group of the
formula:
##STR00005##
or, in the case where R.sub.1.dbd.C.sub.1-C.sub.5 alkylene, also a
group of the formula:
##STR00006##
Y.sub.1.sup.+ is a group of the formula:
##STR00007##
wherein t is 0 or 1; R.sub.2 and R.sub.3 independently of one
another are C.sub.1-C.sub.6 alkyl; R.sub.4 is C.sub.1-C.sub.5
alkyl; C.sub.5-C.sub.7 cycloalkyl or NR.sub.7R.sub.8; R.sub.5 and
R.sub.6 independently of one another are C.sub.1-C.sub.5 alkyl;
R.sub.7 and R.sub.8 independently of one another are hydrogen or
C.sub.1-C.sub.5 alkyl; R.sub.9 and R.sub.10 independently of one
another are unsubstituted C.sub.1-C.sub.6 alkyl or C.sub.1-C.sub.6
alkyl substituted by hydroxyl, cyano, carboxyl,
carb-C.sub.1-C.sub.6 alkoxy, C.sub.1-C.sub.6 alkoxy, phenyl,
naphthyl or pyridyl; u is from 1 to 6; A.sub.1 is a unit which
completes an aromatic 5- to 7-membered nitrogen heterocycle, which
may where appropriate also contain one or two further nitrogen
atoms as ring members, and B.sub.1 is a unit which completes a
saturated 5- to 7-membered nitrogen heterocycle, which may where
appropriate also contain 1 to 2 nitrogen, oxygen and/or sulfur
atoms as ring members; Q.sub.2 is hydroxyl; C.sub.1-C.sub.22 alkyl;
branched C.sub.3-C.sub.22 alkyl; C.sub.2-C.sub.22 alkenyl; branched
C.sub.3-C.sub.22 alkenyl and mixtures thereof; C.sub.1-C.sub.22
alkoxy; a sulfo or carboxyl radical; a radical of the formula:
##STR00008##
a branched alkoxy radical of the formula:
##STR00009##
an alkylethyleneoxy unit of the formula:
-(T.sub.1)d-(CH.sub.2).sub.b (OCH.sub.2CH.sub.2)e-B.sub.3; or an
ester of the formula: COOR.sub.18, wherein B.sub.2 is hydrogen;
hydroxyl; C.sub.1-C.sub.30 alkyl; C.sub.1-C.sub.30 alkoxy;
--CO.sub.2H; --CH.sub.2COOH; --SO.sub.3-M.sub.1OSO.sub.3-M.sub.1;
--PO.sub.3.sup.2-M.sub.1; --OPO.sub.3.sup.2-M.sub.1; and mixtures
thereof; B.sub.3 is hydrogen; hydroxyl; --COOH; --SO.sub.3-M.sub.1;
--OSO.sub.3-M.sub.1 or C.sub.1-C.sub.6 alkoxy; M.sub.1 is a
water-soluble cation; T.sub.1 is --O--; or --NH--; X.sub.1 and
X.sub.4 independently of one another are --O--; --NH-- or
--N--C.sub.1-C.sub.5alkyl; R.sub.11 and R.sub.12 independently of
one another are hydrogen; a sulfo group and salts thereof; a
carboxyl group and salts thereof or a hydroxyl group; at least one
of the radicals R.sub.11 and R.sub.12 being a sulfo or carboxyl
group or salts thereof, Y.sub.2 is --O--; --S--; --NH-- or
--N--C.sub.1-C.sub.5alkyl; R.sub.13 and R.sub.14 independently of
one another are hydrogen; C.sub.1-C.sub.6 alkyl;
hydroxy-C.sub.1-C.sub.6 alkyl; cyano-C.sub.1-C.sub.6 alkyl;
sulfo-C.sub.1-C.sub.6 alkyl; carboxy or halogen-C.sub.1-C.sub.6
alkyl; unsubstituted phenyl or phenyl substituted by halogen,
C.sub.1-C.sub.4 alkyl or C.sub.1-C.sub.4 alkoxy; sulfo or carboxyl
or R.sub.13 and R.sub.14 together with the nitrogen atom to which
they are bonded form a saturated 5- or 6-membered heterocyclic ring
which may additionally also contain a nitrogen or oxygen atom as a
ring member; R.sub.15 and R.sub.16 independently of one another are
C.sub.1-C.sub.6 alkyl or aryl-C.sub.1-C.sub.6 alkyl radicals;
R.sub.17 is hydrogen; an unsubstituted C.sub.1-C.sub.6 alkyl or
C.sub.1-C.sub.6 alkyl substituted by halogen, hydroxyl, cyano,
phenyl, carboxyl, carb-C.sub.1-C.sub.6 alkoxy or C.sub.1-C.sub.6
alkoxy; R.sub.18 is C.sub.1-C.sub.22 alkyl; branched
C.sub.3-C.sub.22 alkyl; C.sub.1-C.sub.22 alkenyl or branched
C.sub.3-C.sub.22 alkenyl; C.sub.3-C.sub.22 glycol; C.sub.1-C.sub.22
alkoxy; branched C.sub.3-C.sub.22 alkoxy; and mixtures thereof; M
is hydrogen; or an alkali metal ion or ammonium ion, Z.sub.2.sup.-
is a chlorine; bromine; alkylsulfate or arylsulfate ion; a is 0 or
1; b is from 0 to 6; c is from 0 to 100; d is 0; or 1; e is from 0
to 22; v is an integer from 2 to 12; w is 0 or 1; and A is an
organic or inorganic anion, and s is equal to r in cases of
monovalent anions A.sup.- and less than or equal to r in cases of
polyvalent anions, it being necessary for A.sub.s.sup.- to
compensate the positive charge; where, when r is not equal to 1,
the radicals Q.sub.1 can be identical or different, and where the
phthalocyanine ring system may also comprise further solubilising
groups.
[0362] Other suitable catalytic photobleaches include xanthene
dyes, sulfonated zinc phthalocyanine, sulfonated aluminium
phthalocyanine, Eosin Y, Phoxine B, Rose Bengal, C. I. Food Red 14,
and mixtures. In some embodiment, a photobleach can be a mixture of
sulfonated zinc phthalocyanine and sulfonated aluminium
phthalocyanine, wherein the weight ratio of sulfonated zinc
phthalocyanine to sulfonated aluminium phthalocyanine is greater
than 1, greater than 1 but less than about 100, or from 1 to about
4.
[0363] Suitable photo-initiators include, e.g., aromatic
1,4-quinones such as anthraquinones and naphthaquinones; alpha
amino ketones, particularly those containing benzoyl moieties;
alphahydroxy ketones, particularly alpha-hydroxy acetophenones;
phosphorus-containing photoinitiators, including monoacyl, bisacyl
and trisacyl phosphine oxide and sulphides; dialkoxy acetophenones;
alpha-haloacetophenones; trisacyl phosphine oxides; benzoin and
benzoin based photoinitiators; and mixtures thereof.
Photo-initiators can, e.g., be 2-ethyl anthraquinone; Vitamin K3;
2-sulphate-anthraquinone; 2-methyl
1-[4-phenyl]-2-morpholinopropan-1-one (Irgacure.RTM. 907);
(2-benzyl-2-dimethyl amino-1-(4-morpholinophenyl)-butan-1-one
(Irgacure.RTM. 369); (1-[4-(2-hydroxyethoxy)-phenyl]-2
hydroxy-2-methyl-1-propan-1-one) (Irgacure.RTM. 2959);
1-hydroxy-cyclohexyl-phenyl-ketone (Irgacure.RTM. 184) (Ciba);
oligo[2-hydroxy 2-methyl-1-[4(1-methyl)-phenyl]propanone
(Esacure.RTM. KIP 150) (Lamberti);
2-4-6-(trimethyl-benzoyl)diphenyl-phosphine oxide,
bis(2,4,6-trimethylbenzoyl)-phenyl-phosphine oxide (Irgacure.RTM.
819); (2,4,6 trimethyl benzoyl) phenyl phosphinic acid ethyl ester
(Lucirin.RTM. TPO-L(BASF)); and mixtures thereof.
[0364] A number of photobleaches are commercially available,
including those described above, from, e.g., Aldrich; Frontier
Scientific; Ciba; BASF; Lamberti S.p.A; Dayglo Color Corporation;
Organic Dyestuffs Corp.
[0365] 2) Pearlescent Agents
[0366] Pealescent agents are optional but commonly included
ingredients of a number of household cleaners, especially, e.g., in
hard surface cleaners. They are typically crystalline or glassy
solids, transparent or translucent compounds capable of reflecting
and/or refracting light to produce a perlescent effects. For
example, they are crystalline particles insoluble in the
composition in which they are incorporated. Preferably the
pearlescent agents have the shape of thin plates or spheres (which
are generally spherical). As commonly practiced in the art,
particle sizes are measured across the largest diameter of spheres.
Plate-like particles are defined as those wherein the two
dimensions of the particle (length and width) are at least 5 times
the third dimension (depth or thickness). Other crystal shapes like
cubes or needles typically do not display pearlescent effect and
thus are not used as perlescent agents.
[0367] Suitable pearlescent agents have D0.99 (sometimes referred
to as D99) volume particle size of less than 50 .mu.m. Preferably
the pearlescent agents have D0.99 of less than 40 .mu.m, e.g., less
than 30 .mu.m. More preferably the particles have volume particle
size of greater than 1 .mu.m. The D0.99 is a measure of particle
size relating to particle size distribution and meaning in this
instance that 99% of the particles have volume particle size of
less than 50 .mu.m. Volume particle size and particle size
distribution can be measured using conventional methods and
equipment, such as, e.g., a Hydro 2000G (Malvern Instruments). The
choice of a particle size needs to balance the ease of distribution
vs. the efficacy of the pearlescent agent because the smaller the
particles, the easier they are suspended, but the lower the
efficacy.
[0368] Liquid compositions containing less water and more organic
solvents will typically have a refractive index that is higher in
comparison to the more aqueous compositions. In these compositions,
pearlescent agents with high refractive index are preferably
included because otherwise the pearlescent agents do not impart
sufficient visual perlescence even when introduced at high levels
(e.g., more than about 3 wt. %). In liquid compositions containing
less water and more organic solvents, the perlescent agent is
preferably one having a refractive index of more than 1.41 (e.g.,
more than 1.8, more than 2.0. In some embodiments, the difference
in refractive index between the pearlescent agent and the cleaning
composition or medium, to which pearlescent agent is added, is at
least 0.02, or at least 0.2, or at least 0.6.
[0369] A liquid cleaning composition may comprise about 0.01 wt. %
or more (e.g., about 0.02 wt. % or more, about 0.05 wt. % or more,
about 0.1 wt. % or more, about 0.5 wt. % or more, about 1.0 wt. %
or more, about 1.5 wt. % or more) of one or more pearlescent
agents. Typically, however, the liquid composition comprises no
more than about 2 wt. % (e.g., no more than about 1.5 wt. %, no
more than about 1.0 wt. %, no more than about 0.5 wt. %) of one or
more pearlescent agents. For example, a liquid cleaning composition
herein comprises about 0.01 wt. % to about 2.0 wt. % (e.g., about
0.1 wt. % to about 1.5 wt. %) of pearlescent agents.
[0370] Suitable pearlescent agents may be organic or inorganic.
Organic pearlescent agents include, e.g., monoester and/or diester
of alkylene glycols, propylene glycol, diethylene glycol,
dipropylene glycol, methylene glycol or tetraethylene glycol with
fatty acids containing 6 to 22, preferably about 12 to about 18
carbon atoms, e.g., caproic acid, caprylic acid, 2-ethyhexanoic
acid, capric acid, lauric acid, isotridecanoic acid, myristic acid,
palmitic acid, palmitoleic acid, stearic acid, isostearic acid,
oleic acid, elaidic acid, petroselic acid, linoleic acid, linolenic
acid, arachic acid, gadoleic acid, behenic acid, erucic acid, and
mixtures.
[0371] Inorganic pearlescent agents include mica, metal oxide
coated mica, silica coated mica, bismuth oxychloride coated mica,
bismuth oxychloride, myristyl myristate, glass, metal oxide coated
glass, guanine, glitter, and mixtures thereof.
[0372] Organic pearlescent agent such as ethylene glycol mono
stearate and ethylene glycol distearate provide pearlescence, but
typically only when the composition is in motion. Hence only when
the composition is poured will the composition exhibit
pearlescence. Inorganic pearlescent materials are preferred as the
provide both dynamic and static pearlescence. By dynamic
pearlescence it is meant that the composition exhibits a
pearlescent effect when the composition is in motion. By static
pearlescence it is meant that the composition exhibits pearlescence
when the composition is static.
[0373] Inorganic pearlescent agents are available as a powder, or
as a slurry of the powder in an appropriate suspending agent.
Suitable suspending agents include ethylhexyl hydroxy-stearate,
hydrogenated castor oil. The powder or slurry of the powder can be
added to the composition without the need for any additional
process steps.
[0374] Optionally, co-crystallizing agents can be used to enhance
the crystallization of the organic pearlescent agents. Suitable
co-crystallizing agents include but are not limited to fatty acids
and/or fatty alcohols having a linear or branched, optionally
hydroxyl substituted, alkyl group containing from about 12 to about
22, preferably from about 16 to about 22, and more preferably from
about 18 to 20 carbon atoms, such as palmitic acid, linoleic acid,
stearic acid, oleic acid, ricinoleic acid, behenyl acid, cetearyl
alcohol, hydroxystearyl alcohol, behenyl alcohol, linolyl alcohol,
linolenyl alcohol, and mixtures thereof.
[0375] 3) Perfumes/Fragrances
[0376] The term "perfume" as used herein encompasses individual
perfume ingredients as well as perfume accords. The perfume
ingredients are often premixed to form a perfume accord prior to
adding to a cleaning composition. As used herein, the term
"perfume" can also include perfume microencapsulates. Perfume
microcapsules comprise perfume raw materials encapsulated within a
capsule made with materials selected from urea and formaldehyde;
melamine and formaldehyde; phenol and formaldehyde; gelatine;
polyurethane; polyamides; cellulose ethers; cellulose esters;
polymethacrylate; and mixtures thereof. Encapsulation techniques
are known and described in, for example, "Microencapsulation":
methods and industrial applications, Benita & Simon, eds.
(Marcel Dekker, Inc., 1996).
[0377] The perfume ingredients that can be included in a cleaning
composition can include various natural and synthetic chemicals.
Exemplary perfume ingredients include aldehydes, ketones, esters,
natural extracts, natural essences and the like.
[0378] Industrial cleaning compositions often do not comprise
perfume ingredients. However, perfume ingredients are commonly
found in household and personal care cleaning compositions. When
present, the level of perfume or perfume accord is, e.g., present
in an amount of about 0.0001 wt. % or more (e.g., about 0.01 wt. %
or more, about 0.1 wt. % or more, about 0.5 wt. % or more, about 2
wt. % or more), based on the total weight of the cleaning
composition. For example, the level of perfume or perfume accord
can be present in an amount of about 0.0001 wt. % to about 10 wt. %
(e.g., about 0.01 wt. % to about 5 wt. %, about 0.1 wt. % to about
2 wt. %, preferably about 0.02 wt. % to about 0.8 wt. %, or about
0.003 wt. % to about 0.6 wt. %), by weight of the detergent
composition. The level of perfume ingredients in a perfume accord,
if one exists, is typically from about 0.0001 wt. % to about 99 wt.
% by weight of the perfume accord. Exemplary perfume ingredients
and perfume accords are disclosed in, for example, U.S. Pat. Nos.
5,445,747, 5,500,138, 5,531,910, 6,491,840, and 6,903,061.
[0379] 4) Dyes, Colorants, and Preservatives
[0380] The cleaning compositions herein can optionally contain
dyes, colorants, and/or preservatives, or contain one or more, or
none of these components. The dyes, colorants and/or preservatives
can be naturally occurring or slightly processed from natural
materials, or they can be synthetic. For example, natural-occurring
preservatives include benzyl alcohol, potassium sorbate and
bisabalol, sodium benzoate, and 2-phenoxyethanol. Synthetic
preservatives can be selected from, for example, mildewstate or
bacteriostate, methyl, ethyl, and propyl parabens, bisguarnidine
components (e.g., Dantagard.TM. and/or Glydant.TM. (Lonza Group)).
Midewstate or bacteriostate compounds include, without limitation,
KATHON.RTM. GC, a 5-chloro-3-methyl-4-isothiazolin-3-one,
KATHON.RTM. ICP, a 2-methyl-4-isothiazolin-4-one, and a blend
thereof, and KATHON.RTM. 886, a
5-chloro-2-methyl-4-isothazolin-3-one (Dow Chemicals); BRONOPOL, a
2-bromo-2-nitropropane 1, 3 diol (Boots, Co. Ltd.); DOWICIDE.TM. A,
a 1,2-benzoisothiazolin-3-one (Dow Chemicals); and IRGASAN.RTM. DP
200, a 2,4,4'-trichloro-2-hydroxydiphenylether (Ciba-Geigy,
AG).
[0381] Dyes and colorants include synthetic dyes such as
Liquitint.RTM. Yellow or Blue or natural plant yes or pigments,
such as natural yellow, orange, red, and/or brown pigment, such as
carotenoids, including, for example, beta-carotene and lycopene.
The composition can additionally contain fluorescent whitening
agents or bluing agents. Certain dyes can be light sensitive,
including for example Acid Blue 145 (Crompton), Hidacid.RTM. blue
(Hilton, Davis, Knowles & Triconh); Pigment Green No. 7,
FD&C Green No. 7, Acid Blue 1, Acid Blue 80, Acid Violet 48,
and Acid Yellow 17 (Sandoz Corp.); D&C Yellow No. 10 (Warner
Jenkinson).
[0382] If present, dyes or colorants are, e.g., present in an
amount of about 0.001 wt. % or more (e.g., about 0.002 wt. % or
more, 0.01 wt. % or more, 0.05 wt. % or more, 0.1 wt. % or more;
0.5 wt. % or more). Usually, dyes and colorants are present, if at
all, in an amount of no more than about 1 wt. % (e.g., no more than
about 0.8 wt. %, no more than about 0.5 wt. %, no more than about
0.2 wt. %, no more than about 0.1 wt. %, no more than about 0.01
wt. %). For example, dyes and colorants can be present in a
cleaning composition herein in an amount of about 0.001 wt. % to
about 1 wt. % (e.g., about 0.01 wt. % to about 0.4 wt. %).
[0383] 5) Fabric Care Benefit Agents
[0384] A household cleaning composition can be a laundry detergent,
wherein a preferred optional ingredient can be a fabric care
benefit agent. As used herein, "fabric care benefit agent" refers
to any material that can provide fabric care benefits such as
fabric softening, color protection, pill/fuzz reduction,
anti-abrasion, anti-wrinkle, and the like to garments and fabrics,
particularly on cotton and cotton-rich garments and fabrics, when
an adequate amount of the material is present on the
garment/fabric. Non-limiting examples of fabric care benefit agents
include cationic surfactants, silicones, poly olefin waxes,
latexes, oily sugar derivatives, cationic polysaccharides,
polyurethanes and mixtures. Suitable silicones include, e.g.,
silicone fluids such as poly(di)alkyl siloxanes, especially
polydimethyl siloxanes and cyclic silicones.
[0385] Polydimethyl siloxane derivatives include, e.g.,
organofunctional silicones. One embodiment of functional silicone
are the ABn type silicones, as described in U.S. Pat. Nos.
6,903,061, 6,833,344, and International Publication WO02/018528. A
number of silicones are commercially available, including, e.g.,
Waro.TM. and Silsoft.TM. 843 (GE Silicones). Functionalized
silicones or copolymers with one or more different types of
functional groups such as amino, alkoxy, alkyl, phenyl, polyether,
acrylate, silicon hydride, mercaptoproyl, carboxylic acid,
quaternized nitrogen are also suitable as fabric care benefit
agents. A number of these are commercially available including,
e.g., SM2125, Silwet 7622 (GE Silicones), DC8822, PP-5495, DC-5562
(Dow Chemicals), KF-888, KF-889 (Shin Etsu Silicones);
Ultrasil.RTM. SW-12, Ultrasil.RTM. DW-18, Ultrasil.RTM. DW-AV,
Ultrasil.RTM. Q-Plus, Ultrasil.RTM. Ca-I, Ultrasil.RTM. CA-2,
Ultrasil.RTM. SA-I, Ultrasil.RTM. PE-100 (Noveon Inc.),
Pecosil.RTM. CA-20, Pecosil.RTM. SM-40, Pecosil.RTM. PAN-150
(Phoenix Chemical). Oily sugar derivatives suitable as fabric care
benefit agents were described in International Publication WO
98/16538. Olean.RTM. is a commercial brand for certain oily sugar
derivatives marketed by The Procter and Gamble Co.
[0386] Many dispersible polyolefins can be used to provide fabric
care benefits. The polyolefins can be in the form of waxes,
emulsions, dispersions, or suspensions. Preferably, the polyolefin
is a polyethylene, polypropylene, or a mixture. The polyolefin can
be partially modified to contain various functional groups, such as
carboxyl, alkylamide, sulfonic acid or amide groups. For example,
the polyolefin is partially carboxyl modified or oxidized.
[0387] Polymer latex can also be used to provide fabric care
benefits in a water based cleaning composition. Non-limiting
examples of polymer latexes include those described in, e.g.,
International Publication WO 02/018451. Additional non-limiting
examples include the monomers used in producing polymer latexes,
such as 100% or pure butylacrylate, butylacrylate and butadiene
mixtures with at least 20 wt. % of butylacrylate, butylacrylate and
less than 20 wt. % of other monomers excluding butadiene,
alkylacrylate with an alkyl carbon chain at or greater than
C.sub.6, alkylacrylate with an alkyl carbon chain at or greater
than C.sub.6 and less than 50 wt. % of other monomers, or a third
monomer added into monomer systems above.
[0388] Cationic surfactants are also useful in this invention.
Examples of cationic surfactants have been described in, e.g., U.S.
Patent Publication US2005/0164905.
[0389] Fatty acids can also be used as fabric care benefit agents.
When deposited on fabrics, fatty acids or soaps thereof, provide
fabric care benefits (e.g., softness, shape retention) to laundry
fabrics. Useful fatty acids (or soaps, such as alkali metal soaps)
are the higher fatty acids containing from about 8 to about 24
carbon atoms, more preferably from about 12 to about 18 carbon
atoms. Soaps can be made by direct saponification of fats and oils
or by the neutralization of free fatty acids. Particularly useful
are the sodium and potassium salts of the mixtures of fatty acids
derived from coconut oil and tallow. Fatty acids can be from
natural or synthetic origin, both saturated and unsaturated with
linear or branched chains.
[0390] Color care agents are another type of fabric care benefit
agent that can be suitably included in a cleaning composition.
Examples include metallo catalysts for color maintenance, such as
those described in International Publication WO 98/39403.
[0391] Fabric care benefit agents, when present in a household
cleaning composition such as a laundry detergent composition, can
suitably be present at a level of up to about 30 wt. % (e.g., up to
about 20 wt. %, up to about 15 wt. %, up to about 10 wt. %, up to
about 5 wt. %, up to about 2 wt. %), based on the total weight of
the cleaning composition. For example, a cleaning composition of
the invention comprises about 1 wt. % to about 20 wt. % (e.g.,
about 2 wt. % to about 15 wt. %, about 5 wt. % to about 10 wt. %)
of one or more fabric care benefit agents.
[0392] 6) Deposition Aid
[0393] As used herein, "deposition aid" refers to any cationic
polymer or combination of cationic polymers that significantly
enhance the deposition of the fabric care benefit agent onto the
fabric during laundering. An effective deposition aid typically has
a strong binding capability with the water insoluble fabric care
benefit agents via physical forces such as van der Waals forces or
non-covalent chemical bonds such as hydrogen bonding and/or ionic
bonding.
[0394] An exemplary deposition aid is a cationic or amphoteric
polymer. Amphoteric polymers have a net cationic charge. The
cationic charge density of the polymer can range from about 0.05
milliequivalents/g to about 6 milliequivalents/g. The charge
density is calculated by dividing the number of net charge per
repeating unit by the molecular weight of the repeating unit.
Nonlimiting examples of deposition aids include cationic
polysaccharides, chitosan and its derivatives, and cationic
synthetic polymers. Specific deposition aids include, e.g.,
cationic hydroxy ethyl cellulose, cationic starch, cationic guar
derivatives, and mixtures. Certain deposition aids are commercially
available, including, e.g., the JR 30M, JR 400, JR 125, LR 400 and
LK 400 polymers (Amerchol Corp.), Celquat.RTM. H200, Celquat.RTM.
L-200, and the Cato.RTM. starch (National Starch and Chemical Co.),
and Jaguar Cl 3 and Jaguar Excel (Rhodia, Inc.).
[0395] 7) Fabric Substantive and Hueing Dye
[0396] Dyes can be included in a cleaning composition of the
invention, e.g., a laundry detergent. Conventionally, dyes include
certain types of acid, basic, reactive, disperse, direct, vat,
sulphur or solvent dyes. For inclusion in cleaning compositions,
direct dyes, acid dyes, and reactive dyes are preferred. Direct
dyes are water-soluble dyes taken up directly by fibers from an
aqueous solution containing an electrolyte, presumably due to
selective adsorption. In the Color Index system, direct dye refers
to various planar, highly conjugated molecular structures that
contain one or more anionic sulfonate group. Acid dyes are water
soluble anionic dyes that are applied from an acidic solution.
Reactive dyes are those containing reactive groups capable of
forming covalent linkages with certain portions of the molecules of
natural or synthetic fibers. Suitable fabric substantive dyes that
can be included in a cleaning composition include, e.g., an azo
compound, stilbenes, oxazines and phthalocyanines.
[0397] Hueing dyes are another type of dyes that may be present in
a household cleaning composition of the invention. Such dyes have
been found to exhibit good tinting efficiency during a laundry wash
cycle without exhibiting excessive undesirable build up during
laundering. Typically, a hueing dye is included in the laundry
detergent composition in an amount sufficient to provide a tinting
effect to fabric washed in a solution containing the detergent. In
one embodiment, the detergent composition comprises, e.g., about
0.0001 wt. % to about 0.05 wt. % (e.g., about 0.001 wt. % to about
0.01 wt. %) of a hueing dye.
[0398] 8) Dye Transfer Inhibitors
[0399] A household cleaning composition of the invention, e.g., a
laundry detergent composition, can comprise one or more compounds
for inhibiting dye transfer from one fabric to another of
solubilized and suspended dyes encountered during fabric laundering
operations involving colored fabrics. Exemplary dye transfer
inhibitors include polymedc dye transfer inhibiting agents, which
are capable of complexing or absorbing the fugitive dyes washed out
of dyed fabrics before the dyes have an opportunity to become
attached to other articles in the wash. Polymedc dye transfer
agents are described in, e.g., International Publication WO
98/39403. Modified polyethyleneimine polymers, such as those
described in International Publication WO 00/05334, which are
water-soluble or dispersible, modified polyamines can also be used.
Other exemplary dye transfer inhibiting agents include, e.g.,
polyvinylpyrridine N-oxide (PVNO), polyvinyl pyrrolidone (PVP),
polyvinyl imidazole, N-vinyl-pyrrolidone and N-vinylimidazole
copolymers (PVPVI), copolymers thereof, and mixtures. The amount of
dye transfer inhibiting agents in the cleaning composition can be,
e.g., about 0.01 wt. % to about 10 wt. % (e.g., about 0.02 wt. % to
about 5 wt. %, about 0.03 wt. % to about 2 wt. %).
[0400] 9) Optional Ingredients
[0401] Unless specified herein below, an "effective amount" of a
particular adjunct or ingredient is preferably present in an amount
of about 0.01 wt. % or more (e.g., about 0.1 wt. % or more, about
0.5 wt. % or more, about 1.0 wt. % or more, about 2.0 wt. % or
more), based on the total weight of the detergent composition.
Optional adjuncts are usually presented in an amount of no more
than about 20 wt. % (e.g., no more than about 15 wt. %, no more
than about 10 wt. %, no more than about 5 wt. %, or no more than
about 1 wt. %).
[0402] Examples of other suitable cleaning adjuncts, one or more of
which may be included in a cleaning composition, include, e.g.,
effervescent systems comprising hydrogen peroxide and catalase;
optical brighteners or fluorescers; soil release polymers;
dispersants; suds suppressors; photoactivators; hydrolysable
surfactants; preservatives; anti-oxidants; anti-shrinkage agents;
gelling agents (e.g., amidoamines, amidoamine oxides, gellan gums);
anti-wrinkle agents; germicides; fungicides; color speckles;
antideposition agents such as celluose derivatives, colored beads,
spheres or extrudates; sunscreens; fluorinated compounds; clays;
luminescent agents or chemiluminescent agents; anti-corrosion
and/or appliance protectant agents; alkalinity sources or other pH
adjusting agents; solubilizing agents; processing aids; pigments;
free radical scavengers, and mixtures. Suitable materials and
effective amounts are described in, e.g., U.S. Pat. Nos. 5,705,464,
5,710,115, 5,698,504, 5,695,679, 5,686,014 and 5,646,101. Mixtures
of the above components can be made in any proportion.
[0403] 10) Encapsulated Composition
[0404] A cleaning composition, such as a household cleaning
composition including a laundry detergent, a dishwashing liquid, or
a surface cleaning composition, of the present invention can
optionally be encapsulated within a water soluble film. The
water-soluble film can be made from polyvinyl alcohol or other
suitable variations, carboxy methyl cellulose, cellulose
derivatives, starch, modified starch, sugars, PEG, waxes, or
combinations thereof.
[0405] In certain embodiment the water-soluble film may comprise
other adjuncts such as copolymer of vinyl alcohol and a carboxylic
acid, the advantages of which have been desbribed in, for example,
U.S. Pat. No. 7,022,656. An exemplary benefit of such encapsulation
practice is the improvement of the shelf-life of the pouched
composition. Another exemplary advantage is that this practice
provides improved cold water (e.g., less than 10.degree. C.)
solubility to the cleaning composition. The level of the co-polymer
in the film material is at least about 60 wt. % (e.g., about 65 wt.
%, about 70 wt. %, about 80 wt. %) by weight. The polymer can have
any average molecular weight, preferably about 1,000 daltons to
1,000,000 daltons (e.g., about 10,000 daltons to about 300,000
daltons, about 15,000 daltons to 200,000 daltons, about 20,000
daltons to 150,000 daltons). In certain embodiments, the copolymer
present in the film is about 60% to about 98% hydrolysed (e.g.,
about 80% to 95% hydrolysed), to improve the dissolution of the
material. In certain embodiments, the copolymer comprises about 0.1
mol % to about 30 mol % (e.g., about 1 mol % to about 6 mol %) of
carboxylic acid. In certain embodiments, the water-soluble film
comprises additional co-monomers, including, for example,
sulfonates and ethoxylates such as 2-acrylamido-2-methyl-1-propane
sulphonic acid. In further embodiments, the film can also comprise
other ingredients, including, for example, plasticizers, for
example, glycerol, ethylene glycol, diethyleneglycol, propane diol,
2-methyl-1,3-propane diol, sorbitol, and mixtures thereof,
additional water, disintegrating aids, fillers, anti-foaming
agents, emulsifying/dispersing agents, and/or antiblocking agents.
It may be useful that the pouch or water-soluble film itself
comprises a detergent additive to be delivered to the wash water,
for example organic polymeric soil release agents, dispersants, dye
transfer inhibitors. Optionally the surface of the film of the
pouch may be dusted with fine powder to reduce the coefficient of
friction. Sodium aluminosilicate, silica, talc and amylose are
examples of suitable fine powders. Certain water-soluble films are
commercially available, for example, those marketed under the
tradename M8630.TM. (Mono-Sol).
Adjuncts Particularly Suitable for Personal Care Applications
[0406] 1) Hair Conditioning Agents
[0407] Cleaning compositions of the invention can comprise, in some
embodiments such as, for example, in personal or beauty care
applications, certain known conditioning agents. An exemplary
conditioning agent especially suitable for personal care
compositions such as shampoos, is a silicone or a
silicone-containing material. Such materials can be selected from,
e.g., non-volatile silicones, siloxane gums and resins,
aminofunctional silicones, quaternary silicones, and mixtures
thereof with each other and with volatile silicones. Examples of
these silicone polymers have been disclosed, for example, in U.S.
Pat. No. 6,316,541.
[0408] Silicone oils are flowable silicone materials having a
viscosity as measured at 25.degree. C. of less than about 50,000
centistokes (e.g., less than aobut 30,000 centistokes). For
example, silicone oils typically have a viscosity of about 5
centistokes to about 50,000 centistokes (e.g., about 10 centistokes
to about 30,000 centistokes). Suitable silicone oils include
polyalkyl siloxanes, polyaryl siloxanes, polyalkylaryl siloxanes,
polyether siloxane copolymers, and mixtures. Other insoluble,
non-volatile silicone fluids having hair conditioning properties
can also be used. Methods of making microemulsions of silicone
particles are described in the art, including, e.g., the tecnique
described in U.S. Pat. No. 6,316,541. The silicone may, e.g., be a
liquid at ambient temperatures, so as to be of a suitable viscosity
to enable the material itself to be readily emulsified to the
required particle size of about 0.15 microns or less.
[0409] The amount of silicone incorporated into a cleaning
composition of the invention may depend on the type of composition
and the particular silicone materials used. A preferred amount is
about 0.01 wt. % to about 10 wt. %, although these limits are not
absolute. The lower limit is determined by the minimum level to
achieve acceptable conditioning for a target consumer group and the
upper limit by the maximum level to avoid making the hair and/or
skin unacceptably greasy. The activity of the microemulsion can be
adjusted accordingly to achieve the desired amount of silicone or a
lower level of the preformed microemulsion can be added.
[0410] The microemulsion of silicone oil may be further stabilized
by sodium lauryl sulfate or sodium lauryl ether sulfate with 1-10
moles of ethoxylation. Additional emulsifier, preferably chosen
from anionic, cationic, nonionic, amphoteric and zwitterionic
surfactants, and mixtures thereof may be present. The amount of
emulsifier will typically be in the ratio of about 1:1 to about 1:7
parts by weight of the silicone, although larger amounts of
emulsifier can be used, e.g., in about 5:1 parts by weight of the
silicone or more. Use of these emulsifiers may be necessary to
maintain clarity of the microemulsion if the microemulsion is
diluted prior to addition to the personal care cleaning
composition. The same detersive surfactants in the cleaning
composition can also serve as the emulsifier in the preformed
microemulsion.
[0411] The silicone microemulsion may be further stabilized using
an emulsion polymerization process. A suitable emulsion
polymerization process has been described by, for example, U.S.
Pat. No. 6,316,541. A typical emulsifier is TEA dodecyl benzene
sulfonate which is formed in the process when triethanolamine (TEA)
is used to neutralize the dodecyl benzene sulfonic acid used as the
emulsion polymerization catalyst. It has been found that selection
of the anionic counterion, typically an amine, and/or selection of
the alkyl or alkenyl group in the sulfonic acid catalyst can
further improve the stability of the microemulsion in the shampoo
composition. Examples of preferred amines include, without
limitation, triisopropanol amine, diisopropanol amine, and
aminomethyl propanol.
[0412] 2) Pearlescent Agents
[0413] Pearlescent agents, such as those described herein above can
be suitably included in a personal care cleaning composition such
as a shampoo. They are defined, for the purpose of the present
disclosure, as materials which impart to a composition the
appearance of mother of pearl. Pearlescence is produced by specular
reflection of light. Light reflected from pearl platelets or
spheres as they lie essentially parallel to each other at different
levels in the composition creates a sense of depth and luster. Some
light is reflected off the pearlescent agent, and the remainder
will pass through the agent, which may pass directly through or be
refracted. Reflected, refracted light produces a different colour,
brightness and luster.
[0414] 3) Cationic Cellulose or Guar Polymer
[0415] Cleaning compositions of the present invention can further
contain a cationic polymer to aid the deposition of the silicone
oil component and enhance conditioning performance. Non limiting
examples of such polymers are described in the CTFA Cosmetic
Ingredient Dictionary, 3rd ed, Estrin, Crosley, & Haynes eds.,
(The Cosmetic, Toiletry, and Fragrance Association, Inc.,
Washington, D. C. (1982)). Suitable cataionic polymers include
polysaccharide polymers, such as cationic cellulose derivatives,
for example, salts of hydroxyethyl cellulose reacted with trimethyl
ammonium substituted epoxide, referred to in the industry (CTFA) as
Polyquaternium 10, as well as Polymer LR, JR, JP and KG series
polymers (Amerchol Corp.). Other suitable cationic cellulose
polymers includes the polymeric quaternary ammonium salts of
hydroxyethyl cellulose reacted with lauryl dimethyl
ammonium-substituted epoxide referred to in the industry (CTFA) as
Polyquaternium 24, available under the tradename Polymer LM-200
(Amerchol Corp). Suitable cationic guar polymers include cationic
guar gum derivatives, such as guar hydroxypropyltrimonium chloride,
and those described in, for example, U.S. Pat. No. 5,756,720.
Certain of these polymers are commercialy available, including, for
example, Jaguar.RTM. Excel (Rhodia Corp.).
[0416] When used, the cationic polymers herein are either soluble
in the cleaning composition or are soluble in a complex coacervate
phase in the cleaning composition formed by the cationic polymer
and the anionic, amphoteric and/or zwitterionic detersive
surfactant component described hereinbefore. Complex coacervates of
the cationic polymer can also be formed with other charged
materials in the composition.
[0417] Concentrations of the cationic polymer in the composition
can range from about 0.01 wt. % to about 3 wt. % (e.g., about 0.05
wt. % to about 2 wt. %, about 0.1 wt. % to about 1 wt. %). Suitable
cationic polymers have cationic charge densities of at least about
0.4 meq/gm (e.g., at least about 0.6 meq/gm). Suitable cationic
polymers have cationic charge densities of no more than about 5
meq/gm, at the pH of intended use of the cleaning composition. In
an exemplary personal care cleaning composition, such as, for
example, a shampoo, which generally has a pH range of about 3 to
about 9 (e.g., about 4 to about 8). As used herein, "cationic
charge density" of a polymer refers to the ratio of the number of
positive charges on the polymer to the molecular weight of the
polymer. The average molecular weight of suitable cationic guars
and cellulose polymers is typically at least about 800,000 daltons.
For example, suitable cationic polymers, which can be included in a
cleaning composition of the present invention, is one of
sufficiently high cationic charge density to effectively enhance
deposition efficiency of the solid particle components in the
cleaning composition. Cationic polymers comprising cationic
cellulose polymers and cationic guar derivatives with cationic
charge densities of at least about 0.5 meq/gm and preferably less
than about 7 meq/gm are suitable.
[0418] Preferably, the deposition polymers give good clarity and
adequate flocculation on dilution with water during use, especially
when suitable electrolytes including, e.g., sodium chloride, sodium
benzoate, magnesium chloride, and magnesium sulfate, are added.
[0419] 4) Perfumes/Fragrances
[0420] Just as perfumes or perfume accords are typically included
in a household cleaning composition of the invention, perfumes or
perfume accords as described herein (e.g., supra) are often
included in a personal care cleaning composition, such as a shampoo
or a bodywash composition. The perfume ingredients, which
optionally can be formulated into a perfume accord prior to
blending or formulating the cleaning composition, can be obtained
from a wide variety of natural or synthetic sources. They include,
without limitation, aldehydes, ketones, esters, and the like. They
also include, for example, natural extracts and essences, which can
include complex mixtures of ingredients, such as orange oils, lemon
oils, rose extracts, lavender, musk, patchouli, balsamic essence,
sandalwood oil, pine oil, cedar, and the like. The amount of
perfume to be included in a cleaning composition of the invention
can vary, for example, from about 0.0001 wt. % to about 2 wt. %
(e.g., about 0.01 wt. % to about 1.0 wt. %, about 0.1 wt. % to
about 0.5 wt. %), based on the total weight of the cleaning
composition.
[0421] 5) Sensory Indicators--Silica Particles
[0422] Optionally, in a personal care cleaning composition of the
invention, various sensory indicators can be included. These agents
provide a change in sensory feel after an appropriate usage time,
allowing for easy and precise recognition for the appropriate time
of washing. For example, these agents are particularly suitable for
cleaning compositions such as hand cleansers. An exemplary type of
sensory indicators are silica particles. The properties of the
silica particle may be adjusted to provide the desired end point in
time.
[0423] Various silica particles are commercially available,
including, for example, those made and distributed by INEOS Silicas
Ltd. These particles have also been described in, for example, U.S.
Pat. No. 6,165,510, U.S. Patent Publication 2003/0044442.
[0424] Silica particles can be present in an amount that can
initially be felt by hands when starting washing with the cleaning
composition. In one embodiment, the amount of silica particles is
about 0.05 wt. % to about 8 wt. %. In some embodiments, suitable
silica particles can have an initial average diameter of about 50
.mu.m to about 600 .mu.m (e.g., about 180 to about 420 .mu.m). In
some embodiments, silica particles can further comprise color or
pigment on the surface. In other embodiments, suitable silica
particles diminish in size and cannot be felt by users during
washing before about 5 min, about 2 min, about 30 sec, about 25
sec, about 20 sec, about 15 sec, about 10 sec, about 5 sec, about 5
to about 30 sec, or about 10 to about 30 sec.
[0425] Silica particles can also, in addition to providing sensory
indications, improve the dispensing of the cleaning composition.
For example, by including these particles, the cleaning
composition, such as a liquid hand cleaner or a shampoo, may
achieve a desirable thickness such that it is easier to be
dispensed with a pump.
[0426] It is often desirable to regulate the viscosity of a
composition comprising silica particles, however. Addition of
glycerin has been found to be an effective approach to achieve this
regulation. Glycerin is typically added to a composition comprising
silica particles in an amount of at least about 1 wt. % (e.g.,
about 2 wt. %, about 2.5 wt. %, about 3 wt. %, about 4 wt. %, about
5 wt. %, or about 6 wt. %), based on the total weight of the
cleaning composition. In some embodiments, glycerin is added in an
amount of less than about 10 wt. % (e.g., less than about 8 wt. %,
less than about 6 wt. %, less than about 4 wt. %, less than about 2
wt. %). The addition of glycerin may, in certain embodiments, help
prevent clogging of pumps.
[0427] 6) Suspension Agents-Viscosity Control
[0428] Cleaning compositions of the invention can also include a
suspending agent that allows the particulate matters therein, e.g.,
the silica particles, to remain suspended. Suspending agents are
materials that are capable of increasing the ability of the
composition to suspend material. Examples of suspending agents
include, e.g., synthetic structuring agents, polymeric gums,
polysaccharides, pectin, alginate, arabinogalactan, carrageen,
gellan gum, xanthum gum, guar gum, rhamsan gum, furcellaran gum,
and other natural gum. An exemplary synthetic structuring agent is
a polyacrylate. An exemplary acrylate aqueous solution used to form
a stable suspension of the solid particles is manufactured by
Lubrizol as CARBOPOL.TM. resins, also known as CARBOMER.TM., which
are hydrophilic high molecular weight, crosslinked acrylic acid
polymers. Other polymers suitable as suspension agents include,
e.g., CARBOPOL.TM. Aqua 30, CARBOPOL.TM. 940 and CARBOPOL.TM.
934.
[0429] The suspending agents can be used alone or in combination.
The amount of suspending agent can be any amount that provides for
a desired level of suspending ability. In certain embodiment, the
suspending agent is present in an amount of about 0.01 wt. % to
about 15 wt. % (e.g., about 0.1 wt. % to about 12 wt. %, about 1
wt. % to about 10 wt. %, about 2 wt % to about 5 wt. %) by weight
of the cleaning composition.
[0430] 7) Other Suitable Adjuncts
[0431] A number of other adjuncts can be suitable for inclusion in
a personal care cleaning composition. Those include, for example,
thickeners, such as hydroxyl ethyl cellulose derivatives (e.g.,
Methocel.TM. products, Dow Chemicals; Natrosol.RTM. products,
Aqualon Ashland; Carbopol.TM. products, Lubrizol).
[0432] Stability enhancers can also be included as suitable
adjuncts. They are typically nonionic surfactants, including those
having an hydrophilic-lipophilic balance range of about 9-18. These
surfactants can be straight chained or branched chained, and they
typically containing various levels of ethoxylation/propoxylation.
The nonionic surfactants useful in the present invention are
preferably formed from a fatty alcohol, a fatty acid, or a
glyceride with a C.sub.3 to C.sub.24 carbon chain, preferably a
C.sub.12 to C.sub.18 carbon chain derivatized to yield a
Hydrophilic-Lipophilic Balance (HLB) of at least 9. HLB is
understood to mean the balance between the size and strength of the
hydrophilic group and the size and strength of the lipophilic group
of the surfactant. Suitable adjuncts for personal care cleaning
compoisitons can also include various vitamins, including, for
example, vitamin B complex; incuding thiamine, nicotinic acid,
biotin, pantothenic acid, choline, riboflavin, vitamin B6, vitamin
B12, pyridoxine, inositol, carnitine, vitamins A, C, D, E, K, and
their derivatives.
[0433] Further suitable adjuncts may include one or more materials
such as antimicrobial agents, antifungal agents, antidandruff
agents, dyes, foam boosters, pediculocides, pH adjusting agents,
preservatives, proteins, skin active agents, sunscreens, UV
absorbers, minerals, herbal/fruit/food extracts, sphigolipid
derivatives or synthetic derivatives, and clay.
Examples of Preferred Embodiments
[0434] Surfactant compositions of the invention can be formulated
or used without substantial post-production processing. This is
especially the case if the surfactant composition is applied in
industrial settings, for example, in oil industry for oil recovery
applications. Because typically minimum purity specification is
required in such settings, it is potentially possible to use
whole-cell broths. Surfactants comprising microbially-produced
branched fatty alcohols and derivatives prepared in accordance with
the methods herein are relatively more selective, as compared with
conventional chemical surfactants. As such, they are required in
small quantities, and effective under a broad range of oil and
reservoir conditions. They are also more environmentally friendly
in protection of coastal areas from additional damage inflicted by
synthetic chemicals, because they are readily biodegradable and
have lower toxicity than synthetic surfactants. Potentially an
about 30% or more increase in total oil recovery from underground
sandstone can be achieved using surfactants comprising microbially
produced fatty alcohols and derivatives such as those described
herein.
[0435] Microbially-produced fatty alcohols, including branched
fatty alcohols and derivatives thereof such as those described
herein, are also more anaerobic, halotolerant and thermo-tolerant
as compared to their petroleum-derived counterparts, making
surfactants comprising these fatty alcohols particularly useful for
in situ enhanced oil recovery. These surfactants are potent
reducers of oil viscosity, making it vastly easier to pump heavy
oils from underground sandstone as well as through commercial
pipelines for long distances. Microbially-produced fatty alcohols
and derivatives and surfactants comprising these materials can also
be used to desludge crude oil storage tanks. The branched fatty
alcohols and derivates described herein also have improved low
temperature properties, and are thus particularly suited for
application in low temperature environments such as in the deep
sea.
[0436] Potentially, suitable host cells can be engineered such that
the culture broth not only provide suitable surfactants but also
provides biodegradation of hydrocarbons, resulting in microbial
remediation of hydrocarbon- and crude oil-contaminated soils.
Furthermore, the branched fatty alcohols, derivatives thereof, as
well as the surfactants comprising these materials can be used to
manage and emulsify hydrocarbon-water mixtures. This capacity to
effectively emulsify oil/water mixtures can be utilized in oil
spill management.
[0437] With more extensive post-production processing, surfactants
comprising the branched fatty alcohols and derivatives as described
herein can be particularly suitable as food additives or in the
health care and cosmetic industries. The branching of these
molecules confer added oxidative stability and significantly
decreased volatility and vapor pressure. They are also useful as
ingredients in various household and personal and/or pet care
cleaning compositions, with particular advantages at lower washing
temperatures.
[0438] In certain embodiments, the invention features a surfactant
composition comprising about 0.001 wt. % to about 100 wt. % (e.g.,
about 0.01 wt. % to about 80 wt. %, about 0.1 wt. % to about 70 wt.
%, about 1 wt. % to about 60 wt. %, about 5 wt. % to about 50 wt.
%) of one or more microbially produced branched fatty alcohols
and/or derivatives thereof. An exemplary surfactant composition of
the invention comprises about 0.1 wt. % to about 50 wt. % of
microbially produced branched fatty alcohols and/or derivatives
thereof. The surfactant composition of the present invention can
further comprise one or more other co-surfactants, derived from
similar origins (e.g., microbially produced) or different origins
(e.g., chemically synthesized, derived from petroleum sources).
[0439] In another aspect, the invention pertains to a cleaning
composition comprising one or more surfactants comprising branched
fatty alcohols and derivatives produced in accordance with the
methods described herein. The inventive cleaning composition can be
formulated as a solid cleaning composition or as a liquid cleaning
composition.
[0440] In certain embodiment, the invention provides a cleaning
composition comprising about 0.1 wt. % to about 50 wt. % (e.g.,
about 0.1 wt. % to about 50 wt. %, about 0.5 wt. % to about 45 wt.
%, about 1 wt. % to about 40 wt. %, about 5 wt. % to about 35 wt.
%, about 10 wt. % to about 30 wt. %) of one or more microbially
produced branched fatty alcohols and/or derivatives thereof. An
exemplary cleaning composition comprises about 1 wt. % to about 40
wt. % of microbially produced branched fatty alcohols and/or
derivatives thereof. In another embodiment, the composition
comprises about 2 wt. % to about 20 wt. % of microbially produced
branched fatty alcohols and/or derivatives thereof.
[0441] In one embodiment, the invention features a liquid cleaning
composition comprising (a) about 0.1 wt. % to about 50 wt. % (e.g.,
about 0.1 wt. % to about 50 wt. %, about 0.5 wt. % to about 45 wt.
%, about 1 wt. % to about 40 wt. %, about 5 wt. % to about 35 wt.
%, about 10 wt. % to about 30 wt. %) of one or more microbially
produced branched fatty alcohols and/or derivatives thereof, (b)
about 1 wt. % to about 30 wt. % (e.g., about 2 wt. % to about 25
wt. %, about 5 wt. % to about 20 wt. %) of one or more
co-surfactant, (c) about 0 wt. % to about 10 wt. % (e.g., about 0
wt. % to about 10 wt. %, about 0 wt. % to about 8 wt. %, about 0
wt. % to about 5 wt. %, about 0 wt. % to about 2 wt. %) of one or
more detergency builders, (d) about 0 wt. % to about 2.0 wt. %
(e.g., about 0.0001 wt % to about 1.5 wt. %, about 0.001 wt. % to
about 1 wt. %, about 0.01 wt. % to about 0.8 wt. %) of one or more
enzymes; (e) about 0 wt. % to about 15 wt. % (e.g., about 0 wt. %
to about 12 wt. %, about 0 wt. % to about 10 wt. %, about 0 wt. %
to about 8 wt. %, about 0 wt. % to about 5 wt. %) of one or more
chelating agents; (f) about 0 wt. % to about 20 wt. % (about 0 wt.
% to about 15 wt. %, about 0 wt. % to about 10 wt %, about 0 wt. %
to about 5 wt. %) of one or more hydrotropes; (g) about 0 to about
15 wt. % (e.g., about 0 wt. % to about 10 wt. %, about 0 wt. % to
about 8 wt. %, about 0 wt. % to about 5 wt. %) of one or more
rheology modifiers; (h) about 0 wt % to about 1.0 wt. % (e.g.,
about 0 wt. % to about 0.8 wt. %, about 0 wt. % to about 0.5 wt. %,
about 0 wt. % to about 0.2 wt. %) of one or more organic
sequestering agents; and (i) about 0.1 wt. % to about 98 wt. %
(e.g., about 0.1 wt. % to about 95 wt. %, about 1 wt. % to about 90
wt. %, about 10 wt. % to about 85 wt. %) of a solvent system
comprising water or other suitable solvents.
[0442] In another embodiment, the invention features a solid
detergent composition comprising (a) about 0.1 wt. % to about 50
wt. % of one or more microbially produced fatty alcohols and/or
derivatives thereof, (b) about 1 wt. % to about 30 wt. % of one or
more co-surfactants, (c) about 1 wt. % to about 60 wt. % of one or
more detergency builders, (d) about 0 wt. % to about 2.0 wt. % of
one or more enzymes, (e) about 0 wt. % to about 20 wt. % of one or
more hydrotropes, (f) about 10 wt. % to about 35 wt. % of one or
more filler salts, (f) about 0 wt. % to about 15 wt. % of one or
more chelating agents, and (g) about 0.01 wt. % to about 1 wt. % of
one or more organic sequestering agents.
[0443] When the cleaning composition is a solid (e.g., a
particulate, a granule, a tablet), the composition herein can be in
any solid form, such as a granular composition or for example a
tablet, flake, extrudate, agglomerate, or granule-containing
composition. Alternatively, the detergent composition can be a
powder. The composition herein can be made by methods such as
dry-mixing, agglomerating, compaction, spray drying of various
ingredients comprised in the composition herein, or a combination
thereof. The composition herein preferably has a bulk density of
from about 300 g/L, 350 g/L, or 450 g/L to 1500 g/L, 1000 g/L, or
850 g/L.
[0444] In certain embodiments, a liquid cleaning composition of the
present invention is formulated such that during use, the wash
water will have a pH of between about 6.5 and about 11.0 (e.g.,
between about 6.5 to about 11, between about 7.0 to about 8.5).
[0445] In one embodiment, the invention provides a dishwashing
detergent composition. The dishwashing detergent composition can be
formulated for use in hand washing of dishes or for use in
automatic dishwashers. A skilled person will appreciate that a
detergent composition formulated for use in automatic dishwashers
should contain suitable antifoaming agents in order to prevent
excessive foaming of the detergent composition within the
dishwasher. However, foaming may be desirable when hand washing
dishes. Antifoaming agents are known. For example, various silicone
antifoam compounds can be used, including a variety of relatively
high molecular weight polymers containing siloxane units and
hydrocarbyl groups of various types. Other suitable antifoam agents
include monocarboxylic fatty acids and soluble salts thereof, high
molecular weight fatty esters (e.g., fatty acid triglycerides),
fatty acid esters of monovalent alcohols, aliphatic
C.sub.18-C.sub.40 ketones (e.g., stearone), N-alkylated amino
triazines (e.g., tri- to hexa-alkylmelamine or di- to
tetra-alkyldiamine chlortriazines), propylene oxide, bis-stearic
acid amide, and monostearyl di-alkali metal (e.g., sodium,
potassium, lithium) phosphate and phosphate esters, amine oxides,
alkanolamides, betaines, and mixtures thereof.
[0446] In addition the dishwashing detergents can optionally
comprise one or more enzymes, gelling agents, abrasive materials,
fragrances, solubility enhancers, antideposition agents, e.g.,
cellulose derivatives. Abrasive materials can be, e.g., pumice,
sand, feldspar, corn meal, or mixtures. Antideposition agent can be
present in an exemplary cleaning composition in an amount of about
0.1 wt. % to about 5 wt. % (e.g., about 0.1 wt. % to about 2 wt.
%).
[0447] In certain embodiments, the invention provides a laundry
detergent composition comprising, in addition to the microbially
produced branched fatty alcohols and/or derivatives thereof as
described herein, the co-surfactants and the builders, optionally
one or more enzymes, gelling agents, fragrances, antideposition
agents, brighteners, anticaking agents, pearlescent agents, fabric
softeners, bleach systems, dyes or colorants, preservatives, fabric
care benefit agents, hueing dyes, soil release polymers,
photoactivators, hydrolysable surfactants, anti-shrinkage agents,
anti-wrinkle agents, germicides, fungicides, color speckles,
colored beads, fluorinated compounds, etc.
[0448] In certain embodiments, the invention further provides a
solid surface cleaning composition. In addition to the microbially
produced branched fatty alcohols and/or derivatives thereof as
described herein, the co-surfactants and the builders, the surface
cleaning composition can further comprise one or more of the
optional ingredients including, without limitation, one or more
enzymes, gelling agents, fragrances, antideposition agents,
pearlescent agents, soil release polymers, germicides, abrasive
materials, fungicides and mixtures thereof.
[0449] In certain embodiments, the invention also provide a
personal and/or pet care cleaning composition comprising one or
more microbially produced branched fatty alcohols and/or
derivatives thereof, builders, and co-surfactants. Optionally,
additional components can be included in the personal and/or pet
care cleaning composition, including, for example, conditioners,
silicones, silica particles, cationic cellulose or guar polymers,
silicone microemulsion stabilizers, enzymes, fatty amphiphiles,
germicides, fungicides, anti-dandruff agents, pearlescent agents,
foam boosters, pediculocides, pH adjusting agents, UV absorbers,
sunscreens, skin active agents, vitamins, minerals,
herbal/fruit/food extracts, sphingolipids, sensory indicators,
suspension agents, and mixtures thereof.
[0450] The invention further provides a method for cleaning a
substrate, such as fibers, fabrics, hard surfaces, skin, hair,
etc., by contacting the substrate with the cleaning composition of
the invention and water. Agitation is preferably provided to
enhance cleaning. Suitable means for providing agitation include
rubbing by hand or with a brush, sponge, cloth, mop, or other
cleaning device, automatic laundry machines, automatic dishwashers,
and the like.
EXAMPLES
[0451] The invention is further illustrated by the following
examples. The examples are provided for illustrative purposes only.
They are not to be construed as limiting the scope or content of
the invention in any way.
[0452] Although particular methods are described, one of ordinary
skill in the art will understand that other, similar methods also
can be used. In general, standard laboratory practices were used,
unless otherwise stipulated. For example, standard laboratory
practices were used for: cloning; manipulation and sequencing of
nucleic acids; purification and analysis of proteins; and other
molecular biological and biochemical techniques. Such techniques
are explained in detail in standard laboratory manuals, such as
Sambrook et al., Molecular Cloning: A Laboratory Manual, 3rd ed.,
vol. 1-3, Cold Spring Harbor, New York (2000), and Ausubel et al.,
Current Protocols in Molecular Biology, Greene Publ. Assoc. &
Wiley-Intersciences (1989).
Example 1
Constructing E. coli MG1655 .DELTA.fadE .DELTA.tonA AAR:Kan
[0453] This example describes the construction of a genetically
engineered microorganism in which the expression of a fatty acid
degradation enzyme is attenuated.
[0454] The fadE gene of E. coli MG1655 was deleted using the lambda
red system described by Datsenko et al., Proc. Natl. Acad. Sci. USA
97: 6640-6645 (2000), with the following modifications:
[0455] The following two primers were used to create the deletion
of fadE:
TABLE-US-00012 Del-fadE-F (SEQ ID NO: 158)
5'-AAAAACAGCAACAATGTGAGCTTTGTTGTAATTATATTGTAAAC
ATATTGATTCCGGGGATCCGTCGACC; and Del-fadE-R (SEQ ID NO: 159)
5'-AAACGGAGCCTTTCGGCTCCGTTATTCATTTACGCGGCTTCAAC
TTTCCTGTAGGCTGGAGCTGCTTC
[0456] The Del-fadE-F and Del-fadE-R primers were used to amplify
the kanamycin resistance (Km.sup.R) cassette from plasmid pKD13 by
PCR. The PCR product was then used to transform electrocompetent E.
coli MG1655 cells containing pKD46 that had been previously induced
with arabinose for 3-4 hours. Following a 3-hour outgrowth in SOC
medium at 37.degree. C., the cells were plated on Luria agar plates
containing 50 .mu.g/ml of Kanamycin. Resistant colonies were
identified and isolated after an overnight incubation at 37.degree.
C. Disruption of the fadE gene was confirmed in some of the
colonies by PCR amplification using primers fadE-L2 and fadE-R1,
which were designed to flank the E. coli fadE gene.
[0457] The fadE deletion confirmation primers were:
TABLE-US-00013 fadE-L2 (SEQ ID NO: 160) 5'-CGGGCAGGTGCTATGACCAGGAC;
and fadE-R1 (SEQ ID NO: 161) 5'-CGCGGCGTTGACCGGCAGCCTGG
[0458] After the fadE deletion was confirmed, a single colony was
used to remove the Km.sup.R marker using the pCP20 plasmid as
described by Datsenko et al., supra. The resulting MG1655 E. coli
strain with the fadE gene deleted and the Km.sup.R marker removed
was named E. coli MG1655 .DELTA.fadE, or E. coli MG 1655 D1.
[0459] Furthermore, the expression of an outer membrane protein
receptor for ferrichrome, colicin M, or phages T1, T5, and phi80
are attenuated.
[0460] The tonA gene of E. coli MG1655, which encodes a ferrichrome
outer membrane transporter (GenBank Accession No. NP_414692), was
deleted from strain E. coli MG1655 D1 of Example 1, using the
lambda red system according to Datsenko et al., supra, but with the
following modifications:
[0461] The primers used to create the deletion of tonA were:
TABLE-US-00014 Del-tonA-F (SEQ ID NO: 162)
5'-ATCATTCTCGTTTACGTTATCATTCACTTTACATCAGAGATATAC
CAATGATTCCGGGGATCCGTCGACC; and Del-tonA-R (SEQ ID NO: 163)
5'-GCACGGAAATCCGTGCCCCAAAAGAGAAATTAGAAACGGAAG GTTGCGG
TTGTAGGCTGGAGCTGCTTC
[0462] The Del-tonA-F and Del-tonA-R primers were used to amplify
the kanamycin resistance (Km.sup.R) cassette from plasmid pKD13 by
PCR. The PCR product obtained in this way was used to transform
electrocompetent E. coli MG1655 D1 cells of Example 1 containing
pKD46, which cells had been previously induced with arabinose for
3-4 h. Following a 3-hour outgrowth in SOC medium at 37.degree. C.,
cells were plated on Luria agar plates containing 50 .mu.g/ml of
kanamycin. Resistant colonies were identified and isolated after an
overnight incubation at 37.degree. C. Disruption of the tonA gene
was confirmed in some of the colonies by PCR amplification using
primers flanking the E. coli tonA gene: tonA-verF and
tonA-verR:.
TABLE-US-00015 tonA-verF (SEQ ID NO: 164) 5'-CAACAGCAACCTGCTCAGCAA;
and tonA-verR (SEQ ID NO: 165) 5'-AAGCTGGAGCAGCAAAGCGTT
[0463] After the tonA deletion was confirmed, a single colony was
used to remove the Km.sup.R marker using the pCP20 plasmid as
described by Datsenko et al., supra. The resulting MG1655 E. coli
strain having fadE and tonA gene deletions was named E. coli MG1655
.DELTA.fadE_.DELTA.tonA, or E. coli MG1655 DV2
[0464] The aar gene encoding Synechococcus elongatus
PCC_Synpcc7942_1594 enzyme is integrated into the chromosome with
the kanamycin marker directly after the aar sequence.
Example 2
Expression of BKD Homologs and FabH in E. coli
[0465] A branched chain alpha-keto acid dehydrogenase complex from
Pseudomonas putida and a FabH from Bacillus subtilis were used to
generate two E. coli plasmids for expression. First, the
Pseudomonas putida BKD operon was PCT-amplified from Pseudomonas
putida F1 genomic DNA. The following primers were used:
TABLE-US-00016 P.p.BKDFUsion_F: (SEQ ID NO: 166)
5'-ATAAACCATGGATCCATGAACGAGTACGCCCC-3' P.pBKDFusion_R: (SEQ ID NO:
167) 5'-CCAAGCTTCGAATTCTCAGATATGCAAGGCGTG-3'
[0466] Using these primers, Pseudomonas putida Pput_1450 (GenBank
Accession No. A5W0E08), Pput_1451 (GenBank Accession No. A5W0E9),
Pput_1452 (GenBank Accession No. A5W0F0), and Pput_1453 (A5W0F1)
were amplified. The PCR product was then cloned into vector
pGL10.173B (See, FIG. 8), a plasmid with a pBR322 backbone and a
pTrc promoter to drive gene expression. The PCR product was cloned
into pGL between BamHI and EcoRI restriction sites. Correct
insertion of the PCR product was verified by diagnostic restriction
digests. The resulting plasmid was named "pKZ4." (See, FIG. 7)
[0467] To clone E. coli PfabH promoter-B. subtilis fabH1 into a
pACYC vector, insert of pDG6 (pCFDuet-E. coli PfabH promoter-B.
subtilis fabH1) was subcloned into pACYC vector using NcoI and
AvrII restriction sites. The resulting plasmid was named pDG6
(pCFDuet+E. coli PfabH+B. subtilis fabH1). (See, FIG. 6B and FIG.
6C)
[0468] E. coli strain MG1655 .DELTA.fadE_.DELTA.tonA, AAR:kan was
transformed with pKZ4 and pDG6 (pCFDuet+E. coli PfabH+B. subtilis
fabH1). The strain was evaluated for production of branched chain
materials using shake flask fermentation. Shake flask fermentation
was carried out using Che-9 media. Specifically, cultures of E.
coli MG1655 .DELTA.fadE_.DELTA.tonA AAR:kan without plasmids or
carrying individual plasmids were used as controls. Seed cultures
of E. coli MG1655 .DELTA.fadE_.DELTA.tonA AAR:kan, E. coli MG1655
.DELTA.fadE_.DELTA.tonA AAR:kan+pKZ4, E. coli MG1655
.DELTA.fadE_.DELTA.tonA AAR:kan+pDG6, and E. coli MG1655
.DELTA.fadE_.DELTA.tonA AAR:kan+pKZ4+pDG6 were grown in LB broths
supplemented with the appropriate antibiotics. After 4 hours of
growth, the cultures were diluted 1:25 in Che-9 2NBT
medium+appropriate selection marker and grown overnight. The
cultures were then diluted in 4NBT to a final OD.sub.600.about.0.2.
After 6 h of growth, IPTG was added to a final concentration of 1
mM. At 24 h post-induction, 1 mL of culture was extracted with 0.5
mL of methyl tert-butyl ether (MTBE) and subjected to GC/MS
analysis. The analysis revealed the production of iso-C.sub.14:0,
iso-C.sub.15:0, anteiso-C.sub.15:0, iso-C.sub.16:0, iso-C.sub.17:0,
and anteiso-C.sub.17:0 fatty alcohols. (See, FIG. 4A).
Example 3
Quantification and Identification of Branched Fatty Alcohols
Instrumentation:
[0469] The instrument is an Agilent 5975B MSD system equipped with
a 30 m.times.0.25 mm (0.10 .mu.m film) DB-5 column. The mass
spectrometer was equipped with an electron impact ionization
source. Two GC/MS programs were utilized.
[0470] GC/MS program #1: The temperature of the column is held
isothermal at 90.degree. C. for 5 min, then is raised to
300.degree. C. with a 25.degree. C./min ramp, and finally stays at
300.degree. C. for 1.6 min. The total run time is 15 min With this
program, the inlet temperature is hold at 300.degree. C. The
injector is set at splitless mode. 1 .mu.L of sample is injected
for every injection. The carrier gas (helium) is released at 1.0
mL/min. The source temperature of the mass spectrometer is held at
230.degree. C.
[0471] GC/MS program #2: The temperature of the column is held
isothermal at 100.degree. C. for 3 min, then is raised to
320.degree. C. with 20.degree. C./min, and finally stays isothermal
at 320.degree. C. for 5 min. The total run time is 19 min. The
injector is set at splitless mode. 1 .mu.L of sample is injected
for every injection. The carrier gas (helium) is released at 1.2
mL/min. The ionization source temperature is set at 230.degree.
C.
Samples:
[0472] Extracts containing fatty alcohols by the engineered E. coli
strains were analyzed on GC/MS. In FIG. 4A chromatograms of the
extracts from the mutant strains are compared to those from control
strains which only produce straight chain fatty alcohols. The
branched fatty alcohol produced are listed: iso-C.sub.14:0,
iso-C.sub.15:0, anteiso-C.sub.15:0, iso-C.sub.16:0, iso-C.sub.17:0,
and anteiso-C.sub.17:0.
[0473] In FIG. 4A top panel, a GC/MS chromatogram of extract from
strain E. coli MG1655 .DELTA.fadE .DELTA.tonA AAR:kan+pKZ4+pDG6 (a)
and of control strain E. coli MG1655 .DELTA.fadE .DELTA.tonA
AAR:kan+pBR322+pCFDuet(b). Both chromatograms were obtained with
GC/MS program #2. Compared to the control strain, mutant strain
produces branched-chain fatty alcohols, and the peaks representing
the branched fatty alcohols are boxed. GC/MS semi-quantitative
analysis:
[0474] In addition to the qualitative analysis, semi-quantitative
analysis was performed to obtain the ratio between the branched
chain compounds and the straight chain isomers. Due to the lack of
commercially available standards for branched fatty alcohols,
accurate quantitation for the branched chain compounds was
challenging. However, by using straight chain standard with the
same functional group, the relative quantity or yield of
branched-chan fatty alcohols in relation to the yield of their
straight-chain counterpart (isomers) were estimated
semi-quantitatively. Standard curve quantitation method was
applied, wherein standard mixtures with different concentrations
were analyzed by the same GC/MS program as the samples. After data
acquisition, the instrument response (total ion current) was
plotted against the concentrations of the standards. Linear
calibration curves were obtained. (See, FIG. 5). The concentration
of branched alcohols in a given sample was calculated according to
Equation 1: y=ax+b, wherein y is the instrument response for a
particular compound in a sample. Slope a and intercept b for this
calibration curve were determined by the linear regression of all
calibration levels of standard fatty alcohols (FIG. 4A lower
panel).times.(the concentration of the branched fatty alcohol
product in the sample). Accordingly, the relative concentration of
branched fatty alcohols in the production mixture was
calculated.
[0475] The table below lists the compounds used as standards to
quantify different branched fatty alcohol compounds.
TABLE-US-00017 Alcohol in sample Standard used for quantitation
Iso-Alc C.sub.15:0 Alc C.sub.15:0 Anteiso-Alc C.sub.15:0 Alc
C.sub.15:0 Alc C.sub.15:0 Alc C.sub.15:0 Ald C.sub.16:0 Alc
C.sub.15:0 Alc C.sub.16:0 Alc C.sub.15:0
[0476] Once the titers were obtained for all the fatty alcohol
compounds, the ratio between the production of branched chain fatty
alcohols and the production of straight chain isomers were
calculated according to equation 2:
Percentage production = Total branched chain products in mg / L
Total straight chain products in mg / L .times. 100 %
##EQU00001##
[0477] Using this method, we were able to semi-quantitatively
estimate the amount of branched fatty alcohol yield relative to the
straight-chain fatty alcohol yield to be about 48%.
Example 4
Production of Branched Acyl-CoA Precursors
[0478] An E. coli strain, MG1655(DE3) .DELTA.fadE::FRT
.DELTA.fabH::cat/pDG6 was created, which was tested for its ability
to utilize branched-chain substrate molecules to create
branched-chain fatty precursors of branched fatty alcohols in
vivo.
[0479] The strain MG1655(DE3) .DELTA.fadE::FRT
.DELTA.fabH::cat/pDG6 was constructed as follows:
[0480] A region of the E. coli fabH gene described in Lai, et al.,
2003, J. Biol. Chem. 278(51): 59494, was replaced by an antibiotic
resistance gene. This deletion was perfomed in a strain that was
complemented for fabH by the plasmid pDG6 carrying the B. subtilis
fabH1 gene.
[0481] Initially, the pDG2 plasmid was constructed. The pCDFDuet-1
vector was purchased from Novagen/EMD Biosciences. The vector
carries the CloDF13 replicon, lacI gene and
streptomycin/spectinomycin resistance gene (aadA).
[0482] The C-terminal portion of the plsX gene, which contains an
internal promoter for the downstream fabH gene, was amplified from
E. coli MG1655 genomic DNA using primers
5'-TGAATTCCATGGCGCAACTCACTCTTCTTTTAGTCG-3' (SEQ ID NO:168) and
5'-CAGTACCTCGAGTCTTCGTATACATATGCGCT CAGTCAC-3' (SEQ ID NO:169).
These primers introduced NcoI and XhoI restriction sites near the
ends, as well as an internal NdeI site.
[0483] Both the plsX insert and pCDFDuet-1 vector were digested
with restriction enzymes NcoI and XhoI. The cut vector was treated
with Antarctic phosphatase. The insert was ligated into the vector
and transformed into chemically competent TOP10 cells. Clones were
screened by DNA sequencing. See, FIG. 6A.
[0484] Then a pDG6 plasmid was constructed using the pDG2 plasmid.
The fabH1 gene from Bacillus subtilis strain 168 was amplified from
plasmid pLS9-114 (see, FIG. 6K) using primers
5'-CCTTGGGGCATATGAAAGCTG-3' (SEQ ID NO:170) and
5'-TTTAGTCATCTCGAGTGCACCTCACCTTT-3' (SEQ ID NO:171). These primers
introduced or included NdeI and XhoI restriction sites.
[0485] Both the fabH1 insert and pDG2 vector were digested with
restriction enzymes NdeI and XhoI. The cut vector was treated with
Antarctic phosphatase. The insert was ligated into the vector and
transformed into chemically competent TOP10 cells. Clones were
screened by DNA sequencing. See, FIG. 6B and FIG. 6C.
[0486] Then, the cat chloramphenicol resistance gene was amplified
from template plasmid pKD3 using primers
5'-GCCACATTGCCGCGCCAAACGAAACC
GTTTCAACCATGGCATATGAATATCCTCCTTAGTTCCTATTCCG-3' (SEQ ID NO: 172)
and 5'-CGCCCCAGATTTCACGTATTGATCGGCTACGCTTAATGCAT
GTGTAGGCTGGAGCTGCTTC-3' (SEQ ID NO:173) which added 50 by
nucleotide ends that are homologous to the E. coli fabH gene. This
linear PCR product was used to inactivate the E. coli fabH
gene.
[0487] Strain MG1655(DE3) .DELTA.fadE::FRT was first transformed
with plasmid pKD46 encoding the lambda red recombinase genes.
MG1655(DE3) .DELTA.fadE::FRT/pKD46 was then transformed with
plasmid pDG6. Finally, MG1655(DE3) .DELTA.fadE::FRT/pKD46+pDG6 was
induced for expression of the recombinase genes by addition of 10
mM arabinose and transformed with the linear PCR product as
described in Datsenko et al. (supra). Colonies were selected on LB
plates containing 30 .mu.g/mL chloramphenicol and screened using
colony PCR with primers 5'-TTGACACGTC TAACCCTGGC-3' (SEQ ID NO:174)
and 5'-CTGTCCAGGGAACACAAATG C-3' (SEQ ID NO:175).
[0488] A number of other constructs comprising pDG7, and pDG8 were
also constructed following the approach as above. The plasmids are
prepared as follows.
[0489] The plasmid pDG7 was prepared from pDG2 with B. subtilis
fabH2 insert. The fabH2 gene from Bacillus subtilis strain 168 was
amplified from plasmid pLS9-111 (see, FIG. 6J) using primers
5'-TTGTGTCGCCCTTTCGCTG-3' (SEQ ID NO:176) and
5'-CTTACGTACGTACTCGAGTGACGC-3' (SEQ ID NO:177). These primers
introduced or included NdeI and XhoI restriction sites.
[0490] Both the fabH2 insert and pDG2 vector were digested with
restriction enzymes NdeI and XhoI. The cut vector was treated with
Antarctic phosphatase. The insert was ligated into the vector and
transformed into chemically competent TOP10 cells. Clones were
screened by DNA sequencing. See, FIG. 6D and FIG. 6E.
[0491] The plasmid pDG8 was prepared from pDG2 with S. coelicolor
fabH insert. The fabH gene from Streptomyces coelicolor was
amplified from plasmid pLS9-115 (see, FIG. 6L) using primers
5'-AAGTGGGGCATATGTCTAAGATC-3' (SEQ ID NO:178) and
5'-GTGATCCGGCTCGAGGTGGTTAC-3' (SEQ ID NO:179). These primers
introduced or included NdeI and XhoI restriction sites.
[0492] Both the fabH insert and pDG2 vector were digested with
restriction enzymes NdeI and XhoI. The cut vector was treated with
Antarctic phosphatase. The insert was ligated into the vector and
transformed into chemically competent TOP10 cells. Clones were
screened by DNA sequencing. See, FIG. 6F and FIG. 6G.
[0493] The plasmid pDG10 was prepared using pCR-Blunt vector, which
was purchased from Invitrogen, with C. acetobutylicum ptb_buk
operon insert, wherein the ptb part represents the gene encoding C.
acetobutylicum phosphotransbutyrylase (GenBank Accession
AAA75486.1, SEQ ID NO:156), and the buk part represents the gene
encoding C. acetobutylicum butyrate kinase (GenBank Accession
JN0795, SEQ ID NO:157). The buk_ptb operon was amplified from
Clostridium acetobutylicum genomic DNA (ATCC 824) using primers
5'-CTTAACTTCATGTGAAAAGTTTGT-3' (SEQ ID NO:180) and
5'-ACAATACCCATGTTTATAGGGCAA-3' (SEQ ID NO:181). The PCR product was
ligated into the pCR_Blunt vector following the manufacturer's
instructions. Colonies were verified by DNA sequencing. See, FIG.
6H and FIG. 6I.
[0494] E. coli strains were transformed with pDG10, and OP-180
plasmid comprising E. coli thioesterase gene tesA under the control
of the Ptrc promoter and independently also one of the pDG6, pDG7
and pDG8 plasmids as described above.
[0495] These strains were fed branched molecules isobutyrate, which
resulted in iso-C.sub.14:0 and iso-C.sub.16:0 branched acyl-CoA
precursors. Independently they were fed branched molecule
isovalerate, which resulted in iso-C.sub.13:0 and iso-C.sub.15:0
branched acyl-CoA precursors. See, FIG. 4B. These precursors can
then be incorporated into the branched fatty alcohol pathways as
described herein and depicted in FIG. 1A and FIG. 1B.
Other Embodiments
[0496] It is to be understood that while the invention has been
described in conjunction with the detailed description thereof, the
foregoing description is intended to illustrate and not limit the
scope of the invention, which is defined by the scope of the
appended claims. Other aspects, advantages, and modifications are
within the scope of the following claims.
Sequence CWU 1
1
1811330PRTBacillus subtilis 1Met Ser Thr Asn Arg His Gln Ala Leu
Gly Leu Thr Asp Gln Glu Ala 1 5 10 15 Val Asp Met Tyr Arg Thr Met
Leu Leu Ala Arg Lys Ile Asp Glu Arg 20 25 30 Met Trp Leu Leu Asn
Arg Ser Gly Lys Ile Pro Phe Val Ile Ser Cys 35 40 45 Gln Gly Gln
Glu Ala Ala Gln Val Gly Ala Ala Phe Ala Leu Asp Arg 50 55 60 Glu
Met Asp Tyr Val Leu Pro Tyr Tyr Arg Asp Met Gly Val Val Leu 65 70
75 80 Ala Phe Gly Met Thr Ala Lys Asp Leu Met Met Ser Gly Phe Ala
Lys 85 90 95 Ala Ala Asp Pro Asn Ser Gly Gly Arg Gln Met Pro Gly
His Phe Gly 100 105 110 Gln Lys Lys Asn Arg Ile Val Thr Gly Ser Ser
Pro Val Thr Thr Gln 115 120 125 Val Pro His Ala Val Gly Ile Ala Leu
Ala Gly Arg Met Glu Lys Lys 130 135 140 Asp Ile Ala Ala Phe Val Thr
Phe Gly Glu Gly Ser Ser Asn Gln Gly 145 150 155 160 Asp Phe His Glu
Gly Ala Asn Phe Ala Ala Val His Lys Leu Pro Val 165 170 175 Ile Phe
Met Cys Glu Asn Asn Lys Tyr Ala Ile Ser Val Pro Tyr Asp 180 185 190
Lys Gln Val Ala Cys Glu Asn Ile Ser Asp Arg Ala Ile Gly Tyr Gly 195
200 205 Met Pro Gly Val Thr Val Asn Gly Asn Asp Pro Leu Glu Val Tyr
Gln 210 215 220 Ala Val Lys Glu Ala Arg Glu Arg Ala Arg Arg Gly Glu
Gly Pro Thr 225 230 235 240 Leu Ile Glu Thr Ile Ser Tyr Arg Leu Thr
Pro His Ser Ser Asp Asp 245 250 255 Asp Asp Ser Ser Tyr Arg Gly Arg
Glu Glu Val Glu Glu Ala Lys Lys 260 265 270 Ser Asp Pro Leu Leu Thr
Tyr Gln Ala Tyr Leu Lys Glu Thr Gly Leu 275 280 285 Leu Ser Asp Glu
Ile Glu Gln Thr Met Leu Asp Glu Ile Met Ala Ile 290 295 300 Val Asn
Glu Ala Thr Asp Glu Ala Glu Asn Ala Pro Tyr Ala Ala Pro 305 310 315
320 Glu Ser Ala Leu Asp Tyr Val Tyr Ala Lys 325 330 2993DNABacillus
subtilis 2ctacttcgca taaacataat caagcgctga ctcaggagct gcatatgggg
cgttctccgc 60ttcatccgtc gcttcattta cgattgccat aatttcatcc agcatggttt
gttctatctc 120atcggacagc aggcctgttt cctttaagta agcttgataa
gtaagcaggg gatcactttt 180tttcgcttcc tctacttctt cacggcctct
gtagctgctg tcatcgtcat cactggaatg 240tggtgtaagg cggtaagaaa
tcgtttcaat taatgtcggg ccttctcctc tgcgtgccct 300ttcgcgtgct
tctttaaccg cttgataaac ttccagcgga tcatttccat tcacagttac
360gccaggcatc ccatagccta tggcacggtc ggaaatgttc tcacatgcga
cttgcttatc 420gtaaggcact gagattgcgt atttgttgtt ttcacacatg
aaaataaccg gcagcttatg 480gacagcggca aagtttgccc cttcatggaa
atcgccttgg tttgaagacc cttccccgaa 540tgtaacaaag gctgcgatat
cctttttctc catacgtccc gcaagcgcaa taccgactgc 600gtgcggcact
tgcgttgtaa ccggagatga tcccgtcaca atgcggtttt tcttttgtcc
660gaaatgtccc ggcatctggc ggcctcctga gttcggatct gctgcttttg
caaacccgga 720catcattaag tcctttgctg tcatgccaaa cgcgagcacg
acacccatgt ctctgtagta 780cggcaataca taatccattt cacggtcaag
tgcgaaagcc gctcctacct gtgctgcttc 840ctgtccttga caagagatta
caaatggaat tttgccagaa cggtttaaca gccacattct 900ttcatcgatt
tttcttgcta acagcatggt tctatacata tcaacggctt cctgatcagt
960cagccctagt gcttgatgtc ggtttgtact cat 9933406PRTStreptomyces
avermitilis 3Met Thr Val Glu Ser Thr Ala Ala Arg Lys Pro Arg Arg
Ser Ala Gly 1 5 10 15 Thr Lys Ser Ala Ala Ala Lys Arg Thr Ser Pro
Gly Ala Lys Lys Ser 20 25 30 Pro Ser Thr Thr Gly Ala Glu His Glu
Leu Ile Gln Leu Leu Thr Pro 35 40 45 Asp Gly Arg Arg Val Lys Asn
Pro Glu Tyr Asp Ala Tyr Val Ala Asp 50 55 60 Ile Thr Pro Glu Glu
Leu Arg Gly Leu Tyr Arg Asp Met Val Leu Ser 65 70 75 80 Arg Arg Phe
Asp Ala Glu Ala Thr Ser Leu Gln Arg Gln Gly Glu Leu 85 90 95 Gly
Leu Trp Ala Ser Met Leu Gly Gln Glu Ala Ala Gln Ile Gly Ser 100 105
110 Gly Arg Ala Thr Arg Asp Asp Asp Tyr Val Phe Pro Thr Tyr Arg Glu
115 120 125 His Gly Val Ala Trp Cys Arg Gly Val Asp Pro Thr Asn Leu
Leu Gly 130 135 140 Met Phe Arg Gly Val Asn Asn Gly Gly Trp Asp Pro
Asn Ser Asn Asn 145 150 155 160 Phe His Leu Tyr Thr Ile Val Ile Gly
Ser Gln Thr Leu His Ala Thr 165 170 175 Gly Tyr Ala Met Gly Ile Ala
Lys Asp Gly Ala Asp Ser Ala Val Ile 180 185 190 Ala Tyr Phe Gly Asp
Gly Ala Ser Ser Gln Gly Asp Val Ala Glu Ser 195 200 205 Phe Thr Phe
Ser Ala Val Tyr Asn Ala Pro Val Val Phe Phe Cys Gln 210 215 220 Asn
Asn Gln Trp Ala Ile Ser Glu Pro Thr Glu Lys Gln Thr Arg Val 225 230
235 240 Pro Leu Tyr Gln Arg Ala Gln Gly Tyr Gly Phe Pro Gly Val Arg
Val 245 250 255 Asp Gly Asn Asp Val Leu Ala Cys Leu Ala Val Thr Lys
Trp Ala Leu 260 265 270 Glu Arg Ala Arg Arg Gly Glu Gly Pro Thr Leu
Val Glu Ala Phe Thr 275 280 285 Tyr Arg Met Gly Ala His Thr Thr Ser
Asp Asp Pro Thr Lys Tyr Arg 290 295 300 Ala Asp Glu Glu Arg Glu Ala
Trp Glu Ala Lys Asp Pro Ile Leu Arg 305 310 315 320 Leu Arg Thr Tyr
Leu Glu Ala Ser Asn His Ala Asp Glu Gly Phe Phe 325 330 335 Ala Glu
Leu Glu Val Glu Ser Glu Ala Leu Gly Arg Arg Val Arg Glu 340 345 350
Val Val Arg Ala Met Pro Asp Pro Asp His Phe Ala Ile Phe Glu Asn 355
360 365 Val Tyr Ala Asp Gly His Ala Leu Val Asp Glu Glu Arg Ala Gln
Phe 370 375 380 Ala Ala Tyr Gln Ala Ser Phe Thr Thr Glu Pro Asp Gly
Gly Ser Ala 385 390 395 400 Ala Gly Gln Gly Gly Asn 405
41221DNAStreptomyces avermitilis 4gtgaccgtgg agagcactgc cgcgcgaaag
ccgcgacgca gcgccggtac gaagagcgcc 60gcagccaagc gcaccagccc cggcgccaag
aagtcaccga gcacgaccgg cgccgagcac 120gagctgattc agctgctcac
gcccgacggc cggcgggtga agaaccccga gtacgacgcg 180tacgtcgcgg
acatcacccc cgaagagctg cgcggtctgt accgggacat ggtgctgagc
240cgccgcttcg acgcagaggc cacctccctg caacgccagg gcgagctggg
cctgtgggcc 300tcgatgctcg ggcaggaggc cgcccagatc ggctcgggcc
gggccacccg tgacgacgac 360tacgtcttcc cgacctaccg cgagcacggc
gtcgcctggt gccgcggggt cgaccccacc 420aacctgctcg gcatgttccg
cggcgtgaac aacggcggct gggatcccaa cagcaacaac 480ttccacctct
acacgatcgt catcggctcg cagacgctgc acgccaccgg ctacgccatg
540ggtatcgcca aggacggcgc cgactcggcc gtgatcgcgt acttcggtga
cggcgcctcc 600agccagggtg acgtcgccga atcgttcacc ttctccgcgg
tctacaacgc ccctgtcgtc 660ttcttctgcc agaacaacca gtgggcgatc
tccgagccca ccgagaagca gacccgcgtc 720ccgctctacc agcgcgcgca
gggctacggc ttcccgggcg tccgcgtcga cggcaacgac 780gtactggcct
gcctcgccgt caccaagtgg gccctcgagc gggcccgccg gggcgagggg
840cccacgttgg tcgaggcgtt cacgtaccgc atgggcgcgc acaccacctc
cgacgacccg 900accaagtacc gggccgacga ggagcgcgag gcgtgggagg
cgaaggaccc gatcctgcgt 960ctgcgcacgt atctcgaggc ctcaaaccac
gcggacgagg gattcttcgc ggaactcgag 1020gtggagagcg aggcgttggg
aaggcgagtg cgcgaagtgg tgcgtgccat gccggacccg 1080gaccacttcg
ccatcttcga gaacgtgtac gcggacgggc atgcgctcgt cgacgaggag
1140cgggcgcagt tcgccgccta ccaggcgtcg ttcacgacgg agcctgacgg
cggctccgcc 1200gcgggacagg ggggtaactg a 12215410PRTPseudomonas
putida 5Met Asn Glu Tyr Ala Pro Leu Arg Leu His Val Pro Glu Pro Thr
Gly 1 5 10 15 Arg Pro Gly Cys Gln Thr Asp Phe Ser Tyr Leu Arg Leu
Asn Asp Ala 20 25 30 Gly Gln Ala Arg Lys Pro Ala Ile Asp Val Asp
Ala Ala Asp Thr Ala 35 40 45 Asp Leu Ser Tyr Ser Leu Val Arg Val
Leu Asp Glu Gln Gly Asp Ala 50 55 60 Gln Gly Pro Trp Ala Glu Asp
Ile Asp Pro Gln Ile Leu Arg Gln Gly 65 70 75 80 Met Arg Ala Met Leu
Lys Thr Arg Ile Phe Asp Ser Arg Met Val Val 85 90 95 Ala Gln Arg
Gln Lys Lys Met Ser Phe Tyr Met Gln Ser Leu Gly Glu 100 105 110 Glu
Ala Ile Gly Ser Gly Gln Ala Leu Ala Leu Asn Arg Thr Asp Met 115 120
125 Cys Phe Pro Thr Tyr Arg Gln Gln Ser Ile Leu Met Ala Arg Asp Val
130 135 140 Ser Leu Val Glu Met Ile Cys Gln Leu Leu Ser Asn Glu Arg
Asp Pro 145 150 155 160 Leu Lys Gly Arg Gln Leu Pro Ile Met Tyr Ser
Val Arg Glu Ala Gly 165 170 175 Phe Phe Thr Ile Ser Gly Asn Leu Ala
Thr Gln Phe Val Gln Ala Val 180 185 190 Gly Trp Ala Met Ala Ser Ala
Ile Lys Gly Asp Thr Lys Ile Ala Ser 195 200 205 Ala Trp Ile Gly Asp
Gly Ala Thr Ala Glu Ser Asp Phe His Thr Ala 210 215 220 Leu Thr Phe
Ala His Val Tyr Arg Ala Pro Val Ile Leu Asn Val Val 225 230 235 240
Asn Asn Gln Trp Ala Ile Ser Thr Phe Gln Ala Ile Ala Gly Gly Glu 245
250 255 Ser Thr Thr Phe Ala Gly Arg Gly Val Gly Cys Gly Ile Ala Ser
Leu 260 265 270 Arg Val Asp Gly Asn Asp Phe Val Ala Val Tyr Ala Ala
Ser Arg Trp 275 280 285 Ala Ala Glu Arg Ala Arg Arg Gly Leu Gly Pro
Ser Leu Ile Glu Trp 290 295 300 Val Thr Tyr Arg Ala Gly Pro His Ser
Thr Ser Asp Asp Pro Ser Lys 305 310 315 320 Tyr Arg Pro Ala Asp Asp
Trp Ser His Phe Pro Leu Gly Asp Pro Ile 325 330 335 Ala Arg Leu Lys
Gln His Leu Ile Lys Ile Gly His Trp Ser Glu Glu 340 345 350 Glu His
Gln Ala Val Thr Ala Glu Leu Glu Ala Ala Val Ile Ala Ala 355 360 365
Gln Lys Glu Ala Glu Gln Tyr Gly Thr Leu Ala Asn Gly His Ile Pro 370
375 380 Ser Ala Ala Ser Met Phe Glu Asp Val Tyr Lys Glu Met Pro Glu
His 385 390 395 400 Leu Arg Arg Gln Arg Gln Glu Leu Gly Val 405 410
61233DNAPseudomonas putida 6tcaaaccccc agttcctggc gttgacggcg
caggtgttcg ggcatctcct tgtacacatc 60ctcgaacatc gaggcggcgc tcgggatgtg
cccgttagcc agggtgccgt actgctcggc 120ttctttctgt gcggcaatca
ccgcagcttc gagctcggcc gtgacggctt ggtgttcttc 180ttcggaccag
tggccgatct tgatcaggtg ctgcttcagg cgggcgatcg ggtcacccag
240cgggaagtgg ctccagtcat cggcagggcg gtacttggag gggtcgtccg
acgtcgagtg 300cgggccggca cggtaggtga cccactcgat caggcttggg
cccaggccgc ggcgggcgcg 360ctcggcagcc cagcgcgagg cggcgtacac
ggcgacgaag tcgttgccgt caacccgcag 420cgaggcaatg ccgcagccca
cgccacggcc ggcgaaggtg gtcgactcgc caccggcgat 480ggcctggaag
gtagaaatcg cccactggtt gttgaccaca ttgaggatca ccggggcgcg
540gtaaacgtgg gcaaaggtga gggcggtgtg gaagtccgac tcggcggtgg
ctccgtcacc 600gatccacgcc gaagcaatct tggtatcgcc cttgatcgcc
gaggccatgg cccagccgac 660tgcctgcacg aactgggtcg ccaggttgcc
gctgatggtg aagaagccgg cttcgcgcac 720cgagtacatg atcggcaact
ggcggccctt gagggggtcg cgctcgttgg acagcagttg 780gcagatcatc
tcgaccagcg atacgtcgcg ggccatcagg atgctttgct ggcggtaggt
840cgggaagcac atgtcggtgc ggttcagcgc cagcgcctgg ccactgccga
tggcttcttc 900gcccaggctt tgcatgtaga aggacatctt cttctggcgc
tgggcaacca ccatgcggct 960gtcgaagatc cgcgtcttga gcatggcgcg
catgccttga cgaaggatct gtgggtcgat 1020gtcttcggcc caggggcctt
gcgcatcacc ttgctcgtcg agcacgcgga ccaggctgta 1080ggacaggtcg
gcagtgtcgg cagcatcgac atcgatcgcg ggtttacggg cttgacctgc
1140atcgttgagg cgcaggtagg aaaaatcggt ctggcagcct ggccggccgg
tgggctcggg 1200cacatgcaaa cgcagggggg cgtactcgtt cat
12337331PRTListeria monocytogenes 7Met Thr Leu Lys Glu Ala Gly Leu
Thr Glu Asp Lys Leu Ile Lys Met 1 5 10 15 Tyr Glu Thr Met Leu Met
Ala Arg Arg Leu Asp Glu Arg Met Trp Leu 20 25 30 Leu Asn Arg Ser
Gly Lys Ile Pro Phe Thr Ile Ser Gly Gln Gly Gln 35 40 45 Glu Thr
Ala Gln Ile Gly Ala Ala Phe Ala Phe Asp Leu Asp Lys Asp 50 55 60
Tyr Ala Leu Pro Tyr Tyr Arg Asp Leu Ala Val Val Leu Ala Phe Gly 65
70 75 80 Met Thr Ala Lys Asp Ile Met Leu Ser Ala Phe Ala Lys Ala
Glu Asp 85 90 95 Pro Asn Ser Gly Gly Arg Gln Met Pro Ala His Phe
Gly Gln Lys Ser 100 105 110 Asn Arg Ile Val Thr Gln Ser Ser Pro Val
Thr Thr Gln Phe Pro His 115 120 125 Ala Ala Gly Ile Gly Leu Ala Ala
Lys Met Ala Gly Asp Glu Ile Ala 130 135 140 Ile Tyr Ala Ser Thr Gly
Glu Gly Ser Ser Asn Gln Gly Asp Phe His 145 150 155 160 Glu Gly Ile
Asn Phe Ala Ser Val His Lys Leu Pro Val Val Phe Val 165 170 175 Ile
His Asn Asn Gln Tyr Ala Ile Ser Val Pro Ala Ser Lys Gln Tyr 180 185
190 Ala Ala Glu Lys Leu Ser Asp Arg Ala Ile Gly Tyr Gly Ile Pro Gly
195 200 205 Glu Arg Val Asp Gly Thr Asn Met Gly Glu Val Tyr Ala Ala
Phe Lys 210 215 220 Arg Ala Ala Asp Arg Ala Arg Asn Gly Glu Gly Pro
Thr Leu Ile Glu 225 230 235 240 Thr Val Ser Tyr Arg Phe Thr Pro His
Ser Ser Asp Asp Asp Asp Ser 245 250 255 Ser Tyr Arg Ser Arg Glu Glu
Val Asn Glu Ala Lys Gly Lys Asp Pro 260 265 270 Leu Thr Ile Phe Gln
Thr Glu Leu Leu Glu Glu Gly Tyr Leu Thr Glu 275 280 285 Glu Lys Ile
Ala Glu Ile Glu Lys Asn Ile Ala Lys Glu Val Asn Glu 290 295 300 Ala
Thr Asp Tyr Ala Glu Ser Ala Ala Tyr Ala Glu Pro Glu Ser Ser 305 310
315 320 Leu Leu Tyr Val Tyr Asp Glu Glu Ala Asn Ser 325 330
8996DNAListeria monocytogenes 8atgactttaa aagaagcagg tttaacagaa
gataaattaa ttaaaatgta tgaaacaatg 60ctaatggcaa gaagactaga cgagcgtatg
tggttgctga accgttctgg gaaaattcct 120ttcaccattt ctggacaagg
acaagaaacg gcacaaattg gcgcagcgtt tgcctttgat 180ttagataaag
attacgcatt accatattac cgtgatttag cggtggtgtt agcatttggg
240atgacagcga aagatattat gttatccgcg ttcgctaaag cagaggatcc
aaactctggt 300ggacgtcaaa tgccagctca ttttggtcaa aaatcaaatc
gcatcgtgac acaaagttca 360ccagtaacaa cgcagttccc gcatgcagca
ggtattggtc ttgcagcgaa aatggccggt 420gatgagattg caatttatgc
ttcaacgggt gaaggatctt ctaaccaagg agatttccat 480gaaggaatca
acttcgcatc tgtacataag ttgccagttg ttttcgtgat tcacaataac
540caatatgcca tttccgttcc agcatcgaaa caatatgctg cagaaaaact
atccgaccga 600gcaatcggtt atggtatccc aggggaacgt gtggatggca
caaatatggg tgaagtatac 660gcggcattta aacgtgcagc agatcgtgca
agaaacggcg agggccccac tttaattgaa 720acagtttctt accgattcac
accgcactct tctgatgatg atgacagcag ttatcgttcc 780agagaagaag
tgaacgaagc aaaaggaaaa gatccactga caattttcca aacagaatta
840ctcgaagaag gttacttaac agaagaaaaa atcgctgaaa tcgaaaaaaa
tattgcaaaa 900gaagttaacg aagcaaccga ttacgcggaa agtgcagcat
acgctgaacc agaatcatct 960ttactttatg tatatgatga agaagcgaat agctga
9969381PRTStreptomyces avermitilis 9Met Thr Val Met Glu Gln Arg Gly
Ala Tyr Arg Pro Thr Pro Pro Pro 1 5 10 15 Ala Trp Gln Pro Arg Thr
Asp Pro Ala Pro Leu Leu Pro Asp Ala Leu 20 25 30 Pro His Arg Val
Leu Gly Thr Glu Ala Ala Ala Glu Ala Asp Pro Leu 35 40 45 Leu Leu
Arg Arg Leu Tyr Ala Glu Leu Val Arg Gly Arg Arg Tyr Asn 50 55 60
Thr Gln Ala Thr Ala Leu Thr Lys Gln Gly Arg Leu Ala Val Tyr Pro 65
70 75 80 Ser Ser Thr Gly Gln Glu Ala Cys Glu Val Ala Ala Ala Leu
Val Leu 85
90 95 Glu Glu Arg Asp Trp Leu Phe Pro Ser Tyr Arg Asp Thr Leu Ala
Ala 100 105 110 Val Ala Arg Gly Leu Asp Pro Val Gln Ala Leu Thr Leu
Leu Arg Gly 115 120 125 Asp Trp His Thr Gly Tyr Asp Pro Arg Glu His
Arg Ile Ala Pro Leu 130 135 140 Cys Thr Pro Leu Ala Thr Gln Leu Pro
His Ala Val Gly Leu Ala His 145 150 155 160 Ala Ala Arg Leu Lys Gly
Asp Asp Val Val Ala Leu Ala Leu Val Gly 165 170 175 Asp Gly Gly Thr
Ser Glu Gly Asp Phe His Glu Ala Leu Asn Phe Ala 180 185 190 Ala Val
Trp Gln Ala Pro Val Val Phe Leu Val Gln Asn Asn Gly Phe 195 200 205
Ala Ile Ser Val Pro Leu Ala Lys Gln Thr Ala Ala Pro Ser Leu Ala 210
215 220 His Lys Ala Val Gly Tyr Gly Met Pro Gly Arg Leu Val Asp Gly
Asn 225 230 235 240 Asp Ala Ala Ala Val His Glu Val Leu Ser Asp Ala
Val Ala His Ala 245 250 255 Arg Ala Gly Gly Gly Pro Thr Leu Val Glu
Ala Val Thr Tyr Arg Ile 260 265 270 Asp Ala His Thr Asn Ala Asp Asp
Ala Thr Arg Tyr Arg Gly Asp Ser 275 280 285 Glu Val Glu Ala Trp Arg
Ala His Asp Pro Ile Ala Leu Leu Glu His 290 295 300 Glu Leu Thr Glu
Arg Gly Leu Leu Asp Glu Asp Gly Ile Arg Ala Ala 305 310 315 320 Arg
Glu Asp Ala Glu Ala Met Ala Ala Asp Leu Arg Ala Arg Met Asn 325 330
335 Gln Asp Pro Ala Leu Asp Pro Met Asp Leu Phe Ala His Val Tyr Ala
340 345 350 Glu Pro Thr Pro Gln Leu Arg Glu Gln Glu Ala Gln Leu Arg
Ala Glu 355 360 365 Leu Ala Ala Glu Ala Asp Gly Pro Gln Gly Val Gly
Arg 370 375 380 101146DNAStreptomyces avermitilis 10atgacggtca
tggagcagcg gggcgcttac cggcccacac cgccgcccgc ctggcagccc 60cgcaccgacc
ccgcgccact gctgcccgac gcgctgcccc accgcgtcct gggcaccgag
120gcggccgcgg aggccgaccc gctactgctg cgccgcctgt acgcggagct
ggtgcgcggc 180cgccgctaca acacgcaggc cacggctctc accaagcagg
gccggctcgc cgtctacccg 240tcgagcacgg gccaggaggc ctgcgaggtc
gccgccgcgc tcgtgctgga ggagcgcgac 300tggctcttcc ccagctaccg
ggacaccctc gccgccgtcg cccgcggcct cgatcccgtc 360caggcgctca
ccctcctgcg cggcgactgg cacaccgggt acgacccccg tgagcaccgc
420atcgcgcccc tgtgcacccc tctcgcgacc cagctcccgc acgccgtcgg
cctcgcgcac 480gccgcccgcc tcaagggcga cgacgtggtc gcgctcgccc
tggtcggcga cggcggcacc 540agcgagggcg acttccacga ggcactgaac
ttcgccgccg tctggcaggc gccggtcgtc 600ttcctcgtgc agaacaacgg
cttcgccatc tccgtcccgc tcgccaagca gaccgccgcc 660ccgtcgctgg
cccacaaggc cgtcggctac gggatgccgg gccgcctggt cgacggcaac
720gacgcggcgg ccgtgcacga ggtcctcagc gacgccgtgg cccacgcgcg
cgcgggaggg 780gggccgacgc tcgtggaggc ggtgacctac cgcatcgacg
cccacaccaa cgccgacgac 840gcgacgcgct accgggggga ctccgaggtg
gaggcctggc gcgcgcacga cccgatcgcg 900ctcctggagc acgagttgac
cgaacgcggg ctgctcgacg aggacggcat ccgggccgcc 960cgcgaggacg
ccgaggcgat ggccgcggac ctgcgcgcac gcatgaacca ggatccggcc
1020ctggacccca tggacctgtt cgcccatgtg tatgccgagc ccacccccca
gctgcgggag 1080caggaagccc agttgcgggc cgagctggca gcggaggccg
acgggcccca aggagtcggc 1140cgatga 114611384PRTMicrococcus luteus
11Met Thr Leu Val Asp His Thr Arg Pro Thr Gly Gly Gln Ser Ala Gly 1
5 10 15 Ser Pro Pro Pro Ala Gly Pro Ala Glu Ala Val Met Leu Gln Val
Leu 20 25 30 Asp Thr Glu Gly Arg Arg Arg Pro Gln Pro Glu Leu Asp
Pro Trp Ile 35 40 45 Glu Asp Val Asp Ala Ala Ala Leu Ala Ala Leu
Tyr Arg Gln Met Ala 50 55 60 Val Val Arg Arg Leu Asp Val Glu Ala
Thr His Leu Gln Arg Gln Gly 65 70 75 80 Glu Leu Ala Leu Trp Pro Pro
Leu Leu Gly Gln Glu Ala Ala Gln Val 85 90 95 Gly Ser Ala Val Ala
Leu Arg Pro Asp Asp Phe Val Phe Pro Ser Tyr 100 105 110 Arg Glu Asn
Gly Val Ala Leu Leu Arg Gly Val Pro Ala Leu Asp Leu 115 120 125 Leu
Arg Val Trp Arg Gly Ser Thr Phe Ser Ser Trp Asp Pro Asn Glu 130 135
140 Thr Arg Val Ala Thr Gln Gln Ile Ile Ile Gly Ala Gln Ala Leu His
145 150 155 160 Ala Val Gly Tyr Ala Met Gly Val Gln Arg Asp Gln Ala
Asp Val Ala 165 170 175 Thr Ile Val Tyr Phe Gly Asp Gly Ala Thr Ser
Gln Gly Asp Val Asn 180 185 190 Glu Ala Met Val Phe Ser Ala Ser Tyr
Gln Ala Pro Val Val Phe Phe 195 200 205 Cys Gln Asn Asn His Trp Ala
Ile Ser Glu Pro Val Arg Leu Gln Thr 210 215 220 Arg Arg Ser Ile Ala
Asp Arg Pro Trp Gly Phe Gly Ile Pro Ser Met 225 230 235 240 Arg Val
Asp Gly Asn Asp Val Leu Ala Val Leu Ala Ala Thr Arg Ala 245 250 255
Ala Val Glu Arg Ala Ala Asp Gly Gly Gly Pro Thr Phe Val Glu Ala 260
265 270 Val Thr Tyr Arg Met Gly Pro His Thr Thr Ala Asp Asp Pro Thr
Arg 275 280 285 Tyr Arg Asp Asp Ala Glu Leu Glu Ala Trp Lys Ala Arg
Asp Pro Leu 290 295 300 Thr Arg Val Glu Ala His Leu Arg Thr Leu Asp
Val Asp Val Asp Ala 305 310 315 320 Val Leu Ala Gln Ala Gln Ala Glu
Ala Asp Glu Leu Ala Ala Glu Val 325 330 335 Arg Arg Ala Leu Glu Ala
Leu Glu Glu Asp Gly Ala Asp Arg Leu Phe 340 345 350 Asp Glu Ile Tyr
Ala Glu Pro His Gln Glu Leu Glu Arg Gln Arg Arg 355 360 365 Glu His
Ala Leu Tyr Leu Gln Gln Phe Asp Asp Glu Glu Ala Gly Ala 370 375 380
121155DNAMicrococcus luteus 12gtgaccctcg tggaccacac ccgtcccacc
ggcggacagt ccgccggctc tccgcccccg 60gcgggcccgg ccgaggccgt gatgctccag
gtgctcgaca cggagggccg ccgccgtccg 120cagccggagc tcgacccgtg
gatcgaggac gtcgacgccg ccgccctcgc cgcgctgtac 180cgccagatgg
ccgtggtccg tcgcctcgac gtcgaggcca cgcacctgca gcgtcagggc
240gagctggccc tgtggccgcc gctgctgggc caggaggccg cccaggtggg
ctccgccgtc 300gcgctgcgcc cggacgactt cgtcttcccg tcctaccgcg
agaacggcgt ggccctgctg 360cgcggcgtcc ccgcgctgga cctgctgcgg
gtgtggcgcg gctccacgtt ctcgagctgg 420gacccgaacg agacgcgggt
ggccacccag cagatcatca tcggcgcgca ggccctgcac 480gccgtcggct
acgcgatggg cgtccagcgg gaccaggcgg acgtcgccac gatcgtctac
540ttcggcgacg gcgccacgag ccagggcgac gtcaacgagg ccatggtctt
cagcgcctcc 600taccaggcgc ccgtggtgtt cttctgccag aacaaccact
gggccatctc cgagcccgtg 660cgcctgcaga cccgccgcag catcgcggac
cgcccgtggg gcttcggcat cccgtcgatg 720cgcgtggacg gcaacgacgt
cctggccgtg ctcgccgcaa cccgcgccgc cgtcgagcgc 780gcggccgacg
ggggcggccc cacgttcgtc gaggccgtca cctaccgcat gggtccacac
840accaccgcgg acgaccccac ccgctaccgg gacgacgccg agctcgaggc
ctggaaggcc 900cgtgacccgc tgacccgcgt ggaggcgcac ctgcgcaccc
tcgacgtgga cgtggacgcc 960gtgcttgcac aggcccaggc cgaggccgac
gagctggcag cggaggtccg ccgtgccctc 1020gaggcgctcg aggaggacgg
cgcggacagg ctcttcgacg agatctacgc ggagccccac 1080caggagctcg
agcggcagcg ccgcgagcac gccctctacc tgcagcagtt cgacgacgag
1140gaggcgggcg cgtga 115513330PRTStaphylococcus aureus 13Met Ile
Asp Tyr Lys Ser Leu Gly Leu Ser Glu Glu Asp Leu Lys Val 1 5 10 15
Ile Tyr Lys Trp Met Asp Leu Gly Arg Lys Ile Asp Glu Arg Leu Trp 20
25 30 Leu Leu Asn Arg Ala Gly Lys Ile Pro Phe Val Val Ser Gly Gln
Gly 35 40 45 Gln Glu Ala Thr Gln Ile Gly Met Ala Tyr Ala Leu Glu
Glu Gly Asp 50 55 60 Ile Thr Ala Pro Tyr Tyr Arg Asp Leu Ala Phe
Val Thr Tyr Met Gly 65 70 75 80 Ile Ser Ala Tyr Asp Thr Phe Leu Ser
Ala Phe Gly Lys Lys Asp Asp 85 90 95 Val Asn Ser Gly Gly Lys Gln
Met Pro Ser His Phe Ser Ser Arg Ala 100 105 110 Lys Asn Ile Leu Ser
Gln Ser Ser Pro Val Ala Thr Gln Ile Pro His 115 120 125 Ala Val Gly
Ala Ala Leu Ala Leu Lys Met Asp Gly Lys Lys Lys Ile 130 135 140 Ala
Thr Ala Thr Val Gly Glu Gly Ser Ser Asn Gln Gly Asp Phe His 145 150
155 160 Glu Gly Leu Asn Phe Ala Gly Val His Lys Leu Pro Phe Val Cys
Val 165 170 175 Ile Ile Asn Asn Lys Tyr Ala Ile Ser Val Pro Asp Ser
Leu Gln Tyr 180 185 190 Ala Ala Glu Lys Leu Ser Asp Arg Ala Leu Gly
Tyr Gly Ile His Gly 195 200 205 Glu Gln Val Asp Gly Asn Asp Pro Leu
Ala Met Tyr Lys Ala Met Lys 210 215 220 Glu Ala Arg Asp Arg Ala Ile
Ser Gly Gln Gly Ser Thr Leu Ile Glu 225 230 235 240 Ala Val Thr Ser
Arg Met Thr Ala His Ser Ser Asp Asp Asp Asp Gln 245 250 255 Tyr Arg
Thr Lys Glu Glu Arg Glu Ala Leu Lys Lys Ala Asp Cys Asn 260 265 270
Glu Lys Phe Lys Lys Glu Leu Leu Ser Ala Gly Ile Ile Asp Asp Ala 275
280 285 Trp Leu Ala Glu Ile Glu Ala Glu His Lys Asp Ile Ile Asn Lys
Ala 290 295 300 Thr Lys Ala Ala Glu Asp Ala Pro Tyr Pro Ser Val Glu
Glu Ala Tyr 305 310 315 320 Ala Phe Val Tyr Glu Glu Gly Ser Leu Asn
325 330 14993DNAStaphylococcus aureus 14ttagttaaga ctcccttctt
cgtacacaaa tgcataggct tcttcgacac ttggatatgg 60cgcgtcttca gcagcctttg
tcgctttatt gatgatgtct ttatgctccg cttctatttc 120tgccaaccaa
gcatcatcga taatgccagc tgaaagcaac tcttttttga acttttcatt
180gcagtcagct tttttaagcg cttcacgctc ttctttcgta cgatattggt
cgtcatcatc 240tgatgaatga gctgtcatac gacttgttac tgcttcaatc
aaagttgaac cttgaccaga 300aatagctcga tctcttgctt ctttcatcgc
tttatacatt gctaatggat cattaccatc 360tacttgttca ccatgtatac
cgtaaccaag tgctctatcc gataattttt cagctgcgta 420ttgtaatgaa
tcaggtactg aaattgcata tttattattt ataatgacac atacaaaagg
480aagtttgtgt acacccgcga agtttaaacc ttcatggaag tcaccttggt
ttgagctacc 540ttcaccaaca gttgctgttg caattttctt cttaccatcc
atttttaaag ctaaagcagc 600accaacagca tggggtattt gagttgctac
cggtgaactt tgagacaaaa tattcttagc 660tctactacta aagtgtgatg
gcatttgttt tccaccagag ttaacatcgt ctttctttcc 720aaacgctgat
aaaaacgtat catacgctga gatacccata taagtaacga aagctagatc
780tctataataa ggcgctgtaa tatcaccttc ttctaatgcg tatgccatcc
caatctgagt 840tgcttcttgt ccttgaccac ttacaacaaa tggaatttta
cctgcacggt tcaataacca 900cagtctttca tctatttttc tacctaaatc
catccattta tatattactt ttaggtcttc 960ttcgctaagg cctaatgatt
tataatcaat cat 99315331PRTStreptococcus mutans 15Met Ala Arg Lys
Ile Leu Glu Val Ile Ile Ala Met Leu Ser Lys Lys 1 5 10 15 Gln Tyr
Leu Asp Met Phe Leu Lys Met Gln Arg Ile Arg Asp Val Asp 20 25 30
Thr Lys Leu Asn Lys Leu Val Arg Arg Gly Phe Val Gln Gly Met Thr 35
40 45 His Phe Ser Val Gly Glu Glu Ala Ala Ser Val Gly Ala Ile Gln
Gly 50 55 60 Leu Thr Asp Gln Asp Ile Ile Phe Ser Asn His Arg Gly
His Gly Gln 65 70 75 80 Thr Ile Ala Lys Gly Ile Asp Ile Pro Ala Met
Phe Ala Glu Leu Ala 85 90 95 Gly Lys Ala Thr Gly Ser Ser Lys Gly
Arg Gly Gly Ser Met His Leu 100 105 110 Ala Asn Leu Glu Lys Gly Asn
Tyr Gly Thr Asn Gly Ile Val Gly Gly 115 120 125 Gly Tyr Ala Leu Ala
Val Gly Ala Ala Leu Thr Gln Gln Tyr Asp Asn 130 135 140 Thr Gly Asn
Ile Val Val Ala Phe Ser Gly Asp Ser Ala Thr Asn Glu 145 150 155 160
Gly Ser Phe His Glu Ser Val Asn Leu Ala Ala Val Trp Asn Leu Pro 165
170 175 Val Ile Phe Phe Ile Ile Asn Asn Arg Tyr Gly Ile Ser Thr Asp
Ile 180 185 190 Asn Tyr Ser Thr Lys Ile Ser His Leu Tyr Leu Arg Ala
Asp Ala Tyr 195 200 205 Gly Ile Pro Gly His Tyr Val Glu Asp Gly Asn
Asp Val Ile Ala Val 210 215 220 Tyr Glu Lys Met Gln Glu Val Ile Asp
Tyr Val Arg Ser Gly Asn Gly 225 230 235 240 Pro Ala Leu Val Glu Val
Glu Ser Tyr Arg Trp Phe Gly His Ser Thr 245 250 255 Ala Asp Ala Gly
Ala Tyr Arg Thr Lys Glu Glu Val Asp Ala Trp Lys 260 265 270 Ala Lys
Asp Pro Leu Lys Lys Tyr Arg Thr Tyr Leu Thr Glu Asn Lys 275 280 285
Ile Ala Thr Asp Glu Glu Leu Asp Met Ile Glu Lys Glu Val Ala Gln 290
295 300 Glu Ile Glu Asp Ala Val Lys Phe Ala Gln Asp Ser Pro Glu Pro
Glu 305 310 315 320 Leu Ser Val Ala Phe Glu Asp Val Trp Val Asp 325
330 16996DNAStreptococcus mutans 16atggcaagaa aaattttgga ggtcattata
gcaatgttat ctaaaaaaca atatttggat 60atgtttttaa aaatgcagcg tatccgtgat
gtcgatacaa aactcaataa attagttcgt 120cgtggtttcg tacaaggtat
gacacacttt tcagtaggag aagaggcggc ttcggttggt 180gcgattcaag
gcttgactga tcaggatatt atcttttcaa atcaccgtgg acatggtcaa
240accattgcaa aagggattga cattcctgct atgtttgcag aattagccgg
taaggcaacg 300ggttcttcaa aaggtcgtgg tggttctatg cacttggcaa
atcttgaaaa aggaaactat 360gggaccaatg gtattgttgg cgggggttat
gccttagcag tcggtgctgc tttgacacag 420caatatgaca atacgggaaa
tattgttgtc gccttttcag gagactcggc aactaatgaa 480ggctctttcc
atgagtctgt taatttggca gctgtctgga atttaccggt tatcttcttt
540attattaata atcgttatgg tatctcaaca gatatcaatt attctactaa
gatttcacat 600ctttatttac gtgctgatgc ttatggtatt cctggacatt
atgttgaaga tggtaatgat 660gtcattgcag tttatgaaaa aatgcaggaa
gtcattgatt atgtgcgttc aggaaatggg 720ccagctcttg ttgaagtgga
atcttatcgt tggttcggac attctactgc tgatgcagga 780gcttaccgta
caaaagaaga agtagatgct tggaaagcta aagatcctct caagaaatac
840cgcacttatc taacagaaaa taagattgca acagatgagg aacttgatat
gattgaaaaa 900gaagtcgcac aggaaattga ggatgcagtg aaatttgccc
aagatagccc tgaaccagag 960ctttctgtag cttttgaaga tgtttgggta gattag
9961716PRTArtificial sequenceSynthetic polypeptide 17Xaa Xaa Xaa
Gly Xaa Glu Ala Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15
188PRTArtificial sequenceSynthetic polypeptide 18Asp Xaa Xaa Xaa
Pro Xaa Tyr Arg 1 5 1911PRTArtificial sequenceSynthetic polypeptide
19Xaa Gln Xaa Xaa Xaa Ala Xaa Gly Xaa Ala Xaa 1 5 10
2020PRTArtificial sequenceSynthetic polypeptide 20Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Xaa Xaa Gly Xaa Gly Xaa Xaa Xaa 1 5 10 15 Xaa Xaa
Asp Xaa 20 2121PRTArtificial sequenceSynthetic polypeptide 21Phe
Xaa Xaa Val Xaa Xaa Xaa Pro Val Xaa Xaa Xaa Xaa Xaa Asn Asn 1 5 10
15 Xaa Xaa Ala Ile Ser 20 2216PRTArtificial sequenceSynthetic
polypeptide 22Xaa Xaa Xaa Gly Xaa Xaa Xaa Xaa Xaa Xaa Xaa Val Asp
Gly Asn Asp 1 5 10 15 2331PRTArtificial sequenceSynthetic
polypeptide 23Xaa Ala Arg Xaa Gly Xaa Gly Pro Xaa Leu Xaa Glu Xaa
Xaa Xaa Tyr 1 5 10 15 Arg Xaa Xaa Xaa His Xaa Xaa Xaa Asp Asp Xaa
Xaa Xaa Tyr Arg 20 25 30 24327PRTBacillus subtillis 24Met Ser Val
Met Ser Tyr Ile Asp Ala Ile Asn Leu Ala Met Lys Glu 1 5 10 15 Glu
Met Glu Arg Asp Ser Arg Val Phe Val Leu Gly Glu Asp Val Gly 20 25
30 Arg Lys Gly Gly Val Phe Lys Ala Thr Ala Gly Leu Tyr Glu Gln Phe
35 40 45 Gly Glu Glu Arg Val Met Asp Thr Pro Leu Ala Glu Ser Ala
Ile Ala 50 55 60 Gly Val Gly Ile Gly Ala Ala Met Tyr Gly Met Arg
Pro Ile Ala Glu 65 70 75 80 Met Gln Phe Ala Asp Phe Ile Met Pro Ala
Val
Asn Gln Ile Ile Ser 85 90 95 Glu Ala Ala Lys Ile Arg Tyr Arg Ser
Asn Asn Asp Trp Ser Cys Pro 100 105 110 Ile Val Val Arg Ala Pro Tyr
Gly Gly Gly Val His Gly Ala Leu Tyr 115 120 125 His Ser Gln Ser Val
Glu Ala Ile Phe Ala Asn Gln Pro Gly Leu Lys 130 135 140 Ile Val Met
Pro Ser Thr Pro Tyr Asp Ala Lys Gly Leu Leu Lys Ala 145 150 155 160
Ala Val Arg Asp Glu Asp Pro Val Leu Phe Phe Glu His Lys Arg Ala 165
170 175 Tyr Arg Leu Ile Lys Gly Glu Val Pro Ala Asp Asp Tyr Val Leu
Pro 180 185 190 Ile Gly Lys Ala Asp Val Lys Arg Glu Gly Asp Asp Ile
Thr Val Ile 195 200 205 Thr Tyr Gly Leu Cys Val His Phe Ala Leu Gln
Ala Ala Glu Arg Leu 210 215 220 Glu Lys Asp Gly Ile Ser Ala His Val
Val Asp Leu Arg Thr Val Tyr 225 230 235 240 Pro Leu Asp Lys Glu Ala
Ile Ile Glu Ala Ala Ser Lys Thr Gly Lys 245 250 255 Val Leu Leu Val
Thr Glu Asp Thr Lys Glu Gly Ser Ile Met Ser Glu 260 265 270 Val Ala
Ala Ile Ile Ser Glu His Cys Leu Phe Asp Leu Asp Ala Pro 275 280 285
Ile Lys Arg Leu Ala Gly Pro Asp Ile Pro Ala Met Pro Tyr Ala Pro 290
295 300 Thr Met Glu Lys Tyr Phe Met Val Asn Pro Asp Lys Val Glu Ala
Ala 305 310 315 320 Met Arg Glu Leu Ala Glu Phe 325
25984DNABacillus subtillis 25ttaaaactcc gctaattctc tcatcgccgc
ttccacttta tcagggttga ccataaagta 60tttttccatt gtcggcgcat aaggcatagc
cggaatatca ggacctgcaa gccgtttgat 120cggcgcgtct aagtcgaaca
gacaatgctc ggatataatt gcggctactt cgctcatgat 180gctgccttct
tttgtatctt ctgtgaccaa aagaaccttt ccagttttgg acgcagcttc
240gatgatggct tctttatcaa gcgggtaaac tgttcttaaa tccaccacat
gcgctgaaat 300gccatctttt tcgagacgtt ctgcagcttg taaggcgaag
tggacacaca ggccgtatgt 360gatcactgtg atgtcgtcgc cttccctttt
tacgtccgcc ttgccgattg gcaggacata 420atcatcagcc ggaacctcgc
cctttatcag acggtatgcc cgcttgtgct caaaaaacag 480cacggggtct
tcgtcacgaa ctgcggcttt taagagccct ttcgcgtcat atggtgttga
540tggcatgaca attttcagtc cgggctggtt ggcgaaaatt gcttcgactg
attgagaatg 600atacagggct ccgtgcacgc ctccgccgta tggcgctctg
acgacaatcg gacagctcca 660gtcattgttg ctgcggtagc ggattttagc
cgcttcagaa ataatttggt tgactgccgg 720cataatgaaa tcagcaaact
gcatttcagc aatcggtctc attccgtaca ttgccgctcc 780gataccgact
cctgcgattg cagattcagc aagcggcgta tccataacgc gctcttcccc
840aaattgttca tagagtcccg ctgtcgcttt aaacacaccg ccttttcttc
ctacatcttc 900cccaaggacg aaaacgcgag aatctcgttc catttcttct
ttcatcgcca aattgattgc 960atcaatatat gacattactg acat
98426325PRTStreptomyces avermitilis 26Met Ala Glu Lys Met Ala Ile
Ala Lys Ala Ile Asn Glu Ser Leu Arg 1 5 10 15 Lys Ala Leu Glu Ser
Asp Pro Lys Val Leu Ile Met Gly Glu Asp Val 20 25 30 Gly Lys Leu
Gly Gly Val Phe Arg Val Thr Asp Gly Leu Gln Lys Asp 35 40 45 Phe
Gly Glu Glu Arg Val Ile Asp Thr Pro Leu Ala Glu Ser Gly Ile 50 55
60 Val Gly Thr Ala Ile Gly Leu Ala Leu Arg Gly Tyr Arg Pro Val Val
65 70 75 80 Glu Ile Gln Phe Asp Gly Phe Val Phe Pro Ala Tyr Asp Gln
Ile Val 85 90 95 Thr Gln Leu Ala Lys Met His Ala Arg Ala Leu Gly
Lys Ile Lys Leu 100 105 110 Pro Val Val Val Arg Ile Pro Tyr Gly Gly
Gly Ile Gly Ala Val Glu 115 120 125 His His Ser Glu Ser Pro Glu Ala
Leu Phe Ala His Val Ala Gly Leu 130 135 140 Lys Val Val Ser Pro Ser
Asn Ala Ser Asp Ala Tyr Trp Met Met Gln 145 150 155 160 Gln Ala Ile
Gln Ser Asp Asp Pro Val Ile Phe Phe Glu Pro Lys Arg 165 170 175 Arg
Tyr Trp Asp Lys Gly Glu Val Asn Val Glu Ala Ile Pro Asp Pro 180 185
190 Leu His Lys Ala Arg Val Val Arg Glu Gly Thr Asp Leu Thr Leu Ala
195 200 205 Ala Tyr Gly Pro Met Val Lys Val Cys Gln Glu Ala Ala Ala
Ala Ala 210 215 220 Glu Glu Glu Gly Lys Ser Leu Glu Val Val Asp Leu
Arg Ser Met Ser 225 230 235 240 Pro Ile Asp Phe Asp Ala Val Gln Ala
Ser Val Glu Lys Thr Arg Arg 245 250 255 Leu Val Val Val His Glu Ala
Pro Val Phe Leu Gly Thr Gly Ala Glu 260 265 270 Ile Ala Ala Arg Ile
Thr Glu Arg Cys Phe Tyr His Leu Glu Ala Pro 275 280 285 Val Leu Arg
Val Gly Gly Tyr His Ala Pro Tyr Pro Pro Ala Arg Leu 290 295 300 Glu
Glu Glu Tyr Leu Pro Gly Leu Asp Arg Val Leu Asp Ala Val Asp 305 310
315 320 Arg Ser Leu Ala Tyr 325 27978DNAStreptomyces avermitilis
27atggccgaga agatggcgat cgccaaggcg atcaacgagt cgctgcgcaa ggccctggag
60tccgacccca aggttctgat catgggtgag gacgtcggca agctcggtgg cgtcttccgc
120gtcaccgacg gcctgcagaa ggacttcggc gaggagcggg tcatcgacac
cccgctcgcc 180gagtcgggca tcgtcggcac ggcgatcggt ctcgccctgc
gcggctaccg cccggtggtg 240gagatccagt tcgacggctt cgtcttcccg
gcgtacgacc agatcgtcac gcagctcgcg 300aagatgcacg cgcgggcgct
cggcaagatc aagctccccg ttgtcgtccg catcccgtac 360ggcggcggca
tcggcgccgt cgagcaccac tccgagtccc ccgaggcgct cttcgcgcac
420gtggcgggcc tcaaggtggt ctccccgtcc aacgcgtcgg acgcgtactg
gatgatgcag 480caggccatcc agagcgacga cccggtgatc ttcttcgagc
cgaagcggcg ctactgggac 540aagggcgagg tcaacgtcga ggcgatcccc
gacccgctgc acaaggcccg tgtggtgcgt 600gagggcaccg acctgacgct
cgccgcgtac ggcccgatgg tgaaggtctg ccaggaggcc 660gcggccgccg
ccgaggagga gggcaagtcc ctggaggtcg tcgacctgcg ctccatgtcg
720ccgatcgact tcgacgccgt ccaggcctcc gtcgagaaga cccgccgtct
ggtcgtggtg 780cacgaggcgc cggtgttcct gggcacgggc gcggagatcg
ccgcccgcat cacggagcgc 840tgcttctacc acctggaggc acccgtgctg
agggtcggcg gctaccacgc cccgtatccg 900ccggcgcgtc tggaagagga
gtaccttccg ggccttgacc gggtgctcga tgccgtcgac 960cgctcgctgg cgtactga
97828410PRTPseudomonas putida 28Met Asn Glu Tyr Ala Pro Leu Arg Leu
His Val Pro Glu Pro Thr Gly 1 5 10 15 Arg Pro Gly Cys Gln Thr Asp
Phe Ser Tyr Leu Arg Leu Asn Asp Ala 20 25 30 Gly Gln Ala Arg Lys
Pro Ala Ile Asp Val Asp Ala Ala Asp Thr Ala 35 40 45 Asp Leu Ser
Tyr Ser Leu Val Arg Val Leu Asp Glu Gln Gly Asp Ala 50 55 60 Gln
Gly Pro Trp Ala Glu Asp Ile Asp Pro Gln Ile Leu Arg Gln Gly 65 70
75 80 Met Arg Ala Met Leu Lys Thr Arg Ile Phe Asp Ser Arg Met Val
Val 85 90 95 Ala Gln Arg Gln Lys Lys Met Ser Phe Tyr Met Gln Ser
Leu Gly Glu 100 105 110 Glu Ala Ile Gly Ser Gly Gln Ala Leu Ala Leu
Asn Arg Thr Asp Met 115 120 125 Cys Phe Pro Thr Tyr Arg Gln Gln Ser
Ile Leu Met Ala Arg Asp Val 130 135 140 Ser Leu Val Glu Met Ile Cys
Gln Leu Leu Ser Asn Glu Arg Asp Pro 145 150 155 160 Leu Lys Gly Arg
Gln Leu Pro Ile Met Tyr Ser Val Arg Glu Ala Gly 165 170 175 Phe Phe
Thr Ile Ser Gly Asn Leu Ala Thr Gln Phe Val Gln Ala Val 180 185 190
Gly Trp Ala Met Ala Ser Ala Ile Lys Gly Asp Thr Lys Ile Ala Ser 195
200 205 Ala Trp Ile Gly Asp Gly Ala Thr Ala Glu Ser Asp Phe His Thr
Ala 210 215 220 Leu Thr Phe Ala His Val Tyr Arg Ala Pro Val Ile Leu
Asn Val Val 225 230 235 240 Asn Asn Gln Trp Ala Ile Ser Thr Phe Gln
Ala Ile Ala Gly Gly Glu 245 250 255 Ser Thr Thr Phe Ala Gly Arg Gly
Val Gly Cys Gly Ile Ala Ser Leu 260 265 270 Arg Val Asp Gly Asn Asp
Phe Val Ala Val Tyr Ala Ala Ser Arg Trp 275 280 285 Ala Ala Glu Arg
Ala Arg Arg Gly Leu Gly Pro Ser Leu Ile Glu Trp 290 295 300 Val Thr
Tyr Arg Ala Gly Pro His Ser Thr Ser Asp Asp Pro Ser Lys 305 310 315
320 Tyr Arg Pro Ala Asp Asp Trp Ser His Phe Pro Leu Gly Asp Pro Ile
325 330 335 Ala Arg Leu Lys Gln His Leu Ile Lys Ile Gly His Trp Ser
Glu Glu 340 345 350 Glu His Gln Ala Val Thr Ala Glu Leu Glu Ala Ala
Val Ile Ala Ala 355 360 365 Gln Lys Glu Ala Glu Gln Tyr Gly Thr Leu
Ala Asn Gly His Ile Pro 370 375 380 Ser Ala Ala Ser Met Phe Glu Asp
Val Tyr Lys Glu Met Pro Glu His 385 390 395 400 Leu Arg Arg Gln Arg
Gln Glu Leu Gly Val 405 410 291233DNAPseudomonas putida
29tcaaaccccc agttcctggc gttgacggcg caggtgttcg ggcatctcct tgtacacatc
60ctcgaacatc gaggcggcgc tcgggatgtg cccgttagcc agggtgccgt actgctcggc
120ttctttctgt gcggcaatca ccgcagcttc gagctcggcc gtgacggctt
ggtgttcttc 180ttcggaccag tggccgatct tgatcaggtg ctgcttcagg
cgggcgatcg ggtcacccag 240cgggaagtgg ctccagtcat cggcagggcg
gtacttggag gggtcgtccg acgtcgagtg 300cgggccggca cggtaggtga
cccactcgat caggcttggg cccaggccgc ggcgggcgcg 360ctcggcagcc
cagcgcgagg cggcgtacac ggcgacgaag tcgttgccgt caacccgcag
420cgaggcaatg ccgcagccca cgccacggcc ggcgaaggtg gtcgactcgc
caccggcgat 480ggcctggaag gtagaaatcg cccactggtt gttgaccaca
ttgaggatca ccggggcgcg 540gtaaacgtgg gcaaaggtga gggcggtgtg
gaagtccgac tcggcggtgg ctccgtcacc 600gatccacgcc gaagcaatct
tggtatcgcc cttgatcgcc gaggccatgg cccagccgac 660tgcctgcacg
aactgggtcg ccaggttgcc gctgatggtg aagaagccgg cttcgcgcac
720cgagtacatg atcggcaact ggcggccctt gagggggtcg cgctcgttgg
acagcagttg 780gcagatcatc tcgaccagcg atacgtcgcg ggccatcagg
atgctttgct ggcggtaggt 840cgggaagcac atgtcggtgc ggttcagcgc
cagcgcctgg ccactgccga tggcttcttc 900gcccaggctt tgcatgtaga
aggacatctt cttctggcgc tgggcaacca ccatgcggct 960gtcgaagatc
cgcgtcttga gcatggcgcg catgccttga cgaaggatct gtgggtcgat
1020gtcttcggcc caggggcctt gcgcatcacc ttgctcgtcg agcacgcgga
ccaggctgta 1080ggacaggtcg gcagtgtcgg cagcatcgac atcgatcgcg
ggtttacggg cttgacctgc 1140atcgttgagg cgcaggtagg aaaaatcggt
ctggcagcct ggccggccgg tgggctcggg 1200cacatgcaaa cgcagggggg
cgtactcgtt cat 123330327PRTListeria monocytogenes 30Met Pro Val Ile
Ser Tyr Ile Asp Ala Ile Thr Met Ala Leu Lys Glu 1 5 10 15 Glu Met
Glu Arg Asp Asp Lys Val Phe Ile Leu Gly Glu Asp Val Gly 20 25 30
Lys Lys Gly Gly Val Phe Lys Ala Thr Ala Gly Leu Tyr Asp Glu Phe 35
40 45 Gly Glu Asp Arg Val Leu Asp Thr Pro Leu Ala Glu Ser Ala Ile
Ala 50 55 60 Gly Val Gly Ile Gly Ala Ala Met Tyr Gly Tyr Arg Pro
Val Ala Glu 65 70 75 80 Met Gln Phe Ala Asp Phe Ile Met Pro Ala Val
Asn Gln Ile Ile Ser 85 90 95 Glu Ala Ala Arg Ile Arg Tyr Arg Ser
Asn Asn Asp Trp Ser Cys Pro 100 105 110 Met Val Ile Arg Ala Pro Phe
Gly Gly Gly Val His Gly Ala Leu Tyr 115 120 125 His Ser Gln Ser Val
Glu Lys Val Phe Phe Gly Gln Pro Gly Leu Lys 130 135 140 Ile Val Val
Pro Ser Ser Pro Tyr Asp Ala Lys Gly Leu Leu Lys Ala 145 150 155 160
Ala Ile Arg Asp Asn Asp Pro Val Leu Phe Phe Glu His Lys Arg Ala 165
170 175 Tyr Arg Leu Leu Lys Gly Glu Val Pro Glu Thr Asp Tyr Ile Val
Pro 180 185 190 Ile Gly Glu Ala Asn Val Val Arg Glu Gly Asp Asp Ile
Thr Val Ile 195 200 205 Thr Tyr Gly Leu Ala Val Gln Phe Ala Gln Gln
Ala Ala Glu Arg Leu 210 215 220 Ala Ala Glu Gly Val Glu Ala His Ile
Leu Asp Leu Arg Thr Ile Tyr 225 230 235 240 Pro Leu Asp Gln Glu Ala
Ile Ile Glu Ala Thr Lys Lys Thr Gly Lys 245 250 255 Val Leu Leu Val
Thr Glu Asp Asn Lys Gln Gly Ser Ile Ile Ser Glu 260 265 270 Val Ala
Ala Ile Ile Ser Glu His Cys Leu Phe Asp Leu Asp Ala Pro 275 280 285
Ile Ala Arg Leu Ala Gly Pro Asp Thr Pro Ala Met Pro Phe Ala Pro 290
295 300 Thr Met Glu Lys His Phe Met Ile Asn Pro Asp Lys Val Ala Asp
Ala 305 310 315 320 Met Lys Glu Leu Ala Glu Phe 325
31984DNAListeria monocytogenes 31atgccagtca tttcatatat tgatgcaata
accatggcgc ttaaagaaga aatggagcgc 60gatgataaag tatttatttt aggagaagat
gttgggaaaa aaggtggcgt atttaaagcg 120actgctggtc tatatgacga
atttggtgaa gacagagtac ttgatacacc acttgctgaa 180tctgccattg
ccggagttgg aattggcgcg gcgatgtatg gctaccgccc agttgcagaa
240atgcaatttg ctgactttat tatgccagct gtcaaccaaa tcatttcaga
agctgccaga 300attcggtacc gttctaataa cgattggtct tgtccaatgg
ttattcgcgc accttttggc 360ggcggggtac acggggcact ttaccattca
caatctgttg aaaaagtgtt tttcggacaa 420cctggtttga aaatcgttgt
tccttcttca ccatatgatg caaaagggct tttaaaagcg 480gcgattcgcg
ataatgatcc agtgcttttc tttgagcata aacgtgcgta ccgcttgcta
540aaaggcgaag tgccagaaac tgattatatc gttccaatcg gcgaagcaaa
tgttgttcgt 600gaaggtgatg atattacagt aattacttac ggacttgcgg
ttcaatttgc ccaacaagca 660gcagaacgtt tagcagcgga aggcgtagaa
gcacatattc ttgatttacg gacaatctat 720ccactagacc aagaagcaat
tattgaagca acgaaaaaaa caggtaaagt acttcttgta 780acggaagata
acaaacaagg aagtattatc agtgaagtgg cagcaatcat ttcggagcat
840tgtttatttg acttagacgc accgattgct agactcgcag gacctgatac
cccagcgatg 900ccttttgctc caacaatgga aaaacatttt atgatcaatc
cagataaagt ggcggatgca 960atgaaagaat tagcggaatt ttag
98432334PRTStreptomyces avermitilis 32Met Thr Thr Val Ala Leu Lys
Pro Ala Thr Met Ala Gln Ala Leu Thr 1 5 10 15 Arg Ala Leu Arg Asp
Ala Met Ala Ala Asp Pro Ala Val His Val Met 20 25 30 Gly Glu Asp
Val Gly Thr Leu Gly Gly Val Phe Arg Val Thr Asp Gly 35 40 45 Leu
Ala Lys Glu Phe Gly Glu Asp Arg Cys Thr Asp Thr Pro Leu Ala 50 55
60 Glu Ala Gly Ile Leu Gly Thr Ala Val Gly Met Ala Met Tyr Gly Leu
65 70 75 80 Arg Pro Val Val Glu Met Gln Phe Asp Ala Phe Ala Tyr Pro
Ala Phe 85 90 95 Glu Gln Leu Ile Ser His Val Ala Arg Met Arg Asn
Arg Thr Arg Gly 100 105 110 Ala Met Pro Leu Pro Ile Thr Ile Arg Val
Pro Tyr Gly Gly Gly Ile 115 120 125 Gly Gly Val Glu His His Ser Asp
Ser Ser Glu Ala Tyr Tyr Met Ala 130 135 140 Thr Pro Gly Leu His Val
Val Thr Pro Ala Thr Val Ala Asp Ala Tyr 145 150 155 160 Gly Leu Leu
Arg Ala Ala Ile Ala Ser Asp Asp Pro Val Val Phe Leu 165 170 175 Glu
Pro Lys Arg Leu Tyr Trp Ser Lys Asp Ser Trp Asn Pro Asp Glu 180 185
190 Pro Gly Thr Val Glu Pro Ile Gly Arg Ala Val Val Arg Arg Ser Gly
195 200 205 Arg Ser Ala Thr Leu Ile Thr Tyr Gly Pro Ser Leu Pro Val
Cys Leu 210 215 220 Glu Ala Ala Glu Ala Ala Arg Ala Glu Gly Trp Asp
Leu Glu Val Val 225 230 235 240 Asp Leu Arg Ser Leu Val Pro Phe Asp
Asp Glu Thr Val Cys Ala Ser 245 250 255 Val Arg Arg Thr Gly Arg Ala
Val Val Val His Glu Ser Gly Gly Tyr 260 265 270 Gly Gly Pro Gly Gly
Glu Ile Ala Ala Arg Ile Thr Glu Arg Cys Phe 275 280 285 His His Leu
Glu Ala Pro Val Leu Arg Val Ala Gly Phe Asp Ile Pro 290 295 300
Tyr Pro Pro Pro Met Leu Glu Arg His His Leu Pro Gly Val Asp Arg 305
310 315 320 Ile Leu Asp Ala Val Gly Arg Leu Gln Trp Glu Ala Gly Ser
325 330 331005DNAStreptomyces avermitilis 33atgaccaccg ttgccctcaa
gccggccacc atggcgcagg cactcacacg cgcgttgcgt 60gacgccatgg ccgccgaccc
cgccgtccac gtgatgggcg aggacgtcgg cacgctcggc 120ggggtcttcc
gggtcaccga cgggctcgcc aaggagttcg gcgaggaccg ctgcacggac
180acgccgctcg ccgaggcagg catcctcggc acggccgtcg gcatggcgat
gtacgggctg 240cggccggtcg tcgagatgca gttcgacgcg ttcgcgtacc
cggcgttcga gcagctcatc 300agccatgtcg cgcggatgcg caaccgcacc
cgcggggcga tgccgctgcc gatcaccatc 360cgtgtcccct acggcggcgg
aatcggcgga gtcgaacacc acagcgactc ctccgaggcg 420tactacatgg
cgactccggg gctccatgtc gtcacgcccg ccacggtcgc cgacgcgtac
480gggctgctgc gcgccgccat cgcctccgac gacccggtcg tcttcctgga
gcccaagcgg 540ctgtactggt cgaaggactc ctggaacccg gacgagccgg
ggaccgttga accgataggc 600cgcgcggtgg tgcggcgctc gggccggagc
gccacgctca tcacgtacgg gccttccctg 660cccgtctgcc tggaggcggc
cgaggcggcc cgggccgagg gctgggacct cgaagtcgtc 720gatctgcgct
ccctggtgcc cttcgacgac gagacggtgt gcgcgtcggt gcgccggacc
780ggacgcgccg tcgtcgtgca cgagtcgggt ggttacggcg gcccgggcgg
ggagatcgcc 840gcgcggatca ccgagcgctg cttccaccat ctggaggcgc
cggtgctgcg cgtcgccggg 900ttcgacatcc cgtatccgcc gccgatgctg
gagcgccatc atctgcccgg tgtcgaccgg 960atcctggacg cggtggggcg
gcttcagtgg gaggcgggga gctga 100534355PRTMicrococcus luteus 34Met
Ser Glu Arg Met Thr Phe Gly Arg Ala Ile Asn Arg Gly Leu His 1 5 10
15 Arg Ala Leu Ala Asp Asp Pro Lys Val Leu Leu Met Gly Glu Asp Ile
20 25 30 Gly Ala Leu Gly Gly Val Phe Arg Ile Thr Asp Gly Leu Gln
Ala Glu 35 40 45 Phe Gly Glu Asp Arg Val Leu Asp Thr Pro Leu Ala
Glu Ser Gly Ile 50 55 60 Val Gly Thr Ala Ile Gly Leu Ala Met Arg
Gly Tyr Arg Pro Val Val 65 70 75 80 Glu Ile Gln Phe Asp Gly Phe Val
Tyr Pro Ala Phe Asp Gln Ile Val 85 90 95 Ala Asn Leu Ala Lys Leu
Arg Ala Arg Thr Arg Gly Ala Val Pro Met 100 105 110 Pro Val Thr Ile
Arg Ile Pro Phe Gly Gly Gly Ile Gly Ser Pro Glu 115 120 125 His His
Ser Glu Ser Pro Glu Ala Tyr Phe Leu His Thr Ala Gly Leu 130 135 140
Arg Val Val Ser Pro Ser Ser Pro Gln Glu Gly Tyr Asp Leu Ile Arg 145
150 155 160 Ala Ala Ile Ala Ser Glu Asp Pro Val Val Tyr Leu Glu Pro
Lys Arg 165 170 175 Arg Tyr His Asp Lys Gly Asp Val Asp Leu Gly Val
Ala Ile Pro Pro 180 185 190 Met Ser Pro Ala Arg Ile Leu Arg Glu Gly
Arg Asp Ala Thr Leu Val 195 200 205 Ala Tyr Gly Pro Leu Val Lys Thr
Ala Leu Gln Ala Ala Glu Val Ala 210 215 220 Ala Glu Glu Gly Val Glu
Val Glu Val Val Asp Leu Arg Ser Leu Ser 225 230 235 240 Pro Leu Asp
Thr Gly Leu Val Glu Ser Ser Val Arg Arg Thr Gly Arg 245 250 255 Leu
Val Val Ala His Glu Ala Ser Arg Thr Gly Gly Leu Gly Ala Glu 260 265
270 Leu Val Ala Thr Val Ala Glu Arg Ala Phe His Trp Leu Glu Ala Pro
275 280 285 Pro Val Arg Val Thr Gly Met Asp Val Pro Tyr Pro Pro Ser
Lys Leu 290 295 300 Glu His Leu His Leu Pro Asp Leu Asp Arg Ile Leu
Asp Gly Leu Asp 305 310 315 320 Arg Ala Leu Gly Arg Pro Asn Ser Leu
Asp Ser Val Asp Ala Phe Ala 325 330 335 Ala Pro Glu Thr Ala Glu Gln
Phe Leu Ala Ala Gln Asn Ala Gly Glu 340 345 350 Glu Thr Arg 355
351068DNAMicrococcus luteus 35gtgagcgagc gcatgacctt cggccgtgcg
atcaaccgcg gcctgcaccg tgccctggcc 60gacgacccca aggtcctgct catgggcgag
gacatcggcg ccctcggcgg cgtgttccgc 120atcaccgacg gcctgcaggc
cgagttcggc gaggaccggg tgctcgacac cccgctggcc 180gagtccggca
tcgtgggcac ggccatcggc ctggcgatgc gcggctaccg gcccgtcgtc
240gagatccagt tcgacggctt cgtgtacccg gcgttcgacc agatcgtggc
gaacctggcc 300aagctgcgcg cccgcacccg cggcgccgtg ccgatgccgg
tgaccatccg catccccttc 360ggcggcggca tcggctcccc ggagcaccac
tccgagtcgc ccgaggccta cttcctgcac 420accgcgggtc tgcgcgtggt
ctccccgtcc tccccgcagg aggggtacga cctcatccgc 480gccgcgatcg
cctcggagga cccggtggtc tacctcgagc ccaagcgtcg ctaccacgac
540aagggcgacg tggacctggg cgtcgcgatc ccgccgatga gcccggcccg
catcctgcgc 600gagggccgtg acgccacgct cgtggcctac ggcccgctcg
tgaagaccgc cctgcaggcc 660gccgaggtgg cggccgagga gggtgtcgag
gtcgaggtgg tcgacctgcg cagcctgtcc 720ccgctggaca ccggcctcgt
cgagtcctcg gtgcggcgca ccggtcggct cgtcgtggcg 780cacgaggcct
cccgcacggg cggcctcggc gccgagctcg tggccacggt ggccgagcgc
840gcgttccatt ggctcgaggc cccgccggtg cgcgtcaccg gcatggacgt
gccctacccg 900ccgtccaagc tcgagcacct gcacctgccg gacctcgacc
gcatcctcga cggcctggac 960cgtgctctgg gccggccgaa ttcgctggac
tccgtggacg cgttcgccgc ccccgagacc 1020gccgagcagt tcctcgccgc
ccagaacgcc ggggaggaga cccgatga 106836327PRTStaphylococcus aureus
36Met Ala Lys Leu Ser Tyr Leu Glu Ala Ile Arg Gln Ala Gln Asp Leu 1
5 10 15 Ala Leu Gln Gln Asn Lys Asp Val Phe Ile Leu Gly Glu Asp Val
Gly 20 25 30 Lys Lys Gly Gly Val Phe Gly Thr Thr Gln Gly Leu Gln
Gln Gln Tyr 35 40 45 Gly Glu Asp Arg Val Ile Asp Thr Pro Leu Ala
Glu Ser Asn Ile Val 50 55 60 Gly Thr Ala Ile Gly Ala Ala Met Val
Gly Lys Arg Pro Ile Ala Glu 65 70 75 80 Ile Gln Phe Ala Asp Phe Ile
Leu Pro Ala Thr Asn Gln Ile Ile Ser 85 90 95 Glu Ala Ala Lys Met
Arg Tyr Arg Ser Asn Asn Asp Trp Gln Cys Pro 100 105 110 Leu Thr Ile
Arg Ala Pro Phe Gly Gly Gly Val His Gly Gly Leu Tyr 115 120 125 His
Ser Gln Ser Ile Glu Ser Ile Phe Ala Ser Ser Pro Gly Leu Thr 130 135
140 Ile Val Ile Pro Ser Thr Pro Tyr Asp Ala Lys Gly Leu Leu Leu Ser
145 150 155 160 Ser Ile Glu Ser Asn Asp Pro Val Leu Tyr Phe Glu His
Lys Lys Ala 165 170 175 Tyr Arg Phe Leu Lys Glu Glu Val Pro Glu Glu
Tyr Tyr Thr Val Pro 180 185 190 Leu Gly Lys Ala Asp Val Lys Arg Glu
Gly Glu Asp Leu Thr Val Phe 195 200 205 Cys Tyr Gly Leu Met Val Asn
Tyr Cys Leu Gln Ala Ala Asp Ile Leu 210 215 220 Ala Ala Asp Gly Ile
Asn Val Glu Val Val Asp Leu Arg Thr Val Tyr 225 230 235 240 Pro Leu
Asp Lys Glu Thr Ile Ile Asp Arg Ala Lys Asn Thr Gly Lys 245 250 255
Val Leu Leu Val Thr Glu Asp Asn Leu Glu Gly Ser Ile Met Ser Glu 260
265 270 Val Ser Ala Ile Ile Ala Glu His Cys Leu Phe Asp Leu Asp Thr
Pro 275 280 285 Ile Met Arg Leu Ala Ala Pro Asp Val Pro Ser Met Pro
Phe Ser Pro 290 295 300 Val Leu Glu Asn Glu Ile Met Met Asn Pro Glu
Lys Ile Leu Asn Lys 305 310 315 320 Met Arg Glu Leu Ala Glu Phe 325
37984DNAStaphylococcus aureus 37ctagaattct gctaattcac gcattttatt
taagattttt tctggattca tcataatttc 60attttctaat acaggagaaa atggcataga
tggtacatct ggagcagcta aacgcatgat 120tggtgtatct aaatcgaaca
agcaatgctc tgcaataatc gctgacactt ctgacataat 180actaccttct
aaattatctt cagttacaag taaaacttta cctgtatttt tagcacgatc
240aataattgtt tctttatcta atggataaac agttcgtaaa tcaacgactt
caacgttgat 300accgtctgca gctaaaatat ccgctgcttg taaacaataa
ttgaccatta atccataaca 360aaatactgtt aaatcttcac cttcacgttt
aacatctgct tttcctaaag gtacagtgta 420atattcttct ggcacttctt
cctttaagaa acgataagct tttttatgct caaagtacaa 480tactggatca
tttgattcga tagatgataa taaaagccct ttagcatcat acggtgtgga
540aggaataaca attgttaaac ctggtgatga agcaaatata ctttcaatac
tttgtgaatg 600atatagtcct ccgtgaacac cgccaccaaa tggtgcacga
atcgttaatg ggcattgcca 660atcattattt gaacgataac gcattttcgc
agcttcacta ataatttgat ttgtcgcagg 720taaaataaaa tctgcaaatt
gaatttctgc aattggtctt ttacctacca tagctgcacc 780aatggcagtt
ccaacaatat ttgactcagc taatggcgta tcgataactc tgtcttcacc
840atattgttgt tgtagtcctt gagtagtacc aaatacgcca ccttttttac
caacatcttc 900accaagaata aacacatctt tattttgttg taatgctaag
tcttgtgcct ggcgtatcgc 960ctctaaataa gataatttag ccat
98438337PRTStreptococcus mutans 38Met Arg Arg Lys Arg Tyr Met Ser
Glu Thr Lys Val Val Ala Leu Arg 1 5 10 15 Glu Ala Ile Asn Leu Ala
Met Ser Glu Glu Met Arg Lys Asp Glu Lys 20 25 30 Ile Ile Leu Met
Gly Glu Asp Val Gly Ile Tyr Gly Gly Asp Phe Gly 35 40 45 Thr Ser
Val Gly Met Leu Ala Glu Phe Gly Glu Lys Arg Val Lys Asp 50 55 60
Thr Pro Ile Ser Glu Ala Ala Ile Ala Gly Ser Ala Val Gly Ala Ala 65
70 75 80 Gln Thr Gly Leu Arg Pro Ile Val Asp Leu Thr Phe Met Asp
Phe Val 85 90 95 Thr Ile Ala Met Asp Ala Ile Val Asn Gln Gly Ala
Lys Ala Asn Tyr 100 105 110 Met Phe Gly Gly Gly Leu Lys Thr Pro Val
Thr Phe Arg Val Ala Ser 115 120 125 Gly Ser Gly Ile Gly Ser Ala Ala
Gln His Ser Gln Ser Leu Glu Ala 130 135 140 Trp Leu Thr His Ile Pro
Gly Ile Lys Val Val Ala Pro Gly Thr Val 145 150 155 160 Asn Asp Ala
Lys Ala Leu Leu Lys Ser Ala Ile Arg Asp Asn Asn Ile 165 170 175 Val
Ile Phe Met Glu Pro Lys Ala Leu Tyr Gly Lys Lys Glu Glu Val 180 185
190 Asn Leu Asp Pro Asp Phe Tyr Ile Pro Leu Gly Lys Gly Glu Ile Lys
195 200 205 Arg Glu Gly Thr Asp Val Thr Ile Val Ser Tyr Gly Arg Met
Leu Glu 210 215 220 Arg Val Leu Lys Ala Ala Glu Glu Val Ala Ala Glu
Asp Ile Ser Val 225 230 235 240 Glu Val Val Asp Pro Arg Thr Leu Ile
Pro Leu Asp Lys Asp Leu Ile 245 250 255 Ile Asn Ser Val Lys Lys Thr
Gly Lys Val Ile Leu Val Asn Asp Ala 260 265 270 Tyr Lys Thr Gly Gly
Phe Ile Gly Glu Ile Ala Ser Val Ile Thr Glu 275 280 285 Ser Glu Ala
Phe Asp Tyr Leu Asp Ala Pro Val Leu Arg Leu Ala Ser 290 295 300 Glu
Asp Val Pro Val Pro Tyr Ser His Val Leu Glu Thr Ala Ile Leu 305 310
315 320 Pro Asp Val Ala Lys Ile Lys Glu Ala Ile Tyr Lys Gln Val Arg
Lys 325 330 335 Arg 391014DNAStreptococcus mutans 39atgaggagaa
agagatatat gtcagaaaca aaagtagtag ccttacggga agctatcaat 60cttgctatga
gcgaggaaat gcgtaaggac gaaaaaatta ttttgatggg tgaagatgtc
120ggtatttatg gtggtgactt tggaacttct gttggtatgc tggctgaatt
tggtgaaaag 180cgtgttaaag atacccctat ttcagaagca gccattgcag
gatctgcagt aggtgccgct 240caaactggac ttcgtcctat tgttgatttg
acctttatgg actttgtgac tattgccatg 300gatgctattg ttaatcaagg
tgctaaagcc aattatatgt ttggcggcgg acttaaaacg 360cctgtaacct
ttcgtgtggc ctcaggctca ggtatcggct cagcagcgca gcattctcag
420tcactagaag cttggttaac tcatattccg ggaatcaagg tggttgcgcc
tggcacagtc 480aatgatgcta aagccttgct caaatctgct attcgtgata
ataatatcgt tattttcatg 540gaaccaaaag cgctttatgg caaaaaagaa
gaggtcaatt tagatcctga tttttatatt 600ccgcttggta aaggcgaaat
taagcgcgag ggaacagatg ttaccattgt gtcttatggt 660cgtatgctgg
aacgcgttct caaagccgct gaggaagtgg cggctgaaga tatcagtgtt
720gaagttgttg acccgcgtac ccttattccg cttgataaag acttaattat
taattctgtg 780aaaaagacgg gtaaggttat cctagttaat gatgcttata
aaacaggtgg tttcattggt 840gaaatagcat cagtgattac tgaaagcgaa
gcatttgatt atttagatgc accagtgctt 900cgtctcgctt ctgaggatgt
gcctgttccc tattctcatg ttctcgaaac agccatttta 960ccagatgtgg
caaaaattaa agaagctatc tataaacaag tcaggaaaag atag
10144021PRTArtificial sequenceSynthetic polypeptide 40Val Xaa Xaa
Xaa Gly Xaa Asp Val Gly Xaa Xaa Gly Gly Val Phe Xaa 1 5 10 15 Xaa
Thr Xaa Gly Ile 20 4116PRTArtificial sequenceSynthetic polypeptide
41Xaa Gly Xaa Xaa Arg Xaa Xaa Asp Xaa Pro Xaa Xaa Glu Xaa Xaa Ile 1
5 10 15 4215PRTArtificial sequenceSynthetic polypeptide 42Gly Thr
Xaa Xaa Xaa Gly Xaa Arg Pro Xaa Xaa Glu Xaa Gln Phe 1 5 10 15
4319PRTArtificial sequenceSynthetic polypeptide 43Pro Xaa Gly Gly
Xaa Xaa Xaa Xaa Xaa Xaa His Ser Xaa Ser Xaa Glu 1 5 10 15 Ala Xaa
Xaa 4413PRTArtificial sequenceSynthetic polypeptide 44Xaa Asp Pro
Val Xaa Xaa Xaa Glu Xaa Lys Arg Xaa Tyr 1 5 10 4512PRTArtificial
sequenceSynthetic polypeptide 45Xaa Val Xaa Asp Leu Arg Xaa Xaa Xaa
Pro Xaa Asp 1 5 10 4619PRTArtificial sequenceSynthetic polypeptide
46Glu Xaa Cys Xaa Xaa Xaa Leu Xaa Ala Pro Xaa Xaa Arg Xaa Xaa Gly 1
5 10 15 Xaa Xaa Pro 47424PRTBacillus subtillis 47Met Ala Ile Glu
Gln Met Thr Met Pro Gln Leu Gly Glu Ser Val Thr 1 5 10 15 Glu Gly
Thr Ile Ser Lys Trp Leu Val Ala Pro Gly Asp Lys Val Asn 20 25 30
Lys Tyr Asp Pro Ile Ala Glu Val Met Thr Asp Lys Val Asn Ala Glu 35
40 45 Val Pro Ser Ser Phe Thr Gly Thr Ile Thr Glu Leu Val Gly Glu
Glu 50 55 60 Gly Gln Thr Leu Gln Val Gly Glu Met Ile Cys Lys Ile
Glu Thr Glu 65 70 75 80 Gly Ala Asn Pro Ala Glu Gln Lys Gln Glu Gln
Pro Ala Ala Ser Glu 85 90 95 Ala Ala Glu Asn Pro Val Ala Lys Ser
Ala Gly Ala Ala Asp Gln Pro 100 105 110 Asn Lys Lys Arg Tyr Ser Pro
Ala Val Leu Arg Leu Ala Gly Glu His 115 120 125 Gly Ile Asp Leu Asp
Gln Val Thr Gly Thr Gly Ala Gly Gly Arg Ile 130 135 140 Thr Arg Lys
Asp Ile Gln Arg Leu Ile Glu Thr Gly Gly Val Gln Glu 145 150 155 160
Gln Asn Pro Glu Glu Leu Lys Thr Ala Ala Pro Ala Pro Lys Ser Ala 165
170 175 Ser Lys Pro Glu Pro Lys Glu Glu Thr Ser Tyr Pro Ala Ser Ala
Ala 180 185 190 Gly Asp Lys Glu Ile Pro Val Thr Gly Val Arg Lys Ala
Ile Ala Ser 195 200 205 Asn Met Lys Arg Ser Lys Thr Glu Ile Pro His
Ala Trp Thr Met Met 210 215 220 Glu Val Asp Val Thr Asn Met Val Ala
Tyr Arg Asn Ser Ile Lys Asp 225 230 235 240 Ser Phe Lys Lys Thr Glu
Gly Phe Asn Leu Thr Phe Phe Ala Phe Phe 245 250 255 Val Lys Ala Val
Ala Gln Ala Leu Lys Glu Phe Pro Gln Met Asn Ser 260 265 270 Met Trp
Ala Gly Asp Lys Ile Ile Gln Lys Lys Asp Ile Asn Ile Ser 275 280 285
Ile Ala Val Ala Thr Glu Asp Ser Leu Phe Val Pro Val Ile Lys Asn 290
295 300 Ala Asp Glu Lys Thr Ile Lys Gly Ile Ala Lys Asp Ile Thr Gly
Leu 305 310 315 320 Ala Lys Lys Val Arg Asp Gly Lys Leu Thr Ala Asp
Asp Met Gln Gly 325 330 335 Gly Thr Phe Thr Val Asn Asn Thr Gly Ser
Phe Gly Ser Val Gln Ser 340 345 350 Met Gly Ile Ile Asn Tyr Pro Gln
Ala Ala Ile Leu Gln Val Glu Ser 355 360 365 Ile Val Lys Arg Pro Val
Val Met Asp Asn Gly Met Ile Ala Val Arg 370 375 380 Asp Met Val Asn
Leu Cys Leu Ser Leu Asp His Arg Val Leu Asp Gly 385 390 395 400 Leu
Val Cys Gly Arg Phe Leu Gly Arg Val Lys Gln Ile Leu Glu Ser 405 410
415 Ile Asp Glu Lys Thr Ser Val Tyr 420
481275DNABacillus subtillis 48ttagtaaaca gatgtcttct cgtcaatcga
ttctaaaatt tgtttcactc gtccgaggaa 60tcgtccgcac acgagaccgt caagcactct
gtgatctaat gacaggcaca gattaaccat 120gtctctgaca gcaatcatgc
cattgtccat gacaaccggg cgtttgacga tggattctac 180ttgaagaatc
gcagcctgag ggtagttgat aatgcccatc gactgaacag acccgaacga
240acctgtgttg ttgacggtaa acgtgcctcc ctgcatgtca tctgcagtga
gttttccgtc 300tcttactttt ttagctaggc cggtaatgtc tttcgcaatg
cctttaattg ttttttcatc 360agcgttttta atcaccggaa caaataaaga
atcctctgtg gcaactgcaa ttgaaatatt 420gatatccttt ttctgaataa
ttttgtcccc cgcccacatg ctattcattt gcgggaattc 480ttttaacgcc
tgagcgaccg cttttacaaa aaaggcgaag aacgttaaat taaagccttc
540tgtcttctta aaagaatctt ttatactgtt gcgatatgca accatatttg
tgacgtcgac 600ttccatcatc gtccaagcat gcggaatttc tgttttgctt
cgcttcatat tggaagcaat 660tgcttttctt acacctgtga cagggatttc
tttatcaccg gctgcagacg caggatatga 720cgtctcttct tttggctcag
gttttgatgc agacttcggt gcaggagctg ctgttttcag 780ctcctcagga
ttctgttctt gcacgccgcc tgtttcaatt aagcgctgaa tatcttttcg
840tgtgatgcgc ccgccggcac cagttcctgt cacttgatcg aggtcaatgc
cgtgctctcc 900ggccaaacgg agaacagctg gcgagtagcg ctttttattg
ggctgatcgg ctgctccagc 960actttttgca acagggttct cagcggcttc
tgatgctgct ggctgttctt gtttttgttc 1020agccggattc gcgccttctg
tttcaatttt gcaaatcatt tctccgactt gcagggtttg 1080gccttcttct
cccacaagct ctgttatcgt accagtaaaa gaagacggaa cctctgcatt
1140taccttatct gtcatgactt ccgcgatcgg atcgtatttg ttcactttat
caccgggggc 1200gacaagccat ttgctgatcg tcccctctgt tacgctttct
ccaagctgcg gcatcgtcat 1260ttgttcaatt gccat 127549462PRTStreptomyces
avermitilis 49Met Thr Glu Ala Ser Val Arg Glu Phe Lys Met Pro Asp
Val Gly Glu 1 5 10 15 Gly Leu Thr Glu Ala Glu Ile Leu Lys Trp Tyr
Val Gln Pro Gly Asp 20 25 30 Thr Val Thr Asp Gly Gln Val Val Cys
Glu Val Glu Thr Ala Lys Ala 35 40 45 Ala Val Glu Leu Pro Ile Pro
Tyr Asp Gly Val Val Arg Glu Leu Arg 50 55 60 Phe Pro Glu Gly Thr
Thr Val Asp Val Gly Gln Val Ile Ile Ala Val 65 70 75 80 Asp Val Ala
Gly Asp Ala Pro Val Ala Glu Ile Pro Val Pro Ala Gln 85 90 95 Glu
Ala Pro Val Gln Glu Glu Pro Lys Pro Glu Gly Arg Lys Pro Val 100 105
110 Leu Val Gly Tyr Gly Val Ala Glu Ser Ser Thr Lys Arg Arg Pro Arg
115 120 125 Lys Ser Ala Pro Ala Ser Glu Pro Ala Ala Glu Gly Thr Tyr
Phe Ala 130 135 140 Ala Thr Val Leu Gln Gly Ile Gln Gly Glu Leu Asn
Gly His Gly Ala 145 150 155 160 Val Lys Gln Arg Pro Leu Ala Lys Pro
Pro Val Arg Lys Leu Ala Lys 165 170 175 Asp Leu Gly Val Asp Leu Ala
Thr Ile Thr Pro Ser Gly Pro Asp Gly 180 185 190 Val Ile Thr Arg Glu
Asp Val His Ala Ala Val Ala Pro Pro Pro Pro 195 200 205 Ala Pro Gln
Pro Val Gln Thr Pro Ala Ala Pro Ala Pro Ala Pro Val 210 215 220 Ala
Ala Tyr Asp Thr Ala Arg Glu Thr Arg Val Pro Val Lys Gly Val 225 230
235 240 Arg Lys Ala Thr Ala Ala Ala Met Val Gly Ser Ala Phe Thr Ala
Pro 245 250 255 His Val Thr Glu Phe Val Thr Val Asp Val Thr Arg Thr
Met Lys Leu 260 265 270 Val Glu Glu Leu Lys Gln Asp Lys Glu Phe Thr
Gly Leu Arg Val Asn 275 280 285 Pro Leu Leu Leu Ile Ala Lys Ala Leu
Leu Val Ala Ile Lys Arg Asn 290 295 300 Pro Asp Ile Asn Ala Ser Trp
Asp Glu Ala Asn Gln Glu Ile Val Leu 305 310 315 320 Lys His Tyr Val
Asn Leu Gly Ile Ala Ala Ala Thr Pro Arg Gly Leu 325 330 335 Ile Val
Pro Asn Ile Lys Asp Ala His Ala Lys Thr Leu Pro Gln Leu 340 345 350
Ala Glu Ser Leu Gly Glu Leu Val Ser Thr Ala Arg Glu Gly Lys Thr 355
360 365 Ser Pro Thr Ala Met Gln Gly Gly Thr Val Thr Ile Thr Asn Val
Gly 370 375 380 Val Phe Gly Val Asp Thr Gly Thr Pro Ile Leu Asn Pro
Gly Glu Ser 385 390 395 400 Ala Ile Leu Ala Val Gly Ala Ile Lys Leu
Gln Pro Trp Val His Lys 405 410 415 Gly Lys Val Lys Pro Arg Gln Val
Thr Thr Leu Ala Leu Ser Phe Asp 420 425 430 His Arg Leu Val Asp Gly
Glu Leu Gly Ser Lys Val Leu Ala Asp Val 435 440 445 Ala Ala Ile Leu
Glu Gln Pro Lys Arg Leu Ile Thr Trp Ala 450 455 460
501389DNAStreptomyces avermitilis 50atgactgagg cgtccgtgcg
tgagttcaag atgcccgatg tgggtgaggg actcaccgag 60gccgagatcc tcaagtggta
cgtccagccc ggcgacaccg tcaccgacgg ccaggtcgtc 120tgcgaggtcg
agaccgcgaa ggcggccgtg gaactcccca ttccgtacga cggtgtcgta
180cgcgaactcc gtttccccga ggggacgacg gtggacgtgg gacaggtgat
catcgcggtg 240gacgtggccg gcgacgcacc ggtggcggag atccccgtgc
ccgcgcagga ggctccggtc 300caggaggagc ccaagcccga gggccgcaag
cccgtcctcg tgggctacgg ggtggccgag 360tcctccacca agcgccgtcc
gcgcaagagc gcgccggcga gcgagcccgc tgcggagggc 420acgtacttcg
cagcgaccgt tctccagggc atccagggcg agctgaacgg acacggcgcg
480gtgaagcagc gtccgctggc gaagccgccg gtgcgcaagc tggccaagga
cctgggcgtc 540gacctcgcga cgatcacgcc gtcgggcccc gacggcgtca
tcacgcgcga ggacgtgcac 600gcggcggtgg cgccaccgcc gccggcaccc
cagcccgtgc agacgcccgc tgccccggcc 660ccggcgccgg tggccgcgta
cgacacggct cgtgagaccc gtgtccccgt caagggcgtc 720cgcaaggcga
cggcggcggc gatggtcggc tcggcgttca cggcgccgca cgtcacggag
780ttcgtgacgg tggacgtgac gcgcacgatg aagctggtcg aggagctgaa
gcaggacaag 840gagttcaccg gcctgcgggt gaacccgctg ctcctcatcg
ccaaggcgct cctggtcgcg 900atcaagcgga acccggacat caacgcgtcc
tgggacgagg cgaaccagga gatcgtcctc 960aagcactatg tgaacctggg
catcgcggcg gccaccccgc gcggtctgat cgtcccgaac 1020atcaaggacg
cccacgccaa gacgctgccg caactggccg agtcactggg tgagttggtg
1080tcgacggccc gcgagggcaa gacgtccccg acggccatgc agggcggcac
ggtcacgatc 1140acgaacgtcg gcgtcttcgg cgtcgacacg ggcacgccga
tcctcaaccc cggcgagtcc 1200gcgatcctcg cggtcggcgc gatcaagctc
cagccgtggg tccacaaggg caaggtcaag 1260ccccgacagg tcaccacgct
ggcgctcagc ttcgaccatc gcctggtcga cggcgagctg 1320ggctccaagg
tgctggccga cgtggcggcg atcctggagc agccgaagcg gctgatcacc
1380tgggcctag 138951423PRTPseudomonas putida 51Met Gly Thr His Val
Ile Lys Met Pro Asp Ile Gly Glu Gly Ile Ala 1 5 10 15 Gln Val Glu
Leu Val Glu Trp Phe Val Lys Val Gly Asp Ile Ile Ala 20 25 30 Glu
Asp Gln Val Val Ala Asp Val Met Thr Asp Lys Ala Thr Val Glu 35 40
45 Ile Pro Ser Pro Val Ser Gly Lys Val Leu Ala Leu Gly Gly Gln Pro
50 55 60 Gly Glu Val Met Ala Val Gly Ser Glu Leu Ile Arg Ile Glu
Val Glu 65 70 75 80 Gly Ser Gly Asn His Val Asp Val Pro Gln Pro Lys
Pro Val Glu Ala 85 90 95 Pro Ala Ala Pro Ile Ala Ala Lys Pro Glu
Pro Gln Lys Asp Val Lys 100 105 110 Pro Ala Val Tyr Gln Ala Pro Ala
Asn His Glu Ala Ala Pro Ile Val 115 120 125 Pro Arg Gln Pro Gly Asp
Lys Pro Leu Ala Ser Pro Ala Val Arg Lys 130 135 140 Arg Ala Leu Asp
Ala Gly Ile Glu Leu Arg Tyr Val His Gly Ser Gly 145 150 155 160 Pro
Ala Gly Arg Ile Leu His Glu Asp Leu Asp Ala Phe Met Ser Lys 165 170
175 Pro Gln Ser Asn Ala Gly Gln Ala Pro Asp Gly Tyr Ala Lys Arg Thr
180 185 190 Asp Ser Glu Gln Val Pro Val Ile Gly Leu Arg Arg Lys Ile
Ala Gln 195 200 205 Arg Met Gln Asp Ala Lys Arg Arg Val Ala His Phe
Ser Tyr Val Glu 210 215 220 Glu Ile Asp Val Thr Ala Leu Glu Ala Leu
Arg Gln Gln Leu Asn Ser 225 230 235 240 Lys His Gly Asp Ser Arg Gly
Lys Leu Thr Leu Leu Pro Phe Leu Val 245 250 255 Arg Ala Leu Val Val
Ala Leu Arg Asp Phe Pro Gln Ile Asn Ala Thr 260 265 270 Tyr Asp Asp
Glu Ala Gln Ile Ile Thr Arg His Gly Ala Val His Val 275 280 285 Gly
Ile Ala Thr Gln Gly Asp Asn Gly Leu Met Val Pro Val Leu Arg 290 295
300 His Ala Glu Ala Gly Ser Leu Trp Ala Asn Ala Gly Glu Ile Ser Arg
305 310 315 320 Leu Ala Asn Ala Ala Arg Asn Asn Lys Ala Ser Arg Glu
Glu Leu Ser 325 330 335 Gly Ser Thr Ile Thr Leu Thr Ser Leu Gly Ala
Leu Gly Gly Ile Val 340 345 350 Ser Thr Pro Val Val Asn Thr Pro Glu
Val Ala Ile Val Gly Val Asn 355 360 365 Arg Met Val Glu Arg Pro Val
Val Ile Asp Gly Gln Ile Val Val Arg 370 375 380 Lys Met Met Asn Leu
Ser Ser Ser Phe Asp His Arg Val Val Asp Gly 385 390 395 400 Met Asp
Ala Ala Leu Phe Ile Gln Ala Val Arg Gly Leu Leu Glu Gln 405 410 415
Pro Ala Cys Leu Phe Val Glu 420 521272DNAPseudomonas putida
52tcactccacg aacaggcagg cgggttgttc gagcaggcca cgcacggcct ggatgaacag
60ggcggcgtcc atgccatcga ccacgcggtg gtcgaacgag ctggacaggt tcatcatctt
120gcgcacgacg atctggccat caatcaccac cggtcgttcg accatgcggt
tgaccccgac 180gattgccact tccggggtgt tgaccaccgg cgtgctgaca
atgccaccca aggcgccgag 240gctggtcagg gtgatggtcg agccggacag
ctcctcgcgg ctggccttgt tgttacgtgc 300agcgttggcc aggcgcgaaa
tctcgccggc attggcccac aggctgcccg cttcggcgtg 360gcgcagcacg
ggtaccatca ggccgttgtc accctgggtg gcaatgccca catgcaccgc
420gccatggcgg gtgatgatct gcgcttcgtc gtcgtaggtc gcgttgatct
gcgggaagtc 480acgcagcgcc acgacgaggg cgcgcaccag gaatggcagc
aaggtcagtt tgccgcggct 540gtcgccgtgc ttgctgttga gttgctggcg
cagggcttcc agggcggtga cgtcgatttc 600ctcgacataa ctgaagtgcg
cgacccggcg tttggcgtcc tgcatgcgct gggcgatctt 660gcggcgcagg
ccgatcaccg gcacctgctc gctgtcggtg cgcttggcat aaccatcagg
720tgcttgcccg gcattgcttt gcggcttgct catgaaggcg tcgaggtctt
cgtgcagaat 780gcgcccggcc gggccgctac catgcacata acgcagttcg
ataccggcgt ccagggcgcg 840tttgcgcacg gccggcgagg ccagcggctt
gtcgcccggc tggcgcggca cgatgggcgc 900agcttcgtgg ttggcgggcg
cctggtacac ggcgggtttt acgtctttct gcggttccgg 960cttggctgca
atcggggcgg ccggggcctc taccggtttt ggctgaggca cgtccacatg
1020gttgccgctg ccttccactt cgatgcggat cagttcgcta ccgaccgcca
tcacttcccc 1080gggctggcca cccagggcca acaccttgcc gctgaccggc
gaggggattt ccacggtggc 1140cttgtcggtc atgacgtcgg ccaccacctg
gtcctcggcg atgatgtcgc cgaccttgac 1200gaaccattcc accaactcga
cctgcgcgat gccttcgcca atgtccggca tcttgatgac 1260gtgcgtgccc at
127253416PRTListeria monocytogenes 53Met Ala Val Glu Lys Ile Thr
Met Pro Lys Leu Gly Glu Ser Val Thr 1 5 10 15 Glu Gly Thr Ile Ser
Ser Trp Leu Val Lys Pro Gly Asp Thr Val Glu 20 25 30 Lys Tyr Asp
Ala Ile Ala Glu Val Leu Thr Asp Lys Val Thr Ala Glu 35 40 45 Ile
Pro Ser Ser Phe Ser Gly Thr Ile Lys Glu Ile Leu Ala Glu Glu 50 55
60 Asp Glu Thr Leu Glu Val Gly Glu Val Ile Cys Thr Ile Glu Thr Glu
65 70 75 80 Glu Ala Ser Ser Ser Glu Pro Val Val Glu Ala Glu Gln Thr
Glu Pro 85 90 95 Lys Thr Pro Glu Lys Gln Glu Thr Lys Gln Val Lys
Leu Ala Glu Ala 100 105 110 Pro Ala Ser Gly Arg Phe Ser Pro Ala Val
Leu Arg Ile Ala Gly Glu 115 120 125 Asn Asn Ile Asp Leu Ser Thr Val
Glu Gly Thr Gly Lys Gly Gly Arg 130 135 140 Ile Thr Arg Lys Asp Leu
Leu Gln Val Ile Glu Asn Gly Pro Val Ala 145 150 155 160 Pro Lys Arg
Glu Glu Val Lys Ser Ala Pro Gln Glu Lys Glu Ala Thr 165 170 175 Pro
Asn Pro Val Arg Ser Ala Ala Gly Asp Arg Glu Ile Pro Ile Asn 180 185
190 Gly Val Arg Lys Ala Ile Ala Lys His Met Ser Val Ser Lys Gln Glu
195 200 205 Ile Pro His Ala Trp Met Met Val Glu Val Asp Ala Thr Gly
Leu Val 210 215 220 Arg Tyr Arg Asn Thr Val Lys Asp Ser Phe Lys Lys
Glu Glu Gly Tyr 225 230 235 240 Ser Leu Thr Tyr Phe Ala Phe Phe Ile
Lys Ala Val Ala Gln Ala Leu 245 250 255 Lys Glu Phe Pro Gln Leu Asn
Ser Thr Trp Ala Gly Asp Lys Ile Ile 260 265 270 Glu His Ala Asn Ile
Asn Ile Ser Ile Ala Ile Ala Ala Gly Asp Leu 275 280 285 Leu Tyr Val
Pro Val Ile Lys Asn Ala Asp Glu Lys Ser Ile Lys Gly 290 295 300 Ile
Ala Arg Glu Ile Ser Glu Leu Ala Gly Lys Ala Arg Asn Gly Lys 305 310
315 320 Leu Ser Gln Ala Asp Met Glu Gly Gly Thr Phe Thr Val Asn Ser
Thr 325 330 335 Gly Ser Phe Gly Ser Val Gln Ser Met Gly Ile Ile Asn
His Pro Gln 340 345 350 Ala Ala Ile Leu Gln Val Glu Ser Ile Val Lys
Arg Pro Val Ile Ile 355 360 365 Asp Asp Met Ile Ala Val Arg Asp Met
Val Asn Leu Cys Leu Ser Ile 370 375 380 Asp His Arg Ile Leu Asp Gly
Leu Leu Ala Gly Lys Phe Leu Gln Ala 385 390 395 400 Ile Lys Ala Asn
Val Glu Lys Ile Ser Lys Glu Asn Thr Ala Leu Tyr 405 410 415
541251DNAListeria monocytogenes 54gtggcagttg aaaaaatcac catgcccaaa
ttaggggaaa gtgtaacaga aggaacgatt 60agttcatggt tagttaaacc aggcgataca
gtagaaaaat atgatgctat cgcggaagtt 120ttaacagata aagtaacagc
tgaaatccca tcatccttta gtggcactat caaagaaatt 180ttagcagagg
aagatgaaac actagaagta ggcgaagtta tttgtaccat cgaaacagaa
240gaggctagta gttcagagcc tgtagttgaa gcagaacaaa cagaaccaaa
aactccagaa 300aaacaagaaa caaaacaagt gaaattagca gaagcaccag
ccagtggaag attttcacca 360gcggtactgc gtattgctgg agaaaacaat
attgatttat caaccgtaga aggcacaggt 420aaaggtggcc gaattacaag
aaaagattta cttcaagtaa ttgaaaatgg tccagtagct 480ccgaaacgcg
aggaagtgaa gtctgctcca caagaaaaag aagcgacgcc aaatcctgta
540cgttcagcag caggtgacag agaaatccca atcaatggtg taagaaaagc
gattgctaaa 600catatgagcg tgagtaaaca agaaattccg catgcttgga
tgatggtgga agtagatgca 660actggtcttg ttcgctatcg taatacagtt
aaagacagct ttaaaaaaga agaaggttat 720tcattaactt atttcgcctt
tttcatcaaa gccgttgcac aagcattgaa agaattcccg 780caacttaaca
gcacgtgggc aggcgataaa attattgagc atgcgaatat caatatttcg
840attgcgattg cagctggcga tttattgtat gtgccagtta ttaaaaatgc
ggacgaaaaa 900tccattaaag gtattgctcg cgaaataagt gaactagctg
gaaaagcgcg taatggtaaa 960ctgagccaag ccgatatgga aggtgggact
ttcactgtaa atagtactgg ttcatttggc 1020tctgttcaat caatggggat
tattaaccac ccacaagccg ctattcttca agtggaatcc 1080attgttaagc
gcccagtcat tattgacgat atgattgctg tacgagatat ggtcaaccta
1140tgtctatcca tcgatcatcg tattttagac ggcttactag caggtaaatt
cttacaagca 1200attaaagcca atgtcgaaaa gatttctaaa gaaaatacag
cgttgtatta a 125155455PRTStreptomyces avermitilis 55Met Ala Gln Val
Leu Glu Phe Lys Leu Pro Asp Leu Gly Glu Gly Leu 1 5 10 15 Thr Glu
Ala Glu Ile Val Arg Trp Leu Val Gln Val Gly Asp Val Val 20 25 30
Ala Ile Asp Gln Pro Val Val Glu Val Glu Thr Ala Lys Ala Met Val 35
40 45 Glu Val Pro Cys Pro Tyr Gly Gly Val Val Thr Ala Arg Phe Gly
Glu 50 55 60 Glu Gly Thr Glu Leu Pro Val Gly Ser Pro Leu Leu Thr
Val Ala Val 65 70 75 80 Gly Ala Pro Ser Ser Val Pro Ala Ala Ser Ser
Leu Ser Gly Ala Thr 85 90 95 Ser Ala Ser Ser Ala Ser Ser Val Ser
Ser Asp Asp Gly Glu Ser Ser 100 105 110 Gly Asn Val Leu Val Gly Tyr
Gly Thr Ser Ala Ala Pro Ala Arg Arg 115 120 125 Arg Arg Val Arg Pro
Gly Gln Ala Ala Pro Val Val Thr Ala Thr Ala 130 135 140 Ala Ala Ala
Ala Thr Arg Val Ala Ala Pro Glu Arg Ser
Asp Gly Pro 145 150 155 160 Val Pro Val Ile Ser Pro Leu Val Arg Arg
Leu Ala Arg Glu Asn Gly 165 170 175 Leu Asp Leu Arg Ala Leu Ala Gly
Ser Gly Pro Asp Gly Leu Ile Leu 180 185 190 Arg Ser Asp Val Glu Gln
Ala Leu Arg Ala Ala Pro Thr Pro Ala Pro 195 200 205 Thr Pro Thr Met
Pro Pro Ala Pro Thr Pro Ala Pro Thr Pro Ala Ala 210 215 220 Ala Pro
Arg Gly Thr Arg Ile Pro Leu Arg Gly Val Arg Gly Ala Val 225 230 235
240 Ala Asp Lys Leu Ser Arg Ser Arg Arg Glu Ile Pro Asp Ala Thr Cys
245 250 255 Trp Val Asp Ala Asp Ala Thr Ala Leu Met His Ala Arg Val
Ala Met 260 265 270 Asn Ala Thr Gly Gly Pro Lys Ile Ser Leu Ile Ala
Leu Leu Ala Arg 275 280 285 Ile Cys Thr Ala Ala Leu Ala Arg Phe Pro
Glu Leu Asn Ser Thr Val 290 295 300 Asp Met Asp Ala Arg Glu Val Val
Arg Leu Asp Gln Val His Leu Gly 305 310 315 320 Phe Ala Ala Gln Thr
Glu Arg Gly Leu Val Val Pro Val Val Arg Asp 325 330 335 Ala His Ala
Arg Asp Ala Glu Ser Leu Ser Ala Glu Phe Ala Arg Leu 340 345 350 Thr
Glu Ala Ala Arg Thr Gly Thr Leu Thr Pro Gly Glu Leu Thr Gly 355 360
365 Gly Thr Phe Thr Leu Asn Asn Tyr Gly Val Phe Gly Val Asp Gly Ser
370 375 380 Thr Pro Ile Ile Asn His Pro Glu Ala Ala Met Leu Gly Val
Gly Arg 385 390 395 400 Ile Ile Pro Lys Pro Trp Val His Glu Gly Glu
Leu Ala Val Arg Gln 405 410 415 Val Val Gln Leu Ser Leu Thr Phe Asp
His Arg Val Cys Asp Gly Gly 420 425 430 Thr Ala Gly Gly Phe Leu Arg
Tyr Val Ala Asp Cys Val Glu Gln Pro 435 440 445 Ala Val Leu Leu Arg
Thr Leu 450 455 561368DNAStreptomyces avermitilis 56atggcccagg
tgctcgagtt caagctcccc gacctcgggg agggcctgac cgaggccgag 60atcgtccgct
ggctggtgca ggtcggcgac gtcgtggcga tcgaccagcc ggtcgtcgag
120gtggagacgg ccaaggcgat ggtcgaggtg ccgtgcccct acgggggcgt
ggtcaccgcc 180cgcttcggcg aggagggcac ggaactgccc gtgggctcac
cgctgttgac ggtggctgtc 240ggagctccgt cctcggtgcc cgcggcgtcc
tcgctgtccg gggcgacatc ggcgtcctcc 300gcgtcctcgg tgtcatcgga
cgacggcgag tcgtccggca acgtcctggt cggatacggc 360acgtcggccg
cgcccgcgcg ccggcggagg gtgcggccgg gccaggcggc acccgtggtg
420acggcaactg ccgccgcggc cgccacgcgc gtggcggctc ccgagcggag
cgacggcccc 480gtgcccgtga tctccccgct ggtccgcagg ctcgcccggg
agaacggcct ggatctgcgg 540gcgctggcgg gctccgggcc cgacgggctg
atcctgaggt cggacgtcga gcaggcgctg 600cgcgccgcgc ccactcctgc
ccccaccccg accatgcctc cggctcccac tcctgccccc 660acccccgccg
cggcaccccg cggcacccgc atccccctcc gaggggtccg cggtgccgtc
720gccgacaaac tctcccgcag ccggcgtgag atccccgacg cgacctgctg
ggtggacgcc 780gacgccacgg cactcatgca cgcgcgcgtg gcgatgaacg
cgaccggcgg cccgaagatc 840tccctcatcg cgctgctcgc caggatctgc
accgccgcac tggcccgctt ccccgagctc 900aactccaccg tcgacatgga
cgcccgcgag gtcgtacggc tcgaccaggt gcacctgggc 960ttcgccgcgc
agaccgaacg ggggctcgtc gtcccggtcg tgcgggacgc gcacgcgcgg
1020gacgccgagt cgctcagcgc cgagttcgcg cggctgaccg aggccgcccg
gaccggcacc 1080ctcacacccg gggaactgac cggcggcacc ttcacgttga
acaactacgg ggtgttcggc 1140gtcgacggtt ccacgccgat catcaaccac
cccgaggcgg ccatgctggg cgtcggccgc 1200atcatcccca agccgtgggt
gcacgagggc gagctggcgg tgcggcaggt cgtccagctc 1260tcgctcacct
tcgaccaccg ggtgtgcgac ggcggcacgg caggcggttt cctgcgctac
1320gtggcggact gcgtggaaca gccggcggtg ctgctgcgca ccctgtag
136857355PRTMicrococcus luteus 57Met Ser Glu Arg Met Thr Phe Gly
Arg Ala Ile Asn Arg Gly Leu His 1 5 10 15 Arg Ala Leu Ala Asp Asp
Pro Lys Val Leu Leu Met Gly Glu Asp Ile 20 25 30 Gly Ala Leu Gly
Gly Val Phe Arg Ile Thr Asp Gly Leu Gln Ala Glu 35 40 45 Phe Gly
Glu Asp Arg Val Leu Asp Thr Pro Leu Ala Glu Ser Gly Ile 50 55 60
Val Gly Thr Ala Ile Gly Leu Ala Met Arg Gly Tyr Arg Pro Val Val 65
70 75 80 Glu Ile Gln Phe Asp Gly Phe Val Tyr Pro Ala Phe Asp Gln
Ile Val 85 90 95 Ala Asn Leu Ala Lys Leu Arg Ala Arg Thr Arg Gly
Ala Val Pro Met 100 105 110 Pro Val Thr Ile Arg Ile Pro Phe Gly Gly
Gly Ile Gly Ser Pro Glu 115 120 125 His His Ser Glu Ser Pro Glu Ala
Tyr Phe Leu His Thr Ala Gly Leu 130 135 140 Arg Val Val Ser Pro Ser
Ser Pro Gln Glu Gly Tyr Asp Leu Ile Arg 145 150 155 160 Ala Ala Ile
Ala Ser Glu Asp Pro Val Val Tyr Leu Glu Pro Lys Arg 165 170 175 Arg
Tyr His Asp Lys Gly Asp Val Asp Leu Gly Val Ala Ile Pro Pro 180 185
190 Met Ser Pro Ala Arg Ile Leu Arg Glu Gly Arg Asp Ala Thr Leu Val
195 200 205 Ala Tyr Gly Pro Leu Val Lys Thr Ala Leu Gln Ala Ala Glu
Val Ala 210 215 220 Ala Glu Glu Gly Val Glu Val Glu Val Val Asp Leu
Arg Ser Leu Ser 225 230 235 240 Pro Leu Asp Thr Gly Leu Val Glu Ser
Ser Val Arg Arg Thr Gly Arg 245 250 255 Leu Val Val Ala His Glu Ala
Ser Arg Thr Gly Gly Leu Gly Ala Glu 260 265 270 Leu Val Ala Thr Val
Ala Glu Arg Ala Phe His Trp Leu Glu Ala Pro 275 280 285 Pro Val Arg
Val Thr Gly Met Asp Val Pro Tyr Pro Pro Ser Lys Leu 290 295 300 Glu
His Leu His Leu Pro Asp Leu Asp Arg Ile Leu Asp Gly Leu Asp 305 310
315 320 Arg Ala Leu Gly Arg Pro Asn Ser Leu Asp Ser Val Asp Ala Phe
Ala 325 330 335 Ala Pro Glu Thr Ala Glu Gln Phe Leu Ala Ala Gln Asn
Ala Gly Glu 340 345 350 Glu Thr Arg 355 581068DNAMicrococcus luteus
58gtgagcgagc gcatgacctt cggccgtgcg atcaaccgcg gcctgcaccg tgccctggcc
60gacgacccca aggtcctgct catgggcgag gacatcggcg ccctcggcgg cgtgttccgc
120atcaccgacg gcctgcaggc cgagttcggc gaggaccggg tgctcgacac
cccgctggcc 180gagtccggca tcgtgggcac ggccatcggc ctggcgatgc
gcggctaccg gcccgtcgtc 240gagatccagt tcgacggctt cgtgtacccg
gcgttcgacc agatcgtggc gaacctggcc 300aagctgcgcg cccgcacccg
cggcgccgtg ccgatgccgg tgaccatccg catccccttc 360ggcggcggca
tcggctcccc ggagcaccac tccgagtcgc ccgaggccta cttcctgcac
420accgcgggtc tgcgcgtggt ctccccgtcc tccccgcagg aggggtacga
cctcatccgc 480gccgcgatcg cctcggagga cccggtggtc tacctcgagc
ccaagcgtcg ctaccacgac 540aagggcgacg tggacctggg cgtcgcgatc
ccgccgatga gcccggcccg catcctgcgc 600gagggccgtg acgccacgct
cgtggcctac ggcccgctcg tgaagaccgc cctgcaggcc 660gccgaggtgg
cggccgagga gggtgtcgag gtcgaggtgg tcgacctgcg cagcctgtcc
720ccgctggaca ccggcctcgt cgagtcctcg gtgcggcgca ccggtcggct
cgtcgtggcg 780cacgaggcct cccgcacggg cggcctcggc gccgagctcg
tggccacggt ggccgagcgc 840gcgttccatt ggctcgaggc cccgccggtg
cgcgtcaccg gcatggacgt gccctacccg 900ccgtccaagc tcgagcacct
gcacctgccg gacctcgacc gcatcctcga cggcctggac 960cgtgctctgg
gccggccgaa ttcgctggac tccgtggacg cgttcgccgc ccccgagacc
1020gccgagcagt tcctcgccgc ccagaacgcc ggggaggaga cccgatga
106859424PRTStaphylococcus aureus 59Met Glu Ile Thr Met Pro Lys Leu
Gly Glu Ser Val His Glu Gly Thr 1 5 10 15 Ile Glu Gln Trp Leu Val
Ser Val Gly Asp His Ile Asp Glu Tyr Glu 20 25 30 Pro Leu Cys Glu
Val Ile Thr Asp Lys Val Thr Ala Glu Val Pro Ser 35 40 45 Thr Ile
Ser Gly Thr Ile Thr Glu Ile Leu Val Glu Ala Gly Gln Thr 50 55 60
Val Ala Ile Asp Thr Ile Ile Cys Lys Ile Glu Thr Ala Asp Glu Lys 65
70 75 80 Thr Asn Glu Thr Thr Glu Glu Ile Gln Ala Lys Val Asp Glu
His Thr 85 90 95 Gln Lys Ser Thr Lys Lys Ala Ser Ala Thr Val Glu
Gln Thr Phe Thr 100 105 110 Ala Lys Gln Asn Gln Pro Arg Asn Asn Gly
Arg Phe Ser Pro Val Val 115 120 125 Phe Lys Leu Ala Ser Glu His Asp
Ile Asp Leu Ser Gln Val Val Gly 130 135 140 Ser Gly Phe Glu Gly Arg
Val Thr Lys Lys Asp Ile Met Ser Val Ile 145 150 155 160 Glu Asn Gly
Gly Thr Thr Ala Gln Ser Asp Lys Gln Val Gln Thr Lys 165 170 175 Ser
Thr Ser Val Asp Thr Ser Ser Asn Gln Ser Ser Glu Asp Asn Ser 180 185
190 Glu Asn Ser Thr Ile Pro Val Asn Gly Val Arg Lys Ala Ile Ala Gln
195 200 205 Asn Met Val Asn Ser Val Thr Glu Ile Pro His Ala Trp Met
Met Ile 210 215 220 Glu Val Asp Ala Thr Asn Leu Val Asn Thr Arg Asn
His Tyr Lys Asn 225 230 235 240 Ser Phe Lys Asn Lys Glu Gly Tyr Asn
Leu Thr Phe Phe Ala Phe Phe 245 250 255 Val Lys Ala Val Ala Asp Ala
Leu Lys Ala Tyr Pro Leu Leu Asn Ser 260 265 270 Ser Trp Gln Gly Asn
Glu Ile Val Leu His Lys Asp Ile Asn Ile Ser 275 280 285 Ile Ala Val
Ala Asp Glu Asn Lys Leu Tyr Val Pro Val Ile Lys His 290 295 300 Ala
Asp Glu Lys Ser Ile Lys Gly Ile Ala Arg Glu Ile Asn Thr Leu 305 310
315 320 Ala Thr Lys Ala Arg Asn Lys Gln Leu Thr Thr Glu Asp Met Gln
Gly 325 330 335 Gly Thr Phe Thr Val Asn Asn Thr Gly Thr Phe Gly Ser
Val Ser Ser 340 345 350 Met Gly Ile Ile Asn His Pro Gln Ala Ala Ile
Leu Gln Val Glu Ser 355 360 365 Ile Val Lys Lys Pro Val Val Ile Asn
Asp Met Ile Ala Ile Arg Ser 370 375 380 Met Val Asn Leu Cys Ile Ser
Ile Asp His Arg Ile Leu Asp Gly Leu 385 390 395 400 Gln Thr Gly Lys
Phe Met Asn His Ile Lys Gln Arg Ile Glu Gln Tyr 405 410 415 Thr Leu
Glu Asn Thr Asn Ile Tyr 420 601275DNAStaphylococcus aureus
60ctaatatata tttgtatttt ctaaagtata ctgttcgata cgctgtttaa tatgattcat
60aaatttacca gtttgtaaac catctaaaat gcgatgatct attgaaatac ataaatttac
120catacttcga attgcaatca tatcattaat tactactggc tttttaacga
ttgattctac 180ttgtaatatc gctgcttgag ggtgatttat aatccccatt
gatgatactg aaccaaatgt 240accagtatta ttaactgtaa atgttccacc
ttgcatatct tcagttgtca attgcttatt 300acgcgctttt gttgctaaag
tattaatttc tctagctata cctttgattg acttttcgtc 360tgcatgctta
atcacaggta cgtataattt attttcatca gcaacagcaa ttgaaatatt
420aatgtcttta tgtaagacaa tttcatttcc ttgccagcta ctatttaata
aaggatatgc 480ttttaaagca tctgctacag cttttacaaa gaaagcaaag
aacgttagat tatatccttc 540tttattttta aagctgtttt tataatgatt
tctcgtattc acaagatttg tagcatctac 600ttcaatcatc atccatgcat
gtggaatctc tgttacacta ttaaccatat tttgcgcaat 660tgctttacgc
acaccattta ctggtattgt gctgttttca ctattgtctt cagatgattg
720gttacttgat gtatctactg atgttgattt tgtttgaact tgtttgtcag
attgagctgt 780ggtaccacca ttttcaataa ctgacattat atccttctta
gttacacgac cttcaaatcc 840actacctaca acttgtgata aatcaatgtc
atgctctgaa gcgagtttaa atacaacagg 900tgaaaagcga ccattattac
gaggttgatt ttgtttagca gtaaatgtct gttccactgt 960tgcactagct
tttttagtag atttctgagt atgctcatcc acttttgctt gtatctcttc
1020agttgtttca tttgtctttt catcagcagt ttcaatttta cagataattg
tatcaatagc 1080tactgtctgc cccgcttcaa ctaaaatttc tgtaattgtt
cctgatatcg tggaagggac 1140ttcagctgtc actttatctg taataacttc
acataatggt tcatattcat caatatgatc 1200accaacagaa actaaccatt
gttcaatggt accttcatga acactctcac ctaacttagg 1260cattgttatt tccat
127561455PRTStreptococcus mutans 61Met Ala Val Glu Ile Ile Met Pro
Lys Leu Gly Val Asp Met Gln Glu 1 5 10 15 Gly Glu Ile Ile Glu Trp
Lys Lys Gln Glu Gly Asp Glu Val Lys Glu 20 25 30 Gly Glu Ile Leu
Leu Glu Ile Met Ser Asp Lys Thr Asn Met Glu Ile 35 40 45 Glu Ala
Glu Asp Ser Gly Val Leu Leu Lys Ile Val Lys Gly Asn Gly 50 55 60
Gln Val Val Pro Val Thr Glu Val Ile Gly Tyr Ile Gly Gln Ala Gly 65
70 75 80 Glu Val Leu Glu Ile Ala Asp Val Pro Ala Ser Thr Val Pro
Lys Glu 85 90 95 Asn Ser Ala Ala Pro Ala Glu Lys Thr Lys Ala Met
Ser Ser Pro Thr 100 105 110 Val Ala Ala Ala Pro Gln Gly Lys Ile Arg
Ala Thr Pro Ala Ala Arg 115 120 125 Lys Ala Ala Arg Asp Leu Gly Val
Asn Leu Asn Gln Val Ser Gly Thr 130 135 140 Gly Ala Lys Gly Arg Val
His Lys Glu Asp Val Glu Ser Phe Lys Ala 145 150 155 160 Ala Gln Pro
Lys Ala Thr Pro Leu Ala Arg Lys Ile Ala Ile Asp Lys 165 170 175 Gly
Ile Asp Leu Ala Ser Val Ser Gly Thr Gly Phe Gly Gly Lys Ile 180 185
190 Ile Lys Glu Asp Ile Leu Asn Leu Phe Glu Ala Ala Gln Pro Val Asn
195 200 205 Asp Val Ser Asp Pro Ala Lys Glu Ala Ala Ala Leu Pro Glu
Gly Val 210 215 220 Glu Val Ile Lys Met Ser Ala Met Arg Lys Ala Val
Ala Lys Ser Met 225 230 235 240 Val Asn Ser Tyr Leu Thr Ala Pro Thr
Phe Thr Leu Asn Tyr Asp Ile 245 250 255 Asp Met Thr Glu Met Ile Ala
Leu Arg Lys Lys Leu Ile Asp Pro Ile 260 265 270 Met Glu Lys Thr Gly
Phe Lys Val Ser Phe Thr Asp Leu Ile Gly Leu 275 280 285 Ala Val Val
Lys Thr Leu Met Lys Pro Glu His Arg Tyr Leu Asn Ala 290 295 300 Ser
Leu Ile Asn Asp Ala Thr Glu Ile Glu Leu His Gln Phe Val Asn 305 310
315 320 Leu Gly Ile Ala Val Gly Leu Asp Glu Gly Leu Leu Val Pro Val
Val 325 330 335 His Gly Ala Asp Lys Met Ser Leu Ser Asp Phe Val Ile
Ala Ser Lys 340 345 350 Asp Val Ile Lys Lys Ala Gln Thr Gly Lys Leu
Lys Ala Thr Glu Met 355 360 365 Ser Gly Ser Thr Phe Ser Ile Thr Asn
Leu Gly Met Phe Gly Thr Lys 370 375 380 Thr Phe Asn Pro Ile Ile Asn
Gln Pro Asn Ser Ala Ile Leu Gly Val 385 390 395 400 Gly Ala Thr Ile
Gln Thr Pro Thr Val Val Asp Gly Glu Ile Lys Ile 405 410 415 Arg Pro
Ile Met Ala Leu Cys Leu Thr Ile Asp His Arg Leu Val Asp 420 425 430
Gly Met Asn Gly Ala Lys Phe Met Val Asp Leu Lys Lys Leu Met Glu 435
440 445 Asn Pro Phe Thr Leu Leu Ile 450 455 621368DNAStreptococcus
mutans 62atggcagtcg aaattattat gcctaaactt ggtgttgata tgcaggaagg
cgaaatcatc 60gagtggaaaa aacaagaagg tgatgaggtc aaagaagggg agatcctcct
tgagattatg 120tctgacaaga ccaatatgga aatcgaagct gaggattcag
gtgtcctgct taagattgtt 180aaaggaaatg gtcaagttgt tcctgtaact
gaggtcattg gttatattgg tcaagcaggt 240gaagttcttg aaatagctga
tgttcctgca agtacagttc ctaaagaaaa tagtgcagca 300cctgctgaaa
aaacaaaagc aatgtcttct ccgacagttg cagcagcccc tcaaggaaag
360attcgagcaa caccagcagc tcgtaaggcg gctcgtgatc tgggagttaa
cctgaatcag 420gtttcaggga caggcgctaa aggccgtgtt cacaaggaag
atgttgaaag ctttaaagca 480gctcagccta aagcaacacc attagctagg
aaaattgcta tagataaagg tattgatcta 540gccagtgtct caggaacagg
ttttggcggc aaaattatca aggaagatat tttaaatctg 600tttgaggcag
ctcagcctgt taatgatgtg tcagatcctg ctaaagaagc agctgcctta
660ccagagggtg ttgaagtcat taagatgtct gccatgcgta aggcagtggc
taaaagcatg 720gtcaattctt acctgacagc tccaactttt actctcaatt
atgacattga catgactgag 780atgattgcgt tgcgtaaaaa gttaattgat
cctatcatgg aaaaaacagg ttttaaagtt 840agcttcacag atttgattgg
tctggcagtc gtaaaaacct taatgaaacc agaacatcgt 900tacctcaatg
cttcactcat taatgacgcg actgagattg aacttcatca atttgttaac
960cttggtatcg ccgttggact tgatgaagga ctgttagtac ctgttgttca
tggtgcagat
1020aagatgagct tgtcagattt tgttatagct tcaaaggatg tcattaagaa
agctcagacc 1080ggtaaattaa aagccactga aatgtctggt tcaacctttt
ccattacaaa cttggggatg 1140tttggcacta agactttcaa ccccattatc
aatcagccaa attcggctat tttgggtgta 1200ggagcaacta tccaaacgcc
aactgttgtg gatggtgaaa ttaagattcg tccaatcatg 1260gcactgtgct
tgaccatcga tcaccgcttg gttgatggca tgaacggcgc taagttcatg
1320gttgatctta aaaaactgat ggaaaatcca tttacattat tgatttga
13686314PRTArtificial sequenceSynthetic polypeptide 63Pro Xaa Val
Xaa Xaa Xaa Ala Xaa Xaa Xaa Gly Xaa Xaa Leu 1 5 10 648PRTArtificial
sequenceSynthetic polypeptide 64Xaa Xaa Gly Xaa Xaa Gly Xaa Ile 1 5
6516PRTArtificial sequenceSynthetic polypeptide 65Xaa Pro Xaa Xaa
Gly Xaa Arg Xaa Xaa Ala Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15
6613PRTArtificial sequenceSynthetic polypeptide 66Gly Xaa Thr Xaa
Thr Xaa Xaa Xaa Xaa Gly Xaa Xaa Gly 1 5 10 6719PRTArtificial
sequenceSynthetic polypeptide 67Asn Xaa Pro Glu Xaa Ala Xaa Xaa Xaa
Val Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Pro Xaa Val 6812PRTArtificial
sequenceSynthetic polypeptide 68Leu Xaa Xaa Xaa Phe Xaa His Arg Xaa
Xaa Asp Gly 1 5 10 69457PRTBacillus subtillis 69Met Ala Thr Glu Tyr
Asp Val Val Ile Leu Gly Gly Gly Thr Gly Gly 1 5 10 15 Tyr Val Ala
Ala Ile Arg Ala Ala Gln Leu Gly Leu Lys Thr Ala Val 20 25 30 Val
Glu Lys Glu Lys Leu Gly Gly Thr Cys Leu His Lys Gly Cys Ile 35 40
45 Pro Ser Lys Ala Leu Leu Arg Ser Ala Glu Val Tyr Arg Thr Ala Arg
50 55 60 Glu Ala Asp Gln Phe Gly Val Glu Thr Ala Gly Val Ser Leu
Asn Phe 65 70 75 80 Glu Lys Val Gln Gln Arg Lys Gln Ala Val Val Asp
Lys Leu Ala Ala 85 90 95 Gly Val Asn His Leu Met Lys Lys Gly Lys
Ile Asp Val Tyr Thr Gly 100 105 110 Tyr Gly Arg Ile Leu Gly Pro Ser
Ile Phe Ser Pro Leu Pro Gly Thr 115 120 125 Ile Ser Val Glu Arg Gly
Asn Gly Glu Glu Asn Asp Met Leu Ile Pro 130 135 140 Lys Gln Val Ile
Ile Ala Thr Gly Ser Arg Pro Arg Met Leu Pro Gly 145 150 155 160 Leu
Glu Val Asp Gly Lys Ser Val Leu Thr Ser Asp Glu Ala Leu Gln 165 170
175 Met Glu Glu Leu Pro Gln Ser Ile Ile Ile Val Gly Gly Gly Val Ile
180 185 190 Gly Ile Glu Trp Ala Ser Met Leu His Asp Phe Gly Val Lys
Val Thr 195 200 205 Val Ile Glu Tyr Ala Asp Arg Ile Leu Pro Thr Glu
Asp Leu Glu Ile 210 215 220 Ser Lys Glu Met Glu Ser Leu Leu Lys Lys
Lys Gly Ile Gln Phe Ile 225 230 235 240 Thr Gly Ala Lys Val Leu Pro
Asp Thr Met Thr Lys Thr Ser Asp Asp 245 250 255 Ile Ser Ile Gln Ala
Glu Lys Asp Gly Glu Thr Val Thr Tyr Ser Ala 260 265 270 Glu Lys Met
Leu Val Ser Ile Gly Arg Gln Ala Asn Ile Glu Gly Ile 275 280 285 Gly
Leu Glu Asn Thr Asp Ile Val Thr Glu Asn Gly Met Ile Ser Val 290 295
300 Asn Glu Ser Cys Gln Thr Lys Glu Ser His Ile Tyr Ala Ile Gly Asp
305 310 315 320 Val Ile Gly Gly Leu Gln Leu Ala His Val Ala Ser His
Glu Gly Ile 325 330 335 Ile Ala Val Glu His Phe Ala Gly Leu Asn Pro
His Pro Leu Asp Pro 340 345 350 Thr Leu Val Pro Lys Cys Ile Tyr Ser
Ser Pro Glu Ala Ala Ser Val 355 360 365 Gly Leu Thr Glu Asp Glu Ala
Lys Ala Asn Gly His Asn Val Lys Ile 370 375 380 Gly Lys Phe Pro Phe
Met Ala Ile Gly Lys Ala Leu Val Tyr Gly Glu 385 390 395 400 Ser Asp
Gly Phe Val Lys Ile Val Ala Asp Arg Asp Thr Asp Asp Ile 405 410 415
Leu Gly Val His Met Ile Gly Pro His Val Thr Asp Met Ile Ser Glu 420
425 430 Ala Gly Leu Ala Lys Val Leu Asp Ala Thr Pro Trp Glu Val Gly
Gln 435 440 445 Thr Ile Ser Pro Ala Ser Asn Ala Phe 450 455
701374DNABacillus subtillis 70tcagaaagcg ttggatgcgg gtgaaatcgt
ttgcccgacc tcccacggtg ttgcgtccag 60cactttggca agacccgctt cagaaatcat
gtcggtgaca tgcgggccaa tcatatgaac 120gccgagaata tcatctgtat
ctcggtcagc cacgattttg acaaaaccgt cgctttcacc 180gtatacaagc
gcttttccaa tcgccataaa tgggaacttg ccgattttga cattatgccc
240gttcgccttt gcttcgtctt cggttaagcc gacactggca gcttcagggc
ttgagtaaat 300gcacttcggc acaagcgtcg gatcaagcgg atgcggattg
agacctgcaa aatgctcaac 360agcaataatt ccctcatgtg aagcaacgtg
agctaactgc aggccaccga ttacgtctcc 420gattgcataa atatgagatt
ccttcgtttg gcagctttca ttgactgaaa tcatgccatt 480ttcagtaaca
atatcggtgt tctctaggcc gatgccttcg atatttgcct gtctgccgat
540ggaaacaagc attttctcag cagaataggt aacggtttct ccgtcttttt
ccgcttgtat 600gctgatatcg tctgatgttt ttgtcattgt gtcaggcagc
acttttgccc ctgttatgaa 660ctggatgcct tttttcttaa gaagactttc
catttctttt gaaatctcta gatcttcagt 720cggcaatatg cgatccgcgt
attcaataac cgttacctta acgccaaaat catgaagcat 780agacgcccat
tcgataccga taacccctcc gccgacaatg atgattgact gtggcagctc
840ctccatttgg agcgcctcat ctgaagtcag tacagactta ccgtccactt
caagacccgg 900aagcattctc ggtcttgatc ctgttgcaat gatcacttgt
ttcgggatca gcatgtcatt 960ttcttcgcca tttccccgct caacagaaat
tgttcccggc agcggagaga agattgacgg 1020tccaaggata cgtccatatc
cggtgtacac gtcaattttt ccttttttca ttaaatgatt 1080tacacccgct
gcaagcttat caacaacggc ttgcttacgc tgctgcactt tttcaaagtt
1140gagggacacg ccagccgttt ccactccgaa ttgatcggct tcacgagctg
tccggtatac 1200ctctgcgctt ctaagcagcg ctttactcgg gatacagcct
ttatgcagac atgttccccc 1260gagtttttcc ttttccacaa cggctgtttt
taagccgagc tgagcggctc tgatggccgc 1320aacataaccg ccggtaccgc
cgcccagaat gactacgtca tactcagttg ccat 137471462PRTStreptomyces
avermitilis 71Met Ala Asn Asp Ala Ser Thr Val Phe Asp Leu Val Ile
Leu Gly Gly 1 5 10 15 Gly Ser Gly Gly Tyr Ala Ala Ala Leu Arg Gly
Ala Gln Leu Gly Leu 20 25 30 Asp Val Ala Leu Ile Glu Lys Asp Lys
Val Gly Gly Thr Cys Leu His 35 40 45 Arg Gly Cys Ile Pro Thr Lys
Ala Leu Leu His Ala Gly Glu Ile Ala 50 55 60 Asp Gln Ala Arg Glu
Ser Glu Gln Phe Gly Val Lys Ala Thr Phe Glu 65 70 75 80 Gly Ile Asp
Val Pro Ala Val His Lys Tyr Lys Asp Gly Val Ile Ser 85 90 95 Gly
Leu Tyr Lys Gly Leu Gln Gly Leu Ile Ala Ser Arg Lys Val Thr 100 105
110 Tyr Ile Glu Gly Glu Gly Arg Leu Ser Ser Pro Thr Ser Val Asp Val
115 120 125 Asn Gly Gln Arg Val Gln Gly Arg His Val Leu Leu Ala Thr
Gly Ser 130 135 140 Val Pro Lys Ser Leu Pro Gly Leu Ala Ile Asp Gly
Asn Arg Ile Ile 145 150 155 160 Ser Ser Asp His Ala Leu Val Leu Asp
Arg Val Pro Glu Ser Ala Ile 165 170 175 Val Leu Gly Gly Gly Val Ile
Gly Val Glu Phe Ala Ser Ala Trp Lys 180 185 190 Ser Phe Gly Ala Asp
Val Thr Val Ile Glu Gly Leu Lys His Leu Val 195 200 205 Pro Val Glu
Asp Glu Asn Ser Ser Lys Leu Leu Glu Arg Ala Phe Arg 210 215 220 Lys
Arg Gly Ile Lys Phe Asn Leu Gly Thr Phe Phe Ser Lys Ala Glu 225 230
235 240 Tyr Thr Gln Asn Gly Val Lys Val Thr Leu Ala Asp Gly Lys Glu
Phe 245 250 255 Glu Ala Glu Val Leu Leu Val Ala Val Gly Arg Gly Pro
Val Ser Gln 260 265 270 Gly Leu Gly Tyr Glu Glu Gln Gly Val Ala Met
Asp Arg Gly Tyr Val 275 280 285 Leu Val Asp Glu Tyr Met Arg Thr Asn
Val Pro Thr Ile Ser Ala Val 290 295 300 Gly Asp Leu Val Pro Thr Leu
Gln Leu Ala His Val Gly Phe Ala Glu 305 310 315 320 Gly Ile Leu Val
Ala Glu Arg Leu Ala Gly Leu Lys Thr Val Pro Ile 325 330 335 Asp Tyr
Asp Gly Val Pro Arg Val Thr Tyr Cys His Pro Glu Val Ala 340 345 350
Ser Val Gly Ile Thr Glu Ala Lys Ala Lys Glu Ile Tyr Gly Ala Asp 355
360 365 Lys Val Val Ala Leu Lys Tyr Asn Leu Ala Gly Asn Gly Lys Ser
Lys 370 375 380 Ile Leu Asn Thr Ala Gly Glu Ile Lys Leu Val Gln Val
Lys Asp Gly 385 390 395 400 Ala Val Val Gly Val His Met Val Gly Asp
Arg Met Gly Glu Gln Val 405 410 415 Gly Glu Ala Gln Leu Ile Tyr Asn
Trp Glu Ala Leu Pro Ala Glu Val 420 425 430 Ala Gln Leu Ile His Ala
His Pro Thr Gln Asn Glu Ala Met Gly Glu 435 440 445 Ala His Leu Ala
Leu Ala Gly Lys Pro Leu His Ser His Asp 450 455 460
721389DNAStreptomyces avermitilis 72tcagtcgtgc gagtgcagcg
gcttgcccgc gagggccagg tgggcctcgc ccatcgcttc 60gttctgcgtc gggtgggcgt
ggatgagctg ggcgacctcg gccggcagcg cctcccagtt 120gtagatcagc
tgggcttcgc cgacctgctc gcccatacgg tcaccgacca tgtggacgcc
180gaccacggca ccgtccttca cctggacgag cttgatctcg cccgcggtgt
tgaggatctt 240gctcttgccg ttgcccgcca ggttgtactt cagagcgacg
accttgtccg cgccgtagat 300ctccttggcc ttggcctcgg tgatgcccac
ggaggcgacc tcggggtggc agtacgtcac 360ccgcggcacg ccgtcgtagt
cgatcgggac ggtcttcaga ccggccagac gctccgccac 420caggatgccc
tcggcgaagc cgacgtgcgc gagctggagc gtcgggacca ggtcaccgac
480ggcggagatg gtcgggacgt tcgtccgcat gtactcgtcg accaggacgt
agccgcggtc 540catggcgacg ccctgctcct cgtagccgag gccctgcgag
accgggccgc ggccgacggc 600gacgagcagg acctcggcct cgaactcctt
gccgtcggcg agggtgacct tgacaccgtt 660ctgggtgtac tcggccttcg
agaagaaggt gcccaggttg aacttgatgc cgcgcttgcg 720gaacgcgcgc
tcaagaagct tggaggagtt ctcgtcctcg accgggacga ggtgcttgag
780gccctcgatc accgtcacgt cggctccgaa ggacttccac gcggaggcga
actcgacgcc 840gatgacgccg ccgccgagca cgatcgcgga ctccgggacg
cggtccagga ccagcgcgtg 900gtcggaggag atgatgcggt tgccgtcgat
cgccaggccc ggcagcgact tcggcacgga 960gccggtcgcc aggagcacgt
ggcggccctg gacgcgctgg ccgttcacgt cgacggaggt 1020cggggaggac
agacggccct caccctcgat gtacgtcacc ttgcgggagg cgatcagccc
1080ctgcagaccc ttgtacaggc ccgagatgac cccgtccttg tacttgtgga
cggccggtac 1140gtcgatgccc tcgaaggtgg ccttgacgcc gaactgctcg
ctctcgcggg cctggtcggc 1200gatctcgccc gcgtgcagca gcgccttggt
ggggatgcac ccacggtgca ggcaggtacc 1260gccgaccttg tccttctcga
tcagggcgac gtccaggccc agctgcgctc cgcgcagggc 1320cgcggcgtaa
ccaccgctac caccgccgag gatcactagg tcgaaaacgg tgctggcgtc
1380gttcgccac 138973459PRTPseudomonas putida 73Met Gln Gln Ile Ile
Gln Thr Thr Leu Leu Ile Ile Gly Gly Gly Pro 1 5 10 15 Gly Gly Tyr
Val Ala Ala Ile Arg Ala Gly Gln Leu Gly Ile Pro Thr 20 25 30 Val
Leu Val Glu Gly Gln Ala Leu Gly Gly Thr Cys Leu Asn Ile Gly 35 40
45 Cys Ile Pro Ser Lys Ala Leu Ile His Val Ala Glu Gln Phe His Gln
50 55 60 Ala Ser Arg Phe Thr Glu Pro Ser Pro Leu Gly Ile Ser Val
Ala Ser 65 70 75 80 Pro Arg Leu Asp Ile Gly Gln Ser Val Thr Trp Lys
Asp Gly Ile Val 85 90 95 Asp Arg Leu Thr Thr Gly Val Ala Ala Leu
Leu Lys Lys His Gly Val 100 105 110 Lys Val Val His Gly Trp Ala Lys
Val Leu Asp Gly Lys Gln Val Glu 115 120 125 Val Asp Gly Gln Arg Ile
Gln Cys Glu His Leu Leu Leu Ala Thr Gly 130 135 140 Ser Thr Ser Val
Glu Leu Pro Met Leu Pro Leu Gly Gly Pro Val Ile 145 150 155 160 Ser
Ser Thr Glu Ala Leu Ala Pro Lys Ala Leu Pro Gln His Leu Val 165 170
175 Val Val Gly Gly Gly Tyr Ile Gly Leu Glu Leu Gly Ile Ala Tyr Arg
180 185 190 Lys Leu Gly Ala Gln Val Ser Val Val Glu Ala Arg Glu Arg
Ile Leu 195 200 205 Pro Thr Tyr Asp Ser Glu Leu Thr Ala Pro Val Ala
Glu Ser Leu Lys 210 215 220 Lys Leu Gly Ile Ala Leu His Leu Gly His
Ser Val Glu Gly Tyr Glu 225 230 235 240 Asn Gly Cys Leu Leu Ala Ser
Asp Gly Lys Gly Gly Gln Leu Arg Leu 245 250 255 Glu Ala Asp Gln Val
Leu Val Ala Val Gly Arg Arg Pro Arg Thr Lys 260 265 270 Gly Phe Asn
Leu Glu Cys Leu Asp Leu Lys Met Asn Gly Thr Ala Ile 275 280 285 Ala
Ile Asp Glu Arg Cys His Thr Ser Met His Asn Val Trp Ala Ile 290 295
300 Gly Asp Val Ala Gly Glu Pro Met Leu Ala His Arg Ala Met Ala Gln
305 310 315 320 Gly Glu Met Val Ala Glu Ile Ile Ala Gly Lys Ala Arg
Arg Phe Glu 325 330 335 Pro Ala Ala Ile Ala Ala Val Cys Phe Thr Asp
Pro Glu Val Val Val 340 345 350 Val Gly Lys Thr Pro Glu Gln Ala Ser
Gln Gln Ala Leu Asp Cys Ile 355 360 365 Val Ala Gln Phe Pro Phe Ala
Ala Asn Gly Arg Ala Met Ser Leu Glu 370 375 380 Ser Lys Ser Gly Phe
Val Arg Val Val Ala Arg Arg Asp Asn His Leu 385 390 395 400 Ile Val
Gly Trp Gln Ala Val Gly Val Ala Val Ser Glu Leu Ser Thr 405 410 415
Ala Phe Ala Gln Ser Leu Glu Met Gly Ala Cys Leu Glu Asp Val Ala 420
425 430 Gly Thr Ile His Ala His Pro Thr Leu Gly Glu Ala Val Gln Glu
Ala 435 440 445 Ala Leu Arg Ala Leu Gly His Ala Leu His Ile 450 455
741380DNAPseudomonas putida 74tcagatatgc aaggcgtggc ccaacgcgcg
tagggcggct tcttgcaccg cttcacccaa 60cgtggggtgg gcatgaatgg tgccggccac
atcttccagg cacgcgccca tctccagcga 120ttgggcaaac gcggtggaca
gctcggagac cgccacgcca accgcctgcc aacccacgat 180caggtggttg
tcacggcgcg ccaccacccg cacgaaaccg cttttcgact ccaggctcat
240ggcccggcca ttggcggcaa acgggaactg cgcgacgatg cagtccaggg
cctgctggct 300ggcttgttcc ggggtcttgc cgaccaccac cacttccggg
tcggtaaagc acacggcggc 360aatcgctgcc ggctcgaagc gtcgggcctt
gccggcgatg atttcggcga ccatctcgcc 420ttgggccatg gcccggtgcg
ccagcatcgg ttcgccagcg acgtcgccaa tggcccagac 480gttgtgcatg
ctggtatgac agcgctcgtc gatggcaatg gcggtgccgt tcatcttcag
540gtccaggcat tccaggttga agcccttggt gcgtggccgg cggcccacgg
ccaccagtac 600ctgatcggct tcaagacgca gttgcccacc cttgccgtcg
ctggccagca ggcagccatt 660ttcgtagccc tcgacgctgt ggcccaggtg
caacgcgatg cccagtttct tcagcgactc 720ggccaccggg gcggtcaatt
cgctgtcgta ggtcggcagg atgcgttcgc gcgcttccac 780cacactcacc
tgtgcaccca gcttgcgata ggcaatgccc agctccaggc cgatatagcc
840accgccgacc accaccaggt gttgcggcag ggctttcggc gccagggctt
cggtcgagga 900aatcaccggg ccacccagcg gcagcatcgg cagttcgaca
ctggtggaac cggtcgccag 960caacagatgc tcgcactgga tacgctggcc
atcgacctcg acctgcttgc cgtccagtac 1020cttggcccag ccatgcacca
ctttcacccc gtgctttttc agcaaggcgg caacaccggt 1080ggtcagacgg
tcgacaatgc cgtccttcca ggtgacgctc tggccgatgt ccaggcgcgg
1140cgaagccacg ctgatgccca gcggcgaggg ttcggtaaag cgcgaggctt
ggtgaaactg 1200ctcggccacg tggatcagcg ccttggacgg gatgcagccg
atgttcaggc aggtgccgcc 1260cagtgcctgg ccttccacca gtacggtagg
aatgcccagt tgcccggcgc ggatggctgc 1320tacatagccg ccagggccgc
cgccgatgat caacagggta gtctggataa tctgttgcat 138075475PRTListeria
monocytogenes 75Met Ala Lys Glu Tyr Asp Val Val Ile Leu Gly Gly Gly
Thr Gly Gly 1 5 10 15 Tyr Val Ala Ala Ile Gln Ala Ala Lys Asn Gly
Gln Lys Val Ala Val 20 25 30 Val Glu Lys Gly Lys Val Gly Gly Thr
Cys Leu His Arg Gly Cys Ile 35 40 45 Pro Thr Lys Ala Leu Leu Arg
Ser Ala Glu Val Leu Gln Thr Val Lys 50 55 60 Lys Ala Ser Glu Phe
Gly Ile Ser Val Glu Gly Thr Ala Gly Ile Asn 65 70
75 80 Phe Leu Gln Ala Gln Glu Arg Lys Gln Ala Ile Val Asp Gln Leu
Glu 85 90 95 Lys Gly Ile His Gln Leu Phe Lys Gln Gly Lys Ile Asp
Leu Phe Val 100 105 110 Gly Thr Gly Thr Ile Leu Gly Pro Ser Ile Phe
Ser Pro Thr Ala Gly 115 120 125 Thr Ile Ser Val Glu Phe Glu Asp Gly
Ser Glu Asn Glu Met Leu Ile 130 135 140 Pro Lys Asn Leu Ile Ile Ala
Thr Gly Ser Lys Pro Arg Thr Leu Ser 145 150 155 160 Gly Leu Thr Ile
Asp Glu Glu His Val Leu Ser Ser Asp Gly Ala Leu 165 170 175 Asn Leu
Glu Thr Leu Pro Lys Ser Ile Ile Ile Val Gly Gly Gly Val 180 185 190
Ile Gly Met Glu Trp Ala Ser Met Met His Asp Phe Gly Val Glu Val 195
200 205 Thr Val Leu Glu Tyr Ala Asp Arg Ile Leu Pro Thr Glu Asp Lys
Glu 210 215 220 Val Ala Lys Glu Leu Ala Arg Leu Tyr Lys Lys Lys Lys
Leu Asn Met 225 230 235 240 His Thr Ser Ala Glu Val Gln Ala Ala Ser
Tyr Lys Lys Thr Asp Thr 245 250 255 Gly Val Glu Ile Lys Ala Ile Ile
Lys Gly Glu Glu Gln Thr Phe Thr 260 265 270 Ala Asp Lys Ile Leu Val
Ser Val Gly Arg Ser Ala Thr Thr Glu Asn 275 280 285 Ile Gly Leu Gln
Asn Thr Asp Ile Ala Thr Glu Asn Gly Phe Ile Gln 290 295 300 Val Asn
Asp Phe Tyr Gln Thr Lys Glu Ser His Ile Tyr Ala Ile Gly 305 310 315
320 Asp Cys Ile Pro Thr Ile Gln Leu Ala His Val Ala Met Glu Glu Gly
325 330 335 Thr Ile Ala Ala Asn His Ile Ala Gly Lys Ala Ala Glu Lys
Leu Asp 340 345 350 Tyr Asp Leu Val Pro Arg Cys Ile Tyr Thr Ser Thr
Glu Ile Ala Ser 355 360 365 Val Gly Ile Thr Glu Glu Gln Ala Lys Glu
Arg Gly His Glu Val Lys 370 375 380 Lys Gly Lys Phe Phe Phe Arg Gly
Ile Gly Lys Ala Leu Val Tyr Gly 385 390 395 400 Glu Ser Asp Gly Phe
Ile Lys Ile Ile Ala Asp Lys Lys Thr Asp Asp 405 410 415 Ile Leu Gly
Val Ser Met Ile Gly Pro His Val Thr Asp Met Ile Ser 420 425 430 Glu
Ala Ala Leu Ala Gln Val Leu Asn Ala Thr Pro Trp Glu Val Gly 435 440
445 Asn Thr Ile His Pro His Pro Thr Leu Ser Glu Ser Phe Arg Glu Ala
450 455 460 Ala Leu Ala Val Asp Gly Asn Ala Ile His Gly 465 470 475
761428DNAListeria monocytogenes 76gtggcaaaag aatatgatgt agttattctt
ggcggaggaa ctggcggtta cgttgcagca 60attcaagcag ctaagaatgg ccagaaagta
gccgtcgttg aaaaagggaa agttggagga 120acgtgtcttc accgtgggtg
tattccaacg aaagcgttat tacgttcagc ggaagttcta 180caaacggtaa
aaaaagcaag tgaatttggt atttctgtag aaggaactgc cggaatcaat
240tttttacaag cacaagaacg aaaacaagca atagtagatc aattagaaaa
aggtattcac 300caattattta aacaagggaa aattgacttg tttgttggaa
cgggaactat tttgggacca 360tcaatttttt caccaacagc tggaacaatt
tcagttgaat tcgaagatgg ttctgaaaat 420gaaatgctaa ttcctaaaaa
cttaattatc gcaactgggt ccaaaccgcg cacattaagc 480ggtttaacaa
tcgatgagga acatgtttta tcatctgacg gcgcgcttaa cctagaaact
540ttaccaaaat caattattat tgttggcggt ggggttatcg gaatggaatg
ggcttcgatg 600atgcatgatt tcggtgtaga agttacggtg ctagaatatg
cagaccgaat tttgccaaca 660gaagataaag aagtggccaa agaattagca
agactttata aaaagaaaaa attaaacatg 720catacatctg ctgaagttca
agcagctagt tataaaaaaa cagatactgg tgtggaaatt 780aaagcaatca
ttaaaggcga agagcagact ttcacagcag ataaaattct tgtttcagtt
840ggtcgttctg ctactacaga aaacatcggc ttacaaaata cagatatcgc
gaccgaaaac 900ggctttatcc aagtaaatga tttttaccaa acaaaagaaa
gtcacatcta tgcgattgga 960gactgcattc caacgattca actcgcgcac
gttgcaatgg aagaaggaac aattgcagcc 1020aaccatattg ccggaaaagc
agccgaaaaa cttgactacg acttagttcc ccgctgtatt 1080tatacttcta
cagaaatcgc aagtgtcggt atcacagaag aacaagcaaa agaacggggt
1140catgaagtga aaaaaggcaa attcttcttc cgtggtatcg ggaaagcgct
cgtttacgga 1200gaatcagatg gcttcattaa aattattgca gataaaaaaa
cagacgatat cttaggcgtg 1260agcatgattg gaccacacgt tacggacatg
attagcgaag ccgctttagc acaagtttta 1320aatgcaacgc cgtgggaagt
gggcaacacg attcacccgc acccaacttt atcagaaagt 1380tttagagaag
ctgcccttgc tgtggatggc aatgcaattc acggttaa 142877478PRTStreptomyces
avermitilis 77Met Glu Asn Met Asn Thr Pro Asp Val Ile Val Ile Gly
Gly Gly Thr 1 5 10 15 Gly Gly Tyr Ser Ala Ala Leu Arg Ala Ala Ala
Leu Gly Leu Thr Val 20 25 30 Val Leu Ala Glu Arg Asp Lys Val Gly
Gly Thr Cys Leu His Arg Gly 35 40 45 Cys Ile Pro Ser Lys Ala Met
Leu His Ala Ala Glu Leu Val Asp Gly 50 55 60 Ile Ala Glu Ala Arg
Glu Arg Trp Gly Val Lys Ala Thr Leu Asp Asp 65 70 75 80 Ile Asp Trp
Pro Ala Leu Val Ala Thr Arg Asp Asp Ile Val Thr Arg 85 90 95 Asn
His Arg Gly Val Glu Ala His Leu Ala His Ala Arg Val Arg Val 100 105
110 Val Arg Gly Ser Ala Arg Leu Thr Gly Pro Arg Ser Val Arg Val Glu
115 120 125 Gly Ala Pro Asp Asp Leu Pro Gly Gly Ala Gly Asp Phe Thr
Ala Arg 130 135 140 Arg Gly Ile Val Leu Ala Thr Gly Ser Arg Pro Arg
Thr Leu Pro Gly 145 150 155 160 Leu Val Pro Asp Gly Arg Arg Val Val
Thr Ser Asp Asp Ala Leu Phe 165 170 175 Ala Pro Gly Leu Pro Arg Ser
Val Leu Val Leu Gly Gly Gly Ala Ile 180 185 190 Gly Val Glu Tyr Ala
Ser Phe His Arg Ser Met Gly Ala Glu Val Thr 195 200 205 Leu Val Glu
Ala Ala Asp Arg Ile Val Pro Leu Glu Asp Val Asp Val 210 215 220 Ser
Arg His Leu Thr Arg Gly Leu Lys Lys Arg Gly Ile Asp Val Arg 225 230
235 240 Ala Gly Ala Arg Leu Leu Asp Ala Glu Leu Leu Glu Ala Gly Val
Arg 245 250 255 Ala Arg Val Arg Thr Val Arg Gly Glu Ile Arg Thr Leu
Glu Ala Glu 260 265 270 Arg Leu Leu Val Ala Val Gly Arg Ala Pro Val
Thr Asp Gly Leu Asp 275 280 285 Leu Ala Ala Ala Gly Leu Ala Thr Asp
Glu Arg Gly Phe Val Thr Pro 290 295 300 Ser Asp Trp Asp Arg Leu Glu
Thr Ala Val Pro Gly Ile His Val Val 305 310 315 320 Gly Asp Leu Leu
Pro Pro Pro Ser Leu Gly Leu Ala His Ala Ser Phe 325 330 335 Ala Glu
Gly Leu Ser Val Ala Glu Thr Leu Ala Gly Leu Pro Ser Ala 340 345 350
Pro Val Asp Tyr Ala Ala Val Pro Arg Val Thr Tyr Ser Ser Pro Gln 355
360 365 Thr Ala Ser Val Gly Leu Gly Glu Ala Glu Ala Arg Ala Arg Gly
His 370 375 380 Glu Val Asp Val Asn Thr Met Pro Leu Thr Ala Val Ala
Lys Gly Met 385 390 395 400 Val His Gly Arg Gly Gly Met Val Lys Val
Val Ala Glu Glu Gly Gly 405 410 415 Gly Gln Val Leu Gly Val His Leu
Val Gly Pro His Val Ser Glu Met 420 425 430 Ile Ala Glu Ser Gln Leu
Ile Val Gly Trp Asp Ala Gln Pro Ser Asp 435 440 445 Val Ala Arg His
Ile His Ala His Pro Thr Leu Ser Glu Ala Val Gly 450 455 460 Glu Thr
Phe Leu Thr Leu Ala Gly Arg Gly Leu His Gln Gln 465 470 475
781437DNAStreptomyces avermitilis 78gtggagaaca tgaacacacc
ggacgtcatc gtcatcggag gcggcaccgg cggctacagc 60gccgccctgc gcgccgccgc
cctcggtctg accgtggtgc tcgccgagcg ggacaaggtc 120ggcggaacct
gtctgcaccg tggctgcatt ccgagcaagg cgatgctgca cgccgcagaa
180ctggtcgacg gcatcgccga ggcgcgcgag cgctgggggg tgaaggccac
gctggacgac 240atcgactggc ctgcgctcgt cgccacgcgc gacgacatag
tgacgcgcaa ccaccgcggc 300gtggaggcgc acctcgccca cgcgcgcgtg
cgcgtcgtcc ggggcagtgc ccggctgacc 360ggtccgcgca gcgtccgcgt
cgagggtgct ccggacgacc tgccgggcgg cgcgggcgac 420ttcaccgcgc
gccggggcat cgtcctggcg accggctcac ggccgcgtac gctcccgggg
480ctcgtgccgg acgggcggcg cgtggtgacg agcgacgacg cgctgttcgc
gcccggcctc 540ccccgctccg tgctggtcct gggcggcggt gcgatcgggg
tcgagtacgc ctcgttccac 600cgctccatgg gtgcggaggt cactctcgtc
gaggccgccg accggatcgt gccgctcgaa 660gacgtcgacg tcagccgtca
tctgacgcgc ggtctgaaga agcgcggcat cgatgtgcgg 720gcgggggcgc
ggctgctcga cgccgaactc ctggaggcgg gggtacgcgc gcgcgtacgc
780accgtgcggg gcgagatccg cacactggag gccgagcggc tcctggtggc
ggtcgggcgg 840gcgccggtca ccgacgggct ggacctggcc gccgcgggcc
tggcgacgga cgagcggggt 900tttgtgacgc cgtccgactg ggaccgtctg
gagaccgcgg tgcccggcat ccacgtggtg 960ggcgacctgc tgccaccgcc
gtccctggga ctggcccacg cgtcgttcgc cgagggcctg 1020tcggtggccg
agacgctggc cgggctgccg tccgcgcccg tggactacgc ggccgtgccc
1080cgggtcacgt actcgtcgcc gcagaccgcc tccgtggggc tgggcgaggc
ggaggcacgc 1140gcgcgtggac acgaggtgga cgtcaacacg atgccgctga
ccgccgtcgc caagggcatg 1200gtccacggcc ggggcgggat ggtgaaggtc
gtcgccgagg agggcggcgg gcaggtgctc 1260ggcgtgcatc tggtgggccc
ccacgtgtcc gagatgatcg ccgagagcca gctgatcgtc 1320ggctgggacg
cacagccctc cgacgtggcc cggcacatcc acgcgcaccc cacgctgtcc
1380gaggcggtcg gcgaaacgtt tctcacgctc gcgggacggg ggctgcatca gcagtga
143779476PRTMicrococcus luteus 79Met Thr Glu Glu Asn Ser Thr Phe
Ile Pro Ser Leu Thr Ile Ile Gly 1 5 10 15 Gly Gly Pro Gly Gly Tyr
Glu Ala Ala Met Val Ala Ala Lys Leu Gly 20 25 30 Ala Arg Val Thr
Leu Val Glu Arg Gln Gly Val Gly Gly Ala Ala Val 35 40 45 Leu Thr
Asp Val Val Pro Ser Lys Thr Leu Ile Ala Ala Ala Asp Ser 50 55 60
Met Arg Arg Val Gly Ala Ser Val Asp Leu Gly Val Asp Leu Gly Gly 65
70 75 80 Ala Glu Val His Ala Asp Met Gly Arg Val Gly His Arg Ile
Leu Asn 85 90 95 Leu Ala His Glu Gln Ser Ser Asp Ile Arg Ala Gly
Leu Glu Arg Val 100 105 110 Gly Val Arg Val Ile Asp Gly Val Gly Arg
Val Val Gly Pro His Glu 115 120 125 Val Ser Val Arg Ala Leu Asp Asp
Ala Asp Ala Gly Ala Glu Pro Glu 130 135 140 Ile Ile Thr Ser Asp Ala
Ile Leu Val Ala Val Gly Ala Ser Pro Arg 145 150 155 160 Glu Leu Pro
Thr Ala Val Pro Asp Gly Glu Arg Ile Phe Asn Trp Lys 165 170 175 Gln
Val Tyr Asn Leu Lys Glu Leu Pro Glu His Leu Ile Val Val Gly 180 185
190 Ser Gly Val Thr Gly Ala Glu Phe Ala Ser Ala Tyr Asn Arg Leu Gly
195 200 205 Ala Lys Val Thr Leu Val Ser Ser Arg Asp Arg Val Leu Pro
Gly Glu 210 215 220 Asp Ala Asp Ala Ala Glu Leu Leu Glu Lys Val Phe
Glu Gly Asn Gly 225 230 235 240 Leu Arg Val Val Ser Arg Ser Arg Ala
Glu Ser Val Glu Arg Thr Glu 245 250 255 Thr Gly Val Arg Val His Leu
Ser Gly Glu Gly Ala Glu Asp Thr Pro 260 265 270 Ser Ile Glu Gly Ser
His Ala Leu Val Ala Val Gly Gly Val Pro Asn 275 280 285 Thr Ala Gly
Leu Gly Leu Asp Asp Val Gly Val Lys Leu Ala Asp Ser 290 295 300 Gly
His Val Leu Val Asp Gly Val Ser Arg Thr Ser Val Pro Ser Ile 305 310
315 320 Tyr Ala Ala Gly Asp Cys Thr Gly Lys Leu Ala Leu Ala Ser Val
Ala 325 330 335 Ala Met Gln Gly Arg Ile Ala Val Ala His Leu Leu Gly
Asp Ala Leu 340 345 350 Lys Pro Leu Arg Pro His Leu Leu Ala Ser Asn
Ile Phe Thr Ser Pro 355 360 365 Glu Ile Ala Thr Val Gly Val Ser Gln
Ala Gln Val Asp Ser Gly Gln 370 375 380 Tyr Gln Ala Asp Val Leu Arg
Leu Asp Phe His Thr Asn Pro Arg Ala 385 390 395 400 Lys Met Ser Gly
Ala Glu Glu Gly Phe Val Lys Ile Phe Ala Arg Gln 405 410 415 Gly Ser
Gly Thr Val Ile Gly Gly Val Val Val Ser Pro Arg Ala Ser 420 425 430
Glu Leu Ile Tyr Ala Leu Ala Leu Ala Val Thr His Lys Leu His Val 435
440 445 Asp Asp Leu Ala Asp Thr Phe Thr Val Tyr Pro Ser Met Ser Gly
Ser 450 455 460 Ile Ala Glu Ala Ala Arg Arg Leu His Val Arg Val 465
470 475 801431DNAMicrococcus luteus 80gtgaccgagg aaaacagcac
cttcatcccg tccctgacca tcatcggcgg cggccccggc 60ggctacgagg ccgccatggt
ggccgcgaag ctgggcgccc gcgtgaccct ggtcgagcgc 120cagggggtgg
gcggcgcggc cgtcctcacg gacgtggtcc cctccaagac gctgatcgcc
180gccgccgact cgatgcgccg cgtgggcgcc tccgtggacc tgggggtcga
cctcggcggg 240gccgaggtcc acgcggacat gggccgggtc ggccaccgca
tcctgaacct ggcccacgag 300cagtcctcgg acatccgcgc gggcctcgag
cgggtcggtg tccgggtgat cgacggcgtg 360ggccgcgtcg tcggccccca
cgaggtgtcc gtccgcgccc tcgacgacgc cgacgccggc 420gccgagcccg
agatcatcac ctcggacgcg atcctcgtgg ccgtcggcgc gagtccccgg
480gagctgccca ccgccgtccc ggacggcgag cggatcttca actggaagca
ggtctacaac 540ctcaaggagc tgcccgagca cctgatcgtc gtgggctccg
gcgtcaccgg cgccgagttc 600gcctcggcct acaaccgcct cggcgccaag
gtcaccctcg tctcctcgcg cgaccgcgtg 660ctccccggcg aggacgccga
cgccgcagag ctgctcgaga aggtcttcga gggcaacggc 720ctcagggttg
tctcccgctc ccgggccgag tcggtcgagc ggaccgagac cggcgtgcgc
780gtgcacctct ccggcgaggg ggccgaagac accccgtcga tcgagggctc
ccacgcgctg 840gtggccgtcg gcggcgtgcc gaacacggcg ggcctcggcc
tcgacgacgt gggcgtgaag 900ctggccgact ccggccacgt gctcgtggac
ggcgtctccc gcacgtccgt gccgagcatc 960tacgcggcgg gcgactgcac
gggcaagctc gccctcgcct cggtggcggc catgcagggg 1020cgcatcgccg
tggcccacct gctcggcgac gccctcaagc cgctgcgccc gcacctgctg
1080gcctcgaaca tcttcacctc gccggagatc gccaccgtgg gcgtctcgca
ggcgcaggtg 1140gactccggcc agtaccaggc ggacgtgctg cgactggact
tccacaccaa cccccgcgcc 1200aagatgtccg gcgcggagga ggggttcgtg
aagatcttcg cgcgtcaggg ctccggcacc 1260gtgatcggcg gcgtggtggt
ctcgccgcgc gcctccgagc tgatctacgc gctcgcgctc 1320gcggtcacgc
acaagttgca cgtggacgac ctcgcggaca ccttcaccgt gtacccgtcc
1380atgtccgggt cgatcgcgga ggcggcgcgc cgcctccatg tgcgggtgtg a
143181473PRTStaphylococcus aureus 81Met Ser Glu Lys Gln Tyr Asp Leu
Val Val Leu Gly Gly Gly Thr Ala 1 5 10 15 Gly Tyr Val Ala Ala Ile
Arg Ala Ser Gln Leu Gly Lys Lys Val Ala 20 25 30 Ile Val Glu Arg
Gln Leu Leu Gly Gly Thr Cys Leu His Lys Gly Cys 35 40 45 Ile Pro
Thr Lys Ser Leu Leu Lys Ser Ala Glu Val Phe Gln Thr Val 50 55 60
Lys Gln Ala Ala Met Phe Gly Val Asp Val Lys Asp Ala Asn Val Asn 65
70 75 80 Phe Glu Asn Met Leu Ala Arg Lys Glu Asp Ile Ile Asn Gln
Met Tyr 85 90 95 Gln Gly Val Lys His Leu Met Gln His Asn His Ile
Asp Ile Tyr Asn 100 105 110 Gly Thr Gly Arg Ile Leu Gly Thr Ser Ile
Phe Ser Pro Gln Ser Gly 115 120 125 Thr Ile Ser Val Glu Tyr Glu Asp
Gly Glu Ser Asp Leu Leu Pro Asn 130 135 140 Gln Phe Val Leu Ile Ala
Thr Gly Ser Ser Pro Ala Glu Leu Pro Phe 145 150 155 160 Leu Ser Phe
Asp His Asp Lys Ile Leu Ser Ser Asp Asp Ile Leu Ser 165 170 175 Leu
Lys Thr Leu Pro Ser Ser Ile Gly Ile Ile Gly Gly Gly Val Ile 180 185
190 Gly Met Glu Phe Ala Ser Leu Met Ile Asp Leu Gly Val Asp Val Thr
195 200 205 Val Ile Glu Ala Gly Glu Arg Ile Leu Pro Thr Glu Ser Lys
Gln Ala 210 215 220 Ser Gln Leu Leu Lys Lys Ser Leu Ser Ala Arg Gly
Val Lys Phe Tyr 225 230 235 240 Glu Gly Ile Lys Leu Ser Glu Asn Asp
Ile Asn Val Asn Glu Asp Gly 245 250 255 Val Thr Phe Glu Ile Ser Ser
Asp Ile Ile Lys Val Asp Lys Val Leu
260 265 270 Leu Ser Ile Gly Arg Lys Pro Asn Thr Ser Asp Ile Gly Leu
Asn Asn 275 280 285 Thr Lys Ile Lys Leu Ser Thr Ser Gly His Ile Leu
Thr Asn Glu Phe 290 295 300 Gln Gln Thr Glu Asp Lys His Ile Tyr Ala
Ala Gly Asp Cys Ile Gly 305 310 315 320 Lys Leu Gln Leu Ala His Val
Gly Ser Lys Glu Gly Val Val Ala Val 325 330 335 Asp His Met Phe Glu
Gly Asn Pro Ile Pro Val Asn Tyr Asn Met Met 340 345 350 Pro Lys Cys
Ile Tyr Ser Gln Pro Glu Ile Ala Ser Ile Gly Leu Asn 355 360 365 Ile
Glu Gln Ala Lys Ala Glu Gly Met Lys Val Lys Ser Phe Lys Val 370 375
380 Pro Phe Lys Ala Ile Gly Lys Ala Val Ile Asp Ser His Asp Ala Asn
385 390 395 400 Glu Gly Tyr Ser Glu Met Val Ile Asp Gln Ser Thr Glu
Glu Ile Val 405 410 415 Gly Ile Asn Met Ile Gly Pro His Val Thr Glu
Leu Ile Asn Glu Ala 420 425 430 Ser Leu Leu Gln Phe Met Asn Gly Ser
Ala Leu Glu Leu Gly Leu Thr 435 440 445 Thr His Ala His Pro Ser Ile
Ser Glu Val Leu Met Glu Leu Gly Leu 450 455 460 Lys Ala Glu Ser Arg
Ala Ile His Val 465 470 821422DNAStaphylococcus aureus 82ttatacgtga
atagctctac tttctgcttt caatcctaat tccatcaaca cttcagagat 60ggaaggatgt
gcgtgtgttg ttagtcctaa ttctaatgcc gagccattca tgaactgtaa
120cagtgatgcc tcattaatca attctgttac atgtggacca atcatattaa
tacccacaat 180ttcttcagtt gattgatcaa tcaccatttc gctataccct
tcgtttgcgt catggctatc 240aatcactgct ttaccaattg ctttaaatgg
tactttaaaa cttttaactt tcattccctc 300tgcctttgct tgttcaatgt
ttaaaccgat agaagcaatt tcaggttgtg aataaataca 360cttaggcatc
atgttatagt ttactgggat tgggttcccc tcaaacatat gatcaacagc
420cacaacacct tcttttgatc caacatgtgc caattgtaat tttcctatac
aatcaccagc 480tgcataaata tgtttatctt cagtttgttg aaattcgttc
gttaaaatat gtcctgatgt 540agaaagtttt attttagtgt tgtttaaacc
aatatctgat gtgttaggtt ttctaccaat 600cgatagcaac actttatcta
ctttaattat gtctgaagaa atttcaaacg taacaccatc 660ttcgttaaca
tttatatcat tttcagaaag ttttattccc tcatagaatt taacaccacg
720tgctgacaat gattttttta atagttgtga agcttgttta ctttcagttg
gtaaaattct 780ttcacctgct tctataactg ttacgtcaac acctaaatct
atcatcaatg atgcaaattc 840cattccgata acaccaccac caataatacc
aatacttgat ggtaacgtct ttaatgataa 900tatatcatcg ctagataaaa
ttttatcatg atcaaatgat aagaatggca actctgcagg 960cgaagaacca
gttgcaatta atacaaattg gttgggtaat aagtctgatt caccatcttc
1020atattcgaca gaaattgtgc cactttgagg tgaaaatata gatgtaccta
gaatacgtcc 1080cgtgccatta taaatgtcaa tgtgattgtg ttgcattaaa
tgctttacac cttgatacat 1140ttgattaata atgtcttctt ttcgtgccaa
catattttca aaattaacat tagcatcttt 1200gacatcaacg ccaaacattg
ctgcctgttt tactgtttga aatacttcag cagatttaag 1260cagcgattta
gtaggaatac aacctttatg gagacaagta cctcctaata gttgtcgttc
1320tactattgcc actttcttac ctaattgaga cgcacgtatc gcagcaacat
atcctgcagt 1380acctccaccg agaacgacta aatcatattg tttctctgac at
142283581PRTStreptococcus mutans 83Met Ala Val Glu Ile Ile Met Pro
Lys Leu Gly Val Asp Met Gln Glu 1 5 10 15 Gly Glu Ile Ile Glu Trp
Lys Lys Gln Glu Gly Asp Glu Val Lys Glu 20 25 30 Gly Asp Ile Leu
Leu Glu Ile Met Ser Asp Lys Thr Asn Met Glu Ile 35 40 45 Glu Ala
Glu Asp Ser Gly Val Leu Leu Lys Ile Val Lys Gly Asn Gly 50 55 60
Gln Val Val Pro Val Thr Glu Val Ile Gly Tyr Ile Gly Ser Ala Gly 65
70 75 80 Glu Thr Ile Glu Thr Asn Ala Ala Pro Ala Ala Ser Ala Asp
Asp Leu 85 90 95 Lys Ala Ala Gly Leu Glu Val Pro Asp Thr Leu Gly
Glu Ser Ala Ala 100 105 110 Pro Ala Ala Gln Lys Thr Pro Leu Ala Asp
Asp Glu Tyr Asp Met Ile 115 120 125 Val Val Gly Gly Gly Pro Ala Gly
Tyr Tyr Ala Ala Ile Arg Gly Ala 130 135 140 Gln Leu Gly Gly Lys Val
Ala Ile Val Glu Lys Ser Glu Phe Gly Gly 145 150 155 160 Thr Cys Leu
Asn Lys Gly Cys Ile Pro Thr Lys Thr Tyr Leu Lys Asn 165 170 175 Ala
Glu Ile Leu Asp Gly Ile Lys Ile Ala Ala Gly Arg Gly Ile Asn 180 185
190 Phe Ala Ser Thr Asn Tyr Thr Ile Asp Met Asp Lys Thr Val Ala Phe
195 200 205 Lys Asp Thr Val Val Lys Thr Leu Thr Ser Gly Val Gln Gly
Leu Leu 210 215 220 Lys Ala Asn Lys Val Thr Ile Phe Asn Gly Leu Gly
Gln Val Asn Pro 225 230 235 240 Asp Lys Thr Val Thr Val Gly Ser Glu
Thr Ile Lys Gly His Asn Ile 245 250 255 Ile Leu Ala Thr Gly Ser Lys
Val Ser Arg Ile Asn Ile Pro Gly Ile 260 265 270 Asp Ser Pro Leu Val
Leu Thr Ser Asp Asp Ile Leu Asp Leu Arg Glu 275 280 285 Ile Pro Lys
Ser Leu Ala Val Met Gly Gly Gly Val Val Gly Ile Glu 290 295 300 Leu
Gly Leu Val Tyr Ala Ser Tyr Gly Thr Glu Val Thr Val Ile Glu 305 310
315 320 Met Ala Asp Arg Ile Ile Pro Ala Met Asp Lys Glu Val Ser Leu
Glu 325 330 335 Leu Gln Lys Ile Leu Ser Lys Lys Gly Met Asn Ile Lys
Thr Ser Val 340 345 350 Gly Val Ala Glu Ile Val Glu Ala Asn Asn Gln
Leu Thr Leu Lys Leu 355 360 365 Asn Asp Gly Ser Glu Val Val Ala Glu
Lys Ala Leu Leu Ser Ile Gly 370 375 380 Arg Val Pro Gln Leu Ser Gly
Leu Glu Asn Leu Asn Leu Glu Leu Glu 385 390 395 400 Arg Gly Arg Ile
Lys Val Asp Asp Tyr Gln Glu Thr Ser Ile Ser Gly 405 410 415 Ile Tyr
Ala Pro Gly Asp Val Asn Gly Arg Lys Met Leu Ala His Ala 420 425 430
Ala Tyr Arg Met Gly Glu Val Ala Ala Glu Asn Ala Ile Trp Gly Asn 435
440 445 Val Arg Lys Ala Asn Leu Lys Tyr Thr Pro Ala Ala Val Tyr Thr
His 450 455 460 Pro Glu Val Ala Met Cys Gly Ile Thr Glu Glu Gln Ala
Arg Gln Glu 465 470 475 480 Tyr Gly Asn Val Leu Val Gly Lys Ser Ser
Phe Ser Gly Asn Gly Arg 485 490 495 Ala Ile Ala Ser Asn Glu Ala Gln
Gly Phe Val Lys Val Val Ala Asp 500 505 510 Ala Lys Tyr His Glu Ile
Leu Gly Val His Ile Ile Gly Pro Ala Ala 515 520 525 Ala Glu Met Ile
Asn Glu Ala Ser Thr Ile Met Glu Asn Glu Leu Thr 530 535 540 Val Asp
Glu Leu Leu Arg Ser Ile His Gly His Pro Thr Phe Ser Glu 545 550 555
560 Val Met Tyr Glu Ala Phe Ala Asp Val Leu Gly Glu Ala Ile His Asn
565 570 575 Pro Pro Lys Arg Arg 580 841746DNAStreptococcus mutans
84atggcagtcg aaattattat gcctaaactc ggtgttgata tgcaggaagg cgaaatcatc
60gagtggaaaa aacaagaagg tgatgaggtc aaagaagggg atatcctcct tgaaatcatg
120tctgacaaga ccaatatgga aattgaagct gaggattcag gtgtcctgct
caaaattgtt 180aaaggaaatg gtcaagttgt ccctgtgact gaggtcattg
gttatattgg ttctgctggt 240gaaacgattg aaacaaatgc agcgccagca
gcttcagctg atgatctcaa agcagcgggt 300cttgaagttc ctgatacttt
aggcgagtca gcagcaccag cagctcaaaa aactccgctt 360gctgatgatg
agtatgatat gattgtcgtt ggtggtggtc ctgctggtta ttatgctgct
420attcgcggtg cacaattggg cggcaaggtt gctatcgtcg aaaaatcaga
atttggaggg 480acttgtttaa ataaaggctg cattccaact aaaacttatc
ttaagaatgc tgaaatcctt 540gatggcatca aaattgcagc gggtcgcggt
attaattttg cttcaaccaa ctataccatt 600gacatggaca aaacggttgc
ctttaaagat accgttgtta aaacattgac aagtggggtt 660cagggtcttc
ttaaagccaa taaagtgact attttcaatg gtctcggtca ggttaatcct
720gataagacag tgactgtcgg ttcggaaacg attaaaggac ataatattat
ccttgcaaca 780ggttcaaaag tgtctcgtat taatattccg ggaattgatt
cacctcttgt tttaacatcg 840gatgatattc ttgatcttcg tgaaattcca
aagtcacttg ctgttatggg cggtggtgtt 900gtcggcattg aactcggtct
tgtttacgct tcctatggta cagaagtgac tgttattgaa 960atggctgatc
gcattattcc tgctatggac aaggaagtat cgcttgaact gcaaaaaatt
1020ctatccaaga aaggaatgaa cattaagact tctgttggtg tggctgaaat
tgttgaagct 1080aacaatcaat taacgctgaa actcaatgac ggctctgaag
ttgtggctga aaaggccctg 1140ctttctattg gtcgtgtccc acaattaagc
ggtttagaaa atcttaatct ggaacttgaa 1200cgcggtcgca tcaaagtgga
cgattatcag gaaacctcta tttcaggtat ttatgccccg 1260ggtgatgtta
atggaagaaa gatgttagcg catgctgcct atcgtatggg tgaagtagct
1320gccgaaaatg ctatctgggg aaatgttcgt aaggctaacc tgaaatatac
accagcagct 1380gtttacaccc atccagaggt tgctatgtgc ggtattactg
aagaacaagc ccgtcaagaa 1440tatggaaacg tcttagttgg gaaatcctct
ttttcaggaa atggacgtgc gatcgcttct 1500aatgaagcac aaggatttgt
caaagttgtc gcagatgcta aataccatga aattcttgga 1560gtccatatta
ttggaccagc agctgctgag atgattaatg aagcctcaac gattatggaa
1620aatgagttga cggttgatga gctgctacgt tctattcatg gccatcctac
cttctcggag 1680gttatgtatg aagcctttgc agacgtcctt ggcgaagcta
tccataaccc gccaaagcgt 1740cgttaa 17468517PRTArtificial
sequenceSynthetic polypeptide 85Xaa Gly Gly Xaa Xaa Xaa Xaa Xaa Xaa
Cys Xaa Pro Xaa Lys Xaa Xaa 1 5 10 15 Xaa 8617PRTArtificial
sequenceSynthetic polypeptide 86Xaa Ala Thr Gly Xaa Xaa Xaa Xaa Xaa
Leu Xaa Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Gly 8712PRTArtificial
sequenceSynthetic polypeptide 87Xaa Xaa Gly Xaa Gly Xaa Xaa Gly Xaa
Glu Xaa Xaa 1 5 10 8814PRTArtificial sequenceSynthetic polypeptide
88Thr Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Gly Asp Xaa Xaa Xaa 1 5 10
8911PRTArtificial sequenceSynthetic polypeptide 89Xaa Xaa Xaa Xaa
Xaa Xaa Xaa Xaa Pro Xaa Xaa 1 5 10 90317PRTEscherichia coli 90Met
Tyr Thr Lys Ile Ile Gly Thr Gly Ser Tyr Leu Pro Glu Gln Val 1 5 10
15 Arg Thr Asn Ala Asp Leu Glu Lys Met Val Asp Thr Ser Asp Glu Trp
20 25 30 Ile Val Thr Arg Thr Gly Ile Arg Glu Arg His Ile Ala Ala
Gln Asn 35 40 45 Glu Thr Val Ser Thr Met Gly Phe Glu Ala Ala Thr
Arg Ala Ile Glu 50 55 60 Met Ala Gly Ile Glu Lys Asp Gln Ile Gly
Leu Ile Val Val Ala Thr 65 70 75 80 Thr Ser Ala Thr His Ala Phe Pro
Ser Ala Ala Cys Gln Ile Gln Ser 85 90 95 Met Leu Gly Ile Lys Gly
Cys Pro Ala Phe Asp Val Ala Ala Ala Cys 100 105 110 Ala Gly Phe Thr
Tyr Ala Leu Ser Val Ala Asp Gln Tyr Val Lys Ser 115 120 125 Gly Ala
Val Lys Tyr Ala Leu Val Val Gly Ser Asp Val Leu Ala Arg 130 135 140
Thr Cys Asp Pro Thr Asp Arg Gly Thr Ile Ile Ile Phe Gly Asp Gly 145
150 155 160 Ala Gly Ala Ala Val Leu Ala Ala Ser Glu Glu Pro Gly Ile
Ile Ser 165 170 175 Thr His Leu His Ala Asp Gly Ser Tyr Gly Glu Leu
Leu Thr Leu Pro 180 185 190 Asn Ala Asp Arg Val Asn Pro Glu Asn Ser
Ile His Leu Thr Met Ala 195 200 205 Gly Asn Glu Val Phe Lys Val Ala
Val Thr Glu Leu Ala His Ile Val 210 215 220 Asp Glu Thr Leu Ala Ala
Asn Asn Leu Asp Arg Ser Gln Leu Asp Trp 225 230 235 240 Leu Val Pro
His Gln Ala Asn Leu Arg Ile Ile Ser Ala Thr Ala Lys 245 250 255 Lys
Leu Gly Met Ser Met Asp Asn Val Val Val Thr Leu Asp Arg His 260 265
270 Gly Asn Thr Ser Ala Ala Ser Val Pro Cys Ala Leu Asp Glu Ala Val
275 280 285 Arg Asp Gly Arg Ile Lys Pro Gly Gln Leu Val Leu Leu Glu
Ala Phe 290 295 300 Gly Gly Gly Phe Thr Trp Gly Ser Ala Leu Val Arg
Phe 305 310 315 91954DNAEscherichia coli 91atgtatacga agattattgg
tactggcagc tatctgcccg aacaagtgcg gacaaacgcc 60gatttggaaa aaatggtgga
cacctctgac gagtggattg tcactcgtac cggtatccgc 120gaacgccaca
ttgccgcgca aaacgaaacc gtttcaacca tgggctttga agcggcgaca
180cgcgcaattg agatggcggg cattgagaaa gaccagattg gcctgatcgt
tgtggcaacg 240acttctgcta cgcacgcttt cccgagcgca gcttgtcaga
ttcaaagcat gctgggcatt 300aaaggttgcc cggcatttga cgttgcagca
gcctgcgcag gtttcaccta tgcattaagc 360gtagccgatc aatacgtgaa
atctggggcg gtgaagtatg ctctggtcgt cggttccgat 420gtactggcgc
gcacctgcga tccaaccgat cgtgggacta ttattatttt tggcgatggc
480gcgggcgctg cggtgctggc tgcctctgaa gagccgggaa tcatctccac
ccatctgcat 540gccgacggta gctatggtga gttgctgacg ctgcctaatg
ctgaccgtgt gaatccagag 600aattcaattc atctgacgat ggcgggcaac
gaagtcttca aggttgcggt aacggaactg 660gcgcacatcg ttgatgagac
gctggcggca aataatcttg accgttctca actggactgg 720ctggttccgc
atcaggctaa cctgcgtatt atcagtgcaa cggcgaaaaa actcggtatg
780tctatggaca atgtcgtggt gacgctggat cgccacggta atacctctgc
ggcctctgtc 840ccgtgcgcgc tggatgaagc tgtacgcgac gggcgcatta
agccggggca gttggttctg 900cttgaagcct ttggcggtgg attcacctgg
ggctccgcgc tggttcgttt ctag 95492317PRTEscherichia coli 92Met Tyr
Thr Lys Ile Ile Gly Thr Gly Ser Tyr Leu Pro Glu Gln Val 1 5 10 15
Arg Thr Asn Ala Asp Leu Glu Lys Met Val Asp Thr Ser Asp Glu Trp 20
25 30 Ile Val Thr Arg Thr Gly Ile Arg Glu Arg His Ile Ala Ala Pro
Asn 35 40 45 Glu Thr Val Ser Thr Met Gly Phe Glu Ala Ala Thr Arg
Ala Ile Glu 50 55 60 Met Ala Gly Ile Glu Lys Asp Gln Ile Gly Leu
Ile Val Val Ala Thr 65 70 75 80 Thr Ser Ala Thr His Ala Phe Pro Ser
Ala Ala Cys Gln Ile Gln Ser 85 90 95 Met Leu Gly Ile Lys Gly Cys
Pro Ala Phe Asp Val Ala Ala Ala Cys 100 105 110 Ala Gly Phe Thr Tyr
Ala Leu Ser Val Ala Asp Gln Tyr Val Lys Ser 115 120 125 Gly Ala Val
Lys Tyr Ala Leu Val Val Gly Ser Asp Val Leu Ala Arg 130 135 140 Thr
Cys Asp Pro Thr Asp Arg Gly Thr Ile Ile Ile Phe Gly Asp Gly 145 150
155 160 Ala Gly Ala Ala Val Leu Ala Ala Ser Glu Glu Pro Gly Ile Ile
Ser 165 170 175 Thr His Leu His Ala Asp Gly Ser Tyr Gly Glu Leu Leu
Thr Leu Pro 180 185 190 Asn Ala Asp Arg Val Asn Pro Glu Asn Ser Ile
His Leu Thr Met Ala 195 200 205 Gly Asn Glu Val Phe Lys Val Ala Val
Thr Glu Leu Ala His Ile Val 210 215 220 Asp Glu Thr Leu Thr Ala Asn
Asn Leu Asp Arg Ser Gln Leu Asp Trp 225 230 235 240 Leu Val Pro His
Gln Ala Asn Leu Arg Ile Ile Ser Ala Thr Ala Lys 245 250 255 Lys Leu
Gly Met Ser Met Asp Asn Val Val Val Thr Leu Asp Arg His 260 265 270
Gly Asn Thr Ser Ala Ala Ser Val Pro Cys Ala Leu Asp Glu Ala Val 275
280 285 Arg Asp Gly Arg Ile Lys Pro Gly Gln Leu Val Leu Leu Glu Ala
Phe 290 295 300 Gly Gly Gly Phe Thr Trp Gly Ser Ala Leu Val Arg Phe
305 310 315 93954DNAEscherichia coli 93atgtatacga agattattgg
tactggcagc tatctgcccg aacaagtgcg gacaaacgcc 60gatttggaaa aaatggtgga
cacctctgac gagtggattg tcactcgtac cggtatccgc 120gaacgccaca
ttgccgcgcc aaacgaaacc gtttcaacca tgggctttga agcggcgaca
180cgcgcaattg agatggcggg cattgagaaa gaccagattg gcctgatcgt
tgtggcaacg 240acttctgcta cgcacgcttt cccgagcgca gcttgtcaga
ttcaaagcat gttgggcatt 300aaaggttgcc cggcatttga cgttgcagca
gcctgcgcag gtttcaccta tgcattaagc 360gtagccgatc aatacgtgaa
atctggggcg gtgaagtatg ctctggtcgt cggttccgat 420gtactggcgc
gcacctgcga tccaaccgat cgtgggacta ttattatttt tggcgatggc
480gcgggcgctg cggtgctggc tgcctctgaa gagccgggaa tcatctccac
ccatctgcat 540gccgacggta gttatggtga attgctgacg ctgccaaacg
ccgaccgcgt gaatccagag 600aattcaattc atctgacgat ggcgggcaac
gaagtcttca aggttgcggt aacggaactg 660gcgcacatcg ttgatgagac
gctgacggcg aataatcttg
accgttctca actggactgg 720ctggttccgc atcaggctaa cctgcgtatt
atcagtgcaa cggcgaaaaa actcggtatg 780tcgatggaca atgtcgtggt
gacgctggat cgccacggta atacctctgc ggcctctgtc 840ccgtgcgcgc
tggatgaagc tgtacgcgac gggcgcatta agccggggca gttggttctg
900cttgaagcct ttggcggtgg attcacctgg ggctccgcgc tggttcgttt ctag
95494312PRTBacillus subtilis 94Met Lys Ala Gly Ile Leu Gly Val Gly
Arg Tyr Ile Pro Glu Lys Val 1 5 10 15 Leu Thr Asn His Asp Leu Glu
Lys Met Val Glu Thr Ser Asp Glu Trp 20 25 30 Ile Arg Thr Arg Thr
Gly Ile Glu Glu Arg Arg Ile Ala Ala Asp Asp 35 40 45 Val Phe Ser
Ser His Met Ala Val Ala Ala Ala Lys Asn Ala Leu Glu 50 55 60 Gln
Ala Glu Val Ala Ala Glu Asp Leu Asp Met Ile Leu Val Ala Thr 65 70
75 80 Val Thr Pro Asp Gln Ser Phe Pro Thr Val Ser Cys Met Ile Gln
Glu 85 90 95 Gln Leu Gly Ala Lys Lys Ala Cys Ala Met Asp Ile Ser
Ala Ala Cys 100 105 110 Ala Gly Phe Met Tyr Gly Val Val Thr Gly Lys
Gln Phe Ile Glu Ser 115 120 125 Gly Thr Tyr Lys His Val Leu Val Val
Gly Val Glu Lys Leu Ser Ser 130 135 140 Ile Thr Asp Trp Glu Asp Arg
Asn Thr Ala Val Leu Phe Gly Asp Gly 145 150 155 160 Ala Gly Ala Ala
Val Val Gly Pro Val Ser Asp Asp Arg Gly Ile Leu 165 170 175 Ser Phe
Glu Leu Gly Ala Asp Gly Thr Gly Gly Gln His Leu Tyr Leu 180 185 190
Asn Glu Lys Arg His Thr Ile Met Asn Gly Arg Glu Val Phe Lys Phe 195
200 205 Ala Val Arg Gln Met Gly Glu Ser Cys Val Asn Val Ile Glu Lys
Ala 210 215 220 Gly Leu Ser Lys Glu Asp Val Asp Phe Leu Ile Pro His
Gln Ala Asn 225 230 235 240 Ile Arg Ile Met Glu Ala Ala Arg Glu Arg
Leu Glu Leu Pro Val Glu 245 250 255 Lys Met Ser Lys Thr Val His Lys
Tyr Gly Asn Thr Ser Ala Ala Ser 260 265 270 Ile Pro Ile Ser Leu Val
Glu Glu Leu Glu Ala Gly Lys Ile Lys Asp 275 280 285 Gly Asp Val Val
Val Met Val Gly Phe Gly Gly Gly Leu Thr Trp Gly 290 295 300 Ala Ile
Ala Ile Arg Trp Gly Arg 305 310 95939DNABacillus subtilis
95atgaaagctg gaatacttgg tgttggacgt tacattcctg agaaggtttt aacaaatcat
60gatcttgaaa aaatggttga aacttctgac gagtggattc gtacaagaac aggaatagaa
120gaaagaagaa tcgcagcaga tgatgtgttt tcatcacata tggctgttgc
agcagcgaaa 180aatgcgctgg aacaagctga agtggctgct gaggatctgg
atatgatctt ggttgcaact 240gttacacctg atcagtcatt ccctacggtc
tcttgtatga ttcaagaaca actcggcgcg 300aagaaagcgt gtgctatgga
tatcagcgcg gcttgtgcgg gcttcatgta cggggttgta 360accggtaaac
aatttattga atccggaacc tacaagcatg ttctagttgt tggtgtagag
420aagctctcaa gcattaccga ctgggaagac cgcaatacag ccgttctgtt
tggagacgga 480gcaggcgctg cggtagtcgg gccagtcagt gatgacagag
gaatcctttc atttgaacta 540ggagccgacg gcacaggcgg tcagcacttg
tatctgaatg aaaaacgaca tacaatcatg 600aatggacgag aagttttcaa
atttgcagtc cgccaaatgg gagaatcatg cgtaaatgtc 660attgaaaaag
ccggactttc aaaagaggat gtcgactttt tgattccgca tcaggcgaac
720atccgtatca tggaagctgc tcgcgagcgt ttagagcttc ctgtcgaaaa
gatgtctaaa 780actgttcata aatatggaaa tacttctgcc gcatccattc
cgatctctct tgtagaagaa 840ttggaagccg gtaaaatcaa agacggcgat
gtggtcgtta tggtagggtt cggcggagga 900ctaacatggg gcgccattgc
aatccgctgg ggccgataa 93996335PRTStreptomyces avermitilis 96Met Ser
Gly Gly Arg Ala Ala Val Ile Thr Gly Ile Gly Gly Tyr Val 1 5 10 15
Pro Pro Asp Leu Val Thr Asn Asp Asp Leu Ala Gln Arg Leu Asp Thr 20
25 30 Ser Asp Ala Trp Ile Arg Ser Arg Thr Gly Ile Ala Glu Arg His
Val 35 40 45 Ile Ala Pro Gly Thr Ala Thr Ser Asp Leu Ala Val Glu
Ala Gly Leu 50 55 60 Arg Ala Leu Lys Ser Ala Gly Asp Glu His Val
Asp Ala Val Val Leu 65 70 75 80 Ala Thr Thr Thr Pro Asp Gln Pro Cys
Pro Ala Thr Ala Pro Gln Val 85 90 95 Ala Ala Arg Leu Gly Leu Gly
Gln Val Pro Ala Phe Asp Val Ala Ala 100 105 110 Val Cys Ser Gly Phe
Leu Phe Gly Leu Ala Thr Ala Ser Gly Leu Ile 115 120 125 Ala Ala Gly
Val Ala Asp Lys Val Leu Leu Val Ala Ala Asp Ala Phe 130 135 140 Thr
Thr Ile Ile Asn Pro Glu Asp Arg Thr Thr Ala Val Ile Phe Ala 145 150
155 160 Asp Gly Ala Gly Ala Val Val Leu Arg Ala Gly Ala Ala Asp Glu
Pro 165 170 175 Gly Ala Val Gly Pro Leu Val Leu Gly Ser Asp Gly Glu
Leu Ser His 180 185 190 Leu Ile Glu Val Pro Ala Gly Gly Ser Arg Gln
Arg Ser Ser Gly Pro 195 200 205 Thr Thr Asp Pro Asp Asp Gln Tyr Phe
Arg Met Leu Gly Arg Asp Thr 210 215 220 Tyr Arg His Ala Val Glu Arg
Met Thr Asp Ala Ser Gln Arg Ala Ala 225 230 235 240 Glu Leu Ala Asp
Trp Arg Ile Asp Asp Val Asp Arg Phe Ala Ala His 245 250 255 Gln Ala
Asn Ala Arg Ile Leu Asp Ser Val Ala Glu Arg Leu Gly Val 260 265 270
Pro Ala Glu Arg Gln Leu Thr Asn Ile Ala Arg Val Gly Asn Thr Gly 275
280 285 Ala Ala Ser Ile Pro Leu Leu Leu Ser Gln Ala Ala Ala Ala Gly
Arg 290 295 300 Leu Gly Ala Gly His Arg Val Leu Leu Thr Ala Phe Gly
Gly Gly Leu 305 310 315 320 Ser Trp Gly Ala Gly Thr Leu Val Trp Pro
Glu Val Gln Pro Val 325 330 335 971008DNAStreptomyces avermitilis
97atgagcggcg gacgcgcggc ggtgatcacc gggatcgggg gctatgtgcc tcccgatctg
60gtgaccaacg acgatctggc ccagcggctc gacacctccg acgcgtggat ccgctcgcgc
120accgggatcg ccgagcggca tgtgatcgcg cccggcaccg cgacctccga
cctggcggtg 180gaggccggac tgcgggccct gaagtcggcg ggcgacgagc
acgtggacgc ggtcgtcctg 240gccaccacga cgcccgacca gccctgcccg
gcgaccgccc cgcaggtggc cgcacggctg 300ggactcgggc aggtgccggc
gttcgacgtg gccgccgtct gctccggctt cctgttcggc 360ctcgccaccg
cgtccgggct gatcgcggcc ggggtggcgg acaaggtcct gctggtcgcc
420gccgacgcgt tcaccacgat catcaacccc gaggaccgca ccacggccgt
catcttcgcg 480gacggcgcgg gcgcggtggt gctgcgcgcg ggcgccgccg
acgagccggg ggccgtcggc 540ccgctggtgc tcggcagcga cggcgagctg
agccatctca tcgaggtgcc ggcgggcggc 600tcgcgccagc gctcgtccgg
ccccacgacc gacccggacg accagtactt ccggatgctc 660ggccgggaca
cctaccggca cgcggtggag cggatgaccg atgcgtccca gcgggcggcc
720gaactggccg actggcggat cgacgacgtc gaccggttcg cggcgcacca
ggccaacgcc 780cgcatcctcg actcggtcgc ggaacgtctc ggggtccccg
ccgaacggca gttgaccaac 840atcgcccggg tcggcaacac cggcgccgcc
tcgatcccgc tgcttctgtc gcaggcggcc 900gcggccggcc ggctcggcgc
cgggcaccgg gtgctcctga ccgcgttcgg cgggggcctg 960tcctggggcg
cggggactct ggtctggccg gaggtccagc cggtctga
100898313PRTStraphylococcus aureus 98Met Asn Val Gly Ile Lys Gly
Phe Gly Ala Tyr Ala Pro Glu Lys Ile 1 5 10 15 Ile Asp Asn Ala Tyr
Phe Glu Gln Phe Leu Asp Thr Ser Asp Glu Trp 20 25 30 Ile Ser Lys
Met Thr Gly Ile Lys Glu Arg His Trp Ala Asp Asp Asp 35 40 45 Gln
Asp Thr Ser Asp Leu Ala Tyr Glu Ala Ser Val Lys Ala Ile Ala 50 55
60 Asp Ala Gly Ile Gln Pro Glu Asp Ile Asp Met Ile Ile Val Ala Thr
65 70 75 80 Ala Thr Gly Asp Met Pro Phe Pro Thr Val Ala Asn Met Leu
Gln Glu 85 90 95 Arg Leu Gly Thr Gly Lys Val Ala Ser Met Asp Gln
Leu Ala Ala Cys 100 105 110 Ser Gly Phe Met Tyr Ser Met Ile Thr Ala
Lys Gln Tyr Val Gln Ser 115 120 125 Gly Asp Tyr His Asn Ile Leu Val
Val Gly Ala Asp Lys Leu Ser Lys 130 135 140 Ile Thr Asp Leu Thr Asp
Arg Ser Thr Ala Val Leu Phe Gly Asp Gly 145 150 155 160 Ala Gly Ala
Val Ile Ile Gly Glu Val Ser Glu Gly Arg Gly Ile Ile 165 170 175 Ser
Tyr Glu Met Gly Ser Asp Gly Thr Gly Gly Lys His Leu Tyr Leu 180 185
190 Asp Lys Asp Thr Gly Lys Leu Lys Met Asn Gly Arg Glu Val Phe Lys
195 200 205 Phe Ala Val Arg Ile Met Gly Asp Ala Ser Thr Arg Val Val
Glu Lys 210 215 220 Ala Asn Leu Thr Ser Asp Asp Ile Asp Leu Phe Ile
Pro His Gln Ala 225 230 235 240 Asn Ile Arg Ile Met Glu Ser Ala Arg
Glu Arg Leu Gly Ile Ser Lys 245 250 255 Asp Lys Met Ser Val Ser Val
Asn Lys Tyr Gly Asn Thr Ser Ala Ala 260 265 270 Ser Ile Pro Leu Ser
Ile Asp Gln Glu Leu Lys Asn Gly Lys Leu Lys 275 280 285 Asp Asp Asp
Thr Ile Val Leu Val Gly Phe Gly Gly Gly Leu Thr Trp 290 295 300 Gly
Ala Met Thr Ile Lys Trp Gly Lys 305 310 99942DNAStraphylococcus
aureus 99ctattttccc cattttattg tcattgcgcc ccaagttagg ccgccaccga
atccgacaag 60aacaattgta tcatcatctt tgagtttacc attttttaat tcttgatcga
tacttaaagg 120tattgacgca gctgaagtat ttccatattt atttacagaa
acactcattt tgtcttttga 180aatacctaag cgttctctag ctgattccat
aattctaata ttagcttgat gaggaataaa 240taaatctata tcatctgatg
ttaaattcgc tttttcaact acacgtgttg atgcatcacc 300cataattcta
acagcaaatt taaatacttc tcgaccattc attttcagtt taccagtatc
360tttatctaaa tataaatgtt taccacctgt gccatcagaa cccatttcat
aacttataat 420acctctgcct tctgaaactt caccgatgat aaccgcacct
gcaccatctc caaatagaac 480tgcagtagaa cggtcagtta aatctgttat
tttagataat ttatctgcac cgacaactaa 540aatattatga taatctccag
attgaacata ttgtttagct gtaatcattg aatacataaa 600tccagaacat
gctgcaagtt gatccataga ggcaactttg cccgtcccta aacgttcttg
660taacatattt gcgacagttg gaaatggcat atctccagtt gctgtggcaa
caattatcat 720atctatatct tcgggctgaa taccagcgtc agcgattgct
tttacacttg cttcatatgc 780taaatctgaa gtatcttgat cgtcatctgc
ccaatgtctt tctttaattc cagtcatctt 840agaaatccat tcatcagatg
tatctaaaaa ttgctcaaaa taggcattgt caataatctt 900ttctggtgca
tatgcaccaa aacctttaat acccacgttc at 942100325PRTStreptococcus
mutans 100Met Thr Phe Ala Lys Ile Ser Gln Ala Ala Tyr Tyr Val Pro
Ser Gln 1 5 10 15 Val Val Thr Asn Asp Asp Leu Ser Lys Ile Met Asp
Thr Ser Asp Glu 20 25 30 Trp Ile Thr Ser Arg Thr Gly Ile Arg Glu
Arg Arg Ile Ser Gln Ser 35 40 45 Glu Asp Thr Ser Asp Leu Ala Ser
Gln Val Ala Lys Glu Leu Leu Lys 50 55 60 Lys Ala Ser Leu Lys Ala
Lys Glu Ile Asp Phe Ile Ile Val Ala Thr 65 70 75 80 Ile Thr Pro Asp
Ala Met Met Pro Ser Thr Ala Ala Cys Val Gln Ala 85 90 95 Lys Ile
Gly Ala Val Asn Ala Phe Ala Phe Asp Leu Thr Ala Ala Cys 100 105 110
Ser Gly Phe Ile Phe Ala Leu Ser Ala Ala Glu Lys Met Ile Lys Ser 115
120 125 Gly Gln Tyr Gln Lys Gly Leu Val Ile Gly Ala Glu Val Leu Ser
Lys 130 135 140 Ile Ile Asp Trp Ser Asp Arg Thr Thr Ala Val Leu Phe
Gly Asp Gly 145 150 155 160 Ala Gly Gly Val Leu Leu Glu Ala Asp Ser
Ser Glu His Phe Leu Phe 165 170 175 Glu Ser Ile His Ser Asp Gly Ser
Arg Gly Glu Ser Leu Thr Ser Gly 180 185 190 Glu His Ala Val Ser Ser
Pro Phe Ser Gln Val Asp Lys Lys Asp Asn 195 200 205 Cys Phe Leu Lys
Met Asp Gly Arg Ala Ile Phe Asp Phe Ala Ile Arg 210 215 220 Asp Val
Ser Lys Ser Ile Ser Met Leu Ile Arg Lys Ser Asp Met Pro 225 230 235
240 Val Glu Ala Ile Asp Tyr Phe Leu Leu His Gln Ala Asn Ile Arg Ile
245 250 255 Leu Asp Lys Met Ala Lys Lys Ile Gly Ala Asp Arg Glu Lys
Phe Pro 260 265 270 Ala Asn Met Met Lys Tyr Gly Asn Thr Ser Ala Ala
Ser Ile Pro Ile 275 280 285 Leu Leu Ala Glu Cys Val Glu Asn Gly Thr
Ile Glu Leu Asn Gly Ser 290 295 300 His Thr Val Leu Leu Ser Gly Phe
Gly Gly Gly Leu Thr Trp Gly Ser 305 310 315 320 Leu Ile Val Lys Ile
325 101978DNAStreptococcus mutans 101atgacttttg caaagattag
tcaagcagca tattatgtac catcacaggt tgtcaccaat 60gatgatttat ctaaaataat
ggataccagt gatgaatgga ttacaagtcg tacgggaata 120agagagcgcc
gtattagtca atccgaagat accagtgact tagccagtca ggtggccaaa
180gaacttttaa aaaaagcctc attaaaggcg aaagagattg attttattat
tgttgctaca 240attactccgg atgcaatgat gccatcaaca gctgcttgtg
tccaagcgaa aattggtgca 300gtgaatgctt ttgctttcga tttaactgcc
gcctgcagtg gatttatttt tgcactttca 360gctgcggaaa aaatgattaa
atccggtcag taccagaaag gtttagttat cggtgcagaa 420gttctatcta
aaatcatcga ttggtcggat cgaacaacag ctgttctttt tggagatgga
480gctggcggtg ttcttttaga agcagattct tctgaacatt ttttatttga
atctattcat 540tcagatggca gtcgtggtga aagtttgaca tcaggtgaac
acgctgtttc gtcacccttt 600tcacaggttg ataaaaaaga taactgtttt
ctaaaaatgg atggtcgagc tatatttgac 660tttgctattc gtgatgtgtc
aaaaagtatt tcgatgctca ttaggaagtc agatatgcct 720gtagaagcga
ttgattattt cttattacat caggctaata ttcgtatttt ggataaaatg
780gctaaaaaaa ttggcgctga tagagaaaaa tttcctgcta atatgatgaa
gtatggtaat 840accagtgcag caagtattcc tattttatta gccgaatgtg
tcgaaaatgg aactatagag 900ctaaatggtt cacacactgt tctcctgagc
gggttcggtg ggggtttgac atggggcagt 960ttaattgtta aaatttag
978102325PRTLactococcus lactis 102Met Thr Phe Ala Lys Ile Thr Gln
Val Ala His Tyr Val Pro Glu Asn 1 5 10 15 Val Val Ser Asn Asp Asp
Leu Ser Lys Ile Met Asp Thr Asn Asp Glu 20 25 30 Trp Ile Tyr Ser
Arg Thr Gly Ile Lys Asn Arg His Ile Ser Thr Gly 35 40 45 Glu Asn
Thr Ser Asp Leu Ala Ala Lys Val Ala Lys Gln Leu Ile Ser 50 55 60
Asp Ser Asn Leu Ser Pro Glu Thr Ile Asp Phe Ile Ile Val Ala Thr 65
70 75 80 Val Thr Pro Asp Ser Leu Met Pro Ser Thr Ala Ala Arg Val
Gln Ala 85 90 95 Gln Val Gly Ala Val Asn Ala Phe Ala Tyr Asp Leu
Thr Ala Ala Cys 100 105 110 Ser Gly Phe Val Phe Ala Leu Ser Thr Ala
Glu Lys Leu Ile Ser Ser 115 120 125 Gly Ala Tyr Gln Arg Gly Leu Val
Ile Gly Ala Glu Val Phe Ser Lys 130 135 140 Val Ile Asp Trp Ser Asp
Arg Ser Thr Ala Val Leu Phe Gly Asp Gly 145 150 155 160 Ala Ala Gly
Val Leu Ile Glu Ala Gly Ala Ser Gln Pro Leu Ile Ile 165 170 175 Ala
Glu Lys Met Gln Thr Asp Gly Ser Arg Gly Asn Ser Leu Leu Ser 180 185
190 Ser Tyr Ala Asp Ile Gln Thr Pro Phe Ala Ser Val Ser Tyr Glu Ser
195 200 205 Ser Asn Leu Ser Met Glu Gly Arg Ala Ile Phe Asp Phe Ala
Val Arg 210 215 220 Asp Val Pro Lys Asn Ile Gln Ala Thr Leu Glu Lys
Ala Asn Leu Ser 225 230 235 240 Ala Glu Glu Val Asp Tyr Tyr Leu Leu
His Gln Ala Asn Ser Arg Ile 245 250 255 Leu Asp Lys Met Ala Lys Lys
Leu Gly Val Thr Arg Gln Lys Phe Leu 260 265 270 Gln Asn Met Gln Glu
Tyr Gly Asn Thr Ser Ala Ala Ser Ile Pro Ile 275 280 285 Leu Leu Ser
Glu Ser Val Lys Asn Gly Ile Phe Ser Leu Asp Gly Gln 290 295 300 Thr
Lys Val Val Leu Thr Gly Phe Gly Gly Gly Leu Thr Trp Gly Thr 305 310
315 320 Ala Ile Ile Asn Leu 325 103978DNALactococcus lactis
103atgacttttg cgaaaattac gcaagtggca
cactatgtgc ctgaaaatgt ggtatctaat 60gatgacttgt ccaaaataat ggatactaat
gatgaatgga tttacagtcg gacagggatt 120aaaaatcgcc atatttcaac
tggagagaac acctcagact tagcagctaa agttgctaag 180cagttgatta
gcgattcaaa tttaagccca gaaacgattg acttcatcat tgttgctaca
240gtaactccgg actcattgat gccttcaacc gcggcacggg ttcaagctca
agtaggagca 300gttaatgctt ttgcttacga tttgactgcg gcttgttcag
gctttgtctt tgctctatca 360acagcggaaa aattaatttc ctcaggagca
tatcaacgag ggcttgtcat tggcgcagaa 420gtcttttcaa aagtaattga
ttggtcagac cgatcaactg ctgttctttt cggagatgga 480gctgctggtg
tgcttattga agctggcgcg agtcaacctc tgattattgc tgaaaaaatg
540caaacagatg gaagtcgtgg gaacagttta ctttctagtt atgctgacat
ccaaactcca 600tttgcctctg tttcatacga aagttcaaac ttgagtatgg
aagggcgagc aatttttgat 660tttgccgtac gtgatgttcc taaaaatatc
caggcaactt tagaaaaagc taatttgtct 720gctgaagaag tagattatta
tctccttcat caagcgaatt caagaatcct tgataaaatg 780gctaaaaagc
ttggtgtgac gcgccaaaag ttccttcaaa atatgcaaga atatggtaac
840acatcggcag caagtatccc tatattgttg tcagaatccg taaaaaatgg
tatatttagt 900ttggacggtc aaacaaaagt cgtcttgaca ggatttggcg
gtggcctcac ttggggtaca 960gcaattatta atttataa
978104317PRTLeginonella pneumophila 104Met Lys Asn Ala Val Ile Ser
Gly Thr Gly Ser Tyr Ser Pro Glu Arg 1 5 10 15 Gln Met Thr Asn Ala
Glu Leu Glu Thr Met Leu Asp Thr Ser Asp Glu 20 25 30 Trp Ile Val
Thr Arg Thr Gly Ile Ser Ser Arg Ser Val Ala Gln Glu 35 40 45 His
Glu Thr Thr Ser Tyr Met Ala Ser Arg Ala Ala Glu Gln Ala Leu 50 55
60 Glu Ala Ser Gly Leu Asp Ala Glu Glu Ile Asp Leu Ile Leu Val Ala
65 70 75 80 Thr Cys Thr Pro Asp Tyr Phe Phe Pro Ser Val Ala Cys His
Val Gln 85 90 95 His Ala Leu Gly Ile Lys Arg Pro Ile Pro Ala Phe
Asp Ile Gly Ala 100 105 110 Ala Cys Ser Gly Phe Val Tyr Ala Met Asp
Val Ala Lys Gln Tyr Ile 115 120 125 Ala Thr Gly Ala Ala Lys His Val
Leu Val Val Gly Ser Glu Ser Met 130 135 140 Ser Arg Ala Val Asp Trp
Thr Asp Arg Ser Ile Cys Val Leu Phe Gly 145 150 155 160 Asp Gly Ala
Gly Ala Val Val Leu Ser Ala Ser Asp Arg Gln Gly Ile 165 170 175 Met
Gly Ser Val Leu His Ser Ala Tyr Asp Ser Asp Lys Leu Leu Val 180 185
190 Leu Arg Asn Ser Thr Phe Glu Gln Asp Arg Ala Thr Ile Gly Met Arg
195 200 205 Gly Asn Glu Val Phe Lys Ile Ala Val Asn Ile Met Gly Asn
Ile Val 210 215 220 Asp Glu Val Leu Glu Ala Ser His Leu Lys Lys Ser
Asp Ile Asp Trp 225 230 235 240 Leu Ile Pro His Gln Ala Asn Ile Arg
Ile Ile Gln Ala Ile Ala Lys 245 250 255 Lys Leu Ser Leu Pro Met Ser
His Val Ile Val Thr Ile Gly Asn Gln 260 265 270 Gly Asn Thr Ser Ala
Ala Ser Ile Pro Leu Ala Leu Asp Tyr Ser Ile 275 280 285 Lys Asn Asn
Gln Ile Lys Arg Asp Glu Ile Leu Leu Ile Glu Ser Phe 290 295 300 Gly
Gly Gly Met Thr Trp Gly Ala Met Val Ile Arg Tyr 305 310 315
105954DNALeginonella pneumophila 105atgaaaaatg ctgttattag
tggcactgga agttactctc cagagagaca aatgactaat 60gctgaactgg aaaccatgct
tgatactagc gatgaatgga tagttaccag gactggtatt 120agtagtcgta
gtgttgctca agaacatgaa acaacatctt atatggcctc cagagcagca
180gagcaagcac tagaggcatc aggccttgat gctgaagaaa ttgatttgat
attagtagca 240acatgtaccc cggattattt ttttcctagc gttgcctgtc
acgtacaaca tgctttagga 300atcaaaagac ctattccggc ttttgacatt
ggagctgcat gcagcggttt tgtttatgcg 360atggatgtag cgaaacaata
cattgctaca ggggctgcca aacacgttct tgtcgtaggc 420agcgagagca
tgtcaagagc ggtagattgg actgatcgtt ctatttgtgt cttattcgga
480gatggcgcag gcgctgttgt tttaagcgca agtgatcgcc aagggattat
gggtagtgtt 540ttacattctg cctatgactc tgataaatta ctagtccttc
gtaattcaac ttttgaacaa 600gatcgtgcaa cgattggaat gcgaggtaat
gaggtattta aaattgctgt taatattatg 660ggtaatattg ttgatgaagt
gttagaagca agtcatttaa aaaaatctga tattgattgg 720ctgatacctc
atcaagccaa tatacgcatt atacaagcca tagctaaaaa attatctctt
780cctatgtcac atgttattgt tacaattggt aaccaaggca acacatcggc
tgcttctatt 840cccttagcac ttgattattc tattaaaaat aatcagatta
aaagggatga aatattatta 900attgaatcct ttggtggtgg aatgacctgg
ggcgctatgg ttattcgtta ctaa 954106312PRTListeria monocytogenes
106Met Asn Ala Gly Ile Leu Gly Val Gly Lys Tyr Val Pro Glu Lys Ile
1 5 10 15 Val Thr Asn Phe Asp Leu Glu Lys Ile Met Asp Thr Ser Asp
Glu Trp 20 25 30 Ile Arg Thr Arg Thr Gly Ile Glu Glu Arg Arg Ile
Ala Arg Asp Asp 35 40 45 Glu Tyr Thr His Asp Leu Ala Tyr Glu Ala
Ala Lys Val Ala Ile Glu 50 55 60 Asn Ala Gly Leu Thr Pro Asp Asp
Ile Asp Leu Phe Ile Val Ala Thr 65 70 75 80 Val Thr Gln Glu Ala Thr
Phe Pro Ser Val Ala Asn Ile Ile Gln Asp 85 90 95 Arg Leu Gly Ala
Thr Asn Ala Ala Gly Met Asp Val Glu Ala Ala Cys 100 105 110 Ala Gly
Phe Thr Phe Gly Val Val Thr Ala Ala Gln Phe Ile Lys Thr 115 120 125
Gly Ala Tyr Lys Asn Ile Val Val Val Gly Ala Asp Lys Leu Ser Lys 130
135 140 Ile Thr Asn Trp Asp Asp Arg Ala Thr Ala Val Leu Phe Gly Asp
Gly 145 150 155 160 Ala Gly Ala Val Val Met Gly Pro Val Ser Asp Asp
His Gly Leu Leu 165 170 175 Ser Phe Asp Leu Gly Ser Asp Gly Ser Gly
Gly Lys Tyr Leu Asn Leu 180 185 190 Asp Glu Asn Lys Lys Ile Tyr Met
Asn Gly Arg Glu Val Phe Arg Phe 195 200 205 Ala Val Arg Gln Met Gly
Glu Ala Ser Leu Arg Val Leu Glu Arg Ala 210 215 220 Gly Leu Glu Lys
Glu Glu Leu Asp Leu Leu Ile Pro His Gln Ala Asn 225 230 235 240 Ile
Arg Ile Met Glu Ala Ser Arg Glu Arg Leu Asn Leu Pro Glu Glu 245 250
255 Lys Leu Met Lys Thr Val His Lys Tyr Gly Asn Thr Ser Ser Ser Ser
260 265 270 Ile Ala Leu Ala Leu Val Asp Ala Val Glu Glu Gly Arg Ile
Lys Asp 275 280 285 Asn Asp Asn Val Leu Leu Val Gly Phe Gly Gly Gly
Leu Thr Trp Gly 290 295 300 Ala Leu Ile Ile Arg Trp Gly Lys 305 310
107939DNAListeria monocytogenes 107atgaacgcag gaattttagg agtaggtaaa
tacgtacctg aaaaaatagt aacaaatttt 60gatttagaaa aaataatgga tacatccgat
gagtggattc gtactcgaac tggtattgaa 120gaaagaagaa ttgctcgtga
tgacgaatat acgcacgact tagcatacga agcagcaaag 180gtagctattg
agaatgctgg gcttacacca gatgacattg acttatttat tgttgccact
240gtgacgcagg aagcgacttt tccatccgtt gcgaatatta ttcaagaccg
tttaggagca 300acaaatgctg cgggtatgga cgtggaagcg gcatgtgccg
gttttacttt tggcgtagta 360actgcagcac aatttattaa aacaggggca
tacaagaata tcgtcgtagt tggtgcggat 420aaattatcta aaatcactaa
ctgggatgat cgcgcaacag ccgtattatt tggtgatgga 480gcgggagccg
ttgttatggg tccggtttct gatgaccatg gactactttc gtttgactta
540ggctcagatg gatctggcgg caaatacttg aacttagatg aaaataagaa
gatttatatg 600aatggacgtg aagtgttccg ttttgcagtt cgccaaatgg
gagaagcttc gttacgagta 660cttgaacgtg ctggacttga aaaagaagaa
ttggatttac taattcctca ccaagcaaat 720atccgtatca tggaagcttc
tcgcgagcgt ttgaatttac cggaagaaaa actgatgaaa 780acagtgcata
aatacggtaa tacttcgtca tcttctattg ctcttgcgct agttgatgca
840gtcgaagaag gacgcattaa agataatgac aatgtcctgc ttgttggctt
tggcggcgga 900ctaacatggg gcgccctaat cattcgttgg ggtaagtaa
939108325PRTBacillus subtilis subsp. subtilis str. 168 108Met Ser
Lys Ala Lys Ile Thr Ala Ile Gly Thr Tyr Ala Pro Ser Arg 1 5 10 15
Arg Leu Thr Asn Ala Asp Leu Glu Lys Ile Val Asp Thr Ser Asp Glu 20
25 30 Trp Ile Val Gln Arg Thr Gly Met Arg Glu Arg Arg Ile Ala Asp
Glu 35 40 45 His Gln Phe Thr Ser Asp Leu Cys Ile Glu Ala Val Lys
Asn Leu Lys 50 55 60 Ser Arg Tyr Lys Gly Thr Leu Asp Asp Val Asp
Met Ile Leu Val Ala 65 70 75 80 Thr Thr Thr Ser Asp Tyr Ala Phe Pro
Ser Thr Ala Cys Arg Val Gln 85 90 95 Glu Tyr Phe Gly Trp Glu Ser
Thr Gly Ala Leu Asp Ile Asn Ala Thr 100 105 110 Cys Ala Gly Leu Thr
Tyr Gly Leu His Leu Ala Asn Gly Leu Ile Thr 115 120 125 Ser Gly Leu
His Gln Lys Ile Leu Val Ile Ala Gly Glu Thr Leu Ser 130 135 140 Lys
Val Thr Asp Tyr Thr Asp Arg Thr Thr Cys Val Leu Phe Gly Asp 145 150
155 160 Ala Ala Gly Ala Leu Leu Val Glu Arg Asp Glu Glu Thr Pro Gly
Phe 165 170 175 Leu Ala Ser Val Gln Gly Thr Ser Gly Asn Gly Gly Asp
Ile Leu Tyr 180 185 190 Arg Ala Gly Leu Arg Asn Glu Ile Asn Gly Val
Gln Leu Val Gly Ser 195 200 205 Gly Lys Met Val Gln Asn Gly Arg Glu
Val Tyr Lys Trp Ala Ala Arg 210 215 220 Thr Val Pro Gly Glu Phe Glu
Arg Leu Leu His Lys Ala Gly Leu Ser 225 230 235 240 Ser Asp Asp Leu
Asp Trp Phe Val Pro His Ser Ala Asn Leu Arg Met 245 250 255 Ile Glu
Ser Ile Cys Glu Lys Thr Pro Phe Pro Ile Glu Lys Thr Leu 260 265 270
Thr Ser Val Glu His Tyr Gly Asn Thr Ser Ser Val Ser Ile Val Leu 275
280 285 Ala Leu Asp Leu Ala Val Lys Ala Gly Lys Leu Lys Lys Asp Gln
Ile 290 295 300 Val Leu Leu Phe Gly Phe Gly Gly Gly Leu Thr Tyr Thr
Gly Leu Leu 305 310 315 320 Ile Lys Trp Gly Met 325
109978DNABacillus subtilis subsp. subtilis str. 168 109atgtcaaaag
caaaaattac agctatcggc acctatgcgc cgagcagacg tttaaccaat 60gcagatttag
aaaagatcgt tgatacctct gatgaatgga tcgttcagcg cacaggaatg
120agagaacgcc ggattgcgga tgaacatcaa tttacctctg atttatgcat
agaagcggtg 180aagaatctca agagccgtta taaaggaacg cttgatgatg
tcgatatgat cctcgttgcc 240acaaccacat ccgattacgc ctttccgagt
acggcatgcc gcgtacagga atatttcggc 300tgggaaagca ccggcgcgct
ggatattaat gcgacatgcg ccgggctgac atacggcctc 360catttggcaa
atggattgat cacatctggc cttcatcaaa aaattctcgt catcgccgga
420gagacgttat caaaggtaac cgattatacc gatcgaacga catgcgtact
gttcggcgat 480gccgcgggtg cgctgttagt agaacgagat gaagagacgc
cgggatttct tgcgtctgta 540caaggaacaa gcgggaacgg cggcgatatt
ttgtatcgtg ccggactgcg aaatgaaata 600aacggtgtgc agcttgtcgg
ttccggaaaa atggtgcaaa acggacgcga ggtatataaa 660tgggccgcaa
gaaccgtccc tggcgaattt gaacggcttt tacataaagc aggactcagc
720tccgatgatc tcgattggtt tgttcctcac agcgccaact tgcgcatgat
cgagtcaatt 780tgtgaaaaaa caccgttccc gattgaaaaa acgctcacta
gtgttgagca ctacggaaac 840acgtcttcgg tttcaattgt tttggcgctc
gatctcgcag tgaaagccgg gaagctgaaa 900aaagatcaaa tcgttttgct
tttcgggttt ggcggcggat taacctatac aggattgctt 960attaaatggg ggatgtaa
978110331PRTMyxococcus xanthus 110Met Arg Tyr Ala Gln Ile Leu Ser
Thr Gly Arg Tyr Val Pro Glu Lys 1 5 10 15 Val Leu Thr Asn Ala Asp
Val Glu Lys Ile Leu Gly Glu Lys Val Asp 20 25 30 Glu Trp Leu Gln
Gln Asn Val Gly Ile Arg Glu Arg His Met Met Ala 35 40 45 Asp Asp
Gln Ala Thr Ser Asp Leu Cys Val Gly Ala Ala Arg Gln Ala 50 55 60
Leu Glu Arg Ala Gly Thr Lys Pro Glu Glu Leu Asp Leu Ile Ile Ile 65
70 75 80 Ala Thr Asp Thr Pro Asp Tyr Leu Ser Pro Ala Thr Ala Ser
Val Val 85 90 95 Gln Ala Lys Leu Gly Ala Val Asn Ala Gly Thr Tyr
Asp Leu Asn Cys 100 105 110 Ala Cys Ala Gly Trp Val Thr Ala Leu Asp
Val Gly Ser Lys Thr Ile 115 120 125 Ala Ala Asp Asp Ser Tyr Gln Arg
Ile Leu Val Val Gly Ala Tyr Gly 130 135 140 Met Ser Arg Tyr Ile Asn
Trp Lys Asp Lys Lys Thr Ala Thr Leu Phe 145 150 155 160 Ala Asp Gly
Ala Gly Ala Val Val Leu Gly Ala Gly Asp Thr Pro Gly 165 170 175 Phe
Met Gly Ala Lys Leu Leu Ala Asn Gly Glu Tyr His Asp Ala Leu 180 185
190 Gly Val Tyr Thr Gly Gly Thr Asn Arg Pro Ala Thr Ala Glu Ser Leu
195 200 205 Glu Leu Thr Gly Gly Lys Pro Ala Val Gln Phe Val Arg Lys
Phe Pro 210 215 220 Ala Thr Phe Asn Thr Glu Arg Trp Pro Met Leu Leu
Asp Gln Leu Leu 225 230 235 240 Lys Arg Gln Asn Leu Lys Leu Asp Asp
Val Lys Gln Phe Val Phe Thr 245 250 255 Gln Leu Asn Leu Arg Thr Ile
Glu Ala Thr Met Lys Ile Leu Gly Gln 260 265 270 Pro Met Glu Lys Ala
His Tyr Thr Met Asp Lys Trp Gly Tyr Thr Gly 275 280 285 Ser Ala Cys
Ile Pro Met Thr Leu Asp Asp Ala Val Val Gln Gly Lys 290 295 300 Val
Gln Arg Gly Asp Leu Val Ala Leu Cys Ala Ser Gly Gly Gly Leu 305 310
315 320 Ala Met Ala Ser Ala Leu Tyr Arg Trp Thr Ala 325 330
111996DNAMyxococcus xanthus 111atgcgatacg cccagattct ctccactggc
cgctacgtcc ccgagaaggt cctcaccaac 60gctgacgtcg agaagattct cggtgagaag
gtggatgagt ggctccagca gaacgtgggc 120attcgcgaac gccacatgat
ggcggatgac caggccacct ccgacctctg cgtgggcgcc 180gcccgccagg
cgctggagcg cgcgggcacg aagccggagg aactggacct catcatcatc
240gccaccgata ccccggacta tctcagcccc gccacggcct ccgtggtgca
ggcgaagctg 300ggcgcggtga acgccggcac ctacgacctc aactgcgcgt
gcgcgggctg ggtgacggcg 360ctggacgtgg gctcgaagac gattgccgcg
gatgacagct accagcgcat cctcgtcgtg 420ggcgcctacg gcatgtcgcg
ctacatcaac tggaaggaca agaagaccgc caccctgttc 480gcggacggcg
cgggcgcggt cgtgctgggc gcgggtgaca cgcccggctt catgggcgcg
540aagctgctgg ccaacggcga gtaccacgac gcgctgggtg tctacaccgg
cggtacgaac 600cgcccggcca ccgcggagtc gctggagctc acgggcggca
agcccgcggt gcagttcgtc 660cgcaagttcc cggcgacgtt caacaccgag
cgctggccca tgctgctgga ccagctcctc 720aagcggcaga acctgaagct
ggacgacgtg aagcagttcg tcttcacgca gctcaacctg 780cgcaccatcg
aagccaccat gaagatcctg ggccagccga tggagaaggc ccactacacc
840atggacaagt ggggctacac cggttcggcc tgcatcccga tgacgctgga
tgacgcggtg 900gtgcagggca aggtgcagcg cggcgacctg gtggccctgt
gtgccagcgg cggcgggctc 960gccatggcct ccgccctcta ccgctggacg gcctga
996112325PRTStenotrophomonas maltophilia 112Met Ser Lys Arg Ile Tyr
Ser Arg Ile Ala Gly Thr Gly Ser Tyr Leu 1 5 10 15 Pro Glu Lys Val
Leu Thr Asn Ala Asp Leu Glu Lys Met Val Glu Thr 20 25 30 Ser Asp
Glu Trp Ile Gln Ser Arg Thr Gly Ile Arg Glu Arg His Ile 35 40 45
Ala Ala Glu Gly Glu Thr Thr Ser Asp Leu Gly Tyr Asn Ala Ala Leu 50
55 60 Arg Ala Leu Glu Ala Ala Gly Ile Asp Ala Ser Gln Leu Asp Met
Ile 65 70 75 80 Val Val Gly Thr Thr Thr Pro Asp Leu Ile Phe Pro Ser
Thr Ala Cys 85 90 95 Leu Ile Gln Ala Lys Leu Gly Val Ala Gly Cys
Pro Ala Phe Asp Val 100 105 110 Asn Ala Ala Cys Ser Gly Phe Val Phe
Ala Leu Gly Val Ala Asp Lys 115 120 125 Phe Ile Arg Ser Gly Asp Cys
Arg Tyr Val Leu Val Ile Gly Ala Glu 130 135 140 Thr Leu Thr Arg Met
Val Asp Trp Asn Asp Arg Thr Thr Cys Val Leu 145 150 155 160 Phe Gly
Asp Gly Ala Gly Ala Val Val Leu Lys Ala Asp Glu Glu Thr 165 170 175
Gly Ile Leu Ser Thr His Leu His Ser Asp Gly Ser Lys Lys Glu Leu 180
185 190 Leu Trp Asn Pro Val Gly Val Ser Thr Gly Phe Lys Gly Gly Ala
Asn 195 200 205 Gly Gly Gly Thr Ile Asn Met Lys Gly Asn Asp Val
Phe
Lys Tyr Ala 210 215 220 Val Lys Ala Leu Asp Ser Val Val Asp Glu Thr
Leu Ala Ala Asn Gly 225 230 235 240 Leu Asp Lys Ser Asp Leu Asp Trp
Leu Ile Pro His Gln Ala Asn Leu 245 250 255 Arg Ile Ile Glu Ala Thr
Ala Lys Arg Leu Asp Met Ser Met Glu Gln 260 265 270 Val Val Val Thr
Val Asp Gln His Gly Asn Thr Ser Ser Gly Ser Val 275 280 285 Pro Leu
Ala Leu Asp Ala Ala Val Arg Ser Gly Lys Val Glu Arg Gly 290 295 300
Gln Leu Leu Leu Leu Glu Ala Phe Gly Gly Gly Phe Thr Trp Gly Ser 305
310 315 320 Ala Leu Leu Arg Tyr 325 113978DNAStenotrophomonas
maltophilia 113atgagcaagc ggatctattc gaggatcgcg ggcaccggta
gctatttgcc ggaaaaagtc 60ctgaccaacg ccgacctgga aaaaatggtc gaaacctcgg
atgagtggat ccagtcgcgc 120accggcattc gtgaacggca catcgcggcc
gaaggcgaaa ccaccagcga tctcggctac 180aacgccgcgc tgcgcgcact
tgaagcggcc ggcatcgacg cttcgcagct cgacatgatc 240gtggtcggta
cgaccacccc tgaccttatt ttcccgtcca ccgcgtgcct gatccaggcc
300aagctcggtg tggccggatg ccccgccttc gacgtcaacg cggcctgttc
gggtttcgtg 360ttcgcgctgg gcgtggccga caaattcatc cgttccggcg
actgccggta cgtgctggtg 420atcggcgccg aaacgctgac ccgcatggtt
gactggaacg atcgcaccac ctgcgtgctg 480ttcggtgatg gtgccggcgc
cgtcgtgctc aaggccgacg aagagaccgg catcctcagc 540acccacctgc
attccgatgg cagcaagaag gagctgttgt ggaacccggt gggtgtctcg
600accggtttca agggcggcgc caacggtggt ggcactatca acatgaaggg
caacgatgtg 660ttcaagtacg ccgtcaaggc gctggactcg gtcgtggacg
agaccttggc tgcgaacggc 720ctggacaagt ccgacctgga ttggctgatt
ccgcaccagg ccaacctacg catcatcgaa 780gccacggcca agcgcctgga
catgtcgatg gaacaggtcg tggtcacggt tgatcagcac 840ggcaacacct
cgtccggctc ggtgccgctg gcgctggacg ctgcagtgcg atcgggcaag
900gtcgagcgcg gccagctgct gttgctggaa gccttcggcg gcggcttcac
ctggggttcg 960gccctgctgc gctattga 978114334PRTBacteroides vulgatus
114Met Glu Lys Ile Asn Ala Val Ile Thr Gly Val Gly Gly Tyr Val Pro
1 5 10 15 Asp Tyr Val Leu Thr Asn Glu Glu Ile Ser Arg Met Val Asp
Thr Asn 20 25 30 Asp Glu Trp Ile Met Thr Arg Ile Gly Val Lys Glu
Arg Arg Ile Leu 35 40 45 Asn Glu Glu Gly Leu Gly Thr Ser Tyr Met
Ala Arg Lys Ala Ala Lys 50 55 60 Gln Leu Met Gln Lys Thr Ala Ser
Asn Pro Asp Asp Ile Asp Ala Val 65 70 75 80 Ile Val Ala Thr Thr Thr
Pro Asp Tyr His Phe Pro Ser Thr Ala Ser 85 90 95 Ile Leu Cys Asp
Lys Leu Gly Leu Lys Asn Ala Phe Ala Phe Asp Leu 100 105 110 Gln Ala
Ala Cys Cys Gly Phe Leu Tyr Leu Met Glu Thr Ala Ala Ser 115 120 125
Leu Ile Ala Ser Gly Arg His Lys Lys Ile Ile Ile Val Gly Ala Asp 130
135 140 Lys Met Ser Ser Met Val Asn Tyr Gln Asp Arg Ala Thr Cys Pro
Ile 145 150 155 160 Phe Gly Asp Gly Ala Ala Ala Cys Met Val Glu Ala
Thr Thr Glu Asp 165 170 175 Tyr Gly Ile Met Asp Ser Ile Leu Arg Thr
Asp Gly Lys Gly Leu Pro 180 185 190 Phe Leu His Met Lys Ala Gly Gly
Ser Val Cys Pro Pro Ser Tyr Phe 195 200 205 Thr Val Asp His Lys Met
His Tyr Leu Tyr Gln Glu Gly Arg Thr Val 210 215 220 Phe Lys Tyr Ala
Val Ser Asn Met Ser Asp Ile Thr Ala Thr Ile Ala 225 230 235 240 Glu
Lys Asn Gly Leu Asn Lys Asp Asn Ile Asp Trp Val Ile Pro His 245 250
255 Gln Ala Asn Leu Arg Ile Ile Asp Ala Val Ala Ser Arg Leu Glu Val
260 265 270 Pro Leu Glu Lys Val Met Ile Asn Ile Gln Arg Tyr Gly Asn
Thr Ser 275 280 285 Gly Ala Thr Leu Pro Leu Cys Leu Trp Asp Tyr Glu
Lys Gln Leu Lys 290 295 300 Lys Gly Asp Asn Leu Ile Phe Thr Ala Phe
Gly Ala Gly Phe Thr Tyr 305 310 315 320 Gly Ala Val Tyr Val Lys Trp
Gly Tyr Asp Gly Ser Lys Arg 325 330 1151005DNABacteroides vulgatus
115atggaaaaaa taaatgcagt aataacagga gtcggtggat atgtaccaga
ttatgtcttg 60actaacgaag agatttcaag aatggtagat accaatgatg aatggattat
gactcgaatc 120ggagttaaag aaagacgtat tctgaatgaa gaaggattag
gtacatcgta tatggcgcgt 180aaggctgcca aacaactgat gcagaaaaca
gcttctaatc cggatgacat tgatgcagta 240atcgtagcaa ctactactcc
tgactatcat ttcccttcca ctgcttctat cctgtgtgat 300aagctgggat
tgaaaaatgc atttgcattt gatttgcagg ctgcctgctg cggctttttg
360tatttaatgg aaactgctgc ttcacttatc gcatcgggaa gacataaaaa
gattattatt 420gtcggtgcag ataagatgtc atctatggta aactaccagg
atcgtgcaac ttgccctatc 480tttggtgatg gtgcagcagc atgtatggtg
gaagctacta cagaagatta tggtattatg 540gattctattc ttcgtacaga
tggtaaggga cttccttttc ttcacatgaa agccggtggt 600tctgtatgtc
ctccttctta tttcactgtt gatcataaga tgcattatct ttatcaggaa
660ggaagaacag tatttaaata tgctgtttcc aatatgtcgg atattacagc
gactattgcc 720gaaaagaatg gtttgaataa agataatatc gactgggtaa
ttcctcatca ggctaatctg 780cgtattattg atgcggtagc ctctcgcttg
gaagttccct tggaaaaggt aatgattaat 840attcagcgat atggtaatac
cagtggtgct acacttccgt tgtgtctttg ggattacgaa 900aagcagctga
agaaaggaga taacctgata tttacagctt tcggcgcagg ttttacctat
960ggagccgttt atgtgaaatg gggttacgat ggtagtaaga gataa
1005116325PRTClostridium acetobutylicum 116Met Asn Ser Val Glu Ile
Ile Gly Thr Gly Ser Tyr Val Pro Glu Lys 1 5 10 15 Ile Val Thr Asn
Glu Asp Met Ser Lys Ile Val Asp Thr Ser Asp Glu 20 25 30 Trp Ile
Ser Ser Arg Thr Gly Ile Lys Glu Arg Arg Ile Ser Ile Asn 35 40 45
Glu Asn Thr Ser Asp Leu Gly Ala Lys Ala Ala Leu Arg Ala Ile Glu 50
55 60 Asp Ser Asn Ile Lys Pro Glu Glu Ile Asp Leu Ile Ile Val Ala
Thr 65 70 75 80 Thr Ser Pro Asp Ser Tyr Thr Pro Ser Val Ala Cys Ile
Val Gln Glu 85 90 95 Lys Ile Gly Ala Lys Asn Ala Ala Cys Phe Asp
Leu Asn Ala Ala Cys 100 105 110 Thr Gly Phe Ile Phe Ala Leu Asn Thr
Ala Ser Gln Phe Ile Lys Thr 115 120 125 Gly Glu Tyr Lys Thr Ala Leu
Val Val Gly Thr Glu Val Leu Ser Lys 130 135 140 Ile Leu Asp Trp Gln
Asp Arg Gly Thr Cys Val Leu Phe Gly Asp Gly 145 150 155 160 Ala Gly
Ala Val Ile Ile Arg Gly Gly Asp Glu Asn Gly Ile Ile Lys 165 170 175
Ala Cys Leu Gly Ser Asp Gly Thr Gly Lys Asp Phe Leu His Cys Pro 180
185 190 Ala Thr Asn Val Ile Asn Pro Phe Ser Asp Glu Lys Gly Leu Ala
Ser 195 200 205 Ser Lys Ile Ser Met Asn Gly Arg Glu Val Phe Lys Phe
Ala Val Lys 210 215 220 Val Met Val Ser Ser Val Lys Lys Val Ile Glu
Asp Ser Gly Leu Asn 225 230 235 240 Ile Glu Asp Ile Asp Tyr Ile Val
Pro His Gln Ala Asn Ile Arg Ile 245 250 255 Ile Glu Phe Ala Ala Lys
Lys Leu Gly Leu Ser Met Asp Lys Phe Phe 260 265 270 Ile Asn Leu Gln
Asn Tyr Gly Asn Thr Ser Gly Ala Thr Ile Pro Leu 275 280 285 Ala Ile
Asp Glu Met Asn Lys Lys Gly Leu Leu Lys Arg Gly Ala Lys 290 295 300
Ile Val Val Val Gly Phe Gly Gly Gly Leu Thr Trp Gly Ser Met Val 305
310 315 320 Leu Lys Trp Thr Lys 325 117978DNAClostridium
acetobutylicum 117gtgaatagtg ttgagattat agggactgga agctatgtcc
cagaaaaaat agttactaat 60gaagatatgt ctaagatagt tgatactagt gatgagtgga
tatcatcaag aacaggtata 120aaggaaagaa gaatatctat aaacgaaaat
acatcagatt taggtgctaa agctgcctta 180agggcaatag aggactcaaa
cataaaacca gaagaaatag atttaataat agttgcaact 240acaagtccag
actcatatac tccatccgta gcttgtattg ttcaggagaa gataggtgcc
300aaaaatgctg cctgttttga tttgaatgcg gcatgtactg gatttatatt
tgctcttaat 360acggcatctc agtttataaa aacaggagag tataaaacag
ctcttgtagt aggaacagag 420gtactatcaa agatacttga ttggcaagat
agaggtacat gtgtactttt tggagatggt 480gcaggtgcgg taattataag
aggcggagat gaaaacggaa ttattaaagc atgtcttggt 540tcagatggta
cgggaaaaga cttcttgcat tgtccagcga ctaatgtgat aaatccattt
600tcggatgaaa aaggtttagc aagcagtaag atttctatga atggaagaga
agtctttaaa 660tttgcagtta aggtaatggt aagctcagtt aaaaaggtta
tagaagatag tggactaaat 720atagaagaca ttgattatat agtacctcat
caggctaaca ttagaataat agagtttgca 780gctaaaaaac ttggattaag
tatggacaaa ttttttataa acctacaaaa ctatggaaat 840acatctggag
cgactatacc actggcaata gatgaaatga ataaaaaagg cttgcttaaa
900agaggtgcta aaatagttgt agttggtttt ggtggaggac ttacttgggg
ttccatggtt 960cttaaatgga ctaaataa 978118332PRTFlavobacterium
johnsoniae 118Met Asn Thr Ile Thr Ala Ala Ile Thr Ala Val Gly Gly
Tyr Val Pro 1 5 10 15 Asp Phe Val Leu Ser Asn Lys Val Leu Glu Thr
Met Val Asp Thr Asn 20 25 30 Asp Glu Trp Ile Thr Thr Arg Thr Gly
Ile Lys Glu Arg Arg Ile Leu 35 40 45 Lys Asp Ala Asp Lys Gly Thr
Ser Tyr Leu Ala Ile Gln Ala Ala Gln 50 55 60 Asp Leu Ile Ala Lys
Ala Asn Ile Asp Pro Leu Glu Ile Asp Met Val 65 70 75 80 Ile Met Ala
Thr Ala Thr Pro Asp Met Met Val Ala Ser Thr Gly Val 85 90 95 Tyr
Val Ala Thr Glu Ile Gly Ala Val Asn Ala Phe Ala Tyr Asp Leu 100 105
110 Gln Ala Ala Cys Ser Ser Phe Leu Tyr Gly Met Ser Thr Ala Ala Ala
115 120 125 Tyr Val Gln Ser Gly Arg Tyr Lys Lys Val Leu Leu Ile Gly
Ala Asp 130 135 140 Lys Met Ser Ser Ile Val Asp Tyr Thr Asp Arg Ala
Thr Cys Ile Ile 145 150 155 160 Phe Gly Asp Gly Ala Gly Ala Val Leu
Phe Glu Pro Asn Tyr Glu Gly 165 170 175 Leu Gly Leu Gln Asp Glu Tyr
Leu Arg Ser Asp Gly Val Gly Arg Asp 180 185 190 Phe Leu Lys Ile Pro
Ala Gly Gly Ser Leu Ile Pro Ala Ser Glu Asp 195 200 205 Thr Val Lys
Asn Arg Gln His Asn Ile Met Gln Asp Gly Lys Thr Val 210 215 220 Phe
Lys Tyr Ala Val Thr Asn Met Ala Asp Ala Ser Glu Leu Ile Leu 225 230
235 240 Gln Arg Asn Asn Leu Thr Asn Gln Asp Val Asp Trp Leu Val Pro
His 245 250 255 Gln Ala Asn Lys Arg Ile Ile Asp Ala Thr Ala Gly Arg
Leu Glu Leu 260 265 270 Glu Glu Ser Lys Val Leu Val Asn Ile Glu Arg
Tyr Gly Asn Thr Thr 275 280 285 Ser Gly Thr Leu Pro Leu Val Leu Ser
Asp Phe Glu Asn Gln Phe Lys 290 295 300 Lys Gly Asp Asn Ile Ile Leu
Ala Ala Phe Gly Gly Gly Phe Thr Trp 305 310 315 320 Gly Ser Ile Tyr
Leu Lys Trp Ala Tyr Asp Lys Lys 325 330 119999DNAFlavobacterium
johnsoniae 119atgaatacaa tcacagccgc aattaccgct gttggaggct
acgttccaga ctttgtgctt 60tcaaacaaag tgttggaaac aatggtagat accaatgacg
aatggattac cactcgtaca 120ggaattaaag aaagaagaat tcttaaagat
gctgataaag gtacatctta ccttgccata 180caagcagcac aggatttaat
agcaaaagct aatattgatc ctcttgaaat tgatatggtt 240attatggcaa
ctgcaacacc agatatgatg gtagcttcaa caggagttta tgttgcaaca
300gaaattggag ctgttaatgc atttgcatac gatttgcagg cagcttgttc
aagtttctta 360tacggaatgt ctactgctgc ggcttatgta caatctggaa
gatataaaaa agttctttta 420attggtgccg ataaaatgtc atcaattgta
gattacacag acagagcaac ttgtattatt 480tttggtgatg gagcaggggc
agttttgttt gagccaaatt acgaaggtct tggtctgcaa 540gacgaatatt
taagaagtga tggtgtagga cgcgattttc ttaaaatacc agctggagga
600tctttaattc cagcttcaga agatactgta aaaaacagac aacacaatat
tatgcaggat 660ggtaaaacag tttttaaata tgctgtaacc aatatggctg
atgccagcga actaatcttg 720caaagaaaca atttaactaa tcaggatgtt
gattggttag tgcctcacca ggcaaacaaa 780cgcatcatcg atgcaactgc
aggaagacta gagttagaag agtctaaagt actagttaat 840atcgaaagat
atggtaatac aacttcagga acattacctt tggtattaag cgattttgaa
900aatcaattca aaaaaggaga taatattatt ttagcagcat ttggaggtgg
attcacttgg 960ggatctattt acctaaaatg ggcttacgat aagaaataa
999120350PRTMicrococcus luteus 120Met Thr Val Thr Leu Lys Gln His
Glu Arg Pro Ala Ala Ser Arg Ile 1 5 10 15 Val Ala Val Gly Ala Tyr
Arg Pro Ala Asn Leu Val Pro Asn Glu Asp 20 25 30 Leu Ile Gly Pro
Ile Asp Ser Ser Asp Glu Trp Ile Arg Gln Arg Thr 35 40 45 Gly Ile
Val Thr Arg Gln Arg Ala Thr Ala Glu Glu Thr Val Pro Val 50 55 60
Met Ala Val Gly Ala Ala Arg Glu Ala Leu Glu Arg Ala Gly Leu Gln 65
70 75 80 Gly Ser Asp Leu Asp Ala Val Ile Val Ser Thr Val Thr Phe
Pro His 85 90 95 Ala Thr Pro Ser Ala Ala Ala Leu Val Ala His Glu
Ile Gly Ala Thr 100 105 110 Pro Ala Pro Ala Tyr Asp Val Ser Ala Ala
Cys Ala Gly Tyr Cys Tyr 115 120 125 Gly Val Ala Gln Ala Asp Ala Leu
Val Arg Ser Gly Thr Ala Arg His 130 135 140 Val Leu Val Val Gly Val
Glu Arg Leu Ser Asp Val Val Asp Pro Thr 145 150 155 160 Asp Arg Ser
Ile Ser Phe Leu Leu Gly Asp Gly Ala Gly Ala Val Ile 165 170 175 Val
Ala Ala Ser Asp Glu Pro Gly Ile Ser Pro Ser Val Trp Gly Ser 180 185
190 Asp Gly Glu Arg Trp Ser Thr Ile Ser Met Thr His Ser Gln Leu Glu
195 200 205 Leu Arg Asp Ala Val Glu His Ala Arg Thr Thr Gly Asp Ala
Ser Ala 210 215 220 Ile Thr Gly Ala Glu Gly Met Leu Trp Pro Thr Leu
Arg Gln Asp Gly 225 230 235 240 Pro Ser Val Phe Arg Trp Ala Val Trp
Ser Met Ala Lys Val Ala Arg 245 250 255 Glu Ala Leu Asp Ala Ala Gly
Val Glu Pro Glu Asp Leu Ala Ala Phe 260 265 270 Ile Pro His Gln Ala
Asn Met Arg Ile Ile Asp Glu Phe Ala Lys Gln 275 280 285 Leu Lys Leu
Pro Glu Ser Val Val Val Ala Arg Asp Ile Ala Asp Ala 290 295 300 Gly
Asn Thr Ser Ala Ala Ser Ile Pro Leu Ala Met His Arg Leu Leu 305 310
315 320 Glu Glu Asn Pro Glu Leu Ser Gly Gly Leu Ala Leu Gln Ile Gly
Phe 325 330 335 Gly Ala Gly Leu Val Tyr Gly Ala Gln Val Val Arg Leu
Pro 340 345 350 121979DNAMicrococcus luteus 121atgaccgtca
ccctgaagca gcacgagcgc cccgcggcca gccgcatcgt ggccgtgggc 60gcctaccgcc
cggcgaacct ggtcccgaac gaggacctca tcggccccat cgactcgtcg
120gacgagtgga tccgccagcg caccggcatc gtcacacgcc agcgcgccac
ggcggaggag 180accgtgcccg tcatggccgt gggcgccgcc cgggaggccc
tcgagcgggc cggcctgcag 240ggctcggacc tggacgccgt gatcgtctcg
accgtcacct tcccgcacgc caccccctcg 300gccgcggccc tcgtggcgca
cgagatcggc gccaccccgg cgcccgccta cgacgtctcc 360gccgcgtgcg
ccggctactg ctacggcgtg gcccaggccg acgcgctcgt gcgctccggc
420accgcgcggc acgtgctcgt ggtcggcgtc gagcgcctct ccgacgtcgt
ggatcccacg 480gaccgctcca tctccttcct gctgggcgac ggcgcgggcg
ccgtgatcgt cgcggcctcg 540gacgagccgg gcatctcccc ctcggtgtgg
ggctcggacg gggagcgctg gtccacgatc 600tccatgacgc actcgcagct
ggagctgcgc gatgccgtgg agcacgcccg caccacgggc 660gacgcctcgg
cgatcaccgg cgcagagggg atgctctggc ccacgctgcg ccaggacggg
720ccctccgtct tccgttgggc cgtgtggtcg atggcgaagg tggcccgcga
ggcccttgac 780gccgcgggcg tggagcccga ggacctcgcc gcgttcatcc
cgcaccaggc caacatgcgg 840atcatcgacg agttcgccaa gcagctgaag
ctgccggagt ccgtcgtcgt ggcccgggac 900atcgcggacg ccggcaacac
gtcggccgcg tccatcccgc tggccatgca ccggctgctg 960gaggagaacc ccgagctct
97912217PRTArtificial sequenceSynthetic polypeptide 122Asp Thr Xaa
Asp Xaa Trp Ile Xaa Xaa Xaa Thr Gly Ile Xaa Xaa Arg 1 5 10 15 Xaa
12318PRTArtificial
sequenceSynthetic polypeptide 123Xaa Xaa Asp Xaa Xaa Ala Xaa Cys
Xaa Gly Phe Xaa Xaa Xaa Xaa Xaa 1 5 10 15 Xaa Ala
12415PRTArtificial sequenceSynthetic polypeptide 124Asp Arg Xaa Thr
Xaa Xaa Xaa Phe Xaa Asp Gly Ala Xaa Xaa Xaa 1 5 10 15
1258PRTArtificial sequenceSynthetic polypeptide 125His Gln Ala Asn
Xaa Arg Ile Xaa 1 5 12619PRTArtificial sequenceSynthetic
polypeptide 126Gly Asn Thr Xaa Ala Ala Ser Xaa Pro Xaa Xaa Xaa Xaa
Xaa Xaa Xaa 1 5 10 15 Xaa Xaa Gly 12713PRTArtificial
sequenceSynthetic polypeptide 127Xaa Xaa Leu Xaa Xaa Phe Gly Gly
Gly Xaa Xaa Trp Gly 1 5 10 1284559DNAArtificial sequencepDG2
plasmid 128ggggaattgt gagcggataa caattcccct gtagaaataa ttttgtttaa
ctttaataag 60gagatatacc atggcgcaac tcactcttct tttagtcggc aattccgacg
ccatcacgcc 120attacttgct aaagctgact ttgaacaacg ttcgcgtctg
cagattattc ctgcgcagtc 180agttatcgcc agtgatgccc ggccttcgca
agctatccgc gccagtcgtg ggagttcaat 240gcgcgtggcc ctggagctgg
tgaaagaagg tcgagcgcaa gcctgtgtca gtgccggtaa 300taccggggcg
ctgatggggc tggcaaaatt attactcaag cccctggagg ggattgagcg
360tccggcgctg gtgacggtat taccacatca gcaaaagggc aaaacggtgg
tccttgactt 420aggggccaac gtcgattgtg acagcacaat gctggtgcaa
tttgccatta tgggctcagt 480tctggctgaa gaggtggtgg aaattcccaa
tcctcgcgtg gcgttgctca atattggtga 540agaagaagta aagggtctcg
acagtattcg ggatgcctca gcggtgctta aaacaatccc 600ttctatcaat
tatatcggct atcttgaagc caatgagttg ttaactggca agacagatgt
660gctggtttgt gacggcttta caggaaatgt cacattaaag acgatggaag
gtgttgtcag 720gatgttcctt tctctgctga aatctcaggg tgaagggaaa
aaacggtcgt ggtggctact 780gttattaaag cgttggctac aaaagagcct
gacgaggcga ttcagtcacc tcaaccccga 840ccagtataac ggcgcctgtc
tgttaggatt gcgcggcacg gtgataaaaa gtcatggtgc 900agccaatcag
cgagcttttg cggtcgcgat tgaacaggca gtgcaggcgg tgcagcgaca
960agttcctcag cgaattgccg ctcgcctgga atctgtatac ccagctggtt
ttgagctgct 1020ggacggtggc aaaagcggaa ctctgcggta gcaggacgct
gccagcgaac tcgcagtttg 1080caagtgacgg tatataaccg aaaagtgact
gagcgcatat gtatacgaag actcgagtct 1140ggtaaagaaa ccgctgctgc
gaaatttgaa cgccagcaca tggactcgtc tactagcgca 1200gcttaattaa
cctaggctgc tgccaccgct gagcaataac tagcataacc ccttggggcc
1260tctaaacggg tcttgagggg ttttttgctg aaacctcagg catttgagaa
gcacacggtc 1320acactgcttc cggtagtcaa taaaccggta aaccagcaat
agacataagc ggctatttaa 1380cgaccctgcc ctgaaccgac gaccgggtca
tcgtggccgg atcttgcggc ccctcggctt 1440gaacgaattg ttagacatta
tttgccgact accttggtga tctcgccttt cacgtagtgg 1500acaaattctt
ccaactgatc tgcgcgcgag gccaagcgat cttcttcttg tccaagataa
1560gcctgtctag cttcaagtat gacgggctga tactgggccg gcaggcgctc
cattgcccag 1620tcggcagcga catccttcgg cgcgattttg ccggttactg
cgctgtacca aatgcgggac 1680aacgtaagca ctacatttcg ctcatcgcca
gcccagtcgg gcggcgagtt ccatagcgtt 1740aaggtttcat ttagcgcctc
aaatagatcc tgttcaggaa ccggatcaaa gagttcctcc 1800gccgctggac
ctaccaaggc aacgctatgt tctcttgctt ttgtcagcaa gatagccaga
1860tcaatgtcga tcgtggctgg ctcgaagata cctgcaagaa tgtcattgcg
ctgccattct 1920ccaaattgca gttcgcgctt agctggataa cgccacggaa
tgatgtcgtc gtgcacaaca 1980atggtgactt ctacagcgcg gagaatctcg
ctctctccag gggaagccga agtttccaaa 2040aggtcgttga tcaaagctcg
ccgcgttgtt tcatcaagcc ttacggtcac cgtaaccagc 2100aaatcaatat
cactgtgtgg cttcaggccg ccatccactg cggagccgta caaatgtacg
2160gccagcaacg tcggttcgag atggcgctcg atgacgccaa ctacctctga
tagttgagtc 2220gatacttcgg cgatcaccgc ttccctcata ctcttccttt
ttcaatatta ttgaagcatt 2280tatcagggtt attgtctcat gagcggatac
atatttgaat gtatttagaa aaataaacaa 2340atagctagct cactcggtcg
ctacgctccg ggcgtgagac tgcggcgggc gctgcggaca 2400catacaaagt
tacccacaga ttccgtggat aagcagggga ctaacatgtg aggcaaaaca
2460gcagggccgc gccggtggcg tttttccata ggctccgccc tcctgccaga
gttcacataa 2520acagacgctt ttccggtgca tctgtgggag ccgtgaggct
caaccatgaa tctgacagta 2580cgggcgaaac ccgacaggac ttaaagatcc
ccaccgtttc cggcgggtcg ctccctcttg 2640cgctctcctg ttccgaccct
gccgtttacc ggatacctgt tccgcctttc tcccttacgg 2700gaagtgtggc
gctttctcat agctcacaca ctggtatctc ggctcggtgt aggtcgttcg
2760ctccaagctg ggctgtaagc aagaactccc cgttcagccc gactgctgcg
ccttatccgg 2820taactgttca cttgagtcca acccggaaaa gcacggtaaa
acgccactgg cagcagccat 2880tggtaactgg gagttcgcag aggatttgtt
tagctaaaca cgcggttgct cttgaagtgt 2940gcgccaaagt ccggctacac
tggaaggaca gatttggttg ctgtgctctg cgaaagccag 3000ttaccacggt
taagcagttc cccaactgac ttaaccttcg atcaaaccac ctccccaggt
3060ggttttttcg tttacagggc aaaagattac gcgcagaaaa aaaggatctc
aagaagatcc 3120tttgatcttt tctactgaac cgctctagat ttcagtgcaa
tttatctctt caaatgtagc 3180acctgaagtc agccccatac gatataagtt
gtaattctca tgttagtcat gccccgcgcc 3240caccggaagg agctgactgg
gttgaaggct ctcaagggca tcggtcgaga tcccggtgcc 3300taatgagtga
gctaacttac attaattgcg ttgcgctcac tgcccgcttt ccagtcggga
3360aacctgtcgt gccagctgca ttaatgaatc ggccaacgcg cggggagagg
cggtttgcgt 3420attgggcgcc agggtggttt ttcttttcac cagtgagacg
ggcaacagct gattgccctt 3480caccgcctgg ccctgagaga gttgcagcaa
gcggtccacg ctggtttgcc ccagcaggcg 3540aaaatcctgt ttgatggtgg
ttaacggcgg gatataacat gagctgtctt cggtatcgtc 3600gtatcccact
accgagatgt ccgcaccaac gcgcagcccg gactcggtaa tggcgcgcat
3660tgcgcccagc gccatctgat cgttggcaac cagcatcgca gtgggaacga
tgccctcatt 3720cagcatttgc atggtttgtt gaaaaccgga catggcactc
cagtcgcctt cccgttccgc 3780tatcggctga atttgattgc gagtgagata
tttatgccag ccagccagac gcagacgcgc 3840cgagacagaa cttaatgggc
ccgctaacag cgcgatttgc tggtgaccca atgcgaccag 3900atgctccacg
cccagtcgcg taccgtcttc atgggagaaa ataatactgt tgatgggtgt
3960ctggtcagag acatcaagaa ataacgccgg aacattagtg caggcagctt
ccacagcaat 4020ggcatcctgg tcatccagcg gatagttaat gatcagccca
ctgacgcgtt gcgcgagaag 4080attgtgcacc gccgctttac aggcttcgac
gccgcttcgt tctaccatcg acaccaccac 4140gctggcaccc agttgatcgg
cgcgagattt aatcgccgcg acaatttgcg acggcgcgtg 4200cagggccaga
ctggaggtgg caacgccaat cagcaacgac tgtttgcccg ccagttgttg
4260tgccacgcgg ttgggaatgt aattcagctc cgccatcgcc gcttccactt
tttcccgcgt 4320tttcgcagaa acgtggctgg cctggttcac cacgcgggaa
acggtctgat aagagacacc 4380ggcatactct gcgacatcgt ataacgttac
tggtttcaca ttcaccaccc tgaattgact 4440ctcttccggg cgctatcatg
ccataccgcg aaaggttttg cgccattcga tggtgtccgg 4500gatctcgacg
ctctccctta tgcgactcct gcattaggaa attaatacga ctcactata
45591295502DNAArtificial sequencepDG6 plasmid 129ggggaattgt
gagcggataa caattcccct gtagaaataa ttttgtttaa ctttaataag 60gagatatacc
atggcgcaac tcactcttct tttagtcggc aattccgacg ccatcacgcc
120attacttgct aaagctgact ttgaacaacg ttcgcgtctg cagattattc
ctgcgcagtc 180agttatcgcc agtgatgccc ggccttcgca agctatccgc
gccagtcgtg ggagttcaat 240gcgcgtggcc ctggagctgg tgaaagaagg
tcgagcgcaa gcctgtgtca gtgccggtaa 300taccggggcg ctgatggggc
tggcaaaatt attactcaag cccctggagg ggattgagcg 360tccggcgctg
gtgacggtat taccacatca gcaaaagggc aaaacggtgg tccttgactt
420aggggccaac gtcgattgtg acagcacaat gctggtgcaa tttgccatta
tgggctcagt 480tctggctgaa gaggtggtgg aaattcccaa tcctcgcgtg
gcgttgctca atattggtga 540agaagaagta aagggtctcg acagtattcg
ggatgcctca gcggtgctta aaacaatccc 600ttctatcaat tatatcggct
atcttgaagc caatgagttg ttaactggca agacagatgt 660gctggtttgt
gacggcttta caggaaatgt cacattaaag acgatggaag gtgttgtcag
720gatgttcctt tctctgctga aatctcaggg tgaagggaaa aaacggtcgt
ggtggctact 780gttattaaag cgttggctac aaaagagcct gacgaggcga
ttcagtcacc tcaaccccga 840ccagtataac ggcgcctgtc tgttaggatt
gcgcggcacg gtgataaaaa gtcatggtgc 900agccaatcag cgagcttttg
cggtcgcgat tgaacaggca gtgcaggcgg tgcagcgaca 960agttcctcag
cgaattgccg ctcgcctgga atctgtatac ccagctggtt ttgagctgct
1020ggacggtggc aaaagcggaa ctctgcggta gcaggacgct gccagcgaac
tcgcagtttg 1080caagtgacgg tatataaccg aaaagtgact gagcgcatat
gaaagctggc attcttggtg 1140ttggacgtta cattcctgag aaggttttaa
caaatcatga tcttgaaaaa atggttgaaa 1200cttctgacga gtggattcgt
acaagaacag gaatagaaga aagaagaatc gcagcagatg 1260atgtgttttc
atcacacatg gctgttgcag cagcgaaaaa tgcgctggaa caagctgaag
1320tggctgctga ggatctggat atgatcttgg ttgcaactgt tacacctgat
cagtcattcc 1380ctacggtgtc ttgtatgatt caagaacaac tcggcgcgaa
gaaagcgtgt gctatggata 1440tcagcgcggc ttgtgcgggc ttcatgtacg
gggttgtaac cggtaaacaa tttattgaat 1500ccggaaccta caagcatgtt
ctagttgttg gtgtagagaa gctctcaagc attaccgact 1560gggaagaccg
caatacagcc gttctgtttg gagacggagc aggcgctgcg gtagtcgggc
1620cagtcagtga tgacagagga atcctttcat ttgaactagg agccgacggc
acaggcggtc 1680agcacttgta tctgaatgaa aaacgacata caatcatgaa
tggacgagaa gttttcaaat 1740ttgcagtccg ccaaatggga gaatcatgcg
taaatgtcat tgaaaaagcc ggactttcaa 1800aagaggatgt ggactttttg
attccgcatc aggcgaacat ccgtatcatg gaagctgctc 1860gcgagcgttt
agagcttcct gtcgaaaaga tgtctaaaac tgttcataaa tatggaaata
1920cttctgccgc atccattccg atctctcttg tagaagaatt ggaagccggt
aaaatcaaag 1980acggcgatgt ggtcgttatg gtagggttcg gcggaggact
aacatggggc gccattgcaa 2040tccgctgggg ccgataaaaa aaaggtgagg
tgcactcgag tctggtaaag aaaccgctgc 2100tgcgaaattt gaacgccagc
acatggactc gtctactagc gcagcttaat taacctaggc 2160tgctgccacc
gctgagcaat aactagcata accccttggg gcctctaaac gggtcttgag
2220gggttttttg ctgaaacctc aggcatttga gaagcacacg gtcacactgc
ttccggtagt 2280caataaaccg gtaaaccagc aatagacata agcggctatt
taacgaccct gccctgaacc 2340gacgaccggg tcatcgtggc cggatcttgc
ggcccctcgg cttgaacgaa ttgttagaca 2400ttatttgccg actaccttgg
tgatctcgcc tttcacgtag tggacaaatt cttccaactg 2460atctgcgcgc
gaggccaagc gatcttcttc ttgtccaaga taagcctgtc tagcttcaag
2520tatgacgggc tgatactggg ccggcaggcg ctccattgcc cagtcggcag
cgacatcctt 2580cggcgcgatt ttgccggtta ctgcgctgta ccaaatgcgg
gacaacgtaa gcactacatt 2640tcgctcatcg ccagcccagt cgggcggcga
gttccatagc gttaaggttt catttagcgc 2700ctcaaataga tcctgttcag
gaaccggatc aaagagttcc tccgccgctg gacctaccaa 2760ggcaacgcta
tgttctcttg cttttgtcag caagatagcc agatcaatgt cgatcgtggc
2820tggctcgaag atacctgcaa gaatgtcatt gcgctgccat tctccaaatt
gcagttcgcg 2880cttagctgga taacgccacg gaatgatgtc gtcgtgcaca
acaatggtga cttctacagc 2940gcggagaatc tcgctctctc caggggaagc
cgaagtttcc aaaaggtcgt tgatcaaagc 3000tcgccgcgtt gtttcatcaa
gccttacggt caccgtaacc agcaaatcaa tatcactgtg 3060tggcttcagg
ccgccatcca ctgcggagcc gtacaaatgt acggccagca acgtcggttc
3120gagatggcgc tcgatgacgc caactacctc tgatagttga gtcgatactt
cggcgatcac 3180cgcttccctc atactcttcc tttttcaata ttattgaagc
atttatcagg gttattgtct 3240catgagcgga tacatatttg aatgtattta
gaaaaataaa caaatagcta gctcactcgg 3300tcgctacgct ccgggcgtga
gactgcggcg ggcgctgcgg acacatacaa agttacccac 3360agattccgtg
gataagcagg ggactaacat gtgaggcaaa acagcagggc cgcgccggtg
3420gcgtttttcc ataggctccg ccctcctgcc agagttcaca taaacagacg
cttttccggt 3480gcatctgtgg gagccgtgag gctcaaccat gaatctgaca
gtacgggcga aacccgacag 3540gacttaaaga tccccaccgt ttccggcggg
tcgctccctc ttgcgctctc ctgttccgac 3600cctgccgttt accggatacc
tgttccgcct ttctccctta cgggaagtgt ggcgctttct 3660catagctcac
acactggtat ctcggctcgg tgtaggtcgt tcgctccaag ctgggctgta
3720agcaagaact ccccgttcag cccgactgct gcgccttatc cggtaactgt
tcacttgagt 3780ccaacccgga aaagcacggt aaaacgccac tggcagcagc
cattggtaac tgggagttcg 3840cagaggattt gtttagctaa acacgcggtt
gctcttgaag tgtgcgccaa agtccggcta 3900cactggaagg acagatttgg
ttgctgtgct ctgcgaaagc cagttaccac ggttaagcag 3960ttccccaact
gacttaacct tcgatcaaac cacctcccca ggtggttttt tcgtttacag
4020ggcaaaagat tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc
ttttctactg 4080aaccgctcta gatttcagtg caatttatct cttcaaatgt
agcacctgaa gtcagcccca 4140tacgatataa gttgtaattc tcatgttagt
catgccccgc gcccaccgga aggagctgac 4200tgggttgaag gctctcaagg
gcatcggtcg agatcccggt gcctaatgag tgagctaact 4260tacattaatt
gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct
4320gcattaatga atcggccaac gcgcggggag aggcggtttg cgtattgggc
gccagggtgg 4380tttttctttt caccagtgag acgggcaaca gctgattgcc
cttcaccgcc tggccctgag 4440agagttgcag caagcggtcc acgctggttt
gccccagcag gcgaaaatcc tgtttgatgg 4500tggttaacgg cgggatataa
catgagctgt cttcggtatc gtcgtatccc actaccgaga 4560tgtccgcacc
aacgcgcagc ccggactcgg taatggcgcg cattgcgccc agcgccatct
4620gatcgttggc aaccagcatc gcagtgggaa cgatgccctc attcagcatt
tgcatggttt 4680gttgaaaacc ggacatggca ctccagtcgc cttcccgttc
cgctatcggc tgaatttgat 4740tgcgagtgag atatttatgc cagccagcca
gacgcagacg cgccgagaca gaacttaatg 4800ggcccgctaa cagcgcgatt
tgctggtgac ccaatgcgac cagatgctcc acgcccagtc 4860gcgtaccgtc
ttcatgggag aaaataatac tgttgatggg tgtctggtca gagacatcaa
4920gaaataacgc cggaacatta gtgcaggcag cttccacagc aatggcatcc
tggtcatcca 4980gcggatagtt aatgatcagc ccactgacgc gttgcgcgag
aagattgtgc accgccgctt 5040tacaggcttc gacgccgctt cgttctacca
tcgacaccac cacgctggca cccagttgat 5100cggcgcgaga tttaatcgcc
gcgacaattt gcgacggcgc gtgcagggcc agactggagg 5160tggcaacgcc
aatcagcaac gactgtttgc ccgccagttg ttgtgccacg cggttgggaa
5220tgtaattcag ctccgccatc gccgcttcca ctttttcccg cgttttcgca
gaaacgtggc 5280tggcctggtt caccacgcgg gaaacggtct gataagagac
accggcatac tctgcgacat 5340cgtataacgt tactggtttc acattcacca
ccctgaattg actctcttcc gggcgctatc 5400atgccatacc gcgaaaggtt
ttgcgccatt cgatggtgtc cgggatctcg acgctctccc 5460ttatgcgact
cctgcattag gaaattaata cgactcacta ta 55021305541DNAArtificial
sequencepDG7 plasmid 130ggggaattgt gagcggataa caattcccct gtagaaataa
ttttgtttaa ctttaataag 60gagatatacc atggcgcaac tcactcttct tttagtcggc
aattccgacg ccatcacgcc 120attacttgct aaagctgact ttgaacaacg
ttcgcgtctg cagattattc ctgcgcagtc 180agttatcgcc agtgatgccc
ggccttcgca agctatccgc gccagtcgtg ggagttcaat 240gcgcgtggcc
ctggagctgg tgaaagaagg tcgagcgcaa gcctgtgtca gtgccggtaa
300taccggggcg ctgatggggc tggcaaaatt attactcaag cccctggagg
ggattgagcg 360tccggcgctg gtgacggtat taccacatca gcaaaagggc
aaaacggtgg tccttgactt 420aggggccaac gtcgattgtg acagcacaat
gctggtgcaa tttgccatta tgggctcagt 480tctggctgaa gaggtggtgg
aaattcccaa tcctcgcgtg gcgttgctca atattggtga 540agaagaagta
aagggtctcg acagtattcg ggatgcctca gcggtgctta aaacaatccc
600ttctatcaat tatatcggct atcttgaagc caatgagttg ttaactggca
agacagatgt 660gctggtttgt gacggcttta caggaaatgt cacattaaag
acgatggaag gtgttgtcag 720gatgttcctt tctctgctga aatctcaggg
tgaagggaaa aaacggtcgt ggtggctact 780gttattaaag cgttggctac
aaaagagcct gacgaggcga ttcagtcacc tcaaccccga 840ccagtataac
ggcgcctgtc tgttaggatt gcgcggcacg gtgataaaaa gtcatggtgc
900agccaatcag cgagcttttg cggtcgcgat tgaacaggca gtgcaggcgg
tgcagcgaca 960agttcctcag cgaattgccg ctcgcctgga atctgtatac
ccagctggtt ttgagctgct 1020ggacggtggc aaaagcggaa ctctgcggta
gcaggacgct gccagcgaac tcgcagtttg 1080caagtgacgg tatataaccg
aaaagtgact gagcgcatat gtcaaaagca aaaattacag 1140ctatcggcac
ctatgcgccg agcagacgtt taaccaatgc agatttagaa aagatcgttg
1200atacctctga tgaatggatc gttcagcgca caggaatgag agaacgccgg
attgcggatg 1260aacatcaatt tacctctgat ttatgcatag aagcggtgaa
gaatctcaag agccgttata 1320aaggaacgct tgatgatgtc gatatgatcc
tcgttgccac aaccacatcc gattacgcct 1380ttccgagtac ggcatgccgc
gtacaggaat atttcggctg ggaaagcacc ggcgcgctgg 1440atattaatgc
gacatgcgcc gggctgacat acggcctcca tttggcaaat ggattgatca
1500catctggcct tcatcaaaaa attctcgtca tcgccggaga gacgttatca
aaggtaaccg 1560attatacgga tcgaacgaca tgcgtactgt tcggcgatgc
cgcgggtgcg ctgttagtag 1620aacgagatga agagacgccg ggatttcttg
cgtctgtaca aggaacaagc gggaacggcg 1680gcgatatttt gtatcgtgcc
ggactgcgaa atgaaataaa cggtgtgcag cttgtcggtt 1740ccggaaaaat
ggtgcaaaac ggacgcgagg tatataaatg ggccgcaaga accgtccctg
1800gcgaatttga acggctttta cataaagcag gactcagctc cgatgatctc
gattggtttg 1860ttcctcacag cgccaacttg cgcatgatcg agtcaatttg
tgaaaaaaca ccgttcccga 1920ttgaaaaaac gctcactagc gttgagcact
acggaaacac gtcttcggtt tcaattgttt 1980tggcgctcga tctcgcagtg
aaagccggga agctgaaaaa agatcaaatc gttttgcttt 2040tcgggtttgg
cggcggatta acctatacag gattgcttat taaatggggg atgtaaagat
2100ctcctaggcg tcactcgagt ctggtaaaga aaccgctgct gcgaaatttg
aacgccagca 2160catggactcg tctactagcg cagcttaatt aacctaggct
gctgccaccg ctgagcaata 2220actagcataa ccccttgggg cctctaaacg
ggtcttgagg ggttttttgc tgaaacctca 2280ggcatttgag aagcacacgg
tcacactgct tccggtagtc aataaaccgg taaaccagca 2340atagacataa
gcggctattt aacgaccctg ccctgaaccg acgaccgggt catcgtggcc
2400ggatcttgcg gcccctcggc ttgaacgaat tgttagacat tatttgccga
ctaccttggt 2460gatctcgcct ttcacgtagt ggacaaattc ttccaactga
tctgcgcgcg aggccaagcg 2520atcttcttct tgtccaagat aagcctgtct
agcttcaagt atgacgggct gatactgggc 2580cggcaggcgc tccattgccc
agtcggcagc gacatccttc ggcgcgattt tgccggttac 2640tgcgctgtac
caaatgcggg acaacgtaag cactacattt cgctcatcgc cagcccagtc
2700gggcggcgag ttccatagcg ttaaggtttc atttagcgcc tcaaatagat
cctgttcagg 2760aaccggatca aagagttcct ccgccgctgg acctaccaag
gcaacgctat gttctcttgc 2820ttttgtcagc aagatagcca gatcaatgtc
gatcgtggct ggctcgaaga tacctgcaag 2880aatgtcattg cgctgccatt
ctccaaattg cagttcgcgc ttagctggat aacgccacgg 2940aatgatgtcg
tcgtgcacaa caatggtgac ttctacagcg cggagaatct cgctctctcc
3000aggggaagcc gaagtttcca aaaggtcgtt gatcaaagct cgccgcgttg
tttcatcaag 3060ccttacggtc accgtaacca gcaaatcaat atcactgtgt
ggcttcaggc cgccatccac 3120tgcggagccg tacaaatgta cggccagcaa
cgtcggttcg agatggcgct cgatgacgcc 3180aactacctct gatagttgag
tcgatacttc ggcgatcacc gcttccctca tactcttcct 3240ttttcaatat
tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga
3300atgtatttag aaaaataaac aaatagctag ctcactcggt cgctacgctc
cgggcgtgag 3360actgcggcgg gcgctgcgga cacatacaaa gttacccaca
gattccgtgg ataagcaggg 3420gactaacatg tgaggcaaaa cagcagggcc
gcgccggtgg cgtttttcca taggctccgc 3480cctcctgcca gagttcacat
aaacagacgc ttttccggtg catctgtggg agccgtgagg 3540ctcaaccatg
aatctgacag tacgggcgaa acccgacagg acttaaagat ccccaccgtt
3600tccggcgggt cgctccctct tgcgctctcc tgttccgacc ctgccgttta
ccggatacct 3660gttccgcctt tctcccttac gggaagtgtg gcgctttctc
atagctcaca cactggtatc 3720tcggctcggt gtaggtcgtt cgctccaagc
tgggctgtaa gcaagaactc cccgttcagc 3780ccgactgctg cgccttatcc
ggtaactgtt cacttgagtc caacccggaa aagcacggta 3840aaacgccact
ggcagcagcc attggtaact gggagttcgc agaggatttg tttagctaaa
3900cacgcggttg ctcttgaagt gtgcgccaaa gtccggctac actggaagga
cagatttggt 3960tgctgtgctc tgcgaaagcc agttaccacg gttaagcagt
tccccaactg acttaacctt 4020cgatcaaacc acctccccag gtggtttttt
cgtttacagg gcaaaagatt acgcgcagaa 4080aaaaaggatc tcaagaagat
cctttgatct tttctactga accgctctag atttcagtgc 4140aatttatctc
ttcaaatgta gcacctgaag tcagccccat
acgatataag ttgtaattct 4200catgttagtc atgccccgcg cccaccggaa
ggagctgact gggttgaagg ctctcaaggg 4260catcggtcga gatcccggtg
cctaatgagt gagctaactt acattaattg cgttgcgctc 4320actgcccgct
ttccagtcgg gaaacctgtc gtgccagctg cattaatgaa tcggccaacg
4380cgcggggaga ggcggtttgc gtattgggcg ccagggtggt ttttcttttc
accagtgaga 4440cgggcaacag ctgattgccc ttcaccgcct ggccctgaga
gagttgcagc aagcggtcca 4500cgctggtttg ccccagcagg cgaaaatcct
gtttgatggt ggttaacggc gggatataac 4560atgagctgtc ttcggtatcg
tcgtatccca ctaccgagat gtccgcacca acgcgcagcc 4620cggactcggt
aatggcgcgc attgcgccca gcgccatctg atcgttggca accagcatcg
4680cagtgggaac gatgccctca ttcagcattt gcatggtttg ttgaaaaccg
gacatggcac 4740tccagtcgcc ttcccgttcc gctatcggct gaatttgatt
gcgagtgaga tatttatgcc 4800agccagccag acgcagacgc gccgagacag
aacttaatgg gcccgctaac agcgcgattt 4860gctggtgacc caatgcgacc
agatgctcca cgcccagtcg cgtaccgtct tcatgggaga 4920aaataatact
gttgatgggt gtctggtcag agacatcaag aaataacgcc ggaacattag
4980tgcaggcagc ttccacagca atggcatcct ggtcatccag cggatagtta
atgatcagcc 5040cactgacgcg ttgcgcgaga agattgtgca ccgccgcttt
acaggcttcg acgccgcttc 5100gttctaccat cgacaccacc acgctggcac
ccagttgatc ggcgcgagat ttaatcgccg 5160cgacaatttg cgacggcgcg
tgcagggcca gactggaggt ggcaacgcca atcagcaacg 5220actgtttgcc
cgccagttgt tgtgccacgc ggttgggaat gtaattcagc tccgccatcg
5280ccgcttccac tttttcccgc gttttcgcag aaacgtggct ggcctggttc
accacgcggg 5340aaacggtctg ataagagaca ccggcatact ctgcgacatc
gtataacgtt actggtttca 5400cattcaccac cctgaattga ctctcttccg
ggcgctatca tgccataccg cgaaaggttt 5460tgcgccattc gatggtgtcc
gggatctcga cgctctccct tatgcgactc ctgcattagg 5520aaattaatac
gactcactat a 55411315582DNAArtificial sequencepDG8 plasmid
131ggggaattgt gagcggataa caattcccct gtagaaataa ttttgtttaa
ctttaataag 60gagatatacc atggcgcaac tcactcttct tttagtcggc aattccgacg
ccatcacgcc 120attacttgct aaagctgact ttgaacaacg ttcgcgtctg
cagattattc ctgcgcagtc 180agttatcgcc agtgatgccc ggccttcgca
agctatccgc gccagtcgtg ggagttcaat 240gcgcgtggcc ctggagctgg
tgaaagaagg tcgagcgcaa gcctgtgtca gtgccggtaa 300taccggggcg
ctgatggggc tggcaaaatt attactcaag cccctggagg ggattgagcg
360tccggcgctg gtgacggtat taccacatca gcaaaagggc aaaacggtgg
tccttgactt 420aggggccaac gtcgattgtg acagcacaat gctggtgcaa
tttgccatta tgggctcagt 480tctggctgaa gaggtggtgg aaattcccaa
tcctcgcgtg gcgttgctca atattggtga 540agaagaagta aagggtctcg
acagtattcg ggatgcctca gcggtgctta aaacaatccc 600ttctatcaat
tatatcggct atcttgaagc caatgagttg ttaactggca agacagatgt
660gctggtttgt gacggcttta caggaaatgt cacattaaag acgatggaag
gtgttgtcag 720gatgttcctt tctctgctga aatctcaggg tgaagggaaa
aaacggtcgt ggtggctact 780gttattaaag cgttggctac aaaagagcct
gacgaggcga ttcagtcacc tcaaccccga 840ccagtataac ggcgcctgtc
tgttaggatt gcgcggcacg gtgataaaaa gtcatggtgc 900agccaatcag
cgagcttttg cggtcgcgat tgaacaggca gtgcaggcgg tgcagcgaca
960agttcctcag cgaattgccg ctcgcctgga atctgtatac ccagctggtt
ttgagctgct 1020ggacggtggc aaaagcggaa ctctgcggta gcaggacgct
gccagcgaac tcgcagtttg 1080caagtgacgg tatataaccg aaaagtgact
gagcgcatat gtctaagatc aagccaagca 1140agggcgctcc gtacgcgcgc
atcctgggcg tcggcggtta ccgtccgacc cgtgtggtgc 1200cgaacgaggt
gatcctggag aagatcgact cttccgacga gtggattcgc tctcgctccg
1260gcatcgaaac gcgtcactgg gcgggtccgg aagaaaccgt cgcggcgatg
tctgtggagg 1320cctccggcaa ggcactggcc gacgccggta tcgacgcctc
tcgtatcggt gccgtggtag 1380tctctaccgt gtctcacttc agccagaccc
cggccatcgc caccgagatc gccgaccgcc 1440tgggcacgga caaggccgca
gccttcgaca tctctgccgg ctgcgcgggc ttcggctacg 1500gtctgaccct
ggccaagggc atggtcgtcg aaggttctgc ggagtacgtg ctggtcatcg
1560gcgtggagcg tctgtccgac ctgaccgacc tggaggaccg tgccacggcc
ttcctgttcg 1620gcgacggcgc tggtgcggtc gtggtcggcc cgtcccagga
gccggcaatc ggcccgacgg 1680tctggggctc tgagggcgac aaggccgaaa
cgatcaagca gaccgtttcc tgggaccgct 1740tccgtatcgg cgatgtctcc
gaactgccgc tggactccga gggcaacgtc aagtttcctg 1800cgatcacgca
ggagggccag gcggtgttcc gctgggccgt gttcgagatg gcgaaggtcg
1860cgcagcaggc gctggacgcg gcgggtatca gcccggacga cctggacgtc
tttatcccgc 1920accaggccaa tgtgcgtatc atcgactcta tggtgaaaac
cctgaagctg ccggagcacg 1980tcacggtcgc ccgtgacatc cgcaccaccg
gcaacacctc tgccgcctct attccgctgg 2040cgatggagcg tctgctggcg
accggcgacg cgcgtagcgg cgacaccgcg ctggtcatcg 2100gcttcggtgc
gggtctggtc tacgccgcga cggtcgttac cctgccgtaa ccacctcgag
2160tctggtaaag aaaccgctgc tgcgaaattt gaacgccagc acatggactc
gtctactagc 2220gcagcttaat taacctaggc tgctgccacc gctgagcaat
aactagcata accccttggg 2280gcctctaaac gggtcttgag gggttttttg
ctgaaacctc aggcatttga gaagcacacg 2340gtcacactgc ttccggtagt
caataaaccg gtaaaccagc aatagacata agcggctatt 2400taacgaccct
gccctgaacc gacgaccggg tcatcgtggc cggatcttgc ggcccctcgg
2460cttgaacgaa ttgttagaca ttatttgccg actaccttgg tgatctcgcc
tttcacgtag 2520tggacaaatt cttccaactg atctgcgcgc gaggccaagc
gatcttcttc ttgtccaaga 2580taagcctgtc tagcttcaag tatgacgggc
tgatactggg ccggcaggcg ctccattgcc 2640cagtcggcag cgacatcctt
cggcgcgatt ttgccggtta ctgcgctgta ccaaatgcgg 2700gacaacgtaa
gcactacatt tcgctcatcg ccagcccagt cgggcggcga gttccatagc
2760gttaaggttt catttagcgc ctcaaataga tcctgttcag gaaccggatc
aaagagttcc 2820tccgccgctg gacctaccaa ggcaacgcta tgttctcttg
cttttgtcag caagatagcc 2880agatcaatgt cgatcgtggc tggctcgaag
atacctgcaa gaatgtcatt gcgctgccat 2940tctccaaatt gcagttcgcg
cttagctgga taacgccacg gaatgatgtc gtcgtgcaca 3000acaatggtga
cttctacagc gcggagaatc tcgctctctc caggggaagc cgaagtttcc
3060aaaaggtcgt tgatcaaagc tcgccgcgtt gtttcatcaa gccttacggt
caccgtaacc 3120agcaaatcaa tatcactgtg tggcttcagg ccgccatcca
ctgcggagcc gtacaaatgt 3180acggccagca acgtcggttc gagatggcgc
tcgatgacgc caactacctc tgatagttga 3240gtcgatactt cggcgatcac
cgcttccctc atactcttcc tttttcaata ttattgaagc 3300atttatcagg
gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa
3360caaatagcta gctcactcgg tcgctacgct ccgggcgtga gactgcggcg
ggcgctgcgg 3420acacatacaa agttacccac agattccgtg gataagcagg
ggactaacat gtgaggcaaa 3480acagcagggc cgcgccggtg gcgtttttcc
ataggctccg ccctcctgcc agagttcaca 3540taaacagacg cttttccggt
gcatctgtgg gagccgtgag gctcaaccat gaatctgaca 3600gtacgggcga
aacccgacag gacttaaaga tccccaccgt ttccggcggg tcgctccctc
3660ttgcgctctc ctgttccgac cctgccgttt accggatacc tgttccgcct
ttctccctta 3720cgggaagtgt ggcgctttct catagctcac acactggtat
ctcggctcgg tgtaggtcgt 3780tcgctccaag ctgggctgta agcaagaact
ccccgttcag cccgactgct gcgccttatc 3840cggtaactgt tcacttgagt
ccaacccgga aaagcacggt aaaacgccac tggcagcagc 3900cattggtaac
tgggagttcg cagaggattt gtttagctaa acacgcggtt gctcttgaag
3960tgtgcgccaa agtccggcta cactggaagg acagatttgg ttgctgtgct
ctgcgaaagc 4020cagttaccac ggttaagcag ttccccaact gacttaacct
tcgatcaaac cacctcccca 4080ggtggttttt tcgtttacag ggcaaaagat
tacgcgcaga aaaaaaggat ctcaagaaga 4140tcctttgatc ttttctactg
aaccgctcta gatttcagtg caatttatct cttcaaatgt 4200agcacctgaa
gtcagcccca tacgatataa gttgtaattc tcatgttagt catgccccgc
4260gcccaccgga aggagctgac tgggttgaag gctctcaagg gcatcggtcg
agatcccggt 4320gcctaatgag tgagctaact tacattaatt gcgttgcgct
cactgcccgc tttccagtcg 4380ggaaacctgt cgtgccagct gcattaatga
atcggccaac gcgcggggag aggcggtttg 4440cgtattgggc gccagggtgg
tttttctttt caccagtgag acgggcaaca gctgattgcc 4500cttcaccgcc
tggccctgag agagttgcag caagcggtcc acgctggttt gccccagcag
4560gcgaaaatcc tgtttgatgg tggttaacgg cgggatataa catgagctgt
cttcggtatc 4620gtcgtatccc actaccgaga tgtccgcacc aacgcgcagc
ccggactcgg taatggcgcg 4680cattgcgccc agcgccatct gatcgttggc
aaccagcatc gcagtgggaa cgatgccctc 4740attcagcatt tgcatggttt
gttgaaaacc ggacatggca ctccagtcgc cttcccgttc 4800cgctatcggc
tgaatttgat tgcgagtgag atatttatgc cagccagcca gacgcagacg
4860cgccgagaca gaacttaatg ggcccgctaa cagcgcgatt tgctggtgac
ccaatgcgac 4920cagatgctcc acgcccagtc gcgtaccgtc ttcatgggag
aaaataatac tgttgatggg 4980tgtctggtca gagacatcaa gaaataacgc
cggaacatta gtgcaggcag cttccacagc 5040aatggcatcc tggtcatcca
gcggatagtt aatgatcagc ccactgacgc gttgcgcgag 5100aagattgtgc
accgccgctt tacaggcttc gacgccgctt cgttctacca tcgacaccac
5160cacgctggca cccagttgat cggcgcgaga tttaatcgcc gcgacaattt
gcgacggcgc 5220gtgcagggcc agactggagg tggcaacgcc aatcagcaac
gactgtttgc ccgccagttg 5280ttgtgccacg cggttgggaa tgtaattcag
ctccgccatc gccgcttcca ctttttcccg 5340cgttttcgca gaaacgtggc
tggcctggtt caccacgcgg gaaacggtct gataagagac 5400accggcatac
tctgcgacat cgtataacgt tactggtttc acattcacca ccctgaattg
5460actctcttcc gggcgctatc atgccatacc gcgaaaggtt ttgcgccatt
cgatggtgtc 5520cgggatctcg acgctctccc ttatgcgact cctgcattag
gaaattaata cgactcacta 5580ta 55821325678DNAArtificial sequencepDG10
plasmid 132agcgcccaat acgcaaaccg cctctccccg cgcgttggcc gattcattaa
tgcagctggc 60acgacaggtt tcccgactgg aaagcgggca gtgagcgcaa cgcaattaat
gtgagttagc 120tcactcatta ggcaccccag gctttacact ttatgcttcc
ggctcgtatg ttgtgtggaa 180ttgtgagcgg ataacaattt cacacaggaa
acagctatga ccatgattac gccaagctat 240ttaggtgacg cgttagaata
ctcaagctat gcatcaagct tggtaccgag ctcggatcca 300ctagtaacgg
ccgccagtgt gctggaattc aggcttaact tcatgtgaaa agtttgttaa
360aatataaatg agcacgttaa tcatttaaca tagataatta aatagtaaaa
gggagtgtac 420gaccagtgat taagagtttt aatgaaatta tcatgaaggt
aaagagcaaa gaaatgaaaa 480aagttgctgt tgctgtagca caagacgagc
cagtacttga agcagtaaga gatgctaaga 540aaaatggtat tgcagatgct
attcttgttg gagaccatga cgaaatcgtg tcaatcgcgc 600ttaaaatagg
aatggatgta aatgattttg aaatagtaaa cgagcctaac gttaagaaag
660ctgctttaaa ggcagtagag cttgtatcaa ctggaaaagc tgatatggta
atgaagggac 720ttgtaaatac agcaactttc ttaagatctg tattaaacaa
agaagttgga cttagaacag 780gaaaaactat gtctcacgtt gcagtatttg
aaactgagaa atttgataga ctattatttt 840taacagatgt tgctttcaat
acttatcctg aattaaagga aaaaattgat atagtaaaca 900attcagttaa
ggttgcacat gcaataggaa ttgaaaatcc aaaggttgct ccaatttgtg
960cagttgaggt tataaaccct aaaatgccat caacacttga tgcagcaatg
ctttcaaaaa 1020tgagtgacag aggacaaatt aaaggttgtg tagttgacgg
acctttagca cttgatatag 1080ctttatcaga agaagcagca catcataagg
gagtaacagg agaagttgct ggaaaagctg 1140atatcttctt aatgccaaac
atagaaacag gaaatgtaat gtataagact ttaacatata 1200caactgattc
aaaaaatgga ggaatcttag ttggaacttc tgcaccagtt gttttaactt
1260caagagctga cagccatgaa acaaaaatga actctatagc acttgcagct
ttagttgcag 1320gcaataaata aattaaagtt aagtggagga atgttaacat
gtatagatta ctaataatca 1380atcctggctc gacctcaact aaaattggta
tttatgacga tgaaaaagag atatttgaga 1440agactttaag acattcagct
gaagagatag aaaaatataa cactatattt gatcaatttc 1500aattcagaaa
gaatgtaatt ttagatgcgt taaaagaagc aaacatagaa gtaagttctt
1560taaatgctgt agttggaaga ggcggactct taaagccaat agtaagtgga
acttatgcag 1620taaatcaaaa aatgcttgaa gaccttaaag taggagttca
aggtcagcat gcgtcaaatc 1680ttggtggaat tattgcaaat gaaatagcaa
aagaaataaa tgttccagca tacatagttg 1740atccagttgt tgtggatgag
cttgatgaag tttcaagaat atcaggaatg gctgacattc 1800caagaaaaag
tatattccat gcattaaatc aaaaagcagt tgctagaaga tatgcaaaag
1860aagttggaaa aaaatacgaa gatcttaatt taatcgtagt ccacatgggt
ggaggtactt 1920cagtaggtac tcataaagat ggtagagtaa tagaagttaa
taatacactt gatggagaag 1980gtccattctc accagaaaga agtggtggag
ttccaatagg agatcttgta agattgtgct 2040tcagcaacaa atatacttat
gaagaagtaa tgaaaaagat aaacggcaaa ggcggagttg 2100ttagttactt
aaatactatc gattttaagg ctgtagttga taaagctctt gaaggagata
2160agaaatgtgc acttatatat gaagctttca cattccaggt agcaaaagag
ataggaaaat 2220gttcaaccgt tttaaaagga aatgtagatg caataatctt
aacaggcgga attgcgtaca 2280acgagcatgt atgtaatgcc atagaggata
gagtaaaatt catagcacct gtagttagat 2340atggtggaga agatgaactt
cttgcacttg cagaaggtgg acttagagtt ttaagaggag 2400aagaaaaagc
taaggaatac aaataataaa gtcataaata atataatata accagtaccc
2460atgtttataa aacttttgcc ctataaacat gggtattgtc ctgaattctg
cagatatcca 2520tcacactggc ggccgctcga gcatgcatct agagggccca
attcgcccta tagtgagtcg 2580tattacaatt cactggccgt cgttttacaa
cgtcgtgact gggaaaaccc tggcgttacc 2640caacttaatc gccttgcagc
acatccccct ttcgccagct ggcgtaatag cgaagaggcc 2700cgcaccgatc
gcccttccca acagttgcgc agcctatacg tacggcagtt taaggtttac
2760acctataaaa gagagagccg ttatcgtctg tttgtggatg tacagagtga
tattattgac 2820acgccggggc gacggatggt gatccccctg gccagtgcac
gtctgctgtc agataaagtc 2880tcccgtgaac tttacccggt ggtgcatatc
ggggatgaaa gctggcgcat gatgaccacc 2940gatatggcca gtgtgccggt
ctccgttatc ggggaagaag tggctgatct cagccaccgc 3000gaaaatgaca
tcaaaaacgc cattaacctg atgttctggg gaatataaat gtcaggcatg
3060agattatcaa aaaggatctt cacctagatc cttttcacgt agaaagccag
tccgcagaaa 3120cggtgctgac cccggatgaa tgtcagctac tgggctatct
ggacaaggga aaacgcaagc 3180gcaaagagaa agcaggtagc ttgcagtggg
cttacatggc gatagctaga ctgggcggtt 3240ttatggacag caagcgaacc
ggaattgcca gctggggcgc cctctggtaa ggttgggaag 3300ccctgcaaag
taaactggat ggctttcttg ccgccaagga tctgatggcg caggggatca
3360agctctgatc aagagacagg atgaggatcg tttcgcatga ttgaacaaga
tggattgcac 3420gcaggttctc cggccgcttg ggtggagagg ctattcggct
atgactgggc acaacagaca 3480atcggctgct ctgatgccgc cgtgttccgg
ctgtcagcgc aggggcgccc ggttcttttt 3540gtcaagaccg acctgtccgg
tgccctgaat gaactgcaag acgaggcagc gcggctatcg 3600tggctggcca
cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga
3660agggactggc tgctattggg cgaagtgccg gggcaggatc tcctgtcatc
tcaccttgct 3720cctgccgaga aagtatccat catggctgat gcaatgcggc
ggctgcatac gcttgatccg 3780gctacctgcc cattcgacca ccaagcgaaa
catcgcatcg agcgagcacg tactcggatg 3840gaagccggtc ttgtcgatca
ggatgatctg gacgaagagc atcaggggct cgcgccagcc 3900gaactgttcg
ccaggctcaa ggcgagcatg cccgacggcg aggatctcgt cgtgacccat
3960ggcgatgcct gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg
attcatcgac 4020tgtggccggc tgggtgtggc ggaccgctat caggacatag
cgttggctac ccgtgatatt 4080gctgaagagc ttggcggcga atgggctgac
cgcttcctcg tgctttacgg tatcgccgct 4140cccgattcgc agcgcatcgc
cttctatcgc cttcttgacg agttcttctg aattattaac 4200gcttacaatt
tcctgatgcg gtattttctc cttacgcatc tgtgcggtat ttcacaccgc
4260atcaggtggc acttttcggg gaaatgtgcg cggaacccct atttgtttat
ttttctaaat 4320acattcaaat atgtatccgc tcatgagaca ataaccctga
taaatgcttc aataatagca 4380cgtgaggagg gccaccatgg ccaagttgac
cagtgccgtt ccggtgctca ccgcgcgcga 4440cgtcgccgga gcggtcgagt
tctggaccga ccggctcggg ttctcccggg acttcgtgga 4500ggacgacttc
gccggtgtgg tccgggacga cgtgaccctg ttcatcagcg cggtccagga
4560ccaggtggtg ccggacaaca ccctggcctg ggtgtgggtg cgcggcctgg
acgagctgta 4620cgccgagtgg tcggaggtcg tgtccacgaa cttccgggac
gcctccgggc cggccatgac 4680cgagatcggc gagcagccgt gggggcggga
gttcgccctg cgcgacccgg ccggcaactg 4740cgtgcacttc gtggccgagg
agcaggactg acacgtgcta aaacttcatt tttaatttaa 4800aaggatctag
gtgaagatcc tttttgataa tctcatgacc aaaatccctt aacgtgagtt
4860ttcgttccac tgagcgtcag accccgtaga aaagatcaaa ggatcttctt
gagatccttt 4920ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca
ccgctaccag cggtggtttg 4980tttgccggat caagagctac caactctttt
tccgaaggta actggcttca gcagagcgca 5040gataccaaat actgttcttc
tagtgtagcc gtagttaggc caccacttca agaactctgt 5100agcaccgcct
acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga
5160taagtcgtgt cttaccgggt tggactcaag acgatagtta ccggataagg
cgcagcggtc 5220gggctgaacg gggggttcgt gcacacagcc cagcttggag
cgaacgacct acaccgaact 5280gagataccta cagcgtgagc tatgagaaag
cgccacgctt cccgaaggga gaaaggcgga 5340caggtatccg gtaagcggca
gggtcggaac aggagagcgc acgagggagc ttccaggggg 5400aaacgcctgg
tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt
5460tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac gccagcaacg
cggccttttt 5520acggttcctg gccttttgct ggccttttgc tcacatgttc
tttcctgcgt tatcccctga 5580ttctgtggat aaccgtatta ccgcctttga
gtgagctgat accgctcgcc gcagccgaac 5640gaccgagcgc agcgagtcag
tgagcgagga agcggaag 56781333651DNAArtificial sequencepLS9-111
plasmid 133tagaaaaact catcgagcat caaatgaaac tgcaatttat tcatatcagg
attatcaata 60ccatattttt gaaaaagccg tttctgtaat gaaggagaaa actcaccgag
gcagttccat 120aggatggcaa gatcctggta tcggtctgcg attccgactc
gtccaacatc aatacaacct 180attaatttcc cctcgtcaaa aataaggtta
tcaagtgaga aatcaccatg agtgacgact 240gaatccggtg agaatggcaa
aagtttatgc atttctttcc agacttgttc aacaggccag 300ccattacgct
cgtcatcaaa atcactcgca tcaaccaaac cgttattcat tcgtgattgc
360gcctgagcga ggcgaaatac gcgatcgctg ttaaaaggac aattacaaac
aggaatcgag 420tgcaaccggc gcaggaacac tgccagcgca tcaacaatat
tttcacctga atcaggatat 480tcttctaata cctggaacgc tgtttttccg
gggatcgcag tggtgagtaa ccatgcatca 540tcaggagtac ggataaaatg
cttgatggtc ggaagtggca taaattccgt cagccagttt 600agtctgacca
tctcatctgt aacatcattg gcaacgctac ctttgccatg tttcagaaac
660aactctggcg catcgggctt cccatacaag cgatagattg tcgcacctga
ttgcccgaca 720ttatcgcgag cccatttata cccatataaa tcagcatcca
tgttggaatt taatcgcggc 780ctcgacgttt cccgttgaat atggctcata
ttcttccttt ttcaatatta ttgaagcatt 840tatcagggtt attgtctcat
gagcggatac atatttgaat gtatttagaa aaataaacaa 900ataggggtca
gtgttacaac caattaacca attctgaaca ttatcgcgag cccatttata
960cctgaatatg gctcataaca ccccttgttt gcctggcggc agtagcgcgg
tggtcccacc 1020tgaccccatg ccgaactcag aagtgaaacg ccgtagcgcc
gatggtagtg tggggactcc 1080ccatgcgaga gtagggaact gccaggcatc
aaataaaacg aaaggctcag tcgaaagact 1140gggcctttcg cccgggctaa
ttagggggtg tcgcccttac gtacgtactc gattgacgcc 1200taggagatct
ttacatcccc catttaataa gcaatcctgt ataggttaat ccgccgccaa
1260acccgaaaag caaaacgatt tgatcttttt tcagcttccc ggctttcact
gcgagatcga 1320gcgccaaaac aattgaaacc gaagacgtgt ttccgtagtg
ctcaacgcta gtgagcgttt 1380tttcaatcgg gaacggtgtt ttttcacaaa
ttgactcgat catgcgcaag ttggcgctgt 1440gaggaacaaa ccaatcgaga
tcatcggagc tgagtcctgc tttatgtaaa agccgttcaa 1500attcgccagg
gacggttctt gcggcccatt tatatacctc gcgtccgttt tgcaccattt
1560ttccggaacc gacaagctgc acaccgttta tttcatttcg cagtccggca
cgatacaaaa 1620tatcgccgcc gttcccgctt gttccttgta cagacgcaag
aaatcccggc gtctcttcat 1680ctcgttctac taacagcgca cccgcggcat
cgccgaacag tacgcatgtc gttcgatccg 1740tataatcggt tacctttgat
aacgtctctc cggcgatgac gagaattttt tgatgaaggc 1800cagatgtgat
caatccattt gccaaatgga ggccgtatgt cagcccggcg catgtcgcat
1860taatatccag cgcgccggtg ctttcccagc cgaaatattc ctgtacgcgg
catgccgtac 1920tcggaaaggc gtaatcggat gtggttgtgg caacgaggat
catatcgaca tcatcaagcg 1980ttcctttata acggctcttg agattcttca
ccgcttctat gcataaatca gaggtaaatt 2040gatgttcatc cgcaatccgg
cgttctctca ttcctgtgcg ctgaacgatc cattcatcag 2100aggtatcaac
gatcttttct aaatctgcat tggttaaacg tctgctcggc gcataggtgc
2160cgatagctgt aatttttgct tttgacatat gtcagcgaaa
gggcgacaca aaatttattc 2220taaatgcata ataaatactg ataacatctt
atagtttgta ttatattttg tattatcgtt 2280gacatgtata attttgatat
caaaaactga ttttcccttt attattttcg agatttattt 2340tcttaattct
ctttaacaaa ctagaaatat tgtatataca aaaaatcata aataatagat
2400gaatagttta attataggtg ttcatcaatc gaaaaagcaa cgtatcttat
ttaaagtgcg 2460ttgctttttt ctcatttata aggttaaata attctcatat
atcaagcaaa gtgacaggcg 2520cccttaaata ttctgacaaa tgctctttcc
ctaaactccc cccataaaaa aacccgccga 2580agcgggtttt tacgttattt
gcggattaac gattactcgt tatcagaacc gcccaggggg 2640cccgagctta
agactggccg tcgttttaca acacagaaag agtttgtaga aacgcaaaaa
2700ggccatccgt caggggcctt ctgcttagtt tgatgcctgg cagttcccta
ctctcgcctt 2760ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg
gctgcggcga gcggtatcag 2820ctcactcaaa ggcggtaata cggttatcca
cagaatcagg ggataacgca ggaaagaaca 2880tgtgagcaaa aggccagcaa
aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 2940tccataggct
ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc
3000gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc
ctcgtgcgct 3060ctcctgttcc gaccctgccg cttaccggat acctgtccgc
ctttctccct tcgggaagcg 3120tggcgctttc tcatagctca cgctgtaggt
atctcagttc ggtgtaggtc gttcgctcca 3180agctgggctg tgtgcacgaa
ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3240atcgtcttga
gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta
3300acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag
tggtgggcta 3360actacggcta cactagaaga acagtatttg gtatctgcgc
tctgctgaag ccagttacct 3420tcggaaaaag agttggtagc tcttgatccg
gcaaacaaac caccgctggt agcggtggtt 3480tttttgtttg caagcagcag
attacgcgca gaaaaaaagg atctcaagaa gatcctttga 3540tcttttctac
ggggtctgac gctcagtgga acgacgcgcg cgtaactcac gttaagggat
3600tttggtcatg agcttgcgcc gtcccgtcaa gtcagcgtaa tgctctgctt t
36511347310DNAArtificial sequencepLS9-114 plasmid 134ttaataagat
gatcttcttg agatcgtttt ggtctgcgcg taatctcttg ctctgaaaac 60gaaaaaaccg
ccttgcaggg cggtttttcg aaggttctct gagctaccaa ctctttgaac
120cgaggtaact ggcttggagg agcgcagtca ccaaaacttg tcctttcagt
ttagccttaa 180ccggcgcatg acttcaagac taactcctct aaatcaatta
ccagtggctg ctgccagtgg 240tgcttttgca tgtctttccg ggttggactc
aagacgatag ttaccggata aggcgcagcg 300gtcggactga acggggggtt
cgtgcataca gtccagcttg gagcgaactg cctacccgga 360actgagtgtc
aggcgtggaa tgagacaaac gcggccataa cagcggaatg acaccggtaa
420accgaaaggc aggaacagga gagcgcacga gggagccgcc agggggaaac
gcctggtatc 480tttatagtcc tgtcgggttt cgccaccact gatttgagcg
tcagatttcg tgatgcttgt 540caggggggcg gagcctatgg aaaaacggct
ttgccgcggc cctctcactt ccctgttaag 600tatcttcctg gcatcttcca
ggaaatctcc gccccgttcg taagccattt ccgctcgccg 660cagtcgaacg
accgagcgta gcgagtcagt gagcgaggaa gcggaatata tcctgtatca
720catattctgc tgacgcaccg gtgcagcctt ttttctcctg ccacatgaag
cacttcactg 780acaccctcat cagtgccaac atagtaagcc agtatacact
ccgctagcgc tgaggtcccg 840cagccgaacg accgagcgca gcggcgagag
tagggaactg ccaggcatcc tgggcggttc 900tgataacgag taatcgttaa
tccgcaaata acgtaaaaac ccgcttcggc gggttttttt 960atggggggag
tttagggaaa gagcatttgt cagaatattt aagggcgcct gtcactttgc
1020ttgatatatg agaattattt aaccttataa atgagaaaaa agcaacgcac
tttaaataag 1080atacgttgct ttttcgattg atgaacacct ataattaaac
tattcatcta ttatttatga 1140ttttttgtat atacaatatt tctagtttgt
taaagagaat taagaaaata aatctcgaaa 1200ataataaagg gaaaatcagt
ttttgatatc aaaattatac atgtcaacga taatacaaaa 1260tataatacaa
actataagat gttatcagta tttattatgc atttagaata ccttttgtgt
1320cgcccttggg gcatatgaaa gctggcattc ttggtgttgg acgttacatt
cctgagaagg 1380ttttaacaaa tcatgatctt gaaaaaatgg ttgaaacttc
tgacgagtgg attcgtacaa 1440gaacaggaat agaagaaaga agaatcgcag
cagatgatgt gttttcatca cacatggctg 1500ttgcagcagc gaaaaatgcg
ctggaacaag ctgaagtggc tgctgaggat ctggatatga 1560tcttggttgc
aactgttaca cctgatcagt cattccctac ggtgtcttgt atgattcaag
1620aacaactcgg cgcgaagaaa gcgtgtgcta tggatatcag cgcggcttgt
gcgggcttca 1680tgtacggggt tgtaaccggt aaacaattta ttgaatccgg
aacctacaag catgttctag 1740ttgttggtgt agagaagctc tcaagcatta
ccgactggga agaccgcaat acagccgttc 1800tgtttggaga cggagcaggc
gctgcggtag tcgggccagt cagtgatgac agaggaatcc 1860tttcatttga
actaggagcc gacggcacag gcggtcagca cttgtatctg aatgaaaaac
1920gacatacaat catgaatgga cgagaagttt tcaaatttgc agtccgccaa
atgggagaat 1980catgcgtaaa tgtcattgaa aaagccggac tttcaaaaga
ggatgtggac tttttgattc 2040cgcatcaggc gaacatccgt atcatggaag
ctgctcgcga gcgtttagag cttcctgtcg 2100aaaagatgtc taaaactgtt
cataaatatg gaaatacttc tgccgcatcc attccgatct 2160ctcttgtaga
agaattggaa gccggtaaaa tcaaagacgg cgatgtggtc gttatggtag
2220ggttcggcgg aggactaaca tggggcgcca ttgcaatccg ctggggccga
taaaaaaaag 2280gtgaggtgca cacaagatga ctaaaaaacg tgtagttgtt
acaggtcttg gagcattatc 2340tccacttggc aacgacgtcg atacaagttg
gaataacgca atcaacggtg tgtccggaat 2400cggtccgatc actcgtgttg
acgctgaaga atatccggca aaagttgccg ctgaattaaa 2460agattttaat
gttgaagatt atatggataa aaaagaagcc agaaaaatgg accgctttac
2520acaatatgcg gttgtggctg cgaaaatggc ggttgaagat gctgatctta
acattaccga 2580tgagatcgcg ccgagagtcg gtgtttgggt aggctccggt
atcggaggac ttgaaacact 2640agagtctcaa tttgaaatct tcttaacaaa
aggcccaaga cgggtaagcc cgtttttcgt 2700gccaatgatg attcctgaca
tggcgacagg ccagatttct attgcattag gagcaaaagg 2760ggtgaactct
tgtacggtta cagcatgtgc tacaggaacg aactccatcg gtgacgcgtt
2820taaggttatt cagcgcggtg atgcagacgt gatggtcaca ggcggaacag
aagcgctgct 2880gacaagaatg tcattcgccg gctttagtgc caacaaagcg
ctgtctacta atccagatcc 2940gaaaacagcg agccgcccgt ttgataaaaa
ccgtgatggc tttgtcatgg gggaaggtgc 3000agggattatc gttcttgaag
aacttgagca tgccctggcc cgcggcgcta aaatttacgg 3060agaaattgtc
ggctacggct caaccggaga cgcttatcat atcacagcgc cggcccaaga
3120cggtgaaggc ggagcgagag cgatgcaaga agccattaaa gatgcaggca
ttgcacctga 3180agaaattgat tacatcaatg ctcacgggac aagcacgtat
tacaacgaca aatacgaaac 3240aatggcgatt aagaccgttt ttggcgagca
tgcgcataaa cttgcggtaa gctctacaaa 3300atcgatgaca ggccacctct
taggagcagc cggcggtatt gaagccattt tctctatcct 3360ggccattaaa
gaaggcgtga ttccgccgac aatcaatatt caaacacctg acgaagaatg
3420tgatttggat tatgtgcctg atgaagcccg cagacaggaa cttaattatg
ttctcagcaa 3480ctcattagga ttcggcggac acaacgcaac attaatcttt
aaaaaatatc aatcataagt 3540tttttctcga aaatttcatc gtagtttctc
tagtttttta aaaacgaatc cactataata 3600cttgagggga ggtgaattgc
tatggcagac acattagagc gtgtaacgaa aatcatcgta 3660gatcgccttg
gcgttgatga agcagacgtc aaacttgaag catctttcaa ggaagactta
3720ggtgctgatt ccctagatgt agttgagctt gttatggaac ttgaagacga
gtttgatatg 3780gagatttctg acgaagatgc tgaaaagatt gcaacagtcg
gcgacgctgt gaactacata 3840caaaaccagc aataattaat taacctagga
aaaaagggcg acacccctca attagcccgg 3900gcgaaaggcc cagtctttcg
actgagcctt tcgttttatt tgatgcctgg cagttcccta 3960ctctcgcatg
gggagtcccc acactaccat cggcgctacg gcgtttcact tctgagttcg
4020gcatggggtc aggtgggacc accgcgctac tgccgccagg caaacaaggg
gtgttatgag 4080ccatattcag gtataaatgg gctcgcgata atgttcagaa
ttggttaatt ggttgtaaca 4140ctgaccccta tttgtttatt tttctaaata
cattcaaata tgtatccgct catgagacaa 4200taaccctgat aaatgcttca
ataatattga aaaaggaaga atatgagtat tcaacatttc 4260cgtgtcgccc
ttattccctt ttttgcggca ttttgccttc ctgtttttgc tcacccagaa
4320acgctggtga aagtaaaaga tgctgaagat cagttgggtg cacgagtggg
ttacatcgaa 4380ctggatctca acagcggtaa gatccttgag agttttcgcc
ccgaagaacg ttttccaatg 4440atgagcactt ttaaagttct gctatgtggc
gcggtattat cccgtattga cgccgggcaa 4500gagcaactcg gtcgccgcat
acactattct cagaatgact tggttgagta ctcaccagtc 4560acagaaaagc
atcttacgga tggcatgaca gtaagagaat tatgcagtgc tgccataacc
4620atgagtgata acactgcggc caacttactt ctgacaacga tcggaggacc
gaaggagcta 4680accgcttttt tgcacaacat gggggatcat gtaactcgcc
ttgatcgttg ggaaccggag 4740ctgaatgaag ccataccaaa cgacgagcgt
gacaccacga tgcctgtagc gatggcaaca 4800acgttgcgca aactattaac
tggcgaacta cttactctag cttcccggca acaattaata 4860gactggatgg
aggcggataa agttgcagga ccacttctgc gctcggccct tccggctggc
4920tggtttattg ctgataaatc cggagccggt gagcgtggtt ctcgcggtat
catcgcagcg 4980ctggggccag atggtaagcc ctcccgtatc gtagttatct
acacgacggg gagtcaggca 5040actatggatg aacgaaatag acagatcgct
gagataggtg cctcactgat taagcattgg 5100taaaggagga aaaaaaaatg
agccatattc aacgggaaac gtcgaggccg cgattaaatt 5160ccaacatgga
tgctgattta tatgggtata aatgggctcg cgataatgtc gggcaatcag
5220gtgcgacaat ctatcgcttg tatgggaagc ccgatgcgcc agagttgttt
ctgaaacatg 5280gcaaaggtag cgttgccaat gatgttacag atgagatggt
cagactaaac tggctgacgg 5340aatttatgcc acttccgacc atcaagcatt
ttatccgtac tcctgatgat gcatggttac 5400tcaccactgc gatccccgga
aaaacagcgt tccaggtatt agaagaatat cctgattcag 5460gtgaaaatat
tgttgatgcg ctggcagtgt tcctgcgccg gttgcactcg attcctgttt
5520gtaattgtcc ttttaacagc gatcgcgtat ttcgcctcgc tcaggcgcaa
tcacgaatga 5580ataacggttt ggttgatgcg agtgattttg atgacgagcg
taatggctgg cctgttgaac 5640aagtctggaa agaaatgcat aaacttttgc
cattctcacc ggattcagtc gtcactcatg 5700gtgatttctc acttgataac
cttatttttg acgaggggaa attaataggt tgtattgatg 5760ttggacgagt
cggaatcgca gaccgatacc aggatcttgc catcctatgg aactgcctcg
5820gtgagttttc tccttcatta cagaaacggc tttttcaaaa atatggtatt
gataatcctg 5880atatgaataa attgcagttt catttgatgc tcgatgagtt
tttctaaagg aggaaaaaaa 5940aatggagaaa aaaatcactg gatataccac
cgttgatata tcccaatggc atcgtaaaga 6000acattttgag gcatttcagt
cagttgctca atgtacctat aaccagaccg ttcagctgga 6060tattacggcc
tttttaaaga ccgtaaagaa aaataagcac aagttttatc cggcctttat
6120tcacattctt gcccgcctga tgaatgctca tccggagttc cgtatggcaa
tgaaagacgg 6180tgagctggtg atatgggata gtgttcaccc ttgttacacc
gttttccatg agcaaactga 6240aacgttttca tcgctctgga gtgaatacca
cgacgatttc cggcagtttc tacacatata 6300ttcgcaagat gtggcgtgtt
acggtgaaaa cctggcctat ttccctaaag ggtttattga 6360gaatatgttt
ttcgtcagcg ccaatccctg ggtgagtttc accagttttg atttaaacgt
6420ggccaatatg gacaacttct tcgcccccgt tttcactatg ggcaaatatt
atacgcaagg 6480cgacaaggtg ctgatgccgc tggcgattca ggttcatcat
gccgtctgtg atggcttcca 6540tgtcggcaga atgcttaatg aattacaaca
gtactgcgat gagtggcagg gcggggcgta 6600aacgccgagg aggaaaaaaa
aatgcgctca cgcaactggt ccagaacctt gaccgaacgc 6660agcggtggta
acggcgcagt ggcggttttc atggcttgtt atgactgttt ttttgtacag
6720tctatgcctc gggcatccaa gcagcaagcg cgttacgccg tgggtcgatg
tttgatgtta 6780tggagcagca acgatgttac gcagcagggc agtcgcccta
aaacaaagtt aggtggctca 6840agtatgggca tcattcgcac atgtaggctc
ggccctgacc aagtcaaatc catgcgggct 6900gctcttgatc ttttcggtcg
tgagttcggt gacgtagcca cctactccca acatcagccg 6960gactccgatt
acctcgggaa cttgctccgt agtaagacat tcatcgcgct tgctgccttc
7020gaccaagaag cggttgttgg cgctctcgcg gcttacgttc tgcccaagtt
tgagcagccg 7080cgtagtgaga tctatatcta tgatctcgca gtctccggcg
agcaccggag gcagggcatt 7140gccaccgcgc tcatcaatct cctcaagcat
gaggccaacg cgcttggtgc ttatgtgatc 7200tacgtgcaag cagattacgg
tgacgatccc gcagtggctc tctatacaaa gttgggcata 7260cgggaagaag
tgatgcactt tgatatcgac ccaagtaccg ccacctaagc
73101357804DNAArtificial sequencepLS9-115 plasmid 135ggtggcggta
cttgggtcga tatcaaagtg catcacttct tcccgtatgc ccaactttgt 60atagagagcc
actgcgggat cgtcaccgta atctgcttgc acgtagatca cataagcacc
120aagcgcgttg gcctcatgct tgaggagatt gatgagcgcg gtggcaatgc
cctgcctccg 180gtgctcgccg gagactgcga gatcatagat atagatctca
ctacgcggct gctcaaactt 240gggcagaacg taagccgcga gagcgccaac
aaccgcttct tggtcgaagg cagcaagcgc 300gatgaatgtc ttactacgga
gcaagttccc gaggtaatcg gagtccggct gatgttggga 360gtaggtggct
acgtcaccga actcacgacc gaaaagatca agagcagccc gcatggattt
420gacttggtca gggccgagcc tacatgtgcg aatgatgccc atacttgagc
cacctaactt 480tgttttaggg cgactgccct gctgcgtaac atcgttgctg
ctccataaca tcaaacatcg 540acccacggcg taacgcgctt gctgcttgga
tgcccgaggc atagactgta caaaaaaaca 600gtcataacaa gccatgaaaa
ccgccactgc gccgttacca ccgctgcgtt cggtcaaggt 660tctggaccag
ttgcgtgagc gcattttttt ttcctcctcg gcgtttacgc cccgccctgc
720cactcatcgc agtactgttg taattcatta agcattctgc cgacatggaa
gccatcacag 780acggcatgat gaacctgaat cgccagcggc atcagcacct
tgtcgccttg cgtataatat 840ttgcccatag tgaaaacggg ggcgaagaag
ttgtccatat tggccacgtt taaatcaaaa 900ctggtgaaac tcacccaggg
attggcgctg acgaaaaaca tattctcaat aaacccttta 960gggaaatagg
ccaggttttc accgtaacac gccacatctt gcgaatatat gtgtagaaac
1020tgccggaaat cgtcgtggta ttcactccag agcgatgaaa acgtttcagt
ttgctcatgg 1080aaaacggtgt aacaagggtg aacactatcc catatcacca
gctcaccgtc tttcattgcc 1140atacggaact ccggatgagc attcatcagg
cgggcaagaa tgtgaataaa ggccggataa 1200aacttgtgct tatttttctt
tacggtcttt aaaaaggccg taatatccag ctgaacggtc 1260tggttatagg
tacattgagc aactgactga aatgcctcaa aatgttcttt acgatgccat
1320tgggatatat caacggtggt atatccagtg atttttttct ccattttttt
ttcctccttt 1380agaaaaactc atcgagcatc aaatgaaact gcaatttatt
catatcagga ttatcaatac 1440catatttttg aaaaagccgt ttctgtaatg
aaggagaaaa ctcaccgagg cagttccata 1500ggatggcaag atcctggtat
cggtctgcga ttccgactcg tccaacatca atacaaccta 1560ttaatttccc
ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg
1620aatccggtga gaatggcaaa agtttatgca tttctttcca gacttgttca
acaggccagc 1680cattacgctc gtcatcaaaa tcactcgcat caaccaaacc
gttattcatt cgtgattgcg 1740cctgagcgag gcgaaatacg cgatcgctgt
taaaaggaca attacaaaca ggaatcgagt 1800gcaaccggcg caggaacact
gccagcgcat caacaatatt ttcacctgaa tcaggatatt 1860cttctaatac
ctggaacgct gtttttccgg ggatcgcagt ggtgagtaac catgcatcat
1920caggagtacg gataaaatgc ttgatggtcg gaagtggcat aaattccgtc
agccagttta 1980gtctgaccat ctcatctgta acatcattgg caacgctacc
tttgccatgt ttcagaaaca 2040actctggcgc atcgggcttc ccatacaagc
gatagattgt cgcacctgat tgcccgacat 2100tatcgcgagc ccatttatac
ccatataaat cagcatccat gttggaattt aatcgcggcc 2160tcgacgtttc
ccgttgaata tggctcattt ttttttcctc ctttaccaat gcttaatcag
2220tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct
gactccccgt 2280cgtgtagata actacgatac gggagggctt accatctggc
cccagcgctg cgatgatacc 2340gcgagaacca cgctcaccgg ctccggattt
atcagcaata aaccagccag ccggaagggc 2400cgagcgcaga agtggtcctg
caactttatc cgcctccatc cagtctatta attgttgccg 2460ggaagctaga
gtaagtagtt cgccagttaa tagtttgcgc aacgttgttg ccatcgctac
2520aggcatcgtg gtgtcacgct cgtcgtttgg tatggcttca ttcagctccg
gttcccaacg 2580atcaaggcga gttacatgat cccccatgtt gtgcaaaaaa
gcggttagct ccttcggtcc 2640tccgatcgtt gtcagaagta agttggccgc
agtgttatca ctcatggtta tggcagcact 2700gcataattct cttactgtca
tgccatccgt aagatgcttt tctgtgactg gtgagtactc 2760aaccaagtca
ttctgagaat agtgtatgcg gcgaccgagt tgctcttgcc cggcgtcaat
2820acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg
gaaaacgttc 2880ttcggggcga aaactctcaa ggatcttacc gctgttgaga
tccagttcga tgtaacccac 2940tcgtgcaccc aactgatctt cagcatcttt
tactttcacc agcgtttctg ggtgagcaaa 3000aacaggaagg caaaatgccg
caaaaaaggg aataagggcg acacggaaat gttgaatact 3060catattcttc
ctttttcaat attattgaag catttatcag ggttattgtc tcatgagcgg
3120atacatattt gaatgtattt agaaaaataa acaaataggg gtcagtgtta
caaccaatta 3180accaattctg aacattatcg cgagcccatt tatacctgaa
tatggctcat aacacccctt 3240gtttgcctgg cggcagtagc gcggtggtcc
cacctgaccc catgccgaac tcagaagtga 3300aacgccgtag cgccgatggt
agtgtgggga ctccccatgc gagagtaggg aactgccagg 3360catcaaataa
aacgaaaggc tcagtcgaaa gactgggcct ttcgcccggg ctaattgagg
3420ggtgtcgccc ttattcgact ctatagtgaa gttcctattc tctagaaagt
ataggaactt 3480ctgaagtggg gcatatgtct aagatcaagc caagcaaggg
cgctccgtac gcgcgcatcc 3540tgggcgtcgg cggttaccgt ccgacccgtg
tggtgccgaa cgaggtgatc ctggagaaga 3600tcgactcttc cgacgagtgg
attcgctctc gctccggcat cgaaacgcgt cactgggcgg 3660gtccggaaga
aaccgtcgcg gcgatgtctg tggaggcctc cggcaaggca ctggccgacg
3720ccggtatcga cgcctctcgt atcggtgccg tggtagtctc taccgtgtct
cacttcagcc 3780agaccccggc catcgccacc gagatcgccg accgcctggg
cacggacaag gccgcagcct 3840tcgacatctc tgccggctgc gcgggcttcg
gctacggtct gaccctggcc aagggcatgg 3900tcgtcgaagg ttctgcggag
tacgtgctgg tcatcggcgt ggagcgtctg tccgacctga 3960ccgacctgga
ggaccgtgcc acggccttcc tgttcggcga cggcgctggt gcggtcgtgg
4020tcggcccgtc ccaggagccg gcaatcggcc cgacggtctg gggctctgag
ggcgacaagg 4080ccgaaacgat caagcagacc gtttcctggg accgcttccg
tatcggcgat gtctccgaac 4140tgccgctgga ctccgagggc aacgtcaagt
ttcctgcgat cacgcaggag ggccaggcgg 4200tgttccgctg ggccgtgttc
gagatggcga aggtcgcgca gcaggcgctg gacgcggcgg 4260gtatcagccc
ggacgacctg gacgtcttta tcccgcacca ggccaatgtg cgtatcatcg
4320actctatggt gaaaaccctg aagctgccgg agcacgtcac ggtcgcccgt
gacatccgca 4380ccaccggcaa cacctctgcc gcctctattc cgctggcgat
ggagcgtctg ctggcgaccg 4440gcgacgcgcg tagcggcgac accgcgctgg
tcatcggctt cggtgcgggt ctggtctacg 4500ccgcgacggt cgttaccctg
ccgtaaccac tccgtgccgg atcaccccgg tccggaacgg 4560agagcagcac
cgcccgccgc cgacgcggcg ggccgccaca ccctctggac aacaaagaag
4620gagcgccgtc atggccgcca ctcaggaaga gatcgtcgcc ggtctggcgg
agatcgtgaa 4680cgagatcgcc ggcatcccgg tcgaggacgt caagctggac
aagtccttca ccgacgacct 4740ggacgtagac tctctgagca tggtcgaggt
cgtcgtcgcc gccgaagagc gcttcgacgt 4800caagatcccg gacgacgacg
tcaagaacct gaaaacggtc ggcgacgcga cgaagtacat 4860cctggaccac
caggcctgat ccgccgatac tcgggcatga cccgcgtacc gggcagatcc
4920gggcagactg ccccgccgcc cggcggtggc gccgtacgaa tccgtatccc
gttggagaaa 4980gaattcccat gagcagcacc aatcgcaccg tggtcgtcac
cggtatcggc gcaaccaccc 5040cgctgggtgg cgacgcagcc tctacctggg
agggtctggt cgcgggtcgt tccggcgtcc 5100gtccgctgga gcaggagtgg
gctgccgacc aggcggtccg tatcgcagcg ccggcagccg 5160tagacccgtc
cgaggtcatc ccgcgtccgc aggcacgccg tctggaccgc tctgcgcagt
5220tcgcgctgat cgcggcgcag gaggcctgga aggacgccgg ttacgccggc
aaggcgggcg 5280agtctccggc ggaggacggt gcggctcacg tagacccgga
ccgtctgggt gcggtcatcg 5340cctccggcat cggcggcgtg accacgctgc
tggaccagta cgacgtgctg aaggagaagg 5400gcgtccgccg cgtttccccg
cacaccgtcc cgatgctgat gccgaacggt ccgtccgcca 5460acgtcggcct
ggccgtgggt gcccgtgcgg gcgtgcacac cccggtgtct gcctgcgcgt
5520ctggcgccga ggccatcggc tacgccatcg agatgatccg cactggccgt
gcggacgtcg 5580tcgtcgcggg tggcacggag gcggcgatcc acccgctgcc
gattgccgcg ttcggcaaca 5640tgatggcgat gtccaagaac aacgacgacc
cgcagggcgc ctcccgcccg ttcgacacgg 5700cgcgtgacgg cttcgtcctg
ggcgaaggtg ccggcgtcct ggtcctggag tccgccgagc 5760atgcggcagc
gcgcggtgcc cgcgtctacg cggaggcggt cggccagggc atctccgccg
5820acagccacga catcgtgcag ccggagccgg agggccgtgg catctccgca
gcgctgcaaa 5880acctgctgga cggcaacgac ctggacccgg ccgagatcgt
gcacgtcaac gcgcacgcca 5940cctctacccc ggcaggtgac atcgccgagc
tgaaggcgct gcgcaaggtc ctgggcgacg 6000acgtagacca catggccgtc
agcggcacca agtctatgac cggtcacctg ctgggtggcg 6060ctggcggcgt
ggagtccgtg gcgaccgtgc tggcgctgta ccaccgtgtg gctccgccga
6120ccatcaacgt cgagaacctg gacccggagg ccgaggccaa cgcggacatc
gtccgcggtg 6180aggcccgcaa
gctgccggtg gagggccgta tcgccgcgct gaacgactct ttcggcttcg
6240gcggtcacaa cgtggtgctg gcgttccgtt ctgtctgatt aattaaccta
ggaaaatgaa 6300gtgaagttcc tatactttct agagaatagg aacttctata
gtgagtcgaa taagggcgac 6360acaaaattta ttctaaatgc ataataaata
ctgataacat cttatagttt gtattatatt 6420ttgtattatc gttgacatgt
ataattttga tatcaaaaac tgattttccc tttattattt 6480tcgagattta
ttttcttaat tctctttaac aaactagaaa tattgtatat acaaaaaatc
6540ataaataata gatgaatagt ttaattatag gtgttcatca atcgaaaaag
caacgtatct 6600tatttaaagt gcgttgcttt tttctcattt ataaggttaa
ataattctca tatatcaagc 6660aaagtgacag gcgcccttaa atattctgac
aaatgctctt tccctaaact ccccccataa 6720aaaaacccgc cgaagcgggt
ttttacgtta tttgcggatt aacgattact cgttatcaga 6780accgcccagg
gggcccgagc ttaagactgg ccgtcgtttt acaacacaga aagagtttgt
6840agaaacgcaa aaaggccatc cgtcaggggc cttctgctta gtttgatgcc
tggcagttcc 6900ctactctcgc cttccgcttc ctcgctcact gactcgctgc
gctcggtcgt tcggctgcgg 6960cgagcggtat cagctcactc aaaggcggta
atacggttat ccacagaatc aggggataac 7020gcaggaaaga acatgtgagc
aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg 7080ttgctggcgt
ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca
7140agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc
ccctggaagc 7200tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg
gatacctgtc cgcctttctc 7260ccttcgggaa gcgtggcgct ttctcatagc
tcacgctgta ggtatctcag ttcggtgtag 7320gtcgttcgct ccaagctggg
ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc 7380ttatccggta
actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca
7440gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac
agagttcttg 7500aagtggtggg ctaactacgg ctacactaga agaacagtat
ttggtatctg cgctctgctg 7560aagccagtta ccttcggaaa aagagttggt
agctcttgat ccggcaaaca aaccaccgct 7620ggtagcggtg gtttttttgt
ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa 7680gaagatcctt
tgatcttttc tacggggtct gacgctcagt ggaacgacgc gcgcgtaact
7740cacgttaagg gattttggtc atgagcttgc gccgtcccgt caagtcagcg
taatgctctg 7800ctta 780413610460DNAArtificial sequencepKZ4 plasmid
136tcgcgacgcg aggctggatg gccttcccca ttatgattct tctcgcttcc
ggcggcatcg 60ggatgcccgc gttgcaggcc atgctgtcca ggcaggtaga tgacgaccat
cagggacagc 120ttcaaggatc gctcgcggct cttaccagcc taacttcgat
cactggaccg ctgatcgtca 180cggcgattta tgccgcctcg gcgagcacat
ggaacgggtt ggcatggatt gtaggcgccg 240ccctatacct tgtctgcctc
cccgcgttgc gtcgcggtgc atggagccgg gccacctcga 300cctgaatgga
agccggcggc acctcgctaa cggattcacc actccaagaa ttggagccaa
360tcaattcttg cggagaactg tgaatgcgca aaccaaccct tggcagaaca
tatccatcgc 420gtccgccatc tccagcagcc gcacgcggcg catctcgggc
agcgttgggt cctggccacg 480ggtgcgcatg atcgtgctcc tgtcgttgag
gacccggcta ggctggcggg gttgccttac 540tggttagcag aatgaatcac
cgatacgcga gcgaacgtga agcgactgct gctgcaaaac 600gtctgcgacc
tgagcaacaa catgaatggt cttcggtttc cgtgtttcgt aaagtctgga
660aacgcggaag tcagcgccct gcaccattat gttccggatc tgcatcgcag
gatgctgctg 720gctaccctgt ggaacaccta catctgtatt aacgaagcgc
tggcattgac cctgagtgat 780ttttctctgg tcccgccgca tccataccgc
cagttgttta ccctcacaac gttccagtaa 840ccgggcatgt tcatcatcag
taacccgtat cgtgagcatc ctctctcgtt tcatcggtat 900cattaccccc
atgaacagaa atccccctta cacggaggca tcagtgacca aacaggaaaa
960aaccgccctt aacatggccc gctttatcag aagccagaca ttaacgcttc
tggagaaact 1020caacgagctg gacgcggatg aacaggcaga catctgtgaa
tcgcttcacg accacgctga 1080tgagctttac cgcagctgcc tcgcgcgttt
cggtgatgac ggtgaaaacc tctgacacat 1140gcagctcccg gagacggtca
cagcttgtct gtaagcggat gccgggagca gacaagcccg 1200tcagggcgcg
tcagcgggtg ttggcgggtg tcggggcgca gccatgaccc agtcacgtag
1260cgatagcgga gtgtatactg gcttaactat gcggcatcag agcagattgt
actgagagtg 1320caccatatgc ggtgtgaaat accgcacaga tgcgtaagga
gaaaataccg catcaggcgc 1380tcttccgctt cctcgctcac tgactcgctg
cgctcggtcg ttcggctgcg gcgagcggta 1440tcagctcact caaaggcggt
aatacggtta tccacagaat caggggataa cgcaggaaag 1500aacatgtgag
caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg
1560tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc
aagtcagagg 1620tggcgaaacc cgacaggact ataaagatac caggcgtttc
cccctggaag ctccctcgtg 1680cgctctcctg ttccgaccct gccgcttacc
ggatacctgt ccgcctttct cccttcggga 1740agcgtggcgc tttctcatag
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 1800tccaagctgg
gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt
1860aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc
agcagccact 1920ggtaacagga ttagcagagc gaggtatgta ggcggtgcta
cagagttctt gaagtggtgg 1980cctaactacg gctacactag aaggacagta
tttggtatct gcgctctgct gaagccagtt 2040accttcggaa aaagagttgg
tagctcttga tccggcaaac aaaccaccgc tggtagcggt 2100ggtttttttg
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct
2160ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta
agggattttg 2220gtcatgagat tatcaaaaag gatcttcacc tagatccttt
taaattaaaa atgaagtttt 2280aaatcaatct aaagtatata tgagtaaact
tggtctgaca gttaccaatg cttaatcagt 2340gaggcaccta tctcagcgat
ctgtctattt cgttcatcca tagttgcctg actccccgtc 2400gtgtagataa
ctacgatacg ggagggctta ccatctggcc ccagtgctgc aatgataccg
2460cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc
cggaagggcc 2520gagcgcagaa gtggtcctgc aactttatcc gcctccatcc
agtctattaa ttgttgccgg 2580gaagctagag taagtagttc gccagttaat
agtttgcgca acgttgttgc cattgctgca 2640ggcatcgtgg tgtcacgctc
gtcgtttggt atggcttcat tcagctccgg ttcccaacga 2700tcaaggcgag
ttacatgatc ccccatgttg tgcaaaaaag cggttagctc cttcggtcct
2760ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat
ggcagcactg 2820cataattctc ttactgtcat gccatccgta agatgctttt
ctgtgactgg tgagtactca 2880accaagtcat tctgagaata gtgtatgcgg
cgaccgagtt gctcttgccc ggcgtcaaca 2940cgggataata ccgcgccaca
tagcagaact ttaaaagtgc tcatcattgg aaaacgttct 3000tcggggcgaa
aactctcaag gatcttaccg ctgttgagat ccagttcgat gtaacccact
3060cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg
gtgagcaaaa 3120acaggaaggc aaaatgccgc aaaaaaggga ataagggcga
cacggaaatg ttgaatactc 3180atactcttcc tttttcaata ttattgaagc
atttatcagg gttattgtct catgagcgga 3240tacatatttg aatgtattta
gaaaaataaa caaatagggg ttccgcgcac atttccccga 3300aaagtgccac
ctgacgtctt aattaatcag gagagcgttc accgacaaac aacagataaa
3360acgaaaggcc cagtctttcg actgagcctt tcgttttatt tgatgcctgg
cagttcccta 3420ctctcgcatg gggagacccc acactaccat cggcgctacg
gcgtttcact tctgagttcg 3480gcatggggtc aggtgggacc accgcgctac
tgccgccagg caaattctgt tttatcagac 3540cgcttctgcg ttctgattta
atctgtatca ggctgaaaat cttctctcat ccgccaaaac 3600agccaagctg
gagaccgttt aaactcaatg atgatgatga tgatggtcga cggcgctatt
3660cagatcctct tctgagatga gtttttgttc gggcccaagc ttcgaattct
cagatatgca 3720aggcgtggcc caacgcgcgt agggcggctt cttgcaccgc
ttcacccaac gtggggtggg 3780catgaatggt gccggccaca tcttccaggc
acgcgcccat ctccagcgat tgggcaaacg 3840cggtggacag ctcggagacc
gccacgccaa ccgcctgcca acccacgatc aggtggttgt 3900cacggcgcgc
caccacccgc acgaaaccgc ttttcgactc caggctcatg gcccggccat
3960tggcggcaaa cgggaactgc gcgacgatgc agtccagggc ctgctggctg
gcttgttccg 4020gggtcttgcc gaccaccacc acttccgggt cggtaaagca
cacggcggca atcgctgccg 4080gctcgaagcg tcgggccttg ccggcgatga
tttcggcgac catctcgcct tgggccatgg 4140cccggtgcgc cagcatcggt
tcgccagcga cgtcgccaat ggcccagacg ttgtgcatgc 4200tggtatgaca
gcgctcgtcg atggcaatgg cggtgccgtt catcttcagg tccaggcatt
4260ccaggttgaa gcccttggtg cgtggccggc ggcccacggc caccagtacc
tgatcggctt 4320caagacgcag ttgcccaccc ttgccgtcgc tggccagcag
gcagccattt tcgtagccct 4380cgacgctgtg gcccaggtgc aacgcgatgc
ccagtttctt cagcgactcg gccaccgggg 4440cggtcaattc gctgtcgtag
gtcggcagga tgcgttcgcg cgcttccacc acactcacct 4500gtgcacccag
cttgcgatag gcaatgccca gctccaggcc gatatagcca ccgccgacca
4560ccaccaggtg ttgcggcagg gctttcggcg ccagggcttc ggtcgaggaa
atcaccgggc 4620cacccagcgg cagcatcggc agttcgacac tggtggaacc
ggtcgccagc aacagatgct 4680cgcactggat acgctggcca tcgacctcga
cctgcttgcc gtccagtacc ttggcccagc 4740catgcaccac tttcaccccg
tgctttttca gcaaggcggc aacaccggtg gtcagacggt 4800cgacaatgcc
gtccttccag gtgacgctct ggccgatgtc caggcgcggc gaagccacgc
4860tgatgcccag cggcgagggt tcggtaaagc gcgaggcttg gtgaaactgc
tcggccacgt 4920ggatcagcgc cttggacggg atgcagccga tgttcaggca
ggtgccgccc agtgcctggc 4980cttccaccag tacggtagga atgcccagtt
gcccggcgcg gatggctgct acatagccgc 5040cagggccgcc gccgatgatc
aacagggtag tctggataat ctgttgcatg ctcactccac 5100gaacaggcag
gcgggttgtt cgagcaggcc acgcacggcc tggatgaaca gggcggcgtc
5160catgccatcg accacgcggt ggtcgaacga gctggacagg ttcatcatct
tgcgcacgac 5220gatctggcca tcaatcacca ccggtcgttc gaccatgcgg
ttgaccccga cgattgccac 5280ttccggggtg ttgaccaccg gcgtgctgac
aatgccaccc aaggcgccga ggctggtcag 5340ggtgatggtc gagccggaca
gctcctcgcg gctggccttg ttgttacgtg cagcgttggc 5400caggcgcgaa
atctcgccgg cattggccca caggctgccc gcttcggcgt ggcgcagcac
5460gggtaccatc aggccgttgt caccctgggt ggcaatgccc acatgcaccg
cgccatggcg 5520ggtgatgatc tgcgcttcgt cgtcgtaggt cgcgttgatc
tgcgggaagt cacgcagcgc 5580cacgacgagg gcgcgcacca ggaatggcag
caaggtcagt ttgccgcggc tgtcgccgtg 5640cttgctgttg agttgctggc
gcagggcttc cagggcggtg acgtcgattt cctcgacata 5700actgaagtgc
gcgacccggc gtttggcgtc ctgcatgcgc tgggcgatct tgcggcgcag
5760gccgatcacc ggcacctgct cgctgtcggt gcgcttggca taaccatcag
gtgcttgccc 5820ggcattgctt tgcggcttgc tcatgaaggc gtcgaggtct
tcgtgcagaa tgcgcccggc 5880cgggccgcta ccatgcacat aacgcagttc
gataccggcg tccagggcgc gtttgcgcac 5940ggccggcgag gccagcggct
tgtcgcccgg ctggcgcggc acgatgggcg cagcttcgtg 6000gttggcgggc
gcctggtaca cggcgggttt tacgtctttc tgcggttccg gcttggctgc
6060aatcggggcg gccggggcct ctaccggttt tggctgaggc acgtccacat
ggttgccgct 6120gccttccact tcgatgcgga tcagttcgct accgaccgcc
atcacttccc cgggctggcc 6180acccagggcc aacaccttgc cgctgaccgg
cgaggggatt tccacggtgg ccttgtcggt 6240catgacgtcg gccaccacct
ggtcctcggc gatgatgtcg ccgaccttga cgaaccattc 6300caccaactcg
acctgcgcga tgccttcgcc aatgtccggc atcttgatga cgtgcgtgcc
6360cattcagacc tccatgacct ttttcagtgc cgcacctacc cgcgaagggc
cggggaagta 6420agcccattcc tgtgcgtgag ggtagggggt gtcccagccg
gtgacgcgct cgatcggcgc 6480ctccaggtgg tggaagcagt gctcctgcac
cagcgacacc agctcggcac cgaagccgca 6540ggtgcgggtg gcctcgtgca
ccaccacgca acggccagtc tttttcaccg actcgacgat 6600agtgtccagg
tccagcggcc acaggctgcg caggtcgatc acttcggcat cgacgccgct
6660ttcttcggcg gccacctggg ccacgtacac cgtggtgccg taagtcagta
cggtcacgtc 6720attgccaggg cgggtaatgg cggccttgtc cagcggtacg
gtgtaatagc cgtcgggcac 6780ggcgctgtgc gggtgcttcg accatggggt
tacagggcgg tcgtggtggc catcgaacgg 6840gccgttgtac agacgtttgg
gctccaggaa gattaccggg tcgtcgcatt cgatcgaggc 6900aatcagcagg
cctttggcgt cataagggtt ggacggcatc acggtgcgca ggccgcagac
6960ctgggtgaac atcgcttccg ggctctggct gtgagtctgg ccgccataga
tgccgccgcc 7020gcaaggcatg cgcagggtca gcggggcaat gaactcgccg
gccgaccggt aacgcaggcg 7080ggccagctcg gagacgatct ggtcggaggc
cgggtagaag tagtcggcga actggatctc 7140caccaccggg cgcaggccat
aggcgcccat gcctacggcg gtaccgacga tgccgctctc 7200ggagatgggc
gcgtcgaaca cgcgcgattt gccgtacttg ttctgcaggc cttcggtgca
7260gcggaacacg ccgccgaagt aaccgacgtc ctggccgtac accaccacat
tgtcgtcgcg 7320ctcaagcatg acatccatgg ccgagcgcag ggcctggatc
atggtcatgg tagtggtggc 7380catggcggtt tccgggttga tgctgttgtt
gtggtcgttc atctcaaacc cccagttcct 7440ggcgttgacg gcgcaggtgt
tcgggcatct ccttgtacac atcctcgaac atcgaggcgg 7500cgctcgggat
gtgcccgtta gccagggtgc cgtactgctc ggcttctttc tgtgcggcaa
7560tcaccgcagc ttcgagctcg gccgtgacgg cttggtgttc ttcttcggac
cagtggccga 7620tcttgatcag gtgctgcttc aggcgggcga tcgggtcacc
cagcgggaag tggctccagt 7680catcggcagg gcggtacttg gaggggtcgt
ccgacgtcga gtgcgggccg gcacggtagg 7740tgacccactc gatcaggctt
gggcccaggc cgcggcgggc gcgctcggca gcccagcgcg 7800aggcggcgta
cacggcgacg aagtcgttgc cgtcaacccg cagcgaggca atgccgcagc
7860ccacgccacg gccggcgaag gtggtcgact cgccaccggc gatggcctgg
aaggtagaaa 7920tcgcccactg gttgttgacc acattgagga tcaccggggc
gcggtaaacg tgggcaaagg 7980tgagggcggt gtggaagtcc gactcggcgg
tggctccgtc accgatccac gccgaagcaa 8040tcttggtatc gcccttgatc
gccgaggcca tggcccagcc gactgcctgc acgaactggg 8100tcgccaggtt
gccgctgatg gtgaagaagc cggcttcgcg caccgagtac atgatcggca
8160actggcggcc cttgaggggg tcgcgctcgt tggacagcag ttggcagatc
atctcgacca 8220gcgatacgtc gcgggccatc aggatgcttt gctggcggta
ggtcgggaag cacatgtcgg 8280tgcggttcag cgccagcgcc tggccactgc
cgatggcttc ttcgcccagg ctttgcatgt 8340agaaggacat cttcttctgg
cgctgggcaa ccaccatgcg gctgtcgaag atccgcgtct 8400tgagcatggc
gcgcatgcct tgacgaagga tctgtgggtc gatgtcttcg gcccaggggc
8460cttgcgcatc accttgctcg tcgagcacgc ggaccaggct gtaggacagg
tcggcagtgt 8520cggcagcatc gacatcgatc gcgggtttac gggcttgacc
tgcatcgttg aggcgcaggt 8580aggaaaaatc ggtctggcag cctggccggc
cggtgggctc gggcacatgc aaacgcaggg 8640gggcgtactc gttcatggat
ccatggttta ttcctcctta tttaatcgat acattaatat 8700atacctcttt
aatttttaat aataaagtta atcgataatt ccggtcgagt gcccacacag
8760attgtctgat aaattgttaa agagcagtgc cgcttcgctt tttctcagcg
gcgctgtttc 8820ctgtgtgaaa ttgttatccg ctcacaattc cacacattat
acgagccgga tgattaattg 8880tcaacagctc atttcagaat atttgccaga
accgttatga tgtcggcgca aaaaacatta 8940tccagaacgg gagtgcgcct
tgagcgacac gaattatgca gtgatttacg acctgcacag 9000ccataccaca
gcttccgatg gctgcctgac gccagaagca ttggtgcacc gtgcagtcga
9060tgataagctg tcaaaccaga tcaattcgcg ctaactcaca ttaattgcgt
tgcgctcact 9120gcccgctttc cagtcgggaa acctgtcgtg ccagctgcat
taatgaatcg gccaacgcgc 9180ggggagaggc ggtttgcgta ttgggcgcca
gggtggtttt tcttttcacc agtgagacgg 9240gcaacagctg attgcccttc
accgcctggc cctgagagag ttgcagcaag cggtccacgc 9300tggtttgccc
cagcaggcga aaatcctgtt tgatggtggt tgacggcggg atataacatg
9360agctgtcttc ggtatcgtcg tatcccacta ccgagatatc cgcaccaacg
cgcagcccgg 9420actcggtaat ggcgcgcatt gcgcccagcg ccatctgatc
gttggcaacc agcatcgcag 9480tgggaacgat gccctcattc agcatttgca
tggtttgttg aaaaccggac atggcactcc 9540agtcgccttc ccgttccgct
atcggctgaa tttgattgcg agtgagatat ttatgccagc 9600cagccagacg
cagacgcgcc gagacagaac ttaatgggcc cgctaacagc gcgatttgct
9660ggtgacccaa tgcgaccaga tgctccacgc ccagtcgcgt accgtcttca
tgggagaaaa 9720taatactgtt gatgggtgtc tggtcagaga catcaagaaa
taacgccgga acattagtgc 9780aggcagcttc cacagcaatg gcatcctggt
catccagcgg atagttaatg atcagcccac 9840tgacgcgttg cgcgagaaga
ttgtgcaccg ccgctttaca ggcttcgacg ccgcttcgtt 9900ctaccatcga
caccaccacg ctggcaccca gttgatcggc gcgagattta atcgccgcga
9960caatttgcga cggcgcgtgc agggccagac tggaggtggc aacgccaatc
agcaacgact 10020gtttgcccgc cagttgttgt gccacgcggt tgggaatgta
attcagctcc gccatcgccg 10080cttccacttt ttcccgcgtt ttcgcagaaa
cgtggctggc ctggttcacc acgcgggaaa 10140cggtctgata agagacaccg
gcatactctg cgacatcgta taacgttact ggtttcacat 10200tcaccaccct
gaattgactc tcttccgggc gctatcatgc cataccgcga aaggttttgc
10260accattcgat ggtgtcaacg taaatgcatg ccgcttcgcc ttcgcgcgcg
aattgatctg 10320ctgcctcgcg cgtttcggtg atgacggtga aaacctctga
cacatgcagc tcccggagac 10380ggtcacagct tgtctgtaag cggatgccgg
gagcagacaa gcccgtcagg gcgcgtcagc 10440gggtgttggc ggggccggcc
104601375544DNAArtificial sequencepGL10.173b vector backbone
137tcgcgacgcg aggctggatg gccttcccca ttatgattct tctcgcttcc
ggcggcatcg 60ggatgcccgc gttgcaggcc atgctgtcca ggcaggtaga tgacgaccat
cagggacagc 120ttcaaggatc gctcgcggct cttaccagcc taacttcgat
cactggaccg ctgatcgtca 180cggcgattta tgccgcctcg gcgagcacat
ggaacgggtt ggcatggatt gtaggcgccg 240ccctatacct tgtctgcctc
cccgcgttgc gtcgcggtgc atggagccgg gccacctcga 300cctgaatgga
agccggcggc acctcgctaa cggattcacc actccaagaa ttggagccaa
360tcaattcttg cggagaactg tgaatgcgca aaccaaccct tggcagaaca
tatccatcgc 420gtccgccatc tccagcagcc gcacgcggcg catctcgggc
agcgttgggt cctggccacg 480ggtgcgcatg atcgtgctcc tgtcgttgag
gacccggcta ggctggcggg gttgccttac 540tggttagcag aatgaatcac
cgatacgcga gcgaacgtga agcgactgct gctgcaaaac 600gtctgcgacc
tgagcaacaa catgaatggt cttcggtttc cgtgtttcgt aaagtctgga
660aacgcggaag tcagcgccct gcaccattat gttccggatc tgcatcgcag
gatgctgctg 720gctaccctgt ggaacaccta catctgtatt aacgaagcgc
tggcattgac cctgagtgat 780ttttctctgg tcccgccgca tccataccgc
cagttgttta ccctcacaac gttccagtaa 840ccgggcatgt tcatcatcag
taacccgtat cgtgagcatc ctctctcgtt tcatcggtat 900cattaccccc
atgaacagaa atccccctta cacggaggca tcagtgacca aacaggaaaa
960aaccgccctt aacatggccc gctttatcag aagccagaca ttaacgcttc
tggagaaact 1020caacgagctg gacgcggatg aacaggcaga catctgtgaa
tcgcttcacg accacgctga 1080tgagctttac cgcagctgcc tcgcgcgttt
cggtgatgac ggtgaaaacc tctgacacat 1140gcagctcccg gagacggtca
cagcttgtct gtaagcggat gccgggagca gacaagcccg 1200tcagggcgcg
tcagcgggtg ttggcgggtg tcggggcgca gccatgaccc agtcacgtag
1260cgatagcgga gtgtatactg gcttaactat gcggcatcag agcagattgt
actgagagtg 1320caccatatgc ggtgtgaaat accgcacaga tgcgtaagga
gaaaataccg catcaggcgc 1380tcttccgctt cctcgctcac tgactcgctg
cgctcggtcg ttcggctgcg gcgagcggta 1440tcagctcact caaaggcggt
aatacggtta tccacagaat caggggataa cgcaggaaag 1500aacatgtgag
caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg
1560tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc
aagtcagagg 1620tggcgaaacc cgacaggact ataaagatac caggcgtttc
cccctggaag ctccctcgtg 1680cgctctcctg ttccgaccct gccgcttacc
ggatacctgt ccgcctttct cccttcggga 1740agcgtggcgc tttctcatag
ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 1800tccaagctgg
gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt
1860aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc
agcagccact 1920ggtaacagga ttagcagagc gaggtatgta ggcggtgcta
cagagttctt gaagtggtgg 1980cctaactacg gctacactag aaggacagta
tttggtatct gcgctctgct gaagccagtt 2040accttcggaa aaagagttgg
tagctcttga tccggcaaac aaaccaccgc tggtagcggt 2100ggtttttttg
tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct
2160ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta
agggattttg 2220gtcatgagat tatcaaaaag gatcttcacc tagatccttt
taaattaaaa atgaagtttt 2280aaatcaatct aaagtatata tgagtaaact
tggtctgaca gttaccaatg cttaatcagt 2340gaggcaccta tctcagcgat
ctgtctattt cgttcatcca tagttgcctg actccccgtc 2400gtgtagataa
ctacgatacg ggagggctta ccatctggcc ccagtgctgc aatgataccg
2460cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc
cggaagggcc 2520gagcgcagaa gtggtcctgc aactttatcc gcctccatcc
agtctattaa ttgttgccgg 2580gaagctagag taagtagttc gccagttaat
agtttgcgca acgttgttgc cattgctgca 2640ggcatcgtgg tgtcacgctc
gtcgtttggt atggcttcat tcagctccgg ttcccaacga 2700tcaaggcgag
ttacatgatc ccccatgttg tgcaaaaaag cggttagctc cttcggtcct
2760ccgatcgttg tcagaagtaa gttggccgca
gtgttatcac tcatggttat ggcagcactg 2820cataattctc ttactgtcat
gccatccgta agatgctttt ctgtgactgg tgagtactca 2880accaagtcat
tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc ggcgtcaaca
2940cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg
aaaacgttct 3000tcggggcgaa aactctcaag gatcttaccg ctgttgagat
ccagttcgat gtaacccact 3060cgtgcaccca actgatcttc agcatctttt
actttcacca gcgtttctgg gtgagcaaaa 3120acaggaaggc aaaatgccgc
aaaaaaggga ataagggcga cacggaaatg ttgaatactc 3180atactcttcc
tttttcaata ttattgaagc atttatcagg gttattgtct catgagcgga
3240tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac
atttccccga 3300aaagtgccac ctgacgtctt aattaatcag gagagcgttc
accgacaaac aacagataaa 3360acgaaaggcc cagtctttcg actgagcctt
tcgttttatt tgatgcctgg cagttcccta 3420ctctcgcatg gggagacccc
acactaccat cggcgctacg gcgtttcact tctgagttcg 3480gcatggggtc
aggtgggacc accgcgctac tgccgccagg caaattctgt tttatcagac
3540cgcttctgcg ttctgattta atctgtatca ggctgaaaat cttctctcat
ccgccaaaac 3600agccaagctg gagaccgttt aaactcaatg atgatgatga
tgatggtcga cggcgctatt 3660cagatcctct tctgagatga gtttttgttc
gggcccaagc ttcgaattcc catatggtac 3720cagctgcaga tctcgagctc
ggatccatgg tttattcctc cttatttaat cgatacatta 3780atatatacct
ctttaatttt taataataaa gttaatcgat aattccggtc gagtgcccac
3840acagattgtc tgataaattg ttaaagagca gtgccgcttc gctttttctc
agcggcgctg 3900tttcctgtgt gaaattgtta tccgctcaca attccacaca
ttatacgagc cggatgatta 3960attgtcaaca gctcatttca gaatatttgc
cagaaccgtt atgatgtcgg cgcaaaaaac 4020attatccaga acgggagtgc
gccttgagcg acacgaatta tgcagtgatt tacgacctgc 4080acagccatac
cacagcttcc gatggctgcc tgacgccaga agcattggtg caccgtgcag
4140tcgatgataa gctgtcaaac cagatcaatt cgcgctaact cacattaatt
gcgttgcgct 4200cactgcccgc tttccagtcg ggaaacctgt cgtgccagct
gcattaatga atcggccaac 4260gcgcggggag aggcggtttg cgtattgggc
gccagggtgg tttttctttt caccagtgag 4320acgggcaaca gctgattgcc
cttcaccgcc tggccctgag agagttgcag caagcggtcc 4380acgctggttt
gccccagcag gcgaaaatcc tgtttgatgg tggttgacgg cgggatataa
4440catgagctgt cttcggtatc gtcgtatccc actaccgaga tatccgcacc
aacgcgcagc 4500ccggactcgg taatggcgcg cattgcgccc agcgccatct
gatcgttggc aaccagcatc 4560gcagtgggaa cgatgccctc attcagcatt
tgcatggttt gttgaaaacc ggacatggca 4620ctccagtcgc cttcccgttc
cgctatcggc tgaatttgat tgcgagtgag atatttatgc 4680cagccagcca
gacgcagacg cgccgagaca gaacttaatg ggcccgctaa cagcgcgatt
4740tgctggtgac ccaatgcgac cagatgctcc acgcccagtc gcgtaccgtc
ttcatgggag 4800aaaataatac tgttgatggg tgtctggtca gagacatcaa
gaaataacgc cggaacatta 4860gtgcaggcag cttccacagc aatggcatcc
tggtcatcca gcggatagtt aatgatcagc 4920ccactgacgc gttgcgcgag
aagattgtgc accgccgctt tacaggcttc gacgccgctt 4980cgttctacca
tcgacaccac cacgctggca cccagttgat cggcgcgaga tttaatcgcc
5040gcgacaattt gcgacggcgc gtgcagggcc agactggagg tggcaacgcc
aatcagcaac 5100gactgtttgc ccgccagttg ttgtgccacg cggttgggaa
tgtaattcag ctccgccatc 5160gccgcttcca ctttttcccg cgttttcgca
gaaacgtggc tggcctggtt caccacgcgg 5220gaaacggtct gataagagac
accggcatac tctgcgacat cgtataacgt tactggtttc 5280acattcacca
ccctgaattg actctcttcc gggcgctatc atgccatacc gcgaaaggtt
5340ttgcaccatt cgatggtgtc aacgtaaatg catgccgctt cgccttcgcg
cgcgaattga 5400tctgctgcct cgcgcgtttc ggtgatgacg gtgaaaacct
ctgacacatg cagctcccgg 5460agacggtcac agcttgtctg taagcggatg
ccgggagcag acaagcccgt cagggcgcgt 5520cagcgggtgt tggcggggcc ggcc
55441381026DNASynechococcus elongatus 138atgttcggtc ttatcggtca
tctcaccagt ttggagcagg cccgcgacgt ttctcgcagg 60atgggctacg acgaatacgc
cgatcaagga ttggagtttt ggagtagcgc tcctcctcaa 120atcgttgatg
aaatcacagt caccagtgcc acaggcaagg tgattcacgg tcgctacatc
180gaatcgtgtt tcttgccgga aatgctggcg gcgcgccgct tcaaaacagc
cacgcgcaaa 240gttctcaatg ccatgtccca tgcccaaaaa cacggcatcg
acatctcggc cttggggggc 300tttacctcga ttattttcga gaatttcgat
ttggccagtt tgcggcaagt gcgcgacact 360accttggagt ttgaacggtt
caccaccggc aatactcaca cggcctacgt aatctgtaga 420caggtggaag
ccgctgctaa aacgctgggc atcgacatta cccaagcgac agtagcggtt
480gtcggcgcga ctggcgatat cggtagcgct gtctgccgct ggctcgacct
caaactgggt 540gtcggtgatt tgatcctgac ggcgcgcaat caggagcgtt
tggataacct gcaggctgaa 600ctcggccggg gcaagattct gcccttggaa
gccgctctgc cggaagctga ctttatcgtg 660tgggtcgcca gtatgcctca
gggcgtagtg atcgacccag caaccctgaa gcaaccctgc 720gtcctaatcg
acgggggcta ccccaaaaac ttgggcagca aagtccaagg tgagggcatc
780tatgtcctca atggcggggt agttgaacat tgcttcgaca tcgactggca
gatcatgtcc 840gctgcagaga tggcgcggcc cgagcgccag atgtttgcct
gctttgccga ggcgatgctc 900ttggaatttg aaggctggca tactaacttc
tcctggggcc gcaaccaaat cacgatcgag 960aagatggaag cgatcggtga
ggcatcggtg cgccacggct tccaaccctt ggcattggca 1020atttga
1026139341PRTSynechococcus elongatus 139Met Phe Gly Leu Ile Gly His
Leu Thr Ser Leu Glu Gln Ala Arg Asp 1 5 10 15 Val Ser Arg Arg Met
Gly Tyr Asp Glu Tyr Ala Asp Gln Gly Leu Glu 20 25 30 Phe Trp Ser
Ser Ala Pro Pro Gln Ile Val Asp Glu Ile Thr Val Thr 35 40 45 Ser
Ala Thr Gly Lys Val Ile His Gly Arg Tyr Ile Glu Ser Cys Phe 50 55
60 Leu Pro Glu Met Leu Ala Ala Arg Arg Phe Lys Thr Ala Thr Arg Lys
65 70 75 80 Val Leu Asn Ala Met Ser His Ala Gln Lys His Gly Ile Asp
Ile Ser 85 90 95 Ala Leu Gly Gly Phe Thr Ser Ile Ile Phe Glu Asn
Phe Asp Leu Ala 100 105 110 Ser Leu Arg Gln Val Arg Asp Thr Thr Leu
Glu Phe Glu Arg Phe Thr 115 120 125 Thr Gly Asn Thr His Thr Ala Tyr
Val Ile Cys Arg Gln Val Glu Ala 130 135 140 Ala Ala Lys Thr Leu Gly
Ile Asp Ile Thr Gln Ala Thr Val Ala Val 145 150 155 160 Val Gly Ala
Thr Gly Asp Ile Gly Ser Ala Val Cys Arg Trp Leu Asp 165 170 175 Leu
Lys Leu Gly Val Gly Asp Leu Ile Leu Thr Ala Arg Asn Gln Glu 180 185
190 Arg Leu Asp Asn Leu Gln Ala Glu Leu Gly Arg Gly Lys Ile Leu Pro
195 200 205 Leu Glu Ala Ala Leu Pro Glu Ala Asp Phe Ile Val Trp Val
Ala Ser 210 215 220 Met Pro Gln Gly Val Val Ile Asp Pro Ala Thr Leu
Lys Gln Pro Cys 225 230 235 240 Val Leu Ile Asp Gly Gly Tyr Pro Lys
Asn Leu Gly Ser Lys Val Gln 245 250 255 Gly Glu Gly Ile Tyr Val Leu
Asn Gly Gly Val Val Glu His Cys Phe 260 265 270 Asp Ile Asp Trp Gln
Ile Met Ser Ala Ala Glu Met Ala Arg Pro Glu 275 280 285 Arg Gln Met
Phe Ala Cys Phe Ala Glu Ala Met Leu Leu Glu Phe Glu 290 295 300 Gly
Trp His Thr Asn Phe Ser Trp Gly Arg Asn Gln Ile Thr Ile Glu 305 310
315 320 Lys Met Glu Ala Ile Gly Glu Ala Ser Val Arg His Gly Phe Gln
Pro 325 330 335 Leu Ala Leu Ala Ile 340 14010097DNAArtificial
sequencepCL-Ptrc-carB_'tesA plasmid 140cactatacca attgagatgg
gctagtcaat gataattact agtccttttc ctttgagttg 60tgggtatctg taaattctgc
tagacctttg ctggaaaact tgtaaattct gctagaccct 120ctgtaaattc
cgctagacct ttgtgtgttt tttttgttta tattcaagtg gttataattt
180atagaataaa gaaagaataa aaaaagataa aaagaataga tcccagccct
gtgtataact 240cactacttta gtcagttccg cagtattaca aaaggatgtc
gcaaacgctg tttgctcctc 300tacaaaacag accttaaaac cctaaaggcg
tcggcatccg cttacagaca agctgtgacc 360gtctccggga gctgcatgtg
tcagaggttt tcaccgtcat caccgaaacg cgcgaggcag 420cagatcaatt
cgcgcgcgaa ggcgaagcgg catgcattta cgttgacacc atcgaatggt
480gcaaaacctt tcgcggtatg gcatgatagc gcccggaaga gagtcaattc
agggtggtga 540atgtgaaacc agtaacgtta tacgatgtcg cagagtatgc
cggtgtctct tatcagaccg 600tttcccgcgt ggtgaaccag gccagccacg
tttctgcgaa aacgcgggaa aaagtggaag 660cggcgatggc ggagctgaat
tacattccca accgcgtggc acaacaactg gcgggcaaac 720agtcgttgct
gattggcgtt gccacctcca gtctggccct gcacgcgccg tcgcaaattg
780tcgcggcgat taaatctcgc gccgatcaac tgggtgccag cgtggtggtg
tcgatggtag 840aacgaagcgg cgtcgaagcc tgtaaagcgg cggtgcacaa
tcttctcgcg caacgcgtca 900gtgggctgat cattaactat ccgctggatg
accaggatgc cattgctgtg gaagctgcct 960gcactaatgt tccggcgtta
tttcttgatg tctctgacca gacacccatc aacagtatta 1020ttttctccca
tgaagacggt acgcgactgg gcgtggagca tctggtcgca ttgggtcacc
1080agcaaatcgc gctgttagcg ggcccattaa gttctgtctc ggcgcgtctg
cgtctggctg 1140gctggcataa atatctcact cgcaatcaaa ttcagccgat
agcggaacgg gaaggcgact 1200ggagtgccat gtccggtttt caacaaacca
tgcaaatgct gaatgagggc atcgttccca 1260ctgcgatgct ggttgccaac
gatcagatgg cgctgggcgc aatgcgcgcc attaccgagt 1320ccgggctgcg
cgttggtgcg gatatctcgg tagtgggata cgacgatacc gaagacagct
1380catgttatat cccgccgtta accaccatca aacaggattt tcgcctgctg
gggcaaacca 1440gcgtggaccg cttgctgcaa ctctctcagg gccaggcggt
gaagggcaat cagctgttgc 1500ccgtctcact ggtgaaaaga aaaaccaccc
tggcgcccaa tacgcaaacc gcctctcccc 1560gcgcgttggc cgattcatta
atgcagctgg cacgacaggt ttcccgactg gaaagcgggc 1620agtgagcgca
acgcaattaa tgtaagttag cgcgaattga tctggtttga cagcttatca
1680tcgactgcac ggtgcaccaa tgcttctggc gtcaggcagc catcggaagc
tgtggtatgg 1740ctgtgcaggt cgtaaatcac tgcataattc gtgtcgctca
aggcgcactc ccgttctgga 1800taatgttttt tgcgccgaca tcataacggt
tctggcaaat attctgaaat gagctgttga 1860caattaatca tccggctcgt
ataatgtgtg gaattgtgag cggataacaa tttcacacag 1920gaaacagcgc
cgctgagaaa aagcgaagcg gcactgctct ttaacaattt atcagacaat
1980ctgtgtgggc actcgaccgg aattatcgat taactttatt attaaaaatt
aaagaggtat 2040atattaatgt atcgattaaa taaggaggaa taaaccatga
ccagcgatgt tcacgacgcc 2100acagacggcg tcaccgaaac cgcactcgac
gacgagcagt cgacccgccg catcgccgag 2160ctgtacgcca ccgatcccga
gttcgccgcc gccgcaccgt tgcccgccgt ggtcgacgcg 2220gcgcacaaac
ccgggctgcg gctggcagag atcctgcaga ccctgttcac cggctacggt
2280gaccgcccgg cgctgggata ccgcgcccgt gaactggcca ccgacgaggg
cgggcgcacc 2340gtgacgcgtc tgctgccgcg gttcgacacc ctcacctacg
cccaggtgtg gtcgcgcgtg 2400caagcggtcg ccgcggccct gcgccacaac
ttcgcgcagc cgatctaccc cggcgacgcc 2460gtcgcgacga tcggtttcgc
gagtcccgat tacctgacgc tggatctcgt atgcgcctac 2520ctgggcctcg
tgagtgttcc gctgcagcac aacgcaccgg tcagccggct cgccccgatc
2580ctggccgagg tcgaaccgcg gatcctcacc gtgagcgccg aatacctcga
cctcgcagtc 2640gaatccgtgc gggacgtcaa ctcggtgtcg cagctcgtgg
tgttcgacca tcaccccgag 2700gtcgacgacc accgcgacgc actggcccgc
gcgcgtgaac aactcgccgg caagggcatc 2760gccgtcacca ccctggacgc
gatcgccgac gagggcgccg ggctgccggc cgaaccgatc 2820tacaccgccg
accatgatca gcgcctcgcg atgatcctgt acacctcggg ttccaccggc
2880gcacccaagg gtgcgatgta caccgaggcg atggtggcgc ggctgtggac
catgtcgttc 2940atcacgggtg accccacgcc ggtcatcaac gtcaacttca
tgccgctcaa ccacctgggc 3000gggcgcatcc ccatttccac cgccgtgcag
aacggtggaa ccagttactt cgtaccggaa 3060tccgacatgt ccacgctgtt
cgaggatctc gcgctggtgc gcccgaccga actcggcctg 3120gttccgcgcg
tcgccgacat gctctaccag caccacctcg ccaccgtcga ccgcctggtc
3180acgcagggcg ccgacgaact gaccgccgag aagcaggccg gtgccgaact
gcgtgagcag 3240gtgctcggcg gacgcgtgat caccggattc gtcagcaccg
caccgctggc cgcggagatg 3300agggcgttcc tcgacatcac cctgggcgca
cacatcgtcg acggctacgg gctcaccgag 3360accggcgccg tgacacgcga
cggtgtgatc gtgcggccac cggtgatcga ctacaagctg 3420atcgacgttc
ccgaactcgg ctacttcagc accgacaagc cctacccgcg tggcgaactg
3480ctggtcaggt cgcaaacgct gactcccggg tactacaagc gccccgaggt
caccgcgagc 3540gtcttcgacc gggacggcta ctaccacacc ggcgacgtca
tggccgagac cgcacccgac 3600cacctggtgt acgtggaccg tcgcaacaac
gtcctcaaac tcgcgcaggg cgagttcgtg 3660gcggtcgcca acctggaggc
ggtgttctcc ggcgcggcgc tggtgcgcca gatcttcgtg 3720tacggcaaca
gcgagcgcag tttccttctg gccgtggtgg tcccgacgcc ggaggcgctc
3780gagcagtacg atccggccgc gctcaaggcc gcgctggccg actcgctgca
gcgcaccgca 3840cgcgacgccg aactgcaatc ctacgaggtg ccggccgatt
tcatcgtcga gaccgagccg 3900ttcagcgccg ccaacgggct gctgtcgggt
gtcggaaaac tgctgcggcc caacctcaaa 3960gaccgctacg ggcagcgcct
ggagcagatg tacgccgata tcgcggccac gcaggccaac 4020cagttgcgcg
aactgcggcg cgcggccgcc acacaaccgg tgatcgacac cctcacccag
4080gccgctgcca cgatcctcgg caccgggagc gaggtggcat ccgacgccca
cttcaccgac 4140ctgggcgggg attccctgtc ggcgctgaca ctttcgaacc
tgctgagcga tttcttcggt 4200ttcgaagttc ccgtcggcac catcgtgaac
ccggccacca acctcgccca actcgcccag 4260cacatcgagg cgcagcgcac
cgcgggtgac cgcaggccga gtttcaccac cgtgcacggc 4320gcggacgcca
ccgagatccg ggcgagtgag ctgaccctgg acaagttcat cgacgccgaa
4380acgctccggg ccgcaccggg tctgcccaag gtcaccaccg agccacggac
ggtgttgctc 4440tcgggcgcca acggctggct gggccggttc ctcacgttgc
agtggctgga acgcctggca 4500cctgtcggcg gcaccctcat cacgatcgtg
cggggccgcg acgacgccgc ggcccgcgca 4560cggctgaccc aggcctacga
caccgatccc gagttgtccc gccgcttcgc cgagctggcc 4620gaccgccacc
tgcgggtggt cgccggtgac atcggcgacc cgaatctggg cctcacaccc
4680gagatctggc accggctcgc cgccgaggtc gacctggtgg tgcatccggc
agcgctggtc 4740aaccacgtgc tcccctaccg gcagctgttc ggccccaacg
tcgtgggcac ggccgaggtg 4800atcaagctgg ccctcaccga acggatcaag
cccgtcacgt acctgtccac cgtgtcggtg 4860gccatgggga tccccgactt
cgaggaggac ggcgacatcc ggaccgtgag cccggtgcgc 4920ccgctcgacg
gcggatacgc caacggctac ggcaacagca agtgggccgg cgaggtgctg
4980ctgcgggagg cccacgatct gtgcgggctg cccgtggcga cgttccgctc
ggacatgatc 5040ctggcgcatc cgcgctaccg cggtcaggtc aacgtgccag
acatgttcac gcgactcctg 5100ttgagcctct tgatcaccgg cgtcgcgccg
cggtcgttct acatcggaga cggtgagcgc 5160ccgcgggcgc actaccccgg
cctgacggtc gatttcgtgg ccgaggcggt cacgacgctc 5220ggcgcgcagc
agcgcgaggg atacgtgtcc tacgacgtga tgaacccgca cgacgacggg
5280atctccctgg atgtgttcgt ggactggctg atccgggcgg gccatccgat
cgaccgggtc 5340gacgactacg acgactgggt gcgtcggttc gagaccgcgt
tgaccgcgct tcccgagaag 5400cgccgcgcac agaccgtact gccgctgctg
cacgcgttcc gcgctccgca ggcaccgttg 5460cgcggcgcac ccgaacccac
ggaggtgttc cacgccgcgg tgcgcaccgc gaaggtgggc 5520ccgggagaca
tcccgcacct cgacgaggcg ctgatcgaca agtacatacg cgatctgcgt
5580gagttcggtc tgatctgaga attctagatc tgatcgttgc gggcggggcg
agagtctcgc 5640cccgcccgcg accgcggtga aaatacgaga atattatttg
tattgatctc ctaggcgggg 5700taccgtattt tggatgataa cgaggcgcaa
aaaatggcgg acacgttatt gattctgggt 5760gatagcctga gcgccgggta
tcgaatgtct gccagcgcgg cctggcctgc cttgttgaat 5820gataagtggc
agagtaaaac gtcggtagtt aatgccagca tcagcggcga cacctcgcaa
5880caaggactgg cgcgccttcc ggctctgctg aaacagcatc agccgcgttg
ggtgctggtt 5940gaactgggcg gcaatgacgg tttgcgtggt tttcagccac
agcaaaccga gcaaacgctg 6000cgccagattt tgcaggatgt caaagccgcc
aacgctgaac cattgttaat gcaaatacgt 6060ctgcctgcaa actatggtcg
ccgttataat gaagccttta gcgccattta ccccaaactc 6120gccaaagagt
ttgatgttcc gctgctgccc ttttttatgg aagaggtcta cctcaagcca
6180caatggatgc aggatgacgg tattcatccc aaccgcgacg cccagccgtt
tattgccgac 6240tggatggcga agcagttgca gcctttagta aatcatgact
cataacctag gggtaccgct 6300agcgagctct ctagagaagc ttgggcccga
acaaaaactc atctcagaag aggatctgaa 6360tagcgccgtc gaccatcatc
atcatcatca ttgagtttaa acggtctcca gcttggctgt 6420tttggcggat
gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt
6480ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga
ccccatgccg 6540aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg
ggtctcccca tgcgagagta 6600gggaactgcc aggcatcaaa taaaacgaaa
ggctcagtcg aaagactggg cctttcgttt 6660tatctgttgt ttgtcggtga
acgctctcct gacgcctgat gcggtatttt ctccttacgc 6720atctgtgcgg
tatttcacac cgcatatggt gcactctcag tacaatctgc tctgatgccg
6780catagttaag ccagccccga cacccgccaa cacccgctga cgagcttagt
aaagccctcg 6840ctagatttta atgcggatgt tgcgattact tcgccaacta
ttgcgataac aagaaaaagc 6900cagcctttca tgatatatct cccaatttgt
gtagggctta ttatgcacgc ttaaaaataa 6960taaaagcaga cttgacctga
tagtttggct gtgagcaatt atgtgcttag tgcatctaac 7020gcttgagtta
agccgcgccg cgaagcggcg tcggcttgaa cgaattgtta gacattattt
7080gccgactacc ttggtgatct cgcctttcac gtagtggaca aattcttcca
actgatctgc 7140gcgcgaggcc aagcgatctt cttcttgtcc aagataagcc
tgtctagctt caagtatgac 7200gggctgatac tgggccggca ggcgctccat
tgcccagtcg gcagcgacat ccttcggcgc 7260gattttgccg gttactgcgc
tgtaccaaat gcgggacaac gtaagcacta catttcgctc 7320atcgccagcc
cagtcgggcg gcgagttcca tagcgttaag gtttcattta gcgcctcaaa
7380tagatcctgt tcaggaaccg gatcaaagag ttcctccgcc gctggaccta
ccaaggcaac 7440gctatgttct cttgcttttg tcagcaagat agccagatca
atgtcgatcg tggctggctc 7500gaagatacct gcaagaatgt cattgcgctg
ccattctcca aattgcagtt cgcgcttagc 7560tggataacgc cacggaatga
tgtcgtcgtg cacaacaatg gtgacttcta cagcgcggag 7620aatctcgctc
tctccagggg aagccgaagt ttccaaaagg tcgttgatca aagctcgccg
7680cgttgtttca tcaagcctta cggtcaccgt aaccagcaaa tcaatatcac
tgtgtggctt 7740caggccgcca tccactgcgg agccgtacaa atgtacggcc
agcaacgtcg gttcgagatg 7800gcgctcgatg acgccaacta cctctgatag
ttgagtcgat acttcggcga tcaccgcttc 7860cctcatgatg tttaactttg
ttttagggcg actgccctgc tgcgtaacat cgttgctgct 7920ccataacatc
aaacatcgac ccacggcgta acgcgcttgc tgcttggatg cccgaggcat
7980agactgtacc ccaaaaaaac agtcataaca agccatgaaa accgccactg
cgccgttacc 8040accgctgcgt tcggtcaagg ttctggacca gttgcgtgag
cgcatacgct acttgcatta 8100cagcttacga accgaacagg cttatgtcca
ctgggttcgt gccttcatcc gtttccacgg 8160tgtgcgtcac ccggcaacct
tgggcagcag cgaagtcgag gcatttctgt cctggctggc 8220gaacgagcgc
aaggtttcgg tctccacgca tcgtcaggca ttggcggcct tgctgttctt
8280ctacggcaag gtgctgtgca cggatctgcc ctggcttcag gagatcggaa
gacctcggcc 8340gtcgcggcgc ttgccggtgg tgctgacccc ggatgaagtg
gttcgcatcc tcggttttct 8400ggaaggcgag catcgtttgt tcgcccagct
tctgtatgga acgggcatgc ggatcagtga 8460gggtttgcaa ctgcgggtca
aggatctgga tttcgatcac ggcacgatca tcgtgcggga 8520gggcaagggc
tccaaggatc gggccttgat gttacccgag agcttggcac ccagcctgcg
8580cgagcagggg aattaattcc cacgggtttt gctgcccgca aacgggctgt
tctggtgttg 8640ctagtttgtt atcagaatcg cagatccggc ttcagccggt
ttgccggctg aaagcgctat 8700ttcttccaga attgccatga ttttttcccc
acgggaggcg tcactggctc ccgtgttgtc 8760ggcagctttg attcgataag
cagcatcgcc tgtttcaggc tgtctatgtg tgactgttga 8820gctgtaacaa
gttgtctcag gtgttcaatt
tcatgttcta gttgctttgt tttactggtt 8880tcacctgttc tattaggtgt
tacatgctgt tcatctgtta cattgtcgat ctgttcatgg 8940tgaacagctt
tgaatgcacc aaaaactcgt aaaagctctg atgtatctat cttttttaca
9000ccgttttcat ctgtgcatat ggacagtttt ccctttgata tgtaacggtg
aacagttgtt 9060ctacttttgt ttgttagtct tgatgcttca ctgatagata
caagagccat aagaacctca 9120gatccttccg tatttagcca gtatgttctc
tagtgtggtt cgttgttttt gcgtgagcca 9180tgagaacgaa ccattgagat
catacttact ttgcatgtca ctcaaaaatt ttgcctcaaa 9240actggtgagc
tgaatttttg cagttaaagc atcgtgtagt gtttttctta gtccgttatg
9300taggtaggaa tctgatgtaa tggttgttgg tattttgtca ccattcattt
ttatctggtt 9360gttctcaagt tcggttacga gatccatttg tctatctagt
tcaacttgga aaatcaacgt 9420atcagtcggg cggcctcgct tatcaaccac
caatttcata ttgctgtaag tgtttaaatc 9480tttacttatt ggtttcaaaa
cccattggtt aagcctttta aactcatggt agttattttc 9540aagcattaac
atgaacttaa attcatcaag gctaatctct atatttgcct tgtgagtttt
9600cttttgtgtt agttctttta ataaccactc ataaatcctc atagagtatt
tgttttcaaa 9660agacttaaca tgttccagat tatattttat gaattttttt
aactggaaaa gataaggcaa 9720tatctcttca ctaaaaacta attctaattt
ttcgcttgag aacttggcat agtttgtcca 9780ctggaaaatc tcaaagcctt
taaccaaagg attcctgatt tccacagttc tcgtcatcag 9840ctctctggtt
gctttagcta atacaccata agcattttcc ctactgatgt tcatcatctg
9900agcgtattgg ttataagtga acgataccgt ccgttctttc cttgtagggt
tttcaatcgt 9960ggggttgagt agtgccacac agcataaaat tagcttggtt
tcatgctccg ttaagtcata 10020gcgactaatc gctagttcat ttgctttgaa
aacaactaat tcagacatac atctcaattg 10080gtctaggtga ttttaat
10097141471PRTEuglena gracilis 141Val Pro Gln Met Ala Glu Gly Phe
Ser Gly Glu Ala Thr Ser Ala Trp 1 5 10 15 Ala Ala Ala Gly Pro Gln
Trp Ala Ala Pro Leu Val Ala Ala Ala Ser 20 25 30 Ser Ala Leu Ala
Leu Trp Trp Trp Ala Ala Arg Arg Ser Val Arg Arg 35 40 45 Pro Leu
Ala Ala Leu Ala Glu Leu Pro Thr Ala Val Thr His Leu Ala 50 55 60
Pro Pro Met Ala Met Phe Thr Thr Thr Ala Lys Val Ile Gln Pro Lys 65
70 75 80 Ile Arg Gly Phe Ile Cys Thr Thr Thr His Pro Ile Gly Cys
Glu Lys 85 90 95 Arg Val Gln Glu Glu Ile Ala Tyr Ala Arg Ala His
Pro Pro Thr Ser 100 105 110 Pro Gly Pro Lys Arg Val Leu Val Ile Gly
Cys Ser Thr Gly Tyr Gly 115 120 125 Leu Ser Thr Arg Ile Thr Ala Ala
Phe Gly Tyr Gln Ala Ala Thr Leu 130 135 140 Gly Val Phe Leu Ala Gly
Pro Pro Thr Lys Gly Arg Pro Ala Ala Ala 145 150 155 160 Gly Trp Tyr
Asn Thr Val Ala Phe Glu Lys Ala Ala Leu Glu Ala Gly 165 170 175 Leu
Tyr Ala Arg Ser Leu Asn Gly Asp Ala Phe Asp Ser Thr Thr Lys 180 185
190 Ala Arg Thr Val Glu Ala Ile Lys Arg Asp Leu Gly Thr Val Asp Leu
195 200 205 Val Val Tyr Ser Ile Ala Ala Pro Lys Arg Thr Asp Pro Ala
Thr Gly 210 215 220 Val Leu His Lys Ala Cys Leu Lys Pro Ile Gly Ala
Thr Tyr Thr Asn 225 230 235 240 Arg Thr Val Asn Thr Asp Lys Ala Glu
Val Thr Asp Val Ser Ile Glu 245 250 255 Pro Ala Ser Pro Glu Glu Ile
Ala Asp Thr Val Lys Val Met Gly Gly 260 265 270 Glu Asp Trp Glu Leu
Trp Ile Gln Ala Leu Ser Glu Ala Gly Val Leu 275 280 285 Ala Glu Gly
Ala Lys Thr Val Ala Tyr Ser Tyr Ile Gly Pro Glu Met 290 295 300 Thr
Trp Pro Val Tyr Trp Ser Gly Thr Ile Gly Glu Ala Lys Lys Asp 305 310
315 320 Val Glu Lys Ala Ala Lys Arg Ile Thr Gln Gln Tyr Gly Cys Pro
Ala 325 330 335 Tyr Pro Val Val Ala Lys Ala Leu Val Thr Gln Ala Ser
Ser Ala Ile 340 345 350 Pro Val Val Pro Leu Tyr Ile Cys Leu Leu Tyr
Arg Val Met Lys Glu 355 360 365 Lys Gly Thr His Glu Gly Cys Ile Glu
Gln Met Val Arg Leu Leu Thr 370 375 380 Thr Lys Leu Tyr Pro Glu Asn
Gly Ala Pro Ile Val Asp Glu Ala Gly 385 390 395 400 Arg Val Arg Val
Asp Asp Trp Glu Met Ala Glu Asp Val Gln Gln Ala 405 410 415 Val Lys
Asp Leu Trp Ser Gln Val Ser Thr Ala Asn Leu Lys Asp Ile 420 425 430
Ser Asp Phe Ala Gly Tyr Gln Thr Glu Phe Leu Arg Leu Phe Gly Phe 435
440 445 Gly Ile Asp Gly Val Asp Tyr Asp Gln Pro Val Asp Val Glu Ala
Asp 450 455 460 Leu Pro Ser Ala Ala Gln Gln 465 470
1421164DNAEscherichia coli 142atggaacagg ttgtcattgt cgatgcaatt
cgcaccccga tgggccgttc gaagggcggt 60gcttttcgta acgtgcgtgc agaagatctc
tccgctcatt taatgcgtag cctgctggcg 120cgtaacccgg cgctggaagc
ggcggccctc gacgatattt actggggttg tgtgcagcag 180acgctggagc
agggttttaa tatcgcccgt aacgcggcgc tgctggcaga agtaccacac
240tctgtcccgg cggttaccgt taatcgcttg tgtggttcat ccatgcaggc
actgcatgac 300gcagcacgaa tgatcatgac tggcgatgcg caggcatgtc
tggttggcgg cgtggagcat 360atgggccatg tgccgatgag tcacggcgtc
gattttcacc ccggcctgag ccgcaatgtc 420gccaaagcgg cgggcatgat
gggcttaacg gcagaaatgc tggcgcgtat gcacggtatc 480agccgtgaaa
tgcaggatgc ctttgccgcg cggtcacacg cccgcgcctg ggccgccacg
540cagtcggccg catttaaaaa tgaaatcatc ccgaccggtg gtcacgatgc
cgacggcgtc 600ctgaagcagt ttaattacga cgaagtgatt cgcccggaaa
ccaccgtgga agccctcgcc 660acgctgcgtc cggcgtttga tccagtaaac
ggtatggtaa cggcgggcac atcttctgca 720ctttccgatg gcgcagctgc
catgctggtg atgagtgaaa gccgcgccca tgaattaggt 780cttaagccgc
gcgctcgtgt gcgttcgatg gcggtcgttg gttgtgaccc atcgattatg
840ggttacggcc cggttccggc ctcgaaactg gcgctgaaaa aagcggggct
ttctgccagc 900gatatcggcg tgtttgaaat gaacgaagcc tttgccgcgc
agatcctgcc atgtattaaa 960gatctgggac taattgagca gattgacgag
aagatcaacc tcaacggtgg cgcgatcgcg 1020ctgggtcatc cgctgggttg
ttccggtgcg cgtatcagca ccacgctgct gaatctgatg 1080gaacgcaaag
acgttcagtt tggtctggcg acgatgtgta tcggtctggg tcagggtatt
1140gcgacggtgt ttgagcgggt ttaa 1164143387PRTEscherichia coli 143Met
Glu Gln Val Val Ile Val Asp Ala Ile Arg Thr Pro Met Gly Arg 1 5 10
15 Ser Lys Gly Gly Ala Phe Arg Asn Val Arg Ala Glu Asp Leu Ser Ala
20 25 30 His Leu Met Arg Ser Leu Leu Ala Arg Asn Pro Ala Leu Glu
Ala Ala 35 40 45 Ala Leu Asp Asp Ile Tyr Trp Gly Cys Val Gln Gln
Thr Leu Glu Gln 50 55 60 Gly Phe Asn Ile Ala Arg Asn Ala Ala Leu
Leu Ala Glu Val Pro His 65 70 75 80 Ser Val Pro Ala Val Thr Val Asn
Arg Leu Cys Gly Ser Ser Met Gln 85 90 95 Ala Leu His Asp Ala Ala
Arg Met Ile Met Thr Gly Asp Ala Gln Ala 100 105 110 Cys Leu Val Gly
Gly Val Glu His Met Gly His Val Pro Met Ser His 115 120 125 Gly Val
Asp Phe His Pro Gly Leu Ser Arg Asn Val Ala Lys Ala Ala 130 135 140
Gly Met Met Gly Leu Thr Ala Glu Met Leu Ala Arg Met His Gly Ile 145
150 155 160 Ser Arg Glu Met Gln Asp Ala Phe Ala Ala Arg Ser His Ala
Arg Ala 165 170 175 Trp Ala Ala Thr Gln Ser Ala Ala Phe Lys Asn Glu
Ile Ile Pro Thr 180 185 190 Gly Gly His Asp Ala Asp Gly Val Leu Lys
Gln Phe Asn Tyr Asp Glu 195 200 205 Val Ile Arg Pro Glu Thr Thr Val
Glu Ala Leu Ala Thr Leu Arg Pro 210 215 220 Ala Phe Asp Pro Val Asn
Gly Met Val Thr Ala Gly Thr Ser Ser Ala 225 230 235 240 Leu Ser Asp
Gly Ala Ala Ala Met Leu Val Met Ser Glu Ser Arg Ala 245 250 255 His
Glu Leu Gly Leu Lys Pro Arg Ala Arg Val Arg Ser Met Ala Val 260 265
270 Val Gly Cys Asp Pro Ser Ile Met Gly Tyr Gly Pro Val Pro Ala Ser
275 280 285 Lys Leu Ala Leu Lys Lys Ala Gly Leu Ser Ala Ser Asp Ile
Gly Val 290 295 300 Phe Glu Met Asn Glu Ala Phe Ala Ala Gln Ile Leu
Pro Cys Ile Lys 305 310 315 320 Asp Leu Gly Leu Ile Glu Gln Ile Asp
Glu Lys Ile Asn Leu Asn Gly 325 330 335 Gly Ala Ile Ala Leu Gly His
Pro Leu Gly Cys Ser Gly Ala Arg Ile 340 345 350 Ser Thr Thr Leu Leu
Asn Leu Met Glu Arg Lys Asp Val Gln Phe Gly 355 360 365 Leu Ala Thr
Met Cys Ile Gly Leu Gly Gln Gly Ile Ala Thr Val Phe 370 375 380 Glu
Arg Val 385 1442190DNAEscherichia coli 144atgctttaca aaggcgacac
cctgtacctt gactggctgg aagatggcat tgccgaactg 60gtatttgatg ccccaggttc
agttaataaa ctcgacactg cgaccgtcgc cagcctcggc 120gaggccatcg
gcgtgctgga acagcaatca gatctaaaag ggctgctgct gcgttcgaac
180aaagcagcct ttatcgtcgg tgctgatatc accgaatttt tgtccctgtt
cctcgttcct 240gaagaacagt taagtcagtg gctgcacttt gccaatagcg
tgtttaatcg cctggaagat 300ctgccggtgc cgaccattgc tgccgtcaat
ggctatgcgc tgggcggtgg ctgcgaatgc 360gtgctggcga ccgattatcg
tctggcgacg ccggatctgc gcatcggtct gccggaaacc 420aaactgggca
tcatgcctgg ctttggcggt tctgtacgta tgccacgtat gctgggcgct
480gacagtgcgc tggaaatcat tgccgccggt aaagatgtcg gcgcggatca
ggcgctgaaa 540atcggtctgg tggatggcgt agtcaaagca gaaaaactgg
ttgaaggcgc aaaggcggtt 600ttacgccagg ccattaacgg cgacctcgac
tggaaagcaa aacgtcagcc gaagctggaa 660ccactaaaac tgagcaagat
tgaagccacc atgagcttca ccatcgctaa agggatggtc 720gcacaaacag
cggggaaaca ttatccggcc cccatcaccg cagtaaaaac cattgaagct
780gcggcccgtt ttggtcgtga agaagcctta aacctggaaa acaaaagttt
tgtcccgctg 840gcgcatacca acgaagcccg cgcactggtc ggcattttcc
ttaacgatca atatgtaaaa 900ggcaaagcga agaaactcac caaagacgtt
gaaaccccga aacaggccgc ggtgctgggt 960gcaggcatta tgggcggcgg
catcgcttac cagtctgcgt ggaaaggcgt gccggttgtc 1020atgaaagata
tcaacgacaa gtcgttaacc ctcggcatga ccgaagccgc gaaactgctg
1080aacaagcagc ttgagcgcgg caagatcgat ggtctgaaac tggctggcgt
gatctccaca 1140atccacccaa cgctcgacta cgccggattt gaccgcgtgg
atattgtggt agaagcggtt 1200gttgaaaacc cgaaagtgaa aaaagccgta
ctggcagaaa ccgaacaaaa agtacgccag 1260gataccgtgc tggcgtctaa
cacttcaacc attcctatca gcgaactggc caacgcgctg 1320gaacgcccgg
aaaacttctg cgggatgcac ttctttaacc cggtccaccg aatgccgttg
1380gtagaaatta ttcgcggcga gaaaagctcc gacgaaacca tcgcgaaagt
tgtcgcctgg 1440gcgagcaaga tgggcaagac gccgattgtg gttaacgact
gccccggctt ctttgttaac 1500cgcgtgctgt tcccgtattt cgccggtttc
agccagctgc tgcgcgacgg cgcggatttc 1560cgcaagatcg acaaagtgat
ggaaaaacag tttggctggc cgatgggccc ggcatatctg 1620ctggacgttg
tgggcattga taccgcgcat cacgctcagg ctgtcatggc agcaggcttc
1680ccgcagcgga tgcagaaaga ttaccgcgat gccatcgacg cgctgtttga
tgccaaccgc 1740tttggtcaga agaacggcct cggtttctgg cgttataaag
aagacagcaa aggtaagccg 1800aagaaagaag aagacgccgc cgttgaagac
ctgctggcag aagtgagcca gccgaagcgc 1860gatttcagcg aagaagagat
tatcgcccgc atgatgatcc cgatggtcaa cgaagtggtg 1920cgctgtctgg
aggaaggcat tatcgccact ccggcggaag cggatatggc gctggtctac
1980ggcctgggct tccctccgtt ccacggcggc gcgttccgct ggctggacac
cctcggtagc 2040gcaaaatacc tcgatatggc acagcaatat cagcacctcg
gcccgctgta tgaagtgccg 2100gaaggtctgc gtaataaagc gcgtcataac
gaaccgtact atcctccggt tgagccagcc 2160cgtccggttg gcgacctgaa
aacggcttaa 2190145729PRTEscherichia coli 145Met Leu Tyr Lys Gly Asp
Thr Leu Tyr Leu Asp Trp Leu Glu Asp Gly 1 5 10 15 Ile Ala Glu Leu
Val Phe Asp Ala Pro Gly Ser Val Asn Lys Leu Asp 20 25 30 Thr Ala
Thr Val Ala Ser Leu Gly Glu Ala Ile Gly Val Leu Glu Gln 35 40 45
Gln Ser Asp Leu Lys Gly Leu Leu Leu Arg Ser Asn Lys Ala Ala Phe 50
55 60 Ile Val Gly Ala Asp Ile Thr Glu Phe Leu Ser Leu Phe Leu Val
Pro 65 70 75 80 Glu Glu Gln Leu Ser Gln Trp Leu His Phe Ala Asn Ser
Val Phe Asn 85 90 95 Arg Leu Glu Asp Leu Pro Val Pro Thr Ile Ala
Ala Val Asn Gly Tyr 100 105 110 Ala Leu Gly Gly Gly Cys Glu Cys Val
Leu Ala Thr Asp Tyr Arg Leu 115 120 125 Ala Thr Pro Asp Leu Arg Ile
Gly Leu Pro Glu Thr Lys Leu Gly Ile 130 135 140 Met Pro Gly Phe Gly
Gly Ser Val Arg Met Pro Arg Met Leu Gly Ala 145 150 155 160 Asp Ser
Ala Leu Glu Ile Ile Ala Ala Gly Lys Asp Val Gly Ala Asp 165 170 175
Gln Ala Leu Lys Ile Gly Leu Val Asp Gly Val Val Lys Ala Glu Lys 180
185 190 Leu Val Glu Gly Ala Lys Ala Val Leu Arg Gln Ala Ile Asn Gly
Asp 195 200 205 Leu Asp Trp Lys Ala Lys Arg Gln Pro Lys Leu Glu Pro
Leu Lys Leu 210 215 220 Ser Lys Ile Glu Ala Thr Met Ser Phe Thr Ile
Ala Lys Gly Met Val 225 230 235 240 Ala Gln Thr Ala Gly Lys His Tyr
Pro Ala Pro Ile Thr Ala Val Lys 245 250 255 Thr Ile Glu Ala Ala Ala
Arg Phe Gly Arg Glu Glu Ala Leu Asn Leu 260 265 270 Glu Asn Lys Ser
Phe Val Pro Leu Ala His Thr Asn Glu Ala Arg Ala 275 280 285 Leu Val
Gly Ile Phe Leu Asn Asp Gln Tyr Val Lys Gly Lys Ala Lys 290 295 300
Lys Leu Thr Lys Asp Val Glu Thr Pro Lys Gln Ala Ala Val Leu Gly 305
310 315 320 Ala Gly Ile Met Gly Gly Gly Ile Ala Tyr Gln Ser Ala Trp
Lys Gly 325 330 335 Val Pro Val Val Met Lys Asp Ile Asn Asp Lys Ser
Leu Thr Leu Gly 340 345 350 Met Thr Glu Ala Ala Lys Leu Leu Asn Lys
Gln Leu Glu Arg Gly Lys 355 360 365 Ile Asp Gly Leu Lys Leu Ala Gly
Val Ile Ser Thr Ile His Pro Thr 370 375 380 Leu Asp Tyr Ala Gly Phe
Asp Arg Val Asp Ile Val Val Glu Ala Val 385 390 395 400 Val Glu Asn
Pro Lys Val Lys Lys Ala Val Leu Ala Glu Thr Glu Gln 405 410 415 Lys
Val Arg Gln Asp Thr Val Leu Ala Ser Asn Thr Ser Thr Ile Pro 420 425
430 Ile Ser Glu Leu Ala Asn Ala Leu Glu Arg Pro Glu Asn Phe Cys Gly
435 440 445 Met His Phe Phe Asn Pro Val His Arg Met Pro Leu Val Glu
Ile Ile 450 455 460 Arg Gly Glu Lys Ser Ser Asp Glu Thr Ile Ala Lys
Val Val Ala Trp 465 470 475 480 Ala Ser Lys Met Gly Lys Thr Pro Ile
Val Val Asn Asp Cys Pro Gly 485 490 495 Phe Phe Val Asn Arg Val Leu
Phe Pro Tyr Phe Ala Gly Phe Ser Gln 500 505 510 Leu Leu Arg Asp Gly
Ala Asp Phe Arg Lys Ile Asp Lys Val Met Glu 515 520 525 Lys Gln Phe
Gly Trp Pro Met Gly Pro Ala Tyr Leu Leu Asp Val Val 530 535 540 Gly
Ile Asp Thr Ala His His Ala Gln Ala Val Met Ala Ala Gly Phe 545 550
555 560 Pro Gln Arg Met Gln Lys Asp Tyr Arg Asp Ala Ile Asp Ala Leu
Phe 565 570 575 Asp Ala Asn Arg Phe Gly Gln Lys Asn Gly Leu Gly Phe
Trp Arg Tyr 580 585 590 Lys Glu Asp Ser Lys Gly Lys Pro Lys Lys Glu
Glu Asp Ala Ala Val 595 600 605 Glu Asp Leu Leu Ala Glu Val Ser Gln
Pro Lys Arg Asp Phe Ser Glu 610 615 620 Glu Glu Ile Ile Ala Arg Met
Met Ile Pro Met Val Asn Glu Val Val 625 630 635 640 Arg Cys Leu Glu
Glu Gly Ile Ile Ala Thr Pro Ala Glu Ala Asp Met 645 650 655 Ala Leu
Val Tyr Gly Leu Gly Phe Pro Pro Phe His Gly Gly Ala Phe 660 665 670
Arg Trp Leu Asp Thr Leu Gly Ser Ala Lys Tyr Leu Asp Met Ala Gln 675
680 685 Gln Tyr Gln His Leu Gly Pro Leu Tyr Glu Val Pro Glu Gly Leu
Arg 690 695 700 Asn Lys Ala Arg His Asn Glu Pro
Tyr Tyr Pro Pro Val Glu Pro Ala 705 710 715 720 Arg Pro Val Gly Asp
Leu Lys Thr Ala 725 1461311DNAEscherichia coli 146atgggtcagg
ttttaccgct ggttacccgc cagggcgatc gtatcgccat tgttagcggt 60ttacgtacgc
cttttgcccg tcaggcgacg gcttttcatg gcattcccgc ggttgattta
120gggaagatgg tggtaggcga actgctggca cgcagcgaga tccccgccga
agtgattgaa 180caactggtct ttggtcaggt cgtacaaatg cctgaagccc
ccaacattgc gcgtgaaatt 240gttctcggta cgggaatgaa tgtacatacc
gatgcttaca gcgtcagccg cgcttgcgct 300accagtttcc aggcagttgc
aaacgtcgca gaaagcctga tggcgggaac tattcgagcg 360gggattgccg
gtggggcaga ttcctcttcg gtattgccaa ttggcgtcag taaaaaactg
420gcgcgcgtgc tggttgatgt caacaaagct cgtaccatga gccagcgact
gaaactcttc 480tctcgcctgc gtttgcgcga cttaatgccc gtaccacctg
cggtagcaga atattctacc 540ggcttgcgga tgggcgacac cgcagagcaa
atggcgaaaa cctacggcat cacccgagaa 600cagcaagatg cattagcgca
ccgttcgcat cagcgtgccg ctcaggcatg gtcagacgga 660aaactcaaag
aagaggtgat gactgccttt atccctcctt ataaacaacc gcttgtcgaa
720gacaacaata ttcgcggtaa ttcctcgctt gccgattacg caaagctgcg
cccggcgttt 780gatcgcaaac acggaacggt aacggcggca aacagtacgc
cgctgaccga tggcgcggca 840gcggtgatcc tgatgactga atcccgggcg
aaagaattag ggctggtgcc gctggggtat 900ctgcgcagct acgcatttac
tgcgattgat gtctggcagg acatgttgct cggtccagcc 960tggtcaacac
cgctggcgct ggagcgtgcc ggtttgacga tgagcgatct gacattgatc
1020gatatgcacg aagcctttgc agctcagacg ctggcgaata ttcagttgct
gggtagtgaa 1080cgttttgctc gtgaagcact ggggcgtgca catgccactg
gcgaagtgga cgatagcaaa 1140tttaacgtgc ttggcggttc gattgcttac
gggcatccct tcgcggcgac cggcgcgcgg 1200atgattaccc agacattgca
tgaacttcgc cgtcgcggcg gtggatttgg tttagttacc 1260gcctgtgctg
ccggtgggct tggcgcggca atggttctgg aggcggaata a
1311147436PRTEscherichia coli 147Met Gly Gln Val Leu Pro Leu Val
Thr Arg Gln Gly Asp Arg Ile Ala 1 5 10 15 Ile Val Ser Gly Leu Arg
Thr Pro Phe Ala Arg Gln Ala Thr Ala Phe 20 25 30 His Gly Ile Pro
Ala Val Asp Leu Gly Lys Met Val Val Gly Glu Leu 35 40 45 Leu Ala
Arg Ser Glu Ile Pro Ala Glu Val Ile Glu Gln Leu Val Phe 50 55 60
Gly Gln Val Val Gln Met Pro Glu Ala Pro Asn Ile Ala Arg Glu Ile 65
70 75 80 Val Leu Gly Thr Gly Met Asn Val His Thr Asp Ala Tyr Ser
Val Ser 85 90 95 Arg Ala Cys Ala Thr Ser Phe Gln Ala Val Ala Asn
Val Ala Glu Ser 100 105 110 Leu Met Ala Gly Thr Ile Arg Ala Gly Ile
Ala Gly Gly Ala Asp Ser 115 120 125 Ser Ser Val Leu Pro Ile Gly Val
Ser Lys Lys Leu Ala Arg Val Leu 130 135 140 Val Asp Val Asn Lys Ala
Arg Thr Met Ser Gln Arg Leu Lys Leu Phe 145 150 155 160 Ser Arg Leu
Arg Leu Arg Asp Leu Met Pro Val Pro Pro Ala Val Ala 165 170 175 Glu
Tyr Ser Thr Gly Leu Arg Met Gly Asp Thr Ala Glu Gln Met Ala 180 185
190 Lys Thr Tyr Gly Ile Thr Arg Glu Gln Gln Asp Ala Leu Ala His Arg
195 200 205 Ser His Gln Arg Ala Ala Gln Ala Trp Ser Asp Gly Lys Leu
Lys Glu 210 215 220 Glu Val Met Thr Ala Phe Ile Pro Pro Tyr Lys Gln
Pro Leu Val Glu 225 230 235 240 Asp Asn Asn Ile Arg Gly Asn Ser Ser
Leu Ala Asp Tyr Ala Lys Leu 245 250 255 Arg Pro Ala Phe Asp Arg Lys
His Gly Thr Val Thr Ala Ala Asn Ser 260 265 270 Thr Pro Leu Thr Asp
Gly Ala Ala Ala Val Ile Leu Met Thr Glu Ser 275 280 285 Arg Ala Lys
Glu Leu Gly Leu Val Pro Leu Gly Tyr Leu Arg Ser Tyr 290 295 300 Ala
Phe Thr Ala Ile Asp Val Trp Gln Asp Met Leu Leu Gly Pro Ala 305 310
315 320 Trp Ser Thr Pro Leu Ala Leu Glu Arg Ala Gly Leu Thr Met Ser
Asp 325 330 335 Leu Thr Leu Ile Asp Met His Glu Ala Phe Ala Ala Gln
Thr Leu Ala 340 345 350 Asn Ile Gln Leu Leu Gly Ser Glu Arg Phe Ala
Arg Glu Ala Leu Gly 355 360 365 Arg Ala His Ala Thr Gly Glu Val Asp
Asp Ser Lys Phe Asn Val Leu 370 375 380 Gly Gly Ser Ile Ala Tyr Gly
His Pro Phe Ala Ala Thr Gly Ala Arg 385 390 395 400 Met Ile Thr Gln
Thr Leu His Glu Leu Arg Arg Arg Gly Gly Gly Phe 405 410 415 Gly Leu
Val Thr Ala Cys Ala Ala Gly Gly Leu Gly Ala Ala Met Val 420 425 430
Leu Glu Ala Glu 435 1482145DNAEscherichia coli 148atggaaatga
catcagcgtt tacccttaat gttcgtctgg acaacattgc cgttatcacc 60atcgacgtac
cgggtgagaa aatgaatacc ctgaaggcgg agtttgcctc gcaggtgcgc
120gccattatta agcaactccg tgaaaacaaa gagttgcgag gcgtggtgtt
tgtctccgct 180aaaccggaca acttcattgc tggcgcagac atcaacatga
tcggcaactg caaaacggcg 240caagaagcgg aagctctggc gcggcagggc
caacagttga tggcggagat tcatgctttg 300cccattcagg ttatcgcggc
tattcatggc gcttgcctgg gtggtgggct ggagttggcg 360ctggcgtgcc
acggtcgcgt ttgtactgac gatcctaaaa cggtgctcgg tttgcctgaa
420gtacaacttg gattgttacc cggttcaggc ggcacccagc gtttaccgcg
tctgataggc 480gtcagcacag cattagagat gatcctcacc ggaaaacaac
ttcgggcgaa acaggcatta 540aagctggggc tggtggatga cgttgttccg
cactccattc tgctggaagc cgctgttgag 600ctggcaaaga aggagcgccc
atcttcccgc cctctacctg tacgcgagcg tattctggcg 660gggccgttag
gtcgtgcgct gctgttcaaa atggtcggca agaaaacaga acacaaaact
720caaggcaatt atccggcgac agaacgcatc ctggaggttg ttgaaacggg
attagcgcag 780ggcaccagca gcggttatga cgccgaagct cgggcgtttg
gcgaactggc gatgacgcca 840caatcgcagg cgctgcgtag tatctttttt
gccagtacgg acgtgaagaa agatcccggc 900agtgatgcgc cgcctgcgcc
attaaacagc gtggggattt taggtggtgg cttgatgggc 960ggcggtattg
cttatgtcac tgcttgtaaa gcggggattc cggtcagaat taaagatatc
1020aacccgcagg gcataaatca tgcgctgaag tacagttggg atcagctgga
gggcaaagtt 1080cgccgtcgtc atctcaaagc cagcgaacgt gacaaacagc
tggcattaat ctccggaacg 1140acggactatc gcggctttgc ccatcgcgat
ctgattattg aagcggtgtt tgaaaatctc 1200gaattgaaac aacagatggt
ggcggaagtt gagcaaaatt gcgccgctca taccatcttt 1260gcttcgaata
cgtcatcttt accgattggt gatatcgccg ctcacgccac gcgacctgag
1320caagttatcg gcctgcattt cttcagtccg gtggaaaaaa tgccgctggt
ggagattatt 1380cctcatgcgg ggacatcggc gcaaaccatc gctaccacag
taaaactggc gaaaaaacag 1440ggtaaaacgc caattgtcgt gcgtgacaaa
gccggttttt acgtcaatcg catcttagcg 1500ccttacatta atgaagctat
ccgcatgttg acccaaggtg aacgggtaga gcacattgat 1560gccgcgctag
tgaaatttgg ttttccggta ggcccaatcc aacttttgga tgaggtagga
1620atcgacaccg ggactaaaat tattcctgta ctggaagccg cttatggaga
acgttttagc 1680gcgcctgcaa atgttgtttc ttcaattttg aacgacgatc
gcaaaggcag aaaaaatggc 1740cggggtttct atctttatgg tcagaaaggg
cgtaaaagca aaaaacaggt cgatcccgcc 1800atttacccgc tgattggcac
acaagggcag gggcgaatct ccgcaccgca ggttgctgaa 1860cggtgtgtga
tgttgatgct gaatgaagca gtacgttgtg ttgatgagca ggttatccgt
1920agcgtgcgtg acggggatat tggcgcggta tttggcattg gttttccgcc
atttctcggt 1980ggaccgttcc gctatatcga ttctctcggc gcgggcgaag
tggttgcaat aatgcaacga 2040cttgccacgc agtatggttc ccgttttacc
ccttgcgagc gtttggtcga gatgggcgcg 2100cgtggggaaa gtttttggaa
aacaactgca actgacctgc aataa 2145149714PRTEscherichia coli 149Met
Glu Met Thr Ser Ala Phe Thr Leu Asn Val Arg Leu Asp Asn Ile 1 5 10
15 Ala Val Ile Thr Ile Asp Val Pro Gly Glu Lys Met Asn Thr Leu Lys
20 25 30 Ala Glu Phe Ala Ser Gln Val Arg Ala Ile Ile Lys Gln Leu
Arg Glu 35 40 45 Asn Lys Glu Leu Arg Gly Val Val Phe Val Ser Ala
Lys Pro Asp Asn 50 55 60 Phe Ile Ala Gly Ala Asp Ile Asn Met Ile
Gly Asn Cys Lys Thr Ala 65 70 75 80 Gln Glu Ala Glu Ala Leu Ala Arg
Gln Gly Gln Gln Leu Met Ala Glu 85 90 95 Ile His Ala Leu Pro Ile
Gln Val Ile Ala Ala Ile His Gly Ala Cys 100 105 110 Leu Gly Gly Gly
Leu Glu Leu Ala Leu Ala Cys His Gly Arg Val Cys 115 120 125 Thr Asp
Asp Pro Lys Thr Val Leu Gly Leu Pro Glu Val Gln Leu Gly 130 135 140
Leu Leu Pro Gly Ser Gly Gly Thr Gln Arg Leu Pro Arg Leu Ile Gly 145
150 155 160 Val Ser Thr Ala Leu Glu Met Ile Leu Thr Gly Lys Gln Leu
Arg Ala 165 170 175 Lys Gln Ala Leu Lys Leu Gly Leu Val Asp Asp Val
Val Pro His Ser 180 185 190 Ile Leu Leu Glu Ala Ala Val Glu Leu Ala
Lys Lys Glu Arg Pro Ser 195 200 205 Ser Arg Pro Leu Pro Val Arg Glu
Arg Ile Leu Ala Gly Pro Leu Gly 210 215 220 Arg Ala Leu Leu Phe Lys
Met Val Gly Lys Lys Thr Glu His Lys Thr 225 230 235 240 Gln Gly Asn
Tyr Pro Ala Thr Glu Arg Ile Leu Glu Val Val Glu Thr 245 250 255 Gly
Leu Ala Gln Gly Thr Ser Ser Gly Tyr Asp Ala Glu Ala Arg Ala 260 265
270 Phe Gly Glu Leu Ala Met Thr Pro Gln Ser Gln Ala Leu Arg Ser Ile
275 280 285 Phe Phe Ala Ser Thr Asp Val Lys Lys Asp Pro Gly Ser Asp
Ala Pro 290 295 300 Pro Ala Pro Leu Asn Ser Val Gly Ile Leu Gly Gly
Gly Leu Met Gly 305 310 315 320 Gly Gly Ile Ala Tyr Val Thr Ala Cys
Lys Ala Gly Ile Pro Val Arg 325 330 335 Ile Lys Asp Ile Asn Pro Gln
Gly Ile Asn His Ala Leu Lys Tyr Ser 340 345 350 Trp Asp Gln Leu Glu
Gly Lys Val Arg Arg Arg His Leu Lys Ala Ser 355 360 365 Glu Arg Asp
Lys Gln Leu Ala Leu Ile Ser Gly Thr Thr Asp Tyr Arg 370 375 380 Gly
Phe Ala His Arg Asp Leu Ile Ile Glu Ala Val Phe Glu Asn Leu 385 390
395 400 Glu Leu Lys Gln Gln Met Val Ala Glu Val Glu Gln Asn Cys Ala
Ala 405 410 415 His Thr Ile Phe Ala Ser Asn Thr Ser Ser Leu Pro Ile
Gly Asp Ile 420 425 430 Ala Ala His Ala Thr Arg Pro Glu Gln Val Ile
Gly Leu His Phe Phe 435 440 445 Ser Pro Val Glu Lys Met Pro Leu Val
Glu Ile Ile Pro His Ala Gly 450 455 460 Thr Ser Ala Gln Thr Ile Ala
Thr Thr Val Lys Leu Ala Lys Lys Gln 465 470 475 480 Gly Lys Thr Pro
Ile Val Val Arg Asp Lys Ala Gly Phe Tyr Val Asn 485 490 495 Arg Ile
Leu Ala Pro Tyr Ile Asn Glu Ala Ile Arg Met Leu Thr Gln 500 505 510
Gly Glu Arg Val Glu His Ile Asp Ala Ala Leu Val Lys Phe Gly Phe 515
520 525 Pro Val Gly Pro Ile Gln Leu Leu Asp Glu Val Gly Ile Asp Thr
Gly 530 535 540 Thr Lys Ile Ile Pro Val Leu Glu Ala Ala Tyr Gly Glu
Arg Phe Ser 545 550 555 560 Ala Pro Ala Asn Val Val Ser Ser Ile Leu
Asn Asp Asp Arg Lys Gly 565 570 575 Arg Lys Asn Gly Arg Gly Phe Tyr
Leu Tyr Gly Gln Lys Gly Arg Lys 580 585 590 Ser Lys Lys Gln Val Asp
Pro Ala Ile Tyr Pro Leu Ile Gly Thr Gln 595 600 605 Gly Gln Gly Arg
Ile Ser Ala Pro Gln Val Ala Glu Arg Cys Val Met 610 615 620 Leu Met
Leu Asn Glu Ala Val Arg Cys Val Asp Glu Gln Val Ile Arg 625 630 635
640 Ser Val Arg Asp Gly Asp Ile Gly Ala Val Phe Gly Ile Gly Phe Pro
645 650 655 Pro Phe Leu Gly Gly Pro Phe Arg Tyr Ile Asp Ser Leu Gly
Ala Gly 660 665 670 Glu Val Val Ala Ile Met Gln Arg Leu Ala Thr Gln
Tyr Gly Ser Arg 675 680 685 Phe Thr Pro Cys Glu Arg Leu Val Glu Met
Gly Ala Arg Gly Glu Ser 690 695 700 Phe Trp Lys Thr Thr Ala Thr Asp
Leu Gln 705 710 150789DNAEscherichia coli 150atgggttttc tttccggtaa
gcgcattctg gtaaccggtg ttgccagcaa actatccatc 60gcctacggta tcgctcaggc
gatgcaccgc gaaggagctg aactggcatt cacctaccag 120aacgacaaac
tgaaaggccg cgtagaagaa tttgccgctc aattgggttc tgacatcgtt
180ctgcagtgcg atgttgcaga agatgccagc atcgacacca tgttcgctga
actggggaaa 240gtttggccga aatttgacgg tttcgtacac tctattggtt
ttgcacctgg cgatcagctg 300gatggtgact atgttaacgc cgttacccgt
gaaggcttca aaattgccca cgacatcagc 360tcctacagct tcgttgcaat
ggcaaaagct tgccgctcca tgctgaatcc gggttctgcc 420ctgctgaccc
tttcctacct tggcgctgag cgcgctatcc cgaactacaa cgttatgggt
480ctggcaaaag cgtctctgga agcgaacgtg cgctatatgg cgaacgcgat
gggtccggaa 540ggtgtgcgtg ttaacgccat ctctgctggt ccgatccgta
ctctggcggc ctccggtatc 600aaagacttcc gcaaaatgct ggctcattgc
gaagccgtta ccccgattcg ccgtaccgtt 660actattgaag atgtgggtaa
ctctgcggca ttcctgtgct ccgatctctc tgccggtatc 720tccggtgaag
tggtccacgt tgacggcggt ttcagcattg ctgcaatgaa cgaactcgaa 780ctgaaataa
789151262PRTEscherichia coli 151Met Gly Phe Leu Ser Gly Lys Arg Ile
Leu Val Thr Gly Val Ala Ser 1 5 10 15 Lys Leu Ser Ile Ala Tyr Gly
Ile Ala Gln Ala Met His Arg Glu Gly 20 25 30 Ala Glu Leu Ala Phe
Thr Tyr Gln Asn Asp Lys Leu Lys Gly Arg Val 35 40 45 Glu Glu Phe
Ala Ala Gln Leu Gly Ser Asp Ile Val Leu Gln Cys Asp 50 55 60 Val
Ala Glu Asp Ala Ser Ile Asp Thr Met Phe Ala Glu Leu Gly Lys 65 70
75 80 Val Trp Pro Lys Phe Asp Gly Phe Val His Ser Ile Gly Phe Ala
Pro 85 90 95 Gly Asp Gln Leu Asp Gly Asp Tyr Val Asn Ala Val Thr
Arg Glu Gly 100 105 110 Phe Lys Ile Ala His Asp Ile Ser Ser Tyr Ser
Phe Val Ala Met Ala 115 120 125 Lys Ala Cys Arg Ser Met Leu Asn Pro
Gly Ser Ala Leu Leu Thr Leu 130 135 140 Ser Tyr Leu Gly Ala Glu Arg
Ala Ile Pro Asn Tyr Asn Val Met Gly 145 150 155 160 Leu Ala Lys Ala
Ser Leu Glu Ala Asn Val Arg Tyr Met Ala Asn Ala 165 170 175 Met Gly
Pro Glu Gly Val Arg Val Asn Ala Ile Ser Ala Gly Pro Ile 180 185 190
Arg Thr Leu Ala Ala Ser Gly Ile Lys Asp Phe Arg Lys Met Leu Ala 195
200 205 His Cys Glu Ala Val Thr Pro Ile Arg Arg Thr Val Thr Ile Glu
Asp 210 215 220 Val Gly Asn Ser Ala Ala Phe Leu Cys Ser Asp Leu Ser
Ala Gly Ile 225 230 235 240 Ser Gly Glu Val Val His Val Asp Gly Gly
Phe Ser Ile Ala Ala Met 245 250 255 Asn Glu Leu Glu Leu Lys 260
152861DNAEscherichia coli 152atgagtcagg cgctaaaaaa tttactgaca
ttgttaaatc tggaaaaaat tgaggaagga 60ctctttcgcg gccagagtga agatttaggt
ttacgccagg tgtttggcgg ccaggtcgtg 120ggtcaggcct tgtatgctgc
aaaagagacc gtccctgaag agcggctggt acattcgttt 180cacagctact
ttcttcgccc tggcgatagt aagaagccga ttatttatga tgtcgaaacg
240ctgcgtgacg gtaacagctt cagcgcccgc cgggttgctg ctattcaaaa
cggcaaaccg 300attttttata tgactgcctc tttccaggca ccagaagcgg
gtttcgaaca tcaaaaaaca 360atgccgtccg cgccagcgcc tgatggcctc
ccttcggaaa cgcaaatcgc ccaatcgctg 420gcgcacctgc tgccgccagt
gctgaaagat aaattcatct gcgatcgtcc gctggaagtc 480cgtccggtgg
agtttcataa cccactgaaa ggtcacgtcg cagaaccaca tcgtcaggtg
540tggatccgcg caaatggtag cgtgccggat gacctgcgcg ttcatcagta
tctgctcggt 600tacgcttctg atcttaactt cctgccggta gctctacagc
cgcacggcat cggttttctc 660gaaccgggga ttcagattgc caccattgac
cattccatgt ggttccatcg cccgtttaat 720ttgaatgaat ggctgctgta
tagcgtggag agcacctcgg cgtccagcgc acgtggcttt 780gtgcgcggtg
agttttatac ccaagacggc gtactggttg cctcgaccgt tcaggaaggg
840gtgatgcgta atcacaatta a 861153286PRTEscherichia coli 153Met Ser
Gln Ala Leu Lys Asn Leu Leu Thr Leu Leu Asn Leu Glu Lys 1 5 10 15
Ile Glu Glu Gly Leu Phe Arg Gly Gln Ser Glu Asp Leu Gly Leu Arg 20
25 30 Gln Val Phe Gly Gly Gln Val Val Gly Gln Ala Leu Tyr Ala Ala
Lys 35
40 45 Glu Thr Val Pro Glu Glu Arg Leu Val His Ser Phe His Ser Tyr
Phe 50 55 60 Leu Arg Pro Gly Asp Ser Lys Lys Pro Ile Ile Tyr Asp
Val Glu Thr 65 70 75 80 Leu Arg Asp Gly Asn Ser Phe Ser Ala Arg Arg
Val Ala Ala Ile Gln 85 90 95 Asn Gly Lys Pro Ile Phe Tyr Met Thr
Ala Ser Phe Gln Ala Pro Glu 100 105 110 Ala Gly Phe Glu His Gln Lys
Thr Met Pro Ser Ala Pro Ala Pro Asp 115 120 125 Gly Leu Pro Ser Glu
Thr Gln Ile Ala Gln Ser Leu Ala His Leu Leu 130 135 140 Pro Pro Val
Leu Lys Asp Lys Phe Ile Cys Asp Arg Pro Leu Glu Val 145 150 155 160
Arg Pro Val Glu Phe His Asn Pro Leu Lys Gly His Val Ala Glu Pro 165
170 175 His Arg Gln Val Trp Ile Arg Ala Asn Gly Ser Val Pro Asp Asp
Leu 180 185 190 Arg Val His Gln Tyr Leu Leu Gly Tyr Ala Ser Asp Leu
Asn Phe Leu 195 200 205 Pro Val Ala Leu Gln Pro His Gly Ile Gly Phe
Leu Glu Pro Gly Ile 210 215 220 Gln Ile Ala Thr Ile Asp His Ser Met
Trp Phe His Arg Pro Phe Asn 225 230 235 240 Leu Asn Glu Trp Leu Leu
Tyr Ser Val Glu Ser Thr Ser Ala Ser Ser 245 250 255 Ala Arg Gly Phe
Val Arg Gly Glu Phe Tyr Thr Gln Asp Gly Val Leu 260 265 270 Val Ala
Ser Thr Val Gln Glu Gly Val Met Arg Asn His Asn 275 280 285
154912DNAAcinetobacter sp. ADP1 154ttgatatcaa tcagggaaaa acgcgtgaac
aaaaaacttg aagctctctt ccgagagaat 60gtaaaaggta aagtggcttt gatcactggt
gcatctagtg gaatcggttt gacgattgca 120aaaagaattg ctgcggcagg
tgctcatgta ttattggttg cccgaaccca agaaacactg 180gaagaagtga
aagctgcaat tgaacagcaa gggggacagg cctctatttt tccttgtgac
240ctgactgaca tgaatgcgat tgaccagtta tcacaacaaa ttatggccag
tgtcgatcat 300gtcgatttcc tgatcaataa tgcagggcgt tcgattcgcc
gtgccgtaca cgagtcgttt 360gatcgcttcc atgattttga acgcaccatg
cagctgaatt actttggtgc ggtacgttta 420gtgttaaatt tactgccaca
tatgattaag cgtaaaaatg gccagatcat caatatcagc 480tctattggtg
tattggccaa tgcgacccgt ttttctgctt atgtcgcgtc taaagctgcg
540ctggatgcct tcagtcgctg tctttcagcc gaggtactca agcataaaat
ctcaattacc 600tcgatttata tgccattggt gcgtacccca atgatcgcac
ccaccaaaat ttataaatac 660gtgcccacgc tttccccaga agaagccgca
gatctcattg tctacgccat tgtgaaacgt 720ccaaaacgta ttgcgacgca
cttgggtcgt ctggcgtcaa ttacctatgc catcgcacca 780gacatcaata
atattctgat gtcgattgga tttaacctat tcccaagctc aacggctgca
840ctgggtgaac aggaaaaatt gaatctgcta caacgtgcct atgcccgctt
gttcccaggc 900gaacactggt aa 912155303PRTAcinetobacter sp. ADP1
155Met Ile Ser Ile Arg Glu Lys Arg Val Asn Lys Lys Leu Glu Ala Leu
1 5 10 15 Phe Arg Glu Asn Val Lys Gly Lys Val Ala Leu Ile Thr Gly
Ala Ser 20 25 30 Ser Gly Ile Gly Leu Thr Ile Ala Lys Arg Ile Ala
Ala Ala Gly Ala 35 40 45 His Val Leu Leu Val Ala Arg Thr Gln Glu
Thr Leu Glu Glu Val Lys 50 55 60 Ala Ala Ile Glu Gln Gln Gly Gly
Gln Ala Ser Ile Phe Pro Cys Asp 65 70 75 80 Leu Thr Asp Met Asn Ala
Ile Asp Gln Leu Ser Gln Gln Ile Met Ala 85 90 95 Ser Val Asp His
Val Asp Phe Leu Ile Asn Asn Ala Gly Arg Ser Ile 100 105 110 Arg Arg
Ala Val His Glu Ser Phe Asp Arg Phe His Asp Phe Glu Arg 115 120 125
Thr Met Gln Leu Asn Tyr Phe Gly Ala Val Arg Leu Val Leu Asn Leu 130
135 140 Leu Pro His Met Ile Lys Arg Lys Asn Gly Gln Ile Ile Asn Ile
Ser 145 150 155 160 Ser Ile Gly Val Leu Ala Asn Ala Thr Arg Phe Ser
Ala Tyr Val Ala 165 170 175 Ser Lys Ala Ala Leu Asp Ala Phe Ser Arg
Cys Leu Ser Ala Glu Val 180 185 190 Leu Lys His Lys Ile Ser Ile Thr
Ser Ile Tyr Met Pro Leu Val Arg 195 200 205 Thr Pro Met Ile Ala Pro
Thr Lys Ile Tyr Lys Tyr Val Pro Thr Leu 210 215 220 Ser Pro Glu Glu
Ala Ala Asp Leu Ile Val Tyr Ala Ile Val Lys Arg 225 230 235 240 Pro
Lys Arg Ile Ala Thr His Leu Gly Arg Leu Ala Ser Ile Thr Tyr 245 250
255 Ala Ile Ala Pro Asp Ile Asn Asn Ile Leu Met Ser Ile Gly Phe Asn
260 265 270 Leu Phe Pro Ser Ser Thr Ala Ala Leu Gly Glu Gln Glu Lys
Leu Asn 275 280 285 Leu Leu Gln Arg Ala Tyr Ala Arg Leu Phe Pro Gly
Glu His Trp 290 295 300 156296PRTClostridium acetobutylicum 156Met
Ile Lys Ser Phe Asn Glu Ile Ile Met Lys Val Lys Ser Lys Glu 1 5 10
15 Met Lys Lys Val Ala Val Ala Val Ala Gln Asp Glu Pro Val Leu Glu
20 25 30 Ala Val Arg Asp Ala Lys Lys Asn Gly Ile Ala Asp Ala Ile
Leu Val 35 40 45 Gly Asp His Asp Glu Ile Val Ser Ile Ala Leu Lys
Ile Gly Met Asp 50 55 60 Val Asn Asp Phe Glu Ile Val Asn Glu Pro
Asn Val Lys Lys Ala Ala 65 70 75 80 Leu Lys Ala Val Glu Leu Val Ser
Thr Gly Lys Ala Asp Ile Leu Met 85 90 95 Asn Gly Leu Val Asn Thr
Ala Thr Phe Leu Lys Ile Cys Ile Leu Asn 100 105 110 Lys Glu Val Gly
Leu Arg Thr Gly Lys Thr Met Ser His Val Ala Val 115 120 125 Phe Glu
Thr Glu Thr Ser Asp Arg Leu Ser Phe Leu Thr Asp Val Ala 130 135 140
Phe Asn Thr Tyr Pro Glu Leu Lys Glu Lys Ile Asp Ile Val Asn Asn 145
150 155 160 Ser Val Lys Val Ala His Ala Ile Gly Ile Val Asn Pro Lys
Val Ala 165 170 175 Pro Ile Cys Ala Val Glu Val Ile Asn Pro Lys Met
Pro Ser Thr Leu 180 185 190 Asp Ala Ala Met Leu Ser Lys Met Ser Asp
Arg Gly Gln Ile Lys Gly 195 200 205 Cys Val Val Asp Gly Pro Leu Ala
Leu Asp Ile Ala Leu Ser Glu Glu 210 215 220 Ala Ala His His Lys Gly
Val Thr Gly Glu Val Ala Gly Lys Ala Asp 225 230 235 240 Ile Phe Leu
Met Pro Asn Ile Glu Thr Gly Asn Val Met Tyr Lys Thr 245 250 255 Leu
Thr Tyr Thr Thr Asp Ser Lys Asn Gly Gly Ile Leu Val Gly Thr 260 265
270 Ser Ala Pro Val Val Leu Thr Ser Arg Ala Asp Ser His Glu Thr Lys
275 280 285 Met Asn Ser Ile Ala Leu Ala Ala 290 295
157355PRTClostridium acetobutylicum 157Met Ser Tyr Lys Leu Leu Ile
Ile Asn Pro Gly Ser Thr Ser Thr Lys 1 5 10 15 Ile Gly Val Tyr Glu
Gly Glu Lys Glu Leu Phe Glu Glu Thr Leu Arg 20 25 30 His Thr Asn
Glu Glu Ile Lys Arg Tyr Asp Thr Ile Tyr Asp Gln Phe 35 40 45 Glu
Phe Arg Lys Glu Val Ile Leu Asn Val Leu Lys Glu Lys Asn Phe 50 55
60 Asp Ile Lys Thr Leu Ser Ala Ile Val Gly Arg Gly Gly Met Leu Arg
65 70 75 80 Pro Val Glu Gly Gly Thr Tyr Ala Val Asn Asp Ala Met Val
Glu Asp 85 90 95 Leu Lys Val Gly Val Gln Gly Pro His Ala Ser Asn
Leu Gly Gly Ile 100 105 110 Ile Ala Lys Ser Ile Gly Asp Glu Leu Asn
Ile Pro Ser Phe Ile Val 115 120 125 Asp Pro Val Val Thr Asp Glu Leu
Ala Asp Val Ala Arg Leu Ser Gly 130 135 140 Val Pro Glu Leu Pro Arg
Lys Ser Lys Phe His Ala Leu Asn Gln Lys 145 150 155 160 Ala Val Ala
Lys Arg Tyr Gly Lys Glu Ser Gly Gln Gly Tyr Glu Asn 165 170 175 Leu
Asn Leu Val Val Val His Met Gly Gly Gly Val Ser Val Gly Ala 180 185
190 His Asn His Gly Lys Val Val Asp Val Asn Asn Ala Leu Asp Gly Asp
195 200 205 Gly Pro Phe Ser Pro Glu Arg Ala Gly Ser Val Pro Ile Gly
Asp Leu 210 215 220 Val Lys Met Cys Phe Ser Gly Lys Tyr Ser Glu Ala
Glu Val Tyr Gly 225 230 235 240 Lys Ala Val Gly Lys Gly Gly Phe Val
Gly Tyr Leu Asn Thr Asn Asp 245 250 255 Val Lys Gly Val Ile Asp Lys
Met Glu Glu Gly Asp Lys Glu Cys Glu 260 265 270 Ser Ile Tyr Lys Ala
Phe Val Tyr Gln Ile Ser Lys Ala Ile Gly Glu 275 280 285 Met Ser Val
Val Leu Glu Gly Lys Val Asp Gln Ile Ile Phe Thr Gly 290 295 300 Gly
Ile Ala Tyr Ser Pro Thr Leu Val Pro Asp Leu Lys Ala Lys Val 305 310
315 320 Glu Trp Ile Ala Pro Val Thr Val Tyr Pro Gly Glu Asp Glu Leu
Leu 325 330 335 Ala Leu Ala Gln Gly Ala Ile Arg Val Leu Asp Gly Glu
Glu Gln Ala 340 345 350 Lys Val Tyr 355 15870DNAArtificial
sequenceSynthetic primer 158aaaaacagca acaatgtgag ctttgttgta
attatattgt aaacatattg attccgggga 60tccgtcgacc 7015968DNAArtificial
sequenceSynthetic primer 159aaacggagcc tttcggctcc gttattcatt
tacgcggctt caactttcct gtaggctgga 60gctgcttc 6816023DNAArtificial
sequenceSynthetic primer 160cgggcaggtg ctatgaccag gac
2316123DNAArtificial sequenceSynthetic primer 161cgcggcgttg
accggcagcc tgg 2316270DNAArtificial sequenceSynthetic primer
162atcattctcg tttacgttat cattcacttt acatcagaga tataccaatg
attccgggga 60tccgtcgacc 7016369DNAArtificial sequenceSynthetic
primer 163gcacggaaat ccgtgcccca aaagagaaat tagaaacgga aggttgcggt
tgtaggctgg 60agctgcttc 6916421DNAArtificial sequenceSynthetic
primer 164caacagcaac ctgctcagca a 2116521DNAArtificial
sequenceSynthetic primer 165aagctggagc agcaaagcgt t
2116632DNAArtificial sequenceSynthetic primer 166ataaaccatg
gatccatgaa cgagtacgcc cc 3216733DNAArtificial sequenceSynthetic
primer 167ccaagcttcg aattctcaga tatgcaaggc gtg 3316836DNAArtificial
sequenceSynthetic primer 168tgaattccat ggcgcaactc actcttcttt tagtcg
3616939DNAArtificial sequenceSynthetic primer 169cagtacctcg
agtcttcgta tacatatgcg ctcagtcac 3917021DNAArtificial
sequenceSynthetic primer 170ccttggggca tatgaaagct g
2117129DNAArtificial sequenceSynthetic primer 171tttagtcatc
tcgagtgcac ctcaccttt 2917270DNAArtificial sequenceSynthetic primer
172gccacattgc cgcgccaaac gaaaccgttt caaccatggc atatgaatat
cctccttagt 60tcctattccg 7017361DNAArtificial sequenceSynthetic
primer 173cgccccagat ttcacgtatt gatcggctac gcttaatgca tgtgtaggct
ggagctgctt 60c 6117420DNAArtificial sequenceSynthetic primer
174ttgacacgtc taaccctggc 2017521DNAArtificial sequenceSynthetic
primer 175ctgtccaggg aacacaaatg c 2117619DNAArtificial
sequenceSynthetic primer 176ttgtgtcgcc ctttcgctg
1917724DNAArtificial sequenceSynthetic primer 177cttacgtacg
tactcgagtg acgc 2417823DNAArtificial sequenceSynthetic primer
178aagtggggca tatgtctaag atc 2317923DNAArtificial sequenceSynthetic
primer 179gtgatccggc tcgaggtggt tac 2318024DNAArtificial
sequenceSynthetic primer 180cttaacttca tgtgaaaagt ttgt
2418124DNAArtificial sequenceSynthetic primer 181acaataccca
tgtttatagg gcaa 24
* * * * *
References