U.S. patent number 11,041,177 [Application Number 16/473,149] was granted by the patent office on 2021-06-22 for modified lysine decarboxylase enzymes.
This patent grant is currently assigned to CATHAY BIOTECH INC., CIBT AMERICA INC.. The grantee listed for this patent is CATHAY BIOTECH INC., CIBT AMERICA INC.. Invention is credited to Ling Chen, Howard Chou, Xiucai Liu, Wenqiang Lu.
United States Patent |
11,041,177 |
Chou , et al. |
June 22, 2021 |
Modified lysine decarboxylase enzymes
Abstract
The invention provides CadA polypeptides with mutations that
increase activity in alkaline pH compared to the wild-type lysine
decarboxylase. The invention also provides methods of generating
such mutant polypeptides, microorganisms genetically modified to
overexpress the mutant polypeptides, and methods of generating such
microorganism.
Inventors: |
Chou; Howard (Shanghai,
CN), Lu; Wenqiang (Shanghai, CN), Chen;
Ling (Shanghai, CN), Liu; Xiucai (Shanghai,
CN) |
Applicant: |
Name |
City |
State |
Country |
Type |
CATHAY BIOTECH INC.
CIBT AMERICA INC. |
Shanghai
Newark |
N/A
DE |
CN
US |
|
|
Assignee: |
CATHAY BIOTECH INC. (Shanghai,
CN)
CIBT AMERICA INC. (Newark, DE)
|
Family
ID: |
1000005631478 |
Appl.
No.: |
16/473,149 |
Filed: |
December 30, 2016 |
PCT
Filed: |
December 30, 2016 |
PCT No.: |
PCT/CN2016/113519 |
371(c)(1),(2),(4) Date: |
June 24, 2019 |
PCT
Pub. No.: |
WO2018/120026 |
PCT
Pub. Date: |
July 05, 2018 |
Prior Publication Data
|
|
|
|
Document
Identifier |
Publication Date |
|
US 20200231999 A1 |
Jul 23, 2020 |
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12P
13/001 (20130101); C12Y 401/01018 (20130101); C12N
9/88 (20130101) |
Current International
Class: |
C12P
13/00 (20060101); C12N 9/88 (20060101) |
References Cited
[Referenced By]
U.S. Patent Documents
Foreign Patent Documents
|
|
|
|
|
|
|
103484444 |
|
Jan 2014 |
|
CN |
|
105296456 |
|
Feb 2016 |
|
CN |
|
2602314 |
|
Jun 2013 |
|
EP |
|
9617930 |
|
Jun 1996 |
|
WO |
|
2016119230 |
|
Aug 2016 |
|
WO |
|
WO 2016/129812 |
|
Aug 2016 |
|
WO |
|
Other References
International Search Report issued in Application No.
PCT/CN2016/113519 dated Oct. 11, 2017, 3 pages. cited by applicant
.
Zhang Kai et al. "Directed evolution by DNA shuffling of lysine
decarboxylase gene cadA and Idc" Chinese Journal of Bioprocess
Engineering, vol. 13, No. 5, Sep. 30, 2015. 6 pages. cited by
applicant .
Communication pursuant to Rule 164(1) EPC in European Application
No. 16925677.3, dated Aug. 26, 2020, 20 pages. cited by applicant
.
Zhang Kai, et al. "Directed evolution by DNA shuffling of lysine
decarboxylase gene cadA and Idc", Chinese Journal of Bioprocess
Engineering, Nanjing University of Technology, CN, vol. 13, No. 5,
Sep. 30, 2015 (Sep. 30, 2015), pp. 20-25. cited by applicant .
Usheer Kanjee, et al., "Linkage between the bacterial acid stress
and stringent responses: the structure of the lysine
decarboxylase", The EMBO Journal, vol. 30, No. 5, Mar. 2, 2011
(Mar. 2, 2011), pp. 931-944. cited by applicant .
Wang Chen, et al., "Directed Evolution and Mutagenesis of Lysine
Decarboxylase from Hafnia alvei ASI.1009 to Improve Its Activity
toward Efficient Cadaverine Production", Biotechnology and
Bioprocess Engineering, Korean Society for Biotechnology and
Bioengineering, Seoul, KR, vol. 20, No. 3, Jul. 21, 2015 (Jul. 21,
2015), pp. 439-446. cited by applicant.
|
Primary Examiner: Desai; Anand U
Attorney, Agent or Firm: Rothwell, Figg, Ernst &
Manbeck, P.C.
Claims
What is claimed is:
1. A product, which is one of the following products I) through
VI): I) a CadA variant polypeptide comprising at least one amino
acid substitution at a glutamic acid residue at one or more
positions selected from the group consisting of positions 291, 344,
355, 463, 482, and 499 as determined with reference to SEQ ID NO:2,
wherein the glutamic acid residue occurs at the surface of the
protein with the side chain oriented towards the external
environment in a segment of the protein that lacks a defined
secondary structure; and wherein the CadA variant polypeptide has
at least 80% identity to any one of SEQ ID NOS:2 to 5; II) a
polynucleotide comprising a nucleic acid sequence encoding a CadA
variant polypeptide of I); III) an expression vector comprising a
polynucleotide of II); IV) a genetically modified host cell
comprising a CadA variant polypeptide of I); V) a genetically
modified host cell comprising a polynucleotide of II), wherein the
nucleic acid sequence encoding the CadA variant polypeptide is
integrated into the host cell chromosome; VI) a genetically
modified host cell comprising an expression vector of III).
2. A product of claim 1, which is I) the CadA variant polypeptide,
wherein the substitution is at glutamic acid residue at one or more
positions selected from the group consisting of positions 291, 344,
355, 463, 482, and 499.
3. The product of claim 1, wherein the amino acid substitution is
E291A/C/D/H/R/V/G/K/N/S, E355C/F/H/K/L/MN/P/Q/R/S/T/V/Y, or
E482C/F/I/L/S/W/Y/A/H/K/M.
4. The product of claim 3, where the amino acid substitution is
E291A/C/D/H/R/V, E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or
E482C/F/I/L/S/W/Y.
5. A product of claim 1, which is I) the CadA variant polypeptide,
wherein the CadA variant polypeptide at least 85%, at least 90%, or
at least 95% identity to SEQ ID NO:2.
6. A product of claim 1, which is I) the CadA variant polypeptide,
wherein the CadA variant polypeptide has at least 80%, at least
85%, at least 90%, or at least 95% identity to SEQ ID NO:3, SEQ ID
NO:4, or SEQ ID NO:5.
7. A product of claim 1, which is I) the CadA variant polypeptide,
wherein the CadA variant polypeptide is a conservatively modified
variant polypeptide.
8. A product of claim 1, which is IV) the genetically modified host
cell, wherein the host cell is genetically modified to over express
one or more lysine biosynthesis polypeptides.
9. A product of claim 1, which is IV) the genetically modified host
cell, wherein the host cell is a bacterium.
10. The product of claim 8, wherein the host cell is from the genus
Escherichia, Hafnia, or Corynebacteria.
11. The product of claim 8, wherein the host cell is selected from
the group consisting of Escherichia coli, Hafnia alvei and
Corynebacterium glutamicum.
12. A product of claim 1, which is VI) the genetically modified
host cell, wherein the host cell is a bacterium.
13. A product of claim 1, which is VI) the genetically modified
host cell, wherein the host cell is form the genus Escherichia,
Hafnia, or Corynebacteria.
14. A product of claim 1, which is VI) the genetically modified
host cell, wherein the host cell is selected from the group
consisting of Escherichia coli, Hafnia alvei and Corynebacterium
glutamicum.
15. A product of claim 1, which is V) the genetically modified host
cell, wherein the host cell is a bacterium.
16. The product of claim 14, wherein the host cell is form the
genus Escherichia, Hafnia, or Corynebacteria.
17. The product of claim 14, wherein the host cell is selected from
the group consisting of Escherichia coli, Hafnia alvei and
Corynebacterium glutamicum.
18. A method, which is one of the following methods I) through
III): I) a method of producing cadaverine, the method comprising
culturing a genetically modified host cell of claim 1 IV) under
conditions in which the CadA variant polypeptide is expressed; II)
a method of producing cadaverine, the method comprising culturing a
genetically modified host cell of claim 1 V) under conditions in
which the CadA variant polypeptide is expressed; III) a method of
producing cadaverine, the method comprising culturing a genetically
modified host cell of claim 1 VI) under conditions in which the
CadA variant polypeptide is expressed.
Description
CROSS REFERENCE TO RELATED APPLICATION
This application is a 35 U.S.C. .sctn. 371 National Stage of
International Application No. PCT/CN2016/113519, filed 30 Dec.
2016, designating the United States. Each application is
incorporated herein by reference in its entirety.
BACKGROUND OF THE INVENTION
Most enzymes function optimally within a narrow pH range, because
they are amphoteric molecules. The pH of the surrounding
environment directly affects the charges on the acidic and basic
groups of the amino acids that make up the enzyme. These changes in
charge affect the net charge of the enzyme, the pKa of the active
site, and the charge distribution across the surface of the enzyme.
As a result, changes in pH can affect the activity, solubility, and
stability of an enzyme.
The class of proteins known as acid decarboxylases is a group of
enzymes that catalyze the decarboxylase reaction of basic amino
acids (e.g., lysine, arginine, ornithine) in order to generate
polyamines as part of the acid stress response in many
microorganisms Escherichia coli has several PLP-dependent acid
decarboxylases: CadA, LdcC, AdiA, SpeA, SpeC, SpeF, GadA, and GadB.
All of these enzymes function within a narrow pH range, and the
enzyme activity decreases significantly outside of that pH range
(Kanjee et al., Biochemistry 50, 9388-9398, 2011). It has been
previously observed that these PLP-dependent decarboxylases
dimerize in order to form a complete active site. In some cases,
such as CadA, the dimers form decamers that aggregate into higher
molecular weight protein complexes required for optimal function.
The inhibition of higher molecular weight protein complex formation
(e.g., in conditions outside of the optimal pH) leads to a
significant decrease in function (Kanjee et al., The EMBO Journal
30, 931-944, 2011).
The pKa values of individual amino acids in a protein are important
for determining its biomolecular function, because one of the
dominant reactions in a protein-water solution is the exchange of
protons by certain amino acids with the environment of the protein.
The pKa is a measure of the difference in free energy between the
neutral and charged states, and indicates the propensity of an
amino acid to donate or accept a proton. Certain amino acids have
titratable groups that make them more amenable to accept and donate
protons. These amino acids include aspartate, glutamate, cysteine,
serine, tyrosine, threonine, histidine, lysine, and arginine.
Illustrative pKa values of some amino acids are: aspartate is 4.0,
glutamate is 4.4, cysteine is 8.7, tyrosine is 9.6, histidine is
6.3, lysine is 10.4, and arginine is 13.0 (Nielsen J E & Vriend
G, Proteins 43, 403-412, 2001). These pKa values can vary by 0.5,
depending on the literature source.
Whether a titratable group accepts or donates a proton will depend
on its environment, such as the pH or other amino acids in its
proximity. For example, when the pH is less than the pKa of the
titratable group, then the group will more likely accept a proton.
Conversely, when the pH is greater than the pKa of the titratable
group, then the group will more likely donate a proton. When a
titratable group of an amino acid either accepts or donates a
proton, the amino acid can become either positively charged,
negatively charged, or neutral depending on the charge it started
with before the proton exchange happened. Charged groups can
interact with other charged groups when the two groups are brought
into proximity of one another. Like charges repel each other and
opposite charges attract each other. Neutrally charged groups that
are protonated can still interact with other groups through
hydrogen bond interactions.
An understanding of the pKa values of the titratable groups of a
protein is not only important for understanding how pH affects
polypeptide folding and enzyme activity, but also protein-protein
interactions (Jensen J E, Curr Pharm Biotechnol 9, 96-102, 2008),
especially in the cases of the acid decarboxylases that undergo
significant changes in their quaternary structure as a result of a
change in the pH of the environment. There have been few studies in
evaluating the effect of mutations at various amino acids with
titratable groups on the function of the acid decarboxylases.
Based on prior literature (Kanjee, et al. The EMBO Journal 30,
931-944, 2011), CadA transitions from a state that consists of
decamers and high-order oligomers to a state that is composed
mostly of dimers when the pH of the environment changes. The
formation of decamers is a prerequisite for the formation of
high-order oligomers. It has been shown that CadA functions
optimally at a pH of 5.0-5.5 (Lemonnier M & Lane D,
Microbiology 144, 751-760, 1998). At this acidic pH, Kanjee et al.
show using EM that CadA exists mainly as high-order oligomers. When
the pH is increased above 6.0 or when the inhibitor ppGpp is
present, the high-order oligomers do not form and decarboxylase
function is significantly reduced. However, there is an absence of
literature that describes how the CadA high-order oligomers form
and the amino acid residues that play a role in their
formation.
BRIEF SUMMARY OF ASPECTS OF THE DISCLOSURE
This invention is based, in part, on the discovery of mutations
that provide the ability to stabilize the chemical interactions
necessary for quaternary structure formation, and increase
stability to allow a mutant acid decarboxylase protein to function
across a wider pH range. The ability to function across a wider pH
range is important in maintaining a high reaction rate without the
need to add additional chemicals to maintain pH. The maintenance of
a high reaction rate across a wide pH range would enable lysine to
be converted into cadaverine faster and reduce the amount of
utilities required. The tolerance of the protein for alkaline pH
eliminates the need to add additional chemicals to maintain pH.
These chemicals used to maintain pH oftentimes form salts (e.g.,
SO.sub.4.sup.2- or Cl.sup.-) that go either into the wastewater or
must be removed from the process using additional purification
steps. Therefore, a mutant acid decarboxylase that functions at a
wider pH range than wild-type would decrease the cost and the
environmental footprint of the overall process by reducing the
amount of salts formed during the process.
In one aspect, the invention thus provides a CadA variant
polypeptide comprising at least one amino acid substitution at a
glutamic acid residue in a region corresponding to amino acids 276
to 509 as determined with reference to SEQ ID NO:2, where the
glutamic acid residue occurs at the surface of the protein with the
side chain oriented towards the external environment in a segment
of the protein that lacks a defined secondary structure; and
wherein the CadA variant polypeptide has at least 70% identity to
any one of SEQ ID NOS:2 to 5. In some embodiments, the substitution
is at a glutamic acid residue at a position selected from the group
consisting of positions 291, 344, 355, 463, 482, and 499 as
determined with reference to SEQ ID NO:2. In some embodiments, the
amino acid substitution is E291A/C/D/H/R/V/G/K/N/S,
E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or E482C/F/I/L/S/W/Y/A/H/K/M. In
some embodiments, the amino acid substitution is E291A/C/D/H/R/V,
E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or E482C/F/I/L/S/W/Y. In some
embodiments a CadA variant polypeptide comprises substitutions of
glutamic acid residues at at least two positions selected from the
group consisting of positions 291, 344, 355, 463, 482, and 499. In
some embodiments, a CadA variant polypeptide comprises
substitutions of glutamic acid residues at at least three positions
selected from the group consisting of positions 291, 344, 355, 463,
482, and 499. In some embodiments, a CadA variant polypeptide
comprises substitutions of glutamic acid residues at four or five
positions selected from the group consisting of positions 291, 344,
355, 463, 482, and 499; or at all six of the positions. In some
embodiments, a CadA variant polypeptide as described above in this
paragraph has at least 70% identity to SEQ ID NO:2. In some
embodiments, the CadA variant polypeptide has at least 75%, at
least 80%, at least 85%, at least 90%, or at least 95% identity to
SEQ ID NO:2. In some embodiments, a CadA variant polypeptide has at
least 70% identity to SEQ ID NO:3, SEQ ID NO:4, or SEQ ID NO:5. In
some embodiments, the CadA variant polypeptide has at least 75%, at
least 80%, at least 85%, at least 90%, or at least 95% identity to
SEQ ID NO:3, SEQ ID NO:4, or SEQ ID NO:5.
In a further aspect, the invention provides a genetically modified
host cell comprising a CadA variant polypeptide as described
herein, e.g., in the preceding paragraph. In typical embodiments,
the genetically modified host cell is genetically modified to over
express one or more lysine biosynthesis polypeptides. In some
embodiments, the host cell is a bacterium. In further embodiments,
the host cell is from the genus Escherichia, Hafnia, or
Corynebacteria. In some embodiments, the genetically modified host
cell is Escherichia coli. In some embodiments, the genetically
modified host cell is Hafnia alvei. In some embodiments, the
genetically modified host cell is Corynebacterium glutamicum.
In an additional aspect, the invention provides a polynucleotide
comprising a nucleic acid sequence encoding a CadA variant
polypeptide as described herein, e.g., in the preceding paragraphs
in this section. In further aspects, the invention additionally
provides an expression vector comprising a polynucleotide encoding
the CadA variant, and/or a genetically modified host cell
comprising the expression vector. In some embodiments, the host
cell is a bacterium, e.g., from the genus Escherichia, Hafnia, or
Corynebacteria. In some embodiments, the genetically modified host
cell is Escherichia coli. In some embodiments, the genetically
modified host cell is Hafnia alvei. In some embodiments, the
genetically modified host cell is Corynebacterium glutamicum.
In a further aspect, the invention provides a genetically modified
host cell comprising a polynucleotide that comprises a nucleic acid
sequence encoding a CadA variant polypeptide as described herein,
e.g., in the preceding paragraphs in this section, wherein the
nucleic acid sequence encoding the CadA variant polypeptide is
integrated into the host cell chromosome. In some embodiments, the
host cell is a bacterium, e.g., from the genus Escherichia, Hafnia,
or Corynebacteria. In some embodiments, the genetically modified
host cell is Escherichia coli. In some embodiments, the genetically
modified host cell is Hafnia alvei. In some embodiments, the
genetically modified host cell is Corynebacterium glutamicum.
In another aspect, the invention provides a method of producing
cadaverine, the method comprising culturing a genetically modified
host cell as described herein, e.g., in the preceding paragraphs in
this section, under conditions in which CadA variant polypeptide is
expressed.
DESCRIPTION OF THE FIGURES
FIG. 1 shows an alignment of the E. coli CadA polypeptide sequence
SEQ ID NO:2 with CadA homologs from Salmonella enterica
(WP_001540636.1, SEQ ID NO:5), Klebsiella multispecies
(WP_012968785, SEQ ID NO:3), and Enterobacteriaeceae multispecies
(WP_002892486.1, SEQ ID NO:4).
DETAILED DESCRIPTION OF ASPECTS OF THE DISCLOSURE
Before the present invention is described, it is to be understood
that this invention is not limited to particular embodiments
described, as such may, of course, vary. It is also to be
understood that the terminology used herein is for the purpose of
describing particular embodiments only, and is not intended to be
limiting.
Unless defined otherwise, all technical and scientific terms used
herein have the same meaning as commonly understood by one of
ordinary skill in the art to which this invention belongs. Although
any methods and materials similar or equivalent to those described
herein can be used in the practice or testing of the present
invention, the preferred methods and materials are now described.
All publications and accession numbers mentioned herein are
incorporated herein by reference to disclose and describe the
methods and/or materials in connection with which the publications
are cited.
Terminology
As used in the context of the present disclosure, a "CadA
polypeptide" refers to an Escherichia coli CadA polypeptide having
the amino acid sequence of SEQ ID NO:2, or a biologically active
variant thereof that has activity, i.e., catalyzes the
decarboxylation of L-lysine to produce cadaverine. Biologically
active variants include alleles, mutants, fragments, and
interspecies homologs of the E. coli CadA polypeptide. CadA has
been well characterized structurally and functionally. The protein
data bank ID for the structure of CadA is 3N75. Illustrative CadA
polypeptides from other species include CadA from Klebsiella (e.g.,
SEQ ID NO:3), Enterobacteriaceae (e.g., SEQ ID NO:5), and
Salmonella enterica (e.g., SEQ ID NO:6). Additional CadA
polypeptides from other species include Serratia sp., WP
033635725.1; and Raoultella ornithinolytica, YP 007874766.1. In
some embodiments, a "CadA polypeptide" has at least 60% amino acid
sequence identity, typically at least 65%, 70%, 75%, 80%, 85%, 90%
identity; often at least 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or
99% or greater amino acid sequence identity, over a region of at
least about 200, 300, 400, 500, or more, amino acids; or over the
length of the CadA polypeptide of SEQ ID NO:2. In some embodiments,
a "CadA polypeptide" comprises a region that has at least 80%, at
least 85%, at least 90%, at least 95%, or at least 91%, 92%, 93%,
94%, 95%, 96%, 97%, 98% or 99% or greater amino acid sequence
identity over a region comprising amino acids residues that
correspond to amino acids 261-509 of SEQ ID NO:2 where a native
glutamate present in the region is substituted with another
non-naturally occurring amino acid as described herein. In some
embodiments, a "CadA polypeptide" has at least 60% amino acid
sequence identity, often at least 65%, 70%, 75%, 80%, 85%, 90%,
91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater, amino
acid sequence identity, preferably over a region of at least about
200, 300, 400, 500, or more, amino acids, or over the length of the
CadA polypeptide of SEQ ID NO:3, SEQ ID NO:4, or SEQ ID NO:5.
A "CadA polynucleotide" as used herein refers to a polynucleotide
that encodes a CadA polypeptide. A nucleic acid or polynucleotide
that encodes a CadA refers to a gene, pre-mRNA, mRNA, and the like,
including nucleic acids encoding variants, alleles, fragments,
mutants, and interspecies homologs of the particular amino acid
sequences described herein.
As used herein, the term "alkaline pH" refers to a solution or
surrounding environment having a pH of greater than 7.5. In one
embodiment, alkaline pH refers to a solution or surrounding
environment have a pH of at least 8.0, at least 8.5, or higher.
The term "enhanced" or "improved" in the context of the production
of an amino acid derivative, e.g., cadaverine, as used herein
refers to an increase in the production of the amino acid
derivative produced by a host cell that expresses a CadA variant
polypeptide of the invention in comparison to a control counterpart
cell, such as a cell of the wildtype strain or a cell of the same
strain that expresses the wildtype CadA protein. In one embodiment,
activity of the CadA variant is improved by at least 10%, 15% 20%,
30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, or greater, compared to
CadA activity of a counterpart cell expressing a wildtype CadA,
where activity is assessed by measuring the production of an amino
acid derivative, typically cadaverine, produced by the host cell
and control cell under identical conditions. For example, activity
of a CadA variant polypeptide of the invention can be assessed by
evaluating an aliquot of a culture of host cells transformed with a
polynucleotide encoding the variant CadA polypeptide compared to a
corresponding aliquot from a culture of counterpart host cells of
the same strain that expresses wildtype CadA. By way of further
illustration, the activity of a CadA variant polypeptide of the
invention compared to the counterpart widltype CadA can be
determined by evaluating the production of cadaverine by cells
transformed with either a vector comprising a nucleic acid sequence
encoding the variant CadA polypeptide (variant host cells) or a
vector comprising a nucleic acid encoding the wildtype CadA
polypeptide (control host cells). Variant and control host cells
that are grown under conditions to express CadA and an aliquot is
incubated with lysine-HCl and PLP at a final concentration of 120
g/L and 0.1 mM, respectively at pH 8.0 for a period of time, e.g.,
2 hours. Cadaverine production is measured following incubation. An
exemplary assay is provided in the Examples section.
The terms "numbered with reference to", or "corresponding to," or
"determined with reference to" when used in the context of the
numbering of a given amino acid or polynucleotide sequence, refers
to the numbering of the residues of a specified reference sequence
when the given amino acid or polynucleotide sequence is compared to
the reference sequence. For example, a position of a variant CadA
polypeptide sequence "corresponds to" a position in SEQ ID NO:2
when the variant polypeptide is aligned with SEQ ID NO:2 in a
maximal alignment.
The terms "wild type", "native", and "naturally occurring" with
respect to a CadA polypeptide are used herein to refer to a CadA
protein that has a sequence that occurs in nature.
In the context of this invention, the term "mutant" with respect to
a mutant polypeptide or mutant polynucleotide is used
interchangeably with "variant". A "non-naturally" occurring CadA
variant refers to a variant or mutant CadA polypeptide that is not
present in a cell in nature and that is produced by genetic
modification, e.g., using genetic engineering technology or
mutagenesis techniques, of a native CadA polynucleotide or
polypeptide. A "variant" CadA polypeptide in the context of this
disclosure includes any non-naturally occurring CadA polypeptide
that comprises at least one amino acid substitution, where the at
least one amino acid substitution is a substitution of a glutamic
acid residue at positions 291, 344, 355, 463, 482, or 499, as
determined with reference to SEQ ID NO:2. A variant CadA
polypeptide of the invention may also have additional mutations
relative to SEQ ID NO:2, including further substitutions,
insertions, or deletions.
The terms "polynucleotide" and "nucleic acid" are used
interchangeably and refer to a single or double-stranded polymer of
deoxyribonucleotide or ribonucleotide bases read from the 5' to the
3' end. A nucleic acid as used in the present invention will
generally contain phosphodiester bonds, although in some cases,
nucleic acid analogs may be used that may have alternate backbones,
comprising, e.g., phosphoramidate, phosphorothioate,
phosphorodithioate, or O-methylphosphoroamidite linkages (see
Eckstein, Oligonucleotides and Analogues: A Practical Approach,
Oxford University Press); positive backbones; non-ionic backbones,
and non-ribose backbones. Nucleic acids or polynucleotides may also
include modified nucleotides that permit correct read-through by a
polymerase. "Polynucleotide sequence" or "nucleic acid sequence"
includes both the sense and antisense strands of a nucleic acid as
either individual single strands or in a duplex. As will be
appreciated by those in the art, the depiction of a single strand
also defines the sequence of the complementary strand; thus the
sequences described herein also provide the complement of the
sequence. Unless otherwise indicated, a particular nucleic acid
sequence also implicitly encompasses variants thereof (e.g.,
degenerate codon substitutions) and complementary sequences, as
well as the sequence explicitly indicated. The nucleic acid may be
DNA, both genomic and cDNA, RNA or a hybrid, where the nucleic acid
may contain combinations of deoxyribo- and ribo-nucleotides, and
combinations of bases, including uracil, adenine, thymine,
cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine,
isoguanine, etc.
The term "substantially identical," used in the context of this
disclosure for two nucleic acids or polypeptides, refers to a
sequence that has at least 50% sequence identity with a reference
sequence. Percent identity can be any integer from 50% to 100%.
Some embodiments include at least: 50%, 55%, 60%, 65%, 70%, 75%,
80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%,
compared to a reference sequence using the programs described
herein; preferably BLAST using standard parameters, as described
below.
Two nucleic acid sequences or polypeptide sequences are said to be
"identical" if the sequence of nucleotides or amino acid residues,
respectively, in the two sequences is the same when aligned for
maximum correspondence as described below. The terms "identical" or
percent "identity," in the context of two or more nucleic acids or
polypeptide sequences, refer to two or more sequences or
subsequences that are the same or have a specified percentage of
amino acid residues or nucleotides that are the same, when compared
and aligned for maximum correspondence over a comparison window, as
measured using one of the following sequence comparison algorithms
or by manual alignment and visual inspection.
For sequence comparison, typically one sequence acts as a reference
sequence, to which test sequences are compared. When using a
sequence comparison algorithm, test and reference sequences are
entered into a computer, subsequence coordinates are designated, if
necessary, and sequence algorithm program parameters are
designated. Default program parameters can be used, or alternative
parameters can be designated. The sequence comparison algorithm
then calculates the percent sequence identities for the test
sequences relative to the reference sequence, based on the program
parameters.
An algorithm that may be used to determine whether a variant CadA
polypeptide has sequence identity to SEQ ID NO:2, or another
polypeptide reference sequence such as SEQ ID NO:3, SEQ ID NO:4, or
SEQ ID NO:5, is the BLAST algorithm, which is described in Altschul
et al., 1990, J. Mol. Biol. 215:403-410). Software for performing
BLAST analyses is publicly available through the National Center
for Biotechnology Information (on the worldwide web at
ncbi.nlm.nih.gov/). Illustrative software for performing protein
sequence alignments include ClustalW2 and BLASTP. For amino acid
sequences, the BLASTP program uses as defaults a word size (W) of
3, an expect threshold (E) of 10, and the BLOSUM62 scoring matrix
(see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915
(1989)). In the present disclosure, polypeptide sequence identity
is typically determined using BLASTP Align Sequence with the
default parameters.
A "comparison window," as used herein, includes reference to a
segment of any one of the number of contiguous positions selected
from the group consisting of from 20 to 600, usually about 50 to
about 200, more usually about 100 to about 150 in which a sequence
may be compared to a reference sequence of the same number of
contiguous positions after the two sequences are optimally aligned.
Methods of alignment of sequences for comparison are well-known in
the art. Optimal alignment of sequences for comparison can be
conducted, e.g., by the local homology algorithm of Smith &
Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment
algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970),
by the search for similarity method of Pearson & Lipman, Proc.
Nat'l. Acad. Sci. USA 85:2444 (1988), by computerized
implementations of these algorithms (GAP, BESTFIT, FASTA, and
TFASTA in the Wisconsin Genetics Software Package, Genetics
Computer Group, 575 Science Dr., Madison, Wis.), or by manual
alignment and visual inspection. Optimal alignments are typically
conducted using BLASTP with default parameters.
Nucleic acid or protein sequences that are substantially identical
to a reference sequence include "conservatively modified variants."
With respect to particular nucleic acid sequences, conservatively
modified variants refers to those nucleic acids which encode
identical or essentially identical amino acid sequences, or where
the nucleic acid does not encode an amino acid sequence, to
essentially identical sequences. Because of the degeneracy of the
genetic code, a large number of functionally identical nucleic
acids encode any given protein. For instance, the codons GCA, GCC,
GCG and GCU all encode the amino acid alanine. Thus, at every
position where an alanine is specified by a codon, the codon can be
altered to any of the corresponding codons described without
altering the encoded polypeptide. Such nucleic acid variations are
"silent variations," which are one species of conservatively
modified variations. Every nucleic acid sequence herein which
encodes a polypeptide also describes every possible silent
variation of the nucleic acid. One of skill will recognize that
each codon in a nucleic acid (except AUG, which is ordinarily the
only codon for methionine) can be modified to yield a functionally
identical molecule. Accordingly, each silent variation of a nucleic
acid which encodes a polypeptide is implicit in each described
sequence.
The term "polypeptide" as used herein includes reference to
polypeptides containing naturally occurring amino acids and amino
acid backbones as well as non-naturally occurring amino acids and
amino acid analogs.
As to amino acid sequences, one of skill will recognize that
individual substitutions, in a nucleic acid, peptide, polypeptide,
or protein sequence which alters a single amino acid or a small
percentage of amino acids in the encoded sequence is a
"conservatively modified variant" where the alteration results in
the substitution of an amino acid with a chemically similar amino
acid. Conservative substitution tables providing functionally
similar amino acids are well known in the art. Examples of amino
acid groups defined in this manner can include: a "charged/polar
group" including Glu (Glutamic acid or E), Asp (Aspartic acid or
D), Asn (Asparagine or N), Gln (Glutamine or Q), Lys (Lysine or K),
Arg (Arginine or R) and His (Histidine or H); an "aromatic or
cyclic group" including Pro (Proline or P), Phe (Phenylalanine or
F), Tyr (Tyrosine or Y) and Trp (Tryptophan or W); and an
"aliphatic group" including Gly (Glycine or G), Ala (Alanine or A),
Val (Valine or V), Leu (Leucine or L), Ile (Isoleucine or I), Met
(Methionine or M), Ser (Serine or S), Thr (Threonine or T) and Cys
(Cysteine or C). Within each group, subgroups can also be
identified. For example, the group of charged/polar amino acids can
be sub-divided into sub-groups including: the "positively-charged
sub-group" comprising Lys, Arg and His; the "negatively-charged
sub-group" comprising Glu and Asp; and the "polar sub-group"
comprising Asn and Gln. In another example, the aromatic or cyclic
group can be sub-divided into sub-groups including: the "nitrogen
ring sub-group" comprising Pro, His and Trp; and the "phenyl
sub-group" comprising Phe and Tyr. In another further example, the
aliphatic group can be sub-divided into sub-groups including: the
"large aliphatic non-polar sub-group" comprising Val, Leu and Ile;
the "aliphatic slightly-polar sub-group" comprising Met, Ser, Thr
and Cys; and the "small-residue sub-group" comprising Gly and Ala.
Examples of conservative mutations include amino acid substitutions
of amino acids within the sub-groups above, such as, but not
limited to: Lys for Arg or vice versa, such that a positive charge
can be maintained; Glu for Asp or vice versa, such that a negative
charge can be maintained; Ser for Thr or vice versa, such that a
free --OH can be maintained; and Gln for Asn or vice versa, such
that a free --NH2 can be maintained. The following six groups each
contain amino acids that further provide illustrative conservative
substitutions for one another. 1) Ala, Ser, Thr; 2) Asp, Glu; 3)
Asn, Gln; 4) Arg, Lys; 5) Ile, Leu, Met, Val; and 6) Phe, Try, and
Trp (see, e.g., Creighton, Proteins (1984)). In some embodiments,
conservative substitutions are employed in generating Cada variants
having substitutions at sites other than a glutamate residue.
The term "promoter," as used herein, refers to a polynucleotide
sequence capable of driving transcription of a nucleic acid
sequence in a cell. Thus, promoters used in the polynucleotide
constructs of the invention include cis- and trans-acting
transcriptional control elements and regulatory sequences that are
involved in regulating or modulating the timing and/or rate of
transcription of a gene. For example, a promoter can be a
cis-acting transcriptional control element, including an enhancer,
a repressor binding sequence and the like. These cis-acting
sequences typically interact with proteins or other biomolecules to
carry out (turn on/off, regulate, modulate, etc.) gene
transcription. Most often the core promoter sequences lie within
1-2 kb of the translation start site, more often within 1 kbp and
often within 500 bp or 200 bp or fewer, of the translation start
site. By convention, promoter sequences are usually provided as the
sequence on the coding strand of the gene it controls. In the
context of this application, a promoter is typically referred to by
the name of the gene for which it naturally regulates expression. A
promoter used in an expression construct of the invention is
referred to by the name of the gene. Reference to a promoter by
name includes a wild type, native promoter as well as variants of
the promoter that retain the ability to induce expression.
Reference to a promoter by name is not restricted to a particular
species, but also encompasses a promoter from a corresponding gene
in other species.
A "constitutive promoter" in the context of this invention refers
to a promoter that is capable of initiating transcription under
most conditions in a cell, e.g., in the absence of an inducing
molecule. An "inducible promoter" initiates transcription in the
presence of an inducer molecule.
A polynucleotide is "heterologous" to an organism or a second
polynucleotide sequence if it originates from a foreign species,
or, if from the same species, is modified from its original form.
For example, when a polynucleotide encoding a polypeptide sequence
is said to be operably linked to a heterologous promoter, it means
that the polynucleotide coding sequence encoding the polypeptide is
derived from one species whereas the promoter sequence is derived
from another, different species; or, if both are derived from the
same species, the coding sequence is not naturally associated with
the promoter (e.g., is a genetically engineered coding sequence,
e.g., from a different gene in the same species, or an allele from
a different species). Similarly, a polypeptide is "heterologous" to
a host cell if the native wildtype host cell does not produce the
polypeptide.
The term "exogenous" refers generally to a polynucleotide sequence
or polypeptide that does not naturally occur in a wild-type cell or
organism, but is typically introduced into the cell by molecular
biological techniques, i.e., engineering to produce a recombinant
microorganism. Examples of "exogenous"polynucleotides include
vectors, plasmids, and/or man-made nucleic acid constructs encoding
a desired protein.
The term "endogenous" refers to naturally-occurring polynucleotide
sequences or polypeptides that may be found in a given wild-type
cell or organism. In this regard, it is also noted that even though
an organism may comprise an endogenous copy of a given
polynucleotide sequence or gene, the introduction of a plasmid or
vector encoding that sequence, such as to over-express or otherwise
regulate the expression of the encoded protein, represents an
"exogenous" copy of that gene or polynucleotide sequence. Any of
the pathways, genes, or enzymes described herein may utilize or
rely on an "endogenous" sequence, which may be provided as one or
more "exogenous" polynucleotide sequences, or both.
"Recombinant nucleic acid" or "recombinant polynucleotide" as used
herein refers to a polymer of nucleic acids wherein at least one of
the following is true: (a) the sequence of nucleic acids is foreign
to (i.e., not naturally found in) a given host cell; (b) the
sequence may be naturally found in a given host cell, but in an
unnatural (e.g., greater than expected) amount; or (c) the sequence
of nucleic acids comprises two or more subsequences that are not
found in the same relationship to each other in nature. For
example, regarding instance (c), a recombinant nucleic acid
sequence can have two or more sequences from unrelated genes
arranged to make a new functional nucleic acid.
The term "operably linked" refers to a functional relationship
between two or more polynucleotide (e.g., DNA) segments. Typically,
it refers to the functional relationship of a transcriptional
regulatory sequence to a transcribed sequence. For example, a
promoter or enhancer sequence is operably linked to a DNA or RNA
sequence if it stimulates or modulates the transcription of the DNA
or RNA sequence in an appropriate host cell or other expression
system. Generally, promoter transcriptional regulatory sequences
that are operably linked to a transcribed sequence are physically
contiguous to the transcribed sequence, i.e., they are cis-acting.
However, some transcriptional regulatory sequences, such as
enhancers, need not be physically contiguous or located in close
proximity to the coding sequences whose transcription they
enhance.
The term "expression cassette" or "DNA construct" or "expression
construct" refers to a nucleic acid construct that, when introduced
into a host cell, results in transcription and/or translation of an
RNA or polypeptide, respectively. In the case of expression of
transgenes, one of skill will recognize that the inserted
polynucleotide sequence need not be identical, but may be only
substantially identical to a sequence of the gene from which it was
derived. As explained herein, these substantially identical
variants are specifically covered by reference to a specific
nucleic acid sequence. One example of an expression cassette is a
polynucleotide construct that comprises a polynucleotide sequence
encoding a polypeptide of the invention protein operably linked to
a promoter, e.g., its native promoter, where the expression
cassette is introduced into a heterologous microorganism. In some
embodiments, an expression cassette comprises a polynucleotide
sequence encoding a polypeptide of the invention where the
polynucleotide is targeted to a position in the genome of a
microorganism such that expression of the polynucleotide sequence
is driven by a promoter that is present in the microorganism.
The term "host cell" as used in the context of this invention
refers to a microorganism and includes an individual cell or cell
culture that can be or has been a recipient of any recombinant
vector(s) or isolated polynucleotide(s) of the invention. Host
cells include progeny of a single host cell, and the progeny may
not necessarily be completely identical (in morphology or in total
DNA complement) to the original parent cell due to natural,
accidental, or deliberate mutation and/or change. A host cell
includes cells into which a recombinant vector or a polynucleotide
of the invention has been introduced, including by transformation,
transfection, and the like.
The term "isolated" refers to a material that is substantially or
essentially free from components that normally accompany it in its
native state. For example, an "isolated polynucleotide," as used
herein, may refer to a polynucleotide that has been isolated from
the sequences that flank it in its naturally-occurring or genomic
state, e.g., a DNA fragment that has been removed from the
sequences that are normally adjacent to the fragment, such as by
cloning into a vector. A polynucleotide is considered to be
isolated if, for example, it is cloned into a vector that is not a
part of the natural environment, or if it is artificially
introduced in the genome of a cell in a manner that differs from
its naturally-occurring state. Alternatively, an "isolated peptide"
or an "isolated polypeptide" and the like, as used herein, may
refer to a polypeptide molecule that is free of other components of
the cell, i.e., it is not associated with in vivo cellular
substances.
The invention employs various routine recombinant nucleic acid
techniques. Generally, the nomenclature and the laboratory
procedures in recombinant DNA technology described below are
commonly employed in the art. Many manuals that provide direction
for performing recombinant DNA manipulations are available, e.g.,
Sambrook & Russell, Molecular Cloning, A Laboratory Manual (3rd
Ed, 2001); and Current Protocols in Molecular Biology (Ausubel, et
al., John Wiley and Sons, New York, 2009-2016).
Summary of Certain Aspects of the Disclosure
In one aspect, the invention provides a variant CadA polypeptide
that comprises a mutation at a glutamic acid (glutamate) that
resides at a position in one of sections of the protein in the core
domain that are without a defined secondary structure (.alpha.,
alpha helix; .beta., beta sheet; .eta., strand) (see Kanjee et
al.,) within those domains. These sections are the amino acids
276-299 between .beta.10 and .beta.11 that includes .eta.4 and
.alpha.11, the amino acids 314-326 between .beta.11 and .alpha.12
that includes .beta.12, the amino acids 344-357 between .eta.6 and
.beta.14, the amino acids 454-483 between .beta.16 and .beta.18
that includes .beta.17, and the amino acids 494-509 between
.beta.19 and .alpha.17, wherein the positions of the amino acids
are defined with reference to SEQ ID NO:2.
The ability of a variant CadA of the present invention to tolerate
alkaline pH also allows the use of alternative nitrogen sources
that have higher pH values, such as urea and ammonia (1M solution
has a pH 11.6) in fermentation reactions to generate the desired
product, e.g., polyamines. These alternative nitrogen sources
generate less salt waste byproduct.
CadA Polypeptide Variants
CadA is a member of the subclass of Fold Type I pyridoxal
5'-phosphate (PLP)-dependent decarboxylases. This class of proteins
typically contains a N-terminal wing domain, a core domain, and a
C-terminal domain. The core domain has a linker region, a
PLP-binding subdomain, and subdomain 4 For CadA, the N-terminal
wing domain (corresponding to residues 1 to 129 as determined with
reference to SEQ ID NO:2) has a flavodoxin-like fold composed of
five-stranded parallel beta-sheets sandwiched between two sets of
amphipathic alpha-helices. The core domain (residues 130 to 563 as
determined with reference to 563 of SEQ ID NO:2) includes: a linker
region, amino acid residues 130 to 183 of SEQ ID NO:2, that form a
short helical bundle; the PLP-binding subdomain, amino acids 184 to
417 of SEQ ID NO:2 that form a seven-stranded beta-sheet core
surrounded by three sets of alpha-helices; and subdomain 4, amino
acids 418 to 563 that form a four stranded antiparallel beta-sheet
core with three alpha-helices facing outward. The C-terminal domain
corresponds to amino acid residues 564 to 715 as determined with
referenced to SEQ ID NO:2 that form two sets of beta sheets with an
alpha-helical outer surface (Kanjee et al., The EMBO Journal 30,
931-944 2011).
CadA protein forms a two-fold symmetric dimer that completes the
active site of each monomer. Five dimers associate to form a
decamer that consist of a double-ringed structure with five-fold
symmetry. The decamer associates with other decamers to form
higher-order oligomers. It has been shown that in acidic conditions
(pH 5), CadA predominantly exists in the oligomeric state, and less
oligomers and decamers are found as the environment becomes more
basic. It was estimated that 25% of the enzymes exist as dimers and
75% exist as decamers at pH 6.5, while 95% of the enzymes exist as
dimers at pH 8.0 (Kanjee et al., The EMBO Journal 30, 931-944
2011). This decrease in oligomer formation coincides with the
decrease in decarboxylase activity observed as the pH of the
environment of the enzyme increases above 5.0.
Illustrative Cad A polypeptides from E. coli, Salmonella enterica,
Klebsiella, and Enterobacteriaeceae are provided in SEQ ID NOS:2-5,
which share greater than 90% sequence identity with one
another.
CadA polypeptides of the present invention comprise at least one
substitution of another amino acid for a glutamate at a position in
one of the segments of the protein in the core domain that are
without a defined secondary structure (.alpha., alpha helix;
.beta., beta sheet; .eta., strand) (see Kanjee et al.,); and where
the glutamate is at the surface of the protein where the side chain
is oriented toward the external environment. In the present
disclosure, the amino acid that is substituted for the glutamate
does not occur at that position in a native CadA sequence. A
variant CadA polypeptide in accordance with the invention thus can
comprise at least one amino acid substitution at position E291,
E344, E355, E463, E482, and E499 as determined with reference to
SEQ ID NO:2. In some embodiments a variant CadA polypeptide
comprises more than one amino acid substitution, e.g., 2, 3, or
more substitutions, at positions E291, E344, E355, E463, E482, and
E499 as determined with reference to SEQ ID NO:2. In some
embodiments, the amino acid that is substituted for a glutamate is
selected from the group of amino acids consisting of alanine,
arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic
acid, glycine, histidine, isoleucine, leucine, lysine, methionine,
phenylalanine, proline, serine, threonine, tryptophan, tyrosine, or
valine, where the mutant does not have the same amino acid as the
wild-type sequence (SEQ ID NO: 2, 3, 4, or 5) at the same position.
In some embodiments, the amino acid that is substituted for a
glutamate has the ability to donate a hydrogen for hydrogen bond
formation. For example, C, Y, K, and N have pKa valuates greater
than 7, so their protonation state does not change when the pH
increases from 5 to 8. Therefore, any hydrogen bond formed at these
positions is more stable compared to when glutamate, which has an
acidic pKa, is present at those positions. The sulfur of M can act
as either a nucleophile or an electrophile and does not need a
proton to interact with other amino acid groups. M may thus further
stabilize a protein-protein interaction at that site. In some
embodiments, the amino acid that is substituted for a glutamate is
selected from the group consisting of C, H, K, S, A, F, L, M, N, R,
V, Y, D, G, I, P, Q, T, and W. In some embodiments, the amino acid
that is substituted for a glutamate is selected from the group
consisting of C, H, K, S, A, F, L, M, N, R, V, and Y. In some
embodiments, the amino acid that is substituted for a glutamate is
selected from the group consisting of C, Y, K, S, H, R, M, and N.
In some embodiments, the amino acid that is substituted for a
glutamate is selected from the group consisting of C, H, K, and
S.
In some embodiments, the variant CadA polypeptide is a variant of
CadA from E. coli in which at least, one, two, three, four, five,
or all six of the glutamates at positions E291, E344, E355, E463,
E482, and E499 is substituted with another amino acid.
In some embodiments, a variant CadA polypeptide of the invention
has at least 60% amino acid sequence identity, often at least 65%,
70%, 75%, 80%, or 85% identity; and typically at least 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater amino acid
sequence identity, over a region of at 500 or more amino acids in
length, or over the length of, the CadA polypeptide of SEQ ID NO:2;
and has a substitution at a glutamic acid residue at at least one
of positions E291, E344, E355, E463, E482, or E499 as determined
with reference to SEQ ID NO:2. In some embodiments, the
substitution is E291A/C/D/H/R/V/G/K/N/S,
E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or E482C/F/I/L/S/W/Y/A/H/K/M. In
other embodiments, the substitution is E291A/C/D/H/R/V,
E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or E482C/F/I/L/S/W/Y.
In some embodiments, a variant CadA polypeptide of the invention
has at least 60% amino acid sequence identity, often at least 65%,
70%, 75%, 80%, or 85% identity; and typically at least 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater amino acid
sequence identity, over a region of at 500 or more amino acids in
length, or over the length of, the CadA polypeptide of SEQ ID NO:3;
and has a substitution at a glutamic acid residue at at least one
of positions E291, E344, E355, E463, or E482 as determined with
reference to SEQ ID NO:3. In some embodiments, the substitution is
E291A/C/D/H/R/V/G/K/N/S, E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or
E482C/F/I/L/S/W/Y/A/H/K/M. In other embodiments, the substitution
is E291A/C/D/H/R/V, E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or
E482C/F/I/L/S/W/Y.
In some embodiments, a variant CadA polypeptide of the invention
has at least 60% amino acid sequence identity, often at least 65%,
70%, 75%, 80%, or 85% identity; and typically at least 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater amino acid
sequence identity, over a region of at 500 or more amino acids in
length, or over the length of, the CadA polypeptide of SEQ ID NO:4;
and has a substitution at a glutamic acid residue at at least one
of positions E291, E344, E355, E463, or E482 as determined with
reference to SEQ ID NO:4. In some embodiments, the substitution is
E291A/C/D/H/R/V/G/K/N/S, E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or
E482C/F/I/L/S/W/Y/A/H/K/M. In other embodiments, the substitution
is E291A/C/D/H/R/V, E355C/F/H/K/L/M/N/P/Q/R/S/T/B/Y, or
E482C/F/I/L/S/W/Y.
In some embodiments, a variant CadA polypeptide of the invention
has at least 60% amino acid sequence identity, often at least 65%,
70%, 75%, 80%, or 85% identity; and typically at least 90%, 91%,
92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or greater amino acid
sequence identity, over a region of at 500 or more amino acids in
length, or over the length of, the CadA polypeptide of SEQ ID NO:5;
and has a substitution at a glutamic acid residue at at least one
of positions E291, E355, E463, or E482 as determined with reference
to SEQ ID NO:5. In some embodiments, the substitution is
E291A/C/D/H/R/V/G/K/N/S, E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or
E482C/F/I/L/S/W/Y/A/H/K/M. In other embodiments, the substitution
is E291A/C/D/H/R/V, E355C/F/H/K/L/M/N/P/Q/R/S/T/V/Y, or
E482C/F/I/L/S/W/Y.
Nucleic Acids Encoding CadA Variant Polypeptides
Isolation or generation of CadA polynucleotide sequences can be
accomplished by a number of techniques. In some embodiments,
oligonucleotide probes and based on the sequences disclosed here
can be used to identify the desired polynucleotide in a cDNA or
genomic DNA library from a desired bacteria species. Desired
substitutions may be introduced into the CadA-encoding
polynucleotide sequence using appropriate primers, e.g., as
illustrated in the Examples section, to incorporate the desired
changes into the polynucleotide sequence. For instance, PCR may be
used to amplify the sequences of the genes directly from mRNA, from
cDNA, from genomic libraries or cDNA libraries and to introduce
desired substitutions.
Appropriate primers and probes for identifying a CadA
polynucleotide in bacteria can be generated from comparisons of the
sequences provided herein or generated based on a CadA
polynucleotide sequence from another bacteria. For a general
overview of PCR see PCR Protocols: A Guide to Methods and
Applications. (Innis, M, Gelfand, D., Sninsky, J. and White, T.,
eds.), Academic Press, San Diego (1990). Illustrative primer
sequences are shown in the Table of Primers in the Examples
section.
Nucleic acid sequences encoding an acid decarboxylase polypeptide
for use in the disclosure includes genes and gene products
identified and characterized by techniques such as hybridization
and/or sequence analysis using illustrative nucleic acid sequences,
e.g., a cadA polynucleotide sequence of SEQ ID NO:1. In some
embodiments, a host cell is genetically modified by introducing a
nucleic acid sequence having at least 60% identity, or at least
70%, 75%, 80%, 85%, or 90% identity, or 95% identity, or greater,
to an acid decarboxylase polynucleotide, e.g., a cadA
polynucleotide of SEQ ID NO:1, wherein the nucleic acid comprises a
codon that encodes the desired amino acid to be substituted.
Nucleic acid sequences encoding a CadA variant protein in
accordance with the invention that confers increased production of
an amino acid derivative, e.g., cadaverine, to a host cell, may
additionally be codon-optimized for expression in a desired host
cell. Methods and databases that can be employed are known in the
art. For example, preferred codons may be determined in relation to
codon usage in a single gene, a set of genes of common function or
origin, highly expressed genes, the codon frequency in the
aggregate protein coding regions of the whole organism, codon
frequency in the aggregate protein coding regions of related
organisms, or combinations thereof. See e.g., Henaut and Danchin in
"Escherichia coli and Salmonella," Neidhardt, et al. Eds., ASM
Pres, Washington D.C. (1996), pp. 2047-2066; Nucleic Acids Res.
20:2111-2118; Nakamura et al., 2000, Nucl. Acids Res. 28:292).
Preparation of Recombinant Vectors
Recombinant vectors for expression of a variant CadA protein can be
prepared using methods well known in the art. For example, a DNA
sequence encoding a CadA variant polypeptide, can be combined with
transcriptional and other regulatory sequences which will direct
the transcription of the sequence from the gene in the intended
cells, e.g., bacterial cells such as H. alvei, E. coli, or C.
glutamicum. In some embodiments, an expression vector that
comprises an expression cassette that comprises the gene encoding
the CadA variant polypeptide further comprises a promoter operably
linked to the nucleic acid sequence encoding the CadA variant
polypeptide. In other embodiments, a promoter and/or other
regulatory elements that direct transcription of the cadA
polynucleotide encoding a variant Cada polypeptide are endogenous
to the host cell and an expression cassette comprising the cadA
gene is introduced, e.g., by homologous recombination, such that
the exogenous gene is operably linked to an endogenous promoter and
is expression driven by the endogenous promoter.
As noted above, expression of the polynucleotide encoding a CadA
variant polypeptide can be controlled by a number of regulatory
sequences including promoters, which may be either constitutive or
inducible; and, optionally, repressor sequences, if desired.
Examples of suitable promoters, especially in a bacterial host
cell, are the promoters obtained from the E. coli lac operon and
other promoters derived from genes involved in the metabolism of
other sugars, e.g., galactose and maltose. Additional examples
include promoters such as the trp promoter, bla promoter
bacteriophage lambda PL, and T5. In addition, synthetic promoters,
such as the tac promoter (U.S. Pat. No. 4,551,433), can be used.
Further examples of promoters include Streptomyces coelicolor
agarase gene (dagA), Bacillus subtilis levansucrase gene (sacB),
Bacillus licheniformis alpha-amylase gene (amyL), Bacillus
stearothermophilus maltogenic amylase gene (amyM), Bacillus
amyloliquefaciens alpha-amylase gene (amyQ), Bacillus licheniformis
penicillinase gene (penP), Bacillus subtilis xylA and xylB genes.
Suitable promoters are also described in Ausubel and Sambrook &
Russell, both supra. Additional promoters include promoters
described by Jensen & Hammer, Appl. Environ. Microbiol. 64:82,
1998; Shimada, et al., J. Bacteriol. 186:7112, 2004; and Miksch et
al., Appl. Microbiol. Biotechnol. 69:312, 2005.
In some embodiments, a promoter that influences expression of a
cadA gene encoding a CadA variant polypeptide of the invention may
be modified to increase expression. For example, an endogenous CadA
promoter may be replaced by a promoter that provides for increased
expression compared to the native promoter.
An expression vector may also comprise additional sequences that
influence expression of a polynucleotide encoding the CadA variant
polypeptide. Such sequences include enhancer sequences, a ribosome
binding site, or other sequences such as transcription termination
sequences, and the like.
A vector expressing a polynucleotide encoding a CadA variant
polypeptide of the invention may be an autonomously replicating
vector, i.e., a vector which exists as an extrachromosomal entity,
the replication of which is independent of chromosomal replication,
e.g., a plasmid, an extrachromosomal element, a minichromosome, or
an artificial chromosome. The vector may contain any means for
assuring self-replication. Alternatively, the vector may be one
which, when introduced into the host, is integrated into the genome
and replicated together with the chromosome(s) into which it has
been integrated. Thus, an expression vector may additionally
contain an element(s) that permits integration of the vector into
the host's genome.
An expression vector of the invention preferably contains one or
more selectable markers which permit easy selection of transformed
hosts. For example, an expression vector may comprise a gene that
confers antibiotic resistance (e.g., ampicillin, kanamycin,
chloramphenicol or tetracycline resistance) to the recombinant host
organism, e.g., a bacterial cell such as E. coli, H. alvei, or C.
glutamicum.
Although any suitable expression vector may be used to incorporate
the desired sequences, readily available bacterial expression
vectors include, without limitation: plasmids such as pSClOl,
pBR322, pBBR1MCS-3, pUR, pET, pEX, pMR100, pCR4, pBAD24, p15a,
pACYC, pUC, e.g., pUC18 or pUC19, or plasmids derived from these
plasmids; and bacteriophages, such as M1 3 phage and .lamda. phage.
One of ordinary skill in the art, however, can readily determine
through routine experimentation whether any particular expression
vector is suited for any given host cell. For example, the
expression vector can be introduced into the host cell, which is
then monitored for viability and expression of the sequences
contained in the vector.
Expression vectors of the invention may be introduced into the host
cell using any number of well-known methods, including calcium
chloride-based methods, electroporation, or any other method known
in the art.
Host Cells
The present invention provides for a genetically modified host cell
that is engineered to express a CadA variant polypeptide of the
invention. A genetically modified host strain of the present
invention typically comprises at least one additional genetic
modification to enhance production of an amino acid or amino acid
derivative relative to a control strain that does not have the one
additional genetic modification, e.g., a wildtype strain or a cell
of the same strain without the one additional genetic modification.
An "additional genetic modification to enhance production of an
amino acid or amino acid derivative" can be any genetic
modification. In some embodiments, the genetic modification is the
introduction of a polynucleotide that expresses an enzyme involved
in the synthesis of the amino acid or amino acid derivative. In
some embodiments, the host cell comprises multiple modifications to
increase production, relative to a wildtype host cell, of an amino
acid or amino acid derivative.
In some aspects, genetic modification of a host cell to express a
CadA variant polypeptide is performed in conjunction with modifying
the host cell to overexpress one or more lysine biosynthesis
polypeptides.
In some embodiments, a host cell may be genetically modified to
express one or more polypeptides that affect lysine biosynthesis.
Examples of lysine biosynthesis polypeptides include the E. coli
genes SucA, Ppc, AspC, LysC, Asd, DapA, DapB, DapD, ArgD, DapE,
DapF, LysA, Ddh, PntAB, CyoABE, GadAB, YbjE, GdhA, GltA, SucC,
GadC, AcnB, PflB, ThrA, AceA, AceB, GltB, AceE, SdhA, MurE, SpeE,
SpeG, PuuA, PuuP, and YgjG, or the corresponding genes from other
organisms Such genes are known in the art (see, e.g., Shah et al.,
J. Med. Sci. 2:152-157, 2002; Anastassiadia, S. Recent Patents on
Biotechnol. 1: 11-24, 2007). See, also, Kind, et al., Appl.
Microbiol. Biotechnol. 91: 1287-1296, 2011 for a review of genes
involved in cadaverine production. Illustrative genes encoding
lysine biosynthesis polypeptides are provided below.
TABLE-US-00001 GenBank Protein Gene EC Number Accession No.
.alpha.-ketogultarate dehydrogenase (SucA) sucA 1.2.4.2 YP_489005.1
Phosphoenolpyruvate carboxylase (PPC) ppc 4.1.1.31 AAC76938.1
aspartate transaminase (AspC) aspC 2.6.1.1 AAC74014.1 aspartate
kinase (LysC) lysC 2.7.2.4 NP_418448.1 aspartate semialdehyde
dehydrogenase (Asd) asd 1.2.1.11 AAC76458.1 dihydrodipicolinate
synthase (DapA) dapA 4.3.3.7 NP_416973.1 dihydropicolinate
reductase (DapB) dapB 1.17.1.8 AAC73142.1 tetrahydrodipicoinate
succinylase (DapD) dapD 2.3.1.117 AAC73277.1
N-succinyldiaminopimelate aminotransferase (ArgD) argD 2.6.1.11
AAC76384.1 N-succinyl-L-diaminopimelate deacylase (DapE) dapE
3.5.1.18 AAC75525.1 diaminopimelate epimerase (DapF) dapF 5.1.1.7
AAC76812.2 diaminopimelate decarboxylase (LysA) lysA 4.1.1.20
AAC75877.1 meso-diaminopimelate dehydrogenase (Ddh) ddh NA P04964.1
pyridine nucleotide transhydrogenase (PntAB) pntAB NA AAC74675.1,
AAC74674.1 cytochrome O oxidase (CyoABE) cycABE 1.10.3.10
AAC73535.1, AAC73534.1, AAC73531.1 glutamate decarboxylase (GadAB)
gadAB 4.1.1.15 AAC76542.1, AAC74566.1 L-amino acid efflux
transporter (YbjE) ybjE NA AAC73961.2 glutamate dehydrogenase
(GdhA) gdhA 1.4.1.4 AAC74831.1 citrate synthase (GltA) gltA
2.3.3.1/2.3.3.16 AAC73814.1 succinyl-coA synthase (SucC) sucC
6.2.1.5 AAC73822.1 glutamate-GABA antiporter (GadC) gadC NA
AAC74565.1 aconitase B (AcnB) acnB 4.2.1.99 AAC73229.1
pyruvate-formate lyase (PflB) pflB NA AAC73989.1 aspartate
kinase/homoserine dehydrogenase (ThrA) thrA 2.7.2.4 AAC73113.1
isocitrate lyase (AceA) aceA 4.1.3.1 AAC76985.1 malate synthase
(AceB) aceB 2.3.3.9 AAC76984.1 glutmate synthase (GltB) gltB
1.4.1.13 AAC76244.2 pyruvate dehydrogenase (AceE) aceE 1.2.4.1
AAC73225.1 succinate dehydrogenase (SdhA) sdhA 1.3.5.1 AAC73817.1
UDP-N-acetylmuramoyl-L-alanyl-D-glutamate: murE 6.3.2.13 AAC73196.1
meso-diaminopimelate ligase (MurE) putrescine/cadaverine speE
2.5.1.16 AAC73232.1 aminopropyltransferase (SpeE) spermidine
acetyltransferase (SpeG) speG NA AAC74656.1
glutamate-putrescine/glutamate-cadaverine puuA NA AAC74379.2 ligase
(PuuA) putrescine importer (PuuP) puuP NA AAC74378.2
putrescine/cadaverine aminotransferase (YgjG) ygjG 2.6.1.82
AAC76108.3
In some embodiments, a host cell may be genetically modified to
attenuate or reduce the expression of one or more polypeptides that
affect lysine biosynthesis. Examples of such polypeptides include
the E. coli genes Pck, Pgi, DeaD, CitE, MenE, PoxB, AceA, AceB,
AceE, RpoC, and ThrA, or the corresponding genes from other
organisms. Such genes are known in the art (see, e.g., Shah et al.,
J. Med. Sci. 2:152-157, 2002; Anastassiadia, S. Recent Patents on
Biotechnol. 1: 11-24, 2007). See, also, Kind, et al., Appl.
Microbiol. Biotechnol. 91: 1287-1296, 2011 for a review of genes
attenuated to increase cadaverine production. Illustrative genes
encoding polypeptides whose attenuation increases lysine
biosynthesis are provided below.
TABLE-US-00002 GenBank Accession Protein Gene EC Number No. PEP
carboxykinase (Pck) pck 4.1.1.49 NP_417862 Glucose-6-phosphate
isomerase (Pgi) pgi 5.3.1.9 NP_418449 DEAD-box RNA helicase (DeaD)
deaD NP_417631 citrate lyase (CitE) citE 4.1.3.6/4.1.3.34 NP_415149
o-succinylbenzoate-CoA ligase (MenE) menE 6.2.1.26 NP_416763
pyruvate oxidase (PoxB) poxB 1.2.2.2 NP_415392 isocitrate lyase
(AceA) aceA 4.1.3.1 NP_418439 malate synthase A (AceB) aceB 2.3.3.9
NP_418438 pyruvate dehydrogenase (aceE) aceE 1.2.4.1 NP_414656 RNA
polymerase b' subunit (RpoC) rpoC 2.7.7.6 NP_418415 aspartokinase I
(ThrA) thrA 2.7.2.4/1.1.1.3 NP_414543
Nucleic acids encoding a lysine biosynthesis polypeptide may be
introduced into the host cell along with a polynucleotide encoding
a CadA variant polypeptide, e.g., encoded on a single expression
vector, or introduced in multiple expression vectors at the same
time. Alternatively, the host cell may be genetically modified to
overexpress one or more lysine biosynthesis polypeptides before or
after the host cells genetically modified to express a CadA variant
polypeptide.
A host cell engineered to express a CadA variant polypeptide is
typically a bacterial host cell. In typical embodiments, the
bacterial host cell is a Gram-negative bacterial host cell. In some
embodiments of the invention, the bacterium is an enteric
bacterium. In some embodiments of the invention, the bacterium is a
species of the genus Corynebacterium, Escherichia, Pseudomonas,
Zymomonas, Shewanella, Salmonella, Shigella, Enterobacter,
Citrobacter, Cronobacter, Erwinia, Serratia, Proteus, Hafnia,
Yersinia, Morganella, Edwardsiella, or Klebsiella taxonomical
classes. In some embodiments, the host cells are members of the
genus Escherichia, Hafnia, or Corynebacterium. In some embodiments,
the host cell is an Escherichia coli, Hafnia alvei, or
Corynebacterium glutamicum host cell. In some embodiments, the host
cell is Escherichia coli. In some embodiments, the host cell is
Hafnia alvei. In some embodiments, the host cell is Corynebacterium
glutamicum.
In some embodiments, the host cell is a gram-positive bacterial
host cell, such as a Bacillus sp., e.g., Bacillus subtilis or
Bacillus licheniformis; or another Bacillus sp. such as B.
alcalophilus, B. aminovorans, B. amyloliquefaciens, B.
caldolyticus, B. circulans, B. stearothermophilus, B.
thermoglucosidasius, B. thuringiensis or B. vulgatis.
Host cells modified in accordance with the invention can be
screened for increased production of lysine or a lysine derivative,
such as cadaverine, as described herein.
In some embodiments, a CadA variant polypeptide of the present
invention may be recovered from a host cell that expresses the
variant polypeptide. In some embodiments, the recovered variant
protein may be immobilized onto a solid substrate or inert material
to form an immobilized enzyme. In one embodiment, the immobilized
enzyme may have improved operational stability than the soluble
form of the fusion protein.
Methods of Producing Lysine or a Lysine Derivative.
A host cell genetically modified to overexpress a CadA variant
polypeptide of the invention can be employed to produce lysine or a
derivative of lysine. In some embodiments, the host cell produces
cadaverine. Thus, for example, to produce cadaverine, a host cell
genetically modified to express a CadA variant polypeptide as
described herein can be cultured under conditions suitable to allow
expression of the polypeptide and expression of genes that encode
the enzymes that are used to produce lysine and/or cadaverine. A
host cell modified in accordance with the invention to express a
CadA variant polypeptide provides a higher yield of cadaverine
relative to a counterpart host cell that expresses native CadA.
Host cells may be cultured using well known techniques (see, e.g.,
the illustrative conditions provided in the examples section.
In some embodiments, host cells are cultured using nitrogen sources
that are not salts (e.g., ammonium sulfate or ammonium chloride),
such as ammonia or urea. Host cells may be cultured at an alkaline
pH during cell growth or enzyme production.
The lysine or lysine derivative then be separated and purified
using known techniques. Lysine or lysine derivatives, e.g.,
cadverine, produced in accordance with the invention may then be
used in any known process, e.g., to produce a polyamide.
In some embodiments, lysine may be converted to caprolactam using
chemical catalysts or by using enzymes and chemical catalysts.
The present invention will be described in greater detail by way of
specific examples. The following examples are offered for
illustrative purposes, and are not intended to limit the invention
in any manner Those of skill in the art will readily recognize a
variety of noncritical parameters, which can be changed or modified
to yield essentially the same results.
EXAMPLES
We hypothesized that the decamer-decamer interface contains
segments of the protein in the core domain, and is composed of
sections of the protein without a defined secondary structure
(.alpha., alpha helix; .beta., beta sheet; .eta., strand) (see
Kanjee et al.,) within those domains. These sections are the amino
acids 276-299 between .beta.10 and .beta.11 that includes .eta.4
and .alpha.11, the amino acids 314-326 between .beta.11 and
.alpha.12 that includes .beta.12, the amino acids 344-357 between
.eta.6 and .beta.14, the amino acids 454-483 between .beta.16 and
.beta.18 that includes .beta.17, and the amino acids 494-509
between .beta.19 and .alpha.17. These examples show that glutamate
residues at these positions can be substituted to increase
production of cadaverine by host cells that are genetically
modified to express the CadA variant polypeptide.
Example 1: Construction of Plasmid Vectors that Encode CadA
A plasmid vector containing wild-type E. coli cadA (SEQ ID NO: 1),
which encodes the lysine decarboxylase CadA (SEQ ID NO: 2), was
amplified from the E. coli MG1655 K12 genomic DNA using the PCR
primers cadA-F and cadA-R (FIG. 1), digested using the restriction
enzymes SacI and XbaI, and ligated into pUC18 to generate the
plasmid pCIB60. The 5' sequence upstream of the cadA gene was
optimized using the PCR primers cadA-F2 and cadA-R2 to create
pCIB71.
Example 2: Construction of Plasmid Vectors that Encode CadA with
Cysteine Mutations at the Predicted Interfacial Amino Acid
Residues
Primer pairs were designed to modify the amino acid at positions
291, 344, 355, 463, 482, or 499 to cysteine of the cadA gene in
pCIB71 using Quickchange PCR. The mutations were verified using DNA
sequencing, and the plasmids carrying the cysteine mutations were
labeled pCIB71-E291C, pCIB71-E344C, pCIB71-E355C, pCIB71-E463C,
pCIB71-E482C, or pCIB71-E499C.
Example 3: Lysine Decarboxylase Activity of Mutant CadA
Polypeptides with Cysteine Mutations at the Predicted Interfacial
Amino Acid Residues
[H. alvei was transformed with pCIB71-E291C, pCIB71-E344C,
pCIB71-E355C, pCIB71-E463C, pCIB71-E482C, or pCIB71-E499C. Three
single colonies from each transformation were grown overnight at
37.degree. C. in 4 mL of LB medium with ampicillin (100 .mu.g/mL).
The following day, 0.7 mL of each overnight culture was added to
0.3 mL of lysine-HCl and PLP to a final concentration of 120 g/L
and 0.1 mM, respectively. The final mixture was adjusted to pH 8.0
with 1M NaOH. Each mixture was incubated at 37.degree. C. for 2
hours. Cadaverine production from each sample was quantified using
NMR, and yield was calculated by dividing the molar amount of
cadaverine produced by the molar amount of lysine added. The
average yield from each sample relative to the average yield from
H. alvei transformed with pCIB71 after 2 hours is presented in
Table 2.
TABLE-US-00003 TABLE 2 Relative cadaverine yield at pH 8 by H.
alvei strains expressing plasmids encoding CadA polypeptides with
mutations at the predicted interfacial amino acid residues. Plasmid
Relative Yield (%) pCIB71 100 pCIB71-E291C 188 pCIB71-E344C 110
pCIB71-E355C 170 pCIB71-E463C 120 pCIB71-E482C 137 pCIB71-E499C
112
As shown in Table 2, several mutations improved the activity of the
CadA polypeptide at pH 8.0. The mutations E291C, E355C, and E482C
significantly increased relative yield by more than 30%. The
mutations E344C, E463C, and E499C increased yield by 10% to
30%.
Example 4: Construction of Plasmid Vectors that Encode CadA with
Cysteine Mutations at Predicted Interfacial Amino Acid Residues
with Deviant pKa's and Analysis of Lysine Decarboxylase of the
Mutated Proteins
This example show that modification of predicted interfacial amino
acid residues did not enhance cadaverine production.
Primer pairs were designed to modify the amino acid at positions
279, 288, 319, 323, 346, 353, 357, or 470 to cysteine of the cadA
gene in pCIB71 using Quickchange PCR. The mutations were verified
using DNA sequencing, and the plasmids carrying the cysteine
mutations were labeled pCIB71-E279C, pCIB71-R288C, pCIB71-K319C,
pCIB71-D323C, pCIB71-K346C, pCIB71-R353C, pCIB71-K357C, or
pCIB71-D470C.
H. alvei was transformed with pCIB71-E279C, pCIB71-R288C,
pCIB71-K319C, pCIB71-D323C, pCIB71-K346C, pCIB71-R353C,
pCIB71-K357C, and pCIB71-D470C. Three single colonies from each
transformation were grown overnight at 37.degree. C. in 4 mL of LB
medium with ampicillin (100 .mu.g/mL). The following day, 0.7 mL of
each overnight culture was added to 0.3 mL of lysine-HCl and PLP to
a final concentration of 120 g/L and 0.1 mM, respectively. The
final mixture was adjusted to pH 8.0 with 1M NaOH. Each mixture was
incubated at 37.degree. C. for 2 hours. Cadaverine production from
each sample was quantified using NMR, and yield was calculated by
dividing the molar amount of cadaverine produced by the molar
amount of lysine added. The average yield from each sample relative
to the average yield from H. alvei transformed with pCIB71 after 2
hours is presented in Table 3. As shown in Table 3, all fo the
mutations decreased cadaverine yield at pH 8.0.
TABLE-US-00004 TABLE 3 Relative cadaverine yield at pH 8 by H.
alvei strains expressing plasmids encoding CadA polypeptides with
mutations at the predicted interfacial amino acid residues with
deviant pKa's. Relative Relative Yield Yield Plasmid (%) Plasmid
(%) pCIB71 100 pCIB71-K346C 81 pCIB71-E279C 60 pCIB71-R353C 75
pCIB71-R288C 67 pCIB71-K357C 79 pCIB71K319C 82 pCIB71-D470C 60
pCIB71-D323C 85
Example 5: Construction of Plasmid Vectors that Encode CadA with a
Mutation at E291
Primer pairs were designed to modify the amino acid at position 291
of the cadA gene in pCIB71 using Quickchange PCR. The mutations
were verified using DNA sequencing, and the plasmids carrying each
mutation at amino acid position 291 were labeled pCIB71-E291X,
where X is the amino acid that replaced the glutamate residue.
Example 6: Construction of Plasmid Vectors that Encode CadA with a
Mutation at E355
Primer pairs were designed to modify the amino acid at position 355
of the cadA gene in pCIB71 using Quickchange PCR. The mutations
were verified using DNA sequencing, and the plasmids carrying each
mutation at amino acid position 355 were labeled pCIB71-E355X,
where X is the amino acid that replaced the glutamate residue.
Example 7: Construction of Plasmid Vectors that Encode CadA with a
Mutation at E482
Primer pairs were designed to modify the amino acid at position 468
of the cadA gene in pCIB71 using Quickchange PCR. The mutations
were verified using DNA sequencing, and the plasmids carrying each
mutation at amino acid position 482 were labeled pCIB71-E482X,
where X is the amino acid that replaced the glutamate residue.
Example 8: Lysine Decarboxylase Activity of Mutant CadA
Polypeptides with a E291X Mutation
H. alvei was transformed with pCIB71-E291X. Three single colonies
from each transformation were grown overnight at 37.degree. C. in 4
mL of LB medium with ampicillin (100 .mu.g/mL). The following day,
0.7 mL of each overnight culture was added to 0.3 mL of lysine-HCl
and PLP to a final concentration of 120 g/L and 0.1 mM,
respectively. The final mixture was adjusted to pH 8.0 with 1M
NaOH. Each mixture was incubated at 37.degree. C. for 2 hours.
Cadaverine production from each sample was quantified using NMR,
and yield was calculated by dividing the molar amount of cadaverine
produced by the molar amount of lysine added. The yield from each
sample relative to the average yield from H. alvei transformed with
pCIB71 after 2 hours is presented in Table 4.
TABLE-US-00005 TABLE 4 Relative cadaverine yield at pH 8 by H.
alvei strains expressing plasmids encoding CadA polypeptides with
mutations at amino acid position 291. Relative Relative Yield Yield
Plasmid (%) Plasmid (%) pCIB71 100 pCIB71-E291M 97 pCIB71-E291A 126
pCIB71-E291N 120 pCIB71-E291C 148 pCIB71-E291P 4 pCIB71-E291D 135
pCIB71-E291Q 98 pCIB71-E291F 82 pCIB71-E291R 127 pCIB71-E291G 120
pCIB71-E291S 108 pCIB71-E291H 131 pCIB71-E291T 96 pCIB71-E291I 98
pCIB71-E291V 135 pCIB71-E291K 111 pCIB71-E291W 74 pCIB71-E291L 100
pCIB71-E291Y 80
As shown in Table 4, several mutations at amino acid position 291
improved the activity of the CadA polypeptide at pH 8.0. The
mutations E291A, E291C, E291D, E291H, E291R, and E291V increased
relative yield by more than 25%. The mutations E291G, E291K, E291N,
and E291S also increased yield. The mutations E291I, E291L, E291M,
E291Q, and E291T had little effect on yield. The remaining
mutations E291F, E291P, E291W, and E291Y decreased yield.
Example 9: Lysine Decarboxylase Activity of Mutant CadA
Polypeptides with a E355X Mutation
H. alvei was transformed with pCIB71-E355X. Three single colonies
from each transformation were grown overnight at 37.degree. C. in 4
mL of LB medium with ampicillin (100 .mu.g/mL). The following day,
0.7 mL of each overnight culture was added to 0.3 mL of lysine-HCl
and PLP to a final concentration of 120 g/L and 0.1 mM,
respectively. The final mixture was adjusted to pH 8.0 with 1M
NaOH. Each mixture was incubated at 37.degree. C. for 2 hours.
Cadaverine production from each sample was quantified using NMR,
and yield was calculated by dividing the molar amount of cadaverine
produced by the molar amount of lysine added. The yield from each
sample relative to the average yield from H. alvei transformed with
pCIB71 after 2 hours is presented in Table 5.
TABLE-US-00006 TABLE 5 Relative cadaverine yield at pH 8 by H.
alvei strains expressing plasmids encoding CadA polypeptides with
mutations at amino acid position 355. Relative Relative Yield Yield
Plasmid (%) Plasmid (%) pCIB71 100 pCIB71-E355M 141 pCIB71-E355A 0
pCIB71-E355N 135 pCIB71-E355C 157 pCIB71-E355P 150 pCIB71-E355D 97
pCIB71-E355Q 151 pCIB71-E355F 149 pCIB71-E355R 152 pCIB71-E355G 0
pCIB71-E355S 143 pCIB71-E355H 147 pCIB71-E355T 148 pCIB71-E355I 104
pCIB71-E355V 145 pCIB71-E355K 139 pCIB71-E355W 111 pCIB71-E355L 136
pCIB71-E355Y 141
As shown in Table 5, the majority of mutations at amino acid
position 355 improved yield at pH 8.0, with the exception of E355A,
E355D, E355G, E355I, and E355W. The mutations E355D, E355I, and
E355W had little effect on yield. The mutations E355A and E355G
decreased yield.
Example 10: Lysine Decarboxylase Activity of Mutant CadA
Polypeptides with a E482X Mutation
H. alvei was transformed with pCIB71-E482X. Three single colonies
from each transformation were grown overnight at 37.degree. C. in 4
mL of LB medium with ampicillin (100 .mu.g/mL). The following day,
0.7 mL of each overnight culture was added to 0.3 mL of lysine-HCl
and PLP to a final concentration of 120 g/L and 0.1 mM,
respectively. The final mixture was adjusted to pH 8.0 with 1M
NaOH. Each mixture was incubated at 37.degree. C. for 2 hours.
Cadaverine production from each sample was quantified using NMR,
and yield was calculated by dividing the molar amount of cadaverine
produced by the molar amount of lysine added. The yield from each
sample relative to the average yield from H. alvei transformed with
pCIB71 after 2 hours is presented in Table 6.
TABLE-US-00007 TABLE 6 Relative cadaverine yield at pH 8 by H.
alvei strains expressing plasmids encoding CadA polypeptides with
mutations at amino acid position 482. Relative Relative Yield Yield
Plasmid (%) Plasmid (%) pCIB71 100 pCIB71-E482M 116 pCIB71-E482A
122 pCIB71-E482N 93 pCIB71-E482C 140 pCIB71-E482P 69 pCIB71-E482D
94 pCIB71-E482Q 73 pCIB71-E482F 135 pCIB71-E482R 89 pCIB71-E482G 86
pCIB71-E482S 126 pCIB71-E482H 119 pCIB71-E482T 74 pCIB71-E482I 147
pCIB71-E482V 85 pCIB71-E482K 112 pCIB71-E482W 126 pCIB71-E482L 128
pCIB71-E482Y 138
As shown in Table 6, several mutations at amino acid position 482
improved the activity of the CadA polypeptide at pH 8.0. The
mutations E482C, E482F, E482I, E482L, E482S, E482W, and E482Y
increased relative yield by more than 25%. The mutations E482A,
E482H, E482K, and E482M also increased yield. The mutations E482D
and E482N had little effect on yield. The remaining mutations
E482G, E482P, E482Q, E482R, E482T, and E482V decreased yield.
Example 11: In Vitro Kinetic Analysis of Mutant CadA Polypeptides
at Alkaline pH Conditions
100 mL samples of H. avlei transformed with either pCIB71,
pCIB71-E291C, pCIB71-E355C, and pCIB71-E482C were lysed with a
french press. The lysed samples were centrifuged, and the
supernatant was separated from the pellet in order to perform in
vitro experiments. Each reaction was performed in Tris-HCl buffer
(50 mM Tris-HCl either pH 6 or 8, 25 mM NaCl, 2 mM EDTA) with 120
g/L lysine-HCl and 0.1 mM PLP. The reaction rate of each lysed
sample was measured using NMR by sampling the amount of lysine
converted in the presence of PLP into cadaverine every 1.6 minutes
for a total of 20 minutes, and taking the slope of the linear
portion of the yield curve. The samples were diluted so that the
reaction rate U (mmol/min/mL) of each sample was 4. The kinetic
constants Vmax and Km for lysine of each lysed samples was measured
using the same U at an initial pH of either 6 or pH 8. The results
of the kinetic analysis of the two samples are shown in Table
7.
TABLE-US-00008 TABLE 7 Kinetic analysis of normalized Vmax of lysed
samples of H. avlei expressing plasmids encoding wild-type or
mutant CadA polypeptides under different pH conditions. pH pCIB71
pCIB71-E291C pCIB71-E355C pCIB71-E482C 6 100% 100% 100% 100% 8 73%
88% 91% 90%
As shown in Table 7, wild-type CadA (pCIB71) lost 27% of activity
at pH 8 compared to pH 6. Surprisingly, the mutant CadA
polypeptides (pCIB71-E291C, pCIB71-E355C, and pCIB71-E486C) showed
significantly higher activity at pH 8 compared to wild-type CadA
polypeptide despite there being no significant difference in
activity between the wild-type and mutants at pH 6.
TABLE-US-00009 TABLE 8 Table of plasmids and strains used in
Examples. Protein(s) Host Overexpressed Plasmid Hafnia alvei CadA
pCIB71 Hafnia alvei CadA E291A pCIB71-E291A Hafnia alvei CadA E291C
pCIB71-E291C Hafnia alvei CadA E291D pCIB71-E291D Hafnia alvei CadA
E291F pCIB71-E291F Hafnia alvei CadA E291G pCIB71-E291G Hafnia
alvei CadA E291H pCIB71-E291H Hafnia alvei CadA E291I pCIB71-E291I
Hafnia alvei CadA E291K pCIB71-E291K Hafnia alvei CadA E291L
pCIB71-E291L Hafnia alvei CadA E291M pCIB71-E291M Hafnia alvei CadA
E291N pCIB71-E291N Hafnia alvei CadA E291P pCIB71-E291P Hafnia
alvei CadA E291Q pCIB71-E291Q Hafnia alvei CadA E291R pCIB71-E291R
Hafnia alvei CadA E291S pCIB71-E291S Hafnia alvei CadA E291T
pCIB71-E291T Hafnia alvei CadA E291V pCIB71-E291V Hafnia alvei CadA
E291W pCIB71-E291W Hafnia alvei CadA E291Y pCIB71-E291Y Hafnia
alvei CadA E355A pCIB71-E355A Hafnia alvei CadA E355C pCIB71-E355C
Hafnia alvei CadA E355D pCIB71-E355D Hafnia alvei CadA E355F
pCIB71-E355F Hafnia alvei CadA E355G pCIB71-E355G Hafnia alvei CadA
E355H pCIB71-E355H Hafnia alvei CadA E355I pCIB71-E355I Hafnia
alvei CadA E355K pCIB71-E355K Hafnia alvei CadA E355L pCIB71-E355L
Hafnia alvei CadA E355M pCIB71-E355M Hafnia alvei CadA E355N
pCIB71-E355N Hafnia alvei CadA E355P pCIB71-E355P Hafnia alvei CadA
E355Q pCIB71-E355Q Hafnia alvei CadA E355R pCIB71-E355R Hafnia
alvei CadA E3555 pCIB71-E3555 Hafnia alvei CadA E355T pCIB71-E355T
Hafnia alvei CadA E355V pCIB71-E355V Hafnia alvei CadA E355W
pCIB71-E355W Hafnia alvei CadA E355Y pCIB71-E355Y Hafnia alvei CadA
E482A pCIB71-E482A Hafnia alvei CadA E482C pCIB71-E482C Hafnia
alvei CadA E482D pCIB71-E482D Hafnia alvei CadA E482F pCIB71-E482F
Hafnia alvei CadA E482G pCIB71-E482G Hafnia alvei CadA E482H
pCIB71-E482H Hafnia alvei CadA E482I pCIB71-E482I Hafnia alvei CadA
E482K pCIB71-E482K Hafnia alvei CadA E482L pCIB71-E482L Hafnia
alvei CadA E482M pCIB71-E482M Hafnia alvei CadA E482N pCIB71-E482N
Hafnia alvei CadA E482P pCIB71-E482P Hafnia alvei CadA E482Q
pCIB71-E482Q Hafnia alvei CadA E482R pCIB71-E482R Hafnia alvei CadA
E482S pCIB71-E482S Hafnia alvei CadA E482T pCIB71-E482T Hafnia
alvei CadA E482V pCIB71-E482V Hafnia alvei CadA E482W pCIB71-E482W
Hafnia alvei CadA E482Y pCIB71-E482Y
TABLE-US-00010 TABLE 9 Table of primer sequences used in Examples.
Name Sequence (5'-3') cadA-F
GGCGAGCTCACACAGGAAACAGACCATGAACGTTATTGCA ATATTGAATCAC cadA-R
GGCTCTAGACCACTTCCCTTGTACGAGC cadA-F2
ATTTCACACAGGAAACAGCTATGAACGTTATTGCAATATT GAAT cadA-R2
AGCTGTTTCCTGTGTGAAAT E291A-F GCGCGTGAAAAGCACACCAAACGCAACCTGGCC
E291A-R CGTTTGGTGTGCTTTTCACGCGCTTAGCAATG E291C-F
AAGCGCGTGAAATGTACACCAAACGCAACCTGGCCGGTAC E291C-R
GTTGCGTTTGGTGTACATTTCACGCGCTTAGCAATGGTAG E291D-F
GCGCGTGAAAGATACACCAAACGCAACCTGGCC E291D-R
CGTTTGGTGTATCTTTCACGCGCTTAGCAATG E291F-F
GCGCGTGAAATTCACACCAAACGCAACCTGGCC E291F-R
CGTTTGGTGTGAATTTCACGCGCTTAGCAATG E291G-F
GCGCGTGAAAGGTACACCAAACGCAACCTGGCC E291G-R
CGTTTGGTGTACCTTTCACGCGCTTAGCAATG E291H-F
GCGCGTGAAACATACACCAAACGCAACCTGGCC E291H-R
CGTTTGGTGTATGTTTCACGCGCTTAGCAATG E291I-F
GCGCGTGAAAATCACACCAAACGCAACCTGGCC E291I-R
CGTTTGGTGTGATTTTCACGCGCTTAGCAATG E291K-F
GCGCGTGAAAAAGACACCAAACGCAACCTGGCC E291K-R
CGTTTGGTGTCTTTTTCACGCGCTTAGCAATG E291L-F
GCGCGTGAAACTAACACCAAACGCAACCTGGCC E291L-R
CGTTTGGTGTTAGTTTCACGCGCTTAGCAATG E291M-F
GCGCGTGAAAATGACACCAAACGCAACCTGGCC E291M-R
CGTTTGGTGTCATTTTCACGCGCTTAGCAATG E291N-F
GCGCGTGAAAAACACACCAAACGCAACCTGGCC E291N-R
CGTTTGGTGTGTTTTTCACGCGCTTAGCAATG E291P-F
GCGCGTGAAACCAACACCAAACGCAACCTGGCC E291P-R
CGTTTGGTGTTGGTTTCACGCGCTTAGCAATG E291Q-F
GCGCGTGAAACAAACACCAAACGCAACCTGGCC E291Q-R
CGTTTGGTGTTTGTTTCACGCGCTTAGCAATG E291R-F
GCGCGTGAAACGTACACCAAACGCAACCTGGCC E291R-R
CGTTTGGTGTACGTTTCACGCGCTTAGCAATG E291S-F
GCGCGTGAAATCAACACCAAACGCAACCTGGCC E291S-R
CGTTTGGTGTTGATTTCACGCGCTTAGCAATG E291T-F
GCGCGTGAAAACTACACCAAACGCAACCTGGCC E291T-R
CGTTTGGTGTAGTTTTCACGCGCTTAGCAATG E291V-F
GCGCGTGAAAGTAACACCAAACGCAACCTGGCC E291V-R
CGTTTGGTGTTACTTTCACGCGCTTAGCAATG E291W-F
GCGCGTGAAATGGACACCAAACGCAACCTGGCC E291W-R
CGTTTGGTGTCCATTTCACGCGCTTAGCAATG E291Y-F
GCGCGTGAAATACACACCAAACGCAACCTGGCC E291Y-R
CGTTTGGTGTGTATTTCACGCGCTTAGCAATG E355A-F
CGGTGGCCGTGTAGCAGGGAAAGTGATTTACGAAACCCAG E355A-R
AATCACTTTCCCTGCTACACGGCCACCGCTCATACCGCAT E355C-F
CGGTGGCCGTGTATGTGGGAAAGTGATTTACGAAACCCAG E355C-R
AATCACTTTCCCACATACACGGCCACCGCTCATACCGCAT E355D-F
CGGTGGCCGTGTAGACGGGAAAGTGATTTACGAAACCCAG E355D-R
AATCACTTTCCCGTCTACACGGCCACCGCTCATACCGCAT E355F-F
CGGTGGCCGTGTATTCGGGAAAGTGATTTACGAAACCCAG E355F-R
AATCACTTTCCCGAATACACGGCCACCGCTCATACCGCAT E355G-F
CGGTGGCCGTGTAGGAGGGAAAGTGATTTACGAAACCCAG E355G-R
AATCACTTTCCCTCCTACACGGCCACCGCTCATACCGCAT E355H-F
CGGTGGCCGTGTACATGGGAAAGTGATTTACGAAACCCAG E355H-R
AATCACTTTCCCATGTACACGGCCACCGCTCATACCGCAT E355I-F
CGGTGGCCGTGTAATCGGGAAAGTGATTTACGAAACCCAG E355I-R
AATCACTTTCCCGATTACACGGCCACCGCTCATACCGCAT E355K-F
CGGTGGCCGTGTAAAAGGGAAAGTGATTTACGAAACCCAG E355K-R
AATCACTTTCCCTTTTACACGGCCACCGCTCATACCGCAT E355L-F
CGGTGGCCGTGTACTGGGGAAAGTGATTTACGAAACCCAG E355L-R
AATCACTTTCCCCAGTACACGGCCACCGCTCATACCGCAT E355M-F
CGGTGGCCGTGTAATGGGGAAAGTGATTTACGAAACCCAG E355M-R
AATCACTTTCCCCATTACACGGCCACCGCTCATACCGCAT E355N-F
CGGTGGCCGTGTAAACGGGAAAGTGATTTACGAAACCCAG E355N-R
AATCACTTTCCCGTTTACACGGCCACCGCTCATACCGCAT E355P-F
CGGTGGCCGTGTACCAGGGAAAGTGATTTACGAAACCCAG E355P-R
AATCACTTTCCCTGGTACACGGCCACCGCTCATACCGCAT E355Q-F
CGGTGGCCGTGTACAAGGGAAAGTGATTTACGAAACCCAG E355Q-R
AATCACTTTCCCTTGTACACGGCCACCGCTCATACCGCAT E355R-F
CGGTGGCCGTGTACGTGGGAAAGTGATTTACGAAACCCAG E355R-R
AATCACTTTCCCACGTACACGGCCACCGCTCATACCGCAT E355S-F
CGGTGGCCGTGTATCAGGGAAAGTGATTTACGAAACCCAG E355S-R
AATCACTTTCCCTGATACACGGCCACCGCTCATACCGCAT E355T-F
CGGTGGCCGTGTAACAGGGAAAGTGATTTACGAAACCCAG E355T-R
AATCACTTTCCCTGTTACACGGCCACCGCTCATACCGCAT E355V-F
CGGTGGCCGTGTAGTAGGGAAAGTGATTTACGAAACCCAG E355V-R
AATCACTTTCCCTACTACACGGCCACCGCTCATACCGCAT E355W-F
CGGTGGCCGTGTATGGGGGAAAGTGATTTACGAAACCCAG E355W-R
AATCACTTTCCCCCATACACGGCCACCGCTCATACCGCAT E355Y-F
CGGTGGCCGTGTATACGGGAAAGTGATTTACGAAACCCAG E355Y-R
AATCACTTTCCCGTATACACGGCCACCGCTCATACCGCAT E482A-F
CATCGATAACGCCCACATGTATCTTGACCCGATCAAAGTC E482A-R
GATACATGTGGGCGTTATCGATGTTTTTGAAGCCGTGCC E482C-F
CATCGATAACTGCCACATGTATCTTGACCCGATCAAAGTC E482C-R
GATACATGTGGCAGTTATCGATGTTTTTGAAGCCGTGCC E482D-F
CATCGATAACGATCACATGTATCTTGACCCGATCAAAGTC E482D-R
GATACATGTGATCGTTATCGATGTTTTTGAAGCCGTGCC E482F-F
CATCGATAACTTTCACATGTATCTTGACCCGATCAAAGTC E482F-R
GATACATGTGAAAGTTATCGATGTTTTTGAAGCCGTGCC E482G-F
CATCGATAACGGTCACATGTATCTTGACCCGATCAAAGTC E482G-R
GATACATGTGACCGTTATCGATGTTTTTGAAGCCGTGCC E482H-F
CATCGATAACCATCACATGTATCTTGACCCGATCAAAGTC E482H-R
GATACATGTGATGGTTATCGATGTTTTTGAAGCCGTGCC E482I-F
CATCGATAACATTCACATGTATCTTGACCCGATCAAAGTC E482I-R
GATACATGTGAATGTTATCGATGTTTTTGAAGCCGTGCC E482K-F
CATCGATAACAAACACATGTATCTTGACCCGATCAAAGTC E482K-R
GATACATGTGTTTGTTATCGATGTTTTTGAAGCCGTGCC E482L-F
CATCGATAACCTGCACATGTATCTTGACCCGATCAAAGTC E482L-R
GATACATGTGCAGGTTATCGATGTTTTTGAAGCCGTGCC E482M-F
CATCGATAACATGCACATGTATCTTGACCCGATCAAAGTC E482M-R
GATACATGTGCATGTTATCGATGTTTTTGAAGCCGTGCC E482N-F
CATCGATAACAACCACATGTATCTTGACCCGATCAAAGTC E482N-R
GATACATGTGGTTGTTATCGATGTTTTTGAAGCCGTGCC E482P-F
CATCGATAACCCGCACATGTATCTTGACCCGATCAAAGTC E482P-R
GATACATGTGCGGGTTATCGATGTTTTTGAAGCCGTGCC E482Q-F
CATCGATAACCAGCACATGTATCTTGACCCGATCAAAGTC E482Q-R
GATACATGTGCTGGTTATCGATGTTTTTGAAGCCGTGCC E482R-F
CATCGATAACCGTCACATGTATCTTGACCCGATCAAAGTC E482R-R
GATACATGTGACGGTTATCGATGTTTTTGAAGCCGTGCC E482S-F
CATCGATAACAGCCACATGTATCTTGACCCGATCAAAGTC E482S-R
GATACATGTGGCTGTTATCGATGTTTTTGAAGCCGTGCC E482T-F
CATCGATAACACCCACATGTATCTTGACCCGATCAAAGTC E482T-R
GATACATGTGGGTGTTATCGATGTTTTTGAAGCCGTGCC E482V-F
CATCGATAACGTGCACATGTATCTTGACCCGATCAAAGTC E482V-R
GATACATGTGCACGTTATCGATGTTTTTGAAGCCGTGCC E482W-F
CATCGATAACTGGCACATGTATCTTGACCCGATCAAAGTC E482W-R
GATACATGTGCCAGTTATCGATGTTTTTGAAGCCGTGCC E482Y-F
CATCGATAACTATCACATGTATCTTGACCCGATCAAAGTC E482Y-R
GATACATGTGATAGTTATCGATGTTTTTGAAGCCGTGCC
Illustrative Sequences:
TABLE-US-00011 Escherichia coli cadA nucleic acid sequence SEQ ID
NO: 1 ATGAACGTTATTGCAATATTGAATCACATGGGGGTTTATTTTAAAGAAGA
ACCCATCCGTGAACTTCATCGCGCGCTTGAACGTCTGAACTTCCAGATTG
TTTACCCGAACGACCGTGACGACTTATTAAAACTGATCGAAAACAATGCG
CGTCTGTGCGGCGTTATTTTTGACTGGGATAAATATAATCTCGAGCTGTG
CGAAGAAATTAGCAAAATGAACGAGAACCTGCCGTTGTACGCGTTCGCTA
ATACGTATTCCACTCTCGATGTAAGCCTGAATGACCTGCGTTTACAGATT
AGCTTCTTTGAATATGCGCTGGGTGCTGCTGAAGATATTGCTAATAAGAT
CAAGCAGACCACTGACGAATATATCAACACTATTCTGCCTCCGCTGACTA
AAGCACTGTTTAAATATGTTCGTGAAGGTAAATATACTTTCTGTACTCCT
GGTCACATGGGCGGTACTGCATTCCAGAAAAGCCCGGTAGGTAGCCTGTT
CTATGATTTCTTTGGTCCGAATACCATGAAATCTGATATTTCCATTTCAG
TATCTGAACTGGGTTCTCTGCTGGATCACAGTGGTCCACACAAAGAAGCA
GAACAGTATATCGCTCGCGTCTTTAACGCAGACCGCAGCTACATGGTGAC
CAACGGTACTTCCACTGCGAACAAAATTGTTGGTATGTACTCTGCTCCAG
CAGGCAGCACCATTCTGATTGACCGTAACTGCCACAAATCGCTGACCCAC
CTGATGATGATGAGCGATGTTACGCCAATCTATTTCCGCCCGACCCGTAA
CGCTTACGGTATTCTTGGTGGTATCCCACAGAGTGAATTCCAGCACGCTA
CCATTGCTAAGCGCGTGAAAGAAACACCAAACGCAACCTGGCCGGTACAT
GCTGTAATTACCAACTCTACCTATGATGGTCTGCTGTACAACACCGACTT
CATCAAGAAAACACTGGATGTGAAATCCATCCACTTTGACTCCGCGTGGG
TGCCTTACACCAACTTCTCACCGATTTACGAAGGTAAATGCGGTATGAGC
GGTGGCCGTGTAGAAGGGAAAGTGATTTACGAAACCCAGTCCACTCACAA
ACTGCTGGCGGCGTTCTCTCAGGCTTCCATGATCCACGTTAAAGGTGACG
TAAACGAAGAAACCTTTAACGAAGCCTACATGATGCACACCACCACTTCT
CCGCACTACGGTATCGTGGCGTCCACTGAAACCGCTGCGGCGATGATGAA
AGGCAATGCAGGTAAGCGTCTGATCAACGGTTCTATTGAACGTGCGATCA
AATTCCGTAAAGAGATCAAACGTCTGAGAACGGAATCTGATGGCTGGTTC
TTTGATGTATGGCAGCCGGATCATATCGATACGACTGAATGCTGGCCGCT
GCGTTCTGACAGCACCTGGCACGGCTTCAAAAACATCGATAACGAGCACA
TGTATCTTGACCCGATCAAAGTCACCCTGCTGACTCCGGGGATGGAAAAA
GACGGCACCATGAGCGACTTTGGTATTCCGGCCAGCATCGTGGCGAAATA
CCTCGACGAACATGGCATCGTTGTTGAGAAAACCGGTCCGTATAACCTGC
TGTTCCTGTTCAGCATCGGTATCGATAAGACCAAAGCACTGAGCCTGCTG
CGTGCTCTGACTGACTTTAAACGTGCGTTCGACCTGAACCTGCGTGTGAA
AAACATGCTGCCGTCTCTGTATCGTGAAGATCCTGAATTCTATGAAAACA
TGCGTATTCAGGAACTGGCTCAGAATATCCACAAACTGATTGTTCACCAC
AATCTGCCGGATCTGATGTATCGCGCATTTGAAGTGCTGCCGACGATGGT
AATGACTCCGTATGCTGCATTCCAGAAAGAGCTGCACGGTATGACCGAAG
AAGTTTACCTCGACGAAATGGTAGGTCGTATTAACGCCAATATGATCCTT
CCGTACCCGCCGGGAGTTCCTCTGGTAATGCCGGGTGAAATGATCACCGA
AGAAAGCCGTCCGGTTCTGGAGTTCCTGCAGATGCTGTGTGAAATCGGCG
CTCACTATCCGGGCTTTGAAACCGATATTCACGGTGCATACCGTCAGGCT
GATGGCCGCTATACCGTTAAGGTATTGAAAGAAGAAAGCAAAAAATAA CadA polypeptide
sequence SEQ ID NO: 2
MNVIAILNHMGVYFKEEPIRELHRALERLNFQIVYPNDRDDLLKLIENNA
RLCGVIFDWDKYNLELCEEISKMNENLPLYAFANTYSTLDVSLNDLRLQI
SFFEYALGAAEDIANKIKQTTDEYINTILPPLTKALFKYVREGKYTFCTP
GHMGGTAFQKSPVGSLEYDFFGPNTMKSDISISVSELGSLLDHSGPHKEA
EQYIARVFNADRSYMVTNGTSTANKIVGMYSAPAGSTILIDRNCHKSLTH
LMMMSDVTPIYFRPTRNAYGILGGIPQSEFQHATIAKRVKETPNATWPVH
AVITNSTYDGLLYNTDFIKKTLDVKSIHFDSAWVPYTNFSPIYEGKCGMS
GGRVEGKVIYETQSTHKLLAAFSQASMIHVKGDVNEETFNEAYMMHTTTS
PHYGIVASTETAAAMMKGNAGKRLINGSIERAIKFRKEIKRLRTESDGWF
FDVWQPDHIDTTECWPLRSDSTWHGFKNIDNEHMYLDPIKVTLLTPGMEK
DGTMSDFGIPASIVAKYLDEHGIVVEKTGPYNLLFLFSIGIDKTKALSLL
RALTDFKRAFDLNLRVKNMLPSLYREDPEFYENMRIQELAQNIHKLIVHH
NLPDLMYRAFEVLPTMVMTPYAAFQKELHGMTEEVYLDEMVGRINANMIL
PYPPGVPLVMPGEMITEESRPVLEFLQMLCEIGAHYPGFETDIHGAYRQA DGRYTVKVLKEESKK
Polypeptide from Klebsiella homologous to E. coli CadA SEQ ID NO: 3
MNVIAIMNHMGVYFKEEPIRELHRALERLDFRIVYPNDRDDLLKLIENNS
RLCGVIFDWDKYNLELCEEISKMNEYMPLYAFANTYSTLDVSLNDLRMQV
RFFEYALGAAEDIANKIKQNTDEYIDTILPPLTKALFKYVREGKYTFCTP
GHMGGTAFQKSPVGSIFYDFFGPNTMKSDISISVSELGSLLDHSGPHKEA
EEYIARVFNAERSYMVTNGTSTANKIVGMYSAPAGSTVLIDRNCHKSLTH
LMMMSDITPIYFRPTRNAYGILGGIPQSEFQHATIAKRVKETPNATWPVH
AVITNSTYDGLLYNTDFIKKTLDVKSIHFDSAWVPYTNFSPIYEGKCGMS
GGRVEGKVIYETQSTHKLLAAFSQASMIHVKGDVNEETFNEAYMMHTTTS
PHYGIVASTETAAAMMKGNAGKRLIDGSIERSIKFRKEIKRLKGESDGWF
FDVWQPEHIDGPECWPLRSDSAWHGFKNIDNEHMYLDPIKVTLLTPGMKK
DGTMDDFGIPASIVAKYLDEHGIVVEKTGPYNLLFLFSIGIDKTKALSLL
RALTDFKRAFDLNLRVKNMLPSLYREDPEFYENMRIQDLAQNIHKLIEHH
NLPDLMFRAFEVLPSMVMTPYAAFQKELHGQTEEVYLEEMVGRVNANMIL
PYPPGVPLVMPGEMITEESRPVLEFLQMLCEIGAHYPGFETDIHGAYRQA DGRYTVKVLKEENNK
Polypeptide from Enterobacteriaceae homologous to E. coli CadA SEQ
ID NO: 4 MNVIAIMNHMGVYFKEEPIRELHRALERLDFRIVYPNDRDDLLKLIENNS
RLCGVIFDWDKYNLELCEEISKMNEYMPLYAFANTYSTLDVSLNDLRMQV
RFFEYALGAAEDIANKIKQNTDEYIDTILPPLTKALFKYVREGKYTFCTP
GHMGGTAFQKSPVGSIFYDFFGSNTMKSDISISVSELGSLLDHSGPHKEA
EEYIARVFNAERSYMVTNGTSTANKIVGMYSAPAGSTVLIDRNCHKSLTH
LMMMSDITPIYFRPTRNAYGILGGIPQSEFQHATIAKRVKETPNATWPVH
AVITNSTYDGLLYNTDFIKKTLDVKSIHFDSAWVPYTNFSPIYEGKCGMS
GGRVEGKVIYETQSTHKLLAAFSQASMIHVKGDVNEETFNEAYMMHTTTS
PHYGIVASTETAAAMMKGNAGKRLIDGSIERSIKFRKEIKRLKGESDGWF
FDVWQPEHIDGPECWPLRSDSAWHGFKNIDNEHMYLDPIKVTLLTPGMKK
DGTMDDFGIPASIVAKYLDEHGIVVEKTGPYNLLFLFSIGIDKTKALSLL
RALTDFKRAFDLNLRVKNMLPSLYREDPEFYENMRIQDLAQNIHKLIEHH
NLPDLMFRAFEVLPSMVMTPYAAFQKELHGQTEEVYLEEMVGRVNANMIL
PYPPGVPLVMPGEMITEESRPVLEFLQMLCEIGAHYPGFETDIHGAYRQA DGRYTVKVLKEENNK
Polypeptide from Salmonella enterica homologous to E. coli CadA SEQ
ID NO: 5 MNVIAIMNHMGVYFKEEPIRELHRALEGLNFRIVYPNDREDLLKLIENNS
RLCGVIFDWDKYNLELCEEISKLNEYMPLYAFANSYSTLDVSLNDLRMQV
RFFEYALGAATDIAAKIRQNTDEYIDNILPPLTKALFKYVREGKYTFCTP
GHMGGTAFQKSPVGSIFYDFFGPNTMKSDISISVSELGSLLDHSGPHKEA
EEYIARVFNAERSYMVTNGTSTANKIVGMYSAPAGSTVLIDRNCHKSLTH
LMMMSDITPIYFRPTRNAYGILGGIPQSEFQHATIAKRVKETPNATWPVH
AVITNSTYDGLLYNTDYIKKTLDVKSIHFDSAWVPYTNFSPIYQGKCGMS
GDRVEGKIIYETQSTHKLLAAFSQASMIHVKGDINEETFNEAYMMHTTTS
PHYGIVASTETAAAMMKGNAGKRLINGSIERAIKFRKEIKRLKSESDGWF
FDVWQPEHIDGAECWPLRSDSAWHGFKNIDNEHMYLDPIKVTILTPGMKK
DGTMDEFGIPASLVAKYLDERGIIVEKTGPYNLLFLFSIGIDKTKALSLL
RALTEFKRAFDLNLRVKNILPALYREAPEFYENMRIQELAQNIHKLVEHH
NLPDLMYRAFEVLPKMVMTPYTAFQKELHGETEEVYLEEMVGRVNANMIL
PYPPGVPLVMPGEMITEESRPVLEFLQMLCEIGAHYPGFETDIHGAYRQA
DGRYTVKVLKENTK
SEQUENCE LISTINGS
1
12312148DNAEscherichia coli 1atgaacgtta ttgcaatatt gaatcacatg
ggggtttatt ttaaagaaga acccatccgt 60gaacttcatc gcgcgcttga acgtctgaac
ttccagattg tttacccgaa cgaccgtgac 120gacttattaa aactgatcga
aaacaatgcg cgtctgtgcg gcgttatttt tgactgggat 180aaatataatc
tcgagctgtg cgaagaaatt agcaaaatga acgagaacct gccgttgtac
240gcgttcgcta atacgtattc cactctcgat gtaagcctga atgacctgcg
tttacagatt 300agcttctttg aatatgcgct gggtgctgct gaagatattg
ctaataagat caagcagacc 360actgacgaat atatcaacac tattctgcct
ccgctgacta aagcactgtt taaatatgtt 420cgtgaaggta aatatacttt
ctgtactcct ggtcacatgg gcggtactgc attccagaaa 480agcccggtag
gtagcctgtt ctatgatttc tttggtccga ataccatgaa atctgatatt
540tccatttcag tatctgaact gggttctctg ctggatcaca gtggtccaca
caaagaagca 600gaacagtata tcgctcgcgt ctttaacgca gaccgcagct
acatggtgac caacggtact 660tccactgcga acaaaattgt tggtatgtac
tctgctccag caggcagcac cattctgatt 720gaccgtaact gccacaaatc
gctgacccac ctgatgatga tgagcgatgt tacgccaatc 780tatttccgcc
cgacccgtaa cgcttacggt attcttggtg gtatcccaca gagtgaattc
840cagcacgcta ccattgctaa gcgcgtgaaa gaaacaccaa acgcaacctg
gccggtacat 900gctgtaatta ccaactctac ctatgatggt ctgctgtaca
acaccgactt catcaagaaa 960acactggatg tgaaatccat ccactttgac
tccgcgtggg tgccttacac caacttctca 1020ccgatttacg aaggtaaatg
cggtatgagc ggtggccgtg tagaagggaa agtgatttac 1080gaaacccagt
ccactcacaa actgctggcg gcgttctctc aggcttccat gatccacgtt
1140aaaggtgacg taaacgaaga aacctttaac gaagcctaca tgatgcacac
caccacttct 1200ccgcactacg gtatcgtggc gtccactgaa accgctgcgg
cgatgatgaa aggcaatgca 1260ggtaagcgtc tgatcaacgg ttctattgaa
cgtgcgatca aattccgtaa agagatcaaa 1320cgtctgagaa cggaatctga
tggctggttc tttgatgtat ggcagccgga tcatatcgat 1380acgactgaat
gctggccgct gcgttctgac agcacctggc acggcttcaa aaacatcgat
1440aacgagcaca tgtatcttga cccgatcaaa gtcaccctgc tgactccggg
gatggaaaaa 1500gacggcacca tgagcgactt tggtattccg gccagcatcg
tggcgaaata cctcgacgaa 1560catggcatcg ttgttgagaa aaccggtccg
tataacctgc tgttcctgtt cagcatcggt 1620atcgataaga ccaaagcact
gagcctgctg cgtgctctga ctgactttaa acgtgcgttc 1680gacctgaacc
tgcgtgtgaa aaacatgctg ccgtctctgt atcgtgaaga tcctgaattc
1740tatgaaaaca tgcgtattca ggaactggct cagaatatcc acaaactgat
tgttcaccac 1800aatctgccgg atctgatgta tcgcgcattt gaagtgctgc
cgacgatggt aatgactccg 1860tatgctgcat tccagaaaga gctgcacggt
atgaccgaag aagtttacct cgacgaaatg 1920gtaggtcgta ttaacgccaa
tatgatcctt ccgtacccgc cgggagttcc tctggtaatg 1980ccgggtgaaa
tgatcaccga agaaagccgt ccggttctgg agttcctgca gatgctgtgt
2040gaaatcggcg ctcactatcc gggctttgaa accgatattc acggtgcata
ccgtcaggct 2100gatggccgct ataccgttaa ggtattgaaa gaagaaagca aaaaataa
21482715PRTEscherichia coli 2Met Asn Val Ile Ala Ile Leu Asn His
Met Gly Val Tyr Phe Lys Glu1 5 10 15Glu Pro Ile Arg Glu Leu His Arg
Ala Leu Glu Arg Leu Asn Phe Gln 20 25 30Ile Val Tyr Pro Asn Asp Arg
Asp Asp Leu Leu Lys Leu Ile Glu Asn 35 40 45Asn Ala Arg Leu Cys Gly
Val Ile Phe Asp Trp Asp Lys Tyr Asn Leu 50 55 60Glu Leu Cys Glu Glu
Ile Ser Lys Met Asn Glu Asn Leu Pro Leu Tyr65 70 75 80Ala Phe Ala
Asn Thr Tyr Ser Thr Leu Asp Val Ser Leu Asn Asp Leu 85 90 95Arg Leu
Gln Ile Ser Phe Phe Glu Tyr Ala Leu Gly Ala Ala Glu Asp 100 105
110Ile Ala Asn Lys Ile Lys Gln Thr Thr Asp Glu Tyr Ile Asn Thr Ile
115 120 125Leu Pro Pro Leu Thr Lys Ala Leu Phe Lys Tyr Val Arg Glu
Gly Lys 130 135 140Tyr Thr Phe Cys Thr Pro Gly His Met Gly Gly Thr
Ala Phe Gln Lys145 150 155 160Ser Pro Val Gly Ser Leu Phe Tyr Asp
Phe Phe Gly Pro Asn Thr Met 165 170 175Lys Ser Asp Ile Ser Ile Ser
Val Ser Glu Leu Gly Ser Leu Leu Asp 180 185 190His Ser Gly Pro His
Lys Glu Ala Glu Gln Tyr Ile Ala Arg Val Phe 195 200 205Asn Ala Asp
Arg Ser Tyr Met Val Thr Asn Gly Thr Ser Thr Ala Asn 210 215 220Lys
Ile Val Gly Met Tyr Ser Ala Pro Ala Gly Ser Thr Ile Leu Ile225 230
235 240Asp Arg Asn Cys His Lys Ser Leu Thr His Leu Met Met Met Ser
Asp 245 250 255Val Thr Pro Ile Tyr Phe Arg Pro Thr Arg Asn Ala Tyr
Gly Ile Leu 260 265 270Gly Gly Ile Pro Gln Ser Glu Phe Gln His Ala
Thr Ile Ala Lys Arg 275 280 285Val Lys Glu Thr Pro Asn Ala Thr Trp
Pro Val His Ala Val Ile Thr 290 295 300Asn Ser Thr Tyr Asp Gly Leu
Leu Tyr Asn Thr Asp Phe Ile Lys Lys305 310 315 320Thr Leu Asp Val
Lys Ser Ile His Phe Asp Ser Ala Trp Val Pro Tyr 325 330 335Thr Asn
Phe Ser Pro Ile Tyr Glu Gly Lys Cys Gly Met Ser Gly Gly 340 345
350Arg Val Glu Gly Lys Val Ile Tyr Glu Thr Gln Ser Thr His Lys Leu
355 360 365Leu Ala Ala Phe Ser Gln Ala Ser Met Ile His Val Lys Gly
Asp Val 370 375 380Asn Glu Glu Thr Phe Asn Glu Ala Tyr Met Met His
Thr Thr Thr Ser385 390 395 400Pro His Tyr Gly Ile Val Ala Ser Thr
Glu Thr Ala Ala Ala Met Met 405 410 415Lys Gly Asn Ala Gly Lys Arg
Leu Ile Asn Gly Ser Ile Glu Arg Ala 420 425 430Ile Lys Phe Arg Lys
Glu Ile Lys Arg Leu Arg Thr Glu Ser Asp Gly 435 440 445Trp Phe Phe
Asp Val Trp Gln Pro Asp His Ile Asp Thr Thr Glu Cys 450 455 460Trp
Pro Leu Arg Ser Asp Ser Thr Trp His Gly Phe Lys Asn Ile Asp465 470
475 480Asn Glu His Met Tyr Leu Asp Pro Ile Lys Val Thr Leu Leu Thr
Pro 485 490 495Gly Met Glu Lys Asp Gly Thr Met Ser Asp Phe Gly Ile
Pro Ala Ser 500 505 510Ile Val Ala Lys Tyr Leu Asp Glu His Gly Ile
Val Val Glu Lys Thr 515 520 525Gly Pro Tyr Asn Leu Leu Phe Leu Phe
Ser Ile Gly Ile Asp Lys Thr 530 535 540Lys Ala Leu Ser Leu Leu Arg
Ala Leu Thr Asp Phe Lys Arg Ala Phe545 550 555 560Asp Leu Asn Leu
Arg Val Lys Asn Met Leu Pro Ser Leu Tyr Arg Glu 565 570 575Asp Pro
Glu Phe Tyr Glu Asn Met Arg Ile Gln Glu Leu Ala Gln Asn 580 585
590Ile His Lys Leu Ile Val His His Asn Leu Pro Asp Leu Met Tyr Arg
595 600 605Ala Phe Glu Val Leu Pro Thr Met Val Met Thr Pro Tyr Ala
Ala Phe 610 615 620Gln Lys Glu Leu His Gly Met Thr Glu Glu Val Tyr
Leu Asp Glu Met625 630 635 640Val Gly Arg Ile Asn Ala Asn Met Ile
Leu Pro Tyr Pro Pro Gly Val 645 650 655Pro Leu Val Met Pro Gly Glu
Met Ile Thr Glu Glu Ser Arg Pro Val 660 665 670Leu Glu Phe Leu Gln
Met Leu Cys Glu Ile Gly Ala His Tyr Pro Gly 675 680 685Phe Glu Thr
Asp Ile His Gly Ala Tyr Arg Gln Ala Asp Gly Arg Tyr 690 695 700Thr
Val Lys Val Leu Lys Glu Glu Ser Lys Lys705 710
7153715PRTArtificialKlebsiella 3Met Asn Val Ile Ala Ile Met Asn His
Met Gly Val Tyr Phe Lys Glu1 5 10 15Glu Pro Ile Arg Glu Leu His Arg
Ala Leu Glu Arg Leu Asp Phe Arg 20 25 30Ile Val Tyr Pro Asn Asp Arg
Asp Asp Leu Leu Lys Leu Ile Glu Asn 35 40 45Asn Ser Arg Leu Cys Gly
Val Ile Phe Asp Trp Asp Lys Tyr Asn Leu 50 55 60Glu Leu Cys Glu Glu
Ile Ser Lys Met Asn Glu Tyr Met Pro Leu Tyr65 70 75 80Ala Phe Ala
Asn Thr Tyr Ser Thr Leu Asp Val Ser Leu Asn Asp Leu 85 90 95Arg Met
Gln Val Arg Phe Phe Glu Tyr Ala Leu Gly Ala Ala Glu Asp 100 105
110Ile Ala Asn Lys Ile Lys Gln Asn Thr Asp Glu Tyr Ile Asp Thr Ile
115 120 125Leu Pro Pro Leu Thr Lys Ala Leu Phe Lys Tyr Val Arg Glu
Gly Lys 130 135 140Tyr Thr Phe Cys Thr Pro Gly His Met Gly Gly Thr
Ala Phe Gln Lys145 150 155 160Ser Pro Val Gly Ser Ile Phe Tyr Asp
Phe Phe Gly Pro Asn Thr Met 165 170 175Lys Ser Asp Ile Ser Ile Ser
Val Ser Glu Leu Gly Ser Leu Leu Asp 180 185 190His Ser Gly Pro His
Lys Glu Ala Glu Glu Tyr Ile Ala Arg Val Phe 195 200 205Asn Ala Glu
Arg Ser Tyr Met Val Thr Asn Gly Thr Ser Thr Ala Asn 210 215 220Lys
Ile Val Gly Met Tyr Ser Ala Pro Ala Gly Ser Thr Val Leu Ile225 230
235 240Asp Arg Asn Cys His Lys Ser Leu Thr His Leu Met Met Met Ser
Asp 245 250 255Ile Thr Pro Ile Tyr Phe Arg Pro Thr Arg Asn Ala Tyr
Gly Ile Leu 260 265 270Gly Gly Ile Pro Gln Ser Glu Phe Gln His Ala
Thr Ile Ala Lys Arg 275 280 285Val Lys Glu Thr Pro Asn Ala Thr Trp
Pro Val His Ala Val Ile Thr 290 295 300Asn Ser Thr Tyr Asp Gly Leu
Leu Tyr Asn Thr Asp Phe Ile Lys Lys305 310 315 320Thr Leu Asp Val
Lys Ser Ile His Phe Asp Ser Ala Trp Val Pro Tyr 325 330 335Thr Asn
Phe Ser Pro Ile Tyr Glu Gly Lys Cys Gly Met Ser Gly Gly 340 345
350Arg Val Glu Gly Lys Val Ile Tyr Glu Thr Gln Ser Thr His Lys Leu
355 360 365Leu Ala Ala Phe Ser Gln Ala Ser Met Ile His Val Lys Gly
Asp Val 370 375 380Asn Glu Glu Thr Phe Asn Glu Ala Tyr Met Met His
Thr Thr Thr Ser385 390 395 400Pro His Tyr Gly Ile Val Ala Ser Thr
Glu Thr Ala Ala Ala Met Met 405 410 415Lys Gly Asn Ala Gly Lys Arg
Leu Ile Asp Gly Ser Ile Glu Arg Ser 420 425 430Ile Lys Phe Arg Lys
Glu Ile Lys Arg Leu Lys Gly Glu Ser Asp Gly 435 440 445Trp Phe Phe
Asp Val Trp Gln Pro Glu His Ile Asp Gly Pro Glu Cys 450 455 460Trp
Pro Leu Arg Ser Asp Ser Ala Trp His Gly Phe Lys Asn Ile Asp465 470
475 480Asn Glu His Met Tyr Leu Asp Pro Ile Lys Val Thr Leu Leu Thr
Pro 485 490 495Gly Met Lys Lys Asp Gly Thr Met Asp Asp Phe Gly Ile
Pro Ala Ser 500 505 510Ile Val Ala Lys Tyr Leu Asp Glu His Gly Ile
Val Val Glu Lys Thr 515 520 525Gly Pro Tyr Asn Leu Leu Phe Leu Phe
Ser Ile Gly Ile Asp Lys Thr 530 535 540Lys Ala Leu Ser Leu Leu Arg
Ala Leu Thr Asp Phe Lys Arg Ala Phe545 550 555 560Asp Leu Asn Leu
Arg Val Lys Asn Met Leu Pro Ser Leu Tyr Arg Glu 565 570 575Asp Pro
Glu Phe Tyr Glu Asn Met Arg Ile Gln Asp Leu Ala Gln Asn 580 585
590Ile His Lys Leu Ile Glu His His Asn Leu Pro Asp Leu Met Phe Arg
595 600 605Ala Phe Glu Val Leu Pro Ser Met Val Met Thr Pro Tyr Ala
Ala Phe 610 615 620Gln Lys Glu Leu His Gly Gln Thr Glu Glu Val Tyr
Leu Glu Glu Met625 630 635 640Val Gly Arg Val Asn Ala Asn Met Ile
Leu Pro Tyr Pro Pro Gly Val 645 650 655Pro Leu Val Met Pro Gly Glu
Met Ile Thr Glu Glu Ser Arg Pro Val 660 665 670Leu Glu Phe Leu Gln
Met Leu Cys Glu Ile Gly Ala His Tyr Pro Gly 675 680 685Phe Glu Thr
Asp Ile His Gly Ala Tyr Arg Gln Ala Asp Gly Arg Tyr 690 695 700Thr
Val Lys Val Leu Lys Glu Glu Asn Asn Lys705 710
7154715PRTArtificialEnterobacteriaceae 4Met Asn Val Ile Ala Ile Met
Asn His Met Gly Val Tyr Phe Lys Glu1 5 10 15Glu Pro Ile Arg Glu Leu
His Arg Ala Leu Glu Arg Leu Asp Phe Arg 20 25 30Ile Val Tyr Pro Asn
Asp Arg Asp Asp Leu Leu Lys Leu Ile Glu Asn 35 40 45Asn Ser Arg Leu
Cys Gly Val Ile Phe Asp Trp Asp Lys Tyr Asn Leu 50 55 60Glu Leu Cys
Glu Glu Ile Ser Lys Met Asn Glu Tyr Met Pro Leu Tyr65 70 75 80Ala
Phe Ala Asn Thr Tyr Ser Thr Leu Asp Val Ser Leu Asn Asp Leu 85 90
95Arg Met Gln Val Arg Phe Phe Glu Tyr Ala Leu Gly Ala Ala Glu Asp
100 105 110Ile Ala Asn Lys Ile Lys Gln Asn Thr Asp Glu Tyr Ile Asp
Thr Ile 115 120 125Leu Pro Pro Leu Thr Lys Ala Leu Phe Lys Tyr Val
Arg Glu Gly Lys 130 135 140Tyr Thr Phe Cys Thr Pro Gly His Met Gly
Gly Thr Ala Phe Gln Lys145 150 155 160Ser Pro Val Gly Ser Ile Phe
Tyr Asp Phe Phe Gly Ser Asn Thr Met 165 170 175Lys Ser Asp Ile Ser
Ile Ser Val Ser Glu Leu Gly Ser Leu Leu Asp 180 185 190His Ser Gly
Pro His Lys Glu Ala Glu Glu Tyr Ile Ala Arg Val Phe 195 200 205Asn
Ala Glu Arg Ser Tyr Met Val Thr Asn Gly Thr Ser Thr Ala Asn 210 215
220Lys Ile Val Gly Met Tyr Ser Ala Pro Ala Gly Ser Thr Val Leu
Ile225 230 235 240Asp Arg Asn Cys His Lys Ser Leu Thr His Leu Met
Met Met Ser Asp 245 250 255Ile Thr Pro Ile Tyr Phe Arg Pro Thr Arg
Asn Ala Tyr Gly Ile Leu 260 265 270Gly Gly Ile Pro Gln Ser Glu Phe
Gln His Ala Thr Ile Ala Lys Arg 275 280 285Val Lys Glu Thr Pro Asn
Ala Thr Trp Pro Val His Ala Val Ile Thr 290 295 300Asn Ser Thr Tyr
Asp Gly Leu Leu Tyr Asn Thr Asp Phe Ile Lys Lys305 310 315 320Thr
Leu Asp Val Lys Ser Ile His Phe Asp Ser Ala Trp Val Pro Tyr 325 330
335Thr Asn Phe Ser Pro Ile Tyr Glu Gly Lys Cys Gly Met Ser Gly Gly
340 345 350Arg Val Glu Gly Lys Val Ile Tyr Glu Thr Gln Ser Thr His
Lys Leu 355 360 365Leu Ala Ala Phe Ser Gln Ala Ser Met Ile His Val
Lys Gly Asp Val 370 375 380Asn Glu Glu Thr Phe Asn Glu Ala Tyr Met
Met His Thr Thr Thr Ser385 390 395 400Pro His Tyr Gly Ile Val Ala
Ser Thr Glu Thr Ala Ala Ala Met Met 405 410 415Lys Gly Asn Ala Gly
Lys Arg Leu Ile Asp Gly Ser Ile Glu Arg Ser 420 425 430Ile Lys Phe
Arg Lys Glu Ile Lys Arg Leu Lys Gly Glu Ser Asp Gly 435 440 445Trp
Phe Phe Asp Val Trp Gln Pro Glu His Ile Asp Gly Pro Glu Cys 450 455
460Trp Pro Leu Arg Ser Asp Ser Ala Trp His Gly Phe Lys Asn Ile
Asp465 470 475 480Asn Glu His Met Tyr Leu Asp Pro Ile Lys Val Thr
Leu Leu Thr Pro 485 490 495Gly Met Lys Lys Asp Gly Thr Met Asp Asp
Phe Gly Ile Pro Ala Ser 500 505 510Ile Val Ala Lys Tyr Leu Asp Glu
His Gly Ile Val Val Glu Lys Thr 515 520 525Gly Pro Tyr Asn Leu Leu
Phe Leu Phe Ser Ile Gly Ile Asp Lys Thr 530 535 540Lys Ala Leu Ser
Leu Leu Arg Ala Leu Thr Asp Phe Lys Arg Ala Phe545 550 555 560Asp
Leu Asn Leu Arg Val Lys Asn Met Leu Pro Ser Leu Tyr Arg Glu 565 570
575Asp Pro Glu Phe Tyr Glu Asn Met Arg Ile Gln Asp Leu Ala Gln Asn
580 585 590Ile His Lys Leu Ile Glu His His Asn Leu Pro Asp Leu Met
Phe Arg 595 600 605Ala Phe Glu Val Leu Pro Ser Met Val Met Thr Pro
Tyr Ala Ala Phe 610 615 620Gln Lys Glu Leu His Gly Gln Thr Glu Glu
Val Tyr Leu Glu Glu Met625 630 635 640Val Gly Arg Val Asn Ala Asn
Met Ile Leu Pro Tyr Pro Pro Gly Val 645 650 655Pro Leu Val Met Pro
Gly Glu Met Ile Thr Glu Glu Ser Arg Pro Val 660 665 670Leu Glu Phe
Leu Gln Met
Leu Cys Glu Ile Gly Ala His Tyr Pro Gly 675 680 685Phe Glu Thr Asp
Ile His Gly Ala Tyr Arg Gln Ala Asp Gly Arg Tyr 690 695 700Thr Val
Lys Val Leu Lys Glu Glu Asn Asn Lys705 710 7155714PRTSalmonella
enterica 5Met Asn Val Ile Ala Ile Met Asn His Met Gly Val Tyr Phe
Lys Glu1 5 10 15Glu Pro Ile Arg Glu Leu His Arg Ala Leu Glu Gly Leu
Asn Phe Arg 20 25 30Ile Val Tyr Pro Asn Asp Arg Glu Asp Leu Leu Lys
Leu Ile Glu Asn 35 40 45Asn Ser Arg Leu Cys Gly Val Ile Phe Asp Trp
Asp Lys Tyr Asn Leu 50 55 60Glu Leu Cys Glu Glu Ile Ser Lys Leu Asn
Glu Tyr Met Pro Leu Tyr65 70 75 80Ala Phe Ala Asn Ser Tyr Ser Thr
Leu Asp Val Ser Leu Asn Asp Leu 85 90 95Arg Met Gln Val Arg Phe Phe
Glu Tyr Ala Leu Gly Ala Ala Thr Asp 100 105 110Ile Ala Ala Lys Ile
Arg Gln Asn Thr Asp Glu Tyr Ile Asp Asn Ile 115 120 125Leu Pro Pro
Leu Thr Lys Ala Leu Phe Lys Tyr Val Arg Glu Gly Lys 130 135 140Tyr
Thr Phe Cys Thr Pro Gly His Met Gly Gly Thr Ala Phe Gln Lys145 150
155 160Ser Pro Val Gly Ser Ile Phe Tyr Asp Phe Phe Gly Pro Asn Thr
Met 165 170 175Lys Ser Asp Ile Ser Ile Ser Val Ser Glu Leu Gly Ser
Leu Leu Asp 180 185 190His Ser Gly Pro His Lys Glu Ala Glu Glu Tyr
Ile Ala Arg Val Phe 195 200 205Asn Ala Glu Arg Ser Tyr Met Val Thr
Asn Gly Thr Ser Thr Ala Asn 210 215 220Lys Ile Val Gly Met Tyr Ser
Ala Pro Ala Gly Ser Thr Val Leu Ile225 230 235 240Asp Arg Asn Cys
His Lys Ser Leu Thr His Leu Met Met Met Ser Asp 245 250 255Ile Thr
Pro Ile Tyr Phe Arg Pro Thr Arg Asn Ala Tyr Gly Ile Leu 260 265
270Gly Gly Ile Pro Gln Ser Glu Phe Gln His Ala Thr Ile Ala Lys Arg
275 280 285Val Lys Glu Thr Pro Asn Ala Thr Trp Pro Val His Ala Val
Ile Thr 290 295 300Asn Ser Thr Tyr Asp Gly Leu Leu Tyr Asn Thr Asp
Tyr Ile Lys Lys305 310 315 320Thr Leu Asp Val Lys Ser Ile His Phe
Asp Ser Ala Trp Val Pro Tyr 325 330 335Thr Asn Phe Ser Pro Ile Tyr
Gln Gly Lys Cys Gly Met Ser Gly Asp 340 345 350Arg Val Glu Gly Lys
Ile Ile Tyr Glu Thr Gln Ser Thr His Lys Leu 355 360 365Leu Ala Ala
Phe Ser Gln Ala Ser Met Ile His Val Lys Gly Asp Ile 370 375 380Asn
Glu Glu Thr Phe Asn Glu Ala Tyr Met Met His Thr Thr Thr Ser385 390
395 400Pro His Tyr Gly Ile Val Ala Ser Thr Glu Thr Ala Ala Ala Met
Met 405 410 415Lys Gly Asn Ala Gly Lys Arg Leu Ile Asn Gly Ser Ile
Glu Arg Ala 420 425 430Ile Lys Phe Arg Lys Glu Ile Lys Arg Leu Lys
Ser Glu Ser Asp Gly 435 440 445Trp Phe Phe Asp Val Trp Gln Pro Glu
His Ile Asp Gly Ala Glu Cys 450 455 460Trp Pro Leu Arg Ser Asp Ser
Ala Trp His Gly Phe Lys Asn Ile Asp465 470 475 480Asn Glu His Met
Tyr Leu Asp Pro Ile Lys Val Thr Ile Leu Thr Pro 485 490 495Gly Met
Lys Lys Asp Gly Thr Met Asp Glu Phe Gly Ile Pro Ala Ser 500 505
510Leu Val Ala Lys Tyr Leu Asp Glu Arg Gly Ile Ile Val Glu Lys Thr
515 520 525Gly Pro Tyr Asn Leu Leu Phe Leu Phe Ser Ile Gly Ile Asp
Lys Thr 530 535 540Lys Ala Leu Ser Leu Leu Arg Ala Leu Thr Glu Phe
Lys Arg Ala Phe545 550 555 560Asp Leu Asn Leu Arg Val Lys Asn Ile
Leu Pro Ala Leu Tyr Arg Glu 565 570 575Ala Pro Glu Phe Tyr Glu Asn
Met Arg Ile Gln Glu Leu Ala Gln Asn 580 585 590Ile His Lys Leu Val
Glu His His Asn Leu Pro Asp Leu Met Tyr Arg 595 600 605Ala Phe Glu
Val Leu Pro Lys Met Val Met Thr Pro Tyr Thr Ala Phe 610 615 620Gln
Lys Glu Leu His Gly Glu Thr Glu Glu Val Tyr Leu Glu Glu Met625 630
635 640Val Gly Arg Val Asn Ala Asn Met Ile Leu Pro Tyr Pro Pro Gly
Val 645 650 655Pro Leu Val Met Pro Gly Glu Met Ile Thr Glu Glu Ser
Arg Pro Val 660 665 670Leu Glu Phe Leu Gln Met Leu Cys Glu Ile Gly
Ala His Tyr Pro Gly 675 680 685Phe Glu Thr Asp Ile His Gly Ala Tyr
Arg Gln Ala Asp Gly Arg Tyr 690 695 700Thr Val Lys Val Leu Lys Glu
Asn Thr Lys705 710652DNAArtificial SequencePrimer cadA-F
6ggcgagctca cacaggaaac agaccatgaa cgttattgca atattgaatc ac
52728DNAArtificial SequencePrimer cadA-R 7ggctctagac cacttccctt
gtacgagc 28844DNAArtificial SequencePrimer cadA-F2 8atttcacaca
ggaaacagct atgaacgtta ttgcaatatt gaat 44920DNAArtificial
SequencePrimer cadA-R2 9agctgtttcc tgtgtgaaat 201033DNAArtificial
SequencePrimer E291A-F 10gcgcgtgaaa agcacaccaa acgcaacctg gcc
331132DNAArtificial SequencePrimer E291A-R 11cgtttggtgt gcttttcacg
cgcttagcaa tg 321240DNAArtificial SequencePrimer E291C-F
12aagcgcgtga aatgtacacc aaacgcaacc tggccggtac 401340DNAArtificial
SequencePrimer E291C-R 13gttgcgtttg gtgtacattt cacgcgctta
gcaatggtag 401433DNAArtificial SequencePrimer E291D-F 14gcgcgtgaaa
gatacaccaa acgcaacctg gcc 331532DNAArtificial SequencePrimer
E291D-R 15cgtttggtgt atctttcacg cgcttagcaa tg 321633DNAArtificial
SequencePrimer E291F-F 16gcgcgtgaaa ttcacaccaa acgcaacctg gcc
331732DNAArtificial SequencePrimer E291F-R 17cgtttggtgt gaatttcacg
cgcttagcaa tg 321833DNAArtificial SequencePrimer E291G-F
18gcgcgtgaaa ggtacaccaa acgcaacctg gcc 331932DNAArtificial
SequencePrimer E291G-R 19cgtttggtgt acctttcacg cgcttagcaa tg
322033DNAArtificial SequencePrimer E291H-F 20gcgcgtgaaa catacaccaa
acgcaacctg gcc 332132DNAArtificial SequencePrimer E291H-R
21cgtttggtgt atgtttcacg cgcttagcaa tg 322233DNAArtificial
SequencePrimer E291I-F 22gcgcgtgaaa atcacaccaa acgcaacctg gcc
332332DNAArtificial SequencePrimer E291I-R 23cgtttggtgt gattttcacg
cgcttagcaa tg 322433DNAArtificial SequencePrimer E291K-F
24gcgcgtgaaa aagacaccaa acgcaacctg gcc 332532DNAArtificial
SequencePrimer E291K-R 25cgtttggtgt ctttttcacg cgcttagcaa tg
322633DNAArtificial SequencePrimer E291L-F 26gcgcgtgaaa ctaacaccaa
acgcaacctg gcc 332732DNAArtificial SequencePrimer E291L-R
27cgtttggtgt tagtttcacg cgcttagcaa tg 322833DNAArtificial
SequencePrimer E291M-F 28gcgcgtgaaa atgacaccaa acgcaacctg gcc
332932DNAArtificial SequencePrimer E291M-R 29cgtttggtgt cattttcacg
cgcttagcaa tg 323033DNAArtificial SequencePrimer E291N-F
30gcgcgtgaaa aacacaccaa acgcaacctg gcc 333132DNAArtificial
SequencePrimer E291N-R 31cgtttggtgt gtttttcacg cgcttagcaa tg
323233DNAArtificial SequencePrimer E291P-F 32gcgcgtgaaa ccaacaccaa
acgcaacctg gcc 333332DNAArtificial SequencePrimer E291P-R
33cgtttggtgt tggtttcacg cgcttagcaa tg 323433DNAArtificial
SequencePrimer E291Q-F 34gcgcgtgaaa caaacaccaa acgcaacctg gcc
333532DNAArtificial SequencePrimer E291Q-R 35cgtttggtgt ttgtttcacg
cgcttagcaa tg 323633DNAArtificial SequencePrimer E291R-F
36gcgcgtgaaa cgtacaccaa acgcaacctg gcc 333732DNAArtificial
SequencePrimer E291R-R 37cgtttggtgt acgtttcacg cgcttagcaa tg
323833DNAArtificial SequencePrimer E291S-F 38gcgcgtgaaa tcaacaccaa
acgcaacctg gcc 333932DNAArtificial SequencePrimer E291S-R
39cgtttggtgt tgatttcacg cgcttagcaa tg 324033DNAArtificial
SequencePrimer E291T-F 40gcgcgtgaaa actacaccaa acgcaacctg gcc
334132DNAArtificial SequencePrimer E291T-R 41cgtttggtgt agttttcacg
cgcttagcaa tg 324233DNAArtificial SequencePrimer E291V-F
42gcgcgtgaaa gtaacaccaa acgcaacctg gcc 334332DNAArtificial
SequencePrimer E291V-R 43cgtttggtgt tactttcacg cgcttagcaa tg
324433DNAArtificial SequencePrimer E291W-F 44gcgcgtgaaa tggacaccaa
acgcaacctg gcc 334532DNAArtificial SequencePrimer E291W-R
45cgtttggtgt ccatttcacg cgcttagcaa tg 324633DNAArtificial
SequencePrimer E291Y-F 46gcgcgtgaaa tacacaccaa acgcaacctg gcc
334732DNAArtificial SequencePrimer E291Y-R 47cgtttggtgt gtatttcacg
cgcttagcaa tg 324840DNAArtificial SequencePrimer E355A-F
48cggtggccgt gtagcaggga aagtgattta cgaaacccag 404940DNAArtificial
SequencePrimer E355A-R 49aatcactttc cctgctacac ggccaccgct
cataccgcat 405040DNAArtificial SequencePrimer E355C-F 50cggtggccgt
gtatgtggga aagtgattta cgaaacccag 405140DNAArtificial SequencePrimer
E355C-R 51aatcactttc ccacatacac ggccaccgct cataccgcat
405240DNAArtificial SequencePrimer E355D-F 52cggtggccgt gtagacggga
aagtgattta cgaaacccag 405340DNAArtificial SequencePrimer E355D-R
53aatcactttc ccgtctacac ggccaccgct cataccgcat 405440DNAArtificial
SequencePrimer E355F-F 54cggtggccgt gtattcggga aagtgattta
cgaaacccag 405540DNAArtificial SequencePrimer E355F-R 55aatcactttc
ccgaatacac ggccaccgct cataccgcat 405640DNAArtificial SequencePrimer
E355G-F 56cggtggccgt gtaggaggga aagtgattta cgaaacccag
405740DNAArtificial SequencePrimer E355G-R 57aatcactttc cctcctacac
ggccaccgct cataccgcat 405840DNAArtificial SequencePrimer E355H-F
58cggtggccgt gtacatggga aagtgattta cgaaacccag 405940DNAArtificial
SequencePrimer E355H-R 59aatcactttc ccatgtacac ggccaccgct
cataccgcat 406040DNAArtificial SequencePrimer E355I-F 60cggtggccgt
gtaatcggga aagtgattta cgaaacccag 406140DNAArtificial SequencePrimer
E355I-R 61aatcactttc ccgattacac ggccaccgct cataccgcat
406240DNAArtificial SequencePrimer E355K-F 62cggtggccgt gtaaaaggga
aagtgattta cgaaacccag 406340DNAArtificial SequencePrimer E355K-R
63aatcactttc ccttttacac ggccaccgct cataccgcat 406440DNAArtificial
SequencePrimer E355L-F 64cggtggccgt gtactgggga aagtgattta
cgaaacccag 406540DNAArtificial SequencePrimer E355L-R 65aatcactttc
cccagtacac ggccaccgct cataccgcat 406640DNAArtificial SequencePrimer
E355M-F 66cggtggccgt gtaatgggga aagtgattta cgaaacccag
406740DNAArtificial SequencePrimer E355M-R 67aatcactttc cccattacac
ggccaccgct cataccgcat 406840DNAArtificial SequencePrimer E355N-F
68cggtggccgt gtaaacggga aagtgattta cgaaacccag 406940DNAArtificial
SequencePrimer E355N-R 69aatcactttc ccgtttacac ggccaccgct
cataccgcat 407040DNAArtificial SequencePrimer E355P-F 70cggtggccgt
gtaccaggga aagtgattta cgaaacccag 407140DNAArtificial SequencePrimer
E355P-R 71aatcactttc cctggtacac ggccaccgct cataccgcat
407240DNAArtificial SequencePrimer E355Q-F 72cggtggccgt gtacaaggga
aagtgattta cgaaacccag 407340DNAArtificial SequencePrimer E355Q-R
73aatcactttc ccttgtacac ggccaccgct cataccgcat 407440DNAArtificial
SequencePrimer E355R-F 74cggtggccgt gtacgtggga aagtgattta
cgaaacccag 407540DNAArtificial SequencePrimer E355R-R 75aatcactttc
ccacgtacac ggccaccgct cataccgcat 407640DNAArtificial SequencePrimer
E355S-F 76cggtggccgt gtatcaggga aagtgattta cgaaacccag
407740DNAArtificial SequencePrimer E355S-R 77aatcactttc cctgatacac
ggccaccgct cataccgcat 407840DNAArtificial SequencePrimer E355T-F
78cggtggccgt gtaacaggga aagtgattta cgaaacccag 407940DNAArtificial
SequenceE355T-R 79aatcactttc cctgttacac ggccaccgct cataccgcat
408040DNAArtificial SequencePrimer E355V-F 80cggtggccgt gtagtaggga
aagtgattta cgaaacccag 408140DNAArtificial SequenceE355V-R
81aatcactttc cctactacac ggccaccgct cataccgcat 408240DNAArtificial
SequencePrimer E355W-F 82cggtggccgt gtatggggga aagtgattta
cgaaacccag 408340DNAArtificial SequencePrimer E355W-R 83aatcactttc
ccccatacac ggccaccgct cataccgcat 408440DNAArtificial SequencePrimer
E355Y-F 84cggtggccgt gtatacggga aagtgattta cgaaacccag
408540DNAArtificial SequencePrimer E355Y-R 85aatcactttc ccgtatacac
ggccaccgct cataccgcat 408640DNAArtificial SequencePrimer E482A-F
86catcgataac gcccacatgt atcttgaccc gatcaaagtc 408739DNAArtificial
SequencePrimer E482A-R 87gatacatgtg ggcgttatcg atgtttttga agccgtgcc
398840DNAArtificial SequencePrimer E482C-F 88catcgataac tgccacatgt
atcttgaccc gatcaaagtc 408939DNAArtificial SequencePrimer E482C-R
89gatacatgtg gcagttatcg atgtttttga agccgtgcc 399040DNAArtificial
SequencePrimer E482D-F 90catcgataac gatcacatgt atcttgaccc
gatcaaagtc 409139DNAArtificial SequencePrimer E482D-R 91gatacatgtg
atcgttatcg atgtttttga agccgtgcc 399240DNAArtificial SequencePrimer
E482F-F 92catcgataac tttcacatgt atcttgaccc gatcaaagtc
409339DNAArtificial SequencePrimer E482F-R 93gatacatgtg aaagttatcg
atgtttttga agccgtgcc 399440DNAArtificial SequencePrimer E482G-F
94catcgataac ggtcacatgt atcttgaccc gatcaaagtc 409539DNAArtificial
SequencePrimer E482G-R 95gatacatgtg accgttatcg atgtttttga agccgtgcc
399640DNAArtificial SequencePrimer E482H-F 96catcgataac catcacatgt
atcttgaccc gatcaaagtc 409739DNAArtificial SequencePrimer E482H-R
97gatacatgtg atggttatcg atgtttttga agccgtgcc 399840DNAArtificial
SequencePrimer E482I-F 98catcgataac attcacatgt atcttgaccc
gatcaaagtc 409939DNAArtificial SequencePrimer E482I-R 99gatacatgtg
aatgttatcg atgtttttga agccgtgcc 3910040DNAArtificial SequencePrimer
E482K-F 100catcgataac aaacacatgt atcttgaccc gatcaaagtc
4010139DNAArtificial SequencePrimer E482K-R 101gatacatgtg
tttgttatcg atgtttttga agccgtgcc 3910240DNAArtificial SequencePrimer
E482L-F 102catcgataac ctgcacatgt atcttgaccc gatcaaagtc
4010339DNAArtificial SequencePrimer E482L-R 103gatacatgtg
caggttatcg atgtttttga agccgtgcc 3910440DNAArtificial SequencePrimer
E482M-F 104catcgataac atgcacatgt atcttgaccc gatcaaagtc
4010539DNAArtificial SequencePrimer E482M-R 105gatacatgtg
catgttatcg atgtttttga agccgtgcc 3910640DNAArtificial SequencePrimer
E482N-F 106catcgataac aaccacatgt atcttgaccc gatcaaagtc
4010739DNAArtificial SequencePrimer E482N-R 107gatacatgtg
gttgttatcg atgtttttga agccgtgcc 3910840DNAArtificial SequencePrimer
E482P-F 108catcgataac ccgcacatgt atcttgaccc gatcaaagtc
4010939DNAArtificial SequencePrimer E482P-R 109gatacatgtg
cgggttatcg atgtttttga agccgtgcc 3911040DNAArtificial SequencePrimer
E482Q-F 110catcgataac cagcacatgt atcttgaccc gatcaaagtc
4011139DNAArtificial SequencePrimer E482Q-R 111gatacatgtg
ctggttatcg atgtttttga agccgtgcc 3911240DNAArtificial SequencePrimer
E482R-F 112catcgataac cgtcacatgt atcttgaccc gatcaaagtc
4011339DNAArtificial SequencePrimer E482R-R 113gatacatgtg
acggttatcg atgtttttga agccgtgcc 3911440DNAArtificial SequencePrimer
E482S-F 114catcgataac agccacatgt atcttgaccc gatcaaagtc
4011539DNAArtificial SequencePrimer E482S-R 115gatacatgtg
gctgttatcg atgtttttga agccgtgcc 3911640DNAArtificial SequencePrimer
E482T-F 116catcgataac acccacatgt atcttgaccc gatcaaagtc
4011739DNAArtificial SequencePrimer E482T-R 117gatacatgtg
ggtgttatcg atgtttttga agccgtgcc 3911840DNAArtificial SequencePrimer
E482V-F 118catcgataac gtgcacatgt atcttgaccc gatcaaagtc
4011939DNAArtificial SequencePrimer E482V-R 119gatacatgtg
cacgttatcg atgtttttga agccgtgcc 3912040DNAArtificial SequencePrimer
E482W-F 120catcgataac tggcacatgt atcttgaccc gatcaaagtc
4012139DNAArtificial SequencePrimer E482W-R 121gatacatgtg
ccagttatcg atgtttttga agccgtgcc 3912240DNAArtificial SequencePrimer
E482Y-F 122catcgataac tatcacatgt atcttgaccc gatcaaagtc
4012339DNAArtificial SequencePrimer E482Y-R 123gatacatgtg
atagttatcg atgtttttga agccgtgcc 39
* * * * *